Download Operating Systems 2230 Lecture 8: Complexity of I/O Devices

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Burroughs MCP wikipedia , lookup

Distributed operating system wikipedia , lookup

Process management (computing) wikipedia , lookup

VS/9 wikipedia , lookup

CP/M wikipedia , lookup

Transcript
Operating Systems 2230
Computer Science & Software Engineering
Lecture 8: Complexity of I/O Devices
Modern operating systems are expected to support a wide variety of different
input and output (I/O) devices.
However, the variety of these devices makes their consistent, logical, and efficient
support difficult. In particular, the I/O transfer rates and sizes are of most
concern for efficient operation of an operating system.
The types of I/O devices supported can be grouped roughly into three distinct
categories:
Human interface devices communicate with the user, including keyboard
(10Bps), mouse (50Bps), laser printer (4MBps), and video displays (50MBps).
Machine-readable devices communicate within the single computer system, or provide digital input from external sensors, including digital-analogue
converters (100KBps), floppy disks (10KBps), magneto-optical disks (1MBps),
magnetic tape (5MBps), and magnetic disk (10MBps).
Communication devices connect computer systems to each other, including modem (5KBps), standard Ethernet (1.25MBps), ATM networks (19.375MBp
and very fast Ethernet (125MBps).
Note, that the (approximate) I/O transfer speeds shown above are presented
in multiples of Bytes-per-second (Bps).
1
Many I/O devices, typically communications/networking devices, report their
transfer speeds in bits-per-second (bps). Moreover, communication speeds are
given in multiples of 1000bps, not 1024bps, so that transferring one megabyte
of data at 1Mbps may take 2.5% longer than you’d think!
The diversity of uses to which I/O devices are placed makes it difficult for an
operating system to make a (single) uniform and consistent approach to their
management. We can highlight many of these differences:
Data rate: our previous examples have highlighted transfer rates spanning
7-8 orders of magnitude.
Application: the expected use of a device affects the software policies and
priorities employed by an operating system. For example, different output
devices may be supported at different priorities (particularly in a real-time,
alert-based system) and otherwise identical disk drives may be managed
differently if one is the swapping device, and the other “only” stores users’
files.
Complexity of control: devices such as mice and keyboards require little control (being read-only devices), whereas bidirectional, mirrored disk
drives are much more complex.
Data transfer models: typically character stream-based (keyboards), or blockbased (disks and tapes).
Error management: complex I/O devices often recover from their own errors, and the operating system only hears of catastrophic failures. In addition, some errors may be handled by the operating system, whereas others
must “percolate” to the user’s application.
Stallings describes I/O management as the “messiest aspect of operating system
design”.
2
Types of I/O Functions
An operating system may be expected to support I/O using one of three methods. Which method is employed depends on the complexity of the I/O device:
Programmed, or polled, I/O
The processor issues an I/O-based instruction on behalf of the currently executing process. The process loops incessantly until the I/O request is satisfied.
Of course, in a multi-programmed environment, all other processes are delayed,
too. This technique is often termed busy-waiting.
Interrupt-driven I/O
The processor again issues an I/O-based instruction on behalf of the currently
executing process.
The process either continues execution until it is informed that the I/O has
completed (termed asynchronous I/O), or the process is blocked, another process may execute, and the original process is eventually marked as Ready when
the operating system receives the appropriate “I/O done” hardware interrupt.
Direct Memory Access (DMA)
The processor issues an I/O-based instruction on behalf of the currently executing process, but directs the request to a DMA (hardware) module. The DMA
module manages the data transfer between main memory and the I/O device
without processor intervention.
When the whole data block has been transfered, the DMA module interrupts
the processor (as described above).
3
Complexity of I/O Responsibilities
As computer hardware and operating systems have evolved, the methods employed to manage I/O have increased in complexity. These evolutionary steps
have been:
1. the processor directly controls the I/O device.
2. an I/O controller or module is employed. The processor communicates
with this controller using programmed I/O (no interrupts), but is unaware
of the device’s external interface.
3. an I/O controller is again used, but efficiency is increased because the
controller interrupts the processor when ready.
4. the I/O controller communicates with the DMA module, relinquishing the
processor of all I/O responsibilities.
5. the I/O controller supports its own processor and I/O instruction set. The
main processor initiates I/O by providing the controller with the address
of a sequence of I/O instructions in main memory. When the I/O channel
has fetched and executed this sequence, it interrupts the main processor.
6. the I/O controller supports its own processor and local memory. The I/O
controller is now directed from the main processor by simply providing
a description of the I/O task required. When this I/O processor has
performed its task, it interrupts the main processor.
In this evolutionary sequence, the I/O controller becomes increasingly “intelligent”, and the main processor is increasingly relieved of any I/O responsibilities,
leaving it to perform other computations.
4
Direct Memory Access
The Direct Memory Access (DMA) controller has the responsibility of transferring blocks of data between main memory and any I/O devices. Typically,
data sizes are kilobytes or megabytes.
The DMA module uses the main bus to perform the transfer. Ideally this use
will be when the processor does not require the bus, otherwise the DMA module
must suspend the processor’s execution.
The latter technique is often termed cycle-stealing as the DMA module’s actions steal a processor cycle.
Figure 1: DMA and Interrupt Breakpoints.
As can be seen from Figure 1, the processor does not need to use the bus all
of the time (and certainly not for data stored entirely in its registers). This
provides opportunities for the DMA controller to transfer data using the bus
when it is otherwise idle.
The DMA module is able to suspend the processor during its instruction fetchdecode-execute cycle. During the DMA breakpoints, the DMA module suspends the processor and transfers a single unit (typically only one byte) between
5
memory and I/O module.
Notice that the DMA module does not interrupt the processor, it just suspends
it. There is no need for the processor to save its execution context and execute
another routine, as the DMA module does not alter this context. The processor
runs slower due to its suspension, but not as slow as if it were interrupted at
the completion of each byte’s transfer or if polling were being used.
Direct Memory Access Organisation
At the hardware level, the main processor, DMA module, and other I/O modules may be configured in different ways (Figure 11.3; this and all subsequent
figures are taken from Stallings’ web-site):
In 11.3(a), DMA modules are separate devices, on the same bus as the processor
and I/O modules.
In 11.3(b), more sophisticated (faster) devices have their own DMA controller,
or a single DMA controller (here, often termed an I/O channel) may directly
support multiple devices.
In 11.3(c), the DMA module acts as an I/O bridge between the main system
bus and a new I/O bus.
It may now be possible for a single system to have a variety of different I/O bus
standards, such as IDE (Integrated Drive Electronics) or SCSI (Small Computer System Interface) for disk drives, a USB v2.0 bus (Universal Serial Bus,
60MBps) for external drives and scanners, and a Firewire bus (IEEE 1394,
50MBps) for digital multimedia.
This decreases the number of direct interfaces that the DMA module must
support, and removes half of the main bus traffic which can potentially interfere
with the processor’s execution.
Moreover, some devices may now only be contacted via their DMA (now, bus)
controller.
6
Processor
DMA
• • •
I/O
Memory
I/O
(a) Single-bus, detached DMA
Processor
DMA
Memory
DMA
I/O
I/O
I/O
(b) Single-bus, Integrated DMA-I/O
System bus
Processor
Memory
DMA
I/O bus
I/O
I/O
I/O
(c) I/O bus
Figure 11.3 Alternative DMA Configurations
7
Logical Organisation of I/O
As with most aspects of operating system design, a hierarchical structure can
be employed to decompose I/O responsibilities into manageable subproblems.
At the lowest level of the I/O hierarchy are sections of code which must interact
directly with hardware, and complete their activities in a few billionths of a
second.
At the highest level, application programs wish to communicate their I/O requirements at a more logical level, and wish to be isolated from hardware
specifics. We see these more “logical” interfaces represented in a consistent,
small set of system calls.
There are three distinct levels in this hierarchy:
Logical I/O: treats the I/O device as a logical resource, while ignoring the
control of the device. Operating system interfaces allow application programs to interact with this responsibility with familiar operations such as
open, seek, read, write, and close.
Device I/O: the required I/O operations and the data’s location are converted to sequences of instructions and accesses.
Scheduling and control: the operating system schedules the I/O requests
in an attempt to maximise their throughput (based on the I/O device’s
characteristics). “Returning” interrupts are also handled at this level, with
control information (status) being returned to the invoking processes.
Examples of Logical I/O Organisation
How much operating system code is “squeezed” above Device I/O depends
mostly on how the device is represented to an application (Figure 11.4). For
example, the Communications port’s code may include a full TCP/IP protocol
suite, requiring tens of thousands of lines of code.
8
User
Processes
User
Processes
User
Processes
Directory
Management
Logical
I/O
Communication
Architecture
File
System
Physical
Organization
Device
I/O
Device
I/O
Device
I/O
Scheduling
& Control
Scheduling
& Control
Scheduling
& Control
Hardware
Hardware
Hardware
(a) Local peripheral device
(b) Communications port
(c) File system
Figure 11.4 A Model of I/O Organization
9
The Need For I/O Buffering
As we have seen, if a process wishes to input a block of characters, it may either
poll the I/O module until the request is satisfied, or employ interrupt-driven
I/O. The second approach is more efficient as other processes may execute while
we wait for the I/O.
However, the interrupt-driven approach interferes with the desirable use of
swapping: while the I/O blocked process is waiting, the operating system may
choose to suspend the process by moving some of its physical memory to the
swapping device.
The memory (page) holding the buffer to receive the input must not be swapped
out, as it must be resident when the I/O request is satisfied. It is thus impossible
to completely swap out the process.
Moreover, the operating system must avoid the single-process deadlock condition under which the process is blocked on the I/O event, and the I/O event is
blocked waiting for the process to be swapped in!
To avoid this deadlock, the region of memory receiving the I/O result must be
locked in memory, however long this takes.
There is an analogous condition for processes wishing to perform output.
Types Of I/O Buffering
The operating system must ensure that the pages of memory involved in I/O
requests are not swapped out while requests are pending. The standard solution
is for the operating system to reserve some memory, collectively termed I/O
buffers, for all I/O requests (Figure 11.5).
An application’s data is either initially copied to an I/O buffer (for an output
operation), or finally copied from an I/O buffer (for an input operation). Under
Linux, try the free command.
10
Operating System
I/O Device
User Process
In
(a) No buffering
Operating System
I/O Device
In
User Process
Move
(b) Single buffering
Operating System
I/O Device
In
User Process
Move
(c) Double buffering
Operating System
I/O Device
In
•
•
User Process
Move
(d) Circular buffering
Figure 11.5
I/O Buffering Schemes (input)
11
Disk I/O and Its Performance
Over the last thirty years, speed improvements of processors and memory have
far outstripped the improvements in disk speeds. Disks have become much
larger, but not correspondingly faster.
Electromechanical disks still provide the best combination of lowest access time
and cost of storage, and the highest storage capacity for read/write digital
storage.
Sectors
Tracks
Inter-sector gap
• •
•
S6
Inter-track gap
• • •
S6
S5
SN
S5
SN
S4
S1
S4
S1
S2
S3
S2
S3
Figure 11.16 Disk Data Layout
12
The speed of all file servers and user applications is limited by the number of
disk drive operations (reads and writes) that can be performed per second.
When operating at full capacity, a disk is spinning with a constant angular
velocity. To read or write information from a disk, (one of) its disk head(s)
must be positioned over the correct track, and then must wait for the correct
sector to spin underneath the head (Figure 11.16).
Disk Access Speeds
On a fixed-head disk, track selection simply involves electronically selecting the
correct head over the required track (standard PC-BIOS supports up to 255
heads per drive).
More typically, on a moving head disk, one head must first be positioned over
the required track (Enhanced IDE drives support a maximum of 16 heads per
disk).
The time spent waiting for the head to position itself is termed the seek time.
Typical high-performance disk drives can move the read/write head over half
the disk surface (an average seek) in 8-12ms.
The time taken for the correct sector to spin under the head is termed the rotational delay. At 5,400 revolutions per minute (a typical value for inexpensive
drives), the disk does a full-revolution in 1/90s, i.e. 11 ms. We thus say the
average rotational delay is 5.5ms.
Older disks (usually those less than 540 MBytes) spin at 3,600 RPM. More
expensive drives are typically rated at 7,200 RPM or 10,000 RPM.
The sum of the seek times and rotational delay is termed the access time.
Once the correct sector is positioned under the head, the data may transferred
between the disk and disk controller (either read or written).
While an IDE interface transfers data at a burst rate of about 2 MBytes/s and
13
SCSI typically transfers data at 10 MBytes/s, the data is read from the actual
disk drive at 5 Mbits/s to 48 Mbits/s. A 4,096 byte transfer (a typical cluster
size) would transfer in 0.5-2 ms.
Finally, the data may be transferred from the disk’s controller to the main
memory (hopefully via the DMA module).
This transfer rate depends on the bus used in the PC. The ISA specification
transfers at about 2 MBytes/s, and PCI specification at up to 132 MBytes/s.
We can summarise the times required for a typical disk operation:
35% Head seek to required track.
25% Rotational latency (for correct sector to spin under head).
25% Data transfer from disk to controller.
10% Disk driver software handling.
5% Data transfer from controller to memory.
14
Disk Scheduling Policies
Of clear importance in minimising the total access times of disk data is minimising the disk’s seek time.
On single user workstation, a typical sequence of disk read requests (track+sector)
will arrive (relatively slowly) from a single application, often in sequential order.
However, on a multi-user system, disk requests may appear in a random sequence of (track+sector) requests.
We could consider the random arrival order as the worst-case access pattern,
giving the worst possible performance.
The simplest disk scheduling strategy is the obvious first-in-first-out (FIFO),
in which all requests are serviced in order. It is a fair strategy as all requests
are eventually satisfied and have a bounded response time.
Figure 11.7 introduces an example track request sequence of 55, 58, 39, 18,
90, 160, 150, 38, 184. We can consider the effect of some simple disk
scheduling policies on this single data set.
We can improve the scheduling strategy by appreciating that the selection function (in software) can execute much faster than the required hardware positioning: track numbers can be sorted very quickly. Stallings further introduces
three simple strategies:
Shortest Seek Time First (SSTF) selects the next disk request which
results in the least movement of the disk head.
The SCAN policy keeps moving the disk head in the same direction until there are no more requests in that direction. This policy overcomes
potential starvation in the SSTF policy.
The C-SCAN policy restricts scanning to one direction only. thus reducing
the expected waiting time for all tracks.
15
track number
track number
0
25
50
75
100
125
150
175
199
track number
track number
0
25
50
75
100
125
150
175
199
0
25
50
75
100
125
150
175
199
0
25
50
75
100
125
150
175
199
(a) FIFO
Time
(b) SSTF
Time
(c) SCAN
Time
(d) C-SCAN
Figure 11.7 Comparison of Disk Scheduling Algorithms (see Table 11.3)
16
Time