Download Scheduling Techniques for Operating Systems

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

VS/9 wikipedia , lookup

CP/M wikipedia , lookup

Distributed operating system wikipedia , lookup

DNIX wikipedia , lookup

Unix security wikipedia , lookup

OS 2200 wikipedia , lookup

Burroughs MCP wikipedia , lookup

Process management (computing) wikipedia , lookup

Transcript
Scheduling
Techniques for
Operating Systems
R. B. Bunt
University of Saskatchewan
I notice so many people slipping away
And many more waiting in the lines
Paul Simon, "Congratulations"
Charing Cross Music, ( 1971
Introduction
One of the primary functions of an operating system is
to distribute the resources under its control among the
users of the system in such a way as to achieve
installation standards of performance (including service).
One of the most important resources in a computer
system is, undeniably, the processor. For example, all
system activities require time on at least one processor.
So it is hardly surprising that processor scheduling has
received considerable attention since the very early days
of computing, and that many techniques for accomplishing
this essential task have been developed. These have been
variously simulated, analyzed mathematically, and occasionally, implemented in actual systems. This paper
looks at some techniques for scheduling processors and
compares their implementations in a number of familiar
operating systems.
Figure 1. Both the operating constraints and the performance objectives change with the levels. Although the
terminology may differ from author to author, the basic
division remains much the same. At the lowest level, decisions are made concerning the allocation of physical
resources such as CPU cycles to processes* in the system.
This level of scheduling will be referred to as process
management. Since actual physical resources of the system are being managed at this level, the performance
objectives should be given in terms of measures of
resource utilization and efficiency. In effect, the process
manager takes a real processor and through its scheduling
presents the illusion of (or simulates) a number of independent virtual processors. Higher-level decisions, concerning the allocation of these virtual processors, are
made at the level of job management. At this level it is
assumed that a certain number of these virtual processors
exists (the maximum allowed level of multiprogramming).
0
A general model of processor scheduling
According to one definition, given by Hellerman and
Conroy,' an operating system scheduler "is an algorithm
that uniquely specifies which job is to receive next
service by a resource."
Schedulers are usually described in isolation (if at all),
and consequently it is sometimes difficult to see how
they are related to one another. In this section, to provide a common framework for the description of actual
implementations, a general model will be presented. For
simplicity, this discussion is based on a single-processor
system; extensions to multiprocessor systems are
straightforward.
For the sake of convenience, the overall scheduling
function is often divided into distinct levels, as shown in
10
Mange
-
0
~
-
anager-Fi
Physical
------. -Vi,...
I .. - PrcsosRsouc
I---------
User Jobs
Figure 1. The division of the scheduling function.
*It is assumed that the reader is familiar with the concept of
process or task. The literature abounds with definitions (see,
for example, Dijkstral and Horning and Randell3). For the
purpose of this paper it suffices to view a process simply as a
program in execution.
COMPUTER
The job manager sees a set of user-submitted jobs
competing for these virtual processors and allocates
them according to some predetermined policy. The process manager sees a community of sequential processes,
each executing on its own virtual processor, requesting
actual physical resources such as CPU cycles. Whereas
the process manager is primarily concerned with measures of resource utilization, the job manager, since it
deals directly with users of the system, ought to have
its performance assessed in terms of measures of service
to the users (such as turnaround time).
It is possible now to chart the history of a user request under this model. A user requests some action
of the system by submitting a job to the system. The
job manager allocates a virtual processor to this job if
one is free, and a corresponding process (or possibly
more than one) is created to perform the requested
action. If there are no free virtual processors, the job
is queued until one becomes available. At this level it
is assumed that the virtual processor will actually execute the process. The process manager, however, must
allocate to the process whatever resources it may need
(including CPU time) for the action to be performed.
Allocation strategies are required for each level. Again,
it is important to keep in mind both the operating constraints and the performance objectives at each level.
Decisions by the process manager must be made very
quickly and generally last for only a brief period. Strategies
involving complicated queuing methods, complex data
structures, or a lengthy analysis of process characteristics
may consume unwarranted quantities of the very resource
being scheduled. At the job level more effort can be (and
ought to be) expended on making decisions resulting in
the best service to the users of the system. In terms
of performance objectives, evaluation of the job manager
should focus on job-oriented measures such as turnaround or throughput. Hellerman and Conroy' refer to
these as "job performance measures," normal indicators
of the level of service afforded users of the system. On
the other hand, the process manager is most appropriately evaluated in terms of system-oriented measuressuch as CPU utilization. Hellerman and Conroy refer
to these as "resource performance measures." In the
remaining portions of this section some classical allocation strategies used at both the process and job levels will
be described. (A very good presentation along the same
lines is given by Muntz.4)
Strategies for process management. Process managers
appear in the literature under a number of aliases. The
process manager is known variously as the "dispatcher,"5
the "traffic controller,"6 the "CPU scheduler,"7 the "shortterm scheduler,"8 the "process scheduler,"9 or simply "the
scheduler."10
Most strategies operate by moving processes through a
series of well-defined states. For the model of the paper, a
process having instructions executed by its virtual processor is said to be in the running state; it is blocked if it is
waiting for some wakeup signal (from another process or
an external agent) to resume execution. For example, if
a running process wants to perform an input operation,
it issues the appropriate input command and blocks,
remaining blocked until a signal that the requested input
operation has been completed permits it to resume. The
prpcess manager handles the block and wakeup instruc-
tiES.
Idthough all the virtual processors are logically operatin' in parallel, the real processor (here, the CPU) can in
fact be executing only one instruction at a time. Thus,
while several processes may be running, only one proOctober 1976
cess at a time can actually be executing. Therefore,
from the point of view of the actual processor, three
states are possible for each process. Processes that are
neither executing nor blocked are said to be ready. If
the executing process' becomes blocked, the process
manager decides which of the ready processes is to run
next. The movement through states is summarized in
Figure 2.
PROCESS
MANAGEMENT
JOB
MANAGEMENT
Figure 2. A two-level scheduling model.
The running processes are often ordered in priority
according to some criterion, and each time the executing
process blocks, the process manager passes control to the
highest-priority ready process. Priority may be purchased
by the user, deserved by his status or rank, or earned
by way of the executing process displaying certain desirable characteristics.'1 For example, in some systems
the I/O activity of the running processes is monitored
dynamically and higher priority is given to those processes that seem to be doing a lot of I/O. An I/Obound process will frequently release control of the CPU
for its I/O operations. The time necessary to complete
the requested operation can then be overlapped with processing of a lower-priority process. This technique improves system throughput and at the same time alleviates
a possible system bottleneck by keeping I/O devices busy.
This scheme can be elaborated to include other characteristics of running processes as well.
A second method of distributing the CPU is through
time slicing. Rather than allow processes to run until
they block themselves, each ready process receives in turn
some small amount of CPU time known as a time slice
(or quantum), at the end of which it is interrupted by a signal from an interval timer (or clocking process). The time
slice may be fixed or varied, and the sequence in which
ready processes are allocated time slices (as well as the
time slice duration) may be determined in a variety of
ways. For example, the ready processes may be allocated
fixed-length time slices in a cyclical or round robin fashion.
In theoretical or simulation studies, true parallel execution is often assumed. A method known as processor
sharing serves all the ready processes simultaneously
at a rate inversely proportional to their current number.
Thus a process that would run at a rate r if it were the
11
only one being processed, runs at a rate r/n if there are
n ready processes. This assumption makes possible the
parallel advancement of jobs without requiring the complication introduced by the mechanics of process switching. Although this strategy cannot be implemented in
conventional computers (and therefore will be considered
no further in this paper), it is often useful as a yardstick against which other strategies can be compared
in theoretical or simulation studies. A good discussion of
processor sharing is given by Coffman and Kleinrock."
Strategies for job management. At the level of job
management, the real processor has been replaced logically
by some number of virtual processors (as determined by
the level of multiprogramming). In a standard batch processing environment, users submit jobs to the computer
system in the form of programs. The number of virtual
processors is usually considerably smaller than the number of jobs competing for them. If there are no available
virtual processors, the jobs are queued by the job manager
according to some priority. When a virtual processor
becomes available, the job manager allocates it to the
waiting job with the highest priority.
The decision algorithm used by the job manager,
commonly referred to as the "scheduling algorithm,"
enforces a sequencing discipline on the jobs waiting for
virtual processors and determines the order in which they
will be allowed access to the virtual processors. Although
in large part a political decision, the choice of scheduling
discipline is a significant factor in the performance of the
system and should be made according to the needs of the
particular system. For example, in a system devoted to
interactive use, emphasis may be placed on minimizing
worst-cage response to short terminal requests. In other
systems average turnaround might be considered most
important. These two systems would require different
scheduling strategies at the job level.
The simplest scheduling discipline is "first-come-firstserved" (FCFS), also known as "first-in-first-out" (FIFO).*
All jobs are assumed to be equally preferred, and thus
are serviced to completion in the order that they arrive.
Although it has been said'3 that "an inherent sense of
fair play has elevated [the FCFS rule] to an eminence
out of all proportion to its basic virtue," it can be
quite adequate in certain situations, and is often used as
a basis of comparison for other disciplines. Little is
required in the way of system overhead for queue
management, but the system performance can be very
erratic, particularly under heavy loading.
An important class of rules selects certain users as
preferred and gives them better service (possibly at
increased cost). A valuable principle to be borne in mind
is known as Kleinrock's conservation law'4 which states,
informally, that for given arrival and service patterns, a
particular weighted sum of average waiting times for all
jobs is invariant to the scheduling discipline used. This
says that scheduling can only improve the service afforded some jobs at the expense of that given others.
Preferential scheduling algorithms differ in their choice of
users to be given preferential treatment.
A common preferential discipline is "shortest first."
This rule requires a priori knowledge of each job's
running time and bases the job's priority on this information. Each time a job is completed, its virtual processor
is allocated to the waiting job having the smallest
requirement. A distinction may be made'3 between the case in which the processing requirements are
known exactly [called "shortest processing time" (SPT)
or "shortest job first" (SJF)] and the case wherein they
are estimated in some fashion [such as "shortest expected
processing
12
processing time" (SEPT)]. This is not a difficult rule to
implement but requires more overhead for queue management than is required by FCFS. It gives much better
service to jobs with small processing requirements, but
does so by giving poorer service to the longer jobs. Since
users with short jobs would be quite unhappy with
delays that might be tolerated by users with much longer
jobs, this seems a reasonable approach. In job mixes
distributed in such a way that there are many more
short jobs than long jobs (which is often the case in the
normal operating environment"**), this rule makes many
people happy at the expense of relatively few. Of all the
rules not using preemption, this rule yields the smallest
mean turnaround time'3 provided that accurate a priori
knowledge is available.
In both FCFS and SPT, a job once scheduled (or
allocated a virtual processor) is served until it is completed. The introduction of a preemption mechanism leads
to interesting variations of nonpreemptive rules. Preemption involves interrupting the job currently executing
on a particular virtual processor, recording the current state of its execution, perhaps rolling the job out
to secondary storage, and allocating the virtual processor
to a new job. A certain amount of processor time (called
"preemption overhead" or cost) is consumed by this
operation.*** Normally, preempted jobs are returned to the
same queues in which the arriving jobs are held. When
a preempted job again comes up for service, execution
resumes at the point of interruption; consequently, this
technique is known as resume preemption.
The incorporation of preemption in the SPT rule,
yielding "shortest remaining processing time" (SRPT)
or "preemptive shortest job first" (PSJF), results in still
sharper service discrimination between short jobs and
long jobs with the added cost of some preemption overhead. SRPT is simply the natural extension to SPT,
applying the "shortest first" rule at every arrival as well
as every completion. If the new arrival has a smaller
processing requirement than that remaining for the job
currently in service, the job being serviced is preempted
and replaced by the new arrival. It can be shown'7
that SRPT scheduling yields the smallest average turnaround time (but with the highest variance) when arrivals
occur intermittently. However, the cost of preemption in
some systems may negate the advantage of SRPT over
SPT.
Both SPT and SRPT require the scheduler to have
exact a priori knowledge of each job's processing requirements. Such information is not generally available in most
systems. In some systems, estimates provided by the
users themselves form the basis of the scheduling decisions. Although experienced users can, with some practice, learn to estimate fairly closely, in many cases for a
variety of reasons (such as rapidly changing environments,
novice users, weak penalties for bad estimates) the overall accuracy of estimates may be questionable.
*In a multiprogramming system, jobs may not complete in
the order they arrive because of the interleaved execution
enforced by the process manager. Thus "first-in" does not
necessarily imply "first-out."
**In one study of a CP-67 system" it was observed that 85%
of the jobs comprised only 7% of the demand for CPU.
Similar findings have been reported elsewhere.
***The cost of preemption varies from system to system depending on the amount of work involved, the number of
programs in the system, and the proportion of time that can
be overlapped.'6
COMPUTER
A final class of rules requires no a prior7i knowledge
of processing requirements for scheduling decisions. In
the round robin discipline shown in Figure 3, each job,
in turn, is allocated a quantum of uninterrupted service.
If it does not complete within the quantum, it is preempted and returned to the waiting queues, and a new
job is started. The strategy is basically one of sampling
the jobs in turn to see which of them can complete
with a small amount of additional time.
preempted jobs
completions
arrivals
processor
preempted jobs
ompletions
arrivals
processor
system of waiting
queues
Figure 3. The round robin scheduling strategy.
Figure 4. The FBN scheduling strategy.
Some form of round robin scheduling is used in many
interactive systems, and is appropriate since most of the
jobs (or requests for service) are short (perhaps requiring
only a single quantum) and fast response is essential.
The size of the quantum is a design parameter and a
critical factor in the performance of the algorithm. The
expected time any given job might wait to receive its
next quantum of service is proportional to the number of
active jobs n, the quantum size q, and the overhead due
to switching jobs s (assume s << q). This overhead
comprises the cost required for preempting the job currently running and starting a new one, and is clearly
associated with the sampling operation. As n increases,
the expected response- time must also increase if q
and s remain constant. A variable quantum size can
allow the algorithm to be more responsive to the current
load. For example, under light loads n is small and adequate response can be achieved with a large quantum;
therefore, the amount of switching done (and thus-the
total sampling overhead) can be reduced by choosing a
fairly large quantum size. As the load increases, response
degrades; once it becomes unacceptable the quantum
size should be reduced. If q becomes too small, however,
the s term will become significant and the cost of repeated
switching will limit the achievable response.
An important variation of the round robin rule is the
multilevel feedback rule (FB, sometimes called "multilevel foreground/background") shown in Figure 4. An
ordering is imposed on the system of waiting queues corresponding to the number of service periods that the jobs
in the queue have already had. Thus QUEUEo contains
new arrivals, QUEUE1 contains jobs that have been preempted once, and so on. Each of the queues is ordered
FCFS. After the quantum of a job being serviced is
exhausted, the first job in the lowest-numbered nonempty queue is selected for service, and the preempted
job is returned to a queue one level higher than the queue
from which it was chosen. If there are at most N
waiting queues, the rule is sometimes known as FBN. The
FB rule results in sharper short-job discrimination than
the round robin rule by ensuring that long jobs do not
interfere. The result of the movement to lower-priority
queues is an implicit ordering of the jobs by length of
running time (within the limits of accuracy of the quantum
size). Thus the effect is similar to that of a "shortest
first" rule, but is achieved without any advance knowledge of running times. Like round robin, however, there
is associated sampling overhead.
In this section, a number of classical scheduling
methods have been described. These can be classified in a
variety of ways. First, we have non-preemptive rules,
such as FCFS and SPT, versus preemptive rules, such as
SRPT, round robin, and FB. The preemption mechanism
enhances the discriminatory capability of the scheduling
rule, at a cost of a certain amount of system overhead
required for the preemption. If the cost of preemption can
be kept reasonably small, preemptive rules should outperform non-preemptive rules.7
Scheduling rules could also be classified according to
the information they require about the jobs a priori.
Rules such as FCFS, round robin, and FB require no
information, whereas rules such as SPT and SRPT require exact information. As expected, schedulers perform
better if they can take advantage of information about
the nature of the jobs they are scheduling, but in many
systems exact knowledge is not available. Rules such as
SEPT ("shortest expected processing time") rely on
estimates.
Most operating system schedulers resemble one (or
may combine several) of the classical rules. These are
normally altered to meet certain special requirements of
the particular system. In the remainder of this paper,
several important operating systems will be described.
In the descriptions, an attempt will be made to separate
the scheduling component from other aspects of resource
allocation, an approach that is neither easy to do nor
advisable in practice. This approach has been taken in
the interests of space, and hopefully, clarity of description.
Apologies are extended in advance to those who might
rightfully object to this simplification.
October 1976
IBM SystemI360 and
SystemI370 Operating Systems
IBM provides a family of operating systems for its
System/360 and System/370 lines of computers. More
complete descriptions of these systems, the services they
provide, and details of their design are found in the
literature.",9 In this section the techniques for both job
13
management and process management in some of the
more common systems will be described.
The terminology used by IBM differs in some respects
from both that used elsewhere and that used earlier in
this paper. Here we will be as uniform as possibleand that means some liberties will have to be taken with
actual system terminology. In IBM systems, a job is
actually submitted by a user as a collection of job
steps. For purposes of simplification each job step will
be referred to simply as a job in this discussion.
Similarly, the terms "process" will be used in place of
"task," and "virtual processor" in place of "initiator,"
"main storage," or "region."
The IBM operating systems operate essentially on a
job classification basis. The user is allowed considerable
opportunity to make input to the classification decision.
In some of the more sophisticated systems, scheduling
parameters are modified dynamically, with the user information giving starting values.
OS/MFT. The operating system OS/MFT (multiprogramming with a fixed number of tasks) is the simplest
of the systems offering a multiprogramming capability.
Essentially, a fixed number of virtual processors (no
more than 15, typically less) are made available for users
of the system. Associated with each is a fixed amount
of main memory. The virtual processors are numbered
PO, Pl, .... PN with the index used to determine
the dispatching (or process management) priority.
The user classifies his job according to the nature of its
resource demands according to the class definitions
established by the installation. For example, class A
might indicate a job that is I/O-bound, class B a job
that is CPU-bound, class C a short express run, class
D a job requiring tape and/or disk mounts, and so forth.
A system queue is established for each class. Since
several jobs will often belong to the same class, a scheme
is needed -to break ties. First, the user is allowed to
specify a priority for his job. If this fails to produce
a unique candidate, the jobs are selected in the order
they entered the queue (i.e., FCFS within priority and
class).
The system operator assigns up to three of the possible
job classes to each virtual processor. The order the
classes are assigned indicates the job scheduling priority
for that particular virtual processor. That is, the first
class assigned to a virtual processor has highest priority
for the use of that processor, the second class (if there is
one) has second priority, and so on. A job may not be
scheduled on a virtual processor unless it is from a class
assigned to that virtual processor.
In the example illustrated in Figure 5, virtual processor P0 has been assigned, in order of decreasing
priority, classes C, A, and B. This means that first call
on this virtual processor goes to class C jobs, second
call goes to class A jobs, and in the event that both
class C and A queues are empty, a class B job is
scheduled. Class D jobs are not eligible to be scheduled
on P0. Classes have been assigned to the remaining virtual
processors in a similar manner.
The process management technique used in OS/MFT is
known as "highest static priority first" (with preemption).
Associated with each virtual processor is a fixed priority
with processes running on P0 having the highest priority
and processes running on P the lowest. The highestpriority-ready process is scheduled and executes until one
of two events occurs:
(1) the process blocks, for example, on a request for an
I/O operation, or
14
Virtual
Job Queues
Processors
class A
e
class B
third
p
riority
CAB
p1
class C
class D
Assigned Job
Closses
0
|
ACB
DB
Figure 5. Scheduling from classes in OS/M FT.
(2) a process with a higher priority (i.e., running on a
higher-priority virtual processor) becomes ready, for
example, by completing an I/O operation (signaled
by an interrupt).
The assignment of classes to virtual processors has a
significant effect on performance as a result of the
actions of the process manager, illustrating the effect
of interactions between the two levels of scheduling. For
example, if CPU-bound processes are assigned to highpriority virtual processors they will seldom relinquish
control of the real CPU, and I/O-bound processes with
lower priority will have little chance to execute. As a
result, system throughput will be low. Note also that
the user has only indirect control over the attention his
job receives at the process management (or dispatching)
level. The definition of the classes and the assignment
of priorities to these classes (through their assignment
to certain virtual processors) is out of his hands. He is
allowed to specify a job management priority that
affects the service he receives relative to other jobs of the
same class. It is important for the performance of the
entire system for the installation to ensure the users do
not attempt to misrepresent their jobs.
OS/MVT. OS/MVT (multiprogramming with a variable
number of tasks) is similar in many ways to OS/MFT.
The job class and priority within class are once again
determined at the time of job submission. The assignment
of classes to virtual processors and the selection of jobs
are done as in MFT. Because the amount of main memory
associated .with a virtual processor in MVT can vary,
further processing of a scheduled job may have to be postponed until the requested amount of memory can be made
available. The additional complication is described by
Hellerman and Conroy' and will not be dealt with here.
Process management techniques in MVT differ from
those of MFT. Once again the "highest-static-priorityfirst" rule is used, but in MVT the virtual processors
have no inherent priority; the process management
priority is taken from the priority specified by the user on
submission of his job (explicitly or by default), which now
serves two purposes:
(1) for breaking ties within classes at the level of job
management, and
(2) as the process priority at the level of process
management.
This now allows the user to assign high process management priority to I/O-bound jobs and low priority to CPUbound jobs to achieve the high resource usage and improved throughput described earlier. Of course, this presumes that the user knows his job characteristics, and
further, that they remain constant throughout the job's
COMPUTER
execution. Neither of these may be the case. Once again,
care must be taken to prevent abuse of this system.
It is possible for an MVT (or MFT) installation to
employ the time slice option as well as the "higheststatic-priority-first" rule at the level of process management. Under this option, all processes at a certain installation-specified priority are scheduled in a round robin
fashion as described earlier. Processes with priorities
above or below this value are handled in the normal
manner. When the priority of the time slice group
becomes the highest among all the ready processes, each
ready process of the time slice group receives a fixed
quantum of CPU time in turn until interrupted by a
higher-priority process or until all processes in the time
slice group enter the blocked state. Conventional scheduling then takes over.
Among the features offered by HASP is an option
called heuristic dispatching, which tries to improve resource utilization and increase system throughput by
giving high priority to I/O-bound processes. This is done
by monitoring process characteristics as the processes
execute. Each executing process is given a quantum of
CPU time. If the process uses the entire quantum, it is
assumed to be CPU-bound and placed in the CPU subgroup. If the process blocks for an I/O operation during
its quantum, it is assumed to be I/O-bound and placed
in the I/O subgroup. The heuristic dispatcher gives higher
priority to processes in the I/O subgroup and schedules
the CPU subgroup only if all processes in the I/O subgroup are blocked. Processes in the I/O subgroup are
allowed to preempt processes in the CPU subgroup. As
processes change their characteristics during execution,
HASP will change their subgroup. The effectiveness of
the current quantum size in making the distinction between I/O-bound processes and CPU-bound processes is
also monitored at specified long intervals (of many quanta).
If the proportion of processes identified as I/O-bound is
more than a proportion specified as desired by the installation, the quantum size is shortened so as to increase
the number of processes identified as CPU-bound and
bring the ratio down. If the observed ratio is less than
desired, the quantum size is lengthened. The adjustments
are made within specified upper and lower bounds. The
technique of heuristic dispatching has been found to be
very effective, with throughput improvements of almost
19 percent reported by Marshall.'9
As mentioned, it is often difficult for users to make
judgments on the execution characteristics of jobs,
particularly when these characteristics change as the job
executes (for example, jobs may alternate between periods
of CPU-boundedness and 1/0-boundedness) or when inferences on the characteristics of jobs other than the
user's own are required. The HASP (Houston Automatic
Spooling Priority) system offers an enhancement that
attempts to meet these difficulties.5
HASP was originally developed as part of an enhancement to OS/360 for real-time spaceflight control for
NASA's Apollo spaceflights (see the work of Johnstone'8
for a description of the extensions made to OS/360
for this purpose), but it soon became a popular addition
to many OS/360 installations. HASP is primarily concerned with peripheral functions, such as the collecting
of the job stream and its output (following execution) on
direct access devices, and the scheduling of printing and
punching of this output from the direct access devices.
Many of the HASP functions have been designed directly
into the more recent IBM operating systems.
IBM also offers a number of operating systems capable
of supporting virtual storage on the System/370 (The
October 1976
reader not familiar with the concept of virtual storage is
referred to treatments by Doran,'0 Shaw," or Hellerman
and Conroy.'). Basically they are enhanced versions of
OS/MFT and OS/MVT, originally developed for the
System/360. The major differences are in the area of
memory management; the scheduling techniques are
similar to those already described for MFT and MVT.
OS/VS2 is the enhanced version of OS/MVT (OSIVS1
is the enhanced MFT). The major differences in job
management between VS2 and MVT are the support of
more virtual processors (up to 63 as compared to 15), and
the inclusion of techniques to reduce contention of I/O
devices (see IBM guide22 for details). The process manager
incorporates a facility called "automatic priority grouping,"
based on HASP's heuristic dispatching. A particular
priority level can be specified as an automatic priority
group to which the techniques described earlier are
applied. A restriction imposed is that this priority level
cannot also be specified as a time-slicing group. Job
management in VS2 is closely tied to memory management. For example, a "load leveler" can interrupt and
temporarily halt an active job if the paging rate is
assessed to be too high. Thus the number of running
virtual processors is dynamically varied. A good description of the facility is given by Hellerman and Conroy.
MULTICS
The IBM operating systems described in the previous
section are primarily oriented to a batch environment,
although options such as TSO (timesharing option) are
available. An example of an operating system design to
meet a somewhat different need is MULTICS (Multi
plexed Information and Computing Service), developed
jointly by MIT and General Electric. MULTICS offers
both interactive and batch service with considerable emphasis placed on the concept of information sharing.
Performance objectives vary in such a system, and as a
result the scheduling function is handled in a different
fashion. A very complete discussion of all aspects of the
MULTICS system is given by Organick." The MULTICS
virtual storage structure is discussed by Doran.'0
In a system oriented toward interactive timesharing,
the distinction between jobs and processes is somewhat
fuzzy. "Jobs" are normally very short requests entered
from some type of terminal. The request might be for
an edit of some line of text, or it might be for the
execution of some previously saved job, such as a compiler.
Once a request is received, a "process" is created to
perform the requested action. For the sake of uniformity
in the presentation, the distinction between job management and process management will be retained in this
description. In general, the processor time required to
service a request is not known in advance. Consequently,
the scheduler is usually one that assumes no knowledge.
The MULTICS job manager is a modified FBN scheduler.
To provide for different service requirements, each job
(or request) is assigned, on submission, a range of
priority levels (l, 12) and given the initial priority 11. The
range of priorities indicates roughly the type of service
the job will receive. Highly interactive jobs (such as line
edits) will require very fast response and therefore will
be given high priority. Longer-running interactive Jobs
(such as compilations) or "absentee user" jobs (such as
batch jobs) are given a lower-priority range. The levels
may, in fact, overlap. Corresponding to the priority
levels is a set of N queues from which jobs are scheduled
15
according to the FBN rule, with the additional complication that each job begins at the queue corresponding to
its assigned 11 value, and is not allowed to drop to queues
lower than its assigned 12 value. As described earlier,
the FB rule implicitly determines the amount of service
requireZ by a job and relegates longer jobs to lowerpriority queues. The quantum allocated in MULTICS
varies with the level, doubling at each successive lowerpriority level. This policy tends to reduce total sampling
overhead. The number of jobs permitted to be active at
any time (or the multiprogramming level) is determined
dynamically from an assessment of the current memory
demands of all the active jobs. This is similar to the load
leveler embodied in OS/VS2.
At the process management level, control is given to
the highest-priority process that is ready to run. If the
process blocks, control passes to the next highestpriority process. If the quantum allocated to the job by
the job manager expires, the job is deactivated and returned to the system of FB queues. The policy is
designed to increase the amount of "effective work"
done, or minimize resource wastage. Service considerations,
such as fast response, are the province of the job
manager.
Concluding remarks
In an attempt to provide a common framework for the
description of diverse schedulers, a general model was
proposed. A number of classical scheduling techniques
were described using this model and their characteristics
were assessed. Actual implementations of these techniques
often compromise the classical definitions to accommodate some special requirements or constraints of the
particular system. A common problem is that of balancing
the total resource demand of the jobs in the system against
the resources available. For example, a scheduling decision resulting from the application of one of the classical
rules may have to be overridden because of insufficient
available memory. Clearly, any discussion that attempts
to concentrate solely on processor scheduling will be
deficient in some of these areas. A scheduler must be
an integral part of the resource allocation component of
an operating system.
In this paper the scheduling methods of a number of
popular operating systems have been described. The IBM
systems' described (OS/MFT, OS/MVT, OS/VS2) are all
primarily oriented to an environment of batch submissions. The MULTICS system and the UNIX system,
offering different types of service, have different performance objectives and hence employ a different scheduling
approach to meet these objectives. U
UNIX
UNIX is a general-purpose multiuser, interactive
operating system developed by a group at Bell Laboratories for the Digital Equipment Corporation's PDP-1 1/40
and PDP-11/45 computers.24 Unlike the systems described previously, it is quite feasible to run UNIX on
relatively small and inexpensive machines, yet the system
still offers very effective interactive service. It was primarily designed with objectives such as simplicity, elegance, and ease of use in mind.
Because it is devoted entirely to interactive use,
scheduling decisions are somewhat simpler than those
made by the MULTICS system. The UNIX job manager
examines the jobs in the waiting queue and selects the
one that has been waiting the longest. If there is sufficient main memory available to accommodate the needs of
this job, it is immediately transferred to the running
state. If this is not the case, the job manager tries to find
enough "easy core," that is, memory occupied by processes currently in the blocked state. If this additional
search fails to meet the specified needs, then a decision
is made to acquire the needed memory by deactivating the
job that has been active for the longest uninterrupted
period, provided it has been active for more than 2
seconds. If all efforts to acquire memory fail, the job
manager itself is put to sleep until either a period of 1
second elapses, or until one of the executing jobs blocks,
at which time the job manager is reinvoked and the
above operation is repeated.
The objective of this job manager is to attempt to give
each user a "fair crack" at the processor; that is, no
user will wait excessively long for a virtual processor, and
once given a virtual processor, each user has a chance to
do a reasonable amount of work before deactivation. This
objective is consistent with the overall response needs of
an interactive system.
The UNIX process manager is very simple. The CPU
is given to the highest-priority ready process which retains control for up to 1 second. If the process should
block before 1 second has elapsed, control will pass to the
next process in the ready queue. This is much the same
strategy as that employed in the MULTICS system.
16
Acknowledgments
The preparation of this paper was supported in part by
the Defence Research Board of Canada, Grant No. 993140. The author is grateful for the help of Chris Thomson
for gathering material and of Dianne Good for typing the
manuscript.
References
1.
H. Hellerman and T. F. Conroy, Computer System Performance, McGraw-Hill, New York, 1975.
2.
E. W. Dijkstra, "Cooperating Sequential Processes," Programming Languages (F. Genuys ed.), Academic Press,
New York, 1968, pp. 43-112.
3.
J. J. Horning and B. Randell, "Process Structuring,"
ACM Computing Surveys, Vol. 5, No. 1 (March 1973),
pp. 5-30.
4.
R. R. Muntz, Software Systems Principles: A Survey
(P. Freeman, ed), Chapter 7, Science Research Associates,
1975.
5.
K. D. Ryder, "A Heuristic Approach to Task Dispatching,"
IBMSystems Journal, Vol.9, No.3 (1970), pp. 189-198.
6.
J. H. Saltzer, "Traffic Control in a Multiplexed Computer
System," Sc.D. thesis, Dept. of EE, MIT, Cambridge,
Massachusetts, 1966.
7.
S. Sherman, F. Baskett, and J. C. Browne, "TraceDriven Modeling and Analysis of CPU Scheduling in a
Multiprogramming System," CACM, Vol. 15, No. 12
(December 1972), pp. 1063-1069.
8.
P. Brinch Hansen, Operating Systems Principles, PrenticeHall, Englewood Cliffs, New Jersey, 1973.
COMPUTER
9.
S. E. Madnick and J. J. Donovan, Operating Systems,
McGraw-Hill, New York, 1974.
10.
B. W. Lampson, "A Scheduling Philosophy for Multiprocessing Systems," CACM, Vol. 11, No. 5 (May 1968),
pp. 347-360.
11.
E. G. Coffman, Jr., and L. Kleinrock, "Computer Scheduling Methods and Their Countermeasures," Proc. AFIPS,
SJCC, Vol. 32 (1968), pp. 11-21.
12.
E. G. Coffman, Jr., and L. Kleinrock, "Feedback Queueing
Models for Time-Shared Systems," JACM, Vol. 15, No.
4 (October 1968), pp. 549-576.
13.
R. W. Conway, W. L. Maxwell, and L. W. Miller, Theory
of Scheduling, Addison-Wesley, Reading, Massachusetts,
1967.
14.
L. Kleinrock, "A Conservation Law for a Wide Class
of Queueing Disciplines," Naval Research Logistics
Quarterly, Vol. 12, No. 2 (June 1965), pp. 181-192.
15.
J. Rodriguez-Rosell and J. Dupuy, "The Design, Implementation and Evaluation of a Working Set Dispatcher,"
CACM, Vol. 16, No.4 (April 1973), pp. 247-253.
16.
U. N. Bhat and R. E. Nance, "Dynamic Quantum Allocation and Swap Time Variability in Time-Sharing Operating
Systems," Tech. Report No. CP-73009, Department of
Computer Science and Operations Research, Southern
Methodist University, Dallas, Texas, April 1973.
17.
L. Schrage, "A Proof of the Optimality of the Shortest
Remaining Processing Time Discipline," Operations Research, Vol. 16, No. 3 (May-June 1968), pp. 687-690.
18.
J. L. Johnstone, Software Systems Principles: A Survey
(P. Freeman, ed.), Chapter 17, Science Research Associates, 1975.
19.
B. S. Marshall, "Dynamic Calculation of Dispatching
Priorities Under OS/360 MVT," Datamation (August
1969), pp. 93-97.
20.
R. W. Doran, "Virtual Memory," Computer, Vol. 9, No. 10
(October 1976), pp. 27-37.
21.
A. C. Shaw, The Logical Design of Operating Systems,
Prentice-Hall, Englewood Cliffs, New Jersey, 1974.
22.
IBM Corporation, OS/VS2 Planning and Use Guide,
Form No. GC28-0600-2, September 1972.
23.
E. I, Organick, The MULTICS System: An Examination
of Its Structure, MIT Press, Cambridge, Massachusetts,
1972.
24.
D. M. Ritchie and K. Thompson, "The UNIX TimeSharing System," CACM, Vol. 17, No. 7 (July 1974),
pp. 365-375.
Rick Bunt is an associate professor in the
Department of Computational Science at the
University of Saskatchewan, Saskatoon,
Canada. He has been at the university since
1972. His research interests include operating
systems, simulation, and the study of programming and programmers. He received
the B.Sc. degree from Queen's University,
Kingston, Ontario, and the M.Sc. and Ph.D
degrees in computer science from the University *of Toronto. He is a member of ACM, the IEEE
Computer Society, and the Computer Science Association
(Canada).
October 1976
I
o
system &
software
Oflgmn
Texas Instruments has immediate
openings for highly motivated,
talented individuals in their Corporate
Engineering Center. In this highly
visible and dynamic organization you
will be a member of a team whose
function is to evaluate emerging
technologies, design and implement
prototype systems, and develop
comprehensive strategies for future
business growth.
If you are a computer professional
with a MS or PhD and are looking for a
challenging opportunity, we have
positions for experienced individuals
with records of achievement in:
* Computer architecture
* Memory system design
* Multiprocessor systems
* Peripheral technology
* Data Communications
* Software development
* Programming language design
* Software engineering
* Operating system design
If you qualify, send your resume in
complete confidence to: Harry Moseley/
Texas Instruments/P.O. Box 5474, M.S.
217/Dallas,'rexas 75222.
TEXAS INSTRUMENTS
I
L
INCORPORATED
An equal opportunity
employer.
r
17