Download Effects of clock resolution on the scheduling of interactive and soft real-time processes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Effects of Clock Resolution on the Scheduling of
Interactive and Soft Real-Time Processes
Yoav Etsion
Dan Tsafrir
Dror G. Feitelson
School of Computer Science and Engineering
The Hebrew University, 91904 Jerusalem, Israel
fetsman,dants,feitgs.huji.a.il
ABSTRACT
1. INTRODUCTION
It is ommonly agreed that sheduling mehanisms in general purpose operating systems do not provide adequate support for modern interative appliations, notably multimedia appliations. The ommon solution to this problem is to
devise speialized sheduling mehanisms that take the spei needs of suh appliations into aount. A muh simpler
alternative is to better tune existing systems. In partiular, we show that onventional sheduling algorithms typially only have little and possibly misleading information
regarding the CPU usage of proesses, beause inreasing
CPU rates have aused the ommon 100 Hz lok interrupt
rate to be oarser than most appliation time quanta. We
therefore ondut an experimental analysis of what happens
if this rate is signiantly inreased. Results indiate that
muh higher lok interrupt rates are possible with aeptable overheads, and lead to muh better information. In
addition we show that inreasing the lok rate an provide
a measure of support for soft real-time requirements, even
when using a general-purpose operating system. For example, we ahieve a sub-milliseond lateny under heavily
loaded onditions.
Contemporary omputer workloads, espeially on the desktop, ontain a signiant multimedia omponent: playing
of musi and sound eets, displaying video lips and animations, et. These workloads are not well supported by
onventional operating system shedulers, whih prioritize
proesses aording to reent CPU usage [18℄. This deieny is often attributed to the lak of spei support for
real-time features, and to the fat that multimedia appliations onsume signiant CPU resoures themselves.
The ommon solution to this problem has been to design
speialized programming APIs that enable appliations to
request speial treatment, and shedulers that respet these
requests [19, 8, 22℄. For example, appliations may be required to speify timing onstraints suh as deadlines. To
support suh deadlines, the onventional operating system
sheduler has to be modied, or a real-time system an be
used.
While this approah solves the problem, it suers from
two drawbaks. One is prie. Real-time operating systems
are muh more expensive than ommodity desktop operating systems like Linux or Windows. The prie reets
the diÆulty of implementing industrial strength real-time
sheduling. This diÆulty, and the requirement for areful testing of all important senarios, are the reasons that
many interesting proposals made in aademia do not make
it into prodution systems. The other drawbak is the need
for speialized interfaes, that may redue the portability of
appliations, and require a larger learning and oding eort.
An alternative is to stik with ommodity desktop operating systems, and tune them to better support modern
workloads. While this may lead to sub-optimal results, it
has the important benet of being immediately appliable
to the vast majority of systems installed around the world.
It is therefore worth while to perform a detailed analysis
of this approah, inluding what an be done, what results
may be expeted, and what are its inherent limitations.
Categories and Subject Descriptors
D.4.1 [Proess Management℄: Sheduling; D.4.8 [Performane℄: Measurements; C.4 [Performane of Systems℄:
Design studies
General Terms
Measurement, Performane
Keywords
Clok interrupt rate, Interative proess, Linux, Overhead,
Sheduling, Soft real-time, Tuning
Supported by a Usenix sholasti grant.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
SIGMETRICS’03, June 10–14, 2003, San Diego, California, USA.
Copyright 2003 ACM 1-58113-664-1/03/0006 ...$5.00.
1.1 Commodity Scheduling Algorithms
Prevalent ommodity systems (as opposed to researh systems) use a simple sheduler that has not hanged muh in
30 years. The basi idea is that proesses are run in priority order. Priority has a stati omponent (e.g. operating
system proesses have a higher initial priority than user proesses) and a dynami part. The dynami part depends on
CPU usage: the more CPU yles used by a proess, the
lower its priority beomes. This negative feedbak (running
redues your priority to run more) ensures that all proesses
1.2 The Resolution of Clock Interrupts
Computer systems have two loks: a hardware lok that
governs the instrution yle, and an operating system lok
that governs system ativity. Unlike the hardware lok, the
frequeny of the system lok is not predened: rather, it
is set by the operating system on startup. Thus the system
an deide for itself what frequeny it wants to use. It is
this tunability that is the fous of the present paper.
The importane of the system lok (also alled the timer
interrupt rate) lies in the fat that ommodity systems measure time using this lok, inluding CPU usage and when
timers should go o. The reason that timers are aligned with
lok tiks is to simplify their implementation and bound the
overhead. The alternative of setting a speial interrupt for
eah timer event requires more bookkeeping and risks high
overhead if many timers are set with very short intervals.
The most ommon frequeny used today is 100 Hz: it is
used in Linux, the BSD family, Solaris, the Windows family,
and Ma OS X. This hasn't hanged muh in the last 30
years. For example, bak in 1976 Unix version 6 running on
a PDP11 used a lok interrupt rate of 60 Hz [16℄. Sine
that time the hardware lok rate has inreased by about 3
orders of magnitude, from several megahertz to over 3 gigahertz [23℄. As a onsequene, the size of an operating system
tik has inreased a lot, and is now on the order of 10 million yles or instrutions. Simple interative appliations
suh as text editors don't require that many yles per quantum1 , making the tik rate obsolete | it is too oarse for
measuring the running time of an interative proess. For
example, the operating system annot distinguish between
proesses that run for a thousand yles and those that run
for a million yles, beause using 100 Hz tiks on a 1 GHz
proessor both look like 0 time.
A speial ase of time measurement is setting the time
1
Interestingly, this same onsideration has also motivated
the approah of making the hardware lok slower, rather
than making the operating system lok faster as we propose. This has the benet of reduing power onsumption
[10℄.
achieved frames per second
get a fair share of the resoures. CPU usage is forgotten after some time, in order to fous on reent ativity and not
on distant history.
While the basi ideas are the same, spei systems employ dierent variations. For example, in Solaris priorities
of proesses that wake up after waiting for an event are set
aording to a table, and the alloated quantum duration
is longer if the priority is lower [17℄. In Linux the relationship goes the other way, with the same number serving as
both the alloation and the priority [5, 4℄. In Windows NT
and 2000, the priority and quanta alloated to threads are
determined by a set of rules rather than a formula, but the
eet is the same [24℄. For example, threads that seem to
be starved get a double quantum at the highest possible priority, and threads waiting for I/O or user input also get a
priority boost.
In all ases, proesses that do not use the CPU very muh
| suh as I/O-bound proesses | enjoy a higher priority for
those short bursts in whih they want it. This was suÆient
for the interative appliations of twenty years ago. It is no
longer suÆient for modern multimedia appliations (a lass
of appliations that did not exist when these shedulers were
designed), beause their CPU usage is relatively high.
1000 Hz
60
50
100 Hz
40
30
20
20
30
40
50
60
desired frames per second
Figure 1: Desired and ahieved frame rate for the Xine
MPEG viewer, on systems with 100 Hz and 1000 Hz lok
interrupt rates.
that a proess may run before it is preempted. This duration, alled the alloation quantum, is also measured in
lok tiks. Changing the lok resolution therefore impliitly eets the quantum size. However in reality these two
parameters need not be orrelated, and they an be set independently of eah other. The question is how to set eah
one.
A related issue is providing support for soft real-time appliations suh as games with realisti video rendering, that
require aurate timing down to several milliseonds. These
appliations require signiant CPU resoures, but in a fragmented manner, and are barely served by a 100 Hz tik rate.
In some ases, the limited lok interrupt rate may atually
prevent the operating system from providing required servies.
An example is given in Figure 1. This shows the desired
and ahieved frame rates of the Xine MPEG viewer showing
500 frames of a short lip that is already loaded into memory,
when running on a Linux system with lok interrupt rates of
100 Hz and 1000 Hz. For this benhmark the disk and CPU
power are not bottleneks, and the desired frame rates an
all be ahieved. However, when using a 100 Hz system, the
viewer repeatedly disards frames beause the system does
not wake it up in time to display them if the desired frame
rate is 60 frames per seond. This is an important deieny,
as 60 frames/se is mandated by the MPEG standard.
Even ner timing servies are required in other, non-desktop
appliations. Video rates of up to 1000 frames per seond are
used for reording high-speed events, suh as vehile rash
experiments [26℄. Similar high rates an also be expeted
for sampling sensors in various situations. Even higher rates
are neessary in networking, for the implementation of ratebased transmission [25, 2℄. Full utilization of a 100 Mb/s
Fast Ethernet with 1500-byte pakets requires a paket to
be transmitted every 120 s, i.e. 8333 times a seond. On a
gigabit link, the interval drops to 12 s, and the rate jumps
up to 83,333 times a seond.
Inreasing the lok interrupt rate may be expeted to
provide muh better timing support than that available today with 100 Hz. However, this omes at the possible expense of additional overhead, and has therefore been dis-
ouraged by Linux developers (this will probably hange as
the 2.5 development kernel has swithed to 1000 Hz for the
prevalent Intel arhiteture; in the past, suh a rate was
reommended only for the Alpha proessor, whih aording to the kernel mailing list was \strong enough to handle
it") and by Sun doumentation (\exerise great are if you
are onsidering setting high-resolution tiks2 ... this setting
should never, ever be made on a prodution system without
extensive testing rst" [17, p. 56℄). Our goal is to investigate
this tradeo more thoroughly.
1.3 Related Work
Other approahes to improving the soft real-time servie
provided by ommodity systems inlude RT-Linux, one-shot
timers, soft timers, rm timers, and priority adjustments.
The RT-Linux projet uses virtual mahine tehnology to
run a real-time exeutive under Linux, only allowing Linux
to run when there are no urgent real-time tasks that need the
proessor [3℄. Thus Linux does not run on the native hardware, but on a virtual mahine. The result is a juxtaposition
of a hard real-time system and a Linux system. In partiular, the real-time servies are not available for the Linux
proesses, so real time appliations must be partitioned into
two independent parts. However, ommuniation between
the two parts is supported.
One-shot timers do not have a pre-dened periodiity. Instead, they are set aording to need. The system stores
timer requests sorted by time. Whenever a timer event is
red, the system sets a timer interrupt for the next event.
Variants of one-shot timers have been used in several systems, inluding the Pebble operating system, the Nemesis
operating system for multimedia [15℄, and the KURT realtime system [25℄. The problem is that this may lead to
high overhead if many timing events are requested with ne
resolution.
In soft timers the timing of system events is also not tied
to periodi lok interrupts [2℄. Instead, the system opportunistially makes use of onvenient irumstanes in order
to provide higher-resolution servies. For example, on eah
return from a system all the system may hek whether
any timer has beome ready, and re the respetive events.
As suh opportunities our at a muh higher rate than the
timer interrupts, the average resolution is muh improved
(in other words, soft timers are suh a good idea speially
beause the resolution of lok interrupts is so outdated).
However, the timing of a spei event annot be guaranteed, and the original low-resolution timer interrupts serve
as a fallbak. Using a higher lok rate, as we suggest, an
guarantee a muh smaller maximal deviation from the desired time.
Firm timers ombine soft timers with one-shot timers [13℄.
This ombination redues the need for timer interrupts, alleviating the risk of exessive overheads. Firm timers together
with a preemptible kernel and suitable sheduling have been
shown to be eetive in supporting time-sensitive appliations on a ommodity operating system.
Priority adjustments allow a measure of ontrol over when
proesses will run, enabling the emulation of real-time servies [1℄. This is essentially similar to the implementation
of hard real-time support in the kernel, exept for the fat
that it is done by an external proess, and an only use the
primitives provided by the underlying ommodity system.
2
This speially means 1000 Hz.
Finally, there are also various programming projets to
improve the responsiveness and performane of the Linux
kernel. One is the preemptible kernel path, whih has been
adopted as part of the 2.5 development kernel. It redues
interrupt proessing lateny by allowing long kernel operations to be preempted.
A major dierene between the above approahes and ours
is that they either require speial APIs, make non-trivial
modiations to the system, or both. Suh modiations
annot be made by any user, and require a substantial review proess before they are inorporated in standard software releases (if at all). For example, one-shot timers and
soft timers have been known sine the mid '90s, but are yet
to be inorporated in a major system. By ontradistintion, we fous on a single simple tuning knob | the lok
interrupt rate, and investigate the benets and the osts of
turning it to muh higher values than ommonly done. Previous work on multimedia sheduling, with the exeption of
[19℄, has made no mention of the underlying system lok,
and foused on designs for meeting deadline and lateny
onstraints.
1.4 Preview of Results
Our goal is to show that inreasing the lok interrupt rate
is both possible and desirable. Measurements of the overheads involved in interrupt handling and ontext swithing
indiate that urrent CPUs an tolerate muh higher lok
interrupt rates than those ommon today (Setion 3). We
then go on to demonstrate the following:
Using a higher tik rate allows the system to perform
muh more aurate billing, thus giving a better disrimination among proesses with dierent CPU usage
levels (Setion 4).
Using a higher tik rate also allows the system to provide a ertain \best eort" style of real-time proessing, in whih appliations an obtain high-resolution
timing measurements and alarms (as exemplied in
Figure 1, and expanded in Setion 5). For appliations that use time sales that are related to human
pereption, a modest inrease in tik rate to 1000 Hz
may suÆe. Appliations that operate at smaller time
sales, e.g. to monitor ertain sensors, may require
muh higher rates and shortening of sheduling quantum lengths (Setion 7).
We onlude that improved lok resolution | and the shorter
quanta that it makes possible | should be a part of any solution to the problem of sheduling soft real-time appliations,
and should be taken into aount expliitly.
2. METHODOLOGY AND APPLICATIONS
Before presenting detailed measurement results, we rst
desribe the experimental platform and introdue the appliations used in the measurements.
2.1 The Test Platform
Most measurements were done on a 664 MHz PentiumIII mahine, equipped with 256 MB RAM, and a 3DFX
Voodoo3 graphis aelerator with 16 MB RAM that supports OpenGL in hardware. In addition, we performed
ross-platform omparisons using mahines ranging from Pentium 90 to Pentium-IV 2.4 GHz. The operating system is
a 2.4.8 Linux kernel (RedHat 7.0), with the XFree86 4.1 X
server. The same kernel was ompiled for all the dierent
arhitetures, whih may result in minor dierenes in the
generated ode due to arhiteture-spei ifdef s. The default lok interrupt rate is 100 Hz. We modied the kernel
to run at up to 20,000 Hz. The modiations were essentially straightforward, and involved extending kernel ifdef s
to this range and orreting the alulation of bogomips3 .
The measurements were onduted using klogger, a kernel
logger we developed that supports ne-grain events. While
the ode is integrated into the kernel, its ativation at runtime is ontrolled by applying a speial systl all using the
/pro le system. In order to redue interferene and overhead, logged events are stored in a sizeable buer in memory
(we typially use 4 MB), and only exported at large intervals. This export is performed by a daemon that wakes up
every few seonds (the interval is redued for higher lok
rates to ensure that events are not lost). The implementation is based on inlined ode to aess the CPU's yle
ounter and store the logged data. Eah event has a 20byte header inluding a serial number and timestamp with
yle resolution, followed by event-spei data. The overhead of eah event is only a few hundred yles (we estimate
that at 100 Hz the overhead for logging is 0.63%, at 1000
Hz it is 0.95%, and at 20,000 Hz 1.18%). In our use, we
log all sheduling-related events: ontext swithing, realulation of priorities, forks, exes, and hanging the state of
proesses.
2.2 The Workload
The system's behavior was measured with dierent lok
rates and dierent workloads. The workloads were omposed of the following appliations:
3
A lassi interative appliation | the Emas text editor. During the test the editor was used for standard
typing at a rate of about 8 haraters per seonds.
The Xine MPEG viewer, whih was used to show a
short video lip in a loop. Xine's implementation is
multithreaded, making it a suitable representative of
this growing lass of appliations [11℄. Speially,
Xine uses 6 distint proesses. The two most important ones are the deoder, whih reads the data stream
from the disk and generates frames for display, and the
displayer, whih displays the frames at the appropriate
rate. The displayer keeps trak of time using alarms
with a resolution of 4 ms. On eah alarm it heks
whether the next frame should be displayed, and if so,
sends the frame to the X server. If it is too late, the
frame is disarded. If it is very late, the displayer an
also notify the deoder to skip ertain frames.
In the experiments, audio output was sent to /dev/null
rather than to the sound ard, to allow fous on interations with the X server.
Quake 3, whih represents a modern interative appliation (role playing game). Quake uses the X server's
Diret Rendering Infrastruture (DRI) [21℄ feature whih
enables the OpenGL graphis library to aess the
hardware diretly, without proxying all the requests
Bogomips are an estimate of the lok rate omputed by
the Linux kernel upon booting. The orretion prevents
division by zero in this alulation.
through the X server. This results in some of the
graphis proessing being done by the Graphial Proessor Unit (GPU) on the aelerator.
Another interesting feature of Quake is that it is adaptive: it an hange its frame rate based on how muh
CPU time it gets. Thus when Quake ompetes with
other proesses, its frame rate will drop. In our experiments, when running alone it is always ready to run
and an use all available CPU time.
CPU-bound proesses that serve as a bakground load
that an absorb any number of available CPU yles,
and ompete with the interative and real-time proesses.
In addition, the system ran a host of default proesses,
mostly various daemons. Of these, the most important with
regard to interative proesses is obviously the X server.
3. CLOCK RESOLUTION AND
OVERHEADS
A major onern regarding inreasing the lok interrupt
rate is the resulting inrease in overheads: with more lok
interrupts more time will be wasted on proessing them,
and there may also be more ontext swithes (as will be
explained below in Setion 6), whih in turn lead to redued
ahe and TLB eÆieny. This is the reason why today only
the Alpha version of Linux employs a rate of 1024 Hz by
default. This is ompounded by the onern that operating
systems in general beome less eÆient on mahines with
higher hardware lok rates [20℄. We will show that these
onerns are unfounded, and a lok interrupt rate of 1000
Hz or more is perfetly possible.
The overhead aused by lok interrupts may be divided
into two parts: diret overhead for running the interrupt
handling routine, and indiret overhead due to redued ahe
and TLB eÆieny. The diret overhead an easily be measured using klogger. We have performed suh measurements
on a range of Pentium proessors with lok rates from 90
MHz to 2.4 GHz, and on an Athlon XP1700+ at 1.467 GHz
with DDR-SDRAM memory.
The results are shown in Table 1. We nd that the overhead for interrupt proessing is dropping at a muh slower
rate than expeted aording to the CPU lok rate | in
fat, it is relatively stable in terms of absolute time. This is
due to an optimization in the Linux implementation of gettimeofday(), whereby overhead is redued by aessing the
8253 timer hip on eah lok interrupt | rather than when
gettimeofday() itself is alled | and extrapolating using the
yle ounter register. This takes a onstant amount of time
and therefore adds overhead to the interrupt handling that
is not related to the CPU lok rate. Even so, the overhead
is still short enough to allow many more interrupts than
are used today, up to an order of 10,000 Hz. Alternatively,
by removing this optimization, the overhead of lok interrupt proessing an be redued onsiderably, to allow muh
higher rates. A good ompromize might be to inrease the
lok interrupt rate but leave the rate at whih the 8253 is
aessed at 100 Hz. This will amortize the overhead of the
o-hip aess, thus reduing the overhead per lok interrupt.
A related issue is the overhead for running the sheduler.
More lok interrupts imply more alls to the sheduler.
Default
Cyles
s
814180 9.02
1654553 8.31
2342303 6.71
3972462 5.98
6377602 5.64
14603436 6.11
10494396 7.15
Without 8253
Cyles
s
498466 5.53
462762 2.32
306311 0.88
327487 0.49
426914 0.38
445550 0.19
202461 0.14
Table 1: Interrupt proessing overheads on dierent proessor generations (averagestandard deviation).
1 process
PP−200
20
PII−350
PIV−2.4
PIII−664
PIII−1.133
10
0
1000
Context swith Cahe BW
Trap
Cyles
s
MB/s
Cyles s
1871656 20.75
281 15324 1.70
1530389 7.69
70526 37975 1.91
1327331 3.80
131429 34368 0.98
1317424 1.98
251232 348163 0.52
1330441 1.18
428682 364278 0.32
3792857 1.59
301647 171232 0.72
1436477 0.98
396263 27420 0.19
Other overheads on dierent proessor generations
(averagestandard deviation).
Table 2:
More serious is the fat that in Linux the sheduler overhead is proportional to the number of proesses in the ready
queue. However, this only beomes an important fator for
very large numbers of proesses. It is also partly oset by
the fat that with more ready proesses it takes longer to
omplete a sheduling epoh, and therefore priority realulations are done less frequently.
As a side note, it is interesting to ompare lok interrupt
proessing overhead to other types of overhead. Ousterhout has laimed that in general operating systems do not
beome faster as fast as hardware [20℄. We have repeated
some of his measurements on the platforms listed above.
The results (Table 2) show that the overhead for ontext
swithing (measured using two proesses that exhange a
byte via a pipe) takes roughly the same number of yles,
regardless of CPU lok speed (exept on the P-IV, whih is
using DDR-SDRAM memory at 266 MHz and not the newer
RDRAM). It therefore does beome faster as fast as the
hardware. We also found that the trap overhead (measured
by the repeated invoation of getpid) and ahe bandwidth
(measured using mempy) behave similarly. This is more
optimisti than Ousterhout's results. The dierene may be
due to the fat that Ousterhout ompared RISC vs. CISC
arhitetures, and there is also a dierene in methodology:
we measure time and yles diretly, whereas Ousterhout
based his results on performane relative to a MirovaxII
and on estimated MIPS ratings.
The indiret overhead of lok interrupt proessing an
only be assessed by measuring the total overhead in the ontext of a spei appliation (as was done, for example, in
[2℄). The appliation we used is sorting of a large array that
oupies about half of the L2 ahe (the L2 ahe was 256
KB on all platforms exept for the P-II 350 whih had an
L2 ahe of 512 KB). The sorting algorithm was introsort,
whih is used by STL that ships with g. The sorting was
done repeatedly, where eah iteration rst initializes the ar-
5000
10000
20000
clock interrupt rate [Hz]
P−90
8 process
30
overhead above 100Hz [%]
Proessor
P-90
PP-200
PII-350
PIII-664
PIII-1.133
PIV-2.4
A1.467
P−90
30
overhead above 100Hz [%]
Proessor
P-90
PP-200
PII-350
PIII-664
PIII-1.133
PIV-2.4
A1.467
PP−200
20
PII−350
PIV−2.4
PIII−664
PIII−1.133
10
0
1000
5000
10000
20000
clock interrupt rate [Hz]
Inrease in overhead due to inreasing the lok
interrupt rate from a base ase of 100 Hz. The basi quantum is 50 ms.
Figure 2:
ray randomly and then sorts it (but the same random sequenes were used to ompare the dierent platforms). By
measuring the time per iteration under dierent onditions,
we an fator out the added total overhead due to additional lok interrupts (as is shown below). To also hek
the overhead aused by additional ontext swithing among
proesses, we used dierent multiprogramming levels, running 1, 2, 4, or 8 opies of the test appliation at the same
time. All this was repeated for dierent CPU generations
with dierent (hardware) lok rates.
Assuming that the amount of work to sort the array one
is essentially xed, measuring this time as a funtion of the
lok interrupt rate will show how muh time was added
due to overhead. Figure 2 shows this added overhead as a
perentage of the total time required at 100 Hz. From this
we see that the added overhead at 1000 Hz is negligible, and
even at 5000 Hz it is quite low. Note, however, that this is
after removing the gettimeofday() optimization, i.e. without
aessing the 8253 hip on eah interrupt. For higher lok
rates, the overhead inreases linearly, with a slope that beomes atter with eah new proessor generation (exept for
the P-IV). Essentially the same results are obtained with a
multiporgramming level of 8. Thus we an expet higher
lok interrupt rates to be inreasingly aeptable.
The overhead also depends on the length of the quanta,
i.e. on how muh time is alloated to a proess eah time it
runs. In Linux, the default alloation is 50 ms, whih trans-
P−90
overhead above 100Hz [%]
70
1 process
60
PP−200
50
40
30
PII−350
PIII−664
PIII−1.133
PIV−2.4
20
10
0
1000
5000
10000
20000
clock interrupt rate [Hz]
overhead above 100Hz [%]
70
8 process
P−90
60
50
PP−200
40
PIII−664
PIII−1.133
30
PII−350
20
PIV−2.4
10
0
1000
5000
10000
20000
clock interrupt rate [Hz]
Inrease in overhead due to inreasing the lok
interrupt rate from a base ase of 100 Hz. Quanta are 6
lok tiks, so they beome shorter for high lok rates.
Figure 3:
lates to 5 tiks4 . When raising the lok interrupt rate, the
question is whether to stik with the alloation of 50 ms, or
to redue it by dening the alloation in terms of tiks, so
as to improve responsiveness. The results shown in Figure 2
were for 50 ms. Figure 3 shows the same experiments when
using 5 tiks, meaning that the quanta are 10 or 100 times
shorter when using 1000 Hz or 10,000 Hz interrupt rates,
respetively. As shown in the graphs this leads to muh
higher overheads, espeially under higher loads, probably
beause there are many more ontext swithes. This may
limit the realisti lok interrupt rate to 1000 Hz or a bit
more, but probably not as high as 5000 Hz (in this ase the
P-IV is substantially better than the other platforms, but
this is due to using performane relative to 100 Hz, whih
was worse than for other platforms for an unknown reason).
Note, however, that 1000 Hz is an order of magnitude above
what is ommon today, and already leads to signiant benets, as shown in subsequent setions; the added overhead
in this ase is just a few perentage points, muh less than
the 10{30% whih were the norm a mere deade ago [7℄.
Our measurements also allow for an assessment of the relative osts of diret and indiret overhead. For example,
when swithing from 100 Hz to 10,000 Hz, the extra time
4
The atual alloation is 5 tiks plus one, to ensure that the
alloation is stritly positive, as the 5 is derived from the
integral quotient of two onstants.
Billing ratio
Missed quanta
Appliation 100Hz 1000Hz 100Hz 1000Hz
Emas
1.0746
0.9468
95.96% 73.42%
Xine
1.2750
1.0249
89.46% 74.81%
Quake
1.0310
1.0337
54.17% 23.23%
X ServerÆ
0.0202
0.9319
99.43% 64.05%
CPU-bound 1.0071
1.0043
7.86%
7.83%
CPU+Quake 1.0333
1.0390
26.71%
2.36%
Æ
When running Xine
Table 3:
Sheduler billing suess rate.
an be attributed to 9900 additional lok interrupts eah
seond. By subtrating the ost of 9900 alls to the interrupt
proessing routine (from Table 1), we an nd how muh of
this extra time should be attributed to indiret overhead,
that is mainly to ahe eets.
For example, onsider the ase of a P-III 664 MHz mahine running a single sorting proess with 50 ms quanta.
The average time to sort an array one is 12.675 ms on
the 100 Hz system, and 13.397 ms on the 10,000 Hz system. During this time the 10,000 Hz system suered an
additional 9900 0:013397 = 133 interrupts. Aording to
Table 1 the overhead for eah one (without aessing the
8253 hip) is 0.49 s, so the total additional overhead was
133 0:49 = 65s. But the dierene in the time to sort an
array is 13397 12675 = 722s! Thus 722 65 = 657s are
unaounted for, and should be attributed to ahe eets
and sheduler overhead. In other words, 657=722 = 91% of
the overhead is indiret, and only 9% is diret. This number
is typial of many of the ongurations heked. The indiret overhead on the P-IV and Athlon mahines, and when
using shorter quanta on all mahines, are higher, and may
even reah 99%. This means that the gures given in Table
1 should be multiplied by at least 10 (and in some extreme
ases by as muh as 100) to derive the real ost of inreasing
the lok interrupt rate.
4. CLOCK RESOLUTION AND BILLING
Pratially all ommodity operating systems use prioritybased shedulers, and fator CPU usage into their priority
alulations. CPU usage is measured in tiks, and is based
on sampling: the proess running when a lok interrupt
ours is billed for this tik. But the oarse granularity
of tiks implies that billing may be inaurate, leading to
inaurate information used by the sheduler.
The relationship between atual CPU onsumption and
billing on a 100 Hz system is shown at the top of Figure 4.
The X axis in these graphs is the eetive quantum length:
the exat time from when the proess is sheduled to run
until when it is preempted or bloked. While the eetive
quantum tends to be widely distributed, billing is done in
an integral numbers of tiks. In partiular, for Emas and X
the typial quantum is very short, and they are pratially
never billed!
Using klogger, we an tabulate all the times eah appliation is sheduled, for how muh time, and whether or not this
was billed. The data is summarized in Table 3. The billing
ratio is the time for whih an appliation was billed by the
sheduler, divided by the total time atually onsumed by
it during the test. The miss perentage is the perentage
3
53700 quanta
3
21810 quanta
xine
3
2634 quanta
quake
3
8205 quanta
emacs
X (w/xine)
2
2
2
2
1
1
1
1
billing [ticks]
100Hz
0
0
0
1
2
73200 quanta
30
0
0
1
2
3
30390 quanta
30
xine
0
0
2
4050 quanta
30
quake
20
1
0
1
2
16005 quanta
30
emacs
X (w/xine)
20
20
20
10
10
10
1000Hz
10
0
0
0
10
20
0
0
10
20
30
0
0
10
20
0
10
20
effective quantum [ticks]
The relationship between eetive quanta durations and how muh the proess is billed, for dierent appliations,
using a kernel running at 100 Hz and at 1000 Hz. Conentrations of data points are rendered as larger disks; otherwise the
graphs would have a lean steps shape, beause the billing (Y axis) is in whole tiks. Note also that the optimal would be a
diagonal line with slope 1.
Figure 4:
of the appliation's quanta that were totally missed by the
sheduler and not billed for at all.
The table shows that even though very many quanta are
totally missed by the sheduler, espeially for interative appliations, most appliations are atually billed with reasonable auray in the long run. This is a result of the probabilisti nature of the sampling. Sine most of the quanta are
shorter than one lok tik, and the sheduler an only ount
in omplete tik units, many of the quanta are not billed at
all. But when a short quantum does happen to inlude a
lok interrupt, it is over billed and harged a full tik. On
average, these two eets tend to anel out, beause the
probability that a quantum inludes a tik is proportional
to its duration. The same averaging happens also for quanta
that are longer than a tik: some are rounded up to the next
whole tik, while others are rounded down.
A notable exeption is the X server when running with
Xine (we used Xine beause it intensively uses the X server,
as opposed to Quake whih uses DRI). As shown below
in Setion 6, when running at 100 Hz this appliation has
quanta that are either extremely short (around 68% of the
quanta), or 0.8{0.9 of a tik (the remaining 32%). Given the
distribution of quanta, we should expet over 30% of them
to inlude a tik and be ounted. But the sheduler misses
over 99% of them, and only bills about 2% of the onsumed
time! This turns out to be the result of synhronization with
the operating system tiks. Speially, the long quanta always our after a very short quantum of a Xine proess
that was ativated by a timer alarm. This is the displayer,
whih heks whether to display the next frame. When it
deides that the time is right, it passes the frame to X. The
X server then awakes and takes a relatively long time to atually display the frame, but just less than a full tik. As
the timer alarm is arried out on a tik, these long quanta
always start very soon after one tik, and omplete just before the next tik. Thus, despite being nearly a tik long,
they are hardly ever ounted.
When running the kernel at 1000 Hz we an see that the
situation improves dramatially | the eetive quantum
length, even for interative appliations, is typially several
tiks long, so the sheduler bills the proess an amount that
reets the atual onsumed time muh more aurately. In
partiular, on a 1000 Hz system X is billed for over 93% of
the time it onsumed, with the missed quanta perentage
dropping to 64% | the fration of quanta that are indeed
very short.
An alternative to this whole disussion is of ourse the
option to measure runtime aurately, rather than sampling
on lok interrupts. This an be done easily by aessing
the CPU yle ounter [6℄. However, this involves modifying
the operating system, whereas we are only interested in the
eets obtainable by simple tuning of the lok interrupt
rate.
5. CLOCK RESOLUTION AND TIMING
Inreasing the kernel's lok resolution also yields a major
benet in terms of the system's ability to provide aurate
timing servies. Speially, with a high-resolution lok it
clock interrupts
shift
S =5 56 ms
tick1
tick2
tick3
tick4
T0+S+10
T0+S+20
T0+S+30
10ms
T0+S
ok
T0
skip
T0+8 13
ok
T0+16 23
skip
T0+25
ok
T0+33 13
T0+S+40
skip
T0+41 23
T0+ 50
16 23 ms
frame1
frame2
Appliation
Emas
Xine (all proesses)
Quake
X Server (w/Xine)
CPU-bound
Table 4: Average quanta per seond ahieved by eah appliation when running in isolation.
CPU usage
Appliation 100Hz 1000Hz
Xine
39.42%
40.42%
X Server
20.10%
20.79%
idle loop
31.46%
31.58%
other
9.02%
7.21%
frame3
desired frame display times
Figure 5: Relationship of lok interrupts to frame display
times that auses frames to be skipped. In this example the
relative shift is 5 56 ms, and frame 2 is skipped.
is possible to deliver high-resolution timer interrupts. This
is espeially signiant for soft real-time appliations suh
as multimedia players, whih rely on timer events to keep
orret time.
A striking example was given in the introdution, where
it was shown that the Xine MPEG player was sometimes
unable to display a movie at a rate of 60 frames per seond
(whih is mandated by the MPEG standard). This is somewhat surprising, beause the underlying system lok rate
is 100 Hz | higher than the desired rate.
The problem stems from the relative timing of the lok
interrupts and the times at whih frames are to be displayed.
Xine operates aording to two rules: it does not display a
frame ahead of its time, and it skips frames that are late by
more than half a frame duration. A frame will therefore be
displayed only if the lok interrupt that auses Xine's timer
signal to be delivered ours in the rst half of a frame's
sheduled display time. In the ase of 60 frames per seond
on a 100 Hz system, the smallest ommon multiple of the
2
frame duration ( 1000
60 = 16 3 ms) and lok interval (10 ms)
is 50 ms. Suh an interval is shown in Figure 5. In this
example frame 2 will be skipped, beause interrupt 2 is a bit
too early, whereas interrupt 3 is already too late. In general,
the question of whether this will indeed happen depends on
the relative shift between the sheduled frame times and the
lok interrupts. A simple inspetion of the gure indiates
that frame 1 will be skipped if the shift (between the rst
lok interrupt and the rst frame) is in the range of 8 31 {10
ms, frame 2 will be skipped for shifts in the range 5{6 23 ms,
and frame 3 will be skipped for shifts in the range 1 23 {3 13 ms,
for a total of 5 ms out of the 10 ms between tiks. Assuming
the initial shift is random, there is therefore a 50% hane of
entering a pattern in whih a third of the frames are skipped,
leading to the observed frame rate of about 40 frames per
seond (in reality, though, this happens muh less than 50%
of the time, beause the initial program startup tends to be
synhronized with a lok tik).
To hek this analysis we also tried a muh more extreme
ase: running a movie at 50 frames per seond on a 50 Hz
system. In this ase, either all lok interrupts fall in the rst
half of their respetive frames, and all frames are shown, or
else all interrupts fall in the seond half of their frames, and
all are skipped. And indeed, we observed runs in whih all
Quanta/se
100Hz 1000Hz
22.36
34.60
470.67
695.94
187.88
273.85
71.35
148.21
28.81
38.97
Table 5:
CPU usage distribution when running Xine.
frames were skipped and the sreen remained blak throughout the entire movie.
The impliation of the above is that the timing servie has
to have muh ner resolution than that of the requests. For
Xine to display a movie at 60 Hz, the timing servie needs
a resolution of 4 ms. This is required for the appliation to
funtion orretly, not for the atual viewing, and therefore
applies despite the fat that this lok resolution is muh
higher than the sreen refresh rate.
6. CLOCK RESOLUTION AND THE
INTERLEAVING OF APPLICATIONS
Reall that we dene the eetive quantum length to be
the interval from when a proess is sheduled until it is desheduled for some reason. On our Linux system, the alloation for a quantum is 50 ms plus one tik. However, as
we an see from Figures 4 and 6 (introdued below), our interative appliations never even approah this limit. They
are always preempted or bloked muh sooner, often quite
soon in their rst tik. In other words, the eetive quantum length is very short. This enables the system to support
more than 100 quanta per seond, even if the lok interrupt
rate is only 100 Hz, as shown in Table 4. It also explains
the suess of soft timers [2℄.
The distributions of the eetive quantum length for the
dierent appliations are shown in Figure 6, for 100 Hz and
1000 Hz systems. An interesting observation is that when
running the kernel at 1000 Hz the eetive quanta beome
even shorter. This happens beause the system has more
opportunities to intervene and preempt a proess, either beause it woke up another proess that has higher priority, or
due to a timer alarm that has expired. However, the total
CPU usage does not hange signiantly (Table 5). Thus
inreasing the lok rate did not hange the amount of omputation performed, but the way in whih it is partitioned
into quanta, and the granularity at whih the proesses are
interleaved with eah other.
A spei example is provided by Xine. One of the Xine
proesses sets a 4 ms alarm, that is used to synhronize
the video stream. In a 100 Hz system, the alarm signal is
only delivered every 10 ms, beause this is the size of a tik.
But when using a 1000 Hz lok the system an atually
Probability
1
Xine
1
Emacs
1
0.8
0.8
0.8
0.6
0.6
0.6
0.4
0.4
0.4
0.2
0.2
0.2
0
0
0
1
10
20
30
X (with Xine)
0
0
1
10
20
30
0
Quake
1
0.8
0.8
0.8
0.6
0.6
0.6
0.4
0.4
0.4
0.2
0.2
0.2
0
0
0
10
20
30
CPU bound (alone)
10
20
30
40
50
60
40
50
60
CPU bound (with quake)
0
0
10
100HZ
20
0
30
10
20
30
Milliseconds
1000HZ
Figure 6:
Cumulative distribution plots of the eetive quantum durations of the dierent appliations.
deliver the signals on time. As a result the maximal eetive
quanta of X and the other Xine proesses are redued to 4
ms, beause they get interrupted by the Xine proess with
the 4 ms timer.
Likewise, the servie reeived by CPU-bound appliations
is not independent of the interative proesses that aompany them. To investigate this eet, these proesses were
measured alone and running with Quake. When running
alone, their quanta are typially indeed an integral number
of tiks long. Most of the time the number of tiks is less
than the full alloation, due to interruptions from system
daemons or klogger, but a sizeable fration do ahieve the
alloated 50 ms plus one tik (whih is an additional 10 ms
at 100 Hz, but only 1 ms at 1000 Hz). But when Quake is
added, the quanta of the CPU-bound proesses are shortened to the same range as those of Quake, and moreover,
they beome less preditable. This also leads to an inrease
in the number of quanta that are missed for billing (Table
3), unless the higher lok rate of 1000 Hz is used.
7.
TOWARDS BEST-EFFORT SUPPORT
FOR REAL-TIME
In this setion we set out to explore how lose a general
purpose system an ome to supporting real-time proesses
in terms of timing delays, only by tuning the lok interrupt
rate and reduing the alloated quanta. The metri that we
use in order to perform suh an evaluation is lateny: the
dierene between the time in whih an alarm requested by
a proess should expire, and the time in whih this proess
was atually assigned a CPU.
Without worrying about overhead (for the moment), our
aim is to show that under loads of up to 8 proesses, we an
bound the lateny to be less than 1 milliseond. As there
are very many types of soft real-time appliations, we sample
the possible spae by onsidering three types of proesses:
1.
BLK: A proess repeatedly sets alarms without per-
forming any type of omputation. Our experiments
involved proesses that requested an alarm signal 500
times, with delays that are uniformly distributed between 1 and 1000 milliseonds.
2. N%: Same as BLK, with the dierene that a proess
omputed for a ertain fration (N%) of the time till
the next alarm. Speially, we heked omputation
of N = 1, 2, 4, and 8% out of this interval. Note for
example that a ombination of 8 proesses omputing
for 8% of the time leads to an average of 64% CPU
utilization. To hek what happens when the CPU is
not left idle, we also added CPU-bound proesses that
do not set timers.
3. CONT: Same as N% where N=100% i.e. the proess
omputes ontinuously.
For eah of the above 3 types, we heked ombinations
of 1, 2, 4, and 8 proesses. All the proesses that set timers
were assigned to the (POSIX) Round-Robin lass. Note that
a ombination of more than one CONT-proess onstitutes
the worst-ase senario, beause | ontrary to the other
workloads | the CPU is always busy and there are always
alternative proesses with similar priorities (in the RoundRobin queue) that are waiting to run.
1 process
2 processes
800
700
600
500
400
300
200
100
20000Hz, 100 µs time quanta
0
480000
420000
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
360000
300000
240000
180000
120000
60000
100Hz, default time quanta (60 ms)
0
Probability
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Microseconds
4 processes
8 processes
Distributions of latenies till a timer signal is delivered, for proesses that ompute ontinuously and also set
timers for random intervals of up to one seond.
Figure 7:
sorted numbers per sec. [millions]
5
base config
4
extreme config
3
2
1
0
A1
4
2.
3
13
67
.4
V−
PI
1.
4
66
II−
PI
0
00
35
II−
PI
I−
PI
−2
PP
90
P−
The base system we used is the default onguration of
Linux, with 100 Hz lok interrupt rate and a 60 ms (6 tiks)
maximal quantum duration. In order to ahieve our submilliseond lateny goal, we ompared this with a rather
aggressive alternative: 20,000 Hz lok interrupt rate and
100 s (2 tiks) quantum (note that we are hanging two
parameters at one: both the lok resolution and the number of tiks in a quantum). Theoretially, for this onguration the maximal lateny would be 100s 7 = 700s < 1
ms, beause even if a proess is positioned at the end of the
run-queue it only needs to wait for seven other proesses to
run for 100s eah.
The results shown in Figure 7 onrm our expetations.
This gure is assoiated with the worst-ase senario of a
workload omposed solely of CONT proesses. Examining
the results for the original 100 Hz system (left of Figure 7),
we see that a single proess reeives the signal within one
tik, as may be expeted. When more proesses are present,
there is also a positive probability that a proess will nevertheless reeive the signal within a tik: 21 , 41 and 18 for
2, 4 and 8 proesses, respetively. The Y-axis of the gure
shows that the atual frations were 0.53, 0.30, and 0.16 (respetively), slightly more than the assoiated probabilities.
But, a proess may also be fored to wait for other proesses
that preede it to exhaust their quanta. This leads to the
step-like shape of the graphs, beause the wait is typially
an integral number of tiks. The maximal wait is a full
quantum for eah of the other proesses. In the ase of 8
ompeting proesses, for example, the maximum is 60 ms
for eah of the other 7, for a total of 420 ms (=420,000 s).
The situation on the 20,000 Hz system is essentially the
same, exept that the time sale is muh muh shorter |
the lateny is almost always less than a milliseond, as expeted. In other words, the high lok interrupt rate and
rapid ontext swithing allow the system to deliver timer
signals in a timely manner, despite having to yle through
all ompeting proesses.
Table 6 shows that this is the ase for all our experiments
(for brevity only seleted experiments are shown). Note that
using the higher lok rate also provides signiantly improved latenies to the experiments where proesses only
platform
Throughput of the sort appliation, measured as
how many millions of numbers were sorted per seond, with
8 ompeting proesses.
Figure 8:
ompute for a fration of the time till the timer event. With
100 Hz even this senario sometimes auses onits, despite
the relatively low overall CPU utilization. The relatively few
long-lateny events that remain in the high lok-rate ase
are attributed to onits with system daemons that perform disk I/O, suh as the pager. Similar eets have been
noted in other systems [14℄. These problems are expeted
to go away in the next Linux kernel, whih is preemptive;
they should not be an issue in other kernels that are already
preemptive (suh as Solaris).
But what about overheads? As shown in Figure 3, when
running ontinuously omputing proesses (in that ase, a
sorting appliation) with a 20,000 Hz lok interrupt rate
and quanta of 6 tiks, the additional overhead an reah
35% on ontemporary arhitetures. The overhead for the
shorter 2-tik quanta used here may be even higher. This
Proesses
Type
Number
BLK
2
BLK
8
CONT
2
CONT
8
2%
2
2%
8
8%
2
8%
8
4%
1+2CPU
4%
1+8CPU
100Hz
0.9
0.95
0.99
5
8
11
5
12
22
50,003 60,003 60,004
370,014 400,014 420,015
6
9
9,193
2,910
8,419 17,940
9 12,431 39,512
40,003 60,005 130,006
50,003 50,003 50,004
50,003 50,003 170,014
20,000Hz
max 0.9 0.95
0.99
max
40 13
14
21
23
420
7
9
13
25
160,006 102 103 18,468 60,448
740,025 656 706 15,096 68,139
19,153 13
15
23
837
32,944 12
52
53 1,809
60,003 14
19
53 3,797
294,291 53
53
54 37,328
50,005 55
56
200
256
280,010 56
57
59
856
Table 6: Tails of distributions of latenies to deliver timer signals in dierent experimental settings. Table values are latenies
in miroseonds, for various perentiles of the distribution.
seems like an expensive and unaeptable prie to pay. However, if we examine the appliation throughput on dierent
platforms the piture is not so bleak. Figure 8 ompares the
ahieved throughput, as measured by numbers sorted per
seond, for two ongurations. The base onguration uses
100 Hz interrupts and 60 ms quanta. The extreme onguration uses 20,000 Hz interrupts and 100 s quanta. While
performane dramatially drops when omparing the two
ongurations on the same platform, the extreme onguration of eah platform still typially outperforms the base
onguration on the previous platform. For example, PIII664 running the base onguration manages to sort about
2,559,000 numbers per seond, while the PIII-1.133 with the
extreme onguration sorts about 3,136,000 numbers per
seond (the P-IV onsistently performs worse than previous
generations). This is an optimisti result whih means that
in order to get the same or even improve the performane
of an existing platform, while ahieving sub-milliseond lateny, all one has to do is upgrade to the next generation.
This is usually muh heaper than purhasing the industrial
hard real-time alternative.
8.
CONCLUSIONS AND FUTURE WORK
General purpose systems, suh as Linux and Windows,
are already often used for soft real-time appliations suh
a viewing video, playing musi, or burning CDs. Other
less ommon appliations inlude various ontrol funtions,
ranging from laboratory experiment ontrol to traÆ-light
ontrol. Suh appliations are not ritial to the degree that
they require a full-edged real-time system. However, they
may fae problems on a typial ommodity system due to
the lak of adequate support for high-resolution timing servies. A speial ase is \timeline gaps", where the proessor
is totally unavailable for a relatively long time [14℄.
Various solutions have been proposed for this problem,
typially based on expliit support for timing funtions. In
partiular, very good results are obtained by using soft timers
or one-shot timers. The idea there is to hange the kernel's
timing mehanism from the urrent periodi time sampling
to event-based time sampling. However, sine this eventbased approah alls for a massive redesign of a major kernel subsystem, it has remained more of an aademi exerise
and has yet to make it into the world of mainstream operating systems.
The goal of this paper is to hek the degree to whih
existing systems an provide reasonable soft real-time ser-
vies, speially for interative appliations, just by leveraging the very fast hardware that is now routinely available,
without any sophistiated modiations to the system. The
mehanism is simply to inrease the frequeny of the periodi timer sampling. We show that this solution | although suering from non-negligible overhead | is a viable
solution on today's ultra-fast CPUs. We also show that implementing this solution in mainstream operating systems is
as trivial as turning a tuning knob, possibly even at system
runtime.
We started with the observation that there is a large and
growing gap between the CPU lok rates, whih grow exponentially, and the system lok interrupt rates, whih are
rather stable at 100 Hz. We showed that by inreasing the
lok interrupt rate by a mere order of magnitude, to 1000
Hz, one ahieves signiant advantages in terms of timing
and billing servies, while keeping the overheads aeptably
low. The modiations required to the system are rather
trivial: to inrease the lok interrupt rate, and redue the
default quantum length. As multimedia appliations typially operate in this range (i.e. with timers of several milliseonds), suh an inrease may be enough to satisfy this
important lass of appliations. A similar observation has
been made by Nieh and Lam with regard to the sheduling
of multimedia appliations in the SMART sheduler [19℄. A
rate of 1000 Hz is used in the experimental Linux 2.5 kernel,
and also on personal systems of some kernel hakers [12℄.
For more demanding appliations, we experimented with
raising the lok interrupt rate up to 20,000 Hz, and found
that by doing so appliations are guaranteed to reeive timer
signals within one milliseond of the orret times with high
probability, even under loaded onditions.
In addition to suggesting that 1000 Hz be used as the
minimal default lok rate, we also propose that the HZ
value and the quantum length be settable parameters, rather
than ompiled onstants. This will enable users of systems
that are dediated to a time-sensitive task to ongure them
so as to bound the lateny, by shortening the quantum so
that when multiplied by the expeted number of proesses
in the system the produt is less than the desired bound.
Of ourse, this funtionality has to be traded o with the
overhead it entails. Suh detailed onsiderations an only
be made by knowledgeable users on a ase-by-ase basis.
Even so, this is expeted to be ost eetive relative to the
alternative of prouring a hard real-time system.
The last missing piee is the orret prioritization of ap-
pliations under heavy load onditions. The problem is that
modern interative appliations may use quite a lot of CPU
power to generate realisti graphis and video in real-time,
and may therefore be hard to distinguish from low priority CPU-bound appliations. This is espeially hard when
faed with multi-threaded appliations (like Xine), or if appliations are adaptive (as Quake is) and an always use
additional ompute power to improve their output. Our future work therefore deals with alternative mehanisms for
the identiation of interative proesses. The mehanisms
we are onsidering involve traking the interations of appliations with the X server, and thus with input and output
devies that represent the loal user [9℄.
Acknowledgements
Many thanks are due to Danny Braniss and Tomer Klainer
for providing aess to various platforms and helping make
them work.
9.
REFERENCES
[1℄ B. Adelberg, H. Garia-Molina, and B. Kao,
\Emulating soft real-time sheduling using traditional
operating system shedulers". In Real-Time System
Symp., Ot 1994.
[2℄ M. Aron and P. Drushel, \Soft timers: eÆient
miroseond software timer support for network
proessing". ACM Trans. Comput. Syst. 18(3),
pp. 197{228, Aug 2000.
[3℄ M. Barabanov and V. Yodaiken, \Introduing
real-time Linux". Linux Journal 34, Feb 1997.
http://www.linuxjournal.om/artile.php?sid=0232.
[4℄ M. Bek, H. Bohme, M. Dziadzka, U. Kunitz,
R. Magnus, and D. Verworner, Linux Kernel
Internals. Addison-Wesley, 2nd ed., 1998.
[5℄ D. P. Bovet and M. Cesati, Understanding the Linux
Kernel. O'Reilly, 2001.
[6℄ J. B. Chen, Y. Endo, K. Chan, D. Mazieres, A. Dias,
M. Seltzer, and M. D. Smith, \The measured
performane of personal omputer operating systems".
ACM Trans. Comput. Syst. 14(1), pp. 3{40, Feb 1996.
[7℄ R. T. Dimpsey and R. K. Iyer, \Modeling and
measuring multiprogramming and system overheads
on a shared memory multiproessor: ase study". J.
Parallel & Distributed Comput. 12(4), pp. 402{414,
Aug 1991.
[8℄ K. J. Duda and D. R. Cheriton,
\Borrowed-virtual-time (BVT) sheduling: supporting
lateny-sensitive threads in a general-purpose
sheduler". In 17th Symp. Operating Systems
Priniples, pp. 261{276, De 1999.
[9℄ Y. Etsion, D. Tsafrir, and D. G. Feitelson,
Human-Centered Sheduling of Interative and
Multimedia Appliations
on a Loaded Desktop. Tehnial Report, Hebrew
University, Mar 2003.
[10℄ K. Flautner and T. Mudge, \Vertigo: automati
performane-setting for Linux". In 5th Symp.
Operating Systems Design & Implementation,
pp. 105{116, De 2002.
[11℄ K. Flautner, R. Uhlig, S. Reinhardt, and T. Mudge,
\Thread-level parallelism and interative performane
of desktop appliations". In 9th Intl. Conf. Arhitet.
Support for Prog. Lang. & Operating Syst.,
pp. 129{138, Nov 2000.
[12℄ FreeBSD Doumentation Server, Thread on \lok
granularity (kernel option HZ)". URL
http://dos.freebsd.org/mail/arhive/2002/freebsdhakers/20020203.freebsd-hakers.html, Feb
2002.
[13℄ A. Goel, L. Abeni, C. Krasi, J. Snow, and
J. Walpole, \Supporting time-sensitive appliations on
a ommodity OS". In 5th Symp. Operating Systems
Design & Implementation, pp. 165{180, De 2002.
[14℄ J. Gwinn, \Some measurements of timeline gaps in
VAX/VMS". Operating Syst. Rev. 28(2), pp. 92{96,
Apr 1994.
[15℄ I. Leslie, D. MAuley, R. Blak, T. Rosoe, P. Barham,
D. Evers, R. Fairbairns, and E. Hyden, \The design
and implementation of an operating system to support
distributed multimedia appliations". IEEE J. Selet
Areas in Commun. 14(7), pp. 1280{1297, Sep 1996.
[16℄ J. Lions, Lions' Commentary on UNIX 6th Edition,
with Soure Code. Annabooks, 1996.
[17℄ J. Mauro and R. MDougall, Solaris Internals.
Prentie Hall, 2001.
[18℄ J. Nieh, J. G. Hanko, J. D. Northutt, and
G. A. Wall, \SVR4 UNIX sheduler unaeptable for
multimedia appliations". In 4th Int'l Workshop
Network & Operating System Support for Digital
Audio and Video, Nov 1993.
[19℄ J. Nieh and M. S. Lam, \The design, implementation
and evaluation of SMART: a sheduler for multimedia
appliations". In 16th Symp. Operating Systems
Priniples, pp. 184{197, Ot 1997.
[20℄ J. K. Ousterhout, \Why aren't operating systems
getting faster as fast as hardware?". In USENIX
Summer Conf., pp. 247{256, Jun 1990.
[21℄ B. Paul, \Introdution to the Diret Rendering
Infrastruture".
http://dri.soureforge.net/do/DRIintro.html, August
2000.
[22℄ M. A. Rau and E. Smirni, \Adaptive CPU sheduling
poliies for mixed multimedia and best-eort
workloads". In Modeling, Anal. & Simulation of
Comput. & Teleomm. Syst., pp. 252{261, Ot 1999.
[23℄ R. Ronen, A. Mendelson, K. Lai, S-L. Lu, F. Pollak,
and J. P. Shen, \Coming hallenges in
miroarhiteture and arhiteture". Pro. IEEE
89(3), pp. 325{340, Mar 2001.
[24℄ D. A. Solomon and M. E. Russinovih, Inside
Mirosoft Windows 2000. Mirosoft Press, 3rd ed.,
2000.
[25℄ B. Srinivasan, S. Pather, R. Hill, F. Ansari, and
D. Niehaus, \A rm real-time system implementation
using ommerial o-the-shelf hardware and free
software". In 4th IEEE Real-Time Tehnology & App.
Symp., pp. 112{119, Jun 1998.
[26℄ D. Tyrell, K. Severson, A. B. Perlman, B. Brikle, and
C. Vaningen-Dunn, \Rail passenger equipment
rashworthiness testing requirements and
implementation". In Intl. Mehanial Engineering
Congress & Exposition, Nov 2000.