Download Wikibook

Document related concepts

Mobile operating system wikipedia , lookup

OS/2 wikipedia , lookup

Library (computing) wikipedia , lookup

Plan 9 from Bell Labs wikipedia , lookup

Security-focused operating system wikipedia , lookup

System 7 wikipedia , lookup

Copland (operating system) wikipedia , lookup

OS 2200 wikipedia , lookup

Distributed operating system wikipedia , lookup

RSTS/E wikipedia , lookup

Windows NT startup process wikipedia , lookup

DNIX wikipedia , lookup

Process management (computing) wikipedia , lookup

Burroughs MCP wikipedia , lookup

Spring (operating system) wikipedia , lookup

VS/9 wikipedia , lookup

Unix security wikipedia , lookup

CP/M wikipedia , lookup

Paging wikipedia , lookup

Transcript
Computer Systems Course:
the Operating System
Machine Level
Vrije Universiteit Amsterdam
PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information.
-- -
Contents
Articles
1. Introduction
Operating system
1
1
Kernel (computing)
18
Booting
35
2. Processes
47
Computer multitasking
47
Process (computing)
51
Process management (computing)
54
Context switch
60
Scheduling (computing)
62
3. I/O
71
Input/output
71
Device driver
73
4. Memory
77
Memory management
77
Virtual memory
80
Page (computer memory)
85
Paging
88
Page fault
96
5. Files
File system
Virtual file system
99
99
114
References
Article Sources and Contributors
118
Image Sources, Licenses and Contributors
122
Article Licenses
License
123
1
1. Introduction
Operating system
An operating system (OS) is a collection of software that manages computer hardware resources and provides
common services for computer programs. The operating system is a vital component of the system software in a
computer system. Application programs usually require an operating system to function.
Time-sharing operating systems schedule tasks for efficient use of the system and may also include accounting for
cost allocation of processor time, mass storage, printing, and other resources.
For hardware functions such as input and output and memory allocation, the operating system acts as an
intermediary between programs and the computer hardware,[1][2] although the application code is usually executed
directly by the hardware and will frequently make a system call to an OS function or be interrupted by it. Operating
systems can be found on almost any device that contains a computer€from cellular phones and video game consoles
to supercomputers and web servers.
Examples of popular modern operating systems include Android, BSD, iOS, Linux, Mac OS X, Microsoft
Windows,[3] Windows Phone, and IBM z/OS. All these, except Windows and z/OS, share roots in UNIX.
Types of operating systems
Real-time
A real-time operating system is a multitasking operating system that aims at executing real-time applications.
Real-time operating systems often use specialized scheduling algorithms so that they can achieve a
deterministic nature of behavior. The main objective of real-time operating systems is their quick and
predictable response to events. They have an event-driven or time-sharing design and often aspects of both. An
event-driven system switches between tasks based on their priorities or external events while time-sharing
operating systems switch tasks based on clock interrupts.
Multi-user
A multi-user operating system allows multiple users to access a computer system at the same time.
Time-sharing systems and Internet servers can be classified as multi-user systems as they enable multiple-user
access to a computer through the sharing of time. Single-user operating systems have only one user but may
allow multiple programs to run at the same time.
Multi-tasking vs. single-tasking
A multi-tasking operating system allows more than one program to be running at a time, from the point of
view of human time scales. A single-tasking system has only one running program. Multi-tasking can be of
two types: pre-emptive and co-operative. In pre-emptive multitasking, the operating system slices the CPU
time and dedicates one slot to each of the programs. Unix-like operating systems such as Solaris and Linux
support pre-emptive multitasking, as does AmigaOS. Cooperative multitasking is achieved by relying on each
process to give time to the other processes in a defined manner. 16-bit versions of Microsoft Windows used
cooperative multi-tasking. 32-bit versions of both Windows NT and Win9x, used pre-emptive multi-tasking.
Mac OS prior to OS X used to support cooperative multitasking.
Distributed
A distributed operating system manages a group of independent computers and makes them appear to be a
single computer. The development of networked computers that could be linked and communicate with each
Operating system
2
other gave rise to distributed computing. Distributed computations are carried out on more than one machine.
When computers in a group work in cooperation, they make a distributed system.
Embedded
Embedded operating systems are designed to be used in embedded computer systems. They are designed to
operate on small machines like PDAs with less autonomy. They are able to operate with a limited number of
resources. They are very compact and extremely efficient by design. Windows CE and Minix 3 are some
examples of embedded operating systems.
History
Early computers were built to perform a series of single tasks, like a calculator. Operating systems did not exist in
their modern and more complex forms until the early 1960s.[4] Basic operating system features were developed in the
1950s, such as resident monitor functions that could automatically run different programs in succession to speed up
processing. Hardware features were added that enabled use of runtime libraries, interrupts, and parallel processing.
When personal computers became popular in the 1980s, operating systems were made for them similar in concept to
those used on larger computers.
In the 1940s, the earliest electronic digital systems had no operating systems. Electronic systems of this time were
programmed on rows of mechanical switches or by jumper wires on plug boards. These were special-purpose
systems that, for example, generated ballistics tables for the military or controlled the printing of payroll checks from
data on punched paper cards. After programmable general purpose computers were invented, machine languages
(consisting of strings of the binary digits 0 and 1 on punched paper tape) were introduced that sped up the
programming process (Stern, 1981).
In the early 1950s, a computer could execute only one program at a
time. Each user had sole use of the computer for a limited period of
time and would arrive at a scheduled time with program and data on
punched paper cards and/or punched tape. The program would be
loaded into the machine, and the machine would be set to work until
the program completed or crashed. Programs could generally be
debugged via a front panel using toggle switches and panel lights. It is
said that Alan Turing was a master of this on the early Manchester
Mark 1 machine, and he was already deriving the primitive conception
of an operating system from the principles of the Universal Turing
machine.[4]
Later machines came with libraries of programs, which would be
linked to a user's program to assist in operations such as input and
output and generating computer code from human-readable symbolic
code. This was the genesis of the modern-day operating system.
However, machines still ran a single job at a time. At Cambridge
University in England the job queue was at one time a washing line
from which tapes were hung with different colored clothes-pegs to
indicate job-priority.
OS/360 was used on most IBM mainframe
computers beginning in 1966, including the
computers that helped NASA put a man on the
moon.
Operating system
Mainframes
Through the 1950s, many major features were pioneered in the field of operating systems, including batch
processing, input/output interrupt, buffering, multitasking, spooling, runtime libraries, link-loading, and programs for
sorting records in files. These features were included or not included in application software at the option of
application programmers, rather than in a separate operating system used by all applications. In 1959 the SHARE
Operating System was released as an integrated utility for the IBM 704, and later in the 709 and 7090 mainframes,
although it was quickly supplanted by IBSYS/IBJOB on the 709, 7090 and 7094.
During the 1960s, IBM's OS/360 introduced the concept of a single OS spanning an entire product line, which was
crucial for the success of the System/360 machines. IBM's current mainframe operating systems are distant
descendants of this original system and applications written for OS/360 can still be run on modern machines.
OS/360 also pioneered the concept that the operating system keeps track of all of the system resources that are used,
including program and data space allocation in main memory and file space in secondary storage, and file locking
during update. When the process is terminated for any reason, all of these resources are re-claimed by the operating
system.
The alternative CP-67 system for the S/360-67 started a whole line of IBM operating systems focused on the concept
of virtual machines. Other operating systems used on IBM S/360 series mainframes included systems developed by
IBM: COS/360 (Compatibility Operating System), DOS/360 (Disk Operating System), TSS/360 (Time Sharing
System), TOS/360 (Tape Operating System), BOS/360 (Basic Operating System), and ACP (Airline Control
Program), as well as a few non-IBM systems: MTS (Michigan Terminal System), MUSIC (Multi-User System for
Interactive Computing), and ORVYL (Stanford Timesharing System).
Control Data Corporation developed the SCOPE operating system in the 1960s, for batch processing. In cooperation
with the University of Minnesota, the Kronos and later the NOS operating systems were developed during the 1970s,
which supported simultaneous batch and timesharing use. Like many commercial timesharing systems, its interface
was an extension of the Dartmouth BASIC operating systems, one of the pioneering efforts in timesharing and
programming languages. In the late 1970s, Control Data and the University of Illinois developed the PLATO
operating system, which used plasma panel displays and long-distance time sharing networks. Plato was remarkably
innovative for its time, featuring real-time chat, and multi-user graphical games. Burroughs Corporation introduced
the B5000 in 1961 with the MCP, (Master Control Program) operating system. The B5000 was a stack machine
designed to exclusively support high-level languages with no machine language or assembler, and indeed the MCP
was the first OS to be written exclusively in a high-level languageۥ ESPOL, a dialect of ALGOL. MCP also
introduced many other ground-breaking innovations, such as being the first commercial implementation of virtual
memory. During development of the AS400, IBM made an approach to Burroughs to licence MCP to run on the
AS400 hardware. This proposal was declined by Burroughs management to protect its existing hardware production.
MCP is still in use today in the Unisys ClearPath/MCP line of computers.
UNIVAC, the first commercial computer manufacturer, produced a series of EXEC operating systems. Like all early
main-frame systems, this was a batch-oriented system that managed magnetic drums, disks, card readers and line
printers. In the 1970s, UNIVAC produced the Real-Time Basic (RTB) system to support large-scale time sharing,
also patterned after the Dartmouth BC system.
General Electric and MIT developed General Electric Comprehensive Operating Supervisor (GECOS), which
introduced the concept of ringed security privilege levels. After acquisition by Honeywell it was renamed to General
Comprehensive Operating System (GCOS).
Digital Equipment Corporation developed many operating systems for its various computer lines, including
TOPS-10 and TOPS-20 time sharing systems for the 36-bit PDP-10 class systems. Prior to the widespread use of
UNIX, TOPS-10 was a particularly popular system in universities, and in the early ARPANET community.
3
Operating system
In the late 1960s through the late 1970s, several hardware capabilities evolved that allowed similar or ported
software to run on more than one system. Early systems had utilized microprogramming to implement features on
their systems in order to permit different underlying computer architectures to appear to be the same as others in a
series. In fact most 360s after the 360/40 (except the 360/165 and 360/168) were microprogrammed
implementations. But soon other means of achieving application compatibility were proven to be more significant.
The enormous investment in software for these systems made since 1960s caused most of the original computer
manufacturers to continue to develop compatible operating systems along with the hardware. The notable supported
mainframe operating systems include:
•
•
•
•
Burroughs MCPۥ B5000, 1961 to Unisys Clearpath/MCP, present.
IBM OS/360ۥ IBM System/360, 1966 to IBM z/OS, present.
IBM CP-67ۥ IBM System/360, 1967 to IBM z/VM, present.
UNIVAC EXEC 8ۥ UNIVAC 1108, 1967, to OS 2200 Unisys Clearpath Dorado, present.
Microcomputers
The first microcomputers did not have the capacity or
need for the elaborate operating systems that had been
developed for mainframes and minis; minimalistic
operating systems were developed, often loaded from
ROM and known as monitors. One notable early disk
operating system was CP/M, which was supported on
many early microcomputers and was closely imitated
by Microsoft's MS-DOS, which became wildly
popular as the operating system chosen for the IBM
PC (IBM's version of it was called IBM DOS or PC
DOS). In the '80s, Apple Computer Inc. (now Apple
PC-DOS was an early personal computer OS that featured a command
line interface.
Inc.) abandoned its popular Apple II series of
microcomputers to introduce the Apple Macintosh
computer with an innovative Graphical User Interface (GUI) to the Mac OS.
The introduction of the Intel 80386 CPU chip with 32-bit architecture and paging capabilities, provided personal
computers with the ability to run multitasking operating systems like those of earlier minicomputers and mainframes.
Microsoft responded to this progress by hiring Dave Cutler, who had developed the VMS operating system for
Digital Equipment Corporation. He would lead the development of the Windows NT operating system, which
continues to serve as the basis for Microsoft's operating systems line. Steve Jobs, a co-founder of Apple Inc., started
NeXT Computer Inc., which developed the NEXTSTEP operating system. NEXTSTEP would later be acquired by
Apple Inc. and used, along with code from FreeBSD as the core of Mac OS X.
The GNU Project was started by activist and programmer Richard Stallman with the goal of creating a complete free
software replacement to the proprietary UNIX operating system. While the project was highly successful in
duplicating the functionality of various parts of UNIX, development of the GNU Hurd kernel proved to be
unproductive. In 1991, Finnish computer science student Linus Torvalds, with cooperation from volunteers
collaborating over the Internet, released the first version of the Linux kernel. It was soon merged with the GNU user
space components and system software to form a complete operating system. Since then, the combination of the two
major components has usually been referred to as simply "Linux" by the software industry, a naming convention that
Stallman and the Free Software Foundation remain opposed to, preferring the name GNU/Linux. The Berkeley
Software Distribution, known as BSD, is the UNIX derivative distributed by the University of California, Berkeley,
starting in the 1970s. Freely distributed and ported to many minicomputers, it eventually also gained a following for
use on PCs, mainly as FreeBSD, NetBSD and OpenBSD.
4
Operating system
5
Examples of operating systems
UNIX and UNIX-like operating systems
Unix was originally written in assembly language.[5]
Ken Thompson wrote B, mainly based on BCPL,
based on his experience in the MULTICS project. B
was replaced by C, and Unix, rewritten in C,
developed into a large, complex family of
inter-related operating systems which have been
influential in every modern operating system (see
History).
The UNIX-like family is a diverse group of operating
systems, with several major sub-categories including
System V, BSD, and Linux. The name "UNIX" is a
Evolution of Unix systems
trademark of The Open Group which licenses it for
use with any operating system that has been shown to conform to their definitions. "UNIX-like" is commonly used to
refer to the large set of operating systems which resemble the original UNIX.
Unix-like systems run on a wide variety of computer architectures. They are used heavily for servers in business, as
well as workstations in academic and engineering environments. Free UNIX variants, such as Linux and BSD, are
popular in these areas.
Four operating systems are certified by the The Open Group (holder of the Unix trademark) as Unix. HP's HP-UX
and IBM's AIX are both descendants of the original System V Unix and are designed to run only on their respective
vendor's hardware. In contrast, Sun Microsystems's Solaris Operating System can run on multiple types of hardware,
including x86 and Sparc servers, and PCs. Apple's OS X, a replacement for Apple's earlier (non-Unix) Mac OS, is a
hybrid kernel-based BSD variant derived from NeXTSTEP, Mach, and FreeBSD.
Unix interoperability was sought by establishing the POSIX standard. The POSIX standard can be applied to any
operating system, although it was originally created for various Unix variants.
BSD and its descendants
A subgroup of the Unix family is the Berkeley
Software Distribution family, which includes
FreeBSD, NetBSD, and OpenBSD. These operating
systems are most commonly found on webservers,
although they can also function as a personal
computer OS. The Internet owes much of its existence
to BSD, as many of the protocols now commonly
used by computers to connect, send and receive data
over a network were widely implemented and refined
in BSD. The world wide web was also first
demonstrated on a number of computers running an
OS based on BSD called NextStep.
BSD has its roots in Unix. In 1974, University of
California, Berkeley installed its first Unix system.
The first server for the World Wide Web ran on NeXTSTEP, based on
BSD.
Operating system
Over time, students and staff in the computer science department there began adding new programs to make things
easier, such as text editors. When Berkely received new VAX computers in 1978 with Unix installed, the school's
undergraduates modified Unix even more in order to take advantage of the computer's hardware possibilities. The
Defense Advanced Research Projects Agency of the US Department of Defense took interest, and decided to fund
the project. Many schools, corporations, and government organizations took notice and started to use Berkeley's
version of Unix instead of the official one distributed by AT&T.
Steve Jobs, upon leaving Apple Inc. in 1985, formed NeXT Inc., a company that manufactured high-end computers
running on a variation of BSD called NeXTSTEP. One of these computers was used by Tim Berners-Lee as the first
webserver to create the World Wide Web.
Developers like Keith Bostic encouraged the project to replace any non-free code that originated with Bell Labs.
Once this was done, however, AT&T sued. Eventually, after two years of legal disputes, the BSD project came out
ahead and spawned a number of free derivatives, such as FreeBSD and NetBSD.
OS X
OS X (formerly "Mac OS X") is a line of open core graphical operating systems developed, marketed, and sold by
Apple Inc., the latest of which is pre-loaded on all currently shipping Macintosh computers. OS X is the successor to
the original Mac OS, which had been Apple's primary operating system since 1984. Unlike its predecessor, OS X is a
UNIX operating system built on technology that had been developed at NeXT through the second half of the 1980s
and up until Apple purchased the company in early 1997. The operating system was first released in 1999 as Mac OS
X Server 1.0, with a desktop-oriented version (Mac OS X v10.0 "Cheetah") following in March 2001. Since then, six
more distinct "client" and "server" editions of OS X have been released, the most recent being OS X 10.8 "Mountain
Lion", which was first made available on February 16, 2012 for developers, and was then released to the public on
July 25, 2012. Releases of OS X are named after big cats.
Prior to its merging with OS X, the server edition - OS X Server - was architecturally identical to its desktop
counterpart and usually ran on Apple's line of Macintosh server hardware. OS X Server included work group
management and administration software tools that provide simplified access to key network services, including a
mail transfer agent, a Samba server, an LDAP server, a domain name server, and others. With Mac OS X v10.7 Lion,
all server aspects of Mac OS X Server have been integrated into the client version and the product re-branded as "OS
X" (dropping "Mac" from the name). The server tools are now offered as an application.[6]
Linux and GNU
Linux (or GNU/Linux) is a Unix-like operating system
that was developed without any actual Unix code,
unlike BSD and its variants. Linux can be used on a
wide range of devices from supercomputers to
wristwatches. The Linux kernel is released under an
open source license, so anyone can read and modify its
code. It has been modified to run on a large variety of
electronics. Although estimates suggest that Linux is
used on 1.82% of all personal computers,[7][8] it has
been widely adopted for use in servers[9] and embedded
Ubuntu, desktop Linux distribution
systems[10] (such as cell phones). Linux has superseded
Unix in most places, and is used on the 10 most
powerful supercomputers in the world.[11] The Linux kernel is used in some popular distributions, such as Red Hat,
Debian, Ubuntu, Linux Mint and Google's Android.
6
Operating system
7
The GNU project is a mass collaboration of programmers who seek to create a completely free and open operating
system that was similar to Unix but with completely original code. It was started in 1983 by Richard Stallman, and is
responsible for many of the parts of most Linux variants. Thousands of pieces of software for virtually every
operating system are licensed under the GNU General Public License. Meanwhile, the Linux kernel began as a side
project of Linus Torvalds, a university student from Finland. In 1991, Torvalds began work on it, and posted
information about his project on a newsgroup for computer students and programmers. He received a wave of
support and volunteers who ended up creating a full-fledged kernel. Programmers from GNU took notice, and
members of both projects worked to integrate the finished GNU parts with the Linux kernel in order to create a
full-fledged operating system.
Google Chromium OS
Chromium is an operating system based on the Linux kernel and designed by Google. Since Chromium OS targets
computer users who spend most of their time on the Internet, it is mainly a web browser with limited ability to run
local applications, though it has a built-in file manager and media player. Instead, it relies on Internet applications (or
Web apps) used in the web browser to accomplish tasks such as word processing, as well as online storage for
storing most files.
Microsoft Windows
Microsoft Windows is a family of proprietary operating systems
designed by Microsoft Corporation and primarily targeted to Intel
architecture based computers, with an estimated 88.9 percent total
usage share on Web connected computers.[8][12][13][14] The newest
version is Windows 8 for workstations and Windows Server 2012 for
servers. Windows 7 recently overtook Windows XP as most used
OS.[15][16][17]
Bootable Windows To Go USB flash drive
Microsoft Windows originated in 1985 as an operating environment running on top of MS-DOS, which was the
standard operating system shipped on most Intel architecture personal computers at the time. In 1995, Windows 95
was released which only used MS-DOS as a bootstrap. For backwards compatibility, Win9x could run real-mode
MS-DOS[18][19] and 16 bits Windows 3.x[20] drivers. Windows ME, released in 2000, was the last version in the
Win9x family. Later versions have all been based on the Windows NT kernel. Current versions of Windows run on
IA-32 and x86-64 microprocessors, although Windows 8 will support ARM architecture. In the past, Windows NT
supported non-Intel architectures.
Server editions of Windows are widely used. In recent years, Microsoft has expended significant capital in an effort
to promote the use of Windows as a server operating system. However, Windows' usage on servers is not as
widespread as on personal computers, as Windows competes against Linux and BSD for server market share.[21][22]
Other
There have been many operating systems that were significant in their day but are no longer so, such as AmigaOS;
OS/2 from IBM and Microsoft; Mac OS, the non-Unix precursor to Apple's Mac OS X; BeOS; XTS-300; RISC OS;
MorphOS and FreeMint. Some are still used in niche markets and continue to be developed as minority platforms for
enthusiast communities and specialist applications. OpenVMS formerly from DEC, is still under active development
by Hewlett-Packard. Yet other operating systems are used almost exclusively in academia, for operating systems
education or to do research on operating system concepts. A typical example of a system that fulfills both roles is
MINIX, while for example Singularity is used purely for research.
Other operating systems have failed to win significant market share, but have introduced innovations that have
influenced mainstream operating systems, not least Bell Labs' Plan 9.
Operating system
Components
The components of an operating system all exist in order to make the different parts of a computer work together. All
user software needs to go through the operating system in order to use any of the hardware, whether it be as simple
as a mouse or keyboard or as complex as an Internet component.
Kernel
With the aid of the firmware and device drivers, the kernel provides the
most basic level of control over all of the computer's hardware devices.
It manages memory access for programs in the RAM, it determines
which programs get access to which hardware resources, it sets up or
resets the CPU's operating states for optimal operation at all times, and
it organizes the data for long-term non-volatile storage with file
systems on such media as disks, tapes, flash memory, etc.
Program execution
A kernel connects the application software to the
The operating system provides an interface between an application
hardware of a computer.
program and the computer hardware, so that an application program
can interact with the hardware only by obeying rules and procedures
programmed into the operating system. The operating system is also a set of services which simplify development
and execution of application programs. Executing an application program involves the creation of a process by the
operating system kernel which assigns memory space and other resources, establishes a priority for the process in
multi-tasking systems, loads program binary code into memory, and initiates execution of the application program
which then interacts with the user and with hardware devices.
Interrupts
Interrupts are central to operating systems, as they provide an efficient way for the operating system to interact with
and react to its environment. The alternative€€ having the operating system "watch" the various sources of input for
events (polling) that require action€€ can be found in older systems with very small stacks (50 or 60 bytes) but are
unusual in modern systems with large stacks. Interrupt-based programming is directly supported by most modern
CPUs. Interrupts provide a computer with a way of automatically saving local register contexts, and running specific
code in response to events. Even very basic computers support hardware interrupts, and allow the programmer to
specify code which may be run when that event takes place.
When an interrupt is received, the computer's hardware automatically suspends whatever program is currently
running, saves its status, and runs computer code previously associated with the interrupt; this is analogous to
placing a bookmark in a book in response to a phone call. In modern operating systems, interrupts are handled by the
operating system's kernel. Interrupts may come from either the computer's hardware or from the running program.
When a hardware device triggers an interrupt, the operating system's kernel decides how to deal with this event,
generally by running some processing code. The amount of code being run depends on the priority of the interrupt
(for example: a person usually responds to a smoke detector alarm before answering the phone). The processing of
hardware interrupts is a task that is usually delegated to software called device driver, which may be either part of
the operating system's kernel, part of another program, or both. Device drivers may then relay information to a
running program by various means.
A program may also trigger an interrupt to the operating system. If a program wishes to access hardware for
example, it may interrupt the operating system's kernel, which causes control to be passed back to the kernel. The
kernel will then process the request. If a program wishes additional resources (or wishes to shed resources) such as
8
Operating system
memory, it will trigger an interrupt to get the kernel's attention.
Modes
Modern CPUs support multiple modes of
operation. CPUs with this capability use at
least two modes: protected mode and
supervisor mode. The supervisor mode is
used by the operating system's kernel for
low level tasks that need unrestricted access
to hardware, such as controlling how
memory is written and erased, and
communication with devices like graphics
cards. Protected mode, in contrast, is used
for almost everything else. Applications
operate within protected mode, and can only
use hardware by communicating with the
kernel, which controls everything in
Privilege rings for the x86 available in protected mode. Operating systems
supervisor mode. CPUs might have other
determine which processes run in each mode.
modes similar to protected mode as well,
such as the virtual modes in order to emulate older processor types, such as 16-bit processors on a 32-bit one, or
32-bit processors on a 64-bit one.
When a computer first starts up, it is automatically running in supervisor mode. The first few programs to run on the
computer, being the BIOS or EFI, bootloader, and the operating system have unlimited access to hardwareۥ and this
is required because, by definition, initializing a protected environment can only be done outside of one. However,
when the operating system passes control to another program, it can place the CPU into protected mode.
In protected mode, programs may have access to a more limited set of the CPU's instructions. A user program may
leave protected mode only by triggering an interrupt, causing control to be passed back to the kernel. In this way the
operating system can maintain exclusive control over things like access to hardware and memory.
The term "protected mode resource" generally refers to one or more CPU registers, which contain information that
the running program isn't allowed to alter. Attempts to alter these resources generally causes a switch to supervisor
mode, where the operating system can deal with the illegal operation the program was attempting (for example, by
killing the program).
Memory management
Among other things, a multiprogramming operating system kernel must be responsible for managing all system
memory which is currently in use by programs. This ensures that a program does not interfere with memory already
in use by another program. Since programs time share, each program must have independent access to memory.
Cooperative memory management, used by many early operating systems, assumes that all programs make voluntary
use of the kernel's memory manager, and do not exceed their allocated memory. This system of memory
management is almost never seen any more, since programs often contain bugs which can cause them to exceed their
allocated memory. If a program fails, it may cause memory used by one or more other programs to be affected or
overwritten. Malicious programs or viruses may purposefully alter another program's memory, or may affect the
operation of the operating system itself. With cooperative memory management, it takes only one misbehaved
program to crash the system.
Memory protection enables the kernel to limit a process' access to the computer's memory. Various methods of
memory protection exist, including memory segmentation and paging. All methods require some level of hardware
9
Operating system
10
support (such as the 80286 MMU), which doesn't exist in all computers.
In both segmentation and paging, certain protected mode registers specify to the CPU what memory address it
should allow a running program to access. Attempts to access other addresses will trigger an interrupt which will
cause the CPU to re-enter supervisor mode, placing the kernel in charge. This is called a segmentation violation or
Seg-V for short, and since it is both difficult to assign a meaningful result to such an operation, and because it is
usually a sign of a misbehaving program, the kernel will generally resort to terminating the offending program, and
will report the error.
Windows 3.1-Me had some level of memory protection, but programs could easily circumvent the need to use it. A
general protection fault would be produced, indicating a segmentation violation had occurred; however, the system
would often crash anyway.
Virtual memory
The use of virtual memory addressing (such as paging or
segmentation) means that the kernel can choose what
memory each program may use at any given time, allowing
the operating system to use the same memory locations for
multiple tasks.
If a program tries to access memory that isn't in its current
range of accessible memory, but nonetheless has been
allocated to it, the kernel will be interrupted in the same
way as it would if the program were to exceed its allocated
memory. (See section on memory management.) Under
UNIX this kind of interrupt is referred to as a page fault.
When the kernel detects a page fault it will generally adjust
the virtual memory range of the program which triggered it,
granting it access to the memory requested. This gives the
kernel discretionary power over where a particular
application's memory is stored, or even whether or not it
has actually been allocated yet.
In modern operating systems, memory which is accessed
less frequently can be temporarily stored on disk or other
media to make that space available for use by other
programs. This is called swapping, as an area of memory
can be used by multiple programs, and what that memory
area contains can be swapped or exchanged on demand.
"Virtual memory" provides the programmer or the user
with the perception that there is a much larger amount of
RAM in the computer than is really there.[23]
Many operating systems can "trick" programs into using
memory scattered around the hard disk and RAM as if it is one
continuous chunk of memory, called virtual memory.
Operating system
Multitasking
Multitasking refers to the running of multiple independent computer programs on the same computer; giving the
appearance that it is performing the tasks at the same time. Since most computers can do at most one or two things at
one time, this is generally done via time-sharing, which means that each program uses a share of the computer's time
to execute.
An operating system kernel contains a piece of software called a scheduler which determines how much time each
program will spend executing, and in which order execution control should be passed to programs. Control is passed
to a process by the kernel, which allows the program access to the CPU and memory. Later, control is returned to the
kernel through some mechanism, so that another program may be allowed to use the CPU. This so-called passing of
control between the kernel and applications is called a context switch.
An early model which governed the allocation of time to programs was called cooperative multitasking. In this
model, when control is passed to a program by the kernel, it may execute for as long as it wants before explicitly
returning control to the kernel. This means that a malicious or malfunctioning program may not only prevent any
other programs from using the CPU, but it can hang the entire system if it enters an infinite loop.
Modern operating systems extend the concepts of application preemption to device drivers and kernel code, so that
the operating system has preemptive control over internal run-times as well.
The philosophy governing preemptive multitasking is that of ensuring that all programs are given regular time on the
CPU. This implies that all programs must be limited in how much time they are allowed to spend on the CPU
without being interrupted. To accomplish this, modern operating system kernels make use of a timed interrupt. A
protected mode timer is set by the kernel which triggers a return to supervisor mode after the specified time has
elapsed. (See above sections on Interrupts and Dual Mode Operation.)
On many single user operating systems cooperative multitasking is perfectly adequate, as home computers generally
run a small number of well tested programs. The AmigaOS is an exception, having pre-emptive multitasking from its
very first version. Windows NT was the first version of Microsoft Windows which enforced preemptive
multitasking, but it didn't reach the home user market until Windows XP (since Windows NT was targeted at
professionals).
Disk access and file systems
Access to data stored on disks is a central feature of
all operating systems. Computers store data on disks
using files, which are structured in specific ways in
order to allow for faster access, higher reliability, and
to make better use out of the drive's available space.
The specific way in which files are stored on a disk is
called a file system, and enables files to have names
and attributes. It also allows them to be stored in a
hierarchy of directories or folders arranged in a
directory tree.
Early operating systems generally supported a single
Filesystems allow users and programs to organize and sort files on a
type of disk drive and only one kind of file system.
computer, often through the use of directories (or "folders")
Early file systems were limited in their capacity,
speed, and in the kinds of file names and directory
structures they could use. These limitations often reflected limitations in the operating systems they were designed
for, making it very difficult for an operating system to support more than one file system.
11
Operating system
While many simpler operating systems support a limited range of options for accessing storage systems, operating
systems like UNIX and Linux support a technology known as a virtual file system or VFS. An operating system such
as UNIX supports a wide array of storage devices, regardless of their design or file systems, allowing them to be
accessed through a common application programming interface (API). This makes it unnecessary for programs to
have any knowledge about the device they are accessing. A VFS allows the operating system to provide programs
with access to an unlimited number of devices with an infinite variety of file systems installed on them, through the
use of specific device drivers and file system drivers.
A connected storage device, such as a hard drive, is accessed through a device driver. The device driver understands
the specific language of the drive and is able to translate that language into a standard language used by the operating
system to access all disk drives. On UNIX, this is the language of block devices.
When the kernel has an appropriate device driver in place, it can then access the contents of the disk drive in raw
format, which may contain one or more file systems. A file system driver is used to translate the commands used to
access each specific file system into a standard set of commands that the operating system can use to talk to all file
systems. Programs can then deal with these file systems on the basis of filenames, and directories/folders, contained
within a hierarchical structure. They can create, delete, open, and close files, as well as gather various information
about them, including access permissions, size, free space, and creation and modification dates.
Various differences between file systems make supporting all file systems difficult. Allowed characters in file
names, case sensitivity, and the presence of various kinds of file attributes makes the implementation of a single
interface for every file system a daunting task. Operating systems tend to recommend using (and so support natively)
file systems specifically designed for them; for example, NTFS in Windows and ext3 and ReiserFS in Linux.
However, in practice, third party drives are usually available to give support for the most widely used file systems in
most general-purpose operating systems (for example, NTFS is available in Linux through NTFS-3g, and ext2/3 and
ReiserFS are available in Windows through third-party software).
Support for file systems is highly varied among modern operating systems, although there are several common file
systems which almost all operating systems include support and drivers for. Operating systems vary on file system
support and on the disk formats they may be installed on. Under Windows, each file system is usually limited in
application to certain media; for example, CDs must use ISO 9660 or UDF, and as of Windows Vista, NTFS is the
only file system which the operating system can be installed on. It is possible to install Linux onto many types of file
systems. Unlike other operating systems, Linux and UNIX allow any file system to be used regardless of the media it
is stored in, whether it is a hard drive, a disc (CD,DVD...), a USB flash drive, or even contained within a file located
on another file system.
Device drivers
A device driver is a specific type of computer software developed to allow interaction with hardware devices.
Typically this constitutes an interface for communicating with the device, through the specific computer bus or
communications subsystem that the hardware is connected to, providing commands to and/or receiving data from the
device, and on the other end, the requisite interfaces to the operating system and software applications. It is a
specialized hardware-dependent computer program which is also operating system specific that enables another
program, typically an operating system or applications software package or computer program running under the
operating system kernel, to interact transparently with a hardware device, and usually provides the requisite interrupt
handling necessary for any necessary asynchronous time-dependent hardware interfacing needs.
The key design goal of device drivers is abstraction. Every model of hardware (even within the same class of device)
is different. Newer models also are released by manufacturers that provide more reliable or better performance and
these newer models are often controlled differently. Computers and their operating systems cannot be expected to
know how to control every device, both now and in the future. To solve this problem, operating systems essentially
dictate how every type of device should be controlled. The function of the device driver is then to translate these
12
Operating system
operating system mandated function calls into device specific calls. In theory a new device, which is controlled in a
new manner, should function correctly if a suitable driver is available. This new driver will ensure that the device
appears to operate as usual from the operating system's point of view.
Under versions of Windows before Vista and versions of Linux before 2.6, all driver execution was co-operative,
meaning that if a driver entered an infinite loop it would freeze the system. More recent revisions of these operating
systems incorporate kernel preemption, where the kernel interrupts the driver to give it tasks, and then separates
itself from the process until it receives a response from the device driver, or gives it more tasks to do.
Networking
Currently most operating systems support a variety of networking protocols, hardware, and applications for using
them. This means that computers running dissimilar operating systems can participate in a common network for
sharing resources such as computing, files, printers, and scanners using either wired or wireless connections.
Networks can essentially allow a computer's operating system to access the resources of a remote computer to
support the same functions as it could if those resources were connected directly to the local computer. This includes
everything from simple communication, to using networked file systems or even sharing another computer's graphics
or sound hardware. Some network services allow the resources of a computer to be accessed transparently, such as
SSH which allows networked users direct access to a computer's command line interface.
Client/server networking allows a program on a computer, called a client, to connect via a network to another
computer, called a server. Servers offer (or host) various services to other network computers and users. These
services are usually provided through ports or numbered access points beyond the server's network address. Each
port number is usually associated with a maximum of one running program, which is responsible for handling
requests to that port. A daemon, being a user program, can in turn access the local hardware resources of that
computer by passing requests to the operating system kernel.
Many operating systems support one or more vendor-specific or open networking protocols as well, for example,
SNA on IBM systems, DECnet on systems from Digital Equipment Corporation, and Microsoft-specific protocols
(SMB) on Windows. Specific protocols for specific tasks may also be supported such as NFS for file access.
Protocols like ESound, or esd can be easily extended over the network to provide sound from local applications, on a
remote system's sound hardware.
Security
A computer being secure depends on a number of technologies working properly. A modern operating system
provides access to a number of resources, which are available to software running on the system, and to external
devices like networks via the kernel.
The operating system must be capable of distinguishing between requests which should be allowed to be processed,
and others which should not be processed. While some systems may simply distinguish between "privileged" and
"non-privileged", systems commonly have a form of requester identity, such as a user name. To establish identity
there may be a process of authentication. Often a username must be quoted, and each username may have a
password. Other methods of authentication, such as magnetic cards or biometric data, might be used instead. In some
cases, especially connections from the network, resources may be accessed with no authentication at all (such as
reading files over a network share). Also covered by the concept of requester identity is authorization; the particular
services and resources accessible by the requester once logged into a system are tied to either the requester's user
account or to the variously configured groups of users to which the requester belongs.
In addition to the allow/disallow model of security, a system with a high level of security will also offer auditing
options. These would allow tracking of requests for access to resources (such as, "who has been reading this file?").
Internal security, or security from an already running program is only possible if all possibly harmful requests must
be carried out through interrupts to the operating system kernel. If programs can directly access hardware and
13
Operating system
14
resources, they cannot be secured.
External security involves a request from outside the computer, such as a login at a connected console or some kind
of network connection. External requests are often passed through device drivers to the operating system's kernel,
where they can be passed onto applications, or carried out directly. Security of operating systems has long been a
concern because of highly sensitive data held on computers, both of a commercial and military nature. The United
States Government Department of Defense (DoD) created the Trusted Computer System Evaluation Criteria
(TCSEC) which is a standard that sets basic requirements for assessing the effectiveness of security. This became of
vital importance to operating system makers, because the TCSEC was used to evaluate, classify and select trusted
operating systems being considered for the processing, storage and retrieval of sensitive or classified information.
Network services include offerings such as file sharing, print services, email, web sites, and file transfer protocols
(FTP), most of which can have compromised security. At the front line of security are hardware devices known as
firewalls or intrusion detection/prevention systems. At the operating system level, there are a number of software
firewalls available, as well as intrusion detection/prevention systems. Most modern operating systems include a
software firewall, which is enabled by default. A software firewall can be configured to allow or deny network traffic
to or from a service or application running on the operating system. Therefore, one can install and be running an
insecure service, such as Telnet or FTP, and not have to be threatened by a security breach because the firewall
would deny all traffic trying to connect to the service on that port.
An alternative strategy, and the only sandbox strategy available in systems that do not meet the Popek and Goldberg
virtualization requirements, is the operating system not running user programs as native code, but instead either
emulates a processor or provides a host for a p-code based system such as Java.
Internal security is especially relevant for multi-user systems; it allows each user of the system to have private files
that the other users cannot tamper with or read. Internal security is also vital if auditing is to be of any use, since a
program can potentially bypass the operating system, inclusive of bypassing auditing.
User interface
Every computer that is to be operated by an individual
requires a user interface. The user interface is usually
referred to as a shell and is essential if human
interaction is to be supported. The user interface
views the directory structure and requests services
from the operating system that will acquire data from
input hardware devices, such as a keyboard, mouse or
credit card reader, and requests operating system
services to display prompts, status messages and such
on output hardware devices, such as a video monitor
or printer. The two most common forms of a user
interface have historically been the command-line
interface, where computer commands are typed out
line-by-line, and the graphical user interface, where a
visual environment (most commonly a WIMP) is
present.
A screenshot of the Bourne Again Shell command line. Each
command is typed out after the 'prompt', and then its output appears
below, working its way down the screen. The current command
prompt is at the bottom.
Operating system
15
Graphical user interfaces
Most of the modern computer systems support
graphical user interfaces (GUI), and often include
them. In some computer systems, such as the original
implementation of Mac OS, the GUI is integrated into
the kernel.
While technically a graphical user interface is not an
operating system service, incorporating support for
one into the operating system kernel can allow the
GUI to be more responsive by reducing the number of
context switches required for the GUI to perform its
A screenshot of the KDE Plasma Desktop graphical user interface.
output functions. Other operating systems are
Programs take the form of images on the screen, and the files, folders
modular, separating the graphics subsystem from the
(directories), and applications take the form of icons and symbols. A
kernel and the Operating System. In the 1980s UNIX,
mouse is used to navigate the computer.
VMS and many others had operating systems that
were built this way. Linux and Mac OS X are also built this way. Modern releases of Microsoft Windows such as
Windows Vista implement a graphics subsystem that is mostly in user-space; however the graphics drawing routines
of versions between Windows NT 4.0 and Windows Server 2003 exist mostly in kernel space. Windows 9x had very
little distinction between the interface and the kernel.
Many computer operating systems allow the user to install or create any user interface they desire. The X€Window
System in conjunction with GNOME or KDE Plasma Desktop is a commonly found setup on most Unix and
Unix-like (BSD, Linux, Solaris) systems. A number of Windows shell replacements have been released for
Microsoft Windows, which offer alternatives to the included Windows shell, but the shell itself cannot be separated
from Windows.
Numerous Unix-based GUIs have existed over time, most derived from X11. Competition among the various
vendors of Unix (HP, IBM, Sun) led to much fragmentation, though an effort to standardize in the 1990s to COSE
and CDE failed for various reasons, and were eventually eclipsed by the widespread adoption of GNOME and K
Desktop Environment. Prior to free software-based toolkits and desktop environments, Motif was the prevalent
toolkit/desktop combination (and was the basis upon which CDE was developed).
Graphical user interfaces evolve over time. For example, Windows has modified its user interface almost every time
a new major version of Windows is released, and the Mac€OS GUI changed dramatically with the introduction of
Mac€OS€X in 1999.[24]
Real-time operating systems
A real-time operating system (RTOS) is a multitasking operating system intended for applications with fixed
deadlines (real-time computing). Such applications include some small embedded systems, automobile engine
controllers, industrial robots, spacecraft, industrial control, and some large-scale computing systems.
An early example of a large-scale real-time operating system was Transaction Processing Facility developed by
American Airlines and IBM for the Sabre Airline Reservations System.
Embedded systems that have fixed deadlines use a real-time operating system such as VxWorks, PikeOS, eCos,
QNX, MontaVista Linux and RTLinux. Windows CE is a real-time operating system that shares similar APIs to
desktop Windows but shares none of desktop Windows' codebase. Symbian OS also has an RTOS kernel (EKA2)
starting with version 8.0b.
Operating system
Some embedded systems use operating systems such as Palm OS, BSD, and Linux, although such operating systems
do not support real-time computing.
Operating system development as a hobby
Operating system development is one of the most complicated activities in which a computing hobbyist may engage.
A hobby operating system may be classified as one whose code has not been directly derived from an existing
operating system, and has few users and active developers. [25]
In some cases, hobby development is in support of a "homebrew" computing device, for example, a simple
single-board computer powered by a 6502 microprocessor. Or, development may be for an architecture already in
widespread use. Operating system development may come from entirely new concepts, or may commence by
modeling an existing operating system. In either case, the hobbyist is his/her own developer, or may interact with a
small and sometimes unstructured group of individuals who have like interests.
Examples of a hobby operating system include ReactOS and Syllable.
Diversity of operating systems and portability
Application software is generally written for use on a specific operating system, and sometimes even for specific
hardware. When porting the application to run on another OS, the functionality required by that application may be
implemented differently by that OS (the names of functions, meaning of arguments, etc.) requiring the application to
be adapted, changed, or otherwise maintained.
This cost in supporting operating systems diversity can be avoided by instead writing applications against software
platforms like Java or Qt. These abstractions have already borne the cost of adaptation to specific operating systems
and their system libraries.
Another approach is for operating system vendors to adopt standards. For example, POSIX and OS abstraction layers
provide commonalities that reduce porting costs.
References
[1]
[2]
[3]
[4]
Stallings (2005). Operating Systems, Internals and Design Principles. Pearson: Prentice Hall. p.€6.
Dhotre, I.A. (2009). Operating Systems.. Technical Publications. p.€1.
"Operating System Market Share" (http:/ / marketshare. hitslink. com/ operating-system-market-share. aspx?qprid=10). Net Applications. .
Hansen, Per Brinch, ed. (2001). Classic Operating Systems (http:/ / books. google. com/ ?id=-PDPBvIPYBkC& lpg=PP1&
pg=PP1#v=onepage& q). Springer. pp.€4•7. ISBN€0-387-95113-X. .
[5] Ritchie, Dennis. "Unix Manual, first edition" (http:/ / cm. bell-labs. com/ cm/ cs/ who/ dmr/ 1stEdman. html). Lucent Technologies. .
Retrieved 22 November 2012.
[6] "OS X Mountain Lionۥ Move your Mac even further ahead" (http:/ / www. apple. com/ macosx/ lion/ ). Apple. . Retrieved 2012-08-07.
[7] Usage share of operating systems
[8] "Top 5 Operating Systems from January to April 2011" (http:/ / gs. statcounter. com/ #os-ww-monthly-201101-201104-bar). StatCounter.
October 2009. . Retrieved November 5, 2009.
[9] "IDC report into Server market share" (http:/ / www. idc. com/ about/ viewpressrelease. jsp?containerId=prUS22360110& sectionId=null&
elementId=null& pageType=SYNOPSIS). Idc.com. . Retrieved 2012-08-07.
[10] Linux still top embedded OS (http:/ / www. linuxdevices. com/ news/ NS4920597981. html)
[11] Jermoluk, Tom (2012-08-03). "TOP500 List€• November 2010 (1•100) | TOP500 Supercomputing Sites" (http:/ / www. top500. org/ list/
2010/ 11/ 100). Top500.org. . Retrieved 2012-08-07.
[12] "Global Web Stats" (http:/ / marketshare. hitslink. com/ operating-system-market-share. aspx?qprid=8). Net Market Share, Net Applications.
May 2011. . Retrieved 2011-05-07.
[13] "Global Web Stats" (http:/ / www. w3counter. com/ globalstats. php). W3Counter, Awio Web Services. September 2009. . Retrieved
2009-10-24.
[14] "Operating System Market Share" (http:/ / marketshare. hitslink. com/ operating-system-market-share. aspx?qprid=8). Net Applications.
October 2009. . Retrieved November 5, 2009.
[15] "w3schools.com OS Platform Statistics" (http:/ / www. w3schools. com/ browsers/ browsers_os. asp). . Retrieved October 30, 2011.
16
Operating system
[16] "Stats Count Global Stats Top Five Operating Systems" (http:/ / gs. statcounter. com/ #os-ww-monthly-201010-201110). . Retrieved
October 30, 2011.
[17] "Global statistics at w3counter.com" (http:/ / www. w3counter. com/ globalstats. php). . Retrieved 23 January 2012.
[18] "Troubleshooting MS-DOS Compatibility Mode on Hard Disks" (http:/ / support. microsoft. com/ kb/ 130179/ EN-US).
Support.microsoft.com. . Retrieved 2012-08-07.
[19] "Using NDIS 2 PCMCIA Network Card Drivers in Windows 95" (http:/ / support. microsoft. com/ kb/ 134748/ en). Support.microsoft.com.
. Retrieved 2012-08-07.
[20] "INFO: Windows 95 Multimedia Wave Device Drivers Must be 16 bit" (http:/ / support. microsoft. com/ kb/ 163354/ en).
Support.microsoft.com. . Retrieved 2012-08-07.
[21] "Operating System Share by Groups for Sites in All Locations January 2009" (http:/ / news. netcraft. com/ SSL-Survey/ CMatch/ osdv_all). .
[22] "Behind the IDC data: Windows still No. 1 in server operating systems" (http:/ / blogs. zdnet. com/ microsoft/ ?p=5408). ZDNet.
2010-02-26. .
[23] Stallings, William (2008). Computer Organization & Architecture. New Delhi: Prentice-Hall of India Private Limited. p.€267.
ISBN€978-81-203-2962-1.
[24] Poisson, Ken. "Chronology of Personal Computer Software" (http:/ / www. islandnet. com/ ~kpolsson/ compsoft/ soft1998. htm). Retrieved
on 2008-05-07. Last checked on 2009-03-30.
[25] "My OS is less hobby than yours" (http:/ / www. osnews. com/ story/ 22638/ My_OS_Is_Less_Hobby_than_Yours). Osnews. December 21,
2009. . Retrieved December 21, 2009.
Further reading
• Auslander, Marc A.; Larkin, David C.; Scherr, Allan L. (1981). The evolution of the MVS Operating System
(http://www.research.ibm.com/journal/rd/255/auslander.pdf). IBM J. Research & Development.
• Deitel, Harvey M.; Deitel, Paul; Choffnes, David. Operating Systems. Pearson/Prentice Hall.
ISBN€978-0-13-092641-8.
• Bic, Lubomur F.; Shaw, Alan C. (2003). Operating Systems. Pearson: Prentice Hall.
• Silberschatz, Avi; Galvin, Peter; Gagne, Greg (2008). Operating Systems Concepts. John Wiley & Sons.
ISBN€0-470-12872-0.
External links
• Operating Systems (http://www.dmoz.org/Computers/Software/Operating_Systems/) at the Open Directory
Project
• Multics History (http://www.cbi.umn.edu/iterations/haigh.html) and the history of operating systems
• How Stuff Works€• Operating Systems (http://computer.howstuffworks.com/operating-system.htm)
• Help finding your Operating System type and version (http://whatsmyos.com)
17
Kernel (computing)
18
Kernel (computing)
In computing, the kernel is the main component of most computer
operating systems; it is a bridge between applications and the actual
data processing done at the hardware level. The kernel's
responsibilities include managing the system's resources (the
communication between hardware and software components).[1]
Usually, as a basic component of an operating system, a kernel can
provide the lowest-level abstraction layer for the resources (especially
processors and I/O devices) that application software must control to
perform its function. It typically makes these facilities available to
application processes through inter-process communication
mechanisms and system calls.
A kernel connects the application software to the
hardware of a computer
Operating system tasks are done differently by different kernels,
depending on their design and implementation. While monolithic kernels execute all the operating system code in the
same address space to increase the performance of the system, microkernels run most of the operating system
services in user space as servers, aiming to improve maintainability and modularity of the operating system.[2] A
range of possibilities exists between these two extremes.
Kernel basic facilities
The kernel's primary function is to manage the computer's resources and allow other programs to run and use these
resources.[1] Typically, the resources consist of:
• The Central Processing Unit. This is the most central part of a computer system, responsible for running or
executing programs. The kernel takes responsibility for deciding at any time which of the many running programs
should be allocated to the processor or processors (each of which can usually run only one program at a time)
• The computer's memory. Memory is used to store both program instructions and data. Typically, both need to be
present in memory in order for a program to execute. Often multiple programs will want access to memory,
frequently demanding more memory than the computer has available. The kernel is responsible for deciding
which memory each process can use, and determining what to do when not enough is available.
• Any Input/Output (I/O) devices present in the computer, such as keyboard, mouse, disk drives, USB devices,
printers, displays, network adapters, etc. The kernel allocates requests from applications to perform I/O to an
appropriate device (or subsection of a device, in the case of files on a disk or windows on a display) and provides
convenient methods for using the device (typically abstracted to the point where the application does not need to
know implementation details of the device).
Key aspects necessary in resource managements are the definition of an execution domain (address space) and the
protection mechanism used to mediate the accesses to the resources within a domain.[1]
Kernels also usually provide methods for synchronization and communication between processes called
inter-process communication (IPC).
A kernel may implement these features itself, or rely on some of the processes it runs to provide the facilities to other
processes, although in this case it must provide some means of IPC to allow processes to access the facilities
provided by each other.
Finally, a kernel must provide running programs with a method to make requests to access these facilities.
Kernel (computing)
Process management
The main task of a kernel is to allow the execution of applications and support them with features such as hardware
abstractions. A process defines which memory portions the application can access.[3] (For this introduction, process,
application and program are used as synonyms.) Kernel process management must take into account the hardware
built-in equipment for memory protection.[4]
To run an application, a kernel typically sets up an address space for the application, loads the file containing the
application's code into memory (perhaps via demand paging), sets up a stack for the program and branches to a given
location inside the program, thus starting its execution.[5]
Multi-tasking kernels are able to give the user the illusion that the number of processes being run simultaneously on
the computer is higher than the maximum number of processes the computer is physically able to run
simultaneously. Typically, the number of processes a system may run simultaneously is equal to the number of CPUs
installed (however this may not be the case if the processors support simultaneous multithreading).
In a pre-emptive multitasking system, the kernel will give every program a slice of time and switch from process to
process so quickly that it will appear to the user as if these processes were being executed simultaneously. The
kernel uses scheduling algorithms to determine which process is running next and how much time it will be given.
The algorithm chosen may allow for some processes to have higher priority than others. The kernel generally also
provides these processes a way to communicate; this is known as inter-process communication (IPC) and the main
approaches are shared memory, message passing and remote procedure calls (see concurrent computing).
Other systems (particularly on smaller, less powerful computers) may provide co-operative multitasking, where each
process is allowed to run uninterrupted until it makes a special request that tells the kernel it may switch to another
process. Such requests are known as "yielding", and typically occur in response to requests for interprocess
communication, or for waiting for an event to occur. Older versions of Windows and Mac OS both used co-operative
multitasking but switched to pre-emptive schemes as the power of the computers to which they were targeted
grew.[6]
The operating system might also support multiprocessing (SMP or Non-Uniform Memory Access); in that case,
different programs and threads may run on different processors. A kernel for such a system must be designed to be
re-entrant, meaning that it may safely run two different parts of its code simultaneously. This typically means
providing synchronization mechanisms (such as spinlocks) to ensure that no two processors attempt to modify the
same data at the same time.
Memory management
The kernel has full access to the system's memory and must allow processes to safely access this memory as they
require it. Often the first step in doing this is virtual addressing, usually achieved by paging and/or segmentation.
Virtual addressing allows the kernel to make a given physical address appear to be another address, the virtual
address. Virtual address spaces may be different for different processes; the memory that one process accesses at a
particular (virtual) address may be different memory from what another process accesses at the same address. This
allows every program to behave as if it is the only one (apart from the kernel) running and thus prevents applications
from crashing each other.[5]
On many systems, a program's virtual address may refer to data which is not currently in memory. The layer of
indirection provided by virtual addressing allows the operating system to use other data stores, like a hard drive, to
store what would otherwise have to remain in main memory (RAM). As a result, operating systems can allow
programs to use more memory than the system has physically available. When a program needs data which is not
currently in RAM, the CPU signals to the kernel that this has happened, and the kernel responds by writing the
contents of an inactive memory block to disk (if necessary) and replacing it with the data requested by the program.
The program can then be resumed from the point where it was stopped. This scheme is generally known as demand
19
Kernel (computing)
paging.
Virtual addressing also allows creation of virtual partitions of memory in two disjointed areas, one being reserved for
the kernel (kernel space) and the other for the applications (user space). The applications are not permitted by the
processor to address kernel memory, thus preventing an application from damaging the running kernel. This
fundamental partition of memory space has contributed much to current designs of actual general-purpose kernels
and is almost universal in such systems, although some research kernels (e.g. Singularity) take other approaches.
Device management
To perform useful functions, processes need access to the peripherals connected to the computer, which are
controlled by the kernel through device drivers. A device driver is a computer program that enables the operating
system to interact with a hardware device. It provides the operating system with information of how to control and
communicate with a certain piece of hardware. The driver is an important and vital piece to a program application.
The design goal of a driver is abstraction; the function of the driver is to translate the OS-mandated function calls
(programming calls) into device-specific calls. In theory, the device should work correctly with the suitable driver.
Device drivers are used for such things as video cards, sound cards, printers, scanners, modems, and LAN cards. The
common levels of abstraction of device drivers are:
1. On the hardware side:
•
•
•
•
Interfacing directly.
Using a high level interface (Video BIOS).
Using a lower-level device driver (file drivers using disk drivers).
Simulating work with hardware, while doing something entirely different.
2. On the software side:
•
•
•
•
Allowing the operating system direct access to hardware resources.
Implementing only primitives.
Implementing an interface for non-driver software (Example: TWAIN).
Implementing a language, sometimes high-level (Example PostScript).
For example, to show the user something on the screen, an application would make a request to the kernel, which
would forward the request to its display driver, which is then responsible for actually plotting the character/pixel.[5]
A kernel must maintain a list of available devices. This list may be known in advance (e.g. on an embedded system
where the kernel will be rewritten if the available hardware changes), configured by the user (typical on older PCs
and on systems that are not designed for personal use) or detected by the operating system at run time (normally
called plug and play). In a plug and play system, a device manager first performs a scan on different hardware buses,
such as Peripheral Component Interconnect (PCI) or Universal Serial Bus (USB), to detect installed devices, then
searches for the appropriate drivers.
As device management is a very OS-specific topic, these drivers are handled differently by each kind of kernel
design, but in every case, the kernel has to provide the I/O to allow drivers to physically access their devices through
some port or memory location. Very important decisions have to be made when designing the device management
system, as in some designs accesses may involve context switches, making the operation very CPU-intensive and
easily causing a significant performance overhead.
20
Kernel (computing)
System calls
A system call is a mechanism that is used by the application program to request a service from the operating system.
They use a machine-code instruction that causes the processor to change mode. An example would be from
supervisor mode to protected mode. This is where the operating system performs actions like accessing hardware
devices or the memory management unit. Generally the operating system provides a library that sits between the
operating system and normal programs. Usually it is a C library such as Glibc or Windows API. The library handles
the low-level details of passing information to the kernel and switching to supervisor mode. System calls include
close, open, read, wait and write.
To actually perform useful work, a process must be able to access the services provided by the kernel. This is
implemented differently by each kernel, but most provide a C library or an API, which in turn invokes the related
kernel functions.[7]
The method of invoking the kernel function varies from kernel to kernel. If memory isolation is in use, it is
impossible for a user process to call the kernel directly, because that would be a violation of the processor's access
control rules. A few possibilities are:
• Using a software-simulated interrupt. This method is available on most hardware, and is therefore very common.
• Using a call gate. A call gate is a special address stored by the kernel in a list in kernel memory at a location
known to the processor. When the processor detects a call to that address, it instead redirects to the target location
without causing an access violation. This requires hardware support, but the hardware for it is quite common.
• Using a special system call instruction. This technique requires special hardware support, which common
architectures (notably, x86) may lack. System call instructions have been added to recent models of x86
processors, however, and some operating systems for PCs make use of them when available.
• Using a memory-based queue. An application that makes large numbers of requests but does not need to wait for
the result of each may add details of requests to an area of memory that the kernel periodically scans to find
requests.
Kernel design decisions
Issues of kernel support for protection
An important consideration in the design of a kernel is the support it provides for protection from faults (fault
tolerance) and from malicious behaviors (security). These two aspects are usually not clearly distinguished, and the
adoption of this distinction in the kernel design leads to the rejection of a hierarchical structure for protection.[1]
The mechanisms or policies provided by the kernel can be classified according to several criteria, including: static
(enforced at compile time) or dynamic (enforced at run time); pre-emptive or post-detection; according to the
protection principles they satisfy (i.e. Denning[8][9]); whether they are hardware supported or language based;
whether they are more an open mechanism or a binding policy; and many more.
Support for hierarchical protection domains[10] is typically that of CPU modes. An efficient and simple way to
provide hardware support of capabilities is to delegate the MMU the responsibility of checking access-rights for
every memory access, a mechanism called capability-based addressing.[11] Most commercial computer architectures
lack MMU support for capabilities. An alternative approach is to simulate capabilities using commonly supported
hierarchical domains; in this approach, each protected object must reside in an address space that the application
does not have access to; the kernel also maintains a list of capabilities in such memory. When an application needs to
access an object protected by a capability, it performs a system call and the kernel performs the access for it. The
performance cost of address space switching limits the practicality of this approach in systems with complex
interactions between objects, but it is used in current operating systems for objects that are not accessed frequently or
which are not expected to perform quickly.[12][13] Approaches where protection mechanism are not firmware
supported but are instead simulated at higher levels (e.g. simulating capabilities by manipulating page tables on
21
Kernel (computing)
hardware that does not have direct support), are possible, but there are performance implications.[14] Lack of
hardware support may not be an issue, however, for systems that choose to use language-based protection.[15]
An important kernel design decision is the choice of the abstraction levels where the security mechanisms and
policies should be implemented. Kernel security mechanisms play a critical role in supporting security at higher
levels.[11][16][17][18][19]
One approach is to use firmware and kernel support for fault tolerance (see above), and build the security policy for
malicious behavior on top of that (adding features such as cryptography mechanisms where necessary), delegating
some responsibility to the compiler. Approaches that delegate enforcement of security policy to the compiler and/or
the application level are often called language-based security.
The lack of many critical security mechanisms in current mainstream operating systems impedes the implementation
of adequate security policies at the application abstraction level.[16] In fact, a common misconception in computer
security is that any security policy can be implemented in an application regardless of kernel support.[16]
Hardware-based protection or language-based protection
Typical computer systems today use hardware-enforced rules about what programs are allowed to access what data.
The processor monitors the execution and stops a program that violates a rule (e.g., a user process that is about to
read or write to kernel memory, and so on). In systems that lack support for capabilities, processes are isolated from
each other by using separate address spaces.[20] Calls from user processes into the kernel are regulated by requiring
them to use one of the above-described system call methods.
An alternative approach is to use language-based protection. In a language-based protection system, the kernel will
only allow code to execute that has been produced by a trusted language compiler. The language may then be
designed such that it is impossible for the programmer to instruct it to do something that will violate a security
requirement.[15]
Advantages of this approach include:
• No need for separate address spaces. Switching between address spaces is a slow operation that causes a great
deal of overhead, and a lot of optimization work is currently performed in order to prevent unnecessary switches
in current operating systems. Switching is completely unnecessary in a language-based protection system, as all
code can safely operate in the same address space.
• Flexibility. Any protection scheme that can be designed to be expressed via a programming language can be
implemented using this method. Changes to the protection scheme (e.g. from a hierarchical system to a
capability-based one) do not require new hardware.
Disadvantages include:
• Longer application start up time. Applications must be verified when they are started to ensure they have been
compiled by the correct compiler, or may need recompiling either from source code or from bytecode.
• Inflexible type systems. On traditional systems, applications frequently perform operations that are not type safe.
Such operations cannot be permitted in a language-based protection system, which means that applications may
need to be rewritten and may, in some cases, lose performance.
Examples of systems with language-based protection include JX and Microsoft's Singularity.
Process cooperation
Edsger Dijkstra proved that from a logical point of view, atomic lock and unlock operations operating on binary
semaphores are sufficient primitives to express any functionality of process cooperation.[21] However this approach
is generally held to be lacking in terms of safety and efficiency, whereas a message passing approach is more
flexible.[22] A number of other approaches (either lower- or higher-level) are available as well, with many modern
kernels providing support for systems such as shared memory and remote procedure calls.
22
Kernel (computing)
I/O devices management
The idea of a kernel where I/O devices are handled uniformly with other processes, as parallel co-operating
processes, was first proposed and implemented by Brinch Hansen (although similar ideas were suggested in
1967[23][24]). In Hansen's description of this, the "common" processes are called internal processes, while the I/O
devices are called external processes.[22]
Similar to physical memory, allowing applications direct access to controller ports and registers can cause the
controller to malfunction, or system to crash. With this, depending on the complexity of the device, some devices
can get surprisingly complex to program, and use several different controllers. Because of this, providing a more
abstract interface to manage the device is important. This interface is normally done by a Device Driver or Hardware
Abstraction Layer. Frequently, applications will require access to these devices. The Kernel must maintain the list of
these devices by querying the system for them in some way. This can be done through the BIOS, or through one of
the various system buses (Such as PCI/PCIE, or USB.) When an application requests an operation on a device (Such
as displaying a character), the kernel needs to send this request to the current active video driver. The video driver, in
turn, needs to carry out this request. This is an example of Inter Process Communication (IPC).
Kernel-wide design approaches
Naturally, the above listed tasks and features can be provided in many ways that differ from each other in design and
implementation.
The principle of separation of mechanism and policy is the substantial difference between the philosophy of micro
and monolithic kernels.[25][26] Here a mechanism is the support that allows the implementation of many different
policies, while a policy is a particular "mode of operation". For instance, a mechanism may provide for user log-in
attempts to call an authorization server to determine whether access should be granted; a policy may be for the
authorization server to request a password and check it against an encrypted password stored in a database. Because
the mechanism is generic, the policy could more easily be changed (e.g. by requiring the use of a security token) than
if the mechanism and policy were integrated in the same module.
In minimal microkernel just some very basic policies are included,[26] and its mechanisms allows what is running on
top of the kernel (the remaining part of the operating system and the other applications) to decide which policies to
adopt (as memory management, high level process scheduling, file system management, etc.).[1][22] A monolithic
kernel instead tends to include many policies, therefore restricting the rest of the system to rely on them.
Per Brinch Hansen presented arguments in favor of separation of mechanism and policy.[1][22] The failure to
properly fulfill this separation, is one of the major causes of the lack of substantial innovation in existing operating
systems,[1] a problem common in computer architecture.[27][28][29] The monolithic design is induced by the "kernel
mode"/"user mode" architectural approach to protection (technically called hierarchical protection domains), which
is common in conventional commercial systems;[30] in fact, every module needing protection is therefore preferably
included into the kernel.[30] This link between monolithic design and "privileged mode" can be reconducted to the
key issue of mechanism-policy separation;[1] in fact the "privileged mode" architectural approach melts together the
protection mechanism with the security policies, while the major alternative architectural approach, capability-based
addressing, clearly distinguishes between the two, leading naturally to a microkernel design[1] (see Separation of
protection and security).
While monolithic kernels execute all of their code in the same address space (kernel space) microkernels try to run
most of their services in user space, aiming to improve maintainability and modularity of the codebase.[2] Most
kernels do not fit exactly into one of these categories, but are rather found in between these two designs. These are
called hybrid kernels. More exotic designs such as nanokernels and exokernels are available, but are seldom used for
production systems. The Xen hypervisor, for example, is an exokernel.
23
Kernel (computing)
Monolithic kernels
In a monolithic kernel, all OS services run along with the main kernel thread,
thus also residing in the same memory area. This approach provides rich and
powerful hardware access. Some developers, such as UNIX developer Ken
Thompson, maintain that it is "easier to implement a monolithic kernel"[31] than
microkernels. The main disadvantages of monolithic kernels are the
dependencies between system components € a bug in a device driver might
crash the entire system € and the fact that large kernels can become very
difficult to maintain.
Monolithic kernels, which have traditionally been used by Unix-like operating
systems, contain all the operating system core functions and the device drivers
(small programs that allow the operating system to interact with hardware
Diagram of a monolithic kernel
devices, such as disk drives, video cards and printers). This is the traditional
design of UNIX systems. A monolithic kernel is one single program that contains all of the code necessary to
perform every kernel related task. Every part which is to be accessed by most programs which cannot be put in a
library is in the kernel space: Device drivers, Scheduler, Memory handling, File systems, Network stacks. Many
system calls are provided to applications, to allow them to access all those services. A monolithic kernel, while
initially loaded with subsystems that may not be needed can be tuned to a point where it is as fast as or faster than
the one that was specifically designed for the hardware, although more in a general sense. Modern monolithic
kernels, such as those of Linux and FreeBSD, both of which fall into the category of Unix-like operating systems,
feature the ability to load modules at runtime, thereby allowing easy extension of the kernel's capabilities as required,
while helping to minimize the amount of code running in kernel space. In the monolithic kernel, some advantages
hinge on these points:
• Since there is less software involved it is faster.
• As it is one single piece of software it should be smaller both in source and compiled forms.
• Less code generally means fewer bugs which can translate to fewer security problems.
Most work in the monolithic kernel is done via system calls. These are interfaces, usually kept in a tabular structure,
that access some subsystem within the kernel such as disk operations. Essentially calls are made within programs
and a checked copy of the request is passed through the system call. Hence, not far to travel at all. The monolithic
Linux kernel can be made extremely small not only because of its ability to dynamically load modules but also
because of its ease of customization. In fact, there are some versions that are small enough to fit together with a large
number of utilities and other programs on a single floppy disk and still provide a fully functional operating system
(one of the most popular of which is muLinux). This ability to miniaturize its kernel has also led to a rapid growth in
the use of Linux in embedded systems.
These types of kernels consist of the core functions of the operating system and the device drivers with the ability to
load modules at runtime. They provide rich and powerful abstractions of the underlying hardware. They provide a
small set of simple hardware abstractions and use applications called servers to provide more functionality. This
particular approach defines a high-level virtual interface over the hardware, with a set of system calls to implement
operating system services such as process management, concurrency and memory management in several modules
that run in supervisor mode. This design has several flaws and limitations:
• Coding in kernel can be challenging, in part because you cannot use common libraries (like a full-featured libc),
and because you need to use a source-level debugger like gdb. Rebooting the computer is often required. This is
not just a problem of convenience to the developers. When debugging is harder, and as difficulties become
stronger, it becomes more likely that code will be "buggier".
24
Kernel (computing)
• Bugs in one part of the kernel have strong side effects; since every function in the kernel has all the privileges, a
bug in one function can corrupt data structure of another, totally unrelated part of the kernel, or of any running
program.
• Kernels often become very large and difficult to maintain.
• Even if the modules servicing these operations are separate from the whole, the code integration is tight and
difficult to do correctly.
• Since the modules run in the same address space, a bug can bring down the entire system.
• Monolithic kernels are not portable; therefore, they must be rewritten for each new architecture that the operating
system is to be used on.
Microkernels
Microkernel (also abbreviated ‚K or uK) is the term
describing an approach to Operating System design
by which the functionality of the system is moved
out of the traditional "kernel", into a set of "servers"
that communicate through a "minimal" kernel,
leaving as little as possible in "system space" and as
much as possible in "user space". A microkernel that
is designed for a specific platform or device is only
ever going to have what it needs to operate. The
microkernel approach consists of defining a simple
abstraction over the hardware, with a set of
In the microkernel approach, the kernel itself only provides basic
primitives or system calls to implement minimal OS
functionality that allows the execution of servers, separate programs
services such as memory management, multitasking,
that assume former kernel functions, such as device drivers, GUI
and inter-process communication. Other services,
servers, etc.
including those normally provided by the kernel,
such as networking, are implemented in user-space programs, referred to as servers. Microkernels are easier to
maintain than monolithic kernels, but the large number of system calls and context switches might slow down the
system because they typically generate more overhead than plain function calls.
Only parts which really require being in a privileged mode are in kernel space: IPC (Inter-Process Communication),
Basic scheduler, or scheduling primitives, Basic memory handling, Basic I/O primitives. Many critical parts are now
running in user space: The complete scheduler, Memory handling, File systems, and Network stacks. Micro kernels
were invented as a reaction to traditional "monolithic" kernel design, whereby all system functionality was put in a
one static program running in a special "system" mode of the processor. In the microkernel, only the most
fundamental of tasks are performed such as being able to access some (not necessarily all) of the hardware, manage
memory and coordinate message passing between the processes. Some systems that use micro kernels are QNX and
the HURD. In the case of QNX and Hurd user sessions can be entire snapshots of the system itself or views as it is
referred to. The very essence of the microkernel architecture illustrates some of its advantages:
•
•
•
•
Maintenance is generally easier.
Patches can be tested in a separate instance, and then swapped in to take over a production instance.
Rapid development time and new software can be tested without having to reboot the kernel.
More persistence in general, if one instance goes hay-wire, it is often possible to substitute it with an operational
mirror.
Most micro kernels use a message passing system of some sort to handle requests from one server to another. The
message passing system generally operates on a port basis with the microkernel. As an example, if a request for more
memory is sent, a port is opened with the microkernel and the request sent through. Once within the microkernel, the
25
Kernel (computing)
steps are similar to system calls. The rationale was that it would bring modularity in the system architecture, which
would entail a cleaner system, easier to debug or dynamically modify, customizable to users' needs, and more
performing. They are part of the operating systems like AIX, BeOS, Hurd, Mach, Mac OS X, MINIX, QNX. Etc.
Although micro kernels are very small by themselves, in combination with all their required auxiliary code they are,
in fact, often larger than monolithic kernels. Advocates of monolithic kernels also point out that the two-tiered
structure of microkernel systems, in which most of the operating system does not interact directly with the hardware,
creates a not-insignificant cost in terms of system efficiency. These types of kernels normally provide only the
minimal services such as defining memory address spaces, Inter-process communication (IPC) and the process
management. The other functions such as running the hardware processes are not handled directly by micro kernels.
Proponents of micro kernels point out those monolithic kernels have the disadvantage that an error in the kernel can
cause the entire system to crash. However, with a microkernel, if a kernel process crashes, it is still possible to
prevent a crash of the system as a whole by merely restarting the service that caused the error. Although this sounds
sensible, it is questionable how important it is in reality, because operating systems with monolithic kernels such as
Linux have become extremely stable and can run for years without crashing.
Other services provided by the kernel such as networking are implemented in user-space programs referred to as
servers. Servers allow the operating system to be modified by simply starting and stopping programs. For a machine
without networking support, for instance, the networking server is not started. The task of moving in and out of the
kernel to move data between the various applications and servers creates overhead which is detrimental to the
efficiency of micro kernels in comparison with monolithic kernels.
Disadvantages in the microkernel exist however. Some are:
• Larger running memory footprint
• More software for interfacing is required, there is a potential for performance loss.
• Messaging bugs can be harder to fix due to the longer trip they have to take versus the one off copy in a
monolithic kernel.
• Process management in general can be very complicated.
• The disadvantages for micro kernels are extremely context based. As an example, they work well for small single
purpose (and critical) systems because if not many processes need to run, then the complications of process
management are effectively mitigated.
A microkernel allows the implementation of the remaining part of the operating system as a normal application
program written in a high-level language, and the use of different operating systems on top of the same unchanged
kernel.[22] It is also possible to dynamically switch among operating systems and to have more than one active
simultaneously.[22]
Monolithic kernels vs. microkernels
As the computer kernel grows, a number of problems become evident. One of the most obvious is that the memory
footprint increases. This is mitigated to some degree by perfecting the virtual memory system, but not all computer
architectures have virtual memory support.[32] To reduce the kernel's footprint, extensive editing has to be performed
to carefully remove unneeded code, which can be very difficult with non-obvious interdependencies between parts of
a kernel with millions of lines of code.
By the early 1990s, due to the various shortcomings of monolithic kernels versus microkernels, monolithic kernels
were considered obsolete by virtually all operating system researchers. As a result, the design of Linux as a
monolithic kernel rather than a microkernel was the topic of a famous debate between Linus Torvalds and Andrew
Tanenbaum.[33] There is merit on both sides of the argument presented in the Tanenbaum•Torvalds debate.
26
Kernel (computing)
Performances
Monolithic kernels are designed to have all of their code in the same address space (kernel space), which some
developers argue is necessary to increase the performance of the system.[34] Some developers also maintain that
monolithic systems are extremely efficient if well-written.[34] The monolithic model tends to be more efficient
through the use of shared kernel memory, rather than the slower IPC system of microkernel designs, which is
typically based on message passing.
The performance of microkernels constructed in the 1980s the year in which it started and early 1990s was
poor.[35][36] Studies that empirically measured the performance of these microkernels did not analyze the reasons of
such inefficiency.[35] The explanations of this data were left to "folklore", with the assumption that they were due to
the increased frequency of switches from "kernel-mode" to "user-mode",[35] to the increased frequency of
inter-process communication[35] and to the increased frequency of context switches.[35]
In fact, as guessed in 1995, the reasons for the poor performance of microkernels might as well have been: (1) an
actual inefficiency of the whole microkernel approach, (2) the particular concepts implemented in those
microkernels, and (3) the particular implementation of those concepts.[35] Therefore it remained to be studied if the
solution to build an efficient microkernel was, unlike previous attempts, to apply the correct construction
techniques.[35]
On the other end, the hierarchical protection domains architecture that leads to the design of a monolithic kernel[30]
has a significant performance drawback each time there's an interaction between different levels of protection (i.e.
when a process has to manipulate a data structure both in 'user mode' and 'supervisor mode'), since this requires
message copying by value.[37]
By the mid-1990s, most researchers had abandoned the belief that careful tuning could reduce this overhead
dramatically, but recently, newer microkernels, optimized for performance, such as L4[38] and K42 have addressed
these problems.
Hybrid (or) Modular kernels
Hybrid kernels are used in most commercial
operating systems such as Microsoft Windows NT,
2000, XP, Vista, and 7. Apple Inc's own Mac OS X
uses a hybrid kernel called XNU which is based upon
code from Carnegie Mellon's Mach kernel and
FreeBSD's monolithic kernel. They are similar to
micro kernels, except they include some additional
code in kernel-space to increase performance. These
kernels represent a compromise that was
implemented by some developers before it was
demonstrated that pure micro kernels can provide
The hybrid kernel approach combines the speed and simpler design of a
high performance. These types of kernels are
monolithic kernel with the modularity and execution safety of a
extensions of micro kernels with some properties of
microkernel.
monolithic kernels. Unlike monolithic kernels, these
types of kernels are unable to load modules at runtime on their own. Hybrid kernels are micro kernels that have some
"non-essential" code in kernel-space in order for the code to run more quickly than it would were it to be in
user-space. Hybrid kernels are a compromise between the monolithic and microkernel designs. This implies running
some services (such as the network stack or the filesystem) in kernel space to reduce the performance overhead of a
traditional microkernel, but still running kernel code (such as device drivers) as servers in user space.
27
Kernel (computing)
Many traditionally monolithic kernels are now at least adding (if not actively exploiting) the module capability. The
most well known of these kernels is the Linux kernel. The modular kernel essentially can have parts of it that are
built into the core kernel binary or binaries that load into memory on demand. It is important to note that a code
tainted module has the potential to destabilize a running kernel. Many people become confused on this point when
discussing micro kernels. It is possible to write a driver for a microkernel in a completely separate memory space
and test it before "going" live. When a kernel module is loaded, it accesses the monolithic portion's memory space by
adding to it what it needs, therefore, opening the doorway to possible pollution. A few advantages to the modular
(or) Hybrid kernel are:
• Faster development time for drivers that can operate from within modules. No reboot required for testing
(provided the kernel is not destabilized).
• On demand capability versus spending time recompiling a whole kernel for things like new drivers or subsystems.
• Faster integration of third party technology (related to development but pertinent unto itself nonetheless).
Modules, generally, communicate with the kernel using a module interface of some sort. The interface is generalized
(although particular to a given operating system) so it is not always possible to use modules. Often the device drivers
may need more flexibility than the module interface affords. Essentially, it is two system calls and often the safety
checks that only have to be done once in the monolithic kernel now may be done twice. Some of the disadvantages
of the modular approach are:
• With more interfaces to pass through, the possibility of increased bugs exists (which implies more security holes).
• Maintaining modules can be confusing for some administrators when dealing with problems like symbol
differences.
Nanokernels
A nanokernel delegates virtually all services € including even the most basic ones like interrupt controllers or the
timer € to device drivers to make the kernel memory requirement even smaller than a traditional microkernel.[39]
Exokernels
Exokernels are a still experimental approach to operating system design. They differ from the other types of kernels
in that their functionality is limited to the protection and multiplexing of the raw hardware, providing no hardware
abstractions on top of which to develop applications. This separation of hardware protection from hardware
management enables application developers to determine how to make the most efficient use of the available
hardware for each specific program.
Exokernels in themselves are extremely small. However, they are accompanied by library operating systems,
providing application developers with the functionalities of a conventional operating system. A major advantage of
exokernel-based systems is that they can incorporate multiple library operating systems, each exporting a different
API, for example one for high level UI development and one for real-time control.
History of kernel development
Early operating system kernels
Strictly speaking, an operating system (and thus, a kernel) is not required to run a computer. Programs can be
directly loaded and executed on the "bare metal" machine, provided that the authors of those programs are willing to
work without any hardware abstraction or operating system support. Most early computers operated this way during
the 1950s and early 1960s, which were reset and reloaded between the execution of different programs. Eventually,
small ancillary programs such as program loaders and debuggers were left in memory between runs, or loaded from
ROM. As these were developed, they formed the basis of what became early operating system kernels. The "bare
metal" approach is still used today on some video game consoles and embedded systems,[40] but in general, newer
28
Kernel (computing)
computers use modern operating systems and kernels.
In 1969 the RC 4000 Multiprogramming System introduced the system design philosophy of a small nucleus "upon
which operating systems for different purposes could be built in an orderly manner",[41] what would be called the
microkernel approach.
Time-sharing operating systems
In the decade preceding Unix, computers had grown enormously in power € to the point where computer operators
were looking for new ways to get people to use the spare time on their machines. One of the major developments
during this era was time-sharing, whereby a number of users would get small slices of computer time, at a rate at
which it appeared they were each connected to their own, slower, machine.[42]
The development of time-sharing systems led to a number of problems. One was that users, particularly at
universities where the systems were being developed, seemed to want to hack the system to get more CPU time. For
this reason, security and access control became a major focus of the Multics project in 1965.[43] Another ongoing
issue was properly handling computing resources: users spent most of their time staring at the screen and thinking
instead of actually using the resources of the computer, and a time-sharing system should give the CPU time to an
active user during these periods. Finally, the systems typically offered a memory hierarchy several layers deep, and
partitioning this expensive resource led to major developments in virtual memory systems.
Amiga
The Commodore Amiga was released in 1985, and was among the first (and certainly most successful) home
computers to feature a hybrid architecture. The Amiga's kernel executive component, exec.library, uses microkernel
message passing design but there are other kernel components, like graphics.library, that had a direct access to the
hardware. There is no memory protection and the kernel is almost always running in a user mode. Only special
actions are executed in kernel mode and user mode applications can ask operating system to execute their code in
kernel mode.
29
Kernel (computing)
Unix
During the design phase of Unix, programmers decided to
model every high-level device as a file, because they believed
the purpose of computation was data transformation.[44]
For instance, printers were represented as a "file" at a known
location € when data was copied to the file, it printed out.
Other systems, to provide a similar functionality, tended to
virtualize devices at a lower level € that is, both devices and
files would be instances of some lower level concept.
Virtualizing the system at the file level allowed users to
manipulate the entire system using their existing file
management utilities and concepts, dramatically simplifying
operation. As an extension of the same paradigm, Unix
allows programmers to manipulate files using a series of
small programs, using the concept of pipes, which allowed
users to complete operations in stages, feeding a file through
a chain of single-purpose tools. Although the end result was
the same, using smaller programs in this way dramatically
increased flexibility as well as ease of development and use,
allowing the user to modify their workflow by adding or
removing a program from the chain.
In the Unix model, the Operating System consists of two
parts; first, the huge collection of utility programs that drive
A diagram of the predecessor/successor family relationship
most operations, the other the kernel that runs the
for Unix-like systems.
programs.[44] Under Unix, from a programming standpoint,
the distinction between the two is fairly thin; the kernel is a
program, running in supervisor mode,[45] that acts as a program loader and supervisor for the small utility programs
making up the rest of the system, and to provide locking and I/O services for these programs; beyond that, the kernel
didn't intervene at all in user space.
Over the years the computing model changed, and Unix's treatment of everything as a file or byte stream no longer
was as universally applicable as it was before. Although a terminal could be treated as a file or a byte stream, which
is printed to or read from, the same did not seem to be true for a graphical user interface. Networking posed another
problem. Even if network communication can be compared to file access, the low-level packet-oriented architecture
dealt with discrete chunks of data and not with whole files. As the capability of computers grew, Unix became
increasingly cluttered with code. It is also because the modularity of the Unix kernel is extensively scalable.[46]
While kernels might have had 100,000 lines of code in the seventies and eighties, kernels of modern Unix successors
like Linux have more than 13 million lines.[47]
Modern Unix-derivatives are generally based on module-loading monolithic kernels. Examples of this are the Linux
kernel in its many distributions as well as the Berkeley software distribution variant kernels such as FreeBSD,
DragonflyBSD, OpenBSD, NetBSD, and Mac OS X. Apart from these alternatives, amateur developers maintain an
active operating system development community, populated by self-written hobby kernels which mostly end up
sharing many features with Linux, FreeBSD, DragonflyBSD, OpenBSD or NetBSD kernels and/or being compatible
with them.[48]
30
Kernel (computing)
Mac OS
Apple Computer first launched Mac OS in 1984, bundled with its Apple Macintosh personal computer. Apple moved
to a nanokernel design in Mac OS 8.6. Against this, Mac OS X is based on Darwin, which uses a hybrid kernel
called XNU, which was created combining the 4.3BSD kernel and the Mach kernel.[49]
Microsoft Windows
Microsoft Windows was first released in 1985 as an add-on to MS-DOS. Because of its dependence on another
operating system, initial releases of Windows, prior to Windows 95, were considered an operating environment (not
to be confused with an operating system). This product line continued to evolve through the 1980s and 1990s,
culminating with release of the Windows 9x series (upgrading the system's capabilities to 32-bit addressing and
pre-emptive multitasking) through the mid 1990s and ending with the release of Windows Me in 2000. Microsoft
also developed Windows NT, an operating system intended for high-end and business users. This line started with
the release of Windows NT 3.1 in 1993, and has continued through the years of 2000 with Windows 7 and Windows
Server 2008.
The release of Windows XP in October 2001 brought the NT kernel version of Windows to general users, replacing
Windows 9x with a completely different operating system. The architecture of Windows NT's kernel is considered a
hybrid kernel because the kernel itself contains tasks such as the Window Manager and the IPC Managers, with a
client/server layered subsystem model.[50]
Development of microkernels
Although Mach, developed at Carnegie Mellon University from 1985 to 1994, is the best-known general-purpose
microkernel, other microkernels have been developed with more specific aims. The L4 microkernel family (mainly
the L3 and the L4 kernel) was created to demonstrate that microkernels are not necessarily slow.[38] Newer
implementations such as Fiasco and Pistachio are able to run Linux next to other L4 processes in separate address
spaces.[51][52]
QNX is a real-time operating system with a minimalistic microkernel design that has been developed since 1982,
having been far more successful than Mach in achieving the goals of the microkernel paradigm.[53] It is principally
used in embedded systems and in situations where software is not allowed to fail, such as the robotic arms on the
space shuttle and machines that control grinding of glass to extremely fine tolerances, where a tiny mistake may cost
hundreds of thousands of dollars.
Notes
[1]
[2]
[3]
[4]
Wulf 74 pp.337•345
Roch 2004
Levy 1984, p.5
Needham, R.M., Wilkes, M. V. Domains of protection and the management of processes (http:/ / comjnl. oxfordjournals. org/ cgi/ content/
abstract/ 17/ 2/ 117), Computer Journal, vol. 17, no. 2, May 1974, pp 117•120.
[5] Silberschatz 1990
[6] http:/ / www. answers. com/ topic/ operating-system
[7] Tanenbaum, Andrew S. (2008). Modern Operating Systems (3rd ed.). Prentice Hall. pp.€50•51. ISBN€0-13-600663-9. ". . . nearly all system
calls [are] invoked from C programs by calling a library procedure . . . The library procedure . . . executes a TRAP instruction to switch from
user mode to kernel mode and start execution . . ."
[8] Denning 1976
[9] Swift 2005, p.29 quote: "isolation, resource control, decision verification (checking), and error recovery."
[10] Schroeder 72
[11] Linden 76
[12] Stephane Eranian and David Mosberger, Virtual Memory in the IA-64 Linux Kernel (http:/ / www. informit. com/ articles/ article.
aspx?p=29961), Prentice Hall PTR, 2002
[13] Silberschatz & Galvin, Operating System Concepts, 4th ed, pp445 & 446
31
Kernel (computing)
[14] Hoch, Charles; J. C. Browne (University of Texas, Austin) (July 1980). "An implementation of capabilities on the PDP-11/45" (http:/ /
portal. acm. org/ citation. cfm?id=850701& dl=acm& coll=& CFID=15151515& CFTOKEN=6184618) (PDF). ACM SIGOPS Operating
Systems Review 14 (3): 22•32. doi:10.1145/850697.850701. . Retrieved 2007-01-07.
[15] A Language-Based Approach to Security (http:/ / www. cs. cmu. edu/ ~rwh/ papers/ langsec/ dagstuhl. pdf), Schneider F., Morrissett G.
(Cornell University) and Harper R. (Carnegie Mellon University)
[16] P. A. Loscocco, S. D. Smalley, P. A. Muckelbauer, R. C. Taylor, S. J. Turner, and J. F. Farrell. The Inevitability of Failure: The Flawed
Assumption of Security in Modern Computing Environments (http:/ / www. jya. com/ paperF1. htm). In Proceedings of the 21st National
Information Systems Security Conference, pages 303•314, Oct. 1998. (http:/ / csrc. nist. gov/ nissc/ 1998/ proceedings/ paperF1. pdf).
[17] J. Lepreau et al. The Persistent Relevance of the Local Operating System to Global Applications (http:/ / doi. acm. org/ 10. 1145/ 504450.
504477). Proceedings of the 7th ACM SIGOPS Eurcshelf/book001/book001.html Information Security: An Integrated Collection of Essays],
IEEE Comp. 1995.
[18] J. Anderson, Computer Security Technology Planning Study (http:/ / csrc. nist. gov/ publications/ history/ ande72. pdf), Air Force Elect.
Systems Div., ESD-TR-73-51, October 1972.
[19] * Jerry H. Saltzer, Mike D. Schroeder (September 1975). "The protection of information in computer systems" (http:/ / web. mit. edu/
Saltzer/ www/ publications/ protection/ ). Proceedings of the IEEE 63 (9): 1278•1308. doi:10.1109/PROC.1975.9939. .
[20] Jonathan S. Shapiro; Jonathan M. Smith; David J. Farber (1999). "EROS: a fast capability system" (http:/ / portal. acm. org/ citation.
cfm?doid=319151. 319163). Proceedings of the seventeenth ACM symposium on Operating systems principles 33 (5): 170•185.
doi:10.1145/319344.319163. .
[21] Dijkstra, E. W. Cooperating Sequential Processes. Math. Dep., Technological U., Eindhoven, Sept. 1965.
[22] Brinch Hansen 70 pp.238•241
[23] "SHARER, a time sharing system for the CDC 6600" (http:/ / portal. acm. org/ citation. cfm?id=363778& dl=ACM& coll=GUIDE&
CFID=11111111& CFTOKEN=2222222). . Retrieved 2007-01-07.
[24] "Dynamic Supervisors • their design and construction" (http:/ / portal. acm. org/ citation. cfm?id=811675& dl=ACM& coll=GUIDE&
CFID=11111111& CFTOKEN=2222222). . Retrieved 2007-01-07.
[25] Baiardi 1988
[26] Levin 75
[27] Denning 1980
[28] Jƒrgen Nehmer The Immortality of Operating Systems, or: Is Research in Operating Systems still Justified? (http:/ / portal. acm. org/
citation. cfm?id=723612) Lecture Notes In Computer Science; Vol. 563. Proceedings of the International Workshop on Operating Systems of
the 90s and Beyond. pp. 77•83 (1991) ISBN 3-540-54987-0 (http:/ / www. sigmod. org/ dblp/ db/ conf/ dagstuhl/ os1991. html) quote: "The
past 25 years have shown that research on operating system architecture had a minor effect on existing main stream systems." (http:/ / www.
soe. ucsc. edu/ ~brucem/ soft_ins/ dissert. html)
[29] Levy 84, p.1 quote: "Although the complexity of computer applications increases yearly, the underlying hardware architecture for
applications has remained unchanged for decades."
[30] Levy 84, p.1 quote: "Conventional architectures support a single privileged mode of operation. This structure leads to monolithic design; any
module needing protection must be part of the single operating system kernel. If, instead, any module could execute within a protected
domain, systems could be built as a collection of independent modules extensible by any user."
[31] Open Sources: Voices from the Open Source Revolution (http:/ / oreilly. com/ catalog/ opensources/ book/ appa. html)
[32] Virtual addressing is most commonly achieved through a built-in memory management unit.
[33] Recordings of the debate between Torvalds and Tanenbaum can be found at dina.dk (http:/ / www. dina. dk/ ~abraham/
Linus_vs_Tanenbaum. html), groups.google.com (http:/ / groups. google. com/ group/ comp. os. minix/ browse_thread/ thread/
c25870d7a41696d2/ f447530d082cd95d?tvc=2#f447530d082cd95d), oreilly.com (http:/ / www. oreilly. com/ catalog/ opensources/ book/
appa. html) and Andrew Tanenbaum's website (http:/ / www. cs. vu. nl/ ~ast/ reliable-os/ )
[34] Matthew Russell. "What Is Darwin (and How It Powers Mac OS X)" (http:/ / oreilly. com/ pub/ a/ mac/ 2005/ 09/ 27/ what-is-darwin.
html?page=2). O'Reilly Media. . quote: "The tightly coupled nature of a monolithic kernel allows it to make very efficient use of the
underlying hardware [...] Microkernels, on the other hand, run a lot more of the core processes in userland. [...] Unfortunately, these benefits
come at the cost of the microkernel having to pass a lot of information in and out of the kernel space through a process known as a context
switch. Context switches introduce considerable overhead and therefore result in a performance penalty."
[35] Liedtke 95
[36] H„rtig 97
[37] Hansen 73, section 7.3 p.233 "interactions between different levels of protection require transmission of messages by value"
[38] The L4 microkernel family • Overview (http:/ / os. inf. tu-dresden. de/ L4/ overview. html)
[39] KeyKOS Nanokernel Architecture (http:/ / www. cis. upenn. edu/ ~KeyKOS/ NanoKernel/ NanoKernel. html)
[40] Ball: Embedded Microprocessor Designs, p. 129
[41] Hansen 2001 (os), pp.17•18
[42] BSTJ version of C.ACM Unix paper (http:/ / cm. bell-labs. com/ cm/ cs/ who/ dmr/ cacm. html)
[43] Introduction and Overview of the Multics System (http:/ / www. multicians. org/ fjcc1. html), by F. J. Corbat… and V. A. Vissotsky.
[44] The UNIX System € The Single Unix Specification (http:/ / www. unix. org/ what_is_unix/ single_unix_specification. html)
32
Kernel (computing)
[45] The highest privilege level has various names throughout different architectures, such as supervisor mode, kernel mode, CPL0, DPL0, Ring
0, etc. See Ring (computer security) for more information.
[46] Unix‚s Revenge by Horace Dediu (http:/ / www. asymco. com/ 2010/ 09/ 29/ unixs-revenge/ )
[47] Linux Kernel 2.6: It's Worth More! (http:/ / www. dwheeler. com/ essays/ linux-kernel-cost. html), by David A. Wheeler, October 12, 2004
[48] This community mostly gathers at Bona Fide OS Development (http:/ / www. osdever. net), The Mega-Tokyo Message Board (http:/ / www.
mega-tokyo. com/ forum) and other operating system enthusiast web sites.
[49] XNU: The Kernel (http:/ / www. kernelthread. com/ mac/ osx/ arch_xnu. html)
[50] Windows History: Windows Desktop Products History (http:/ / www. microsoft. com/ windows/ WinHistoryDesktop. mspx)
[51] The Fiasco microkernel • Overview (http:/ / os. inf. tu-dresden. de/ fiasco/ overview. html)
[52] L4Ka • The L4 microkernel family and friends (http:/ / www. l4ka. org)
[53] QNX Realtime Operating System Overview (http:/ / www. qnx. com/ products/ rtos/ microkernel. html)
References
• Roch, Benjamin (2004). "Monolithic kernel vs. Microkernel" (http://www.vmars.tuwien.ac.at/courses/akti12/
journal/04ss/article_04ss_Roch.pdf) (PDF). Retrieved 2006-10-12.
• Silberschatz, Abraham; James L. Peterson, Peter B. Galvin (1991). Operating system concepts (http://portal.
acm.org/citation.cfm?id=95329&dl=acm&coll=&CFID=15151515&CFTOKEN=6184618). Boston,
Massachusetts: Addison-Wesley. p.€696. ISBN€0-201-51379-X.
• Ball, Stuart R. (2002) [2002]. Embedded Microprocessor Systems: Real World Designs (first ed.). Elsevier
Science. ISBN€0-7506-7534-9.
• Deitel, Harvey M. (1984) [1982]. An introduction to operating systems (http://portal.acm.org/citation.
cfm?id=79046&dl=GUIDE&coll=GUIDE) (revisited first ed.). Addison-Wesley. p.€673. ISBN€0-201-14502-2.
• Denning, Peter J. (December 1976). "Fault tolerant operating systems" (http://portal.acm.org/citation.
cfm?id=356680&dl=ACM&coll=&CFID=15151515&CFTOKEN=6184618). ACM Computing Surveys 8 (4):
359•389. doi:10.1145/356678.356680. ISSN€0360-0300.
• Denning, Peter J. (April 1980). "Why not innovations in computer architecture?" (http://portal.acm.org/
citation.cfm?id=859506&coll=&dl=ACM&CFID=15151515&CFTOKEN=6184618). ACM SIGARCH
Computer Architecture News 8 (2): 4•7. doi:10.1145/859504.859506. ISSN 0163-5964.
• Hansen, Per Brinch (April 1970). "The nucleus of a Multiprogramming System" (http://portal.acm.org/citation.
cfm?id=362278&dl=ACM&coll=GUIDE&CFID=11111111&CFTOKEN=2222222). Communications of the
ACM 13 (4): 238•241. doi:10.1145/362258.362278. ISSN 0001-0782.
• Hansen, Per Brinch (1973). Operating System Principles (http://portal.acm.org/citation.cfm?id=540365).
Englewood Cliffs: Prentice Hall. p.€496. ISBN€0-13-637843-9.
• Hansen, Per Brinch (2001) (PDF). The evolution of operating systems (http://brinch-hansen.net/papers/2001b.
pdf). Retrieved 2006-10-24. included in book: Per Brinch Hansen, ed. (2001). "1" (http://brinch-hansen.net/
papers/2001b.pdf). Classic operating systems: from batch processing to distributed systems (http://portal.acm.
org/citation.cfm?id=360596&dl=ACM&coll=&CFID=15151515&CFTOKEN=6184618). New York,:
Springer-Verlag. pp.€1•36. ISBN€0-387-95113-X.
• Hermann H„rtig, Michael Hohmuth, Jochen Liedtke, Sebastian Sch†nberg, Jean Wolter The performance of
€-kernel-based systems (http://os.inf.tu-dresden.de/pubs/sosp97/#Karshmer:1991:OSA), "The performance
of ‚-kernel-based systems" (http://doi.acm.org/10.1145/268998.266660). Doi.acm.org. Retrieved
2010-06-19. ACM SIGOPS Operating Systems Review, v.31 n.5, p.€66•77, Dec. 1997
• Houdek, M. E., Soltis, F. G., and Hoffman, R. L. 1981. IBM System/38 support for capability-based addressing
(http://portal.acm.org/citation.cfm?id=800052.801885). In Proceedings of the 8th ACM International
Symposium on Computer Architecture. ACM/IEEE, pp.€341•348.
• Intel Corporation (2002) The IA-32 Architecture Software Developer€s Manual, Volume 1: Basic Architecture
(http://www.intel.com/design/pentium4/manuals/24547010.pdf)
• Levin, R.; E. Cohen, W. Corwin, F. Pollack, William Wulf (1975). "Policy/mechanism separation in Hydra"
(http://portal.acm.org/citation.cfm?id=806531&dl=ACM&coll=&CFID=15151515&
33
Kernel (computing)
•
•
•
•
•
•
•
•
•
•
•
•
CFTOKEN=6184618). ACM Symposium on Operating Systems Principles / Proceedings of the fifth ACM
symposium on Operating systems principles 9 (5): 132•140. doi:10.1145/1067629.806531.
Levy, Henry M. (1984). Capability-based computer systems (http://www.cs.washington.edu/homes/levy/
capabook/index.html). Maynard, Mass: Digital Press. ISBN€0-932376-22-3.
Liedtke, Jochen. On •-Kernel Construction (http://i30www.ira.uka.de/research/publications/papers/index.
php?lid=en&docid=642), Proc. 15th ACM Symposium on Operating System Principles (SOSP), December 1995
Linden, Theodore A. (December 1976). "Operating System Structures to Support Security and Reliable Software"
(http://portal.acm.org/citation.cfm?id=356682&coll=&dl=ACM&CFID=15151515&
CFTOKEN=6184618). ACM Computing Surveys 8 (4): 409•445. doi:10.1145/356678.356682. ISSN 0360-0300.,
"Operating System Structures to Support Security and Reliable Software" (http://csrc.nist.gov/publications/
history/lind76.pdf) (PDF). Retrieved 2010-06-19.
Lorin, Harold (1981). Operating systems (http://portal.acm.org/citation.cfm?id=578308&coll=GUIDE&
dl=GUIDE&CFID=2651732&CFTOKEN=19681373). Boston, Massachusetts: Addison-Wesley. pp.€161•186.
ISBN€0-201-14464-6.
Schroeder, Michael D.; Jerome H. Saltzer (March 1972). "A hardware architecture for implementing protection
rings" (http://portal.acm.org/citation.cfm?id=361275&dl=ACM&coll=&CFID=15151515&
CFTOKEN=6184618). Communications of the ACM 15 (3): 157•170. doi:10.1145/361268.361275. ISSN
0001-0782.
Shaw, Alan C. (1974). The logical design of Operating systems (http://portal.acm.org/citation.
cfm?id=540329). Prentice-Hall. p.€304. ISBN€0-13-540112-7.
Tanenbaum, Andrew S. (1979). Structured Computer Organization. Englewood Cliffs, New Jersey:
Prentice-Hall. ISBN€0-13-148521-0.
Wulf, W.; E. Cohen, W. Corwin, A. Jones, R. Levin, C. Pierson, F. Pollack (June 1974). "HYDRA: the kernel of
a multiprocessor operating system" (http://www.cs.virginia.edu/papers/p337-wulf.pdf). Communications of
the ACM 17 (6): 337•345. doi:10.1145/355616.364017. ISSN 0001-0782.
Baiardi, F.; A. Tomasi, M. Vanneschi (http://www.di.unipi.it/~vannesch/) (1988) (in Italian). Architettura dei
Sistemi di Elaborazione, volume 1 (http://www.pangloss.it/libro.php?isbn=882042746X&id=4357&
PHPSESSID=9da1895b18ed1cda115cf1c7ace9bdf0). Franco Angeli. ISBN€88-204-2746-X.
Swift, Michael M.; Brian N. Bershad, Henry M. Levy. Improving the reliability of commodity operating systems
(http://nooks.cs.washington.edu/nooks-tocs.pdf).
"Improving the reliability of commodity operating systems" (http://doi.acm.org/10.1145/1047915.1047919).
Doi.acm.org. doi:10.1002/spe.4380201404. Retrieved 2010-06-19.
"ACM Transactions on Computer Systems (TOCS), v.23 n.1, p.€77•110, February 2005".
Further reading
•
•
•
•
Andrew Tanenbaum, Operating Systems • Design and Implementation (Third edition);
Andrew Tanenbaum, Modern Operating Systems (Second edition);
Daniel P. Bovet, Marco Cesati, The Linux Kernel;
David A. Peterson, Nitin Indurkhya, Patterson, Computer Organization and Design, Morgan Koffman (ISBN
1-55860-428-6);
• B.S. Chalk, Computer Organisation and Architecture, Macmillan P.(ISBN 0-333-64551-0).
34
Kernel (computing)
35
External links
• Detailed comparison between most popular operating system kernels (http://widefox.pbwiki.com/Kernel
Comparison Linux vs Windows)
Booting
In computing, booting (also known as booting up) is the initial set of operations that a computer system performs
when electrical power to the CPU is switched on. The process begins when a computer is turned on for the first time
or is re-energized after being turned off, and ends when the computer is ready to perform its normal operations. On
modern general purpose computers, this can take tens of seconds and typically involves performing a power-on
self-test, locating and initializing peripheral devices, and then finding, loading and starting an operating system.
Many computer systems also allow these operations to be initiated by a software command without cycling power, in
what is known as a soft reboot, though some of the initial operations might be skipped on a soft reboot. A boot
loader is a computer program that loads the main operating system or runtime environment for the computer after
completion of the self-tests.
The computer term boot is short for bootstrap[1][2] or bootstrap load and derives from the phrase to pull oneself up
by one's bootstraps.[3] The usage calls attention to the requirement that, if most software is loaded onto a computer
by other software already running on the computer, some mechanism must exist to load initial software onto the
computer.[4] Early computers used a variety of ad-hoc methods to get a small program into memory to solve this
problem. The invention of integrated circuit read-only memory (ROM) of various types solved this paradox by
allowing computers to be shipped with a start up program that could not be erased. Growth in the capacity of ROM
has allowed ever more elaborate start up procedures to be implemented.
On general purpose computers, the boot process begins with the execution of an initial program stored in boot ROMs
or read in another fashion. In some older computers, the initial program might have been the application to run, if no
operating system was used, or the operating system. In other computers, the initial program is a boot loader that may
then load into random-access memory (RAM), from nonvolatile secondary storage (such as a hard disk drive) or, in
some older computers, from a medium such as punched cards, punched tape, or magnetic tape in older computers,
the binary code of an operating system or runtime environment and then execute it. If the boot loader is limited in its
size and capabilities, it may, instead, load a larger and more capable secondary boot loader, which would then load
the operating system or runtime environment. Some embedded systems do not require a noticeable boot sequence to
begin functioning and when turned on may simply run operational programs that are stored in ROM.
History
There are many different methods available to load a short initial
program into a computer. These methods reach from simple, physical
input to removable media that can hold more complex programs.
Pre integrated-circuit-ROM examples
Switches and cables used to program ENIAC
(1946)
Booting
36
Early computers
Early computers in the 1940s and 50s were one-of-a-kind engineering efforts that could take weeks to program and
program loading was one of many problems that had to be solved. An early computer, ENIAC, as initially built, was
not even programmable as we would think of such; its interconnections were made via cables to configure the
hardware for the current problem. Bootstrapping in a stored-program computer simply did not apply. In 1960, the
Ballistic Missile Early Warning System Display Information Processor (DIP) in Colorado Springs (before Cheyenne
Mountain) ran only one program, which carried its own startup code. The program was stored as a bit image on a
continuously running magnetic drum, and loaded in a fraction of a second. Core memory was probably cleared
manually via the maintenance console, and startup from when power was fully up was very fast, only a few seconds.
In its general design, the DIP compared roughly with a DEC PDP-8.
First commercial computers
The first programmable computers for commercial sale, such as the UNIVAC I and the IBM 701[5] included features
to make their operation simpler. They typically included instructions that performed a complete input or output
operation. The same hardware logic could be used to load the contents of a punch card or other input media that
contained a bootstrap program by pressing a single button. This booting concept was called "Initial Program Load"
for IBM computers of the 1950s, and the term is still in use in IBM's z/Architecture mainframes.
The IBM 701 computer (1952•1956) had a "Load" button that initiated
reading of the first 36-bit word into main memory from a punched card
in a card reader, a magnetic tape in a tape drive, or a magnetic drum
unit, depending on the position of the Load Selector switch. The left
18-bit half-word was then executed as an instruction, which usually
read additional words into memory.[6][7] The loaded boot program was
then executed, which, in turn, loaded a larger program from that
medium into memory without further help from the human operator.
The term "boot" has been used in this sense since at least 1958.[8]
Other IBM computers of that era had similar features. For example, the
IBM 1401 system (c. 1958) used a card reader to load a program from
a punched card. The 80 characters stored in the punched card were read
into memory locations 001 to 080, then the computer would branch to
memory location 001 to read its first stored instruction. This instruction
was always the same: move the information in these first 80 memory
locations to an assembly area where the information in punched cards
2, 3, 4, and so on, could be combined to form the stored program. Once
this information was moved to the assembly area, the machine would
branch to an instruction in location 080 (read a card) and the next card
would be read and its information processed.
Initial program load punched card for the IBM
1130 (1965)
IBM System/3 console from the 1970s. Program
load selector switch is lower left; Program load
switch is lower right.
Another example was the IBM 650 (1953), a decimal machine, which had a group of ten 10-position switches on its
operator panel which were addressable as a memory word (address 8000) and could be executed as an instruction.
Thus setting the switches to 7004000400 and pressing the appropriate button would read the first card in the card
reader into memory (op code 70), starting at address 400 and then jump to 400 to begin executing the program on
that card.[9]
IBM's competitors also offered single button program load.
• The CDC 6600 (c. 1964) had a dead start panel with 144 toggle switches; the dead start switch entered 12 words
from the toggle switches to the memory of peripheral processor (PP) 0 and initiated the load sequence. PP 0
Booting
loaded the necessary code into its own memory and then initialized the other PPs.
• The GE 645 (c. 1965) had a "SYSTEM BOOTLOAD" button that, when pressed, caused one of the I/O
controllers to load a 64-word program into memory from a diode read-only memory and deliver an interrupt to
cause that program to start running.[10]
• The first model of the PDP-10 had a "READ IN" button that, when pressed, reset the processor and started an I/O
operation on a device specified by switches on the control panel, reading in a 36-bit word giving a target address
and count for subsequent word reads; when the read completed, the processor started executing the code read in
by jumping to the last word read in.[11]
A noteworthy variation of this is found on the Burroughs B1700 where there is neither a bootstrap ROM nor a
hardwired IPL operation. Instead, after the system is reset it reads and executes opcodes sequentially from a tape
drive mounted on the front panel; this sets up a boot loader in RAM which is then executed. However, since this
makes few assumptions about the system it can equally well be used to load diagnostic (Maintenance Test Routine)
tapes which display an intelligible code on the front panel even in cases of gross CPU failure.
IBM System/360 and successors
In the IBM System/360 and its successors, including the current z/Architecture machines, the boot process is known
as Initial Program Load (IPL).
This term was coined by IBM for the design of the System/360 (ca, 1965) and continues to be used in those
environments today.[12] In the System/360 processors, an IPL is initiated by the computer operator by selecting the
three hexadecimal digit device address (CUu; C=I/O Channel address, U=Control unit address and u=Device
address[13]) followed by pressing the LOAD button. On System/370 and some later systems, the functions of the
switches and the LOAD button are simulated using selectable areas on the screen of a graphics console, often an
IBM 2250-like device or an IBM 3270-like device. For example, on the System/370 Model 158, the keyboard
sequence 0-7-X (zero, seven and X, in that order) results in an IPL from the device address which was keyed into the
input area. Amdahl 470V/6 and related CPUs supported four hexadecimal digits on those CPUs which had the
optional second channel unit installed, for a total of 32 channels. Later, IBM would also support more than 16
channels.
The IPL function in the System/360 and its successors, and its compatibles such as Amdahl's, reads 24 bytes from an
operator-specified device into main storage starting at real address zero. The second and third groups of eight bytes
are treated as Channel Command Words (CCWs) to continue loading the startup program (the first CCW is always
simulated by the CPU and consists of a READ IPL command, 02h, with command chaining and suppress incorrect
length implied). When the I/O channel commands are complete, the first group of eight bytes is then loaded into the
processor's Program Status Word (PSW) and the startup program begins execution at the location designated by that
PSW.[12] The IPL device is usually a disk drive, but exactly the same procedure is also used to IPL from other
input-type devices, such as tape drives, or even card readers, in a device-independent manner, allowing, for example,
the installation of an operating system on a brand-new computer from an OS initial distribution magnetic tape (for
disk controllers, the 02h command also causes the selected device to seek to cylinder 0000h, head 0000h, and to
search for record 01h, thereby also simulating a stand-alone seek command, 07h, and a search ID equal command,
31h; seeks and searches are not simulated by tape and card controllers).
The disk, tape or card deck must contain a special program to load the actual operating system into main storage, and
for this specific purpose "IPL Text" is placed on the disk by the stand-alone DASDI (Direct Access Storage Device
Initialization) program or an equivalent program running under an operating system, e.g., ICKDSF, but IPL-able
tapes and card decks are usually distributed with this "IPL Text" already present.
37
Booting
38
Minicomputers
Minicomputers, starting with the Digital Equipment Corporation
(DEC) PDP-5 and PDP-8 (1965) simplified design by using the CPU to
assist input and output operations. This saved cost but made booting
more complicated than pressing a single button. Minicomputers
typically had some way to toggle in short programs by manipulating an
array of switches on the front panel. Since the early minicomputers
used magnetic core memory, which did not lose its information when
power was off, these bootstrap loaders would remain in place unless
they were erased. Erasure sometimes happened accidentally when a
program bug caused a loop that overwrote all of memory.
PDP-8/E front panel showing the switches used
to load the bootstrap program
Other examples include early models of the Data General Nova (1969), PDP-11 (1970) and early microcomputers
such as the Altair 8800 (1975). The Nova used 16 front panel switches and an enter pushbutton to manually load the
first 22 addresses into a core memory. DEC later added an optional diode matrix read-only memory for the PDP-11
that stored a bootstrap program of up to 32 words (64 bytes). It consisted of a printed circuit card, the M792, that
plugged in to the Unibus and held a 32 by 16 array of semiconductor diodes. With all 512 diodes in place, the
memory contained all one bits; the card was programmed by cutting off each diode whose bit was to be zero. DEC
also sold versions of the card, the BM792-Yx series, pre-programmed for many standard input devices by simply
omitting the unneeded diodes.[14][15]
Following the older approach, the earlier PDP-1 has a hardware loader, such that an operator need only push the
"load" switch to instruct the paper tape reader to load a program directly into core memory.
Early minicomputer boot loader examples
In a minicomputer with a paper tape reader, the first program to run in the boot process, the boot loader, would read
into core memory either the second-stage boot loader (often called a Binary Loader) that could read paper tape with
checksum or the operating system from an outside storage medium. Pseudocode for the boot loader might be as
simple as the following eight instructions:
1.
2.
3.
4.
5.
6.
7.
8.
Set the P register to 9
Check paper tape reader ready
If not ready, jump to 2
Read a byte from paper tape reader to accumulator
Store accumulator to address in P register
If end of tape, jump to 9
Increment the P register
Jump to 2
A related example is based on a loader for a Nicolet Instrument Corporation minicomputer of the 1970s, using a
Teletype Model 33 ASR teleprinter as a paper tape reader. Note that the bytes of the second-stage loader are read
from paper tape in reverse order.
1.
2.
3.
4.
5.
6.
7.
Set the P register to 106
Check paper tape reader ready
If not ready, jump to 2
Read a byte from paper tape reader to accumulator
Store accumulator to address in P register
Decrement the P register
Jump to 2
Booting
The length of the second stage loader is such that the final byte overwrites location 7. After the instruction in
location 6 executes, location 7 starts the second stage loader executing. The second stage loader then waits for the
much longer tape containing the operating system to be placed in the tape reader. The difference between the boot
loader and second stage loader is the addition of checking code to trap paper tape read errors, a frequent occurrence
with relatively low-cost, "part-time-duty" hardware such as the Teletype Model 33 ASR. (Friden Flexowriters were
far more reliable, but also comparatively costly.)
Booting the first microcomputers
The earliest microcomputers, such as the [Altair 8800] and an even earlier, similar machine (based on the Intel 8008
CPU) had no bootstrapping hardware as such. When started, the CPU would see memory that would contain
executable code containing only binary zeros -- memory was cleared by resetting when powering up. The front
panels of these machines carried toggle switches, one switch per bit of the computer memory word. Simple additions
to the hardware permitted one memory location at a time to be loaded from those switches to store bootstrap code.
Meanwhile, the CPU was kept from attempting to execute memory content. Once correctly loaded, the CPU was
enabled to execute the bootstrapping code. This process was tedious and had to be error-free.
Integrated circuit read-only memory era
The boot process was revolutionized by the introduction of integrated
circuit read-only memory (ROM), with its many variants, including
mask-programmed ROMs, programmable ROM (PROM), erasable
ROM (EPROM), and flash memory. These allowed firmware boot
programs to be shipped installed on the computer.
Apple Inc.'s first computer, the Apple 1 introduced in 1976, featured
PROM chips that eliminated the need for a front panel. According
Apple's ad announcing it "No More Switches, No More Lights ... the
firmware in PROMS enables you to enter, display and debug programs
(all in hex) from the keyboard."[16]
Some operating systems, most notably pre-1995 Macintosh systems
An UV-erasable ROM "chip" on a circuit board.
from Apple, are so closely interwoven with their hardware that it is
This UV EPROM chip is under the transparent
impossible to natively boot an operating system other than the standard
window in the gold-and-white "package" that
one. This is the opposite extreme of the scenario using switches
transmits short-wave ultraviolet light to erase data
stored in the chip.
mentioned above; it is highly inflexible but relatively error-proof and
foolproof as long as all hardware is working normally. A common
solution in such situations is to design a boot loader that works as a program belonging to the standard OS that
hijacks the system and loads the alternative OS. This technique was used by Apple for its A/UX Unix
implementation and copied by various freeware operating systems and BeOS Personal Edition 5.
Some machines, like the Atari ST microcomputer, were "instant-on", with the operating system executing from a
ROM. Retrieval of the OS from secondary or tertiary store was thus eliminated as one of the characteristic operations
for bootstrapping. To allow system customizations, accessories, and other support software to be loaded
automatically, the Atari's floppy drive was read for additional components during the boot process. There was a
timeout delay that provided time to manually insert a floppy as the system searched for the extra components. This
could be avoided by inserting a blank disk. The Atari ST hardware was also designed so the cartridge slot could
provide native program execution for gaming purposes as a holdover from Atari's legacy making electronic games;
by inserting the Spectre GCR cartridge with the Macintosh system ROM in the game slot and turning the Atari on, it
could "natively boot" the Macintosh operating system rather than Atari's own TOS system.
39
Booting
The IBM Personal Computer included ROM-based firmware called the BIOS; one of the functions of that firmware
was to perform a power-on self test when the machine was powered up, and then to read software from a boot device
and execute it. Firmware compatible with the BIOS on the IBM Personal Computer is used in IBM PC compatible
computers. The Extensible Firmware Interface was developed by Intel, originally for Itanium-based machines, and
later also used as an alternative to the BIOS in x86-based machines, including Apple Macs using Intel processors.
Unix workstations originally had vendor-specific ROM-based firmware. Sun Microsystems later developed
OpenBoot, later known as Open Firmware, which incorporated a Forth interpreter, with much of the firmware being
written in Forth. It was standardized by the IEEE as IEEE standard 1275-1994; firmware that implements that
standard was used in PowerPC-based Macs and some other PowerPC-based machines, as well as Sun's own
SPARC-based computers. The Advanced RISC Computing specification defined another firmware standard, which
was implemented on some MIPS-based and Alpha-based machines and the SGI Visual Workstation x86-based
workstations.
Modern boot loaders
When a modern computer is turned off, software, including operating systems, application code, and data, is stored
on nonvolatile data storage devices such as hard drives, CDs, DVDs, flash memory cards (like SD cards), USB flash
drives, and floppy disks. When the computer is powered on, it typically does not have an operating system in random
access memory (RAM). The computer first executes a relatively small program stored in read-only memory (ROM)
along with a small amount of needed data, to access the nonvolatile device or devices from which the operating
system programs and data can be loaded into RAM.
The small program that starts this sequence is known as a bootstrap loader, bootstrap or boot loader. This small
program's only job is to load other data and programs which are then executed from RAM. Often, multiple-stage
boot loaders are used, during which several programs of increasing complexity load one after the other in a process
of chain loading.
Some computer systems, upon receiving a boot signal from a human operator or a peripheral device, may load a very
small number of fixed instructions into memory at a specific location, initialize at least one CPU, and then point the
CPU to the instructions and start their execution. These instructions typically start an input operation from some
peripheral device (which may be switch-selectable by the operator). Other systems may send hardware commands
directly to peripheral devices or I/O controllers that cause an extremely simple input operation (such as "read sector
zero of the system device into memory starting at location 1000") to be carried out, effectively loading a small
number of boot loader instructions into memory; a completion signal from the I/O device may then be used to start
execution of the instructions by the CPU.
Smaller computers often use less flexible but more automatic boot loader mechanisms to ensure that the computer
starts quickly and with a predetermined software configuration. In many desktop computers, for example, the
bootstrapping process begins with the CPU executing software contained in ROM (for example, the BIOS of an IBM
PC) at a predefined address (some CPUs, including the Intel x86 series are designed to execute this software after
reset without outside help). This software contains rudimentary functionality to search for devices eligible to
participate in booting, and load a small program from a special section (most commonly the boot sector) of the most
promising device.
Boot loaders may face peculiar constraints, especially in size; for instance, on the IBM PC and compatibles, a boot
sector should typically work in only 32 KB[17] (later relaxed to 64 KB[18]) of system memory and not use
instructions not supported by the original 8088/8086 processors. The first stage of boot loaders located on fixed disks
and removable drives must fit into the first 446 bytes of the Master Boot Record in order to leave room for the
default 64-byte partition table with four partition entries and the two-byte boot signature, which the BIOS requires
for a proper boot loader € or even less, when additional features like more than four partition entries (up to 16 with
16 bytes each), a disk signature (6 bytes), a disk timestamp (6 bytes), an Advanced Active Partition (18 bytes) or
40
Booting
special multi-boot loaders have to be supported as well in some environments. In floppy and superfloppy Volume
Boot Records, up to 59 bytes are occupied for the Extended BIOS Parameter Block on FAT12 and FAT16 volumes
since DOS 4.0, whereas the FAT32 EBPB introduced with DOS 7.1 requires even 71 bytes, leaving only 441 bytes
for the boot loader when assuming a sector size of 512 bytes. Microsoft boot sectors therefore traditionally imposed
certain restrictions on the boot process, for example, the boot file had to be located at a fixed position in the root
directory of the file system and stored as consecutive sectors, conditions taken care of by the SYS command and
slightly relaxed in later versions of DOS. The boot loader was then able to load the first three sectors of the file into
memory, which happened to contain another embedded boot loader able to load the remainder of the file into
memory. When they added LBA and FAT32 support, they even switched to a two-sector boot loader using 386
instructions. At the same time other vendors managed to squeeze much more functionality into a single boot sector
without relaxing the original constraints on the only minimal available memory and processor support. For example,
DR-DOS boot sectors are able to locate the boot file in the FAT12, FAT16 and FAT32 file system, and load it into
memory as a whole via CHS or LBA, even if the file is not stored in a fixed location and in consecutive sectors.
Second-stage boot loader
Second-stage boot loaders, such as GNU GRUB, BOOTMGR, Syslinux, or NTLDR, are not themselves operating
systems, but are able to load an operating system properly and transfer execution to it; the operating system
subsequently initializes itself and may load extra device drivers.
Many boot loaders (like GNU GRUB, Windows's BOOTMGR, and Windows NT/2000/XP's NTLDR) can be
configured to give the user multiple booting choices. These choices can include different operating systems (for dual
or multi-booting from different partitions or drives), different versions of the same operating system (in case a new
version has unexpected problems), different operating system loading options (e.g., booting into a rescue or safe
mode), and some standalone programs that can function without an operating system, such as memory testers (e.g.,
memtest86+) or even games (see List of PC Booter games).[19] Some boot loaders can also load other boot loaders;
for example, GRUB loads BOOTMGR instead of loading Windows directly. Usually a default choice is preselected
with a time delay during which a user can press a key to change the choice; after this delay, the default choice is
automatically run so normal booting can occur without interaction.
The boot process can be considered complete when the computer is ready to interact with the user, or the operating
system is capable of running system programs or application programs. Typical modern personal computers boot in
about one minute, of which about 15 seconds are taken by a power-on self-test (POST) and a preliminary boot
loader, and the rest by loading the operating system and other software. Time spent after the operating system
loading can be considerably shortened to as little as 3 seconds[20] by bringing the system up with all cores at once, as
with coreboot.[21] Large servers may take several minutes to boot and start all their services.
Many embedded systems must boot immediately. For example, waiting a minute for a digital television or a GPS
satellite to start is generally unacceptable. Therefore such devices have software systems in ROM or flash memory
so the device can begin functioning immediately; little or no loading is necessary, because the loading can be
precomputed and stored on the ROM when the device is made.
Large and complex systems may have boot procedures that proceed in multiple phases until finally the operating
system and other programs are loaded and ready to execute. Because operating systems are designed as if they never
start or stop, a boot loader might load the operating system, configure itself as a mere process within that system, and
then irrevocably transfer control to the operating system. The boot loader then terminates normally as any other
process would.
41
Booting
Network booting
Most computers are also capable of booting over a computer network. In this scenario, the operating system is stored
on the disk of a server, and certain parts of it are transferred to the client using a simple protocol such as the Trivial
File Transfer Protocol. After these parts have been transferred, the operating system then takes over control of the
booting process.
Boot devices (IBM PC)
The boot device is the device from which the operating system is
loaded. A modern PC BIOS supports booting from various devices,
typically a local hard disk drive via the Master Boot Record (MBR)
(and of several MS-DOS partitions on such a disk, or GPT through
GRUB 2), an optical disc drive (using El Torito), a USB mass storage
Windows To Go bootable flash drive, a Live
device (FTL-based flash drive, SD card, or multi-media card slot; hard
USB example
disk drive, optical disc drive, etc.), or a network interface card (using
PXE). Older, less common BIOS-bootable devices include floppy disk drives, SCSI devices, Zip drives, and LS-120
drives.
Typically, the BIOS will allow the user to configure a boot order. If the boot order is set to "first, the DVD drive;
second, the hard disk drive", then the BIOS will try to boot from the DVD drive, and if this fails (e.g. because there
is no DVD in the drive), it will try to boot from the local hard drive.
For example, on a PC with Windows XP installed on the hard drive, the user could set the boot order to the one
given above, and then insert a Linux Live CD in order to try out Linux without having to install an operating system
onto the hard drive. This is an example of dual booting ƒ the user choosing which operating system to start after the
computer has performed its POST. In this example of dual booting, the user chooses by inserting or removing the
CD from the computer, but it is more common to choose which operating system to boot by selecting from a menu
using the computer keyboard. (Typically F11 or ESC)
Boot sequence of IBM-PC compatibles
Upon starting, an IBM-compatible personal computer's x86 CPU runs, in real mode, the instruction located at the
physical memory address FFFFFFF0h, usually pointing to the BIOS entry point inside the ROM.[22] This memory
location typically contains a jump instruction that transfers execution to the location of the BIOS start-up program.
This program runs a power-on self-test (POST) to check and initialize required devices such as DRAM and the PCI
bus (including running embedded ROMs). The most complicated step is setting up DRAM over SPI, made more
difficult by the fact that at this point memory is very limited.
After initializing required hardware, the BIOS goes through a pre-configured list of non-volatile storage devices
("boot device sequence") until it finds one that is bootable. A bootable device is defined as one that can be read from,
and where the last two bytes of the first sector contain the little-endian word AA55h, found as byte sequence 55h,
AAh on disk (also known as the MBR boot signature), or where it is otherwise established that the code inside the
sector is executable on x86 PCs.
Coreboot splits the initialization and boot services into distinct parts, supporting "payloads" such as SeaBIOS,
TianoCore, GRUB, and Linux directly (from flash).
42
Booting
43
Once the BIOS has found a bootable device it loads the boot sector to
linear address 7C00h (usually Segment:Offset 0000h:7C00h, but
some BIOSes use 07C0h:0000h) and transfers execution to the boot
code. In the case of a hard disk, this is referred to as the Master Boot
Record (MBR) and is by definition not operating-system specific. The
conventional MBR code checks the MBR's partition table for a
partition set as bootable (the one with active flag set). If an active
partition is found, the MBR code loads the boot sector code from that
partition, known as Volume Boot Record (VBR), and executes it.
The VBR is often operating-system specific; however, in most
operating systems its main function is to load and execute the operating
system kernel, which continues startup.
A hex dump of FreeBSD's boot0 MBR
If there is no active partition, or the active partition's boot sector is invalid, the MBR may load a secondary boot
loader which will select a partition (often via user input) and load its boot sector, which usually loads the
corresponding operating system kernel. In some cases, the MBR may also attempt to load secondary boot loaders
before trying to boot the active partition. If all else fails, it should issue an INT 18h[18] BIOS interrupt call (followed
by an INT 19h just in case INT 18h would return) in order to give back control to the BIOS, which would then
attempt to boot off other devices, attempt a remote boot via network or invoke ROM BASIC.
Some systems (particularly newer Macintoshes and new editions of Microsoft Windows) use Intel's EFI. Also
coreboot allows a computer to boot without having the firmware/BIOS constantly running in system management
mode. 16-bit BIOS interfaces are required by certain x86 operating systems, such as DOS and Windows 3.1/95/98
(and all when not booted via UEFI). However, most boot loaders retain 16-bit support BIOS call support.[23][24][25]
Other kinds of boot sequences
Some other processors have other kinds of boot modes.
There are alternative techniques for booting CPUs and microcontrollers:
• Some modern CPUs and microcontrollers (for example, TI OMAP) or sometimes even DSPs may have boot
ROM with boot code integrated directly into their silicon, so such a processor could perform quite a sophisticated
boot sequence on its own and load boot programs from various sources like NAND flash, SD or MMC card and
so on. It is hard to hardwire all the required logic for handling such devices, so an integrated boot ROM is used
instead in such scenarios. Boot ROM usage enables more flexible boot sequences than hardwired logic could
provide. For example, the boot ROM could try to perform boot from multiple boot sources. Also, a boot ROM is
often able to load a boot loader or diagnostic program via serial interfaces like UART, SPI, USB and so on. This
feature is often used for system recovery purposes when for some reasons usual boot software in non-volatile
memory got erased. This technique could also be used for initial non-volatile memory programming when there is
clean non-volatile memory installed and hence no software available in the system yet.
• It is also possible to take control of a system by using a hardware debug interface such as JTAG. Such an
interface may be used to write the boot loader program into bootable non-volatile memory (e.g. flash) by
instructing the processor core to perform the necessary actions to program non-volatile memory. Alternatively,
the debug interface may be used to upload some diagnostic or boot code into RAM, and then to start the processor
core and instruct it to execute the uploaded code. This allows, for example, the recovery of embedded systems
where no software remains on any supported boot device, and where the processor does not have any integrated
boot ROM. JTAG is a standard and popular interface; many CPUs, microcontrollers and other devices are
manufactured with JTAG interfaces (as of 2009).
Booting
• Some microcontrollers provide special hardware interfaces which can't be used to take arbitrary control of a
system or directly run code, but instead they allow the insertion of boot code into bootable non-volatile memory
(like flash memory) via simple protocols. Then at the manufacturing phase, such interfaces are used to inject boot
code (and possibly other code) into non-volatile memory. After system reset, the microcontroller begins to
execute code programmed into its non-volatile memory, just like usual processors are using ROMs for booting.
Most notably this technique is used by Atmel AVR microcontrollers, and by others as well. In many cases such
interfaces are implemented by hardwired logic. In other cases such interfaces could be created by software
running in integrated on-chip boot ROM from GPIO pins.
Most digital signal processors have the following boot modes:
• Serial mode boot
• Parallel mode boot, such as the host port interface (HPI boot)
In case of DSPs there is often a second microprocessor or microcontroller present in the system design, and this is
responsible for overall system behavior, interrupt handling, dealing with external events, user interface, etc. while
the DSP is dedicated to signal processing tasks only. In such systems the DSP could be booted by another processor
which is sometimes referred as the host processor (giving name to a Host Port). Such a processor is also sometimes
referred as the master, since it usually boots first from its own memories and then controls overall system behavior,
including booting of the DSP, and then further controlling the DSP's behavior. The DSP often lacks its own boot
memories and relies on the host processor to supply the required code instead. The most notable systems with such a
design are cell phones, modems, audio and video players and so on, where a DSP and a CPU/microcontroller are
co-existing.
Many FPGA chips load their configuration from an external serial EEPROM ("configuration ROM") on power-up.
Quick boot
Several devices are available that enable the user to "quick-boot" to a usually Linux-powered OS for various simple
tasks such as Internet access (such as Splashtop and Latitude ON).[26][27][28][29][30][31][32][33][34]
Notes
[1]
[2]
[3]
[4]
[5]
"Bootstrap" (http:/ / dictionary. reference. com/ search?r=2& q=bootstrap). Dictionary.com. .
"Bootstrap" (http:/ / www. thefreedictionary. com/ bootstrap). TheFreeDictionary.com. .
http:/ / en. wiktionary. org/ wiki/ pull_oneself_up_by_one's_bootstraps
"Phrase Finder" (http:/ / www. phrases. org. uk/ meanings/ 290800. html). phrases.org.uk. .
Buchholz, Werner (1953). "The System Design of the IBM Type 701 Computer" (http:/ / bitsavers. org/ pdf/ ibm/ 701/
Buchholz_IBM_701_System_Design_Oct53. pdf). Proceedings of the I.R.E. 41 (10): 1273. .
[6] Principles of Operation Type 701 And Associated Equipment (http:/ / bitsavers. org/ pdf/ ibm/ 701/ 24-6042-1_701_PrincOps. pdf). IBM.
1953. p.€26. . Retrieved November 9, 2012.
[7] From Gutenberg to the Internet, Jeremy M. Norman, 2005, page 436, ISBN 0-930405-87-0
[8] Oxford English Dictionary. Oxford University.
[9] IBM 650 (http:/ / bitsavers. trailing-edge. com/ pdf/ ibm/ 650/ 22-6060-2_650_OperMan. pdf)
[10] "GE-645 System Manual" (http:/ / bitsavers. org/ pdf/ ge/ GE-645/ GE-645_SystemMan_Jan68. pdf). . Retrieved November 6, 2012.
[11] PDP-10 System Reference Manual, Part 1 (http:/ / bitsavers. org/ pdf/ dec/ pdp10/ 1970_PDP-10_Ref/ 1970PDP10Ref_Part1. pdf). Digital
Equipment Corporation. 1969. p.€2-72. . Retrieved November 9, 2012.
[12] (PDF) z/Architecture Principles of Operation (http:/ / publibz. boulder. ibm. com/ epubs/ pdf/ a2278324. pdf). IBM. September 2005.
Chapter 17. . Retrieved 2007-04-14.
[13] Some control units attached only 8 devices; some attached more than 16. Indeed, the 3830 DASD controller offered 32-drive-addressing as
an option.
[14] PDP-11 Peripherals Handbook, DEC, 1975, p.4-25
[15] http:/ / decpicted. blogspot. com/ 2010/ 07/ m792-yb-bootstrap-diode-matrix. html Photos of M792-YB Diode ROM bootstrapping card
[16] http:/ / en. wikipedia. org/ wiki/ File:Apple_1_Advertisement_Oct_1976. jpg Apple Ad, Interface Age, October 1976
[17] Masahiko Sakamoto (May 13, 2010). "Why BIOS loads MBR into 7C00h in x86?" (http:/ / www. glamenv-septzen. net/ en/ view/ 6).
Glamenv-Septzen.net. . Retrieved 2012-08-22.
44
Booting
[18] Compaq Computer Corporation, Phoenix Technologies Ltd, Intel Corporation (1996-01-11). BIOS Boot Specification 1.01 ( (https:/ / www.
acpica. org/ download/ specsbbs101. pdf)).
[19] "Tint" (http:/ / www. coreboot. org/ Tint). coreboot. . Retrieved 20 November 2010.
[20] "FAQ - Why do we need coreboot?" (http:/ / www. coreboot. org/ FAQ#Why_do_we_need_coreboot_for_cluster_maintainance. 3F).
coreboot. . Retrieved 20 November 2010.
[21] "Google tech talks - coreboot (aka LinuxBIOS): The Free/Open-Source x86 Firmware" (http:/ / www. youtube. com/
watch?v=X72LgcMpM9k). YouTube. .
[22] "9.1.4 First Instruction Executed" (http:/ / download. intel. com/ products/ processor/ manual/ 325462. pdf). Intel‚ 64 and IA-32
Architectures Software Developer€s Manual. Intel Corporation. May 2012. p.€2611. . Retrieved August 23, 2012. "The first instruction that is
fetched and executed following a hardware reset is located at physical address FFFFFFF0h. This address is 16 bytes below the processor‚s
uppermost physical address. The EPROM containing the software- initialization code must be located at this address."
[23] "Intel Platform Innovation Framework for EFI" (http:/ / www. intel. com/ technology/ framework/ ). Intel. . Retrieved 2008-01-07.
[24] Intel Macintosh computers all have firmware with compatibility mode for legacy BIOS operations
[25] "?" (http:/ / www. coreboot. org/ OpenBIOS). .
[26] Brown, Eric (2008-10-02). "MontaVista Linux drives Dell's quick-boot feature" (http:/ / www. linuxdevices. com/ news/ NS2560585344.
html). linuxdevices.com. . Retrieved 20 November 2010.
[27] Larabel, Michael (June 14, 2008). "SplashTop Linux On HP, Dell Notebooks?" (http:/ / www. phoronix. com/ scan. php?page=article&
item=splashtop_voodoo& num=1). Phoronix. . Retrieved 20 November 2010.
[28] "Voodoo Envy's Instant-On IOS (powered by Splashtop)" (http:/ / www. youtube. com/ watch?v=InUpF5Uetfc). YouTube. . Retrieved 20
November 2010.
[29] "Voodoo Envy 133 Laptop vs MacBook Air" (http:/ / www. gadgets-reviews. com/ voodoo-envy-133. html). gadgets-reviews.com. July 29,
2008. . Retrieved 20 November 2010.
[30] "Voodoopc homepage" (http:/ / www. voodoopc. com/ ). . Retrieved 20 November 2010.
[31] Brown, Eric (2008-10-03). "5-second Linux boots on low-powered hardware" (http:/ / www. linuxdevices. com/ news/ NS7654890804.
html). . Retrieved 20 November 2010.
[32] "Latitude ON" (http:/ / www. youtube. com/ watch?v=y40Z1mvGOt8). YouTube. . Retrieved 20 November 2010.
[33] Brown, Eric (2008-11-07). "Linux boots in 2.97 seconds" (http:/ / www. linuxdevices. com/ news/ NS5185504436. html). linuxdevices.com.
. Retrieved 20 November 2010.
[34] "News" (http:/ / www. linuxdevices. com/ news/ NS8282586707. html?kc=rss). linuxdevices.com. . Retrieved 20 November 2010.
References
Further reading
• How Computers Boot Up (http://duartes.org/gustavo/blog/post/how-computers-boot-up)
• Practical boot loader tutorial for ATmega microcontrollers (http://www.societyofrobots.com/
bootloader_50_robot.shtml)
• Booting with Grub (http://www.osdcom.info/content/view/33/39/) at OSDEV Community
• Tutorial on writing hello world boot loader (http://viralpatel.net/taj/tutorial/hello_world_bootloader.php)
• x86 BootStrap Programming Tutorial (http://www.vnutz.com/content/program_a_bootstrap_loader)
• Bootstrapping FreeBSD (http://www.khmere.com/freebsd_book/html/ch02.html)
• The Linux boot process unveiled (http://lateral.netmanagers.com.ar/stories/23.html)
• Mac OS X Boot Process (http://www.kernelthread.com/mac/osx/arch_boot.html)
• Jonathan de Boyne Pollard (2006). "The EFI boot process" (http://homepage.ntlworld.com./jonathan.
deboynepollard/FGA/efi-boot-process.html). Frequently Given Answers.
• Jonathan de Boyne Pollard (2006). "The ARC boot process" (http://homepage.ntlworld.com./jonathan.
deboynepollard/FGA/arc-boot-process.html). Frequently Given Answers.
• Jonathan de Boyne Pollard (1996). "The DOS and DOS/Windows boot processes" (http://homepage.ntlworld.
com./jonathan.deboynepollard/FGA/dos-windows-boot-process.html). Frequently Given Answers.
• Jonathan de Boyne Pollard (2006). "The Windows NT 6 boot process" (http://homepage.ntlworld.com./
jonathan.deboynepollard/FGA/windows-nt-6-boot-process.html). Frequently Given Answers.
• Windows Mobile 5.0 Soft Reset (http://www.pocketpcfaq.com/faqs/5.0/reset.htm)
• Pocket PC devices hard reset and soft reset (http://www.hardreset.eu/index_en.html)
45
Booting
• Cell phone, Tablet and Pocket PC devices hard reset and soft reset (http://www.hard-reset.com/)
• Understanding Multibooting (http://www.goodells.net/multiboot/)
• Code of a simple boot loader for students (http://code.google.com/p/akernelloader/)
46
47
2. Processes
Computer multitasking
In computing, multitasking is a method where multiple tasks, also known as processes, are performed during the
same period of time. The tasks share common processing resources, such as a CPU and main memory. In the case of
a computer with a single CPU, only one task is said to be running at any point in time, meaning that the CPU is
actively executing instructions for that task. Multitasking solves the problem by scheduling which task may be the
one running at any given time, and when another waiting task gets a turn. The act of reassigning a CPU from one
task to another one is called a context switch. When context switches occur frequently enough the illusion of
parallelism is achieved. Even on computers with more than one CPU (called multiprocessor machines), multitasking
allows many more tasks to be run than there are CPUs. The term "multitasking" has become an international term, as
the same word in many other languages such as German, Italian, Dutch, Danish and Norwegian.
Operating systems may adopt one of many different scheduling strategies, which generally fall into the following
categories:
• In multiprogramming systems, the running task keeps running until it performs an operation that requires waiting
for an external event (e.g. reading from a tape) or until the computer's scheduler forcibly swaps the running task
out of the CPU. Multiprogramming systems are designed to maximize CPU usage.
• In time-sharing systems, the running task is required to relinquish the CPU, either voluntarily or by an external
event such as a hardware interrupt. Time sharing systems are designed to allow several programs to execute
apparently simultaneously.
• In real-time systems, some waiting tasks are guaranteed to be given the CPU when an external event occurs. Real
time systems are designed to control mechanical devices such as industrial robots, which require timely
processing.
Multiprogramming
In the early days of computing, CPU time was expensive, and peripherals were very slow. When the computer ran a
program that needed access to a peripheral, the Central processing unit (CPU) would have to stop executing program
instructions while the peripheral processed the data. This was deemed very inefficient. The first computer using a
multiprogramming system was the British Leo III owned by J. Lyons and Co.. Several different programs in batch
were loaded in the computer memory, and the first one began to run. When the first program reached an instruction
waiting for a peripheral, the context of this program was stored away, and the second program in memory was given
a chance to run. The process continued until all programs finished running.
Multiprogramming doesn't give any guarantee that a program will run in a timely manner. Indeed, the very first
program may very well run for hours without needing access to a peripheral. As there were no users waiting at an
interactive terminal, this was no problem: users handed in a deck of punched cards to an operator, and came back a
few hours later for printed results. Multiprogramming greatly reduced wait times when multiple batches were being
processed.
Computer multitasking
Cooperative multitasking/time-sharing
The expression 'time sharing' was usually used to designate computers shared by interactive users at terminals, such
as IBM's TSO, and VM/CMS. The term time-sharing is no longer commonly used, having been replaced by simply
multitasking, and by the advent of personal computers and workstations rather than shared interactive systems. When
computer usage evolved from batch mode to interactive mode, multiprogramming was no longer a suitable approach.
Each user wanted to see his program running as if it were the only program in the computer. The use of time sharing
made this possible, with the qualification that the computer would not seem as fast to any one user as it really would
be if it were running only that user's program.
Early multitasking systems used applications that voluntarily ceded time to one another. This approach, which was
eventually supported by many computer operating systems, is known today as cooperative multitasking. Although it
is now rarely used in larger systems, cooperative multitasking was once the scheduling scheme employed by
Microsoft Windows (prior to Windows 95 and Windows NT) and Mac OS (prior to Mac OS X) in order to enable
multiple applications to be run simultaneously. Windows 9x also used cooperative multitasking, but only for 16-bit
legacy applications, much the same way as pre-Leopard PowerPC versions of Mac OS X used it for Classic
applications. The network operating system NetWare used cooperative multitasking up to NetWare 6.5. Cooperative
multitasking is still used today on RISC OS systems.
Because a cooperatively multitasked system relies on each process regularly giving up time to other processes on the
system, one poorly designed program can consume all of the CPU time for itself or cause the whole system to hang.
In a server environment, this is a hazard that makes the entire network brittle and fragile. All software must be
evaluated and cleared for use in a test environment before being installed on the main server, or a misbehaving
program on the server slows down or freezes the entire network.
Despite the difficulty of designing and implementing cooperatively multitasked systems, time-constrained, real-time
embedded systems (such as spacecraft) are often implemented using this paradigm. This allows highly reliable,
deterministic control of complex real time sequences, for instance, the firing of thrusters for deep space course
corrections.
Preemptive multitasking/time-sharing
Preemptive multitasking allows the computer system to guarantee more reliably each process a regular "slice" of
operating time. It also allows the system to deal rapidly with important external events like incoming data, which
might require the immediate attention of one or another process.
Operating systems were developed to take advantage of these hardware capabilities and run multiple processes
preemptively. Digital Equipment Corporation was a leader in this. For example, preemptive multitasking was
implemented in the earliest version of Unix [1] in 1969, and is standard in Unix and Unix-like operating systems,
including Linux, Solaris and BSD with its derivatives.
At any specific time, processes can be grouped into two categories: those that are waiting for input or output (called
"I/O bound"), and those that are fully utilizing the CPU ("CPU bound"). In primitive systems, the software would
often "poll", or "busywait" while waiting for requested input (such as disk, keyboard or network input). During this
time, the system was not performing useful work. With the advent of interrupts and preemptive multitasking, I/O
bound processes could be "blocked", or put on hold, pending the arrival of the necessary data, allowing other
processes to utilize the CPU. As the arrival of the requested data would generate an interrupt, blocked processes
could be guaranteed a timely return to execution.
The earliest preemptive multitasking OS available to home users was Sinclair QDOS on the Sinclair QL, released in
1984, but very few people bought the machine. Commodore's powerful Amiga, released the following year, was the
first commercially successful home computer to use the technology, and its multimedia abilities make it a clear
ancestor of contemporary multitasking personal computers. Microsoft made preemptive multitasking a core feature
48
Computer multitasking
of their flagship operating system in the early 1990s when developing Windows NT 3.1 and then Windows 95. It
was later adopted on the Apple Macintosh by Mac OS 9.x [2] as an additional API, i.e. the application could be
programmed to use the preemptive or cooperative model, and all legacy applications were multitasked cooperatively
within a single process. Mac OS X, being a Unix-like system, uses preemptive multitasking for all native
applications, although Classic applications are multitasked cooperatively in a Mac OS 9 environment that itself is
running as an OS X process (and is subject to preemption like any other OS X process).
A similar model is used in Windows 9x and the Windows NT family, where native 32-bit applications are
multitasked preemptively, and legacy 16-bit Windows 3.x programs are multitasked cooperatively within a single
process, although in the NT family it is possible to force a 16-bit application to run as a separate preemptively
multitasked process.[3] 64-bit editions of Windows, both for the x86-64 and Itanium architectures, no longer provide
support for legacy 16-bit applications, and thus provide preemptive multitasking for all supported applications.
Real time
Another reason for multitasking was in the design of real-time computing systems, where there are a number of
possibly unrelated external activities needed to be controlled by a single processor system. In such systems a
hierarchical interrupt system is coupled with process prioritization to ensure that key activities were given a greater
share of available process time.
Multithreading
As multitasking greatly improved the throughput of computers, programmers started to implement applications as
sets of cooperating processes (e.€g., one process gathering input data, one process processing input data, one process
writing out results on disk). This, however, required some tools to allow processes to efficiently exchange data.
Threads were born from the idea that the most efficient way for cooperating processes to exchange data would be to
share their entire memory space. Thus, threads are basically processes that run in the same memory context. Threads
are described as lightweight because switching between threads does not involve changing the memory context.
While threads are scheduled preemptively, some operating systems provide a variant to threads, named fibers, that
are scheduled cooperatively. On operating systems that do not provide fibers, an application may implement its own
fibers using repeated calls to worker functions. Fibers are even more lightweight than threads, and somewhat easier
to program with, although they tend to lose some or all of the benefits of threads on machines with multiple
processors.
Some systems directly support multithreading in hardware.
Memory protection
When multiple programs are present in memory, an ill-behaved program may (inadvertently or deliberately)
overwrite memory belonging to another program, or even to the operating system itself.
The operating system therefore restricts the memory accessible to the running program. A program trying to access
memory outside its allowed range is immediately stopped before it can change memory belonging to another
process.
Another key innovation was the idea of privilege levels. Low privilege tasks are not allowed some kinds of memory
access and are not allowed to perform certain instructions. When a task tries to perform a privileged operation a trap
occurs and a supervisory program running at a higher level is allowed to decide how to respond.
49
Computer multitasking
Memory swapping
Use of a swap file or swap partition is a way for the operating system to provide more memory than is physically
available by keeping portions of the primary memory in secondary storage. While multitasking and memory
swapping are two completely unrelated techniques, they are very often used together, as swapping memory allows
more tasks to be loaded at the same time. Typically, a multitasking system allows another process to run when the
running process hits a point where it has to wait for some portion of memory to be reloaded from secondary storage.
Programming in a multitasking environment
Processes that are entirely independent are not much trouble to program. Most of the complexity in multitasking
systems comes from the need to share computer resources between tasks and to synchronize the operation of
co-operating tasks. Various concurrent computing techniques are used to avoid potential problems caused by
multiple tasks attempting to access the same resource.
Bigger systems were sometimes built with a central processor(s) and some number of I/O processors, a kind of
asymmetric multiprocessing.
Over the years, multitasking systems have been refined. Modern operating systems generally include detailed
mechanisms for prioritizing processes, while symmetric multiprocessing has introduced new complexities and
capabilities.
Notes
[1] But Unix was far later than Digital. The Digital Research Initiative (http:/ / www. ibiblio. org/ team/ intro/ unix/ what. html)
[2] Technical Note TN2006: MP-Safe Routines (http:/ / developer. apple. com/ technotes/ tn/ tn2006. html)
[3] Smart Computing Article - Windows 2000 &16-Bit Applications (http:/ / www. smartcomputing. com/ editorial/ article. asp?article=articles/
2005/ s1606/ 08s06/ 08s06. asp)
50
Process (computing)
Process (computing)
In computing, a process is an instance of a computer program that is
being executed. It contains the program code and its current activity.
Depending on the operating system (OS), a process may be made up of
multiple threads of execution that execute instructions
concurrently.[1][2]
A computer program is a passive collection of instructions; a process is
the actual execution of those instructions. Several processes may be
A list of processes being shown on htop.
associated with the same program; for example, opening up several
instances of the same program often means more than one process is being executed.
Multitasking is a method to allow multiple processes to share processors (CPUs) and other system resources. Each
CPU executes a single task at a time. However, multitasking allows each processor to switch between tasks that are
being executed without having to wait for each task to finish. Depending on the operating system implementation,
switches could be performed when tasks perform input/output operations, when a task indicates that it can be
switched, or on hardware interrupts.
A common form of multitasking is time-sharing. Time-sharing is a method to allow fast response for interactive user
applications. In time-sharing systems, context switches are performed rapidly. This makes it seem like multiple
processes are being executed simultaneously on the same processor. The execution of multiple processes seemingly
simultaneously is called concurrency.
For security and reliability reasons most modern operating systems prevent direct communication between
independent processes, providing strictly mediated and controlled inter-process communication functionality.
Representation
In general, a computer system process consists of (or is said to 'own') the following resources:
• An image of the executable machine code associated with a program.
• Memory (typically some region of virtual memory); which includes the executable code, process-specific data
(input and output), a call stack (to keep track of active subroutines and/or other events), and a heap to hold
intermediate computation data generated during run time.
• Operating system descriptors of resources that are allocated to the process, such as file descriptors (Unix
terminology) or handles (Windows), and data sources and sinks.
• Security attributes, such as the process owner and the process' set of permissions (allowable operations).
• Processor state (context), such as the content of registers, physical memory addressing, etc. The state is typically
stored in computer registers when the process is executing, and in memory otherwise.[1]
The operating system holds most of this information about active processes in data structures called process control
blocks.
Any subset of resource, but typically at least the processor state, may be associated with each of the process' threads
in operating systems that support threads or 'daughter' processes.
The operating system keeps its processes separated and allocates the resources they need, so that they are less likely
to interfere with each other and cause system failures (e.g., deadlock or thrashing). The operating system may also
provide mechanisms for inter-process communication to enable processes to interact in safe and predictable ways.
51
Process (computing)
Process management in multi-tasking operating systems
A multitasking operating system may just switch between processes to give the appearance of many processes
executing concurrently or simultaneously, though in fact only one process can be executing at any one time on a
single-core CPU (unless using multithreading or other similar technology).[3]
It is usual to associate a single process with a main program, and 'daughter' ('child') processes with any spin-off,
parallel processes, which behave like asynchronous subroutines. A process is said to own resources, of which an
image of its program (in memory) is one such resource. (Note, however, that in multiprocessing systems, many
processes may run off of, or share, the same reentrant program at the same location in memory€ but each process is
said to own its own image of the program.)
Processes are often called "tasks" in embedded operating systems. The sense of "process" (or task) is "something that
takes up time", as opposed to 'memory', which is "something that takes up space".[4]
The above description applies to both processes managed by an operating system, and processes as defined by
process calculi.
If a process requests something for which it must wait, it will be blocked. When the process is in the Blocked State,
it is eligible for swapping to disk, but this is transparent in a virtual memory system, where blocks of memory values
may be really on disk and not in main memory at any time. Note that even unused portions of active processes/tasks
(executing programs) are eligible for swapping to disk. All parts of an executing program and its data do not have to
be in physical memory for the associated process to be active.
Process states
An operating system kernel that allows
multi-tasking needs processes to have
certain states. Names for these states are not
standardised, but they have similar
functionality.[1]
• First, the process is "created" - it is
loaded from a secondary storage device
(hard disk or CD-ROM...) into main
memory. After that the process scheduler
assigns it the state "waiting".
• While the process is "waiting" it waits for
the scheduler to do a so-called context
switch and load the process into the
processor. The process state then
becomes "running", and the processor
executes the process instructions.
• If a process needs to wait for a resource
(wait for user input or file to open ...), it
The various process states, displayed in a state diagram, with arrows indicating
is assigned the "blocked" state. The
possible transitions between states.
process state is changed back to
"waiting" when the process no longer
needs to wait.
• Once the process finishes execution, or is terminated by the operating system, it is no longer needed. The process
is removed instantly or is moved to the "terminated" state. When removed, it just waits to be removed from main
memory.[1][5]
52
Process (computing)
Inter-process communication
When processes communicate with each other it is called "Inter-process communication" (IPC). Processes frequently
need to communicate, for instance in a shell pipeline, the output of the first process need to pass to the second one,
and so on to the other process. It is preferred in a well-structured way not using interrupts.
It is even possible for the two processes to be running on different machines. The operating system (OS) may differ
from one process to the other, therefore some mediator(s) (called protocols) are needed.
History
By the early 1960s computer control software had evolved from Monitor control software, e.g., IBSYS, to Executive
control software. Computers got "faster" and computer time was still neither "cheap" nor fully used. It made
multiprogramming possible and necessary.
Multiprogramming means that several programs run "at the same time" (concurrently, including parallel and
non-parallel). At first they ran on a single processor (i.e., uniprocessor) and shared scarce resources.
Multiprogramming is also basic form of multiprocessing, a much broader term.
Programs consist of sequences of instructions for processors. A single processor can run only one instruction at a
time: it is impossible to run more programs at the same time. A program might need some resource (input ...) which
has a large delay, or a program might start some slow operation (output to printer ...). This would lead to processor
being "idle" (unused). To use processor at all times, the execution of such a program is halted. At that point, a second
(or nth) program is started or restarted. To the user, it will appear that the programs run at the same time (hence the
term, concurrent).
Shortly thereafter, the notion of a 'program' was expanded to the notion of an 'executing program and its context'.
The concept of a process was born.
This became necessary with the invention of re-entrant code.
Threads came somewhat later. However, with the advent of time-sharing; computer networks; multiple-CPU, shared
memory computers; etc., the old "multiprogramming" gave way to true multitasking, multiprocessing and, later,
multithreading.
Notes
[1] SILBERSCHATZ, Abraham; CAGNE, Greg, GALVIN, Peter Baer (2004). "Chapter 4 - Processes". Operating system concepts with Java
(Sixth Edition ed.). John Wiley & Sons, Inc.. ISBN€0-471-48905-0.
[2] Vahalia, Uresh (1996). "2 - The Process and the Kernel". UNIX Internals - The New Frontiers. Prentice-Hall Inc.. ISBN€0-13-101908-2.
[3] Some modern CPUs combine two or more independent processors and can execute several processes simultaneously - see Multi-core for
more information. Another technique called simultaneous multithreading (used in Intel's Hyper-threading technology) can simulate
simultaneous execution of multiple processes or threads.
[4] Tasks and processes refer essentially to the same entity. And, although they have somewhat different terminological histories, they have come
to be used as synonyms. Today, the term process is generally preferred over task, except when referring to 'multitasking', since the alternative
term, 'multiprocessing', is too easy to confuse with multiprocessor (which is a computer with two or more CPUs).
[5] Stallings, William (2005). Operating Systems: internals and design principles (5th edition). Prentice Hall. ISBN€0-13-127837-1.
Particularly chapter 3, section 3.2, "process states", including figure 3.9 "process state transition with suspend
states"
53
Process (computing)
References
• Gary D. Knott (1974) A proposal for certain process management and intercommunication primitives (http://
doi.acm.org/10.1145/775280.775282) ACM SIGOPS Operating Systems Review. Volume 8, Issue 4 (October
1974). pp.€7 • 44
External links
• processlibrary.com - Online Resource For Process Information (http://www.processlibrary.com/)
• file.net - Computer Process Information Database and Forum (http://www.file.net/)
Process management (computing)
Process management is an integral part of any modern day operating system (OS). The OS must allocate resources
to processes, enable processes to share and exchange information, protect the resources of each process from other
processes and enable synchronisation among processes. To meet these requirements, the OS must maintain a data
structure for each process, which describes the state and resource ownership of that process, and which enables the
OS to exert control over each process.
Multiprogramming
In many modern operating systems, there can be more than one instance of a program loaded in memory at the same
time; for example, more than one user could be executing the same program, each user having separate copies of the
program loaded into memory. With some programs, it is possible to have one copy loaded into memory, while
several users have shared access to it so that they each can execute the same program-code. Such a program is said to
be re-entrant. The processor at any instant can only be executing one instruction from one program but several
processes can be sustained over a period of time by assigning each process to the processor at intervals while the
remainder become temporarily inactive. A number of processes being executed over a period of time instead of at the
same time is called concurrent execution.
A multiprogramming or multitasking OS is a system executing many processes concurrently. Multiprogramming
requires that the processor be allocated to each process for a period of time and de-allocated at an appropriate
moment. If the processor is de-allocated during the execution of a process, it must be done in such a way that it can
be restarted later as easily as possible.
There are two possible ways for an OS to regain control of the processor during a program‚s execution in order for
the OS to perform de-allocation or allocation:
1. The process issues a system call (sometimes called a software interrupt); for example, an I/O request occurs
requesting to access a file on hard disk.
2. A hardware interrupt occurs; for example, a key was pressed on the keyboard, or a timer runs out (used in
pre-emptive multitasking).
The stopping of one process and starting (or restarting) of another process is called a context switch or context
change. In many modern operating systems, processes can consist of many sub-processes. This introduces the
concept of a thread. A thread may be viewed as a sub-process; that is, a separate, independent sequence of execution
within the code of one process. Threads are becoming increasingly important in the design of distributed and
client•server systems and in software run on multi-processor systems.
54
Process management (computing)
How multiprogramming increases efficiency
A common trait observed among processes associated with most computer programs, is that they alternate between
CPU cycles and I/O cycles. For the portion of the time required for CPU cycles, the process is being executed; i.e. is
occupying the CPU. During the time required for I/O cycles, the process is not using the processor. Instead, it is
either waiting to perform Input/Output, or is actually performing Input/Output. An example of this is the reading
from or writing to a file on disk. Prior to the advent of multiprogramming, computers operated as single-user
systems. Users of such systems quickly became aware that for much of the time that a computer was allocated to a
single user, the processor was idle; when the user was entering information or debugging programs for example.
Computer scientists observed that overall performance of the machine could be improved by letting a different
process use the processor whenever one process was waiting for input/output. In a uni-programming system, if N
users were to execute programs with individual execution times of t1, t2, ..., tN, then the total time, tuni, to service the
N processes (consecutively) of all N users would be:
tuni = t1 + t2 + ... + tN.
However, because each process consumes both CPU cycles and I/O cycles, the time which each process actually
uses the CPU is a very small fraction of the total execution time for the process. So, for process i:
ti (processor) „ ti (execution)
where
ti (processor) is the time process i spends using the CPU, and
ti (execution) is the total execution time for the process; i.e. the time for CPU cycles plus I/O cycles to be carried out
(executed) until completion of the process.
In fact, usually the sum of all the processor time, used by N processes, rarely exceeds a small fraction of the time to
execute any one of the processes;
Therefore, in uni-programming systems, the processor lay idle for a considerable proportion of the time. To
overcome this inefficiency, multiprogramming is now implemented in modern operating systems such as Linux,
UNIX and Microsoft Windows. This enables the processor to switch from one process, X, to another, Y, whenever X
is involved in the I/O phase of its execution. Since the processing time is much less than a single job's runtime, the
total time to service all N users with a multiprogramming system can be reduced to approximately:
tmulti = max(t1, t2, ..., tN)
Process creation
Operating systems need some ways to create processes. In a very simple system designed for running only a single
application (e.g., the controller in a microwave oven), it may be possible to have all the processes that will ever be
needed be present when the system comes up. In general-purpose systems, however, some way is needed to create
and terminate processes as needed during operation.
There are four principal events that cause a process to be created:
•
•
•
•
System initialization.
Execution of process creation system call by running a process.
A user request to create a new process.
Initiation of a batch job.
When an operating system is booted, typically several processes are created. Some of these are foreground processes,
that interacts with a (human) user and perform work for them. Other are background processes, which are not
associated with particular users, but instead have some specific function. For example, one background process may
55
Process management (computing)
be designed to accept incoming e-mails, sleeping most of the day but suddenly springing to life when an incoming
e-mail arrives. Another background process may be designed to accept an incoming request for web pages hosted on
the machine, waking up when a request arrives to service that request.
Process creation in UNIX and Linux are done through fork() or clone() system calls. There are several steps involved
in process creation. The first step is the validation of whether the parent process has sufficient authorization to create
a process. Upon successful validation, the parent process is copied almost entirely, with changes only to the unique
process id, parent process, and user-space. Each new process gets its own user space.[1]
Process termination
There are many reasons for process termination:
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Batch job issues halt instruction
User logs off
Process executes a service request to terminate
Error and fault conditions
Normal completion
Time limit exceeded
Memory unavailable
Bounds violation; for example: attempted access of (non-existent) 11th element of a 10-element array
Protection error; for example: attempted write to read-only file
Arithmetic error; for example: attempted division by zero
Time overrun; for example: process waited longer than a specified maximum for an event
I/O failure
Invalid instruction; for example: when a process tries to execute data (text)
Privileged instruction
Data misuse
Operating system intervention; for example: to resolve a deadlock
Parent terminates so child processes terminate (cascading termination)
Parent request
fatal error
Two-state process management model
The operating system‚s principal responsibility is in controlling the execution of processes. This includes
determining the interleaving pattern for execution and allocation of resources to processes. One part of designing an
OS is to describe the behaviour that we would like each process to exhibit. The simplest model is based on the fact
that a process is either being executed by a processor or it is not. Thus, a process may be considered to be in one of
two states, RUNNING or NOT RUNNING. When the operating system creates a new process, that process is initially
labeled as NOT RUNNING, and is placed into a queue in the system in the NOT RUNNING state. The process (or
some portion of it) then exists in main memory, and it waits in the queue for an opportunity to be executed. After
some period of time, the currently RUNNING process will be interrupted, and moved from the RUNNING state to the
NOT RUNNING state, making the processor available for a different process. The dispatch portion of the OS will
then select, from the queue of NOT RUNNING processes, one of the waiting processes to transfer to the processor.
The chosen process is then relabeled from a NOT RUNNING state to a RUNNING state, and its execution is either
begun if it is a new process, or is resumed if it is a process which was interrupted at an earlier time.
From this model we can identify some design elements of the OS:
• The need to represent, and keep track of each process.
56
Process management (computing)
• The state of a process.
• The queuing of NON RUNNING processes
Three-state process management model
Although the two-state process management model is a perfectly valid design for an operating system, the absence of
a BLOCKED state means that the processor lies idle when the active process changes from CPU cycles to I/O cycles.
This design does not make efficient use of the processor. The three-state process management model is designed to
overcome this problem, by introducing a new state called the BLOCKED state. This state describes any process
which is waiting for an I/O event to take place. In this case, an I/O event can mean the use of some device or a signal
from another process. The three states in this model are:
• RUNNING: The process that is currently being executed.
• READY: A process that is queuing and prepared to execute when given the opportunity.
• BLOCKED: A process that cannot execute until some event occurs, such as the completion of an I/O operation.
At any instant, a process is in one and only one of the three states. For a single processor computer, only one process
can be in the RUNNING state at any one instant. There can be many processes in the READY and BLOCKED states,
and each of these states will have an associated queue for processes.
Processes entering the system must go initially into the READY state, processes can only enter the RUNNING state
via the READY state. Processes normally leave the system from the RUNNING state. For each of the three states, the
process occupies space in main memory. While the reason for most transitions from one state to another might be
obvious, some may not be so clear.
• RUNNING € READY The most common reason for this transition is that the running process has reached the
maximum allowable time for uninterrupted execution; i.e. time-out occurs. Other reasons can be the imposition of
priority levels as determined by the scheduling policy used for the Low Level Scheduler, and the arrival of a
higher priority process into the READY state.
• RUNNING € BLOCKED A process is put into the BLOCKED state if it requests something for which it must
wait. A request to the OS is usually in the form of a system call, (i.e. a call from the running process to a function
that is part of the OS code). For example, requesting a file from disk or a saving a section of code or data from
memory to a file on disk.
Five-state process management model
While the three state model is sufficient to describe the behavior of processes with the given events, we have to
extend the model to allow for other possible events, and for more sophisticated design. In particular, the use of a
portion of the hard disk to emulate main memory (so called virtual memory) requires additional states to describe the
state of processes which are suspended from main memory, and placed in virtual memory (on disk). Of course, such
processes can, at a future time, be resumed by being transferred back into main memory. The Medium Level
Scheduler controls these events. A process can be suspended from the RUNNING, READY or BLOCKED state,
giving rise to two other states, namely, READY SUSPEND and BLOCKED SUSPEND. A RUNNING process that is
suspended becomes READY SUSPEND, and a BLOCKED process that is suspended becomes BLOCKED SUSPEND.
A process can be suspended for a number of reasons; the most significant of which arises from the process being
swapped out of memory by the memory management system in order to free memory for other processes. Other
common reasons for a process being suspended are when one suspends execution while debugging a program, or
when the system is monitoring processes. For the five-state process management model, consider the following
transitions described in the next sections.
• SUSPEND BLOCKED then BLOCKED € If a process in the RUNNING state requires more memory, then at
least one BLOCKED process can be swapped out of memory onto disk. The transition can also be made for the
57
Process management (computing)
BLOCKED process if there are READY processes available, and the OS determines that the READY process it
would like to dispatch requires more main memory to maintain adequate performance.
• SUSPEND BLOCKED then SUSPEND READY A process in the SUSPEND BLOCKED state is moved to the
SUSPEND READY state when the event for which it has been waiting occurs. Note that this requires the state
information concerning suspended processes be accessible to the OS.
• SUSPEND READY then READY When there are no READY processes in main memory, the OS will need to
bring one in to continue execution. In addition, it might be the case that a process in the READY SUSPEND state
has higher priority than any of the processes in the READY state. In that case, the OS designer may dictate that it
is more important to get at the higher priority process than to minimise swapping.
• SUSPENDED but READY Normally, the OS would be designed so that the preference would be to suspend a
BLOCKED process rather than a READY one.
Process description and control
Each process in the system is represented by a data structure called a Process Control Block (PCB), or Process
Descriptor in Linux, which performs the same function as a traveller's passport. The PCB contains the basic
information about the job including:
•
•
•
•
•
What it is
Where it is going
How much of its processing has been completed
Where it is stored
How much it has …spent† in using resources
Process Identification: Each process is uniquely identified by the user‚s identification and a pointer connecting it to
its descriptor.
Process Status: This indicates the current status of the process; READY, RUNNING, BLOCKED, READY
SUSPEND, BLOCKED SUSPEND.
Process State: This contains all of the information needed to indicate the current state of the job.
Accounting: This contains information used mainly for billing purposes and for performance measurement. It
indicates what kind of resources the process has used and for how long.
Processor modes
Contemporary processors incorporate a mode bit to define the execution capability of a program in the processor.
This bit can be set to kernel mode or user mode. Kernel mode is also commonly referred to as supervisor mode,
monitor mode or ring 0. In kernel mode, the processor can execute every instruction in its hardware repertoire,
whereas in user mode, it can only execute a subset of the instructions. Instructions that can be executed only in
kernel mode are called kernel, privileged or protected instructions to distinguish them from the user mode
instructions. For example, I/O instructions are privileged. So, if an application program executes in user mode, it
cannot perform its own I/O. Instead, it must request the OS to perform I/O on its behalf. The system may logically
extend the mode bit to define areas of memory to be used when the processor is in kernel mode versus user mode. If
the mode bit is set to kernel mode, the process executing in the processor can access either the kernel or user
partition of the memory. However, if user mode is set, the process can reference only the user memory space. We
frequently refer to two classes of memory user space and system space (or kernel, supervisor or protected space). In
general, the mode bit extends the operating system's protection rights. The mode bit is set by the user mode trap
instruction, also called a supervisor call instruction. This instruction sets the mode bit, and branches to a fixed
location in the system space. Since only system code is loaded in the system space, only system code can be invoked
via a trap. When the OS has completed the supervisor call, it resets the mode bit to user mode prior to the return.
58
Process management (computing)
The kernel concept
The parts of the OS critical to its correct operation execute in kernel mode, while other software (such as generic
system software) and all application programs execute in user mode. This fundamental distinction is usually the
irrefutable distinction between the operating system and other system software. The part of the system executing in
kernel supervisor state is called the kernel, or nucleus, of the operating system. The kernel operates as trusted
software, meaning that when it was designed and implemented, it was intended to implement protection mechanisms
that could not be covertly changed through the actions of untrusted software executing in user space. Extensions to
the OS execute in user mode, so the OS does not rely on the correctness of those parts of the system software for
correct operation of the OS. Hence, a fundamental design decision for any function to be incorporated into the OS is
whether it needs to be implemented in the kernel. If it is implemented in the kernel, it will execute in kernel
(supervisor) space, and have access to other parts of the kernel. It will also be trusted software by the other parts of
the kernel. If the function is implemented to execute in user mode, it will have no access to kernel data structures.
However, the advantage is that it will normally require very limited effort to invoke the function. While
kernel-implemented functions may be easy to implement, the trap mechanism and authentication at the time of the
call are usually relatively expensive. The kernel code runs fast, but there is a large performance overhead in the
actual call. This is a subtle, but important point.
Requesting system services
There are two techniques by which a program executing in user mode can request the kernel's services:
• System call
• Message passing
Operating systems are designed with one or the other of these two facilities, but not both. First, assume that a user
process wishes to invoke a particular target system function. For the system call approach, the user process uses the
trap instruction. The idea is that the system call should appear to be an ordinary procedure call to the application
program; the OS provides a library of user functions with names corresponding to each actual system call. Each of
these stub functions contains a trap to the OS function. When the application program calls the stub, it executes the
trap instruction, which switches the CPU to kernel mode, and then branches (indirectly through an OS table), to the
entry point of the function which is to be invoked. When the function completes, it switches the processor to user
mode and then returns control to the user process; thus simulating a normal procedure return.
In the message passing approach, the user process constructs a message, that describes the desired service. Then it
uses a trusted send function to pass the message to a trusted OS process. The send function serves the same purpose
as the trap; that is, it carefully checks the message, switches the processor to kernel mode, and then delivers the
message to a process that implements the target functions. Meanwhile, the user process waits for the result of the
service request with a message receive operation. When the OS process completes the operation, it sends a message
back to the user process.
The distinction between two approaches has important consequences regarding the relative independence of the OS
behavior, from the application process behavior, and the resulting performance. As a rule of thumb, operating system
based on a system call interface can be made more efficient than those requiring messages to be exchanged between
distinct processes. This is the case, even though the system call must be implemented with a trap instruction; that is,
even though the trap is relatively expensive to perform, it is more efficient than the message passing approach, where
there are generally higher costs associated with process multiplexing, message formation and message copying. The
system call approach has the interesting property that there is not necessarily any OS process. Instead, a process
executing in user mode changes to kernel mode when it is executing kernel code, and switches back to user mode
when it returns from the OS call. If, on the other hand, the OS is designed as a set of separate processes, it is usually
easier to design it so that it gets control of the machine in special situations, than if the kernel is simply a collection
59
Process management (computing)
of functions executed by users processes in kernel mode. Even procedure-based operating system usually find it
necessary to include at least a few system processes (called daemons in UNIX) to handle situation whereby the
machine is otherwise idle such as scheduling and handling the network.
Sources
•
•
•
•
•
•
•
Operating System incorporating Windows and UNIX, Colin Ritchie. ISBN 0-8264-6416-5
Operating Systems, William Stallings, Prentice Hall, (4th Edition, 2000)
Multiprogramming, Process Description and Control
Operating Systems • A Modern Perspective, Gary Nutt, Addison Wesley, (2nd Edition, 2001).
Process Management Models, Scheduling, UNIX System V Release 4:
Modern Operating Systems, Andrew Tannenbaum, Prentice Hall, (2nd Edition, 2001).
Operating System Concepts, Silberschatz, Galvin & Gagne, John Wiley & Sons, (6th Edition, 2003).
References
[1] http:/ / sunnyeves. blogspot. com/ 2010/ 09/ sneak-peek-into-linux-kernel-chapter-2. html
Context switch
A context switch is the computing process of storing and restoring the state (context) of a Process so that execution
can be resumed from the same point at a later time. This enables multiple processes to share a single CPU. The
context switch is an essential feature of a multitasking operating system. Context switches are usually
computationally intensive and much of the design of operating systems is to optimize the use of context switches. A
context switch can mean a register context switch, a task context switch, a thread context switch, or a process context
switch. What constitutes the context is determined by the processor and the operating system. Switching from one
process to another requires a certain amount of time for doing the administration - saving and loading registers and
memory maps, updating various tables and lists etc.
When to switch?
There are three potential triggers for a context switch:
Multitasking
Most commonly, within some scheduling scheme, one process needs to be switched out of the CPU so another
process can run. This context switch can be triggered by the process making itself unrunnable, such as by waiting for
an I/O or synchronization operation to complete. On a pre-emptive multitasking system, the scheduler may also
switch out processes which are still runnable. To prevent other processes from being starved of CPU time,
preemptive schedulers often configure a timer interrupt to fire when a process exceeds its time slice. This interrupt
ensures that the scheduler will gain control to perform a context switch.
60
Context switch
Interrupt handling
Modern architectures are interrupt driven. This means that if the CPU requests data from a disk, for example, it does
not need to busy-wait until the read is over; it can issue the request and continue with some other execution. When
the read is over, the CPU can be interrupted and presented with the read. For interrupts, a program called an
interrupt handler is installed, and it is the interrupt handler that handles the interrupt from the disk.
When an interrupt occurs, the hardware automatically switches a part of the context (at least enough to allow the
handler to return to the interrupted code). The handler may save additional context, depending on details of the
particular hardware and software designs. Often only a minimal part of the context is changed in order to minimize
the amount of time spent handling the interrupt. The kernel does not spawn or schedule a special process to handle
interrupts, but instead the handler executes in the (often partial) context established at the beginning of interrupt
handling. Once interrupt servicing is complete, the context in effect before the interrupt occurred is restored so that
the interrupted process can resume execution in its proper state.
User and kernel mode switching
When a transition between user mode and kernel mode is required in an operating system, a context switch is not
necessary; a mode transition is not by itself a context switch. However, depending on the operating system, a context
switch may also take place at this time.
Context switch: steps
In a switch, the state of the first process must be saved somehow, so that, when the scheduler gets back to the
execution of the first process, it can restore this state and continue.
The state of the process includes all the registers that the process may be using, especially the program counter, plus
any other operating system specific data that may be necessary. This data is usually stored in a data structure called a
process control block (PCB), or switchframe.
In order to switch processes, the PCB for the first process must be created and saved. The PCBs are sometimes
stored upon a per-process stack in kernel memory (as opposed to the user-mode call stack), or there may be some
specific operating system defined data structure for this information.
Since the operating system has effectively suspended the execution of the first process, it can now load the PCB and
context of the second process. In doing so, the program counter from the PCB is loaded, and thus execution can
continue in the new process. New processes are chosen from a queue or queues. Process and thread priority can
influence which process continues execution, with processes of the highest priority checked first for ready threads to
execute.
Software vs hardware context switching
Context switching can be performed primarily by software or hardware. Some processors, like the Intel 80386 and its
successors,[1] have hardware support for context switches, by making use of a special data segment designated the
Task State Segment or TSS. A task switch can be explicitly triggered with a CALL or JMP instruction targeted at a
TSS descriptor in the global descriptor table. It can occur implicitly when an interrupt or exception is triggered if
there's a task gate in the interrupt descriptor table. When a task switch occurs the CPU can automatically load the
new state from the TSS. As with other tasks performed in hardware, one would expect this to be rather fast; however,
mainstream operating systems, including Windows and Linux,[2] do not use this feature.
This is due to mainly two reasons:
1. hardware context switching does not save all the registers (only general purpose registers, not floating point
registers € although the TS bit is automatically turned on in the CR0 control register, resulting in a fault when
61
Context switch
executing floating point instructions and giving the OS the opportunity to save and restore the floating point state
as needed).
2. associated performance issues, e.g., software context switching can be selective and store only those registers that
need storing, whereas hardware context switching stores nearly all registers whether they're required or not.
References
[1] http:/ / www. linfo. org/ context_switch. html
[2] Bovet, Daniel Pierre; Cesat‡, Marco (2006). Understanding the Linux Kernel, Third Edition (http:/ / books. google. com/ ?id=h0lltXyJ8aIC&
lpg=PA104& dq=Linux hardware TSS& pg=PA104#v=onepage& q=Linux hardware TSS). O'Reilly Media. p.€104.
ISBN€978-0-596-00565-8. . Retrieved 2009-11-23.
External links
• Context Switching (http://wiki.osdev.org/Context_Switching) - at OSDev.org
• Context Switch Definition (http://www.linfo.org/context_switch.html) - by The Linux Information Project
(LINFO)
• Context Switches (http://msdn.microsoft.com/en-us/library/ms682105(VS.85).aspx) - from the Microsoft
Developer Network (MSDN)
• General Architecture and Design -Interrupt Handling (http://www.freebsd.org/doc/en/books/arch-handbook/
smp-design.html) at FreeBSD.org
• Measuring Basic Linux Operations (http://people.virginia.edu/~chg5w/page3/assets/MeasuringUnix.pdf)
Scheduling (computing)
In computer science, scheduling is the method by which threads, processes or data flows are given access to system
resources (e.g. processor time, communications bandwidth). This is usually done to load balance a system effectively
or achieve a target quality of service. The need for a scheduling algorithm arises from the requirement for most
modern systems to perform multitasking (execute more than one process at a time) and multiplexing (transmit
multiple flows simultaneously).
The scheduler is concerned mainly with:
• Throughput - The total number of processes that complete their execution per time unit.
• Latency, specifically:
• Turnaround time - total time between submission of a process and its completion.
• Response time - amount of time it takes from when a request was submitted until the first response is
produced.
• Fairness / Waiting Time - Equal CPU time to each process (or more generally appropriate times according to each
process' priority). It is the time for which the process remains in the ready queue.
In practice, these goals often conflict (e.g. throughput versus latency), thus a scheduler will implement a suitable
compromise. Preference is given to any one of the above mentioned concerns depending upon the user's needs and
objectives.
In real-time environments, such as embedded systems for automatic control in industry (for example robotics), the
scheduler also must ensure that processes can meet deadlines; this is crucial for keeping the system stable. Scheduled
tasks are sent to mobile devices and managed through an administrative back end.
62
Scheduling (computing)
Types of operating system schedulers
Operating systems may feature up to three distinct types of scheduler, a long-term scheduler (also known as an
admission scheduler or high-level scheduler), a mid-term or medium-term scheduler and a short-term scheduler. The
names suggest the relative frequency with which these functions are performed. The scheduler is an operating system
module that selects the next jobs to be admitted into the system and the next process to run.
Long-term scheduling
The long-term, or admission scheduler, decides which jobs or processes are to be admitted to the ready queue (in the
Main Memory); that is, when an attempt is made to execute a program, its admission to the set of currently executing
processes is either authorized or delayed by the long-term scheduler. Thus, this scheduler dictates what processes are
to run on a system, and the degree of concurrency to be supported at any one time - i.e.: whether a high or low
amount of processes are to be executed concurrently, and how the split between input output intensive and CPU
intensive processes is to be handled. In modern operating systems, this is used to make sure that real time processes
get enough CPU time to finish their tasks. Without proper real time scheduling, modern GUI interfaces would seem
sluggish. The long term queue exists in the Hard Disk or the "Virtual Memory".
Long-term scheduling is also important in large-scale systems such as batch processing systems, computer clusters,
supercomputers and render farms. In these cases, special purpose job scheduler software is typically used to assist
these functions, in addition to any underlying admission scheduling support in the operating system.
Medium-term scheduling
The medium-term scheduler temporarily removes processes from main memory and places them on secondary
memory (such as a disk drive) or vice versa. This is commonly referred to as "swapping out" or "swapping in" (also
incorrectly as "paging out" or "paging in"). The medium-term scheduler may decide to swap out a process which has
not been active for some time, or a process which has a low priority, or a process which is page faulting frequently,
or a process which is taking up a large amount of memory in order to free up main memory for other processes,
swapping the process back in later when more memory is available, or when the process has been unblocked and is
no longer waiting for a resource. [Stallings, 396] [Stallings, 370]
In many systems today (those that support mapping virtual address space to secondary storage other than the swap
file), the medium-term scheduler may actually perform the role of the long-term scheduler, by treating binaries as
"swapped out processes" upon their execution. In this way, when a segment of the binary is required it can be
swapped in on demand, or "lazy loaded". [Stallings, 394]
Short-term scheduling
The short-term scheduler (also known as the CPU scheduler) decides which of the ready, in-memory processes are to
be executed (allocated a CPU) next following a clock interrupt, an I/O interrupt, an operating system call or another
form of signal. Thus the short-term scheduler makes scheduling decisions much more frequently than the long-term
or mid-term schedulers - a scheduling decision will at a minimum have to be made after every time slice, and these
are very short. This scheduler can be preemptive, implying that it is capable of forcibly removing processes from a
CPU when it decides to allocate that CPU to another process, or non-preemptive (also known as "voluntary" or
"co-operative"), in which case the scheduler is unable to "force" processes off the CPU.
A preemptive scheduler relies upon a programmable interval timer which invokes an interrupt handler that runs in
kernel Mode and implements the scheduling function.
63
Scheduling (computing)
Dispatcher
Another component involved in the CPU-scheduling function is the dispatcher. The dispatcher is the module that
gives control of the CPU to the process selected by the short-term scheduler. This function involves the following:
• Switching context
• Switching to user mode
• Jumping to the proper location in the user program to restart that program.
Dispatcher analyses the values from Program counter and fetches instructions, data into registers.
The dispatcher should be as fast as possible, since it is invoked during every process switch. During the context
switches, the processor is idle for a fraction of time. Hence, unnecessary context switches should be avoided. The
time it takes for the dispatcher to stop one process and start another running is known as the dispatch latency.
[Galvin, 155].
Scheduling disciplines
Scheduling disciplines are algorithms used for distributing resources among parties which simultaneously and
asynchronously request them. Scheduling disciplines are used in routers (to handle packet traffic) as well as in
operating systems (to share CPU time among both threads and processes), disk drives (I/O scheduling), printers
(print spooler), most embedded systems, etc.
The main purposes of scheduling algorithms are to minimize resource starvation and to ensure fairness amongst the
parties utilizing the resources. Scheduling deals with the problem of deciding which of the outstanding requests is to
be allocated resources. There are many different scheduling algorithms. In this section, we introduce several of them.
In packet-switched computer networks and other statistical multiplexing, the notion of a scheduling algorithm is
used as an alternative to first-come first-served queuing of data packets.
The simplest best-effort scheduling algorithms are round-robin, fair queuing (a max-min fair scheduling algorithm),
proportionally fair scheduling and maximum throughput. If differentiated or guaranteed quality of service is offered,
as opposed to best-effort communication, weighted fair queuing may be utilized.
In advanced packet radio wireless networks such as HSDPA (High-Speed Downlink Packet Access ) 3.5G cellular
system, channel-dependent scheduling may be used to take advantage of channel state information. If the channel
conditions are favourable, the throughput and system spectral efficiency may be increased. In even more advanced
systems such as LTE, the scheduling is combined by channel-dependent packet-by-packet dynamic channel
allocation, or by assigning OFDMA multi-carriers or other frequency-domain equalization components to the users
that best can utilize them.
First in first out
Also known as First Come, First Served (FCFS), is the simplest scheduling algorithm, FIFO simply queues
processes in the order that they arrive in the ready queue.
• Since context switches only occur upon process termination, and no reorganization of the process queue is
required, scheduling overhead is minimal.
• Throughput can be low, since long processes can hold the CPU
• Turnaround time, waiting time and response time can be high for the same reasons above
• No prioritization occurs, thus this system has trouble meeting process deadlines.
• The lack of prioritization means that as long as every process eventually completes, there is no starvation. In an
environment where some processes might not complete, there can be starvation.
• It is based on Queuing
64
Scheduling (computing)
Shortest remaining time
Similar to Shortest Job First (SJF). With this strategy the scheduler arranges processes with the least estimated
processing time remaining to be next in the queue. This requires advanced knowledge or estimations about the time
required for a process to complete.
• If a shorter process arrives during another process' execution, the currently running process may be interrupted
(known as preemption), dividing that process into two separate computing blocks. This creates excess overhead
through additional context switching. The scheduler must also place each incoming process into a specific place
in the queue, creating additional overhead.
• This algorithm is designed for maximum throughput in most scenarios.
• Waiting time and response time increase as the process's computational requirements increase. Since turnaround
time is based on waiting time plus processing time, longer processes are significantly affected by this. Overall
waiting time is smaller than FIFO, however since no process has to wait for the termination of the longest
process.
• No particular attention is given to deadlines, the programmer can only attempt to make processes with deadlines
as short as possible.
• Starvation is possible, especially in a busy system with many small processes being run.
Fixed priority pre-emptive scheduling
The OS assigns a fixed priority rank to every process, and the scheduler arranges the processes in the ready queue in
order of their priority. Lower priority processes get interrupted by incoming higher priority processes.
• Overhead is not minimal, nor is it significant.
• FPPS has no particular advantage in terms of throughput over FIFO scheduling.
• Waiting time and response time depend on the priority of the process. Higher priority processes have smaller
waiting and response times.
• Deadlines can be met by giving processes with deadlines a higher priority.
• Starvation of lower priority processes is possible with large amounts of high priority processes queuing for CPU
time.
Round-robin scheduling
The scheduler assigns a fixed time unit per process, and cycles through them.
• RR scheduling involves extensive overhead, especially with a small time unit.
• Balanced throughput between FCFS and SJF, shorter jobs are completed faster than in FCFS and longer processes
are completed faster than in SJF.
• Poor average response time, waiting time is dependent on number of processes, and not average process length.
• Because of high waiting times, deadlines are rarely met in a pure RR system.
• Starvation can never occur, since no priority is given. Order of time unit allocation is based upon process arrival
time, similar to FCFS.
65
Scheduling (computing)
66
Multilevel queue scheduling
This is used for situations in which processes are easily divided into different groups. For example, a common
division is made between foreground (interactive) processes and background (batch) processes. These two types of
processes have different response-time requirements and so may have different scheduling needs. It is very useful for
shared memory problems.
Overview
Scheduling algorithm
CPU Overhead Throughput Turnaround time Response time
First In First Out
Low
Low
High
Low
Shortest Job First
Medium
High
Medium
Medium
Priority based scheduling
Medium
Low
High
High
Round-robin scheduling
High
Medium
Medium
High
High
Medium
Medium
Multilevel Queue scheduling High
How to choose a scheduling algorithm
When designing an operating system, a programmer must consider which scheduling algorithm will perform best for
the use the system is going to see. There is no universal …best† scheduling algorithm, and many operating systems use
extended or combinations of the scheduling algorithms above. For example, Windows NT/XP/Vista uses a
multilevel feedback queue, a combination of fixed priority preemptive scheduling, round-robin, and first in first out.
In this system, processes can dynamically increase or decrease in priority depending on if it has been serviced
already, or if it has been waiting extensively. Every priority level is represented by its own queue, with round-robin
scheduling amongst the high priority processes and FIFO among the lower ones. In this sense, response time is short
for most processes, and short but critical system processes get completed very quickly. Since processes can only use
one time unit of the round robin in the highest priority queue, starvation can be a problem for longer high priority
processes.
Operating system process scheduler implementations
The algorithm used may be as simple as round-robin in which each process is given equal time (for instance 1€ms,
usually between 1€ms and 100€ms) in a cycling list. So, process A executes for 1€ms, then process B, then process C,
then back to process A.
More advanced algorithms take into account process priority, or the importance of the process. This allows some
processes to use more time than other processes. The kernel always uses whatever resources it needs to ensure proper
functioning of the system, and so can be said to have infinite priority. In SMP(symmetric multiprocessing) systems,
processor affinity is considered to increase overall system performance, even if it may cause a process itself to run
more slowly. This generally improves performance by reducing cache thrashing.
Windows
Very early MS-DOS and Microsoft Windows systems were non-multitasking, and as such did not feature a
scheduler. Windows 3.1x used a non-preemptive scheduler, meaning that it did not interrupt programs. It relied on
the program to end or tell the OS that it didn't need the processor so that it could move on to another process. This is
usually called cooperative multitasking. Windows 95 introduced a rudimentary preemptive scheduler; however, for
legacy support opted to let 16 bit applications run without preemption.[1]
Scheduling (computing)
Windows NT-based operating systems use a multilevel feedback queue. 32 priority levels are defined, 0 through to
31, with priorities 0 through 15 being "normal" priorities and priorities 16 through 31 being soft real-time priorities,
requiring privileges to assign. 0 is reserved for the Operating System. Users can select 5 of these priorities to assign
to a running application from the Task Manager application, or through thread management APIs. The kernel may
change the priority level of a thread depending on its I/O and CPU usage and whether it is interactive (i.e. accepts
and responds to input from humans), raising the priority of interactive and I/O bounded processes and lowering that
of CPU bound processes, to increase the responsiveness of interactive applications.[2] The scheduler was modified in
Windows Vista to use the cycle counter register of modern processors to keep track of exactly how many CPU
cycles a thread has executed, rather than just using an interval-timer interrupt routine.[3] Vista also uses a priority
scheduler for the I/O queue so that disk defragmenters and other such programs don't interfere with foreground
operations.[4]
Mac OS
Mac OS 9 uses cooperative scheduling for threads, where one process controls multiple cooperative threads, and also
provides preemptive scheduling for MP tasks. The kernel schedules MP tasks using a preemptive scheduling
algorithm. All Process Manager processes run within a special MP task, called the "blue task". Those processes are
scheduled cooperatively, using a round-robin scheduling algorithm; a process yields control of the processor to
another process by explicitly calling a blocking function such as WaitNextEvent. Each process has its own copy
of the Thread Manager that schedules that process's threads cooperatively; a thread yields control of the processor to
another thread by calling YieldToAnyThread or YieldToThread.[5]
Mac OS X uses a multilevel feedback queue, with four priority bands for threads - normal, system high priority,
kernel mode only, and real-time.[6] Threads are scheduled preemptively; Mac OS X also supports cooperatively
scheduled threads in its implementation of the Thread Manager in Carbon.[5]
AIX
In AIX Version 4 there are three possible values for thread scheduling policy :
• FIFO: Once a thread with this policy is scheduled, it runs to completion unless it is blocked, it voluntarily yields
control of the CPU, or a higher-priority thread becomes dispatchable. Only fixed-priority threads can have a FIFO
scheduling policy.
• RR: This is similar to the AIX Version 3 scheduler round-robin scheme based on 10ms time slices. When a RR
thread has control at the end of the time slice, it moves to the tail of the queue of dispatchable threads of its
priority. Only fixed-priority threads can have a RR scheduling policy.
• OTHER This policy is defined by POSIX1003.4a as implementation-defined. In AIX Version 4, this policy is
defined to be equivalent to RR, except that it applies to threads with non-fixed priority. The recalculation of the
running thread's priority value at each clock interrupt means that a thread may lose control because its priority
value has risen above that of another dispatchable thread. This is the AIX Version 3 behavior.
Threads are primarily of interest for applications that currently consist of several asynchronous processes. These
applications might impose a lighter load on the system if converted to a multithreaded structure.
AIX 5 implements the following scheduling policies: FIFO, round robin, and a fair round robin. The FIFO policy has
three different implementations: FIFO, FIFO2, and FIFO3. The round robin policy is named SCHED_RR in AIX,
and the fair round robin is called SCHED_OTHER. This link provides additional information on AIX 5 scheduling:
http://www.ibm.com/developerworks/aix/library/au-aix5_cpu/index.html#N100F6 .
67
Scheduling (computing)
Linux
Linux 2.4
In Linux 2.4, an O(n) scheduler with a multilevel feedback queue with priority levels ranging from 0-140 was used.
0-99 are reserved for real-time tasks and 100-140 are considered nice task levels. For real-time tasks, the time
quantum for switching processes was approximately 200 ms, and for nice tasks approximately 10 ms. The scheduler
ran through the run queue of all ready processes, letting the highest priority processes go first and run through their
time slices, after which they will be placed in an expired queue. When the active queue is empty the expired queue
will become the active queue and vice versa.
However, some Enterprise Linux distributions such as SUSE Linux Enterprise Server replaced this scheduler with a
backport of the O(1) scheduler (which was maintained by Alan Cox in his Linux 2.4-ac Kernel series) to the Linux
2.4 kernel used by the distribution.
Linux 2.6.0 to Linux 2.6.22
From versions 2.6 to 2.6.22, the kernel used an O(1) scheduler developed by Ingo Molnar and many other kernel
developers during the Linux 2.5 development. For many kernel in time frame, Con Kolivas developed patch sets
which improved interactivity with this scheduler or even replaced it with his own schedulers.
Since Linux 2.6.23
Con Kolivas's work, most significantly his implementation of "fair scheduling" named "Rotating Staircase
Deadline", inspired Ingo Molnˆr to develop the Completely Fair Scheduler as a replacement for the earlier O(1)
scheduler, crediting Kolivas in his announcement.[7]
The Completely Fair Scheduler (CFS) uses a well-studied, classic scheduling algorithm called fair queuing originally
invented for packet networks. Fair queuing had been previously applied to CPU scheduling under the name stride
scheduling.
The fair queuing CFS scheduler has a scheduling complexity of O(log N), where N is the number of tasks in the
runqueue. Choosing a task can be done in constant time, but reinserting a task after it has run requires O(log N)
operations, because the run queue is implemented as a red-black tree.
CFS is the first implementation of a fair queuing process scheduler widely used in a general-purpose operating
system.[8]
The Brain Fuck Scheduler (BFS) is an alternative to the CFS.
FreeBSD
FreeBSD uses a multilevel feedback queue with priorities ranging from 0-255. 0-63 are reserved for interrupts,
64-127 for the top half of the kernel, 128-159 for real-time user threads, 160-223 for time-shared user threads, and
224-255 for idle user threads. Also, like Linux, it uses the active queue setup, but it also has an idle queue.[9]
NetBSD
NetBSD uses a multilevel feedback queue with priorities ranging from 0-223. 0-63 are reserved for time-shared
threads (default, SCHED_OTHER policy), 64-95 for user threads which entered kernel space, 96-128 for kernel
threads, 128-191 for user real-time threads (SCHED_FIFO and SCHED_RR policies), and 192-223 for software
interrupts.
68
Scheduling (computing)
69
Solaris
Solaris uses a multilevel feedback queue with priorities ranging from 0-169. 0-59 are reserved for time-shared
threads, 60-99 for system threads, 100-159 for real-time threads, and 160-169 for low priority interrupts. Unlike
Linux, when a process is done using its time quantum, it's given a new priority and put back in the queue.
Summary
Operating System
Preemption
Algorithm
Amiga OS
Yes
Prioritized Round-robin scheduling
FreeBSD
Yes
Multilevel feedback queue
Linux pre-2.6
Yes
Multilevel feedback queue
Linux 2.6-2.6.23
Yes
O(1) scheduler
Linux post-2.6.23
Yes
Completely Fair Scheduler
Mac OS pre-9
None
Cooperative Scheduler
Mac OS 9
Some
Preemptive for MP tasks, Cooperative Scheduler for processes and threads
Mac OS X
Yes
Multilevel feedback queue
NetBSD
Yes
Multilevel feedback queue
Solaris
Yes
Multilevel feedback queue
Windows 3.1x
None
Cooperative Scheduler
Windows 95, 98, Me
Half
Preemptive for 32-bit processes, Cooperative Scheduler for 16-bit
processes
Windows NT (including 2000, XP, Vista, 7, and
Server)
Yes
Multilevel feedback queue
References
[1]
[2]
[3]
[4]
[5]
[6]
Early Windows (http:/ / web. archive. org/ web/ */ www. jgcampbell. com/ caos/ html/ node13. html)
Sriram Krishnan. "A Tale of Two Schedulers Windows NT and Windows CE" (http:/ / sriramk. com/ schedulers. html). .
Inside the Windows Vista Kernel: Part 1 (http:/ / technet. microsoft. com/ en-us/ magazine/ cc162494. aspx), Microsoft Technet
"Vista Kernel Improvements" (http:/ / blog. gabefrost. com/ ?p=25). .
"Technical Note TN2028 - Threading Architectures" (http:/ / developer. apple. com/ technotes/ tn/ tn2028. html). .
"Mach Scheduling and Thread Interfaces" (http:/ / developer. apple. com/ mac/ library/ documentation/ Darwin/ Conceptual/
KernelProgramming/ scheduler/ scheduler. html). .
[7] Molnˆr, Ingo (2007-04-13). "[patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]" (http:/ / lwn. net/ Articles/ 230501/ ).
linux-kernel mailing list. .
[8] Efficient and Scalable Multiprocessor Fair Scheduling Using Distributed Weighted Round-Robin (http:/ / happyli. org/ tongli/ papers/ dwrr.
pdf)
[9] "Comparison of Solaris, Linux, and FreeBSD Kernels" (http:/ / cn. opensolaris. org/ files/ solaris_linux_bsd_cmp. pdf). .
• B‰aŠewicz, Jacek; Ecker, K.H.; Pesch, E.; Schmidt, G.; Weglarz, J. (2001). Scheduling computer and
manufacturing processes (2 ed.). Berlin [u.a.]: Springer. ISBN€3-540-41931-4.
• Stallings, William (2004). Operating Systems Internals and Design Principles (fifth international edition).
Prentice Hall. ISBN€0-13-147954-7.
• Stallings, William (2004). Operating Systems Internals and Design Principles (fourth edition). Prentice Hall.
ISBN€0-13-031999-6.
• Information on the Linux 2.6 O(1)-scheduler (http://joshaas.net/linux/)
Scheduling (computing)
Further reading
• Brief discussion of Job Scheduling algorithms (http://www.cs.sunysb.edu/~algorith/files/scheduling.shtml)
• Understanding the Linux Kernel: Chapter 10 Process Scheduling (http://www.oreilly.com/catalog/linuxkernel/
chapter/ch10.html)
• Kerneltrap: Linux kernel scheduler articles (http://kerneltrap.org/scheduler)
• AIX CPU monitoring and tuning (http://www.ibm.com/developerworks/aix/library/au-aix5_cpu/index.
html#N100F6)
• Josh Aas' introduction to the Linux 2.6.8.1 CPU scheduler implementation (http://joshaas.net/linux/)
• Peter Brucker, Sigrid Knust. Complexity results for scheduling problems (http://www.mathematik.
uni-osnabrueck.de/research/OR/class/)
• TORSCHE Scheduling Toolbox for Matlab (http://rtime.felk.cvut.cz/scheduling-toolbox) is a toolbox of
scheduling and graph algorithms.
70
71
3. I/O
Input/output
In computing, input/output or I/O is the communication between an information processing system (such as a
computer) and the outside world, possibly a human or another information processing system. Inputs are the signals
or data received by the system, and outputs are the signals or data sent from it. The term can also be used as part of
an action; to "perform I/O" is to perform an input or output operation. I/O devices are used by a person (or other
system) to communicate with a computer. For instance, a keyboard or a mouse may be an input device for a
computer, while monitors and printers are considered output devices for a computer. Devices for communication
between computers, such as modems and network cards, typically serve for both input and output.
Note that the designation of a device as either input or output depends on the perspective. Mouse and keyboards take
as input physical movement that the human user outputs and convert it into signals that a computer can understand.
The output from these devices is input for the computer. Similarly, printers and monitors take as input signals that a
computer outputs. They then convert these signals into representations that human users can see or read. For a
human user the process of reading or seeing these representations is receiving input. These interactions between
computers and humans is studied in a field called human•computer interaction.
In computer architecture, the combination of the CPU and main memory (i.e. memory that the CPU can read and
write to directly, with individual instructions) is considered the brain of a computer, and from that point of view any
transfer of information from or to that combination, for example to or from a disk drive, is considered I/O. The CPU
and its supporting circuitry provide memory-mapped I/O that is used in low-level computer programming, such as
the implementation of device drivers. An I/O algorithm is one designed to exploit locality and perform efficiently
when data reside on secondary storage, such as a disk drive.
Interface
An I/O interface is required whenever the I/O device is driven by the processor. The interface must have necessary
logic to interpret the device address generated by the processor. Handshaking should be implemented by the
interface using appropriate commands (like BUSY, READY, and WAIT), and the processor can communicate with
an I/O device through the interface. If different data formats are being exchanged, the interface must be able to
convert serial data to parallel form and vice-versa. There must be provision for generating interrupts and the
corresponding type numbers for further processing by the processor if required.
A computer that uses memory-mapped I/O accesses hardware by reading and writing to specific memory locations,
using the same assembly language instructions that computer would normally use to access memory.
Higher-level implementation
Higher-level operating system and programming facilities employ separate, more abstract I/O concepts and
primitives. For example, most operating systems provide application programs with the concept of files. The C and
C++ programming languages, and operating systems in the Unix family, traditionally abstract files and devices as
streams, which can be read or written, or sometimes both. The C standard library provides functions for
manipulating streams for input and output.
In the context of the ALGOL 68 programming language, the input and output facilities are collectively referred to as
transput. The ALGOL 68 transput library recognizes the following standard files/devices: stand in, stand out,
Input/output
stand errors and stand back.
An alternative to special primitive functions is the I/O monad, which permits programs to just describe I/O, and the
actions are carried out outside the program. This is notable because the I/O functions would introduce side-effects to
any programming language, but this allows purely functional programming to be practical.
Addressing mode
There are many ways through which data can be read or stored in the memory. Each method is an addressing mode,
and has its own advantages and limitations.
There are many type of addressing modes such as direct addressing, indirect addressing, immediate addressing,
index addressing, based addressing, based-index addressing, implied addressing, etc.
Direct addressing
In this type of address of the data is a part of the instructions itself. When the processor interprets the instruction, it
gets the memory address from where it can be read/written the required information. For example:[1]
MOV register, [address] ; to read
MOV [address], register ; to write
; similarly
IN register, [address] ; to read as input
OUT [address], register ; to write as output
Here the address operand points to a memory location which holds the data and copies it into/from the specified
register. A pair of brackets is a dereference operator.
Indirect addressing
According to the above example, the address can be stored in another register. Therefore, the instructions will
have the register representing the address. So to fetch the data, the instruction must be interpreted appropriate
register selected. The value of the register will be used for addressing appropriate memory location and then data
will be read/written. This addressing method has an advantage against the direct mode that the register value is
changeable so the appropriate memory location can also be dynamically selected.
Port-mapped I/O
Port-mapped I/O and memory-mapped I/O usually requires the use of instructions which are specifically designed to
perform I/O operations.
References
[1] LINUX assembly language programming (http:/ / books. google. com/ books?id=pbB8Z1ewwEgC& lpg=PA67& ots=pq2zZDeqdf&
dq=dereference operator assembly& hl=th& pg=PA66#v=onepage& q& f=false)
72
Device driver
Device driver
In computing, a device driver is a computer program that operates or controls a particular type of device that is
attached to a computer.[1] A driver typically communicates with the device through the computer bus or
communications subsystem to which the hardware connects. When a calling program invokes a routine in the driver,
the driver issues commands to the device. Once the device sends data back to the driver, the driver may invoke
routines in the original calling program. Drivers are hardware-dependent and operating-system-specific. They
usually provide the interrupt handling required for any necessary asynchronous time-dependent hardware
interface.[2]
Purpose
A device driver simplifies programming by acting as translator between a hardware device and the applications or
operating systems that use it.[1] Programmers can write the higher-level application code independently of whatever
specific hardware the end-user is using. Physical layers communicate with specific device instances. For example, a
serial port needs to handle standard communication protocols such as XON/XOFF that are common for all serial port
hardware. This would be managed by a serial port logical layer. However, the physical layer needs to communicate
with a particular serial port chip. 16550 UART hardware differs from PL-011. The physical layer addresses these
chip-specific variations. Conventionally, OS requests go to the logical layer first. In turn, the logical layer calls upon
the physical layer to implement OS requests in terms understandable by the hardware. Conversely, when a hardware
device needs to respond to the OS, it uses the physical layer to speak to the logical layer.
In Linux environments, programmers can build device drivers either as parts of the kernel or separately as loadable
modules. Makedev includes a list of the devices in Linux: ttyS (terminal), lp (parallel port), hd (disk), loop (loopback
disk device), sound (these include mixer, sequencer, dsp, and audio)...[3]
The Microsoft Windows .sys files and Linux .ko modules contain loadable device drivers. The advantage of loadable
device drivers is that they can be loaded only when necessary and then unloaded, thus saving kernel memory.
Development
Writing a device driver requires an in-depth understanding of how the hardware and the software of a given platform
function. Drivers operate in a highly privileged environment and can cause disaster if they get things wrong. In
contrast, most user-level software on modern operating systems can be stopped without greatly affecting the rest of
the system. Even drivers executing in user mode can crash a system if the device is erroneously programmed. These
factors make it more difficult and dangerous to diagnose problems.[4]
The task of writing drivers thus usually falls to software engineers or computer engineers who work for
hardware-development companies. This is because they have better information than most outsiders about the design
of their hardware. Moreover, it was traditionally considered in the hardware manufacturer's interest to guarantee that
their clients can use their hardware in an optimum way. Typically, the logical device driver (LDD) is written by the
operating system vendor, while the physical device driver (PDD) is implemented by the device vendor. But in recent
years non-vendors have written numerous device drivers, mainly for use with free and open source operating
systems. In such cases, it is important that the hardware manufacturer provides information on how the device
communicates. Although this information can instead be learned by reverse engineering, this is much more difficult
with hardware than it is with software.
Microsoft has attempted to reduce system instability due to poorly written device drivers by creating a new
framework for driver development, called Windows Driver Foundation (WDF). This includes User-Mode Driver
Framework (UMDF) that encourages development of certain types of drivers€primarily those that implement a
message-based protocol for communicating with their devices€as user-mode drivers. If such drivers malfunction,
73
Device driver
they do not cause system instability. The Kernel-Mode Driver Framework (KMDF) model continues to allow
development of kernel-mode device drivers, but attempts to provide standard implementations of functions that are
known to cause problems, including cancellation of I/O operations, power management, and plug and play device
support.
Apple has an open-source framework for developing drivers on Mac OS€X called the I/O€Kit.
Kernel mode vs. user mode
Device drivers, particularly on modern Microsoft Windows platforms, can run in kernel-mode (Ring 0 on x86 CPUs)
or in user-mode (Ring€3 on x86 CPUs).[5] The primary benefit of running a driver in user mode is improved stability,
since a poorly written user mode device driver cannot crash the system by overwriting kernel memory.[6] On the
other hand, user/kernel-mode transitions usually impose a considerable performance overhead, thereby prohibiting
user-mode drivers for low latency and high throughput requirements.
Kernel space can be accessed by user module only through the use of system calls. End user programs like the UNIX
shell or other GUI-based applications are part of the user space. These applications interact with hardware through
kernel supported functions.
Applications
Because of the diversity of modern hardware and operating systems, drivers operate in many different
environments.[7] Drivers may interface with:
•
•
•
•
•
•
•
•
•
•
printers
video adapters
Network cards
Sound cards
Local buses of various sorts€in particular, for bus mastering on modern systems
Low-bandwidth I/O buses of various sorts (for pointing devices such as mice, keyboards, USB, etc.)
Computer storage devices such as hard disk, CD-ROM, and floppy disk buses (ATA, SATA, SCSI)
Implementing support for different file systems
Image scanners
Digital cameras
Common levels of abstraction for device drivers include:
• For hardware:
• Interfacing directly
• Writing to or reading from a device control register
• Using some higher-level interface (e.g. Video BIOS)
• Using another lower-level device driver (e.g. file system drivers using disk drivers)
• Simulating work with hardware, while doing something entirely different
• For software:
•
•
•
•
Allowing the operating system direct access to hardware resources
Implementing only primitives
Implementing an interface for non-driver software (e.g.,€TWAIN)
Implementing a language, sometimes quite high-level (e.g.,€PostScript)
Choosing and installing the correct device drivers for given hardware is often a key component of computer system
configuration.
74
Device driver
Virtual device drivers
Virtual device drivers represent a particular variant of device drivers. They are used to emulate a hardware device,
particularly in virtualization environments, for example when a DOS program is run on a Microsoft Windows
computer or when a guest operating system is run on, for example, a Xen host. Instead of enabling the guest
operating system to dialog with hardware, virtual device drivers take the opposite role and emulate a piece of
hardware, so that the guest operating system and its drivers running inside a virtual machine can have the illusion of
accessing real hardware. Attempts by the guest operating system to access the hardware are routed to the virtual
device driver in the host operating system as e.g.,€function calls. The virtual device driver can also send simulated
processor-level events like interrupts into the virtual machine.
Virtual devices may also operate in a non-virtualized environment. For example a virtual network adapter is used
with a virtual private network, while a virtual disk device is used with iSCSI. The best example for virtual device
drivers can be "Daemon Tools".
There are several variants of virtual device drivers.
Open drivers
•
•
•
•
Printers: CUPS
RAIDs: CCISS[8] (Compaq Command Interface for SCSI-3 Support[9])
Scanners: SANE
Video: Vidix, Direct Rendering Infrastructure
Solaris descriptions of commonly used device drivers
•
•
•
•
•
•
•
•
fas: Fast/wide SCSI controller
hme: Fast (10/100 Mbit/s) Ethernet
isp: Differential SCSI controllers and the SunSwift card
glm: (Gigabaud Link Module[10]) UltraSCSI controllers
scsi: Small Computer Serial Interface (SCSI) devices
sf: soc+ or social Fiber Channel Arbitrated Loop (FCAL)
soc: SPARC Storage Array (SSA) controllers
social: Serial optical controllers for FCAL (soc+)
APIs
•
•
•
•
•
•
•
•
•
•
•
Windows Display Driver Model (WDDM) • the graphic display driver architecture for Windows Vista
Windows Driver Foundation (WDF)
Windows Driver Model (WDM)
Network Driver Interface Specification (NDIS) • a standard network card driver API
Advanced Linux Sound Architecture (ALSA) • as of 2009 the standard Linux sound-driver interface
Scanner Access Now Easy (SANE) • a public-domain interface to raster-image scanner-hardware
I/O Kit • an open-source framework from Apple for developing Mac OS X device drivers
Installable File System (IFS) • a filesystem API for IBM OS/2 and Microsoft Windows NT
Open Data-Link Interface (ODI) • a network card API similar to NDIS
Uniform Driver Interface (UDI) • a cross-platform driver interface project
Dynax Driver Framework (dxd) • C++ open source cross-platform driver framework for KMDF and IOKit
75
Device driver
Identifiers
A device on the PCI bus or USB is identified by two IDs which consist of 4 hexadecimal numbers each. The vendor
ID identifies the vendor of the device. The device ID identifies a specific device from that manufacturer/vendor.
A PCI device has often an ID pair for the main chip of the device, and also a subsystem ID pair which identifies the
vendor, which may be different from the chip manufacturer.
References
[1] "What is a device driver?, The purpose of device drivers" (https:/ / www. pc-gesund. de/ it-wissen/ what-is-a-device-driver). PC Gesund. .
Retrieved November 8, 2012.
[2] EMC Education Services (2010). Information Storage and Management: Storing, Managing, and Protecting Digital Information (http:/ /
books. google. com. ph/ books?id=sCCfRAj3aCgC& dq=device+ driver+ hardware+ dependent& hl=fil& source=gbs_navlinks_s). John
Wiley & Sons. .
[3] "MAKEDEV€€ Linux Command€€ Unix Command" (http:/ / linux. about. com/ od/ commands/ l/ blcmdl8_MAKEDEV. htm).
Linux.about.com. 2009-09-11. . Retrieved 2009-09-17.
[4] Burke, Timothy (1995). Writing device drivers: tutorial and reference (http:/ / books. google. com. ph/ books?id=9aFQAAAAMAAJ&
q=writing+ a+ device+ driver& dq=writing+ a+ device+ driver& hl=fil& sa=X& ei=yA2jUJyXEKmfiAes44CIAQ& ved=0CC8Q6AEwAQ).
Digital Press. .
[5] "User-mode vs. Kernel-mode Drivers" (http:/ / technet2. microsoft. com/ windowsserver/ en/ library/
eb1936c0-e19c-4a17-a1a8-39292e4929a41033. mspx?mfr=true). Microsoft. 2003-03-01. . Retrieved 2008-03-04.
[6] "Introduction to the User-Mode Driver Framework (UMDF)" (http:/ / blogs. msdn. com/ iliast/ archive/ 2006/ 10/ 10/
Introduction-to-the-User_2D00_Mode-Driver-Framework. aspx). Microsoft. 2006-10-10. . Retrieved 2008-03-04.
[7] Deborah Morley (2009). Understanding Computers 2009: Today and Tomorrow (http:/ / books. google. com. ph/ books?id=tfU7RtI7OX8C&
dq=applications+ device+ driver& hl=fil& source=gbs_navlinks_s). Cengage Learning. .
[8] "CCISS" (http:/ / sourceforge. net/ projects/ cciss/ ). SourceForge. 2010. . Retrieved 2010-08-11. "Drivers for the HP (previously Compaq)
Smart Array controllers which provide hardware RAID capability."
[9] Russell, Steve; et al. (2003-10-21). "Abbreviations and acronyms" (http:/ / www. redbooks. ibm. com/ redbooks/ pdfs/ sg246852. pd). Server
Consolidation with the IBM eserver xSeries 440 and VMware ESX Serve. IBM International Technical Support Organization. p.€207.
ISBN€0-7384-2684-9. . Retrieved 2011-08-14.
[10] "US Patent 5969841 - Gigabaud link module with received power detect signal" (http:/ / www. patentstorm. us/ patents/ 5969841. html).
PatentStorm LLC. . Retrieved 2009-09-08. "An improved Gigabaud Link Module (GLM) is provided for performing bi-directional data
transfers between a host device and a serial transfer medium."
External links
• Microsoft Windows Hardware Developer Central (http://www.microsoft.com/whdc)
• Linux Hardware Compatibility Lists and Linux Drivers (http://www.linux-drivers.org)
• Understanding Modern Device Drivers(Linux) (http://pages.cs.wisc.edu/~kadav/study/study.pdf)
76
77
4. Memory
Memory management
Memory management is the act of managing computer memory. The essential requirement of memory management
is to provide ways to dynamically allocate portions of memory to programs at their request, and freeing it for reuse
when no longer needed. This is critical to the computer system.
Several methods have been devised that increase the effectiveness of memory management. Virtual memory systems
separate the memory addresses used by a process from actual physical addresses, allowing separation of processes
and increasing the effectively available amount of RAM using paging or swapping to secondary storage. The quality
of the virtual memory manager can have an extensive effect on overall system performance.
Dynamic memory allocation
Details
The task of fulfilling an allocation request consists of locating a block
of unused memory of sufficient size. Memory requests are satisfied by
allocating portions from a large pool of memory called the heap. At
any given time, some parts of the heap are in use, while some are
"free" (unused) and thus available for future allocations. Several issues
complicate implementation, such as internal and external
External Fragmentation
fragmentation, which arises when there are many small gaps between
allocated memory blocks, which invalidates their use for an allocation
request. The allocator's metadata can also inflate the size of (individually) small allocations. This is managed often
by chunking. The memory management system must track outstanding allocations to ensure that they do not overlap
and that no memory is ever "lost" as a memory leak.
Efficiency
The specific dynamic memory allocation algorithm implemented can impact performance significantly. A study
conducted in 1994 by Digital Equipment Corporation illustrates the overheads involved for a variety of allocators.
The lowest average instruction path length required to allocate a single memory slot was 52 (as measured with an
instruction level profiler on a variety of software).[1]
Implementations
Since the precise location of the allocation is not known in advance, the memory is accessed indirectly, usually
through a pointer reference. The specific algorithm used to organize the memory area and allocate and deallocate
chunks is interlinked with the kernel, and may use any of the following methods.
Memory management
Fixed-size-blocks allocation
Fixed-size-blocks allocation, also called memory pool allocation, uses a free list of fixed-size blocks of memory
(often all of the same size). This works well for simple embedded systems where no large objects need to be
allocated, but suffers from fragmentation, especially with long memory addresses.
Buddy blocks
In this system, memory is allocated into several pools of memory instead of just one, where each pool represents
blocks of memory of a certain power of two in size. All blocks of a particular size are kept in a sorted linked list or
tree and all new blocks that are formed during allocation are added to their respective memory pools for later use. If
a smaller size is requested than is available, the smallest available size is selected and halved. One of the resulting
halves is selected, and the process repeats until the request is complete. When a block is allocated, the allocator will
start with the smallest sufficiently large block to avoid needlessly breaking blocks. When a block is freed, it is
compared to its buddy. If they are both free, they are combined and placed in the next-largest size buddy-block list.
Systems with virtual memory
Virtual memory is a method of decoupling the memory organization from the physical hardware. The applications
operate memory via virtual addresses. Each time an attempt to access stored data is made, virtual memory data
orders translate the virtual address to a physical address. In this way addition of virtual memory enables granular
control over memory systems and methods of access.
Protection
In virtual memory systems the operating system limits how a process can access the memory. This feature can be
used to disallow a process to read or write to memory that is not allocated to it, preventing malicious or
malfunctioning code in one program from interfering with the operation of another.
Sharing
Even though the memory allocated for specific processes is normally isolated, processes sometimes need to be able
to share information. Shared memory is one of the fastest techniques for Inter-process communication.
Physical organization
Memory is usually classed by access rate as with primary storage and secondary storage. Memory management
systems handle moving information between these two levels of memory.
Notes
[1] http:/ / www. eecs. northwestern. edu/ ~robby/ uc-courses/ 15400-2008-spring/ spe895. pdf
References
• Donald Knuth. Fundamental Algorithms, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89683-4. Section
2.5: Dynamic Storage Allocation, pp.€435•456.
• Simple Memory Allocation Algorithms (http://buzzan.tistory.com/m/post/view/id/428) (originally published
on OSDEV Community)
• Wilson, P.R.; Johnstone, M.S.; Neely, M.; Boles, D. (1995). "Dynamic Storage Allocation: A Survey and Critical
Review" (http://books.google.com/?id=m0yZN2bA3TcC&pg=PA1&dq=paul+wilson). Memory
Management: International Workshop, Iwmm'95, Kinross, Uk, September 27•29, 1995: Proceedings (Springer).
ISBN€978-3-540-60368-9. Retrieved 2008-01-06.
78
Memory management
• Berger, E.D.; Zorn, B.G.; McKinley, K.S. (2001). "Composing high-performance memory allocators" (http://
portal.acm.org/citation.cfm?id=381694.378821). ACM SIGPLAN Notices 36 (5): 114•124.
doi:10.1145/381694.
• Berger, E.D.; Zorn, B.G.; McKinley, K.S. (2002). "Reconsidering custom memory allocation" (http://portal.
acm.org/citation.cfm?id=582419.582421). Proceedings of the 17th ACM SIGPLAN conference on
Object-oriented programming, systems, languages, and applications. ACM Press New York, NY, USA. pp.€1•12.
• memorymanagement.org (http://www.memorymanagement.org/) A small old site dedicated to memory
management.
Further reading
• "Dynamic Storage Allocation: A Survey and Critical Review" (http://www.cs.northwestern.edu/~pdinda/
icsclass/doc/dsa.pdf), Department of Computer Sciences University of Texas at Austin
External links
• "Generic Memory Manager" C++ library (http://memory-mgr.sourceforge.net/)
• Sample bit-mapped arena memory allocator in C (http://code.google.com/p/arena-memory-allocation/
downloads/list)
• TLSF: a constant time allocator for real-time systems (http://www.gii.upv.es/tlsf/)
• Slides on Dynamic memory allocation (https://users.cs.jmu.edu/bernstdh/web/common/lectures/
slides_cpp_dynamic-memory.php)
• Inside A Storage Allocator (http://www.flounder.com/inside_storage_allocation.htm)
• The Memory Management Reference (http://www.memorymanagement.org/)
• The Memory Management Reference, Beginner's Guide Allocation (http://www.memorymanagement.org/
articles/alloc.html)
• Linux Memory Management (http://linux-mm.org/)
• Memory Management For System Programmers (http://www.enderunix.org/docs/memory.pdf)
• VMem - general malloc/free replacement. Fast thread safe C++ allocator (http://www.puredevsoftware.com/)
79
Virtual memory
80
Virtual memory
In computing, virtual memory is a memory management
technique developed for multitasking kernels. This
technique virtualizes a computer architecture's various
forms of computer data storage (such as random-access
memory and disk storage), allowing a program to be
designed as though there is only one kind of memory,
"virtual" memory, which behaves like directly and
contiguous addressable read/write memory.
Properties
Virtual memory makes application programming easier by
hiding fragmentation of physical memory; by delegating to
the kernel the burden of managing the memory hierarchy
(eliminating the need for the program to handle overlays
explicitly); and, when each process is run in its own
dedicated address space, by obviating the need to relocate
program code or to access memory with relative
addressing.
Memory virtualization can be seen as a generalization of
the concept of virtual memory.
Usage
Virtual memory combines active RAM and inactive memory on
[1]
DASD to form a large range of contiguous addresses.
Virtual memory is an integral part of a modern computer
architecture; implementations require hardware support, typically in the form of a memory management unit built
into the CPU. While not necessary, emulators and virtual machines can employ hardware support to increase
performance of their virtual memory implementations.[2] Consequently, older operating systems, such as those for
the mainframes of the 1960s, and those for personal computers of the early to mid 1980s (e.g. DOS),[3] generally
have no virtual memory functionality, though notable exceptions for mainframes of the 1960s include:
•
•
•
•
•
the Atlas Supervisor for the Atlas
MCP for the Burroughs B5000
MTS, TSS/360 and CP/CMS for the IBM System/360 Model 67
Multics for the GE 645
the Time Sharing Operating System for the RCA Spectra 70/46
The Apple Lisa is an example of a personal computer of the 1980s that features virtual memory.
Most modern operating systems that support virtual memory also run each process in its own dedicated address
space. Each program thus appears to have sole access to the virtual memory. However, some older operating systems
(such as OS/VS1 and OS/VS2 SVS) and even modern ones (such as IBM i) are single address space operating
systems that run all processes in a single address space composed of virtualized memory.
Embedded systems and other special-purpose computer systems that require very fast and/or very consistent
response times may opt not to use virtual memory due to decreased determinism; virtual memory systems trigger
unpredictable traps that may produce unwanted "jitter" during I/O operations. This is because embedded hardware
Virtual memory
costs are often kept low by implementing all such operations with software (a technique called bit-banging) rather
than with dedicated hardware.
History
In the 1940s and 1950s, all larger programs had to contain logic for managing primary and secondary storage, such
as overlaying. Virtual memory was therefore introduced not only to extend primary memory, but to make such an
extension as easy as possible for programmers to use.[4] To allow for multiprogramming and multitasking, many
early systems divided memory between multiple programs without virtual memory, such as early models of the
PDP-10 via registers.
The concept of virtual memory was developed by the german physicist Fritz-Rudolf Gƒntsch at the Technische
Universit„t Berlin in 1956.[5][6] Paging was first developed at the University of Manchester as a way to extend the
Atlas Computer's working memory by combining its 16 thousand words of primary core memory with an additional
96 thousand words of secondary drum memory. The first Atlas was commissioned in 1962 but working prototypes of
paging had been developed by 1959.[4](p2)[7][8] In 1961, the Burroughs Corporation independently released the first
commercial computer with virtual memory, the B5000, with segmentation rather than paging.[9][10]
Before virtual memory could be implemented in mainstream operating systems, many problems had to be addressed.
Dynamic address translation required expensive and difficult to build specialized hardware; initial implementations
slowed down access to memory slightly.[4] There were worries that new system-wide algorithms utilizing secondary
storage would be less effective than previously used application-specific algorithms. By 1969, the debate over virtual
memory for commercial computers was over;[4] an IBM research team led by David Sayre showed that their virtual
memory overlay system consistently worked better than the best manually controlled systems. The first
minicomputer to introduce virtual memory was the Norwegian NORD-1; during the 1970s, other minicomputers
implemented virtual memory, notably VAX models running VMS.
Virtual memory was introduced to the x86 architecture with the protected mode of the Intel 80286 processor, but its
segment swapping technique scaled poorly to larger segment sizes. The Intel 80386 introduced paging support
underneath the existing segmentation layer, enabling the page fault exception to chain with other exceptions without
double fault. However, loading segment descriptors was an expensive operation, causing operating system designers
to rely strictly on paging rather than a combination of paging and segmentation.
Paged virtual memory
Nearly all implementations of virtual memory divide a virtual address space into pages, blocks of contiguous virtual
memory addresses. Pages are usually at least 4 kilobytes in size; systems with large virtual address ranges or
amounts of real memory generally use larger page sizes.
Page tables
Page tables are used to translate the virtual addresses seen by the application into physical addresses used by the
hardware to process instructions; such hardware that handles this specific translation is often known as the memory
management unit. Each entry in the page table holds a flag indicating whether the corresponding page is in real
memory or not. If it is in real memory, the page table entry will contain the real memory address at which the page is
stored. When a reference is made to a page by the hardware, if the page table entry for the page indicates that it is not
currently in real memory, the hardware raises a page fault exception, invoking the paging supervisor component of
the operating system.
Systems can have one page table for the whole system, separate page tables for each application and segment, a tree
of page tables for large segments or some combination of these. If there is only one page table, different applications
running at the same time use different parts of a single range of virtual addresses. If there are multiple page or
81
Virtual memory
segment tables, there are multiple virtual address spaces and concurrent applications with separate page tables
redirect to different real addresses.
Paging supervisor
This part of the operating system creates and manages page tables. If the hardware raises a page fault exception, the
paging supervisor accesses secondary storage, returns the page that has the virtual address that resulted in the page
fault, updates the page tables to reflect the physical location of the virtual address and tells the translation mechanism
to restart the request.
When all physical memory is already in use, the paging supervisor must free a page in primary storage to hold the
swapped-in page. The supervisor uses one of a variety of page replacement algorithms such as least recently used to
determine which page to free.
Pinned/Locked/Fixed pages
Operating systems have memory areas that are pinned (never swapped to secondary storage). For example, interrupt
mechanisms rely on an array of pointers to their handlers, such as I/O completion and page fault. If the pages
containing these pointers or the code that they invoke were pageable, interrupt-handling would become far more
complex and time-consuming, particularly in the case of page fault interruptions. Hence, some part of the page table
structures is not pageable.
Some pages may be pinned for short periods of time, others may be pinned for long periods of time, and still others
may need to be permanently pinned. For example:
• The paging supervisor code and drivers for secondary storage devices on which pages reside must be permanently
pinned, as otherwise paging wouldn't even work because the necessary code wouldn't be available.
• Timing-dependent components may be pinned to avoid variable paging delays.
• Data buffers that are accessed directly by peripheral devices that use direct memory access or I/O channels must
reside in pinned pages while the I/O operation is in progress because such devices and the buses to which they are
attached expect to find data buffers located at physical memory addresses; regardless of whether the bus has a
memory management unit for I/O, transfers cannot be stopped if a page fault occurs and then restarted when the
page fault has been processed.
In IBM's operating systems for System/370 and successor systems, the term is "fixed", and pages may be long-term
fixed, or may be short-term fixed. Control structures are often long-term fixed (measured in wall-clock time, i.e.,
time measured in seconds, rather than time measured in less than one second intervals) whereas I/O buffers are
usually short-term fixed (usually measured in significantly less than wall-clock time, possibly for a few
milliseconds). Indeed, the OS has a special facility for "fast fixing" these short-term fixed data buffers (fixing which
is performed without resorting to a time-consuming Supervisor Call instruction). Additionally, the OS has yet
another facility for converting an application from being long-term fixed to being fixed for an indefinite period,
possibly for days, months or even years (however, this facility implicitly requires that the application firstly be
swapped-out, possibly from preferred-memory, or a mixture of preferred- and non-preferred memory, and secondly
be swapped-in to non-preferred memory where it resides for the duration, however long that might be; this facility
utilizes a documented Supervisor Call instruction).
Multics used the term "wired". OpenVMS and Windows refer to pages temporarily made nonpageable (as for I/O
buffers) as "locked", and simply "nonpageable" for those that are never pageable.
82
Virtual memory
Virtual-real operation
In OS/VS1 and similar OSes, some parts of systems memory are managed in virtual-real mode, where every virtual
address corresponds to a real address, specifically interrupt mechanisms, paging supervisor and tables in older
systems, and application programs using non-standard I/O management. For example, IBM's z/OS has 3 modes
(virtual-virtual, virtual-real and virtual-fixed).[11]
Thrashing
When paging is used, a problem called "thrashing" can occur, in which the computer spends an unsuitable amount of
time swapping pages to and from a backing store, hence slowing down useful work. Adding real memory is the
simplest response, but improving application design, scheduling, and memory usage can help.
Segmented virtual memory
Some systems, such as the Burroughs B5500,[12] use segmentation instead of paging, dividing virtual address spaces
into variable-length segments. A virtual address here consists of a segment number and an offset within the segment.
The Intel 80286 supports a similar segmentation scheme as an option, but it is rarely used. Segmentation and paging
can be used together by dividing each segment into pages; systems with this memory structure, such as Multics and
IBM System/38, are usually paging-predominant, segmentation providing memory protection.[13][14][15]
In the Intel 80386 and later IA-32 processors, the segments reside in a 32-bit linear, paged address space. Segments
can be moved in and out of that space; pages there can "page" in and out of main memory, providing two levels of
virtual memory; few if any operating systems do so, instead using only paging. Early non-hardware-assisted x86
virtualization solutions combined paging and segmentation because x86 paging offers only two protection domains
whereas a VMM / guest OS / guest applications stack needs three.[16]:22 The difference between paging and
segmentation systems is not only about memory division; segmentation is visible to user processes, as part of
memory model semantics. Hence, instead of memory that looks like a single large vector, it is structured into
multiple spaces.
This difference has important consequences; a segment is not a page with variable length or a simple way to lengthen
the address space. Segmentation that can provide a single-level memory model in which there is no differentiation
between process memory and file system consists of only a list of segments (files) mapped into the process's
potential address space.[17]
This is not the same as the mechanisms provided by calls such as mmap and Win32's MapViewOfFile, because
inter-file pointers do not work when mapping files into semi-arbitrary places. In Multics, a file (or a segment from a
multi-segment file) is mapped into a segment in the address space, so files are always mapped at a segment
boundary. A file's linkage section can contain pointers for which an attempt to load the pointer into a register or
make an indirect reference through it causes a trap. The unresolved pointer contains an indication of the name of the
segment to which the pointer refers and an offset within the segment; the handler for the trap maps the segment into
the address space, puts the segment number into the pointer, changes the tag field in the pointer so that it no longer
causes a trap, and returns to the code where the trap occurred, re-executing the instruction that caused the trap.[18]
This eliminates the need for a linker completely[4] and works when different processes map the same file into
different places in their private address spaces.[19]
83
Virtual memory
Further reading
• Hennessy, John L.; and Patterson, David A.; Computer Architecture, A Quantitative Approach (ISBN
1-55860-724-2)
• Martignetti, E.; What Makes It Page?: The Windows 7 (x64) Virtual Memory Manager (ISBN 978-1479114290)
Notes
[1]
[2]
[3]
[4]
Early systems used drums; contemporary systems use disks or solid state memory
"AMD-V‡ Nested Paging" (http:/ / developer. amd. com/ assets/ NPT-WP-1 1-final-TM. pdf). AMD. . Retrieved 11 May 2012.
"Windows Version History" (http:/ / support. microsoft. com/ kb/ 32905). Microsoft. Last Review: July 19, 2005. . Retrieved 2008-12-03.
Denning, Peter (1997). "Before Memory Was Virtual" (http:/ / cs. gmu. edu/ cne/ pjd/ PUBS/ bvm. pdf) (PDF). In the Beginning:
Recollections of Software Pioneers. .
[5] E. Jessen: Origin of the Virtual Memory Concept. IEEE Annals of the History of Computing. Vol. 26. 4/2004, Page 71 ff.
[6] Jessen, Eike (1996). "Die Entwicklung des virtuellen Speichers" (http:/ / dx. doi. org/ 10. 1007/ s002870050034) (in german).
Informatik-Spektrum (Springer Berlin / Heidelberg) 19 (4): 216•219. doi:10.1007/s002870050034. ISSN€0170-6012. .
[7] R. J. Creasy, " The origin of the VM/370 time-sharing system (http:/ / pages. cs. wisc. edu/ ~stjones/ proj/ vm_reading/ ibmrd2505M. pdf)",
IBM Journal of Research & Development, Vol. 25, No. 5 (September 1981), p. 486
[8] Atlas design includes virtual memory (http:/ / www. computer50. org/ kgill/ atlas/ atlas. html)
[9] Ian Joyner on Burroughs B5000 (http:/ / web. mac. com/ joynerian/ iWeb/ Ian Joyner/ Burroughs. html)
[10] Cragon, Harvey G. (1996). Memory Systems and Pipelined Processors (http:/ / books. google. com/ ?id=q2w3JSFD7l4C). Jones and Bartlett
Publishers. p.€113. ISBN€0-86720-474-5. .
[11] "z/OS Basic Skills Information Center: z/OS Concepts" (http:/ / publib. boulder. ibm. com/ infocenter/ zoslnctr/ v1r7/ topic/ com. ibm.
zconcepts. doc/ zconcepts. pdf) (PDF). .
[12] Burroughs. Burroughs B5500 Information Processing System Reference Manual. 1021326.
[13] (PDF) GE-645 System Manual (http:/ / computer-refuge. org/ bitsavers/ pdf/ ge/ GE-645/ GE-645_SystemMan_Jan68. pdf). January 1968.
pp.€21•30. . Retrieved 2007-11-13.
[14] Corbat…, F.J.; and Vyssotsky, V. A.. "Introduction and Overview of the Multics System" (http:/ / www. multicians. org/ fjcc1. html). .
Retrieved 2007-11-13.
[15] Glaser, Edward L.; Couleur, John F.; and Oliver, G. A.. "System Design of a Computer for Time Sharing Applications" (http:/ / www.
multicians. org/ fjcc2. html). .
[16] J. E. Smith, R. Uhlig (August 14, 2005) Virtual Machines: Architectures, Implementations and Applications, HOTCHIPS 17, Tutorial 1,
part 2 (http:/ / www. hotchips. org/ archives/ hc17/ 1_Sun/ HC17. T1P2. pdf)
[17] Bensoussan, Andr‹; Clingen, CharlesT.; Daley, Robert C. (May 1972). "The Multics Virtual Memory: Concepts and Design" (http:/ / www.
multicians. org/ multics-vm. html). Communications of the ACM 15 (5): 308•318. doi:10.1145/355602.361306. .
[18] "Multics Execution Environment" (http:/ / www. multicians. org/ exec-env. html). .
[19] Organick, Elliott I. (1972). The Multics System: An Examination of Its Structure. MIT Press. ISBN€0-262-15012-3.
References
• This article is based on material taken from the Free On-line Dictionary of Computing prior to 1 November 2008
and incorporated under the "relicensing" terms of the GFDL, version 1.3 or later.
External links
• "Time-Sharing Supervisor Programs" (http://archive.michigan-terminal-system.org/documentation/
documents/timeSharingSupervisorPrograms-1971.pdf) by Michael T. Alexander in Advanced Topics in Systems
Programming, University of Michigan Engineering Summer Conference 1970 (revised May 1971), compares the
scheduling and resource allocation approaches, including virtual memory and paging, used in four mainframe
operating systems: CP-67, TSS/360, MTS, and Multics.
• LinuxMM: Linux Memory Management (http://linux-mm.org/).
• Birth of Linux Kernel (http://gnulinuxclub.org/index.php?option=com_content&task=view&id=161&
Itemid=32), mailing list discussion.
• The Virtual-Memory Manager in Windows NT, Randy Kath, Microsoft Developer Network Technology Group,
12 December 1992 (http://web.archive.org/20100622062522/http://msdn2.microsoft.com/en-us/library/
84
Virtual memory
ms810616.aspx) at the Wayback Machine (archived June 22, 2010)
Page (computer memory)
"Page size" redirects to this article. For information on paper see Paper size
A page, memory page, or virtual page is a fixed-length contiguous block of virtual memory, and it is the smallest
unit of data for the following:
• memory allocation performed by the operating system for a program; and
• transfer between main memory and any other auxiliary store, such as a hard disk drive.
Virtual memory allows a page that does not currently reside in main memory to be addressed and used. If a program
tries to access a location in such a page, an exception called a page fault is generated. The hardware or operating
system is notified and loads the required page from the auxiliary store automatically. A program addressing the
memory has no knowledge of a page fault or a process following it. Thus a program can address more (virtual) RAM
than physically exists in the computer. Virtual memory is a scheme that gives users the illusion of working with a
large block of contiguous memory space (perhaps even larger than real memory), when in actuality most of their
work is on auxiliary storage (disk). Fixed-size blocks (pages) or variable-size blocks of the job are read into main
memory as needed.
A transfer of pages between main memory and an auxiliary store, such as a hard disk drive, is referred to as paging
or swapping.[1]
Page size trade-off
Page size is usually determined by processor architecture. Traditionally, pages in a system had uniform size, for
example 4096 bytes. However, processor designs often allow two or more, sometimes simultaneous, page sizes due
to the benefits and penalties. There are several points that can factor into choosing the best page size.
Page size versus page table size
A system with a smaller page size uses more pages, requiring a page table that occupies more space. For example, if
a 232 virtual address space is mapped to 4KB (212 bytes) pages, the number of virtual pages is 220 =( 232 / 212).
However, if the page size is increased to 32KB (215 bytes), only 217 pages are required. A multi-level paging
algorithm can decrease the memory cost of allocating a large page table for each process by further dividing the page
table up into smaller tables, effectively paging the page table.
Page size versus TLB usage
Since every access to memory must be mapped from virtual to physical address, reading the page table every time
can be quite costly. Therefore, a very fast kind of cache, the Translation Lookaside Buffer (TLB), is often used. The
TLB is of limited size, and when it cannot satisfy a given request (a TLB miss) the page tables must be searched
manually (either in hardware or software, depending on the architecture) for the correct mapping. Larger page sizes
mean that a TLB cache of the same size can keep track of larger amounts of memory, which avoids the costly TLB
misses.
85
Page (computer memory)
Internal fragmentation of pages
Rarely do processes require the use of an exact number of pages. As a result, the last page will likely only be
partially full, wasting some amount of memory. Larger page sizes increase the potential for wasted memory this
way, as more potentially unused portions of memory are loaded into main memory. Smaller page sizes ensure a
closer match to the actual amount of memory required in an allocation.
As an example, assume the page size is 1024KB. If a process allocates 1025KB, two pages must be used, resulting in
1023KB of unused space (where one page fully consumes 1024KB and the other only 1KB).
Page size versus disk access
When transferring from a rotational disk, much of the delay is caused by seek time, the time it takes to correctly
position the read/write heads above the disk platters. Because of this, large sequential transfers are more efficient
than several smaller transfers. Transferring the same amount of data from disk to memory often requires less time
with larger pages than with smaller pages.
Determining the page size in a program
Most operating systems allow programs to discover the page size at runtime. This allows programs to use memory
more efficiently by aligning allocations to this size and reducing overall internal fragmentation of pages.
Unix and POSIX-based operating systems
Unix and POSIX-based systems may use the system function sysconf(), as illustrated in the following example
written in the C programming language.
#include <stdio.h>
#include <unistd.h> /* sysconf(3) */
int main(void) {
printf("The page size for this system is %ld bytes.\n",
sysconf(_SC_PAGESIZE)); /* _SC_PAGE_SIZE is OK too. */
return 0;
}
In many Unix systems the command line utility getconf can be used. For example getconf PAGESIZE will
return the page size in bytes.
Windows-based operating systems
Win32-based operating systems, such as those in the Windows 9x and Windows NT families, may use the system
function GetSystemInfo() from kernel32.dll.
#include <stdio.h>
#include <windows.h>
int main(void) {
SYSTEM_INFO si;
GetSystemInfo(&si);
printf("The page size for this system is %u bytes.\n",
86
Page (computer memory)
87
si.dwPageSize);
return 0;
}
Huge pages
Huge page size depends on processor architecture, processor type, and operating (addressing) mode. The operating
system selects one from the sizes supported by the architecture. Note that not all processors implement all defined
Huge/Large page sizes.
Architecture Page Size
Huge Page Size
Large Page Size
i386
4€KB
4M (2M in PAE mode)
1€GB
IA-64
4€KB
4K, 8K, 64K, 256K, 1M, 4M, 16M, 256M -
ppc64
4€KB
-
16M
sparc
8€KB
-
8K, 64K, 4M, 256M, 2G
[2]
Information from: http:/ / wiki. debian. org/ Hugepages (todo: supplement information for processors manufactures
documentations)
Some instruction set architectures can support multiple page sizes, including pages significantly larger than the
standard page size. Starting with the Pentium Pro, x86 processors support 4€MB pages (called Page Size Extension)
(2€MB pages if using PAE) in addition to their standard 4€KB pages; newer x86-64 processors, such as AMD's newer
AMD64 processors and Intel's Westmere,[3] processors can use 1€GB pages in long mode. IA-64 supports as many as
eight different page sizes, from 4€KB up to 256€MB, and some other architectures have similar features. This support
for huge pages (known as superpages in FreeBSD, and large pages in Microsoft Windows terminology) allows for
"the best of both worlds", reducing the pressure on the TLB cache (sometimes increasing speed by as much as 15%,
depending on the application and the allocation size) for large allocations while still keeping memory usage at a
reasonable level for small allocations.
Huge pages, despite being available in the processors used in most contemporary personal computers, are not in
common use except in large servers and computational clusters. Commonly, their use requires elevated privileges,
cooperation from the application making the large allocation (usually setting a flag to ask the operating system for
huge pages), or manual administrator configuration; operating systems commonly, sometimes by design, cannot
page them out to disk.
However, SGI IRIX has general purpose support for multiple page sizes. Each individual process can provide hints
and the operating system will automatically use the largest page size possible for a given segment of address
space.[4]
Linux has supported huge pages on several architectures since the 2.6 series via the hugetlbfs filesystem[5] and
without hugetlbfs since 2.6.38.[6] Windows Server 2003 (SP1 and newer), Windows Vista and Windows Server 2008
support huge pages under the name of large pages. Windows 2000 and Windows XP support large pages
internally,[7] but do not expose them to applications. Solaris beginning with version 9 supports large pages on
SPARC and x86.[8][9] FreeBSD 7.2-RELEASE features superpages.[10] Note that until recently in Linux,
applications needed to be modified in order to use huge pages. The 2.6.38 kernel introduced support for transparent
use of huge pages.[11] On Linux kernels supporting transparent huge pages, as well as FreeBSD and Solaris,
applications take advantage of huge pages automatically, without the need for modification.[10]
Page (computer memory)
References
[1] Belzer, Jack; Holzman, Albert G.; Kent, Allen, eds. (1981), "Virtual memory systems" (http:/ / books. google. com/
books?id=KUgNGCJB4agC& printsec=frontcover), Encyclopedia of computer science and technology, 14, CRC Press, p.€32,
ISBN€0-8247-2214-0,
[2] "AIX documentation, Large Pages" (http:/ / publib. boulder. ibm. com/ infocenter/ pseries/ v5r3/ index. jsp?topic=/ com. ibm. aix. prftungd/
doc/ prftungd/ large_page_ovw. htm). IBM. . Retrieved 2012-12-14.
[3] "The Intel Xeon 5670: Six Improved Cores" (http:/ / anandtech. com/ show/ 2964/ ). AnandTech. . Retrieved 2012-11-03.
[4] "General Purpose Operating System Support for Multiple Page Sizes" (http:/ / static. usenix. org/ publications/ library/ proceedings/ usenix98/
full_papers/ ganapathy/ ganapathy. pdf). Static.usenix.org. . Retrieved 2012-1102.
[5] "Pages - dankwiki, the wiki of nick black" (http:/ / dank. qemfd. net/ dankwiki/ index. php/ Pages). Dank.qemfd.net. . Retrieved 2012-11-03.
[6] "Transparent huge pages in 2.6.38" (http:/ / lwn. net/ Articles/ 423584/ ). Lwn.net. . Retrieved 2012-11-03.
[7] "AGP program may hang when using page size extension on Athlon processor" (http:/ / support. microsoft. com/ kb/ 270715).
Support.microsoft.com. 2007-01-27. . Retrieved 2012-11-03.
[8] "Supporting Multiple Page Sizes in the Solaris Operating System" (http:/ / www. sun. com/ blueprints/ 0304/ 817-5917. pdf). Sun BluePrints
Online. Sun Microsystems. . Retrieved 2008-01-19.
[9] "Supporting Multiple Page Sizes in the Solaris Operating System Appendix" (http:/ / www. sun. com/ blueprints/ 0304/ 817-6242. pdf). Sun
BluePrints Online. Sun Microsystems. . Retrieved 2008-01-19.
[10] "FreeBSD 7.2-RELEASE Release Notes" (http:/ / www. freebsd. org/ releases/ 7. 2R/ relnotes-detailed. html). FreeBSD Foundation. .
Retrieved 2009-05-03.
[11] Jonathan Corbet. "Transparent huge pages in 2.6.38" (http:/ / lwn. net/ Articles/ 423584). LWN. . Retrieved 2011-03-02.
Further reading
• Dandamudi, Sivarama P. (2003). Fundamentals of Computer Organization and Design (1st ed.). Springer.
pp.€740•741. ISBN€0-387-95211-X.
Paging
In computer operating systems, paging is one of the memory-management schemes by which a computer can store
and retrieve data from secondary storage for use in main memory. In the paging memory-management scheme, the
operating system retrieves data from secondary storage in same-size blocks called pages. The main advantage of
paging over memory segmentation is that it allows the physical address space of a process to be noncontiguous.
Before paging came into use, systems had to fit whole programs into storage contiguously, which caused various
storage and fragmentation problems.[1]
Paging is an important part of virtual memory implementation in most contemporary general-purpose operating
systems, allowing them to use disk storage for data that does not fit into physical random-access memory (RAM).
Overview
The main functions of paging are performed when a program tries to access pages that are not currently mapped to
physical memory (RAM). This situation is known as a page fault. The operating system must then take control and
handle the page fault, in a manner invisible to the program. Therefore, the operating system must:
1.
2.
3.
4.
5.
Determine the location of the data in auxiliary storage.
Obtain an empty page frame in RAM to use as a container for the data.
Load the requested data into the available page frame.
Update the page table to show the new data.
Return control to the program, transparently retrying the instruction that caused the page fault.
Until there is not enough RAM to store all the data needed, the process of obtaining an empty page frame does not
involve removing another page from RAM. If all page frames are non-empty, obtaining an empty page frame
requires choosing a page frame containing data to empty. If the data in that page frame has been modified since it
88
Paging
was read into RAM (i.e., if it has become "dirty"), it must be written back to its location in secondary storage before
being freed; otherwise, the contents of the page's page frame in RAM are the same as the contents of the page in
secondary storage, so it does not need to be written back to secondary storage. If a reference is then made to that
page, a page fault will occur, and an empty page frame must be obtained and the contents of the page in secondary
storage again read into that page frame.
Efficient paging systems must determine the page frame to empty by choosing one that is least likely to be needed
within a short time. There are various page replacement algorithms that try to do this. Most operating systems use
some approximation of the least recently used (LRU) page replacement algorithm (the LRU itself cannot be
implemented on the current hardware) or a working set-based algorithm.
To further increase responsiveness, paging systems may employ various strategies to predict which pages will be
needed soon. Such systems will attempt to load pages into main memory preemptively, before a program references
them.
Page replacement algorithms
Demand paging
When pure demand paging is used, page loading only occurs at the time of the data request, and not before. In
particular, when a demand pager is used, a program usually begins execution with none of its pages pre-loaded in
RAM. Pages are copied from the executable file into RAM the first time the executing code references them, usually
in response to page faults. As a consequence, pages of the executable file containing code not executed during a
particular run will never be loaded into memory.
Anticipatory paging
This technique, sometimes called "swap prefetch", preloads a process's non-resident pages that are likely to be
referenced in the near future (taking advantage of locality of reference). Such strategies attempt to reduce the number
of page faults a process experiences. Some of those strategies are "if a program references one virtual address which
causes a page fault, perhaps the next few pages' worth of virtual address space will soon be used" and "if one big
program just finished execution, leaving lots of free RAM, perhaps the user will return to using some of the
programs that were recently paged out".
Free page queue
The free page queue is a list of page frames that are available for assignment after a page fault. Some operating
systems[2] support page reclamation; if a page fault occurs for a page that had been stolen and the page frame was
never reassigned, then the operating system avoids the necessity of reading the page back in by assigning the
unmodified page frame.
Page stealing
Some operating systems periodically look for pages that have not been recently referenced and add them to the Free
page queue, after paging them out if they have been modified.
89
Paging
Pre-cleaning
Unix operating systems periodically use sync to pre-clean all dirty pages, that is, to save all modified pages to hard
disk. Windows operating systems do the same thing via "modified page writer" threads.
Pre-cleaning makes starting a new program or opening a new data file much faster. The hard drive can immediately
seek to that file and consecutively read the whole file into pre-cleaned page frames. Without pre-cleaning, the hard
drive is forced to seek back and forth between writing a dirty page frame to disk, and then reading the next page of
the file into that frame.
Thrashing
Most programs reach a steady state in their demand for memory locality both in terms of instructions fetched and
data being accessed. This steady state is usually much less than the total memory required by the program. This
steady state is sometimes referred to as the working set: the set of memory pages that are most frequently accessed.
Virtual memory systems work most efficiently when the ratio of the working set to the total number of pages that
can be stored in RAM is low enough that the time spent resolving page faults is not a dominant factor in the
workload's performance. A program that works with huge data structures will sometimes require a working set that is
too large to be efficiently managed by the page system resulting in constant page faults that drastically slow down
the system. This condition is referred to as thrashing: pages are swapped out and then accessed causing frequent
faults.
An interesting characteristic of thrashing is that as the working set grows, there is very little increase in the number
of faults until the critical point (when faults go up dramatically and the majority of the system's processing power is
spent on handling them).
An extreme example of this sort of situation occurred on the IBM System/360 Model 67 and IBM System/370 series
mainframe computers, in which a particular instruction could consist of an execute instruction, which crosses a page
boundary, that the instruction points to a move instruction, that itself also crosses a page boundary, targeting a move
of data from a source that crosses a page boundary, to a target of data that also crosses a page boundary. The total
number of pages thus being used by this particular instruction is eight, and all eight pages must be present in memory
at the same time. If the operating system will allocate less than eight pages of actual memory in this example, when
it attempts to swap out some part of the instruction or data to bring in the remainder, the instruction will again page
fault, and it will thrash on every attempt to restart the failing instruction.
To decrease excessive paging, and thus possibly resolve thrashing problem, a user can do any of the following:
• Increase the amount of RAM in the computer (generally the best long-term solution).
• Decrease the number of programs being concurrently run on the computer.
The term thrashing is also used in contexts other than virtual memory systems, for example to describe cache issues
in computing or silly window syndrome in networking.
Sharing
In multi-programming or in multi-user environment it is common for many users to be executing the same program.
If individual copies of these programs were given to each user, much of the primary storage would be wasted. The
solution is to share those pages that can be shared.
Sharing must be carefully controlled to prevent one process from modifying data that another process is accessing. In
most systems the shared programs are divided into separate pages i.e. coding and data are kept separate. This is
achieved by having page map table entries of different processes point to the same page frame, that page frame is
shared among those processes.
90
Paging
Terminology
Historically, paging sometimes referred to a memory allocation scheme that used fixed-length pages as opposed to
variable-length segments, without implicit suggestion that virtual memory techniques were employed at all or that
those pages were transferred to disk.[3] [4] Such usage is rare today.
Some modern systems use the term swapping along with paging. Historically, swapping referred to moving from/to
secondary storage a whole program at a time, in a scheme known as roll-in/roll-out. [5] [6] In the 1960s, after the
concept of virtual memory was introduced€in two variants, either using segments or pages€the term swapping was
applied to moving, respectively, either segments or pages, between disk and memory. Today with the virtual
memory mostly based on pages, not segments, swapping became a fairly close synonym of paging, although with
one difference.
In systems that support memory-mapped files, when a page fault occurs, a page may be then transferred to or from
any ordinary disk file, not necessarily a dedicated space. Page in is transferring a page from the disk to RAM. Page
out is transferring a page from RAM to the disk. Swap in and out only refer to transferring pages between RAM and
dedicated swap space or swap file or scratch disk space, and not any other place on disk.
On Windows NT based systems, dedicated swap space is known as a page file and paging/swapping are often used
interchangeably.
Implementations
Ferranti Atlas
The first computer to support paging was the Atlas,[7][8][9] jointly developed by Ferranti, the University of
Manchester and Plessey. The machine had an associative (content-addressable) memory with one entry for each 512
word page. The Supervisor[10] handled non-equivalence interruptions[11] and managed the transfer of pages between
core and drum in order to provide a one-level store[12] to programs.
Windows 3.x and Windows 9x
Paging has been a feature of Microsoft Windows since Windows 3.0 in 1990. Windows 3.x creates a hidden file
named 386SPART.PAR or WIN386.SWP for use as a swap file. It is generally found in the root directory, but it
may appear elsewhere (typically in the WINDOWS directory). Its size depends on how much swap space the system
has (a setting selected by the user under Control Panel ˆ Enhanced under "Virtual Memory".) If the user moves or
deletes this file, a blue screen will appear the next time Windows is started, with the error message "The permanent
swap file is corrupt". The user will be prompted to choose whether or not to delete the file (whether or not it exists).
Windows 95, Windows 98 and Windows Me use a similar file, and the settings for it are located under Control Panel
ˆ System ˆ Performance tab ˆ Virtual Memory. Windows automatically sets the size of the page file to start at
1.5Πthe size of physical memory, and expand up to 3Πphysical memory if necessary. If a user runs
memory-intensive applications on a system with low physical memory, it is preferable to manually set these sizes to
a value higher than default.
Windows NT
In NT-based versions of Windows (such as Windows XP, Windows Vista, and Windows 7), the file used for paging
is named pagefile.sys. The default location of the page file is in the root directory of the partition where
Windows is installed. Windows can be configured to use free space on any available drives for pagefiles. It is
required, however, for the boot partition (i.e. the drive containing the Windows directory) to have a pagefile on it if
the system is configured to write either kernel or full memory dumps after a crash. Windows uses the paging file as
temporary storage for the memory dump. When the system is rebooted, Windows copies the memory dump from the
91
Paging
pagefile to a separate file and frees the space that was used in the pagefile.[13]
Fragmentation
In Windows' default configuration the pagefile is allowed to expand beyond its initial allocation when necessary. If
this happens gradually, it can become heavily fragmented which can potentially cause performance problems.[14]
The common advice given to avoid this is to set a single "locked" pagefile size so that Windows will not expand it.
However, the pagefile only expands when it has been filled, which, in its default configuration, is 150% the total
amount of physical memory.[15] Thus the total demand for pagefile-backed virtual memory must exceed 250% of the
computer's physical memory before the pagefile will expand.
The fragmentation of the pagefile that occurs when it expands is temporary. As soon as the expanded regions are no
longer in use (at the next reboot, if not sooner) the additional disk space allocations are freed and the pagefile is back
to its original state.
Locking a page file's size can be problematic in the case that a Windows application requests more memory than the
total size of physical memory and the page file. In this case, requests to allocate memory fail, which may cause
applications and system processes to fail. Supporters of this view will note that the page file is rarely read or written
in sequential order, so the performance advantage of having a completely sequential page file is minimal. However,
it is generally agreed that a large page file will allow use of memory-heavy applications, and there is no penalty
except that more disk space is used.
The extra disk space may be trivial on systems using current specifications, i.e. a system with 3€GB of memory
having a 6 gigabyte fixed-size swap file on a computer with a 750€GB disk drive, or a system with 6€GB of
memory/16€GB fixed swap and 2€TB of disk space, in both cases the system is using about 8/10 of 1% of the disk
space with the swap file pre-extended to maximum.
Defragmenting the page file is also occasionally recommended to improve performance when a Windows system is
chronically using much more memory than its total physical memory. This view ignores the fact that, aside from the
temporary results of expansion, the pagefile does not become fragmented over time. In general, performance
concerns related to pagefile access are much more effectively dealt with by adding more physical memory.
Unix and Unix-like systems
Unix systems, and other Unix-like operating systems, use the term "swap" to describe both the act of moving
memory pages between RAM and disk, and the region of a disk the pages are stored on. In some of those systems, it
is common to dedicate an entire partition of a hard disk to swapping. These partitions are called swap partitions.
Many systems have an entire hard drive dedicated to swapping, separate from the data drive(s), containing only a
swap partition. A hard drive dedicated to swapping is called a "swap drive" or a "scratch drive" or a "scratch disk".
Some of those systems only support swapping to a swap partition; others also support swapping to files.
Linux
From a software point of view with the 2.6 Linux kernel, swap files are just as fast[16][17] as swap partitions. The
kernel keeps a map of where the swap file exists, and accesses the disk directly, bypassing caching and filesystem
overhead.[17] Red Hat recommends using a swap partition.[18] With a swap partition one can choose where on the
disk it resides and place it where the disk throughput is highest. The administrative flexibility of swap files can
outweigh the other advantages of swap partitions. For example, a swap file can be placed on any drive, can be set to
any desired size, and can be added or changed as needed. A swap partition, however, requires that it be set for the
entire hard drive, and once the size of a swap partition is set, it can't be changed without using tools to resize the
entire drive.
Linux supports using a virtually unlimited number of swapping devices, each of which can be assigned a priority.
When the operating system needs to swap pages out of physical memory, it uses the highest-priority device with free
92
Paging
space. If multiple devices are assigned the same priority, they are used in a fashion similar to level 0 RAID
arrangements. This provides improved performance as long as the devices can be accessed efficiently in parallel.
Therefore, care should be taken assigning the priorities. For example, swaps located on the same physical disk
should not be used in parallel, but in order ranging from the fastest to the slowest (i.e. the fastest having the highest
priority).
OS X
OS X uses multiple swap files. The default (and Apple-recommended) installation places them on the root partition,
though it is possible to place them instead on a separate partition or device.[19]
Solaris
Solaris allows swapping to raw disk slices as well as files. The traditional method is to use slice 1 (i.e. the second
slice) on the OS disk to house swap. Swap setup is managed by the system boot process if there are entries in the
"vfstab" file, but can also be managed manually through the use of the "swap" command. While it is possible to
remove, at runtime, all swap from a lightly loaded system, Sun does not recommend it. Recent additions to the ZFS
file system allow creation of ZFS devices that can be used as swap partitions. Swapping to normal files on ZFS file
systems is not supported.
AmigaOS 4
AmigaOS 4.0 introduced a new system for allocating RAM and defragmenting physical memory. It still uses flat
shared address space that cannot be defragmented. It is based on slab allocation method and paging memory that
allows swapping. Paging was implemented in AmigaOS 4.1 but may lock up system if all physical memory is used
up.[20] Swap memory could be activated and deactivated any moment allowing the user to choose to use only
physical RAM.
Performance
The backing store for a virtual memory operating system is typically many orders of magnitude slower than RAM.
Additionally, using mechanical storage devices introduces delay, several milliseconds for a harddisk. Therefore it is
desirable to reduce or eliminate swapping, where practical. Some operating systems offer settings to influence the
kernel's decisions.
1. Linux offers the /proc/sys/vm/swappiness parameter, which changes the balance between swapping out
runtime memory, as opposed to dropping pages from the system page cache.
2. Windows 2000, XP, and Vista offer the DisablePagingExecutive registry setting, which controls
whether kernel-mode code and data can be eligible for paging out.
3. Mainframe computers frequently used head-per-track disk drives or drums for page and swap storage to eliminate
seek time, and several technologies[21] to have multiple concurrent requests to the same device in order to reduce
rotational latency.
4. Flash memory has a finite number of erase-write cycles (see Limitations of flash memory), and the smallest
amount of data that can be erased at once might be very large (128 KiB for an Intel X25-M SSD [22]), seldom
coinciding with pagesize. Therefore, flash memory may wear out quickly if used as swap space under tight
memory conditions. On the attractive side, flash memory is practically delayless compared to harddisks, and not
volatile as RAM chips. Schemes like ReadyBoost and Intel Turbo Memory are made to exploit these
characteristics.
Many Unix-like operating systems (for example AIX, Linux and Solaris) allow using multiple storage devices for
swap space in parallel, to increase performance.
93
Paging
Tuning swap space size
In some older virtual memory operating systems, space in swap backing store is reserved when programs allocate
memory for runtime data. OS vendors typically issue guidelines about how much swap space should be allocated.
Reliability
Swapping can decrease system reliability by some amount. If swapped data gets corrupted on the disk (or at any
other location, or during transfer), the memory will also have incorrect contents after the data has later been returned.
Addressing limits on 32-bit hardware
Paging is one way of allowing the size of the addresses used by a process€the process's "virtual address space" or
"logical address space" -- to be different from the amount of main memory actually installed on a particular
computer€the physical address space.
Main memory smaller than virtual memory
In most systems, the size of a process's virtual address space is much larger than the available main memory.[23]
In these systems, the amount of main memory used by a process is, at most, the amount of physical main memory
available. The amount of physical main memory available is limited by the number of address bits on the address bus
that connects the CPU to main memory€for example, the 68000 CPU, and the i386SX CPU, both internally use
32-bit virtual addresses, but both have only 24 pins connected to the address bus, limiting addressing to at most
16€MB of physical main memory.
Even on systems that have the same or more physical address bits as virtual address bits, often the actual amount of
physical main memory installed is much less than the size that can potentially be addressed, for financial reasons or
because the hardware address map reserves large regions for I/O or other hardware features, so main memory cannot
be placed in those regions.
Main memory the same size as virtual memory
It is not uncommon to find 32-bit computers with 4€GB of RAM, the maximum amount of RAM addressable unless
the page table entry format supports physical addresses larger than 32 bits. For example, on 32-bit x86 processors,
the Physical Address Extension (PAE) feature is required to access more than 4€GB of RAM. For some machines,
e.g., the IBM S/370 in XA mode, the upper bit was not part of the address and only 2€GB could be addressed.
Paging and swap space can be used beyond this 4€GB limit, due to it being addressed in terms of disk locations rather
than memory addresses.
While 32-bit programs on machines with linear address spaces will continue to be limited to the 4€GB they're capable
of addressing, because they each exist in their own virtual address space, a group of programs can together grow
beyond this limit.
On machines with segment registers, e.g., the access registers on an IBM System/370 in ESA mode,[24] the address
space size is limited only by OS constraints, e.g., the need to fit the mapping tables into the available storage.
94
Paging
Main memory larger than virtual address space
A few computers have a main memory larger than the virtual address space of a process, such as the Magic-1, some
PDP-11 machines, and some 32-bit processors with Physical Address Extension.[23]
This nullifies the main advantage of virtual memory, since a single process can't use more main memory than the
amount of its virtual address space. Such systems often use paging techniques to obtain secondary benefits:
• The "extra memory" can be used in the page cache to cache frequently used files and metadata, such as directory
information, from secondary storage.
• If the processor and operating system support multiple virtual address spaces, the "extra memory" can be used to
run more processes. Paging allows the cumulative total of virtual address spaces to exceed physical main
memory.
The size of the cumulative total of virtual address spaces is still limited by the amount of secondary storage
available.
Notes
[1] Belzer, Jack; Holzman, Albert G.; Kent, Allen, eds. (1981). "Virtual memory systems" (http:/ / books. google. com/ ?id=KUgNGCJB4agC&
printsec=frontcover). Encyclopedia of computer science and technology. 14. CRC Press. p.€32. ISBN€0-8247-2214-0.
[2] E.g., MVS
[3] Deitel, Harvey M. (1983). An Introduction to Operating Systems. Addison-Wesley. pp.€181, 187. ISBN€0-201-14473-5
[4] Belzer, Jack; Holzman, Albert G.; Kent, Allen, eds. (1981). "Operating systems" (http:/ / books. google. com/ ?id=uTFirmDlSL8C&
printsec=frontcover). Encyclopedia of computer science and technology. 11. CRC Press. p.€433. doi:10.1002/. ISBN€0-8247-2261-2.
[5] Belzer, Jack; Holzman, Albert G.; Kent, Allen, eds. (1981). "Operating systems" (http:/ / books. google. com/ ?id=uTFirmDlSL8C&
printsec=frontcover). Encyclopedia of computer science and technology. 11. CRC Press. p.€442. ISBN€0-8247-2261-2.
[6] Cragon, Harvey G. (1996). Memory Systems and Pipelined Processors (http:/ / books. google. com/ ?id=q2w3JSFD7l4C). Jones and Bartlett
Publishers. p.€109. ISBN€0-86720-474-5.
[7] Sumner, F. H.; Haley, G.; Chenh, E. C. Y. (1962), "The Central Control Unit of the 'Atlas' Computer", Information Processing 1962, IFIP
Congress Proceedings, Proceedings of IFIP Congress 62, Spartan
[8] "The Atlas" (http:/ / www. computer50. org/ kgill/ atlas/ atlas. html), University of Manchester: Department of Computer Science,
[9] "Atlas Architecture" (http:/ / www. chilton-computing. org. uk/ acl/ technology/ atlas/ p005. htm), Atlas Computer, Chilton: Atlas Computer
Laboratory,
[10] Kilburn, T.; Payne, R. B.; Howarth, D. J. (December 1961), "The Atlas Supervisor" (http:/ / www. chilton-computing. org. uk/ acl/
technology/ atlas/ p019. htm), Computers - Key to Total Systems Control, Conferences Proceedings, Volume 20, Proceedings of the Eastern
Joint Computer Conference Washington, D.C., Macmillan, pp.€279•294,
[11] A non-equivalence interruption occurs when the high order bits of an address do not match any entry in the associative memory.
[12] Kilburn, T.; Edwards, D. B. G.; Lanigan, M. J.; Sumner, F. H. (April 1962), "One-Level Storage System", IRE Transactions Electronic
Computers (Institute of Radio Engineers)
[13] Tsigkogiannis, Ilias (December 11, 2006). "Crash Dump Analysis" (http:/ / blogs. msdn. com/ iliast/ archive/ 2006/ 12/ 11/
crash-dump-analysis. aspx). Ilias Tsigkogiannis' Introduction to Windows Device Drivers. MSDN Blogs. . Retrieved 2008-07-22.
[14] "Windows Sysinternals PageDefrag" (http:/ / technet. microsoft. com/ en-us/ sysinternals/ bb897426). Sysinternals. Microsoft. November 1,
2006. . Retrieved 2010-12-20.
[15] "How to determine the appropriate page file size for 64-bit versions of Windows Server 2003 or Windows XP (MSKB889654_" (http:/ /
support. microsoft. com/ kb/ 889654). Knowledge Base. Microsoft. November 7, 2007. . Retrieved 2007-12-26.
[16] ""Jesper Juhl": Re: How to send a break? - dump from frozen 64bit linux" (http:/ / lkml. org/ lkml/ 2006/ 5/ 29/ 3). LKML. 2006-05-29. .
Retrieved 2010-10-28.
[17] "Andrew Morton: Re: Swap partition vs swap file" (http:/ / lkml. org/ lkml/ 2005/ 7/ 7/ 326). LKML. . Retrieved 2010-10-28.
[18] Chapter 7. Swap Space - Red Hat Customer Portal (https:/ / access. redhat. com/ knowledge/ docs/ en-US/ Red_Hat_Enterprise_Linux/ 5/
html/ Deployment_Guide/ ch-swapspace. html) "Swap space can be a dedicated swap partition (recommended), a swap file, or a combination
of swap partitions and swap files."
[19] John Siracusa (October 15, 2001). "Mac OS X 10.1" (http:/ / arstechnica. com/ reviews/ os/ macosx-10-1. ars/ 7). Ars Technica. . Retrieved
2008-07-23.
[20] AmigaOS Core Developer (2011-01-08). "Re: Swap issue also on Update 4 ?" (http:/ / forum. hyperion-entertainment. biz/ viewtopic.
php?f=14& t=755#p9346). Hyperion Entertainment. . Retrieved 2011-01-08.
[21] E.g., Rotational Position Sensing on a Block Multiplexor channel
[22] "Aligning filesystems to an SSD‚s erase block size | Thoughts by Ted" (http:/ / thunk. org/ tytso/ blog/ 2009/ 02/ 20/
aligning-filesystems-to-an-ssds-erase-block-size). Thunk.org. 2009-02-20. . Retrieved 2010-10-28.
95
Paging
[23] Bill Buzbee. "Magic-1 Minix Demand Paging Design". (http:/ / www. homebrewcpu. com/ demand_paging. htm)
[24] IBM (January 1987), IBM System/370 Extended Architecture Principles of Operation, Second Edition, SA22-7085-1.
References
External links
• Windows Server - Moving Pagefile to another partition or disk (http://it.toolbox.com/blogs/
microsoft-infrastructure/moving-the-pagefilesys-to-another-partition-or-disk-35772) by David Nudelman
• How Virtual Memory Works (http://computer.howstuffworks.com/virtual-memory.htm) from
HowStuffWorks.com (in fact explains only swapping concept, and not virtual memory concept)
• Linux swap space management (http://www.faqs.org/docs/linux_admin/x1762.html) (outdated, as the author
admits)
• Guide On Optimizing Virtual Memory Speed (http://www.techarp.com/showarticle.aspx?artno=143)
(outdated, and contradicts section 1.4 of this wiki page, and (at least) references 8, 9, and 11.)
• Virtual Memory Page Replacement Algorithms (http://people.msoe.edu/~durant/courses/cs384/papers0405/
mccrawt.pdf)
• Windows XP. How to manually change the size of the virtual memory paging file (http://support.microsoft.
com/kb/308417/)
• Windows XP. Factors that may deplete the supply of paged pool memory (http://support.microsoft.com/
?id=312362)
• SwapFs (http://www.acc.umu.se/~bosse/) driver that can be used to save the paging file of Windows on a
swap partition of Linux.
Page fault
A page fault (sometimes #pf or pf) is a trap to the software raised by the hardware when a program accesses a page
that is mapped in the virtual address space, but not loaded in physical memory. In the typical case the operating
system tries to handle the page fault by making the required page accessible at a location in physical memory or kills
the program in the case of an illegal access. The hardware that detects a page fault is the memory management unit
in a processor. The exception handling software that handles the page fault is generally part of the operating system.
Contrary to what the name 'page fault' might suggest, page faults are not always errors and are common and
necessary to increase the amount of memory available to programs in any operating system that utilizes virtual
memory, including Microsoft Windows, Unix-like systems (including Mac OS X, Linux, *BSD, Solaris, AIX, and
HP-UX), and z/OS. Microsoft uses the term hard fault in more recent versions of the Resource Monitor (e.g.,
Windows Vista) to mean 'page fault'.[1]
Types
Minor
If the page is loaded in memory at the time the fault is generated, but is not marked in the memory management unit
as being loaded in memory, then it is called a minor or soft page fault. The page fault handler in the operating system
merely needs to make the entry for that page in the memory management unit point to the page in memory and
indicate that the page is loaded in memory; it does not need to read the page into memory. This could happen if the
memory is shared by different programs and the page is already brought into memory for other programs. The page
could also have been removed from a process's Working Set, but not yet written to disk or erased, such as in
96
Page fault
operating systems that use Secondary Page Caching. For example, HP OpenVMS may remove a page that does not
need to be written to disk (if it has remained unchanged since it was last read from disk, for example) and place it on
a Free Page List if the working set is deemed too large. However, the page contents are not overwritten until the page
is assigned elsewhere, meaning it is still available if it is referenced by the original process before being allocated.
Since these faults do not involve disk latency, they are faster and less expensive than major page faults.
Major
If the page is not loaded in memory at the time the fault is generated, then it is called a major or hard page fault. The
page fault handler in the operating system needs to find a free page in memory, or choose another non-free page in
memory to be used for this page's data, which might be used by another process. In this latter case, the OS first needs
to write out the data in that page if it hasn't already been written out since it was last modified, and mark that page as
not being loaded into memory in its process page table. Once the page has thus been made available, the OS can read
the data for the new page into the physical page, and then make the entry for that page in the memory management
unit point to the page in memory and indicate that the page is loaded in memory. Major faults are more expensive
than minor page faults and add disk latency to the interrupted program's execution. This is the mechanism used by an
operating system to increase the amount of program memory available on demand. The operating system delays
loading parts of the program from disk until the program attempts to use it and the page fault is generated.
Invalid
If a page fault occurs for a reference to an address that's not part of the virtual address space, so that there can't be a
page in memory corresponding to it, then it is called an invalid page fault. The page fault handler in the operating
system then needs to terminate the code that made the reference, or deliver an indication to that code that the
reference was invalid. A null pointer is usually represented as a pointer to address 0 in the address space; many
operating systems set up the memory management unit to indicate that the page that contains that address is not in
memory, and do not include that page in the virtual address space, so that attempts to read or write the memory
referenced by a null pointer get an invalid page fault.
Handling illegal accesses and invalid page faults
Illegal accesses and invalid page faults can result in a program crash, segmentation error, bus error or core dump
depending on the operating system environment. Often these problems are caused by software bugs, but hardware
memory errors, such as those caused by overclocking, may corrupt pointers and make correct software fail.
Operating systems such as Windows and UNIX (and other UNIX-like systems) provide differing mechanisms for
reporting errors caused by page faults. Windows uses structured exception handling to report page fault-based
invalid accesses as access violation exceptions, and UNIX (and UNIX-like) systems typically use signals, such as
SIGSEGV, to report these error conditions to programs.
If the program receiving the error does not handle it, the operating system performs a default action, typically
involving the termination of the running process that caused the error condition, and notifying the user that the
program has malfunctioned. Recent versions of Windows often report such problems by simply stating something
like "this program must close" (an experienced user or programmer with access to a debugger can still retrieve
detailed information). Recent Windows versions also write a minidump (similar in principle to a core dump)
describing the state of the crashed process. UNIX and UNIX-like operating systems report these conditions to the
user with error messages such as "segmentation violation", or "bus error", and may also produce a core dump.
97
Page fault
Performance
Page faults, by their very nature, degrade the performance of a program or operating system and in the degenerate
case can cause thrashing. Optimizations to programs and the operating system that reduce the number of page faults
improve the performance of the program or even the entire system. The two primary focuses of the optimization
effort are reducing overall memory usage and improving memory locality. Generally, making more physical memory
available also reduces page faults. Many page replacement algorithms have been proposed, such as implementing
heuristic-based algorithms to reduce the incidence of page faults.
An average hard disk has an average rotational latency of 3ms, a seek-time of 5ms, and a transfer-time of
0.05ms/page. So the total time for paging comes in near 8ms (8 000 000 ns). If the memory access time is 200ns,
then the page fault would make the operation about 40,000 times slower. To reduce the page faults in the system,
programmers must make use of an appropriate page replacement algorithm that suits the current requirements and
maximizes the page hits.
References
• John L. Hennessy, David A. Patterson, Computer Architecture, A Quantitative Approach (ISBN 1-55860-724-2)
• Tanenbaum, Andrew S. Operating Systems: Design and Implementation (Second Edition). New Jersey:
Prentice-Hall 1997.
• Intel Architecture Software Developer's Manual•Volume 3: System Programming
[1] cf. Resource View Help in Microsoft operating systems
External links
• " So What Is A Page Fault? (http://www.osronline.com/article.cfm?article=222)" from OSR Online (a
Windows-specific explanation)
• " Virtual Memory Details (http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/3/html/
Introduction_to_System_Administration/s1-memory-virt-details.html)" from the Red Hat website.
• " UnhandledExceptionFilter (Windows) (http://msdn2.microsoft.com/en-us/library/ms681401.aspx)" from
MSDN Online (http://msdn.microsoft.com/library).
98
99
5. Files
File system
A file system (or filesystem) is an abstraction to store, retrieve and update a set of files. The term also identifies the
data structures specified by some of those abstractions, which are designed to organize multiple files as a single
stream of bytes, and the network protocols specified by some other of those abstractions, which are designed to
allow files on a remote machine to be accessed. By extension, the term also identifies software or firmware
components that implement the abstraction (i.e. that actually access the data source on behalf of other software or
firmware that uses those components).
The file system manages access to the data and the metadata of the files, and manages the available space of the
device(s) which contain it. Ensuring reliability is a major responsibility of a file system. A file system organizes data
in an efficient manner, and may be tuned to the characteristics of the backing device.
Some file systems are used on data storage devices, to maintain the locations of the files on the device (which is seen
as a stream of bytes). Others provide access to files residing on a server, by acting as clients for a network protocol
(e.g. NFS, SMB, or 9P clients). Others provide access to data that is not stored on a persistent device, and/or may be
computed on request (e.g. procfs). This is distinguished from a directory service and registry.
Aspects of file systems
Space management
Note: this only applies to file systems used in storage devices.
File systems allocate space in a granular manner, usually multiple
physical units on the device. The file system is responsible for
organizing files and directories, and keeping track of which areas of
the media belong to which file and which are not being used. For
example, in Apple DOS of the early 1980s, 256-byte sectors on 140
kilobyte floppy disk used a track/sector map.
Example of slack space, demonstrated with
4,096-byte NTFS clusters: 100,000 files, each 5
bytes per file, equals 500,000 bytes of actual data,
but requires 409,600,000 bytes of disk space to
store
This results in unused space when a file is not an exact multiple of the
allocation unit, sometimes referred to as slack space. For a 512-byte
allocation, the average unused space is 255 bytes. For a 64€KB clusters,
the average unused space is 32KB. The size of the allocation unit is
chosen when the file system is created. Choosing the allocation size based on the average size of the files expected to
be in the file system can minimize the amount of unusable space. Frequently the default allocation may provide
reasonable usage. Choosing an allocation size that is too small results in excessive overhead if the file system will
contain mostly very large files.
File system fragmentation occurs when unused space or single files are not contiguous. As a file system is used, files
are created, modified and deleted. When a file is created the file system allocates space for the data. Some file
systems permit or require specifying an initial space allocation and subsequent incremental allocations as the file
grows. As files are deleted the space they were allocated eventually is considered available for use by other files.
This creates alternating used and unused areas of various sizes. This is free space fragmentation. When a file is
created and there is not an area of contiguous space available for its initial allocation the space must be assigned in
File system
fragments. When a file is modified such that it becomes larger it may exceed the space initially allocated to it,
another allocation must be assigned elsewhere and the file becomes fragmented.
Filenames
A filename (or file name) is used to identify a storage location in the file system. Most file systems have restrictions
on the length of filenames. In some file systems, filenames are case-insensitive (i.e., filenames such as FOO and
foo refer to the same file); in others, filenames are case-sensitive (i.e., the names FOO and foo refer to two
separate files).
Most modern file systems allow filenames to contain a wide range of characters from the Unicode character set.
Most file system interface utilities, however, have restrictions on the use of certain special characters, disallowing
them within filenames (the file system may use these special characters to indicate a device, device type, directory
prefix, or file type). However, these special characters might be allowed by, for example, enclosing the filename
with double quotes ("). For simplicity, special characters are generally discouraged within filenames.
Directories
File systems typically have directories (also called folders) which allow the user to group files into separate
collections. This may be implemented by associating the file name with an index in a table of contents or an inode in
a Unix-like file system. Directory structures may be flat (i.e. linear), or allow hierarchies where directories may
contain subdirectories. The first file system to support arbitrary hierarchies of directories was used in the Multics
operating system.[1] The native file systems of Unix-like systems also support arbitrary directory hierarchies, as do,
for example, Apple's Hierarchical File System, and its successor HFS+ in classic Mac OS (HFS+ is still used in Mac
OS X), the FAT file system in MS-DOS 2.0 and later and Microsoft Windows, the NTFS file system in the Windows
NT family of operating systems, and the ODS-2 and higher levels of the Files-11 file system in OpenVMS.
Metadata
Other bookkeeping information is typically associated with each file within a file system. The length of the data
contained in a file may be stored as the number of blocks allocated for the file or as a byte count. The time that the
file was last modified may be stored as the file's timestamp. File systems might store the file creation time, the time it
was last accessed, the time the file's meta-data was changed, or the time the file was last backed up. Other
information can include the file's device type (e.g. block, character, socket, subdirectory, etc.), its owner user ID and
group ID, its access permissions and other file attributes (e.g. whether the file is read-only, executable, etc.).
Additional attributes can be associated on file systems, such as NTFS, XFS, ext2, ext3, some versions of UFS, and
HFS+, using extended file attributes. Some file systems provide for user defined attributes such as the author of the
document, the character encoding of a document or the size of an image.
Some file systems allow for different data collections to be associated with one file name. These separate collections
may be referred to as streams or forks. Apple has long used a forked file system on the Macintosh, and Microsoft
supports streams in NTFS. Some file systems maintain multiple past revisions of a file under a single file name; the
filename by itself retrieves the most recent version, while prior saved version can be accessed using a special naming
convention such as "filename;4" or "filename(-4)" to access the version four saves ago.
100
File system
File system as an abstract user interface
In some cases, a file system may not make use of a storage device but can be used to organize and represent access to
any data, whether it is stored or dynamically generated (e.g. procfs).
Utilities
The difference between a utility and a built-in core command function is arbitrary, depending on the design of the
operating system, and the memory and space limitations of the hardware. For example, in Microsoft MS-DOS,
formatting is performed by a utility and simple file copying is a built-in command, while in the Apple DOS,
formatting is a built-in command but simple file copying is performed with a utility.
File systems include utilities to initialize, alter parameters of and remove an instance of the file system. Some
include the ability to extend or truncate the space allocated to the file system.
Directory utilities create, rename and delete directory entries and alter metadata associated with a directory. They
may include a means to create additional links to a directory (hard links in Unix), rename parent links (".." in
Unix-like OS), and create bidirectional links to files.
File utilities create, list, copy, move and delete files, and alter metadata. They may be able to truncate data, truncate
or extend space allocation, append to, move, and modify files in-place. Depending on the underlying structure of the
file system, they may provide a mechanism to prepend to, or truncate from, the beginning of a file, insert entries into
the middle of a file or delete entries from a file.
Also in this category are utilities to free space for deleted files if the file system provides an undelete function.
Some file systems defer reorganization of free space, secure erasing of free space and rebuilding of hierarchical
structures. They provide utilities to perform these functions at times of minimal activity. Included in this category is
the infamous defragmentation utility.
Some of the most important features of file system utilities involve supervisory activities which may involve
bypassing ownership or direct access to the underlying device. These include high-performance backup and
recovery, data replication and reorganization of various data structures and allocation tables within the file system.
Restricting and permitting access
There are several mechanisms used by file systems to control access to data. Usually the intent is to prevent reading
or modifying files by a user or group of users. Another reason is to ensure data is modified in a controlled way so
access may be restricted to a specific program. Examples include passwords stored in the metadata of the file or
elsewhere and file permissions in the form of permission bits, access control lists, or capabilities. The need for file
system utilities to be able to access the data at the media level to reorganize the structures and provide efficient
backup usually means that these are only effective for polite users but are not effective against intruders.
Methods for encrypting file data are sometimes included in the file system. This is very effective since there is no
need for file system utilities to know the encryption seed to effectively manage the data. The risks of relying on
encryption include the fact that an attacker can copy the data and use brute force to decrypt the data. Losing the seed
means losing the data.
101
File system
Maintaining integrity
One significant responsibility of a file system is to ensure that, regardless of the actions by programs accessing the
data, the structure remains consistent. This includes actions taken if a program modifying data terminates abnormally
or neglects to inform the file system that is has completed its activities. This may include updating the metadata, the
directory entry and handling any data that was buffered but not yet updated on the physical storage media.
Other failures which the file system must deal with include media failures or loss of connection to remote systems.
In the event of an operating system failure or "soft" power failure, special routines in the file system must be invoked
similar to when an individual program fails.
The file system must also be able to correct damaged structures. These may occur as a result of an operating system
failure for which the OS was unable to notify the file system, power failure or reset.
The file system must also record events to allow analysis of systemic issues as well as problems with specific files or
directories.
User data
The most important purpose of a file system is to manage user data. This includes storing, retrieving and updating
data.
Some file systems accept data for storage as a stream of bytes which are collected and stored in a manner efficient
for the media. When a program retrieves the data it specifies the size of a memory buffer and the file system transfers
data from the media to the buffer. Sometimes a runtime library routine may allow the user program to define a
record based on a library call specifying a length. When the user program reads the data the library retrieves data via
the file system and returns a record.
Some file systems allow the specification of a fixed record length which is used for all write and reads. This
facilitates updating records.
An identification for each record, also known as a key, makes for a more sophisticated file system. The user program
can read, write and update records without regard with their location. This requires complicated management of
blocks of media usually separating key blocks and data blocks. Very efficient algorithms can be developed with
pyramid structure for locating records.
Using a file system
Utilities, language specific run-time libraries and user programs use file system APIs to make requests of the file
system. These include data transfer, positioning, updating metadata, managing directories, managing access
specifications, and removal.
Multiple file systems within a single system
Frequently retail systems are configured with a single file system occupying the entire hard disk.
Another approach is to partition the disk so that several file systems with different attributes can be used. One file
system, for use as browser cache, might be configured with a small allocation size. This has the additional advantage
of keeping the frantic activity of creating and deleting files typical of browser activity in a narrow area of the disk
and not interfering with allocations of other files. A similar partition might be created for email. Another partition,
and file system might be created for the storage of audio or video files with a relatively large allocation. One of the
file systems may normally be set read-only and only periodically be set writable.
A third approach, which is mostly used in cloud systems, is to use "disk images" to house additional file systems,
with the same attributes or not, within another (host) file system as a file. A common example is virtualization: one
user can run an experimental Linux distribution (using the ext4 file system) in a virtual machine under his/her
102
File system
production Windows environment (using NTFS). The ext4 file system resides in a disk image, which is treated as a
file (or multiple files, depend on the hypervisor and settings) in the NTFS host file system.
Having multiple file systems on a single system has the additional benefit that in the event of a corruption of a single
partition, the remaining file systems will frequently still be intact. This includes virus destruction of the system
partition or even a system that will not boot. File system utilities which require dedicated access can effectively be
completed piecemeal. In addition, defragmentation may be more effective. Several system maintenance utilities,
such as virus scans and backups, can also be processed in segments. For example it is not necessary to back up the
file system containing videos along with all the other files if none have been added since the last backup. As of the
image files, one can easily "spin off" differential images which contain only "new" data written to the master
(original) image. Differential images can be used for both safety concerns (as a "disposable" system - can be quickly
restored if destroyed or contaminated by a virus, as the old image can be removed and a new image can be created in
matter of seconds, even without automated procedures) and quick virtual machine deployment (since the differential
images can be quickly spawned using a script in batches)
Design limitations
All file systems have some functional limit that defines the maximum storable data capacity within that system.
These functional limits are a best-guess effort by the designer based on how large the storage systems are right now
and how large storage systems are likely to become in the future. Disk storage has continued to increase at near
exponential rates (see Moore's law), so after a few years, file systems have kept reaching design limitations that
require computer users to repeatedly move to a newer system with ever-greater capacity.
File system complexity typically varies proportionally with the available storage capacity. The file systems of early
1980s home computers with 50€KB to 512€KB of storage would not be a reasonable choice for modern storage
systems with hundreds of gigabytes of capacity. Likewise, modern file systems would not be a reasonable choice for
these early systems, since the complexity of modern file system structures would consume most or all of the very
limited capacity of the early storage systems.
Types of file systems
File system types can be classified into disk/tape file systems, network file systems and special-purpose file systems.
Disk file systems
A disk file system takes advantages of the ability of disk storage media to randomly address data in a short amount of
time. Additional considerations include the speed of accessing data following that initially requested and the
anticipation that the following data may also be requested. This permits multiple users (or processes) access to
various data on the disk without regard to the sequential location of the data. Examples include FAT (FAT12,
FAT16, FAT32), exFAT, NTFS, HFS and HFS+, HPFS, UFS, ext2, ext3, ext4, btrfs, ISO 9660, Files-11, Veritas
File System, VMFS, ZFS, ReiserFS and UDF. Some disk file systems are journaling file systems or versioning file
systems.
103
File system
Optical discs
ISO 9660 and Universal Disk Format (UDF) are two common formats that target Compact Discs, DVDs and Blu-ray
discs. Mount Rainier is an extension to UDF supported by Linux 2.6 series and Windows Vista that facilitates
rewriting to DVDs.
Flash file systems
A flash file system considers the special abilities, performance and restrictions of flash memory devices. Frequently a
disk file system can use a flash memory device as the underlying storage media but it is much better to use a file
system specifically designed for a flash device.
Tape file systems
A tape file system is a file system and tape format designed to store files on tape in a self-describing form. Magnetic
tapes are sequential storage media with significantly longer random data access times than disks, posing challenges
to the creation and efficient management of a general-purpose file system.
In a disk file system there is typically a master file directory, and a map of used and free data regions. Any file
additions, changes, or removals require updating the directory and the used/free maps. Random access to data
regions is measured in milliseconds so this system works well for disks.
Tape requires linear motion to wind and unwind potentially very long reels of media. This tape motion may take
several seconds to several minutes to move the read/write head from one end of the tape to the other.
Consequently, a master file directory and usage map can be extremely slow and inefficient with tape. Writing
typically involves reading the block usage map to find free blocks for writing, updating the usage map and directory
to add the data, and then advancing the tape to write the data in the correct spot. Each additional file write requires
updating the map and directory and writing the data, which may take several seconds to occur for each file.
Tape file systems instead typically allow for the file directory to be spread across the tape intermixed with the data,
referred to as streaming, so that time-consuming and repeated tape motions are not required to write new data.
However, a side effect of this design is that reading the file directory of a tape usually requires scanning the entire
tape to read all the scattered directory entries. Most data archiving software that works with tape storage will store a
local copy of the tape catalog on a disk file system, so that adding files to a tape can be done quickly without having
to rescan the tape media. The local tape catalog copy is usually discarded if not used for a specified period of time, at
which point the tape must be re-scanned if it is to be used in the future.
IBM has developed a file system for tape called the Linear Tape File System. The IBM implementation of this file
system has been released as the open-source IBM Linear Tape File System€€ Single Drive Edition (LTFS€SDE)
product. The Linear Tape File System uses a separate partition on the tape to record the index meta-data, thereby
avoiding the problems associated with scattering directory entries across the entire tape.
Tape formatting
Writing data to a tape is often a significantly time-consuming process that may take several hours. Similarly,
completely erasing or formatting a tape can also take several hours. With many data tape technologies it is not
necessary to format the tape before over-writing new data to the tape. This is due to the inherently destructive nature
of overwriting data on sequential media.
Because of the time it can take to format a tape, typically tapes are pre-formatted so that the tape user does not need
to spend time preparing each new tape for use. All that is usually necessary is to write an identifying media label to
the tape before use, and even this can be automatically written by software when a new tape is used for the first time.
104
File system
Database file systems
Another concept for file management is the idea of a database-based file system. Instead of, or in addition to,
hierarchical structured management, files are identified by their characteristics, like type of file, topic, author, or
similar rich metadata. [2]
IBM DB2 for i [3] (formerly known as DB2/400 and DB2 for i5/OS) is a database file system as part of the object
based IBM i [4] operating system (formerly known as OS/400 and i5/OS), incorporating a single level store and
running on IBM Power Systems (formerly known as AS/400 and iSeries), designed by Frank G. Soltis IBM's former
chief scientist for IBM i. Around 1978 to 1988 Frank G. Soltis and his team at IBM Rochester have successfully
designed and applied technologies like the database file system where others like Microsoft later failed to
accomplish [5]. These technologies are informally known as 'Fortress Rochester' and were in few basic aspects
extended from early Mainframe technologies but in many ways more advanced from a technology perspective.
Some other projects that aren't "pure" database file systems but that use some aspects of a database file system:
• A lot of Web-CMS use a relational DBMS to store and retrieve files. Examples: XHTML files are stored as XML
or text fields, image files are stored as blob fields; SQL SELECT (with optional XPath) statements retrieve the
files, and allow the use of a sophisticated logic and more rich information associations than "usual file systems".
• Very large file systems, embodied by applications like Apache Hadoop and Google File System, use some
database file system concepts.
Transactional file systems
Some programs need to update multiple files "all at once". For example, a software installation may write program
binaries, libraries, and configuration files. If the software installation fails, the program may be unusable. If the
installation is upgrading a key system utility, such as the command shell, the entire system may be left in an
unusable state.
Transaction processing introduces the isolation guarantee, which states that operations within a transaction are
hidden from other threads on the system until the transaction commits, and that interfering operations on the system
will be properly serialized with the transaction. Transactions also provide the atomicity guarantee, that operations
inside of a transaction are either all committed, or the transaction can be aborted and the system discards all of its
partial results. This means that if there is a crash or power failure, after recovery, the stored state will be consistent.
Either the software will be completely installed or the failed installation will be completely rolled back, but an
unusable partial install will not be left on the system.
Windows, beginning with Vista, added transaction support to NTFS, in a feature called Transactional NTFS, but its
use is now discouraged.[6] There are a number of research prototypes of transactional file systems for UNIX systems,
including the Valor file system,[7] Amino,[8] LFS,[9] and a transactional ext3 file system on the TxOS kernel,[10] as
well as transactional file systems targeting embedded systems, such as TFFS.[11]
Ensuring consistency across multiple file system operations is difficult, if not impossible, without file system
transactions. File locking can be used as a concurrency control mechanism for individual files, but it typically does
not protect the directory structure or file metadata. For instance, file locking cannot prevent TOCTTOU race
conditions on symbolic links. File locking also cannot automatically roll back a failed operation, such as a software
upgrade; this requires atomicity.
Journaling file systems are one technique used to introduce transaction-level consistency to file system structures.
Journal transactions are not exposed to programs as part of the OS API; they are only used internally to ensure
consistency at the granularity of a single system call.
Data backup systems typically do not provide support for direct backup of data stored in a transactional manner,
which makes recovery of reliable and consistent data sets difficult. Most backup software simply notes what files
have changed since a certain time, regardless of the transactional state shared across multiple files in the overall
105
File system
dataset. As a workaround, some database systems simply produce an archived state file containing all data up to that
point, and the backup software only backs that up and does not interact directly with the active transactional
databases at all. Recovery requires separate recreation of the database from the state file, after the file has been
restored by the backup software.
Network file systems
A network file system is a file system that acts as a client for a remote file access protocol, providing access to files
on a server. Examples of network file systems include clients for the NFS, AFS, SMB protocols, and file-system-like
clients for FTP and WebDAV.
Shared disk file systems
A shared disk file system is one in which a number of machines (usually servers) all have access to the same external
disk subsystem (usually a SAN). The file system arbitrates access to that subsystem, preventing write collisions.
Examples include GFS2 from Red Hat, GPFS from IBM, and SFS from DataPlow.
Special file systems
A special file system presents non-file elements of an operating system as files so they can be acted on using file
system APIs. This is most commonly done in Unix-like operating systems, but devices are given file names in some
non-Unix-like operating systems as well.
Device file systems
A device file system represents I/O devices and pseudo-devices as files, called device files. Examples in Unix-like
systems include devfs and, in Linux 2.6 systems, udev. In non-Unix-like systems, such as TOPS-10 and other
operating systems influenced by it, where the full filename or pathname of a file can include a device prefix, devices
other than those containing file systems are referred to by a device prefix specifying the device, without anything
following it.
Other special file systems
• In the Linux kernel, configfs and sysfs provide files that can be used to query the kernel for information and
configure entities in the kernel.
• procfs maps processes and, on Linux, other operating system structures into a filespace.
Minimal file system / Audio-cassette storage
In the late 1970s hobbyists saw the development of the microcomputer. Disk and digital tape devices were too
expensive for hobbyists. An inexpensive basic data storage system was devised that used common audio cassette
tape.
When the system needed to write data, the user was notified to press "RECORD" on the cassette recorder, then press
"RETURN" on the keyboard to notify the system that the cassette recorder was recording. The system wrote a sound
to provide time synchronization, then modulated sounds that encoded a prefix, the data, a checksum and a suffix.
When the system needed to read data, the user was instructed to press "PLAY" on the cassette recorder. The system
would listen to the sounds on the tape waiting until a burst of sound could be recognized as the synchronization. The
system would then interpret subsequent sounds as data. When the data read was complete, the system would notify
the user to press "STOP" on the cassette recorder. It was primitive, but it worked (a lot of the time). Data was stored
sequentially in an unnamed format. Multiple sets of data could be written and located by fast-forwarding the tape and
observing at the tape counter to find the approximate start of the next data region on the tape. The user might have to
listen to the sounds to find the right spot to begin playing the next data region. Some implementations even included
106
File system
audible sounds interspersed with the data.
Flat file systems
In a flat file system, there are no subdirectories.
When floppy disk media was first available this type of file system was adequate due to the relatively small amount
of data space available. CP/M machines featured a flat file system, where files could be assigned to one of 16 user
areas and generic file operations narrowed to work on one instead of defaulting to work on all of them. These user
areas were no more than special attributes associated with the files, that is, it was not necessary to define specific
quota for each of these areas and files could be added to groups for as long as there was still free storage space on the
disk. The Apple Macintosh also featured a flat file system, the Macintosh File System. It was unusual in that the file
management program (Macintosh Finder) created the illusion of a partially hierarchical filing system on top of
EMFS. This structure required every file to have a unique name, even if it appeared to be in a separate folder.
While simple, flat file systems becomes awkward as the number of files grows and makes it difficult to organize data
into related groups of files.
A recent addition to the flat file system family is Amazon's S3, a remote storage service, which is intentionally
simplistic to allow users the ability to customize how their data is stored. The only constructs are buckets (imagine a
disk drive of unlimited size) and objects (similar, but not identical to the standard concept of a file). Advanced file
management is allowed by being able to use nearly any character (including '/') in the object's name, and the ability
to select subsets of the bucket's content based on identical prefixes.
File systems and operating systems
Many operating systems include support for more than one file system. Sometimes the OS and the file system are so
tightly interwoven it is difficult to separate out file system functions.
There needs to be an interface provided by the operating system software between the user and the file system. This
interface can be textual (such as provided by a command line interface, such as the Unix shell, or OpenVMS DCL)
or graphical (such as provided by a graphical user interface, such as file browsers). If graphical, the metaphor of the
folder, containing documents, other files, and nested folders is often used (see also: directory and folder).
Unix-like operating systems
Unix-like operating systems create a virtual file system, which makes all the files on all the devices appear to exist in
a single hierarchy. This means, in those systems, there is one root directory, and every file existing on the system is
located under it somewhere. Unix-like systems can use a RAM disk or network shared resource as its root directory.
Unix-like systems assign a device name to each device, but this is not how the files on that device are accessed.
Instead, to gain access to files on another device, the operating system must first be informed where in the directory
tree those files should appear. This process is called mounting a file system. For example, to access the files on a
CD-ROM, one must tell the operating system "Take the file system from this CD-ROM and make it appear under
such-and-such directory". The directory given to the operating system is called the mount pointۥ it might, for
example, be /media. The /media directory exists on many Unix systems (as specified in the Filesystem
Hierarchy Standard) and is intended specifically for use as a mount point for removable media such as CDs, DVDs,
USB drives or floppy disks. It may be empty, or it may contain subdirectories for mounting individual devices.
Generally, only the administrator (i.e. root user) may authorize the mounting of file systems.
Unix-like operating systems often include software and tools that assist in the mounting process and provide it new
functionality. Some of these strategies have been coined "auto-mounting" as a reflection of their purpose.
1. In many situations, file systems other than the root need to be available as soon as the operating system has
booted. All Unix-like systems therefore provide a facility for mounting file systems at boot time. System
107
File system
administrators define these file systems in the configuration file fstab (vfstab in Solaris), which also indicates
options and mount points.
2. In some situations, there is no need to mount certain file systems at boot time, although their use may be desired
thereafter. There are some utilities for Unix-like systems that allow the mounting of predefined file systems upon
demand.
3. Removable media have become very common with microcomputer platforms. They allow programs and data to
be transferred between machines without a physical connection. Common examples include USB flash drives,
CD-ROMs, and DVDs. Utilities have therefore been developed to detect the presence and availability of a
medium and then mount that medium without any user intervention.
1. Progressive Unix-like systems have also introduced a concept called supermounting; see, for example, the Linux
supermount-ng project [12]. For example, a floppy disk that has been supermounted can be physically removed
from the system. Under normal circumstances, the disk should have been synchronized and then unmounted
before its removal. Provided synchronization has occurred, a different disk can be inserted into the drive. The
system automatically notices that the disk has changed and updates the mount point contents to reflect the new
medium. Similar functionality is found on Windows machines.
2. An automounter will automatically mount a file system when a reference is made to the directory atop which it
should be mounted. This is usually used for file systems on network servers, rather than relying on events such as
the insertion of media, as would be appropriate for removable media.
Linux
Linux supports many different file systems, but common choices for the system disk on a block device include the
ext* family (such as ext2, ext3 and ext4), XFS, JFS, ReiserFS and btrfs. For raw flash without a flash translation
layer (FTL) or Memory Technology Device (MTD), there is UBIFS, JFFS2, and YAFFS, among others. SquashFS is
a common compressed read-only file system.
Solaris
The Sun Microsystems Solaris operating system in earlier releases defaulted to (non-journaled or non-logging) UFS
for bootable and supplementary file systems. Solaris defaulted to, supported, and extended UFS.
Support for other file systems and significant enhancements were added over time, including Veritas Software Corp.
(Journaling) VxFS, Sun Microsystems (Clustering) QFS, Sun Microsystems (Journaling) UFS, and Sun
Microsystems (open source, poolable, 128 bit compressible, and error-correcting) ZFS.
Kernel extensions were added to Solaris to allow for bootable Veritas VxFS operation. Logging or Journaling was
added to UFS in Sun's Solaris 7. Releases of Solaris 10, Solaris Express, OpenSolaris, and other open source variants
of the Solaris operating system later supported bootable ZFS.
Logical Volume Management allows for spanning a file system across multiple devices for the purpose of adding
redundancy, capacity, and/or throughput. Legacy environments in Solaris may use Solaris Volume Manager
(formerly known as Solstice DiskSuite.) Multiple operating systems (including Solaris) may use Veritas Volume
Manager. Modern Solaris based operating systems eclipse the need for Volume Management through leveraging
virtual storage pools in ZFS.
OS X
OS X uses a file system that it inherited from classic Mac OS called HFS Plus, sometimes called Mac OS Extended.
HFS Plus is a metadata-rich and case-preserving but (usually) case-insensitive file system. Due to the Unix roots of
OS X, Unix permissions were added to HFS Plus. Later versions of HFS Plus added journaling to prevent corruption
of the file system structure and introduced a number of optimizations to the allocation algorithms in an attempt to
defragment files automatically without requiring an external defragmenter.
108
File system
Filenames can be up to 255 characters. HFS Plus uses Unicode to store filenames. On OS X, the filetype can come
from the type code, stored in file's metadata, or the filename extension.
HFS Plus has three kinds of links: Unix-style hard links, Unix-style symbolic links and aliases. Aliases are designed
to maintain a link to their original file even if they are moved or renamed; they are not interpreted by the file system
itself, but by the File Manager code in userland.
OS X also supports the UFS file system, derived from the BSD Unix Fast File System via NeXTSTEP. However, as
of Mac OS X 10.5 (Leopard), OS X can no longer be installed on a UFS volume, nor can a pre-Leopard system
installed on a UFS volume be upgraded to Leopard.[13]
Newer versions of OS X are capable of reading and writing to the legacy FAT file systems(16 & 32) common on
Windows. They are also capable of reading the newer NTFS file systems for Windows. In order to write to NTFS
file systems on OS X versions prior to 10.6 (Snow Leopard) third party software is necessary. Mac OS X 10.6 (Snow
Leopard) and later allows writing to NTFS file systems, but only after a non-trivial system setting change (third party
software exists that automates this).
PC-BSD
PC-BSD is a desktop version of FreeBSD, which inherits FreeBSD's ZFS support, similarly to FreeNAS. The new
graphical installer of PC-BSD can handle / (root) on ZFS and RAID-Z pool installs and disk encryption using Geli
right from the start in an easy convenient (GUI) way. The current PC-BSD 9.0+ 'Isotope Edition' has ZFS filesystem
version 5 and ZFS storage pool version 28.
Plan 9
Plan 9 from Bell Labs treats everything as a file, and accessed as a file would be (i.e., no ioctl or mmap): networking,
graphics, debugging, authentication, capabilities, encryption, and other services are accessed via I-O operations on
file descriptors. The 9P protocol removes the difference between local and remote files
These file systems are organized with the help of private, per-process namespaces, allowing each process to have a
different view of the many file systems that provide resources in a distributed system.
The Inferno operating system shares these concepts with Plan 9.
Microsoft Windows
Windows makes use of the FAT, NTFS,
exFAT and ReFS file systems (the latter is
only supported and usable in Windows
Server 2012; Windows cannot boot from it).
Windows uses a drive letter abstraction at
the user level to distinguish one disk or
partition from another. For example, the
path C:\WINDOWS represents a directory
WINDOWS on the partition represented by
the letter C. Drive C: is most commonly
Directory listing in a Windows command shell
used for the primary hard disk partition
(since at the advent of hard disks many
computers had two floppy drives, A: and B:), on which Windows is usually installed and from which it boots. This
"tradition" has become so firmly ingrained that bugs exist in many applications which make assumptions that the
drive that the operating system is installed on is C. The use of drive letters, and the tradition of using "C" as the drive
letter for the primary hard disk partition, can be traced to MS-DOS, where the letters A and B were reserved for up
109
File system
to two floppy disk drives. This in turn derived from CP/M in the 1970s, and ultimately from IBM's CP/CMS of 1967.
FAT
The family of FAT file systems is supported by almost all operating systems for personal computers, including all
versions of Windows and MS-DOS/PC€DOS and DR-DOS. (PC€DOS is an OEM version of MS-DOS, MS-DOS was
originally based on SCP's 86-DOS. DR-DOS was based on Digital Research's Concurrent DOS, a successor of
CP/M-86.) The FAT file systems are therefore well-suited as a universal exchange format between computers and
devices of most any type and age.
The FAT file system traces its roots back to an (incompatible) 8-bit FAT precursor in Stand-alone Disk BASIC and
the short-lived MDOS/MIDAS project.
Over the years, the file system has been expanded from FAT12 to FAT16 and FAT32. Various features have been
added to the file system including subdirectories, codepage support, extended attributes, and long filenames.
Third-parties such as Digital Research have incorporated optional support for deletion tracking, and
volume/directory/file-based multi-user security schemes to support file and directory passwords and permissions
such as read/write/execute/delete access rights. Most of these extensions are not supported by Windows.
The FAT12 and FAT16 file systems had a limit on the number of entries in the root directory of the file system and
had restrictions on the maximum size of FAT-formatted disks or partitions.
FAT32 addresses the limitations in FAT12 and FAT16, except for the file size limit of close to 4€GB, but it remains
limited compared to NTFS.
FAT12, FAT16 and FAT32 also have a limit of eight characters for the file name, and three characters for the
extension (such as .exe). This is commonly referred to as the 8.3 filename limit. VFAT, an optional extension to
FAT12, FAT16 and FAT32, introduced in Windows 95 and Windows NT 3.5, allowed long file names (LFN) to be
stored in the FAT file system in a backwards compatible fashion.
NTFS
NTFS, introduced with the Windows NT operating system, allowed ACL-based permission control. Other features
also supported by NTFS include hard links, multiple file streams, attribute indexing, quota tracking, sparse files,
encryption, compression, and reparse points (directories working as mount-points for other file systems, symlinks,
junctions, remote storage links), though not all these features are well-documented.
exFAT
exFAT is a proprietary and patent-protected file system with certain advantages over NTFS with regards to file
system overhead.
exFAT is not backwards compatible with FAT file systems such as FAT12, FAT16 or FAT32. The file system is
supported with newer Windows systems, such as Windows 2003, Windows Vista, Windows 2008, Windows 7 and
more recently, support has been added for Windows XP.[14] Support in other operating systems is sparse since
Microsoft has not published the specifications of the file system and implementing support for exFAT requires a
license.
110
File system
Other file systems
• The Prospero File System is a file system based on the Virtual System Model.[15] The system was created by Dr.
B. Clifford Neuman of the Information Sciences Institute at the University of Southern California.[16]
• RSRE FLEX file system - written in ALGOL 68
• The file system of the Michigan Terminal System (MTS) is interesting because: (i) it provides "line files" where
record lengths and line numbers are associated as metadata with each record in the file, lines can be added,
replaced, updated with the same or different length records, and deleted anywhere in the file without the need to
read and rewrite the entire file; (ii) using program keys files may be shared or permitted to commands and
programs in addition to users and groups; and (iii) there is a comprehensive file locking mechanism that protects
both the file's data and its metadata.[17][18]
Limitations
Converting the type of a file system
It may be advantageous or necessary to have files in a different file system than they currently exist. Reasons include
the need for an increase in the space requirements beyond the limits of the current file system. The depth of path may
need to be increased beyond the restrictions of the file system. There may be performance or reliability
considerations. Providing access to another operating system which does not support existing file system is another
reason.
In-place conversion
In some cases conversion can be done in-place, although migrating the file system is more conservative, as it
involves a creating a copy of the data and is recommended.[19] On Windows, FAT and FAT32 file systems can be
converted to NTFS via the convert.exe utility, but not the reverse.[19] On Linux, ext2 can be converted to ext3 (and
converted back), and ext3 can be converted to ext4 (but not back),[20] and both ext3 and ext4 can be converted to
btrfs, and converted back until the undo information is deleted.[21] These conversions are possible due to using the
same format for the file data itself, and relocating the metadata into empty space, in some cases using sparse file
support.[21]
Migrating to a different file system
Migration has the disadvantage of requiring additional space although it may be faster. The best case is if there is
unused space on media which will contain the final file system.
For example, to migrate a FAT32 file system to an ext2 file system. First create a new ext2 file system, then copy the
data to the file system, then delete the FAT32 file system.
An alternative, when there is not sufficient space to retain the original file system until the new one is created, is to
use a work area (such as a removable media). This takes longer but a backup of the data is a nice side effect.
Long file paths and long file names
In hierarchical file systems, files are accessed by means of a path that is a branching list of directories containing the
file. Different file systems have different limits on the depth of the path. File systems also have a limit on the length
of an individual filename.
Copying files with long names or located in paths of significant depth from one file system to another may cause
undesirable results. This depends on how the utility doing the copying handles the discrepancy. See also pathmunge
[22]
111
File system
References
Cited references
[1] R. C. Daley; P. G. Neumann (1965). "A General-Purpose File System For Secondary Storage" (http:/ / www. multicians. org/ fjcc4. html).
Fall Joint Computer Conference. AFIPS. pp.€213•229. doi:10.1145/1463891.1463915. . Retrieved 2011-07-30.
[2] http:/ / www. theregister. co. uk/ 2002/ 03/ 29/ windows_on_a_database_sliced/
[3] http:/ / www-03. ibm. com/ systems/ i/ software/ db2/ index. html
[4] http:/ / www. ibm. com/ developerworks/ ibmi/ newto/
[5] http:/ / www. theregister. co. uk/ 2002/ 01/ 28/ xp_successor_longhorn_goes_sql/
[6] http:/ / msdn. microsoft. com/ en-us/ library/ windows/ desktop/ hh802690(v=vs. 85). aspx
[7] Spillane, Richard; Gaikwad, Sachin; Chinni, Manjunath; Zadok, Erez and Wright, Charles P.; 2009; "Enabling transactional file access via
lightweight kernel extensions" (http:/ / www. fsl. cs. sunysb. edu/ docs/ valor/ valor_fast2009. pdf); Seventh USENIX Conference on File and
Storage Technologies (FAST 2009)
[8] Wright, Charles P.; Spillane, Richard; Sivathanu, Gopalan; Zadok, Erez; 2007; "Extending ACID Semantics to the File System (http:/ / www.
fsl. cs. sunysb. edu/ docs/ amino-tos06/ amino. pdf); ACM Transactions on Storage
[9] Selzter, Margo I.; 1993; "Transaction Support in a Log-Structured File System" (http:/ / www. eecs. harvard. edu/ ~margo/ papers/ icde93/
paper. pdf); Proceedings of the Ninth International Conference on Data Engineering
[10] Porter, Donald E.; Hofmann, Owen S.; Rossbach, Christopher J.; Benn, Alexander and Witchel, Emmett; 2009; "Operating System
Transactions" (http:/ / www. sigops. org/ sosp/ sosp09/ papers/ porter-sosp09. pdf); In the Proceedings of the 22nd ACM Symposium on
Operating Systems Principles (SOSP '09), Big Sky, MT, October 2009.
[11] Gal, Eran; Toledo, Sivan; "A Transactional Flash File System for Microcontrollers" (http:/ / www. usenix. org/ event/ usenix05/ tech/
general/ full_papers/ gal/ gal. pdf)
[12] http:/ / sourceforge. net/ projects/ supermount-ng
[13] Mac OS X 10.5 Leopard: Installing on a UFS-formatted volume (http:/ / docs. info. apple. com/ article. html?artnum=306516)
[14] Microsoft WinXP exFat patch (http:/ / www. microsoft. com/ downloads/ details.
aspx?FamilyID=1cbe3906-ddd1-4ca2-b727-c2dff5e30f61& displaylang=en)
[15] The Prospero File System: A Global File System Based on the Virtual System Model (http:/ / citeseer. ist. psu. edu/ viewdoc/
summary?doi=10. 1. 1. 132. 7982)
[16] cs.ucsb.edu (http:/ / www. cs. ucsb. edu/ ~ravenben/ papers/ fsml/ prospero-gfsvsm. ps. gz)
[17] "A file system for a general-purpose time-sharing environment" (http:/ / ieeexplore. ieee. org/ xpl/ freeabs_all. jsp?arnumber=1451786), G.
C. Pirkola, Proceedings of the IEEE, June 1975, volume 63 no. 6, pp.€918•924, ISSN 0018-9219
[18] "The Protection of Information in a General Purpose Time-Sharing Environment" (https:/ / docs. google. com/ viewer?a=v& pid=sites&
srcid=ZGVmYXVsdGRvbWFpbnxtaWNoaWdhbnRlcm1pbmFsc3lzdGVtfGd4Ojc5MTAxNzg1NTVmMjg5Mzk), Gary C. Pirkola and John
Sanguinetti, Proceedings of the IEEE Symposium on Trends and Applications 1977: Computer Security and Integrity, vol. 10 no. 4, , pp.
106-114
[19] How to Convert FAT Disks to NTFS (http:/ / technet. microsoft. com/ en-us/ library/ bb456984. aspx), Microsoft, October 25, 2001
[20] Converting an ext3 filesystem to ext4 (https:/ / ext4. wiki. kernel. org/ index. php/ Ext4_Howto#Converting_an_ext3_filesystem_to_ext4)
[21] Conversion from Ext3 (https:/ / btrfs. wiki. kernel. org/ index. php/ Conversion_from_Ext3), Btrfs wiki
[22] http:/ / www. cyberciti. biz/ faq/ redhat-linux-pathmunge-command-in-shell-script/
General references
• Jonathan de Boyne Pollard (1996). "Disc and volume size limits" (http://homepage.ntlworld.com./jonathan.
deboynepollard/FGA/os2-disc-and-volume-size-limits.html). Frequently Given Answers. Retrieved February 9,
2005.
• IBM. "OS/2 corrective service fix JR09427" (ftp://service.boulder.ibm.com/ps/products/os2/fixes/v4warp/
english-us/jr09427/JR09427.TXT). Retrieved February 9, 2005.
• "Attribute - $EA_INFORMATION (0xD0)" (http://linux-ntfs.sourceforge.net/ntfs/attributes/ea_information.
html). NTFS Information, Linux-NTFS Project. Retrieved February 9, 2005.
• "Attribute - $EA (0xE0)" (http://linux-ntfs.sourceforge.net/ntfs/attributes/ea.html). NTFS Information,
Linux-NTFS Project. Retrieved February 9, 2005.
• "Attribute - $STANDARD_INFORMATION (0x10)" (http://linux-ntfs.sourceforge.net/ntfs/attributes/
standard_information.html). NTFS Information, Linux-NTFS Project. Retrieved February 21, 2005.
• Apple Computer Inc. "Technical Note TN1150: HFS Plus Volume Format" (http://developer.apple.com/
technotes/tn/tn1150.html). Detailed HFS Plus and HFSX description. Retrieved May 2, 2006.
112
File system
• File System Forensic Analysis (http://www.digital-evidence.org/fsfa/), Brian Carrier, Addison Wesley, 2005.
Further reading
Books
• Carrier, Brian (2005). File System Forensic Analysis (http://www.digital-evidence.org/fsfa/).
Addison-Wesley. ISBN€0-321-26817-2.
• Custer, Helen (1994). Inside the Windows NT File System. Microsoft Press. ISBN€1-55615-660-X.
• Giampaolo, Dominic (1999) (PDF). Practical File System Design with the Be File System (http://www.nobius.
org/~dbg/practical-file-system-design.pdf). Morgan Kaufmann Publishers. ISBN€1-55860-497-9. Retrieved
2010-01-22.
• McCoy, Kirby (1990). VMS File System Internals. VAX - VMS Series. Digital Press. ISBN€1-55558-056-4.
• Mitchell, Stan (1997). Inside the Windows 95 File System (http://oreilly.com/catalog/156592200X). O'Reilly.
ISBN€1-56592-200-X.
• Nagar, Rajeev (1997). Windows NT File System Internals : A Developer's Guide (http://oreilly.com/catalog/
9781565922495). O'Reilly. ISBN€978-1-56592-249-5.
• Pate, Steve D. (2003). UNIX Filesystems: Evolution, Design, and Implementation (http://eu.wiley.com/
WileyCDA/WileyTitle/productCd-0471164836.html). Wiley. ISBN€0-471-16483-6.
• Rosenblum, Mendel (1994). The Design and Implementation of a Log-Structured File System. The Springer
International Series in Engineering and Computer Science. Springer. ISBN€0-7923-9541-7.
• Russinovich, Mark; Solomon, David A.; Ionescu, Alex (2009). "File Systems". Windows Internals (5th ed.).
Microsoft Press. ISBN€0-7356-2530-1.
• Prabhakaran, Vijayan (2006). IRON File Systems (http://www.cs.wisc.edu/~vijayan/vijayan-thesis.pdf). PhD
disseration, University of Wisconsin-Madison.
• Silberschatz, Abraham; Galvin, Peter Baer; Gagne, Greg (2004). "Storage Management". Operating System
Concepts (7th ed.). Wiley. ISBN€0-471-69466-5.
• Tanenbaum, Andrew S. (2007). Modern operating Systems (http://www.pearsonhighered.com/
product?ISBN=0136006639) (3rd ed.). Prentice Hall. ISBN€0-13-600663-9.
• Tanenbaum, Andrew S.; Woodhull, Albert S. (2006). Operating Systems: Design and Implementation (http://
www.pearsonhighered.com/pearsonhigheredus/educator/product/products_detail.page?isbn=0-13-142938-8)
(3rd ed.). Prentice Hall. ISBN€0-13-142938-8.
Online
• Benchmarking Filesystems (outdated) (http://linuxgazette.net/102/piszcz.html) by Justin Piszcz, Linux
Gazette 102, May 2004
• Benchmarking Filesystems Part II (http://linuxgazette.net/122/piszcz.html) using kernel 2.6, by Justin Piszcz,
Linux Gazette 122, January 2006
• Filesystems (ext3, ReiserFS, XFS, JFS) comparison on Debian Etch (http://www.debian-administration.org/
articles/388) 2006
• Interview With the People Behind JFS, ReiserFS & XFS (http://www.osnews.com/story.php?news_id=69)
• Journal File System Performance (outdated) (http://www.open-mag.com/features/Vol_18/filesystems/
filesystems.htm): ReiserFS, JFS, and Ext3FS show their merits on a fast RAID appliance
• Journaled Filesystem Benchmarks (outdated) (http://staff.osuosl.org/~kveton/fs/): A comparison of ReiserFS,
XFS, JFS, ext3 & ext2
• Large List of File System Summaries (most recent update 2006-11-19) (http://www.osdata.com/system/
logical/logical.htm)
• Linux File System Benchmarks (http://fsbench.netnation.com/) v2.6 kernel with a stress on CPU usage
113
File system
• Linux Filesystem Benchmarks (http://www.techyblog.com/linux-news/linux-26-filesystem-benchmarks-older.
html)
• Linux large file support (outdated) (http://www.suse.de/~aj/linux_lfs.html)
• Local Filesystems for Windows (http://www.microsoft.com/whdc/device/storage/LocFileSys.mspx)
• Overview of some filesystems (outdated) (http://osdev.berlios.de/osd-fs.html)
• Sparse files support (outdated) (http://www.lrdev.com/lr/unix/sparsefile.html)
• Jeremy Reimer (March 16, 2008). "From BFS to ZFS: past, present, and future of file systems" (http://
arstechnica.com/articles/paedia/past-present-future-file-systems.ars). arstechnica.com. Retrieved 2008-03-18.
External links
• Filesystem Specifications - Links & Whitepapers (http://www.forensics.nl/filesystems)
• Interesting File System Projects (http://filesystems.org/all-projects.html)
Virtual file system
A virtual file system (VFS) or virtual filesystem switch is an abstraction layer on top of a more concrete file
system. The purpose of a VFS is to allow client applications to access different types of concrete file systems in a
uniform way. A VFS can, for example, be used to access local and network storage devices transparently without the
client application noticing the difference. It can be used to bridge the differences in Windows, Mac OS and Unix
filesystems, so that applications can access files on local file systems of those types without having to know what
type of file system they are accessing.
A VFS specifies an interface (or a "contract") between the kernel and a concrete file system. Therefore, it is easy to
add support for new file system types to the kernel simply by fulfilling the contract. The terms of the contract might
change incompatibly from release to release, which would require that concrete file system support be recompiled,
and possibly modified before recompilation, to allow it to work with a new release of the operating system; or the
supplier of the operating system might make only backward-compatible changes to the contract, so that concrete file
system support built for a given release of the operating system would work with future versions of the operating
system.
Implementations
One of the first virtual file system mechanisms on Unix-like systems was introduced by Sun Microsystems in SunOS
2.0 in 1985. It allowed Unix system calls to access local UFS file systems and remote NFS file systems
transparently. For this reason, Unix vendors who licensed the NFS code from Sun often copied the design of Sun's
VFS. Other file systems could be plugged into it also: there was an implementation of the MS-DOS FAT file system
developed at Sun that plugged into the SunOS VFS, although it wasn't shipped as a product until SunOS 4.1. The
SunOS implementation was the basis of the VFS mechanism in System V Release 4.
John Heidemann developed a stacking VFS under SunOS 4.0 for the experimental Ficus file system. This design
provided for code reuse among file system types with differing but similar semantics (e.g., an encrypting file system
could reuse all of the naming and storage-management code of a non-encrypting file system). Heidemann adapted
this work for use in 4.4BSD as a part of his thesis research; descendants of this code underpin the file system
implementations in modern BSD derivatives including Mac OS X.
Other Unix virtual file systems include the File System Switch in System V Release 3, the Generic File System in
Ultrix, and the VFS in Linux. In OS/2 and Microsoft Windows, the virtual file system mechanism is called the
Installable File System.
114
Virtual file system
The Filesystem in Userspace (FUSE) mechanism allows userland code to plug into the virtual file system mechanism
in Linux, NetBSD, FreeBSD, OpenSolaris, and Mac OS X.
In Microsoft Windows, virtual filesystems can also be implemented through userland Shell namespace extensions;
however, they do not support the lowest-level file system access application programming interfaces in Windows, so
not all applications will be able to access file systems that are implemented as namespace extensions. KIO and
GVFS/GIO provide similar mechanisms in the KDE and GNOME desktop environments (respectively), with similar
limitations, although they can be made to use FUSE techniques and therefore integrate smoothly into the system.
Single-file virtual file systems
Sometimes Virtual File System refers to a file or a group of files (not necessarily inside a concrete file system) that
acts as a manageable container which should provide the functionality of a concrete file system through the usage of
software. Examples of such containers are SolFS or a single-file virtual file system in an emulator like PCTask or
so-called WinUAE, Oracle's VirtualBox, Microsoft's Virtual PC, VMware.
The primary benefit for this type of file system is that it is centralized and easy to remove. A single-file virtual file
system may include all the basic features expected of any file system (virtual or otherwise), but access to the internal
structure of these file systems is often limited to programs specifically written to make use of the single-file virtual
file system (instead of implementation through a driver allowing universal access). Another major drawback is that
performance is relatively low when compared to other virtual file systems. Low performance is mostly due to the
cost of shuffling virtual files when data is written or deleted from the virtual file system.
Implementation of single-file virtual filesystems
Direct examples of single-file virtual file systems include emulators, such as PCTask and WinUAE, which
encapsulate not only the filesystem data but also emulated disk layout. This makes it easy to treat an OS installation
like any other piece of software€transferring it with removable media or over the network.
PCTask
The Amiga emulator PCTask emulated an Intel PC 8088 based machine clocked at 4.77MHz (and later an 80486SX
clocked at 25€MHz). Users of PCTask could create a file of large size on the Amiga filesystem, and this file would
be virtually accessed from the emulator as if it were a real PC Hard Disk. The file could be formatted with the
FAT16 filesystem to store normal MS-DOS or Windows files.[1][2]
WinUAE
The UAE for Windows, WinUAE, allows for large single files on Windows to be treated as Amiga file systems. In
WinUAE this file is called a hardfile.[3]
UAE could also treat a directory on the host filesystem -- (Windows, Linux, Mac OS, AmigaOS) -- as an Amiga
filesystem.[4]
Notes
1.
2.
3.
4.
Emulation on Amiga [1] Comparison between PCX and PCTask, Amiga PC emulators.
See also This article [2] explaining how it works PCTask.
Help About WinUAE [3] (See Hardfile section).
Help About WinUAE [3] (See Add Directory section)
115
Virtual file system
References
Put virtual filesystems to work [4]
Vnodes: An Architecture for Multiple File System Types in Sun UNIX [5]
Linux kernel's Virtual File System [6]
Rodriguez, R.; M. Koehler; R. Hyde (June 1986). "The Generic File System". Proceedings of the USENIX
Summer Technical Conference. Atlanta, Georgia: USENIX Association. pp.€260•269.
• Karels, M.; M. K. McKusick (September 1986). "Towards a Compatible File System Interface". Proceedings of
the European UNIX Users Group Meeting. Manchester, England: EUUG. pp.€481•496.
• Heidemann, John (1995). Stackable Design of File Systems [7] (Technical report CSD-950032). UCLA.
•
•
•
•
• The Linux VFS, Chapter 4 of Linux File Systems by Moshe Bar (McGraw-Hill, 2001). ISBN 0-07-212955-7
• Chapter 12 of Understanding the Linux Kernel by Daniel P. Bovet, Marco Cesati (O'Reilly Media, 2005). ISBN
0-596-00565-2
• The Linux VFS Model: Naming structure [8]
External links
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Embedded File System (EFS) [9] - Open Source cross-platform C++ implementation of Virtual File System
AVFS [10] - A Virtual File System for mounting compressed or remote files
fs-driver [11] Ext2 Installable File System for Microsoft Windows
Anatomy of the Linux file system by M. Tim Jones [12]
Solid File System [13] - (SolFS) cross-platform single-file virtual file system with encryption and compression
Callback File System [14] - SDK that lets developers create virtual file systems for Windows in user mode
FUSE - Filesystem in Userspace [15] - virtual filesystem for Linux.
LUFS - Linux Userland FileSystem [16] - virtual filesystem with support of localfs, sshfs, ftpfs, gnutellafs,
locasefs, gvfs, cardfs, cefs and more. Latest file release: 2003-10-29
TrueVFS [17] - Virtual File System for Java, with thread-safe read/write access to ZIP, ZIP.RAES, TAR,
TAR.BZ2, TAR.GZ, TAR.XZ, ODF, HTTP(S) etc.
Commons VFS [18] - virtual filesystem for Java, with support for Cifs, ftp, http, Zip (file format), Tar (file
format), gzip, bzip2, and more.
MillScript VFS [19] - virtual filesystem for Java, influenced by the KIO subsystem in KDE, Steve Leach's work on
a VFS in JSpice and to a limited extent the Apache Commons VFS.
KIO [20] - (KDE IO) - a network-enabled file management system
flipcode - Programming a Virtual File System [21]
Dokan [22] - A free and open source virtual filesystem for Windows (includes C, .NET, and Ruby bindings).
References
[1] http:/ / www. simon. mooli. org. uk/ AF/ 8. html
[2] http:/ / www. unitechelectronics. com/ emul. htm
[3] http:/ / winuaehelp. back2roots. org/ gui/ hard-drives. htm
[4] http:/ / www. ibm. com/ developerworks/ library/ l-sc12. html
[5] http:/ / www. arl. wustl. edu/ ~fredk/ Courses/ cs523/ fall01/ Papers/ kleiman86vnodes. pdf
[6] http:/ / www. science. unitn. it/ ~fiorella/ guidelinux/ tlk/ node102. html#SECTION001120000000000000000
[7] http:/ / www. isi. edu/ ~johnh/ PAPERS/ Heidemann95e. html
[8] http:/ / www. atalon. cz/ vfs-m/ linux-vfs-model/
[9] http:/ / www. scalingweb. com/ embedded_file_system. php
[10] http:/ / www. inf. bme. hu/ ~mszeredi/ avfs/
[11] http:/ / www. fs-driver. org/
[12] http:/ / www. ibm. com/ developerworks/ linux/ library/ l-linux-filesystem/
[13] http:/ / www. eldos. com/ solfs/
[14] http:/ / www. eldos. com/ cbfs/
116
Virtual file system
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
http:/ / fuse. sourceforge. net/
http:/ / sourceforge. net/ projects/ lufs
http:/ / truevfs. java. net
http:/ / jakarta. apache. org/ commons/ vfs/ index. html
http:/ / millscript. sourceforge. net/ projects/ millscript-vfs/ index. html
http:/ / developer. kde. org/ documentation/ library/ 3. 5-api/ kdelibs-apidocs/ kio/ html/ index. html
http:/ / www. flipcode. com/ articles/ article_vfs01. shtml
http:/ / dokan-dev. net/ en/
117
Article Sources and Contributors
Article Sources and Contributors
Operating system €Source: http://en.wikipedia.org/w/index.php?oldid=542418181 €Contributors: 10metreh, 12.245.75.xxx, 1297, 130.64.31.xxx, 149AFK, 151.30.199.xxx, 1exec1, 1yesfan,
2001:470:1F08:782:0:0:0:2, 2001:4898:0:FFF:0:5EFE:A79:6C06, 2001:878:200:1053:7CE8:2A44:14A5:BA2, 216.150.138.xxx, 28421u2232nfenfcenc, 2D, 2nth0nyj, 62.253.64.xxx,
789455dot38, 9258fahsflkh917fas, 9marksparks9, =Josh.Harris, A876, APH, AR bd, AVRS, AVand, Aaron north, Aaronstj, Abazgiri, Abhik0904, Ablewisuk, Ablonus, Acerperi, Achowat,
Adair2324, Adams kevin, Addshore, Adityachodya, AdjustShift, Adriaan, Ae-a, Afed, After Midnight, Agentlame, Ahoerstemeier, Ahunt, Ahy1, Aim Here, Aitias, Aladdin Sane, Alanbrowne,
Alansohn, Alasdair, Ale jrb, AlefZet, Alegoo92, Alenaross07, AlexDitto, Alexei-ALXM, Alexf, Alexius08, Alexswilliams, Alextyhy, Alexwcovington, Alisha.4m, AlistairMcMillan, Alksentrs,
Alll, Alsandro, Altay437, Alten, Althepal, Am088, Amicon, Amillar, Amphlett7, Anaxial, Andre Engels, Andrew Maiman, Andrewpmk, Android Mouse, Andy pyro, Andy16666, Andyzweb,
Ang3lboy2001, Anna Lincoln, AnnaFrance, AnonMoos, Anouymous, Ansumang, Antandrus, AnthonyJ Lock, Antonielly, Antonio Lopez, Applechair, Arakunem, Aranea Mortem, Arch dude,
Archanamiya, Archer3, ArjunML, Ark, Arman Cagle, Aruton, Ashikpa, Ashish Gaikwad, Ashley thomas80, Ashleypurdy, Astatine-210, Astral, Atlant, Atomician, Attitude2000, Avenged
Eightfold, Awaterl, Ayla, BMF81, Bachinchi, Bact, Badhaker, Badriram, Baron1984, Baronnet, Bastique, Bbbl67, Bbuss, Bcxfu75k, Beland, Ben Webber, BenAveling, Bencherlite, Benneman,
Beno1000, Betacommand, Bevo, Bhu z Crecelu, BiT, Bidgee, Big Brother 1984, BigDunc, Bigdumbdinosaur, Bijesh nair, Biometricse, Bj†rnBergman, Blainster, Bleh999, Blu Aardvark III,
Blue520, Bluemask, Bobo192, Boing! said Zebedee, Bonadea, Bongwarrior, Bookinvestor, Bornslippy, Branddobbe, Brianga, Brion VIBBER, Brolin Empey, Brownga, Bsadowski1, Btate,
Bubba hotep, Buonoj, Burkeaj, Bwildasi, Cactus.man, Caknuck, Calabe1992, Callmejosh, Calltech, CalumH93, Camilo Sanchez, Caminoix, Can You Prove That You're Human, Can't sleep,
clown will eat me, CanadianLinuxUser, Canageek, CanisRufus, Canterbury Tail, Capricorn42, Captain Goggles, Captain-n00dle, CarbonUnit, CarbonX, CardinalDan, Carlosvigopaz,
CaroleHenson, Carrot Lord, Cartread, Casull, Cdills, Celebere, CesarB, Cfallin, Cgtdk, Chairman S., Chaitanya.lala, Chamal N, ChaoticHeavens, Charles Nguyen, Charles dye, CharlotteWebb,
[email protected], Chatul, Chikoosahu, Chris1219, Chrisch, Chrislk02, Christian List, Christian75, Ck lostsword, Cleduc, Clindhartsen, Cllnk, Closedmouth, Clsin, Cncxbox, Cobi, Coffee,
CommonsDelinker, Comperr, Conan, Conti, Conversion script, Cookdn, Coolcaesar, CoolingGibbon, Corn cheese, Corpx, Courcelles, Cp111, Cpiral, Cps274203, Cpuwhiz11, Crazycomputers,
Create g77, Creativename, Credema, Creidieki, Crossy1234, Cul22dude, Cuvtixo, Cybercobra, Cybiko123, CyborgTosser, D, D6, DARTH SIDIOUS 2, DBishop1984, DJ Craig, DStoykov,
DVdm, Daesotho, Dainomite, Damian Yerrick, Dan100, DanDoughty, Daniel C, Danieltobey, Dantheman88, Darkwind, Darth Panda, Dasnov, Daverocks, David Santos, DavidCary, DavidHalko,
Davidam, Davidm617617, Dawnseeker2000, DeDroa, DeadEyeArrow, Deagle AP, Debloper, Deconstructhis, Defender of torch, Dekard, Dekisugi, Delinka, Demiurge, Demonkoryu, Denisarona,
Deon, DerHexer, Desolator12, DestroyerPC, Dexter Nextnumber, Dhardik007, DiaNoCHe, DigitallyBorn, Dirkbb, DirkvdM, Discospinster, Dispenser, DivineAlpha, Djonesuk, Djsasso, Dmerrill,
Dmlandfair, Doh5678, Donhoraldo, Donner60, Dori, Dosman, Download, Doyley, DrDnar, DreamFieldArts, Drmies, Drummondjacob, Dsda, Dudboi, Duke56, DuncanHill, Dvn805, Dyl,
Dynaflow, Dysprosia, Dzubint, E Wing, E.mammadli, ERcheck, ESkog, Eab28, Easwarno1, Echo95, EconoPhysicist, EdEColbert, Edivorce, Edward, Edwy, Eeekster, Ehheh, El C, Eleete,
Elkman, Ellmist, Elockid, Elsendero, Elvenmuse, Ems2, Emurphy42, Emwave, Emx, Endothermic, Enigmar007, Enna59, Enno, Ente75, Enviroboy, Epbr123, Erickanner, Erkan Yilmaz,
ErkinBatu, Escape Orbit, Ethan.hardman, Ethanhardman3, EurekaLott, Eurleif, Evercat, EwokiWiki, Excirial, Eyreland, Face, Falcon Kirtaran, Falcon8765, Favonian, Feedintm, Felyza,
Ferrenrock, Fish and karate, Flewis, Flonase, Florian Blaschke, Floydoid, Flubbit, Fobenavi, Foot, ForrestVoight, Fram, Francis2795, Frankdushantha, Frap, Fred Gandt, FredStrauss, Fredrik,
Freyr, Friecode, Fronx, Fsiler, Fubar Obfusco, Furrykef, Fvasconcellos, Fyver528, GB fan, GRAHAMUK, Gail, Gaius Cornelius, Galzigler, Gardar Rurak, Garlovel, Gaurav1146, Gauravdce07,
Gazpacho, Gbeeker, Geekman314, GeneralChrisV, Georgia guy, Geph, Gepotto, Gerard Czadowski, Ghakko, Ghettoblaster, Ghyll, Giftlite, Glacialfox, Glen, Gogo Dodo, Gogodidi, Golfington,
Golftheman, GoneAwayNowAndRetired, Goodnightmush, GorillaWarfare, Gorrister, Gortu, Grafen, Graham87, Grandscribe, GrayFullbuster, Greensburger, GrooveDog, Grosscha, Ground Zero,
Grover cleveland, Grunt, Gschizas, Gscshoyru, Gtgray1948, Guess Who, Gumbos, Gurch, Guy Harris, Guy Macon, HDrake, Hair, Hammersoft, Hanii Puppy, Hannes Hirzel, Hansfn, Harris
james, Harry, Harryboyles, Hashar, Hawaiian717, Hazard-SJ, Hdante, HeikoEvermann, Henriquevicente, Heron, HexaChord, Hillel, Hirzel, Hmains, Holden15, Hqb, Hrundi Bakshi, Htaccess,
Huszone, Hut 8.5, Hydrogen Iodide, II MusLiM HyBRiD II, IMSoP, Iamunknown, Ian Dunster, Ian Pitchford, Ian.thomson, Icefirearceus, Ida Shaw, Ideogram, Idleguy, Ilmari Karonen,
ImperfectlyInformed, Incnis Mrsi, Indon, Inferno, Lord of Penguins, Ino5hiro, Insanity Incarnate, Integralexplora, Intgr, Ioeth, Iridescent, IronGargoyle, Ishanjand, Iswariya.r, It Is Me Here,
ItsMeowAnywhere, Ixfd64, J Milburn, J.delanoy, J00tel, JForget, JHunterJ, JLaTondre, JSpudeman, Jaan513, Jackfork, Jackmiles2006, James pic, JamesAM, JamesBWatson, Janitor Starr,
Jarble, Jasper Deng, Jatkins, Javierito92, Jaxl, Jaysweet, Jbarta, Jclemens, Jdm64, Jdrowlands, Jebus989, Jedikaiti, Jeff G., Jeffwang, Jeltz, JeremyA, Jerome Charles Potts, Jeronimo, Jerry,
Jerryobject, Jerrysmp, Jesse V., JetBlast, Jfg284, Jfmantis, Jh51681, Jhonsrid, Jijojohnpj, Jim1138, JimPlamondon, Jimmi Hugh, Jjk, Jjupiter100, Jkl4201, JoanneB, Jobrad, JoeSmack,
Joecoolatjunkmaildotcom, Joemaza, Joffeloff, John Nevard, Johnnaylor, Johnny039, Johnuniq, JonHarder, Jonathan Hall, Jordi Burguet Castell, Jorge.guillen, JorgePeixoto, Josef.94, Josepsbd,
Josh the Nerd, Joshlk, Joshua Gyamfi, Joy, Jpeeling, Jschnur, Jstirling, Jsysinc, Julepalme, Jumbuck, Jusdafax, K7jeb, KAMiKAZOW, KAtremer, KDesk, KGasso, Ka Faraq Gatri, Kagredon,
Kajasudhakarababu, Kamanleodickson, Karabulutis252, Karimarie, Karnesky, Karol Langner, Karolinski, Kashmiri, Katalaveno, Kathleen.wright5, Katieh5584, Kaustubh.singh, Kaypoh,
Kbdank71, Kbrose, Kcordina, Ke5crz, KenBest, Kenny sh, KenshinWithNoise, Kenyon, Kerowhack, Kev19, Kevin586, Kgoetz, Khoikhoi, Kidde, Kim Bruning, Kimdino, Kiore, Kjaleshire,
Kjetil r, Kjkolb, Kku, Klungel, Knokej, Knownot, Kokamomi, Kotiwalo, KrakatoaKatie, Krauss, Kubanczyk, Kuru, Kushalbiswas777, Kusma, Kusunose, Kwiki, Kyle1278, Kyng, Kyuuseishu, L
Kensington, LFaraone, La Pianista, Lambiam, Landroo, Latka, Law, Leaflord, LeaveSleaves, Lejarrag, LeoNomis, Letdorf, Leuko, Lexlex, Lifemaestro, Lightedbulb, Lindert, Linkspamremover,
Linnell, Littlegeisha, Livajo, Lkatkinsmith, Lmmaaaoooo, Loadmaster, Logan, Logixoul, Lordmarlineo, Lost.goblin, Love manjeet kumar singh, Lovelac7, Lowellian, Lradrama, Lt monu,
Ltomuta, Lucid, Lucy-seline, Lucyin, Luk, Lumos3, Luna Santin, Lvken7, Lysander89, Lyt701, M.r santosh kumar., M2Ys4U, M4gnum0n, MBisanz, MC MasterChef, MER-C, MONGO,
Mabdul, Macintosh User, Macintosh123, Magnus Manske, Maitchy, Makeemlighter, Manassehkatz, Mandarax, Manickam001, Manmohan Brahma, Manojbp07, Manticore, March23.1999,
Marek69, MarioRadev, MarkSG, Markaci, Marko75, MarmotteNZ, Martarius, Martin smith 637, Martinwguy, Masonkinyon, Materialscientist, MattGiuca, Mattbr, Matthardingu, Matthuxtable,
MattieTK, Mav, Max Naylor, Maxim, Maximus Rex, Maziotis, Mbalamuruga, Mblumber, Mc hammerutime, McDutchie, McLovin34, Mcloud91, Mdd, Mdikici, Meaghan, Medovina, Meegs,
Melab-1, Melsaran, Memset, Mendalus, Meneth, Meowmeow8956, Merlion444, MetaEntropy, Miaers, Michael B. Trausch, MichaelR., Michaelas10, Mickyfitz13, Mike Rosoft, Mike33,
Mike92591, MikeLynch, Mikeblas, Milan Ker•lˆger, Mild Bill Hiccup, Minesweeper, Minghong, Miquonranger03, Mirror Vax, Miss Madeline, MisterCharlie, Mistman123, MithrandirAgain,
Mjoshi91, Mmxx, Mnemoc, Moe Epsilon, Mononomic, Monz, Moondyne, Mortus Est, MovGP0, Mppl3z, Mptb3, Mr.Z-man, MrOllie, MrPaul84, MrX, Mrankur, Mthomp1998, Mualif02,
Muehlburger, Mufka, Mujz1, Muralihbh, Murderbike, Musiphil, Mwanner, Mwheatland, Mwtoews, Mxn, Mƒslimix, N sharma000, N419BH, N5iln, NE Ent, NNLauron, NPrice, NULL, Nakon,
Nanshu, Naohiro19 revertvandal, NapoliRoma, Nasnema, NawlinWiki, Nayak143, Nayvik, Ndavidow, NellieBly, Nergaal, Neversay.misher, Ngch89, Ngien, Ngyikp, Nick, Nikai, Ninuxpdb,
Nixeagle, Njuuton, Nk, Nlu, No Guru, Noldoaran, Nono64, Norm, Northamerica1000, NotAnonymous0, Nothingisoftensomething, Notinasnaid, Nrabinowitz, Nsaa, Numlockfishy, Nvt,
Nwusr123log, O.Koslowski, OKtosiTe, Ocolon, Oda Mari, Odell421, Odie5533, Ohnoitsjamie, Olathe, Oliverdl, Olivier, OllieWilliamson, OlurotimiO, Omicronpersei8, Omniplex, Ondertitel,
Onorem, Oosoom, Openmikenite, Optimisticrizwan, OrgasGirl, Orrs, OspreyPL, Oxymoron83, P.Marlow, Papadopa, Parasti, Patato, Patrick, Paul E T, Paul1337, Pcbsder, Pepper, PeterStJohn,
Petrb, PhJ, PhantomS, Phgao, Phil [email protected], Philip Howard, Philip Trueman, Photonik UK, Piano non troppo, Pierre Monteux, Pinethicket, Pinkadelica, Pithree, Plasticup,
PlutosGeek, Pmlineditor, Polluks, Polyamorph, Pontiacsunfire08, Posix memalign, Poydoy, Prashanthomesh, PrestonH, Programming geek, Prolog, Prophile, Pruefer, Public Menace, Puffin,
Qaanol, Quarkuar, QuiteUnusual, Qwerty0, Qwyrxian, R'n'B, R. S. Shaw, RA0808, RB972, RTC, Raanoo, Rabi Javed, Raffaele Megabyte, RainbowOfLight, Rainsak, Rami R, Ramif 47,
Random Hippopotamus, RandomAct, RaseaC, Ratnadeepm, RattusMaximus, RavenXtra, Rayngwf, Raysonho, Raywil, RazorICE, Rbakels, Rbanzai, Rdsmith4, Reach Out to the Truth, RedWolf,
Reedy, Rektide, Remixsoft10, Rettetast, RexNL, Rfc1394, Rhyswynne, Riana, Rich Farmbrough, Rilak, Rjgarr, Rjwilmsi, Rlinfinity, Rmere, Rmhermen, Robert K S, Robert Merkel, RobertG,
Robertwharvey, RockMaster, Rocketrod1960, Rockstone35, Rodri316, RogierBrussee, Rokfaith, Rolandg, Romanm, Ronark, Ronhjones, RossPatterson, Rotem Dan, RoyBoy, Rreagan007, Rrelf,
Rror, Rubena, Rubicon, Ruud Koot, Rzelnik, S.borchers, S10462, SF007, SNIyer12, SPQRobin, Safinaskar, Sainath468, Sakariyerirash, Sam Vimes, SampigeVenkatesh, Sander123,
Sanfranman59, Sango123, Sardanaphalus, Sarikaanand, Scherr, SchmuckyTheCat, SchreyP, SchuminWeb, Schwallex, Schzmo, Sdfisher, Sean William, Seba5618, Sedmic, Senator Palpatine,
Sewing, Shadowjams, Sharanbngr, Sharkert, SheikYerBooty, Shizhao, Shreevatsa, Shreshth91, Shriram, Sidious1741, Sigma 7, Signalhead, Silas S. Brown, Simon the Dragon, SimonP, Simxp,
Sir Nicholas de Mimsy-Porpington, SirGrant, SirGre, SivaKumar, Skarebo, Skomes, Slgrandson, Slogan621, Slon02, SmackEater, Smadge1, Snowmanradio, Snowolf, Socalaaron, Socrates2008,
SocratesJedi, SolKarma, SolarisBigot, Sommers, Sophus Bie, South Bay, Sowlos, Sp, Spanglegluppet, Sparkle24, SpooK, SpuriousQ, Squash, Sridip, Staffwaterboy, Stealthmartin, Stephen
Gilbert, Stephen Turner, Stephenb, Stephenchou0722, SteveSims, Stevenj, Stewartadcock, Stickee, Stormie, SudoGhost, Sun Creator, SunCountryGuy01, Sunay419, Super Mac Gamer,
SuperLuigi31, Superswade, SusanLesch, Susheel verma, Sven Manguard, Sven nestle, Sweet blueberry pie, Synchronism, Syzygy, THEN WHO WAS PHONE?, THeReDragOn, Ta bu shi da yu,
Tannin, TarkusAB, Tarmo Tanilsoo, Tarquin, Tasting boob, Tatrgel, Tdrtdr, Tdscanuck, TempestSA, Teply, TexasAndroid, Texture, Tgeairn, Tgnome, The Anome, The Random Editor, The
Thing That Should Not Be, The undertow, The1DB, TheAMmollusc, TheNewPhobia, TheWorld, Thecomputist, Theda, Thedjatclubrock, TheguX, Themoose8, Theshibboleth, Thine Antique
Pen, Thingg, Thorpe, Thumperward, Tide rolls, TigerShark, Timir Saxa, Titoxd, Tnxman307, Tobias Bergemann, Toddst1, Tokai, Tom Hek, Tom harrison, Tomcool, Tommy2010,
TommyB7973, Tompsci, Tony1, Tothwolf, Touch Of Light, Tpbradbury, Tpk5010, Traroth, Travelbird, TravellerQLD, Trevj, Trevjs, Trimaine, Triona, Trisweb, Triwbe, TurboForce, Twas Now,
Twistedkevin, Twitty666, TwoTwoHello, Twsx, Tyler, Tyomitch, Typhoon, Tyrel, Ultimus, Umofomia, Unbreakable MJ, Uncle Dick, Unixguy, Unknown-xyz, Uogl, Upthegro, Ursu17,
Urvashi.iyogi, Useight, User A1, Utahraptor ostrommaysi, Utilitytrack, VampWillow, Vanessaezekowitz, Vanished user 39948282, Vbigdeli, VegaDark, Verrai, Vicenarian, Vikrant manore,
Vincenzo.romano, Viriditas, Vorosgy, Vox Humana 8', Vrenator, W163, WJetChao, Wakebrdkid, Wapcaplet, Wareh, Warren, Warut, Wasted Sapience, Waterjuice, Wavelength, Wayward,
Wbm1058, Wdfarmer, Wdflake, Weedwhacker128, WellHowdyDoo, White Shadows, Who.was.phone, Widefox, WikHead, Wiki Wikardo, Wiki alf, WikiDan61, WikiPuppies, WikiTome,
Wikievil666, Wikiloop, Wikipelli, WilyD, Winchelsea, Wingnutamj, Winhunter, Winston365, Wisconsinsurfer, Wk muriithi, Wknight94, Wluka, Woohookitty, World-os.com, WorldBrains,
Wormsgoat, Wtmitchell, Wtshymanski, Wwagner, X42bn6, Xdenizen, Yaronf, Yellowdesk, Yes-minister, Yidisheryid, Yoink23, Youwillnevergetthis, Yunshui, Yworo, Zephyrus67, Zfr,
Zidonuke, Zigger, Ziiike, Zlemming, Zondor, Zotel, Zx-man, Zzuuzz, Ž•Ž, •var Arnfj†r‘ Bjarmason, ’“”•– —˜”“™•š›“œ, €•‚ƒ „…•†‡, •žŸ ¡ ¢£¤Ÿ¥¦§, € • ‚ ƒ „ … , 3340 anonymous edits
Kernel (computing) €Source: http://en.wikipedia.org/w/index.php?oldid=540219781 €Contributors: .anaconda, .digamma, 3DS Mike, =Josh.Harris, @modi, ACupOfCoffee, Aaron Schulz,
Abdull, Abion47, Adrian J. Hunter, Aholstenson, Ahunt, Alansohn, Aleczopf, AlekseyFy, Alex43223, AlistairMcMillan, Amircrypto, Anakletos, Android Mouse, Anna Lincoln, Anthony, Ark,
Arvindn, Astudent, Attilios, BMF81, Badgernet, Bdesham, Beanyk, Beland, Bender235, Beno1000, Bk0, Bobblewik, Bobbo, Bobby D. DS., Bomac, Bongwarrior, Bourne4war, Bped1985,
Brainix, Brian the Editor, Bruce1ee, C10191, CS46, Campbellssource, Candamir, CanisRufus, CarlHewitt, Cedars, Cfallin, Chaprash27, Chiefmanzzz, Chris 73, Chris Chittleborough,
Coastalsteve984, CombatWombat42, Crimsonmargarine, CrookedAsterisk, Cruccone, Curtdbz, Cybercobra, DARTH SIDIOUS 2, DNewhall, DStoykov, Dah31, Danaman5, DariuszT, David
Latapie, Demonkoryu, Djbaniel, Dori, Download, DrDentz, DrSeehas, Dungodung, Dysprosia, Edward, Ellywa, Elrohir4, Emerge --sync, Endorf, Epbr123, Epolk, Evercat, Eyreland, Fandyllic,
118
Article Sources and Contributors
Farosdaughter, Fchristophersen, Ferrierd, Fiftyquid, Fiosharishs, Fisico, Fragglet, Frap, Freedom to share, From Adam, Furrykef, Gaius Cornelius, Galoubet, Gardar Rurak, Gardrek, Gazpacho,
Gemini1980, Ghewgill, Giants27, Giftlite, Gioto, Goffrie, Guy Harris, Gwern, Gzorg, Hackwrench, Hans Dunkelberg, Harryboyles, Haseo9999, Hazard-SJ, Helix84, Herostratus, Hertzsprung,
Hoopya11042010, Hyungjin Ahn, I80and, IanM, Icey, Iluvcapra, Imran22, Indon, Infomniac, Infrangible, Ino5hiro, Irideforhomieeees, Ixfd64, Jakew, Jay, Jayden54, Jayen466, Jdm64,
Jehoshaphat, Jellocube27, Jeltz, Jengelh, Jesse V., Jmreinhart, Jnc, John318, Johndrinkwater, Johnt9000, JonHarder, Jonnymay, Josh the Nerd, Joshays, Joshf, Joy, Jpk, JulesH, Julien, Jumbuck,
K;;m5m k;;m5m, KDesk, Ka-Ping Yee, Karada, Kaysov, Kenny sh, Knutux, Komitsuki, Kongr43gpen, Kozuch, Kutulu, LVDX, LaMenta3, Larry Sanger, Lauri.pirttiaho, Leibniz, Lejarrag,
Levin, Lilac Soul, LjL, Llakais, Lofrsh, Luky-amiga, Lupusrex, Lysander89, MARQUIS111, MFNickster, MJA, MK8, MKar, Mac, Malleus Fatuorum, Manop, Marco Krohn, Martpol,
Marudubshinki, Materialscientist, MattGiuca, Maury Markowitz, Mav, Maxturk, Mbp, Melab-1, Merphant, Mg55, Midkay, Mikenolte, Miko3k, Milowarmerdam, Mitch Ames, Miym, Mjb,
Mmernex, Moritz schlarb, Mormegil, Mozzley, MrRedwood, Mulad, Muthukumar, Mykhal, Nanshu, Natkeeran, NawlinWiki, Ninjagecko, Nixdorf, Olathe, OliviaGuest, OlofE, Omniplex,
Padillah, Paul August, Paul Stansifer, PaulWay, Pdcook, Pearsejward, Peyre, Phoenix314, Plasticup, Pne, Poccil, Polluks, Positron, PrincessofLlyr, Professor Calculus, Public Menace, Qwertyus,
R. S. Shaw, RAM, Raffaele Megabyte, Raul654, Ravedave, Reinoutr, RevenDS, Rfc1394, Rfl, Rich Farmbrough, Rilak, Rjpryan, Rjwilmsi, Rkononenko, Rkr1991, Roboshed, Romanm,
Romatrast, Rory096, Rubicon, Ruud Koot, Ryan Norton, Ryan Sawhill, S.¨rvarr.S, SC©RECROW, SJP, Sak31122, Samsara, SandyGeorgia, SchreiberBike, Seba5618, Senator Palpatine,
Sephiroth BCR, Shadowjams, ShakespeareFan00, Shanes, Sibi antony, Simetrical, Simple.fatima, Sligocki, Slon02, Smitz, Snoopen, Snori, Some jerk on the Internet, Soumyasch, Starkiller88,
Starnestommy, Stephan Leeds, SummerWithMorons, Sun Creator, Suruena, Swampsparrow, TPIRman, TakuyaMurata, Tero, That Guy, From That Show!, The Anome, The Thing That Should
Not Be, The wub, TheCoffee, TheTweaker, Thruston, Thumperward, Timbatron, Titoxd, Tobias Bergemann, Toby Bartels, Tom Herbert, Tom harrison, Tony1, Trusilver, Turian, TuukkaH,
Twilo, Uday, Una Smith, Uncle G, UnderstandingSP, UnitedStatesian, Unixer, VMS Mosaic, Vald, Vatekor, Vbs, Vssun, Wallmani, Wapcaplet, Warren, Wavelength, Weeliljimmy, Wernher,
Wik, WikiLeon, Woohookitty, Xorxos, Xos‹, Yahoolian, Yes-minister, Ykhwong, Ysangkok, Yurik, Yuriz, Yworo, ZachPruckowski, Zachwoo, Zntrip, Zoicon5, Zvn, •var Arnfj†r‘ Bjarmason,
ˆ‰Š‹Œ… •Ž•, •žŸ ¡ ¢£¤Ÿ¥¦§, 582 anonymous edits
Booting €Source: http://en.wikipedia.org/w/index.php?oldid=540903797 €Contributors: .:Ajvol:., 16@r, 2406:E000:316:13:4C61:F503:D8D1:1B25, A.e.t.10694, A930913, AGK, Abdull, Abu
ali, Agateller, Ahoerstemeier, Alan Liefting, Ale5000, Alfio, AlistairMcMillan, Allan McInnes, Andyluciano, AnthonyQBachler, ArnoldReinhold, Attilios, AubreyEllenShomo, Aussie Evil,
Autoerrant, B4hand, Ben.c.roberts, Big Brother 1984, Billmic, Bio watcher, Blainster, BlankVerse, BlueNovember, Bluerasberry, Bongwarrior, BriGuy92, Brianski, Brovie, Bsadowski1,
Bsharkey, Bwrs, C.Fred, CUSENZA Mario, Cadsuane Melaidhrin, Can't sleep, clown will eat me, CanisRufus, Cardioid, Cedar ren, Ceriak, Chatul, Chealer, Chewbaca75, Chris the speller,
Christian Storm, Chuunen Baka, Cid SilverWing, Cimon Avaro, Ciphergoth, Conan, Conan-san, ConradPino, Cpl Syx, Cuvtixo, Cyan, DARTH SIDIOUS 2, DKqwerty, Dahie, Damian Yerrick,
Danhm, Daniel.Cardenas, DanielCD, David Gale, David.Monniaux, DavidCary, Dawnseeker2000, Debloper, Deepakipedia, Denisarona, Derschmidt, DevAnubis, Discharger12, Djdaedalus,
DonPaolo, Dpv, Dysprosia, Edward, Electron9, Elkman, Elsendero, Email4mobile, Emperorbma, Erik Sandberg, Estevoaei, Evil Monkey, Evil saltine, Ewlyahoocom, FT2, Filemon, FoolsWar,
Frap, Fuebar, Furrykef, G Clark, Galoubet, Garrettcobb, Gary King, Geeree, Gerry Ashton, Gilliam, Gioto, Gire 3pich2005, Golftheman, Good Olfactory, Grayshi, Greenrd, Greensburger, Grm
wnr, Gurpreet007, GuruGuru214, Guy Harris, GwydionM, Gzkn, Haham hanuka, Haikupoet, Hammersoft, HelenWatt, Heqs, Historyphysics, Hpa, Hvn0413, IOLJeff, Indil, Ire and curses,
IsaacGS, Ishi Gustaedr, Itemirus, J-r-k, J.delanoy, J0m1eisler, JLD, Jab843, Jabencarsey, James Kevin Nord, Jclemens, Jeroen, Jfhaugh, Jfmantis, Jftsang, Jhsounds, Jimmi Hugh, Jkl, Jnc,
Jncraton, Joaotorres, John Nevard, Jonathan de Boyne Pollard, Jwh335, Kaihsu, Keegan, Keilana, Kenyon, Kephir, Keyvan amel, Kin kad, Kingpin13, Kitwe, Kmote, Kozuch, Ksinkar,
Kubanczyk, Kwesadilo, LeaveSleaves, Leedeth, Leefkrust22, LessMayBeMore, Libertyernie2, LilHelpa, Lotje, MER-C, Mac, Magioladitis, Makelelecba, Mandarax, MarcoAurelio, MarkMLl,
MarkusHagenlocher, Maschwab, MattTait, Matthew99445, Matthiaspaul, MaxSem, Mbssbs, Mcapdevila, Mdsam2, Megatronium, Methcub, Mfc, Mfwitten, Michael john87, Mike3211,
Minesweeper, Mirror Vax, Modster, Monz, Mwhite1979, Mxl, Namazu-tron, Nastajus, Natl1, Neelix, Neshatian, Neugjl, Nfearnley, Nidonocu, Nikevich, Niwi3, Nixdorf, Nopetro, Norm, Nuno
Brito, NyAp, Ohnoitsjamie, OlEnglish, OlavN, Old Moonraker, Patrick, Pavel.nps, PawelP 1985, Pcuser42, Pengo, Persian knight shiraz, Peterh5322, Peyre, Phearson, Pinethicket, Plasticup,
Pleriche, Pne, Poweroid, Ppntori, Ptrlow, Public Menace, Pukasplas, Quarl, R. S. Shaw, RAMChYLD, RPHv, Radagast83, RattleMan, Rbakels, Rcampbell1, Rchandra, RedWolf, Relaxing, Reza
R, Richardw, RickK, Riluve, Romanm, Ronz, RossPatterson, Roy da Vinci, Rprpr, Rsduhamel, Rsmn, Rstrasser, SF007, SGGH, SNIa, SampigeVenkatesh, Sandipsandilyasonu, Sasank, Savh,
SchuminWeb, Scientus, SejuB, SentientParadox, Seven of Nine, Shadow demon, Shadowjams, Shawnc, Sietse Snel, Sinan Taifour, Sirfurboy, SixSix, Smurfy, SparsityProblem, Spellbinder,
Spitfire, Spoon!, Superm401, Suruena, Sweetfreek, Syrthiss, TakuyaMurata, Taxman, Tcncv, The Anome, The Monster, The Thing That Should Not Be, TheStarman, Thumperward, Tobias
Bergemann, Trasz, Troy1103, Tsolox, Uncle G, Vaceituno, ValC, Vedranf, VictorianMutant, Viralpatel27, Virtual Particle, VnutZ, Voidxor, Vvarkey, WHeimbigner, Wa3frp, Warren,
Wavelength, Wayne Riddock, Wbm1058, Wfaulk, Wideangle, Widefox, Wiki fanatic, WikiChip, Wikipelli, Wilfw, Wmahan, Woohookitty, Wsacul, Wtmitchell, Wtshymanski, X!, Xezbeth,
Yuhong, Zer0431, Zetawoof, Zhy, 521 anonymous edits
Computer multitasking €Source: http://en.wikipedia.org/w/index.php?oldid=541582461 €Contributors: 213.253.39.xxx, AJim, ANI MARTIROSYAN, Abdull, Ad88110, Ahoerstemeier,
Akshatdabralit, Alan.A.Mick, Ale jrb, AlistairMcMillan, Andy16666, AntiVan, Ap, Arch dude, ArchonMagnus, Arkantospurple, Atlantia, Beland, Bender235, Bhadani, BlindWanderer,
Bobo192, Bryan Derksen, Burschik, CALR, CanisRufus, CarlHewitt, Chris Purcell, Cometstyles, Compilation finished successfully, Conversion script, Csabo, Cst17, DMahalko, Dachshund,
Damian Yerrick, Deineka, Dianne Hackborn, Dicklyon, Djce, Dmeranda, Drable, Drhex, EdC, Edcolins, Esb, FatalError, Fiskegalen92, FizzixNerd, Florian Blaschke, FormerUser1, Frappucino,
GB fan, GGShinobi, Gail, GeorgeBills, Gimboid13, Guinness2702, H2g2bob, Harryboyles, HiDrNick, Iridescent, Ixe013, Ixtli, JNW, JakkoWesterbeke, Jcea, Jessel, Jimperio, Joel Saks, John.St,
Jsavit, Julle, Jyril, Ketsuekigata, Kubanczyk, L Kensington, Loadmaster, Lou Crazy, Lupin, Magioladitis, Magnus.de, Manticore, Marek69, Materialscientist, Maurice Carbonaro, Mazarin07,
Mfwitten, Mhouston, Mikenolte, Misza13, Mormegil, Nayuki, Ndenison, Nskillen, Nuno Tavares, Oliverbell99, Ozhiker, PS2pcGAMER, Palica, Peruvianllama, Peyre, Phatom87, Piet Delport,
Pit, Public Menace, QuiTeVexat, R. S. Shaw, Radagast83, Rat2, Reedy, RexNL, Rich Farmbrough, Rjstott, Rp, Rrelf, RucasHost, Salsa Shark, Scicluna93, ScottSteiner, Shaun9876, Silas S.
Brown, Simson, Sparking Spirit, Sriharsh1234, Starblue, Stephan Leclercq, Stevage, Sun Creator, The Anome, Theanthrope, Theaveng, Theopolisme, Thomas Blomberg, Tipiac, Toastyking,
Tobias Bergemann, Torreslfchero, Toyalla, TwistedText, Ulmanor, Unixguy, Venera Seyranyan, Wavelength, Web-Crawling Stickler, Wernher, Wikid77, Wmahan, Wrmattison, Wtshymanski,
Yerpo, Zhuuu, Zimbricchio, Zvar, 250 ,ª« ¬-¬ anonymous edits
Process (computing) €Source: http://en.wikipedia.org/w/index.php?oldid=541191919 €Contributors: Akmenon, Alansohn, Allan McInnes, Amolshah, Andreas Kaufmann, Andrejj, Andy16666,
AndyBQ, Anna Lincoln, Aquinex, Architectchao, Attilios, B4hand, Berndo, BigDunc, Bobby122, Bobheddle, Bongwarrior, Burzmali, CanisRufus, Cazzchazz, Cburnett, ChrisiPK, Cmatta,
Cspan64, Cygnus78, DARTH SIDIOUS 2, Diza, Dmeranda, Dori, Dysprosia, Echo95, Ed Poor, EdC, Ejile, Epbr123, Francs2000, Frankie1990, Furrykef, Gazpacho, GeorgeBills,
Geraldshields11, Giftlite, Gimboid13, GoingBatty, Goldom, Gregory haynes, Gurch, HGK745, Hdt83, Helix84, Imran, Ipsign, Jarble, Jhuk77, JiFish, Jonas AGX, Joy, Jozue, JulesH, Kejia,
Kevin S., Kimero, Kjetil r, Konstable, L Kensington, Laurentius, Lee Daniel Crocker, Lexor, Liftarn, Linas, LittlebutBIG, Loadmaster, Ludovic.ferre, Lysander89, Mange01, Martin451, Mateja,
MegaSloth, Miguel.mateo, Milan Ker•lˆger, Ndngvr, Nlu, Normxxx, NortyNort, Ozten, Pearle, Phantomsteve, Phatom87, Phe, Pion, Poccil, R. S. Shaw, Raghu.kuttan, RedWolf, Rich
Farmbrough, Rjwilmsi, Rrelf, Runningonbrains, Saputello, Seaphoto, Seashorewiki, Sir Anon, Skizzik, Sopoforic, Stephenb, Stephenbez, Syusuf, TParis, The Anome, TimBentley,
Timhowardriley, Tkmcinto, Tobias Bergemann, Truthflux, Una Smith, Vanis, Welsh, Wernher, Whaa?, Yaronf, Zouf, •ƒ†‘, 189 anonymous edits
Process management (computing) €Source: http://en.wikipedia.org/w/index.php?oldid=541095849 €Contributors: Alksentrs, Anbu121, Andreas Kaufmann, Andy16666, Asided m plane,
BigDunc, Bluebusy, Brighterorange, Brilliant trees, ChrisHodgesUK, Domer48, Dreamswithtali, Ettrig, Fryed-peach, Gpierre, Jim1138, Krishna sun82, Loadmaster, Simon the Likable, Stephan
Leeds, Una Smith, Woohookitty, YUL89YYZ, ’†“”•–05, 44 anonymous edits
Context switch €Source: http://en.wikipedia.org/w/index.php?oldid=540266751 €Contributors: 64.104.217.xxx, Abdull, Aib, Alex, AlistairMcMillan, Allan McInnes, Andreas Kaufmann,
Andy16666, Aswani369in, BMF81, Bender235, Bevo, Bluebusy, Bpaterni, Bwrs, Calculuslover, CanisRufus, Capricorn42, Chethankr wiki, Cmh, Cometstyles, Conversion script, Danielcohn,
David Eppstein, DavidCary, Dysprosia, Eeekster, Frap, Gareth Owen, Graham87, Hu, Ilion2, InTheCastle, Intgr, Jarfil, Jengelh, Jgalowicz, John Vandenberg, Johnnycakes234, Kocio, Kristofer0,
LW77, Liedoz, Maximus Rex, Mboverload, Modster, Neilc, Nulzilla, Peter Horn, Phe, Quinobi, R. S. Shaw, Ronz, Sam Korn, ShadowRangerRIT, Shai-kun, Shirik, Soler97, Sparkie82,
Strawtree, The Thing That Should Not Be, Thomas Blomberg, Tobias Bergemann, Tom harrison, Uogl, Vegard, Walk&check, Wernher, 152 anonymous edits
Scheduling (computing) €Source: http://en.wikipedia.org/w/index.php?oldid=542394069 €Contributors: =Josh.Harris, Abdull, Ajinkyaj, AlistairMcMillan, Alsu50, Andres, Andy16666,
Andyluciano, Anouarattn, Arctic Kangaroo, Arthena, BMF81, BPositive, Bamnet, Banco, Beland, Belzberg, Beno1000, Bethuganesh32, BigDunc, Bkil, Bluebusy, Bongwarrior, Bovineone,
Calliopejen1, Capricorn42, Carmichael, Ceriak, Charles Matthews, Church of emacs, Clappingsimon, Cml5129, Csurguine, Cyde, Dappawit, Davidhorman, Dcoetzee, Deathphoenix, Deineka,
Diego Moya, Dogface222, Doors5678, Dougher, Download, Dwhipps, Dysprosia, ESkog, Echinacin35, EdgeOfEpsilon, Erd, Eric B. and Rakim, Excirial, Fasten, Flooey, Fluffernutter,
GeorgeBills, Ghettoblaster, Gigacephalus, GregorB, Guy Harris, Heyandy889, Hu12, Huwr, IGeMiNix, Instantnood, Ire and curses, JHunterJ, JLaTondre, JPalonus, Jdstroy, Jheiv, Josh Tumath,
Jrdioko, Jshen6, Juggernaut the, Julesd, K.Nevelsteen, Kabads, Kainaw, Kameraad Pjotr, Karada, Kfcdesuland, Kristof vt, Kushalbiswas777, Kvng, LAMurakami, Le savoir et le savoir-faire,
Love+-Zero, Lysander89, Mahbubur-r-aaman, Mange01, Marckossa, Marrowmonkey, Materialscientist, Matt Kovacs, Mblumber, Michael Anon, Mifter, Milan Kerslager, Milan Ker•lˆger,
Mjancuska, Modulatum, Moe Epsilon, Moxon, NapoliRoma, Nixdorf, Nmondal, Nodekeeper, Nsaa, Ojigiri, Okona, PabloCastellano, Piet Delport, Pit, Pnorcks, Qwertyus, Raelus, Rdsmith4,
Reconsider the static, Reedy, Rich Farmbrough, Robbie on rails, Rrelf, Runtime, Rythie, Saibo, ShelfSkewed, Simxp, Sleske, Soumyasch, Starlionblue, StaticGull, SteveLoughran, Suruena,
Susfele, Tblackma222, Tewk, The Wilschon, TheAMmollusc, Thomas Blomberg, Tide rolls, TimBentley, Tmn, Tobias Bergemann, TomK32, Torqueing, Tribaal, TwizteDope, Tyw7,
Unixpickle, Unixplumber, Unready, Voelp, Vrenator, Warren, Wavelength, Wensong, Wnme, Woohookitty, ZabMilenko, Zvar, •ƒ†‘, 374 anonymous edits
Input/output €Source: http://en.wikipedia.org/w/index.php?oldid=539433175 €Contributors: 16@r, ABF, Ahoerstemeier, Aidan W, Al Lemos, Ale jrb, Alecv, AlexanderPar, Algotime,
Andy16666, Anonymous4367, Anton017, Anuang, Apokrif, Atreyu42, AvicAWB, AxG, B4hand, Badseed, Bento00, BlueSquadronRaven, Bluerasberry, Bobianite, Bornhj, Bryan Derksen,
COMPATT, Calmer Waters, CanisRufus, Conversion script, Courcelles, Csigabi, Cuperdon, DXBari, Daemorris, Dave6, Davewild, DavidLevinson, Dawnseeker2000, Dgw, DragonHawk,
Drmies, EagleFan, Ebraminio, Epbr123, Ewlyahoocom, Excirial, Fastily, FatalError, Fritzpoll, Furrykef, Fvw, Gardar Rurak, Goodoldpolonius, Goodoldpolonius2, Goodvac, Graham87, Greenrd,
Grika, Gurch, Guy M, Hadal, Hairy Dude, Happysailor, HarisM, Helixblue, Hellisp, Hemanshu, Heron, Hut 8.5, Ignacioerrico, Ihateblazing, Iridescent, Isnow, Ixfd64, Ja 62, Jbattersby, Jfmantis,
Jj137, Jnc, Jumpytoo, Jusdafax, Kanonkas, Karl-Henner, Kathryn NicDh®na, Kbdank71, Kesla, Khargas, Knowledge Seeker, Koavf, Kusunose, Kyz, Lambiam, Larry Sanger, Lee Daniel Crocker,
LeeDanielCrocker, Liberatus, Little Mountain 5, Lotje, Machine Elf 1735, Manway, McBrayn, Mdebets, MeganMc08, Merlion444, Mintleaf, Mirror Vax, Mortense, MsHyde, Mudlock,
119
Article Sources and Contributors
Mwtoews, N-Man, NapoliRoma, Neilc, NevilleDNZ, Nivix, Nixdorf, Noah Salzman, Octahedron80, Oda Mari, OrgasGirl, Orion11M87, Patrick, PdDemeter, Polluks, Polonium, Pseudomonas,
Public Menace, Pussyhole, QASIMARA, Quuxplusone, RedWolf, Redconfetti, Registreernu, Rencas, Rhrad, Richdiesal, Rjstott, Rjwilmsi, Rlove, Rtdrury, Saintuser, Saric, Sciurin¯, Seaphoto,
Siroxo, Sligocki, SoSaysChappy, Sonnyjim 06, Sporti, Stephenb, Suruena, Tagishsimon, Tatterfly, Tbhotch, Tempodivalse, The Rogue Penguin, Thehotelambush, Tobias Bergemann,
Tommy2010, Tompsci, Tpk5010, Ttwaring, TutterMouse, TuukkaH, Useight, V4nd4l king, Vipulcvyas, Waycool27, Wmahan, Ykhwong, ZenerV, Zhernovoi, Zzuuzz, 331 anonymous edits
Device driver €Source: http://en.wikipedia.org/w/index.php?oldid=541679685 €Contributors: 63.117.239.xxx, A412, AVRS, Adamhollick, Ahugenerd, Alansohn, Aldie, AlistairMcMillan,
Anabus, Anaxial, Ancheta Wis, Andareed, Andre Engels, Andy16666, Angela, Apparition11, Arctic Kangaroo, Arthena, Attilios, Aurista25, Avkulkarni, Awaterl, Aweiredguy, Ayla, BMF81,
BMT, Babajobu, Barek, BarnabyJoe, Bbx, BenFrantzDale, Beno1000, Bevo, Bhadani, Biscuittin, Bmecoli, Bobo192, Borgx, Btilm, COMPFUNK2, CanisRufus, Capricorn42, Ched, ChrisJMoor,
Classical Esther, Cometstyles, Conversion script, Cool3, Cpiral, DMCer, Da Joe, Damieng, DanielPharos, Davehi1, David Edgar, Diego Moya, Dispensa, Dixu9293, Doctahdrey, Don4of4,
Dragon Dan, DreamGuy, El C, Elsendero, Endlessnameless, Enjoi4586, Equatorian, Equendil, Erdeee, Fabartus, FatalError, Frap, Fsgeek, F¯, GRAHAMUK, Gbeuk99, Gilbertgagnon, Glenn,
Gragox, Greenrd, Haeinous, Haiviet, Happysailor, Harumphy, Helen D., HenryLi, Herakleitoszefesu, Hu12, Iakjfhfuksn, Ich, Ifotography, Imroy, Ipsign, Itsluy, JTN, Jandalhandler, Jane Fairfax,
Jesant13, Jetwhiz, JordoCo, Jorunn, Josepant, Joshuamarius, Justin545, Kbrose, Kenny sh, Kerdek, Kingpin13, Knownot, Knutux, Kozuch, Kr…tki, Kurykh, Kvng, Kyng, Laurusnobilis, Leejjcn,
Lightmouse, LuK3, MER-C, Mac, Macar, Matt8163, Mboverload, Mhaitham.shammaa, Mickraus, Mipadi, Mirror Vax, Mjuarez, Mr. Vernon, Mutilin, Nacase, Nasnema, Nickols k, Nixdorf, Nk,
Nopetro, Norborb, NuclearWinner, Ok.book.guy, Orange Suede Sofa, Oswazteca, Oxymoron83, Ozzmosis, Pablo-flores, Paul Richter, Perspective, Pgan002, Pgk, Phatom87, PhilipO, Piano non
troppo, Pietrozuco, Pinethicket, PlutarcoNaranjo, Quux, Qwerty9030, R'n'B, R4f, Raven4x4x, Reddi, Remember the dot, Renoj, Rich Farmbrough, Rilak, Rjwilmsi, Rocketrod1960, Romanm,
RossPatterson, Ruud Koot, SF007, Samwb123, Satellizer, Seidenstud, Seishirou Sakurazuka, Sentient Planet, ShakespeareFan00, Silvonen, Snowolf, Socrates2008, Sohchouhan,
Spiko-carpediem, Stephenb, Steven Weston, StuartBrady, Technobadger, Tedickey, Teuxe, The Anome, The Red, The Thing That Should Not Be, Thumperward, Tobias Bergemann,
Trekkie4christ, UncleDouggie, Uriyan, Vertium, Walk&check, Warren, Wavelength, Wayiran, Wbm1058, WegianWarrior, Wikiklrsc, Wikipelli, Winterst, Wk muriithi, Yakudza, Yamaguchi†
‡ , Ysangkok, Yukoba, Yworo, Zakblade2000, ZimZalaBim, Zzuuzz, ^demon, ’†“”•–05, 390 anonymous edits
Memory management €Source: http://en.wikipedia.org/w/index.php?oldid=539735827 €Contributors: 1exec1, 2001:470:1F07:4D1:F1A9:699A:ED34:EBE9, 84user, Abune, Adam McMaster,
AdamElleston, AgentPeppermint, Ahoerstemeier, Ajaikvarshney, Ajraddatz, AlbertCahalan, Ansumang, Arnaud, Arthena, Baris simsek, Basakjayan, Beland, BigDunc, Blethering Scot, Bovlb,
CodeCat, Cryptic, Curly Turkey, Cybercobra, Dcoetzee, Dekart, Diannaa, Dpbsmith, Dysprosia, EdC, Edward321, Enlitened108, Europrobe, Evanh, Frecklefoot, GRAHAMUK, GeneralChrisV,
Glevner, Grandphuba, Grayscale, Guy Harris, Harry Lives!, Hchrm, Hjasud, Hmains, ICEAGE, Iainelder, Incnis Mrsi, Indeterminate, Intgr, Jamesday, JaysonSunshine, Jcw69, Jeffrey O.
Gustafson, Jfmantis, John of Reading, Julesd, Kbrose, Kephir, L Kensington, LachlanA, Liridon, MagnaMopus, Mateo SA, Matroskin222, MeekMelange, Munford, NSR, Ndanielm, Neilc,
Netpilot43556, Northamerica1000, Nuno Tavares, Ocaasi, Omicronpersei8, Peter Flass, Pichai Asokan, Pion, PlaysWithLife, Poco a poco, Porturology, R'n'B, R. S. Shaw, RattusMaximus,
RicDod, Riceahmed, RoyGoldsmith, Ruud Koot, Sagaciousuk, Sarathc, SeanProctor, Seaphoto, Stephenb, Stewartlynch, SudoGhost, Tannin, Tempest67, TexasAndroid, TheMidnighters, Tide
rolls, TimmyBx, Torreslfchero, Tslocum, U-Mos, Uogl, Wayward, Wikipelli, William Avery, Woohookitty, WtfIsWikipedia, Wtshymanski, Zakblade2000, ’†“”•–05, 210 anonymous edits
Virtual memory €Source: http://en.wikipedia.org/w/index.php?oldid=542030530 €Contributors: .:Ajvol:., 123Hedgehog456, 203.37.81.xxx, 209.239.197.xxx, 216.119.139.xxx, ABCD, Abdull,
Abune, Acolyte of Discord, Aeusoes1, Agentbla, Ahoerstemeier, Aksi great, Alan Peakall, AlanUS, Alanl, Alereon, AlexGWU, Aliekens, AlistairMcMillan, Allstar87, Alpinesol, Amcfreely,
AndrewN, Angrytoast, Ankit jn, Anomie, Antandrus, Anthony Ivanoff, Apotheon, Arch dude, Armistej, Artaxiad, Arved, Avocado27, Awk, Bazza1971, Beach drifter, Bearclause, Beland,
Bemoeial, Bevo, Bezymov, BioPupil, BitterTwitter, Bobo192, Boccobrock, Bomazi, Bongwarrior, Borgx, BradBeattie, Bradkittenbrink, Brianski, Bryan Derksen, Bumm13, CRGreathouse, CTR,
Caknuck, Cameron Montgomery, Can't sleep, clown will eat me, CanisRufus, Capricorn42, Ccalvin, Centrx, Chatul, Chief of Staff, Christian75, Closedmouth, Compfreak7, Connelly, Conversion
script, Cracked acorns, Crispmuncher, Curly Turkey, Cwolfsheep, Damicatz, Daniel Santos, DerHexer, Derek Ross, DevastatorIIC, Dhanav, Discospinster, DmitTrix, Doktor Who, Doodan,
Dragonfly298, Duncan.Hull, Dwiakigle, Dwo, Dyl, Dysprosia, E.James, E.w.bullock, EdC, Edsanville, Ehamberg, El C, Eldri005, Elsendero, Emperorbma, Enigmaman, EnricoMartignetti,
EnriqueVillar, Eptalon, Erarchit007, Espoo, Evanh, Excirial, Feezo, Fraggle81, Frap, Frecklefoot, Fredrik, Freywa, Friendlydata, Fritzpoll, Furrykef, G.ardaud, GHe, Geni, GeorgeBills, Germ,
Ghiraddje, Giftlite, Gilliam, Gnowor, Goatasaur, Graham87, Guinness2702, Guy Harris, Haniefdar, HartzR, Hmains, HripsimeHripsime, Huggie, Hyarmion, Incnis Mrsi, Intgr, Isnow,
ItsProgrammable, J Di, JCLately, JForget, JavierMC, Jaxl, Jcea, Jdcope, Jed S, JeffW, Jeh, Jim1138, Jirislaby, Jkt, Jnc, Joepearson, Johnny--34, JonHarder, Jonnyspace, JoshuaZ, Justforasecond,
J†rg Olschewski, K.Nevelsteen, Kbdank71, Kday, Kelly Martin, Kingturtle, Kittybrewster, Knutux, Kojozone, Kstailey, Kubanczyk, Kushalbiswas777, Lfstevens, LilHelpa, Loudenvier, MCG,
Makecat, Manavkataria, Marek69, Mark Arsten, Markuswise, Martinwguy, Marudubshinki, Mfwitten, MichaelBillington, Midgrid, Mkweise, Mmernex, Monkey Bounce, MrOllie, Msrkiran,
Nanshu, Nigelrees, Nikai, Nil Einne, Nils, Nimur, Noone, NotAnEditor, Ochib, Olof nord, Omegatron, Openstrings, Orever, OrgasGirl, PEHowland, Parthasarathinag, Pavel Vozenilek, Pcap,
Pdelong, Perteghella, Peter Flass, Peterh5322, Pgk, Philcha, Philip Trueman, Phosphoricx, Phuzion, Pinethicket, PizzaMargherita, Plugwash, Pmalmsten, Poweroid, Pseudomonas, Pthibault,
Public Menace, Quaeler, Quintote, Qwertyus, R. S. Shaw, Radius, Radon210, Ramack, Random2001, Rbakels, RedWolf, Rilak, Rjwilmsi, Ronhjones, RossPatterson, RoyGoldsmith, Rsocol,
SSDPenguin, Sailorman2003, Sam Hocevar, Sceptre, SchmuckyTheCat, Seans Potato Business, Seiji, Shaddim, Shanes, Shervinemami, Silvestre Zabala, Simetrical, Smb1001, Snoyes, Softy, Sol
Blue, Someguy1221, Soup man, SparsityProblem, Stephan Leclercq, Stevertigo, Stirlingstout, Stypex, Super-Magician, Svick, Sydbarrett74, TUF-KAT, Teac77, The Thing That Should Not Be,
Thomasyen, Thumperward, Tobias Bergemann, Toresbe, Tyomitch, Underdog, Urhixidur, UrmasU, VTBassMatt, Vald, VampWillow, VegaDark, Vegaswikian, Vendeka, VitalyLipatov, W163,
Walk&check, Warren, WatchAndObserve, Weyrick, Why Not A Duck, Widefox, WillMall, Winterspan, WojPob, Writtenonsand, XJamRastafire, Xelgen, YUL89YYZ, Yath, Yodaat,
Yonghokim, ZaferXYZ, Zakblade2000, ‹”†— ‹˜‘… ‹”†—, 603 anonymous edits
Page (computer memory) €Source: http://en.wikipedia.org/w/index.php?oldid=541478439 €Contributors: A5b, Antony-22, Asymmetric, Beland, Christian75, Christinemyers17, Cic, Cychoi,
Darrenmoffat, Dbolton, Dennis714, Derek R Bullamore, Dkasak, EnriqueVillar, FranklinPiat, Frap, Gechurch, Gordonnovak, Greenstruck, Guy Harris, Hires an editor, IGeMiNix, Ino5hiro,
Ipsign, Jengelh, Kalotus, Kubanczyk, Kvng, Meatsgains, Mlpkr, Mortense, MostlyCaffeinated, MyNameIsNeo, Neilc, Oli Filth, Paulish, Personman, Pete142, Petrb, PlaysWithLife, Raysonho,
Reedjc, Rjwilmsi, Sosodank, Tide rolls, Trasz, Umawera, Vegaswikian, Waldoalvarez00, Xpclient, Znik70, 58 anonymous edits
Paging €Source: http://en.wikipedia.org/w/index.php?oldid=542535501 €Contributors: 16@r, Abdull, Ahoerstemeier, AlanUS, AlistairMcMillan, Andres, Andy16666, Anihl, Anirvan,
Arknascar44, Avik pram, B4283, BJWORLD, Bazza1971, Beland, Bento00, Bevo, Bhamv, Bitbut, Blaxthos, Bluezy, Bovineone, Bradkittenbrink, Btritchie, Caltas, Chatul, Chris the speller,
ChrisGualtieri, Chrislk02, Cibu, Cowplopmorris, Creidieki, CrizCraig, CrypticBacon, Csabo, Curly Turkey, Cutedaiel899, Cwjones, Czarkoff, D4g0thur, David Mor…n, DavidCary, DerHexer,
DieselDude, Dustinkirkland, Dysprosia, Eberlin, Ed g2s, Ego White Tray, Eliashedberg, Enviroboy, Evahala, Eynar, Frap, Fubar Obfusco, Furrykef, Garde, Gareth Owen, Grafen, Greenstruck,
GregRobson, Guy Harris, Harryboyles, HughesJohn, IceKarma, Intgr, J.delanoy, Jeh, Jengelh, JohnCD, KD5TVI, Kate, Kbdank71, Killiondude, Kubanczyk, LFaraone, LOL, Lankrist, Lapidar,
Larry V, Levin, LilHelpa, LjL, MER-C, Magnus.de, Makewa, Marius, MarkMLl, Marko75, Martarius, Menchi, Michael Hardy, Miggyb, Mipadi, Mlucaciu, Mmj, Msohaibayub, Msrkiran, Music
Sorter, NapoliRoma, NawlinWiki, Nihiltres, Nikitadanilov, Nikolas Karalis, Nono64, Nuno Tavares, Obsidian-fox, Olof nord, Operagost, Orever, Oxymoron83, Pavlor, Pengo, PeterReid, Petrsw,
PhilippWeissenbacher, Pinar, PlaysWithLife, Pm215, Polluks, Preslethe, Qaz, R. S. Shaw, Raelthelamb, Raffaele Megabyte, Rayno, Res2216firestar, Rfc1394, Rivecoder, Rnbc, Robert
Brockway, RodneyMyers, Rox, Sailorman2003, Sanskritkanji, Scientus, Seren-dipper, SimonP, Sketch-The-Fox, Sophie, Speculatrix, Speedarius, Stephenb, Sudozero, Tatsh, Teac77, TedColes,
The Anome, Timrollpickering, Tobias Bergemann, Trusilver, UltraMagnus, Universalcosmos, Unixguy, VasilievVV, Viralpatel27, W163, W5WMW, Warren, Wbm1058, Wernher,
Wtshymanski, Xcalibre, Xenophonix, Xorxos, Yufeng.Zhang, Zoicon5, ‰Š‹ŠŒ, 322 ,°ª±²³ ´-³µ anonymous edits
Page fault €Source: http://en.wikipedia.org/w/index.php?oldid=540646566 €Contributors: Aaron of Mpls, Abdull, Akira625, AlistairMcMillan, Angrytoast, Arickp, AxelBoldt, Aymatth2, Bevo,
Brian Geppert, Byteemoz, CloudNine, Coffee, Creidieki, David Berman, Ddeimeke, Doug Bell, Dysprosia, Fiftyquid, Fimbulvetr, Fuhghettaboutit, Gatortpk, Ghettoblaster, Guanaco, Guy Harris,
Hodarcobra, Idont Havaname, Ilikemouthwaffles, Jimbo1qaz, Jrrs, Ksn, Kubanczyk, Kutulu, Lambiam, Linguofreak, Louis Waweru, Lu‡s Felipe Braga, Megabytephreak, Mikeblas, Mlpkr,
Modify, Moe Epsilon, Movax, Mr Vholes, Msikma, Music Sorter, Neilc, Novel Zephyr, Pengo, PlaysWithLife, Plugwash, Quuxplusone, Raviemani, SS2005, Sam Hocevar, Scottk, Seeker68,
Shawnc, Skywing, SpecOp Macavity, Spitfire, Tarabyte, Taxman, Technocrat1, Tesseran, The Anome, Thr4wn, Truthflux, Ur•ul, VMS Mosaic, Valeriya, Vegaswikian, Waelkdouh, Wbm1058,
Wbrenna36, Ybungalobill, Zer0431, Ziggurat, 119 anonymous edits
File system €Source: http://en.wikipedia.org/w/index.php?oldid=539594582 €Contributors: (, 100110100, 121a0012, 2mcm, 90 Auto, Adamantios, Adrian, Ae-a, Ahoerstemeier, Ahy1, Aillema,
Aj00200, Alansohn, Alba, Aldie, Aleksandar030, Aliekens, AlistairMcMillan, Alkrow, AmRadioHed, Ameen.crew, Anandbabu, Ancheta Wis, Andre Engels, Andy16666, AnonMoos, Anthony
Borla, Arjayay, Ark, Arnon007, Aron1, Arrenlex, Aschrage, Asd.988, Assarbad, AtheWeatherman, AxelBoldt, Badgernet, Baryonic Being, Becksguy, Beland, Benash, Bender235, BiT, Bitwise,
Bletch, Bob007, Boborok, Boing! said Zebedee, Bornhj, Brickmack, Brycen, Burschik, Byteemoz, COstop, Can't sleep, clown will eat me, Cander0000, Capricorn42, Carlosguitar, Catalina22,
Cbayly, Ceyockey, Cgy flames, Chealer, Chipuni, Chris Chittleborough, Chris the speller, ChrisHodgesUK, Christian Storm, Ciroccoseattle, Claunia, Cmdrjameson, Cohesion, Colin Hill,
Conversion script, Coolfrood, Corby, Cpiral, Crashmatrix, Creidieki, Csabo, Cspurrier, Ctachme, CyberSkull, DGerman, DMG413, DMahalko, DRAGON BOOSTER, DStoykov, DVD R W,
Damian Yerrick, Damieng, DanDevorkin, Darklilac, Darrien, David Gerard, David H Braun (1964), DavidHalko, Davitf, Decoy, Dekisugi, Del Merritt, Delirium, DerHexer, Dexter Nextnumber,
Dillee1, Dirkbb, DmitryKo, Donhalcon, Download, Druiloor, Dsant, Dsav, Dysprosia, EddEdmondson, Edward, ElBenevolente, Electron9, Eltouristo, Emperorbma, Emre D., Eob, Everyking,
Ewlyahoocom, Eyreland, Falsifian, FatalError, Favonian, Ferrenrock, Firthy2002, Fogelmatrix, Foxxygirltamara, FrYGuY, Fraggle81, Frap, Froggy454, Gaius Cornelius, Galoubet, Gazpacho,
Ghakko, Ghettoblaster, Gpvos, GraemeL, Grafikm fr, Graham87, Greg Lindahl, GregorB, Groogle, Guroadrunner, Guy Harris, Hadal, Hagedis, Hairy Dude, Hazel77, Helix84, Hif, Howdyboby,
Htonl, Ian Pitchford, Ianiruddha, Imroy, InShaneee, Ineuw, [email protected], Intgr, Isarra, J0m1eisler, JLaTondre, Jason Quinn, Jasper Deng, Jcorgan, Jec, Jeff G., Jeffpc, Jengelh,
Jerryobject, Jim.henderson, Joeblakesley, Joeinwap, Jonathan de Boyne Pollard, JordoCo, Jotel, Joy, J…na ¶…runn, Kairos, Karada, Karmastan, Kate, Kbdank71, Kbolino, Kc2idf, Kendrick7,
Kenyon, Kerrick Staley, Kim Bruning, Kiralexis, Kozaki, Krauss, Kvedulv, Kwharris, Kwi, LHOON, Lament, Lehoo, Leon Hunt, Letdorf, Lightdarkness, LilHelpa, Lion.guo, Loadmaster,
LocoBurger, Lofote, Logixoul, Lost.goblin, Lotje, Lowellian, Lupo, MARQUIS111, MZMcBride, Mac, Mange01, Mannafredo, ManuSporny, Marcika, MarekMahut, Marudubshinki,
Marysunshine, Mat-C, Materialscientist, MatthewWilcox, Matthiaspaul, Mattisgoo, Maxal, Maximaximax, Mbakhoff, Mdd, Med, Miblo, MicahDCochran, MikeRS, Mild Bill Hiccup,
Mindmatrix, Minghong, Mjk64, Mk*, Mlessard, Mmairs, Modster, Monz, Morte, Mrichmon, Mschlindwein, Mulad, Mushroom, Mwtoews, Nahum Reduta, Nanshu, NapoliRoma, Nbarth,
NeaNita, NevilleDNZ, Nikitadanilov, Nixdorf, OccamzRazor, Oda Mari, Omicronpersei8, OrangeDog, Orzetto, Oscarthecat, Ovpjuggalo, PGSONIC, Palconit, Patrick, Pattepa,
120
Article Sources and Contributors
Paul.raymond.brenner, Peterlin, Peyre, Phil Bordelon, PhilHibbs, PhotoBox, PichuUmbreon, Poccil, Pol098, Poppafuze, Porterde, Psychonaut, Public Menace, Pythagoras1, Qaywsxedc, Quale,
Questulent, Quiddity, R. S. Shaw, RFST, RJaguar3, Radagast83, Raffaele Megabyte, Ravenmewtwo, Reconsider the static, RedWolf, Reisio, Retron, Reyk, Rfc1394, Rhobite, Rich257, Riotnrrd,
Rob Kennedy, Rockstone35, Rogitor, Royce, Rror, Runner5k, Ruud Koot, Rvalles, Rynsaha, Ryulong, SEWilco, SMC, Sam Hocevar, SamCPP, Samfw, Santurwoman, Saucepan, Scarlet Lioness,
Scientus, ScottJ, Sdfisher, SeanMack, Semifinalist, Sheehan, Sherwood Cat, Showeropera, Slogan621, Smappy, Snaxe920, SolKarma, SolarisBigot, Sommerfeld, SpeedyGonsales, Splash,
Squash, Ssd, Stephen Gilbert, Stephenb, Stuart Morrow, StuartBrady, Suffusion of Yellow, Supertin, Suruena, Swift, Swpb, Tablizer, Taka, Tannin, Tarquin, Tawker, Tellarite, Tempel, The
ansible, TheAMmollusc, TheGeekHead, The_ansible, Theone256, Thompsa, Thumperward, Thunderpenguin, Tim Ivorson, Tobias Bergemann, Traut, Tylerni7, Typhoon, Uli, Uncle G, Unixguy,
Vadmium, Val42, Vasi, Velella, Voidxor, W163, W1tgf, Wai Wai, Walabio, Warpflyght, Wayiran, Wesley, Wikid77, Wikievil666, Winston Chuen-Shih Yang, Wknight94, Wli, Woohookitty,
Ww, X7q, Xcvista, Yamla, Yapchinhoong, Yudiweb, Zemyla, Zetawoof, Zhaofeng Li, Zodon, Zoicon5, Zunaid, •var Arnfj†r‘ Bjarmason, ·“š˜¸¹º¹”–“», 823 ,°ª±²³ ´-³µ anonymous edits
Virtual file system €Source: http://en.wikipedia.org/w/index.php?oldid=541271970 €Contributors: 121a0012, A876, AlistairMcMillan, Andy16666, Aphthong, CherryX, ChrisGualtieri, David
Gerard, Dcljr, Debresser, EdC, Elifarley, Eric B. and Rakim, Error, Etamura, Falsifian, Fiftyquid, Frap, Fritz Saalfeld, Furrykef, Gamebouille, Ghettoblaster, Guy Harris, Harryboyles, Hellisp,
Hundblue, Itai, Jerryobject, John Vandenberg, Jsnx, Jurasik, Kinema, Kku, Ktpenrose, Lightmouse, Lost.goblin, LukeShu, Martarius, Mayevski, Mipadi, MoazReyad, Mre5765, Mwtoews,
NorkNork, OlavN, Pbb, Pdelong, Phil Boswell, Raffaele Megabyte, Reisio, Roseatefairy, Smoothy, Stuartkonen, Suruena, Tedickey, TerryE, Toussaint, Xiaoshuang, ’†“”•–05, 87 anonymous edits
121
Image Sources, Licenses and Contributors
Image Sources, Licenses and Contributors
Image:IBM360-65-1.corestore.jpg €Source: http://en.wikipedia.org/w/index.php?title=File:IBM360-65-1.corestore.jpg €License: GNU Free Documentation License €Contributors: Original
uploader was ArnoldReinhold at en.wikipedia
Image:PC-DOS 1.10 screenshot.png €Source: http://en.wikipedia.org/w/index.php?title=File:PC-DOS_1.10_screenshot.png €License: Public Domain €Contributors: Remember the dot at
en.wikipedia (PNG)
File:Unix history-simple.png €Source: http://en.wikipedia.org/w/index.php?title=File:Unix_history-simple.png €License: Creative Commons Attribution-Sharealike 3.0 €Contributors:
Eraserhead1
Image:First Web Server.jpg €Source: http://en.wikipedia.org/w/index.php?title=File:First_Web_Server.jpg €License: GNU Free Documentation License €Contributors: User:Coolcaesar at
en.wikipedia
File:Ubuntu 12.04 Final Live CD Screenshot.png €Source: http://en.wikipedia.org/w/index.php?title=File:Ubuntu_12.04_Final_Live_CD_Screenshot.png €License: GNU General Public
License €Contributors: Ahunt, Meno25, Michael Barera, 1 anonymous edits
File:Windows To Go USB Drive.png €Source: http://en.wikipedia.org/w/index.php?title=File:Windows_To_Go_USB_Drive.png €License: Creative Commons Zero €Contributors: Adrignola,
SF007, 2 anonymous edits
Image:Kernel Layout.svg €Source: http://en.wikipedia.org/w/index.php?title=File:Kernel_Layout.svg €License: Creative Commons Attribution-Sharealike 3.0 €Contributors: Bobbo
Image:Priv rings.svg €Source: http://en.wikipedia.org/w/index.php?title=File:Priv_rings.svg €License: Creative Commons Attribution-Sharealike 2.5 €Contributors: Daemorris, Magog the Ogre,
Opraco, Torsch
File:Virtual memory.svg €Source: http://en.wikipedia.org/w/index.php?title=File:Virtual_memory.svg €License: Creative Commons Attribution-Sharealike 3.0 €Contributors: Ehamberg
File:Dolphin FileManager.png €Source: http://en.wikipedia.org/w/index.php?title=File:Dolphin_FileManager.png €License: unknown €Contributors: KDE
File:Command line.png €Source: http://en.wikipedia.org/w/index.php?title=File:Command_line.png €License: GNU General Public License €Contributors: The GNU Dev team, and the Arch
Linux Dev team (for the Pacman command in the example)
File:KDE 4.png €Source: http://en.wikipedia.org/w/index.php?title=File:KDE_4.png €License: GNU General Public License €Contributors: KDE
Image:kernel-simple.png €Source: http://en.wikipedia.org/w/index.php?title=File:Kernel-simple.png €License: GNU Free Documentation License €Contributors: Maxturk, MithrandirMage,
Ysangkok, 6 anonymous edits
Image:Kernel-microkernel.svg €Source: http://en.wikipedia.org/w/index.php?title=File:Kernel-microkernel.svg €License: Public Domain €Contributors: Mattia Gentilini
Image:Kernel-hybrid.svg €Source: http://en.wikipedia.org/w/index.php?title=File:Kernel-hybrid.svg €License: Public Domain €Contributors: Mattia Gentilini
Image:Unix-history.svg €Source: http://en.wikipedia.org/w/index.php?title=File:Unix-history.svg €License: GNU Free Documentation License €Contributors: Original uploader was Phidauex at
en.wikipedia
File:Eniac.jpg €Source: http://en.wikipedia.org/w/index.php?title=File:Eniac.jpg €License: Public Domain €Contributors: .:Ajvol:., AKeen, Elipongo, Evrik, Infrogmation, Joanjoc, Liftarn,
Look2See1, Luestling, Nameless23, Ranveig, StuartBrady, 3 anonymous edits
File:IBM1130CopyCard.agr.jpg €Source: http://en.wikipedia.org/w/index.php?title=File:IBM1130CopyCard.agr.jpg €License: Creative Commons Attribution-Sharealike 2.5 €Contributors:
Arnold Reinhold
File:System3.JPG €Source: http://en.wikipedia.org/w/index.php?title=File:System3.JPG €License: Creative Commons Attribution-Sharealike 3.0 €Contributors: Jonathunder
Image:PDP 8 e Trondheim.jpg €Source: http://en.wikipedia.org/w/index.php?title=File:PDP_8_e_Trondheim.jpg €License: GNU Free Documentation License €Contributors: User:Arj
File:Intel 2708 1KB EPROM.jpg €Source: http://en.wikipedia.org/w/index.php?title=File:Intel_2708_1KB_EPROM.jpg €License: Public Domain €Contributors: Swtpc6800 en:User:Swtpc6800
Michael Holley
File:Binary executable file2.png €Source: http://en.wikipedia.org/w/index.php?title=File:Binary_executable_file2.png €License: unknown €Contributors: Original uploader was Scientus at
en.wikipedia
File:Htop.png €Source: http://en.wikipedia.org/w/index.php?title=File:Htop.png €License: GNU Free Documentation License €Contributors: PER9000, Per Erik Strandberg
Image:Process states.svg €Source: http://en.wikipedia.org/w/index.php?title=File:Process_states.svg €License: Public Domain €Contributors: User:a3r0
File:External Fragmentation.svg €Source: http://en.wikipedia.org/w/index.php?title=File:External_Fragmentation.svg €License: Creative Commons Zero €Contributors: User:Hjasud
File:100 000-files 5-bytes each -- 400 megs of slack space.png €Source: http://en.wikipedia.org/w/index.php?title=File:100_000-files_5-bytes_each_--_400_megs_of_slack_space.png €License:
Creative Commons Attribution-Sharealike 3.0 €Contributors: User:DMahalko
File:DirectoryListing1.png €Source: http://en.wikipedia.org/w/index.php?title=File:DirectoryListing1.png €License: Public Domain €Contributors: Loadmaster (David R. Tribble)
122
License
License
Creative Commons Attribution-Share Alike 3.0 Unported
//creativecommons.org/licenses/by-sa/3.0/
123