Improving the Reliability of
Commodity Operating Systems
Allows existing OS extensions to execute safely
in commodity kernels
Use lightweight kernel protection domains
Restricted write access to kernel memory
Track and validate all modifications to kernel data
Computer reliability a unsolved problem
Cost of failures continues to rise
OS extensions have become prevalent
70% of Linux kernel code
35,000 drivers on Windows XP
Written by people who are less experienced in
kernel organization
Extensions are leading cost of failures
In Windows XP, drivers cause 85% of failures
In Linux, device drivers introduce 7x errors than
the rest of the kernel
Extended OS cannot be tested completely
Nooks Approach
Target existing extension architecture
Use conventional C instead of type-safe
Aim to reduce the number of crashes due
to drivers and extensions
Prototype implemented in Linux
Showed graceful recovery for 99% of fault
Related Work
Hardware approaches
Capability-based architectures
Recovery difficult for shared resources
Segment architectures
Difficult to program
New OS structures
Good fault isolation
Rebooting required to restart services
Transaction-based systems
Works well for file systems
Language-based approaches
Limited applicability
Core principles
Design for fault resistance, not fault tolerance
Prevent and recover from most, not all
Design for mistakes, not abuse
Extensions are generally well-behaved (not
Can explore the design space between unproctected
and safe
+ Can define an architecture that supports
existing drivers with moderate performance
- Malicious code can bypass these mechanisms
Isolation of kernel from extension failures
Need to detect failures before they spread
Automatic recovery from failures
Backward compatibility
Reliability layer inserted between the
extensions and the OS kernel
Intercepts all interactions between the
extensions and the OS kernel
Major functions
Object tracking
Lightweight kernel protection domain
Write access to a limited portion of the kernel’s
address space
Major tasks
Creation, manipulation, and maintenance of
lightweight kernel protection domains
Inter-domain control transfer
Extension procedure call (XPC)
Similar to lightweight RPC
Assume trusted interactions
Asymmetric relationship
Kernel has more privileges
The Nooks interposition mechanisms
Make sure that
All control flows between the kernel and extensions
are through the XPC mechanism
All data flows between the kernel and extensions are
managed by Nooks’ object-tracking code
Extensions and the kernel communicate
through wrapper stubs
Object Tracking
Maintains a list of kernel data structures
that are manipulated by an extension
Controls all modifications to those
Provides object info for cleanup when an
extension fails
Object Tracking
An object must be copied into an
extension before it is modified
Object tracking code verifies the type and
accessibility of each parameter being
Nooks detects software faults
When kernel services are invoked incorrectly
When an extension consumes too many
Return to the extension
Generate an error code
Nooks detects hardware faults
Processor raises an exception during extension
Attempts to read unmapped memory
Write memory outside of its protection domain
A user or a program trigger Nooks
recovery explicitly
Since extensions are decoupled from
kernel, Nooks can freely release
extension-held kernel structures, such as
objects or locks, during the recovery
Apache Web
Navigator Web
Quake3D Video
Operating System Kernel
File System
Nooks Kernel Runtime
Network Nook
Video Nook
Per-nook runtime
Per-nook runtime
TCP/IP Driver
Ethernet Driver
Video Driver
SCSI Driver
Nooks Kernel Runtime
Ethernet Card
Video Card
SCSI Controller
Linux 2.4.18
Worst-case target
18 months of development
22,000 lines of Nooks code (vs. 2.4 million lines
of Linux code and 50 million lines of Windows
2003 code)
Two parts
Memory management
Extension procedure call
Memory Management
Kernel has read-write access to the entire
address space
Each extension is restricted to read-only
kernel access and read-write access to its
local domain
Nooks maintains a copy of the kernel page
table for each domain
Memory Management
Changing protection domains is not as
costly as changing processes
Protection domains share kernel address space
Extension Procedure Call
Transparent to both the kernel and its
Managed by two functions
nooks_driver_call(func_ptr, arg_list, domain)
nooks_kernel_call(func_ptr, arg_list, domain)
Deferred call mechanisms available
Useful for network drivers to queue up packets
and perform bulk transfers
Changes to Linux Kernel
Maintain coherency between the kernel
and extension page tables
Detect exceptions that occurs within
Nooks’ protection domains
Locate tasks that are no longer collocated
on the kernel stack due to isolation
Provides wrapper stubs between
extensions and the kernel
Transparent to the kernel and drivers
Kernel modifications
Make standard module load to bind extensions
to wrappers instead of kernel functions
The kernel is initialized to interpose on the
Nooks’ call into extensions
Some data references are interposed
Certain objects are linked directly into the
extension for reading
Kernel modification calls are wrapped
Performance critical data structure
Shadow object in extension that are
synchronized before and after XPCs
Otherwise, just XPCs
Within the kernel’s protection domain
Three basic tasks
Check parameters for validity
Create a copy of kernel objects in the
extension’s protection domain
No serialization/deserialization necessary
Synchronization code placed in wrappers
Perform an XPC into the kernel or extension
Automatically generated
Wrapper Code Sharing
50% of Nooks code base
Shared among multiple drivers
Object Tracking
Supports 43 kernel object types
Records the addresses of all objects in
use by an extension
Records the association between the
kernel and the extension versions of
writable objects
Performs garbage collection
Determines whether to copy an object
Recovery manager releases resources
Unloading the extension
Releasing its kernel and physical resources
Reloading and restarting the extension
User-mode agent coordinates recovery
Each object is associated with a recovery
Implementation Limitations
Nooks does not handle all possible errors
Deliberate corruptions of system states
Infinite loops
However, a moderate reduction of system
crashes is a significant contribution
Achieving Transparency
Wrapper stubs for every call in the
extension-kernel interface
Object-tracking code for every object type
that is passed between the extension and
the kernel
Nooks transparent to both the extension
and the kernel
Nooks can detect and recover 99% of
extension faults
Test Methodology
Synthetic fault injection
Automatically changes single instructions in the
extension code to emulate common errors
Uninitialized variables
Bad parameters
Types of Extensions Isolated
Device drivers (network, sound cards)
Optional kernel subsystems (VFAT)
Application-specific kernel extension
Test Environment
Allows automation of crash testing without
5 extensions
400 tests each
Test Results
Not all faulty-injection trials cause faulty
System Crashes
A system crash is easiest to detect
OS panics
Linux experienced 317 crashes
Nooks eliminated 313 crashes, or 99%
4 deadlocks
System Crashes
Sound blaster and VFAT extensions are
Fewer crashes
kHTTPd, pcnet32, e1000 are interruptedbased
More crashes
Non-Fatal Extension Failures
Nooks cannot detect erroneous extension
Network could disappear
Mounted file system hangs
Recovery Errors
A faulting extension is unloaded, reloaded,
and restarted
Works well with kHTTPp
Not as well with VFAT
Corruptions can propagate to disk if not detected in
Summary of Reliability Experiments
Nooks eliminated 99% of the system
crashes in extensions
Nooks eliminated nearly 60% of non-fatal
extension failures
Dell 1.7 GHz Pentium 4
890 MB of RAM
SoundBlaster 16
Intel Pro/1000 Gb Ethernet Adapter
7200 RPM, 41 GB IDE HD
Linux 2.4.18
Sound Benchmark
Plays an MP3 file at 128 Kb/sec
150 XPCs/sec
Nooks imposes little overhead
Network Benchmark
netperf performance tool
A node sends/receives a stream of 32 KB
TCP messages via a 256KB buffer
10% overhead
Compile Benchmark
Linux kernel compilation on VFAT
25% slowdown
Web Server Benchmarks
Repeatedly request a 1-KB file and measure
the maximum request rate
60% slowdown
CPU bound
3% slowdown
If the computation is not CPU bound, the
penalty may not be important
Nooks is achievable with modest
engineering effort
Extensions such as device drivers can be
isolated without changes to extension
Isolation and recovery can dramatically
improve the system’s ability to survive
extension faults