Download Disco

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Disco: Running Commodity
Operation Systems on Scalable
Multiprocessors
E Bugnion, S Devine, K Govil, M Rosenblum
Computer Systems Laboratory, Stanford
University
Presented by Xiaofei Wang
Background
• Hardware faster than System Software
– Late, incompatible, and possibly bug
• Multiprocessor in the market (1990s)
– Innovative Hardware
• Resource-intensive Modification of OS
(hard and time waste for size, etc)
• Make a Virtual Machine Monitor (software)
between OS and Hardware to resolve the
Problem [Old Idea]
Two opposite Way for System
Software
• Address these challenges in the operating
system: OS-Intensive
– Hive , Hurricane, Cellular-IRIX, etc
– innovative, single system image
– But large effort.
• Hard-partition machine into independent failure
units: OS-light
– Sun Enterprise10000 machine
– Partial single system image
– Cannot dynamically adapt the partitioning
One Compromise Way between
OS-intensive & OS-light
• Disco was introduced to allow trading off
between the costs of performance and
development cost.
Target of Disco
• Run operating systems efficiently on
Innovative hardware: Scalable SharedMemory Machines
• Small implementation effort with no major
changes to the OS
• Three challenges
– Overhead
– Resource Management
– Communication & Sharing
Outline
•
•
Disco Architecture
Disco Implementation
1.
2.
3.
4.
5.
6.
•
•
•
Virtual CPUs
Virtual Physical Memory
NUMA Men management
Copy-On-Write Disks.
Virtual I/O devices
Virtual Network interfaces
Running Commodity Operating Systems
Experience
Related Work
Disco Architecture(1)
Disco Architecture(2)
• Interface
– Processors: Emulates MIPS R10000
– Physical Memory: An abstraction of main
memory residing in a contiguous physical
address space starting at address zero.
– I/O Devices: Each virtual machine is created
with a specified set of I/O devices
Virtual CPUs
• Schedules virtual machine/CPU as task
• Sets the real machines’ registers to the virtual
CPU’s
• Jumps to the current PC of the virtual CPU,
Direct execution on the real CPU
• Detection and fast emulation of operations that
cannot be safely exported to the virtual machine
(privileged instructions or physical memory)
– TLB modification
– Direct access to physical memory and I/O devices.
• Controlled access to memory & Instruction
Virtual Physical Memory(1)
• Address translation & maintains a
physical-to-machine address mapping.
• Virtual machines use physical addresses
• Disco map physical addresses to machine
addresses (FLASH’s 40 bit addresses)
• Software reloaded translation-lookaside
buffer (TLB) of the MIPS processor
(Hardware reload or Software reload?)
Virtual Physical Memory(2)
• OS want to update the TLB, Disco
emulates this operation, compute the
corrected TLB entry & insert it.
• Pmap ->Quick compute TLB
• Each VM has an associated pmap in the
monitor
• pmap also has a back pointer to its virtual
address to help invalidate mappings in the
TLB
Virtual Physical Memory(4)
• MIPS has a tagged TLB, called address
space identifier (ASID).
• ASIDs are not virtualized, so TLB must be
flushed on VM context switches
• TLB missing: 2nd level software TLB, idea
like cache?
NUMA Memory Management(1)
• NUMA: memory access time depends on the
memory location relative to a processor. local
memory faster than non-local memory (SGI)
• CCNUMA: all NUMA computers sold to the
market use special-purpose hardware to
maintain cache coherence (Non CCNUMA
system is Complex in programing)
• Cache misses should be satisfied from local
memory rather (fast) than remote memory (slow)
• Dynamic page migration and page replication
system
NUMA Memory Management(2)
• Read and read-shared pages are migrated
to all nodes that frequently access them
• Write-shared are not, since maintaining
consistency requires remote access
anyway
• Migration and replacement policy is driven
by cache-miss-counting facility provided
by the FLASH hardware
NUMA Memory Management(3)
• two different virtual
processors of the same
virtual machine logically
read-share the same
physical page, but each
virtual processor
accesses a local copy.
• memmap tracks which
virtual page references
each physical page.
Used during TLB
shootdown
Disco Mem Management
Virtual I/O Devices
• Disco intercepts all device accesses from
the virtual machine and forwards them to
the physical devices
• Each Disco device defines a monitor call
used by the device driver to pass all
command arguments in a single trap
• DMA requests to translate the physical
addresses into machine addresses.
Copy-On-Write Disks
• Disco intercepts every disk request that DMAs
data into memory
• Attempts to modify a shared page will result in a
copy-on-write fault handled internally by the
monitor.
Virtual Network Interface
• Communication between virtual machines by
accessing data in shared cache
• Avoid duplication of data
• Use sharing whenever possible
Running Commodity Operating
Systems
• Minor changes to kernel code and data
segment (MIPS architecture)
• Disco uses original device drivers
• Added code to the HAL to pass hints to the
monitor for better decision
• IRIX
SPLASHOS
• Thin specialized library OS, supported
directly by Disco (no need for virtual
memory subsystem)
Weakness of Disco
• Virtual physical memory is done in Disco by
catching TLB misses. However, this only applies
to the MIPS architecture. Nearly all other
architectures use hardware loaded TLB. The
paper doesn't propose a way to provide virtual
physical memory without the ability to catch TLB
misses, which is often true. It seems that there
isn't a simple solution to this. So it may be a
considerable problem for implementing virtual
machines efficiently on other architectures
Experiments(1)
•
•
•
•
•
•
No Flash Hardwar-> Simulation on SimOS
Four Workloads
Execution Overheads
Memory Overheads
Scalability
Dynamic Page Migration and Replication
Experiments(2)
• Single Hardwar-> SGI Origen200
Related Work
• System Software for Scalable Shared
Memory Machines
• Virtual Machine Monitors
• Other System Software Structuring
Techniques
• CC-NUMA Memory Management
Problem
• In p.3, the authors said that "..., the changes for
CC-NUMA machine represent a significant
development cost". Why is it so difficult to adapt
the existing OS to multi-processor?
In the case of Linux, the machine-dependent
codes are only about 2% of the entire system.
Do you think it requires less effort to implement
virtual machines instead of adapting the existing
OS to multiple processors?
• Does virtual machine the real solution for
really utilizing the power of multiple
processors?
• Or people use it because of its benefits of
running multiple OS in the same machine,
and the
ability of running several specialized OSs
instead of a general-purpose one?
• This seems like a good idea. How has this been
improved since then? Is it still in use? Out of
curiosity, how many people and how much time
has been spent on Disco?
• Why didn't this CC-NUMA multiprocessor
architecture kick off? I mean, it is not a very
popular system nowadays, right?
• ( No further information about it , the page dead
at year 2000)
• The performance studies presented in the
paper are largely short workloads
because the simulation environment made
it impossible to run long running
workloads. How well would this operating
system run on long workloads?
How much does the simulation
environment affect the results of running
short workloads?
• In the introduction, the authors mention that
operating systems specialized for scalable
shared memory multiprocessor units require
significant changes to operating systems that
result in late, incompatible, and buggy system
software. They also mention a few prototype
systems that try to deal with this. My question is
why none of these were compared to Disco in
the performance studies?
• Why did the use of virtual machine
monitors fade away after the 1970s (ie
what problems did they have) and what
changes/improvements today have
brought them back? Will they stick around
this time?
• Can we use CC-NUMA type machine
across the internet? So everybody
contributes their machine to combine to a
giant computer.
(NC? Pervasive Computing?)
1.The paper used shared memory multiprocessors
example to show that VMM is the right solution of
handling hardware innovations. Do you think VMM is
still the best choice for multi-core CPU structure (one of
the latest hardware innovation) ? ( yes)
2.In section 7.3, the paper mentioned the similarities
between DISCO and microkernel structure. So, are
virtual machine monitors microkernels done
right? (reference:
http://scholar.google.ca/scholar?hl=en&lr=&safe=off&q=
Are+virtual+machine+monitors+microkernels+done+right
%3F&btnG=Search) (Andrew Warfield )
Thank you~