Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Presented by: Pierre LaBorde, Jordan Deveroux, Imran Ali, Yazen Ghannam, Tzu-Wei Kuo Paper by: Edouard Bugnion, Scott Devine, Kinshuk Govil, Mendel Rosenblum Introduction Pierre LaBorde Introduction • CC-NUMA o Cache-Coherent Non-Uniform Memory Access • Coupling with standard distributed protocols TCP/IP NFS Global Buffer Cache o o • Introduction • Hide NUMA-ness o Page placement o Dynamic page migration o Dynamic page replication Problem • Operating systems for innovative hardware o Scalable shared memory multiprocessors • Significant changes required o OS typically have millions of lines of code Solution Virtual Machine Monitors • Instead of modifying existing OS o Additional layer of software between hardware and OS o Multiple copies of existing operating systems Support a variety of workloads o Virtualizes all of the resources Exports conventional hardware interface o Schedules virtual resources on the physical Processor Memory Virtual Machine Monitor • Monitor and distributed protocols need to scale o Simplicity of the monitor o Fault-containment o NUMA memory management issues • Global policies o Fine-grained resource sharing Challenges • Overheads o Privileged instructions o I/O Devices • Resource Management o Instruction execution stream Idle loop Lock busy-waiting • Communication and Sharing o Virtual disk Disco: A Virtual Machine Monitor Jordan Deveroux Disco's Interface • Processors o Abstraction of MIPS R10000 processor o Does not support complete virtualization of kernel virtual address space o Extends architecture to support efficient access to some processor functions • Physical Memory o Abstraction of main memory that resides in contiguous physical address space o Uses dynamic page migration and replication to export nearly uniform memory architecture to the software • I/O Devices o Each virtual machine has specified set of I/O devices o Intercepts communication from all of it's I/O devices for translation or emulation o Virtualizes access to the networking devices of the underlying system Implementing Disco • Multithreaded, shared memory program • Disco vs. Other Systems o NUMA memory placement o cache-aware data structures o interprocessor communication patterns • NUMA memory management o Copy DISCO into all memories of FLASH machine • Cache-aware data structures o Partitioned so that parts accessed only by a certain processor are in memory near that processor • Interprocessor communication patterns o Very few locks o Wait-free synchronization Implementing Disco: Virtual CPU's • Emulates virtual CPU's by using direct execution of real CPU's • Same execution speed as running on real CPU's • Each virtual CPU has a data structure like a process table entry in traditional O.S. o Contains state of virtual CPU • Runs in kernel mode with full access • Simple scheduler allows virtual processors to be shared Implementing Disco: Virtual Physical Memory • Add a level of address translation and maintains physical-to-machine address mappings • Translation performed using translation-lookaside buffer • Memory references are translated through this mapping from now on • Each TLB entry is marked with an address space identifier to avoiding the flushing the TLB on context switches • Each miss is more expensive o emulation of trap architecture o emulation of privileged instructions o remapping of physical addresses Implementing Disco: NUMA Memory Management • Optimization that enhances data locality • Fast translation of virtual-to-physical addresses • Allocation of real memory to virtual machines • Only moves pages that will have performance benefit • Contains a memmap data structure with an entry for each real machine memory page Two different virtual processors of the same virtual machine logically read-share the same physical page, but each virtual processor accesses a local copy Implementing Disco: Virtual I/O • Intercepts all device access from the virtual machine and forwards them to the physical devices • Each disco device defines a monitor call used by the device driver to pass all command arguements • Disks and network interfaces include a map as part of their arguements o list of address pairs that specify the source and destination of I/O operations VM Sharing Imran Ali Copy-on-Write Disks • Uses Virtual Memory Addressing to Map Data to physical Memory • Multiple Virtual Machines(VM) Share Machine Memory • Copy on write means that VM is unaware of Machine Memory being shared VM Sharing Pages Virtual Network Interfaces • Virtual Machines are not allowed to communicate with each other • Uses Standard Protocols to communicate through Ethernet- type addressing • All read only pages can be shared through virtual machines reducing memory overhead • Pages are shared whenever possible and are replicated when needed to improve proformance Transparent Sharing of Pages Experimental Results Yazen Ghannam Experimental Setup • Experiments are Simulated, not using real hardware • Used four different workloads o Software Development (Pmake) OS, I/O Intensive o Hardware Development (Engineering) OS light; Large memory footprint o Scientific Computing (Raytrace, Radix) OS light; uses shared memory regions o Commercial Database I/O light; Single memory intensive Execution Overheads Memory Overheads Scalability Page Migration and Replication Experiences and Related Work Tzu-Wei Kuo Experiences on Real Hardware • Disco was ported to run on a real hardware in order to confirm the simulation test results • Run on SGI Origin200 board which forms the basis of the FLASH machine o Single - 180MHz MIPS R10000 processor o 128MB of memory Experiences on Real Hardware • Overheads of Virtualization • Two workloads o Pmake: compiles Disco itself using the SGI development tools, two files at a time o Engineering: simulates the memory system of the FLASH machine Experiences on Real Hardware • This table shows a breakdown of the execution time for the two workloads and a comparison between IRIX and Disco running IRIX. The execution time is broken down into the user, system, and idle time. Related Work • System Software for Scalable Shared Memory Machines • Virtual Machine Monitors • Other System Software Structuring Techniques • CC-NUMA Memory Management Conclusion • Developing system software for scalable shared memory multiprocessors without massive development effort • Experimental results shows that the overhead of virtualization is modest in both processing time and memory footprints • Disco provides simple solution for scalability and reliability • Lower implementation cost Disco: Running Commodity Operating Systems on Scalable Multiprocessors Presented by: Pierre LaBorde, Jordan Deveroux, Imran Ali, Yazen Ghannam, Tzu-Wei Kuo Paper by: Edouard Bugnion, Scott Devine, Kinshuk Govil, Mendel Rosenblum Title • Text