Download Virtual machine - Duke Computer Science

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Spring (operating system) wikipedia , lookup

Unix security wikipedia , lookup

DNIX wikipedia , lookup

Process management (computing) wikipedia , lookup

Paging wikipedia , lookup

Memory management unit wikipedia , lookup

Transcript
A Survey of Virtual Machine
Research
Landon Cox
April 12, 2017
How do we virtualize?
• Key technique: “trap and emulate”
•
•
•
•
Ways to do this?
Untrusted/user code tries to do something it can’t
Rely on HW
Transfer control to something that can do it
Rewrite code
Evaluate whether thing is allowed
If so, do it and return control. Else, kill process or throw exception.
• Where have we seen trap and emulate?
•
•
•
•
•
Virtual memory
Process tries to access non-resident memory
Trap to OS
OS makes virtual page resident
Retry instruction that caused fault
• Generally useful technique, crucial for virtual machines
Coarser abstraction: virtual
machine
• We’ve already seen a kind of virtual machine
• OS gives processes virtual memory
• Each process runs on a virtualized CPU
• Virtual machine
• An execution environment
• May or may not correspond to physical reality
• Emulate the parts that don’t correspond to reality
Virtual machine options
• Approaches to implementing VMs
1. Interpreted virtual machines
• Translate every VM instruction
• Kind of like on-the-fly compilation
• VM instruction  HW instruction(s)
2. Direct execution
• Execute instructions directly
• Emulate the hard ones
Interpreted virtual machines
• Implement the machine in software
• Must translate emulated to physical
• Java: byte codes  x86, PPC, ARM, etc
• Software fetches/executes instructions
Program
(foo.class)
Byte code
Interpreter
(java)
x86
Looks like a dynamic virtual memory translator
Java virtual machine
• What is the low-level interface to programs?
• Java byte-code (or Dalvik) instructions
• What abstraction does this interface provide?
•
•
JVM: Stack-machine architecture
Dalvik (Android): Register-based architecture
• The Java programming language
• High-level language compiled into byte code
• Library of services (kind of like a kernel)
• Like C++/STL, C#
Direct execution
• What is the interface?
• Hardware ISA (e.g., x86 instructions)
• What is the abstraction?
• Physical machine (e.g., x86 processor)
Program
(XP kernel)
x86
Monitor
(VMware)
Different techniques
• Full emulation
• Bochs, QEMU
• Partial emulation
• VMware
• Para-virtualization
• Xen
• Dynamic recompilation
• Virtual PC
Views of the CPU
• How is a process’s view of the CPU different than the OS’s?
•
•
•
•
•
•
Kernel mode
Access to physical memory
Manipulation of page tables
Other “privileged instructions”
Turn off interrupts
Traps
• Keep these in mind when thinking about virtual machines
Traditional OS structure
Ring 3
Ring 0
App
App
App
Operating System
Host Machine
App
Virtual machine structure
Ring 3
Guest
App
Guest
App
Guest
App
Ring 3?
Guest OS
Guest OS
Guest OS
Ring 0
Virtual Machine Monitor (Hypervisor)
Host Machine
Virtual machine structure
Ring 3
Guest
App
Guest
App
Guest
App
Ring 1
Guest OS
Guest OS
Guest OS
Ring 0
Virtual Machine Monitor (Hypervisor)
Host Machine
Why are hypervisors useful?
• Code reuse
• Can run old operating systems + apps on new hardware
• Original purpose of VMs by IBM in the 60s
• Encapsulation
• Can put entire state of an “application” in one thing
• Move it, restore it, copy it, etc
• Isolation, security
• All interactions with hardware are mediated
• Hypervisor can keep one VM from affecting another
• Hypervisor cannot be corrupted by guest operating systems
Encapsulation
• Say I want to suspend/restore an application
• Write the process mem + regs to disk
• Can I restart the process later?
• Yes, this is just like switching threads or processes
• Just restore address space and registers
• Jump to saved PC
Encapsulation
• Say I want to suspend/restore an application
• Write the process mem + regs to disk
• What if I reboot my kernel and restart the process?
•
•
•
•
No, application state is spread out in many places
Application might depend on other processes
Applications have state in the kernel (lost on reboot)
(e.g., open files, locks, process ids, driver states, etc)
Encapsulation
• Virtual machines capture all of this state
• Can suspend/restore an application
• On same machine between boots
• On different machines
• Very useful in server farms
• We’ll talk more about this with Xen
Security
• Can user processes corrupt the kernel? Which ones?
•
•
•
•
•
Privileged user processes can (running as super user)
Can overwrite logs
Overwrite kernel file
Can boot a new kernel
Exploit a bug in the system call interface
• Ok, so I’ll use a hypervisor. Is my data any less vulnerable?
• All the state in the guest is still vulnerable (file systems, etc)
• So what’s the point?
• Hypervisors can observe the guest OS
• Security services in hypervisor are safe, makes detection easier
Security
• Hypervisors buggy too, why trust them more than kernels?
•
•
•
•
Narrower interface to malicious code (no system calls)
No way for kernel to call into hypervisor
Smaller, (hopefully) less complex codebase
Should be fewer bugs
• Anything wrong with this argument?
•
•
•
•
Hypervisors are still complex
May be able to take over hypervisor via non-syscall interfaces
E.g., what if hypervisor is running IP-accessible services?
Para-virtualization (in Xen) may compromise this
VMware architecture
Host World
VMM World
Target
App
Host
App
VM App
Host OS
VM Driver
Host Machine
Target
App
Target OS
Virtual Machine
Monitor
SimOS (proto-VMware) arch.
Target
App
Target
App
Target OS
Host
App
SimOS
Host OS
Host Machine
Host
App
SimOS memory
SimOS
SimOS VMemory
SimOS code, data
Target OS
TargOS code, data
Target App
Target App
TargApp code, data
Virtual MMU
SimDisk
Host OS
Host Machine
SimDisk File
Mem File
SimOS page fault
SimOS
SimOS VMemory
Target OS
Target App
Target App
SimOS Fault handler
What if I want to
TargOS Fault handler
suspend and
Unmapped
addr
migrate
the target
OS?
Virtual MMU
SimDisk
Host OS
Host Machine
SimDisk File
Mem File
Full vs interpreted
• Why would I use VMware instead of Java?
• Support for legacy applications
• Do not force users to use a particular language
• Do not force users to use a particular OS
• Why would I use Java instead of VMware?
• Lighter weight
• Nice properties of type-safe language
• Can prove safety at compile time
Full vs interpreted
•
What about protection?
•
What does Java use for protection? VMware?
•
•
•
What are the trade-offs? Which protection model is better?
•
•
•
Java relies on language features (cannot express unsafe data access)
VMware: hardware (like an OS) and bin. rewriting (like link-loader)
Java gives you stronger (i.e., provable) safety guarantees
Hardware protection doesn’t constrain programming expressiveness
What about sharing (the opposite of protection)?
•
•
•
•
Sharing among components in Java is easy
(call a function, compiler makes sure it is safe)
Sharing between address spaces is more work, has higher overhead
(use sockets, have to context switch, flush TLB, etc)
Singularity (could try both)
In 1974 …
• Virtual machines have finally arrived!
• (except not really)
• Why did it take until the 2000s for VMs to actually arrive?
• Data centers are the main reason for widespread adoption of VMs
• Nice to run multiple OSes on your desktop
• VMs allow infrastructure owners to safely rent their resources
• Could just hand out accounts. Why are VMs easier?
•
•
•
•
VMs encapsulate all of an app’s dependencies
Includes kernal and libraries w/ correct versions
Can move VMs around
Can consolidate VMs on one server, and shut others down
Sharing machines among users
• When?
• Scientific computing (testbeds, “the grid”)
• Data centers (three-tier web applications)
• “The Cloud”
• Why would you want to do this?
Sharing machines among users
Consolidate under-utilized
servers
to reduce CapEx and OpEx
Avoid downtime with relocation
Dynamically re-balance workload
to guarantee application SLAs
Enforce security policy
What should the interface be?
Amazon EC2
Does anyone know what EC2 uses?
Xen hypervisor (para-virtualized Linux)
Xen architecture
Guest
App
Guest
App
Guest OS
Guest OS
Host
App
Xen
Domain 0
Host Machine
X86_32 address space
When are each set of virtual addresses are valid?
4GB
3GB
Xen
S
Kernel
S
User
All
address
spaces
All of a
VM’s
address
spaces
U
0GB
When does the hypervisor need to flush the TLB?
When a new guest VM or guest app needs to be run.
Each
guest
app
Xen physical memory
• Allocated by hypervisor when VM is created
• Why can’t we allow guests to update PTBR?
• Might map virtual addrs to physical addrs they don’t own
• VMware and Xen used to handle this differently
• VMware maintained “shadow page tables”
• Xen used “hypercalls”
• (Xen and VMware support both mechanisms now)
VMware guest page tables
Virtual → Machine
Update PTE
Guest OS
How does VMM grab control when PTE is updated?
Marks PTE pages read-only, generates page fault.
Shadow page table
VMM
Hardware
MMU
Xen physical memory
• Guest OSes allocate and manage own PTs
• “Hypercall” to change PT base
• Like a system call between guest OS and Xen
• Xen must validate PT updates before use
• What are the validation rules?
1. Guest may only map phys. pages it owns
2. PT pages may only be mapped RO
Xen guest page tables
Virtual → Machine
Update PTE hypercall (like a syscall)
Guest OS
1) Validation check
2) Perform update
VMM
Hardware
MMU
Para-virtualized CPU
• Hypervisor runs at higher privilege than guest OS
• Why is having only two levels a problem?
• Guest OSes must be protected from guest applications
• Hypervisor must be protected from guest OS
• What do we do if we only have two privilege levels?
• OS shares lower privilege level with guest applications
• Run guest apps and guest OS in different address spaces
• Why would this be slow?
• VMM must flush the TLB on system calls, page faults
Google App Engine
What is the interface?
Python and Java API and runtime