Download Virtual Machines

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Virtual Machine (VM)
Layered model of computation
Software and hardware divided into logical layers
Layer n
Receives services from server layer n – 1
Provides services to client layer n + 1
Virtual
Machines
Layers interact through well-defined programming interface
Virtual layer
Software emulation of hardware or software layer n
Transparent to layer n + 1
Provides service to layer n + 1 as expected from real layer n
Virtual layer n can run at some layer m ≠ n in real system
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
1
Examples of Virtual Systems
virtual
Hardware Virtual System
Real System
Modern Microprocessors — Fall 2012
Virtual Machines
Server OS
Protocol Stack
Network
Client real
2
Runs above primary OS / below guest OS
Provides guest OS with software emulation of real hardware system
Hardware
System Virtual Machine
Emulation of system-level hardware environment
Server
Cloud computing
Virtual
Runs above physical hardware and below one or more OSs
Service level agreement (SLA) specifies infrastructure requirements
User sees hardware / software configuration / performance
Application
Application
Application
Application
OS
OS
Guest OS
Provider assembles virtual configuration
Meets SLA requirements
May be implemented in any way
OS
VMM
VMM
Hardware
OS
Hardware
Basic System
Virtual Machines
Application
VM
Real
Modern Microprocessors — Fall 2012
Dr. Martin Land
Process Virtual Machine
VM provides application interpretation above OS
Hosted Virtual Machine
Virtual machine monitor (VMM)
Web server
Local OS
Protocol Stack
real
n + 1
virtual n = m
m – 1 Types of Virtual Machine
Web browser exchanges data with server
Browser n + 1
n
n – 1 Dr. Martin Land
3
Modern Microprocessors — Fall 2012
System VM
Hardware
Hardware
Hosted VM
Virtual Machines
Process VM
Dr. Martin Land
4
Process VM Example — Java
Hosted VM Example — Guest OS Over OS
DOS command line interface over Windows
Windows allocates 1 MB virtual memory space
Copies DOS kernel into low memory
Designed for program portability between platforms
Provides standard interface to software
Java VM located above a standard OS
Interface to hardware implementation dependent
I/O operations performed by calls to OS
Java compiled to bytecode
Bytecode usually run (interpreted) in Java VM
debug
Windows
Application
System calls handled by guest DOS kernel
Virtual 86
Windows
DOS accesses to hardware
Trapped and served by Windows host OS
Responses returned to DOS
Hardware
Concurrent DOS windows
Multiple allocations of 1 MB virtual memory spaces
DEBUG
Application running in virtual DOS machine
Sees 1 MB memory space allocated by Windows
Register values
Windows emulates real values to DOS
Debug emulates DOS values to user
Java without VM
Java bytecode processor in IBM mainframes
Native machine language (ISA) is Java bytecode
Execute Java bytecode without interpretation
Parallels, VirtualBox, VMware, DOSBox, ...
Host Windows, Linux, DOS, … as guest OSs over host OS
http://java.sun.com/docs/books/tutorial/getStarted/intro/definition.html
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
5
Virtual Machine in IBM z/990 Mainframe
…
User User
OS — LPAR
…
User User
OS — LPAR
…
User User
OS — LPAR
…
…
User User
Resource management
Hardware redundancy
High availability
Recovery management
…
Hardware pooling
User
6
App1
App2
App2
App3
OS
OS
OS
OS
VMM
VMM
Server
Server
Assemble hardware cluster
Map applications to hardware efficiently
OS — LPAR
Load balancing
Remap applications to hardware
Hardware
Virtual Machines
Dr. Martin Land
Isolate user environments on single hardware platform
Multiple copies of single operating system running independently
Multiple operating systems running concurrently
Maintain higher security
VMM — Systems Manager — Hypervisor
Modern Microprocessors — Fall 2012
Virtual Machines
VM as System Management Tool
Hardware
CPUs, I/O system, internal communication network
VMM (hypervisor)
Operator console for partitioning/configuring CPUs and I/O
Provides hardware emulation as abstraction to OS layer
OS
Logical partition (LPAR) runs separate instance of operating system
Run z/OS, MVS, VM, Unix, Linux, Windows, … instances in parallel
Non-Windows OS versions expect to see hypervisor (not hardware)
User
User sees single-user interface provided by one OS
User
Modern Microprocessors — Fall 2012
Dr. Martin Land
7
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
8
z/990 Parallel Sysplex Model
Virtualization for Server Systems
Parallel Sysplex
Merge 2 to 32 instances of z/OS into a single system
Applications divide work and data among LPARs
High capacity for very large workloads
Resource sharing
Dynamic workload balancing
Old file server model
Run one application per physical server
Server specified for worst case load
Large number of typically underutilized servers
Huge aggregate space capacity
Competition from mainframes
Geographical diversity
Coupled LPARs on remote physical systems
Physical backup
User … User User … User User … User User
Automatic failure recovery
LPAR - OS
LPAR - OS
LPAR - OS
Continuous availability
Systems Manager
…
…
User User
…
User
LPAR - OS
Virtualization in server
Hardware (processors, RAM, I/O)
User
…
User User
LPAR - OS
Coupling
Facility
Modern Microprocessors — Fall 2012
…
User User
LPAR - OS
…
User User
LPAR - OS
…
…
User User
…
User
LPAR - OS
Partition hardware resources to run independent applications
Intel virtualization
IA-32 and IA-64 ISA support
I/O chipset support
Systems Manager
Hardware (processors, RAM, I/O)
Virtual Machines
VMM provides dynamic load balancing
Hardware provides centralized power, cooling, monitoring, backup
High SAR — scalability, availability, reliability
Lower cost per served client than server farm
Dr. Martin Land
9
HP Virtual Partitions (vPars) Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
10
Dr. Martin Land
12
System VM Organization
Hypervisor
Virtual machine monitor (VMM)
Lowest layer above physical hardware (host)
Uniprocessor or multiprocessor system
Creates virtual machine (VM) environments for guest OSs
Allocates physical host resources to virtual resources
VM overhead
Processor intensive applications — low overhead
Infrequent use of OS calls
Most instructions run directly on hardware
I/O intensive applications — high overhead
Frequent use of OS calls
OS calls for I/O services run in emulation
Boot
Order
I/O-limited applications
Program throughput limited by I/O latency
Emulation adds relatively small overhead
Hewlett-Packard, "Installing and Managing HP-UX Virtual Partitions (vPars)"
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
11
Modern Microprocessors — Fall 2012
Virtual Machines
VMM Requirements Virtualization Awareness
Hardware abstraction
Guest environment must replicate hardware
VMM must present well-defined software interface to OS
Virtualization-aware guest OS
OS written to run above VMM/hypervisor
Expects to interact with virtual host
Does not expect full or direct control of physical hardware
Protection
OS code interfaces with hypervisor code
No need to remap (bluff) pointers intended for real hardware
Isolate guests from one another
Protect VMM from guest OS and application software
Guest software cannot change allocation of physical resources
May be presented with view of real system for limited operations
Example — mainframe OS
Privilege
Writes I/O outputs to hypervisor interface
Does not attempt to configure I/O hardware devices
Particular OS may be given direct control of particular I/O device
VMM runs in kernel mode
Guest OSs and applications run in user mode
Virtualization-unaware guest OS
Hardware support for VMM
OS written to run above physical hardware
Expects full and direct control of real hardware
Requires extensive intervention and remapping by VMM
Virtualization primitives built into mainframe ISA
Any OS or application access to hardware causes trap to VMM
VMM catches every access to hardware abstraction layer (HAL)
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
13
Hardware Emulation Activities
VMM Emulation
CPU Memory Access
Read data or Translate data/instruction from guest to host format
instruction Remap address space
Write data Read/Write to real host memory
CPU I/O Device Access
Read data or Translate data/instruction from guest to host format
instruction Remap I/O port space
Write data Read/Write to real host I/O device
I/O device actions
OS
VMM
Virtual Machines
14
CPU
Translates guest ISA to host ISA
Memory
Translates memory size and organization
Chipset
Translates guest configuration instructions to host
I/O devices
Translates guest driver to host driver
CPU emulation example
Hardware
Run Nintendo game on PC
Translate each Nintendo instruction to IA-32 instruction set
Partial system emulation
Part of host hardware presented to OS unchanged
VMM passes guest operations to host with minimal intervention
Most system VMs emulate subset/superset of real host hardware
CPU emulation only in special cases
VMM manages I/O device DMA
DMA or IRQ Translate OS interrupt handlers from guest
format to host format
Modern Microprocessors — Fall 2012
Dr. Martin Land
Full system emulation
VMM intervenes in every OS access to hardware
Application
Operation
Virtual Machines
Full/Partial System Emulation
OS sees hardware through operations
OS instructions cause to CPU initiate memory and I/O operations
I/O devices initiate DMA operations and interrupts
Real Hardware
Modern Microprocessors — Fall 2012
Dr. Martin Land
15
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
16
Software Emulation of I/O Hardware
Bootstrap Process in System VM Advantages
VMM provides emulation of widely supported device hardware
Guest OS runs available device drivers without modification
Difficulties
Requires very accurate device emulation
Includes hardware revisions and "bug emulation"
Performance issues
VMM intervention on every guest OS access to I/O device
Workstation without VMM
Context switch from guest OS to VMM
VMM emulates I/O access and access to real I/O device
Context switch back to guest OS with response
Adds considerable overhead
Emulation is compute-intensive — increases CPU utilization
Least-bad case
Virtual device = real device
Remap I/O ports — no change to driver operation
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
17
Virtualization Difficulties for IA‐32
Expect to have highest privilege
Can easily discover their lower privilege
Modern Microprocessors — Fall 2012
Virtual Machines
System boot
CPU loads initial system loader (ISL)
ISL points to system boot device Boot device contains OS
CPU loads initial system loader (ISL)
ISL points to system boot device Boot device contains VMM
Device Discovery
OS loader writes to host I/O space
Chipset and I/O devices respond
OS loads drivers for host devices
OS provides user interface
VMM loader writes to host I/O space
Chipset and I/O devices respond
VMM loads drivers for host devices
VMM provides administrator interface
Secondary
Boot
Administrator configures VM partitions
Administrator points VMM to device
containing OS boot image
VMM boots OS into partition
Device Discovery
OS loader writes to virtual I/O space
VMM responds for I/O devices
OS loads drivers for virtual devices
OS provides user interface
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
18
Memory Resource Compression
OS manages resources using IA-32 system tables
Assigns pointer to page table root (directory)
Manages page table entries
Manages memory segmentation with descriptor tables limited to 8 K entries
IA-32 designed to provide hardware support to OS
Memory segmentation
Virtual memory and paging
Task management
Interrupt management
Protection and privilege for segmentation, paging, interrupts
Workaround virtualization
Treat OS like user application
Can create a kludge on IA-32 systems
IA-32 operating systems
Workstation with VMM
Application
OS
Hardware Virtual
User
Kernel Global descriptor table (GDT)
Map segment pointer to virtual address
Define segment type (code, data, system) and privilege level
Interrupt descriptor table (IDT)
Map interrupts and traps to service routines
Application
OS
VMM
Hardware
Real
Dr. Martin Land
Memory compression
VMM must reserve part of guest virtual memory for management
OS expects to see the full virtual memory space
Table resource compression
VMM requires entries in GDT and IDT for management of OS
VMM must prevent OS access to its descriptors
OS expects full control of all 8 K table entries
19
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
20
Ring Aliasing
Non‐Faulting Access to Privileged State
Privilege rings
Memory segments assigned privilege from 0 (highest) to 3 (lowest)
Privileged registers
Control configuration of hardware systems
VMM must
Stored in segment descriptor (table entry defining segment)
Access rights for code limited to segments of same or lower privilege
Copied into code segment selector (pointer to segment via descriptor)
User mode ~ ring 3
OS kernel mode ~ ring 0
Ring aliasing
Deprivileging
Intercept OS access to privileged registers
Provide virtual values based determined for guest environment
Access to privileged registers in IA-32
Access by unprivileged software usually prevented
Access
Granted
Run VMM at ring 0 and OS at ring 1
Issues
Causes protection fault
VMM emulates response to guest instruction
Access
Denied
Some unprivileged accesses privileged state and do not fault
CPL
CPL
Paging restricted to two levels
4 level privilege not supported in 64-bit systems
OS can read its CPL from code segment selector
DPL
CPL
DPL 0 1
CPL
2
3
DPL
DPL
Virtual Machines
GDTR
pointer to GDT
IDTR
pointer to IDT
LDTR
pointer to LDT
TR
pointer to current task segment
Guest OS can determine that it does control CPU
CPL — privilege level of code segment
DPL — privilege level for data access or branch target
Modern Microprocessors — Fall 2012
On user access to system state
Protection fault on write No fault on read
Dr. Martin Land
21
System Calls and Interrupts
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
22
Intel Virtualization Technology (VT)
System calls
Application in ring 3 invokes OS in ring 0
Require indirect mechanism (call gate)
Virtual machine monitor
Hardware boots (3rd party) VMM software instead of OS
VMM configures hardware resources among guest systems
Redirects to hidden ring 0 address
VMM must emulate call gates
Remaps hardware locations to virtual pointers for guests
OSs boot within guest partitions
SYSENTER instruction provides fast calls to ring 0
Hardware support for virtualization
VT-enabled processors alternate between operating modes
Will call VMM instead of guest OS
SYSEXIT instruction ends SYSENTER routine
Root mode grants full hardware control to VMM
Non-root mode presents virtual pointers to guest OS
Faults to ring 0 if executed from lower privilege
VMM must emulate response to SYSENTER/SYSEXIT
Interrupts
Interrupts can be masked by controlling interrupt flag (IF)
VMM must mask interrupts and handle interrupts by emulation
Some OSs toggle IF frequently requiring many VMM interventions
VT-enabled chipset
Grants control of I/O to root mode
Remaps I/O channels for non-root mode
VMX non‐root
User
Ring 3
User Privilege
OS
Ring 0 Virtual Full Privilege
Operating system
VMX root VMM
Sees virtual machine as real system
Operates in ring 0 for maximum privilege
Sends instructions to hardware pointers in usual way
Real Full Privilege
http://www.intel.com/technology/itj/2006/v10i3/index.htm
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
23
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
24
System Issues in Virtualization CPU virtualization support
Handles operations initiated by CPU
Memory access by guest software
VMM assigns virtual address
space to guest OS
I/O access by guest software
VMM translates OS driver
output for host device
VT‐x for IA‐32 Processor Virtualization
Virtual machine extensions (VMX)
VMX root operation
PCI Host-to-Bus Bridge
(bus controller)
CPU
Graphics
ROM
VMX non-root operation
RAM
Operating mode designed for guest OS
Presents OS with virtual host configured by VMM
OS sees standard ring 0 access to virtual IA-32 resources
OS access to privileged state trapped by VMM
PCI (expansion) bus
I/O
Chipset virtualization support
Handles operations initiated
by I/O device
Interrupts and DMA accesses by I/O device
Intercepted by VMM and remapped
Modern Microprocessors — Fall 2012
Operating mode designed for VMM
Grants highest privilege access to host CPU hardware state
I/O
I/O
I/O
ISA
Bridge
Mode transitions
VM entry
VMX root operation → VMX non-root operation
ISA/EISA bus
VM exit
VMX non-root operation → VMX root operation
disk
Virtual Machines
I/O
VM
Entry
Dr. Martin Land
25
Virtual Machine Control Structure
Virtual Machines
User
VMX root
VMM
Hardware
Host
OS
Dr. Martin Land
26
Referenced by physical address
No page table entry in any guest address space
Location determined by VMM software
VMCS structure
Not determined by architecture
Defined as set of VMCS access host instructions
VMM author chooses implementation
VM entry
Loads table pointers from VMCS
Pointer updates cause context shift to VM process
VMM can optionally inject virtual event (interrupt) to cause VM
response
VM exit
VM saves context to memory
All VMs exit to common entry point in VMM
VM exit records details of reason for exit in VMCS
VMM provides detailed response to VM exit
Saves processor state to VMCS host-state area
Loads processor state from VMCS guest-state area
VM exit
Saves processor state to VMCS guest-state area
Loads processor state from VMCS host-state area
VMCS host-state area
Segment register selectors for VMM operations
Privileged system table pointers (GDTR, IDTR, TR, page table root)
VMCS guest-state area
Segment register selectors for OS operations
Virtual system table pointers determined by VMM
VMM physical address space not mapped to guest OS virtual address
space
Interrupt flag (IF)
Virtual Machines
Modern Microprocessors — Fall 2012
VMX non‐root
VMCS Details
Virtual-machine control structure (VMCS)
Used for mode transition management
VM entry
Modern Microprocessors — Fall 2012
VM
Exit
Dr. Martin Land
27
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
28
VMCS Control Fields
VT‐x Solves Virtualization Problems Settable options for interrupt virtualization
External‐interrupt exiting
VM exit on external interrupt
External interrupts not maskable by guest
Interrupt‐window exiting
VM exit if guest allows interrupts
Ring aliasing and compression
Guest software runs at intended privilege level
Address-Space Compression
Guest/VMM transitions can change virtual address space
Guest software has full use of its own address space
VMCS resides in physical address space
Guest/Host mask for control register virtualization
Status flags in control registers determine processor options
VMM masks selected flags to prevent write by guest
Guest write to masked flag causes VM exit
Guest reads flag value specified by VMM in VMCS
VM exit bitmaps
VMM chooses subset of guest actions that cause VM exit
Exception bitmap — 32 exceptions that optionally cause VM exit
I/O bitmap — each 16-bit I/O port can be set to VM exit on guest access
Instruction bitmap — selects privileged instructions that cause VM exit
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
Does not use not linear address space
Nonfaulting access to privileged state
VMCS controls interrupts
VMM allows guest OS access to privileged registers
Accesses cause transition to VMM
System calls
Guest OS runs at ring 0 as intended
Interrupts
VMCS controls response to interrupt through VMCS
29
VT‐x Exception Handling
not set
in bitmap
Virtual Machines
Dr. Martin Land
30
Interrupt Virtualization
Set option external-interrupt exiting
OS handles
OS continues
VMM services
exception
Exception
set in
bitmap
Modern Microprocessors — Fall 2012
VM entry
Interrupt
VM exit
to VMM
VMM prepares
system tables
event
injection
VM entry
VM exit
to VMM
Event injection replicates interrupt
VMM updates
system tables
event
injection
Possible updates — interrupt tables,
system registers, I/O
configuration, ...
Event injection replicates
exception
Possible updates — page tables,
system registers, I/O
configuration, ...
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
31
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
32
VT‐d for PCI Chipset Virtualization
VMM allocates resources to guest OSs
Virtual address space
CPU
Virtual I/O devices mapped to real I/O devices
OS accesses real I/O device through VMM mapping
DMA remapping
OS configures virtual I/O devices
DMA Protection Domains
Bridge
I/O
RAM
I/O
Protection domain
Subset of physical memory allocated for device-initiated DMA
Protection domains may be allocated to
VMM
Guest OS
Driver process running under guest OS
I/O
I/O device
May be assigned to a protection domain
Can only perform DMA to assigned protection domain
DMA address translation
I/O device DMA request to bridge contains memory address
VT-d treats request address as DMA virtual address (DVA)
Enables device-initiated DMA operations to guest address space
Real I/O device must write to guest OS through emulation mapping
Interrupt remapping
Real I/O devices my interrupt CPU
Interrupt intended for one guest OS
Real I/O device must deliver interrupt
to guest OS through emulation
mapping
Guest Physical Address (GPA) of guest OS
General software-generated virtual I/O address
DVA translated to Host Physical Address (HPA)
http://www.intel.com/technology/itj/2006/v10i3/index.htm
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
33
Mapping I/O Devices to Protection Domains
Modern Microprocessors — Fall 2012
Device
Dr. Martin Land
34
Address Space Overview
PCI device requester ID
Identifies DMA device and request
PCI Bus
Virtual Machines
VMM
Function
Page
Tables
Assigned by PCI configuration software during device discovery
Root-Entry Table
Index — 8-bit bus number from requester ID
Entry — Pointer to context-entry table
Guest
Virtual
Memory
Page
Structures
HPA
Host
Physical
Memory
GVA
HPA
GPA
DVA
Context
Entry
Table
DMA
Virtual
Memory
Root
Entry
Table
Guest
Physical
Memory
Context-Entry Table
Index — 8-bit device/function number from requester ID
Entry — pointer to page structure used to translate DVA
PCI Bus
Device Function
I/O device DMA Request ID
Page structure
Multilevel table structure
Similar to IA-32 page tables
Modern Microprocessors — Fall 2012
Virtual
Memory
Virtual Machines
Dr. Martin Land
35
Modern Microprocessors — Fall 2012
Emulated
Physical
Memory
Physical
Memory
Virtual Machines
Virtual
Memory
Dr. Martin Land
36
IA‐32 Interrupt Handling
Message Interrupt Handling
Legacy interrupts
Interrupt controller in chipset handles device interrupts
Local APIC
CPU interrupt controller
Receives/decodes local interrupt signals
Receives interrupt messages from
I/O APIC
Programmable Interrupt Controller (PIC) integrated into ISA chipset
APIC (Advanced PIC) integrated into PCI chipset
I/O device assigned interrupt request (IRQ) connection to APIC
APIC
Translates device IRQ to 8-bit CPU interrupt number n
Sends hardware interrupt signal (INTR) to processor
I/O APIC
PCI chipset interrupt controller
Receives/decodes device IRQ signals
Sends/receives interrupt messages
CPU
Loads 64-bit entry n from Interrupt Descriptor Table (IDT)
Entry points to Interrupt Service Routine (ISR)
Message signaled interrupts (MSI)
I/O APIC in PCI chipset formats IRQ signal into structured message
Message transferred on PCI bus as device-initiated DMA operation
Local APIC in CPU receives and decodes message
IA-32 Intel Architecture Software Developer’s Manual
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
37
Interprocessor Interrupts
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
38
Interrupt Remapping Interprocessor Interrupt (IPI)
Subset of APIC interrupt message table
CPU writes to interrupt command register (ICR) in local APIC
Local APIC issues IPI message on system bus
Used to boot and spawn threads in multiprocessor system
Message signaled interrupt (MSI)
Encodes interrupt vector and destination processor
Real I/O device not aware of guest OS view of emulated I/O device
VMM must intercept MSI
VMM redefines interrupt message format
Provides substitute MSI
DMA write request contains
Message identifier
No interrupt attributes (vector and destination processor)
Requester ID of real I/O device generating interrupt
Requester ID mapped through table structure (root/context tables)
Points to interrupt remapping table (IRT)
Entry provides vector and destination processor
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
39
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
40
Caching of Remapping Structures
VirtualBox
VT-d supports hardware caching of remapping tables
Root/Context tables
Paging structures
IOTLB
Interrupt remapping table entries
Open source hosted VMM by Oracle (Sun Microsystems)
Runs on Intel and AMD x86 hardware
Runs above Windows, Linux, Mac OS X (Intel), Solaris
Provides VM with guest OS
Standard DOS, Windows, Linux, OS/2, FreeBSD, Solaris
Uses hardware virtualization support if available (not required)
VMM responsible for maintaining remapping cache
Must invalidate stale cache entries
Scheduling
Host OS grants timeslice to VM
VM sub-processes scheduled by guest OS
Remapping errors
DMA access request returns error message
Application
Device response to error implementation dependent
Application
Errors logged to VMM
Application
Guest OS
VirtualBox Hypervisor
VMM may reset cache or I/O device configuration tables
Host OS
x86 Hardware
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
41
VirtualBox Architecture
Virtual Machines
Dr. Martin Land
42
Dr. Martin Land
44
CPU Operating Modes
Front-end (client)
VirtualBox hypervisor
Runs above host OS
Without Intel VT performs workaround virtualization
Runs native (not emulated) on CPU
Host applications at ring 3
Host OS code at ring 0
Guest "safe" application code at ring 3
Hypervisor runs in ring 0 of guest context
Guest OS runs as user program in ring 1 of guest context
Limited use of Intel VT if available
Non-system activities
Makes system calls to guest OS
Runs emulated on CPU at ring 3
Guest application code that causes guest OS interventions
Application
Guest OS
Disable interrupts
Trap of prohibited accesses
Executes real mode code
Hypervisor
Back-end (server)
Host OS
Ring 0 driver in host OS
VirtualBox Driver
Copes with "gory details of x86 architecture"
Allocates physical memory for VM (guest OS)
Saves/restores guest CPU context during host interrupt
Each instruction interpreted by VirtualBox driver
Interpreted code run in CPU instead of native code
Runs native on CPU at ring 1
Guest OS ring 0 code
VirtualBox driver handles "gory details" of workaround
Registers and descriptor tables
No intervention in guest OS process management
Modern Microprocessors — Fall 2012
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
43
Modern Microprocessors — Fall 2012
Virtual Machines
Xen
Xen Architecture
Open source system VMM
Runs on Intel and AMD x86 hardware
Runs directly above hardware
Xen hypervisor
Directly above hardware
Boots system on on start-up
Domain 0
Initialized by hypervisor on boot
Runs XenLinux — modified Linux kernel
Provides Domain Management and Control (DMC)
Domain U
VM running guest OS
Linux required to build and install Xen
Provides VMs with guest OSs
Linux, Solaris, Windows XP, 2003 Server
Hardware virtualization support required for Windows guest OS
Para-virtualization for Linux/Unix guest OS
OS kernel modified to support Xen explicitly
Operating systems ported to run on Xen
Similar effort to porting OS to new hardware platform
Para-virtual machine architecture very similar to native hardware
User space applications and libraries not modified
DMC
Application
Application
Application
XenLinux OS
Domain 0
Guest OS
Domain U
Guest OS
Domain U
Guest OS
Domain U
Xen Hypervisor
x86 Hardware
Xen Architecture Overview, http://wiki.xensource.com/xenwiki
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
45
Hypervisor
Modern Microprocessors — Fall 2012
Virtual Machines
46
Domain 0
Full privilege
Operates directly on hardware in ring 0
Functions
CPU scheduling for virtual machines
Memory partitioning
Provides hardware abstraction to virtual machines
No awareness of
Networking
External storage devices
Video
Domain Domain U
U
Common I/O
Xen Hypervisor
x86 Hardware
Scheduler CPU
XenLinux
Modified Linux kernel running in unique VM over hypervisor
Direct privileged access rights to physical I/O resources
Provides I/O virtualization to Domain U guest VMs
Generic I/O drivers
Network Backend Driver
Manages local networking hardware
Processes all VM networking requests
from Domain U guests
Block Backend Driver
Domain U
Domain 0
Partitioner
Process
List
Page
Tables
Manages local storage disk
Processes all read/write data requests
from Domain U guests
Virtual Machines
Domain U
Domain U
Scheduler I/O
CPU
Memory
Modern Microprocessors — Fall 2012
Dr. Martin Land
Domain 0
I/O Drivers
Partitioner
Process
List
Page
Tables
I/O
Memory
Dr. Martin Land
47
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
48
Domain U PV
Domain U HVM
Domain U PV guests
Paravirtualized VM running modified Linux/UNIX kernels
OS expectations
No direct access to host hardware
Shares host hardware with other VMs
Guest drivers provide I/O access
Access backend drivers in Domain 0
PV Network Driver
PV Block Driver
Domain U HVM Guests
Fully virtualized machines
Run standard Windows or other unmodified OS
OS runs as VMX non-root operation with VT-x
OS expectations
No hardware virtualization
Not sharing with other VMs
Normal hardware access for boot
Domain 0
Domain 0
Domain U
OS Driver
OS Driver
daemon
Xen virtual firmware runs as VMX root operation with VT-x
Simulates BIOS expected by OS on initial startup
Domain 0
Domain 0
Domain U
PV Driver
PV Driver
Backend Driver
I/O support
No special drivers
Domain 0 runs Qemu-dm daemon for each HVM Guest
Supports Domain U HVM Guest for networking and disk access requests
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
49
Domain Management
Virtual Machines
Dr. Martin Land
50
Domain U PV to Domain 0 Communication
Xend daemon
Python application running in Domain 0
System manager for Xen environment
Processes requests as XML remote procedure call (RPC)
Domain U PV Guest requests I/O from Domain 0 via hypervisor
No direct support in hypervisor for I/O
Inter-Domain event channel
Domain 0 and each Domain U have shared memory area
Asynchronous inter-domain interrupts implemented in hypervisor
Qemu-dm
Daemon handles networking and disk requests from Domain U HVM
Provides full emulation of hardware for standard OS I/O drivers
Example — Domain U PV Guest data write to hard disk
Guest OS sends write request to PV block driver
Guest PV block driver
Virtual firmware
Writes data to Domain 0 shared memory through hypervisor
Sends inter-domain interrupt to Domain-0 through hypervisor
Provides full emulation of BIOS for Domain U HVM Guest OS
Xend
Qemu
Unix Application
Linux Application
Windows Application
XenLinux OS
Domain 0
XenUnix
Domain U PV
XenLinux
Domain U PV
Standard Windows Domain U HVM
Dr. Martin Land
Triggers PV Block Backend Driver access to shared memory
Reads blocks from Domain U PV Guest shared memory
Writes data to hard disk
x86 Hardware
Virtual Machines
Domain 0 receives interrupt from hypervisor
Backend Driver
Xen Hypervisor
Modern Microprocessors — Fall 2012
Modern Microprocessors — Fall 2012
51
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
52
I/O Driver Communication
Xen PV and HVM Performance
Test Configuration
Intel Xeon @ 2.3 GHz
4 GB DDR2 533 MHz memory
160 GB Seagate SATA disk
Intel E100 Ethernet controller
Unix Application
DMC
Write request
Write disk
Interrupt
Backend block Driver
XenLinux
Domain 0
Read shared memory
PV block driver
XenLinux
Domain U PV
Interrupt
Write shared memory
Xen Hypervisor
Interrupt
Interrupt
x86 Hardware
Dong, et. al., "Extending Xen with Intel Virtualization Technology", Intel Technical Journal
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
53
Dr. Martin Land
55
I/O Bottleneck
Bottleneck — Single Ethernet controller
Guest OS tasks waiting for I/O access
hides performance degradation caused by virtualization
Web server running over
native Linux without Xen
Threads compete above
2.5 Gbps
Web server running over
XenLinux in Domain 0
Threads compete above
1.9 Gbps
Web server running over
XenLinux in Domain U PV
Threads compete above
0.9 Gbps
Modern Microprocessors — Fall 2012
Virtual Machines
Modern Microprocessors — Fall 2012
Virtual Machines
Dr. Martin Land
54