Download SYS-T312 Intel`s Vision For Virtualization And Benchmarking

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Mobile operating system wikipedia , lookup

Windows RT wikipedia , lookup

OS-tan wikipedia , lookup

Windows Phone 8.1 wikipedia , lookup

OS/2 wikipedia , lookup

Memory management unit wikipedia , lookup

Paging wikipedia , lookup

Transcript
Fernando Martins
Director
Virtualization Strategy and Planning
Tom Adelmeyer
Principal Engineer
Virtualization Performance
and Benchmarking
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS.
NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL
PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S
TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY
WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO
SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING
TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY
PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE
NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS.
Intel may make changes to specifications and product descriptions at any time, without notice.
All products, dates, and figures specified are preliminary based on current expectations, and are subject
to change without notice.
Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata,
which may cause the product to deviate from published specifications. Current characterized errata are
available on request.
Intel, the Intel logo, Intel Leap ahead, Intel Leap ahead logo, Intel vPro, Intel vPro logo, Intel VIIV, Intel
VIIV logo, Intel Centrino Duo, Intel Centrino Duo logo, Intel Xeon, Intel Xeon Inside logo, Intel Itanium 2
and Intel Itanium 2 Inside logo are trademarks or registered trademarks of Intel Corporation or its
subsidiaries in the United States and other countries."
*Other names and brands may be claimed as the property of others.
Copyright © 2006 Intel Corporation.
Throughout this presentation:
VT-x refers to Intel® VT for IA-32 and Intel® 64
VT-i refers to the Intel® VT for IA-64, and
VT-d refers to Intel® VT for Directed I/O and its extensions
The confluence of compelling usage models and robust
solutions is driving virtualization to mainstream adoption
New usage models require radically new approaches to
performance measurement and capacity planning
This session will describe Intel’s portfolio of virtualization
technologies, and through practical examples provide a
deep technical dive into the challenging problem of
meaningful benchmarking in a virtualized environment
We will discuss Intel’s research in the space and share
our latest results, including vConsolidate - Intel’s seed
contribution to a vendor-agnostic standard virtualization
benchmark currently being developed by SPEC
Intel’s Strategy for Virtualization
Intel® Virtualization Technology Evolution
Current and Emerging Usage models
Usage Model based Benchmarking
20.00%
15.00%
Apr-07 Forecast Update
Feb-07 Forecast Update
10.00%
Sep-06 WW Forecast
5.00%
0.00%
2005
2006
2007
2008
Server Virtualization is now considered a
mainstream technology among IT buyers.
IT professional are bullish in future use: driving
45% server use in 12 months
2009
2010
41% of new server x86 purchased in
2007 will be virtualized
- IDC End User Study; Jun-06
-IDC Directions 2007 Feb-07
>81% of business are using virtualization in
production environments
- 451 Group Special Report – Dec-06
Platform of Choice for Virtualization
Broad Ecosystem Support
Remove Adoption Barriers
Leadership in HW assists for Virtualization
CPU virtualization (VT-x and VT-i)
IO virtualization (VT-d)
Networking virtualization (IOAT and VMDq)
Better Platform Reliability Features
Leader in Reliability features
Proven Platform Architecture: 40X more
Intel servers
More Power/Performance Headroom
Quad-Core 4-way NICs
Q4’05 IDC server Tracker, 1996-2005 total system shipped
IA-based System Virtualization Today
Requires Frequent VMM Software Intervention
Standards for IO-device sharing:
Multi-Context I/O Devices
Endpoint Address Translation Caching
Under definition in the PCI-SIG* IOV WG
Hardware support for I/O virtualization
Device DMA remapping
Direct assignment of I/O devices to VMs
Interrupt Routing and Remapping
Establish foundation
for virtualization in the
IA-32 and
Itanium architectures…
Software-only VMMs
Binary translation
Paravirtualization
Simpler
and more Secure
VMM through
foundation
of virtualizable ISAs
*Other names and brands may be claimed as the property of others
… followed by on-going evolution of support:
Micro-architectural (e.g., lower VM switch times)
Architectural (e.g., Extended Page Tables)
Increasingly better CPU and I/O virtualization
performance and functionality as I/O devices
and VMMs exploit infrastructure provided
by VT-x, VT-i, VT-d
New CPU Operating Mode
VMX Root Operation (for
VMM)
Non-Root Operation (for
Guest)
Eliminates ring deprivileging
Apps
Apps
WinXP
Linux
New Transitions
VM entry to guest OS
VM exit to VMM
VM Control Structure
(VMCS)
Configured by VMM software
Specifies guest Operating
System (OS) state
Controls when VM exits occur
(eliminates over and
under exiting)
Supports on-die CPU
VM
Entry
VM
VMCS
Exit Configuration
H/W VM Control
Structure (VMCS)
Extended Page Table
(EPT)
A new page-table structure
under the control of the VMM
Map guest-physical to hostphysical (accesses memory)
Performance Benefit
Guest OS able to freely
modify its own page tables
Eliminates VM exits due to
page faults, INVLPG, or
CR3 accesses
Memory Savings
Shadow page tables
required for each guest user
process (w/o EPT)
A single EPT supports
entire VM
No VM Exits
VT-x
with EPT
Platform implementation for I/O
virtualization
Defines an architecture for
DMA remapping
Implemented as part of core
logic chipset
Will be supported broadly in Intel
server and client chipsets
Improves system reliability
Contains and reports errant
DMA to software
Basic infrastructure for I/O
virtualization
Enable direct assignment of
I/O devices to unmodified or
paravirtualized VMs
DMA-remapping
Improves reliability and security through device
isolation
Improves I/O performance through direct assignment
of devices
Improves I/O performance of 32 bit devices that
happen to use bounce buffer condition
Interrupt-remapping
Interrupt isolation: isolate interrupts across VMs
Interrupt migration: efficiently migrate interrupts
across CPUs
Address Translation Services (ATS)
Support for ATS capable endpoint devices
DMA remapping performance improvements
Processor
Chipset
Network
Intel’s holistic design approach delivers
Platforms built to excel in Virtualization
Hardware Virtualization
Mechanisms under VMM Control
VM1
VMn
…
VM1
App
App
OS
OS
OS
HW0
HWn
App
VMn
…
App
App
App
OS
OS
OS
VMM
VMM
HW
HW
VM1
App
VMn
…
OS
VM1
App
App
OS
OS
VMn
…
VM1
VMn
VM1
VMn
App
App
App
App
App
OS
OS
OS
OS
OS
VMM
VMM
VMM
VMM
VMM
HW0
HWn
HW0
HW0
HW
Traditional benchmarking covers Performance,
Power, Scalability
Metrics: Throughput (MB/s), Response time, #users,
etc
Micro-architecture focus: cache sizing, frequency,
bandwidth, etc.
New technology requires new areas of analysis
and metrics
Areas of focus driven by use models.
E.g., VM migration time, VM utilization
Need to measure how Intel® Virtualization technology
benefits end-users and ISVs
Virtualization presents unique challenges
Which configurations to focus on
Homogeneous or heterogeneous OS
Number Virtual Machines
Configuration of individual VMs (CPU, Memory, NIC, HBA, HDD)
Measuring performance
Virtual clock accuracy induces platform dependent error
Availability of performance monitoring capabilities
Consolidation use case adds additional testing challenges
Synchronicity: Use automation scripts
Utilization: Avoid harmonic bottlenecks
Steady State: Easy, repeatable measurements
Only way to overcome the challenges is to develop the benchmarks
Tier consolidation using SAP SD
vConsolidate: a server application consolidation benchmark
SAP SD (Sales and
Distribution)
OLTP-style benchmark that
measures performance of a
server running the Enterprise
Resource Planning (ERP)
solution from SAP AG
Tier Consolidation
Database and app server
run in VMs
Benefits of 3-Tier (isolation,
maintainability), cost of 2-Tier
Benchmark value
Reuse existing Metrics
New focus area
Inter VM communication
VM1
VMn
App
Svr
OS
DB
OS
VMM
HW
vConsolidate
Description
Benchmark that represents
predominant use case -> server
application consolidation
Application types selected for
consolidation guided by market
data
vConsolidate provides
A methodology for measuring
performance in a consolidated
environment
A means for fellow travelers to
publish virtualization performance
proof points
The ability to analyze performance
across VMMs and hardware
platforms
Knowledge obtained  SPEC
virtualization workload
Business Processing
26.2%
Database
28.5%
Decision Support
9.2%
Collaborative
8.4%
Application Development
12.0%
Web Infrastructure
6.8%
IT Infrastructure
4.8%
Technical
3.5%
Other
0.6%
5 Virtual Machines
3 Clients: Controller, Mail, and Web
*Other names and brands may be claimed as the property of others
Consolidation Stack Unit – (CSU)
Smallest granule in vCon
Consist of 5 Virtual Machines
Database
Commercial Mail
Web Server
Java Application Server
Idle
Each CSU represents single
score
Final score is aggregate of the
individual CSU scores
Workload
vCPUs
Web
Webbench
Mail
Loadsim
Database
Sysbench
Java
SPECjbb
Idle
1
1
1
1
1
Workload
vCPUs
Web
Webbench
Mail
Loadsim
Database
Sysbench
Java
SPECjbb
Idle
2
1
2
2
1
ProfileProfile
#1 #2
Profile # Workload
1
vCPUs
vMemory
vMemory
OS
App
vCPUs vMemory OS OS App AppvCPUs
Web
Windows
Windows
Windows
Webbench
1
1.0
GB
32-bit
1.0 GB
32-bit
2
1.5 GB
32-bit IIS IIS 2
IIS
Mail
Windows
Windows
Windows
LoadsimExchange1
Exchange
1.0 GB
32-bit
1 1.0 GB
1.5 GB32-bit 32-bit
Exchange1
Database
Windows
Windows
Windows
Sysbench
1.0 GB
32-bit
2 1.0 GB
1.5 GB32-bit 64-bitMS SQL
MS SQL 1
MS SQL 2
Java
Windows
Windows
Windows
SPECjbb
1
1.7
GB
32-bit
BEA JVM
1.7 GB
32-bit
2
2.0 GB
64-bit
BEA JVM
BEA JVM2
Windows
Windows
Windows
Idle
1
1
0.4 GB
32-bit
1 0.4 GB
0.4 GB32-bit 32-bit
Profile # 2
vMemory
OS
Windows
1.5 GB
32-bit
Windows
1.5 GB
32-bit
Windows
1.5 GB
64-bit
Windows
2.0 GB
64-bit
Windows
0.4 GB
32-bit
ProfileProfile
#3 #4
Profile # Workload
3
vMemory
vMemory
OS
AppvCPUsvCPUs
vMemory OS OS App AppvCPUs
Web
Linux
Linux
Windows
Webbench
1.5 GB
32-bit
2 1.5 GB
2.0 GB32-bit 32-bitApache IIS 2
Apache 2
Mail
Windows
Windows
Windows
Loadsim
1
1.5
GB
Exchange
1.5 GB
32-bit
2
2.0 GB32-bit 32-bit
Exchange
Exchange2
Database
Linux
Linux
Windows
Sysbench
2
1.5
GB
64-bit
1.5 GB
64-bit
4
2.0 GB
64-bitMySQL
MySQL
MS SQL 4
Java
Linux
Linux
Windows
SPECjbbBEA JVM2
BEA JVM
2.0 GB
64-bit
4 2.0 GB
2.0 GB64-bit 64-bit
BEA JVM4
Windows
Windows
Windows
Idle
1
1
0.4 GB
32-bit
1 0.4 GB
0.4 GB32-bit 32-bit
Profile # 4
vMemory
OS
Windows
2.0 GB
32-bit
Windows
2.0 GB
32-bit
Windows
2.0 GB
64-bit
Windows
2.0 GB
64-bit
Windows
0.4 GB
32-bit
App
IIS
Exchange
MS SQL
BEA JVM
App
IIS
Exchange
MS SQL
BEA JVM
Running vConsolidate
Controller application
Starts the tests via helper scripts; Runs for 30 minutes
Stops the test and reports score
Time measured in “Controller Client”  external timer
Scoring
The “Controller” application
calculates final score
SpecJBB, Sysbench and
Loadsim - transactions/
second
WebBench – throughput
CSU Final Score = GEOMEAN
(VM Relative Perf[i])
Example Scoring
Web
Java
Database
Mail
319
14236
201
13.5
#CSU CPU%
Web
Raw Relative
1
65%
1124
3.52
Java
Raw
14842
Database
Mail
Relative Raw Relative Raw Relative
1.04
229
1.14
VM relative scores = Measured/Reference
(E.g., WebBench = 3.52)
1 CSU score: GEOMEAN (3.52, 1.04, 1.14, 1.16) = 1.48
15.6
1.16
Higher is better
Lower is better
Seeding Industry with Benchmark Workloads
vConsolidate– Consolidated stack of business workloads consisting of
Server Side Java, Commercial Database, Commercial Mail, Commercial
Web Server on 4 VMs
Collaborating with Virtualization leaders
Microsoft and OEMs - consolidation workloads, methodology & metrics
VMware – VMmark* consolidation stack
Establishing benchmarks with ISV/OSVs
Contributing to standard benchmarks through SPEC (long term)
*Other names and brands may be claimed as the property of others.
Platform of Choice for Virtualization
Dedicated HW support
Reliability Leadership
High Performance / Energy Efficient
Broader Ecosystem Support
VMM vendors, ISVs, OEMs, SIGs,
Standards
Removing Adoption Barriers
Education Programs / Best Practices
New Benchmarks
Dual Port 10/100/1000 x4 PCI Express*
Gigabit Ethernet Controller
PCIe
x4, x2, x1
SMBus
RMII
External Interfaces
Dual 1000BASE-T, SerDes, and SGMII
interfaces
PCIe ver 1.1 x4
Intel® I/O Acceleration Technology
(IOAT2)
MSI-X
Low Latency Interrupt
Direct Cache Access
Header-splitting and replication
Virtualization support (VMDq): 4 TX/RX
Queues (per port)
I/O Enhancements
Offloads compatible with IPv4, IPv6 &
multiple VLAN tags
Receive Side Scaling
Manageability
PCIPCI-Express
DMA/Host
Interface
TX
FIFO
FIFO
RX
Mgmt
FIFO
RAM
GbE MAC
SerDes
PHY
Mgmt
DMA/Host
Interface
TX
FIFO
FIFO
RX
Mgmt
FIFO
RAM
GbE MAC
PHY
SerDes
PXE, iSCSI Boot
RMII, SMBus Interfaces
ECC on all memory
25mm x 25mm FCBGA
Schedule
Sampling now
Production: Q2’07
SerDes 1000
/SGMII BASE-T
SerDes
1000
BASE-T /SGMII
Unique Intel x86 Reliability Features
Description
Feature
Benefit
Memory ECC
Data Integrity
& Availability
Detects & corrects single-bit errors
Enhanced
Memory ECC
Data Integrity
& Availability
Retry double-bit errors vs. standard memory
ECC that does single-bit errors only
Memory
CRC (FBD)
Continued
Operation
& Availability
Intel Xeon
Other x86
processor
Based
Based Servers Servers
Data Availability
Predicts a “failing” DIMM & copies the data to
a spare memory DIMM , maintaining server
available & uptime
Memory
Mirroring
Data Protection
Data is written to 2 locations in system memory so
that if a DRAM device fails, mirrored memory
enables continued operation and data availability





Symmetric
Access
to all CPUs
Server Continuity
Enables a system to restart and operate if the
primary processor fails

Memory
Sparing
Address & command transmissions are
automatically retried if a transient error occurs vs.
the potential of silent data corruption






A Better Business Foundation
Less Downtime, Higher Service Availability and Improved Confidence
Enabled by a combination of processor, chipset and platform memory technologies. Data as of March 6, 2006
Intel Virtualization Technology
For Directed I/O
Monolithic Model
VMn
VM0
Guest OS
and Apps
Guest OS
and Apps
I/O Services
Service VM Model
Service VMs
I/O
Services
Device
Drivers
Pass-through Model
Guest VMs
VMn
VM0
Guest OS
and Apps
VMn
VM0
Guest OS
and Apps
Guest OS
and Apps
Device
Drivers
Device
Drivers
Device Drivers
Hypervisor
Shared
Devices
Pro: Higher Performance
Pro: I/O Device Sharing
Pro: VM Migration
Con: Larger Hypervisor
Hypervisor
Hypervisor
Shared
Devices
Assigned
Devices
Pro: High Security
Pro: I/O Device Sharing
Pro: VM Migration
Con: Lower Performance
VT-d Goal: Support all Models
Pro: Highest Performance
Pro: Smaller Hypervisor
Pro: Device assisted sharing
Con: Migration Challenges
VT-d is platform infrastructure for I/O virtualization
Defines architecture for DMA remapping
Implemented as part of platform core logic
Will be supported broadly in Intel server and client chipsets
CPU
CPU
System Bus
North Bridge
DRAM
VT-d
Integrated
Devices
PCIe* Root Ports
PCI Express
South
Bridge
PCI, LPC,
Legacy devices, …
Basic infrastructure for I/O virtualization
Enable direct assignment of I/O devices to unmodified
or paravirtualized VMs
Improves system reliability
Contain and report errant DMA to software
Enhances security
Support multiple protection domains under SW control
Provide foundation for building trusted I/O capabilities
Other usages
Generic facility for DMA scatter/gather
Overcome addressability limitations on legacy devices
DMA Requests
Dev 31, Func 7
Device ID
Virtual Address
Length
…
Bus 255
Dev P, Func 2
Page
Frame
Bus N
Fault Generation
Bus 0
Dev P, Func 1
Dev 0, Func 0
DMA Remapping
Engine
Translation Cache
Context Cache
Memory Access with System
Physical Address
Device
Assignment
Structures
Device D1
4KB Page
Tables
Address Translation
Structures
Device D2
Address Translation
Structures
Memory-resident Partitioning And
Translation Structures
VT-d hardware selects page-table based on source of DMA request
Requestor ID (bus / device / function) in request identifies DMA source
VT-d Device Assignment Entry
127
64
Rsvd
Domain ID
Rsvd
Address Width
63
0
Page-Table Root Pointer
Rsvd
Ext.
Controls
Controls
P
VT-d supports hierarchical page tables for address translation
Page directories and page tables are 4 KB in size
4KB base page size with support for larger page sizes
Support for DMA snoop control through page table entries
VT-d Page Table Entry
63
0
Rsvd
Page-Frame / Page-Table Address Available
S
P
Rsvd
Ext.
Controls
W
R
Requestor ID
15
8 7
Bus
3 2
DMA Virtual Address
0
63
57 56
48 47
000000b 000000000b
Device Func
39 38
30 29
21 20
12 11
0
Level-4
Level-3
Level-2
Level-1
Page Offset
table offset table offset table offset table offset
Base
Device
Assignment
Tables
Page
Level-4
Page
Table
Example Device Assignment
Table Entry specifying 4-level
page table
Level-3
Page
Table
Level-2
Page
Table
Level-1
Page
Table
Architecture supports caching of remapping structures
Context Cache: Caches frequently used device-assignment
entries
IOTLB: Caches frequently used translations (results of page
walk)
Non-leaf Cache: Caches frequently used page-directory entries
When updating VT-d translation structures, software
enforces consistency of these caches
Architecture supports global, domain-selective, and page-range
invalidations of these caches
Primary invalidation interface through MMIO registers for
synchronous invalidations
Extended invalidation interface for queued invalidations
PCI Express protocol extensions being defined by
PCISIG for Address Translation Services (ATS)
Enables scaling of translation caches to devices
Devices may request translations from root complex and cache
Protocol extensions to invalidate translation caches on devices
VT-d extended capabilities
Support for ATS
Enables VMM software to control device participation in ATS
Returns translations for valid ATS translation requests
Supports ATS invalidations
Provides capability to isolate, remap and route interrupts to VMs
Support device-specific demand paging by ATS capable devices
VT-d Extended features utilize PCI Express enhancements
being pursued within the PCI-SIG
A VMM must protect host physical memory
Multiple guest operating systems share the
same host physical memory
VMM typically implements protections through
“page-table shadowing” in software
Page-table shadowing accounts for a large
portion of virtualization overheads
VM exits due to: #PF, INVLPG, MOV CR3
Goal of EPT is to reduce these overheads
CR3
Guest Linear Address
EPT Base Pointer (EPTP)
Guest IA-32 Guest Physical Address
Page
Tables
Extended
Page
Tables
Host Physical Address
Extended Page Table
A new page-table structure, under the control of the VMM
Defines mapping between guest- and host-physical addresses
EPT base pointer (new VMCS field) points to the EPT page tables
EPT (optionally) activated on VM entry, deactivated on VM exit
Guest has full control over its own IA-32 page tables
No VM exits due to guest page faults, INVLPG, or CR3 changes
CR3
Guest Linear Address
Host Physical Address
+
Page
Directory
Page Table
EPT
Tables
+
EPT
Tables
+
Guest
Physical
Page Base
Address
EPT Tables
Guest Physical
Address
All guest-physical memory addresses go through EPT tables
(CR3, PDE, PTE, etc.)
Above example is for 2-level table for 32-bit address space
Translation possible for other page-table formats (e.g., PAE)