Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Storage Virtualization Discussion led by Larry Rudolph MIT IAP 2010 The Science of Virtualization © 2009 VMware Inc. All rights reserved Virtualize Storage device Guest OS Guest Disk Guest OS Device Driver for Storage Device is emulated Virtual Disk is emulated as a single file on the real file system Device Driver Device Emulation Map virtual disk to File on physical disk Physical Disk Device Driver Real Disk Simple • Use direct file access • Guest issues block reads & writes • Guest issues some other random stuff, but it is just detail Details, details, details 2 VM’s file system size? Does each virtual disk have a fixed size? • If it is a file on the host file system, then how to do that? Should all the space be preallocated? What does Guest OS tell user • How much space is left on disk? Guest OS makes use of a buffer cache • Cache of blocks read already, from read-ahead • Host OS also maintains a block cache • Silly to have multiple copies • What if there are several VMs • Host block cache may thrash 3 Guest OS does disk I/O scheduling Guest thinks it has a physical disk attached Guest schedules I/O to optimize arm movement • Is this a problem? Does it matter? • Guest does not know the truth Guest issues a command to virtual disk via disk driver Captured by VMM’s emulated device Command (translated &) issued to real device Return back up to Guest OS, which issues next command • But might be too late 4 Optimization: simple sharing Assume all n VMs running same version of Windows XP • • • • Do not need to keep n copies of Windows read-only code Can use virtualization to save disk space How? Virtualization is a level of indirection; just like shared pages How can this be done efficiently? What about other blocks? • COW – copy on write Can / should storage be shared? • Can host “see” files in the guest? • What if guest is not running? • Think about how virus protection works • Can guest “see” files on the host? • What happened to Isolation? 5 Consider VMs in a data center Two broad models of storage: Local disks vs Storage Array Network (SAN) Google, for example, uses the local disk model • Query sent to lots of servers, each searches local disk • To access data on remote disk, ask it’s server What happens when (virtualized) servers move from one physical machine to another? • Matters if real connection between the two 6 SANs: Storage Array Network Not as simple as it seems Servers and Disks connected to network SAN need not be commodities – can provide many services • These services are physically based. Logical Unit Number – LUN • Think file system or file volume • SANs deal with these things Protocols: Fiber Channel & SCSI What services can Storage Arrays provide? • RAID (e.g. use 9 disks to provide fault tolerance.) • Lots of other stuff, but on a LUN granularity 7 SAN Quality of Service – Multiqueue Many VMs sharing one physical connection If VMM has only a single queue, then one VM can dominate leaving another VM starved for service • Want to provide quality of service to all VMs Potential Solution • Multiple queues • VMM does appropriate queue popping to maintain QoS • Best effort vs Fairness in scheduling Resource Sharing • Bandwidth, time, ports, … 8 De-duplication Why store multiple copies of the same block; just use pointers Good idea, let’s do it in S/W Details? • Deletion? • Migration? • Backup? 9 Snapshots (versioning) Application consistent versioning • • • • • 10 VMM must flush all block caches / buffers Guest OS must flush all block caches / buffers Guest Application must flush all block caches / buffers Everyone must be told when it is safe to proceed Who does the new mapping? Atomicity – locking (reservations) There are times when it is desirable to lock whole file system. • Why? SANs provide locking on a LUN basis 11 Replication Want to maintain two copies of the file system • Geographic separation in case of disaster • What other reasons? SANs provide this on a LUN basis • Multicast the write block messages What are problems with replication at the file system level 12 Disaster Recovery Replication • Asynchronous • Synchronous (higher latency) Fail-over Mechanism 13 Thin Provisioning Allocate blocks on demand • Map virtual block to physical block on some disk in storage array • Permits over-commitment of storage • When get close to real capacity, can add more physical disks to storage array Good idea, let’s also do it in S/W 14 Predictive read and other (common case) optimizations There are non-volatile buffers in front of disk Can do read-ahead to avoid latency costs • E.g. during boot-up, easy to predict which blocks read next • Buffer is finite, cannot read everything into it • Assume some logical flow of reads Pattern of reads is on file system level not LUN 15 vMotion – migrate VM and its storage Can migrate VM from server to server while it is still running Want to do the same for the (virtual) file system Want to migrate live • Eager or lazy? 16 Summary Storage is a big part of data-center (Server, Network, & Storage) Today, the Physical deals in terms of LUNs, the Virtual in files • Agree on standard set of APIs VMFS (virtual machine file system) • • • • • • • 17 Provides API’s to be supported by physical SANs Block copy Zeroing Atomic test & set Thin Provisioning Block release ….. Storage virtualization for virtual machines Christos Karamanolis, VMware, Inc. © 2009 VMware Inc. All rights reserved VMware Products “Hosted” Products • VMware Workstation (WS), VMware Player – free • ACE • VMware GSX Server, VMware Server – free “Bare metal” Product • VMware ESX Server • Virtual Center • Services: HA, DRS, Backup Copyright © 2006 VMware, Inc. All rights reserved. 19 ESX Server Architecture Platform for running virtual machines (VMs) Virtual Machine Monitor (VMM) VMkernel takes full control of physical hardware • Resource management • High-performance network and storage stacks • Drivers for physical hardware Service console • For booting machine and external interface Copyright © 2006 VMware, Inc. All rights reserved. 20 vSphere Storage Virtualization overview Copyright © 2006 VMware, Inc. All rights reserved. 21 Storage virtualization: Virtual Disks • Virtual Disks are presented as SCSI disks to VM • Guest OS sees h/w Host Bus Adapter (HBA) • Buslogic or LSIlogic • Proprietary “PVSCSI” adapter (optimized virtualization) • HBA functionality emulated by VMM • Virtual Disks stored on either • Locally attached disks • Storage Area Network (SAN) • Network Attached Storage (NFS, iSCSI) Copyright © 2006 VMware, Inc. All rights reserved. 22 VMkernel: virtual disks -> physical storage Support for virtual disks Virtual Machine ESX VMkernel (vSCSI) virtual SCSI COW virtual disks support virt disks -> phys storage Raw VMFS NFS LUN • Process SCSI cmds from VM HBA • Hot-add virtual disks to VMs • Snapshots (COW): versioning, rollback Map virtual disk I/O to physical storage I/O • Handle physical heterogeneity • Different phys storage type per virtual disk core SCSI LUN disk scheduling rescan multipathing device drivers FC driver iSCSI driver Copyright © 2006 VMware, Inc. All rights reserved. 23 Advanced storage features • Disk scheduling • Multipathing, LUN discovery • Pluggable architecture for 3rd parties Device drivers for real h/w VMware File System (VMFS) Efficient FS for virtual disks on SANs • Performance close to native Physical storage (LUN) management VMFS • VMFS file systems automatically recognized when LUNs discovered • Lazy provisioning of virtual disks 2.vmdk.redo 1.vmdk Clustered: sharing across ESX 3.vmdk hosts 2.vmdk • Per-file locking • Uses only SAN, not IP network SAN Copyright © 2006 VMware, Inc. All rights reserved. 24 Supports mobility & HA of VMs Scalability: 100s of VMs on 10s of hosts sharing a VMFS file system VMFS performance- sequential read (old graphs) sequential-read workload 180 MBps 160 140 physical 120 virtual 100 80 60 40 20 0 4k 8k 16k 32k 64k block size ESX 2.5 on HP Proliant DL 580, 4 processors, 16 GB mem. VM/OS: Windows 2003, Ent.Ed., uni-processor, 3.6 GB mem, 4 GB virtual disk on one 5-disk RAID5 LUN, on Clariion CX500 Testware: IOmeter, 8 outstanding IOs Copyright © 2006 VMware, Inc. All rights reserved. 25 VMFS performance- random read (old graphs) random-read workload 30 25 physical virtual MBps 20 15 10 5 0 4k 8k 16k 32k 64k block size ESX 2.5 on HP Proliant DL 580, 4 processors, 16 GB mem. VM/OS: Windows 2003, Ent.Ed., uni-processor, 3.6 GB mem, 4 GB virtual disk on one 5-disk RAID5 LUN, on Clariion CX500 Testware: IOmeter, 8 outstanding IOs Copyright © 2006 VMware, Inc. All rights reserved. 26 Raw access to LUNs Clustering applications in VMs • Sharing between VMs using SCSI reservations • Sharing between VMs and physical machines VMFS 1.vmdk from VMs 4.rdm • Some disk arrays provide snapshots, SAN Copyright © 2006 VMware, Inc. All rights reserved. 27 In-band control of array features mirroring, etc. that is controllable via special SCSI commands • Similar for access to tape drives • We provide pass-through mode that allows VM to control this functionality Lower layers of storage stack Disk scheduling • Control share of disk bandwidth to each VM Multipathing • Failover to new path when existing storage path fails • Transparently provides reliable connection to storage for all VMs • Improves portability & ease of management • No multipathing software required in Guest OS LUN rescanning and VMFS file system discovery • Dynamically add/remove LUNs on ESX Storage Device Drivers • May need to copy/map guest data to low memory because of driver DMA restrictions Copyright © 2006 VMware, Inc. All rights reserved. 28 ESX VMkernel vSCSI virtual disks support COW virt disks -> phys storage VMFS Raw LUN NFS core LUN disk sched SCSI rescan multipathing device FC drivers driver iSCSI driver Storage Solutions Copyright © 2006 VMware, Inc. All rights reserved. 29 Enabling solutions ESX’s storage virtualization platform • VM state encapsulation • Generic snapshot/rollback tool • Highly available access to phys storage • Safe sharing of the phys storage Enabler for virtualization solutions • Ease of management in the data center • VM High-Availability and mobility • Protection of VMs: application + data Copyright © 2006 VMware, Inc. All rights reserved. 30 Future directions: IO resource control Typical usage scenario: singleapplication VM • Leverage knowledge of application/workload? • Provisioning, resource management, metadata, format Advanced IO scheduling • Adapt to dynamic nature of SAN, workloads • Combine with control of CPU, mem, net • Local resource control vs. global goals • Target high-level ‘business’ QoS goals Copyright © 2006 VMware, Inc. All rights reserved. 31 Storage VMotion • Migrate VM disks • Non-disruptive while VM is running • VM granularity, LUN independent • Can be combined with VMotion • Uses: • Upgrade to new arrays LUN A1 LUN B1 LUN A2 LUN B2 Array A (off lease) 32 Array B (NEW) • Migrate to different class of storage (up/down tier) • Change VM -> VMFS volume mappings Business continuity Virtual machine (configuration, disks) encapsulates all state needed for recovery Safe VM state sharing on SAN, allows immediate recovery when physical hardware fails high availability • Fail over VM upon failure; fully automated • No h/w dependencies, no re-installation, no idle h/w Extend this model to with non-shared and even remote storage Copyright © 2006 VMware, Inc. All rights reserved. 33 Consolidated Backup (VCB) Framework Backup framework for VMs Proxy backup server • May be in VM or not • Copy virtual disk off SAN/NAS Disk consistency • Delta disks for consistent disk image • Opt. Application quiescing (VSS) 3rd party backup software • Number of vendors Restore virtual disks from backup and start VM Copyright © 2006 VMware, Inc. All rights reserved. 34 • Use VMware Converter Disaster Recovery (DR) Site Recovery Manager (SRM) • Automates disaster recovery workflows: • Setup, testing, failover, failback • Central management of recovery plans from Virtual Center • Manual recovery processes -> automated recovery plans • Uses 3rd-party storage replication • Array-based replication • Setup replication at VMFS volume granularity (group of LUNs) 35 Breaking virtual disk encapsulation? Share/clone/manage VM state • encapsulation vs. manageability Opaque virtual disks • Pros: encapsulation, isolation, generic versioning and rollback, simple to manage • Cons: too coarse granularity, proliferation of VM versions, redundancy, no sharing Explicit support for VM “templates”, share VM boot virtual disks VM state-aware FS that allows transparent sharing (NSDI-06) Copyright © 2006 VMware, Inc. All rights reserved. 36 Storage trends and challenges 37 Storage Trends: Converged Data Center Networks Driving Forces • need to reduce costs • need to simplify and consolidate management • need to support next-gen data center architectures Industry Responses • converged fabrics • data center ethernet • FCoE • new data center network topologies • new rack architectures • new management paradigms 38 Storage Trends - Information Growth Driving Forces • 60% raw growth • Shift to rich content vs. transactions • Consumed globally and via mobile • Longer retention periods • More use of “archival” information in business processes • Desire to move beyond tape 39 Industry Responses • Increased focus on policy • new governance functions • Categorization software • ILM, archiving, etc. • Larger disk drives • 1TB -> 2TB -> 3TB -> 4TB • Data deduplication • removing redundancy • “Content” clouds • moving information closer to users Storage vision Today: 40 Tomorrow: • Manual configuration and • Zero physical storage management of physical storage based on vendors tools • multiple tiers of magnetic media • Manual mapping of workload/app to storage • manual performance optimization • disk and tape co-existing management • Manage Apps / workloads, not media • Automatic, dynamic storage tiering • Scale-out (distributed) storage architectures based on commodity h/w • Policy-based automated placement and movement of data Towards the vision 41 Storage awareness and integration VMware Infrastructure Virtual Datacenter OS from VMware Storage Partners Infrastructure vCompute vServices vStorage Storage operations • VMFS Storage management • Storage VMotion • Storage Virtual Appliances • vStorage API’s • VM storage management • Linked Clones • Thin Provisioning • VMDirectPath 42 vNetwork vCloud Consolidated I/O Fabrics Common Storage and Network Built on low-latency, lossless 10GbE FCoE, iSCSI, NAS Both cost effective bandwidth and increased bandwidth per VM Simplified infrastructure and management • Simplify physical infrastructure • Unify LAN, SAN and VMotion networks • Scale VM performance to 10GE 43 LAN LAN SAN A SAN B VMotion Scale-out storage architectures ESX hosts Virtual Name Space (VNS): Hide object placement details Client: ESX hosts Singlepane mgmt Asymmetric Access protocol, e.g. pNFS Specify per-object policy Storage cluster: Object mobility, availability, reliability, snapshots, thin provisioning, dedup, etc Storage containers: Provide different tiers of storage with different capabilities 44 White box storage h/w Per-object policy enforcement Fabric Automated, adaptive storage provisioning and tiering