* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Elden Christensen Senior Program Manager Lead Microsoft Session Code: SVR319 Session Objectives And Takeaways Session Objective(s): Understanding the need and benefit of multi-site clusters What to consider as you plan, design, and deploy your first multi-site cluster Windows Server Failover Clustering is a great solution for not only high availability, but also disaster recovery Multi-Site Clustering Introduction Networking Storage Quorum Workloads Is my Cluster Resilient to Site Failures? Site A Same Physical Location But what if there is a catastrophic event? SAN Fire, flood, earthquake … Multi-Site Clusters for DR Extends a cluster from being a High Availability solution, to also being a Disaster Recovery solution Site A SAN Applications are failed over to a separate physical location Site B SAN Node is moved to a physically separate site Benefits of a Multi-Site Cluster Protects against loss of an entire datacenter Automates failover Reduced downtime Lower complexity disaster recovery plan Reduces administrative overhead Automatically synchronize application and cluster changes Easier to keep consistent than standalone servers The primary reason DR solutions fail is dependence on people Multi-Site Clustering Introduction Networking Storage Quorum Workloads Network Considerations Network Options: 1. Stretch VLAN’s across sites 2. Cluster nodes can reside in different subnets Public Network Site B Site A 10.10.10.1 184.108.40.206 220.127.116.11 18.104.22.168 Separate Network Stretching the Network Longer distance traditionally means greater network latency Too many missed health checks can cause false failover Heartbeating is fully configurable SameSubnetDelay (default = 1 second) Frequency heartbeats are sent SameSubnetThreshold (default = 5 heartbeats) Missed heartbeats before an interface is considered down CrossSubnetDelay (default = 1 second) Frequency heartbeats are sent to nodes on dissimilar subnets CrossSubnetThreshold (default = 5 heartbeats) Missed heartbeats before an interface is considered down to nodes on dissimilar subnets Command Line: Cluster.exe /prop PowerShell (R2): Get-Cluster | fl * Security over the WAN Encrypt intra-node traffic 0 = clear text 1 = signed (default) 2 = encrypted Site B Site A 10.10.10.1 22.214.171.124 126.96.36.199 188.8.131.52 Enhanced Dependencies – OR Network Name resource stays up if either IP Address Resource A OR IP Address Resource B is up Network Name resource OR IP Address Resource A IP Address Resource B Client Reconnect Considerations Nodes in dissimilar subnets Failover changes resource’s IP Address Clients need that new IP Address from DNS to reconnect DNS Replication DNS Server 1 Record Created DNS Server 2 Record Updated Record Obtained Record Updated 10.10.10.111 184.108.40.206 FS = 10.10.10.111 220.127.116.11 Site A Site B Solution #1: Configure NN Setting RegisterAllProvidersIP (default = 0 for FALSE) Determines if all IP Addresses for a Network Name will be registered by DNS TRUE (1): IP Addresses can be online or offline and will still be registered Ensure application is set to try all IP Addresses, so clients can connect quicker HostRecordTTL (default = 1200 seconds) Controls time the DNS record lives on client for a cluster network name Shorter TTL: DNS records for clients updated sooner Solution #2: Prefer Local Failover Local failover for higher availability No change in IP Address Cross-site failover for disaster recovery DNS Server 2 DNS Server 1 10.10.10.111 18.104.22.168 FS = 10.10.10.111 Site A Site B Solution #3: Stretch VLAN’s Deploying a VLAN minimizes client reconnection times DNS Server 2 DNS Server 1 10.10.10.111 10.10.10.111 VLAN FS = 10.10.10.111 Site A Site B Solution #4: Abstraction in Device Network device uses 3rd IP 3rd IP is the one registered in DNS & used by client Example:http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Cente r/App_Networking/extmsftw2k8vistacisco.pdf DNS Server 2 22.214.171.124 DNS Server 1 10.10.10.111 Site A 126.96.36.199 Site B FS = 188.8.131.52 This is generic guidance… If you have other creative ideas, that’s ok! Multi-Site Clustering Introduction Networking Storage Quorum Workloads Storage in Multi-Site Clusters Different than local clusters: Multiple storage arrays – independent per site Nodes commonly access own site storage No “true” shared disk visible to all nodes Site A Site B Storage Considerations Site B Site A Replica Changes are made on Site A and replicated to Site B Need a data replication mechanism between sites Replication Options Replication levels: Hardware storage-based replication Software host-based replication Application-based replication Synchronous Replication Host receives “write complete” response from the storage after the data is successfully written on both storage devices Replication Write Request Secondary Storage Write Complete Primary Storage Acknowledgement Asynchronous Replication Host receives “write complete” response from the storage after the data is successfully written to the primary storage device Replication Write Request Secondary Storage Write Complete Primary Storage Synchronous vs. Asynchronous Synchronous No data loss Requires high bandwidth/low latency connection Stretches over shorter distances Write latencies impact application performance Asynchronous Potential data loss on hard failures Enough bandwidth to keep up with data replication Stretches over longer distances No significant impact on application performance Storage Resource Dependencies Group determines smallest unit of failover Resource Group Establishes start order timing Workload Resource (example File Server) Network Name Resource Ensures node is communicating with local storage and array state IP Address Resources* Disk Resource Ensures application comes online after replication is complete Custom Resource Cluster Validation and Replication Multi-Site clusters are not required to pass the Storage tests to be supported Validation Guide and Policy http://go.microsoft.com/fw link/?LinkID=119949 HP’s Multi-Site Implementation & Demo Matthias Popp Architect HP HP's Multi-Site Implementation: CLX for Windows Virtual Machine Physical Disk VM Config File HP CLX All Physical Disk resources of one Resource Group (VM) depend on a CLX resource Very smooth integration HP Cluster Extension – What’s new? Support for Hyper-V Live Migration across disk arrays Support for Windows 2008 R2 Support for Windows Hyper-V Server 2008 R2 TT337AAE – HP StorageWorks Cluster Extension EVA for Window e-LTU There is no change to current CLX product pricing XP Cluster Extension does not yet support Live Migration planed for 2010 Live Migration with Storage Failover Initiate Live Migration Create VM on target node Copy memory pages from source server to target server via Ethernet Check disk array for replication link and disk pair states Final state transfer Pause virtual machine Move storage connectivity from source server to target server Change storage replication direction Run new VM on target server; Delete VM on source server Host 2 Host 1 storage storagebased basedremote remotereplication replication HP EVA Storage HP EVA Storage HP Storage for Virtualization Hyper-V Live Migration between Replicated Disk Arrays End-user transparent app migration across data centers; across servers and storage Zero Downtime Array Load Balancing (IOPS, cache utilization, response times, power consumption, etc.) Zero Downtime Maintenance Firmware/HBA/Server updates without user interruption Plan maintenance without the need to check for downtimes Follow the sun/moon data center access model Move the app/VM closest to the users or closest to the cheapest power source Failover, failback, Quick and Live Migration using the same management software No need to learn x different tools and their limitations EVA CLX with Exchange 2010 Live Migration Hyper-V Geo Cluster with Exchange Hyper-V Geo Cluster with Exchange Automatically re-direct storage replication during Live Migration Additional HP Resources HP website for Hyper-V www.hp.com/go/hyper-v HP and Microsoft Frontline Partnership website www.hp.com/go/microsoft HP website for Windows Server 2008 R2 www.hp.com/go/ws2008r2 HP website for management tools www.hp.com/go/insight HP OS Support Matrix www.hp.com/go/osssupport Information on HP ProLiant Network Adapter Teaming for Hyper-V http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01663264/c01 663264.pdf Technical overview on HP ProLiant Network Adapter Teaming http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01415139/c01 415139.pdf?jumpid=reg_R1002_USEN Whitepaper: Disaster Tolerant Virtualization Architecture with HP StorageWorks Cluster Extension and Microsoft Hyper-V™ http://h20195.www2.hp.com/V2/getdocument.aspx?docname=4AA26905ENW.pdf 37 Multi-Site Clustering Introduction Networking Storage Quorum Workloads Quorum Overview Majority is greater than 50% Possible Voters: Nodes (1 each) + 1 Witness (Disk or File Share) 4 Quorum Types Disk only (not recommended) Node and Disk majority Vote Vote Vote Node majority Node and File Share majority Vote Vote Replicated Disk Witness A witness is a decision maker when nodes lose network connectivity When a witness is not a single decision maker, problems occur Do not use in multi-site clusters unless directed by vendor Vote Vote Vote ? Replicated Storage from vendor Node Majority Can I communicate with majority of the nodes in the cluster? Yes, then Stay Up 5 Node Cluster: Majority = 3 Site A SAN Can I communicate with majority of the nodes in the cluster? No, drop out of Cluster Membership Site B Cross site network connectivity broken! Majority in Primary Site SAN Node Majority We are down! 5 Node Cluster: Majority = 3 Site A SAN Can I communicate with majority of the nodes in the cluster? No, drop out of Cluster Membership Site B Disaster at Site 1 Majority in Primary Site SAN Need to force quorum manually Forcing Quorum Always understand why quorum was lost Used to bring cluster online without quorum Cluster starts in a special “forced” state Once majority achieved, no more “forced” state Command Line: net start clussvc /fixquorum (or /fq) PowerShell (R2): Start-ClusterNode –FixQuorum (or –fq) Multi-Site With File Share Witness File Share Witness Site C Complete resiliency and automatic recovery from the loss of any 1 site Site A \\Foo\Cluster1 WAN SAN Site B SAN Replicated Storage Multi-Site With File Share Witness Can I communicate with majority of the nodes (+FSW) in the cluster? Yes, then Stay Up Site C \\Foo\Cluster1 Complete resiliency and automatic recovery from the loss of connection between sites Site A File Share Witness WAN SAN Can I communicate with majority of the nodes in the cluster? No (lock failed), drop out of Cluster Membership Site B SAN Replicated Storage FSW Considerations Simple Windows File Server Single file server can serve as a witness for multiple clusters Each cluster requires it’s own share Can be clustered in a second cluster Recommended to be at 3rd separate site so that there is no single point of failure FSW cannot be on a node in the same cluster Quorum Model Summary No Majority: Disk Only Not Recommended Use as directed by vendor Node and Disk Majority Use as directed by vendor Node Majority Odd number of nodes More nodes in primary site Node and File Share Majority Even number of nodes Best availability solution – FSW in 3rd site Multi-Site Clustering Introduction Networking Storage Quorum Workloads Hyper-V in a Multi-Site Cluster Area Network Considerations -On cross-subnet failover, if guest is … -DHCP, then IP updated automatically -Statically configured IP, then admin needs to configure new IP Storage Quorum -Use VLAN preferred with live migration between sites -3rd party replication solution required -Configuration with CSV (explained next) -No special considerations Links: http://technet.microsoft.com/en-us/library/dd197488.aspx CSV in a Multi-Site Cluster Architectural assumptions collide… Replication solutions assume only 1 array accessed at a time CSV assumes all nodes can concurrently access the LUN CSV is not required for Live Migration Talk to your storage vendor for their support story CSV requires VLAN’s Nodes in Primary Site Nodes in Disaster Recovery Site VHD Replication Read/Write Read/Only VM attempts to access replica SQL in a Multi-Site Cluster Area Network Storage Quorum Considerations -SQL does not support OR dependency -Need to stretch VLAN between sites -No special considerations -3rd party replication solution required -No special considerations Links:http://technet.microsoft.com/en-us/library/ms189134.aspx http://technet.microsoft.com/en-us/library/ms178128.aspx Exchange in a Multi-Site Cluster Area Network Storage Quorum Considerations -No VLAN needed -Change HostRecordTTL from 20 minutes to 5 minutes -CCR supports 2 nodes, one per site -Exchange CCR provides application-based replication -File share witness on the Hub Transport server on primary site Links:http://technet.microsoft.com/en-us/library/bb124721.aspx http://technet.microsoft.com/en-us/library/aa998848.aspx Session Summary Multi-Site Failover Clustering has many benefits Redundancy is needed everywhere Understand your replication needs Compare VLANs with multiple subnets Plan quorum model & nodes before deployment Follow the checklist and best practices Resources www.microsoft.com/teched www.microsoft.com/learning Sessions On-Demand & Community Microsoft Certification & Training Resources http://microsoft.com/technet http://microsoft.com/msdn Resources for IT Professionals Resources for Developers Related Content Breakout Sessions SVR208 Gaining Higher Availability with Windows Server 2008 R2 Failover Clustering SVR319 Multi-Site Clustering with Windows Server 2008 R2 DAT312 All You Needed to Know about Microsoft SQL Server 2008 Failover Clustering UNC307 Microsoft Exchange Server 2010 High Availability SVR211 The Challenges of Building and Managing a Scalable and Highly Available Windows Server 2008 R2 Virtualisation Solution SVR314 From Zero to Live Migration. How to Set Up a Live Migration Demo Sessions SVR01-DEMO Free Live Migration and High Availability with Microsoft Hyper-V Server 2008 R2 Hands-on Labs UNC12-HOL Microsoft Exchange Server 2010 High Availability and Storage Scenarios Multi-Site Clustering Content Design guide: http://technet.microsoft.com/en-us/library/dd197430.aspx Deployment guide/checklist: http://technet.microsoft.com/en-us/library/dd197546.aspx Complete an evaluation on CommNet and enter to win an Xbox 360 Elite! © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.