Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Automated Disaster Recovery Solution for Microsoft SharePoint server using Azure Site Recovery Summary: This document provides technical guidance for implementing one-click disaster recovery solution for Microsoft SharePoint server using Azure Site Recovery. Published: September 2015 Applies to: Microsoft SharePoint server, Azure Site Recovery 0|Page Copyright and Disclaimer © 2015 Microsoft Corporation. All rights reserved. This document is provided "as-is”. Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. You bear the risk of using it. This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may copy and use this document for your internal, reference purposes. You may modify this document for your internal, reference purposes. 1|Page Automated Disaster Recovery Solution for Microsoft SharePoint server using Azure Site Recovery ................................................................................................... 3 Overview ....................................................................................................................................................................... 3 SharePoint architecture ........................................................................................................................................... 3 Supported Azure Site Recovery Deployment Options ................................................................................ 4 Prerequisites ................................................................................................................................................................ 4 Enable DR of SharePoint server using ASR ................................................................... 5 Protect your SharePoint Server farm ................................................................................................................. 5 Setup AD and DNS replication Setup SQL Server Protection (Site to Site) Setup SQL Protection (Site to Azure) Enable protection for web and app servers Configure Networking 5 6 8 8 8 Create a recovery plan........................................................................................................ 12 Perform a Test Failover ...................................................................................................... 15 Perform an Unplanned Failover...................................................................................... 16 Perform a Planned Failover .............................................................................................. 17 Perform a Failback................................................................................................................ 17 Best Practices.......................................................................................................................... 19 Capacity planning and readiness assessment ..............................................................................................19 Implementation Checklist.....................................................................................................................................19 Summary .................................................................................................................................. 20 Appendix (Scripts) ................................................................................................................ 21 2|Page Automated Disaster Recovery Solution for Microsoft SharePoint server using Azure Site Recovery Overview Microsoft SharePoint is a powerful application that can help a group or department organize, collaborate, and share information. SharePoint can provide intranet portals, document and file management, collaboration, social networks, extranets, websites, enterprise search, and business intelligence. It also has system integration, process integration, and workflow automation capabilities. Typically, organizations consider it as an enterprise class Tier-1 application sensitive to downtime and data loss. Today, Microsoft SharePoint1 does not provide any out-of-the-box disaster recovery capabilities. Regardless of the type and scale of a disaster, recovery involves the use of a standby data center that you can recover the farm to. Standby data centers are required for scenarios where local redundant systems and backups cannot recover from the outage at the primary data center. Azure Site Recovery2 is an Azure based service that provides disaster recovery capabilities by orchestrating replication, failover and recovery of virtual machines. Azure Site Recovery supports a number of replication technologies to consistently replicate, protect, and seamlessly failover virtual machines and applications to private/public or service provider clouds. Azure Site Recovery based disaster recovery solution is fully tested, certified and recommended by Microsoft SharePoint. This document explains in detail about how you can create a disaster recovery solution for your SharePoint server farm, perform a planned/unplanned/test failovers using one-click recovery plan, supported configurations and prerequisites. The audience is expected to be familiar with SharePoint and Azure Site Recovery. SharePoint architecture SharePoint can be deployed on one or more servers using tiered topologies and server roles to implement a farm design that meets specific goals and objectives. A typical large, high-demand SharePoint server farm that supports a high number of concurrent users and a large number of content items use service grouping as part of their scalability strategy. This approach involves running services on dedicated servers, grouping these services together, and then scaling out the servers as a group. The following topology illustrates the service and server grouping for a three tier SharePoint server farm. Please refer to SharePoint documentation and product line architectures for detailed guidance on different SharePoint topologies. 1 2 SharePoint Resources for IT Pros Azure Site Recovery documentation 3|Page Service and server grouping in a three-tier farm Supported Azure Site Recovery Deployment Options Customers can deploy SharePoint farm as Virtual Machines running on Hyper-V or VMware or as Physical Servers. Azure Site Recovery can protect both physical and virtual deployments to either a secondary Site or to Azure. Hyper-V VMware Physical Site to Site Site to Azure Site to Site Site to Azure Site to Site Site to Azure Yes Yes Yes Yes Yes Yes Prerequisites Implementing disaster recovery for SharePoint using Azure Site Recovery requires the following prerequisites completed. 3 An on-premises SharePoint production farm has been setup Azure Site Recovery Services vault has been created in Microsoft Azure subscription3 Create Azure Site Recovery vault in Microsoft Azure subscription 4|Page Use Windows Azure Backup (or other backup methodologies) to take a scheduled backup of the ‘Search’ application for SharePoint if search continuity post failover is desired If Azure is your recovery site, run the Azure Virtual Machine Readiness Assessment tool4 on VMs to ensure that they are compatible with Azure VMs and Azure Site Recovery Services. Enable DR of SharePoint server using ASR Protect your SharePoint Server farm Each component of the SharePoint farm needs to be protected to enable farm replication and recovery. This section covers: Protection of Active Directory Protection of SQL Tier Protection of App and Web Tiers Networking configuration Setup AD and DNS replication Active Directory is required on the DR site for SharePoint farm to function. There are two recommended choices based on the complexity of the customer’s on-premises environment. Option 1 If the customer has a small number of applications and a single domain controller for his entire onpremises site and will be failing over the entire site together, then we recommend using ASRReplication to replicate the DC machine to secondary site (applicable for both Site to Site and Site to Azure). Option 2 If the customer has a large number of applications and is running an Active Directory forest and will failover few applications at a time, then we recommend setting up an additional domain controller on the DR site (secondary site or in Azure). Please refer to companion guide5 on making a domain controller available on DR site. For remainder of this document we will assume a DC is available on DR site. 4 5 Azure Virtual Machine Readiness Assessment Setting up AD for a DR environment 5|Page Setup SQL Server Protection (Site to Site) Please refer to companion guide6 for detailed technical guidance on the recommended option for protecting SQL tier. If SQL Availability groups are being used in your SharePoint server farm, make sure you follow the below guidance to group the Databases into Availability Groups. 6 Availability Group Name Databases Comments AG_Search The 4 search databases Search databases cannot be replicated asynchronously and are therefore separated. AG_Content Only content databases Content databases usually have a high volume of change and are therefore separated. AG_Services All other databases Databases for Configuration DB and Service Applications change infrequently and therefore can be grouped together. Protect SQL Tier 6|Page WahidSaleemi Windows Server Failover Cluster (WSFC) Primary Datacenter Network Subnet 1 Alternate Datacenter Network Subnet 2 WSFC Quorum File Share Node 1 Node 2 Default Instance Node 3 Default Instance Default Instance Availability Group: AG_Content WahidSaleemi sync async sync Primary Replica Secondary Replica Secondary Replica Availability Group Listener Virtual Network Name Availability Group: AG_Search sync sync Primary Replica Search Databases Secondary Replica Availability Group Listener Virtual Network Name Availability Group: AG_Services WahidSaleemi Primary Replica async sync sync Secondary Replica Secondary Replica Availability Group Listener Virtual Network Name Storage Storage Storage Failover Clustering and SQL AlwaysOn AGs in SharePoint server farm 7|Page Setup SQL Protection (Site to Azure) Please refer to companion guide7 for detailed technical guidance on the recommended option for protecting SQL tier. Configuring multiple SQL availability groups in Azure is challenging since each availability group requires a dedicated listener and configuration of each listener requires a separate cloud service. The recommendation, in this case, is to configure one availability group with all the databases included. The recommendation of not asynchronously replicating search databases remains. Enable protection for web and app servers Enable protection of web and app tier VMs in ASR. Perform relevant Azure Site Recovery configuration based on whether the VMs are deployed on Hyper-V or on VMware. Recommended Crash consistent frequency to configure is 15minutes. The below snapshot shows the protection status of SharePoint web and app tier VMs in ‘Hyper-V site to Azure’ protection scenario. Configure Networking Configure VM Network Settings For the App and web tier VMs configure network settings in ASR so that the VM networks get attached to the right DR network after failover. Ensure the DR network for these tiers is routable to the SQL tier. You can select the VM in the ‘VMM Cloud’ or the ‘Protection Group’ to configure the network settings as shown in the snapshot below. 7 Protect SQL Tier 8|Page Configure DNS and Traffic Routing For internet facing sites, create an instance of Traffic Manager in the Azure subscription and configure it and your DNS in the following manner. Where Public DNS On-premises DNS Source Target Public DNS for SharePoint sites Traffic Manager Ex: sharepoint.contoso.com contososharepoint.trafficmanager.net sharepointonprem.contoso.com <Public IP on the on-premises farm> Load balancing method: Failover Failover Priority list: 1. <URL configured for Primary farm> 2. <URL configured for Recovery farm> Example: 1. sharepointpri.contoso.com 2. sharepointrec.contoso.com 9|Page Host a test page on a specific PORT (e.g. 800) in the SharePoint web tier in order for Traffic Manager to automatically detect availability post failover. This is a workaround in case you cannot enable anonymous authentication on any of your SharePoint sites. Once failover to the recovery cloud happens, this policy will result in the following and the traffic will start getting routed to the recovery farm side. For internal only sites, skip Traffic Manager and create DNS entries in the following manner. Source On-premises DNS Target Internal URL Default site name Ex: https://<WebTierVMName> https://sharepointonprem.contoso.com or Load balancer IP Add the URL configured as an Alternate access mapping for SharePoint so that there is no re-configuration requirement post failover8. 8 Setting up alternate access mapping in SharePoint 10 | P a g e The following pictures illustrate the network topology of the SharePoint application in E2E and E2A scenarios once the complete protection is enabled using Azure Site Recovery. Network topology for Site to Site scenario Network topology for Site to Azure scenario 11 | P a g e Create a recovery plan You can create a recovery plan in ASR to automate the failover process. Add app tier and web tier in the Recovery Plan. Order them in different groups so that the front-end shutdown before app tier. 1. Select the ASR vault in your subscription and click on ‘Recovery Plans’ tab. 2. Click on ‘Create’ and specify a name 3. Select the ‘Source’ and ‘Target’. The target can be Azure or secondary site. 4. Select the app tier and web tier VMs to add to the recovery plan and click ✓. 12 | P a g e You can customize the recovery plan for SharePoint server by adding various steps as detailed below. The above snapshot shows the complete recovery plan after adding all the steps. Steps: 1. SQL Server failover steps Refer to ‘SQL Server DR Solution’ companion guide9 for details about recovery steps specific to SQL server. 2. Failover Group 1: Failover the App tier VMs Make sure that the recovery point selected is as close as possible to the database PIT but not ahead. 3. Script 1: Configure availability set (Only Site to Azure) Add a script (via Azure automation) after App tier group comes up to create an availability set and add the App tier VMs into the availability set. You can use a script to do this task. 9 Protect SQL Server 13 | P a g e 4. Failover Group 2: Failover the web tier VMs Failover the web tier VMs as part of the recovery plan. 5. Script 2: Configure availability set (only Site to Azure) Add a script (via Azure automation) after Web tier group comes up to create an availability set and add the Web tier VMs into the availability set. You can use a script to do this task. 6. Script 3: Clear SharePoint config cache Execute the following script on each SharePoint VM. In Azure, this can be automated using Automation runbooks. Refer to SharePoint config flush script in Appendix section 7. Manual step 4: Update the DNS records to point to the new farm in Azure For internet facing sites, no DNS update should be required post failover. Follow the steps described in the previous section to configure Traffic Manager. If Traffic Manager has been setup as described in the previous section, add a script to open dummy port (800 in the example) on the recovery side. For internal facing sites, add a manual step to update the DNS record to point to the new front end VM’s load balancer IP. 8. Script 4: Open port 80 (only Site to Azure) Add an Azure automation script to add HTTP endpoint at Port 80 on the front-end VMs. Repeat the same for the Traffic Manager Port added in the previous section. Refer to Open Azure endpoints script in Appendix section 9. Manual step 5: Restore search application from a backup or start a new search service Restore Search Service Application from a backup10 1. This method assumes that a backup of the Search Service Application was performed prior to the catastrophic event and that the back is available at the DR site. This can easily be achieved by scheduling the backup (for example, once daily) and using a copy procedure to place the backup at the DR site. Copy procedures could include scripted programs such as AzCopy (Azure Copy) or setting up DFSR (Distributed File Services Replication). 2. Now that the SharePoint farm is running, navigate the Central Administration, Backup and Restore and select Restore. The restore will interrogate the backup location specified (you may need to update the value). Select the Search Service Application backup you would like to restore. 3. Search will be restored. Keep in mind that the restore expect to find the same topology (same number of servers) and same hard drive letters assigned to those servers. For more information see Restore 10 Restore Search application in SharePoint 2013 14 | P a g e Start with a new Search Service Application 1. This method assumes that a backup of the “Search Administration” database is available at the DR site. 2. Since the other Search Service Application databases are not replicated, they will need to be re-created. To do so, navigate to Central Administration and delete the Search Service Application. On any servers which host the Search Index, delete the index files. 3. Re-create the Search Service Application, this will re-create the databases. It is strongly recommended to have a prepared script that will re-create this service application since it is not possible to perform all actions via the GUI. For example, setting the index drive location and configuring the search topology are only possible by using SharePoint PowerShell cmdlets. Use the Windows PowerShell cmdlet RestoreSPEnterpriseSearchServiceApplication and specify the log-shipped and replicated Search Administration database, Search_Service__DB. This cmdlet gives the search configuration, schema, managed properties, rules, and sources and creates a default set of the other components. 4. Once the Search Service Application has be re-created, you must start a full crawl for each content source to restore the Search Service. Note that you lose some analytics information from the on-premises farm, such as search recommendations. Perform a Test Failover Refer to ‘AD DR Solution11’ and ‘SQL Server DR solution12’ companion guides for considerations specific to AD and SQL server respectively during Test Failover. 1. 2. 3. 4. 11 12 Go to Azure manage portal and select your Site Recovery vault. Click on the recovery plan created for SharePoint. Click on ‘Test Failover’. Select the virtual network to start the test failover process. Protect AD Protect SQL Server 15 | P a g e 5. Once the secondary environment is up, you can perform your validations. 6. Once the validations are complete, you can select ‘Validations complete’ and the test failover environment will be cleaned. Perform an Unplanned Failover 1. Go to Azure manage portal and select your Site Recovery vault. 2. Click on the recovery plan created for SharePoint. 3. Click on ‘Failover’ and select ‘Unplanned Failover’. 4. Select the target network and click ✓ to start the failover process. 16 | P a g e Perform a Planned Failover 1. 2. 3. 4. Go to Azure manage portal and select your Site Recovery vault. Click on the recovery plan created for SharePoint. Click on ‘Failover’ and select ‘Planned Failover’. Select the target network and click ✓ to start the failover process. Perform a Failback Refer to ‘SQL Server DR Solution13’ companion guide for considerations specific to SQL server during Failback. 1. Go to Azure manage portal and select your Site Recovery vault. 2. Click on the recovery plan created for SharePoint. 3. Click on ‘Failover’ and select planned/unplanned failover. 13 Protect SQL Server 17 | P a g e 4. Click on ‘Change Direction’. 5. Select the appropriate options - data synchronization and VM creation options 6. Click ✓ to start the ‘Failback’ process. 18 | P a g e Best Practices Capacity planning and readiness assessment Hyper-V site User Capacity planner tool14 to design the server, storage and network infrastructure for your Hyper-V Replica environment. Azure You can run the Azure Virtual Machine Readiness Assessment tool15 on VMs to ensure that they are compatible with Azure VMs and Azure Site Recovery Services. The Readiness Assessment Tool checks VM configurations and warns when configurations are incompatible with Azure. For example, it issues a warning if a C: drive is larger than 127 GB. Capacity planning is made up of at least two important components: Mapping on-premises Hyper-V VMs to Azure VM sizes (such as A6, A7, A8, and A9). Determining the required Internet bandwidth. Implementation Checklist Step 1 Create Azure Site Recovery vault in Microsoft Azure subscription. Check the prerequisites to protect your SharePoint server farm. Step 2 Hyper-V only step - Download Microsoft Azure Site Recovery Provider, and install it on VMM server / Hyper-V host VMware only step - Configure Protection server, Configuration server and Master Target servers appropriately Step 3 Prepare resources. Add an Azure Storage account. Hyper-V only step - Download the Microsoft Azure Recovery Services Agent, and install it on Hyper-V host servers. VMware only step – Make sure the mobility service is installed on all the VMs Step 4 14 15 Hyper-V Replica Capacity Planner tool Azure Virtual Machine Readiness Assessment tool 19 | P a g e Enable protection for VMs in VMM clouds / Hyper-V sites / VMware sites Step 5 Map resources. Map on premise networks to Azure VNET. Step 7 Create the recovery plan Perform test failover using the recovery plan Ensure that all VMs have access to required resources, such as Active Directory Ensure that network redirections for SharePoint are working Step 8 Perform DR drill using planned and unplanned failovers Ensure that all VMs have access to required resources, such as Active Directory Ensure that network redirections for SharePoint are working Summary Using Azure Site Recovery, you can create a complete automated disaster recovery plan for your SharePoint server farm. You can initiate the failover within seconds from anywhere in the event of a disruption and get the application up and running in a few minutes. 20 | P a g e Appendix (Scripts) Script to flush SharePoint Cache Add-PSSnapin -Name Microsoft.SharePoint.PowerShell –erroraction SilentlyContinue Stop-Service SPTimerV4 $folders = Get-ChildItem C:\ProgramData\Microsoft\SharePoint\Config foreach ($folder in $folders) { $items = Get-ChildItem $folder.FullName -Recurse foreach ($item in $items) { if ($item.Name.ToLower() -eq "cache.ini") { $cachefolder = $folder.FullName } } } $cachefolderitems = Get-ChildItem $cachefolder -Recurse foreach ($cachefolderitem in $cachefolderitems) { if ($cachefolderitem -like "*.xml") { $cachefolderitem.Delete() } } $a = Get-Content $cachefolder\cache.ini $a = 1 Set-Content $a -Path $cachefolder\cache.ini read-host "Do this on all your SharePoint Servers - and THEN press ENTER" start-Service SPTimerV4 21 | P a g e Script to Open End Points in Azure workflow OpenPort80 { param ( [Object]$RecoveryPlanContext ) $Cred = Get-AutomationPSCredential -Name 'contosoLogin' # Connect to Azure $AzureAccount = Add-AzureAccount -Credential $Cred Select-AzureSubscription -SubscriptionName "DevTesting2" $AEProtocol = "TCP" $AELocalPort = 80 $AEPublicPort = 80 Write-Output $AzureAccount $vmMap = $RecoveryPlanContext.VmMap.PsObject.Properties foreach($VMProperty in $vmMap) { $VM = $VMProperty.Value $AEName = "Port80" + $VM.RoleName Write-Output "Processing for " + $VM.RoleName $vms = Get-AzureVM Write-Output $vms # Invoke pipeline commands within an InlineScript $EndpointStatus = InlineScript { # Invoke the necessary pipeline commands to add a Azure Endpoint to a specified Virtual Machine # This set of commands includes: Get-AzureVM | Add-AzureEndpoint | UpdateAzureVM (including necessary parameters) $Status = Get-AzureVM -ServiceName $Using:VM.CloudServiceName -Name $Using:VM.RoleName | ` Add-AzureEndpoint -Name $Using:AEName -Protocol $Using:AEProtocol PublicPort $Using:AEPublicPort -LocalPort $Using:AELocalPort | ` Update-AzureVM Write-Output $Status } $AEPublicPort = $AEPublicPort + 1 } } 22 | P a g e