* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The Path to Five 9s
Survey
Document related concepts
Oracle Database wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft Access wikipedia , lookup
Tandem Computers wikipedia , lookup
Concurrency control wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Relational model wikipedia , lookup
Database model wikipedia , lookup
Team Foundation Server wikipedia , lookup
Clusterpoint wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Transcript
June 2013 Michael Otey The Path to Five 9s W ithout a doubt availability is the DBA’s first priority. Even performance ceases to matter if the database isn’t available. High availability isn’t just for the enterprise. Many smaller and medium sized organizations have the same needs for availability but they may not have the same resources as larger enterprises. SQL Server offers several high availability features that can each provide different levels of availability to your organization. Each different option has different costs and different levels of complexity. In addition, hardware based fault tolerant solutions like NEC’s FT Server systems can also provide very high levels of availability for your mission critical database applications. NEC’s FT Server systems can provide up to 99.999% availability right out of the box using specially designed fault tolerant hard- Special Advertising Supplement to W i n d o ws IT P r o ware that’s completely compatible with all the current Windows Server and SQL Server releases. The availability solution that’s right for your organization depends a lot on the size and characteristics of your business. In this whitepaper you’ll see how fault tolerant systems like NEC’s FT Server can provide five 9’s of availability without the complexity and associated licensing costs of some of SQL Server’s built-in high availability options. You’ll also get an overview of the different SQL Server high availability technologies starting from the lowest levels of availability and working to the highest. Availability and Fault Tolerance There are several different technologies that you can use to provide high availability to your organization. The right solution depends on the needs of your organization. Sponsored by NEC and Intel Figure 1 - Availability and Fault Tolerance If your organization’s mission critical systems cannot tolerate downtime then you definitely need to look into fault tolerant systems like NEC’s FT Server. You can see an IDC comparison of the availability delivered by fault tolerance systems verses clustering solutions and standalone servers in Figure 1. Fault Tolerant systems are designed from the ground up to provide extreme availSpecial Advertising Supplement to W i n d o ws IT P r o ability by using fully redundant system components and can provide continuous availability even in the event of a system failure. Fault tolerant systems can provide up to 99.999% uptime which equates to just a little more than 5 minutes of downtime per year. Other solutions are not capable of providing these same levels of availability. For instance, while clustering solutions like Microsoft Windows Failover Sponsored by NEC and Intel Clusters can provide up to 99.9% uptime they will still typically have more than 8 hours of downtime per year. Five 9s of Availability with Fault Tolerant Systems NEC is currently offering their 6th generation of FT servers. They have been providing fault tolerant technology to the market since 2000. NEC’s FT Servers are capable of providing even higher levels of availability than software-only based availability solutions. NEC’s FT Servers are designed with built-in redundancy enabling them to withstand component and hardware failure with no loss of availability. In addition, they do not have the same degree of management complexity as SQL Server’s built-in software-based solutions. There is no need to create clusters or to implement other involved solutions. NEC’s FT Server systems provide very high levels of availability right out-of-the-box. Managing the FT Server systems is almost identical to managing any typical x86 or x64 Windows Server system. The FT Server systems run off-the-shelf versions of the Windows and Linux operating systems and no application changes are required. In addition, the FT Server systems provide these high levels of availability no matter what edition of the Windows Server operating system or SQL Server that you are running. You Special Advertising Supplement to W i n d o ws IT P r o do not have to be using the more expensive SQL Server Enterprise edition. This combination of management simplicity and extremely high out-of-the-box availability makes the NEC FT Servers a particularly good fit for small and medium sized business and branch office scenarios where the organization may not have the expertise required to configure and operate a Windows Server cluster. You can see an example of one of NEC’s FT Server systems shown in Figure 2. Figure 2 - NEC’s Fault Tolerant (FT) Server NEC Express5800 FT Server First introduced in 2000 and now in its sixth generation, the NEC Express5800/ R320 is the product of a hardware partnership between NEC and Intel®, and of a software partnership with Intel®. It is a fault tolerant server that’s ideally suited for mission-critical virtualization, database, and e-mail services. The R320 is a 4U rackmounted server with two six-core Intel® Xeon® CPU modules that are kept in lockSponsored by NEC and Intel Figure 3 - The FT Server System Architecture step. The CPU lockstep design results in a fully fault tolerant server than can endure a CPU, motherboard, network, or storage hardware failure with no interruption to applications and end-users. The continuous high availability of the Express5800 works transparently with applications, operating systems, and virtual server software as it eliminates the need for host-based clustering software, cluster-aware applications, and SAN attached storage. Express5800/R320 Series Specifications: • Fully redundant 4U chassis • One or two socket Multi-Core Intel® 5500 and 5600 series Xeon® Processors • Up to 96 GB of memory Special Advertising Supplement to W i n d o ws IT P r o • Up to 2.4 TB of storage • Two or four PCIe slots per customer replaceable unit The high-end Express5800/R320c’s optional ExpressCluster X feature provides additional protection for software services by detecting when the application stops and then automatically restarting it. NEC’s ExpressCluster technology can be used to provide disaster recovery services and can increase uptime beyond 5 9s. It’s important to point out that these are the logical specifications for each of the different FT Server system models. The NEC FT Server systems employ completely Sponsored by NEC and Intel redundant server components so the actual physical count for each of these specifications is doubled. FT Server Systems Architecture The NEC FT Server systems provide complete redundancy for all of the system components including the motherboard, CPU, RAM and power supplies. NEC’s FT Server systems use a proprietary lockstep technology to keep all of the duplicate system components completely in synch. This lockstep technology ensures that the redundant CPUs are executing exactly the same instructions. Because redundant components are all completely in synch all of the time there is no interruption in processing even if a component fails. In addition, there is no loss of performance and zero loss of data integrity. If a component failure occurs the FT Server systems automatically switch over to using the redundant components. In Figure 3 you can see an overview of the fault tolerant architecture used by the NEC FT Server systems. The NEC FT Series servers provide nonstop operation in the event of hardware failures. They also provide that ability to easily conduct systems repairs without interrupting daily operations. As you can see in Figure 3 the NEC FT servers have a different architecture from a normal x64 system. The FT Servers have two identical component groups called CPU/IO modules. Each CPU/ Special Advertising Supplement to W i n d o ws IT P r o IO module consists of the same components as a normal general purpose server and each is capable of running as a single server. NEC’s LSI GeminiEngine provides fault tolerance by keeping the two CPU/IO modules in sync while monitoring for system faults and isolating any failed components. The CPU subsystems work in lockstep and the GeminiEngine processes the requests from the two subsystems as one request. This allows the system to operate as if there was only one CPU subsystem in operation. Likewise, the two IO subsystems share a common structure and one serves as a standby during normal operations. The storage drivers detect any failures in the active I/O devices and will automatically fail over to the standby subsystem in the event of an error. Considerations for the NEC FT Server Systems NEC’s FT Server systems provide several important availability features. • Zero data loss or corruption • Zero downtime and no loss of connectivity from a hardware failure • Completely transparent to the OS and applications • Lower cost than cluster based solutions • No complication configuration and licensing like clustering solutions • No complex and expensive applications Sponsored by NEC and Intel NEC Fault Tolerant Server Like you might expect, the NEC FT Server system’s redundant system components and fault tolerant architecture make it somewhat more expensive than a standard x64 server that does not have fault tolerant capabilities. However, this extra cost is often offset by the reduced complexity and lower licensing costs provided by the FT Server systems. Deploying and managing the FT Server systems is exactly like deploying a standard Special Advertising Supplement to W i n d o ws IT P r o server—there is no added technology complexity required to realize its higher levels on availability. In addition, unlike Windows Server Clustering and AlwaysOn Availability Groups there is no need to purchase multiple server systems along with extra required licenses for the Windows Server operating system. There is also no need to purchase the more expensive SQL Server Enterprise edition in order to reach 5 nines of availabilSponsored by NEC and Intel ity. It’s important to point out that while the FT Server systems deliver 5 nines of availability out of the box they can also be combined with any of the other SQL Server high availability technologies as well. tion and higher. You can see an overview of SQL Server log shipping in Figure 4. SQL Server Availability Technologies Microsoft provides a number of different technologies that can be used to increase the availability of SQL Server instances. In the following section you’ll learn about the different high availability technologies that Microsoft provides for SQL Server. For each of these technologies you see the type of downtime that it is primarily intended to address as well as its requirements and limitations. Log Shipping Log Shipping is one of the most basic important SQL Server availability technologies. While Windows Failover Clustering and Database Mirroring were focused primarily on increasing availability log shipping is primarily designed as a disaster recovery technology. Log shipping provides protection at the database level. And it works by forwarding transaction log entries from the primary server to one or more secondary servers. Like Database Mirroring, log shipping doesn’t require any specialized hardware and can be implemented on any system that’s capable of running SQL Server. Log shipping is supported by the SQL Server Standard EdiSpecial Advertising Supplement to W i n d o ws IT P r o Figure 4 - Log Shipping Log shipping is implemented using a primary SQL Server system and one or more secondary SQL Server systems. The primary server contains the production database. Log shipping is initialized by taking a full backup of the database that will be protected on the primary server. That database backup is then restored to the secondary servers. Then a SQL Server Agent job runs a stored procedure on the primary system that performs transaction log backups of the production database. These transaction log backups are forwarded to the secondary servers. Another SQL Server Agent job runs on the secondary servers to periodically apply the transaction log backups to the backup database on the secondary servers. Considerations for Log Shipping Unlike Windows Failover Clustering, AlwaysOn Availability Groups, or Database Mirroring, SQL Server Log Shipping has no automated failover or failback proSponsored by NEC and Intel cess. All of the steps to connect end users to the standby server must be manual. There can be substantial downtime associated with log shipping and possible data loss. In addition, because the protection provided by Log Shipping is at the database level you must manually ensure that the secondary SQL Server systems have compatible server-level configuration settings. You must manually provide any logins and system database settings to support the application. Database Mirroring Database Mirroring is a high availability technology that’s primarily designed to protect against unplanned downtime and it can also be implemented as a disaster recovery technology. Database Mirroring provides database-level protection for a specific database. Database mirroring provides a higher level of availability than log shipping because Database Mirroring provides automated failover with minimal downtime. The automatic failover time is typically 3-5 seconds and when database mirroring is implemented in High Safety Mode it also protects against all data loss. . Figure 5 shows an overview of SQL Server’s Database Mirroring. When Database Mirroring is implemented for high availability three SQL Server systems are required: the principal Special Advertising Supplement to W i n d o ws IT P r o Figure 5 - Database Mirroring server, the mirror server, and a witness server. The principal server is the system that is initially providing the database services. The mirror server maintains a copy the databases from the principle server. The Witness server determines when the principal server becomes unavailable and an automatic failover is required. Database mirroring works by capturing transaction log records from the principal server and automatically forwarding them to the mirror server. On the mirror server the database is in a constant state of recovery and can’t be used until a failover occurs. However, the mirror server is not restricted to just providing mirroring services. The mirror server can also be actively supporting other databases that are not participating in the mirroring operation. Sponsored by NEC and Intel Considerations for Database Mirroring Database Mirroring is supported on the SQL Server Standard Edition and above. However, the SQL Server Standard Edition is limited to using Database Mirroring only in High Safety. To use High Performance mode you must have the SQL Server Enterprise edition. Database Mirroring has several important limitations. Database Mirroring is limited to a single failover partner. While you can setup multiple Database Mirroring partnerships per SQL Server instance each partnership can only consist of two systems and the optional witness (which doesn’t actually maintain a copy of the mirrored database). Further, Database Mirroring is only capable of protected a single database at a time. While this is adequate for some applications it doesn’t adequately protect more complex multi-database applications. In addition, you have to choose whether Database Mirroring is implemented either synchronously or asynchronously. You can’t choose both you have to pick one or the other. Finally, the databases on the mirror server are in a state of constant recovery and they can’t be directly accessed. To use the data in the mirrored database you must take point-in-time snapshots of the mirrored databases. In addition, because Database Mirroring is not a server-level technology you must manually make sure Special Advertising Supplement to W i n d o ws IT P r o that each server has the required serverlevel configuration, logins, and system database settings to support the application that makes use of the mirrored database. Failover Clustering Windows Failover Clustering is Microsoft’s primary high availability technology. Windows Failover Clustering is a high availability technology that’s designed to address server and site level unplanned downtime. Windows Failover Clustering provides automated failover with very high levels of availability. A Windows Server Failover Cluster is comprised of multiple systems where each system is configured using the Windows Failover Clustering feature. In addition Windows Server Failover Clustering also requires a shared storage solution. Each physical server that participates in the cluster is called a node. If one node in a cluster fails another node in the cluster can automatically take over the services running on the failed node. This process is called a failover. When the failed node is restored you can failback to service to the original node. With Windows Server 2012 Failover Clustering is provided in both the Windows Server Standard and Datacenter editions with support for up to 64 nodes in a cluster. For Windows Server 2008 and 2008 R2 Windows Failover Clustering is only Sponsored by NEC and Intel provided in the Windows Server Enterprise Edition and higher and there is support for up to 16 node clusters. The number of nodes that are allowed in a cluster is also dependent on the edition of SQL Server that is in use. The SQL Server 2012 Enterprise edition supports the maximum of nodes supported by the Windows Server operating system. The SQL Server 2012 Business Intelligence and Standard editions are limited to two-node clusters. The SQL Server 2008 R2 Enterprise edition supports up to 16 node clusters while the SQL Server 2008 R2 Standard Edition supports for two node clusters. You can see an overview of Windows Failover Clustering in Figure 6. hard drive. In addition, for Windows Server 2012, 2008 R2 and 2008 you must install the Failover Cluster Feature. The cluster requires a private TCP/IP network to send the heartbeat between cluster nodes. The cluster’s heartbeat determines if a cluster node has failed. Networked clients connect to the clustered resources using the organization’s external network. A Windows Failover Cluster also requires shared storage for the clustered services and cluster quorum. During the failover process any client systems connected to the failed node are disconnected. When the failover is complete client systems will reconnect to the cluster resources running on the backup node. Considerations for Failover Clustering Figure 6 - Windows Server Failover Cluster and SQL Server Figure 6 illustrates a two node cluster. Each cluster node requires the installation of the Windows Server operating system on a local Special Advertising Supplement to W i n d o ws IT P r o Windows Server Failover Clustering is an effective technology to provide protection from server level unplanned downtime. However, to accomplish that you need to deal with the cost and complexity of the Windows Failover Clustering technology. From the hardware standpoint you need to purchase a minimum of two systems complete with Windows Server licenses. You also need a private and public network as well as a shared storage that can be accessed by all of the nodes in the cluster. While configuring Windows Server Failover Clusters is easier than it was in the past it can still be a challenging especially for smaller and Sponsored by NEC and Intel medium sized businesses who may not have a great deal of in-house technical expertise. The length of time for a failover depends on the level of database activity that was occurring at the time of failover. SQL Server writes all of its database transactions in the transaction log. In the event of a failover all committed transactions in the transaction log must be applied to the database and all the uncommitted transactions must be rolled back to ensure database integrity. The length of time that this recovery period takes varies depending on the performance capabilities of the system and on the number of transactions in the log. Larger more active database applications will have more transactions and therefore will take longer to complete the restart process than a smaller system that has fewer transactions to recover. each replica is located on a separate SQL Server instance running on different Windows Failover Cluster nodes. AlwaysOn Availability Groups can contain multiple databases all of which can be automatically failed over as a unit. This means that AlwaysOn Availability Groups can protect multiple related databases and fail them over simultaneously. AlwaysOn Availability Groups support both synchronous and asynchronous replicas simultaneously. Synchronous connections are typically used in high availability scenarios where there is a requirement for fast automatic failover. Asynchronous connections are typically used in disaster recovery scenarios where there is geographical distance between the different servers. You can see an overview of AlwaysOn Availability Groups shown in Figure 7. AlwaysOn Availability Groups SQL Server AlwaysOn Availability Groups were first introduced with SQK Server 2012 and they are essentially the next evolution of Database Mirroring. AlwaysOn Availability Groups provide high availability for multiple databases. They are capable of automated failover and the downtime is typically 3-5 seconds. Availability Groups provide database level protection and they can protect against planned and unplanned downtime. SQL Server 2012 AlwaysOn Availability Groups support up to four replicas where Special Advertising Supplement to W i n d o ws IT P r o Figure 7 - SQL Server AlwaysOn Availability Groups Sponsored by NEC and Intel Like Database Mirroring, AlwaysOn Availability Groups works by automatically forwarding transaction log entries from the primary system to the different replicas. In addition, unlike database mirroring, the AlwaysOn Availability Group replica databases are able to provide read-only access. This enables the replicas to be used for both reporting as well as backup purposes potentially offloading some of the workload and I/O from the primary server. Considerations for AlwaysOn Availability Groups AlwaysOn Availability Groups can provide protection from planned and unplanned downtime for multiple databases. However, it also requires a high degree of cost and complexity to implement. First it requires that you implement a Windows Server Failover Cluster which can be difficult for smaller and medium sized businesses. This means that AlwaysOn Availability Groups requires multiple Windows Server and SQL Server instances as well as a shared storage solution. After setting up the Windows Server Failover cluster you then must setup the SQL Server Availability Groups. In addition, although the failover process is very fast there is a small amount of downtime associated with AlwaysOn Availability Groups. Also like database mirroring there is the possibility of data loss when using asynSpecial Advertising Supplement to W i n d o ws IT P r o chronous replicas. While replica databases can be actively used to offload backup and reporting workloads active databases require additional SQL Server licenses. The Road to Five 9s Maximizing availability is one of the most important goals for the database administrator. Microsoft provides several high availability technologies such as Log Shipping, Database Mirroring, Windows Failover Clustering and AlwaysOn Availability Groups. You can see a summary of the different SQL Server availability technologies discussed in this paper in Figure 8. While SQL Server’s built-in technologies can provide high levels of availability for your tier 1 database instances they also each have associated costs and complexity. For instance, technologies like SQL Server’s AlwaysOn Availability Groups provide protection form unexpected databaselevel failures by but there is also significant complexity involved in setting up the prerequisite Windows Failover cluster. There are also licensing considerations involved in implementing AlwaysOn Availability Group because you need to have the SQL Server 2012 Enterprise edition. Like many of the SQL Server high availability options, AlwaysOn Availability Groups is not available in the lower cost SQL Server 2012 Standard edition. Sponsored by NEC and Intel Figure 8 - Summary of SQL Server high availability technologies The NEC FT Server line of systems can provide 5 nine’s of availability right out of the box with no additional complexity or licensing cost beyond what you would pay for a standard x64 server. The key value Special Advertising Supplement to W i n d o ws IT P r o propositions offered by NEC’s FT Server solutions are: • Continuous Availability – The FT Server line provides continuous availability and 99.999% uptime right out Sponsored by NEC and Intel FT Server Value Proposition of the box. Any hardware failures are completely transparent to the end user and there is no data loss. • Mission Critical Application Protection – NEC’s fault tolerance FT Servers provide protection from the single point of failure problem that can occur in virtualization hosts that run many VMs or mission critical database servers that support many applications. The FT Server’s fault tolerant design enables the Special Advertising Supplement to W i n d o ws IT P r o host to withstand failures and provide continuous availability for these mission critical applications. • Operational Simplicity – Operating and managing NEC’s FT Server systems is almost identical to managing a standard standalone server. There’s no need to learn new technologies or to perform complex clustering configurations. • Reduced Maintenance Costs – Hot swap technology allows maintenance Sponsored by NEC and Intel Server Cost Comparison to be performed with no end user interruption. The FT Server systems incorporate a Customer Replaceable Unit (CRU) design that allows IT personnel to replace most system components on site. • Reduced TCO – NEC’s FT Servers provide greater uptime, reduced management time as well as reduced infrastructure and software licensing Special Advertising Supplement to W i n d o ws IT P r o compared to other availability solutions. This combination of simplicity and high availability makes the NEC FT Server systems well suited for smaller and medium sized business or remote office deployments who have requirements for high availability but may lack the resources or technical capabilities to implement more complex solutions. Sponsored by NEC and Intel