Download The Path to Five 9s

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Oracle Database wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Access wikipedia , lookup

Database wikipedia , lookup

Tandem Computers wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

SQL wikipedia , lookup

PL/SQL wikipedia , lookup

Btrieve wikipedia , lookup

Team Foundation Server wikipedia , lookup

Clusterpoint wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Transcript
June 2013
Michael Otey
The Path to Five 9s
W
ithout a doubt availability is the
DBA’s first priority. Even performance ceases to matter if the
database isn’t available. High availability
isn’t just for the enterprise. Many smaller
and medium sized organizations have the
same needs for availability but they may
not have the same resources as larger
enterprises. SQL Server offers several high
availability features that can each provide
different levels of availability to your organization. Each different option has different
costs and different levels of complexity. In
addition, hardware based fault tolerant solutions like NEC’s FT Server systems can also
provide very high levels of availability for
your mission critical database applications.
NEC’s FT Server systems can provide up to
99.999% availability right out of the box
using specially designed fault tolerant hard-
Special Advertising Supplement
to
W i n d o ws IT P r o ware that’s completely compatible with all
the current Windows Server and SQL Server
releases. The availability solution that’s
right for your organization depends a lot
on the size and characteristics of your business. In this whitepaper you’ll see how fault
tolerant systems like NEC’s FT Server can
provide five 9’s of availability without the
complexity and associated licensing costs
of some of SQL Server’s built-in high availability options. You’ll also get an overview
of the different SQL Server high availability
technologies starting from the lowest levels
of availability and working to the highest.
Availability and Fault Tolerance
There are several different technologies
that you can use to provide high availability to your organization. The right solution
depends on the needs of your organization.
Sponsored
by
NEC
and
Intel
Figure 1 - Availability and Fault Tolerance
If your organization’s mission critical systems cannot tolerate downtime then you
definitely need to look into fault tolerant
systems like NEC’s FT Server. You can see
an IDC comparison of the availability delivered by fault tolerance systems verses clustering solutions and standalone servers in
Figure 1.
Fault Tolerant systems are designed from
the ground up to provide extreme availSpecial Advertising Supplement
to
W i n d o ws IT P r o ability by using fully redundant system
components and can provide continuous
availability even in the event of a system
failure. Fault tolerant systems can provide
up to 99.999% uptime which equates to
just a little more than 5 minutes of downtime per year. Other solutions are not
capable of providing these same levels of
availability. For instance, while clustering
solutions like Microsoft Windows Failover
Sponsored
by
NEC
and
Intel
Clusters can provide up to 99.9% uptime
they will still typically have more than 8
hours of downtime per year.
Five 9s of Availability
with Fault Tolerant Systems
NEC is currently offering their 6th generation of FT servers. They have been providing fault tolerant technology to the market
since 2000. NEC’s FT Servers are capable
of providing even higher levels of availability than software-only based availability
solutions. NEC’s FT Servers are designed
with built-in redundancy enabling them to
withstand component and hardware failure with no loss of availability. In addition, they do not have the same degree of
management complexity as SQL Server’s
built-in software-based solutions. There is
no need to create clusters or to implement
other involved solutions. NEC’s FT Server
systems provide very high levels of availability right out-of-the-box. Managing the
FT Server systems is almost identical to
managing any typical x86 or x64 Windows
Server system. The FT Server systems run
off-the-shelf versions of the Windows and
Linux operating systems and no application changes are required. In addition, the
FT Server systems provide these high levels of availability no matter what edition
of the Windows Server operating system
or SQL Server that you are running. You
Special Advertising Supplement
to
W i n d o ws IT P r o do not have to be using the more expensive SQL Server Enterprise edition. This
combination of management simplicity
and extremely high out-of-the-box availability makes the NEC FT Servers a particularly good fit for small and medium
sized business and branch office scenarios
where the organization may not have the
expertise required to configure and operate a Windows Server cluster. You can see
an example of one of NEC’s FT Server systems shown in Figure 2.
Figure 2 - NEC’s Fault Tolerant (FT) Server
NEC Express5800 FT Server
First introduced in 2000 and now in its
sixth generation, the NEC Express5800/
R320 is the product of a hardware partnership between NEC and Intel®, and of
a software partnership with Intel®. It is a
fault tolerant server that’s ideally suited for
mission-critical virtualization, database,
and e-mail services. The R320 is a 4U rackmounted server with two six-core Intel®
Xeon® CPU modules that are kept in lockSponsored
by
NEC
and
Intel
Figure 3 - The FT Server System Architecture
step. The CPU lockstep design results in a
fully fault tolerant server than can endure
a CPU, motherboard, network, or storage
hardware failure with no interruption to
applications and end-users. The continuous
high availability of the Express5800 works
transparently with applications, operating
systems, and virtual server software as it
eliminates the need for host-based clustering software, cluster-aware applications,
and SAN attached storage.
Express5800/R320 Series Specifications:
• Fully redundant 4U chassis
• One or two socket Multi-Core Intel®
5500 and 5600 series Xeon® Processors
• Up to 96 GB of memory
Special Advertising Supplement
to
W i n d o ws IT P r o • Up to 2.4 TB of storage
• Two or four PCIe slots per customer
replaceable unit
The high-end Express5800/R320c’s
optional ExpressCluster X feature provides
additional protection for software services
by detecting when the application stops
and then automatically restarting it. NEC’s
ExpressCluster technology can be used to
provide disaster recovery services and can
increase uptime beyond 5 9s.
It’s important to point out that these are
the logical specifications for each of the
different FT Server system models. The
NEC FT Server systems employ completely
Sponsored
by
NEC
and
Intel
redundant server components so the actual
physical count for each of these specifications is doubled.
FT Server Systems Architecture
The NEC FT Server systems provide complete redundancy for all of the system components including the motherboard, CPU,
RAM and power supplies. NEC’s FT Server
systems use a proprietary lockstep technology to keep all of the duplicate system components completely in synch. This lockstep
technology ensures that the redundant CPUs
are executing exactly the same instructions.
Because redundant components are all completely in synch all of the time there is no
interruption in processing even if a component fails. In addition, there is no loss of performance and zero loss of data integrity. If a
component failure occurs the FT Server systems automatically switch over to using the
redundant components. In Figure 3 you can
see an overview of the fault tolerant architecture used by the NEC FT Server systems.
The NEC FT Series servers provide nonstop operation in the event of hardware failures. They also provide that ability to easily
conduct systems repairs without interrupting daily operations. As you can see in Figure 3 the NEC FT servers have a different
architecture from a normal x64 system. The
FT Servers have two identical component
groups called CPU/IO modules. Each CPU/
Special Advertising Supplement
to
W i n d o ws IT P r o IO module consists of the same components
as a normal general purpose server and
each is capable of running as a single server.
NEC’s LSI GeminiEngine provides fault tolerance by keeping the two CPU/IO modules
in sync while monitoring for system faults
and isolating any failed components.
The CPU subsystems work in lockstep and the GeminiEngine processes the
requests from the two subsystems as one
request. This allows the system to operate
as if there was only one CPU subsystem
in operation. Likewise, the two IO subsystems share a common structure and
one serves as a standby during normal
operations. The storage drivers detect any
failures in the active I/O devices and will
automatically fail over to the standby subsystem in the event of an error.
Considerations
for the NEC FT Server Systems
NEC’s FT Server systems provide several
important availability features.
• Zero data loss or corruption
• Zero downtime and no loss of connectivity from a hardware failure
• Completely transparent to the OS and
applications
• Lower cost than cluster based solutions
• No complication configuration and
licensing like clustering solutions
• No complex and expensive applications
Sponsored
by
NEC
and
Intel
NEC Fault Tolerant Server
Like you might expect, the NEC FT Server
system’s redundant system components and
fault tolerant architecture make it somewhat
more expensive than a standard x64 server
that does not have fault tolerant capabilities. However, this extra cost is often offset
by the reduced complexity and lower licensing costs provided by the FT Server systems.
Deploying and managing the FT Server systems is exactly like deploying a standard
Special Advertising Supplement
to
W i n d o ws IT P r o server—there is no added technology complexity required to realize its higher levels
on availability. In addition, unlike Windows
Server Clustering and AlwaysOn Availability
Groups there is no need to purchase multiple server systems along with extra required
licenses for the Windows Server operating
system. There is also no need to purchase
the more expensive SQL Server Enterprise
edition in order to reach 5 nines of availabilSponsored
by
NEC
and
Intel
ity. It’s important to point out that while the
FT Server systems deliver 5 nines of availability out of the box they can also be combined with any of the other SQL Server high
availability technologies as well.
tion and higher. You can see an overview of
SQL Server log shipping in Figure 4.
SQL Server Availability Technologies
Microsoft provides a number of different
technologies that can be used to increase
the availability of SQL Server instances. In
the following section you’ll learn about the
different high availability technologies that
Microsoft provides for SQL Server. For each of
these technologies you see the type of downtime that it is primarily intended to address
as well as its requirements and limitations.
Log Shipping
Log Shipping is one of the most basic important SQL Server availability technologies.
While Windows Failover Clustering and
Database Mirroring were focused primarily
on increasing availability log shipping is primarily designed as a disaster recovery technology. Log shipping provides protection at
the database level. And it works by forwarding transaction log entries from the primary
server to one or more secondary servers.
Like Database Mirroring, log shipping
doesn’t require any specialized hardware and
can be implemented on any system that’s
capable of running SQL Server. Log shipping
is supported by the SQL Server Standard EdiSpecial Advertising Supplement
to
W i n d o ws IT P r o Figure 4 - Log Shipping
Log shipping is implemented using a primary SQL Server system and one or more
secondary SQL Server systems. The primary
server contains the production database. Log
shipping is initialized by taking a full backup
of the database that will be protected on
the primary server. That database backup is
then restored to the secondary servers. Then
a SQL Server Agent job runs a stored procedure on the primary system that performs
transaction log backups of the production
database. These transaction log backups are
forwarded to the secondary servers. Another
SQL Server Agent job runs on the secondary
servers to periodically apply the transaction
log backups to the backup database on the
secondary servers.
Considerations for Log Shipping
Unlike Windows Failover Clustering,
AlwaysOn Availability Groups, or Database Mirroring, SQL Server Log Shipping
has no automated failover or failback proSponsored
by
NEC
and
Intel
cess. All of the steps to connect end users
to the standby server must be manual.
There can be substantial downtime associated with log shipping and possible data
loss. In addition, because the protection
provided by Log Shipping is at the database level you must manually ensure that
the secondary SQL Server systems have
compatible server-level configuration settings. You must manually provide any
logins and system database settings to
support the application.
Database Mirroring
Database Mirroring is a high availability
technology that’s primarily designed to
protect against unplanned downtime and
it can also be implemented as a disaster
recovery technology. Database Mirroring
provides database-level protection for a
specific database. Database mirroring provides a higher level of availability than
log shipping because Database Mirroring
provides automated failover with minimal
downtime. The automatic failover time is
typically 3-5 seconds and when database
mirroring is implemented in High Safety
Mode it also protects against all data loss. .
Figure 5 shows an overview of SQL Server’s
Database Mirroring.
When Database Mirroring is implemented for high availability three SQL
Server systems are required: the principal
Special Advertising Supplement
to
W i n d o ws IT P r o Figure 5 - Database Mirroring
server, the mirror server, and a witness
server. The principal server is the system
that is initially providing the database services. The mirror server maintains a copy
the databases from the principle server.
The Witness server determines when the
principal server becomes unavailable and
an automatic failover is required.
Database mirroring works by capturing
transaction log records from the principal server and automatically forwarding
them to the mirror server. On the mirror server the database is in a constant
state of recovery and can’t be used until
a failover occurs. However, the mirror
server is not restricted to just providing
mirroring services. The mirror server can
also be actively supporting other databases that are not participating in the mirroring operation.
Sponsored
by
NEC
and
Intel
Considerations
for Database Mirroring
Database Mirroring is supported on the SQL
Server Standard Edition and above. However, the SQL Server Standard Edition is
limited to using Database Mirroring only in
High Safety. To use High Performance mode
you must have the SQL Server Enterprise
edition. Database Mirroring has several
important limitations. Database Mirroring
is limited to a single failover partner. While
you can setup multiple Database Mirroring
partnerships per SQL Server instance each
partnership can only consist of two systems
and the optional witness (which doesn’t
actually maintain a copy of the mirrored
database). Further, Database Mirroring is
only capable of protected a single database
at a time. While this is adequate for some
applications it doesn’t adequately protect
more complex multi-database applications.
In addition, you have to choose whether
Database Mirroring is implemented either
synchronously or asynchronously. You
can’t choose both you have to pick one or
the other. Finally, the databases on the mirror server are in a state of constant recovery and they can’t be directly accessed. To
use the data in the mirrored database you
must take point-in-time snapshots of the
mirrored databases. In addition, because
Database Mirroring is not a server-level
technology you must manually make sure
Special Advertising Supplement
to
W i n d o ws IT P r o that each server has the required serverlevel configuration, logins, and system
database settings to support the application
that makes use of the mirrored database.
Failover Clustering
Windows Failover Clustering is Microsoft’s
primary high availability technology. Windows Failover Clustering is a high availability technology that’s designed to address
server and site level unplanned downtime.
Windows Failover Clustering provides
automated failover with very high levels
of availability. A Windows Server Failover
Cluster is comprised of multiple systems
where each system is configured using the
Windows Failover Clustering feature. In
addition Windows Server Failover Clustering also requires a shared storage solution.
Each physical server that participates in
the cluster is called a node. If one node in
a cluster fails another node in the cluster
can automatically take over the services
running on the failed node. This process is
called a failover. When the failed node is
restored you can failback to service to the
original node.
With Windows Server 2012 Failover
Clustering is provided in both the Windows Server Standard and Datacenter editions with support for up to 64 nodes in a
cluster. For Windows Server 2008 and 2008
R2 Windows Failover Clustering is only
Sponsored
by
NEC
and
Intel
provided in the Windows Server Enterprise
Edition and higher and there is support for
up to 16 node clusters.
The number of nodes that are allowed in
a cluster is also dependent on the edition of
SQL Server that is in use. The SQL Server
2012 Enterprise edition supports the maximum of nodes supported by the Windows
Server operating system. The SQL Server
2012 Business Intelligence and Standard editions are limited to two-node clusters. The
SQL Server 2008 R2 Enterprise edition supports up to 16 node clusters while the SQL
Server 2008 R2 Standard Edition supports for
two node clusters. You can see an overview
of Windows Failover Clustering in Figure 6.
hard drive. In addition, for Windows Server
2012, 2008 R2 and 2008 you must install the
Failover Cluster Feature. The cluster requires
a private TCP/IP network to send the heartbeat between cluster nodes. The cluster’s
heartbeat determines if a cluster node has
failed. Networked clients connect to the
clustered resources using the organization’s
external network. A Windows Failover Cluster also requires shared storage for the clustered services and cluster quorum.
During the failover process any client systems connected to the failed node are disconnected. When the failover is complete
client systems will reconnect to the cluster
resources running on the backup node.
Considerations for Failover Clustering
Figure 6 - Windows Server Failover Cluster and SQL Server
Figure 6 illustrates a two node cluster. Each
cluster node requires the installation of the
Windows Server operating system on a local
Special Advertising Supplement
to
W i n d o ws IT P r o Windows Server Failover Clustering is an
effective technology to provide protection
from server level unplanned downtime.
However, to accomplish that you need to
deal with the cost and complexity of the
Windows Failover Clustering technology.
From the hardware standpoint you need to
purchase a minimum of two systems complete with Windows Server licenses. You also
need a private and public network as well
as a shared storage that can be accessed by
all of the nodes in the cluster. While configuring Windows Server Failover Clusters
is easier than it was in the past it can still
be a challenging especially for smaller and
Sponsored
by
NEC
and
Intel
medium sized businesses who may not have
a great deal of in-house technical expertise.
The length of time for a failover depends
on the level of database activity that was
occurring at the time of failover. SQL Server
writes all of its database transactions in the
transaction log. In the event of a failover all
committed transactions in the transaction
log must be applied to the database and all
the uncommitted transactions must be rolled
back to ensure database integrity. The length
of time that this recovery period takes varies
depending on the performance capabilities
of the system and on the number of transactions in the log. Larger more active database
applications will have more transactions and
therefore will take longer to complete the
restart process than a smaller system that
has fewer transactions to recover.
each replica is located on a separate SQL
Server instance running on different Windows Failover Cluster nodes. AlwaysOn
Availability Groups can contain multiple
databases all of which can be automatically
failed over as a unit. This means that AlwaysOn Availability Groups can protect multiple
related databases and fail them over simultaneously. AlwaysOn Availability Groups support both synchronous and asynchronous
replicas simultaneously. Synchronous connections are typically used in high availability scenarios where there is a requirement
for fast automatic failover. Asynchronous
connections are typically used in disaster
recovery scenarios where there is geographical distance between the different servers.
You can see an overview of AlwaysOn Availability Groups shown in Figure 7.
AlwaysOn Availability Groups
SQL Server AlwaysOn Availability Groups
were first introduced with SQK Server 2012
and they are essentially the next evolution of
Database Mirroring. AlwaysOn Availability
Groups provide high availability for multiple
databases. They are capable of automated
failover and the downtime is typically 3-5
seconds. Availability Groups provide database level protection and they can protect
against planned and unplanned downtime.
SQL Server 2012 AlwaysOn Availability
Groups support up to four replicas where
Special Advertising Supplement
to
W i n d o ws IT P r o Figure 7 - SQL Server AlwaysOn Availability Groups
Sponsored
by
NEC
and
Intel
Like Database Mirroring, AlwaysOn
Availability Groups works by automatically
forwarding transaction log entries from the
primary system to the different replicas. In
addition, unlike database mirroring, the
AlwaysOn Availability Group replica databases are able to provide read-only access.
This enables the replicas to be used for
both reporting as well as backup purposes
potentially offloading some of the workload and I/O from the primary server.
Considerations
for AlwaysOn Availability Groups
AlwaysOn Availability Groups can provide
protection from planned and unplanned
downtime for multiple databases. However,
it also requires a high degree of cost and
complexity to implement. First it requires
that you implement a Windows Server
Failover Cluster which can be difficult for
smaller and medium sized businesses. This
means that AlwaysOn Availability Groups
requires multiple Windows Server and
SQL Server instances as well as a shared
storage solution. After setting up the Windows Server Failover cluster you then must
setup the SQL Server Availability Groups.
In addition, although the failover process is
very fast there is a small amount of downtime associated with AlwaysOn Availability
Groups. Also like database mirroring there is
the possibility of data loss when using asynSpecial Advertising Supplement
to
W i n d o ws IT P r o chronous replicas. While replica databases
can be actively used to offload backup and
reporting workloads active databases require
additional SQL Server licenses.
The Road to Five 9s
Maximizing availability is one of the most
important goals for the database administrator. Microsoft provides several high availability technologies such as Log Shipping,
Database Mirroring, Windows Failover Clustering and AlwaysOn Availability Groups.
You can see a summary of the different SQL
Server availability technologies discussed in
this paper in Figure 8.
While SQL Server’s built-in technologies
can provide high levels of availability for
your tier 1 database instances they also
each have associated costs and complexity. For instance, technologies like SQL
Server’s AlwaysOn Availability Groups provide protection form unexpected databaselevel failures by but there is also significant
complexity involved in setting up the prerequisite Windows Failover cluster. There
are also licensing considerations involved
in implementing AlwaysOn Availability
Group because you need to have the SQL
Server 2012 Enterprise edition. Like many
of the SQL Server high availability options,
AlwaysOn Availability Groups is not available in the lower cost SQL Server 2012
Standard edition.
Sponsored
by
NEC
and
Intel
Figure 8 - Summary of SQL Server high availability technologies
The NEC FT Server line of systems can
provide 5 nine’s of availability right out of
the box with no additional complexity or
licensing cost beyond what you would pay
for a standard x64 server. The key value
Special Advertising Supplement
to
W i n d o ws IT P r o propositions offered by NEC’s FT Server
solutions are:
• Continuous Availability – The FT
Server line provides continuous availability and 99.999% uptime right out
Sponsored
by
NEC
and
Intel
FT Server Value Proposition
of the box. Any hardware failures are
completely transparent to the end user
and there is no data loss.
• Mission Critical Application Protection – NEC’s fault tolerance FT Servers provide protection from the single
point of failure problem that can occur
in virtualization hosts that run many
VMs or mission critical database servers
that support many applications. The FT
Server’s fault tolerant design enables the
Special Advertising Supplement
to
W i n d o ws IT P r o host to withstand failures and provide
continuous availability for these mission
critical applications.
• Operational Simplicity – Operating and
managing NEC’s FT Server systems is
almost identical to managing a standard
standalone server. There’s no need to
learn new technologies or to perform
complex clustering configurations.
• Reduced Maintenance Costs – Hot
swap technology allows maintenance
Sponsored
by
NEC
and
Intel
Server Cost Comparison
to be performed with no end user
interruption. The FT Server systems
incorporate a Customer Replaceable
Unit (CRU) design that allows IT personnel to replace most system components on site.
• Reduced TCO – NEC’s FT Servers provide greater uptime, reduced
management time as well as reduced
infrastructure and software licensing
Special Advertising Supplement
to
W i n d o ws IT P r o compared to other availability solutions.
This combination of simplicity and high
availability makes the NEC FT Server systems
well suited for smaller and medium sized
business or remote office deployments who
have requirements for high availability but
may lack the resources or technical capabilities to implement more complex solutions.
Sponsored
by
NEC
and
Intel