Download High Availability and Disaster Recovery SQL Server Solution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Tandem Computers wikipedia , lookup

Concurrency control wikipedia , lookup

Oracle Database wikipedia , lookup

Microsoft Access wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

Team Foundation Server wikipedia , lookup

Btrieve wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Clusterpoint wikipedia , lookup

Open Database Connectivity wikipedia , lookup

SQL wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

PL/SQL wikipedia , lookup

Transcript
May 08 – 09 2012, Kongresshaus Berchtesgaden
SQL Server High Availability
Concepts & Solution
Guidance (2008 R2 & 2012)
Satya SK Jayanty
Director & Principal Architect
D BI A Solutions
[email protected]
About me

IT Experience
 Principal Architect & Consultant – D Bi A Solutions : Europe
([email protected])
 Been in the IT field over 20+ years (using SQL Server ver.4.2 onwards)

Publications
 Author: Microsoft SQL Server 2008 R2 Administration cookbook –
Packt Publishers (May 2011)


Co-author for MVP Deep Dives Volume II – Manning Publications (October 2011)
Community Contributions
 SQL Server MVP since 2006
 Founder (SQLMaster) & blogs at www.sqlserver-qa.net (SQL Server Knowledge
Sharing Network)
 Contributing Editor & Moderator - www.sql-server-performance.com [SSP]
 Quiz Master & Blogger: www.beyondrelational.com & www.sqlservergeeks.com
 Active participation in assorted forums such as SSP, SQL Server Central, MSDN,
SQL Server magazine, dbforums etc.
www.sqlserver-qa.net
@sqlmaster
www.packtpub.com
www.sqlserver-qa.net
@sqlmaster
Agenda
• Understanding High Availability
• Common terms
• Planned and Unplanned Downtime
• High Availability vs. Disaster Recovery
• Specific SQL Server 2008 + R2 features for High Availability
• SQL Server 2012 – What’s New
• AlwaysOn
Describing High Availability
Number of 9’s
Availability
Percentage
2
99%
3
99.9%
4
99.99%
5
99.999%
IT Lifecycle
Database Service Management
Build
IT Operations
Operate
Maintenance
High Availability Definition
Database Service Management
Online
Offline
Online
Offline
Online
Time between failure Time to
repair
IT Operations
Online
Offline
Online
Uptime
Offline
Online
Downtime
MTBF – Mean Time Between Failures, MTTR – Mean Time To Repair
Database Server Availability
Tables
Database
SQL Server Instance
Operating System
Hardware, Network and Storage
Total Availability = product of the availability of each component
Downtime
Planned
Unplanned
Software releases
Hardware component failure
OS Patch releases
Security breaches
SQL Server service packs and hotfixes Human error
Database maintenance and upgrades Natural disasters
Consider:
• RTO – Duration of Outage (Recovery Time Objective)
• RPO – Measure of acceptable data loss (Recovery Point Objective)
Justify ROI
 Avoid downtime
 Automating recovery
 Resource utilization
Disaster recovery
Business Continuity Plan
Business Function
Database Service Management
Online
Offline
Online
Offline
Online
IT Operations
Online
Recovery Time
Objective
Disaster recovery
plan
Data loss
Recovery Point Objective
High Availability vs. Disaster Recovery
High Availability
Disaster Recovery
Increased uptime
Designed to meet RTO and RPO
Typically focuses within a site
Recovery from site-level disasters
Component level redundancy
Data and Operations Recovery
Windows Failover Clustering
Data backup and Restore
Downtime: limit investigation time
Recovered: find the root cause
•
•
High Availability (HA) – prevent an outage
Disaster Recovery (DR) – address & re-establish HA after outage
•
HA is feature and DR is implementation (must be tested)
•
RCA is highly essential in both aspects.
SQL SERVER 2008 R2 HIGH
AVAILABILITY FEATURES
Failover clustering
Storage Area Network
Direct Attached Storage
HIGH AVAILABILITY FEATURES
Geographically dispersed failover clustering
HIGH AVAILABILITY FEATURES
Database mirroring
HIGH AVAILABILITY FEATURES
Log shipping
HIGH AVAILABILITY FEATURES
Transactional Replication
HIGH AVAILABILITY FEATURES
Peer-to-peer transactional replication
WINDOWS HIGH AVAILABILITY FEATURES
Other features
Edition specific
• Online Indexing
• Hot Add CPU
• Hot Add Memory
– Adjust memory online without restart of SQL
Services
COMMON CONFIGURATION
SCENARIOS
Failover clustering
SQL
Instance A
Node A
SQL
Instance A
Node B
mscs
SQL
Instance
A
SQL
Instance B
Node A
SQL
Instance A
SQL
Instance
B
Node B
mscs
Storage Area Network
Storage Area Network
COMMON CONFIGURATION
SCENARIOS
Geographically dispersed failover clustering
SQL
Instance A
Node A
SQL
Instance A
Node B
SQL
Instance A
Node C
SQL
Instance A
Node D
COMMON CONFIGURATION
SCENARIOS
Database mirroring and log shipping
SQL Instance A
SQL Instance A
SQL Instance A
SQL Instance A
Node A
Node B
Node A
Node B
COMMON CONFIGURATION
SCENARIOS
Peer-to-peer transactional replication
SQL Instance A
SQL Instance A
SQL Instance B
SQL Instance B
Node A1
Node A2
Node B1
Node B2
SQL Instance C
SQL Instance C
Node C1
Node C2
COMPARING HIGH AVAILABILITY
SOLUTIONS
Failover
Failover
clustering
Geographically
dispersed
failover cluster
• SQL Server
Instance
• Automated
• Within minutes
• No application
redirection
required
• SQL Server
Instance
• Automated
• Within minutes
• No application
redirection
required
Database
mirroring
• Database
• Manual or
automated
• Few seconds to
minutes
• Application
redirection
required (can
be automated)
Log shipping
• Database
• Manual
• Based on
configuration
• Forced
application
redirection
required
Peer-to-peer
replication
• Publication
• Application
redirection
• Few seconds
• Forced
application
redirection
COMPARING HIGH AVAILABILITY
SOLUTIONS
Secondary server
Failover clustering
• Instance
• Multiple nodes
• Server in same LAN
• Hot standby server
• Available for use
• Shared storage
• Status check using
Heartbeat
Geographically
dispersed failover
cluster
• Instance
• Multiple nodes
• Span across WAN
• Hot standby
• Available for use
• Shared storage at
site-level
• Status check using
heartbeat
Database mirroring
• Database
• Principal and
Mirror
• Span across WAN
• Hot standby
• Available for use
• Storage need not
be shared
• No status check
unless using
witness
Log shipping
• Database
• Multiple secondary
nodes
• Span across WAN
• Warm standby
• Available for use
• Storage need not
be shared
• No status checking
Peer-to-peer
replication
• Publication
• Multiple nodes
• Span across WAN
• Hot standby
• Available for use
• Storage need not
be shared
• No status checking
COMPARING HIGH AVAILABILITY
SOLUTIONS
Data considerations
Failover
clustering
• All instance
databases are
shared
• Loss limited to
last hardened
record
• No data copy
across network
Geographically
dispersed
failover cluster
• All instance
databases are
shared
• Loss limited to
last copied
storage unit
• Based on
technology
Database
mirroring
• Configured
database is
mirrored
• Loss limited to
last mirrored
transaction
• Synchronous or
Asynchronous
mirroring
Log shipping
• Configured
database logs
are applied
• Loss limited to
last applied
transaction log
backup
• Transaction log
Backup, Copy
and Restore
Peer-to-peer
replication
• Configured
publication is
replicated
• Loss limited to
last replicated
record
• Replication
agents
replicated
changes
COMPARING HIGH AVAILABILITY
SOLUTIONS
Network considerations
Failover
clustering
• Network
connectivity
needs to meet
heartbeat
requirements
• Data is not
transferred over
network
Geographically
dispersed
failover cluster
• Nodes can be
geographically
dispersed
• Based on the
storage
replication
configuration
Database
mirroring
Log shipping
• Network latency
to meet I/U/D
SLA’s
• Data is
transferred over
network
• Network latency
to meet RTO
• Transaction Log
file records
transferred over
network
Peer-to-peer
replication
• Network should
meet RTO
• Data changes
transferred over
network
High Availability Solution
High Availability Solution
Database mirroring/Log shipped
High Availability Solution
Interoperability: Database mirroring and Log shipping combination
High Availability Solution
Peer-to-peer transactional replication
SQL Server 2012:: HA What’s new
AlwaysOn :: Configuring availability at both database & instance
level.
• AlwaysOn Availability Groups (AG)
–
–
–
–
–
Log based data movement without shared disks
Zero data loss
Automatic & manual failover of a logical group of databases
Support upto 4 secondary replicas
Automatic page repair (continuing from SQL2008 R2)
• AlwaysOn Failover Cluster Instances (FCI)
– Multi-site clustering across subnets
– Enables cross data-center failover of SQL instances
– Faster failover for application availability
What’s new:: Reduce Downtime
Helps reduced planned downtime! (Comes with a cost)
• Windows Server Core
• Online operations (SQL Server)
• Rolling upgrade & patches (AlwaysOn)
• SQL Server on Hyper-V (benefit of Live Migration)
– migrate virtual machines between hosts with zero downtime.
• Easy deployment
– Configuration Wizard, Windows PowerShell command-line interface,
dashboards, dynamic management views (DMVs), policy-based management,
and System Center integration help simplify deployment and management of
availability groups.
AlwaysOn: RPO & RTO Capabilities
Potential
Data Loss
(RPO)
Potential
Recovery Time
(RTO)
Automatic
Failover
Readable
Secondaries(1)
AlwaysOn Availability Group - synchronous-commit
Zero
Seconds
Yes(4)
0-2
AlwaysOn Availability Group - asynchronous-commit
Seconds
Minutes
No
0-4
AlwaysOn Failover Cluster Instance
NA(5)
Yes
NA
Database Mirroring(2) - High-safety (sync + witness)
Zero
Seconds
-to-minutes
Seconds
Yes
NA
Database Mirroring(2) - High-performance (async)
Seconds(6)
Minutes(6)
No
NA
Log Shipping
Minutes(6)
Minutes
-to-hours(6)
Hours
-to-days(6)
No
Not during
a restore
Not during
a restore
High Availability and Disaster Recovery
SQL Server Solution
Backup, Copy, Restore(3)
1.
2.
3.
4.
5.
6.
Hours(6)
No
An AlwaysOn Availability Group can have no more than a total of four secondary replicas, regardless
of type.
This feature will be removed in a future version of Microsoft SQL Server. Use AlwaysOn Availability
Groups instead.
Backup, Copy, Restore is appropriate for disaster recovery, but not for high availability.
Automatic failover of an availability group is not supported to or from a failover cluster instance.
The FCI itself doesn’t provide data protection; data loss is dependent upon the storage system
implementation.
Highly dependent upon the workload, data volume, and failover procedures.
AlwaysOn Layers of Protection
Provides fault tolerance and disaster recovery across several logical and physical layers
of infrastructure and application components
• Infrastructure level
– Windows Server Failover Clustering (WSFC): Server-level-fault-tolerance & intra-node
network
• SQL Server instance level
– FCI: attached to symmetric shared storage
• Database level
– AG – Availability Groups: Primary replica & 4 Secondary replica
– Each replica is hosted by an instance (FCI or non-FCI) on different node of WSFC
• Client Connectivity
– connect directly to a SQL Server instance network name, or
– they may connect to a virtual network name (VNN) that is bound to an availability group
listener
– Logical redirection to appropriate SQL Server instance and database replica
AlwaysOn: Storage Considerations
Direct-attached vs. remote
• HBA
• SAN – iSCSI or Fibre channel
• SMB (Server Messaging Block)
Symmetric or Assymetric
• Storage devices are considered symmetric
• SSDs are good
Dedicated vs Shared
• Dedicated reserved for use and assigned to a single node in the cluster
• Shared storage is accessible to multiple nodes in the cluster
• WSFC supports cluster shared volumes – file sharing
• SQL Server does not support to a shared volume
Availability Improvements
• Flexible Failover Policy
– sp_server_diagnostics uses FailureConditionLevel
• Failover Policy for Failover Cluster Instances (http://msdn.microsoft.com/enus/library/ff878664(SQL.110).aspx)
• Enhanced logging and instrumentation
– Specific system configuration views, DMVs, performance counters, and
an extended event health session
• AlwaysOn Availability Groups Dynamic Management Views and Functions
(http://msdn.microsoft.com/en-us/library/ff877943(SQL.110).aspx), &
• sys.dm_os_cluster_nodes (http://msdn.microsoft.com/en-us/library/ms187341(SQL.110).aspx)
• SMB file share support:
– SQL Databases on File Shares - It's time to reconsider the scenario
(http://blogs.msdn.com/b/sqlserverstorageengine/archive/2011/10/18/sql-databases-onfile-shares-it-s-time-to-reconsider-the-scenario.aspx)
Using storage replication technologies
effectively
• Test the solution prior to deployment
– Test the database failover and ensure they can be brought online
every single time
– Test the entire solution to ensure that the required operation
processes and documents are in-place
• Understand the performance impact of the solution
implemented
– Synchronous replication can reduce RPO to zero but impacts
performance based on the network latency
– Asynchronous replication can reduce performance impact, but
increases RPO
– Benchmark the solution performance prior to the deployment
• Implement vendor specified best practices
Using storage replication technologies
effectively
• Data growth
– Understand the impact of dynamically increasing
the size of the LUNS if available
• Follow SQL Server best practices
– Keep each database data and log files on it’s own
devices
– Avoid replicating the TempDB
– Simple user databases physical layout help simplify
maintenance
www.packtpub.com
www.sqlserver-qa.net
@sqlmaster
End slide if you need one
Any Questions?