Download SQL Server Summit: Achieving High Availability with SQL Server

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Tandem Computers wikipedia , lookup

Concurrency control wikipedia , lookup

Oracle Database wikipedia , lookup

Microsoft Access wikipedia , lookup

Ingres (database) wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

Btrieve wikipedia , lookup

Team Foundation Server wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

SQL wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Clusterpoint wikipedia , lookup

PL/SQL wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Transcript
Achieving High
Availability
with SQL Server using
EMC SRDF
Prem Mehra – SQL Server Development, Microsoft
Art Ullman - CSC
Topics Covered

Share experiences gained on deploying
SQL Server and SAN for a Highly
Available Data Warehouse. Emphasis on





Intersection of SAN and SQL Server
Technologies
Not on Large Data Base Implementation or on
Data Warehouse Best Practices
Project Overview
Best Practices in a SAN environment
Remote Site Fail-over using EMC SRDF
and SQL Server Log Shipping
USDA GDW Project Overview
Project
Client
Build Geo-spatial Data
Warehouse for two sites with
remote fail-over
USDA
Storage
EMC SAN (46 terabytes)
Database
SQL Server 2000
Implementation
USDA / CSC
Consultants
EMC / CSC / Microsoft / ESRI
Geo-spatial
Software
ESRI Data Management
Software
Application Requirements


A large (46 TB total storage)
geo-spatial data warehouse for 2
USDA sites:
Salt Lake City & Fort Worth
Provide database fail-over and
fail-back between remote sites


Run data replication across DS3
network between sites (45Mb/sec)
Support read- only access at failover sites on ongoing basis
SAN Implementation


Understand your throughput, response
time and availability requirements and
potential bottlenecks and issues
Work with your storage vendor




1
Get Best Practices
Get design advice on LUN size, sector
alignment, etc
Understand the available backend monitoring
tools
Do not try to over optimize, keep LUN,
filegroup, file design simple, if possible
SAN Implementation

Balance I/O across all HBAs when possible
using balancing software (e.g., EMC’s
PowerPath)




2
Provides redundant data paths
Offers the most flexibility and much easier to design
when compared to static mapping
Some vendors are now offering implementations
which use Microsoft’s MPIO (multi-path IO).
Permits more flexibility in heterogeneous storage
environments.
Managing growth

Some configurations offer dynamic growth of
existing LUNs for added flexibility (e.g., Veritas
Volume Manger or SAN Vendor Utilities)

Working with SAN vendor engineers is highly recommended
SAN Implementation 3
Benchmarking the I/O System

Before implementing SQL Server, benchmark
the SAN. Shake out hardware/driver problems




Test a variety of I/O types and sizes.
Combinations - read/write & sequential/random
Include I/O of at least 8K, 64K, 128K, and 256K.
Ensure test files are significantly larger than SAN
Cache – At least 2 to 4 times
Test each I/O path individually & in combination to
cover all paths
Ideally - linear scale up of throughput (MB/s) as
paths are added
Save the benchmark data for comparison when
SQL is being deployed
SAN Implementation 4
Benchmarking the I/O System


Share results with vendor:
Is performance reasonable for the
configuration?
SQLIO.exe is an internal Microsoft tool

On-going discussions to post it as an
unsupported tool at
http://www.microsoft.com/sql/techinfo/admi
nistration/2000/scalability.asp
SAN Implementation

Among other factors, parallelism also influenced by



Number of CPUs on the host
Number of LUNs
For optimizing Create Database and Backup/Restore
performance, consider



5
More or as many volumes as the number of CPUs.
Could be volumes created by dividing a dynamic disk or
separate LUNs
Database and TempDB Files


Internal file structures require synchronization, consider the
# of processors on the server
Number of data files should be >= the number of
processors
Remote Site Fail-over
with SQL Server and EMC
SRDF
USDA Geo-spatial Database
Data Requirements





23 terabytes of EMC SAN storage per
site (46 TB total storage)
2 primary SQL Servers and 2 fail-over
servers per site
15 TB of image data in SQL Server at
Salt Lake City site with fail-over to Fort
Worth
3 TB of vector data in SQL server at
Fort Worth site with fail-over to Salt
Lake City
80 GB of daily updates that need to be
processed and pushed to fail-over site
Solution





Combination of SRDF and SQL
Server Log Shipping
Initial Synchronization using SRDF
Push updates using SQL Server
Log Shipping
Use SRDF incremental update to
fail-back after a fail-over
Use SRDF to move log backups to
remote site
Hardware Infrastructure
Site Configuration (identical at each site)
Technical Overview – EMC Devices





EMC SAN is partitioned into HyperVolumes and Meta-Volumes
(collections of Hyper-Volumes) through
BIN File configuration
All drives are either mirrored or Raid
7+1
Hypers and or Metas are masked to
hosts and are viewable as LUNs to the
OS
EMC Devices are identified by Sym Id
EMC Devices are defined as R1, R2,
Local or BCV devices in the Bin File
Technical Overview – Device Mapping
Windows Device Manager and SYMPD LIST Output
Technical Overview – SRDF 1
SRDF provides track to track data mirroring
between remote EMC SAN devices.
BCVs are for local copies.
• Track to track replication (independent of host)
• R1 Device is source
• R2 Device is target
• R2 is read/write disabled until the mirror is split
Technical Overview – SRDF 2
Synchronous Mode
 Semi-Synchronous
Synchronous with some lag
 Adaptive Copy Mode – Asynchronous
 Adaptive Copy A – Asynchronous with
guaranteed write sequence using
buffered track copies
Note: only Adaptive Copy A requires
additional storage space. All other
SRDF replications simply keep a table
of tracks that have changed.

Technical Overview – SRDF 3
•
SRDF replicates by Sym Device (Hyper or Meta).
•
SRDF Devices can be “Grouped” for synchronizing.
•
SQL Server databases are replicated “by database” or “by
groupings of databases” if TSIMSNAP2 is used.
Primary Host
Fail-over Host
R1
R2
R1
R2
R1 Group A
Database 1
R1 Group B
Database 2
Process Overview





Initial Synchronization using SRDF in
Adaptive Copy Mode (all database files).
Use TSIMSNAP(2) to split SRDF group after
synchronization is complete.
Restore fail-over databases using
TSIMSNAP(2) after splitting SRDF mirror.
Use SQL Server Log shipping to push all
updates to fail-over server (after initial sync).
Fail-over database is up and running at all
times, giving you confidence that the failover server is working.
Planning




Install SQL Server and system
databases on Primary and Fail-over
Servers (on Local non-replicated
devices)
Create user databases on R1 devices
(MDF, NDF and LDF) on Primary Host
Don’t share devices among databases,
if you need to keep databases
independent for fail-over and fail-back.
(Important)
Database volumes can be drive letters
or mount points
Initial Step
Create Databases on R1 Devices
Load Data
Synchronize to Fail-over host



Create SRDF Group for Database on R1
Set Group to Adaptive Copy Mode
Establish SRDF Mirror to R2
1
Synchronize to Fail-over host




2
Wait until Adaptive Copy is “synchronized”
Use TSIMSNAP command to split SRDF group after device
synchronization is complete.
Use TSIMSNAP2 for multiple databases.
TSIMSNAP writes Meta Data about databases to R1, which is
used for recovering databases on R2 host.
Break Mirror
Write Meta Data
Attach Database on Fail-over Host



Verify SQL Server is installed and running
on Fail-over host.
Mount R2 volumes on remote host.
Run TSIMSNAP RESTORE command on
Fail-over host. Specify either standby
(read-only) or norecovery mode.


Database is now available for log shipping on
fail-over.
SRDF Mirror is now broken, but the track
changes are still tracked (for incremental
mirror and/or for fail-back).
Log Shipping – at Primary Site




Log Shipping volume on separate R1 device (not the same
as the database R1)
Log Backup Maintenance Plan to backup logs to log
shipping volume, which is an R1 device
Set R1 to Adaptive Copy Mode
Establish R1/R2 Mirror. Logs automatically get copied to R2.
Log Shipping – at Fail-over Site



BCV (mirror) of R2
Schedule a script that splits and mounts BCV, then restores
logs to SQL Server database(s)
Flush, un-mount and re-establish BCV mirror after logs have
been restored
Process Overview Summary




Initial Synchronization using SRDF in
Adaptive Copy Mode.
Use TSIMSNAP(2) to split SRDF group
after synchronization is complete.
Use SQL Server Log shipping to push
updates to fail-over server.
Fail-over database is up and running at
all times, giving you confidence that
the fail-over server is working.
Fail-over Process
Fail-over Type
Read-only
Full Update
Required Action
No Server Action. Clients
would need to point to
fail-over server.
SQL Command:
Restore Database
DBName with Recovery
Fail-back Process
From
Read-only Failover
Full Update Failover
Required Action
None Required. Point Clients to
Primary.
1.
2.
3.
4.
5.
6.
Run SYMRDF Update command to copy
from R2 to R1 in Adaptive Copy Mode.
Detach database on R2 after Update
Complete.
Flush and un-mount volumes on R2
Run SYMRDF FAILBACK to replicate
final changes back to R1 and write
enable R1
Mount R1 volumes
Attach Database on Primary Host
Closing Observations





So far SQL Server 2000 has met High Availability
objectives
Network traffic across the WAN was minimized, (by
shipping only SQL Server Log Copies, once the
initial synchronization was completed.)
The dual Nishan fiber-to-IP switches allowed for
data transfer at about 16 GB / hour, taking full
advantage of the DS3. This transfer rate easily met
USDA’s needs for initial synchronization, daily log
shipping, and the fail-back process.
The working read-only version of the fail-over
database meant that the administrators always knew
the status of their fail-over system.
The USDA implementation did not require a large
number of BCV volumes - as some other replication
schemes require.
Closing Observations



After the R1/R2 mirror has been split, SRDF
continues to track updates to R1 (from normal
processing) and R2 (from log restore process).
SRDF is then able to ship only the modified tracks
during fail-back or re-synchronization. This process
is called an Incremental Establish, or an Incremental
Fail-back and is much more efficient than a Full
Establish or Full Fail-back.
After fail-back, the R1 and R2 devices will be “insync”, and ready for log shipping startup with a
minimal amount of effort.
Since SRDF (initial synchronization, fail-back, and
log shipping) all run in adaptive copy mode, the
performance on the primary server is not impacted.
Software




SQL Server 2000 Enterprise Edition
Windows 2000 / Windows 2003
Server
EMC SYM Command Line Interface
EMC Resource Pack
Call To Action


Understand your HA requirements
Work with your SAN Vendor to architect and design
for SQL Server deployment



Plan your device & database allocation before requesting a
BIN File
Decide if sharing devices for databases (use TSIMSNAP or
TSIMSNAP2). Decision effects convenience, space &
flexibility of operations
Stress test subsystem prior to deploying SQL Server
For more information, please email
[email protected]
You can download all presentations at
www.microsoft.com/usa/southcentral/
SQL Server Summit
Brought To You By:
© 2004 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.