* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Microsoft SQL Server on EMC Symmetrix Storage Systems
Survey
Document related concepts
Entity–attribute–value model wikipedia , lookup
Tandem Computers wikipedia , lookup
Serializability wikipedia , lookup
Oracle Database wikipedia , lookup
Ingres (database) wikipedia , lookup
Microsoft Access wikipedia , lookup
Concurrency control wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Team Foundation Server wikipedia , lookup
Relational model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Database model wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Transcript
Microsoft SQL Server on EMC Symmetrix Storage Systems Version 3.0 • Layout, Configuration and Performance • Replication, Backup and Recovery • Disaster Restart and Recovery Txomin Barturen Copyright © 2007, 2010, 2011 EMC Corporation. All rights reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date regulatory document for your product line, go to the Technical Documentation and Advisories section on EMC Powerlink. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. All other trademarks used herein are the property of their respective owners. Microsoft SQL Server on EMC Symmetrix Storage Systems 3.0 P/N h2203.2 2 Microsoft SQL Server on EMC Symmetrix Storage Systems Contents Preface Chapter 1 Microsoft SQL Server Microsoft SQL Server overview...................................................... 24 Microsoft SQL Server instances and databases ............................ 26 Microsoft SQL Server logical components .................................... 27 A SQL Server database ..............................................................28 Data access ......................................................................................... 29 Microsoft SQL Server physical components ................................. 31 Data files ......................................................................................31 Transaction log............................................................................33 Microsoft SQL Server system databases........................................ 35 Microsoft SQL Server instances ...................................................... 36 Microsoft Windows Clustering installations ................................ 40 Backup and recovery interfaces — VDI and VSS......................... 42 Microsoft SQL Server and EMC integration ................................. 43 Advanced storage system functionality ........................................ 45 Additional Microsoft SQL Server tools.......................................... 47 Chapter 2 EMC Foundation Products Introduction ....................................................................................... 50 Symmetrix hardware and EMC Enginuity features .................... 53 Symmetrix VMAX platform......................................................54 EMC Enginuity operating environment..................................55 EMC Solutions Enabler base management ................................... 57 EMC Change Tracker ....................................................................... 60 EMC Symmetrix Remote Data Facility .......................................... 61 SRDF benefits ..............................................................................62 Microsoft SQL Server on EMC Symmetrix Storage Systems 3 Contents SRDF modes of operation.......................................................... 62 SRDF device groups and composite groups .......................... 63 SRDF consistency groups .......................................................... 63 SRDF terminology ...................................................................... 67 SRDF control operations............................................................ 69 Failover and failback operations .............................................. 73 EMC SRDF/Cluster Enabler solutions.................................... 75 EMC TimeFinder............................................................................... 76 TimeFinder/Mirror establish operations................................ 77 TimeFinder split operations...................................................... 78 TimeFinder restore operations ................................................. 79 TimeFinder consistent split....................................................... 80 Enginuity Consistency Assist ................................................... 80 TimeFinder/Mirror reverse split ............................................. 83 TimeFinder/Clone operations.................................................. 83 TimeFinder/Snap operations ................................................... 86 EMC Storage Resource Management ............................................ 89 EMC Storage Viewer ........................................................................ 94 EMC PowerPath................................................................................ 96 PowerPath/VE............................................................................ 98 EMC Replication Manager ............................................................ 105 EMC Open Replicator .................................................................... 107 EMC Virtual Provisioning ............................................................. 108 Thin device ................................................................................ 108 Data device ................................................................................ 108 Symmetrix VMAX specific features....................................... 109 EMC Virtual LUN migration ......................................................... 111 EMC Fully Automated Storage Tiering for Disk Pools............. 114 EMC Fully Automated Storage Tiering for Virtual Pools......... 116 Chapter 3 Storage Provisioning Storage provisioning ...................................................................... 118 SAN storage provisioning ............................................................. 119 LUN mapping ........................................................................... 120 LUN masking ............................................................................ 121 Auto-provisioning with Symmetrix VMAX ......................... 122 Host LUN discovery ................................................................ 124 Challenges of traditional storage provisioning .......................... 125 How much storage to provide................................................ 125 How to add storage to growing applications....................... 125 How to balance usage of the LUNs ....................................... 126 How to configure for performance ........................................ 126 4 Microsoft SQL Server on EMC Symmetrix Storage Systems Contents Virtual Provisioning........................................................................ 128 Terminology...............................................................................129 Thin devices ...............................................................................130 Thin pool ....................................................................................131 Data devices...............................................................................131 I/O activity to a thin device ....................................................131 Virtual Provisioning requirements.........................................133 Windows NTFS considerations ..............................................133 SQL Server components on thin devices ...............................136 Approaches for replication ......................................................141 Performance considerations ....................................................141 Thin pool management ............................................................142 Thin pool monitoring ...............................................................142 Exhaustion of oversubscribed pools ......................................143 Recommendations ....................................................................145 Fully Automated Storage Tiering ................................................. 146 Evolution of Storage Tiering ...................................................147 FAST implementation ..............................................................149 Deploying FAST DP with SQL Server databases ....................... 155 Deploying FAST VP with SQL Server databases........................ 173 Chapter 4 Creating Microsoft SQL Server Database Clones Overview .......................................................................................... 195 Recoverable versus restartable copies of databases ................... 196 Recoverable database copies ...................................................196 Restartable database copies .....................................................196 Copying the database with Microsoft SQL Server shutdown .. 198 Using TimeFinder/Mirror .......................................................198 Using TimeFinder/Clone ........................................................200 Using TimeFinder/Snap ..........................................................202 Copying the database using EMC Consistency technology ..... 205 Using TimeFinder/Mirror .......................................................205 Using TimeFinder/Clone ........................................................207 Using TimeFinder/Snap ..........................................................209 Copying the database using SQL Server VDI and VSS ............. 212 Using TimeFinder/Mirror .......................................................213 Using TimeFinder/Clone ........................................................218 Using TimeFinder/Snap ..........................................................224 Copying the database using Replication Manager .................... 229 Transitioning disk copies to SQL Server databases clones........ 231 Instantiating clones from consistent split or shutdown images .........................................................................................232 Microsoft SQL Server on EMC Symmetrix Storage Systems 5 Contents Using SQL Server VDI to process the database image ....... 233 Reinitializing the cloned environment ........................................ 241 Choosing a database cloning methodology................................ 242 Chapter 5 Backing up Microsoft SQL Server Databases EMC Consistency technology and backup ................................. 247 SQL Server backup functionality ................................................. 248 Microsoft SQL Server recovery models................................. 250 Types of SQL Server backups ................................................. 253 SQL Server log markers ................................................................. 255 EMC products for SQL Server backup ........................................ 256 Integrating TimeFinder and Microsoft SQL Server............. 257 EMC Storage Resource Management .................................... 259 TF/SIM VDI and VSS backup....................................................... 263 Using TimeFinder/Mirror ...................................................... 263 Using TimeFinder/Clone........................................................ 265 Using TimeFinder/Snap ......................................................... 269 SYMIOCTL VDI backup ................................................................ 273 Using TimeFinder/Mirror ...................................................... 273 Replication Manager VDI backup................................................ 277 Saving the VDI or VSS backup to long-term media .................. 279 Chapter 6 Restoring and Recovering Microsoft SQL Server Databases SQL Server restore functionality .................................................. 283 SQL Server – RESTORE WITH RECOVERY ........................ 284 SQL Server – RESTORE WITH NORECOVERY.................. 285 SQL Server – RESTORE WITH STANDBY........................... 285 EMC Products for SQL Server recovery...................................... 288 EMC Consistency technology and restore .................................. 289 TF/SIM VDI and VSS restore........................................................ 292 Using TimeFinder/Mirror ...................................................... 292 Using TimeFinder/Clone........................................................ 303 Using TimeFinder/Snap ......................................................... 313 SMIOCTL VDI restore.................................................................... 327 Executing TimeFinder restore ................................................ 329 SYMIOCTL with NORECOVERY.......................................... 331 SYMIOCTL with STANDBY................................................... 334 SYMIOCTL with RECOVERY ................................................ 337 Replication Manager VDI restore................................................. 338 Applying logs up to timestamps or marked transactions ........ 340 6 Microsoft SQL Server on EMC Symmetrix Storage Systems Contents Chapter 7 Microsoft SQL Server Disaster Restart and Disaster Recovery Definitions ........................................................................................ 343 Dependent-write consistency..................................................343 Database restart.........................................................................343 Database recovery.....................................................................344 Roll forward recovery ..............................................................344 Considerations for disaster restart and disaster recovery......... 345 Recovery Point Objective (RPO) .............................................345 Recovery Time Objective (RTO) .............................................346 Operational complexity............................................................346 Source server activity ...............................................................347 Production impact ....................................................................347 Target server activity................................................................347 Number of copies......................................................................348 Distance for solution.................................................................348 Bandwidth requirements .........................................................348 Federated consistency ..............................................................349 Testing the solution ..................................................................349 Cost .............................................................................................350 Tape-based solutions....................................................................... 351 Tape-based disaster recovery..................................................351 Tape-based disaster restart ......................................................351 Local high-availability solutions................................................... 353 Multisite high-availability solutions ............................................ 354 Remote replication challenges....................................................... 356 Propagation delay .....................................................................356 Bandwidth requirements .........................................................357 Network infrastructure ............................................................357 Method of instantiation............................................................358 Method of re-instantiation .......................................................358 Change rate at the source site..................................................358 Locality of reference .................................................................359 Expected data loss.....................................................................359 Failback operations ...................................................................360 Array-based remote replication .................................................... 361 Planning for array-based replication............................................ 362 SQL Server specific issues.............................................................. 363 SRDF/S: Single Symmetrix to single Symmetrix ....................... 364 How to restart in the event of production site loss..............366 SRDF/S and consistency groups .................................................. 368 Rolling disaster..........................................................................368 Protecting against a rolling disaster .......................................370 Microsoft SQL Server on EMC Symmetrix Storage Systems 7 Contents SRDF/S with multiple source Symmetrix arrays ................ 372 SRDF/A............................................................................................ 375 SRDF/A using single source Symmetrix .............................. 377 SRDF/A using multiple source Symmetrix ......................... 378 Restart processing..................................................................... 379 SRDF/AR single hop ..................................................................... 381 Restart processing..................................................................... 383 SRDF/AR multi hop ...................................................................... 384 Restart processing..................................................................... 386 Database log-shipping solutions .................................................. 387 Overview of log shipping........................................................ 387 Log shipping considerations................................................... 391 Log shipping and the remote database ................................. 394 Shipping logs with SRDF ........................................................ 398 SQL Server Database Mirroring ............................................. 398 Running database solutions .......................................................... 404 SQL Server transactional replication ..................................... 404 Other transactional systems .......................................................... 407 Chapter 8 Microsoft SQL Server Database Layouts on EMC Symmetrix The performance stack ................................................................... 411 Optimizing I/O......................................................................... 412 Storage system layout considerations ................................... 413 SQL Server layout recommendations .......................................... 415 File system partition alignment.............................................. 415 General principles for layout .................................................. 415 SQL Server layout and replication considerations .............. 417 Symmetrix performance guidelines............................................. 419 Front-end connectivity............................................................. 419 Symmetrix cache....................................................................... 421 Back-end considerations.......................................................... 430 Additional layout considerations........................................... 432 Configuration recommendations ........................................... 433 RAID considerations ...................................................................... 435 Types of RAID........................................................................... 435 RAID recommendations.......................................................... 438 Symmetrix metavolumes......................................................... 440 Host versus array-based striping ................................................. 441 Host-based striping .................................................................. 441 Symmetrix based striping (metavolumes)............................ 442 Striping recommendation........................................................ 442 8 Microsoft SQL Server on EMC Symmetrix Storage Systems Contents Data placement considerations ..................................................... 445 Disk performance considerations ...........................................445 Hypervolume contention.........................................................447 Maximizing data spread across back-end devices ...............448 Minimizing disk head movement ..........................................450 Other layout considerations .......................................................... 451 Database layout considerations with SRDF/S .....................451 Database cloning, TimeFinder, and sharing spindles .........452 Database clones using TimeFinder/Snap .............................453 Database-specific settings for SQL Server ................................... 454 Remote replication performance considerations..................454 TEMPDB storage and replication ...........................................455 Appendix A Related Documents Related documents.......................................................................... 458 Appendix B References Sample SYMCLI group creation commands ............................... 462 Glossary Index Microsoft SQL Server on EMC Symmetrix Storage Systems 9 Contents 10 Microsoft SQL Server on EMC Symmetrix Storage Systems Figures Title 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Page Microsoft SQL Server architecture overview ............................................. 27 SQL Server database architecture ................................................................ 30 SQL Server internal data file logical structure ........................................... 32 Multiple SQL Server instance installation directories .............................. 37 SQL Server Enterprise Manager with multiple instances ........................ 38 Connecting to default and named instances .............................................. 39 SRDF/CE for MSCS resource group for a SQL Server instance.............. 41 Symmetrix VMAX logical diagram ............................................................. 55 Basic synchronous SRDF configuration ...................................................... 62 SRDF consistency group ............................................................................... 65 SRDF establish and restore control operations .......................................... 71 SRDF failover and failback control operations .......................................... 73 Geographically distributed four-node EMC SRDF/CE clusters............. 75 EMC Symmetrix configured with standard volumes and BCVs ............ 77 ECA consistent split across multiple database-associated hosts............. 81 ECA consistent split on a local Symmetrix system ................................... 82 Creating a copy session using the symclone command ........................... 85 TimeFinder/Snap copy of a standard device to a VDEV......................... 88 SRM commands.............................................................................................. 90 EMC Storage Viewer...................................................................................... 95 PowerPath/VE vStorage API for multipathing plug-in........................... 99 Output of the rpowermt display command on a Symmetrix VMAX device ...............................................................................................................102 Device ownership in vCenter Server......................................................... 103 Virtual Provisioning components .............................................................. 109 Virtual LUN Eligibility Tables.................................................................... 111 Simple storage area network configuration ............................................. 120 LUN masking relationship ......................................................................... 121 Auto-provisioning with Symmetrix VMAX............................................. 123 Virtual Provisioning component relationships........................................ 132 Microsoft SQL Server on EMC Symmetrix Storage Systems 11 Figures 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 12 Windows NTFS volume format with Quick Format option.................. 134 Thin pool display after NTFS format operations .................................... 135 Creation of a SQL Server database ............................................................ 137 Thin pool display after database creation ................................................ 138 SQL Server Management Studio view of the database .......................... 139 Windows event log entry created by SYMAPI event daemon.............. 143 SQL Server I/O request informational message ..................................... 144 FAST relationships....................................................................................... 149 Overview of relationships of filegroups, files and physical drives ...... 156 Defining Storage Types within Symmetrix Management Console ...... 158 Tier definition within Symmetrix Management Console ...................... 159 Allocating a Storage Group to a policy in Symmetrix Management Console ............................................................................................................ 160 Symmetrix Optimizer collect and swap/move windows...................... 161 SQL Server Performance Warehouse workload view ............................ 162 SQL Server Performance Warehouse virtual file statistics .................... 163 Read I/O rates for data file volumes......................................................... 164 Read latency for data file volumes ............................................................ 164 FAST generated performance movement plan........................................ 166 FAST policy compliance view.................................................................... 167 FAST generated compliance movement plan .......................................... 167 User approval of a suggested FAST plan ................................................. 168 FAST migration in progress ....................................................................... 169 Read latencies for data volumes - post migration................................... 170 Read I/O rates for data volumes - post migration.................................. 170 Performance Warehouse virtual file statistics - post migration ............ 171 Comparison of improvement for major metrics...................................... 171 Overview of relationships of filegroups, files and thin devices............ 174 Relationship of thin LUNs to data devices and physical drives ........... 175 Detail of FC_SQL thin pool allocations for bound thin devices............ 176 Defining FAST VP storage tiers within Symmetrix Management Console ............................................................................................................ 178 FAST VP policy definition within Symmetrix Management Console . 179 Allocating a storage group to a policy in Symmetrix Management Console ............................................................................................................ 180 Symmetrix FAST Configuration Wizard performance time window.. 181 Symmetrix FAST Configuration Wizard movement time window ..... 182 Windows performance counters for reads and writes IOPs.................. 183 Windows performance counters for read and write latencies .............. 184 Read and write workload before and during migrations ...................... 185 Read and write latencies before and during migrations ........................ 185 Output from demand association report.................................................. 186 Summary of thin pool allocations over time............................................ 186 Microsoft SQL Server on EMC Symmetrix Storage Systems Figures 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 Storage allocations for a LUN used by a Broker file ............................... 187 Storage allocations for the SATA tier ........................................................ 188 Read I/O rates for data volumes - Post-FAST VP activation................. 189 Comparison of improvement for major metrics ...................................... 190 Copying a shutdown SQL Server database with TimeFinder/Mirror. 199 Copying a shutdown SQL Server database with TimeFinder/Clone .. 202 Copying a shutdown SQL Server database with TimeFinder/Snap .... 203 Copying a running SQL Server database with TimeFinder/Mirror..... 206 Copying a running SQL Server database with TimeFinder/Clone ...... 209 Copying a running SQL Server database with TimeFinder/Snap........ 210 Creating a TimeFinder/Mirror backup of a SQL Server database........ 214 Sample TimeFinder/SIM backup with TimeFinder/Mirror ................. 215 Sample SYMIOCTL backup with TimeFinder/Mirror ........................... 218 Creating a backup of a SQL Server database with TimeFinder/ Clone.................................................................................................................219 Sample TF/SIM VSS backup using TimeFinder/Clone ......................... 220 Sample SYMIOCTL backup using TF/Clone........................................... 223 Creating a VDI or VSS backup of a SQL Server database with TimeFinder/Snap ...........................................................................................224 Sample TF/SIM backup with TF/Snap..................................................... 226 Sample SYMIOCTL backup with TF/SNAP ............................................ 228 Using RM to make a backup of a SQL Server database.......................... 229 Attaching a consistent split image to SQL Server.................................... 233 TF/SIM and SQL Server VDI to create a clone ........................................ 234 Using SYMIOCTL and SQL Server VDI to create a clone ...................... 235 Using TF/SIM and SQL Server VDI to create a standby database ....... 236 Using SYMIOCTL and SQL Server VDI to create a standby database. 237 Attaching a cloned database with relocated data and log locations..... 238 SQL Query Analyzer executing sp_helpdb .............................................. 239 Mapping database logical components to new file locations ................ 239 Using TF/SIM and VDI to restore s database to a new location ........... 240 SQL Server Management Studio backup interface.................................. 248 DATABASE BACKUP Transact-SQL execution ...................................... 249 Recovery model options for a SQL Server database ............................... 252 Setting recovery model via Transact-SQL ................................................ 253 SYMRDB listing SQL database file locations............................................ 261 SYMCLONE query of a device group ....................................................... 262 Creating a TimeFinder/Mirror VDI or VSS backup of a SQL Server database ...........................................................................................................265 Creating a VDI or VSS backup of a SQL Server database with TimeFinder/Clone .........................................................................................266 TF/SIM VDI backup using TimeFinder/Clone ....................................... 268 TF/SIM remote VSS backup using TimeFinder/Clone.......................... 269 Microsoft SQL Server on EMC Symmetrix Storage Systems 13 Figures 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 14 Creating a VDI backup of a SQL Server database with TimeFinder/ Snap.................................................................................................................. 271 TF/SIM backup using TimeFinder/Snap ................................................ 272 Sample SYMIOCTL backup usingTF/Mirror.......................................... 276 Using RM to make a TimeFinder replica of a SQL Server database..... 277 SQL Enterprise Manager restore interface ............................................... 283 Additional Enterprise Manager restore options...................................... 284 DATABASE RESTORE Transact-SQL execution..................................... 286 SQL log from attaching a Consistent split image to SQL Server .......... 291 TF/SIM restore process using TimeFinder/Mirror ................................ 292 TF/SIM restore database with TimeFinder/Mirror and NORECOVERY .............................................................................................. 295 SQL Management Studio view of a RESTORING database .................. 297 Restore of incremental transaction log with NORECOVERY ............... 298 TF/SIM restore with TimeFinder/Mirror and STANDBY .................... 300 SQL Management Studio view of a STANDBY (read-only) database . 301 Restore of incremental transaction log with STANDBY ........................ 302 Execution of TimeFinder/Clone restore................................................... 305 TF/SIM restore database with TimeFinder/Clone and NORECOVERY .............................................................................................. 306 SQL Management Studio view of a RESTORING database .................. 307 Restore of incremental transaction log with NORECOVERY ............... 308 TF/SIM restore with TimeFinder/Mirror and STANDBY .................... 310 SQL Enterprise Manager view of a STANDBY (read-only) database .. 311 Restore of incremental transaction log with STANDBY ........................ 312 TF/SIM restore using TimeFinder/Snap ................................................. 315 TimeFinder/Snap listing restore session.................................................. 316 TF/SIM restore with TimeFinder/SNAP and NORECOVERY............ 318 SQL Management Studio view of a RESTORING database .................. 319 Restore of incremental transaction log with NORECOVERY ............... 320 TF/SIM restore with TimeFinder/SNAP and STANDBY..................... 322 SQL Management Studio view of a STANDBY (read-only) database . 323 Restore of incremental transaction log with STANDBY ........................ 324 Using SYMIOCTL for TimeFinder/Mirror restore of production database ........................................................................................................... 328 SYMIOCTL restore and NORECOVERY.................................................. 331 SQL Management Studio view of a RESTORING database .................. 332 Restore of incremental transaction log with NORECOVERY ............... 333 SYMIOCTL restore and STANDBY.......................................................... 334 SQL Management Studio view of a STANDBY (read-only) database . 335 Restore of incremental transaction log with STANDBY ........................ 336 SYMIOCTL restore and recovery mode ................................................... 337 Replication Manager/Local restore overview ......................................... 338 Microsoft SQL Server on EMC Symmetrix Storage Systems Figures 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 SRDF/S replication process ........................................................................ 365 Rolling disaster with multiple production Symmetrix arrays ............... 370 SRDF consistency group protection against rolling disaster ................. 371 SRDF/S with multiple source Symmetrix and ConGroup protection . 373 SRDF/Asynchronous replication internals .............................................. 375 SRDF/AR single-hop replication internals............................................... 382 SRDF/AR multi-hop replication internals ............................................... 386 Log shipping configuration - destination dialog ..................................... 390 Log Shipping configuration - restoration mode ...................................... 391 Log shipping implementation overview................................................... 395 Restore of incremental transaction log with NORECOVERY................ 396 Restore of incremental transaction log with STANDBY......................... 397 SQL Server Database Mirroring with SRDF/S ........................................ 399 SQL Server Database Mirroring flow overview – SYNC mode............. 401 The performance stack................................................................................. 412 Relationship between I/O size, operations per second, and throughput...................................................................................................... 421 Performance Manager graph of write pending for single hypervolume .................................................................................................. 427 Performance Manager graph of write pending for four member striped metavolume .......................................................................................428 Comparison of write workload single hyper and striped metavolume.................................................................................................... 429 RAID 5 (3+1) layout detail .......................................................................... 436 Anatomy of a RAID 5 write ........................................................................ 437 Disk performance factors ............................................................................ 447 Microsoft SQL Server on EMC Symmetrix Storage Systems 15 Figures 16 Microsoft SQL Server on EMC Symmetrix Storage Systems Tables Title 1 2 3 4 5 6 7 8 9 10 11 12 13 Page Microsoft SQL Server system databases ...................................................... 35 SYMCLI base commands ............................................................................... 57 TimeFinder device type summary................................................................ 87 Data object SRM commands .......................................................................... 91 Data object mapping commands .................................................................. 91 File system SRM commands to examine file system mapping ................ 92 File system SRM command to examine logical volume mapping ........... 93 SRM statistics command ................................................................................ 93 Virtual Provisioning terminology............................................................... 129 Storage Class definition................................................................................ 148 A comparison of database cloning technologies ...................................... 242 Database cloning requirements and solutions .......................................... 243 SQL Server data and log response guidelines ......................................... 414 Microsoft SQL Server on EMC Symmetrix Storage Systems 17 Tables 18 Microsoft SQL Server on EMC Symmetrix Storage Systems Preface As part of an effort to improve and enhance the performance and capabilities of its product lines, EMC periodically releases revisions of its hardware and software. Therefore, some functions described in this document may not be supported by all versions of the software or hardware currently in use. For the most up-to-date information on product features, refer to your product release notes. If a product does not function properly or does not function as described in this document, please contact your EMC representative. Note: This document was accurate as of the time of publication. However, as information is added, new versions of this document may be released to the EMC Powerlink website. Check the Powerlink website to ensure that you are using the latest version of this document. Purpose This document describes how the EMC Symmetrix storage system operates and interfaces with Microsoft SQL Server. The information in this document is based on Microsoft SQL Server 2005, Microsoft SQL Server 2008, and Microsoft SQL Server 2008 R2 on Symmetrix storage systems running Solutions Enabler Version 7.x, and current releases of Symmetrix Enginuity microcode. This document provides an overview of Microsoft SQL Server 2005, Microsoft SQL Server 2008, and Microsoft SQL Server 2008 R2 along with a general description of EMC products and utilities that can be used for SQL Server administration. EMC Symmetrix storage systems and EMC software products can be used to manage Microsoft SQL Server environments and to enhance database and storage management backup/recovery and restart procedures. Using EMC products and utilities to manage Microsoft SQL Server environments Microsoft SQL Server on EMC Symmetrix Storage Systems 19 Preface can help reduce database and storage management administration, reduce system CPU resource consumption, and reduce the time required to clone, back up, recover, or restart Microsoft SQL Server databases. In this document the product names Microsoft SQL Server 2005, Microsoft SQL Server 2008 and Microsoft SQL Server 2008 R2 may be referred to as SQL Server. The acronym SQL refers to the Structured Query Language, and should not be confused with the product Microsoft SQL Server. The Structured Query Language (SQL) is used within many relational database management systems (RDBMS) to store, retrieve, and manipulate data. Microsoft provides a specific implementation of the SQL language called Transact SQL (T-SQL). T-SQL is used throughout this document in the various examples. Microsoft provides extensive documentation on SQL Server through its website, and through the SQL Server Books On-Line documentation set. This should be the primary source of information on SQL Server-specific commands and T-SQL syntax. The Books On Line documentation may be installed through the SQL Server installation process, and may be installed independently of the SQL Server database engine. Updated versions of the Books On Line documentation are available for free download from the Microsoft SQL Server website at http://www.microsoft.com/sql. Audience Conventions used in this document The intended audience is SQL Server systems administrators, database administrators, and storage management personnel responsible for managing SQL Server systems. EMC uses the following conventions for special notices. Note: A note presents information that is important, but not hazard-related. A caution contains information essential to avoid data loss or damage to the system or equipment. IMPORTANT An important notice contains information essential to operation of the software or hardware. 20 Microsoft SQL Server on EMC Symmetrix Storage Systems Preface Typographical conventions EMC uses the following type style conventions in this document: Normal Used in running (nonprocedural) text for: • Names of interface elements (such as names of windows, dialog boxes, buttons, fields, and menus) • Names of resources, attributes, pools, Boolean expressions, buttons, DQL statements, keywords, clauses, environment variables, functions, utilities • URLs, pathnames, filenames, directory names, computer names, filenames, links, groups, service keys, file systems, notifications Bold Used in running (nonprocedural) text for: • Names of commands, daemons, options, programs, processes, services, applications, utilities, kernels, notifications, system calls, man pages Used in procedures for: • Names of interface elements (such as names of windows, dialog boxes, buttons, fields, and menus) • What user specifically selects, clicks, presses, or types Italic Used in all text (including procedures) for: • Full titles of publications referenced in text • Emphasis (for example a new term) • Variables Courier Used for: • System output, such as an error message or script • URLs, complete paths, filenames, prompts, and syntax when shown outside of running text Courier bold Used for: • Specific user input (such as commands) Courier italic Used in procedures for: • Variables on command line • User input variables <> Angle brackets enclose parameter or variable values supplied by the user [] Square brackets enclose optional values | Vertical bar indicates alternate selections - the bar means “or” {} Braces indicate content that you must specify (that is, x or y or z) ... Ellipses indicate nonessential information omitted from the example Your feedback on our TechBooks is important to us! We want our books to be as helpful and relevant as possible, so please feel free to send us your comments, opinions and thoughts on this or any other TechBook: [email protected] Microsoft SQL Server on EMC Symmetrix Storage Systems 21 Preface 22 Microsoft SQL Server on EMC Symmetrix Storage Systems 1 Microsoft SQL Server This chapter presents these topics: ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ Microsoft SQL Server overview ....................................................... Microsoft SQL Server instances and databases ............................. Microsoft SQL Server logical components ..................................... Data access .......................................................................................... Microsoft SQL Server physical components .................................. Microsoft SQL Server system databases......................................... Microsoft SQL Server instances ....................................................... Microsoft Windows Clustering installations.................................. Backup and recovery interfaces — VDI and VSS .......................... Microsoft SQL Server and EMC integration .................................. Advanced storage system functionality ......................................... Additional Microsoft SQL Server tools........................................... Microsoft SQL Server 24 26 27 29 31 35 36 40 42 43 45 47 23 Microsoft SQL Server Microsoft SQL Server overview Microsoft SQL Server is Microsoft Corporation’s premier relational database management system (RDBMS). Developed over several years, Microsoft SQL Server has grown to become one of the most scalable and highly performing database systems currently available, as shown by several industry-leading Transaction Processing Council (TPC) benchmarks. The origin of the RDBMS stems back to initial co-development work between Microsoft and Sybase, which focused on developing a database management system for the then-evolving OS/2 environment. Later, a variation of this initial release would become available on Microsoft’s LAN Manager platform. In 1995, Microsoft developed and released Microsoft SQL Server 6.0 on the Windows platform. In the intervening years a number of subsequent releases providing increased functionality, performance, and scalability have been widely adopted. As of February 2011, Microsoft SQL Server 2008 R2 is the latest in a series of SQL Server product releases that specifically cater to the Microsoft Windows platform. Currently, Microsoft SQL Server does not support any platform other than Windows Server. In 2003, Microsoftintroduced support of the Itanium 64-bit version of the Windows Server, coinciding with the Windows Server support, Microsoft SQL Server released an Itanium 64-bit (IA-64) version of Microsoft SQL Server 2000. In April of 2010, Microsoft announced that support of Itanium-based systems would be limited to the current version of Windows Server 2008 R2. As a result of this position, no future versions of Microsoft SQL Server are expected to be developed for Itanium-based implementations. Ongoing support for existing deployments would be maintained with Microsoft’s product lifecycle guidelines. Microsoft SQL Server introduced support for native 64-bit EM64T/AMD64 environments with the introduction of Microsoft SQL Server 2005 . The EM64T/AMD64 environment is generically referred to as x64, and has become the main Windows platform for Windows Server 2008. Indeed, as of Windows Server 2008 R2, the 32-bit architectures cease to be supported as target platforms. Subsequently future SQL Server releases will only become available on the x64 platform. 24 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Further information and support may be found at the appropriate Microsoft product and support website locations. Microsoft SQL Server overview 25 Microsoft SQL Server Microsoft SQL Server instances and databases A Microsoft SQL Server instance is defined as a discrete set of files and executables, which operate autonomously to provide service. SQL Server supports multiple instances operating independently on a given operating system. Within any given SQL Server instance there are typically a number of independent user databases. It is important to understand this relationship of database-to-SQL Server instance, and the capability to support multiple SQL Server instances on the server. A single database cannot physically exist across multiple database instances, but can be logically represented across multiple instances and/or servers. This logical extension of a single database across multiple SQL Server instances is referred to as a Federated Database environment. This style of architecture utilizes distributed partitioned views to create a single logical database entity. However, while the database appears to be a single entity, each member server in the federated environment maintains a discrete set of data files and transaction logs for its database instance. 26 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Microsoft SQL Server logical components Microsoft SQL Server may be divided into two main logical components and a number of subsidiary subsystems: ◆ Relational engine — Responsible for verifying SQL statements, and selecting the most efficient means of retrieving the data requested ◆ Storage engine — Responsible for executing physical I/O requests, which return the rows requested by the relational engine Together, these components create a complete relational database environment providing continuous availability and data integrity. Figure 1 on page 27 provides an overview of the Microsoft SQL Server architecture. Network Libraries User Mode Scheduler Relational Engine T-SQL Parser T-SQL Compiler Optimizer Other Subsystems Other Interfaces OLE DB Interface Transaction Manager Logging and Recovery Storage Engine File and Device Manager Lock Manager Buffer and Log Manager Other Subsystems Backup/Restore VDI I/O Manager and Windows Subsystem ICO-IMG-000035 Figure 1 Microsoft SQL Server architecture overview Microsoft SQL Server logical components 27 Microsoft SQL Server A SQL Server database A SQL Server database exists as a collection of physical objects (data files and transaction log) that contain the data in the form of tables. As previously mentioned, it is possible to create many databases within a SQL Server instance. Typically, within any SQL Server instance, there are by default four system databases and one or more user databases. Each database has a defined owner who may grant or revoke access permissions to other users. Typical SQL Server logical database objects are: ◆ Tables ◆ Indexes ◆ Views The database owner is associated with a user within a database instance. All permissions and ownership of objects in the database are controlled by the owner’s user account. Typically, the owner of a SQL Server database is referred to as user dbo. This means that for example, a user xyz of database myDB_1 and a user abc of database myDB_2 may both be referred to as the dbo user for their respective databases. Microsoft SQL Server 2005 and 2008 include the ability to use Integrated Windows Security with the Windows Active Directory. This allows access and ownership to be defined to Windows login accounts stored within the Active Directory, rather than the previous SQL Server internal user accounts. It is possible to run either SQL Server authentication; Integrated Windows-based authentication; or a combination of both. Note: It is the responsibility of the various programs and utilities to utilize the appropriate authentication methods. Applications may fail to function correctly if they do not support the installed authentication method. Implemented authentication methods can be changed through the SQL Server properties page. 28 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Data access Microsoft SQL Server utilizes a page as the basic physical representation of data. A SQL Server data page is 8 KB (8192 bytes). Data pages are then logically aggregated to form tables, which are collections of data suitable for quick reference. Each table is a data structure defined with a table name and a set of columns and rows, with data occupying each cell formed by a row/column intersection. A row is a collection of column information corresponding to a single record. In SQL Server, an index is an optional structure associated with a table which may increase data retrieval performance. Indexes are created on one or more columns of a table. Indexes are useful when an application often needs to make queries to a table for a range of rows or a specific row. There are two forms of indexes within SQL Server, a clustered index or a non-clustered index. There can only be one clustered index for a given table, as the clustered index defines the order in which the data is stored in the table. Non-clustered indexes are logically and physically independent of the table data and can therefore be created or dropped at any time. If no clustered index is defined for a table, then the table is referred to as a heap. A view may be best considered as a virtualized table, though it should be noted that the view does not persist as an object within the database. The Transact-SQL statement, which defined the view, is stored, and the view may be materialized when referenced. This may be useful to generate a large virtual table which joins data from different tables. A filegroup is a named storage pool that aggregates the physical database data files. As shown in Figure 2 on page 30, one or more table and index structures make up the database files of a filegroup. The data is stored logically in filegroups and physically in data files that are associated with the corresponding filegroups. Data access 29 Microsoft SQL Server Database instance MASTER database PRIMARY filegroup Table A Table B master.mdf User database FileGroup1 FileGroup2 Table A Owner1 Table A Table A Table B Owner2 Table B Owner2 Table C UserData1.mdf UserData2.ndf UserData3.ndf Log file Log file mastlog.ldf Userlog.ldf ICO-IMG-000036 Figure 2 SQL Server database architecture Note: Every SQL Server instance contains four system databases named master, tempdb, msdb, and model, which a SQL Server instance creates automatically when it is installed. The master database contains the data dictionary tables and views for the entire SQL Server instance in addition to security information relating to the user databases defined within the instance. 30 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Microsoft SQL Server physical components Data files SQL Server maintains its data and index information in data files. Figure 3 on page 32 represents the physical layout of a single data file object, and shows the relationship of pages and extents. A page is a logical storage structure that is the smallest unit of storage and I/O used by the SQL Server database. The data page size for both SQL Server 2005 and 2008 is 8 KB (8192 bytes). When data is added to a given table (or index), space must be allocated for the new data. SQL Server allocates additional space within the relevant data file in a unit size of eight contiguous data pages. SQL Server refers to this eight 8 KB page allocation as an extent. Therefore the extent size for SQL Server is 64 KB (65536 bytes). Extents are either mixed or uniform, where mixed extents contain both data and index pages which may belong to multiple tables within the database, and uniform extents only contain data or index pages for a single table. Typically, as data is added to a table within a database, only uniform extents are used to store the relevant index or data pages. As shown in Figure 2 on page 30, several data files may be aggregated into filegroups. Utilizing this functionality provides a facility to constrain given data tables and/or indexes to a particular filegroup. There is always at least one filegroup, the PRIMARY filegroup. As data or index information is added to a table or index, extents are allocated from the data files which represent the filegroup wherein the table or index exists. The first file created in the first (PRIMARY) filegroup for the database is unique in that it contains additional information (metadata) regarding the structure of the database itself. This file is referred to as the primary file, and typically is assigned the .mdf extension to signify this functionality. Subsequent data files are assigned the .ndf extension and log files are assigned the .ldf extension. In actuality, file extensions are somewhat irrelevant to SQL Server itself, and are provided simply as a means to make the function of the file more obvious to the user/administrator. Microsoft SQL Server physical components 31 Microsoft SQL Server Pages 8 KB File header Page free space Global allocation map Shared GAM File header IAM Extent - 8 pages (64KB) DATAFILE ICO-IMG-000041 Figure 3 SQL Server internal data file logical structure It is possible to allocate specific data or index information to a named (other than the default) filegroup, such that the named filegroup will only contain the given data or index information. This can be used to isolate particular data or index information, which is unique in some manner, or has a different I/O profile and therefore requires a specific storage location. However, this is not generally required in the majority of SQL Server deployments, and Microsoft fully supports mixing both data and index information within the filegroups—this is the default behavior. When allocating new data or index storage space, SQL Server will use extents allocated from the filegroup in a proportional fill fashion, such that data and index information are evenly distributed amongst the files within the filegroup as data is added. This functionality ensures that there is a more even distribution of data and index information and therefore I/O load. Proportional fill ensures that over time, all the data files have storage allocated based on the available free space in the data file. As such, if one data file within a filegroup has twice as much free space as other data files within the same filegroup, twice as many extent allocations will be made from the larger data file. It is therefore optimal to equally size all data files within a filegroup, as this results in a more even round-robin allocation of space across all the data files. 32 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Transaction log In addition to the data and index storage areas, which are the data files, Microsoft SQL Server also creates and maintains a transaction log for each database. The transaction log maintains records relating to changes within the data files. It is possible to define multiple transaction logs for a given SQL Server database. However, only one transaction log is actively receiving writes at any given time. SQL Server takes a serial approach to logging within the transaction log, such that the first log file is fully written to before switching to another transaction log file. The transaction log does not use allocations such as extents used by the data files. Once created, any single physical transaction log is logically segmented into virtual logs based on internal SQL Server algorithms and the initial size of the transaction log. Once transactional activity begins, transactional information is recorded into a virtual log within the physical log file. A logical sequence number (LSN) is assigned to each transaction log record. Additional information is also recorded in the transaction log record, including the transaction ID of the transaction to which this record belongs, as well as a pointer to any preceding log record for the transaction. The transaction log is also considered to have an active log component. This active log area is the sequence of log records (which may span multiple virtual logs) relating to transactions in process. This is required to facilitate any recovery operations on pending transactions at the database level. Those virtual logs, which contain data relating to the active log portion, cannot be marked for reuse until their state changes. The information recorded within the transaction log records is used by Microsoft SQL Server to resolve inconsistencies within the database either in an operational state, or subsequent to an unexpected server failure. In general these are: ◆ rollback operations (when the data need to be returned to the state before a transaction executing). ◆ roll forward changes into the data files (when a transaction has completed successfully and the data files have not yet been updated). Microsoft SQL Server maintains relational integrity by utilizing in-memory structures and the data recorded within the transaction log combined with the data in the data files. Transaction log records Microsoft SQL Server physical components 33 Microsoft SQL Server always contain any updates to data pages which have been modified by committed transactions. They may also contain updated pages that belong to a transaction that may not have been committed. In the event of a server crash, SQL Server utilizes the information recorded in the transaction log to return to a transactionally consistent point in time. To ensure that the log always maintains the current state of data, the transaction log is written to synchronously, bypassing all file system buffering. Once updates are recorded in the transaction log, the subsequent updates to the data files may occur asynchronously. It is obvious then, that the state of the data files and the transaction log are not synchronized, but the log always maintains information ahead of the data files. A point of consistency is created by SQL Server when a checkpoint occurs. A checkpoint forces updated (or dirty) pages out of the memory data buffers and onto disk. At the point when a checkpoint completes, the state of the transaction log and the data files is consistent. The log is required to identify the state of data pages belonging to those transactions that have not been committed and therefore could potentially be rolled back if they abort. 34 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Microsoft SQL Server system databases Microsoft SQL Server maintains a number of system databases which are used internally by various systems and functions. A list of these databases is provided in Table 1 on page 35, including a description on the function of the database. Table 1 Microsoft SQL Server system databases Database Name Function MASTER Records system information on all user databases, login accounts, security information, etc. MODEL Default blank template for new databases. TEMPDB Used to maintain temporary sort areas, stored procedures, etc. This database is rebuilt each time the SQL Server instance is started. MSDB Used for recording backup/restore events; scheduling and alerts. System databases themselves comprise a data file and a transaction log. In most instances, there is minimal activity to the system databases, so placement is not performance critical. There may be operational issues that dictate that these databases need to be placed on SAN devices. As an example, Microsoft Failover Clustering will require that these system databases are located on a shared SAN environment to facilitate restart operations. Of all the system databases, TEMPDB has specific functionality which differentiates it from the other system databases. The TEMPDB data file(s) and transaction log(s) may need to be appropriately located as they can be the destination of significant I/O load. In general, the default location of these databases is in the %SYSTEMDRIVE% drive. Discussion on placement and requirements for these databases are covered in Chapter 8, “Microsoft SQL Server Database Layouts on EMC Symmetrix.” Microsoft SQL Server system databases 35 Microsoft SQL Server Microsoft SQL Server instances It is possible to install multiple separate instances of the SQL Server software on a given Windows Server environment. In general, SQL Server can be installed as a default instance, or as a named instance. Each instance can be considered a completely autonomous SQL Server environment. The instances can be started or shut down independently, except for actions such as the Windows Server shutdown, which will obviously affect all instances. Each instance is installed with its own system databases, as previously described, and a separate set of executables for the server components. Thus, it is possible to implement completely separate security mechanisms for each environment, and even to have each instance at a different version or even different SQL Server Service Pack level. In Figure 4 on page 37, two instances have been installed on a single Windows Server. This results in two sets of executables being installed. The default instance is referenced by the MSSQ10.MSSQLSERVER directory, the secondary named instance was called DEV, and is therefore referenced as MSSQL10.DEV 36 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Figure 4 Multiple SQL Server instance installation directories As a result of the ability to have multiple instances executing on a given Windows Server, Microsoft provided extensions to its client connectivity to allow for defining the specific SQL Server instance required. It is possible to reference the default SQL Server instance by simply defining the connection to be the Windows Server hostname itself. For a named instance, the server name needs to include the instance name. All Microsoft management tools support and display multiple instances installed on a server. In Figure 5 on page 38, Enterprise Manager is used to display the two SQL Server instances created on the LICOC211 server. Microsoft SQL Server instances 37 Microsoft SQL Server Figure 5 SQL Server Enterprise Manager with multiple instances For example, on the sample Windows Server documented in Figure 6 on page 39, it is possible to connect to the two instances from the osql command line interface in the following manner. Each connection specifies the relevant instance. In the first instance, a connection is made to the default instance by only providing the server name, and in the second, the named instance DEV is appended to the server name. The server is named LICOC211. In both cases, a statement is executed to return the name of the server instance. 38 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Figure 6 Connecting to default and named instances All EMC products that interact with Microsoft SQL Server support multiple SQL Server instances on a given Windows server. Microsoft SQL Server instances 39 Microsoft SQL Server Microsoft Windows Clustering installations SQL Server supports installations in a Microsoft Windows Cluster deployments, both Windows Server 2003 Cluster Service (MSCS) and Windows Server 2008 Failover Cluster environments.When installed into a cluster configuration, the installation locations differ from those of a standard local installation. Note: As of Windows Server 2008, the name for the clustering functionality was changed to Windows Failover Clustering. For the purposes of this document, we will consider MSCS and Failover Clustering synonymous, except where indicated. All data and transaction log files, including those for the system databases must be located on disk devices viewed by the cluster as shared disks. This is a requirement to ensure that all nodes within a given cluster are able to access all required databases within the instance. By default, the SQL Server instance itself will implement a resource dependency on the shared disk resources installed within the resource group. This dependency must be maintained, and any modifications to the database structure that may introduce an additional disk resource, such as adding a data file on a new shared LUN, must be replicated within the resource group, by adding the new disk as a resource within the group, and adding it as a dependency for SQL Server. Solutions built using EMC’s geographically dispersed cluster product SRDF®/CE for MSCS implement identical functionality and restrictions. In Figure 7 on page 41, an SRDF/CE for MSCS implementation is displayed. The dependencies as described are required to be maintained, but now additionally include the disks depending on the SRDF/CE for MSCS resource. In this manner, states for RDF devices are managed appropriately before upper-level services attempting start. In many ways, the implementation of SRDF/CE for MSCS is somewhat transparent to a standard MSCS installation. Figure 7 on page 41 details an MSCS resource group. In this instance, the view is actually of a SRDF/CE for MSCS resource group, and the SRDF/CE for MSCS resource may be seen as a group member. 40 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Figure 7 SRDF/CE for MSCS resource group for a SQL Server instance Unlike the shared data files and logs for all the databases within the SQL Server instance, the SQL Server executables are installed locally to each node within the cluster in a similar layout as described in the preceding section. This cluster requirement is also imposed for EMC’s geographically dispersed cluster implementation, SRDF/CE for MSCS. Microsoft Windows Clustering installations 41 Microsoft SQL Server Backup and recovery interfaces — VDI and VSS Microsoft SQL Server provides application programming interface (APIs) to control the interaction of SQL Server backup/restore operations with a disk mirror split. These programmatic interface are utilized by vendors such as EMC to enable specific storage array functionality. For EMC, this facilitates integration of backup/restore operations with disk mirroring technology for local (TimeFinder®) or remote (SRDF) execution. The first API is the Virtual Device Interface (VDI).This interface manages the required operations to perform a controlled quiesce of I/O operations to the data files and transaction logs before suspending write activity, such that disk mirrors may be split. In addition to the VDI interface, Microsoft SQL Server 2008 provides support for the Volume Shadowcopy Service. Similar in nature to the VDI implementation, VSS allows applications to implement Writer components that can interact and control database states. Both VDI and VSS based implementations result in valid SQL Server-compliant backup states to be created on the target storage devices. Both implementations provide support for Symmetrix® BCV/R2/Clone or SNAP devices as implemented within the specific backup application. Note: Applications may choose to implement a single form of the backup API. The resultant solutions still are to be considered valid backup environments. Specific application documentation will outline the supported interfaces. 42 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Microsoft SQL Server and EMC integration Operationally, the primary level of integration is provided around backup/recovery and restart operations. EMC provides a number of utilities that can be used to facilitate the administration of backup and recovery procedures. These utilities are designed to reduce operating system overhead and mitigate the amount of time required to perform backup and recovery operations. Products such as the EMC TimeFinder® Integration Module for Microsoft SQL Server (TF/SIM),EMC’s Replication Manager (RM/Local) and EMC NetWorker® Module for SQL Server facilitate low-impact backup and recovery procedures, which utilize disk mirror technologies to optimize execution time and enhance availability. The backup images created by these products represent a valid backup image for a SQL Server database by using either SQL Server’s Virtual Device Interface (VDI) or Volume Shadowcopy Service (VSS). When executed, the VDI or VSS interfaces cause the SQL Server instance to issue a checkpoint for the respective databases (which ensures a consistent on-disk image of the database) and suspends write activity to the transaction log and data files for the duration of a disk mirror split. Once complete, the I/O activity is resumed, and a backup indicator is entered into the system databases. This backup marker is a critical requirement for being able to initiate incremental transaction log backups for those databases where a full recovery model is used. It is necessary to process log backups in this environment to ensure that the transaction log does not run out of space. These disk mirror-based backup images may also be used to facilitate other additional business requirements such as reporting database instances or log-shipping instances. Utilizing EMC Symmetrix Remote Data Facility (SRDF), customers can create or migrate the backup images to remote arrays, providing the capability of creating disaster recovery or disaster restart solutions. EMC provides a series of solutions based on consistency technology, which can be used to provide a valid restart point for individual databases or SQL Server instances. More importantly, this functionality allows customers to create larger federated restart solutions. These are generally represented by a number of disparate and/or heterogeneous database environments tied together to create a business application or process. Microsoft SQL Server and EMC integration 43 Microsoft SQL Server Restart solutions provide functionality beyond standard backup/recovery processes. It is often impossible to utilize standard backup and recovery processes to create a business point of consistency across multiple environments. 44 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Advanced storage system functionality EMC Symmetrix VMAX™ storage arrays provide additional value for Microsoft SQL Server environments, by introducing support for advanced functionality such as Virtual Provisioning™, Enterprise Flash Drives (EFD) and Fully Automated Storage Tiering (FAST). These features assist in optimizing storage infrastructure for a range of business demands, and are fully supported by an implementation of Microsft SQL Server. Virtual Provisioning, generally known in the industry as "thin provisioning," enables organizations to improve ease of use, enhance performance, and increase capacity utilization for certain applications and workloads. The implementation of Virtual Provisioning for Symmetrix DMX-3 and DMX-4 and Symmetrix VMAX storage arrays directly addresses improvements in storage infrastructure utilization, as well as associated operational requirements and efficiencies. These efficiencies can be achieved in a Microsoft SQL Server environment when the database files are located on virtually provisioned storage devices from a Symmetrix array. EMC created a new "Tier 0" ultra-performance storage tier that removed the performance limitations previously imposed by magnetic disk drives with the introduction of Enterprise Flash Drives in the Symmetrix DMX and VMAX storage arrays. EFDs provide substantial total cost of ownership (TCO) advantages over traditional disk drives, by virtue of lower power consumption, reduced weight, and lowered heat dissipation requirements. By combining EFDs optimized with EMC technology and advanced Symmetrix functionality, organizations now have new options previously unavailable from any enterprise storage vendor. EFDs dramatically increase performance for read latency sensitive applications implemented with Microsoft SQL Server. EFDs, also known as solid state drives (SSD), contain no moving parts and appear as standard Fibre Channel drives to existing Symmetrix management tools, allowing administrators to manage Tier 0 without special processes or custom tools. Tier 0 Enterprise Flash storage is ideally suited for applications with high transaction rates and those requiring the fastest possible retrieval and storage of data, such as currency exchange and electronic trading systems, or real-time data acquisition and processing. A Symmetrix or VMAX array with Flash drives can deliver single-millisecond application response times and Advanced storage system functionality 45 Microsoft SQL Server significantly more input/output operations per second (IOPS) than traditional Fibre Channel hard disk drives (HDD). Additionally, because there are no mechanical components, EFDs consume up to 98 percent less energy per I/O than traditional hard disk drives. Finally, with the introduction of differing disk storage types, including EFD, Fibre Channel and Serial ATA (SATA), each can be optimized for specific operational and business need. The level of complexity for administrators to manually provision appropriate storage, with needs that could change over time, became a more time-consuming task. This manual and often time-consuming approach to Storage Tiering can now be automated using Fully Automated Storage Tiering (FAST). FAST uses policies to manage sets of logical devices and available Storage Types. Based on the policy guidance and the actual workload profile over time, the FAST Controller will recommend and even execute automatically the movement of the managed devices between the Storage Types. The new technologies will be discussed in detail in subsequent chapters of this document. 46 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Additional Microsoft SQL Server tools Throughout this document, reference will be made to tools and services provided by Microsoft SQL Server and the various tools deployed by default. ◆ SQL Server Management Studio for SQL Server 2005 and SQL Server 2008. This is the graphical user interface into the management and maintenance functions for SQL Server. ◆ Query Analyzer. In SQL Server 2005 and SQL Server 2008, the abilityto execute discrete queries has been included within the SQL Server Management Studio. This mechanism lets a user generate and execute Transact SQL (T-SQL) statements against databases. A command line interface is also available in the form of the osql tool. As of SQL Server 2005, and now SQL Server 2008, the sqlcmd utility has been added to provide an additional method of interacting with the SQL Server Storage Engine. ◆ Books On Line. This documentation set covers all the various aspects of SQL Server deployment and management. It provides exhaustive coverage on the correct syntax and options to all T-SQL statements. It also provides discussion on options and features which may be deployed for SQL Server database environments. All of these tools (as appropriate) are available in the Microsoft SQL Server menu once Microsoft SQL Server has been installed. Additional Microsoft SQL Server tools 47 Microsoft SQL Server 48 Microsoft SQL Server on EMC Symmetrix Storage Systems 2 EMC Foundation Products This chapter introduces the EMC foundation products discussed in this document that work in Microsoft Infrastructure environments: ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ Introduction ........................................................................................ 50 Symmetrix hardware and EMC Enginuity features...................... 53 EMC Solutions Enabler base management .................................... 57 EMC Change Tracker......................................................................... 60 EMC Symmetrix Remote Data Facility ........................................... 61 EMC TimeFinder ................................................................................ 76 EMC Storage Resource Management.............................................. 89 EMC Storage Viewer.......................................................................... 94 EMC PowerPath ................................................................................. 96 EMC Replication Manager.............................................................. 105 EMC Open Replicator ..................................................................... 107 EMC Virtual Provisioning............................................................... 108 EMC Virtual LUN migration........................................................... 111 EMC Fully Automated Storage Tiering for Disk Pools .............. 114 EMC Fully Automated Storage Tiering for Virtual Pools .......... 116 EMC Foundation Products 49 EMC Foundation Products Introduction EMC provides many hardware and software products that support Microsoft SQL Server environments on Symmetrix® systems. This chapter provides a technical overview of the EMC products referenced in this document. The following products, which are highlighted and discussed, were used and/or tested with MicrosoftSQL Server deployed on EMC Symmetrix. EMC offers an extensive product line of high-end storage solutions targeted to meet the requirements of mission-critical databases and applications. The Symmetrix product line includes the DMX Direct Matrix Architecture™ series and the VMAX® Virtual Matrix™ series. EMC Symmetrix is a fully redundant, high-availability storage processor, providing nondisruptive component replacements and code upgrades. The Symmetrix system features high levels of performance, data integrity, reliability, and availability. EMC Enginuity™ Operating Environment — Enginuity enables interoperation between the latest Symmetrix platforms and previous generations of Symmetrix systems and enables them to connect to a large number of server types, operating systems and storage software products, and a broad selection of network connectivity elements and other devices, ranging from HBAs and drivers to switches and tape systems. EMC Solutions Enabler — Solutions Enabler is a package that contains the SYMAPI runtime libraries and the SYMCLI command line interface. SYMAPI provides the interface to the EMC Enginuity operating environment. SYMCLI is a set of commands that can be invoked from the command line or within scripts. These commands can be used to monitor device configuration and status, and to perform control operations on devices and data objects within a storage complex. EMC Symmetrix Remote Data Facility (SRDF®) — SRDF is a business continuity software solution that replicates and maintains a mirror image of data at the storage block level in a remote Symmetrix system. The SRDF component extends the basic SYMCLI command set of Solutions Enabler to include commands that specifically manage SRDF. 50 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products EMC SRDF consistency groups — An SRDF consistency group is a collection of related Symmetrix devices that are configured to act in unison to maintain data integrity. The devices in consistency groups can be spread across multiple Symmetrix systems. EMC TimeFinder® — TimeFinder is a family of products that enable LUN-based replication within a single Symmetrix system. Data is copied from Symmetrix devices using array-based resources without using host CPU or I/O. The source Symmetrix devices remain online for regular I/O operations while the copies are created. The TimeFinder family has three separate and distinct software products, TimeFinder/Mirror, TimeFinder/Clone, and TimeFinder/Snap: • TimeFinder/Mirror enables users to configure special devices, called business continuance volumes (BCVs), to create a mirror image of Symmetrix standard devices. Using BCVs, TimeFinder creates a point-in-time copy of data that can be repurposed. The TimeFinder/Mirror component extends the basic SYMCLI command set of Solutions Enabler to include commands that specifically manage Symmetrix BCVs and standard devices. • TimeFinder/Clone enables users to make copies of data simultaneously on multiple target devices from a single source device. The data is available to a target’s host immediately upon activation, even if the copy process has not completed. Data may be copied from a single source device to as many as 16 target devices. A source device can be either a Symmetrix standard device or a TimeFinder BCV device. • TimeFinder/Snap enables users to configure special devices in the Symmetrix array called virtual devices (VDEVs) and save area devices (SAVDEVs). These devices can be used to make pointer-based, space-saving copies of data simultaneously on multiple target devices from a single source device. The data is available to a target’s host immediately upon activation. Data may be copied from a single source device to as many as 128 VDEVs. A source device can be either a Symmetrix standard device or a TimeFinder BCV device. A target device is a VDEV. A SAVDEV is a special device without a host address that is used to hold the changing contents of the source or target device. Introduction 51 EMC Foundation Products EMC Change Tracker — EMC Symmetrix Change Tracker software measures changes to data on a Symmetrix volume or group of volumes. Change Tracker software is often used as a planning tool in the analysis and design of configurations that use the EMC TimeFinder or SRDF components to store data at remote sites. Solutions Enabler Storage Resource Management (SRM) component — The SRM component extends the basic SYMCLI command set of Solutions Enabler to include commands that allow users to systematically find and examine attributes of various objects on the host, within a specified relational database, or in the EMC enterprise storage. The SRM commands provide mapping support for relational databases, file systems, logical volumes and volume groups, as well as performance statistics. EMC PowerPath® — PowerPath is host-based software that provides I/O path management. PowerPath operates with several storage systems, on several enterprise operating systems and provides failover and load balancing transparent to the host application and database. 52 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products Symmetrix hardware and EMC Enginuity features Symmetrix hardware architecture and the EMC Enginuity operating environment are the foundation for the Symmetrix storage platform. This environment consists of the following components: ◆ Symmetrix hardware ◆ Enginuity-based operating functions ◆ Solutions Enabler ◆ Symmetrix application program interface (API) for mainframe ◆ Symmetrix-based applications ◆ Host-based Symmetrix applications ◆ Independent software vendor (ISV) applications All Symmetrix systems provide advanced data replication capabilities, full mainframe and open systems support, and flexible connectivity options, including Fibre Channel, FICON, ESCON, Gigabit Ethernet, and iSCSI. Interoperability between Symmetrix storage systems enables customers to migrate storage solutions from one generation to the next, protecting their investment even as their storage demands expand. Symmetrix enhanced cache director technology allows configurations of up to 512 GB of cache. The cache can be logically divided into 32 independent regions providing up to 32 concurrent 500 MB/s transaction throughput. The Symmetrix on-board data integrity features include: ◆ Continuous cache and on-disk data integrity checking and error detection/correction ◆ Fault isolation ◆ Nondisruptive hardware and software upgrades ◆ Automatic diagnostics and phone-home capabilities At the software level, advanced integrity features ensure information is always protected and available. By choosing a mix of RAID 1 (mirroring), RAID 1/0, high performance RAID 5 (3+1 and 7+1) protection and RAID 6, users have the flexibility to choose the Symmetrix hardware and EMC Enginuity features 53 EMC Foundation Products protection level most appropriate to the value and performance requirements of their information. The Symmetrix DMX and VMAX are EMC’s latest generation of high-end storage solutions. From the perspective of the host operating system, a Symmetrix system appears to be multiple physical devices connected through one or more I/O controllers. The host operating system addresses each of these devices using a physical device name. Each physical device includes attributes, vendor ID, product ID, revision level, and serial ID. The host physical device maps to a Symmetrix device. In turn, the Symmetrix device is a virtual representation of a portion of the physical disk called a hypervolume. Symmetrix VMAX platform The EMC Symmetrix VMAX Series with Enginuity is a new entry to the Symmetrix product line. Built on the strategy of simple, intelligent, modular storage, it incorporates a new scalable fabric interconnect design that allows the storage array to seamlessly grow from an entry-level configuration into the world's largest storage system. The Symmetrix VMAX provides improved performance and scalability for demanding enterprise storage environments while maintaining support for EMC's broad portfolio of platform software offerings. The Enginuity operating environment for Symmetrix version 5874 is a new, feature-rich Enginuity release supporting Symmetrix VMAX storage arrays. With the release of Enginuity 5874, Symmetrix VMAX systems deliver new software capabilities that improve capacity utilization, ease of use, business continuity and security. The Symmetrix VMAX also maintains customer expectations for high-end storage in terms of availability. High-end availability is more than just redundancy; it means nondisruptive operations and upgrades, and being “always online.” Symmetrix VMAX provides: 54 ◆ Nondisruptive expansion of capacity and performance at a lower price point ◆ Sophisticated migration for multiple storage tiers within the array ◆ The power to maintain service levels and functionality as consolidation grows ◆ Simplified control for provisioning in complex environments Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products Many of the new features provided by the new EMC Symmetrix VMAX platform can reduce operational costs for customers deploying VMware Infrastructure environments, as well as enhance functionality to enable greater benefits. This document details those features that provide significant benefits to customers deploying VMware Infrastructure environments. Figure 8 on page 55 illustrates the architecture and interconnection of the major components in the Symmetrix VMAX storage system. ICO-IMG-000752 Figure 8 Symmetrix VMAX logical diagram EMC Enginuity operating environment EMC Enginuity is the operating environment for all Symmetrix storage systems. Enginuity manages and ensures the optimal flow and integrity of data through the different hardware components. It also manages Symmetrix operations associated with monitoring and optimizing internal data flow. This ensures the fastest response to the user's requests for information, along with protecting and replicating data. Enginuity provides the following services: ◆ Manages system resources to intelligently optimize performance across a wide range of I/O requirements. Symmetrix hardware and EMC Enginuity features 55 EMC Foundation Products 56 ◆ Ensures system availability through advanced fault monitoring, detection, and correction capabilities and provides concurrent maintenance and serviceability features. ◆ Offers the foundation for specific software features available through EMC disaster recovery, business continuity, and storage management software. ◆ Provides functional services for both Symmetrix-based functionality and for a large suite of EMC storage application software. ◆ Defines priority of each task, including basic system maintenance, I/O processing, and application processing. ◆ Provides uniform access through APIs for internal calls, and provides an external interface to allow integration with other software providers and ISVs. Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products EMC Solutions Enabler base management The EMC Solutions Enabler kit contains all the base management software that provides a host with SYMAPI-shared libraries and the basic Symmetrix command line interface (SYMCLI). Other optional subcomponents in the Solutions Enabler (SYMCLI) series enable users to extend functionality of the Symmetrix systems. Three principle sub-components are: ◆ Solutions Enabler SYMCLI SRDF, SRDF/CG, and SRDF/A ◆ Solutions Enabler SYMCLI TimeFinder/Mirror, TimeFinder/CG, TimeFinder/Snap, TimeFinder/Clone ◆ Solutions Enabler SYMCLI Storage Resource Management (SRM) These components are discussed later in this chapter. SYMCLI resides on a host system to monitor and perform control operations on Symmetrix storage arrays. SYMCLI commands are invoked from the host operating system command line or via scripts. SYMCLI commands invoke low-level channel commands to specialized devices on the Symmetrix called gatekeepers. Gatekeepers are very small devices carved from disks in the Symmetrix that act as SCSI targets for the SYMCLI commands. SYMCLI is used in single command line entries or in scripts to monitor and perform control operations on devices and data objects toward the management of the storage complex. It also monitors device configuration and status of devices that make up the storage environment. To reduce the number of inquiries from the host to the Symmetrix systems, configuration and status information is maintained in a host database file. Table 2 on page 57 lists the SYMCLI base commands discussed in this document. Table 2 Command SYMCLI base commands (page 1 of 3) Argument Description Performs operations on a device group (dg) symdg create Creates an empty device group delete Deletes a device group rename Renames a device group EMC Solutions Enabler base management 57 EMC Foundation Products Table 2 Command SYMCLI base commands (page 2 of 3) Argument Description release Releases a device external lock associated with all devices in a device group list Displays a list of all device groups known to this host show Shows detailed information about a device group and any gatekeeper or BCV devices associated with the device group Performs operations on a composite group (cg) symcg create Creates an empty composite group add Adds a device to a composite group remove Removes a device from a composite group delete Deletes a composite group rename Renames a composite group release Releases a device external lock associated with all devices in a composite group hold Hold devices in a composite group unhold Unhold devices in a composite group list Displays a list of all composite groups known to this host show Shows detailed information about a composite group, and any gatekeeper or BCV devices associated with the group Performs operations on a device in a device group symld 58 add Adds devices to a device group and assigns the device a logical name list Lists all devices in a device group and any associated BCV devices remove Removes a device from a device group rename Renames a device in the device group show Shows detailed information about a device in a the device group Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products Table 2 Command SYMCLI base commands (page 3 of 3) Argument Description Performs support operations on BCV pairs symbcv list Lists BCV devices associate Associates BCV devices to a device group – required to perform operations on the BCV device disassociate Disassociates BCV devices from a device group associate –rdf Associates remotely attached BCV devices to a SRDF device group disassociate –rdf Disassociates remotely attached BCV devices from an SRDF device group EMC Solutions Enabler base management 59 EMC Foundation Products EMC Change Tracker The EMC Symmetrix Change Tracker software is also part of the base Solutions Enabler SYMCLI management offering. Change Tracker commands are used to measure changes to data on a Symmetrix volume or group of volumes. Change Tracker functionality is often used as a planning tool in the analysis and design of configurations that use the EMC SRDF and TimeFinder components to create copies of production data. The Change Tracker command (symchg) is used to monitor the amount of changes to a group of hypervolumes. The command timestamps and marks specific volumes for tracking and maintains a bitmap to record which tracks have changed on those volumes. The bitmap can be interrogated to gain an understanding of how the data on the volume changes over time and to assess the locality of reference of applications. 60 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products EMC Symmetrix Remote Data Facility The Symmetrix Remote Data Facility (SRDF) component of EMC Solutions Enabler extends the basic SYMCLI command set to enable users to manage SRDF. SRDF is a business continuity solution that provides a host-independent, mirrored data storage solution for duplicating production site data to one or more physically separated target Symmetrix systems. In basic terms, SRDF is a configuration of multiple Symmetrix systems whose purpose is to maintain multiple copies of logical volume data in more than one location. SRDF replicates production or primary (source) site data to a secondary (target) site transparently to users, applications, databases, and host processors. The local SRDF device, known as the source (R1) device, is configured in a partner relationship with a remote target (R2) device, forming an SRDF pair. While the R2 device is mirrored with the R1 device, the R2 device is write-disabled to the remote host. After the R2 device synchronizes with its R1 device, the R2 device can be split from the R1 device at any time, making the R2 device fully accessible again to its host. After the split, the target (R2) device contains valid data and is available for performing business continuity tasks through its original device address. SRDF requires configuration of specific source Symmetrix volumes (R1) to be mirrored to target Symmetrix volumes (R2). If the primary site is no longer able to continue processing when SRDF is operating in synchronous mode, data at the secondary site is current up to the last committed transaction. When primary systems are down, SRDF enables fast failover to the secondary copy of the data so that critical information becomes available in minutes. Business operations and related applications may resume full functionality with minimal interruption. Figure 9 on page 62 illustrates a basic SRDF configuration where connectivity between the two Symmetrix is provided using ESCON, Fibre Channel, or Gigabit Ethernet. The connection between the R1 and R2 volumes is through a logical grouping of devices called a remote adapter (RA) group. The RA group is independent of the device and composite groups defined and discussed in “SRDF device groups and composite groups” on page 63. EMC Symmetrix Remote Data Facility 61 EMC Foundation Products <200Km Escon FC GigE Server Source Target ICO-IMG-000001 Figure 9 Basic synchronous SRDF configuration SRDF benefits SRDF offers the following features and benefits: ◆ High data availability ◆ High performance ◆ Flexible configurations ◆ Host and application software transparency ◆ Automatic recovery from a component or link failure ◆ Significantly reduced recovery time after a disaster ◆ Increased integrity of recovery procedures ◆ Reduced backup and recovery costs ◆ Reduced disaster recovery complexity, planning, testing, etc. ◆ Supports Business Continuity across and between multiple databases on multiple servers and Symmetrix systems. SRDF modes of operation SRDF currently supports the following modes of operation: ◆ 62 Synchronous mode (SRDF/S) provides real-time mirroring of data between the source Symmetrix system(s) and the target Symmetrix system(s). Data is written simultaneously to the cache of both systems in real time before the application I/O is completed, thus ensuring the highest possible data availability. Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products Data must be successfully stored in both the local and remote Symmetrix systems before an acknowledgment is sent to the local host. This mode is used mainly for metropolitan area network distances less than 200 km. ◆ Asynchronous mode (SRDF/A) maintains a dependent-write consistent copy of data at all times across any distance with no host application impact. Applications needing to replicate data across long distances historically have had limited options. SRDF/A delivers high-performance, extended-distance replication and reduced telecommunication costs while leveraging existing management capabilities with no host performance impact. ◆ Adaptive copy mode transfers data from source devices to target devices regardless of order or consistency, and without host performance impact. This is especially useful when transferring large amounts of data during data center migrations, consolidations, and in data mobility environments. Adaptive copy mode is the data movement mechanism of the Symmetrix Automated Replication (SRDF/AR) solution. SRDF device groups and composite groups Applications running on Symmetrix systems normally involve a number of Symmetrix devices. Therefore, any Symmetrix operation must ensure all related devices are operated upon as a logical group. Defining device or composite groups achieves this. A device group or a composite group is a user-defined group of devices that SYMCLI commands can execute upon. Device groups are limited to a single Symmetrix system and RA group (a.k.a. SRDF group). A composite group, on the other hand, can span multiple Symmetrix systems and RA groups. The device or composite group type may contain R1 or R2 devices and may contain various device lists for standard, BCV, virtual, and remote BCV devices. The symdg/symld and symcg commands are used to create and manage device and composite groups. SRDF consistency groups An SRDF consistency group is a collection of devices defined by a composite group that has been enabled for consistency protection. Its purpose is to protect data integrity for applications that span multiple EMC Symmetrix Remote Data Facility 63 EMC Foundation Products RA groups and/or multiple Symmetrix systems. The protected applications may comprise multiple heterogeneous data resource managers across multiple host operating systems. An SRDF consistency group uses PowerPath or Enginuity Consistency Assist (SRDF-ECA) to provide synchronous disaster restart with zero data loss. Disaster restart solutions that use consistency groups provide remote restart with short recovery time objectives. Zero data loss implies that all completed transactions at the beginning of a disaster will be available at the target. When the amount of data for an application becomes very large, the time and resources required for host-based software to protect, back up, or run decision-support queries on these databases become critical factors. The time required to quiesce or shut down the application for offline backup is no longer acceptable. SRDF consistency groups allow users to remotely mirror the largest data environments and automatically split off dependent-write consistent, restartable copies of applications in seconds without interruption to online service. A consistency group is a composite group of SRDF devices (R1 or R2) that act in unison to maintain the integrity of applications distributed across multiple Symmetrix systems or multiple RA groups within a single Symmetrix. If a source (R1) device in the consistency group cannot propagate data to its corresponding target (R2) device, EMC software suspends data propagation from all R1 devices in the consistency group, halting all data flow to the R2 targets. This suspension, called tripping the consistency group, ensures that a dependent-write consistent R2 copy of the database up to the point in time that the consistency group tripped. Tripping a consistency group can occur either automatically or manually. Scenarios in which an automatic trip would occur include: ◆ One or more R1 devices cannot propagate changes to their corresponding R2 devices ◆ The R2 device fails ◆ The SRDF directors on the R1 side or R2 side fail In an automatic trip, the Symmetrix system completes the write to the R1 device, but indicates that the write did not propagate to the R2 device. EMC software intercepts the I/O and instructs the Symmetrix to suspend all R1 source devices in the consistency group from propagating any further writes to the R2 side. Once the suspension is 64 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products complete, writes to all of the R1 devices in the consistency group continue normally, but they are not propagated to the target side until normal SRDF mirroring resumes. An explicit trip occurs when the command symrdf –cg suspend or split is invoked. Suspending or splitting the consistency group creates an on-demand, restartable copy of the database at the R2 target site. BCV devices that are synchronized with the R2 devices are then split after the consistency group is tripped, creating a second dependent-write consistent copy of the data. During the explicit trip, SYMCLI issues the command to create the dependent-write consistent copy, but may require assistance from PowerPath or SRDF-ECA if I/O is received on one or more R1 devices, or if the SYMCLI commands issued are abnormally terminated before the explicit trip. An EMC consistency group maintains consistency within applications spread across multiple Symmetrix systems in an SRDF configuration, by monitoring data propagation from the source (R1) devices in a consistency group to their corresponding target (R2) devices as depicted in Figure 10 on page 65. Consistency groups provide data integrity protection during a rolling disaster. Host 1 4 Consistency group Host component Symmetrix control Facility 1 5 Suspend R1/R2 relationship DBMS E-ConGroup definition (X,Y,Z) 7 DBMS restartable copy R1(A) 6 R1(B) R1(X) RDF-ECA R2(X) R1(Y) R2(Y) Host 2 R1(C) Consistency group Host component Symmetrix control Facility 2 R2(Z) R2(A) R2(B) R2(C) 3 DBMS R1(Z) RDF-ECA X = DBMS data Y = Application data Z = Logs ICO-IMG-000106 Figure 10 SRDF consistency group EMC Symmetrix Remote Data Facility 65 EMC Foundation Products A consistency group protection is defined containing volumes X, Y, and Z on the source Symmetrix. This consistency group definition must contain all of the devices that need to maintain dependent-write consistency and reside on all participating hosts involved in issuing I/O to these devices. A mix of CKD (mainframe) and FBA (UNIX/Windows) devices can be logically grouped together. In some cases, the entire processing environment may be defined in a consistency group to ensure dependent-write consistency. The rolling disaster described previously begins, preventing the replication of changes from volume Z to the remote site. Since the predecessor log write to volume Z cannot be propagated to the remote Symmetrix system, a consistency group trip occurs. A ConGroup trip holds the write that could not be replicated along with all of the writes to the logically grouped devices. The writes are held by PowerPath on UNIX/Windows hosts and by IOS on mainframe hosts (or by ECA-RDA for both UNIX/Windows and mainframe hosts) long enough to issue two I/Os to all of the Symmetrix systems involved in the consistency group. The first I/O changes the state of the devices to a suspend-pending state. The second I/O performs the suspend actions on the R1/R2 relationships for the logically grouped devices which immediately disables all replication to the remote site. This allows other devices outside of the group to continue replicating, provided the communication links are available. After the relationship is suspended, the completion of the predecessor write is acknowledged back to the issuing host. Furthermore, all writes that were held during the consistency group trip operation are released. After the second I/O per Symmetrix completes, the I/O is released, allowing the predecessor log write to complete to the host. The dependent data write is issued by the DBMS and arrives at X but is not replicated to the R2(X). When a complete failure occurs from this rolling disaster, the dependent-write consistency at the remote site is preserved. If a complete disaster does not occur and the failed links are activated again, the consistency group replication can be resumed. EMC recommends creating a copy of the dependent-write consistent image while the resume takes place. Once the SRDF process reaches synchronization the dependent-write consistent copy is achieved at the remote site. 66 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products SRDF terminology This section describes various terms related to SRDF operations. Suspend and resume operations Practical uses of suspend and resume operations usually involve unplanned situations in which an immediate suspension of I/O between the R1 and R2 devices over the SRDF links is desired. In this way, data propagation problems can be stopped. When suspend is used with consistency groups, immediate backups can be performed off the R2s without affecting I/O from the local host application. I/O can then be resumed between the R1 and R2 and return to normal operation. Establish and split operations The establish and split operations are normally used in planned situations in which use of the R2 copy of the data is desired without interfering with normal write operations to the R1 device. Splitting a point-in-time copy of data allows access to the data on the R2 device for various business continuity tasks. The ease of splitting SRDF pairs to provide exact database copies makes it convenient to perform scheduled backup operations, reporting operations, or new application testing from the target Symmetrix data while normal processing continues on the source Symmetrix system. The R2 copy can also be used to test disaster recovery plans without manually intensive recovery drills, complex procedures, and application service interruptions. Upgrades to new versions can be tested or changes to actual code can be made without affecting the online production server. For example, modified server code can be run on the R2 copy of the database until the upgraded code runs with no errors before upgrading the production server. In cases where an absolute real-time copy of the production data is not essential, users may choose to split the SRDF pair periodically and use the R2 copy for queries and report generation. The SRDF pair can be re-established periodically to provide incremental updating of data on the R2 device. The ability to refresh the R2 device periodically provides the latest information for data processing and reporting. Failover and failback operations Practical uses of failover and failback operations usually involve the need to switch business operations from the production site to a remote site (failover) or the opposite (failback). Once failover occurs, EMC Symmetrix Remote Data Facility 67 EMC Foundation Products normal operations continue using the remote (R2) copy of synchronized application data. Scheduled maintenance at the production site is one example of where failover to the R2 site might be needed. Testing of disaster recovery plans is the primary reason to temporarily fail over to a remote site. Traditional disaster recovery routines involve customized software and complex procedures. Offsite media must be either electronically transmitted or physically shipped to the recovery site. Time-consuming restores and the application of logs usually follow. SRDF failover/failback operations significantly reduce the recovery time by incrementally updating only the specific tracks that have changed; this accomplishes in minutes what might take hours for a complete load from dumped database volumes. Update operation The update operation allows users to resynchronize the R1s after a failover while continuing to run application and database services on the R2s. This function helps reduce the amount of time that a failback to the R1 side takes. The update operation is a subset of the failover/failback functionality. Practical uses of the R1 update operation usually involve situations in which the R1 becomes almost synchronized with the R2 data before a failback, while the R2 side is still online to its host. The -until option, when used with update, specifies the target number of invalid tracks that are allowed to be out of sync before resynchronization to the R1 completes. Concurrent SRDF Concurrent SRDF means having two target R2 devices configured as concurrent mirrors of one source R1 device. Using a Concurrent SRDF pair allows the creation of two copies of the same data at two remote locations. When the two R2 devices are split from their source R1 device, each target site copy of the application can be accessed independently. R1/R2 swap Swapping R1/R2 devices of an SRDF pair causes the source R1 device to become a target R2 device and vice versa. Swapping SRDF devices allows the R2 site to take over operations while retaining a remote mirror on the original source site. Swapping is especially useful after failing over an application from the R1 site to the R2 site. SRDF swapping is available with Enginuity version 5567 or later. 68 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products Data Mobility Data mobility is an SRDF configuration that restricts SRDF devices to operating only in adaptive copy mode. This is a lower-cost licensing option that is typically used for data migrations. It allows data to be transferred in adaptive copy mode from source to target, and is not designed as a solution for DR requirements unless used in combination with TimeFinder. Dynamic SRDF Dynamic SRDF allows the creation of SRDF pairs from non-SRDF devices while the Symmetrix system is in operation. Historically, source and target SRDF device pairing has been static and changes required assistance from EMC personnel. This feature provides greater flexibility in deciding where to copy protected data. Dynamic RA groups can be created in a SRDF switched fabric environment. An RA group represents a logical connection between two Symmetrix systems. Historically, RA groups were limited to those static RA groups defined at configuration time. However, RA groups can now be created, modified, and deleted while the Symmetrix system is in operation. This provides greater flexibility in forming SRDF-pair-associated links. SRDF control operations This section describes typical control operations that can be performed by the Solutions Enabler symrdf command. Solutions Enabler SYMCLI SRDF commands perform the following basic control operations on SRDF devices: ◆ Establish synchronizes an SRDF pair by initiating a data copy from the source (R1) side to the target (R2) side. This operation can be a full or incremental establish. Changes on the R2 volumes are discarded by this process. ◆ Restore resynchronizes a data copy from the target (R2) side to the source (R1) side. This operation can be a full or incremental restore. Changes on the R1 volumes are discarded by this process. ◆ Split stops mirroring for the SRDF pair(s) in a device group and write-enables the R2 devices. ◆ Swap exchanges the source (R1) and target (R2) designations on the source and target volumes. EMC Symmetrix Remote Data Facility 69 EMC Foundation Products ◆ Failover switches data processing from the source (R1) side to the target (R2) side. The source side volumes (R1), if still available, are write-disabled. ◆ Failback switches data processing from the target (R2) side to the source (R1) side. The target side volumes (R2), if still available, are write-disabled. Establishing an SRDF pair Establishing an SRDF pair initiates remote mirroring—the copying of data from the source (R1) device to the target (R2) device. SRDF pairs come into existence in two different ways: ◆ At configuration time through the pairing of SRDF devices. This is a static pairing configuration discussed earlier. ◆ Anytime during a dynamic pairing configuration in which SRDF pairs are created on demand. A full establish (symrdf establish –full) is typically performed after an SRDF pair is initially configured and connected via the SRDF links. After the first full establish, users can perform an incremental establish, where the R1 device copies to the R2 device only the new data that was updated while the relationship was split or suspended. To initiate an establish operation on all SRDF pairs in a device or composite group, all pairs must be in the split or suspended state. The symrdf query command is used to check the state of SRDF pairs in a device or composite group. When the establish operation is initiated, the system write-disables the R2 device to its host and merges the track tables. The merge creates a bitmap of the tracks that need to be copied to the target volumes discarding the changes on the target volumes. When the establish operation is complete and the SRDF pairs are in the synchronized state. The R1 device and R2 device contain identical data, and continue to do so until interrupted by administrative command or unplanned disruption. Figure 11 on page 71 depicts SRDF establish and restore operations: 70 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products Production DBMS Disaster recovery DBMS Establish Data Production server Logs Data Restore Logs R1 DR server R2 ICO-IMG-000003 Figure 11 SRDF establish and restore control operations The establish operation may be initiated by any host connected to either Symmetrix system, provided that an appropriate device group has been built on that host. The following command initiates an incremental establish operation for all SRDF pairs in the device group named MyDevGrp: symrdf –g MyDevGrp establish –noprompt Splitting an SRDF pair When read/write access to a target (R2) device is necessary, the SRDF pair can be split. When the split completes, the target host can access the R2 device for write operations. The R2 device contains valid data and is available for business continuity tasks or restoring data to the R1 device if there is a loss of data on that device. While an SRDF pair is in the split state, local I/O to the R1 device can still occur. These updates are not propagated to the R2 device immediately. Changes on each Symmetrix system are tracked through bitmaps and are reconciled when normal SRDF mirroring operations are resumed. To initiate a split, an SRDF pair must already be in one of the following states: ◆ ◆ ◆ ◆ Synchronized Suspended R1 updated SyncInProg (if the –symforce option is specified for the split – resulting in a set of R2 devices that are not dependent-write consistent and are not usable) EMC Symmetrix Remote Data Facility 71 EMC Foundation Products The split operation may be initiated from either host. The following command initiates a split operation on all SRDF pairs in the device group named MyDevGrp: symrdf –g MyDevGrp split –noprompt The symrdf split command provides exactly the same functionality as the symrdf suspend and symrdf rw_enable R2 commands together. Furthermore, the split and suspend operations have exactly the same consistency characteristics as SRDF consistency groups. Therefore, when SRDF pairs are in a single device group, users can split the SRDF pairs in the device group as shown previously and have restartable copies on the R2 devices. If the application data spans multiple Symmetrix systems or multiple RA groups, include SRDF pairs in a consistency group to achieve the same results. Restoring an SRDF pair When the target (R2) data must be copied back to the source (R1) device, the SRDF restore command is used (see Figure 11 on page 71). After an SRDF pair is split, the R2 device contains valid data and is available for business continuance tasks (such as running a new application) or restoring data to the R1 device. Moreover, if the results of running a new application on the R2 device need to be preserved, moving the changed data and new application to the R1 device is another option. Users can perform a full or incremental restore. A full restore operation copies the entire contents of the R2 device to the R1 device. An incremental restore operation is much faster because it copies only new data that was updated on the R2 device while the SRDF pair was split. Any tracks on the R1 device that changed while the SRDF pair was split are replaced with corresponding tracks on the R2 device. To initiate a restore, an SRDF pair must already be in the split state. The restore operation can be initiated from either host. The following command initiates an incremental restore operation on all SRDF pairs in the device group named MyDevGrp (add the –full option for a full restore). symrdf –g MyDevGrp restore –noprompt symrdf –g MyDevGrp restore –noprompt -full The restore operation is complete when the R1 and R2 devices contain identical data. The SRDF pair is then in a synchronized state and may be reestablished by initiating the following command: 72 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products symrdf -g MyDevGrp establish Failover and failback operations Having a synchronized SRDF pair allows users to switch data processing operations from the source site to the target site if operations at the source site are disrupted or if downtime must be scheduled for maintenance. This switchover from source to target is enabled through the use of the failover command. When the situation at the source site is back to normal, a failback operation is used to reestablish I/O communications links between source and target, resynchronize the data between the sites, and resume normal operations on the R1 devices as shown in Figure 12 on page 73, which illustrates the failover and failback operations. Production DBMS Disaster recovery DBMS Failover Data Production server Logs Data Failback R1 Logs DR server R2 ICO-IMG-000004 Figure 12 SRDF failover and failback control operations The failover and failback operations relocate the processing from the source site to the target site or vice versa. This may or may not imply movement of data. Failover Scheduled maintenance or storage system problems can disrupt access to production data at the source site. In this case, a failover operation can be initiated from either host to make the R2 device read/write-enabled to its host. Before issuing the failover, all applications services on the R1 volumes must be stopped. This is because the failover operation makes the R1 volumes read-only. The following command initiates a failover on all SRDF pairs in the device group named MyDevGrp: symrdf –g MyDevGrp failover –noprompt EMC Symmetrix Remote Data Facility 73 EMC Foundation Products To initiate a failover, the SRDF pair must already be in one of the following states: ◆ ◆ ◆ ◆ Synchronized Suspended R1 updated Partitioned (when invoking this operation at the target site) The failover operation: ◆ ◆ ◆ Suspends data traffic on the SRDF links Write-disables the R1 devices Write-enables the R2 volumes Failback To resume normal operations on the R1 side, a failback (R1 device takeover) operation is initiated. This means read/write operations on the R2 device must be stopped, and read/write operations on the R1 device must be started. When the failback command is initiated, the R2 becomes read-only to its host, while the R1 becomes read/write-enabled to its host. The following command performs a failback operation on all SRDF pairs in the device group named MyDevGrp: symrdf –g MyDevGrp failback -noprompt The SRDF pair must already be in one of the following states for the failback operation to succeed: ◆ ◆ ◆ ◆ ◆ Failed over Suspended and write-disabled at the source Suspended and not ready at the source R1 Updated R1 UpdInProg The failback operation: ◆ ◆ ◆ ◆ ◆ 74 Write-enables the R1 devices. Performs a track table merge to discard changes on the R1s. Transfers the changes on the R2s. Resumes traffic on the SRDF links. Write-disables the R2 volumes. Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products EMC SRDF/Cluster Enabler solutions EMC SRDF/Cluster Enabler (SRDF/CE) for MSCS is an integrated solution that combines SRDF and clustering protection over distance. EMC SRDF/CE provides disaster-tolerant capabilities that enable a cluster to span geographically separated Symmetrix systems. It operates as a software extension (MMC snap-in) to the Microsoft Cluster Service (MSCS). SRDF/CE achieves this capability by exploiting SRDF disaster restart capabilities. SRDF allows the MSCS cluster to have two identical sets of application data in two different locations. When cluster services are failed over or failed back, SRDF/CE is invoked automatically to perform the SRDF functions necessary to enable the requested operation. Figure 13 on page 75 illustrates the hardware configuration of two, four-node, geographically distributed EMC SRDF/CE clusters using bidirectional SRDF. Clients Enterprise LAN/WAN Primary site nodes Secondary site nodes Fibre Channel or SCSI Fibre Channel or SCSI R1 R2 R1 SRDF R2 ICO-IMG-000005 Figure 13 Geographically distributed four-node EMC SRDF/CE clusters EMC Symmetrix Remote Data Facility 75 EMC Foundation Products EMC TimeFinder The SYMCLI TimeFinder component extends the basic SYMCLI command set to include TimeFinder or business continuity commands that allow control operations on device pairs within a local replication environment. This section specifically describes the functionality of: ◆ TimeFinder/Mirror — General monitor and control operations for business continuance volumes (BCV) ◆ TimeFinder/CG — Consistency groups ◆ TimeFinder/Clone — Clone copy sessions ◆ TimeFinder/Snap — Snap copy sessions Commands such as symmir and symbcv perform a wide spectrum of monitor and control operations on standard/BCV device pairs within a TimeFinder/Mirror environment. The TimeFinder/Clone command, symclone, creates a point-in-time copy of a source device on nonstandard device pairs (such as standard/standard, BCV/BCV). The TimeFinder/Snap command, symsnap, creates virtual device copy sessions between a source device and multiple virtual target devices. These virtual devices only store pointers to changed data blocks from the source device, rather than a full copy of the data. Each product requires a specific license for monitoring and control operations. Configuring and controlling remote BCV pairs requires EMC SRDF business continuity software discussed previously. The combination of TimeFinder with SRDF provides for multiple local and remote copies of production data. Figure 14 on page 77 illustrates application usage for a TimeFinder/Mirror configuration in a Symmetrix system. 76 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products Server running SYMCLI STD BCV STD BCV STD BCV Target data uses: Backup Data warehouse Regression testing Data protection ICO-IMG-000006 Figure 14 EMC Symmetrix configured with standard volumes and BCVs TimeFinder/Mirror establish operations A BCV device can be fully or incrementally established. After configuration and initialization of a Symmetrix system, BCV devices contain no data. BCV devices, like standard devices, can have unique host addresses and can be online and ready to the host(s) to which they are connected. A full establish operation must be used the first time the standard devices are paired with the BCV devices. An incremental establish of a BCV device can be performed to resynchronize any data that has changed on the standard since the last establish operation. Note: When BCVs are established, they are inaccessible to any host. Symmetrix systems allow up to four mirrors for each hypervolume. The mirror positions are commonly designated M1, M2, M3, and M4. An unprotected BCV can be the second, third, or fourth mirror position of the standard device. A host, however, logically views the Symmetrix M1/M2 mirrored devices as a single device. To assign a BCV as a mirror of a standard Symmetrix device, the symmir establish command is used. One method of establishing a BCV pair is to allow the standard/BCV device-pairing algorithm to arbitrarily create BCV pairs from multiple devices within a device group: symmir -g MyDevGrp establish –full -noprompt With this method, TimeFinder/Mirror first checks for any attach assignments (specifying a preferred BCV match from among multiple BCVs in a device group). TimeFinder/Mirror then checks if there are EMC TimeFinder 77 EMC Foundation Products any pairing relationships among the devices. If either of these previous conditions exists, TimeFinder/Mirror uses these assignments. TimeFinder split operations Splitting a BCV pair is a TimeFinder/Mirror action that detaches the BCV from its standard device and makes the BCV ready for host access. When splitting a BCV, the system must perform housekeeping tasks that may require a few milliseconds on a busy Symmetrix system. These tasks involve a series of steps that result in separation of the BCV from its paired standard: ◆ I/O is suspended briefly to the standard device. ◆ Write pending tracks for the standard device that have not yet been written out to the BCV are duplicated in cache to be written to the BCV. ◆ The BCV is split from the standard device. ◆ The BCV device status is changed to ready. Regular split A regular split is the type of split that has existed for TimeFinder/Mirror since its inception. With a regular split (before Enginuity version 5568), I/O activity from the production hosts to a standard volume was not accepted until it was split from its BCV pair. Therefore, applications attempting to access the standard or the BCV would experience a short wait during a regular split. Once the split was complete, no further overhead was incurred. Beginning with Enginuity version 5568, any split operation is an instant split. A regular split is still valid for earlier versions and for current applications that perform regular split operations. However, current applications that perform regular splits with Enginuity version 5568 actually perform an instant split. By specifying the –instant option on the command line, an instant split with Enginuity versions 5x66 and 5x67 can be performed. Since version 5568, this option is no longer required because instant split mode has become the default behavior. It is beneficial to continue to supply the –instant flag with later Enginuity versions, otherwise the default is to wait for the background split to complete. 78 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products Instant split An instant split shortens the wait period during a split by dividing the process into a foreground split and a background split. During an instant split, the system executes the foreground split almost instantaneously and returns a successful status to the host. This instantaneous execution allows minimal I/O disruptions to the production volumes. Furthermore, the BCVs are accessible to the hosts as soon as the foreground process is complete. The background split continues to split the BCV pair until it is complete. When the -instant option is included or defaulted, SYMCLI returns immediately after the foreground split, allowing other operations while the BCV pair is splitting in the background. The following operation performs an instant split on all BCV pairs in MyDevGrp, and allows SYMCLI to return to the server process while the background split is in progress: symmir -g MyDevGrp split –instant –noprompt The following symmir query command example checks the progress of a split on the composite group named MyConGrp. The –bg option is provided to query the status of the background split: symmir –cg MyConGrp query –bg TimeFinder restore operations A BCV device can be used to fully or incrementally restore data on the standard volume. Like the full establish operation, a full restore operation copies the entire contents of the BCV devices to the standard devices. The devices upon which the restore operates may be defined in a device group, composite group, or device file. For example: symmir -g MyDevGrp -full restore –noprompt symmir -cg MyConGrp -full restore –noprompt symmir -f MyFile -full –sid 109 restore -noprompt The incremental restore process accomplishes the same thing as the full restore process with a major time-saving exception. The BCV copies to the standard device only new data that was updated on the BCV device while the BCV pair was split. The data on the corresponding tracks of the BCV device also overwrites any changed tracks on the standard device. This maximizes the efficiency of the resynchronization process. This process is useful, for example, if, EMC TimeFinder 79 EMC Foundation Products after testing or validating an updated version of a database or a new application on the BCV device is completed, a user wants to migrate and utilize a copy of the tested data or application on the production standard device. Note: An incremental restore of a BCV volume to a standard volume is only possible when the two volumes have an existing TimeFinder relationship TimeFinder consistent split TimeFinder consistent split allows you to split off a dependent-write consistent, restartable image of an application without interrupting online services. Consistent split helps to avoid inconsistencies and restart problems that can occur when splitting an application-related BCV without first quiescing or halting the application. Consistent split is implemented using Enginuity Consistency Assist (ECA) feature. This functionality requires a TimeFinder/CG license. Enginuity Consistency Assist The Enginuity Consistency Assist (ECA) feature of the Symmetrix operating environment can be used to perform consistent split operations across multiple heterogeneous environments. This functionality requires a TimeFinder/CG license and uses the –consistent option of the symmir command. Using ECA to consistently split BCV devices from the standards, a control host with no database or a database host with a dedicated channel to gatekeeper devices must be available. The dedicated channel cannot be used for servicing other devices or to freeze I/O. For example, to split a device group, execute: symmir –g MyDevGrp split –consistent -noprompt Figure 15 on page 81 illustrates an ECA split across three database hosts that access devices on a Symmetrix system. 80 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products Controlling host Host A Database servers Host B STD BCV STD BCV STD BCV SYMAPI ECA prodgrp Consistent split Host C Figure 15 ICO-IMG-000007 ECA consistent split across multiple database-associated hosts Device groups or composite groups must be created on the controlling host for the target application to be consistently split. Device groups can be created to include all of the required devices for maintaining business continuity. For example, if a device group is defined that includes all of the devices being accessed by Hosts A, B, and C (see Figure 15 on page 81), then all of the BCV pairs related to those hosts can be consistently split with a single command. However, if a device group is defined that includes only the devices accessed by Host A, then the BCV pairs related to Host A can be split without affecting the other hosts. The solid vertical line in Figure 15 on page 81 represents the ECA holding of I/Os during an instant split process, creating a dependent-write consistent image in the BCVs. Figure 16 on page 82 illustrates the use of local consistent split with a database management system (DBMS). EMC TimeFinder 81 EMC Foundation Products Host 4 Symmetrix SYMAPI SYMCLI 2 DBMS 6 1 PowerPath or ECA 3 Application Application data data LOGS Other data BCV BCV BCV BCV 5 ICO-IMG-000008 Figure 16 ECA consistent split on a local Symmetrix system When a split command is issued with ECA from the production host, a consistent database image is created using the following sequence of events shown in Figure 16 on page 82: 1. The device group, device file, or composite group identifies the standard devices that hold the database. 2. SYMAPI communicates to Symmetrix Enginuity to validate that all identified BCV pairs can be split. 3. SYMAPI communicates to Symmetrix Enginuity to open the ECA window (the time within Symmetrix Enginuity where the writes are deferred), the instant split is issued, and the writes are released by closing the window. 4. ECA suspends writes to the standard devices that hold the database. The DBMS cannot write to the devices and subsequently waits for these devices to become available before resuming any further write activity. Read activity to the device is not affected unless attempting to read from a device with a write queued against it. 5. SYMAPI sends an instant split request to all BCV pairs in the specified device group and waits for the Symmetrix to acknowledge that the foreground split has occurred. SYMAPI then communicates with Symmetrix Enginuity to resume the write or close the ECA window. 6. The application resumes writing to the production devices. The BCV devices now contain a restartable copy of the production data that is consistent up until the time of the instant split. The production application is unaware that the split or 82 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products suspend/resume operation occurred. When the application on the secondary host is started using the BCVs, there is no record of a successful shutdown. Therefore, the secondary application instance views the BCV copy as a crashed instance and proceeds to perform the normal crash recovery sequence to restart. When performing a consistent split, it is a good practice to issue host-based commands that commit any data that has not been written to disk before the split to reduce the amount of time on restart. For example on UNIX systems, the sync command can be run. From a database perspective, a checkpoint or equivalent should be executed. TimeFinder/Mirror reverse split BCVs can be mirrored to guard against data loss through physical drive failures. A reverse split is applicable for a BCV that is configured to have two local mirrors. It is generally used to recover from an unsuccessful restore operation. When data is restored from the BCV to the standard device, any writes that occur while the standard is being restored alter the original copy of data on the BCVs primary mirror. If the original copy of BCV data is needed again at a later time, it can be restored to the BCVs primary mirror from the BCVs secondary mirror using a reverse split. For example, whenever logical corruption is reintroduced to a database during a recovery process (following a BCV restore), both the standard device and the primary BCV mirror are left with corrupted data. In this case, a reverse split can restore the original BCV data from a BCVs secondary mirror to its primary mirror. This is particularly useful when performing a restore and immediately restarting processing on the standard devices when the process may have to be restarted many times. Note: Reverse split is not available when protected restore is used to return the data from the BCVs to the standards. TimeFinder/Clone operations Symmetrix TimeFinder/Clone operations using SYMCLI can create up to 16 copies from a source device onto target devices. Unlike TimeFinder/Mirror, TimeFinder/Clone does not require the traditional standard-to-BCV device pairing. Instead, TimeFinder/Clone allows any combination of source and target EMC TimeFinder 83 EMC Foundation Products devices. For example, a BCV can be used as the source device, while another BCV can be used as the target device. Any combination of source and target devices can be used. Additionally, TimeFinder/Clone does not use the traditional mirror positions the way that TimeFinder/Mirror does. Because of this, TimeFinder/Clone is a useful option when more than three copies of a source device are desired. Normally, one of the three copies is used to protect the data against hardware failure. The source and target devices must be the same emulation type (FBA or CKD). The target device must be equal in size to the source device. Clone copies of striped or concatenated metavolumes can also be created providing the source and target metavolumes are identical in configuration. Once activated, the target device can be instantly accessed by a target’s host, even before the data is fully copied to the target device. TimeFinder/Clone copies are appropriate in situations where multiple copies of production data is needed for testing, backups, or report generation. Clone copies can also be used to reduce disk contention and improve data access speed by assigning users to copies of data rather than accessing the one production copy. A single source device may maintain as many as 16 relationships that can be a combination of BCVs, clones and snaps. Clone copy sessions TimeFinder/Clone functionality is controlled via copy sessions, which pair the source and target devices. Sessions are maintained on the Symmetrix system and can be queried to verify the current state of the device pairs. A copy session must first be created to define and set up the TimeFinder/Clone devices. The session is then activated, enabling the target device to be accessed by its host. When the information is no longer needed, the session can be terminated. TimeFinder/Clone operations are controlled from the host by using the symclone command to create, activate, and terminate the copy sessions. Figure 17 on page 85 illustrates a copy session where the controlling host creates a TimeFinder/Clone copy of standard device DEV001 on target device DEV005, using the symclone command. 84 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products 1 2 DEV 001 Server running SYMCLI Target host DEV 005 ICO-IMG-000490 Figure 17 Creating a copy session using the symclone command The symclone command is used to enable cloning operations. The cloning operation happens in two phases: creation and activation. The creation phase builds bitmaps of the source and target that are later used during the activation or copy phase. The creation of a symclone pairing does not start copying of the source volume to the target volume, unless the -precopy keyword is used. For example, to create clone sessions on all the standards and BCVs in the device group MyDevGrp, use the following command: symclone -g MyDevGrp create -noprompt The activation of a clone enables the copying of the data. The data may start copying immediately if the –copy keyword is used. If the –copy keyword is not used, tracks are only copied when they are accessed from the target volume or when they are changed on the source volume. Activation of the clone session established in the previous create command can be accomplished using the following command. symclone –g MyDevGrp activate -noprompt New Symmetrix VMAX TimeFinder/Clone features Solutions Enabler 7.1 and Enginuity 5874 SR1 introduce the ability to clone from Thick to Thin devices using TimeFinder/Clone. Thick to thin TimeFinder/Clone allows application data to be moved from standard Symmetrix volumes to Virtually Provisioned storage within the same array. For some workloads Virtually Provisioned volumes offer advantages with allocation utilization, ease of use and performance through automatic wide striping. Thick to Thin TimeFinder/Clone provides an easy way to move workloads that benefit from Virtual Provisioning into that storage paradigm. EMC TimeFinder 85 EMC Foundation Products Migration from Thin devices back to fully provisioned devices is also possible. The source and target of the migration may be of different protection types and disk technologies offering versatility with protections schemes and disk tier options. Thick to Thin TimeFinder Clone will not disrupt hosts or internal array replication sessions during the copy process. TimeFinder/Snap operations Symmetrix arrays provide another technique to create copies of application data. The functionality, called TimeFinder/Snap, allows users to make pointer-based, space-saving copies of data simultaneously on multiple target devices from a single source device. The data is available for access instantly. TimeFinder/Snap allows data to be copied from a single source device to as many as 128 target devices. A source device can be either a Symmetrix standard device or a BCV device controlled by TimeFinder/Mirror, with the exception being a BCV working in clone emulation mode. The target device is a Symmetrix virtual device (VDEV) that consumes negligible physical storage through the use of pointers to track changed data. The VDEV is a host-addressable Symmetrix device with special attributes created when the Symmetrix system is configured. However, unlike a BCV which contains a full volume of data, a VDEV is a logical-image device that offers a space-saving way to create instant, point-in-time copies of volumes. Any updates to a source device after its activation with a virtual device, causes the pre-update image of the changed tracks to be copied to a save device. The virtual device’s indirect pointer is then updated to point to the original track data on the save device, preserving a point-in-time image of the volume. TimeFinder/Snap uses this copy-on-first-write technique to conserve disk space, since only changes to tracks on the source cause any incremental storage to be consumed. The symsnap create and symsnap activate commands are used to create source/target Snap pair. 86 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products Table 3 on page 87 summarizes some of the differences between devices used in TimeFinder/Snap operations. Table 3 TimeFinder device type summary Device Description Virtual device A logical-image device that saves disk space through the use of pointers to track data that is immediately accessible after activation. Snapping data to a virtual device uses a copy-on-first-write technique. Save device A device that is not host-accessible but accessed only through the virtual devices that point to it. Save devices provide a pool of physical space to store snap copy data to which virtual devices point. BCV A full volume mirror that has valid data after fully synchronizing with its source device. It is accessible only when split from the source device that it is mirroring. Snap copy sessions TimeFinder/Snap functionality is managed via copy sessions, which pair the source and target devices. Sessions are maintained on the Symmetrix system and can be queried to verify the current state of the devices. A copy session must first be created—a process which defines the Snap devices in the operation. On subsequent activation, the target virtual devices become accessible to its host. Unless the data is changed by the host accessing the virtual device, the virtual device always presents a frozen point-in-time copy of the source device at the point of activation. When the information is no longer needed, the session can be terminated. TimeFinder/Snap operations are controlled from the host by using the symsnap command to create, activate, terminate, and restore the TimeFinder/Snap copy sessions. The TimeFinder/Snap operations described in this section explain how to manage the devices participating in a copy session through SYMCLI. Figure 18 on page 88 illustrates a virtual copy session where the controlling host creates a copy of standard device DEV001 on target device VDEV005. EMC TimeFinder 87 EMC Foundation Products Controlling host 1 I/O 2 I/O Target host DEV 001 VDEV 005 SAV DEV Device pointers from VDEV to original data Data copied to save area due to copy on write ICO-IMG-000491 Figure 18 TimeFinder/Snap copy of a standard device to a VDEV The symsnap command is used to enable TimeFinder/Snap operations. The snap operation happens in two phases: creation and activation. The creation phase builds bitmaps of the source and target that are later used to manage the changes on the source and target. The creation of a snap pairing does not copy the data from the source volume to the target volume. To create snap sessions on all the standards and BCVs in the device group MyDevGrp, use the following command. symsnap -g <MyDevGrp> create -noprompt The activation of a snap enables the protection of the source data tracks. When protected tracks are changed on the source volume, they are first copied into the save pool and the VDEV pointers are updated to point to the changed tracks in the save pool. When tracks are changed on the VDEV, the data is written directly to the save pool and the VDEV pointers are updated in the same way. Activation of the snap session created in the previous create command can be accomplished using the following command. symsnap –g <MyDevGrp> activate -noprompt 88 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products EMC Storage Resource Management The Storage Resource Management (SRM) component of EMC Solutions Enabler extends the basic SYMCLI command set to include SRM commands that allow users to discover and examine attributes of various objects on a host or in the EMC storage enterprise. Note: The acronym for EMC Storage Resource Management (SRM) can be easily confused with the acronym for VMware Site Recovery Manager. To avoid any confusion, this document always refers to VMware Site Recovery Manager as VMware SRM. SYMCLI commands support SRM in the following areas: ◆ Data objects and files ◆ Relational databases ◆ File systems ◆ Logical volumes and volume groups ◆ Performance statistics SRM allows users to examine the mapping of storage devices and the characteristics of data files and objects. These commands allow the examination of relationships between extents and data files or data objects, and how they are mapped on storage devices. Frequently, SRM commands are used with TimeFinder and SRDF to create point-in-time copies for backup and restart. Figure 19 on page 90 outlines the process of how SRM commands are used with TimeFinder in a database environment. EMC Storage Resource Management 89 EMC Foundation Products SRM Host SYMAPI SYMCLI 1 Invoke Database APIs 2 Identify devices DBMS PowerPath or ECA SYMCLI Mapping Command 3 Map database objects between database metadata and the SYMCLI database 4 TimeFinder SPLIT Data BCV DEV 001 DEV 001 Data BCV DEV 002 DEV 002 Log BCV DEV 003 DEV 003 Log BCV DEV 004 DEV 004 ICO-IMG-000011 Figure 19 SRM commands EMC Solutions Enabler with a valid license for TimeFinder and SRM is installed on the host. In addition, the host must also have PowerPath or use ECA, and must be utilized with a supported DBMS system. As discussed in the Section “TimeFinder split operations” on page 78, when splitting a BCV, the system must perform housekeeping tasks that may require a few seconds on a busy Symmetrix system. These tasks involve a series of steps (shown in Figure 19 on page 90) that result in the separation of the BCV from its paired standard: 1. Using the SRM base mapping commands, first query the Symmetrix system to display the logical-to-physical mapping information about any physical device, logical volume, file, directory, and/or file system. 2. Using the database mapping command, query the Symmetrix to display physical and logical database information. 3. Next, use the database mapping command to translate: • The devices of a specified database into a device group or a consistency group, or • The devices of a specified table space into a device group or a consistency group. 4. The BCV is split from the standard device. 90 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products Table 4 lists the SYMCLI commands used to examine the mapping of data objects. Table 4 Data object SRM commands Command Argument Action symrslv pd Displays logical to physical mapping information about any physical device. lv Displays logical to physical mapping information about a logical volume. file Displays logical to physical mapping information about a file. dir Displays logical to physical mapping information about a directory. fs Displays logical to physical mapping information about a file system. SRM commands allow users to examine the host database mapping and the characteristics of a database. The commands provide listings and attributes that describe various databases, their structures, files, table spaces, and user schemas. Typically, the database commands work with Oracle, Informix, SQL Server, Sybase, Microsoft Exchange, SharePoint Portal Server, and DB2 LUW database applications. Table 5 on page 91 lists the SYMCLI commands used to examine the mapping of database objects. Table 5 Data object mapping commands Command Argument Action symrdb list Lists various physical and logical database objects: Current relational database instances available table spaces, tables, files, or schemas of a database Files, segments, or tables of a database table space or schema show Shows information about a database object: table space, tables, file, or schema of a database, File, segment, or a table of a specified table space or schema rdb2dg Translates the devices of a specified database into a device group. EMC Storage Resource Management 91 EMC Foundation Products Table 5 Data object mapping commands Command Argument Action rdb2cg Translates the devices of a specified table space into a composite group or a consistency group. tbs2cg Translates the devices of a specified table space into a composite group. Only data database files are translated. tbs2dg Translates the devices of a specified table space into a device group. Only data database files are translated. The SYMCLI file system SRM command allows users to investigate the file systems that are in use on the operating system. The command provides listings and attributes that describe file systems, directories, and files, and their mapping to physical devices and extents. Table 6 on page 92 lists the SYMCLI command that can be used to examine the file system mapping. Table 6 File system SRM commands to examine file system mapping Command Argument Action symhostfs list Displays a list of file systems, files, or directories show Displays more detail information about a file system or file system object. SYMCLI logical volume SRM commands allow users to map logical volumes to display a detailed view of the underlying storage devices. Logical volume architecture defined by a Logical Volume Manager (LVM) is a means for advanced applications to improve performance by the strategic placement of data. 92 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products Table 7 on page 93 lists the SYMCLI commands that can be used to examine the logical volume mapping. Table 7 File system SRM command to examine logical volume mapping Command Argument Action symvg deport Deports a specified volume group so it can be imported later. import Imports a specified volume group. list Displays a list of volume groups defined on the host system by the logical volume manager. rescan Rescans all the volume groups. show Displays more detail information about a volume group. vg2cg Translates volume groups to composite groups. vg2dg Translates volume groups to device groups. list Displays a list of logical volumes on a specified volume group. show Displays detail information (including extent data) about a logical volume. symlv SRM performance statistics commands allow users to retrieve statistics about a host’s CPU, disk, and memory. Table 8 on page 93 lists the statistics commands. Table 8 SRM statistics command Command Argument Action symhost show Displays host configuration information. stats Displays performance statistics. EMC Storage Resource Management 93 EMC Foundation Products EMC Storage Viewer EMC Storage Viewer (SV) for vSphere Client extends the vSphere Client to facilitate discovery and identification of EMC Symmetrix storage devices that are allocated to VMware ESX/ESXi hosts and virtual machines. The Storage Viewer for vSphere Client presents the underlying storage details to the virtual datacenter administrator, merging the data of several different storage mapping tools into a few seamless vSphere Client views. The Storage Viewer for vSphere Client enables you to resolve the underlying storage of Virtual Machine File System (VMFS) datastores and virtual disks, as well as raw device mappings (RDM). In addition, you are presented with lists of storage arrays and devices that are accessible to the ESX and ESXi hosts in the virtual datacenter. Previously, these details were only made available to you using separate storage management applications. Once installed and configured, Storage Viewer provides four different views: 94 ◆ The global EMC Storage view. This view configures the global settings for the Storage Viewer, including the Solutions Enabler client/server settings, log settings, and version information. Additionally, an arrays tab lists all of the storage arrays currently being managed by Solutions Enabler, and allows for the discovery of new arrays and the deletion of previously discovered arrays. ◆ The EMC Storage tab for hosts. This tab appears when an ESX/ESXi host is selected. It provides insight into the storage that is configured and allocated for a given ESX/ESXi host. ◆ The SRDF SRA tab for hosts. This view also appears when an ESX/ESXi host is selected on a vSphere Client running on VMware Site Recovery Manager Server. It allows you to configure device pair definitions for the EMC SRDF Storage Replication Adapter (SRA), to use when testing VMware Site Recovery Manager recovery plans, or when creating gold copies before VMware Site Recovery Manager recovery plans are executed. ◆ The EMC Storage tab for virtual machines. This view appears when a virtual machine is selected. It provides insight into the storage that is allocated to a given virtual machine, including both virtual disks and raw device mappings (RDM). Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products A typical view of the Storage Viewer for vSphere Client can be seen in Figure 20 on page 95. Figure 20 EMC Storage Viewer EMC Storage Viewer 95 EMC Foundation Products EMC PowerPath EMC PowerPath is host-based software that works with networked storage systems to intelligently manage I/O paths. PowerPath manages multiple paths to a storage array. Supporting multiple paths enables recovery from path failure because PowerPath automatically detects path failures and redirects I/O to other available paths. PowerPath also uses sophisticated algorithms to provide dynamic load balancing for several kinds of path management policies that the user can set. With the help of PowerPath, systems administrators are able to ensure that applications on the host have highly available access to storage and perform optimally at all times. A key feature of path management in PowerPath is dynamic, multipath load balancing. Without PowerPath, an administrator must statically load balance paths to logical devices to improve performance. For example, based on current usage, the administrator might configure three heavily used logical devices on one path, seven moderately used logical devices on a second path, and 20 lightly used logical devices on a third path. As I/O patterns change, these statically configured paths may become unbalanced, causing performance to suffer. The administrator must then reconfigure the paths, and continue to reconfigure them as I/O traffic between the host and the storage system shifts in response to usage changes. Designed to use all paths concurrently, PowerPath distributes I/O requests to a logical device across all available paths, rather than requiring a single path to bear the entire I/O burden. PowerPath can distribute the I/O for all logical devices over all paths shared by those logical devices, so that all paths are equally burdened. PowerPath load balances I/O on a host-by-host basis, and maintains statistics on all I/O for all paths. For each I/O request, PowerPath intelligently chooses the least-burdened available path, depending on the load-balancing and failover policy in effect. In addition to improving I/O performance, dynamic load balancing reduces management time and downtime because administrators no longer need to manage paths across logical devices. With PowerPath, configurations of paths and policies for an individual device can be changed dynamically, taking effect immediately, without any disruption to the applications. PowerPath provides the following features and benefits: 96 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products ◆ Multiple paths, for higher availability and performance — PowerPath supports multiple paths between a logical device and a host bus adapter (HBA, a device through which a host can issue I/O requests). Having multiple paths enables the host to access a logical device even if a specific path is unavailable. Also, multiple paths can share the I/O workload to a given logical device. ◆ Dynamic multipath load balancing — Through continuous I/O balancing, PowerPath improves a host’s ability to manage heavy I/O loads. PowerPath dynamically tunes paths for performance as workloads change, eliminating the need for repeated static reconfigurations. ◆ Proactive I/O path testing and automatic path recovery — PowerPath periodically tests failed paths to determine if they are available. A path is restored automatically when available, and PowerPath resumes sending I/O to it. PowerPath also periodically tests available but unused paths, to ensure they are operational. ◆ Automatic path failover — PowerPath automatically redirects data from a failed I/O path to an alternate path. This eliminates application downtime; failovers are transparent and non-disruptive to applications. ◆ Enhanced high availability cluster support — PowerPath is particularly beneficial in cluster environments, as it can prevent interruptions to operations and costly downtime. PowerPath’s path failover capability avoids node failover, maintaining uninterrupted application support on the active node in the event of a path disconnect (as long as another path is available). ◆ Consistent split — PowerPath allows users to perform TimeFinder consistent splits by suspending device writes at the host level for a fraction of a second while the foreground split occurs. PowerPath software provides suspend-and-resume capability that avoids inconsistencies and restart problems that can occur if a database-related BCV is split without first quiescing the database. ◆ Consistency Groups — Consistency groups are a composite group of Symmetrix devices specially configured to act in unison to maintain the integrity of a database distributed across multiple SRDF arrays controlled by an open systems host computer. EMC PowerPath 97 EMC Foundation Products PowerPath/VE EMC PowerPath/VE delivers PowerPath Multipathing features to optimize VMware vSphere virtual environments. With PowerPath/VE, you can standardize path management across heterogeneous physical and virtual environments. PowerPath/VE enables you to automate optimal server, storage, and path utilization in a dynamic virtual environment. With hyper-consolidation, a virtual environment may have hundreds or even thousands of independent virtual machines running, including virtual machines with varying levels of I/O intensity. I/O-intensive applications can disrupt I/O from other applications and before the availability of PowerPath/VE, load balancing on an ESX host system had to be manually configured to correct for this. Manual load-balancing operations to ensure that all virtual machines receive their individual required response times are time-consuming and logistically difficult to effectively achieve. PowerPath/VE works with VMware ESX and ESXi as a multipathing plug-in (MPP) that provides enhanced path management capabilities to ESX and ESXi hosts. PowerPath/VE is supported with vSphere (ESX4) only. Previous versions of ESX do not have the PSA, which is required by PowerPath/VE. PowerPath/VE installs as a kernel module on the vSphere host. PowerPath/VE will plug in to the vSphere I/O stack framework to bring the advanced multipathing capabilities of PowerPath dynamic load balancing and automatic failover - to the VMware vSphere platform (Figure 21 on page 99). 98 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products Figure 21 PowerPath/VE vStorage API for multipathing plug-in At the heart of PowerPath/VE path management is server-resident software inserted between the SCSI device-driver layer and the rest of the operating system. This driver software creates a single "pseudo device" for a given array volume (LUN) regardless of how many physical paths on which it appears. The pseudo device, or logical volume, represents all physical paths to a given device. It is then used for creating virtual disks, and for raw device mapping (RDM), which is then used for application and database access. EMC PowerPath 99 EMC Foundation Products PowerPath/VE's value fundamentally comes from its architecture and position in the I/O stack. PowerPath/VE sits above the HBA, allowing heterogeneous support of operating systems and storage arrays. By integrating with the I/O drivers, all I/Os run through PowerPath and allow for it to be a single I/O control and management point. Since PowerPath/VE resides in the ESX kernel, it sits below the Guest OS level, application level, database level, and file system level. PowerPath/VE's unique position in the I/O stack makes it an infrastructure manageability and control point - bringing more value going up the stack. PowerPath/VE features PowerPath/VE provides the following features: 100 ◆ Dynamic load balancing - PowerPath is designed to use all paths at all times. PowerPath distributes I/O requests to a logical device across all available paths, rather than requiring a single path to bear the entire I/O burden. ◆ Auto-restore of paths - Periodic auto-restore reassigns logical devices when restoring paths from a failed state. Once restored, the paths automatically rebalance the I/O across all active channels. ◆ Device prioritization - Setting a high priority for a single or several devices improves their I/O performance at the expense of the remaining devices, while otherwise maintaining the best possible load balancing across all paths. This is especially useful when there are multiple virtual machines on a host with varying application performance and availability requirements. ◆ Automated performance optimization - PowerPath/VE automatically identifies the type of storage array and sets the highest performing optimization mode by default. For Symmetrix, the mode is SymmOpt (Symmetrix Optimized). ◆ Dynamic path failover and path recovery - If a path fails, PowerPath/VE redistributes I/O traffic from that path to functioning paths. PowerPath/VE stops sending I/O to the failed path and checks for an active alternate path. If an active path is available, PowerPath/VE redirects I/O along that path. PowerPath/VE can compensate for multiple faults in the I/O channel (for example, HBAs, fiber-optic cables, Fibre Channel switch, storage array port). Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products ◆ Monitor/report I/O statistics - While PowerPath/VE load balances I/O, it maintains statistics for all I/O for all paths. The administrator can view these statistics using rpowermt. ◆ Automatic path testing - PowerPath/VE periodically tests both live and dead paths. By testing live paths that may be idle, a failed path may be identified before an application attempts to pass I/O down it. By marking the path as failed before the application becomes aware of it, timeout and retry delays are reduced. By testing paths identified as failed, PowerPath/VE will automatically restore them to service when they pass the test. The I/O load will be automatically balanced across all active available paths. PowerPath/VE management PowerPath/VE uses a command set, called rpowermt, to monitor, manage, and configure PowerPath/VE for vSphere. The syntax, arguments, and options are very similar to the traditional powermt commands used on all the other PowerPath Multipathing supported operating system platforms. There is one significant difference in that rpowermt is a remote management tool. Not all vSphere installations have a service console interface. In order to manage an ESXi host, customers have the option to use vCenter Server or vCLI (also referred to as VMware Remote Tools) on a remote server. PowerPath/VE for vSphere uses the rpowermt command line utility for both ESX and ESXi. PowerPath/VE for vSphere cannot be managed on the ESX host itself. There is neither a local nor remote GUI for PowerPath on ESX. Administrators must designate a Guest OS or a physical machine to manage one or multiple ESX hosts. rpowermt is supported on Windows 2003 (32-bit) and Red Hat 5 Update 2 (64-bit). When the vSphere host server is connected to the Symmetrix system, the PowerPath/VE kernel module running on the vSphere host will associate all paths to each device presented from the array and associate a pseudo device name (as discussed earlier). An example of this is shown in Figure 18 on page 88, which shows the output of rpowermt display host=x.x.x.x dev=emcpower0. Note in the output that the device has four paths and displays the optimization mode (SymmOpt = Symmetrix optimization). EMC PowerPath 101 EMC Foundation Products Figure 22 Output of the rpowermt display command on a Symmetrix VMAX device As more VMAX Engines or Symmetrix DMX directors become available, the connectivity can be scaled as needed. PowerPath/VE supports up to 32 paths to a device. These methodologies for connectivity ensure all front-end directors and processors are utilized, providing maximum potential performance and load balancing for vSphere hosts connected to the Symmetrix VMAX/DMX storage arrays in combination with PowerPath/VE. PowerPath/VE in vCenter Server PowerPath/VE for vSphere is managed, monitored, and configured using rpowermt as discussed in the previous section. This CLI-based management is common across all PowerPath platforms and presently, there is very little integration at this time with VMware management tools. However, LUN ownership is presented in the GUI. As seen in Figure 23 on page 103, under the ESX Configuration tab and within the Storage Devices list, the owner of the device is shown. 102 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products Figure 23 Device ownership in vCenter Server Figure 23 on page 103 shows a number of different devices owned by PowerPath. A set of claim rules are added to the vSphere PSA, which enables PowerPath/VE to manage supported storage arrays. As part of the initial installation process and claiming of devices by PowerPath/VE, the system must be rebooted. Nondisruptive installing is discussed in the following section. Nondisruptive installation of PowerPath/VE using VMotion Installing PowerPath/VE on a vSphere host requires a reboot. Just as with other PowerPath platforms, either the host must be rebooted or the I/O to applications running on the host must be stopped. In the case of vSphere, the migration capability built into the hypervisor allows members of the cluster to have PowerPath/VE installed without disrupting active virtual machines. VMware VMotion technology leverages the complete virtualization of servers, storage, and networking to move an entire running virtual machine instantaneously from one server to another. VMware VMotion uses the VMware cluster file system to control access to a virtual machine's storage. During a VMotion operation, the active memory and precise execution state of a virtual machine is rapidly transmitted over a high-speed network from one physical server to another and access to the virtual machines' disk storage is instantly switched to the new physical host. It is therefore advised, in order to eliminate any downtime, to use VMotion to move all running virtual EMC PowerPath 103 EMC Foundation Products machines off the ESX host server before the installation of PowerPath/VE. If the ESX host server is in a fully automated High Availability (HA) cluster, put the ESX host into maintenance mode, which will immediately begin migrating all of the virtual machines off the ESX host to other servers in the cluster. As always, it is necessary to perform a number of different checks before evacuating virtual machines from an ESX host to make sure that the virtual machines can actually be migrated. These checks include making sure that: ◆ VMotion is properly configured and functioning. ◆ The datastores containing the virtual machines are shared over the cluster. ◆ No virtual machines are using physical media from their ESX host system (that is, CD-ROMs, USB drives) ◆ The remaining ESX hosts in the cluster will be able to handle the additional load of the temporarily migrated virtual machines. Performing these checks will help to ensure the successful (and error-free) migration of the virtual machines. Additionally, this due diligence will greatly reduce the risk of degraded virtual machine performance resulting from overloaded ESX host systems. For more information on configuring and using VMotion, refer to VMware documentation. This process should be repeated on all ESX hosts in the cluster until all PowerPath installations are complete. 104 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products EMC Replication Manager EMC Replication Manager is an EMC software application that dramatically simplifies the management and use of disk-based replications to improve the availability of user’s mission-critical data and rapid recovery of that data in case of corruption. Note: All functionality offered by EMC Replication Manager is not supported in a VMware Infrastructure environment. The EMC Replication Manager Support Matrix available on Powerlink® (EMC’s password-protected customer- and partner-only website) provides further details on supported configurations. Replication Manager helps users manage replicas as if they were tape cartridges in a tape library unit. Replicas may be scheduled or created on demand, with predefined expiration periods and automatic mounting to alternate hosts for backups or scripted processing. Individual users with different levels of access ensure system and replica integrity. In addition to these features, Replication Manager is fully integrated with many critical applications such as DB2 LUW, Oracle, and Microsoft Exchange. Replication Manager makes it easy to create point-in-time, disk-based replicas of applications, file systems, or logical volumes residing on existing storage arrays. It can create replicas of information stored in the following environments: ◆ Oracle databases ◆ DB2 LUW databases ◆ Microsoft SQL Server databases ◆ Microsoft Exchange databases ◆ UNIX file systems ◆ Windows file systems ◆ VMware file systems The software utilizes Java-based client-server architecture. Replication Manager can: ◆ Create point-in-time replicas of production data in seconds. ◆ Facilitate quick, frequent, and non-destructive backups from replicas. EMC Replication Manager 105 EMC Foundation Products ◆ Mount replicas to alternate hosts to facilitate offline processing (for example, decision-support services, integrity checking, and offline reporting). ◆ Restore deleted or damaged information quickly and easily from a disk replica. ◆ Set the retention period for replicas so that storage is made available automatically. Replication Manager has a generic storage technology interface that allows it to connect and invoke replication methodologies available on: ◆ EMC Symmetrix arrays ◆ EMC CLARiiON® arrays ◆ HP StorageWorks arrays Replication Manager uses Symmetrix API (SYMAPI) Solutions Enabler software and interfaces to the storage array’s native software to manipulate the supported disk arrays. Replication Manager automatically controls the complexities associated with creating, mounting, restoring, and expiring replicas of data. Replication Manager performs all of these tasks and offers a logical view of the production data and corresponding replicas. Replicas are managed and controlled with the easy-to-use Replication Manager console. 106 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products EMC Open Replicator EMC Open Replicator enables distribution and/or consolidation of remote point-in-time copies between EMC Symmetrix DMX and qualified storage systems such as the EMC CLARiiON storage arrays. By leveraging the high-end Symmetrix DMX storage architecture, Open Replicator offers unmatched deployment flexibility and massive scalability. Open Replicator can be used to provide solutions to business processes that require high-speed data mobility, remote vaulting and data migration. Specifically, Open Replicator enables customers to: ◆ Rapidly copy data between Symmetrix, CLARiiON and third-party storage arrays. ◆ Perform online migrations from qualified storage to Symmetrix DMX arrays with minimal disruption to host applications. ◆ Push a point-in-time copy of applications from Symmetrix DMX arrays to a target volume on qualified storage arrays with incremental updates. ◆ Copy from source volumes on qualified remote arrays to Symmetrix DMX volumes. Open Replicator is tightly integrated with the EMC TimeFinder and SRDF family of products, providing enterprises with highly flexible and lower-cost options for remote protection and migration. Open Replicator is ideal for applications and environments where economics and infrastructure flexibility outweigh RPO and RTO requirements. Open Replicator enables businesses to: ◆ Provide a cost-effective and flexible solution to protect lower-tier applications. ◆ Reduce TCO by pushing or pulling data from Symmetrix DMX systems to other qualified storage arrays in conventional SAN/WAN environments. ◆ Create remote point-in-time copies of production applications for many ancillary business operations such as data vaulting. ◆ Obtain cost-effective application restore capabilities with minimal RPO/RTO impact. ◆ Comply with industry policies and government regulations. EMC Open Replicator 107 EMC Foundation Products EMC Virtual Provisioning Virtual Provisioning™ (commonly known as Thin Provisioning) was released with the 5773 Enginuity operating environment. Virtual Provisioning allows for storage to be allocated/accessed on-demand from a pool of storage servicing one or many applications. This type of approach has multiple benefits: ◆ Enables LUNs to be “grown” into over time with no impact to the host or application as space is added to the thin pool ◆ Only delivers space from the thin pool when it is written to, that is, on-demand. Overallocated application components only use space that is written to — not requested. ◆ Provides for thin-pool wide striping and for the most part relieves the storage administrator of the burden of physical device/LUN configuration Virtual Provisioning introduces two new devices to the Symmetrix. The first device is a thin device and the second device is a data device. These are described in the following two sections. Thin device A thin device is a “Host accessible device” that has no storage directly associated with it. Thin devices have pre-configured sizes and appear to the host to have that exact capacity. Storage is allocated in chunks when a block is written to for the first time. Zeros are provided to the host for data that is read from chunks that have not yet been allocated. Data device Data devices are specifically configured devices within the Symmetrix that are containers for the written-to blocks of thin devices. Any number of data devices may comprise a data device pool. Blocks are allocated to the thin devices from the pool on a round robin basis. This allocation block size is 768K. Figure 24 on page 109 depicts the components of a Virtual Provisioning configuration: 108 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products Pool A Data devices Thin Devices Pool B Data devices ICO-IMG-000493 Figure 24 Virtual Provisioning components Symmetrix VMAX specific features Solutions Enabler 7.1 and Enginuity 5874 SR1 introduce two new features to Symmetrix Virtual Provisioning - thin pool write balancing and zero space reclamation. Thin pool write balancing provides the ability to automatically rebalance allocated extents on data devices over the entire pool when new data devices are added. Zero space reclamation allows users to reclaim space from tracks of data devices that are all zeros. Thin pool write rebalance Thin pool write rebalancing for Virtual Provisioning pools extends the functionality of the Virtual Provisioning feature by implementing a method to normalize the used capacity levels of data devices within a virtual data pool after new data drives are added or existing data drives are drained. This feature introduces a background optimization task to scan the used capacity levels of the data devices within a virtual pool and perform movements of multiple track groups from the most utilized pool data devices to the least used pool data devices. The process can be scheduled to run only when EMC Virtual Provisioning 109 EMC Foundation Products changes to virtual pool composition make it necessary and user controls exist to specify what utilization delta will trigger track group movement. Zero space reclamation Zero space reclamation or Virtual Provisioning space reclamation provides the ability to free, also referred to as "de-allocate," storage extents found to contain all zeros. This feature is an extension of the existing Virtual Provisioning space de-allocation mechanism. Previous versions of Enginuity and Solutions Enabler allowed for reclaiming allocated (reserved but unused) thin device space from a thin pool. Administrators now have the ability to reclaim both allocated/unwritten extents as well as extents filled with host-written zeros within a thin pool. The space reclamation process is nondisruptive and can be executed with the targeted thin device ready and read/write to operating systems and applications. Starting the space reclamation process spawns a back-end disk director (DA) task that will examine the allocated thin device extents on specified thin devices. A thin device extent is 768 KB (or 12 tracks) in size and is the default unit of storage at which allocations occur. For each allocated extent, all 12 tracks will be brought into Symmetrix cache and examined to see if they contain all zero data. If the entire extent contains all zero data, the extent will be de-allocated and added back into the pool, making it available for a new extent allocation operation. An extent that contains any non-zero data is not reclaimed. 110 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products EMC Virtual LUN migration This feature offers system administrators the ability to transparently migrate both host visible LUNs and Thin LUNs from differing tiers of storage that are available in the Symmetrix VMAX. Enginuity 5875 is required for migration of Thin LUNs between Thin Pools. When used for migration of Thin LUNs, only allocated Thin extents for the LUN being migrated are transferred to the target Thin Pool. The storage tiers can represent differing hardware capability as well as differing tiers of protection. Traditionalfull LUNs,constructed from disk groups, can be migrated to either unallocated space (also referred to as unconfigured space) or to configured space, which is defined as existing Symmetrix LUNs that are not currently assigned to a server-existing, not-ready volumes-within the same subsystem.For Thin devices, migrations are specified to a target Thin Pool, and assume that sufficient space is available in the target pool. Thin LUN migrations do not support swap operations. For both disk group and Thin LUN migrations, the data on the original source LUN, or source Thin Pool is cleared using instant VTOC once the migration has been deemed successful. The migrations do not require swap or DVR space, andare nondisruptive to the attached hosts or other internal Symmetrix applications such as TimeFinder and SRDF. Figure 25 on page 111 shows the valid combinations of drive types and protection types that are available for migration. Drive Type Flash Protection Type Flash Fibre Channel SATA y y y Fibre Channel y y y SATA y y y RAID 1 RAID 6 RAID 6 UnProtected RAID 1 y y y x RAID 6 y y y x RAID 6 y y y x UnProtected y y y x ICO-IMG-000754 Figure 25 Virtual LUN Eligibility Tables EMC Virtual LUN migration 111 EMC Foundation Products The device migration is completely transparent to the host on which an application is running since the operation is executed against the Symmetrix device; thus the target and LUN number are not changed and applications are uninterrupted. Furthermore, in SRDF environments, the migration does not require customers to re-establish their disaster recovery protection after the migration. The Virtual LUN feature leverages the newly designed virtual RAID architecture introducedwith Enginuity 5874, which abstracts device protection from its logical representation to a server. This powerful approach allows a device to have more simultaneous protection types such as BCVs, SRDF, Concurrent SRDF, and spares. It also enables seamless transition from one protection type to another while servers and their associated applications and Symmetrix software are accessing the device. The Virtual LUN feature offers customers the ability to effectively utilize SATA storage - a much cheaper, yet reliable, form of high capacity storage. It also facilitates fluid movement of data across the various storage tiers present within the subsystem - the realization of true "tiered storage in the box." Thus, Symmetrix VMAX becomes the first enterprise storage subsystem to offer a comprehensive "tiered storage in the box," ILM capability that complements the customer's tiering initiatives. Customers can now achieve varied cost/performance profiles by moving lower priority application data to less expensive storage, or conversely, moving higher priority or critical application data to higher performing storage as their needs dictate Specific use cases for customer applications enable the moving of data volumes transparently from tier to tier based on changing performance (moving to faster or slower disks) or availability requirements (changing RAID protection on the array). This migration can be performed transparently without interrupting those applications or host systems utilizing the array volumes and with only a minimal impact to performance during the migration. The following sample commands show how to move two LUNs of a host environment from RAID 6 drives on Fibre Channel 15k rpm drives to Enterprise Flash drives. The new symmigrate command, which comes in EMC Solutions Enabler 7.0, is used to perform the migrate operation. The source Symmetrix hypervolume numbers are 200 and 201, and the target Symmetrix hypervolumes on the Enterprise Flash drives are A00 and A01. 112 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products 1. A file (migrate.ctl) is created that contains the two LUNs to be migrated. The file has the following content: 200 A00 201 A01 2. The following command is executed to perform the migration: symmigrate -sid 1261 -name <ds_mig> -f <migrate.ctl> establish The ds_mig name associated with this migration can be used to interrogate the progress of the migration. 3. To inquire on the progress use the following command: symmigrate -sid 1261 -name <ds_mig> query The two host accessible LUNs are migrated without having to impact application or server availability. EMC Virtual LUN migration 113 EMC Foundation Products EMC Fully Automated Storage Tiering for Disk Pools With the release of Enginuity 5874, EMC now offers the first generation of Fully Automated Storage Tiering technology. EMC Symmetrix VMAX Fully Automated Storage Tiering for Disk Pools (FAST DP) for standard provisioned environments automates the identification of data volumes for the purposes of allocating or re-allocating application data across different performance tiers within an array. FAST proactively monitors workloads at the volume (LUN) level in order to identify "busy" volumes that would benefit from being moved to higher-performing drives. FAST will also identify less "busy" volumes that could be relocated to higher-capacity drives, without existing performance being affected. This promotion/demotion activity is based on policies that associate a storage group to multiple drive technologies, or RAID protection schemes, based on the performance requirements of the application contained within the storage group. Data movement executed during this activity is performed nondisruptively, without affecting business continuity and data availability. The primary benefits of FAST include: • Automating the process of identifying volumes that can benefit from Enterprise Flash Drives and/or that can be kept on higher-capacity, less-expensive drives without impacting performance • Improving application performance at the same cost, or providing the same application performance at lower cost. Cost is defined as space, energy, acquisition, management and operational expense. • Optimizing and prioritizing business applications, which allows customers to dynamically allocate resources within a single array • Delivering greater flexibility in meeting different price/performance ratios throughout the lifecycle of the information stored Management and operation of FAST are provided by SMC, as well as the Solutions Enabler Command Line Interface (SYMCLI). Also, detailed performance trending, forecasting, alerts, and resource utilization are provided through Symmetrix Performance Analyzer 114 Microsoft SQL Server on EMC Symmetrix Storage Systems EMC Foundation Products (SPA). Ionix™ ControlCenter® provides the capability for advanced reporting and analysis to be used for charge back and capacity planning. EMC Fully Automated Storage Tiering for Disk Pools 115 EMC Foundation Products EMC Fully Automated Storage Tiering for Virtual Pools With the release of Enginuity 5874, EMC now offers the benefits of both Fully Automated Storage Tiering technology and the efficiencies of using the Virtual Provisioning model. EMC Symmetrix VMAX Fully Automated Storage Tiering for Virtual Pools (FAST VP) for virtually provisioned environments automates the identification of granual data extents from host accessible thin volumes for the purposes of allocating or re-allocating application data across different performance tiers within an array. FAST VP proactively monitors workloads at sub-LUN level in order to identify "busy" extents within “busy” volumes that would benefit from being moved to higher-performing thin pools. FAST VP will also identify less "busy" extents within thin volumes that could be re-allocated to higher-capacity drives, without existing performance being affected. This sub-LUN promotion/demotion activity is based on policies that associate a storage group to multiple drive technologies, or RAID protection schemes, based on the performance requirements of the application contained within the storage group. With FAST VP implementations, significant benefits are derived from the identification of sub-LUN extents that are generating the specific workloads. The sub-LUN movements are proportionally more efficient in their use of target pools, as well as the time taken to excute movements. Data movement executed during this activity is performed nondisruptively, without affecting business continuity and data availability. The primary benefits of FAST VP expand upon the benefits provided by FAST DP by: ◆ Significantly faster movements of sub-LUN extents to the most cost-effective or performant tier of storage. ◆ More efficient utilization of storage system resources by only requiring allocations based on FAST VP sub-LUN extents. Management and operation of FAST VP are provided by SMC, as well as the Solutions Enabler Command Line Interface (SYMCLI). Also, detailed performance trending, forecasting, alerts, and resource utilization are provided through Symmetrix Performance Analyzer (SPA). Ionix ControlCenter provides the capability for advanced reporting and analysis to be used for chargeback and capacity planning. 116 Microsoft SQL Server on EMC Symmetrix Storage Systems 3 Storage Provisioning This chapter describes storage provisioning used when deploying Symmetrix in a storage area network. ◆ ◆ ◆ ◆ ◆ ◆ ◆ Storage provisioning........................................................................ SAN storage provisioning .............................................................. Challenges of traditional storage provisioning ........................... Virtual Provisioning......................................................................... Fully Automated Storage Tiering .................................................. Deploying FAST DP with SQL Server databases ........................ Deploying FAST VP with SQL Server databases ........................ Storage Provisioning 118 119 125 128 146 155 173 117 Storage Provisioning Storage provisioning External storage can be attached to hosts in various ways. Some of these ways are listed below: ◆ Fibre Channel Switched (FC-SW) ◆ Fibre Channel Arbitrated Loop (FC-AL) ◆ Network Attached Storage (NAS) ◆ iSCSI The most common way to provision storage from a Symmetrix storage controller is using switched Fibre Channel (FC-SW). The Storage Area Network (SAN), with multiple Fibre Channel switches and dual host bus adapters (HBAs), allows for the highest throughput and availability. Provisioning Symmetrix storage across a SAN involves multiple steps: ◆ Creation of a bin file — a bin file is a special file that is used to configure a Symmetrix. The bin file configuration for the Symmetrix describes how all the physical disks are divided up into host-addressable and non-host addressable volumes. It also describes how the front-end and back-end are configured, how the hypervolumes are protected, which hypervolumes are addressed by which front-end directors and what the host addresses are (SCSI target and LUN). ◆ Fabric Zoning — software enforced path isolation methodology on a SAN to define the valid paths through the fabric(s) for the Fibre Channel HBAs and the Fibre Channel Adapters on the Symmetrix. ◆ LUN masking — real-time software filtering mechanism for volumes that are presented on a specific front-end director and port by allowing and disallowing hosts to access those volumes. The following section presents a brief overview of the typical storage provisioning tasks used when implementing a Symmetrix in a Storage Area Network (SAN). 118 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning SAN storage provisioning The initial tasks for deploying a Symmetrix in a SAN are usually performed by storage administrators and server administrators. They can either work with EMC Customer Engineers to create the initial bin file or use EMC software tools to design the Symmetrix configuration. Storage administrators have to understand many things about the Symmetrix to deploy storage that fits the server and applications requirements regarding: ◆ Performance ◆ Availability ◆ Capacity • Number of LUNs needed • LUN sizes ◆ RAID Protection ◆ Replication requirements Not all of these factors may be immediately obvious, especially for deployment of new applications. So the administrator can be reduced to a best-guess approach to solving the problem. Once these initial storage provisioning tasks are completed, a server can be connected through a Fibre Channel Network to the front-end adapter or host adapter of the Symmetrix. In order to be able to navigate a path through the SAN fabric, single-HBA zoning needs to be set up. This is accomplished by defining a relationship between the unique World Wide Name (WWN) of the HBA and the WWN of the Fibre Channel director in the Symmetrix (FA). This pairing, called a zone, defines an end-to-end path through one or more switches from the server to the Symmetrix. Figure 26 on page 120 depicts a simple, single switch SAN. SAN storage provisioning 119 Storage Provisioning FA HBA FC switch Server WWN WWN 10000000c944cd47 50060482d52e64a9 Zone Figure 26 Symmetrix ICO-IMG-000494 Simple storage area network configuration Multiple zones are usually implemented in a SAN and these are grouped into something that is referred to as a zone set. The zone set is activated on the fabric to define the end-to-end paths for all servers to all Symmetrix arrays in the SAN. The WWN name of the FA is derived from the Symmetrix serial number in combination with the FA slot number and the port number. This means that should an FA be replaced, the WWN stays the same. All the previous zoning will still work with the newer hardware when it is replaced. LUN mapping Before a LUN can be used by a host it must be provided a SCSI target address on the FA (commonly called target and lun). This address must not conflict with any other addresses on the FA. This address may also have to conform to specific host addressing requirements. For example, the Microsoft Windows Server SCSI driver stack typically can only address LUNs with a LUN ID less than 255. Presenting a LUN on an FA is a process called LUN mapping. This process is accomplished either by EMC customer service personnel during BIN file creation or by an administrative user using the SYMCONFIGURE command which is part of the Solutions Enabler program set, or via Symmetrix Management Console or EMC Ionix™ ControlCenter® applications. 120 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning An issue often encountered by administrators on Symmetrix DMX systems is managing the relationship of SCSI target addressing to the FA, and the resulting SCSI ID as seen by the host. Since the SCSI target address on the FA must be unique, in very large configurations where hundreds of targets could be mapped to an individual FA, this could result in LUN IDs greater than 255, and subsequently cause access issues for Windows servers. To resolve this issue, an administrator could utilize the LUN Offset functionality in Symmetrix masking operations to us a low order LUN address for a given HBA. LUN masking Once a server HBA can log in to a Symmetrix FA, devices mapped to that FA can then be provisioned for the server. Since multiple servers can be zoned to the same FA director and port, some kind of masking protocol must be employed to prevent one server from seeing/using the other server’s volumes. This protocol is called LUN masking. LUN masking can be accomplished using SYMCLI, ECC or SMC. The process of LUN masking relates specific Symmetrix devices on the FA director and port to the HBA WWN zoned to that FA director and port. Figure 27 on page 121 depicts this three-way relationship. Zone Host WWN FA WWN Allowed devices ICO-IMG-000495 Figure 27 LUN masking relationship SAN storage provisioning 121 Storage Provisioning Auto-provisioning with Symmetrix VMAX While the above processes of zoning, mapping and masking may seem simple enough on the surface, the reality is that in large environments the effort to perform these steps can be laborious, and are often automated through customer-maintained scripts. In addition, many of the steps are going to be similar. For instance it is likely that all HBAs in the server are going to see all LUNs that are masked to the same FA. Symmetrix VMAX with Enginuity 5874 provides storage and system administrators with a simplified model for mapping and masking LUNs to servers. This new storage provisioning model is referred to as Auto-provisioning Groups. Auto-provisioning Groups allow sets of HBAs (identified by World Wide Names or WWNs) to be associated with a set of Symmetrix devices and a set of FA ports on the Symmetrix. This enables the processing of the groups as a unit and automates the storage provisioning steps of LUN-mapping and LUN masking for each WWN to FA port to device relationship. The following steps show the typical progression of actions using SYMACCESS to define the groups of resources that are processed together using initiator groups: 1. Create the storage group, which defines the specific Symmetrix devices that will be presented to the host. symaccess -sid <SID> create -name <StorGroupName> -type storage devs <SymmDevs> 2. Create the director group, which defines the director ports to which the devices are to be mapped, and thru which the host will be able to access the devices as defined in the storage group. symaccess -sid <SID> create -name <PortGroupName> -type port -dirport <SymmPorts> 3. Create the host initiator groups, which define the WWNs of the host bus adapters that are used by the host. symaccess -sid <SID> create -name <InitGroupName> -type initiator -wwn <HBA_WWN> symaccess -sid <SID> add -name <InitGroupName> -type initiator -wwn <HBA_WWN> 122 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning 4. Define the view, which binds the previously defined groups. The creation of the view will cause the Symmetrix to execute mapping and masking operations as necessary to make the LUNs available on the ports specified to the WWNs specified. symaccess -sid <SID> create view -name <ViewName> -storgrp <StorGroupName> -portgrp <PortGroupName> -initgrp <InitGroupName> Auto-provisioning Groups functionality, implicitly implements Dynamic LUN addressing for the view that is created. This functionality will automatically assign LUN IDs for devices assigned to the WWNs in increasing value beginning at address 0 (zero). The assignment takes into account all views incorporating the specified HBA WWNs, on a given FA port such that no conflicts occur. Additional, user-specified functionality may be implemented during the masking phase to ensure that LUN ID assignments are consistent across all ports. This option exists, since it is possible that device assignments to a given HBA WWN may not be uniform, thus allowing a LUN ID for a given device on one FA port to have a different LUN ID than the same device presented via a different FA port. Such discrepancies are catered for by host multipath solutions such as EMC PowerPath®. The implementation of the Auto-provisioning functionality may be viewed pictorially in Figure 28 on page 123: Figure 28 Auto-provisioning with Symmetrix VMAX SAN storage provisioning 123 Storage Provisioning This figure depicts a Host initiator group that contains four WWNs, a director port group containing four FA ports and 1 to n Symmetrix devices. Host LUN discovery Once devices have been masked to the HBA WWN the host needs to scan the FC bus to discover the new devices. The process to discover the new devices on Windows servers may require both the HBA driver to rescan the fabric for new devices as well as Windows Device Manager to enumerate the newly presented storage devices. 124 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Challenges of traditional storage provisioning The sort of activities that are required when provisioning storage in SANs have been introduced in an earlier section. Here they are discussed in further detail focusing on the management, performance and operational aspects of the storage provisioning activities. How much storage to provide A major challenge facing storage administrators is provisioning storage for new applications. Administrators typically allocate space based on anticipated future growth of applications. This is done to mitigate recurring operational functions, such as incrementally increasing storage allocations or adding discrete blocks of storage as existing space is consumed. Often, this approach results in more physical storage being allocated to the application than is needed for a significant amount of time and at a higher cost than is necessary. This overprovisioning of physical storage also leads to increased power, cooling and floor space requirements. Even with the most careful planning, it may be necessary to provision additional storage, beyond that which was originally planned for, which could potentially require an application outage. A second layer of storage overprovisioning happens when a database administrator overallocates storage for a table space. The operating system sees the space as completely allocated but internally only a fraction of the allocated space might be used. The penalty of under-provisioning is so large in terms of management and people cost, that it is typical for requests to demand a lot more storage than will actually be needed as a method to minimize the number of times administrative actions are required to provision additional storage. How to add storage to growing applications Combined with all the steps from the initial allocation are considerations for the addition of existing storage to running applications. Such things as host level striping have to be taken into account. How can storage be added to a logical volume that is already striped? If multiple files within a SQL Server file group are Challenges of traditional storage provisioning 125 Storage Provisioning being used, how is the addition of new space going to impact the tables and indexes that exist on those storage devices? Does the existing data in the file group need to be rebalanced? Notwithstanding the technical considerations, the management and procedural aspects have to be understood. How long does it take to get the additional storage? Will the amount added be sufficient? What is the growth rate of the application. How to balance usage of the LUNs One of the biggest problems facing application deployments is when the access to the storage is skewed. It’s not infrequent to find configurations where 20 percent of the disks are performing 80 percent of the workload. While products such as Symmetrix Optimizer can balance the workload of individual hypervolumes on the physical spindles, its granularity is at the Symmetrix device level. If an individual LUN is receiving a significant workload, Symmetrix Optimizer is not able to redistribute that workload across additional multiple spindles. The cost in terms of downtime and effort to distribute the activity on that LUN from the host side is high. So it is desirable to avoid this situation if at all possible. How to configure for performance Beyond simple storage allocation sizing, application deployments bring with them a need to also address the anticipated I/O workload. Raising issues such as the specific performance requirements for the application, and how many IOPS or how much throughput may be needed. Administrators may also be concerned about how is it possible, in an enterprise class array running multiple different applications, to deliver the appropriate service levels. Again, the best guess approach is probably most commonly used and because of the advanced features of the Symmetrix like cache management, prefetch, most configurations perform well. However, relying on features such as cache management and prefetch to mitigate poor storage allocation processes should be avoided. Invariably, such mechanisms can only mask poor storage design to a point, and when workloads exceed the ability to mitigate the inefficient storage allocation, performance will be adversely affected. 126 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning The use of PowerPath for front-end load balancing helps eliminate bottlenecks at the Fibre Adapters in the Symmetrix. Front-end load balancing will not resolve imbalance in activity on the back end. For some the SAME (stripe and mirror everywhere) principle might be optimal. Certainly for OLTP databases with close to 100 percent random read and write characteristics, this approach stands a good chance of working. Sizing for storage allocation and sizing for I/O demands needs to occur at the same time. Provisioning on the basis of only one of these attributes is likely to have adverse affects on the other. Challenges of traditional storage provisioning 127 Storage Provisioning Virtual Provisioning Symmetrix DMX-3, DMX-4, and VMAX storage arrays continue to provide increased storage utilization and optimization, enhanced capabilities, greater interoperability and security, as well as multiple ease-of-use improvements. One such feature introduced with Enginuity release 5773 that provides increased storage utilization and optimization is Symmetrix Virtual Provisioning. Virtual Provisioning, generally known in the industry as “thin provisioning,” enables organizations to improve ease of use, enhance performance, and increase capacity utilization for certain applications and workloads. EMC Virtual Provisioning can address both forms of overprovisioning previously discussed by allowing more storage to be presented to an application than is physically available. More importantly, Virtual Provisioning will consume physical storage only when the storage is actually written to. This allows more flexibility in predicting future growth and reduces the initial costs of provisioning storage to an application and can obviate the inherent waste in overallocation of space and administrative management of storage allocations. Additionally, Virtual Provisioning introduces a storage allocation implementation that allows for a broad distribution of workloads across a large pool of physical disk resources. Such implementations can ensure that workloads do not create hot spots on a smaller pool of devices, and ensure that all thin devices are able to benefit from the cumulative performance capacity of the storage pool. 128 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Terminology The introduction of Virtual Provisioning brings with it some new terminology to describe the components, features and processes that enable it. Table 9 on page 129 describes the components for Virtual Provisioning: Table 9 Virtual Provisioning terminology (page 1 of 2) Term Description Thin Device A host accessible device that has no storage directly associated with it. Data Device An internal device that provides storage capacity to be used by thin devices. Extent Mapping Specifies the relationship between the thin device and data device extents. The extent sizes between a thin device and a data device do not need to be the same. Thin Pool A collection of data devices that provide storage capacity for thin devices. Thin Pool Capacity The sum of the capacities of the member data devices. Bind The process by which one or more thin devices are associated to a thin pool. Unbind The process by which a thin device is diassociated from a given thin pool. When unbound all previous extent allocations from the data devices are erased an returned for re-use. Enabled data device A data device belonging to a thin pool on which extents can be allocated for thin devices bound to that thin pool. Disabled data device A data device belonging to a thin pool from which capacity cannot be allocated for thin devices. This state is under user control. If a data device has existing extent allocations when a disable operation is executed against it, the extents will be re-located to other enabled data devices with available free space within the thin pool. Thin Pool Enabled Capacity The sum of the capacities of enabled data devices belonging to a thin pool. Thin Pool Allocated Capacity A subset of thin pool enabled capacity that has been allocated for the exclusive use of all thin devices bound to that thin pool. Thin Pool Pre-allocated Capacity The initial amount of capacity that is allocated when a thin pool is bound to a thin pool. This property is under user control. Virtual Provisioning 129 Storage Provisioning Table 9 Virtual Provisioning terminology (page 2 of 2) Term Description Thin Device Minimum Pre-allocated Capacity The minimum amount of capacity that is pre-allocated to a thin device when it is bound to a thin pool. This property is not under user control. Thin Device Written Capacity The capacity on a thin device that was written to by a host. In most implementations this is a subset of the thin device allocated capacity. Thin Device Subscribed Capacity The total capacity that a thin device is entitled to withdraw from a thin pool which may be equal to or less than the thin device capacity. Thin Device Allocation Limit The capacity limit that a thin device is entitled to withdraw from a thin pool, which may be equal to or less than the thin device subscribed capacity. Thin devices Symmetrix thin devices are logical devices that can be used in many of the same ways that Symmetrix devices have traditionally been used. They are provisioned in the same way as traditional devices by mapping them to a front-end Symmetrix director and port, and then LUN-masking them to the WWN of an HBA in the server requiring the storage. Unlike traditional Symmetrix devices, thin devices do not need to have physical storage completely allocated at the time the device is created and presented to a host. Symmetrix thin devices are in fact cache-based devices, and may be considered to be a level of storage indirection (a collection of pointers). When storage for a bound thin device is required, the physical storage that is used to supply disk space to thin devices comes from a shared storage pool called a thin pool. At the point where an allocation occurs, a pointer is updated for the thin device, which points to the specific location within the thin pool where the data physically resides. This allocation is referred to as a thin extent. Thin devices do not themselves implement a RAID protection level, although they can be, and often are, implemented as metavolumes. The RAID protection scheme for a thin device is inherited from the thin pool to which it has been bound. A thin device can only be bound to one thin pool at any given time. 130 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Thin pool A thin pool is a new Symmetrix construct that provides the underlying storage that supports thin devices. Thin pools are created by the end user and not by EMC Customer Engineers. The user can define up to 255 thin pools. Thin pool storage is made up of data devices. Data devices Data devices are specific devices in the Symmetrix that are not addressable to a host and that provide capacity for a thin pool. One or more data devices can be related to a thin pool. All devices within a given thin pool must have a common RAID protection scheme, and must come from the same type of storage technology. For example, a single thin pool cannot contain data devices created from traditional Fiber Channel drives and SATA drives. When data devices are added to a thin pool they can be in an enabled or disabled state. In order for the data device to be used for thin extent allocation it needs to be in the enabled state. For it to be removed from the thin pool, it needs to be in a disabled state. I/O activity to a thin device When a write is performed to a part of the thin device for which physical storage has not yet been allocated, the Symmetrix allocates physical storage from the thin pool for that portion of the thin device only. The Symmetrix operating environment, Enginuity, satisfies the requirement by providing a block of storage from the thin pool called a thin device extent. This approach reduces the amount of storage that is actually consumed. The minimum amount of physical storage that can be reserved at a time for the dedicated use of a thin device is referred to as the thin device extent. The entire thin device extent is physically allocated to the thin device at the time the thin storage allocation is made. The thin device extent is allocated from any one of the enabled data devices in the associated thin pool. Note that the thin extent represents a contiguous set of blocks that represent part of the LUN itself as seen by the host. Subsequent writes to logical block addresses Virtual Provisioning 131 Storage Provisioning (LBA) within the allocated range will not require a new extent allocation. Only writes to LBAs that are currently not allocated will result in a net new allocation. When a read is performed on a thin device, the data being read is retrieved from the appropriate data device in the thin pool to which the thin device is associated. If for some reason a read is performed against an unallocated portion of the thin device, zeros are returned to the reading process and no back-end read is generated. When more data device storage is required to service existing or future thin devices, for example, when a thin pool is approaching full storage allocations, data devices can be added to existing thin pools dynamically without needing a system outage. New thin devices can also be created and associated with existing thin pools. Figure 29 on page 132 depicts the relationships between thin devices and their associated thin pools. There are nine devices associated with thin pool A and three thin devices associated with thin pool B. Pool A Data devices Thin Devices Pool B Data devices ICO-IMG-000493 Figure 29 Virtual Provisioning component relationships The way thin extents are allocated across the data devices results in a form of striping across all enabled data devices within the thin pool. The more data devices that are in the thin pool, the wider the striping and the greater the number of devices that can participate in 132 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning application I/O. The thin extent size is currently 768 KB in size. This may change in future versions of the Enginuity operating environment. The maximum size of a thin device is the same as the hypervolume size, and is currently around 64 GB for a DMX array and around 240 GB for a VMAX array. If a LUN of a larger size than the maximum hypervolume size is needed, then a metavolume comprised of thin devices can be created. It is recommended that the metavolume be concatenated rather than striped since the thin pool is already striped using thin extents. Virtual Provisioning requirements Virtual Provisioning requires Enginuity code level 5773 or later. To create and manage thin pools and devices Solutions Enabler 6.5.0 or later is required. For Symmetrix VMAX deployments, Enginuity code level 5874 or later, and Solutions Enabler 7.0 or later are required. If Symmetrix Management Console (SMC) is being used to manage Virtual Provisioning components, version 6.1 or later is needed. Thin pools can only be created by the end user and cannot be created during the bin file (Symmetrix configuration) creation process. Windows NTFS considerations Host file system and volume management routines behave in subtly different ways. These behaviors are usually invisible to end users. The introduction of thin devices can put these activities in a new perspective. For example, a Windows file system might show as empty when viewing from within Windows Explorer, but many blocks may have been written to the thin device in support of file system metadata like the Master File Table (MFT). From a Virtual Provisioning point of view this thin device could be using a substantial number of thin extents even though the space is showing as available to the operating system and logical volume manager. While this kind of pre-formatting activity diminishes the value of Virtual Provisioning in one area, it does not negate the other values that Virtual Provisioning offers. For Windows Server deployments that wish to implement the space saving aspects of thin devices, administrators should avoid those functions that may cause full device allocations to occur. The first, and most aggressive of these, is the formatting of NTFS volumes Virtual Provisioning 133 Storage Provisioning when they are created. It is highly recommended that administrators ensure that the “Quick Format” option is always selected when creating and formatting NTFS volumes on Windows Server environments. Figure 30 on page 134 demonstrates the use of the FORMAT command line option, utilizing the “/Q” parameter to execute a quick format operation. Figure 30 Windows NTFS volume format with Quick Format option The resulting extent allocations from three thin device after a Windows NTFS volume format are shown in Figure 31 on page 135. The three devices are targeted for a SQL Server database instance deployment to demonstrate the operations of EMC Virtual Provisioning. During the format phase the NTFS Quick Format option was used for the Data1 (0432), Data2 (435) and Log (0438) volumes. Data1 and Data2 are thin devices of the same size and the Log device is physically smaller. When comparing the allocations of Data1 and Data2 it is clear that both thin devices have the same number of both Pool Allocated Tracks and Pool Written Tracks. In all cases, the actual allocated storage from the data devices is smaller than the space represented by the thin devices themselves. 134 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Figure 31 Thin pool display after NTFS format operations In addition to Windows NTFS behaviors, Microsoft SQL Server database files, and transaction logs can have different kinds of behavior when laid down on virtually provisioned devices. Specifically, to obtain the benefits of a thinly provisioned storage allocation for database files, administrators should ensure that their configuration allows for the use of SQL Server Instant File Initialization. Instant File Initialization functionality is documented within the Microsoft SQL Server Books On Line documentation. The Instant File Initialization functionality is enabled when the “Perform Volume Management Tasks” security policy is set for the account that is used to run the SQL Server Instance itself. By default this permission is assigned to the Local Administrators group on Windows Server installations. Virtual Provisioning 135 Storage Provisioning SQL Server components on thin devices There are certain considerations that a DBA must be aware when deploying a Microsoft SQL Server database using Virtual Provisioning. How the various components work in this kind of configuration depends on the component itself, its usage and sometimes the underlying host data structure supporting the component. This section describes how the various SQL Server database components interact with Virtual Provisioning devices. SQL Server data files To deploy a SQL Server database on a given environment, DBAs will typically execute a statement of the form shown in Figure 32 on page 137. In this example a database create operation is being executed to create the various files using mount point locations that are located on thin devices. The T-SQL sample CREATE DATABASE statement for the myThinDB database does not require any special options when using thin devices. In this environment, Instant File Initialization was being utilized. 136 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Figure 32 Creation of a SQL Server database The example T-SQL statement will result in the creation of four physical files. The first is the primary file group, which in this instance is configured with only 8 MB of space. Two data files, each of 15 GB, are also defined to be created. The data files myThinDB_Data1.ndf and myThinDB_Data2.ndf are located on the two mount points previously defined. Finally, the transaction log is defined to be 10 GB in size, and is located on the mount point prepared for the transaction log. After the execution of the CREATE DATABASE statement, a display of the thin pool devices and associated allocations is shown in Figure 33 on page 138. Again, Symmetrix devices 0432 and 0435 Virtual Provisioning 137 Storage Provisioning shown in the Thin Devices section represent the data file locations for myThinDB_Data1.ndf and myThinDB_Data2.ndf. Symmetrix device 0438 is the location for the transaction log file myThinDB_Log.ldf. Figure 33 Thin pool display after database creation It is noticeable that the Pool Allocated Tracks and Pool Written Tracks for the transaction log device 0438 counters are substantially higher than for the data file locations. This is a result of Microsoft SQL Server writing every page of the defined transaction log, which is 10 GB in size. It is clear, therefore, that the data files, which are defined to be 15 GB each, are not fully allocated and that SQL Server 2005 Instant File Initialization is recommended for Symmetrix Virtual Provisioning. This functionality allows customers to fully leverage the benefits offered by the technology. 138 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning When using Microsoft SQL Server Management Studio to view the resulting database, as shown in Figure 34 on page 139, it should be noted that SQL Server is unaffected by the use of thin devices, and shows full allocation of devices as defined in the CREATE DATABASE statement. Figure 34 SQL Server Management Studio view of the database Transaction log files Active log files are always fully allocated and zeroed when the transaction log for a given database is created, or when it is extended by a grow operation. Every single block of the log files is written to at this time such that the log files become fully provisioned when they are initialized and will not cause any thin extent allocations after this. One thing to remember is that the log files become striped across all the devices in the thin pool. So no single physical disk will incur the overhead of all the writes that come to the logs. If the standards for a Virtual Provisioning 139 Storage Provisioning particular site require that logs need to be separated from the data, this can be achieved by creating a second thin pool that is dedicated for the active log files. TempDB database The usage cycle of Microsoft SQL Server TEMPDB storage deployed on virtually provisioned storage is interesting to follow and understand. Initially, as TEMPDB is simply another SQL Server database the rules for data files and log files apply. The transaction log files will be fully allocated at creation, and data files will be allocated on an as-needed basis. When temporary space is required from TEMPDB for the first time, thin extents are allocated out of the associated thin pools. However, when the temporary space is released by the database the thin extents will still remain allocated. This behavior will be consistent across restarts of the Microsoft SQL Server instance, as extent allocations from thin pools will persist for the life of the thin devices. The fact that TEMPDB may be reinitialized when the SQL Server Instance is restarted will not affect thin extent allocations. The total thin extent allocation will be based on the maximum TEMPDB utilization, but will never exceed the maximum size of the TEMPDB itself (assuming maximum parameters have been set for data files and the transaction log). SQL Server AUTOGROW and thin devices Microsoft SQL Server supports the concept of data file and transaction log auto-growth. This functionality is defined at the time of creation of the database, or may be modified during the life of the database by using the ALTER DATABASE T-SQL statement. In general, the Auto-Grow feature of SQL Server is independent of the implementation of Virtual Provisioning. Auto-Grow functionality will behave in exactly the same manner in a thinly provisioned environment as it does in a traditional fully allocated environment. Microsoft SQL Server and EMC guidance for Auto-Grow is to avoid depending on this functionality, especially in larger environments, where proactive administrator management of file growth is highly recommended. While Auto-Grow functionality will help databases avoid an out-of-space condition if there is available space in the volume where a data file is located, it does not grow all data files of a file group at the same time. Microsoft SQL Server uses a proportional fill mechanism for allocating space for a table or index across a given file group. If one of the data files executes an Auto-Grow operation, it 140 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning will be seen to have proportionally more free space, and will therefore have more table extent allocations made from it. This in turn may skew workloads. To ensure even distribution of allocations and workload patterns in more complex database environments that implement multiple data files within a file group, it is highly recommended to manually extend all the data files within a file group at the same time. This will ensure that the proportional fill mechanism continues to evenly distribute data across all data files Approaches for replication Organizations will be able to perform “thin-to-thin” replication with Symmetrix thin devices by using standard TimeFinder, SRDF, and Open Replicator operations. The current release of Virtual Provisioning does not support all replication modes. Please refer to the release notes for further details. In addition, thin devices can be used as control devices for hot and cold pull and cold push Open Replicator copy operations. If a push operation is done using a thin device as the source, zeros will be sent for any regions of the thin device that have not been allocated, or that have been allocated but have not been written to. Open Replicator can also be used to copy data from a standard device to a thin device. If a pull or push operation is initiated from a standard device that targets a thin device, then a portion of the target thin device, equal in size to the reported size of the source volume, will become allocated. Performance considerations The architecture of Virtual Provisioning creates a naturally striped environment where the thin extents are allocated across all volumes in the assigned storage pool. The larger the storage pool for the allocations, the greater number of devices that can be leveraged for the Microsoft SQL Server database I/O. One of the possible consequences of using a large pool of data devices and potentially sharing these devices with other applications is variability in performance. In other words, possible contention with other applications for physical disk resources may cause inconsistent Virtual Provisioning 141 Storage Provisioning performance levels. If this variability is not desirable for a particular application, that application could be dedicated to its own thin pool. The Symmetrix supports up to 512 thin pools. When a new data extent is required from the thin pool there is an additional latency introduced to the write while the thin extent location is assigned and formatted. This latency is approximately 1 millisecond, and is incurred only when a new thin extent allocation is required. Thus for a sequential write stream, a new extent allocation will occur on the first new allocation, and again when the current stripe has been fully written and the writes are moved to a new extent. If the application cannot tolerate this occasional additional latency it is recommend to preallocate storage to the thin device when the thin device is bound to the thin pool. Thin pool management When storage is provisioned from a thin pool to support multiple thin devices there is usually more “virtual” storage provisioned to hosts than is supported by the underlying data devices. This is one of the main reasons for using Virtual Provisioning. However, there is a possibility that applications using a thin pool may grow rapidly and request more storage capacity from the thin pool than is actually available. This condition is known as oversubscription. This is an undesirable situation. The next section discusses the steps necessary to avoid oversubscription. Thin pool monitoring Along with Virtual Provisioning come several methodologies to monitor the capacity consumption of the thin pools. Solutions Enabler has the symcfg monitor command. There are also event thresholds that can be monitored through the SYMAPI event daemon. Thresholds can be set with Symmetrix Management Console and SNMP traps can be sent for monitoring using the EMC Ionix ControlCenter or any data center management product. As thresholds for thin pool storage are met or exceeded, events are raised by the storage array. In a configuration where the SYMAPI event daemon is implemented on any system with appropriate connectivity to the Symmetrix DMX array, the event daemon will propagate events into the configured location. This location can include one or more of the following: the Windows event log, an 142 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning SMNP target, or a defined log file. When configured to log events in to the Windows application event log, events of the form shown in Figure 35 on page 143 will be posted when threshold values are met or exceeded. Figure 35 Windows event log entry created by SYMAPI event daemon System administrators and storage administrators must put processes in effect to monitor the capacity for thin pools to make sure that they do not get filled. The pools can be dynamically expanded to include more data devices without application impact. These devices should be added in groups and should be added long before the thin pool is approaches a full condition. If devices are added individually, hot spots on the disks can be created when much of the write activity is suddenly directed to the newly added disks because other disks in the pool are full. Exhaustion of oversubscribed pools If a thin pool is oversubscribed and has no available space for new extent allocations, attempts by SQL Server to write to a page that would require a new data extent to be allocated will result in a Virtual Provisioning 143 Storage Provisioning continual re-driving of the I/O operation. Informational messages of the form shown in Figure 36 on page 144 will be posted to the Windows application event log. Figure 36 SQL Server I/O request informational message SQL Server will continue to buffer changed pages in memory until such time as SQL Server memory is completely filled, or the outstanding writes are able to be serviced. Storage administrators should take immediate action in the event that these errors are logged as a result of exhaustion of thin pool storage. In concert with the I/O delay message, the Windows System log may also report disk errors of event ID 11. These messages are a result of write requests to the thin device being rejected by the Symmetrix target, as new allocations are not possible as the pool is full. In the event that additional data devices are added and enabled within the thin pool, Microsoft SQL Server will write all outstanding data pages to the data files, and will return to normal operating mode for the database. The resubmission of the outstanding write requests will be successful at the time additional storage space is available to the thin pool. Subsequently all informational messages from SQL Server and disk event ID 11 messages will cease to be generated 144 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning The database is protected at all times from any inconsistencies by the Write Ahead Logging (WAL) mechanism used by Microsoft SQL Server. As stated in “Transaction log files” on page 139, the transaction log will be fully allocated from any thin pool it may be bound to. As a result, the transaction log itself will not suffer from an overallocation issue. At all times, transaction log records will be written to the transaction log, and will ensure that transactional consistency is maintained even in the event of an abnormal termination of the server. Recommendations EMC Symmetrix Virtual Provisioning provides a simple, noninvasive, and economical way to provide storage for Microsoft SQL Server databases. Microsoft Windows Server 2003 and Microsoft Windows Server 2008 in concert with Microsoft SQL Server in particular provides the specific features required to obtain the maximum benefits of Virtual Provisioning. With Microsoft Windows "thin-friendly" NTFS formatting capabilities, and Microsoft SQL Server's Instant File Initialization mechanisms, customers are able to derive the maximum benefits of Virtual Provisioning. The following summarizes the optimal configurations of Virtual Provisioning and Microsoft SQL Server: ◆ Microsoft Windows Server 2003 or Windows Server 2008 with NTFS format storage allocation ◆ Microsoft SQL Server configurations with Instant File Initialization ◆ Systems where overallocation of storage is typical ◆ Systems where rapid growth is expected over time but downtime is limited The following are situations where Virtual Provisioning may not be optimal: ◆ Systems where shared allocations from a common pool of thin devices are not desired ◆ Systems where data is deleted and re-created rapidly and the underlying data management structure does not allow for reuse of the deleted space Systems that cannot tolerate an occasional response time increase of approximately 1 millisecond, due to writes to uninitialized blocks Virtual Provisioning 145 Storage Provisioning Fully Automated Storage Tiering Customers are under increasing pressure to improve performance as well as the return on investment (ROI) for storage technologies, and ensuring that data is placed on the most appropriate Storage Typeis a key requirement. It is common to find applications utilizing the services of the SQL Server environment, having differing access patterns to different files or even different portions of a data file. This style of access will often direct a significant workload to a relatively small subset of the data stored within the database. Other data files, and the data they contain, maybe less frequently accessed. This phenomenon is referred to as data access skewing. Optimizing the placement of these various components of the database to the Storage Type that matches their needs best, will help increase performance, reduce the overall number of drives, and improve the total cost of ownership (TCO) and ROI for storage deployments. As companies increase deployment of multiple drive types and protection types in their high-end storage arrays, it provides the storage and database administrators with the challenge of selecting the correct storage configuration for each application. Often, a single Storage Type is selected for all the data in a given database, effectively placing both active and idle data portions on fast FC drives. This approach can be expensive and inefficient, because infrequently accessed data will reside unnecessarily on high-performance drives. Alternatively, making use of high-density low-cost SATA drives for the less active data, FC drives for the medium active, and Enterprise Flash Drives (EFD) for the very active, enables efficient use of the storage resources, and reduces the overall cost and the number of drives necessary. This, in turn, also helps to reduce energy requirements and floor space, allowing the business to grow more rapidly. This tiering can often be complicated by the fact that there may exist a skew within the data files, where only ranges of data or index information can be heavily accessed. The remainder of the data or index information may be less frequently accessed. Administrators cannot often differentiate the data based on access patterns, and indeed, the access patterns may change over time. To achieve this "tiered" approach with Microsoft SQL Server databases, administrators can use Symmetrix Enhanced Virtual LUN Technology to moveboth entire logical devices,or extents from Virtually Provised devices between drive types and RAID protections 146 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning seamlessly inside the storage array. Symmetrix Virtual LUN technology does not require application downtime. It preserves the device IDs, which means there is no need to change file system mount points, volume manager settings, database file locations, or scripts. It also maintains any TimeFinder or SRDF business continuity operations even as the data migration takes place. This manual and often time-consuming approach to Storage Tiering can be automated using Fully Automated Storage Tiering or FAST. FAST uses policies to manage sets of logical devices and available Storage Types. Based on the policy guidance and the actual workload profile over time, The FAST controller will recommend and even execute automatically the movement of the managed devices between the Storage Types. Two forms of FAST deployment are available with Symmetrix VMAX storage systems. For fully provisioned storage, FAST DP provides a mechanism that manages and optimizes the movement of full LUN allocations between storage tiers. For implementations utilizing virtually provisioned devices, FAST VP provides a much more granular mechanism to migrate sub-LUN allocations (or extents) between the tiers defined by a FAST Policy. Symmetrix VMAX tiered storage architecture can enhance customer deployments for SQL Server databases, providing “right-sizing” mechanisms that nondisruptively move database storage, using either Enhanced Virtual LUN or FAST technologies in order to put the right data on the right storage at the right time Evolution of Storage Tiering From a business perspective, "Storage Tiering" generally means that policies coupled with storage resources having distinct performance, availability, and other characteristics are used to meet the service level objective (SLO) for a given application. By SLO we mean a targeted I/O service goal, that is, performance for an application. This remains the case with FAST. For administrators, the definition of Storage Tiering is evolving. Initially, different storage platforms met different SLOs. For example: ◆ Gold Tier - Symmetrix ◆ Silver Tier - CLARiiON® ◆ Bronze Tier - Tape Fully Automated Storage Tiering 147 Storage Provisioning More recently, Storage Tiering meant that multiple SLOs are achievable in the same array: ◆ Gold Tier - 15k, FC RAID 1 ◆ Silver Tier - 10k, FC RAID 5 ◆ Bronze Tier - 7.2k, SATA, RAID 6 FAST changes the categories further. Because multiple Storage Types can support the same application, "tier" is not used to describe a category of storage in the context of FAST. Rather, EMC is using new terms: ◆ Storage Group - logical grouping of volumes (often by application) for common management ◆ Storage Class - combination of Storage Types and FAST Policies to meet service level objectives for Storage Groups ◆ FAST Policies - policies that manage data placement and movement across Storage Types to achieve service levels for one or more Storage Groups ◆ Storage Type - a shared storage resource with common technologies, namely drive type and RAID scheme For example, users may establish a Gold Storage Class as follows: Table 10 Storage Class definition Service Level Objective Read/write response time objective Storage Class FAST Policy Storage Type Gold 10% 40% 50% 15k rpm RAID 1 10k rpm RAID 5 7.2k SATA RAID 6 A summary of the relationship between Storage Types, FAST Policies, and Storage Groups is provided in Figure 37 on page 149. 148 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Figure 37 FAST relationships In subsequent discussions, Storage Type at times will be referred to as the Symmetrix Tier, in order to map clearly to specific commands in tools such as SMC. "FAST Policies" and "Storage Groups" will refer to the pre-defined meanings. FAST implementation Fully Automated Storage Tiering within the Symmetrix VMAX storage system is offered in two main forms. These two major forms of FAST are based on the style of underlying storage allocation. Fully Automated Storage Tiering for Disk Pools (FAST DP) refers to the FAST implementation used for non-thin implementations. Thus the storage allocation in the FAST DP model is that of the traditional LUN provisioning from a Disk Group. Fully Automated Storage Tiering for Virtual Pools (FAST VP) utilizes the Virtual Provisioning storage model for LUNs. In this model, storage allocations for thin LUNs are made when storage allocations from the host occur to the thin devices. One significant distinction between the two models is based on the menchanism and granularity of the migrations employed by the two implemntations. FAST DP migrations occur at the LUN level, requiring that an entire LUN defined within the FAST Policy is migrated as a single unit. FAST VP migrations between storage tiers occur at a sub-LUN allocation size. Discrete FAST VP migrations, as a result, can occur in a significant more granular manner, providing for both a higher effective utilization of the Storage Tiers defined, and a much more responsive model. Fully Automated Storage Tiering 149 Storage Provisioning Fully Automated Storage Tiering for Disk Pools (FAST DP) was introduced in the Enginuity 5874 Q4 service release. The solution was delivered as Symmetrix software that utilizes intelligent algorithms to continuously analyze device I/O activity and generate plans for moving and swapping logical devices for the purposes of allocating or re-allocating application data across different performance Storage Types within a Symmetrix array. Enginuity 5875 for EMC Symmetrix VMAX introduced the Fully Automated Storage Provisioning for Virtual Pools (FAST VP). The FAST VP solution utilizes a similar functional model as that of FAST DP, utilizing the same concepts of Tiers and Policies. The mechanisms utilized by FAST VP provide for a more granular, efficient and dynamic model for storage tiering based on actual SQL Server workloads to individual tables and indexes within SQL Server data files. In both forms, FAST proactively monitors workloads at the Symmetrix device (LUN for FAST DP and sub-LUN with FAST VP) level in order to identify "busy" devicesand extents that would benefit from being moved to higher-performing pools such as those provisioned from EFD. FAST will also identify less "busy" devicesand extents that could be relocated to higher-capacity, more cost-effective storage such as SATA pools without altering performance. Time windows can be defined to specify when FAST should collect performance statistics (upon which the analysis to determine the appropriate Storage Type for devicesand extents is based), and when FAST should perform the changes necessary to move devicesand extents between Storage Types as appropriate. Movement is based on user-defined Storage Types and FAST policies and the style of FAST implementation in use. The primary benefits of FAST include: 150 ◆ Automating the process of identifying workloads that can benefit from EFD and/or that can be kept on higher-capacity, less-expensive pools without impacting performance ◆ Improving application performance at the same cost, or providing the same application performance at lower cost. Cost is defined as space, energy, acquisition, management and operational expense ◆ Optimizing and prioritizing business applications, allowing customers to dynamically allocate resources within a single array Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning ◆ Delivering greater flexibility in meeting different price/performance ratios throughout the lifecycle of the information stored The management and operation of FAST can be conducted using either the Symmetrix Management Console (SMC) or the Solutions Enabler Command Line Interface (SYMCLI). Additionally, detailed performance trending, forecasting, alerts, and resource utilization are provided through Symmetrix Performance Analyzer (SPA). And if so desired, EMC Ionix ControlCenter provides the capability for advanced reporting and analysis that can be used for charge back and capacity planning. Configuring FAST FAST is configured by defining three distinct objects: ◆ Storage Group. A Storage Group is a logical grouping of up to 4,096 Symmetrix devices. Storage Groups are shared between FAST and Auto-provisioning Groups; however, a Symmetrix device may only belong to one Storage Group that is under FAST control. ◆ Storage Type. Storage Types are a combination of a drive technology (for example, EFD, FC 15k rpm, or SATA) and a RAID protection type (for example, RAID 1, RAID 5 3+1, or RAID 6 6+2). There are two types of Storage Types - static and dynamic. A static type contains explicitly specified Symmetrix disk groups, while a dynamic type will automatically contain all Symmetrix disk groups of the same drive technology. A Storage Type will contain at least one physical disk group from the Symmetrix but can contain more than one. If more than one disk group is contained in a Storage Type, the disk groups must be of a single drive technology type. ◆ FAST Policy. FAST Policies associate a set of Storage Groups with up to three Storage Tiers, and include the maximum percentage Storage Groups volumes can occupy in each of the Storage Tiers. The percentage of storage specified for each tier in the policy when aggregated must total at least 100 percent. It may, however, total more than 100 percent. For example, if the Storage Groups associated with the policy are allowed 100 percent in any of the tiers, FAST can recommend for all the storage devices to be together on any one tier (capacity limit on the tiers is not forced). In another example, to force the Storage Group to one of the tiers simply set the policy to 100 percent on that tier and 0 percent on Fully Automated Storage Tiering 151 Storage Provisioning all other tiers. At the time of association, a Storage Group may also be given a priority (between 1 and 3) with a policy. If a conflict arises between multiple active FAST Policies, the Fast Policy priority will help determine which policy gets precedence. FAST can be configured to operate in a "set and forget" mode (Automatic) in which the system will continually gather statistics, analyze, and recommend and execute moves and swaps to maintain optimal configuration based on policy. This Automatic mode of operation is required for FAST VP deployments, but is optional for FAST DP deployments. FAST DP mayoptionally be set in a "user approval" mode (User Approved) in which all configuration change plans made by FAST must be approved for a FAST suggested plan to be executed. Data movement There are two methods by which data will be relocated to another type: move or swap (Swap is only an option for FAST DP). For FAST DP, a device move occurs when unconfigured (free) space exists in the target tier. Only one device is involved in a move, and a DRV (special Symmetrix device used for device swapping) is not required. Moves are performed by creating new devices in unconfigured space on the appropriate Storage Type, moving the data to the new devices, and deleting the old device. ForFAST DP, a device swap occurs when there is no unconfigured space in the target type, and results in a corresponding device being moved out of the target Storage Type. In order to preserve data on both devices involved in the swap, a single DRV is used (DRV should therefore be sized to fit the largest FAST controlled devices). For FAST VP deployments, an extent movement occurs when unconfigured thin pool space exists in the target tier. Movements occur at extent granularity. Similar to FAST DP movements,new extents are allocated from the thin pool, data contents are moved to the new extents, and the existing extent from the old Tier are deallocated. In practice, the movement of extents occurs as an extent group movement, where an extent group is comprised of multiple, contiguous thin device extents. The extent group size for VMAX, for the purpose of data movement, is 7,680 KB. Moves and swaps,as appropriate, are completely transparent to the host and applications and can be performed non disruptively, without affecting business continuity and data availability. With FAST DP, Symmetrix metavolume are moved as a complete entity; 152 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning therefore, metavolume members cannot exist in different physical disk groups. As FAST VP works at a sub-LUN level, a single thin metavolume may have allocations from all applicable Tiers defined within the Policy concurrently. FAST optimizes application performance in Symmetrix VMAX arrays that contain drives of different technologies. It is expected that customers will have their arrays configured with Enterprise Flash, Fibre Channel, and/or SATA drives, resulting in storage tiers with different service level objectives. Rather than leave applications and data statically configured to reside on the same Storage Type, FAST will allow customers to establish the definitions and parameters necessary for automating data movement from one type to another according to current data usage. Skewed LUN access Skewed LUN access is a phenomenon that is present in most databases. Skewing is when a small number of SQL Server LUNs receive a large percentage of the I/O from the applications. This skewed type of activity can be transient, that is to say lasting only a short period of time, or persistent and lasting much longer periods of time.Skewed, persistent access to volumes make those volumes good candidates to place on a higher-performing Storage Type, if they are not already on the highest-performing type. Skewing by its nature means that there are SQL Server LUNs that are receiving lower activity than others. If these volumes are persistently receiving fewer requests than others they may be good candidates to move a lower-performing Storage Type. Microsoft SQL Server database environments are created by defining one or more filegroups (the PRIMARY filegroup will always exist). Each filegroup is subsequently defined to exist on one or more data files located on NTFS volumes. Due to the proportional fill mechanism used by SQL Server, data files within a given filegroup will generate almost identical I/O patterns. This highly correlated workload also leads to contention, and is the primary motivation for the best practice recommendation to distribute these files across volumes and LUNs. It is a commonplace occurrence, however, to have storage allocations for these multiple LUNs be provided from within the same storage pool. Thus, while the workload is distributed across different LUNs and NTFS volumes at the host level, the workload is generated to the same physical spindles defined within the storage pool. Fully Automated Storage Tiering 153 Storage Provisioning Skewed data access Skewed data access is a phenomenon that is present in most user storage allocations irrespective of the application or database system in use. Data access skewing is when a specific portion of a storage allocation made to an application such as a SQL Server data file receives a large percentage of the I/O from the applications. This may occur when a specific range of data within a data file is accessed more frequently than other data ranges. Data access skews and LUN access skews invariably occur together. Certain data files may be accessed more frequently based on the tables that they provide the storage allocation for. In a SQL Server context, multiple LUNs will invariably contain data files that may belong to a single file group. Workloads to different file groups will differ based on the tables and indexes that they contain. Within a given table or index access patterns may differ based on logical aggregations of the data itself. For example, time-based or geographical dimensions on the data may cause certain ranges of data within a data file to be accessed more frequently. Similar to access patterns to LUNs, this skewed type of activity can be transient, and in general are typically expected to change over time. Skewed, persistent access to specific data ranges make those extents supporting the data range good candidates to place on a higher-performing Storage Type. While FAST DP allows for the migration of full LUNs between storage types, FAST VP provides the ability to move sub-LUN extents between Storage Tiers. This provides a significantly more efficient and effective use of Storage Tiers defined within the Symmetrix VMAX array. 154 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Deploying FAST DP with SQL Server databases To validate the mechanisms and show the efficacy of the FASTDP solution for Microsoft SQL Server environments, a test environment was defined to simulate a user environment. A moderate to high user workload was generated against a SQL Server 2008 environment. Storage allocations were initially provided in a manner that is consistent with current provisioning practices deployed within the SQL Server user community. Storage Classes were defined within the environment, and a FAST policy was defined for the SQL Server database. FAST monitoring was implemented, and a user approval mode was utilized to identify the actions the FAST controller determined. These suggested migrations were subsequently approved, and the system monitored for the resulting performance changes. For all testing, Microsoft Windows Server 2008 R2 and Microsoft SQL Server 2008 SP1 were utilized. The primary user database was implemented to be comprised of several filegroups where each filegroup mapped to one or more files located over eight LUNs, as shown in Figure 38 on page 156. Using this information, it is possible to map the locations of the SQL Server logical structures, such as tables and indexes, to physical locations.Given the proportional fill algorithm used by SQL Server, each logical entity, such as the Broker filegroup, will have allocations from each of the constituent files. Subsequently, all I/O activity generates against tables and indexes defined within the Broker filegroup will be serviced by the physical files. As a result this I/O will be directed at the LUNs, which provide the storage for the files. Deploying FAST DP with SQL Server databases 155 Storage Provisioning Figure 38 Overview of relationships of filegroups, files and physical drives Each LUN was defined as approximately 120 GB in size. Each LUN contained a single NTFS volume, and each NTFS volume was mounted at a mount point located under the directory C:\SQLDB. Most volumes only contained a single data file, except DATA1, which contained the Misc_1.ndf file as well as the Broker_1.ndf file. In addition to the listed storage allocation areas for the user database, other volumes were presented to the host and had SQL Server database files located on them. Specifically, volumes were added to support the TEMPDB data and transaction log files. These LUNs were included within the Storage Group for the host, although due to the nature of the workload, they were not actively used in the testing conducted. It is anticipated that TEMPDB usage may be an important aspect for more customer configurations, and the functionality demonstrated for the user database described here will apply to the TEMPDB environment. The simulated environment was that of a TPC-E like environment. As such, it represents the workload as anticipated of an online brokerage environment. Simulated users perform a range of processing tasks, 156 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning such as stock transactions, and look-ups, as well as the generation of larger reports. The workload generated a significant read/write requirement against the various portions of the database. The Windows Perfmon instrumentation was set to a 15-secondinterval and the results before, during, and after the volume relocation were collected and analyzed. Additionally, the SQL Server Performance Warehouse was utilized to collect instrumentation from the environment. The SQL Server Performance Warehouse is a management feature available within the SQL Server 2008 environment. In addition to providing drive-based statistics in a similar manner to Windows Performance Manager, it also correlates internal SQL Server metrics that can assist in determine SQL specific attributes.The results are presented in the following sections Each metavolume presented to the host as a LUN was created as a striped configuration using four members, whereby each member hypervolume was 30 GB in size. Hypervolumes were of a RAID 5 (3+1) configuration. Thus each hypervolume was located on a set of four physical spindles. All hypervolumes, and subsequently all metavolumes, were created in a single disk group that contained 96 drives. These drives were 450 GB 15k rpm. As each hypervolume was defined across four physical spindles, and each metavolume was configured from four hypervolumes, each metavolume was potentially located over 16 physical spindles. With nine such metavolumes (eight for data, and one for transaction log), this would have effectively required 144 spindles. However, the disk group was constrained to a set of 96 drives, and thus there was sharing of spindles. In addition to the single database environment, an additional constant workload was generated against the 96 physical spindles. IOMETER was used to generate a moderate (22,000 IOPS at or above a 70 percent read hit), but constant, read workload from the physical spindles. The read workload from IOMETER resulted in an average of 68 read requests per spindle. The IOMETER workload was executed throughout the testing and is thus a constant in the configuration, and represents other user workloads that are expected in a customer environment. Configuring FAST for the environment When initially configured, the storage allocations for all LUNs were made from a single pool of physical spindles with a single RAID protection scheme. This is synonymous with a Storage Type, as Deploying FAST DP with SQL Server databases 157 Storage Provisioning previously defined. The Symmetrix VMAX array contained a number of different technologies including Enterprise Flash Drives (EFD), 450 GB 15k rpm drives, and 1 TB SATA drives. Using these differing technologies and available RAID levels, it is possible to construct the various Storage Types that may be applied. In the tested environment, multiple Storage Types tiers were defined, as shown in Figure 39 on page 158. A type of storage may differ on technology implemented (the physical drive characteristics) or by the RAID protection scheme implemented. The tiers created in this configuration varied both in terms of the underlying technology, where the tier "FLASH" was defined as RAID 5 3+1 protection on EFD, FC_R53 was defined as being a RAID 5 3+1 protection scheme on Fibre Channel 450 GB 15k rpm drives, and SATA_5 was defined as being a RAID 5 7+1 protection scheme on 1 TB 5,400 rpm drives. Other entries also exist, but these three tiers were subsequently used to define a policy for the SQL Server environment. Figure 39 Defining Storage Types within Symmetrix Management Console The Storage Types were applied to create a policy named "SQL_PROD". This policy defined the applicable Storage Types, and applied capacity percentages to the various types, referred to as tiers in SMC. The percentages define how much of the storage space used by the Storage Group defined in the policy can be utilized from each 158 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning of the tiers. Thus, the values as shown in Figure 40 on page 159 indicate that when applied to a Storage Group, 90 percent of the storage allocation for the group may come from the FC_R53 Storage Type, 20 percent may come from the FLASH Storage Type, and 20 percent may be allocated from the SATA_R5 Storage Type. The total allocation in this instance is 130 percent, and will allow the FAST policy to allocate storage from the most appropriate tier based on its statistical sampling. Figure 40 Tier definition within Symmetrix Management Console The allocation of a Storage Group to a defined tier binds the policy to the devices contained with the Storage Group. The Storage Group is defined when implementing Auto-provisioning Groups, and defines the storage devices that will be available to a host when bound to a view. For a tested workload the Storage Group name was "SQL_FAST_PRD" and defined all storage devices that were presented to the SQL Server host. This Storage Group naming is typically conducted when initially provisioning storage to the SQL Server host during initial deployment. In Figure 41 on page 160 the Storage Group is displayed with its component devices. Deploying FAST DP with SQL Server databases 159 Storage Provisioning Figure 41 Allocating a Storage Group to a policy in Symmetrix Management Console To complete the implementation of the FAST mechanisms, it is necessary to set appropriate time periods for the policy engine to collect statistics for workload analysis. Statistics collection is defined within the Symmetrix Optimizer environment, and may also be set through Symmetrix Management Console, as shown in Figure 42 on page 161. 160 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Figure 42 Symmetrix Optimizer collect and swap/move windows It is possible to configure Symmetrix Optimizer to execute swaps automatically or require user approval for swaps. If automatic mode is set for Symmetrix Optimizer, the FAST-suggested movements will be executed, as Optimizer is used for effecting movement. In the tested configuration, Optimizer was left in user approval mode, to show the planned device migrations. Having defined the Storage Group association to the Storage Types through the policy, and scheduling the application time periods within Symmetrix Optimizer, the environment was configured appropriately.To validate the selection of devices, and planned movement, the operating system and SQL Server monitoring were also configured. Monitoring SQL Server workloads Microsoft SQL Server database administrators and Windows Server administrators will typically utilize Windows performance counters to monitor overall system performance. The instrumentation provided by the Windows performance counters represents a valid performance profile of the behavior of the storage devices presented to the host.Additionally, Microsoft SQL Server implements a Performance Warehouse implementation that can be used to specifically monitor those portions of the environment utilized by a Deploying FAST DP with SQL Server databases 161 Storage Provisioning SQL Server database. The SQL Server Performance Warehouse can be a valuable tool for database administrators to identify specific performance and transaction counters. As the workload executed for an extended period of time to ensure that the workload reached a constant state, both SQL Server Performance Warehouse and Windows Performance Monitor counters were being collected over the time interval, as shown in Figure 43 on page 162. This view represented a time period of 4 hours when the workload was executing. Figure 43 SQL Server Performance Warehouse workload view In the case of the selected workload, the SQL Server Waits, and specifically the “Buffer I/O” was a significant contributor to the overall workload. This wait state is typically incurred as a result of long read times for storage components. It is possible to drill into the specific waits for SQL Server by selecting the graph, and expanding the view to look at specific I/O activity for the database environment. The resulting display for the test environment can be seen in Figure 44 on page 163, where the logical file devices are listed in workload and I/O latency order. For the tested environment, the Broker4 and Broker5 logical files can be seen to have the greatest latency of 8 ms and 6 ms, respectively, at throughput rates of around 60 MB/s. Correlating the logical files 162 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning with physical storage devices, it is possible to identify that Broker4 and Broker5 are located on NTFS volumes DATA4 and DATA5, which in turn are provisioned from the metavolumes 119 and 11D Figure 44 SQL Server Performance Warehouse virtual file statistics It is also reasonable to assume, given the initial configuration, where all LUNs are effectively coming from the same pool of physical drives, that the resulting latency differences may be attributed to contention at the drive level. The I/O wait and latency numbers displayed by SQL Server Performance Warehouse can easily be cross-referenced with Windows performance counters. Figure 45 on page 164 shows the read per second workload for the data volumes (there is little read activity on the transaction log in comparison to the data files). Deploying FAST DP with SQL Server databases 163 Storage Provisioning Figure 45 Read I/O rates for data file volumes Read per second numbers are rarely sufficient to determine that there is any issue with the given configuration. Rather, it is necessary to put the read and write activity into context with the latency for the given workload style as well as size of the I/O itself.In the case of the read workload to the data files, the average I/O size was 8 KB (the SQL Server page size). The latency numbers for the read workload are shown in Figure 46 on page 164. These latency statistics match those shown in the SQL Server Performance Warehouse, and the larger latencies are generated by the LUNs containing the Broker files. Figure 46 Read latency for data file volumes It is clear therefore that there are differing performance characteristics for the various data files. The Broker files have a significant workload generated against them as compared to the other data files and resulting LUNs. These files contribute to the increased SQL Server waits shown in Figure 44 on page 163. 164 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Selection of migration targets Selection of the migration targets is based on the relevant policy being applied. In the tested configuration, all devices were initially created within a single Storage Type. All devices were from the one pool of 88 physical drives. All hypervolumes were configured with RAID 5 3+1 protection, and each LUN was configured as a striped metavolume of four member hypervolumes. This style of design is consistent with user implementations for SQL Server database environments. Certainly latencies for the environment were within general best practice guidance, although as was shown, the configuration was suffering from high wait times within SQL Server. Through the use of both Windows Performance Monitor and SQL Server Performance Warehouse statistics the most underperforming storage locations could be identified. Specifically those metavolumes providing storage locations for Broker4 (DATA4) and Broker5 (DATA5) suffered from the greatest latency. In a manually managed environment, such devices would require administrator intervention. In a policy-based environment such as that provided by FAST, identification of underperforming devices, and actionable movements within the policy, can improve the overall performance. The monitoring time interval used by the policy engine was set so as to analyze all devices within the system during the course of the workload run. The resulting statistics were utilized by the policy engine to generate plans, and a plan for movement was created by the FAST controller. The plan can either be viewed through Symmetrix Management Console through the Swap/Move list, or via the symfast CLI as shown in Figure 47 on page 166. The suggested plan is uniquely identified by the Plan Identifier, and details the reason for the plan (Performance) and the suggested movement. Again, as the setting on the system was defined to be user approved, the plan shows a Plan State of NotApproved. Deploying FAST DP with SQL Server databases 165 Storage Provisioning Figure 47 FAST generated performance movement plan The targeted devices include the storage location for Broker4 (DATA4), which is the metavolume defined by device 119 (the metavolume head) and its three members (devices 11A, 11B and 11C). Also included is the storage location for Broker5 (DATA5), which is the metavolume defined by device 11D (the metavolume head) and its three members (devices 11E, 11F and 120). The plan suggests that the devices be relocated to the FLASH Storage Type, and displays the associated disk group number and name. Additionally, the protection type is displayed, in this case R5(3+1), as this was the protection type defined for this Storage Type. Also applicable to a migration are other styles of policy compliance. In Figure 48 on page 167, a further migration is suggested and in this instance, the Group status indicates that the move is based on compliance needs. Currently the storage allocation is out of compliance with the FAST policy, which as defined, indicated that 20 percent of storage allocation for the Storage Group could be satisfied from EFD storage, and 90 percent could be satisfied from the RAID 5 Storage Type. While two devices were moved to the EFD Storage Type, this continued to leave the policy out of compliance, as shown in Figure 48 on page 167. 166 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Figure 48 FAST policy compliance view In such cases, the FAST Controller will determine those storage devices that can be moved to ensure the compliance to the policy. In the case where a movement is to a lower-performing Storage Type, devices that exhibit the lowest workload patterns will be identified to be moved. Such movements are scheduled in the same manner as performance moves, and a special designation for the plan is shown, as seen in Figure 49 on page 167 Figure 49 FAST generated compliance movement plan Deploying FAST DP with SQL Server databases 167 Storage Provisioning In the tested configuration, the device selected was the storage device allocated to one of TEMPDB data file locations. As TEMPDB is not utilized in this configuration, any of the three devices (two TEMPDB data files and one TEMPDB Log) locations were equal candidates. Scheduling migrations Migrations that are either based on a user approval mode, or automatically approved, are subject to a further scheduling policy defined within the Optimizer environment. The migration time period, especially for up-tiering events, which are generally trying to address underperforming configurations, should be scheduled during hours when heavy production utilization is not typical. Quality of Service (QoS) mechanisms within the Symmetrix VMAX environment will ensure that user workloads are not significantly affected when they occur, but scheduling movements such that they complete outside of production periods is recommended. User approval of a given FAST plan may be either approved through Symmetrix Management Console in the Symmetrix Optimizer section, or by utilizing the symfast CLI, as shown in Figure 50 on page 168. Figure 50 User approval of a suggested FAST plan Approved plans will wait for the defined swap window to arrive before execution. Once executing, a migration cannot be terminated, and will run until concluded. Depending on volume sizes, and the number of migrations in process, the time for migration completion will vary. Symmetrix QoS mechanisms will prevent the migration from adversely affecting production workloads should the migration continue into normal work hours. It is also possible to apply manual priority settings to further limit the copy process by utilizing the symqos CLI and lowering the Mirror Pace setting. While the migration is executing, it is possible to query the progress of the migration by again using Symmetrix Management Console or the symfast CLI. In Figure 51 on page 169 additional information is displayed for a plan that is executing a migration, including the time that the actual migration began. 168 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Figure 51 FAST migration in progress Once all devices have fully migrated, the migration will automatically terminate, and the targeted devices will exist on the selected Storage Type. The performance attributes of the Storage Type will automatically apply to the devices. Assessing the benefits of FAST The migrations implemented by the FAST policy engine improved overall performance of the more heavily accessed volumes by migrating them to a higher-performing Storage Type. In the example configuration, two LUNs were migrated to the EFD Storage Type, resulting in improved workload throughput and reduction in I/O latency time for these storage devices. The migration of these heavily accessed volumes also resulted in a reduction of contention within the original storage pool, thereby improving response times for the remaining devices. Post-migration response times were significantly better than prior to the move, as shown in Figure 52 on page 170. Average latencies dropped for all volumes well below a 4 ms latency, as compared to before the move, where average latencies for a number of volumes were 6 ms and higher. Deploying FAST DP with SQL Server databases 169 Storage Provisioning Figure 52 Read latencies for data volumes - post migration In addition to improved latency for read requests, overall throughput for the LUNs increased. Figure 53 on page 170 shows that read requests per second increased from the highest of around 7,000 reads/s to over 9.000 read/s after migration. Figure 53 Read I/O rates for data volumes - post migration The performance improvements were correlated by the data collected by the SQL Server Performance Warehouse and shown in Figure 54 on page 171. 170 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Figure 54 Performance Warehouse virtual file statistics - post migration As a direct result of higher throughput and lower latencies, the operation of the SQL Server environment improved across all major attributes. Figure 55 on page 171 details the relative improvement in performance for the environment. Batch requests/s can provide a general view of SQL Server throughput, and increased by a factor of 62 percent. Figure 55 Comparison of improvement for major metrics Microsoft SQL Server database environments often exhibit a skewed workload across files comprising differing filegroups. In many implementations, the LUNs used as storage devices are allocated from the same storage pools, which will often lead to contention for Deploying FAST DP with SQL Server databases 171 Storage Provisioning physical resources. In enterprise-class storage arrays, tiered storage configurations are implemented to provide differing performance and cost-effective storage solutions. Migrations of data devices that represent the more highly accessed volumes from a lower-performing Storage Type to a higher-performing Storage Type help performance not only for the volumes being migrated, but for the other volumes within the previous Storage Type that suffer from a lower level of contention. Moving lightly accessed volumes from a higher-performing Storage Class to a more appropriate Storage Class helps further improve utilization and drives down cost. 172 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Deploying FAST VP with SQL Server databases Validation of the FAST VP for a SQL Server environment was conducted in a similiar manner to that described for FAST DP. A moderate to high user workload was generated against a SQL Server 2008 R2 environment. Storage allocations were initially provided in a manner that is consistent with current provisioning practices deployed within the SQL Server user community. Specifically, in this instance, Virtual Provisioning was utilized to implement a thin environment for the SQL Server database. Storage usedto support all database data files and transaction log space was configured from virtually provisioned storage. Storage Tiers were defined within the environment, and a FAST VP policy was defined for the SQL Server database. FAST monitoring was implemented, and automatic mode for movements was enabled. For all testing, Microsoft Windows Server 2008 R2 and Microsoft SQL Server 2008 SP1 were utilized. The primary user database was implemented to be comprised of several filegroups where each filegroup mapped to one or more files located over eight LUNs, as shown in Figure 56 on page 174. Each of the eight LUNs utilized for the workload was configured from a single thin pool. Given the proportional fill algorithm used by SQL Server, each logical entity, such as the Broker filegroup, will have allocations from each of the constituent files. Subsequently, all I/O activity generates against tables and indexes defined within the Broker filegroup will be serviced by the physical files. Workloads directed at the LUNs, will subsequently be directed to the thin pool used for the storage allocation. EMC Symmetrix Virtual Provisioning ensures that the I/O allocations are distributed across the physical disk resources allocated and enabled within the thin pool. In this manner, workloads can be satisfied from a much larger resource pool comprised by the thin pool than might be able to be configured through the use of traditional storage allocations. Utilizing multiple files and LUNs continues to be a recommendation even in configurations which utilize thin pools where workloads all resolve to a single thin pool. The value of multiple files and LUNs is gained at the host operating system and SQL Server level, where multiple physical disk and file objects ensure sufficient bandwith to the storage array. Deploying FAST VP with SQL Server databases 173 Storage Provisioning Figure 56 Overview of relationships of filegroups, files and thin devices Each thin LUN was defined as approximately 360 GB in size, although actual allocations from the thin pool were based on the actual data written to the files located on each LUN. All configured LUNs contained a single NTFS volume, and each NTFS volume was mounted at a mount point located under the directory C:\SQLDB. All volumes only contained a single data file, defined by the database creation script. As deployed, all storage LUNs were initially bound to a single thin pool called FC_SQL. Figure 57 on page 175 displays the relationship of the thin devices to the data devices suppporting the thin pool, and subsequently the physical disks from where the data devices were configured. In this environment, 96 physical drives were utililized to create the data devices in a RAID 5 3+1 configuration. The drives were 450 GB 15k rpm devices. From the physical drive pool, 48 Data Devices were constructed with the RAID 5 3+1 protection scheme, and these Data Devices were added to the FC_SQL thin pool. Each drive therefore provisoned storage for two data devices. 174 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Figure 57 Relationship of thin LUNs to data devices and physical drives Each LUN, as presented to the host, was created as a striped metavolume configuration using six members, whereby each member hypervolume was 30 GB in size. As all LUNs were thin, there is no specific RAID protection at the thin LUN level, rather this is implemented at the thin pool level, and in the tested configuration all data devices within the FC_SQL pool were of a RAID 5 (3+1) configuration. While it would be typical for a production database system to require additional storage allocations for system databases such as TEMPDB, the workload in use for the testing does not generate workload against this environment. As a result, these additional storage allocations while created were not considered for this testing. Two additional devices were constructed for TEMPDB in the configuration, and are shown as Symmetrix devices 0149 and 014F in subsequent outputs. It is anticipated that TEMPDB usage may be an important aspect for customer configurations, and the functionality demonstrated for the user database described here will apply to the TEMPDB environment, where workloads will be driven to this database. Deploying FAST VP with SQL Server databases 175 Storage Provisioning After the creation of the user database on the allocated volumes, and the loading of data into the environment, the allocation against the LUNs can be seen in Figure 58 on page 176. LUN allocations are based on the actual usage of the data files defined, as previously discussed. The actual usage can be seen in the Pool Allocated Tracks value for any given Pool Bound Thin Device. Figure 58 176 Detail of FC_SQL thin pool allocations for bound thin devices Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning The simulated workload was that of a TPC-E like environment. As such, it represents the workload as anticipated of an online brokerage environment. Simulated users perform a range of processing tasks, such as stock transactions, and look-ups, as well as the generation of larger reports. The workload generated a significant read/write requirement against the various portions of the database. When implementing FAST VP, it is important to consider the storage movement. Unlike standard FAST DP, where an entire LUN is moved to an alternate tier, FAST VP only moves portions of the thin devices defined within the policy. As a result, some components may be located on any of the tiers and will exhibit the performance characteristics of the tier on which they are located. As workloads change on the storage allocations, FAST VP will transparently migrate the data to the most appropriate tier Configuring FAST VP for the environment In the initial starting configuration, all LUNs were bound to a single thin pool (FC_SQL), resulting in all allocations being provided by this single thin pool. All thin pools were configure to utilize a RAID 5 3+1 protection type. The Symmetrix VMAX array contained a number of different technologies including EFD, 450 GB 15k rpm drives, and 1 TB SATA drives. In the tested environment, multiple storage tiers were defined, as shown in Figure 59 on page 178. The tiers created in this configuration varied in terms of the underlying technology, where the tier "EFD_5R3" was defined as RAID 5 3+1 protection on EFD, "FC_5R3" was defined as being a RAID 5 3+1 protection scheme on Fibre Channel 450 GB 15k rpm drives, and "SATA_5R3" was defined as being a RAID 5 3+1 protection scheme on 1 TB 5,400 rpm drives. These three tiers were subsequently used to define a policy for the SQL Server environment Deploying FAST VP with SQL Server databases 177 Storage Provisioning Figure 59 Defining FAST VP storage tiers within Symmetrix Management Console The storage tiers were applied to create a policy named "SQLPRD_POLICY". This policy defined the applicable storage tiers, and applied capacity percentages to the various tiers in SMC. The percentages define how much of the storage space currently allocated by the storage group defined in the policy can be utilized from each of the tiers. Thus, the values as shown in Figure 60 on page 179 indicate that when applied to a storage group, 100 percent of the storage allocation for the group may come from the FC_5R3 storage tier, 30 percent may come from the EFD_5R3 storage tier, and 100 percent may be allocated from the SATA_5R3 storage tier. The total allocation in this instance is 230 percent, and will allow the FAST policy to allocate storage from the most appropriate tier based on its statistical sampling 178 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Figure 60 FAST VP policy definition within Symmetrix Management Console The association of a storage group to a defined tier binds the policy to the devices contained with the storage group. The storage group is defined when implementing Auto-provisioning Groups, and defines the storage devices that will be available to a host when bound to a view. For the tested workload the storage group name was "SQLPRD_Thin_Devs" and defined all storage devices that were presented to the SQL Server host. This storage group naming is typically conducted when initially provisioning storage to the SQL Server host during initial deployment. In Figure 61 on page 180 the storage group is displayed with its component thin devices Deploying FAST VP with SQL Server databases 179 Storage Provisioning Figure 61 Allocating a storage group to a policy in Symmetrix Management Console To complete the implementation of the FAST VP mechanisms, it is necessary to set appropriate time periods for the policy engine to collect statistics for workload analysis. Statistics collection is defined within the Symmetrix Optimizer environment, and may also be set through SMC during execution of the FAST Configuration Wizard, as shown in Figure 62 on page 181. 180 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Figure 62 Symmetrix FAST Configuration Wizard performance time window When implementing FAST VP, all movement operations are fully automatic, and there is no user approval mode. All storage movements relating to FAST VP policies are executed during the periods specified in the Movement Time Window. Figure 63 on page 182 displays the configuration settings available through the Symmetrix Management Console FAST Configuration Wizard. Multiple time windows may be specified for movements to occur. Deploying FAST VP with SQL Server databases 181 Storage Provisioning Figure 63 Symmetrix FAST Configuration Wizard movement time window Having defined the storage group association to the storage tiers through the policy, and scheduling the application time periods within SMC, the environment was configured appropriately Monitoring Workload Performance Microsoft SQL Server database administrators and Windows Server administrators will typically utilize Windows performance counters to monitor overall system performance. The instrumentation provided by the Windows performance counters represents a valid performance profile of the behavior of the storage devices presented to the host. Additionally, SQL Server implements a Performance Warehouse implementation that can be used to specifically monitor those portions of the environment utilized by a SQL Server database. The SQL Server Performance Warehouse can be a valuable tool for database administrators to identify specific performance and transaction counters. Only Windows Performance Monitor counters were collected during testing. When implementing FAST VP, it is important to consider the storage movement. Unlike standard FAST, where an entire LUN is moved to an alternate tier, FAST VP only moves portions of the thin devices defined within the policy. As a result, some components may be located on any of the tiers and will exhibit the performance 182 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning characteristics of the tier on which they are located. As workloads change on the storage allocations, FAST VP will transparently migrate the data to the most appropriate tier. Identification of database workloads The user workload was executed for an extended period of time to ensure that the workload reached a constant state and that the initial performance statistics time collection interval as defined within Symmetrix Management Console, for FAST VP, had passed. Windows Performance Monitor counters were being collected over the time interval. In Figure 64 on page 183 overviews of the Disk Reads and Writes per Second are displayed, covering a 4-hour time period when the initial workload was executed. Figure 64 Windows performance counters for reads and writes IOPs During the initial workload period, Read operations per second were averaging 4,000 for the Broker FileGroup. Other file groups produced significantly less workload. Write operations did not approach the levels seen with reads but did exceed 1,000 IOPS during checkpoint intervals. Latencies for the LUNs were also recorded, and are displayed in Figure 12 and show that read operation latencies for the most heavily accessed LUNs (Broker FileGroup) were approaching 20 milliseconds. Write latencies were lower with the most heavily accessed LUNs maintaining an 11 ms latency. Deploying FAST VP with SQL Server databases 183 Storage Provisioning Figure 65 Windows performance counters for read and write latencies It is clear therefore that there are differing performance characteristics for the various data files. The Broker files have a significant workload generated against them as compared to the other files and resulting LUNs. Effects of FAST VP Migrations of storage extents in the LUNs defined for the FAST VP storage group will begin executing after the initial statistics collection period and within a valid movement time window. In the tested configuration, the movement time window was set for a 24-hour period, seven days a week. The limiting factor was the initial statistics collection time interval, which was set to 7 hours. After this time period had elapsed, the movements began. The size of the movements will be based on the utilization and access patterns to the various storage allocations for the LUNs. Details on the sizes and mechanisms used are covered in the Implementing Fully Automated Storage Tiering with Virtual Pools (FAST VP) for EMC Symmetrix VMAX Series Arrays Technical Notes, available on Powerlink®. As storage extents are moved between the various tiers defined within the policy, performance of the thin devices will begin to exhibit the characteristics of the tier to which the extents have been moved. The resulting workload profile can be seen in Figure 66 on page 185, which covers the initial statistics collection time interval as well as the time period where the majority of migrations are occurring and then the time period where the workload reaches a stable point. Extent moves continue for the LUNs under FAST VP control continually, even after the initial relocation activity. 184 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning Figure 66 Read and write workload before and during migrations The read workload exhibits the largest change in profile as extents are migrated to the various storage tiers. In the tested environment, the reads per second increased from a rate of around 4,000 operations per second for the Broker FileGroup to a level just below 5,000 operations per second. Other SQL Server FileGroups can also be seen to have an increase in throughput, but to a lesser degree than the Broker FileGroup. As with the changes in read and writes throughputs, the latency numbers were also positively affected, and these results can be seen in Figure 14. It can be seen that there is little impact to the latencies during the period of migration of extents, and that as a result of the migrations, the latency figures for both read and write operations improve Figure 67 Read and write latencies before and during migrations Using Solutions Enabler commands, it is possible to monitor the migrations between the tiers defined within the policy. The symfast command can be utilized to show the current allocations across the tiers defined. In the tested configuration, the command symfast -sid Deploying FAST VP with SQL Server databases 185 Storage Provisioning 1667 list -association -demand -mb was used to collect information on the tiers being used. The command provides details on the current allocation in the format shown in Figure 68 on page 186. Figure 68 Output from demand association report The output from this command was collected over time to monitor the change in allocations. The results from this output are summarized in Figure 69 on page 186 and show that extent migrations occurred between the Fibre Channel and EFD tiers in a highly correlated manner. Figure 69 186 Summary of thin pool allocations over time Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning The FAST VP policy in place will be monitoring all thin LUNs allocated to the storage group, and statistical sampling will be occurring for all LUNs. Migrations between the tiers defined in the policy will be based on the level of activity. It should be expected that the most heavily accessed storage allocation, or those that contain the most heavily accessed data, will be identified for migrations to a higher-performing storage tier. In Figure 70 on page 187 one of the metavolumes used for a Broker file is used to display the storage allocations during a similar time period as previously shown for all tiers. Figure 70 Storage allocations for a LUN used by a Broker file Storage allocations also occurred against the SATA storage tier; however this was not to the same degree as the migration into the EFD tier. SATA tier allocations are shown in Figure 71 on page 188, and comprise a significantly smaller allocation rate and overall size than the Fibre Channel and EFD pools. Deploying FAST VP with SQL Server databases 187 Storage Provisioning Figure 71 Storage allocations for the SATA tier The main cause of the minimal usage of the SATA tier is based on the workload generated by the workload used during the testing. The workload generates a constant workload across all data files comprising the database storage. This constant workload will limit the ability to move extents to a higher density storage pool, since demotions will require no workload to be generated across the storage extents. The only demotions that could occur are for NTFS metadata structures that generate no workload access over time. It is these structures that tend to be demoted to the SATA tier. The storage allocations for these structures is relatively small, and account for some of the storage allocations on the LUNs. Additionally, two LUNs were provided to store TEMPDB in the environment. These LUNs generated no workload during the testing, and were fully migrated to the SATA pool. It should be noted that the LUNs were thin devices, and were appropriately configured using best practice implementation for Windows. These best practices will result in minimal storage allocations occurring on the thin devices themselves. Thus these storage allocations are insignificant as compared to the approximate 1.3 TB allocated for the database files and transaction logs. Assessing the benefits of FAST VP The migrations automatically implemented by the FAST VP Policy engine improved overall performance of the total environment by migrating heavily accessed storage allocations to a higher-performing storage tier. In the example configuration, a large 188 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning proportion of the more active storage allocations of the database (Broke FileGroup) were migrated to the EFD storage tier, resulting in improved workload throughput and reduction in I/O latency time for these LUNs. The migration of these heavily accessed storage allocations also resulted in a reduction of contention within the original storage pool, thereby improving response times for the remaining devices. Post-migration I/O throughput rates were significantly better than pre-migration rates, and were executing with significantly reduced latency numbers. Figure 72 on page 189 shows that overall system read requests per second increased over the duration of the testing. Reads increased from a rate of around 17,000 to around 21,000 reads/sec, while providing improved latency as seconds/read rates dropped from around 12 ms per read to an average of 8 ms per read. Figure 72 Read I/O rates for data volumes - Post-FAST VP activation As a direct result of higher throughput and lower latencies, the operation of the SQL Server environment improved across all major attributes. Figure 73 on page 190 details the relative improvement in performance for the environment. The performance improvements were obtained after only 28 percent of the total storage allocations across all LUNs had been migrated to an alternate storage tier. Deploying FAST VP with SQL Server databases 189 Storage Provisioning Figure 73 Comparison of improvement for major metrics The movement of storage extents to a lower, more cost-effective tier was limited by the utilization of the storage devices. SQL Server ensures that there is a uniform distribution of I/O across all data files. This style of activity ensures that disk resources are equally accessed. SQL partitioning, and refinements in the allocation of data and indexes to discrete data files, may improve the overall efficiency of the system. Microsoft SQL Server database environments often exhibit a skewed workload across files comprising differing filegroups. In many implementations, the LUNs used as storage devices are allocated from the same storage pools, which will often lead to contention for physical resources. In enterprise-class storage arrays, tiered storage configurations are implemented to provide differing performance and cost-effective storage solutions. Implementing FAST VP can improve performance across the array. The sub-LUN blocks migrated to higher-performance storage can significantly improve application performance, while also improving performance of the remaining non-migrated data due to diminished contention. Moving lightly accessed storage allocations from a higher-performing storage tier to a more appropriate storage tier helps improve capacity utilization. The cost savings from tiered storage implementations can be realized in lower acquisition costs and long-term operational savings, especially in lower long-term 190 Microsoft SQL Server on EMC Symmetrix Storage Systems Storage Provisioning power usage. Symmetrix VMAX with Enginuity 5875 enables a further refinement in operational efficiencies to ensure that data placement is optimized based on workload requirements. Deploying FAST VP with SQL Server databases 191 Storage Provisioning 192 Microsoft SQL Server on EMC Symmetrix Storage Systems 4 Creating Microsoft SQL Server Database Clones This chapter presents these topics: ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ Overview ........................................................................................... Recoverable versus restartable copies of databases.................... Copying the database with Microsoft SQL Server shutdown... Copying the database using EMC Consistency technology ...... Copying the database using SQL Server VDI and VSS .............. Copying the database using Replication Manager ..................... Transitioning disk copies to SQL Server databases clones ........ Reinitializing the cloned environment.......................................... Choosing a database cloning methodology ................................. Creating Microsoft SQL Server Database Clones 195 196 198 205 212 229 231 241 242 193 Creating Microsoft SQL Server Database Clones This chapter describes the Microsoft SQL Server database cloning process using various EMC products. Determining which replication products to use depends on the customer’s requirements and database environment. Products such as the TimeFinder Integration Module for SQL Server, the TimeFinder CLI tools themselves, Replication Manager and EMC NetWorker provide an easy method of copying Microsoft SQL Server databases in a single Symmetrix array. All storage-based technologies used to create cloned databases depend on some form of restart or recovery processing. This methodology is used by many of the existing SQL Server solutions, including implementations such as Microsoft Failover Clustering which behave as restart environments. The mechanisms used by EMC clone processing are not unique and utilize tried and tested methodologies. A database cloning process typically includes some or all of the following steps, depending on the copying mechanism selected and the desired usage of the database clone: 194 ◆ Preparing the array for replication ◆ Conditioning the source database ◆ Making a copy of the database volumes ◆ Resetting the source database ◆ Presenting the target database copy to a server ◆ Conditioning the target database copy Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones Overview There are many choices when cloning databases with EMC array-based replication software. Each software product has differing characteristics that affect the final deployment. A thorough understanding of the options available leads to an optimal replication choice. A Microsoft SQL Server database can be in one of three data states when it is being copied: ◆ Shutdown ◆ Processing normally ◆ Conditioned using SQL Server Virtual Device Interface (VDI) or Volume Shadow Copy Service (VSS) Depending on the data state of the database at the time it is copied, the database copy may be restartable or recoverable. This chapter begins with a discussion of recoverable and restartable database clones. It then describes various approaches to data replication using EMC software products and how the replication techniques can be used in combination with the different database data states to facilitate the database cloning process. Following that, database clone usage considerations are discussed along with descriptions of the procedures used to deploy database clones. Overview 195 Creating Microsoft SQL Server Database Clones Recoverable versus restartable copies of databases The Symmetrix-based replication technologies described in this section can create two types of database copies, recoverable or restartable. A significant amount of confusion exists between these two types of database copies; a clear understanding of their differences is critical to ensure the appropriate application of each method when a cloned SQL Server environment is required. Recoverable database copies A recoverable database copy is one in which logs can be applied to the database data state and the database is rolled forward to a point in time after the database copy was created. A recoverable SQL Server database copy is intuitively easy for DBAs to understand since maintaining recoverable copies in the form of backups is an important DBA function. In the event of a failure of the production database, the ability to recover the database not only to the point in time when the last backup was taken, but also to roll forward subsequent transactions up to the point of failure, is a key feature of the SQL Server database. In general, recoverable copies of SQL Server databases must utilize the BACKUP DATABASE Transact-SQL statement, utilize the Virtual Device Interface (VDI) or integrate with the Volume Shadow Copy Service (VSS) subsystem. Restartable database copies If a disk mirror copy of a running SQL Server database is created using EMC consistency technology without utilizing a utility which implements VDI or VSS backup functionality, the copy is to be considered a DBMS restartable copy. This means that when the files representing the database are attached on the restartable copy, SQL Server will perform crash recovery. First, all transactions that were recorded as committed and written to the transaction log, but which may not have had corresponding data pages written to the data files, are rolled forward. This is the redo phase. Second, when the redo phase is complete, SQL Server enters the undo phase where it looks for database changes that were recorded (dirty page flushed by a lazy write for instance) but which were never actually committed by a transaction. These changes are rolled back or undone. The state 196 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones attained is often referred to as a transactionally consistent point in time. It is essentially the same process that the RDBMS would undergo should the server have suffered an unanticipated interruption such as a power failure. Roll-forward recovery using incremental transaction log backups to a point in time after the database copy was created is not supported on a Microsoft SQL Server restartable database copy. Recoverable versus restartable copies of databases 197 Creating Microsoft SQL Server Database Clones Copying the database with Microsoft SQL Server shutdown It is possible to take a copy of a Microsoft SQL Server database while the database is shutdown. Taking a copy after the database has been shut down normally will ensure a clean copy for streaming to tape or for fast startup of the cloned database. In addition, a cold copy of a database is in a known transactional data state which, for some application requirements, is exceedingly important. Copies of running databases are in unknown transactional data states. Note: A cold copy of a SQL Server database does not constitute a valid SQL Server backup, as no record of a backup event is ever made by the SQL Server instance. Subsequently, it is also not possible to process the files with functions such as the Transact SQL RESTORE DATABASE command set. The primary method of creating cold copies of a Microsoft SQL Server database is through the use of EMC’s local replication product, TimeFinder. TimeFinder is also used by Replication Manager and EMC NetWorker to make database copies. Replication Manager and EMC NetWorker facilitate the automation and management of database clones. Additionally, the EMC TimeFinder SQL Server Integration Module (TF/SIM) also facilitates the creation of copies based on TimeFinder functionality. TimeFinder operations are implemented in three different forms, TimeFinder/Mirror, TimeFinder/Clone, and TimeFinder/Snap. These were discussed in general terms in Chapter 2, “EMC Foundation Products.” Here, they are utilized in a database context. Using TimeFinder/Mirror TimeFinder/Mirror is an EMC Symmetrix array implementation that allows an additional hardware mirror to be attached to a source volume. The additional mirror is a specially designated volume in the Symmetrix configuration called a business continuance volume, or BCV. The BCV is synchronized to the source volume through a process referred to as an establish. While the BCV is established, it is not ready to all hosts to which it may be presented. At an appropriate time, the BCV can be split from the source volume to create a complete point-in-time copy of the source data that can be used for different purposes, including backup, decision support, and regression testing. 198 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones Note: For VMAX array implementations, TimeFinder/Mirror is generally replaced with TimeFinder/Clone operations. Coverage is provided here for completeness. Groups of BCVs are managed together using SYMCLI device or composite groups. Solutions Enabler commands are executed to create SYMCLI groups for TimeFinder/Mirror operations. If the database spans more than one Symmetrix, a composite group is used. Figure 74 on page 199 on page depicts the necessary steps to make a database copy of a cold Microsoft SQL Server database using TimeFinder/Mirror: SQL Server 2 4 1. Establish BCVs to Standard devices 2. Shut down SQL Server Instance 3. Split BCVs from Standard devices 4. Restart SQL Server Instance 1 3 Data STD Data BCV Data STD Data BCV Log STD Log BCV ICO-IMG-000012 Figure 74 Copying a shutdown SQL Server database with TimeFinder/Mirror 1. Establish the BCVs to the standard devices. This operation occurs in the background and should be executed in advance of when the BCV copy is needed. symmir –g <device_group> establish –full -noprompt Note: The first iteration of the establish needs to be a full synchronization. Subsequent iterations by default are incremental if the –full keyword is omitted. Once the command is issued, the array begins the synchronization process using only Symmetrix resources. Since this operation occurs independently from the host, the process must be interrogated to see when it completes. The command to interrogate the synchronization process is: symmir –g <device_group> verify Copying the database with Microsoft SQL Server shutdown 199 Creating Microsoft SQL Server Database Clones This command will return a 0 return code when the synchronization operation is complete. Alternatively, synchronization can be verified using the following: symmir -g <device_group> query After the volumes are synchronized, the split command can be issued at any time. 2. Once BCV synchronization is complete the database needs to be brought down in order to make a copy of a cold database. SQL Server management tools may be used to take the database offline or shut down the instance. 3. When the database is deactivated, split the BCV mirrors using the following command: symmir –g <device_group> split -noprompt The split command takes a few seconds to process. The database copy on the BCVs is now ready for further processing. 4. The source database can now be activated and made available to users once again. Once again, use the SQL Server management tools to reinstate the database, or restart the SQL Server instance. Using TimeFinder/Clone TimeFinder/Clone is an EMC software product that copies data internally in the Symmetrix array. A TimeFinder/Clone session is created between a source data volume and a target volume. The target volume needs to be equal to or greater in size than the source volume. The source and target for TimeFinder/Clone sessions can be any hypervolumes in the Symmetrix configuration. TimeFinder/Clone devices are managed together using SYMCLI device or composite groups. Solutions Enabler commands are executed to create SYMCLI groups for TimeFinder/Clone operations. If the database spans more than one Symmetrix, a composite group is used. Examples of these commands can be found in Appendix A, “Related Documents.” Figure 75 on page 202 depicts the necessary steps to make a copy of a cold SQL Server database onto BCV devices using TimeFinder/Clone: 200 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones 1. The first action is to create the TimeFinder/Clone pairs. The following command creates the TimeFinder/Clone pairings and protection bitmaps. No data is copied or moved at this time: symclone –g <device_group> create -noprompt Unlike TimeFinder/Mirror, the TimeFinder/Clone relationship is created and activated when it is needed. No prior synchronization of data is necessary. After the TimeFinder/Clone session is created, it can be activated consistently. 2. Once the create command has completed, the database needs to be shut down to make a cold disk copy of the database. SQL Server management tools may be used to take the database offline or shut down the instance. 3. With the database down, the TimeFinder/Clone can now be activated: symclone –g <device_group> activate -noprompt After the activate, the database copy provided by TimeFinder/Clone is immediately available for further processing even though the copying of data may not have completed. 4. The source database can now be activated and made available to users once again. In current releases of Symmetrix Enginuity, databases copied using TimeFinder/Clone are no longer subject to Copy on First Write (COFW) penalties, but will suffer from Copy on Access (COA) penalties. With later releases, Enginuity implements an Asynchronous Copy of First Write process (ACOFW). The Asynchronous Copy on First Write allows a track to be written to the source volume by servicing the update from cache. The update is protected by the Symmetrix cache, and battery backup systems. The update will subsequently be destaged to disk, once the original pre-changed track has been copied to the Clone target. Copy on Access means that if a track on a TimeFinder/Clone volume is accessed before it has been copied, it must first be copied from the source volume to the target volume. This causes additional disk read activity to the source volumes and could be a source of disk contention on busy systems. Copying the database with Microsoft SQL Server shutdown 201 Creating Microsoft SQL Server Database Clones SQL Server 2 4 1. Create TimeFinder Clone 2. Shut down SQL Server Instance 3. Activate TimeFinder Clone 4. Restart SQL Server Instance 1 3 Data STD Data BCV Data STD Data BCV Log STD Log BCV ICO-IMG-000013 Figure 75 Copying a shutdown SQL Server database with TimeFinder/Clone Using TimeFinder/Snap TimeFinder/Snap enables user to create complete copies of their data while consuming only a fraction of the disk space required by the original copy. TimeFinder/Snap is an EMC software product that maintains space-saving pointer-based copies of disk volumes using virtual devices (VDEVs) and save devices (SAVDEVs). The VDEVs contain pointers either to the source data (when it is unchanged) or to the SAVDEVs (when the data has been changed). TimeFinder/Snap devices are managed together using SYMCLI device or composite groups. Solutions Enabler commands are executed to create SYMCLI groups for TimeFinder/Snap operations. If the database spans more than one Symmetrix, a composite group is used. Examples of these commands can be found in Appendix B, “References.” Figure 76 on page 203 depicts the necessary steps to make a copy of a cold SQL Server database using TimeFinder/Snap: 202 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones Production SQL Server I/O 2 STD 4 1 I/O 3 VDEV Data copied to save area due to copy on write Target SQL Server SAVE DEVs 1. Create Snap Session 2. Shut down SQL Server 3. Activate Snap Clone 4. Restart SQL Server Figure 76 Device pointers from VDEV to original data ICO-IMG-000014 Copying a shutdown SQL Server database with TimeFinder/Snap 1. The first action is to create the TimeFinder/Snap pairs. The following command creates the TimeFinder/Snap pairings and protection bitmaps. No data is copied or moved at this time: symsnap –g <device_group> create -noprompt 2. Once the create operation has completed, the database needs to be shut down to make a cold TimeFinder/Snap of the DBMS. SQL Server management tools may be used to take the database offline or shut down the instance, whichever is most applicable. 3. With the database down, the TimeFinder/Snap copy can now be activated: symsnap –g <device_group> activate -noprompt After the activate, the pointer-based database copy on the VDEVs is available for further processing. 4. The source database can now be restarted. In current releases of Symmetrix Enginuity, databases copied using TimeFinder/Snap are no longer subject to a Copy on First Write (COFW) penalty while the snap is activated. Asynchronous Copy on First Write (ACOFW) has been implemented for TimeFinder/Snap devices, allowing updates to the source devices to be serviced from cache. The update is protected by the Symmetrix cache, and battery Copying the database with Microsoft SQL Server shutdown 203 Creating Microsoft SQL Server Database Clones backup systems. The update will subsequently be destaged to the source device, once the original pre-changed track has been copied to the Snap save area. 204 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones Copying the database using EMC Consistency technology The replication of a running database system involves a database copying technique that is employed while the database is servicing applications and users. The database copying technique uses EMC Consistency technology in the form of a consistency group (CG) combined with an appropriate data copy process like TimeFinder/Mirror and TimeFinder/Clone. TimeFinder/CG allows for the running database copy to be created in an instant through use of the –consistent keyword on the split or activate commands. The image created in this way is in a dependent-write consistent data state and can be utilized as a restartable copy of the database. Databases management systems enforce a principle of dependent-write I/O. That is, no dependent-write will be issued until the predecessor write that it is dependent on has completed. This type of programming discipline is used to coordinate database and log updates within a database management system and allows those systems to be restartable in event of a power failure. Dependent-write consistent data states are created when database management systems are exposed to power failures. Using EMC Consistency technology options during the database cloning process also creates a database copy that has a dependent-write consistent data state. See Chapter 2, “EMC Foundation Products,” for more discussion on EMC Consistency technology. Microsoft SQL Server databases can be copied while they are running and processing transactions. The following sections describe how to copy a running Microsoft SQL Server database using TimeFinder technology. Using TimeFinder/Mirror TimeFinder/Mirror is an EMC software product that allows an additional hardware mirror to be attached to a source volume. The additional mirror is a specially designated volume in the Symmetrix configuration called a business continuance volume, or BCV. The BCV is synchronized to the source volume through a process called an establish. While the BCV is established, it is not ready to all hosts. At an appropriate time, the BCV can be split from the source volume to create a complete point-in-time copy of the source data that can be used for different purposes, including backup, decision support, and regression testing. Copying the database using EMC Consistency technology 205 Creating Microsoft SQL Server Database Clones Note: For VMAX array implementations, TimeFinder/Mirror is generally replaced with TimeFinder/Clone operations. Coverage is provided here for completeness. Groups of BCVs are managed together using SYMCLI device or composite groups. Solutions Enabler commands are executed to create SYMCLI groups for TimeFinder/Mirror operations. If the database spans more than one Symmetrix, a composite group is used. Examples of these commands can be found in Appendix B, “References.” Figure 77 on page 206 depicts the necessary steps to make a database copy of a running SQL Server database using TimeFinder/Mirror: 1 2 Data STD Data BCV Data STD Data BCV Log STD Log BCV SQL Server 1. Establish BCVs to Standard devices 2. Consistent split BCVs from STDs ICO-IMG-000015 Figure 77 Copying a running SQL Server database with TimeFinder/Mirror 1. The first action is to establish the BCVs to the standard devices. This operation occurs in the background and should be executed in advance of when the BCV copy is needed: symmir –g <device_group> establish –full -noprompt Note: The first iteration of the establish needs to be a full synchronization. Subsequent iterations are incremental and do not need the –full keyword. Once the command is issued, the array begins the synchronization process using only Symmetrix resources. Since this operation occurs independently from the host, the process must be interrogated to see when it completes. 206 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones The command to interrogate the synchronization process is: symmir –g <device_group> verify This command will return a 0 return code when the synchronization operation is complete. Alternatively, synchronization can be verified using the following: symmir -g <device_group> query 2. When the volumes are synchronized the split command can be issued. symmir –g <device_group> split –consistent -noprompt The –consistent keyword tells the Symmetrix to use Enginuity Consistency Assist (ECA) to momentarily suspend writes to the disks while the split is being processed. The effect of this is to create a point-in-time copy of the database on the BCVs. It is similar to the image created when there is a power outage that causes the server to crash. This image is a restartable copy—a term which is defined previously. The database copy on the BCVs is then available for further processing. Since there was no specific coordination between the database state and the execution of the consistent split, the copy is taken independent of the database activity. In this way, EMC Consistency technology can be used to make point-in-time copies of multiple systems atomically, resulting in a consistent point in time with respect to all applications and databases included in the consistent split. Using TimeFinder/Clone TimeFinder/Clone is an EMC software product that copies data internally in the Symmetrix array. A TimeFinder/Clone session is created between a source data volume and a target volume. The target volume needs to be equal to or greater in size than the source volume. The source and target for TimeFinder/Clone sessions can be any hypervolumes in the Symmetrix configuration. TimeFinder/Clone devices are managed together using SYMCLI device or composite groups. Solutions Enabler commands are executed to create SYMCLI groups for TimeFinder/Clone operations. If the database spans more than one Symmetrix, a composite group is used. Examples of these commands can be found in Appendix B, “References.” Copying the database using EMC Consistency technology 207 Creating Microsoft SQL Server Database Clones Figure 78 on page 209 depicts the necessary steps to make a copy of a running SQL Server database onto BCV devices using TimeFinder/Clone: 1. The first action is to create the TimeFinder/Clone pairs. The following command creates the TimeFinder/Clone pairings and protection bitmaps. No data is copied or moved at this time: symclone –g <device_group> create -noprompt Unlike TimeFinder/Mirror, the TimeFinder/Clone relationship is created and activated when it is needed. No prior copying of data is necessary. 2. After the TimeFinder/Clone relationship is created it can be activated consistently. symclone –g <device_group> activate –consistent -noprompt The –consistent keyword tells the Symmetrix to use Enginuity Consistency Assist (ECA) to momentarily suspend writes to the source disks while the TimeFinder/Clone is being activated. The effect of this is to create a point-in-time copy of the database on the target volumes. It is a copy similar in state to that created when there is a power outage resulting in a server crash. This copy is a restartable copy—a term which is defined previously. After the activate command, the database copy on the TimeFinder/Clone devices is available for further processing. Since there was no specific coordination between the database state and the execution of the consistent split, the copy is taken independent of the database activity. In this way, EMC Consistency technology can be used to make point-in-time copies of multiple systems atomically, resulting in a consistent point in time with respect to all applications and databases included in the consistent split. 208 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones 1 2 Data STD Data BCV Data STD Data BCV Log STD Log BCV SQL Server 1. Create Clone session 2. Consistent Activate Clone session ICO-IMG-000016 Figure 78 Copying a running SQL Server database with TimeFinder/Clone Using TimeFinder/Snap TimeFinder/Snap enables users to create complete copies of their data while consuming only a fraction of the disk space required by the original copy. TimeFinder/Snap is an EMC software product that maintains space-saving, pointer-based copies of disk volumes using virtual devices (VDEVs) and save devices (SAVDEVs). The VDEVs contain pointers either to the source data (when it is unchanged) or to the SAVDEVs (when the data has been changed). TimeFinder/Snap devices are managed together using SYMCLI device or composite groups. Solutions Enabler commands are executed to create SYMCLI groups for TimeFinder/Snap operations. If the database spans more than one Symmetrix, a composite group is used. Examples of these commands can be found in Appendix B, “References.” Figure 79 on page 210 depicts the necessary steps to make a copy of a running Microsoft SQL Server database using TimeFinder/Snap: Copying the database using EMC Consistency technology 209 Creating Microsoft SQL Server Database Clones Production SQL Server I/O 1 STD 2 I/O Target SQL Server Device pointers from VDEV to original data VDEV Data copied to save area due to copy on write SAVE DEVs 1. Create Snap Session 2. Consistent Activate session ICO-IMG-000017 Figure 79 Copying a running SQL Server database with TimeFinder/Snap 1. The first action is to create the TimeFinder/Snap pairs. The following command creates the TimeFinder/Snap pairings and protection bitmaps. No data is copied or moved at this time: symsnap –g <device_group> create -noprompt After the TimeFinder/Snap is created, all the pointers from the VDEVs point at the source volumes. No data has been copied at this point. The snap can be activated consistently using the consistent activate command. 2. Once the create operation has completed, the activate command can be executed with the –consistent option. Execute the following command to perform the consistent snap: symsnap –g <device_group> activate –consistent -noprompt The –consistent keyword tells the Symmetrix to use Enginuity Consistency Assist (ECA) to momentarily suspend writes to the disks while the activate command is being processed. The effect of this is to create a point-in-time copy of the database on the VDEVs. It is similar to the state created when there is a power outage that causes the server to crash. This image is a restartable copy—a term which is defined previously. The database copy on the VDEVs is available for further processing. Since there was no specific coordination between the database state and the execution of the consistent split, the copy is taken independent of the database activity. In this way, EMC Consistency 210 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones technology can be used to make point-in-time copies of multiple systems atomically, resulting in a consistent point in time with respect to all applications and databases included in the consistent split. Copying the database using EMC Consistency technology 211 Creating Microsoft SQL Server Database Clones Copying the database using SQL Server VDI and VSS Microsoft SQL Server 2005 and later support a programmatic interface, which provides the capability to use split mirror disk technology while a SQL Server database is online and create a recoverable database on the copied devices. During this process, the database is fully available for reads. Write operations are suspended during the Virtual Device Interface (VDI) split operation. When a transaction issues a commit, it is suspended until the VDI operation has been issued and the subsequent disk mirror is split. In SQL Server 2008, support has been added to provide support for Volume Shadow Copy Service (VSS) backup operations. The VSS implementation functions in a similar manner to the VDI implementation by coordinating with the SQL Server instance to create valid on-disk backup images. Note: In contrast to the executions of the previous section, there is no need to specify the -consistent option to any TimeFinder command, as SQL Server through the VDI and VSS operations, ensures database consistency is maintained. The copies made utilizing either VDI or VSS interfaces are considered valid full database backups of the target database(s). Microsoft SQL Server will indicate that a full valid backup has been completed in the system tables, and associated database files. Because both the VDI and VSS mechanisms are programmatic interfaces, they cannot be called directly by a user. Custom applications are typically created by storage vendors such as EMC, which interface between VDI and VSS, providing the integration with the storage array technology. EMC provides a number of products that support these mechanisms for creating valid backups for EMC Symmetrix. These products include the graphical Replication Manager and the command line interface-based TimeFinder SQL Server Integration Module (TF/SIM). Additionally, the Storage Resource Management (SRM) suite of tools provides the SYMIOCTL command line utility. Note: SYMIOCTL only provides support for the VDI implementation. 212 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones In the following examples, TF/SIM and SYMIOCTL are used to demonstrate the various forms of TimeFinder functionality. These examples assume local execution of the TF/SIM product on the production server. TF/SIM VSS operations are implemented in a client/server model, and therefore assume remote backup operations from a backup client. Specific, detailed implementation requirements are available in the respective product guides. Using TimeFinder/Mirror TimeFinder/Mirror is an EMC software product that allows an additional hardware mirror to be attached to a source volume. The additional mirror is a specially designated volume in the Symmetrix configuration called a business continuance volume, or BCV. The BCV is synchronized to the source volume through a process called an establish. While the BCV is established, it is not ready to all hosts. At an appropriate time, the BCV can be split from the source volume to create a complete point-in-time copy of the source data that can be used for multiple different purposes, including backup, decision support, and regression testing. Note: For VMAX array implementations, TimeFinder/Mirror is generally replaced with TimeFinder/Clone operations. Coverage is provided here for completeness. In general, those EMC products that utilize the SQL Server VDI and VSS mechanisms also implement EMC’s Storage Resource Management (SRM), which dynamically maps database files to array volumes. Thus, the operations of these tools become independent of SYMCLI device groups, which are used by administrators to monitor the volume states and relationships. Executing TF/SIM The process detailed in Figure 80 on page 214 demonstrates the necessary steps required to create a split-mirror database backup of a Microsoft SQL Server database using TF/SIM to manager TimeFinder/Mirror operations: Copying the database using SQL Server VDI and VSS 213 Creating Microsoft SQL Server Database Clones 2.2 2.3 2.5 SQL Server 2 1. Establish BCVs to Standard devices 2. Execute TF/SIM backup command 2.1 TF/SIM validates BCV state 2.2 SQL Server executes checkpoint 2.3 SQL Server suspends write I/O 2.4 TF/SIM splits BCVs 2.5 SQL Server resumes I/O 1 2.1 2.4 Data STD Data BCV Data STD Data BCV Log STD Log BCV ICO-IMG-000018 Figure 80 Creating a TimeFinder/Mirror backup of a SQL Server database 1. The first action is to establish the BCVs to the standard devices. This operation occurs in the background and should be executed in advance of when the BCV copy is needed. symmir –g <device_group> establish –full -noprompt Note: The first iteration of the establish needs to be a full synchronization. Subsequent iterations are incremental and do not need the –full keyword. Once the command is issued, the array begins the synchronization process using only Symmetrix resources. Since this is asynchronous to the host, the process must be interrogated to see when it is finished. The command to interrogate the synchronization process is: symmir –g <device_group> verify This command will return a 0 return code when the synchronization operation is complete. 2. When the volumes are synchronized, the TF/SIM backup operation may be executed. 214 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones Note: TF/SIM does not actually use the device group to mange its internal operations, rather it examines relationships, which have been created between source and disk mirrors by administrators as a result their use of various SYMCLI device group operations. tsimsnap backup –d <database> –f <location of metafile> In the example in Figure 81 on page 215, the target production database is provided after the –d command line option. The –f command line option specifies the location where TF/SIM will store the VDI metadata file. Figure 81 Sample TimeFinder/SIM backup with TimeFinder/Mirror Executing SYMIOCTL Similar to the TF/SIM execution, it is possible to utilize the SYMIOCTL command line utility to execute VDI processing for the SQL Server database. Unlike TF/SIM, however, all processing of the device group and state of the mirror devices within the group must be managed through command line operations. Also, there is no default ability to flush the filesystem buffers. Copying the database using SQL Server VDI and VSS 215 Creating Microsoft SQL Server Database Clones Note: For the preceding TF/SIM examples, TF/SIM ensures that file system buffers are also flushed. To implement this functionality with SYMIOCTL, it is necessary to use the Symmetrix Integration Utility (SIU) to perform the flush operation. The flush operation ensures that other file system-level operations are successfully written to the disk. The execution of the SYMIOCTL proceeds in the following manner: 1. The first action is to establish the BCVs to the standard devices. This operation occurs in the background and should be executed in advance of when the BCV copy is needed. symmir –g <device_group> establish –full -noprompt Note: The first iteration of the establish needs to be a full synchronization. Subsequent iterations are incremental and do not need the –full keyword. Once the command is issued, the array begins the synchronization process using only Symmetrix resources. Since this is asynchronous to the host, the process must be interrogated to see when it is finished. The command to interrogate the synchronization process is: symmir –g <device_group> verify This command will return a 0 return code when the synchronization operation is complete. 2. When the volumes are synchronized, the SYMIOCTL backup operation may be executed. symioctl –type SQLServer begin snapshot <database> SAVEFILE <location of metafile> -noprompt In this example, the SAVEFILE command line option specifies the location where SYMIOCTL will store the VDI metadata file. 3. Flush all database locations, either drive letters or mount points, in use by using SIU. In this example, the locations are mount points: symntctl flush –path <mount point for volume> Execute for each and every database file and log location. 4. Split the devices. 216 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones symmir –g <device_group> split –instant -noprompt In this case, the –instant option to the symmir command is used to facilitate a much faster execution of the split operation. Using this command, the Symmetrix array will maintain the logical state of the devices, and return control immediately to the command line. 5. Finalize the SYMIOCTL backup command and thaw I/O operations on the SQL Server database. symioctl –type SQLServer end snapshot <database> -noprompt It is the responsibility of the administrator to ensure that the SYMIOCTL end snapshot command is executed within the script—there is no automatic termination of the command. Leaving the database in snapshot state will suspend all user connections and may cause SQL Server to take the database offline. Any user-created process or script must cater for abnormal termination of the process or script itself, and ensure that the end snapshot is executed as a cleanup operation. Copying the database using SQL Server VDI and VSS 217 Creating Microsoft SQL Server Database Clones Figure 82 Sample SYMIOCTL backup with TimeFinder/Mirror Using TimeFinder/Clone TimeFinder/Clone is an EMC software product that copies data internally in the Symmetrix array. A TimeFinder/Clone session is created between a source data volume and a target volume. The target volume needs to be equal to or greater in size than the source volume. The source and target for TimeFinder/Clone sessions can be any hypervolumes in the Symmetrix configuration. TimeFinder/Clone devices are managed together using SYMCLI device or composite groups. Solutions Enabler commands are executed to create SYMCLI groups for TimeFinder/Clone operations. If the database spans more than one Symmetrix, a composite group is used. Examples of these commands can be found in Appendix B, “References.” 218 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones Executing TF/SIM The TimeFinder SQL Server Integration Module provides support for the Symmetrix TimeFinder/Clone mode to facilitate backup operations for a SQL Server database. In deployments with DMX systems, all TimeFinder/Mirror operations are mapped transparently to TimeFinder/Clone operations utilizing Symmetrix Clone Emulation mode to facilitate backup operations. For Symmetrix DMX deployments, TF/SIM does require that for local operation, the source and target volumes are pre-established. Therefore, for these deployments, it will be necessary to establish the devices. In the case of Clone Emulation mode, the symmir establish command will be mapped to the appropriate symclone operation. For standard TimeFinder/Clone operations, TF/SIM does not require devices to be pre-established. TF/SIM will execute a full copy operation by default at the time of the clone device activation. 2.2 2.3 2.5 SQL Server 2 1. Create Clone Emulation session Establish 2. Execute TF/SIM backup command 2.1 TF/SIM validates Clone state 2.2 SQL Server executes checkpoint 2.3 SQL Server suspends write I/O 2.4 TF/SIM splits Clones 2.5 SQL Server resumes I/O 1 2.1 2.4 Data STD Data BCV Data STD Data BCV Log STD Log BCV ICO-IMG-000019 Figure 83 Creating a backup of a SQL Server database with TimeFinder/Clone Figure 83 on page 219 depicts the necessary steps to make a backup of a Microsoft SQL Server database onto clone devices using TimeFinder/Clone. TF/SIM also supports the execution of SQL Server VSS backup operations using TimeFinder/Clone operations. In Figure 84 on page 220 the execution of a TF/SIM VSS backup utilizing clone Copying the database using SQL Server VDI and VSS 219 Creating Microsoft SQL Server Database Clones devices is demonstrated. VSS backup operations are typically executed from a remote backup host, rather than locally on the production system. Figure 84 Sample TF/SIM VSS backup using TimeFinder/Clone The typical sequence of operations will include the following: 1. The first action is to create the TimeFinder/Clone pairs. The following command creates the TimeFinder/Clone pairings and protection bitmaps. symclone –g <device_group> create -noprompt In the event that Clone Emulation mode is required, the appropriate environment variable must be set before executing the symmir command: set SYMCLI_CLONE_EMULATION=ENABLED symmir –g <device_group> establish –full -noprompt In Clone Emulation mode, TF/SIM requires that the clone is fully synchronized with the source volumes. As this is Clone Emulation mode, the standard symmir commands should be used with the environment variable set. The command to interrogate the process is: symmir –g <device_group> verify 220 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones This command will return a 0 return code when the synchronization operation is complete. For native TimeFinder/Clone operations, the status of the clone devices can be verified by executing the following command: symclone –g <device_group> verify 2. If using Clone Emulation mode, when the volumes need to be synchronized the TF/SIM backup operation may be executed: tsimsnap backup –d <database> –f <location of metadata> In this example, the target production database is provided after the –d command line option. The –f command line option specifies the location where TF/SIM will store the VDI metadata file. For native TimeFinder/Clone operations, no prior synchronization of devices is required. 3. Execution of the backup operation may be completed either by using the TF/SIM VDI or VSS mode of operation. For VDI backup operation, the TF/SIM tsimsnap command may be executed in the following manner: tsimsnap backup –d <database> -clone –f <location of metadata> For VSS backup operations, the TF/SIM tsimvss command may be executed in the following manner from a remote backup client: tsimvss backup -ph <production server> –d <database> -clone –bcd <location of bcd metadata> -wmd <location of wmd metadata> Executing SYMIOCTL Similar to the TF/SIM execution, it is possible to utilize the SYMIOCTL command line utility to execute VDI processing for the SQL Server database. Unlike TF/SIM, however, all processing of the device group and state of the mirror devices within the group must be managed through command line operations. Also, there is no default ability to flush the file system buffers. Note: SYMIOCTL does not implement VSS backup operations Copying the database using SQL Server VDI and VSS 221 Creating Microsoft SQL Server Database Clones Note: The preceding TF/SIM examples, TF/SIM ensures that file system buffers are also flushed. SYMIOCTL does not implement such functionality, and it will be necessary to cater for this operation in any script created. To implement this functionality with SYMIOCTL, it is necessary to use the Symmetrix Integration Utility (SIU) to perform the flush operation. The flush operation ensures that other file system-level operations are successfully written to the disk. An example of the execution of this process is shown in Figure 85 on page 223. Therefore, the execution of the SYMIOCTL process executes in the following manner: 1. The first action is to establish the BCVs to the standard devices. This operation occurs in the background and should be executed in advance of when the BCV copy is needed. symclone –g <device_group> create -noprompt 2. Once the clone session is created, the SYMIOCTL backup operation may be executed. symioctl –type SQLServer begin snapshot <database> SAVEFILE <location of metafile> -noprompt In this example, the SAVEFILE command line option specifies the location where SYMIOCTL will store the VDI metadata file. 3. Flush all database locations, either drive letters or mount points, in use, by using SIU. In this example, the locations are mount points. symntctl flush –path <mount point for volume> Execute for each and every database file and log location. 4. Activate the clone session. symclone –g <device_group> activate -noprompt 5. Finalize the SYMIOCTL backup command and thaw I/O operations on the SQL Server database. symioctl –type SQLServer end snapshot <database> -noprompt 222 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones It is the responsibility of the administrator to ensure that the SYMIOCTL end snapshot command is executed within the script—there is no automatic termination of the command. Leaving the database in snapshot state will suspend all user connections and may cause SQL Server to take the database offline. Any user-created process or script must cater for abnormal termination of the process or script itself, and ensure that the end snapshot is executed as a cleanup operation. Figure 85 Sample SYMIOCTL backup using TF/Clone Copying the database using SQL Server VDI and VSS 223 Creating Microsoft SQL Server Database Clones Using TimeFinder/Snap TimeFinder/Snap enables users to create complete copies of their data while consuming only a fraction of the disk space required by the original copy. TimeFinder/Snap is an EMC software product that maintains space-saving pointer-based copies of disk volumes using virtual devices (VDEVs) and save devices (SAVDEVs). The VDEVs contain pointers either to the source data (when it is unchanged) or to the SAVDEVs (when the data has been changed). TimeFinder/Snap devices are managed together using SYMCLI device or composite groups. Solutions Enabler commands are executed to create SYMCLI groups for TimeFinder/Snap operations. If the database spans more than one Symmetrix, a composite group is used. Examples of these commands can be found in Appendix B, “References.” Executing TF/SIM Figure 86 on page 224 depicts the necessary steps to make a backup of a SQL Server database using TimeFinder/Snap: 2.2 2.3 I/O STD 2.5 Production SQL Server 1 2.1 2.4 1. Create Snap Session 2. Execute TF/SIM backup command 2.1 TF/SIM validates Snap state 2.2 SQL Server executes checkpoint 2.3 SQL Server suspends write I/O 2.4 TF/SIM Activates Snap session 2.5 SQL Server resumes I/O Device pointers from VDEV to original data VDEV Data copied to save area due to copy on write SAVE DEVs ICO-IMG-000020 Figure 86 224 Creating a VDI or VSS backup of a SQL Server database with TimeFinder/Snap Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones 1. The first action is to create the TimeFinder/Snap pairs. The following command creates the TimeFinder/Snap pairings and protection bitmaps. No data is copied or moved at this time: symsnap –g <device_group> create -noprompt Unlike TimeFinder/Mirror, the snap relationship is created and activated when it is needed. No prior copying of data is necessary. The create operation establishes the relationship between the standard devices and the virtual devices (VDEVs) and it also creates the protection metadata. 2. Execution of the backup operation may be completed either by using the TF/SIM VDI or VSS mode of operation. For VDI operations, after the snap is created, the following TF/SIM tsimsnap backup operation may be executed: tsimsnap backup –d <database> –f <location of metafile> -snap In this example, the target production database is provided after the –d command line option. The –f command line option specifies the location where TF/SIM will store the VDI metadata file. An example of the execution of this process is shown in Figure 87 on page 226. For VSS backup operations, the TF/SIM tsimvss command may be executed in the following manner from a remote backup client: tsimvss backup -ph <production server> –d <database> -snap –bcd <location of bcd metadata> -wmd <location of wmd metadata> Copying the database using SQL Server VDI and VSS 225 Creating Microsoft SQL Server Database Clones Figure 87 Sample TF/SIM backup with TF/Snap Executing SYMIOCTL Similar to the TF/SIM execution, it is possible to utilize the SYMIOCTL command line utility to execute VDI processing for the SQL Server database. Unlike TF/SIM, however, all processing of the device group and state of the mirror devices within the group must be managed through command line operations. Also, there is no default ability to flush the file system buffers. Note: SYMIOCTL does not implement VSS backup operations Note: The preceding TF/SIM examples, TF/SIM ensures that file system buffers are also flushed. SYMIOCTL does not implement such functionality, and it will be necessary to cater for this operation in any script created. To implement this functionality with SYMIOCTL, it is necessary to use the Symmetrix Integration Utility (SIU) to perform the flush operation. The flush operation ensures that other file system-level operations are successfully written to the disk. 226 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones To implement the SYMIOCTL process, the following steps should be executed: 1. The first action is to create the TimeFinder/Snap pairs. The following command creates the TimeFinder/Snap pairings and protection bitmaps. No data is copied or moved at this time: symsnap –g <device_group> create -noprompt 2. Once the Snap session is created, the SYMIOCTL backup operation may be executed. symioctl –type SQLServer begin snapshot <database> SAVEFILE <location of metafile> -noprompt In this example, the SAVEFILE command line option specifies the location where SYMIOCTL will store the VDI metadata file. 3. Flush all database locations, either drive letters or mount points, in use by using SIU. In this example, the locations are mount points. symntctl flush –path <mount point for volume> Execute for each and every database file and log location. 4. Activate the Snap session. symsnap –g <device_group> activate 5. Finalize the SYMIOCTL backup command and thaw I/O operations on the SQL Server database. symioctl –type SQLServer end snapshot <database> -noprompt It is the responsibility of the administrator to ensure that the SYMIOCTL end snapshot command is executed within the script—there is no automatic termination of the command. Leaving the database in snapshot state will suspend all user connections and may cause SQL Server to take the database offline. Any user-created process or script must cater for abnormal termination of the process or script itself, and ensure that the end snapshot is executed as a cleanup operation. An example of the execution of the backup process is shown in Figure 88 on page 228. Copying the database using SQL Server VDI and VSS 227 Creating Microsoft SQL Server Database Clones Figure 88 228 Sample SYMIOCTL backup with TF/SNAP Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones Copying the database using Replication Manager EMC Replication Manager (RM) can be used to manage and control the TimeFinder copies of a SQL Server database using the VDI interface. The RM product has a GUI and command line and provides the capability to: ◆ Auto-discover the standard volumes holding the database ◆ Identify the path name for all data and transaction log file locations Using this information, RM can set up TimeFinder Groups with BCVs or VDEVs, schedule TimeFinder operations and manage the creation of database copies, expiring older versions as needed. Figure 89 on page 229 demonstrates the steps performed by Replication Manager using TimeFinder/Mirror to create a database copy that can be used for multiple other purposes: 1. RM/Local discovers and maps database files and logs SQL Server 3.1 2. RM/Local establishes BCVs to STD devices 5.1 2 3 5 Data STD Mirror Data STD Mirror 3. RML/Local executes SQL VDI call 3.1 SQL server executes checkpoint and suspend writes 4. RM/local splits BCVs from STDs Log STD Mirror 4 5. RM/Local executes SQL VDI call 5.1 SQL server resumes I/O 1 6. RM/Local copies VDI Metadata file 6 RM/Local Server ICO-IMG-000021 Figure 89 Using RM to make a backup of a SQL Server database 1. In the first step Replication Manager maps the database locations of all the data files and logs on all Symmetrix devices. Copying the database using Replication Manager 229 Creating Microsoft SQL Server Database Clones Note: The dynamic nature of this activity will handle the situation when extra volumes are added to the database. The procedure will not have to change. 2. Replication Manager then establishes the BCVs to the standard volumes in the Symmetrix. Replication Manager polls the progress of the establish until the BCVs are synchronized and then moves on to the next step. 3. Replication Manager executes a SQL Server VDI database call to checkpoint and write suspend the database. 4. Replication Manager then issues a TimeFinder split, to detach the BCVs from the standard devices. 5. Next, Replication Manager executes another SQL Server VDI call to resume all I/O operations to the database. 6. Finally, Replication Manager copies the VDI metadata file from the source host to an RM holding area for use later in a restore. 230 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones Transitioning disk copies to SQL Server databases clones How a database copy can be enabled for use depends on how the copy was created. A database copy created using the SQL Server VDI and VSS processes can be processed by using the restore operations based on the utility used for its creation. These utilities implement the appropriate SQL Server VDI and VSS functionality to facilitate a restore procedure using the appropriate metadata files. Note: TF/SIM VSS backups may only be restored to the production host from which they were created. Manual methods may be utilized to implement workaround operations, but these steps are beyond the scope of this paper. EMC recommends using the TF/SIM VDI implementation for creation of copies of a given source database. If a copy of a running database was created using EMC Consistency technology without using the SQL Server VDI or VSS functions, it can only be restarted. Equally, an image created while the SQL Server Instance or database was shutdown cannot be processed with VDI or VSS functionality. In both of these instances, whether a consistent split of a running database or a copy of a shutdown database, no roll forward log application is possible.This restricts the instantiation of the database to be a point-in-time copy of the source database. The copy created with database shutdown or using EMC Consistency technology can be attached to any SQL Server instance which has appropriate connectivity to the Symmetrix array. Note: Care must be taken if the copy is to be re-presented to the source server especially in the case of the server being a node in a Microsoft Cluster configuration. This is because of duplicate signature issues for a Microsoft Cluster configuration. In general, presenting mirror devices representing source devices that are part of a Microsoft Cluster configuration is not supported unless specifically documented by the relevant product. To attach the copy created from a shutdown database or using EMC Consistency technology, it is possible to utilize the SQL Server Transact-SQL sp_attach_db procedure. This procedure allows for the relocation of the data files and logs, and for the renaming of the copied database. This section details how to use the sp_attach_db procedure, how to restart a database copy created using EMC Consistency technology, and how to deal with host-related issues when processing the Transitioning disk copies to SQL Server databases clones 231 Creating Microsoft SQL Server Database Clones database copy. Also discussed is the usage of SQL Server VDI processes to create database instances where the initial state was created using a VDI compliant utility. Instantiating clones from consistent split or shutdown images In most cases, when creating a copy of a database and utilizing it on a different server, the database manager on the new server must be made aware of the new database that is coming under its control. This is done by telling the database manager the locations of the copy files. In the SQL Server environment this may be facilitated by utilizing the SQL Server graphical user interface to attach the database, or by utilizing SQL Server Transact-SQL statements. The sp_attach_db stored procedure looks similar to this: sp_attach_db <new_database_alias>, @filename1=’<new_file_location>’, @filename<n>=’<new_file_location>’ Where <new_file_location> represents the locations of all database data files and logs for the database copy. Refer to SQL Server Books On Line for complete documentation on this Transact-SQL stored procedure. In Figure 90 on page 233, a consistent split image is attached to a SQL Server instance. It assumes that all the mirrored devices, be they BCVs, clones, or snaps, are presented to the target server and mounted in appropriate locations. If the mount locations are identical to those on the production instance, then it is only necessary to specify the first file location (the SQL Server metadata file for the database). Since the metadata file contains information for the locations of all subsequent files and logs, the stored procedure will use this information. If the file locations have been modified, then all the new locations must be specified. SQL Server will perform necessary roll forward/roll back recovery on the database instance. Since the Write Ahead Logging functionality of SQL Server has been maintained, the resulting cloned instance will return to a transactional consistent state, and become a fully independent clone of the production instance. Note in Figure 90 on page 233 that transaction roll-forward and roll-back operations are shown during the attach process as viewed through Query Analyzer. 232 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones Figure 90 Attaching a consistent split image to SQL Server Using SQL Server VDI to process the database image A database copy of a SQL Server database created by using the SQL Server VDI split mirror backup process can be subsequently restored for use in a number of ways, using the respective VDI utility that was utilized in the initial creation. The SQL Server VDI process can initialize the database copy into a number of different modes: ◆ As a fully independent cloned instance ◆ As a standby database, providing read-only access Creating a cloned database using VDI processing The command to create a cloned database instance on a separate target server using TF/SIM VDI processing is: tsimsnap restore –d <DB_alias> -f <metafile> -m -recovery This process needs to be facilitated by the –m command line option, as it is considered a manual recovery process. As a result, this command expects that the file locations for the various data and log files are the same on the target server as they were on the production system. The VDI metadata file created during the backup process for TF/SIM, which relates to this mirrored device set, must be accessible Transitioning disk copies to SQL Server databases clones 233 Creating Microsoft SQL Server Database Clones on the host executing the restore operation, and identified using the –f option. The –recovery option indicates that SQL Server should perform any and all recovery processes on the data files and thus transform the database into a new instance. An example of the execution of this process is shown in Figure 91 on page 234. Using the mirror devices to instantiate a database clone will modify the state of the files on those devices. Once a VDI backup image has been modified in this way, it will no longer be possible to restore this specific instance back to the production system, and should no longer be considered a backup image. This process results in the creation of a new transaction log chain. In-flight transactions that were active at the time the database copy was created are backed out to the last commit. This cloned database instance is now available for access and represents the state of the database at the point in time the VDI backup was created. Figure 91 TF/SIM and SQL Server VDI to create a clone Equally, a VDI backup image created by using SYMIOCTL may also be restored in a similar manner by using the following command: symioctl –type SQLServer restore snapshot <database alias> SAVEFILE <file location> -noprompt 234 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones The VDI metadata file created during the backup process for SYMIOCTL, which relates to this mirrored device set, must be accessible on the host executing the restore operation and identified using the SAVEFILE option. An example of the execution of this process is shown in Figure 92 on page 235. Figure 92 Using SYMIOCTL and SQL Server VDI to create a clone Creating a STANDBY database using VDI processing SQL Server supports the implementation of STANDBY database environment, where a target database instance created from a backup image of the production system may apply subsequent incremental transaction log backups. In the STANDBY state, the database is available for READ-ONLY access during periods of time while transaction logs are not applied to the standby instance. This process implements a data state, which is continually updated as transaction log backups are processed. Databases placed in this recovery mode are only accessible for read operations, and can therefore be used for those processes that need only access a clone of the production database for interrogation or data extraction purposes. The following is an example of the TF/SIM command to create a standby database, which can be used for other purposes such as decision support and regression testing: tsimsnap restore –d <database_alias> -f <metafile> -m standby The standby keyword specifies that the standby database is to be used as a backup copy, which can be used to restore the primary database. After execution of this command, the cloned database is left Transitioning disk copies to SQL Server databases clones 235 Creating Microsoft SQL Server Database Clones in a READ-ONLY state, and may also have subsequent incremental transaction logs applied. An example of the execution of this process is shown in Figure 93 on page 236. Figure 93 Using TF/SIM and SQL Server VDI to create a standby database Similarly, a VDI backup created using SYMIOCTL may be restored in a manner which allows for subsequent transaction logs to be applied. The SYMIOCTL command to create a recovering database from a VDI backup is: symioctl –type SQLServer restore snapshot <database alias> SAVEFILE <metafile> -standby -noprompt The –standby option implements the same functionality as documented for the TF/SIM preceding command. The VDI metadata file created during the backup process for SYMIOCTL, which relates to this mirrored device set, must be accessible on the host executing the restore operation, and identified using the SAVEFILE option. An example of the execution of this process is shown in Figure 94 on page 237. 236 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones Figure 94 Using SYMIOCTL and SQL Server VDI to create a standby database As indicated, it is possible to apply incremental transaction log application to this standby database. In the context of creating a cloned environment, this may be useful to occasionally update the database state, and thereby provide more current information from the clone. Log shipping functionality and behavior is documented in Figure 93 on page 236. Relocating a database copy Relocating the data files and logs of the copied SQL Server database is a requirement if mounting the database back to the same server that hosts the source database, or if database locations have changed for whatever reason. Both the TF/SIM utility and the sp_attach_db Transact-SQL stored procedure support database relocation. The components that can be identified in this way are: ◆ Database name ◆ Data filepaths ◆ Log filepaths Transitioning disk copies to SQL Server databases clones 237 Creating Microsoft SQL Server Database Clones For the sp_attach_db stored procedure, it is only necessary to provide the new locations of the new cloned database. This should be done for each logical database component. The example shown in Figure 95 on page 238, attaches a new ProdDB_Clone database with relocated data and log file locations. Figure 95 Attaching a cloned database with relocated data and log locations In case of TF/SIM, the logical component names for the various devices are required. It is possible to interrogate the existing database to find the logical names by executing the sp_helpdb stored procedure. 238 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones Figure 96 SQL Query Analyzer executing sp_helpdb The logical SQL names for the physical files comprising the database are listed in the name column. Utilizing this list, it is possible to create a text file which lists each logical name with a new location, as shown in Figure 97 on page 239. Figure 97 Mapping database logical components to new file locations When relocating a database using TF/SIM, as shown in Figure 98 on page 240, the command should be constructed as follows: Transitioning disk copies to SQL Server databases clones 239 Creating Microsoft SQL Server Database Clones tsimsnap restore –d <database_alias> -f <metafile> -rf <relocate mapping file> -m -<recovery style> U Figure 98 240 Using TF/SIM and VDI to restore s database to a new location Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones Reinitializing the cloned environment In many cases, it may be necessary to reinitialize the clone environment. To facilitate this process, it is necessary to reverse some of the processes that were executed to create the database clone using the specific process required. A number of steps are required to remove the clone database from the target server so that a reinitialization may be processed: 1. Drop or detach the cloned database instance from the target server. This may be executed via the appropriate Management UI, or by executing the sp_detach_db Transaction-SQL statement. 2. Unmount the volumes from the target server using appropriate symntctl commands. 3. Reestablish BCV relationship or recreate sessions as appropriate. 4. Execute the appropriate split functionality as initially used. 5. Remount the volumes using appropriate symntctl commands. 6. Attach the cloned database in the same manner as originally processed. Reinitializing the cloned environment 241 Creating Microsoft SQL Server Database Clones Choosing a database cloning methodology The replication technologies described in previous sections each have pros and cons with respect to their applicability to solve a given business problem. The following matrix in Table 11 on page 242 provides a contrast to the different methods that can be used and the differing attributes of those methods. Table 11 A comparison of database cloning technologies TF/Snap TF/Clone TF/Mirror Replication Manager Maximum No. of copies 128 Incremental: 16 Non-inc: Unlimited Incremental: 16 Non-inc: Unlimited Incremental: 16 Non-inc: Unlimited No. simultaneous Copies 128 16 2 2 Production Impact ACOFW ACOFW & COA None None Scripting Required Required Required Automated Database clone needed a long time Not recommended Recommended Recommended Recommended High Write Usage to DB Clone Not recommended Recommended Recommended Recommended ACOFW = Asynchronous Copy on First Write COFW = Copy on First Write COA = Copy on Access. 242 Microsoft SQL Server on EMC Symmetrix Storage Systems Creating Microsoft SQL Server Database Clones Table 12 on page 243 shows examples of the choices you might make for database cloning based upon this matrix. Table 12 Database cloning requirements and solutions System Requirements Replication Choices The application on the source volumes is very performance sensitive and the slightest degradation will cause responsiveness of the system to miss SLAs TimeFinder/Mirror TimeFinder/Clone Replication Manager Space and economy are a real concern. Multiple copies are needed and retained only a short period of time, with performance not critical. TimeFinder/Snap Replication Manager More than 2 simultaneous copies need to be made. The copies will live for up to a month’s time. TimeFinder/Clone Choosing a database cloning methodology 243 Creating Microsoft SQL Server Database Clones 244 Microsoft SQL Server on EMC Symmetrix Storage Systems 5 Backing up Microsoft SQL Server Databases This chapter presents these topics: ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ EMC Consistency technology and backup................................... SQL Server backup functionality................................................... SQL Server log markers .................................................................. EMC products for SQL Server backup.......................................... TF/SIM VDI and VSS backup ........................................................ SYMIOCTL VDI backup ................................................................. Replication Manager VDI backup ................................................. Saving the VDI or VSS backup to long-term media.................... Backing up Microsoft SQL Server Databases 247 248 255 256 263 273 277 279 245 Backing up Microsoft SQL Server Databases To mitigate the possibility of complete or partial data loss of a production database, it is necessary to implement some form of backup and recovery process to ensure that a separate, logically valid copy of the database is always available. If the production database is rendered unusable, the logically valid copy may be restored to the production system. Clearly, the currency of the data state of the copied database is important, and backup processes typically implement functionality to ensure that there will be minimal data loss. For most production databases the requirements for availability are such that the database is accessible 24 hours a day, seven days a week. Thus, all backup operations must be processed online with minimal impact to the production system. Inevitably, as the production database size grows, the ability to process a full backup becomes more time consuming. While online streaming backup processes can have minimal impact, their duration and timing become a substantial issue. This chapter describes SQL Server backup processes, and how database administrators can leverage EMC technology in a backup strategy to: 246 ◆ Reduce production system impact during backup processing ◆ Create consistent backup images ◆ Integrate SQL Server backup/recovery processes Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases EMC Consistency technology and backup In Chapter 4, “Creating Microsoft SQL Server Database Clones,” the use of EMC Consistency technology was described as a method to create restartable copies of a source SQL Server database. This type of technology is noninvasive to the production database operations and occurs entirely at the storage array level. The restartable images represent dependent-write consistent images of all related objects that have been defined within the consistency group. Restartable images created using Consistency technology do have valuable usage when customers are concerned with federated environments, where the creation of a consistent restart point for all databases and other objects within the environment is the primary goal. Independently created recoverable images created of each individual component of a federated environment make it inordinately complex, if not impossible, to resolve a single business point of consistency across all systems. It is possible to maintain local images or stream to longer term media these copies of a SQL Server database created from procedures which do not utilize SQL Server’s BACKUP DATABASE Transact SQL statement, such as those created from a SQL Server shutdown state, or from EMC consistent split images. The Recovery Point Objective (RPO) of such a restartable image is based on a static point in time, and will increase as the source production environment changes. It is only when the next point of consistency is created that the RPO can be reset to its initial state. During intervening times between cycles, the RPO point will vary. It is important to control the cycle of such a process to ensure that the solution falls within the maximum RPO. EMC Consistency technology and backup 247 Backing up Microsoft SQL Server Databases SQL Server backup functionality Microsoft SQL Server provides a number of standard tools and interfaces to facilitate backup processing. Within the standard product, administrators can use the built-in backup tools presented by both the SQL Server Management Studio and the SQL Server T-SQL programmatic interface. SQL Server Management Studio provides a graphical user interface (GUI) to the major administrative functions within the database environment. In Figure 99 on page 248, an example of the interface to the backup functionality is provided. Figure 99 SQL Server Management Studio backup interface Standard SQL Server backup functionality typically allows for backup of a database to a tape device, or a file within a file system. In Figure 99 on page 248, a disk file device is to be used for the backup operation. Additionally, the SQL Server Management Studio provides a scheduling mechanism (and a maintenance plan), which 248 Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases allows for backup operations of this form to be processed. Within the Transact-SQL command interface, a similar process could be invoked through the BACKUP DATABASE statement. Figure 100 DATABASE BACKUP Transact-SQL execution Executing regular backup processes, as shown in Figure 100 on page 249, provides a recovery procedure if serious corruption or failure of the production server and/or the database itself. Backups can be executed in a number of ways, but at the core is the full database backup. It is from a full backup that recoveries are generally processed when required. Full backups also facilitate other incremental backup functions such as incremental transaction log backups. Restoration of these incremental backup processes then allow for a more efficient method of ensuring the currency of the state of the database. Microsoft SQL Server fully supports online backup processes and these are utilized through the examples in this chapter. When a production database is relatively small, full online backup operations are very fast and provide little operational issue. As database environments grow, such operations can begin to impact the production system. Backups can begin to interfere with user activity, and this is when the value of split mirror disk technology becomes more relevant. SQL Server backup functionality 249 Backing up Microsoft SQL Server Databases Before looking at the various EMC technologies applicable in the backup/recovery space for SQL Server, it is important to understand the various recovery modes available to a SQL Server database. Microsoft SQL Server recovery models The recovery models defined by SQL Server affect the behavior of the transaction log management for a database. SQL Server does not implement functionality in the same manner as other RDBMS environments in this respect. There is no functionality that could be referred to an automatic archive log. The only mechanism that provides similar functionality is that of the execution of incremental transaction log backups. This behavior is controlled by the recovery models, as defined in the next sections. The recovery models are important to understand as they affect backup, and thus recovery and operations. SIMPLE recovery model In this model, SQL Server attempts to manage the size of the transaction log by recovering transaction log space which is no longer required for transactions which have been committed. This reclaimed space is then used to record additional in-process transactions. Other RDBMS environments may refer to this as circular logging. In this model, SQL Server can protect the database from events such as a server restart, but is unable to recover from a backup and roll forward. Transaction log backups are not required or supported in this mode. Therefore, there is no chain of transaction log backups to replay. The Recovery Point Objective (RPO) of this state in the event of complete system loss is that of the last full backup. FULL recovery model In this model, all updates to the database are recorded in the transaction log and the log is neither automatically truncated nor is the space within the transaction log re-used dynamically. Re-use is only implemented after incremental transaction log backups. The use of the full recovery model ensures the creation of a continuous sequence of transaction log records that allows for recovery of a given database to essentially any required point. Incremental transaction log backups can be used in combination with full backups to recover from a total system loss to the point in time of the last transaction log backup. 250 Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases In general, until a full database backup has been completed for a database in full recovery model, the database will behave like it is in the simple recovery model. After the first full backup, the full recovery functionality is implemented. BULK-LOGGED recovery model One major problem for customers is the size of the transaction log file when certain operations are executed. In many situations, data load processes may be executed in certain environments. When utilizing the full recovery model, the size of the transaction log may grow to a significant size. To mitigate this effect, certain operations can be minimally logged in the bulk-logged model. This provides a good deal of the recoverability of the full recovery model and addresses some of the transaction size issues under certain conditions. It is possible to switch between the various recovery models. Microsoft documents the restrictions on this switching process in the Books On Line documentation under Switching Recovery models. For the purposes of the remainder of this document, we will assume that the database is in the full recovery model since this is the most applicable form. Implementing the simple recovery model is not recommended or supported by EMC to be used with any production database. The recovery model for a database may be set either through the SQL Server Management Studio, by selecting the properties of the required database. In the case of our example, the PROD_DB database has the full recovery model selected. In Figure 101 on page 252, the other forms of recovery are displayed in the drop-down menu. SQL Server backup functionality 251 Backing up Microsoft SQL Server Databases Figure 101 Recovery model options for a SQL Server database Modification of the recovery model used for a given database is also supported via the Transact-SQL command ALTER DATABASE. Customers may find the need to switch between the full recovery and bulk-logged model to cater for operational requirements and to mitigate transaction log growth rates during load operations. Switching between the full and bulk-logged model is fully supported and does not require any specific change in operational behavior. Customers should understand the limitations of the bulk-logged model of operations, and should utilize the full recovery model where possible. 252 Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases Figure 102 Setting recovery model via Transact-SQL Types of SQL Server backups Microsoft SQL Server supports multiple forms of database backups. The simplest of all to consider is the full backup. This is a complete set of the data and transaction log files for the specified database. Clearly, this form of creating a full database backup can be expensive in terms of processing time and impact to the production environment. In the case of online backup to tape, SQL Server’s database engine will require CPU and IO resources to stream all database pages out of the datafiles and transaction log. In addition to the full backup, SQL Server supports a differential backup process. This form of backup copies only the changed data since the last full backup. It can be restored after a full backup to return the database to the state at the time the differential backup was executed. This form of backup can be used in combination with transaction log backups to return to a consistent state. Incremental transaction log backups are the last form of backup discussed here. Transaction log backups copy all the current transactional activity recorded for a database from the transaction log. Once the transaction log data has been backed up and is no longer required for transactional recovery, the SQL Server database engine can reuse space within the transaction log file. SQL Server backup functionality 253 Backing up Microsoft SQL Server Databases As discussed in “Microsoft SQL Server physical components” on page 31, a physical transaction log is subdivided by SQL Server into a number of logical virtual log files. Once a virtual log is full and does not contain any information pertaining to those transaction log records which constitute the active portion of the transaction log, the virtual log may be marked for reuse. This occurs after an incremental transaction log backup has been executed, and the log records have been backed up. The mechanisms that control this reuse are more complex than described here. Additional, extensive discussion of the transaction log, log records, virtual logs, and active log definition may be found in the SQL Server Books On Line documentation. SQL Server allows for the restoration of a full database backup and application of incremental transaction log backups created since the last full backup. In this way, the amount of data loss is minimized, and any data loss will be related to the amount of time since the last incremental backup. The full chain of incremental logs must be replayed in the order in which they were created. Both differential backups and incremental transaction log backups help to mitigate the processing time required for creating a logical backup process, this time relates directly to a requirement often referred to the recovery time objective (RTO). The RTO determines how much time is allowed to return to database service. This is independent of what actual process may be used. For example, RTO does not speak to whether a D/R solution is being used, or whether a restore to the production system is required. It is simply concerned with the amount of time required to restore service. 254 Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases SQL Server log markers SQL Server automatically maintains timestamp information for transaction log records. The timestamps are based on the system clock of the production system itself. This automatic inclusion of timestamps facilitates rolling forward to a specific point in time when subsequently applying incremental transaction log backups. In addition to the timestamps, it is possible to record a marked transaction within the transaction log. This typically represents a logical point in time; a point in time that may reflect the point before some operational process, rather than one based on a specific time. For example, it may be useful to record the logical point before a large update process within the database. Therefore, in the event of a system failure, it is possible to return to that state before the start of the update. Recording a marked transaction may be executed in a similar manner to the following example: BEGIN TRANSACTION myUpdate WITH MARK 'Before_Update' go COMMIT TRANSACTION myUpdate go Restoration to a logical transaction marker is discussed in Chapter 6, “Restoring and Recovering Microsoft SQL Server Databases.” SQL Server log markers 255 Backing up Microsoft SQL Server Databases EMC products for SQL Server backup In general, it can be considered that EMC products are positioned to facilitate the core full backup process implemented by SQL Server. Since the functionality of storage arrays is generally based on the logical unit number (LUN), they facilitate backup and restore operations at the full LUN level. This means that incremental backup functions, which are typically processed by the SQL Server engine, cannot be executed by a disk mirror environment. However, it is possible to implement functions such as file group backup processes, which allow for the creation of backup images of only parts of the entire database. It should be noted that the backup of the file group is only implemented as a full backup of the file group itself, rather than an incremental backup of the file group. This process then allows for the separate restoration of a single file group should that be appropriate, rather than the restoration of the entire physical database. It is expected that customers will incorporate EMC tools that create full backups with additional processes that serve to enhance backup and restore functions. Therefore, it is typical to expect that full backups created at a storage level are enhanced with regular transaction log backups used to mitigate any data loss should a restore be necessitated. To create a valid disk mirror backup image of a SQL Server database, it is necessary to interface to the owning SQL Server instance itself. This functionality is required to quiesce the database and suspend I/O so that a split mirror image may be created. EMC has created a number of products and utilities to support the broad range of operational modes that various customer environment require. These products and utilities range from graphical user applications, such as Replication Manager, to command line products such as TimeFinder SQL Server Integration Utility (TF/SIM). Additionally, within the Solutions Enabler product, command line utilities exist, which can initiate these processes in discrete steps. This range of products allows customers to select the best solution to match their specific requirements. 256 Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases Integrating TimeFinder and Microsoft SQL Server Integration of EMC’s TimeFinder functionality with Microsoft SQL Server for backup and restore operations centers on the usage of the SQL Server Virtual Device Interface (VDI). Within SQL Server 2005 and SQL Server 2008, Microsoft provides the VDI functionality to support disk mirror snapshot backup operations, which are enabled by storage arrays. Additionally, for SQL Server 2008, Microsoft introduced full support of the Volume Shadow Copy Service (VSS) for implementing backup operations. Utilizing EMC TimeFinder technology provides support for clones, snaps and traditional TimeFinder BCVs. These technologies are described in detail in Chapter 2, “EMC Foundation Products.” Since the SQL Server VDI and VSS implementations are programmatic interfaces, users cannot use these functions directly, and vendors are required to provide end-user tools to make use of the respective API. EMC has provided a number of end-user tools to integrate the functionality of the virtual device interface and TimeFinder operations. Microsoft SQL Server VDI also provides support for a more granular form of file group-level operation. Many of the EMC products provide support for creation and subsequent restoration of these file groups where implemented. The ability to target specific file groups for backup and restore operations may be beneficial in configurations in which only a specific file group may need to be targeted for backup and restore operations. Operationally, it would be more efficient to only designate that file group and exclude the remainder of the file groups, which may be static and unchanging from daily backup cycles. Using file group operations, administrators could optimize backup and restore operations to implement backups against a specific target file group. It is also possible to specify a particular file group for restore operations from a disk mirror backup based on a full backup of the database. This allows for maximum flexibility for all customer requirements. The tools typically fall into two categories: command line interface (CLI) tools and graphical user interface (GUI) tools. In the CLI area, EMC provides the TimeFinder SQL Server Integration Module (TF/SIM) as a separate product, and the SYMIOCTL tool as a part of the Solutions Enabler package. In the GUI area, users can take EMC products for SQL Server backup 257 Backing up Microsoft SQL Server Databases advantage of Replication Manager (RM) to control the creation and management of backup images (referred to as replicas). Additionally, RM allows for the automation and scheduling of replica creation. Both TF/SIM and RM provide interfaces to backup and restore databases using database snapshot backup/restore functionality through the SQL Server Virtual Device Interface (VDI). TF/SIM also implements support of the SQL Server Volume Shadow Copy Service (VSS) interface. Both RM and TF/SIM provide an integrated method for quickly creating point-in-time backups of databases, which can then be used for decision support systems, database maintenance commands, and disaster recovery. Disk mirror backups vs. streamed backups A VDI or VSS enabled snapshot database backup differs from a streamed backup because the actual data pages that are normally streamed out through SQL Server itself during a standard backup do not have to be processed in this manner. This is because the disk mirror (BCV/snap or clone) results in an identical copy of the database. During the VDI or VSS snapshot creation, SQL Server ensures that the image on the production volume (and therefore on the relevant disk mirrors) are placed in a valid backup state. Creation of this valid image does require SQL Server to execute a checkpoint to flush all updated pages to the data files, and then suspend all updates. As all updates are suspended for the snapshot creation, all write activity to the transaction log is also suspended. This provides a logically consistent disk state for the database because the transaction log and data file states will be flushed to disk. The time it takes to checkpoint the database, write the metadata file to disk, and split or activate the disk mirror device from their source volume determines the total duration of the entire snapshot backup process. In comparison, the duration of a traditional backup is determined by the time that it takes to write every allocated data page in the database to disk or tape. Note: The duration of the I/O suspension is only for the disk mirror split or activate process. During the VDI and VSS snapshot backup, some additional data, referred to as metadata, is recorded during the execution. This metadata is usually in the order of tens of kilobytes (KBs) in total size and contains information about the various files that make up the 258 Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases database and some additional log information. These metadata files are used during a restoration phase to automate much of the processing. Effects of SQL Server VDI backups on SQL Server databases A VDI snapshot backup generally has minimal impact on database performance during its execution. Most importantly, user connections to the SQL Server are not broken during this process. Read access is maintained while write operations are suspended. Those transactions, which execute a commit operation while the VDI backup is being processed, may be suspended because of their write requirement. Most VDI backup operations will execute within a matter of seconds, although the VDI implementation itself does not implement a timeout value. TF/SIM introduced enhanced support for a manual timeout setting for VDI operations which is documented within the product guide. Effects of VSS backups on SQL Server databases The VSS implementation provided with SQL Server 2008 provides a structured framework for executing SQL Server disk-based backup operations. The VSS framework provides an implementation that has similar requirements to that of VDI. Operationally, the effects on I/O operations are similar to that of the VDI, although the VSS framework does implement a threshold value for processing the backup. The threshold value for VSS operations is 10 seconds. If a disk mirror-based backup exceeds this timing, the backup will abort and I/O operations will proceed as normal. EMC Storage Resource Management EMC provides additional tools to facilitate creation and management of TimeFinder configurations when using a database environment such as Microsoft SQL Server. A number of those tools are available within the EMC Solutions Enabler package as described in Chapter 2, “EMC Foundation Products,” and form part of the Storage Resource Management (SRM) infrastructure. These tools will be discussed in this section to provide a more complete picture of the various operations. Products such as RM/Local use similar mechanisms, but may not display the underlying operations to the user. EMC products for SQL Server backup 259 Backing up Microsoft SQL Server Databases Note: The utilization of the SRM commands is not a specific requirement, but is used here to demonstrate functionality. Products such as Replication Manager monitor and manage required devices in a much more dynamic manner, and do not require that administrators utilize device groups. As a part of the SRM utility set, the symrdb command line executable provides specific functionality for discovering, managing, and reporting on Microsoft SQL Server Instances, as well as other database platforms. In Figure 103 on page 261, the symrdb utility is used to list the database files used by the PROD_DB database. In this case, we use SRM’s mapping functionality to show the file locations for the database itself. In this form, the utility should provide the same locations as those through the SQL Server Management Studio GUI and through the Transact-SQL sp_helpdb stored procedure. The symrdb command line utility can also be used to define any required TimeFinder device groups to be used for managing a split mirror environment. SRM, through the mapping functions, is able to relate SQL Server data and transaction log files to Symmetrix volumes, and then use this information to populate the respective device groups. In the following example, symrdb is used to define a new device group for those Symmetrix devices used by the data and transaction log files of the specified database. Alternate manual methods to create the device groups and allocate volumes are also possible, although utilizing SRM ensures that all devices are appropriately captured. symrdb -type SQLServer -db <database> rdb2dg <device_group> It is not generally possible to use data files independently of the current transaction log, nor is the transaction log independently a source of significant value. Backup/restore processes typically will require both the data and transaction log files to be available together. Therefore, creating and managing a single device group is recommended. The final step is to add those specific types of mirror devices to the device group to implement the execution of split mirror functionality. BCV devices must be the same size as the source (STD) devices to support TimeFinder operations. Clone devices may be larger than the 260 Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases source STD volumes. For utilization of TimeFinder/Snap functionality, the relevant VDEV devices must be added to the device group. Figure 103 SYMRDB listing SQL database file locations Details regarding adding BCV, Clone and Snap devices, and establishing TimeFinder relationships can be found documented in the relevant EMC documentation as listed in Appendix A, “Related Documents.” The disk mirror devices are also typically presented through the SAN to a backup or target host, which is then able to mount or use the BCV devices in some manner. Mapping and presenting these split mirror devices is beyond the scope of this document, and may be found in the relevant EMC technical documentation. Once the relevant mirror devices have been selected, mapped to the appropriate alternate host and synchronized, it is possible to obtain information regarding the volumes in use. Using the respective SYMCLI command, it is possible to query the status of the device group, as shown in Figure 104 on page 262. In this instance, the symclone command is utilized, but it would also be possible to interrogate using the symmir or symsnap command line utilities, should their respective style of devices be used. In general, before initiating any synchronization, resynchronization, creation, activation or restore process, you should ensure that any secondary target or backup system is shut down or has the volumes unmounted. Failure to do so leads to a corrupted file system image on the secondary system, and this may result in unpredictable results for the database clone image. EMC products for SQL Server backup 261 Backing up Microsoft SQL Server Databases Figure 104 SYMCLONE query of a device group Prior to the activation process of the clone targets to the source volumes, a create operation is required to associate the clone target to the source devices. For example: symclone –g <device_group> create-tgt -noprompt The create operation sets up protection bit maps for the devices, and prepares the environment for activation.It is also possible to execute a pre-copy operation, which would result in all tracks being synchronized to the clone targets. This operation would result in similar operation of BCV functions. It is also possible to utilize incremental operations for TimeFinder/Clone devices by utilizing the -differential command line option. This will later support incremental update functions. 262 Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases TF/SIM VDI and VSS backup The TimeFinder SQL Server Integration Utility is a command line utility, which interfaces with the SQL Server Virtual Device Interface and Volume Shadow Copy Service frameworks. Additionally, TF/SIM ensures the appropriate storage-level states for the supported split mirror devices. Note: It should be noted that the TimeFinder SQL Server Integration Utility can be configured for local or remote modes of execution. The discussions in this chapter refer to only the local form of backup operation for VDI processing, and remote processing for VSS. The different forms of execution have different preconditions for the mirror devices being used. For example, local mode does require TimeFinder/Mirror devices to be synchronized, whereas remote mode requires that they are not currently established. Please refer to the product guide for the TimeFinder SQL Server Integration Module which will list precondition requirements, and execution modes. Refer to Appendix A, “Related Documents,” for information on related documentation. Using TimeFinder/Mirror TimeFinder/Mirror is an EMC software product that allows an additional hardware mirror to be attached to a source volume. The additional mirror is a specially designated volume in the Symmetrix configuration called a business continuance volume, or BCV. The BCV is synchronized to the source volume through a process called an establish. While the BCV is established, it is not ready to all hosts. At an appropriate time, the BCV can be split from the source volume to create a complete point-in-time copy of the source data that can be used for multiple different purposes, including backup, decision support, and regression testing. Note: For VMAX array implementations, TimeFinder/Mirror is generally replaced with TimeFinder/Clone operations. Coverage is provided here for completeness. In general, those EMC products that utilize the SQL Server VDI and VSS also implement EMC’s Storage Resource Management (SRM), which dynamically map database files to array volumes. Thus, the TF/SIM VDI and VSS backup 263 Backing up Microsoft SQL Server Databases operations of these tools become independent of SYMCLI device groups which are used by administrators to monitor the volume states and relationships. However, the device groups may be used to manage the TimeFinder state when required for certain preconditions, which may be required. For example, local execution of TF/SIM on production database servers requires that TimeFinder/Mirror devices are in a synchronized state. Device groups would be used to ensure that state is created. The following process demonstrated in Figure 105 on page 265 shows the necessary steps required to create a split-mirror database backup of a Microsoft SQL Server database using TimeFinder/Mirror and TF/SIM being executed in local mode on the production server: 1. The first action is to establish the BCVs to the standard devices. This operation occurs in the background and should be executed in advance of when the BCV copy is needed. symmir –g <device_group> establish –full -noprompt Note: The first iteration of the establish needs to be a full synchronization. Subsequent iterations are incremental and do not need the –full keyword. Once the command is issued, the array begins the synchronization process using only Symmetrix resources. Since this is asynchronous to the host, the process must be interrogated to see when it is finished. The command to interrogate the synchronization process is: symmir –g <device_group> verify This command will return a 0 return code when the synchronization operation is complete. 2. When the volumes are synchronized, the TF/SIM backup operation may be executed: tsimsnap backup –d <device_group> –f <location for metadata> The target production database is provided after the –d command line option. The –f command line option specifies the location where TF/SIM will store the VDI metadata file. 264 Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases 2.2 2.3 2.5 SQL Server 2 1. Establish BCVs to Standard devices 2. Execute TF/SIM backup command 2.1 TF/SIM validates BCV state 2.2 SQL Server executes checkpoint 2.3 SQL Server suspends write I/O 2.4 TF/SIM splits BCVs 2.5 SQL Server resumes I/O 1 2.1 2.4 Data STD Data BCV Data STD Data BCV Log STD Log BCV ICO-IMG-000018 Figure 105 Creating a TimeFinder/Mirror VDI or VSS backup of a SQL Server database Using TimeFinder/Clone TimeFinder/Clone is an EMC software product that copies data internally in the Symmetrix array. A TimeFinder/Clone session is created between a source data volume and a target volume. The target volume needs to be equal to or greater in size than the source volume. The source and target for TimeFinder/Clone sessions can be any hypervolumes in the Symmetrix configuration. TimeFinder/Clone devices are managed together using SYMCLI device or composite groups. Solutions Enabler commands are executed to create SYMCLI groups for TimeFinder/Clone operations. If the database spans more than one Symmetrix, a composite group is used. Examples of these commands can be found in Appendix A, “Related Documents.” Figure 106 on page 266 depicts the necessary steps to make a VDI backup of a Microsoft SQL Server database onto clone devices using TimeFinder/Clone: TF/SIM VDI and VSS backup 265 Backing up Microsoft SQL Server Databases 2.2 2.3 2.5 SQL Server 2 1. Create Clone Emulation session Establish 2. Execute TF/SIM backup command 2.1 TF/SIM validates Clone state 2.2 SQL Server executes checkpoint 2.3 SQL Server suspends write I/O 2.4 TF/SIM splits Clones 2.5 SQL Server resumes I/O 1 2.1 2.4 Data STD Data BCV Data STD Data BCV Log STD Log BCV ICO-IMG-000019 Figure 106 Creating a VDI or VSS backup of a SQL Server database with TimeFinder/Clone 1. The first action is to create and optionally synchronize the TimeFinder/Clone pairs. The following command creates the TimeFinder/Clone pairings and protection bitmaps: symclone –g <device_group> create -tgt -noprompt This operation occurs instantaneously. In the case where a pre-copy of all tracks is required, the execution should be initiated in advance of when the activation is needed. Note: The clone session creation supports the use of a pre-copy operation to execute a full synchronization prior to activation. Additionally, it is possible to execute incremental re-synchronization by using the -differential option. Subsequent iterations will then be incremental. If the -pre-copy command is issued, the array begins the synchronization process using only Symmetrix resources. The command to interrogate the precopy process is: symclone –g <device_group> verify 2. When the volumes are suitably prepared, the TF/SIM backup operation may be executed. 266 Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases To execute a TF/SIM backup using VDI and clone devices, the tsimsnap command may be used in the following manner: tsimsnap backup –d <database> -clone –f <location for metadata> In this example, the target production database is provided after the –d command line option. The -clone option specifies that TimeFinder/Clone is to be utilized for the backup operation, The –f command line option specifies the location where TF/SIM will store the VDI metadata file. To execute a TF/SIM backup using VSS and clone devices, the tsimvss command may be used from a remote backup host in the following manner: tsimvss backup -ps <production host> –d <database> -clone –bcd <location for bcd metadata> -wmd <location for wmd metadata> Databases copied using TimeFinder/Clone are subject to Copy on Access (COA) penalties. The Copy on Access penalty means that if a track on a target volume is accessed before it has been copied, it must first be copied from the source volume to the target volume. This causes additional disk read activity to the source volumes and could be a source of disk contention on busy systems. An example of the execution of a TF/SIM VDI clone backup operation is shown in Figure 107 on page 268. TF/SIM VDI and VSS backup 267 Backing up Microsoft SQL Server Databases Figure 107 TF/SIM VDI backup using TimeFinder/Clone Execution of a remote host of a TF/SIM VSS backup operation using TimeFinder/Clone operations is demonstrated in Figure 108 on page 269. 268 Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases Figure 108 TF/SIM remote VSS backup using TimeFinder/Clone Using TimeFinder/Snap TimeFinder/Snap enables users to create complete copies of their data while consuming only a fraction of the disk space required by the original copy. TimeFinder/Snap is an EMC software product that maintains space-saving pointer-based copies of disk volumes using virtual devices (VDEVs) and save devices (SAVDEVs). The VDEVs contain pointers either to the source data (when it is unchanged) or to the SAVDEVs (when the data has been changed). TimeFinder/Snap devices are managed together using SYMCLI device or composite groups. Solutions Enabler commands are executed to create SYMCLI groups for TimeFinder/Snap operations. If the database spans more than one Symmetrix, a composite group is used. Examples of these commands can be found in Appendix B, “References.” Figure 109 on page 271 depicts the necessary steps to make a VDI backup of a SQL Server database using TimeFinder/Snap: 1. The first action is to create the TimeFinder/Snap pairs. The following command creates the TimeFinder/Snap pairings and protection bitmaps. No data is copied or moved at this time: TF/SIM VDI and VSS backup 269 Backing up Microsoft SQL Server Databases symsnap –g <device_group> create -svp <pool name> -noprompt It is necessary to specify the save pool that will be used to store the pre-updated contents of tracks that would be changed by update activity to the source volumes once the snap session is activated. 2. No prior copying of data is necessary. The create operation establishes the relationship between the standard devices and the virtual devices (VDEVs), and it also creates the protection metadata. After the snap is created, the TF/SIM backup operation may be executed as appropriate. To execute a TF/SIM backup using VDI and snap devices, the tsimsnap command may be used in the following manner: tsimsnap backup –d <database> -snap –f <location for metadata> In this example, the target production database is provided after the –d command line option. The –f command line option specifies the location where TF/SIM will store the VDI metadata file. A sample execution of this TF/SIM process is shown in Figure 110 on page 272. To execute a TF/SIM backup using VSS and snap devices, the tsimvss command may be used from a remote backup host in the following manner: tsimvss backup -ps <production host> –d <database> -snap –bcd <location for bcd metadata> -wmd <location for wmd metadata> 270 Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases 2.2 2.3 I/O STD 2.5 Production SQL Server 1 1. Create Snap Session 2. Execute TF/SIM backup command 2.1 TF/SIM validates Snap state 2.2 SQL Server executes checkpoint 2.3 SQL Server suspends write I/O 2.4 TF/SIM Activates Snap session 2.5 SQL Server resumes I/O 2.1 2.4 Device pointers from VDEV to original data VDEV Data copied to save area due to copy on write SAVE DEVs ICO-IMG-000020 Figure 109 Creating a VDI backup of a SQL Server database with TimeFinder/Snap TF/SIM VDI and VSS backup 271 Backing up Microsoft SQL Server Databases Figure 110 272 TF/SIM backup using TimeFinder/Snap Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases SYMIOCTL VDI backup SYMIOCTL is a part of the EMC Storage Resource Management framework, which is a separately licensed EMC product. Implemented as a command line interface, SYMIOCTL does provide the ability to initiate VDI snapshot functionality for SQL Server environments into discrete steps to allow for specific actions to be implemented between begin and end snapshot commands. This is in contrast to products such as TF/SIM and Replication Manager, which provide structured approaches for managing known styles of execution. Serious consideration should be given to solutions based on this style of functionality, and great care should be taken with user created scripts. Consider the issues such as the abnormal termination of a user-generated script after the begin snapshot and before the end snapshot. Such an abnormal termination would leave the target database in a VDI-suspended state and result in possible termination of the database itself by the SQL Server instance. Products such as TF/SIM and Replication Manager implement functionality in a well structured and controlled manner to ensure that impact to production databases is mitigated. Using TimeFinder/Mirror The SYMIOCTL command line utility allows for execution of VDI processing against SQL Server databases in a TimeFinder environment. Unlike TF/SIM, however, all processing of the device group and state of the mirror devices within the group must be managed through command line operations. Since the execution of SYMIOCTL is somewhat independent of any TimeFinder operations, it can simply be executed at the appropriate time to place the database in the required state. Additionally, there is no default ability to flush the file system buffers. Note: The preceding TF/SIM examples, TF/SIM ensures that file system buffers are also flushed. SYMIOCTL VDI backup 273 Backing up Microsoft SQL Server Databases To implement this functionality with SYMIOCTL, it is necessary to use the Symmetrix Integration Utility (SIU) to perform the flush operation. The flush operation ensures that other file system-level operations are successfully written to the disk. Therefore, the execution of the SYMIOCTL process is in the following manner, and assumes that the device group has been correctly defined, and that requisite mirror devices (BCVs, clones, or snap VDEVs) have been implemented: 1. The first action is to implement the required disk mirror device state. For TimeFinder/Mirror devices, the BCVs should be synchronized with the standard volumes utilizing the following TimeFinder command: symmir –g <device_group> establish [–full] -noprompt For TimeFinder/Clone devices, the clone session should be created by using the following TimeFinder command: symclone –g <device_group> create -noprompt For TimeFinder/Snap devices, the snap session should be created by using the following TimeFinder command: symsnap –g <device_group> create -noprompt 2. When the volumes are synchronized and/or sessions have been created, the SYMIOCTL backup operation may be executed. symioctl –type SQLServer begin snapshot <database> SAVEFILE <location of metafile> -noprompt In this example, the SAVEFILE command line option specifies the location where SYMIOCTL will store the VDI metadata file. 3. Flush all database locations, either drive letters or mount points, in use by using SIU. In this example, the locations are mount points. symntctl flush –path <mount point for volume> Execute for each and every database file and log location. 4. Split the devices For TimeFinder/Mirror: 274 Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases symmir –g <device_group> split –instant -noprompt For TimeFinder/Clone: symclone –g <device_group> activate -noprompt For TimeFinder/Snap: symsnap –g <device_group> activate -noprompt 5. Finalize the SYMIOCTL backup command and thaw I/O operations on the SQL Server database. symioctl –type SQLServer end snapshot <database> -noprompt It is the responsibility of the administrator to ensure that the SYMIOCTL end snapshot command is executed within the script—there is no automatic termination of the command. Leaving the database in snapshot state will suspend all user connections and may cause SQL Server to take the database offline. Any user-created process or script must cater for abnormal termination of the process or script itself, and ensure that the end snapshot is executed as a cleanup operation. An example of the execution of the SYMIOCTL command using TimeFinder/Mirror is shown in Figure 111 on page 276. SYMIOCTL VDI backup 275 Backing up Microsoft SQL Server Databases Figure 111 276 Sample SYMIOCTL backup usingTF/Mirror Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases Replication Manager VDI backup EMC Replication Manager (RM) can be used to manage and control the TimeFinder copies of a SQL Server database. The RM product has a GUI and command line and provides the capability to: ◆ Auto-discover the standard volumes holding the database ◆ Identify the path name for all data and transaction log file locations Using this information, RM can set up TimeFinder groups with BCVs or VDEVs, schedule TimeFinder operations, and manage the creation of database copies, expiring older versions as needed. Figure 112 on page 277 demonstrates the steps performed by Replication Manager using TimeFinder/Mirror to create a database copy that can be used for multiple other purposes: 1. RM/Local discovers and maps database files and logs SQL Server 3.1 2. RM/Local establishes BCVs to STD devices 5.1 2 3 5 Data STD Mirror Data STD Mirror 3. RML/Local executes SQL VDI call 3.1 SQL server executes checkpoint and suspend writes 4. RM/local splits BCVs from STDs Log STD Mirror 4 5. RM/Local executes SQL VDI call 5.1 SQL server resumes I/O 1 6. RM/Local copies VDI Metadata file 6 RM/Local Server ICO-IMG-000021 Figure 112 Using RM to make a TimeFinder replica of a SQL Server database 1. Replication Manager maps the database locations of all the data files and logs. These Symmetrix standard volumes holding those locations are discovered. Replication Manager VDI backup 277 Backing up Microsoft SQL Server Databases Note: The dynamic nature of this activity will handle the situation when extra volumes are added to the database. The procedure will not have to change. 2. Replication Manager then establishes the BCVs to the standard volumes in the Symmetrix. Replication Manager polls the progress of the establish until the BCVs are synchronized and then moves on to the next step. 3. Replication Manager executes a SQL Server VDI database call to checkpoint and write suspend the database. 4. Replication Manager then issues a TimeFinder split, to detach the BCVs from the standard devices. 5. Next, Replication Manager executes another SQL Server VDI call to resume all I/O operations to the database. 6. Finally, Replication Manager copies the VDI metadata file from the source host to an RM holding area for use later in a restore. 278 Microsoft SQL Server on EMC Symmetrix Storage Systems Backing up Microsoft SQL Server Databases Saving the VDI or VSS backup to long-term media Once the VDI or VSS backup has completed, the result is a consistent set of data and transaction log files on the mirror devices. These devices are typically presented to a secondary (or backup) server, where they can be mounted as valid NTFS volumes as either Drive letter locations or mount points, as required. Note: To retain the validity of the backup state, the files themselves should not be mounted to a SQL Server instance to facilitate any further processing. Instead, the file system objects (the files) themselves should be copied to whatever long-term storage media is required. The backup state represented by the files themselves should be maintained until the backup is required to be restored or processed in some manner. It is also important to maintain a copy of the VDI and VSS metadata files created during the VDI or VSS backup process. Each VDI or VSS metafile is required to facilitate recovery processes against the volumes containing the associated backup image. The VDI or VSS metadata files should be stored in a location from which it is accessible for the host executing the recovery process. Saving the VDI or VSS backup to long-term media 279 Backing up Microsoft SQL Server Databases 280 Microsoft SQL Server on EMC Symmetrix Storage Systems 6 Restoring and Recovering Microsoft SQL Server Databases This chapter presents these topics: ◆ ◆ ◆ ◆ ◆ ◆ ◆ SQL Server restore functionality.................................................... EMC Products for SQL Server recovery ....................................... EMC Consistency technology and restore.................................... TF/SIM VDI and VSS restore ......................................................... SMIOCTL VDI restore ..................................................................... Replication Manager VDI restore .................................................. Applying logs up to timestamps or marked transactions.......... Restoring and Recovering Microsoft SQL Server Databases 283 288 289 292 327 338 340 281 Restoring and Recovering Microsoft SQL Server Databases Recovering a production database is an event that all DBAs hope is never required. Nevertheless, database administrators must be prepared for unforeseen events such as media failures or user errors, which require database recovery operations. The keys to a successful database recovery include: ◆ Identifying database recovery time objectives ◆ Planning the appropriate recovery strategy based upon the backup type (full, incremental) ◆ Documenting the recovery procedures ◆ Validating the recovery process Microsoft SQL Server database recovery depends upon the backup methodology used. With the appropriate backup procedures in place, a SQL Server database can be recovered to any point in time between the end of the full backup and the point of failure using a combination of the full backup data and log files, as well as those recovery structures including any incremental transaction log backups. Recovery typically involves copying the previously backed up files into their appropriate locations and, if necessary, performing recovery operations to ensure that the database is recovered to the appropriate point in time and to ensure that the database is consistent. To mitigate the amount of data loss sustained to a production database, the first step is to ensure that a complete chain of database backups and incremental log backups are maintained. Note: The first process before any restoration procedures to a production system is to create an incremental backup of the current transaction log state. The final incremental transaction log backup will become the last log sequence available, and will be required to minimize data loss. If it is not possible to create a backup of the current transaction log contents in a form that will enable replay, loss of changes will result. This section assumes that EMC technology has been used in the backup process as described in Chapter 5, “Backing up Microsoft SQL Server Databases.” 282 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases SQL Server restore functionality Microsoft SQL Server provides a number of standard tools and interfaces to facilitate restore processing. Within the standard product, administrators can use the built-in backup tools presented by both the SQL Server Management Studio and the SQL Server T-SQL programmatic interface. SQL Server Management Studio provides a graphical user interface (GUI) to the major administrative functions within the database environment. In Figure 113 on page 283, an example of the interface to the restore functionality is provided. Figure 113 SQL Enterprise Manager restore interface Restore operations do present a significantly larger array of alternatives than backup operations. These options are available to implement such functions as partial restore operations, that for example, allow only a single filegroup within the database to be restored. It is also possible, in the case of restoring a full database backup, to relocate the files during the restore operation. Additionally, since a full database backup may have subsequently SQL Server restore functionality 283 Restoring and Recovering Microsoft SQL Server Databases facilitated incremental transaction log backups, options are provided to allow for subsequent restoration of these transaction log backups against the recovered database to mitigate data loss. Figure 114 on page 284, shows the Options tab of the restore window within SQL Server Management Studio, displaying these abilities. Of special note in Figure 114 on page 284 is the Recovery State. These completion states facilitate the ability to apply subsequent incremental transaction logs to the restored database. These options are described next. Figure 114 Additional Enterprise Manager restore options SQL Server – RESTORE WITH RECOVERY This is the Leave the database ready to use by rolling back uncommitted transactions option shown in Figure 114 on page 284. When the WITH RECOVERY keywords are used during a SQL Server restore operation, this implies that no subsequent transaction log backups are to be applied to this restored database. SQL Server 284 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases will then execute internal roll forward and rollback operations to ensure a transactionally consistent state of the database. Once completed, the database will be placed in a standard online state. SQL Server – RESTORE WITH NORECOVERY This is the Leave the database non-operational, and do not roll back uncommitted transactions option shown in Figure 114 on page 284. When the WITH NORECOVERY keywords are used during a SQL Server restore operations, this implies that subsequent incremental transaction logs will be applied to the database being restored. Therefore, at the end of the database restore, or subsequent incremental transaction log restores, SQL Server will not carry out any rollback recovery operations. Any database which is placed in a NORECOVERY state is not available for any user access. Finally, any sequence of restore operations executed with NORECOVERY will be concluded with a RESTORE … WITH RECOVERY statement. This will then indicate that the process outlined in “SQL Server – RESTORE WITH RECOVERY” on page 284 be initiated. SQL Server – RESTORE WITH STANDBY This is the Leave the database in read-only mode option shown in Figure 114 on page 284. Unlike the NORECOVERY process, which does not allow for access to the database being restored, the STANDBY mode implements additional steps to allow for read only access to the database during those times when incremental transaction logs are not being applied. The RESTORE LOG …. WITH STANDBY does require file locations in which SQL Server may store those pages whose state has been rolled back because the transaction which changed the data has not yet executed a commit as represented by the last log. The database state at the end of this undo process is one of a transactionally consistent point in time. Users can read from the database while logs are not being applied. At the start of the application of a subsequent incremental transaction log, the saved page states are restored from the standby files, and the subsequent incremental transaction log backup is applied. This cycle proceeds until such time as a WITH RECOVERY clause is executed. At that point, SQL Server will initiate the process details in “SQL Server – RESTORE WITH RECOVERY” on page 284. SQL Server restore functionality 285 Restoring and Recovering Microsoft SQL Server Databases The goal of the NORECOVERY and STANDBY recovery modes is to limit any potential loss of work (or data) between the time at which the full backup occurred and the time of the database failure which necessitated the restore. Clearly, the STANDBY mode of operation does allow for additional functionality. The Transact-SQL statement RESTORE DATABASE facilitates all of these operations. The full syntax for the RESTORE DATABASE statement can be found in the Microsoft SQL Server Books Online documentation. Only a subset of the available options will be used in this chapter for illustration purposes. Generally, a restore operation will necessitate the SQL Server instance taking the database offline during the restore process, such as that process shown in Figure 115 on page 286. As of SQL Server 2005 and SQL Server 2008, certain restore operations may be executed while the database remains online, though typically any given filegroup may need to be taken offline while the remainder of the database is accessible. Clearly, taking a production database offline will impact service-level agreements which may be in place. In all cases, the primary concern is to mitigate any impact to the production system. This includes facilitating the fastest possible restoration process through to reducing any potential data loss exposure. Figure 115 286 DATABASE RESTORE Transact-SQL execution Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases In almost all cases, disk mirror operations which facilitate a restore process will be significantly more expeditious than a comparable streaming restore operation, be that from a disk based backup file as in the example above, or from streaming tape. It is for this single reason that in general, disk mirror backup operations become appealing. The ability to almost instantaneously restore a corrupted database, and return it to an operational state is much more appealing than requiring to wait until a streaming restore has been streamed from disk or tape. Until the restore is complete and any applicable incremental transaction log backups are replayed, the production database is considered offline. SQL Server restore functionality 287 Restoring and Recovering Microsoft SQL Server Databases EMC Products for SQL Server recovery Disk mirror-based functionality of storage arrays can significantly decrease the amount of time required to execute restore operations. As documented within Chapter 5, “Backing up Microsoft SQL Server Databases,” EMC provides a number of products that can be used to facilitate restore operations. EMC products that facilitate SQL Server backup operations currently utilize the SQL Server VDI and VSS operations made available with Microsoft SQL Server 2005 and Microsoft SQL Server 2008. Because disk mirror restore operations will necessitate the reverse synchronization of the mirror device back to the standard devices, a number of issues need to be addressed. Restoration of this type at the LUN level cannot be an online operation. The LUN must be dismounted before the reverse synchronization (restore). This is to ensure that Windows does not retain a prerestored view of the specific LUNs. Should a restore be executed with the LUN mounted, Windows will mark the volume as questionable, and possible data corruption may result. In all circumstances, it is necessary to follow the guidelines provide by the specific EMC product being used to execute the restore. In the case of Microsoft Failover Cluster configurations, this restoration may become more complex because of the dependencies built into the Cluster Service itself. In general, disk resources are one of the dependencies for the SQL Server instance resource. To facilitate a restore operation, the disk resources will need to be taken offline before the TimeFinder restore operation. Failure to do so may result in Cluster Service inaccurately detecting a disk resource failure and commencing a failover operation. However, before taking the disk resources offline, it would be necessary to delete the database targeted for restore. 288 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases EMC Consistency technology and restore EMC provides a number of solutions built around a foundation of Consistency technology. Using this technology, it is possible to create valid restartable images of a SQL Server database environment using dependent-write principles. These styles of solutions are described in Chapter 4, “Creating Microsoft SQL Server Database Clones.” It is entirely possible to utilize the images created through Consistency technology to facilitate the ability to create valid restart points. It is also possible to stream these images to tape or other media for longer term storage and subsequent restoration. However, while these images represent valid restart points, they do not facilitate any of the SQL Server recovery forms such as NORECOVERY or STANDBY, which are described in the remainder of this chapter. That is, you are not able to apply subsequent incremental transaction logs to a restartable image, only to a recoverable one. Restartable images created using Consistency technology do have valuable usage when customers are concerned with federated environments, where the creation of a consistent restart point for all databases and other objects within the environment is the primary goal. Independently created recoverable images created of each individual component of a federated environment make it inordinately complex, if not impossible, to resolve a single business point of consistency across all systems. The Recovery Point Objective (RPO) of a restartable image is based on a static point in time, and will increase as changes to the source production environment proceed. It is only when the next point of consistency is created that the RPO can be reset to its initial state. During intervening times between cycles, the RPO point will vary, it is important to control the cycle of such a process to ensure that the solution falls within the maximum RPO. The RTO of such Federated environments is significantly easier to manage, since they typically only require restart processes, which have minimal administrative overheads. This combination of controlled RPO and very low RTO when considered across the scope of all systems within the federation makes and exceptionally compelling business restart solution. Restore operations for images created by Consistency Technology require five major steps: EMC Consistency technology and restore 289 Restoring and Recovering Microsoft SQL Server Databases 1. Drop or detach the production database using sp_detach_db 2. Unmount the filesystems used as locations for the database data and log files. 3. Use the appropriate restore operation for the Mirror environment be utilized 4. Mount the volumes 5. Reattach the database files using sp_attach_db. The sp_attach_db stored procedure looks similar to this: sp_attach_db <database>, @filename1=’<file_location>’, @filename<n>=’<file_location>’ Where <file_location> represents the locations of all database data files and logs for the database copy, typically these will be the same location from which the original consistent spit was taken. Please refer to SQL Server Books On Line for complete documentation on this Transact-SQL stored procedure. In the following example, a Consistent Split image is attached to the SQL Server instance. It assumes that all the mirrored devices, be they BCVs, Clones or Snaps are restored appropriated to the STD devices, and re-mounted in appropriate locations. If the mount locations are identical to those on the production instance, then it is only necessary to specify the first file location (the SQL Server Metadata file for the database). Since the Metadata file contains information for the locations of all subsequent files and logs, the stored procedure will use this information. If the file locations have been modified, then all the new locations must be specified. SQL Server will perform necessary roll forward/roll back recovery on the database instance. Since the Write Ahead Logging functionality of SQL Server has been maintained, the resulting cloned instance will return to a transactional consistent state, and become a fully independent clone of the production instance. Note: The following example of Figure 116 on page 291, that transaction roll forward and rollback operations are shown as logged in the SQL Server activity logs during the attach process. In this instance, 33,692 transactions were rolled forward, and 2 were rolled back. 290 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Figure 116 SQL log from attaching a Consistent split image to SQL Server EMC Consistency technology and restore 291 Restoring and Recovering Microsoft SQL Server Databases TF/SIM VDI and VSS restore The following examples assume that a restoration is being executed for the production server, to restore a full backup image to its original location. Prior to restoring any production database on the production server it is always recommended to maintain an existing copy of the database where possible, and to make every effort to create a final transaction log backup. It will be necessary to restore the appropriate VDI or VSS metadata files created as a result of the backup operation.The metadata files are required to execute the restore operation. Using TimeFinder/Mirror The diagram in Figure 117 on page 292 depicts the necessary steps to execute a VDI or VSS restore of a Microsoft SQL Server database using TimeFinder/Mirror: 1 2 SQL Server 5 6 1. Prepare SQL Server database state 2. Dismount volumes 3. Restore BCVs to Standard devices 4. Split BCVs from Standard devices 5. Mount volumes 6. Complete SQL Server restore with specified state 3 4 Data STD Data BCV Data STD Data BCV Log STD Log BCV ICO-IMG-000022 Figure 117 TF/SIM restore process using TimeFinder/Mirror For TF/SIM VDI restore processing, the following steps are executed. 292 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Note: Prior to executing a restore operation, it is recommended to ensure that there is a valid backup state, and that all incremental transaction log backups exist, to mitigate any data loss conditions. 1. Communication is initiated with the SQL Server instance, and a restore operation is executed for the specified database. 2. The volumes or mount points are dismounted on the production host in preparation of the disk mirror restore operation. 3. The disk mirrors are incrementally restored to the standard devices. Only those track which were modified on the standard volumes will be copied back from the BCV devices 4. Once all changes have been restored, the BCV devices are split from the standards to protect the state of the image on the BCVs 5. The volumes are remounted on the production hosts 6. SQL Server then processes the final restore operation and places the database in the appropriate state as defined on the initial call. For TF/SIM VSS restore processing, the following steps are executed. 1. Communication is initiated from the remote backup node to the production SQL Server instance. 2. The administrator is prompted to detach the database targeted for restore. 3. The administrator is prompted to dismount any of the BCV devices from the backup host that will be used for the restore operation. 4. The administrator is prompted to dismount volumes or mount points on the production host in preparation of the disk mirror restore operation. 5. The disk mirrors are incrementally restored to the standard devices. Only those track which were modified on the standard volumes will be copied back from the BCV devices 6. The administrator will be prompted to remount the restored volumes or mount points to proceed with the database restore operation. 7. The volumes are processed in order to revert them to an operational state TF/SIM VDI and VSS restore 293 Restoring and Recovering Microsoft SQL Server Databases 8. SQL Server then processes the final restore operation and places the database in the appropriate state as defined on the initial call. The specified state for the TF/SIM restore operation is one of NORECOVERY, STANDBY or RECOVERY. The option selected will depend on the specific requirements for the restore process. The options are detailed in the following sections. Restore with NORECOVERY In general, the NORECOVERY mode is utilized in the instance where a chain of logs is known to follow a full database backup. This allows for the restoration of the data and transaction logs of the database, and then the subsequent application of transaction log backups. It is possible to restore a backup and not require the application of transaction logs, this is the RECOVERY option, which is discussed in a subsequent section. A user database that is being restored is not available for user processes until such time as the backup procedures and any log applications have been completed. The command to restore a database instance back to the production server using TF/SIM VDI processing is: tsimsnap restore –d <DB_alias> -f <metafile> -norecovery This process results in the source volumes being restored to the state created by the backup operation last executed for the BCV. Additionally, TF/SIM manages a number of other processes, including the unmount operation of the LUN devices used by the database data and log files. The execution of this command is shown in Figure 118 on page 295. 294 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Figure 118 TF/SIM restore database with TimeFinder/Mirror and NORECOVERY It is also possible to use the –protbcvrest option the TF/SIM call to optimize the restore operation. With the use of this option, the BCV restore operation begins as a background task and access to the restored state on the STD is immediately available. If a host process attempts a reference to a track marked to be restored from the BCV, the track is copied immediately, and returned to the calling process. Moreover, updates do not flow to the BCV devices. In this way, the validity of the BCV backup image is maintained. Once completely restored, the BCV devices may be split form the STDs. Note: Refer to the Product Guide for the TF/SIM product for preconditions and usage of the –protbcvrest option. Details on the location of the Product Guide are provided in Appendix A, “Related Documents.” TF/SIM VDI and VSS restore 295 Restoring and Recovering Microsoft SQL Server Databases To maintain the backup validity of the BCV image when the –protbcvrest option is not used or is not available, the BCV devices are split after all required tracks are restored and before the mount operation. This ensures that the fidelity of the backup image on the BCV devices is maintained. The command to restore a database instance back to the production server as executed from a remote host using TF/SIM VSS processing is: tsimvss restore -ps <production host> –d <DB_alias> -bcd <bcd metafile> -wmd <wmd metafile> -norecovery This process results in the BCV being restored to the state created by the backup operation last executed for the BCV. Additionally, TF/SIM manages a number of other processes, including the unmount operation of the LUN devices used by the database data and log file. The resulting state of the database after the execution of either a VDI or VSS restore operation is one of a RESTORING database. A RESTORING database is not available for access directly, and is therefore different to a read-only state as created by a process such as a STANDBY. The state of the database does allow for subsequent transaction log backup files to be restored into the database instance. This application of incremental log backups allows for the minimization of data loss between the time the original database backup occurred and the time at which the failure occurred, which necessitated the restore operation. The database state can easily be seen by using SQL Server Management Studio, as shown in Figure 119 on page 297. 296 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Figure 119 SQL Management Studio view of a RESTORING database Transaction log backups need to be applied to this database in sequence, specifically in the order in which they were created. The time required to apply the incremental transaction log backups will relate directly to the amount of change which occurred on the production database during normal operations. Clearly, a restoration of the production database in this manner will necessitate an outage. While much of time requirement may be mitigated by using disk mirror technology to restore the last full backup image, the incremental transaction logs will still be require to be replayed against the restored database. The time required to restore the database and apply transaction logs is referred to as the Recovery Time Interval. Figure 120 on page 298 demonstrates the Transact-SQL statement execution for the application of an incremental transaction log backup. The NORECOVERY option used indicates that further logs may be applied to this restoring database. TF/SIM VDI and VSS restore 297 Restoring and Recovering Microsoft SQL Server Databases Figure 120 Restore of incremental transaction log with NORECOVERY The application of incremental transaction logs progress in this manner until the last available transaction log is processed. In the case of the last incremental transaction log restore operation, the RECOVERY keyword should be used. This indicates to the SQL Server instance that final recovery processing should be executed against the database. This will roll back uncommitted transactions in the database and return it to an online state. It is also possible to execute the following Transact-SQL against the database state, should there be no further transaction logs: RESTORE LOG <database> WITH RECOVERY GO This causes recovery processing to be executed without the need to specify a valid transaction log, when no further logs are available. Should it be necessary to restore to some point before the end of a given transaction log, please refer to “Applying logs up to timestamps or marked transactions” on page 340 for information relating to restoring to timestamps or marked transactions. 298 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Restore with STANDBY The STANDBY mode is not a typical mode used for restoration of a production database. Typically, either the NORECOVERY or RECOVERY modes are used because they result in the required states. However, the STANDBY mode of recovery does allow for the creation of a read-only state and it is possible that a database administrator may want to query the state of the database during intervals between incremental transactions log applications. For that purpose, it is discussed here. To facilitate the application of incremental transaction logs in a similar manner to that when the NORECOVERY option is utilized, but also to provide read only access to the restored database, the STANDBY option may be used. In this way, the restored database is available for those operations that do not attempt to update the target database. Any attempt to execute updates will return a failure, as the database is in a read-only state. The command to restore a database instance back to the production server using TF/SIM VDI processing, as shown in Figure 121 on page 300, is: tsimsnap restore –d <DB_alias> -f <metafile> -standby Note: TF/SIM does not support the notion of a standby database mode when using VSS processing. During the application of incremental transaction logs, it is necessary, however, to terminate any access from other client connections. Exclusive access to the database is required to be able to apply an incremental transaction log created from the production system. The STANDBY state leaves the target database in a read-only state, which is shown through SQL Server Management Studio in Figure 122 on page 301. Similar to the application of incremental transaction logs to a NORECOVERY state, incremental transaction logs may be applied to the STANDBY database. However, to maintain the STANDBY state it is also necessary to specify the STANDBY option on the Transact-SQL RESTORE LOG statement. It is also required that the location for an undo file is provided. The undo file is used to record those pages rolled back for uncommitted transactions, after the application of a transaction log. The rollback operation ensures that a transactionally consistent view is available of the database state. TF/SIM VDI and VSS restore 299 Restoring and Recovering Microsoft SQL Server Databases Figure 121 300 TF/SIM restore with TimeFinder/Mirror and STANDBY Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Figure 122 SQL Management Studio view of a STANDBY (read-only) database Before the application of subsequent incremental transaction logs, the undo file changes are reapplied to the database. Figure 123 on page 302 demonstrates the use of the STANDBY option and the inclusion of the standby file location. TF/SIM VDI and VSS restore 301 Restoring and Recovering Microsoft SQL Server Databases Figure 123 Restore of incremental transaction log with STANDBY The application of incremental transaction logs progress in this manner until the last available transaction log is processed. In the case of the last incremental transaction log restore operation, the RECOVERY keyword should be used. This indicates to the SQL Server instance that final recovery processing should be executed against the database. This will roll back uncommitted transactions in the database and return it to an online state. It is also possible to execute the following Transact-SQL against the database in this state, should there be no further transaction logs: RESTORE LOG <database> WITH RECOVERY GO This causes recovery processing to be executed without the need to specify a valid transaction log, when no further logs are available. Should it be necessary to restore to some point before the end of a given transaction log, refer to “Applying logs up to timestamps or marked transactions” on page 340 for information relating to restoring to timestamps or marked transactions 302 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Restore with RECOVERY In situations where there are no incremental transaction log backups available for replay against the full database backup, it may be appropriate to simply restore the last full backup and allow SQL Server to bring the database online. In this case, the end state of the operation will be of an online, fully accessible database. No additional incremental transaction logs can be applied to this database instance once recovered, as a new chain of transaction logs will be initiated. This process is identical to that executed in “Restore with NORECOVERY” on page 294, except for the use of the RECOVERY option. The command to restore a database instance back to the production server using TF/SIM VDI processing is: tsimsnap restore –d <DB_alias> -f <metafile> -recovery The command to restore a database instance back to the production server from a remote host using TF/SIM VSS processing is: tsimsnap restore -ps <production server> –d <DB_alias> -bcd <bcd metafile> -wmd <wmd metafile> -recovery Using TimeFinder/Clone Implementation of restore operations with TF/SIM when utilizing TimeFinder/Clone devices is supported by both the TF/SIM VDI and VSS implementations. When using TimeFinder/Clone restore processes, TF/SIM requires the use of manual restore operations by usage of the -m command line operation. Note: Prior to executing a restore operation, it is recommended to ensure that there is a valid backup state, and that all incremental transaction log backups exist, to mitigate any data loss conditions. For TF/SIM VDI restore processing, the following steps are executed. 1. The administrator must detach or otherwise remove the targeted database from the production host. 2. The volumes or mount points must be dismounted on the production host in preparation of the disk mirror restore operation. TF/SIM VDI and VSS restore 303 Restoring and Recovering Microsoft SQL Server Databases 3. The disk mirrors are incrementally restored to the standard devices. Only those track which were modified on the standard volumes will be copied back from the clone devices. 4. The volumes are remounted on the production host. 5. Once all changes have been restored, the clone devices are left in a RESTORED state. The clone devices may have the restored state terminated once all data has been restored. 6. SQL Server then processes the final restore operation and places the database in the appropriate state as defined on the initial call. For TF/SIM VSS restore processing, the following steps are executed. 1. The administrator must detach or otherwise remove the targeted database from the production host. 2. The administrator is prompted to detach the database targeted for restore. 3. The administrator is prompted to dismount any of the clone devices from the backup host that will be used for the restore operation. 4. The administrator is prompted to dismount volumes or mount points on the production host in preparation of the disk mirror restore operation. 5. The disk mirrors are incrementally restored to the standard devices. Only those track which were modified on the standard volumes will be copied back from the clone devices. Once all changed tracks have been copied back to the source volumes, the clone devices will remain in a RESTORED state. 6. The administrator will be prompted to remount the restored volumes or mount points to proceed with the database restore operation. 7. The volumes are processed in order to revert them to an operational state 8. SQL Server then processes the final restore operation and places the database in the appropriate state as defined on the initial call. The specified state for the TF/SIM restore operation is one of NORECOVERY, STANDBY or RECOVERY. The option selected will depend on the specific requirements for the restore process. The options are detailed in the following sections. 304 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Restore with NORECOVERY In general, the NORECOVERY mode is utilized in the instance where a chain of logs is known to follow a full database backup. This allows for the restoration of the data and transaction logs of the database, and then the subsequent application of transaction log backups. It is possible to restore a backup and not require the application of transaction logs, this is the RECOVERY option, which is discussed in a subsequent section. A user database that is being restored is not available for user processes until such time as the backup procedures and any log applications have been completed. To initiate a restore operation using TimeFinder/Clone devices, it will first be necessary to detach or otherwise remove from the SQL Server instance the user database to be restored. It will subsequently be necessary to unmount the volumes from the production server to ensure that Windows Server does not retain information regarding the state of the filesystems, and subsequently cause corruption. Once dismounted, the reverse synchronization of the TimeFinder/Clone devices will need to be executed. In Figure 124 on page 305, a TimeFinder/Clone restore operation is executed, using a device file containing the source and clone target pairings used to create the initial backup state. Figure 124 Execution of TimeFinder/Clone restore Once the restore process has been initiated, the volumes on the production host need to be re-mount to the drive locations initially used by the source database. This will make the files accessible to execute the manual restore operation. The command to initiate a manual restore for the target database instance on the production server using TF/SIM VDI processing is: tsimsnap restore –d <DB_alias> -f <metafile> -m -norecovery TF/SIM VDI and VSS restore 305 Restoring and Recovering Microsoft SQL Server Databases This process results in the specified database being placed into a NORECOVERY mode using the files manually restored. The execution of this command is shown in Figure 125 on page 306. Figure 125 TF/SIM restore database with TimeFinder/Clone and NORECOVERY TimeFinder/Clone devices will be left in a RESTORED state after the execution of the restore TF/SIM operation. TimeFinder/Clone devices will not receive updates from the source devices when they are in the RESTORED state. The command to restore a database instance back to the production server as executed from a remote host using TF/SIM VSS processing is: tsimvss restore -ps <production host> –d <DB_alias> -bcd <bcd metafile> -wmd <wmd metafile> -norecovery This process results in the source devices being restored to the state created by the backup operation last executed for the clone targets. The resulting state of the database after the execution of either a VDI or VSS restore operation is one of a RESTORING database. A RESTORING database is not available for access directly, and is therefore different to a read-only state as created by a process such as a STANDBY. The state of the database does allow for subsequent transaction log backup files to be restored into the database instance. 306 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases This application of incremental log backups allows for the minimization of data loss between the time the original database backup occurred and the time at which the failure occurred, which necessitated the restore operation. The database state can easily be seen by using SQL Server Management Studio, as shown in Figure 126. Figure 126 SQL Management Studio view of a RESTORING database Transaction log backups need to be applied to this database in sequence, specifically in the order in which they were created. The time required to apply the incremental transaction log backups will relate directly to the amount of change that occurred on the production database during normal operations. Clearly, a restoration of the production database in this manner will necessitate an outage. While much of time requirement may be mitigated by using disk mirror technology to restore the last full backup image, the incremental transaction logs will still be required to be replayed against the restored database. The time required to restore the database and apply transaction logs is referred to as the Recovery Time Interval. Figure 127 demonstrates the TF/SIM VDI and VSS restore 307 Restoring and Recovering Microsoft SQL Server Databases Transact-SQL statement execution for the application of an incremental transaction log backup. The NORECOVERY option used indicates that further logs may be applied to this restoring database. Figure 127 Restore of incremental transaction log with NORECOVERY The application of incremental transaction logs progress in this manner until the last available transaction log is processed. In the case of the last incremental transaction log restore operation, the RECOVERY keyword should be used. This indicates to the SQL Server instance that final recovery processing should be executed against the database. This will roll back uncommitted transactions in the database and return it to an online state. It is also possible to execute the following Transact-SQL against the database state, should there be no further transaction logs: RESTORE LOG <database> WITH RECOVERY GO This causes recovery processing to be executed without the need to specify a valid transaction log, when no further logs are available. Should it be necessary to restore to some point before the end of a 308 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases given transaction log, please refer to “Applying logs up to timestamps or marked transactions” on page 340 for information relating to restoring to timestamps or marked transactions. Restore with STANDBY The STANDBY mode is not a typical mode used for restoration of a production database. Typically, either the NORECOVERY or RECOVERY modes are used because they result in the required states. However, the STANDBY mode of recovery does allow for the creation of a read-only state and it is possible that a database administrator may want to query the state of the database during intervals between incremental transactions log applications. For that purpose, it is discussed here. To facilitate the application of incremental transaction logs in a similar manner to that when the NORECOVERY option is utilized, but also to provide read-only access to the restored database, the STANDBY option may be used. In this way, the restored database is available for those operations that do not attempt to update the target database. Any attempt to execute updates will return a failure, as the database is in a read-only state. To initiate a restore operation using TimeFinder/Clone devices, it will first be necessary to detach or otherwise remove from the SQL Server instance the user database to be restored. It will subsequently be necessary to unmount the volumes from the production server to ensure that Windows Server does not retain information regarding the state of the filesystems, and subsequently cause corruption. Once dismounted, the reverse synchronization of the TimeFinder/Clone devices will need to be executed. Once the restore process has been initiated, the volumes on the production host need to be re-mount to the drive locations initially used by the source database. This will make the files accessible to execute the manual restore operation. The command to restore a database instance back to the production server using TF/SIM VDI processing, as shown in Figure 121 on page 300, is: tsimsnap restore –d <DB_alias> -f <metafile> -m -standby Note: TF/SIM does not support the notion of a standby database mode when using VSS processing. TF/SIM VDI and VSS restore 309 Restoring and Recovering Microsoft SQL Server Databases During the application of incremental transaction logs, it is necessary, however, to terminate any access from other client connections. Exclusive access to the database is required to be able to apply an incremental transaction log created from the production system. The STANDBY state leaves the target database in a read-only state, which is shown through SQL Server Management Studio in Figure 128. Similar to the application of incremental transaction logs to a NORECOVERY state, incremental transaction logs may be applied to the STANDBY database. However, to maintain the STANDBY state it is also necessary to specify the STANDBY option on the Transact-SQL RESTORE LOG statement. It is also required that the location for an undo file is provided. The undo file is used to record those pages rolled back for uncommitted transactions, after the application of a transaction log. The rollback operation ensures that a transactionally consistent view is available of the database state. Figure 128 310 TF/SIM restore with TimeFinder/Mirror and STANDBY Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Figure 129 SQL Enterprise Manager view of a STANDBY (read-only) database Before the application of subsequent incremental transaction logs, the undo file changes are reapplied to the database. Figure 130 on page 312 demonstrates the use of the STANDBY option and the inclusion of the standby file location. TF/SIM VDI and VSS restore 311 Restoring and Recovering Microsoft SQL Server Databases Figure 130 Restore of incremental transaction log with STANDBY The application of incremental transaction logs progress in this manner until the last available transaction log is processed. In the case of the last incremental transaction log restore operation, the RECOVERY keyword should be used. This indicates to the SQL Server instance that final recovery processing should be executed against the database. This will roll back uncommitted transactions in the database and return it to an online state. It is also possible to execute the following Transact-SQL against the database in this state, should there be no further transaction logs: RESTORE LOG <database> WITH RECOVERY GO This causes recovery processing to be executed without the need to specify a valid transaction log, when no further logs are available. Should it be necessary to restore to some point before the end of a given transaction log, refer to “Applying logs up to timestamps or marked transactions” for information relating to restoring to timestamps or marked transactions. 312 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Restore with RECOVERY In situations where there are no incremental transaction log backups available for replay against the full database backup, it may be appropriate to simply restore the last full backup and allow SQL Server to bring the database online. In this case, the end state of the operation will be of an online, fully accessible database. No additional incremental transaction logs can be applied to this database instance once recovered, as a new chain of transaction logs will be initiated. This process is identical to that executed in “Restore with NORECOVERY” on page 294, except for the use of the RECOVERY option. To initiate a restore operation using TimeFinder/Clone devices, it will first be necessary to detach or otherwise remove from the SQL Server instance the user database to be restored. It will subsequently be necessary to unmount the volumes from the production server to ensure that Windows Server does not retain information regarding the state of the filesystems, and subsequently cause corruption. Once dismounted, the reverse synchronization of the TimeFinder/Clone devices will need to be executed. Once the restore process has been initiated, the volumes on the production host need to be re-mount to the drive locations initially used by the source database. This will make the files accessible to execute the manual restore operation. The command to restore a database instance back to the production server using TF/SIM VDI processing is: tsimsnap restore –d <DB_alias> -f <metafile> -m -recovery The command to restore a database instance back to the production server from a remote host using TF/SIM VSS processing is: tsimsnap restore -ps <production server> –d <DB_alias> -bcd <bcd metafile> -wmd <wmd metafile> -m -recovery Using TimeFinder/Snap The execution of TF/SIM restore operations from TimeFinder/Snap devices differs slightly in the manner of execution from that of TimeFinder/Mirror. Once the TimeFinder/Snap restore process is started, the STD devices become available. A restore session is then TF/SIM VDI and VSS restore 313 Restoring and Recovering Microsoft SQL Server Databases created for this process, and the original copy session remains intact. Any changed tracks recorded since the start of the TF/Snap session are restored to the STD devices. Figure 131 on page 315 depicts the necessary steps to make a VDI backup of a Microsoft SQL Server database using TimeFinder/Snap: 1. The administrator must detach or otherwise remove the targeted database from the production host. 2. The volumes or mount points must be dismounted on the production host in preparation of the disk mirror restore operation. 3. The restore process is initiated by the creation of a new session, which makes the view of the restored state immediately available. 4. The volumes are remounted on the production hosts 5. SQL Server then processes the final restore operation and places the database in the appropriate state as defined on the initial call. After the execution of the TF/SIM restore operation using TimeFinder/Snap, the database will become immediately available in the state which was designated. This immediate access is provided while the actual copy of tracks is processed as a background operation. The state of the STD will be seen to be that state at the time of TimeFinder/Snap activation. If data is referenced on a track which has not yet been restored from the SAVE area, it will be immediately restored, and then the data will become accessible to the host. Validity of the state is protected and will be consistent. 314 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Production SQL Server I/O 1 STD 3 2 4 Data recorded as modified on STD restored from SAVE area 5 VDEV 1. Prepare SQL Server database state 2. Dismount volumes 3. Restore pre-modified data to Standard devices from save area 4. Mount volumes 5. Complete SQL Server restore with specified state SAVE DEVs ICO-IMG-000023 Figure 131 TF/SIM restore using TimeFinder/Snap In actuality, TimeFinder/Snap creates a new restore session to revert the changed tracks. It is possible to view the two states using the following command: symsnap –g <device_group> query –multi The –multi option lists all device sessions. This is shown in Figure 132 on page 316. TF/SIM VDI and VSS restore 315 Restoring and Recovering Microsoft SQL Server Databases Figure 132 TimeFinder/Snap listing restore session In the example shown in Figure 132 on page 316 each source STD device is listed in the first column. Two VDEV devices are associated with each STD device; one is shown with a Restored state in the second last column, and the other is listed with a CopyOnWrite state. Those devices listed with the Restored state are the VDEVs created by the restore operation. Once completely restored, the session provides no specific value in this environment, and the VDEV devices created by the restore session may be terminated with the following command: symsnap –g <device_group> terminate –restored The specified state for the TF/SIM restore operation is one of NORECOVERY, STANDBY or RECOVERY. The option selected will depend on the specific requirements for the restore process. The options are detailed in the following sections. 316 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Restore with NORECOVERY In general, the NORECOVERY mode is utilized in the instance where a chain of logs is known to follow a full database backup. This allows for the restoration of the data and transaction logs of the database, and then the subsequent application of transaction logs. It is possible to restore a backup and not require the application of transaction logs, this is the RECOVERY option, and is discussed in a subsequent section. To initiate a restore operation using TimeFinder/Snap devices, it will first be necessary to detach or otherwise remove from the SQL Server instance the user database to be restored. It will subsequently be necessary to unmount the volumes from the production server to ensure that Windows Server does not retain information regarding the state of the filesystems, and subsequently cause corruption. Once dismounted, the reverse synchronization of the TimeFinder/Snap devices will need to be executed. Once the restore process has been initiated, the volumes on the production host need to be re-mount to the drive locations initially used by the source database. This will make the files accessible to execute the manual restore operation. The command to restore a database instance back to the production server using TF/SIM VDI processing with TF/Snap is: tsimsnap restore –d <DB_alias> -f <metafile> -m -norecovery The command to restore a database instance back to the production server as executed from a remote host using TF/SIM VSS processing is: tsimvss restore -ps <production host> –d <DB_alias> -bcd <bcd metafile> -wmd <wmd metafile> -norecovery This process results in the source devices being restored to the state created by the backup operation last executed for the snap devices. This process results in tracks, which were copied to the save area as a result of being updated on the STD, being restored to their preupdated state. Ultimately, this will initiate a restore process against the STD devices to return them back to the state at which the TF/Snap session was activated—that of the backup. Additionally, TF/SIM manages a number of other processes, including the TF/SIM VDI and VSS restore 317 Restoring and Recovering Microsoft SQL Server Databases unmount operation of the STD LUN devices used by the database data and log files. An example of the execution process of the TF/SIM command is shown in Figure 133 on page 318. Figure 133 TF/SIM restore with TimeFinder/SNAP and NORECOVERY The resulting state of the database immediately after the TF/SIM snap restore operation is one of a LOADING database. A LOADING database is not available for access directly, and is therefore different to a read-only state as created by a process such as a STANDBY. The state of the database does allow for subsequent transaction log backup files to be restored into the database instance. This application of incremental log backups allows for the minimization of data loss between the time the original database backup occurred and the time at which the failure occurred, which necessitated the restore operation. The database state can easily be seen by using SQL Server Enterprise Manager, as shown in Figure 134 on page 319. 318 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Figure 134 SQL Management Studio view of a RESTORING database Transaction log backups need to be applied to this database in sequence, specifically in the order in which they were created. The time required to apply the incremental transaction log backups will relate directly to the amount of change which occurred on the production database during normal operations. Clearly, a restoration of the production database in this manner will necessitate an outage. While much of time requirement may be mitigated by using disk mirror technology to restore the last full backup image, the incremental transaction logs will still be require to be replayed against the restored database. The time required to restore the database and apply transaction logs is referred to as the Recovery Time Interval. Typically, customers state this as a Recovery Time Objective or that time period that has been dictated by their service-level agreements (SLAs) for which the production database to be unavailable in the event of failure, for restoration purposes. Application of transaction logs is typically a requirement of any recovery process for a production database. An example of the application of a transaction log increment backup executed from within Query Analyzer is shown in Figure 135 on page 320. TF/SIM VDI and VSS restore 319 Restoring and Recovering Microsoft SQL Server Databases Figure 135 Restore of incremental transaction log with NORECOVERY The application of incremental transaction logs progress in this manner until the last available transaction log is processed. In the case of the last incremental transaction log restore operation, the RECOVERY keyword should be used. This indicates to the SQL Server instance that final recovery processing should be executed against the database. This will roll back uncommitted transactions in the database and return it to an online state. It is also possible to execute the following Transact-SQL against the database in this state, should there be no further transaction logs: RESTORE LOG <database> WITH RECOVERY GO This causes recovery processing to be executed without the need to specify a valid transaction log, when no further logs are available. Should it be necessary to restore to some point before the end of a given transaction log, refer to “Applying logs up to timestamps or marked transactions” on page 340 for information relating to restoring to timestamps or marked transactions. 320 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Note: Remember to terminate the TimeFinder/Snap Restore Session once it has been completely restored. Restore with STANDBY The STANDBY mode is not a typical mode used for restoration of a production database. Typically, either the NORECOVERY or RECOVERY modes are used because they result in the required states. However, it is possible that a database administrator may want to query the state of the database during intervals between incremental transactions log applications. For that purpose, it is discussed here. To facilitate the application of incremental transaction logs in a similar manner to that when the NORECOVERY option is utilized, but also to provide read only access to the restored database, the STANDBY option may be used. In this way, the restored database is available for those operations that do not attempt to update the target database. Any attempt to execute updates will return a failure, as the database is in a read-only state. To initiate a restore operation using TimeFinder/Snap devices, it will first be necessary to detach or otherwise remove from the SQL Server instance the user database to be restored. It will subsequently be necessary to unmount the volumes from the production server to ensure that Windows Server does not retain information regarding the state of the filesystems, and subsequently cause corruption. Once dismounted, the reverse synchronization of the TimeFinder/Snap devices will need to be executed. Once the restore process has been initiated, the volumes on the production host need to be re-mount to the drive locations initially used by the source database. This will make the files accessible to execute the manual restore operation. The command to restore a database instance back to the production server using TF/SIM VDI processing is: tsimsnap restore –d <DB_alias> -f <metafile> -m –standby The command to restore a database instance back to the production server as executed from a remote host using TF/SIM VSS processing is: tsimvss restore -ps <production host> –d <DB_alias> -bcd <bcd metafile> -wmd <wmd metafile> -norecovery TF/SIM VDI and VSS restore 321 Restoring and Recovering Microsoft SQL Server Databases This process results in the source devices being restored to the state created by the backup operation last executed for the snap devices. An example of the execution of the TF/SIM command is shown in Figure 136 on page 322. Figure 136 TF/SIM restore with TimeFinder/SNAP and STANDBY During the application of incremental transaction logs it is necessary, however, to terminate any access from other client connections. Exclusive access to the database is required to be able to apply an incremental transaction log created from the production system. Since the STANDBY state does leave the target database in a read-only state, this can be seen through SQL Server Enterprise Manager, as shown in Figure 137 on page 323. 322 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Figure 137 SQL Management Studio view of a STANDBY (read-only) database Similar to the application of incremental transaction logs to a NORECOVERY state, incremental transaction logs may be applied to the STANDBY database, as shown in Figure 138 on page 324. However, to maintain the STANDBY state it is also necessary to specify the STANDBY option on the Transact-SQL RESTORE LOG statement. It is also required that the location for an undo file is provided. The undo file is used to record those pages rolled back for uncommitted transactions, after the application of a transaction log. The roll back operation ensures that a transactionally consistent view is available of the database state. Before the application of subsequent incremental transaction logs, the undo file changes are reapplied to the database. TF/SIM VDI and VSS restore 323 Restoring and Recovering Microsoft SQL Server Databases Figure 138 Restore of incremental transaction log with STANDBY The application of incremental transaction logs progress in this manner until the last available transaction log is processed. In the case of the last incremental transaction log restore operation, the RECOVERY keyword should be used. This indicates to the SQL Server instance that final recovery processing should be executed against the database. This will roll back uncommitted transactions in the database and return it to an online state. It is also possible to execute the following Transact-SQL against the database state, should there be no further transaction logs: RESTORE LOG <database> WITH RECOVERY GO This causes recovery processing to be executed without the need to specify a valid transaction log, when no further logs are available. Should it be necessary to restore to some point before the end of a given transaction log, please refer to “Applying logs up to timestamps or marked transactions” on page 340 for information relating to restoring to timestamps or marked transactions. 324 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Note: Remember to terminate the TimeFinder/Snap Restore Session once it has been completely restored. Restore with RECOVERY In situations where there are no available incremental transaction log backups available for replay against the full database backup, it may be appropriate to simply restore the last full backup and allow SQL Server to bring the database online. In this case, the end state of the operation will be of an online, fully accessible database. No additional incremental transaction logs will be able to be applied to this database instance once recovered, as a new chain of transaction logs will be initiated. To initiate a restore operation using TimeFinder/Snap devices, it will first be necessary to detach or otherwise remove from the SQL Server instance the user database to be restored. It will subsequently be necessary to unmount the volumes from the production server to ensure that Windows Server does not retain information regarding the state of the filesystems, and subsequently cause corruption. Once dismounted, the reverse synchronization of the TimeFinder/Snap devices will need to be executed. Once the restore process has been initiated, the volumes on the production host need to be re-mount to the drive locations initially used by the source database. This will make the files accessible to execute the manual restore operation. This process is identical to that executed in “Restore with NORECOVERY” on page 294 except for the use of the RECOVERY option. The command to restore a database instance back to the production server using TF/SIM VDI processing is: tsimsnap restore –d <DB_alias> -f <metafile> -m -recovery The command to restore a database instance back to the production server as executed from a remote host using TF/SIM VSS processing is: tsimvss restore -ps <production host> –d <DB_alias> -bcd <bcd metafile> -wmd <wmd metafile> -norecovery This process results in the source devices being restored to the state created by the backup operation last executed for the snap devices. TF/SIM VDI and VSS restore 325 Restoring and Recovering Microsoft SQL Server Databases Note: Remember to terminate the TimeFinder/Snap Restore Session once it has been completely restored. 326 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases SMIOCTL VDI restore Backup images created using SYMIOCTL may also be used to restore the production database instance. However, SYMIOCTL provides no automation of activities for the restore process itself and the SYMIOCTL execution is entirely independent of the TimeFinder operations. In all cases, it is necessary to remove the database object from SQL Server as the first step. This is required, as it is necessary to unmount the file system objects before restoring the BCV image. Before restoring any production database on the production server, it is always recommended to maintain an existing copy of the database where possible, and to make every effort to create a final transaction log backup. Figure 139 on page 328 depicts the following steps which are required for any restore using SYMIOCTL: 1. Detach or delete the target database 2. Unmount drives or mount points which are to be restored from mirror devices 3. Execute restore operation for TimeFinder 4. Mount drives or mount points 5. Execute SYMIOCTL restore operation 6. Apply logs where applicable SMIOCTL VDI restore 327 Restoring and Recovering Microsoft SQL Server Databases 1 2 SQL Server 5 6 1. Detach/Delete SQL Server database 2. Dismount volumes 3. Restore BCVs to Standard devices 4. Split BCVs from Standard devices 5. Mount volumes 6. Execute SYMIOCTL restore call with required state 3 4 Data STD Data BCV Data STD Data BCV Log STD Log BCV ICO-IMG-000024 Figure 139 Using SYMIOCTL for TimeFinder/Mirror restore of production database It will be necessary to restore the VDI metadata file created as a result of the backup operation so that it too is accessible for the SYMIOCTL command execution. The following sections only detail the actual execution of the SYMIOCTL command instance. Detaching and/or deleting the database instance can be managed by the SQL Server Enterprise Manager or by executing the appropriate sp_detach_db, DROP DATABASE Transact-SQL statement. For correct usage and syntax, please refer to the Microsoft SQL Server Books Online documentation. It is also assumed that the Symmetrix Integration Utility (SIU) command line process SYMNTCTL will be used for managing the unmount and mount operations. Examples of the execution of SYMNTCTL are provided in Chapter 4, “Creating Microsoft SQL Server Database Clones.” Additional documentation on the SIU can be found in the product guide for the TimeFinder Integration Modules. Refer to Appendix A, “Related Documents,” for details on locating relevant documentation. 328 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Executing TimeFinder restore Once production database has been detached or otherwise removed from the SQL Server instance, and volumes have been dismounted to facilitate the restore operation, it will be necessary to select the appropriate restore process for the style of TimeFinder operations used to create the initial backup. The three forms of TimeFinder restore covered here will be TimeFinder/Mirror, TimeFinder/Clone, and TimeFinder/Snap. TimeFinder/Mirror restore The restore operation is a standard TimeFinder/Mirror process of the form: symmir –g <device_group> restore –noprompt The execution results in all changed tracks recorded for the STD devices being rolled back to that state when the backup image was created. To check on the restore process, it is possible to query the restore operation in the following way: symmir –g <device_group> query While it is possible to access the STD volumes once the BCV restore process has been initiated, it is recommended that the restore is fully completed, and the BCV devices split from the STDs before proceeding. Alternatively, where possible, use the –protect option for the operation. The goal is to ensure that the BCV state is not changed when access is made to the STD volumes. This ensures the continued validity of the BCV backup image. After the volumes have been restored appropriately and the volumes subsequently re-mounted to the production host, the SYMIOCTL utility may be used to execute the restore operation. It is assumed that the VDI metadata file relating to the original backup operation will be restored to a location on the production host. TimeFinder/Clone restore The restore operation is a standard TimeFinder/Clone process of the form: symclone –g <device_group> restore –noprompt SMIOCTL VDI restore 329 Restoring and Recovering Microsoft SQL Server Databases The execution results in all changed tracks recorded for the STD devices being rolled back to that state when the backup image was created on the clone devices. To check on the restore process, it is possible to query the restore operation in the following way: symclone –g <device_group> query Unlike TimeFinder/Mirror restore operations, TimeFinder/Clone devices do not allow updates to be propagated to the clone devices, which have been restored. Thus, it is possible to access the STD volumes once the clone restore process has been initiated. The validity of the clone backup image is maintained. After the restore process has been initiated and the volumes subsequently re-mounted to the production host, the SYMIOCTL utility may be used to execute the restore operation. It is assumed that the VDI metadata file relating to the original backup operation will be restored to a location on the production host. It will also be necessary to terminate the restored clone process by executing the following command: symclone –g <device_group> terminate -noprompt TimeFinder/Snap restore The restore operation is a standard TimeFinder/Snap process of the form: symsnap –g <device_group> restore –noprompt The execution results in all changed tracks recorded for the STD devices being rolled back to that state when the backup image was created on the snap devices. To check on the restore process, it is possible to query the restore operation in the following way: symsnap –g <device_group> query Unlike TimeFinder/Mirror restore operations, TimeFinder/Snap devices do not allow updates to be propagated to the VDEV devices, which have been restored. Thus, it is possible to access the STD volumes once the Snap restore process has been initiated. The validity of the snap backup image is maintained. After the restore process has been initiated and the volumes subsequently re-mounted to the production host, the SYMIOCTL utility may be used to execute the restore operation. It is assumed that the VDI metadata file relating to the original backup operation will be restored to a location on the production host. 330 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases It will also be necessary to terminate the restored snap session by executing the following command: symsnap –g <device_group> terminate –restored -noprompt SYMIOCTL with NORECOVERY The SYMIOCTL restore command will be of the form: symioctl –type SQLServer restore snapshot <database name> SAVEFILE <metadata file> -norecovery –noprompt Figure 140 on page 331 provides an example of the execution of the SYMIOCTL command. Figure 140 SYMIOCTL restore and NORECOVERY The resulting state of the database immediately after the SYMIOCTL restore operation is one of a LOADING database. A LOADING database is not available for access directly, and is therefore different to a read-only state as created by a process such as a STANDBY. The state of the database does allow for subsequent transaction log backup files to be restored into the database instance. This application of incremental log backups allows for the minimization of data loss between the time the original database backup occurred and the time at which the failure occurred which necessitated the restore operation. The database state can easily be seen by using SQL Server Enterprise Manager as shown in Figure 141 on page 332. SMIOCTL VDI restore 331 Restoring and Recovering Microsoft SQL Server Databases Figure 141 SQL Management Studio view of a RESTORING database Transaction log backups need to be applied to this database in sequence, specifically in the order in which they were created. The time required to apply the incremental transaction log backups will relate directly to the amount of change which occurred on the production database during normal operations. Clearly, a restoration of the production database in this manner will necessitate an outage. While much of time requirement may be mitigated by using disk mirror technology to restore the last full backup image, the incremental transaction logs will still be require to be replayed against the restored database. The time required to restore the database and apply transaction logs is referred to as the Recovery Time Interval. Typically, customers state this as a Recovery Time Objective or that time period that has been dictated by their service-level agreements (SLAs) for which the production database to be unavailable in the event of failure, for restoration purposes. Application of transaction logs is typically a requirement of any recovery process for a production database. Figure 142 on page 333 demonstrates the application of an incremental transaction log using SQL Server Query Analyzer. 332 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Figure 142 Restore of incremental transaction log with NORECOVERY The application of incremental transaction logs progress in this manner until the last available transaction log is processed. In the case of the last incremental transaction log restore operation, the RECOVERY keyword should be used. This indicates to the SQL Server instance that final recovery processing should be executed against the database. This will roll back uncommitted transactions in the database and return it to an online state. It is also possible to execute the following Transact-SQL against the database state, should there be no further transaction logs: RESTORE LOG <database> WITH RECOVERY GO This causes recovery processing to be executed without the need to specify a valid transaction log, when no further logs are available. Should it be necessary to restore to some point before the end of a given transaction log, refer to “Applying logs up to timestamps or marked transactions” on page 340 for information relating to restoring to timestamps or marked transactions. SMIOCTL VDI restore 333 Restoring and Recovering Microsoft SQL Server Databases SYMIOCTL with STANDBY The SYMIOCTL restore command will be of the form: symioctl –type SQLServer restore snapshot <database name> SAVEFILE <metadata file> -standby –noprompt An example of the execution of the SYMIOCTL restore command is shown in Figure 143 on page 334. Figure 143 SYMIOCTL restore and STANDBY Once completed, the database will be placed in a read-only state. However, during the application of incremental transaction logs it will be necessary to terminate any access from other client connections. Exclusive access to the database is required to be able to apply an incremental transaction log created from the production system. Since the STANDBY state does leave the target database in a read-only state, this can be seen through SQL Server Enterprise Manager, as shown in Figure 144 on page 335. 334 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases Figure 144 SQL Management Studio view of a STANDBY (read-only) database Similar to the application of incremental transaction logs to a NORECOVERY state, incremental transaction logs may be applied to the STANDBY database. However, to maintain the STANDBY state it is also necessary to specify the STANDBY option on the Transact-SQL RESTORE LOG statement. It is also required that the location for an undo file is provided. The undo file is used to record those pages rolled back for uncommitted transactions, after the application of a transaction log. The roll back operation ensures that a transactionally consistent view is available of the database state. Figure 145 on page 336 demonstrates the application of an incremental transaction log to a STANDBY database using the relevant Transact SQL commands from within SQL Server Query Analyzer. Before the application of subsequent incremental transaction logs, the undo file changes are reapplied to the database. SMIOCTL VDI restore 335 Restoring and Recovering Microsoft SQL Server Databases Figure 145 Restore of incremental transaction log with STANDBY The application of incremental transaction logs progress in this manner until the last available transaction log is processed. In the case of the last incremental transaction log restore operation, the RECOVERY keyword should be used. This indicates to the SQL Server instance that final recovery processing should be executed against the database. This will roll back uncommitted transactions in the database and return it to an online state. It is also possible to execute the following Transact-SQL against the database state, should there be no further transaction logs: RESTORE LOG <database> WITH RECOVERY GO This causes recovery processing to be executed without the need to specify a valid transaction log, when no further logs are available. Should it be necessary to restore to some point before the end of a given transaction log, refer to “Applying logs up to timestamps or marked transactions” on page 340 for information relating to restoring to timestamps or marked transactions. 336 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases SYMIOCTL with RECOVERY In situations where there are no available incremental transaction log backups available for replay against the full database backup, it may be appropriate to simply restore the last full backup and allow SQL Server to bring the database online. In this case, the end state of the operation will be of an online, fully accessible database. No additional incremental transaction logs will be able to be applied to this database instance once recovered, as a new chain of transaction logs will be initiated. As shown in Figure 146 on page 337, the command to restore a database instance back to the production server using SYMIOCTL VDI processing is: symioctl –type SQLServer restore snapshot <database name> SAVEFILE <metadata file> -noprompt The recovery option is implicit when no other forms of recovery are provided. Figure 146 SYMIOCTL restore and recovery mode SMIOCTL VDI restore 337 Restoring and Recovering Microsoft SQL Server Databases Replication Manager VDI restore Replication Manager provides automated restore procedures through a graphical user interface to simplify the restore process and to allow for the selection of the appropriate replica if there are multiple replicas in existence. Replication Manager controls the recovery process with respect to the SQL Server instance state for the database, unmount and mount operations, and will also manage the availability of the VDI metadata file on the production system required to complete the restore operation. This process flow is detailed in Figure 147 on page 338. As a part of the restore request, the administrator is prompted to identify the required state of the database being restored. As with the preceding restore operations, the choices will typically be those of RECOVERY, NORECOVERY, or STANDBY. 1. RM/Local initiates restore of selected replica SQL Server 4 6 2 3 2. RM/Local transfers VDI metadata file to server 5 Data STD Mirror 3. RML/Local initiates VDI restore Data STD Mirror 4. Unmount volumes Log STD Mirror 5. RM/Local restores Mirrors to STD devices 7 8 6. Mount volumes 1 7. RM/Local completes VDI restore RM/Local Server 8. After complete restore, Mirrors are disconnected from STDs ICO-IMG-000025 Figure 147 Replication Manager/Local restore overview If Replication Manager was used to create a TimeFinder copy of the database, the restore process must be initiated through the utility. A TimeFinder restore of a Replication Manager copy of a SQL Server database will not be successful. 338 Microsoft SQL Server on EMC Symmetrix Storage Systems Restoring and Recovering Microsoft SQL Server Databases A more complete discussion on the options available to Replication Manager for SQL Server restoration may be found in the relevant Replication Manager product, user, and administrator guides. Refer to Appendix A, “Related Documents,” for details on locating relevant documentation. Replication Manager VDI restore 339 Restoring and Recovering Microsoft SQL Server Databases Applying logs up to timestamps or marked transactions In those instances where it may be necessary to facilitate recovery to a point in time before some event, it is possible to provide additional options to the RESTORE LOG Transact-SQL statement such that processing of log records within the log backup is terminated at a given location. In general, either timestamps, which are automatically recorded in the transaction log records, or marked transactions can be utilitized. Marked transactions must have been explicitly created on the production database to be used in this manner. These markers placed in the transaction log as a result of executing a marked transaction allow for the roll forward of a full database backup up to the logical point in time when the marked transaction was executed. Given the aforementioned state of a full database restore procedure being executed in the appropriate NORECOVERY or STANDBY modes described in the previous sections, the appropriate Transact SQL statement may be used to stop processing transaction log records from incremental transaction log backups at the required point. To restore an incremental transaction log backup and terminate at a point in time recorded in the transaction log records: RESTORE LOG ProdDB FROM DISK=’D:\SQL\tranlog_00001.trn’ WITH RECOVERY, STOPAT = ‘Dec 10, 2005 10:30 AM’ go To restore an incremental transaction log backup and terminate at a logical point in time referenced by a marked transaction, such as one created in “SQL Server log markers” on page 255 you may use something similar to the following: RESTORE LOG ProdDB FROM DISK=’D:\SQL\tranlog_00001.trn’ WITH RECOVERY, STOPATMARK = ‘Before_Update’ go 340 Microsoft SQL Server on EMC Symmetrix Storage Systems 7 Microsoft SQL Server Disaster Restart and Disaster Recovery This chapter presents these topics: ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ Definitions......................................................................................... Considerations for disaster restart and disaster recovery.......... Tape-based solutions ....................................................................... Local high-availability solutions.................................................... Multisite high-availability solutions ............................................. Remote replication challenges........................................................ Array-based remote replication ..................................................... Planning for array-based replication............................................. SQL Server specific issues............................................................... SRDF/S: Single Symmetrix to single Symmetrix ........................ SRDF/S and consistency groups ................................................... SRDF/A ............................................................................................. SRDF/AR single hop....................................................................... SRDF/AR multi hop........................................................................ Database log-shipping solutions.................................................... Running database solutions ........................................................... Other transactional systems ........................................................... Microsoft SQL Server Disaster Restart and Disaster Recovery 343 345 351 353 354 356 361 362 363 364 368 375 381 384 387 404 407 341 Microsoft SQL Server Disaster Restart and Disaster Recovery A critical part of managing a database is planning for unexpected loss of data. The loss can occur from a disaster like fire or flood, or it can come from hardware or software failures. It can even be caused by human error or malicious intent. In each instance, the database must be restored to some usable point before application services can be resumed. The effectiveness of any plan for restart or recovery involves answering the following questions: ◆ How much down time is acceptable to the business? ◆ How much data loss is acceptable to the business? ◆ How complex is the solution? ◆ Does the solution accommodate the data architecture? ◆ How much does the solution cost? ◆ What disasters does the solution protect against? ◆ Is there protection against logical corruption? ◆ Is there protection against physical corruption? ◆ Is the database restartable or recoverable? ◆ Can the solution be tested? ◆ If failover happens, will failback work? All restart and recovery plans include a replication component. In its simplest form, the replication process may be as easy as making a tape copy of the database and application. In a more sophisticated form, it could be real-time replication of all changed data to some remote location. Remote replication of data has its own challenges centered on: ◆ Distance ◆ Propagation delay (latency) ◆ Network infrastructure ◆ Data loss This chapter provides an introduction to the spectrum of disaster recovery and disaster restart solutions for SQL Server databases on EMC Symmetrix arrays. 342 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Definitions In the next sections, the terms dependent-write consistency, database restart, database recovery, and roll forward recovery are used. A clear definition of these terms is required to understand the context of this section. Dependent-write consistency A dependent-write I/O is one that cannot be issued until a related predecessor I/O has completed. Dependent-write consistency is a data state where data integrity is guaranteed by dependent-write I/Os embedded in application logic. Database management systems are good examples of the practice of dependent-write consistency. Database management systems must devise protection against abnormal termination to successfully recover from one. The most common technique used is to guarantee that a dependent-write cannot be issued until a predecessor write has completed. Typically, the dependent-write is a data or index write while the predecessor write is a write to the log. Because the write to the log must be completed before issuing the dependent-write, the application thread is synchronous to the log write, that is, it waits for that write to complete before continuing. The result of this kind of strategy is a dependent-write consistent database. Database restart Database restart is the implicit application of database logs during its normal initialization process to ensure a transactionally consistent data state. If a database is shut down normally, the process of getting to a point of consistency during restart requires minimal work. If the database abnormally terminates, then the restart process will take longer, depending on the number and size of in-flight transactions at the time of termination. An image of the database created by using EMC Consistency technology while it is running, without conditioning the database, will be in a dependent-write consistent data state, which is similar to that created by a local power failure. This is also known as a DBMS restartable image. The restart of this image transforms it to a Definitions 343 Microsoft SQL Server Disaster Restart and Disaster Recovery transactionally consistent data state by completing committed transactions and rolling back uncommitted transactions during the normal database initialization process. Database recovery Database recovery is the process of rebuilding a database from a backup image, and then explicitly applying subsequent logs to roll the data state forward to a designated point of consistency. Database recovery is only possible with databases configured with the appropriate level of database logging, typically using the FULL recovery model. A recoverable database copy can be taken in one of three ways: 1. With the database shut down and copying the database components using external tools. 2. With the database running using the SQL Server VDI or VSS frameworks. 3. Streamed backup using SQL Server BACKUP processes. Roll forward recovery With some databases, it may be possible to take a DBMS restartable image of the database, and apply subsequent incremental transaction log backups, to roll forward the database to a point in time after the image was created. This means that the image created can be used in a backup strategy in combination with transaction log backups. At the time of printing, a DBMS restartable image of a SQL Server database cannot use subsequent logs to roll forward transactions. In most cases, during a disaster, the storage array image at the remote site will be a SQL Server restartable image and cannot have any incremental transaction log backups applied it. 344 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Considerations for disaster restart and disaster recovery Loss of data or loss of application availability has a varying impact from one business type to another. For instance, data loss in the form of transactions for a bank could cost millions, whereas system downtime may not have a major fiscal impact. On the other hand, businesses that are primarily web-based may require 100 percent application availability in order to survive. The two factors, loss of data and loss of uptime, are the business drivers that are baseline requirements for a DR solution. When quantified, these two factors are more formally known as Recovery Point Objective (RPO) and Recovery Time Objective (RTO), respectively. When evaluating a solution, the RPO and RTO requirements of the business need to be met. In addition, the solution needs to consider operational complexity, cost, and the ability to return the whole business to a point of consistency. Each of these aspects is discussed in the following sections. Recovery Point Objective (RPO) The RPO is a point of consistency to which a user wants to recover or restart. It is measured in the amount of time from when the point of consistency was created or captured to the time the disaster occurred. This time equates to the acceptable amount of data loss. Zero data loss (no loss of committed transactions from the time of the disaster) is the ideal goal, but the high cost of implementing such a solution must be weighed against the business impact and cost of a controlled data loss. Some organizations, like banks, have zero data loss requirements. The database transactions entered at one location must be replicated immediately to another location. This can have an impact on application performance when the two locations are far apart. On the other hand, keeping the two locations close together might not protect against regional disasters like a power outage in the Northeast United States or hurricanes in Florida. Defining the required RPO is usually a compromise between the needs of the business, the cost of the solution, and the risk of a particular event happening. Considerations for disaster restart and disaster recovery 345 Microsoft SQL Server Disaster Restart and Disaster Recovery Recovery Time Objective (RTO) The RTO is the maximum amount of time allowed for recovery or restart to a specified point of consistency. This time involves many factors including the time it takes to: ◆ Provision power, utilities, and so on ◆ Provision servers with the application and database software ◆ Configure the network ◆ Restore the data at the new site ◆ Roll forward the data to a known point of consistency ◆ Validate the data Some delays can be reduced or eliminated by choosing certain DR options like having a hot site where servers are preconfigured and on standby. Also, if storage-based replication is used, the time it takes to restore the data to a usable state is completely eliminated. As with RPO, each solution for RTO will have a different cost profile. Defining the RTO is usually a compromise between the cost of the solution and the cost to the business when database and applications are unavailable. Operational complexity The operational complexity of a DR solution may be the most critical factor in determining the success or failure of a DR activity. The complexity of a DR solution can be considered as three separate phases: 1. Initial set up of the implementation. 2. Maintenance and management of the running solution. 3. Execution of the DR plan in the event of a disaster. While initial configuration complexity and running complexity can be a demand on people resources, the third phase, execution of the plan, is where automation and simplicity must be the focus. When a disaster is declared, key personnel may not be available in addition to the loss of servers, storage, networks, buildings, etc. If the complexity of the DR solution is such that skilled personnel with an intimate 346 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery knowledge of all systems involved are required to restore, recover and validate application and database services, the solution has a high probability of failure. Multiple database environments grow organically over time into complex federated database architectures. In these federated database environments, reducing the complexity of DR is absolutely critical. Validation of transactional consistency within the complex database architecture is time consuming, costly, and requires application and database familiarity. One reason for the complexity is because of the heterogeneous databases and operating systems involved in these federated environments. Across multiple heterogeneous platforms, it is hard to establish a common clock and therefore hard to determine a business point of consistency across all platforms. This business point of consistency has to be created from intimate knowledge of the transactions and data flows. Source server activity DR solutions may or may not require additional processing activity on the source servers. The extent of that activity can impact both response time and throughput of the production application. This effect should be understood and quantified for any given solution to ensure the impact to the business is minimized. The effect for some solutions is continuous while the production application is running; for other solutions, the impact is sporadic, where bursts of write activity are followed by periods of inactivity. Production impact Some DR solutions delay the host activity while taking actions to propagate the changed data to another location. This action only affects write activity and although the introduced delay may only be of the order of a few milliseconds, it can impact response time in a high-write environment. Synchronous solutions introduce delay into write transactions at the source site, asynchronous solutions do not. Target server activity Some DR solutions require a target server at the remote location to perform DR operations. The server has both software and hardware costs and needs personnel with physical access to it for basic Considerations for disaster restart and disaster recovery 347 Microsoft SQL Server Disaster Restart and Disaster Recovery operational functions like power on and power off. Ideally, this server could have some usage like running development or test databases and applications. Some DR solutions require more target server activity and some require none. Number of copies DR solutions require replication of data in one form or another. Replication of a database and associated files can be as simple as making a tape backup and shipping the tapes to a DR site or as sophisticated as asynchronous array-based replication. Some solutions require multiple copies of the data to support DR functions. More copies of the data may be required to perform testing of the DR solution in addition to those that support the DR process. Distance for solution Disasters, when they occur, have differing ranges of impact. For instance, a fire may take out a building, an earthquake may destroy a city, or a tidal wave may devastate a region. The level of protection for a DR solution should address the probable disasters for a given location. For example when protecting against an earthquake, the DR site should not be in the same locale as the production site. For regional protection, the two sites need to be in two different regions. The distance associated with the DR solution affects the kind of DR solution that can be implemented. Bandwidth requirements One of the largest costs for DR is in provisioning bandwidth for the solution. Bandwidth costs are an operational expense; this makes solutions that have reduced bandwidth requirements very attractive to customers. It is important to recognize in advance the bandwidth consumption of a given solution to be able to anticipate the running costs. Incorrect provisioning of bandwidth for DR solutions can adversely affect production performance and can invalidate the overall solution. 348 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Federated consistency Databases are rarely isolated islands of information with no interaction or integration with other applications or databases. Most commonly, databases are loosely and/or tightly coupled to other databases using triggers, database links, and stored procedures. Some databases provide information downstream for other databases using information distribution middleware; other databases receive feeds and inbound data from message queues and EDI transactions. The result can be a complex interwoven architecture with multiple interrelationships. This is referred to as a federated database architecture. With a federated database architecture, making a DR copy of a single database without regard to other components invites consistency issues and creates logical data integrity problems. All components in a federated architecture need to be recovered or restarted to the same dependent-write consistent point of time to avoid these problems. With this in mind, it is possible that point database solutions for DR, like log-shipping, do not provide the required business point of consistency in a federated database architecture. Federated consistency solutions guarantee that all components, databases, applications, middleware, flat files, and so on are recovered or restarted to the same dependent-write consistent point in time. Testing the solution Tested, proven, and documented procedures are also required for a DR solution. Many times, the DR test procedures are operationally different from a true disaster set of procedures. Operational procedures need to be clearly documented. In the best-case scenario, companies should periodically execute the actual set of procedures for DR. This could be costly to the business because of the application downtime required to perform such a test, but is necessary to ensure validity of the DR solution. Considerations for disaster restart and disaster recovery 349 Microsoft SQL Server Disaster Restart and Disaster Recovery Cost The cost of doing DR can be justified by comparing it to the cost of not doing it. What does it cost the business when the database and application systems are unavailable to users? For some companies this is easily measurable, and revenue loss can be calculated per hour of downtime or per hour of data loss. Whatever the business, the DR cost is going to be an extra expense item and, in many cases, with little in return. The costs include, but are not limited to: 350 ◆ Hardware (storage, servers and maintenance) ◆ Software licenses and maintenance ◆ Facility leasing/purchase ◆ Utilities ◆ Network infrastructure ◆ Personnel Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Tape-based solutions Tape-based disaster recovery Traditionally, the most common form of disaster recovery was to make a copy of the database onto tape and using Pickup Truck Access Method (PTAM), take the tapes offsite to a hardened facility. In most cases, the database and application needed to be available to users during the backup process. Taking a backup of a running database created a fuzzy image of the database on tape, one that required database recovery after the image had been restored. Recovery usually involved application of logs that were active during the time the backup was in process. These logs had to be archived and kept with the backup image to ensure successful recovery. The rapid growth of data over the last two decades has meant that this method has become unmanageable. Making a hot copy of the database is now the standard—but this method has its own challenges. How can a consistent copy of the database and supporting files be made when they are changing throughout the duration of the backup? What exactly is the content of the tape backup at completion? The reality is that the tape data is a fuzzy image of the disk data, and considerable expertise is required to restore the database back to a database point of consistency. In addition, the challenge of returning the data to a business point of consistency, where a particular database must be recovered to the same point as other databases or applications, is making this solution less viable. Tape-based disaster restart Tape-based disaster restart is a recent development in disaster recovery strategies and is used to avoid the fuzziness of a backup taken while the database and application are running. A restart copy of the system data is created by locally mirroring the disks that contain the production data, and splitting off the mirrors to create a dependent-write consistent point-in-time image on the disks. This image is a DBMS restartable image as previously described. Thus, if this image was restored and the database brought up, the database Tape-based solutions 351 Microsoft SQL Server Disaster Restart and Disaster Recovery would perform an implicit recovery to attain transactional consistency. Roll-forward recovery using incremental transaction logs from this database image is not possible for Microsoft SQL Server. The restartable image on the disks can be backed up to tape and moved offsite to a secondary facility. If this image is created and shipped offsite on a daily basis, the maximum amount of data loss is 24 hours. The time it takes to restore the database is a factor to consider since reading from tape is typically slow. Consequently, this solution can be effective for customers with relaxed RTOs. 352 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Local high-availability solutions Microsoft provides a commonly known high-availability solution known as Windows Failover Clustering. This is an implementation of a shared disk environment, though in many ways it is also a shared nothing environment. While storage devices must be accessible to multiple nodes within a given cluster, any individual disk resource cannot be accessible to more than one node at any given time. Furthermore, Microsoft Failover Clustering provides the capability to aggregate resources into resource groups and define dependency. In this way, resource groups may be defined, which logically define an application. In the case of a SQL Server cluster instance, the resource group contains a number of resources such as a network name, an IP address, disk resources and SQL Server processes themselves. With the ability to define dependencies between resources, it is possible to provide structured start and stop processes to ensure reliable operations. For example, the SQL Server service is dependent on both disk resources, and the network name. The network name in turn depends on the IP address. Thus, the higher-level functions ensure that lower-level services are operational before proceeding. The Microsoft Failover Cluster itself is actually a restart environment at its core. In the event that a node running a SQL Server instance fails, for example, the Cluster Service will attempt to move all resources to a peer node to restart the services. During this restart process, SQL Server services will open the database files on the shared disks, detect the failure state, and execute implicit recovery operations against the database to ensure that a transactionally consistent state is obtained. This is an identical process for images created by consistent split processing using EMC Consistency technology. Since in a single array Microsoft Failover Cluster environments used a single set of shared disks, disk failures, or other physical or logical corruption of the disk objects, or data maintained within them will affect that Cluster Resource Group irrespective of the node on which it is operating on. There is no protection against this style of logical or physical failure in single array solutions. Local high-availability solutions 353 Microsoft SQL Server Disaster Restart and Disaster Recovery Multisite high-availability solutions To extend the functionality of single-site Windows Failover Cluster configurations and provide additional multisite protection, EMC provides the SRDF®/Cluster Enabler for MSCS geographically dispersed clustering product. This solution extends the single array Windows Failover Cluster configuration to allow for multiple sites with array-based replication between the two sites. Certification of the solution is provide by Microsoft in a similar manner to single Windows Failover Cluster configurations. Certified configurations may be found on the Windows Catalog available from the Microsoft website at http://www.microsoft.com/. With the introduction of Windows Server 2008, Windows Failover Cluster configurations may now be self-certified by customers by running the Cluster Validation Wizard and validating that the cluster can be supported. SRDF/CE for MSCS, provides a level of abstract at the storage level such that Windows Failover Clustering believes it is executing is a standard mode. All typically Failover Clustering functions, procedures, and restrictions remain in place. The benefit provided by SRDF/CE for MSCS is that it can provide single-site failure conditions, and enhance the RTO for applications by automatically managing recovery of resources groups. It is important to understand SRDF/CE for MSCS operating modes, and specifically the default behavior in the event of a site failure. The default behavior is No New Onlines, which indicates that surviving modes may continue to run their current resource groups but will not be able to start new resource groups. Overrides are provided, and are documented in the product guide for SRDF/CE for MSCS. Geographically dispersed clustering solutions such as that of SRDF/CE for MSCS based on SRDF/S provide zero data loss solutions with extremely small RTO, since most processes are automated. EMC also provides the AutoStart™ resource management product which provides similar functionality for geographically dispersed configurations. AutoStart is not based on Microsoft Failover Clustering, and utilizes its own mechanisms for managing state and ensuring availability of resources. Please refer to the product guide 354 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery for EMC AutoStart for additional information. Additional details on reference documentation are provided in Appendix A, “Related Documents.” Multisite high-availability solutions 355 Microsoft SQL Server Disaster Restart and Disaster Recovery Remote replication challenges Replicating database information over long distances for the purpose of disaster recovery is challenging. Synchronous replication over distances greater than 200 km may not be feasible because of the negative impact on the performance of writes because of propagation delay; some form of asynchronous replication must be adopted. Considerations in this section apply to all forms of remote replication technology, whether they are array-based, host-based, or managed by the database. Remote replication solutions usually start with initially copying a full database image to the remote location. This is called instantiation of the database. There are a variety of ways to perform this. After instantiation, only the changes from the source site are replicated to the target site in an effort to keep the target up to date. Some methodologies may not send all of the changes (certain log shipping techniques for instance), by omission rather than design. These methodologies may require periodic re-instantiation of the database at the remote site. The following considerations apply to remote replication of databases: ◆ Propagation delay (latency because of distance) ◆ Bandwidth requirements ◆ Network infrastructure ◆ Method of instantiation ◆ Method of re-instantiation ◆ Change rate at the source site ◆ Locality of reference ◆ Expected data loss ◆ Failback operations Propagation delay Electronic operations execute at the speed of light. The speed of light in a vacuum is 186,000 miles per second. The speed of light through glass (in the case of fiber optic media) is less, approximately 115,000 miles per second. In other words, in an optical network like SONET 356 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery for instance, it takes 1 millisecond to send a data packet 125 miles or 8 milliseconds for 1000 miles. All remote replication solutions need to be designed with a clear understanding of the propagation delay impact. Bandwidth requirements All remote replication solutions have some bandwidth requirements because the changes from the source site must be propagated to the target site. The more changes there are, the greater the bandwidth that is needed. It is the change rate and replication methodology that determine the bandwidth requirement, not necessarily the size of the database. Data compression can help reduce the quantity of data transmitted and therefore the size of the pipe required. Certain network devices, like switches and routers, provide native compression, some by software and some by hardware. GigE directors provide native compression in a VMAX to VMAX SRDF pairing. The amount of compression achieved is dependent on the type of data that is being compressed. Typical character and numeric database data compresses at about a 2-to-1 ratio. A good way to estimate how the data will compress is to assess how much tape space is required to store the database during a full backup process. Tape drives perform hardware compression on the data before writing it. For instance, if a 300 GB database takes 200 GB of space on tape, the compression ratio is 1.5 to 1. For most customers, a major consideration in the disaster recovery design is cost. It is important to recognize that some components of the end solution represent a capital expenditure and some an operational expenditure. Bandwidth costs are operational expenses and thus any reduction in this area, even at the cost of some capital expense, is highly desirable. Network infrastructure The choice of channel extension equipment, network protocols, switches, and routers ultimately determines the operational characteristics of the solution. EMC has a proprietary BC Design Tool to assist customers in their analysis of the source systems and to determine the required network infrastructure to support a remote replication solution. Remote replication challenges 357 Microsoft SQL Server Disaster Restart and Disaster Recovery Method of instantiation In all remote replication solutions, a common requirement is for an initial, consistent copy of the complete database to be replicated to the remote site. The initial copy from source to target is called instantiation of the database at the remote site. Following instantiation, only the changes made at the source site are replicated. For large databases, sending only the changes after the initial copy is the only practical and cost-effective solution for remote database replication. In some solutions, instantiation of the database at the remote site uses a process that is similar to the one that replicates the changes. Some solutions do not even provide for instantiation at the remote site (log shipping for instance). In all cases, it is critical to understand the pros and cons of the complete solution. Method of re-instantiation Some methods of remote replication require periodic refreshing of the remote system with a full copy of the database. This is called re-instantiation. Technologies such as log shipping frequently require this since not all activity on the production database may be represented in the log. In these cases, the disaster recovery plan must account for re-instantiation and also for the fact there may be a disaster during the refresh. The business objectives of RPO and RTO must likewise be met under those circumstances. Change rate at the source site After instantiation of the database at the remote site, only changes to the database are replicated remotely. There are many methods of replication to the remote site and each has differing operational characteristics. The changes can be replicated using logging technology, hardware and software mirroring for example. Before designing a solution with remote replication, it is important to quantify the average change rate. It is also important to quantify the change rate during periods of burst write activity. These periods might correspond to end-of-month/quarter/year processing, billing, or payroll cycles. The solution needs to be designed to allow for peak write workloads. 358 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Locality of reference Locality of reference is a factor that needs to be measured to understand if there will be a reduction of bandwidth consumption when any form of asynchronous transmission is used. Locality of reference is a measurement of how much write activity on the source is skewed. For instance, a high locality of reference application may make many updates to a few tables in the database, whereas a low locality of reference application rarely updates the same rows in the same tables during a given period of time. It is important to understand that while the activity on the tables may have a low locality of reference, the write activity into an index might be clustered when inserted rows have the same or similar index column values, rendering a high locality of reference on the index components. In some asynchronous replication solutions, updates are batched up into periods of time and sent to the remote site to be applied. In a given batch, only the last image of a given row/block is replicated to the remote site. So, for highly skewed application writes, this results in bandwidth savings. Generally, the greater the time period of batched updates, the greater the savings on bandwidth. Log shipping technologies do not take into account locality of reference. For example, a row updated a 100 times, is transmitted 100 times to the remote site, whether the solution is synchronous or asynchronous. Expected data loss Synchronous DR solutions are zero data loss solutions, that is to say, there is no loss of committed transactions from the time of the disaster. Synchronous solutions may also be impacted by a rolling disaster in which case work completed at the source site after the rolling disaster started may be lost. Rolling disasters are discussed in detail in a later section. Non-synchronous DR solutions have the potential for data loss. How much data is lost depends on many factors, most of which have been previously defined. For asynchronous replication, where updates are batched and sent to the remote site, the maximum amount of data lost will be two cycles or two batches worth. The two cycles that may be lost include the cycle currently being captured on the source site and Remote replication challenges 359 Microsoft SQL Server Disaster Restart and Disaster Recovery the one currently being transmitted to the remote site. With inadequate network bandwidth, data loss could be increased because of the increased transmission time. Failback operations If there is the slightest chance that fail over to the DR site may be required, then there is a 100 percent chance that failback to the primary site will also be required, unless the primary site is lost permanently. The DR architecture should be designed in such a way as to make failback simple, efficient, and low-risk. If failback is not planned for, there may be no reasonable or acceptable way to move the processing from the DR site, where the applications may be running on tier 2 servers and tier 2 networks, back to the production site. Ideally, the DR process should be tested once a quarter, with database and application services fully failed over to the DR site. The integrity of the application and database needs to be verified at the remote site to ensure that all required data was copied successfully. Ideally, production services are brought up at the DR site as the ultimate test. This means that production data would be maintained on the DR site, requiring a failback when the DR test completed. While this is not always possible, it is the ultimate test of a DR solution. It not only validates the DR process, but also trains the staff on managing the DR process should a catastrophic failure ever occur. The downside for this approach is that duplicate sets of servers and storage need to be present to make an effective and meaningful test. This tends to be an expensive proposition. 360 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Array-based remote replication Customers can use the capabilities of a Symmetrix storage array to replicate the database from the production location to a secondary location. No host CPU cycles are used for this, leaving the host dedicated to running the production application and database. In addition, no host I/O is required to facilitate this, the array takes care of all replication and no hosts are required at the target location to manage the target array. EMC provides multiple solutions for remote replication of databases: ◆ SRDF/S: SRDF/Synchronous ◆ SRDF/A: SRDF/Asynchronous ◆ SRDF/AR: SRDF/Automated Replication Each of these solutions is discussed in detail in the next sections. In order to use any of the array-based solutions, it is necessary to coordinate the disk layout of the databases with this kind of replication in mind. Array-based remote replication 361 Microsoft SQL Server Disaster Restart and Disaster Recovery Planning for array-based replication All Symmetrix solutions replicating data from one array to another are disk based. This allows the Symmetrix to be agnostic to the volume manager, file system, database system, and so on. However, this does not mean that file system and volume manager concerns can be ignored. It is important to understand the relationship of Windows NT File System (NTFS) volumes and LUNs. On Windows, the smallest unit of granularity for storage-based replication is the LUN, a volume set, or disk group, depending on how the disks are set up in disk manager. In addition, if a database is to be replicated independently of other databases, it should have its own dedicated disks (LUNs). That is, the disks used by a database should not be shared with other applications or databases. When a set of volumes has been defined for a database for remote replication, care must be taken to ensure that the disks contain everything that is needed to restart the database at the remote site. For SQL Server databases, this must be the complete set of data and transaction log files comprising the database. Administrators should ensure that the same versions of SQL Server, including service packs and hot fixes, are installed on both the production and target. In instances where application binaries are being replicated with the databases, this will be resolved by the replication process. 362 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery SQL Server specific issues A number of issues need to be addressed in configurations that utilize remote replication technologies, specifically those that are designed to replicate only a given SQL Server database. One of the most important considerations is that of security, access and authentication information. SQL Server maintains security and authentication information within the MASTER DB. If the replication solution does not include replicating the MASTER DB and other systems databases, then it is required that some process exists which deals adequately with replicating this information. Should this authentication information not be replicated, then it is possible to attain a state where the copy of the user database is made available, but users are unable to access the copied database because of authentication failures. SQL Server specific issues 363 Microsoft SQL Server Disaster Restart and Disaster Recovery SRDF/S: Single Symmetrix to single Symmetrix SRDF/Synchronous, or SRDF/S, is a method of replicating production data changes from locations that are no greater than 200 km apart. Synchronous replication takes writes that are inbound to the source Symmetrix and copies them to the target Symmetrix. The write operation is not acknowledged as complete to the host until both Symmetrix arrays have the data in cache. It is important to realize that while the following examples involve Symmetrix, the fundamentals of synchronous replication described here are true for all synchronous replication solutions. Figure 148 on page 365 depicts the process: 1. A write is received into the source Symmetrix cache. At this time, the host has not received acknowledgement that the write is complete. 2. The source Symmetrix uses SRDF/S to push the write to the target Symmetrix. 3. The target Symmetrix sends an acknowledgement back to the source that the write was received. 4. Ending status of the write is presented to the host. These four steps cause a delay in the processing of writes as perceived by the database on the source server. The amount of delay depends on the exact configuration of the network, the storage, the write block size, and the distance between the two locations. Note: It reads from the source Symmetrix are not affected by the replication. 364 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery 2 1 Data Data 4 SQL Server Logs Source/R1 Logs 3 Target/R2 1. Write is issued from host/server into cache of Source Array 2. Write is transmitted to the cache of Target Array 3. Receipt of Write is transmitted to Source Array 4. Acknowledgement of write is returned to host/server ICO-IMG-000026 Figure 148 SRDF/S replication process The following steps outline the process of setting up synchronous replication using Solutions Enabler (SYMCLI) commands: 1. Before the synchronous mode of SRDF can be established, initial instantiation of the database has to take place. In other words, a baseline full copy of all the volumes that are going to participate in the synchronous replication must be executed first. This is usually accomplished using the adaptive copy mode of SRDF. The following command create a group where <device_group> is a user specified name: symdg create <device_group> –type rdf1 The type of the device group is dependent on the location of the system being used to define the device group. The rdf1 type is used if the host is connected locally to the Source/R1 array. 2. Add devices to the group as described in Appendix B, “References.” 3. The following command puts the group into adaptive copy mode: symrdf –g <device_group> set mode acp_disk –noprompt 4. The following command causes the source Symmetrix to send all the tracks on the source site to the target site using the current mode: symrdf –g <device_group> establish –full -noprompt SRDF/S: Single Symmetrix to single Symmetrix 365 Microsoft SQL Server Disaster Restart and Disaster Recovery The adaptive copy mode of SRDF has no impact to host application performance. It transmits tracks to the remote site that have never been sent before or that have changed since the last time the track was sent. It does not preserve write order or dependent-write consistency. 5. When both sides are synchronized, SRDF can then be put into synchronous mode. In the following command, the device group is put into synchronous mode: symrdf –g <device_group> set mode sync –noprompt Note: There is no requirement for host availability at the remote site during the synchronous replication. The target Symmetrix itself manages the in-bound writes and updates the appropriate volumes in the array. Dependent-write consistency is inherent in a synchronous relationship as the target R2 volumes are at all times equal to the source provided that a single RA group is used. If multiple RA groups are used or if multiple Symmetrix arrays are used on the source site, SRDF/Consistency Groups (SRDF/CG) must be used to guarantee consistency. How to restart in the event of production site loss In the event of a disaster where the primary source Symmetrix is lost, it becomes necessary to run database and application services from the DR site. A host at the DR site is required for this. The first requirement is to write enable the R2 devices. If the device group is not yet built on the remote host, it must be created using the R2 devices that were mirrors of the R1 devices on the source Symmetrix. Group Named Services (GNS) can be used to propagate the device group to the remote site if there is a host being utilized there. For more details on GNS see the Solutions Enabler Symmetrix Base Management CLI Product Guide. The following command write enables the R2s in a group: symld –g <device_group> rw_enable –noprompt At this point, the host can issue the necessary commands to access the disks. 366 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Once the data is available to the host, the database can be restarted. The database will perform an implicit recovery when restarted. Transactions that were committed but not completed are rolled forward and completed using the information in the transaction log. Transactions that have updates applied to the database but were not committed are rolled back. The result is a transactionally consistent database. SRDF/S: Single Symmetrix to single Symmetrix 367 Microsoft SQL Server Disaster Restart and Disaster Recovery SRDF/S and consistency groups Zero data loss disaster recovery techniques tend to use straightforward database and application restart procedures. These procedures work well if all processing and data mirroring stop at the same instant in time at the production site when a disaster happens. Such is the case when there is a site power failure. However, in most cases, it is unlikely that all data processing ceases at an instant in time. Computing operations can be measured in nanoseconds and even if a disaster takes only a millisecond to complete, many such computing operations could be completed between the start of a disaster until all data processing ceases. This gives us the notion of a rolling disaster. A rolling disaster is a series of events taken over a period of time that comprise a true disaster. The specific period of time that makes up a rolling disaster could be milliseconds (in the case of an explosion) or minutes in the case of a fire. In both cases, the DR site must be protected against data inconsistency. Rolling disaster Protection against a rolling disaster is required when the data for a database resides on more than one Symmetrix array or multiple RA groups. Figure 149 on page 370 depicts a dependent-write I/O sequence where a predecessor log write is happening before a page flush from a database buffer pool. The log device and data device are on different Symmetrix arrays with different replication paths. Figure 149 on page 370 demonstrates how rolling disasters can affect this dependent-write sequence: 1. This example of a rolling disaster starts with a loss of the synchronous links between the bottom source Symmetrix and the target Symmetrix. This will prevent the remote replication of data on the bottom source Symmetrix. 2. The Symmetrix, which now is no longer replicating, receives a predecessor log write of a dependent-write I/O sequence. The local I/O is completed, however, it is not replicated to the remote Symmetrix, and the tracks are marked as being owed to the target Symmetrix. Nothing prevents the predecessor log write from completing to the host completing the acknowledgement process. 368 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery 3. Now that the predecessor log write has completed, the dependent data write is issued. This write is received on both the source Symmetrix and the target Symmetrix because the rolling disaster has not yet affected those communication links. 4. If the rolling disaster ended in a complete disaster, the condition of the data at the remote site is such that it creates a data ahead of log condition, which is an inconsistent state for a database. The severity of the situation is that when the database is restarted, performing an implicit recovery, it may not detect the inconsistencies. A person extremely familiar with the transactions running at the time of the rolling disaster might be able to detect the inconsistencies. Database utilities could also be run to detect some of the inconsistencies. A rolling disaster can happen in such a manner that data links providing remote mirroring support are disabled in a staggered fashion, while application and database processing continues at the production site. The sustained replication during the time when some Symmetrix units are communicating with their remote partners through their respective links while other Symmetrix units are not (because of link failures) can cause data integrity exposure at the recovery site. Some data integrity problems caused by the rolling disaster cannot be resolved through normal database restart processing and may require a full database recovery using appropriate backups, journals, and logs. A full database recovery elongates overall application restart time at the recovery site. SRDF/S and consistency groups 369 Microsoft SQL Server Disaster Restart and Disaster Recovery 4 Data ahead of Log R1(A) Host 1 3 R1(B) R1(X) 3 R2(X) R1(Y) DBMS R2(Y) R1(C) 1 R2(Z) R2(A) R2(B) R2(C) 2 X = DBMS Data Y = Application Data Z = Logs R1(Z) 1. Rolling disaster begins 2. Log write 3. Dependent data write 4. Inconsistent data ICO-IMG-000027 Figure 149 Rolling disaster with multiple production Symmetrix arrays Protecting against a rolling disaster SRDF/Consistency Group (SRDF/CG) technology provides protection against rolling disasters. A consistency group is a set of Symmetrix volumes spanning multiple RA groups and/or multiple Symmetrix frames that replicate as a logical group to other Symmetrix arrays using Synchronous SRDF. It is not a requirement to span multiple RA groups and/or Symmetrix frames when using consistency groups. Consistency group technology guarantees that if a single source volume is unable to replicate to its partner for any reason, then all the volumes in the group stop replicating. This ensures that the image of the data on the target Symmetrix is consistent from a dependent-write perspective. Figure 150 on page 371 depicts a dependent-write I/O sequence where a predecessor log write is happening before a page flush from a database buffer pool. The log device and data device are on different Symmetrix arrays with different replication paths. Figure 150 on page 371 demonstrates how rolling disasters can be prevented using EMC consistency group technology: 370 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Host 1 Solutions Enabler consistency group 5 Suspend R1/R2 1 relationship 4 E-ConGroup definition (X,Y,Z) 7 DBMS 6 DBMS restartable copy R1(A) SCF/SYMAPI R1(B) R1(X) IOS/PowerPath R2(X) R1(Y) R2(Y) Host 2 Solutions Enabler consistency group DBMS SCF/SYMAPI IOS/PowerPath R1(C) 1 E-ConGroup definition (X,Y,Z) 2 R2(Z) R2(A) R2(B) R2(C) 3 R1(Z) X = DBMS data Y = Application data Z = Logs 1. ConGroup protection 2. Rolling disaster begins 3. Log write 4. ConGroup "trip" 5. Suspend R1/R2 6. Dependent data write 7. Dependent write consistent ICO-IMG-000002 Figure 150 SRDF consistency group protection against rolling disaster 1. Consistency group protection is defined containing volumes X, Y, and Z on the source Symmetrix. This consistency group definition must contain all of the devices that need to maintain dependent-write consistency and reside on all participating hosts involved in issuing I/O to these devices. A mix of CKD (mainframe) and FBA (UNIX/Windows) devices can be logically grouped together. In some cases, the entire processing environment may be defined in a consistency group to ensure dependent-write consistency. 2. The rolling disaster previously described begins preventing the replication of changes from volume Z to the remote site. 3. The predecessor log write occurs to volume Z, causing a consistency group (ConGroup) trip. 4. A ConGroup trip will hold the I/O that could not be replicated along with all of the I/O to the logically grouped devices. The I/O is held by PowerPath on the UNIX or Windows hosts, and SRDF/S and consistency groups 371 Microsoft SQL Server Disaster Restart and Disaster Recovery IOS on the mainframe host. It is held long enough to issue two(2) I/Os per Symmetrix. The first I/O will put the devices in a suspend-pending state. 5. The second I/O performs the suspend of the R1/R2 relationship for the logically grouped devices, which immediately disables all replication to the remote site. This allows other devices outside of the group to continue replicating provided the communication links are available. 6. After the R1/R2 relationship is suspended, all deferred write I/Os are released allowing the predecessor log write to complete to the host. The dependent data write is issued by the DBMS and arrives at X but is not replicated to the R2(X). 7. If a complete failure occurred from this rolling disaster, dependent-write consistency at the remote site is preserved. If a complete disaster did not occur and the failed links were activated again, the consistency group replication could be resumed once synchronous mode is achieved. It is recommended to create a copy of the dependent-write consistent image while the resume takes place. Once the SRDF process reaches synchronization the dependent-write consistent copy is achieved at the remote site. SRDF/S with multiple source Symmetrix arrays The implications of spreading a database across multiple Symmetrix frames or across multiple RA groups and replicating in synchronous mode were discussed in previous sections. The challenge in this type of scenario is to protect against a rolling disaster. SRDF consistency groups can be used to avoid data corruption in a rolling disaster situation. Consider the architecture depicted in Figure 151 on page 373. 372 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Data Data Log SQL Server Data Data Log Source/R1 Arrays S Y N C H R O N O U S Data Data Log Data Data Log Target/R2 Arrays ICO-IMG-000028 Figure 151 SRDF/S with multiple source Symmetrix and ConGroup protection To protect against a rolling disaster, a consistency group can be created that encompasses all the volumes on all Symmetrix arrays participating in replication, as shown by the blue dotted oval. The following steps outline the process of setting up synchronous replication with consistency groups using Solutions Enabler (SYMCLI) commands: 1. To create a consistency group for the source side of the synchronous relationship, that is, the R1 side: symcg create <composite_group> –type rdf1 -ppath 2. Add devices to the group as described in Appendix B, “References.” 3. Before the synchronous mode of SRDF can be established, the initial instantiation of the database has to have taken place. In other words, the baseline full copy of all the volumes that are going to participate in the synchronous replication must be executed first. This is usually accomplished using adaptive copy mode of SRDF. SRDF/S and consistency groups 373 Microsoft SQL Server Disaster Restart and Disaster Recovery 4. The following command puts the consistency group into adaptive copy mode: symrdf –cg <composite_group> set mode acp_disk –noprompt 5. The following command causes the source Symmetrix to send all tracks at the source site to the target site using the current mode: symrdf –cg <composite_group> establish –full -noprompt 6. Adaptive copy mode has no host impact. It transmits tracks to the remote site that have never been sent before or that have changed since the last time the track was sent. It does not preserve write order or consistency. When both sides are synchronized, SRDF can be put into synchronous mode. In the following command, the consistency group is put into synchronous mode: symrdf –cg <composite_group> set mode sync –noprompt 7. To enable consistency protection, use the following command: symcg –cg <composite_group> enable –noprompt Note: There is no requirement for a host at the remote site during the synchronous replication. The target Symmetrix manages the in-bound writes and updates the appropriate disks in the array 374 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery SRDF/A SRDF/A, or SRDF/Asynchronous, is a method of replicating production data changes from one Symmetrix to another using delta set technology. Delta sets are the collection of changed blocks grouped together by a time interval that can be configured at the source site. The default time interval is 30 seconds. The delta sets are then transmitted from the source site to the target site in the order they were created. SRDF/A preserves dependent-write consistency of the database at all times at the remote site. The distance between the source and target Symmetrix is unlimited and there is no host impact. Writes are acknowledged immediately when they hit the cache of the source Symmetrix. SRDF/A is only available on the DMX and VMAX family of Symmetrix. Figure 152 on page 375 depicts the process: 1 SQL Server 5 2 3 4 R1 Vol State: N State: N-1 State: N-1 R2 Vol State: N-2 R1 Vol State: N State: N-1 State: N-1 R2 Vol State: N-2 Source/R1 Target/R2 1. CAPTURE Delta Set collects application write I/Os 2. Delta Set switch: dependent write consistent state 3. TRANSMIT Delta Set sends N-1 set of I/Os to RECEIVE Delta Set on Target 4. APPLY Delta Set: after TRANSMIT is complete, data applied to disk 5. Cycle repeats ICO-IMG-000029 Figure 152 SRDF/Asynchronous replication internals SRDF/A 375 Microsoft SQL Server Disaster Restart and Disaster Recovery 1. Writes are received into the source Symmetrix cache. The host receives immediate acknowledgement that the write is complete. Writes are gathered into the capture delta set for 30 seconds. 2. A delta set switch occurs and the current capture delta set becomes the transmit delta set by changing a pointer in cache. A new empty capture delta set is created. 3. SRDF/A sends the changed blocks that are in the transmit delta set to the remote Symmetrix. The changes collect in the receive delta set at the target site. When the replication of the transmit delta set is complete, another delta set switch occurs and a new empty capture delta set is created with the current capture delta set becoming the new transmit delta set. The receive delta set becomes the apply delta set. 4. The apply delta set marks all the changes in the delta set against the appropriate volumes as invalid tracks and begins destaging the blocks to disk. 5. The cycle repeats continuously. With sufficient bandwidth for the source database write activity, SRDF/A will transmit all changed data within the default 30 seconds. This means that the maximum time the target data will be behind the source is 60 seconds (two replication cycles). At times of high write activity, it may not be possible to transmit all the changes that occur during a 30 second interval. This means that the target Symmetrix will fall behind the source Symmetrix by more than 60 seconds. Careful design of the SRDF/A infrastructure and a thorough understanding of write activity at the source site are necessary to design a solution that meets the RPO requirements of the business at all times. Consistency is maintained throughout the replication process on a delta set boundary. The Symmetrix will not apply a partial delta set which would invalidate consistency. Dependent-write consistency is preserved by placing a dependent-write in either the same delta set as the write it depends on or a subsequent delta set. Note: There is no requirement for a host at the remote site during asynchronous replication. The target Symmetrix manages in-bound writes and updates the appropriate disks in the array. 376 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Different command sets are used to enable SRDF/A depending on whether the SRDF/A group of devices is contained within a single Symmetrix or is spread across multiple Symmetrix arrays. SRDF/A using single source Symmetrix Before the asynchronous mode of SRDF can be established, initial instantiation of the database has to take place. In other words, a baseline full copy of all the volumes that are going to participate in the asynchronous replication must be executed first. This is usually accomplished using the adaptive copy mode of SRDF. The following steps outline the process of setting up asynchronous replication using Solutions Enabler (SYMCLI) commands: 1. To create an SRDF disk group for the source side of the synchronous relationship, that is, the R1 side: symdg create <device_group> –type rdf1 2. Add devices to the group as described in Appendix B, “References.” 3. The following command puts the device group into adaptive copy mode: symrdf –g <device_group> set mode acp_disk –noprompt 4. The following command causes the source Symmetrix to send all the tracks at the source site to the target site using the current mode: symrdf –g <device_group> establish –full -noprompt 5. The adaptive copy mode of SRDF has no impact to host application performance. It transmits tracks to the remote site that have never been sent before or that have changed since the last time the track was sent. It does not preserve write order or consistency. When both sides are synchronized, SRDF can be put into asynchronous mode. In the following command, the device group is put into asynchronous mode: symrdf –g <device_group> set mode async –noprompt Note: There is no requirement for a host at the remote site during the asynchronous replication. The target Symmetrix manages the in-bound writes and updates the appropriate disks in the array. SRDF/A 377 Microsoft SQL Server Disaster Restart and Disaster Recovery SRDF/A using multiple source Symmetrix When a database is spread across multiple Symmetrix arrays and SRDF/A is used for long-distance replication, separate software must be used to manage the coordination of the delta set boundaries between the participating Symmetrix arrays and to stop replication if any of the volumes in the group cannot replicate for any reason. The software must ensure that all delta set boundaries on every participating Symmetrix in the configuration are coordinated to give a dependent-write consistent point-in-time image of the database. SRDF/A multisession consistency (MSC) provides consistency across multiple RA groups and/or multiple Symmetrix arrays. MSC is available on 5671 microcode and later with Solutions Enabler V6.0 and later. SRDF/A with MSC is supported by an SRDF process daemon that performs cycle-switching and cache recovery operations across all SRDF/A sessions in the group. This ensures that a dependent-write consistent R2 copy of the database exists at the remote site at all times. A composite group must be created using the SRDF consistency protection option (-rdf_consistency) and must be enabled using the symcg enable command before the RDF daemon begins monitoring and managing the MSC consistency group. The RDF process daemon must be running on all hosts that can write to the set of SRDF/A volumes being protected. At the time of an interruption (SRDF link failure, for instance), MSC analyzes the status of all SRDF/A sessions and either commits the last cycle of data to the R2 target or discards it. The following steps outline the process of setting up synchronous replication with consistency groups using Solutions Enabler (SYMCLI) commands: 1. The replication composite group for the SRDF/A devices can be created using the command: symcg create <composite_group> -rdf_consistency -type rdf1 The –rdf_consistency option indicates that the volumes that will be in the group are to be protected by MSC. 2. Add devices to the group as described in Appendix B, “References.” 378 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery 3. Before the asynchronous mode of SRDF can be established, the initial copy of the database has to take place. In other words, the baseline full copy of all the volumes that are going to participate in the asynchronous replication must be executed first. This is usually accomplished using the adaptive copy mode of SRDF. The following command puts the group into adaptive copy mode: symrdf –g <composite_group> set mode acp_disk –noprompt 4. The following command causes the source Symmetrix to send all the tracks at the source site to the target site using the current mode: symrdf –g <composite_group> establish –full -noprompt 5. The adaptive copy mode of SRDF has no impact on host application performance. It transmits tracks to the remote site that have never been sent before or that have changed since the last time the track was sent. It does not preserve write order or consistency. When both sides are synchronized, SRDF can be put into asynchronous mode. In the following command, the composite group is put into asynchronous mode: symrdf –g <composite_group> set mode async –noprompt 6. To enable multisession consistency for the group, execute the following command: symcg –cg <composite_group> enable Note: There is no requirement for a host at the remote site during the asynchronous replication. The target Symmetrix itself manages the in-bound writes and updates the appropriate disks in the array. Restart processing In the event of a disaster when the primary source Symmetrix is lost, database and application services must be run from the DR site. A host at the DR site is required for this. If the device or composite group is not defined yet on the remote host, it must first be created using the R2 devices that were mirrors of the R1 devices on the source Symmetrix. The first step that must be done is to write-enable the R2 devices. R2s on a single Symmetrix: SRDF/A 379 Microsoft SQL Server Disaster Restart and Disaster Recovery symld –g <device_group> rw_enable –noprompt R2s on multiple Symmetrix: symcg –cg <composite_group> rw_enable –noprompt At this point, the host can issue the necessary commands to access the disks. This includes the steps required for mounting the volumes appropriately as mount points or drive letters. Once the data is available to the host, the database can be restarted. The database will perform crash recovery when the database is attached. Transactions that were committed but not completed are rolled forward and completed using the information in the active logs. Transactions that have updates applied to the database but not committed are rolled back. The result is a transactionally consistent database. 380 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery SRDF/AR single hop SRDF/Automated Replication, or SRDF/AR, is a continuous movement of dependent-write consistent data to a remote site using SRDF adaptive copy mode and TimeFinder consistent split technology. TimeFinder BCVs are used to create a dependent-write consistent point-in-time image of the data to be replicated. The BCVs also have an R1 personality, which means that SRDF in adaptive copy mode can be used to replicate the data from the BCVs to the target site. Since the BCVs are not changing, replication completes in a finite length of time. The length of time for replication depends on the size of the network pipe between the two locations, the distance between the two locations, the quantity of changed data tracks, and the locality of reference of the changed tracks. On the remote Symmetrix, another BCV copy of the data is made using data on the R2s. This is necessary because the next SRDF/AR iteration replaces the R2 image in a nonordered fashion, and if a disaster were to occur while the R2s were synchronizing, there would not be a valid copy of the data at the DR site. The BCV copy of the data in the remote Symmetrix is commonly called the gold copy of the data. The whole process then repeats. With SRDF/AR, there is no host impact. Writes are acknowledged immediately when they hit the cache of the source Symmetrix. Figure 153 on page 382 depicts the process: SRDF/AR single hop 381 Microsoft SQL Server Disaster Restart and Disaster Recovery 5 4 1 SQL Server 2 STD BCV/R1 R2 BCV STD BCV/R1 R2 BCV 3 Source/R1 3 Target/R2 1. Consistent split on Source 2. SRDF Mirroring resumed 3. Incremental Establish initiated on both Source and Target BCVs 4. BCV split on Target 5. Cycle repeats based on cycle parameters ICO-IMG-000030 Figure 153 SRDF/AR single-hop replication internals 1. Writes are received into the source Symmetrix cache and are acknowledged immediately. The BCVs are already synchronized with the STDs at this point. A consistent split is executed against the STD-BCV pairing to create a point-in-time image of the data on the BCVs. 2. SRDF transmits the data on the BCV/R1s to the R2s in the remote Symmetrix. 3. When the BCV/R1 volumes are synchronized with the R2 volumes, they are reestablished with the standards in the source Symmetrix. This causes the SRDF links to be suspended. At the same time, an incremental establish is performed on the target Symmetrix to create a gold copy on the BCVs in that frame. 4. When the BCVs in the remote Symmetrix are fully synchronized with the R2s, they are split and the configuration is ready to begin another cycle. 5. The cycle repeats based on configuration parameters. The parameters can specify the cycles to begin at specific times, specific intervals or to run back to back. 382 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Note: Cycle times for SRDF/AR are usually in the minutes to hours range. The RPO is double the cycle time in a worst-case scenario. This may be a good fit for customers with relaxed RPOs. The added benefit of having a longer cycle time is that the locality of reference will likely increase. This is because there is a much greater chance of a track being updated more than once in a one-hour interval than in, say, a 30 second interval. The increase in locality of reference shows up as reduced bandwidth requirements for the final solution. Before SRDF/AR can be started, instantiation of the database has to take place. In other words, a baseline full copy of all the volumes that are going to participate in the SRDF/AR replication must be executed first. This means a full establish to the BCVs in the source array, a full SRDF establish of the BCV/R1s to the R2s and a full establish of the R2s to the BCVs in the target array is required. There is an option to automate the initial set up of the relationship. As with other SRDF solutions, SRDF/AR does not require a host at the DR site. The commands to update the R2s and manage the synchronization of the BCVs in the remote site are all managed in-band from the production site. For additional details, refer to Appendix A, “Related Documents,” for details on locating the SRDF/AR solutions guide. Restart processing In the event of a disaster, it is necessary to determine if the most current copy of the data is located on the remote site BCVs or R2s at the remote site. Depending on when in the replication cycle the disaster occurs, the most current version could be on either set of disks. SRDF/AR single hop 383 Microsoft SQL Server Disaster Restart and Disaster Recovery SRDF/AR multi hop SRDF/Automated Replication multi-hop, or SRDF/AR multi-hop, is an architecture that allows long-distance replication with zero seconds of data loss through use of a bunker Symmetrix. Production data is replicated synchronously to the bunker Symmetrix, which is within 200 km of the production Symmetrix allowing synchronous replication but also far enough away that potential disasters at the primary site may not affect it. Typically, the bunker Symmetrix is placed in a hardened computing facility. BCVs in the bunker frame are periodically synchronized to the R2s and consistent split in the bunker frame to provide a dependent-write consistent point-in-time image of the data. These bunker BCVs also have an R1 personality, which means that SRDF in adaptive copy mode can be used to replicate the data from the bunker array to the target site. Since the BCVs are not changing, the replication can be completed in a finite length of time. The length of time for the replication depends on the size of the pipe between the bunker location and the DR location, the distance between the two locations, the quantity of changed data, and the locality of reference of the changed data. On the remote Symmetrix, another BCV copy of the data is made using the R2s. This is because the next SRDF/AR iteration replaces the R2 image, in a nonordered fashion. If a disaster were to occur while the R2s were synchronizing, there would not be a valid copy of the data at the DR site. The BCV copy of the data in the remote Symmetrix is commonly called the gold copy of the data. The whole process then repeats. With SRDF/AR multi-hop, there is minimal host impact. Writes are only acknowledged when they hit the cache of the bunker Symmetrix and a positive acknowledgment is returned to the source Symmetrix. Figure 154 on page 386 depicts the process: 1. BCVs are synchronized and consistently split against the R2s in the bunker Symmetrix. The write activity is momentarily suspended on the source Symmetrix to get a dependent-write consistent point-in-time image on the R2s in the bunker Symmetrix, which creates a dependent-write consistent point-in-time copy of the data on the BCVs. 2. SRDF transmits the data on the bunker BCV/R1s to the R2s in the DR Symmetrix. 384 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery 3. When the BCV/R1 volumes are synchronized with the R2 volumes in the target Symmetrix, the bunker BCV/R1s are established again with the R2s in the bunker Symmetrix. This causes the SRDF links to be suspended between the bunker Symmetrix and the DR Symmetrix. At the same time, an incremental establish is performed on the DR Symmetrix to create a gold copy on the BCVs in that frame. 4. When the BCVs in the DR Symmetrix are fully synchronized with the R2s, they are split and the configuration is ready to begin another cycle. 5. The cycle repeats based on configuration parameters. The parameters can specify the cycles to begin at specific times, specific intervals or to run immediately after the previous cycle completes. It should be noted that even though cycle times for SRDF/AR multi-hop are usually in the minutes-to-hours range, the most current data is always in the bunker Symmetrix. Unless there is a regional disaster that destroys both the primary site and the bunker site, the bunker Symmetrix will transmit all data to the remote DR site. This means zero data loss at the point of the beginning of the rolling disaster or an RPO of 0 seconds. This solution is a good fit for customers with a requirement of zero data loss and long-distance DR. An added benefit of having a longer cycle time means that the locality of reference will likely increase. This is because there is a much greater chance of a track being updated more than once in a one-hour interval than in say a 30 second interval. The increase in locality of reference shows up as reduced bandwidth requirements for the network segment between the bunker Symmetrix and the DR Symmetrix. Before SRDF/AR can be initiated, initial instantiation of the database has to take place. In other words, a baseline full copy of all the volumes that are going to participate in the SRDF/AR replication must be executed first. This means a full establish of the R1s in the source location to the R2s in the bunker Symmetrix. The R1s and R2s need to be synchronized continuously. Then a full establish from the R2s to the BCVs in the bunker Symmetrix, a full SRDF establish of the BCV/R1s to the R2s in the DR Symmetrix and a full establish of the R2s to the BCVs in the DR Symmetrix is performed. There is an option to automate this process of instantiation. SRDF/AR multi hop 385 Microsoft SQL Server Disaster Restart and Disaster Recovery For additional details, refer to Appendix A, “Related Documents,” for details on locating the SRDF/AR solutions guide. Production D/R Bunker 5 4 1 2 STD R2 BCV/R1 R2 BCV STD R2 BCV/R1 R2 BCV SQL Server 3 Short Distance 3 Long Distance 1. Consistent split in Bunker Array 2. SRDF Mirroring resumed 3. Incremental Establish initiated on both Source and Target BCVs 4. BCV split on Target 5. Cycle repeats based on cycle parameters ICO-IMG-000031 Figure 154 SRDF/AR multi-hop replication internals Restart processing In the event of a disaster, it is necessary to determine if the most current copy of the data is on the R2s on the remote site or on the BCV/R1s in the bunker Symmetrix. Depending on when the disaster occurs, the most current version could be on either set of disks. This determination is simple and is outlined in the SRDF/AR solutions guide. 386 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Database log-shipping solutions Log shipping is a strategy that some companies employ for disaster recovery. The process only works for databases which are in full recovery mode, and for which incremental transaction log backups are created. Note: Microsoft SQL Server does implement a specific form of log shipping using SQL Server agent processes to manage the required steps. In this section, we will discuss the log shipping process itself rather than SQL Server’s specific implementation. The essence of log shipping is that changes to the database at the source site are reflected in the log are propagated to the target site. These incremental transaction log backups are then applied to a database at the target site to maintain a consistent image of the database that can be used for DR purposes. The target database is in either a NORECOVERY or STANDBY state, and will have been restored from a full backup. Overview of log shipping Log shipping is commonly used by RDBMS vendors to provide a mechanism that allows for high availability and DR of the database environment. One of the major benefits of the log shipping environment is its ability to mitigate the time and effort required to completely restore the database in the event of total loss of the production system. Typically, a log shipping environment is created by using a full backup of the production database itself. This is restored to a separate server than the production system, and is restored using the NORECOVERY or STANDBY form of the RESTORE Transact SQL statement. This results in a completely independent copy of the source database data and transaction files. In this way, this log shipping target system is protected from any physical damage to the data files which may potentially occur on the production system. Since the restored backup on the log shipping target system will become more out of date with respect to the state of the production data as changes occur to the production system. The log shipping environment, as the name suggests, implements a mechanism by which logged changes can be shipped from production system to target log shipping server. Those changes are any data manipulation Database log-shipping solutions 387 Microsoft SQL Server Disaster Restart and Disaster Recovery language (DML) statements, which have been logged within the transaction log. For this reason, the full recovery mode is used for logging, such that all changes are logged and subsequently available for replay on the target server. In the case of Microsoft SQL Server, the changes are propagated through the creation of incremental backups of the transaction log. Since all changes to the data files are recorded in the transaction log, as a result of SQL Server’s implementation of Write Ahead Log (WAL) protocol, using the log records can allow for the same changes to be propagated to a full backup when played back in order. The log shipping environment replays those transaction log backups against the separate restored database and thus places that database in the same state as the production database at the point in time of the end of the last applied transaction log backup. Incremental transaction log backups created from the production database need to be replayed against the log shipping serve in the order in which they were created. To ensure that this requirement is met, SQL Server enforces that Log Sequence Numbers (LSNs) recorded in the transaction log are played back in sequential order. SQL Server will not allow out of order restores of incremental transaction logs, and thus a reliable mechanism must be utilized to effectively unique transaction log backups. In the event of production system total failure, once all available transaction logs have been applied, the log shipping database can be brought online and made available for use. Utilizing the STANDBY mode of operation for the log shipping target server, allows for access in a read-only mode. During periods of time when incremental transaction logs are being applied to the standby database, no user connections are allowed, and all connections must be terminated. Using such a configuration allows for administrators to extract data from the standby instance or for users to utilize the standby instance for reporting requirements. For example, should data have been inadvertently or maliciously deleted from the production system, the log shipping target will typically be at some point behind the production system data state. If the administrator is alerted to the issue in the production system in time, it may be possible to only restore incremental log backups to the log-shipping database until just prior to the time of the logical data corruption. It would then be possible to extract the deleted data from the log shipping target database in STANDBY mode, and insert this data back into the production system. 388 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery The creation of incremental transaction logs, their shipment between the production and log shipping target server, and ultimately the application of those incremental logs can be managed manually by an administrator or by some automated processes. The Microsoft SQL Server environment provides a structured, automated process for the creation, monitoring and ongoing processing of a log shipping environment. Information on the definition, implementation and ongoing management of the SQL Server solution is available in the SQL Server Books Online documentation. It is possible to allow for EMC disk mirror VDI restore operations to create the target environment from a full database backup which has been processed in the appropriate manner. The target database environment must have been left in either a NORECOVERY or STANDBY mode to be integrated into the SQL Server managed log shipping environment. To allow SQL Server management tools to utilize this restored target database, when configuring the log shipping environment via the Maintenance Plan subsystem, simply select the No, the secondary database is initialized option provided in the Secondary Database Settings dialog box, as shown in Figure 155 on page 390. Database log-shipping solutions 389 Microsoft SQL Server Disaster Restart and Disaster Recovery Figure 155 Log shipping configuration - destination dialog Also note in Figure 156 on page 391, the options presented under the Database State when restoring backups selection, which allow configuration of the target database.Specifically, the STANDBY option for read-only access. The Disconnect users in the database option relates to the STANDBY mode and ensures that exclusive access can be obtained when applying subsequent transaction logs in a STANDBY mode. 390 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Figure 156 Log Shipping configuration - restoration mode If a disaster occurred at the primary site, the target log shipping server would be able to be brought online and be made available to users, albeit with some loss of data. The amount of data loss would be determined by the processes which manage the creation of the incremental transaction log backups, and their propagation to the remote server. Log shipping considerations When considering a log shipping strategy it is important to understand: ◆ What log shipping covers ◆ What log shipping does not cover ◆ Exposure to data loss during periods of change in recovery modes Database log-shipping solutions 391 Microsoft SQL Server Disaster Restart and Disaster Recovery ◆ Server requirements ◆ How to instantiate and re-instantiate the target database ◆ How failback works ◆ Federated consistency requirements ◆ Amount of data loss in event of site disaster ◆ Manageability of the solution ◆ Scalability of the solution Log shipping limitations Log shipping only transfers those changes recorded to the transaction log which are subsequently saved to an incremental transaction log backup. Consequently, operations that do not result in transaction log records are not propagated to the target server. SQL Server will limit the ability to replay logs in certain conditions depending on the version of SQL Server in use and the recovery model selected. Refer to the SQL Server Books Online for the version being used for specific information. Log shipping is a database centric strategy. It is completely agnostic and does not address changes which occur outside of the database. These changes include, but are not limited to: ◆ Application and binary files (service packs) ◆ Database configuration changes ◆ Database binaries ◆ Operating system changes ◆ External flat files ◆ Addition of new data files on new LUNs To sustain a working environment at the DR site, procedures must be executed to keep these objects up to date. Server requirements Log shipping requires a server at the remote DR site to receive and apply the logs to the standby database. It may be possible to offset this cost by using the server for other functions when it is not being used for DR. Database licensing fees for the standby database will also apply. 392 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery How to instantiate and re-instantiate the target database Log shipping architectures need to be supported by a method of instantiating the database at the remote site. The method needs to be manageable and timely. For example, shipping 200 tapes from the primary site to the DR site may not be an adequate approach, considering the transfer time and database restore time. Periodically, it may be necessary to re-instantiate the database at the DR site. The process should be easily managed but also should provide continuous DR protection. That is to say, there must be a contingency plan for a disaster during re-instantiation. How failback works An important component of any DR solution is designing a failback procedure. If the DR setup is tested with any frequency, this method should be simple and risk free. Log shipping can be implemented in reverse direction and works well when the primary site is still available. In the case of a disaster where the primary site data is lost, the database has to be re-instantiated at the production site. To facilitate this specific process, within the SQL Server log shipping implementation, there is documentation of required procedures to implement a log shipping role change. Refer to SQL Server Books Online for details on this role change process. Federated consistency requirements Most databases are not isolated islands of information. They frequently have upstream inputs and downstream outputs, triggers, and stored procedures that reference other databases. There may also be a message queuing system such as Microsoft Message Queue (MSMQ), IBM MQ Series, or TIBCO queues between RDBMS environments or other data stores; externalized data stores for BLOBs, and so on. This entire environment is a federated structure that needs to be recovered to the same point in time to get a transactionally consistent disaster restart point. Log shipping solutions are single database centric and are not adequate solutions in federated database environments. Data loss expectations If sufficient bandwidth is provisioned for the solution, the amount of data lost in a disaster is going to be approximately two logs worth of information. In terms of time, it would be approximately twice the Database log-shipping solutions 393 Microsoft SQL Server Disaster Restart and Disaster Recovery length of time it takes to create an incremental transaction log backup. This time will most likely vary during the course of the day because of fluctuations in write activity. Manageability of the solution The manageability of a DR solution is a key to its success. Log shipping solutions have many components to manage including servers, databases, and external objects as previously noted. Some of the questions that need to be answered to make a clear determination of the manageability of a log shipping solution are: How much effort does it take to set up log shipping? How much effort is needed to keep it running on an ongoing basis? What is the risk if something required at the target site is missing? If Windows File Shares (CIFS) are used to ship the incremental transaction log backups, what kind of monitoring is needed to guarantee success? In many of these cases, the specific SQL Server implementation mitigates the operational complexity through the implementation of SQL Agent processes and a replication monitor system, which reports on any failures. Scalability of the solution The scalability of a solution is directly linked to its complexity. To successfully scale the DR solution, the following questions must be addressed: ◆ How much more effort does it take to add more databases? ◆ How easy is the solution to manage when the databases grow larger? ◆ What happens if the quantity of updates increases dramatically? Log shipping and the remote database The remote database of a log shipping implementation is one which has been created from a restore of a full backup of the source database. The remote database will have been restored with either the NORECOVERY or STANDBY options of the RESTORE DATABASE Transact SQL Statement. 394 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Figure 157 on page 395 depicts the SQL Server log shipping environment: Source Target 1 2 SQL Server Data Files Data Files Transaction Log Transaction Log General Storage General Storage 4 SQL Server 3 1. Database instantiation from SRDF propagated backup 2. Transaction log backup to file location through SQL Server 3. Periodic copy of log backups to remote site via network 4. Application of log backups to Target database in NORECOVERY or STANDBY state through SQL Server ICO-IMG-000032 Figure 157 Log shipping implementation overview 1. The database must be instantiated at the D/R site, either by tape or by using SRDF, or if it is small enough, by shipping the backup over the network. The database is restored with either the NORECOVERY or STANDBY state. 2. Incremental transaction log backups are created on local storage on the production site (or on a remote file share from the D/R site). 3. If stored locally, the incremental backup logs are copied to the DR site. 4. Periodically, the logs are applied to the remote database The remote database is continually rolled forward by the application of the incremental transaction log backups being applied to it. If the database is in a STANDBY mode, then all user connections must be terminated during the application of the transaction log backup. Database log-shipping solutions 395 Microsoft SQL Server Disaster Restart and Disaster Recovery Note: The SQL Server implementation of the log shipping environment manages the creation, transmission and application of incremental transaction log backups. The steps are explained here to detail the log shipping process in general. NORECOVERY mode logs are applied incrementally, and must be executed with the NORECOVERY keyword, as shown in Figure 158 on page 396. Figure 158 Restore of incremental transaction log with NORECOVERY In STANDBY mode logs are applied incrementally, and must be executed with the STANDBY keyword and the specification of the standby file location, as shown in Figure 159 on page 397. SQL Server used the standby file to copy uncommitted pages from the data files and roll those data pages back to a consistent state—typically that state represented by only viewing committed updates. 396 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Figure 159 Restore of incremental transaction log with STANDBY In the case of the STANDBY mode, before the application of subsequent incremental transaction logs, all user connections must be terminated, and the data pages stored in the standby file reapplied to their respective data files. This represents that state which existed as of the end of the application of the previous transaction log backup, and thus allows for the continuation of the application of a subsequent log. The application of incremental transaction logs progress in the applicable manner until the last available transaction log is processed. In the case of the last incremental transaction log restore operation, the RECOVERY keyword should be used. This indicates to the SQL Server instance that final recovery processing should be executed against the database. This will roll back uncommitted transactions in the database and return it to an online state. It is also possible to execute the following Transact-SQL against the database state, should there be no further transaction logs: RESTORE LOG <database> WITH RECOVERY GO Database log-shipping solutions 397 Microsoft SQL Server Disaster Restart and Disaster Recovery This causes recovery processing to be executed without the need to specify a valid transaction log when no further logs are available. Shipping logs with SRDF Rather than using Windows network shares to ship the logs from the source to the target site, SRDF can be used as the transfer mechanism. SRDF/Synchronous can be used to ship incremental transaction log backups when the Symmetrix arrays are less than 200 km apart. Synchronous mode will ensure that when a transaction log backup has been completed it is guaranteed to be available on the remote site for replay, thus mitigating some of the data loss exposure created by network transfers In addition, federated DR requirements can be satisfied by synchronously replicating data external to the database. The advantages to using SRDF to ship the logs follow: ◆ Guaranteed delivery — SRDF brings deterministic protocols into play, an important advantage over FTP. This means that SRDF guarantees that what is sent is what is received. If packets are lost on the network, SRDF retransmits the packets as necessary. ◆ Restartability — SRDF also brings restartability to the log shipping functionality. Should a network outage or interruption cause data transfer to stop, SRDF can restart from where it left off after the network is returned to normal operations. While replicating the incremental log backups to the remote site, the receiving volumes (R2s) cannot be used by the host, as they are read only. If the business requirements are such that a standby database should be continually applying logs, periodically BCVs can be used to synchronize against the R2s and split. Then the BCVs can be accessed by a host at the remote site and the incremental transaction log backups retrieved and applied to the standby database SQL Server Database Mirroring Database Mirroring is a new implementation of a high availability solution for SQL Server 2005. Database Mirroring, as with SQL Server log shipping, is implemented with two database instances which are referred to as the principal (the production instance) and the mirror (the remote). It is also possible to configure a third SQL Server 398 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery instance as a witness to facilitate quorum processing. The ability to create a quorum with the witness provides automatic failover capabilities. Unlike log shipping, which uses a coarse granularity based on the incremental transaction log intervals, and where each backup can thus include thousands of discrete transactions, Database Mirroring works at the individual log record level and can be configured to run in a synchronous or asynchronous mode. The ability to run in synchronous mode, and thus implement a zero data loss solution is based both on the rate of update and the distance (latency) between the principal and mirror. In its simplest form, the Database Mirroring environment can be implemented as shown in Figure 160 on page 399: Source Target 1 SQL Server Principal Data Files Data Files Transaction Log Transaction Log SQL Server Mirror 2 3 4 5 1. Database instantiation from SRDF propagated backup 2. Remote database is restored with NORECOVERY option 3. Principal state is enabled on source database 4. Mirror state is enabled by the target database 5. Log updates continually processed via network ICO-IMG-000033 Figure 160 SQL Server Database Mirroring with SRDF/S 1. The database must be instantiated at the DR site, either by using some form of restore from media or by utilizing SRDF. If small enough, the database backup may be transferred vie the network 2. The remote database instance is restored in a NORECOVERY mode. Database log-shipping solutions 399 Microsoft SQL Server Disaster Restart and Disaster Recovery 3. Utilizing either the SQL Server Management Studio or the appropriate Transact-SQL statements, mirroring is enabled on the production database. 4. The remote database is also entered into a mirroring mode by utilizing the SQL Server Management Studio GUI or by using the appropriate Transact-SQL statements. 5. SQL Server Database Mirroring processing then enables the transmission of each transaction log write submitted to the local transaction log file on the production instance to be also transmitted via network connectivity to the remote mirror. As with typical log shipping environments, the target (remote) database may require re-instantiation periodically. SRDF can be used to in combination with disk mirror backup operations to facilitate this process when required. In addition, if SRDF was used in the initial creation of the remote database backup, subsequent re-instantiations will become incremental operations. This may allow for more efficient processing depending on the amount of data change in the intervening time period. SQL Server Database Mirroring modes of operation SQL Server 2005 and 2008 Database Mirroring provides two effective forms of replication between the principal and mirror. One of SYNC or ASYNC must be selected as the operational mode. Figure 161 on page 401 details the process flow used by the Database Mirroring implementation: 400 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery 1 Commit 7 Acknowledge Witness Constant redo process 6 Acknowledge 2 Transmit to mirror Principal 2 Write to local log Data Files 3 Committed in log Transaction Log Mirror 4 Write to remote log 5 Committed in log Transaction Log Data Files ICO-IMG-000034 Figure 161 SQL Server Database Mirroring flow overview – SYNC mode 1. An operation occurs, which necessitates a log write. In this example, a commit of a transaction is executed by a user process. 2. Principal submits both a local log write operation and log write operation to mirror server. 3. Local log write I/O operation completes, and I/O is acknowledged to server 4. If in SYNC mode, then I/O completion status is not returned to calling process until log write is also executed against mirror’s log device 5. Mirror log write is acknowledged to mirror server 6. Acknowledgement of successful log write operation sent back to principal 7. Principal SQL Server instance reports success to calling process. Should the ASYNC form of operation be employed, then steps 2 through 6 are not in the critical path for log write operations. As a result, only the local log write operation is required before acknowledgement is returned to the calling process. Database log-shipping solutions 401 Microsoft SQL Server Disaster Restart and Disaster Recovery For environments where there is high latency on network traffic, or where the distance creates latency, SYNC mode can provide a serious impact to production operations. The use of the ASYNC form of database mirroring attempts to mitigate performance issues, by removing the dependency on the log write of the mirror instance. It is possible that over longer latency network links, a back log of writes may occur, leading to greater data loss. SQL Server Database Mirroring client connectivity The SQL Client connectivity stack has been enhanced to ensure a high-availability solution in the event of principal server failure. New functionality was added to the SQL client to allow for the designation of both the principal and mirror instances. In this way, should a failover occur, the client application could automatically be redirected to the alternate server. Clearly, this functionality assumes that automatic failover operations have been implemented. Initiation of the mirror database The mirror database state is derived from a restore operation from a valid backup of the source production database. EMC disk mirror backup technology may be utilized to create the required state of a NORECOVERY database. Once created, standard SQL Server processes may be utilized to configure, manage, and monitor the ongoing operations. SQL Server Database Mirroring high availability mode The Database Mirroring solution for SQL Server 2005 and SQL Server 2008 provides the ability to utilize an additional server with a SQL Server instance to participate in the environment to implement additional automatic failover capabilities. This third server is referred to as the witness. The witness server allows for the management of service availability by forming quorum between the three servers. At a minimum, quorum is formed by at least two of the available servers. In the event that either the principal or the mirror fail, the witness will form quorum with the remaining SQL Server instance, and in the case of forming quorum with the principal, retain ongoing service; or in the case of forming quorum with the mirror, promote the mirror to provide service as the new principal. 402 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Additionally, protect is provided for split-brain scenarios, where connectivity is lost between the hosts, rather than a physical server failure. In this case, a server which is unable to form quorum with at least its partner server, or the witness will not continue (or start) service for the database. This does mean that if connectivity is lost between all three servers, then service to the production database will be lost because the principal is unable to form quorum, thus resulting in it taking the database offline. Also, the mirror is unable to form quorum with the witness, and thus is not able to promote itself to principal status. SQL Database Mirroring failover operations Given the nature of the replication cycle of transaction information between the principal and the mirror, it is an extremely fast operation to translate the mirror to principal state. There is no requirement to apply additional logs, since all log records will have already been recorded. The only internal recovery processes required by the database in this state is to rollback uncommitted transactions, which is a relatively efficient process. Enhanced by the new recovery features available in SQL Server 2005 and 2008, the database instance can be made instantly available to users, while the rollback operations continue. Transactional consistency for users is always marinated. Once all rollback operations complete, the instance becomes the new principal. The capabilities also exist for the old principal to assume the state of the mirror. Therefore, the flow of information is reverses, and provides a re-creation of the level of availability and protection. Network connectivity, the latency and amount of data flow will be the primary considerations for deploying a Database Mirroring solution. Database log-shipping solutions 403 Microsoft SQL Server Disaster Restart and Disaster Recovery Running database solutions Running database solutions attempt to use DR resources actively. Instead of having the database and server sitting idly waiting for a disaster to occur, the idea of having the database running and serving a useful purpose at the DR site is an attractive one. Most hardware, software replication technologies that utilize backup/restore or recovery processes typically require exclusive access to the database, not allowing users to access the target database. The solutions presented in this section perform replication at the database transaction level or application layer and therefore allow user access even when the database is being updated by the replication process. They typically lend themselves to single database solutions, but provide the mechanisms to provision scale-out solutions, where multiple discrete instances provide service. SQL Server transactional replication SQL Server provides several transactional replication implementations. These replication implementations are implemented on articles, which are the basic unit of replication. An article may be a table, a partitioned part of a table, or other valid SQL Server objects allowed for replication between SQL Server instances. In general, a publication is created for the article on the source SQL Server instance (or publisher). A distributor is an intermediate SQL Server instance which provides a distribution database which stores information relating to the publications. Subscribers are the target SQL Server instances which subscribe to the various Publications, and they may communicate with the distributor and the publisher to obtain required information regarding the publication, or in the instance of updateable or merge replications to submit transactional changes. Replication is configurable through SQL Server Management user interfaces, and definitive information on implementation and functionality can be found in the SQL Server Books Online documentation. 404 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Snapshot Replication Snapshot Replication, as mentioned, forms basis from which all other forms of SQL Server Transactional replication are derived. Once the definition of the article(s) to be published is determined, SQL Server replication processes will obtain both schema regarding the article and the data for the defined article. To provide a consistent state, a lock is obtained across the defined data for the duration of the creation of the snapshot. The schema is obtained to ensure that the required target tables and columns may be created correctly prior to importing the data. The data is effectively exported from the source database in BulkCopy (bcp) format for SQL Server only environments. It is possible to configure an Oracle database as the destination for SQL Server replication, should an Oracle environment be configured, the data is exported as text for later import. Once schema and data have been exported, these are copied and stored in the distributor database. As subscribers are introduced into the environment, they will be implemented with this snapshot data. As the snapshot data is static, over time it will differ from the current data within the production environment. To this end, SQL Server provides updateable snapshots. The implementation and details of updateable snapshots is beyond the scope of this document and is covered in the SQL Server Books Online documentation. Transactional Replication During the implementation of SQL Server replication, it is possible to define the replication to be a form of Transaction Replication. Again, this is based on the Snapshot Replication, but it further extends the implementation to allow for incremental updates to be propagated to the subscribers on a defined schedule. Updates flow from the publisher to subscribers via the distributor, and updates are not expected from the subscribers. Again, an updateable form of this implementation is available, but is not discussed here. After the creation of the Snapshot, SQL Server replication agents extract INSERT, UPDATE and DELETE transactions from the transaction log of the publisher. These transactions are stored in the distributor and on a cyclical basis copied by the subscribers and applied. In this way, the state of the data within the snapshot is maintained in a more up to date manner. Running database solutions 405 Microsoft SQL Server Disaster Restart and Disaster Recovery New subscribers will be able to use the last snapshot created and all subsequent transactional updates to implement a replica of the source publication without interacting with the production system in any manner—this is the value of the distributor system. Merge Replication The last form of replication discussed here is that of Merge Replication. Again, this form of replication is based on the Snapshot Replication to define the initial subscriber state. Unlike the other forms of transactional replication discussed, updates are expected and allowed on both the publisher and subscribers. All systems interact and have conflict managers implemented to ensure consistent outcomes of conflicting updates to the same data. It is typically most suitable to have well partitioned data in this form of replication to mitigate update conflicts. Merge Replication may be implemented across all forms of SQL Server, including Mobile editions. Therefore, this form of replication lends itself very well to mobile workers who need to collect and update information while being disconnected from the production database. 406 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Disaster Restart and Disaster Recovery Other transactional systems Another form of propagating data between systems exists in the form of Messaging Queuing systems. Microsoft provides the Microsoft Message Queuing (MSMQ) environment as a mechanism of passing information between source and target applications. Many other forms of Message Queuing systems are also provided by third-party companies such as TIBCO and IBM. These systems provide a method of persisting information in an appropriate store, and can subsequently be used as a mechanism to provide updates across multiple platforms. In this way, these systems may be able to maintain a common (although perhaps not synchronous, depending on configuration) state of multiple database environments. These solutions are typically customized for specific implementations. Additionally, other forms of transactional systems such as Distributed Transaction Coordinators may be used to ensure consistent states across multiple database environments using such mechanisms as two-phase commit processing. Microsoft provides such a mechanism as its MSDTC product. Again, typically a customized implementation, solutions built on two-phase commit operations may be implemented in such a way as to ensure that multiple databases are synchronized. It should be noted that implementing two-phase commit operations across long latency links will adversely affect the performance of the source system. When considering the usage of such Message Queuing and Distributed Transaction Coordinators, it is important to include them as a central part of any D/R solution. Data which may persist within the queuing system, or records of commit outcomes maintained in a transactional coordinator will typically be required for restart/recovery purposes in the remote site. These implementations are a key indicator of a federated environment, and are often much better suited to restart operations than recovery operations in a site failure situation. Other transactional systems 407 Microsoft SQL Server Disaster Restart and Disaster Recovery 408 Microsoft SQL Server on EMC Symmetrix Storage Systems 8 Microsoft SQL Server Database Layouts on EMC Symmetrix This chapter presents these topics: ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ The performance stack .................................................................... SQL Server layout recommendations ........................................... Symmetrix performance guidelines .............................................. RAID considerations ....................................................................... Host versus array-based striping................................................... Data placement considerations ...................................................... Other layout considerations ........................................................... Database-specific settings for SQL Server .................................... Microsoft SQL Server Database Layouts on EMC Symmetrix 411 415 419 435 441 445 451 454 409 Microsoft SQL Server Database Layouts on EMC Symmetrix Monitoring and managing database performance should be a continuous process in all SQL Server environments. Establishing baselines and then collecting database performance statistics for comparison against those baselines are important to monitor as are performance trends. This chapter discusses the performance stack and how database performance should be managed in general. Subsequent sections, discuss Symmetrix DMX and VMAX specific layout and configuration issues to help ensure the database meets the required performance levels 410 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix The performance stack Performance tuning involves the identification and elimination of bottlenecks in the various resources that make up the system. Resources include the application, the code (SQL) that drives the application, the database, the host, and the storage. Tuning performance involves analyzing each of these individual components that make up an application, identifying bottlenecks or potential optimizations that can be made to improve performance, implementing changes that eliminate the bottlenecks or improve performance, and verifying that the change has improved overall performance. This is an iterative process and is performed until the benefits to be gained by continued tuning are outweighed by the effort required to tune the system. Figure 162 on page 412 shows the various layers that need to be examined as a part of any performance analysis. The potential benefits achieved by analyzing and tuning a particular layer of the performance stack are not equal, however. In general, tuning the upper layers of the performance stack, that is, the application and SQL statements, provide a much better return on investment than tuning the lower layers, such as the host or storage layers. For example, implementing a new index on a heavily used table that changes logical access from a full table scan to index lookup with individual row selection can vastly improve database performance if the statement is run many times (thousands or millions) a day. When tuning a SQL Server database application, developers, DBAs, system administrators, and storage administrators need to work together to monitor and manage the process. Efforts should begin at the top of the stack and address application and SQL statement tuning before moving down into the database and host-based tuning parameters. After all of these have been addressed, storage-related tuning efforts should then be performed. The performance stack 411 Microsoft SQL Server Database Layouts on EMC Symmetrix Application Poorly written application, inefficient code SQL Statements SQL logic errors, missing index DB Engine Database resource contention Operating System File system parameters settings, kernel tuning, I/O distribution Storage System Storage allocation errors, volume contention ICO-IMG-000040 Figure 162 The performance stack Optimizing I/O The primary goal at all levels of the performance stack is to optimize I/O. Optimization at each level ensures that all system resources are used appropriately to service that workload, which is valid and required. In theory, an ideal database environment is one in which most I/Os are satisfied from memory rather than going to disk to retrieve the required data. In practice, however, this is not realistic; careful consideration of the disk I/O subsystem is necessary. Poorly performing transactions, which result in major table scan operations, can adversely affect least recently used (LRU) caching mechanisms utilized by both RDBMS buffer managers, and storage array controllers. As a result, viable pages may be discarded to satisfy the need to table scan operation, only to result in a re-read of the data so that changes may be made. Furthermore, the storage array controller may also have discarded cache buffers that previously contained relevant data. It is this style of activity whose negative effects can significantly affect overall optimal operation of the system. These 412 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix activities that adversely affect optimization mechanisms, such as LRU chains, create massive demands on storage subsystems, and present limited value as a standard operating policy. These types of demanding workloads may also make it extremely difficult to meet the performance guidelines provided in the next section. Optimizing performance of a SQL Server database on an EMC Symmetrix storage arrays involves a detailed evaluation of the I/O requirements of the proposed application or environment. A thorough understanding of the performance characteristics and best practices of the Symmetrix, including the underlying storage components (disks and directors, for instance) is also needed. Additionally, knowledge of complementary software products such as EMC SRDF, EMC TimeFinder, EMC Symmetrix Optimizer, and backup software, along with how utilizing these products will affect the database, is important for maximizing performance. Ensuring optimal configuration for a given SQL Server database requires a holistic approach to application, host, and storage configuration planning. Configuration considerations for host- and application-specific parameters are beyond the scope of this document. Storage system layout considerations What is the best way to configure SQL Server environments on EMC Symmetrix storage? Customers frequently ask this question. However, before recommendations can be given, details of the configuration and requirements for the database, host(s), and storage environment need to be understood. The principal goal for optimizing any layout on the Symmetrix array is to maximize the spread of I/O across the components of the array, reducing or eliminating any potential bottlenecks in the system. The next sections examine the trade-offs between optimizing storage performance and manageability for SQL Server. They also discuss recommendations for laying out SQL Server databases on EMC Symmetrix DMX and VMAX arrays. At the heart of all recommendations are some measurable performance characteristics, which are typically represented as providing optimal response for the SQL Server database engine. These recommendations are in the form of response times for the various parts of a SQL Server database and are listed in Table 13 on page 414. The performance stack 413 Microsoft SQL Server Database Layouts on EMC Symmetrix Table 13 SQL Server data and log response guidelines Data File Read Latency Recommendation Response 10 ms Very Good Response 15 ms Acceptable Response 20 ms Need for investigation and improvement Response 5 ms Very Good Response 5 to 10 ms Acceptable Response 10 to 15 ms Needs improvement Response 15 to 20 ms Investigate and improve Response > 20 ms Log Write Operations 414 Microsoft SQL Server on EMC Symmetrix Storage Systems Immediate action should be initiated Microsoft SQL Server Database Layouts on EMC Symmetrix SQL Server layout recommendations Traditional best practices for database layouts focus on avoiding incidents of contention for storage-related resources. Eliminating contention involves understanding how the database manages the data flow process and ensures that concurrent or near-concurrent storage resource requests are separated on to different physical spindles. Many of these recommendations still have value in a Symmetrix environment. Before examining other storage-based optimizations, a brief digression to discuss these recommendations is made. File system partition alignment It is EMC’s common recommendation when utilizing LUNs presented from storage arrays to ensure that partitions created on this LUNs are aligned to 64 KB boundaries during the partition creation phase. As of the introduction of Windows Server 2008, Microsoft Windows automatically caters for partition alignment by creating volumes that have a 1 MB offset from the start of the disk. This 1 MB offset is acceptable, since this is a factor of 64 KB, and does ensure that I/O operations are optimized within the Symmetrix storage array. As such, new volumes created with Windows Server 2008 and higher do not need specific intervention by storage or system administrators. This partition alignment and the resulting volume alignment should not be confused with the Windows Allocation Unit size specified when formatting the volume. These are two different processes. A number of EMC best practice documents exist on this matter and are available on Powerlink. These documents discuss the issue, and describe processes to implement corrective measures. General principles for layout There are some general recommendations that can be made for any kind of database and are thus worth mentioning here: SQL Server layout recommendations 415 Microsoft SQL Server Database Layouts on EMC Symmetrix ◆ Transaction logs on separate hypers and spindles. This minimizes contention for the logs as new writes come in from the database and any old transaction log information is streamed out during incremental transaction log backups. It also isolates the sequential write and random read activity for these members from other volumes with differing access characteristics. In more recent storage allocation technologies, such as EMC Virtual Provisioning, the necessity to separate data files from transaction logs for performance reasons, has to a great degree been mitigated. Since storage allocations from a thin pool can come from a large collection of physical spindles, the performance impact of co-locating data and log is not as severe as they are in a traditional provisioning environment. ◆ Isolate TEMPDB data and log files from other user databases and optionally allocate multiple data files for TEMPDB data. Here again, storage allocation technologies such as Virtual Provisioning may mitigate the necessity to isolate TEMPDB workloads in cases where the storage allocations are being created from a sufficiently large pool. Additionally, customers considering deploying Enterprise Flash Drives in a SQL Server environment, may consider looking at the performance characteristics provided by these devices as a means to improve TEMPDB operations. ◆ Utilize multiple files within a filegroup to distribute I/O load. SQL Server will allow for certain parallel operations when tables have been created on multiple files. Full table scans would represent one such parallel operation. ◆ Implement only a single file on any given LUN. In general, this provides for the best performance configuration and may not always be possible. Windows Server and HBA configurations create device I/O queues based on LUNs. Ensuring that queue structures are scaled out, and that workloads are optimized for a LUN, will in turn result in performance scaling. While these recommendations may be considered to be best practices, in certain circumstances they may not be possible to implement. In such cases, it is possible to collocate these files in shared locations. However, this will require constant monitoring and management to ensure that the overall performance of the environment is not suffering as a result of these competing workloads. 416 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix If it is planned to use array replication technologies such as TimeFinder and SRDF, it is prudent to organize the database data files in such a way to facilitate recovery. Since array replication techniques copy volumes at the physical disk level (as seen by the host), all data files for a given user database should be created on a set of disks dedicated to the database and should not be shared with other applications and databases. Therefore, co-locating data and log files, or other files on LUNs (viewed as physical disks by Windows servers) will affect functionality. Specifically, an attempt to restore a database located on a LUN may result in the inadvertent and erroneous restoration of other unrelated data. Note: This discussion is with respect to a LUN device. Windows Disk Management will allow for multiple volumes to be created on a given LUN. Restoration of disk mirror devices will restore the LUN, and therefore any and all volumes which exist on those LUNs. It is recommended in a SQL Server environment that separate LUNs only contain volumes that contain database files from the same database, or databases which will be backed up and restored together. The key point of any recommendation for optimizing the layout is that it is critical to understand both the type (sequential or random), size (long or short), and quantity (low, medium or high) of I/O against the various data files and other elements (logs, tempdb, and so on) of the database. Without a clear understanding of the data elements and the access patterns expected against them, serious contention issues on the back-end directors or physical spindles may arise that can negatively impact SQL Server performance. Knowledge of the application, and both data elements and access patterns is critical to ensuring high performance in the database environment. SQL Server layout and replication considerations If it is planned to use array replication technologies such as TimeFinder and SRDF, it is recommended to organize the database in such a way as to facilitate optimal performance. While changes are being made to optimize transmission of write operations over SRDF links, there are still locking mechanisms in place that may need to be addressed. In configurations such as SRDF/S, write ordering is preserved by managing device states while update operations are transmitted to the remote array. In recent Enginuity releases for SQL Server layout recommendations 417 Microsoft SQL Server Database Layouts on EMC Symmetrix DMX-4 and VMAX, features such as Concurrent J0 Write operations and Single Round Trip operations for Fibre Channel deployments have been introduced. While these new features and functions provide significant performance improvements in replicated environments, it is still necessary to consider the impact of replication technologies as they may relate to any given environment. Traditionally, the method employed to improve performance in an SRDF/Synchronous deployment was to add more hypervolumes to support the I/O workload in order to distribute the workload across multiple devices, and allow for parallelization across the resources. This strategy still has value in synchronous replication environments, even when new features that improve performance are considered. Clearly, this becomes a trade-off of size and performance. Moreover, larger hypervolumes will result in additional storage allocation for the database location. If this is not a suitable outcome, then it may be appropriate to create large numbers of smaller hypervolumes so the optimization for parallelism is attained and storage allocation is managed. This may need to be well planned for disk mirror backup operations to ensure that BCV or Clone devices are also sized appropriately. 418 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix Symmetrix performance guidelines Optimizing performance for SQL Server in an EMC Symmetrix environment is very similar to optimizing performance for all applications on the storage array. In order to maximize performance, a clear understanding of the I/O requirements of the applications accessing storage is required. The overall goal when laying out an application on disk devices in the back end of the Symmetrix array is to reduce or eliminate bottlenecks in the storage system by spreading out the I/O across all of the array’s resources. Inside a Symmetrix DMX array, there are a number of areas to consider: ◆ Front-end connections into the Symmetrix system — This includes the number of connections from the host to the Symmetrix array that are required, and whether front-end Fibre Channel ports will be directly connected or a SAN will be deployed to share ports between hosts. ◆ Memory cache in the Symmetrix array — All host I/Os pass through cache on the Symmetrix array. I/O can be adversely affected if insufficient cache is configured in the Symmetrix array for the environment. Also, writes to individual hypervolumes or to the array as a whole may be throttled when a threshold, known as the write pending limit (discussed next), is reached. ◆ Back-end considerations — There are two sources of possible contention in the back end of the Symmetrix – the back-end directors and the physical spindles. Proper layout of the data on the disks is needed to ensure satisfactory performance. Front-end connectivity Optimizing front-end connectivity requires an understanding of the number and size of I/Os (both reads and writes) that will be sent between the hosts and the Symmetrix array. There are limitations to the amount of I/O that each front-end director port, each front-end director processor, and each front-end director board can handle. Additionally, SAN fan-out counts, that is, the number of hosts that can be attached through a Fibre Channel switch to a single front-end port, need to be carefully managed. A key concern when optimizing front-end performance is determining which of the following I/O characteristics is more important in the customer’s environment: Symmetrix performance guidelines 419 Microsoft SQL Server Database Layouts on EMC Symmetrix ◆ Input/output operations per second (IOPS) ◆ Throughput (MB/s) ◆ A combination of IOPS and throughput In OLTP database applications, where I/Os are typically small and random, IOPS is the more important factor. In DSS applications, where transactions in general require large sequential table or index scans, throughput is the more critical factor. In some databases, a combination of OLTP and DSS like I/Os are required. Optimizing performance in each type of environment requires tuning the host I/O size. Figure 163 on page 421 depicts the relationships between the page size of a random read request from the host, and both IOPS and the throughput needed to fulfill that request from the Symmetrix DMX. SQL Server utilizes an 8 KB page size, and it can be seen that the Symmetrix DMX provides high IOPS at this size. Currently, each Fibre Channel port on a Symmetrix array is theoretically capable of 400 MB/s of throughput. In practice, however, the throughput available per port is significantly less and depends on the I/O size and on the shared utilization of the port and processor on the director. Increasing the size of the I/O from the host perspective decreases the number of IOPS that can be performed, but increases the overall throughput (MB/s) of the port. Limiting total throughput to a fraction of the theoretical maximum will ensure that enough bandwidth is available for connectivity between the Symmetrix array and the host. It is always recommended in SQL Server environments to provide connectivity to at least two or potentially more front-end fibre connectors to not only provide scale out by distributing workload across multiple adaptors, but also to provide high availability when used in conjunction with EMC PowerPath. 420 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix IOPS and throughput vs. blocksize Random read cache hit 100% Percent of maximum 90% 80% 70% 60% 50% 40% 30% 20% I/O per sec 10% MB per sec 0% 512 4096 8192 Blocksize Figure 163 32768 65536 ICO-IMG-000042 Relationship between I/O size, operations per second, and throughput Symmetrix cache The Symmetrix cache plays a key role in improving I/O performance in the storage subsystem. The cache improves performance by allowing write acknowledgements to be returned to a host when data is received in solid state cache, rather than being fully destaged to the physical disk drives. Additionally, reads benefit from cache when sequential requests from the host allow follow-on reads to be prestaged in cache. The following sections briefly describe how the Symmetrix cache is used for writes and reads, and then discuss performance considerations for it. Write operations and the Symmetrix cache All write operations on a Symmetrix array are serviced by cache. When a write is received by the front-end director, a cache slot must be found to service the write operation. Since cache slots are a representation of the underlying hypervolume, if a prior read or write operation caused the required data to already be loaded into cache, the existing cache slot may be used to store the write I/O. The cache slot is marked write pending if it is not already in this state. Symmetrix performance guidelines 421 Microsoft SQL Server Database Layouts on EMC Symmetrix If a cache slot representing the storage area is not currently allocated, a call is made to locate a free cache slot from the global pool, for the write. The write operation is moved to the cache slot and the slot is then marked write pending. At some later point, Enginuity™ will destage the write to physical disk. The decision of when to destage is based on overall system load, physical disk activity; read operations to the physical disk, and availability of cache. Cache is used to service the write operation to optimize the performance of the host system. Because write operations to cache are significantly faster than physical writes to disk media, the write is reported as complete to the host operating system much more efficiently. Battery backup and priority destage functions within the Symmetrix ensure that no data loss occurs in the event of system power failure. If the write operation to a given disk is delayed because of higher priority operations (read activity is one such operation), the write pending slot remains in cache for longer periods of time. Cache slots are allocated as needed to a volume for this purpose. Enginuity calculates thresholds for allocations to limit the saturation of cache by a single hypervolume. These limits are referred to as write pending limits. Cache allocations are based on a per-hypervolume basis. As write pending thresholds are reached, additional allocations may occur, as well as re-prioritization of write activity. As a result, write operations to the physical disks may increase in priority to ensure that excessive cache allocations do not occur. This is discussed next in more detail. In the manner described, the cache enables buffering of write I/Os and allows for a steady stream of write activity to service the destaging of write operations from a host. In a bursty write environment, this serves to even out the write activity. Should the write activity constantly exceed the low write priority to the physical disk, Enginuity will raise the priority of write operations to attempt to meet the write demand. Ultimately, if the write I/O load from the host exceeds the physical disk ability to write, the volume maximum write pending limit may be reached. In this condition, new cache slots will only be allocated for writes to a particular volume once a currently allocated slot is freed by destaging it to disk. This condition, if reached, may severely impact write operations to a single hypervolume. 422 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix Read operations and the Symmetrix cache As discussed in the previous section, read operations typically have an elevated priority for service from the physical disks. As user processes in general need to wait for an I/O operation to complete before continuing. This is generally a good practice for storage arrays, especially those able to satisfy write operations from cache. When a read request is received from a host system, Enginuity checks to see if a corresponding cache slot representing the storage area exists. If so, a further check is made to determine if the required data to service the read has been loaded into the cache slot. If so, the read request may be serviced immediately—this is considered a read hit. If the cache slot has been allocated, but the data is not available, this is considered a short read miss, and a request is made to load the data from disk to the available cache slot. The read request must wait for a transfer from disk. If a cache slot does not already exist for this storage location, but free slots are available, the read operation must wait for a cache slot to be allocated and for the transfer from disk—this is referred to as a long read miss. Although cache slots themselves are 64 KB, a cache slot may contain only the requested data. That is, if a read request is made for an 8 KB block, then only that 8 KB block will be transferred into the cache slot, as opposed to reading the entire 64 KB track from disk. The smallest read request unit is 8 KB, as this is the size of a Symmetrix sector. Note: In the DMX-2 and earlier arrays, the cache slot size is 32 KB. Sector size for these arrays is 4 KB. Symmetrix cache performance considerations An important performance consideration is to ensure that an appropriate amount of cache is installed in the Symmetrix array. All I/O requests from hosts attached to the array are serviced from the Symmetrix global cache. Symmetrix cache can be thought of as an extension to database buffering mechanisms. As such, many database application environments can benefit from additional Symmetrix cache. With newly purchased arrays, appropriately sizing the cache is performed by the sales team, based upon the number and size of physical spindles, configuration (including number and type of volumes), replication requirements (SRDF for example), and customer requirements. Symmetrix performance guidelines 423 Microsoft SQL Server Database Layouts on EMC Symmetrix Cache usage can be monitored through a number of Symmetrix DMX monitoring tools. Primary among these are ControlCenter Performance Manager (formerly known as Workload Analyzer) and Symmetrix Performance Analyzer (SPA). Performance Manager contains a number of views that analyze Symmetrix cache utilization at both the hypervolume and overall system level. Symmetrix Performance Analyzer can also assist in viewing cache effectiveness. Views in both products provide detailed information on specific component utilizations, including disks, directors (front end and back end), and cache utilization. Symmetrix cache plays a key role in host I/O read and write performance. Read performance can be improved through prefetching by the Symmetrix array if the reads are sequential in nature. Enginuity algorithms detect sequential read activity and prestage reads from disk in cache before the data is requested. Write performance is greatly enhanced because all writes are acknowledged back to the host when they reach Symmetrix cache rather than when they are written to disk. While reads from a specific hypervolume can use as much cache as is required to satisfy host requests assuming free cache slots are available, the Symmetrix array limits the number of writes that can be written to a single volume (that is, the write pending limit previously discussed). Understanding the Enginuity write pending limits is important when planning for optimal performance. As previously discussed, the write pending limit is used to prevent high write rates to a single hypervolume from consuming all of the storage array cache for its use, at the expense of performance for reads or writes to other volumes. The write pending limit for each hypervolume is determined at system startup and depends on the number and type of volumes configured and the amount of cache available. The limit is not dependent on the actual size of each volume. The more cache available, the more write requests that can be serviced in cache by each individual volume. While some sharing of unused cache may take place (although this is not guaranteed), an upper limit of three times the initial write pending limit assigned to a volume is the maximum amount of cache any hypervolume can acquire for changed tracks. If the maximum write pending limit is reached, destaging to disk must take place before new writes can come in. This forced destaging to disk before a new write can be received into cache limits writes to that particular volume to physical disk write speeds. Forced destage of writes can significantly reduce performance to a hypervolume if the write pending limit is reached. 424 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix If performance problems with a particular volume are identified, an initial step in determining the source of the problem should include verification of the number of writes and the write pending limit for that volume. In addition to limits imposed at the hypervolume level, there are additional write pending limits imposed at the system level. Two key cache utilization points for the Symmetrix DMX are reached when 40 percent and 80 percent of the cache is used for pending writes. Under normal operating conditions, satisfying read requests from a host has greater priority than satisfying write requests. However, when pending writes consume 40 percent of cache, the Symmetrix DMX then prioritizes reads and writes equally. This re-prioritization can have a profound effect on database performance. The degradation is even more pronounced if cache utilization for writes reaches 80 percent. At that point, the DMX begins a forced destage of writes to disk with discernible performance degradation of both writes and reads. If this threshold is reached, it is a clear indicator that both the cache and the total I/O on the array need to be re-examined. Symmetrix VMAX arrays differ from Symmetrix DMX arrays with respect to cache limits. Symmetrix VMAX arrays set the maximum usable write cache for a single volume to be 5% of total usable global cache. Write pending limits are also established for Symmetrix metavolumes. Metavolumes are created by combining two or more individual hypervolumes into a single logical device that is then presented to a host as a single logical unit (LUN). Metavolumes can be created as concatenated or striped metavolumes. Striped metavolumes use a stripe size of 960 KB. Concatenated metavolumes write data to the first hyper in the metavolume (metahead) and fill it before beginning to write to the next member of the meta. Write pending limits for a metavolume are calculated on a member-by-member (hypervolume) basis. Determining the write pending limit and current number of writes pending per hypervolume can be done simply using SYMCLI commands: The following SYMCLI command returns the write pending limit for hypervolumes in a Symmetrix: symcfg -sid <sid> -v list | findstr Pending Max # of system write pending slots:162901 Max # of DA write pending slots:81450 Symmetrix performance guidelines 425 Microsoft SQL Server Database Layouts on EMC Symmetrix Max # of device write pending slots:4719 For DMX arrays, depending on cache availability, the maximum number of write pending slots an individual hypervolume can use is up to three times the maximum number of device write pending slots previously listed, that is, 3 * 4,719 = 14,157 write pending tracks. For Symmetrix VMAX, the maximum number of write pending slots for a single hypervolume is up to 5% of the total available global cache. The number of write pending slots utilized by a host’s devices can be found using the SYMCLI command symstat as shown next: symstat –i 30 DEVICE 13:09:52 13:09:52 035A 0430 0431 0432 0434 0435 043A 043C 043E 043F 0440 0441 0442 0443 0444 (Not (Not (Not (Not (Not (Not (Not (Not (Not (Not (Not (Not (Not (Not (Not KB/sec KB/sec % Hits %Seq Num WP READ WRITE READ WRITE RD WRT READ Tracks Visible Visible Visible Visible Visible Visible Visible Visible Visible Visible Visible Visible Visible Visible Visible ) 0 ) 0 ) 0 ) 0 ) 0 ) 0 ) 0 ) 0 ) 0 ) 0 ) 0 ) 0 ) 13 ) 0 ) 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 66 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N/A 100 100 82 0 0 N/A N/A N/A N/A N/A N/A 0 100 N/A N/A 0 0 28 100 100 N/A N/A N/A N/A N/A N/A 100 0 N/A N/A 2 100 2679 100 2527 0 2444 0 14157 0 14157 N/A 49 N/A 54 N/A 15 N/A 10 N/A 807 N/A 328 0 17 100 1597 N/A 4 From this, we can see that the devices 435 and 434 have reached the device write pending limit of 14,157 for the DMX array. Further analysis on the cause of the excessive writes and methods of alleviating this performance bottleneck against these devices should be made. Alternatively, Performance Manager may be used to determine the device write pending limit, and whether device limits are being reached. Figure 164 on page 427 is a Performance Manager view displaying both the device write pending limits and device write pending counts for a given device; this example shows Symmetrix DMX device 055. For the Symmetrix DMX in this example, the write pending slots per device was 9,776 and thus the max write pending limit was 29,328 slots (3 * 9776). In general, a distinct flat line in such graphs indicates that a limit has been reached. 426 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix 30000 25000 Devices 055write pending count 12/16/200n 20000 15000 Devices 055maximum write pending threshold 12/16/200n 10000 5000 0 16:50 16:52 16:54 16:56 16:58 17:00 17:02 17:04 17:06 17:08 17:10 17:12 17:14 ICO-IMG-000043 Figure 164 Performance Manager graph of write pending for single hypervolume The functionality of using metavolumes when we compare the same workload run against a 4 member striped metavolume. In Figure 165 on page 428, the hypervolumes comprising the one metavolume were used for the same workload as generated in Figure 164 on page 427. It can be seen that each of the volumes serviced a proportion of the workload. Because the location of the file being created and the stripe depth used, one of the volumes incurred more write activity. However, even in this case, it did not exceed the lowest of the write pending thresholds, let alone reach the maximal threshold limit. Symmetrix performance guidelines 427 Microsoft SQL Server Database Layouts on EMC Symmetrix Devices 00Fwrite pending count 12/21/200n 10000 9000 Devices 00Ewrite pending count 12/21/200n 8000 7000 Devices 011write pending count 12/21/200n 6000 5000 Devices 010write pending count 12/21/200n 4000 3000 2000 Devices 00Fmaximum write pending threshold 12/21/200n 1000 0 12:49 12:53 12:57 13:01 13:05 13:09 13:13 13:17 13:27 13:25 ICO-IMG-000038 Figure 165 Performance Manager graph of write pending for four member striped metavolume In the same way, the throughput and overall performance of the workload was substantially improved. Figure 166 on page 429 shows a comparison of certain metrics in this configuration. It should be obvious that this is not truly a fair comparison since we are comparing a single hypervolume against 4 hypervolumes within the metavolume. However, it does show that multiple disks are better able to satisfy an intense workload, which can clearly exceed the capability of a single device. It also serves to demonstrate the management and dynamic allocation of cache resources for volumes. 428 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix Meta 10 ps Read 10 ps Hyper Write 10 ps Transactions per second ICO-IMG-000039 Figure 166 Comparison of write workload single hyper and striped metavolume The number of cache boards can also have a minor effect on performance. When comparing Symmetrix DMX arrays that have the same amount of cache, increasing the number of boards (for example, 4 cache boards with 16 GB each as opposed to 2 cache boards with 32 GB each) has a small positive effect on the performance in DSS applications. This is because of the increased number of paths between front-end directors and cache, and has the affect of improving overall throughput. However, configuring additional boards is only helpful in high-throughput environments such as DSS applications. For OLTP workloads, where IOPS are more critical, additional cache directors provide no added performance benefits. This is because the number of IOPS per port or director depends is limited by the processing power of CPUs on each board. Note: In the DMX-3, write pending limits for individual volumes is modified. Instead of allowing writes up to 3x the initial write pending limit, up to ~1/20 of the cache can be utilized by any individual hypervolume. Symmetrix performance guidelines 429 Microsoft SQL Server Database Layouts on EMC Symmetrix Symmetrix VMAX has a distribution of cache resources across all engines within the array. As a result, VMAX systems have a natural distribution and therefore optimization of cache resources. Back-end considerations Back-end considerations are typically the most important part of optimizing performance on the Symmetrix array. Advances in spinning disk technologies have not kept up with performance increases in other parts of the storage array such as director and bandwidth (that is, Direct Matrix™ versus Bus) performance. However, with the introduction of Enterprise Flash Drives (EFDs), individual flash drives are able to support I/O rates of thousands of operations per second. Conversely, spinning disk access speeds have only increased by a factor of three to seven in the last decade while other components have easily increased one to three orders of magnitude. As such, for arrays utilizing only spinning disk drives, most performance bottlenecks in the Symmetrix array are attributable to physical spindle limitations. An important consideration for back-end performance is the number of physical spindles available to handle the anticipated I/O load. Each disk is capable of a finite number of I/O operations at a given I/O size. Algorithms in the Symmetrix Enginuity operating environment optimize I/Os to the disks. Although this helps to reduce the number of reads and writes to disk, access to disk, particularly for random reads, is still a requirement to attempt to optimize I/O operations and counts for individual spindles. If an insufficient number of physical disks are available to handle the anticipated I/O workload, performance will suffer. Note: It is critical to determine the number of spindles required for a SQL Server database implementation based upon I/O performance requirements, and not solely on the physical space considerations. In order to reduce or eliminate back-end performance issues on the Symmetrix array, take care to spread access to the disks across as many back-end directors and physical spindles as possible. EMC has long recommended for data placement of application data to go wide before going deep. This means that performance is improved by spreading data across the back-end directors and disks, rather than allocating specific applications to specific physical spindles. Significant attention should be given to balancing the I/O on the 430 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix physical spindles. Understanding the I/O characteristics of each data file and separating high application I/O volumes on separate physical disks will minimize contention and improve performance. Implementing Symmetrix Optimizer may also help to reduce I/O contention between hypervolumes on a physical spindle. Symmetrix Optimizer identifies I/O contention on individual hypervolumes and non disruptively moves one of the hypers to a new location on another disk. Symmetrix Optimizer is an invaluable tool in helping to reduce contention on physical spindles should workload requirements change in an environment. As covered in the “Storage Provisioning” on page 117 additional options such as Fully Automated Storage Tiering may be used in Symmetrix arrays to optimize resource utilization. FAST provides the ability to have the Symmetrix array automatically migrate volumes between storage tiers as a means to optimize performance and storage utilization. Placement of data on the disks is another performance consideration. Because of the rotational properties of disk platters, tracks on the outer parts of the disk perform better than inner tracks. While the Symmetrix Enginuity algorithms smooth much of this variation, small performance increases can be achieved by placing high I/O objects on the outer parts of the disk. Of more importance, however, is minimizing the seek times associated with the disk head moving between hypervolumes on a spindle. Physically locating higher I/O devices together on the disks can significantly improve performance. Disk head movement across the platters (seek time) is a large source of latency in I/O performance. By placing higher I/O devices contiguously, disk head movement may be reduced, increasing I/O performance of that physical spindle. Enterprise Flash Drives do not suffer from traditional disk head seek latencies, and introduce the ability to service I/O requests at electronic speeds. Even when considering the implementation of EFDs within Symmetrix arrays, it will be necessary to distribute workloads across a range of devices to service the aggregate workload. It is also important to understand that the Symmetrix array is designed to provide shared access to resources. As a result, items such as physical disk spindles are subdivided into hypervolumes. Multiple hypervolumes exist on any given disk spindle. Each hypervolume may either be presented to a host connected to the Symmetrix, or combined as a part of a metavolume, which is Symmetrix performance guidelines 431 Microsoft SQL Server Database Layouts on EMC Symmetrix subsequently presented to a host. The key point here is that spindles are not specifically designated to a given host, but rather they can be shared amongst different hosts. The overall load and performance of any individual spindle will result from the cumulative load represented by all the workloads targeted to all hypervolumes on that spindle. Expectations should be set appropriately for the anticipated performance of these resources. Administrators should understand that their particular system may not be the only system generating workload to the spindles, but they may be seeing the performance impact from the other hosts workloads. Additional layout considerations There are a number of additional factors that determine the best layout for a given hardware and software configuration. It is important to evaluate each of these factors to create an optimal layout for a SQL Server database. Host bus adapters A host bus adapter (HBA) is a circuit board and/or integrated circuit adapter that provides I/O processing and physical connectivity between a server and a storage device. The connection may route through Fibre Channel switches if Fibre Channel FC-SW is used. Because the HBA relieves the host microprocessor of both data storage and retrieval tasks, it can improve the server’s performance time. An HBA and its associated disk subsystems are sometimes referred to as a disk channel. HBAs can be a bottleneck if an insufficient number of them are provisioned for a given throughput environment. When configuring SQL Server systems estimate the throughput required and provision sufficient HBAs accordingly. It is recommended that there be a minimum of two HBAs located in any production system. This will at a minimum provide a load sharing capability across these resources, and also provide a level of fault protection when used in combination with products such as PowerPath. Host addressing limitations HBAs also have limitations on how many LUNs can be addressed on a given channel. Windows HBAs in general support up to 255 logical unit devices (LUNs). 432 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix These factors must be weighed when designing the database storage infrastructure. The final architecture will always be a compromise between what is ideal and what is economically feasible within the constraints of the implementation environment. It is recommended to utilize Symmetrix based striped metavolume LUNs to provide storage allocations for SQL Server database file locations in OLTP environments. Striped metavolumes provide a method of balancing back-end workloads because of the stripe used across member devices. This methodology attempts to ensure that a single volume does not become overly saturated with I/O activity. Configuration recommendations The key recommendations for configuring the Symmetrix DMX for optimal performance include the following: ◆ Understand the I/O — It is critical to understand the characteristics of the database I/O, including the number, type (read or write) size, location (that is, data files, logs), and sequentially of the I/Os. Empirical data or estimates are needed to assist in planning. ◆ Multiple host bus adapters — Implementing multiple Fibre HBAs zoned to separate front-end adapters on the Symmetrix can provide an efficient balanced path to the array. When used in conjunction with PowerPath, they provide a level of fault protection. ◆ Physical spindles — The number of disk drives in the Symmetrix array should first be determined by calculating the number of I/Os required, rather than solely based on the physical space needs. The key is to ensure that the front end needs of the applications can be satisfied by the flow of data from the back end. ◆ Spread out the I/O — Both reads and writes should be spread across the physical resources (front-end and back-end ports, physical spindles, hypervolumes) of the Symmetrix array. This helps to prevent bottlenecks such as hitting port or spindle I/O limits, or reaching write pending limits on a hypervolume from developing. ◆ Bandwidth — A key consideration when configuring connectivity between a host and the Symmetrix array is the expected bandwidth required to support database activity. This Symmetrix performance guidelines 433 Microsoft SQL Server Database Layouts on EMC Symmetrix requires and understanding of the size and number of I/Os between the host and the Symmetrix. Connectivity considerations for both the number of HBAs and Symmetrix front-end ports are required 434 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix RAID considerations Integrated cached storage arrays provide multiple ways to protect and manage database data. The options chosen at the array level can affect the operation of the running database. The next sections provide a brief description of RAID configurations provided by the Symmetrix. Types of RAID The following defines RAID configurations that are available on Symmetrix arrays: ◆ Unprotected — This configuration is not typically used in a Symmetrix environment for production volumes. BCVs and occasionally R2 devices (used as target devices for SRDF) can be configured as unprotected volumes. ◆ RAID 1 — These are mirrored devices and are the most common RAID type in a Symmetrix array. Mirrored devices require writes to both physical spindles. However, intelligent algorithms in the Enginuity operating environment can use both copies of the data to satisfy read requests that are not already in the cache of the Symmetrix. RAID 1 offers optimal availability and performance but at an increased cost over other RAID protection options. ◆ RAID 5 — A relatively recent addition to Symmetrix data protection (Enginuity 5670+), RAID 5 stripes parity information across all volumes in the RAID group. RAID 5 offers good performance and availability at a decreased cost. Data is striped using a stripe width of four tracks (128 KB). RAID 5 is configured either as RAID 5 (3+1) (75 percent usable) or RAID 5 (7+1) (87.5 percent usable) configurations. Figure 167 on page 436 shows the configuration for 3+1 RAID 5. Figure 168 on page 437 shows how a random write in a RAID 5 environment is performed. RAID considerations 435 Microsoft SQL Server Database Layouts on EMC Symmetrix RAID 5 Track index Group Data 3 to 1 Disk 1 Parity Period Parity 1-12 Data 13 14 15 16 Data 25 26 27 28 Data 37 38 39 40 Stripe size Disk 2 Disk 3 Disk 4 Data 1 2 3 4 Parity 13-24 Data 29 30 31 32 Data 41 42 43 44 Data 5 6 7 8 Data 17 18 19 20 Parity 25-36 Data 45 46 47 48 Data 9 10 11 12 Data 21 22 23 24 Data 33 34 35 36 Parity 37-48 ICO-IMG-000044 Figure 167 RAID 5 (3+1) layout detail ◆ RAID 6 — RAID 6provides dual parity configurations with parity information calculated in a manner so as to protect data from up to two drive failures within the same RAID stripe. RAID 6 functionality is optimized and supported in either 6+2 or 14+2 configurations. ◆ RAID 1/0 — These are striped and mirrored devices. This configuration is only used in mainframe environments. However, RAID 1/0 can also be configured by creating striped metavolumes, as described next. RAID 5 and RAID 6 write penalty RAID 5 and RAID 6 are generally understood to suffer from a performance penalty for write operations. The penalty is imposed as a result of recalculation of the parity information, which forms the basis of the protection scheme. 436 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix CACHE Host data 1 Data slot 2 Parity slot 3 4 rite XOR data w of old XOR ta w da e and n XOR parity write Data Parity ICO-IMG-000045 Figure 168 Anatomy of a RAID 5 write The following describes the process of a random write operation to a RAID 5 volume: 1. A random write is received from the host and is placed into a data slot in cache to be destaged to disk. 2. The write is destaged from cache to the physical spindle. When received, parity information is calculated in cache on the drive by reading the old data and using an exclusive OR calculation with the new data. 3. The new parity information is written back to Symmetrix cache. 4. The new parity information is written to the appropriate parity location on another physical spindle. RAID 6 parity calculations include all the calculations as outlined for a RAID 5 device, but also include an additional parity calculation. This secondary parity calculation requires additional I/O operations to create the additional parity information. It is also important to note some of the optimizations implemented within Enginuity for large sequential batch update (write) operations. As previously discussed, when random write operations are processed, there may be a requirement to generate a background read operation to be able to read and regenerate new parity information. With subsequent optimizations and when large sequential writes are being generated by the host application, Enginuity is able to calculate parity information based on the data being written. It can then write the new parity information at the same time as the data is destaged to disk. In this way, the write penalty is removed. The size of the write operation must be at least complete RAID 5 stripe, since each stripe is 128 KB, in a RAID 5 (3+1) environment, the write must be (3 x 128 KB) RAID considerations 437 Microsoft SQL Server Database Layouts on EMC Symmetrix or 384 KB. For a RAID 5 (7+1) the write must be (7 x 128 KB) or 896 KB. Thus, large sequential write operations which may be typical of large batch updates may benefit from this optimization. Determining the appropriate level of RAID to configure in an environment depends on the availability and performance requirements of the applications that will utilize the Symmetrix array. Combinations of RAID types are configurable in the Symmetrix arrays, allowing for optimizing the environment based on specific application requirements. Until recently, RAID 1 was the predominant choice for RAID protection in Symmetrix storage environments. RAID 1 provides maximum availability and enhanced performance over other available RAID protections. In addition, performance optimizations such as Symmetrix Optimizer, which reduces contention on the physical spindles by non disruptively migrating hypervolumes, and dynamic mirror service policy, which improves read performance by optimizing reads from both mirrors, were only available with mirrored volumes, not with parity RAID devices. While mirrored storage is still the recommended choice for RAID configurations in Symmetrix environments, the space efficiency of RAID 5 storage, and the higher levels of RAID 6 protection provide customers with a reliable, economical alternative for their production storage needs. RAID 5 storage protection provides economic advantages over using RAID 1, while at the same time providing high availability and performance. RAID 5 implements the standard data striping and rotating parity across all members of the RAID group (either 3+1 or 7+1). RAID 6 provides the highest levels of drive redundancy. Additionally, Symmetrix Optimizer functionality is available with RAID 5 and RAID 6 to reduce spindle contention. RAID 5 provides customers with a flexible data protection option for dealing with varying workloads and service-level requirements. RAID recommendations While EMC recommends RAID 1 to be the primary choice in RAID configuration for reasons of reliability and availability, SQL Server databases can be deployed on RAID 5- or RAID 6-protected disks for many database environments, excluding those requiring support for high I/O demands and performance intensive applications. Databases used for test, development, QA, or reporting are likely candidates for using RAID 5- or RAID 6-protected volumes. 438 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix Another potential candidate for deployment on RAID 5 storage are DSS applications. In many DSS environments, read performance greatly outweighs the need for rapid writes. This is because data warehouses typically perform loads off-hours or infrequently (once a week or month); read performance in the form of database user queries is significantly more important. Since there is no RAID penalty for RAID 5 read performance, only write performance, these types of applications are generally good candidates for RAID 5 storage deployments. Conversely, production OLTP applications typically require small random writes to the database, and as such, are generally more suited to RAID 1 storage. An important consideration when deploying RAID 5 is disk failures. When disks containing RAID 5 members fail, two primary issues arise—performance and data availability. Performance will be affected when the RAID group operates in the degraded mode because the missing data must be reconstructed using parity and data information from other members in the RAID group. Performance will also be affected when the disk rebuild process is initiated after the failed drive is replaced or a hot spare is disk is activated. Potential data loss is the other important consideration when using RAID 5. Multiple drive failures that cause the loss of multiple members of a single RAID group will result in loss of data. While the probability of such an event is insignificant, the potential in RAID 5 (7+1) environment is much higher than that for RAID 1. As such, the probability of data loss because of the loss of multiple members of RAID 5 group should be carefully weighed against the benefits of using RAID 5. To provide higher levels of protection against drive failures, RAID 6 protection allows for up to two drives failures to occur within a given RAID stripe, while maintaining availability. RAID 6 will similarly suffer from performance degradation in the event of a drive failure, and during rebuild operations. RAID 6 significantly reduces the probability of data loss as a result of multiple drive failures. The bottom line in choosing a RAID type is ensuring that the configuration meets the needs of the customer’s environment. Considerations include read and write performance, balancing the I/O across the spindles and the back end of the Symmetrix, tolerance for reduced application performance when a drive fails, and the consequences of losing data in the event of multiple disk failures. It should also be noted that this is not an all-or-nothing approach. It is RAID considerations 439 Microsoft SQL Server Database Layouts on EMC Symmetrix feasible to implement a combination of RAID 5 and RAID 1 devices to support a given database, selecting the optimal configuration for each type of workload is the key consideration. In general, EMC recommends RAID 1 for all types of customer data including SQL Server databases. However, RAID 5and RAID 6 configurations may be beneficial for many applications and should be considered. Symmetrix metavolumes Individual Symmetrix hypervolumes of the same RAID type (RAID 1, RAID 5 or RAID 6) may be combined together to form a virtualized device called a Symmetrix metavolume. Metavolumes are created for a number of reasons including: ◆ A desire to create devices that are greater than the largest hypervolume available (DMX arrays provide a maximum hypervolume size limit of around 60 GB and for VMAX arrays the maximum hypervolume size limit is around 240 GB). ◆ To reduce the number of volumes presented down a front-end director or to an HBA. A metavolume presented to an HBA only counts as a single LUN even though the device may comprise a large number of individual hypers. ◆ To increase performance of a LUN by spreading I/O across more physical spindles. There are two types of metavolumes—concatenated or striped. With concatenated metavolumes, the individual hypers are combined to form a single volume, such that data is written to the first hypervolume sequentially before moving to the next. Writes to the metavolume start with the metahead and proceed on that physical until full, and then move on to the next hypervolume. Striped metavolumes, on the other hand, write data across all members of the device. For Symmetrix DMX arrays, the stripe size is set at two cylinders or 960 KB. For Symmetrix VMAX arrays, the stripe size is set to one cylinder, though due to internal structure changes, the resulting stripe size is also 960 KB. In nearly all cases, striped metavolumes are recommended over concatenated volumes in SQL Server database environments. The exception to this general rule occurs in specific DSS environments where metavolumes may obscure the sequential nature of the I/Os from the Enginuity prefetching algorithms. 440 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix Host versus array-based striping Another hotly disputed issue when configuring a storage environment for maximum performance is whether to use host-based or array-based striping in SQL Server environments. Striping of data across the physical disks is critically important to database performance because it allows the I/O to be distributed across multiple spindles. Although disk drive size and speeds have increased dramatically in recent years, spindle technologies have not kept pace with host CPU and memory improvements. Performance bottlenecks in the disk subsystem can develop if the data storage requirements and configuration are not closely watched. In general, the discussion concerns trade-offs between performance and manageability of the storage components. The following sections present in more depth the trade-offs of using host-based and array-based striping. Host-based striping Host-based striping is configured through the Logical Volume Manager utilized on most open systems hosts. The base Windows operating system provides support for Logical Volume Management of this type when utilizing dynamic disks. Two important things to consider when creating host-based striping are the number of disks to configure in a stripe set, and an appropriate stripe size. While no definitive answer can be given that optimizes these settings for any given configuration, the following are general guidelines to use when creating host-based stripes: ◆ Ensure that the stripe size used is a power of two multiple of the track size configured on the Symmetrix array (that is, a multiple of 32 KB for DMX-2 and 64 KB for DMX-3, DMX-4 and VMAX), the database, and host I/Os. Alignment of database blocks, Symmetrix tracks, host I/O size, and the stripe size can have considerable impact on database performance. Typical stripe sizes are 64 KB to 256 KB, although it can be as high as 512 KB or even 1 MB. ◆ Multiples of four physical devices for the stripe width are generally recommended, although this may be increased to eight or 16 as required for LUN presentation or SAN configuration restrictions as needed. Take care with RAID 5 metavolumes to ensure that members do not end up on the same physical Host versus array-based striping 441 Microsoft SQL Server Database Layouts on EMC Symmetrix spindles (a phenomenon known as vertical striping) because this may adversely affect performance. In general RAID 5 metavolumes are not recommended. ◆ When configuring in an SRDF environment, smaller stripe sizes, particularly for the active logs, are recommended. This is to enhance performance in synchronous SRDF environments because of the limit of having only one outstanding I/O per hypervolume on the link. ◆ Data alignment (that is, along block boundaries) can play a significant role in performance, particularly in Windows environments. Refer to operating system-specific documentation to learn how to align data blocks from the host along Symmetrix array track boundaries. Symmetrix based striping (metavolumes) An alternative to using host-based striping is to stripe at the Symmetrix array level. Striping in the Symmetrix is accomplished through the use of striped metavolumes, as discussed in the previous section. Individual hypervolumes are selected and striped together, forming a single LUN that is presented through the front-end director to the host. At the Symmetrix level, all writes to this single LUN are striped. Currently, the only stripe size available for a metavolume is 960 KB. It is possible to create metavolumes with up to 255 hypervolumes, although in practice metavolumes are usually created with four to sixteen devices. Striping recommendation Determining the appropriate striping method depends on a number of factors. In general, striping is a trade-off between manageability and performance. With host-based striping, CPU cycles are used to manage the stripes—Symmetrix metavolumes require no host cycles to stripe the data. This small performance decrease in host-based striping is offset, however, by the fact that each device in a striped volume group maintains an I/O queue, thereby increasing performance over a Symmetrix metavolume, which only has a single I/O queue on the host. EMC has performed tests that have shown that striping at the host level provides somewhat better performance than comparable Symmetrix based striping, and is generally recommended if performance is paramount. Host-based striping may 442 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix also be recommended with environments utilizing synchronous SRDF, since stripe sizes in the host can be tuned to smaller increments than is currently available with Symmetrix metavolumes, thereby increasing performance. This host-based striping may be most applicable for write intensive files such as the transaction log. Management considerations however, generally favor Symmetrix-based metavolumes over host-based stripes. In many environments, customers have achieved high-performance back-end layouts on the Symmetrix by allocating all of the storage as four-way striped metavolumes. This has the advantage that any volume selected for host data is always striped, with reduced chances for contention on any given physical spindle. Additional storage requirements for any host volume group, since additional storage is configured as a metavolume, are also striped. Management of added storage to an existing volume group utilizing host-based striping may be significantly more difficult, requiring in some cases a full backup, reconfiguration of the volume group, and restore of the data in order to successful expand the stripe. An alternative in Windows environments gaining popularity recently is the combined use of both host-based and array-based striping. Known as double striping or a plaid, this configuration utilizes striped metavolumes in the Symmetrix array, which are then presented to a volume group and striped at the host level. This has many advantages in database environments where read access is small and highly random in nature. Since I/O patterns are pseudo random, access to data is spread across a large quantity of physical spindles, thereby decreasing the probability of contention on any given disk. Double striping, in some cases, can interfere with data prefetching at the Symmetrix array level when large, sequential data reads are predominant—this configuration may not be appropriate for DSS workloads. Another method of double striping the data is through the use of Symmetrix metavolumes and RAID 5 or RAID 6. For example, a RAID 5 hypervolume stripes data across either four or eight physical disks using a stripe size of 128 KB. Striped metavolumes stripe data across two or more hypers using a stripe size of 960 KB. When using striped metavolumes in conjunction with RAID 5 devices, ensure that members do not end up on the same physical spindles because this will adversely affect performance. In some cases, double striping using this method may also affect prefetching for long, sequential reads. Host versus array-based striping 443 Microsoft SQL Server Database Layouts on EMC Symmetrix While host-based, array-based, or double striping in a storage environment have positive and negative factors, the important thing is to ensure that some form of striping is used for the storage layout. The appropriate layer for disk striping can have a significant impact on the overall performance and manageability of the database system. Deciding which form of striping to use depends on the specific nature and requirements of the database environment in which it is configured. With the advent of RAID 5 and RAID 6 data protection in Symmetrix arrays, an additional option of triple striping data using array RAID protection, host-based striping, and metavolumes combined is now available. However, triple striping increases data layout complexity, and in testing has shown no performance benefits over other forms of striping. In fact, it has actually been shown to be detrimental to performance. As a result, is not recommended in any Symmetrix array configuration. Host-based striping and limitations When considering the usage of host-based striping in a Windows environment, it is important to understand the limitations that this may impose. Windows Server is provided with a built-in Logical Volume Manager, which supports the usage of dynamic disks. It is only when using dynamic disks that host-based striping mechanisms may be implemented. Using the base dynamic disk format in the base Windows Logical Volume Management is prohibited in environments such a Windows Cluster configurations. And for these environments a third-party volume manager is required. Additionally, certain EMC products and utilities and software products do not interoperate with the base Windows dynamic volume implementation. It is important to check the requirements of the proposed solution to ensure that the ultimate configuration provides support for the various software products. For EMC products, consult the relevant product release notes and guide for detailed information. 444 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix Data placement considerations Placement of the data on the physical spindles can potentially have a significant impact on SQL Server database performance. Placement factors that affect database performance include: ◆ Hypervolume selection for specific database files on the physical spindles themselves. ◆ The spread of database files across the spindles to minimize contention. ◆ The placement of high I/O devices contiguously on the spindles to minimize head movement (seek time). ◆ The spread of files across the spindles and back-end directors to reduce component bottlenecks. Each of these factors is discussed next. Disk performance considerations As shown in Figure 169 on page 447, there are five main considerations for traditional spinning disk spindle performance. ◆ Actuator Positioning (seek time) — This is the time it takes the actuating mechanism to move the heads from their present position to a new position. This delay averages a few milliseconds in length and depends on the type of drive. For example, a 15k drive has an average seek time of approximately 3.5 ms for reads and 4 ms for writes. The full disk seek of 7.4 ms for reads and 7.9 ms for writes. Note: Disk drive characteristics can be found at appropriate disk vendor websites such as http://www.seagate.com. ◆ Rotational speed — This is because of the need for the platter to rotate underneath the head to correctly position the data that needs to be accessed. Rotational speeds for spindles in the Symmetrix range from 7,200 rpm to 15,000 rpm. The average rotational delay is the time it takes for one half of a revolution of the disk. In the case of a 15k drive, this would be about 2 milliseconds. Data placement considerations 445 Microsoft SQL Server Database Layouts on EMC Symmetrix ◆ Interface speed — This is a measure of the transfer rate from the drive into the Symmetrix cache. It is important to ensure that the transfer rate between the drive and cache is greater than the drive’s rate to deliver data. Delay caused by this is typically a very small value, on the order of a fraction of a millisecond. ◆ Areal density — This is a measure of the number of bits of data that fits on a given surface area on the disk. The greater the density, the more data per second can be read from the disk as it passes under the disk head. ◆ Cache capacity and algorithms — Newer disk drives have improved read and write algorithms, as well as cache, to improve the transfer of data in and out of the drive and to make parity calculations for RAID 5 and RAID 6. Delay caused by the movement of the disk head across the platter surface is called seek time. The time associated with a data track rotating to the required location under the disk head is referred to as rotational delay. The cache capacity on the drive, disk algorithms, interface speed, and the areal density (or zoned bit recording) combines to produce a disk transfer time. Therefore, the time it takes to complete an I/O (or disk latency) consists of these three elements—the seek time, the rotational delay, and the transfer time. Data transfer times are typically on the order of fractions of a millisecond. Therefore, rotational delays and delays because repositioning the actuator heads are the primary sources of latency on a physical spindle. Additionally, rotational speeds of disk drives have increased from top speeds of 7,200 rpm up to 15,000 rpm, but still average on the order of a few milliseconds. The seek time continues to be the largest source of latency in disk assemblies when using the entire disk. Transfer delays are lengthened in the inner parts of the drive—more data can be read per second on the outer parts of the drive than by data located on the inner regions. Therefore, performance is significantly improved on the outer parts of the disk. In many cases, performance improvements of more than 50 percent can sometimes be realized on the outer cylinders of a physical spindle. This performance differential typically leads customers to place high I/O objects on the outer portions of the drive. While placing high I/O objects such as active logs on the outer edges of the spindles has merit, performance differences across the drives inside the Symmetrix are significantly smaller than the stand-alone disk characteristics would attest. Enginuity operating environment 446 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix algorithms, particularly the algorithms that optimize ordering of I/O as the disk heads scan across the disk, greatly reduce differences in hypervolume performance across the drive. Although this smoothing of disk latency may actually increase the delay of a particular I/O, overall performance characteristics of I/Os to hypervolumes across the face of the spindle will be more uniform. Areal density Rotational speed Position actuator Cache and algorithms Interface speed ICO-IMG-000037 Figure 169 Disk performance factors Hypervolume contention Disk drives are capable of receiving only a limited number of read or write I/Os before performance degradation occurs. While disk improvements and cache, both on the physical drives and in disk arrays, have improved disk read and write performance, the physical devices can still become a critical bottleneck in SQL Server database environments. Eliminating contention on the physical spindles is a key factor in ensuring optimal SQL Server performance on Symmetrix arrays. Contention can occur on a physical spindle when I/O (read or write) to one or more hypervolumes exceeds the I/O capacity of the disk. While contention on a physical spindle is undesirable, this type of contention can be rectified by migrating high I/O data onto other devices with lower utilization. This can be accomplished using a Data placement considerations 447 Microsoft SQL Server Database Layouts on EMC Symmetrix number of methods, depending on the type of contention that is found. For example, when two or more hypervolumes on the same physical spindle have excessive I/O, contention may be eliminated by migrating one of the hypervolumes to another, lower utilized physical spindle. This could be done through processes such as LVM mirroring at the host level or by using tools such as EMC Symmetrix Optimizer to non disruptively migrate data from impacted devices. One method of reducing hypervolume contention is careful layout of the data across the physical spindles on the back-end of the Symmetrix. Other methods of reducing contention are to use striping, either at the host level or inside the Symmetrix. Hypervolume contention can be found in a number of ways. SQL Server specific data collection and analysis tools such as the database SNAP feature, as well as host server tools can identify areas of reduced I/O performance in the database. Additionally, EMC tools such as Performance Manager can help to identify performance bottlenecks in the Symmetrix array. Establishing baselines of the system and proactive monitoring are essential in helping to maintain an efficient, high-performance database. Commonly, tuning database performance on the Symmetrix system is performed post-implementation. This is unfortunate because with a small amount of up front effort and detailed planning, significant I/O contention issues could be minimized or eliminated in a new implementation. While detailed I/O patterns of a database environment are not always well known, particularly in the case of a new system implementation, careful layout consideration of a database on the back end of the Symmetrix can save time and future effort in trying to identify and eliminate I/O contention on the disk drives. Maximizing data spread across back-end devices A long-standing data layout recommendation at EMC has been go wide before going deep. What this means is that data placement on the Symmetrix array should be spread across the back-end directors and physical spindles before locating data on the same physical drives. By spreading the I/O across the back end of the Symmetrix, I/O bottlenecks in any one array component can be minimized or eliminated. 448 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix Considering recent improvements in the Symmetrix array component technologies such as CPU performance on the directors and the Virtual Matrix architecture, the most common bottleneck in new implementations is with contention on the physical spindles and the back-end directors. To reduce these contention issues, a detailed examination of the I/O requirements for each application that will utilize the Symmetrix storage should be made. From this analysis, a detailed layout that balances the anticipated I/O requirements across both back-end directors and physical spindles should be made. Before data is laid out on the back end of a Symmetrix array, it is helpful to understand the I/O requirements for each of the file systems or volumes that are being laid out. Many methods for optimizing layout on the back-end directors and spindles are available. One time-consuming method involves creating a map of the hypervolumes on physical storage, including hypervolume presentation by director and physical spindle, based upon information available in EMC Ionix ControlCenter. This involves documenting the environment using a tool such as Excel, with each hypervolume marked on its physical spindle and disk director. Using this map of the back end and volume information for the database elements, preferably categorized by I/O requirement (high/medium/low, or by anticipated reads and writes), the physical data elements and I/Os can be evenly spread across the directors and physical spindles. This type of layout can be extremely complex and time-consuming. Additional complexity is added when RAID 5 or RAID 6 hypers are added to the configuration. Since each hypervolume is really placed on multiple physical volumes in these RAID environments, trying to uniquely map out each data file or database element is beyond what most customers feel provides value. In these cases, one alternative is to rank each of the database elements or volumes in terms of anticipated I/O. Once ranked, each element may be assigned a hypervolume in order on the back end. Since BIN file creation tools almost always spread contiguous hypervolume numbers across different elements of the back end, this method of assigning the ranked database elements usually provides a reasonable spread of I/O across the spindles and back-end directors in the Symmetrix array. Combined with Symmetrix Optimizer, this method of spreading the I/O is normally effective in maximizing the spread of I/O across Symmetrix components. Data placement considerations 449 Microsoft SQL Server Database Layouts on EMC Symmetrix Minimizing disk head movement Perhaps the key performance consideration controllable by a customer when laying out a database on Symmetrix arrays is minimizing head movement on the physical spindles. Head movement is minimized by positioning high I/O hypervolumes contiguously on the physical spindles. Disk latency caused by interface or rotational speeds cannot be controlled by layout considerations. The only disk drive performance considerations that can be controlled are the placement of data onto specific, higher performing areas of the drive (discussed in a previous section), and the reduction of actuator movement by trying to place high I/O objects in adjacent hypervolumes on the physical spindles. One method, described in the previous section, describes how volumes can be ranked by anticipated I/O requirements. Utilizing a documented map of the back-end spindles, high I/O objects can be placed on the physical spindles, grouping the highest I/O objects together. Recommendations differ as to whether placing the highest I/O objects together on the outer parts of the spindle (that is, the highest performing parts of a physical spindle) or in the center of a spindle are optimal. Since there is no real consensus and any definitive answer to this question is likely to be it depends, the historical recommendation of putting high I/O objects together on the outer part of the spindle is still a reasonable suggestion. Placing these high I/O objects together on the outer parts of the spindle should help to reduce disk actuator movement when doing reads and writes to each hypervolume on the spindle, thereby improving a controllable parameter in any data layout exercise. 450 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix Other layout considerations Besides the layout considerations described in previous sections, a few additional factors may be important to DBAs or storage administrators who want to optimize database performance and general maintenance operations. Some additional configuration factors to consider include: ◆ Implementing SRDF/S for the database. ◆ Creating database clones using TimeFinder/Mirror or TimeFinder/Clone. ◆ Database clones using TimeFinder/Snap. These additional layout considerations are discussed in the next sections. Database layout considerations with SRDF/S There are two primary concerns that must be considered when SRDF/S is implemented. The first is the inherent latency added for each write to the database. Latency occurs because each write must be first written to both the local and remote Symmetrix caches before the write can be acknowledged to the host. This latency must always be considered as a part of any SRDF/S implementation. Because the speed of light cannot be circumvented, there is little that can be done to mitigate this latency. The secondary consideration is more amenable to DBA mitigation. Each hypervolume configured in the Symmetrix is only allowed to send a single update I/O across the SRDF link. Performance degradation results when multiple I/Os are written to a single hypervolume, since subsequent writes must wait for predecessors to complete. Striping at the host level can be particularly helpful in these situations. Utilizing a smaller stripe size (32 KB to 128 KB) ensures that larger writes will be spread across multiple hypervolumes, reducing the chances for SRDF to serialize writes across the link. Other layout considerations 451 Microsoft SQL Server Database Layouts on EMC Symmetrix Database cloning, TimeFinder, and sharing spindles Database cloning is useful when DBAs wish to create backup or other business continuance images of a database. A common question when laying out a database is whether BCVs or clones should share the same physical spindles as the production volumes or whether the BCVs and clone targets should be isolated on separate physical disks. There are pros and cons to each of the solutions; the optimal solution generally depends on the anticipated workload. The primary benefit of spreading BCVs and clone targets across all physical spindles is performance. By spreading I/Os across more spindles, there is a reduced chance of developing bottlenecks on the physical disks. Workloads that utilize BCVs and clone targets, such as backups and reporting databases, may generate high I/O rates. Spreading this workload across more physical spindles may significantly improve performance in these environments. The main drawbacks to spreading BCVs and clone targets across all spindles in the Symmetrix are that synchronization may cause spindle contention during resynchronization, and that BCV and clone target workloads may negatively impact production database performance. When re-synchronizing the BCVs or clone targets, data is read from the production hypers and copied into cache. From there, it is destaged to the target devices. When the physical disks share production and target devices, the synchronization rates can be greatly reduced because of increased seek times due to the conflict between reading from one part of the disk and writing to another. The other drawback to sharing physical disks is the increased workload on the spindles that may impact performance on the production volumes. Sharing the spindles increases the chance that contention may arise, decreasing database performance. Determining the appropriate location for BCVs and clone targets, sharing the same physical spindles or isolated on their own disks, depends on customer preference and workload. In general, it is recommended that the BCVs and clone targets share the same physical spindles. However, in cases where the BCV and clone synchronization and utilization may negatively impact applications (such as, databases that run 24-by- 7 with high I/O requirements), it may be beneficial for the BCVs and clone targets to be isolated on their own physical disks. 452 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix Database clones using TimeFinder/Snap TimeFinder/Snap provides many of the benefits of full volume replication techniques such as TimeFinder/Mirror or TimeFinder/Clone, but at greatly reduced costs. However, a performance consideration must be taken into account when using TimeFinder/Snap to make database clones for backups or other business continuous functions. The implementation of TimeFinder/Snap protection can incur Copy on Access (COA) penalties. This penalty affects the source devices when update access to the snap volumes occurs. In this situation, protected tracks will need to be copied to the save devices as updates are processed against the TimeFinder/Snap devices. As updates occur against the source devices, an Asynchronous Copy on First Write (ACOFW) will also occur. This mechanism will copy protected tracks into the save pool as updates occur from the production system, however, any impact to the production system is mitigated. Other layout considerations 453 Microsoft SQL Server Database Layouts on EMC Symmetrix Database-specific settings for SQL Server Microsoft SQL Server is a dynamically tuning database environment. Most interaction with the disk storage system is entirely managed automatically by SQL Server and thus no tuning options are provided. Issues, which may affect the overall performance of a SQL Server database environment when used on a Symmetrix array, are discussed next. Remote replication performance considerations As previously discussed, Synchronous replication of storage on which SQL Server data and transaction log files reside can introduce increased latency. It may also be important to monitor the setting of the Recovery Time Interval. The Recovery Time Interval controls the anticipated amount of time it takes to resolve transactional state for the database in the event of production server failure. Specifically, this setting will affect any Recovery Time Objective. It does not affect the Recovery Point Objective, since the use of SRDF/S and the SQL Server Write Ahead Log will guarantee a zero data loss solution (RPO = zero). With smaller Recovery Time Intervals, the checkpoint mechanisms of the SQL Server database will become more aggressive, generating additional write activity to the data files. In higher latency SRDF/S environments this may have a negative effect on overall performance. Also note that increasing the Recovery Time Interval may result in a dirty buffer pool size increase, since dirty pages will be allowed to persist in memory for longer periods of time. However, lazy writer operations will continue, and thus on an ongoing basis, dirty pages will be flushed to disk, although not as aggressively as by the checkpoint mechanism. The Recovery Time Interval may be configured by the sp_configure Transact-SQL stored procedure. More information for changing this setting may be found in the SQL Server Books Online documentation. 454 Microsoft SQL Server on EMC Symmetrix Storage Systems Microsoft SQL Server Database Layouts on EMC Symmetrix TEMPDB storage and replication The TEMPDB system database is utilized by SQL Server to provide semi-persistent storage for certain operations. The amount of activity to the TEMPDB data and log files may be significant in certain environments, depending on the style of Transact-SQL statements being executed and certain SQL Server functionality such as the new isolation levels provide by SQL Server 2005 and SQL Server 2008. SQL Server recreates the TEMPDB storage when a new instance of the SQL Server environment is initiated. Thus, there is no viable persistent data stored within TEMPDB. Consequently, there is no value in replicating the storage space allocated for TEMPDB data and log files over SRDF links. It is important, however, to have similarly allocated TEMPDB configurations on the target storage, such that in the event of site failover, performance is not adversely affected due to insufficient allocation of TEMPDB space. In geographically dispersed clustering solutions such as SRDF/CE for Windows Failover Clustering, the replication of TEMPDB space is unavoidable because all storage space allocation must be shared. Database-specific settings for SQL Server 455 Microsoft SQL Server Database Layouts on EMC Symmetrix 456 Microsoft SQL Server on EMC Symmetrix Storage Systems A Related Documents This appendix presents: ◆ Related documents .......................................................................... 458 Related Documents 457 Related Documents Related documents The following is a list of related documents that may assist readers with more detailed information on topics described in this TechBook. Many of these documents may be found the EMC Powerlink site at: http://Powerlink.EMC.com. For Microsoft SQL Server information, refer to the Microsoft websites, including http://www.microsoft.com/sql/, or more directly to the documentation page for the various versions at: http://msdn.microsoft.com/sql/sqlref/docs/default.aspx Microsoft SQL Server Books On Line documentation provides extensive coverage of features and functions and may be installed through the SQL Server installation process, independently of the SQL Server database engine. Updated versions of the Books On Line documentation are available for free download from the Microsoft SQL Server website http://www.microsoft.com/sql SYMCLI Solutions Enabler Release Notes (by release) Solutions Enabler Support Matrix (by release) Solutions Enabler Symmetrix Device Masking CLI Product Guide (by release) Solutions Enabler Symmetrix Base Management CLI Product Guide (by release) Solutions Enabler Symmetrix CLI Command Reference (by release) Solutions Enabler Symmetrix Configuration Change CLI Product Guide (by release) Solutions Enabler Symmetrix SRM CLI Product Guide (by release) Solutions Enabler Symmetrix Double Checksum CLI Product Guide (by release) Solutions Enabler Installation Guide (by release) Solutions Enabler Symmetrix CLI Quick Reference (by release) TimeFinder Solutions Enabler Symmetrix TimeFinder Family CLI Product Guide (by release) TimeFinder/Integration Modules Product Guide (by release) 458 Microsoft SQL Server on EMC Symmetrix Storage Systems Related Documents SRDF Solutions Enabler Symmetrix SRDF Family CLI Product Guide (by release) Symmetrix Remote Data Facility (SRDF) Product Guide Symmetrix Automated Replication UNIX and Windows Replication Manager Replication Manager Product Guide Replication Manager Support Matrix AutoStart AutoStart Data Source for SRDF Administrator Guide AutoStart Module for SQL Server 2005 Administrators Guide Microsoft Windows Server EMC Symmetrix with Microsoft Windows Server 2003 and 2008 Related documents 459 Related Documents 460 Microsoft SQL Server on EMC Symmetrix Storage Systems B References This appendix presents: ◆ Sample SYMCLI group creation commands................................ 462 References 461 References Sample SYMCLI group creation commands The following shows how Symmetrix device groups and composite groups are created for the TimeFinder family of products including TimeFinder/Mirror, TimeFinder/Clone, and TimeFinder/Snap. This example shows you how to build and populate a device group and a composite group for TimeFinder/Mirror usage.: ◆ Device group: 1. To create the device group execute the command: symdg create dbgroup –type regular 2. The standard devices need to be added to the group. The database containers reside on five Symmetrix devices. The device numbers for these are 0CF, 0F9, 0FA, 0FB, and 101: symld symld symld symld symld –g –g –g –g –g dbgroup dbgroup dbgroup dbgroup dbgroup add add add add add dev dev dev dev dev 0CF 0F9 0FA 0FB 101 3. Associate the BCV devices to the group. The number of BCV devices should be the same as the number of standard devices. They should also be the same size. The device serial numbers of the BCVs used in the example are 00C, 00D, 063, 064, and 065: symbcv symbcv symbcv symbcv symbcv ◆ –g –g –g –g –g dbgroup dbgroup dbgroup dbgroup dbgroup associate associate associate associate associate dev dev dev dev dev 00C 00D 063 064 065 Composite group: 1. To create the composite group execute the command: symcg create dbgroup –type regular 2. The standard devices need to be added to the composite group. The database containers reside on five Symmetrix devices on two different Symmetrix arrays. The device numbers for these are 0CF, 0F9 on Symmetrix with the last three digits of 123, and device numbers 0FA, 0FB, and 101 on the Symmetrix with the last three digits of 456. 462 Microsoft SQL Server on EMC Symmetrix Storage Systems References 3. Associate the BCV devices to the composite group. The number of BCV devices should be the same as the number of standard devices. They should also be the same size. The device serial numbers of the BCVs used in the example are 00C, 00D, 063, 064, and 065: symbcv symbcv symbcv symbcv symbcv –cg –cg –cg –cg –cg dbgroup dbgroup dbgroup dbgroup dbgroup associate associate associate associate associate dev dev dev dev dev 00C 00D 063 064 065 –sid –sid –sid –sid –sid 123 123 456 456 456 This example shows how to build and populate a device group and a composite group for TimeFinder/Clone usage.: ◆ Device group: 1. To create the device group dbgroup execute the command: symdg create dbgroup –type regular 2. The standard devices need to be added to the group. The database containers reside on five Symmetrix devices. The device numbers for these are 0CF, 0F9, 0FA, 0FB, and 101: symld symld symld symld symld –g –g –g –g –g dbgroup dbgroup dbgroup dbgroup dbgroup add add add add add dev dev dev dev dev 0CF 0F9 0FA 0FB 101 3. The target clone devices need to be added to the group. The targets for the clones can be standard devices or BCV devices. In this example, BCV devices are being used. The number of BCV devices should be the same as the number of standard devices. They should also be the same size or larger than the paired standard device. The device serial numbers of the BCVs used in the example are 00C, 00D, 063, 064, and 065: symbcv symbcv symbcv symbcv symbcv ◆ –g –g –g –g –g dbgroup dbgroup dbgroup dbgroup dbgroup associate associate associate associate associate dev dev dev dev dev 00C 00D 063 064 065 Composite Group: 1. To create the composite group dbgroup execute the command: symcg create dbgroup –type regular Sample SYMCLI group creation commands 463 References 2. The standard devices need to be added to the group. The database containers reside on five Symmetrix devices on two different Symmetrix arrays. The device numbers for these are 0CF, 0F9 on Symmetrix with the last 3 digits of 123, and device numbers 0FA, 0FB, and 101 on the Symmetrix with the last 3 digits of 456: symcg symcg symcg symcg symcg –g –g –g –g –g dbgroup dbgroup dbgroup dbgroup dbgroup add add add add add dev dev dev dev dev 0CF 0F9 0FA 0FB 101 –sid –sid –sid –sid –sid 123 123 456 456 456 3. Add the target for the clones to the device group. In this example, BCV devices are added to the composite group to simplify the later symclone commands. The number of BCV devices should be the same as the number of standard devices. They should also be the same size. The device serial numbers of the BCVs used in the example are 00C, 00D, 063, 064, and 065. symbcv symbcv symbcv symbcv symbcv –cg –cg –cg –cg –cg dbgroup dbgroup dbgroup dbgroup dbgroup associate associate associate associate associate dev dev dev dev dev 00C 00D 063 064 065 –sid –sid –sid –sid –sid 123 123 456 456 456 This example shows how to build and populate a device group and a composite group for TimeFinder/Snap usage: ◆ Device group: 1. To create the device group dbgroup execute the command: symdg create dbgroup –type regular 2. Add the standard devices to the group. The database containers reside on five Symmetrix devices. The device numbers for these are 0CF, 0F9, 0FA, 0FB, and 101: symld symld symld symld symld 464 –g –g –g –g –g dbgroup dbgroup dbgroup dbgroup dbgroup Microsoft SQL Server on EMC Symmetrix Storage Systems add add add add add dev dev dev dev dev 0CF 0F9 0FA 0FB 101 References 3. Then the virtual devices or VDEVs need to be added to the group. The number of VDEVs should be the same as the number of standard devices. They should also be the same size. The device serial numbers of the VDEVs used in the example are 291, 292, 394, 395, and 396: symld symld symld symld symld ◆ –g –g –g –g –g dbgroup dbgroup dbgroup dbgroup dbgroup add add add add add dev dev dev dev dev 291 292 394 395 396 -vdev -vdev -vdev -vdev –vdev Composite group: 1. To create the composite group dbgroup execute the command: symcg create dbgroup –type regular 2. The standard devices need to be added to the composite group. The database containers reside on five Symmetrix devices on two different Symmetrix arrays. The device numbers for these are 0CF, 0F9 on Symmetrix with the last 3 digits of 123, and device numbers 0FA, 0FB and 101 on the Symmetrix with the last 3 digits of 456: symcg symcg symcg symcg symcg –g –g –g –g –g dbgroup dbgroup dbgroup dbgroup dbgroup add add add add add dev dev dev dev dev 0CF 0F9 0FA 0FB 101 –sid –sid –sid –sid –sid 123 123 456 456 456 3. Then the virtual devices or VDEVs need to be added to the composite group. The number of VDEVs should be the same as the number of standard devices. They should also be the same size. The device serial numbers of the VDEVs used in the example are 291, 292, 394, 395, and 396: symld symld symld symld symld –cg –cg –cg –cg –cg dbgroup dbgroup dbgroup dbgroup dbgroup add add add add add dev dev dev dev dev 291 292 394 395 396 –sid –sid –sid –sid –sid 123 123 456 456 456 -vdev -vdev -vdev -vdev -vdev Sample SYMCLI group creation commands 465 References 466 Microsoft SQL Server on EMC Symmetrix Storage Systems Glossary This glossary contains terms related to disk storage subsystems. Many of these terms are used in this manual. A actuator A set of access arms and their attached read/write heads, which move as an independent component within a head and disk assembly (HDA). adapter Card that provides the physical interface between the director and disk devices (SCSI adapter), director and parallel channels (Bus & Tag adapter), director and serial channels (Serial adapter). alternate track A track designated to contain data in place of a defective primary track. See also ”primary track.” C cache cache slot channel director Random access electronic storage used to retain frequently used data for faster access by the channel. Unit of cache equivalent to one track. The component in the Symmetrix subsystem that interfaces between the host channels and data storage. It transfers data between the channel and cache. Microsoft SQL Server on EMC Symmetrix Storage Systems 467 Glossary CKD controller ID Count Key Data, a data recording format employing self-defining record formats in which each record is represented by a count area that identifies the record and specifies its format, an optional key area that may be used to identify the data area contents, and a data area that contains the user data for the record. CKD can also refer to a set of channel commands that are accepted by a device that employs the CKD recording format. Controller identification number of the director the disks are channeled to for EREP usage. There is only one controller ID for Symmetrix. D DASD data availability delayed fast write Access to any and all user data by the application. There is no room in cache for the data presented by the write operation. destage The asynchronous write of new or updated data from cache to disk device. device A uniquely addressable part of the Symmetrix subsystem that consists of a set of access arms, the associated disk surfaces, and the electronic circuitry required to locate, read, and write data. See also ”volume.” device address The hexadecimal value that uniquely defines a physical I/O device on a channel path. See also ”unit address.” device number The value that logically identifies a disk device in a string. diagnostics 468 Direct access storage device, a device that provides nonvolatile storage of computer data and random access to that data. System level tests or firmware designed to inspect, detect, and correct failing components. These tests are comprehensive and self-invoking. director The component in the Symmetrix subsystem that allows Symmetrix to transfer data between the host channels and disk devices. See also ”channel director.” disk director The component in the Symmetrix subsystem that interfaces between cache and the disk devices. Microsoft SQL Server on EMC Symmetrix Storage Systems Glossary dual-initiator dynamic sparing A Symmetrix feature that automatically creates a backup data path to the disk devices serviced directly by a disk director, if that disk director or the disk management hardware for those devices fails. A Symmetrix feature that automatically transfers data from a failing disk device to an available spare disk device without affecting data availability. This feature supports all non-mirrored devices in the Symmetrix subsystem. E ESCON Enterprise Systems Connection, a set of IBM and vendor products that connect mainframe computers with each other and with attached storage, locally attached workstations, and other devices using optical fiber technology and dynamically modifiable switches called ESCON Directors. See also ”ESCON director.” ESCON director Device that provides a dynamic switching function and extended link path lengths (with XDF capability) when attaching an ESCON channel to a Symmetrix serial channel interface. F fast write FBA frame FRU In Symmetrix, a write operation at cache speed that does not require immediate transfer of data to disk. The data is written directly to cache and is available for later destaging. Fixed Block Architecture, disk device data storage format using fixed-size data blocks. Data packet format in an ESCON environment. See also ”ESCON.” Field Replaceable Unit, a component that is replaced or added by service personnel as a single entity. G gatekeeper GB A small logical volume on a Symmetrix storage subsystem used to pass commands from a host to the Symmetrix storage subsystem. Gatekeeper devices are configured on standard Symmetrix disks. Gigabyte, 109 bytes. Microsoft SQL Server on EMC Symmetrix Storage Systems 469 Glossary H head and disk assembly A field replaceable unit in the Symmetrix subsystem containing the disk and actuator. home address The first field on a CKD track that identifies the track and defines its operational status. The home address is written after the index point on each track. See also ”CKD.” hyper-volume extension The ability to define more than one logical volume on a single physical disk device making use of its full formatted capacity. These logical volumes are user-selectable in size. The minimum volume size is one cylinder and the maximum size depends on the disk device capacity and the emulation mode selected. I ID IML index marker index point INLINES I/O device Identifier, a sequence of bits or characters that identifies a program, device, controller, or system. Initial microcode program loading. Indicates the physical beginning and end of a track. The reference point on a disk surface that determines the start of a track. An EMC-provided host-based Cache Reporter utility for viewing short and long term cache statistics at the system console. An addressable input/output unit, such as a disk device. K K Kilobyte, 1024 bytes. L 470 least recently used algorithm (LRU) The algorithm used to identify and make available the cache space by removing the least recently used data. logical volume A user-defined storage device. In the Model 5200, the user can define a physical disk device as one or two logical volumes. Microsoft SQL Server on EMC Symmetrix Storage Systems Glossary long miss longitude redundancy code (LRC) Requested data is not in cache and is not in the process of being fetched. Exclusive OR (XOR) of the accumulated bytes in the data record. M MB mirroring mirrored pair Megabyte, 106 bytes. The Symmetrix maintains two identical copies of a designated volume on separate disks. Each volume automatically updates during a write operation. If one disk device fails, Symmetrix automatically uses the other disk device. A logical volume with all data recorded twice, once on each of two different physical devices. P physical ID primary track promotion Physical identification number of the Symmetrix director for EREP usage. This value automatically increments by one for each director installed in Symmetrix. This number must be unique in the mainframe system. It should be an even number. This number is referred to as the SCU_ID. The original track on which data is stored. See also ”alternate track.” The process of moving data from a track on the disk device to cache slot. R read hit read miss record zero Data requested by the read operation is in cache. Data requested by the read operation is not in cache. The first record after the home address. S scrubbing The process of reading, checking the error correction bits, and writing corrected data back to the source. Microsoft SQL Server on EMC Symmetrix Storage Systems 471 Glossary SCSI adapter Card in the Symmetrix subsystem that provides the physical interface between the disk director and the disk devices. short miss Requested data is not in cache, but is in the process of being fetched. SSID For 3990 storage control emulations, this value identifies the physical components of a logical DASD subsystem. The SSID must be a unique number in the host system. It should be an even number and start on a zero boundary. stage storage control unit string The process of writing data from a disk device to cache. The component in the Symmetrix subsystem that connects Symmetrix to the host channels. It performs channel commands and communicates with the disk directors and cache. See also ”channel director.” A series of connected disk devices sharing the same disk director. U unit address The hexadecimal value that uniquely defines a physical I/O device on a channel path. See also ”device address.” V volume A general term referring to a storage device. In the Symmetrix subsystem, a volume corresponds to single disk device. W write hit write miss 472 There is room in cache for the data presented by the write operation. There is no room in cache for the data presented by the write operation. Microsoft SQL Server on EMC Symmetrix Storage Systems Index A Adaptive copy 63 Asynchronous SRDF 63 AUTOGROW 140 Autoprovisioning Groups 122 Enginuity Consistency Assist 64, 65, 66, 80, 81, 82 ESCON 53, 61 extent group 152 extent movement 152 F B BCV 76, 77, 90 bin file 118, 119, 133 C Cache 53, 62 Change Tracker 52, 60 CKD 66 Composite groups 63 Con group trip 64, 66 Concurrent SRDF 68 Consistency group 66 Consistency groups 51, 63, 64, 65, 66, 72, 90 Crash recovery 83 D Data devices 129, 131 Dependent-write consistency 66, 71 Device group 63 device move 152 device swap 152 DR 69 FAST 147 FAST DP 114 FAST Policies 148 FAST Policy 151 FBA 66 Fibre Channel 53, 61, 118, 119 FICON 53 G Gigabit Ethernet 53, 61 H HBA 118, 119, 121, 124, 130 I Instant File Initialization 135, 145 iSCSI 53, 118 L LUN masking 118, 121 LUN Offset 121 E M Enginuity 50, 53, 55, 133 Metavolumes 133 Microsoft SQL Server on EMC Symmetrix Storage Systems 473 Index Mirror positions 77 O Open Replicator 141 P Path failover 97 Path load balancing 96, 97 Path management 96 PowerPath 52, 64, 96, 127 R RA group 61, 64, 72 RAID 1 53 RAID 5 53 RAID 6 53 Remote adapter 61 Restartable databases 72 Rolling disaster 65, 66 S SAN 118, 119, 120 Skewed data access 154 Skewed LUN access 153 SNMP 142 Solutions Enabler 50, 52, 57 SRDF 50, 60, 61, 62, 63, 66, 67, 69, 71, 72, 76, 141 SRDF adaptive copy 63 SRDF Data Mobility 69 SRDF establish 70, 71 SRDF failback 73, 74 SRDF failover 73 SRDF restore 72 SRDF Split 71 SRDF/A 63 SRDF/AR 63 SRDF/CE 75 SRM 52 Storage Class 148 Storage Group 148, 151 Storage Type 148, 151 SYMAPI 50, 57, 82, 142 SYMCLI 57, 60, 65, 76, 89, 92, 93, 121 symclone 76, 84, 85 474 symmir 76, 77, 79, 80 symsnap 76, 86, 87, 88 Synchronous SRDF 62 T Temporary table spaces 140 Thin device 129, 130, 131, 136, 141 Thin pool 129, 131 TimeFinder 51, 60, 69, 76, 141 TimeFinder/CG 76 TimeFinder/Clone 51, 76, 83, 84 TimeFinder/Mirror 51, 76, 77, 78 TimeFinder/Mirror establish 77 TimeFinder/Mirror restore 79 TimeFinder/Mirror split 78 TimeFinder/Snap 51, 76, 86, 87 V VDEV 86, 88 Virtual Provisioning 108, 191 W WWN 119, 120, 121, 124, 130 Z Zoning 118 Microsoft SQL Server on EMC Symmetrix Storage Systems