* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Activate Physical Standby: Controlled failover
Survey
Document related concepts
Microsoft Access wikipedia , lookup
Serializability wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Ingres (database) wikipedia , lookup
Functional Database Model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Relational model wikipedia , lookup
Concurrency control wikipedia , lookup
Oracle Database wikipedia , lookup
Database model wikipedia , lookup
Transcript
Lowell Noodle Company Oracle 11g Physical Standby Data Guard GROUP 4 Minh Vo, Susan Champigny, Ganapathi S. Santhana 10/13/2012 1 University Avenue Lowell, MA 01854 Phone: 978 934-4000 Lowell Noodle Company 1 Introduction The solution that would best fit the LNC requirements, at this point in time, is the implementation of Oracle Data Guard. We are confident that this is the best approach without implementing RMAN. Using Data Guard will allow an active standby database for the production database. This environment will provide protection against planned or unplanned downtime as well as against data loss in case the primary database environment becomes unavailable. This is extremely important in a 24x7 production environment, which we highly considered. In conjunction with the use of proper hardware and technology, it is also possible to keep the standby database synchronized with the primary database, therefore providing continued database operation availability. In the event of problems (including hardware failures, application issues, user errors, and unforeseen disasters) the standby database environment can be quickly activated to maintain. In addition, configuration of Flashback on both the primary and the standby will enable rapid role transitions and reduce the effort required to re-establish database roles after transition. We considered the best practices to achieve this, and we reduced the Flashback retention target from the default of 24 hours to 2 hours. Using Data Guard will ensure an effective recovery protection strategy. When ready, we highly suggest integrating RMAN operations into your environment. RMAN will provide performance enhancements such as, automatic parallelization of backup, restore, and recovery operations. Furthermore, do to your 24x7 environment it would be wise to consider configuring a recovery catalog schema, which would be created in a separate database on a separate server. Using a Recovery Catalog scenario will allow generation of reports while allowing storing RMAN scripts in a repository. Benefits of using a Recovery Catalog versus just using RMAN is the availability of metadata about the production (target) database, which can contain multiple databases if needed in the future. The use of storing information about more than one incarnation database allows reporting on the production database (target) from a non-current incarnation. In case of recovering from the recovery catalog you can create a database from a previous backup, it can be decided to locate the catalog in another database by importing the data from the export of the previous catalog owner into the schema of the newly created user. In addition, a new database can be created and import the entire database from an export of the recover catalog database. Lowell Noodle Company 2 Plan of Action PREPARATION o Prepare the Primary database to be used to create a Physical Standby o Create PFILES for Production and Standby PHYSICAL STANDBY CREATION o LNCSB_CREATE_STANDBY.sh o Configure Listener and TNSNAMES o Create/Copy Password Files STARTUP AND TESTING o Startup Primary and Standby Databases o Check Archive Log Working for Primary o Monitoring Standby Managed Recovery Environment CONTROLLED FAILOVER o Create Flashback Point on Standby o Disconnect Primary and Startup Standby as New Primary Database o Testing and Verification o Revert Standby from Flashback o Reconnect Primary and Standby Databases APPENDIX o Using ORAPWD Does Not Work! o Syntax Error and Missing PFILE entries. o Startup in Active Data Guard Mode Lowell Noodle Company 3 Preparation PREPARING PRIMARY DATABASE (ORA11) FOR SAME MACHINE ACTIVE DATA GUARD After checking around the Production database and verifying basic functionality, we proceeded to edit the Primary pfiles and created a new one for lncsb, which will be our Standby. We took extra care to make sure the destinations to log archive parameters and data file locations are properly set. Care was also taken in defining and referencing db_unique_name parameters, keeping in mind that we’re planning to add lncsb (Standby SID), lnc_fc12ora112 (Service to Primary), and lncsb_fc12ora112 (Service to Standby) SID/Services. Notice also the FAL_SERVER and FAL_CLIENT settings. The pfiles look as follows: INITORA 11. ORA audit_file_dest="/u01/app/oracle/admin/ora11/adump" audit_trail=NONE compatible=11.2.0.0.0 control_files=/u02/oradata/ora11/ora11_ctrl01.ctl, /u02/oradata/ora11/ora11_ctrl02.ctl db_block_size=8192 db_cache_size=243269632 db_domain=fc12ora112.uml.edu db_name="ora11" db_recovery_file_dest_size=4196401152 db_recovery_file_dest="/u01/app/oracle/fast_recovery_area" diagnostic_dest=/u01/app/oracle java_pool_size=67108864 large_pool_size=37748736 ## log_archive_dest_1='LOCATION=/u02/oradata/ora11/arch/' ## log_archive_dest_state_1='ENABLE' ## log_archive_format=ora11_%s_%t_%r.arc open_cursors=300 pga_aggregate_target=127926272 processes=150 shared_pool_size=247463936 undo_tablespace=UNDOTBS1 event="" ## add parameters missing, but ignore deprecated ones db_file_multiblock_read_count=8 db_flashback_retention_target=1440 job_queue_processes=2 utl_file_dir=/u01/app/oracle/admin/ora11/utl undo_management=AUTO ## Data Guard config log_archive_config='DG_CONFIG=(lnc_fc12ora112,lncsb_fc12ora112)' Lowell Noodle Company 4 log_archive_dest_1='LOCATION=/u02/oradata/ora11/arch/ VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=lnc_fc12ora112' log_archive_dest_2='SERVICE=lncsb_fc12ora112 VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=lncsb_fc12ora112 LGWR ASYNC REOPEN=10' log_archive_dest_state_1=ENABLE log_archive_dest_state_2=ENABLE db_unique_name='lnc_fc12ora112' service_names=lnc_fc12ora112 log_archive_format=ora11_%s_%t_%r.arc FAL_CLIENT='lnc_fc12ora112' FAL_SERVER='lncsb_fc12ora112' ## standby_archive_dest='/u04/oradata/lncsb/arch' ## deprecated... standby_file_management='AUTO' remote_login_passwordfile='SHARED' INITNCSB . ORA audit_file_dest="/u01/app/oracle/admin/lncsb/adump" audit_trail=NONE compatible=11.2.0.0.0 control_files=/u04/oradata/lncsb/stdby.ctl db_block_size=8192 db_cache_size=243269632 db_domain=fc12ora112.uml.edu db_name="ora11" db_recovery_file_dest_size=4196401152 db_recovery_file_dest="/u01/app/oracle/fast_recovery_area" diagnostic_dest=/u01/app/oracle java_pool_size=67108864 large_pool_size=37748736 open_cursors=300 pga_aggregate_target=127926272 processes=150 shared_pool_size=247463936 undo_tablespace=UNDOTBS1 event="" ## add parameters missing, but ignore deprecated ones db_file_multiblock_read_count=8 db_flashback_retention_target=1440 db_file_name_convert('/u02/oradata/ora11','/u04/oradata/lncsb') job_queue_processes=2 utl_file_dir=/u01/app/oracle/admin/lncsb/utl undo_management=AUTO ## Data Guard config log_archive_config='DG_CONFIG=(lnc_fc12ora112,lncsb_fc12ora112)' Lowell Noodle Company 5 log_archive_dest_1='LOCATION=/u04/oradata/lncsb/arch/ VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=lncsb_fc12ora112' log_archive_dest_2='SERVICE=lnc_fc12ora112 VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=lnc_fc12ora112 LGWR ASYNC REOPEN=10' log_archive_dest_state_1=ENABLE log_archive_dest_state_2=ENABLE db_unique_name='lncsb_fc12ora112' service_names=lncsb_fc12ora112 log_archive_format=lncsb_%s_%t_%r.arc log_file_name_convert=('/u02/oradata/ora11','/u04/oradata/lncsb') FAL_CLIENT='lncsb_fc12ora112' FAL_SERVER='lnc_fc12ora112' ## standby_archive_dest='/u04/oradata/lncsb/arch' ## deprecated... standby_file_management='AUTO' remote_login_passwordfile='SHARED' Next, on the Primary, we created an spfile from the modified pfile, then moved them to /u01/app/oracle/admin/ora11/pfile/ and created symbolic links. Lowell Noodle Company 6 We proceeded to verify permissions are at least 775, otherwise set it. Startup ora11 and verified spfile is being used and that the new configuration parameters took effect. Lowell Noodle Company 7 Verify other parameters: Lowell Noodle Company 8 Turn on archivelog mode, open the database and check. Optional: make sure that Flashback is on for added safety. Our primary database is now ready for us to run scripts to create a Standby from it. That’s exactly what we’ll do next. Lowell Noodle Company 9 Standby Creation CREATE STANDBY DATABASE (LNCSB) FROM PRIMARY (ORA11) Make the directory structure for lncsb as follows: Change permissions to 775 for these directories, then do the same for /u04/oradata/lncsb and /u04/oradata/lncsb/arch directories. Lowell Noodle Company 10 Create the lncsb_create_standby.sh script used to copy necessary files over to standby location. Ours looked like this: LNCSB _ CREATE _ STANDBY .SH -- mikec - mv modified 10/11/2012 -- execute in primary database (ora11) as SYS user -- ensure SID is set to ora11 (we're using this as primary db) -- Set SQL*Plus variables to manipulate output set feedback on heading off verify off set pagesize 0 linesize 200 -- Set SQL*Plus user variables used in script -- Linux User variables define dir = '/u04/oradata/lncsb' define fil = '/tmp/lncsb_coldbkup.sql' define pdir = '/u01/app/oracle/admin/ora11/pfile' alter database backup controlfile to trace; prompt *** Spooling to &fil spool &fil select 'host cp '|| name ||' &dir' from v$datafile order by 1; select 'host cp '|| member ||' &dir' from v$logfile order by 1; -- select 'host cp '|| name ||' &dir' from v$controlfile order by 1; select 'host cp '|| name ||' &dir' from v$tempfile order by 1; spool off; -- Shutdown the database cleanly !echo Database: $ORACLE_SID shutting down. shutdown immediate; !echo . !echo Instance shutdown... -- Run the copy file commands !echo Copying database files... @&fil -- Start the database again !echo Database Copied...Starting Up in MOUNT Mode Now startup mount; alter database create standby controlfile as '/u04/oradata/lncsb/stdby.ctl'; !echo Standby Database Control File Created for lncSB !echo !echo ora11 Database Shutting Down... shutdown immediate !echo ora11 Database Shutdown... !echo DBA Should Startup lncSB in Standby Mode First !echo Then Startup lnc in Normal Mode Second !echo Manually Switch Logfile in lnc To Activate Standby Recovery Mode set feedback off exit Lowell Noodle Company 11 Stored in /tmp to be run inside sqlplus as sysdba. Lowell Noodle Company 12 Verify permissions for the standby database data files are set to 775. Ensure that a standby control file has been created from the primary database. Lowell Noodle Company 13 lncsb_create_standby.sh script already created the control file as you can see below. Proceeded to shutdown the database. Next time we start up will be after standby database is already running in standby managed recovery mode. CONFIGURE LISTENER AND TNSNAMES Now we proceed to set up communication between the Primary and Standby databases. In our case, they are both on the same server. Key values in listener.ora here are SID_NAME and GLOBAL_DBNAME… making sure each SID_LIST item is set correctly. LISTENER . ORA # listener.ora Network Configuration File: /u01/app/oracle/product/11.2.0/db_2/network/admin/listener.ora LISTENER = (DESCRIPTION_LIST = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = fc12ora112.uml.edu)(PORT = 1521)) ) ) ADR_BASE_LISTENER = /u01/app/oracle SID_LIST_LISTENER = (SID_LIST = (SID_DESC = Lowell Noodle Company 14 (GLOBAL_DBNAME=ora11) (SID_NAME=ora11) (ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_2) ) (SID_DESC = (GLOBAL_DBNAME=lncsb_fc12ora112) (SID_NAME=lnc_fc12ora112) (ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_2) ) (SID_DESC = (GLOBAL_DBNAME=lnc_fc12ora112) (SID_NAME=lnc_fc12ora112) (ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_2) ) (SID_DESC = (GLOBAL_DBNAME=rman) (SID_NAME=rman) (ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_2) ) ) TNSNAMES .ORA ORA11 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = fc12ora112.uml.edu)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = ora11) ) ) LNCSB_FC12ORA112 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = fc12ora112.uml.edu)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = lncsb_fc12ora112) ) ) LNC_FC12ORA112 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = fc12ora112.uml.edu)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) Lowell Noodle Company 15 (SERVICE_NAME = lnc_fc12ora112) ) ) RMAN = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = fc12ora112.uml.edu)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = rman) ) ) Lowell Noodle Company 16 CREATE PASSWORD FILES FOR STANDBY AND MAKE SURE PASSWORDS MATCH Note: This caused some FAL connection failures in our test environments. Using force=Y case_insensitive=Y solves the problem. Otherwise, copy orapwora11 straight to orapwlncsb. [see Appendix] Lowell Noodle Company 17 Startup STARTUP PRIMARY AND STANDBY DATABASES We made a new pfile for lncsb during our preparation stages. Now we’ll create a link to it: Make sure /etc/oratab has home for lncsb set Lowell Noodle Company 18 Create spfile from pfile and startup the standby database in standby managed recovery mode Make sure data files have at least 775: Lowell Noodle Company 19 Now startup Primary (ora11). Sometimes we got: ORA-01157: cannot identify/lock data file 1 - see DBWR trace file ORA-01110: data file 1: '/u02/oradata/ora11/ora11_system01.dbf' Restarting sometimes worked, but making sure dg_config and db_file_convert was properly set fixed it permanently. [see Appendix] Lowell Noodle Company 20 Everything looks good so far. Proceed to test. On the Primary database, we issued alter system switch logfile; and ran it a few times. Then we checked to see if the sequence numbers were incrementing correctly, which they were. Lowell Noodle Company 21 Things were looking okay to our knowledge. But then we proceeded to check whether the logs were being received correctly on the standby. They were not, so we fixed it. [see Appendix] After fixing, we proceeded to check Data Guard status on both servers with SQL> select message from v$dataguard_status Lowell Noodle Company 22 Looks like it’s working. Now we’ll continue testing the Standby environment. Lowell Noodle Company 23 MONITORING THE STANDBY ENVIRONMENT On the primary database, run alter system switch logfile; a few times (again) and select from v$archived_log to see that the changes are logged. Check overall status of managed recovery: Lowell Noodle Company 24 Determine if the standby site did not receive any log files: There are about 20-some-odd logs still to do. Now on the standby database, check if archive logs are received properly. Check if archiver logs are being applied successfully on the standby database: Lowell Noodle Company 25 Check managed standby status with SQL> select process, client_process, sequence#, status from v$managed_standby Did one more alter system switch logfile; for good measure (on Primary). Lowell Noodle Company 26 And verified new sequence# (120) has been applied on standby. Finally make sure there are no gaps on the physical standby database. Lowell Noodle Company 27 Activate Physical Standby: Controlled failover Now for the good stuff… We’ll test the Physical Standby to see if it can properly act as Primary. From fresh reboot, we started lsnrctl start, started up standby in mount managed recovery mode, started up Primary in normal, and verified Data Guard status. On the standby, we’re going to create a restore point so that after testing we can revert the standby back to its original purpose. Verify that flashback is on and if not, enable it. Cancel standby’s managed recovery. Lowell Noodle Company 28 Start standby back up, then on Primary do a couple logfile switches then DEFER log_archive_dest_state_2. Now on the Standby, cancel managed standby recovery mode and stop shipping any logfiles to old primary since we’re going to change its role. Lowell Noodle Company 29 Convert the standby database to a primary database. By now the standby database is open for testing! Note that no changes will be sent to primary database since destination is disabled. Cool. Query a few tables in ora1 schema to test. Lowell Noodle Company 30 Standby turned acting Primary database testing seems good. Everything seems accessible. Optionally we could change data guard protection mode to maximum performance if needed (alter database set standby database to maximize performance;). Perform Flashback to undo any changes and activate database as physical standby. Start back up in managed standby recovery mode and re-enable log_archive_dest_state_2. Make sure to also reenable log_archive_dest_state_2 on old Primary that will be back up. Lowell Noodle Company 31 On Primary, after enabling log destination 2 run a few more alter system switch logfile; lines and verify that the two databases are once again communicating. Looking pretty good at this point. Checking message in v$dataguard_status and select sequence#, applied from v$archived_log order by sequence#; Lowell Noodle Company 32 Sequence numbers show that they’ve been sent and applied correctly! Things are back to Data Guard working status. Finally drop the restore point on standby. It’s no longer needed. Lowell Noodle Company 33 Appendix USING ORAPWD TO GENERATE BOTH PASSWORD FILES DOES NOT WORK IN 11G Our archive logs were not being sent over to the standby database. This is how we fixed it. Lowell Noodle Company 34 Okay, we have a problem. We spent about 10-12 hours on this issue. At first we thought it might be due to bad listener configuration, so went there and removed db_host=”fc12ora112.uml.edu” to db_host=”” (this had no effect). It must be misconfigured somewhere. Checked v$dataguard_status and logs. Lowell Noodle Company 35 Lowell Noodle Company 36 Next we looked at our alert logs: Error 1017 received logging on to the standby (on production). Only after seeing this error in Primary alert logs did we find the cause (which has been noted earlier in this report). Google led us to http://remigium.blogspot.com/2012/02/error-1017-received-logging-on-to.html, which solved our issues. On our databases we were receiving: Error 1041 from standby Error 1017 on primary Lowell Noodle Company 37 Using orapwd to create the two password files even with the same password does not work! It worked in 10g, but in 11g it’s better to copy the passwordfile from the primary to standby and rename: Shutdown primary, then cancel managed recovery on standby, shutdown standby, and turn everything back on so that the new oracle password file is used. Lowell Noodle Company 38 SYNTAX ERROR IN ONE OF OUR PFILES Lowell Noodle Company 39 LOG_FILE_NAME_CONVERT WAS MISSING IN ONE OF OUR PFILES This caused some random weird behaviors and maybe some lockage. Adding it fixed (obviously). Lowell Noodle Company 40 STARTUP IN ACTIVE DATA GUARD MODE On the Standby database, to turn it into Active mode, cancel the managed recovery then use: SQL> alter database open read only; And start it up again with: SQL> alter database recover managed standby database nodelay disconnect; This allows the Standby database to be queried from while managed recovery is still running.