Download DataGuard SwitchOver steps

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Global serializability wikipedia , lookup

Commitment ordering wikipedia , lookup

DBase wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Serializability wikipedia , lookup

Microsoft Access wikipedia , lookup

IMDb wikipedia , lookup

Btrieve wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Functional Database Model wikipedia , lookup

Oracle Database wikipedia , lookup

Ingres (database) wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

SQL wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Versant Object Database wikipedia , lookup

Concurrency control wikipedia , lookup

PL/SQL wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Clusterpoint wikipedia , lookup

ContactPoint wikipedia , lookup

Transcript
11.2 Data Guard Physical Standby Switchover Best Practices using
SQL*Plus (Doc ID 1304939.1)
In this Document
Goal
Fix
I. Prerequisites / Preparation
Apply Latest Patch Bundle.
Review Primary Database Initialization Parameters
Verify the Setup
Understand and Test Fallback Options
II. Pre-Switchover Checks
Verify Managed Recovery is running on the standby
Verify Sufficient Number of Archiver Processes
Ensure Online Redo Log Files on the Target Physical Standby have been cleared
Verify there are no large Gaps
Verify Primary and Standby tempfiles match and all datafiles are ONLINE
III. Switchover
Clear Potential Blocking Parameters & Jobs
Shutdown all mid-tiers (optional)
Turn on Data Guard tracing on primary and standby
Tail Alert Logs (optional) on all instances
Create Guaranteed Restore Points (optional)
Verify that the primary database can be switched to the standby role
If The Primary is a RAC, then shutdown all secondary primary instances
Switchover the primary to a standby database
Verify the standby has received the end-of-redo (EOR) log(s)
Verify that the standby database can be switched to the primary role
Switchover the standby database to a primary
Open the new primary database
Correct any tempfile mismatch
Restart the new standby
Contingency or Fallback
IV. Post-Switchover Steps
Set Trace to Prior Value
Reset Jobs
Drop any Switchover Guaranteed Restore Points
References
Applies to:
Oracle Server - Enterprise Edition - Version 11.2.0.1 and later
Information in this document applies to any platform.
Goal
Perform trouble free Data Guard switchover.
Note: for Data Guard switchover using the Broker please refer to Note 1305019.1 - "11.2 Data
Guard Physical Standby Switchover Best Practices using the Broker"
Fix
I. Prerequisites / Preparation
These are items that should only have to be done once during configuration and setup.
Alert: If you upgraded your databases to 11.2.0.2 from a prior release (e.g. 10.2, 11.1, 11.2.0.1)
it is imperative that you refer to Note 1288640.1 "Managed Recovery (MRP) Fails w/ ORA-328
After Upgrade to 11.2.0.2 and Switchover" before continuing.
Apply Latest Patch Bundle.


Review Note 785351.1 "11gR2 Upgrade Companion"
o Make sure to check the “Patches Recommended” tab.
See Note 756671.1 for the latest available patches or patchset updates.
Review Primary Database Initialization Parameters
Ensure that the LOG_ARCHIVE_CONFIG & DG_CONFIG initialization parameters are
established at the primary database. See Section 3.1.4 Set Primary Database Initialization
Parameters
Verify the Setup Physical Standby Database Is Performing Properly
Once you create the physical standby database and set up redo transport services, you may want
to verify database modifications are being successfully transmitted from the primary database to
the standby database.
To see that redo data is being received on the standby database, you should first identify the
existing archived redo log files on the standby database, force a log switch and archive a few
online redo log files on the primary database, and then check the standby database again. The
following steps show how to perform these tasks.
Step 1 Identify the existing archived redo log files.
On the standby database, query the V$ARCHIVED_LOG view to identify existing files in the
archived redo log. For example:
SQL> SELECT SEQUENCE#, FIRST_TIME, NEXT_TIME > FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#;
SEQUENCE#
---------8
9
10
FIRST_TIME
-----------------11-JUL-07 17:50:45
11-JUL-07 17:50:53
11-JUL-07 17:50:58
NEXT_TIME
-----------------11-JUL-07 17:50:53
11-JUL-07 17:50:58
11-JUL-07 17:51:03
3 rows selected.
Step 2 Force a log switch to archive the current online redo log file.
On the primary database, issue the ALTER SYSTEM SWITCH LOGFILE statement to force a
log switch and archive the current online redo log file group:
SQL> ALTER SYSTEM SWITCH LOGFILE;
Step 3 Verify the new redo data was archived on the standby database.
On the standby database, query the V$ARCHIVED_LOG view to verify the redo data was
received and archived on the standby database:
SQL> SELECT SEQUENCE#, FIRST_TIME, NEXT_TIME > FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#;
SEQUENCE# FIRST_TIME
---------- -----------------8 11-JUL-07 17:50:45
9 11-JUL-07 17:50:53
10 11-JUL-07 17:50:58
11 11-JUL-07 17:51:03
4 rows selected.
NEXT_TIME
-----------------11-JUL-07 17:50:53
11-JUL-07 17:50:58
11-JUL-07 17:51:03
11-JUL-07 18:34:11
The archived redo log files are now available to be applied to the physical standby
database.
Step 4 Verify that received redo has been applied.
On the standby database, query the V$ARCHIVED_LOG view to verify that received redo
has been applied:
SQL> SELECT SEQUENCE#,APPLIED FROM V$ARCHIVED_LOG > ORDER BY SEQUENCE#;
SEQUENCE#
--------8
9
10
11
APP
--YES
YES
YES
IN-MEMORY
4 rows selected.
Note:
The value of the APPLIED column for the most recently received log file will be either INMEMORY or YES if that log file has been applied.
Understand and Test Fallback Options
See Following link for Failed Data guard steps
http://docs.oracle.com/cd/E11882_01/server.112/e17022/troubleshooting.htm#SBYDB4916
II. Pre-Switchover Checks
These steps should be completed before the switchover planned maintenance window begins.
Our recommendation is that these are done a couple days in advance.
Verify Managed Recovery is running on the standby
The following query at the standby verifies that managed recovery is running:
SQL> SELECT PROCESS FROM V$MANAGED_STANDBY WHERE PROCESS LIKE 'MRP%';
The following query at the Primary verifies that recovery is running with “REAL TIME
APPLY” option. In the example below, LOG_ARCHIVE_DEST_2 is established to ship redo to
the target standby (dest_id=2):
SQL> SELECT RECOVERY_MODE FROM V$ARCHIVE_DEST_STATUS WHERE DEST_ID=2;
RECOVERY_MODE
----------------------MANAGED REAL TIME APPLY
If managed standby recovery is not running or not started with real-time apply, restart managed
recovery with real-time apply enabled:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE
DISCONNECT;
Note: If you previously defined a delay for this standby the delay is ignored when you start real
time apply
Verify Sufficient Number of Archiver Processes
Make sure LOG_ARCHIVE_MAX_PROCESSES is set to 4 or higher on every primary and
standby database in the Data Guard configuration. Take care not to set it too high as the
additional archivers can increase the time to shutdown the database. This parameter can be set
dynamically via ALTER SYSTEM.
Ensure Online Redo Log Files on the Target Physical Standby have been cleared
Online redo logs on the target physical standby need to be cleared before that standby database
can become a primary database. Although this will automatically happen as part of the
SWITCHOVER TO PRIMARY command, it is recommended that the logs are cleared prior to
the switchover.
Setting the LOG_FILE_NAME_CONVERT parameter at the physical standby will cause the
online redo logs to be automatically cleared when managed recovery is started on the standby.
If your databases are using Oracle Managed Files (OMF) or you have already set the parameter
LOG_FILE_NAME_CONVERT you can skip this step as the online log files will always be
cleared automatically.
Clearing online redo logs as part of the SWITCHOVER TO PRIMARY command can make the
switchover command susceptible to termination by another process that is waiting on access to
the CONTROLFILE. The CONTROLFILE waiter will attempt to kill the switchover after a
timeout is 15 minutes.
Oracle recommends setting LOG_FILE_NAME_CONVERT to automatically clear online redo
logs on the physical standby database. In the event the primary database and the physical standby
database have the exact same directory path to the online redo logs, it is acceptable to set
LOG_FILE_NAME_CONVERT such that the entry pairs have the same value.
As an example, if the online redo logs are stored in /oradata/order_db/redo for both the primary
and physical standby databases on their respective servers, you can set the parameter value as
LOG_FILE_NAME_CONVERT=’/oradata/order_db/redo/’,’/oradata/order_db/redo/’
This will initiate automatic clearing of the online redo logs on the physical standby database
when managed recovery is started.
Since the LOG_FILE_NAME_CONVERT parameter is not dynamic you must restart the
standby database for the property change to take affect.
If you have not set your environment to automatically clear the online redo logs and you do not
want to restart the standby database, you should manually clear them at some point prior to the
switchover. This can be done at any time.
On the target physical standby run the following query to determine if the online redo logs have
not been cleared:
SQL> SELECT DISTINCT L.GROUP# FROM V$LOG L, V$LOGFILE LF
WHERE L.GROUP# = LF.GROUP#
AND L.STATUS NOT IN (‘UNUSED’, ‘CLEARING’,’CLEARING_CURRENT’);
If the above query returns rows, on the target physical standby issue the following statement for
each GROUP# returned:
SQL> ALTER DATABASE CLEAR LOGFILE GROUP <ORL GROUP# from the query above>;
Note that later when you do the actual switchover if it is terminated by a CONTROLFILE waiter
timeout, just re-issue the SWITCHOVER TO PRIMARY command until it completes
successfully.
You should monitor your alert log to ensure your online redo logs are being cleared and you are
not experiencing some other issue.
Verify there are no large Gaps
Identify the current sequence number for each thread on the primary database
SQL> SELECT THREAD#, SEQUENCE# FROM V$THREAD;
Verify the target physical standby database has applied up to, but not including the logs from the
primary query. On the standby the following query should be within 1 or 2 of the primary query
result.
SQL> SELECT THREAD#, MAX(SEQUENCE#) FROM V$ARCHIVED_LOG
WHERE APPLIED = 'YES'
AND RESETLOGS_CHANGE# = (SELECT RESETLOGS_CHANGE#
FROM V$DATABASE_INCARNATION WHERE STATUS = ‘CURRENT’)
GROUP BY THREAD#;
Verify Primary and Standby tempfiles match and all datafiles are ONLINE
For each temporary tablespace on the standby, verify that temporary files associated with that
tablespace on the primary database also exist on the standby database. Tempfiles added after
initial standby creation are not propagated to the standby. Run this query on both the primary and
target physical standby databases and verify that they match.
SQL> SELECT TMP.NAME FILENAME, BYTES, TS.NAME TABLESPACE
FROM V$TEMPFILE TMP, V$TABLESPACE TS WHERE TMP.TS#=TS.TS#;
If the queries do not match then you can correct the mismatch now or immediately after the open
of the new primary database.
Prior to switchover, on the target standby, verify that all datafiles necessary for updates after role
transition to primary are ONLINE.
On the target standby:
SQL> SELECT NAME FROM V$DATAFILE WHERE STATUS=’OFFLINE’;
If there are any OFFLINE datafiles, and these are needed after switchover, bring them ONLINE:
SQL> ALTER DATABASE DATAFILE ‘datafile-name’ ONLINE;
III. Switchover
These steps are completed as part of the switchover process on the day of the planned outage.
Clear Potential Blocking Parameters & Jobs
Capture current job state on the primary:
SQL> SELECT * FROM DBA_JOBS_RUNNING;
Depending on what the running job is, be ready to terminate the job if necessary.
SQL> SELECT OWNER, JOB_NAME, START_DATE, END_DATE, ENABLED FROM
DBA_SCHEDULER_JOBS WHERE ENABLED=’TRUE’ AND OWNER <> ‘SYS”;
SQL> SHOW PARAMETER job_queue_processes
Note: Job candidates to be disabled among others: oracle text sync and optimizer, RMAN
backups, application garbage collectors, application background agents.
Block further job submission
SQL> ALTER SYSTEM SET job_queue_processes=0 SCOPE=BOTH SID=’*’;
Disable any jobs that may interfere.
SQL> EXECUTE DBMS_SCHEDULER.DISABLE( <job_name> );
Shutdown all mid-tiers (optional)
This can be done in parallel to the switchover.
$ opmnctl stopall
Turn on Data Guard tracing on primary and standby
Tracing is turned on to have diagnostic information available in case any issues arise. Turning on
tracing does not have any noticeable impact on switchover time but does require space for the
trace output.
Capture the current value on both the primary and the target physical standby databases
SQL> SHOW PARAMETER log_archive_trace
Set Data Guard trace level to 8191 on both the primary and the target physical standby databases
SQL> ALTER SYSTEM SET log_archive_trace=8191;
Trace output will appear under the destination pointed to by the database parameter
BACKGROUND_DUMP_DEST with “mrp” in the file name.
Tail Alert Logs (optional) on all instances
Locate alert logs by showing database parameter background_dump_dest
SQL> SHOW PARAMETER background_dump_dest
Tail the alert logs
> tail –f <background_dump_dest location>/alert*
Create Guaranteed Restore Points (optional)
The standard switchover fallback options should suffice for successfully backing out of a
switchover. However, if you want an additional fallback option then you can create a guaranteed
restore point on the primary and standby database participating in the switchover.
On the standby
Stop the apply process
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
Create a guaranteed restore point
SQL> CREATE RESTORE POINT SWITCHOVER_START_GRP GUARANTEE FLASHBACK DATABASE;
Start the apply process
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE
USING CURRENT LOGFILE DISCONNECT;
On the primary
Create a guaranteed restore point
SQL> CREATE RESTORE POINT SWITCHOVER_START_GRP GUARANTEE FLASHBACK DATABASE;
Note: If a guaranteed restore points are created, make sure they are dropped post-switchover!
Verify that the primary database can be switched to the standby role
Query the SWITCHOVER_STATUS column of the V$DATABASE view on the primary
database:
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;
SWITCHOVER_STATUS
----------------TO STANDBY
A value of TO STANDBY or SESSIONS ACTIVE (which requires the WITH SESSION
SHUTDOWN clause on the switchover command) indicates that the primary database can be
switched to the standby role. If neither of these values is returned, a switchover is not possible
because redo transport is either mis-configured or is not functioning properly.
If The Primary is a RAC, then shutdown all secondary primary instances
A normal or immediate shutdown can be done, but to expedite the shutdown issue a
SHUTDOWN ABORT on secondary RAC instances on the primary cluster only leaving one
Primary instance up. Wait until the remaining Primary instance has completed cluster
reconfiguration (and performed recovery if you chose to abort the secondary instances) before
continuing.
Switchover the primary to a standby database
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY WITH SESSION SHUTDOWN;
If an ORA-16139 error is encountered, as long as
V$DATABASE.DATABASE_ROLE=’PHYSICAL STANDBY’, then you can proceed. A
common case where this can occur is when there are a large number of data files. Once managed
recovery is started on the new standby, the database will recover.
If the role was not changed then you need to cancel the switchover and review the alert logs and
trace files further.
Verify the standby has received the end-of-redo (EOR) log(s)
In the primary alert log you will see messages like these:
Switchover: Primary controlfile converted to standby controlfile succesfully.
Tue Mar 15 16:12:15 2011
MRP0 started with pid=17, OS id=2717
MRP0: Background Managed Standby Recovery process started (SFO)
Serial Media Recovery started
Managed Standby Recovery not using Real Time Apply
Online logfile pre-clearing operation disabled by switchover
Media Recovery Log
/u01/app/flash_recovery_area/SFO/archivelog/2011_03_15/o1_mf_1_133_6qzl0yvd_.arc
Identified End-Of-Redo for thread 1 sequence 133
Resetting standby activation ID 0 (0x0)
Media Recovery End-Of-Redo indicator encountered
Media Recovery Applied until change 4314801
MRP0: Media Recovery Complete: End-Of-REDO (SFO)
MRP0: Background Media Recovery process shutdown (SFO)
Tue Mar 15 16:12:21 2011
Switchover: Complete - Database shutdown required (SFO)
Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY
WITH SESSION SHUTDOWN
And correspondingly in the standby alert log file you should see messages like these:
Tue Mar 15 16:12:15 2011
RFS[8]: Assigned to RFS process 2715
RFS[8]: Identified database type as 'physical standby': Client is Foreground pid 2568
Media Recovery Log
/u01/app/flash_recovery_area/NYC/archivelog/2011_03_15/o1_mf_1_133_6qzl0yjp_.arc
Identified End-Of-Redo for thread 1 sequence 133
Resetting standby activation ID 2680651518 (0x9fc77efe)
Media Recovery End-Of-Redo indicator encountered
Media Recovery Continuing
Resetting standby activation ID 2680651518 (0x9fc77efe)
Media Recovery Waiting for thread 1 sequence 134
In versions prior to Oracle Database 11g Release 2, the MRP (Redo Apply coordinator) would
stop automatically after processing the End-of-Redo marker. With Oracle Database 11g Release
2, it no longer stops leaving all bystander standby databases still ready to apply redo from the
new primary database without having to be restarted. The MRP process will be shut down
automatically by the switchover command when executed at the target standby database.
Verify that the standby database can be switched to the primary role
Query the SWITCHOVER_STATUS column of the V$DATABASE view on the standby
database:
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;
SWITCHOVER_STATUS
----------------TO PRIMARY
A value of TO PRIMARY or SESSIONS ACTIVE indicates that the standby database is ready to
be switched to the primary role. If neither of these values is returned, verify that redo apply is
active and that redo transport is configured and working properly. Continue to query this column
until the value returned is either TO PRIMARY or SESSIONS ACTIVE.
Switchover the standby database to a primary
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;
In the standby alert log file you should see messages like these:
Tue Mar 15 16:16:44 2011
ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION
SHUTDOWN
ALTER DATABASE SWITCHOVER TO PRIMARY (NYC)
Maximum wait for role transition is 15 minutes.
Switchover: Media recovery is still active
Role Change: Canceling MRP - no more redo to apply
Tue Mar 15 16:16:45 2011
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /u01/app/diag/rdbms/nyc/NYC/trace/NYC_pr00_2467.trc:
ORA-16037: user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Waiting for MRP0 pid 2460 to terminate
Errors in file /u01/app/diag/rdbms/nyc/NYC/trace/NYC_pr00_2467.trc:
ORA-16037: user requested cancel of managed recovery operation
Tue Mar 15 16:16:45 2011
MRP0: Background Media Recovery process shutdown (NYC)
Role Change: Canceled MRP
Open the new primary database
SQL> ALTER DATABASE OPEN;
Note: There will be an increase in I/O activity while the new primary’s standby redo logs are
cleared.
Correct any tempfile mismatch
If there was a tempfile that was not corrected during the pre-switchover check, then correct it
now on the new primary.
Restart the new standby
If the new standby database (former primary database) was not shutdown since switching it to
standby, bring it to the mount state and start managed recovery. This can be done in parallel to
the new primary open.
SQL> SHUTDOWN ABORT;
Note: If you use IMMEDIATE, an ABORT will be performed anyway as of 11.2.0.2 and you
would see the following in the alert log:
Performing implicit shutdown abort due to switchover to physical standby
Shutting down instance (abort)
License high water mark = 15
USER (ospid: 14665): terminating the instance
Instance terminated by USER, pid = 14665
SQL> STARTUP MOUNT;
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE
DISCONNECT;
Note: If you were using a delay for your standby then you would restart the apply without real
time apply:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;
Finally, if the database is a RAC, then start all secondary instances on the new standby.
Contingency or Fallback
IV. Post-Switchover Steps
Set Trace to Prior Value
For each instance on the Primary and Standby:
SQL> ALTER SYSTEM SET log_archive_trace=<prior value>;
Reset Jobs
Set the job queue processes to its original value on the new standby.
SQL> ALTER SYSTEM SET job_queue_processes=<value saved> scope=both sid=’*’
Enable any jobs that were disabled.
SQL> EXECUTE DBMS_SCHEDULER.ENABLE(<for each job name captured>);
Drop any Switchover Guaranteed Restore Points
On all databases where a Guaranteed Restore point was created
SQL> DROP RESTORE POINT SWITCHOVER_START_GRP;