Download Activate Physical Standby: Controlled failover

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

SQL wikipedia , lookup

Microsoft Access wikipedia , lookup

Serializability wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

IMDb wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Open Database Connectivity wikipedia , lookup

PL/SQL wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Ingres (database) wikipedia , lookup

Functional Database Model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Concurrency control wikipedia , lookup

Oracle Database wikipedia , lookup

Database model wikipedia , lookup

Clusterpoint wikipedia , lookup

ContactPoint wikipedia , lookup

Transcript
Lowell Noodle Company
Oracle 11g Physical Standby
Data Guard
GROUP 4
Minh Vo, Susan Champigny, Ganapathi S. Santhana
10/13/2012
1 University Avenue
Lowell, MA 01854
Phone: 978 934-4000
Lowell Noodle Company
1
Introduction
The solution that would best fit the LNC requirements, at this point in time, is the implementation of Oracle Data
Guard. We are confident that this is the best approach without implementing RMAN. Using Data Guard will allow
an active standby database for the production database. This environment will provide protection against planned
or unplanned downtime as well as against data loss in case the primary database environment becomes
unavailable. This is extremely important in a 24x7 production environment, which we highly considered. In
conjunction with the use of proper hardware and technology, it is also possible to keep the standby database
synchronized with the primary database, therefore providing continued database operation availability.
In the event of problems (including hardware failures, application issues, user errors, and unforeseen disasters) the
standby database environment can be quickly activated to maintain. In addition, configuration of Flashback on both
the primary and the standby will enable rapid role transitions and reduce the effort required to re-establish
database roles after transition. We considered the best practices to achieve this, and we reduced the Flashback
retention target from the default of 24 hours to 2 hours. Using Data Guard will ensure an effective recovery
protection strategy.
When ready, we highly suggest integrating RMAN operations into your environment. RMAN will provide
performance enhancements such as, automatic parallelization of backup, restore, and recovery operations.
Furthermore, do to your 24x7 environment it would be wise to consider configuring a recovery catalog schema,
which would be created in a separate database on a separate server. Using a Recovery Catalog scenario will allow
generation of reports while allowing storing RMAN scripts in a repository.
Benefits of using a Recovery Catalog versus just using RMAN is the availability of metadata about the production
(target) database, which can contain multiple databases if needed in the future. The use of storing information
about more than one incarnation database allows reporting on the production database (target) from a non-current
incarnation.
In case of recovering from the recovery catalog you can create a database from a previous backup, it can be decided
to locate the catalog in another database by importing the data from the export of the previous catalog owner into
the schema of the newly created user. In addition, a new database can be created and import the entire database
from an export of the recover catalog database.
Lowell Noodle Company
2
Plan of Action





PREPARATION
o
Prepare the Primary database to be used to create a Physical Standby
o
Create PFILES for Production and Standby
PHYSICAL STANDBY CREATION
o
LNCSB_CREATE_STANDBY.sh
o
Configure Listener and TNSNAMES
o
Create/Copy Password Files
STARTUP AND TESTING
o
Startup Primary and Standby Databases
o
Check Archive Log Working for Primary
o
Monitoring Standby Managed Recovery Environment
CONTROLLED FAILOVER
o
Create Flashback Point on Standby
o
Disconnect Primary and Startup Standby as New Primary Database
o
Testing and Verification
o
Revert Standby from Flashback
o
Reconnect Primary and Standby Databases
APPENDIX
o
Using ORAPWD Does Not Work!
o
Syntax Error and Missing PFILE entries.
o
Startup in Active Data Guard Mode
Lowell Noodle Company
3
Preparation
PREPARING PRIMARY DATABASE (ORA11) FOR SAME MACHINE ACTIVE DATA GUARD
After checking around the Production database and verifying basic functionality, we proceeded to edit the Primary
pfiles and created a new one for lncsb, which will be our Standby. We took extra care to make sure the destinations
to log archive parameters and data file locations are properly set.
Care was also taken in defining and referencing db_unique_name parameters, keeping in mind that we’re planning
to add lncsb (Standby SID), lnc_fc12ora112 (Service to Primary), and lncsb_fc12ora112 (Service to Standby)
SID/Services. Notice also the FAL_SERVER and FAL_CLIENT settings. The pfiles look as follows:
INITORA 11. ORA
audit_file_dest="/u01/app/oracle/admin/ora11/adump"
audit_trail=NONE
compatible=11.2.0.0.0
control_files=/u02/oradata/ora11/ora11_ctrl01.ctl, /u02/oradata/ora11/ora11_ctrl02.ctl
db_block_size=8192
db_cache_size=243269632
db_domain=fc12ora112.uml.edu
db_name="ora11"
db_recovery_file_dest_size=4196401152
db_recovery_file_dest="/u01/app/oracle/fast_recovery_area"
diagnostic_dest=/u01/app/oracle
java_pool_size=67108864
large_pool_size=37748736
## log_archive_dest_1='LOCATION=/u02/oradata/ora11/arch/'
## log_archive_dest_state_1='ENABLE'
## log_archive_format=ora11_%s_%t_%r.arc
open_cursors=300
pga_aggregate_target=127926272
processes=150
shared_pool_size=247463936
undo_tablespace=UNDOTBS1
event=""
## add parameters missing, but ignore deprecated ones
db_file_multiblock_read_count=8
db_flashback_retention_target=1440
job_queue_processes=2
utl_file_dir=/u01/app/oracle/admin/ora11/utl
undo_management=AUTO
## Data Guard config
log_archive_config='DG_CONFIG=(lnc_fc12ora112,lncsb_fc12ora112)'
Lowell Noodle Company
4
log_archive_dest_1='LOCATION=/u02/oradata/ora11/arch/
VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=lnc_fc12ora112'
log_archive_dest_2='SERVICE=lncsb_fc12ora112 VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)
DB_UNIQUE_NAME=lncsb_fc12ora112 LGWR ASYNC REOPEN=10'
log_archive_dest_state_1=ENABLE
log_archive_dest_state_2=ENABLE
db_unique_name='lnc_fc12ora112'
service_names=lnc_fc12ora112
log_archive_format=ora11_%s_%t_%r.arc
FAL_CLIENT='lnc_fc12ora112'
FAL_SERVER='lncsb_fc12ora112'
## standby_archive_dest='/u04/oradata/lncsb/arch'
## deprecated...
standby_file_management='AUTO'
remote_login_passwordfile='SHARED'
INITNCSB . ORA
audit_file_dest="/u01/app/oracle/admin/lncsb/adump"
audit_trail=NONE
compatible=11.2.0.0.0
control_files=/u04/oradata/lncsb/stdby.ctl
db_block_size=8192
db_cache_size=243269632
db_domain=fc12ora112.uml.edu
db_name="ora11"
db_recovery_file_dest_size=4196401152
db_recovery_file_dest="/u01/app/oracle/fast_recovery_area"
diagnostic_dest=/u01/app/oracle
java_pool_size=67108864
large_pool_size=37748736
open_cursors=300
pga_aggregate_target=127926272
processes=150
shared_pool_size=247463936
undo_tablespace=UNDOTBS1
event=""
## add parameters missing, but ignore deprecated ones
db_file_multiblock_read_count=8
db_flashback_retention_target=1440
db_file_name_convert('/u02/oradata/ora11','/u04/oradata/lncsb')
job_queue_processes=2
utl_file_dir=/u01/app/oracle/admin/lncsb/utl
undo_management=AUTO
## Data Guard config
log_archive_config='DG_CONFIG=(lnc_fc12ora112,lncsb_fc12ora112)'
Lowell Noodle Company
5
log_archive_dest_1='LOCATION=/u04/oradata/lncsb/arch/
VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=lncsb_fc12ora112'
log_archive_dest_2='SERVICE=lnc_fc12ora112 VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)
DB_UNIQUE_NAME=lnc_fc12ora112 LGWR ASYNC REOPEN=10'
log_archive_dest_state_1=ENABLE
log_archive_dest_state_2=ENABLE
db_unique_name='lncsb_fc12ora112'
service_names=lncsb_fc12ora112
log_archive_format=lncsb_%s_%t_%r.arc
log_file_name_convert=('/u02/oradata/ora11','/u04/oradata/lncsb')
FAL_CLIENT='lncsb_fc12ora112'
FAL_SERVER='lnc_fc12ora112'
## standby_archive_dest='/u04/oradata/lncsb/arch'
## deprecated...
standby_file_management='AUTO'
remote_login_passwordfile='SHARED'
Next, on the Primary, we created an spfile from the modified pfile, then moved them to
/u01/app/oracle/admin/ora11/pfile/ and created symbolic links.
Lowell Noodle Company
6
We proceeded to verify permissions are at least 775, otherwise set it.
Startup ora11 and verified spfile is being used and that the new configuration parameters took effect.
Lowell Noodle Company
7
Verify other parameters:
Lowell Noodle Company
8
Turn on archivelog mode, open the database and check. Optional: make sure that Flashback is on for added safety.
Our primary database is now ready for us to run scripts to create a Standby from it. That’s exactly what we’ll do
next.
Lowell Noodle Company
9
Standby Creation
CREATE STANDBY DATABASE (LNCSB) FROM PRIMARY (ORA11)
Make the directory structure for lncsb as follows:
Change permissions to 775 for these directories, then do the same for /u04/oradata/lncsb and
/u04/oradata/lncsb/arch directories.
Lowell Noodle Company
10
Create the lncsb_create_standby.sh script used to copy necessary files over to standby location. Ours looked like
this:
LNCSB _ CREATE _ STANDBY .SH
-- mikec - mv modified 10/11/2012
-- execute in primary database (ora11) as SYS user
-- ensure SID is set to ora11 (we're using this as primary db)
-- Set SQL*Plus variables to manipulate output
set feedback on heading off verify off
set pagesize 0 linesize 200
-- Set SQL*Plus user variables used in script
-- Linux User variables
define dir = '/u04/oradata/lncsb'
define fil = '/tmp/lncsb_coldbkup.sql'
define pdir = '/u01/app/oracle/admin/ora11/pfile'
alter database backup controlfile to trace;
prompt *** Spooling to &fil
spool &fil
select 'host cp '|| name ||' &dir' from v$datafile order by 1;
select 'host cp '|| member ||' &dir' from v$logfile order by 1;
-- select 'host cp '|| name ||' &dir' from v$controlfile order by 1;
select 'host cp '|| name ||' &dir' from v$tempfile order by 1;
spool off;
-- Shutdown the database cleanly
!echo Database: $ORACLE_SID shutting down.
shutdown immediate;
!echo .
!echo Instance shutdown...
-- Run the copy file commands
!echo Copying database files...
@&fil
-- Start the database again
!echo Database Copied...Starting Up in MOUNT Mode Now
startup mount;
alter database create standby controlfile as '/u04/oradata/lncsb/stdby.ctl';
!echo Standby Database Control File Created for lncSB
!echo
!echo ora11 Database Shutting Down...
shutdown immediate
!echo ora11 Database Shutdown...
!echo DBA Should Startup lncSB in Standby Mode First
!echo Then Startup lnc in Normal Mode Second
!echo Manually Switch Logfile in lnc To Activate Standby Recovery Mode
set feedback off
exit
Lowell Noodle Company
11
Stored in /tmp to be run inside sqlplus as sysdba.
Lowell Noodle Company
12
Verify permissions for the standby database data files are set to 775.
Ensure that a standby control file has been created from the primary database.
Lowell Noodle Company
13
lncsb_create_standby.sh script already created the control file as you can see below. Proceeded to shutdown the
database. Next time we start up will be after standby database is already running in standby managed recovery
mode.
CONFIGURE LISTENER AND TNSNAMES
Now we proceed to set up communication between the Primary and Standby databases. In our case, they are both
on the same server. Key values in listener.ora here are SID_NAME and GLOBAL_DBNAME… making sure each
SID_LIST item is set correctly.
LISTENER . ORA
# listener.ora Network Configuration File: /u01/app/oracle/product/11.2.0/db_2/network/admin/listener.ora
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = fc12ora112.uml.edu)(PORT = 1521))
)
)
ADR_BASE_LISTENER = /u01/app/oracle
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
Lowell Noodle Company
14
(GLOBAL_DBNAME=ora11)
(SID_NAME=ora11)
(ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_2)
)
(SID_DESC =
(GLOBAL_DBNAME=lncsb_fc12ora112)
(SID_NAME=lnc_fc12ora112)
(ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_2)
)
(SID_DESC =
(GLOBAL_DBNAME=lnc_fc12ora112)
(SID_NAME=lnc_fc12ora112)
(ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_2)
)
(SID_DESC =
(GLOBAL_DBNAME=rman)
(SID_NAME=rman)
(ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_2)
)
)
TNSNAMES .ORA
ORA11 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = fc12ora112.uml.edu)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = ora11)
)
)
LNCSB_FC12ORA112 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = fc12ora112.uml.edu)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = lncsb_fc12ora112)
)
)
LNC_FC12ORA112 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = fc12ora112.uml.edu)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
Lowell Noodle Company
15
(SERVICE_NAME = lnc_fc12ora112)
)
)
RMAN =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = fc12ora112.uml.edu)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = rman)
)
)
Lowell Noodle Company
16
CREATE PASSWORD FILES FOR STANDBY AND MAKE SURE PASSWORDS MATCH
Note: This caused some FAL connection failures in our test environments. Using force=Y case_insensitive=Y solves
the problem. Otherwise, copy orapwora11 straight to orapwlncsb. [see Appendix]
Lowell Noodle Company
17
Startup
STARTUP PRIMARY AND STANDBY DATABASES
We made a new pfile for lncsb during our preparation stages. Now we’ll create a link to it:
Make sure /etc/oratab has home for lncsb set
Lowell Noodle Company
18
Create spfile from pfile and startup the standby database in standby managed recovery mode
Make sure data files have at least 775:
Lowell Noodle Company
19
Now startup Primary (ora11). Sometimes we got:
ORA-01157: cannot identify/lock data file 1 - see DBWR trace file
ORA-01110: data file 1: '/u02/oradata/ora11/ora11_system01.dbf'
Restarting sometimes worked, but making sure dg_config and db_file_convert was properly set fixed it
permanently. [see Appendix]
Lowell Noodle Company
20
Everything looks good so far. Proceed to test. On the Primary database, we issued alter system switch logfile; and
ran it a few times. Then we checked to see if the sequence numbers were incrementing correctly, which they were.
Lowell Noodle Company
21
Things were looking okay to our knowledge. But then we proceeded to check whether the logs were being received
correctly on the standby. They were not, so we fixed it. [see Appendix]
After fixing, we proceeded to check Data Guard status on both servers with
SQL> select message from v$dataguard_status
Lowell Noodle Company
22
Looks like it’s working. Now we’ll continue testing the Standby environment.
Lowell Noodle Company
23
MONITORING THE STANDBY ENVIRONMENT
On the primary database, run alter system switch logfile; a few times (again) and select from
v$archived_log to see that the changes are logged.
Check overall status of managed recovery:
Lowell Noodle Company
24
Determine if the standby site did not receive any log files:
There are about 20-some-odd logs still to do.
Now on the standby database, check if archive logs are received properly.
Check if archiver logs are being applied successfully on the standby database:
Lowell Noodle Company
25
Check managed standby status with
SQL> select process, client_process, sequence#, status from
v$managed_standby
Did one more alter system switch logfile; for good measure (on Primary).
Lowell Noodle Company
26
And verified new sequence# (120) has been applied on standby.
Finally make sure there are no gaps on the physical standby database.
Lowell Noodle Company
27
Activate Physical Standby: Controlled failover
Now for the good stuff… We’ll test the Physical Standby to see if it can properly act as Primary.
From fresh reboot, we started lsnrctl start, started up standby in mount managed recovery mode, started up
Primary in normal, and verified Data Guard status.
On the standby, we’re going to create a restore point so that after testing we can revert the standby back to its
original purpose. Verify that flashback is on and if not, enable it. Cancel standby’s managed recovery.
Lowell Noodle Company
28
Start standby back up, then on Primary do a couple logfile switches then DEFER log_archive_dest_state_2.
Now on the Standby, cancel managed standby recovery mode and stop shipping any logfiles to old primary since
we’re going to change its role.
Lowell Noodle Company
29
Convert the standby database to a primary database.
By now the standby database is open for testing! Note that no changes will be sent to primary database since
destination is disabled. Cool.
Query a few tables in ora1 schema to test.
Lowell Noodle Company
30
Standby turned acting Primary database testing seems good. Everything seems accessible. Optionally we could
change data guard protection mode to maximum performance if needed (alter database set standby database to
maximize performance;).
Perform Flashback to undo any changes and activate database as physical standby.
Start back up in managed standby recovery mode and re-enable log_archive_dest_state_2. Make sure to also reenable log_archive_dest_state_2 on old Primary that will be back up.
Lowell Noodle Company
31
On Primary, after enabling log destination 2 run a few more alter system switch logfile; lines and verify that the two
databases are once again communicating.
Looking pretty good at this point. Checking message in v$dataguard_status and select sequence#, applied from
v$archived_log order by sequence#;
Lowell Noodle Company
32
Sequence numbers show that they’ve been sent and applied correctly! Things are back to Data Guard working
status.
Finally drop the restore point on standby. It’s no longer needed.
Lowell Noodle Company
33
Appendix
USING ORAPWD TO GENERATE BOTH PASSWORD FILES DOES NOT WORK IN 11G
Our archive logs were not being sent over to the standby database. This is how we fixed it.
Lowell Noodle Company
34
Okay, we have a problem. We spent about 10-12 hours on this issue. At first we thought it might be due to bad
listener configuration, so went there and removed db_host=”fc12ora112.uml.edu” to db_host=”” (this had no effect).
It must be misconfigured somewhere. Checked v$dataguard_status and logs.
Lowell Noodle Company
35
Lowell Noodle Company
36
Next we looked at our alert logs:
Error 1017 received logging on to the standby (on production). Only after seeing this error in Primary alert logs did
we find the cause (which has been noted earlier in this report). Google led us to
http://remigium.blogspot.com/2012/02/error-1017-received-logging-on-to.html, which solved our issues.
On our databases we were receiving:
Error 1041 from standby
Error 1017 on primary
Lowell Noodle Company
37
Using orapwd to create the two password files even with the same password does not work! It worked in 10g, but
in 11g it’s better to copy the passwordfile from the primary to standby and rename:
Shutdown primary, then cancel managed recovery on standby, shutdown standby, and turn everything back on so
that the new oracle password file is used.
Lowell Noodle Company
38
SYNTAX ERROR IN ONE OF OUR PFILES
Lowell Noodle Company
39
LOG_FILE_NAME_CONVERT WAS MISSING IN ONE OF OUR PFILES
This caused some random weird behaviors and maybe some lockage. Adding it fixed (obviously).
Lowell Noodle Company
40
STARTUP IN ACTIVE DATA GUARD MODE
On the Standby database, to turn it into Active mode, cancel the managed recovery then use:
SQL> alter database open read only;
And start it up again with:
SQL> alter database recover managed standby database nodelay disconnect;
This allows the Standby database to be queried from while managed recovery is still running.