Download Database Disaster Recovery and More

Document related concepts

Serializability wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

IMDb wikipedia , lookup

Tandem Computers wikipedia , lookup

DBase wikipedia , lookup

Functional Database Model wikipedia , lookup

Microsoft Access wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Ingres (database) wikipedia , lookup

Btrieve wikipedia , lookup

Database wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Oracle Database wikipedia , lookup

Database model wikipedia , lookup

ContactPoint wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Open Database Connectivity wikipedia , lookup

SQL wikipedia , lookup

PL/SQL wikipedia , lookup

Transcript
Database Disaster Recovery
and More
Presented by: Stephen Rea
University of Arkansas Cooperative Extension Service
April 3, 2006
Evaluation Code 082
April 2-5 Orlando, Florida
Session Rules of Etiquette
 Please turn off your cell phone/pager
 If you must leave the session early, please do so as
discreetly as possible
 Please avoid side conversation during the session
Thank you for your cooperation!
Evaluation Code 082
2
Introduction
“Backups? We don’ NEED no steenkin’
backups!” - Famous last words of an ex-DBA
 Stephen Rea - Oracle® Database Administrator
 University of Arkansas
Cooperative Extension Service
Evaluation Code 082
3
Introduction
 This session will show you step-by-step instructions for:
 Jaunt: Bulletproofing your database against data loss
 Jaunt: Creating database backups (cold and hot backups)
 Recovering from various disaster scenarios
 Bonus: Implementing a standby database (if time permits)
Examples shown are for Oracle 9i (and 8i) and AIX UNIX 5.2 (not using
RMAN here; originally developed under Oracle 7.3.4 and AIX UNIX 4.1.5).
Oracle Backup and Recovery Classes (instructor led training):
 Enterprise DBA Part 1B: Backup and Recovery (8i)
 Oracle9i Database Administration Fundamentals II
 4-5 days
 $2,000-$2,500 registration
 $???? transportation, hotel, meals, etc.
Evaluation Code 082
4
Benefits of Attending
 After this session you will be able to:
 Minimize the potential for loss of your data.
 Recognize Oracle problems that you may encounter.
 Know step-by-step how to recover from those
problems.
 Gain confidence in your disaster recovery ability.
 Increase your worth as a DBA.
 And, you’ve saved a bundle of money ($2,000+) as well
as several days of your valuable time.
Evaluation Code 082
5
Topics of Discussion
Jaunt: Bulletproofing against data loss
 Jaunt: Creating backups (cold and hot)
 Recovering from disaster scenarios
 Bonus: Implementing a standby database
Evaluation Code 082
6
Bulletproofing
 To lessen the possibility of data loss, you can:
 Enable archivelog mode (to reapply the changes during
recovery; required by most disaster recovery scenarios)
 Separate the archive logs from the redo logs (allocate to
separate drives; likewise for the following items)
 Separate the redo logs and archive logs from the
datafiles
 Multiplex (mirror) the redo log groups and members
 Multiplex (mirror) the control file
 Multiplex (mirror) the archive log files
Oracle 9i Database Administrator's Guide (chapters 6 - 8, 12)
Evaluation Code 082
7
The Basics - Shutting Down Your Database
Position to the correct database (setting the Oracle SID):
For example, in UNIX:
Or, in NT:
$ . oraenv
c:\> set ORACLE_SID=PROD
PROD
c:\> set ORACLE_HOME=d:\oracle\v9204
Connect as sysdba (from a dba group user login, such as oracle):
Oracle 8.1.5 and above:
Prior to Oracle 9i, use Server Manager:
$ sqlplus “/ as sysdba”
$ svrmgrl
(SQL> connect / as sysdba)
SVRMGR> connect internal
Shut down the database (connected as sysdba):
SQL> shutdown immediate (kills sessions, rolls back pending changes)
If “shutdown immediate” fails or hangs (do this in another session):
SQL> shutdown abort
SQL> startup
SQL> shutdown immediate
Evaluation Code 082
8
The Basics - Starting Up Your Database in Various Stages
Database Startup Sequence (from shutdown state; connected as
sysdba in sqlplus):
1) nomount – reads init<SID>.ora (or spfile), allocates SGA memory,
starts background processes (such as pmon - process monitor)
SQL> startup nomount (from shutdown state)
2) mount – opens control files
SQL> startup mount (from shutdown state, or, …)
SQL> alter database mount; (from nomount state)
3) open – opens datafiles and online redo logs, performs crash
recovery
SQL> startup [open] (from shutdown state, or, …)
SQL> alter database open; (from nomount or mount state)
Note: The pfile=init<SID>.ora option on the startup command can be used
to specify the database’s init.ora file pathname.
Evaluation Code 082
9
Enabling Archiving (to recover database changes)
Steps: Update init.ora, create archive log directory,
shutdown, mount, set archivelog mode, open:
Edit the init.ora file to add the archive log parameters:
$ vi $ORACLE_HOME/dbs/initPROD.ora
log_archive_dest = /u01/oradata/PROD/archivelogs/
log_archive_format = arch_PROD_%S.arc
log_archive_start = true
$ mkdir /u01/oradata/PROD/archivelogs
From the mount state after shutdown, start up archiving
and open the database:
SQL> alter database archivelog;
SQL> alter database open;
SQL> archive log list (or: select * from v$database;)
Evaluation Code 082
10
Moving Datafiles (with shutdown)
Steps: Shutdown, move files, mount, rename files, open:
After shutdown, use an O/S command to move the datafile (done
from within the unix sqlplus session here; “!” is the host command
for unix; “$” is the host command for NT):
SQL> !mv /u03/oradata/PROD/devl_PROD_01.dbf
/u04/oradata/PROD
From the mount state, rename the datafile that you moved:
SQL> alter database rename file
'/u03/oradata/PROD/devl_PROD_01.dbf' to
'/u04/oradata/PROD/devl_PROD_01.dbf';
Then, open the database and look at the change made:
SQL> alter database open;
SQL> select * from v$datafile;
Evaluation Code 082
11
Moving Datafiles (without shutdown)
Steps: Offline tablespace, move files, rename files, online
tablespace:
$ sqlplus "/ as sysdba"
SQL> alter tablespace development offline;
SQL> !mv /u03/oradata/PROD/devl_PROD_01.dbf
/u04/oradata/PROD
SQL> alter database rename file
'/u03/oradata/PROD/devl_PROD_01.dbf' to
'/u04/oradata/PROD/devl_PROD_01.dbf';
SQL> alter tablespace development online;
SQL> select * from v$datafile;
Evaluation Code 082
12
Moving Datafiles (using control file)
Steps: Create textual control file, shutdown, move files,
update datafile pathnames in control file, run control file:
$ sqlplus "/ as sysdba"
SQL> alter database backup controlfile to trace;
SQL> show parameter user_dump
SQL> shutdown immediate
SQL> exit
$ mv /u03/oradata/PROD/*PROD*.dbf
/u04/oradata/PROD
(continued …)
Evaluation Code 082
13
Moving Datafiles (using control file)
$ cd /u00/oracle/admin/PROD/udump
$ ls -ltr *.trc
$ vi prod_ora_16060.trc (or similar name, like ora_16060.trc)
Delete the lines before the STARTUP NOMOUNT line
Rename the datafiles (:g/DATAFILE/,/;/s/u03/u04/)
# RECOVER (comment out the RECOVER line)
Execute the edited control file, which starts up the database:
$ sqlplus "/ as sysdba"
SQL> @prod_ora_16060.trc
SQL> select * from v$datafile;
Evaluation Code 082
14
Adding Redo Log Members (mirror across drives)
Steps: Add member file to given group:
SQL> alter database add logfile member
'/u04/oradata/PROD/log_PROD_1C.rdo' to group 1;
SQL> select * from v$logfile;
Evaluation Code 082
15
Increasing Maximum Number of Members (to add)
Steps: Create textual control file, update maxlogmembers,
shutdown, run control file:
$ sqlplus "/ as sysdba"
SQL> alter database backup controlfile to trace;
SQL> !ls -ltr /u00/oracle/admin/PROD/udump
SQL> !vi /u00/oracle/admin/PROD/udump/prod_ora_16060.trc
Delete the lines before the STARTUP NOMOUNT line
Edit maxlogmembers value (such as changing from 2 to 3)
# RECOVER (comment out the RECOVER line)
SQL> shutdown immediate
SQL> @/u00/oracle/admin/PROD/udump/prod_ora_16060.trc
Evaluation Code 082
16
Adding Redo Log Groups (to limit log switch waits)
Steps: Add group with member files of given size:
SQL> alter database add logfile group 4
('/u00/oradata/PROD/log_PROD_4A.rdo',
'/u01/oradata/PROD/log_PROD_4B.rdo')
size 500K;
SQL> select * from v$logfile;
Evaluation Code 082
17
Multiplexing Control Files (to mirror across drives)
Steps: Shutdown, make control file copies, update init.ora,
startup:
$ sqlplus "/ as sysdba"
SQL> shutdown immediate
SQL> !cp -p /u03/oradata/PROD/ctrl_PROD_01.ctl
/u01/oradata/PROD/ctrl_PROD_02.ctl
SQL> !vi $ORACLE_HOME/dbs/initPROD.ora
control_files = (/u03/oradata/PROD/ctrl_PROD_01.ctl,
/u01/oradata/PROD/ctrl_PROD_02.ctl)
SQL> startup
SQL> select * from v$controlfile;
Evaluation Code 082
18
Topics of Discussion
 Jaunt: Bulletproofing against data loss
Jaunt: Creating backups (cold and hot)
 Recovering from disaster scenarios
 Bonus: Implementing a standby database
Evaluation Code 082
19
What To Back Up
 Files to back up during one backup cycle:
 Datafiles (for all tablespaces)
 Control Files (binary and textual versions)
 Redo Log Files (cold backups only, not hot backups)
 Archive Log Files (archived redo logs, if archivelog
mode is enabled)
 Parameter File (init.ora (and/or spfile); init.ora is not in
the database; like $ORACLE_HOME/dbs/initPROD.ora)
 Password File (if used; it is not in the database; named
like $ORACLE_HOME/dbs/orapwdPROD)
Oracle9i User-Managed Backup and Recovery Guide (chapter 2)
Evaluation Code 082
20
What To Back Up (DB-based SQL)
Getting names of datafiles, temp files, control files, and
redo log files:
SQL> select name from v$datafile;
SQL> select name from v$tempfile;
SQL> select name from v$controlfile;
SQL> select member from v$logfile;
Getting names of datafile and temp file tablespaces:
SQL> select tablespace_name,file_name from
dba_data_files order by tablespace_name;
SQL> select tablespace_name,file_name from
dba_temp_files;
Evaluation Code 082
21
What To Back Up (DB-based SQL)
Getting locations of archive logs and archive parameters:
SQL> select name,value from v$parameter
where name in ('log_archive_dest',
'log_archive_format','log_archive_start');
SQL> show parameter archive
SQL> archive log list
For just the most recent archive log file names:
SQL> select name from v$archived_log
where trunc(completion_time) >= trunc(sysdate)-5;
Evaluation Code 082
22
What To Back Up (SID-based - preferred)
Getting files using SID-based standard naming
convention (preferred):
$ find / -name '*PROD*' ! -type d
2>/dev/null >backemup.dat
 Scripts for getting the list of backup files:
 backup_list.shl (SID-based; best for script)
 backup_list.sql (DB-based - but, don’t use this in
an actual backup script for cold backups)
Evaluation Code 082
23
Cold Backups (with database down)
 Cold Backup Requirements:
 Database is shut down (shutdown immediate)
 Complete backup from same point in time
 Database not available for queries or updates
 Either archivelog mode or noarchivelog mode
 All associated files are backed up:


Datafiles
Control Files
 Redo Log Files

Archive Log Files


Parameter File
Password File
Evaluation Code 082
24
Cold Backups - Getting the files to back up
Create the textual control file (then, shutdown):
$ sqlplus "/ as sysdba"
SQL> alter database backup controlfile to trace;
SQL> shutdown immediate
SQL> exit
$ ls -ltr /u00/oracle/admin/PROD/udump/* | tail -1 |
sed 's/^.* \//\//' >>backemup.dat
Get the SID-based file list with archivelogs at end of list:
$ find / -name '*PROD*' ! -type d 2>/dev/null |
grep -v 'arc$' | grep -v 'gz$' >>backemup.dat
$ ls /u01/oradata/PROD/archivelogs/*.arc.gz >>backemup.dat
$ ls /u01/oradata/PROD/archivelogs/*.arc >>backemup.dat
Evaluation Code 082
25
Cold Backups - File Backup Options
 Copy the files to a staging (backup) directory:
$ cat backemup.dat | sed 's/\(^.*\)\/\(.*\)$/
cp -p \1\/\2 \/u03\/backups\/prod1\/\2/' | sh
 Or, compress (zip) the files to a staging directory:
$ cat backemup.dat | sed 's/\(^.*\)\/\(.*\)$/
gzip -cv1 \1\/\2 \/u03\/backups\/prod1\/\2.gz;
touch -r \1\/\2 \/u03\/backups\/prod1\/\2.gz/' | sh
 Then, start the database and back up the staging directory:
SQL> startup
$ ls /u03/backups/prod1/* | cpio -ovC64 >/dev/rmt0
 Or just copy the files directly to tape (no staging directory):
$ cat backemup.dat | cpio -ovC64 >/dev/rmt0
Evaluation Code 082
26
Hot Backups (with database up)
 Hot Backup Requirements:
 Database is up (tablespace begin/end backup)
 Backup of datafiles per tablespace over time
 Database is available for queries and updates
 Archivelog mode required
 Special backup processing required for files:


Datafiles (by ts)
BACKUP Control Files


NO Redo Logs
Archive Logs (switch)


Parameter File
Password File
Evaluation Code 082
27
Hot Backups of Tablespaces
Steps: For each tablespace: begin tablespace backup,
copy (or zip) datafiles to a staging (backup) directory,
end tablespace backup:
SQL> alter tablespace development begin backup;
SQL> !cp -p /u03/oradata/PROD/devl_PROD_*.dbf
/u03/backups/prod1
SQL> alter tablespace development end backup;
Note that no shutdown or startup is needed
Evaluation Code 082
28
Hot Backups of Log Files
Steps: Switch logfile, get list of archivelogs, copy log files:
SQL> alter system switch logfile;
$ ls /u01/oradata/PROD/archivelogs/*.arc.gz >logfile.lst
$ ls /u01/oradata/PROD/archivelogs/*.arc >>logfile.lst
$ sleep 5 (for script)
$ cat logfile.lst | sed "s/\(.*\/\)\([^\/].*\)/
cp -p \1\2 \/u03\/backups\/prod1\/\2/" >logfile.shl
$ sh logfile.shl
Evaluation Code 082
29
Hot Backups of Control Files
Steps: Create binary and textual control files, find and copy
textual control (back up the control file last):
SQL> alter database backup controlfile to
'/u03/backups/prod1/controlfile.ctl';
SQL> alter database backup controlfile to trace;
$ ls -ltr /u00/oracle/admin/PROD/udump
$ cp -p /u00/oracle/admin/PROD/udump/prod_ora_16060.trc
/u03/backups/prod1
Evaluation Code 082
30
Backing Up the Staging (Backup) Directory
Steps: Change to the staging directory, compress or zip
the files (optional), copy resulting files to tape :
$ cd /u03/backups/prod1
$ gzip -vN1 *
$ find /u03/backups/prod1 | cpio -ovC64 >/dev/rmt0
Compressing while Exporting using UNIX Pipes:
$ mknod /tmp/exp_pipe p
$ gzip -cNf </tmp/exp_pipe >prod.dmp.gz &
$ exp " '/ as sysdba' " file=/tmp/exp_pipe full=y
compress=n log=prod.dmp.log
$ rm -f /tmp/exp_pipe
Evaluation Code 082
31
Additional Nightly Processing (optional)
 Create and keep the current copy of your textual control file
 Back up the current copy of your textual init.ora file (if you
are using a server parameter file (spfile), use the "create
pfile from spfile;" command to create it)
 Make a full database export for quick table restores and to
check datafile integrity
 Generate sql definitions for all of your tables and indexes
(using indexfile option of the import command)
 Gather statistics on datafile and index space usage and
table extent growth for proactive maintenance (shown on
next slide)
Evaluation Code 082
32
Other Nightly Processing
Gathering space usage statistics:
SQL> select segment_name,segment_type,extents,max_extents
from sys.dba_segments where extents + 5 > max_extents or
extents > 50 order by segment_name,segment_type desc;
SQL> col "Free" format 9,999,999,999
SQL> col "Total Size" format 9,999,999,999
SQL> select tablespace_name,sum(bytes) "Total Size" from
dba_data_files group by tablespace_name;
SQL> select tablespace_name,sum(bytes) "Free" from
dba_free_space group by tablespace_name;
For each index, find the number of deleted rows (may rebuild it):
SQL> validate index <index name>;
SQL> select name,del_lf_rows_len from index_stats;
Evaluation Code 082
33
Topics of Discussion
 Jaunt: Bulletproofing against data loss
 Jaunt: Creating backups (cold and hot)
Recovering from disaster scenarios
 Bonus: Implementing a standby database
Evaluation Code 082
34
Suddenly Your Database Dies
SQL> @mywebstats.sql
ORA-01115: IO error reading block from file 13 (block # 66771)
ORA-01110: data file 13: '/ndxs/oradata/PROD/medi_PROD_01.dbf'
ORA-27091: skgfqio: unable to queue I/O
ORA-27072: skgfdisp: I/O error
IBM AIX RISC System/6000 Error: 5: I/O error
Additional information: 66770
$ ls -ltr /ndxs/oradata/PROD
ls: /ndxs/oradata/PROD: There is an input or
output error.
total 0
Evaluation Code 082
35
Disaster Recovery – What Happened?
 Things to check to determine the problem:
 System error messages during normal processing
 Messages from database startup (backup first)
 Alert log in the database's bdump directory
(/u00/oracle/admin/PROD/bdump/alert_PROD.log)
 Recent bdump and udump trace (*.trc) files
(background_dump_dest, user_dump_dest)
 Oracle processes ($ ps -ef | grep ora, including:
ora_smon_PROD, *pmon*, *dbwr*, *lgwr*)
 Recent /home/jobsub/*.lis and *.log files
Evaluation Code 082
36
What To Restore (from backup)
 ONLY the affected datafile for full recovery
 ALL datafiles for incomplete recovery
 All archivelog files generated since the datafile backup
was made
 DON'T restore control files (unless all of them are lost)
 DON'T restore online redo log files (unless they are not
used during recovery - then must resetlogs:)
SQL> connect / as sysdba
SQL> startup mount
SQL> alter database open resetlogs;
Evaluation Code 082
37
Disaster Recovery Overview



Backup first, then try startup (shutdown after)
Primary recovery options:
 Recover database (complete or incomplete recovery)
 Recover datafile (one datafile; complete recovery only)
 Recover tablespace (all of its datafiles; complete only)
Generic recovery steps:
1. Shutdown database or offline tablespace or datafile
2. Restore datafile(s) (and archivelogs) from backup
3. Issue recover command (database, datafile, or
tablespace)
4. Bring tablespace or datafiles online
Oracle9i User-Managed Backup and Recovery Guide
Evaluation Code 082
38
Basic Recover Database – Complete Recovery
State: mount (from shutdown state), datafiles online
Steps: Restore datafile(s), mount, online the datafile(s)
if needed, recover database, open:
$ cp -p /u03/backups/prod1/devl_PROD_01.dbf
/u03/oradata/PROD (restore datafile(s))
$ sqlplus "/ as sysdba"
SQL> startup mount
SQL> select * from v$datafile;
if any offline, then:
SQL> alter database datafile
'<full offline datafile pathname>' online;
SQL> set autorecovery on
(or recover automatic below)
SQL> recover database;
SQL> alter database open;
Evaluation Code 082
39
Basic Recover Database – Incomplete Recovery
State: mount, datafiles online
Steps: Restore ALL datafiles, mount, online the datafiles,
recover database until ..., open. Like complete recovery,
except for "recover database" command (use only one):
SQL> recover automatic database until time
'2005-02-14:15:45:00';
SQL> recover database until cancel;
(no autorecovery; steps through logs until CANCEL
entered)
SQL> recover automatic database until change 43047423;
(SCN (system change number) for archivelogs is in
v$archived_log as first_change# and next_change# - 1;
online redo logs is in v$log as first_change# (subtract 1))
SQL> alter database open resetlogs;
(then, backup)
Evaluation Code 082
40
Basic Recover Datafile – From Mount State
State: mount, datafile online or offline
Steps: Restore datafile, mount, recover datafile, online the
datafile if needed, open:
$ cp -p /u03/backups/prod1/devl_PROD_01.dbf
/u03/oradata/PROD (restore datafile)
$ sqlplus "/ as sysdba"
SQL> startup mount
SQL> recover automatic datafile
'/u03/oradata/PROD/devl_PROD_01.dbf';
SQL> select * from v$datafile;
if datafile is offline, then:
SQL> alter database datafile
'/u03/oradata/PROD/devl_PROD_01.dbf' online;
SQL> alter database open;
Evaluation Code 082
41
Basic Recover Datafile – From Open State
State: open (with database up), datafile offline
Steps: Offline the datafile, restore datafile, recover datafile,
online the datafile (not for system datafile):
$ sqlplus "/ as sysdba"
SQL> alter database datafile
'/u03/oradata/PROD/devl_PROD_01.dbf' offline;
SQL> !cp -p /u03/backups/prod1/devl_PROD_01.dbf
/u03/oradata/PROD (restore datafile)
SQL> recover automatic datafile
'/u03/oradata/PROD/devl_PROD_01.dbf';
SQL> alter database datafile
'/u03/oradata/PROD/devl_PROD_01.dbf' online;
Evaluation Code 082
42
Basic Recover Datafile – From Open After Startup
State: open, datafile offline
Steps: Mount, offline the datafile, open, restore datafile,
recover datafile, online the datafile (not for system datafile):
$ sqlplus "/ as sysdba"
SQL> startup mount
SQL> alter database datafile
'/u03/oradata/PROD/devl_PROD_01.dbf' offline;
SQL> alter database open;
SQL> !cp -p /u03/backups/prod1/devl_PROD_01.dbf
/u03/oradata/PROD (restore datafile)
SQL> recover automatic datafile
'/u03/oradata/PROD/devl_PROD_01.dbf';
SQL> alter database datafile
'/u03/oradata/PROD/devl_PROD_01.dbf' online;
Evaluation Code 082
43
Basic Recover Tablespace – From Open State
State: open, tablespace offline
Steps: Offline the tablespace, restore tablespace’s datafiles,
recover tablespace, online the tablespace (not for system
tablespace):
$ sqlplus "/ as sysdba"
SQL> alter tablespace development offline immediate;
(immediate rolls back currently pending transactions)
SQL> !cp -p /u03/backups/prod1/devl_PROD*
/u03/oradata/PROD (restore datafiles)
SQL> recover automatic tablespace development;
SQL> alter tablespace development online;
Evaluation Code 082
44
Basic Recover Tablespace – From Open After Startup
State: open, tablespace offline
Steps: Mount, offline any bad datafiles, open, offline the
tablespace,restore tablespace’s datafiles, recover tablespace,
online the tablespace (not for system tablespace):
$ sqlplus "/ as sysdba"
SQL> startup mount
SQL> alter database datafile
'/u03/oradata/PROD/devl_PROD_01.dbf' offline;
SQL> alter database open;
SQL> alter tablespace development offline;
SQL> !cp -p /u03/backups/prod1/devl_PROD*
/u03/oradata/PROD (restore datafiles)
SQL> recover automatic tablespace development;
SQL> alter tablespace development online;
Evaluation Code 082
45
Checking Logs and Trace Files
Checking alert log and trace files for pmon, lgwr, dbwr, arch:
SQL> select value from v$parameter where name =
'background_dump_dest';
$ grep background_dump_dest $ORACLE_HOME/dbs/initPROD.ora
$ cd /u00/oracle/admin/PROD/bdump
$ tail -200 alert_PROD.log
$ ls -ltr *.trc
(look at latest pmon, lgwr, dbwr, arch trace files, if any)
$ cat prod_pmon_13612.trc
(process monitor trace)
$ cat prod_lgwr_32306.trc
(redo log writer trace)
$ cat prod_dbwr_43213.trc
(database writer trace)
$ cat prod_arch_22882.trc
(archiver trace)
Evaluation Code 082
46
Archivelogs Disk Volume Filled Up
 Symptoms:
Database freezes. No space for archivelogs (df -k).
 Messages:
Users: None (sessions freeze).
Logins: ERROR: ORA-00257: archiver error.
connect internal only, until freed.
 Logs:
Alert log: ORA-00272: error writing archive log.
Arch trace: ORA-00272: error writing archive log.
Note: Use oerr for error description, such as: $ oerr ora 257
Evaluation Code 082
47
Archivelogs Disk Volume Filled Up
Steps: Free up space on archivelogs volume by moving files or
older archivelogs off, or, by deleting old archive log files that have
already been backed up:
# File: remove_old_logs.shl
echo "You must be logged in as user Oracle to run this script,"
echo "which removes all archivelog files older than X days."
echo "Enter number of days to keep: \c"
read DAYS_KP; export DAYS_KP
find /u01/oradata/PROD/archivelogs -name '*.arc.gz' -mtime
+$DAYS_KP -exec rm {} \;
find /u01/oradata/PROD/archivelogs -name '*.arc' -mtime
+$DAYS_KP -exec rm {} \;
echo "Results after deletions:"
du -k
df -k
Evaluation Code 082
48
Loss of Control Files
 Symptoms:
May be none until shutdown and/or startup.
 Messages:
Shutdown: If deleted: ORA-00210: cannot open control
file '/u03/oradata/PROD/ctrl_PROD_01.ctl'. If overwritten:
ORA-00201: control file version incompatible with
ORACLE version. (May have to shutdown abort.)
Startup: ORA-00205: error in identifying control file
'/u03/oradata/PROD/ctrl_PROD_01.ctl'. Also, if
overwritten: ORA-07366: sfifi: invalid file, file does not
have valid header block.
Evaluation Code 082
49
Loss of Control Files – Textual Control File Recovery
Steps: Edit out header from latest textual control file you have
available (in udump directory), run control file (as long as no
datafiles have been added):
$ sqlplus "/ as sysdba"
SQL> shutdown abort
SQL> !ls -ltr /u00/oracle/admin/PROD/udump/*.trc
SQL> !vi /u00/oracle/admin/PROD/udump/prod_ora_31494.trc
Delete the lines before the STARTUP NOMOUNT line
SQL> @/u00/oracle/admin/PROD/udump/prod_ora_31494.trc
Evaluation Code 082
50
Loss of Control Files – Backup Control File Recovery
Steps: Restore ALL datafiles AND control files (NOT online
redo log files), recover using backup controlfile:
$ sqlplus "/ as sysdba"
SQL> shutdown abort
At this point, restore ALL datafiles AND control files from the
last backup, but, NOT the online redo log files.
SQL> connect / as sysdba
SQL> startup mount
SQL> recover automatic database using backup controlfile;
(AUTO on "ORA-00308: cannot open archived log ...")
SQL> alter database open resetlogs; (then, backup)
Evaluation Code 082
51
Loss of TEMP Datafile
 Symptoms:
Error message on large sorts (select distinct, order
by, group by, union).
 Messages:
Loss during: ORA-01157: cannot identify data file 3 –
file not found.
Loss before: ORA-01116: error in opening database
file 3.
Both Followed by: ORA-01110: data file 3:
'/u03/oradata/PROD/temp_PROD_01.dbf'.
 Logs:
No alert.log entries or trace files generated.
Evaluation Code 082
52
Loss of TEMP Datafile
State: open or mount
Steps: Offline datafile, drop and recreate TEMP tablespace
(here we are using locally managed temporary tablespace):
SQL> alter database datafile
'/u03/oradata/PROD/temp_PROD_01.dbf' offline;
If from mount state: SQL> alter database open;
SQL> select file_name,bytes/1024 kbytes from dba_temp_files;
SQL> select initial_extent/1024 kbytes from dba_tablespaces
where tablespace_name = 'TEMP';
SQL> drop tablespace temp;
SQL> !rm /u03/oradata/PROD/temp_PROD_01.dbf
SQL> create temporary tablespace temp
tempfile '/u03/oradata/PROD/temp_PROD_01.dbf'
size 40064K extent management local uniform size 640K;
Evaluation Code 082
53
TEMP Datafile Offline
 Symptoms:
Error message on large sorts (select distinct, order
by, group by, union). (Like Loss of TEMP Datafile.)
 Messages:
ORA-00376: file 3 cannot be read at this time.
ORA-01110: data file 3:
'/u03/oradata/PROD/temp_PROD_01.dbf'.
Steps: Recover datafile (DB open), online the datafile:
SQL> recover automatic datafile
'/u03/oradata/PROD/temp_PROD_01.dbf';
SQL> alter database datafile
'/u03/oradata/PROD/temp_PROD_01.dbf' online;
Evaluation Code 082
54
Loss of INACTIVE Online Redo Log Group (Archived)
 Symptoms:
Database crashes switching to lost log group.
 Messages:
Users: ORA-01092: ORACLE instance terminated.
Disconnection forced.
Logins: ERROR: ORA-03114: not connected to ORACLE.
ERROR: ORA-00472: PMON process terminated with error.
 Logs:
Alert log: No indication.
Pmon trace: ORA-00470: LGWR process terminated with
error.
Lgwr trace: ORA-00313: open failed for members of log
group 3 of thread 1. ORA-00312: online log 3 thread 1:
'/u03/oradata/PROD/log_PROD_3B.rdo'.
Evaluation Code 082
55
Loss of INACTIVE Online Redo Log Group (Archived)
Steps: Mount, drop logfile group, add logfile group, open:
$ sqlplus "/ as sysdba"
SQL> startup mount
If "ORA-01081: cannot start already-running ORACLE –
shut it down first", then "startup force" (or "shutdown
abort" and "startup mount").
Shows "Database mounted." and "ORA-00313: open failed
for members of log group 3 of thread 1".
SQL> select bytes/1024 K from v$log where group# = 3;
SQL> select member from v$logfile where group# = 3;
SQL> alter database drop logfile group 3;
SQL> alter database add logfile group 3
('/u03/oradata/PROD/log_PROD_3A.rdo',
'/u03/oradata/PROD/log_PROD_3B.rdo') size 500K;
SQL> alter database open;
Evaluation Code 082
56
Loss of INACTIVE Online Redo Log Group Member
 Symptoms:
No apparent symptoms.
 Logs:
Alert log: ORA-00313: open failed for members of log
group 3 of thread 1. ORA-00312: online log 3 thread 1:
'/u03/oradata/PROD/log_PROD_3A.rdo'. (Each switch.)
Steps: Drop logfile member, add logfile member to group:
SQL> select * from v$log where group# = 3;
If status is active or current for group, first do:
SQL> alter system switch logfile;
SQL> alter database drop logfile member
'/u03/oradata/PROD/log_PROD_3A.rdo';
SQL> alter database add logfile member
'/u03/oradata/PROD/log_PROD_3A.rdo' to group 3;
Evaluation Code 082
57
Loss of CURRENT Online Redo Log Group (Unarchived)
 Symptoms:
Database freezes (after cycling back). Plenty of space for
archivelogs (df -k).
 Messages:
Users: None (sessions freeze).
Logins: ERROR: ORA-00257: archiver error. Connect
internal only, until freed.
 Logs:
Alert log: ORA-00286: No members available, or no
member contains valid data. ORACLE Instance PROD –
Can not allocate log, archival required. Thread 1 cannot
allocate new log, sequence 21. All online logs needed
archiving.
Arch trace: ORA-00286: No members available, or no
member contains valid data.
Evaluation Code 082
58
Loss of CURRENT Online Redo Log Group (Unarchived)
Steps: Startup to get log group number, then do an incomplete
recovery (as described earlier) up to the start time for the lost
log group (Note: all changes in all redo log groups will be lost):
$ sqlplus "/ as sysdba"
SQL> shutdown abort
SQL> startup
Shows: ORA-00313: open failed for members of log
group 2 of thread 1.
SQL> shutdown abort
(continued …) .
Evaluation Code 082
59
Loss of CURRENT Online Redo Log Group (Unarchived)
At this point, restore ALL datafiles AND the lost online redo
log group's files from the latest backup, but, NOT the control
files.
SQL> connect / as sysdba
SQL> startup mount
SQL> select sequence#, bytes, first_change#,
to_char(first_time, 'DD-MON-YY HH24:MI:SS'), status
from v$log where group# = 2;
SQL> recover automatic database until time
'2005-02-14:12:59:59'; (1 second before loss)
SQL> alter database open resetlogs; (and backup)
Evaluation Code 082
60
Failure During Hot Backup
Steps: Mount, determine datafiles in hot backup state,
end backup (Oracle 7.2 and above), open:
$ sqlplus "/ as sysdba"
SQL> startup mount
SQL> select df.name,bk.time from v$datafile df,v$backup bk
where df.file# = bk.file# and bk.status ='ACTIVE';
SQL> alter database datafile
'/u03/oradata/PROD/devl_PROD_01.dbf' end backup;
SQL> alter database open;
Evaluation Code 082
61
Topics of Discussion
 Jaunt: Bulletproofing against data loss
 Jaunt: Creating backups (cold and hot)
 Recovering from disaster scenarios
Bonus: Implementing a standby database
Evaluation Code 082
62
Bonus Topic: Data Guard - Oracle's Answer
to Disaster Recovery
 See how to quickly implement a Data Guard physical standby
database step-by-step in a day.
 Learn how to switch over or fail over to your standby database
in minutes.
 Possibly offload your batch reporting workload to your standby
database.
 Replace your forebodings about crashes with "Don't worry ...
be happy!"
Oracle Data Guard Concepts and Administration Release 2 (9.2)
Evaluation Code 082
63
Data Guard Flow (Oracle 9i)
Evaluation Code 082
64
Data Guard Flow (Oracle 10g)
Evaluation Code 082
65
Data Guard Protection Modes
 Maximum Performance
 Updates committed to primary and sent to standby without waiting
to see if they were applied to standby
 Pros: Little or no effect on performance of primary
 Cons: Slight chance of lost transactions on failover
 Maximum Availability (we will implement this one)
 Attempts to apply updates to standby before committed to primary
 Lowers protection to Maximum Performance temporarily if
updates can't be applied to standby
 Pros: Primary continues unaffected if connection to standby is lost
or the updates are delayed
 Cons: Slight performance hit on primary; lost transactions on
failover possible only if the standby has been unreachable
Evaluation Code 082
66
Data Guard Protection Modes
 Maximum Protection
 Assures updates are applied to standby before committed
to primary
 Pros: No chance of lost transactions
 Cons: Primary will freeze if connection to standby is lost
or the updates are delayed
Evaluation Code 082
67
Primary Database Requirements for Data Guard
 FORCE LOGGING must be enabled:
SQL> select force_logging from v$database;
SQL> alter database force logging;
 ARCHIVELOG mode and automatic archiving must be
enabled:
SQL> archive log list
 MAXLOGFILES >= (2 * Current Redo Log Groups) + 1:
SQL> select records_used "Current Groups",
records_total "Max Groups"
from v$controlfile_record_section
where type = 'REDO LOG';
Evaluation Code 082
68
listener.ora Additions
 Define the standby database SID on the standby site:
(SID_DESC=
(SID_NAME=PROD2)
(ORACLE_HOME=/pgms/oracle/product/v9204)
)
(in $ORACLE_HOME/network/admin/listener.ora)
Evaluation Code 082
69
tnsnames.ora Additions
 Define the standby database connect string on the primary site:
myserver_prod2 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS =
(PROTOCOL = TCP)
(Host = 123.45.67.89) -- whatever host IP has PROD2
(Port = 1521)
)
)
(CONNECT_DATA = (SID = PROD2)
)
)
(define myserver_prod and myserver_prod2 on both
primary and standby sites for quick switchovers)
Evaluation Code 082
70
sqlnet.ora and /etc/oratab Additions
 Enable dead connection detection on the primary and
standby sites:
sqlnet.expire_time=2
(in $ORACLE_HOME/network/admin/sqlnet.ora)
 Add the standby database's entry to /etc/oratab on the
standby site:
PROD2:/pgms/oracle/product/v9204:N
Evaluation Code 082
71
Standby Database Parameter File
 Create the initPROD2.ora parameter file to be used for the
standby database (done from primary database):
 If your primary is using an spfile:
$ sqlplus "/ as sysdba"
SQL> create pfile='$ORACLE_HOME/dbs/initPROD2.ora'
from spfile;
 Else, if your primary is using a pfile:
$ cp -p $ORACLE_HOME/dbs/initPROD.ora
$ORACLE_HOME/dbs/initPROD2.ora
 Note: We will be modifying both the primary and standby
parameter files to handle being in either the primary or the
standby mode for quick switchovers.
Evaluation Code 082
72
Standby Database Parameters (changes in
copy of primary's values)
Change pathnames, such as control_files, background_dump_dest,
core_dump_dest, user_dump_dest, and audit_file_dest, and add:
# log_archive_dest = /orcl/oradata/PROD2/archivelogs
log_archive_dest_1 = 'LOCATION=/orcl/oradata/PROD2/archivelogs MANDATORY'
# for switchover
log_archive_dest_state_1 = ENABLE # for switchover
log_archive_dest_2 = 'SERVICE=myserver_prod LGWR SYNC' # for switchover
log_archive_dest_state_2 = ENABLE # for switchover
standby_archive_dest = /orcl/oradata/PROD2/archivelogs
standby_file_management = AUTO # or MANUAL for raw devices
remote_archive_enable = TRUE # TRUE or RECEIVE, change RECEIVE to SEND
on switchover
instance_name = PROD2
lock_name_space = PROD2 # use when primary and standby on same system;
same as instance_name
fal_server = myserver_prod # "fal" is Fetch Archive Log, for log gap resolution
fal_client = myserver_prod2
db_file_name_convert = ('/PROD/','/PROD2/')
log_file_name_convert = ('/PROD/','/PROD2/')
Evaluation Code 082
73
Primary Database Parameters (changes in
primary's values)
#log_archive_dest = /orcl/oradata/PROD/archivelogs
log_archive_dest_1 = 'LOCATION=/orcl/oradata/PROD/archivelogs MANDATORY'
log_archive_dest_state_1 = ENABLE
log_archive_dest_2 = 'SERVICE=myserver_prod2 LGWR SYNC'
log_archive_dest_state_2 = ENABLE
standby_archive_dest = /orcl/oradata/PROD/archivelogs # for switchover
standby_file_management = AUTO # for switchover; or MANUAL for raw devices
remote_archive_enable = TRUE # TRUE or SEND, change SEND to RECEIVE on
switchover
instance_name = PROD
lock_name_space = PROD # use when primary and standby on same system;
same as instance_name
fal_server = myserver_prod2 # for switchover
fal_client = myserver_prod # for switchover
db_file_name_convert = ('/PROD2/','/PROD/') # for switchover
log_file_name_convert = ('/PROD2/','/PROD/') # for switchover
(If primary uses spfile, wait until after the standby database
files are copied/created to make these parameter changes.)
Evaluation Code 082
74
Standby Database Datafiles, etc.
 Create the standby control file from the primary database:
SQL> alter database create standby controlfile as
'/orcl/oradata/PROD2/ctrl_PROD_01.ctl';
 Shut down the primary database and copy or FTP its
datafiles, redo log files, and the just-created standby
parameter file and standby control file, to the standby site.
Evaluation Code 082
75
Standby Database Datafiles, etc.
 Copy the standby control file on the standby site to the
other file names listed in the control_files init.ora
parameter.
 Create the standby's password file, if needed, on the
standby site:
$ orapwd file=$ORACLE_HOME/dbs/orapwPROD2
password=<sys password> entries=5
 Reload the listener on the primary and standby sites:
$ lsnrctl reload
Evaluation Code 082
76
Standby Database Startup
 Start the standby database in nomount mode, create the
spfile if wanted, mount the standby database, and change
to managed recovery:
$ . oraenv
PROD2
$ sqlplus "/ as sysdba"
SQL> create spfile from pfile;
SQL> startup nomount
SQL> alter database mount standby database;
SQL> alter database recover managed standby database
disconnect from session;
SQL> exit
Evaluation Code 082
77
Primary Database Startup
 If your primary is using an spfile, set the primary database
parameters in the spfile as listed earlier:
SQL> startup nomount
SQL> alter system reset log_archive_dest
scope=spfile sid='*';
SQL> alter system set log_archive_dest_1 =
'LOCATION=/orcl/oradata/PROD/archivelogs
MANDATORY' scope=spfile;
… etc …
SQL> shutdown
Evaluation Code 082
78
Primary Database Startup
 Start up the primary database with the new parameters:
SQL> startup
 Start archiving to the standby database by issuing a log
switch:
SQL> alter system switch logfile;
Congratulations! You now have a working
standby database for your primary database.
But wait … There's more …
Evaluation Code 082
79
Add Standby Redo Log Groups to Standby
Database
 Create standby redo log groups on standby database (one more than
current redo log groups):
$ sqlplus "/ as sysdba"
SQL> alter database recover managed standby database cancel;
SQL> alter database open read only;
SQL> select max(group#) maxgroup from v$logfile;
SQL> select max(bytes) / 1024 "size (K)" from v$log;
SQL> alter database add standby logfile group 4
( '/orcl/oradata/PROD2/stby_log_PROD_4A.rdo',
'/orcl/oradata/PROD2/stby_log_PROD_4B.rdo')
size 4096K;
… etc …
SQL> column member format a55
SQL> select vs.group#,vs.bytes,vl.member from v$standby_log vs,
v$logfile vl where vs.group# = vl.group# order by
vs.group#,vl.member;
Evaluation Code 082
80
Add Tempfile To Standby
 Add a tempfile to the standby database for switchover or
read-only access, then, switch back to managed recovery:
SQL> alter tablespace temp add tempfile
'/data/oradata/PROD2/temp_PROD_01.dbf'
size 400064K reuse;
SQL> alter database recover managed standby database
disconnect from session;
SQL> select * from v$tempfile;
SQL> exit
Evaluation Code 082
81
Add Standby Redo Log Groups to Primary Database
 Create standby logfile groups on the primary database for
switchovers (one more than current redo log groups):
$ sqlplus "/ as sysdba"
SQL> select max(group#) maxgroup from v$logfile;
SQL> select max(bytes) / 1024 "size (K)" from v$log;
SQL> alter database add standby logfile group 4
( '/orcl/oradata/PROD/stby_log_PROD_4A.rdo',
'/orcl/oradata/PROD/stby_log_PROD_4B.rdo')
size 4096K;
… etc …
SQL> column member format a55
SQL> select vs.group#,vs.bytes,vl.member
from v$standby_log vs,v$logfile vl
where vs.group# = vl.group#
order by vs.group#,vl.member;
Evaluation Code 082
82
Switch To Maximum Availability Protection Mode
 Switch to the desired "maximum availability" protection
mode on the primary database (from the default "maximum
performance"):
SQL> select value from v$parameter where name =
'log_archive_dest_2'; -- must show LGWR SYNC
SQL> shutdown normal
SQL> startup mount
SQL> alter database set standby database to
maximize availability;
SQL> alter database open;
SQL> select protection_mode from v$database;
Evaluation Code 082
83
Test Updates Propagating to Standby
 Try some edits on the primary and check to see that the changes made
it to the standby:
 On the primary:
SQL> update spriden set spriden_first_name = 'James'
where spriden_pidm = 1234 and spriden_change_ind
is null;
SQL> commit;
SQL> alter system switch logfile;
 On the standby (wait a few seconds first):
SQL> alter database recover managed standby database cancel;
SQL> alter database open read only;
SQL> select * from spriden where spriden_pidm = 1234 and
spriden_change_ind is null;
SQL> alter database recover managed standby database
disconnect from session;
Evaluation Code 082
84
Running Reports with a Standby
 Set standby to Read Only to run reports:
SQL> alter database recover managed standby database
cancel;
SQL> alter database open read only;
SQL> @myreport.sql
SQL> alter database recover managed standby database
disconnect from session;
Evaluation Code 082
85
Shutdown and Startup for Standby Database
 To shut down a standby database:
 If in read-only access, switch back to managed recovery (after
terminating any other active sessions):
SQL> alter database recover managed standby database
disconnect from session;
 Cancel managed recovery and shutdown:
SQL> alter database recover managed standby database cancel;
SQL> shutdown immediate
 To start up a standby database:
SQL> startup mount
SQL> alter database mount standby database;
SQL> alter database recover managed standby database
disconnect from session;
Evaluation Code 082
86
Switchover - Swapping Primary and Standby




End all activities on the primary and standby database.
On the primary (switchover status "TO STANDBY"):
SQL> select database_role,switchover_status from v$database;
SQL> alter database commit to switchover to physical standby;
SQL> shutdown immediate
SQL> startup nomount
SQL> alter database mount standby database;
On the standby (switchover status "SWITCHOVER PENDING"):
SQL> select database_role,switchover_status from v$database;
SQL> alter database commit to switchover to primary;
SQL> shutdown normal
SQL> startup
On the primary:
SQL> alter database recover managed standby database
disconnect from session;


On the standby:
SQL> alter system archive log current;
Change tnsnames.ora entry on all servers to swap the connect strings
(myserver_prod and myserver_prod2).
Evaluation Code 082
87
Failover - Standby Becomes Primary
 End all activities on the standby database.
 May need to resolve redo log gaps (not shown here).
 On the standby:
SQL> alter database recover managed standby database finish;
SQL> alter database commit to switchover to primary;
SQL> shutdown immediate
SQL> startup
 Change tnsnames.ora entry on all servers to point the primary
connect string to the standby database.
 New standby needs to be created. Old primary is no longer
functional.
Evaluation Code 082
88
Summary
 Jaunt: Bulletproofing against data loss
 Enabling archiving of redo log files
 Separating types of database files
 Moving datafiles
 Adding redo log members
 Mirroring log files and control files
 Jaunt: Creating backups (cold and hot)
 Identifying files to back up
 Performing cold backups (DB down)
 Performing hot backups (DB up)
 Statistics, exports, and other tasks
Evaluation Code 082
89
Summary
 Recovering from disaster scenarios
 Determining cause of failure
 Identifying files to restore
 Basic disaster recovery scenarios
 Complete and incomplete
 Database, datafile, tablespace
 Specific file type recovery scenarios
 Step-by-step recovery commands
Evaluation Code 082
90
Summary
 Bonus Topic: Implementing a standby database
 Data Guard provides an automated standby
database which can essentially eliminate
downtime of your production data.
 Setup is easy and fairly straightforward.
 Maintenance is minimal.
 Switchovers and failovers can be done within a
few minutes.
 Reporting can be offloaded to the standby to ease
the workload on the primary.
 And … It's Free! (Included with Enterprise Edition)
Evaluation Code 082
91
Summary
 Backup, backup, backup
 Practice, practice, practice your
disaster recovery plans
 Don’t Panic! Follow your plan
http://www.uaex.edu/srea/bkupreco.htm
http://www.uaex.edu/srea/dataguard.htm
Evaluation Code 082
92
Questions and Answers
Whew! Glad That’s Over!
Any Questions?
Evaluation Code 082
93
Thank You!
Stephen Rea
[email protected]
More Oracle and Banner information can be found at
my Oracle Tips, Tricks, and Scripts web site:
http://www.uaex.edu/srea
Please complete the on-line Evaluation Form
Evaluation Code 082
Evaluation Code 082
94