* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download IT 21003 Database Administration
Microsoft Access wikipedia , lookup
Global serializability wikipedia , lookup
Commitment ordering wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Serializability wikipedia , lookup
Oracle Database wikipedia , lookup
Ingres (database) wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Functional Database Model wikipedia , lookup
Relational model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Versant Object Database wikipedia , lookup
Database model wikipedia , lookup
Clusterpoint wikipedia , lookup
IT 21003 Database Administration Section 08 Protecting the Database from Failure Backup and Recovery  Questions to ask:      Who has the responsibility? Who does it? Who plans the procedures? Who takes the precautions? Who makes it happen?  Possibilities:  End Users  Operators  Power Users  Application or Systems Programmers  Systems Managers Tape Librarians  DBA Protecting the Database from Failure Backup and Recovery: DBA Activities  The DBA has the responsibility for planning backup and recovery procedures/strategies to cope with hardware and software failure  Disasters cause high-stress situtations  The plan must be simple, preferably automated  Should minimize human intervention  Can use the Recovery Manager to help automate these procedures  The Recovery Plans must be well documented  The recovery procedures must:  Bring the database back in a consistent state at the time of the crash  Recover the database in a reasonable time period so that downtime is minimized Protecting the Database from Failure Backup and Recovery Requirements  Recovery plans should be discussed in the early stages of database development  Some key issues: Is the database volatile? Can the users afford to lose a day’s work? Can the database be taken down on a regular basis?  It is important to establish a Service-Level Agreement (SLA) with the user departments  Backup and recovery strategies are much simpler if users understand the process  7x24 databases require complex backup and recovery routines and carefully planned downtime for maintenance Protecting the Database from Failure Backup and Recovery Techniques  Dumping and Logging  Most sites cannot allow the lose of any work  File dumps of the database are taken at set intervals, and logs of the changes between the dumps are produced  These files can then be used to perform full recovery of a database  If the database (or parts of it) are lost  Restore from the latest dump  Roll Forward in time by applying the logs of all the changes  All these facilities are available in the Oracle Recoverer Protecting the Database from Failure Possible Failures  User Errors – hard to plan for  Transaction Errors  User Process Failures  Network Failures  Instance Failures  CPU (Host) Failures  Media Failures Protecting the Database from Failure User Errors  Typical user errors An unwanted change is committed to the database, but the mistake is not realized until some time later Accidental drop of a table (cannot be rolled back) Performing wholesale deletes or updates without a WHERE clause  Prevention Users should be trained in database and application principles  Recovery  Committed data cannot simply be backed out  Other transactions may have already used the bad data  Even if the user knows the correct value, user management should probably be consulted before any change is made  Recovery Manager can be used to recover a tablespace to the point before the error occured Protecting the Database from Failure Transaction Errors  Typical transactions errors Lack of space when a table needs to expand Lack of privileges on underlying objects Deadlock with conflicting transactions Programming errors within and application  Prevention  Cannot guarantee prevention of any of the above  Careful planning of storage needs will help  Recovery  Intervention by the DBA is not normally required  Recovery performed by Oracles automatic rollback processes Protecting the Database from Failure User Process Failures  Typical causes of user process failures Abnormal disconnection of user process Loss of client due to user action Addressing exception  Prevention Not much can be done by the DBA  Recovery  Intervention by the DBA is not normally required  Aborted processes are detected by PMON which performs a rollback of the active transaction and releases all resources  PMON cleans out all caches and frees up memory occupied by the failed process Protecting the Database from Failure Network Failures  Typical types of network failures Network or phone connections lost Client workstation becomes disconnected  Recovery  Intervention by the DBA is not normally required  Loss of the connection is detected automatically by PMON  PMON forces a rollback of the client’s transaction Protecting the Database from Failure Instance Failures  Typical causes of instance failure Loss of an Oracle background process  Prevention Not much can be done Monitor the trace files for clues to the cause Recovery  Intervention by the DBA is not normally required except to shutdown and restart the database  SMON process automatically applies redo log information and rolls back all uncommitted changes on instance startup Protecting the Database from Failure CPU (Host) Failure  Typical causes of CPU failure Loss of power  Prevention Utilize a UPS Recovery  Intervention by the DBA is not normally required except to have already created scripts that restart the database on computer bootup  SMON process automatically applies redo log information and rolls back all uncommitted changes on instance startup Protecting the Database from Failure Media Failure – with no data loss  Typical failures Loss of disk controller  Prevention Utilize a UPS Recovery  Intervention by the DBA is not normally required except to have already created scripts that restart the database on computer bootup  SMON process automatically applies redo log information and rolls back all uncommitted changes on instance startup Protecting the Database from Failure Media Failure – with data loss  Typical failures  Disk Head crash  Disk or data destruction  Fire  Natural disasters  Accidental datafile deletion  Serious Failures Requires DBA action Requires plans and procedures to be in place Recovery  A strategy should be in place to handle all possible losses of data  Data loss requires both rollforward and rollback procedures  Same steps as instance recovery but probably from an earlier point in time  Need to establish which files are lost  Backup method and type of files lost dictate the recovery method used Protecting the Database from Failure Media Failure – with data loss (further protections)  Mirroring  Some sites are able to use mirror disks  Offers fault tolerance against media failure  Cannot protect against user and software errors  Mirrored Systems Expensive Whole systems mirrored for full fault tolerance  Hot standby databases  Oracle can keep a standby database in permanent recovery mode  It is a clone of the target database  Archived redo logs generated by the target are applied to the standby database as they are produced  RAID  Allows the use of additional disks to protect database files  Numerous configurations possible Protecting the Database from Failure Backup Strategies  All files need to be protected Datafiles Redo Log Files Control Files A backup strategy must be developed A choice of techniques is available to consider Rehearsals should be carried out periodically of both backup and recovery procedures  Need to consider:  The volatility of the database?  Can users afford to lose any data?  Can the database have planned downtime?  How large is the database? Protecting the Database from Failure Archive Logging  When active online redo log file is full, it is written out to an archive log file  Requires an additional background process ARCH – Archiver This is an optional process  Archiving allows the following features: Full recovery of the database to a state of consistency as of the time of the crash No loss of any committed data Important for transaction-oriented volatile databases  Oracle databases can run in either  ARCHIVELOG mode or  NOARCHIVELOG mode Protecting the Database from Failure Archive Logging  ARCHIVELOG mode  Allows full recovery of database  Hot backups can be performed while the database remains available  Management overhead concerns because the archive disk can become full  Writing to tape can be slow so archive to disk first  Highly active systems can fill redo log files faster than the archive to tape process  This will halt the instance until the archiver catches up Protecting the Database from Failure Archive Logging  NOARCHIVELOG mode  Allows the fastest performance of database  Advantages  Redo log files are cyclically reused because no archive logs are produced  No management overhead  Mass updates possible without fear of filling up the archive disk  Disadvantages  Cold, whole database backup only allowed  Damaged tablespaces stop the instance until dropped or restored  Recovery from media failures is generally not possible  Recovery Options  Restore from a previous backup  Restore from a previous export using the import utility Protecting the Database from Failure Switch to Archive Logging  The database can be switched to ARCHIVELOG mode after creation  Steps: Shutdown Normal Startup Mount db1 Alter Database db1 Archivelog; Alter Database Open;  The database is placed into ARCHIVELOG mode while in the MOUNTed state  This does not automatically enable archiving Protecting the Database from Failure Switch to Archive Logging  When a database is put into ARCHIVELOG mode the redo log files are not automatically archived  Two methods to turn on automatic archiving  Use the parameter file to enable permanent automatic archiving  Log_Archive_Start = True (False is the default)  Or while the instance is running issue:  Alter System Archive Log Start;  This starts automatic archiving but does not survive a shutdown and startup  Remember this only activates automatic archiving if the database is already in ARCHIVELOG mode  Stopping automatic archiving:  Alter System Archive Log Stop; Protecting the Database from Failure Setting a destination for Archive Log Files  The Log_Archive_Dest parameter  Log_Archive_Dest = d:\archive  This can be a single point of failure so be careful  Log_Archive_Dest_1 = d:\archive  Log_Archive_Dest_2 = e:\archive  Log_Archive_Dest_3 = e:\backup  Log_Archive_Dest_4 = f:\archive  Up to five destinations can be specified  No single point of failure Protecting the Database from Failure Archive Summary Information  Archive Log List;  This statement generates the following output:  Database log mode  Automatic archival  Archive destination  Oldest online log sequence  Next log sequence to archive  Current log sequence number Protecting the Database from Failure Redo Log Management  How many?  How big?  Where are they stored?  How are they protected/recovered?  What do they contain?  How will they be used?  Which mode of operation? Protecting the Database from Failure Redo Log Management  Minimum of two log files are required Redo log files are used in a cyclic nature Should use enough log groups to allow archiving to take place before log files need to be reused  Default size is 500 KB Minimum size is 50 KB There is no maximum size Usual size is 1 - 10 MB All log files should be the same size  Should be positioned on separate disks Protecting the Database from Failure Redo Log Management  Should be duplexed (Mirrored)  Oracle is vulnerable to the loss of redo logs  Database stops and full recovery is impossible  Redo logs contain both before and after images of changed data Redo logs provide protection for both data and rollback segments to allow total recovery  Redo logs are used to recover from instance and media failures Can be applied to restored backup copies of the database files Database changes are written to redo logs on transaction commits When redo log is full, a log switch and checkpoint occurs  Redo logs can be archived as they are filled Protecting the Database from Failure Redo Log File Switches  A checkpoint occurs at each switch and each new log is assigned a sequence number Redo Log Group 1 Log 1 Redo Log Group 2 Archive Redo Logs LGWR ARCH Log 1 LGWR 1 Log 2 ARCH Log 3 LGWR 2 Log 2 ARCH Log 3 LGWR 1 3 Log 4 2 1 Protecting the Database from Failure Redo Log Group  Number the groups 1, 2, 3 not 10, 20 , 30 Takes up less space in the control file which references the redo log files  Use just enough groups to prevent waits for checkpoints and archive operations  If Oracle loses a member of a group, it is recorded as STALE and an error is recorded in the ALERT file and LGWR trace file System is not halted, LGWR continues to write to the other members of the group  The archiver can archive any one member of a group  If the log file is unavailable archiver simply chooses an alternative  LGWR cannot reuse a log file currently being archived Protecting the Database from Failure Redo Log Group  Adding a new redo log group  Alter Database db_name Add Logfile Group x ‘filespec’ Size n (K | M);  Adding a new redo log member  Alter Database db_name Add Logfile Member ‘filespec’ To Group n;  Dropping Redo Log Groups  Alter Database db_name Drop Logfile Group n;  At least three log file groups must be present  Must have at least two for LGWR to function Protecting the Database from Failure Managing and Protecting Control Files  Control Files  Relatively small binary files that are critical to the operation of a database  Contain database definitions:  Database name  Names and locations of redo log files and datafiles  Current log sequence numbers and status  Checkpoint timestamp information  Database creation timestamp  Beginning and ending blocks of rollback segments  Information required by the Recovery Manager  All control files must be available while the database is open  A database cannot be opened if a control file is lost Protecting the Database from Failure Control Files: Current Advice  Store at least two copies on separate disks  Better to have a copy on each disk  Oracle mirrors any changes to the control files on all copies  Control files are updated at each checkpoint with a slight impact on performance  If you have only one disk available place the control files in different areas to avoid problems from bad sectors  Backup the control file whenever a change is made to database or redo log files  If primary control files are lost than a full recovery is possible from backup control files Protecting the Database from Failure Adding a New Control File  Shutdown the database  Copy the existing (good) control file to a file(s) on another disk(s) using the operating system  Specify the new control file name in the CONTROL_FILES parameter in the parameter file  Start up the instance and open the database using the changed parameter file Protecting the Database from Failure Safeguarding Control Files  Perform a control file backup each time a database change is made Alter Database Backup Controlfile To ‘filespec’;  Trace out the control file to produce a humanreadable file  Alter Database Backup Controlfile To Trace;  This produces a file in the trace directory that can be read, edited and used to reconfigure the control file if needed Protecting the Database from Failure Loss of a Control File  If any control files are lost, the database cannot function correctly  Shutdown the database (may require a Shutdown Abort) Using the OS make a copy of the good control file but not in the same location as the problem control file  Specify the location of the new control file in the parameter file and remove reference to the bad control file  Start up the database and return to normal function Protecting the Database from Failure Loss of All Control Files  Serious Problem Call Oracle Support  Media recovery is likely necessary  Can restore from a backup and then use an appropriate backup control file to recover the database  It is sometimes possible to re-create a control file using the CREATE CONTROLFILE statement Protecting the Database from Failure The Alert Log File  Written by Oracle and stored in oraclehome/admin/oracle_sid/bdump directory  Contains information on  Database creation  Startup of each background process  Log switches  Datafile and logfile manipulation  Internal errors  Block corruption  Deadlocks  The file grows without bound  Must decide what to keep online and offline Protecting the Database from Failure Questions?
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            