Download PPT - Ajay Ardeshana

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsoft SQL Server wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Global serializability wikipedia , lookup

IMDb wikipedia , lookup

Oracle Database wikipedia , lookup

Ingres (database) wikipedia , lookup

Functional Database Model wikipedia , lookup

Relational model wikipedia , lookup

Database wikipedia , lookup

Commitment ordering wikipedia , lookup

Database model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

Versant Object Database wikipedia , lookup

ContactPoint wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Serializability wikipedia , lookup

Concurrency control wikipedia , lookup

Transcript
Unit - 2
Database Backup
And
Recovery
 Introduction :Concurrency control and Database Recovery both
are a part of the transaction management.
Recovery is requires to protect the database from
data inconsistencies and data loss.
It ensures the atomicity and durability properties
of transactions.
These characteristics of DBMS helps to recover
from the failure and restore the database to a
consistent state.
Database Recovery Concepts : Database recovery is the process of restoring the database to a
correct state in the event of the failure.
 It is the process of restoring the database in to the most recent
consistent state that exist shortly before the time of system failure.
 The failure may be the result of system crash due to hardware or
software errors, a media failure such as a head crash, or a software
in the application such as a logical error in a program that is
accessing the database.
 The number of recovery techniques that are used are based on the
atomicity property of transaction.
 A transaction is consider as a single unit of work in which all
operations must be applied and completed to produced a
consistent database.
 If, for some reason, any transaction operation cannot be
completed, the transaction must be aborted and any changes to
the database must be rolled back(undone).
 Thus, transaction recovery reverses all the changes that the
transaction has made to the database before it was aborted.
 If entire database needs to be recovered to a consistent state, the
recovery uses the most recent backup copy of the database in a
known consistent state.
 The backup copy is then rolled forward to restore all subsequent
transactions by using the transaction log information.
 If the database needs to be recovered but the committed portion
of the database is still unstable, the recovery process uses the
transaction log to undo all the transactions that were not
committed.
 Database Backup : Some DBMSs provide functions that allow the database
administrator to schedule automatic database backups to
secondary storage devices such as, disks, CDs, tapes and so on.
 The level of database backups can be taken as follows :
 A full backup or a dump of database.
 A differential backup of the database on which only the last
modifications done to the database, when compare with the
previous backup copy, are copied.
 A backup of transaction log only. This level backs up all the
transactions log operations that are not reflected ion a previous
backup copy of the dataabse.
Types of Database Failure : There many types of failure that can affect the database
processing
 Some failure affect the main memory only, while others
involve secondary storage.
1. Hardware Failure : Hardware failure may include memory errors, disk
crashes, bad disk sectors, disk full error and so on.
 Hardware failure can also be attributed to design
errors, poor quality control during fabrication,
overloading and wear out of mechanical parts.
2. Software Failure : Software failure may include a failure a failures related
to software such as, operating systems, DBMS
software, application Programs and so on.
3. System Crashes : System crashes are due to hardware or software
errors, resulting in the loss of main memory.
 This could be the situation that the system has
entered an undesirable state, such as Dead Lock,
which prevent the program form continuing with
normal processing.
4. Network Failure : Network failure can occur while using a Client-server
configuration or distributed database system where
multiple database servers are connected y common
network.
 Network failure such as communication software failure or
aborted asynchronous connections will interrupt the
normal operation of the database system.
5. Media Failure : Such failures are due to head crashes or unreadable
media, resulting in the loss of parts of secondary storage.
 They are the most dangerous failures.
6. Application Software Error : These are logical errors in the program that is accessing the
database, which cause one or more transactions to fail.
7. Natural Physical Disasters : These are failure such as fire, floods, earthquake or power
failure.
8. Carelessness : There are the failure due to unintentional destruction of data or
facilities by operators or users.
9. Sabotage : These are failures due to international corruption or destruction
of data, hardware or users.
Types of Database
Recovery
Types of Database Recovery : In case of any type of failure a transaction must be either aborted
or committed to maintain data integrity.
 Transaction log plays an important role for database recovery and
bring the database in a consistent state.
 During recovery from failure, the recovery manager ensures that
either all the effects of a given transaction are permanently
recorded in the database or none of them are recorded.
 A transaction begins with a successful execution of BEGIN
TRANSACTION statement and it ends with successful execution of
COMMIT statement.
 Following two types of transaction recovery are used :
 Forward Recovery.
 Backward Recovery.
 Forward Recovery (or REDO) : Forward Recovery is the recovery procedure, which is used in
case of a physical damage, for example failure of secondary
storage, failures during writing of data to database buffers or
failure during transferring buffers to secondary storage.
 The intermediate result of the transaction are written in the
buffer. The database buffer occupy an area in the main memory.
From this buffer, data are transfer to the secondary storage of
database.
 The update operation is regarded as permanent only when the
buffers are flushed to the secondary storage of the database.
 The flushing operation can be triggered by the COMMIT
operation of the transaction or automatically in the event of
buffers becoming full.
 If failures occur between writing to the buffers and flushing of
buffers to the secondary storage, the recovery manager must
determine the status of the transaction that performed the WRITE
and the time of failure.
 If the transaction had already issued COMMIT, the recovery
manager perform redo, so that the transaction’s updates to the
database.
 This redoing of transaction updates is also known as roll-forward.
 The forward recovery guarantees the durability property of tran.
 To recreate the lost disk, the system begin reading the most recent
copy of the lost data and the transaction log of the changes to it.
 A program the start reading log entries, starting from the first one
that was recorded after the copy of database was made and
continuing through to the last one that was recorded just before
the failure.
 For each of these entries, programs changes the data value
concerned in the copy of the database to the ‘after value’ shown in
the log entry.
 This means that whatever process took place in the transaction
that caused the log entry to be made, the net result of the
database after that transaction will be stored.
 Operation for each entry in the log is performed that caused a
changes in the database since the copy was taken, in the same
order that these transactions were originally executed.
 This brings the database copy to the up-to-date level of the
database that was destroyed.
 Backward Recovery (or UNDO) : Backward Recovery is a recovery procedure, which is used in
case an error occurs in the middle of normal operation on the
database.
 If the transaction had not committed at the time of failure, it will
cause an inconsistency in the database, because of this other
program may read incorrect data and made use of it.
 Then the recovery manager must undo (rollback) any effect of
the transaction database.
 The backward recovery guarantees the atomicity property of the
transactions.
 In case of backward recovery, the recovery is started with the
database in its current state and the transaction log is positioned
at the last entry that was made in it.
 Then a program reads ‘backward’ through log, resetting each
updated data value in the database to it previous value as recorded
in the transaction log, until it reach the point where the error was
made.
 Thus the program undoes each transaction in the reverse order
from that in which it was made.
 Example :-
 ts  Starting Time of Transaction
 tc  Time for Disk Crash
 tf  Time for Transaction Failure.
 In this example all the
transactions T1,T2,…T6 are
executing concurrently.
 Let us assume that the data for transaction T2 and T4 are already
written to the disk before failure at time tf.
 It can be observed the transaction T1 and T6 had not committed at
the point of the disk crash. Therefore the recovery manager must
undo the transaction T1 and T6 at restart time.
 However, it is not clear that to what extent the changes made by
the other already committed transactions T1 and T6 have been
propagated to the database on secondary storage.
 This uncertainty is done because the buffers may or may not been
flushed to secondary storage.
 Thus, the recovery manager would be forced to redo transactions
T2, T3, T4 and T5.
Recovery
Techniques
 Recovery Techniques : The database recovery techniques depends on the type and extent
of damage that has occurred to the database.
 These techniques are based on atomic transaction property.
 Following two types of damages can take place to the database.
 Physical Damage :
• If the database has been physically damaged, for example disk
crash has occurred, then the last backup copy of the database
is restored and update operation of committed transactions
are reapplied using the transaction log file.
• It is to be noted that restoration in this case is possible only if
the transaction log has not been damaged.
 Non-Physical or Transaction Failure :
• If the database has become inconsistent due to a system crash
during execution of transactions, then the changes that caused
the inconsistency are rolled-backward(undo).
• It may also be necessary to roll-forward (redo) some
transactions to ensure that the updates performed by them
have reached secondary storage.
• In this case the database is restored to a consistent state using
the before-images and after-images held in the transaction log
file.
• This technique is also known as log-based recovery technique.
• For this following two techniques are used :
– Deferred Update :– Immediate Update :-
Deferred Update : In case of deferred update technique, updates are not written to
the database until after a transaction has reached its COMMIT
point. In other words, the updates to the database are deferred
(postponed) until the transaction complete its execution
successfully and reached its commit point.
 During transaction execution the updates are recorded only in
the transaction log and in the cache buffer.
 After the transaction reached its commit point and the
transaction log is forced-written to disk, the updates are
recorded in the database.
 If a transaction failed before it reaches this point, it will not have
modified the database and so on undoing of changes will be
necessary. However, it may be necessary to redo the updates of
committed transactions as their effect may not have reached the
database.
 In the case of deferred update, the transaction log file is used in
the following ways :
 When a transaction T begins, transaction begin (or <T, BEGIN>) is
written to the transaction log.
 During the execution of transaction T, a new log record
containing all log data specified previously. E.g. new value ai for
attribute A is written as “<WRITE(A,ai)>”. Each record consist of
the transaction name T, the attribute name A and new value ai
for attribute A.
 When all comprising transactions T are committed successfully,
we say that the transaction T partially commits and the record
“<T,COMMIT>” are written to the transaction log. After
transaction T partially commits, the records associated with
transaction T in the transaction log are used in executing the
actual updates by writing to the appropriate records in the
database. If a transaction T aborts, the transaction log record is
ignored for the transaction T and write is not performed.
Time
Time-1
Time-2
Time-3
Time-4
Time-5
Time-6
Transaction
READ(A,a1)
A1:=a1+20000
WRITE(A,a1)
READ(B,b1)
B1:=b1-20000
WRITE(B,b1)
Action
Read Current Loan Balance
Increase Loan Balance by 2000
Write New(Updated) Loan Balance
Read Current Loan Cash Balance
Reduce Loan Cash Balance by 20000
Write New(Updated) Loan Cash Balance
Normal Execution of Transaction T
 The transaction which update an attribute called employee’s loan
balance (EMP_LOAN_BAL) in table EMPLOYEE.
 Assume that the current balance of EMP_LOAN_BAL = 70000 and
CUR_LOAN_CASH_BAL = 80000.
 Now transaction took place for making a loan payment of 20000 to
employee
 After a failure has occurred, the DBMS examines the
transaction log to determine which transactions need to be
redone.
 If the transaction log contains both the start record
<T,BEGIN> and commit record <T,COMMIT> for transaction
T, the transaction T must be redone.
 That means, the database may have been corrupted, but
the transaction execution was completed and the new
values for the relevant data items are contain in the
transaction log.
 Therefore the transaction is needed to be reprocess.
 Redo set the value of all data items updated by transaction
T to the new values that are recorded in the transaction
log.
Time
Log Entries
Before Start of Transaction 
Time – 1
Time – 2
Time – 3
Time – 4
Database Stored Value
A = 70000
B = 80000
<T, BEGIN>
<T, A, 20000>
<T, B, 60000>
<T, COMMIT>
After Transaction 
A = 90000
B = 60000
Database Update Log Entries for Transaction T
 Now let us assume that the database failure has occurred in the
following conditions :
 Just after the COMMIT record is entered in the transaction log
and before the updated records are written to the database.
 Just before the execution of WRITE operation.
Time
Log Entries
Before Start of Transaction 
Time – 1
Time – 2
Time – 3
Time – 4
Database Value
A = 70000
B = 80000
<T, BEGIN>
<T, A, 20000>
<T, b, 60000>
<T, COMMIT>
Failure occur just after the COMMIT record entered and before the
updated records are written into the database.
If the failure occurred just after the <T,COMMIT>
record is enter into the transaction log and before
the updated records are written into the
database.
When the system comes backup, no transaction is
necessary because no COMMIT record for
transaction T appears in the transaction Log.
The REDO operation is executed, resulting in the
values 90000 and 60000 being written to the
database as the updated values of A, B.
Time
Log Entries
Before Start of Transaction 
Database Value
A = 70000
B = 80000
Time – 1 <T, BEGIN>
Time – 2 <T, A, 20000>
Time – 3 <T, b, 60000>
Time – 4 <T, COMMIT>
Failure occur before the execution of WRITE operation.
 In this case when the system comes backup, no action is necessary
because no COMMIT record for transaction T appears in the
transaction log.
 So the value of A and B in database remains 70000 and 80000.
 In this case transaction must be restarted.
Immediate Update : In case of immediate update technique, all updates to the
database are applied immediately as they occur without waiting
to reach the COMMIT point and a record of all changes is kept in
the transaction log.
 In this technique, when the transaction begins, a record
<T,BEGIN> and update operations are written to the transaction
log on disk before it is applied to the database.
 This type of recovery method requires two procedures namely :
• Redoing transaction T(REDO,T) and,
• Undoing of transaction T(UNDO, T).
 First procedure redoes the same operation as Deferred Update.
 Second one restore the values of all attributes updated by
transaction T to their old values
Time
Log Entries
Before start of transaction
Time – 1
Time – 2
Time – 3
<T, BEGIN>
<T, A, 70000, 20000>
Time – 4
Time – 5
Time – 6
<T, B, 80000, 60000>
Database Value
A = 70000
B = 80000
A = 90000
B = 60000
<T, COMMIT>
Immediate Update Log Entries for Transaction T.
 In case of immediate update the transaction log file is used in
following way :
 When a transaction T begins <T, BEGIN> is written to log file.
 When write operation is performed, a record containing the
necessary data is written to the transaction log file.
 Once the transaction log is written, the update is written to the
database buffers.
 The updates to the database itself are written when the buffer
are next flushed to the secondary storage.
 When the transaction T commits, <T,COMMIT> record is written
to the transaction log file.
 If the transaction log contain the record <T,BEGIN> but does not
contain <T,COMMIT> transaction T is undone. The old value of
affected data items are restored and transaction T is restarted. If
transaction T contain both the records T will be redone.
 Now suppose that database failure occurred in the following
conditions :
 Just before the WRITE action : “WRITE (B, b1)”
 Just after “<T, COMMIT>” is written to the transaction log but
before the new values are written to the database.
Time
Log Entries
Before start of Transaction
Stored Value
A = 70000
B = 80000
Time – 1 <T, BEGIN>
Time – 1 <T, A, 70000, 20000>
A = 90000
Transaction T fail before the WRITE action to the Database
In Immediate Update
 When failure occur just before the execution of WRITE operation,
system comes backs up and it find the record <T, BEGIN> but no
corresponding <T, COMMIT>.
 This means that the transaction T must be undone. Thus an
UNDO(T) operation is executed. This restore the value of A to
70000 and the transaction can be restarted.
Time
Log Entries
Before start of transaction 
Time – 1
<T, BEGIN>
Time – 2
<T, A, 70000, 20000>
Time – 3
Time – 4
A = 70000
B = 80000
A = 90000
<T, B, 80000, 60000>
Time – 5
Time – 6
Stored Value
B = 60000
<T, COMMIT>
Immediate Update Log Entries for T when failure occur after COMMIT action
 Above given table shows the transaction log when a
failure has occurred just after the execution of <T,
COMMIT> is written to the transaction log but before the
new values are written to the database.
 When the system comes back again, a scan of the
transaction log shows corresponding <T, BEGIN> and <T,
COMMIT> records.
 Thus a REDO(T) operation is executed.
 This results into the values of A and B as 90000 and
60000 respectively.
Shadow Paging : The Shadow Paging was technique does not requires the use of
transaction log in a single user environment.
 However in a multiuser environment a transaction log may be
needed for concurrency method.
 In the Shadow Page scheme, the database is consider to be
made up of logical unit of storage fixed-size disk pages (or block).
 The pages are mapped into physical blocks of storage by means
of a page table, with one entry for each logical page of database.
 This entry contains the block number of the physical storage
where this page is storage.
 Thus, the shadow paging scheme one possible form of the
indirect page allocation.
 The shadow paging scheme is similar to the one which is used by
the operating system for virtual memory management.
 In case of virtual memory management, the memory is divided
into pages that are assume to be of a certain size.
 The virtual and logical pages are mapped onto a physical memory
blocks of the same size as the page.
Page
Table
Address
Page Table
Physical Blocks
 The mapping is provided by means of table known as Page Table.
 The page table contain one entry for each logical page of the
process’s virtual address space.
 The shadow paging technique maintain the two page tables during
the life of a transaction namely current page table and shadow
page table for a transaction that is going to modify the database.
 The shadow page is the original page table and the transaction
addresses the database using the current page table.
 At the start of transaction the two tables are same and both point
to the same blocks of physical storage.
 The shadow page table is never changed thereafter, and is used to
restore the database in the event of system failure.
 However current page table entries may change during execution
of transaction.
 The current page table is used to record all updates to the
database. When the transaction complete, the current page
become the shadow page table.
 The pages that are affected by the transaction are copied to new
blocks of physical storage and these blocks, along with the block
not modified, are accessible to the transaction via the current page
table.
 The old version of the changed pages remains unchanged and
these pages continue with to be access via the shadow page table.
 The shadow page table contain the entries that existed in the page
table before the start of the transaction and point to the blocks
that were never changed by the transaction.
 The shadow page table remains the unaltered by the transaction
and is used for undoing the transaction.
 Advantages : Overhead of maintaining the transaction log file is eliminated.
 Since there is no need for UNDO or REDO operation, recovery is
significantly faster.
 Disadvantages : Data fragmentation or scattering.
 Need of periodic garbage collection to reclaim inaccessible block
Checkpoints : The point of synchronization between the database and
transaction log file is called checkpoint.
 General method of database recovery is using information in the
transaction log file. But the main difficulty in this recovery is of
m=knowing how far to go back in the transaction log to search in
case of failure.
 In the absence of this exact information, we may end up redoing
transactions that have already been safely written to the
database. Also this is very time-consuming and wasteful.
 A batter way is to find a point that sufficiently far back to ensure
that any time written before that point has been done correctly
and stored safely.
 This method is called checkpointing.
 In checkpointing, all buffers are force-written to
secondary storage.
 The checkpoint technique is used to limit :
 The volume of log information
 Amount of searching
 Subsequent processing that is need to carry out on the
transaction log file.
 During the execution of transaction, the DBMS maintain
the transaction log but periodically perform the
checkpoints.
 Checkpoints are scheduled at predetermined intervals and involve
the following operations :
 Writing the start-of-checkpoint record along with the time and
date to the log on the stable storage device giving the
identification that it is a checkpoint.
 Writing all transaction log file records in main memory to
secondary storage (SS).
 Writing the modified blocks in the database buffer to SS.
 Writing a checkpoint record to the transaction log file. This
record contains the identifier of all transactions that are active at
the time of the checkpoint.
 Writing an end-of-checkpoint record and saving of the address
of the checkpoint record on the file accessible to the recovery
routine on start-up after a system crash.
 At the time of check point all the identifiers, and their database
modifications which reflected at that time only in the database
buffer will be propagated to the appropriate storage.
 A checkpoint can be taken at fixed interval of time.
 In case of failure during the serial operation of transactions, the
transaction log file is checked to find the last transaction that
started before the last check point.
 Any earlier transactions would have committed previously, would
have written to the database at the checkpoint.
 Therefore it is needed to redo only :
a) The one that was active at the checkpoint,
b) Any subsequent transactions for which both started and
commit records appear in the transaction log.
 If the transactions are active at the time of failure, the transaction
must be undone.
 If transactions are performed concurrently, redo all transactions
that have committed since the checkpoint and undo all
transactions that were active at the time of failure.
 Only transaction T1 is ok.
 Transaction T2 and T4 will be redo and T3 and T5 will be undo.
 Buffer Management : The buffers are the reserved blocks of the main memory.
 DBMS application programs require I/O operations, which are
performed by a components of OS.
 These I/O operations normally use buffers to match the speed of
the processor and relatively fast main memories with the slower
secondary storages and also to minimize the number of I/O
operations between the main and secondary memories.
 The assignment and management of memory block are called
buffer management and the component of the OS that perform
this task are called buffer manager.
 The buffer manager is responsible for the efficient management of
the database buffers that are used to transfer pages between
buffer and secondary storages.
 It ensure that as many data requests made by programs as possible
are satisfied from data copied from secondary storage into the
buffer.
 Buffer manager takes care of reading of pages from the disk into
the buffer until the buffers become full and then using a
replacement strategy to decide which buffer to force-write to disk
to make space for new pages that need to be read from disk.
 Some of the replacement strategy used by the buffer manager are:
 First-In-First-Out (FIFO) and
 Least Recently Used (LRU).
 A computer system uses buffers that are in fact virtual memory
buffers. Thus, a mapping is required between a virtual memory
buffer and physical memory.
 The physical memory is managed by memory management
component of OS>
 In a virtual memory management, the buffers containing pages of
the database undergoing modification by a transaction could be
written out to secondary storage.
 The timing of this premature writing of buffer is decided by a
memory management components of OS and it independent of
the state of the transaction.
 To decrease the buffer fault, the LRU algorithm is used for buffer
replacement.
 The buffer management effectively provides a temporary copy of a
database page.
 Therefore, it is used in database recovery system in which the
modifications are done in this temporary copy and original page
remain unchanged in the secondary storage.
 Bothe transaction log and database page are written to the buffer
pages into virtual memory.
 The COMMIT transaction operation takes in two phases, and thus
it called a two-phase commit.
 In the first phase of COMMIT operation, the transaction log buffers
are written out and in second phase of COMMIT operation, the
data buffers are written out.
 Thus it does not cause any problem because the log is always
forced during the first phase of COMMIT.