Download Recovery

Chapter 20: Recovery Failure Types Transaction Failures: local recovery  System Failure: Global recovery   Main memory is lost  Disk survives  Media Failure 421B: Database Systems - Recovery 2 Motivation  Atomicity: All-or-nothing  Transactions may abort (“Rollback”).  Durability:  Changes survive server crash  Desired Behavior after system restarts: – T1, T2 & T3 should be durable. – T4 & T5 should be aborted (effects not seen). 421B: Database Systems - Recovery T1 T2 T3 T4 T5 c crash! c c a 3 Update Approaches  In-place updating  Change value of object and write it back to the same place on stable storage  Used in most DB  Multiversion System  Old object version is kept, new version is created  E.g., in PostgreSQL  Vacuum procedure that from time to time deletes old versions that are no more needed 421B: Database Systems - Recovery 4 Handling the Buffer Pool  Force/NoForce  If transaction T has written object X on page P: Force page P to disk (flush) before T commits  All changes of T are in the database on stable storage before T commits  NoForce: Flushing pages to disk is only determined by replacement policy of buffer manager;   some of the changes of T might not be in the stable database at commit Steal/NoSteal  NoSteal: If transaction T has updated object X on page P: do NOT flush P before T commits;  No change of an active, uncommitted transaction is on stable storage  Steal: Replacements strategy is allowed to replace and flush a page even if the page contains update of uncommitted transaction  Changes of uncommitted transactions might be in the stable database 421B: Database Systems - Recovery 5 Combinations NoSteal Force No Force Steal Atomic flush at Commit Time flush any time before commit flush any time after commit Flush any time 421B: Database Systems - Recovery 6 More on Steal   STEAL (why enforcing Atomicity is hard)  To steal frame F: Current page in F (say P) is written to disk; some transaction T holds lock on object A on P.  What if the T with the lock on A aborts? What if the system fails directly after the flush and before T commits?  Must remember the old value of A at steal time (to support UNDOing the write to page P).  Crash case: we have to do something with transaction T4 ACTIVE at the time of the crash No Steal (no uncommitted changes are in the stable database)  At restart after failure we are sure that none of the changes of T4 are in DB  nothing to do for ACTIVE transactions 421B: Database Systems - Recovery 7 More on Force   Force (write changes before transaction commits)  At restart after failure we are sure that changes of T1, T2, and T3 are in DB and changes of T5 are NOT in the DB; nothing to do for TERMINATED transactions NO FORCE (why enforcing Durability is hard)  Assume a transaction T has modified tuple on page P and T committed but update is not yet in the stable database? Now system crashes before modified page is written to disk?  Write as little as possible, in a convenient place, at commit time,to support REDOing modifications. 421B: Database Systems - Recovery 8 Combinations  Ideal: FORCE/NO-Steal:  nothing has to be done at recovery  Problem: basically not possible with update-in-place  In reality: mostly NOFORCE/STEAL 421B: Database Systems - Recovery 9 Basic Idea: Logging      A log is a read/append data structure maintained on stable storage (survives failures) UNDO information: when transaction T updates an object it stores the old version of object (before-image); when transaction aborts, copy before-image to current objectlocation. REDO information: when transaction T updates an object it stores the new version of object (after-image); can be used to redo updates of transactions that committed. In total: whenever a transaction T updates an object, both before- and after-image are written as one log-record and appended to the log. Additionally: when transaction starts, a BEGIN record is appended to the log; when transaction commits/aborts, a commit/abort record is appended to the log. 421B: Database Systems - Recovery 10 Architecture II Secondary Storage (stable) Upper Layer Cache/Buffer Manager Access cost: ~15 ms Log Disk Access cost: ~1 ms 421B: Database Systems - Recovery Buffer Pool (random access) Log (append/read) 11 DB pages and Log pages Db page: Rid = (i,N) Page i Rid = (i,2) Rid = (i,1) 20 N T255: w(x) ... 16 2 24 N 1 # slots Log page: T3: before(y), after (y) … … T255: begin T3: commit T255: before(z), after (z) … T255: before(x), after (x) 421B: Database Systems - Recovery Log tail 12 When to flush a log page  The Write-Ahead Logging Protocol:  Must force the log entry for an update before the corresponding data page gets to disk.  Must write all log entries for a Xact before commit.  Note: flushing log page is much cheaper than flushing DB page!  #1 guarantees Atomicity  Assume active T has changed X; page with X get flushed to disk (steal); now system crashes before T commits => must undo T changes; need before image of X!  #2 guarantees Durability  Assume T has changed X and committed; page with X does not get flushed to disk (no-force); now system crashes => must redo T changes; need after image of X! 421B: Database Systems - Recovery 13 Types of Recovery  Local UNDO during normal processing  whenever a transactions aborts, undo updates of aborted Xact by installing before-images.  Log-records are probably still in main memory; scan backwards starting from log-tail;  Global UNDO: at restart after system crash  Xacts that aborted before the crash (we find abort record in log)  Xacts that were active at the time of the crash (we find neither abort nor commit record in log)  Whenever pages on the disk have updates of such Xacts (we say the update is reflected in the database), undo these updates by installing before-images  Pages contain additional information to detect this! 421B: Database Systems - Recovery 14 Types of Recovery  Partial REDO: at restart after system crash  Xacts that committed before the crash (we find a commit record in the log)  Whenever pages on the disk do not have the updates of such Xacts (we say the update is not reflected in the database), redo the updates by installing after-images  Page contains additional information so that we can detect this.  Global REDO: after disk failure  Make snapshot of database (once a day /once a week)  Duplicate log and keep on two disks  Keep log on a second storage  After disk failure  Start with snapshot and then apply log 421B: Database Systems - Recovery 15 Recovery after Crash  Simple procedure:  Backward pass: Scan log from tail to head; For each record     If commit of T, include T in list of committed transactions C If abort of T, include T in list of aborted transactions A If update record of T and T is neither in A or C, include T in list of aborted transactions If update record of T on object X and T in A  Read in page P with object X  If update on X performed, install before-image  Forward pass: Scan log from head to tail: for each record  If update record of T on object X and T in C  Read in page P with object X  If update on X not yet performed, install after-image 421B: Database Systems - Recovery 16 Example of Recovery LOG 1 2 3 4 5 6 7 BM Backward pass:  7: Put T3 in C  6: Put T2 in A update: T1 writes A on P5  6: Read P5; nothing update T2 writes B on P3 P5 is flushed has to be done T1 Abort  4,5: nothing update: T3 writes C on P1  3: put T1 in A update: T3 write D on P3  2: read p3; install update: T2 writes Z P5 before-image of B P3 is flushed T3 commit  1: read p5: install CRASH, RESTART before image (the write on A was flushed to disk but  Forward pass: not the undo during  Step 4: read P1 install normal processing) after-image of P1  Step 5: read P3; nothing has to be done 421B: Database Systems - Recovery  17 Checkpointing     Log becomes longer and longer => recovery has to read entire log! Periodically, the DBMS creates a checkpoint, in order to minimize the time taken to recover in the event of a system crash. Simple checkpoint:  Goal: only log that was written after the checkpoint has to be analyzed  Algorithm:  Prevent new transactions from starting  Wait until all transactions have terminated  Flush all dirty pages to stable storage  Write a checkpoint log entry  Start new transactions  Upon recovery: backward pass only goes to last checkpoint entry In real life more complicated; transaction processing is not interrupted; no big flush in one step 421B: Database Systems - Recovery 18 Example Write Flush Chkpt Buffer record start T1 T2 T3 T4 T5 c a 421B: Database Systems - Recovery c 19 Further Issues    Crash during Recovery Logical logging: instead of physical before/after image redo operation / inverse operation (e.g. increment by one, decrement by one) Hard disk failures:  mirror disk or  Archive copy (consistent copy of database on tape, created e.g. once every night when no transaction processing) + archive log (similar to log shown here)  Real Life: much more complicated: see textbook with ARIES 421B: Database Systems - Recovery 20

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Recovery