Download 4_sql_recovery_1

Document related concepts

Tandem Computers wikipedia , lookup

Database wikipedia , lookup

Oracle Database wikipedia , lookup

Consistency model wikipedia , lookup

Database model wikipedia , lookup

Clusterpoint wikipedia , lookup

Commitment ordering wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Relational model wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

SQL wikipedia , lookup

PL/SQL wikipedia , lookup

Concurrency control wikipedia , lookup

Serializability wikipedia , lookup

Transcript
EXEC SQL
• An embedded SQL statement is
distinguished from the host language
statements
– by enclosing it between EXEC SQL or EXEC
SQL BEGIN
– and a matching EXEC SQL END (or
semicolon)
• Shared variables used in both languages.
– Have to be declared in SQL
– Variables in DECLARE shared and can appear
(prefixed by a colon :) in SQL statements
1
Example: Variable Declaration
in Language C
• SQLCODE (or SQLSTATE) used to communicate
errors/exceptions between the database and the
program
int loop;
EXEC SQL BEGIN DECLARE SECTION;
varchar dname[16], fname[16], …;
char ssn[10], bdate[11], …;
int dno, dnumber, SQLCODE, …;
EXEC SQL END DECLARE SECTION;
2
[En] Singleton Select eg Fig 13.2
• Read SSN of Emp, print some information about that Emp
loop = 1;
while (loop) {
prompt (“Enter SSN: “, ssn);
EXEC SQL
select FNAME, LNAME, ADDRESS, SALARY
into :fname, :lname, :address, :salary
from EMPLOYEE where SSN == :ssn;
if (SQLCODE == 0) printf(fname, …);
else printf(“SSN does not exist: “,
ssn);
prompt(“More SSN? (1=yes, 0=no): “,
loop);
3
Impedance Mismatch + Cursors
• SQL deals with tables with multiple rows
– No such data structure exist in traditional
programming languages such as C.
•
•
SQL supports a mechanism called a cursor to
handle this.
Can declare a cursor on a query statement
Can open a cursor
– Repeatedly fetch a tuple
• until all tuples have been retrieved.
– Can also modify/delete tuple pointed to by a cursor.
–
4
[EN] Cursor Eg Figure 13.3
• Get name of Dept, give each employee in that Dept a raise,
amount of raise input by user
5
Transaction Support in SQL
• A single SQL statement is always considered to
be atomic
– Either the statement completes execution without
error or it fails and leaves the database unchanged.
• [SKS] In number of situations, every SQL
statement also commits implicitly, if it executes
successfully
– Implicit commit can be turned off by a database
directive
• E.g. in JDBC,
connection.setAutoCommit(false);
6
Transactions in Embedded SQL
• When running embedded SQL statements,
different statements can be bundled together
into a transaction
• With embedded SQL, there is no explicit Begin
Transaction statement.
– Transaction initiation is done implicitly when
particular SQL statements (eg: SELECT,
CURSOR, CREATE TABLE etc.) encountered.
7
Transactions in Embedded SQL
• Every transaction must have one of these
explicit end statement :
– EXEC SQL COMMIT
– EXEC SQL ROLLBACK
• Typical pattern:
– SQL STMT, SQL STMT, …, COMMIT: T1
– SQL STMT, SQL STMT, …, COMMIT: T2
– SQL STMT, SQL STMT, …, COMMIT: T3
• If program crashes or ends without a COMMIT
or ROLLBACK, system dependent default
8
Transactions in Embedded SQL
• Locking not done by users/transactions
– i.e. transaction does not request locks or release
– done implicitly by DBMS
• Eg: strict 2PL
• But user can pick among some options in SQL
– User can go with SERIALIZABLE
– But can also go with a weaker consistency option
• What happen to locks at
COMMIT/ROLLBACK ?
• DBMS releases locks held by transaction
9
Sample SQL Transaction
EXEC SQL whenever sqlerror go to UNDO;
EXEC SQL SET TRANSACTION
READ WRITE
ISOLATION LEVEL SERIALIZABLE;
EXEC SQL INSERT
INTO EMPLOYEE (FNAME, LNAME, SSN, DNO, SALARY)
VALUES ('Robert','Smith','991004321',2,35000);
EXEC SQL UPDATE EMPLOYEE
SET SALARY = SALARY * 1.1
WHERE DNO = 2;
EXEC SQL COMMIT;
GOTO THE_END;
UNDO: EXEC SQL ROLLBACK;
THE_END: ...
10
Allowing non-serializability
• Some applications can live with non-serializable
schedules. Why ?
• Tradeoff accuracy for performance i.e. if “total”
accuracy (consistency) is not important, get better
performance through more concurrency
– Eg: a read-only transaction that wants to get an
approximate total balance of all accounts
– Eg: database statistics computed for query
optimization can be approximate (why?)
– Such transactions need not be serializable with respect
to other transactions
• SQL allows this
11
Transaction Characteristics SQL
• The more the concurrency, the less a
transaction is protected from other transactions.
• In SQL, user is allowed to make decide how
much concurrency to allow by a SET
TRANSACTION statement
• In this statement, user can specify
– Access mode
– Isolation level
12
Transaction Characteristics SQL
• SET TRANSACTION [access mode]
[isolation level]
• Eg:
SET TRANSACTION
READ WRITE
ISOLATION LEVEL READ COMMITTED
• Isolation level can be the following : these are
ordered from more to less concurrency
–
–
–
–
READ UNCOMMITTED
READ COMMITTED
REPEATABLE READ
SERIALIZABLE
13
Transaction Characteristics SQL
• Access mode:
– READ ONLY
– READ WRITE
• The default is READ WRITE
– unless the isolation level of READ UNCOMITTED
is specified
– in which case READ ONLY is required. Why?
• READ UNCOMITTED is so “dangerous” (so
much concurrency) that it is allowed only for T
that are in READ ONLY mode
– i.e. not allowed to WRITE. Why ?
14
Transaction Characteristics SQL
• Suppose T2 is READ UNCOMITTED and is
allowed to write
–
–
–
–
–
–
T1 writes x
T2 reads x
T2 writes y (y = y+x)
T2 commits
T3 reads y
T1 aborts. Now what ?
• T2 will have to be aborted – but is committed
transaction which has changed the database
– how can we undo effects of T2
15
Nonserializable behaviors: dirty read
• Each of the isolation level allows/does not allow
certain “undesirable” behaviors
– Allows for more concurrency
• What are these undesirable behaviors?
• Dirty Read: T reads a value that was written by
T’ which has not yet committed
– We saw this can lead to problems
• Even if T is READ ONLY still can have
inconsistencies:
– Eg Fig 21.3 (c) on next slide
16
FIGURE 21.3 (c) The incorrect summary problem.
If one transaction is calculating an aggregate summary function on a
number of records while other transactions are updating some of these
records, the aggregate function may calculate some values before they
are updated and others after they are updated.
17
Nonserializable behaviors:
nonrepeatable read
T1
T2
r(y)
w (y);
commit;
r(y)
• Values of y read by T1 are different
– even though T1 has not changed the value of y
• Could not happen in serial execution
• Nonrepeatable Read: Allowing another
transaction to write a new value between
multiple reads of one transaction.
18
Nonserializable behaviors: phantoms
• T1: SELECT SSN FROM EMP
WHERE DNO = 3;
– Suppose gets back ssn = 2345
• T2: INSERT INTO EMP(SSN, DNO)
VALUES (1234, 3);
• T1: SELECT SSN FROM EMP
WHERE DNO = 3;
– Will get back ssn = 2345, 1234
• Could not happen with serial execution
19
Nonserializable behaviors: phantoms
• Why would something like this happen ?
• Suppose T1 gets a lock on all rows of Emp
which match query, but not on all of Emp
• Between two SELECTs of T1, T2 does an
INSERT in Emp
– Can do because T1 does not have a lock on entire
Emp table
• Would not happen if T1 was locking entire
Emp table
– Or the access path. Eg: index
20
Nonserializable behaviors: phantoms
•
•
•
•
Phantoms:
T1 gets some tuples via SELECT statement
T2 makes changes
T1 uses the same SELECT statement
– But gets a different result
• Different from unrepeatable read
– Not the particular tuple which T1 read first time
which is different
– But could have additional tuples or fewer tuples
21
Possible violation of
serializabilty:
Type of Violation
Isolation
level
_____________________
READ UNCOMMITTED
READ COMMITTED
REPEATABLE READ
SERIALIZABLE
___________________________________
Dirty
nonrepeatable
read
read
phantom
_____ _________ ____________________
yes
yes
yes
no
yes
yes
no
no
yes
no
no
no
22
X-locks
• In all cases (including READ UNCOMMITTED)
– Have to get X locks before writing
– X locks are held till the end
• DBMS has to guaranteed that all T are following
this
23
Read uncommitted
• We know that all transactions
– Have to get X locks before writing
– X locks are held till the end
• How could a dirty read take place ?
–
–
–
–
T1 writing x
T1 not committed
T1 will be holding X lock till commits
So how could T2 read ?
• T2 does not get S lock before reading
24
Read committed
• How to ensure no dirty read ?
• DBMS forces transactions to get S locks before reading.
Why is this enough ?
• Look at Eg:T1 writing, T2 reading
– Want to ensure T2 can’t read from uncommitted T1 write
– T2 has to get S lock before reading. Why enough ?
• Combined with fact that T1 holds X lock till end
–
–
–
–
T2 will have to wait till get S lock
Can’t do till T1 gives up X lock
Won’t happen till T1 commits
So T2 can’t read till T1 has committed
25
Read committed
• But S lock can be released when read is done
– Don’t have to wait till commit
– What is the effect of giving up S lock ?
• Unrepeatable read possible. Why ?
• Conflict pairs (r1(y), w2(y)), (w2(y), r1(y))
possible. How ?
• S1(y), r1(y),U1(y), then ?
• X2(y), w2(y),U2(y),C2, r1(y).
• Two values of y that T1 reads can be different.
• How to prevent this ?
26
Repeatable read
• How can we guarantee repeatable read
• All S locks are held till end of T.
– Why does this guarantee repeatable read ?
•
•
•
•
•
•
Is (r1(y), w2(x)), (w2(y), r1(y)) possible ?
No. Why ?
Suppose T1 read y before T2 wrote y
T1 will keep S-lock till it commits
T2 can’t get X-lock till after T1 commits
(w2(y), r1(y)) not possible
27
Serializable
• Difference between serializable and repeatable
read ?
• With repeatable read could have phantoms
– How to stop
• Get locks are at a high enough levels
– Eg: entire table
– Eg: index
28
Weak Levels of Consistency [SKS]
• Degree-two consistency:
–
–
–
–
S-locks and X-locks may be acquired at any time
S-locks may be released any time
X-locks must be held till end of transaction
Is serializability guaranteed ?
• No: similar to Eg we saw earlier
– T1 gets S-lock on y, R1(y), T1 releases S-lock on y
– T2 gets X-lock on y, W2(y), T2 commits and releases
X-lock on y
– T1 gets S-lock on y, R1(y)
– Non-repeatable read, but read committed guaranteed
29
Weak Levels of Consistency [SKS]
• Cursor stability:
– For reads, each tuple is locked, read, and lock is
immediately released
– X-locks are held till end of transaction
– Special case of degree-two consistency
• In SQL standard, serializable is default
• In many DBMS, read committed is default
– explicitly change to serializable when required
– most DBMS implement read committed read
committed as cursor-stability
30
SQL Server Transaction Isolation Levels
•
Locking in Microsoft SQL Server, Alexander Chigrik
– http://www.mssqlcity.com/Articles/Adm/SQL70Locks.htm
• “ There are four isolation levels:
– READ UNCOMMITTED , READ COMMITTED , REPEATABLE
READ , SERIALIZABLE
• Microsoft SQL Server supports all of these Transaction Isolation
Levels
• READ UNCOMMITTED
– When it's used, SQL Server not issue shared locks while reading data. So,
you can read an uncommitted transaction that might get rolled back later.
This isolation level is also called dirty read. This is the lowest isolation
level.
• READ COMMITTED
– This is the default isolation level in SQL Server. When it's used, SQL
Server will use shared locks while reading data. It ensures that a physically
corrupt data will not be read and will never read data that another
application has changed and not yet committed, but it not ensures that the
data will not be changed before the end of the transaction.”
31
Oracle Transaction Isolation Levels
Oracle® Database Concepts, 10g Release 2 http://download.oracle.com/docs/cd/B19306_01/server.102/b14220/consist.htm
• “Oracle provides these transaction isolation levels.
•
•
Read-only: Read-only transactions see only those changes that were committed at the
time the transaction began and do not allow INSERT, UPDATE, and DELETE
statements.
Read committed: This is the default transaction isolation level. Each query executed by
a transaction sees only data that was committed before the query (not the transaction)
began. An Oracle query never reads dirty (uncommitted) data.
– Because Oracle does not prevent other transactions from modifying the data read by a query,
that data can be changed by other transactions between two executions of the query. Thus, a
transaction that runs a given query twice can experience both nonrepeatable read and
phantoms.
– The default isolation level for Oracle is read committed. This degree of isolation is
appropriate for environments where few transactions are likely to conflict. Oracle causes
each query to run with respect to its own materialized view time, thereby permitting
nonrepeatable reads and phantoms for multiple executions of a query, but providing higher
potential throughput. Read committed isolation is the appropriate level of isolation for
environments where few transactions are likely to conflict.
•
Serializable: Serializable transactions see only those changes that were committed at
the time the transaction began, plus those changes made by the transaction itself
through INSERT, UPDATE, and DELETE statements. Serializable transactions do not
experience nonrepeatable reads or phantoms.”
32
Oracle Transaction Isolation Levels
•
“Serializable isolation is suitable for environments:
–
–
–
•
Serializable isolation permits concurrent transactions to make only those database changes they
could have made if the transactions had been scheduled to run one after another.
–
•
Oracle gives the application developer a choice of two transaction isolation levels with different
characteristics. Both the read committed and serializable isolation levels provide a high degree of
consistency and concurrency. Both levels provide the contention-reducing benefits of Oracle's read
consistency multiversion concurrency control model and exclusive row-level locking implementation and
are designed for real-world application deployment.
Transaction Set Consistency
–
•
Specifically, Oracle permits a serializable transaction to modify a data row only if it can determine that
prior changes to the row were made by transactions that had committed when the serializable transaction
began
Comparison of Read Committed and Serializable Isolation
–
•
With large databases and short transactions that update only a few rows
Where the chance that two concurrent transactions will modify the same rows is relatively low
Where relatively long-running transactions are primarily read only
A useful way to view the read committed and serializable isolation levels in Oracle is to consider the
following scenario: Assume you have a collection of database tables (or any set of data), a particular
sequence of reads of rows in those tables, and the set of transactions committed at any particular time. An
operation (a query or a transaction) is transaction set consistent if all its reads return data written by the
same set of committed transactions. An operation is not transaction set consistent if some reads reflect the
changes of one set of transactions and other reads reflect changes made by other transactions. An
operation that is not transaction set consistent in effect sees the database in a state that reflects no single
set of committed transactions.
Oracle provides transactions executing in read committed mode with transaction set consistency
for each statement. Serializable mode provides transaction set consistency for each transaction.”
33
Transactions and Constraints
• DNO is a Foreign Key from Emp to Dept
• MGRSSN is a F. Key from Dept to Emp
• Which one of Dept or Emp to enter first ???
– Can’t enter either. Why ?
• One of the Foreign Keys violated
• Generally, DBMS checks constraint as soon as
SQL statement is executed
• But can tell DBMS to wait till end of T to
check constraint
34
Transactions and Constraints
FOREIGN KEY DNO
REFERENCES DEPT (DNUMBER) DEFFERED
• Insert into Dept, insert into Emp will be done in the same
transaction
• Foreign key constraint will be checked only at the end of the
transaction
35
Review: The ACID properties
•
A tomicity:
All actions in T happen, or none
happen.
•
C onsistency:
•
I solation:
If each T is consistent, and the DB
starts consistent, it ends up consistent.
Execution of one T is isolated from that
of other T.
•
D urability:
If a T commits, its effects persist.
• The Recovery Manager guarantees Atomicity &
Durability.
36
Recovery and Atomicity
• T transfers $50 from account A to account B
– goal is either to perform all database
modifications made by T or none at all.
• Several operations required for T
– failure may occur after one of these modifications
have been made but before all of them are made.
• To ensure atomicity despite failures, what do
we need ?
• We first output information describing the
modifications to stable storage
– before modifying the database itself.
37
Recovery and Durability
• Durability: If the system crashes, what
is the desired behaviour after the
system restarts ?
• T1, T2 & T3 should
be durable.
• T4 & T5 should be
rolled back(effects
not seen).
T1
T2
T3
T4
T5
crash!
38
Transactions: System Concepts
• Transaction: For recovery purposes, system
needs to keep track of when the transaction
starts, terminates, and commits or aborts.
• Transaction states:
– Active state
– Partially committed state: finished all ops, ready to
commit
– Committed state
– Failed state
– Terminated State
39
FIGURE 21.4
State transition diagram illustrating the states
for transaction execution.
40
Storage Structure
• Volatile storage:
– does not survive system crashes
– examples: main memory, cache memory
• Nonvolatile storage:
– survives system crashes
– examples: disk, tape, flash memory,
• Stable storage:
– a mythical form of storage that survives all
failures
– approximated by maintaining multiple copies on
distinct nonvolatile media
41
Model of Where Data is Stored [SKS]
• Block : a contiguous sequence of sectors from
a single track
– data transferred between disk, RAM in blocks
• Physical blocks are those blocks residing on
the disk.
• Buffer blocks are the blocks residing
temporarily in RAM.
• We assume, for simplicity, that each data item
fits in, and is stored inside, a single block.
42
Blocks
• DBMS tries to minimize # block transfers
between the disk and memory.
– Various kinds of optimizations possible
• Tradeoffs in having small/large blocks?
• Smaller blocks: more transfers from disk
• Larger blocks: more space wasted due to
partially filled blocks
43
Blocks
• Typical block sizes range from 4K to 16K
• Blocks size fixed by OS during disk
formatting and can’t be changed.
• Hardware Address: Physical block address
supplied to disk I/O hardware. Consists of:
– a cylinder number (imaginary collection of tracks
of same radius from all recoreded surfaces)
– track (surface) number (within the cylinder)
– block number (within track).
• Logical Block Address: a block number
which is mapped by disk controller to
hardware address
44
How is disk I/O done
• We can reduce # disk accesses by keeping as
many blocks as possible in main memory.
• Buffer – portion of main memory available to
store copies of disk blocks.
• Buffer manager – subsystem responsible for
allocating buffer space in main memory.
• OS provides logical block address and address
(where block is to go) in RAM to device driver
45
Read block: Buffer Manager (BM)
• Program calls on BM when needs a block from disk.
• If block already in buffer, BM returns the address of
the block in main memory. If not in buffer ?
• Find space in the buffer for the block. If free space,
that gets allocated. Otherwise ?
• Replace another block to make space
– Replacement policy
• Reads block from the disk to the buffer, and returns
the address of block in RAM to requester.
46
Model of Where Data is Stored
• Block movements between disk and main
memory are initiated through the following
two operations:
• input(B) transfers the physical block B
to main memory.
• output(B) transfers the buffer block B to
the disk, and replaces the appropriate
physical block there.
• Each transaction T has its private work-area
in which local copies of all data items
accessed and updated by it are kept.
47
Example of Where Data is Stored
buffer
Buffer Block A
X
Buffer Block B
Y
input(A)
A
output(B)
B
read(X)
write(Y)
x1
x2
y1
work area
of T1
work area
of T2
memory
disk
48
Model of Where Data is Stored
• Transaction transfers data items between
DBMS buffer blocks and its private workarea using the following operations :
• read(X) assigns the value of data item X to
the local variable x.
• write(X) assigns the value of local variable x
to data item X in the buffer block.
• Both read and write may lead to the issue of
an input(BX) instruction (bring to RAM)
– if BX in which X resides is not already in RAM
49
Model of Where Data is Stored
• Transactions perform read(X) while accessing
X for the first time.
• Subsequent accesses (getting value of X and
changing value X) are to the local copy x.
– read twice from buffer block if non-repeatable read
• When transaction executes a write(X) statement
– Copies local x to data item X in the buffer block.
• write(X) may not always be immediately
followed by output(BX).
– System can perform the output operation when it
deems fit. Why not always immediately ?
• Efficiency : don’t want too many disk writes
50
How Data is Stored [Elmasri]
• Who controls when a page from RAM is
written back (flushed) to DISK
• Typically done by OS in other situations
• We assume DBMS has partial or total control
– by calling low-level OS functions
• DBMS cache : collection of RAM buffers
kept by DBMS for pages of DBMS
• If need to bring in additional pages and
buffers all used up, DBMS decides which
pages to flush
– Could use FIFO, LRU strategy etc.
51
How Data is Stored [Elmasri]
• Cache Manager(CM): controls DBMS cache
– Data items to be modified are first stored into
database cache by CM
– At “some point” CM flushes modified items to disk.
• Dirty Bit: Indicates whether data item has been
modified or not. Why is this important ?
• If not dirty, don’t need to flush
• Pin-Unpin Bit: Instructs the operating system
not to flush the data item. Why useful?
• Eg: change made by uncommitted T
– May not want change made to disk block
52
When are changes made to disk
• We assume when T writes an item, change is
immediately made to block in cache (RAM)
• When will this change be reflected in the
block on disk ? Three possibilities:
1. As soon as a data item is modified in cache,
the disk copy could be updated.
2. When transaction commits
3. Could be at a later time:
• Eg: after fixed number of transactions have
committed.
53
Recovery Algorithms
• Recovery algorithms are techniques to
ensure database consistency and transaction
atomicity and durability despite failures
• Recovery algorithms have two parts
1.Actions taken during normal transaction
processing
• to ensure enough information exists to recover
from failures
2.Actions taken after a failure
• to recover the database contents to a consistent
state : ensures atomicity and durability.
54
Assumptions
• Concurrency control is in effect.
– When needed, will assume Strict 2PL
• In-place update: The disk version of the data
item is overwritten by the cache version.
– old data overwritten on (deleted from) the disk.
• Alternative is Shadow update: The modified
version of a data item does not overwrite the
old disk copy but is written at a separate disk
location.
• Don’t have to worry about undos because
original is still saved
• We will work with in-place update
55
Database Recovery
Transaction Log
For recovery from failure, old data value prior to
modification (BFIM - BeFore Image) and the new
value after modification (AFIM – AFter Image) are
needed. These values, and other information, is stored
in a sequential file called Transaction log. A sample log
is given below. Back P and Next P point to the
previous and next log records of the same transaction.
In first row, Next P = 1 should be a 2
T ID Back P Next P Operation Data item
Begin
T1
0
1
T1
1
4
Write
X
Begin
T2
0
8
T1
2
5
W
Y
T1
4
7
R
M
T3
0
9
R
N
T1
5
nil
End
BFIM
AFIM
X = 100 X = 200
Y = 50 Y = 100
M = 200 M = 200
N = 400 N = 400
56
UNDO/REDO
• Transaction Roll-back (Undo) and RollForward (Redo)
• Undo: how to do ?
• Undo: Restore all BFIMs on to disk (Remove
all AFIMs).
• Redo: How to do?
• Redo: Restore all AFIMs on to disk.
• Database recovery is achieved either by
performing only Undos or only Redos or by a
combination of the two. These operations are
recorded in the log as they happen.
57
Rollback example
•We show the process of roll-back with the help of the following
three transactions T1, and T2 and T3.
•Strict 2PL not being followed here
T1
read_item (A)
read_item (D)
write_item (D)
T2
read_item (B)
write_item (B)
read_item (D)
write_item (A)
T3
read_item (C)
write_item (B)
read_item (A)
write_item (A)
58
Rollback example
Roll-back: One execution of T1, T2 and T3 as recorded in the log.
T3
READ(C) WRITE(B)
BEGIN
READ(B)
T2
BEGIN
READ(A)
WRITE(B)
READ(D) WRITE(D)
READ(A) READ(D) WRITE(D)
T1
BEGIN
Time
system crash
Illustrating cascading roll-back
59
Rollback example
Roll-back: One execution of T1, T2 and T3 as recorded in the log.
A
30
B
15
[start_transaction, T3]
[read_item, T3, C]
* [write_item, T3, B, 15, 12]
12
[start_transaction,T2]
[read_item, T2, B]
** [write_item, T2, B, 12, 18]
18
[start_transaction,T1]
[read_item, T1, A]
[read_item, T1, D]
[write_item, T1, D, 20, 25]
[read_item, T2, D]
** [write_item, T2, D, 25, 26]
[read_item, T3, A]
Abort T3
What should happen with the different transaction ?
•
C
40
D
20
25
26
T3 rolled back and then T2 has to be rolled back because T2 read a value (B) written by
T3
60
Write-Ahead Logging
• When in-place update (old value overwritten by new
value) is used then log is necessary for recovery
• Done by Write-Ahead Logging (WAL) protocol.
• To make sure Undo can be done, what do we need?
– Before a data item’s AFIM is flushed to the disk (overwriting
the BFIM)
– Its BFIM must be written to the log
– Log must be saved on disk (force-write the log). Why?
• If T writes X, X on disk changed, system crashes
– how will we recover ?
61
Write-Ahead Logging
• Need BFIM of X to Undo, will be in log.
– But log in RAM vanishes with system crash
– So need to force write log
• What do we need to for Redo ?
• For Redo: Before a transaction executes its
commit operation
– its AFIMs must be written to log
– log must be force written on disk. Why ?
• Need AFIM of X to Redo, will be in log.
– But log in RAM vanishes with system crash
– So need to force write log
62
Checkpoints
•
Periodically flush buffers to disk
–
–
Simplifies recovery: “old” stuff guaranteed on disk
When to do ? Under some criteria:
• Eg: when certain #T commit
• Eg: at fixed time intervals
• checkpoint : carry out following steps :
1. Suspend execution of transactions temporarily.
2. Force write modified buffer data of committed
transactions to disk
3. Write checkpoint record to log
4. Save log to disk.
5. Resume normal transaction execution.
• During recovery Redo or Undo required only for
transactions active after last checkpoint.
–
Eg next slide
63
Immediate Update –Concurrent Users
T1
T3
T2
T4
T5
t1
Time
checkpoint
t2
system crash
Recovery in a concurrent users environment.
• What actions will have to be taken when the system crashes at
time t2 ?
– What to do about each of the transactions ? T1 ?
• T1 committed before checkpt: do nothing. T4 ?
• T4: not committed when crash : UNDO. T5 ?
• T5: not committed when crash : UNDO. T3?
• T3: committed after checkpt : REDO. T2 ?
• T2 committed after checkpt : REDO.
64
Steal/No-Steal
• Steal/No-Steal and Force/No-Force : Possible ways
for flushing database cache to disk.
• Issue: before T commits, can changes made by
T to buffer blocks be flushed to disk?
– Eg: if Cache Manager needs space to bring in new
pages.
– Not asking if always flushed, but IF POSSIBLE
• Steal: Cache can be flushed before T commits.
• No-Steal: Cache cannot be flushed before T
commits.
• Why called Steal ?
65
Steal/No-Steal
•
•
•
•
•
•
Advantage of No-Steal ?
Don’t have to Undo. Why?
Pages not written out on disk before commit
Advantage of Steal ?
If No Steal, then poor throughput. Why?
Cache Manager not allowed to write pages till
commit
• Has to keep all uncommitted writes in buffer
• Buffer space limited, will allow only small #
T, so poor throughput
66
Force/No-Force
• Issue: when T commits, do pages modified by T
HAVE TO be immediately flushed out to disk?
• Force: Have to flush immediately on commit.
• No-Force: Don’t have to immediately flush
– CM decides when to flush changes made by
committed transactions to disk
• Difference between Steal and Force issues:
– Steal/No-steal: is it possible that pages get flushed to
disk before commit
– Force/No-force: is it guaranteed that pages get
flushed to disk immediately on commit
67
Force/No-Force
•
•
•
•
•
•
Advantage of Force ?
No need to redo. Why ?
On COMMIT, immediately written to disk
Advantage of No-Force ?
Less writes to disk . Why ?
Eg: Consider Force and No-Force: initially x = 2.
– T1 writes x = 3, commit. T2 writes x = 4, commit.
• With Force will have to do 2 disk writes:
– 1 when T1 commits, 1 when T2 commits
• With No-Force maybe 2 or maybe only 1 disk write:
– When T1 commits, modified x (x = 3) in buffer
– May not be written to disk
68
Steal/No-Steal and Force/No-Force
• These give rise to four different ways for
handling recovery:
• No-Steal/Force. In terms of Undo/Redo ?
• (No-undo/No-redo).
• Steal/Force. In terms of Undo/Redo ?
• (Undo/No-redo)
• No-Steal/No-Force. In terms of Undo/Redo ?
• (No-undo/Redo)
• Steal/No-Force. In terms of Undo/Redo ?
• (Undo/Redo)
69
Steal/No-Steal and Force/No-Force
• Typical DBMS uses
Steal/No Force strategy
– Steal: Good throughput
– No Force: less disk I/O
• Immediate Update:
Steal/No Force strategy.
– Have to UNDO/REDO ?
No Steal
Force
No Force
Steal
Trivial
Desired
• Steal: have to UNDO
• No-Force: have to
REDO
70
When is log force-written ?
• What are the different times at which we have
to force-write log to the disk ?
• At checkpoint
• When commit is issued
• In case of STEAL ?
• When database block gets stolen
– Flushed out to disk
– Why ?
• In case we need to UNDO
71
Deferred Update
• Deferred Update: No-Steal/No-Force
• No-Steal: RAM blocks not flushed to disk till T
commits
– No Undo
• No-Force: May not be flushed even when T commits
– How will system recover from system crash ?
– Redo needed for those T for which commit was issued and
whose changes have not been recorded on disk blocks
– After reboot from a failure, log is used to redo all the
transactions affected by this failure.
• [EN] covers Deferred Update in detail
– we will not discuss
72