Download งานนำเสนอ PowerPoint

Document related concepts

IMDb wikipedia , lookup

Oracle Database wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Global serializability wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Consistency model wikipedia , lookup

Functional Database Model wikipedia , lookup

Relational model wikipedia , lookup

Database wikipedia , lookup

Database model wikipedia , lookup

Clusterpoint wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Commitment ordering wikipedia , lookup

Versant Object Database wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Serializability wikipedia , lookup

Concurrency control wikipedia , lookup

Transcript
Transaction Manager
Concurrency Control
Recovery Management
Transactions
A transaction is a unit of program execution that accesses and
possibly updates various data items.
[ A transaction program is a collection of operations that form a
single unit of work.]
Clearly, it is essential that all these operations occur, or
that, in case of failure, none occur. A database system must
ensure proper execution of transactions despite failures – either
the entire transaction executes, or none of it does. Furthermore, it
must manage concurrent execution of transactions in a way that
avoids the introduction of inconsistency.
Transactions
หมายถึงโปรแกรมการประมวลผลทีเ่ ขียนด้วย High-level
data manipulation language เพือ่ เข้าไป update ข้อมูล
ในระบบฐานข้อมูล และ DBMS ต้องรับประกันว่า เมือ่ transaction
ทางานเสร็จแล้ว จะต้องทาให้ขอ้ มูลอยูใ่ นสภาพทีส่ มบูรณ์ถกู ต้อง กล่าวคือถ้า
ก่อนการ update ฐานข้อมูลเดิมมีสภาพดีอยูแ่ ล้ว หลังจากการประมวลผล
ของ transaction ฐานข้อมูลจะต้องคงสภาพความถูกต้องดังเดิม
Collections of operations that form a single logical unit of work are called
transactions. DBMS must ensure proper execution of transactions despite
failures either the entire transaction executes, or none of it does. Furthermore,
it must manage concurrent execution of transactions in a way that avoids he
introduction of inconsistency.
Architecture of a TPS Application
Notice of
Event
Transaction
Keyed
TPS Data
TPS
Program
Event
Response
Response
TPS
Data
Report(s)
The event is recorded by keying it into the computer system as a transaction,
which is a representation of the event. One or more TPS programs process
the transaction against TPS data. The TPS program generates two types of
output. It sends messages back to the user terminal, and it generates printed
documents.
Transaction State
A transaction may not always complete its execution successfully.
Such a transaction is termed aborted. If we are to ensure the
atomicity property, an aborted transaction must have no effect on
the state of the database. Thus, any changes that the aborted
transaction made to the database must be undone. Once the
changes caused by an aborted transaction have been undone, we
say that the transaction has been rolled back.
Partially
committed
Committed
failed
Aborted
active
Transactions access data using two operations:
• read(X), which transfers the data item X from the
database
to a local buffer belonging to the transaction that executed
the read operation.
• write(X), which transfers the data item X from the local
buffer of the transaction that executed the write back to the
database.
In a real database system, the write operation does not
necessarily result in the immediate update of the data on the
disk; the write operation may be temporarily stored in memory
and executed on the disk later. For now, however, it is assumed
that the write operation updates the database immediately.
Transaction Concepts
Usually, a transaction is initiated by a user program written
in high-level DML or programming language, where it is delimited
by statements (or function calls) of the form begin transaction and
end transaction. The transaction consists of all operations executed
between the begin transaction and end transaction.
To ensure integrity of the data, we require that the database
system maintain ACID properties of the transactions:
ACID properties of Transaction ensured by DBMS
Atomicity. Either all operations of the transaction are reflected
properly in the database, or none are.
Consistency. Execution of a transaction in isolation (that is, with
no other transaction executing concurrently) preserves the
consistency of the database.
Isolation.
Even though multiple transactions may execute
concurrently, the system guarantees that, for every pair of
transactions Ti and Tj, it appears to Ti that either Tj finished
execution before Ti started, or Tj started execution after Ti finished.
Thus each transaction is unaware of other transactions executing
concurrently in the system.
Durability. After a transaction completes successfully, the change
it has made to the database persist, even if there are system failures.
Atomicity: Because of the failure (power failures, hardware
failures, and software errors), the state of the system no
longer reflects a real state of the world that the database is
supposed to capture. We term such a state an inconsistent
state. We must ensure that such inconsistencies are not
visible in a database system. [The system must be at some
point be in a temporary inconsistent state, however, it is
eventually replaced by the consistent state.]
The basic idea behind ensuring atomicity is this: The
database system keeps track (on disk) of the old values of any data
on which a transaction performs a write, and, if the transaction does
not complete its execution, the database system restores the old
values to make it appear as though the transaction never executed.
Ensuring atomicity is the responsibility of the database system
itself; specifically, it is handled by a component called the
transaction-management component.
Consistency: Ensuring consistency for an individual transaction is
the responsibility of the application programmer. This task may be
facilitated by automatic testing of integrity constraints
Isolation: Even if the consistency and atomicity properties are
ensured for each transaction, if several transactions are executes
concurrently, their operations may interleave in some undesirable
way, resulting in an inconsistent state.
A way to avoid the problem of concurrently executing
transactions is to execute transaction serially – that is, one after
the other. However, concurrent execution of transactions provides
significant performance benefits.
The isolation property of a transaction ensures that
the concurrent execution of transactions results in a system
state that is equivalent to state that could have been obtained
had these transactions executed one at a time in some order.
Ensuring the isolation property is the responsibility of a
component of the database system called the concurrencycontrol component.
Durability: We assume that a failure of the computer system may
result in loss of data in the main memory, but data written to
disk are never lost. DBMS can guarantee durability by
ensuring that either :
1. The updates carried out by the transaction have been
written to disk before the transaction completes.
2. Information about the updates carried out by the
transaction and written to disk is sufficient to enable the
database to reconstruct the updates when the database system
is restarted after the failure.
Ensuring durability is the responsibility of a component of
the database system called the recovery-management
component.
DBMS must maintain the following properties of the transactions :-
Atomicity : ถ้า transactions เริ่มดาเนินการ ต้องดาเนินไปจนลุล่วงหมด
ทุกคาสัง่ หรือถ้าดาเนินการไม่สาเร็จลุล่วง transactions จะต้องทาให้
ฐานข้อมูลดูเสมือนหนึ่ งว่าไม่เคยมีการกระทาใด ๆ เกิดขึน้ เลย ค่าข้อมูล
ต่างๆ ยังคงเป็ นค่าเดิมก่อนการประมวลผลของ transactions โดยที่
transactions ต้องถูก roll back กลับไปตัง้ ต้นไป การ commit หรือ
roll back นี้ จะถูกดาเนินการโดย transaction-management component
ซึ่งเป็ นองค์ประกอบหนึ่ ง ของ DBMS
DBMS เก็บค่าเก่าของข้อมูลทุกค่าที่ transactions เข้าไปดาเนินการ
write และถ้า transactions ไม่สามารถประมวลผลจนเสร็จสมบูรณ์ (system
failure หรือ program runtime error...) DBMS จะนาค่าเก่าขึน้ มาฟื้ นสภาพ
ให้กบั ข้อมูล เสมือนหนึ่ งว่าไม่เคยมีการประมวลผลใด ๆ เกิดขึน้ กับข้อมูล
เหล่านี้ เลย ซึ่งเป็ นหน้ าที่ของ Recovery manager
Consistency : DBMS ต้องรับประกันความถูกต้องของข้อมูลในระบบ
ฐานข้อมูลอยู่เสมอ ไม่ว่าก่อน หรือหลังการประมวลผลของ transaction การ
รับประกันคุณสมบัติ consistency นี้ สามารถทาได้โดยระบุกฎเกณฑ์ความคงสภาพ
(Integrity constraint)
Isolation : ถึงแม้ว่าหลาย ๆ transactions สามารถเข้าประมวลผลฐานข้อมูล
พร้อม ๆ กันได้ ในเวลาเดียวกัน (Concurrent execution)
และอาจเข้า
ประมวลผลชิ้นข้อมูลเดียวกันด้วย แต่ DBMS ต้องรับประกันในการจัด
ลาดับการเข้าประมวลผลของ transactions เหล่านัน้ ให้มีลกั ษณะเสมือนเป็ น
serial execution การรับประกันคุณสมบัติ Isolation เป็ นหน้ าที่ความ
รับผิดชอบของ Concurrency-control component หรือ Scheduler
Durability : เมื่อ transactions จบสิ้นการประมวลผลอย่างสมบูรณ์ ระบบ
ข้อมูลต้องคงสภาพอยู่อย่างนัน้ แม้ว่าจะเกิด system failures ในภายหลัง
(คุณสมบัติ durability ภายหลังการเกิด system failures หมายถึง system
failures ที่มีผลทาให้ข้อมูลใน main memory สูญหาย แต่ไม่กระทบข้อมูล
ที่บนั ทึกลงบนดิสก์เรียบร้อยแล้ว)
Access
manager
Transaction
Manager
Scheduler
Buffer
manager
Recovery
manager
File
manager
System
manager
Database and
system catalog
A transaction manager is software that monitors the behavior
of transactions and decides whether each action can be allowed
to execute. The transaction manager coordinates transactions
on behalf of application programs. It communicates with the
scheduler (sometimes referred to as the lock manager). This
module is responsible for implementing a particular strategy
for concurrency control. If a failure occurs during the
transaction, then the database could be inconsistent. It is the
task of the recovery manager to ensure that the database in
consistent state. Finally, the buffer manager is responsible for
the transfer of data between disk storage and main memory.
Transaction Atomicity in a Single-Transaction System
In a single-transaction system, only one transaction is execute at any
time. If a transaction is active, no other transaction can start. This situation is
the same as having one application connected to the database server at a time.
To support atomicity, a database server must support operations to
open a transaction, commit a transaction, and rollback a transaction by
grouping one or more SQL commands together. If either command fails,
transaction manager can roll back all commands, returning the data source to
its original state. If all commands are successful, the transaction manager
commits the changes and make them permanent.
Concurrent Transaction Processing
Concurrency arises when many applications are executing transactions
at the same time. A single database server processes all operations, so only
one database operation can be processed at a time. However, the operations
of the transactions overlap because independent applications are requesting
service by the database server in parallel.
Schedule is a sequence of the operations by a set of concurrent
tractions that preserves the order of the operations in each of the
individual transactions.
Clearly, a schedule for a set of transactions must consists of
all instructions of those transactions, and must preserve the
chronological order in which instructions appear in each
individual transaction.
A schedule can be serial or non-serial schedule.
Each serial schedule consists of a sequence of instructions from
various transactions, where the operations of each transaction are
executed consecutively without any interleaved operations from
other transactions. For a set of n transactions, there exist n!
different valid serial schedules.
When the database system executes several transactions
concurrently, the corresponding schedule no longer needs to be
serial. OS must perform a context switch (CPU time is shared)
among all transactions which concurrently access to database.
Several execution sequences are possible, since the
various instructions from several transactions may now be
interleaved. The number of possible schedules for a set of n
transactions is much larger then n!.
คัน่ ด้วยแผ่นใสอีก 2 แผ่น อยูใ่ น ไฟล์ word ขือ่ transaction2.doc อยูใ่ น
D:\srp\transaction2.doc
Schedule : A sequence of the operations by a set of concurrent transactions that preserves
the order of the operations in each of the individual transactions.
Serial schedule : A schedule where the operations of each transaction are executed
consecutively without any interleaved operations from other transactions.
T1:
T1
read(A);
A := A-50;
write(A)
read(B);
B := B + 50;
write(B);
T2:
T2
read(A);
A := A-50;
write(A)
read(B);
B := B + 50;
write(B);
read(A)
temp := A * 0.1;
A := A – temp;
write(A)
read(B);
B := B + temp;
write(B);
read(A)
temp := A * 0.1;
A := A – temp;
write(A);
read(B);
B := B + temp;
write(B);
Nonserial schedule : A schedule where the operations from a set of concurrent
transactions are interleaved.
T1
T2
read(A);
A := A-50;
read(A)
temp := A * 0.1;
A := A – temp;
write(A)
read(B);
write(A);
read(B);
B := B + 50;
write(B);
B := B + temp;
write(B);
If several transactions run concurrently, and control of concurrent
execution is left entirely to the OS, database consistency can be
destroyed despite the correctness of each individual transaction
We can ensure consistency of the database under concurrent
execution by making sure that any schedule that executed has the
same effect as a schedule that could have occurred without any
concurrent execution. That is, the schedule should, in some sense,
be equivalent to a serial schedule.
Potential problems caused by concurrency
1. Lost update problem : An apparently successfully completed update operation
by one user can be overridden by another user.
T3
Time1 balance1 = (select balance from
Customer where accountID = 101);
balance1 += 5.00;
Time 2
Time 3 update Customer set balance =
?balance1 where accountID = 101;
Time 4
Time 5 Commit
Time 6
T4
balance (15)
15
balance2 = (select balance from
Customer where accountID = 101);
balance2 += 10.00;
15
20
update Customer set balance =
?balance2 where accountID = 101;
Commit
25
25
25
Potential problems caused by concurrency
2. The uncommitted dependency problem : This problem occurs when one transaction
is allowed to see the intermediate result of another transaction before it has committed.
T3
Time1 balance1 = (select balance from
Customer where accountID = 101);
balance1 += 5.00;
Time 2 update Customer set balance =
?balance1 where accountID = 101;
Time 3
Time 4 Rollback
Time 5
Time 6
T4
balance (15)
15
20
balance2 = (select balance from
Customer where accountID = 101);
balance2 += 10.00;
update Customer set balance =
?balance2 where accountID = 101;
Commit
20
15
30
30
3. Incorrect summary problem :
T3
Time1 balance1 = (select balance from
Customer where accountID = 101);
balance1 += 10.00;
Time 2 update Customer set balance =
?balance1 where accountID = 101;
Time 3
Time 4
Time 5
Time 6
Time 7
balance1 = (select balance from
Customer where accountID = 102);
balance1 -= 10.00;
update Customer set balance =
?balance1 where accountID = 102;
Commit
T4
Balance
bal 101 bal 102
15
15
25
15
25
15
25
5
25
5
Total = select sum(balance)
from customer where accountID
= 101 or accountID = 102
Commit
A phantom read problem : It occurs when an aggregate operation is repeated by
a transaction and yields a different result because of the insertion of a row by another
transaction
T1
Time1 totalA = (select sum(balance) from
Customer where zipcode = 31101);
Time 2
Time 3 totalB = (select sum(balance) from
Customer where zipcode = 31101);
Time 4
Time 5 Commit
T2
insert into customer (accountID,
balance, zipcode) values
(105, 10.00, 31101)
sum(balance)
100
100
100
110
rollback
A nonrepeatable read problem : It occurs when a transaction reads the same value more
than one time. In between reading the data item, another transaction modifies the data item.
T1
Time1 balance1 = (select balance from
Customer where accountID = 101);
Time 2
Time 3 balance2 = (select balance from
Customer where accountID = 101);
T2
balance
15
15
update customer set balance = 0.0
where accountID = 101;
0.0
110
Recoverability :
If a transaction fails, the atomicity property requires that we undo the effects
of the transaction. In addition, the durability property states that once a
transaction commits, its changes cannot be undone.
Recoverable schedule :
A schedule where, for each pair of transactions
Ti and Tj, if Tj reads a data item previously written by T i, then the commit
operation of Ti precedes the commit operation of Tj.
Non-recoverable schedule
T3
Time1 balance1 = (select balance from
Customer where accountID = 101);
balance1 += 5.00;
Time 2 update Customer set balance =
?balance1 where accountID = 101;
Time 3
Time 4
Time 5
Time 6 Rollback
T4
balance (15)
15
20
balance2 = (select balance from
Customer where accountID = 101);
balance2 += 10.00;
update Customer set balance =
?balance2 where accountID = 101;
Commit
20
30
30
10
Locking : A procedure used to control concurrent access to data. When one
transaction is accessing the database, a lock may deny access to other
transactions to prevent incorrect results.
Locking methods are the most widely used approach to ensure
serializability of concurrent transactions. There are several variations, but all
share the same fundamental characteristic, namely that a transaction must claim
a read (shared) or write (exclusive) lock on a data item before the corresponding
database read or write operation.
The lock prevents another transaction from modifying the item or
even reading it, in the case of write lock.
Data items of various sizes, ranging from the entire database down
to a field, may be locked. The size of the item determines the fineness, or
granularity, of the lock.
Read lock : If a transaction has a read lock on a data item, it can read the item
but not update it
Write lock : If a transaction has a write lock on a data item, it can both read and
update the item.
•·
Any transaction that needs to access a data item must first lock the item,
requesting a read lock only access or a write lock for both read and write
access.
•·
If the item is not already locked by another transaction, the lock will be
granted.
·
If the item is currently locked, the DBMS determines whether the request is
compatible with the existing lock. If a read lock is requested on an item that
already has a read lock on it, the request will be granted; otherwise, the
transaction must wail until the existing lock is released.
• A transaction continues to hold a lock until it explicitly releases it either during
execution or when it terminates (aborts or commits). It is only when the write
lock has been released that the effects of the write operation will be made
visible to other transaction.
Lock can solve Lost update problem : (An apparently successfully completed
update operation by one user can be overridden by another user.)
Time1
Time 2
Time 3
Time 4
Time 5
Time 6
Time 7
T3
Write_lock (balance)
balance1 = (select balance from
Customer where accountID = 101);
balance1 += 5.00;
update Customer set balance =
?balance1 where accountID = 101;
Commit/ Unlock (balance)
T4
balance (15)
15
Write_lock (balance)
Wait
Wait
balance2 = (select balance from
Customer where accountID = 101);
balance2 += 10.00;
update Customer set balance =
?balance2 where accountID = 101;
Commit/ Unlock (balance)
20
20
20
30
30
Lock can solveThe uncommitted dependency problem : This problem occurs when one
Transaction is allowed to see the intermediate result of another transaction before it has
committed.
T3
Time1 Write_lock (balance)
balance1 = (select balance from
Customer where accountID = 101);
balance1 += 5.00;
Time 2 update Customer set balance =
?balance1 where accountID = 101;
Time 3
Time 4
Time 5 Rollback / Unlock (balance)
Time 6
Time 7
Time 8
T4
balance (15)
15
20
Write_lock (balance)
Wait
Wait
balance2 = (select balance from
Customer where accountID = 101);
balance2 += 10.00;
update Customer set balance =
?balance2 where accountID = 101;
Commit / Unlock (balance)
15
15
25
25
Lock can solve Incorrect summary problem :
T3
Time1 Write_lock (balance)
balance1 = (select balance from
Customer where accountID = 101);
balance1 += 10.00;
Time 2 update Customer set balance =
?balance1 where accountID = 101;
Time 3
Time 4 balance1 = (select balance from
Customer where accountID = 102);
balance1 -= 10.00;
Time 5 update Customer set balance =
?balance1 where accountID = 102;
Time 6 Commit / Unlock (balance)
Time 7
Time 8
T4
Balance
bal 101 bal 102
15
15
25
15
25
15
Wait
25
5
Wait
Total = select sum(balance)
from customer where accountID
= 101 or accountID = 102
Commit / Unlock (balance)
25
5
Write_Lock (balance)
ถ้ าปล่อย Lock เร็วเกินไป อาจเกิดปัญหา Inconsistency กับฐานข้ อมูล
Write_Lock (balx);
Read (balx);
balx = balx + 100;
Write(balx);
Unlock (balx);
Write_Lock (balx);
Read (balx);
balx = balx * 1.1;
Write(balx);
Unlock (balx);
Write_Lock (baly);
Read (baly);
baly = baly * 1.1;
Write(baly);
Unlock (baly);
Commit
Write_Lock (baly);
Read (baly);
baly = baly - 100;
Write(baly);
Unlock (baly);
Commit
Cascading rollback : the situation, in which a single transaction leads to a series of rollback.
Cascading rollbacks are undesirable, since they potentially lead to the undoing of a significant amount of
work. Clearly, it would be useful if we could design protocols that prevent cascading rollbacks. One way
to achieve this with two-phase locking is to leave the release of all locks until the end of the transaction.
T1
Write_Lock (balx);
Read (balx);
Read_Lock (baly);
Read(baly);
balx = baly + balx;
Write(balx);
Unlock (balx);
.
.
.
.
Rollback
T2
Write_Lock (balx);
Read (balx);
balx = baly + 100;
Write(balx);
Unlock (balx);
.
.
.
.
Rollback
T3
Read_Lock (balx);
.
.
.
.
Rollback
Two-phase locking (2PL) :
A transaction follows the two-phase locking protocol if all locking operations
precede the first unlock operation in the transaction.
According to the rules of this protocol, every transaction can be divided into
two phases; first a growing phase, in which it acquires all the locks needed but cannot
release any locks, and then a shrinking phase, in which it releases its locks but cannot
acquire any new locks.
Two-phase locking protocol may cause deadlock.
Deadlock : An impasse that may result when two or more transactions are each
waiting for locks held by the other to be released. Neither transaction can continue
because each is waiting for a lock it cannot obtain until the other completes.
Once deadlock occurs, the applications involved cannot resolve the problem.
Instead, the DBMS has to recognize that deadlock exists and break the deadlock
in some way.
Lock can solveThe uncommitted dependency problem : This problem occurs when one transaction
is allowed to see the intermediate result of another transaction before it has committed.
Time1
Write_lock (balance);
balance1 = (select balance from customer
where accountID = 101); balance1 += 10.00;
Time 2
Time 3
Write_lock (balance);
balance1 = (select balance from customer
where accountID = 102; balance -= 10.00;
update Customer set balance =
?balance1 where accountID = 101;
Time 4
Time 5 Write_lock (balance);
balance2 = (select balance from customer
where accountID = 102);
Time 6 Wait
Time 7
Time 8 Wait
update Customer set balance =
?balance1 where accountID = 102;
Write_lock (balance)
balance2 = (select balance from customer where
accountID = 101;
Wait
In addition to these rules, some systems permit a transaction to issue
a read lock on an item and then later to upgrade the lock to a write lock.
This effectively allows a transaction to examine the data first and then decide
whether it wishes to update it. If upgrading is not supported, a transaction
must hold write locks on all data items that it may update at some time during
the execution of the transaction, thereby potentially reducing the level of
concurrency in the system.
For the same reason, some systems also permit a transaction to
issue a write lock and then later to downgrade the lock to a read lock.
Granularity of Data Items
Granularity : The size of data items chosen as the unit of protection by a
concurrency control protocol.
A data item is chosen to be one of the following, ranging from
coarse to fine, where fine granularity refers to small item sizes and coarse
granularity refers to large item sizes:
·
The entire database.
·
A file.
·
A page (sometimes called an area or database space – a section of
physical disk in which relations are stored).
·
A record
·
A field value of a record
The size of granularity of the data item that can be locked in a single operation
has a significant effect on the overall performance of the concurrency control
algorithm. The granularity would prevent any other transactions from executing
until the lock is released. Thus, the coarser the data item size, the lower the
degree of concurrency permitted. On the other hand, the finer the item size,
the more locking information that is needed to be stored. The best item size
depends upon the nature of the transactions.
The solutions to this problem will involve providing a locking mechanism in
the database server. Any restrictions on the concurrency of transactions
will have a negative effect on the number of transactions that can be
executing at any time. This balancing act is a typical trade-off. The more
restrictive the concurrency strategy is, the more reliable it is, and the slower
it is. DBMS designers, database administrators, and application developers
must all carefully consider how much concurrency can be achieved without
sacrificing either speed or reliability.
Timestamp-Based Protocal
เป็ นกฎเกณฑ์ที่ใช้ควบคุมให้การทางานของรายการเปลี่ยนแปลงต่างๆ
ภายใน schedule ให้มีการรันเป็ น conflict serializable โดยระบบ
จะทาการกาหนด Timestamps ซึ่งก็คือจานวนตัวเลขสะสมครัง้ ละ
1 CLOCK “TICK” และระบบจะทาการนับ 1 CLOCK TICK
ทุก ๆ 1 Microsecond เมื่อ Transaction เริ่มทางาน จะได้รบั
Timestamps ซึ่งเป็ นเวลาปัจจุบนั ของนาฬิกา และเมื่อ Transaction
เริ่มทางาน คาสัง่ READ หรือ WRITE Transaction ก็จะได้รบั
Timestamp สาหรับการ READ หรือ WRITE เช่นกัน
The Timestamp-ordering Protocal
1.
Suppose that transaction Ti issues read(Q)
(a) If TS(Ti) < W-timestamp(Q), then Ti needs to read a value of Q that
was already overwritten. Hence, the read operation is rejected.
(b) If TS(Ti) ≥ W-timestamp(Q), then the read operation is executed,
and R-timestamp(Q) is set to the maximum of R-timestamp(Q) and TS(Ti).
2.
Suppose that transaction Ti issues write(Q)
(a) If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is producing
was needed previously, and the system assumed that that value would
never be produced. Hence, the system rejects the write operation.
(b) If TS(Ti) < W-timestamp(Q), then Ti is attempting to write an obsolete
value of Q. Hence, the system rejects this write operation.
(c) Otherwise, the system executes the write operation and sets Wtimestamp(Q)
to TS(Ti).
Failure Classification
 Transaction failure. There are 2 types of errors that may
cause a transaction to fail:
Logical error: The transaction can no longer continue with its normal
execution because of some internal condition, such as bad input, data not found,
overflow or resource limit exceeded.
System error : The system has entered an undesirable state.
 System crash. There is a hardware malfunction, or a bug in the DBMS or OS,
that causes the loss of the content of volatile storage and brings transaction
processing to a halt. The content of nonvolatile storage remains intact.
 Disk failure. A disk block loses its content as a result of either a head crash or
failure during a data transfer operation.
The execution of an SQL statement begins with an implicit request to
open a transaction, followed by the processing of the statement,
followed automatically by a commit request. Rollback happens only
when the SQL statement fails.
An application must make explicit calls to the database
transaction manager to enter explicit-commit mode and allow multiple
SQL statements to execute as a single transaction.
An application executes an open transaction statement (begin
transaction) to ask the transaction manager to create a new transaction
before the next SQL statement executes.
The application executes a commit transaction statement to ask
the transaction manager to commit the transaction.
The application executes a rollback statement to ask the
application to cancel the transaction.
Storage Hierarchy
ระบบฐานข้อมูล เก็บอยูใ่ น nonvolatile storage เช่น ดิสก์ โดยเนื้อที่ของดิสก์
ถูกแบ่งเป็ น fixed-length storage เรี ยกว่า block (เป็ นหน่วยของข้อมูลที่ใช้ในการ
เคลื่อนย้ายระหว่าง ดิสก์ กับ main memory)
Block ที่อยูใ่ นดิสก์เรี ยกว่า physical block
Block ที่อยูใ่ น main memory เรี ยกว่า buffer block
คาสัง่ ที่ทาให้เกิด block movement ระหว่าง ดิสก์ กับ main memory คือ Input
(X) : เคลื่อนย้าย physical block ที่บรรจุขอ้ มูล X จากดิสก์มาไว้ใน main
memory
Output(X) : เคลื่อนย้าย buffer block ที่บรรจุขอ้ มูล X ไปยังดิสก์
Input (A)
A
B
Main Memory
Output (B)
B
Disk
Transaction Ti ทาการส่ งผ่านข้อมูลไปมาระหว่าง working area ของ Ti
ใน main memory กับฐานข้อมูลในดิสก์ ด้วย 2 คาสัง่ คือ
Read (X) : คาสัง่ นี้ทาการให้ค่า (assign) ข้อมูล X กับตัวแปร Xi โดยมีข้ นั ตอนการทางาน
ดังนี้
 ถ้าบล็อก Bx ที่มีค่าข้อมูล X อาศัยอยู่ ยังไม่ได้อยูใ่ น main
memory ระบบจัดการ
ฐานข้อมูล จะออกคาสัง่ input (X) เพื่อเคลื่อนย้ายบล็อก Bx จาก
ดิสก์เข้ามา
 : ให้
ค่าง่ ข้นีอ้ จมูะน
ล าค่
Xาของตั
กับตัววแปร
Write (X)
คาสั
แปร Xi
Xi assign ให้กบั ข้อมูล X ที่อยูใ่ น buffer โดยมี
ขั้นตอนการทางาน ดังนี้
 ถ้าบล็อก Bx ที่มีค่าข้อมูล X อาศัยอยู่ ยังไม่ได้อยูใ่ น main
memory ระบบจัดการ
ฐานข้อมูล จะออกคาสัง่ input (X) เพื่อเคลื่อนย้ายบล็อก Bx จาก
ดิสก์เข้ามา
 นาค่าจากตัวแปร Xi
ให้กบั ข้อมูล X
Both operations may require the transfer of a block from disk to
main memory. However, they do not require the transfer of a block from
main memory to disk.
The output (Bx) operation for the buffer block Bx on which X
resides does not need to take effect immediately after write (X) is
executed, since the block Bx may contain other data items that are still
being accessed.
A buffer block is eventually written out to the disk either
because the buffer manager needs the memory space for other purposes
or because the database system wishes to reflect the change to B on the
disk. (DBMS performs a force-output of buffer B if it issues an output
B).
Algorithms proposed to ensure database consistency and
transaction atomicity despite failures are known as recovery
algorithms, which have 2 parts :1: Actions taken during normal transaction processing to ensure
that enough information exists to allow recovery from
failures.
2: Actions taken after a failure to recover the database contents
to a state that ensures database consistency, transaction
atomicity, and durability.
Log-Based Recovery
The most widely used structure of recording database
modifications is the log. The log is a sequence of log records,
recording all the update activities in the database. There are
several types of log records. An update log record describes a
single database write. It has these fields:
• Transaction identifier
• Data-item identifier
• Old value
• New value
Other special log records exist to record significant events during
transaction processing.
Whenever a transaction performs a write, it is essential that the log
record for that write be created before the database is modified. (the
transaction has its own memory that acts like a cache for the modified
data items.)
Once a log record exists, we can output the modification to the
database if that is desirable. Also, we have the ability to undo a
modification that has already been output to the database. We undo it
by using the old-value field in log records.
Deferred Database Modification
This technique ensures transaction atomicity by recording all
database modifications in the log, but deferring the execution of
all write operations of a transaction until the transaction partially
commits.
When a transaction partially commits, the information on the log
associated with the transaction is used in executing the deferred
writes. If the system crashes before the transaction completes its
execution, or if the transaction aborts, then the information on the log
is simply ignored.
The execution of transaction Ti proceeds as follows.
Before Ti starts its execution, a record <Ti start> is written to the
log. A write(X) operation by Ti results in the writing of a new record
to the log. Finally, when Ti partially commits, a record <Ti commit> is
written to the log.
T0:
T1:
Read(A);
A = A – 50;
Write (A);
Read (B);
B = B + 50;
Write (B);
Read (C);
C = C – 100;
Write (C);
สมมุติให้ขอ้ มูลปั จจุบนั ของ A = 1000
B = 2000 และ C = 700
< T0 Start>
< T0, A, 950 >
< T0, B, 2050>
< T0 Commit>
< T1 Start>
< T1, C, 600 >
< T1 Commit>
ข้ อมูลใน log บันทึก
เฉพาะค่ าใหม่ เท่ านั้น
When transaction Ti partially commits, the records associated with it in
the log are used in executing the deferred writes. Since a failure may
occur while this updating is taking place, we must ensure that, before
the start of these updates, all the log records are written out to stable
storage. Once they have been written, the actual updating takes place,
and the transaction enters the committed state.
T0:
T1:
Read(A);
A = A – 50;
Write (A);
Read (B);
B = B + 50;
Write (B);
Read (C);
C = C – 100;
Write (C);
เรคอร์ ดใน log
ข้ อมูลใน Database
< T0 Start>
< T0, A, 950 >
< T0, B, 2050>
< T0 Commit>
A = 950
B = 2050
< T1 Start>
< T1, C, 600 >
< T1 Commit>
C = 600
เรคอร์ ดใน log
ข้ อมูลใน Database
< T0 Start>
< T0, A, 950 >
< T0, B, 2050>
System
failure
A = 1000
B = 2000
< T1 Start>
< T1, C, 600 >
System
failure
C = 700
DBMS does not take any action after recovery from failure because
database has been untouched.
เรคอร์ ดใน log
ข้ อมูลใน Database
< T0 Start>
< T0, A, 950 >
< T0, B, 2050>
< T0 Commit>
System
failure
A = 950
B = 2050
< T1 Start>
< T1, C, 600 >
< T1 Commit>
System
failure
C = 600
DBMS has to perform redo operation after recovery from failure.
เรคอร์ ดใน log
ข้ อมูลใน Database
< T0 Start>
< T0, A, 950 >
< T0, B, 2050>
System
failure
A = 1000
B = 2000
< T1 Start>
< T1, C, 600 >
< T1 Commit>
C = 600
System
failure
DBMS does not take any action to T0 because A and B are untouched
but DBMS must perform redo to T1 after recovery from failure.
Using the log, the system can handle any failure that results in the loss
of information on volatile storage. The recovery scheme uses the
following recovery procedure:
Redo(Ti) sets the value of all data items updated by transaction Ti to
the new values.
The redo operation must be idempotent; that is, executing it several
times must be equivalent to executing it once. This characteristic is
required if we are to guarantee correct behavior even if a failure occurs
during the recovery process.
After a failure, the recovery subsystem consults the log to determine
which transactions need to be redone. Transaction Ti needs to be
redone if and only if the log contains both the record
<Ti start>
<Ti commit>.
Thus, if the system crashes after the transaction completes its
execution, the recovery scheme uses the information in the log to
restore the system to a previous consistent state after the transaction
had completed.
Immediate Database Modification
This technique allows database modifications to be output to the
database while the transaction is still in the active state. Data
modifications written by active transactions are called uncommitted
modifications.
In the event of a crash or a transaction failure, the system must use the
old-value field of the log records to restore the modified data items to
the value they had prior to the start of the transaction. The undo
operation accomplishes this restoration.
Before a transaction Ti starts its execution, the system writes the record
<Ti start> to the log. During its execution, any write(X) operation by
Ti is preceded by the writing of the appropriate new update record to
the log. When Ti partially commits, the system writes the record <Ti
commit> to the log.
เรคอร์ ดใน log
T0:
T1:
Read(A);
A = A – 50;
Write (A);
Read (B);
B = B + 50;
Write (B);
Read (C);
C = C – 100;
Write (C);
< T0 Start>
< T0, A, 1000, 950 >
< T0, B, 2000, 2050>
ข้ อมูลใน Database
A = 950
B = 2050
< T0 Commit>
< T1 Start>
< T1, C, 700, 600 >
C = 600
< T1 Commit>
เรคอร์ ดใน log
< T0 Start>
< T0, A, 1000, 950 >
< T0, B, 2000, 2050>
ข้ อมูลใน Database
A = 950
B = 2050
System
failure
< T1 Start>
< T1, C, 700, 600 >
C = 600
System
failure
DBMS must perform undo to T0 and T1 by using old value after recovery
from failure.
เรคอร์ ดใน log
< T0 Start>
< T0, A, 1000, 950 >
< T0, B, 2000, 2050>
ข้ อมูลใน Database
A = 950
B = 2050
< T0 Commit>
System
failure
< T1 Start>
< T1, C, 700, 600 >
C = 600
< T1 Commit>
System
failure
DBMS has to perform redo operation by using new value to T0 and T1 after
recovery from failure.
เรคอร์ ดใน log
< T0 Start>
< T0, A, 1000, 950 >
< T0, B, 2000, 2050>
ข้ อมูลใน Database
A = 950
B = 2050
System
failure
< T1 Start>
< T1, C, 700, 600 >
C = 600
< T1 Commit>
System
failure
DBMS has to perform undo to T0 and redo to T1 after recovery from failure.
After a failure, the recovery subsystem consults the log to determine
which transactions need to be undone or redone. Transaction Ti
needs to be undone if the log contains only the record <Ti start>
and need to be redone if there exists
<Ti start> and <Ti commit>
Thus, if the system crashes after the transaction completes its
execution, the recovery scheme uses the information in the log to
restore the system to a previous consistent state after the transaction
had completed.
Since the information in the log is used in reconstructing the state of the
database, We therefore require that, before execution of an output(B)
operation, the log records corresponding to B be written onto stable
storage.
Rollback segment (RBS)
Rollback segment (RBS) : An Oraclex database has a data area that
contains a rollback segment (RBS) entry for each open transaction. RBS
entry is a set of images of rows that have been modified by the transaction.
The images represent the values of the rows before the execution of the
transaction. Each update operation executed by a transaction is applied to
row of a database table only after the previous value of the row is added to
the RBS entry.
Oraclex database server
Rollback segment
Before image
Transaction T
T.A write r
Database tables
Updated values
r
r
s
s
T.B write s
t
T.C read s
u
T.D read u
The open transaction operation creates a new RBS entry and associates it
with the transaction. The execution of a transaction commit operation
deletes the RBS entry and makes the changes permanent. The execution
of a rollback operation restores all of the modified rows from the RBS
entry.
Other DBMS systems, the transaction has its own memory that acts like a cache for
the modified rows. During the execution, the database tables are not changed.
Instead, the new row images are written into the memory of the transaction. All
accesses to rows in database tables go first to the transaction cache. If a row is not
found, the full database tables are used. The commit operation flushes the cache by
writing the new row values to the database tables and deleting the cache. The
rollback operation deletes the cache, leaving the database unchanged.
Cached updates database server
Update segment
Updated values
Transaction T
T.A write r
Database tables
Before image
r
r
s
s
T.B write s
t
T.C read s
u
T.D read u
Checkpoints
To reduce the number of transactions to be redone and undone, the
system periodically performs checkpoints, which require the
following sequence of actions to take place :1. Output onto stable storage all log records currently
residing
in main memory.
2. Output to the disk all modified buffer blocks.
3. Output onto stable storage a log record
<checkpoint>.
Transactions are not allowed to perform any update actions, such as
writing to a buffer block or writing a log record, while a checkpoint
is in progress.
After a failure has occurred, the recovery scheme examines the log to
determine the most recent transaction Ti that started executing before the
most recent checkpoint took place. It can find such a transaction by
searching the log backward, from the end of log, until it finds the first
“<checkpoint>” record;
then it continues the search backward until it finds the next
“<Ti start> record.. This identifies a transaction Ti.