Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Transactions
CPSC 356 Database
Ellen Walker
Hiram College
(Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)
Transaction
• A logical unit of work for a database
– Example: “move $5 from savings to checking”
– { withdraw 5 from savings; add 5 to checking }
• Must be prevented from interfering with each
other
• Transactions must be “all or nothing”
– If the transaction doesn’t complete, no changes
should be made at all!
States of a Transaction
• Successful transactions are committed
• Failed partial transactions are rolled back
• Committed transactions cannot be undone. We must
create a new (compensating) transaction to fix the
database.
ACID Properties of a Transaction
(Härder and Reuter, 1983)
• Atomicity — a transaction is either performed in its
entirety or not at all
• Consistency — a transaction must take the database
from one consistent state to another
• Isolation (Serializable) — if two transactions run at
the same time, the result must look as if they ran
sequentially in some arbitrary order; a transaction’s
updates must not be visible to other transactions until
it commits
• Durability — once a transaction commits, its result is
permanent (must never be lost)
Concurrency Control
• Two or more transactions proceed
concurrently, while preserving serializability
(isolation)
• Transactions cannot interfere with each other
– Lost update problem
– Dirty read problem
– Inconsistent analysis problem
Lost Update Problem
– Account A = $100, B = $200, C = $300
• Transaction T transfers $4 from A to B
• Transaction U transfers $3 from C to B
• Should end A = $96, B = $207, C = $297
– U’s update of B is lost:
Transaction T
bal=read(A)
write(A,bal–4)
bal=read(B)
write(B,bal+4)
Transaction U
$100
$96
bal=read(C)
write(C,bal–3)
$300
$297
bal=read(B)
write(B,bal+3)
$200
$203
$200
$204
Dirty Read Problem
• Account A = $200, B = $200
– Transaction T transfers $100 from A to B but fails!
– Transaction U deposits $25 to A
– Should end A = $225, B = $200
• Problem:
– Transaction U read “dirty value” of A after $100 was taken…
Transaction T
bal=read(A)
write(A,bal–100)
Transaction U
$200
$100
bal=read(A)
$100
write(A, bal+25) $125
bal=read(B)
…
ROLLBACK!
$200
Nonrepeatable Read Problem
• Similar to dirty read, but the same transaction
reads the same value twice
Transaction T
sal=read(A)
write(A,sal*1.1)
$1000
$1100
Transaction U
read(A)
$1000
(unrelated actions)
sal=read(A)
$1100
Inconsistent Analysis Problem
– Situation:
• Transaction T gives everyone a 10% raise
• Transaction U computes the average salary
– Problem:
• Some salaries have been raised, some not when
average is computed (avg should be 1500 or 1650)
Transaction T
sal=read(A)
write(A,sal*1.1)
sal=read(B)
write(B,sal*1.1)
Transaction U
$1000
$1100
bal=read(A)
bal+=read(B)
$1100
$3100
avg = bal/2
$1550
$2000
$2200
Interleaving Causes Problems
• We need concurrency control mechanism
– Allow as much concurrency among transactions
as possible (throughput)
– Prevent other transactions from viewing
intermediate values (not yet committed)
Definitions for Scheduling
• Schedule
– A sequence of operations by a set of concurrent
transactions that preserves order of operations
within each transaction
• Serial Schedule
– A schedule without any interleaving
• Nonserial Schedule
– A schedule where operations from different
transactions are interleaved
Conflict Serializability
• A serializable schedule has the same result
as a serial schedule
• Recognize conflicts between transactions
– Both transactions access the same variable
– At least one of those accesses is a write
• When all conflicts happen in the same order
(T before U or U before T), then the schedule
is serializable; otherwise not.
Serializability Testing
• Draw a downward (forward in time) arrow for
each conflict (when one transaction is writing). If
all arrows point the same way, then the schedule
is serializable
Transaction T
bal=read(A)
write(A,bal–4)
Transaction U
bal=read(C)
write(C,bal–3)
bal=read(B)
write(B,bal+4)
bal=read(B)
write(B,bal+3)
Serializability Testing (cont.)
• If at least one arrow is pointing leftward and
another arrow is pointing rightward, the
schedule is not serializable
Transaction T
bal=read(A)
write(A,bal–4)
Transaction U
bal=read(C)
write(C,bal–3)
bal=read(B)
bal=read(B)
write(B,bal+4)
write(B,bal+3)
Generalizing Serializability
• With more than two transactions, build a
conflict serializable graph
– Each transaction is a node of the graph
– For each conflict, draw an arc from the earlier
transaction to the later transaction.
• If this graph has a cycle, then the schedule is
not serializable
Serializability Testing vs.
Enforcement
• To test serializability, you have to create the
graph and check for cycles
– This cannot be done efficiently (result from study
of algorithms)
• Instead, let’s create extra constraints (locking)
to enforce serializability
Locking Algorithms
• Locking is a method of controlling
concurrency using a lock (variable) to deny
transactions access to certain objects
• Types of locking
– Static locking
– 2 Phase Locking
• Other algorithms (we won’t cover)
– Optimistic concurrency control
– Timestamp ordering
Using Locks
• Transaction must lock the data object before
accessing it
• Transaction should unlock the data object
when done
• If an item is locked, the transaction must wait
until it is unlocked
• Example transaction:
– Lock B; read B; … write B; unlock B; commit.
Types of Locks
• Shared lock
– Transaction can read item only (read lock)
• Exclusive lock
– Transaction can read and update item (write lock)
• Shared lock can be upgraded to exclusive
lock.
• Exclusive lock can be downgraded to shared
lock.
Locking Protocols
• Even locking doesn’t guarantee serializability
– Object is unlocked and locked again within a
transaction; another transaction “jumps in”
• Locking protocols prevent this
– Static locking
– 2 Phase locking
Static Locking
• Transaction locks all the data items before using any
of them.
– Usually the first operation in the transaction
• Transaction releases all locks at once when it’s done
with the data
– Usually at the end of the transaction
• This method limits concurrency but guarantees
serializability
• Transaction must know in advance which objects it
will use
2 Phase Locking
• Constraint: A transaction cannot request a
lock on one data item after it has unlocked
any data items.
• To maintain the constraint, use 2 phases:
– Growing phase — transaction requests locks, but
doesn’t release any locks (upgrades allowed)
• The stage of a transaction when it holds locks on all the
needed data objects is called the lock point
– Shrinking phase — transaction releases locks, but
doesn’t request any more locks (downgrades
allowed)
2-Phase Locking can cause
Cascading Rollback
• With 2PL, after the transaction has released some of
its locks, yet before it has committed the transaction,
those intermediate results become visible
• When a transaction is rolled back, all modified data
objects are restored
• What if another transaction reads those intermediate
results, and this transaction later aborts?
– All transactions that have read these data objects must also
be rolled back (even if they’ve already completed!) — this is
called cascaded roll-back
Rigorous & Strict 2 Phase Locking
• Rigorous 2PL
– A transaction holds all its locks until it completes,
when it commits (or aborts) and releases all of its
locks in a single atomic action
• Strict 2PL
– A transaction holds all its exclusive locks until it
completes, when it commits (or aborts) and
releases all of its locks in a single atomic action
Deadlock
• When 2 or more transactions are each
waiting for locks on items held by other
waiting transactions. (Circular wait)
• Example: Dining Philosophers
– 5 philosophers, 5 forks
– To eat, you need both left and right forks
– If each philosopher picks up a left fork and waits
for a right fork to become available, deadlock!
2 Phase Locking can lead to
Deadlock
• A transaction can request a lock on a data
object while holding locks on other data
object, so a circular wait can result
• Resolved (after detecting deadlock) by:
– Abort deadlocked transaction, restore all modified
data objects, release all its locks, and withdraw all
pending lock requests
Deadlock Detection
• Deadlock detection
– Wait-for Graph
• If transaction T is waiting for a lock that transaction U
holds, there is an arrow from T to U in WFG
– Lock manager is responsible for detection
• It looks for cycles in its Wait For Graph
• If it finds a cycle, it must select and abort a transaction
(the deadlock victim)
• Choose victim based on age, number of changes already
made, number of changes still to be made
Deadlock Prevention (Lock methods)
• Lock all items when transaction starts (static locking)
• Request locks in predefined order
– May cause premature locking, which reduces concurrency
• Lock timeouts (enables preemption)
– Each lock is invulnerable for a limited period, and vulnerable
afterwards
– If a transaction wants to access a data object protected by a
vulnerable lock, the lock is broken and the transaction
holding it is aborted
Deadlock Prevention (Timestamp)
– Transaction timestamps
• Each transaction is assigned a unique timestamp when it
starts
• If a transaction needs to access a data object that is
locked by another transaction, the timestamps of the two
transactions are compared
– Older transaction (smaller timestamp) generally have
priority
– Wait-for edges are only allowed from older to younger,
which prevents cycles
Eliminating Deadlock with
Timestamps
• Wait-die:
(aborts one)
– If older transaction wants something held by
younger transaction, it waits
– If younger transaction wants something held by
older transaction, it must die
• Wound-wait:
(preempts resource)
– If older transaction wants something held by
younger transaction, it preempts it
– If younger transaction wants something held by
older transaction, it waits
Locking in a Real DBMS
• Granularity
– Lock by tuple -- possible “phantom”
– Lock by table -- limits concurrency
• Isolation levels: (increasing order)
–
–
–
–
READ UNCOMMITTED (dirty reads)
READ COMMITTED (no dirty reads)
REPEATABLE READ (no nonrepeatable reads)
SERIALIZABLE (no phantoms)