Download Document

Document related concepts

Clusterpoint wikipedia , lookup

Global serializability wikipedia , lookup

Consistency model wikipedia , lookup

Database model wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Commitment ordering wikipedia , lookup

Concurrency control wikipedia , lookup

Serializability wikipedia , lookup

Transcript
Chapter 16 : Concurrency
Control
Chapter 16: Concurrency Control







Lock-Based Protocols
Timestamp-Based Protocols
Validation-Based Protocols
Multiple Granularity
Multiversion Schemes
Insert and Delete Operations
Concurrency in Index Structures
Goal of Concurrency Control
Transactions should be executed so that it is
as though they executed in some serial order


Also called Isolation or Serializability
Transactional Concurrency Control

Three ways to ensure a serial-equivalent order on
conflicts:

Option 1, execute transactions serially.


Option 2, pessimistic concurrency control: block T until
transactions with conflicting operations are done.


Done using locks
Option 3, optimistic concurrency control: proceed as if no
conflicts will occur, and recover if constraints are violated.


“single shot” transactions
Repair the damage by rolling back (aborting) one of the
conflicting transactions.
Option 4, hybrid timestamp ordering using versions.
Locking





Locking is the most frequent technique used to
control concurrent execution of database
transactions
Operating systems provide a binary locking system
that is too restrictive for database transactions
That's why DBMS contains its own lock manager
A lock_value(X ) is variable associated with (each)
database data item X
The lock_value(X ) describes the status of the data
item X, by telling which operations can be applied to
X
Concurrency Control using Locking




The locking technique operates by preventing a transaction from
improperly accessing data that is being used by another
transaction.
Before a transaction can perform a read or write operation, it
must claim a read (shared) or write (exclusive) lock on the
relevant data item.
Once a read lock has been granted on a particular data item,
other transactions may read the data, but not update it.
A write lock prevents all access to the data by other
transactions.
Kinds of Locks

Generally, the lock manager of a DBMS offers two kinds of
locks:





A shared (read) lock, and
An exclusive (write) lock
If a transaction T issues a read_lock(X ) command, it will be
added to the list of transactions that share lock on item X,
unless there is a transaction already holding write lock on X
If a transaction T issues a write_lock(X ) command, it will be
granted an exclusive lock on X, unless another transaction is
already holding lock on X
Accordingly,
lock_value{read_lock, write_lock, unlocked }
Lock Semantics



Since read operations cannot conflict, it is acceptable
for more than one transaction to hold read locks
simultaneously on the same item.
On the other hand, a write lock gives a transaction
exclusive access to the data item
Locks are used in the following way:




A transaction needing access to a data item must first lock the
item, requesting a read lock for read-only access or a write-lock
for read-write access.
If the item is not already locked by another transaction, the lock
request will be granted.
If the item is currently locked, the DBMS determines whether the
request is compatible with the current lock. If a read lock is
requested on an item that is already read locked, the request is
granted, otherwise the transaction must wait until the existing
write lock is released.
A transaction holds a lock until it explicitly releases it, commits, or
aborts.
Basic Locking Rules

The basic locking rules are:





T must issue a read_lock(X ) or write_ lock(X ) command
before any read_item(X ) operation
T must issue a write_lock(X ) command before any
write_item(X ) operation
T must issue an unlock(X ) command when all
read_item(X ) or write_item(X ) operations are completed
Some DBMS lock managers perform automatic
locking by granting an appropriate database item
lock to a transaction when it attempts to read or
write an item into database
So, an item lock request can be either explicit, or
implicit
Locking Rules



Lock manager- The part of DBMS that keeps track of locks
issued to transactions is called lock manager. It maintains
lock table
Data items can be locked in two modes :
1. exclusive (X) mode. Data item can be both read as well as
written.
X-lock is requested using lock-X instruction.
2. shared (S) mode. Data item can only be read. S-lock is
requested using lock-S instruction.
Lock requests are made to concurrency-control manager.
Transaction can proceed only after request is granted.
Lock-Based Protocols

Lock-compatibility matrix
A transaction may be granted a lock on an item if the requested lock
is compatible with locks already held on the item by other
transactions
 Any number of transactions can hold shared locks on an item,


but if any transaction holds an exclusive on the item no other
transaction may hold any lock on the item.
If a lock cannot be granted, the requesting transaction has to wait
till all incompatible locks held by other transactions have been
released. The lock is then granted.
Lost Update Problem and Locking
T1
T2
read_lock(X )
read_item ( X )
unlock(X )
write_lock(X )
read_item X
X=X+M
write_item(X )
unlock(X )
write_lock(X )
X=X–N
write_item (X )
unlock(X )
t
i
m
e
T2's update to X is lost
because T1 wrote over X,
and it happened despite the
fact that both transactions
are issuing lock and unlock
commands
The problem is that T1
releases lock on X to early,
allowing T2 to start updating
X
We need a protocol that will
guarantee serializability
A locking protocol is a set of rules followed by all transactions while requesting and releasing
locks. Locking protocols restrict the set of possible schedules.
The basic Two Phase Locking Protocol


All lock operations must precede the first unlock operation
Now, it can be considered as if a transaction has two phases:


Growing (or expanding), when all locks are being acquired, and
Shrinking, when locks are being downgraded and released, but non
can be acquired or upgraded

A theorem:
If all transactions in a schedule obey locking rules and two phase
locking protocol, the schedule is a conflict serializable one

Consequently:

A schedule that obeys to two phase locking protocol has not to be
tested for conflict serializability
Lost Update and Two Phase Locking
T1
T2
read_lock(X )
read_item ( X )
X=X–N
write_lock(X )
write_item (X )
unlock(X )
write_lock(X ) t
//has to wait
i
write_lock(X )
read_item (X)
X=X+M
write_item(X )
unlock(X )
m
e
T2 can not obtain a
write_lock on X since
T1 holds a read lock on X
and it has to wait
When T1 releases lock on X,
T2 acquires a lock on X and
finishes successfully
Two phase locking provides
for a safe conflict serializable
schedule
The Two-Phase Locking Protocol


Many database systems employ a two-phase locking protocol to control the
manner in which locks are acquired and released. This is a protocol which
ensures conflict-serializable schedules.
Phase 1: Growing Phase




Phase 2: Shrinking Phase



transaction may release locks
transaction may not obtain locks
The rules of the protocol are as follows:




transaction may obtain locks
transaction may not release locks
Once all locks have been acquired , transaction is in its locked point.
A transaction must acquire a lock on an item before operating on the item.
The lock may be read or write, depending on the type of access needed.
Once the transaction releases a lock, it can never acquire any new locks.
The protocol assures serializability. It can be proved that the transactions can be
serialized in the order of their lock points (i.e. the point where a transaction
acquired its final lock, end of growing phase).
The Two-Phase Locking Protocol

Two-phase locking is governed by following rules



Two transaction can not have conflicting locks.
No unlock operation can precede lock operation in the same transaction.
The point in the schedule where the final lock is obtained is called the
lock point.
Phase 1 lock
Phase 2 unlock
Growing
phase
Lock
phase
Shrinking
phase
A Question for You
T1
This slide describes the dirty
read problem
T2
The question:
Does two phase locking solve
The dirty read problem?
read_item ( X )
X=X–N
write_item (X )
read_item (X)
X=X+M
write_item (X)
read_item ( Y )
T1 fails
t
i
m
e
Answers:
a) Yes
b) No, because the dirty read
problem is not a consequence
of conflicting operations.
Strict Two Phase Locking
protocol solves the “dirty read”
problem
Two Phase Locking: Dirty Read
T1
write_lock(X )
read_item ( X )
X=X–N
write_item (X )
write_lock(Y )
unlock(X )
read_item ( Y )
T2
t
i
write_lock(X )
m
read_item X
e
X=X+M
write_item(X )
unlock(X )
Y=Y+Q
write_item (Y )
unlock(Y )
//T1 fails before
it commits
If T1 gets exclusive lock
on X first, T2 has to wait
until T1 unlocks X
Note that interleaving
is still possible, only not
within the transactions
that access the same
data items
Two phase locking alone
does not solve the dirty
read problem, because T2
is allowed to read uncommitted
database item X
Strict Two Phase Locking


A variant of the two phase locking protocol
Protocol:





A transaction T does not release any of exclusive locks until
after it commits or aborts
Hence, no other transaction can read or write an item X that is
written by T unless T has committed
The strict two phase locking protocol is safe for dirty read
Rigorous two-phase locking is even stricter: here all locks
are held till commit/abort. In this protocol transactions can
be serialized in the order in which they commit.
Most DBMS implement either strict or rigorous two phase
locking
Schedule for Strict 2PL with serial execution
T1
T2
X(A)
R(A)
W(A)
X(B)
R(B)
W(B)
Commit
X(A)
R(A)
W(A)
X(B)
R(B)
W(B)
Commit
Disadvantages of Locking

Pessimistic concurrency control has a number of key disadvantages,
particularly in distributed systems:

Overhead. Locks cost, and you pay even if no conflict occurs.



Low concurrency.




If locks are too coarse, they reduce concurrency unnecessarily.
Need for strict 2PL to avoid cascading aborts makes it even worse.
Low availability.


Even readonly actions must acquire locks.
High overhead forces careful choices about lock granularity.
A client cannot make progress if the server or lock holder is temporarily unreachable.
Deadlock.
Two phase locking can introduce some undesirable effects. These
are:
 waits,
 deadlocks,
 starvation
Problems with Locking




The use of locks and the 2PL protocol prevents many of the problems
arising from concurrent access to the database.
However, it does not solve all problems, and it can even introduce new
ones.
Firstly, there is the issue of cascading rollbacks:
 2PL allows locks to be released before the final commit or rollback of a
transaction.
 During this time, another transaction may acquire the locks released by
the first transaction, and operate on the results of the first transaction.
 If the first transaction subsequently aborts, the second transaction must
abort since it has used data now being rolled back by the first
transaction.
 This problem can be avoided by preventing the release of locks until the
final commit or abort action.
Cascading roll-back is possible under two-phase locking. Cascade
less schedule is not possible
Dead Lock



Dead lock is also called deadly embrace
Deadlock occurs when two or more transactions reach an impasse
because they are waiting to acquire locks held by each other.
Typical sequence of operations is given on the following diagram
T1
T2
•T2 acquired exclusive lock on Y
write_lock(X )
write_lock(Y )
write_lock(Y )
//has to wait
•T1 acquired exclusive lock on X
t
i
m
write_lock(X )
e
//has to wait
•No one can finish, because both
are inthe waiting state
•To handle a deadlock one of T3 or T4
must be rolled back and its locks
released.
Dead Lock (continued)

Dead lock examples:
a)



T1 has locked X and waits to lock Y
T2 has locked Y and waits to lock Z
T3 has locked Z and waits to lock X
b)


BothT1 and T2 have acquired sharable locks on X and wait
to lock X exclusively
All that results in a cyclic wait for graph
T2 waits for T1
T2
T1
T1 waits for T2
Starvation

Starvation is also possible if concurrency control manager is badly designed.
For example:



A transaction may be waiting for an X-lock on an item, while a sequence of
other transactions request and are granted an S-lock on the same item.
If T2 has s-lock on data item and T1 request X-lock on same data item. So T1
has to wait for T2 to release S-lock. Meanwhile T3 request S-lock on same data
item and lock request is compatible with lock granted to T2 so T3 may be
granted S-lock. so now T2 release a lock but still T1 is not granted until T3
finish.
Concurrency control manager can be designed to prevent starvation.
T1
lock-S(A)
T2
lock-X(A)
wait
T3
T4
T5
lock-S(A)
lock-S(A)
lock-S(A)
Starvation



Starvation is a problem that appears when using
locks, or deadlock detection protocols
Starvation occurs when a transaction can not make
any progress for an indefinite period of time, while
other transactions proceed
Starvation can occur when:


Waiting protocol for locked items is unfair (used stacks
instead of queues)
The same transaction is selected as `victim` repeatedly
Lock Conversion


A transaction T that already holds a lock on
item X can convert it to another state
Lock conversion rules are:


T can upgrade a read_lock(X ) to a write_lock(X
) if it is the only one that holds a lock on the item X
(otherwise, T has to wait)
T can always downgrade a write_lock(X ) to a
read_lock(X )
Lock Conversions

Two-phase locking with lock conversions:
For getting more concurrency we used 2PL with lock conversion.
– First Phase:




can acquire a lock-S on item
can acquire a lock-X on item
can convert a lock-S to a lock-X (upgrade)
Upgrading possible only in growing phase.
– Second Phase:




can release a lock-S
can release a lock-X
can convert a lock-X to a lock-S (downgrade)
Downgrading possible only in shrinking phase.
Lock Conversions
T1
T2
read_item ( a1 )
read_item ( a2 )
read_item ( a3 )
write_item ( a1 )
t
read_item (a1) i
read_item ( a2)
m
P=a1+a2
e
1. Normal 2PL would make T2
wait till the write_item (a1) is
executed
2. Lock conversion would allow
higher concurrency and allow
shared lock to be acquired by
T2
3. T1 could upgrade shared lock
to exclusive just before the
write instruction
Lock Conversions



Transactions attempting to upgrade may
need to wait
Lock conversions generate serializable
schedules, serialized by their lock points
If exclusive locks are held till the end of the
transactions then schedules are cascadeless
Automatic Acquisition of Locks

A transaction Ti issues the standard read/write instruction,
without explicit locking calls.

The operation read(D) is processed as:
if Ti has a lock on D
then
read(D)
else begin
if necessary wait until no other
transaction has a lock-X on D
grant Ti a lock-S on D;
read(D)
end
Automatic Acquisition of Locks
(Cont.)


write(D) is processed as:
if Ti has a lock-X on D
then
write(D)
else begin
if necessary wait until no other trans. has any lock on
D,
if Ti has a lock-S on D
then
upgrade lock on D to lock-X
else
grant Ti a lock-X on D
write(D)
end;
All locks are released after commit or abort
Implementation of Locking





A lock manager can be implemented as a separate
process to which transactions send lock and unlock
requests
The lock manager replies to a lock request by sending a
lock grant messages (or a message asking the
transaction to roll back, in case of a deadlock)
The requesting transaction waits until its request is
answered
The lock manager maintains a data-structure called a
lock table to record granted locks and pending requests
The lock table is usually implemented as an in-memory
hash table indexed on the name of the data item being
locked
Lock Management

Lock table entry:





Number of transactions currently holding a lock
Type of lock held (shared or exclusive)
Pointer to queue of lock requests
Locking and unlocking have to be atomic
operations
Lock upgrade: transaction that holds a
shared lock can be upgraded to hold an
exclusive lock
Lock Table




Granted
Waiting

Black
rectangles
indicate
granted locks, white ones
indicate waiting requests
Lock table also records the type
of lock granted or requested
New request is added to the end
of the queue of requests for the
data item, and granted if it is
compatible with all earlier locks
Unlock requests result in the
request being deleted, and later
requests are checked to see if
they can now be granted
If transaction aborts, all waiting
or granted requests of the
transaction are deleted

lock manager may keep a list of
locks held by each transaction,
to implement this efficiently
Other protocols than 2-PL – graphbased
-
-
-
Graph-based protocols are an alternative to twophase locking
Assumption: we have prior knowledge about the
order in which data items will be accessed
(hierarchical) ordering on the data items, like,
e.g., pages of a B-tree
A
B
C
Graph-Based Protocols

Impose a partial ordering  on the set D = {d1, d2
,..., dh} of all data items.




If di  dj then any transaction accessing both di and dj
must access di before accessing dj.
Implies that the set D may now be viewed as a directed
acyclic graph, called a database graph.
The tree-protocol is a simple kind of graph protocol.
Transactions access items from the root of this
partial order
Tree Protocol
1.
2.
3.
4.
Only exclusive locks are allowed.
The first lock by Ti may be on any data item. Subsequently, a data Q
can be locked by Ti only if the parent of Q is currently locked by Ti.
Data items may be unlocked at any time.
A data item that has been locked and unlocked by Ti cannot
subsequently be relocked by Ti
Tree protocol - example
T1
L(B)
T2
-2PL?
A
-follows tree protocol?
L(D)
L(H)
U(D)
-‘correct’?
B
L(E)
U(E)
L(D)
U(B)
C
D
E
F
U(H)
L(G)
U(D)
U(G)
G
H
I
Graph-Based Protocols (Cont.)

The tree protocol ensures conflict serializability as well as freedom from
deadlock.

Advantages of tree protocol over 2 phase locking protocol



shorter waiting times, unlocking may occur earlier and increase in concurrency
protocol is deadlock-free, no rollbacks are required
Drawbacks

Protocol does not guarantee recoverability or cascade freedom

Need to introduce commit dependencies to ensure recoverability


Transactions may have to lock data items that they do not access.




If Ti performs a read of uncommitted data item , we record a commit dependency of Ti on the
transaction that performed the last write on the data item.
increased locking overhead, and additional waiting time
Without prior knowledge of what data items will be locked, transactions may lock the root
of the tree reducing concurrency
Schedules not possible under two-phase locking are possible under tree
protocol, and vice versa.
Main application

used for locking in B+trees, to allow high-concurrency update access;
otherwise, root page lock is a bottleneck
Timestamp-Based Protocols

BASIC TIMESTAMP ORDERING:-

Timestamp is unique identifier created by DBMS to identify a transaction.

Timestamp values are assigned in order in which transactions are submitted to the
system.

Each transaction is issued a timestamp when it enters the system. If an old
transaction Ti has time-stamp TS(Ti), a new transaction Tj is assigned time-stamp
TS(Tj) such that TS(Ti) <TS(Tj).

The timestamp could use the value of the system clock or a logical counter

The protocol manages concurrent execution such that the time-stamps determine
the serializability order.

In order to assure such behavior, the protocol maintains for each data item Q two
timestamp values:

W-timestamp(Q) is the largest time-stamp of any transaction that executed write(Q)
successfully.

R-timestamp(Q) is the largest time-stamp of any transaction that executed read(Q)
successfully.
Timestamp-Ordering Protocol


The timestamp ordering protocol ensures that
any conflicting read and write operations are
executed in timestamp order.
Suppose a transaction Ti issues a read(Q)
1.
If TS(Ti) < W-timestamp(Q), then Ti needs to read a
value of Q that was already overwritten. Hence,
the read operation is rejected, and Ti is rolled
back.
2.
If TS(Ti)  W-timestamp(Q), then the read operation
is executed, and R-timestamp(Q) is set to max(Rtimestamp(Q), TS(Ti)).
Timestamp Ordering Protocols (Cont.)

Suppose that transaction Ti issues write(Q).
1.
2.
If TS(Ti) < R-timestamp(Q), then abort and rollback T and
reject the operation. This should be done because some
younger transaction with timestamp greater than TS(T)after T in timestamp ordering-has already read or write the
value of item Q before T had chance to write Q,thus
violating the timestamp ordering.
If TS(Ti) < W-timestamp(Q), then Ti is attempting to write
an obsolete value of Q. Hence, this write operation is
rejected, and Ti is rolled back.
3.
Otherwise, the write operation is executed, and Wtimestamp(Q) is set to TS(Ti). (If TS(Ti) > W-timestamp(Q)
and TS(Ti) > R-timestamp(Q))
Example of Timestamp Ordering Protocol
T14
read(B)
T15
read(B)
B:= B-50
write(B)
read(A)
display (A+B)
read (A)
A := A+50
write (A)
display (A+B)
Timestamp Ordering Protocols (Cont.)


The time stamp ordering protocol ensures conflict serializability as well
as freedom from deadlock. However there is a possibility of starvation,
if a sequence of conflicting transactions cause repeated restarts
Consider two transactions T1 and T2 shown below.
T1
Write(p)
T2
Read(p)
Read(q)
Write(q)



The write(q) of T1 fails and this rollsback T1 and in effect T2 as T2 is
dependent on T1
This protocol can generate schedules that are non recoverable
Recoverability and cascadelessness can be ensured by performing all
writes at the end of the transaction and recoverability alone can be
ensured by tracking uncommitted dependencies
Strict Timestamp ordering

Variation of basic TO called strict TO ensures schedules are
both strict(recoverable) and (conflict)serializable. If transaction
T issues read_item(X)
or write_item(X) such that
TS(T)>write_TS(X) has its read or write delayed until
transaction T’(TS(T’)=write_TS(X) has commited or aborted.
 Thomas’


Write Rule
Modified version of the timestamp-ordering protocol in which
obsolete or outdated write operations may be ignored under
certain circumstances.
When Ti attempts to write data item Q, if TS(Ti) < Wtimestamp(Q), then Ti is attempting to write an obsolete or
outdated value of {Q}.

Rather than rolling back Ti as the timestamp ordering
protocol would have done, this {write} operation can be
ignored.

Otherwise this protocol is the same as the timestamp
ordering protocol.

Thomas' Write Rule allows greater potential concurrency.

Allows some view-serializable schedules that are not conflictserializable.
T1
T2
R(A)
W(A)
commit
W(A)
Commit
View Serializable schedule equivalent to serial schedule <T1,T2>






Validation-Based Protocol (Optimistic method for concurrency
control)
Optimistic concurrency control techniques also known as
validation or certification methods
This techniques are called optimistic because they assume
that conflicts of database operations are rare and hence
that there is no need to do checking during transaction
execution.
Checking represent overhead during transaction execution,
with effect of slowing down the transaction and checking
needs to be done before commit
Updates in transaction are not applied directly to d/b item
until transaction reaches its end and are carried out in a
temporary database file
The three phases of concurrently executing transactions
can be
interleaved, but each transaction must go
through the three phases in that order.
Execution of transaction is done in three phases.
Validation-Based Protocol (Optimistic method for concurrency
control)
Read and execution phase:

Transaction Ti read values of committed items
from database. Updates are applied only to temporary local
copies of data items / temporary update file.
Validation phase (Certification phase):

Transaction Ti performs a ``validation test'‘ to determine if local copies can be
written without violating serializability. If validation test positive then moves
to write phase else transactions are discarded
Write phase:

If Ti is validated successfully , the updates are
applied to the database; otherwise, Ti is rolled back and then
restarted.

This phase is performed only for Read-write transactions and not for Readonly transactions
Optimistic concurrency control

Each transaction T is given 3 timestamps:




Start(T): when the transaction starts
Validation(T): when the transaction enters the
validation phase
Finish(T) : when the transaction finishes
Goal: to ensure the transaction following a
serial schedule based on Validation(T)
Optimistic concurrency control


Given two transaction T1 and T2 and
Validation(T1) < Validation(T2)
Case 1 : Finish(T1) < Start(T2)
Start(T1)
T1 :
Read
Valid(T1)
Valid
Finish(T1)
Write
Read
T2 :
Start(T2)
Here, no problem of serializability
Valid
Valid(T2)
Write
Finish(T2)
Time
Optimistic concurrency control

Case 2 : Finish(T1) < Validation(T2)
Start(T1)
T1 :
Read
Valid(T1)
Valid
Finish(T1)
Write
Potential conflict
Read
T2 :
Start(T2)
Valid
Valid(T2)
Write
Finish(T2)
If T2 does not read anything T1 writes, then no problem
Time
Optimistic concurrency control

Case 3 : Validation(T2) < Finish(T1)
Valid(T1)
Start(T1)
T1 :
Read
Valid
Finish(T1)
Write
Potential conflict
Read
T2 :
Start(T2)
Valid
Valid(T2)
Write
Finish(T2)
If T2 does not read or writes anything T1 writes, then no problem
Time
Optimistic concurrency control

For any transaction T, check for all
transaction T’ such that Validation(T’) <
Validation(T) that
1.
2.
3.
If Finish(T’) > Start(T) then if T reads any
element that T’ writes, then abort
If Finish(T’) > Validation(T) then if T writes any
element that T’ writes, then abort
Otherwise, commit
Schedule Produced by Validation

Example of schedule produced using
validation
T14
read(B)
read(A)
(validate)
display (A+B)
T15
read(B)
B:= B-50
read(A)
A:= A+50
(validate)
write (B)
write (A)
Optimistic concurrency control

Advantages:


No blocking
No overhead during execution




Do have overhead for validation
It is very efficient when conflicts are rare. Occasional conflicts
result is transaction rollback.
Rollback involves local copy of data, database is not involved
so there will be no cascading rollback.
Disadvantages:


Potential starvation for long transaction
Large amount of aborts if high concurrency
Optimistic Concurrency Control
Applications of Optimistic methods


Suitable for environments where there are few conflicts
and no long transactions
Acceptable for mostly read or query database systems
that require very few update transactions
Granularity of Items


Until now, we used the term `data item` without
specifying its exact meaning
In the context of the concurrency control, a data
item can be:







A field of a database record,
A database record,
A disk block,
A whole file,
A whole database
Fine granularity refers to small item sizes and
coarse granularity refers to larger item sizes
The coarser the data item granularity is, the lower
the degree of concurrency
Granularity of Items (continued)



Several trade-offs must be considered before choosing the size of the
data-item
As finer data granularity, as higher locking overhead of the DBMS lock
manager (due to many locks and unlocks)
If we lock large objects (e.g., Relations)



Need few locks
Get low concurrency
If we lock small objects (e.g., tuples,fields)
Need more locks (=> overhead higher)
 Get more concurrency
The best item size depends on the type of a transaction:
 If a transaction accesses a small number of records, then




data item = record
If a transaction accesses a large number of records in the same file, then
data item = file
Some DBMS automatically change granularity level with regard to the
number of records a transaction is accessing (attempting to lock)
Multiple Granularity


Granule is a unit of data individually controlled by concurrency control
system. Granularity is lockable unit in a lock based concurrency control
scheme.
Example :






If Ti needs to access the entire database and locking protocol is used, then Ti
must lock each data item in database. If done individually it is time consuming.
Better if Ti could issue a single lock request to lock the entire database.
If Ti need to access only few data items then should not require to lock entire
database but only lock that data item.
Hence, mechanism is required to allow multiple levels of granularity.
Allow data items to be of various sizes and define a hierarchy of data
granularities, where the small granularities are nested within larger ones
Can be represented graphically as a tree (but don't confuse with tree-locking
protocol)
Example of Granularity Hierarchy

If transaction Ti gets an explicit lock on file Fc in
exclusive mode, then it has an implicit lock in
exclusive mode on all records belonging to this
file Fc. .Does not need to lock individual records
of Fc explictly.
Multiple Granularity



Database level
Entire database
Table Level
Entire table
Page level
Entire disk block is locked. A page has fixed size as.
4K,8K,16K..Page contain several rows of one or more tables.
Most suitable for multi user DBMS.
Row level
Lock exist for each row in each table of d/b. It improves availability of data.
Management requires high overhead cost.
Attribute level
Allow concurrent transactions to access the same row but different attributes. Most
flexible for multi user data access
If any transaction wants to access entire database then has to lock entire database or
root of the tree.
Question is how does system determine if root node can be locked?
One Solution is to search the entire tree which is time consuming which defeats the
purpose of multiple granularity locking scheme.
Multiple Granularity Locking (MGL) Protocol


To make multiple granularity level locking practical,
additional locks called intention locks are needed.
Three types of intention locks :





Intention Shared (IS) indicates shared locks will be requested
on descendant node
Intention Exclusive (IX) indicates exclusive locks will be
requested on descendant node
Shared Intention Exclusive (SIX) indicates current node is
locked in Shared mode but exclusive locks will be requested
on descendant nodes.
Before locking an item, transaction must set “intention locks”
on all its ancestors.
Locking is done “top down” and unlock is done “bottom-up”.
MGL Compatibility Matrix
Requestor
Holder
Multiple Granularity Locking (MGL) Protocol






Lock Compatibility matrix must be adhered
The root of the tree must be locked first in any mode
A node N can be locked by a transaction T in S or IS mode
only if the parent of node N is already locked by transaction
T in either IS or IX mode
A node N can be locked by transaction T in X, IX or SIX
mode only if the parent of node N is already locked by
transaction T in either IX or SIX mode
A transaction T can lock a node only if it has not unlocked
any node (enforce 2PL)
A transaction T can unlock a node N only if none of the
children of node N are currently locked.
Examples

T1 reads record ra2 in File Fa



T2 modifies record ra9 in file Fa






Needs to lock database and area A1in IS mode
Needs to lock Fa in S mode
T4 reads the entire database


Needs to lock database, area A1 and file Fa in IX mode
Needs to lock ra9 in X mode.
T3 reads all records in file Fa


Needs to lock database, area A1 and Fa in IS mode
Needs to lock ra2 in S mode
Needs to lock the database in S mode.
T1, T3 and T4 can access database concurrently
T1 and T2 can access concurrently
T2 cannot access concurrently with T3 or T4
Multiversion Schemes

Multiversion schemes keep old versions of data item to
increase concurrency.






Multiversion Timestamp Ordering
Multiversion Two-Phase Locking
Each successful write results in the creation of a new version of
the data item written.
Use timestamps to label versions.
When a read(Q) operation is issued, select an appropriate
version of Q based on the timestamp of the transaction, and
return the value of the selected version.
reads never have to wait as an appropriate version is returned
immediately.
Multiversion Timestamp Ordering

Each data item Q has a sequence of versions <Q1, Q2,....,
Qm>. Each version Qk contains three data fields:





Content -- the value of version Qk.
W-timestamp(Qk) -- timestamp of the transaction that created
(wrote) version Qk
R-timestamp(Qk) -- largest timestamp of a transaction that
successfully read version Qk
when a transaction Ti creates a new version Qk of Q, Qk's Wtimestamp and R-timestamp are initialized to TS(Ti).
R-timestamp of Qk is updated whenever a transaction Tj reads
Qk, and TS(Tj) > R-timestamp(Qk).
Multiversion Timestamp Ordering (Cont)

Suppose that transaction Ti issues a read(Q) or write(Q)
operation. Let Qk denote the version of Q whose write
timestamp is the largest write timestamp less than or
equal to TS(Ti).
If transaction Ti issues a read(Q), then the value returned is the
content of version Qk.
If transaction Ti issues a write(Q)
1.
2.
1.
2.
3.

if TS(Ti) < R-timestamp(Qk), then transaction Ti is rolled back.
if TS(Ti) = W-timestamp(Qk), the contents of Qk are overwritten
else a new version of Q is created.
Observe that




Reads always succeed
Transaction is rejected if it is too late in doing write.
Conflicts resolved thru rollbacks. Every Read also involves a
write
Does not ensure recoverability and cascadelessness
Protocol guarantees serializability
Multiversion Two-Phase Locking

There are three locking modes for an item:



In standard locking scheme, once transaction obtain write lock on an item,
no other transactions can access that item.
Idea behind multiversion 2PL is to allow other transaction T’ to read an item
X while a single transaction T holds a write lock on X.








Read, write and certify
Accomplished by allowing two version for each item X.
one version must be always written by some committed transaction.
Second version X’ is created when transaction T acquires write lock on item.
Other transaction can continue to read committed version of X while T holds write
lock.
When T is ready to commit ,must obtain certify lock on all items that is
currently holds write locks on before it can commit.
Certify lock is not compatible with read locks, so transaction may have to
delay until all its write locks are released by reading transactions
Once certify locks are acquired, the committed version X of data item is set
to values of version X’ and version X’ is discarded and certify locks are
released.
Ensures schedules are recoverable and cascadeless
Deadlock Handling


Consider the following two transactions:
T1: write (X)
T2: write(Y)
write(Y)
write(X)
Schedule with deadlock
T1
lock-X on X
write (X)
wait for lock-X on Y
T2
lock-X on Y
write (Y)
wait for lock-X on X
Deadlock Handling

Deadlock prevention protocols ensure that the system will never
enter into a deadlock state. Some prevention strategies :
 Require that each transaction locks all its data items before it
begins execution (pre-declaration).
If any of the items can’t be obtained, none of the items are locked.
In other words transaction requesting a new lock is aborted if
possibility of deadlock can occur.



Use preemption and transaction rollback. In preemption when T2
request a lock that T1 holds, lock granted to T1 may be preempted
by rolling back, and granting of lock to T2.
Allow system to enter into a Deadlock state and then Deadlock
Detection and Deadlock Recovery schemes are applied
Both deadlock prevention and Recovery involve rollbacks. Prevention
protocols may be used when the probability of a system entering into
deadlock is high otherwise Detection and Recovery scheme is more
efficient.
Dead Lock Prevention Techniques


There are a number of deadlock prevention
techniques
These are:


Conservative two phase lock protocol
Timestamp techniques:



Wait – Die protocol
Wound – Wait protocol
No – Wait protocol
Conservative Two Phase Locking Protocol

Conservative two phase locking




A transaction has to lock all items it will access before it begins to
execute (as in the ordinary two phase locking),
If it cannot acquire any of its locks, it releases all items, aborts,
and tries again,
Deadlock can't occur because no hold - and wait
Once it starts, a transaction is always in its shrinking phase
Conservative Two Phase Locking Protocol

Problems:




What if a transaction cannot predetermine all items it is going to
use? (e.g. a sequence of interactive SQL statements comprising
one database transaction)
What if a database item that is already locked by another
transaction will be released very soon? (i.e. the transaction is
aborted in vain)
Low data-item utilization, i.e. data item may be locked and
unused for a long duration
Another variant of this approach


Use a total order of data items. Once a particular data item is
locked it cannot request locks on item that precede it
This scheme is easy to implement as long as the set of data
items accessed is known.
Using Timestamps (Wait-Die)




Timestamps are used together with two phase locking
DBMS assigns a timestamp TS to each transaction T entering the
system
If Ti starts before Tj , then TS (Ti ) < TS (Tj) (*Ti is older than Tj )
Wait-die scheme — non-preemptive
 older transaction may wait for younger one to release data item.
Younger transactions never wait for older ones; they are rolled
back instead.
 a transaction may die several times before acquiring needed data
item
 If Ti has higher priority ,allowed to wait, otherwise it is aborted.
 E.g. T22, T23, T24 have timestamps 5,10,15
 If T22 request data item held by T23 then T22 will wait. If T24
requests the data item then it will be rolled back
Using Timestamps (Wound-wait)

wound-wait scheme — preemptive :







If a transaction tries to lock an item held by another one,
the older transaction wound the younger one (causes abort
and restart with the same timestamp),
But the younger one is allowed to wait
So, the oldest transaction will be allowed to finish
may be fewer rollbacks than wait-die scheme.
If Ti has higher priority , abort Tj, otherwise Ti
waits.
E.g. T22, T23, T24 have timestamps 5,10,15
If T22 request data item held by T23 then data item will be preempted from T23 and T23 will be rolled back. If T24 requests the
data item that T23 is holding then it will wait
Deadlock Prevention Using Timestamps




Both schemes avoid starvation, as transaction with
smallest timestamp is not asked to rollback.
Transactions that rollback are not assigned new
timestamps, so at some point and time a transaction
will acquire smallest timestamp value
In wait-die the older transaction waits for the
younger transaction to complete, so the older the
transaction gets the more it tends to wait
In wait-die if Ti is wounded and rolled back, the
chances of it re-issuing the same commands and
dying multiple times exist. In contrast such rollbacks
may be fewer in wound-wait
Deadlock Prevention (No-wait)


Both in wait-die and in wound-wait schemes, a rolled
back transactions is restarted with its original
timestamp. Older transactions thus have precedence
over newer ones, and starvation is hence avoided.
Timeout-Based Schemes :



a transaction waits for a lock only for a specified amount of
time. After that, the wait times out and the transaction is
rolled back.
thus deadlocks are not possible
simple to implement; but starvation is possible. Also difficult
to determine good value of the timeout interval.
Deadlock Detection Schemes





Deadlock prevention is justified if transactions are long and use
many items, or transaction load is very heavy
In many practical situations it is advantageous to do no deadlock
prevention but to detect dead locks and then abort at least one of
the transactions involved
An algorithm that examines the state of the system is invoked
periodically to detect deadlocks
If deadlocks are found then system must attempt to recover from
the deadlock
To do this system requires
 Knowledge of current data items requested by the transactions
and the outstanding data item request
 An algorithm to determine deadlock condition
 A recovery process to recover from the deadlock
Deadlock Detection

Deadlock detection using wait-for-graph protocol:





Construct a wait-for graph where each transaction has its node
If Ti waits on Ti, construct a directed edge from Ti to Tj
If there is a cycle detected, select a `victim` and abort it
Victim selecting algorithm should select and abort transactions
that made the least number of updates
TIMEOUT protocol:


If a transaction waits longer than a specified amount of time, it
gets aborted
Here, deadlock is only supposed, not proved
Deadlock Detection (Wait-for graph)




Deadlocks can be described as a wait-for graph, which
consists of a pair G = (V,E),
 V is a set of vertices (all the transactions in the system)
 E is a set of edges; each element is an ordered pair Ti Tj.
If Ti  Tj is in E, then there is a directed edge from Ti to Tj,
implying that Ti is waiting for Tj to release a data item.
When Ti requests a data item currently being held by Tj, then
the edge Ti  Tj is inserted in the wait-for graph. This edge is
removed only when Tj is no longer holding a data item needed
by Ti.
The system is in a deadlock state if and only if the wait-for
graph has a cycle.
Must invoke a deadlock-detection
algorithm periodically to look for cycles.
Deadlock Detection (Cont.)
Wait-for graph without a cycle
Wait-for graph with a cycle
Deadlock Detection (Wait-for graph)

When should we invoke deadlock detection
algorithm?





Depends on how often the deadlock occurs
How many transactions will be affected by the deadlock
If deadlock occurs frequently the invoke the
algorithm more frequently.
Data items allocated to deadlocked transactions
are unavailable to other transactions till the
deadlock is broken.
In worst case invoke on every request that needs
to wait
Deadlock Recovery

When deadlock is detected :

Select a victim : Some transaction will have to be
rolled back (made a victim) to break deadlock.
Select that transaction as victim that will incur
minimum cost. Factors that determine cost of
rollback include



How long the transaction has computed and how much
longer will it compute before it completes the designated
task
How many data items have been used and how many
more are required
How many transactions will be involved in the rollback
Deadlock Recovery

When deadlock is detected (contd.):

Rollback -- determine how far to roll back
transaction



Total rollback: Abort the transaction and then restart it.
Partial rollback :More effective to roll back transaction
only as far as necessary to break deadlock. This will
require maintaining additional information about the
state of all the transactions. The sequence of lock
request granted and updates performed.
Starvation happens if same transaction is always
chosen as victim. Include the number of rollbacks
in the cost factor to avoid starvation
Other Concurrency Control Issues

Till now we have restricted our attention to read and write. This limits
transactions to data items already exisitng. But transaction can create
and delete data items and can affect the concurrency control of the
transaction

We will now see concurrency issues related to

Insert

Delete

Phantom Records
Other Concurrency Control Issues

Delete Operation





Deletion of a data item conflicts with read and write operations as, if we
read or write a deleted item, it will result in error
Hence in 2PL an exclusive lock is required on a data item before it can
be deleted
In Timestamp ordering protocol, If TS(Ti) < R-timestamp(Q), then Ti is
deleting a value that is already read by transaction Tj, where Ti< Tj.
Hence delete is rejected and rolled back.
If TS(Ti) < W-timestamp(Q), then Ti is deleting a value that is already
written by transaction Tj, where Ti < Tj. Hence delete is rejected and
rolled back
Insert Operation



Insertion of a data item conflicts with read and write operations as, no
read or write can be performed before the item is inserted
In 2PL, An insert operation may be performed only if the transaction
inserting the tuple has an exclusive lock on the tuple to be inserted.
In timestamp ordering protocol, after the insertion the R-timestamp and
W-timestamp values of the data item are set to TS(Ti)
Other Concurrency Control Issues

Phantom Record



A transaction locks database items that satisfy certain
selection condition and updates them
During that update, another transaction inserts a new item
that satisfies the same selection condition
After the update, but inside the same transaction, we
suddenly discover the existence of a database item that
has not been updated although it should have been (since
it satisfies the selection condition)

This database item, called a “phantom record”, appeared because it
did not exist when locking has been done
Other Concurrency Control Issues

Phantom Record






E.g. T1 : Select sum(salary) from emp where d_no = 5
T2 : insert (emp_id, d_no, emp_name, salary)
into emp values (25,5,’xyz’,4500)
Let S be a schedule involving T1, T2. Although the two transactions do
not access any tuples in common, yet they conflict on a phantom tuple
If concurrency control is done at tuple level, this conflict can go
undetected and the system could fail to prevent a non-serializable
schedule. This phenomenon is called “Phantom phenomenon”
To prevent the phantom phenomenon T1 should prevent other
transactions from creating new tuples in emp table where d_no=5
Hence it is not enough to lock tuples that are accessed, but we also
need to lock the information used to find the tuples that are accesses
To manage this index locking technique is used. Any transaction that
inserts a tuple into a relation should insert information into each of the
indices and locking is done on the indices as well. This eliminates the
phantom phenomenon.
+
B -Tree
For account File with n = 3.
Insertion of “Clearview” Into the B+-Tree of
Figure 16.21
Index Locking Protocol

Index locking protocol:



Every relation must have at least one index.
A transaction can access tuples only after finding them through
one or more indices on the relation
A transaction Ti that performs a lookup must lock all the index
leaf nodes that it accesses, in S-mode


A transaction Ti that inserts, updates or deletes a tuple ti in a
relation r




Even if the leaf node does not contain any tuple satisfying the index
lookup (e.g. for a range query, no tuple in a leaf is in the range)
must update all indices to r
must obtain exclusive locks on all index leaf nodes affected by the
insert/update/delete
The rules of the two-phase locking protocol must be observed
Guarantees that phantom phenomenon won’t occur
Concurrency Control in Indexes


2PL locking can be applied but would mean holding the locks till the
shrinking phase and this would be expensive
Crabbing Protocol






Takes the advantage of B-tree structure of index
Index search always traverses from the root to a leaf
When searching acquire shared lock on the node. After acquiring a lock on the child node,
release the lock on the parent as the parent node will not be required any further.
When an insertion or deletion is being done, follow the same mechanism as search and
obtain and release shared locks, except the parent.
Acquire an exclusive lock on the leaf node to insert or delete key values. If node is not full no
change is required. If node is full upgrade the parent lock to exclusive.
B-Link Protocol




Sibling nodes at each level are linked
For Lookup / Search, shared locks are requested and before accessing the child node the
lock on the parent is released
For an insert or delete operation, proceed as in lookup and upgrade the shared lock to
exclusive for the leaf node.
If split relock parent in the exclusive mode. If split occurs with a search concurrently, locate
thru the right sibling
Transaction Support in SQL-92

Higher level of consistency allow programmers to ignore
concurrency issues, whereas weaker levels of consistency
places additional burden on the programmers to maintain
consistency
Isolation Level
Dirty
Read
Unrepeatable
Read
Phantom
Problem
Read Uncommitted
Maybe
Maybe
Maybe
Read Committed
No
Maybe
Maybe
Repeatable Reads
No
No
Maybe
Serializable
No
No
No
Weak Levels of Consistency in SQL

SQL allows non-serializable executions


Serializable: is the default
Repeatable read: allows only committed records to be read,
and repeating a read should return the same value (so read
locks should be retained)

However, the phantom phenomenon need not be prevented




T1 may see some records inserted by T2, but may not see others
inserted by T2
Read committed: same as degree two consistency, but most
systems implement it as cursor-stability
Read uncommitted: allows even uncommitted data to be
read
In many database systems, read committed is the default
consistency level

has to be explicitly changed to serializable when required

set isolation level serializable