Download Transaction Processing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Consistency model wikipedia , lookup

Relational model wikipedia , lookup

Database wikipedia , lookup

Global serializability wikipedia , lookup

Database model wikipedia , lookup

Clusterpoint wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Commitment ordering wikipedia , lookup

Versant Object Database wikipedia , lookup

Concurrency control wikipedia , lookup

Serializability wikipedia , lookup

Transcript
Enterprise Database Systems
Transaction Processing
Technological Educational Institution of Larissa
in collaboration with Staffordshire University
Larissa 2008
Dr. Theodoros Mitakos [email protected]
Agenda
Transaction Processing (Concepts and
Theory)
 Concurrency Control Techniques

2
Introduction


A transaction is an executing program that forms a
logical unit of database processing. A transaction
includes one or more database access operations
(these can include insertion, deletion, modification or
retrieval operations). It must be completed in its
entirety to ensure correctness.
Transaction processing systems are systems with
large databases and hundreds of concurrent users
that are executing database transactions (e.g.
reservations, banking, credit card processing,
supermarkets)
3
Database Systems Classification





The data model (e.g. relational, object
oriented, network)
The number of users supported (multiuser,
single user)
The number of sites over which the database
is distributed (centralized, distributed,
federated)
The cost of the DBMS (low end, expensive)
General purpose, special purpose DBMSs (eg
traditional systems, online transaction
systems)
4
Database Systems Classification
according of the number of users
Single user: At most one user can use
the system at a time.
 Multi-user: Many users can access the
database concurrently.
 Concurrent execution of processes is
interleaved in multiprogrammed
operating systems

5
Example in SQL Server
BEGIN TRANSACTION /* the beginning of the transaction */
UPDATE employee
SET emp_no = 39831
WHERE emp_no = 10102
IF(@@error <> 0)
ROLLBACK TRANSACTON /* Rollback of the transaction */
UPDATE works_on
SET emp_no = 39831
WHERE emp_no = 10102
IF(@@error <> 0)
ROLLBACK TRANSACTON /* Rollback of the transaction */
COMMIT TRANSACTION /* The end of the transaction */
The consistent state of data can be obtained only if both UPDATE or neither
of them are executed
6
TRANSACT-SQL STATEMENTS
AND TRANSACTIONS
BEGIN TRANSACTION
 BEGIN DISTRIBUTED TRANSACTION
 COMMIT TRANSACTION
 ROLLBACK TRANSACTION
 SAVE TRANSACTION

7
TRANSACT-SQL STATEMENTS
AND TRANSACTIONS
BEGIN TRANSACTION [transaction name]
[WITH MARK [‘description’]]
transaction name is the name assigned to a transaction
WITH MARK option specifies that the transaction is to be marked in the log
 A distributed transaction is one that involves databases on more than
one server
 The COMMIT TRANSACTION statement successfully ends the
transaction started with the BEGIN TRANSACTION statement. This
means all modifications made by the transaction are stored on the disk.
 The SAVE TRANSACTION statement set a save point within a
transaction. A save point marks a specified point within the transaction
so that all updates that follow can be cancelled without canceling the
entire transaction.

8
EXAMPLE
BEGIN TRANSACTION
INSERT INTO department(dept_no, dept_name)
VALUES(‘d4’,’Sales’)
SAVE TRANSACTION a
INSERT INTO department(dept_no, dept_name)
VALUES(‘d5’,’Research’)
SAVE TRANSACTION b
INSERT INTO department(dept_no, dept_name)
VALUES(‘d6’,’Management’)
ROLLBACK TRANSACTION b
INSERT INTO department(dept_no, dept_name)
VALUES(‘d7’,’Support’)
ROLLBACK TRANSACTION a
COMMIT TRANSACTION
9

Each Transact-SQL statement always belongs
implicitly or explicitly to a transaction.
 When a session operates in implicit
transaction mode , selected statements
implicitly issue the BEGIN TRANSACTION
statement. This means that you do nothing to
start such a transaction. However, the end of
each implicit transaction must be explicitly
committed or rolled back using COMMIT
statement.
10
Transaction Logging

SQL server keeps record of each change it
makes to the database during a transaction.
This is necessary in case an error occurs
during the execution of the transaction. In this
case all previously executed statements within
the transaction have to rolled back.
 SQL server keeps the before and after values
in the transaction log. The transaction log is
used to roll back or restore a transaction.
11
Locking

Locks can be applied to the following
database objects:

Row
 Page (main unit of data storage in SQL server)
 Index
 Extend (physical unit of storage in which space is allocated to a
table)


Table
Database itself
The size of the object that is locked is called granularity
12
Kinds of locks







Row level – Page level
Shared (S)
Exclusive (X)
Update (U)
A shared lock reserves a database object for reading
only
An exclusive lock reserves a page or row for the
exclusive use of a single transaction
An update lock can be placed only if no other update
or exclusive lock exists. On the other hand, it can be
placed on objects that already have shared locks
13
Table level locks






Shared (S)
Exclusive (X)
Intended share (IS)
Intended exclusive (IX)
Shared with intent exclusive (SIX)
Generally, an indent lock shows an intention to lock
the next lower object in the hierarchy of the database
objects. Therefore, intent locks are placed at a level in
the hierarchy above that witch the process intends to
lock.
14
Basic database access operations

Read_item(X): reads a database item X into a program variable




Find the address of the disk block that contains item X
Copy that disk block into a buffer in main memory
Copy item X from the buffer to the program variable named X
Write_item(X) Writes the value of program variable X into the database
item X




Find the address of the disk block that contains item X
Copy that disk block into a buffer in main memory
Copy item X from the program variable named X into its correct location in the
buffer
Store the updated block from the buffer back to disk
15
Example transactions







T1
Read_item(X);
X:=X-N;
Write_item(X);
Read_item(Y);
Y:=Y+N;
Write_item(Y);

T2
 Read_item(X);
 X:=X+M;
 Write_item(X);
16
Why concurrency control is needed

The lost update problem. This problem
occurs when two transactions that
access the same database items have
their operations interleaved in a way that
makes the value of some database items
incorrect.
17
Example



T1
Read_item(X);
X:=X-N;

T2

Read_item(X);
 X:=X+M;


Write_item(X);
Read_item(Y);



Write_item(X);
Y:=Y+N;
Write_item(Y);
18

Temporary Update (or Dirty read)
problem: The problem occurs when one
transaction updates a database item and
then the transaction fails for some
reason. The updated item is accessed by
another transaction before it is changed
back to its original value.
19
Example




T1
Read_item(X);
X:=X-N;
Write_item(X);

T2


Read_item(X);
X:=X+M;
Write_item(X);

The value of X is called dirty data



Read_item(Y);
Transaction T1 fails and must change the
value of X back to its old value; meanwhile
T2 has read the “temporary” incorrect
value of X
20

The incorrect summary problem: If one
transaction is calculating an aggregate
summary function on a number of
records while other transactions are
updating some of these records, the
aggregate function may calculate some
values before they are updated and
others after they are updated.
21
Example

T1



Read_item(X);
X:=X-N;
Write_item(X);

T3
 Sum:=0;
 Read_iem(A);
 sum:=sum+A;




Read_item(X);
Sum:=sum+X;
Read_item(Y);
Sum:=sum+Y;


Read_item(Y);
Y:=Y+N;
 Write_item(Y);
22
Types of Failures






A system crash
A transaction or system error (e.g. division by
zero)
Local errors or exception conditions detected
by the transaction (e.g. data for the transaction
not found)
Concurrency control enforcement (abort the
transaction)
Disk failure (read/write malfunction)
Physical problems and catastrophes.
23
Transaction states
Begin transaction
active
End transacion
commit
Partially commited
commited
abort
abort
failed
terminated
24
Commit point

A transaction T reaches its commit point
when all its operations that access the
database have been executed
successfully and the effect of all
transaction operations on the database
have been recorded in the log. Beyond
the commit point the transaction is said
to be committed and its effect is
assumed to be permanently recorded in
the database
25
Desirable properties of transactions




Atomicity: A transaction is an atomic unit of
processing; it is either performed in its entirety or not
performed at all
Consistency preservation: A transaction is consistency
preserving if its complete execution takes the
database from one consistent state to another
Isolation: A transaction should appear as though it is
being executed in isolation from other transactions
Durability: The changes applied to the database by a
committed transaction must persist in the database.
26
Schedules Histories of transactions
A schedule S of n transactions T1, T2, …, Tn
is an ordering of the operations of the
transactions subject to the constraint that, for
each transaction Ti that participates in S, the
operations of Ti in S must appear in the same
order in which they occur in Ti.
 However operations from other transactions Tj
can be interleaved with the operations of Ti.
 We are interested in read_item, write_item,
commit and abort operations.

27
Conflict
Two operations in a schedule are said to
conflict if:
 Belong to deferent transactions
 Access the same item X
 At least one of the operations is
write_item(X)

28
Characterizing schedules based on
recoverability




Recoverable schedules: Once a transaction T is
committed it should never be necessary to rollback T.
The other are called nonrecoverable.
A schedule S is recoverable if no transaction T in S
commits until all transactions T’ that have written an
item that T reads have committed.
A schedule is said to be cascedeless or avoid
cascading rollback if every transaction in the schedule
reads only items that were written by committed
transactions.
A schedule is called strict if transactions can neither
read nor write an item X until the last transaction that
wrote X has committed.
29
Characterizing schedules based on
serializability

A schedule S is serial if for every transaction T
participating in the schedule, all operations of
T are executed consequently in the schedule;
otherwise the schedule is called nonserial.
(Only one transaction at a moment is active)
 A schedule S of n transactions is serializable if
it is equivalent to some serial schedule of the
same n transactions.
30
Defining equivalence


Two schedules are said to be conflict equivalent if the
order of any two conflicting operations is the same in
both schedules.
View equivalence:

The idea is as long as each read operation of a transaction
reads the result of the same write operation in both schedules
the write operations of each transaction must produce the
same results. The read operations are said to see the same
view in both schedules.
31

In most practical systems methods are
determined to ensure serializability
without having to test schedules
themselves (there exist algorithms
though that test if schedules are
serializable). According to this approach
protocols (set of rules) ensure
serializability of all schedules in which
transactions participate.
32
Conditions under which schedule Sh is considered correct even
though it is not serializable.

With the additional knowledge, or semantics,
that the operations between each ri(I) and wi(I)
are commutative, we know that the order of
executing the sequences consisting of (read,
update, write) is not important as long as each
(read, update, write) sequence by a particular
transaction Ti on a particular item I is not
interrupted by conflicting operations.
33
Types of locks

Binary locks: A binary lock can have two
states or values (locked and unlocked, 0 or 1
for simplicity). A distinct lock is associated with
each database item. If the value of the lock on
X is 1, item X cannot be accessed y a
database operation that requests the item. If
the value is 0 the item can be accessed when
requested. Lock_item(X), unlock_item(X)
34
Types of locks

Shared/Exclusive (read/write)
locks:Allows several transactions to
access the same item X if they all
access x for reading purposes only.
However, if a transaction is to write an
item X it must have exclusive access to
X. read_lock(X), write_lock(X), unlock(X)
35
Conversion of locks: A transaction that
already holds a lock on item X is allowed
under certain conditions to convert the
lock from one locked state to another.
 Using binary locks or read/write locks in
transactions does not guarantee
serializability of schedules on its own.

36
Two phase locking

A transaction is said to follow the two phase
locking protocol if all locking operations
read_lock, write lock precede the first unlock
operation in the transaction.
 Expanding (first) phase during which new
locks on items can be acquired but none can
be released.
 Shrinking (second) phase during which
existing locks can be released but no new
locks can be acquired.
37
Example T1 and T2 follow the two
phase locking protocol









T1
Read_lock(Y);
Read_item(Y);
Write_lock(X);
Unlock(Y);
Read_item(X);
X:=X+Y;
Write_iem(X);
Unlock(X);









T2
Read_lock(X);
Read_item(X);
Write_lock(Y);
Unlock(X);
Read_item(Y);
Y:=Y+X;
Write_iem(Y);
Unlock(Y);
38
Example T1 and T2 do not follow
the two phase locking protocol




T1
Read_lock(Y);
Read_item(Y);
Unlock(Y);

T2

Read_lock(X);
Read_item(X);
Unlock(X);
Write_lock(Y);
Read_item(Y);
Y:=Y+X;
Write_iem(Y);
Unlock(Y);






Write_lock(X);
Read_item(X);
X:=X+Y;
Write_iem(X);
Unlock(X);






Nonserializable transactions
39
Variations of two phase locking



Basic 2PL
Conservative 2PL: Requires a transaction to lock all
items it access before the transaction begins
execution by predeclaring its readset (all items a
transaction writes) and writeset (All items a transaction
writes). If any of the preceding items cannot be locked
the transaction waits until all items are available for
locking. It is deadlock free protocol but difficult to
implement.
Strict 2PL: The transaction does not release any of
the write locks until after it commits or aborts. It is not
deadlock free.
40
Deadlock

Deadlock occurs when each transaction
T in a set of two or more transactions is
waiting for some item that is locked by
some other transaction T’ in the set.
Hence, each transaction in the set is in a
waiting queue waiting for one of the
other transactions in the set to release
the lock on an item.
41
Deadlock prevention protocols
E.g. Conservative 2PL protocol.
 Order all items in the database and
make sure that a transaction that needs
several items will lock them according to
that order. (it is not practical in database
context)
 A transaction time stamp TS(T) is a
unique identifier assigned to each
transaction.

42
Time stamp deadlock solutions
Suppose that transaction Ti tries to lock an item X but is
not able to because X is locked by some other
transaction Tj with a conflicting lock.
 Wait-die: if TS(Ti)<TS(Tj) then (Ti is older than Tj) Ti is
allowed to wait; otherwise (Ti younger than Tj) abort Ti
(Ti dies) and restart it later with the same timestamp.
 Wound-wait: if TS(Ti)<TS(Tj) then (Ti is older than Tj)
abort Tj (Ti wounds Tj) and restart it later with the
same timestamp; otherwise (Ti younger than Tj) Ti is
allowed to wait.
 Both schemas end up aborting the younger of the two
transactions that may be involved in a deadlock.
43
Deadlock detection and timeouts




It is a deadlock detection method.
It is better applied when the transactions are short and
each transaction locks only a few items or the
transaction load is light.
Construct a wait for graph. Whenever a transaction Ti
is waiting to lock an item X that is currently locked by a
transaction Tj a directed edge is created in the wait
for graph.
Timeouts: if a transaction waits for a period longer
than a system defined timeout period the system
assumes that the transaction may be deadlocked and
aborts it.
44
Starvation
Starvation occurs when a transaction
cannot proceed for an indefinite period of
time while other transactions in the
system continue normally.
 Solution: fair waiting scheme like using a
first come first served queue. The wait
die and wound wait schemes avoid
starvation.

45
Granularity level considerations
The larger the data item size the lower
the degree of concurrency permitted.
 The smaller the data item size is the
more the items in the database (large
number of locks to be handled by the
system lock manager - more lock and
unlock operations will be permitted
causing a higher overhead.)

46
Multiple granularity level locking

The best granularity size depends on the given
transaction. A database system should support
multiple levels of granularity.
 Intention locks: A transaction indicates along
the path from the root to the desired node
what type of lock shared or exclusive it will
require from one of the node’s descendants.
47