Download Database overview

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Coase theorem wikipedia , lookup

Operational transformation wikipedia , lookup

Commitment ordering wikipedia , lookup

Transcript
CS 292 Special topics on
Big Data
Yuan Xue
([email protected])
Part I Relational Database
(Transaction)
Yuan Xue
([email protected])
Review and Look Forward

What we know so far

Design: Database schema






Optimization objective: minimum
redundancy with information preservation
Operation: Database access and
manipulation via SQL
Design: How data is stored in database?
Operation: How data is accessed and
manipulated; How to configure index
via SQL
Optimization: How data storage and
access method (Index creation) can be
designed so that application (SQL
queries) execution time is minimized
Next step

More on DB operation: How
constraints/integrity is ensured
Conceptual Design
Entity/Relationship model
Logic Design
Data model mapping
Logical Schema]
Normalization
Normalized Schema
Physical Design
Physical (Internal) Schema
Motivation Example For Transaction

Scenario

Efficient support for counting Followers, Followees,
Tweets


A new table Number
Consider the two operations


Bob2013 unfollow Alice00
Dave11 follow Alice00
Operation by Bob2013 on Number
Number
User ID
NumFollower
NumFollowee
NumTweet
Alice00
6
7
8
Bob2013
1
2
3
Cathy123
6
9
12
Operation by Dave11 on Number
X:=
SELECT NumFollower
FROM Number
WHERE UserID =Alice00
X:=
SELECT NumFollower
FROM Number
WHERE UserID =Alice00
X := X-1
X := X +1
UPDATE Number
SET NumFollower=X
WHERE UserID =Alice00
UPDATE Number
SET NumFollower=X
WHERE UserID =Alice00
Motivation Example For Transaction

Scenario

Efficient support for counting Followers, Followees,
Tweets


A new table Number
Consider the two operations


Bob2013 unfollow Alice00
Dave11 follow Alice00
Operation by Bob2013 on Number
Number
User ID
NumFollower
Alice00
6
Bob2013
Cathy123
Operation by Dave11 on Number
X:=
SELECT NumFollower
Read(X)
FROM Number
WHERE UserID =Alice00
X:=
SELECT NumFollower
Read(X)
FROM Number
WHERE UserID =Alice00
X := X-1
X := X +1
UPDATE Number
Write(X)
SET NumFollower=X
WHERE UserID =Alice00
UPDATE Number
Write(X)
SET NumFollower=X
WHERE UserID =Alice00
NumFollowee
NumTweet
7
8
1
2
3
6
9
12
X
Possible Outcome (I)

Operation by Bob2013:
Read(X)
X := X-1
Write(X)


Operation by Dave11:
Read(X)
X := X+1
Write(X)
Operation by Dave11:
Read(X)
X := X+1
Write(X)
Operation by Bob2013:
Read(X)
X := X-1
Write(X)
No interweaving
Serial operation
Order does not
matter
Possible Outcome (II)
Operation by Bob2013:
Read(X)
X := X-1
Operation by Dave11:

Read(X)
X := X+1
Write(X)
Write(X)


Lost Update
Final result
X=?
Motivation Example For Transaction

Scenario

Efficient support for counting Followers,
Followees,Tweets


A new table Number
Consider the two operations


Bob2013 unfollow Alice00
Dave11 follow Alice00
Number
User ID
NumFollower
Alice00
6
Bob2013
Cathy123
Getting more complicated
- operate on two relations simultaneously
Operation by Bob2013
on both Number and Follow
Operation by Dave11
on both Number and Follow
NumFollowee
NumTweet
7
8
1
2
3
6
9
12
X
Followee
Follower
Timesta
mp
Alice00
Bob2013
Y
2011.1.1.3.6.6
Bob2013
Cathy123
2012.10.2.6.7.7
Read(X)
X := X-1
Write(X)
Read(X)
X := X+1
Write(X)
Alice00
Cathy123
2012.11.1.2.3.3
Cathy123
Alice00
2012.11.1.2.6.6
Delete(Y)
Insert(Z)
Bob2013
Alice00
2012.11.1.2.6.7
Alice00
Dave11
Z
2012.11.1.2.6.7
Possible Outcome (II)
Operation by Bob2013:
Read(X)
X := X-1
Write(X)
Operation by Dave11:
Dirty Read

Read(X)
X := X+1
Write(X)
Write(Z)
Failed!
[Roll back value of X]
Delete(Y)


DaveII gets the wrong
number
“Inconsistent” database
state
Operation by Dave11:
Transactions

Transaction




Solution:
Group into one unit
-- transaction
Read(X)
X := X+1
Write(X)
Write(Z)
An executing program that includes some database operations (e.g., read, write).
These operation will together form an atomic unit of work against database
At the end of a transaction the database must be in a valid or consistent state that
satisfy all the constraints specified on the database schema.
Two main purposes:

Concurrent database access


Isolation between programs accessing a database concurrently.
Resilient to system failures


Reliable units of work that allow correct recovery from failures
Keep a database consistent even in cases of system failure, when execution stops
(completely or partially) and many operations upon a database remain uncompleted, with
unclear status.
Transaction Properties





ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that
guarantee that database transactions are processed reliably.
Atomicity: "all-or-nothing" proposition
 Each work unit performed in a database must either complete in its
entirety or have no effect whatsoever
The state of the whole database:
Consistency: database constraints conformation A database state that obey all the
constraints
 Each client, each transaction,
 Can assume all constraints hold when transaction begins
 Must guarantee all constraints hold when transaction ends
Isolation: serial equivalency
 Operations may be interleaved, but execution must be equivalent to some
sequential (serial) order of all transactions.
Durability: durable storage.
 If system crashes after transaction commits, all effects of transaction
remain in database
Constraints

Database constraints

Inherent model-based constraints



Inherent constraints of the relation model ( not duplicate tuples )
Explicit schema-based constraints

Express in the schema of the relation model via DDL
Application-based constraints (semantic constraints/business rules)


Read FDS section 3.2 for details
Express and enforced with in the data model
Schema-based constraints


Specified on a database schema, hold on (Time invariant) every valid database state of this schema
Domain constraint


Key constraints



uniqueness
Constraints on NULL
Entity integrity constraint


With in each tuple, the value of each attribute must be an atomic value from the domain
No primary key can be NULL
Referential integrity constraint

Specified between two relations where one has a foreign key that refers to the other
Transaction Support

Concurrency Control
 Deal with interweaving transactions
 Guarantee isolation and database consistency

Recovery Mechanisms
 Deal with system failures
 Guarantee all-or-nothing execution and database consistency and
durability
Concurrency Control

Simple solution for isolation
 If transactions are executed serially, i.e., sequentially with no
overlap in time, no transaction concurrency exists

Then why is concurrency control needed?

Problem: performance suffers

Most high-performance transactional systems need to run transactions
concurrently to meet their performance requirements.
Concurrency Control Techniques
 Locking
Locking basics
 Two-Phase Locking
 Dealing with Deadlocks

 Timestamp
ordering
 Multiversion concurrency control (MVCC)
Locking for Concurrency Control

Lock:



A variable Lock(X) associated with a data element X [each element has a unique lock]
Describe the status of the element X
Types of Locks


Binary Locks [two states]: locked, unlocked
Read-write Lock [three states]: read_locked, write_locked, unlocked
Binary Lock Operation:
Read-write Lock Operation:
-
-
-
Each transaction (T) must first acquire
Lock(X) before reading/writing X
If Lock(X) is held by another transaction
(T’), then wait
The transaction (T) must release
Lock(X) after all reading/writing ops on
X are completed
The transaction (T) does not need to
acquire Lock(X) if it already holds it.
[Reentrant Lock]
-
Each transaction (T) must issue
read_lock(X) or write_lock(X) before
reading X
Each transaction (T) must issue
write_lock(X) before writing X
Lock acquisition follows Read-Write
Lock rule
The transaction (T) must issue unlock(X)
after all reading/writing ops on X are
completed
Lock conversion and upgrade
Implementation of Lock
Lock Table Implementation
using Hash Table
T1
T2
scheduler
Tn
Lock Table
Lock manager subsystem
X
Hash
function
Lock info for X
.
.
.
Lock
state,
#reads,
Locking
Transacti
on ID
If element not found in hash table,
it is unlocked
Two Phase Locking

Would Locking solve the problem?  Lock is a basic mechanism. It needs to work with
a protocol of how to apply locks to support correct concurrency control
T1:
Read_lock(Y)
Read(Y)
unlock(Y)
T2:
Read_lock(X)
Read(X)
unlock(X)
Write_lock(Y)
Read(Y)
Y:=X+Y
Write(Y)
Unlock(Y)
Write_lock(X)
Read(X)
X:=X+Y
Write(X)
Unlock(X)
Two Phase Locking

Locks are applied and removed in two phases:


Expanding phase: locks are acquired and no locks are released.
Shrinking phase: locks are released and no locks are acquired.
T1:
Read_lock(Y)
Read(Y)
Write_lock(X)
unlock(Y)
Read(X)
X:=X+Y
Write(X)
Unlock(X)


T2:
Read_lock(X)
Read(X)
Write_lock(Y)
unlock(X)
Read(Y)
Y:=X+Y
Write(Y)
Unlock(Y)
It can be proved that if every transaction in a schedule follows the two-phase locking
protocol, the schedule is guaranteed to be serializable (coming up in slide 23).
Problem: Deadlock


Wait-die: old transaction can wait for young transaction, young transaction do not wait for old
ones and will die
Wound-wait: young transaction can wait for old transaction, old transaction will preempt young
transaction
Review – Transaction

Problem:
 Application must perform multiple operations (read, write) to the database as a unit to keep database in
consistent states
 Operation concurrent execution is needed for performance consideration

Solution: Transaction bundles multiple operations into one unit
 ACID property
 Application: defines transaction
 DBMS: supports transaction

Pessimistic
Optimistic

Users
…
Concurrency control

Locking

Timestamp Ordering
MVCC

Recovery
Application Program
T1
T2
…
DBMS
Consistent
DB state
Tn
T
Consistent
DB state
Consistent
DB state’
Transaction States


Writing Transaction in SQL
Transaction States
BEGIN TRANSACTION
[SQL statement]
COMMIT or
ROLLBACK
[Single SQL statement]
Schedule

Transaction Ti


sequence of Read_i(x), write_i(x) operations
T2:
Read_2(X)
Write_2(X)
Write_2(Z)
Read_2(Y)
Read_2(Z)
Schedule (history): a chronological order in
which operations are executed from various
transactions




T1:
Read_1(X)
Write_1(X)
Write_1(Y)
Read_1(Y)
Read_1(Z)
Operations from the same transaction must
follow the same order in which they occur in the
transaction
Operations from different transactions can be
interleaved in the schedule
Totally ordering: for any two operations, one must
occur before the other (vs. partial order only on
conflicting operations)
Property of Schedule from two perspectives


Concurrency control
Recovery (coming up later)
T1:
Read_1(X)
Write_1(X)
Write_1(Y)
Read_1(Y)
Read_1(Z)
T2:
Read_2(X)
Write_2(X)
Write_2(Z)
Read_2(Y)
Read_2(Z)
Serializable Schedule

Serial schedules



T2:
Read_2(X)
Write_2(X)
Write_2(Z)
Read_2(Y)
Read_2(Z)
No concurrent execution  correct
Basis of correct concurrent execution
Serializable Schedule



T1:
Read_1(X)
Write_1(X)
Write_1(Y)
Read_1(Y)
Read_1(Z)
Allow concurrent execution
Considered to be correct as it is equivalent to
some serial schedules
Equivalence of schedules


Conflict equivalence
View equivalence
T1:
Read_1(X)
Write_1(X)
Write_1(Y)
Read_1(Y)
Read_1(Z)
?
T2:
Read_2(X)
Write_2(X)
Write_2(Z)
Read_2(Y)
Read_2(Z)
Conflict Equivalence

Conflict operations





They belong to different transactions
They access the same element (X)
At least one of the operations is a write(X)
Intuition: two operations are conflicting if
changing their orders can result in a different
outcome
Two types

Read-write conflict


Write-write conflict



Read_1(X),Write_2(X)
Write_1(X),Write_2(X)
Two schedules are Conflict Equivalence, if the
orders of any two conflicting operations is
the same in both schedules
A schedule is Conflict Serializable if it is
conflict equivalent to a serial schedule.
T1:
Read_1(X)
Write_1(X)
T2:
Read_2(X)
Write_2(X)
Conflict Serializablity: testing and ensurance


Question: Conflict Serializablity is good, but could we know whether a schedule is
conflict serializable?
Testing based on precedence graph






Each transaction is a node
Directed edges represent the order of conflicts (read-write, write-write)
Schedule is serializable if and only the precedence graph has no cycle
Equivalent serial schedule can be created by topological sort
Most concurrency control techniques do not actually test for serializablity.
They develop rules (protocols) to guarantee that any schedule that follows the rules
are serializable

Two phase locking  used in majority of commercial DBMS

Basic (covered in slide 19)  may cause deadlock

Conservative: Prevents deadlock by locking all desired data items before transaction begins
execution
Strict: where unlocking is performed after a transaction terminates (commits or aborts and
rolled-back). This is the most commonly used two-phase locking algorithm


Timestamp ordering  no locking overhead; suite better for distributed implementation (
used in H-store and Spanner)
Timestamp Ordering

Timestamp





A monotonically increasing variable (integer) indicating the age of an operation or a
transaction.
ID assigned by DBMS to a transaction TS(T)
A larger timestamp value indicates a more recent event or operation.
Implementation: counter, system clock
Timestamp Ordering Algorithm


Timestamp based algorithm uses timestamp to serialize the execution of concurrent
transactions
Only equivalent serial schedule has the transactions in the order of their timestamps
Basic Timestamp Ordering
Read_TS(X)


read TS of item X  largest TS among all the TS of transactions that have successfully read X

Read_TS(X) = TS(T), T is the youngest transaction that has read X successfully
Write_TX(X): similar definition

1. Transaction T issues a write_item(X) operation:
 If read_TS(X) > TS(T) or if write_TS(X) > TS(T), then an younger transaction
has already read the data item so abort and roll-back T and reject the operation.
 If the condition in part (a) does not exist, then execute write_item(X) of T and
set write_TS(X) to TS(T).
2. Transaction T issues a read_item(X) operation:
 If write_TS(X) > TS(T), then an younger transaction has already written to the
data item so abort and roll-back T and reject the operation.
 If write_TS(X)  TS(T), then execute read_item(X) of T and set read_TS(X) to
the larger of TS(T) and the current read_TS(X).

Good News: guarantee to be conflict serializable

Bad News: reject too many transactions

Resubmitted transactions will have a new timestamp

Cascading rollback
Strict Timestamp Ordering

Strict Timestamp Ordering
 A solution to basic TO problem
 Easy recoverable and conflict serializable
1. Transaction T issues a write_item(X) operation:

If TS(T) > read_TS(X), then delay T until the transaction T’ that wrote or read X has terminated
(committed or aborted).
2. Transaction T issues a read_item(X) operation:

If TS(T) > write_TS(X), then delay T until the transaction T’ that wrote or read X has terminated
(committed or aborted).

Thomas’s Write Rule
 A solution to basic TO problem
 Does not enforce conflict serializable

Transaction T issues a write_item(X) operation:



If read_TS(X) > TS(T) then abort and roll-back T and reject the operation.
If write_TS(X) > TS(T), then just ignore the write operation and continue execution. This
is because the most recent writes counts in case of two consecutive writes.
If the conditions given in 1 and 2 above do not occur, then execute write_item(X) of T
and set write_TS(X) to TS(T).
What is a Data Element/Object?
Relation A
Tuple A
Tuple B
Relation B
Tuple C
?
Disk
block
A
...
...
Disk
block
B
...
Multiple Granularity Locks
•
Locking works in any case, but should we choose small or large objects?
•
If we lock large objects (e.g., Relations)
– Need few locks
– Low concurrency
If we lock small objects (e.g., tuples,fields)
– Need more locks
– More concurrency
•
–
Solution: Multiple granularity level locking
Why Recovery is Needed

Atomic: whenever a transaction is summited to a DBMS for execution



Recovery from transaction failures


All operations in the transaction are completed successfully and their effect is
recorded permanently in the database
If a transaction fails after executing some of its operations, then these operations
executed must be undone and have no lasting effect (no effect on the database or any
other transactions )
Restore database to the most recent consistent state (all or nothing)
Types of failures



Hardware failures
Software errors
Concurrency control enforcement
Schedules classified on recoverability

Recoverable schedule:
 One where no transaction needs to be rolled back.
 A schedule S is recoverable if no transaction T in S commits until all
transactions T’ that have written an item that T reads have committed.

Cascadeless schedule:
 One where every transaction reads only the items that are written by
committed transactions.
Recovery Techniques

Causes of failures



Two main mechanisms



Systems crashes (partial transaction)
Transaction failures (being preempted by concurrency techniques)
System log
Check point
Two main techniques to recover from non catastrophic transaction failures



Deferred update (No-UNDO/REDO)  see slide 38
Immediate update (UNDO/REDO)  not covered, see book [FDS] section 23.3
UNDO and REDO operations are required to be idempotent
Data Update

Immediate Update: As soon as a data item is modified in cache, the
disk copy is updated.

Deferred Update: All modified data items in the cache is written
either after a transaction ends its execution or after a fixed
number of transactions have completed their execution.

Shadow update: The modified version of a data item does not
overwrite its disk copy but is written at a separate disk location.

In-place update: The disk version of the data item is overwritten by
the cache version.  require a transaction log
Transaction Log

For recovery from any type of failure data values prior to modification
(BFIM - BeFore Image) and the new value after modification (AFIM – AFter
Image) are required.

These values and other information is stored in a sequential file called
Transaction log.
 A sample log is given below. Back P and Next P point to the previous
and next log records of the same transaction.
T ID Back P Next P Operation Data item
Begin
T1
0
1
T1
1
4
Write
X
Begin
T2
0
8
T1
2
5
W
Y
T1
4
7
R
M
T3
0
9
R
N
T1
5
nil
End
BFIM
AFIM
X = 100
X = 200
Y = 50 Y = 100
M = 200 M = 200
N = 400 N = 400
Write-Ahead Logging

When in-place update (immediate or deferred) is used then log is
necessary for recovery and it must be available to recovery manager.

Write-Ahead Logging (WAL) protocol.
 For Undo: Before a data item’s AFIM is flushed to the database disk
(overwriting the BFIM) its BFIM must be written to the log and the
log must be saved on a stable store (log disk).
 For Redo: Before a transaction executes its commit operation, all
its AFIMs must be written to the log and the log must be saved on
a stable store.
Checkpoint

Time to time (randomly or under some criteria) the database
flushes its buffer to database disk to minimize the task of
recovery. The following steps defines a checkpoint operation:
1.
2.
3.
4.

Suspend execution of transactions temporarily.
Force write modified buffer data to disk.
Write a [checkpoint] record to the log, save the log to disk.
Resume normal transaction execution.
During recovery redo or undo is required to transactions
appearing after [checkpoint] record.
Deferred update (No-UNDO/REDO)

Deferred update (No-UNDO/REDO): do not physically update the database
on disk until after a transaction reaches its commit point; The updates are
recorded in the database.

Before reaching commit, all transactions updates are recorded in the local
transaction workspace or in the main memory buffers that the DBMS
maintains.

Before commit, the updates recorded persistently in the log, then after
commit, the updates are written to the database on disk.

If a transaction fails before reaching its commit point, it will not have
changed the database in any way. So UNDO is not needed, it may be
necessary to REDO the effect of the operation of a committed transaction
from the log, because their effect may not yet have been recorded in the
database.
Summary – Transaction

ACID
Application correctness
A
Recovery
I
Concurrency control
C
D
Persistency of disk
Putting Things Together
Users
Application Program/Queries
Query Processing
DBMS
system
Data access
(Database Engine)
Meta-data
Data
http://www.dbinfoblog.com/post/24/the-query-processor/