Download DBMS functions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsoft SQL Server wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Consistency model wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Global serializability wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Versant Object Database wikipedia , lookup

Commitment ordering wikipedia , lookup

Serializability wikipedia , lookup

Concurrency control wikipedia , lookup

Transcript
System Catalogue
Stores data that describes each database
 meta-data:

–
–
–
–

conceptual, logical, physical schema
mapping between schemata
info for query optimization, security, authorization, etc...
integrity constraints
Meta-database
–
–
–
used for DBA, designers, users
accessed frequently by DBMS modules
cp. Data Dictionary
 more general: document design process &
administration info
Query Processing and Optimization

Query processing:
–
–
–
–
scanner: identifies the language components
parser: checks the query syntax (grammar)
semantic validation
query graph(tree): internal representation of query
Execution Strategy for retrieving data
 Query optimization: choose a suitable
execution strategy for processing a query

Query Optimization

Choosing a reasonably efficient strategy
–

navigational language vs. high-level query
(programmer decide) vs. (DBMS)
Access algorithms
–
–
for relational algebra & aggregation and grouping
may be applied to particular storage structure and access
paths
Query Optimization Techniques

Heuristic rules:
reorder operations in a query tree
 recursive query decomposition


Systematic query optimization:
–
estimation of each strategy




–
catalog info used in cost functions:


access cost to secondary storage
storage of cost intermediate files
computation: searching, sorting, merging in memory
communication cost
number of records, blocks, blocking factor, number of first
index block, selectivity, etc.
Semantic Query Optimization
Transaction Processing

Database Transaction:
–
–
–

Multiprogramming OS for multiusers
–

interleaved model of concurrent execution
ACID properties of transaction (desired)
–

a logical unit of database processing (work)
an execution of a program that includes database access
operations
at data item & disk block access level
Atomicity, consistency, isolation, and durability
Concurrency Control and Recovery Control
Need for Concurrency Control
Lost update problem
 Temporary update problem
 Incorrect Summary (analysis) problem

Need for Recovery Control
system crash
 transaction or system error
 local error or execution error
 concurrency control enforcement
 disk failure: read-write malfunction
 Physical problems and catastrophes:

–
power failure, fire, etc.
ACID Properties

Atomicity
–

Consistency
–

correct execution take the database from one consistent
state to another
Isolation
–

either performed in its entirety or not performed at all
should not make its updates visible to other transaction
until it is committed (can solve temporary update problem)
Durability
–
once committed, the changes will not be lost because of
subsequent failure
Schedules of Transactions

Definition
–
–
–

Conflicts:
–

n transactions are executing concurrently in an interleaved
fashion
the order execution of operations from the various
transaction forms is called a schedule
The operations of Ti in in a schedule must appear in the
same order in which the occur in Ti
2 operations belong to 2 transactions accessing the same
item, and one of the two operation s is a WRITE op.
Committed Project of a schedule: C(S)
–
include only the operations in S that belong to committed
transactions
Serializability Theory

Serial:
–
–

Serializable:
–

for all T in S, all operations of T are executed consecutively
each transaction is independent
a non-serial schedule is (result) equivalent to some serial
schedule of the same n transaction
Precedence graph (or serialization graph)
–
–
testing of conflict serializability of a schedule
if no cycle in the graph, we can crate an equivalent serial
schedule
Concurrency Control & Serializability

practically impossible
–

to determine the operations of a schedule will be
interleaved beforehand to ensure serializability
Protocols
–
–
followed by every individual transaction or enforced by
DBMS concurrency control
 Two-phase locking
 timestamp ordering
 Multiversion
 Optimistic: certification or validation
granularity
Locking

a variable used for synchronizing the access
by concurrent transactions to database item
–
–

Two-phase Locking Protocol:
–
–

Binary locks: lock or unlocked
Multi-mode locks: shared (read-locked) vs. exclusive
(write-locked) locks
all locking operations precede the first unlock operation in
the transaction
Expanding (growing) phase --> Shrinking phase
Problems: deadlock, live lock, starvation
Timestamp Ordering (TO)

Timestamps:
–
–

unique identifier created by DBMS to identify a
transaction (transaction start time): ts(T)
read_ts(x) and write_ts(x) with each database item
Basic TO algorithm
1) T issues write:
a. if read_ts(X) > ts(T) or write_ts(X) > ts(T), abort T
b. set write_ts(x) to ts(T)
2) T issue read:
a. if write_ts(x) > ts(T), abort T
b. if write_ts(x) < or = ts(T), do read, set read_ts(x) to the
larger of ts(T) or read_ts(x)

No deadlock but has cascading rollback
Optimistic Concurrency Control

Unlike locking or TO, no checking is done
during transaction execution
–
–
updates applied to local copies, at the end of transaction
execution, a validation phase checks whether any of the
transaction updates violates serializability
read phase -> validation phase --> write phase
assume little interference
 timestamps on write_set and read_set

Transaction States & Operations
For recovery purpose, transaction states
need to be recorded in a system log
 transactions states

–
–
–
–
–
–
BEGIN_TRANSACTION
READ OR WRITE
END_TRANSACTION
COMMIT_TRANSACTION
ROLLBAK (OR ABORT) vs. UNDO
 one transaction vs. one operation
REDO: redo certain operations to make sure
Recovery Techniques

Deferred update (after)
–

Immediate update: (before)
–

NO-UNDO/REDO algorithm
UNDO/REDO or UNDO/NO-REDO algorithm
in-place updating vs. shadowing
–
–
write-ahead logging (WAL) for in-place updating
before image (BFIM) + after image (AFIM) for shadow paging
O.S.: buffering and caching -> DBMS cache
 Two-phase commit protocol for Multidatabase,

–
–

phase 1: prepare to commit, ready to commit
phase 2: all O.K., “commit”
Database Backup: periodical vs. incremental
Commit point vs. Checkpoint

Commit point:
–
–

all operations in a transaction have been successfully
executed and recorded in a system log
force-write log file (to disk)
Checkpoints
–
–
–
a checkpoint record is written into the log periodically at
that point when the system writes out to the database on
disk the effect of all WRITE operations of committed
transactions
recovery manager decides at what intervals to take a check
point in minutes or number of committed transaction
checkpoint record can contain information such as list of
active transaction_id, location of active transactions, etc.
Recoverability

A schedule S is said to be recoverable if no
transaction T in S commits until all
transactions T’ that have written an item
that T reads have committed