Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Microsoft SQL Server wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Consistency model wikipedia , lookup
Relational model wikipedia , lookup
Clusterpoint wikipedia , lookup
Database model wikipedia , lookup
Global serializability wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Versant Object Database wikipedia , lookup
Commitment ordering wikipedia , lookup
System Catalogue Stores data that describes each database meta-data: – – – – conceptual, logical, physical schema mapping between schemata info for query optimization, security, authorization, etc... integrity constraints Meta-database – – – used for DBA, designers, users accessed frequently by DBMS modules cp. Data Dictionary more general: document design process & administration info Query Processing and Optimization Query processing: – – – – scanner: identifies the language components parser: checks the query syntax (grammar) semantic validation query graph(tree): internal representation of query Execution Strategy for retrieving data Query optimization: choose a suitable execution strategy for processing a query Query Optimization Choosing a reasonably efficient strategy – navigational language vs. high-level query (programmer decide) vs. (DBMS) Access algorithms – – for relational algebra & aggregation and grouping may be applied to particular storage structure and access paths Query Optimization Techniques Heuristic rules: reorder operations in a query tree recursive query decomposition Systematic query optimization: – estimation of each strategy – catalog info used in cost functions: access cost to secondary storage storage of cost intermediate files computation: searching, sorting, merging in memory communication cost number of records, blocks, blocking factor, number of first index block, selectivity, etc. Semantic Query Optimization Transaction Processing Database Transaction: – – – Multiprogramming OS for multiusers – interleaved model of concurrent execution ACID properties of transaction (desired) – a logical unit of database processing (work) an execution of a program that includes database access operations at data item & disk block access level Atomicity, consistency, isolation, and durability Concurrency Control and Recovery Control Need for Concurrency Control Lost update problem Temporary update problem Incorrect Summary (analysis) problem Need for Recovery Control system crash transaction or system error local error or execution error concurrency control enforcement disk failure: read-write malfunction Physical problems and catastrophes: – power failure, fire, etc. ACID Properties Atomicity – Consistency – correct execution take the database from one consistent state to another Isolation – either performed in its entirety or not performed at all should not make its updates visible to other transaction until it is committed (can solve temporary update problem) Durability – once committed, the changes will not be lost because of subsequent failure Schedules of Transactions Definition – – – Conflicts: – n transactions are executing concurrently in an interleaved fashion the order execution of operations from the various transaction forms is called a schedule The operations of Ti in in a schedule must appear in the same order in which the occur in Ti 2 operations belong to 2 transactions accessing the same item, and one of the two operation s is a WRITE op. Committed Project of a schedule: C(S) – include only the operations in S that belong to committed transactions Serializability Theory Serial: – – Serializable: – for all T in S, all operations of T are executed consecutively each transaction is independent a non-serial schedule is (result) equivalent to some serial schedule of the same n transaction Precedence graph (or serialization graph) – – testing of conflict serializability of a schedule if no cycle in the graph, we can crate an equivalent serial schedule Concurrency Control & Serializability practically impossible – to determine the operations of a schedule will be interleaved beforehand to ensure serializability Protocols – – followed by every individual transaction or enforced by DBMS concurrency control Two-phase locking timestamp ordering Multiversion Optimistic: certification or validation granularity Locking a variable used for synchronizing the access by concurrent transactions to database item – – Two-phase Locking Protocol: – – Binary locks: lock or unlocked Multi-mode locks: shared (read-locked) vs. exclusive (write-locked) locks all locking operations precede the first unlock operation in the transaction Expanding (growing) phase --> Shrinking phase Problems: deadlock, live lock, starvation Timestamp Ordering (TO) Timestamps: – – unique identifier created by DBMS to identify a transaction (transaction start time): ts(T) read_ts(x) and write_ts(x) with each database item Basic TO algorithm 1) T issues write: a. if read_ts(X) > ts(T) or write_ts(X) > ts(T), abort T b. set write_ts(x) to ts(T) 2) T issue read: a. if write_ts(x) > ts(T), abort T b. if write_ts(x) < or = ts(T), do read, set read_ts(x) to the larger of ts(T) or read_ts(x) No deadlock but has cascading rollback Optimistic Concurrency Control Unlike locking or TO, no checking is done during transaction execution – – updates applied to local copies, at the end of transaction execution, a validation phase checks whether any of the transaction updates violates serializability read phase -> validation phase --> write phase assume little interference timestamps on write_set and read_set Transaction States & Operations For recovery purpose, transaction states need to be recorded in a system log transactions states – – – – – – BEGIN_TRANSACTION READ OR WRITE END_TRANSACTION COMMIT_TRANSACTION ROLLBAK (OR ABORT) vs. UNDO one transaction vs. one operation REDO: redo certain operations to make sure Recovery Techniques Deferred update (after) – Immediate update: (before) – NO-UNDO/REDO algorithm UNDO/REDO or UNDO/NO-REDO algorithm in-place updating vs. shadowing – – write-ahead logging (WAL) for in-place updating before image (BFIM) + after image (AFIM) for shadow paging O.S.: buffering and caching -> DBMS cache Two-phase commit protocol for Multidatabase, – – phase 1: prepare to commit, ready to commit phase 2: all O.K., “commit” Database Backup: periodical vs. incremental Commit point vs. Checkpoint Commit point: – – all operations in a transaction have been successfully executed and recorded in a system log force-write log file (to disk) Checkpoints – – – a checkpoint record is written into the log periodically at that point when the system writes out to the database on disk the effect of all WRITE operations of committed transactions recovery manager decides at what intervals to take a check point in minutes or number of committed transaction checkpoint record can contain information such as list of active transaction_id, location of active transactions, etc. Recoverability A schedule S is said to be recoverable if no transaction T in S commits until all transactions T’ that have written an item that T reads have committed