* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Introduction to Persistent Storage, Concurrency Control and
Entity–attribute–value model wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Consistency model wikipedia , lookup
Global serializability wikipedia , lookup
Relational model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Database model wikipedia , lookup
Functional Database Model wikipedia , lookup
Clusterpoint wikipedia , lookup
Versant Object Database wikipedia , lookup
Commitment ordering wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
CS 766 Introduction: 1. The Age of Infinite Storage. 2. Concurrency Control. 3. Recovery . Section 1 #1 1. The Age of Infinite Storage has begun Many of us have enough money in our pockets right now to buy all the storage we will be able to fill for the next 5 years. So having the storage capacity is no longer a problem. Managing it is a problem (especially when the volume gets large). How much data is there? Section 1 #2 Googi 10100 Tera Bytes (TBs) are Here 1 TB costs 1k$ to buy 1 TB costs ~300k$/year to own ... Management and curation are the expensive part Searching 1 TB takes hours I’m Terrified by TeraBytes We are here I’m Petrified by PetaBytes I’m completely Exafied by ExaBytes I’m too old to ever be Zettafied by ZettaBytes, but you may be in your lifetime. You may be Yottafied by YottaBytes. You may not be Googified by GoogiBytes, Yotta 1024 Zetta 1021 Exa 1018 Peta 1015 Tera 1012 Giga 109 Mega 106 Kilo 103 but the next generation may be? Section 1 #3 Yotta How much information is there? Zetta Soon everything can be recorded and indexed. Most of it will never be seen by humans. Data summarization, trend detection, anomaly detection, data mining, are key technologies Exa Everything! Recorded Peta All Books MultiMedia Tera All books (words) .Movie Giga A Photo Mega A Book Kilo 10-24 Yocto, 10-21 zepto, 10-18 atto, 10-15 femto, 10-12 pico, 10-9 nano, 10-6 micro, 10-3 milli Section 1 #4 First Disk, in 1956 IBM 305 RAMAC 4 MB 50 24” disks 1200 rpm 100 (revolutions per minute) milli-seconds (ms) access time 35k$/year to rent Included computer & accounting software (tubes not transistors) Section 1 #5 1.6 meters 10 years later 30 MB Section 1 #6 In 2003, the Cost of Storage was about 1K$/TB. It’s gone steadily down since then. 12/1/1999 Price vs disk capacity 9/1/2000 k$/TB Price vs disk capacity 9/1/2001 Price vs disk capacity y = 17.9x $ IDE IDE SCSI 8.0 15 20 y = 13x SCSI 20 0 40 = 2.0x 80 8.0 054.0 9.0 0 7.0 10 03.0 8.0 06.07.0 2.0 $ 200 y=x 0 50 100 150 Raw Disk unit Size 50 100 150GB 200 Raw Disk unit Size GB 20 rawSCSI 6 raw IDE k$/TB 20 k$/TB GB 30 40 50 40 Disk unit size GB 200 250 5.0 4.0 0 3.04.0 2.03.0 1.02.0 1.0 0.0 0.0 0 60 60 80 SCSI 6.0 0.0 50 100 150 Raw Disk unit Size GB 200 0 10.0 1.05.0 IDE y = 2x 0 0 5 10 5.0 11/4/2003 y=x 400 10.0 7.0 IDE raw k$/TB 6.09.0 60 y 20 40 60 Raw Disk unit Size GB SCSI SCSI 10 15 y = 6.7x SCSI 0 800 600 9.0 20 25 $ y = 7.2x IDE raw k$/TB 10.0 25 30 SCSI $ $ 200 30 35 Price vs disk capacityy = 6x IDE SCSI IDE y = 3.8x GB $ 400 35 40 4/1/2002 Price vs disk capacity 800 200 600 40 $ $ $ $ 1000 900 1000 800 900 700 800 1400 600 700 500 1200 600 400 500 300 14001000 400 200 800 300 100 12001400200 600 0 100 10001200 0 0 400 0 1000 50 SCSI IDE 100 150 Disk unit size GB 200 IDE 0 50 50 100 150 200 Disk unit size GB Disk100 unit size150 GB Section 1 #7 200 250 Kilo Mega Disk Evolution Giga Tera Peta Exa Zetta Yotta Section 1 #8 Memex As We May Think, Vannevar Bush, 1945 “A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility” “yet if the user inserted 5000 pages of material a day it would take him hundreds of years to fill the repository, so that he can enter material freely” Section 1 #9 Can you fill a terabyte in a year? Item Items/TB Items/day a 300 KB JPEG image 3M 9,800 a 1 MB Document 1M 2,900 a 1 hour, 256 kb/s MP3 audio file 9K 26 a 1 hour 1 MPEG video 290 0.8 Section 1 # 10 On a Personal Terabyte, How Will We Find Anything? Need Queries, Indexing, Data Mining, Scalability, Replication… If you don’t use a DBMS, you will implement one of your own! Need for Data Mining, Machine Learning is more important then ever! Of the digital data in existence today, 80% is personal/individual DBMS 20% is Corporate/Governmental Section 1 # 11 We’re awash with data! Network data: 10 exabytes by 2010 ~ 1019 Bytes 10 zettabytes by 2015 ~ 1022 Bytes WWW (and other text collections) ~ 1016 Bytes Sensor data from sensors (including Micro & Nano -sensor networks) 15 petabytes by 2007 National Virtual Observatory (aggregated astronomical data) ~ 1013 Bytes US EROS Data Center archives Earth Observing System (near Soiux Falls SD) Remotely Sensed satellite and aerial imagery data 10 terabytes by 2004 10 yottabytes by 2020 ~ 1025 Bytes Genomic/Proteomic/Metabolomic data (microarrays, genechips, genome sequences) 10 gazillabytes by 2030 ~ 1028 Bytes? I made up these Name! Projected data sizes are overrunning our ability to name their orders of magnitude! Stock Market prediction data (prices + all the above?) 10 supragazillabytes by 2040 ~ 1031 Bytes? Useful information must be teased out of these large volumes of raw data. AND these are some of the 1/5th of Corporate or Governmental data collections. The other 4/5ths of data sets are personnel! Section 1 # 12 Parkinson’s Law (for data) Data expands to fill available storage Disk-storage version of Moore’s Law Available storage doubles every 9 months! How do we get the information we need from the massive volumes of data we will have? Querying (for the information we know is there) Data mining (for the answers to questions we don't know to ask precisely). Section 3 # 13 INTRODUCTION TO CONCURRENCY CONTROL AND RECOVERY IN DATABASE SYSTEMS CONCURRENCY CONTROL is activity of coordinating process actions that operate in parallel (concurrently in time), access shared data (or any other shared resources), and potentially interfere with each other (e.g., if one process is reading data item, x, concurrently while another is changing x, then the reader could get the first half of the old value concatenated with the second half of the new value). RECOVERY is the acitivity of ensuring that software and hardware failures do not corrupt (cause the loss of) presistent data (Data that has been guaranteed by the DBMS to survive the ending of the program that created it.). A TRANSACTION is an atomic unit of database work during the execution of a program that accesses a shared database. Transactions must be atomic (all or nothing: Either successfully complete or leave no effects on the database (or users) whatsoever.) It's the job of the Concurrency Control/Recovery system to insure this. Transactions don't interfere with each other. (No audio is embedded in the next 15 slides) Termination means either all effects are made permanent or the Transaction has no effect on the database or user at all. A DATABASE is a set of named data items (each has a value) The collection of all values of the database at any time constitute the database STATE at that time An ITEM is a word of main memory, a diskpage, the entire database, an area, a file, a subset of a file, a record or a field (GRANULE or data item). We will assume the GRANULARITY is record level. INTRODUCTION TO CONCURRENCY CONTROL AND RECOVERY - 2 A DATABASE SYSTEM is a hardware and software system supporting data access commands (operations). The DATABASE OPERATIONS of interest to this discussion are two: Read(x) returns value stored in the data item named x. Write(x) changes the value of x to value. A Database System executes these operations ATOMICALLY and as a Von Neumann machine (as if sequentially). Operation execution is usually CONCURRENT however = several operation executions overlap in time, but the final effect must be the same as the sequential ordering specified by the concurrency controller. The TRANSACTION OPERATIONS of interest in this discussion are three: START (or BEGIN) means "A new Transaction begins here". COMMIT means "This transaction completed correctly (normally) and that the effects should now be made permanent". ABORT means "This transaction did not complete correctly and dictates that all its effects should be obliterated. INTRODUCTION TO CONCURRENCY CONTROL AND RECOVERY - 3 We assume the Database System responds to a START with the return of a unique Transaction Identifier (TID) from a linearly ordered set of TIDs (e.g., start time or timestamp), and that the Transaction program attaches the TID to each operation as it is submitted. Our Database System view of a transaction is that it consists only of: a START, a partially ordered set of database operations (reads and writes), a read(x) operation returns the value found in data item, x, to the requesting transaction a write(x, new_value) operations replaces the current_value found in data_item, x, with new_value and returns an acknowledgement to the requesting transactions (and ACK). a Commit or an Abort (but never both). The User view of a transaction is that of one or more procedures including database and transaction operations and program logic. COMMIT or ABORT: A transaction is ACTIVE if it is started but not yet committed or aborted and it is UNCOMMITTED if it is aborted or active. Note: Aborts can be transaction generated or imposed by the Database System. ABORT wipes out all transaction effects (on the database or users) Causes? System failure in the middle of a transaction, The Database System discovers it has returned an incorrect value to the transaction. Execution of a transaction's COMMIT (and the returning an acknowledgement (ack)) of that commit constitutes a guarantee that the DBS will not abort the transaction and that all the transaction's effects will survive subsequent failures of the system (or even Operation System failures that corrupt main memory but not media falures). MESSAGES: We assume there is no direct communication between separate transactions, except through writes to and reads from the Database. INTRODUCTION TO CONCURRENCY CONTROL AND RECOVERY - 4 Most Database System provide (guarantee) the following ACID properties for their Transactions: Atomicity: A Transaction's changes to the state of the Database are atomic, that is, they either all happen or none happen. (These changes include database changes and messages.) Consistency: A Transaction is a correct transformation of the Database state. The actions of a transaction, taken as a group, do not violate any of integrity constraints associated with the state. (This requires that the Transaction be a correct program.) Isolation: Even though Transactions execute concurrently, it must appear to each Transaction, T, that others execute either before T or after T, but not both. (It must appears as though T is the only transaction in in the system.) Durability: Once a Transaction completes successfully (commits) the changes (writes) it made survive failures. INTRODUCTION TO CONCURRENCY CONTROL AND RECOVERY - 5 T reads x from S if: 1. T reads x after S has written x 2. S does not abort before T reads x 3. every transaction which writes x between S's write and T's read, aborts before T reads. (So the value returned to T is the value written by S.) T reads_from S if T reads one or more data items from S. RECOVERABILITY: An Execution (of a set of concurrent transactions) is RECOVERABLE if, forall T that commit, T's Commit follows the Commit of every transaction that T reads_from. OR, said another way, if T reads_from S and T commits, then, S commits before T commits. Note: Recoverability can be achieved by delaying Commits (dependent ones). Admittedely TERMINAL I/O and other external interactions between transactions and the users who issue them, can result in indirect communication and therefore dependencies, but, we consider any errors caused by such interactions to be user errors (we wash our hands of them ;-) I.e., we do not consider transaction to communicate with each other except through database writes and reads. (DBSs can prevent some user INDIRECT_READ problems by, DEFERRING all output until commit. INTRODUCTION TO CONCURRENCY CONTROL AND RECOVERY - 6 CASCADELESSNESS (Avoidance of Cascading Aborts or ACA) of an execution is achieved if the DBMS ensures that every transaction reads only values written by committed transactions. The DBMS must delay Read(x) until all transaction that have already issued a Write(x, value) abort or commit. I.e., If T reads_from S then S must commit or abort before T reads_from S. The set of Cascadeless executions is CONTAINED_IN the set of Recoverable executions. ACA RC And therefore as properties of individual transactions, Reverability implies ACA. RC ACA INTRODUCTION TO CONCURRENCY CONTROL AND RECOVERY - 7 STRICT EXECUTION: Many systems implement ABORT by restoring the "before-image" of all writes made by the aborting transaction. T1: 1 3 T1-COMMIT v | X------ 1 ----- | -----------------------v Y--------------- 3 ----- 1 -------------^ T2: 1 2 -------^ | --------| 2 T2 ABORT before image of X in Write_by_T2(X,2) =1 and before image of Y in Write_by_T2(Y,1) =3, since if T2 had never happened : T1: 1 3 T1-COMMIT v | X------ 1 ----- | ------------------------------------v Y--------------- 3 ------------------------------------We see that "aborting by restoring before-images" requires that the DBMS use the WRITE-AHEAD-LOGGING (WAL) PROTOCOL: The DBMS can't overwrite a value without logging it first. INTRODUCTION TO CONCURRENCY CONTROL AND RECOVERY - 8 WRITE-AHEAD-LOGGING (WAL) PROTOCOL: DBS can't overwrite a value without logging it first. (The logged value then becomes the before-value (bfv) and "logging" means, basically, writing it to a secure place, e.g., disk (but a different disk from the one containing the DB)) T1: X----- 1 v 1 ---- Y----------------T2: Log: Abort by restoring before-images is not always correct: T1: X-- 1 --------T2: 3 T1-COMMIT | | ----------------------------------2 ---------v ^ 3 --- 1 ----------------------------| ---------: ^ | : | | : 1 2 T2 ABORT v bfv(T2,Y=3) bfv(T2,X=1) 2 v 2 ABORT ----- 3 ------------------------^ 3 Restoring before images would reset X to 1 (whereas it should be 3) unless abort of T1 is "cascaded" to T2 (i.e., T2 aborted also). INTRODUCTION TO CONCURRENCY CONTROL AND RECOVERY - 9 Restoring before images would reset X to 1 (whereas it should be 3) unless abort of T1 is "cascaded" to T2 (i.e., T2 aborted also). If aborts are cascaded to all transactions that read_from_T1 T1: X----- 1 ------T2: 2 v 2 ABORT-T1 ----- 3 ------------------------^ 3 ABORT-T2 then restoring before images would reset X to 1 (when T1 ABORTS), then X to 2 (when T2 ABORTS). This is also incorrect (X should be restored to 1). The problem is that the before image was written by an ABORTED transaction. Solution: Write(x,val) be delayed until all transaction which have previously written x have terminated (COMMIT/ABORT). INTRODUCTION TO CONCURRENCY CONTROL AND RECOVERY - 10 STRICT (ST) executions 1. Delay Read(x) until all transaction that did a previous Write(x) either Commit or Abort (avoid cascading aborts). 2. Delay Write(x) until all transaction that did a previous Write(x) either Commit of Abort (So that we can implement abort by restoring before-images.). Strictness Cascadelessness Recoverability of executions: ST ACA RC as properties INTRODUCTION TO CONCURRENCY CONTROL AND RECOVERY - 11 RIGOROUS (RG) executions = Strict and 3. Delay Write(x) until all transaction previously Read(x) Commit/Abort. Rigorous Strictness Cascadelessness Recoverability RG ST ACA RC INTRODUCTION TO CONCURRENCY CONTROL AND RECOVERY - 12 DATABASE CONSISTENCY: States of the Database are consistent if they satisfy the consistency contraints (eg. Bal=sum(accts)). Designers define consistency predicates. Transaction correctness requires also that the transaction preserve consistency. Serializable executions preserve consistency. SERIAL EXECUTION means there is no interleaving of the operations of concurrent transactions. Serial executions are correct if each transaction is correct (takes the database from one consistent state to another consistent state) so the DB goes from consistent state to consistent state ... forever (Each consistent state is produced by one isolated Transaction.). INTRODUCTION TO CONCURRENCY CONTROL AND RECOVERY - 13 A SERIALIZABLE (SR) execution is one that produces the same output and has the same effect on the Database state as SOME serial execution. Serializability is a definition of complete correctness in our model. SERIAL SR ORDERING Transactions: If equivalence to a particular order is required, it is the users responsibility to insure it (issue 2nd Transaction only after commit of 1st Transaction is ack'ed by system). Sometimes serializability is taken to be unrealistic (Editorial: I don't agree! Especially in this age of "two" systems, an OLTP system (OnLine Transaction Processing system, for short updates) and a DW system (DataWarehouse system, for adhoc, long-running read-only queries) we should be able to achieve SR!) Averages over large amounts of data (some inconsistent retrieval is tolerable) Physical process control (process nonterminating so "very long running") INTRODUCTION TO CONCURRENCY CONTROL AND RECOVERY - 14 From Gray-Reuter: "Suprisingly, most system do not provide true isolation (e.g., serializability). Originally, this was because implementors didn't understand the issues. Now the issues are understood, but implementors make a compromise between correctness and performance." The typical modern system default is to guard agains problems caused by readfroms (WRITE followed by READ) and write-overs (WRITE followed by WRITE) but ignore write-after-reads (READ followed by WRITE) dependencies. This goes under the name "cursor stability" in most SQL systems. The ISO and ANSI SQL standard mandates true isolation (e.g., serializability) as a default, but few vendors follow this aspect of the standard. Rather, they allow sophisticated users to request isolation as an option called "repeatable reads". They also allow a third option to disable write-after-read isolation, called "browse" access, which allows queries to scan the database without acquiring locks and without delaying other Transactions. These options are generally called degrees of isolation. Full isolation is used in this treatment . None-the-less, we give the definitions of isolation levels, but don't recomend them. ISOLATION LEVELS SQL defines at least three levels of isolation weaker than serializability (they do not guarantee consistency entirely, but they are easier to achieve). We look at: REPEATABLE READ READ COMMITTED READ UNCOMMITTED INTRODUCTION TO CONCURRENCY CONTROL AND RECOVERY - 15 REPEATABLE READ ensures that a transaction, T, reads only changes made by committed Transactions. No value read or written by T can be changed by another Transaction until T is complete (committed or aborted). Repeatable Read allows the PHANTOM PHENOMENON: If T is reading "all employee records in the sales department", another transaction might enter (insert) a new employee record with department=sales, which could be missed by T's read. Repeatable read isolation can be achieved by using the same locking protocol as 2PL except it doesn't use index locking (index locking will guarantee that no other transaction can complete an insert of an employee record in the sales dept until T has completed it's read - since such a read locks the value, department = sales of the department index and an insert involves a write to that same index record.) READ COMMITTED ensures that T reads only changes made by committed STNs and that no value written by T is changed by any other transaction until T is complete. However a value read by T may well be modified by another Transaction while T is still in progress. T is exposed to the phantom problem also. A read committed transaction obtains exclusive locks before writing items and holds these locks until END. It also obtains shared locks before reading but these locks are released immediately (their only effect is to guarantee that the Transaction that last modified the object has completed that task.) READ UNCOMMITTED ensures nothing (T can read changes made to an item by an ongoing transaction and the item can be further changed while T is in progress. T is exposed to the phantom problem. A read uncommitted Transaction does not obtain shared locks before reading items. This mode represents the greatest exposure to uncommitted changes of other transactions; so much so that SQL prohibits such a transaction from making any changes itself - a read uncommitted transaction is required to have an access mode of READ ONLY. There are other definitions of reduced levels of isolation (less than serializability) in other treatments of the subject (and in many actual commercial systems). These notes will stick to serializability, and only make occasional comments regarding the reduced levels. Thank you. Section 3 #1