* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 3_concurrency
Survey
Document related concepts
Transcript
Locks • What do we want: – if one transaction is accessing a shared resource – another transaction cannot simultaneously access it. • Locks: mechanism for enforcing this – When want to access an item x, ask for a lock – If somebody else has the lock on x then what ? • • • • Wait. If nobody else has the lock on x ? Go ahead and access x. When finished with x, release the lock. Why ? So other transactions can access the item. 1 Locks basics • First look at simpler binary locks (yes/no) – i.e. either you have a lock or you don’t • Then look at more realistic shared/exclusive locks – Aka read/write locks – Multiple transactions can simultaneously have a shared lock. Why is this OK ? • If two transactions are simultaneously reading, not interfering with one another. 2 Binary Locks Eg • Eg: T1 and T2 both trying to write y T1 T2 wants to write y asks for a lock on y gets lock on y wants to write y write(y) asks for a lock on y waits unlock(y) gets lock on y write(y) 3 Binary Locks basics • When are locks released? • Could be immediately after read/write • Will see that this can lead to problems – So can do at end of transaction when either committing or aborting : synchpoints – But for now assume immediately after read/write • How locks work: – – – – LOCK = 0: available LOCK = 1 : not available When want to access item: Lock_item(Y) When done with item: Unlock_item(Y) • How to implement Lock_item(Y), Unlock_item(Y) ? 4 Locking/unlocking: proposed solution Lock_item(Y) if LOCK (Y) = 0 (*item is unlocked*) then LOCK (Y) 1 (*lock the item*) else go to sleep; Unlock_item(Y) LOCK (Y) 0 (*unlock the item*) if any transactions are waiting then wake up one of the waiting the transactions; • Will this work? 5 Problems with proposed solution 1. Suppose T1 and T2 both trying to get a lock • Could be with multiple processors or interleaved execution on a single processor 2. T1 checks value of lock, finds it =0 3. Before T1 can change lock value to 1, control passes to T2 4. T2 checks value of lock, finds it =0 ! • Testing lock = 0 and setting lock to 1 has to be an atomic indivisible operation. How? • Semaphores, TSL etc: done by OS: CS644/320 6 Shared/Exclusive Locks • Two locks modes : shared (S) and exclusive (X) – To read, need shared (S) lock – To write, need exclusive (X) lock • If T1 holds S lock, T2 wants S lock ? • Will get it. If T1 holds S lock, T2 wants X lock ? • Will have to wait. If T1 holds X lock and T2 wants X lock ? • Will have to wait. If T1 holds X lock, T2 wants S lock? • Will have to wait. 7 Who deals with the locking ? • All of the locking is transparent to the user – User programs don’t have to request locks – User programs just try to read and write values – DBMS inserts lock requests automatically into the transaction • Why not let user programs deal with locks? • Don’t want to rely on user programs to behave – Eg: suppose user program in charge of getting locks – But did a write without getting a lock. Problem ? • Inconsistent data 8 Eg: Shared/Exclusive Locks • T1: is going to do r(A), r(B), display (A+B) get S-lock (A); read (A); unlock (A); get S-lock (B); read (B); unlock (B); display (A+B) • Does this guarantee serializability ? 9 Lock-Based Protocols • Locking as above is not sufficient to guarantee serializability. Why ? • If A and B get updated in-between the read of A and B, the displayed sum would be wrong. – T2 : transferring : adding 5 to A, subtracting 5 from B • Locking protocol : set of rules followed by all transactions while requesting or releasing locks. • Locking protocols – Restrict set of possible schedules – don’t allow some – Control the concurrency 10 Deadlock Eg • If transactions don’t release locks till commit – Executing lock-X on X causes T2 to wait for T1 to release its lock on X – Executing lock-X on Y causes T1 to wait for T2 to release its lock on Y • Deadlock: Neither T1 nor T2 can make progress T1 lock-X on X write (X) T2 lock-X on Y write (X) wait for lock-X on X wait for lock-X on Y 11 Deadlocks • Deadlock: Cycle of transactions waiting for locks to be released by each other. • Two main ways of dealing with deadlocks: • Deadlock detection : – – – • allow deadlock to happen figure out when it has happened deal with it by rolling back one or more transactions Deadlock prevention : protocol ensures that system will never enter into deadlock 12 Deadlock detection/Rollback • After deadlock detected, rollback one of the transactions. Problems ? • Wasted effort • Another possible issues is Starvation : one process doesn’t get service – Is possible if concurrency control manager is badly designed. In above example ? • The same transaction is repeatedly rolled back due to deadlocks. • Concurrency control manager can be designed to prevent starvation. 13 Detection vs Prevention • There are 2 ways to think about deadlock: – Blocking : forcing a T to wait – Aborting : killing a T • Both involve performance penalties: – Blocking : if T has a lock on Y, and T is forced to wait, then all other transactions that need Y will also have to wait. – Aborting : waste work done, overhead in restarting. • Deadlock prevention: more blocking • Deadlock detection: more aborting 14 Granularity:what gets locked • Complex topic, might cover later • What can be locked ? – – – – A single record (a row in a table) A page or several pages of a file storing table Whole table Index or part of an index • The coarser (the bigger) the structure being locked: – Less concurrency – Less overhead – Greater safety 15 Lock Conversion • Lock upgrade: convert existing read lock to write lock. When should this be allowed ? If Ti has a read-lock(y) and no other Tj has read-lock(y) then convert read-lock(y) to write-lock(y) else force Ti to wait until Tj unlocks y • Lock downgrade: convert existing write lock to read lock 16 Issues with Lock Conversion • If T1 holds S lock, T2 waiting for X lock, T3 comes and asks for S lock, what to do? • Won’t give T3 S lock. Why ? • o/w T2 might be waiting forever - starvation • T1 has S lock, T2 waiting for X lock. T1 wants to upgrade to X lock. Allow ? • Depends on protocol. Suppose T1 not allowed to upgrade. What now? • T1 won’t give up S lock, T2 has to wait for T1 to get X lock, T1 has to wait for T2 . Deadlock ! 17 Implementation of Locking • lock manager: implemented as a separate process – to which transactions send lock and unlock requests • Lock manager replies to a lock request by – Either granting the lock – Or requesting transaction waits till lock available – Or asking the transaction to roll back, in case of a deadlock • lock table : data-structure maintained by lock manager to record granted locks and pending requests – usually implemented as hash table – indexed on the name of the data item being locked 18 Lock Table Granted Waiting • Black rectangles indicate granted locks, white ones indicate waiting requests • Lock table also records the type of lock granted or requested • New request is added to the end of the queue of requests for the data item, and granted if it is compatible with all earlier locks: FCFS • Unlock requests result in the request being deleted, and later requests are checked to see if they can now be granted • If transaction aborts, all waiting or granted requests of the transaction are deleted – lock manager may keep a list of locks held by each transaction, to implement this efficiently 19 Deadlock Detection • Create a waits-for graph: Nodes are transactions • When Ti requests data item currently held by Tj – then the edge Ti Tj is inserted in the wait-for graph. – This edge is removed only when Tj is no longer holding a data item needed by Ti. If there is deadlock, what can we say about the graph ? • Deadlock if and only if there is a cycle • Periodically check for cycles in waits-for graph. How? • – Done in CSCI 6632/3326 – Saw some thing similar with serializability – Not serializable if there is a cycle in the precedence graph 20 Two Wait-for Graph Egs Wait-for graph with a cycle T1 T4 T2 T3 • Wait-for graph for schedule on the right ? – Is there deadlock 21 Wait-for Graph Eg 22 Another Wait-for Graph Eg • • • • T1: S(A), R(A), T2: X(B),W(B) T3: T4: X(C) S(C), R(C) X(A) X(B) • What is the wait-for graph for this ? – We assume transactions do not give up any locks till they commit – Is there deadlock ? 23 Deadlock Detection – when to do? • Do either periodically at fixed intervals or when a lot of transactions are waiting. • If periodically, if do too often ? • Expensive: lot of overhead • If do infrequently ? • Lots of transactions may be blocked • Deadlock detection works well if little interference among transactions – Less chance of deadlock 24 Deadlock Recovery • When deadlock is detected, some transaction will have to rolled back (made a victim) to break deadlock. Which one ? • One strategy: select that transaction as victim that will incur minimum cost. 25 Deadlock Recovery • Rollback -- determine how far to roll back transaction • Total rollback: Abort the transaction and then restart it. • More effective to roll back transaction only as far as necessary to break deadlock. • Starvation happens if same transaction is always chosen as victim – Include the number of rollbacks in the cost factor to avoid starvation 26 Deadlock Prevention • Use some protocol which will never allow deadlock. Couple of simple example: • No waiting: if T can’t get lock immediately, abort T. No deadlock – why ? Problem ? • Leads to lots of T getting aborted, better • Cautious waiting: Ti wants lock, Tj has lock. if Tj blocked then abort (Ti) else Ti gets blocked • Why no deadlock ? 27 Transaction Timestamp • Each transaction has a unique timestamp – TS(T1) < TS (T2) means T1 is older than T2 • Assign priorities based on timestamps – Older T get higher priority. Why ? • Has done more work, so more expensive to kill • Can do in different way to prevent deadlock – We look at 2 algorithms: use timestamps to determine when to rollback, and whom to rollback • Wait-Die • Wound-Wait 28 Deadlock Prevention Wait-Die • Wait-Die scheme • If older T2 wants resource held by younger T1, T2 waits for T1 • If younger T1 wants resource held by older T2, T1 won’t wait for T2 – T1 will be rolled back instead. • Why no deadlock ? • Can’t get a cycle. Why ? • If cycle, somewhere there would have to be an edge from younger to older transaction. 29 Deadlock Prevention Wait-Die • A transaction may die several times before acquiring needed data item. • Each time, it is re-started with same timestamp. – Why same timestamp ? – What would be the problem if given new timestamp each time? • It would always be young. Problem ? • It would be getting rolled back all the time : starvation 30 Deadlock Prevention: Wound-Wait • Wound-Wait scheme : • If younger T1 wants resource held by older T2 – T1 can wait for T2. • If older T2 wants resource held by younger T1 – T2 wounds (forces rollback of ) T1 • instead of waiting for T1 – T1 restarted with original timestamp • Why no deadlock? • Older T never waits for younger T • Preemptive: running T can be preempted 31 2-phase locking (2PL) • • • • What might we want from a locking protocol ? Serializable: equivalent to serial Strict: easy to abort, no cascading rollbacks No deadlock 32 Figure 22.3 T1 T2 Result read_lock (Y); read_item (Y); unlock (Y); write_lock (X); read_item (X); X:=X+Y; write_item (X); unlock (X); read_lock (X); read_item (X); unlock (X); Write_lock (Y); read_item (Y); Y:=X+Y; write_item (Y); unlock (Y); Initial values: X=20; Y=30 Result of serial execution T1 followed by T2 X=50, Y=80. Result of serial execution T2 followed by T1 X=70, Y=50 • If T1 and T2 behave as above, can we be sure that interleaved schedule will be serializable? • No –example not serializable ? – next slide 33 Figure 22.3 T1 T2 read_lock (Y); read_item (Y); unlock (Y); Result X = 20; Y = 30 Nonserializable X=50, Y=50 read_lock (X); read_item (X); unlock (X); write_lock (Y); read_item (Y); Y:=X+Y; write_item (Y); unlock (Y); write_lock (X); read_item (X); X:=X+Y; write_item (X); unlock (X); • Why did this happen ? • Y lock should not have been given up by T1 – till X lock had been obtained. 34 2-phase locking • Two-phase locking: A transaction can not request additional locks once it releases any locks. • Phase 1: Growing Phase – transaction may obtain locks – transaction may not release locks • Phase 2: Shrinking Phase – transaction may release locks – transaction may not obtain locks • Doesn’t mean get all locks in the beginning – Is enough to guarantee, but not the only way 35 Figure 22.4 2-phase locking Eg T1 T2 read_lock (Y); read_item (Y); write_lock (X); unlock (Y); read_item (X); X:=X+Y; write_item (X); unlock (X); read_lock (X); read_item (X); Write_lock (Y); unlock (X); read_item (Y); Y:=X+Y; write_item (Y); unlock (Y); • In this eg, T1, T2 follow 2PL: – After give up a lock, never ask for a lock • Two questions – Can deadlock happen – If no deadlock, can non-serializable happen 36 2-phase locking, deadlock • Does 2-phase locking guarantee deadlock free? • Look at Figure 22.4 again T1 T2 read_lock (Y); read_item (Y); read_lock (X); read_item (X); write_lock (X); (waits for X) • Did T1, T2 follow 2PL ? • Yes. Are T1, T2 in deadlock ? • Yes. write_lock (Y); (waits for Y) 37 2-phase locking, serializability • If every transaction follows 2PL, are we guaranteed serializability ? • Yes. What we will show – non-serializable → deadlock – How does this help ? • Contrapositive of “non-serializable → deadlock” ? • No deadlock → serializable • We will now prove – non-serializable → deadlock 38 2-phase locking, serializability • Suppose non-serializable. – What can we say about precedence graph ? • Has to have a cycle: say T1 T2 T1 • T1 T2 edge in precedence graph tells us what ? • T1 accessed resource Y before T2 – So T1 had lock on Y before T2 got lock on Y • T1 won’t give up lock on Y till acquired all locks • T2 T1 edge in precedence graph tells us what ? • T2 accessed resource X before T1 – So T2 had lock on X before T1 got lock on X • T2 won’t give up lock on X till acquired all locks 39 2-phase locking, serializability • • • • • • • T1 got lock on Y first T2 has to wait for T1 to release lock on Y T2 got lock on X first T1 has to wait for T2 to release lock on X Result ? Deadlock ! We have now proved: – non-serializable → deadlock • Contrapositive – No deadlock → serializable 40 Conservative 2-phase locking • Conservative (aka Static) 2PL = 2PL + additional requirement: – Predeclare all locks: T has to get all locks it is ever going to need before it starts executing – All locks obtained together at the start – i.e. if T doesn’t get one lock, won’t get any locks • Eg: – – – – Suppose T wants to first read y and then read z T’ has write lock on z, nobody has a lock on y T will predeclare read locks on y, z. Will T get read lock on y ? • No : if can’t get all, can’t get any. • Advantage of this compared to regular 2PL? 41 Conservative 2-phase locking • No deadlock. Why ? Suppose only T1, T2. • If edge from T2 to T1 in wait-for graph – This means T1 got in before T2. So … • • • • • T1 got all locks before T2 got any locks. So … No edge from T1 to T2. So … No cycle and hence no deadlock. Downside Reduces concurrency. Why ? If T can’t get lock on y, can’t proceed – even if y not needed for long time • How do you know which locks you will need ? 42 T1 2-phase locking: bad Eg T2 write_lock (Y); read_item (Y); Y=Y+2 write_item (Y); give up write lock on Y read_lock (Y); read_item (Y); abort • Problem ? • T2 read from a write done by uncommitted T1 – Cascading rollback: T2 has to be rolled back. • Did we follow 2PL? • Yes, but this still happened : why ? 43 Strict 2-phase locking • Lock given by T1 too early. To prevent problem, when should it have been given up. • Only when T1 committed or aborted • Strict 2PL : T does not release any X locks till it either commits or aborts • Leads to strict schedules. Why ? • No T’ can read or write any item written by T till T has committed. Why ? • T’ can’t get locks :T only releases on commit 44 Strict, Rigorous, Conservative 2PL • Rigorous (aka Strong strict ) 2PL : T does not release any X or S locks till it either commits or aborts – Easier to implement than strict 2PL since in this case no locks released till commit – Has some advantages in distributed databases • • • • • Strict/rigorous 2PL : both commonly used. Does a conservative schedule have to be strict? No : HW problem Does a strict schedule have to be conservative? No : HW problem 45 SQL Server Locking • Locking in Microsoft SQL Server, Alexander Chigrik – http://www.mssqlcity.com/Articles/Adm/SQL70Locks.htm • “There are three main types of locks that SQL Server uses: • Shared locks are used for operations that do not change or update data, such as a SELECT statement. • Update locks are used when SQL Server intends to modify a page, and later promotes the update page lock to an exclusive page lock before actually making the changes. • Exclusive locks are used for the data modification operations, such as UPDATE, INSERT, or DELETE. • Shared locks are compatible with other Shared locks or Update locks. • Update locks are compatible with Shared locks only. • Exclusive locks are not compatible with other lock types.” 46 SQL Server Locking Eg • “Let me to describe it on the real example. There are four processes, which attempt to lock the same page of the same table. These processes start one after another, so Process1 is the first process, Process2 is the second process and so on. Process1 : SELECT, Process2 : SELECT Process3 : UPDATE, Process4 : SELECT • Process1 sets the Shared lock on the page, because there are no other locks on this page. Process2 sets the Shared lock on the page, because Shared locks are compatible with other Shared locks. Process3 wants to modify data and wants to set Exclusive lock, but it cannot make it before Process1 and Process2 will be finished, because Exclusive lock is not compatible with other lock types. So, Process3 sets Update lock. Process4 cannot set Shared lock on the page before Process3 will be finished. • So, there is no Lock starvation. Lock starvation occurs when read transactions can monopolize a table or page, forcing a write transaction to wait indefinitely. So, Process4 waits before Process3 will be finished. After Process1 and Process2 were finished, Process3 transfer Update lock into Exclusive lock to modify data. After Process3 was finished, Process4 sets the Shared lock on the page to select data.” 47 Oracle locks Oracle® Database Concepts, 10g Release 2 http://download.oracle.com/docs/cd/B19306_01/server.102/b14220/consist.htm • “Oracle uses two modes of locking in a multiuser database: • Exclusive lock mode prevents the associates resource from being shared. This lock mode is obtained to modify data. The first transaction to lock a resource exclusively is the only transaction that can alter the resource until the exclusive lock is released. • Share lock mode allows the associated resource to be shared, depending on the operations involved. Multiple users reading data can share the data, holding share locks to prevent concurrent access by a writer (who needs an exclusive lock). Several transactions can acquire share locks on the same resource.” 48 More on Oracle locks • “Types of Locks: Oracle automatically uses different types of locks to control concurrent access to data and to prevent destructive interaction between users. Oracle automatically locks a resource on behalf of a transaction to prevent other transactions from doing something also requiring exclusive access to the same resource. The lock is released automatically when some event occurs so that the transaction no longer requires the resource. • Throughout its operation, Oracle automatically acquires different types of locks at different levels of restrictiveness depending on the resource being locked and the operation being performed. • Oracle locks fall into one of three general categories. • DML locks (data locks) DML locks protect data. For example, table locks lock entire tables, row locks lock selected rows. • DDL locks (dictionary locks) DDL locks protect the structure of schema objects—for example, the definitions of tables and views. • Internal locks and latches protect internal database structures such as data files.” 49 More on Oracle locks • “Lock Duration : All locks acquired by statements within a transaction are held for the duration of the transaction, preventing destructive interference including dirty reads, lost updates, and destructive DDL operations from concurrent transactions. The changes made by the SQL statements of one transaction become visible only to other transactions that start after the first transaction is committed. • Oracle releases all locks acquired by the statements within a transaction when you either commit or undo the transaction. Oracle also releases locks acquired after a savepoint when rolling back to the savepoint. However, only transactions not waiting for the previously locked resources can acquire locks on the now available resources. Waiting transactions will continue to wait until after the original transaction commits or rolls back completely. • Data Lock Conversion Versus Lock Escalation • A transaction holds exclusive row locks for all rows inserted, updated, or deleted within the transaction. Because row locks are acquired at the highest degree of restrictiveness, no lock conversion is required or performed. • Oracle automatically converts a table lock of lower restrictiveness to one of higher restrictiveness as appropriate. For example, assume that a transaction uses a SELECT statement with the FOR UPDATE clause to lock rows of a table. As a result, it acquires the exclusive row locks and a row share table lock for the table. If the transaction later updates one or more of the locked rows, the row share table lock is automatically converted to a row exclusive table lock.” 50 More on Oracle locks • “Lock escalation occurs when numerous locks are held at one level of granularity (for example, rows) and a database raises the locks to a higher level of granularity (for example, table). For example, if a single user locks many rows in a table, some databases automatically escalate the user's row locks to a single table. The number of locks is reduced, but the restrictiveness of what is being locked is increased. • Oracle never escalates locks. Lock escalation greatly increases the likelihood of deadlocks. Imagine the situation where the system is trying to escalate locks on behalf of transaction T1 but cannot because of the locks held by transaction T2. A deadlock is created if transaction T2 also requires lock escalation of the same data before it can proceed. • Deadlock Detection : Oracle automatically detects deadlock situations and resolves them by rolling back one of the statements involved in the deadlock, thereby releasing one set of the conflicting row locks. A corresponding message also is returned to the transaction that undergoes statement-level rollback. The statement rolled back is the one belonging to the transaction that detects the deadlock. Usually, the signaled transaction should be rolled back explicitly, but it can retry the rolled-back statement after waiting. • Deadlocks most often occur when transactions explicitly override the default locking of Oracle. Because Oracle itself does no lock escalation and does not use read locks for queries, but does use row-level locking (rather than pagelevel locking), deadlocks occur infrequently in Oracle.” 51 Elmasri Company Database • The company is organized into DEPARTMENTS. • Each department has a name, number and an employee who manages the department – We keep track of the start date of the department manager. • Each department controls a number of PROJECTS. • Each project has a name, number and is located at a single location. 52 Elmasri Company Database • For each EMPLOYEE, we store the social security number, address, salary, sex, and birthdate. • Employees may have a supervisor • Each employee works for one department but may work on several projects. • We keep track of the number of hours per week that an employee currently works on each project. • Each employee may have a number of DEPENDENTS. – For each dependent, we keep track of their name, sex, birthdate, and relationship to employee. 53 The ER diagram for the COMPANY database. 54 Elmasri database with FK: Figure 3.7 55 Elmasri Figure 3.6: Relational Instance 56 Review of Database Programming • Done in CS 622 • Accessing database from application program – as opposed to interactive interfaces. Why? • Interactive interface convenient but not enough – Majority of database ops done through application programs • written by application programmer • We do quick review of database programming – To help understand SQL support for transactions 57 Database Programming Approaches • Embedded commands: – Database commands embedded in a generalpurpose programming language • Library of database functions: – Available to the host language for database calls – Application Program Interface: Eg: JDBC/ODBC • Design brand new, special database language – Minimizes impedance mismatch. – Eg: Oracle’s PL/SQL • We will only review embedded SQL 58 Embedded SQL approach • Approach: Embed SQL in the host language. – Eg: Pro*C, Pro*COBOL, Pro*FORTRAN, • DBMS-specific preprocessor transforms embedded SQL statements into function calls in the host language • Then a regular compiler is used to compile the code. • Details of translation vary across DBMSs • Source code possibly same for different DBMS – Final executable will work only with one DBMS 59 Embedded SQL : PRO*C • • • • Embedded SQL programming language Used by Oracle, Sybase SQL server Pro*C uses either C or C++ as its host language During compilation – embedded SQL statements interpreted by precompiler – replaced by C or C++ function calls • Output from the Pro*C precompiler is standard C or C++ code – This is compiled by C or C++ compiler into an executable 60