Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Microsoft SQL Server wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Clusterpoint wikipedia , lookup
Relational model wikipedia , lookup
Concurrency control wikipedia , lookup
Database model wikipedia , lookup
EECS 647: Introduction to Database Systems Instructor: Luke Huan Spring 2007 Administrative The following book has been reserved in the engineering library: Data mining : concepts and techniques by Jiawei Han and Micheline Kamber 5/22/2017 Luke Huan Univ. of Kansas 2 Review: Data Integrity in SQL Constraints Trigger 5/22/2017 Luke Huan Univ. of Kansas 3 Trigger example CREATE TRIGGER EECS168AutoRecruit AFTER INSERT ON Student Event REFERENCING NEW ROW AS newStudent FOR EACH ROW WHEN (newStudent.GPA > 3.0) Condition INSERT INTO Enroll VALUES(newStudent.SID, ’EECS168’); Action 5/22/2017 Luke Huan Univ. of Kansas 4 Today’s Topic View Index Transaction Recursion 5/22/2017 Luke Huan Univ. of Kansas 5 Views A view is like a “virtual” table Defined by a query, which describes how to compute the view contents on the fly DBMS stores the view definition query instead of view contents Can be used in queries just like a regular table 5/22/2017 Luke Huan Univ. of Kansas 6 Creating and dropping views Example: EECS647roster CREATE VIEW EECS647Roster AS SELECT SID, name, age, GPA Called “base tables” FROM Student WHERE SID IN (SELECT SID FROM Enroll WHERE CID = ’EECS647’); To drop a view DROP VIEW view_name; 5/22/2017 Luke Huan Univ. of Kansas 7 Using views in queries Example: find the average GPA of EECS647 students SELECT AVG(GPA) FROM EECS647Roster; To process the query, replace the reference to the view by its definition SELECT AVG(GPA) FROM (SELECT SID, name, age, GPA FROM Student WHERE SID IN (SELECT SID FROM Enroll WHERE CID = ’EECS647’)); 5/22/2017 Luke Huan Univ. of Kansas 8 Why use views? To hide data from users To hide complexity from users Logical data independence If applications deal with views, we can change the underlying schema without affecting applications Recall physical data independence: change the physical organization of data without affecting applications To provide a uniform interface for different implementations or sources Real database applications use tons of views 5/22/2017 Luke Huan Univ. of Kansas 9 Modifying views Does not seem to make sense since views are virtual But does make sense if that is how users see the database Goal: modify the base tables such that the modification would appear to have been accomplished on the view Be careful! There may be one way to modify There may be many ways to modify There may be no way to modify 5/22/2017 Luke Huan Univ. of Kansas 10 A simple case CREATE VIEW StudentGPA AS SELECT SID, GPA FROM Student; DELETE FROM StudentGPA WHERE SID = 123; translates to: DELETE FROM Student WHERE SID = 123; 5/22/2017 Luke Huan Univ. of Kansas 11 An impossible case CREATE VIEW HighGPAStudent AS SELECT SID, GPA FROM Student WHERE GPA > 3.7; INSERT INTO HighGPAStudent VALUES(987, 2.5); No matter what you do on Student, the inserted row will not be in HighGPAStudent 5/22/2017 Luke Huan Univ. of Kansas 12 A case with too many possibilities CREATE VIEW AverageGPA(GPA) AS SELECT AVG(GPA) FROM Student; Note that you can rename columns in view definition UPDATE AverageGPA SET GPA = 2.5; Set everybody’s GPA to 2.5? Adjust everybody’s GPA by the same amount? Just lower Lisa’s GPA? 5/22/2017 Luke Huan Univ. of Kansas 13 SQL92 updateable views More or less just single-table selection queries No join No aggregation No subqueries Arguably somewhat restrictive Still might get it wrong in some cases See the slide titled “An impossible case” Adding WITH CHECK OPTION to the end of the view definition will make DBMS reject such modifications 5/22/2017 Luke Huan Univ. of Kansas 14 Indexes for Performance Tuning An index is an auxiliary persistent data structure Search tree (e.g. B-tree, R-tree) Lookup table (e.g., hash table) An index on R.A can speed up accesses of the form R.A = value (lookup) R.A > value (range query) An index on ( R.A1, …, R.An ) can speed up R.A1 = value1 Æ … Æ R.An = valuen (R.A1 , … R.An) > (value1, …, valuen) More on indexes in the second half of this course! 5/22/2017 Luke Huan Univ. of Kansas 15 Examples of Hash Index SELECT * FROM Student WHERE name = ‘Kevin’ Without an index on Student.name: must scan the entire table if we store Student as a flat file of unordered rows With index: go “directly” to rows with name = ’Kevin’ name sid name age gpa John 1234 John 21 3.5 Mary 1123 Mary 19 3.8 Bob 1011 Bob 22 2.6 Susan 1204 Susan 22 3.4 Kevin 1306 Kevin 18 2.9 A hash table 5/22/2017 Luke Huan Univ. of Kansas 16 Creating and dropping indexes CREATE [UNIQUE] INDEX index_name ON table_name(column_name1, …, column_namen); With UNIQUE, the DBMS will also enforce that {column_name1, …, column_namen} is a key of table_name e.g CREATE INDEX name_index ON STUDENT (SName); DROP INDEX index_name; Typically, the DBMS will automatically create indexes for PRIMARY KEY and UNIQUE constraint declarations 5/22/2017 Luke Huan Univ. of Kansas 17 Choosing indexes to create More indexes = better performance? Indexes take space Indexes need to be maintained when data is updated Indexes have one more level of indirection Optimal index selection depends on both query and update workload and the size of tables 5/22/2017 Luke Huan Univ. of Kansas 18 Transactions A program may carry out many operations on the data retrieved from the database However, the DBMS is only concerned about what data is read/written from/to the database. database - a fixed set of relations (A, B, C, …) transaction - a sequence of read and write operations (read(A), write(B), …) DBMS’s abstract view of a user program 5/22/2017 Luke Huan Univ. of Kansas 19 Correctness criteria: The ACID properties A tomicity: All actions in the Xact happen, or none happen. C onsistency: If each Xact is consistent, and the DB starts consistent, it ends up consistent. I solation: Execution of one Xact is isolated from that of other Xacts. D urability: If a Xact commits, its effects persist. An Example about SQL Transaction Consider two transactions (Xacts): T1: T2: • • • • BEGIN A=A+100, B=B-100 END BEGIN A=1.06*A, B=1.06*B END 1st xact transfers $100 from B’s account to A’s 2nd credits both accounts with 6% interest. Assume at first A and B each have $1000. What are the legal outcomes of running T1 and T2??? • $1100 *1.06 = $1166 There is no guarantee that T1 will execute before T2 or vice-versa, if both are submitted together. But, the net effect must be equivalent to these two transactions running serially in some order. 5/22/2017 Luke Huan Univ. of Kansas 21 Example (Contd.) Legal outcomes: A=1166,B=954 or A=1160,B=960 Consider a possible interleaved schedule: T1: T2: • A=1.06*A, B=B-100 B=1.06*B This is OK (same as T1;T2). But what about: T1: T2: • A=A+100, A=A+100, A=1.06*A, B=1.06*B B=B-100 Result: A=1166, B=960; A+B = 2126, bank loses $6 The DBMS’s view of the second schedule: T1: T2: 5/22/2017 R(A), W(A), R(A), W(A), R(B), W(B) Luke Huan Univ. of Kansas R(B), W(B) 22 SQL transactions Syntax in pgSQL: BEGIN <database operations> COMMIT [ROLLBACK] A transaction is automatically started when a user executes an SQL statement (begin is optional) Subsequent statements in the same session are executed as part of this transaction Statements see changes made by earlier ones in the same transaction Statements in other concurrently running transactions do not see these changes COMMIT command commits the transaction (flushing the update to disk) ROLLBACK command aborts the transaction (all effects are undone) 5/22/2017 Luke Huan Univ. of Kansas 23 Atomicity Partial effects of a transaction must be undone when User explicitly aborts the transaction using ROLLBACK The DBMS crashes before a transaction commits Partial effects of a modification statement must be undone when any constraint is violated E.g., application asks for user confirmation in the last step and issues COMMIT or ROLLBACK depending on the response However, only this statement is rolled back; the transaction continues How is atomicity achieved? Logging (to support undo) 5/22/2017 Luke Huan Univ. of Kansas 24 Isolation Transactions must appear to be executed in a serial schedule (with no interleaving operations) For performance, DBMS executes transactions using a serializable schedule In this schedule, only those operations that can be interleaved are executed concurrently Those that can not be interleaved are in a serialized way The schedule guarantees to produce the same effects as a serial schedule 5/22/2017 Luke Huan Univ. of Kansas 25 SQL isolation levels Strongest isolation level: SERIALIZABLE Complete isolation Usually se as default Weaker isolation levels: REPEATABLE READ, READ COMMITTED, READ UNCOMMITTED Increase performance by eliminating overhead and allowing higher degrees of concurrency Trade-off: sometimes you get the “wrong” answer pgSQL default is READ COMMITTED 5/22/2017 Luke Huan Univ. of Kansas 26 READ UNCOMMITTED Can read “dirty” data A data item is dirty if it is written by an uncommitted transaction Problem: What if the transaction that wrote the dirty data eventually aborts? Example: wrong average -- T1: UPDATE Student SET GPA = 3.0 WHERE SID = 142; -- T2: SELECT AVG(GPA) FROM Student; ROLLBACK; COMMIT; 5/22/2017 Luke Huan Univ. of Kansas 27 READ COMMITTED All reads see a snapshot of the database (including all committed transactions) right before the beginning of the query No dirty reads, but non-repeatable reads possible Reading the same data item twice can produce different results Example: different averages -- T1: -- T2: SELECT AVG(GPA) FROM Student; UPDATE Student SET GPA = 3.0 WHERE SID = 142; COMMIT; SELECT AVG(GPA) FROM Student; COMMIT; 5/22/2017 Luke Huan Univ. of Kansas 28 REPEATABLE READ Reads are repeatable, but may see phantoms Do not allow the modification of existing values New rows may be inserted in the mean time Example: different average (still!) -- T1: -- T2: SELECT AVG(GPA) FROM Student; INSERT INTO Student VALUES(789, ‘Nelson’, 10, 1.0); COMMIT; SELECT AVG(GPA) FROM Student; COMMIT; 5/22/2017 Luke Huan Univ. of Kansas 29 SERIALIZABLE READ Reads see the snapshot of the database right before the beginning of the transaction Example: the same average -- T1: -- T2: SELECT AVG(GPA) FROM Student; INSERT INTO Student VALUES(789, ‘Nelson’, 10, 1.0); COMMIT; SELECT AVG(GPA) FROM Student; COMMIT; 5/22/2017 Luke Huan Univ. of Kansas 30 Summary of SQL isolation levels Isolation level/anomaly READ UNCOMMITTED Dirty reads Non-repeatable reads Phantoms Possible Possible Possible READ COMMITTED Impossible Possible Possible REPEATABLE READ Impossible Impossible Possible SERIALIZABLE Impossible Impossible Impossible Syntax: At the beginning of a transaction, SET TRANSACTION ISOLATION LEVEL isolation_level [READ ONLY|READ WRITE]; READ UNCOMMITTED can only be READ ONLY 5/22/2017 Luke Huan Univ. of Kansas 31 Recursion in SQL SQL2 had no recursion You can find Bart’s parents, grandparents, great grandparents, etc. SELECT p1.parent AS grandparent FROM Parent p1, Parent p2 WHERE p1.child = p2.parent AND p2.child = ’Bart’; But you cannot find all his ancestors with a single query SQL3 introduces recursion WITH clause Implemented in DB2 (called common table expressions) Unfortunately, pgSQL does not support it. 5/22/2017 Luke Huan Univ. of Kansas 32 Ancestor query in SQL3 WITH Ancestor(anc, desc) AS base case ((SELECT parent, child FROM Parent) recursion step UNION Define a (SELECT a1.anc, a2.desc a relation FROM Ancestor a1, Ancestor a2 recursively WHERE a1.desc = a2.anc)) SELECT anc FROM Ancestor Query using the relation WHERE desc = ’Bart’; defined in WITH clause How do we compute such a recursive query? 5/22/2017 Luke Huan Univ. of Kansas 33 Summary of SQL features covered so far Query Modification Constraints Triggers Views Transaction Recursion Next: Database programming 5/22/2017 Luke Huan Univ. of Kansas 34