* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Here - NYU Computer Science Department
Encyclopedia of World Problems and Human Potential wikipedia , lookup
Global serializability wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Commitment ordering wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Microsoft Access wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Oracle Database wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Serializability wikipedia , lookup
Ingres (database) wikipedia , lookup
Functional Database Model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Versant Object Database wikipedia , lookup
Relational model wikipedia , lookup
Concurrency control wikipedia , lookup
Database model wikipedia , lookup
Dr. Zvi Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Dennis Shasha has also made changes in blue. Goals of the Course To provide students a fundamental understanding of the principles underlying both currently available and state of the art databases. To acquaint you with basic tools used in the design and implementations of databases, so you are better prepared to use commercially available systems. It turns out that the more you understand about how these systems are built, the better you will be at using a database system. If you like it, then it will like you. Database System Concepts 1.2 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz, A Sample Relational Database NAME A A B C E# 121 132 101 106 E# DATE 85.7.13 85.8.26 E# DATE 84.6.20 85.5.16 BOOK 132 a 121 b ILLNS 101 cold 121 flu E# 101 121 106 132 ADDR aaa bbb ccc ddd Database System Concepts BIRTH 1948.2.12 1951.6.17 1963.2.28 1951.5.10 GRADE 2 3 4 2 SALARY 80 70 70 80 BIRTH 1963.2.28 1948.2.12 1951.5.10 1951.6.17 1.3 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz, Some Typical Queries Some typical queries • Give names of all employees born before 1960.1.1. • Give names and addresses of all employees who took out a book before 1985.8.1. • Send notices to all employees who had a flu to come for a checkup. Note that some queries involve a single table, and some involve several tables. We would like to have a convenient language, as close as possible to a natural language, to express these queries, and similar ones. Such languages exist; principally, SQL. Database System Concepts 1.4 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz, The Need for Good Design It is important also to think carefully about the correct (or just good!) choice of which tables to use and what should be their structure. Some sample problems with our design: Bad structures can lead to inconsistency or loss of data. • Here we have the birth date appearing twice. • There is a danger that this will be inconsistent. • Even more dangerous would be multiple storage of employee's address. • If the employee moves, the address must be changed everywhere. Note, however, that it might be useful to replicate information, to speed up the access. This introduces a field called database tuning, which is mostly the province of advanced databases, though we will touch on it here. Database System Concepts 1.5 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz, Different Users Need Different Data It may be our goal to create a design that best reflects the inherent properties of the data. • Various user groups may need to look at the data assuming different organization. For privacy/security reasons we may want to give different users different access privileges to the database. For example: • The payroll department can see salaries but cannot see diseases. • The health department can see diseases but cannot see salaries. Users may prefer to look at different aspects of the information. • The payroll department may prefer to see the salary in a different currency. • The health department may prefer to see AGE instead of BIRTH. Database System Concepts 1.6 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz, Views A possible solution: give each user (class of users) privileges to look at a view, that is a small derived database. For example, • the health department may think that there is a table: E# 101 121 NAME B A ADDRSS aaa bbb AGE 25 37 DISEASE COLD FLU The database should provide such a view, which is computed from the existing tables, without the user knowing other (prohibited for this user) information. We need to leave flexibility for unanticipated queries. • Some people may have the right and want to ask the query: How are salaries and diseases correlated? Database System Concepts 1.7 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz, Physical Design The database system must be organized so that it is able to process queries efficiently. To do that: • Files must be organized appropriately. • Indices may be employed For example, if we frequently want to find the GRADE for various E#, maybe the file should be hashed on this value, allowing direct access. But, if we want to print the salaries of all the employees born in 1963, maybe the file should be sorted by BIRTH. What to do in general? Physical design of databases deals with such issues, which are also closely related to the optimization of query processing. Database System Concepts 1.8 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz, Recovery The database must be resilient despite the fact that the system is prone to faults. Assume one more table, describing the balances of employees' accounts in the credit union. • E# 101 106 121 132 SAVNGS 40 40 0 10 CHECKNG 30 20 80 0 We want to give each employee a bonus of 10 in the savings account. • To do that, a transaction (a portion of a user program) will sequentially change the values of SAVINGS. Database System Concepts 1.9 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz, Example of a Problem The file describing the table is stored on a disk, values are read into random access memory, modified and written out: if X is a local variable then we have a trace of the desired execution (in shorthand): • X:=SAVINGS[101] X:=X+10 SAVINGS[101]:=X . . . Read Processing Write What if the system crashes in the middle (say, electricity stops)? We do not know which of the values have been changed, so what to do to recover? Various techniques exist for managing the execution, so that reliable execution is possible. Database System Concepts 1.10 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz, Concurrency There may also be problems because of the concurrent execution of several transactions in a time sharing system. Assume that Alice is running a transaction, T1, that should compute and print the sum of the total bank accounts: • Account BALANCE Checking 70 Savings 80 • Concurrently Bob wants to move 40 from CHECKING to SAVINGS, using transaction T2. In a concurrent system we could have an incorrect execution. We write CH for CHECKING and SA for SAVINGS. Database System Concepts 1.11 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz, Execution Trace of Two Transactions T1 (Bob) T2 (Alice) . . . Remove 40 from checking Add 40 to savings . Read and sum up checking plus savings . Alice wonders what is going on. Database System Concepts 1.12 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz, The Layers/Levels of the Ideal Database It is customary to think of the database as made of layers or levels. The standard considers the top 3 of the 4 below: • External (User) View[1], ..., View[n] • Conceptual (Community) Description of the enterprise. Table design. • Internal Files, access methods, indices, distribution • Database O.S. Recovery and concurrency, hardware and OS. Different levels have different tasks. Database System Concepts 1.13 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz, The External Level The external level is seen by various users. Each view (subschema) is like a small conceptual level. It can also change in time. • (A particular view may be modified, deleted or added). Database System Concepts 1.14 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz, The Internal Level The internal level deals with file organization/storage management. It changes in time too. • New storage devices are brought • files may have indices created because some queries are frequent • the data may be geographically distributed, • … Database System Concepts 1.15 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz, The Database Operating System Level The database operating system level deals with concurrency and recovery. The database operating system can change too. The manufacturer of the database management system may discover better methods to handle recovery/concurrency. Database System Concepts 1.16 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz, Independence Among Levels A very important goal is (Data-) independence between levels: e.g. the indexes that are chosen should not affect the semantics of the application. We must make sure that changes in one level disturb as little as possible the other levels. Sophisticated users soon realize that performance considerations may cause one to violate this ideal. But you should know the tradeoffs. Database System Concepts 1.17 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz, Who Does What? The vendor sends: • The database operating system • Tools to create and manipulate the three top levels: external, conceptual, and internal. The database administrator and his/her staff discuss with with the users what information should the database contain and its structure. • A common model (language for describing reality) is needed for them to communicate. • Entity-relationship model is frequently used. The administrator and the users design the conceptual and the external levels. Database System Concepts 1.18 ©Zvi M. Kedem Korth and Sudarshan ©Silberschatz,