Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CMPS 277: officially Relational Databases but this quarter Database Implementation Notes 01: Introduction Arthur Keller CS 277 - Winter 2002 Notes 1 1 Isn’t Implementing a Database System Simple? Relations CS 277 - Winter 2002 Statements Notes 1 Results 2 Introducing the Database Management System • The latest from Megatron Labs • Incorporates latest relational technology • UNIX compatible CS 277 - Winter 2002 Notes 1 3 Megatron 3000 Implementation Details First sign non-disclosure agreement CS 277 - Winter 2002 Notes 1 4 Megatron 3000 Implementation Details • Relations stored in files (ASCII) e.g., relation R is in /usr/db/R Smith # 123 # CS Jones # 522 # EE .. . CS 277 - Winter 2002 Notes 1 5 Megatron 3000 Implementation Details • Directory file (ASCII) in /usr/db/directory R1 # A # INT # B # STR … R2 # C # STR # A # INT … .. . CS 277 - Winter 2002 Notes 1 6 Megatron 3000 Sample Sessions % MEGATRON3000 Welcome to MEGATRON 3000! & . .. & quit % CS 277 - Winter 2002 Notes 1 7 Megatron 3000 Sample Sessions & select * from R # Relation R A B SMITH 123 C CS & CS 277 - Winter 2002 Notes 1 8 Megatron 3000 Sample Sessions & select A,B from R,S where R.A = S.A and S.C > 100 # A 123 522 B CAR CAT & CS 277 - Winter 2002 Notes 1 9 Megatron 3000 Sample Sessions & select * from R | LPR # & Result sent to LPR (printer). CS 277 - Winter 2002 Notes 1 10 Megatron 3000 Sample Sessions & select * from R where R.A < 100 | T # & New relation T created. CS 277 - Winter 2002 Notes 1 11 Megatron 3000 • To execute “select * from R where condition”: (1) Read dictionary to get R attributes (2) Read R file, for each line: (a) Check condition (b) If OK, display CS 277 - Winter 2002 Notes 1 12 Megatron 3000 • To execute “select * from R where condition | T”: (1) Process select as before (2) Write results to new file T (3) Append new line to dictionary CS 277 - Winter 2002 Notes 1 13 Megatron 3000 • To execute “select A,B from R,S where condition”: (1) Read dictionary to get R,S attributes (2) Read R file, for each line: (a) Read S file, for each line: (i) Create join tuple (ii) Check condition (iii) Display if OK CS 277 - Winter 2002 Notes 1 14 What’s wrong with the Megatron 3000 DBMS? CS 277 - Winter 2002 Notes 1 15 What’s wrong with the Megatron 3000 DBMS? • Tuple layout on disk e.g., - Change string from ‘Cat’ to ‘Cats’ and we have to rewrite file - ASCII storage is expensive - Deletions are expensive CS 277 - Winter 2002 Notes 1 16 What’s wrong with the Megatron 3000 DBMS? • Search expensive; no indexes e.g., - Cannot find tuple with given key quickly - Always have to read full relation CS 277 - Winter 2002 Notes 1 17 What’s wrong with the Megatron 3000 DBMS? • Brute force query processing e.g., select * from R,S where R.A = S.A and S.B > 1000 - Do select first? - More efficient join? CS 277 - Winter 2002 Notes 1 18 What’s wrong with the Megatron 3000 DBMS? • No buffer manager e.g., Need caching CS 277 - Winter 2002 Notes 1 19 What’s wrong with the Megatron 3000 DBMS? • No concurrency control CS 277 - Winter 2002 Notes 1 20 What’s wrong with the Megatron 3000 DBMS? • No reliability e.g., - Can lose data - Can leave operations half done CS 277 - Winter 2002 Notes 1 21 What’s wrong with the Megatron 3000 DBMS? • No security e.g., - File system insecure - File system security is coarse CS 277 - Winter 2002 Notes 1 22 What’s wrong with the Megatron 3000 DBMS? • No application program interface (API) e.g., How can a payroll program get at the data? CS 277 - Winter 2002 Notes 1 23 What’s wrong with the Megatron 3000 DBMS? • Cannot interact with other DBMSs. CS 277 - Winter 2002 Notes 1 24 What’s wrong with the Megatron 3000 DBMS? • Poor dictionary facilities CS 277 - Winter 2002 Notes 1 25 What’s wrong with the Megatron 3000 DBMS? • No GUI CS 277 - Winter 2002 Notes 1 26 What’s wrong with the Megatron 3000 DBMS? • Lousy salesman!! CS 277 - Winter 2002 Notes 1 27 Course Overview • File & System Structure Records in blocks, dictionary, buffer management,… • Indexing & Hashing B-Trees, hashing,… • Query Processing Query costs, join strategies,… • Crash Recovery Failures, stable storage,… CS 277 - Winter 2002 Notes 1 28 Course Overview • Concurrency Control Correctness, locks,… • Transaction Processing Logs, deadlocks,… • Security & Integrity Authorization, encryption,… • Distributed Databases Interoperation, distributed recovery,… CS 277 - Winter 2002 Notes 1 29 System Structure Strategy Selector User Transaction Concurrency Control Lock Table Query Parser Transaction Manager Buffer Manager File Manager Statistical Data Recovery Manager M.M. Buffer Log Indexes User Data CS 277 - Winter 2002 User Notes 1 System Data 30 Some Terms • • • • Database system Transaction processing system File access system Information retrieval system CS 277 - Winter 2002 Notes 1 31 Mechanics http://www.soe.ucsc.edu/classes/cs277/ Coming soon CS 277 - Winter 2002 Notes 1 32 Prerequisite • An introductory database course equivalent to CMPS180 • Knowledge of SQL (theory and practice) • Algorithms and elementary analysis • If you do not have the prerequisite, you may want to audit the class instead. CS 277 - Winter 2002 Notes 1 33 Staff • • • • INSTRUCTOR: Arthur Keller Office: Baskin Engineering 153A Email: [email protected] – a good way to reach me. Office Hours: Most Tuesdays, Some Thursdays 4:30-5:30pm, often for a few minutes after class, and by appointment. • I’m coming from Palo Alto, so I may be late. • TEACHING ASSISTANT: none • GRADER: ? CS 277 - Winter 2002 Notes 1 34 Details • LECTURES: Tuesday, Thursday 6-7:45pm, SS II 179 • TEXTBOOK: Garcia-Molina, Ullman, Widom: “DATABASE SYSTEMS, THE COMPLETE BOOK” (second half of book, first half was used for CMPS180). • ASSIGNMENTS: Seven written homework assignments. No programming. Also readings in Textbook. • GRADING: Homeworks: 21% (3% each), Survey paper: 19%, Midterm: 20%, Final: 40%. • WEB SITE: All handouts and assignments will be posted on our Web site at http://www.soe.ucsc.edu/classes/cs277/ • Please check it periodically for last minute announcements. • NEWSGROUP: ucsc.class.cmps277 is being set up. CS 277 - Winter 2002 Notes 1 35 Tentative Syllabus CS 277 - Winter 2002 • • • • • • • • • • • • • • • • • • • • • • DATE Tue Mar 26 Thu Mar 28 Tue Apr 2 Thu Apr 4 Tue Apr 9 Thu Apr 11 Tue Apr 16 Thu Apr 18 Tue Apr 23 Thu Apr 25 Tue Apr 30 Thu May 2 Tue May 7 Thu May 9 Tue May 14 Thu May 16 Tue May 21 Thu May 23 Tue May 28 Thu May 30 TBA Wed Jun 5 CHAPTER TOPIC Introduction Class cancelled Ch. 11 Hardware Ch. 12 File and System Structure Ch. 12 File and System Structure Ch. 13 Indexing and Hashing Ch. 13 Indexing and Hashing Ch. 14 Indexing and Hashing Ch. 15 Query Processing Ch. 15 Query Processing Ch. 16 Query Processing Ch. 17 Crash Recovery Midterm Ch. 17 Crash Recovery Ch. 18 Concurrency Control Ch. 18 Concurrency Control Ch. 18 Concurrency Control Ch. 19 Transaction Processing Ch. 19 Transaction Processing Ch. 20 Information Integration Review Final Notes 1 36 Read: All Chapters • Except following optional material: – – – – – – Sections 11.7.4, 11.7.5 Sections 14.3.6, 14.3.7, 14.3.8 Sections 14.4.2, 14.4.3, 14.4.4 Sections 15.7, 15.8, 15.9 Sections 16.6, 16.7 In Chapters 15, 16: material on duplicate elimination operator, grouping, aggregation operators – Section 18.8 – Sections 19.4, 19.5, 19.6, 19.7 CS 277 - Winter 2002 Notes 1 37 Next time: • Hardware • Read chapter 11 CS 277 - Winter 2002 Notes 1 38