Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Open Database Connectivity wikipedia , lookup
Concurrency control wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Clusterpoint wikipedia , lookup
Versant Object Database wikipedia , lookup
Database model wikipedia , lookup
CS405G: Introduction to Database Systems Final Review Database Design 5/22/2017 Jinze Liu @ University of Kentucky 2 E-R model • E-R model • Entities • Attributes • Relationships 5/22/2017 Jinze Liu @ University of Kentucky 4 Database Design 5/22/2017 5 From E-R Diagram to Relations • • • • Relations Schemas Converting E-R diagram to relations Keys • Super keys • Candidate keys • Primary keys • Relational integrity constraints Key Constraints • Superkey: – (Uniqueness constraints) A set of attributes where no two distinct tuples can have the same values – Every relation has at least one superkey: • The set of all attributes. • Key: A minimal superkey – – Uniqueness constraint (superkey) Minimum Constraint • 5/22/2017 No attribute can be removed and still satisfy the uniqueness constraints. Jinze Liu @ University of Kentucky 7 Relational Integrity Constraints • Constraints are conditions that must hold on all valid relation instances. There are four main types of constraints: 1. Domain constraints 1. The value of a attribute must come from its domain 2. Key constraints 3. Entity integrity constraints 4. Referential integrity constraints 5/22/2017 8 Database Normalization • Functional Dependency • Functional Closure • Keys – Redefined – Based on functional dependency • DB Norm Form – 1st, 2nd, 3rd, BCNF Database Query 5/22/2017 Luke Huan Univ. of Kansas 10 Relational Algebra and SQL • Relational algebra • SQL query SFW Group by …, Having Subqueries Relationship between R.A. and SQL Relational algebra A language for querying relational databases based on operators: RelOp RelOp • Core set of operators: – Selection, projection, cross product, union, difference, and renaming • Additional, derived operators: – Join, natural join, intersection, etc. • Compose operators to make complex queries 5/22/2017 Jinze Liu @ University of Kentucky 12 • • • • • • Summary of core operators σp R πL R RXS R S R-S ρ S(A1, A2, …) R Selection: Projection: Cross product: Union: Difference: Renaming: – Does not really add “processing” power 5/22/2017 Jinze Liu @ University of Kentucky 13 Summary of derived operators • Join: • Natural join: • Intersection: 5/22/2017 R pS RS R S Jinze Liu @ University of Kentucky 14 Classification of relational operators • • • • • • • • Selection: σp R Projection: πL R Cross product: R X S Join: R p S Natural join: R S Union: R U S Difference: R - S Intersection: R ∩ S 5/22/2017 Monotone Monotone Monotone Monotone Monotone Monotone Monotone w.r.t. R; non-monotone w.r.t S Monotone Jinze Liu @ University of Kentucky 15 Update Operations on Relations • Update operations – INSERT a tuple. – DELETE a tuple. – MODIFY a tuple. • Constraints should not be violated in updates 5/22/2017 Jinze Liu @ University of Kentucky 16 Basic queries: SFW statement • SELECT A1, A2, …, An FROM R1, R2, …, Rm WHERE condition; • Also called an SPJ (select-project-join) query • (almost) Equivalent to relational algebra query π A1, A2, …, An (σ condition (R1 X R2 X … X Rm)) 5/22/2017 Luke Huan Univ. of Kansas 17 Semantics of SFW • SELECT E1, E2, …, En FROM R1, R2, …, Rm WHERE condition; • For each t1 in R1: For each t2 in R2: … … For each tm in Rm: If condition is true over t1, t2, …, tm: Compute and output E1, E2, …, En as a row • t1, t2, …, tm are often called tuple variables • Not 100% correct, we will see 5/22/2017 Luke Huan Univ. of Kansas 18 Operational semantics of GROUP BY SELECT … FROM … WHERE … GROUP BY …; • Compute FROM • Compute WHERE • Compute GROUP BY: group rows according to the values of GROUP BY columns • Compute SELECT for each group – For aggregation functions with DISTINCT inputs, first eliminate duplicates within the group Number of groups = number of rows in the final output 5/22/2017 Jinze Liu @ University of Kentucky 19 Database Design 5/22/2017 Jinze Liu @ University of Kentucky 20 physical data organization • Storage hierarchy (DC vs. Pluto) ! count I/O’s • Disk geometry: three components of access cost; random vs. sequential I/O • Data layout – Record layout (handling variable-length fields, NULL’s) – Block layout (NSM, PAX) ! inter-/intra-record locality • Access paths – Primary versus secondary indexes – Tree-based indexes: ISAM, B+-tree ! Again, reintroduce redundancy to improve performance ! Fundamental trade-off: query versus update cost 21 Performance Issues on Indexes • Indexes – ISAM – B+ Tree • Metrics – Storage – IO-costs • Operations – Single value query & range query – Insertion and deletion 22 Query Processing Implementation • Typical Query Processings – Selection – Join – Set operations. • Typical approaches – Sequential scans in unsorted database – Sorted database – What are the tradeoffs. 23