* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Insert Title Here
Microsoft Access wikipedia , lookup
Oracle Database wikipedia , lookup
Serializability wikipedia , lookup
Ingres (database) wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Relational algebra wikipedia , lookup
Clusterpoint wikipedia , lookup
Versant Object Database wikipedia , lookup
1 The Relational Data Model and SQL 10:30AM—12noon Monday, July 18th, 2005 CSIG05 Chaitan Baru 2 OUTLINE • Foundations of the relational data model – Data models and database systems – Relations, attributes, keys • Introduction to SQL • “Hands-on” with SQL – Vishwanath Nandigam 3 Historical note: IBM’s role in database systems • IBM’s early database systems were based on the hierarchical data model – IMS (Information Management System). – IMS serves more than 95 percent of Fortune 1000 companies – Manages 15 petabytes of production data – Supports more than two hundred million users per day S1 Student S2 Course C1 C2 C2 C3 C4 Instructor I1 I2 I2 I3 I1 4 IBM and Relational Database Systems • Relational model was introduced to provide separation of application logic from the data representations • DB2 IBM’s second database product! • Foundations of relational model were invented at IBM by E.F.Codd, IBM – A Relational Model of Data for Large Shared Data Banks, E. F. Codd, June 1970 • First prototype, System R, was developed at IBM Almaden in mid-70’s – Introduced SQL – Provided as SQL/DS on IBM mainframe systems – Oracle was based on early System R work 5 Why the Relational Model? • Deals with the application logic / data separation in business data processing, unlike the earlier “network” and “hierarchical” data model • Plus, an algebra for manipulating relations the key innovation Student N takes S# student info M Course C# course info S# C# N 1 teaches Instructor 6 What is a Database System? • Database (system) = – Database Instance (e.g. set of tables of rows) – Database Management System (DBMS) • Origins in the commercial world: – to organize, query, and manipulate data more effectively, efficiently, and independently • Scientific databases – often special features: • spatial, temporal, spatiotemporal, GIS, units, uncertainty, raw & derived data, … 7 Why not just use files as “databases”? • Works for some applications… • But: – scanning & ‘grep’ing large files can be very inefficient – no language support for selecting desired data, joining them, etc. • cannot express the kinds of questions/queries you’d like to ask • ‘grep’ is no substitute for a query language – – – – – – – redundant and/or inconsistent storage of data no transaction management and concurrency control among multiple users no security no recovery no data independence (application data) no data modeling support … 8 Features of a Database System • A data model (relational, object-oriented, XML) prescribes how data can be organized: – as relations (tables) of tuples (rows) – as classes of (linked) objects – as XML trees • A (database) schema (stored in the “data dictionary”) defines the structure of a specific database instance: – Relational schema – OO schema – XML Schema (or XML DTD) • A query language – Allows ad hoc, declarative (non-procedural) queries on schema 9 Features of a Database System • Data is treated uniformly and separately from the application • Efficient data access • Queries and views are expressed over the schema • Integrity constraints (checking and enforcement) • Transactions combine sets of operations into logical units (all-or-nothing) • Synchronization of concurrent user transactions • Recovery (after system crash) – not to be confused w/ backup – instead: guarantee consistency by “roll-back” of partially executed transactions (how? Hint: logging) • … 10 DB features: E.g. Concurrency Control • Concurrent execution of simultaneous requests – long before web servers where around... – transaction management guarantees consistency despite concurrent/interleaved execution • Transaction (= sequence of read/write operations) – Atomicity: a transaction is executed completely or not at all – Consistency: a transaction creates a new consistent DB state, i.e., in which all integrity constraints are maintained – Isolation: to the user, a transaction seems to run in isolation – Durability: the effect of a successful (“committed”) transaction remains even after system failure 11 Levels of Abstraction: Architecture Overview User Conceptual … View 1 View 2 logical data independence Level ER-Model (Entity-Relationship) OO Models (Classes…) part of DB design conceptual design … often lost in the process… View n Export schemas Logical (“conceptual”) level Tables physical data independence Physical level Index structures DB instances 12 Database Design: EntityRelationship (ER) Model Name • • • • Entities: Relationships: Attributes: ER Model: Salary Employee Name works-for Manager Department since – initial, high-level DB design (conceptual model) – easy to map to a relational schema (database tables) – comes with more constraints (cardinalities, aggregation) and extensions: EER (is-a => class hierarchies) – related: UML (Unified Modeling Language) class diagrams 13 The Relational Model Employee Emp tom tim sally carol carol …. Salary DNo 60k 1 57k 1 45k 3 30k 1 35k 2 FK: foreign key, pointing to another key Department DNo 1 2 3 Name Toys Comp. Shoes Mgr carol carol sam • Relation/Table Name: – employee, dept • Attributes = Column Names: – Emp, Salary, DeptNo, Name, Mgr • Relational Schema: – employee(Emp:string, Salary:integer, DeptNo:integer), ... • Tuple = Row of the table: – (“tom”, “60000”, “1”) • Relation = Set of tuples: – {(...), (...), ...} 14 Creating a Relational Database in SQL CREATE TABLE employee ( ssn CHAR(11), name VARCHAR(30), deptNo INTEGER, PRIMARY KEY (ssn), FOREIGN KEY (deptNo) REFERENCES department ) CREATE TABLE department ( deptNo INTEGER, name VARCHAR(20), manager CHAR(11), PRIMARY KEY (deptNo), FOREIGN KEY (manager) REFERENCES employee(ssn) ) 15 What is a Query? • Intuitively: – An “executable question” in terms of a database schema – Evaluating a query Q against a database instance D yields a set of answer objects: • Relational tuples or XML elements • Example: – Who are the employees in the ‘Toys’ dept.? – Who is (are) the manager(s) of ‘Tom’? – Show all pairs (Employee, Mgr) • Technically: – A mapping from an input schema (the given table schemas) to a result schema (the new columns you are interested in) defined in some query language 16 Why (Declarative) Query Languages? • Things we talk and think about in PLs and QLs – Assembly languages: • registers, memory locations, jumps, ... – C and the likes: • if-then-else, for, while, memory (de-)allocation, pointers, ... – Object-oriented languages: • • • • C++: C plus objects, methods, classes, ... Java: objects, methods, classes, references, ... Smalltalk: objects, objects, objects, ... OQL: object-query language 17 Why (Declarative) Query Languages? • Things we talk and think about in PLs and QLs – Functional languages (Haskell, ML): • (higher-order) functions, fold(l|r), recursion, patterns, ... => Relational languages (SQL, Datalog) • relations (tables), tuples (rows); conceptual level: ER • relational operations: , , , , ..., ,,,,,..., , , |X| => Semistructured/XML (Tree) & Graph Query Languages • trees, graphs, nodes, edges, children nodes, siblings, … • XPath, XQuery, … • Also: Focus on what, and not how! 18 Example: Querying a Relational Database input tables Employee Department Emp anne john DeptNoMgr 1 anne 2 anne Salary DeptNo 62k 2 60k 1 join SQL query (or view def.) SELECT e.Emp, d.Mgr we don’t say how to FROM Employee e, Department d evaluate this expression WHERE e.DeptNo = d.DeptNo result answer (or view) Emp john anne Mgr anne anne 19 Example Query: SQL vs DATALOG • “List all employees and their managers” • In SQL: SELECT e.name, d.manager FROM Employee e, Department d WHERE e.deptNo = d.deptNo a “join” operation • In DATALOG: q(E, M) :- employee(E, S, D), department(D, N, M). 20 Important Relational Operations • select(R, Condition) – filter rows of a table wrt. a condition • project(R, Attr) – remove unwanted columns; keep rest • join(R1, A2, R2, A2, Condition) – find “matches” in a “related” table – e.g. match R1.foreign key = R2.primary key • cartesian product(R1, R2) • union (“OR”), intersection (“AND”) • set-difference (“NOT IN”) 21 Queries, Views, Integrity Constraints • … can all be seen as “special queries” • Query q(…) :- … ad-hoc queries • View v(…) :- … exported views; • Integrity Constraints – ic (…) :- …. MgrSal < EmpSal … – say what shouldn’t happen – if it does: alert the user (or refuse an update, …) 22 Query Evaluation vs Reasoning • Query evaluation – Given a database instance D and a query Q, run Q(D) – What databases do all the time • Reasoning (aka “Semantic Query Optimization”) – Given a query Q and a constraint C, “optimize” Q&C (e.g., given C, Q might be unsatisfiable) – Given Q1 and Q2 decide whether Q1 Q2 – Given Q1,Q2, C decide whether Q1 Q2 | C – Note: we are NOT given a database instance D here; just the schema and the query/IC expressions 23 Summary QLs for Relational Databases Natural Join: same attribute name add condition that values must match 24 Relational Algebra 25 Relational Algebra 26 Relational Algebra 27 Relational Algebra 28 Relational Algebra