* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download CS 579 Database Systems
Survey
Document related concepts
Transcript
Theory, Practice & Methodology of Relational Database Design and Programming Copyright © Ellis Cohen 2002-2008 The Relational Model & Relational Mapping These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. For more information on how you may use them, please see http://www.openlineconsult.com/db 1 Overview of Lecture Representing Relationships as Bridge Tables Relational Models & Referential Integrity Foreign Keys Relational Mapping of 1:M Relationships Relational Mapping Exercises Mapping Reflexive Relationships Conceptual & Relational Models Defining and Changing Attributes Assigning Sequential & Unique Values to Attributes Metadata & System Data © Ellis Cohen 2001-2008 2 Representing Relationships as Bridge Tables © Ellis Cohen 2001-2008 3 Implementing Conceptual Models A conceptual model abstractly describes the information we want to persistently store in a database. But, how do we actually represent the model using the tables provided by a relational database? © Ellis Cohen 2001-2008 4 Entity Classes Tables Make each entity class a table • Each attribute of the entity class becomes an attribute (i.e. a column) of the table • Define a primary key for the entity class if necessary • The primary key of the entity class becomes the primary key of the table Emps empno ename addr 7499 ALLEN 10 Lehigh Way, … 7654 MARTIN 22 Gleason St, … 7844 TURNER … 7212 LAVICH … 30 SALES 7698 BLAKE … 10 ACCOUNTING 7986 STERN … 50 SUPPORT © Ellis Cohen 2001-2008 Depts deptno dname 5 Linking Instances Child Entity Class Employee Parent Entity Class works for Dept deptno empno 7499 ALLEN 15 Pogo Lane 7654 MARTIN … 7844 TURNER … 7212 LAVICH … 7698 BLAKE 12 Rara Road 7986 STERN 30 SALES 10 ACCOUNTING … Relational databases don't actually have "relationships". How do we implement them? © Ellis Cohen 2001-2008 6 Represent Links by a Bridge Table Child Entity Class Parent Entity Class works for Employee Dept deptno empno Suppose an employee does not have a department? Suppose an a department does not have any employees? WorksFor Emps empno ename addr empno deptno 7499 ALLEN … 7499 30 7654 MARTIN … 7654 30 deptno 7844 TURNER … 7844 30 30 SALES 7212 LAVICH … 7698 10 7698 BLAKE … 10 ACCOUNTING 7986 10 7986 STERN … 50 SUPPORT Depts dname What's the primary key of WorksFor? © Ellis Cohen 2001-2008 7 Table-Based Mapping 1. Make each entity class a table 2. Make each relationship a table 3. Combine tables with the same primary keys to make queries of the related tables more efficient (usually) © Ellis Cohen 2001-2008 8 Combine WorksFor with Emps Emps empno ename WorksFor addr 7499 ALLEN … 7654 MARTIN … 7844 TURNER … 7212 LAVICH … 7698 BLAKE … 7986 STERN … empno deptno Depts 7499 30 7654 30 deptno 7844 30 30 SALES 7698 10 10 ACCOUNTING 7986 10 50 SUPPORT dname Suppose an employee does not have a department? Emps empno ename addr deptno 7499 ALLEN ... 30 Depts 7654 MARTIN … 30 deptno 7844 TURNER … 30 30 SALES 7212 LAVICH … 10 ACCOUNTING 7698 BLAKE … 10 50 SUPPORT 7986 STERN … 10 © Ellis Cohen 2001-2008 dname 9 Reference Columns Help Match Information for Querying Emps empno ename addr deptno 7499 ALLEN ... 30 Depts 7654 MARTIN … 30 deptno 7844 TURNER … 30 30 SALES 7212 LAVICH … 10 ACCOUNTING 7698 BLAKE … 10 50 SUPPORT 7986 STERN … 10 dname Suppose we want to find out the name of the Department that BLAKE works in A query can 1. Determine BLAKE's department # 10 2. Determine the name of department #10 We find the tuple in Depts where Depts.deptno matches Emps.deptno We will see shortly how SQL Joins can obtain this information in a single query! © Ellis Cohen 2001-2008 10 Relational Models & Referential Integrity © Ellis Cohen 2001-2008 11 Associated Columns Every employee has a single empno, ename, addr, and a single deptno (which may be NULL if the employee is unassigned) Emps empno ename addr deptno 7499 ALLEN ... 30 7654 MARTIN … 30 7844 TURNER … 30 7212 LAVICH … 7698 BLAKE … 10 7986 STERN … 10 The value of the deptno column identifies another tuple – the tuple which represents the employee's department © Ellis Cohen 2001-2008 12 Referential Integrity Emps empno ename addr deptno 7499 ALLEN ... 30 Depts 7654 MARTIN … 30 deptno 7844 TURNER … 30 30 SALES 7212 LAVICH … 10 ACCOUNTING 7698 BLAKE … 10 50 SUPPORT 7986 STERN … 10 dname In fact, we want to check that every value of deptno in Emps identifies a legal department – that is, it must match a deptno value in Depts. This is called "referential integrity" © Ellis Cohen 2001-2008 13 SQL Representation CREATE TABLE Depts( deptno number(3) primary key, dname varchar(20) ) CREATE TABLE Emps( empno number(4) primary key, ename varchar(30), addr varchar(80), deptno number(3) references Depts(deptno) ) referential integrity constraint © Ellis Cohen 2001-2008 14 Referential Integrity Example Emps empno ename addr deptno 7499 ALLEN ... 30 Depts 7654 MARTIN … 30 deptno 7844 TURNER … 30 30 SALES 7212 LAVICH … 10 ACCOUNTING 7698 BLAKE … 10 50 SUPPORT 7986 STERN … 10 dname Emps( empno, ename, addr, deptno ) • deptno references Depts( deptno ) Every value in Emps.deptno is in Depts.deptno NULLs in Emps.deptno are OK Not every value in Depts.deptno needs to be in Emps.deptno (e.g. 50 isn't) © Ellis Cohen 2001-2008 15 Referential Integrity Violation Emps empno ename addr deptno 7499 ALLEN ... 30 Depts 7654 MARTIN … 30 deptno 7844 TURNER … 30 30 SALES 7212 LAVICH … 70 10 ACCOUNTING 7698 BLAKE … 10 50 SUPPORT 7986 STERN … 10 dname Emps( empno, ename, addr, deptno ) • deptno references Depts( deptno ) is violated, since Emps.deptno contains a value (70), which is not in Depts.deptno The database will NOT allow this! © Ellis Cohen 2001-2008 16 Relational Models A Relational Model describes – The characteristics of each relation (i.e. table) – Including any referential integrity constraints • SQL is a textual description of a relational model • Relational Schema Diagrams are a visual description of a relational model © Ellis Cohen 2001-2008 17 Relational Schema Diagrams VISUAL Relational Model (Relational Schema) Depts Emps deptno dname empno ename addr deptno referential integrity constraint BRIEF TEXTUAL Relational Model (TRex) Depts( deptno, dname ) Emps( empno, ename, addr, deptno ) • deptno references Depts( deptno ) © Ellis Cohen 2001-2008 18 Relational Schemas as 1:M Relationships Emps 1 2 empno ename addr deptno Depts 3 deptno dname 1. The primary key of Emps is empno The values of empno are all unique and non-null Every employee has a single empno, a single ename, and a single addr, but moreover … 2. Every employee has a single deptno However, multiple employees could have the same deptno There's a 1:M relationship between employees and whatever deptno represents 3. deptno represents a department (but multiple empno's could have the same value for deptno), so there's 1:M relationship between employees and department * Employee works for © Ellis Cohen 2001-2008 Dept 19 What's Wrong? WRONG! Don't do this! Emps empno ename addr X Depts deptno dname empno What's wrong with this representation? © Ellis Cohen 2001-2008 20 Only One Employee per Dept! Emps empno ename addr Depts X deptno dname empno This relation has one tuple per department So, each department has a single empno value associated with it! That means that a department can have at most a single employee! © Ellis Cohen 2001-2008 21 Foreign Keys © Ellis Cohen 2001-2008 22 Possible Referential Integrity Emps empno ename addr hiredate ? Projs pno pname budget startdate What might this mean, based on referential integrity? Is it legal? © Ellis Cohen 2001-2008 23 Illegal Referential Integrity Emps empno ename address hiredate X Projs pno pname budget startdate What might this mean? If a value for a hiredate is in Emps, then the same value for startdate MUST be in Projs: Employees could only be hired on the day some project started! Is it legal? No. We only allow foreign key constraints, where the attributes referenced (e.g. startdate) MUST BE unique (preferably a primary key). Projs.startdate is almost certainly not unique. Why only foreign key constraints? Commercial databases only support foreign key constraints; some only allow references to primary keys © Ellis Cohen 2001-2008 24 Foreign Key Constraint A foreign key constraint is a referential integrity constraint, where the referenced column(s) must have unique values. The standard relational model only uses foreign key constraints Foreign Key Emps( empno, ename, addr, deptno ) • deptno references Depts( deptno ) Foreign Key Constraint referenced column has UNIQUE values The database continually checks that a foreign key value matches a value of the referenced attribute –on insert/update of the foreign key, or –on update/delete of the referenced attribute(s) Changes that violate the constraint are disallowed (there are other ways of handling this as well …) © Ellis Cohen 2001-2008 25 Foreign Primary Key Constraint A foreign primary key constraint is a foreign key constraint, where the referenced column(s) is a primary key. Foreign Key Emps( empno, ename, addr, deptno ) • deptno references Depts Foreign Primary Key Constraint The referenced column does not need to be explicitly listed because it is the primary key of Depts © Ellis Cohen 2001-2008 26 SQL Representation CREATE TABLE Depts( deptno number(3) primary key, dname varchar(20) ) CREATE TABLE Emps( empno number(4) primary key, ename varchar(30), addr varchar(80), deptno number(3) references Depts) © Ellis Cohen 2001-2008 27 Relational Mapping of 1:M Relationships © Ellis Cohen 2001-2008 28 Conceptual & Relational Models A conceptual model Abstractly represents the database design Based on entity classes & relationships A relational model Concretely represents the details of the database design using relations (tables) Relationships are not built-in to the relational model. They are implemented referential integrity constraints. A relation is the mathematical term for a table. So the relational model is about how the tables are modeled in the actual database. The relational model is NOT about relationships © Ellis Cohen 2001-2008 29 Database Design Process Requirements Conceptual Design & Conceptual Normalization Conceptual Model Relational Mapping & Relational Normalization Relational Model Physical Mapping Physical Model using DDL & DCL © Ellis Cohen 2001-2008 30 Linking Instances Child Entity Class Employee Parent Entity Class works for Dept deptno empno 7499 ALLEN 15 Pogo Lane 7654 MARTIN … 7844 TURNER … 7212 LAVICH … 7698 BLAKE 12 Rara Road 7986 STERN 30 SALES 10 ACCOUNTING … Map to a relational model to implement the links between instances © Ellis Cohen 2001-2008 31 Table-Based Mapping 1. Make each entity class a table 2. Make each relationship a table (with foreign keys) 3. Combine tables with the same primary keys to make queries of the related tables more efficient (usually) © Ellis Cohen 2001-2008 32 Column Based Mapping 1. Make each entity class a table 2. Represent a 1:M relationship by adding a foreign key (no need to consider bridge tables) © Ellis Cohen 2001-2008 33 Representing Links by Foreign Keys Child Entity Class Parent Entity Class works for Employee Dept deptno empno At most a single dept is associated with an employee. Identify that department An explicit reference to a tuple in the table for the Parent Entity Class Emps empno ename addr deptno 7499 ALLEN ... 30 Depts 7654 MARTIN … 30 deptno 7844 TURNER … 30 30 SALES 7212 LAVICH … 10 ACCOUNTING 7698 BLAKE … 10 50 SUPPORT 7986 STERN … 10 © Ellis Cohen 2001-2008 dname 34 Crow Magnum & Relational Schemas Visual CONCEPTUAL Model (Crow Magnum) Child Entity Class * Parent Entity Class works for Employee Does NOT include deptno Dept deptno dname empno ename addr Visual RELATIONAL Model (Relational Schema) Emps deptno Depts empno ename addr deptno deptno dname foreign key constraint © Ellis Cohen 2001-2008 35 Chen & Relational Schemas Visual CONCEPTUAL Model (Chen) Chen works for Employee Dept Note the arrows point in the same direction! Visual RELATIONAL Model (Relational Schema) Emps Depts empno ename addr deptno deptno dname © Ellis Cohen 2001-2008 36 The Relational Schema Arrow Hint Child Entity Class Employee Parent Entity Class works for Dept deptno dname empno ename addr In a 1:M relationship, the Crow's Foot symbol points to the parent entity class The reference arrow points in the same direction Emps Depts empno ename addr deptno deptno dname © Ellis Cohen 2001-2008 37 Relational Mapping Exercises © Ellis Cohen 2001-2008 38 Reverse Engineering Exercise Design Corresponding ER Models for each of these Players Teams teamid tnam playid pnam teamid planid Depts Divs divid divnam deptno dname divid © Ellis Cohen 2001-2008 HealthPlans planid plannam plantyp Emps empno ename sal deptno hirediv 39 Interpreting Exercise 1 Players Teams teamid tnam playid pnam teamid planid HealthPlans planid plannam plantyp Each player (identified by playid) has • a single pnam • a single teamid (but many players could reference the same team) • a single planid (but many players could reference the same plan) © Ellis Cohen 2001-2008 40 Answer to Exercise 1 Visual RELATIONAL Model: Relational Schema Players HealthPlans playid pnam teamid planid Teams teamid tnam planid plannam plantyp I use singular for entity class names, plural for table names Visual CONCEPTUAL Model (Easy Crow Magnum) Team teamid tnam has Player enrolled in playid pnam © Ellis Cohen 2001-2008 Note direction of Crow Magnum indicators & reference arrows Health Plan planid plannam plantyp 41 Answer to Exercise 2 Visual RELATIONAL Model (Relational Schema) Divs Depts divid divnam deptno dname divid Emps empno ename sal deptno hirediv I use singular for entity class names, plural for table names Visual CONCEPTUAL Model (Easy Crow Magnum) Division divid divnam part of Dept works for deptno dname initially hired © Ellis Cohen 2001-2008 Note direction of Crow Magnum indicators & reference arrows Employee empno ename sal 42 Merging Foreign Key Lines Depts Divs divid divnam deptno dname divid Emps empno ename sal deptno hirediv When two foreign keys reference the same attribute, you can merge the lines (Don't do this in ER diagrams though!) Depts Divs divid divnam deptno dname divid Emps empno ename sal deptno hirediv Could two different Emps attributes both reference divid? © Ellis Cohen 2001-2008 43 Mapping Exercise Design Corresponding Relational Schemas for each of these has Category catid catnam Team teamid tnam has has Style Item itemsku size color stylecode stylenam styledate Player has playid pnam Child childid cname gives scholarship to © Ellis Cohen 2001-2008 44 Interpreting Mapping Exercise 1 Category has Style has Item itemsku size color stylecode stylenam styledate catid catnam Each style is associated with a single category Each item is associated with a single style Styles Items stylecode stylenam styledate catid itemsku size color stylecode © Ellis Cohen 2001-2008 45 Answer to Mapping Exercise 1 Visual CONCEPTUAL Model (Easy Crow Magnum) Category catid catnam has Style has Item itemsku size color stylecode stylenam styledate Visual RELATIONAL Model (Relational Schema) Categories catid catnam Styles Note direction of Crow Magnum indicators & reference arrows Items stylecode stylenam styledate catid itemsku size color stylecode Arrowheads are REQUIRED! © Ellis Cohen 2001-2008 46 Answer to Mapping Exercise 2 Visual CONCEPTUAL Model: Easy Crow Magnum ER Diagram Team teamid tnam has Player has playid pnam Child childid cname gives scholarship to Visual RELATIONAL Model: Relational Schema Players Teams teamid tnam playid pnam teamid © Ellis Cohen 2001-2008 Children childid cname playid schteamid 47 Keeping the Bridge Table Child Entity Class Parent Entity Class works for Employee Dept deptno empno WorksFor Emps empno ename addr empno deptno 7499 ALLEN … 7499 30 7654 MARTIN … 7654 30 deptno 7844 TURNER … 7844 30 30 SALES 7212 LAVICH … 7698 10 7698 BLAKE … 10 ACCOUNTING 7986 10 7986 STERN … 50 SUPPORT Depts dname Occasionally, relational designs keep bridge tables for 1:M relationships. If so, what would be the corresponding relational schema? © Ellis Cohen 2001-2008 48 Relational Schema with Bridge Table WorksFor Emps empno ename addr empno deptno 7499 ALLEN … 7499 30 7654 MARTIN … 7654 30 deptno 7844 TURNER … 7844 30 30 SALES 7212 LAVICH … 7698 10 7698 BLAKE … 10 ACCOUNTING 7986 10 7986 STERN … 50 SUPPORT Emps WorksFor empno ename addr empno deptno © Ellis Cohen 2001-2008 Depts dname Depts deptno dname 49 Mapping Reflexive Relationships © Ellis Cohen 2001-2008 50 Mapping Reflexive 1:M Relationships Employee Emps manages empno ename mgr 7499 ALLEN 7654 MARTIN 7698 7698 BLAKE 7839 7839 KING 7844 TURNER 7698 7986 STERN 7839 … 7698 Add mgr to Emps, referencing empno © Ellis Cohen 2001-2008 51 Schema for Reflexive Relationships Visual CONCEPTUAL Model (Crow Magnum ER Diagram) works for Employee empno ename addr manages Does NOT include deptno or mgr Visual RELATIONAL Model (Relational Schema) Emps Dept deptno dname * Depts empno ename addr deptno mgr deptno dname © Ellis Cohen 2001-2008 52 Visual & Brief Textual Relational Models VISUAL Relational Model (Relational Schema) Emps Depts empno ename addr deptno mgr deptno dname BRIEF TEXTUAL Relational Model (TRex) Depts( deptno, dname ) Emps( empno, ename, addr, deptno, mgr ) • deptno references Depts • mgr references Emps © Ellis Cohen 2001-2008 53 SQL Representation CREATE TABLE Depts( deptno number(3) primary key, dname varchar(20) ) CREATE TABLE Emps( empno number(4) primary key, ename varchar(30), addr varchar(80), deptno number(3) references Depts, mgr number(4) references Emps ) © Ellis Cohen 2001-2008 54 Conceptual and Relational Models © Ellis Cohen 2001-2008 55 Database Design Process Requirements Conceptual Design & Conceptual Normalization Conceptual Model Why not just design the relational model directly from the requirements? Relational Mapping & Relational Normalization Why design an intermediate conceptual model? Relational Model Physical Mapping Physical Model using DDL & DCL © Ellis Cohen 2001-2008 56 Why Do Conceptual Design? Faster to create & draw a conceptual model Clearer semantic intent Easier to understand and reason about Many ways to design a relational model for a given conceptual model Focus on essential design decisions Allows secondary design decisions to be deferred © Ellis Cohen 2001-2008 57 Combining the Conceptual & Relational Models Depts Emps manages empno ename addr deptno mgr works for deptno dname Some modeling tools and technologies combine the conceptual and the relational model. We think it is better to keep the two distinct © Ellis Cohen 2001-2008 58 UML Combination of Conceptual & Relational Models 0..1 manages 0..1 Employees PK FK * FK empno ename addr deptno mgr Depts PK deptno dname * works for Using UML to combine the conceptual and the relational model. We've placed the association link adjacent to the foreign key (FK) which implements it. However, this is actually rarely done when FK's are shown in UML, which can make it hard to tell which FK represents which relationship © Ellis Cohen 2001-2008 59 Why Different Models? Visual CONCEPTUAL Model (Crow Magnum ER Diagram) Employee works for Dept manages Used for communication among system & UI architects and users, interested more in functionality than implementation Visual RELATIONAL Model (Relational Schema) Emps empno ename addr deptno mgr Depts deptno dname © Ellis Cohen 2001-2008 Used for communication among database designers and administrators, focused on implementation 60 Defining and Changing Attributes © Ellis Cohen 2001-2008 61 Identifying Rows in Tables Numeric Id Uniquely identifies row without providing information about row's real identity or contents - e.g. 304792 Structured Id (e.g. SKU) Often structured to reflect structure or practice of organization - e.g. XM-304-T Can causes problems if structure or practice of organization changes Short Name Name that uniquely identifies row to DBA - e.g. ACCT Long Name / Title Name that clearly and uniquely identifies row to user - e.g. ACCOUNTING © Ellis Cohen 2001-2008 62 Naming Columns Use a consistent naming approach • Column name includes (whole or partial) name of table • [in Emps] empno and [in Depts] deptno (or) [in Emps] empid and [in Depts] deptid • [in Emps] ename and [in Depts] dname • Column name is independent of table name • [in Emps] id and [in Depts] id to identify the primary key • [in Emps] name and [in Depts] name to identify the employee & department name – Simplest naming model – Tends to require relative names [Emps.id] with queries joining multiple tables – Possibility of spurious join columns when using natural joins © Ellis Cohen 2001-2008 63 Naming Foreign Keys • Use name of foreign key, with table name included, if necessary: e.g. [in Emps] deptno (or deptid) • Where that name is ambiguous, replace or combine with a name that reflects the role in the local table e.g. [in Emps] use mgr or (better) mgrno to refer to the empno of the employee's manager © Ellis Cohen 2001-2008 64 Altering Column & Tables ALTER TABLE Emps ADD ( numkids int not null default 0 ) – Adds column ALTER TABLE Emps MODIFY ( sal NUMBER(13,2) DEFAULT 250 ) – Modifies column datatype and/or default ALTER TABLE Emps DROP (numkids, sal) – Drops columns ALTER TABLE Emps RENAME TO Employees – Renames entire table (can't rename columns) © Ellis Cohen 2001-2008 65 Assigning Sequential & Unique Values to Attributes © Ellis Cohen 2001-2008 66 Choosing a Primary Key User-provided Id or Name Provided by user when tuple in created System-provided Id Provided by an id generator defined in the database system. Useful as a surrogate key even if a name or a user-provided key is available. © Ellis Cohen 2001-2008 67 System-Provided Id Generator Id Values – Sequential (1,2,3,…) – Globally unique id (GUID) Every id generated is globally unique Useful if merging separate databases Id Generation – Explicit (used by Oracle) Handled through a named database object, which is explicitly incremented, gotten and stored in some column of a new row Via PL/SQL, can be made to look automatic – Automatic (used by SQL Server) Associated with a table column (A specified column in each newly created row gets a new id, sort of like Oracle ROWIDs) © Ellis Cohen 2001-2008 68 Sequences in Oracle Defining and using sequences CREATE SEQUENCE empseq START WITH 8000 INCREMENT BY 10 SELECT empseq.nextval FROM dual This is UGLY! You'd really like to hide it Imagine a single operation NewEmp( ename, deptno, job, sal, comm ) Inserts a tuple into emp with empno filled in from empseq.nextval hiredate filled in with today's date ename, deptno, job, sal, comm filled in from the given values An operation like this can be defined as a stored PL/SQL procedure © Ellis Cohen 2001-2008 69 Generating Globally Unique Ids In Oracle, the function SYS_GUID() will return a globally unique id as a 16 byte RAW value If you are not using a system that can provide GUID's, you can make one by concatenating together – – – – Unique mac or ip address or name of machine Name of database Name of schema (i.e. user) System-provided per-schema id (e.g. from an Oracle sequence) or timestamp For example cs.bu.edu-testdb-scott-80420 © Ellis Cohen 2001-2008 70 Metadata and System Data © Ellis Cohen 2001-2008 71 Metadata & System Data Metadata is information about the data that users store in the database. For example, – the names of a user's tables – the names and types of table columns. More generally, system data, is data stored by the database to support its proper functioning. Databases generally store metadata and other system data in the database itself. – In relational databases, this information is stored in tables. © Ellis Cohen 2001-2008 72 Basic Column Metadata Note that the column metadata for the Emps table can itself be represented as a table © Ellis Cohen 2001-2008 73 Oracle Examples SELECT * FROM Dictionary where table_name like 'USER_TAB%' order by table_name; – Dictionary holds info about all tables in the DB – this selects info about user table metadata SELECT table_name FROM User_Tables; – this lists the names of the user's tables SELECT table_name, column_name FROM User_Tab_Columns ORDER BY table_name, column_name; – this lists the names of the columns of each of the user's tables © Ellis Cohen 2001-2008 74