* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Relational Data Model
Survey
Document related concepts
Transcript
Relational Data Model Ch. 7.1 – 7.3 John Ortiz Why Study Relational Model? Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. “Legacy systems” in older models E.G., IBM’s IMS (hierarchical model) Recent competitor: object-oriented model ObjectStore, Versant, Ontos, O2 A synthesis emerging: object-relational model Informix UDS, UniSQL, Oracle, DB2 Lecture 3 Relational Data Model 2 Anatomy of a Relation Each relation is a table with a name. An attribute is a column heading. The heading is the schema of the relation Students(SSN, Name, Age, GPA) Relation name Students SID 1002 1005 1020 Name John Smith Mary Day Bill Lee Age 20 18 19 GPA 3.2 2.9 2.7 Attribute name Tuple Column Lecture 3 Relational Data Model 3 Domain of an Attribute The domain of an attribute A, denoted by Dom(A), is the set of values that the attribute can take. A domain is usually represented by a type. E.g., SID char(4) Name varchar(30) --- character string of variable length up to 30 Age number --- a number Lecture 3 Relational Data Model 4 Tuples A tuple of a relation is a row in its table. If t is a tuple of a relation R and A is an attribute of R, we use t[A] to represent the value of t under A in R. Example: If t is the second tuple in Students t[Name] = ‘Mary Day’ t[Age] = 18, t[Name, Age] = (‘Mary Day’, 18) Lecture 3 Relational Data Model 5 Schema and Instance A relation schema, denoted by R(A1, A2, …, An), consists of the relation name R and a list of attributes A1, …, An. R.A denotes attribute A of R. # of attributes = degree A relation instance (state) of a relation schema R(A1, …, An), denoted by r(R), is a set of tuples in the table of R at some instance of time. # of tuples = cardinality Lecture 3 Relational Data Model 6 Schema & Instance Update The schema of a relation may change (e.g., adding, deleting, renaming attributes and deleting a table schema) but it is infrequent The state of a relation may also change (e.g., inserting or deleting a tuple, and changing an attribute value in a tuple) & it is frequent A schema may have different states at different times. Lecture 3 Relational Data Model 7 Relational Database A relational database schema is a set of relation schemas S={R1, …, Rm}. A relational database is a set of relations DB(S)={r(R1), …, r(Rm)}. A database state is a set of relation instances at some instance of time. In addition, a relational database must satisfy a number of constraints (more to come later). Lecture 3 Relational Data Model 8 A University Database see p. 204, Fig. 7.5 Students SID 1002 1005 1020 Sections Name J. Smith M. Day B. Lee Major CS Math EE GPA 3.2 2.9 2.7 Cno CS374 CS455 Math210 Courses Cno CS374 CS455 CS100 Math210 Lecture 3 Name Database Network Prog.Lang Calculus Hour 3 3 4 3 Dept CS CS CS Math Sno 001 002 001 Semester F2000 S2000 F1999 Prof Zhang Smith Brown Departments Name CS EE Math Relational Data Model Room SB220 EB318 AB119 Chair Hansen Johnson Miller 9 Constraints of Relational DB Relations must satisfy the following constraints. Domain (1NF) Constraint. Access-by-Content Constraint. Key (Unique Tuple) Constraint. Entity Integrity Constraint. Referential Integrity Constraint. Integrity constraints are enforced by the RDBMS. Lecture 3 Relational Data Model 10 Domain Constraint Also known as the First Normal Form (1NF): Attributes can only take atomic values (I.e., set values are not allowed). How to handle multivalued attributes? Use multiple tuples, one per value Use multiple columns, one per value Use separate tables What problems does these solutions have? Lecture 3 Relational Data Model 11 Handle Multi-Valued Attributes Multiple Values: Employees Use Multiple Tuples: Employees Lecture 3 EID 1234 1357 2468 EID 1234 1234 1357 2468 2468 2468 Name Bob Mary Peter Name Bob Bob Mary Peter Peter Peter Age 34 23 54 Dependents {Allen, Ann} {Kathy} {Mike, Sue, David} Age 34 34 23 54 54 54 Dependents Allen Ann Kathy Mike Sue David Relational Data Model 12 Handle Multi-Valued Attributes Use Multiple Columns: Use Separate Relations: Lecture 3 Employees EID 1234 1357 2468 Name Bob Mary Peter Age 34 23 54 Dep1 Dep2 Dep3 Allen Ann Kathy Mike Sue David Dependents Employees EID 1234 1357 2468 Name Bob Mary Peter Age 34 23 54 Relational Data Model EID 1234 1234 1357 2468 2468 2468 Name Allen Ann Kathy Mike Sue David 13 Access-by-Content Constraint A tuple is retrieved only by values of its attributes, i.e., the order of tuples is not important. This is because a relation is a set of tuples. Although the order of tuples is insignificant for query formulation, it is significant for query evaluation. Lecture 3 Relational Data Model 15 Superkey A superkey of a relation is a set of attributes whose values uniquely identify the tuples of the relation. Every relation has at least one superkey (default is all attributes together?). Any superset of a superkey is a superkey. From a state of a relation, we may determine that a set of attributes is not a superkey, but can not determine that a set of attributes is a superkey. Lecture 3 Relational Data Model 16 Superkey Example Find all superkeys of the Students relation. Students SID 1002 1005 1020 Name J. Smith M. Day B. Lee Major CS Math EE GPA 3.2 2.9 2.7 With the only state of R, is A a superkey? What about {A, B}? R A B C D A1 A2 A2 A3 Lecture 3 Relational Data Model B2 B2 B1 B3 C1 C3 C2 C4 D2 D2 D1 D1 17 Candidate Key A candidate key of a relation is a set of attributes of the relation such that it is a superkey, and none of its proper subsets is a superkey. Find all candidate keys in Students relation. Is it true that every relation has at least candidate key? Why? Can candidate keys be found from a state? If AB is a candidate key of a relation, can A also be a candidate key? What is ABC called? Lecture 3 Relational Data Model 18 Primary Key A primary key of a relation is a candidate key designated (with an underline) by a database designer. Often chosen at the time of schema design, & once specified to DBMS, it cannot be changed. Better be the smallest candidate key for improvement of both storage and query processing efficiencies. What should be the primary key of Students? Lecture 3 Relational Data Model 19 Key Constraint Every relation must have a primary key. Why is key constraint needed? Every tuple has a different primary key value. Only the primary key values need to be checked for identifying duplicate when new tuples are inserted (index is often used). Primary key values can be referenced from within other relations Lecture 3 Relational Data Model 20 Entity Integrity Constraint A null value is a special value that is unknown, yet to be assigned, or inapplicable. Entity Integrity Constraint: No primary key values can be null. Why? Lecture 3 Relational Data Model 21 Foreign Key A foreign key in relation R1 referencing relation R2 is a set of attributes FK of R1, such that, FK is compatible with a candidate (or primary) key PK of R2 (with same number of attributes and compatible domains); and for every tuple t1 in R1, either there exists a tuple t2 in R2 such that t1[FK] = t2[PK] or t1[FK] = null. Foreign keys need to be explicitly defined. Lecture 3 Relational Data Model 22 Foreign Key Example Employees EID 1234 1357 2468 Name Bob Mary Peter Departments Age 34 23 54 DName Sales Service null Name Sales Payroll Service City Huston Dallas Chicago Manager Bill Steve Tom DName of Employees is a foreign key referencing Name of Departments A foreign key may reference its own relation. Employee(EID, Name, Age, Dept, ManegerID) Lecture 3 Relational Data Model 23 Referential Integrity Constraint Referential Integrity Constraint: No relation can contain unmatched foreign key values. Using foreign keys in a relation to reference primary keys of other relations is the only way in the relational data model to establish relationships among different relations. Lecture 3 Relational Data Model 24 Update Operations Insert Can violate any of the 4 previous constraints – what were they again? 1 solution: reject the insert Delete Can only violate referential integrity – why? 3 solutions: reject deletion, propagate deletion, modify referencing attributes Modify Can violate any of the 4 previous constraints Lecture 3 Relational Data Model 25 Relational Model: Summary A tabular representation of data. Simple and intuitive, currently the most widely used. Integrity constraints can be specified by the DBA, based on application semantics. DBMS checks for violations. Two important ICs for primary and foreign keys In addition, we always have domain constraints. Lecture 3 Relational Data Model 26 Relational Model: Summary ICs are based upon the semantics of the realworld enterprise that is being described in the database relations. We can check a database instance to see if an IC is violated, but we can never infer that an IC is true by looking at an instance. Powerful and natural query languages exist. Guidelines to translate ER to relational model (next class…) Lecture 3 Relational Data Model 27 Look Ahead Next topic: Relational Algebra Read Textbook: Chapter 7.4 – 7.6 Lecture 3 Relational Data Model 28