* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The Relational Theory
Extensible Storage Engine wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Ingres (database) wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Concurrency control wikipedia , lookup
Functional Database Model wikipedia , lookup
Clusterpoint wikipedia , lookup
Versant Object Database wikipedia , lookup
ContactPoint wikipedia , lookup
Relational algebra wikipedia , lookup
IT 20303 • Relational Database Theory Relational Database Theory • The Relational Theory – Ways of working with data • Types of “Models” –File database model –Hierarchical database model –Network database model –Relational database model Relational Database Theory • The Relational Theory – Meaning of database model • The way data is organized & stored • The way data is manipulated Relational Database Theory • Relational Model of Data – Published in 1970 by Dr. Edgar (Ted) Codd – IBM • “A Relational Model of Data for Large Shared Data Banks” Relational Database Theory • Relational Model of Data – Purpose • Achieve program/data structure independence • Treat data in a disciplined way –Apply rigor of mathematics –Uses Set Theory – sets of related data • Improve programmer productivity Relational Database Theory • The Relational Model – Relational uses familiar concepts • The data is perceived as organized in tables – Relational also incorporates the rigor of mathematics • Rows of the table are treated as elements in a set • Manipulation of rows is based on set operations – (Vinn Diagrams) – User works with a set of rows at a time Relational Database Theory • Relational also impacts Data Design – Files were often constructed to support an application – Tables are designed to describe one thing or Entity in the database Relational Database Theory • Example of a Relation: – ANIMAL – Entity (Relation) ANAME Candice Zona Sam Elmer Leonard AFAMILY Camel Zebra Snake Elephant Lion WEIGHT 1800 900 5 5000 1200 Relational Database Theory • Definition of a Relation – Data is organized & stored in structures called relations – A relation is a table that adheres to certain rules • A relation can be called a table Relational Database Theory • Definition of a Relation – A relation is a table containing all the data about some entity • An entity is a thing or object that is important in this application area • Data items in the table are related Relational Database Theory • Relational Data Structure Domains Primary Key Name ANAME Species Weight AFAMILY WEIGHT Candice Camel 1800 Zona Zebra 900 Sam Snake 5 Elmer Elephant 5000 Leonard Lion 1200 Relation Attributes Tuples Relational Database Theory • Relational Data Structure Definitions – Relation • The Table – Tuple • A Row – Attribute • A Column Relational Database Theory • Relational Data Structure Definitions – Primary Key • A unique identifier for the table – Domain • A pool of legal values from which an attribute value is selected –Related to meaning –Has a Data Type Relational Database Theory • Relational Data Structure Definitions – Degree • The number of attributes – Cardinality • The number of tuples Relational Database Theory • Relational Table Rules – A Relation is a table that adheres to the following rules: • There are No Duplicate Tuples in the table –The tuples in the table are treated as a mathematical set Relational Database Theory • Relational Table Rules –By definition, a set is a collection of unique elements • There must be a primary key (unique identifier) for each tuple Relational Database Theory • Relational Table Rules • There is no order to the tuples (top to bottom) • There is no order to the attributes (left to right) –By convention, the primary key attribute is usually the first one on the left side of the table Relational Database Theory • Attributes – Each attribute has a datatype • Examples: Integer, character, date, user-defined – The data value of an attribute can be null Relational Database Theory • Attributes – Each attribute value is atomic • There is One & Only One data value in each cell of the table • There are no Lists or Arrays • One fact per field, one field per fact – Can be called a Field (MS Access) Relational Database Theory • Relational Data Structure: Design – Each relation contains data about only one entity • Each row corresponds to one unique occurrence of the entity – A relation does not contain arrays, lists or repeating groups • No multi-valued attributes Relational Database Theory – Tables are designed according to Rules of Normalization • Each data item in the table is determined –By the Primary Key –By the Whole Primary Key –Only by the Primary Key Relational Database Theory – Normalization avoids well-known update problems • Optimizes design to minimize redundancy & storage requirements Relational Database Theory • Example: Table with repeating group –Animal ANAME AFAMILY WEIGHT FOOD Candice Camel 1800 Hay Buns Zona Zebra 900 Brush Sam Snake 5 Mice People Elmer Elephant 5000 Leaves Leonard Lion 1200 People Meat Relational Database Theory • Example: Table with no repeating group Animal-Food Animal ANAME AFAMILY WEIGHT Candice Camel 1800 Zona Zebra 900 Sam Snake 5 Elmer Elephant 5000 Leonard Lion 1200 ANAME FOOD Candice Hay Candice Buns Zona Brush Sam Mice Sam People Elmer Leaves Leonard People Leonard Meat Relational Database Theory • A Database Models the Real World – A Database represents Reality – The database is a collection of relations • A relation represents an entity type • Each tuple represents one occurrence of that entity type • Each occurrence of an entity is unique Relational Database Theory • A Database Models the Real World – A database contains information about • Entities • Relationships between entities • Rules about the entities’ data & the relationships Relational Database Theory • Relational Databases Support Relationships – Relational databases support relationships between entities • Relationship is established by a Foreign Key • Repeat the Primary Key of one table in the related table(s) Relational Database Theory • Example: The Zoo has an “Adopt-an-Animal” program – A zoo member can adopt an animal Foreign Key Zoo-Member *** ANAME Animal MID MNAME MADDR 171 N. Harrison 1400 Blush Rd Zona 144 J. Montagano 1108 5th Ave Leonard Candice Camel 1800 194 J. Spence 1244 Lark Ln Candice Zona Zebra 900 303 E. Wingate 5222 Gains Dr Candice Sam Snake 5 101 H. Yarchun 177 Beach Rd 270 K. Steeg 140 Crystal Dr Zona Elmer Elephant 5000 291 S. Ackerman 1172 Park Dr Sam Leonard Lion 1200 301 K. Snyder 196 279th Ave ANAME AFAMILY WEIGHT Relational Database Theory • Example: Another Relationship Composite Primary Key Animal ANAME AFAMILY WEIGHT Candice Camel 1800 Zona Zebra 900 Sam Snake 5 Elmer Elephant 5000 Leonard Lion 1200 Foreign Key Animal-Food ANAME FOOD Candice Hay Candice Buns Zona Brush Sam Mice Sam People Elmer Leaves Leonard People Leonard Meat Relational Database Theory • Relational Integrity Rules – Entity Integrity • No part of the Primary Key (PK) may be Null – Referential Integrity • The value of a Foreign Key (FK) must either –Be Null or –Be one of the values of the PK in the related table Relational Database Theory • Keys, Keys, and More Keys – Characteristic of a Primary Key (PK) • Unique • Mandatory • Unchanging • Under the control of IT organization Relational Database Theory • Keys, Keys, and More Keys – Names or Types of Keys • Candidate Key –A minimal set of attributes that can be used as the unique identifier for a table Relational Database Theory • Keys, Keys, and More Keys – Names or Types of Keys • Primary Key –One of the candidate keys • Alternate Key –A candidate key that is not the primary key Relational Database Theory • Keys, Keys, and More Keys – Names or Types of Keys • Foreign Key –A primary key of a related table –Indicates relationships Relational Database Theory • Keys, Keys, and More Keys – Names or Types of Keys • Composite Key –A key composed of more than one attribute • Search Key –One or more attributes on which a retrieval is based »Indexes Relational Database Theory • Characteristics of Relationships – Referential integrity applies to the relationship between entities • Also known as an existence constraint or an enterprise rule • For every relationship, referential integrity must be defined Relational Database Theory • Relationships have Cardinality – One-To-One – One-To-Many – Many-To-Many • Relationships have Optionality – Each entity’s participation is either • Mandatory or • Optional Relational Database Theory • Cardinality reflects Business Rules – One-To-One Relationship • One animal is cared for by one zoo worker • One zoo worker cares for one animal Relational Database Theory • Cardinality reflects Business Rules – One-To-Many Relationship • One animal is cared for by many zoo workers • One zoo worker cares for only one animal Relational Database Theory • Cardinality reflects Business Rules – Many-To-Many Relationship • One animal is cared for by many zoo workers • One zoo worker cares for many animals Relational Database Theory • Mandatory Relationship – The Foreign Key Cannot be Null – Every purchase order must have a supplier – In the example below the FK, SNO, cannot be Null Relational Database Theory • Example: PORDER ONO SNO 7001 1234 03/09/02 SUPPLIER SNO SNAME SADDR 1234 Farm & Feed 7000 Booth Rd 2079 The Grain House 2001 Larkin Dr *** ODATE 7002 2079 03/10/02 7003 2079 03/12/02 *** *** Relational Database Theory • Example: FK can be Null Foreign Key ZOO-MEMBER ANIMAL ANID ANAME AFAMILY WEIGHT MID MNAME MADDR *** ANID 0001 Candice Camel 1800 171 N. Harrison 1400 Blush Rd 0002 144 J. Montagano 1108 5th Ave 0005 194 J. Spence 1244 Lark Ln 0001 303 E. Wingate 5222 Gains Dr 0001 0002 Zona Zebra 900 0003 Sam Snake 5 0004 Elmer Elephant 5000 101 H. Yarchun 177 Beach Rd 0005 Leonard Lion 1200 270 K. Steeg 140 Crystal Dr 0002 291 S. Ackerman 1172 Park Dr 0003 301 K. Snyder 196 279th Ave Relational Database Theory • What happens when a Tuple is deleted? – For every relationship, there are three possible delete options • Cascades –Delete the target tuple and –Delete the related tuples Relational Database Theory • Restricted –Delete restricted to cases for which there are no related tuples • Nullifies –Delete the target tuple and –Set the FK to null in the related tuples Relational Database Theory • Relational Algebra Operations – Select – Project – Join – Union – Intersect – Difference Relational Database Theory • Our Zoo Database Tables ANIMAL ANIMAL-FOOD ZOO-MEMBER ANID ANAME AFAMILY WEIGHT MID MNAME 0001 Candice Camel 1800 171 N. Harrison 1400 Blush Rd 0002 144 J. Montagano 1108 5th Ave 0005 194 J. Spence 1244 Lark Ln 0001 303 E. Wingate 5222 Gains Dr 0001 101 H. Yarchun 177 Beach Rd 270 K. Steeg 140 Crystal Dr 0002 291 S. Ackerman 1172 Park Dr 0003 301 K. Snyder 196 279th Ave 0002 Zona Zebra 900 0003 Sam Snake 5 0004 Elmer Elephant 5000 0005 Leonard Lion 1200 MADDR *** ANID ANID FOOD 0001 Hay 0001 Buns 0002 Brush 0003 Mice 0003 People 0004 Leaves 0005 People 0005 Meat Relational Database Theory • Relational Algebra: SELECT – Extracts specified tuples from a relation (or get rows from a table) Relational Database Theory • Example: SELECT out from the ANIMAL-FOOD table (display) the rows where FOOD=PEOPLE ANIMAL-FOOD ANID FOOD RESULTS 0001 Hay ANID FOOD 0001 Buns 0002 Brush 0003 Mice 0003 People 0004 Leaves 0005 People 0005 Meat 0003 People 0005 People Relational Database Theory • Relational Algebra: PROJECT – Extracts specified attributes(columns) from a relation (or get columns from a table) Relational Database Theory • Example: PROJECT from the ZOO-MEMBER table columns (MID, NAME) RESULTS ZOO-MEMBER ANID MID MNAME 1400 Blush Rd 0002 171 N. Harrison J. Montagano 1108 5th Ave 0005 144 J. Montagano 194 J. Spence 1244 Lark Ln 0001 194 J. Spence 303 E. Wingate 5222 Gains Dr 0001 303 E. Wingate 101 H. Yarchun 177 Beach Rd 101 H. Yarchun 270 K. Steeg 140 Crystal Dr 0002 270 K. Steeg 291 S. Ackerman 1172 Park Dr 0003 291 S. Ackerman 301 K. Snyder 196 279th Ave 301 K. Snyder MID MNAME 171 N. Harrison 144 MADDR *** Relational Database Theory • Relational Algebra: JOIN – Join the data in two tables • Concatenate one row from Table 1 with one row from Table 2 –Usually based on a common column called the join condition Relational Database Theory • Example: JOIN T1 and T2 based on the AFAMILY column T1 T2 ANID AFAMILY AFAMILY AREA 0001 Camel Camel 01 0002 Zebra Zebra 03 RESULT ANID AFAMILY AFAMILY AREA 0001 Camel Camel 01 0002 Zebra Zebra 03 Relational Database Theory • Different types of Joins – Equijoin – means a row in T1 is joined with a row in T2 where the values in the common column(s) are equal – This is the most common type of join T1 T2 ANID AFAMILY AFAMILY AREA 0001 Camel Camel 01 0002 Zebra Zebra 03 RESULT Join T1 and T2 where T1.AFAMILY=T2.AFAMILY ANID AFAMILY AFAMILY AREA 0001 Camel Camel 01 0002 Zebra Zebra 03 Relational Database Theory • Natural Join – The rows of T1 are joined with the rows of T2 where the PK value in one table equals the FK value in the other table • Where column name are the same • Don’t use this in a Production Database – renaming causes problems T1 T2 ANID AFAMILY AFAMILY AREA 0001 Camel Camel 01 0002 Zebra Zebra 03 RESULT T1 NATURAL JOIN T2 ANID AFAMILY AFAMILY AREA 0001 Camel Camel 01 0002 Zebra Zebra 03 Relational Database Theory • Inner Join – The rows of T1 are joined with the rows of T2 based on the join condition specified • Only rows from T1 with a matching row in T2 are in the result • Often an Inner Join is both a Natural & a Equijoin Relational Database Theory • Example: Inner Join – T1 INNER JOIN T2 on T1.AFAMILY=T2.AFAMILY T1 T2 ANID AFAMILY AFAMILY AREA 0001 Camel Camel 01 0002 Zebra Zebra 03 RESULT ANID AFAMILY AFAMILY AREA 0001 Camel Camel 01 0002 Zebra Zebra 03 Relational Database Theory • Outer Join – The rows of T1 are joined with the rows of T2 • All rows from one of the tables are included in the result even if there is no matching row in the other table Relational Database Theory • Example: Outer Join – T1 RIGHT OUTER JOIN T2 on T1.AFAMILY=T2.AFAMILY T2 T1 ANID AFAMILY AFAMILY AREA 0001 Camel Camel 01 0002 Zebra Zebra 03 Snake 05 RESULT ANID AFAMILY AFAMILY AREA 0001 Camel Camel 01 0002 Zebra Zebra 03 Snake 05 Relational Database Theory • Cross Join – Every row in T1 is joined with every row in T2 • All possible combinations of rows in the two tables • Also called a Cartesian Product Relational Database Theory • Example: Cross Join – T1 CROSS JOIN T2 T2 T1 ANID AFAMILY 0001 Camel 0002 Zebra RESULT ANID AFAMILY AFAMILY AREA 0001 Camel Camel 01 0001 Camel Zebra 03 0002 Zebra Camel 01 0002 Zebra Zebra 03 AFAMILY AREA Camel 01 Zebra 03 Relational Database Theory • An RDBMS manipulates Data using Relational Algebra Operations – There are (usually) several sequences of operations to answer a query • One sequence may be more efficient than another – A relational DBMS internally has routines that do the relational algebra Relational Database Theory – A relational DBMS generates a sequence or plan of relational algebra operations to accomplish the request – A relational DBMS has a query optimizer to develop an efficient query plan • A least-cost optimizer generates several execution plans and chooses the leastcost one; i.e.. Least amount of I/O Relational Database Theory • Union, Intersection, and Minus Union – union together (append) the result tables from two queries Intersect – take only the rows that are identical in the result tables from two queries Difference – take only the rows in the first result table that have no identical rows in the second result table Relational Database Theory • Relational Algebra: UNION – Union together the results of two queries • Result contains every element in either one or both sets – Query 1 • Select the rows from ANIMAL where WEIGHT > 2000 into T1 • Project from T1(ANID) into result 1 Relational Database Theory – Query 2 • Select the rows from ANIMAL-FOOD where FOOD=PEOPLE into T2 • Project from T2(ANID) into Result 2 – Query 1 UNION Query 2 Relational Database Theory RESULT 1 RESULT 2 RESULT ANID ANID ANID 0003 0003 0005 0004 0004 UNION 0005 Relational Database Theory • Relational Algebra: INTERSECTION – Take only the rows (tuples) that are identical in the result tables of two queries • Query 1 – Select out the rows from ANIMAL where WEIGHT > 1000 into T1 – Project from T1(ANID) into Result 1 Relational Database Theory • Query 2 – Project from ZOO-MEMBER(ANID) into Result 2 • Query 1 INTERSECT Query 2 RESULT 1 RESULT 2 RESULT ANID ANID ANID 0002 0001 0004 0005 0005 0005 0001 0001 INTERSECT 0003 Relational Database Theory • Relational Algebra: Minus/Difference/Except – Subtract from the results of one query from the results of a second query • Query 1 – Project from ANIMAL(ANID) into Result 1 • Query 2 – Project from ZOO-MEMBER(ANID) into Result 2 Relational Database Theory • Query 1 EXCEPT Query 2 RESULT 1 RESULT 2 RESULT ANID ANID ANID 0002 0004 0001 EXCEPT 0002 0005 0003 0001 0004 0003 0005 Relational Database Theory • Strengths of the Relational Approach – Simple • People are familiar with tables • Few rules • Few operations – Easy to learn • Relational algebra is straightforward • Multiple high-level, non-procedural languages are available -SQL Relational Database Theory – Well founded • Basis is mathematics, set theory