Download The Relational Theory

Document related concepts

SQL wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Ingres (database) wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Concurrency control wikipedia , lookup

Functional Database Model wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Versant Object Database wikipedia , lookup

ContactPoint wikipedia , lookup

Relational algebra wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

Transcript
IT 20303
• Relational Database Theory
Relational Database Theory
• The Relational Theory
– Ways of working with data
• Types of “Models”
–File database model
–Hierarchical database model
–Network database model
–Relational database model
Relational Database Theory
• The Relational Theory
– Meaning of database model
• The way data is organized &
stored
• The way data is manipulated
Relational Database Theory
• Relational Model of Data
– Published in 1970 by Dr. Edgar
(Ted) Codd – IBM
• “A Relational Model of Data for
Large Shared Data Banks”
Relational Database Theory
• Relational Model of Data
– Purpose
• Achieve program/data structure
independence
• Treat data in a disciplined way
–Apply rigor of mathematics
–Uses Set Theory – sets of related
data
• Improve programmer productivity
Relational Database Theory
• The Relational Model
– Relational uses familiar concepts
• The data is perceived as organized in tables
– Relational also incorporates the rigor of
mathematics
• Rows of the table are treated as elements in
a set
• Manipulation of rows is based on set
operations – (Vinn Diagrams)
– User works with a set of rows at a time
Relational Database Theory
• Relational also impacts Data Design
– Files were often constructed to
support an application
– Tables are designed to describe one
thing or Entity in the database
Relational Database Theory
• Example of a Relation:
– ANIMAL – Entity (Relation)
ANAME
Candice
Zona
Sam
Elmer
Leonard
AFAMILY
Camel
Zebra
Snake
Elephant
Lion
WEIGHT
1800
900
5
5000
1200
Relational Database Theory
• Definition of a Relation
– Data is organized & stored in
structures called relations
– A relation is a table that adheres to
certain rules
• A relation can be called a table
Relational Database Theory
• Definition of a Relation
– A relation is a table containing all the
data about some entity
• An entity is a thing or object that is
important in this application area
• Data items in the table are related
Relational Database Theory
• Relational Data Structure
Domains
Primary Key
Name
ANAME
Species
Weight
AFAMILY WEIGHT
Candice
Camel
1800
Zona
Zebra
900
Sam
Snake
5
Elmer
Elephant
5000
Leonard
Lion
1200
Relation
Attributes
Tuples
Relational Database Theory
• Relational Data Structure Definitions
– Relation
• The Table
– Tuple
• A Row
– Attribute
• A Column
Relational Database Theory
• Relational Data Structure Definitions
– Primary Key
• A unique identifier for the table
– Domain
• A pool of legal values from which an
attribute value is selected
–Related to meaning
–Has a Data Type
Relational Database Theory
• Relational Data Structure Definitions
– Degree
• The number of attributes
– Cardinality
• The number of tuples
Relational Database Theory
• Relational Table Rules
– A Relation is a table that adheres to
the following rules:
• There are No Duplicate Tuples in
the table
–The tuples in the table are
treated as a mathematical set
Relational Database Theory
• Relational Table Rules
–By definition, a set is a
collection of unique elements
• There must be a primary key
(unique identifier) for each tuple
Relational Database Theory
• Relational Table Rules
• There is no order to the tuples
(top to bottom)
• There is no order to the attributes
(left to right)
–By convention, the primary key
attribute is usually the first one
on the left side of the table
Relational Database Theory
• Attributes
– Each attribute has a datatype
• Examples: Integer, character,
date, user-defined
– The data value of an attribute can be
null
Relational Database Theory
• Attributes
– Each attribute value is atomic
• There is One & Only One data
value in each cell of the table
• There are no Lists or Arrays
• One fact per field, one field per
fact
– Can be called a Field (MS Access)
Relational Database Theory
• Relational Data Structure: Design
– Each relation contains data about
only one entity
• Each row corresponds to one
unique occurrence of the entity
– A relation does not contain arrays,
lists or repeating groups
• No multi-valued attributes
Relational Database Theory
– Tables are designed according to
Rules of Normalization
• Each data item in the table is
determined
–By the Primary Key
–By the Whole Primary Key
–Only by the Primary Key
Relational Database Theory
– Normalization avoids well-known
update problems
• Optimizes design to minimize
redundancy & storage
requirements
Relational Database Theory
• Example: Table with repeating group
–Animal
ANAME
AFAMILY WEIGHT
FOOD
Candice
Camel
1800
Hay
Buns
Zona
Zebra
900
Brush
Sam
Snake
5
Mice
People
Elmer
Elephant
5000
Leaves
Leonard
Lion
1200
People
Meat
Relational Database Theory
• Example: Table with no repeating group
Animal-Food
Animal
ANAME
AFAMILY WEIGHT
Candice
Camel
1800
Zona
Zebra
900
Sam
Snake
5
Elmer
Elephant
5000
Leonard
Lion
1200
ANAME
FOOD
Candice
Hay
Candice
Buns
Zona
Brush
Sam
Mice
Sam
People
Elmer
Leaves
Leonard People
Leonard
Meat
Relational Database Theory
• A Database Models the Real World
– A Database represents Reality
– The database is a collection of relations
• A relation represents an entity type
• Each tuple represents one occurrence
of that entity type
• Each occurrence of an entity is unique
Relational Database Theory
• A Database Models the Real World
– A database contains information
about
• Entities
• Relationships between entities
• Rules about the entities’ data &
the relationships
Relational Database Theory
• Relational Databases Support
Relationships
– Relational databases support
relationships between entities
• Relationship is established by a
Foreign Key
• Repeat the Primary Key of one
table in the related table(s)
Relational Database Theory
• Example: The Zoo has an “Adopt-an-Animal” program
– A zoo member can adopt an animal
Foreign Key
Zoo-Member
***
ANAME
Animal
MID
MNAME
MADDR
171
N. Harrison
1400 Blush Rd
Zona
144
J. Montagano
1108 5th Ave
Leonard
Candice
Camel
1800
194
J. Spence
1244 Lark Ln
Candice
Zona
Zebra
900
303
E. Wingate
5222 Gains Dr
Candice
Sam
Snake
5
101
H. Yarchun
177 Beach Rd
270
K. Steeg
140 Crystal Dr
Zona
Elmer
Elephant
5000
291
S. Ackerman
1172 Park Dr
Sam
Leonard
Lion
1200
301
K. Snyder
196 279th Ave
ANAME
AFAMILY WEIGHT
Relational Database Theory
• Example: Another Relationship
Composite Primary Key
Animal
ANAME
AFAMILY WEIGHT
Candice
Camel
1800
Zona
Zebra
900
Sam
Snake
5
Elmer
Elephant
5000
Leonard
Lion
1200
Foreign Key
Animal-Food
ANAME
FOOD
Candice
Hay
Candice
Buns
Zona
Brush
Sam
Mice
Sam
People
Elmer
Leaves
Leonard People
Leonard
Meat
Relational Database Theory
• Relational Integrity Rules
– Entity Integrity
• No part of the Primary Key (PK) may
be Null
– Referential Integrity
• The value of a Foreign Key (FK) must
either
–Be Null or
–Be one of the values of the PK in
the related table
Relational Database Theory
• Keys, Keys, and More Keys
– Characteristic of a Primary Key (PK)
• Unique
• Mandatory
• Unchanging
• Under the control of IT
organization
Relational Database Theory
• Keys, Keys, and More Keys
– Names or Types of Keys
• Candidate Key
–A minimal set of attributes that
can be used as the unique
identifier for a table
Relational Database Theory
• Keys, Keys, and More Keys
– Names or Types of Keys
• Primary Key
–One of the candidate keys
• Alternate Key
–A candidate key that is not the
primary key
Relational Database Theory
• Keys, Keys, and More Keys
– Names or Types of Keys
• Foreign Key
–A primary key of a related table
–Indicates relationships
Relational Database Theory
• Keys, Keys, and More Keys
– Names or Types of Keys
• Composite Key
–A key composed of more than one
attribute
• Search Key
–One or more attributes on which a
retrieval is based
»Indexes
Relational Database Theory
• Characteristics of Relationships
– Referential integrity applies to the
relationship between entities
• Also known as an existence
constraint or an enterprise rule
• For every relationship, referential
integrity must be defined
Relational Database Theory
• Relationships have Cardinality
– One-To-One
– One-To-Many
– Many-To-Many
• Relationships have Optionality
– Each entity’s participation is either
• Mandatory or
• Optional
Relational Database Theory
• Cardinality reflects Business Rules
– One-To-One Relationship
• One animal is cared for by one
zoo worker
• One zoo worker cares for one
animal
Relational Database Theory
• Cardinality reflects Business Rules
– One-To-Many Relationship
• One animal is cared for by many
zoo workers
• One zoo worker cares for only one
animal
Relational Database Theory
• Cardinality reflects Business Rules
– Many-To-Many Relationship
• One animal is cared for by many
zoo workers
• One zoo worker cares for many
animals
Relational Database Theory
• Mandatory Relationship
– The Foreign Key Cannot be Null
– Every purchase order must have a
supplier
– In the example below the FK, SNO,
cannot be Null
Relational Database Theory
• Example:
PORDER
ONO SNO
7001 1234 03/09/02
SUPPLIER
SNO
SNAME
SADDR
1234 Farm &
Feed
7000 Booth Rd
2079 The Grain
House
2001 Larkin Dr
***
ODATE
7002 2079 03/10/02
7003 2079 03/12/02
***
***
Relational Database Theory
• Example: FK can be Null
Foreign Key
ZOO-MEMBER
ANIMAL
ANID
ANAME
AFAMILY
WEIGHT
MID
MNAME
MADDR
***
ANID
0001
Candice
Camel
1800
171
N. Harrison
1400 Blush Rd
0002
144
J. Montagano
1108 5th Ave
0005
194
J. Spence
1244 Lark Ln
0001
303
E. Wingate
5222 Gains Dr
0001
0002
Zona
Zebra
900
0003
Sam
Snake
5
0004
Elmer
Elephant
5000
101
H. Yarchun
177 Beach Rd
0005
Leonard
Lion
1200
270
K. Steeg
140 Crystal Dr
0002
291
S. Ackerman
1172 Park Dr
0003
301
K. Snyder
196 279th Ave
Relational Database Theory
• What happens when a Tuple is
deleted?
– For every relationship, there are
three possible delete options
• Cascades
–Delete the target tuple and
–Delete the related tuples
Relational Database Theory
• Restricted
–Delete restricted to cases for
which there are no related
tuples
• Nullifies
–Delete the target tuple and
–Set the FK to null in the related
tuples
Relational Database Theory
• Relational Algebra Operations
– Select
– Project
– Join
– Union
– Intersect
– Difference
Relational Database Theory
• Our Zoo Database Tables
ANIMAL
ANIMAL-FOOD
ZOO-MEMBER
ANID
ANAME
AFAMILY
WEIGHT
MID
MNAME
0001
Candice
Camel
1800
171
N. Harrison
1400 Blush Rd
0002
144
J. Montagano
1108 5th Ave
0005
194
J. Spence
1244 Lark Ln
0001
303
E. Wingate
5222 Gains Dr
0001
101
H. Yarchun
177 Beach Rd
270
K. Steeg
140 Crystal Dr
0002
291
S. Ackerman
1172 Park Dr
0003
301
K. Snyder
196 279th Ave
0002
Zona
Zebra
900
0003
Sam
Snake
5
0004
Elmer
Elephant
5000
0005
Leonard
Lion
1200
MADDR
***
ANID
ANID
FOOD
0001
Hay
0001
Buns
0002
Brush
0003
Mice
0003
People
0004
Leaves
0005
People
0005
Meat
Relational Database Theory
• Relational Algebra: SELECT
– Extracts specified tuples from a
relation (or get rows from a table)
Relational Database Theory
• Example:
SELECT out from the ANIMAL-FOOD table (display) the
rows where FOOD=PEOPLE
ANIMAL-FOOD
ANID
FOOD
RESULTS
0001
Hay
ANID FOOD
0001
Buns
0002
Brush
0003
Mice
0003
People
0004
Leaves
0005
People
0005
Meat
0003
People
0005
People
Relational Database Theory
• Relational Algebra: PROJECT
– Extracts specified
attributes(columns) from a relation
(or get columns from a table)
Relational Database Theory
• Example: PROJECT from the ZOO-MEMBER table
columns (MID, NAME)
RESULTS
ZOO-MEMBER
ANID
MID
MNAME
1400 Blush Rd
0002
171
N. Harrison
J. Montagano
1108 5th Ave
0005
144
J. Montagano
194
J. Spence
1244 Lark Ln
0001
194
J. Spence
303
E. Wingate
5222 Gains Dr
0001
303
E. Wingate
101
H. Yarchun
177 Beach Rd
101
H. Yarchun
270
K. Steeg
140 Crystal Dr
0002
270
K. Steeg
291
S. Ackerman
1172 Park Dr
0003
291
S. Ackerman
301
K. Snyder
196 279th Ave
301
K. Snyder
MID
MNAME
171
N. Harrison
144
MADDR
***
Relational Database Theory
• Relational Algebra: JOIN
– Join the data in two tables
• Concatenate one row from Table
1 with one row from Table 2
–Usually based on a common
column called the join condition
Relational Database Theory
• Example: JOIN T1 and T2 based on the
AFAMILY column
T1
T2
ANID
AFAMILY
AFAMILY
AREA
0001
Camel
Camel
01
0002
Zebra
Zebra
03
RESULT
ANID
AFAMILY
AFAMILY
AREA
0001
Camel
Camel
01
0002
Zebra
Zebra
03
Relational Database Theory
• Different types of Joins
– Equijoin – means a row in T1 is joined with a row in T2 where
the values in the common column(s) are equal
– This is the most common type of join
T1
T2
ANID
AFAMILY
AFAMILY
AREA
0001
Camel
Camel
01
0002
Zebra
Zebra
03
RESULT
Join T1 and T2 where
T1.AFAMILY=T2.AFAMILY
ANID
AFAMILY
AFAMILY
AREA
0001
Camel
Camel
01
0002
Zebra
Zebra
03
Relational Database Theory
• Natural Join
– The rows of T1 are joined with the rows of T2 where the PK
value in one table equals the FK value in the other table
• Where column name are the same
• Don’t use this in a Production Database – renaming causes
problems
T1
T2
ANID
AFAMILY
AFAMILY
AREA
0001
Camel
Camel
01
0002
Zebra
Zebra
03
RESULT
T1 NATURAL JOIN T2
ANID
AFAMILY
AFAMILY
AREA
0001
Camel
Camel
01
0002
Zebra
Zebra
03
Relational Database Theory
• Inner Join
– The rows of T1 are joined with the rows of
T2 based on the join condition specified
• Only rows from T1 with a matching row
in T2 are in the result
• Often an Inner Join is both a Natural & a
Equijoin
Relational Database Theory
• Example: Inner Join
– T1 INNER JOIN T2 on
T1.AFAMILY=T2.AFAMILY
T1
T2
ANID
AFAMILY
AFAMILY
AREA
0001
Camel
Camel
01
0002
Zebra
Zebra
03
RESULT
ANID
AFAMILY
AFAMILY
AREA
0001
Camel
Camel
01
0002
Zebra
Zebra
03
Relational Database Theory
• Outer Join
– The rows of T1 are joined with the rows of
T2
• All rows from one of the tables are
included in the result even if there is no
matching row in the other table
Relational Database Theory
• Example: Outer Join
– T1 RIGHT OUTER JOIN T2 on T1.AFAMILY=T2.AFAMILY
T2
T1
ANID
AFAMILY
AFAMILY
AREA
0001
Camel
Camel
01
0002
Zebra
Zebra
03
Snake
05
RESULT
ANID
AFAMILY
AFAMILY
AREA
0001
Camel
Camel
01
0002
Zebra
Zebra
03
Snake
05
Relational Database Theory
• Cross Join
– Every row in T1 is joined with every row in
T2
• All possible combinations of rows in the
two tables
• Also called a Cartesian Product
Relational Database Theory
• Example: Cross Join
– T1 CROSS JOIN T2
T2
T1
ANID
AFAMILY
0001
Camel
0002
Zebra
RESULT
ANID
AFAMILY
AFAMILY
AREA
0001
Camel
Camel
01
0001
Camel
Zebra
03
0002
Zebra
Camel
01
0002
Zebra
Zebra
03
AFAMILY
AREA
Camel
01
Zebra
03
Relational Database Theory
• An RDBMS manipulates Data using
Relational Algebra Operations
– There are (usually) several sequences of
operations to answer a query
• One sequence may be more efficient
than another
– A relational DBMS internally has routines
that do the relational algebra
Relational Database Theory
– A relational DBMS generates a sequence
or plan of relational algebra operations to
accomplish the request
– A relational DBMS has a query optimizer
to develop an efficient query plan
• A least-cost optimizer generates several
execution plans and chooses the leastcost one; i.e.. Least amount of I/O
Relational Database Theory
• Union, Intersection, and Minus
Union – union together (append) the
result tables from two queries
Intersect – take only the rows that
are identical in the result tables from
two queries
Difference – take only the rows in the
first result table that have no identical
rows in the second result table
Relational Database Theory
• Relational Algebra: UNION
– Union together the results of two queries
• Result contains every element in either
one or both sets
– Query 1
• Select the rows from ANIMAL where
WEIGHT > 2000 into T1
• Project from T1(ANID) into result 1
Relational Database Theory
– Query 2
• Select the rows from ANIMAL-FOOD
where FOOD=PEOPLE into T2
• Project from T2(ANID) into Result 2
– Query 1 UNION Query 2
Relational Database Theory
RESULT 1
RESULT 2
RESULT
ANID
ANID
ANID
0003
0003
0005
0004
0004
UNION
0005
Relational Database Theory
• Relational Algebra: INTERSECTION
– Take only the rows (tuples) that are
identical in the result tables of two queries
• Query 1
– Select out the rows from ANIMAL where
WEIGHT > 1000 into T1
– Project from T1(ANID) into Result 1
Relational Database Theory
• Query 2
– Project from ZOO-MEMBER(ANID) into
Result 2
• Query 1 INTERSECT Query 2
RESULT 1
RESULT 2
RESULT
ANID
ANID
ANID
0002
0001
0004
0005
0005
0005
0001
0001
INTERSECT
0003
Relational Database Theory
• Relational Algebra: Minus/Difference/Except
– Subtract from the results of one query from
the results of a second query
• Query 1
– Project from ANIMAL(ANID) into Result 1
• Query 2
– Project from ZOO-MEMBER(ANID) into
Result 2
Relational Database Theory
• Query 1 EXCEPT Query 2
RESULT 1
RESULT 2
RESULT
ANID
ANID
ANID
0002
0004
0001
EXCEPT
0002
0005
0003
0001
0004
0003
0005
Relational Database Theory
• Strengths of the Relational Approach
– Simple
• People are familiar with tables
• Few rules
• Few operations
– Easy to learn
• Relational algebra is straightforward
• Multiple high-level, non-procedural
languages are available -SQL
Relational Database Theory
– Well founded
• Basis is mathematics, set theory