Download Relational Data Model

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Extensible Storage Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

Database wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Relational algebra wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

Transcript
Relational Data Model
Ch. 7.1 – 7.3
John Ortiz
Why Study Relational Model?
 Most widely used model.
Vendors: IBM, Informix, Microsoft, Oracle,
Sybase, etc.
 “Legacy systems” in older models
E.G., IBM’s IMS (hierarchical model)
 Recent competitor: object-oriented model
ObjectStore, Versant, Ontos, O2
A synthesis emerging: object-relational
model
Informix UDS, UniSQL, Oracle, DB2
Lecture 3
Relational Data Model
2
Anatomy of a Relation
 Each relation is a table with a name.
 An attribute is a column heading.
 The heading is the schema of the relation
Students(SSN, Name, Age, GPA)
Relation
name
Students
SID
1002
1005
1020
Name
John Smith
Mary Day
Bill Lee
Age
20
18
19
GPA
3.2
2.9
2.7
Attribute
name
Tuple
Column
Lecture 3
Relational Data Model
3
Domain of an Attribute
 The domain of an attribute A, denoted by
Dom(A), is the set of values that the
attribute can take.
A domain is usually represented by a
type. E.g.,
SID char(4)
Name varchar(30) --- character
string of variable length up to 30
Age number --- a number
Lecture 3
Relational Data Model
4
Tuples
 A tuple of a relation is a row in its table.
 If t is a tuple of a relation R and A is an
attribute of R, we use t[A] to represent the
value of t under A in R.
Example: If t is the second tuple in Students
t[Name] = ‘Mary Day’
t[Age] = 18,
t[Name, Age] = (‘Mary Day’, 18)
Lecture 3
Relational Data Model
5
Schema and Instance
 A relation schema, denoted by R(A1, A2, …,
An), consists of the relation name R and a
list of attributes A1, …, An.
R.A denotes attribute A of R.
# of attributes = degree
 A relation instance (state) of a relation
schema R(A1, …, An), denoted by r(R), is a
set of tuples in the table of R at some
instance of time.
# of tuples = cardinality
Lecture 3
Relational Data Model
6
Schema & Instance Update
 The schema of a relation may change (e.g.,
adding, deleting, renaming attributes and
deleting a table schema) but it is infrequent
 The state of a relation may also change (e.g.,
inserting or deleting a tuple, and changing an
attribute value in a tuple) & it is frequent
A schema may have different states at
different times.
Lecture 3
Relational Data Model
7
Relational Database
 A relational database schema is a set of
relation schemas S={R1, …, Rm}.
 A relational database is a set of relations
DB(S)={r(R1), …, r(Rm)}.
 A database state is a set of relation
instances at some instance of time.
In addition, a relational database must
satisfy a number of constraints (more to
come later).
Lecture 3
Relational Data Model
8
A University Database
see p. 204, Fig. 7.5
Students
SID
1002
1005
1020
Sections
Name
J. Smith
M. Day
B. Lee
Major
CS
Math
EE
GPA
3.2
2.9
2.7
Cno
CS374
CS455
Math210
Courses
Cno
CS374
CS455
CS100
Math210
Lecture 3
Name
Database
Network
Prog.Lang
Calculus
Hour
3
3
4
3
Dept
CS
CS
CS
Math
Sno
001
002
001
Semester
F2000
S2000
F1999
Prof
Zhang
Smith
Brown
Departments
Name
CS
EE
Math
Relational Data Model
Room
SB220
EB318
AB119
Chair
Hansen
Johnson
Miller
9
Constraints of Relational DB
 Relations must satisfy the following
constraints.
Domain (1NF) Constraint.
Access-by-Content Constraint.
Key (Unique Tuple) Constraint.
Entity Integrity Constraint.
Referential Integrity Constraint.
Integrity constraints are enforced by the
RDBMS.
Lecture 3
Relational Data Model
10
Domain Constraint
 Also known as the First Normal Form
(1NF): Attributes can only take atomic
values (I.e., set values are not allowed).
 How to handle multivalued attributes?
Use multiple tuples, one per value
Use multiple columns, one per value
Use separate tables
What problems does these solutions
have?
Lecture 3
Relational Data Model
11
Handle Multi-Valued Attributes
Multiple
Values:
Employees
Use Multiple
Tuples:
Employees
Lecture 3
EID
1234
1357
2468
EID
1234
1234
1357
2468
2468
2468
Name
Bob
Mary
Peter
Name
Bob
Bob
Mary
Peter
Peter
Peter
Age
34
23
54
Dependents
{Allen, Ann}
{Kathy}
{Mike, Sue, David}
Age
34
34
23
54
54
54
Dependents
Allen
Ann
Kathy
Mike
Sue
David
Relational Data Model
12
Handle Multi-Valued Attributes
Use Multiple
Columns:
Use Separate
Relations:
Lecture 3
Employees
EID
1234
1357
2468
Name
Bob
Mary
Peter
Age
34
23
54
Dep1 Dep2 Dep3
Allen Ann
Kathy
Mike Sue David
Dependents
Employees
EID
1234
1357
2468
Name
Bob
Mary
Peter
Age
34
23
54
Relational Data Model
EID
1234
1234
1357
2468
2468
2468
Name
Allen
Ann
Kathy
Mike
Sue
David
13
Access-by-Content Constraint
 A tuple is retrieved only by values of its
attributes, i.e., the order of tuples is not
important.
 This is because a relation is a set of tuples.
Although the order of tuples is
insignificant for query formulation, it is
significant for query evaluation.
Lecture 3
Relational Data Model
15
Superkey
 A superkey of a relation is a set of
attributes whose values uniquely identify the
tuples of the relation.
 Every relation has at least one superkey
(default is all attributes together?).
 Any superset of a superkey is a superkey.
 From a state of a relation, we may determine
that a set of attributes is not a superkey,
but can not determine that a set of
attributes is a superkey.
Lecture 3
Relational Data Model
16
Superkey Example
 Find all superkeys of the Students relation.
Students
SID
1002
1005
1020
Name
J. Smith
M. Day
B. Lee
Major
CS
Math
EE
GPA
3.2
2.9
2.7
 With the only state of R, is A a superkey?
What about {A, B}?
R A B C D
A1
A2
A2
A3
Lecture 3
Relational Data Model
B2
B2
B1
B3
C1
C3
C2
C4
D2
D2
D1
D1
17
Candidate Key
 A candidate key of a relation is a set of
attributes of the relation such that
it is a superkey, and
none of its proper subsets is a superkey.
 Find all candidate keys in Students relation.
 Is it true that every relation has at least
candidate key? Why?
 Can candidate keys be found from a state?
 If AB is a candidate key of a relation, can A
also be a candidate key? What is ABC called?
Lecture 3
Relational Data Model
18
Primary Key
 A primary key of a relation is a candidate key
designated (with an underline) by a database
designer.
Often chosen at the time of schema design,
& once specified to DBMS, it cannot be
changed.
Better be the smallest candidate key for
improvement of both storage and query
processing efficiencies.
What should be the primary key of Students?
Lecture 3
Relational Data Model
19
Key Constraint
 Every relation must have a primary key.
 Why is key constraint needed?
Every tuple has a different primary key
value.
Only the primary key values need to be
checked for identifying duplicate when new
tuples are inserted (index is often used).
Primary key values can be referenced from
within other relations
Lecture 3
Relational Data Model
20
Entity Integrity Constraint
 A null value is a special value that is
unknown,
yet to be assigned, or
inapplicable.
 Entity Integrity Constraint: No primary key
values can be null.
Why?
Lecture 3
Relational Data Model
21
Foreign Key
 A foreign key in relation R1 referencing
relation R2 is a set of attributes FK of R1,
such that,
FK is compatible with a candidate (or
primary) key PK of R2 (with same number of
attributes and compatible domains); and
for every tuple t1 in R1, either there exists
a tuple t2 in R2 such that t1[FK] = t2[PK] or
t1[FK] = null.
Foreign keys need to be explicitly defined.
Lecture 3
Relational Data Model
22
Foreign Key Example
Employees
EID
1234
1357
2468
Name
Bob
Mary
Peter
Departments
Age
34
23
54
DName
Sales
Service
null
Name
Sales
Payroll
Service
City
Huston
Dallas
Chicago
Manager
Bill
Steve
Tom
 DName of Employees is a foreign key referencing
Name of Departments
 A foreign key may reference its own relation.
Employee(EID, Name, Age, Dept, ManegerID)
Lecture 3
Relational Data Model
23
Referential Integrity Constraint
 Referential Integrity Constraint: No relation
can contain unmatched foreign key values.
 Using foreign keys in a relation to reference
primary keys of other relations is the only way
in the relational data model to establish
relationships among different relations.
Lecture 3
Relational Data Model
24
Update Operations
 Insert
Can violate any of the 4 previous constraints
– what were they again?
1 solution: reject the insert
 Delete
Can only violate referential integrity – why?
3 solutions: reject deletion, propagate
deletion, modify referencing attributes
 Modify
Can violate any of the 4 previous constraints
Lecture 3
Relational Data Model
25
Relational Model: Summary
 A tabular representation of data.
 Simple and intuitive, currently the most widely
used.
 Integrity constraints can be specified by the
DBA, based on application semantics. DBMS
checks for violations.
Two important ICs for primary and foreign
keys
In addition, we always have domain
constraints.
Lecture 3
Relational Data Model
26
Relational Model: Summary
 ICs are based upon the semantics of the realworld enterprise that is being described in the
database relations.
 We can check a database instance to see if an
IC is violated, but we can never infer that an
IC is true by looking at an instance.
 Powerful and natural query languages exist.
 Guidelines to translate ER to relational model
(next class…)
Lecture 3
Relational Data Model
27
Look Ahead
 Next topic: Relational Algebra
 Read Textbook:
Chapter 7.4 – 7.6
Lecture 3
Relational Data Model
28