Download The Relational Database Model

Document related concepts

Microsoft Jet Database Engine wikipedia , lookup

Ingres (database) wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Functional Database Model wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Join (SQL) wikipedia , lookup

Relational algebra wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Chapter 3
The Relational Database Model
Logical View of Data
Relational
Database
– Designer focuses on logical representation
rather than physical
– Use of table advantageous
Structural and data independence
 Related records stored in independent tables
 Logical simplicity

A Logical View of Data

Relational database model’s structural and data
independence
–
allow us to ignore the structure’s physical complexity.
–
enables us to view data logically rather than physically.

The logical view allows
a simpler file concept of data storage.

The use of logically independent tables is easier
to understand.

Logical simplicity yields
simpler and more effective database design
methodologies.
A Logical View of Data

Entities and Attributes
– An entity is a person, place, event, or thing for which
we intend to collect data.

University -- Students, Faculty Members, Courses
– Each entity has certain characteristics known as
attributes.

Student -- Student Number, Name, GPA, Date of Enrollment,
Data of Birth, Home Address, Phone Number, Major
– A grouping of related entities becomes an entity set.

The STUDENT entity set contains all student entities.

The FACULTY entity set contains all faculty entities.
Table Characteristics
• Two-dimensional structure with rows and columns
• Row (tuple) represent a single entity occurrence within the
•
•
•
•
entity set.
Columns represent attributes
Row/column intersection represents single value
Column values all have same data format
Each column has range of values called attribute domain
• Order of the rows and columns is
immaterial to the DBMS
• Tables must have an attribute to
uniquely identify each row
A Logical View of Data

Tables and Their Characteristics
– A table contains a group of related entities --
i.e. an entity set.
– The terms entity set and table are often used
interchangeably.
– A table is also called a relation.
entity set ≒ table ≒ relation
Keys
– A key Consists of one or more attributes that
determine other attributes
– Primary key (PK) is an attribute (or a combination
of attributes) that uniquely identifies any given
entity (row)
Keys
– The key’s role is based on a concept known as
determination, which is used in the definition of
functional dependence.

The attribute B is functionally dependent on A
if A determines B. ( A → B )

An attribute that is part of a key is known as a key
attribute.

A multi-attribute key is known as a composite key.

If the attribute (B) is functionally dependent on a
composite key (A) but not on any subset of that
composite key, the attribute (B) is fully functionally
dependent on (A).
(A is minimal !)
Freshman
Sophomore
Junior
Senior
STU_HRS → STU_CLASS
Keys

Superkey
– Uniquely identifies each entity

Candidate key
– Minimal superkey

Primary key
– Candidate key selected to uniquely identify all other
attributes in a given row

Secondary key
– Used only for data retrieval

Foreign key
– Values must match primary key in another table

Super key
– STU_NUM
– (STU_NUM, STU_LNAME)

Candidate key
– STU_NUM
– (STU_LNAME, STU_FNAME, STU_INIT, STU_PHONE)

Primary key
– STU_NUM

Secondary key
not necessarily yield a unique outcome
– CUSTOMER table:
 Primary key : Customer number
 Secondary key : Customer last name, Customer phone number
– A secondary key’s effectiveness in narrowing down a search
depends on how restrictive it is.
Null Values

No data entry

Not permitted in primary key

Should be avoided in other attributes

Can represent
– An unknown attribute value
– A known, but missing, attribute value
– A “not applicable” condition

Can create problems in logic and using formulas
(COUNT, AVERAGE, and SUM)
Keys

Controlled redundancy (shared common attributes)
makes the relational database work.

The primary key of one table appears again as the
link (foreign key) in another table.
–

VEND_CODE in PRODUCT table (Figure 3.2)
referential integrity :
if the foreign key contains a value, that value refers
to an existing valid tuple in another relation.
Integrity Rules

Entity integrity
– (unique and no null)
All Primary key entries are unique , and
no part of a Primary key may be null .



CUS_CODE
AGENT_CODE
Referential integrity
– (null or matching)
A Foreign key may have either a null entry or
an entry that matches the primary key value in a
table
to which it is related.

AGENT_CODE
null or matching
Relational Database Operators

Relational algebra defines the theoretical way of
manipulating table contents using the eight
relational functions:
SELECT
– PROJECT
– JOIN
– INTERSECT
– UNION
– DIFFERENCE
– PRODUCT
– DIVIDE
Use of relational algebra operators on existing tables
(relations) produces new relations
–

Relational Database Operators

UNION combines all rows from two tables. The two
tables must be union compatible.
(share the same columns and domains)

INTERSECT produces a listing that contains only
the rows that appear in both tables. The two
tables must be union compatible.

DIFFERENCE yields all rows in one table that
are not found in the other table; i.e., it subtracts
one table from the other. The tables must be
union compatible.

PRODUCT produces a list of all possible pairs of
rows from two tables. (Cartesian product)

SELECT yields values for all rows found in a table.
It yields a horizontal subset of a table.
Figure 2.9

PROJECT produces a list of all values for selected
attributes. It yields a vertical subset of a table.

JOIN allows us to combine information from two or
more tables. JOIN is the real power behind the
relational database, allowing the use of
independent tables linked by common attributes.
Relational Database Operators
•
Natural JOIN links tables by selecting only the
rows with common values in their common
attribute(s). It is the result of a 3-stage process:
1.
A PRODUCT of the tables is created. (Figure 3.12)
2.
A SELECT is performed on the output of the first
step to yield only the rows for which the common
attribute values match. (Figure 3.13)
Common column(s) are called join column(s)
3.
A PROJECT is performed to yield a single copy of
each attribute, thereby eliminating the duplicate
column. (Figure 3.14)
Relational Database Operators
– EquiJOIN
links tables based on an equality condition that
compares specified columns of each table. The
outcome of the EquiJOIN does not eliminate duplicate
columns and the condition or criteria to join the
tables must be explicitly defined.
– Theta JOIN is an equiJOIN that compares specified
columns of each table using a comparison operator
other than the equality(=) comparison operator.
– In an Outer JOIN, the unmatched pairs would be
retained and the values for the unmatched other
tables would be left null.
equiJOIN criteria(C=E)
A
1
2
3
B
X
X
y
C
10
20
10
D
m
n
E
10
20
p
40
A
B
C
D
E
1
x
10
m
10
3
y
10
m
10
2
x
20
n
20
theta JOIN criteria(C<E)
A
1
2
3
B
X
X
y
C
10
20
10
D
m
n
E
10
20
p
40
A
B
C
D
E
1
x
10
n
20
1
x
10
p
40
2
x
20
p
40
3
y
10
n
20
3
y
10
p
40
CUSTOMER Full Outer JOINCUSTOMER.AGENT_CODE=AGENT.AGENT_CODE AGENT
CUSTOMER Left Outer JOINCUSTOMER.AGENT_CODE=AGENT.AGENT_CODE AGENT
CUSTOMER Right Outer JOINCUSTOMER.AGENT_CODE=AGENT.AGENT_CODE AGENT

DIVIDE requires the use of one single-column
table and one two-column table.

This is not an uncommon situation, but the
solution is not straightforward since the SQL
SELECT statement does not directly support
relational division.
 There are numerous ways to achieve the same
results as a relational division, however. The
easiest method is to rephrase the request.
Instead of "list the suppliers who provide all
product categories,“ which is difficult to process,
try "list all suppliers where the count of their
product categories is equal to the count of all
product categories."
The Data Dictionary and
the System Catalog

Data dictionary
contains metadata to provide detailed accounting of all
tables within the database.

System catalog
is a very detailed system data dictionary that describes all
objects within the database.
– Authorized users, access privileges …
– System catalog is a system-created database whose tables
store the database characteristics and contents.

Terms “system catalog” and “data dictionary” are often used
interchangeably
Relationships within the
Relational Database

E-R Diagram (ERD)
– Rectangles are used to represent entities.
– Entity names are nouns and capitalized.
– Diamonds are used to represent the relationship(s)
between the entities.
– “1” side of relationship

Number 1 in Chen Model

Bar crossing line in Crow’s Feet Model
– “Many” relationships

Letter “M” and “N” in Chen Model

Three pronged “Crow’s foot” in Crow’s Feet Model
The 1:1 Relationship

One entity can be related to only one other
entity, and vice versa

Often means that entity components were not
defined properly

Could indicate that two entities actually
belong in the same table

Sometimes 1:1 relationships are appropriate
The 1:1 Relationship Between
PROFESSOR and DEPARTMENT
The Implemented 1:1 Relationship Between
PROFESSOR and DEPARTMENT
The 1:M Relationship
Between PAINTER and PAINTING
Example 1:M Relationship
The Implemented 1:M Relationship
Between PAINTER and PAINTING
PAINTER (PAINTER_NUM)
PAINTING (PAINTING_NUM) (PAINTER_NUM)
The 1:M Relationship
Between COURSE and CLASS
The Implemented 1:M Relationship
Between COURSE and CLASS
COURSE (CRS_CODE)
CLASS (CLASS_CODE) (CRS_CODE)
The M:N Relationship

Can be implemented by breaking it up to
produce a set of 1:M relationships

Can avoid problems inherent to M:N
relationship by creating a composite entity or
bridge entity
The ERD’s M:N Relationship
Between STUDENT and CLASS
Sample Student Enrollment Data
The M:N Relationship
Between STUDENT and CLASS
321452
10014
Composite Entity

The tables (Figure 3.25) creates many
redundancies.

Composite Entity ( Bridge Entity)
– Links the tables originally were related in a M:N relationship
– Includes at least the primary keys of the tables that are to
be linked.
– In ERD,
represented by a diamond within a rectangle
Converting the M:N Relationship
into Two 1:M Relationships
STUDENT (STU_NUM)
ENROLL (CLASS_CODE+STU_NUM) (CLASS_CODE,STU_NUM)
CLASS (CLASS_CODE) (CRS_CODE)
Changing the M:N Relationship
to Two 1:M Relationships
The Expanded Entity Relationship Model
The Relational Schema for the
Ch03_TinyCollege Database
Data Redundancy Revisited

Data redundancy leads to data anomalies
– Such anomalies can destroy database
effectiveness

Foreign keys
– Control data redundancies by using common
attributes shared by tables
– Foreign keys can reduce redundancy

Sometimes, data redundancy is necessary
Changing the M:N Relationship
to Two 1:M Relationships
PRODUCT
INVOICE
INVOICE
LINE
PRODUCT
CUSTOMER (CUS_CODE)
INVOICE (INV_NUM) (CUS_CODE)
LINE (INV_NUM+LINE_NUM) (INV_NUM, PROD_CODE)
PRODUCT (PROD_CODE)
A Small Invoicing
System
The Relational Schema
for the Invoicing System
Why Data Redundancies ?


Sometimes, data redundancy is necessary
Product price (LINE_PRICE in LINE table) :
Data redundancies !

The redundancy is crucial to the system’s success.
Copying the product price from PRODUCT table to the
LINE table
means that it is possible to maintain
the historical accuracy of the transactions.
Product price may change.
Indexes

An index is an orderly arrangement used to logically
access rows in a table of keys and pointers.

An index is composed of
– an index key (the index’s reference point) (PAINTER_NUM)
– a set of pointers.

Each pointer points to the location of the data
identified by the key.

Makes retrieval of data faster

Each index is associated with only one table
Indexes