Download Employees

Document related concepts

IMDb wikipedia , lookup

Oracle Database wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Ingres (database) wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Functional Database Model wikipedia , lookup

Database wikipedia , lookup

Concurrency control wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

ContactPoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
The Entity-Relationship (ER) Model
Chpt 2
Jianping Fan
www.cs.uncc.edu/~jfan/itcs6160.html
Databases today are essential to every business!
Database Management Systems
Raghu Ramakrishnan
1
Database Management System (DBMS)
a. What is database?
---database consists of many tables and
their inter-table relationships
---each table has some “similar” tuples
(objects)
b. What DBMS should address further?
---storage and indexing of tables of tuples
---formulation & optimization of queries
4 key issues: Description, Storage, Indexing & Search
Database Management Systems
Raghu Ramakrishnan
2
Overview of Database Design
Requirement
Analysis

ER Design
Relational
Schema
DBMS
Requirement Analysis: analysis of data & users’ requirement
entity sets & relationship sets & attributes for indexing
 Conceptual Database Design:
ER model for description
 Logical Database Design:
ER model to relational database
Beyond ER Model
 Schema Refinement:
analyze the collections of relations
 Physical Database Design:
database indexing
 Security Design:
access control & data privacy
Database Management Systems
Raghu Ramakrishnan
3
Overview of Database Design
Requirement
Analysis

ER Design
Relational
Schema
DBMS
Requirement Analysis: analysis of data & users’ requirement
what users search from database?
attributes for indexing
what kind of data we have?
entity sets & relationship sets
a.
b.
c.
d.
e.
f.
What kind of attributes should be included for tuple (object) description?
Entity sets & relationship sets
Which attributes should be indexed? frequent operations & access
Long schema or short schema (balance of storage & access efficiency)?
Table integration or separation? updating frequency
Query optimization framework?
Database Management Systems
Raghu Ramakrishnan
4
Overview of Database Design

Conceptual Database Design: (ER Model is used.)
since
name
ssn
dname
lot
Employees
budget
did
Works_In
Departments
Relationship Set
Database Management Systems
Raghu Ramakrishnan
5
Overview of Database Design

Conceptual Database Design: (ER Model is used.)
a. Should a concept be modeled as an entity or an attribute?
b. Should a concept be modeled as an entity or a relationship?
c. What are relationship sets and their participating entity sets?
d. Should we use binary or ternary relationships?
e. Should we use aggregation?
Database Management Systems
Raghu Ramakrishnan
6
Overview of Database Design
Requirement
Analysis

ER Design
Relational
Schema
Logical Database Design: ER model
DBMS
relational database
a. How to create physical database tables?
b. How to transform E-R models (conceptual database design)
into physical database tables?
c. How to generate physical database indexing for fast query?
Database Management Systems
Raghu Ramakrishnan
7
Overview of Database Design
Requirement
Analysis

ER Design
Schema Refinement:
Relational
Schema
DBMS
analyze the collections of relations
a. Which tables should be separated into multiple smaller tables?
b. Which tables can be integrated as one single larger table?
c. Which attributes should be inserted into existing schema?
d. Which attributes can be deleted from existing schema?
Database Management Systems
Raghu Ramakrishnan
8
Overview of Database Design
Requirement
Analysis

ER Design
Relational
Schema
DBMS
Physical Database Design: database indexing
a. Which attributes should be selected for indexing?
Most frequently-used attributes for query formulation
b. What kind of indexing structures should be selected?
range search or equal search?
Database Management Systems
Raghu Ramakrishnan
9
Overview of Database Design
Requirement
Analysis

ER Design
Relational
Schema
DBMS
Security Design: access control & data privacy
Who can access what in database under which conditions?
What can be shown to a given user?
Database Management Systems
Raghu Ramakrishnan
10
Conceptual Design of Database

Conceptual Database Design: (ER Model is used.)
– What are the entities and relationships in the enterprise?
– What information about these entities and relationships
should we store in the database (i.e., attributes)?
– What are the integrity constraints or business rules that
hold?
– A database `schema’ in the ER Model can be represented
pictorially (ER diagrams).
– Can map an ER diagram into a relational schema.
Database Management Systems
Raghu Ramakrishnan
11
University Database

University database contains employees and
departments which are described by certain
attributes

University database also contains relationships
between employees departments which are also
described by certain attributes
Database Management Systems
Raghu Ramakrishnan
12
Example: University Database
Tables:
Students: SID, sname, year, GPA
Departments: DID, dname, office
Faculties: ssn, fname, f-office, phone, salary
Courses: CID, cname, time, room, credit-hour
Schema & Attributes, Domains
Inter-Table Relationships:
Students enroll in Courses, Faculties teach Courses
Faculties work for Departments
Query over multiple tables is allowed!
Database Management Systems
Raghu Ramakrishnan
13
ER Model Basics
ssn
name
lot
Employees

Entity: Real-world object distinguishable from other
objects. An entity is described (in DB) using a
set of attributes.
ID
name
year
GPA
999-80-3267, John Smith, 2003, 3.5
999-32-0847, James Gary, 2006, 3.0
Database Management Systems
Raghu Ramakrishnan
14
ER Model Basics
ssn
name
lot
Employees

Entity Set: A collection of similar entities. E.g., all
employees.
– All entities in an entity set have the same set of attributes.
– Each entity set has a key.
– Each attribute has a domain.
What’s the key? How many keys one object can have?
Database Management Systems
Raghu Ramakrishnan
15
Entity, Entity Set, Attribute, Schema & Domain
ID or SSN
999-38-4431
Name
Year
John Smith
999-28-3341 Miki Jordan
1999
2000
331-43-4567
David Kim
2000
535-34-5678
Paul Lee
1998
Database Management Systems
Raghu Ramakrishnan
Age
21
GPA
3.68
28
3.45
25
26
4.00
3.89
16
ER Model Basics (Contd.)
since
name
ssn
dname
lot
Employees
budget
did
Works_In
Departments
Relationship Set
Relationship: Association among two or more entities.
 Examples:

Fan works in Computer Science Department.
Smith work in Electronic Engineering Department
Database Management Systems
Raghu Ramakrishnan
17
ER Model Basics (Contd.)
since
name
ssn
dname
lot
Employees
budget
did
Works_In
Departments
Relationship Set

Relationship Set: Collection of similar relationships.
– An n-ary relationship set R relates n entity sets E1 ... En;
each relationship in R involves entities e1 E1, ..., en En
 Same entity set could participate in different
relationship sets, or in different “roles” in same set.
Database Management Systems
Raghu Ramakrishnan
18
Entity vs. Entity Set
Student --- Students
John Smith
(999-21-3415, jsmith@, John Smith, 18, 3.5)
Students in ITCS3160
999-21-3415, jsmith@, John Smith, 18, 3.5
999-31-2356, jzhang@, Jie Zhang, 20, 3.0
999-32-1234, ajain@, Anil Jain, 21, 3.8
Database Management Systems
Raghu Ramakrishnan
19
Entity Keys
Primary key
Candidate key
999-21-3415, jsmith@, John Smith, 18, 3.5
999-31-2356, jzhang@, Jie Zhang, 20, 3.0
999-32-1234, ajain@, Anil Jain, 21, 3.8
Database Management Systems
Raghu Ramakrishnan
20
Relationship vs. Relationship Set
John Smith
(999-21-3415, jsmith@, John Smith, 18, 3.5)
Relationship
ITCS3160
(3160, ITCS, DBMS, J. Fan, 3, Kenn. 236)
Database Management Systems
Raghu Ramakrishnan
21
Relationship vs. Relationship Set
999-21-3415, jsmith@, John Smith, 18, 3.5
999-31-2356, jzhang@, Jie Zhang, 20, 3.0
999-32-1234, ajain@, Anil Jain, 21, 3.8
Relationship set
3160, ITCS, DBMS, J. Fan, 3, Kenn. 236
6157, ITCS, Visual DB, J. Fan, 3, Kenn. 236
Database Management Systems
Raghu Ramakrishnan
22
Potential Relationship Types
1-to-1
Database Management Systems
1-to Many
Many-to-1
Raghu Ramakrishnan
Many-to-Many
23
Example 1

Build an ER Diagram for the following information:
– Students

Have an Id, Name, Login, Age, Gpa
– Courses

Have an Id, Name, Credit Hours
– Students enroll in courses

Receive a grade
Database Management Systems
Raghu Ramakrishnan
24
Example 1 Answer
Name
Id
Login
Students
Age
Id
GPA
Name
Credit
Courses
Enrolled_In
Grade
Database Management Systems
Raghu Ramakrishnan
25
Example 2

Build an ER Diagram for the following information:
– Patients

Name, Address, Phone #, Age
– Drugs

Name, Manufacturer , Expiration Date
– Patients are prescribed drugs

Dosage, # Days
Database Management Systems
Raghu Ramakrishnan
26
Example 2 Answer
Name
Addr
Patients
Phone
Name
Age
Manuf
Exp
Drug
Prescribed
Dosage
Database Management Systems
#days
Raghu Ramakrishnan
27
Example 3

Build an ER Diagram for the following information:
– Students

Have an Id, Name, Login, Age, Gpa
– Courses

Have an Id, Name, Credit Hours
– Students enroll in courses

Receive a grade
- faculties

Name, Address, Phone #, Age
– Faculties teach courses

semester
Database Management Systems
Work on your paper first!
Raghu Ramakrishnan
28
Example 4

Build an ER Diagram for the following information:
– Customers

Home address, Name, phone number, income
– Products

Have an Id, Name, price, company
– Customers buy Products

day
- Employees

Name, Address, Phone #, Age, title
– Employees manage Products

season
Database Management Systems
Work on your paper first!
Raghu Ramakrishnan
29
Example 5: University Database
Tables:
Students: SID, sname, year, GPA
Departments: DID, dname, office
Faculties: ssn, fname, f-office, phone, salary
Courses: CID, cname, time, room, credit-hour
Schema & Attributes, Domains
Inter-Table Relationships:
Students enroll in Courses, Faculties teach Courses
Faculties work for Departments
Query over multiple tables is allowed!
Database Management Systems
Raghu Ramakrishnan
30
Example 5

Build an ER Diagram for the following information:
– Students
 Have an Id, Name, Login, Age, Gpa
– Courses
 Have an Id, Name, Credit Hours
– Students enroll in courses
 Receive a grade
- Faculties
Name, Address, Phone #, Age
– Faculties teach courses
 semester

- Departments
Dname, Office, Phone #, budget
– Faculties work in departments
 duration

Work on your paper first!
Database Management Systems
Raghu Ramakrishnan
31
Example 6

Build an ER Diagram for the following information:
– Walmart Stores

Store Id, Address, Phone #
– Products

Product Id, Description, Price
– Manufacturers

Name, Address, Phone #
– Walmart Stores carry products

Amount in store
– Manufacturers make products

Amount in factory/warehouses
Database Management Systems
Raghu Ramakrishnan
Work on your paper first!
32
Entity vs. Attribute: Ternary Relationship
Should address be an attribute of Employees or an
entity (connected to Employees by a relationship)?
 Depends upon the use we want to make of address
information, and the semantics of the data:

 If
we have several addresses per employee, address must
be an entity (since attributes cannot be set-valued).
 If the structure (city, street, etc.) is important, e.g., we
want to retrieve employees in a given city, address must
be modeled as an entity (since attribute values are
atomic).
Database Management Systems
Raghu Ramakrishnan
33
name
from
to
dname
Works_In2 does not ssn
lot
did
budget
allow an employee to
Departments
Works_In2
work in a department for Employees
two or more periods.
 Similar to the problem
of wanting to record
several addresses for an
name
dname
employee: we want to
ssn
lot
did
budget
record several values of the
Works_In3
Departments
Employees
descriptive attributes for
each instance of this
Duration
to
from
relationship.
Same employee works in the same department in different periods

Database Management Systems
Raghu Ramakrishnan
34
Ternary Relationship
name
dname
ssn
lot
Employees
from
did
Works_In3
budget
Departments
Duration
to
Why?
since
name
ssn
dname
lot
Employees
Database Management Systems
budget
did
Works_In
Departments
Raghu Ramakrishnan
35
Key Constraints

In contrast, each
dept has at most
one manager,
according to the
key constraint on
Manages.
At most one!!!
since
name
ssn
dname
lot
did
Employees
Manages
budget
Departments
Key Constraint
(time constraint)
Database Management Systems
Raghu Ramakrishnan
36
Key Constraints
Consider
Works_In:
An employee can work
in many departments;
a dept can have many employees.
since
name
ssn
lot
Employees
Database Management Systems
dname
did
Works_In
Raghu Ramakrishnan
budget
Departments
37
Key Constraints
Consider
Works_In:
An employee can work
in at most one department;
a dept can have many employees.
since
name
ssn
lot
Employees
Database Management Systems
dname
did
Works_In
Raghu Ramakrishnan
budget
Departments
38
Key Constraints
since
name
ssn
dbudget
lot
Employees
Database Management Systems
did
Manages2
Raghu Ramakrishnan
dname
budget
Departments
39
Key Constraints & Ternary Relationships


Previous example illustrated a case when two binary
relationships were better than one ternary
relationship.
An example in the other direction: a ternary relation
Contracts relates entity set Parts, Departments and
Suppliers, and has descriptive attributes qty. No
combination of binary relationships is an adequate
substitute:
– S “can-supply” P, D “needs” P, and D “deals-with” S does not
imply that D has agreed to buy P from S.
– How do we record qty?
Database Management Systems
Raghu Ramakrishnan
40
Key Constraints &Ternary Relationship

First ER diagram OK if
a manager gets a
separate discretionary
budget for each dept.
since
name
ssn
– Misleading: suggests
dbudget tied to managed
dept.
lot
Employees
– Redundancy of dbudget,
which is stored for each
dept managed by the
manager.
dbudget
dname
did
Departments
Manages2
name
ssn
What if a manager gets
a discretionary budget
that covers all managed
depts?
Database Management Systems
budget
dname
lot
Employees
did
Manages3
budget
Departments

since
apptnum
Mgr_Appts
dbudget
Raghu Ramakrishnan
41
Key Constraints & Ternary Relationships
name
dname
ssn
lot
did
budget
Works_In3
Employees
Duration
from
Departments
to
name
ssn
Employees
dname
lot
did
Works_In3
from
Database Management Systems
budget
Departments
to
Raghu Ramakrishnan
42
Participation Constraints

Does every department have a manager?
– If so, this is a participation constraint: the participation of
Departments in Manages is said to be total (vs. partial).
 Every did value in Departments table must appear in a
row of the Manages table (with a non-null ssn value!)
since
name
dname
ssn
did
lot
Employees
Partial
Total
Manages
budget
Departments
Total w/key
constraint
Works_In
Total
since
Database Management Systems
Raghu Ramakrishnan
43
What are the policies behind this ER model?
since
name
dname
ssn
did
lot
Employees
Total
Total
Manages
budget
Departments
Total w/key
constraint
Works_In
Total
since
Database Management Systems
Raghu Ramakrishnan
44
since
name
dname
ssn
lot
Employees
did
Manages
Any Difference?
budget
Departments
Works_In
since
name
dname
ssn
did
lot
Employees
Partial
Total
Manages
budget
Departments
Total w/key
constraint
Works_In
Total
since
Database Management Systems
Raghu Ramakrishnan
45
since
name
dname
ssn
did
lot
Employees
budget
Departments
Manages
Works_In
Any Difference?
since
since
name
ssn
dname
did
lot
Employees
Manages
budget
Departments
Works_In
since
Database Management Systems
Raghu Ramakrishnan
46
Weak Entities vs. Owner Entities

A weak entity can be identified uniquely only by
considering the primary key of another (owner) entity.
– Owner entity set and weak entity set must participate in a
one-to-many relationship set (1 owner, many weak entities).
– Weak entity set must have total participation in this
identifying relationship set.
name
ssn
Primary Key
for weak entity
lot
Employees
cost
Policy
Identifying Relationship
Database Management Systems
Raghu Ramakrishnan
pname
age
Dependents
Weak Entity
47
Weak Entity vs. Owner Entity
*
name
ssn

If each policy is
owned by just 1
employee:
– Key constraint
on Policies
would mean
policy can only
cover 1
dependent!
Every policy must be owned
by some employee
Dependents is a weak entity
Database Management Systems
pname
lot
Employees
Dependents
Covers
Bad design
Policies
policyid
cost
name
ssn
age
pname
lot
age
Dependents
Employees
Purchaser
Better design
policyid
Raghu Ramakrishnan
Beneficiary
Policies
cost
48
name
ssn
ISA (`is a’) Hierarchies
As
in C++, or other PLs,
attributes are inherited.
hourly_wages
lot
Employees
hours_worked
ISA
contractid
If
we declare A ISA B, every A
Contract_Emps
Hourly_Emps
entity is also considered to be a B
entity.
 Overlap constraints: Can Joe be an Hourly_Emps as well as a
Contract_Emps entity? (Allowed/disallowed)
 Covering constraints: Does every Employees entity also have to
be an Hourly_Emps or a Contract_Emps entity? (Yes/no)
 Reasons for using ISA:
– To add descriptive attributes specific to a subclass.
– To identify entitities that participate in a relationship.
Database Management Systems
Raghu Ramakrishnan
49
name
ssn
Aggregation

Used when we have
to model a
relationship
involving (entitity
sets and) a
relationship set.
– Aggregation allows us
to treat a relationship
set as an entity set
for purposes of
participation in
(other) relationships.
– Monitors mapped to
table like any other
relationship set.
Database Management Systems
lot
Employees
Monitors
until
Aggregation
started_on
pid
dname
pbudget
Projects
Raghu Ramakrishnan
did
Sponsors
budget
Departments
50
Database Management Systems
Raghu Ramakrishnan
51
Conceptual Design Using the ER Model

Design choices:
– Should a concept be modeled as an entity or an attribute?
– Should a concept be modeled as an entity or a relationship?
– Identifying relationships: Binary or Ternary? Aggregation?
Database Management Systems
Raghu Ramakrishnan
52
Entity vs. Attribute
Should address be an attribute of Employees or an
entity (connected to Employees by a relationship)?
 Depends upon the use we want to make of address
information, and the semantics of the data:

 If
we have several addresses per employee, address must
be an entity (since attributes cannot be set-valued).
 If the structure (city, street, etc.) is important, e.g., we
want to retrieve employees in a given city, address must
be modeled as an entity (since attribute values are
atomic).
Database Management Systems
Raghu Ramakrishnan
53
Entity vs. Attribute


name
from
to
dname
Works_In2 does not ssn
lot
did
budget
allow an employee to
Departments
Works_In2
work in a department for Employees
two or more periods.
Similar to the problem
of wanting to record
several addresses for an
name
dname
employee: we want to
ssn
lot
did
budget
record several values of the
Works_In3
Departments
Employees
descriptive attributes for
each instance of this
Duration
to
from
relationship.
Database Management Systems
Raghu Ramakrishnan
54
Entity vs. Relationship

First ER diagram OK if
a manager gets a
separate discretionary
budget for each dept.
since
name
ssn
– Misleading: suggests
dbudget tied to managed
dept.
lot
Employees
– Redundancy of dbudget,
which is stored for each
dept managed by the
manager.
dbudget
dname
did
Departments
Manages2
name
ssn
What if a manager gets
a discretionary budget
that covers all managed
depts?
Database Management Systems
budget
dname
lot
Employees
did
Manages3
budget
Departments

since
apptnum
Mgr_Appts
dbudget
Raghu Ramakrishnan
55
Binary vs. Ternary Relationships
name
dname
ssn
lot
did
budget
Works_In3
Employees
Duration
from
Departments
to
name
ssn
Employees
dname
lot
did
Works_In3
from
Database Management Systems
budget
Departments
to
Raghu Ramakrishnan
56
Binary vs. Ternary Relationships


Previous example illustrated a case when two binary
relationships were better than one ternary
relationship.
An example in the other direction: a ternary relation
Contracts relates entity set Parts, Departments and
Suppliers, and has descriptive attributes qty. No
combination of binary relationships is an adequate
substitute:
– S “can-supply” P, D “needs” P, and D “deals-with” S does not
imply that D has agreed to buy P from S.
– How do we record qty?
Database Management Systems
Raghu Ramakrishnan
57
Summary of Conceptual Design

Conceptual design follows requirements analysis,
– Yields a high-level description of data to be stored

ER model popular for conceptual design
– Constructs are expressive, close to the way people think
about their applications.
Basic constructs: entities, relationships, and attributes
(of entities and relationships).
 Some additional constructs: weak entities, ISA
hierarchies, and aggregation.
 Note: There are many variations on ER model.

Database Management Systems
Raghu Ramakrishnan
58
Summary of ER (Contd.)

Several kinds of integrity constraints can be
expressed in the ER model: key constraints,
participation constraints, and overlap/covering constraints
for ISA hierarchies. Some foreign key constraints are
also implicit in the definition of a relationship set.
– Some constraints (notably, functional dependencies) cannot
be expressed in the ER model.
– Constraints play an important role in determining the best
database design for an enterprise.
Database Management Systems
Raghu Ramakrishnan
59
Summary of ER (Contd.)

ER design is subjective. There are often many ways to
model a given scenario! Analyzing alternatives can be
tricky, especially for a large enterprise. Common
choices include:
– Entity vs. attribute, entity vs. relationship, binary or n-ary
relationship, whether or not to use ISA hierarchies, and
whether or not to use aggregation.

Ensuring good database design: resulting relational
schema should be analyzed and refined further. FD
information and normalization techniques are
especially useful.
Database Management Systems
Raghu Ramakrishnan
60
Erwin ER Modeling Tool


http://www.cai.com/products/alm/erwin.htm
Demo ERwin and it’s capabilities
– Open sample movies model

Erwin_3.5.2/models
– Build Example 2 using ERwin
Database Management Systems
Raghu Ramakrishnan
61
to
from
name
dname
ssn
lot
Employees
did
Departments
Works_In2
name
budget
dname
ssn
lot
Employees
from
did
Works_In3
Duration
budget
Departments
to
name
dname
ssn
lot
Employees
from
Database Management Systems
Raghu Ramakrishnan
did
Works_In3
Duration
budget
Departments
to
62
to
from
name
dname
ssn
lot
Employees
did
Departments
Works_In2
name
budget
dname
ssn
lot
Employees
from
did
Works_In3
Duration
budget
Departments
to
name
dname
ssn
lot
Employees
from
Database Management Systems
Raghu Ramakrishnan
did
Works_In3
Duration
budget
Departments
to
63
In-Class Working Question: University Database












Professors have an SSN, a name, an age, a rank, and a research area;
Projects have a project number, a sponsor name, a starting date, an ending
date, and a budget;
Graduate students have an SSN, a name, an age, and a degree program;
Each project is managed by one professor as PI
Each project is worked on by one or more professors
Professors can manage and work on multiple projects
Each project is worked by one or more graduate students
When graduate students work on a project, a professor must supervise
their work on the project. Graduate students can work on multiple
projects, in which case they will have a supervisor for each one
Departments have a department number, name, main office
Departments have a professor as the chairman
Professors work in one or more departments, each has time percentage
Graduate students have one major department for pursuing degree
Database Management Systems
Work on your paper first!64
Raghu Ramakrishnan
Homework Assignment

Problem 2.4 at the end of Chapter 2
– Pages 53
Due Next Thursday: Hard copy to instructor
 Format for homework: name, ID.

Database Management Systems
Raghu Ramakrishnan
65
Homework Assignment
since
name
dname
ssn
did
lot
Employees
Partial
Manages
Total
Departments
Total w/key
constraint
Works_In
partial
budget
Total
since
pname
Policy
age
Dependents
Key/total
cost
Database Management Systems
Raghu Ramakrishnan
66