Download ER Model - CS-People by full name

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Clusterpoint wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
CAS CS 460
Entity-Relationship Model
Review
 Course Agenda
 Application/User-Oriented
 Database Modeling
 Data Retrieval
 Design Considerations
 Database System Internals
 Storage and Indexing
 Query Processing, Optimization
 Transaction Processing
 Recovery Management
 Advanced Topics
 Data Mining, XML DB in Oracle
1.2
Review
 Course Agenda
 Application/User-Oriented
 Database Modeling – ER Model, Relational Model
 Data Retrieval -- Relational Alg, Rel. Cal,… SQL,..
 Design Considerations – Normal Forms,…
 Database System Internals
 Storage and Indexing
 Query Processing, Optimization
 Transaction Processing
 Recovery Management
 Advanced Topics
 Data Mining, XML DB in Oracle
1.3
Databases Model the Real World
 “Data Model” translates real world things into structures computers
can store
 Many models:
 Relational, E-R, O-O, Network, Hierarchical, etc.
 Relational (more next time)
 Rows & Columns
 Keys & Foreign Keys to link Relations
Enrolled
sid
53666
53666
53650
53666
cid
grade
Carnatic101
C
Reggae203
B
Topology112 A
History105
B
Students
sid
53666
53688
53650
1.4
name
login
Jones jones@cs
Smith smith@eecs
Smith smith@math
age gpa
18 3.4
18 3.2
19 3.8
Problems with Relational Model
CREATE TABLE Students
(sid CHAR(20),
name CHAR(20),
login CHAR(10),
age INTEGER,
gpa FLOAT)
CREATE TABLE Enrolled
(sid CHAR(20),
cid CHAR(20),
grade CHAR(2))
With complicated schemas, it may be hard for a person to
understand the structure from the data definition.
Enrolled
cid
grade sid
Carnatic101
C 53666
Reggae203
B 53666
Topology112
A 53650
History105
B 53666
Students
sid
53666
53688
53650
1.5
name
login
Jones jones@cs
Smith smith@eecs
Smith smith@math
age
18
18
19
gpa
3.4
3.2
3.8
One Solution: The E-R Model
 Instead of relations, it has:
 Entities and Relationships
 These are described with diagrams
 both structure, notation more obvious to humans
since
name
ssn
did
lot
Employee
dname
Works_In
1.6
budget
Department
Steps in Database Design
 Requirements Analysis
 user needs; what must database do?
 Conceptual Design
 high level descr (often done w/ER model)
 Logical Design
 translate ER into DBMS data model
 Schema Refinement
 consistency, normalization
 Physical Design
 indexes, disk layout
 Security Design
 who accesses what, and how
1.7
Example: DBA for Bank of America
 Requirements Specification
 Determine the requirements of clients ( Database to store
information about customers, accounts, loans, branches,
transactions, …)

Conceptual Design
 Express client requirements in terms of E/R model.
 Confirm with clients that requirements are correct.
 Specify required data operations
 Logical Design
 Convert E/R model to relational, object-based, XML-based,…
 Physical Design
 Specify file organizations, build indexes
1.8
Modeling
 A database can be modeled as:
 a collection of entities,
 relationship among entities.
 An entity is an object that exists and is distinguishable from
other objects.
 Example: specific employee, company, customer, loan
 Entities have attributes
 Example: employees have names and addresses
 An entity set is a set of entities of the same type that share the
same properties.
 Example: set of all employees, companies, customers, loans
1.9
Entity Sets customer and loan
Cst_id
Cst_name
Cst_street
Cst_city
1.10
Loan_id, Amount
Relationship Sets
 A relationship is an association among several entities
Example:
Hayes
customer entity
depositor
relationship set
A-102
account entity
 A relationship set is a mathematical relation among n  2
entities, each taken from entity sets
{(e1, e2, … en) | e1  E1, e2  E2, …, en  En}
where (e1, e2, …, en) is a relationship
 Example:
(Hayes, A-102)  depositor
1.11
Relationship Set borrower
1.12
Degree of a Relationship Set
 Refers to number of entity sets that participate in a
relationship set.
 Relationship sets that involve two entity sets are binary (or
degree two). Generally, most relationship sets in a database
system are binary.
 Relationship sets may involve more than two entity sets.
Example: Suppose employees of a bank may have jobs
(responsibilities) at multiple branches, with different jobs at
different branches. Then there is a ternary relationship set
between entity sets employee, job, and branch
 Relationships between more than two entity sets are rare.
Most relationships are binary. (More on this later.)
1.13
Attributes
 An entity is represented by a set of attributes, that is descriptive
properties possessed by all members of an entity set.
Example:
customer = (customer_id, customer_name,
customer_street, customer_city )
loan = (loan_number, amount )
 Domain – the set of permitted values for each attribute
 Attribute types:
 Simple and composite attributes.
 Single-valued and multi-valued attributes
 Example: multivalued attribute: phone_numbers
 Derived attributes
 Can be computed from other attributes
 Example: age, given date_of_birth
1.14
Composite Attributes
1.15
Relationship Sets (Cont.)
 An attribute can also be property of a relationship set.
 For instance, the depositor relationship set between entity sets
customer and account may have the attribute access-date
1.16
E/R Data Model
Another Example
Employee
Branch
Works_At
essn
phone
ename
since
seniority
bname
bcity
children
Lots of notation to come.
Employee
Entity Set
Works_For
Relationship Set
phone
1.17
Attribute
E/R Data Model
Types of Attributes
Employee
Branch
Works_At
essn
phone
ename
bname
seniority
since
bcity
children
1.18
ename
Default
children
Multivalued
seniority
Derived
E/R Data Model
Types of relationships
Employee
Branch
Works_At
essn
phone
ename
since
seniority
bname
bcity
children
Works_At
1.19
Many-to-Many (n:m)
Roles
 Entity sets of a relationship need not be distinct
 The labels “manager” and “worker” are called roles; they specify
how employee entities interact via the works_for relationship set.
 Roles are indicated in E-R diagrams by labeling the lines that
connect diamonds to rectangles.
 Role labels are optional, and are used to clarify semantics of the
relationship
1.20
E/R Data Model
Recursive relationships
Employee
Branch
Works_At
essn
phone
ename
manager
worker
since
seniority
bname
bcity
children
Recursive relationships: Must be declared with roles
Works_For
Many-to-1 Relationship
manager
Employee
1.21
worker
Works_For
E/R Data Model
Design Issue #1: Entity Sets vs. Attributes
 An Example: Employees can have multiple phones
(b)
(a)
vs
Employee
phone_no
Employee
phone_loc
Phone
Uses
no
loc
 To resolve, determine how phones are used
 1. Can many employees share a phone?

(If yes, then (b))
 2. Can employees have multiple phones?

(if yes, then (b), or (a) with multivalued attributes)
 3. Else

(a), perhaps with composite attributes
Employee
phone
no
loc
1.22
E/R Data Model
Design Issue #2: Entity Sets vs. Relationship Sets
 An Example: How to model bank loans
Customer
cssn
cname
Borrows
Loan
lno
vs
amt
Customer
cssn
(a)
Branch
Loans
bname
cname
amt
lno
(b)
 To resolve, determine how loans are issued
 1. Can there be more than one customer per loan?
 If yes, then (a). Otherwise in (b), loan info must be replicated for each customer
(wasteful storage , potential update anomalies)
 2. Is loan a noun or a verb?
 Both, but more of a noun to a bank. (hence (a) probably more appropriate)
1.23
bcity
E/R Data Model
Design Issue #3: Relationship Cardinalities
 An Example:

Customer
?
Borrows
?
Loan
 Variations on Borrows:
 1. Can a customer hold multiple loans?
 2. Can a loan be jointly held by more than 1 customer?
1.24
E/R Data Model
Design Issue #3: Relationship Cardinalities
Customer
?
Borrows
?
Loan
 Cardinalities of Borrows:

Type
Illustrated
One-to-One (1:1)
Borrows
Many-to-one (n:1)
Borrows
No
Yes
One-to-many (1:n)
Borrows
Yes
No
Many-to-many (n:m)
Borrows
Yes
Yes
1.25
Multiple Loans?
No
Joint Loans?
No
E/R Data Model
Design Issue #3: Relationship Cardinalities (cont)
 In general...

1:1
n:1
1:n
n:m
1.26
E/R Data Model
Design Issue #4: N-ary vs Binary Relationship Sets
 An Example: Works_At
Ternary:
Employee
Works_at
Dept
Branch
(Joe, Moody, Acct)  Works_At
vs
Binary:
Employee
WAE
WA
WAB
Branch
WAD
(Joe, w3)  WAE
(Moody, w3)  WAB
(Acct, w3)  WAD
Dept
1.27
Choose n-ary
when possible!
(Avoids redundancy,
update anomalies)
E/R Data Model
Keys
 Key = set of attributes identifying individual entities or relationships
Employee
essn
eaddress
ename
ephone
 A. Superkey:
 any attribute set that distinguishes identities
• i.e., uniquely identify a member of the entity set
 e.g., {essn}, {essn, ename, eaddress}
 B. Candidate Key:
 “minimal superkey” (can’t remove attributes and preserve “keyness”)
 e.g., {essn}, {ename, eaddress}
 C. Primary Key:
 candidate key chosen as the key by a DBA
 e.g., {essn} (denoted by underline)
1.28
Example
ESSN
EName
EAddress
E-Id
123456789
John Neuman
Worcester
OR-0034
234567567
John Neuman
Boston
OR-0027
234789890
Richard Wang
Framingham
OR-0025
976563456
Sheila Dixie
Worcester
OR-6788
• Superkeys: uniquely identify a member of the entity set
- {ESSN}, {EName, EAddress}, {E-Id}, {ESSN, E-Id}
• Candidate keys: {ESSN}, {EName, EAddress}, {E-Id}
- why? No subset is a super key
- E.g., {EName}, {EAddress} are not super keys hence
{EName, EAddress} which is a super key is also a candidate key.
• {ESSN, E-id}: Superkey but not a Candidate key
1.29
E/R Data Model
Relationship Set Keys
Employee
essn
ename
Works_At
...
since
Branch
bname
bcity
...
 Q: What attributes needed to represent relationships in Works_At?

e2
b1
e1
b2
e3
A: {essn, bname, since}
1.30
E/R Data Model
Relationship Set Keys (cont.)
Employee
essn
ename
Works_At
...
since
Branch
bname
 Q: What are the candidate keys of Works_At?

e2
b1
e1
b2
e3
A: {essn}
1.31
bcity
...
E/R Data Model
Relationship Set Keys (cont.)
Employee
essn
ename
Works_At
...
since
Branch
bname
bcity
...
 Q: What are the candidate keys if Works_At is...?

a. 1:n
A: {bname}
b. n:m
A: {essn, bname}
Assumption: employees have <= 1 record per branch
c. 1:1
A: {essn}, {bname}
1.32
E/R Data Model
Relationship Set Keys (cont.)
 General Rules for Relationship Set Keys
E1
P (E1)
R
...
E2
P (E2)
 If R is:
R
1:1
1:n
n:1
n:m
Candidate Keys
P (E1) or P (E2)
P (E2)
P (E1)
P (E1) P (E2)
1.33
...
E/R Data Model
Existence Dependencies and Weak Entity Sets
 Idea:
 Existence of one entity depends on another
 Example: Loans and Loan Payments
Loan
lno
Payment
Loan_Pmt
pno
lamt
pdate
Weak Entity Set
Identifying Relationship
Total Participation
1.34
pamt
E/R Data Model
Existence Dependencies and Weak Entity Sets
Loan
lno
Payment
Loan_Pmt
pno
lamt
pdate
pamt
Weak Entity Sets


existence of payments depends upon loans
have no superkeys: different payment records (for different
loans) can be identical
 instead of keys, discriminators: discriminate between payments
for given loan (e.g., pno)
1.35
E/R Data Model
Existence Dependencies and Weak Entity Sets
Loan
lno
Payment
Loan_Pmt
pno
lamt
pdate
Identifying Relationships
 We say:
 Loan is dominant in Loan_Pmt
 Payment is subordinate in Loan_Pmt
 Payment is existence dependent on Loan
1.36
pamt
E/R Data Model
Existence Dependencies and Weak Entity Sets
Loan
lno
Payment
Loan_Pmt
pno
lamt
pdate
 Total Participation
 All elements of Payment appear in Loan_Pmt
1.37
pamt
E/R Data Model
Existence Dependencies and Weak Entity Sets
E1
atta1
...
E2
R
attb1
attam
 Q. Is {attb1, …, attbn} a superkey of E2?
A: No
 Q. Name a candidate key of E2
A: {atta1, attb1}
1.38
...
attbn
E/R Data Model
Extensions to the Model: Specialization and Generalization
 An Example:
 Customers can have checking and savings accts
 Checking ~ Savings (many of the same attributes)
 Old Way:
Customer
Has1
Savings Acct
acct_no
Has2
balance
interest
Checking Acct
acct_no
1.39
balance
overdraft
E/R Data Model
Extensions to the Model: Specialization and Generalization
 An Example:
 Customers can have checking and savings accts
 Checking ~ Savings (many of the same attributes)
 New Way:
Customer
balance
acct_no
superclass
Account
Has
Isa
Savings Acct
Checking Acct
interest
overdraft
subclasses
1.40
E/R Data Model
Extensions to the Model: Specialization and Generalization
 Subclass Distinctions:
 1. User-Defined vs. Condition-Defined
 User: Membership in subclasses explicitly determined (e.g.,
Employee, Manager < Person)
 Condition: Membership predicate associated with subclasses e.g:
Person
Isa
Child
age < 18
Adult
age 18
1.41
Senior
age 65
E/R Data Model
Extensions to the Model: Specialization and Generalization
 Subclass Distinctions:
 2. Overlapping vs. Disjoint
 Overlapping: Entities can belong to >1 entity set
(e.g., Adult, Senior)
 Disjoint: Entities belong to exactly 1 entity set
(e.g., Child)
Person
Isa
Child
age < 18
Adult
age 18
1.42
Senior
age 65
E/R Data Model
Extensions to the Model: Specialization and Generalization
 Subclass Distinctions:
 3. Total vs. Partial Membership
 Total: Every entity of superclass belongs to a subclass e.g.,
Person
Isa
Child
age < 18
Adult
age 18
Senior
age 65
 Partial: Some entities of superclass do not belong to any
subclass (e.g., if Adults condition is age 21 )
1.43
E/R Data Model
Extensions to the Model: Aggregation
 E/R: No relationships between relationships
 E.g.: Associate loan officers with Borrows relationship set
Customer
Borrows
Loan
?
Loan_Officer
Employee
 Associate Loan Officer with Loan?
 What if we want a loan officer for every (customer, loan) pair?
1.44
E/R Data Model
Extensions to the Model: Aggregation
 E/R: No relationships between relationships
 E.g.: Associate loan officers with Borrows relationship set
Customer
Borrows
Loan_Officer
Employee
 Associate Loan Officer with Borrows?
 Must First Aggregate
1.45
Loan
Other Similar Models: UML
 UML: Unified Modeling Language
 UML has many components to graphically model different
aspects of an entire software system
 UML Class Diagrams correspond to E-R Diagram, but several
differences.
1.46
Summary of UML Class Diagram Notation
1.47
UML Class Diagrams (Cont.)
 Entity sets are shown as boxes, and attributes are shown within
the box, rather than as separate ellipses in E-R diagrams.
 Binary relationship sets are represented in UML by just drawing
a line connecting the entity sets. The relationship set name is
written adjacent to the line.
 The role played by an entity set in a relationship set may also be
specified by writing the role name on the line, adjacent to the
entity set.
 The relationship set name may alternatively be written in a box,
along with attributes of the relationship set, and the box is
connected, using a dotted line, to the line depicting the
relationship set.

Non-binary relationships drawn using diamonds, just as in ER
diagrams
1.48
UML Class Diagram Notation (Cont.)
overlapping
disjoint
*Note reversal of position in cardinality constraint depiction
*Generalization can use merged or separate arrows independent
of disjoint/overlapping
1.49
UML Class Diagrams (Contd.)
 Cardinality constraints are specified in the form l..h, where l
denotes the minimum and h the maximum number of
relationships an entity can participate in.
 Beware: the positioning of the constraints is exactly the
reverse of the positioning of constraints in E-R diagrams.
 The constraint 0..* on the E2 side and 0..1 on the E1 side
means that each E2 entity can participate in at most one
relationship, whereas each E1 entity can participate in many
relationships; in other words, the relationship is many to one
from E2 to E1.
 Single values, such as 1 or * may be written on edges; The
single value 1 on an edge is treated as equivalent to 1..1,
while * is equivalent to 0..*.
1.50
E/R Data Model
Summary
 Entities, Relationships (sets)
 Both can have attributes (simple, multivalued, derived, composite)
 Cardinality or relationship sets (1:1, n:1, n:m)
 Keys: superkeys, candidate keys, primary key
 DBA chooses primary key for entity sets
 Automatically determined for relationship sets
 Weak Entity Sets, Existence Dependence, Total/Partial Participation
 Specialization and Generalization (E/R + inheritance)
 Aggregation (E/R + higher-order relationships)
1.51
These things get pretty hairy!
 Many E-R diagrams cover entire walls!
 A modest example:
1.52
A Cadastral E-R Diagram
cadastral: showing or recording property boundaries, subdivision lines, buildings, and
related details
Source: US Dept. Interior Bureau of Land Management,
Federal Geographic Data Committee Cadastral Subcommittee
http://www.fairview-industries.com/standardmodule/cad-erd.htm
1.53