Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Database Systems
DataBase System
Major Content & Grade
Introduction
*
The
***
Relational Model
SQL
****
Transaction
Management
***
Database
Design (E-R)
***
Database
Design (Normalization)
***
Haichang Gao , Software School , Xidian University
2
DataBase System
Unit 2 The Relational Model
Relational
Model
Relational
Algebra(关系代数)
Relational
Calculus(关系演算)
Haichang Gao , Software School , Xidian University
3
DataBase System
Relational Model
Laid
down in 1969-1970 by E.F.CODD
“ A Relational Model of Data for Large Shared
Data Bank” C ACM 1970
Mathematical Bases ( Relational Theory)
Developed in 1980s
Most commercial DBMS are Relational
Haichang Gao , Software School , Xidian University
4
DataBase System
Mathematical Relations
given sets D1, D2, …. Dn, a relation r is a subset of
Cartesian product(笛卡尔积) D1 × D2 × … × Dn
Thus a relation(关系) is a set of n-tuples (a1, a2, …, an)
where ai Di
Example: if
customer-name = {Jones, Smith, Curry, Lindsay}
customer-street = {Main, North, Park}
customer-city = {Harrison, Rye, Pittsfield}
Then r = {(Jones, Main, Harrison),
(Smith, North, Rye),
(Curry, North, Rye),
(Lindsay, Park, Pittsfield)}
is a relation over customer-name × customer-street ×
customer-city.
Formally,
Haichang Gao , Software School , Xidian University
5
DataBase System
Mathematical Relations
Each attribute of a relation has a name
The set of allowed values for each attribute is called the
domain(域) of the attribute
Attribute values are (normally) required to be atomic; that is,
indivisible
Note:
multivalued attribute(多值属性) values are not atomic
Note:
composite attribute (组合属性) values are not atomic
The special value null is a member of every domain
The
null value causes complications in the definition of many
operations
Haichang Gao , Software School , Xidian University
6
DataBase System
Mathematical Relations
A1,
R
A2, …, An are attributes
= (A1, A2, …, An ) is a relation schema(关系模式)
E.g. Customer-schema =
(customer-name, customer-street, customer-city)
r(R)
is a relation on the relation schema R
E.g.
customer (Customer-schema)
Haichang Gao , Software School , Xidian University
7
DataBase System
Mathematical Relations
current values (relation instance, 关系实例) of a relation
are specified by a table
An element t of r is a tuple(元组), represented by a row in a
table
Order of tuples is irrelevant (元组是无序的 tuples may be
stored in an arbitrary order)
The
attributes
(or columns)
customer_namecustomer_street customer_city
Jones
Smith
Curry
Lindsay
Main
North
North
Park
Harrison
Rye
Rye
Pittsfield
tuples
(or rows)
customer
Haichang Gao , Software School , Xidian University
8
DataBase System
Relational Database
A
Relational database consists of multiple relations
Information about an enterprise is broken up into parts, with
each relation storing one part of the information
Example:
The customer table stores information about customers
Haichang Gao , Software School , Xidian University
9
DataBase System
Relational Database
(b) The account table
(c) The depositor table
The account table stores information about accounts.
The depositor table containing information about which
customer owns which account
Storing all information as a single relation is NOT a good
idea!
Haichang Gao , Software School , Xidian University
10
DataBase System
Key
Let K R
K is a superkey(超码) of R if values for K are sufficient to
identify a unique tuple of each possible relation r(R).
Example:
{customer_id, customer_name} and
{customer_id} are both superkeys of Customer.
by “possible r” we mean a relation r that could exist in the
enterprise we are modeling.
Haichang Gao , Software School , Xidian University
11
DataBase System
Key
Let K R
K is a superkey(超码) of R if values for K are sufficient to
identify a unique tuple of each possible relation r(R).
K is a candidate key(候选码) if K is minimal.
Example:
{customer_id} is a candidate key for Customer,
since it is a superkey, and no subset of it is a superkey.
{customer_name} is a candidate key also, assuming no two
customers can possibly have the same name.
Haichang Gao , Software School , Xidian University
12
DataBase System
Key
Primary key(主码): The candidate key that is selected to
identify tuples uniquely within the relation. And the others
candidate key is called Alternate Keys(替换码).
Example: for relation
customer (customer_id, customer_name, customer_street,
customer_city)
Candidate key : {customer_id}, {customer_name}
Primary key: {customer_id}
Haichang Gao , Software School , Xidian University
13
DataBase System
Key
For the banking database, we have several relations to store data:
Foreign Key(外码): An attribute, or set of attributes,within one
relation R1 that matches the candidate key of some(possibly the
same) relation R2.
Example: Attribute customer_name
in table depositor is a
Foreign Key references to customer_name in customer.
Haichang Gao , Software School , Xidian University
14
DataBase System
Key
Usually, Primary Key and Foreign Key was indicated using
undirected line and dashed line.
Example:
Find out PK and FK of the following schema:
Student(sno, sname, ssex, sbirth, sdept)
Course(cno, cname, cpno, ccredit)
SC(sno, cno, grade)
Haichang Gao , Software School , Xidian University
15
DataBase System
Integrity of Relational Model
Entity integrity(实体完整性约束): In a base relation, NO
attribute of a primary key can be null.
Referential integrity(参照完整性约束): If a foreign key exists
in a relation, either the foreign key value must match a candidate
key value of some tuple in its home relation(被参照关系) or the
foreign key value must be wholly null.
Example:
customer (customer_name, customer_street, customer_city)
depositor (customer_name, account_number)
account (account_number, branch_name, balance)
Haichang Gao , Software School , Xidian University
16
DataBase System
Integrity of Relational Model
Null(空值): Represents a value for an attribute that is
currently unknown or is not applicable for this tuple.
It
is possible for tuples to have a null value, denoted by null,
for some of their attributes
The
result of any arithmetic expression involving null is null.
Aggregate functions
simply ignore null values (as in SQL)
For
duplicate elimination and grouping, null is treated like
any other value, and two nulls are assumed to be the same
(as in SQL)
Null
value is NOT 0 of integer, NOT empty string , it is
JUST NULL.
Haichang Gao , Software School , Xidian University
17
DataBase System
Integrity of Relational Model
Comparisons with null values return the special truth value:
unknown
If false was used instead of unknown, then not (A < 5)
would not be equivalent to
A >= 5
Three-valued logic using the truth value unknown:
OR: (unknown or true)
= true,
(unknown or false)
= unknown
(unknown or unknown) = unknown
AND: (true and unknown)
= unknown,
(false and unknown)
= false,
(unknown and unknown) = unknown
NOT: (not unknown) = unknown
Result of select predicate is treated as false if it evaluates to
unknown
Haichang Gao , Software School , Xidian University
18
DataBase System
Unit 2 The Relational Model
Relational
Model
Relational
Algebra(关系代数)
Relational
Calculus(关系演算)
Haichang Gao , Software School , Xidian University
19
DataBase System
Manipulation of Relational Model
Query Language is a Language in which user requests
information from the database.
Categories of languages
Procedural(过程化的)
Non-procedural, or declarative(非过程化的,声明性的)
Mathematical languages:
Relational algebra(关系代数)
Tuple relational calculus(元组演算)
Domain relational calculus (域演算)
Characteristics of Relational languages
all set-at-a-time (一次一集合)
operands & results are whole tables
Closure property(封闭性)
the output from one operation can become input to another.
Haichang Gao , Software School , Xidian University
20
DataBase System
Relational Algebra
Relational algebra is a Procedural language
Basic operators
select(选择):
project(投影): π
union(并):
set
∪
difference(集合差): –
Cartesian product(笛卡尔积): ×
rename:
The operators take two or more relations as inputs and give a
new relation as a result.
Haichang Gao , Software School , Xidian University
21
DataBase System
Basic operators
Select Operation(选择操作)
Notation:
p(r)
p
is called the selection predicate(条件谓词)
Defined as:
p(r) = {t | t r p(t)}
Where p is a formula in propositional calculus (命题演算)
consisting of terms connected by : (and), (or), (not)
Each term is one of:
<attribute> op <attribute> (or <constant>)
where op is one of: =, , >, , <, (comparison operator)
Example of selection:
branch_name=“Perryridge”(account)
Haichang Gao , Software School , Xidian University
22
DataBase System
Basic operators
Select Operation(选择操作)
• Relation r
• A=B D > ‘5’ (r)
A
B
C
D
1
7
5
7
12
3
23 10
A
B
C
D
1
7
23 10
Haichang Gao , Software School , Xidian University
23
DataBase System
Basic operators
Project Operation(投影操作)
Notation:
π A1, A2, …, Ak (r)
where A1, A2 are attribute names and r is a relation name.
The
result is defined as the relation of k columns obtained by
erasing the columns that are not listed
Duplicate rows
Example:
removed from result, since relations are sets
For relation
account (account_number, branch_name, balance)
To eliminate the branch_name attribute of account
π account_number, balance (account)
Haichang Gao , Software School , Xidian University
24
DataBase System
Basic operators
Project Operation(投影操作)
Relation r:
π A,C (r)
A
B
C
10
1
20
1
30
1
40
2
A
C
A
C
1
1
1
1
1
2
2
=
Haichang Gao , Software School , Xidian University
25
DataBase System
Basic operators
Union Operation(并操作)
Notation:
Defined
rs
r union s
as:
r s = {t | t r t s}
For
r s to be valid:
1. r, s must have the same degree(列数相同)
2. The attribute domains must be compatible (域相容)
Example: find all customers with either an account or a loan
π customer_name (depositor) π customer_name (borrower)
Haichang Gao , Software School , Xidian University
26
DataBase System
Basic operators
Union Operation(并操作)
Relations r, s:
A
B
A
B
1
2
2
3
1
s
r
r s:
A
B
1
2
1
3
The 1st column of r deals with
the same type of values(domain)
as does the 1st column of s ,and
2nd column also, and also for all
attributes.
Haichang Gao , Software School , Xidian University
27
DataBase System
Basic operators
Difference Operation(差操作)
Notation:
Defined
r–s
r minus s
as:
r – s = {t | t r t s}
Set
differences must be taken between compatible relations.
r and s must have the same degree
attribute domains of r and s must be compatible
Haichang Gao , Software School , Xidian University
28
DataBase System
Basic operators
Difference Operation (差操作)
Relations r, s:
A
B
A
B
1
2
2
3
1
s
r
r–s:
A
B
1
1
Haichang Gao , Software School , Xidian University
29
DataBase System
Basic operators
Cartesian-Product Operation(笛卡尔积操作)
Notation:
Defined
r×s
r times s
as:
r × s = {tq | t r q s}
Assume
that attributes of r(R) and s(S) are disjoint. (That is,
R S = , NO same attribute in R and S).
If
attributes of r(R) and s(S) are not disjoint, then renaming
must be used.
Haichang Gao , Software School , Xidian University
30
DataBase System
Basic operators
Cartesian-Product Operation(笛卡尔积操作)
Relations r, s:
A
B
C
D
E
1
2
10
10
20
10
a
a
b
b
r
s
r × s:
A
B
C
D
E
1
1
1
1
2
2
2
2
10
19
20
10
10
10
20
10
a
a
b
b
a
a
b
b
Haichang Gao , Software School , Xidian University
31
DataBase System
Basic operators
Rename Operation(重命名操作)
Allows
us to name, and therefore to refer to, the results of
relational-algebra expressions.
Allows us to refer to a relation by more than one name.
Example:
x (E)
returns the expression E under the name x
If a relational-algebra expression E has degree n, then
x (A1, A2, …, An) (E)
A1, A2, …., An returns the result of expression E under
the name x, and with the attributes renamed to A1, A2, ….,
An.
Haichang Gao , Software School , Xidian University
32
Example Queries
Find the largest account balance.
DataBase System
account_
number
Branch_
name
A-101
A-215
A-102
…
Downtown
Mianus
Perryridge
……
balance
500
700
400
…
account
A1.balance >A2.balance (A1(account) × A2(account) )
?
最小值不被选中!
πbalance(account) –
πA1.balance (A1.balance < A2.balance (A1(account) × A2(account) ) )
33
Haichang Gao , Software School , Xidian University
Database Systems
DataBase System
Unit 2 The Relational Model
Relational
Model
Relational
Algebra(关系代数)
Relational
Calculus(关系演算)
Haichang Gao , Software School , Xidian University
35
DataBase System
Relational Algebra expression
A basic expression in the relational algebra consists of either
one of the following:
A relation in the database
A constant relation
Let E1 and E2 be relational-algebra expressions; the
following are all relational-algebra expressions:
E1 E2
E1 – E2
E1 × E2
p(E1), P is a predicate on attributes in E1
πs(E1), S is a list consisting of some of the attributes in E1
x(E1), x is the new name for the result of E1
Haichang Gao , Software School , Xidian University
36
DataBase System
Example Queries
Banking DataBase:
branch (branch_name, branch_city, assets)
customer (customer_name, customer_street, customer_city)
depositor (customer_name, account_number)
account (account_number, branch_name, balance)
borrower (customer_name, loan_number)
loan (loan_number, branch_name, amount)
Haichang Gao , Software School , Xidian University
37
DataBase System
Example Queries
Find the loan number for each loan of an amount greater than
$1200.
loan_number branch_name amount
L-11
L-14
L-15
L-16
L-17
L-23
L-93
Round Hill
Downtown
Perryridge
Perryridge
Downtown
Redwood
Mianus
loan
900
1500
1500
1300
1000
2000
500
πloan_number ( amount > ‘1200’ (loan) )
loan_number
L-14
L-15
L-16
L-23
注:除特别要求外,写查询表
达式时不需要写出结果集!
Haichang Gao , Software School , Xidian University
38
DataBase System
Example Queries
account_
number
A-101
A-215
A-102
A-305
A-201
A-222
A-217
branch_
name
Downtown
Mianus
Perryridge
Round Hill
Brighton
Redwood
Brighton
account
balance
500
700
400
350
900
700
750
customer_
account_
name
number
Hayes
A-102
Johnson
A-101
Johnson
A-201
Jones
A-217
Lindsay
A-222
Smith
A-215
Turner
A-305
depositor
Find the customers name who have at least one deposit of a
balance greater than $700.
πCustomer_name(account.account_number = depositor.account_number (
π account_number(balance>’700’(account))×depositor ) )
πCustomer_name(account.account_number = depositor.account_number balance>’700’ (
A1:
A2:
account ×depositor ) )
Haichang Gao , Software School , Xidian University
39
DataBase System
Example Queries
customer_
account_
name
number
Hayes
A-102
Johnson
A-101
Johnson
A-201
Jones
A-217
Lindsay
A-222
Smith
A-215
Turner
A-305
depositor
D1.cname D1.a# D2.cname D2.a#
Hayes
Hayes
Hayes
……
Johnson
……
Johnson
……
A-102
A-102
A-102
……
A-101
……
A-201
……
Hayes
Johnson
Johnson
……
Johnson
……
Johnson
……
A-102
A-101
A-201
……
A-201
……
A-101
……
Find all customers who have at least two deposits.
πD1.cname(D1.cname=D2.cname ∧ D1.a#<>D2.a# (
D1(cname,a#)(depositor) × D2(cname,a#)( depositor) ) )
Haichang Gao , Software School , Xidian University
40
Example Queries
Branch_
name
A-101
A-215
A-102
…
Downtown
Mianus
Perryridge
……
Find the largest account balance.
A.ano A.bn
A-101
A-101
A-101
A-215
A-215
A-215
A-102
A-102
A-102
…
Downtown
Downtown
Downtown
Mianus
Mianus
Mianus
Perryridge
Perryridge
Perryridge
…….
A.b B.ano B.bn
500
500
500
700
700
700
400
400
400
…
A-101
A-215
A-102
A-101
A-215
A-102
A-101
A-215
A-102
…
B.b
Downtown
Mianus
Perryridge
Downtown
Mianus
Perryridge
Downtown
Mianus
Perryridge
……
DataBase System
account_
number
500
700
400
500
700
400
500
700
400
…
balance
500
700
400
…
account
A.b > B.b (A × B)?
A.b < B.b (A × B)?
πbalance(account) –
πA1.balance (A1.balance < A2.balance (A1(account) × A2(account) ) )
Haichang Gao , Software School , Xidian University
41
DataBase System
Additional Operations
We define additional operations that do not add any
power to the relational algebra, but that simplify
common queries.
Set
intersection(集合交)
Natural
join(自然连接)
Division(除法)
Haichang Gao , Software School , Xidian University
42
DataBase System
Additional Operations
Intersection Operation(交操作)
Notation: r
s
Assume:
r, s have the same degree
attributes of r and s are compatible
Defined
as:
r s ={ t | t r t s }
Note:
r s = r - (r - s)
Find the names of all customers who have a loan and an account
at bank.
πcustomer_name (borrower ) πcustomer_name (depositor )
Haichang Gao , Software School , Xidian University
43
DataBase System
Additional Operations
-Join Operation(-连接)
-Join
is a general join which don’t need there are same
attribute for both operators, and the join condition can be any
compare operations. That is:
R times S WHERE F
R
AθB
S = { tr ts | tr R ts S tr[A] ts[B] }
Equi-join is another join. Its join condition is equal operator.
R
A=B
S = { tr ts | tr R ts S tr[A]= ts[B] }
Haichang Gao , Software School , Xidian University
44
DataBase System
Additional Operations
-Join Operation(-连接) R
Relations r, s:
R
A
a1
a1
a2
a2
B
b1
b2
b3
b4
S
B
b1
b2
b3
b3
b5
E
3
7
10
2
2
S
C<E
A
a1
R.B
b1
C
5
S.B
b2
E
7
a1
b1
5
b3
10
a1
b2
6
b2
7
a1
b2
6
b3
10
a2
b3
8
b3
10
R
S
C
5
6
8
12
B=B
A
R.B
C
S.B
E
a1
b1
5
b1
3
a1
b2
6
b2
7
a2
b3
8
b3
10
a2
b3
8
b3
2
Haichang Gao , Software School , Xidian University
45
DataBase System
Additional Operations
Natural join Operation(自然连接)
Notation:
r s
Let r and s be relations on schemas R and S respectively.The
result is a relation on schema R S which is obtained by
considering each pair of tuples tr from r and ts from s.
If tr and ts have the same value on each of the attributes in
RS, a tuple t is added to the result, where
t has the same value as tr on r
t has the same value as ts on s
Example:
R = (A, B, C, D)
S = (E, B, D)
Result schema = (A, B, C, D, E)
r
s is defined as:
π r.A, r.B, r.C, r.D, s.E (r.B = s.B r.D = s.D (r × s))
Haichang Gao , Software School , Xidian University
46
DataBase System
Additional Operations
Natural join Operation(自然连接)
Relations r, s:
A
B
C
D
B
D
E
1
2
4
1
2
a
a
b
a
b
1
3
1
2
3
a
a
a
b
b
r
r
s
s
A
B
C
D
E
1
1
1
1
2
a
a
a
a
b
Haichang Gao , Software School , Xidian University
47
DataBase System
Example Queries
Additional Operations simplify the relational algebra
expressions !
Find the customers name who have at least one deposit of a
balance greater than $700.
A1:
πCustomer_name(account.account_number = depositor.account_number (
π account_number(balance>’700’(account)) ×depositor ) )
A2:
πCustomer_name(account.account_number = depositor.account_number balance>’700’ (
account ×depositor ) )
A3:
πCustomer_name(balance>’700’ ( account
depositor ) )
Haichang Gao , Software School , Xidian University
48
DataBase System
Example Queries
customer_
account_
name
number
Hayes
A-102
Johnson
A-101
Johnson
A-201
Jones
A-217
Lindsay
A-222
Smith
A-215
Turner
A-305
depositor
account_
number
A-101
A-215
A-102
A-305
A-201
A-222
A-217
branch_
name
Downtown
Mianus
Perryridge
Round Hill
Brighton
Redwood
Brighton
account
balance
500
700
400
350
900
700
750
(exercise)Find all customers who have at least two deposits in
different branch.
πD1.Customer_name( D1.Customer_name =D2. Customer_name
D1. Account_number<>D2. Account_number D1.branch_name<>D2. branch_name(
D1(depositor
account) × D2(depositor
account) )
Haichang Gao , Software School , Xidian University
49
DataBase System
Example Queries
customer_
account_
name
number
Hayes
A-102
Johnson
A-101
Johnson
A-201
Jones
A-217
Lindsay
A-222
Smith
A-215
Turner
A-305
depositor
account_
number
A-101
A-215
A-102
A-305
A-201
A-222
A-217
branch_
name
Downtown
Mianus
Perryridge
Round Hill
Brighton
Redwood
Brighton
account
balance
500
700
400
350
900
700
750
(exercise)Find all customers who have at least two deposits in
different branch.
πD1.cname(D1.cname=D2.cname D1.a#<>D2.a# D1.bname<>D2.bname(
D2(cname,a#,bname)(πdepositor.customer_name,account_number,branch_name
depositor.account_number=account.account_number(depositor × account)) ×
D1(cname,a#,bname)(πdepositor.customer_name,account_number,branch_name
depositor.account_number=account.account_number(depositor × account)) )
Haichang Gao , Software School , Xidian University
50
DataBase System
Additional Operations
Division Operation(除)
Suited
to queries that include the phrase “for all”.
Let
r and s be relations on schemas R and S respectively
where
R = (A1, …, Am, B1, …, Bn)
S = (B1, …, Bn)
The result of r s is a relation on schema
R S = (A1, …, Am)
Defined
as:
r s = { t | t π R-S(r) u s → tu r ) }
Haichang Gao , Software School , Xidian University
51
DataBase System
Additional Operations
Division Operation(除)
Relations r, s:
A
B
B
1
2
3
1
1
1
3
4
6
1
2
1
r s:
2
s
A
r
Haichang Gao , Software School , Xidian University
52
DataBase System
Additional Operations
Division Operation(除)
Another
Defined as:
R S {tr [ X ] | tr R Yx Y (S )}
Yx={y | tr∈R y=tr[Y] x=tr[X]}
Relations r, s:
r
A B
a
a
a
a
a
a
a
a
C
D
E
a
a
b
a
b
a
b
b
1
1
1
1
3
1
1
1
s
D
E
a
b
1
1
r s:
A
B
C
a
a
Haichang Gao , Software School , Xidian University
53
DataBase System
Additional Operations
Division Operation(除)
Property
Let q is the result of r s
Then q is the largest relation satisfying
q×sr
Definition in terms of the basic algebra operation
Let r(R) and s(S) be relations, and let S R
r s = π R-S (r) –πR-S ( (π R-S (r) × s) – πR-S,S(r))
Haichang Gao , Software School , Xidian University
54
DataBase System
Example Queries
Find all customers who have an account at all branches
located in Brooklyn city.
customer_
name
Johnson
Smith
Hayes
Turner
Johnson
Lindsay
Jones
account_
number
A-101
A-215
A-102
A-305
A-201
A-222
A-217
branch_
balance
name
Downtown
500
Mianus
700
Perryridge
400
Round Hill
350
Brighton
900
Redwood
700
Brighton
750
account
branch_name
Brighton
Downtown
Mianus
North Town
Perryridge
Pownal
Redwood
Round Hill
branch_city
Brooklyn
Brooklyn
Horseneck
Rye
Horseneck
Bennington
PaloAlto
Horseneck
branch
assets
7100000
9000000
400000
3700000
1700000
300000
2100000
8000000
π customer_name, branch_name (depositor
account)
πbranch_name (branch_city = “Brooklyn” (branch))
Haichang Gao , Software School , Xidian University
55
DataBase System
Extended Operations
Outer join Operation(外连接)
The
(left) Outer join is a join on which tuples from R that do
not have matching values in the common attributes of S are
alse included in the result relation. Missing values in the
second relation are set to null.
Notation:
r
s (left outer join )
A
C
D
A
B
10
10
20
10
a
a
b
b
1
2
r
s
A
C
D
B
10
10
20
10
a
a
b
b
1
2
2
r
s
Haichang Gao , Software School , Xidian University
56
DataBase System
Extended Operations
loan_number
L-170
L-230
L-260
customer_name
loan_number
branch_name amount
L-170
Downtown
3000 Jones
L-230
Redwood
4000 Smith
L-155
Perryridge
1700 Hayes
loan
borrower
Inner Join: loan
borrower
loan_number branch_Name amount customer_name
L-170
Downtown
3000 Jones
L-230
Redwood
4000 Smith
Left Outer Join
loan_number
: loan
borrower
branch_Name
amount customer_name
L-170
Downtown
3000 Jones
L-230
Redwood
4000 Smith
L-260
Perryridge
1700 NULL
Haichang Gao , Software School , Xidian University
57
DataBase System
Extended Operations
loan_number
L-170
L-230
L-260
customer_name
Loan_number
branch_name amount
L-170
Downtown
3000 Jones
L-230
Redwood
4000 Smith
L-155
Perryridge
1700 Hayes
loan
borrower
Right Outer Join
: loan
borrower
loan_number nranch_name
L-170
Downtown
L-230
Redwood
L-155
Full Outer Join
: loan
loan_number branch_name
L-170
Downtown
L-230
Redwood
L-260
Perryridge
L-155
amount customer_name
3000 Jones
4000 Smith
Hayes
borrower
amount customer_name
3000 Jones
4000 Smith
1700
Hayes
Haichang Gao , Software School , Xidian University
58
DataBase System
Homework 1
2.1
2.5
Haichang Gao , Software School , Xidian University
59
Database Systems
DataBase System
Unit 2 The Relational Model
Overview
Relational
Model
Relational
Algebra(关系代数)
Relational
Calculus(关系演算)
(教材5.1节)
Haichang Gao , Software School , Xidian University
61
DataBase System
Relational calculus
Relational Calculus Based on a branch of mathematical
logical called the predicate calculus(谓词演算).
A nonprocedural query language, where each query is of the
form :
It
{t | P(t) }
is the set of all tuples t such that predicate P is true for t
t
is a tuple variable, t[A] denotes the value of tuple t on
attribute A, called component of t on attribute A
R(t)
P
denotes that tuple t is in relation R
is a formula of the predicate calculus
Haichang Gao , Software School , Xidian University
62
DataBase System
Relational calculus
Tuple-relational-Calculus Formula
R(t)
is an atom formula(原子公式)
<t[A] θ t[B]> (or <t[A] θ c> or <c θ t[A] > ) is an atom
formula
θ is one of the comparison operators: (, , , , , )
if P1, P2 are formulae, then
P1, P1 P2, P1 P2, P1 P2 are formulae
x(P(x)), x (P(x)) are formulae
Sets of equivalent expressions:
P1 P2 (P1 P2 )
P1 P2 P1 P2
t ( P(t) ) t ( P(t) )
Haichang Gao , Software School , Xidian University
63
DataBase System
Relational calculus
Example
R
S
R1={ t | S(t) t[1]> ‘2’}
A
B
C
A
B
C
1
2
3
1
2
3
A
B
C
4
5
6
3
4
6
3
4
6
7
8
9
5
6
9
5
6
9
R2={ t | R(t) S(t)}
A
B
C
4
5
6
7
8
9
R3={ t | S(t) u(R(u) t[3]<u[2]) }
A
B
C
1
2
3
3
4
6
Haichang Gao , Software School , Xidian University
64
DataBase System
Relational calculus
Example
R
S
R4={ t | R(t) u(S(u)t[3]>u[1]) }
A
B
C
A
B
C
1
2
3
1
2
3
4
5
6
3
4
6
7
8
9
5
6
9
A
B
C
4
5
6
7
8
9
Haichang Gao , Software School , Xidian University
65
DataBase System
Relational calculus
Example
R
S
R4={ t | R(t) u(S(u)t[3]>u[1]) }
A
B
C
A
B
C
1
2
3
1
2
3
4
5
6
3
4
6
7
8
9
5
6
9
A
B
C
4
5
6
7
8
9
R5={ t | u v( R(u) S(v) u[1]>v[2] t[1]=u[2]
t[2]=v[3] t[3]=u[1] )}
Process:
1. Ensure the schema of result;
2. Obtain tuples according the
predicate calculus .
R.B
S.C R.A
5
8
3
3
4
7
8
6
7
8
9
7
Haichang Gao , Software School , Xidian University
66
DataBase System
Example Queries
Banking DataBase:
branch (branch_name, branch_city, assets)
customer (customer_name, customer_street, customer_city)
depositor (customer_name, account_number)
account (account_number, branch_name, balance)
borrower (customer_name, loan_number)
loan (loan_number, branch_name, amount)
Haichang Gao , Software School , Xidian University
67
DataBase System
Example Queries
Find the loan_number, branch_name, and amount for loans of
over $1200.
loan_number branch_name amount
L-11
L-14
L-15
L-16
L-17
L-23
L-93
Round Hill
Downtown
Perryridge
Perryridge
Downtown
Redwood
Mianus
loan
900
1500
1500
1300
1000
2000
500
{t | loan(t) t[amount] 1200}
Find the loan number for each loan of an amount greater than
$1200.
{t | s( loan(s) s[amount ] 1200
t [loan_number] = s [loan_number ] )}
Haichang Gao , Software School , Xidian University
68
DataBase System
Example Queries
customer_
account_
name
number
Hayes
A-102
Johnson
A-101
Johnson
A-201
Jones
A-217
Lindsay
A-222
Smith
A-215
Turner
A-305
depositor
account_
number
A-101
A-215
A-102
A-305
A-201
A-222
A-217
branch_
name
Downtown
Mianus
Perryridge
Round Hill
Brighton
Redwood
Brighton
account
balance
500
700
400
350
900
700
750
Find the customers name who have at least one deposit of a
balance greater than $700.
{t | u( depositor(u) v( account(v) u[account_number ]
= v[account_number] v[balance ] > ‘700’ )
t[customer_name] = u[customer_name ] ) }
Haichang Gao , Software School , Xidian University
69
DataBase System
Example Queries
u
customer_
account_
name
number
Hayes
A-102
Johnson
A-101
Johnson
A-201
Jones
A-217
Lindsay
A-222
Smith
A-215
Turner
A-305
depositor
v
customer_
account_
name
number
Hayes
A-102
Johnson
A-101
Johnson
A-201
Jones
A-217
Lindsay
A-222
Smith
A-215
Turner
A-305
depositor
Find all customers who have at least two deposits.
{t | u v( depositor(u) depositor(v) u[customer_name ] =
v[customer_name] u[account_number] v [account_number]
t[customer_name] = u [customer_name ] ) }
Haichang Gao , Software School , Xidian University
70
DataBase System
x( P(x) Q(x) )
Example Queries
x(P(x) Q(x) )
Find the largest account balance.
u
account_
number
A-101
A-215
A-102
…
branch_
name
Downtown
Mianus
Perryridge
……
account
A1:
balance
500
700
400
…
v
account_
number
branch_
name
A-101
A-215
A-102
…
Downtown
Mianus
Perryridge
……
balance
500
700
400
…
account
{t | u ( account(u) v( account(v) u[balance] ≥
v[balance]) t[balance] = u [balance] ) }
A2:
{t | u ( account(u) v( account(v) u[balance]
v[balance]) t[balance] = u [balance] ) }
Haichang Gao , Software School , Xidian University
71
customer_
account_
name
number
Hayes
A-102
Johnson
A-101
Johnson
A-201
Jones
A-217
Lindsay
A-222
Smith
A-215
Turner
A-305
depositor
account_
number
A-101
A-215
A-102
A-305
A-201
A-222
A-217
Example Queries
u
v
x
y
branch_
name
Downtown
Mianus
Perryridge
Round Hill
Brighton
Redwood
Brighton
account
DataBase System
balance
500
700
400
350
900
700
750
(exercise)Find all customers who have at least two deposits in
different branch.
{t | u x ( depositor(u) account(x) u[account_number] =
x[account_number] v y ( depositor(v) account(y)
v[account_number] = y[account_number] u[customer_name] =
v[customer_name] x[branch_name] y[branch_name]
u[account_number] v[account_number] ) t[customer_name] =
u [customer_name] ) }
Haichang Gao , Software School , Xidian University
72
DataBase System
Example Queries
Find all customers who have an account at all branches
located in Brooklyn city.
customer_ account_
name
number
Hayes
A-102
Johnson
A-101
Johnson
A-201
Jones
A-217
Lindsay
A-222
Smith
A-215
Turner
A-305
depositor
account_ branch_
number
name
A-101 Downtown
A-215 Mianus
A-102 Perryridge
A-305 Round Hill
A-201 Brighton
A-222 Redwood
A-217 Brighton
account
…
…
…
…
…
…
…
…
branch_name
Brighton
Downtown
Mianus
North Town
Perryridge
Pownal
Redwood
Round Hill
branch_city …
Brooklyn
Brooklyn
Horseneck
Rye
Horseneck
Bennington
PaloAlto
Horseneck
branch
…
…
…
…
…
…
…
…
{t | u ( depositor(u) v (branch(v) v[branch_city] = ‘Brooklyn’
w( account(w) w[account_number] = u[account_number]
w[branch_name] = v[branch_name]) ) t[customer_name] =
u[customer_name] ) }
Haichang Gao , Software School , Xidian University
73
DataBase System
x( P(x) Q(x) )
Example Queries
x(P(x) Q(x) )
Find all customers who have an account at all branches
located in Brooklyn city.
{t | u ( depositor(u) v (branch(v) v[branch_city] = ‘Brooklyn’
w( account(w) w[account_number] = u[account_number]
w[branch_name] = v[branch_name]) ) t[customer_name] = u
[customer_name] ) }
等价转换
{t | u ( depositor(u) v (branch(v) v[branch_city] = ‘Brooklyn’
w( account(w) w[account_number] = u[account_number]
w[branch_name] = v[branch_name]) ) t[customer_name] =
u[customer_name] ) }
Haichang Gao , Software School , Xidian University
74
DataBase System
Example Queries
Find the names of all customers having a loan from the
Perryridge branch, and the cities in which they live.
Banking DataBase:
branch (branch_name, branch_city, assets)
customer (customer_name, customer_street, customer_city)
depositor (customer_name, account_number)
account (account_number, branch_name, balance)
borrower (customer_name, loan_number)
loan (loan_number, branch_name, amount)
{t | u v w (customer(u) borrower(v) loan(w) u[customer_name]
= v[customer_name] v[loan_number] = w[loan_number]
w[branch_name] = ‘Perryridge’ t[customer_name] =
u[customer_name] t[customer_city] = u[customer_city] ) }
Haichang Gao , Software School , Xidian University
75
DataBase System
Safety of Expressions
It
is possible to write tuple calculus expressions
that generate infinite(无穷的) relations.
For example, {t | R(t) } results in an infinite relation if
the domain of any attribute of relation r is infinite
To guard against the problem, we restrict the set of
allowable expressions to safe expressions.
An expression {t | P(t)} in the tuple relational calculus is
safe if every component of t appears in one of the
relations, tuples, or constants that appear in P
IN
Relational Algebra Operations we defined is
safety.
Haichang Gao , Software School , Xidian University
76
DataBase System
Homework 2
5.1
5.6
Haichang Gao , Software School , Xidian University
77