Download relational algebra - Department of Computer Science

Document related concepts

History of algebra wikipedia , lookup

Structure (mathematical logic) wikipedia , lookup

Transcript
RELATIONAL ALGEBRA
Prof. Sin-Min LEE
Department of Computer
Science
Terminology:
table (relation)
row (tuple)
formally, a relation is a set of ordered n-
tuples.
Domain : set of data values from which an
attribute value can be drawn.
Eg.
set of department names
set of legal salaries
set of possible employee birth dates
If D1, D2, D3……. Dn are the domains of a
relation R.
Then, R  D1 x D2x D3 x ………..x Dn.
It is an abstract language. We use it to
express the set of operations that any
relational query language must perform.
Two types of operations:
 1.set-theoretic operations: tables are
essentially sets of rows
 2.native relational operations: focus on the
structure of the rows Query languages are
specialized languages for asking
questions,or queries,that involve the
data in database.
Database Scheme
A relational database scheme, or
schema, corresponds to a set of table
definitions.
Eg: product(p_id, name, category, description)
supply(p_id, s_id, qnty_per_month)
supplier(s_id, name, address, ph#)
* remember the difference between a DB instance
and a DB scheme.
Keys
The super key, candidate key and primary
key notations presented for ER model
apply for relational model also.
Query languages
procedural vs. non-procedural
commercial languages have some of
both
we will study:
relational algebra (which is procedural, i.e.
tells you how to process a query)
relational calculus (which is non-procedural
i.e. tells what you want)
Relational Algebra
Fundamental operators
 select
 project
 cartesian product
 union
 set difference
s
p


-
Other operators
 natural join
 set intersection
 division
JOIN (butterfly symbol)


A Simple DB
account
had
transaction
account ac# owner ss# balance
1
bob 123
1000
2
sue
456
2000
3
jane 789
3000
transaction t# ac# type amount outcome
1
1
W
1500
bounced
2
2
D
1000
ok
3
1
W
100
ok
4
3
D
500
ok
5
2
W
200
ok
date
5/1/98
5/2/98
5/4/98
5/7/98
5/9/98
Select
s balance>=1500 account
eg:
result : ac# owner ss# balance
2
sue
456 2000
3
jane
789 3000
Project
eg:
π
result:
owner, ss#
account
owner
bob
sue
jane
ss#
123
456
789
Cartesian product
eg: account
 transaction
this will have 15 rows like the ones shown below:
ac# owner ss# balance t# type amount outcome date
1
bob
123 1000
1 W 1500
bounced 5/1/98
2
sue
456 2000
2 D
1000
ok
5/2/98
……………
Composing operations
eg: “show all transactions done by account owner
Bob”.
σ account.ano= transaction.ano((s owner=“Bob” account) 
transaction)
Natural Join
- combines σ, π, 
- very commonly used
Natural Join forms the cross product of its two
arguments, does a selection to enforce
equality of columns with the same name and
removes duplicate columns.
Eg: “show all transactions done by account
owner Bob”
σ
owner=“Bob”
(account JOIN transaction)
Rename operation
What if you need to access the same relation
twice in a query?
eg. person(ss#, name, mother_ss#, father_ss#)
“Find the name of Bob’s mother” needs the
“person” table to be accessed twice.
The operation ρ x (r) evaluates to a second logical
copy of relation r renamed to x.
Rename operation (contd)
eg:
π
mother.name
(ρ
(
mother
(person))
JOIN
mother.ss# = person.mother_ss#
(s name=“Bob” (person)))
Additional Operations
Additional Operations are those that can be
expressed in terms of other operations
.
r
 Set Intersection
eg.:
r a
1
2
3
b
a
b
d
rs=
s a
1
2
3
r-(r-s)
b
a
c
d
s
rs= a b
1 a
3 d
Additional Operations(cntd.)
 Division 
useful for “for all” queries
Definition: Let r(R) and s(S), where R & S are sets of
attributes, be relations, where S is a subset of R. The
relation r  s has scheme R-S. The tuples in r  s
consist of the R-S part of the tuples of r such that
some tuple tr in r with the those R-S attribute values
matches every tuple in s.
 Can also be defined in terms of relational
algebra.
Additional Operations(cntd.)
 Assignment operation
Sometimes it is convenient to write a relational
algebra expression as a sequence of steps rather
than one large expression. To do this, you can
use assignment:
relname
expression
eg.:
temp
π pno (part)
bigsuppliers
supply  temp