Download presentation source - Courses

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Relational Algebra and Calculus
University of California, Berkeley
School of Information Management and
Systems
SIMS 257: Database Management
Spring 1998
SIMS 257: Database Management -- Ray Larson
Review
• Physical Database Design
• RAID
• Data Integrity
Spring 1998
SIMS 257: Database Management -- Ray Larson
Physical Design Decisions
• There are several critical decisions that will
affect the integrity and performance of the
system.
–
–
–
–
–
Spring 1998
Storage Format
Physical record composition
Data arrangement
Indexes
Query optimization and performance tuning
SIMS 257: Database Management -- Ray Larson
When to Index
• Tradeoff between time and space:
– Indexes permit faster processing for searching
– But they take up space for the index
– They also slow processing for insertions, deletions, and
updates, because both the table and the index must be
modified
• Thus they SHOULD be used for databases where
search is the main mode of interaction
• The might be skipped if high rates of updating and
insertions are expected
Spring 1998
SIMS 257: Database Management -- Ray Larson
When to Use Indexes
• Rules of thumb
– Indexes are most useful on larger tables
– Specify a unique index for the primary key of each
table
– Indexes are most useful for attributes used as search
criteria or for joining tables
– Indexes are useful if sorting is often done on the
attribute
– Most useful when there are many different values for an
attribute
– Some DBMS limit the number of indexes and the size
of the index key values
– Some indexed will not retrieve NULL values
Spring 1998
SIMS 257: Database Management -- Ray Larson
RAID Technology
One logical disk drive
Parallel
Writes
Disk 1
Disk 2
Disk 3
Disk 4
1
5
2
6
3
7
4
8
9
*
*
*
10
*
*
*
11
*
*
*
12
*
*
*
Stripe
Stripe
Stripe
Parallel
Reads
Spring 1998
SIMS 257: Database Management -- Ray Larson
Raid 0
One logical disk drive
Parallel
Writes
Disk 1
Disk 2
Disk 3
Disk 4
1
5
2
6
3
7
4
8
9
*
*
*
10
*
*
*
11
*
*
*
12
*
*
*
Stripe
Stripe
Stripe
Parallel
Reads
Spring 1998
SIMS 257: Database Management -- Ray Larson
RAID-1
Parallel
Writes
Disk 1
Disk 2
Disk 3
Disk 4
1
3
1
3
2
4
2
4
5
*
*
*
5
*
*
*
6
*
*
*
6
*
*
*
Stripe
Stripe
Stripe
Parallel
Reads
Spring 1998
SIMS 257: Database Management -- Ray Larson
RAID-2
Writes span all drives
Disk 1
Disk 2
Disk 3
Disk 4
1a
1b
ecc
ecc
Stripe
2a
3a
*
*
*
2b
3b
*
*
*
ecc ecc
ecc ecc
*
*
*
*
*
*
Stripe
Stripe
Reads span all drives
Spring 1998
SIMS 257: Database Management -- Ray Larson
RAID-3
Writes span all drives
Disk 1
Disk 2
Disk 3
Disk 4
1a
1b
1c
ecc
Stripe
2a
3a
*
*
*
2b
3b
*
*
*
2c
3c
*
*
*
ecc
ecc
*
*
*
Stripe
Stripe
Reads span all drives
Spring 1998
SIMS 257: Database Management -- Ray Larson
Raid-4
Parallel
Writes
Disk 1
Disk 2
Disk 3
Disk 4
1
2
3
ecc
Stripe
4
7
*
*
*
5
8
*
*
*
6
9
*
*
*
ecc
ecc
*
*
*
Stripe
Stripe
Parallel
Reads
Spring 1998
SIMS 257: Database Management -- Ray Larson
RAID-5
Parallel
Writes
Disk 1
Disk 2
Disk 3
Disk 4
1
5
2
6
3
7
4
8
9
*
*
ecc
10
*
*
ecc
11
*
*
ecc
12
*
*
ecc
Stripe
Stripe
Stripe
Parallel
Reads
Spring 1998
SIMS 257: Database Management -- Ray Larson
Integrity Constraints
• The constraints we wish to impose in order to
protect the database from becoming inconsistent.
• Five types
–
–
–
–
–
Spring 1998
Required data
attribute domain constraints
entity integrity
referential integrity
enterprise constraints
SIMS 257: Database Management -- Ray Larson
Today
•
•
•
•
Relational Operations
Relational Algebra
Relational Calculus
Introduction to SQL via Access
Spring 1998
SIMS 257: Database Management -- Ray Larson
Relational Algebra Operations
•
•
•
•
•
•
•
•
Select
Project
Product
Union
Intersect
Difference
Join
Divide
Spring 1998
SIMS 257: Database Management -- Ray Larson
Select
• Extracts specified tuples (rows) from a
specified relation (table).
Spring 1998
SIMS 257: Database Management -- Ray Larson
Project
• Extracts specified attributes(columns) from
a specified relation.
Spring 1998
SIMS 257: Database Management -- Ray Larson
Product
• Builds a relation from two specified
relations consisting of all possible
concatenated pairs of tuples, one from each
of the two relations.
Product
a
b
c
Spring 1998
x
y
a
a
b
b
c
c
x
y
x
y
x
y
SIMS 257: Database Management -- Ray Larson
Union
• Builds a relation consisting of all tuples
appearing in either or both of two specified
relations.
Spring 1998
SIMS 257: Database Management -- Ray Larson
Intersect
• Builds a relation consisting of all tuples
appearing in both of two specified relations
Spring 1998
SIMS 257: Database Management -- Ray Larson
Difference
• Builds a relation consisting of all tuples
appearing in first relation but not the
second.
Spring 1998
SIMS 257: Database Management -- Ray Larson
Join
• Builds a relation from two specified
relations consisting of all possible
concatenated pairs of, one from each of the
two relations, such that in each pair the two
tuples satisfy some condition.
A1 B1
A2 B1
A3 B2
Spring 1998
B1 C1
B2 C2
B3 C3
(Natural
or Inner)
Join
A1 B1 C1
A2 B1 C1
A3 B2 C2
SIMS 257: Database Management -- Ray Larson
Outer Join
• Outer Joins are similar to PRODUCT -- but
will leave NULLs for any row in the first
table with no corresponding rows in the
second.
Outer
Join
A1
A2
A3
A4
Spring 1998
B1
B1
B2
B7
B1 C1
B2 C2
B3 C3
SIMS 257: Database Management -- Ray Larson
A1 B1 C1
A2 B1 C1
A3 B2 C2
A4 * *
Divide
• Takes two relations, one binary and one
unary, and builds a relation consisting of all
values of one attribute of the binary relation
that match (in the other attribute) all values
in the unary relation.
Divide
a
a
a
b
c
Spring 1998
x
y
z
x
y
a
x
y
SIMS 257: Database Management -- Ray Larson
ER Diagram: Acme Widget Co.
Emp#
Wage
ISA
Hourly
Sales
Cust#
Customer
Employee
Sales-Rep
Writes
Orders
Invoice
Part#
Invoice# Quantity
Contains
Line-Item
Contains
Invoice# Rep#
Part#
Count
Price
Cust#
Spring 1998
Part
SIMS 257: Database Management -- Ray Larson
Employee
Firstname
Janet
Thomas
Charles
Roberto
Spring 1998
Middlename Birthdate
Mary
6/25/63
Frederick
8/4/70
Robert
9/23/61
Garcia
7/8/58
Address
City
Zip
234 State Berkeley 92654
12 Lambert Oakland 93587
44 Central Berkeley 94720
76 Highland Berkeley 94654
SIMS 257: Database Management -- Ray Larson
Part
Part #
1
2
3
4
5
6
7
8
9
Spring 1998
Name
Price
Count
Big blue widget
3.76
2
Small blue Widget
7.35
4
Tiny red widget
5.25
7
large red widget
157.23
23
double widget rack
10.44
12
Small green Widget
30.45
58
Big yellow widget
7.96
1
Tiny orange widget
81.75
42
Big purple widget
55.99
9
SIMS 257: Database Management -- Ray Larson
Sales-Rep
SSN
Rep # Sales
123-76-3423
1 $12,345.45
843-36-7659
2 $231,456.75
Hourly
SSN
Wage
342-88-7865
$12.75
486-87-6543
$20.50
Spring 1998
SIMS 257: Database Management -- Ray Larson
Customer
Cust #
COMPANY
STREET1
Integrated Standards
1 Ltd.
35 Broadway
STREET2
STATE ZIPCODE
New York
NY
02111
34 Bureaucracy Plaza Floors 1-172
3 Control Elevation
Cyber Assicates
Place
Center
Phildelphia
PA
03756
Cyberoid
NY
08645
35 Libra Plaza
Nashua
NH
09242
1 Broadway
Middletown
IN
32467
88 Oligopoly Place
3 Independence
Parkway
Sagrado
TX
78798
Rivendell
CA
93456
8 Little Mighty Micro
34 Last One Drive
Orinda
CA
94563
9 SportLine Ltd.
38 Champion Place
Compton
CA
95328
2 MegaInt Inc.
3 Cyber Associates
General
4 Consolidated
Consolidated
5 MultiCorp
Internet Behometh
6 Ltd.
Consolidated
7 Brands, Inc.
Spring 1998
Floor 12
CITY
Suite 882
SIMS 257: Database Management -- Ray Larson
Invoice
Invoice # Cust #
Rep #
93774
3
84747
4
88367
5
88647
9
776879
2
65689
6
Spring 1998
SIMS 257: Database Management -- Ray Larson
1
1
2
1
2
2
Line-Item
Invoice # Part #
93774
84747
88367
88647
776879
65689
93774
88367
Spring 1998
Quantity
3
10
23
1
75
2
4
3
22
5
76
12
23
10
34
2
SIMS 257: Database Management -- Ray Larson
Join Items
Part #
Invoice # Part #
Quantity
93774
3
10
84747
23
1
88367
75
2
88647
4
3
776879
22
5
65689
76
12
93774
23
10
88367
34
2
1
2
3
4
5
6
7
8
9
Cust #
COMPANY
STREET1
Integrated Standards
1 Ltd.
35 Broadway
Spring 1998
Rep #
3
4
5
9
2
6
1
1
2
1
2
2
STREET2
STATE
ZIPCODE
NY
02111
34 Bureaucracy Plaza Floors 1-172
3 Control Elevation
Cyber Assicates
Place
Center
Phildelphia
PA
03756
Cyberoid
NY
08645
35 Libra Plaza
Nashua
NH
09242
1 Broadway
Middletown
IN
32467
88 Oligopoly Place
3 Independence
Parkway
Sagrado
TX
78798
Rivendell
CA
93456
8 Little Mighty Micro
34 Last One Drive
Orinda
CA
94563
9 SportLine Ltd.
38 Champion Place
Compton
CA
95328
3 Cyber Associates
General
4 Consolidated
Consolidated
5 MultiCorp
Internet Behometh
6 Ltd.
Consolidated
7 Brands, Inc.
SIMS 257: Database Management -- Ray Larson
Floor 12
CITY
New York
2 MegaInt Inc.
Invoice # Cust #
93774
84747
88367
88647
776879
65689
Name
Price
Count
Big blue widget
3.76
2
Small blue Widget
7.35
4
Tiny red widget
5.25
7
large red widget
157.23
23
double widget rack
10.44
12
Small green Widget
30.45
58
Big yellow widget
7.96
1
Tiny orange widget
81.75
42
Big purple widget
55.99
9
Suite 882
Relational Algebra
• What is the name of the customer who
ordered Large Red Widgets?
–
–
–
–
–
Spring 1998
Select “large Red Widgets” from Part as temp1
Join temp1 with Line-item on Part # as temp2
Join temp2 with Invoice on Invoice # as temp3
Join temp3 with customer on cust # as temp4
Project Name from temp4
SIMS 257: Database Management -- Ray Larson
Relational Calculus
• Relational Algebra provides a set of explicit
operations (select, project, join, etc) that can
be used to build some desired relation from
the database.
• Relational Calculus provides a notation for
formulating the definition of that desired
relation in terms of the relations in the
database without explicitly stating the
operations to be performed
• SQL is based on the relational calculus.
Spring 1998
SIMS 257: Database Management -- Ray Larson
SQL
• Structured Query Language
• Database Definition and Querying
• Basic language is standardized across
relational DBMSs. Each system may have
proprietary extensions to standard.
• Relational Calculus combines Select,
Project and Join operations in a single
command. SELECT.
Spring 1998
SIMS 257: Database Management -- Ray Larson
SELECT
• Syntax:
– SELECT attr1, attr2,…, attr3 FROM rel1 r1,
rel2 r2,… rel3 r3 WHERE condition1 {AND |
OR} condition2 ORDER BY attr1, attr3
• Examples in Access...
Spring 1998
SIMS 257: Database Management -- Ray Larson
Related documents