Download Chapter 5 Integrity Constraints

Document related concepts

Extensible Storage Engine wikipedia , lookup

Database model wikipedia , lookup

Relational algebra wikipedia , lookup

Relational model wikipedia , lookup

Transcript
Chapter 5 Integrity Constraints
5.1
5.2
5.3
5.4
5.5
5.6
Domain Constraints
Referential Integrity
Assertions
Triggers
Functional Dependencies
Exercises
Chapter 5 Integrity Constraints
The term integrity refers to the accuracy or
correctness of data in the database. Integrity
constraints provide a means of ensuring that changes
made to the database by authorized users do not result
in a loss of data consistency. Thus, integrity
constraints guard against accidental damage to the
database. A given database might be subject to any
number of integrity constraints, of arbitrary complexity.
We classify integrity constraints in general into four
broad categories:
type(domain), attribute, relvar, database constraints
5.1
Domain Constraints
We have seen that a domain of possible values must
be associated with every attribute. Domain constraints
are the most elementary form of integrity constraint.
They are tested early by the system whenever a new
data item is entered into the database.
Defined domain:
create domain Dollars numeric(12,2)
create domain Pounds numeric(12,2)
cast r.A as Pounds
drop domain
alter domain
5.1
Domain Constraints
Essentially, domain constraints is (or is logically equivalent to)
just an enumeration of the legal values of the domain.
Examples: (check clause)
① A check clause can ensure that an hourly wage domain allows
only values greater than a specified value.
create domain hourly-wage numeric(5,2)
constraint wage-value-test
check (value>=4.00)
② the check clause can also be use to restrict a domain to not
contain any null values.
create domain account-number char(10)
constraint account-number-null-test
check (value not null)
5.1
Domain Constraints
③ The domain can be restricted to contain only a
specified set of values by using the in clause.
create domain account-type char(10)
constraint account-type-test
check (value in(“checking”, “saving”))
④ check (branch-name in(select branch-name form
branch))
5.2
Referential Integrity
1.Basic Concepts
 Foreign keys: loosely speaking, a foreign key is a set of attributes
of one relvar R2 whose values are required to match values of some
candidate key of some relvar R1.
Let R2 be a relvar, then a foreign key in R2 is a set of attributes of
R2 , say FK, such that:
a. there exists a relvar R1 (R1 and R2 not necessarily distinct) with
a candidate key CK, and
b. for all time, each value of FK in the current values of R2 is
identical to the values of CK in some tuple in the current values of
R1
5.2
Referential Integrity
depositor
branch
branch-name
branch-city
assets
account
account-number
branch-name
balance
customer-name
account-number
borrower
loan
loan-number
branch-name
amount
customer-name
loan-number
customer
customer-name
customer-street
customer-city
5.2
Referential Integrity
Points arising:
① The definition requires every value of a given foreign key to
appear as a values of the matching candidate key. Note, however,
that the converse is not a requirement.
② A foreign key is simple or composite according as the candidate
key it matches is simple or composite.
③ Each attribute of a given foreign key must have the same name
and type as the corresponding component of the matching
candidate key.
④ A foreign key value represents a reference to the tuple containing
the matching candidate key value(the referenced tuple)
5.2
Referential Integrity
ⅰThe problem of ensuring that the database does not
include any invalid foreign key values is therefore
known as the referential integrity problem.
ⅱ The constraint that values of a given foreign key must
match values of the corresponding candidate key is
known as the a referential constraint.
ⅲ We refer to the relvar that contains the foreign key as
the referencing relvar and the relvar that contains the
corresponding candidate key as the referenced relvar.
5.2
Referential Integrity
⑤ Referential diagrams: consider depositor and account.
We can represent the referential constraints that exist in
that database by means of the following referential
diagram:
account-number
depositor
account
⑥ A given relvar can of course be both referenced and
referencing, as in the case of R2 here.
R3
R2
R1
5.2
Referential Integrity
referential path: Let relation Rn , R(n-1)……R2 , R1 be
such that here is a referential constraint form Rn to R(n-1) ,
a referential constraint from R(n-1) to R(n-2) , …… and a
referential constraint form R2 to R1:
Rn
R(n-1)
……
R(n-2)
R2
R1
Then the chain of arrows form Rn to R1 represents a
referential path from Rn to R1.
referential cycle:
Rn
R(n-1)
R(n-2)
……
R2
R1
Rn
5.2
Referential Integrity
2. Referential integrity
The database must not contain any unmatched foreign key values.
The term “unmatched foreign key value” here simply means a
foreign key value in some referencing relvar for which there does
not exist a matching value of the relevant candidate key in the
relevant referenced relvar.
Here then is the syntax for defining a foreign key:
FOREICN KEY {<item commalist>} REFERENCES<relvar name>
ⅰEach <item> is either an <attribute name> of the referencing relvar
or an expression of the form
RENAME <attribute name> AS <attribute name>
ⅱThe <relvar name> identifies the referenced relvar
5.2
Referential Integrity
3. Database Modification
Database modification can cause violations of referential integrity.
FK(R2)  CK(R1)
① Insert : if a tuple t2 is inserted into R2 , the system must ensure
that there is a tuple t1 in R1 such that t1[CK]= t2[FK]. That is, t2[FK]
∈ CK(R1)
② Delete : if a tuple t1 deleted from R1, the system must compute
the set of tuples in R2 that reference t1: FK=t1[CK](R2)
③ Update: we must consider two cases for update updates to the
referencing relation(R2), and updates to the referenced relation (R1)
5.2
Referential Integrity

If a tuple t2 is update in relation R2, and the update modifies
values for the foreign key FK, then a test similar to the insert case is
made. Let t2’ denote the new value of tuple t2, the system must
ensure that
t2’[FK] ∈ CK(R1)

If a tuple t1 is update in relation R1, and the update modifies
values for the foreign key CK, then a test similar to the delete case
is made. The system must compute using the old value of
t1:FK=t1[CK](R2)

If this set is not empty, the update is rejected as an error, or the
update is cascaded in a manner similar to delete.
5.2
Referential Integrity
4. Referential Integrity in SQL
Primary and candidate keys and foreign keys can be specified as
part of the SQL create table statement (primary key clause, unique
clause, foreign key clause)
Example:
create table account
( account-number char(10) not null
branch-name
char(10)
balance
integer
primary key (account-number)
foreign key (branch-name) references branch
check (balance>=0))
5.2
Referential Integrity
Referential actions:
Problem:
delete from branch
where branch-name=“Perryridge”
Assume this delete does exactly what is says—i.e., it deletes the
branch tuple for branch-name Perryridge, no more and no less.
Assume too that the database does include some account for
branch-name Peryridge, and the application does not go on to
delete those account. When the system checks the referential
constrain from account to branch, then, it will find a violation, and
an error will occur.
5.2
Referential Integrity
Solution:
The obvious compensating action would be for the system to
delete the account for branch-name Perryridge “automatically” we
can achieve the effect by extending the foreign key definition as
follows:
creat table account
(……
foreign key(branch-name) references branch
on delete cascade
on update cascade
……)
5.2
Referential Integrity
cascade:
The specification on delete cascade defines a delete
rule for this particular foreign key, and the specification
cascade is the referential action for that delete rule.
restrict:
In the case at hand, restrict would mean that delete
operations are “restricted” to the case where there are
no matching account.
5.2
Referential Integrity
Omitting a referential action for a particular foreign
key is equivalent to the specifying the “action” on
action, which means what it says—the delete is
performed exactly as requested, no more and no less.
deal with null values:
① All attributes of the primary key are implicitly declared to be not
null
② Attributes of a unique declaration are allowed to be null, provide
that they have not otherwise been declared to be nonnull.
③ Attributes of foreign keys are allowed to be null, provided that
they have not otherwise been declared to be nonnull.
5.3
Assertions
Assertions is general constraints. An assertion is a
predicate expressing a condition that we wish the
database always to satisfy. Domain constrains and
referential-integrity constraints are special forms of
assertions.
Assertions are defined by means of create assertion
syntax:
create assertion <assertion-name>check <predicate>
5.3
Assertions
Examples:
① The sum of all loan amounts for each branch must be less than
the sum of all account balance at the branch.
create assertion sum-constraint check
( not exists ( select * from branch
where (select sum(amount) form loan
where loan.branch-name=branch.branch-name)
>=(select sum(amount) from account
where loan.branch-name=branch.branch-name)))
5.3
Assertions
② Every loan has at least one customer who maintains an account
with a minimum balance of $1000,00
create assertion balance-constraint check
( not exists ( select * from loan
where not exists (select *
from borrower, depositor, account
where loan.loan-number=borrower.loan-number
and borrower.customer-name=depositor.customer-name
and depositor.account-number=account.account-number
and account.balance>=1000)))
5.4
Triggers
A trigger is a statement that is executed automatically
by the system as a side effect of a modification to the
database.
To design a trigger mechanism, we must meet two
requirements:
① Specify the conditions under which the trigger is to be executed
② Specify the actions to be taken when the trigger executes.
5.4
Triggers
Example: (overdrafts)
Steps:
① insert a new tuple s in the loan relation with
s[branch-name]=t[branch-name]
s[loan-number]=t[account-number]
s[amount]= - t[balance]
② insert a new tuple u in the borrower relation with
u[customer-name]=“Jones”
u[loan-number]=t[account-number]
③ set t[balance] to 0
5.4
Triggers
Using SQL to write the account-overdraft trigger
create trigger overdraft-trigger after update on account
referencing new row as nrow
for each row
when nrow.balance<0
begin atomic
insert into borrower
(select customer-name,account-number
from depositor
where nrow.account-number=depositor.account-number);
insert into loan values
(nrow.account-number,nrow.branch0name,-nrow.balance);
update account set balance=0
where account.account-number=nrow.account-number
end
5.4
Triggers
create trigger overdraft-trigger on account
for update
as
if nrow.balance<0
begin
insert into borrower
(select customer-name,account-number
from depositor
where inserted.account-number=depositor.account-number);
insert into loan values
(inserted.account-number, inserted. branch0name,- inserted..balance);
update account set balance=0
from account,inserted
where account.account-number= inserted. account-number
end
5.4
Triggers
create trigger setnull-trigger before update on r
referencing new row as nrow
for each row
when nrow.phone-number=‘’
set nrow.phone-number=null
5.5.1
Introduction
Basically, a functional dependency is a many-to-one
relationship from one set of attributes to another
within a given relvar.
Example:
There is a functional dependency from the set of attributes
{branch-name} to the set of attributes {assets}.
a. For any given value for the pair of attributes branch-name, there
is just one corresponding value of attribute assets, but
b. Many distinct values of the pair of attributes branch-name can
have the same corresponding value for attributes assets.
5.5.2
Basic conceptions
Now, it is very important in this area—as in so many
others! —to distinguish clearly between (a) the value of
a given relvar at a given point in time and (b) the set of
all possible values that the given relvar might assume
at different times.
Here then is the definition for case (a):
Let r be a relation, and let X and Y be arbitrary subsets
of the set of attributes of r. Then we say that Y is
functionally dependent on X —in symbols, XY
5.5.2
Basic conceptions
Consider the branch relvar. A possible value for relvar branch is
shown in Fig A.
branch-name branch-city
assets
Downtown
Brooblyn
9000000
Redwood
Palo Alto
4000000
Perryridge
Horseneck
3000000
Mianus
Horseneck
11000000
Example:
The relation shown in Fig A satisfies the FD.
{branch-name} {branch-city}
5.5.2 Basic conceptions
Here then is the definition for case (b):
Let r be a relation variable, and let X and Y be arbitrary
subsets of the set of attributes of r. Then we say that Y is
functionally dependent on X —in symbols, XY —if and
only if , in every possible legal value of r, each X value
has associated with it precisely one Y value. In other
words, in tuples agree on their X value, they also agree
on their Y value.
branch-name branch-city
assets branch-name
5.5.2 Basic conceptions
We now observe that if X is a candidate key of relvar r,
then all attributes Y of relvar r must be functionally
dependent on X.
Example: For the parts relvar customer
customer-name# {customer-name, customer-street,
customer-city}
In fact, if relvar r satisfies the FD A B and A is not a
candidate key, then r will involve some redundancy.
5.5.2 Basic conceptions
Problem:
Now, even if we restrict our attention to FDs that hold for all time,
the complete set of FDs for a given relvar can still be very large.
Solution:
Given a particular set S of FDs, therefore, it is desirable to find
some other set T that is much smaller than S and has the property
that every FD in S is implied by the FDs in T. if such a set T can be
found, it is sufficient that the DBMS enforce just the FDs in T, and
the FDs in S will then be enforced automatically.
5.5.3 Trivial and Nontrivial
Dependencies
One obvious way to reduce the size of the set of FDs
we need to deal with is to eliminate the trivial
dependencies. A dependency is “trivial” if it cannot
possibly not be satisfied.
Example: branch-name branch-name
In fact, an FD is trivial if and only if the right-hand side
is a subset(not necessarily a proper subset) of the left
hand side.
5.5.4 Closure of a Set of
Functional Dependencies
We shall see that, given a set F of functional
dependencies, we can prove that certain other functional
dependencies hold. We say that such functional
dependencies are logically implied by F.
Example:
Given a relation schema R={A,B,X,G,H,I} and the set of functional
dependencies
A B
A C
CG H
CG I
B H
The functional dependency A H is logically implied.
5.5.4 Closure of a Set of Functional
Dependencies
F+:
Let F be a set of functional dependencies. The closure
of F is the set of all functional dependencies logically
implied by F. we denote the closure of F by F+.
Armstrong’s axioms:
We adopt the convention of suing Greek letters(,,,…)
for sets of attributes, and uppercase Roman letters from
the beginning of the alphabet for individual attributes.
We use  to denote ∪.
Armstrong’s axioms
① Reflexivity rule: if  is a set of attributes and ,
then    holds.
② Augmentation rule: if    holds and  is a set of
attributes, then      holds.
③ Transitivity rule: if    holds and    holds, then
   holds.
These rules are sound, because they do not generate any
incorrect functional dependencies. The rules are complete, because,
for a given set F of functional dependencies, they allow us to
generate all F+
Armstrong’s axioms
④ Union rule: if    hold and    hold, then   
holds
⑤ Decomposition rule: if    holds, then    holds
and    hold
⑥ Pseudotransitivity rule: if    holds and   
holds, then    holds.
Armstrong’s axioms
Example:
Given a relation schema R={A,B,X,G,H,I} and the set of functional
dependencies
A B
A C
CG H
CG I
B H
① Using Armstrong’s axioms to show that A H, CG HI, AG I
A H : A B B H (transitivity rule)
CG HI: CG H and CG I (union rule)
AG I: A C and CG I ( pseudotransitivity rule)
Armstrong’s axioms
② Suppose we are given a relation schema R=(A,B,C,D,E,F) and the
set of function dependencies.
A BC
B E
CD EF
Prove AD F
A BC A C (Decomposition rule)
AD CD (Augmentation rule)
CD EF AD EF (Transitivity rule)
AD F (Decomposition rule)
5.5.5 Closure of attribute Sets
Compute a certain subset of the closure:
Given a relvar R, a set Z of attributes of R, and a set S
of FDs that hold for R, we can determine the set of all
attributes of R that are functionally dependent on Z —the
so-called closure Z+ of Z under S.
5.5.5 Closure of attribute Sets
Algorithm:
closure[Z,S]:=Z
do “forever”
for each FD X Y in S
do;
If X<= closure[Z,S]
then closure[Z,S]:= closure[Z,S] ∪Y
end;
if closure[Z,S] did not change on this iteration
then leave loop;
end;
5.5.5 Closure of attribute Sets
Example:
Suppose we are given relvar R with attributes A,B,C,D,E,F and FDs.
A BC
E CF
B E
CD EF
We now compute the closure {A,B}+ of the set of attribute {A,B}
under this set of FDs.
1. We initialize the result closure[Z,S] to {A,B} .
2. Go round the inner loop fours times, once for each of the given
FDs. On the first ineration(for the FD A BC ) we add C to
closure[Z,S] which has the value {A,B,C}
3. On the second ineration(for the FD E CF ) the closure[Z,S]
remains unchanged.
5.5.5 Closure of attribute Sets
4. On the third ineration(for the FD B E ) we add E to
closure[Z,S] , which now has the value {A,B,C,E}
5. On the fourth ineration(for the FD CD EF) the closure[Z,S]
remains unchanged.
6. Go ground the inner loop four times again. On the first iteration,
the result does not change; on the second, it expands to {A,B,C,E,F};
on the third and fourth, it does not change.
7. Go round the inner loop fours times again. closure[Z,S] does not
change, and so the whole process terminates With {A,B}+
={A,B,C,E,F}
5.5.5 Closure of attribute Sets
An important corollary of the foregoing is as following :
Given a sets of FDs, we can easily tell whether a specific
FD X Y follows from S, because that FD will follow if
and only if Y is a subset of the closure X+ of X under S.
Another important corollary is the following. The
superkeys for a given relvar R are precisely those
subsets K of the attributes of R such that the FD K A
holds true for every attribute A of R.
5.5.6 Irreducible Sets of
dependencies
Let S1 and S2 be two sets of FDs. If every FD implied
by S1 is implied by S2—i.e. if S1+ is a subset of S2+ —we
say that S2 is a cover for S1. What this means is that if
the DBMS enforces the FDs in S2, then it will
automatically be enforcing the FDs in S1.
5.5.6 Irreducible Sets of
dependencies
Now we define a set S of FDs to be irreducible if and only if it
satisfies the following three properties:
1. The right-hand side of every FD in S involves just one
attribute.
2. The left-hand side of every FD in S is irreducible in
turn—meaning that no attribute can be discarded from
the determinant without changing the closure S+. We will
say that such an FD is left -irreducible.
3. No FD in S can be discarded from S without changing
the closure S+.
5.5.6 Irreducible Sets of
dependencies
Example:
ABC
ABD
① S1=
A D
A E
② S2=
A B
A D
A C
AA
③ S3=
A B
A C
A D
5.5.6 Irreducible Sets of
dependencies
Algorithm:
Example:
Relvar R{A,B,C,D,E,G} satisfies the following FDs:
R=
ABC
D EG
C A
BE C
BCD
CG BD
ACD B CE AG
Find an irreducible equivalent for this set of FDs.
5.5.6 Irreducible Sets of
dependencies
① Using decomposition rule to rewrite the FDs such that each
has a singleton right-hand side:
R=
ABC
D E
CG D
C A
D G
CE A
BCD
BE C
CE G
ACD B
CG B
② For each FD f in R, if deleting f from R has no effect on the
closure R+ , we delete f from R.
ABC {AB}+={AB}
C {a,b}
5.5.6 Irreducible Sets of
dependencies
CA
{C}+={C}
A {C}
BCD
{BC}+={BCA}
D {BCA}
ACDB
{ACD}+={ABCDEG} B {ABCDEG}
delete ACDB
CA
{C}+={C}
A {C}
DE
{D}+={DG}
E {DG}
DG
{D}+={DE}
G{DE}
BEC
{BE}+={BE}
C{BE}
CGB
{CG}+={CGDA}
B{CGDA}
CGD
{CG}+={CGBADE}
D{CGBADE}
delete CGD
5.5.6 Irreducible Sets of
dependencies
CEA
{CE}+={CEGABD}
delete CEA
CEG
{CE}+={CEA}
R1=
ABC
D G
C A
BE C
BCD
CG B
D E
CE G
A {CEGABD}
G{CEA}
5.5.6 Irreducible Sets of
dependencies
③ For each FD f in S, we examine each attribute A in the left-hand
side of f; if deleting A from the left-hand side of f has no effect on
the closure R+, we delete A from the left-hand side of f.
ABC
{A}+={A}
{B}+={B} ……
R1 is a irreducible sets of dependencies.
5.5.6 Irreducible Sets of
dependencies
If we compute step three before the step two,then:
{CD}+={CDAEGB}
ACDB
ACDB
R2=
CDB
ABC
D E
CG D
C A
D G
CE A
BCD
BE C CE G
CD B CG B
B {CDAEGB}
5.5.6 Irreducible Sets of
dependencies
Compute step two:
CGB
{CG}+={CGDBAE}
delete CGB
CEA
{CE}+={CEGADB}
delete CEA
R=
ABC
D E
C A
D G
BCD
BE C
CD B CG D
CE G
B{CGDBAE}
A {CEGADB}
Exercises
1.Relvar R{A,B,C,D,E,F,G} satisfies the following FDs:
AB
BC DE
AEFG
Compute the closure {AC}+ under this set of FDs. Is the
FD ACFDG implied by this set?
Exercises
2.Relvar R{A,B,C,D,E,F} satisfies the following FDs:
ABC
BE C
C A
CE FA
BCD
CF BD
ACD B D EF
Find an irreducible equivalent for this set of FDs.
Exercises
2.Relvar R{A,B,C,D,E,G,H,P} satisfies the following FDs:
ABCE CDE P
A C
HB P
GPB
D HG
EP A
ABC PG
Find an irreducible equivalent for this set of FDs.