Download Chapter 5 Integrity Constraints

Chapter 5 Integrity Constraints 5.1 5.2 5.3 5.4 5.5 5.6 Domain Constraints Referential Integrity Assertions Triggers Functional Dependencies Exercises Chapter 5 Integrity Constraints The term integrity refers to the accuracy or correctness of data in the database. Integrity constraints provide a means of ensuring that changes made to the database by authorized users do not result in a loss of data consistency. Thus, integrity constraints guard against accidental damage to the database. A given database might be subject to any number of integrity constraints, of arbitrary complexity. We classify integrity constraints in general into four broad categories: type(domain), attribute, relvar, database constraints 5.1 Domain Constraints We have seen that a domain of possible values must be associated with every attribute. Domain constraints are the most elementary form of integrity constraint. They are tested early by the system whenever a new data item is entered into the database. Defined domain: create domain Dollars numeric(12,2) create domain Pounds numeric(12,2) cast r.A as Pounds drop domain alter domain 5.1 Domain Constraints Essentially, domain constraints is (or is logically equivalent to) just an enumeration of the legal values of the domain. Examples: (check clause) ① A check clause can ensure that an hourly wage domain allows only values greater than a specified value. create domain hourly-wage numeric(5,2) constraint wage-value-test check (value>=4.00) ② the check clause can also be use to restrict a domain to not contain any null values. create domain account-number char(10) constraint account-number-null-test check (value not null) 5.1 Domain Constraints ③ The domain can be restricted to contain only a specified set of values by using the in clause. create domain account-type char(10) constraint account-type-test check (value in(“checking”, “saving”)) ④ check (branch-name in(select branch-name form branch)) 5.2 Referential Integrity 1.Basic Concepts  Foreign keys: loosely speaking, a foreign key is a set of attributes of one relvar R2 whose values are required to match values of some candidate key of some relvar R1. Let R2 be a relvar, then a foreign key in R2 is a set of attributes of R2 , say FK, such that: a. there exists a relvar R1 (R1 and R2 not necessarily distinct) with a candidate key CK, and b. for all time, each value of FK in the current values of R2 is identical to the values of CK in some tuple in the current values of R1 5.2 Referential Integrity depositor branch branch-name branch-city assets account account-number branch-name balance customer-name account-number borrower loan loan-number branch-name amount customer-name loan-number customer customer-name customer-street customer-city 5.2 Referential Integrity Points arising: ① The definition requires every value of a given foreign key to appear as a values of the matching candidate key. Note, however, that the converse is not a requirement. ② A foreign key is simple or composite according as the candidate key it matches is simple or composite. ③ Each attribute of a given foreign key must have the same name and type as the corresponding component of the matching candidate key. ④ A foreign key value represents a reference to the tuple containing the matching candidate key value(the referenced tuple) 5.2 Referential Integrity ⅰThe problem of ensuring that the database does not include any invalid foreign key values is therefore known as the referential integrity problem. ⅱ The constraint that values of a given foreign key must match values of the corresponding candidate key is known as the a referential constraint. ⅲ We refer to the relvar that contains the foreign key as the referencing relvar and the relvar that contains the corresponding candidate key as the referenced relvar. 5.2 Referential Integrity ⑤ Referential diagrams: consider depositor and account. We can represent the referential constraints that exist in that database by means of the following referential diagram: account-number depositor account ⑥ A given relvar can of course be both referenced and referencing, as in the case of R2 here. R3 R2 R1 5.2 Referential Integrity referential path: Let relation Rn , R(n-1)……R2 , R1 be such that here is a referential constraint form Rn to R(n-1) , a referential constraint from R(n-1) to R(n-2) , …… and a referential constraint form R2 to R1: Rn R(n-1) …… R(n-2) R2 R1 Then the chain of arrows form Rn to R1 represents a referential path from Rn to R1. referential cycle: Rn R(n-1) R(n-2) …… R2 R1 Rn 5.2 Referential Integrity 2. Referential integrity The database must not contain any unmatched foreign key values. The term “unmatched foreign key value” here simply means a foreign key value in some referencing relvar for which there does not exist a matching value of the relevant candidate key in the relevant referenced relvar. Here then is the syntax for defining a foreign key: FOREICN KEY {<item commalist>} REFERENCES<relvar name> ⅰEach <item> is either an <attribute name> of the referencing relvar or an expression of the form RENAME <attribute name> AS <attribute name> ⅱThe <relvar name> identifies the referenced relvar 5.2 Referential Integrity 3. Database Modification Database modification can cause violations of referential integrity. FK(R2)  CK(R1) ① Insert : if a tuple t2 is inserted into R2 , the system must ensure that there is a tuple t1 in R1 such that t1[CK]= t2[FK]. That is, t2[FK] ∈ CK(R1) ② Delete : if a tuple t1 deleted from R1, the system must compute the set of tuples in R2 that reference t1: FK=t1[CK](R2) ③ Update: we must consider two cases for update updates to the referencing relation(R2), and updates to the referenced relation (R1) 5.2 Referential Integrity  If a tuple t2 is update in relation R2, and the update modifies values for the foreign key FK, then a test similar to the insert case is made. Let t2’ denote the new value of tuple t2, the system must ensure that t2’[FK] ∈ CK(R1)  If a tuple t1 is update in relation R1, and the update modifies values for the foreign key CK, then a test similar to the delete case is made. The system must compute using the old value of t1:FK=t1[CK](R2)  If this set is not empty, the update is rejected as an error, or the update is cascaded in a manner similar to delete. 5.2 Referential Integrity 4. Referential Integrity in SQL Primary and candidate keys and foreign keys can be specified as part of the SQL create table statement (primary key clause, unique clause, foreign key clause) Example: create table account ( account-number char(10) not null branch-name char(10) balance integer primary key (account-number) foreign key (branch-name) references branch check (balance>=0)) 5.2 Referential Integrity Referential actions: Problem: delete from branch where branch-name=“Perryridge” Assume this delete does exactly what is says—i.e., it deletes the branch tuple for branch-name Perryridge, no more and no less. Assume too that the database does include some account for branch-name Peryridge, and the application does not go on to delete those account. When the system checks the referential constrain from account to branch, then, it will find a violation, and an error will occur. 5.2 Referential Integrity Solution: The obvious compensating action would be for the system to delete the account for branch-name Perryridge “automatically” we can achieve the effect by extending the foreign key definition as follows: creat table account (…… foreign key(branch-name) references branch on delete cascade on update cascade ……) 5.2 Referential Integrity cascade: The specification on delete cascade defines a delete rule for this particular foreign key, and the specification cascade is the referential action for that delete rule. restrict: In the case at hand, restrict would mean that delete operations are “restricted” to the case where there are no matching account. 5.2 Referential Integrity Omitting a referential action for a particular foreign key is equivalent to the specifying the “action” on action, which means what it says—the delete is performed exactly as requested, no more and no less. deal with null values: ① All attributes of the primary key are implicitly declared to be not null ② Attributes of a unique declaration are allowed to be null, provide that they have not otherwise been declared to be nonnull. ③ Attributes of foreign keys are allowed to be null, provided that they have not otherwise been declared to be nonnull. 5.3 Assertions Assertions is general constraints. An assertion is a predicate expressing a condition that we wish the database always to satisfy. Domain constrains and referential-integrity constraints are special forms of assertions. Assertions are defined by means of create assertion syntax: create assertion <assertion-name>check <predicate> 5.3 Assertions Examples: ① The sum of all loan amounts for each branch must be less than the sum of all account balance at the branch. create assertion sum-constraint check ( not exists ( select * from branch where (select sum(amount) form loan where loan.branch-name=branch.branch-name) >=(select sum(amount) from account where loan.branch-name=branch.branch-name))) 5.3 Assertions ② Every loan has at least one customer who maintains an account with a minimum balance of $1000,00 create assertion balance-constraint check ( not exists ( select * from loan where not exists (select * from borrower, depositor, account where loan.loan-number=borrower.loan-number and borrower.customer-name=depositor.customer-name and depositor.account-number=account.account-number and account.balance>=1000))) 5.4 Triggers A trigger is a statement that is executed automatically by the system as a side effect of a modification to the database. To design a trigger mechanism, we must meet two requirements: ① Specify the conditions under which the trigger is to be executed ② Specify the actions to be taken when the trigger executes. 5.4 Triggers Example: (overdrafts) Steps: ① insert a new tuple s in the loan relation with s[branch-name]=t[branch-name] s[loan-number]=t[account-number] s[amount]= - t[balance] ② insert a new tuple u in the borrower relation with u[customer-name]=“Jones” u[loan-number]=t[account-number] ③ set t[balance] to 0 5.4 Triggers Using SQL to write the account-overdraft trigger create trigger overdraft-trigger after update on account referencing new row as nrow for each row when nrow.balance<0 begin atomic insert into borrower (select customer-name,account-number from depositor where nrow.account-number=depositor.account-number); insert into loan values (nrow.account-number,nrow.branch0name,-nrow.balance); update account set balance=0 where account.account-number=nrow.account-number end 5.4 Triggers create trigger overdraft-trigger on account for update as if nrow.balance<0 begin insert into borrower (select customer-name,account-number from depositor where inserted.account-number=depositor.account-number); insert into loan values (inserted.account-number, inserted. branch0name,- inserted..balance); update account set balance=0 from account,inserted where account.account-number= inserted. account-number end 5.4 Triggers create trigger setnull-trigger before update on r referencing new row as nrow for each row when nrow.phone-number=‘’ set nrow.phone-number=null 5.5.1 Introduction Basically, a functional dependency is a many-to-one relationship from one set of attributes to another within a given relvar. Example: There is a functional dependency from the set of attributes {branch-name} to the set of attributes {assets}. a. For any given value for the pair of attributes branch-name, there is just one corresponding value of attribute assets, but b. Many distinct values of the pair of attributes branch-name can have the same corresponding value for attributes assets. 5.5.2 Basic conceptions Now, it is very important in this area—as in so many others! —to distinguish clearly between (a) the value of a given relvar at a given point in time and (b) the set of all possible values that the given relvar might assume at different times. Here then is the definition for case (a): Let r be a relation, and let X and Y be arbitrary subsets of the set of attributes of r. Then we say that Y is functionally dependent on X —in symbols, XY 5.5.2 Basic conceptions Consider the branch relvar. A possible value for relvar branch is shown in Fig A. branch-name branch-city assets Downtown Brooblyn 9000000 Redwood Palo Alto 4000000 Perryridge Horseneck 3000000 Mianus Horseneck 11000000 Example: The relation shown in Fig A satisfies the FD. {branch-name} {branch-city} 5.5.2 Basic conceptions Here then is the definition for case (b): Let r be a relation variable, and let X and Y be arbitrary subsets of the set of attributes of r. Then we say that Y is functionally dependent on X —in symbols, XY —if and only if , in every possible legal value of r, each X value has associated with it precisely one Y value. In other words, in tuples agree on their X value, they also agree on their Y value. branch-name branch-city assets branch-name 5.5.2 Basic conceptions We now observe that if X is a candidate key of relvar r, then all attributes Y of relvar r must be functionally dependent on X. Example: For the parts relvar customer customer-name# {customer-name, customer-street, customer-city} In fact, if relvar r satisfies the FD A B and A is not a candidate key, then r will involve some redundancy. 5.5.2 Basic conceptions Problem: Now, even if we restrict our attention to FDs that hold for all time, the complete set of FDs for a given relvar can still be very large. Solution: Given a particular set S of FDs, therefore, it is desirable to find some other set T that is much smaller than S and has the property that every FD in S is implied by the FDs in T. if such a set T can be found, it is sufficient that the DBMS enforce just the FDs in T, and the FDs in S will then be enforced automatically. 5.5.3 Trivial and Nontrivial Dependencies One obvious way to reduce the size of the set of FDs we need to deal with is to eliminate the trivial dependencies. A dependency is “trivial” if it cannot possibly not be satisfied. Example: branch-name branch-name In fact, an FD is trivial if and only if the right-hand side is a subset(not necessarily a proper subset) of the left hand side. 5.5.4 Closure of a Set of Functional Dependencies We shall see that, given a set F of functional dependencies, we can prove that certain other functional dependencies hold. We say that such functional dependencies are logically implied by F. Example: Given a relation schema R={A,B,X,G,H,I} and the set of functional dependencies A B A C CG H CG I B H The functional dependency A H is logically implied. 5.5.4 Closure of a Set of Functional Dependencies F+: Let F be a set of functional dependencies. The closure of F is the set of all functional dependencies logically implied by F. we denote the closure of F by F+. Armstrong’s axioms: We adopt the convention of suing Greek letters(,,,…) for sets of attributes, and uppercase Roman letters from the beginning of the alphabet for individual attributes. We use  to denote ∪. Armstrong’s axioms ① Reflexivity rule: if  is a set of attributes and , then    holds. ② Augmentation rule: if    holds and  is a set of attributes, then      holds. ③ Transitivity rule: if    holds and    holds, then    holds. These rules are sound, because they do not generate any incorrect functional dependencies. The rules are complete, because, for a given set F of functional dependencies, they allow us to generate all F+ Armstrong’s axioms ④ Union rule: if    hold and    hold, then    holds ⑤ Decomposition rule: if    holds, then    holds and    hold ⑥ Pseudotransitivity rule: if    holds and    holds, then    holds. Armstrong’s axioms Example: Given a relation schema R={A,B,X,G,H,I} and the set of functional dependencies A B A C CG H CG I B H ① Using Armstrong’s axioms to show that A H, CG HI, AG I A H : A B B H (transitivity rule) CG HI: CG H and CG I (union rule) AG I: A C and CG I ( pseudotransitivity rule) Armstrong’s axioms ② Suppose we are given a relation schema R=(A,B,C,D,E,F) and the set of function dependencies. A BC B E CD EF Prove AD F A BC A C (Decomposition rule) AD CD (Augmentation rule) CD EF AD EF (Transitivity rule) AD F (Decomposition rule) 5.5.5 Closure of attribute Sets Compute a certain subset of the closure: Given a relvar R, a set Z of attributes of R, and a set S of FDs that hold for R, we can determine the set of all attributes of R that are functionally dependent on Z —the so-called closure Z+ of Z under S. 5.5.5 Closure of attribute Sets Algorithm: closure[Z,S]:=Z do “forever” for each FD X Y in S do; If X<= closure[Z,S] then closure[Z,S]:= closure[Z,S] ∪Y end; if closure[Z,S] did not change on this iteration then leave loop; end; 5.5.5 Closure of attribute Sets Example: Suppose we are given relvar R with attributes A,B,C,D,E,F and FDs. A BC E CF B E CD EF We now compute the closure {A,B}+ of the set of attribute {A,B} under this set of FDs. 1. We initialize the result closure[Z,S] to {A,B} . 2. Go round the inner loop fours times, once for each of the given FDs. On the first ineration(for the FD A BC ) we add C to closure[Z,S] which has the value {A,B,C} 3. On the second ineration(for the FD E CF ) the closure[Z,S] remains unchanged. 5.5.5 Closure of attribute Sets 4. On the third ineration(for the FD B E ) we add E to closure[Z,S] , which now has the value {A,B,C,E} 5. On the fourth ineration(for the FD CD EF) the closure[Z,S] remains unchanged. 6. Go ground the inner loop four times again. On the first iteration, the result does not change; on the second, it expands to {A,B,C,E,F}; on the third and fourth, it does not change. 7. Go round the inner loop fours times again. closure[Z,S] does not change, and so the whole process terminates With {A,B}+ ={A,B,C,E,F} 5.5.5 Closure of attribute Sets An important corollary of the foregoing is as following : Given a sets of FDs, we can easily tell whether a specific FD X Y follows from S, because that FD will follow if and only if Y is a subset of the closure X+ of X under S. Another important corollary is the following. The superkeys for a given relvar R are precisely those subsets K of the attributes of R such that the FD K A holds true for every attribute A of R. 5.5.6 Irreducible Sets of dependencies Let S1 and S2 be two sets of FDs. If every FD implied by S1 is implied by S2—i.e. if S1+ is a subset of S2+ —we say that S2 is a cover for S1. What this means is that if the DBMS enforces the FDs in S2, then it will automatically be enforcing the FDs in S1. 5.5.6 Irreducible Sets of dependencies Now we define a set S of FDs to be irreducible if and only if it satisfies the following three properties: 1. The right-hand side of every FD in S involves just one attribute. 2. The left-hand side of every FD in S is irreducible in turn—meaning that no attribute can be discarded from the determinant without changing the closure S+. We will say that such an FD is left -irreducible. 3. No FD in S can be discarded from S without changing the closure S+. 5.5.6 Irreducible Sets of dependencies Example: ABC ABD ① S1= A D A E ② S2= A B A D A C AA ③ S3= A B A C A D 5.5.6 Irreducible Sets of dependencies Algorithm: Example: Relvar R{A,B,C,D,E,G} satisfies the following FDs: R= ABC D EG C A BE C BCD CG BD ACD B CE AG Find an irreducible equivalent for this set of FDs. 5.5.6 Irreducible Sets of dependencies ① Using decomposition rule to rewrite the FDs such that each has a singleton right-hand side: R= ABC D E CG D C A D G CE A BCD BE C CE G ACD B CG B ② For each FD f in R, if deleting f from R has no effect on the closure R+ , we delete f from R. ABC {AB}+={AB} C {a,b} 5.5.6 Irreducible Sets of dependencies CA {C}+={C} A {C} BCD {BC}+={BCA} D {BCA} ACDB {ACD}+={ABCDEG} B {ABCDEG} delete ACDB CA {C}+={C} A {C} DE {D}+={DG} E {DG} DG {D}+={DE} G{DE} BEC {BE}+={BE} C{BE} CGB {CG}+={CGDA} B{CGDA} CGD {CG}+={CGBADE} D{CGBADE} delete CGD 5.5.6 Irreducible Sets of dependencies CEA {CE}+={CEGABD} delete CEA CEG {CE}+={CEA} R1= ABC D G C A BE C BCD CG B D E CE G A {CEGABD} G{CEA} 5.5.6 Irreducible Sets of dependencies ③ For each FD f in S, we examine each attribute A in the left-hand side of f; if deleting A from the left-hand side of f has no effect on the closure R+, we delete A from the left-hand side of f. ABC {A}+={A} {B}+={B} …… R1 is a irreducible sets of dependencies. 5.5.6 Irreducible Sets of dependencies If we compute step three before the step two,then: {CD}+={CDAEGB} ACDB ACDB R2= CDB ABC D E CG D C A D G CE A BCD BE C CE G CD B CG B B {CDAEGB} 5.5.6 Irreducible Sets of dependencies Compute step two: CGB {CG}+={CGDBAE} delete CGB CEA {CE}+={CEGADB} delete CEA R= ABC D E C A D G BCD BE C CD B CG D CE G B{CGDBAE} A {CEGADB} Exercises 1.Relvar R{A,B,C,D,E,F,G} satisfies the following FDs: AB BC DE AEFG Compute the closure {AC}+ under this set of FDs. Is the FD ACFDG implied by this set? Exercises 2.Relvar R{A,B,C,D,E,F} satisfies the following FDs: ABC BE C C A CE FA BCD CF BD ACD B D EF Find an irreducible equivalent for this set of FDs. Exercises 2.Relvar R{A,B,C,D,E,G,H,P} satisfies the following FDs: ABCE CDE P A C HB P GPB D HG EP A ABC PG Find an irreducible equivalent for this set of FDs.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chapter 5 Integrity Constraints