Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Review for Final Exam ER diagrams Summary of E-R Diagram components Rectangles entity sets double rectangles weak entity sets ellipses attributes diamonds relationshipsets double diamonds identifying relationshipsets lines links double ellipses multivalued attributes dashed ellipses derived attributes underlined primary keys discriminator – underlined with dashed line total participation of an entity set in a relationship – double line x to 1 – arrow at the unique entity side – or N and 1 Exercise: Draw an ER Diagram to represent: ERD to Tables Super keys, candidate keys, primary key. Definitions The primary key of a strong entity set is the primary key of the corresponding table . The primary key of a relation corresponding to a weak entity set is the union of the discriminator of the weak entity set and the primary key of the strong entity set that it is related to via its identifying relationship. The super keys of a relation (table) corresponding to a relationship set are from the union of the primary keys of the related entity sets. If the relationship is many-to-many, that superkey is also the primary key. If the relationship is one-to-many, the primary key is the primary key of the "many" set. Relational Algebra There are 6 basic operators in the R.A. These operators take 2 or more relations as inputs and give a new relation as a result. Relations are sets of tuples. Remember that in sets, duplicates are eliminates and order doesn’t matter. The operators are: select project cartesian (cross) product rename set difference union Also included (although not basic) join, including outer join variations Exercise: write a query in the Relational algebra. SQL DDL and DML Database tables are not sets, so duplicates are not automatically eliminated from the result of a query. DDL – defining domains and tables, specifying keys and constraints changing the schema DML: select . from . where . Renaming - use of “as” Set operations: union, intersect, except Aggregate functions: avg, min, max, sum, count Find the average account balance at the Oakland branch. select avg(balance) from Account where branch-name = "Oakland" Ordering and grouping: If both a where clause and a having command are present, the where clause is done first. Example: account# > 5000 indicates a special type of account we are interested in. select branch-name, avg(balance) from Account where account# > 5000 group by branch-name having avg(balance) > 1200 Handling NULL values: 1. The result of any arithmetic expression involving null is null. 2. The proper way to test for null is with "is" instead of "=". That is, select loan# from Loan where amount is null 3. Null has a value of "unknown" 4. The truth table for OR and AND 5. The result of a where clause predicate is treated as "false" if it evaluates to "unknown". 6. Except that "P is unknown" in a where clause evaluates to true if P evaluates to "unknown". 7. All aggregate operations except count (*) ignore rows with null values on the aggregated attributes. select sum(amount) // ignores null amounts from Loan // result null only if no non-null amounts Set Comparisons for where and having: A parenthesized list or a table can be considered a set. SQL provides the set comparison operators: in, not in, >all (< >= <= <>), and >some (< >= <= <>) select distinct customer-name from Borrower where customer-name not in ("Smith", "Jones) Nested Queries: Subqueries can appear in the from, where and having clauses. Find all customers who have both an account and a loan at the bank. select distinct customer-name from Borrower where customer-name in (select customer-name from Depositor) Example: Find the names of all branches that have greater assets than all (every) branches in Oakland. select branch-name from Branch where assets >all (select assets from Branch where branch-city ="Oakland" ) The exists command returns true if the set or subquery table is non-empty. The not exists command returns true if it is empty. The word "unique" tests whether the results of a subquery has any duplicates in it. There is also a not unique command. creating views The view definition becomes a part of the database schema and it is re-calculated every time it is used. create view v as <query exp> Modifying the database: delete, insert, update update Account set balance = balance* 1.06 where balance > 10000 Summary of SQL QUERIES select <attribute and function list> from <table list> [where <condition>] [group by <group attribute(s)>] [having <group condition>] --depends on group by [order by <attribute list>] ; <attribute and function list> includes aggregates: avg, min, max, sum, count <table list> can have derived relation --subquery assigned name with as select <> from <> where<> as r1 where <condition> attribute comparison: <,>,<=, <>, =, etc. attribute properties: like, is null, is not null attribute compared to a set: in, not in, >some, >all, >=all, etc. tuple existence: exists, not exists, unique, not unique Integrity Constraints Domain constraints. Referential integrity ensures that a value that appears in one table for a given attribute also appears for the same or compatible attribute in another table. For example: If "Oakland" is a branch name in one of the rows in the Account table, we would like to ensure that there is a row in the Branch table for "Oakland". A foreign key is a column in one table whose values must be a subset of the values of the primary key of another table. How does this affect modification? Assertions and Triggers Functional Dependencies Then the functional dependency A B (A determines B), holds on R iff for any table r where r(R) (that is, R is the schema for r) whenever any two rows t1 and t2 of r agree on the attributes A, they must also agree on the attributes B. t1[A] = t2[A] implies t1[B] = t2[B] If K is a set of attributes in R and K R, then what can we say about K? (K is a superkey for R) The set of all functional dependencies that can be inferred or deduced from F is called the closure of F, denoted F+. A canonical cover of a set of dependencies, F, has the following properties: No functional dependency contains an extraneous attribute. That is, an attribute that can be removed from the dependency without changing the closure of F. Each left side of a functional dependency in F is unique. The closure of A, denoted A+, is the set of attributes that are functionally determined by A under a set of FD's, F. Update Anomolies, spurious tuples Lossless join decomposition r = R1 (r) join R2 (r) R1 R2 R1 or R1 R2 R2 (when decomposed into three or more, you may have R1 R2 = ) Normal Forms 1NF: The domain of an attribute must contain only atomic values and the value of any attribute in a tuple must be a single value from the domain of that attribute. Def. If XY Z and X Z then XY Z is a partial dependency and X Z is a full functional dependency. So, Z is fully functionally dependent on X. Def. An attribute of R is called prime if it is a member of some candidate key of R. 2NF: Every non-prime attribute in R must be fully functionally dependent on the primary key of R. A functional dependency X Z is a transitive dependency if there is a set of attributes Y that is not a candidate key and (XY and YZ). 3NF: No non-prime attribute is transitively dependent on the primary key. BCNF: Whenever a non-trivial functional dependency XA holds in R, then X is a superkey of R. DEF: Dependency preservation: Let Fi be the set of dependencies in F+ that includes only attributes in Ri. The decomposition is dependency preserving if ( Fi )+ = F+ that is, for a 2 relation decomposition (F1 F2)+ = F+ Algorithms for decomposing into BCNF or 3NF It is not always possible to get a BCNF decomposition that is dependency preserving. Indexes Chapter 12 Purpose of index? Measure of a good index. Ordered indexes primary and secondary indexes primary index can be dense or sparse multilevel indexes B trees or B+ trees are most common Hash tables Static hashing Dynamic hashing Query Processing Chapter 13 ????