* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Ingres/Data Dictionary/Integrity
Open Database Connectivity wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Clusterpoint wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Functional Database Model wikipedia , lookup
Relational algebra wikipedia , lookup
Features of a DBMS CSE3180 Summer 2005 Lect 06 / 1 Lecture 6 This lecture will cover many of the functions required of a Database Management System. It is a good opportunity to review much of the material covered to date, and will also open the way to other topics such as Data Dictionary, Integrity, Recovery, Concurrency considerations and Business Rules. We will be moving from the Logical Design, through the Implementation Planning stage to the Physical Design stage of database design. CSE3180 Summer 2005 Lect 06 / 2 A Slight Interlude • Before we do, here is an answer to the puzzle which kept you awake last night. Source C and D move across(A,B) D returns (A,B,D) A and B move across (D) C returns (C,D) C and D move across Target Time Progressive C,D 2 C 1 A,B,C 10 A,B 2 A,B,C,D 2 2 3 13 15 17 CSE3180 Summer 2005 Lect 06 / 3 DBMS Outline Multi Server Dbms Multi Server Dbms Multi Server Dbms Multi-Server Logging and Locking System Data Base CSE3180 Summer 2005 Lect 06 / 4 DBMS Block Diagram General Communications GCF Sequencer / Dispatcher SCF Parser Optimiser PSF OPF Relation Description RDF Query Storage QSF Query Execution QEF Abstract Data Type ADF C o m p a t i b i l i t y L i b r a r y Data Manipulation DMF CSE3180 Summer 2005 Lect 06 / 5 DBMS Components 3. Parser Facility Parses query text and builds query tree Stores query tree in QSF (Query Storage) Notifies SCF (Sequence/Dispatch) that control must now pass to OPF (Optimiser) 4. Optimiser Facility Reads query tree in QSF Builds optimal query plan and stores plan in QSF CSE3180 Summer 2005 Lect 06 / 7 DBMS Components 7. Query Storage - Support Facility Common Storage Pool for passing objects between (Size is user option) PSF, OPF, and QEF Stores Query Plans A number of stored query plans can be controlled by user CSE3180 Summer 2005 Lect 06 / 9 DBMS Components 9. Data Manipulation Facility The Access Methods - btree, hash, isam, heap, bitmap - manages internal page cache Handles Transactions - locking - deadlocks - logging Handles Modify and Sorting CSE3180 Summer 2005 Lect 06 / 11 DBMS Components 10. Compatibility Library - Insulates DBMS from the operating system - Handles all I/O, string comparisons - Associated with ' porting ' - is the component which changes and accommodates F.E's CSE3180 Summer 2005 Lect 06 / 12 DBMS Functions 1. Data Storage, Retrieval and Update 2. A User-Accessible Catalogue (Dictionary) 3. Support for Shared Update 4. Backup and Recovery Services 5. Security 6. Integrity 7. Data Independence 8. Utility Services CSE3180 Summer 2005 Lect 06 / 13 DBMS Functions The Primary Objectives of a DBMS are to provide facilities for : 1. Definition of Database Logical Structures 2. Definition of Physical Structures 3. Access to the Database 4. Definition of Storage Structures to store user data These components are known as the ‘database architecture’ CSE3180 Summer 2005 Lect 06 / 14 Data Dictionary - Also known as the Catalog (ue) CSE3180 Summer 2005 Lect 06 / 15 Data Dictionary A DATA DICTIONARY contains the fundamental definitions, characteristics and uses of data It describes: What the data is Characteristics Uses of Data User Permits / Restrictions A DATA DIRECTORY contains information relating to Physical Data Storage CSE3180 Summer 2005 Lect 06 / 16 Data Dictionary A Data Dictionary SYSTEM stores maintains provides access to the Data Dictionary. It is a set of software Also known as the Catalog Function The Dictionary contains information on Data Processes Environment CSE3180 Summer 2005 Lect 06 / 17 A representation of a ‘database’ The System Catalogue / Dictionary User Tables, Views, Sequences Procedures, Indexes, User Space Instance(s) CSE3180 Summer 2005 Lect 06 / 18 Data Dictionary A Data Dictionary is a DATABASE about the data held in the USER DATABASE Term Used : META DATA CSE3180 Summer 2005 Lect 06 / 19 Data Dictionary A Data Dictionary can provide data about 1. Relationships between dictionary entity types : item uses item ,module table uses item, group, module module uses item, group, file, module program uses file, module system uses program, system 2. Listing of all entities Relationship reports (Which programs use record zzz) Versioning support Password support User access and exits CSE3180 Summer 2005 Lect 06 / 20 Data Dictionary system planning Requirements definition analysis Design Implementation Testing Operations and maintenance D A T A D I C T I O N A R Y data base CSE3180 Summer 2005 Lect 06 / 21 Data Dictionary Database Administration Application Programmers End Users Human Interfaces ------------ Data Dictionary --------- Software and DBMS Interfaces Compilers PreCompilers Application Programs/ Report Generators Integrity Constraints CSE3180 Summer 2005 Lect 06 / 22 Data Dictionary Some Benefits from Data Dictionary Use: 1. Better data management - Redundancies, Standards, Documentation 2. Reduction in system development time - Cross reference listings, Auto copy libraries 3. Reduction in maintenance costs 4. Quicker and More Accurate changes possible 5. Documentation standards 6. Data Audit - cross references, 'where used' listings CSE3180 Summer 2005 Lect 06 / 23 Some of the 770 DBA Tables DBA_VIEWS ALL_ERRORS ALL_TABLES ALL_OBJECTS USER_COLL_TYPES USER_COL_COMMENTS USER_COL_PRIVS USER_COL_PRIVS_MADE USER_ASSOCIATIONS USER_AUDIT_OBJECT USER_AUDIT_SESSION USER_VIEWS USER_CLU_COLUMNS USER_AUDIT_STATEMENT USER_AUDIT_TRAIL USER_CATALOG USER_TAB_PRIVS USER_ARGUMENTS USER_ALL_TABLES USER_TAB_PRIVS V$SQL V$SQLAREA V$SHARED_MEMORY GV$DISPATCHER CSE3180 Summer 2005 Lect 06 / 24 Integrity CSE3180 Summer 2005 Lect 06 / 25 Integrity Integrity is a collection of processes, procedures and techniques which are used to ensure that data held in a database is COMPLETE ACCURATE CLEAR thus ensuring that Information derived from the database also has these characteristics CSE3180 Summer 2005 Lect 06 / 26 Integrity C C.R.U.D.E. Column Integrity - Linked to Domain Integrity R Referential Integrity U User Defined Integrity D Domain Integrity - A user defined datatype E Entity Integrity CSE3180 Summer 2005 Lect 06 / 27 Database Integrity Some terms you will encounter: Entity Integrity Referential Integrity Functional Dependency (constraints between determinants and attributes. For each value of the determinant there is only one value for each of the attributes it determines) Multivalued Dependency Join Dependency Domain Constraints Cardinality Constraint User Defined Constraints CSE3180 Summer 2005 Lect 06 / 28 Data Integrity General Principle: Data compliance with a set of rules Rules Location: Best embodied in the DBMS If they are contained in an application, there is the danger of saturating a network and causing degraded performance. This is particularly so in client / server computing CONSTRAINTS: Declarative approach where integrity constraints are ‘declared’ as part of a table specification. ANSI SQL-99 standards include specifications for integrity constrains syntax and behaviour CSE3180 Summer 2005 Lect 06 / 29 INTEGRITY CONSTRAINTS DATABASE INTEGRITY Refers to correctness and consistency of data Quality Assurance Usually expressed in terms of CONSTRAINTS - consistency rules which must not be subverted CSE3180 Summer 2005 Lect 06 / 30 Forms of Constraints 1. ENTITY INTEGRITY - Primary Key Value NO attribute of a primary key value may be NULL 2. REFERENTIAL INTEGRITY - Foreign Key Values If a FOREIGN KEY exists in a relation, then either (1) the foreign key value MUST match the Primary Key value of some row in its home (or Primary) relation OR (2) the FOREIGN KEY must be NULL 3. FUNCTIONAL DEPENDENCY - Determinant For each value of the DETERMINANT, there must be only ONE value for each of the attributes which it determines CSE3180 Summer 2005 Lect 06 / 31 Forms of Constraints 4. MULTIVALUED DEPENDENCIES If A,B and C are three sets of attributes, then A multidetermines B if and only if the set of B values associated with each A value is independent of the C values 5. JOIN DEPENDENCY - Relation Reconstruction A relation can be reconstructed by taking the join of its projections 6. DOMAIN CONSTRAINT - Value restrictions Possible values of a data item are restricted to a specific set called the DOMAIN CSE3180 Summer 2005 Lect 06 / 32 Forms of Constraints 7. CARDINALITY CONSTRAINT The number of entities which can be related is subject to a constraint 8. SET RETENTION CONSTRAINT The deletion of records is subject to limitations 9. EXISTENCE DEPENDENCY Hierarchical model (also OODB). Dependency of a child on the parents limits insertion and deletion of segments CSE3180 Summer 2005 Lect 06 / 33 Forms of Constraints 10. GENERAL CONSTRAINTS Those restrictions which can be expressed as arbitrary predicates about the data. e.g. no class may be scheduled for Room B.215 after 2.00pm on Fridays General Comments: DBMS’ have deficiencies in their ability to express and enforce constraints. Oracle uses ‘Triggers and Constraints’ and later versions of SQL use a mechanism called ASSERTIONS. CSE3180 Summer 2005 Lect 06 / 34 Referential Integrity Foreign Key Concept - An attribute (or set of attributes)in one table (the referencing table) occurs as the Primary Key of another table (the Primary, Lookup or Referenced table) Referential Integrity Constraint: The Value of a Foreign Key Must Be a Key Value in the Referenced Table OR The Value of the Foreign Key Must Be Undefined (Null) This cannot occur if the Foreign Key is part of the Primary Key of the Referencing Table CSE3180 Summer 2005 Lect 06 / 35 Possible Referential Integrity Processes 1. Limited Insert : If an incoming Foreign Key DOES NOT EXIST as a referenced table Primary Key: ABORT TRANSACTION - REPORT 2. Limited Update : If an incoming Foreign Key DOES NOT EXIST as a referenced table Primary Key TERMINATE PROCESS 3. Restricted Delete : If there are referencing FOREIGN KEYS in a referencing table TERMINATE DELETE PROCESS ON REFERENCED TABLE CSE3180 Summer 2005 Lect 06 / 36 Possible Referential Integrity Processes 4. Restricted Update : If there are referencing Foreign Keys in a referencing table INHIBIT UPDATE OPERATION ON THE REFERENCED KEY 5. Cascade Delete : If there are Referenced Keys INITIATE DELETION OPERATION ON REFERENCED TABLE BY DELETING ALL REFERENCING ROWS 6. Cascade Update : Commence an UPDATE on the REFERENCED TABLE by UPDATING the Foreign Keys on all Referencing Rows in the Referencing Table(s) CSE3180 Summer 2005 Lect 06 / 37 Possible Referential Integrity Processes 7. Nullify Delete : Commence a DELETE operation on the REFERENCED table by setting ALL the FOREIGN KEYS on the Referencing Table(s) to NULL (watch Data Types) 8. Nullify Update : Set all of the Foreign Keys of the Referencing Table to NULL. This will invalidate any referencing of the Referenced Key (which must not be NULL) 9. Default Update : Invalidate references to Updated Referenced Keys by setting all Referencing Table Foreign Keys to a DEFAULT value CSE3180 Summer 2005 Lect 06 / 38 Possible Referential Integrity Processes 10. Default Delete : Invalidate references to the deleted Referencing Key Value(s) by setting all Referencing Foreign Key values to a DEFAULT value 11. Warning Delete : Permit the deletion BUT Warn the user of the Unattached Foreign Keys which are now present in the Referencing Table(s) 12. Warning Update : Permit the Update BUT Warn the User of Unattached Foreign Keys which are now present in the Referencing Table(s) CSE3180 Summer 2005 Lect 06 / 39 A Deeper Look into a DBMS CSE3180 Summer 2005 Lect 06 / 40 CLOSURE (Relational Algebra) Inference Rules; Armstrong’s Axioms (Rules for Inference for Functional Dependencies) Premise: If F is a set of functional dependencies of relation R, the set of ALL FUNCTIONAL DEPENDENCIES which can be derived from F, called F+, is called the closure of F CSE3180 Summer 2005 Lect 06 / 41 CLOSURE (Relational Algebra) 1. REFLEXITIVITY : If B is a subset of A, then A ----> B 2. AUGMENTATION : If A ---> B, then AC ---> BC 3. TRANSITIVITY : If A ---> B, and B ---> C, then A ---> C 4. ADDITIVITY or UNION If A --->BC, then A---> C and A ---> B CSE3180 Summer 2005 Lect 06 / 42 CLOSURE (Relational Algebra) 5. PROJECTIVITY or DECOMPOSITION If A ---> BC, then A--->C and A ---> B 6. PSEUDOTRANSITIVITY If A --->B, and CB --->D, then AC ---> D CSE3180 Summer 2005 Lect 06 / 43 CLOSURE (Relational Algebra) The RESULT of a query is another table, and therefore the output from operation can become the input to another operation It is possible to to take: (a) a projection of a union (b) a join of 2 (or more) restrictions (c) the difference of a join and a restriction And it is possible to express nested relational expressions - the operands are represented by expressions CSE3180 Summer 2005 Lect 06 / 44 Relational Algebra 8 Basic Operators Traditional Set Operators Special Relational Operators Union Select Intersect Project Difference Join Cartesian product Divide High level operators act on ONE or MORE relations producing a NEW relation as a result ------> CLOSURE Most relational DBMS will support SELECT, PROJECT and JOIN CSE3180 Summer 2005 Lect 06 / 45 UNION The UNION of 2 union compatible relations A and B is the set of all rows belonging to either A or B or both employee empid name born 10314 Smith 10-03-1961 10862 Black 23-05-1946 employee union salesperson empid name born 10314 Smith 10-03-1961 10862 Black 23-05-1946 10911 Jones 16-08-1972 salesperson empid name born 10911 Jones 16-08-1972 10314 Smith 10-03-1961 Notice the elimination of duplicate records CSE3180 Summer 2005 Lect 06 / 46 Difference Special Operator The DIFFERENCE between 2 UNION COMPATIBLE relations, A minus B, is the set of all rows belonging to A and NOT to B. See previous for the relations A and B RESULT: empid E7 EMPLOYEE DIFFERENCE SALESPERSON name BLACK born 23-05-1946 CSE3180 Summer 2005 Lect 06 / 47 Intersection Operator The intersection of 2 UNION COMPATIBLE relations is the set of all rows which belong to A and B. EMPLOYEE INTERSECTION SALESPERSON empid E1 name SMITH born 10-03-1961 CSE3180 Summer 2005 Lect 06 / 48 CARTESIAN PRODUCT The Cartesian Product of 2 relations, A times B, is every possible combination of rows from each relation PART partid SUPPLIER partname supplierid suppliername P1 NUT S1 SMITH P2 BOLT S2 JONES P3 WASHER partid P1 partname NUT supplierid S1 suppliername SMITH P2 BOLT S1 SMITH P3 WASHER S1 SMITH P1 NUT S2 JONES P2 BOLT S2 JONES P3 WASHER S2 JONES CSE3180 Summer 2005 Lect 06 / 49 Special SELECT Operator Creates a 'Horizontal subset' of a relation by satisfying a condition EMPLOYEE empid name deptid projectid E1 GOLD D1 P1 E2 BLUE D6 P1 E3 WHITE D1 P2 E4 RED D1 P3 E5 BROWN D6 P3 select employee where projectid = p1 or projectid = p2 RESULT empid name deptid projectid E1 GOLD D1 P1 E2 BLUE D6 P1 E3 WHITE D1 P2 CSE3180 Summer 2005 Lect 06 / 50 PROJECT Special Operator Creates a 'vertical subset' of a relation by projecting only certain attributes of a relation. Duplicate rows are removed. See previous. Project Employee over projectid giving Result2 RESULT2 projectid P1 P2 P3 CSE3180 Summer 2005 Lect 06 / 51 JOIN Special Operator Combines 2 or more relations (tables) based on specified conditions between attributes in each table. (The attributes must have the same domain to be meaningful) SKILL SKILL_EMP skillid name empid skillid S1 database E1 S1 S2 C++ E1 S4 S3 Ingres E3 S3 Natural Join S4 Analysis E5 S2 identical attributes in an equijoin E5 S4 Equi-Join The Join condition is = One of the two Join skill_emp and skill where skill.skillid = skill_emp.skillid giving result3 Result3 empid E1 skillid skillid(skill) S1 S1 name database Any others ? CSE3180 Summer 2005 Lect 06 / 52 Joining a Table to Itself Typical Query: For each employee, list the employee number, name Manager and Manager’s name Select X.EMPID, X.NAME, X.MGR, Y.NAME from EMP X, EMP Y (same table contents - ‘mirrored’) where X.MGR = Y.EMPID Result: EMPID 10 20 30 40 NAME SMITH JONES BLACK BROWN MGR 40 40 40 50 NAME BROWN BROWN BROWN WHITE The Primary Key and the Foreign are both in the same table Two virtual tables are created for joining (‘alias’ feature) CSE3180 Summer 2005 Lect 06 / 53 Outer Join EMP EmpId DEP Name Age DepId Mgr DepId Name Loc 10 smith 25 15 40 11 MIS 20 jones 28 15 40 20 Finance Malvern 30 black 20 40 15 Market 40 brown 46 11 50 17 Accounts Clayton 50 white 42 11 Select d.depid, e.name, e.age From dep d , emp e Caulfield City The + appends a null row to the EMP table for this query and it is used to join to the where d.depid = e.depno (+) DEP rows with no matching employee details DepId name age DepId name 11 brown 46 15 jones 11 white 42 20 black 15 smith 25 17 age 28 CSE3180 Summer 2005 Lect 06 / 54 Joins of Tables The joining of attributes depends on certain types of relationships; Consider two attributes C1 and C2 which are join attributes There are 4 types of relationships possible • (a) the values of C1 and C2 are equal • (b) the values of C1 are a subset of those of C2 (or vice versa) • (c) the values of C1 and C2 are conjoint - they have some values in common • (d) the values of C1 and C2 are disjoint - they have no values in common CSE3180 Summer 2005 Lect 06 / 55 Joins of Tables In set theory, these take the forms (a) C1 = C2 (b) C1 C2 or C2 C1 (c) C1 - C2 0 or C2 - C1 0 (d) C1 - C2 = C1 and C2 - C1 = C2 CSE3180 Summer 2005 Lect 06 / 56 Joins of Tables There are a number of possible ‘join’ types allowable in the relational model They are: • • • • 1. Thetajoin 3. Natural join 5. Outer join 7. Right Outer join 2.Equijoin 4. Inner join 6. Left Outer join 8. Full Outer join CSE3180 Summer 2005 Lect 06 / 57 DIVISION • Divides a BINARY relation by a UNARY relation and produces a UNARY relation as a result. skill-reqd result emp-skill empid skillcode E1 E2 E3 E2 E5 E6 S1 S2 S3 S4 S5 S6 skillcode S2 S4 empid E2 Divide emp-skill by skill-reqd to give result Special note: JOIN, INTERSECTION and DIVISION can be defined in terms of the other 5 operators (which are known as the ‘primitive’ operators). CSE3180 Summer 2005 Lect 06 / 58 A DIVISION example In the Air Transport Industry, pilots records contain details of the aircraft they are qualified to fly. And there are also records of the number and types of aircraft in the hangers and which Company owns what. In this case, the table of pilot’s names and the planes they can fly is the dividend The details of the planes in the hangars is the quotient The query is to obtain the names of the pilots who can fly every type of plane in the hangars CSE3180 Summer 2005 Lect 06 / 59 Suggested Solution • create table pilotskill (pilot vchar (150) not null, plane vchar(15) not null); • create table hangar (plane vchar(15)); • select pilot from pilotskill ps1, hangar h1 where ps1.plane = h1.plane group by ps1.pilot having count(ps1.plane = select count(*) from hangar); [notice the absence of any ‘division’ operator - this is effectively performed by the execution plan] CSE3180 Summer 2005 Lect 06 / 60 Division Examples A B 1 J 1 K 1 L 2 J 2 K 3 K 3 L 3 J C J K L Result 1 3 CSE3180 Summer 2005 Lect 06 / 61 Division Examples Name Jones Jensen Jensen Jensen Smith Smith Rogers Rogers Degree B Sc B Sc M Sc PhD B Sc M Sc B Sc PhD R1 Jensen D1 M Sc B Sc PhD D2 B Sc M Sc R2 Jensen Smith D3 B Sc R3 Jones Jensen Smith Rogers CSE3180 Summer 2005 Lect 06 / 62 Relational Algebra Operators Select Project Cartesian Product a b c Union Intersection x y a a b b c c x y x y x y Difference CSE3180 Summer 2005 Lect 06 / 63 Relational Algebra Operators Divide Natural Join a1 b1 a2 b1 a3 b2 b1 c1 b2 c2 b3 c3 a1 b1 c1 a2 b1 c1 a3 b2 c2 a a a b c x y z x y x y a CSE3180 Summer 2005 Lect 06 / 64 Data Base Design 4th Generation Environment - User Perception user terminal teleprocessing monitor report writer query language electronic mail application programs e-mail files data dictionary DBMS data base structured and non-structured data images, graphics, video,voice CSE3180 Summer 2005 Lect 06 / 65 DBMS Command Levels DataBase Administrators Priviliged set of commands. Sometimes called 'superuser' Data Administration Database Developers Application Developers Users with Query rights only Users with Table modification rights CSE3180 Summer 2005 Lect 06 / 66