Download Ingres/Data Dictionary/Integrity

Document related concepts

Open Database Connectivity wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

Database wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Functional Database Model wikipedia , lookup

Relational algebra wikipedia , lookup

Join (SQL) wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

Transcript
Features of a DBMS
CSE3180 Summer 2005 Lect 06 / 1
Lecture 6
This lecture will cover many of the functions required of a
Database Management System.
It is a good opportunity to review much of the material
covered to date, and will also open the way to other topics
such as Data Dictionary, Integrity, Recovery, Concurrency
considerations and Business Rules.
We will be moving from the Logical Design, through the
Implementation Planning stage to the Physical Design stage
of database design.
CSE3180 Summer 2005 Lect 06 / 2
A Slight Interlude
• Before we do, here is an answer to the puzzle which kept
you awake last night.
Source
C and D move across(A,B)
D returns (A,B,D)
A and B move across (D)
C returns (C,D)
C and D move across
Target
Time Progressive
C,D
2
C
1
A,B,C
10
A,B
2
A,B,C,D
2
2
3
13
15
17
CSE3180 Summer 2005 Lect 06 / 3
DBMS Outline
Multi Server
Dbms
Multi Server
Dbms
Multi Server
Dbms
Multi-Server Logging and Locking System
Data Base
CSE3180 Summer 2005 Lect 06 / 4
DBMS Block Diagram
General Communications
GCF
Sequencer / Dispatcher
SCF
Parser
Optimiser
PSF
OPF
Relation
Description
RDF
Query
Storage
QSF
Query
Execution
QEF
Abstract
Data Type
ADF
C
o
m
p
a
t
i
b
i
l
i
t
y
L
i
b
r
a
r
y
Data Manipulation
DMF
CSE3180 Summer 2005 Lect 06 / 5
DBMS Components
3. Parser Facility
Parses query text and builds query tree
Stores query tree in QSF (Query Storage)
Notifies SCF (Sequence/Dispatch) that control must now
pass to OPF (Optimiser)
4. Optimiser Facility
Reads query tree in QSF
Builds optimal query plan and stores plan in QSF
CSE3180 Summer 2005 Lect 06 / 7
DBMS Components
7. Query Storage - Support Facility
Common Storage Pool for passing objects between
(Size is user option)
PSF, OPF, and QEF
Stores Query Plans
A number of stored query plans can be controlled by user
CSE3180 Summer 2005 Lect 06 / 9
DBMS Components
9. Data Manipulation Facility
The Access Methods - btree, hash, isam, heap, bitmap
- manages internal page cache
Handles Transactions - locking - deadlocks - logging
Handles Modify and Sorting
CSE3180 Summer 2005 Lect 06 / 11
DBMS Components
10. Compatibility Library
- Insulates DBMS from the operating system
- Handles all I/O, string comparisons
- Associated with ' porting ' - is the component which
changes and accommodates F.E's
CSE3180 Summer 2005 Lect 06 / 12
DBMS Functions
1. Data Storage, Retrieval and Update
2. A User-Accessible Catalogue (Dictionary)
3. Support for Shared Update
4. Backup and Recovery Services
5. Security
6. Integrity
7. Data Independence
8. Utility Services
CSE3180 Summer 2005 Lect 06 / 13
DBMS Functions
The Primary Objectives of a DBMS are to provide
facilities for :
1. Definition of Database Logical Structures
2. Definition of Physical Structures
3. Access to the Database
4. Definition of Storage Structures to store
user data
These components are known as the
‘database architecture’
CSE3180 Summer 2005 Lect 06 / 14
Data Dictionary - Also known as the
Catalog (ue)
CSE3180 Summer 2005 Lect 06 / 15
Data Dictionary
A DATA DICTIONARY contains the fundamental definitions,
characteristics and uses of data
It describes:
What the data is
Characteristics
Uses of Data
User Permits / Restrictions
A DATA DIRECTORY contains information relating to Physical
Data Storage
CSE3180 Summer 2005 Lect 06 / 16
Data Dictionary
A Data Dictionary SYSTEM
stores
maintains
provides access
to the Data Dictionary.
It is a set of software
Also known as the Catalog Function
The Dictionary contains information on
Data
Processes
Environment
CSE3180 Summer 2005 Lect 06 / 17
A representation of a ‘database’
The System Catalogue / Dictionary
User Tables, Views, Sequences
Procedures, Indexes, User Space
Instance(s)
CSE3180 Summer 2005 Lect 06 / 18
Data Dictionary
A Data Dictionary is a DATABASE about the data held in the
USER DATABASE
Term Used : META DATA
CSE3180 Summer 2005 Lect 06 / 19
Data Dictionary
A Data Dictionary can provide data about
1. Relationships between dictionary entity types :
item
uses item ,module
table
uses item, group, module
module
uses item, group, file, module
program
uses file, module
system
uses program, system
2. Listing of all entities
Relationship reports (Which programs use record zzz)
Versioning support
Password support
User access and exits
CSE3180 Summer 2005 Lect 06 / 20
Data Dictionary
system
planning
Requirements
definition
analysis
Design
Implementation
Testing
Operations and
maintenance
D
A
T
A
D
I
C
T
I
O
N
A
R
Y
data
base
CSE3180 Summer 2005 Lect 06 / 21
Data Dictionary
Database
Administration
Application
Programmers
End
Users
Human
Interfaces
------------
Data
Dictionary
---------
Software and
DBMS Interfaces
Compilers
PreCompilers
Application
Programs/
Report Generators
Integrity
Constraints
CSE3180 Summer 2005 Lect 06 / 22
Data Dictionary
Some Benefits from Data Dictionary Use:
1. Better data management
- Redundancies, Standards,
Documentation
2. Reduction in system development time - Cross reference
listings, Auto copy libraries
3. Reduction in maintenance costs
4. Quicker and More Accurate changes possible
5. Documentation standards
6. Data Audit - cross references, 'where used' listings
CSE3180 Summer 2005 Lect 06 / 23
Some of the 770 DBA Tables
DBA_VIEWS
ALL_ERRORS
ALL_TABLES
ALL_OBJECTS
USER_COLL_TYPES
USER_COL_COMMENTS
USER_COL_PRIVS
USER_COL_PRIVS_MADE
USER_ASSOCIATIONS
USER_AUDIT_OBJECT
USER_AUDIT_SESSION
USER_VIEWS
USER_CLU_COLUMNS
USER_AUDIT_STATEMENT
USER_AUDIT_TRAIL
USER_CATALOG
USER_TAB_PRIVS
USER_ARGUMENTS
USER_ALL_TABLES
USER_TAB_PRIVS
V$SQL
V$SQLAREA
V$SHARED_MEMORY
GV$DISPATCHER
CSE3180 Summer 2005 Lect 06 / 24
Integrity
CSE3180 Summer 2005 Lect 06 / 25
Integrity
Integrity is a collection of processes, procedures and
techniques which are used to ensure that data held in a
database is
COMPLETE
ACCURATE
CLEAR
thus ensuring that Information derived from the database
also has these characteristics
CSE3180 Summer 2005 Lect 06 / 26
Integrity
C
C.R.U.D.E.
Column Integrity - Linked to Domain Integrity
R
Referential Integrity
U
User Defined Integrity
D
Domain Integrity - A user defined datatype
E
Entity Integrity
CSE3180 Summer 2005 Lect 06 / 27
Database Integrity
Some terms you will encounter:
Entity Integrity
Referential Integrity
Functional Dependency (constraints between determinants
and attributes. For each value of the determinant there is only one value
for each of the attributes it determines)
Multivalued Dependency
Join Dependency
Domain Constraints
Cardinality Constraint
User Defined Constraints
CSE3180 Summer 2005 Lect 06 / 28
Data Integrity
General Principle: Data compliance with a set of rules
Rules Location: Best embodied in the DBMS
If they are contained in an application, there is the danger of
saturating a network and causing degraded performance.
This is particularly so in client / server computing
CONSTRAINTS: Declarative approach where integrity
constraints are ‘declared’ as part of a table specification.
ANSI SQL-99 standards include specifications for integrity
constrains syntax and behaviour
CSE3180 Summer 2005 Lect 06 / 29
INTEGRITY CONSTRAINTS
DATABASE INTEGRITY
Refers to correctness and consistency of data
Quality Assurance
Usually expressed in terms of CONSTRAINTS
- consistency rules which must not be subverted
CSE3180 Summer 2005 Lect 06 / 30
Forms of Constraints
1. ENTITY INTEGRITY - Primary Key Value
NO attribute of a primary key value may be NULL
2. REFERENTIAL INTEGRITY - Foreign Key Values
If a FOREIGN KEY exists in a relation, then either
(1) the foreign key value MUST match the Primary Key
value of some row in its home (or Primary) relation OR
(2) the FOREIGN KEY must be NULL
3. FUNCTIONAL DEPENDENCY - Determinant
For each value of the DETERMINANT, there must be
only ONE value for each of the attributes which it
determines
CSE3180 Summer 2005 Lect 06 / 31
Forms of Constraints
4. MULTIVALUED DEPENDENCIES
If A,B and C are three sets of attributes, then A
multidetermines B if and only if the set of B values
associated with each A value is independent of the C
values
5. JOIN DEPENDENCY - Relation Reconstruction
A relation can be reconstructed by taking the join of its
projections
6. DOMAIN CONSTRAINT - Value restrictions
Possible values of a data item are restricted to a
specific set called the DOMAIN
CSE3180 Summer 2005 Lect 06 / 32
Forms of Constraints
7. CARDINALITY CONSTRAINT
The number of entities which can be related is subject to
a constraint
8. SET RETENTION CONSTRAINT
The deletion of records is subject to limitations
9. EXISTENCE DEPENDENCY
Hierarchical model (also OODB). Dependency of a child
on the parents limits insertion and deletion of segments
CSE3180 Summer 2005 Lect 06 / 33
Forms of Constraints
10. GENERAL CONSTRAINTS
Those restrictions which can be expressed as arbitrary
predicates about the data.
e.g. no class may be scheduled for Room B.215 after
2.00pm on Fridays
General Comments: DBMS’ have deficiencies in their ability to
express and enforce constraints.
Oracle uses ‘Triggers and Constraints’ and later versions of
SQL use a mechanism called ASSERTIONS.
CSE3180 Summer 2005 Lect 06 / 34
Referential Integrity
Foreign Key Concept - An attribute (or set of attributes)in one
table (the referencing table) occurs as the Primary Key of
another table (the Primary, Lookup or Referenced table)
Referential Integrity Constraint:
The Value of a Foreign Key Must Be a Key Value
in the Referenced Table
OR
The Value of the Foreign Key Must Be Undefined (Null)
This cannot occur if the Foreign Key is part of the Primary Key
of the Referencing Table
CSE3180 Summer 2005 Lect 06 / 35
Possible Referential Integrity Processes
1. Limited Insert : If an incoming Foreign Key DOES NOT
EXIST as a referenced table Primary Key:
ABORT TRANSACTION - REPORT
2. Limited Update : If an incoming Foreign Key DOES NOT
EXIST as a referenced table Primary Key
TERMINATE PROCESS
3. Restricted Delete : If there are referencing FOREIGN KEYS
in a referencing table
TERMINATE DELETE PROCESS ON REFERENCED
TABLE
CSE3180 Summer 2005 Lect 06 / 36
Possible Referential Integrity Processes
4. Restricted Update : If there are referencing Foreign Keys in
a referencing table
INHIBIT UPDATE OPERATION ON THE REFERENCED
KEY
5. Cascade Delete : If there are Referenced Keys
INITIATE DELETION OPERATION ON REFERENCED
TABLE BY DELETING ALL REFERENCING ROWS
6. Cascade Update : Commence an UPDATE on the
REFERENCED TABLE by UPDATING the Foreign Keys
on all Referencing Rows in the Referencing Table(s)
CSE3180 Summer 2005 Lect 06 / 37
Possible Referential Integrity Processes
7. Nullify Delete : Commence a DELETE operation on the
REFERENCED table by setting ALL the FOREIGN
KEYS on the Referencing Table(s) to NULL (watch Data
Types)
8. Nullify Update : Set all of the Foreign Keys of the
Referencing Table to NULL. This will invalidate any
referencing of the Referenced Key (which must not be
NULL)
9. Default Update : Invalidate references to Updated
Referenced Keys by setting all Referencing Table
Foreign Keys to a DEFAULT value
CSE3180 Summer 2005 Lect 06 / 38
Possible Referential Integrity Processes
10. Default Delete : Invalidate references to the deleted
Referencing Key Value(s) by setting all Referencing Foreign
Key values to a DEFAULT value
11. Warning Delete : Permit the deletion BUT Warn the user of
the Unattached Foreign Keys which are now present in the
Referencing Table(s)
12. Warning Update : Permit the Update BUT Warn the User of
Unattached Foreign Keys which are now present in the
Referencing Table(s)
CSE3180 Summer 2005 Lect 06 / 39
A Deeper Look into a DBMS
CSE3180 Summer 2005 Lect 06 / 40
CLOSURE (Relational Algebra)
Inference Rules; Armstrong’s Axioms
(Rules for Inference for Functional Dependencies)
Premise:
If F is a set of functional dependencies of relation R, the set of
ALL FUNCTIONAL DEPENDENCIES which can be derived
from F, called F+, is called the closure of F
CSE3180 Summer 2005 Lect 06 / 41
CLOSURE (Relational Algebra)
1. REFLEXITIVITY : If B is a subset of A, then A ----> B
2. AUGMENTATION : If A ---> B, then AC ---> BC
3. TRANSITIVITY : If A ---> B, and B ---> C, then A ---> C
4. ADDITIVITY or UNION
If A --->BC, then A---> C and A ---> B
CSE3180 Summer 2005 Lect 06 / 42
CLOSURE (Relational Algebra)
5. PROJECTIVITY or DECOMPOSITION
If A ---> BC, then A--->C and A ---> B
6. PSEUDOTRANSITIVITY
If A --->B, and CB --->D, then AC ---> D
CSE3180 Summer 2005 Lect 06 / 43
CLOSURE (Relational Algebra)
The RESULT of a query is another table,
and therefore the output from operation can become the
input to another operation
It is possible to to take:
(a) a projection of a union
(b) a join of 2 (or more) restrictions
(c) the difference of a join and a restriction
And it is possible to express nested relational expressions
- the operands are represented by expressions
CSE3180 Summer 2005 Lect 06 / 44
Relational Algebra
8 Basic Operators
Traditional Set Operators
Special Relational Operators
Union
Select
Intersect
Project
Difference
Join
Cartesian product
Divide
High level operators act on ONE or MORE relations producing a NEW
relation as a result ------> CLOSURE
Most relational DBMS will support SELECT, PROJECT and JOIN
CSE3180 Summer 2005 Lect 06 / 45
UNION
The UNION of 2 union compatible relations A and B is the set
of all rows belonging to either A or B or both
employee
empid name
born
10314 Smith 10-03-1961
10862 Black 23-05-1946
employee union salesperson
empid name
born
10314 Smith 10-03-1961
10862 Black 23-05-1946
10911 Jones 16-08-1972
salesperson
empid name
born
10911 Jones 16-08-1972
10314 Smith 10-03-1961
Notice the elimination of
duplicate records
CSE3180 Summer 2005 Lect 06 / 46
Difference Special Operator
The DIFFERENCE between 2 UNION COMPATIBLE
relations, A minus B, is the set of all rows belonging to A
and NOT to B.
See previous for the relations A and B
RESULT:
empid
E7
EMPLOYEE DIFFERENCE SALESPERSON
name
BLACK
born
23-05-1946
CSE3180 Summer 2005 Lect 06 / 47
Intersection Operator
The intersection of 2 UNION COMPATIBLE relations is the
set of all rows which belong to A and B.
EMPLOYEE INTERSECTION SALESPERSON
empid
E1
name
SMITH
born
10-03-1961
CSE3180 Summer 2005 Lect 06 / 48
CARTESIAN PRODUCT
The Cartesian Product of 2 relations, A times B, is every
possible combination of rows from each relation
PART
partid
SUPPLIER
partname
supplierid
suppliername
P1
NUT
S1
SMITH
P2
BOLT
S2
JONES
P3
WASHER
partid
P1
partname
NUT
supplierid
S1
suppliername
SMITH
P2
BOLT
S1
SMITH
P3
WASHER
S1
SMITH
P1
NUT
S2
JONES
P2
BOLT
S2
JONES
P3
WASHER
S2
JONES
CSE3180 Summer 2005 Lect 06 / 49
Special SELECT Operator
Creates a 'Horizontal subset' of a relation by satisfying a condition
EMPLOYEE
empid
name
deptid
projectid
E1
GOLD
D1
P1
E2
BLUE
D6
P1
E3
WHITE
D1
P2
E4
RED
D1
P3
E5
BROWN
D6
P3
select employee where projectid = p1 or projectid = p2
RESULT
empid
name
deptid
projectid
E1
GOLD
D1
P1
E2
BLUE
D6
P1
E3
WHITE
D1
P2
CSE3180 Summer 2005 Lect 06 / 50
PROJECT Special Operator
Creates a 'vertical subset' of a relation by projecting only
certain attributes of a relation. Duplicate rows are removed.
See previous.
Project Employee over projectid giving
Result2
RESULT2
projectid
P1
P2
P3
CSE3180 Summer 2005 Lect 06 / 51
JOIN Special Operator
Combines 2 or more relations (tables) based on specified conditions
between attributes in each table. (The attributes must have the same
domain to be meaningful)
SKILL
SKILL_EMP
skillid name
empid
skillid
S1
database
E1
S1
S2
C++
E1
S4
S3
Ingres
E3
S3
Natural Join
S4
Analysis
E5
S2
identical attributes in an equijoin
E5
S4
Equi-Join The Join condition is =
One of the two
Join skill_emp and skill where skill.skillid = skill_emp.skillid giving result3
Result3
empid
E1
skillid
skillid(skill)
S1
S1
name
database
Any others
?
CSE3180 Summer 2005 Lect 06 / 52
Joining a Table to Itself
Typical Query: For each employee, list the employee number, name
Manager and Manager’s name
Select X.EMPID, X.NAME, X.MGR, Y.NAME
from EMP X, EMP Y (same table contents - ‘mirrored’)
where X.MGR = Y.EMPID
Result:
EMPID
10
20
30
40
NAME
SMITH
JONES
BLACK
BROWN
MGR
40
40
40
50
NAME
BROWN
BROWN
BROWN
WHITE
The Primary Key and the Foreign are both in the same table
Two virtual tables are created for joining (‘alias’ feature)
CSE3180 Summer 2005 Lect 06 / 53
Outer Join
EMP
EmpId
DEP
Name
Age
DepId
Mgr
DepId
Name
Loc
10
smith
25
15
40
11
MIS
20
jones
28
15
40
20
Finance Malvern
30
black
20
40
15
Market
40
brown
46
11
50
17
Accounts Clayton
50
white
42
11
Select d.depid, e.name, e.age
From dep d , emp e
Caulfield
City
The + appends a null row to the EMP table
for this query and it is used to join to the
where d.depid = e.depno (+)
DEP rows with no matching employee details
DepId
name
age
DepId name
11
brown
46
15
jones
11
white
42
20
black
15
smith
25
17
age
28
CSE3180 Summer 2005 Lect 06 / 54
Joins of Tables
The joining of attributes depends on certain types of
relationships;
Consider two attributes C1 and C2 which are join attributes
There are 4 types of relationships possible
• (a) the values of C1 and C2 are equal
• (b) the values of C1 are a subset of those of C2 (or vice
versa)
• (c) the values of C1 and C2 are conjoint - they have some
values in common
• (d) the values of C1 and C2 are disjoint - they have no
values in common
CSE3180 Summer 2005 Lect 06 / 55
Joins of Tables
In set theory, these take the forms
(a) C1 = C2
(b) C1  C2 or C2  C1
(c) C1 - C2  0 or C2 - C1  0
(d) C1 - C2 = C1 and C2 - C1 = C2
CSE3180 Summer 2005 Lect 06 / 56
Joins of Tables
There are a number of possible ‘join’ types allowable in the
relational model
They are:
•
•
•
•
1. Thetajoin
3. Natural join
5. Outer join
7. Right Outer join
2.Equijoin
4. Inner join
6. Left Outer join
8. Full Outer join
CSE3180 Summer 2005 Lect 06 / 57
DIVISION
• Divides a BINARY relation by a UNARY relation and
produces a UNARY relation as a result.
skill-reqd
result
emp-skill
empid
skillcode
E1
E2
E3
E2
E5
E6
S1
S2
S3
S4
S5
S6
skillcode
S2
S4
empid
E2
Divide emp-skill by skill-reqd
to give result
Special note: JOIN, INTERSECTION and DIVISION can be defined
in terms of the other 5 operators (which are known as
the ‘primitive’ operators).
CSE3180 Summer 2005 Lect 06 / 58
A DIVISION example
In the Air Transport Industry, pilots records contain details of
the aircraft they are qualified to fly. And there are also
records of the number and types of aircraft in the hangers
and which Company owns what.
In this case, the table of pilot’s names and the planes they can
fly is the dividend
The details of the planes in the hangars is the quotient
The query is to obtain the names of the pilots who can fly
every type of plane in the hangars
CSE3180 Summer 2005 Lect 06 / 59
Suggested Solution
• create table pilotskill (pilot vchar (150) not null,
plane vchar(15) not null);
• create table hangar (plane vchar(15));
• select pilot from pilotskill ps1, hangar h1
where ps1.plane = h1.plane
group by ps1.pilot
having count(ps1.plane = select count(*) from hangar);
[notice the absence of any ‘division’ operator - this is effectively
performed by the execution plan]
CSE3180 Summer 2005 Lect 06 / 60
Division Examples
A B
1 J
1 K
1 L
2 J
2 K
3 K
3 L
3 J
C
J
K
L
Result
1
3
CSE3180 Summer 2005 Lect 06 / 61
Division Examples
Name
Jones
Jensen
Jensen
Jensen
Smith
Smith
Rogers
Rogers
Degree
B Sc
B Sc
M Sc
PhD
B Sc
M Sc
B Sc
PhD
R1
Jensen
D1
M Sc
B Sc
PhD
D2
B Sc
M Sc
R2
Jensen
Smith
D3
B Sc
R3
Jones
Jensen
Smith
Rogers
CSE3180 Summer 2005 Lect 06 / 62
Relational Algebra Operators
Select
Project
Cartesian Product
a
b
c
Union
Intersection
x
y
a
a
b
b
c
c
x
y
x
y
x
y
Difference
CSE3180 Summer 2005 Lect 06 / 63
Relational Algebra Operators
Divide
Natural Join
a1 b1
a2 b1
a3 b2
b1 c1
b2 c2
b3 c3
a1 b1 c1
a2 b1 c1
a3 b2 c2
a
a
a
b
c
x
y
z
x
y
x
y
a
CSE3180 Summer 2005 Lect 06 / 64
Data Base Design
4th Generation Environment - User Perception
user terminal
teleprocessing
monitor
report
writer
query
language
electronic
mail
application
programs
e-mail
files
data dictionary
DBMS
data
base
structured and non-structured data
images, graphics, video,voice
CSE3180 Summer 2005 Lect 06 / 65
DBMS Command Levels
DataBase Administrators
Priviliged set of commands.
Sometimes called 'superuser'
Data Administration
Database Developers
Application Developers
Users with Query rights only
Users with Table modification rights
CSE3180 Summer 2005 Lect 06 / 66