Download No Slide Title

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Extensible Storage Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
CIS 507
Database Programming
Database Concepts
DBMS Models
Database
A DATABASE is an organized collection of related data
 McFadden-Hoffer
–
–

a shared collection of logically related data
designed to meet the information needs of multiple users in an organization
Kroenke
–
a self-describing collection of integrated records



self-describing: in addition to the user’s source data, contains a description of its
own structure
collection of integrated records: user data, metadata (data about data), indexes to
represent relationships among data and improve performance, data about
applications that use the database
Rob-Coronel
–
a shared integrated computer structure that houses a collection of


end-user data--i.e. raw facts of interest to the end-user
metadata through which the data are integrated
2
File Management Systems

The predecessor to the modern database management system

Problems
–
Maintenance: Data created, managed, and accessed primarily through 3GL
(COBOL, PL/I)
–
Data dependent: all components sensitive to changes made to data
structure or storage and retrieval methods
–
Data Redundancy: uncontrolled duplication of data
–

Data Inconsistency

Data Anomalies
Sharing: inconsistent standards made it impossible to share data
3
DBMS
A Database Management System (DBMS) is general
purpose software and hardware facility to:
–
–
–
–
–
–
Create, delete, reorganize, and manipulate data in a database
Store, retrieve, share, and maintain data in a database
Maintain relationships between the database components
Provide security and procedures relating to privilege and access.
The integrity of all the updates and transactions that are carried
out.
interface for the access, deletion and addition of data and for
redefining the relationships within the database.
A DBMS is a collection of programs that manages the
database structure and controls access to the data stored in
the database.
4
DBMS Disadvantages

DBMSs are complex;

Need for explicit backup and control;

Costs associated with development and
operation can be substantial;

Consolidation of an entire business’
information resources can create a high
level of vulnerability.
5
The Database System Environment



Hardware: computer, storage, networks, devices
Software: OS, DBMS, Applications, Untilities
People:
–
–
–
–
–


System Administrator
Database Administrator
Database Designers
Systems Analysts and Programmers
End-Users
Procedures
Data
6
Database Systems Types

Number of Users:
–
–

Location:
–
–

Single-user: usually desktop
Multi-user: workgroup (small); enterprise (large)
Centralized: all data stored in a database at a single site
Distributed: database is distributed across several sites
Type and Use:
–
–
–
Production (transactional): designed to support day-to-day use
Decision Support: designed to make tactical and strategic decisions at
middle- and high-management levels
Decision Support Systems (data warehouse): use of historical data from
many sources to make decisions such as pricing, sales forecasts, marketing
positioning (e.g. structural estimates for insurance by underwriters)
7
DBMS Functions




Data Dictionary Management
Data Storage Management--Data Storage Definition Language (DSDL)
Data Transformation and Presentation
Database Control Language (DBCL)
–
–



Backup and Recovery Management
Data Integrity Management
Data Access Languages
–
–

Data Definition Language (DDL)
Data Manipulation Language (DML)
Application Program Interfaces
–
–

Security Management
Multi-User Access Control
COBOL, C, PASCAL, Visual Basic
Administrative Utilities
Data Communication Interfaces
–
queries, reports, email through web browsers
8
Features of a good DBMS






Open ended--can be extended
Flexible--can be changed
Efficient
Easy to use
Security should be built-in.
Data independence
9
Models


A database is a model of a user’s model of reality
(Kroenke)
Many different types of models involved in
databases
Reality
Objects,
Properties,
Relationships
Unique
Identifier
ANSI/SPARC
Conceptual Model
External Model
Internal Model
Physical Model
10
ANSI/SPARC Architecture
American National Standards Institute/Standards Planning and Requirements Committee
Conceptual Model
(Database Administrator View)
External Model
(end-user views)
Internal Model
(DBA view)
Physical Model
(storage view)
11
Conceptual Model
Global view of data
 Enterprise-wide view as seen by DBA
 Conceptual schema

–
–

basic blueprint for the database design
frequently represented with E-R diagrams
Hardware and software independent
12
External Model

Accessed by
–
–

External Schema
–
–

Application programmer
End-user
User’s authorized view of the data
A subset of the Conceptual Schema or a logical
view of the Conceptual Schema
Hardware Independent; software dependent
13
Internal Model

Implementation of Conceptual Schema
–
–
–
–
–

Hierarchical Model DBMS
Network (CODASYL) Model DBMS
Relational Model DBMS
Object-Oriented Model DBMS
Semantic Model DBMS
Hardware independent; software dependent
14
Physical Model

Description of how data is to be stored
–
–

Definition of physical storage devices
Definition of physical access methods
Hardware and software dependent
15
Modeling Reality

Common Conceptual Modeling Terms:
–
–
–
–
–

Internal Modeling Terms: unique to Internal (implementation) model
–
–
–
–

Entity: a person, place, event, or thing for which data is to be collected
Attributes: properties or characteristics of an entity which describe the entity in the
context of interest
Identifier: a means of distinguishing one entity from another
Entity Class: a collection of all entities of the same type, i.e. entities that have
exactly the same properties
Relationship: an association among entities in the same or different classes
Hierarchical
Network (CODASYL)
Relational
OODBMS
Semantic
Physical Model: strategy for storage and access is unique to the internal model
16
Relationships

One-to-Many
–
–

–
INSTRUCTOR
Instructor may have many Students
Student may have many Instructors
One-to-One
–
–
ADVISEE
Advisor may have many Advisees
Advisee has but one Advisor (our choice)
Many-to-Many
–

ADVISOR
FACULTY
Faculty is assigned to one office
An office is assigned to one faculty
STUDENT
OFFICE
17
Data Integrity Constraints

Measures taken to ensure data is accurate
Business Constraints: rules that must be satisfied for the business
–

Entity Integrity Constraint: there is an attribute of the entity that is used to
uniquely identify that entity
–

example: student id or ss#
Static Domain Constraint: a value for a property can only be one of the items
in a predefined list
–

example: managers vacation days shall not exceed 20
example: faculty may only be instructor, assistant professor, associate professor,
professor
Referential Integrity Constraint: in a one-to-many association, an entity on the
many side must be associated (reference) an entity on the one side
–
example: in an advising relationship, a student’s advisor must be a faculty member
18
Data Independence

Physical Data Independence:
–

Application programs and terminal activities remain
logically unimpaired whenever any changes are made
in either storage representation or access methods.
Logical Data Independence:
–
Application programs and terminal activities remain
logically unimpaired when information preserving
changes of any kind are made to the conceptual design
19
Hierarchical Model







Entity
Segment
Entity Class
Segment type
Attributes:
Fields
Identifier
Value-bearing field or disk address reference
Relationships
– Internal Model--1-1 and 1-many in parent- child relationship; some
support for two parents for same child
– Physical Model--uses child/twin pointer strategy for 1-many
Data Access
3GL products--COBOL, PL/I, C, Pascal
Commercial Products IMS (DL/I) and Focus
20
DBMS Models - Hierarchical
Conceptual Model
STUDENT
COURSE
STUDENT
-CLASS
Initial Data Collected
student id
1000
1005
1000
1010
1005
1010
1030
name
doe
deer
doe
ray
deer
ray
jay
class
cis120
cis121
cis121
cis501
ma120
ma120
his101
description
prog1
prog2
prog2
aprog1
cal0
cal0
his1
grade
c
a
d
a
b
c
b
21
DBMS Models - Hierarchical
Conceptual Model
STUDENT
COURSE
Modified Conceptual Model
Parent-Child Associations
STUDENT
Physical
STUDENT
-CLASS
STUDENTCLASS (Physical)
COURSE
Logical
STUDENTCLASS (Logical)
22
DBMS Models - Hierarchical
Modified Conceptual Model
Parent-Child Associations
STUDENT
Physical
STUDENTCLASS (Physical)
COURSE
Student Data
Twin
1st Child
Logical
STUDENTCLASS (Logical)
Student-Class Data Twin
Course
23
DBMS Models - Hierarchical
Modified Conceptual Model
Parent-Child Associations
STUDENT
Physical
STUDENTCLASS (Physical)
COURSE
Course Data
Twin
1st Child
Logical
STUDENTCLASS (Logical)
Ref to Physical
Twin
24
DBMS Models - Hierarchical
Modified Conceptual Model
Parent-Child Associations
STUDENT
Student Data
Twin
1st Child
Student Data
Twin
1st Child
COURSE
Student Data
Physical
STUDENTCLASS (Physical)
Logical
STUDENTCLASS (Logical)
Student-Class Data
Twin
Course
Student-Class Data
Twin
Course
Student-Class Data
Twin
Course
Student-Class Data
Twin
Course
Student-Class Data
Twin
Course
Student-Class Data
Twin
Course
Twin
1st Child
Student-Class Data
Twin
Course
Student-Class Data
Twin
Course
Student-Class Data
Twin
Course
25
DBMS Models - Hierarchical
Modified Conceptual Model
Parent-Child Associations
STUDENT
Course Data
Twin
1st Child
Course Data
Twin
1st Child
COURSE
Course Data
Physical
STUDENTCLASS (Physical)
Logical
STUDENTCLASS (Logical)
Ref to Physical
Twin
Ref to Physical
Twin
Ref to Physical
Twin
Ref to Physical
Twin
Ref to Physical
Twin
Ref to Physical
Twin
Twin
Ref to Physical
Twin
Ref to Physical
Twin
Ref to Physical
Twin
1st Child
26
Hierarchical Model

Advantages
–
–
–
–
–
–
Common database makes sharing practical
Security is provided and enforced
Supports some data independence
Referential integrity maintained through parent-child
relationship
Very efficient for data models that are hierarchical (oneto-many)
Many hierarchical type applications are on mainframes
27
Hierarchical Model

Disadvantages
–
–
–
–
–
–
–
–
Knowledge of physical level required
Does not support logical data independence and does
not support all physical data independence operations
Not all problems are one-to-many types
Problems with multiple parent implementation
Problems with anomalies for parent deletion
Application development in 3GL time-consuming
Support programs are not part of the DBMS
“System created by programmers for programmers!”
28
Network (CODASYL) Model





Entity
Record
Entity Class
Record type
Attributes:
Data items
Identifier
Value-bearing field or disk address reference
Relationships
– Internal Model--1-1 and 1-many in owner-member set relationship;
some provide elementary many-to-many relationships
– Physical Model--same type records: doubly-linked, ringed structure




owners: additional references to first & last associated member in each set
members: additional references to associated owner in each set
Data Access
3GL products--COBOL, PL/I, C, Pascal
Commercial Products DBMS-10, IDMS (Cullinet), IDS (Honeywell),
TOTAL, IMAGE, MDBS-III
29
DBMS Models – Network
(CODASYL)
Conceptual Model
STUDENT
COURSE
STUDENT
-CLASS
Initial Data Collected
student id
1000
1005
1000
1010
1005
1010
1030
name
doe
deer
doe
ray
deer
ray
jay
class
cis120
cis121
cis121
cis501
ma120
ma120
his101
description
prog1
prog2
prog2
aprog1
cal0
cal0
his1
grade
c
a
d
a
b
c
b
30
DBMS Models – Network
(CODASYL)
Conceptual Model
STUDENT
Conceptual Model
Set Associations: Owner-Member
COURSE
STUDENT
STUDENT
-CLASS
COURSE
STUDENT
-CLASS
31
DBMS Models – Network
(CODASYL)
Conceptual Model
Set1: Owner (Student)Member (Student-Class)
STUDENT
STUDENT-CLASS
Student Data Next
Prev
Student-Class Data Next
1st Mem
Prev
Last Mem
Assoc Owner
32
DBMS Models – Network
(CODASYL)
Conceptual Model
Set2:Owner (Course)Member (Student-Class) Associations
COURSE
STUDENT-CLASS
Course Data Next
Prev
Student-Class Data Next
1st Mem
Prev
Last Mem
Assoc Owner
33
DBMS Models – Network
(CODASYL)
Student Data
Next
Prev
Student Data
STUDENT
First Mem
Next
Prev
Last Mem
First Mem
Last Mem
COURSE
Student Data
Student-Class Data
Next
Student-Class Data
Next
Prev
Next
Prev
First Mem
Last Mem
Assoc Owner
Prev
Assoc Owner
STUDENT
-CLASS
Student-Class Data
Next
Prev
Assoc Owner
34
Network Model

Advantages
–
–
–
–
Can be used to directly implement one-to-one, one-tomany, and (some DBMS models) many-to-many
relationships
Access, navigation is superior to hierarchical model
Enforces referential integrity through owner-member
relationship
Achieves some physical data independence
35
Network Model

Disadvantages
–
–
–
–
Difficult to design and use
Does not support logical data independence
Very complex--not for the novice
Navigation is achieved at the record level
36
Relational Model






Entity
Row (Tuple)
Entity Class
Table (Relation)
Attributes:
Column (?dimension?)
Identifier
Value-bearing field or generated value
Relationships
– Internal Model--1-1 and 1-many relationships
– Physical Model--uses foreign key to link parent to child
Data Access
–
–

4GL-- SQL
3GL products--COBOL, PL/I, C, Pascal
Commercial Products ACCESS, ORACLE, DB2, SQL/DS, RBASE
500, INGRES, SYBASE
37
Relational Database Structure
Unit Table
Student Table
MIS1100
Information Systems
...
0970000
Joe Bloe
...
MIS1150
Business Statistics
...
0970010
Julie King
...
0970012
John Smith
...
0970015
Anne Oether
...
0970035
John Smith
...
Unit_Student
Relationship Table
MIS1100
0970000
...
MIS1100
0970010
...
MIS1100
0970015
...
MIS1150
0970000
...
MIS1150
0970012
...
MIS1150
0970035
...
Data relating to the relationship is stored in the
relational table. Recording the semester of
enrolment, marks, and the grade for each
student along with the relationship places
logically related data in one location.
38
Relational Model

Advantages
 User
can focus on only the logical view
 Powerful query capabilities from 4GL—SQL
 Ad hoc query capability
 Aggregate processing as opposed to record at a time
 Standardization of language
 Creation, management, and data manipulation
language
 Easier
to make changes to the logical design without
affecting applications (Logical Data Independence)
39
Relational Model

Disadvantages
 More
powerful computers are needed because so
much is done for the user
 Ease
of use creates a false sense of security in the
area of design
40