Download Database Transparencies

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

SQL wikipedia , lookup

Microsoft Access wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Serializability wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

IMDb wikipedia , lookup

Oracle Database wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Ingres (database) wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Functional Database Model wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Relational model wikipedia , lookup

ContactPoint wikipedia , lookup

Clusterpoint wikipedia , lookup

Concurrency control wikipedia , lookup

Database wikipedia , lookup

Database model wikipedia , lookup

Transcript
DATABASE SYSTEMS -- Course Outline







Preliminaries
Introduction - Architectures
Data Modeling and Database Design
The Entity-Relationship (E-R) Model
The Relational Model and Systems - SQL
Physical Database Organization
Other Classical Data Models (Hierarchical,
DBTG-network) and Systems (IMS, Total,etc.)
Database Administration and Maintenance
YV - DBMS Introduction
1
Outline - continued

Database Internals and Operational Issues
–
Integrity and Security
– Query Processing
– Concurrency Control, Recovery

Emerging Technologies and Systems
– Object-Oriented, Multidatabases, Parallel Systems
– Logic-Based, Active, Intelligent Systems
– The emergence of PC-based DBMSs
YV - DBMS Introduction
2
Introduction -- Definitions





DATA
Known facts that can be recorded and have an implied meaning
DATABASE
An integrated collection of related data, stored on secondary storage
MINI WORLD
Some part of a real world about which data is stored in a database
DATABASE MANAGEMENT SYSTEM (DBMS)
A collection of software modules (a generalized software package) for
creating, manipulating, and maintaining the database
DATABASE SYSTEM
The DBMS Software together with the data itself
YV - DBMS Introduction
3
Introduction -- Historical Perspective (1)

1950s (First Generation or File Systems on Tape)
– batch processing, cards and tapes (sequential processing)

1960s (Second generation or File Systems on Disk)
– expanded use of random-access disk technology
» database field
–
–
–
–
early file systems
generalized sorting packages
beginnings of generalized software systems
data definition incorporated into programming language
» COBOL
– development of in-house database systems
YV - DBMS Introduction
4
FILE PROCESSING SCENARIO
P1
P2
ACQUISITION
OF BOOKS
BOOKS
YV - DBMS Introduction
RECORDING
OF READERS
P3
BORROWING
BOOKS
READERS
P4
REMINDERS
FOR LATE
BOOKS
BORROWED
BOOKS
5
Introduction -- Historical Perspective (2)

1970s (Third generation or Pre-Relational)
- movement towards standardization with CODASYL
» DBTG (Data Base Task Group)
- reports in 1969, 1971, 73, 78, 81, 85...,
-
STORED data definitions AND data
embed general access routines in a HOST language(COBOL)
NETWORK and HIERARCHICAL SYSTEMS defined
RELATIONAL model proposed by Codd (in theory)
Computer Science Interest
Clear separation between “logical” and “physical” organization
Operational Issues examined in a more general and theoretical way
First Relational prototype systems created (SYSTEM-R, INGRESS)
Data Models become prevalent -- 3-level architectures
YV - DBMS Introduction
6
DATABASE PROCESSING SCENARIO
Logical
Files
(VIEWS)
.
F1
DATABASE
(Integrated)
YV - DBMS Introduction
DBMS
User/Group
Application
Programs
P1
F2
P2
F3
P3
F4
.
.
P4
.
.
(Software)
7
Example Database

Mini-World for the Example Part of the University Environment INFORMATION
SOME MINI WORLD ENTITIES
Students
Courses
Instructors
Departments
SOME MINI WORLD RELATIONSHIPS
Students Take Courses
Courses have Prerequisite Courses
Instructors Teach Courses
Courses are Offered by Departments
YV - DBMS Introduction
8
Example Database -- Instance (1)
.
STUDENT
.
.
NAME
Smith
Brown
STUDNO
17
18
CLASS
1
2
MAJOR
CSC
CSC
COURSE
.
CNAME
Intr to CS
Database
Discr Mat
Data Str
CNUMB
CSC1310
CSC3380
MATH210
CSC3320
CREDITHRS
4
5
3
4
DEPT
CSC
CSC
MATH
CSC
.
PREREQUIS
.
GRADES
.
.
YV - DBMS Introduction
CNUMB
CSC3380
CSC3380
CSC3320
STUDNO
17
8
17
PRERNUMB
CSC3320
MATH210
CSC1310
CNUMB
CSC3380
CSC3380
CSC1310
GRADE
B
A
B+
9
Example Database -- Instance (2)
.
INSTRUCTR
.
.
NAME
Sellis
Euler
Chalkias
DEPARTM
.
YV - DBMS Introduction
DEPT
CSC
MATH
CSC
ADDRESS
X
Z
Y
DNAME BUILDING
CSC
Informatics
MATH
Edres
OFF.NO
115
234
189
BUDGET
400
500
10
Introduction -- Historical Perspective (3)

1980s (Fourth generation or Relational)
» Relational Database Systems
- Powerful Languages and Interfaces
- Established Theory in Databases
- Set-oriented vs. Record-oriented management and processing of
data
- Database Systems integrated into large Transactional Systems
(networks, etc.)
- Appearance of object-oriented, “intelligent”, and other
models/systems
YV - DBMS Introduction
11
Introduction -- Historical Perspective (4)

1990s (Fifth generation or Post-Relational)
- Emergence of COMPLEX OBJECTS in databases (engineering objects,
multimedia, software objects) -- not only structured data!
»Object-Relational Database Systems
- Multidatabases, Active and Extensible Systems,
Massively Parallel
- Multimedia Database Systems
- A strong showing of PC-based DBMSs. Threat to Capture the Market?
- Web Database Systems - Servers
YV - DBMS Introduction
12
Database System Features and
Characteristics











Controlling Redundancy by reduced duplication + Data Consistency
SHARING of data among multiple users
Enforcing Integrity Constraints
Uniform Access and Control of Data
Restricting unauthorized or malicious access to data (Security)
Centralized Control for better operation (Database Administration)
Providing Multiple Interfaces to different Classes of Users
Concurrency Control and Recovery Facilities
Potential for Enforcing Standards
Reduced Application Development Time
Economies of Scale
YV - DBMS Introduction
13
Simplified Picture of Database System
.
DBMS
DATABASE
VIEWS
of the
Database
V1
U1
V2
P2
V3
P3
V4
U4
.
.
YV - DBMS Introduction
USERS
or APPL
PROGRAMS
14
The ANSI/SPARC
3-level DBMS Architecture
.
.
USER INTERFACE
EXTERNAL
SCHEMA 1
........
CONCEPTUAL
SCHEMA
Database
Internal to
Database
Interface
YV - DBMS Introduction
INTERNAL
SCHEMA
EXTERNAL
SCHEMA n
INTERFACE:
External to
Conceptual Schema
INTERFACE:
Conceptual to
Internal Schema
DBMS: Responsible
for all INTERFACES
15
What Is a DBMS?


A very large, integrated collection of data.
Models real-world enterprise.
– Entities (e.g., students, courses)
– Relationships (e.g., Madonna is taking CS564)

A Database Management System (DBMS) is a software
package designed to store and manage databases.
YV - DBMS Introduction
16
Why Use a DBMS?





Data independence and efficient access.
Reduced application development time.
Data integrity and security.
Uniform data administration.
Concurrent access, recovery from crashes.
YV - DBMS Introduction
17
Database System Features and Characteristics

Self-Contained Nature of a Database System
- A DBMS CATALOG stores the DESCRIPTION of the database (called,
META-DATA). With that, the DBMS works on different databases

Insulation between Programs and Data
- Called PROGRAM-DATA independence. This feature allows to change
the data storage structures without having to change the DBMS access
programs

Data Abstraction
- A Data Model is used to hide storage details and present the user with a
conceptual view of the database.

Support of Multiple Views of the Data
- Each view describes only the data of interest to that user
YV - DBMS Introduction
18
Levels of Abstraction

Many views, single conceptual
(logical) schema and physical
schema.
– Views describe how users see
the data.
– Conceptual schema defines
logical structure
– Physical schema describes the
files and indexes used.
View 1
View 2
View 3
Conceptual Schema
Physical Schema
 Schemas are defined using DDL; data is modified/queried using DML.
YV - DBMS Introduction
19
Example: University Database

Conceptual schema:
– Students(sid: string, name: string, login: string,
age: integer, gpa:real)
– Courses(cid: string, cname:string, credits:integer)
– Enrolled(sid:string, cid:string, grade:string)

Physical schema:
– Relations stored as unordered files.
– Index on first column of Students.

External Schema (View):
– Course_info(cid:string,enrollment:integer)
YV - DBMS Introduction
20
The Architecture of a DBMS
DBMS
DATA MANAGER
.
Dictionary
Manager
Output Generator
DATABASE
Applic.
I/O Processor
USER
Query
LOG
Transaction
Manager
Recovery
Manager
Parser
Precompiler
Optimizer
Authorization Control
Integrity
Checker
YV - DBMS Introduction
Update
Processor
Generation
of Executable
Code
Data
Dictionary
(Schemas)
Query
Processor
21
DBMS Components versus
Database Interfaces
.
INTERFA CES
DBMS COMPONENTS
To the User
I/O Processor (monitor)
External to Conceptual Level
Parser, Pre-compiler, update
and Query Processor
Conceptual to Internal Level
Code Generator, Optimizer
Internal Level to Database
Transaction manager
device and storage manager
YV - DBMS Introduction
22
Layered Organization of DBMS Components
.
LAYER
DBMS COMPONENTS
User Interface Layer
I/O Processor, I/O generator
Language Processing Layer
Parser, Pre-compiler, update
and Query Processor, author
control, optimizer
Access Method Layer
Code Generator, Optimizer
Concurrency Control Layer
Transaction manager
recovery manager
Storage Management Layer
Data Manager
YV - DBMS Introduction
23
Concurrency Control

Concurrent execution of user programs is essential for
good DBMS performance.
– Because disk accesses are frequent, and relatively slow, it is
important to keep the CPU humming by working on several
user programs concurrently.


Interleaving actions of different user programs can lead to
inconsistency: e.g., check is cleared while account balance
is being computed.
DBMS ensures such problems don’t arise: users can
pretend they are using a single-user system
YV - DBMS Introduction
24
Transaction: An Execution of a DB Program


Key concept is transaction, which is an atomic sequence of
database actions (reads/writes).
Each transaction, executed completely, must leave the DB
in a consistent state if DB is consistent when the
transaction begins.
– Users can specify some simple integrity constraints on the
data, and the DBMS will enforce these constraints.
– Beyond this, the DBMS does not really understand the
semantics of the data. (e.g., it does not understand how the
interest on a bank account is computed).
– Thus, ensuring that a transaction (run alone) preserves
consistency is ultimately the user’s responsibility!
YV - DBMS Introduction
25
Scheduling Concurrent Transactions

DBMS ensures that execution of {T1, ... , Tn} is
equivalent to some serial execution T1’ ... Tn’.
– Before reading/writing an object, a transaction requests a
lock on the object, and waits till the DBMS gives it the lock.
All locks are released at the end of the transaction.
(Strict 2PL locking protocol.)
– Idea: If an action of Ti (say, writing X) affects Tj (which
perhaps reads X), one of them, say Ti, will obtain the lock on
X first and Tj is forced to wait until Ti completes; this
effectively orders the transactions.
– What if Tj already has a lock on Y and Ti later requests a
lock on Y? (Deadlock!) Ti or Tj is aborted and restarted!
YV - DBMS Introduction
26
Ensuring Atomicity


DBMS ensures atomicity (all-or-nothing property) even if
system crashes in the middle of a Xact.
Idea: Keep a log (history) of all actions carried out by the
DBMS while executing a set of Xacts:
– Before a change is made to the database, the corresponding
log entry is forced to a safe location. (WAL protocol; OS
support for this is often inadequate.)
– After a crash, the effects of partially executed transactions
are undone using the log. (Thanks to WAL, if log entry wasn’t
saved before the crash, corresponding change was not
applied to database!)
YV - DBMS Introduction
27
The Log -- RECOVERY

The following actions are recorded in the log:
– Ti writes an object: the old value and the new value.
» Log record must go to disk before the changed page!
– Ti commits/aborts: a log record indicating this action.



Log records chained together by Xact id, so it’s easy to
undo a specific Xact (e.g., to resolve a deadlock).
Log is often duplexed and archived on “stable” storage.
All log related activities (and in fact, all CC related
activities such as lock/unlock, dealing with deadlocks etc.)
are handled transparently by the DBMS.
YV - DBMS Introduction
28
Simplified Structure of a DBMS
A typical DBMS has a
layered architecture.
 The figure does not show the
concurrency control and
recovery components.
 This is one of several possible
architectures;
each system
has its own variations.

These layers
must consider
concurrency
control and
recovery
Query Optimization
and Execution
Relational Operators
Files and Access Methods
Buffer Management
Disk Space Management
DB
YV - DBMS Introduction
29
CLASSES OF DATABASE USERS

Database Administrators
- Responsible for managing the database (helping users to define views,
choosing alternative storage structures and access strategies, authorizing users,
validating data, backup and recovery functions, monitoring performance, etc.)

Database Designers
- Responsible for designing the database (could be the administrators)

Application Programmers / Systems Analysts
- design and implement “canned” transactions (programs) for parametric users

End-Users
- use the database for querying, updating, generating reports, etc..
CASUAL USERS (occasional), PARAMETRIC (use pre-programmed
transactions to interact -- e.g.., a bank teller), and SOPHISTICATED
(use full DBMS capabilities to implement complex applications.)
YV - DBMS Introduction
30
OTHER IMPORTANT DATABASE USERS

Database Designers and Implementors
- The Systems programmers that develop the SOFTWARE PACKAGE itself

Tool Developers
- The people that design and implement tools that facilitate the use of the
DBMS software (design tools, performance tools, special interfaces, etc.)

Operators and Maintenance Personnel
- The people that work on running and maintaining the hardware and
systems software environment for the database system.
YV - DBMS Introduction
31
LANGUAGES ASSOCIATED WITH A DBMS (1)

Data Definition Language (DDL)
- Used to express the conceptual schema of the database -- This schema is stored
in the Data Dictionary (CATALOG). Often, a DDL is used to express also the
internal and external schemas. In some DBMSs, two separate languages are
used:
SDL - Storage Definition Language (for internal schemas)
VDL - View Definition Language (for external schemas)

Data Manipulation Language (DML)
- Used to retrieve information and modify the database (insert, delete, update)
- There are two major types of DML: Procedural DML, Declarative DML
YV - DBMS Introduction
32
LANGUAGES ASSOCIATED WITH A DBMS (2)

Query Language
- The subset of the DML which is used for RETRIEVAL

Data Sublanguage
- The UNION of DML and DDL

Host Language
- A programming language (COBOL, C, etc..) in which data
Sublanguage statements are embedded
YV - DBMS Introduction
33
DBMS USER INTERFACES


Stand-alone Query language Interface
Programming Interfaces for embedded DML
– Pre-Compiler Approach
– Procedure Call Approach (subroutines)

Non-technical User Interfaces
– Menu-based, graphics-based, forms-based, natural language, combinations



Parametric Interfaces using Function Keys
Report Generation Language Interfaces
Interfaces for the DBA
– Creating accounts, granting authorizations
– setting system parameters
– changing schemas and access paths
YV - DBMS Introduction
34
DBMS UTILITIES

Functions
–
–
–
–
–
–

Loading Data from files into the database
Backing up the database periodically on tape
Reorganizing database file structures
Report Generation Utilities
Performance Monitoring Utilities
Other functions (sorting, user monitoring, data compression, etc.)
Data Dictionary Utilities
– For storing schema descriptions, design decisions, user information, usage
standards, application program descriptions, etc.
– Active data dictionary is accessed by the DBMS software and users
– Passive data dictionary is accessed by users only
YV - DBMS Introduction
35
Want to Use Database Management?

Purchase a DBMS

Train the staff to use it

Define the Schemas for the database

Load the database

Write Application Programs

Continuously Evolve the Database
YV - DBMS Introduction
36
When NOT to Use Database Management

Main Costs of Using a DBMS
- High initial investment and likely need for additional hardware
- Overhead for providing generality, security, recovery, integrity, and
concurrency control

When a DBMS is unnecessary for your application
- The database and application are simple and very stable
- There are pressing time requirements which may not be met
because of the database system overhead
- Access to data by multiple users is not required
YV - DBMS Introduction
37
Database Management Systems: The OLD Actors


These Systems dominated the market in the before the
mid-eighties
Many installations STILL exist, but no new sales happen
–
–
–
–
–
–
IMS (IBM) -- Hierarchical Model (with the language DL/1)
I-D-S (Honeywell) -- Network DBTG (Integrated Data Store)
IDMS (Cullinane) - Network (Integrated Data Mgmnt System)
TOTAL (Cincom) - Network
IMAGE (Hewlett-Packard) - Network
SYSTEM 2000 (Intel-MRI) - Inverted (ad-hoc model)
Other Inverted: ADABAS (Software AG), Model 204 (CCA)
– ...
YV - DBMS Introduction
38
Database Management Systems: The Main Actors






DB2, running on all IBM or IBM-compatible platforms
ORACLE 8
SYBASE
INFORMIX
INGRES currently called, Computer Associates-Ask Group
The Other Players:
– Rdb, Gupta Quadbase, Ralma, Watcom, XDB, ...

The MPP players (massively parallel):
– Terradata (biggest), Tandem (NonStop SQL), Oracle Parallel
Server, Informix, Sybase (Navigator), DB2, DEC,…

The Modern Players: ILLUSTRA, O2, etc.
YV - DBMS Introduction
39
Database Management Systems: The Main Actors (b)

The PC Giants (coming along BIGGER)
– Microsoft SQL Server
– Powersoft
– Gupta

These systems:
YV - DBMS Introduction
(a) Bring-in SQL access (gateways)
(b) Suitable for Client-Server (DBMS)
(c) Look exactly like “bigger” DBMS
40
Database Management Systems: The Main Actors (c)

The PC-based (coming along BIGGER but still on PC)
–
–
–
–
–
–
–
–

Paradox (Borland)
Microsoft Access 2
Q&A (Symantec)
FileMaker Pro (Claris Corp.)
DataEase Express
Approach (Lotus)
Alpha Four
OLDER: xBASE, dBASE, FoxPro, MicroRIM...
Those systems are either “access packages” or they have
minimal (rudimentary) facilities of a traditional DBMS
YV - DBMS Introduction
41
Database Management Systems: The Main Actors (d)

Another Way to Look at them (PRICE)
–
–
–
–
–
1
2
3
4
5
-
xBASE, dBASE, ...
MS Access, Alpha, Approach, Paradox, Q&A, ...
MS SQL Server, Gupta, Powersoft...
Oracle, Informix, Sybase, DB2/6000, ...
DB2, Rdb, Tandem, Terradata...
The number (1, 2, ...) implies the NUMBER of ZEROS in
the list price of the database management system
(one license)
YV - DBMS Introduction
42
SUMMARY






DBMS used to maintain, query large datasets.
Benefits include recovery from system crashes, concurrent
access, quick application development, data integrity and
security.
Levels of abstraction give data independence.
A DBMS typically has a layered architecture.
DBAs hold responsible jobs
and are well-paid!
DBMS R&D is one of the broadest,
most exciting areas in CS.
YV - DBMS Introduction
43