Download external (pl/1)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Extensible Storage Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
A Lecture Note on
DATABASE SYSTEMS
Fall, 2001
School of Computing, Soongsil University
Prof. Sang Ho Lee
[email protected]
1
Table of Contents
–
–
–
–
–
–
–
–
–
–
Introduction
ER data model, OO data model
Relational data model
Normalization: functional dependency, multivalued dependency
Relational algebra
Datalog
Relational Query Language: SQL
Constraints and Triggers in SQL
SQL programming, Transaction, Authorization
Object-oriented query language: OQL, SQL3
2
Introduction
3
DB, DBMS, DBS
• Databases
– informally a collection of related data
– In general, stored in computer, too large to hold in main memory, subject
to change constantly
• DBMS (Database Management System)
– a collection of software that not only allows us to define, construct,
manipulate databases but also provides a number of desirable
functionalities (data independence, data sharing, recovery, security, etc.)
– one fundamental system software
• DBS (Database System)
– In general, Databases + DBMS
– Database, DBMS, DBS are used interchangeably (at least in practice)
4
Simplified Database System
Users/Programs
Database System
DBMS
Software
Application Program/Queries
Queries Processor
Storage Manager
Databases, Metadata
5
Typical Applications of DB Technology
•
•
•
•
•
•
Airline Reservation system
Banking system
Corporate data
Stock market
World wide web (in short , web)
many many more
6
Databases in General
•
•
•
•
•
One fundamental course in Computer Science unquestionably
Constitute a fundamental system software
Plays a core roll in information technology
Has millions of applications in the real world
Strong technology demand from industry (compared with other areas
in Computer Science)
7
Recommended Textbooks
– J.D. Ullman and J. Widom, A First Course in Database Systems, Prentice Hall,
1997.
– H. Garcia-Molina, J.D. Ullman, and J. Widom, Database System Implementation,
Prentice-Hall, 2000.
– H. Korth and A. Silberschatz, Database System Concepts (third edition),
McGraw-Hill, 1997.
– R. Elmasri and S.B. Navathe, Fundamentals of Database Systems (second edition),
The Benjamin/Cummings Publishing Company, 1994.
– C.J. Date, An Introduction to Database Systems (7th edition), Addison Wesley,
2000.
8
Database Technology is Constantly Evolving !!!
• Technical Journals
–
–
–
–
ACM Transactions on Database Systems (TODS) -- quarterly
IEEE Transactions on Knowledge and Data Engineering -- quarterly
The VLDB Journal -- quarterly
ACM SIGMOD Record, IEEE Data Engineering Bulletin
• Major conferences
–
–
–
–
–
ACM SIGMOD International Conference on Management of Data
IEEE International Conference on Data Engineering
International Conference on Very Large Data Bases
ACM SIGACT-SIGMOD Symposium on Principles of Database Systems
DASFAA, DEXA, etc.
• Trade Journals: Data Base Newsletter, Database Review, InfoDB, Database
Programming & Design, etc.
• Trade shows: Database World,. DB/Expo
• A number of vendor reference manuals, technical reports, etc.
9
• Roughly 100,000 pages of new materials published every year !
Traditional File Systems (1)
– Possible to use file systems, which is part of an operating system (say,
Unix, Windows, …), to manage (create, update, retrieve, …) data
– File systems may support a number of primitive operations, by which
user may construct various kinds of access methods (such as sequential,
indexed sequential, hashing, …) to files stored in computer hard disks.
– File systems tend to consists of a set of different data files and different
application programs, which are working independently.
10
Traditional File Systems (2)
사용자 1
응용 프로그램 1
운영 체제
사용자 2
응용 프로그램 2
...
사용자 n
...
응용 프로그램 n
화일 시스템
화일 1
화일 2
...
화일 n
11
Problems in Traditional File Systems
– Each user defines and implements the file needed for a specific
application
• data redundancy and inconsistency
• space is wasted
– Structure of data files is embedded in the application programs
• any change in data files requires changing all programs
– Users need to know the details of physical data organizations
• no data abstraction
• not self-contained natures (file system usually does not contain a description
or definition of data)
– No support of multiple views of data
– Data security problem
• difficult to protect data from malicious access
– Data integrity problem
• difficult to protect data from accidental loss of consistency
12
Objectives of Database Systems (1)
• Controlled data redundancy
– sometimes, better to permit redundant data particularly in distributed
DBMS
• Data sharing
– allows multiple users to access the databases at the same time
– provides multiple interfaces
– supports concurrency control
• Data independency
– immunity of application programs from underlying physical organization
and changes
13
Objectives of Database Systems (2)
• High-level query language support
–
–
–
–
Easy to use, hide internal data structures
Example: SQL
Tends to be declarative (cf. Procedural language)
Query optimization technique
• Security and authorization facility
– who can perform what operations on what data in what circumstances
• Enforcement of integrity constraints
– ensure that data in the databases is accurate
• Data modeling and abstraction
• Recovery facility
– databases should survive all kinds of failures that can occur
14
Actors on the Scene
– DBA (Database administrator): a chief administrator
•
•
•
•
system monitoring and maintenance
authorizing database accesses (database security)
storage structure and access method definition
schema definition and maintenance, etc.
– Database designer
• responsible for database design
• requirement analysis for particular applications
• schema definition (creation) and authorization
– Application programmer
• develop application programs for particular applications
– End users, casual users, naive users
• uses a QBF (Query by Form), report generator, various canned transactions
15
Classifi. of DBMS (1)
– The following classification is based on data model
– Hierarchical databases
• first developed mid 1960s
• IBM IMS (Information Management System)
– Network databases
• CODASYL Database Task Group Report (1971)
• IDMS (Cullinet), TOTAL (Cincom)
– Relational databases
• relational data model (by T. Codd, 1970)
• most widely used currently
• numerous commercial systems: DB2, Oracle, Informix, SQL/DS, Sybase, Access, MS
SQL Server, etc.
– Object-oriented databases
• first appears in mid 1980s
• mainly intended for special applications: CAD/CAM,), engineering databases, etc.
– Object-relational databases
• relational DBMS + OO DBMS
16
Relational Database Systems
•
Database is viewed as a set of relations (tables) and constraints
구좌
•
구좌번호
12345
잔고
1,000,000
23456
120,000
…
…
유형
보통예금
정기적금
…
Query: Retrieve the balance of the account 12345
– Select 잔고
From 구좌
Where 구좌번호 = 12345;
•
Query: 질의: 잔고가 0 이하인 정기적금 구좌번호를 검색하라
– Select 구좌번호
From 구좌
Where 잔고 < 0 AND 유형 = ‘정기적금’;
17
Classifications of DBMS (2)
• Where databases are located?
– centralized databases vs. distributed databases
– distributed databases
• homogeneous databases
• heterogeneous databases
– federated databases
– multidatabases
• What functionality are emphasized?
–
–
–
–
Deductive databases
Active databases
Real-time databases
Temporal databases, etc.
18
Data Model
• DBMS should support some level of data abstraction by hiding details
of internal data organization (particularly, physical data storage
structure)
• Data model
– the main tool for providing data abstraction
– conceptual tools for describing data and data relationship
– i.e. used to describe the structure of a database (data type, relationship,
constraints, etc.)
19
Categories of Data Models
• High-level (Conceptual) data model
– a human-oriented data model
– concepts such as entities, relationships, attributes
– presented by Entity-relationship model
• Implementation data model
– high level description of the implementation
– used most frequently in current commercial DBMSs
– Relational model, Network model, Hierarchical model, object-oriented
model
• Low-level (physical) data model
– on how data is stored in computer (i.e. record format, record ordering,
access path, etc.)
20
Entity-Relationship diagram example
id
name
level
class
age
student
name
dept
M
year
taken
N
hours
credit
course
semester
21
Schema and Instance
– Schema
• loosely speaking, a description (definition) of data such as item name, data
type, constraints etc.
• not frequently change
• also called "intension"
– Instance (occurrence)
• actual data (contents) that is stored in some schema
• expected to change frequently
• also called "extension of the schema"
– See the difference bet. database schema and database instance
22
Three-schema Architecture
(ANSI/SPARC Architecture)
• Goal
– Program-data independence (insulation of programs and data)
– Support of multiple user views
• Schema is defined at the following three levels
– External schema
• A description of a part of database in which a particular user is interested
– Conceptual schema
• Describes the structure of the whole databases
• A global description of the database that hides the details of physical storage
structures
– Internal schema
• Describes the physical storage structure of databases (record type, physical
sequence of stored records, what indexes exist, access path, etc.)
23
Three-schema Architecture (2)
External
(user)
view
User 1
User n
External
schema1
External
scheman
...
external/conceptual
mapping
Conceptual
view
Conceptual schema
conceptual/internal
mapping
Internal
view
Internal schema
24
Example of the Three Schema (3)
EXTERNAL
DCL 1
2
3
(PL/1)
EMPP,
EMP# CHAR(6),
SAL FIXED BIN(31);
CONCEPTUAL
EXMPLOYEE
EMPLOYEE_NUMBER
DEPARTMENT_NUMBER
SALARY
INTERNAL
STORED_EMP
PREFIX
EMP#
DEPT#
PAY
EXTERNAL (CORBOL)
01 EMPC.
02 EMPNO PIC X(6).
02 DEPTNO PIC X(4).
CHAR(6)
CHAR(4)
NUMERIC(5)
LENGTH = 20
TYPE=BYTE(6),
TYPE=BYTE(6),
TYPE=BYTE(4),
TYPE=FULLWORD,
OFFSET=0
OFFSET=6, INDEX=EMPX
OFFSET=12
OFFSET=16
25
Three-schema Architecture (4)
• Serves as a reference DBMS architecture even though most DBMSs
do not support the three level completely
• Data actually exist at the physical level only
– Needs to provide "mapping" bet. levels
• Most relational systems permit the definition of one external view to
be expressed in terms of other external views (i.e. an external/external
mapping), too
26
Data Independence
• Capacity to change the schema at one level without having to change
the schema at the next higher level
• Logical data independence
– Capacity to change the conceptual schema without having to change
external schemas
– Only mapping bet. the conceptual schema and external schema needs to be
changed
• Physical data independence
– Capacity to change the internal schema without having to change the
conceptual (or external) schema
– Only mapping bet. the conceptual schema and internal schema needs to be
changed
• Three-level schema makes data independence true
27
Database Languages
• Data Definition Language (DDL)
– To define, alter, drop schemas
– e.g. create table, create schema, drop table, etc.
• Data Manipulation Language (DML)
– To retrieve, insert, delete and modify database instances
– Tends to be declarative (i.e. non-procedural)
– e.g. insert record, retrieve record, delete record, etc.
• Data Control Language (DCL)
– To control and monitor various database operations (such as authorization,
server connection, transaction processing, etc.)
– e.g. grant, revoke, commit, rollback, prepare, etc.
28
Trend of Modern Database Systems (1)
• Types, classes, objects
– Object-oriented paradigm
– Rich set of types, OID, encapsulation, abstract data type, inheritance
– Example
Class Account = { account#: integer;
balance: real;
owner: REF Customer;}
Deposit(a: Account, m:real)
// method example
• Constraints and triggers
– An active DBMS is a DBMS that allows users to specify actions to be
taken automatically, without user intervention, when certain conditions
arise
– “ON” conditions of CODASYL
29
Trend of Modern Database Systems (2)
• Multimedia data
– Inclusion of multimedia data such as video, audio, image, text, etc.
– Very large size (up to terabytes, petabytes), unstructured data in natural
– Poses new technical issues such as data analysis, presentation
synchronization, query optimization, data buffering, tertiary storage, …
• Data integration
–
–
–
–
How to interoperate with legacy databases
Multidatabases
Data warehouses, OLAP (On-line analytic processing)
Data mining: search for interesting and unusual patterns in data
30