Download Introduction to Database Systems

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Concurrency control wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Introduction to Database Systems
Chpt 1
Instructor: Weichao Wang
Database Management Systems
Ramakrishnan & Gehrke
1
http://www.sigmod.org/record/issues/0606/index.html
Database Management Systems
Ramakrishnan & Gehrke
2
History
 60s
C. Bachman GE network data model
 Late 60s
IBM IMS hierarchical data model
 70
E.Codd relational model
 80s
SQL IBM R trasaction J. Gray
 Late 80s-90s DB2, Oracle, informix, sybase
 90s
DW, internet, distributed database
 Now
Big Data
Turing award and Turing test?
Database Management Systems
Ramakrishnan & Gehrke
3
What Is a DBMS?
A very large, integrated collection of data.
 Models real-world enterprise.

– Entities (e.g., students, courses)
– Relationships (e.g., Madonna is taking ITCS6160)

A Database Management System (DBMS) is a
software package designed to maintain and
utilize databases.
Database Management Systems
Ramakrishnan & Gehrke
4
Why not just OS file systems?





Size of the data and size of your memory/harddisk
Query processing: remember your file read/write C
programs? Now think about several tera-bytes of
data. You need a separate program for every query.
Consistency: multiple users access the same data
Recovery: is it on harddisk now?
All these can be implemented directly upon OS. But
then you are just designing your own DB and DBMS.
Database Management Systems
Ramakrishnan & Gehrke
5
Why Use a DBMS?
Data independence and efficient access.
 Reduced application development time.
 Data integrity and security.
 Uniform data administration. (not sure about
this now)
 Concurrent access, recovery from crashes.

Database Management Systems
Ramakrishnan & Gehrke
6
Why Study Databases??



Shift from computation to information (application
oriented vs data oriented)
– at the “low end”: scramble to webspace
– at the “high end”: scientific applications
Datasets increasing in diversity and volume.
– Digital libraries, interactive video, Human
Genome project, EOS project
– ... need for DBMS exploding
DBMS encompasses most of CS
– OS, languages, theory, AI, multimedia, logic
Database Management Systems
Ramakrishnan & Gehrke
7
Data Models
A data model is a collection of concepts for
describing data.
 A schema is a description of a particular
collection of data, using the given data
model.
 The relational model of data is the most widely
used model today.

– Main concept: relation, basically a table with rows
and columns.
– Every relation has a schema, which describes the
columns, or fields.
Database Management Systems
Ramakrishnan & Gehrke
8
Levels of Abstraction

Many views, single
conceptual (logical) schema
and physical schema.
View 1
– Views describe how users
see the data.
– Conceptual schema defines
logical structure
– Physical schema describes
the files and indexes used.
View 2
View 3
Conceptual Schema
Physical Schema
 Schemas are defined using DDL; data is modified/queried using DML.
Database Management Systems
Ramakrishnan & Gehrke
9
Example: University Database



Conceptual schema:
– Students(sid: string, name: string, login: string,
age: integer, gpa:real)
– Courses(cid: string, cname:string, credits:integer)
– Enrolled(sid:string, cid:string, grade:string)
Physical schema:
– Relations stored as unordered files.
– Index on first column of Students.
External Schema (View):
– Course_info(cid:string,enrollment:integer)
– Each data entry is stored only once. Views are created.
Database Management Systems
Ramakrishnan & Gehrke
10
Data Independence
Applications insulated from how data is
structured and stored.
 Logical data independence: Protection from
changes in logical structure of data.
 Physical data independence: Protection from
changes in physical structure of data.
 Key is to reduce workload and overhead of
end users.

 One of the most important benefits of using a DBMS!
Database Management Systems
Ramakrishnan & Gehrke
11
These layers
must consider
concurrency
control and
recovery
Structure of a DBMS



A typical DBMS has a
Query Optimization
layered architecture.
and Execution
The figure does not
Relational Operators
show the concurrency
Files and Access Methods
control and recovery
components.
Buffer Management
This is one of several
Disk Space Management
possible architectures;
each system has its own
variations.
DB
Database Management Systems
Ramakrishnan & Gehrke
12
Transaction Management: ACID
properties


A tomicity: All actions in the Xact happen, or none happen.
C onsistency: If each Xact is consistent, and the DB starts
consistent, it ends up consistent.

I solation:
Execution of one Xact is isolated from that of
other Xacts.

D urability:

The Recovery Manager guarantees Atomicity & Durability.
If a Xact commits, its effects persist.
Database Management Systems
Ramakrishnan & Gehrke
13
Motivation of concurrency control
Consistency
 Isolation
 Example

–
–
–
–
Two parallel transactions T1 and T2
Serial execution
Execution with interleaving actions
Similar situations in OS and any other resource
competitions
Database Management Systems
Ramakrishnan & Gehrke
14
Motivation of recovery management

Atomicity:
– Transactions may abort (“Rollback”).

Durability:
– What if DBMS stops running? (Causes?)

Desired Behavior after
system restarts:
– T1, T2 & T3 should be
durable.
– T4 & T5 should be
aborted (effects not seen).
Database Management Systems
T1
T2
T3
T4
T5
Ramakrishnan & Gehrke
crash!
15
Databases make these folks happy ...
End users and DBMS vendors
 DB application programmers

– E.g. smart webmasters

Database administrator (DBA)
–
–
–
–
Designs logical /physical schemas
Handles security and authorization
Data availability, crash recovery
Database tuning as needs evolve
Must understand how a DBMS works!
Database Management Systems
Ramakrishnan & Gehrke
16
New challenges




Application oriented to data oriented
Unstructured data
Conflict b/w data and user privacy
– Data taint/trace
Challenges caused by cloud:
– Storage places
– Index of encrypted data files
– Proof of retrievability
– Mobile: compute it locally or transmit it
Database Management Systems
Ramakrishnan & Gehrke
17
Summary
DBMS used to maintain, query large datasets.
 Benefits include recovery from system crashes,
concurrent access, quick application
development, data integrity and security.
 Levels of abstraction give data independence.
 A DBMS typically has a layered architecture.
 DBAs hold responsible jobs and are
well-paid!
 DBMS R&D is one of the broadest,
most exciting areas in CS.

Database Management Systems
Ramakrishnan & Gehrke
18