Download w01_1_INF280_Basic_Concepts_Concurrency_Control

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Global serializability wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Oracle Database wikipedia , lookup

IMDb wikipedia , lookup

Commitment ordering wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Relational model wikipedia , lookup

Serializability wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

ContactPoint wikipedia , lookup

Database model wikipedia , lookup

Concurrency control wikipedia , lookup

Transcript
INF 280 Database Systems
BASIC CONCEPTS
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
1
Typical software application
Business Logic
INF 280
I
n
t
e
r
f
a
c
e
D. Christozov / G.Tuparov
Transforming
interface into
data request
Query (SQL)
Data Processing
Transforming
datasets into
reports/forms
Datasets
INF 280 Database Systems:
Basic Concepts
Database
2
Basic Concepts - Topics
1. A database as a collection of related data
2. Database and Database Management System
3. Characteristics and advantages of DB
approach
4. DB users
5. DB Architecture
6. DBMS Architecture
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
3
DB as a collection of related data (1)
• Data: facts that can be recorded and that have implicit
meaning.
• Database implicit properties:
– A database represents some aspect of the real world,
sometimes called the miniworld or the Universe of
Discourse (UoD).
– A database is a logically coherent collection of data
with some inherent meaning.
– A database is designed, built, and populated with data
for a specific purpose. It has an intended groups of
users and some applications in which these users are
interested.
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
4
DB as a collection of related data (2)
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
5
Basic characteristics
1. Self-Describing Nature of a Database System
2. Insulation between Programs and Data, Data
Abstraction
3. Support of Multiple Views of the Data
4. Sharing of Data and Multiuser Transaction
Processing
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
6
Basic characteristics (1)
Self-Describing Nature of a Database System
• System catalogue contains information about the
structure of each file, the type and storage format of
each data item, and various constraints on the data.
The information stored in the catalogue is called
meta-data, and it describes the structure of the
primary database.
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
7
Basic characteristics (2)
Insulation between Programs and Data, and Data
Abstraction
• The characteristic that allows program-data
independence and program-operation independence is
called data abstraction.
• A DBMS provides users with a conceptual
representation of data that does not include many of
the details of how the data is stored or how the
operations are implemented. Data model (or logical
data model) is a type of data abstraction that is used to
provide this conceptual representation. Data model
hides storage and implementation details.
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
8
Basic characteristics (3)
Support of Multiple Views of the Data
• A view may be a subset of the database or it may
contain virtual data that is derived from the database
files but is not explicitly stored.
• Different categories of users need different views on
the database.
• One user may need to solve different problems with
database and for every problem may need different
view on the data.
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
9
Basic characteristics (4)
Sharing of Data and Multiuser Transaction
Processing
• Multiple users may need to access database
simultaneously.
• The DBMS must include concurrency control
software to ensure that several users trying to
update the same data do so in a controlled manner
so that the result of the updates is correct.
• On-line transaction processing (OLTP) applications.
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
10
Transaction Processing and Concurrency
Control (1)
Transaction: execution of a program that accesses and/or
changes the content of a file.
Concurrency:concurrent execution of two or more transactions.
Concurrency mechanisms to avoid failures, losses, etc. in
Control: concurrent execution of transactions
Single Vs. Multiuser/Multi Tasking Systems: Time Shearing
System Log: journal (file), which holds the history of changes
the state of a database
D. Christozov
INF 280: Database Systems
Concurency Control
11
Transaction Processing and Concurrency
Control (2)
ACID Properties of Transaction:
A
Atomicity
C
Consistency A correct execution of a transaction takes
Preservation the database from one consistent state to
another consistent state.
I
Isolation
A transaction should not make its updates
visible to other transaction until it is
committed.
D
Durability
Once a transaction changes the DB and the
changes are committed, these changes must
never be lost because of subsequent failure.
D. Christozov
Transaction is either performed on its
entirety or not at all.
INF 280: Database Systems
Concurency Control
12
Transaction Processing and Concurrency
Control (3)
Schedule:
A schedule S of n transactions T1, T2, …,Tn is an
ordering of the execution of operations of the
transactions. Operations of two transactions Ti and
Tj can be interleaved.
Recoverability: Ability to recover from transaction failure.
A schedule S is recoverable if no transaction T in S
commits until all transactions T’, that have written
an item that T reads have committed.
Serializability: The concurrent execution of transactions is
equivalent of serial execution: Serial, Non-Serial,
and Conflict Schedules.
Protocols:
D. Christozov
sets of rules to guarantee “serializability”.
INF 280: Database Systems
Concurency Control
13
Transaction Processing and Concurrency
Control (4)
Granularity: What portion of the DB the data item represents
record
field of a
record
Locking:
block
file
DB space
whole
database
prevents multiple transactions from accessing the
same item concurrently
Timestamps: uses unique identifier for each transaction
Multiversion: the system uses multiple versions of the same
data item
Optimistic:
D. Christozov
validation and certification of transactions
INF 280: Database Systems
Concurency Control
14
Transaction Processing and Concurrency
Control (5)
Locks:
• Binary lock:
two states (locked/unlocked) for each item;
• Shared:
three states: read-lock, write-lock, unlocked;
• Two-phase lock: all locking operations precede the first
unlock operation. First phase – expanding;
second phase – shrinking.
Basic, Conservative, Strict Two-phase locking.
• Deadlock:
each of two transactions is waiting for other to
unlock a given data item.
• Livelock:
a transaction waits, while the other continue.
D. Christozov
INF 280: Database Systems
Concurency Control
15
Transaction Processing and Concurrency
Control (6)
Timestamps: order transactions according to their timestamps
Multiversion: keeps the old values when the item is updated
Optimistic:
D. Christozov
no checking during execution of the transaction; all
updates applied to a local copy of the data item. After
execution a validation phase is performed to check
serializability.
INF 280: Database Systems
Concurency Control
16
Transaction Processing and Concurrency
Control (7)
Testing schedules for serializability:
1. Only read_item and write_item operations are interesting
2. The algorithm is based on constructing precedence (serialization)
graph for the schedule: a directed graph G = {N, E}, where
N = {T1, T2, …, Tn} nodes and E = {e1, e2, …, en} – adges
There is one node for each transaction Ti and
an edge ei is a precedence of (TjTk), where Tj is a starting node
and Tk – ending node, one operation in Tj appears in the schedule
BEFORE some conflict operations in Tk.
D. Christozov
INF 280: Database Systems
Concurency Control
17
Transaction Processing and Concurrency
Control (8)
Algorithm for testing “serializability” of a schedule S:
1. For each transaction Ti create a node in a precedence graph G.
2. If in S Tj:read_item(X) is after Ti:write_item(X), create an edge
(TiTj)
3. If in S Tj:write_item(X) is after Ti:read_item(X), create an edge
(TiTj)
4. If in S Tj:write_item(X) is after Ti:write_item(X), create an edge
(TiTj)
The schedule S is serializable if and only if the G has no cycles.
D. Christozov
INF 280: Database Systems
Concurency Control
18
Transaction Processing and Concurrency
Control (9)
Examples: Serial Schedules
T1
T2
Read item(X)
X:=X-N
Write item(X)
Read item(Y)
Y:=Y+N
Write item(Y)
T1
T2
Read item(X)
X:=X+M
Write item(X)
D. Christozov
INF 280: Database Systems
Concurency Control
19
Transaction Processing and Concurrency
Control (10)
Examples: Non Serial Schedules
T1
T2
Read item(X)
X:=X-N
Read item(X)
X:=X+M
X
T1
Write item(X)
Read item(Y)
T2
X
Write item(X)
Cycle: {X}
Y:=Y+N
Write item(Y)
D. Christozov
INF 280: Database Systems
Concurency Control
20
Transaction Processing and Concurrency
Control (11)
Lost update problem:
Transactions
Schedule
T1
T2
Read_item(X);
X:=X-N;
T1
T2
Read_item(X);
X:=X-N;
Write_item(X);
Read_item(Y);
Y:=Y+N;
Write_item(Y);
Read_item(X);
X:+X+M;
Write_item(X);
Read_item(X);
X:+X+M;
Write_item(X);
Read_item(Y);
Write_item(X);
Y:=Y+N;
Write_item(Y);
The two transactions access and update the same DB item simultaneously.
D. Christozov
INF 280: Database Systems
Concurency Control
21
Transaction Processing and Concurrency
Control (11)
Dirty Read (temporary update problem):
Schedule
Transactions
T1
T1
T2
Read_item(X);
X:=X-N;
Write_item(X);
Read_item(Y);
Y:=Y+N;
Write_item(Y);
Read_item(X);
X:+X+M;
Write_item(X);
T2
Read_item(X);
X:=X-N;
Write_item(X);
Read_item(X);
X:+X+M;
Write_item(X);
Read_item(Y);
failure
One transaction updates an item and fails, before correctly update item Y,
another transaction uses the already updated item.
D. Christozov
INF 280: Database Systems
Concurency Control
22
Transaction Processing and Concurrency
Control (12)
Incorrect summary
problem:
Schedule
T1
Transactions
T2
Read_Item(A);
Sum := Sum+A;
T1
T2
Read_item(X);
X:=X-N;
Write_item(X);
Read_item(Y);
Y:=Y+N;
Write_item(Y);
Read_item(A);
Sum := Sum+A;
Write_item(X);
Sum := Sum +X;
Read_item(Y);
Sum := Sum+Y;
Read_item(X);
X:=X-N;
Write_item(X);
Read_item(X);
Sum := Sum +X;
Read_item(Y);
Sum := Sum+Y;
One transaction calculates aggregate Read_item(Y);
function,
Y:=Y+N;
while another updates the same record. Write_item(Y);
D. Christozov
INF 280: Database Systems
Concurency Control
23
Advantages of Using DBMS
•
•
•
•
•
•
•
•
Controlling Redundancy (reducing)
Preserving Data Integrity
Restricting Unauthorized Access
Providing Persistent Storage for Program Objects
and Data Structures (Object-Oriented DB)
Permitting Inferencing and Actions Using Rules
Providing Multiple User Interfaces
Representing Complex Relationships Among Data
Providing Backup and Recovery
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
24
Redundant Data
Id#
Name
Address
Code
000101
Ivan Ivanov
Scapto 1
COS480
DB System
3
000101
Ivan Ivanov
Scapto 1
COS 221
FDS
000101
Ivan Ivanov
Scapto 1
AUB 102
000102
Georgi Georgiev
Scapto 2
000102
Georgi Georgiev
Scapto 2
Student’s
information
D. Christozov / G.Tuparov
Title
Cr.
Instructor
Section
Grade
Christozov
A
B-
3
Christozov
B
B+
Writing
3
Colman
C
D+
COS 480
DB System
3
Christozov
A
B+
AUB 102
Writing
3
Colman
C
C+
Course
information
INF 280 Database Systems:
Basic Concepts
Grade
information
25
Integrity
Grades
Id#
Name
Address
Code
Title
Cr.
Instructor
Section
Grade
000101
Ivan Ivanov
Scapto 1
COS480
DB System
3
Christozov
A
B-
000101
Ivan Ivanov
Scapto 1
COS 221
FDS
3
Christozov
B
B+
000101
Ivan Ivanov
Scapto 1
AUB 102
Writing
3
Colman
C
D+
000102
Georgi Georgiev
Scapto 2
COS 480
DB System
3
Christozov
A
B+
000102
Georgi Georgiev
Scapto 2
AUB 102
Writing
3
Colman
C
C+
missing
Faculty
Family Name
Given Name
Title
Office
Bonev
Stoyan
Assoc. Professor
221
Colman
Mark
Professor
231
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
26
Actors on the Scene
• DB Administrators
• DB Designers
• End Users:
– Casual
– Naive (parametric)
– Sophisticated
– Stand-alone
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
27
DB Administrators
• The DBA is responsible for authorizing access to the
database, coordinating and monitoring its use, and
acquiring software and hardware resources as
needed.
• The DBA is accountable for problems such as security
breaches and poor system response time. In large
organizations, the DBA is assisted by a staff that
carries out these functions.
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
28
Database designers
• Database designers are responsible for identifying
the data to be stored in the database and for
choosing appropriate structures to represent and
store this data. These tasks are mostly undertaken
before the database is actually implemented and
populated with data.
• Database designers responsibility is to communicate
with all prospective database users in order to
understand their requirements and to create a
design that meets these requirements.
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
29
End Users
• Casual end users occasionally access the database,
but they may need different information each time.
They use a sophisticated database query language to
specify their requests and are typically middle- or
high- level managers or other occasional browsers.
• Naive or parametric end users main job function
revolves around constantly querying and updating
the database, using standard types of queries and
updates - called canned transactions - that have
been carefully programmed and tested.
Examples: Bank tellers, Reservation agents, etc.
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
30
End Users (cont.)
• Sophisticated end users include engineers, scientists,
business analysts, and others who thoroughly
familiarize themselves with the facilities of the DBMS
in order to implement their own applications to meet
their complex requirements.
• Standalone users maintain personal databases by
using ready-made program packages that provide
easy-to-use menu-based or graphics-based
interfaces. An example is the user of a tax package
that stores a variety of personal financial data for tax
purposes.
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
31
Actors Behind the Scene
• DBMS Systems designers and implementers
design and implement the DBMS modules and
interfaces as a software package.
• Tools developers design and implement tools
- the software packages that facilitate
database modeling and design, database
system design, and improved performance.
• Operators and Maintenance personnel are
responsible for the actual running and
maintenance of the hardware and software
environment for the database system.
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
32
DB History
Database Systems:
the success story of Computer Science
•
•
•
•
•
•
Early applications: use of File Systems
1960s: Hierarchical and Network DB models
Late 1970s: Codd’s Relational Model
Late 1980s: OODB -> R-OO DB
1990s: SQL standards, WWW, E-Commerce
Spatial DB, Data Warehouses, Data Mining
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
33
DB Model
Data Model: collection of concepts that can be used to
describe the structure of a database
Structure:
data types; relationships; constraints
Operation: retrieve, insert, delete, modify, userdefined operations
Behavior:
dynamic
Object-Oriented Models incorporate both structure
and behavior
In “classical” models (hierarchical, network or
relational) behavior is limited to generic operations.
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
34
DB Model: Categories of Data Models
High-level –
conceptual
Low-level –
physical
Representational –
logical
How users perceive data.
How data is actually stored on
computer.
Close to the way users
understand data, but allow direct
interpretation by given DBMS.
Database schema: Description of database model.
Most data models have certain
conventions for displaying
schemas as diagrams.
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
35
Schema  Instance  State
• In any data model, it is important to distinguish
between the description of the database and the
database itself. The description of a database is
called the database schema, which is specified
during database design and is not expected to
change frequently.
• The data in the database at a particular moment in
time is called a database state or snapshot. It is also
called the current set of occurrences or instances in
the database.
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
36
DB and DBMS
• A database management system (DBMS) is a
collection of programs that enables users to
create and maintain a database.
• The DBMS is a general-purpose software
system that facilitates the processes of
– defining,
– constructing, and
– manipulating
databases for various applications.
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
37
DBMS
• DBMS supports the following categories of
languages:
– Data definition language (DDL).
– Storage definition language (SDL)
– View definition language (VDL)
– Data manipulation language (DML), including
querying language
• Note: In current DBMSs, these types of
languages are not considered distinct
languages.
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
38
DBMS Components
DB designers
DB administrators
Sophisticated Users
Naive Users
Query compiler
DML compiler
DDL interpreter
System catalogue
Run-time processor
Boundaries of DBMS
Data manager
Concurrency control,
recovery, backup
subsystems
Recorded DB
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
39
DBMS Architecture
• The Three-Schema Architecture
1. The internal level (internal schema), describes
the physical storage structure of the database.
2. The conceptual level (conceptual schema),
describes the structure of the whole database for
a community of users.
3. The external or view level includes a number of
external schemas or user views.
• Data Independence
1. Logical data independence.
2. Physical data independence
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
40
The Three-Schema Architecture
Categories of
Users
External Schemas
Logical Data
Independence
Logical Schema
Physical Schema
Data Files
Master Files
D. Christozov / G.Tuparov
Indexes
INF 280 Database Systems:
Basic Concepts
Physical Data
Independence
Meta Data
System Catalog
41
Database System Utilities
1.
2.
3.
4.
Loading
Backup
File reorganization
Performance monitoring
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
42
Q&A
D. Christozov / G.Tuparov
INF 280 Database Systems:
Basic Concepts
43