Download What is a Distributed Database System?

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Extensible Storage Engine wikipedia , lookup

Oracle Database wikipedia , lookup

Global serializability wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Ingres (database) wikipedia , lookup

Functional Database Model wikipedia , lookup

Commitment ordering wikipedia , lookup

Relational model wikipedia , lookup

Serializability wikipedia , lookup

Database model wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Clusterpoint wikipedia , lookup

Database wikipedia , lookup

Concurrency control wikipedia , lookup

Transcript
28/09/16
What is a Distributed Database
System?
A distributed database (DDB) is a collection of multiple, logically
interrelated databases distributed over a computer network.
What is a distributed DBMS?
Distributed DBMS Architecture
A distributed database management system (D–DBMS) is the software
that manages the DDB and provides an access mechanism that makes this
distribution transparent to the users.
Distributed database system (DDBS) = DDB + D–DBMS
Principles of Distributed Database Systems
Özsu, M. Tamer, Valduriez, Patrick, 3rd ed. 2011
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/1
Ch.
x/1
What is not a DDBS?
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/2
Centralized DBMS on a Network
• A timesharing computer system
• A loosely or tightly coupled multiprocessor system
• A database system which resides at one of the nodes of a network of
Site 1
Site 2
computers - this is a centralized database on a network node
Site 5
Communication
Network
Site 3
Site 4
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/3
Distributed DBMS Environment
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/4
Implicit Assumptions
• Data stored at a number of sites ! each site logically consists of a single
processor.
Site 1
• Processors at different sites are interconnected by a computer network !
not a multiprocessor system
Site 2
➡  Parallel database systems
Site 5
• Distributed database is a database, not a collection of files ! data logically
Communication
Network
related as exhibited in the users’ access patterns
➡  Relational data model
• D-DBMS is a full-fledged DBMS
➡  Not remote file system, not a TP system
Site 4
Distributed DBMS
Site 3
© M. T. Özsu & P. Valduriez
Ch.1/5
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/6
1
28/09/16
Data Delivery Alternatives
Distributed DBMS Promises
• Delivery modes
" Transparent management of distributed, fragmented, and replicated data
➡  Pull-only
# Improved reliability/availability through distributed transactions
➡  Push-only
➡  Hybrid
$ Improved performance
• Frequency
% Easier and more economical system expansion
➡  Periodic
➡  Conditional
➡  Ad-hoc or irregular
• Communication Methods
➡  Unicast
➡  One-to-many
• Note: not all combinations make sense
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/7
Transparency
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/8
Ch.
x/8
Example
• Transparency is the separation of the higher level semantics of a system
from the lower level implementation issues.
• Fundamental issue is to provide
data independence
in the distributed environment
➡  Network (distribution) transparency
➡  Replication transparency
➡  Fragmentation transparency
✦ 
✦ 
✦ 
horizontal fragmentation: selection
vertical fragmentation: projection
hybrid
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/9
Ch.
x/9
© M. T. Özsu & P. Valduriez
Ch.1/10
Distributed Database - User
View
Transparent Access
SELECT ENAME,SAL
FROM
Distributed DBMS
Tokyo
EMP,ASG,PAY
WHERE DUR > 12
Paris
Boston
AND
EMP.ENO = ASG.ENO
AND
PAY.TITLE = EMP.TITLE
Communication
Network
Paris projects
Paris employees
Paris assignments
Boston employees
Distributed Database
Boston projects
Boston employees
Boston assignments
Montreal
New
York
Boston projects
New York employees
New York projects
New York assignments
Distributed DBMS
© M. T. Özsu & P. Valduriez
Montreal projects
Paris projects
New York projects
with budget > 200000
Montreal employees
Montreal assignments
Ch.1/11
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/12
2
28/09/16
Distributed DBMS - Reality
Types of Transparency
• Data independence
• Network transparency (or distribution transparency)
User
Query
DBMS
Software
DBMS
Software
DBMS
Software
User
Application
➡  Location transparency
DBMS
Software
➡  Fragmentation transparency
• Replication transparency
• Fragmentation transparency
Communication
Subsystem
User
Query
User
Application
DBMS
Software
User
Query
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/13
Distributed DBMS
© M. T. Özsu & P. Valduriez
Reliability Through Transactions
Potentially Improved
Performance
• Replicated components and data should make distributed DBMS more
• Proximity of data to its points of use
reliable.
Ch.1/14
➡  Requires some support for fragmentation and replication
• Distributed transactions provide
• Parallelism in execution
➡  Concurrency transparency
➡  Failure atomicity
➡  Inter-query parallelism
• Distributed transaction support requires implementation of
➡  Intra-query parallelism
➡  Distributed concurrency control protocols
➡  Commit protocols
• Data replication
➡  Great for read-intensive workloads, problematic for updates
➡  Replication protocols
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/15
Distributed DBMS
© M. T. Özsu & P. Valduriez
Parallelism Requirements
Distributed DBMS Issues
• Have as much of the data required by each application at the site where the
• Distributed Database Design
application executes
Ch.1/16
➡  How to distribute the database
➡  Replicated & non-replicated database distribution
➡  Full replication
➡  A related problem in directory management
• How about updates?
• Query Processing
➡  Mutual consistency
➡  Convert user transactions to data manipulation instructions
➡  Optimization problem
➡  Freshness of copies
✦ 
min{cost = data transmission + local processing}
➡  General formulation is NP-hard
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/17
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/18
3
28/09/16
Relationship Between Issues
Distributed DBMS Issues
Directory
Management
• Concurrency Control
➡  Synchronization of concurrent accesses
➡  Consistency and isolation of transactions' effects
➡  Deadlock management
Query
Processing
•  Reliability
Distribution
Design
Reliability
➡  How to make the system resilient to failures
➡  Atomicity and durability
Concurrency
Control
Deadlock
Management
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/19
Architecture
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/20
ANSI/SPARC Architecture
• Defines the structure of the system
Users
➡  components identified
➡  functions of each component defined
External
Schema
➡  interrelationships and interactions between components defined
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/22
DBMS Implementation
Alternatives
External
view
External
view
Conceptual
Schema
Conceptual
view
Internal
Schema
Internal view
Distributed DBMS
External
view
© M. T. Özsu & P. Valduriez
Ch.1/23
Dimensions of the Problem
• Distribution
➡  Whether the components of the system are located on the same machine or not
• Heterogeneity
➡  Various levels (hardware, communications, operating system)
➡  DBMS important one
✦ 
data model, query language,transaction management algorithms
• Autonomy
➡  Not well understood and most troublesome
➡  Various versions
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/24
✦ 
Design autonomy: Ability of a component DBMS to decide on issues related to its own
design.
✦ 
Communication autonomy: Ability of a component DBMS to decide whether and how to
communicate with other DBMSs.
✦ 
Execution autonomy: Ability of a component DBMS to execute local operations in any
manner it wants to.
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/25
4
28/09/16
Datalogical Distributed DBMS
Architecture
Distributed Database Servers
ES1
External schemas
...
ES2
ESn
Global Conceptual Schema
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/26
Datalogical Multi-DBMS
Architecture
LES11
…
LCS1
GES1
GES2
LES1n
GCS
...
Local conceptual schemas
LCS1
LCS2
...
LCSn
Local internal schemas
LIS1
LIS2
...
LISn
Distributed DBMS
© M. T. Özsu & P. Valduriez
MDBS Components & Execution
Global
User
Request
GESn
LESn1
LCS2
…
LIS2
…
…
Local
User
Request
LESnm
Distributed DBMS
Local
User
Request
Multi-DBMS
Layer
Global
Subrequest
Global
Subrequest
Global
Subrequest
LCSn
DBMS1
LIS1
Ch.1/27
DBMS2
DBMS3
LISn
© M. T. Özsu & P. Valduriez
Ch.1/28
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/29
Mediator/Wrapper Architecture
Distributed DBMS
© M. T. Özsu & P. Valduriez
Ch.1/30
5