Download Link to Slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsoft Access wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Oracle Database wikipedia , lookup

IMDb wikipedia , lookup

Global serializability wikipedia , lookup

Commitment ordering wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Serializability wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Ingres (database) wikipedia , lookup

Functional Database Model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Versant Object Database wikipedia , lookup

ContactPoint wikipedia , lookup

Concurrency control wikipedia , lookup

Database model wikipedia , lookup

Clusterpoint wikipedia , lookup

Transcript
Distributed Databases
What is a Distributed Database?
A database which is distributed
over some form of network to
bring down the cost or the
difficulty in accessing data and to
increase the efficiency of the
whole system.
The software that runs it is called a
Distributed Database Management
System (DDBMS).
http://www.mahipalreddy.com
/dbdesign/dbarticle1.htm
Centralized Database
All queries are served by a
single datacenter (in MSU).
Black lines indicate query
paths.
Distributed Database
Queries are served by the
nearest datacenter (black
lines).
The datacenters each have
communications channels to
facilitate the transfer of data
(red lines).
Advantages of Distributed Databases
1. More accessible by users (closer datacenter means faster connection)
2. Redundancy (more uptime because other datacenters can handle load
if a datacenter fails)
3. Ability to grow (can add new datacenters without interrupting service)
4. Ability to change (datacenters can change technologies without
impacting service)
5. Parallelism (more queries can be handled than having all funneled to
single datacenter)
Homogenous Distributed Database
A homogenous network of databases have identical
software (and sometimes hardware) used at all sites
(datacenters / servers).
Software refers to the OS, DBMS, and schema of the data.
This is nice because all the sites are well-understood by all,
and optimizations useful for one site can be exported to
others. It is easier to treat such a network as a single central
database system.
Heterogeneous Distributed Database
A heterogeneous network of databases use different
software (i.e. OS, DBMS, schema) at different sites.
Because of this, cooperation (human and computer-wise)
between sites can be more difficult.
But diversity can allow for optimization, and independence
from any particular technology.
Components
User Processor:
Data Processor:
User interface handler – interprets user commands
when they are given in, and formatting the result sets
when the request is answered
Local query optimizer – it optimizes data access by
choosing the best access path. For example, Local
query optimizer decides which index to be used for
optimally executing the given query.
Semantic data controller – checks the integrity
constraints and authorizations defined on database
elements
Global query optimizer and decomposer – devises a
best execution strategy to execute the given user
requests in minimal cost (in terms of time, processor,
memory).
Global execution monitor – this is the transaction
manager. The transaction managers of various sites
participating in a query execution communicate with
each other as part of execution monitoring
Local recovery manager – deals with the consistency
of the local database. In case of failure, local recovery
manager is responsible for maintaining a consistent
database.
Run-time support processor – it accesses the database
physically according to the strategy suggested by the
local query optimizer.
Components
http://www.springer.com/us/book/9781441988331
Global Execution Monitor
Global execution monitor is the
primary component of
distributed databases which
communicates with remote
databases to coordinate the
processing of a query.
http://photobucket.com/images/hall%20monitor%20badge