Download Link to Slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Relational model wikipedia , lookup

Functional Database Model wikipedia , lookup

Versant Object Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
Replication
Replication
Replication is keeping copies of the same data at different sites.
Pro - increases availability (and safety) of the data
Con - increases difficulty of data modifications (consistency)
Replication can either be complete (entire database at each site)
or partial (each site only contains some of the database).
Partial replication requires partitioning to determine what data
goes where.
Partitioning
Horizontal partitioning - partitioning a table by records.
Certain rows are keep at each site. For instance, the CSE
server holds the records of students in the major, and the
MSU server holds the records of undeclared students. All sites
share the same schema, but not the same rows.
Vertical partitioning - partition a table according to column
(decomposition). For instance, the CSE server holds the
capstone data, and the MSU server holds tuition data.
Hybrid partitioning - a combination of horizontal and vertical
Types of Data Distributions
Complete Replication - each site has the complete
database
Partitioned - each site has a fragment of the
database, but each fragment only exists at one site.
Partial Replication - each site has a fragment of the
database, but fragment exists in multiple copies
across the sites.
Difficult of Query Processing
Complete Replication - each site has the complete database
so query processes is easy
Partitioned - information is needed about what data is at
each site to handle queries on data not present at the local
site, so more difficult
Partial Replication - same as partitioned
Difficult of Concurrency Control
Complete Replication - simultaneous reads can be allowed
at each replica, simultaneous writes can only be permitted
at one replica. Any write needs to be propagated to all
locations. Moderately difficult.
Partitioned - Because the data isn't duplicated, this is as
easy as a centralized database.
Partial Replication - same as complete replication, but with
added difficulty of network communication.
Difficult of Maintaining Reliability
Complete Replication - multiple copies increase availability
because failure of any site won't interrupt service. Very low
difficulty.
Partitioned - Each datum is unique meaning every site is a
potential point of failure. Extremely difficult to maintain
reliability.
Partial Replication - same as complete replication, but with
added difficulty of network communication.
Which Data Distribution is best for large
data?
1. Complete Replication
2. Partitioned
3. Partial Replication
4. Depends