Download FEDERATION SERVICES We start our discussion of federation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Big data wikipedia , lookup

Database wikipedia , lookup

Data Protection Act, 2012 wikipedia , lookup

Operational transformation wikipedia , lookup

Data center wikipedia , lookup

Data model wikipedia , lookup

Clusterpoint wikipedia , lookup

Data analysis wikipedia , lookup

Data vault modeling wikipedia , lookup

Information privacy law wikipedia , lookup

3D optical data storage wikipedia , lookup

Business intelligence wikipedia , lookup

Database model wikipedia , lookup

Transcript
FEDERATION SERVICES
We start our discussion of federation services by considering what federation means in the
context of current database technology. In a federated database, many databases contribute data and
resources to a multidatabase federation, but each participant has full local autonomy. In a loosely
coupled federated database, the schemas of the participating databases remain distinguishable,
whereas in a tightly coupled federated database, a global schema hides (to a greater or lesser extent)
schematic and semantic differences between resources (466), mapping a single logical schema
mapped to multiple physical schemas. In the Grid setting, federation is more general than integrating
databases and attempts to provide a uniform framework in which diverse data sources (be they
relational, file-structured, XML,etc.) can be integrated.
Loose federations can provide several types of transparency:
✦Location transparency- Mechanisms for accessing data are independent of the data’s location.
✦Name transparency- An application can access data without knowing its name or location; that is, it
is possible to discover it using registry queries that describe requirements in terms of data content
and operations.
✦Distribution transparency- An application can query and update data without being aware that it
comes from a set of distributed sources.
✦Replication transparency- An application can access data without being aware of replica and
caching mechanisms.
✦Ownership and cost transparency- Applications are spared from separately negotiating for access to
individual sources.
✦Heterogeneity transparency- The access mechanism is independent of the actual implementation of
the data source.
✦Schema change transparency- Data resources are allowed to rearrange theirdata, for example,
across different tables to meet performance requirements or to accommodate new information,
without affecting applications.
Data Mediation
Data mediation services provide transparency with respect to data models. Mediation can be
as simple as renaming attributes, or as complex as providing sophisticated semantic-based mappings
from elements in one data model to one or more elements in a second data model. Combining data
mediation with the mechanisms that provide uniform access to data relieves applications from the
specifics of how to retrieve data elements of interest. A simple type of mediation involves services
that map data names, and potentially data values, from one data model to another. This approach can
be used to provide access to legacy data stored in formats that are not widely supported.
Replication Services for Location Transparency
Replication services provide for replication transparency. We can decompose replication into
three different types of service:
✦Replica management services- create copies and update location services so that the location of a
replica can be identified.
✦Replica location services- serve as registries to locate where replicas exist by defining a mapping
between a data object name (logical name) and the storage services that can provide access to the
data object (physical names). With the replication location services, replicas are not constraint to be
“bit-wise” copies.
✦Consistency services- manage relationships among the various replicas. Multiple service
definitions may exist for each type of service, each providing different federation semantics.
A replication management service is responsible for creating replicas and potentially supports
selection among replicas.
This frame-work is based on several mechanisms:
✦Consistent local state maintained in local replica catalogs (LRCs). Local catalogs maintain
mappings between arbitrary logical names for data and target names (either logical or physical)
associated with replicas of the data.
✦Collective state with relaxed consistency maintained in replica location indices (RLIs) . Each RLI
contains a set of mappings from logical names to target names. A variety of index structures can be
defined with different performance characteristics, simply by varying the number of RLIs and the
amount of redundancy and partitioning among the RLIs.
✦Soft-state maintenance of RLI state. LRCs send information about their state to RLIs using softstate protocols, which, as discussed in Chapter 17, have many advantages in distributed systems.
State information in RLIs times out and must be periodically refreshed by soft-state updates.
✦Compression of soft-state updates. To reduce the amount of soft-state information that must be sent
and the storage requirements of RLIs, soft-state updates may be compressed.
Consistency Services
Grid consistency services allow replication of data items, possibly with some level of
consistency guaranteed among replicas. Consistency requirements for replicated data in Grid systems
vary widely. At one extreme, a Grid that provides read-only access to published data may not require
services to maintain consistency among replicas. At the other extreme, a Grid may provide strict
consistency with synchronous, transactional semantics for updating replicas, thereby supporting
traditional distributed file system or database functionality.