* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download fragments
Survey
Document related concepts
Commitment ordering wikipedia , lookup
Oracle Database wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Ingres (database) wikipedia , lookup
Serializability wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Relational model wikipedia , lookup
Functional Database Model wikipedia , lookup
ContactPoint wikipedia , lookup
Database model wikipedia , lookup
Transcript
Database Design – Lecture 16 Distributed Databases Lecture Objectives Distributed Processing and Distributed Databases Distributed Database Management System (DDBMS) Distributed Database Design 2 Distributed Processing Shares the database’s logical processing among two or more physically independent sites that are connected through a network. Note: data resides at only one site and is shared by other sites (“centralized”) 3 Distributed Databases Stores a logically related database over two or more physically independent sites. The sites are connected by a computer network. Note: database is composed of several parts know as database fragments. These fragments are located at several different sites. 4 Distributed Processing and Distributed Databases In a distributed database environment, the users do not need to know the name or location of each database fragment in order to access the database – transparent to the user Distributed processing does not require a distributed database but a distributed database requires distributed processing Both distributed processing and distributed databases require a network to connect all components 5 Lecture Objectives Distributed Processing and Distributed Databases Distributed Database Management System (DDBMS) Distributed Database Design 6 DDBMS Advantages Data are located near/at “greatest demand” site – improved performance Improved reliability – data replication Growth facilitation Reduced operating costs 7 DDBMS Disadvantages Complexity Cost Database design more complex 8 Distributed Database Management System(DDBMS) Governs the storage and processing of a single logically related database over interconnected computer systems in which both data and processing functions are distributed among several sites. 9 Distributed Database Management System(DDBMS) A DDBMS must have at least the following functions to be classified as distributed: - Application Interface - Validation Transformation - Query Optimization Mapping - I/O Interface Formatting - Security Backup & Recovery - DB Administration Concurrency Control - Transaction Management Computer Workstations (sites or nodes) Network Hardware & Software Communications Media 10 Distributed Database Management System(DDBMS) A DDBMS must have at least the following functions to be classified as distributed: Application Interface Validation Allows the interaction with the end user or application programs and with other DBMSs within the distributed database Able to analyze data requests Transformation To determine which data request components are distributed and which ones are local 11 Distributed Database Management System(DDBMS) A DDBMS must have at least the following functions to be classified as distributed: Query Optimization Mapping To find the best access strategy To determine the data location of local and remote fragments I/O Interface To read or write data from or to permanent local storage 12 Distributed Database Management System(DDBMS) A DDBMS must have at least the following functions to be classified as distributed: Formatting Security To prepare the data for presentation to the end user or an application program To provide data privacy at both local and remote databases Backup and Recovery To ensure the availability and recoverability of the database in case of a failure 13 Distributed Database Management System(DDBMS) A DDBMS must have at least the following functions to be classified as distributed: DB Administration Concurrency Control To allow the Database Administrator to maintain the databases To manage simultaneous data access and ensure data consistency across database fragments in the DDBMS Transaction Management To ensure that the data move from on consistent state to another – synchronizing transactions 14 Distributed Database Management System(DDBMS) A DDBMS must have at least the following components: Computer Workstations (sites or nodes) Network Hardware and Software Form the network system Components that reside in each workstation Allows all sites to interact and exchange data Communications media Carries data from one workstation to another 15 Distributed Database Management System(DDBMS) A DDBMS must have at least the following components: Transaction Processor (TP) Software component found in each computer that requests data Receives and processes the application’s data requests (remote and local) Data Processor (DP) Software component residing on each computer that stores and retrieves data located at the site 16 Distributed Database Environment 17 Lecture Objectives Distributed Processing and Distributed Databases Distributed Database Management System (DDBMS) Distributed Database Design 18 Distributed Database Design Designing for a relational data base structure does not change – start with a top down approach HOWEVER, need to consider the following as well: How to partition the database into fragments Which fragments to replicate Where to locate those fragments and replicas More frequently used fragments should be stored locally Fragments used by all users should be stored centrally 19 Distributed Database Design Data Fragmentation: Allows a single object to be broken into two or more segments or fragments Each fragment can be stored at any site on the network Data fragmentation information is stored in the distributed data catalog (DDC), from which it is accessed by the TP to process user requests 20 Distributed Database Design Types of Data Fragmentation: Horizontal Vertical Mixed 21 Distributed Database Design Types of Data Fragmentation: Horizontal The division of a relation into tuples (rows) Each fragment is stored at a different node and each fragment has unique rows Each tuple has the same attributes (columns) but the rows are fragmented 22 Distributed Database Design Example of horizontal fragmentation Original structure: 5th Edition Fragmented structure: Split by state 6th Edition 23 Distributed Database Design Example of horizontal fragmentation Resulting structure: Fragmented structure: Split by state 5th Edition 24 Distributed Database Design Types of Data Fragmentation: Vertical The division of a relation into subsets by attributes (column) Each subset is stored at a different node, and each fragment has unique columns – with the exception of the key column, which is common to all fragments Transaction issues here because same record may need to be inserted into two tables (part of record into 1 table and other part into another table). If only 1 insert is successful; end up with inconsistent data. 25 Distributed Database Design Original structure: 5th Edition Fragmented structure: Split by location 6th Edition 26 Distributed Database Design Original structure: 5th Edition Example of Vertical Fragmentation Fragmented structure: Split by location 5th Edition 27 Distributed Database Design Types of Data Fragmentation: Mixed A combination of horizontal and vertical strategies 28 Distributed Database Design Example of Mixed Fragmentation: 29 Distributed Database Design Example of Mixed Fragmentation: 30 Data Replication Storage of data copies at multiple sites served by a computer network Fragment copies can be stored at several sites to serve specific information requirements Can enhance data availability and response time Can help to reduce communication and total query costs 31 Replication Scenarios Fully replicated database: Partially replicated database: Stores multiple copies of each database fragment at multiple sites Can be impractical due to amount of overhead Stores multiple copies of some database fragments at multiple sites Most DDBMSs are able to handle the partially replicated database well Unreplicated database: Stores each database fragment at a single site No duplicate database fragments 32 Data Allocation Deciding where to locate data Allocation strategies: Centralized data allocation Partitioned data allocation Database is divided into several disjointed parts (fragments) and stored at several sites Replicated data allocation Entire database is stored at one site Copies of one or more database fragments are stored at several sites Data distribution over a computer network is achieved through data partition, data replication, or a combination of both 33 Distributed Database Design How is a distributed database managed? Distributed Data Catalog (DDC) Contains the description of the entire database as seen by the DBA Translates user requests into sub-queries (remote requests) that will be processed by different DPs DDC is distributed and replicated at network nodes (the location of a database fragment) 34 Examples of Distributed Databases Banking Account data distributed at each local branch Loan data distributed at each local branch Corporate data at head office (summarized branch information) Insurance Policy data with each branch Corporate data at head office 35 Examples of Distributed Databases Retail Inventory data distributed at each local store Employee Scheduling data at each store Corporate data at head office (summarized store information) Payroll data at head office Utilities Utility monitoring data at each location (I.e. nuclear station monitoring – air, water etc at each location) Corporate data at head office 36 Distributed Database vs Client Server Client/Server is really an architecture which models a computerized solution based on the distribution of functions between servers and clients. A client requests specific services from a server and a server provides requested services to clients Distributed processing could be one aspect of client/server architecture – data ‘centralized’ The DDBMS distributes data to different locations – could be used in a Client/Server architecture 37 Distributed Database Design Steps: 1. 2. 3. 4. 5. 6. Always start with a centralized view design Consider horizontal fragmentation of a centralized database Consider vertical fragmentation of a horizontally fragmented database Re-consider PK for all fragments of the database Define data replication rules (scenarios) Complete Design 38