* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture12 - Distributed Databases
Survey
Document related concepts
Global serializability wikipedia , lookup
Microsoft Access wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Commitment ordering wikipedia , lookup
Oracle Database wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Serializability wikipedia , lookup
Functional Database Model wikipedia , lookup
Ingres (database) wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Versant Object Database wikipedia , lookup
Relational model wikipedia , lookup
ContactPoint wikipedia , lookup
Clusterpoint wikipedia , lookup
Transcript
COIS20026 Database Development & Management Week 12 – Distributed Databases Prepared by: Pramila Gupta Updated by: Angelika Schlotzer & Satish Balmuri Week 12 - Distributed Databases Reading Readings for this week: Study guide module 12 Text book readings as directed by study guide 2 Objectives Describe what is meant by a distributed database Describe how this differs from a decentralised database List the reasons for and against a distributed database Describe the difference between homogenous and heterogeneous distributed databases Describe location transparency and local autonomy 3 Objectives (cont’d) Explain horizontal partitioning and vertical partitioning Define local transaction and global transaction List & describe the 4 key objectives of a distributed database: Location transparency Replication transparency Failure transparency Concurrency transparency 4 Distributed vs decentralised Distributed database Appears as one database to the user Users should not normally be aware of the location of any given data Decentralised database: Does not appear as one database to the user User will have to manually navigate to data at another site – will have to know where it is. 5 Architecture DBMS runs on multiple sites on a network normally organisations will use one of the big six DBMS ORACLE, DB2, Informix, Sybase, Ingres, Microsoft and only use 1 ‘database engine’ DBMS specialist knowledge (personnel) required to manage/program them 6 Architecture (cont’d) There will be problems/limitations getting 2 different DBMS to work together (standards are emerging to make this easier) when all DBMS in a distributed database are the same, we call it a homogeneous system as distinct from a heterogeneous system (refer to figures 13-2 and 13-3) each DBMS manages a collection of tables (as part of databases) 7 Architecture (cont’d) These tables are exposed to (can be used by) users (end-users & programs) on other sites the goal is: that users are unaware of the physical location of tables to a user, a distributed database looks like a local database distributed database systems are typically only used by large organisations 8 Why use a Distributed Database System Large organisations are geographically dispersed entities it may make sense to keep data where it is generated & most often used to reduce data transfer costs/network bandwidth improve access speeds 9 Why use a Distributed Database System (cont’d) Politics typically plays a part increased local autonomy is a factor 10 Why not use a Distributed Database System Expensive to buy even more expensive to manage/maintain specialised knowledge (personnel) is needed to setup, manage & maintain more database personnel are required (to manage the different sites) 11 Principles & Objectives Fundamental principle of a distributed database: to the user, the distributed database should look like a local database 12 objectives for a distributed database system: local autonomy local DBMS is autonomous 12 Principles & Objectives (cont’d) Local DBMS can perform its functions independently of other sites if some other site is down, local DBMS can still function in practice, local DBMS must cooperate with other DBMS hence, will be partly dependent on other sites for some services eg access to a table where the ‘primary copy’ is held on another site 13 Principles & Objectives (cont’d) So: local autonomy - to the maximum extent possible no reliance on central site no site in network should assume special role as ‘central site’ otherwise, system is vulnerable to failure of this site actually, this is just one aspect of the local autonomy issue 14 Principles & Objectives (cont’d) Continuous operation minimise unplanned shutdowns there should be no need for planned shutdowns (eg to add a new site) location independence otherwise known as ‘transparency’ it should be transparent to a user / programmer that some tables are held at a remote site 15 Principles & Objectives (cont’d) Someone needs to know where they are - the database administrator(s) by hiding these details from the user / programmer: life is simpler for the user/programmer applications do not become dependent on the location of the tables (ie no data dependence) 16 Principles & Objectives (cont’d) Fragmentation independence fragments: horizontal table rows are held in different locations (eg Australasian account records held in Melbourne (M_Account) and European account records held in Paris (P_Account) users see a single, unified Account table 17 Principles & Objectives (cont’d) Note: relational systems are well suited to handle this fragmentation; eg- Account virtual table can be defined in terms of physical tables as: SELECT * FROM A_Account UNION SELECT * FROM E_Account eg - specification of a fragment where a row is stored 18 Principles & Objectives (cont’d) eg - specification of a fragment where a row is stored is a restriction - Melbourne rows: WHERE Continent = ‘Australasia’ vertical not as many applications may wish to hold columns holding - sensitive data or special data (eg picture, map) on a dedicated server 19 Principles & Objectives (cont’d) again: relational systems are well suited to handle this fragmentation virtual table can be defined as a join of physical vertical fragments, and specification of sensitive columns to hold on dedicated server is a projection 20 Principles & Objectives (cont’d) Fragmentation should be hidden from users so that applications do not become dependent on a given fragmentation views will be used to hide sensitive columns from unauthorised users query processor will fragment queries against a fragmented table 21 Principles & Objectives (cont’d) Eg SELECT ID FROM Account WHERE CreditRating = ‘AAA’ becomes SELECT ID FROM A_Account WHERE CreditRating = ‘AAA’ UNION SELECT ID FROM E_Account WHERE CreditRating = ‘AAA’ 22 Principles & Objectives (cont’d) Replication independence replicas: may make sense to replicate commonly used data on multiple sites should be hidden from users complications - update: do all copies of an object need to be locked? do all copies of an object need to be updated? 23 Principles & Objectives (cont’d) Distributed query processing distributed queries are potentially very costly, so need for optimisation distributed query optimisation just an extension of local query optimisation for RDBMS so, once again, relational systems well-suited to distributed systems 24 Principles & Objectives (cont’d) Date makes the point that the setoriented relational approach is well suited to distributed databases as a single request (query) can be sent to a site from which data is sought; in a record oriented system, a request must be sent for each record 25 Principles & Objectives (cont’d) Distributed transaction management this is more of a requirement than an objective most applications will use transactions to protect data integrity in a distributed database, there will be a need for distributed transactions transactions that involve changes to records on multiple sites 26 Principles & Objectives (cont’d) Hardware independence the idea is that you should be free to choose the hardware on which you implement your distributed database more an issue of the operating system supported important for organisations with a mix of hardware/operating systems 27 Principles & Objectives (cont’d) Products like Oracle are strong here run on different range of Unix flavours, NT, MVS, (mainframe OS), etc Microsoft SQL server is at the other end of the spectrum runs only on NT Operating System Independence see above 28 Principles & Objectives (cont’d) Network Independence similar sort of thing increasingly, the operating system hides the NOS from DBMS DBMS Independence should be able to mix & match RDBMS in fact, advanced features like cooperative distributed transaction processing is limited 29 The Difficult Bits In a distributed database, it becomes much more difficult to manage: The catalog Query processing Concurrent access Recovery 30