* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 2515 - Distributed Databases
Survey
Document related concepts
Serializability wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Microsoft Access wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Oracle Database wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Ingres (database) wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Functional Database Model wikipedia , lookup
Concurrency control wikipedia , lookup
Relational model wikipedia , lookup
Clusterpoint wikipedia , lookup
Transcript
G063 - Distributed Databases Learning Objectives: By the end of this topic you should be able to: • explain how databases may be stored in more than one physical location • explain the methods by which this distribution may be carried out • explain reasons why distribution would be carried out • explain the security issues of distributed databases Database storage: Database storage: Database storage: Database storage: A Distributed Database is: • a single logical database – consisting of many entities – possibly used by many users for different purposes • a database is not stored in its entirety at a single physical location • database is spread physically across a number of computers – computers could be in multiple locations buildings or sites, – computers connected by a data communications link LAN and/or WAN Why distribute a database: • allows faster local queries – faster searching • speeds up other network operations – due to some data queries being handled locally reduces network traffic • improved reliability – data may be replicated at multiple sites • allows for modular growth of the database – can easily add new sites and/or uses • user does not need to know where data is stored physically – looks like a single, location, centralized system to the user Types of Distributed Database • Replicated • Centralised • Partitioned Replicated Database • complete database is duplicated at each centre • exact copy of the database stored & accessed locally • duplicated versions are usually read only – transaction files created of changes at each centre • updates allowed made on a master database – a ‘new’, updated copy of database sent to each centre at regular intervals Replicated Database Advantages: • reliability – data is always available locally – not reliant on the network or central server – work carries on even if some nodes are down • fast response to searches – local access will be faster than WAN access data does not have to be transmitted over the network • reduced network traffic at prime time – faster access to network if required Replicated Database Disadvantages • additional local requirements for storage space • additional time for update operations • complexity and cost of updating • data integrity issues – if replicated data is not updated simultaneously – local copies of data may be different Centralised Database • single database held centrally (possibly at Head Office) • each node accesses database through a network (WAN) – access available to all branches or offices, • an index to the central database is held locally at each node – speeds up queries/transactions • booking systems need distributed access to a central database if they are to work effectively – sharing of up-to-date information important, – avoids double bookings. Centralised Database Advantages: • better security of data – one copy rather than several (replicated copies) – security handled centrally • good data integrity – one copy rather than several always sharing the same data • data can be updated in real time – data always up-to-date • centralised backup – can be automated Centralised Database Advantages (from June 2011 Q13 mark scheme): • storage is only required at the central location for the centralised database (1) the local indexes stored at each site take up far less memory (1) • queries are processed locally(1) this speeds up searches as only the required data is retrieved from the central location (1) • less data traffic than complete centralisation (1) as only data is sent and not the additional information /forms/reports structure (1) • increased security (1) only central database needs increased security as that is where the data is stored (1) • integrity of data not compromised (1) as it is stored in only one location and one database to update (1) • centralised back-up of data (1) management backup easier as it is just one person’s responsibility (1) Centralised Database Drawbacks: • a virus in the central system could spread throughout all sites • possibility of update clashes – two sites trying to modify the same record at the same time Partitioned Database • database is split into sections • each node or site on the network stores local data – i.e the section of the database that relates to that site, e.g. the section of the database that relates to a single supermarket’s stock is stored at that site, • other (global) data is held centrally – changes to central data can be dealt with overnight by a batch update from the sites, Horizontal partitioning • involves putting different rows into different tables. • splitting the table into number of smaller tables – on the basis of rows (records) i.e. specific field contents Example: • branch offices in an organization deal mostly with a set of local customers – Euston Road branch stores the fragment where contents of the Branch field = 'Euston Road' Horizontal partitioning • this table represents the database for an estate agency with 3 branches Horizontal partitioning • the database is horizontally partitioned – so that the data for each branch is stored on the server in that branch: – this will speed up local queries Boldmere staff searching for properties in Boldmere Horizontal partitioning • this means that the data is stored like this: Horizontal partitioning • this means that the data is stored like this: Vertical partitioning • dividing the table based on the different columns. • involves creating tables with fewer columns – using additional tables to store the remaining columns. • different columns of a table located at different sites – e.g. stock descriptions (country of origin, supplier name at one site and prices at another site) Vertical partitioning From June 2011 Q13 mark scheme: • only certain people see certain fields – e.g. financial matters not revealed to all (1) • to conform to the law/DPA (1) – keeping personal information private (1) • reduces amount of data being sent between locations (1) – in order to speed up data transfer (1) – allowing faster reaction time (1) – meaning rescue reaches emergency quicker (1) Partitioned Database Advantages: • speed: – faster access to local data less network access required • local control over local data • scalability – can add new sites as required • not reliant on network or server for day-to-day tasks • each partition can have its own transaction log – local reporting (access/sales) Partitioned Database Drawbacks: • data inconsistency – possibility of different data being held centrally to that on partition – regular batch update required to maintain consistency • unsuitable for certain applications – if data changes at one node must be instantly seen by all nodes e.g. holiday bookings • high network usage during update process – will slow down other network processes