Download Database Management System

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsoft Access wikipedia , lookup

SQL wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Serializability wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Functional Database Model wikipedia , lookup

Database wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Relational model wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Versant Object Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Concurrency control wikipedia , lookup

Database model wikipedia , lookup

Transcript
D
A
Database
Management
Systems
T
A
B
Chapter 11
A
Distributed Databases
S
Jerry Post
E
Copyright © 2013
1
Objectives







Why do you need a distributed database?
What are distributed databases?
How is data distributed with client/server systems?
Can a Web approach solve the data distribution issues?
How much data can you send to a client form?
How do you transfer data across diverse systems?
How will Sally’s employees access the database?
2
Distributed Databases





SELECT Sales
FROM Britain.Sales
UNION
SELECT Sales
FROM France.Sales
UNION
SELECT Sales
FROM Italy.Sales
Definition
Advantages / Uses
Problems / Complications
Client-Server / SQL Server
Microsoft Access
Germany
Britain
France
Italy
3
Distributed Database Definition
 Multiple independent databases
 Each DBMS is a complete DBMS (engine,
queries, locking, transactions, etc.)
 Usually on different machines.
 Usually in different locations.
 Connected by a network.
 Might be different environments
 Hardware
 Operating System
 DBMS Software
Database
Apollo
Database
Zeus
England
France
Database
Athena
United States
4
Distributed Database Rules
 C.J. Date
 Rule 0: Transparency: the
user should not know or
care that the database is
distributed.
 Local autonomy.
 No reliance on a central site.
 Continuous operation.
 Location independence.
 Fragmentation independence
(physical storage).
 Replication independence.
 Distributed query processing.
 Distributed transaction
management.
 Hardware independence.
 Operating system
independence.
 Network independence.
 DBMS independence.
5
Distributed Features
 Each database can continue to run even if portion fails.
 Data and hardware can be moved without affecting operations or users.
 Expanding operations.
 Performance issues.
 System expansion and upgrades.
 Add new section without affecting others.
 Upgrade hardware, network and DBMS.
6
Advantages and Applications
local
transactions
 Business operations are often
distributed
 Work and data are segmented by
department.
 Work and data are segmented by
geographical location.
 Improved performance
 Most updates and queries are performed
locally.
 Maintain local control and responsibility
over data.
future
expansion
 Can still combine data across the
system.
 Scalability and expansion
 Add on, not replacement.
7
Creating a Distributed Database








Design administration plan.
Choose hardware and DBMS vendor, and network.
Set up network and DBMS connections.
Choose locations for data.
Choose replication strategy.
Create backup plan and strategy.
Create local views and synonyms.
Perform stress test: loads and failures.
8
Network Transfer Rates





Drives: 60 - 400 MB per sec. (with SSD or RAID)
LANs: 10-100 MB per sec (100-1000 mbps).
WANs: 0.2 - 300 MB per sec.
Faster is possible but expensive!
Goal is to minimize transmissions.
 Each system must be capable of evaluating
queries--preferably SQL.
 Results depend heavily on how the system joins
tables.
WAN
0.2 - 300 MB
60 – 400 MB
Disk drive
10-100 MB
LAN
9
Distributed Query Processing
 Example




NY: Customers: 1 M rows
LA: Production: 10 M rows
Chicago: Sales: 20 M rows
Query: List customers who bought blue
products on March 1
 Bad idea #1
 Transfer all rows to Chicago
 Then JOIN and select.
 Better idea #2 (probably)
NY
Customers(C#, …)
1,000,000
C# list from
desired P#
Chicago
Matching
Sales(S#, C#, Sdate) Customer
data
20,000,000
SaleItem(S#, P#,…)
50,000,000
P# sold on
March 1
 Transfer blue products from LA to Chicago
 Better idea #3
 Get sale items on March 1
 Get blue products from LA
 Send C# to NY
LA
Blue P#
sold on
March 1
Products(P#, Color…)
10,000,000
10
Data Replication: Publish/Subscribe
subscribe
Replica
Main
Published
changes
Changed
data
11
Data Replication
Britain
 Goals
 Minimize transmissions
 Improve performance
 Support heavy multiuser access.
 Problems
 Updating copies
Britain: Customers
& Sales
Market research &
data corrections.
France: Customers
& Sales
Spain: Customers
& Sales
Periodic
updates
 Bulk transmissions
 Site unavailable
 Concurrency
 Easier for two people to change the
same data at the same time.
 Decision support systems.
 Data warehouse.
Spain
Britain: Customers
& Sales
France: Customers
& Sales
Spain: Customers
& Sales
Update data.
12
Concurrency and Locks
 Each DBMS must maintain lock facility.
 To update, each DBMS must utilize and
recognize other lock mechanisms and
return codes.
 Each DBMS must have a deadlock
resolution protocol that recognizes the
distributed databases.
 Random wait.
 Optimistic updates.
 Two-phase commit.
DBMS #1
Accounts
Jones
8898
Transaction A
Locked
Waiting
DBMS #2
Accounts
Jones
3561
Transaction B
Waiting
Locked
13
Transactions & Two-Phase Commit
 Two (or more) separate lock
managers.
 DBMS initiating update serves as the
coordinator.
 Two phases
 Coordinator sends message and data
to all machines to “get ready.”
 Local machines save data in logs,
verify update status and return
message.
 If all locals report OK, then coordinator
writes log and instructs others to
proceed. If any fail, it sends Rollback
message.
Database 1
Initiate Transaction
1. Prepare to commit.
All agree?
2. Commit
Database 2
Lock tables.
Save log.
Update all tables.
Database 3
14
Distributed Transaction Managers
Transaction Manager
Resource
Manager
DBMS
Transaction Manager
Resource
Manager
DBMS
Transaction Manager
Transaction
Processing
Monitor
Resource
Manager
DBMS
The distributed transaction
coordinator/transaction processing monitor
handles the transaction decisions and
coordinates across the participating systems.
15
Distributed Design Questions
Question
What level of data consistency is needed?
How expensive is storage?
What are the shared access requirements?
How often are the tables updated?
Required speed of updates (transactions)?
How important are predictable transaction times?
DBMS support for concurrency and locking?
Can shared access be avoided?
Concurrent
High
Medium – High
Global
Often
Fast
High
Good – Excellent
No
Replication
Low – Medium
Low
Local
Seldom
Slow
Low
Poor
Yes
16
Client-Server
Server
Server
Shared
Database
Front-end
User Interface
Clients
Clients
Older, but common for retail stores/checkout computers.
18
LAN File Server: Slow
File Server
MyFile.mdb
CustID Name …
115
Jenkins …
Forms 125
Juarez ...
Order ...
DBMS
software
transferred.
Application
and query
transferred.
SELECT *
FROM Customer
WHERE City = “Sandy”
One row at a time
transferred, until
all rows are examined.
20
Client-Server Databases
 One machine machine is dominant
(server) and handles data for many
clients.
 Client machines handle front-end tasks
and small data tables that are not
shared.
File Server
DBMS
Shared
SQL Server Data
Send SQL
statement.
Return
matching
data.
application
21
Three-Tier Client-Server
 Server Databases
 Client front-end
 Middle
 Locate databases
 Business rules
 Program code
Databases.
Transactions.
Legacy applications.
Database links.
Business rules.
Program code.
Application.
Front-end.
User Interface.
Database
Servers
Middleware
Client
23
Database Independence on the Client
Original DBMS
ADO
New DBMS
ADO
Application
24
Database Independence with Queries
Independent Application Query: works with any DBMS
SELECT SaleID, SaleDate, CustomerID, CustomerName
FROM SaleCustomer
Saved Oracle Query
SELECT SaleID, SaleDate, CustomerID,
LastName || ‘, ‘ || FirstName AS CustomerName
FROM Sale, Customer
WHERE Sale.CustomerID=Customer.CustomerID
Saved SQL Server Query
SELECT SaleID, SaleDate, CustomerID,
LastName + ‘, ‘ + FirstName AS CustomerName
FROM Sale INNER JOIN Customer
ON Sale.CustomerID = Customer.CustomerID
25
The Internet as Client-Server
Internet
Router
Router
Server
Client
Browser
request
http://server.location/page
result page
Web Server
HTML
forms
SQL
Data
Database Server
26
HTML Limited Clients
<HTML>
<HEAD>
<TITLE>My main page</TITLE></HEAD>
<BODY BACKGROUND=“graphics/back0.jpg”>
<P>My text goes in paragraphs.</P>
<P>Additional tags set <B>boldface</B> and <I>Italic</I>.
<P>Tables are more complicated and use a set of tags for rows and
columns.</P>
<TABLE BORDER=1>
<TR><TD>First cell</TD><TD>Second cell</TD></TR>
<TR><TD>Next row</TD><TD>Second column</TD></TR>
</TABLE>
<P>There are form tags to create input forms for collecting data.
But you need CGI program code to convert and use the input data.</P>
</BODY>
</HTML>
27
HTML Output
My text goes in paragraphs.
Additional tags set boldface and Italic.
Tables are more complicated and use a set of tags
for rows and columns.
First cell
Second cell
Next row
Second column
There are form tags to create input forms for
collecting data. But you need CGI program code
to convert and use the input data.
28
Web Server Database Fundamentals
DBMS
Web Server
SQL
<body>
<form id="form1" runat="server">
<asp:Label ID="PageTitleLabel" runat="server" …
Data
<asp:SqlDataSource ID="CustomerSqlDataSource“
DeleteCommand="DELETE FROM [Customer] …
SelectCommand="SELECT [CustomerID], …
UpdateCommand="UPDATE [Customer] SET…
Data Object <DeleteParameters>
<asp:Parameter Name="CustomerID" Type="Int32" />
</DeleteParameters>
<asp:FormView ID="CustomersFormView" runat="server“…
DataSourceID="CustomerSqlDataSource">
…
</asp:FormView>
</form>
</body>
Web Browser
Customers
CustomerID
LastName
FirstName
…
1653
Jones
Mary
Save
CSS Style Sheet
.PageTitle
{
font-weight: bold;
font-size: larger;
text-align: center;
}
30
Database Example: Client Side
0
Request Server/Form.html
Server
1 Initial form
3 Results
2
31
Client-Server Data Transfer
Order Form
Order ID
1015
Customer
Jones, Martha
Order Date
12-Aug
What if there are 10,000 customers?
How much time to load the combo box?
How do you refresh/reload the combo box?
Alternatives?
32
Latency
Server
Generate form
Receive form data
Transmission
delay
Transmission
delay
time
Form received
Client
User delay
33
Cloud Computing
Data
Data is replicated to multiple, connected servers in the Internet cloud.
Client requests are filled from the nearest available server, spreading the
bandwidth and processor demands across the network.
34
Cloud Databases: Amazon S3
Web server
HTML,
Code,
Bucket
+Key
Amazon S3
Bucket Name
Key, Object
HTML
Page
User
Developer
Developer opens Amazon S3 account and creates/names a bucket.
Developer uploads content (objects) to bucket and assigns a key to each object.
Developer writes Web server code with HTML page and background code that calls
the S3 service with the bucket name and object key.
Web page is constructed on request and delivered with the object.
35
Web Databases (and Cloud)
 Many cloud systems are proprietary and not relational.
 Designed to handle data that rarely changes—typically through bulk
uploads.
 So little fear of concurrency or transaction issues.
 Microsoft Azure SQL is an exception.
 Need to handle concurrency (and transactions/logs) in your own code.
 Always use optimistic concurrency (not pessimistic locking).
 See Chapter 7, but keep the original values when you read data and use
it in the WHERE clause when you alter data to see if it has changed.
36