Download How to Scale a Database System - CMU-CS 15

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

IMDb wikipedia , lookup

Tandem Computers wikipedia , lookup

Serializability wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Access wikipedia , lookup

Oracle Database wikipedia , lookup

Team Foundation Server wikipedia , lookup

Btrieve wikipedia , lookup

Ingres (database) wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Database model wikipedia , lookup

SQL wikipedia , lookup

Relational model wikipedia , lookup

PL/SQL wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Clusterpoint wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Transcript
CMU SCS
Carnegie Mellon Univ.
Dept. of Computer Science
15-415/615 - DB Applications
C. Faloutsos – A. Pavlo
How to Scale a Database System
CMU SCS
hag·i·og·r
(noun
a·phy
)
CMU SCS
ChristosTheGreekGodofDatabases.com
• Pinterest meets Causal Encounters
meets Kickstarter meets Twitter
– With Christos!
CMU SCS
ChristosTheGreekGodofDatabases.com
• More reads than writes.
• All media stored outside of DBMS.
• How do we choose the right
database architecture?
Faloutsos/Pavlo
CMU SCS 15-415/615
4
CMU SCS
Outline
• Single-Node Databases
• NoSQL Systems
• NewSQL Systems
Faloutsos/Pavlo
CMU SCS 15-415/615
5
CMU SCS
Late-1990s / Early-2000s
• All the big players were heavyweight and
expensive.
– Oracle, DB2, Sybase, SQL Server, Informix.
• Open-source databases were missing
important features.
– Postgres, mSQL, MySQL.
Faloutsos/Pavlo
CMU SCS 15-415/615
6
CMU SCS
Mid-2000s
• MySQL + InnoDB is widely adopted by
new web companies:
– Supported transactions, replication, recovery.
– Memcache for caching queries.
Faloutsos/Pavlo
CMU SCS 15-415/615
7
CMU SCS
ChristosTheGreekGodofDatabases.com
• Let’s go with MySQL.
• We’re getting a lot of traffic.
• Our database server is saturated!
How do we increase the
capacity of our database server?
Faloutsos/Pavlo
CMU SCS 15-415/615
8
CMU SCS
Idea #1:
Buy a
faster
machine.
CMU SCS
Scaling Up
 More disks.
 More RAM.
 Faster CPUs.
Use SSDs.
Application Server
(+) Requires no change to application.
(+) Improvements are immediate.
Faloutsos/Pavlo
Database Server
(-) Expensive! Diminishing Returns.
(-) Single Point of Failure.
CMU SCS 15-415/615
10
CMU SCS
Idea #2:
Replicate
database
on multiple
CMU SCS
Replication
Read Request
Application Server
(+) Requires no change to application.
(+) Parallelize read operations.
(+) Improved fault tolerance.
Faloutsos/Pavlo
Database Server
Replicas
(-) Expensive! Diminishing Returns.
(-) Writes limited to slowest node.
CMU SCS 15-415/615
12
CMU SCS
Idea #3:
Cache
query
results.
CMU SCS
Query Cache
Update
Check Cache
Cache memcache
Query Request
Application Server
Database Server
(+) Reduce load on DBMS.
(+) Fast API.
(-) Extra roundtrip per query.
(-) Requires application changes.
(-) Doesn’t help write-heavy apps.
Faloutsos/Pavlo
CMU SCS 15-415/615
Replicas
14
CMU SCS
Idea #4:
Push SQL
into stored
procedures.
CMU SCS
Stored Procedures
Stored Procedure
def getPage(request):
# Process request
EXEC SQL
def getPage(request):
EXEC SQL
# Process request
# Process results
EXEC PROCEDURE
if x == True:
# Render HTML page
EXEC SQL
return (html)
else:
EXEC SQL
# Render HTML page
return (html)
BEGIN:
EXEC SQL
EXEC SQL
if x == True:
EXEC SQL
else:
EXEC SQL
return (results)
END;
Database Server
Application Code
(+) Reduces network roundtrips.
(+) Less lock contention.
(+) Modularization.
Faloutsos/Pavlo
Replicas
(-) Application logic in two places.
(-) PL/SQL is not standardized.
CMU SCS 15-415/615
16
CMU SCS
Idea #5:
Shard
database
across
CMU SCS
Sharding / Partitioning
Logical
Partitions
Application Server
Database Cluster
(+) Parallelize all operations.
(-) Most DBMSs don’t support this.
(+) Much easier to add more hardware. (-) Joins are expensive.
(-) Non-trivial to split database.
Faloutsos/Pavlo
CMU SCS 15-415/615
18
CMU SCS
ChristosTheGreekGodofDatabases.com
• We want to scale out but writing a sharding
layer is hard.
• Some parts of our
application don’t need
a full-featured DBMS.
Faloutsos/Pavlo
CMU SCS 15-415/615
CMU SCS
Idea #6:
Give up
ACID
guarantees
CMU SCS
Faloutsos/Pavlo
Update Profile
?
Master
Get Profile
CMU SCS 15-415/615
DBMS Servers
Application Servers
Eventual Consistency
?
Replicas
21
CMU SCS
Late-2000s (NoSQL)
• NoSQL systems are able to scale
horizontally right out of the box by giving
traditional database features.
Faloutsos/Pavlo
CMU SCS 15-415/615
22
CMU SCS
ChristosTheGreekGodofDatabases.com
• We need to process payments.
• We don’t want to lose orders.
• We need joins and ACID transactions.
CMU SCS
Strong Consistency
Use TwoPhase Commit
Nice Christos Pictures!
-$100
Send Money
+$100
Thanks!
Faloutsos/Pavlo
CMU SCS 15-415/615
24
CMU SCS
Idea #7:
Keep
guarantees
, optimize
CMU SCS
Early-2010s (NewSQL)
• New DBMSs that can scale across multiple
machines natively and provide ACID
guarantees.
CMU SCS
Conclusion
• RDBMS (Single-Node):
– MySQL, Postgres
• NoSQL (Multi-Node):
– Key-Value, Documents, Graphs
• NewSQL (Multi-Node):
– Transaction Processing, MySQL Sharding
Faloutsos/Pavlo
CMU SCS 15-415/615
27
CMU SCS
What DBMS
should my
start-up use?
CMU SCS
Beyond the 15-415/615
• Christos is teaching 15-826 this fall:
– Multimedia Databases and Data Mining
• Send me an email if you’re interested in
working on a database research project.