Download Horticulture

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Global serializability wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Database wikipedia , lookup

Ingres (database) wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Functional Database Model wikipedia , lookup

Commitment ordering wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Serializability wikipedia , lookup

Database model wikipedia , lookup

Concurrency control wikipedia , lookup

Transcript
CSCI5570 Large Scale Data
Processing Systems
NewSQL
James Cheng
CSE, CUHK
Slide Ack.: modified based on the slides from Hefu Chai
Skew-Aware Automatic Database
Partitioning in Shared-Nothing, Parallel
OLTP Systems
Andrew Pavlo, Carlo Curino, Stanley Zdonik
SIGMOD 2012
2
Main Memory • Parallel • Shared-Nothing
Transaction Processing
H-Store: A High-Performance, Distributed
Main Memory Transaction Processing System
Proc. VLDB Endow., vol. 1, iss. 2, pp. 1496-1499, 2008.
3
4
Procedure Name
Input Parameters
Transaction
Execution
Client
Application
Database Cluster
5
Transaction
Result
Client
Application
Database Cluster
6
OLTP Transactions
short-lived
(i.e., no user stalls)
typically executed as
pre-defined
txn templates or
stored procedures
Fast
Repetitive
touch a small subset
of data using index
(i.e., no full table
scans or large
distributed joins)
Small
7
We need an approach that supports…
• Stored Procedure
• Load balancing in the presence of timevarying skew
• Complex schemas
• Deployments with larger number of partitions
8
Optimal Database Design
• Scalability of NewSQL depends on the
existence of an optimal database design,
which defines
– how an application’s data and workload is
partitioned or replicated across nodes
– how queries and transactions are routed to nodes
– the above determines two crucial factors:
• the number of transactions accessing multiple nodes
• the skewness of the load across the cluster
A growing fraction of distributed transactions and load skew
=> 10x worse performance (see following slides)
9
Automatic Database Design Tool
for Parallel Systems
Skew-Aware Automatic Database Partitioning
in Shared-Nothing, Parallel OLTP Systems
SIGMOD 2012
10
What are the key issues?
• Two key issues when generating a good
database design for enterprise OLTP
applications
– Distributed transactions
• network overhead for employ two-phase commit or
similar distributed consensus protocol to ensure
atomicity and serializability
– Temporal workload skew
• node with skewed load becomes saturated, while other
nodes are idle and clients are blocked waiting for
results
11
What are the key issues?
• Distributed transactions
• Temporal workload skew
12
Impact of distributed transactions on throughput
13
What are the key issues?
• Distributed transactions
• Temporal workload skew
14
Temporal workload skew
• Think about the example of Wikipedia
– Even though the average load of the cluster for
the entire day is uniform, the load across the
cluster for any point is unbalanced (due to
difference in languages of the wiki content and
time difference)
– Static Skew Vs. Temporal Skew
15
Impact of temporal workload skew on throughput
16
What are the key issues?
• A complex tradeoff: distributed transactions
vs. temporal workload skew
– put database on a single node and execute all
transactions there
• no distributed transactions
• extreme load skew
– execute all transactions as distributed transactions
that access data at every partition
• total distributed transactions
• no load skew
17
Horticulture’s Goal
• Analyze
– a database schema
– the structure of application’s stored procedures
– a sample transaction workload
• Generate partitioning that
– minimizes distribution overhead
– balances access skew
18
19
Maintain the tradeoff
between distributed
transactions and
temporal skew
Extend design space
to include replicated
secondary indexes
Organically handling
stored procedure
routing
Two Main Technical Contributions
Large Neighborhood Search: automatic database partitioning
Three Unique Features
Skew-Aware Cost Model: coordination cost and load distribution estimation
20
What are the design options
• For each table:
– Horizontal partition
– Replicate on all partitions
– Replicate a secondary index for a subset of its
columns
– Effectively route incoming transaction requests
21
Horizontal Partitioning
22
Table Replication
For read-only
or read-mostly
tables
23
Secondary Index
For read-only
or read-mostly
columns
24
Stored Procedure Routing
25
Stored Procedure Routing
26
What are the key technique contributions
• Large-Neighborhood Search
• Skew-Aware Cost Model
27
Large-Neighborhood Search
4.
local
search
for a design
new design
2. Perform
Generate
an initial
“best”
Dbest
1.
Database
schema
1. After
3.
5.
Analyze
Create
running
a
sample
new
incomplete
for
workload
a
limited
design
to
time,
pre-compute
stop
D
and
by
using
D
as
starting
point.
Replace
Dbest
relax
relax
based on
the
most
frequently
accessed
2.
Stored
procedures
info
relaxing
return
used
Ddesign
(i.e.,
to
guide
resetting)
the
search
a subset
process
of Dbest
w/
new
with
a lower
cost.
best
columns
Sample
workloaddo not improve
Restart3.Step
3 if k searches
Dbest or no design in Drelax‘s neighborhood.
28
Large-Neighborhood Search
Initial Design
1. Select the most frequently accessed column in each
table as the horizontal partitioning attribute
2. Greedily replicate read-only tables until no space left
3. Select next most frequently accessed, read-only
column as secondary index attribute for each table
4. Select the routing parameter for stored procedures
based on how often the parameters are referenced in
Q (Q: queries that access columns selected in Step 1)
29
Large-Neighborhood Search
Relaxation:
• The process of selecting random tables in the database and resetting their chosen
partitioning attributes in Dbest
• Allow LNS to escape a local minimum and jump to a new neighborhood of potential solutions
• Horticulture:
• decides the number of tables to relax
• randomly chooses which tables to relax
(routing parameters of stored procedures referencing a relaxed table will also be reset)
• generates the candidate attributed for the relaxed tables and procedures
30
Large-Neighborhood Search
• Local Search
Explore the tree using branch-and-bound search, replace the
For each procedure, choose the routing parameter w/ the
table’s design option in Drelax to that of the tree node.
lowest cost, before moving
down the tree.
Estimate the cost, if lower than that of Dbest, go down the tree.
Phase 1
Phase 2
31
What are the key technique contributions
• Large-Neighborhood Search
• Skew-Aware Cost Model
32
Skew-Aware Cost Model
• LNS relies on a cost model to estimate the cost of
executing the sample workload using a given
design
• The cost model must be able to
–
–
–
–
accentuate the properties that are important in a DB
be computed quickly
estimate the cost of an incomplete design
return a monotonically increasing cost as more
variables are set when searching down the tree
33
Skew-Aware Cost Model
Distributed
Transactions
+
Workload
Skew Factor
34
Skew-Aware Cost Model
• Measure
– how much workload executes as distributed
transactions
– how uniformly load is distributed across the
cluster
𝛼 × 𝐶𝑜𝑜𝑟𝑑𝑖𝑛𝑎𝑡𝑖𝑜𝑛𝐶𝑜𝑠𝑡 𝐷, 𝑊 + 𝛽 × 𝑆𝑘𝑒𝑤𝐹𝑎𝑐𝑡𝑜𝑟(𝐷, 𝑊)
𝑐𝑜𝑠𝑡 𝐷, 𝑊 =
𝛼+𝛽
Tradeoff!
35
Skew-Aware Cost Model
• Coordinator Cost
𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝐶𝑜𝑢𝑛𝑡
𝑑𝑡𝑥𝑛𝐶𝑜𝑢𝑛𝑡
× 1.0 +
𝑡𝑥𝑛𝐶𝑜𝑢𝑛𝑡 × 𝑛𝑢𝑚𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠
𝑡𝑥𝑛𝐶𝑜𝑢𝑛𝑡
Total number of partitions accessed divided by total number of
partitions could have been accessed, and scale it based on the
ratio of distributed transactions to single-partition transactions
36
Skew-Aware Cost Model
• Skew Factor
𝑛𝑢𝑚𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑠
𝑠𝑘𝑒𝑤[𝑖] × 𝑡𝑥𝑛𝐶𝑜𝑢𝑛𝑡𝑠[𝑖]
𝑖=0
∑𝑡𝑥𝑛𝐶𝑜𝑢𝑛𝑡𝑠
• To avoid time varying skew, divide W into finite intervals
• Estimate skew factor, skew[i], of each interval i
• Final skew factor is the mean of the skew factors weighted
by the number of transactions executed in each interval
37
Incomplete Designs
• Query that references a table with an unset attribute in a
design is labeled as unknown
• For each unknown query
– Coordinator cost: assume that any unknown query is singlepartitioned
– Skew factor: assume that unknown queries execute on all partitions in
the cluster
• ‘Unknown’ can change to ‘known’
• ‘Known’ cannot change to ‘unknown’
Estimated cost monotonically increasing!
38
Optimizations
• Access Graphs
• Workload Compression
39
Access Graph
Model and store input sample workload as an access graph:
• Vertex: table
• Edge: tables are co-accessed in a query
• Edge weight: the number of times the queries forming the relationship
LNS uses access graph to quickly identify important relationships between
tables w/o repeatedly reprocessing input sample workload
40
Optimizations
• Access Graphs
• Workload Compression
41
Workload Compression
• Given a larger input sample workload, LNS finds a
better database design, but less efficient
• Solution – workload compression:
– Combine sets of similar queries in individual
transactions into fewer weighted records
– Combine similar transactions into a smaller number of
weighted records in the same manner
• The cost model scales its estimates using these
weights w/o having to process each of the
records separately in the original workload
42
Algorithm Comparison
43
Throughput
44
Search Times
The best solution found by Horticulture over time (red line: known optimal design, if available)
45