Download slides - CS 491/591: Cloud Computing

Document related concepts

Global serializability wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Functional Database Model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Commitment ordering wikipedia , lookup

Clusterpoint wikipedia , lookup

Versant Object Database wikipedia , lookup

Database model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Consistency model wikipedia , lookup

Serializability wikipedia , lookup

Concurrency control wikipedia , lookup

Transcript
Data Bases in Cloud Environments
Based on:
Md. Ashfakul Islam
Department of Computer Science
The University of Alabama
Data Today
• Data sizes are increasing exponentially everyday.
• Key difficulties in processing large scale data
– acquire required amount of on-demand resources
– auto scale up and down based on dynamic workloads
– distribute and coordinate a large scale job on several
servers
– Replication – update consistency maintenance
• Cloud platform can solve most of the above
Large Scale Data Management
• Large scale data management is attracting
attention.
• Many organizations produce data in PB level.
• Managing such an amount of data requires
huge resources.
• Ubiquity of huge data sets inspires researchers
to think in new ways.
• Particularly challenging for transactional DBs.
Issues to Consider
• Distributed or Centralized application?
• How can ACID guarantees be maintained?
• Atomicity, Consistency, Isolation, Durability
– Atomic – either all or nothing
– Consistent - database must remain consistent
after each execution of write operation
– Isolation – no interference from others
– Durability – changes made are permanent
ACID challenges
• Data is replicated over a wide area to increase
availability and reliability
• Consistency maintenance in replicated
database is very costly in terms of
performance
• Consistency becomes bottleneck of data
management deployment in cloud
– Costly to maintain
CAP
• CAP theorem
– Consistency, Availability, Partition
• Three desirable, and expected properties of
real-world services
• Brewer states that it is impossible to
guarantee all three
CAP: Consistency - atomic
• Data should maintain atomic consistency
• There must exist a total order on all
operations such that each operation looks as if
it were completed at a single instant
• This is not the same as the Atomic
requirement in ACID
CAP: Available Data Objects
• Every request received by a non-failing node
in the system must result in a response
• No time requirement
• Difficult because even in severe network
failures, every request must terminate
• Brewer originally only required almost all
requests get a response, this has been
simplified to all
CAP: Partition Tolerance
• When the network is partitioned all messages
sent from nodes in one partition to nodes in
another partition are lost
• This causes the difficulty because
– Every response must be atomic even though arbitrary
messages might not be delivered
– Every node must respond even though arbitrary
messages may be lost
• No failure other then total network failure is
allowed to cause incorrect responses
CAP: Consistent & Partition Tolerant
• Ignore all requests
• Alternate solution: each data object is hosted
on a single node and all actions involving that
object are forwarded to the node hosting the
object
CAP: Consistent & Available
• If no partitions occur it is clearly possible to
provided atomic (consistent), available data
• Systems that run on intranets and LANs are an
example of these algorithms
CAP: Available & Partition Tolerant
• The service can return the initial value for all
requests
• The system can provide weakened
consistency, this is similar to web caches
CAP: Weaker Consistency Conditions
• By allowing stale data to be returned when
messages are lost it is possible to maintain a
weaker consistency
• Delayed-t consistency- there is an atomic
order for operations only if there was an
interval between the operations in which all
messages were delivered
CAP
– Can only achieve 2 out of 3 of these
– In most databases on the cloud, data availability
and reliability (even if network partition) are
achieved by compromising consistency
– Traditional consistency techniques become
obsolete
Evaluation Criteria for Data
Management
• Evaluation criteria:
– Elasticity
• scalable, distribute new resources, offload unused
resources, parallelizable, low coupling
– Security
• untrusted host, moving off premises, new
rules/regulations
– Replication
• available, durable, fault tolerant, replication across
globe
Evaluation of Analytical DB
• Analytical DB handles historical data with little or
no updates - no ACID properties.
• Elasticity
– Since no ACID – easier
• E.g. no updates, so locking not needed
– A number of commercial products support elasticity.
• Security
– requirement of sensitive and detailed data
– third party vendor store data
• Replication
– Recent snapshot of DB serves purpose.
– Strong consistency isn’t required.
Analytical DBs - Data Warehousing
• Data Warehousing DW - Popular application of Hadoop
• Typically DW is relational (OLAP)
– but also semi-structured, unstructured data
• Can also be parallel DBs (teradata)
– column oriented
– Expensive, $10K per TB of data
• Hadoop for DW
– Facebook abandoned Oracle for Hadoop (Hive)
– Also Pig – for semi-structured
Evaluation of Transactional DM
• Elasticity
– data partitioned over sites
– locking and commit protocol become complex
and time consuming
– huge distributed data processing overhead
• Security
– same as for analytical
Evaluation of Transactional DM
• Replication
– data replicated in cloud
– CAP theorem: Consistency, Availability, data
Partition, only two can be achievable
– consistency and availability – must choose one
– availability is main goal of cloud
– consistency is sacrificed
– database ACID violation – what to do?
Transactional Data
Management
Transactional Data Management
Needed because:
• Transactional Data Management
– heart of database industry
– almost all financial transaction conducted
through it
– rely on ACID guarantees
• ACID properties are main challenge in
transactional DM deployment in Cloud.
Transactional DM
• Transaction is sequence of read & write
operations.
• Guarantee ACID properties of transactions:
– Atomicity - either all operations execute or none.
– Consistency - DB remains consistent after each
transaction execution.
– Isolation - impact of a transaction can’t be altered by
another one.
– Durability - guarantee impact of committed
transaction.
Existing Transactions for Web
Applications in the Cloud
• Two important properties of Web applications
– all transactions are short-lived
– data request can be responded to with a small set
of well-identified data items
• Scalable database services like Amazon
SimpleDB and Google BigTable allow data to
be queried only by primary key.
• Eventual data consistency is maintained in
these database services.
Related Research
• Different types of consistency
• Strong consistency – subsequent accesses by
transactions will return updated value
• Weak consistency – no guarantee subsequent
accesses return updated value
– Inconsistency window – period between update and
when guaranteed will see update
• Eventual consistency – form of weak
– If no new updates, eventually all accesses return last
updated value
• Size of inconsistency window determined by communication
delays, system load, number of replicas
• Implemented by domain name system (DNS)
Commercial Cloud Databases
• Amazon Dynamo
– 100% available
– Read sensitive
• Amazon Relational Database Services
– MySQL builtin - replica management used
– All replicas in the same location
• Microsoft Azure SQL
– Primary with two redundancy servers
– Quorum approach
• Xeround MySQL [2012]
– Selected coordinator processes read & write requests
– Quorum approach
• Google introduced Spanner
– Extremely scalable, distributed, multiversion DB
– Internal use only
Tree Based Consistency (TBC)
• Our proposed approach:
– Minimize interdependency
– Maximize throughput
– All updates are propagated through a tree
– Different performance factors are considered
– Number of children is also limited
– Tree is dynamic
– New type of consistency ‘apparent consistency’ is
introduced
System Description of TBC
• Two
• Controller
components
– Tree creation
– Controller
– Replica server
– Failure recovery
– Keeping logs
27
• Replica server
– Database
operation
– Communication
with other servers
Performance Factors
• Identified performance factors
– Time required for disk update
– Workload of the server
– Reliability of the server
– Time to relay a message
– Reliability of network
– Network bandwidth
– Network load
PEM
• Causes to enormous performance degradation
– Disk update time, workload or reliability of server
– Reliability, bandwidth, traffic load of network
•
•
•
•
•
•
Performance Evaluation Metric
𝑃𝐸𝑀 = 𝑛𝑖=1(pfi ∗ wfi)
pfi= ith performance factor
wfi= ith weight factor
Wfi cloud be positive or negative
Bigger PEM means better
Building Consistency Tree
•
•
•
•
Prepare the connection graph G(V,E)
Calculate PEM for all nodes
Select the root of the tree
Run Dijkstra’s algorithm with some
modification
• Predefined fan-out of tree is maintained by
algorithm
• Consistency tree is returned by algorithm
Example
Connection Path
Server
1
.9
0
.8
0
.9
0
25
0
20
0
15
.9
1
.8
0
.9
.7
25
0
26
0
17
23
0
.8
1
.9
.6
.7
0
26
0
15
24
20
.8
0
.9
1
.9
0
20
0
15
0
19
0
0
.9
.6
.9
1
.7
0
17
24
19
0
22
.9
.7
.7
0
.7
1
15
23
20
0
22
0
Path Reliability (pf1)
Path Delay (pf2)
.98
.91
.93
.99
.96
Server Reliability (pf1)
50
20
60
30
10
30
Server Delay (pf2)
wf1 = 1 ; wf2 = -.02
5
1
6
2
4
5
3
3
4
.98
2
6
1
Update Operation
• An update operation is done in four steps
1. An update request will be sent to all children of
the root
2. The root will continue to process the update
request on its replica
3. The root will wait to receive confirmation of
successful updates from all of its immediate
children
4. A notification for a successful update will be sent
from root to the client
Consistency Flag
• Two types of consistency flag used:
– Partial consistent flag, Fully consistent flag
• Partial consistent flag
– Set by top-down approach
– Last updated operation sequence number is stored as flag
– Inform immediate children
• Set Fully Consistent Flag
– Set by bottom-up approach
– leaf found empty descendants list, set fully consistent flag as
operation sequence number
– informs immediate ancestor
– ancestor set fully consistent flag after getting confirmation from
all descendants
Consistency Assurance
• All update requests from user are sent to root
• Root waits for its immediate descendants
during update requests
• Read requests are handled by immediate
descendants of root
Maximum Number of Allowable
Children
• Larger number of children
– higher interdependency
– possible performance degradation
• Smaller number of children
– Less reliability
– Higher chance of data loss
• Three categories of trees in experiment
– sparse, medium and dense
• t = op + wl
– Where t-resp time, op-disk time, wl-OS load
Maximum Number of Allowable
Children
Maximum number should be set by trading off between
reliability and performance
Inconsistency Window
• Amount of time a distributed system is being
inconsistent
• Reason behind
– Time consuming update operation
– Accelerate update operations
• System starts processing next operation in
queue
– Getting confirmation from certain number of node
– Not waiting for all to reply
MTBC
• Modified TBC
– Root sends update request to all replica
– Root waits for only its children to reply
– Intermediate nodes will make sure either their
children are updated or not
• MTBC
– Reduces inconsistency window
– Increase complexity at children end
Effect of Inconsistency Window
• Inconsistency window has no effect on
performance
• Possible data loss only if root and its children
all go down at the same time
Failure Recovery
• Primary server failure
– controller finds most updated servers with help of
consistency flag
– finds max reliable server from them
– rebuild consistency tree
– initiate synchronization
Failure Recovery
• Primary server failure
– Controller identifies max reliable server from them
– Rebuilds consistency tree
– Initiates synchronization
• Other server or communication path down
–
–
–
–
Checks server down or communication down
Rebuilds tree without down server
Finds alternate path
Reconfigures tree
Apparent Consistency
• All write requests are handled by root
• All read requests are handled by root’s
children
• Root and its children are always consistent
• Other nodes don’t interact with user
• User found system is consistent any time
• We call it “Apparent Consistency” – ensures
strong consistency
Different Strategies
• Three possible strategies for consistency
– TBC
– Classic
– Quorum
• Classic
– Each update requires participation of all nodes
– Better for databases that are updated rarely
• Quorum
– Based on quorum majority voting
– Better for databases that are updated frequently
• TBC
• Differences in read and write operations
Classic Read and Write Technique
Quorum Read and Write Technique
TBC Read and Write Technique
Experiments
• Compare the 3 strategies
Experiments Design
• All requests pass through a gateway to the
system
• Classic, Quorum & TBC implemented on each
server
• A stream of read & write requests are sent to
the system
• Transport layer is implemented in the system
• Thousands of ping responses are observed to
determine transmission delay and packet loss
pattern
• Acknowledgement based packet transmission
Experimental Environment
• Experiments are performed on a cluster called Sage
– Green prototype cluster at UA
•
•
•
•
•
•
•
•
Intel D201GLY2 mainboard
1.2 GHz Celeron CPU
1 Gb 533 Mhz RAM
80 Gb SATA 3 hard drive
D201GLY2 builtin 10/100 Mbps LAN
Ubuntu Linux 11.0 server edition OS
Java 1.6 platform
MySQL Server (version 5.1.54)
Workload Parameters
Effect of Database Request Rate
• Experiment:
– Compare the response time of the 3
approaches of Classic, Quorum and TBC as the
request rate increases from λ of .01 to .80
– Examine response time for read, write and
combined
Effect of Database Request Rate
Quorum
has large
read
response
Classic read
performs
slightly better
than TBC
read
Quorum
write is
better
Classic write
has highest
response time
Effect of Database Request Rate
Classic’s
performance
is expected
TBC performs
best at any
arrival rate
Classic defeats
quorum at this
point
Quorum has a
higher response
time at higher
arrival rate
Effect of Read-Write Ratio
• Experiment:
– Compare the response time of the 3
approaches of Classic, Quorum and TBC as the
read/write decreases from 90/10% to 50/50%
– Examine response time for read, write and
combined
Effect of Read-Write Ratio
Ratio has no
effect on read
response time
Read set and
write sets aren’t
separate
Quorum has less
effect on response
time by higher
write ratio
Classic & TBC
affected by
higher write
ratio
Effect of Read-Write Ratio
Higher write
ratio has less
effect on
quorum
Classic is
affected more
than TBC by
higher write
ratio
Effect of Network Quality
• Experiment:
– Compare the response time of the 3
approaches of Classic, Quorum and TBC for 3
different types of network quality
• Dedicated, regular, unreliable
– Examine response time for read, write and
combined
Effect of Network Quality
Classic read
always
performs
better
Classic write
is affected by
network
quality
Effect of Network Quality
TBC
always
performs
better
Network
quality affects
classic the
most
Effect of Heterogeneity
• Experiment:
– Compare the response time of the of Classic,
and TBC for 3 different heterogeneous
infrastructures
• Low, medium, high
Effect of Heterogeneity
Infra structure
heterogeneity
affects classic
Findings
• TBC performs better for different arrival rates –
higher frequency of requests
• TBC performs better for some read-write ratios
• TBC performs better for different network quality
(packet loss, network congestion)
• TBC performs better in a heterogeneous
environment
• Next step is to include transaction management in
TBC
63
Next Challenge - Transactions
• We are going to present how TBC can be
used in transaction management,
serializability maintenance, and concurrency
control in cloud databases
• We also analyze the performance of the
transaction manager
• Auto-scaling and partitioning will be
addressed too
64
Transactions
• A Transaction is a logical unit of database processing
• A Transaction consists one or more read and/or write
operations
• We can divide transactions into
– Read-write transactions
– Read-only transactions
• Read-write transactions
– Execute update, insert or delete operations on data items
– Can have read operations too
• Read-only transactions
– Have only read operations
65
Serializability
• A number of transactions must be serializable
– Conflict - two transactions perform operations on
same data item and at least one is a write
– Order of conflicting operations in interleaved schedule
same as order in some serial schedule
– Final states are same
• Potential techniques to be implemented
– Timestamp ordering
– Commit ordering
Serializability
• A schedule of transactions is serializable if
it is equivalent to some serial execution
order
• Conflict serializability
– All Conflicting operations in the same order in
both schedules
67
Concurrency Control
• Mechanism to maintain ACID properties in concurrent
access to DBs
• Four major techniques: locking, timestamp, multiversion
and optimistic
• We use:
– Locking-lock the data items before getting access to them
– Timestamp-execute conflicting operations in order of their
timestamps
68
Read-Write Transaction in TBC
1. Transaction initiation request
User Program
2
2. Transaction sequence number
3. Transaction request to Root of TBC
generated tree
3
8
4. Lock request to lock manager
1
Controller
Root
5
5. Accepts or rejects lock request
7
6. Transaction request to immediate
children of Root
6
4
Lock Manager
Intermediate
Children
7. Children reply to Root
8. Transaction commit process sends
successful reply to user
9
9. Synchronize updates with parent
periodically
Other Children
69
Version Manager
Read-Only Transaction in TBC
1. Transaction initiation request
User Program
2
2. Transaction sequence number
6
3
1
Controller
3. Transaction request to
Root
intermediate children of Root
Lock Manager
4. Executes transaction, requests
version information
Intermediate
Children
5. Ensure latest version of data
5
4
Version Manager
6. Results sent back to user
Other Children
70
Lock Manager
 Two types of lock fields
Database
-Secondary Lock
-Primary Lock
Tables
Key
Columns
Rows
71
Concurrency Control in TBC
• Distributed concurrency control
– Controller, Root
• Combination of locking & timestamp
mechanisms
• Lock manager is implemented at the root
• Controller manages incremental sequence
number
• Deadlock scenario is resolved by wait-die
strategy
72
Serializability in TBC
• Dedicated thread launched to execute operations in
read-write transactions one by one at root node
• Thread executes both read and write operations
within a transaction serially
• multiple threads execute multiple transactions
concurrently.
• Threads in read-set servers associated with readonly transactions execute read operations within a
transaction serially
• Multiple read-only transactions read concurrently
• Read-only transactions read values written by
committed transactions, no conflict with currently
executing transactions.
73
Serializability in TBC
• Proved:
– Read-write transaction is serializable
– Read-only transaction is serializable
– Together read-write and read-only transaction
execution is serializable
– Therefore - TBC ensures serializability
Common Isolation Problems
• Common isolation problems
– Dirty read, unrepeatable read, lost update
• TBC solves these:
– Dirty read
• Only read from committed transactions
– Unrepeatable read
• Data items are locked until transaction commits
– Lost update
• Two transactions never have a lock on the same data
item at the same time
75
ACID Maintenance in TBC
• Atomicity
– Committed transactions are written to DB
– Failed transactions are discarded from the list
• Consistency
– TBC approach maintains apparent consistency –
guarantees strong consistency
• Isolation
– Lock manager ensures two transactions never try to
access the same data items
• Durability
– Multiple copies of committed transactions
76
Different Strategies for Experiments
• Three possible strategies for consistency
– Classic
– Quorum
– TBC
• Classic
– Each update requires participation of all nodes
– Better for databases that are updated rarely
• Quorum – used by MS Azure SQL and Xeround cloud
DBs
– Based on quorum majority voting
• TBC
77
Classic Approach
Read-Write Transaction
Read-Only Transaction
78
Quorum Approach
Read-Write Transaction
Read-Only Transaction
79
TBC Approach
Read-Write Transaction
Read-Only Transaction
80
Experiment Design
• Classic, Quorum & TBC implemented on each
server
• A stream of read-write & read-only transaction
requests are sent to the system
• Transport layer is implemented in the system
• Acknowledgement based packet transmission
• Thousands of ping responses are observed at
different times to determine transmission delay
and packet loss pattern
• Same seed is used in random variable for every
approach
81
Default Parameters
82
Effect of Transaction Request Rate
• Experiment:
– Compare the response time of the 3
approaches of Classic, Quorum and TBC as the
request rate increases from λ of .05 to .25
– Examine response time for read-only, readwrite and combined
– Examine transaction restart percentage
83
Effect of Transaction Request Rate
• Interdependency is the main reason
behind a higher response time for
quorum
• Classic and TBC have lower response
time for no dependency
• TBC has slightly better performance
than classic
• Classic has higher interdependency
and higher response time
• Quorum has lower response time
than classic
• TBC has better & slow growing
response time
84
Effect of Transaction Request Rate
• Quorum has a worse performance
time for higher read-only response
time
• Classic is also affected by a higher
arrival rate
• TBC has a very slow growing
response time with a higher arrival
rate
• Classic has an exponential growth in
restart percentage
• Quorum also has an exponential
growth restart percentage but
better than classic
• TBC has a linear growth in restart
percentage
85
Effect of Read-Write and Read-Only Ratio
• Experiment:
– Compare the response time of the 3
approaches of Classic, Quorum and TBC as the
read-write/read-only decreases from 90/10%
to 50/50%
– Examine response time for read-only, readwrite and combined
– Examine transaction restart percentage
86
Effect of Read-Write and Read-Only Ratio
• Quorum response time decreases
with lower read-only ratio
• Classic and TBC response time
increases with lower read-only ratio
- more writes so higher loads on
R/W set servers
• TBC still performs slightly better
• Quorum response time has an
exponential growth at a higher readwrite ratio
• Classic response time also has
similar exponential growth
• TBC is less effected by higher readwrite ratio, almost linear
87
Effect of Read-Write and Read-Only Ratio
• Both classic and quorum response
time has exponential growth
• Classic has better performance than
quorum
• TBC response time grows slowly,
almost linear
• Everyone has exponential growth in
restart percentage
• TBC has fewer restarted transactions
than others at higher read-write
ratio
88
Effect of Number of Tables and
Columns
• Experiment:
– Compare the response time of the 3
approaches of Classic, Quorum and TBC as the
number of tables, columns increases from 5,5
to 45,25
– Examine response time for read-only, readwrite and combined
– Examine transaction restart percentage
89
Effect of Number of Tables and Columns
• Response time of read-only
transactions remains the same with
respect to a higher number of tables
and columns
• As expected, quorum has the highest
and TBC has the lowest response
time and classic is slightly higher
than TBC
• Slightly decreasing response time is
identified for read-write
transactions with respect to a higher
number of tables and columns
• TBC as expected performed better
than the others
90
Effect of Number of Tables and Columns
• As expected, slightly decreasing
response time with respect to higher
number of tables & columns found
for everyone
• TBC performs noticeably better than
the others
• Clear deceasing pattern found for
restart percentage for every
approach
• TBC has higher restart percentage
than the others
• Higher restart percentage does not
affect response time for TBC
91
Auto Scaling
• Auto scaling means automatically add or
remove resources when necessary
• One of the key features promised by a cloud
platform
• Scale up means add resources when workload
exceeds defined threshold
• Scale down means remove resources when
workload decreases beyond defined threshold
• Thresholds are pre-defined according to SLAs
92
Types of Scaling
• Two types
– Vertical scaling
– Horizontal scaling
• Vertical scaling
– Scaling is done by moving to more powerful
resources
– Or moving to less powerful resource
• Horizontal scaling
– Scaling is done by adding resources
– Or removing resources
93
Partitioning
• Way to share workload with additional servers
to support elasticity
• Process of splitting a logical database into
smaller ones
• An increased workload can be served by
several sites
• If relation r is partitioned
– Divided into fragments r1, r2, …… rn
– Fragments contain enough info to reconstruct r
94
Auto Scaling for Read-Only Transaction
• Scaling up
–
–
–
–
Servers overloaded are identified
Additional server is requested
Connects to the root
Database synchronization
• Scaling down
– Server underutilization is identified
– Remove the connection to root
– Disengaged the server from the system
• No database partitioning is required
95
Auto Scaling for Read-Only Transaction
User
Controller
Read-Only Transaction Requests
Root
Children
Additional Servers
96
•
Auto Scaling for Read-Write
Transaction
Scaling up
– Server overloading is identified
– Additional servers are requested to form write-set
– New consistency tree is prepared
– Database partitioning is initiated
– Controller updates partition table
• Scaling down
– Server underutilization is identified
– Initiate database reconstruction from the partitions
– Controller updates partition table
– Disengage servers from the system
• Each write set is associated with a partition
97
Auto Scaling for Read-Write
Transaction
User
Controller
Read-Write Transaction Requests
Root
Write-Set
Servers
Additional
Servers
Children
98
Database Partitioning in TBC
Database
Database
Partition 2
Partition 1
Partition 2
Partition 2.1
Partition 2.1
Partition 2.1.1
Partition 2.2
Partition 2.1.2
99
Future Direction
• Auto scaling is very important in terms of
performance
– Identify stable workload change
– Predict future workload pattern to prepare in advance
• Apply machine learning
• An efficient database partition
– Less conflict
– Avoid distributed query processing
• Find smart algorithm for partitioning
100
Conclusions
• For databases in a cloud:
– Consistency remains bottleneck
– TBC is proposed to meet all these challenges
– Experimental results shows TBC
• Reduces interdependency
• Able to take advantage of network parameters
101
Conclusions
• TBC Transaction management system
– Guarantees ACID properties
– Maintains serializability
– Prevents common isolation level mistakes
• Hierarchical lock manager allows more
concurrency
• Experimental results show it is possible to
maintain concurrency without sacrificing
consistency and response time in clouds
• Quorum may not always perform better in a
cloud
102
Conclusions
• Tree-Based Consistency
– maintains strong consistency in database
transaction execution
– provides customers with better performance than
existing approaches
– is a viable solution for ACID transactional database
management in a cloud
103
End of TBC
Relational Joins
• Hadoop is not a DB
• Debate between parallel DBs and MR for
OLAPS
– Dewitt/Stonebreaker call MR “step backwards”
– Parallel faster because can create indexes
Relational Joins - Example
• Given 2 data sets S and T:
– (k1, (s1,S1)) k1 is join attribute, s1 is tuple ID, S1 is
rest of attributes
– (k2, (s2,S2))
– (k1, (t1,T1)) info for T
– (k2, (t2,T2))
• S could be user profiles – k is PK, tuple info about
age, gender, etc.
• T could be logs of online activity, tuple is
particular URL, k is FK
Reduce side Join 1:1
• Map over both datasets, emit (join key, tuple)
• All tuples grouped by join key – what is needed
for join
• Which is what type of join?
– Parallel sort-merge join
• If one-to-one join – at most 1 tuple from S, T match
•
• If 2 values, one must be from S, other from T, (don’t know
which since no order), join them
Consistency in Clouds
Current DB Market Status
•
•
•
•
MS SQL doesn’t support auto scaling and load.
MySQL recommended for “lower traffic”
New products: advertise replace MySQL with us
Oracle recently released on-demand resource
allocation
• IBM DB2 can auto scale with dynamic workload.
• Azure Relational DB – great performance