Download Analytics in the cloud Dow we really need to reinvent the

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of CP/CMS wikipedia , lookup

Transcript
IBM Research
Analytics in the cloud
Dow we really need to reinvent the
storage stack?
R. Ananthanarayanan, Karan Gupta, Prashant Pandey,
Himabindu Pucha, Prasenjit Sarkar, Mansi Shah, Renu
Tewari
Image courtesy NASA / ESA
© 2002 IBM Corporation
Storage Systems Research
© 2008 IBM Corporation
IBM Research
Data-Intensive Internet Scale Applications
Typical Applications
• Web-scale search, indexing, mining
• Genomic sequencing
• brain-scale network simulations
Storage Systems Research
© 2008 IBM Corporation
IBM Research
Data-Intensive Internet Scale Applications
Key Requirements
•
Scale to very large data sets
•
Platform needs to scale to 1000’s of nodes
•
Built of commodity hardware for cost efficiency
•
Tolerate failures during “every” job execution
•
Support data shipping to reduce network requirements
Storage Systems Research
© 2008 IBM Corporation
IBM Research
MapReduce for analytics
MapReduce is emerging as a model for large-scale analytics application
Important design goals are extreme-scalability and fault-tolerance
Storage layer is separated and has well-defined requirements
QuickTime™ and a
decompressor
are needed to see this picture.
Image source: http://developer.yahoo.com/hadoop/tutorial/module1.html
Storage Systems Research
© 2008 IBM Corporation
IBM Research
MapReduce Data-store requirements
Provide a hierarchical namespace with directories and files
Allow applications to read/write data to files
Protect data availability and reliability in the face of node and
disk failures
Provide high bandwidth access to reasonably-sized chunks of
data to all compute nodes (not necessarily all-to-all)
Provide chunk access-affinity information to allow proper
scheduling of tasks
Storage Systems Research
© 2008 IBM Corporation
IBM Research
Data store options: Cluster FS Vs Specialized FS
Specialized FS
Cluster FS
Scaling
Yes
Yes
Commodity hardware compliant
Yes
Yes
Storage Systems Research
© 2008 IBM Corporation
IBM Research
Data store options: Cluster FS Vs Specialized FS
Specialized FS
Cluster FS
Scaling
Yes
Yes
Commodity hardware compliant
Yes
Yes
Traditional application support
No
Yes
Mature management tools
No
Yes
Storage Systems Research
© 2008 IBM Corporation
IBM Research
Data store options: Cluster FS Vs Specialized FS
Specialized FS
Cluster FS
Scaling
Yes
Yes
Commodity hardware compliant
Yes
Yes
Traditional application support
No
Yes
Mature management tools
No
Yes
Tuned for Hadoop
Yes
No
Storage Systems Research
© 2008 IBM Corporation
IBM Research
Modifying a Cluster Filesystem for MapReduce
GPFS
•
Mature filesystem - many large production installations
•
High performance, Highly scalable
•
Reliability features focused on SAN environments
Supports rack-aware 2-way replication
•
POSIX interface
•
Supports shared disk (SAN) and shared-nothing setups
•
Not optimized for MapReduce workloads
Does not expose data location information
largest block size = 16 MB
Changes for Hadoop:
•
Make blocks bigger
•
Let the platform know where the big blocks are
•
Optimize replication and placement to reduce network usage
Storage Systems Research
© 2008 IBM Corporation
IBM Research
Key change: Metablocks
Works for many workloads
•
Small FS blocks (eg: 512K)
•
Large Application blks (eg: 64M)
New allocation scheme
•
Metablock size granularity for
wide striping
Block map operates on large
Metablock size
All FS operations operate on small
regular block size
New allocation policy
FS block
Application “meta-block”
Additional changes to provide
block location information and
“write affinity”
Storage Systems Research
© 2008 IBM Corporation
IBM Research
MapReduce performance
2500.00
Time (sec)
2000.00
GPFS rack=1
HDFS rack=1
GPFS rack=2
HDFS rack=2
1500.00
1000.00
500.00
0.00
Grep
TeraGen
Terasort
Test bed
Hadoop : version 0.18.1
16 nodes
iDataPlex: 42 nodes
GPFS: version pre3.3
160 GB data
8 cores, 8GB RAM
(replication factor = 2)
4+1 disks per-node
Storage Systems Research
© 2008 IBM Corporation
IBM Research
Impact on traditional workloads
2.5
Normalized perf
2
1.5
1
0.5
0
512K
512K, metablock
Normalized Random perf
16M
Normalized Seq read perf
iDataPlex: 42 nodes
GPFS: version pre3.3
8 cores, 8GB RAM
Bonnie filesystem benchmark
4+1 disks per-node
Storage Systems Research
© 2008 IBM Corporation
IBM Research
Things that didn’t work
800
700
Large filesystem block-size
600
500
400
Turn-off Prefetching
300
200
Create alignment of records
to block boundaries
100
P
G
PF
S1
6M
LA
N
PF
S1
6M
LA
G
G
PF
S1
6M
L
G
H
PF
S1
6M
D
FS
0
File System
2.5
2
Time (sec)
1.5
1
0.5
0
512K
16M
Normalized Random perf
Storage Systems Research
16M, no-prefetch
Normalized Seq read perf
© 2008 IBM Corporation
IBM Research
Advantages of traditional filesystems
Traditional filesystems have solved many hard problems like
access control, quotas, snapshots …
Allow traditional and MapReduce applications to share the
same input data.
Exploit Filesystem tools & scripts based on “regular”
filesystems.
Re-use of Backup/Archive solutions built around particular
filesystems.
Mixed analytics pipelines.
Using a MapReduce-specific filesystem (e.g. HDFS):
Crawl
Load
Crawler writes to a traditional filesystem
Analyze
Output
Serve
Back to traditional fielsystem
Into mapreduce filesystem
Using a general-purpose filesystem (e.g. GPFS):
Crawl
Analyze
Storage Systems Research
Serve
© 2008 IBM Corporation
IBM Research
Conclusion
MapReduce platforms can use traditional filesystems without
loss of performance.
There are important reasons why traditional filesystems are
attractive to users of MapReduce.
Storage Systems Research
© 2008 IBM Corporation