Download Running Statistical Analysis with R on Longhorn

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Introduction to Data Analysis with R on HPC
Texas Advanced Computing Center
Feb. 2015
Agenda
• 8:30-9:00 Welcome and introduction to TACC
resources.
• 9:00–9:30 Getting started with running R at TACC.
• 9:30–10:00 Practice and coffee break.
• 10:00-11:00 R basics
• 11:00-11:30 Data analysis support in R
• 11:30-1:00 Lunch break
• 1:00-1:30 Scaling up R computations
• 1:30-2:00 A walkthrough with parallel package in R
• 2:00-3:00 hands on lab session
• 3:00-4:00 Understand the performance of R program
Introduction to TACC Resource
About TACC
• TACC is a Research Division at the University of Texas at
Austin
– Origins go back to 1960s Cray CDC 6600 support
– TACC started in 2001 to support research beyond UT needs
• TACC is a service provider for XSEDE on several key systems
– Currently providing between 80 to 90% of HPC cycles in XSEDE
– Not limited to supporting NSF research
• TACC is also supported by partnering with UT Austin, UT
System, Industrial Partners, Multi-institutional research grants,
and donations
• TACC is 110+ people (40+ PhDs) bringing enabling
technologies and techniques to drive digital research
– Many collaborative research projects and mission specific proposals to
support open research
– Consulting to bring TACC expertise to other communities
Stampede
Maverick
HPC Jobs
6400+ Nodes
10 PFlops
14+ PB Storage
Vis & Analysis
Interactive
Access
132 K40 GPUs
Wrangler
Lonestar
HTC Jobs
1800+ Nodes
22000+ Cores
146 GB/node
Corral
Data Intensive
Computations
10 PB Storage
High IOPS
Stockyard
Shared Workspace
20 PB Storage
1 TB per user
Project Workspace
Data Collections
6 PB Storage
Databases
IRODS
Rodeo/
Chameleon
Cloud Services
User VMs
Vis Lab
Ranch
Immersive Vis
Colaborative
Touch Screen
3D
Tape Archive
160 PB Tape
1+ PB Access
Cache
Stampede
• Base Cluster (Dell/Intel/Mellanox):
–
–
–
–
–
6,400 nodes
Intel Sandy Bridge processors
Dell dual-socket nodes w/32GB RAM (2GB/core)
56 Gb/s Mellanox FDR InfiniBand interconnect
More than 100,000 cores, 2.2 PF peak performance
• Max Total Concurrency:
– exceeds 500,000 cores
– 1.8M threads
– #7 in HPC top 500
• 90% allocated through XSEDE
Additional Features of Stampede
• 6800 Intel Xeon Phi “MIC” Many Integrated Core
processors
– Special release of “Knight’s Corner” (61 cores)
– 10+ PF peak performance
• Stampede includes 16 1TB Sandy Bridge shared
memory nodes with dual GPUs
• 128 of the compute nodes are also equipped with
NVIDIA
Kepler K20 GPUs
• Storage subsystem driven by Dell storage nodes:
– Aggregate Bandwidth greater than 150GB/s
– More than 14PB of capacity
– Similar partitioning of disk space into multiple Lustre
filesystems as previous TACC systems ($HOME, $WORK
and $SCRATCH)
What does this mean?
• Faster processors
• More memory per node
• Starting hundreds of analysis jobs in batch.
• Access to latest “massive parallel” hardware
– Intel Xeon Phi
– GPGPU
Automatic offloading with latest
hardware
• R is originally designed as for single thread
execution.
– Slow performance
– Not scalable with large data
• R can be built and linked to library utilizes
latest multiple core technology for automatic
parallel execution for some operations, most
commonly, linear algebra related
computations.
Getting more from R
• Optimizing R performance
on Stampede
– Intel compiler vs. gcc was a
factor of 2 improvement
– MKL significantly improved
performance
– Some Xeon Phi performance
enhancement too
– Supporting common parallel
packages
Maverick Hardware
• 132 Node dual core Ivy Bridge
based cluster
– Each node has NVIDIA Kepler K40
GPU
– 128 GB of memory
– FDR Interconnect
– Shares Work file system with
Stampede (26 PB unformatted)
– Users get 1 TB of Work to start
• Intended for real time analysis
• TACC system, 50% provided to
XSEDE in kind, 50%
discretionary
Visualization and Analysis Portal
R and Python
• Can launch RStudio Server
and iPython Notebook
– Introducing capabilities, best
practices, and forms of
parallelism to users
– Simplifying UI with web
interface
– Improving visualization
capabilities with
Shiny package and GoogleVis
Hadoop Cluster: Rustler
• A Hadoop cluster with 64 Hadoop Data Nodes
– 2 x 10 core Ivy Bridge processors
– 128 GB memory
– 16x1TB disks (1 PB usable disk, 333 TB replicated)
• Login node, 2 Name nodes, 1 Web Proxy node
• 10 Gb/s Ethernet network with 40 Gb/s
connectivity to TACC backbone
• In early user period today
• A pure TACC resource (All discretionary
allocations)
Wrangler
• Three primary subsystems:
– A 10PB disk storage system
– Lustre based – (2 R720 dual E52680 MD servers, 45 C8000 OSF
servers with 6 TB drives)
– An embedded analytics
capability of several thousand
cores.
– 96 Dell R620 Haswell E5-2680v3 nodes with dual IB FDR/40
Gb/s Ethernet
– A high speed global object store
• 500 TB usable Flash via PCI to
all 96 analytics nodes
• 1TB/s IO rate &250M+ IOPS
Mass Storage Subsystem
10 PB
(Replicated)
IB Interconnect 120 Lanes
(56 Gb/s) non-blocking
Access &
Analysis System
96 Nodes
128 GB+ Memory
Haswell CPUs
40 Gb/s
Ethernet
Fabric
PCI Gen 3 Fabric; Allto-all connection, 1TB/s
High Speed Storage System
500+ TB
1 TB/s
250M+ IOPS
Data Intensive Computing Support at
TACC
• Data Management and Collection group
– Providing data storage service
• Files, databases, irods,
– Collection management and curation
• Data Mining and Statistics group
– Collaborating with users to develop and implement
scalable algorithmic solution.
– In addition to general data mining and analysis method,
also expertise in R, Hadoop and visual analytics..
• We are here to help:
– [email protected]