Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb. 2015 Agenda • 8:30-9:00 Welcome and introduction to TACC resources. • 9:00–9:30 Getting started with running R at TACC. • 9:30–10:00 Practice and coffee break. • 10:00-11:00 R basics • 11:00-11:30 Data analysis support in R • 11:30-1:00 Lunch break • 1:00-1:30 Scaling up R computations • 1:30-2:00 A walkthrough with parallel package in R • 2:00-3:00 hands on lab session • 3:00-4:00 Understand the performance of R program Introduction to TACC Resource About TACC • TACC is a Research Division at the University of Texas at Austin – Origins go back to 1960s Cray CDC 6600 support – TACC started in 2001 to support research beyond UT needs • TACC is a service provider for XSEDE on several key systems – Currently providing between 80 to 90% of HPC cycles in XSEDE – Not limited to supporting NSF research • TACC is also supported by partnering with UT Austin, UT System, Industrial Partners, Multi-institutional research grants, and donations • TACC is 110+ people (40+ PhDs) bringing enabling technologies and techniques to drive digital research – Many collaborative research projects and mission specific proposals to support open research – Consulting to bring TACC expertise to other communities Stampede Maverick HPC Jobs 6400+ Nodes 10 PFlops 14+ PB Storage Vis & Analysis Interactive Access 132 K40 GPUs Wrangler Lonestar HTC Jobs 1800+ Nodes 22000+ Cores 146 GB/node Corral Data Intensive Computations 10 PB Storage High IOPS Stockyard Shared Workspace 20 PB Storage 1 TB per user Project Workspace Data Collections 6 PB Storage Databases IRODS Rodeo/ Chameleon Cloud Services User VMs Vis Lab Ranch Immersive Vis Colaborative Touch Screen 3D Tape Archive 160 PB Tape 1+ PB Access Cache Stampede • Base Cluster (Dell/Intel/Mellanox): – – – – – 6,400 nodes Intel Sandy Bridge processors Dell dual-socket nodes w/32GB RAM (2GB/core) 56 Gb/s Mellanox FDR InfiniBand interconnect More than 100,000 cores, 2.2 PF peak performance • Max Total Concurrency: – exceeds 500,000 cores – 1.8M threads – #7 in HPC top 500 • 90% allocated through XSEDE Additional Features of Stampede • 6800 Intel Xeon Phi “MIC” Many Integrated Core processors – Special release of “Knight’s Corner” (61 cores) – 10+ PF peak performance • Stampede includes 16 1TB Sandy Bridge shared memory nodes with dual GPUs • 128 of the compute nodes are also equipped with NVIDIA Kepler K20 GPUs • Storage subsystem driven by Dell storage nodes: – Aggregate Bandwidth greater than 150GB/s – More than 14PB of capacity – Similar partitioning of disk space into multiple Lustre filesystems as previous TACC systems ($HOME, $WORK and $SCRATCH) What does this mean? • Faster processors • More memory per node • Starting hundreds of analysis jobs in batch. • Access to latest “massive parallel” hardware – Intel Xeon Phi – GPGPU Automatic offloading with latest hardware • R is originally designed as for single thread execution. – Slow performance – Not scalable with large data • R can be built and linked to library utilizes latest multiple core technology for automatic parallel execution for some operations, most commonly, linear algebra related computations. Getting more from R • Optimizing R performance on Stampede – Intel compiler vs. gcc was a factor of 2 improvement – MKL significantly improved performance – Some Xeon Phi performance enhancement too – Supporting common parallel packages Maverick Hardware • 132 Node dual core Ivy Bridge based cluster – Each node has NVIDIA Kepler K40 GPU – 128 GB of memory – FDR Interconnect – Shares Work file system with Stampede (26 PB unformatted) – Users get 1 TB of Work to start • Intended for real time analysis • TACC system, 50% provided to XSEDE in kind, 50% discretionary Visualization and Analysis Portal R and Python • Can launch RStudio Server and iPython Notebook – Introducing capabilities, best practices, and forms of parallelism to users – Simplifying UI with web interface – Improving visualization capabilities with Shiny package and GoogleVis Hadoop Cluster: Rustler • A Hadoop cluster with 64 Hadoop Data Nodes – 2 x 10 core Ivy Bridge processors – 128 GB memory – 16x1TB disks (1 PB usable disk, 333 TB replicated) • Login node, 2 Name nodes, 1 Web Proxy node • 10 Gb/s Ethernet network with 40 Gb/s connectivity to TACC backbone • In early user period today • A pure TACC resource (All discretionary allocations) Wrangler • Three primary subsystems: – A 10PB disk storage system – Lustre based – (2 R720 dual E52680 MD servers, 45 C8000 OSF servers with 6 TB drives) – An embedded analytics capability of several thousand cores. – 96 Dell R620 Haswell E5-2680v3 nodes with dual IB FDR/40 Gb/s Ethernet – A high speed global object store • 500 TB usable Flash via PCI to all 96 analytics nodes • 1TB/s IO rate &250M+ IOPS Mass Storage Subsystem 10 PB (Replicated) IB Interconnect 120 Lanes (56 Gb/s) non-blocking Access & Analysis System 96 Nodes 128 GB+ Memory Haswell CPUs 40 Gb/s Ethernet Fabric PCI Gen 3 Fabric; Allto-all connection, 1TB/s High Speed Storage System 500+ TB 1 TB/s 250M+ IOPS Data Intensive Computing Support at TACC • Data Management and Collection group – Providing data storage service • Files, databases, irods, – Collection management and curation • Data Mining and Statistics group – Collaborating with users to develop and implement scalable algorithmic solution. – In addition to general data mining and analysis method, also expertise in R, Hadoop and visual analytics.. • We are here to help: – [email protected]