Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Michael L. Norman Principal Investigator Interim Director, SDSC Allan Snavely Co-Principal Investigator Project Scientist SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO What is Gordon? • A “data-intensive” supercomputer based on SSD flash memory and virtual shared memory • Emphasizes MEM and IO over FLOPS • A system designed to accelerate access to massive data bases being generated in all fields of science, engineering, medicine, and social science • The NSF’s most recent Track 2 award to the San Diego Supercomputer Center (SDSC) • Coming Summer 2011 SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Why Gordon? • Growth of digital data is exponential • “data tsunami” • Driven by advances in digital detectors, networking, and storage technologies • Making sense of it all is the new imperative • • • • • data analysis workflows data mining visual analytics multiple-database queries on demand data-driven applications SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO The Memory Hierarchy Flash SSD, O(TB) 1000 cycles Potential 10x speedup for random I/O to large files and databases SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Gordon Architecture: “Supernode” 4 TB SSD I/O Node • 32 Appro Extreme-X compute nodes • Dual processor Intel Sandy Bridge • 240 GFLOPS • 64 GB • 2 Appro Extreme-X IO nodes • Intel SSD drives • 4 TB ea. • 560,000 IOPS • ScaleMP vSMP virtual shared memory • 2 TB RAM aggregate • 8 TB SSD aggregate vSMP memory virtualization 240 GF Comp. Node 240 GF Comp. Node 64 GB RAM 64 GB RAM SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Gordon Architecture: Full Machine • 32 supernodes = 1024 compute nodes • Dual rail QDR Infiniband network SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN • 3D torus (4x4x4) • 4 PB rotating disk parallel file system • >100 GB/s SN SN SN SN D SN SN D SN SN D SN SN D SN SN SN SN SN D SN D SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Gordon Peak Capabilities Speed 245 TFLOPS Mem (RAM) 64 TB Mem (SSD) 256 TB Mem (RAM+SSD) 320 TB Ratio (MEM/SPEED) 1.31 BYTES/FLOP IO rate to SSDs 35 Million IOPS Network bandwidth 16 GB/s bi-directional Network latency 1 msec. Disk storage 4 PB Disk IO Bandwidth >100 GB/sec SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Gordon is designed specifically for dataintensive HPC applications • Such applications involve “very large data-sets or very large input-output requirements” • Two data-intensive application classes are important and growing Data Mining “the process of extracting hidden patterns from data… with the amount of data doubling every three years, data mining is becoming an increasingly important tool to transform this data into information.” Wikipedia Data-Intensive Predictive Science solution of scientific problems via simulations that generate large amounts of data SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO High Performance Computing (HPC) vs High Performance Data (HPD) Attribute HPC HPD Key HW metric Peak FLOPS Peak IOPS Architectural features Many small-memory multicore nodes Fewer large-memory SMP nodes Typical application Numerical simulation Database query Data mining Concurrency High concurrency Low concurrency or serial Data structures Data easily partitioned e.g. grid Data not easily partitioned e.g. graph Typical disk I/O patterns Large block sequential Small block random Typical usage mode Batch process Interactive SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Data mining applications will benefit from Gordon • De novo genome assembly from sequencer reads & analysis of galaxies from cosmological simulations and observations • Will benefit from large shared memory • Federations of databases and Interaction network analysis for drug discovery, social science, biology, epidemiology, etc. • Will benefit from low latency I/O from flash SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Data-intensive predictive science will benefit from Gordon • Solution of inverse problems in oceanography, atmospheric science, & seismology • Will benefit from a balanced system, especially large RAM per core & fast I/O • Modestly scalable codes in quantum chemistry & structural engineering • Will benefit from large shared memory SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Dash: towards a supercomputer for data intensive computing SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Project Timeline • Phase 1: Dash development (9/09-7/11) • Phase 2: Gordon build and acceptance (3/11-7/11) • Phase 3: Gordon operations (7/11-6/14) SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Comparison of the Dash and Gordon systems Doubling capacity halves accessibility toComponent any random data on a given media Dash System Gordon 2 sockets, 8 cores, 48 GB 2 sockets, TBD cores, 64 GB 64 1024 Nehalem Sandy Bridge Clock Speed (GHz) 2.4 TBD Peak Speed (Tflops) 4.9 245 DRAM (TB) 3 64 I/O Nodes (#) 2 64 2 with 8 ports 1 with 16 ports Flash (TB) 2 256 Total Memory: DRAM + flash (TB) 5 320 Yes Yes 2 32 InfiniBand InfiniBand .5 PB 4.5 PB Node Characteristics (# sockets, cores, DRAM) Compute Nodes (#) Processor Type I/O Controllers per Node vSMP 32-node Supernodes Interconnect Disk SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Gordon project wins storage challenge at SC09 with Dash SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO We won SC09 Data Challenge with Dash! • With these numbers: • IOR 4KB • RAMFS 4Million+ IOPS on up to .750 TB of DRAM (1 supernode’s worth) • 88K+ IOPS on up to 1 TB of flash (1 supernode’s worth) • Speed up Palomar Transients database searches 10x to 100x • Best IOPS per dollar • Since that time we boosted flash IOPS to 540K (hitting our 2011 performance targets – it is now 2009 SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Dash Update – early vSMP test results SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Dash Update – early vSMP test results SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Next Steps • Continue vSMP and flash SSD assessment and development on Dash • Prototype Gordon application profiles using Dash • New application domains • New usage modes and operational support mechanisms • New user support requirements • Work with TRAC to identify candidate apps • Assemble Gordon User Advisory Committee • International Data-Intensive Conference Fall 2010 SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO