* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Slides_Joseph
Survey
Document related concepts
Transcript
Joseph Antony, Andrew Howard, Jason Andrade, Ben Evans, Claire Trenham, Jingbo Wang Production Petascale Climate Data Replication at NCI – Lustre and our engagement with the Earth Systems Grid Federation (ESGF) nci.org.au @NCInews nci.org.au MOTIVATION nci.org.au International Climate Change Research – The CMIP projects • The UN’s International Panel on Climate Change (IPCC) prepares an intergovernmental assessment report every 6 years • This effort requires significant scientific and HPC/HPD resources to back it • The most recent of these activities was the Coupled Model Intercomparison Project 5 (CMIP5) • The NCI is a major data node within the ESGF federation • In this talk I will share with you a ‘view from the coalface’, replicating ~2PB of data nci.org.au nci.org.au CMIP DATA VOLUMES nci.org.au CMIP1 thru CMIP5 Data Volumes Taken from Dean Williams’ ESGF Internet2 presentation, 2014 nci.org.au ESGF NODE ARCHITECTURE nci.org.au ata archival retrieval Theand ESGF Data Archival and Retrieval System s ct d s e http://esgf.org' 8" • The ESGF is a federated peer-to-peer international data archival and retrieval system • Incorporates singlesign-on for end-users • It has publication and version management tools • Supports data aggregations and can notify users if datasets have been modified LLNL-PRES-648666 nci.org.au THE END-USER PERSPECTIVE nci.org.au The Last-Mile Problem … • Data is too large to move onto desktop for analysis – CMIP3 to CMIP5 • Users want versioned, curated data to be able to jump right into scientific analysis • At NCI – An integrated eco-system exists for dataintensive science • Data Repositories • Virtual Laboratories – The ICNWG effort to solve the ‘Last Mile Problem’ for networking nci.org.au ICNWG Activities nci.org.au Okay … so where’s Lustre in all of this you ask? nci.org.au Okay … so where’s Lustre in all of this you ask? We use Lustre as our distributed filesystem for a set of dedicated WAN data transfer nodes (DTNs) nci.org.au Okay … so where’s Lustre in all of this you ask? We use Lustre as our distributed filesystem for a set of dedicated WAN data transfer nodes (DTNs) But first a detour … nci.org.au A small amount of packet loss makes a huge difference in TCP performance 1Gbps == 125 MB/sec Local (LAN) Metro Area With loss, high performance beyond metro distances is essentially impossible International Regional Continental Measured (TCP Reno) Measured (HTCP) Courtesy Eli Dart, ESnet Theoretical (TCP Reno) Measured (no loss) 5/5/14 Lawrence Berkeley National Laboratory nci.org.au U.S. Department of Energy | Office of Science Science DMZ Design Pattern (Abstract) Border Router perfSONAR WAN 10G Enterprise Border Router/Firewall 10GE Site / Campus access to Science DMZ resources Clean, High-bandwidth WAN path 10GE perfSONAR 10GE Site / Campus LAN Science DMZ Switch/Router 10GE perfSONAR Per-service security policy control points High performance Data Transfer Node with high-speed storage Courtesy Eli Dart, ESnet 5/5/14 Lawrence Berkeley National Laboratory 6 nci.org.au U.S. Department of Energy | Office of Science Local And Wide Area Data Flows Border Router perfSONAR WAN 10G Enterprise Border Router/Firewall 10GE Site / Campus access to Science DMZ resources Clean, High-bandwidth WAN path 10GE perfSONAR 10GE Site / Campus LAN Science DMZ Switch/Router 10GE perfSONAR Per-service security policy control points High performance Data Transfer Node with high-speed storage Courtesy Eli Dart, ESnet 5/5/14 Lawrence Berkeley National Laboratory High Latency WAN Path Low Latency LAN Path 7 nci.org.au U.S. Department of Energy | Office of Science Abstract HPC Center With Data Path Border Router WAN Firewall Routed Offices perfSONAR Virtual Circuit perfSONAR Core Switch/Router Front end switch Front end switch perfSONAR Data Transfer Nodes High Latency WAN Path Supercomputer Low Latency LAN Path Parallel Filesystem Courtesy Eli Dart, ESnet 5/5/14 Lawrence Berkeley National Laboratory High Latency VC Path 8 nci.org.au U.S. Department of Energy | Office of Science Abstract HPC Center With Data Path Border Router WAN Firewall Routed Offices perfSONAR Virtual Circuit perfSONAR Core Switch/Router Front end switch Front end switch perfSONAR Data Transfer Nodes High Latency WAN Path Supercomputer Low Latency LAN Path Parallel Filesystem Courtesy Eli Dart, ESnet 5/5/14 Lawrence Berkeley National Laboratory High Latency VC Path 8 nci.org.au U.S. Department of Energy | Office of Science nci.org.au AARNet International Links nci.org.au NCI’s DTN Nodes nci.org.au CBR-SYD and onto the CONUS via SXtransport nci.org.au SXtransport – Physical Layout Cable Station Network Segment nci.org.au SXtransport – Logical Network Layout nci.org.au What are some of the world’s longest submarine cables you ask? 39,000 Km of submarine fibre nci.org.au What are some of the world’s longest submarine cables you ask? 39,000 Km of submarine fibre 28,900 Km of submarine fibre 1,600 Km of terrestrial fibre nci.org.au Networking Topology for Data Replication Courtesy Mary Hester, ESnet nci.org.au Initial Transfer Rates from NCI • • Graph shows the data rate vs. the volume of data transferred Different lines in the graph represent how many data streams were required to obtain the given performance. The results of the graph indicate that it is possible to get a line-rate of 1GB/s (8Gbps) between Australia and the United States, however, it requires configuring transfers to run more than 100 parallel streams nci.org.au Data replication and Science DMZs • Currently we’ve replicated ~1.5PB • Working on improving these rates by employing a Science DMZ model and dedicated data transfer nodes nci.org.au Globus Online • Globus Online is a hosted data-transfer-asa-service offering, run by the University of Chicago • It makes the job of large data transfers easy for both instrument owners and end-users nci.org.au Globus Online Architecture nci.org.au nci.org.au nci.org.au nci.org.au nci.org.au Using Dedicated DTNs – January 2015 nci.org.au Using Dedicated DTNs – March 2015 nci.org.au State of the Union Numbers from the ICNWG Consortium nci.org.au Conclusion • Non-trivial to get various ducks lined-up – 10GigE WAN networking – Mellanox tuning work for 10GigE Ethernet and 56Gbp FDR – Being NUMA aware is critical for the GridFTP daemon! nci.org.au THE END nci.org.au VERIFIED, CURATED SCIENTIFIC DATASETS nci.org.au quality control processing Centralized Quality Control for Data Processing • Multi-layered QC – Initial Level 1 QC done at data nodes – DKRZ performs L2 QC – Further metadata and variable checking is done to get to L3 QC • At every step, end-users can see the QC Level for their data • Replicated data has passed QC Level 3 and receives a DOI 3-Layer Quality Assurance Concept 9" LLNL-PRES-648666 nci.org.au