Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage September 2010 Brandon Hathaway Connectivity Solutions for Efficient Computing Enterprise HPC High-end HPC HPC Clouds Mellanox Interconnect Networking Solutions ICs Adapter Cards Host/Fabric Software Switches/Gateways Cables Leading Connectivity Solution Provider For Servers and Storage 2 © 2010 MELLANOX TECHNOLOGIES Complete End-to-End Connectivity Mellanox Host/Fabric Software Management - Full Software Stack on all OSs and Platforms - In-Box: Linux Kernel - Windows Application Accelerations - CORE-Direct - RDMA - GPUDirect Networking Efficiency/Scalability - QoS - Adaptive Routing - Congestion Management Server and Storage High-Speed Connectivity ICs © 2010 MELLANOX TECHNOLOGIES Adapters Switches/Gateway Cables 3 Mellanox InfiniBand in the TOP500 Top500 InfiniBand Trends Number of Clusters 250 208 200 150 100 152 122 50 0 Jun 08 Jun 09 Jun 10 Mellanox InfiniBand builds the most powerful clusters • 5 of the Top10 (#2, #3, #6, #7, #10) and 64 of the Top100 InfiniBand represents 42% of the Top500 • All InfiniBand clusters use Mellanox solutions Mellanox InfiniBand enables the highest utilization on the Top500 • Up to 96% system utilization • 50% higher than the best Ethernet based system © 2010 MELLANOX TECHNOLOGIES 4 Highest Performance Highest throughput • 40Gb/s node to node • Nearly 90M MPI messages per second • Send/receive and RDMA operations with zero-copy Lowest latency • • • • 1-1.3usec MPI end-to-end 0.9-1us InfiniBand latency for RDMA operations 100ns switch latency at 100% load Lowest latency 648-port switch – 25% to 45% faster vs other solutions Lowest CPU overhead • Full transport offload maximizes CPU availability for user applications © 2010 MELLANOX TECHNOLOGIES 5 Advanced HPC Capabilities CORE-Direct (Collectives Offload Resource Engine) • Collectives communication are communications used for system synchronizations, data broadcast or data gathering • Collectives effect every node • CORE-Direct offloads these collectives to the HCA from the CPU • Eliminates system noise and jitter issue • Increases the CPU cycles available for applications GPUDirect • Enables fastest GPU-to-GPU communications • Elimiates CPU overhead and eliminate memory copies in system memory • Reduces 30% of the GPU-to-GPU communication time Congestion control • Eliminates network congestions (hot-spots) related to many senders and a single receiver Adaptive routing • Eliminates networks congestions related to point to point communications sharing the same network path © 2010 MELLANOX TECHNOLOGIES 6 InfiniBand Link Speed Roadmap Per Lane & Rounded Per Link Bandwidth (Gb/s) # of Lanes per direction 5G-IB DDR 10G-IB QDR 14G-IB-FDR (14.025) 26G-IB-EDR (25.78125) 12 60+60 120+120 168+168 300+300 8 40+40 80+80 112+112 200+200 4 20+20 40+40 56+56 100+100 1 5+5 10+10 14+14 25+25 12x NDR 12x HDR 300G-IB-EDR 168G-IB-FDR 8x NDR Bandwidth per direction (Gb/s) 8x HDR 4x NDR 120G-IB-QDR 200G-IB-EDR 112G-IB-FDR 4x HDR 60G-IB-DDR 80G-IB-QDR x12 100G-IB-EDR 56G-IB-FDR 1x NDR 1x HDR x8 40G-IB-DDR 40G-IB-QDR 20G-IB-DDR 10G-IB-QDR 25G-IB-EDR 14G-IB-FDR x4 2005 Market Demand x1 - 2006 - 2007 © 2010 MELLANOX TECHNOLOGIES - 2008 - 2009 - 2010 - 2011 2014 7 Thank You