Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
TeraGrid and I-WIRE: Models for the Future? Rick Stevens and Charlie Catlett Argonne National Laboratory The University of Chicago TeraGrid Interconnect Objectives • Traditional: Interconnect sites/clusters using WAN • WAN bandwidth balances cost and utilization- objective to keep utilization high to justify high cost of WAN bandwidth • TeraGrid: Build a wide area “machine room” network • TeraGrid WAN objective to handle peak M2M traffic • Partnering with Qwest to begin with 40 Gb/s and grow to ≥80 Gb/s within 2 years. • Long-Term TeraGrid Objective • Build Petaflops capable distributed system, requiring Petabytes storage and a Terabit/second network. • Current objective is to step toward this goal. • Terabit/second network will require many lambdas operating at minimum OC-768 and its architecture is not yet clear. R. Stevens / C. Catlett Argonne National Laboratory + University of Chicago Outline and Major Issues • Trends in national cyberinfrastructure development • TeraGrid as a model for advanced grid Infrastructure • I-WIRE as a model for advanced regional fiber infrastructure • What is needed for these models to succeed • Recommendations R. Stevens / C. Catlett Argonne National Laboratory + University of Chicago Trends Cyberinfrastructure • Advent of regional dark fiber infrastructure • Community owned and managed (via 20 yr IRUs) • Typically supported by state or local resources • Lambda services (IRUs) viable replacements for bandwidth service contracts • Need to be structured with built in capability escalation (BRI) • Need strong operating capability to exploit this • Regional (NGO) groups moving faster (much faster!) than national network providers and agencies • A viable path to putting bandwidth on a Moore’s law curve • Source of new ideas for national infrastructure architecture R. Stevens / C. Catlett Argonne National Laboratory + University of Chicago 13.6 TF Linux TeraGrid 32 256p HP X-Class 32 24 128p HP V2500 24 8 8 92p IA-32 24 Extreme Black Diamond HPSS 4 Calren NTON vBNS Abilene Calren ESnet 574p IA-32 Chiba City Caltech Argonne 32 Nodes 0.5 TF 0.4 TB Memory 86 TB disk 64 Nodes 1 TF 0.25 TB Memory 25 TB disk 32 32 128p Origin 32 32 5 HR Display & VR Facilities 5 HPSS OC-12 OC-48 OC-48 OC-12 OC-12 ATM GbE Juniper M160 SDSC NCSA 256 Nodes 4.1 TF, 2 TB Memory 225 TB disk 500 Nodes 8 TF, 4 TB Memory 240 TB disk Juniper M40 OC-12 OC-12 2 OC-12 OC-3 ESnet HSCC MREN/Abilene Starlight Juniper M40 OC-12 2 vBNS Abilene MREN OC-12 OC-3 8 4 UniTree 8 HPSS 2 Sun Starcat 4 1176p IBM SP Blue Horizon = 32x 1GbE 1024p IA-32 320p IA-64 16 4 Myrinet Clos Spine = 64x Myrinet = 32x Myrinet 14 Myrinet Clos Spine 1500p Origin Sun E10K = 32x FibreChannel = 8x FibreChannel 10 GbE R. Stevens / C. Catlett 32 quad-processor McKinley Servers (128p @ 4GF, 8GB memory/server) 32 quad-processor McKinley Servers (128p @ 4GF, 12GB memory/server) Fibre Channel Switch 16 quad-processor McKinley Servers (64p @ 4GF, 8GB memory/server) Router or Switch/Router IA-32 nodes Argonne National Laboratory + University of Chicago TeraGrid Network Architecture • Cluster interconnect using multi-stage switch/router tree with multiple 10 GbE external links • Separation of cluster aggregation and site border routers necessary for operational reasons • Phase 1: Four routers or switch/routers • each with three OC-192 or 10 GbE WAN PHY • MPLS to allow for >10 Gb/s between any two sites • Phase 2: Add Core routers or switch/routers • Each with ten OC-192 or 10 GbE WAN PHY • Ideally should be expandable with additional 10 Gb/s interfaces R. Stevens / C. Catlett Argonne National Laboratory + University of Chicago Option 1: Full Mesh with MPLS Los Angeles Chicago One Wilshire (Carrier Fiber Collocation Facility) 2200mi 455 N. Cityfront Plaza (Qwest Fiber Collocation Facility) 1 mi 20mi 710 N. Lakeshore (Starlight) 115mi Qwest San Diego POP 140mi 25mi DWDM 20mi OC-192 Caltech SDSC ANL NCSA 10 GbE Site Border Router or Switch/Router Cienna Corestream DWDM Cluster Aggregation Switch/Router DWDM TBD Caltech Cluster SDSC Cluster NCSA Cluster ANL Cluster Other site resources IP Router R. Stevens / C. Catlett Argonne National Laboratory + University of Chicago Expansion Capability: “Starlights” Los Angeles Chicago One Wilshire (Carrier Fiber Collocation Facility) Regional Fiber Aggregation Points 455 N. Cityfront Plaza (Qwest Fiber Collocation Facility) 2200mi 1 mi Additional Sites And Networks 20mi IP Router (packets) 710 N. Lakeshore (Starlight) 115mi Qwest San Diego POP 140mi 25mi or Lambda Router (circuits) DWDM 20mi OC-192 Caltech SDSC ANL NCSA 10 GbE Site Border Router or Switch/Router Cienna Corestream DWDM Cluster Aggregation Switch/Router DWDM TBD Caltech Cluster SDSC Cluster NCSA Cluster ANL Cluster Other site resources IP Router R. Stevens / C. Catlett Argonne National Laboratory + University of Chicago Partnership: Toward Terabit/s Networks • Aggressive Current-Generation TeraGrid Backplane • 3 x 10 GbE per site today with 40 Gb/s in core • Grow to 80 Gb/s or higher core within 18-24 months • Requires hundreds of Gb/s in core/hub devices • Architecture Evaluation for Next-Generation Backplane • Higher Lambda-Counts, Alternative Topologies • OC-768 lambdas • Parallel Persistent Testbed • Use of 1 or more Qwest 10 Gb/s lambdas to keep next-generation technology and architecture testbeds going at all times. • Partnership with Qwest and local fiber/transport infrastructure to test OC768 and additional lambdas. • Can provide multiple, additional dedicated regional10 Gb/s lambdas and dark fiber for OC-768 testing beginning 2q 2002 via I-WIRE. R. Stevens / C. Catlett Argonne National Laboratory + University of Chicago I-Wire Logical and Transport Topology Starlight (NU-Chicago) Argonne 18 4 Qwest 455 N. Cityfront UC Gleacher 4 450 N. Cityfront UIC 10 4 12 12 4 McLeodUSA Level(3) Illinois Century Network 111 N. Canal James R. Thompson Ctr City Hall State of IL Bldg 2 UIUC/NCSA 151/155 N. Michigan Doral Plaza 2 2 Next StepsIIT -Fiber to FermiLab, other sites -Additional fiber to ANL, UIC -DWDM terminals at Level(3), McLeodUSA locations -Experiments with OC-768, Optical Switching/Routing R. Stevens / C. Catlett UChicago Argonne National Laboratory + University of Chicago Gigapops Terapops (OIX) Pacific Lightrail TeraGrid Interconnect Gigapop data /from R. Stevens C. Internet2 Catlett Argonne National Laboratory + University of Chicago Leverage Regional/Community Fiber Experimental Interconnects R. Stevens / C. Catlett Argonne National Laboratory + University of Chicago Recommendations • ANIR Program should support • Interconnection of fiber islands via bit rate independent or advanced ’s (BRI s) • Hardware to light-up community fibers and build out advanced testbeds • People resources to run these research community driven infrastructures • A next gen connection program will not help advance state of the art • Lambda services need to be BRI R. Stevens / C. Catlett Argonne National Laboratory + University of Chicago