Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ESnet Trends and Pressures and Long Term Strategy ESCC, July 21, 2004 William E. Johnston, ESnet Dept. Head and Senior Scientist R. P. Singh, Project Manager Michael S. Collins, Stan Kluz, Joseph Burrescia, and James V. Gagliardi, ESnet Leads and the ESnet Team Lawrence Berkeley National Laboratory 1 DOE Science Bandwidth Requirements • Bandwidth requirements are established by the scientific community by looking at o the increase in the rates at which supercomputers generate data o the geographic scope of the community that must analyze that data o the types of distributed applications must run on geographically diverse systems - e.g. whole system climate models o the data rates, and analysis and collaboration style of the next generation science instruments - e.g. SNS, Fusion, LHC/Atlas/CMS 2 Evolving Quantitative Science Requirements for Networks Science Areas Today End2End Throughput 5 years End2End Throughput 5-10 Years End2End Throughput Remarks High Energy Physics 0.5 Gb/s 100 Gb/s 1000 Gb/s high bulk throughput Climate (Data & Computation) 0.5 Gb/s 160-200 Gb/s N x 1000 Gb/s high bulk throughput SNS NanoScience Not yet started 1 Gb/s 1000 Gb/s + QoS for control channel remote control and time critical throughput Fusion Energy 0.066 Gb/s (500 MB/s burst) 0.198 Gb/s (500MB/ 20 sec. burst) N x 1000 Gb/s time critical throughput Astrophysics 0.013 Gb/s (1 TBy/week) N*N multicast 1000 Gb/s computational steering and collaborations Genomics Data & Computation 0.091 Gb/s (1 TBy/day) 100s of users 1000 Gb/s + QoS for control channel high throughput and steering 3 Evolving Qualitative Requirements for Network Infrastructure S C 1-40 Gb/s, end-to-end 2-4 yrs 1-3 yrs S I C C guaranteed bandwidth paths I C S storage S S C compute I In the near term applications need high bandwidth S instrument cache & compute S C 4-7 yrs 3-5 yrs C 2-4 yrs requirement is for high bandwidth and QoS. C I I C C 100-200 Gb/s, S end-to-end C C 3-5 yrs requirement is for high bandwidth and QoS and network resident cache and compute elements. S C 4-7 yrs requirement is for high bandwidth and QoS and network resident cache and compute elements, and robust bandwidth (multiple paths) 4 Point to Point Connections • 10 Gb/s connections between major data site provides the ability to move about 100 TBy/day – a petabyte every 10 days • A few 10 Gb/s connections between ½ dozen Labs will be probably be feasible in the next few years 5 ESnet’s Evolution over the Next 10-20 Years • Upgrading ESnet to accommodate the anticipated increase from the current 100%/yr traffic growth to 300%/yr over the next 5-10 years is priority number 7 out of 20 in DOE’s “Facilities for the Future of Science – A Twenty Year Outlook” 6 ESnet’s Evolution over the Next 10-20 Years • Based on the requirements of the OSC High Impact Science Workshop and Network 2008 Roadmap, ESnet must address I. Capable, scalable, and reliable production IP networking - University and international collaborator connectivity - Scalable, reliable, and high bandwidth site connectivity II. Network support of high-impact science - provisioned circuits with guaranteed quality of service (e.g. dedicated bandwidth) III. Evolution to optical switched networks - Partnership with UltraScienceNet - Close collaboration with the network R&D community IV. Science Services to support Grids, collaboratories, etc 7 I. Production IP: University and International Connectivity Connectivity between any DOE Lab and any Major University should be as good as ESnet connectivity between DOE Labs and Abilene connectivity between Universities o Partnership with Internet2/Abilene o Multiple high-speed peering points o Routing tailored to take advantage of this o Continuous monitoring infrastructure to verify correct routing o Status: In progress - 4 cross-connects are in place and carrying traffic - first phase monitoring infrastructure is in place 8 Monitoring DOE Lab - University Connectivity AsiaPac SEA • Normal, continuous monitoring (full mesh – need auto detection of bandwidth anomalies) • All network hubs will have monitors • Monitors = network test servers (e.g. OWAMP) + stratum 1 time source Europe CERN/Europe Japan Japan CHI NYC DEN SNV IND DC KC Japan LA SDG ALB DOE Labs w/ monitors ELP Universities w/ monitors HOU network hubs high-speed cross connects with Internet2/Abilene ATL ESnet/Qwest Abilene ORNL 9 Monitoring DOE Lab - University Connectivity • Diagnostic monitoring (e.g. follow path from SLAC to IU) AsiaPac SEA Europe CERN/Europe Japan Japan CHI NYC DEN SNV IND DC KC Japan LA SDG ALB ELP HOU DOE Labs w/ monitors Universities w/ monitors network hubs high-speed cross connects with Internet2/Abilene ATL ESnet/Qwest Abilene ORNL 10 Monitoring DOE Lab - University Connectivity • Initial set of site monitors AsiaPac SEA Europe CERN/Europe Japan Japan CHI NYC DEN SNV IND DC KC Japan LA SDG ALB ELP HOU DOE Labs w/ monitors Universities w/ monitors network hubs high-speed cross connects with Internet2/Abilene ATL ESnet/Qwest Abilene ORNL Prototype site monitors 11 Initial Monitor Results (http://measurement.es.net) 12 Initial Monitoring Prototype LBNL/ESnet -> NCSU/Abilene Thanks! to Chintan Desai, NCSU, Jin Guojun, LBNL, Joe Metzger, ESnet, Eric Boyd Internet2 42 ms 41 ms 48 hour sample 1 128.109.41.1 (128.109.41.1) 0.188 ms 0.124 ms 0.116 ms NCSU sentinel host 2 rlgh1-gw-to-nc-itec.ncren.net (128.109.66.1) 1.665 ms 1.579 ms 1.572ms 3 abilene-wash.ncni.net (198.86.17.66) 9.829 ms 8.849 ms 13.470 ms Abilene-regional peering 4 nycmng-washng.abilene.ucaid.edu (198.32.8.84) 13.096 ms 27.682 ms13.084 ms Abilene DC 5 aoa-abilene.es.net (198.124.216.117) 13.151 ms 13.154 ms 13.173 ms Abilene NYC 6 aoacr1-ge0-aoapr1.es.net (134.55.209.109) 13.269 ms 13.157 ms 13.166ms Abilene -> ESnet 1 GE 7 chicr1-oc192-aoacr1.es.net (134.55.209.57) 33.516 ms 33.589 ms 33.579ms ESnet CHI 8 snvcr1-oc192-chicr1.es.net (134.55.209.53) 81.528 ms 81.514 ms 81.499ms ESnet SNV 9 lbl-snv-oc48.es.net (134.55.209.6) 82.867 ms 82.853 ms 82.959 ms ESnet-LBL peering 10 lbnl-ge-lbl2.es.net (198.129.224.1) 85.412 ms 83.736 ms 84.405 ms LBNL 11 ir1000gw.lbl.gov (131.243.128.210) 83.243 ms 82.960 ms 82.906 ms 12 beetle.lbl.gov (131.243.2.45) 83.138 ms 83.075 ms 83.045 ms LBNL sentinel host 13 I. Production IP: University and International Connectivity 10Gb/s AsiaPac SEA 2.5Gb/s AsiaPac 10Gb/s CERN Starlight/NW JAPAN 10Gb/s 10Gb/s Abilene core 10Gb/s Europe 2.5Gb/s ESnet core 10Gb/s ESnet/ ESnet core Qwest Europe CERN/Europe Japan Japan CHI NYC DEN SNV IND DC KC Japan LA SDG ATL ALB ELP HOU DOE Labs network hubs high-speed cross connects with Internet2/Abilene ESnet/Qwest Abilene ORNL 14 I. Production IP: University and International Connectivity • • • 10 Gb/s ring in NYC to MANLAN for o 10 Gb/s ESnet – Abilene x-connect o international links 10 Gb/s ring to StarLight for CERN link, etc. o 10 GE switch for ESnet aggration at Starlight in the procurement process o 10 GE interface in ESnet Chi router in the procurement process o will try and get use of second set of fibers from ESnet Chi router to Starlight so that we Status: Both of these are in progress 15 I. Production IP: A New ESnet Architecture Local rings, architected like the core, will provide multiple paths for high reliability and scalable bandwidth from the ESnet core to the sites o No single points of failure o Fiber / lambda ring based Metropolitan Area Networks can be built in several important areas - SF Bay Area - Chicago - Long Island - maybe VA - maybe NM 16 MAN Rings • The ESnet Metropolitan Area Networks (MANs) rings are a critical first step in addressing both increased bandwidth and reliability • The MAN architecture approach is to replace the current hub and tail circuit arrangement with local fiber rings that provide o diverse paths to the Labs o multiple high-speed configurable circuits 17 ESnet MAN Architecture DOE funded CERN link StarLight other international peerings Core ring – MAN intersection Qwest hubs production IP circuits to site equip. Vendor neutral facility ESnet core network Chicago hub spare capacity ESnet managed circuit services ESnet managed circuit services ESnet management and monitoring ESnet production IP service ANL FNAL monitor circuit services circuits to site equip. site gateway router Site LAN local fiber ring monitor ESnet production IP service site gateway router Site LAN circuits to site equip. 18 New ESnet Architecture – Chicago MAN as Example CERN (DOE funded link) other highspeed international peerings StarLight Qwest hub Vendor neutral telecom facility ESnet production IP service FNAL monitor site equip. ESnet core Site gateway router Site LAN all interconnects from the sites back to the core ring are high bandwidth and have full module redundancy Current approach of point-to-point tail circuits from hub to site ANL monitor No single point failure can disrupt Site gateway router Site LAN site equip. 19 The Case for ESnet MANs – Addressing the Requirements • All paths are based on 10 Gb/s Ethernet interfaces that are about ½ the cost of the 10 Gb/s OC192 interfaces of the core network o • This addresses the next increment in site access bandwidth (from 622 Mb/s and 2.5 Gb/s today to 10 Gb/s in the MANs) Logically the MAN ring intersects the core ring twice (though at one physical location) o This means that no single component or fiber failure can disrupt communication between any two points in the network - Today we have many single points of failure 20 SF BA MAN – Engineering Study Configuration OAK Level3 POP (Emeryville) LBNL Berkeley JGI National Lambda Rail NERSC Phase 2 adds LLNL and SNLL Walnut Creek Oakland Optoelectronics 10G Ethernet switch PAIX (Palo Alto peering point) the logical ring existing CENIC fiber paths Stanford Sunnyvale SLAC 1380 Kifer (Level3 Comm. hub) 1400 Kifer (Qwest Comm., ESnet hub) 10GE ESnet core network ESnet T320 core router 21 Chicago MAN – Engineering Study Configuration Shared w/ FNAL CERN Shared w/ IWire ESnet core ESnet Starlight optoelectronics Ethernet switch ESnet Qwest hub one optical fiber pair DWDM FNAL site equip. ANL Site gateway router Site gateway router site equip. 22 I. Production IP: A New ESnet Architecture • Status: In progress o Migrate site local loops to ring structured Metropolitan Area Networks and regional nets in some areas o Preliminary engineering study completed for San Francisco Bay Area and Chicago area Have received funding to build the San Francisco Bay Area ring 23 I. Production IP: Long-Term ESnet Connectivity Goal • The case for dual core rings o For high reliability ESnet should not depend on a single core/backbone because of the possibility of hub failure o ESnet needs high-speed connectivity to places where the current core does not provide access o A second core/backbone would provide both redundancy for highly reliable production service and extra bandwidth for high-impact science applications - The IP production traffic would normally use the primary core/backbone (e.g. the current Qwest ring) 24 I. Production IP: Long-Term ESnet Connectivity Goal AsiaPac • Connecting MANs with two cores to ensure against hub failure (for example, NLR is shown as the second core – in blue – below) SEA Europe CERN/Europe Japan Japan CHI NYC DEN SNV DC Japan ALB ATL SDG ELP MANs High-speed cross connects with Internet2/Abilene Major DOE Office of Science Sites ESnet/Qwest NLR ORNL 25 The Need for Multiple Backbones • The backbones connect to the MANs via “hubs” – the router locations on the backbone ring • These hubs present several possibilities for failure that would isolate the MAN rings from the backbone, thus breaking connectivity with the rest of ESnet for significant lengths of time • The two most likely failures are that o o • the ESnet hub router could suffer a failure that take it completely out of service (e.g. a backplane failure) – this could result in several days of isolation of all of the sites connected to that hub The hub site could be disabled by fire, physical attack, physical damage from an earthquake or tornado, etc. – this could result in several weeks or more of isolation of all of the sites connected to that hub A second backbone would connect to the MAN ring at a different location from the first backbone, thus mitigating the impact of a backbone hub failure 26 ESnet MAN Architecture with Single Core Ring one optical fiber pair DWDM site Layer 2 management equipment (e.g. 10 GigEthernet switch) hub router core ring hub site Metropolitan Area Network ring site Layer 3 (IP) management equipment (router) one POS flow between ESnet routers Optical channel (λ) management equipment site production IP provisioned circuits carried over lambdas provisioned circuits carried as tunnels through the ESnet IP backbone 27 ESnet MAN Architecture with Optimally Connected Dual Core Rings site core ring #2 hub router core ring #1 hub site #1 Metropolitan Area Network ring site production IP provisioned circuits carried over lambdas provisioned circuits carried as tunnels through the ESnet IP backbone I. Production IP: Long-Term ESnet Connectivity Goal • What we want NLR core ESnet core Qwest hub Level3 hub SF BA MAN • What we will probably get ESnet core NLR core Level3 hub Qwest hub SF BA MAN SF BA MAN A or B 29 I. Production IP: Long-Term ESnet Connectivity Goal • Using NLR as a second backbone improves the reliability situation with respect to sites connected to the two proposed MANs, but is not a complete solution because instead of each core ring independently connecting to the MAN ring, the two core hubs are connected together, and the MAN is really intersected by only one ring (see below) – true for both SF Bay and Chicago MANs • For full redundancy, need to keep some current circuits in place to connect both cores to the MAN ring, as below ESnet core NLR core North Bay site (NERSC, JGI, LBNL) Level3 hub Qwest hub SF BA MAN SF Bay Area example Existing Qwest circuit 30 Tactics • The planned Southern core route upgrade from OC48 (2.5Gb/s) to OC192 (10Gb/s) will cost nearly $3M o This is the equipment cost for ESnet o This has nothing to do with the So. Core route per se – that remains absolutely essential to ESnet o Qwest optical ring (“Qwave service”) - what I refer to as the No. core route and the So. core route - is the basis of ESnet high-speed, production IP service. And this ring, or something like it, will continue to be at the heart of ESnet's production service. 31 Tactics • What benefit will this upgrade have for ESnet science users? • The answer - now that ORNL will be peering with ESnet at 10 Gb/s in Chicago – is that this upgrade will have zero positive impact on OSC science users. o With ORNL connecting at Atlanta, there was a strong case for OC192 on the So. core route. However, their strategy has changed, and they are now connecting to the No. core route. o Therefore, while originally the upgrade made sense, it no longer does. 2.5Gb/s on So. route is adequate for foreseeable future. o All that is happening here is that the networking situation with the OSC Labs has changed fairly significantly over the last several years, and we are just adapting our planning to accommodate those changes. 32 Northern core route Southern core route 33 Tactics • ESnet will postpone the southern route upgrade 1) Pursue getting a lambda on NLR from Chicago to Seattle to Sunnyvale to San Diego o This will have considerable positive impact on our science users. It will give us - a) a high-speed link from SNV to Seattle and San Diego (we currently have a ridiculous OC3) - b) the potential to provide alternate backbone service to the MANs - c) the ability to get PNNL on the ESnet core at high speed - d) another resource on which we can provision end-to-end circuits for high impact science 2) Collaborate with NYSERNet to build a MAN around Long Island, which will give us the means to get BNL on the ESnet core network at high-speed. 34 Tactics • If it turns out that the NNSA labs in the SW need more bandwidth to the ESnet core in the future, we can always upgrade the So. core route piecemeal, starting with the El Paso to Sunnyvale link. 35 Tactics Leverage and Amplify Non-ESnet Network Connectivity to Labs • When ESnet has not been able to afford to increase the site bandwidth, the Labs have sometimes gotten their own highspeed connections • ESnet can take advantage of this to provide reliable, production high-speed access to the Labs When possible, incorporate the existing non-ESnet connections into the new ESnet architecture to provide a better and more capable service than the Labs can provide on their own • ANL, SLAC, LANL, PNNL, FNAL, ORNL • BNL, JLab Tactics ORNL Connection to ESnet AsiaPac SEA Europe CERN/Europe Japan Japan CHI NYC DEN SNV DC Japan ALB SDG ELP MANs High-speed cross connects with Internet2/Abilene Major DOE Office of Science Sites The ORNL ATL contributed circuit + the existing ESnet circuit effectively incorporate ORNL ESnet/Qwest into a secondary NLR ESnet core ring ORNL Outline • • • • Trends, Opportunities, and Pressures ESnet is Driven by the Needs of DOE Science New Strategic Directions for ESnet o I. Capable, scalable, and reliable production IP networking o II. Network support of high-impact science o III. Evolution to optical switched networks o IV. Science Services to support Grids, collaboratories, etc Draft Outline Strategy, 2005-2010 38 II. Network Support of High-Impact Science Dynamic provisioning of private “circuits” in the MAN and through the core can provide “high impact science” connections with Quality of Service guarantees o A few high and guaranteed bandwidth circuits and many lower bandwidth circuits (e.g. for video, remote instrument operation, etc.) o The circuits are secure and end-to-end, so if - the sites trust each other, and - they have compatible security policies then they should be able to establish direct connections by going around site firewalls to connect specific systems – e.g. HPSS <-> HPSS 39 II. Hi-Impact Science Bandwidth MAN optical fiber ring circuit cross connect ESnet border DMZ Site gateway router Site LAN Specific host, instrument, etc. Site New York (AOA) Production IP network Washington ESnet core Atlanta (ATL) common security policy Private “circuit” from one system to another El Paso (ELP) MAN optical fiber ring circuit cross connect ESnet border DMZ Site gateway router Site Specific host, LAN instrument, etc. Site 40 II. Network Support of High-Impact Science • Status: Initial progress o Proposal funded by MICS Network R&D program for initial development of basic circuit provisioning infrastructure in ESnet core network (site to site) o Will work with UltraScience Net to import advanced services technology 41 ESnet On-Demand Secure Circuits and Advance Reservation System (OSCARS) • The procedure of a typical path setup will be as follows • A user submits a request to the ESnet Reservation Manager (RM) (using an optional web front-end) to schedule an end-to-end path (e.g. between an experiment and computing cluster) specifying start and end times, bandwidth requirements, and specific source IP address and port that will be used to provide application access to the path. • At the requested start time, the RM will configure the ESnet router (at the start end of the path) to create a Label Switched Path (LSP) with the specified bandwidth. • Each router along the route receives the path setup request (via RSVP) and commits bandwidth (if available) creating an end-to-end LSP. The RM will be notified by RSVP if the end-to-end path cannot be established. The RM will then pass on this information to the user. • Packets from the source (e.g. experiment) will be routed through the LAN’s production path to ESnet’s edge router. On entering the edge router, these packets are identified and filtered using flow specification parameters (e.g. source/destination IP address/port numbers) and policed at the specified bandwidth. The packets are then injected into the LSP and switched (using MPLS) through the network to its destination (e.g. computing cluster). ESnet On-Demand Secure Circuits and Advance Reservation System 43 ESnet On-Demand Secure Circuits and Advance Reservation System • Issues o Scalability in numbers of paths may require shapers as part of the architecture o Allocation management (!) o In a single lambda MAN, may have to put a router at the site (previous thinking was no router at the sites as a cost savings – just Ethernet switches) – otherwise you cannot carry the circuit all the way to the site 44 III. Evolution to Optical Switched Networks • Optical transparency o On-demand, rapid setup of “transparent” optical paths o G.709 standard optical interfaces – evolution of SONET for optical networks 45 III. Evolution to Optical Switched Networks Partnership with DOE’s network R&D program o ESnet will cross-connect with UltraNet / National Lambda Rail in Chicago and Sunnyvale, CA o ESnet can experiment with UltraScience Net virtual circuits tunneled through the ESnet core (up to 5 Gb/s between UltraNet and appropriately connected Labs) o One important element of importing DOE R&D into ESnet o Status: In progress - Chicago ESnet – NLR/UltraNet x-connect based on the IWire ring is engineered - Qwest – ESnet Sunnyvale hub x-connect is dependent on Qwest permission, which is being negotiated (almost complete) 46 III. Evolution to Optical Switched Networks • ESnet is building partnerships with the Federal and academic R&D networks in addition to DOE network R&D programs and UltraScienceNet o Internet2 Hybrid Optical Packet Internet (HOPI) and National Lambda Rail for R&D on the next generation hybrid IP packet – circuit switched networks - ESnet will participating in the Internet2 HOPI design team (where UltraScience Net also participates) o o ESnet co-organized a Federal networking workshop on the future issues for interoperability of Optical Switched Networks Lots of good material at JET Web site • These partnerships will provide ESnet with direct access to, and participation in, next generation technology for evaluation and early deployment in ESnet 47 III. Evolution to Optical Switched Networks UltraNet – ESnet Interconnects AsiaPac SEA Europe CERN/Europe Japan Japan CHI NYC DEN SNV DC Japan ALB ATL SDG MANs ELP ESnet – UltraScienceNet cross connects High-speed cross connects with Internet2/Abilene Major DOE Office of Science Sites ESnet/Qwest NLR ORNL UltraNet Conclusions • ESnet is working hard to meet the current and future networking need of DOE mission science in several ways: o Evolving a new high speed, high reliability, leveraged architecture o Championing several new initiatives that will keep ESnet’s contributions relevant to the needs of our community 49