Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Esnet: DOE’s Science Network GNEW March, 2004 William E. Johnston, ESnet Manager and Senior Scientist Michael S. Collins, Stan Kluz, Joseph Burrescia, and James V. Gagliardi, ESnet Leads and the ESnet Team Lawrence Berkeley National Laboratory 1 Esnet Provides • High bandwidth backbone and connections for Office of Science Labs and programs • High bandwidth peering with the US, European, and Japanese Research and Education networks • SecureNet (DOE classified R&D) as an overlay network • • Science services – Grid and collaboration services User support: ESnet “owns” all network trouble tickets (even from end users) until they are resolved one stop shopping for user network problems o 7x24 coverage o Both network and science services problems o 2 ESnet Connects DOE Facilities and Collaborators CA*net4 KDDI (Japan) France Switzerland Taiwan (TANet2) Australia CA*net4 Taiwan (TANet2) Singaren GEANT - Germany - France - Italy - UK - etc Sinet (Japan) Japan – Russia(BINP) CA*net4 CERN MREN Netherlands Russia StarTap Taiwan (ASCC) LIGO PNNL ESnet IP Japan MIT JGI LBNL NERSC SLAC FNAL ANL-DC INEEL-DC ORAU-DC ANL LLNL/LANL-DC SNLL QWEST ATM LLNL AMES BNL NY-NAP PPPL MAE-E 4xLAB-DC GTN&NNSA MAE-W PAIX-E KCP YUCCA MT JLAB ORNL LANL SDSC ALB HUB GA 42 end user sites Office Of Science Sponsored (22) NNSA Sponsored (12) Joint Sponsored (3) Other Sponsored (NSF LIGO, NOAA) Laboratory Sponsored (6) peering points hubs OSTI ARM SNLA ORAU NOAA SRS Allied Signal ESnet backbone: Optical Ring and Hubs International (high speed) OC192 (10G/s optical) OC48 (2.5 Gb/s optical) Gigabit Ethernet (1 Gb/s) OC12 ATM (622 Mb/s) OC12 OC3 (155 Mb/s) T3 (45 Mb/s) T1-T3 T1 (1 Mb/s) 3 ESnet is Driven by the Needs of DOE Science August 13-15, 2002 Organized by Office of Science Mary Anne Scott, Chair Dave Bader Steve Eckstrand Marvin Frazier Dale Koelling Vicky White Workshop Panel Chairs Ray Bair and Deb Agarwal Bill Johnston and Mike Wilde Rick Stevens Ian Foster and Dennis Gannon Linda Winkler and Brian Tierney Sandy Merola and Charlie Catlett Available at www.es.net/#research Focused on science requirements that drive • Advanced Network Infrastructure • Middleware Research • Network Research • Network Governance Model Eight Major DOE Science Areas Analyzed at the August ’02 Workshop Feature Discipline Driven by Vision for the Future Process of Science Requirements Characteristics that Motivate High Speed Nets • A few data repositories, many Analysis of model data distributed computing sites Climate by selected communities that have • NCAR - 20 TBy (near term) high speed networking • NERSC - 40 TBy (e.g. NCAR and NERSC) • ORNL - 40 TBy Networking Middleware • Server side data • Authenticated data streams for easier site access through firewalls processing (computing and cache embedded in the net) • Information servers for global data catalogues • Add many simulation elements/components as understanding increases Climate (5 yr) Enable the analysis of • Robust access to model data by all of the • 100 TBy / 100 yr generated simulation data, 1-5 PBy / yr (just at large quantities of collaborating data NCAR) community o Distribute large chunks of data to major users for postsimulation analysis • 5-10 PBy/yr (at NCAR) • Add many diverse simulation Climate (5+ yr) • Robust networks supporting distributed Integrated climate elements/components, including simulation simulation that from other disciplines - this must be adequate bandwidth includes all high-impact done with distributed, and latency for factors multidisciplinary simulation remote analysis and • Virtualized data to reduce storage visualization of load massive datasets • Reliable data/file transfer (across system / network failures) • Quality of service guarantees for distributed, simulations • Virtual data catalogues and work planners for reconstituting the data on demand 5 Evolving Qualitative Requirements for Network Infrastructure S C 1-40 Gb/s, end-to-end 2-4 yrs 1-3 yrs S I C C guaranteed bandwidth paths I C S storage S S C compute I In the near term applications need high bandwidth S instrument cache & compute S C 4-7 yrs 3-5 yrs C 2-4 yrs requirement is for high bandwidth and QoS. C I I C C 100-200 Gb/s, S end-to-end C C 3-5 yrs requirement is for high bandwidth and QoS and network resident cache and compute elements. S C 4-7 yrs requirement is for high bandwidth and QoS and network resident cache and compute elements, and robust bandwidth (multiple paths) 6 Evolving Quantitative Science Requirements for Networks Science Areas Today End2End Throughput 5 years End2End Throughput 5-10 Years End2End Throughput Remarks High Energy Physics 0.5 Gb/s 100 Gb/s 1000 Gb/s high bulk throughput Climate (Data & Computation) 0.5 Gb/s 160-200 Gb/s N x 1000 Gb/s high bulk throughput SNS NanoScience Not yet started 1 Gb/s 1000 Gb/s + QoS for control channel remote control and time critical throughput Fusion Energy 0.066 Gb/s (500 MB/s burst) 0.198 Gb/s (500MB/ 20 sec. burst) N x 1000 Gb/s time critical throughput Astrophysics 0.013 Gb/s (1 TBy/week) N*N multicast 1000 Gb/s computational steering and collaborations Genomics Data & Computation 0.091 Gb/s (1 TBy/day) 100s of users 1000 Gb/s + QoS for control channel high throughput and steering 7 New Strategic Directions to Address Needs of DOE Science June 3-5, 2003 Organized by the ESSC Workshop Chair Roy Whitney, JLAB Report Editors Roy Whitney, JLAB Larry Price, ANL Workshop Panel Chairs Wu-chun Feng, LANL William Johnston, LBNL Nagi Rao, ORNL David Schissel, GA Vicky White, FNAL Dean Williams, LLNL Focused on what was needed to achieve the science driven network requirements of the previous workshop Available at www.es.net/#research • Both Workshop reports are available at es.net/#research ESnet Strategic Directions • Developing a 5 yr. strategic plan for how to provide the required capabilities identified by the workshops o Between DOE Labs and their major collaborators in the University community we must address - Scalable bandwidth - Reliability - Quality of Service o Must address an appropriate set of Grid and human collaboration supporting middleware services 9 ESnet Connects DOE Facilities and Collaborators CA*net4 KDDI (Japan) France Switzerland Taiwan (TANet2) Australia CA*net4 Taiwan (TANet2) Singaren GEANT - Germany - France - Italy - UK - etc Sinet (Japan) Japan – Russia(BINP) CA*net4 CERN MREN Netherlands Russia StarTap Taiwan (ASCC) LIGO PNNL ESnet IP Japan MIT JGI LBNL NERSC SLAC FNAL ANL-DC INEEL-DC ORAU-DC ANL LLNL/LANL-DC SNLL QWEST ATM LLNL AMES BNL NY-NAP PPPL MAE-E 4xLAB-DC GTN&NNSA MAE-W PAIX-E KCP YUCCA MT JLAB ORNL LANL SDSC ALB HUB GA 42 end user sites Office Of Science Sponsored (22) NNSA Sponsored (12) Joint Sponsored (3) Other Sponsored (NSF LIGO, NOAA) Laboratory Sponsored (6) peering points hubs OSTI ARM SNLA ORAU NOAA SRS Allied Signal ESnet backbone: Optical Ring and Hubs International (high speed) OC192 (10G/s optical) OC48 (2.5 Gb/s optical) Gigabit Ethernet (1 Gb/s) OC12 ATM (622 Mb/s) OC12 OC3 (155 Mb/s) T3 (45 Mb/s) T1-T3 T1 (1 Mb/s) 10 While ESnet Has One Backbone Provider, there are Many Local Loop Providers to Get to the Sites Nevis Yale LIGO PNNL MIT SAN JGI LBNL/ CalRen2 SNLL QWES T ATM LLNL NERSC SLAC SNV HUB AMES FNAL ANL-DC INEEL-DC ORAU-DC ANL LLNL/LANL-DC CHI HUB SDSC/CENIC GA DC HUB ORNL LANL ALB HUB OSTI SNLA Allied Signal Qwest Owned Qwest Contracted Touch America (bankrupt) MCI Contracted/Owned Site Contracted/Owned SBC(PacBell) Contracted/Owned FTS2000 Contracted/Owned SPRINT Contracted/Owned Level3 ARM MAE-E PAIX-E GTN DOE-NNSA Allied Signal YUCCA MT PPPL 4xLAB-DC Mae-W Fix-W BNL NOAA ORAU SRS JLAB ESnet Logical Infrastructure Connects the DOE Community With its Collaborators ESnet provides complete access to the Internet by managing the full complement of Global Internet routes (about 150,000) at 10 general/commercial peering points + high-speed peerings w/ Abilene and the international networks ESnet Traffic ESnet Monthly Accepted Traffic Through Dec., 2003 300 Annual growth in the past five years has increased from 1.7x annually to just over 2.0x annually. 200 150 100 Jan, 04 May,03 Sep, 02 Jan, 02 May, 01 Sep, 00 Jan, 00 May, 99 Sep, 98 Jan, 98 May, 97 Sep, 96 Jan, 96 May, 95 Sept, 94 Jan, 94 May,93 Sep, 92 Jan, 92 Sep, 90 0 May,91 50 Jan, 90 TByte/Month 250 13 Who Generates Traffic, and Where Does it Go? ESnet Inter-Sector Traffic Summary, Jan 2003 72% 21% Commercial DOE sites DOE is a net supplier of data because DOE facilities are used by Univ. and commercial, as well as by DOE researchers ESnet ~25% 14% 17% 10% Peering Points 53% DOE collaborator traffic, inc. data R&E 9% 4% International ESnet Appropriate Use Policy (AUP) All ESnet traffic must originate and/or terminate on an ESnet an site (no transit traffic is allowed) E.g. a commercial site cannot exchange traffic with an international site across ESnet This is effected via routing restrictions ESnet Ingress Traffic = Green ESnet Egress Traffic = Blue Traffic between sites % = of total ingress or egress traffic 14 ESnet Site Architecture New York (AOA) Chicago (CHI) Washington, DC (DC) The Hubs have lots of connections (42 in all) Backbone (optical fiber ring) Atlanta (ATL) Sunnyvale (SNV) ESnet responsibility Hubs (backbone routers and local loop connection points) Site responsibility El Paso (ELP) Local loop (Hub to local site) ESnet border router DMZ Site gateway router Site LAN Site 15 SecureNet • SecureNet connects 10 NNSA (Defense Programs) Labs • Essentially a VPN with special encrypters • o The NNSA sites exchange encrypted ATM traffic o The data is unclassified when ESnet gets it because it is encrypted before it leaves the NNSA sites with an NSA certified encrypter Runs over the ESnet core backbone as a layer 2 overlay – that is, the SecureNet encrypted ATM is transported over ESnet’s Packet-Over-SONET infrastructure by encapsulating the ATM in MPLS using Juniper CCC 16 SecureNet – Mid 2003 Backup SecureNet Path AOA-HUB CHI-HUB GTN SNV-HUB LLNL DC-HUB SNLL ORNL KCP DOE-AL Pantex LANL Primary SecureNet Path SNLA ATL-HUB SRS ELP-HUB SecureNet encapsulates payload encrypted ATM in MPLS using the Juniper Router Circuit Cross Connect (CCC) feature. 17 IPv6-ESnet Backbone 9peers 18 peers 6BONE 7206 7peers BNL StarLight 7206 Distributed 6TAP PAIX LBL Abilene Chicago 7206 ESnet Sunnyvale LBNL 7206 New York TWC ANL FNAL SLAC Abilene 7206 DC Albuquerque IPv6 only IPv4/IPv6 SLAC El Paso IPv4 only • IPv6 is the next generation Internet protocol, and ESnet is working on addressing deployment issues Atlanta -one big improvement is that while IPv4 has 32 bit – about 4x109 – addresses (which we are running short of), IPv6 has 132 bit – about 1040 – addresses (which we are not ever likely to run short of) -another big improvement is native support for encryption of data 18 Operating Science Mission Critical Infrastructure • ESnet is a visible and critical pieces of DOE science infrastructure o • if ESnet fails,10s of thousands of DOE and University users know it within minutes if not seconds Requires high reliability and high operational security in the ESnet operational services – the systems that are integral to the operation and management of the network o Secure and redundant mail and Web systems are central to the operation and security of ESnet - trouble tickets are by email - engineering communication by email - engineering database interface is via Web o o o o Secure network access to Hub equipment Backup secure telephony access to Hub equipment 24x7 help desk (joint with NERSC) 24x7 on-call network engineer 19 Disaster Recovery and Stability • The network operational services must be kept available even if, e.g., the West coast is disabled by a massive earthquake, etc. o ESnet engineers in four locations across the country o Full and partial engineering databases and network operational service replicas in three locations o Telephone modem backup access to all hub equipment • All core network hubs are located in commercial telecommunication facilities with high physical security and backup power Disaster Recovery and Stability LBNL SNV HUB Engineers Eng Srvr Load Srvr Config Srvr SDSC Engineers, 24x7 NOC, generator backed power • Spectrum (net mgmt system) • DNS (name – IP address translation) • Eng database • Load database • Config database • Public and private Web • E-mail (server and archive) • PKI cert. repository and revocation lists • collaboratory authorization ALB HUB service Remote Engineer • partial duplicate infrastructure DNS AMES BNL CHI HUB NYC HUBS PPPL DC HUB Remote Engineer Duplicate Infrastructure (currently deploying full replication of the NOC databases and servers and Science Services databases) • ESnet backbone operated without interruption through • N. Calif. Power blackout of 2000 • the 9/11 attacks • the Sept., 2003 NE States power blackout Maintaining Science Mission Critical Infrastructure in the Face of Cyberattack • A Phased Security Architecture is being implemented to protect the network and the sites • The phased response ranges from blocking certain site traffic to a complete isolation of the network which allows the sites to continue communicating among themselves in the face of the most virulent attacks o Separates ESnet core routing functionality from external Internet connections by means of a “peering” router that can have a policy different from the core routers o Provide a rate limited path to the external Internet that will insure siteto-site communication during an external denial of service attack o provide “lifeline” connectivity for downloading of patches, exchange of e-mail and viewing web pages (i.e.; e-mail, dns, http, https, ssh, etc.) with the external Internet prior to full isolation of the network Phased Response to Cyberattack ESnet first response – filters to assist a site ESnet second response – filter traffic from outside of ESnet ESnet third response – shut down the main peering path and provide only a limited bandwidth path for specific “lifeline” services X X router ESnet peering router router LBNL X Lab first response – filter incoming traffic at their ESnet gateway router gateway router border router attack traffic router peering router border router Lab Sapphire/Slammer worm infection created almost a Gb/s traffic spike on the ESnet backbone until filters were put in place (both into and out of sites) to damp it out. Lab gateway router 23 Future Directions – the 5 yr Network Strategy • Elements o University connectivity o Scalable and reliable site connectivity o Provisioned circuits for hi-impact science bandwidth o Close collaboration with the network R&D community 24 5 yr Strategy – Near Term Goal 1 • Connectivity between any DOE Lab and any Major University should be as good as ESnet connectivity between DOE Labs and Abilene connectivity between Universities o Partnership with I2/Abilene o Multiple high-speed peering points o Routing tailored to take advantage of this o Latency and bandwidth from DOE Lab to University should be comparable to intra ESnet or intra Abilene o Continuous monitoring infrastructure to verify 25 5 yr Strategy – Near Term Goal 2 • Connectivity between ESnet and R&D nets – a critical issue from Roadmap o UltraScienceNet and NLR for starters o Reliable, high bandwidth cross-connects 1) IWire ring between Qwest – ESnet Chicago hub and Starlight – This is also critical for DOE lab connectivity to the DOE funded LHCNet 10 Gb/s link to CERN – Both LHC tier 1 sites in the US – Atlas and CMS – are at DOE Labs 2) ESnet ring between Qwest – ESnet Sunnyvale hub and the Level 3 Sunnyvale hub that houses the West Coast POP for NLR and UltraScienceNet 26 5 yr Strategy – Near-Medium Term Goal • Scalable and reliable site connectivity o Fiber / lambda ring based Metropolitan Area Networks o Preliminary engineering study completed for San Francisco Bay Area and Chicago area - Proposal submitted - At least one of these is very likely to be funded this year • Hi-impact science bandwidth – provisioned circuits 27 ESnet Future Architecture • Migrate site local loops to ring structured Metropolitan Area Network and regional nets in some areas o • Dynamic provisioning of private “circuits” in the MAN and through the backbone to provide “high impact science” connections o • Goal is local rings, like the backbone, that provide multiple paths This should allow high bandwidth circuits to go around site firewalls to connect specific systems. The circuits are secure and end-to-end, so if the sites trust each other, they should allow direct connections if they have compatible security policies. E.g. HPSS <-> HPSS Partnership with DOE UltraNet, Internet 2 HOPI, and National Lambda Rail 28 ESnet Future Architecture one optical fiber pair DWDM providing pointto-point, unprotected circuits site provisioned circuits initially via MPLS paths, eventually via lambda paths Layer 2 management equipment (e.g. 10 GigEthernet switch) Metropolitan Area Networks site Layer 3 (IP) management equipment (router) core ring production IP provisioned circuits carried over lambdas Optical channel (λ) management equipment site provisioned circuits carried as tunnels through the ESnet IP backbone 29 ESnet MAN Architecture - Example CERN (DOE funded link) StarLight other international peerings Vendor neutral facility ESnet managed λ / circuit services ESnet production IP service Qwest hub Current DMZs are back-hauled to the core router Implemented via 2 VLANs – one in each direction around the ring Ethernet switch • DMZ VLANs • Management of provisioned circuits ESnet core ESnet management and monitoring – partly to compensate for no site router ANL FNAL monitor site equip. ESnet managed λ / circuit services tunneled through the IP backbone via MPLS Site gateway router Site LAN monitor Site gateway router Site LAN site equip. 30 Future ESnet Architecture MAN optical fiber ring circuit cross connect ESnet border DMZ Site gateway router Site LAN Specific host, instrument, etc. Site New York (AOA) Washington ESnet backbone Atlanta (ATL) common security policy Private “circuit” from one Lab to another El Paso (ELP) MAN optical fiber ring circuit cross connect ESnet border DMZ Site gateway router Site Specific host, LAN instrument, etc. Site 31 Long-Term ESnet Connectivity Goal Japan • MANs for scalable bandwidth and redundant site access to backbone • Connecting MANs with two backbones to ensure against hub failure (for example NLR is shown as the second backbone below) Europe CERN/Europe Japan MANs Local loops High-speed cross connects with Internet2/Abilene Major DOE Office of Science Sites Qwest NLR 32 Long-Term ESnet Bandwidth Goal • Harvey Newman: “And what about increasing the bandwidth in the backbone?” • Answer: technology progress • o By 2008 (the next generation ESnet backbone) DWDM technology will be 40 Gb/s per lambda o And the backbone will be multiple lambdas Issues o End-to-End, end-to-end, and end-to-end 33 Science Services Strategy • ESnet is in a natural position to be the provider of choice for a number of middleware services that support collaboration, colaboratories, Grids, etc. • The characteristics of ESnet that make it a natural middleware provider are that ESnet is the only computing related organization that serves all of the Office of Science o is trusted and well respected in the OSC community o has the 7x24 infrastructure required to support critical services, and is a longterm stable organization. o • The characteristics of the services for which ESnet is the natural provider are those that o o o require long-term persistence of the service or the data associated with the service require high availability, require a high degree of integrity on the part of the provider are situated at the root of a hierarchy so that the service scales in the number of people that it serves by adding nodes that are managed by local organizations (so that ESnet does not have a large and constantly growing direct user base). 34 Science Services Strategy • DOE Grids CA that provides X.509 identity certificates to support Grid authentication provides an example of this model o the service requires a highly trusted provider, requires a high degree of availability o provides a centralized agent for negoiating trust relationships with, e.g., European CAs o it scales by adding site based or Virtual Organization based Registration Agents that interact directly with the users 35 Science Services: Public Key Infrastructure • Public Key Infrastructure supports cross-site, crossorganization, and international trust relationships that permit sharing computing and data resources and other Grid services • Digital identity certificates for people, hosts and services – essential core service for Grid middleware o o provides formal and verified trust management – an essential service for widely distributed heterogeneous collaboration, e.g. in the International High Energy Physics community DOE Grids CA • Have recently added a second CA with a policy that permits bulk issuing of certificates with central private key mg’mt o Important for secondary issuers - NERSC will auto issue certs when accounts are set up – this constitutes an acceptable identity verification - May also be needed for security domain gateways such as Kerberos – X509 – e.g. KX509 36 Science Services: Public Key Infrastructure • Policy Management Authority – negotiates and manages the formal trust instrument (Certificate Policy - CP) o Sets and interprets procedures that are carried out by ESnet o Currently facing an important oversight situation involving potential compromise of user X.509 cert private keys - Boys-from-Brazil style exploit => kbd sniffer on several systems that housed Grid certs - Is there sufficient forensic information to say that the pvt keys were not compromised?? – Is any amount of forensic information sufficient to guarantee this, or should the certs be revoked? – Policy refinement by experience • Registration Agents (RAs) validate users against the CP and authorize the CA to issue digital identity certs • This service was the basis of the first routine sharing of HEP computing resources between US and Europe 37 Science Services: Public Key Infrastructure • The rapidly expanding customer base of this service will soon make it ESnet’s largest collaboration service by customer count 38 Voice, Video, and Data Collaboration Service • The other highly successful ESnet Science Service is the audio, video, and data teleconferencing service to support human collaboration • Seamless voice, video, and data teleconferencing is important for geographically dispersed collaborators o ESnet currently provides voice conferencing, videoconferencing (H.320/ISDN scheduled, H.323/IP ad-hoc), and data collaboration services to more than a thousand DOE researchers worldwide 39 Voice, Video, and Data Collaboration Service o Heavily used services, averaging around - 4600 port hours per month for H.320 videoconferences, - 2000 port hours per month for audio conferences - 1100 port hours per month for H.323 - approximately 200 port hours per month for data conferencing • Web-Based registration and scheduling for all of these services o authorizes users efficiently o lets them schedule meetings Such an automated approach is essential for a scalable service – ESnet staff could never handle all of the reservations manually 40 Science Services Strategy • The Roadmap Workshop identified twelve high priority middleware services, and several of these fit the criteria for ESnet support. These include, for example o long-term PKI key and proxy credential management (e.g. an adaptation of the NSF’s MyProxy service) o directory services that virtual organizations (VOs) can use to manage organization membership, member attributes and privileges o perhaps some form of authorization service o in the future, some knowledge management services that have the characteristics of an ESnet service are also likely to be important • ESnet is seeking the addition funding necessary to develop, deploy, and support these types of middleware services. 41 Conclusions • ESnet is an infrastructure that is critical to DOE’s science mission and that serves all of DOE • • Focused on the Office of Science Labs ESnet is evolving its architecture and services strategy to need the stated requirements for bandwidth, reliability, QoS, and Grid and collaboration supporting services 42