Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Richard Cavanaugh University of Florida HENP SIG Spring Internet2 Meeting 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 1 Outline • • • • • The Project Science Drivers UltraLight Network Grid-enabled Analysis Summary 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 2 The Project • UltraLight is – A four year $2M NSF ITR funded by MPS – Application driven Network R&D • Two Primary, Synergistic Activities – Network “Backbone”: Perform network R&D / engineering – Applications “Driver”: System Services R&D / engineering • Ultimate goal : Enable physics analysis and discoveries which could not otherwise be achieved 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 3 NSF/PHY-EPP: Caltech, UF, UM, FIU, FNAL, SLAC et al. Partnership • A New Class of Integrated Information Systems – Includes the Network as an Actively Managed Resource – Based on a “Hybrid” packet-switched and circuit-switched optical network infrastructure • Ultrascale Protocols (e.g. FAST) and Dynamic Optical Paths – Monitor, Manage and Optimize the Use of the Network and Grid Systems in real-time • Using a set of Agent-Based Intelligent Global Services – Built on Top of an already-existing, developing software infrastructure in round-the-clock operation: • MonALISA, GEMS, GAE/Clarens • Exceptional Support from Cisco Advanced Research and Technology Infrastructure (ARTI) Group & Calient • Exceptional NLR, CENIC, Internet2/Abilene, ESnet Support 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 4 UltraLight Activities TEAMS: Physicists, Computer Scientists, Network Engineers • High Energy Physics Application Services – Integrate and Develop physics applications into the UltraLight Fabric: Production Codes, Grid-enabled analysis, User Interfaces to Fabric • E-VLBI Application Services – Integrate and Develop eVLBI applications into the UltraLight Fabric: Specific Protocols and Bandwidth Management Techniques • Global System Services – Critical “Upperware” software components in the UltraLight Fabric: Monitoring, Scheduling, Agent-based Services, etc. • Network Engineering – Routing, Switching, Dynamic Path Construction Ops., Management • Testbed Deployment and Operations – Including Optical Network, Compute Cluster, Storage, Kernel and UltraLight System Software Configs. • Education and Outreach 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 5 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 6 Evolving Quantitative Science Requirements for Networks (DOE High Perf. Network Workshop) Today End2End Throughput 5 years End2End Throughput 5-10 Years End2End Throughput Remarks High Energy Physics 0.5 Gb/s 100 Gb/s 1000 Gb/s High bulk throughput Climate (Data & Computation) 0.5 Gb/s 160-200 Gb/s N x 1000 Gb/s High bulk throughput SNS NanoScience Not yet started 1 Gb/s 1000 Gb/s + QoS for Control Channel Remote control and time critical throughput Fusion Energy 0.066 Gb/s (500 MB/s burst) 0.198 Gb/s (500MB/ 20 sec. burst) N x 1000 Gb/s Time critical throughput Astrophysics 0.013 Gb/s (1 TByte/week) N*N multicast 1000 Gb/s Computat’l steering and collaborations Genomics Data & Computation 0.091 Gb/s (1 TBy/day) 100s of users 1000 Gb/s + QoS for Control Channel High throughput and steering Science Areas 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 7 History of Bandwidth Usage – One Large Network; One large Research Site ESnet Accepted Traffic 1990 – 2004 Exponential Growth Since ’92; Annual Rate Increased from 1.7 to 2.0X Per Year In the Last 5 Years L. Cottrell ESnet Monthly Accepted Traffic Through W. Johnston Nov, 2004 10 Gbps W. Johnston 400 300 250 Progress in Steps 200 150 100 SLAC 50 Nov, 04 Jan, 04 Jun, 04 Mar, 03 Aug, 03 May,02 Oct, 02 Jul, 01 Dec, 01 Feb, 01 Apr, 00 Sep, 00 Nov, 99 Jan, 99 Jun, 99 Aug, 98 Oct, 97 Mar, 98 May, 97 Jul, 96 03.05.2005 Dec, 96 Feb, 96 Apr, 95 Sep, 95 Nov, 94 Jan, 94 0 Jun, 94 TByte/Month 350 Traffic ~400 Mbps; Growth in Steps (ESNet Limit): ~ 10X/4 Years. July 2005:2x10 Gbps links: one for production and one for research Projected: ~2 Terabits/s by ~2014 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 8 101 Gigabit Per Second Mark! 2.95 101 Gbps Unstable end-sytems Gbps to+from Rio de Janeiro 1.0 Gbps to Seoul END of demo 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting Source: SC04 Bandwidth Challenge committee 9 MonALISA: SC04 Monitoring Monitoring SCINet, NLR, Abilene, LHCNet, CHEPREO, Int’l NRNs and 9000 Grid Nodes Simultaneously 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 10 The UltraLight Network • UltraLight has a non-standard core network with dynamic links and varying bandwidth inter-connecting our nodes. Optical Hybrid Global Network • Core of UltraLight will dynamically evolve with available resources on other backbones – such as NLR, HOPI, Abilene or ESnet. • The main resources for UltraLight: – LHCnet (IP, L2VPN, CCC) – Abilene (IP, L2VPN) – ESnet (IP, L2VPN) – Cisco NLR wave (Ethernet) – HOPI NLR waves (Ethernet; provisioned on demand) – UltraLight nodes: Caltech, SLAC, FNAL, UF, UM, StarLight, CENIC PoP at LA, CERN 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 11 UltraLight Network Engineering • GOAL: Determine an effective mix of bandwidth-management techniques for the LHC application-space, particularly: – Best-effort and “scavenger” using “effective” protocols – MPLS with QOS-enabled packet switching – Dedicated paths arranged with TL1 commands, GMPLS • PLAN: Develop, test most cost-effective combination of network technologies: – Exercise UltraLight applications on NLR, Abilene and campus networks, as well as LHCNet, and our international partners • Progressively enhance Abilene with QOS support to protect production traffic • Incorporate emerging NLR and RON-based lightpath and lambda facilities – Deploy and study ultrascale protocol stacks (such as FAST) addressing issues of performance & fairness – Use MPLS/QoS and other forms of BW management, and adjustments of optical paths, to optimize end-to-end performance among a set of virtualized disk servers 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 12 UltraLight Network Infrastructure Elements • Trans-US 10G λs Riding on NLR, Plus CENIC, FLR, MiLR – LA – CHI (2 Waves): HOPI and Cisco Research Waves – CHI – JAX (Florida Lambda Rail) • Dark Fiber Caltech – L.A.: 2 X 10G Waves (One to WAN In Lab); 10G Wave L.A. to Sunnyvale for UltraScience Net Connection • Dark Fiber with 10G Wave: StarLight – Fermilab • Dedicated Wave StarLight – Michigan Light Rail • SLAC: ESnet MAN to Provide 2 X 10G Links (from July): One for Production, and One for Research • Partner with Advanced Research & Production Networks – LHCNet (Starlight- CERN), Abilene/HOPI, ESnet, NetherLight, GLIF, UKLight, CA*net4 – Intercont’l extensions: Brazil (CHEPREO/WHREN), GLORIAD, Tokyo, AARNet, Taiwan, China 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 13 UltraLight Testbed Now to the Driving Application Work… 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 14 Grid Analysis Environment 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 15 UltraLight Analysis Architecture Analysis Client - Discovery, - Acl management, - Certificate based access HTTP, SOAP, XML-RPC Grid Services Web Server Scheduler Catalogs FullyAbstract Planner Metadata PartiallyAbstract Planner Data Mngmt FullyConcrete Planner Analysis Client Virtual Data Monitoring Replica Execution Priority Manager 03.05.2005 Grid Wide Execution Service Applications • Clients talk standard protocols to “Grid Services Web Server”, • Simple Web service API allows simple or complex analysis clients • Typical clients: ROOT, Web Browser, …. • Clarens portal hides complexity • Key features: Global Scheduler, Catalogs, Monitoring, Grid-wide Execution service. R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 16 Peer 2 Peer System • Allow a “Peer-toPeer” configuration to be built, with associated robustness and scalability features. • Discovery of Services • No Single point of failure • Robust file download 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 17 UltraLight Application Services UltraLight Application-layer Global Services Services Make UltraLight Available to Physics Applications & their Environments ROOT IGUANA COBRA ATHENA Other apps. Request Planning Services Network Management Network Access Workflow Management Storage Access Execution Services Intelligent Agents End-to-end Monitoring Application Interface UltraLight Infrastructure End-to-end Monitoring • Networking Resources Storage Resources Application Frameworks Augmented to Interact Effectively with the Global Services (GS) – GS Interact in Turn with the Storage Access & Local Execution Service Layers 03.05.2005 • Computation Resources Apps. Provide Hints to High-Level Services About Requirements – – Interfaced also to managed Net and Storage services Allows effective caching, pre-fetching; opportunities for global and local optimization of thruput R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 18 GAE and Ultralight Make the Network an Integrated Managed Resource Application Interfaces Monitor • Unpredictable multi user analysis • Overall demand typically fills the capacity of the resources • Real time monitor systems for networks, storage, computing resources,… : E2E monitoring Request Planning Network Planning Network Resources Support data transfers ranging from the (predictable) movement of large scale (simulated and real) data, to the highly dynamic analysis tasks initiated by rapidly changing teams of scientists 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 19 MonALISA: Monitoring Agents using a Large Integrated Services Architecture • MonALISA able to dynamically – register & discover • Based on multi-threaded engine – Very scalable • Services are self describing • Code updates – Automatic & secure • Dynamic config for services – Secure Admin Interface • Active filter agents – Process data – Application specific monitoring 03.05.2005 • Mobile agents – decision support – global optimisations Fully distributed, noHENP single point of failure! R. Cavanaugh, SIG, Spring Internet2 Meeting 20 MonALISA Example: Monitoring Network Topology, Latencies, Routers MonALISA 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 21 MonALISA Services Integrated with Network Services • Dedicated modules to monitor and control optical switches • Used to control – CALIENT switch @ CIT – GLIMMERGLASS switch @ CERN • ML agent system – Used to create global path – Algorithm can be extended to include prioritisation and preallocation 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 22 (Physics) Analysis on the Grid Move from Existing Components to a Coherent System • Catalogs to select datasets, • Resource & Application Discovery • Schedulers guide jobs to resources • Policies enable “fair” access to resources • Robust (large size) data (set) transfer • • • Client Application 8 2 1 3 Discovery Dataset service 7 Catalogs 4 9 Planner/ Scheduler Job Submission 6 Execution Storage Management 5 5 Policy Feedback to users (e.g. status of their jobs) Crash recovery of components (identify and restart) Provide secure authorized access to resources and services. 03.05.2005 Steering R. Cavanaugh, HENP SIG, Spring Monitor Information Data Transfer Storage Management Ultralight core : data transfer, planning scheduling, (sophisticated) policy management on VO level, integration More sophisticated componentsInternet2 & services in years 3-4 Meeting 23 Clarens Web-Service Backbone The glue which holds everything together • • • • • • • X509 Cert based access Good Performance Access Control Management Remote File Access Dyanamic Discovery of Services on a Global Scale Available in Python and Java Easy to install and part of VDT distribution: – – • • wget -q -O http://hepgrid1.caltech.edu/clarens/setup_clump.sh |sh export opkg_root=/opt/openpkg Interoperability with other web service environments such as Globus, through SOAP Interoperability with MonALISA Monitoring Clarens parameters Service publication 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 24 Many GAE Services Integrated with Network Services • • • • • • • • Core Clarens services (Including a shell service and remote file access) File catalog service (contains dataset information) Sphinx Scheduler (UFL) Service based scheduler Job Submission BOSS (Collaboration with INFN) Root Clarens client Caves (UFL) Analysis code and command sharing environment Steering service. First prototype of steering service Discovery Service UltraLight focuses on integration and sophisticated automated decisions based on monitor information 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 25 Example : Remote Data File Access via Clarens 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 26 Example: Sphinx Scheduling Service • Functions as like a Nerve Centre • Data Warehouse – Policies, Account Information, Grid Weather, Resource Properties and Status, Request Tracking, Workflows, etc Clarens WS Backbone “?” Grid Client • Applies Data Mining methods Recommendation Engine • Flexible Framework: – Client (request/job submission) Grid Resource • Clarens Web Service • Grid Clients – Scheduling Service • Clarens Web Service • MonALISA Monitoring Repository Grid Resource Grid Resource – Grid Resource • MonALISA Monitoring Service • Grid Services 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting MonALISA Monitoring Backbone 27 UltraLight Plans • UltraLight envisions a 4 year program to deliver a new, high-performance, network-integrated infrastructure: • Phase I will last 12 months and focus on deploying the initial network infrastructure and bringing up first services • Phase II will last 18 months and concentrate on implementing all the needed services and extending the infrastructure to additional sites (We are entering this phase starting approximately this summer) • Phase III will complete UltraLight and last 18 months. The focus will be on a transition to production in support of LHC Physics; + eVLBI Astronomy 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 28 Summary • For many years the Wide Area Network has been the bottleneck; this is no longer the case in many countries – Deployment of a data intensive Grid infrastructure is now possible! – Recent I2LSR records show for the first time ever that the network can be truly transparent; throughputs are limited by end-hosts – Challenge shifted from getting adequate bandwidth to deploying adequate infrastructure to make effective use of it! • UltraLight promises to deliver the critical missing component for future eScience: the integrated, managed network – Next generation Network and Grid system – Extend and augment existing grid computing infrastructures (currently focused on CPU/storage) to include the network as an integral component 03.05.2005 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting 29