* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download WP7 Web sites
Survey
Document related concepts
Transcript
Network Monitoring and GridPP Richard Hughes-Jones, University of Manchester 6 November 2001 DataGrid WP7 MB – NG DataTAG GridPP GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester DataGrid WP7 Networking Active and proceeding well, Meetings: Oxford DataGrid meeting Jul 01 CERN 18 Sep 01 Frascati DataGrid meeting Oct 01 Provisioning, Reports on DataGrid network requirements and current infrastructure Use of IP ports for TestBed1 Monitoring (Robin Tasker) High Performance High Throughput (Richard Hughes-Jones) QoS and Bandwidth Reservation (Tiziana Ferrari) Secuity (Dave Kelsey) GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester Network Monitoring Several tools in test – plugged into a coherent structure: PingER, RIPE one way times, iperf, UDPmonRE, rTPL, GridFTP, and NWS prediction engine continuous tests for last month to selected sites: DL Man RL UCL CERN Lyon Bologna SARA NBI SLAC … Discussions this week at WP7 in Amsterdam The aims of monitoring for the Grid: to inform Grid applications, via the middleware, of the current status of the network – input for resource broker and scheduling to identify fault conditions in the operation of the Grid to understand the instantaneous, day-to-day, and month-by-month behaviour of the network – provide advice on configuration etc. Report written on LDAP scheme for publishing the network information GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester Network Monitoring Architecture LDAP Schema Grid Apps GridFTP PingER (RIPE TTB) iperf rTPL NWS etc Local Network Monitoring Store & Analysis of Data (Access) Backend LDAP script to fetch metrics Monitor process to push metrics local LDAP Server Grid Application access via LDAP Schema to - monitoring metrics; - location of monitoring data. Access to current and historic data and metrics via the Web, i.e. WP7 NM Pages, access to metric forecasts Robin Tasker GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester How do the Grid apps access the metrics of network monitoring? What is the RTT viewed from UCL to RAL and QMW? Query an LDAP server that makes use of an LDAP Schema containing the ObjectClass RTT to find out. o=grid How? ou=uk cn=netmon rou=rl ou=DataGrid dc=ucl dc=rl hn=host1 hn=host2 rou=qmw Proposed tree here GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester dc=qmw and find the details from the LDAP Schema for publication of the RTT metric service=netmon, dc=ucl, ou=UK, ou=DataGrid, o=Grid rou=ral objectclass=networkmonitorHost objectclass=networkmonitorRTT objectclass=networkmonitorThroughput objectclass=networkmonitorLoss rou=qmw objectclass=networkmonitorHost objectclass=networkmonitorRTT objectclass=networkmonitorThroughput objectclass=networkmonitorLoss PingER RTT metric rTPL RTT metric GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester to allow a request to return a valid metric PingER datafile 3) Check datafile for metric update IperfER datafile Backend Scripts rTPL Run periodically and generates data file datafile 4) Read new metric from logfile Ftree backend 2) Sends query to backend 5) Reply to query Slapd frontent 1) Query server 6) Reply to query GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester Data: RIPE 1-way & TCP throughput RIPE 1-way time ms Sara RAL 20 Oct 01 RIPE 1-way time ms RAL Sara TCP Iperf + prediction Mbit/s UCL Sara GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester Data: Ping & UDP throughput PingER rtt (ms) dl – cern 1000 byte packet Forecast From 20 Oct 01 UDPmon throughput Mbit/s man – cern 300 * 1400 byte frames GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester High Performance High Throughput Document produced outlining the tests to be made: Understanding the end system HW, best way to monitor traffic and protocol packets, Type and effect of background traffic, Throughput vs rtt, throughput vs window-size, Throughput using multiple TCP streams, and effect on the Network GridFTP at Gigabit, Effect of different TCP stacks, Use of non-TCP protocols, and effect on the Network Show and tell demos Discussions this week at WP7 in Amsterdam Links to MB-NG, DataTAG, GGF/IETF GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester QoS and BW Reservation Proposed workplan presented at Oxford WP7 meeting Identification of traffic classes and middleware components requiring QoS • Discussion with other WorkPackages – definition of application requirements • GridFTP packet loss • Interactive applications packet loss, delay, jitter • Piloting of IP Premium (GEANT and NRNs) Study of traffic differentiation and traffic engineering techniques: • Layer 2 MPLS VPNs (CISCO and Juniper) • LAN & WAN QoS • Traffic clasification, marking, policing, ccongestion control, queue scheduling, traffic aggregation demonstration of QoS & Bandwith Reservation Document in production outlining the tests to be made GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester MB - NG E-science core project Project to investigate and pilot: end-to-end traffic engineering and management over multiple administrative domains – MPLS in core diffserv at the edges. Managed bandwidth and Quality-of-Service provision. (Robin T) High performance high bandwidth data transfers. (Richard HJ) Demonstrate end-to-end network services to CERN using Dante EU-DataGrid and the US DataTAG. Status: approved with requested funds start on 1 Dec 2001 Partners:CISCO, CLRC, Manchester, UCL, UKERNA plus Lancaster and Southampton (IPv6) A technical meeting held – draft spec. of the HW required Would like to use real Grid traffic – maybe: CDF UCL-RAL Collaboration Meeting Nov 2001 BaBar Man-RAL GridPP R. Hughes-Jones Manchester MB - NG Manc MB – NG SuperJANET Testbed MCC Leeds SJ4 Dev C-PoP Warrington SuperJANET4 Production Network Gigabit Ethernet 2.5 Gbit POS 10 Gbit POS SJ4 Dev C-PoP Reading RAL / UKERNA RAL / UKERNA SJ4 Dev C-PoP London ULCC GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester UCL DataTAG The EU DataTAG project EU Transatlantic Gigabit project. Status: approved for 4 M ECU start on 1 Dec 2001 Partners: CERN/PPARC/INFN/UvA. IN2P3 sub-contractor The main foci are: Grid Network Research including: Provisioning (CERN) Investigations of high performance data transport (PPARC) End-to-end inter-domain QoS + BW / network resource reservation Bulk data transfer and monitoring (UvA) Interoperability between Grids in Europe and the US PPDG, GriPhyN, DTF, iVDGL (USA) GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester DataTAG project NL SURFnet MREN STAR-LIGHT UK SuperJANET4 STAR-TAP GEANT CERN ESNET NewYork IT GARR-B Abilene 2.5 Gbit lambda between CERN and Starlight POS 2nd half 2002 – WDM later GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester Olivier Martin DataTAG Single stream vs Multiple streams effect of a single packet loss (e.g. link error, buffer overflow) Streams/Throughput 10 5 1 7.5 4.375 Throughput Gbps 2 9.375 2T Avg. 7.5 Gbps 10 75 2T Avg. 4.375 Gbps 5 Avg. 3.75 Gbps 2.5 T = 2.37 hours! (RTT=200msec, MSS=1500B) T T T GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester Time GGF Grid High-Performance Networking RG Talk on Lambda Networking Research, Cees de Laat, UvA BW reservation & Gigabit Tests ATLAS at Michigan, E. Myers Charter fully discussed Network monitoring WG Much in common with WP7 two way exchange of techniques. GridFTP UK participation in protocols and α-testing with the Globus developers. LDAP Schema for network status Discussions with GLobus GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester Focus on The UK and GridPP GridPP + DataGRID + DataTAG + MB-NG + GGF will collaborate closely. GridPP has UK specific issues and includes experiments at: BaBar with links to SLAC CDF and D0 working at Fermi UKQCD UKDMC (dark matter) MINOS PPNCG is a natural forum fro UK GridPP network matters Recent throughput problems, perceived as transatlantic, traced to on-campus bottlenecks. The PPNCG strongly encourage good liaison between the HEP teams and the campus networking groups. Compiled a picture of the access BW to the HEP sites: GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester Glasgow 1G 100M 30M? Connectivity of UK Grid Sites. BW to Campus, BW to site, limit Edinburgh 1G 100M Lancaster 155M 100M move to c&nlman at 155Mbit Durham 155M ??100M Manchester 1G 100M 1 G soon Sheffield 155M ??100M Liverpool 155M 100M 4*155M soon. To hep ? Cambridge 1G 16M? DL 155M 100M UCL 155M 1G 30M? Birmingham 622M ?? 100M Oxford 622M 100M IC 155M 34M then 1G to Hep RAL 622M 100M Gig on site soon QMW 155M ?? Swansea 155M 100M Brunel 155M ?? Bristol 622M 100M Portsmouth 155M 100M RHBNC 34M 155M soon ?? 100M GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester Southampton 155 100M Sussex155 100M Ptolemy simulation of the Grid (1) Ptolemy - a discrete event simulation tool from Berkley Simulation based on flows from “DataGrid-7-D7.1019-1-0-NetworkRequirements.doc” Predics the flow between Institutes and across SuperJANET Paul Mealor GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester Ptolemy simulation of the Grid (2) CERN RAL Bristol Liverpool SuperJANET Lancaster Birmingham Tier 3 London Man Glasgow & Edinburgh (Imperial) Paul Mealor GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester Don’t Forget Involvement with: UKQCD UKDMC (dark matter) MINOS AstroGRID AccessGRID Grids for High Performance Computer Centres – Edinburgh Manchester Lambda Switching Projects … GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester European Access PPNCG monitoring shows packet loss and increased rtt from the UK to Europe. Due to packet loss at the TEN155gw.ja.net Start of problems 12 Oct 10-20 % loss seen by 31 Oct The 155 Mbit line to Dante running at ~130 Mbit A reconfiguration Fri 2 nov helped. TEN155 contact ends 1 Dec – replaced by Dante 2.5Gbit access link. GridPP Collaboration Meeting Nov 2001 R. Hughes-Jones Manchester