* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download APM - DataTAG
Survey
Document related concepts
Transcript
DataTAG overview Summary Why DataTAG? DataTAG project Test-bed extensions General information Open DataTAG Network map (some) Research topics (a lot of) Issues Conclusion and acknowledgements 3 February 2003 APM meeting - Barcelona Paolo Moroni (Slide 2) Background for DataTAG High-energy physicists are building the LHC (start-up in 2007?) at CERN: an unprecedented amount of data will have to be analyzed and CERN alone will not have enough computing resources The planned computing model is distributed geographically (GRID) The EU-DataGrid project is addressing the middleware problem Reliable, advanced networking is needed underneath At least part of the GRID traffic will not look like any IP commodity traffic now A lot of bandwidth will be necessary, but it won’t be enough 3 February 2003 APM meeting - Barcelona Paolo Moroni (Slide 3) Addressing the problem Buy (a lot of) bandwidth Buy (expensive) network equipment Address security issues without compromising performance (firewalls) Tune some more or less obvious TCP parameter Design the network for reliability and performance Plan for interconnection with key research networks 3 February 2003 APM meeting - Barcelona Paolo Moroni (Slide 4) Why network research? Need to enhance data transport protocols (TCP) Need to measure the perceived application performance (end-to-end) Need to try the GRID software in a WAN environment and at very high speed (Gigabit or more) Need to test the compatibility between EU and US GRIDs (middleware integration) Need to test network technology with enough speed and enough features to support the planned GRID workload (end-to-end inter-domain QoS) Consistent risk of breaking something: production networks can help with some (but not all) the above needs 3 February 2003 APM meeting - Barcelona Paolo Moroni (Slide 5) DataTAG project Full project title: “Research and technological development for a transatlantic GRID” IST project (EU funded), supported by the NSF and the DoE (Caltech) Partners: PPARC (UK), INRIA (FR), University of Amsterdam (NL), INFN (IT) and CERN (CH) Researchers also from CalTech, SLAC and Canada Test-bed kernel: transatlantic STM-16 (Tsystems) between Geneva (CERN) and Chicago (StarLight), with interconnected workstations at each side 3 February 2003 APM meeting - Barcelona Paolo Moroni (Slide 6) Test-bed extensions Amsterdam-Geneva (SARA-CERN): STM-64 from SURFnet (Global Crossing) Lyon-Geneva: STM-16 from VTHD (France Telecom) CH backup access STM-16 to GEANT (COLT) Chicago-Sunnyvale: STM-64 from TeraGrid (Level3) Back-to-back GbE to Canarie in Chicago Back-to-back 10GbE to TeraGrid and Abilene in Chicago 3 February 2003 APM meeting - Barcelona Paolo Moroni (Slide 7) General information 2-years project: 2002 and 2003 Dedicated staff (recruited on project budget) Part-time staff, shared with other activities Open to cooperate with other projects: EUDataGrid, GEANT, Abilene, TeraGrid, NetherLight, etc. Typical EU work package structure: quarterly reports, deliverables at fixed deadlines, periodic reviews by external inspectors. Very formal, heavy and structured framework, but effective to avoid project drifting, delays and wastes 3 February 2003 APM meeting - Barcelona Paolo Moroni (Slide 8) Open DataTAG DataTAG is open to cooperate with other research projects Proposals for additional activity on the DataTAG test-bed are welcome One requirement: ongoing work must not be affected (no overbooking) The current schedule is already relatively busy (both EU and US activities) 3 February 2003 APM meeting - Barcelona Paolo Moroni (Slide 9) DataTAG Network map R06chi-Alcatel7770 W01chi w02chi w03chi w04chi w05chi w06chi ONS15454 SURFNET 2x1GE R06gva-Alcatel7770 V10chi v11chi v12chi v13chi 4x1GE 8x1GE Alcatel 1670 ONS15454 R05chi-JuniperM10 ABILENE Stm64(L3) 2x1GE 2x1GE VTHD/INRIA 1GE Stm16 (FranceTelecom) 1GE 2x1GE Stm16(DTag) Extreme Summit5i 10GE Stm16(GC) 4x1GE 2x1GE SUNNYVALE 2x1GE Alcatel 1670 10x1GE 1GE SURFNET W03gva w04gva 1GE CANARIE 2x1GE ONS15454 W01gva w02gva w05gva w06gva R04chi-Cisco7609 R04gva-Cisco7606 1GE R05gva-JuniperM10 Extreme Summit1i 1GE 2x1GE 10GE Cisco5505-management Chicago Geneva CERN External Network 1GE Vlan4 Vlan5 1GE Vlan7 DataTAG 1GE Teragrid JuniperT640 1GE Stm4(DTag) 1GE Stm16(Swisscom) Cisco2950-management ar3-chicago -Cisco7606 Cernh4-Cisco7609 SWITCH Cernh7-Cisco7609 GEANT Stm16(Colt) backup+projects 3 February 2003 [email protected] - last update: 20021204 APM meeting - Barcelona GARR/CNAF Paolo Moroni (Slide 10) Management addresses and path to reach them from the interne Datatag testbed addresses Path of the CCC tunnel from CNAF Path between the OC farms in Chicago and Geneva Path between Sunnyvale and the PC farm in Geneva DataTAG Routing map W01chi w02chi w03chi w04chi w05chi w06chi SURFNET R06gva R06chi V10chi v11chi v12chi v13chi W01gva w02gva w05gva w06gva SURFNET W03gva w04gva CANARIE VTHD/INRIA R05gva R05chi R04gva R04chi SUNNYVALE ABILENE 192.91.236.0/23 192.91.238.0/23 Cisco5505-management Teragrid JuniperT640 Chicago Geneva 192.91.246.192/26 192.91.244.0/27 DataTAG CERN External Network SWITCH Cisco2950-management ar3-chicago 3 February 2003 [email protected] - last update: 20021204 Cernh4 APM meeting - Barcelona Cernh7 GEANT GARR/CNAF Paolo Moroni (Slide 11) (some) Research topics (I) Linux kernel tuning for high performance: for example, 8 Terabytes in 24 hours, memory-tomemory (achieved by S. Ravot, CalTech) Bulk file transfer (Terabyte disk-to-disk, at 2 Gbps, achieved by Canadian researchers between TRIUMF and CERN) TCP stack improvements (things get worse with longer RTT) Application-level performance measurement GRID middleware interoperability between EU and US 3 February 2003 APM meeting - Barcelona Paolo Moroni (Slide 12) (some) Research topics (II) 10 Gb tests (10GbE and transatlantic link upgrade, tentatively in September 2003) at layer 2 and layer 3 Multi-vendor equipment (Alcatel 1670 and 7770, Cisco 760x, Juniper M10, Extreme Summit switches) QoS tests, advanced reservation Optical networking: validation of equipment Direct access (hardware and software) to the equipment is essential for most activities 3 February 2003 APM meeting - Barcelona Paolo Moroni (Slide 13) Issues (I) Electrical power in Chicago (now fixed) Broken network hardware (mainly 10 GbE cards, but not only) Broken or hung PCs Router interfaces disabled Never enough workstations available for testing Reservation software: never sophisticated enough Network topology: each research group has different requirements Alcatel 1670: needs STM-16 reconfiguration 3 February 2003 APM meeting - Barcelona Paolo Moroni (Slide 14) Issues (II) Cisco 760x components (un)availability Demo workshops (iGRID2002, SC2002, …): nice to see, but organizational nightmares KPNQwest collapse (re-procurement via Tsystems) Routing: interconnections with production networks, with external test networks, even with commodity Internet, plus management access everywhere Furthermore, routing is open for experiments (research groups require enabled access to the routers) 3 February 2003 APM meeting - Barcelona Paolo Moroni (Slide 15) Issues (III) A lot of enthusiasm and interest, unfortunately not always supported by adequate planning When adequate planning is there, it is ignored (because of the enthusiasm, of course) Network planning is not straightforward: mix between test and production Partners coordination is even less straightforward: mix of shared and private resources 3 February 2003 APM meeting - Barcelona Paolo Moroni (Slide 16) Issues (IV) Some JunOS features were discovered in the hard way Same for some IOS “features” DNS, security, access control, management servers, VLANs, WWW site, IP addressing (>100 addresses assigned), etc.: all need planning, work and maintenance PoP management (run out of rack space) + installation issues (mainly, but not only, Alcatel) OOB access in Chicago, to recover from configuration mistakes (this works only sometimes with PCs) 3 February 2003 APM meeting - Barcelona Paolo Moroni (Slide 17) Conclusion and acknowledgements DataTAG is for testing what it would be often useful to do and may not be done because it is too risky, or too expensive, or too complicated, or simply because you cannot afford a test laboratory DataTAG is open to external collaborations, as far as it is allowed by the current test-bed workload Many thanks to DANTE/GEANT and DoE/CalTech, who are actively supporting the project 3 February 2003 APM meeting - Barcelona Paolo Moroni (Slide 18) http://www.datatag.org Thank you