Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Network Monitoring Robin Tasker, Daresbury Laboratory, 9 May 2001 At the February meeting we agreed; - the requirements for Network Monitoring found at http://icfamon.dl.ac.uk/papers/WP7/netmon-requirements.htm - 7 sites to act as “pre-testbed” for network monitoring Bologna, CERN, IN2P3, NIKHEF, Rutherford and Daresbury - Goal, by end of March, to install and configure, - PingER software and Web-based access to the data - Purchase and install RIPE NCC TTM boxes with access to the related data via the RIPE Web site Where are we now? -PingER - installed and operational at Bologna, Rutherford and Daresbury. - IN2P3 planning dedicated m/c, waiting effort -> mid May - CERN already have PingER running (on suncs02.cern.ch) but will install m/c dedicated for our purpose soon - NIKHEF ? RIPE NCC TTM Box - installed and operational at Daresbury and CERN (already had one!) - ordered/purchased at Bologna, IN2P3, Rutherford, awaiting installation - NIKHEF ? So what does http://icfamon.dl.ac.uk/ppncg/datagrid.html look like right now? and in more detail at the testbed sites? Last 14 days pkt loss and rtt between Daresbury and Bologna and there’s more….. So what’s next? - Ability to extract data from various repositories (both for PingER and RIPE NCC), collate it and provide a report for the time period / parameter of interest. How? Probably not by storing ”raw output" in a database, but by storing the necessary tools to allow access to the data to be viewed. - Need a straightforward Web-based tool to assess how the network is between my site and wherever, right *now*. Meets one of our requirements, but need to ask, why does it make the Grid a better place to be? And then there’s the Network Weather System http://nws.npaci.edu/NWS/ basically, - a distributed system that periodically monitors and dynamically forecasts the performance various network and computational resources can deliver over a given time interval. - The service operates a distributed set of performance sensors (network monitors, CPU monitors, etc.) from which it gathers readings of the instantaneous conditions. - It uses numerical models to generate forecasts of what the conditions will be for a given time frame. And there’s more.. - the NWS has been developed for use by dynamic schedulers and to provide Quality-of-Service readings in a networked computational environment - Each prototype forecasts process-to-process network performance (latency and bandwidth) and available CPU percentage for each machine that it monitors. - The AppLeS scheduling methodology makes extensive use of its facilities and prototype implementations for Legion and Globus/Nexus have been developed. But why do it? No doubt : - it’s the fashionable thing to do right now; - it attracts money possibly because, - it’s an easy concept to sell to the layman but - need to understand the benefits to the Grid. - need to be convinced as to the value of this activity and even if we want to do it - do we need yet another set of monitoring stuff deployed; - can’t we use the PingER or RIPE data with a “predictive engine” to do essentially the same thing