Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Review of National Collaboratory Middleware and Network Research Projects. Aug 2003. New generation of tools for E2E measuremement and diagnostics on Ultranet …Les Cottrell and Jiri Navratil Stanford Linear Accelerator Center (SLAC) 1. Introduction The scientific community is increasingly dependent on networking as the need for international cooperation grows in this field. The High Energy and Nuclear Physics (HENP) community is one of the strongest collaboration examples using the Internet. HENP physicists need to transfer huge amounts of data to home institutes all over the world from major sites where there are particle accelerators, such as at SLAC, FNAL, ORNL (US) or CERN (Switzerland). Today, accelerator labs no longer have sufficient computing resources to meet the analysis needs of experiments, so typically multiple computer centers (tier 1 sites, the accelerator lab is a tier 0 site) in multiple countries share the load. To support this, experimental data must be copied between the accelerator site and their tier 1 sites. Further to improve access to analyzed data, it is replicated and saved at multiple data centers. These databases are huge and growing rapidly, exceeding a PetaByte and including the largest known databases in the world today. For example, the SLAC based BaBar experiment had 895 TBytes as of Friday Dec 12, 2003. The networking requirements to copy the data are commensurately large requiring copying multiTeraByte files that today take over a day to transfer. Besides the data copy requirements, rapidly changing teams of tens to hundreds of physicists in dozens of sites around the world need access to the analyzed data via the network in order to further analyze and extract the physics results. Meeting these demands requires constant upgrades and the ability to effectively use the high-speed networks. In the next few years, when the next generation of particle accelerators start taking data (LHC is scheduled to turn on in 2007) the demands for data transfer will increase by an order of magnitude or more. The Grid community is also growing rapidly so gridclustered computers with immense computation and data capability will soon become one of the main network consumers to enable processing scientific data distributed worldwide The Internet is not one compact, centrally managed network. It is composed of many large network backbones operated by different Internet Service Providers (ISPs), each serving hundreds or thousands of smaller networks or sites, each with their own network infrastructure. The UltraNet (even if it will be a private network) will be an indirect partner of this community. First of all because some data coming via Internet will have to enter UltraNet and vice versa. and secondly because it will use the same technology as other networks. The capacity of UltraNet will be very high but it also means that dynamic range (range in which data can flow via network) can be very high so the condition on the network can be very different according to the utilization. So knowing the current situation in the network and knowing how stable this situation could be is very important for everybody. Each big ISP or service organizations creates its own monitoring system, that provides basic information for their users. It is fundamental for the management of such systems 1 Review of National Collaboratory Middleware and Network Research Projects. Aug 2003. and providing basic information for end users. However, each such system passively measures aggregated information in the core or at the exchange points and doesn’t cover all the end-user levels and questions. There will be almost always be a Local Area Network (LAN) in both ends of the UltraNet and so the user packets will almost always pass from one LAN via UltraNet to another LAN. The End to end (E2E) measurements will continue to be necessary since only such measurements can give correct answer on “users requirements”. Doesn’t matter if for individual users or grid-brokers or database redirecting schedulers. Access to such high-speed networks and testbeds such as UltraNet is critical for developers of new E2E tools. Current measurements tools work well for OC-3 links and reasonably well for networks with OC-12 links. In the moment when more modern technologies are coming into game and they will use more aggregated traffic (OC-48, OC-192s or STM/64) with parallel optical channel etc. the current tools will fail or they give incorrect results or they will work very inefficiently. The new technology uses more sophisticated data transfer than in the environment for which tool was originally developed. We start to feel this problem even in current situation because the first such devices of new generation appeared in the network backbones. The capacities of most networks has increased dramatically during the last 5 years and even this brings new challenges into current E2E methods. At high speeds using iperf is becoming increasingly problematic. For example on a 1 Gbits/s link, in order for iperf to reach a stable TCP mode of operation (i.e. to have emerged from the initial slow start and be in congestion avoidance) it take about 6 seconds, so if one needs the measurement to be in the stable congestion avoidance state for 90% of the time, then it takes 60 seconds for a measurement which means transferring over 6GBytes of data. Even today, Abilene traffic statistics indicate that iperf measurements create more than 5% of all traffic in the whole network. Development of more effective tools that estimate the available bandwidth or an achievable throughput has become a hot subject of networking research in several projects but all current tools are able to give results in the limit below 1 Gbits. The new generation of tools for E2E monitoring and prediction must be developed. SONET/SDH will continue to play a key role in the next generation of networks for many carriers. In the core network, the carriers offer services such as telephone, dedicated leased lines, and Internet protocol (IP) data, which are continuously transmitted. The individual data is not transmitted on separate lines; instead it is multiplexed into higher speeds and transmitted on SONET/SDH networks at up to 10 gigabits per second (Gbps). Synchronous transmission system. (SONET) and synchronous digital hierarchy (SDH) transmission technologies. specification outlines the frame format, multiplexing method, and synchronization method between the equipment, as well as the specifying optical interface. SONET that could directly extract low-speed signals from multiplexed high-speed traffic based on the ANSI standard. The adoption and acceptance of SONET allowed to be able to choose equipment from different vendors instead of using only a single vendor with a proprietary optical format. The base rate for SONET is 51 Mbps. Synchronous transport signal (STS-n) refers to the SONET signal in the electrical domain, and optical carrier (OC-n) refers to the SONET signal in the optical domain. The base rate for SDH is 155 Mbps. Synchronous transport module 2 Review of National Collaboratory Middleware and Network Research Projects. Aug 2003. (STM-n) refers to the SDH signal level in both the electrical and optical domains as following: STS-1 OC-1 N/A 51.84 , STS-3 OC-3 STM-1 155.52, STS-12 OC-12 STM-4 622.08, STS-48 OC-48 STM-16 2,488.32, STS-192 OC-192 STM-64 9,953.28, STS768 OC-768 STM-256 39,813.12 Mbps. Multi-services capabilities with a high number of channels supports the mapping of different type of traffic such as POS - Packet over SONET, Ethernet over SONET, ATM, Fibre Channel. FICON, ESCON and Digital Video are available on most devices used this technology. Various types of SONET/SDH interfaces support optical line interfaces from 155 Mbps to 10 Gbps. It opens new possibility for combination of these features but on other side it complicates situation in measurement from outside. The signals are aggregated, scrambled and several times converted from serial to parallel forms and so finally all timing effects which played role in time-dispersion measurements methods are lost. Also “brute-force” methods are becoming totally inefficient because to fill existing capacity from one testing machine is practically impossible. These points are only part of the reasons why new tools for E2E measurements should be developed and why should be developed new methods for tracking user packets in such heavily aggregated environment and generally to study the behavior of different typical internet applications passing via UltraNET. We made our first experiences with this technology during our tests performed in collaboration with CERN and Caltech in the frame of DataTAG project and also during setting experimental 10 Gbps links for “Bandwidth challenge” on sc2003 and we discovered first differences compare to the existing technology which should be studied and tested in more detail in a new experimental network. 3