Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
An Integrated Instrumentation Architecture for NGI Applications Ian Foster, Darcy Quesnel, Steven Tuecke Argonne National Laboratory The University of Chicago DOE NGI Instrumentation Project “A Uniform Instrumentation, Event, and Adaptation Framework for Network-Aware Middleware and Advanced Network Applications” – With UIUC (Dan Reed, Ruth Aydt) – “Produce uniform notification and adaptation mechanisms, with the goal of catalyzing the development of both network-aware middleware and sophisticated network-aware applications” Motivation Environment incorporates multiple sensors – Sources of events relating to behavior of resources, middleware, and applications Significant advantages to having uniform mechanisms for publishing/discovering sensors and for accessing sensor data – E.g., find all sensors for path A->B – Including historical data Enables end-to-end, top-to-bottom, pastto-present analysis Examples of Sensors Network devices – E.g., routers End system devices – E.g., computers, storage systems Grid services – E.g., Globus HBM, Network Weather Service Libraries – E.g., CAVERNsoft, MPI Applications For Example ... App Libs Sys H/W S S MPICH S S globus-io S GRAM S CAVERNsoft S HBM S (netstat) S (SNMP) H R S NWS S R (SNMP) S R S DPSS S ... (SNMP) S (netstat) H Three Project Components 1. Mechanisms for creating, publishing, discovering, and accessing sensors 2. Synthesis and analysis techniques for identifying qualitative behavior and trends in sensor data 3. Adaptation techniques that exploit sensor data to adjust middleware and application configurations to improve performance Argonne focus: (1) and (3); UIUC: (2), (3) Current Approach Use a directory service (LDAP) to register and publish event sources – Publish: source, type, contact [online, archive] – Discover: “find all event sources of type X” Use NetLogger format for data Develop sensor manager to handle publish, subscribe, archiving Use SQL database as archive Initial sensor set based on Globus libraries, applications, NetLogger-accessible devices Initial Instrumentation Architecture Sensor Sensor Application Discover Events in NetLogger format Sensor Manager Archive File SQL Subscribe (“what event sources for route A to B?”) Publish (“netstat, host A, time T, contact X”) Netarchive MySQL LDAP Sensor Manager We are building a program which: – Archives sensor event streams – Redirects sensor event streams to clients using a publish/subscribe interface – Generates sensor event streams from archive, based on query language – Publishes interfaces and index to LDAP Relation to other work – Superset of Netlogd (simple archiver) – Might exploit Netarchiver (MySQL indexing) Archiving Events How to archive sensor event streams? – SQL: Save each event as a record in an SQL database > Advantage: Rich query support – Netarchive: Save each event into file. Use SQL database to build index of file contents > Advantage: Performance and scale? We will explore the use of SQL databases – Premise: Most sensors will not produce high volume event streams; hence optimize for simplicity and rich query support NCSA Origin Nodes Bandwidth/Latency ANL-NASA Ames ANL CPU Load Bandwidth/Latency ANL-Indiana Applying Info Infrastructure to Instrumentation Publishing & Discovering Sensors Globus LDAP-based Metacomputing Directory Service (MDS) provides scalable, global infrastructure for publishing and discovering sensor managers – Sensors stream events to a sensor manager – Sensor manager publishes availability of streams into LDAP – Clients discover sensor managers from LDAP, and can subscribe to either current or archived sensor event streams directly from sensor managers Initial Applications Replica creation in “Data Grid” applications – Online and historical instrumentation for large data transfers (app, lib, network) – Involves DPSS, globus-io – Also application-level selection of replicas, based on sensor information MPI-based video streaming (Karonis, Papka) Security Grid Security Infrastructure (GSI) will be used throughout, hence possible to say e.g. – “Manager M accepts only streams from sensors of user U” – “Manager N only publishes streams to clients of users A, B, C” As a first step, we have augmented the Netlogger C client with GSI Instrumentation Architecture Showing Actuators Monitor Sensor Subscribe Discover Sensor Events Events Actuator Publish Subscribe Sensor Manager File SQL Netarchive MySQL Discover Publish LDAP Future Directions XML – Netlogger is an ASCII based format – If you using ASCII, why not use XML? – XML database could be used for archive Events – Performance related events should be just one part of a larger, integrated event system Typing – Netlogger is weakly typed – Various advantages to strongly typed events Future Directions (2): Publish/Subscribe for Sensors In first version: – Netlogger based sensors stream events to manager – Manager publishes sensor availability to LDAP – Clients subscribe to sensor manager for events In later version: – Sensor can publish existence to LDAP – Client can subscribe directly to sensor for events Network Weather Service (R. Wolski et al., U.Tenn) Scalable, fault tolerant system for – Real-time performance measurements – Predictions of future state When installed on N hosts, delivers: – Network performance (<=N2 via netperf) – Host cpu-load measurements (N) We (USC/ISI crew) are working to integrate this into MDS; hopefully will eventually be consistent with approach described here (to be discussed) Structure of NWS data in MDS (old) c=US o=Globus o=ISI nn= the Internet hs=source.isi.edu to destination.anl.gov source: hn=source.isi.edu, o=ISI, c=US destination: hn=destination.anl.edu, o=ANL, c=US serviceProvider: NWS throughput: 1.903 throughput_prediction: 1.709 throughput_MSE: 0.95 latency: 5.3 latency_throughput: 6.1 latency_MSE: 0.04 N2 Network performance entries for N hosts ... hn=source.isi.edu current_cpu: current_cpu_prediction: current_cpu_MSE: weighted_cpu weighted_cpu_prediction: weighted_cpu_MSE: 0.802 0.802 0.000 0.414 0.414 0.000 N sets of cpu info for N hosts