Download Monitoring Grid Services - Informatics Homepages Server

Monitoring Grid Services Yin Chen [email protected] June 2003 1 Contents Issues of Monitoring Project Proposal 2 Issues of Monitoring What the goals of Grid monitoring What's the characteristics of Grid system What may need to be Monitored What’s the characteristics of Monitoring Data Related Work 3 What the goals of Grid monitoring Propagate errors to users/management Performance monitoring to  tune the application  use the Grid more efficiently The question is  Not how to measure resources  But how to deliver information to end-users and system/Grid 4 What's the characteristics of Grid system Complex distributed system =>often observe unexpectedly low performance Where is the bottleneck? - application operating system disks network adapters on either the sending or the receiving host network switches, routers Experience of the Netlogger group - 40% network, 40% application, 20% host problems - application: 50% client, 50% server process problems 5 What's the characteristics of Grid system (cont..) Dynamic environment World-wide distributed environment with - high latency - frequent faults - very heterogeneous resources 6 What may need to be Monitored  Disk space, speed of processor, network bandwidth, CPU load, memory load, network load, network communication time, number of parallel streams, stripes TCP/IP buffer size, disk access time that includes time to copy data to or from the local hard disk on the server.[2][3]  Some of this information are relative static information while others are run-time dynamic information. 7 What’s the characteristics of Monitoring Data Run-time monitoring data goes "Old" quickly  Producer should near the entities.  Rapidly and efficiently transport from producer to consumer.  Information should be explicate, e.g. by timestamps Updates are frequent Performance information is often stochastic 8 Related Work Monitoring and Discovery Service (MDS) Grid Monitoring Architecture (GMA) Relational Grid Monitoring Architecture (R-GMA) Hawkeye Globus Heartbeat Monitor (HBM) Network Weather Service (NWS) GridRM 9 MDS Architecture 10 GMA Architecture 11 R-GMA Architecture 12 Hawkeye Architecture 13 HBM Architecture 14 NWS Architecture 15 The Global Layer of GridRM 16 The Local GridRM Layer 17 Summary and Conclusion Varieties of different systems exist for monitoring Each system has its own strengths and weaknesses Tend to use standard and open components GGF advocated architecture GMA 18 Summary and Conclusion (cont.) The similarities in architecture  At the lowest level, have a sensor or other program that generates a piece of data.  Some systems allow data to be aggregated from a set of resources  At the resource level, gather together the data from several information collectors into one component  Directory component  Decentralised hierarchy structure, which have higher ability in fault tolerance  Differences in using push or pull mechanism 19 Project Proposal Goal Requirement Architecture -- Pull Model Specification Implementation Testing Schedule 20 Goal Realisation Lightweight & Simple design Reliability & Robustness 21 Architecture What is Pull model  The monitor sends requests to the service for information. This implies repeated queries of resource attributes over some time period at a specific frequency  On the other hand in a Push model the service sends out notifications to a subscribed sink. 22 Benefits of Pull  Less network traffic: collections initiated only from top  Has no time synchronisation problem: collect data from resources at the same time.  The server can determine the size of the file, select the appropriate alternate server, and passively control the bandwidth and storage space.  According to Globus, "push" model "generates a large amount of data and results in constant updates to the MDS.  Standard LDAP databases are not designed to handle frequent updates. 23 Benefits of Pull (Cont.)  The Pull model is based on distributed intelligence to the asset site - it becomes automated.  Using machine-to-machine communications with connected sensors and autonomic computing the asset does self-diagnostics, self maintain and repair, re-routes energy flows, schedules non-routine maintenance and reports on any out of the ordinary activity that poses a security threat.  IBM calls it autonomic computing where machine to machine communications take place to optimise the performance of computing and network resources. 24 Problems of Pull  must gathering current measurements from all resources.  if the data volume is large in real-time may cause bottleneck problem.  may be not useful in fault detection -- heartbeat events are valid only for a short time interval and should be delivered in this time constraint.  may be not useful in dynamic sensor management.  The push model is the most efficient in terms of bandwidth as requests are not sent, just responses from the service. 25 Monitoring Grid Services  Thanks 26

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Monitoring Grid Services - Informatics Homepages Server