Download parmon

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Zero-configuration networking wikipedia , lookup

Dynamic Host Configuration Protocol wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

Distributed operating system wikipedia , lookup

CAN bus wikipedia , lookup

Remote Desktop Services wikipedia , lookup

Network tap wikipedia , lookup

Lag wikipedia , lookup

Routing in delay-tolerant networking wikipedia , lookup

Computer cluster wikipedia , lookup

Transcript
PARMON
A Comprehensive Cluster Monitoring System
A Single System Image Case Study
Developer: PARMON Team
Centre for Development of Advanced Computing,
Bangalore, India
http://www.cdacindia.com
Project Leader: Rajkumar Buyya
([email protected])
Topics of Discussion

PARMON System Model & Architecture







PARMON Server
PARMON Client
PARMON Features and Services
PARMON Installation and its Usage
Monitoring with PARMON
PARMON Integration with other products
Conclusions and Future Directions
2
Motivations




Workstation clusters have off late become a costeffective solution for HPC ? .
C-DAC’s PARAM 10000 is a large cluster of more
than 40 Ultra-4 workstations interconnected
through
low-latency,
high
bandwidth
communication networks.
Monitoring such huge systems is a tedious and
challenging task since typical workstations are
designed to work as a standalone system, rather
than a part of workstation clusters.
System administrators require tools to effectively
monitor such huge systems. PARMON provides the
solution to this challenging problem.
3
C-DAC HPCC Software
Architecture
APPLICATIONS
SYSTEM
MANAGEMENT
TOOLS
Parallel
File
system
C-PFS
Development Tools
F90 IDE, DIVIA
Languages
C, F77, F90,
Message Passing Interfaces
C-MPI, PVM
Light Weight Protocols
SOLARIS
CLUSTER HARDWARE
4
PARMON Capabilities



PARMON allows the user to monitor system
activities and resource utilization of various
components of workstation clusters.
It monitors the machine at various levels:
component, node and the entire system level
exhibiting a single system image.
It allows the system administrator to monitor the
following.





Aggregation of system resources utilization.
Process activities.
System log activities.
Kernel activities.
Multiple instances of the same resource.
5
PARMON - Salient Features




Online creation of Node and Group database
Allows to monitor system activities at Component, Node,
Group, or entire Cluster level monitoring
Designed using state-of-the-art Java technology
Monitoring of System Components :





CPU, Memory, Disk and Network
Allows to monitor multiple instances of the same
componet.
Facility for definition of events and automatic notification
Miscellaneous facilities : Message broadcast, Invocation
of system management commands (halt, reboot, etc.),
System Information & Configuration
PARMON provides GUI interface for initiating
activities/request and presents results graphically.
6
PARMON System Model
PARMON Client on JVM
PARMON Server
on Solaris Node
parmon
parmond
PARMON
High-Speed
Switch
7
PARMON Implementation

Server




Multithreaded using POSIX and Solaris
Developed using C as it need to access system internals
It is a stateless server
Client





Developed using Java
Java features are extensively used..
New Window is created for each client request, which
interacts with server
Threads are used extensively to while creating online
resource utilization meters
Dynamically configures with changes to node date base.
8
Setting up of PARMON

Server installation & invocation





Binding to port
Rights (requires root permission for full functionality)
parmond or parmond <port-no>
(either at boot time or on-line)
Needs to be loaded on all nodes to be monitored
Client installation & invocation




Java based client (client machine can be PC/workstation
supporting JVM)
CLASSPATH (pointing to classes.zip, parmon.jar)
jar file (parmon.jar)
java parmon or java parmon <port-no>
9
Monitoring System Activities
and Resource Utilization
PARMON Launcher
11
Creation of Node Database
12
Node Deletion
13
Group Creation
14
Group Modification/Deletion
15
Resource Utilization at a Glance
16
Selection of Nodes/Group
17
CPU Usage Monitoring
18
Memory Usage monitoring
19
Disk/Network Usage Monitoring
20
Message Viewer (System logs)
21
Process activities
22
Kernel Data Catalog - CPU
23
Kernel Data Catalog - Memory
24
Kernel Data Catalog - Disk
25
Kernel Data Catalog - Network
26
Catalog of CPU Parameters
27
Component View - Physical
28
Component View - Logical
29
Message Broadcast
30
System Configuration
31
System Information
32
Issuing Commands : halt,
shutdown, etc.
33
Node Diagnostics - Online
(SunVTS)
34
Online Help
35
PARMON Integration with
other Products

PARMON can send resource utilization
information to any other product if
protocols are made available
Node 1
parmond
Node N
PARAM online bulletin board
36
Summary and Recent Works





PARMON successfully used in monitoring PARAM
OpenFrame Supercomputer, which is a cluster of
48 Ultra-4 workstations running SUN-Solaris
operating system.
Portable across platforms supporting Java
Comprehensive monitoring support and GUI
PARMON supports Solaris and Linux clusters and
planned for supporting NT clusters (one such
implementation was carried out at UPC, Barcelona).
It has been extended to support web-based
monitoring of clusters, by creating a interface
server (running on web-server) between client and
PARMON server running on cluster nodes.
37
References

Project Team:






Rajkumar Buyya
Krishna Mohan
Bindu Gopal
R. Buyya, PARMON: A Portable and Scalable
Monitoring System for Clusters,
International Journal on Software: Practice
& Experience (SPE), John Wiley & Sons, Inc,
USA, June 2000.
Further Info: http://www.buyya.com/parmon
C-DAC: http://www.cdacindia.com
38
Thank YOU
?
39