Download No Slide Title

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

IEEE 802.1aq wikipedia , lookup

Distributed operating system wikipedia , lookup

Dynamic Host Configuration Protocol wikipedia , lookup

CAN bus wikipedia , lookup

Network tap wikipedia , lookup

Remote Desktop Services wikipedia , lookup

Real-Time Messaging Protocol wikipedia , lookup

Lag wikipedia , lookup

Computer cluster wikipedia , lookup

Transcript
PARMON
A Comprehensive Cluster Monitoring System
PARMON Team
Centre for Development of Advanced Computing,
Bangalore, India
Contact: Rajkumar Buyya ([email protected])
Topics of Discussion
 PARMON System Model & Architecture
 PARMON Server
 PARMON Client





PARMON Features and Services
PARMON Installation and its Usage
Monitoring with PARMON
PARMON Integration with other products
Conclusions and Future Directions
Motivations
 Workstation clusters have off late become a cost-effective
solution for HPC ? .
 C-DAC’s PARAM OpenFrame is a large cluster of more
than 40 Ultra-4 workstations interconnected through lowlatency, high bandwidth communication networks.
 Monitoring such huge systems is a tedious and
challenging task since typical workstations are designed
to work as a standalone system, rather than a part of
workstation clusters.
 System administrators require tools to effectively monitor
such huge systems. PARMON provides the solution to
this challenging problem.
C-DAC HPCC Software Architecture
APPLICATIONS
SYSTEM
MANAGEMENT
TOOLS
Parallel
File
system
C-PFS
Development Tools
F90 IDE, DIVIA
Languages
C, F77, F90,
Message Passing Interfaces
C-MPI, PVM
Light Weight Protocols
SOLARIS
CLUSTER HARDWARE
PARMON - Salient Features
 Online creation of Node and Group database
 Allows to monitor system activities at Component, Node,
Group, or entire Cluster level monitoring
 Designed using state-of-the-art Java technology
 Monitoring of System Components :
 CPU, Memory, Disk and Network
 Allows to monitor multiple instances of the same
componet.
 Facility for definition of events and automatic notification
 Miscellaneous facilities : Message broadcast, Invocation
of system management commands (halt, reboot, etc.),
System Information & Configuration
 PARMON provides GUI interface for initiating
activities/request and presents results graphically.
PARMON System Model
PARMON Client on JVM
PARMON Server
on Solaris Node
parmon
parmond
PARMON
High-Speed
Switch
PARMON Implementation
 Server
 Multithreaded using POSIX and Solaris
 Developed using C as it need to access system internals
 It is a stateless server
 Client
 Developed using Java
 Java features are extensively used..
 New Window is created for each client request, which interacts
with server
 Threads are used extensively to while creating online resource
utilization meters
 Dynamically configures with changes to node date base.
Setting up of PARMON
 Server installation & invocation
 Binding to port
 Rights (requires root permission for full functionality)
 parmond or parmond <port-no>
(either at boot time or on-line)
 Needs to be loaded on all nodes to be monitored
 Client installation & invocation
 Java based client (client machine can be PC/workstation
supporting JVM)
 CLASSPATH (pointing to classes.zip, parmon.jar)
 jar file (parmon.jar)
 java parmon or java parmon <port-no>
Setting up of PARMON
 Server installation & invocation
 Binding to port
 Rights (requires root permission for full functionality)
 parmond or parmond <port-no>
(either at boot time or on-line)
 Needs to be loaded on all nodes to be monitored
 Client installation & invocation
 Java based client (client machine can be PC/workstation
supporting JVM)
 CLASSPATH (pointing to classes.zip, parmon.jar)
 jar file (parmon.jar)
 java parmon or java parmon <port-no>
Monitoring System Activities
and Resource Utilization
PARMON Launcher
Creation of Node Database
Node Deletion
Group Creation
Group Modification/Deletion
Resource Utilization at a Glance
Selection of Nodes/Group
CPU Usage Monitoring
Memory Usage monitoring
Disk/Network Usage Monitoring
Message Viewer (System logs)
Process activities
Kernel Data Catalog - CPU
Kernel Data Catalog - Memory
Kernel Data Catalog - Disk
Kernel Data Catalog - Network
Catalog of CPU Parameters
Component View - Physical
Component View - Logical
Message Broadcast
System Configuration
System Information
Issuing Commands : halt, shutdown, etc.
Node Diagnostics - Online (SunVTS)
Online Help
PARMON Integration with other
Products
 PARMON can send resource utilization information to
any other product if protocols are made available
Node 1
parmond
Node N
PARAM online bulletin board
Conclusions and Future Directions
 PARMON successfully used in monitoring PARAM
OpenFrame Supercomputer, which is a cluster of 48
Ultra-4 workstations running SUN-Solaris operating
system.
 Portable across platforms supporting Java
 Comprehensive monitoring support and GUI
 PARMON supports Solaris and Linux clusters and
planned for supporting NT clusters.
 Can easily be extended to support web-based monitoring
of clusters, by creating a interface server (running on
web-server) between client and PARMON server running
on cluster nodes.
Thank YOU
?