Rocks Clusters
SUN HPC Consortium
November 2004
Federico D. Sacerdoti
Advanced CyberInfrastructure Group
San Diego Supercomputer Center
• Rocks Identity
• Rocks Mission
• Why Rocks
• Rocks Design
• Rocks Technologies, Services, Capabilities
• Rockstar
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Identity
• System to build and manage
Linux Clusters
General Linux maintenance
system for N nodes
Desktops too
Happens to be good for clusters
• Free
• Mature
• High Performance
Designed for scientific workloads

Rocks Mission
• Make Clusters Easy (Papadopoulos, 00)
• Most cluster projects assume a sysadmin will help build
the cluster.
• Build a cluster without assuming CS knowledge
 Simple
idea, complex ramifications
Automatic configuration of all components and services
~30 services on frontend, ~10 services on compute nodes
Clusters for Scientists
• Results in a very robust system that is insulated from
human mistakes

Why Rocks
• Easiest way to build a Rockstar-class machine with SGE
ready out of the box
• More supported architectures
Pentium, Athlon, Opteron, Nocona, Itanium
• More happy users
 280 registered clusters, 700 member support list
 HPCwire Readers Choice Awards 2004
• More configured HPC software: 15 optional extensions
(rolls) and counting.
• Unmatched Release Quality.

Why Rocks
• Big projects use Rocks
 BIRN (20 clusters)
 GEON (20 clusters)
 NBCR (6 clusters)
• Supports different clustering toolkits
Rocks Standard (RedHat HPC)
SCore (Single Process Space)
OpenMosix (Single Process Space: on the way)

Rocks Design
• Uses RedHat’s intelligent installer
Leverages RedHat’s ability to discover & configure hardware
Everyone tries System Imaging at first
Who has homogeneous hardware?
If so, whose cluster stays that way?
• Description Based install: Kickstart
Like Jumpstart
Contains a viable Operating System
No need to “pre-configure” an OS

Rocks Design
• No special “Rocksified” package structure. Can
install any RPM.
• Where Linux core packages come from:
RedHat Advanced Workstation (from SRPMS)
Enterprise Linux 3

Rocks Leap of Faith
• Install is primitive operation for Upgrade and Patch
Seems wrong at first
Why must you reinstall the whole thing?
Actually right: debugging a Linux system is fruitless at this scale.
Reinstall enforces stability.
Primary user has no sysadmin to help troubleshoot
• Rocks install is scalable and fast: 15min for entire cluster
Post script work done in parallel by compute nodes.
Power Admins may use up2date or yum for patches.
To compute nodes by reinstall

Rocks Technology
Cluster Integration with Rocks
Build a frontend node
Build compute nodes
Insert CDs: Base, HPC, Kernel, optional Rolls
Answer install screens: network, timezone, password
Run insert-ethers on frontend (dhcpd listener)
PXE boot compute nodes in name order
Start Computing

Rocks Tech: Dynamic Kickstart File
On node install

Rocks Roll Architecture
• Rolls are Rocks Modules
Think Apache
• Software for cluster
3rd party tarballs
Automatically configured
RPMS plus Kickstart graph in
ISO form.

Rocks Tech: Dynamic Kickstart File
With Roll (HPC)

Rocks Tech: Wide Area Net Install
Install a frontend without CDs
• Can install from minimal
boot image
• Rolls downloaded
• Community can build
specific extensions

Rocks Tech: Security & Encryption
To protect the kickstart file

Rocks Tech: 411 Information Service
• 411 does NIS
Distribute passwords
• File based, simple
HTTP transport
• Scalable
• Secure

Rocks Services
Rocks Cluster Homepage

Rocks Services: Ganglia Monitoring

Rocks Services: Job Monitoring
SGE Batch System

Rocks Services: Job Monitoring
How a job affects resources on this node

Rocks Services: Configured, Ready
• Grid (Globus, from NMI)
• Condor (NMI)
Globus GRAM
Globus GRAM
• MPD parallel job launcher (Argonne)
MPICH 1, 2
• Intel Compiler set

Rocks Capabilities
High Performance Interconnect
• Myrinet
 All major versions, GM2
 Automatic configuration and support in Rocks since first
• Infiniband
 Via Collaboration with AMD & Infinicon

Rocks Visualization “Viz” Wall
Enables LCD Clusters
One PC / tile
 Gigabit Ethernet
 Tile Frame
Large remote sensing
Volume Rendering
Seismic Interpretation
Electronic Visualization Lab
 Bio-Imaging (NCMIR BioWall)

Rockstar Cluster
• Collaboration between SDSC and SUN
• 129 Nodes: Sun V60x (Dual P4 Xeon)
Gigabit Ethernet Networking (copper)
Top500 list positions: 201, 433
• Built on showroom floor of Supercomputing
Conference 2003
Racked, Wired, Installed: 2 hrs total
Running apps through SGE

Building of Rockstar

Rockstar Topology
24-port switches
Not a symmetric network
Best case - 4:1 bisection bandwidth
Worst case - 8:1
Average - 5.3:1
Linpack achieved 49% of peak
Very close to percentage peak of
1st generation DataStar at SDSC

Rocks Future Work
• High Availability: N Frontend nodes.
 Not that far off (supplemental install server design)
 Limited by Batch System
Frontends are long lived in practice:
Keck 2 Cluster (UCSD) uptime: 249 days, 2:56
• Extreme install scaling
• More Rolls!
• Refinements

• Rocks mailing List
• Rocks Cluster Register
• Core: {fds,bruno,mjk,phil}
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents