Download Introduction to High Performance Computing with

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Thunderbolt (interface) wikipedia , lookup

Intel Active Management Technology wikipedia , lookup

Transcript
Introduction to
High Performance Cluster Computing
Courseware Module H.1.a
August 2008
What is HPC
 HPC = High Performance Computing
Includes Supercomputing
 HPCC = High Performance Cluster Computing
Note: these are NOT High Availability clusters
 HPTC = High Performance Technical Computing
 The ultimate aim of HPC users is to max out the CPUs!
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Agenda
• Parallel Computing Concepts
• Clusters
• Cluster Usage
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Concurrency and Parallel Computing
A central concept in computer science is concurrency:
• Concurrency: Computing in which multiple tasks are active at the
same time.
There are many ways to use Concurrency:
• Concurrency is key to all modern Operating Systems as a way to
hide latencies.
• Concurrency can be used together with redundancy to provide
high availability.
• Parallel Computing uses concurrency to decrease program
runtimes.
HPC systems are based on Parallel Computing
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Hardware for Parallel Computing
Parallel computers are classified in terms of streams of data and
streams of instructions:
• MIMD Computers: Multiple streams of instructions acting on
multiple streams of data.
• SIMD Computers: A single stream of instructions acting on
multiple streams of data.
Parallel Hardware comes in many forms:
•
•
•
•
On chip: Instruction level parallelism (e.g. IPF)
Multi-core: Multiple execution cores inside a single CPU
Multiprocessor: Multiple processors inside a single computer.
Multi-computer: networks of computers working together.
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Hardware for Parallel Computing
Parallel Computers
Single Instruction
Multiple Data (SIMD)*
Shared Address Space
Symmetric
Multiprocessor
(SMP)
Non-uniform
Memory
Architecture
(NUMA)
Multiple Instruction
Multiple Data (MIMD)
Disjoint Address Space
Massively
Parallel
Processor
(MPP)
Cluster
Distributed
Computing
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
HPC Platform Generations
In the 1980’s, it was a vector SMP.
Custom components throughout
In the 1990’s, it
was a massively
parallel computer.
Commodity Off The Shelf CPUs,
everything else custom
… but today, it is a cluster.
COTS components
everywhere
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.
*Otherorbrands
and names
property
of or
their
respective
owners.
Corporation
its subsidiaries
inare
thethe
United
States
other
countries.
What is an HPC Cluster
A cluster is a type of parallel or distributed processing system,
which consists of a collection of interconnected stand-alone
computers cooperatively working together as a single,
integrated computing resource.
A typical cluster uses:
• Commodity off the shelf parts
• Low latency communication protocols
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
What is HPCC?
Master Node
LAN/WAN
File Server / Gateway
Interconnect
Compute Nodes
Cluster
Management
Tools
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
A Sample Cluster Design
External Network
PowerConnect 2016
1
3
5
7
9
11
13
15
2
4
6
8
10
12
14
16
Cluster Switch
100M
LNK/ACT
POWER FDX
1
2
3 4
5 6
7 8
9 10 11 12 13 14 15 16
4
4
Rack switches
Data Network
32
31
Compute Nodes
Master Node
External Network
Gigabit Ethernet
(Fibre)
Storage Node
Data Network
Gigabit Ethernet
(copper)
Control and Out-ofBand Network
100BaseT Copper
EMC
2
Connection to storage
Disk Store
Control Node
Rack-mount LCD
Panel/keyboard
31
32
Control and Out-of-Band Network
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Cluster Architecture View
Parallel Benchmarks:
Perf, Ring, HINT, NAS, …
Application
Middleware
shmem
OS
Protocol
Interconnect
Hardware
Real Applications
MPI
PVM
Linux
Other OSes
TCP/IP
Ethernet
desktop
VIA
Quadrics
Workstation
Proprietary
Infiniband
Server
1P/2P
Myrinet
Server
4U +
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Cluster Hardware
The Node
• A single element within the cluster
• Compute Node
• Just computes – little else
• Private IP address – no user access
• Master/Head/Front End Node
• User login
• Job scheduler
• Public IP address – connects to external network
• Management/Administrator Node
• Systems/cluster management functions
• Secure administrator address
• I/O Node
• Access to data
• Generally internal to cluster or to data centre
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Interconnect
Interconnect
Typical
Latency usec
Typical
Bandwidth MB/s
100 Mbps Ethernet
75
8
1Gbit/s Ethernet
60-90
90
10 Gb/s Ethernet
12-20
800
SCI*
1.5-4
200-600
Myricom Myrinet*
2.2-3
250-1200
InfiniBand*
2-4
900-1400
Quadrics QsNet*
3-5
600-900
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Agenda
• Parallel Computing Concepts
• Clusters
• Cluster Usage
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Cluster Usage
Performance Measurements
Usage Model
Application Classification
Application Behaviour
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
The Mysterious FLOPS
1 GFlops = 1 billion floating point operations per second
Theoretical v Real GFlops
Xeon Processor
• 1 Core theoretical peak = 4 x Clock speed (double precision)
• Xeons have 128 bit SSE registers which allows the processor to carry
out 2 double precision floating point add and 2 multiply operations
per clock cycle
• 2 computational cores per processor
• 2 processors per node (4 cores per node)
Sustained (Rmax) = ~35-80% of theoretical peak (interconnect
dependent)
You’ll NEVER hit peak!
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Other Measures of CPU Performance

SPEC (www.spec.org)
– Spec CPU2000/2006 Speed – single core performance indicator
– Spec CPU2000/2006 Rate – node performance indicator
– SpecFP – Floating Point performance
– SpecINT – Integer performance

Many other performance metrics may be required
–
–
–
–
–
STREAM - memory bandwidth
HPL – High Performance Linpack
NPB – NASA suite of performance tests
Pallas Parallel Benchmark – another suite
IOZone – file system throughput
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Technology Advancements in 5 Years
Codename Release
date
Foster
GHz Number Peak FLOP Peak GFLOPS Linpack on
of cores per CPU cycle
per CPU
256
Processors
September 1.7
2001
Woodcrest June 2006
3.0
1
2
3.4
288.9*
2
4
24
4781**



Example:
* From November 2001 top500 supercomputer list
(cluster of Dell Precision 530)
** Intel internal cluster built in 2006
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Usage Model
Electronic Design
Monte Carlo
Design Optimisation
Parallel Search
Many Users
Mixed size Parallel/Serial jobs
Ability to Partition and Allocate
Jobs to Nodes for Best Performance
Many Serial Jobs
(Capacity)
Meteorology
Seismic Analysis
Fluid Dynamics
Molecular Chemistry
One Big Parallel Job
(Capability)
Batch Usage
Appliance Usage
Load Balancing More Important
Job Scheduling very important
Interconnect More Important
Normal
Mixed Usage
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Application and Usage Model
HPC clusters run parallel applications, and applications in parallel!
One single application that takes advantage of multiple computing
platforms
• Fine-Grained Application
•
•
•
Uses many systems to run one application
Shares data heavily across systems
PDVR3D (Eigenvalues and Eigenstates of a matrix)
• Coarse-Grained Application
•
•
•
Uses many systems to run one application
Infrequent data sharing among systems
Casino (Monte-Carlo stochastic methods)
• Embarrassingly Parallel Application
•
•
•
An instance of the entire application runs on each node
Little or no data sharing among compute nodes
BLAST (pattern matching)
A shared memory machine will run all sorts of application
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Types of Applications
Forward Modelling
Inversion
Signal Processing
Searching/Comparing
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Forward Modelling
Solving linear equations
Grid Based
Parallelization by domain decomposition (split and distribute the data)
Finite element/finite difference
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Inversion
From measurements (F) compute models (M) representing properties (d)
of the measured object(s).
Deterministic
• Matrix inversions
• Conjugate gradient
Stochastic
• Monte Carlo, Markov chain
• Genetic algorithms
Generally large amounts of shared memory
Parallelism through multiple runs with different models
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Signal Processing/Quantum Mechanics
Convolution model (stencil)
Matrix computations (eigenvalues…)
Conjugate gradient methods
Normally not very demanding on latency and bandwidth
Some algorithms are embarrassingly parallel
Examples: seismic migration/processing, medical imaging,
SETI@Home
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Signal Processing Example
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Searching/Comparing
Integer operations are more dominant than floating point
IO intensive
Pattern matching
Embarrassingly parallel – very suitable for grid computing
Examples: encryption/decryption, message interception, bioinformatics, data mining
Examples: BLAST, HMMER
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Application Classes
Applications
• FEA – Finite Element Analysis
• The simulation of hard physical materials, e.g. metal, plastic
Crash test, product design, suitability for purpose
• Examples: MSC Nastran, Ansys, LS-Dyna, Abaqus, ESI PAMCrash,
Radioss
• CFD – Computational Fluid Dynamics
• The simulation of soft physical materials, gases and fluids
Engine design, airflow, oil reservoir modelling
• Examples: Fluent, Star-CD, CFX
• Geophysical Sciences
• Seismic Imaging – taking echo traces and building a picture of the
sub-earth geology
• Reservoir Simulation – CFD specific to oil asset management
• Examples: Omega, Landmark VIP and Pro/Max, Geoquest Eclipse
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Application Classes
Applications
• Life Sciences
• Understanding the living world – genome matching, protein folding,
drug design, bio-informatics, organic chemistry
• Examples: BLAST, Gaussian, other
• High Energy Physics
• Understanding the atomic and sub-atomic world
• Software from Fermi-Lab or CERN, or home-grown
• Financial Modelling
• Meeting internal and external financial targets particularly regarding
investment positions
• VaR – Value at Risk – assessing the impact of economic and political
factors on the bank’s investment portfolio
• Trader Risk Analysis – what is the risk on a trader’s position, a group
of traders
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.