Download Environments for Parallel Computing with Contemporary

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
PLATFORMS FOR HPJAVA: RUNTIME SUPPORT
FOR SCALABLE COMPUTATION WITH JAVA
Sang B. Lim
Florida State University
[email protected]
5/24/2017
1
Contents


Overview of HPJava
Runtime Support of HPJava





Adlib
mpiJava
Contributions and Future Work
Conclusions
Publications
5/24/2017
2
Motivation






SPMD (Single Program, Multiple Data) programming has been
very successful for parallel computing.
Many higher-level programming environments and libraries
assume the SPMD style as their basic model—ScaLAPACK,
DAGH, Kelp, Global Array Toolkit.
The library-based SPMD approach to data-parallel programming
lacks the uniformity and elegance of HPF.
Compared with HPF, creating distributed arrays and accessing
their local and remote elements is clumsy and error-prone.
Because the arrays are managed entirely in libraries, the
compiler offers little support and no safety net of compile-time
or compiler-generated run-time checking.
These observations motivate our introduction of the HPspmd
model—direct SPMD programming supported by additional
syntax for HPF-like distributed arrays.
5/24/2017
3
HPspmd




Proposed by Fox, Carpenter, Xiaoming Li
around 1998.
Independent processes executing same
program, sharing elements of distributed
arrays described by special syntax.
Processes operate directly on locally owned
elements. Explicit communication needed in
program to permit access to elements owned
by other processes.
Envisaged bindings for base languages like
Fortran, C, Java, etc.
5/24/2017
4
HPJava—Overview




Environment for parallel programming.
Extends Java by adding some predefined
classes and some extra syntax for dealing
with distributed arrays.
So far the only implement of HPspmd model.
HPJava program translated to standard Java
program which calls communication libraries
and parallel runtime system.
5/24/2017
5
HPJava Example
Procs p = new Procs2(2, 2);
on(p) {
Range x = new ExtBlockRange(M, p.dim(0), 1), y = new ExtBlockRange(N, p.dim(1), 1);
float [[-,-]] a = new float [[x, y]];
. . . Initialize edge values in ‘a’ (boundary conditions)
float [[-,-]] b = new float [[x,y]], r = new float [[x,y]]; // r = residuals
do {
Adlib.writeHalo(a);
overall (i = x for 1 : N – 2)
overall (j = y for 1 : N – 2) {
float newA = 0.25 * (a[i - 1, j] + a[i + 1, j] + a[i, j - 1] + a[i, j + 1] );
}
r [i, j] = Math.abs(newA – a [i, j]);
b [i, j] = newA;
HPspmd.copy(a, b); // Jacobi relaxation.
} while(Adlib.maxval(r) > EPS);
5/24/2017
6
Processes and Process Grids


An HPJava program is started concurrently in some
set of processes.
Processes named through “grid” objects:
Procs p = new Procs2(2,3);


Assumes program currently executing on 6 or more
processes.
Specify execution in a particular process grid by on
construct:
on(p) {
...
}
5/24/2017
7
2-dimensional array block-distributed over p
p.dim(1)
M=N=8
0
0
1
2
a[0,0] a[0,1] a[0,2]
a[0,3] a[0,4] a[0,5]
a[0,6] a[0,7]
a[1,0] a[1,1] a[1,2]
a[1,3] a[1,4] a[1,5]
a[1,6] a[1,7]
a[2,0] a[2,1] a[2,2]
a[2,3] a[2,4] a[2,5]
a[2,6] a[2,7]
a[3,0] a[3,1] a[3,2]
a[3,3] a[3,4] a[3,5]
a[3,6] a[3,7]
a[4,0] a[4,1] a[4,2]
a[4,3] a[4,4] a[4,5]
a[4,6] a[4,7]
a[5,0] a[5,1] a[5,2]
a[5,3] a[5,4] a[5,5]
a[5,6] a[5,7]
a[6,0] a[6,1] a[6,2]
a[6,3] a[6,4] a[6,5]
a[6,6] a[6,7]
a[7,0] a[7,1] a[7,2]
a[7,3] a[7,4] a[7,5]
a[7,6] a[7,7]
p.dim(0)
1
5/24/2017
8
Distributed Arrays in HPJava


Many differences between distributed arrays and
ordinary arrays of Java. New kind of container class
with special syntax.
Type signatures, constructors use double brackets to
emphasize distinction:
Procs2 p = new Procs2(2, 3);
on(p){
Range x = new BlockRange(M, p.dim(0));
Range y = new BlockRange(N, p.dim(1));
}
5/24/2017
float [[-,-]] a = new float[[x, y]];
...
9
The Range hierarchy of
HPJava
BlockRange
CyclicRange
BlockCyclicRange
Range
ExtBlockRange
IrregRange
CollapsedRange
Dimension
5/24/2017
10
Distributed Array Descriptor


Describes how elements of a particular array
are distributed across available processors.
Abstract DAD for a rank-r array is includes:




A distribution group, an integer base and
r range objects, and
r integer strides.
We have developed lightweight, pure Java
versions of support classes such as Group,
Range—the components of the DAD.
5/24/2017
11
Communication Library


Communication should go through calls to library
functions in the source code.
HPJava binding of Adlib library is currently being used
as a basis.



There are three main families of collective operation
in Adlib




Adlib library was completed in the Parallel Compiler Runtime
Consortium (PCRC) project.
Initial emphasis was on High Performance Fortran (HPF).
regular collective communications
reduction operations
irregular communications
Need to serialize the operations for Java types.
5/24/2017
12
Regular Collective Communications

remap

To copy the values of the elements in the source array to the
corresponding elements in the destination array.
void remap (T [[-]] dst, T [[-]] src) ;




T stands as a shorthand for any primitive type of Java.
Destination and source must have the same size and shape but
they can have any, unrelated, distribution formats.
Can implement a multicast if destination has replication
distribution formats.
shift
void shift (T [[-]] dst, T [[-]] src, int amount, int dimension);

implements simpler pattern of communication than general
remap.
5/24/2017
13
writeHalo
void writeHalo (T [[-]] a);


applied to distributed arrays that have ghost regions
and it updates those regions.
A more general form of writeHalo allows to specify
that only a subset of the available ghost area is to be
updated.
void writeHalo(T [[-]] a, int wlo, int whi, int mode);
wlo, whi: specify the widths at upper and lower
ends of the bands to be update.
5/24/2017
14
Solution of Laplace equation using
ghost regions
0
Range x = new ExtBlockRange(M, p.dim(0), 1);
Range y = new ExtBlockRange(N, p.dim(1), 1);
float [[-,-]] a = new float [[x, y]];
. . . Initialize values in ‘a’
0
float [[-,-]] b = new float [[x,y]], r = new float [[x,y]];
do {
Adlib.writeHalo(a);
overall (i = x for 1 : N – 2)
overall (j = y for 1 : N – 2) {
float newA = 0.25 * (a[i - 1, j] + a[i + 1, j] +
a[i, j - 1] + a[i, j + 1] );
r [i, j] = Math.abs(newA – a [i, j]);
b [i, j] = newA;
}
HPspmd.copy(a, b);
1
1
a[0,0] a[0,1] a[0,2]
a[0,1] a[0,2] a[0,3]
a[1,0] a[1,1] a[1,2]
a[1,1] a[1,2] a[1,3]
a[2,0] a[2,1] a[2,2]
a[2,1] a[2,2] a[2,3]
a[3,0] a[3,1] a[3,2]
a[3,1] a[3,2] a[3,3]
a[2,0] a[2,1] a[2,2]
a[2,1] a[2,2] a[2,3]
a[3,0] a[3,1] a[3,2]
a[3,1] a[3,2] a[3,3]
a[4,0] a[4,1] a[4,1]
a[4,1] a[4,2] a[4,3]
a[5,0] a[5,1] a[5,2]
a[5,1] a[5,2] a[5,3]
} while(Adlib.maxval(r) > EPS);
5/24/2017
15
Illustration of the effect of executing the
writeHalo function
Physical
Segment
Of array
“Declared” ghost
Region of array
segment
5/24/2017
Ghost area written
By writeHalo
16
Irregular distributed data
structures

Can be described as distributed array of
Java array.
float [[-]][] a = new float [[x]][];
overall (i = x : )
a [i] = new float[ f(x`) ];
0
[0]
Size = 4
5/24/2017
1
[1]
Size = 2
[2]
Size = 5
[3]
Size = 3
17
Issues about Java
implementation of Adlib


New sets of low-level API, called
mpjdev.
Java objects and Java arrays, plus
distributed array elements.
5/24/2017
18
MPJ

Java Grande Message-Passing Working Group




formed as a subset of the existing Concurrency and
Applications working group of Java Grande Forum.
Discussion of a common API for MPI-like Java
libraries.
To avoid confusion with standards published by the
original MPI Forum the nascent API is called MPJ.
java-mpi mailing list hosted at CSIT in FSU
has about 125 subscribers.
5/24/2017
19
mpiJava




Implements a Java API for MPI suggested in late ’97.
mpiJava is currently implemented as Java interface
to an underlying MPI implementation—such as
MPICH or some other native MPI implementation.
The interface between mpiJava and the underlying
MPI implementation is via the Java Native Interface
(JNI).
This software is available from
http://aspen.csit.fsu.edu/pss/HPJava/mpiJava.html

Around 600 people downloaded this software.
5/24/2017
20
Future Work
5/24/2017
21
Goals


The proposed research is concerned with enabling
parallel, high-performance computation in modern
Internet-oriented environments.
Why not Fortran?


It is becoming a marginalized language, with limited economic
incentive for vendors to produce modern development
environments, optimizing compilers for new hardware, or other
kinds of associated software expected by today's programmers.
Issues concerned with the implementation of the runtime environment underlying HPJava.



low-level API for underlying communications
Support of Java object types
High-level APIs (e.g. MPJ and Java version of Adlib)
5/24/2017
22
Low-level API



One area of research is how to transfer data
between the Java program and the network
while reducing overheads of the Java Native
Interface.
Portably on network platforms and efficiently
on parallel hard ware.
Recently, we proposed a low-level Java API
for HPC message passing, called mpjdev.
5/24/2017
23
mpjdev I



Meant for library developer.
Application level communication libraries like Java
version of Adlib or MPJ might be implemented on top
of mpjdev.
mpjdev may be implemented on top of Java sockets in
a portable network implementation, or—on HPC
platforms—through a JNI interface to a subset of MPI.


The initial version of the mpjdev has been targeted to HPC
platforms—the latter case.
A Java sockets version which will be provide more
portable network implementation will be added in the
future.
5/24/2017
24
mpjdev II

API for mpjdev is small compared to MPI
(only includes point-to-point communications)




Blocking mode (like MPI_SEND, MPI_RECV)
Non-blocking mode (like MPI_ISEND, MPI_IRECV)
The sophisticated data types of MPI are
omitted.
provide a reasonably flexible suit of
operations for copying data to and from the
buffer. (like gather- and scatter-style
operations.)
5/24/2017
25
HPJava communication layers
5/24/2017
26
Java Object types I

Need to fully support communication of intrinsic Java
types.






primitive types
objects
all kinds of Java array types
plus the multidimensional arrays of HPJava.
Should include Java object types as part of the basic
communication data types and properly implemented.
We are especially interested in supporting efficient
communication of science array objects like those
supported by the Java Grande Numerics Working Group.
5/24/2017
27
Java Object types II

The new I/O package of the JDK 1.4 include
some features discussed by the Java Grande
Numerics Working Group



creation of a contiguous memory blocks for array.
non-blocking communication
I will be investigating these new features of
the JDK to see how they can be exploited in
our project.
5/24/2017
28
High-level APIs—Java version
of Adlib


This API intended for an application level
communication library which is suitable
for our HPJava programming.
So far one of the Java version of one of
the underlying communication schedules,
Remap, has been implemented on top of
the mpjdev—others are being added
rapidly.
5/24/2017
29
High-level APIs—MPJ I

There is no complete implementation of the
draft MPJ specification



Our mpiJava wrappers rely on the availability of a
platform-specific native MPI implementation for the
target computer
Ideally should be highly portable— assume only
a Java development environment.
Design goals are that the system should be as
easy to install on distributed systems as we can
reasonably make it, and that it be sufficiently
robust to be usable in an Internet environment.
5/24/2017
30
High-level APIs—MPJ II

Resource Discovery


Handling “Partial Failures”



Technically, Jini discovery and lookup seems an obvious choice.
Daemons register with lookup services. Sun support for Jini
declining but JXTA, for example, offers similar services.
A useable MPI implementation must deal with unexpected
process termination or network failure, without leaving orphan
processes, or leaking other resources.
The Jini-like paradigms of leasing and distributed events can be
used to detect failures and reclaim resources in the event of
failure.
We will also investigate newer ideas coming from
projects like JXTA.
5/24/2017
31
Conclusions


We reviewed HPJava environment language and its
communication libraries—Adlib, and mpiJava.
The proposed research will particularly address issues
concerned with the implementation of the run-time
environment underlying HPJava.





low-level API for underlying communications
Support of Java types
High-level APIs (e.g. MPJ and Java version of Adlib)
Implementation on network platform
The dissertation research will begin to address these
requirements and delivers at least a complete Javacentric implementation of an application-level library
for HPJava programming.
5/24/2017
32
Publications
1. Han-Ku Lee, Bryan Carpenter, Geoffrey Fox, Sang Boem Lim. Benchmarking
HPJava: Prospects for Performance. Sixth Workshop on Languages,
Compilers, and Run-time Systems for Scalable Computers(LCR2002), March
2002.
2. Bryan Carpenter, Geoffrey Fox, Han-Ku Lee and Sang Lim. Translation of the
HPJava Language for Parallel Programming. The 14th annual workshop on
Languages and Compilers for Parallel Computing(LCPC2001), May 2001.
3. Bryan Carpenter, Geoffrey Fox, Sung-Hoon Ko and Sang Lim. Object
Serialization for Marshalling Data in a Java Interface to MPI. ACM 1999 Java
Grande Conference, June 1999.
4. Mark Baker, Bryan Carpenter, Geoffrey Fox, Sung-Hoon Ko, and Sang Lim.
mpiJava: An Object-Oriented Java interface to MPI. International Workshop
on Java for Parallel and Distributed Computing, IPPS/SPDP 1999, San Juan,
Puerto Rico, April 1999.
5/24/2017
33
Publications
5. Bryan Carpenter, Geoffrey Fox, Sung-Hoon Ko and Sang Lim.
Automatic Object Serialization in the mpiJava Interface to MPI, Third
MPI Developer's and User's Conference, MPIDC '99, March 1999.
6. Bryan Carpenter and Sang Boem Lim. A Low-level Java API for HPC
Message Passing. February 27, 2002.
7. Bryan Carpenter, Guansong Zhang, Han-Ku Lee and Sang Lim. Parallel
Programming in HPJava. Draft of May 2001.
http://aspen.csit.fsu.edu/pss/HPJava/
8. Bryan Carpenter, Geoffrey Fox, Sung-Hoon Ko and Sang Lim. mpiJava
1.2: API Specification. October 1999.
http://aspen.csit.fsu.edu/pss/HPJava/mpiJava.html
5/24/2017
34