Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006 Part 1: Introduction Relevant publications: V. Getov. Java in High-Performance Computing – Guest Editorial. FGCS, vol. 18(2), v-vi, Oct. 2001. M. Philippsen, R. Boisvert, V. Getov, R. Pozo, J. Moreira, D. Gannon, G. Fox. JavaGrande – High Performance Computing with Java. Proceedings of PARA 2000 Conference, LNCS, Springer, vol. 1947, 20-36, 2001. High Performance Computing Using Java Technology – Tutorial lecture. ACM JavaGrande and JavaOne2001 Conferences, San Francisco, June 2001. Some Important Facts About Current Computer Systems The gap between processing speed and memory access speed continues to grow; So does the gap between high-level programming models and underlying hardware architectures; The wide variety of hardware architectures makes it particularly difficult to achieve portable high performance; The pace of innovation is such that investment in tuning for one machine may not pay off before that machine is obsolete. Pros/Cons of using Java Pros: Java offers tremendous potential for portability and heterogeneous execution Bytecode Representation, RMI, Object Serialization Cons: Java still suffers a significant performance penalty and as with any new language, the thought of rewriting existing codes brings reluctance and lack of enthusiasm. Background Observations Java thread model is insufficient Message Passing model is important to support Performance is critical Many applications need “high” performance Proper numerical computing Complex, arrays, performance, reproducibility Part 2: High Performance Java Relevant publications: V. Getov. A Mixed-Language Programming Methodology for High Performance Java Computing. In: R. Boisvert and P. Tang (Eds.) The Architecture of Scientific Software. Kluwer Academic Publishers, 333-347, 2001. Q. Lu, V. Getov. Mixed-Language High-Performance Computing for Plasma Simulations. Scientific Programming, vol. 11(1), 57-66, 2003. Mixed-Language Programming with Java Java is a highly-portable language Java adheres to the “Write once, run anywhere” philosophy Java has a well-established collection of scientific library bindings Java’s execution speed is suitable for HPC C/Fortran are highly-portable languages C/Fortran adhere to the “Write once, run anywhere” philosophy C/Fortran have well-established scientific libraries C/Fortran execution speeds are suitable for HPC So, What Language to Use? Java is a highly-portable language Java adheres to the “Write once, run anywhere” philosophy C/Fortran have well-established scientific library bindings C/Fortran execution speeds are suitable for HPC Utilize Java for its portability and standardization, but focus on using Java as a wrapper for porting of native code in the form of shared libraries. This involves the least amount of work and guarantees maximum performance on different platforms. Difficulties in binding a native library to Java Data formats in Java and C differ: sizes of primitive types; C pointers; multidimensional arrays; C structures; Still different Java native method interfaces exist; A native interface is inadequate for calling existing library functions. JCI Block Diagram Legacy libraries bound to Java so far Library Lang. Func. MPI C 128 BLACS C 76 BLAS F77 21 PBLAS C 22 PB-BLAS F77 30 LAPACK F77 14 ScaLAPACK F77 38 C Java 4434 439 5702 489 2095 169 2567 127 4973 241 765 65 5373 293 Mixed-language programming based on JVM NPB EP kernel on IBM SP2 Mixed-language programming based on HPCJ NPB IS kernel on IBM SP2 Part 3: Message Passing in Java Relevant publications: B. Carpenter, V. Getov, G. Judd, A. Skjellum, G. Fox. MPJ: MPI-like Message Passing for Java. Concurrency: Practice and Experience, vol. 12 (11), 1019-1038, 2000. V. Getov, P. Gray, V. Sunderam. Aspects of Portability and Distributed Execution for JNI-Wrapped Message Passing Libraries. Concurrency: Practice and Experience, vol. 12 (11), 1039-1050, 2000. V. Getov, M. Philippsen. Java Communications for Large-Scale Parallel Computing. Proceedings of SciComp'01 Conference, LNCS, Springer, vol. 2179, 33-45, 2001. Message Passing - Motivation The existing communication packages in Java - RMI, API to BSD sockets - are optimized for Client/Server programming The symmetric model of communication is captured in the MPI standard - MPI-1 and MPI-2 An MPI-like message-passing API specification is needed to enable the development of portable JavaGrande applications Early MPI-like Efforts - 1 mpiJava - Modeled after the C++ binding for MPI. Implementation through JNI wrappers to native MPI software. JavaMPI - Automatic generation of wrappers to legacy MPI libraries. C-like implementation based on the JCI code generator. MPIJ - Pure Java implementation of MPI closely based on the C++ binding. A large subset of MPI is implemented using native marshaling of primitive Java types. Early MPI-like Efforts - 2 JMPI - MPI Soft Tech Inc. have announced a commercial effort under way to develop a message passing environment for Java. Others Existing ports - Linux, Solaris (both WS clusters and SMPs), AIX (both WS clusters and SP2), Windows NT clusters, Origin-2000, Fujitsu AP3000, and Hitachi SR2201. Java + MPI codes - growing variety including full applications MPJ API Specification Builds on the MPI-1 Specification and the Java Specification. Immediate standardization for common message passing programs in Java Basis for conversion between C, C++, Fortran and Java. Eventually, support for aspects of MPI-2 as well as possible improvements to the Java language. Multidimensional arrays In Java an “n-dimensional array” is equivalent to a one-dimensional array of (n - 1)dimensional arrays. In MPJ, message buffers are always onedimensional arrays, but element type may be an object, which may have array type - hence multidimensional arrays can appear as message buffers. Java multidimensional arrays A [ 0 ] [ 0 ] A [ 0 ] [ 1 ] A [ 0 ] [ 2 ] A [ 0 ] [ 3 ] A [ 0 ] A [ 1 ] [ 0 ] A [ 1 ] [ 1 ] A [ 1 ] [ 2 ] A [ 1 ] [ 3 ] A [ 1 ] Array of Arrays A [ 2 ] [ 0 ] A [ 2 ] [ 1 ] A [ 2 ] [ 2 ] A [ 2 ] [ 3 ] A [ 2 ] A [ 3 ] [ 0 ] A [ 3 ] [ 1 ] A [ 3 ] [ 2 ] A [ 3 ] [ 3 ] A [ 3 ] Java multidimensional arrays [ 0 ] [ 0 ] A [ 0 ] [ 1 ] A [ 0 ] [ 2 ] A [ 0 ] [ 3 ] A [ 0 ] A B [ 0 ] [ 1 ] [ 0 ] A [ 1 ] [ 1 ] A [ 1 ] A B [ 1 ] [ 2 ] [ 0 ] A [ 2 ] [ 1 ] A [ 2 ] [ 2 ] A [ 2 ] [ 3 ] A [ 2 ] A B [ 2 ] [ 3 ] [ 0 ] B [ 3 ] [ 1 ] B [ 3 ] [ 2 ] A [ 3 ] B B [ 3 ] Java multidimensional arrays are not indivisible objects: could have intra-array aliasing and "partial overlaps" with other arrays Naming Conventions All MPI classes belong to the package mpi. Conventions for capitalization, etc, in class and member names generally follow the recommendations of Sun's Java code conventions consistent with the MPI C++ binding Error codes Unlike the C and Fortran interfaces, the Java interfaces to MPI calls will not return explicit error codes. Instead, the Java exception mechanism will be used to report errors Ping-Pong Timings Execution time (sec) 1e-2 Java/LAM-MPI C/LAM-MPI C/IBM-MPI 1e-3 1e-4 1e+1 1e+2 1e+3 1e+4 Message length (bytes) 1e+5 1e+6 Part 4: Grid Systems and Environments Relevant publications: V. Getov, G. von Laszewski, M. Philippsen, I. Foster. Multi- Paradigm Communications in Java for Grid Computing. Communications of the ACM, vol. 44(10), 118-125, Oct. 2001. V. Getov, M. Gerndt, A. Hoisie, A. Malony, B. Miller (Eds.) Performance Analysis and Grid Computing. Kluwer Academic Publishers, 2003. V. Getov, S. Newhouse, O. Rana, E. Sharakan. Developing Grid Services with Jini and JXTA. Proc of ICCC 2004, 1402-1408, ICCC Press, 2004 (best paper award). V. Getov, A. Puliafito, O. Rana. Computational Grid and Web Services: Concepts, Functionalities, and Comparisons. Proc of ICCC 2004, 10-15, ICCC Press, 2004. Roadmap of Communication Frameworks What is the Grid ? “A Grid provides an abstraction for resource sharing and collaboration across multiple administrative domains…” (Source: NGG Expert Group, 16 June 2003 “European Grid Research 2005-2010) Benefits Increased productivity by reducing the total cost of ownership Any-type, anywhere, anytime services by/for all Infrastructure for dynamic virtual organisations Backbone for the next generation Internet services Industry & Business Grids e-Science Java in Grid Computing Main motivation - need to solve bigger problems with resource requirements beyond the current limits Recent advances in computer communications make it possible to couple geographically distributed resources - Grid computing In contrast with low-level approaches Java can support a single object-oriented communication framework for Grande applications Example Application: An Advanced Scientific Instrument Virtual Reality Cave Advanced Photon Source Scientist Avata r Supercomputer Electronic Library and Databases Computing Portal Clients Lightweight Grid Platform New generic approach to designing the next generation Grid systems with dynamic properties – components-based design. To develop a lightweight Grid platform suitable for resource limited devices, to support our design. To provide a design that will allow for the efficient integration of mobile devices into the Grid. To provide enhanced security, centralized management and monitoring, roaming, fault tolerance and a high level of autonomy in this mobile wireless environment. Challenges in Mobile Grids Limited available resources. Increased power consumption sensitivity. Increased heterogeneity and software non- interoperability. Unpredictable long periods of complete disconnectivity. Unreliable, low-bandwidth and high latency communication links. Very frequent, dynamic and unpredictable changes to the network layout. Hybrid Environment: Virtual “Cluster” Approach Clustered Approach Benefits Single point of entry to the wireless cluster. Centralized cluster management and monitoring. Encapsulation of heterogeneity and dynamicity. Masking of internal failures and silent recovering locally without affecting the regular Grid operation. Ten Reasons to Use Java in High-Performance Computing Language Maintenance Class Libraries Performance Components Gadgets Deployment Industry Portability Education Acknowledgements Bryan Carpenter (Uni Syracuse) Susan Flynn-Hummel (IBM - T.J. Watson) Gregor von Laszewski (Argonne NL) Sava Mintchev (Uni Westminster) Jose Moreira (IBM - T.J. Watson) Michael Philippsen (Uni Karlsruhe) Antonio Puliafito (Uni Messina) Omer Rana (Uni Cardiff) Eric Sharakan (Sun Microsystems) Mary Thomas (San Diego Supercomputer Center) Experiments - CTC, IBM - T.J. Watson, SDSC, Southampton and Westminster Universities