Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
G-JavaMPI: A Grid Middleware for Distributed Java Computing with MPI Binding and Process Migration Supports Lin Chen, Cho-Li Wang, Francis C. M. Lau and Ricky K. K. Ma Department of Computer Science and Information Systems The University of Hong Kong {lchen2+clwang+fcmlau+kk1ma}@csis.hku.hk Outline Motivation Overall system architecture Detailed Issues Related works Conclusion & Future Work GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) 2 Motivation Grid computing: large-scale resource sharing, high performance Globus Project: basic services required by building and using a Grid (authentication, security, resource allocation, remote data access, information services, etc.) However long-running applications continuous computation Better utilization of resource scheduling and load balancing Java process migration architecture-independent bytecode makes migration easier GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) 3 Motivation Let the programmer write a grid application easily no care about inter-site communication and intra-site communication (we must care about it if directly using globus communication libraries) SPMD: one program can be executed in multiple places or sites MPI paradigm a group of distributed processes, they can do peer-to-peer or collective communication Communication source or destination addresses are unrelated with the real physical network address (adaptable) GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) 4 System Overview (3) (1)(1*) Gatekeeper LS LS Gatekeeper Java-MPI (2)communication (*) Migrating (restarting a new process through Globus remote job request with delegated user credentials and Java-MPI job credentials) Some legacy messages are redirected during migration GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) WAN (2*) JVM (3*) Gatekeeper LS M Migration module resides in each JVM 5 System overview Globus Toolkit Libraries LS Local schedulers A Java-MPI process (before migration) Java MPI communication daemons Java-MPI processes M Migration modules Java-MPI process (after migration) (1*) – (2*) – (3*): MPI communication route before migration (1*) – (2*) – (3*): MPI communication route after migration (*): Java MPI communication daemons redirect some legacy messages which should be go to the migrated process GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) 6 Layered design Java-MPI Applications Java-MPI API & Java API (Java-MPI API Layer) JVM Execution State Probe & Migration Plug-in JVMDI (Migration Layer) Migration Instructions Message Queues Authentication Info. Update Restorable Communication Services (Restorable MPI Comm Layer) Control Block DLB Policy (Load Balancing Module) MPICH-G2 Globus Services OS Hardware GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) 7 Java-MPI binding Restorable communication layer Daemon, a running MPICH-G2 process, providing MPI communication services Communicate with JavaMPI process through IPC Post-migration message re-direction Process space Restorable Communication GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) 8 Java Process Migration State capturing: a probe attached in each JVM, saves the process context through JVMDI (JVM Debugger Interface) All runtime data: PC register, stack frames, objects, method area (local variables), etc. Event notification: method_entry, frame_pop, etc. Use object serialization to package all reachable objects in heap New JDK1.4.0 & 1.4.1 released in Aug. 2002 support “fullspeed debugging” JVM GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) JVMDI 1. Execution state data 2. Event notification probe 9 Java process migration State Restoration: Exception handler inserted in bytecode (pre-processing before execution) to restore local variables and “jump” to the original execution point Re-allocate objects when re-starting JVM Dynamic class loading GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) 10 Information update Migration begin Migration Source site Notify other sites (including destination site) The process arrives the safe migration point (consume all legacy messages) Update local site of the process’s new place Begin process state capturation GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) Other sites Migration Destination site 11 Process Restart JVM initialization At the same time, the probe started Original Process Process suspended in the beginning, Probe read out context from dumpfile Restoring the execution context creates a new user certificate proxy (proxy_init_cred ) delegated to remote site get the resource allocation Process resumed and continued from the last point New-started Process The new process can be started (similar to normal globus job submit) GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) 12 Experiment Results Hardware 32-node Cluster “ostrich” configured as two grid points of 16 nodes 733MHz Pentium III processor 392MB of memory connected by a 24-port Fast Ethernet switch Software Linux 2.2.14 Gloubs 2.0 Sun JDK 1.4.0_02 (supporting JVMDI with full-speed debugging mode) MPICH 1.2.4 (MPICH-G2) GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) 13 Experiment results Bandwidth (Kbyte/s) Bandwidth 5000 4500 4000 3500 3000 2500 2000 1500 1000 500 0 8 16 32 64 128 256 512 1024 2048 Message Size (byte) Intra-site bandwidth Inter-site bandwidth Bandwidth comparison between inter-site and intra-site communication with the installation of the MPI communication layer. GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) 14 Experiment results Latency Latency (s) 0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 4 8 16 32 64 128 256 512 1024 2048 Message Size (byte) Inter-site latency Intra-site latency Latency comparison for small messages between intra-site and inter-site communication with the installation of the MPI communication layer. GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) 15 Experiment results time (microsecond) Time for capturing and restoring objects 3000 2500 2000 1500 1000 500 0 1 10 100 1000 10K 100K 1M 10M object size (byte) capturing objects restoring objects Time spent in capturing and restoring objects GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) 16 Experiment results Time for capturing and restoring Java frames time (seconds) 6 5 4 3 2 1 0 1 10 20 50 100 200 300 number of frames capturing frames restoring frames Time spent in capturing and restoring frames GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) 17 Related Works Java bindings for MPI: “mpiJava”, “JavaMPI”, “MPIJ”, etc. Java process or thread migration: Add additional backup codes in programs [Aglets[IBM96]] Insert backup statements in the source or byte code, a backup object is used to store state [Wasp project [Funfrocken98]] Extend the JVM, make state accessible from Java programs, support type recognition of Java stack [sara Bouchenak 2000] Use JVMDI to capture state, insert bytecode instructions in program body to help restoring [Torsten2001] JESSICA (supports thread migration in JVM) GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) 18 Conclusion a new middleware for the Grid with Java-MPI communication and transparent process migration features. write MPI-style programs in Java language Java process migration mechanism supports the development of any dynamic load balancing policy or fault tolerance mechanism GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) 19 Future Plan Develop some scientific and engineering applications on top of this middleware Support of the transfer of other I/O (including file stage-in/out) Load balancing algorithm for the grid environment (both CPU and network load) GCC2002 Presentation Lin Chen, CSIS, HKU (Dec. 26, 2002) 20 The End Thanks !