Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Java as a cycle-stealing computational environment Al Globus, MRJ Technology Solutions, Inc. at NASA Ames Research Center Acknowledgments • Ames: Po Chung, Creon Levit, Subash Saini • UCSC: John Lawton, Rich McClellan, Todd Wipke http://science.nas.nasa.gov/~globus/code/geneticGraphs/ What got done Java implementation of genetic molecular design application • No input (Parameters.java) • No graphics or GUI • Computationally intensive -- O(n3) • Many output files Used cycle scavenging batch system (Condor) on NAS workstations Approximately 150 runs so far http://science.nas.nasa.gov/~globus/code/geneticGraphs/ Genetic molecular design Randomly generate a set of molecules Many times: • Select parent molecules at random with bias towards better performance • Randomly rip copies of each parent in two • Mate opposite halves • Replace random molecules with bias towards worse performance Repeat until satisfied http://science.nas.nasa.gov/~globus/code/geneticGraphs/ Algorithm properties Stochastic, embarrassingly parallel Robust to failure No guaranteed outcome Fitness function is crucial and non-trivial Performs well as cycle-scavenger using Condor, University of Wisconsin, http://www.cs.wisc.edu/condor http://science.nas.nasa.gov/~globus/code/geneticGraphs/ Crossover http://science.nas.nasa.gov/~globus/code/geneticGraphs/ Time to find small molecules http://science.nas.nasa.gov/~globus/code/geneticGraphs/ Finding larger molecules http://science.nas.nasa.gov/~globus/code/geneticGraphs/ Resources 300-400 NAS workstations are idle nights and weekends Provides an ideal resource if owner access is not degraded University of Wisconsin experience suggests each workstation is idle an average of 17 hours/day. Thus NAS should have approximately 6,800 workstation hours/day or 2,448,000 workstation hours/year available. This extra processing costs $0 for hardware http://science.nas.nasa.gov/~globus/code/geneticGraphs/ Condor Cycle-scavenging batch system Developed by University of Wisconsin In production since 1986 Unix workstations (NT port in progress) Free from http://www.cs.wisc.edu/condor http://science.nas.nasa.gov/~globus/code/geneticGraphs/ Some Classes Graph, vertex, edge Molecule, atom, bond Breeder, Population FitnessFunction Parameters, Reporter Sample, DataTable Predicate, Procedure, ExtendedVector IntegerInterval, DoubleInterval http://science.nas.nasa.gov/~globus/code/geneticGraphs/ Java advantages Cleaner code than C++ Dynamic loading eliminated input file parsing. CLASSPATH • Parameters.class directory • classes.jar • Standard library Garbage collection priceless • massive data structure manipulation • cyclic data structures make reference counting ineffective http://science.nas.nasa.gov/~globus/code/geneticGraphs/ Java advantages continued Serialization eases checkpointing Virtual machine eases cross-platform development • Development on WinTel • Execution on SGI workstations Standard library (especially Vectors) Automatic html documentation integrated with code Reflection enables automation of Parameters.java toString() Automatic bounds checking http://science.nas.nasa.gov/~globus/code/geneticGraphs/ Java disadvantages JAVA with jit 50% slower than c on simple numerical code Symantec Visual Café unacceptably buggy debugger and crashed system very hard Supersede ok with a few bugs Lack of multiple inheritance sometimes irritating Condor cannot perform automatic checkpointing • Condor requires relink to automate checkpointing http://science.nas.nasa.gov/~globus/code/geneticGraphs/ Checkpointing Condor jobs may be stopped at any arbitrary time Virtual machine checkpointing would allow automatic heterogeneous mobility • stack format not defined • heap format undefined and universal serialization potentially problematic • jit and optimization causes problems • Must hack Java Virtual Machine Java threads cannot be truly interrupted • suspend(), resume(), stop() depreciated http://science.nas.nasa.gov/~globus/code/geneticGraphs/ Checkpointer.java class application implements Checkpointable • start(String[] arguments); • restart(); Application calls • Checkpointer.ok() • Checkpointer.checkpoint() Condor calls • Checkpointer.prepareToDie() • Checkpointer.areYouReadyToDie() • Checkpointer.cancelDeath() • Checkpointer.checkpointWhenPossible() http://science.nas.nasa.gov/~globus/code/geneticGraphs/ Summary It was fun It was productive • 75 classes, 6389 lines of code • one unique algorithm • two University collaborations • two conference presentations • one conference poster • one journal submission • 400-500 CPU hours per day added to NAS batch capability (so far) Java is my favorite programming language http://science.nas.nasa.gov/~globus/code/geneticGraphs/