JavaGenes and Condor: Cycle-Scavenging Genetic Algorithms

0 downloads 0 Views 589KB Size Report
Sterling Software, Inc. at NASA Ames. Research ... JavaGenes jobs have been run. Most jobs ... jobs with the same input parameters, each job differing only in.
JavaGenes and Condor: Cycle-Scavenging Genetic Algorithms AI Globus

Eric Langhirt

MRJ Technology Solutions, Inc. at Sterling Software, Inc. at NASA Ames NASA Ames Research Center Research Center MS T27A MS 258-6 Moffett Field, CA 94035 Moffett Field, CA 94035 (650) 604-4404 (650) 604-1843

globus @ nas.nasa.gov

langhirt@ nas.nasa.gov

Miron Livny Computer Sciences Department University of Wisconsin 1210 West Dayton St. Madison, Wl 53706 (608) 262-0856

[email protected]

Ravishankar Ramamurthy

Marvin Solomon

Steve Traugott

Computer Sciences Department University of Wisconsin

Computer Sciences Department University of Wisconsin

1210 West Dayton St. Madison, Wl 53706 (608) 262-2252

1210 West Dayton St. Madison, Wl 53706 (608) 262-1204

Sterling Software Inc., at NASA Ames Research Center MS 258-6 Moffett Field, CA 94035 (650) 604-4324

ravi @cs.wisc.edu

solomon @ cs.wisc.edu

stevegt @ nas.nasa.gov

words, JavaGenes must do its own checkpointing and I/O redirection. A prototype Java-aware version of Condor was developed using standard Java serialization for checkpointing. For the prototype to be useful, standard Java serialization must be significantly optimized. JavaGenes is approximately 8700' lines of code and a few thousand JavaGenes jobs have been run. Most jobs ran for a few days. Results include proof that genetic algorithms can evolve directed and undirected graphs, development of a novel crossover operator for graphs, a paper in the journal Nanotechnology [3], and another paper in preparation.

ABSTRACT A genetic algorithm code, JavaGenes, was written in Java and used to evolve pharmaceutical drug molecules and digital circuits. JavaGenes was run under the Condor cyclescavenging batch system managing 100-170 desktop, deskside, and rack-mounted SGI workstations. Genetic algorithms mimic biological evolution by evolving solutions to problems using crossover and mutation. While most genetic algorithms evolve strings or trees, JavaGenes evolves graphs representing (currently) molecules and circuits. Java was chosen as the implementation language because the genetic algorithm requires random splitting and recombining of graphs, a complex data structure manipulation with ample opportunities for memory leaks, loose pointers, out-of-bound indices, and other hard to find bugs. Java garbage-collection memory management, lack of pointer arithmetic, and arraybounds index checking reduces the frequency of these bugs, substantially reducing development time. While a run-time performance penalty must be paid, the only unacceptable performance we encountered was using standard Java serialization to checkpoint and restart the code. This was fixed by a two-day implementation of custom checkpointing. JavaGenes is minimally integrated with Condor; in other

Keywords Java, Condor, genetic algorithms.

1. INTRODUCTION This paper is a case study of running genetic algorithms written in Java under Condor. To understand the experience and results, it is necessary to have some understanding of b o t h genetic algorithms and Condor. Since this paper is written for the Java Grande 2000 conference, we expect readers to understand the Java programming language and run-time environment.

1.1 Genetic Algorithms Genetic algorithms seek to mimic natural evolution's ability to produce highly functional objects. Natural evolution produces organisms. Genetic algorithms produce sets o f parameters, programs, molecular designs, and many other structures. Genetic algorithms usually solve problems by:

Permission to make digital or hard copies of all or part of this work Ibr personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the lull citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Java 2000 San Francisco CA USA Copyright ACM 2000 1-58113-288-3/00/6...$5.00

134



Randomly generating potential solutions.

a

population

of

individual



For each new generation, repeatedly selecting parent individuals at random with a bias towards better i n d i v i d u a l s

and applying transmission operators to produce children. Transmission operators include: 1.

2. 3. •

we need a good similarity measure that can score any molecule. [2] defined such a similarity measure, all-atom-pairs-shortestpath, and searched a large database for molecules similar to diazepam. We use a closely related similarity technique to evolve a population of molecules towards a target drug molecule.

Crossover: each of two parents is divided into two parts and one part from each parent is combined into a child. Mutation: a single "parent" modified to generate a child.

is

randomly

For an excellent review of genetic algorithms and related techniques as of Spring 1997, see [1].

Reproduction: a single "parent" is copied into the new generation.

Continuing until an acceptable solution exhaustion sets in.

1.2 Condor Genetic algorithms have the fortunate property of b e i n g embarrassingly parallel, because fitness function evaluation is usually the most time-consuming step in the genetic algorithm, and there is no dependency between fitness function evaluations. Not only can many fitness function evaluations be conducted in parallel, but since genetic algorithms are statistical, it is usually necessary to make many runs to support a hypothesis. In our work, we usually run 31 jobs with the same input parameters, each job differing only in the random number seed. This provides completely trivial 3 1 way parallelism. Furthermore, we are usually running several experiments at the same time. Thus, it is not uncommon for our project to run 100-200 jobs simultaneously. It's also quite easy to implement parallel fitness function evaluation within a single job, although we haven't found that to be necessary yet.

is found or

A key issue is what constitutes "better individuals." This is determined by a "fitness function." The fitness function takes individuals (strings, trees, or graphs) as input and returns a number representing the fitness of that individual. Genetic algorithms differ in their representation of solutions. Bit string representations were used in the first genetic algorithms [5], but arrays of floating point numbers, special symbols that generate circuits [9], robot commands [13], and many other symbols may be found in the literature. Strings may be of fixed or variable length. Trees can also be evolved [6]. This is usually called genetic programming, because trees are particularly useful for representing computer programs. Many molecules contain cycles, which chemists call rings, and strings and trees don't contain cycles. Therefore, we took the unusual approach of evolving graphs. Graphs are a set o f vertices (for example, atoms) and a set of edges (for example, bonds), each of which connects two vertices. In this paper, the term graph does not refer to a two dimensional image used for data presentation.

Embarrassingly parallel programs are a natural match for Condor [7]. Condor is a software system that creates a High Throughput Computing [8] environment by effectively harnessing the power of a cluster of UNIX workstations on a network. Although Condor can manage a dedicated cluster of workstations, a key appeal of Condor is its ability to effectively harness non-dedicated, preexisting resources in a distributed ownership setting such as machines sitting o n people's desks in offices and labs. We ran JavaGenes on the NAS Condor PO01. NAS is the primary NASA supercomputer center [10]. Approximately 200 workstations, purchased and used for software development, visualization, email, document preparation, etc., are available for batch processing during idle times. The Condor daemons watch these 200 workstations. When a workstation has been idle for 2 hours, a job from the batch queue is assigned to the workstation and will run u n t i l the workstation detects a keystroke, mouse motion, or relatively high non-Condor CPU usage. At that point, the j o b will be removed from the workstation and placed back on the batch queue. As mentioned before, it's not uncommon to have a few hundred JavaGenes jobs in the queue.

Evolving graphs involves randomly splitting arbitrary graphs into two fragments and then combining fragments. Splitting a complex graph, such as morphine involves data structure manipulations that can easily result in various bugs. Java was chosen as the implementation language to minimize these problems. Morphine

0

Because a JavaGenes job running under Condor may be vacated (removed from the workstation) at any time, the j o b must save state (checkpoint) periodically. Condor provides a generic checkpoint/restart facility, but for reasons discussed below, we could not use this facility for JavaGenes. Checkpointing was initially implemented using standard Java serialization. Conceptually, this is relatively simple since the state of a genetic algorithm is simply the current population. However, standard Java serialization turned out to be a serious, but fixable, performance problem, as others have [12]. See the section below on serialization oerformance.

O

O JavaGenes has been used for pharmaceutical drug and digital circuit design. One approach to drug design is to find molecules similar to good drugs. Ideally, a candidate replacement drug is sufficiently similar to have the same beneficial effect, but is different enough to avoid negative side effects. To use JavaGenes for similarity-based drug discovery

2. APPROACH JavaGenes was written in 100% pure Java, version 1.1. There were approximately 8670 lines of source, not including the graph layout code (Jiggle) provided by Daniel [11]. The graph layout code was used to arrange graph vertices in three

135

dimensions for viewing. We will now discuss some of the objects implemented in JavaGenes.. 2.1 Objects The nouns used to describe genetic algorithms all became classes, including the following: • • •

Population -- an array of Individuals. Individual -- an Evolvable and its fitness. Evolvable (nonstandard terminology) -- a data structure capable of being evolved by ChildMaker objects. Currently, only Graph plus subclasses Molecule and DigitalLogicGraph are implemented, but plans for arrays and trees exist.



FitnessFunction - these objects have a 'double evaluate(Evolvable)' method that implements the desired fitness function. General purpose FitnessFunction subclasses included weighted sum (of other FitnessFunctions) and MultiplyBy.



Breeder (nonstandard terminology) -- a class with a 'Population breed(Population)' method that evolves one population into another.



ChildMaker (nonstandard terminology) - - t h e s e objects tell a Breeder how many parents they want (two for crossover, one for mutation), then take an array of parents and produce an array of children.

Another set of classes are responsible manipulating, and managing graphs: subclasses

creating,



Graph, along DigitalLogicGraph.



Vertex, along with subclasses Atom, Digitallnput, DigitalOutput, and DigitalDevice (or, and, xor, etc.).



with

for

Molecule

and

Edge, along with subclasses Bond and DigitalWire.



BrokenGraph -- responsible for a graph fragment after crossover splits a graph.



BrokenEdge -- responsible for an edge broken d u r i n g splitting.



VertexProvider and EdgeProvider -- these classes are used during graph generation and mutation to provide random vertices and edges of various types.

In addition, there are several convenience classes: •



Parameters, along with subclasses MoleculeParameters and DigitalLogicParameters. These objects hold all the values that are typically varied from job to job; for example: population size, maximum number of generations, fitness function, etc. Most of the JavaGenes class files are kept in a jar file, but the parameter files are compiled for each set o f jobs and placed earlier in the CLASSPATH. This provides a flexible mechanism (taking advantage of Java dynamic loading) to set input parameters without writing an i n p u t file parser. InputTokenizer and OutputTokenizer -- these are used to save and restore state. They can read and write integers, doubles, etc.

2.2 Free Code A certain amount of Java code is available for free on the Web. We took advantage of this in two cases. First, the Student TTest code used in the statistics class was supplied by NWP

136

Associates, Inc. This was a minor, but helpful, convenience. Second, and more important, one must examine the evolved graphs to understand the results. To examine a graph, it must be laid out in two or three dimensions for viewing. In other words, xyz locations for each vertex must be chosen. The graph layout problem is non-trivial. In fact, it is very difficult. Fortunately, Daniel Tunkelang made his Jiggle Java code [11] available, and Jiggle has done an excellent job of laying o u t graphs evolved by JavaGenes. Integration of the two packages was quick and easy. Only one bug was found in Jiggle (1615 lines of source). That bug was an infinite loop, which was found and fixed in a little over an hour. 2.3 Development Environment JavaGenes was developed on a Compaq laptop r u n n i n g Windows 95. Windows was used because the main developer cannot type for significant lengths of time and uses a voice recognition system. The first development environment was Visual Cafe. This was abandoned because the debugger was quite buggy. The second development environment was Supercede. Various problems required periodic re-installation, which in turn caused serious problems with Windows 95. Finally, Borland JBuilder was tried and there have been relatively few problems. CodeWarrior was used occasionally for specific debugging problems. Although some compilers are a bit pickier than others, no significant problems were encountered moving the code from one development environment to another; or moving the source or class files to the SGI version of the JDK to run experiments. 2.4 Condor Support for Java Condor runs each job in an environment called a "universe." The two most important universes are called "Standard" and "Vanilla." Standard jobs are programs that have been linked with a special Condor version of the C runtime library that mimics the effects of most UNIX system calls and adds two kinds of enhanced functionality: remote system calls and checkpointing. Remote system calls provide a uniform environment to a j o b running on any workstation on a network. The Condor runtime library replaces system calls with remote procedure calls to a shadow process running on the workstation that submitted the job. The shadow makes the system call on behalf of the remote process and returns the results. For example, the o p e n system call sends the name of the desired file to the shadow, which searches for it on the submitter's home workstation. Subsequent r e a d and w r i t e calls access that file over the network. The result is that the job sees the same file-system environment regardless of where it runs, and all file output is captured in files on the submitting workstation. Checkpointing is also implemented by the Condor runtime library. Each Condor job is run under the control of a starter process. When a workstation needs to be appropriated for another purpose, the starter sends a terminate signal to the application process. The Condor library catches this signal and sends a complete dump of the state of the process back to the submitting machine, where it awaits its turn to be restarted on another worker machine. This checkpoint file includes a binary dump of the entire virtual memory image of the process. It also includes a record (collected by the remote system calls) of the current state of the process' interaction with the operating system kernel. For example, for each file opened b y the job, the checkpoint file records the name of the file and the

current offset within the file (the "seek pointer"). Condor can also be instructed to send a "checkpoint" signal to the starter at periodic intervals. The starter responds to this signal b y suspending the application, checkpointing it, and then allowing it to continue. Under some circumstances, Condor may send other signals to the starter, asking it to suspend the application, resume it, or kill it without giving it a chance to checkpoint. Neither remote system calls nor checkpointing require any source-level modifications to the application program, b u t they do require it to be re-linked with a special version of the system libraries. They also impose some restrictions on the set of system services available to the job. In particular, Condor does not currently support Standard jobs that use kernel-level threads. Programs that cannot be re-linked or that do not meet these requirements must be run in the Vanilla universe. An arbitrary executable program can be run as a Vanilla job, b u t any I/O operations will access the file system of whatever machine the job happens to be running on, and if the workstation is pre-empted (for example, by an interactive user), the job is simply killed and restarted from the b e g i n n i n g on another workstation. At NAS, in general, Condor jobs do not have permission to use the local disk on the worker workstation. Most of the JavaGenes runs described in this paper used the Vanilla universe. The Java Virtual Machine (JVM) from SGI and Sun was available to us only in binary (pre-linked) form. We tried the Kaffe open-source JVM, but found that it had bugs that prevented us from running JavaGenes correctly for more than a few generations. Moreover, both JVM's use certain facilities - - n o t a b l y kernel-level threads -- that are not supported in Condor Standard jobs. Fortunately, the remote system call facility was not necessary in our environment, since we use the Network File System (NFS), which provides a uniform interface to files from all workstations. In this environment, the "job" submitted to Condor is a tel script that calls the JVM, with the name of the application class supplied as a command-line argument. From the point or" view of the JVM, the class files that constitute the application are simply data files that appear to be on the local disk of the worker machine through the magic of NFS. Similarly, NFS is used to create output files on the submitting workstation. The lack of automatic checkpointing was a more serious problem. As mentioned earlier, we tried two different application-specific checkpointing strategies. Both i n v o l v e d periodically invoking a checkpoint method that saves the state of the computation into a file. JavaGenes invokes this method after each new child is created. If a workstation is preempted before the program finishes, it is killed and restarted "from the beginning." However, at startup, JavaGenes jobs look for a checkpoint file (via NFS) so they can initialize state and continue from the last checkpoint. Thus, a job that is killed and restarted loses only the work it did between its most recent checkpoint and the time it was killed. To provide better support for JavaGenes and other Java Grande applications, the Condor project has been developing a Java universe. To run under this universe, a Java program must implement the Checkpointable interface: public interface Checkpointable extends Serializable { void start (String[] arguments) ; void restart () ;

137

void beforeCheckpoint () ; void afterCheckpoint () ; void setCheckpointer(Checkpointer c); } Each universe has its own kind of starter. The starter for the Java universe . is written in Java and extends java. lang. ClassLoader. A "job" in the Java universe is a class file. We assume that each worker machine has a JVM installed locally, but otherwise do not require any network file system or uniform file environment. The Java starter loads all required classes over the network by communicating with a Java version of the shadow using the Java RMI (Remote Method Invocation) interface. The Java starter creates an instance p r o g of the application program class and calls either p r o g . s t a r t ( ) o r p r o g . r e s t a r t ( ) , d e p e n d i n g on whether there is an existing checkpoint file from an earlier run of this job. The starter also creates an instance c p of a Checkpointer object, passing p r o g to its constructor, and calling p r o g . s e t C h e c k p o i n t e r ( c p ) so that p r o g and cp can refer to each other. P r o g can also access cp through static methods in Checkpointer as long as there is only one instance of Checkpointer. When the starter receives a terminate or checkpoint signal, it calls cp. c h e c k p o i n t W h e n P o s s i b l e (), which sets a flag indicating that a checkpoint has been requested. It does not force an immediate checkpoint because the application object may not be in a "quiescent" state in which checkpointing is convenient. The application itself is expected to call c p . o k () periodically, If no checkpoint has been requested, this method simply returns without doing anything. However, if a checkpoint request is pending, the checkpointer uses Java serialization to save the state of the object by calling w r i t e O b j e c t (prog). It also calls prog. beforeCheckpoint () before the checkpoint and p r o g . a f t e r C h e c k p o i n t ( ) after. The w r i t e O b j e c t method saves all of the non-transient fields of the object, as well as all objects pointed to by those fields, all fields of those objects, etc. It does not, however, save anything from the runtime stack -- that is, the values of local variables. It is the responsibility of the application to update the non-transient fields of the application object to reflect the complete state of the application either before calling c p . o k or in the method b e f o r e C h e c k p o i n t . If these updates are costly, they should be in b e f o r e C h e c k p o i n t because this method is only called if the checkpoint is actually performed.The b e f o r e C h e c k p o i n t and afterCheckpoint methods also provide hooks for application-specific performance monitoring, such as determining the amount of time spent checkpointing The Java universe provides several advantages Vanilla universe for Java applications:

over the



It supports remote system calls, so that it does not depend on the availability of a network file system (and uniform conventions for mount points).



It automates more of checkpointing and recovery.



It allows Condor to decide when to checkpoint a job, only requiring the application code to indicate when a checkpoint is safe.

the

tasks

necessary

for



code where analysis can prove that saving and restoring will not cause problems.

It allows a job to migrate among hardware platforms during its lifetime. A job in the Java universe is comprised of class files and a checkpoint file, all of which are platformindependent.

One other fairly serious problem arose. In the initial implementation, JavaGenes started a new random n u m b e r generator after restart. Because of this, evolution did n o t follow the same path taken after the last chzckpoint. In other words, the genetic algorithm was controlled by different random numbers and therefore searched a different part of the search space. That made the algorithm appear more efficient than it was, because the number of generations to find a specific target was used as the efficiency measure. Consider the case where a job ran for 100 generations up to a checkpoint and 20 more generations before being killed. When the job was restarted, 20 different generations would be constructed and one of the new generations, say the 10th, might find the target. Therefore, it would appear as if the target was found in 110 generations, but actually 130 were tried. Thus, jobs m i g h t repeatedly search the region around the last checkpoint and stop if a solution were found, reporting an inaccurately small number of generations to completion. It was therefore necessary, at job restart time, to restart the random n u m b e r generator with the same seed and then execute the random number generator the number of times it was executed up to the last checkpoint. This should have insured that a job would follow the same evolutionary path regardless of checkpoint history. Unfortunately, although this fix worked properly when a job was restarted on the same machine, different results were observed in normal execution on the Condor pool, where jobs frequently move between machines. These c h a n g i n g evolutionary paths were presumably caused by differences in the Java libraries on different versions of the SGI operating system. This may change the order of certain lists in JavaGenes and cause the same random number to pick different items from these lists. Fortunately, only about 6% of the generations were repeated once checkpointing performance was improved, suggesting that the effect was fairly minor. Because checkpointing was fast, computations would usually proceed to the next checkpoint before being restarted.

A prototype version of the Java universe successfully ran JavaGenes but it was not stable enough for production use at the time the experiments described in this paper were run. Also, the Java universe uses Java serialization to checkpoint jobs. Our experience indicates that standard Java serialization is too slow for JavaGenes to use.

3. RESULTS The bottom line for JavaGenes on Condor was to run many jobs to conduct the experiments necessary to start understanding the application of genetic algorithms to graphs. In this, we were successful. Thousands of jobs were successfully run, one genetic algorithm paper was p u b l i s h e d [3], and another is in preparation [4]. JavaGenes does a fairly good job of evolving pharmaceutical drug molecules, but can only evolve trivial circuits so far. We now examine the pros and cons of using Java for our application.

3.1 Java Con 3.1.1 Java Serialization Performance It was discovered that creating serialization files and reading them to implement checkpoint/restart could take as much as three hours wall clock time. Although Java serialization is extremely general-purpose, any object can be serialized and the format is CPU independent, it is difficult to understand why three hours is needed to serialize a few htindred graphs, some ancillary objects, and a few thousand real numbers (the data). In any ease, jobs under these conditions made n o progress. In addition, the serialization files were around 25 MB and there were often two files per job. Because jobs could be interrupted in the middle of serialization, JavaGenes wrote state information into a temporary file and, when finished, moved it to the permanent location. With long serialization times, most jobs were interrupted in the middle of writing the serialization file. Thus, the full checkpoint file and a partial checkpoint file were on disk most of the time. With hundreds of jobs, disk usage became substantial. To solve the performance problem, Java serialization was abandoned and new code written to save the state of the computation and read it back from disk. Development took approximately two days and one bug was found and fixed a few days after development was "complete." Most checkpoint files are now less than 1 MB. Checkpoint write and read usually take around 10 seconds, b u t can require up to about five minutes (including network delays). This was the only performance problem that required changes to the code.

3.1.3 JIT Essential Some very simple tests indicated that jobs run under the SGI just-in-time compiler were approximately 20 times faster than when the JIT was disabled.

3.1.4 Double Read/Write Java Libraries Bug It was discovered that the standard Java libraries will write o u t certain double numbers in an ASCII form that is not tolerated as input by the same libraries. In particular, System.out.write (.... + a D o u b l e ) will write out values such as NaN and Infinity, but Double (String) will throw an exception if these strings are passed as the argument.

3.1.2 Checkpointing with Serialization Besides having performance problems, checkpointing with serialization must be handled with care. In particular, it is sometimes error prone. It is essential to make sure that all o f the necessary program state is saved. In the case of genetic algorithms, nearly all the state that must be saved is contained in the population. However, if checkpointing is allowed in the middle of constructing a new generation, then the loop index indicating how much of the next generation has been constructed must also be saved. Also, it is difficult to write code that can be interrupted at any time and saved to disk. Therefore, checkpointing is restricted to those points in the

3.2 Java Pro 3.2.1 Porting We never had any problems porting the source or class files between Java environments.

3.2.2 Bugs Compared to developing in C or C++, there were few bugs and they were easy to find and fix. Basically, almost all the b u g s were logic problems, most of which were found and fixed while -single stepping through the code with the graphical debuggers provided by the IDEs. The uncontrolled pointers, index out o f

138

range, memory management, and similar bugs extremely common in the primary programmer's 20 year C and C++ development experience almost never occurred. These bugs are also, usually, quite difficult to find. Reducing the number of bugs dramatically lowered development time, at least subjectively.

6. REFERENCES [!] Thomas Baeck, Ulrich Hammel, and Hans-Paul Schwefel, "Evolutionary Computation: Comments on the History and on Evolutionary Current State," IEEE Transactions Computation, volume 1, number 1, pages 3-17, April 1997. [2] Raymond Carhart, Dennis I-I. Smith, and 1L Venkataraghavan, "Atom Pairs as Molecular Features in Structure-Activity Studies: Definition and Application,"

3.2.3 Memory Management Memory management was trivial. Only one bug was encountered when we neglected to create a new xyz array in c l a s s V e r t e x during cloning. Before the bug was fixed, after graph layout, all the atoms were displayed at a single point.

Journal of Chemical Information and Computer Science, volume 23, pages 64-73. [3] AI Globus, John Lawton, and Todd Wipke, "Automatic Molecular Design Using Evolutionary Techniques," Sixth

Foresight

3.2.4 Performance

Conference

on

Molecular

Nanotechnolo~,v.

Sunnyvale, California, November 1998 and Nanotechnology, volume 10, number 3, September 1999, pages 290-299.

Other than the serialization problem, performance was not a major issue. This was, in part, because Condor supplies us with lots of nearly free CPU cycles, but also because Java performance has been reasonable (although certainly not exceptional).

[4] AI Globus, Sean Atsatt, John Lawton, and Todd Wipke, "JavaGenes: Evolving Graphs with Crossover," in preparation. [5] John H. Holland, Adaptation in Natural and Artificial Systems, University of Michigan Press. [6] John R. Koza, Genetic Programming: on the Programming of Computers by Means of Natural Selection, MIT Press,

In short, we were happy with Java for this application. While there is some run-time penalty, more rapid code development and more reliable end-product software is well worth the extra CPU cycles. Since cycles are always getting cheaper, and programmers and support staff seem to be getting more expensive, we expect Java to do well in the coming years.

Massachusetts. [7] M. Litzkow, M. Livny, and M. W. Mutka, "Condor - a Hunter of Idle Workstations," Proceedings of the 8th International Conference of Distributed Computing Systems, pages 104111, June1988. See http//www.cs.wisc.cdu/condor/.

4. FUTURE W O R K The main task before JavaGenes is to incorporate more chemical knowledge into the fitness functions, because JavaGenes usually evolves molecules that are not physiologically stable and that would be difficult, or perhaps impossible, to synthesize. To support more Java applications under Condor and more diverse environments, the prototype Condor Java universe described in the section on Condor Support for Java needs to be brought up to production status. The performance problem with Java serialization needs to be solved more generally, or applications must reimplement serialization. An alternative approach would be to rebuild a Java Virtual Machine as a Standard universe job. This approach has the advantage of requiring less handmodification of Java applications, but requires resolution of certain technical and licensing issues.

[8] M. Livny and 1L Raman, "High Throughput Resource Management," Computational Grids." The Future of HighPerformance Distributed Computing, edited by Ian Foster and Carl Kesselman, published by Morgan Kaufmann, 1999. [9] Jason D. Lohn and Silvano P. Colombano, "Automated Analog Circuit Synthesis Using a Linear Representation,"

Second International Conference on Evolvable Systems: From Biology to Hardware, Springer-Verlag, 23-25 September 1998. [10] Numerical Aerospace Systems Division, NASA Ames Research Center, http://www.nas,nasa,gov/home.html. [11] Daniel Tunkelang, A Ncmerical Ontimization Annroach to Granh Drawing, Dissertation, Carnegie Mellon University, School of Computer Science, December 1998.

5. A C K N O W L E D G E M E N T S

[12] Brian Wims and Cheng-Zhong Xu, "Traveler: A Mobile Agent Infrastructure for Wide Area Parallel Computing," ACM 1999 Java Grande Conference. San Francisco, California, 1214 June 1999.

Many thanks to Daniel Tunkelang, formerly of Carnegie Mellon, for providing his graph layout code [11]. Thanks to NWP Associates, Inc. for providing their Student T-Test code. Thanks to Rich McClellan, University of California at Santa Cruz, for providing the mol file reading and atomic element code. Thanks to Gail Felchle for much of the graphics art work. Thanks to the Condor team at the University of Wisconsin for their support. Thanks to Chris Henze and Bonnie Klein for reviewing this paper. This work was funded by NASA Ames contract NAS 2-14303 and NASA-Ames Cooperative Agreement No. NCC 2-5323.

[13] Jiang Xiao, Zbigniew Michalewicz, Lixin Zhang, and Krzysztof Trojanowski, "Adaptive Evolutionary Planner/Navigator for Mobile Robots," IEEE Transactions on Evolutionary Computation, volume 1, number 1, pages 18-28, April 1997.

139