within an Object-Oriented Genetic Programming (OOGP) system. We ... scaling its ability to evolve computer programs to larger and more complex prob- ..... where d is the distance between the correct position and the return value position.
Learning Recursive Functions with Object Oriented Genetic Programming Alexandros Agapitos and Simon M. Lucas Department of Computer Science University of Essex, Colchester CO4 3SQ, UK {aagapi, sml}@essex.ac.uk
Abstract. This paper describes the evolution of recursive functions within an Object-Oriented Genetic Programming (OOGP) system. We evolved general solutions to factorial, Fibonacci, exponentiation, even-nParity, and nth-3. We report the computational effort required to evolve these methods and provide a comparison between crossover and mutation variation operators, and also undirected random search. We found that the evolutionary algorithms performed much better than undirected random search, and that mutation outperformed crossover on most problems.
1
Introduction
One of the most challenging areas of research in GP is to investigate ways of scaling its ability to evolve computer programs to larger and more complex problem domains. Modularity is arguably the main mechanism that conventional programming uses to address complex problems, and enables solutions to such problems to be specified as relatively simple compositions of sub-components. Past research has attempted to integrate modularity into the GP paradigm. Several approaches have been followed, including, Automatically Defined Functions (ADFs) [1], Module Acquisition [2], Adaptive Representation through Learning [3] and Automatically Defined Macros [4]. Much of modern software development, however, is based on object-oriented (OO) programming. Object-oriented software design couples the design of data structures [5] (Object classes or types) with methods that operate on those structures, thereby providing better modularity and reuse than non-OO techniques. We believe that object-oriented programs should also be amenable to evolutionary search, and may enable GP to scale up to tackle complex problems that would otherwise be infeasible. This is a very significant challenge, however, and in this paper we focus our attention on the evolution of simple recursive methods within OOGP. The reason for doing this is to demonstrate that OOGP is able to provide competitive performance on this class of problem. Recursion is a powerful concept, and when appropriate can be used to specify very elegant programs. Where applicable, recursive programs tend to be more compact than non-recursive (or non-iterative) expression trees. From a machine
2
Alexandros Agapitos and Simon M. Lucas
learning perspective, one of the main goals of GP is to create a program given a set of training data such that the program will have a low error rate on unseen test data. Previous research [6] has shown that more parsimonious evolved solutions are less prone to over-fitting. Finally, it has been argued [7] that programs of shorter effective length have better chances of surviving the destructive effects of crossover than programs with larger effective length. The main problem GP faces when evaluating recursive programs is the handling of infinite loops which result from recursive function calls that never satisfy a termination criterion. Here we use a function call limit within our interpreter. Programs that exceed their limit are terminated and assigned minimum fitness (maximum error). An interesting alternative is the competing coroutines method of Maxwell [8], where a population of programs is run concurrently, with best fitness being assigned to the first programs to provide correct output. For now, however, we use the function call limit, as it is more straightforward to implement and to configure. Previous research [9–20] has addressed the issue of evolving recursive programs, using either implicit or explicit recursion mechanisms. It is clearly not possible in a paper of this length to review all previous attempts on the subject and provide specific comparisons to other approaches. It is worth mentioning that past research has been devoted to evolving recursion using tree, linear (binary machine code) and stack-based hypothesis representations. However, to our knowledge, no solutions to factorial nor exponentiation, using a tree-type representation, have been attempted at the time of writing. Furthermore, Fibonacci sequence, with a tree-type genome, was induced in [11] using Automatically Defined Recursion and Architecture Altering Operations but the evolved program did not generalize beyond the first twelve elements of the sequence used as fitness cases during training. Even-N-Parity with explicit recursion and tree-type representation was studied in [14]. Their work used trees derived from logic grammars, an approach quite different from the one taken in this paper. This paper focuses on evolving Object Oriented (OO) recursive programs. Of direct relevance to this paper is the work of Bruce [21] and Langdon [5] on evolving abstract data types. The most similar prior work on evolution of OO methods is Abbott’s [22] and Lucas’s [23] initial explorations of reflectionbased OOGP systems as well as Suarez’s et al [24] investigation of evolving OO agent programs. Abbott used reflection to make method invocations, and mentioned the ability to use existing class libraries as an advantage of the approach, though the parity problem he used as an example did not demonstrate that, and instead used specially defined classes and methods to help solve the problem. Lucas investigated the use of Java Reflection to enable evolutionary algorithms to directly exploit existing class libraries and demonstrated the feasibility of his approach with the aid of an evolutionary art example. The most extensive set of evolved recursive programs presented so far was due to Spector et al [9] with their PushGP system. They evolved many ingenious recursive solutions to a number of problems. Here, we show that we can obtain similar results in terms of evolved functionality within our OOGP system, with
Learning Recursive Functions with Object Oriented Genetic Programming
3
all the potential benefits that could ensue from using an OO model, together with the ability to use the power of the Java class libraries in to potentially evolve solutions to interesting real-world problems, with very little human input.
2
OOGP: Evolutionary Computation Research System
OOGP is a Java-based Object Oriented Genetic Programming system capable of evolving OO method implementations that match a specified interface. This section describes the main features of the OOGP experimentation framework. OOGP uses a panmictic, generation-based breeding policy to evolve objectoriented programs. For this paper, we are fixing the set of available classes, objects, variables and method signatures, and evolving only the method implementations, which are specified as program-trees. Each run begins with a population of randomly initialized program-trees. The generational model is combined with elitism, in that a fixed percentage of the best individuals are preserved from generation to generation. The genetic algorithm for OOGP is the same as that for standard GP. The system provides implementations of three tree generation methods most widely used in literature namely, Full, Grow, Ramped Half-and-Half [20]. We used tournament selection and standard variation operators of crossover (XO), macro-mutation (MM — substituting a node in the tree with an entire randomly generated subtree with the same return type) and point-mutation1 (PM — substituting a non-terminal node with another non-terminal node with the same return and parameter types or substituting a terminal node with another terminal node of the same return type). We used a standard crossover operator (not homologous). Recursion in conventional programming can be achieved by making the name of a procedural abstraction appear in its own body, and thus enabling it to call itself. Similarly, recursion in GP can be most naturally expressed by assigning a name to the evolved method and allow this name to be called from within the evolved method’s body [12, 14, 19]. An important issue is regarding what we consider to be a non-terminal element. Using the above approach we make no distinction between built-in methods and the evolved method, thus making the evolved method available to the method set serving as the alphabet for constructing hypotheses. Importantly, this representation of recursion is generic and in-line with conventional programming’s implementation of recursive calls. It does not require high-level recursive operators to be supplied in the method set. Thus, the ability to synthesize arbitrary recursive behaviour from non-recursive primitives makes recursion an emergent property of the GP run. Given a method signature (its return type and list of parameter types), the system evolves the code that implements the method. We consider an evolved method’s representation to be a tree-type structure of objects of interface type Expr. This interface is presented in Figure 1. Note that each expression has a set of children, and an evaluation environment. The evaluation environment 1
Analogue of bit-flip mutation in standard Genetic Algorithm.
4
Alexandros Agapitos and Simon M. Lucas
public interface Expr { public Object eval(Expr[] env); public Expr[] getChildren(); public Class getType(); } Fig. 1. The interface for an Expr, the building block of the tree-type representation
provides bindings for any formal method parameters, and this model directly enables recursive calls [25]. Wrapper classes have been used to define the building blocks of the tree representation. These classes include Function2 , IFThenElse, Cond3 , Constant, and Parameter4 , all implementing the Expr interface in order to achieve polymorphism during recursive tree evaluation and tree manipulation operations. Following the discussion above, a Function object must be able to represent both an evolved method and a primitive non-terminal element. The solution we preferred is to declare a Function class instance variable of interface type Callable5 and define two classes, namely, Funcall and MethodCall implementing this interface. The former holds a reference to the root node of an evolved tree-structure and triggers its evaluation, while the latter represents a primitive OO method invocation. The system exploits reflection to automatically discover features about the environment (the existing classes and objects) it is to operate on. Therefore, it is possible to discover the set of all methods that can be called on an object of a particular class and hence invoke these during tree interpretation. Towards this direction, we use a set of active Class types in order to populate our primitive OO non-terminal (i.e methods) set. If desirable, manual intervention is still possible by specifying the set of existing methods to be used. The complete OOGP method set is large and cannot be fully documented here, but the set of the OO primitive methods used in the experiments is presented in Table 1. In addition, every evolved method should be able to use its own objects to invoke methods on. The solution we preferred is to define an ObjectHolder wrapper class, which has two fields: Class objectClass, and Object objectValue. This structure provides local fine-grain control of the object classes and values. The ObjectManager class dynamically instantiates objects of active Class types 6 and makes them available for use to an evolved method during its creation. The interface of Figure 1 shows that each tree-node, represented as a class that implements this interface, is a self-evaluated entity (i.e. eval(Expr[] env)). Our reflection-based interpreter starts the evolved method evaluation from the root 2
3 4 5 6
To avoid confusion with existing java.lang.reflect.Method we call our tree-node, representing an OO method, a Function. Analogous to Lisp’s Cond function. Constant and Parameter classes represent primitive terminal elements. It’s only method is: public Object call(Object[] args). Via reflection-based constructor invocation.
Learning Recursive Functions with Object Oriented Genetic Programming
5
node of its tree-type structure and recursively calls the eval method on each node. Conditional nodes (i.e instances of IfThenElse and Cond classes) are evaluated using the delayed evaluation model (otherwise recursive programs would never halt). All other nodes are strictly evaluated. The heavy reliance on reflection does have an unfortunate performance cost, and for real-world applications a better option might be to compile the code. This can be done by using Java toolkits that allow run-time manipulations of classes, or by generating Java source code and then compiling it. For the simple problems under investigation here, however, the compilation or class construction cost could outweigh the evaluation cost. Table 1. Sample OOGP method set Description Arithmetic
Methods Argument(s) type add, sub, mul, div, pow Double, Double exp, log, sqrt Double Boolean Logic and, or, nand, nor Boolean, Boolean List Processing cdr List car List isEmpty List length List Predicate =, >, >=,