Evolving Modular Recursive Sorting Algorithms Alexandros Agapitos and Simon M. Lucas Department of Computer Science University of Essex, Colchester CO4 3SQ, UK
[email protected],
[email protected]
Abstract. A fundamental issue in evolutionary learning is the definition of the solution representation language. We present the application of Object Oriented Genetic Programming to the task of coevolving general recursive sorting algorithms along with their primitive representation alphabet. We report the computational effort required to evolve target solutions and provide a comparison between crossover and mutation variation operators, and also undirected random search. We found that the induction of evolved method signatures (typed parameters and return type) can be realized through an evolutionary fitness-driven process. We also found that the evolutionary algorithm outperformed undirected random search, and that mutation performed better than crossover in this problem domain. The main result is that modular sorting algorithms can be evolved.
1
Introduction
A fundamental issue in evolutionary learning is the identification of the solution representation space. More specifically, given that the Genetic Programming (GP) paradigm relies on the evaluation of executable structures, the appropriate design of a primitive language is crucial. This language needs to embody a sufficient level of expressiveness for the desired phenotype to evolve. In traditional GP systems the representation system is composed of a static alphabet containing primitive terminal and non-terminal elements. Traditional GP ignores much of what we know about how to design and implement well structured software, which to a significant practical degree, means object-oriented software. Indeed, much of the difficulty of high-level software design lies in the identification of useful abstractions. Building abstractions with procedures is arguably the main mechanism that conventional programming uses to address complex problems, and enables solutions to such problems to be specified as relatively simple compositions of sub-components. Past research has attempted to integrate modularity into the GP paradigm. Several approaches have been followed, including Automatically Defined Functions [1], Module Acquisition [2], Adaptive Representation through Learning [3], Automatically Defined Macros [4] and Structure Abstraction [5]. This paper presents work on coevolving general modular recursive sorting algorithms along with their representational language within an Object Oriented M. Ebner et al. (Eds.): EuroGP 2007, LNCS 4445, pp. 301–310, 2007. c Springer-Verlag Berlin Heidelberg 2007
302
A. Agapitos and S.M. Lucas
Genetic Programming System (OOGP). Sorting is a challenging problem for GP, and in general is not solvable with the usual GP-style constant time expression trees since the evolved algorithm will have to rearrange the comparable elements of sequences of arbitrary length into order. A literature review [6,7,8,9,10,11] on sorting algorithm evolution revealed a limited repertoire of attempts in this problem domain. While previous research on the evolution of iterative sorting has showed some promise, the evolution of recursive sorting algorithms has received very little attention from the evolutionary computation community, limited to the authors’ previous work [7]. That study concentrated on evolving general recursive sorting algorithms. The time complexity of the successfully evolved algorithms was measured experimentally in terms of the number of method calls made, and for the best evolved individuals this was best approximated as O(n × log(n)). Additionally, we investigated the effects of language design on evolving implementations of efficient sorting algorithms as well as the proficiency of five different fitness functions based on measures of sequence disorder.
2
Programming Space Under Exploration
On an intuitive level, the higher the complexity encapsulated in the primitive alphabet used to construct candidate solutions, the more expanded the class of problems that can be addressed. This prompts us to investigate a mechanism for adapting the primitive representational vocabulary by extending it with explicitly evolvable building blocks, tailored to the specific environment. This mechanism was first introduced in [1], under the name of “evolutionary selection of program’s architecture” in an attempt to extend the ADF methodology by overcoming the need of pre-specifying the number of automatically defined functions (and their arguments), and the hierarchical references among them. Performance details of the application of this technique to a wide range of application areas are given in [1]. Here, we extend previous work of [1], under the notion of “evolution of method signatures” by simply adding type information in the return value and formal parameters of the evolvable member methods. The evolutionary algorithm (EA) will be exploring the programming space of sequences of comparable items. For language primitives these are methods and objects of a simple, general-purpose list processing package presented in Table 1. Note that list-based classes and methods have been defined (using standard Java programming techniques). CList is a list of Comparable items, and GreaterThan and NotGreaterThan are predicates that implement the MyComp comparator interface. This interface declares a Compare method that compares Comparable items.
3
Evolvable Recursive Functions
It was shown in [7] that it is possible to reliably evolve a range of general recursive functions within an OOGP system. The recursion mechanism used is general and in-line with conventional programming’s implementation of recursive calls.
Evolving Modular Recursive Sorting Algorithms
303
Table 1. Primitive elements for evolving sorting algorithms Method set Argument(s) type CList CList CList, CList Object, Object Comparable, Comparable Object, Object Conditional Control flow Argument(s) type IF-Then-Else Boolean, CList, CList Terminal set Terminal Value Parameter[0] Parameter[1] Parameter[2] Const: GreaterThan new GreaterThan() Const: NotGreaterThan new NotGreaterThan() Const: null null Method Head Tail Append Cons Compare EqualTo
Return type Comparable CList CList CList Boolean Boolean Return type CList Type CList MyComp Comparable MyComp MyComp Object
It makes no distinction between built-in methods and the evolved method, thus making the evolved method’s reference available to the method set serving as the alphabet for constructing the adaptive tree structures. Each evolved method in the OOGP system looks much like a Java method, with a declaration (signature: return type and parameter types) and an implementation, which is an expression tree evaluated with the arguments bound to the parameters. The expressions are strongly typed and may also invoke any specified methods in the Java API (as specified by the configuration of each experiment). In order to avoid the problem caused by non-terminating recursive structures we limited the recursive calls to between 25 and 10, 500. The upper bound of 10, 500 was chosen to be slightly larger than the largest number of recursive calls required by our hand-coded implementation of the most recursively expensive configuration, as discussed in [7]. In order to allow for the emergence of environment specific modules the hypothesis representation has been enhanced. The structure being evolved is now a set of evolved methods, reminiscent of a Class definition. For the sake of our discussion here we shall call this structure an Evolvable Class. The evaluation of such an individual begins from a pre-specified member method. 3.1
Evolutionary Selection of Hypothesis Structure
The syntactic structure of an Evolvable Class is dependent upon its constituent elements. However, in this work, the primitive set of elements is not static but includes a variable number of coevolving member methods. These methods in turn have variable signatures. When the initial random population is created,
304
A. Agapitos and S.M. Lucas
it contains Evolvable Classes with different structures. That is, the number of evolvable member methods, and the number and type of arguments that they each possess differ from one individual to the other. The different member method signatures range over various useful instances. Each Evolvable Class is evaluated for fitness (starting from a pre-determined member method) and selected to participate in genetic operations using tournament selection. 3.2
Evolutionary Run Initialization
Each Evolvable Class has a pre-specified evolvable method that serves as the initial point of fitness evaluation. We call this Main Member Method. The signature of this member method is set a priori according to the signature of the target solution. The creation of an initial random Evolvable Class begins with the uniform random selection (from within a pre-specified range) of the number of the evolvable member methods (other than the Main Member Method ) that will belong to it. Then a series of independent random choices is made for the number and type of arguments possessed by each member method. All of these random choices are made within a wide but limited range that includes every number and type that might be sensible for the problem at hand. We need to make clear that once the signatures of the evolvable member methods of an Evolvable Class are specified, they cannot be altered by applying a variation operator. The signature diversity enforced by the creation of the initial population plays a significant role in the success of the evolutionary run. Each evolvable member method (including the main one) allow recursive call to itself. Additionally, each member method is allowed to invoke hierarchically other methods of the Evolvable Class. A simple naming scheme has been employed to guard against circular calling dependencies. 3.3
Variation Operators
OOGP uses three main variation operators, namely, macro-mutation (MM — substituting a node in the tree with an entire randomly generated subtree with the same return type and a maximum random depth of 4 - subject to depth constraints), creation (CR — a special case of mutation where an entirely new individual is created in the same way as in the initial random generation) and crossover (XO). The motivation for the creation operator lies in the fact that method signatures are not modified after the creation of the initial population. CR guards against the premature loss of certain signatures. The diversity of signatures of member methods among different Evolvable Classes has a concomitant impact to the mechanism of crossover. In order to guarantee that this variation operator will produce syntactically correct offspring, Point Typing has been used as in [1]. Our single-offspring crossover begins with the uniform selection of a member method from a contributing Evolvable Class. Subsequently, a point from the selected member method is uniformly chosen. The distribution of selection of crossover points is set to 90% probability of selecting interior nodes (uniformly) and 10% probability of selecting a leaf
Evolving Modular Recursive Sorting Algorithms
305
node. The point from the receiving parental Evolvable Class is selected under the constraints of Point Typing. Crossover is then performed in the standard way. The resulting Evolvable Class inherits the member methods’ signatures from the receiving Evolvable Class. During this process, as in [1], member methods coevolve with the Main Member Method resulting to the emergence of environment specific building blocks, advantageous to the composition of the final solution.
4
Experimental Context
Control parameters were specified as follows. Population size was set to 25, 000 individuals and the number of generations was fixed to 100. The maximum depth of a tree in the initial generation was set to 4 whereas the maximum depth resulting from the application of a variation operator was set to 10. We used three different search regimes to search the space of candidate solutions. The first regime, XO-Regime, used 95% XO, 4% MM and 1% CR. The second regime, MMRegime, used 99% MM and 1% CR. Tournament selection (tournament sizes of 3 and 7 for XO-Regime and MM-Regime respectively) along with elitism (1%) was used as the selection scheme. Previous work [12] on the evolution of recursive and iterative algorithms has raised scepticism as to the degree that the performance of an evolutionary algorithm is not merely a result of a random exploration of the fitness landscape. It has been argued [13] that the space of algorithms is very discontinuous as to the space of functions, resulting in difficult to search landscapes, able to coerce the evolutionary learning process to be degenerated in a needle-in-a-haystack problem. In order to ensure that the ability to sort within our setup is not essentially a result of random search we are fixing an additional comparison between the EA and random search. This third regime, RS-Regime, used random search (RS) (i.e no selection pressure), but arranged in generations of purely random individuals (with a random maximum tree depth of 10) in order to plot the fitness on the same graphs as for the other search regimes. The range of potentially useful numbers of member methods within a Class definition cannot be predicted with certainty for an arbitrary problem. The same holds for the range of their number of arguments. Here, we arbitrarily choose to use the number of 3 member methods, allowing a maximum of 4 evolvable member methods (including the main method) to be defined in an Evolvable Class. We set a sensible number of maximum arguments to an evolvable member method by inspecting the average number of arguments defined in methods from the Java API and also by inspecting a modular hand-coded recursive sorting algorithm implementation presented in [7]. Thus, we allow a maximum of three arguments to each method. For argument types, it is reasonable to draw possibly-useful instances from the programming space under consideration. Here we define the set Sargs of possible argument types to be Sargs = {CList, Comparable, MyComp}. Five fitness functions based on different measures of sequence disorder were used as in [7]. These are: (a) Mean Sorted Position Distance (MSPD), (b) Mean Inversion Distance (MID), (c) Minimum Number of Exchanges (MNE), (d)
306
A. Agapitos and S.M. Lucas
Number of Step Downs (NSD), (e) Number of Elements to Remove (REM). The training cases consisted of 10 random lists of 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 unique elements respectively. Elements were randomly chosen from the range of {0, . . . , 250}. Test sets measured the ability of an evolved solution to generalize to unseen data and recognized the success of a run. Test cases for generality consisted of 200 random lists (no element uniqueness requirement) with a maximum random length of 100. It is noteworthy that successfully evolved individuals were making no reference to the length of the input sequence and were subsequently tested correct with lists of up to 1000 elements in order to be evaluated for time complexity. (sort(CList l) (EvolvableMethod(CList l, MyComp comp, (EvolvableMethod Comparator x) (IF-Then-Else (If-Then-Else (EqualTo (l.Tail()) null) (comp.Compare(l.Head()) x) (l.Tail()) (Cons x l) (sort (l.Tail())) (Cons (l.Head()) ) (IF-Then-Else Object: GreaterThan (EqualTo l null) (l.Head()) (Cons x l) ) (EvolvableMethod (l.Tail()) comp x) ) ) ))) Fig. 1. Sample simplified evolved sorting algorithm
5
Evaluating the Generality of the Experimental Setup
On a practical level we want to ensure that the experimental setup is general and not biased toward sorting algorithms. For this purpose we used two supplementary experiments in order to evaluate the generality of the setup. The target functions were chosen to be those of (a) reversing a list (i.e. (reverse ’(1 2 3) = (3 2 1)) and (b) duplicating each element in a list (i.e. (duplicate ’(1 2 3) = ’(1 1 2 2 3 3))). These recursive functions have the same signature as the sorting algorithm (accept a CList as an argument and return a CList). The idea is that by using the same primitive terminal and non-terminal sets but varying the fitness function and the training data we can lead the system to learn different target functions. Experiments used a population of 1000 individuals and 50 generations. MM (99%) and CR (1%) was the search regime employed. The fitness function was based on the sum of the positional distances between the same elements of the induced list and the target list, averaged over the length of the target list. Training and test set sizes, numbers of member methods, arguments, and argument types were set as above. We found that the probability of evolving target solutions for the list reversal problem was 94% (standard error: 2.4), resulting in a computational effort curve I(M,i,z) that reaches a minimum
Evolving Modular Recursive Sorting Algorithms
307
value of 110, 000 individuals in generation 5. Given that 10 fitness cases were used during training the number of fitness evaluations required is 1, 100, 000. Analogously, for the list element duplication problem, we got a probability of success of 81% (standard error: 3.9), resulting in an effort curve that reaches a minimum value of 400, 000 (4, 000, 000 fitness evals.) by generation 9. Table 2. Summary of results for each search regime on each fitness function (bold face indicates best performance on a given fitness function, standard errors in parentheses for prob. of success)
MSPD XO MM RS MID XO MM RS MNE XO MM RS NSD XO MM RS REM XO MM RS
6
Prob. of Success (%) Minimum I(M,i,z) Fitness Evaluations 6 (2.4) 138,750,000 1,387,500,000 7 (2.5) 78,400,000 784,000,000 0 (-) 1 (0.1) 378,675,000 3,786,750,000 4 (1.9) 67,800,000 678,000,000 0 (-) 1 (0.1) 309,825,000 3,098,250,000 4 (1.9) 98,800,000 988,000,000 0 (-) 1 (0.1) 321,300,000 3,213,000,000 1 (0.1) 229,500,000 2,295,000,000 0 (-) 1 (0.1) 413,100,000 4,131,000,000 1 (0.1) 252,450,000 2,524,500,000 0 (-) -
Results and Discussion
We performed 100 independent runs, using each different search regime, in order to get statistically meaningful results. The computational effort I(M,i,z) was computed in the standard way, as described in [1]. Figures 2(a), 2(b), and 2(c) show the best-of-generation individuals of 100 independent runs using XO-Regime, MM-Regime, and RS-Regime respectively. Figures 2(d) and 2(e) provide a comparison of the cumulative probabilities of success between the different fitness functions under XO-Regime and MM-Regime respectively. Figure 2(f) presents a comparison of the average depth and size (in terms of number of nodes) of the successfully evolved individuals for the different fitness functions under XO and MM search regimes. Random search could not find any target solutions under any of the five fitness functions considered. Thus, in each run, 2, 500, 000 individuals (given 10 fitness cases results in 25, 000, 000 fitness evaluations) were processed without producing a general sorting algorithm. Table 2, shows that the fitness function based on sequence disorder MSPD, performed consistently better under both variation operator regimes. This is
A. Agapitos and S.M. Lucas
0.25
0.25
0.2
0.2
Minimum error
Minimum error
308
0.15
0.1
0.05
0.15
0.1
0.05
0 0
10
20
30
40 50 60 Generation
70
80
90
0 0
100
10
20
30
(a)
40 50 60 Generation
70
80
90
100
80
90
100
(b) Cummulative Probability of success (%)
8
Minimum error
0.25
0.2
0.15
0.1
0.05
0 0
10
20
30
40 50 60 Generation
70
80
90
7 6 5 4 3 2 1 0 0
100
MSPD MID MNE NSD REM
10
20
(c)
Depth MM Size MM Depth XO Size XO
30 25
5 4 3
20 15 10
2
5
1 0 0
70
35 MSPD MID MNE NSD REM
Depth / Size
Cummulative Probability of success (%)
6
40 50 60 Generation
(d)
8 7
30
10
20
30
40 50 60 Generation
(e)
70
80
90
100
0
MSPD
MID MNE NSD Fitness functions
REM
(f)
Fig. 2. (a) Best-of-generation individuals using MSPD and XO-Regime; (b) Best-ofgeneration individuals using MSPD and MM-Regime; (c) Best-of-generation individuals using MSPD and RS-Regime; Comparison of Cum. Prob. of success between different fitness functions using (d) XO-Regime and (e) MM-Regime; (f) Comparison of the average depth and size (in terms of number of nodes) of the successfully evolved sorting algorithms for the different fitness functions under XO and MM search regimes.
Evolving Modular Recursive Sorting Algorithms
309
in-line with the previous results in [7] where MSPD and MID performed significantly better under that experimental setup. We also note that for MSPD, macro-mutation performed slightly better than crossover, however, the difference in their probability of success is rather insignificant. The important difference lies in the computational effort required to yield a successful outcome. Looking at the minimum error histograms in figures 2(a) and 2(b) we observe that the population under macro-mutation converges more rapidly, and this has a direct implication on the required fitness evaluations. The results presented in table 2 show that the superiority of macro-mutation in terms of parsimony in fitness evaluations is a general phenomenon as it remains essentially constant over all different fitness functions considered, and it becomes particulary significant in MID and MNE. Figures 2(d) and 2(e) present the performance curves under different variation operators. Looking at those graphs, we note that for XO-Regime most runs tend to stagnate after approximately generation 40 with the consistently better performance of MSPD stagnating after about generation 75. For MM-Regime we see a wider distribution of generation values for run stagnation, with MNE continuing evolution almost approximately up to generation 72. Observing the depth and size comparison we note that on average macro-mutation resulted in smaller solutions, mainly due to the additional depth constraint imposed during its application (the implanted subtree is not allow to grow past the depth of 4). Figure 1 presents a simplified sample evolved solution. We evaluate its efficiency in terms of method invocations required to sort sequences of up to 1000 elements. This is best approximated to O(n2 ), having a close fit to F (n) = 1.255 × n2 . The coefficient (1.255) has been chosen that minimizes the mean squared error between F (n) and estimated method invocations, for n being the length of the input sequence. We found that the algorithmic complexity has increased from O(n × log(n)) in [7] to O(n2 ). Although we make no attempt to fully explain the results on a theoretical level, an intuitive understanding on the differing time efficiency of the evolvable recursive sorting algorithms under the three different experimental setups (as these are presented in [7] and the present paper) can be gained by considering two very important issues that were initially raised in [7]. First, this drop down in time efficiency could well hint at the inherent difficulty of inducing multi-tree structures. A second, most important issue, is related to the programming space explored by GP. This space is defined over all programs that can be constructed with the human-supplied primitive alphabet. The careful design of special language constructs, as done in [7], greatly enhanced the process of evolution and allowed GP to induce sorting algorithms of O(n × log(n)) complexity. We empirically confirmed that in the absence of these special language constructs the time complexity of the successfully evolved algorithms was increased.
7
Conclusions
OOGP was successfully applied to the task of evolving modular recursive sorting algorithms. The evolved individuals were trained on small samples of data and
310
A. Agapitos and S.M. Lucas
generalized perfectly. Evolution significantly outperformed random search. The feasibility of the process for automatically inducing the signatures of the representational building blocks was empirically justified. For that, a fitness function based on the positional distance between actual and sorted state performed the best. Beyond that, we believe that OOGP is an area with immense possibilities, including the evolution of complete classes, and cooperating sets of classes.
References 1. J.R. Koza, Genetic Programming II: automatic discovery of reusable programs, MIT Press, Cambridge, MA, (1994). 2. Peter J. Angeline and Jordan Pollack, “Evolutionary module acquisition”, in Proceedings of the Second Annual Conference on Evolutionary Programming, 1993. 3. Justinian P. Rosca and Dana H. Ballard, “Discovery of subroutines in genetic programming”, in Advances in Genetic Programming 2. MIT Press, 1996. 4. Lee Spector, “Simultaneous evolution of programs and their control structures”, in Advances in Genetic Programming 2. MIT Press, 1996. 5. Tina Yu and Chris Clack, “Recursion, lambda abstractions and genetic programming”, in Genetic Programming 1998: Proceedings of the Third Annual Conference. 6. Lee Spector, Jon Klein, and Maarten Keijzer, “The push3 execution stack and the evolution of control”, in GECCO ’05: Proceedings of the 2005 conference on Genetic and evolutionary computation, New York, NY, USA, 2005, pp. 1689–1696. 7. Alexandros Agapitos and Simon M. Lucas, “Evolving efficient recursive sorting algorithms”, in Proceedings of the 2006 IEEE Congress on Evolutionary Computation, Vancouver, 6-21 July 2006, pp. 9227–9234, IEEE Press. 8. Kenneth E. Kinnear, Jr., “Generality and difficulty in genetic programming: Evolving a sort”, in Proceedings of the 5th International Conference on Genetic Algorithms, ICGA-93, Stephanie Forrest, Ed., University of Illinois at UrbanaChampaign, 17-21 July 1993, pp. 287–294, Morgan Kaufmann. 9. Kenneth E. Kinnear, Jr., “Evolving a sort: Lessons in genetic programming”, in Proceedings of the 1993 International Conference on Neural Networks, San Francisco, USA, 28 March-1 April 1993, vol. 2, pp. 881–888, IEEE Press. 10. Una-May O’Reilly and Franz Oppacher, “An experimental perspective on genetic programming”, in Parallel Problem Solving from Nature 2, 1992. 11. Russ Abbott, Jiang Guo, and Behzad Parviz, “Guided genetic programming”, in The 2003 International Conference on Machine Learning; Models, Technologies and Applications (MLMTA’03), las Vegas, 23-26 June 2003, CSREA Press. 12. Scott Brave, “Evolving recursive programs for tree search”, in Advances in Genetic Programming 2. MIT Press, 1996. 13. Astro Teller, “Genetic programming, indexed memory, the halting problem, and other curiosities”, in Proceedings of the 7th annual Florida Artificial Intelligence Research Symposium, Pensacola, Florida, USA, May 1994, pp. 270–274, IEEE Press.