Phenotype Diversity Objectives for Graph Grammar Evolution

November 20, 2005

20:26

WSPC/Book Trim Size for 9in x 6in

Chapter 1

Phenotype Diversity Objectives for Graph Grammar Evolution

Martin H. Luerssen School of Informatics & Engineering Flinders University of South Australia GPO Box 2100 Adelaide 5001 Australia Email: [email protected] Evolutionary algorithms are a practical means of optimising the topology of graphs. This paper explores the use of phenotype diversity measures as objectives in a graph grammar-based model of multi-objective graph evolution. Since the initial population in this model is exclusively constituted by empty productions, an active promotion of diversity is needed to establish the necessary building blocks from which optimal graphs can be constructed. Six diversity measures are evaluated on problems of symbolic regression, the 6-multiplexer, and neural control of double pole balancing. The highest success rates are obtained by defining diversity as the number of solutions that differ in at least one fitness case and do not Pareto-dominate each other.

1.1

Introduction

A directed graph is a system (V, E, s, t) where V is a finite set of vertices, E is a finite set of edges, and s, t : E → V assign a source s(e) and target t(e) to each e ∈ E. Natural and artificial structures that are representable as graphs are ubiquitous, and many problems of practical interest may be formulated as questions about graphs. Less commonly explored is the optimisation of graph topologies. Existing studies focus mainly on the optimisation of neural networks, but genetic programming (GP) (Koza 1992), a widely applied method for optimising trees, is also of relevance in 1

main

November 20, 2005

2

20:26


The Second Australian Conference on Artificial Life 2005

this context, since considerable research has gone into exploring the reuse of functions and modules, and other means of adding cycles back into the trees (Woodward 2003). A graph grammar-based system for evolving general graphs has been presented previously (Luerssen 2005). Deriving graphs from a grammar mirrors the developmental process that gives complexity to biological organism and allows for an efficient representation of graphs. Genetic algorithms, including GP, commonly start with a population of random initial solutions from which all subsequent offspring is drawn. Randomly initialising a graph grammar, however, easily leads to disconnected and oversized graphs. A viable alternative is to establish complexity by evolving graphs towards greater diversity. This paper evaluates six phenotype diversity measures to be used as objectives in the multi-objective optimisation of graph grammars.

1.2 1.2.1

Background Evolution and Development

Evolutionary algorithms (EAs) are a well-established class of methods for searching discontinuous spaces where little domain-specific knowledge is available (B¨ ack 1996). EAs operate on a population of diverse solutions from which an offspring population is generated by applying mutations and/or recombinations; the fittest solutions are then selected to form a new population. The basic means of representing a graph for this purpose is an adjacency matrix folded into a binary string (Miller, Todd & Hegde 1989). This representation is simple, but grows with the size of the graph rather than its complexity, inevitably leading to scalability issues. Biological representations are more sophisticated. The genes of an organism are part of the genome, or genotype, which is encoded into several chromosomes of DNA. Natural selection does not apply to the genotype directly, but to the phenotype, which is an expression of the genotype within a given environment. Genes composed of DNA are transcribed into RNA, translated into polypeptides, and then processed into proteins which selforganize into phenotypic traits (Futuyma 1998). The ontogenetic process that defines the mapping of genotype to phenotype is called an embryogeny and involves complex feedback loops that control the expression of genes. These feedback loops can produce modular, iterative and recursive programs of development (Slack 1991) and are characterized by polygeny (multiple genes define a single phenotypic variable) and pleiotropy (changes to a single gene affect multiple phenotypic variables). Consequently, when

main

November 20, 2005

20:26



main

3

exploring the phenotype, large changes are possible through small variations to the genotype, while large neutral variations to the genotype will have little or no effect on the phenotype, although they may change the inductive bias on the search (Toussaint 2003). 1.2.2

Graph Ontogenesis

There has been previous research into applying various aspects of biological ontogenesis, such as morphology (Cangelosi, Parisi & Nolfi 1994) and chemistry (Astor & Adami 2000), to the evolution of graphs. The drawback of realistic ontogenetic models is their necessary complexity, which increases computational cost and makes the systematic analysis of such systems difficult. A simpler, more transparent approach is to describe ontogenesis explicitly in terms of hierarchical modularity, iteration, and recursion. A popular instance of this is Cellular Encoding (CE) (Gruau 1994), which was developed for the optimisation of neural network topologies. CE explicitly represents each developmental step as a node in a tree of graph-transforming operators; the tree is evolved by GP. Another variant of GP, Cartesian Genetic Programming (CGP), has been used to directly construct graphs from nodes with labelled edges (Miller & Thomson 2000). This approach is not developmental in nature, however, and requires extensions to incorporate concepts such as reuse and modularity (Walker & Miller 2004). An alternative model of ontogenesis is to derive a phenotype from a grammar of production rules. L-systems are commonly used to accomplish this (Lindenmayer 1968). L-systems rewrite a starting string into a new string by applying the set of production rules to all symbols of the string in parallel, reflecting the parallel division of cells in biology. The set of production rules can be optimised via evolutionary methods, and the string then translated into a graph. Early studies on this were also targeted at optimising neural network topologies (Kitano 1990, Boers & Kuiper 1992); a more recent application is robot design (Hornby 2003). 1.2.3

Evolving a Graph Grammar

Instead of rewriting strings, it is possible to rewrite graphs directly by a process of hyperedge replacement (Habel 1992). Each hyperedge has multiple sources and targets s, t : E → V ∗ ; in contrast, binary edges only have one of each. A graph with hyperedges is called a hypergraph, and a hypergraph with specially labelled begin and end nodes is called a multi-pointed hypergraph. A hyperedge can be replaced by a multi-pointed hypergraph by matching these nodes with hyperedge mappings s and t, respectively. Let N ∈ C be the set of nonterminals over a label set C, T ∈ C be a set

November 20, 2005

4

20:26



Network derived from NH Network derived from NF

NF

NH

ND

NG NB

s b t

s b t

s b t

RHS of

NG

T NB NH s e

s e

t

t

s e t

Fig. 1.1 Diagrammatic representation of an example grammar of cellular productions. Nonterminal nodes (N ) identify productions of the grammar, which may also call each other (depicted as arrows). Nonterminals are always unique, with several nonterminals (depicted as grey N nodes) labelled as starting productions that produce known graphs. During graph derivation, nonterminal NG would be replaced by the cellular graph on the right, where T is a terminal function, NB and NH are nonterminals (replaced subsequently), and b and e are terminal begin and end nodes, each with a source label s and a target label t.

of terminals and H be the set of all multi-pointed hypergraphs. Then a hypergraph production is an ordered pair p = (lhs, rhs) with lhs ∈ N and rhs ∈ H, and a hypergraph grammar is a system hgg = (N, T, P, z) where P is a finite set of hypergraph productions over N and z ∈ H is the axiom. Hypergraph productions can be partitioned into several simpler cellular productions (Luerssen 2005), which may be used as modular components for hyperedge replacement (see Figure 1.1). The right-hand side rhs of a cellular production does not correspond to a complete hypergraph but to a row in an adjacency list, which is extended by additional node labels to reduce the global side-effects of local changes to begin and end nodes (Luerssen & Powers 2005). Deriving a desirable graph from a cellular production grammar requires that the correct set of productions is determined, which can be accomplished by an EA. A system has previously been developed for evolving a graph grammar with multiple starting nonterminals that match an intended population of graphs (Luerssen & Powers 2003). For every graph derived from its associated starting nonterminal, a single expressed production may be spontaneously replaced by a mutated copy. The mutation operators comprise the simple addition, deletion and replacement of all possible nonterminals, terminals, and their labels. After testing all the mutated graphs, the least

main

November 20, 2005

20:26


main


2)

3)

Network derived from NA: f = 0.0

NA

NC

NE

NB

ND

5)

NC

n io ) at ed ut d M B ad (N

NB

NB

NE

ND

NA

NB

Network derived from NF: f = 0.4

NF

NC

Deleted

(N Mu t B ad atio de n d)

Copy/Change

rm Mu in ta al tio s n ad de d)

G ra fo ph rm s ut ele at cte io d n

Network derived from NE: f = 0.3

(te

4)

) n ed io d at ad ut s M nal i rm (te

NA (Initially empty) starting production

Network derived from NC: f = 0.2

Network derived from NB: f = 0.1

NC calls NB

6)

G ra fo ph rm s ut ele at cte io d n

1)

5

NE ( N Mut a F a d tio de n d)

Network derived from NH: f = 0.5

NF

NH

ND

NG NB

Fig. 1.2 Depiction of graph grammar evolution with a maximum population of two graphs. Starting with an empty production/graph NA in generation (1), terminals are added to a copy NB of this production in (2), then NB is added to itself, producing NC in (3), while the graph of NA has least fitness f and is thus removed. NB in the graph of NC is then mutated in (4), producing ND and a copy of NC , NE , with a reference to ND . The graph of NB is now uncompetitive, but remains as a production used by NC . Further changes are applied in (5) and (6), leading to the graph grammar previously shown in Figure 1.1.

fit solutions, both from the mutated and unmutated set, are eliminated, as are all productions not involved in any fitter solutions. Conversely, if a mutation survived, the grammar is modified so that the mutated graph becomes one of the graphs derivable from the grammar. The mutated production is inserted into the grammar; then modified copies are made of all the productions that need to refer to the mutated production, not the original. This is recursively repeated for all the productions referring to the now modified productions, up to the starting production from which the new network can be derived. Evolution may thus be viewed as a repeated growing and pruning of the grammar, as shown in Figure 1.2. 1.2.4

Diversity Objectives

The GP algorithm requires an initial population of syntax trees, which can be generated by a variety of methods (Luke & Panait 2001). This popula-

November 20, 2005

6

20:26



tion provides a reservoir of diverse building blocks from which further trees can be constructed. Building blocks are also essential for graph grammar evolution, as new graphs must be defined from productions that already exist. However, starting with random productions is not viable, as recursive relationships between these productions would make it difficult to control the size of the resulting graphs. Additionally, unlike with trees, vertices are not required to be adjacent to other vertices, so a random initialisation will likely produce disconnected graphs. The alternative to obtaining diverse building blocks from an initialisation method is to generate them during evolution. Diversity refers to the differences between members of a population. Genotype diversity is the diversity among genomes in the population, whereas phenotype diversity is the diversity among fitness values in the population. Genetic lineages often reduce to one lineage early in the evolutionary process (McPhee & Hopper 1999), so to maintain diversity a method of selecting for it must be devised. Fitness sharing involves penalising the fitness of a solution if it is similar to other population members (Goldberg & Richardson 1987). Rosca used fitness values to define an entropy and free energy measure for phenotype diversity (Rosca 1995). High entropy reveals the presence of many unique fitness values, with the population evenly distributed over these. Bersano-Begey tracked the number of solutions that solved specific fitness cases, which was used to discover and promote more distinctive solutions (Bersano-Begey 1997). Fitness sharing among different fitness cases has also been applied to GP, reducing the occurrence of premature convergence (McKay & Abbass 2001). Another means of diversity facilitation is to add diversity as an objective to a multi-objective evolutionary algorithm (MOEA). MOEAs select for solutions that represent Pareto-optimal trade-offs between multiple objectives, with the fitness of a solution based on its Pareto-domination by others (Deb 2001). If multiple solutions have the same degree of domination, those residing in the most sparsely populated region of the search-space are preferred. Niching strategies of this kind can evenly spread solutions across the Pareto-boundary (Deb, Mohan & Mishra 2003), but this is not guaranteed to lead to diverse building blocks and can indeed be detrimental to the scalability of the algorithm (Sastry, Pelikan & Goldberg 2005). It is additionally possible to control solution size using MOEAs by applying a size objective (Bleuer, Braek, Thiele & Zitzler 2001); however, without an active means of promoting diversity, selecting against size is known to lead to premature convergence on small solutions (De Jong & Pollack 2003). De Jong et al. achieved both smaller and more diverse trees by using tree distance as a genotype diversity objective in the multi-objective

main

November 20, 2005

20:26



main

7

optimisation of n-parity problems (De Jong, Watson & Pollack 2001). Bue et al. explored several diversity objectives, including mean and minimum genotype distances; the latter was also implemented by Toffolo and Benini, and competitive results were achieved in all instances (Toffolo & Benini 2003, Bui, Branke & Abbass 2005). The principal drawback of genotype distance measures is that their applicability to graph grammars is quite limited, as the extensive neutrality intrinsic to graph grammars would allow these to improve distance while remaining isomorphic. Since genotype and phenotype diversity are closely intertwined - a decrease in genotype diversity will often cause a decrease in phenotype diversity - a possible solution is to employ a phenotype diversity objective instead.

1.3 1.3.1

Experiments Measures of Phenotype Diversity

The error returned by the objective function is the most available phenotypic trait of a solution and hence a solid basis for measuring phenotype diversity. To reduce any bias attributable to the nature of the specific objective function used, the solutions are ranked against each other on this function; distances are then computed as differences of ranks. Six different rank-based distance measures are suggested. The mean distance of solution i is the absolute difference between ranks, j |Ri − Rj | (1.1) Dij = N where N is the number of other solutions. Since it is often easier to attain worst rank than best rank, using this measure encourages poor performance. A measure less biased towards poor performance is to compare whether two solutions i and j show identical performance, 1 if Ri = Rj Sij = . 0 otherwise The diversity of solution i can be defined as the number of solutions that are not identical in performance, Sij (1.2) Dij = N This ‘difference measure’ encourages solutions to be different but no worse than necessary to achieve this difference. For numeric optimisation, this would obviously lead to a population of very similar solutions; however, in

November 20, 2005

8

20:26


main


the case of graph optimisation similar performance can be attained by very different graphs, so this is arguably less of a concern. Solutions with equal mean performance can still be different, and the above approaches do not recognize this. Distinguishing these solutions without comparing their genotypes is only feasible if there are multiple fitness cases that can be compared separately. Then the mean rank distance can be averaged across each case c, c j |Rci − Rcj | , (1.3) Dij = C ×N where C is the number of fitness cases. Two solutions perform identically if 1 if c |Rci − Rcj | = 0 Sij = , 0 otherwise so that diversity may again be defined as the number of non-identical solutions, Sij Dij = . (1.4) C ×N Pareto-dominance across all fitness cases can also be established, so that dominated solutions can be excluded from the above measures. Thus, satisfying fewer fitness cases is only regarded as diversity if these fitness cases are different. The mean rank distance of a solution i is its distance to other solutions that do not dominate it, |Rci − Rcj | if j does not ∈-dominate i Sij = , 0 otherwise and the proposed diversity measure is the mean of these, Sij Dij = , C × Pi

(1.5)

where Pi is the number of solutions that do not dominate i. Within this dominance framework, two solutions can also be defined to differ if 1 if c |Rci − Rcj | = 0 and j does not ∈-dominate i Sij = , 0 otherwise and diversity can be the proportion of non-identical solutions that do not dominate, Sij Dij = . (1.6) C × Pi

November 20, 2005

20:26



main

9

For comparison, using each of the fitness cases as a separate objective in the MOEA will also be evaluated; solutions thereby remain non-dominated as long as they are superior to all other solutions in at least one fitness case. Ensuring diversity hence becomes the principal responsibility of the niching mechanism. 1.3.2

Evaluation

Diversity measures are evaluated on three tasks commonly used in GP and neuro-evolutionary research; this allows for easy comparison and provides a context in which to view the results. The first task is a symbolic regression of the sixth-order polynomial: f (x) = x6 − 2x4 − x2 .

(1.7)

Fitness cases are 21 equidistant points generated by this function over the interval of x = [−1, 1]. The second task is the 6-bit Boolean multiplexer problem, which involves decoding a 2-bit binary address (00, 01, 10, 11) and returning the value of the corresponding data register (d0, d1, d2, d3). The final task is to evolve a neural network for double pole balancing. The pole balancing experiment is set up as described by (Stanley & Miikkulainen 2002) with position and velocity inputs. The Runge-Kutta fourth-order method is used to implement the dynamics of the system, with a step size of 0.01s. All state variables are scaled to [−1, 1] before being fed to the network, which outputs a variable force to the cart. The initial position of the long pole is 1◦ and the short pole is upright; the track is 4.8m long, and poles are only regarded as balanced if between −36◦ and 36◦ from vertical. Fitness is the number of time steps that both poles remain balanced. On all tasks, a (µ + λ) evolution strategy is used, with all parents producing a single offspring each (µ = λ). For the symbolic regression, the population is 10, and the permitted terminals of the graph are the binary functions {+, −, ×, div}, where div returns 1 if the divisor is zero, otherwise it returns the normal result of the division. For the 6-multiplexer, the population is 30, and the terminals are AND, OR, NOT, IF. For the pole balancing, the population is 50, and the terminals are log-sigmoid neurons. Real-valued weight vectors are initialized randomly with a standard Gaussian distribution. New weight vectors are generated by adding the weighted difference vector between two weight vectors (of different neurons) to a third vector, adapted from Differential Evolution (Price 1999) with F = 0.2 and a crossover probability of 0.9. On all tasks, recurrent relationships between terminals are disallowed; all graphs are feed-forward. Graphs are composed via a soft matching approach, and all cellular productions are modular (Luerssen & Powers 2005).

November 20, 2005

10

20:26



One production is mutated for each graph at a time, with productions chosen randomly from those expressed by the graph. Graphs are evolved over 5000 generations. Selection occurs using a multi-objective NSGA-II (Deb, Agrawal, Pratab & Meyarivan 2000) applied to three objectives: the function error, which is the mean squared error over all training samples; the size of the graph, which is a simple count of the expressed terminal and nonterminal nodes; and a diversity measure. For the symbolic regression and 6-multiplexer, 6(+1) diversity measures (as proposed in section 1.3.1) are evaluated, including the mean rank distance (equation 1.1), the mean rank difference (equation 1.2), the mean rank distance across fitness cases (equation 1.3), the mean rank difference across fitness cases (equation 1.4), the mean rank distance across fitness cases for non-dominated solutions only (equation 1.5), the mean rank difference across fitness cases for nondominated solutions only (equation 1.6), and using each fitness case as a separate performance objective. For pole balancing, due to the absence of multiple fitness cases, only the first two measures are evaluated. 1.3.3

Results and Discussion

Results are averaged over 100 runs and shown in Table 1.1 and Figure 1.3. A few sample solutions are also shown in Figure 1.4 for illustration. Without applying a diversity objective, no convergence occurs on the symbolic regression. This contrasts with the 100% success rate that is attained with the non-dominated difference measure across fitness cases (1.6); a perfect solution is found after 6058 evaluations on average. Selecting for distance is overall significantly less effective (p < .001) than selecting for difference, but the distance measure improves significantly (p < .001) if applied to fitness cases, especially if only non-dominated solutions are considered. For comparison, the success rate of CGP is 61% for the same population size and 8000 generations (Miller & Thomson 2000), and up to 64% for GP with a population of 4000 and 50 generations (Langdon 2000). The impact of diversity measures is less pronounced with the 6multiplexer problem. 81% of runs produce an optimal solution in the absence of a diversity measure, which increases to a maximum of 91% when using the non-dominated case difference measure (1.6) (p = .01). Smaller improvements and in some instances reductions in performance are produced by the other diversity measures. This lack of strong results is surprising given that performance benefits of diversity are quite large with GP on this problem (McKay & Abbass 2001). However, the graph grammar system appears not to be very competitive on this problem in general. GP requires about 42000 evaluations to find a solution (Langdon 2000), as compared to 225100 evaluations with the graph grammar system.

main

November 20, 2005

20:26


main


11

Table 1.1 Success rates for 100 runs on each of the three problem tasks for each of the applied diversity objectives. Diversity Objective None Distance Difference Case Distance Case Difference Non-Dominated Distance Non-Dominated Difference Case Objectives

Regression 0.00 0.21 0.96 0.38 0.99 0.64 1.00 0.85

6-Multiplexer 0.81 0.63 0.74 0.86 0.90 0.85 0.91 0.54

Pole Balancing 0.95 0.97 1.00 n/a n/a n/a n/a n/a

Only a single fitness case is used for pole balancing, so none of the fitness case-based diversity measures can be applied. The evaluated diversity measures do not substantially improve the success rate on this problem, but the mean number of generations needed in a successful run is only 254 when using the difference measure (1.2), which is significantly better (p < .001) than 527 for the distance measure (1.1) and 467 for no diversity measure. The mean number of evaluations thus needed to find a perfect pole balancing solution is 12700, which is much less than the 80000 evaluations of Wieland (Wieland 1991), and also compares well to the 34000 evaluations of CE (Gruau, Whitley & Pyeatt 1996) and 12600 evaluations of Symbiotic Adaptive Neuro-Evolution (Moriarty & Miikkulainen 1996), although it converges not nearly as fast as the 3600 evaluations required by NeuroEvolution of Augmenting Topologies (Stanley & Miikkulainen 2002).

x 10

6th−order Polynomial Regression

6−Multiplexer

0.3

Min Error

6 5 4

Double Pole Balancing

10

None Distance Difference Case Distance Case Difference Non−Dominated Distance Non−Dominated Difference Case Objectives

0.35

7

Min Error

−1

0.4

8

0.25 0.2

−2

10

Min Error

−3

9

−3

10

0.15 3

−4

10 0.1

2

0.05

1

−5

10 0 0

1000

2000

3000

Epoch

4000

5000

0

1000

2000

3000

Epoch

4000

5000

0

1000

2000

3000

4000

5000

Epoch

Fig. 1.3 Mean error over all runs of the minimum error solution at each generation across all problem tasks and diversity objectives.

November 20, 2005

12

20:26


The Second Australian Conference on Artificial Life 2005 6th-order Polynomial

6-Multiplexer

Pole Balancing Neural Network

Fig. 1.4 Example graphs obtained through graph grammar evolution on the three problem tasks. (Bracketed values are input weights.)

1.4

Conclusions

This paper presents several phenotype diversity objectives and applies these to the evolution of graph grammars. The starting population in this model is comprised solely of an empty grammar; since there are no initial building blocks, selecting for diversity helps establish the productions from which good graphs can be derived. Applying a diversity objective produces predominantly higher success rates on the evaluated problems. The sixth-order polynomial in particular cannot be regressed unless a diversity objective is provided, a phenomenon that is likely to be exhibited by any problem where the empty graph is a better solution than many non-trivial graphs. The highest success rates are achieved by defining diversity as the number of solutions that differ in at least one fitness case and do not Paretodominate each other across fitness cases. Certain general trends are also observed: simply counting the number of different solutions is more effective than using a mean distance measure for diversity; and estimating diversity based on individual fitness cases is also more effective than otherwise, as is the use of a single diversity objective in place of multiple performance objectives. The results show that merely by adding a simple phenotype diversity objective to a multi-objective optimisation framework the process of solution finding can be notably improved. This conclusion likely extends beyond graph grammars. For instance, the network required for double pole balancing has a trivial topology (Gruau et al. 1996); success is thus mostly dependent on the weight optimisation, and benefits from diversity are observed here as well.

main

November 20, 2005

20:26


Bibliography

Astor, J. C. & Adami, C. (2000), ‘A developmental model for the evolution of artificial neural networks’, Artificial Life 6, 189–218. B¨ ack, T. (1996), Evolutionary algorithms in theory and practice, Oxford University Press, New York. Bersano-Begey, T. (1997), Controlling exploration, diversity and escaping local optima in GP, in ‘Late Breaking Papers at the Genetic Programming Conference’, MIT Press, pp. 7–10. Bleuer, S., Braek, M., Thiele, L. & Zitzler, E. (2001), Multiobjective genetic programming: reducing bloat using SPEA2, in ‘Proceedings of the IEEE Congress on Evolutionary Computation’, Vol. 1, IEEE Press, pp. 536–543. Boers, E. J. W. & Kuiper, H. (1992), Biological metaphors and the design of modular artificial neural networks, Master’s thesis, Leiden University, The Netherlands. Bui, L. T., Branke, J. & Abbass, H. A. (2005), Multiobjective optimization for dynamic environments, in ‘Proceedings of the IEEE Congress on Evolutionary Computation’, in press. Cangelosi, A., Parisi, D. & Nolfi, S. (1994), ‘Cell division and migration in a ’genotype’ for neural networks (cell division and migration in neural networks)’, Network: Computation in Neural Systems 5, 497–515. De Jong, E. D. & Pollack, J. B. (2003), ‘Multi-objective methods for tree size control’, Genetic Programming and Evolvable Machines 4(3), 211–233. De Jong, E. D., Watson, R. A. & Pollack, J. B. (2001), Reducing bloat and promoting diversity using multiobjective methods, in ‘Proceedings of the Genetic and Evolutionary Computation Conference’, Morgan Kaufmann, pp. 11–18. Deb, K. (2001), Multi-objective optimization using evolutionary algorithms, Wiley, Chichester. Deb, K., Agrawal, S., Pratab, A. & Meyarivan, T. (2000), A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II, in ‘Proceedings of the Parallel Problem Solving from Nature VI Conference’, Springer-Verlag, pp. 849–858. Deb, K., Mohan, M. & Mishra, S. (2003), Towards a quick computation of 13

main

November 20, 2005

14

20:26



well-spread pareto-optimal solutions, in ‘Evolutionary Multi-Criterion Optimization: Second International Conference’, Springer-Verlag, pp. 222– 236. Futuyma, D. J. (1998), Evolutionary biology, Sinauer Associates, Inc., Sunderland. Goldberg, D. E. & Richardson, J. (1987), Genetic algorithms with sharing for multimodal function optimization, in ‘Proceedings of the 2nd International Conference on Genetic Algorithms and their Applications’, Lawrence Erlbaum Associates, pp. 41–49. Gruau, F. (1994), Neural network synthesis using cellular encoding and the genetic algorithm, Ph.d. thesis, l’Ecole Normale Suprieure de Lyon. Gruau, F., Whitley, D. & Pyeatt, L. (1996), A comparison between cellular encoding and direct encoding for genetic neural networks, in ‘Genetic Programming 1996: Proceedings of the First Annual Conference’, MIT Press, pp. 81–89. Habel, A. (1992), Hyperedge replacement: grammars and languages, Lecture Notes in Computer Science 643, Springer-Verlag, Berlin ; New York. Hornby, G. S. (2003), Generative representations for evolutionary design automation, Ph.d. thesis, Brandeis University Dept. of Computer Science. Kitano, H. (1990), ‘Designing neural networks using genetic algorithms with graph generation systems’, Complex Systems 4(4), 461–476. Koza, J. R. (1992), Genetic programming: on the programming of computers by means of natural selection, MIT Press, Cambridge. Langdon, W. B. (2000), ‘Size fair and homologous tree crossovers for tree genetic programming’, Genetic Programming and Evolvable Machines 1(1/2), 95– 119. Lindenmayer, A. (1968), ‘Mathematical models for cellular interaction in development, parts I and II’, Journal of Theoretical Biology 18, 280–315. Luerssen, M. H. (2005), Graph grammar encoding and evolution of automata networks, in ‘Proceedings of the 28th Australasian Computer Science Conference, Newcastle, Australia’, Australian Computer Society, Inc., pp. 229– 238. Luerssen, M. H. & Powers, D. M. W. (2003), On the artificial evolution of neural graph grammars, in ‘Proceedings of the 4th International Conference on Cognitive Science’, University of New South Wales, pp. 369–377. Luerssen, M. H. & Powers, D. M. W. (2005), Graph composition in a graph grammar-based method for automata network evolution, in ‘Proceedings of the IEEE Congress on Evolutionary Computation’, in press. Luke, S. & Panait, L. (2001), A survey and comparison of tree generation algorithms, in ‘Proceedings of the Genetic and Evolutionary Computation Conference’, Morgan Kaufmann, pp. 81–88. McKay, R. I. & Abbass, H. A. (2001), Anticorrelation measures in genetic programming, in ‘Australasia-Japan Workshop on Intelligent and Evolutionary Systems’, pp. 45–51. McPhee, N. & Hopper, N. (1999), Analysis of genetic diversity through population history, in ‘Proceedings of the Genetic and Evolutionary Computation

main

November 20, 2005

20:26


Bibliography

main

15

Conference’, Morgan Kaufmann, pp. 1112–1120. Miller, G. F., Todd, P. M. & Hegde, S. U. (1989), Designing neural networks using genetic algorithms, in ‘Proceedings of the Third International Conference on Genetic Algorithms and Their Applications’, Morgan Kaufmann, pp. 379–384. Miller, J. F. & Thomson, P. (2000), Cartesian genetic programming, in ‘Proceedings of the Third European Conference on Genetic Programming (EuroGP2000)’, Springer-Verlag, pp. 121–132. Moriarty, D. E. & Miikkulainen, R. (1996), ‘Efficient reinforcement learning through symbiotic evolution’, Machine Learning 22(1-3), 11–32. Price, K. (1999), An introduction to differential evolution, in D. Corne, M. Dorigo & F. Glover, eds, ‘New ideas in optimization’, McGraw-Hill, London, pp. 79–108. Rosca, J. P. (1995), Entropy-driven adaptive representation, in ‘Proceedings of the Workshop on Genetic Programming: From Theory to Real-World Applications’, Tahoe City, pp. 23–32. Sastry, K., Pelikan, M. & Goldberg, D. E. (2005), ‘Decomposable problems, niching, and scalability of multiobjective estimation of distribution algorithms’, IlliGAL Report No. 2005004, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, IL. Slack, J. M. W. (1991), From egg to embryo, 2 edn, Cambridge University Press, Cambridge. Stanley, K. O. & Miikkulainen, R. (2002), ‘Evolving neural networks through augmenting topologies’, Evolutionary Computation 10(2), 99–127. Toffolo, A. & Benini, E. (2003), ‘Genetic diversity as an objective in multiobjective evolutionary algorithms’, Evolutionary Computation 11(2), 151– 168. Toussaint, M. (2003), The evolution of genetic representations and modular neural adaptation, Ph.d. thesis, Institut f¨ ur Neuroinformatik, Ruhr-Universit¨ at Bochum. Walker, J. A. & Miller, J. F. (2004), Evolution and acquisition of modules in cartesian genetic programming, in ‘Proceedings of the 7th European Conference on Genetic Programming’, Springer-Verlag, pp. 187–197. Wieland, A. (1991), Evolving neural network controllers for unstable systems, in ‘Proceedings of the International Joint Conference on Neural Networks’, IEEE Press, pp. 667–673. Woodward, J. R. (2003), Modularity in genetic programming, in ‘Genetic Programming, Proceedings of EuroGP’2003’, Springer-Verlag, pp. 254–263.