A General Framework for Cooperative Co-evolutionary ... - CiteSeerX

A General Framework for Cooperative Co-evolutionary Algorithms: A Society Model Qiangfu Zhao The University of Aizu, Japan 965-80 E-mail: [email protected]

Small

Compared with the conventional algorithms, the evolutionary algorithms (EAs) are usually considered more ecient for system design because they can provide higher opportunity for obtaining the global optimal solution (see [1]). However, in most existing EAs, an individual corresponds directly to a candidate of the desired system. This results in the following problems for largescale systems: A large amount of computations are required in evaluating the tness of the individuals, and in reconstructing the individuals from their genotypes, A large population set is necessary to increase the probability for obtaining good solutions, and It is very time-consuming to determine good evolution parameters. There are solutions for each problem. For example, the parallelized evolution may solve the rst problem ([2], Chapter 22), the distributed genetic algorithm may solve the second problem [3], and EAs with self-adaptable parameters may solve the third problem [4]. However, there is still no general method for solving all the problems together.

fitness

I. Introduction

The whole system

Large

Abstract| Compared with the conventional algorithms, the evolutionary algorithms (EAs) are usually more ecient for system design because they can provide higher opportunity for obtaining the global optimal solution. However, the EAs cannot be used directly to design large-scale systems because a large amount of computations are required. To solve this problem, many approaches have been proposed in the literature. The cooperative co-evolutionary algorithms (CCEA) is possibly one of the most ecient approaches. The basic idea of most CCEAs is divide-and-conquer: divide the system into many modules, de ne an individual as a candidate of a module, assign a population to each module, nd good individuals within each population, and put them together again to form the whole system. In this paper, we generalize our earlier studies, and introduce a society model for the study of CCEAs. Based on the society model, we will formulate existing CCEAs in a general framework. We will also provide several case studies, all of which are interesting topics for future researches.

P1

P2

Pn Used (best) module Unused module

Fig. 1. A simple illustration of CCEA

To design a large-scale system, a principle is divide and conquer. This is a result of both nature selection and theoretic analysis. Speci cally, we can decompose the system into many modules, de ne an individual as a candidate of a module, assign a population to each type of modules, nd the best individual(s) in each population, and put them together again to form the whole system. This is the basic idea of most cooperative coevolutionary algorithms (CCEAs) (see Figure 1). Using the CCEAs, it is possible to solve all the above problems together. There are three steps in a CCEA. The rst step is to decompose a large system into many sub-systems or modules, which are smaller and easier to design. The second step is to evolve the modules in some populations. The third step is to reconstruct the whole system from the modules. The second and the third steps can be repeated for many times (generations). In most existing CCEAs, it is supposed that the system structure is known, and the system can be decomposed and recon-

58

structed using given information. In this case, the only problem is how to evolve the modules. For this purpose, we should de ne the tness of the individuals (modules) properly. A direct way is to de ne the tness of an individual according to the performance of the system consisting of this individual and each individual from other populations. If the number of modules for constructing the system is large, we can estimate the tness of an individual by testing the performance of the system consisting of this individual and the best individuals from other populations [5]. To get a better estimation, we can construct many systems, and determine the tness of an individual by evaluating the performance of all systems it participates [6]. In our study, we have also proposed two CCEAs for learning of nearest neighbor based multilayer perceptrons (NN-MLP). The rst one is the individual evolutionary algorithm (IEA). The IEA can construct an NN-MLP eciently if good candidates of the modules are given [8]. Compared with other CCEAs, the IEA can determine the number of modules automatically through evolution. Further, since the tness of an individual is determined only by its performance in the current network, the individuals can be evaluated in a more stable way. To produce good candidates of the modules, we proposed the co-evolutionary algorithm (CEA) [9]. Actually, the CEA can be used to obtain good candidates of the modules for the IEA, and the IEA can be used to construct the whole system [10]. In this paper, we will generalize our study by introducing a society model for the study of CCEAs [12]. Based on the society model, we will formulate the existing CCEAs in a common framework. We will also provide several case studies, all of which are interesting topics for future researches.

Φ

T1

T2

TN

R1

R2

RN

φ

φ

φ

1

2

O ON

Information

Materials I1

N

O2

O1

In

I2

Ψ ψ

1

S

ψ

P1

2

ψ

P2

n

Pn P

Environment

Fig. 2. A society model for CCEA

the over-all structure (skeleton) of Oi , the pointers to the individuals in P , and a function i for constructing Oi . If the over-all structure of Oi is given, and the individuals are selected according to their tness (see [10]), i contains only i . This is the reason we call i a function in our study. Usually, i is a learning algorithm which can (hopefully) accelerate the evolution. In general the system structure and the pointers are unknown. They are created rst at random for each organization, and then evolved by the member function . Compared with existing CCEAs, automatic determination of the system structure is one of the important features of the society based CCEA. Note that decomposition is not necessary here because the society based II. A Society Model for CCEA CCEA is a bottom-up approach: yet-to-be-evolved sysThe society model is shown in Fig. 2. A society S is tems are build up using yet-to-be-evolved modules. de ned as an object consisting of two members (O and P ), and two member functions ( and ), which are B. P is a set of populations described as follows. Each population Pk (k = 1; 2; ; n) is an object containing a set of individuals Ik and a function k for evoA. O is a set of organizations Each organization Oi (i = 1; 2; ; N ) is again an ob- lution of individuals in Ik . An individual is a candidate ject containing a set of tasks Ti to be ful lled, a set of of a representative. One population is assigned to each representatives Ri to ful ll the tasks, and a function i type of individuals. We say two individuals are the same for construction of Oi (Actually, i may not be a func- type if they share some common features. If there are tion, see below). For function approximation problems, n types of representatives for constructing all organizaTi is the training set (containing the training samples tions, there will be n populations. and their correct answers). For system design, Ti is the C. is an algorithm for evolution of O set of design speci cations. The representative set Ri is a set of subsystems or modules. The function i is the Here, an individual corresponds to an organization. If organizer of Oi . In general, i is also an object contain- the task sets of all organizations are the same, we can ing use to obtain better organizations in the global sense.

59

In this case, any existing EA can be used for this purpose. If the task sets are dierent, should have some speciation ability (say, tness sharing), so that many specialized systems can be created for ful lling dierent tasks. The genotype of Oi contains (only) the information for building up the organizer i . Once i is built up, the body (the representative set) of the organization can be initialized directly using the pointers, and be updated using the learning algorithm i . It is not necessary to contain all information of an organization in its genotype. D. is an algorithm for evolution of P is used for creating new populations with new types of individuals. Usually, current individuals may not be sucient to construct the desired organizations. Therefore, removing some useless populations, and creating new and possibly better populations are very important. The genotype of a population can be expressed by the common features of its individuals. The tness of a population can be de ned by the number of useful individuals in it. Note that we may introduce some populations which are invariant to . These populations may contain the re-usable modules that have been obtained earlier by some other methods. III. Some Special Cases

To make the discussion more concrete, let us consider some case studies. In this section, we will see that existing CCEAs can be formulated in a common framework by using the society model. In the following sections, we will propose some topics for future researches. A. The Cooperative Co-evolutionary Genetic Algorithm In the cooperative co-evolutionary genetic algorithm (CCGA) [5], there is only one organization O1 in O. The task is to nd the minimum (maximum) value of a given function. Therefore, the task set T1 may contain nothing for unconstrained optimization problems, the representative set R1 consists of the variants to be found, and the function 1 contains all information for constructing the function (say, the number of variants and the relation between the variants). In the population set P , one population is assigned to each variant. The member functions and are not used. In the CCGA, the organization is constructed in every learning cycle by using the best individuals of all populations (based on information in 1 ). There are three problems in the CCGA. First, the over-all structure of the function to be optimized must be pre-speci ed. Second, evaluation of the individuals is not stable because the organization (the basis for evaluation) changes in every learning cycle. Third, all populations cannot evolve

in parallel because evaluation of individuals in one population requires all other populations being frozen. B. The Symbiotic, Adaptive Neuro-Evolution In the symbiotic, adaptive neuro-evolution (SANE) algorithm [6], many organizations are constructed in each learning cycle. Therefore, if we consider only one learning cycle, SANE is similar to the society model. The task of SANE is to nd neural networks suitable for function approximation. In each organization Oi , the task set Ti is nothing but the training set, the representative set Ri consists of all neurons, and the function i contains the information for constructing the desired neural network. There is only one population in the population set P . The member functions is not used, and is not necessary. Although there are many organizations in SANE, the organizations are generated randomly in each learning cycle. Therefore, the rst two problems of CCGA can also be found in SANE. Of course, SANE could estimate the tness of the modules more precisely than CCGA. C. The Hierarchical SANE In [7], the authors of SANE proposed the hierarchical SANE. In the hierarchical SANE, the organizations are created not randomly, but according to some blueprints. Clearly, the blueprint of Oi is nothing but i . These blueprints can be evolved by another EA . Therefore, the basic idea of the hierarchical SANE is very close to the society model. Of course, if i (some learning algorithm) is introduced in each i , the hierarchical SANE could be more ecient. In addition, since there is only one population for creating all individuals in the hierarchal SANE, it may not be ecient for designing systems with dierent types of modules. Therefore, using the society model, we can improve the hierarchical SANE in many ways. D. The Individual Evolutionary Algorithm The purpose of the individual evolutionary algorithm (IEA: [8]-[10]) is to design a nearest neighbor based multilayer perceptron (NN-MLP). The NN-MLP is a special neural network for pattern recognition [8]. The structure of an NN-MLP is shown in Fig. 3. In an NN-MLP, each hidden node corresponds to a module (sub-subsystem), and each output corresponds to a class (subsystem). For a given input x, if it belongs to the i ? th class, only the i ? th output is one, and all others are zeros. For any x, it belongs to the i ? th class if

9j; fij (x) = max fkl (x); (1) k = 1; 2; ; n; l = 1; 2; ; nk

60

In addition, for o-line learning, since the number of classes is given, there is no need to use the function . We can simply assign a population to each class (as before). For on-line learning, however, the number of classes should be determined during evolution. In this case, could be used to produce modules for newly observed classes.

output

Module

Sub-system

input

Fig. 3. A nearest neighbor based multilayer perceptron

where n is the number of classes, ni is the number of modules for the i ? th class, and fij (x) is the decision function corresponding to the j ? th module of the i ? th (i = 1; 2; :::; n) class. In designing the NN-MLP using the IEA, there is only one organization O1 in O. The task set T1 is the training set and the representative set R1 consists of all modules for constructing the current network. The function 1 is an editor consisting of four basic operations: Evaluate the current organization, Delete some representatives with low tness, Insert some individuals with high tness, and Train the organization to make it better. The parameters used by 1 are contained in 1 . In the population set P , a population is assigned to each subsystem, and an individual in Pk is a candidate of a representative of the k ? th type (class). To produce good individuals, any existing EAs can be used within each population, provided that the tness of the individuals are de ned properly [10]. The functions and are not used in the original version of the IEA. E. Generalization of the IEA Similar to the SANE algorithm, the IEA can also be improved based on the society model. First of all, we can introduce multiple organizations in O. Each organization can be de ned in the same way as before. Then, we can use an evolutionary algorithm to evolve the organizations in O. For o-line learning, the tness of each organization can be de ned simply as the recognition rate of the corresponding NN-MLP. Some criteria related to network size, convergent speed and so on might be useful, too. For on-line learning, the tness should be related with the ages of the organizations (see [11]). The genotype of an organization consists of all parameters used by i . For example, the convergent ratio for training, the threshold for deletion, and the threshold for insertion.

IV. Evolutionary Learning of RBF-MLP

From the above discussions, we can see that the existing CCEAs can be formulated in a common framework based on the society model. From this section, we will provide some examples of applying the society model to solve dierent problems. Actually, each example poses an interesting topic for future research. A radial-basis function multilayer perceptron (RBFMLP) is a single-hidden-layer neural network suitable for function approximation [13]. The RBF-MLP is very similar to the NN-MLP in the sense that both of them are neural networks based on local representation, and therefore both of them are easily decomposable. To design an RBF-MLP, we have the following de nitions: A. The organization set O For each organization Oi , the task set Ti is the training set, the representative set Ri consists of the basis functions for constructing this organization, and the organizer i is an object containing the number of modules for constructing the network, the pointers to the individuals in P , and a member function i for constructing the network from the speci ed individuals. The member function i can be some kind of gradient descent algorithm. The linear weights (relation between the modules) for constructing the network can be found by i . B. The population set P Each population Pk consists of the candidates of the basis functions of the k ? th type. We say two basis functions are the same type if they have some common features. For example, if we use Gaussian functions as the basis functions, all basis functions with the same center (or variance) can be considered to be the same type. To evolve the individuals in a population Pk using k , we should de ne the genotype and the tness of each individual. The genotype can be the variance (or the center, if we use the variance to de ne the individual type) of each individual. The genotypes can be coded into binary strings, and the well known genetic algorithms (GAs) can be used for k .

61

The tness of an individual p can be given as follows: they share some common features. For example, we can de ne the type of a subroutine according to its arguN ments. If two subroutines have exactly the same ar(2) guments, they will be the same type. Based on this fitness(p) = fitness(Oi) S (p; ri ) i=1 de nition, functions like f1 (x) = sin(x); f2 (x) = cos(x) and f3 (x) = tan(x) are the same type, and functions where fitness(Oi ) is the tness of the i?th organization, like f 4 (x; y; z ) = x=y=z; f5 (x; y; z ) = x + y ? z and ri is the representative in Ri which is most similar to p, f ( x; y; z ) = x y=z should be in the same population 6 and S (p; ri ) is the similarity between p and ri . The range (all these functions are to be generated by evolution). of the similarity function S ( ) is [0; 1]. Since the individuals are computer programs, the genetic programming (GP) can be used for k . The tC. The member function ness of each individual can be de ned in a similar way The function can be used to nd a good organiza- as discussed in the previous section. tion. An organization Oi is good if i is able to construct a good network based on individuals provided by the cur- C. The member function rent P . That is, only the organizers are evolved by . The aim of is to nd a good organization. is a GP The genotype of i may contain Ni + 2 segments. The in nature because the genotype of an organization Oi is rst segment is the number of modules for constructing the main program i . The tness of the organization this organization, the second one is the convergence racan be de ned as the performance of the corresponding tio of i , and the other segments are the pointers. The program for given data. genotype can be coded into a binary string with variable length, and GAs can also be used for . D. The member function D. The member function is used to create new and possibly better types To produce new and possibly better type of basis func- of subroutines. The genotype of a population Pk can tions, we can use the function . Again, if we use the be de ned as the common features of the individuals. Gaussian function, and the center is used to de ne the The common features, say, the arguments, can be easily type of the basis functions, the genotype of each pop- coded into a binary string. Therefore, GAs can be used ulation can be very simple: just the common center of for . Thus, in the society model, dierent kinds of EAs its individuals. Therefore, the genotype of a population can be adopted in dierent parts. can also be coded into a binary string, and again, GAs The above algorithm can be named the cooperative can be used for . co-evolutionary genetic programming (CCGP). There is an important dierence between the CCGP and the GP V. Cooperative Co-evolutionary Genetic with ADF (automatic de ned function, [14]). In the Programming CCGP, the evolution of the main program and that of When the system to be designed is a computer pro- the subroutines are separated. This separation may algram, we can de ne the members and the member func- low us to get a large-scale program in a hierarchical way. More study is necessary to verify this point. tions of a society as follows:

X

A. The organization set O Each organization Oi is a program. The main program is i , and each representative in Ri is a subroutine. The task set Ti contains the input data and the desired output (for identi cation of the desired program). The over-all structure of the organization is determined by the main program i . For example, i may contain the following information: the number of subroutines to be used, pointers to the individuals in P , and the relation between the subroutines.

VI. The Hierarchical Society

Up to now we have studied the society model as if it is a two-layered system: the organizations are in the top layer, and the individuals are in the bottom layer. It is worthwhile to notice that the society model is inherently hierarchical. Actually, each population in P again can be a society. The over-all structure of a hierarchical society is shown in Fig. 4, where P i is the population set of the i ? th layer, and i is an EA for evolution of populations in this layer. Each population is a generator of certain type of individuals. Any individual in any population of any layer is an B. The population set P object with a task set T , a representative set R, and an One population is assigned to each type of subrou- organizer . Some individuals (say, the top layer inditines. We say two subroutines are the same type if viduals) have direct interaction with the environment,

62

VII. Summary

I1 Ψ

i-1

ψ

1

ψ

2

P1

I1 Ψ

ψ

n

P2

Pn P

i-1

In

I2

i

ψ

1

ψ

ψ

2

P1

I1 Ψ

In

I2

n

P2

Pn P

i

i+1

ψ

1

ψ

ψ

P1

2

P2

n

Acknowledgment

The author thanks Prof. Tatsuo Higuchi of Tohoku University for his encouragement and support in this research.

In

I2

In this paper, we have introduced the society model for cooperative co-evolutionary algorithms (CCEAs), and provided some case studies. As can be seen, the society model seems to be very ecient for formulating the CCEAs. Note that the society model itself is a result of several years of evolution (see [8]-[12]), and it is still evolving. Therefore, we encourage the readers to verify the results and algorithms given here, and if possible, to propose their own, possibly better models.

Pn P

i+1

Fig. 4. A hierarchical society model

[1] [2] [3] [4]

and their tness can be evaluated directly based on the task sets. Only this kind of individuals have learning ability. For individuals with empty task sets, the tness can be evaluated according to the tness of the higher layer individuals according to Eq. (2) (a kind of back-propagation). Generally, an individual can be constructed from individuals of any lower layers. For a hidden individual, it may also have a non-empty task set, if the higher layer individuals have task-decomposition ability. This is true when the higher layer individuals are intelligent agents. The representative set R is the real body of an individual. In a multilayer society, it is necessary for each individual to have a real body, so that higher layer individuals can be constructed easily. Further, since the representatives are initially taken from lower layer populations, and updated later by (the learning algorithm in ), R is useful to memorize the evolution or learning history of an individual. The organizer of any individual is the same as that de ned for an organization. For bottom layer individuals, is not necessary. For any non-bottom layer individual, should provide the over-all structure and the mechanism for constructing (statically or dynamically) the individual. We believe that the hierarchical society model provides a possible way for evolutionary construction of large-scale and complex systems. Of course, this should be veri ed in our future study.

[5] [6] [7] [8] [9] [10] [11] [12] [13] [14]

References D. B. Fogel, Evolutionary Computation, IEEE Press, 1995. J. R. Koza, Genetic Programming, Fourth Printing, The MIT Press, 1994. R. Tanese, Distributed genetic algorithm for function optimization, Ph.D. dissertation. Department of Electrical Engineering and Computer Science, University of Michigan. N. Hansen and A. Ostermeier, \Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation," Proc. ICEC'96, pp. 312-317, Nagoya, May 1996. M. A. Potter and K. A. De Jong, \A cooperative coevolutionary approach to function optimization," Proc. Third Conference on Parallel Problem Solving from Nature, pp. 249-257, Springer-Verlag, 1994. D. E. Moriarty and R. Mikkulainen, \Ecient reinforcement learning through symbiotic evolution," Machine Learning, Vol. 22, pp. 11-33, 1996.), D. E. Moriarty and R. Mikkulainen, \Hierarchical evolution of neural networks," Technical Report, AI96-242, The University of Texas at Austin, Jan. 1996. Q. F. Zhao and T. Higuchi, \Evolutionary learning of nearest neighbor MLP," IEEE Trans. on Neural Networks, Vol.7, No. 3, pp. 762-767, 1996. Q. F. Zhao, "Co-evolutionary learning of neural networks," to be published in International Journal of Intelligent and Fuzzy Systems. Q. F. Zhao, "EditEr: a combination of IEA and CEA," Proc. ICEC'97, pp. 641-645, Indianapolis, April 1997. Q. F. Zhao, "On-line evolutionary learning of NN-MLP," to be published in IEEE Trans. on Neural Networks. Q. F. Zhao, "A society model for cooperative co-evolutionary algorithms," Proc. ICONIP'97, Dunedin, New Zealand, Nov. 1997. S. Haykin, Neural networks: a comprehensive foundation, Chap. 7, Macmillan Publishing Company, 1994. J. R. Koza, Genetic Programming II, The MIT Press, 1994.