Proceedings of the 2001 IEEE Congress on Evolutionary Computation Seoul, Korea • May 27-30, 2001
Evolutionary Algorithms with Adaptive Levy Mutations Chang-Yong Lee
Xin Yao
Department of Industrial Information Kongju National University Chungnam, 340-800, South Korea e-mail:
[email protected]
School of Computer Science University of Birmingham Edgbaston, Birmingham B15 2TT, United Kingdom e-mail:
[email protected]
AbstractAn evolutionary programming algorithm with adaptive mutation operators based on Levy probability distribution is studied. Levy stable distribution has an innite second moment. Because of this, Levy mutation is more likely to generate an ospring that is farther away from its parent than Gaussian mutation, which is often used in evolutionary algorithms. Such likelihood depends on a parameter in the distribution. Based on this, we propose an adaptive Levy mutation in which four dierent candidate ospring are generated by each parent, according to = 1:0, 1:3, 1:7, and 2:0, and the best one is chosen as the ospring for the next generation. The proposed algorithm was applied to several multivariate function optimization problems. We showed empirically that the performance of the proposed algorithm was better than that of classical evolutionary algorithms using Gaussian mutation.
parents. It was shown that FEP had some advantages over CEP. Another proposal was made (Lee 1999) by means of Levy mutation, which is based on Levy probability distribution. Using Levy mutation, the performance of EP can be improved. Levy mutation is, in a sense, a generalization of Cauchy mutation since Cauchy distribution is a special case of Levy distribution. In particular, Levy distribution exhibits a power law in the tail region that is a characteristic of the fractal structure of nature (Mandelbrot 1982). More importantly, by adjusting the parameter in Levy distribution, one can tune the shape of the probability density function, which in turn yields adjustable variation in mutation step sizes. In this paper, we extend the idea of Levy mutation to an adaptive scheme and propose an adaptive Levy mutation operator. In the previous Levy mutation, although various were tested, the parameter was xed during independent experiments. From the spirit of evolutionary computation, it is highly desirable to make the parameter adaptive so that can be also evolved adaptively during the procedure. This idea is closely related to the Improved FEP (IFEP) proposed by X. Yao et al (Yao 1999). They also did some analysis on the performances of FEP and CEP and found that Cauchy mutation performed better when the current search point was relatively farther away from the global optimum point, whereas Gaussian mutation was better at nding a solution in the vicinity of the global optimum (Yao 1997). IFEP is based on mixing dierent mutation operators. That is, IFEP generates two ospring from each parent, one by Cauchy mutation and the other by Gaussian mutation, and selects the better one as the ospring. However, IFEP provides no opportunities for mutating a parent using a distribution which is neither Cauchy nor Gaussian. Levy distribution provides an ideal probability distribution for designing adaptive mutations because of its adjustable parameter . Dierent values dene probability distributions of dierent shapes. For example, it can implement both Gaussian and Cauchy distributions by simply changing the parameter. In fact, one can generate any distribution between the Gaussian and the Cauchy probability distribution. In this paper, a new EP algorithm is proposed, which uses adaptive Levy muta-
1 Introduction Evolutionary programming (EP) was initially studied as an evolutionary approach to articial intelligence (Fogel 1966). It was later developed for numerical and combinatorial optimization (Fogel 1990, 1994, Sebald 1994). Mutation is the primary search operator in EP. In classical evolutionary programming (CEP), each parent generates an ospring via the Gaussian mutation and better individuals among the current parents and their ospring are selected as the parents of the next generation (Fogel 1990). One of the drawbacks of EP in the optimization of multimodal functions is its slow convergence to the optimal or nearly optimal solutions. Since the new solutions are generated via mutation operation, dierent mutation operators have been investigated. In particular, Yao and Liu (Yao 1996) proposed a fast evolutionary programming (FEP) with mutation operators based on Cauchy probability distribution. Unlike Gaussian distribution, Cauchy distribution has an innite second moment and, as a result, has a much longer tail than the Gaussian distribution. Due to the long tail in the distribution, one could get mutated ospring quite dierent from their
0-7803-6657-3/01/$10.00 © 2001 IEEE
568
tions. Each Levy mutation generates four candidate ospring and the best one is selected. Many engineering design problems are formulated as parameter optimization problems (Gen and Cheng 1997). The techniques proposed in this paper can be applied directly in optimal engineering design. Even when the design space is discrete, the idea in this paper can still be used by discretizing the continuous Levy distribution. Examples of such discretization can be found in (Yao 1991). We organize the rest of the paper as follows. Section 2 describes the algorithm for multi-dimensional function optimization problems. In section 3, we discuss some characteristics of Levy probability distribution and the adaptive Levy mutation scheme. This is followed by results and discussion of the experiments on some benchmark functions. The last section is devoted to the conclusion and future studies.
2 Function Optimization Using Evolutionary Programming A multi-variable function optimization problem can be dened as follows: For a function f (~x),
f : S ! R and ~x 2 S R
(1)
the algorithm aims to nd
~xmin such that 8~x 2 S f (~xmin ) f (~x) :
(2)
Since it is known that CEP with self-adaptive mutation usually performs better than without it (Back 1993), we adopt self-adaptive mutation in our work. We will describe CEP rst, and then discuss the dierence between CEP and our proposed algorithm in the next section. CEP is implemented as follows:
Step 1 : Generate the initial population consisting
of individuals, each of which can be represented as a set of real vectors, (~xi ~i ) i = 1 2 . Each ~xi and ~i have m independent components
~xi = ~i =
fxi(1) xi (2) xi (m)g fi(1) i (2) i (m)g:
(3) (4)
Step 2 : For each parent (~xi ~i ), create an ospring (~xi ~i ) in the following manner. For j = 1 2 m, 0
0
i (j ) = i (j ) expf N (0 1) + Nj (0 1)g (5) xi (j ) = xi (j ) + i (j )Nj (0 1) (6) where N (0 1) stands for a standard Gaussian random 0
0
0
variable xed for each individual in the population and Nj (0 1) a newly generated Gaussian random variable for each component j . The parameter and are dened as (Back 1993) 0
(7) = p 1p and = p1 : 2m 2 m Step 3 : For parents (~xi ~i ) and their ospring (~xi ~i ), calculate their tness values f1 f2 f+ . 0
0
0
Step 4 : Use the tournament selection of the size
q to select individuals to be the parents for the next generation. Step 5 : Repeat step 2 through step 4 until the required criteria are satised.
3 Levy Distribution and Adaptive Levy Mutation As a nite second moment leads to a characteristic scale and the Gaussian behavior for the probability density through the central limit theorem, P. Levy, in the 1930's, searched for an exception to it. He discovered a class of probability distributions having an innite second moment and governing the sum of these random variables. An innite second moment implies the absence of a definite scale in the probability distribution. Consider a set fYi g of identically distributed random variables each of which has the same probability density. If the sum of these random variables also has the same probability distribution as each random variable, then this process is called stable. A typical example of a stable process is the Gaussian process. That is, the sum of the Gaussian random variables also has a Gaussian distribution. While the Gaussian process requires a nite second moment, there is a class of probability distributions with an innite second moment which also yields a stable process. This probability distribution is called the Levy probability distribution and has the following form (Levy 1937): Z q 1 L (y ) = e cos(qy) dq : (8)
1
;
0
As can be easily seen from Eq. (8), the distribution is symmetric with respect to y = 0 and has two parameters, and . is the scaling factor satisfying > 0 and satises 0 < < 2. The analytic form of the integral is not known for general except for a few cases. In particular, for = 1, the integral can be carried out analytically and is known as the Cauchy probability distribution. In the limit of ! 2, the distribution approaches the Gaussian distribution. The parameter controls the shape of the probability distribution in such a way that one can obtain dierent shapes of probability distribution, especially in the tail region depending on the parameter . That is, the smaller the parameter , the longer the tail. Figure 1 shows the Levy distribution of = 1:0 and = 1:5 together with the standard Gaussian distribution. 569
One of the most important properties of the Levy distribution is its scaling. To see this, let us re-scale y to y = by with some constant b: y ! by. Then from Eq. (8), we get the following relation: L (by) = 1 L (y) (9) 0
b
0
where = b . In particular, if = 1, the above equation becomes L (y ) = 1= L1(y) (10) Thus the scaling factor can be set to = 1 without loss of generality. That is, with the distribution of = 1, we can get the distribution of any other . In this study, we x = 1 and denote L1 = L. 0
;
0
0
;
1 α = 1.0 α = 1.5 Gaussian
Fig. 1 0.1
probability
0.01
0.001
0.0001
1e-05
1e-06 -20
-15
-10
-5
0 x
5
10
15
20
Figure 1: Comparison of the Levy probability distribution of = 1:0 and = 1:5 with the standard Gaussian distribution. The ordinate is log scale. With the Levy distribution discussed above, we now describe the adaptive Levy mutation. To see the eect of the adaptive Levy mutation, we keep the dierence between procedures of CEP and adaptive Levy mutation as little as possible. Thus the procedure of EP with the adaptive Levy mutation adopted in this work is the same as that of CEP except for the following: Firstly, Eq. (6) of step 2 in the CEP procedure is replaced by xi (j ) = xi (j ) + i (j )Lj () (11) where Lj () is a random number, for each dierent j , newly generated from the Levy probability distribution with parameter . To generate a L'evy random number, we used an eective algorithm proposed by R. Mantegna (Mantegna 1994). Secondly, we generate four ospring with = 1:0 1:3 1:7 and 2:0 from each parent as the candidates for the "true" ospring. The best one out of four is then chosen as the ospring. This scheme is adaptive in the sense that dierent ospring can be chosen depending on the environment. One should note here that = 2:0 corresponds to the Gaussian mutation. 0
It is known that Gaussian mutation ( = 2:0) works better for searching a small local neighborhood, whereas Cauchy mutation ( = 1:0) is good at searching a large area of the search space. By adding additional two candidate ospring ( = 1:3 and 1:7), one is not xed to the two extremes. An advantage of choosing the best ospring out of four would be that one can select an ospring more eectively depending on how wide the search area should be for a given generation. This way of selection mimics an adaptation of the in the course of the procedure. A similar idea has been used in the study of a complex adaptive system, such as the minority game (Challet 1997). In this sense, we call the proposed scheme as an adaptive Levy mutation.
4 Experimental Results and Discussion In this section, we will apply the algorithm discussed in Section 3 to several function optimization problems to see the eect of adaptive Levy mutation. Table 1 lists the test functions. We divide the test functions into two classes: functions with many local minima and those with only a few local minima. The following are the parameters that have been used in the experiments: number of population: = 100 400 tournament size: q = 10 initial standard deviation: = 3:0 (which determines the variance of the Gaussian distribution) number of generations: 1500 for f1 f2 f6 and 100 for f7 f8 It is worth noting that = 3:0 is only the initial value. It will be changed (i.e., evolved) adaptively by the EP algorithm during a run. Since the adaptive Levy mutation is the result of a mixture of four dierent 's, for comparison, we performed an independent set of experiments for each , as well as for the adaptive Levy mutation. Thus, for each function, ve dierent sets of experiments are carried out. To make the comparison fair in terms of computing time, the population size of the adaptive Levy mutation is reduced to one quarter of that of each dierent , since each individual in the adaptive Levy mutation generates four ospring1. In doing so, we have exactly the same number of function evaluations for dierent algorithms. However, the adaptive Levy mutation takes less computing time, as the other operations, such as selection, use less time in a small population. Figure 2 is the result of the experiment for the functions with many local minima shown in Table 2. As can 1 When the size of the population is the same, we have obtained much better results in favor of the adaptive Levy mutation, as one can easily expect.
570
Table 1: Functions that have been tested in this paper. N stands for the dimension of the function and S their variable ranges. Test Functions N S p P N f1 (x) = ; 30 (;500 500)N i=1 xi sin( jxi j) P N 2 f2 (x) = i=1 fxi ; 10qcos(2xi ) + 10g 30 (;5:12 5:12)N P P f3 (x) = ;20expf;0:2 N1 Ni=1 x2i g ; expf N1 Ni=1 cos(2xi )g + 20 + e 30 (;32 32)N P N x2 ; N cos( xi ) + 1 1 30 (;600 600)N f4 (x) = 4000 iP =1 i=1 i i N 1 f5 (x) = N fP 10 sin2 (yi ) + i=1 (yi ; 1)2 f1 + 10 sin2 (yi+1 )g + (yn ; 1)2 g 30 (;50 50)N N + i=1 u(xi 10 P 100 4) yi = 1 + 1=4(xi + 1) f6 (x) = 0:1Pfsin2 (3x1 ) + Ni=1 1 (xi ; 1)2 f1 + sin2 (3xi+1 )g + (xN ; 1)2 f1 + sin2 (2xN gg 30 (;50 50)N + PNi=1 uP (xi 5 100 4) f7 (x) = ; P5i=1 fP4j=1 (xj ; aij )2 + ci g 1 4 (0 10)N 7 4 f8 (x) = ; Pi=1 fPj=1 (xj ; aij )2 + ci g 1 4 (0 10)N 10 4 2 1 f9 (x) = ; i=1 f j=1 (xj ; aij ) + ci g 4 (0 10)N p
;
;
; ; ;
be seen from Fig. 2, none of the scheme is best for all test functions. However, the adaptive Levy mutation yields better results for both convergence rate and the obtained minimum values. Except for f1 and f4 , adaptive Levy mutation gives better results over all the others. It should be noticed that the convergence rate for the adaptive Levy mutation is always faster than others, which indicates eectiveness of the adaptive scheme. The minimum values for the adaptive and the best value among dierent Levy mutations, together with ttest statistics are shown in Table 2. t-test is used to test a statistical hypothesis of equality of means. From the results of the t-test in Table 2, one can nd that, for the case of f4 , it is hard to tell which mutation scheme is best since the results of the scheme fall within the condence interval with the signicance level 5%. If one excludes the adaptive mutation, there is no general consensus on which is the best for most test functions, as the column \Best " shows. The performance of the Gaussian mutation is relatively worse for all test functions. The change of the variables after the Gaussian mutation is so small that it is hard to get out of a local minimum once the algorithm is trapped, while the adaptive Levy mutation gives a higher probability of getting out of the local minimum as a result of the adaptive variation of the parent. In order to see how the adaptive Levy mutation works, we take f3 as an example to analyse. Similar behaviors can be observed with other functions. In Fig. 3, we plot the number of successful mutations in a population for four dierent 's as a function of generations. When one examines Fig. 3 in conjunction with the adaptive Levy mutation result in Fig. 2(c), one can nd that until the algorithm nds a vicinity of the solution (up to around 400th generation in Fig. 2(c)), non-Gaussian mutations ( = 1:0 1:3 1:7) occupy about a half of the population. However, once the algorithm nds a vicinity of the so-
-2000 adaptive α=1.0 α=1.3 α=1.7 α=2.0
-3000 -4000
Fig. 2(a)
minimum fitness
-5000 -6000 -7000 -8000 -9000 -10000 -11000 -12000 0
200
400
600
800 generation
1000
1200
1400
1600
400 adaptive α=1.0 α=1.3 α=1.7 α=2.0
350
Fig. 2(b)
minimum fitness
300
250
200
150
100
50
0 0
571
200
400
600
800 generation
1000
1200
1400
1600
Table 2: The minimum values for each function, together with the results of the t-test, the best value of , and others are listed. The \Mean Best" indicates the average of the minimum values obtained and the \Std Dev" stands for the standard deviation over 50 independent runs. The column \Best " takes a value of the 's in the L'evy mutations from which the best result yields. FunctPopulation Number of Adaptive Best Levy Best t-test ions Adaptive (Levy) Generation Mean Best (Std Dev) Mean Best (Std Dev) f1 100 (400) 1500 -11469.2 (58.2) -11898.9 (52.2) 1.0 7.69 f2 100 (400) 1500 5.85 (2.07) 12.50 (2.29) 1.3 15.07 f3 100 (400) 1500 1:9 10 2 (1:0 10 3) 3:1 10 2 (2:0 10 3) 1.7 37.57 f4 100 (400) 1500 2:4 10 2 (2:8 10 2) 1:8 10 2 (1:7 10 2) 1.0 1.28 f5 100 (400) 1500 6:0 10 6 (1:0 10 6) 3:0 10 5 (4:0 10 6) 1.3 40.75 f6 100 (400) 1500 9:8 10 5 (1:2 10 5) 2:6 10 4 (3:0 10 5) 1.7 35.10 The t value of 49 degree of freedom is signicant at the level =5% by a two-tailed test. y
y
;
;
;
;
;
;
;
;
y
;
;
;
;
y
;
;
;
;
y
y
adaptive α=1.0 α=1.3 α=1.7 α=2.0
1e+9
100
Fig. 2(e)
adaptive α=1.0 α=1.3 α=1.7 α=2.0
1e+6
minimum fitness
Fig. 2(c)
minimum fitness
10
1e+3
0 1
1e-3
0.1 1e-6 0
200
400
600
800 generation
1000
1200
1400
1600
0.01 0
200
400
600
800 generation
1000
1200
1400
1600
adaptive α=1.0 α=1.3 α=1.7 α=2.0
1e+9
Fig. 2(f)
minimum fitness
1e+6
1000 adaptive α=1.0 α=1.3 α=1.7 α=2.0 100
1e+3
0
Fig. 2(d)
minimum fitness
1e-3
10 1e-6 0
1
0.1
0.01 0
200
400
600
800 generation
1000
1200
1400
1600
200
400
600
800 generation
1000
1200
1400
1600
Figure 2: Minimum values of the test functions versus generation. 50 independent runs were averaged. Figures (a), (b), , (f) correspond to the test functions f1 f2 f6 , respectively. The "adaptive" in the legend indicates the result using the adaptive Levy mutation, the others indicate the Levy mutation with the corresponding . The population size for the adaptive Levy mutation is 100, and the others 400.
572
lution (after around 400th generation in Fig. 2(c)), the Gaussian mutation takes over and plays a dominant role. That is, the non-Gaussian mutations play a signicant role in the population in the early stage of evolution, while the Gaussian mutation lls a major role in later stage of evolution. This is quite reasonable since in the early stage the distance between the current search point and the global minimum is expected to be large, thus non-Gaussian mutations are more eective. As the evolution progresses, the current search point tends to approach the global minimum, and as a result, the Gaussian mutation starts to produce better ospring. Similar trend was also observed in the IFEP case (Yao 1999).
0 adaptive α=1.0 α=1.3 α=1.7 α=2.0
-2
Fig. 4(a)
minimum fitness
-4
-6
-8
-10
-12 0
10
20
30
40
50 generation
60
70
80
90
100
100 90 80
0
Fig. 3 α=2.0 α=1.0 α=1.3 α=1.7
60
adaptive α=1.0 α=1.3 α=1.7 α=2.0
-2
Fig. 4(b)
50
-4 minimum fitness
number of mutations
70
40 30 20
-6
-8 10 0 0
200
400
600
800 generation
1000
1200
1400
-10
1600
Figure 3: Number of successful mutations for = 1:0 1:3 1:7 and 2:0 in a population, when the adaptive Levy mutation is applied to the test function f3 . The total number of successful mutations are added up to 100%.
0
10
20
30
40
50 generation
60
70
80
90
100
0 adaptive α=1.0 α=1.3 α=1.7 α=2.0
-2
Fig. 4(c) -4 minimum fitness
Figure 4 shows the results for the optimization of the functions with a few local minima, f7 f8 and f9 , known as Shekel functions. In this case, it is hard to tell which mutation operator performs best. The adaptive Levy mutation again has a fast convergence rate. However, the performance is not better than Levy mutations with a xed single , especially for f7 . According to \Best ", the Gaussian mutation seems to work better. As we can see the upper half of Table 3, however, all of them fall within the condence interval with the signicant level 5% when the t-test is carried out. As is the case for the multimodal functions: if one excludes the adaptive mutation, there is no general consensus on which is the best for most test functions. We have performed another set of experiments in which the population size is xed, regardless of the mutations. The result is shown in Fig. 5 and the lower half of Table 3. When the population size is the same, the adaptive Levy mutation always performs signicantly better on all three functions, among which two are signicant at the level =5% using a two-tailed test.
-12
-6
-8
-10
-12 0
10
20
30
40
50 generation
60
70
80
90
100
Figure 4: Minimum values of the test functions (f7 f8 f9 ) versus generation. All results were averaged over 50 independent runs. The "adaptive" in the legend indicates the result using the adaptive Levy mutation, the others indicate the Levy mutation with the corresponding . The population size for the adaptive Levy mutation is 100, and the others 400. Figures (a), (b), and (c) correspond to the test functions f7 f8 and f9 , respectively.
573
0
-1 adaptive α=1.0 α=1.3 α=1.7 α=2.0
-1
-3
Fig. 5(a)
-3
-4
-4
-5
minimum fitness
minimum fitness
-2
-5 -6
Fig. 5(c)
-6 -7
-7
-8
-8
-9
-9
-10
-10
-11 0
10
20
30
40
50 generation
60
70
80
90
100
adaptive α=1.0 α=1.3 α=1.7 α=2.0
-2 -3
Fig. 5(b)
-4 -5 -6 -7 -8 -9 -10 -11 0
10
20
30
40
50 generation
60
70
80
90
0
10
20
30
40
50 generation
60
70
80
90
100
Figure 5: Minimum value of the test functions versus generation. 50 independent trials were averaged. The number of population is xed 100 for all cases. Figures (a), (b), and (c) correspond to the test functions f7 f8 and f9 , respectively.
-1
minimum fitness
adaptive α=1.0 α=1.3 α=1.7 α=2.0
-2
100
From Figures 2 to 5 and Tables 2 and 3, we can conclude that the adaptive Levy mutation is able to perform as good as or better than the Levy mutation of any of dierent 's. This result is promising because in the adaptive Levy mutation neither complicated operators nor additional parameters are used. This result also sheds some light on the importance of using variable step sizes throughout dierent 's.
5 Conclusion In this paper, we proposed an adaptive Levy mutation operator based on Levy stable probability distributions with dierent parameters and investigated its behaviors by means of minimizing a number of test functions. For functions with many local minima, in general, the adaptive Levy mutations have an advantage over CEP and the other Levy mutations on both the convergence rate and the obtained minimum values. In particular, since the adaptive Levy mutation possess an adaptive versatility of adjusting variation of the ospring, this scheme might be applicable to practical engineering design problems with many local optima. Considering that many practical optimization problems stem from a complex system having many local optima, the adaptive Levy mutation deserves further investigation.
The adaptive Levy mutation have many aspects for future studies, among which the following three are worth mentioning. Firstly, we have not considered the region of 0 < < 1:0. This is mainly due to the fact that a fast algorithm of generating the Levy random numbers in the 0 < < 0:7 region has not been found. However, in order to see whether a further reduction in would lead to even better results, we need to include this region of the parameter. Secondly, in applying the adaptive Levy mutation, we apply the Levy mutation only to the variation of the variables (Eq. (6)) not to the self-adaptive deviation (Eq. (5)). It is, however, more natural to apply the adaptive Levy mutation to the self-adaptive deviation. Finally, although we restricted the parameter to four discrete values, for each experiment, it is highly desirable to make the parameter self-adaptive so that the value of parameter can be also changed continuously during evolution.
Acknowledgements One of the authors (C.-Y. Lee) wishes to thank the School of Computer Science at the University of Birmingham, where part of this work was done, for their kind hospitality. The authors are also grateful to anonymous reviewers for their careful reading of the manuscript and many valuable comments. This work was supported by the Korea Science and Engineering Foundation (KOSEF) and the Royal Society of UK under their exchange program for visiting scientists in 2000.
References
T. B ack and H.-P. Schwefel (1993). "An overview of evolutionary algorithms for parameter optimization", 574
Table 3: The minimum values for each function, together with the results of the t-test, the best value of , and others are listed. The \Mean Best" indicates the average of the minimum values obtained and the \Std Dev" stands for the standard deviation over 50 independent runs. The column \Best " takes a value of the 's in the Levy mutation from which the best result yields. FunctPopulation Number of Adaptive Best Levy Best t-test ions Adaptive (Levy) Generation Mean Best (Std Dev) Mean Best (Std Dev) f7 100 (400) 100 -9.54 (1.69) -9.95 (0.99) 2.0 1.47 f8 100 (400) 100 -10.30 (0.74) -10.40 (1:0 10 4) 2.0 0.95 f9 100 (400) 100 -10.54 (4:9 10 5) -10.54 (3:1 10 3) 1.7 0.0 ;
;
f7 f8 f9
;
100 (100) 100 -9.54 (1.69) -9.05 (2.23) 1.7 1.23 100 (100) 100 -10.30 (0.74) -9.64 (2.13) 1.0 2.05 100 (100) 100 -10.54 (4:9 10 5) -9.75 (2.19) 2.0 2.53 The t value of 49 degree of freedom is signicant at the level =5% by a two-tailed test.
y
;
y
Evolutionary Computation Vol. 1, No. 1, pp. 1-23. D. Challet and Y.-C. Zhang (1997). "Emergence of cooperation and organization in an evolutionary game", Physica A, Vol. 246, pp.407-418. D. Fogel and J. Atmar (1990). "Comparing Genetic Operators with Gaussian Mutation in Simulated Evolutionary Processes Using Linear Systems", Biological Cybernetics, Vol. 63, pp. 111-114. L. Fogel, A. Owens, and M. Walsh (1966). "Articial Intelligence Through Simulated Evolution", John Wiley & Sons, New York, NY. D. Fogel and L. Stayton (1994). "On the Eectiveness of Crossover in Simulated Evolutionary Optimization", BioSystems, Vol. 32:3 pp. 171-182. M. Gen and R. Cheng (1997). Genetic Algorithms and Engineering Design. Wiley-Interscience, John Wiley and Sons, Inc., New York. C.-Y. Lee and Y. Song (1999), "Evolutionary Programming using the Levy Probability Distribution", Proceeding of GECCO99, pp.886-893. P. Levy (1937). "Theorie de l'Addition des Veriables Aleatoires", Gauthier-Villars, Paris 1937 B. Gnedenko and A. Kolmogorov, "Limit distribution for Sums of Independent Random Variables", Addition-Wesley, Cambridge, MA 1954. B. Mandelbrot (1982). "The Fractal Geometry of Nature", Freeman, San Francisco. R. Mantegna (1994). \Fast, accurate algorithm for numerical simulation of Levy stable stochastic process", Physical Review E, Vol. 49, No. 5, pp. 4677-4683. A. Sebald and J. Schlenzig (1994). "Minimax Design of Neural Net Controllers for Highly Uncertain Plants", IEEE Trans. Neural Networks, Vol. 5:1, pp. 73-82. X. Yao (1991), \Simulated annealing with extended neighbourhood," International Journal of Computer Mathematics, 40:169-189. X. Yao and Y. Lin (1996). "Fast evolutionary programming", in Evolutionary Programming V: Proceed-
y
ings of the 5th Annual Conference on Evolutionary Programming, MIT Press Cambridge, MA. X. Yao, G. Lin and Y. Liu (1997). "An analysis of evolutionary algorithms based on neighborhood and step size", in the Proceedings of the 6th international conference on evolutionary programming, pp. 297-307. X. Yao, Y. Liu and G. Lin (1999). "Evolutionary programming made faster," IEEE Transactions on Evolutionary Computation, Vol. 3, No. 2, pp.82-102.
575