A Multi-Population Approach to Dynamic Optimization Problems J¨urgen Branke, Thomas Kaußler, Christian Schmidt, and Hartmut Schmeck Institute AIFB University of Karlsruhe D-76128 Karlsruhe, Germany Email:fbranke j
[email protected]
Abstract. Time-dependent optimization problems pose a new challenge to evolutionary algorithms, since they not only require a search for the optimum, but also a continuous tracking of the optimum over time. In this paper, we will will use concepts from the ”forking GA” (a multipopulation evolutionary algorithm proposed to find multiple peaks in a multi-modal landscape) to enhance search in a dynamic landscape. The algorithm uses a number of smaller populations to track the most promising peaks over time, while a larger parent population is continuously searching for new peaks. We will show that this approach is indeed suitable for dynamic optimization problems by testing it on the recently proposed Moving Peaks Benchmark. Keywords: evolutionary algorithm, forking, dynamic optimization problem, time-dependent optimization
1
Introduction
Most research in evolutionary computation focuses on optimization of static, nonchanging problems. However, many real-world optimization problems are actually dynamic, and optimization methods capable of continuously adapting the solution to a changing environment are needed. Applications include for example scheduling, where new jobs have to be added all the time, or manufacturing, where the quality of raw material is changing over time. Although evolutionary algorithms seem to be a natural candidate to solve dynamic optimization problems, this area has only recently attracted significant research interest. A comprehensive survey can be found in [1]. As has been argued by Branke [2], continuous adaptation only makes sense when the landscapes before and after the change are sufficiently correlated, otherwise it would be at least as efficient to restart the search from scratch. Therefore it is valid to assume only small to moderate changes. But although after a small change local hill-climbing might often be sufficient, even a slight change might move the optimum to a totally different location, for example when the heights of the peaks change such that a different peak becomes the maximum peak. In these cases the EA basically has to “jump”, or cross a valley, to reach the new maximum peak.
The main problem with standard evolutionary algorithms used for dynamic optimization problems appears to be that EAs eventually converge to an optimum and thereby lose their diversity necessary for efficiently exploring the search space and consequently also their ability to adapt to a change in the environment when such a change occurs. In this paper, we adapt the concept of Forking Genetic algorithms (FGAs) as introduced by Tsutsui, Fujimoto and Ghosh [3] to time varying multimodal optimization problems. The FGA is based on the idea of dividing the search space into several parts, each exclusively explored by one of several subpopulations. A parent population continuously searches for new peaks, while a number of child populations try to exploit previously detected promising areas. Here, we use this general idea to maintain individuals on several peaks simultaneously, which should be helpful in the context of dynamic optimization problems. A number of adaptations were necessary to adapt the FGA to allow it to react on changes of the fitness function. The only other multi-population approach to dynamic optimization problems the authors are aware of is the recently published Shifting Balance GA proposed by Oppacher and Wineberg [4]. Again, that algorithm tries to maintain the EAs exploratory power by dividing the population into one core population and a number of smaller colony populations. But there, the task of the core population is to exploit the best solution found, while the colony populations are forced to search in different areas of the fitness landscape, i.e. they are responsible for exploration. In the approach presented here, a number of smaller populations try to exploit several peaks simultaneously, while a large search population is searching for new peaks. Also, the mechanisms to maintain separated populations are completely different. At the current state we are unable to tell which of these two independently developed paradigms is more promising. The paper’s outline is as follows: First, Section 2 will present our approach in more detail and point out the major differences compared to the forking GA. Section 3 will report on some preliminary experiments. The paper concludes with a summary and some remarks on possible future work.
2 Concept of Forking and Adaption to Dynamic Problems The original Forking Genetic Algorithm (FGA) as introduced by Tsutsui, Fujimoto and Ghosh is rather complex, therefore only a brief summary can be given here. The interested reader is referred to [3] for more details. As pointed out before, the FGA has originally been designed to find multiple peaks of a multi-modal landscape, and is based on the idea of dividing up the search space. It uses a parent population, continuously searching for new peaks, and a number of child populations, which are restricted to search in some promising areas identified
previously. Whenever the parent population has converged ”sufficiently” to one region, the FGA separates this region from the parent population’s search space and assigns a child population to it for further exploitation, i.e. the parent population and the child populations operate on disjoint parts of the search space (cf. Figure 1). If the maximally allowed number of forking populations is reached, the oldest forking population is discarded. To make the FGA suitable for dynamic problems, at least two issues have to be addressed:
the subpopulations have to be enabled to follow “their” peak through time we need to efficiently distribute our individuals (and thereby search efforts) between the different child populations and the parent population. This is done by adjusting the number of individuals in each population according to its assumed optimization potential. The least promising child populations may be erased completely when that seems appropriate.
Overall, the child populations should exploit the knowledge gained previously and follow the most promising peaks in the search space, while the parent population should constantly explore and search for new peaks. The new approach introduced in this paper will be called “Self-Organizing Scouts” (SOS), since the individuals here act as scouts, that may divide the search space into different regions and distribute their search efforts onto the most promising of these regions. The algorithm starts just as a simple EA with a single population searching through the entire search space. In regular intervals, this population (called the parent population) is analyzed, and it is checked whether the conditions for forking are fulfilled. If that is the case, a child population is split off from the parent population and henceforth independently explores the corresponding subspace. The parent population continues to search in the remaining search space. Note that the total number of individuals is never affected, the individuals are just assigned to different (sub)populations. Each child population is able to adapt its search space to the changing fitness landscape by moving it through the phenotypical feature space. The number of individuals in each child population is adjusted regularly to reflect its estimated optimization potential and quality. The overall algorithm is depicted below.
Algorithm: Self Organizing Scouts REPEAT Compute the next generation of child populations and adjust search spaces Compute the next generation of parent population IF (forking generation) Create new child population when possible Adjust size of base and child populations UNTIL termination criterion
Parent Population
Child Population 1
Child Population 2
Forking
Forking
Forking
search space
Fig. 1: Division of the search space by forking (following [3])
In the following, the different steps are explained in more detail. Computing the next generation: Generally, computing the next generation of the parent or child population is equivalent to a single generation of an ordinary EA. The mutation step size is adjusted to the diameter of the corresponding search space, i.e. child populations generally have much smaller mutation step sizes than the parent population. Since the fitness function is dynamic, all individuals have to be reevaluated in every generation. It is always ensured that no individual from the parent population lies within any of child populations’ search spaces. Creating a new child population: A child population is an independent subpopulation working on a part of the phenotypic feature space. It is defined by a center (the most fit individual in the subpopulation) and a distance (range), and consists of all individuals whose distance (phenotypical manhattan distance) from the center is smaller than or equal to the given range. The population fitness is defined as the fitness of its best individual. Child populations underly a number of restrictions:
minimum and maximum number of individuals relative to overall amount
minimum and maximum diameter of the subspaces relative to the size of the total search space minimum fitness of new forking populations relative to current overall best individual minimum fitness of existing forking populations relative to current overall best individual.
The original FGA only considers forking when certain convergence criteria are fulfilled and the best individual did not change over some number of generations. Since we are dealing with a dynamic fitness here, convergence may never occur, and a regular attempt for forking seemed more appropriate. Therefore, at specific generations, called forking generations, the parent population is analyzed for the existence of a group of individuals that would fulfill the above constraints for child populations. If more than one group is found, the one with the maximum ratio of number of individuals to diameter is selected. All individuals in that group are split off from the parent population and assigned the smallest subspace encompassing all individuals as search space (note that this allows a variable size of the subspace, while the original FGA always uses a fixed size subspace). Moving the child population’s search space: As the peaks of the fitness landscape may move, the search space of each child population is allowed to move as well. This is achieved by defining the best individual in a child population as the center of its search space. The diameter of each search space, however, is kept constant in the current implementation. When a child population’s search space moves, some of the parent population’s individuals may become invalid, as they lie within a child’s population search space. These individuals are replaced by valid randomly generated individuals. Individuals that drop out of a moving child population’s search space are kept for the next generation. In addition, it may happen that the search spaces of child populations overlap. Usually this is tolerated, only when a center individual falls into the search space of another child population, its whole child population is removed. Adjusting the population sizes: Since the overall computing power is limited, it should be distributed efficiently over the different populations. Generally, more effort should be devoted to areas with high quality and high dynamics. On the other hand, when a child population has converged to a peak and that peak has not changed for several generations, it may be sufficient to maintain a very small “outpost” on that peak in order to be able to detect when that peak becomes interesting again, i.e. when it changes height and/or position. For that purpose, in each forking generation, first of all, any child population with a fitness smaller than the minimum required fitness is discarded. Then, for all remaining child populations as well as the parent population a quality measure Qi is calculated.
The quality Qi of population i is simply a linear combination of its fitness Fi and dynamism measure Di which depends on the difference between a population’s current and previous fitness (see equation below).
Fi (t )
=
Di (t )
=
fitness of best individual in population i at time t Fi (t ) Fi(t 1) g maxf0; Fi (t 1)
8 Dt 0 : otherwise
The desired population size Si is then chosen proportionally to each population’s relative quality: Si =
Qi ∑j Qj
(total number of individuals)
Of course, the restrictions on minimum and maximum population size of each child population and the parent population always have to be respected. When the size of a population is increased, new random individuals are generated within the corresponding search space. If individuals have to be removed as the new population size is smaller than the old one, the worst individuals are removed. We consider that the above described attempt to measure quality is a rather straightforward and preliminary approach, in future studies other indicators like e.g. convergence may be included.
3 Empirical Evaluation For a preliminary empirical evaluation, we compare the SOS algorithm as presented above to a simple evolutionary algorithm (EA) using the Moving Peaks Problem [2]. This benchmark consists of a number of peaks, randomly changing their height, location and width from time to time. The step size by which a peak is moved can be set explicitly and thus allows to define the severity of a change. As standard settings for SOS as well as for the simple EA we use a total number of 100 individuals, real-valued encoding, rank based selection, generational replacement with elitism of one individual, mutation probability of 0.2, and crossover probability of 0.6. The α-Parameter to set the relative importance of dynamism and quality in SOS ha For the Moving Peaks benchmark, each of 5 dimensions was restricted to
values between 0 and 100, the step size for a peak has been set to 1, and the landscape changes every 5000 evaluations (50 generations). All reported values are averages over 20 runs with different random seeds but identical fitness function. Since for dynamic fitness functions it is not useful to report the best solution found, we will use the offline performance as quality measure. The offline performance is defined as the average of the best solutions at each time step, i.e. x (T ) = T1 ∑tT=1 et with et being the best solution at time t (cf. [5]). Note that the number of values that are used for the average grows with time, thus the curves tend to get smoother and smoother. Figure 2 depicts three different settings of the SOS with differing restrictions to cluster size, and compares them to the standard EA. As can be seen, SOS always significantly outperforms the standard EA. With increasing cluster size, the performance of SOS improves slightly. Nevertheless, the other results reported in this paper have been computed with the smallest cluster size restrictions, which leaves some potential for improvement.
60
60
55
55
50
50
Average offline performance
Average offline performance
The effect of changing the total number of individuals can be seen in Figure 3, again for the standard EA (lower 3 curves) and SOS (upper 3 curves). Overall, the effect of varying the population size seems to be small.
45 40 35 30 SOS [0.05; 0.1] SOS [0.2; 0.4] SOS [0.1; 0.2] simple GA
25 20 0
1000
2000
3000
4000 5000 6000 Generations
7000
8000
9000 10000
Fig. 2: Offline performance of standard EA compared to SOS with different restrictions for cluster size.
45 40 35 30
SOS 100 SOS 50 SOS 200 simple GA 100 simple GA 200 simple GA 50
25 20 0
1000
2000
3000
4000 5000 6000 Generations
7000
8000
9000 10000
Fig. 3: Offline performance of standard EA compared to SOS with different total population size
Figure 4 shows the results obtained with standard settings but a higher change frequency, namely a change of the fitness landscape occurring every 20 generations. The change frequency is one aspect of dynamism, and it seems that with growing dynamism SOS is gaining additional relative advantage compared to the standard EA. Finally, we examined the effect of changing the step size, or severity of the changes. Figure 5 reports the offline performance after 5000 generations for different values of severity. As can be seen, SOS is much less affected by an increased change severity than the standard EA, which means that again the difference between standard EA and SOS is growing when dynamism is increased. This is particularly noteworthy in
60
55
55
54
SOS GA
53
50 45 40 35
SOS simple GA
30
offline-performance
Average offline performance
comparison to the results reported in [2], as the memory based approaches presented there have proven to be very sensitive to changes of the peak locations, i.e. the severity.
52 51 50 49 48
25
47
20
46 0
1000
2000
3000
4000
5000
6000
7000
8000
9000 10000
0
Generations
Fig. 4: Offline performance of standard EA compared to SOS with higher change frequency (every 20 generations).
0.5
1
1.5
2
shift vector length s
Fig. 5: Offline performance of standard EA and SOS after 5000 generations, varying shift length s.
4 Conclusion and Future Work We have presented a new way to tackle dynamic optimization problems by means of evolutionary algorithms. The newly proposed Self Organizing Scouts approach is based on the forking GA presented earlier in [3]. The basic idea behind this work was to design an evolutionary algorithm that is capable of tracking several different promising parts of the search space simultaneously. Although some of the design decisions have been made rather ad hoc in order to quickly prove feasibility, in the preliminary tests performed so far, our approach significantly outperformed the simple GA on the Moving Peaks problem. Furthermore, the approach is largely independent of the total number of individuals or the severity of the changes (as opposed to memory based approaches, which are very sensitive to changes of the peak locations). For future work, it would certainly be valuable to examine more closely the effect of the different modifications made to the original FGA. In particular, the resizing of the populations should be refined, not only by refining the quality rating, but also by allowing the size of the search spaces to adapt. Testing the approach on problems with more peaks than the maximally allowed number of child populations will reveal whether the resizing procedure is still capable of identifying the most promising search regions. A comparison with other approaches, especially the shifting balance GA, should also be done.
References 1. J. Branke. Evolutionary algorithms for dynamic optimization problems - a survey. Technical Report 387, Insitute AIFB, University of Karlsruhe, February 1999. 2. J. Branke. Memory enhanced evolutionary algorithms for changing optimization problems. In Congress on Evolutionary Computation CEC99, volume 3, pages 1875–1882. IEEE, 1999. 3. S. Tsutsui, Y. Fujimoto, and A. Ghosh. Forking genetic algorithms: GAs with search space division schemes. Evolutionary Computation, 5(1):61–80, 1997. 4. F. Oppacher and M. Wineberg. The shifting balance genetic algorithm: Improving the ga in a dynamic environment. In W. Banzhalf et al., editor, Genetic and Evolutionary Computation Conference, volume 1, pages 504–510. Morgan Kaufmann, 1999. 5. K. De Jong. An analysis of the behavior of a class of genetic adaptive systems. PhD thesis, University of Michigan, Ann Arbor MI, 1975.