The dynamic niching with CMA-ES algorithm [2] is a niching method which uses .... x g(x, λ. =10). By applying the calculation of the dynamic niche count m dyn.
Niche Radius Adaptation in the CMA-ES Niching Algorithm Ofer M. Shir and Thomas B¨ ack∗
acements
Natural Computing Group, Leiden University, Niels Bohrweg 1, 2333 CA Leiden, The Netherlands
Motivation
dyn
Niching methods are the extension of Evolutionary Algorithms (EAs) to multi-modal optimization: they allow parallel convergence into multiple good solutions by maintaining the diversity of certain properties within the population. The majority of the EAs Niching methods holds an assumption concerning the fitness landscape, stating that the peaks are far enough from one another with respect to some threshold distance, called the niche radius, which is estimated for the given problem and remains fixed during the course of evolution. Obviously, there are landscapes for which this assumption isn’t applicable, and where those niching methods are most likely to fail. This is the so-called niche radius problem.
By applying the calculation of the dynamic niche count mi define the niche fitness of individual i by: finiche =
(Eq. 4), based on the appropriate radii, we
fi ³
dyn
g mi
,λ
(9)
´
The selection of the next parent in each niche is based on this niche fitness. A single generation of the method is summarized as Algorithm 1.
!"$#&%('*),+.-0/214365714398:1;+:1@?8A*B:-DCE?1GFHI5J1;3K8:HML7-BN1GO.P QSRTVk UXdli9WWJdTSYUjSZd*[9mn\]\_^Nc$U`boXga2WXdc$cqdXU$pT9USec$f7dsr*gRRlhiJijKjsc f9d7t$uv7rShXcljXT9hpXwj9h$RlibRQVY Niching Background: Fundamental Concepts X x S g r U S j y d s j f > d K i X h e f y d S T s U r h K w > c | z $ { , } ~ U X e s e R T S r h i y j > R
The fitness sharing approach considers the fitness as a shared resource and by that aims to decrease rel d S i s r S Q R T dundancy in the population. Given d , the distance between individuals i and j, ρ (traditionally noted as X S U l W w U S j y d 9 l h s j i d X c > c R 7 Q s j f > d g l R X g 9 w X W U 9 j $ h l R i σ ), the radius of every niche, and α , a control parameter usually set to 1 - the sharing function is: t l R o X g w S j y d s j f & d S $ i l U K o X h e S X d $ U b X d V j R * Q s j f & d g l R X g 9 w X W U 9 j $ h R " i | w X c h S i > s j f > d K p U $ c X d 7 r R I i h S i r h 9 h r w U W S T s U S r h h ( ³ ´ t l R o X g w S j y d s j f & d S $ i l U K o X h e 9 X h e f " d t l R X w i ( j @ > R V Q d S d S T b h S i r h 9 h r 9 w X U W S Q R V T d d T 7 S h S d J i g X d U V R * Q s j f " d X r 9 i l U K o X h $ e l 9 g X d U S s c d 7 j r R if d < ρ 1− sh(d ) = (1) tXRldojIgXwhiSjSrSdyhljsDfd"99hXhjseff7dyp9dhlcljsj7iiKdcXhXcefdy@SQ9hjsiIdccy UScyUbc$dXUTKef"g9RhijMRQ7jsfdyidX$j*dlidTUj9h$Rli 0 otherwise X i 9 f d 9 T h 7 j s j f J d $ t u v s c d V j l U S i y r X w S g r $ U S j J d l h J j S T d c g d e 9 j l h S X d s W Based on this sharing function, the niche count is then defined as following: l d S i s r S Q R T PSfrag replacements hQV¡ ¢K£¤Z¥cXh¦d*RQ*rXiUlo|hXe$lgdUSc$dj*§V^ X k sh(d ) (2) m = l d 9 i d S T U S j * d © ^ M ¨ ¡ 9 ¢ £ 9 i d M $ c X d U 9 T e " f 9 g R h i 9 j ª c ¥ S T S d $ c d V j s t u v $ c d K j c s Adaptation in the CMA-ES Niching Algorithm dliSrhQ ∗ «Sdcsdj7jsfdM¬4^.`[X®¯4°D\]\±\]¬4^:`baª®²¯4°ncsdXUT9ef7gRhij9c Ofer M. Shir and Thomas B¨ a ck Given an individual’s raw fitness f , the shared fitness is then defined by: ij sh
sh
dij αsh ρ
ij
ij
N
i
ij
j=1
Natural Computing Group, Leideni University, Niels Bohrweg 1, 2333 CA Leiden, The Netherlands fi sh fi = (3) mi The dynamic niche sharing method recognizes the q peaks of the forming niches and classifies the individNumerical Results uals accordingly. Introduce the dynamic niche count: ½ Table 1 summarizes the unconstrained multimodal test functions. The algorithm is tested on the specified if individual i is within dynamic niche j nj dyn mi = (4) functions for various dimensions. Each test case includes 100 runs. All runs are performed with a core mi otherwise (non-peak individual) mechanism of a (1, 10)-strategy per niche and initial points are sampled uniformly within the initialization where nj is the size of the jth dynamic niche, and mi is the standard niche count, as defined in Eq. 2. intervals. Initial step sizes, as well as initial niche radii, are set to 16 of the intervals. The parameter q is set The shared fitness is then defined respectively: based on a-priori knowledge when available, or arbitrarily otherwise; p is set to 1. The default value of α is 5 fi dyn −10, but it becomes problem dependent for some cases, and has to be tuned. Each run is stopped after 10 fi = dyn (5) 6 evaluations). generations ((q + 1) · 10 mi We consider three measures as the performance criteria: the saturation M.P.R. (maximum peak ratio; see, The identification of the niches can be done with the Dynamic Peak Identification (DPI) algorithm [2]. e.g., [2]), the global optimum location percentage, and the number of optima found (with respect to the desired value, q). The results of the simulations are summarized in table 2.
¸±¹]º]» ¼ ½ ¾_º ·ÕÔ±Ö× Ñ ÛÝ Þ ØÝÅß ¿ º]½ ¼ ,-*.*/ 0 1 23. JLKIMONQP The dynamic niching with CMA-ES algorithm [2] is a niching method which uses the Covariance Matrix ÊÎ ·ÕÌÈÔ±Ë4Ö$ÍÏà Ñ ÎÐÅÑÓÛÝ ÞÒ Ñ »Õ¾Ãá Ç ÎÐÅâÅÐÅÊ ØÚÝ Ù Ë ãåÛäKÜ ÐÉÑ;ä|Ñ æÊ Û s Æ É Ç Î ¶ è à é É ê É è ¶ é ë JLKIMONYZS ç Û Ü Adaptation Evolution Strategy [1] as its core evolutionary mechanism. The aim of this approach is to find Å Ø ù JLKIMONQ_3S Ý ß Û Ý Û Þ ì ì á ½ ] º î Å · ± Ô S Ö ñ ó × É ò ô ó õ ö ÷ à é É ê Õ è ë Ì Ê Ó Ë
Í Î í Õ Ñ Ì ð Ê ä É Ø G Ë Ò Î â Ñ É Ç È Ç ï ï ï ö ø multiple local optima simultaneously, within one run of the ES. Given q, the estimated/expected number ç cYKEMdNQ_ Ø Ý Ý Û Ý Þ cYKEMdNR S Û ¸G ·Õ¼Õ» ú±·Åû ü ý;¾¶þÓ·É ÿ ÝÝǶÊÌÈÍÍ ËÓÜÍ ÛÛÜ ÞÞ ÑÑ ÇÇ Ñ ÝÝ Ç ÒÒ áá ν½ ºº ÇÇ Ê Ë ËGË@ää ÝÝ ÒÒ »Õ»Õ¾Ã¾Ãáá ÇÇ Ê Ë Ë of peaks, q + p “CMA-sets” are initialized, where a CMA-set is defined as the collection of all the dynamic ² Î ; ð ê ó ð ë f KEMONZ ç variables of the CMA algorithm which uniquely define the search at a given point of time. Such dynamic f KEMONY_ Û ÇÉÊÌÈ Ë4Ü ÍÏÎ ÛlÜ ÛÝ Þ Ñ á ½ º Ç ðÌÊ Ý Ë Ã é É ê Õ è ë PSfrag replacements PSfrag replacements Ñ çÎè¶éÃêÉèÉé¶ë Û f KEMONbR S variables are the current search point, the covariance matrix, the step size, as well as other auxiliary param å¾Úúó´Ú» ú±·±á ÁÃÄ Îé ÇÉÊÌÈ ËÓÒ Í »Õ¾ÃÜ á Ý Þ ðÌõ Ñ Ê ÝÇ ÊË;ØÝ Î9äé ÚÊ Ò ØÝ »Õ¾ÃÑ á _ðÌÊ Ý ÑÅËGä|é ÚË h KEMdNR ç Ç Ç s Adaptation in the CMA-ES Niching Algorithm Niche Radius Adaptation in the CMA-ES Niching Algorithm eters. At every point in time the algorithm stores exactly q + p CMA-sets, which are associated with q + p Û û ½ ·ÅºÃþ²´¶º]Á ǶÊóÈ ËåÍ Ñ Ü ÛÝ Þ Ñ Ê ØÝ Î í ÛÝ Þ Ñ »Õ¾Ãá × òÉô Ý ß äªè h KEMdNYZ Î ¶ è à é É ê É è ¶ é ë ç Ñ Ñ Ñ ∗ ú±·ÅÁ_·Å ∗ h KEMdNY` Û Ý Þ & % ...(q + p) CMA-sets are search points: q for the peaks and p for the “non-peaks domain”. The (q + 1) à é É ê É è ¶ é ë ó Ê Ó Ë Ï Í Î É Ç È Ofer M. Shir and Thomas B¨ ack Ofer M. Shir and Thomas B¨ a ck Ü Õ î ô Ú ò " õ Å ! ô # Ú ò " õ Å ! ô # $ ô ç Ñ Ý Û Û Ý Þ h KEMdNRHS ' ½ ó º Õ » Å · _ º ¼ ( á ½ º  * ¾ ) Ì Ê Ó Ë Ï Í Î É è é Ò Ê Ë é * à + É ê É è ¶ é ë Ñ É Ç È Ç Ç l Û Ü individuals whichGroup, are randomly re-generated in every generation as potential candidates for niche formation. Natural Computing Leiden University, Niels Bohrweg 1, 2333 CA Leiden, The Netherlands Natural Computing Group, Leiden University, Niels Bohrweg 1, 2333 CA Leiden, The Netherlands ç Until stopping criteria are met, the following procedure takes place. Each search point samples λ offspring, Dynamic Niching with Covariance Matrix Adaptation Evolution Strategy
th
³´¶µ$· À » ÁÃÂ ·ÅÄ
465 785 9:5 R S*[ \]3^3_ S*[ P*RH^3\ S*[ ]^3PCZ S*[ UaZ^3^ R S[ ^3^*R S[ Ua^P S*[ ^P3^C` S*[ ^S3\3S S*[ ]CUCRH_ S*[ ]\3_3]
th
based on its evolving CMA-set. After the fitness evaluation of the new λ·(q +p) individuals, the classification into niches of the entire population is done using the DPI algorithm, and the peaks become the new search points. Their CMA-sets are inherited from their parents and updated according to the CMA method.
The Niche Radius Problem
;=< 23>*?< RHSS*T `]*T _3P*T RHSS*T _*U3T RHSS*T RHSS*T \*U3T RHSS*T RHSS*T RHSS*T RHSS*T
Test Functions
@BAC0 1 DE?*FHG UV3U ZZ*[ \*Va_*R ZaS[ ^*Va^*R _[ _*V` P[ _*V3R3R _CVa_ P[ S*Va_ Z*[ P*Va_ `*[ SC`3Va\ RaUC[ ^3\CVaP\ P][ RH\CV`aS P\[ ]*V`aS
,-*.*/ 0 1 2. WXKIMONYP WXKIMONbR S WXKIMONY_S eQKIMONQP eQKIMONRHS eQKIMONYZS g6KEMONYZ g6KEMONRHS iYKIMONR iYKIMONYZ iYKIMONY` iYKIMONRHS
465 7&5 9:5 R S*[ ]]3^R S*[ UU`3Z S*[ ]CUZ\ S*[ `a\3]3^ S*[ R \C`3` S*[ UaZ^3^ S[ P3]^ S*[ ]\*Ua\ S*[ ^S3\3S S*[ UHPR3R S*[ UaZ^3^
;=< 23>*?< RHSS*T RHSS*T RHSS*T RHSS*T ^CZCT \RT RHSS*T `P*T RHSS*T RHSS*T ]RT Ua]*T
@BAC0 1 DI?*FHG RHSS*V3RHSS ]3][ RV3RHSS ^*UC[ ZCV3RHSS P[ ]3\CV` Z*[ Z*RaV` R3[ Z*RaV` P[ ]3\CV` Z*[ ZCVa` U3[ ^P3PCVa^ \[ P3PCVa^ _[ P*UVa^ P[ _RaVa^
Numerical Results
As reflected by those results, our method performs in a satisfying manner. The performance of the new niching method is not harmed by the introduction of the niche radius adaptation mechanism with respect to the same multimodal test functions reported in [2], except for the Ackley function in high dimensions. Our confidence is further reassured by the results on the functions {V, S} with the unevenly spread optima, which are satisfying (plots are given below).
The traditional formula for the niche radius for phenotypic sharing in GAs and for ES niching is given by r ρ= √ n q , where given lower and upper boundary values x k,min , xk,max of each coordinate in the decision qP n 2. parameters space, r is defined as r = 12 (x − x ) k,max k,min k=1 Hence, by applying this niche radius approach, two assumptions used to be held: 1. The expected/desired number of peaks, q, is given or can be estimated. 2. All peaks are at least in distance 2ρ from each other, where ρ is the fixed radius of every niche. PSfrag replacements
15 1 0.8 0.6 10
0.4 0.2 0
PSfrag replacements
5
us Adaptation Niche in the Radius CMA-ES Adaptation Niching inAlgorithm Algorithm the CMA-ES Niching Algorithm Niche Radius Adaptation in the CMA-ES Niching
Ofer M. Shir and Thomas B¨ ack∗Ofer M. Shir and Thomas B¨ ack∗ 0
0
1
2
3
4
5
6
7
8
9
10
−0.2 −0.4 −0.6 −0.8 −1
1
2
3
4
5
6
7
8
9
10
Our new algorithm Group, tackles the niche University, radius problem, in particular the regarding the fitness Natural Computing Leiden Natural Niels Bohrweg Computing 1, assumption 2333 Group, CA Leiden, Leiden University, The Netherlands Niels Bohrweg 1, 2333 CA Leiden, The Netherlands landscape: it introduces the concept of an individual niche radius which adapts during the course of evoS (n = 1): Final Population V (n = 1): Final Population lution. The idea is to couple the niche radius to the global step size σ, whereas the indirect selection of the niche radius is applied through the demand for λ individuals per niche. This is implemented through a quasi dynamic fitness sharing mechanism. Summary and Conclusion The CMA-ES Niching method is used as outlined earlier, with the following modifications. q is given as an input to the algorithm, but it’s now merely a prediction or a demand for the number of solutions, with no The proposed method is shown to perform in a satisfying manner in the location of the desired optima of effect on the nature of the search. A niche radius is initialized for each individual in the population, noted functions which were tested in the past on the predecessor of this method, using a fixed niche radius. More 0 as ρi . The update step of the niche radius of individual i in generation g + 1 is based on the parent’s radius importantly, the niche radius problem is tackled successfully, as demonstrated on functions with unevenly and on its step-size: ³ ´ spread optima. The function of the learning coefficients has to be tuned (through the parameter α) in some g+1 g+1 g g+1 g+1 ρi = 1 − ci · ρparent + ci · σparent (6) cases. Although this is an undesired situation, i.e., the adaptation mechanism is problem dependent, this g method makes it possible to locate all desired optima on landscapes which could not be handled by the old c ) = 0.2 ⋅ (1− e ci is the individual learning coefficient, which is methods of fixed niche radii, or would require the tuning of q parameters. updated according to the delta of the step size σ ¯ ¯´ ³ ¯ g+1 ¯ g+1 g ∆σi = ¯σparent − σparent¯ : g+1
α⋅ ∆σ
g+1 i
i
0.25
0.2
CMA-ES Niching Algorithm
0.1
0.05
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
This work is part of the research programme of the ’Stichting voor Fundamenteel Onderzoek de Materie (FOM)’, financially supported by the ’Nederlandse Organisatie voor Wetenschappelijk Onderzoek (NWO)’.
g(x,λ)=1+Θ(λ−x)⋅(1/ λ)⋅ (λ−x)2 +Θ(x−λ)⋅ (λ−x)2 30
References
25
20
g(x,λ =10)
CMA-ES Niching Algorithm
0
g+1 ∆σ i
2
(λ − x) g (x, λ) = 1+Θ (λ − x)· +Θ (x − λ)·(λ − x)2 PSfrag replacements λ (8) where Θ (y) is the Heaviside step function. Given a fixed λ, g (x, λ) is aShir parabola withThomas unequal Ofer M. and B¨ ack∗ niversity,branches, Niels Bohrweg Leiden, The Netherlands centered at1,(x2333 = λ, CA g = 1); see plot.
Acknowledgments
i
³ n o´ PSfrag replacements 1 g+1 g+1 ci = · 1 − exp α · ∆σi (7) 5 The DPIOfer algorithm is run usingand the individual M. Shir Thomas B¨ ack∗ for the of the peaks niversity,niche Nielsradii, Bohrweg 1, identification 2333 CA Leiden, The Netherlands and the classification of the population. Furthermore, introduce:
cg+1
0.15
[1] Nikolaus Hansen and Andreas Ostermeier. Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation, 9(2), 2001.
15
10
5
0
4
6
8
x=10
12
14
16
x
∗ NuTech Solutions, Martin-Schmeisser-Weg 15, 44227 Dortmund, Germany.
[2] Ofer M. Shir and Thomas B¨ack. Dynamic niching in evolution strategies with covariance matrix adaptation. In Proceedings of the 2005 Congress on Evolutionary Computation CEC-2005, Piscataway, NJ, USA, 2005. IEEE Press.