Improving the Performance of Particle Swarms through Dimension ...

2 downloads 0 Views 695KB Size Report
scouts phase. This ability to exploit separability in locust swarms leads to large performance improvements on separable problems. More interestingly ...
WCCI 2010 IEEE World Congress on Computational Intelligence July, 18-23, 2010 - CCIB, Barcelona, Spain

CEC IEEE

Improving the Performance of Particle Swarms through Dimension Reductions – A Case Study with Locust Swarms Stephen Chen, Member, IEEE, and Yenny Noa Vargas 

Abstract—A key challenge for many heuristic search techniques is scalability – techniques that work well on lowdimension problems may perform poorly on high-dimension problems. To the extent that some problems/problem domains are separable, this can lead to a benefit for search techniques that can exploit separability. The standard algorithm for particle swarm optimization does not provide opportunities to exploit separable problems. However, the design of locust swarms involves two phases (scouts and swarms), and “dimension reductions” can be easily implemented during the scouts phase. This ability to exploit separability in locust swarms leads to large performance improvements on separable problems. More interestingly, dimension reductions can also lead to significant performance improvements on nonseparable problems. Results on the Black-Box Optimization Benchmarking (BBOB) problems show how dimension reductions can help locust swarms perform better than standard particle swarms – especially on high-dimension problems.

T

I. INTRODUCTION

HE complexity of an optimization problem (e.g. the number of local optima) can increase exponentially with dimensionality, so search techniques that are very effective on low-dimension problems may not be efficient enough to handle problems in higher dimensions. To create highdimension benchmark optimization problems, onedimension functions are often repeated (e.g. summed or multiplied) multiple times. High-dimension problems of this nature are separable – the optimal value for a search in any one dimension is for that term the same value it will have at the global optimum. Therefore, these types of highdimension problems are easily solved as a series of independent, one-dimension problems. Search techniques which can focus on one term/dimension at a time can have substantial advantages on separable problems. For example, the CEC2008 Large Scale Global Optimization (LSGO) competition used separable problems [1]. In this competition, the best method was Multiple Trajectory Search [2] which relies heavily on a series of one-dimensional searches. Other search methods which performed well and which had the opportunity to This work has received funding support from the Natural Sciences and Engineering Research Council of Canada. S. Chen is with the School of Information Technology, York University, Toronto, ON M3J 1P3 Canada (phone: 416-736-2100 x:30526; fax: 416736-5287; e-mail: [email protected]). Y. Noa Vargas is with the Department of Mathematics and Computer Science, University of Havana, Havana, Cuba (e-mail: [email protected]).

c 978-1-4244-8126-2/10/$26.00 2010 IEEE

exploit separability include a differential evolution-based technique with an evolvable crossover factor [3] (i.e. selecting fewer terms for crossover leads to a greater exploitation of separability) and a Multilevel Cooperative Coevolution approach [4] which used the solution to smaller sub-problems (i.e. problem separation) as a starting point for solving the overall problem. Compared to these techniques, the particle swarm-based techniques [5][6] had no discernable ability to exploit problem separation and widely inferior LSGO results. In particle swarm optimization (PSO) [7], a solution is viewed as a location (which may happen to be in n-dimensional space) as opposed to an entity like an organism which is distinctly composed of n components. The search trajectory of particles in PSO is “as the crow flies”, and there is no concept of “Manhattan distance” in a standard PSO. Locust swarms [8] are a recently developed multi-optima particle swarm. With their “devour and move on” search strategy, they were explicitly developed for multi-modal functions without global structure. When a swarm converges, it “devours” the area around a (local) optimum. To help find a promising new area to devour, locust swarms use “scouts” to sample the search space [9]. Scouts can change all n dimensions of the previous (local) optimum, or they can change a reduced set. These “dimension reductions” in the search space of the scouts allow locust swarms to exploit problem separation. Although locust swarms can perform very well on the LSGO problems [10], the separability of the LSGO problems limits their usefulness as benchmarks. The use of one-dimensional searches on non-separable problems can lead to very poor results. To more fully examine the benefits of dimension reductions, locust swarms have been applied to the Black-Box Optimization Benchmarking (BBOB) problem set [11]. The “separable functions” (BBOB problems 1-5) and the “multi-modal functions with adequate global structure” (BBOB problems 15-19) are particularly useful for analyzing the effects of dimension reductions. On the separable functions, the performance of locust swarms generally improves with increasing dimension reductions (i.e. lower-dimensional searches). On the multimodal functions with adequate global structure, the performance of locust swarms generally improves with some dimension reductions, but their performance also degrades greatly when scouts use only one-dimensional searches. However, in both cases, the performance of locust swarms can be improved through the use of dimension reductions.

2950

Comparing how locust swarms perform with a benchmark implementation of particle swarm optimization [12], both techniques have advantages and disadvantages on different functions in the BBOB problem set at D = 20 dimensions. However, problems with D = 20 still have relatively few dimensions, and a key hypothesis is that dimension reductions can help counteract the “curse of dimensionality”. To test this hypothesis, additional comparisons between locust swarms and the benchmark PSO were conducted for D = 50, 100, and 200 dimensions. On the BBOB problems, the effects of increasing dimensionality can be dramatically less on locust swarms, and this can lead to large advantages as D increases from 20 to 200. This demonstration of how dimension reductions can help improve the performance of particle swarm-based search techniques begins with an introduction of both particle swarm optimization and locust swarms in section II. In section III, the BBOB problem set is introduced. Experiments with locust swarms and standard PSO are performed on these problems in sections IV and V. The results of these experiments are discussed in section VI before a summary is provided in section VII. II. BACKGROUND The development of particle swarm optimization includes inspirations from “bird flocking, fish schooling, and swarming theory in particular” [7]. Each particle (e.g. a simulated bird) knows its personal best position (pbest) and the globally best position (gbest) that any member of the swarm has encountered. The position of the particle at time step i+1 is calculated by (1) and its velocity update (2) is affected by attractions to the global best (3) and its personal best (4) [7]. positioni+1 = positioni + step*velocityi velocityi+1 = M*velocityi + G*globali + C*locali globali = gbest - positioni locali = pbest - positioni

upon one another due to the attractive effects of the primary long-range forces (i.e. G). The use of discrete time step evaluations can further exaggerate the effects of the shortrange force causing particles to not simply repel each other enough to avoid complete convergence, but in fact to expel each other into entirely different areas of the search space. The restart mechanism for locust swarms is based on time. As time is spent “devouring” one area of the search space (i.e. trying to find the exact local optimum), it becomes less and less likely that additional efforts in exploitation will be productive. Thus, there comes a time to “move on” to another area of the search space. This “devour and move on” process is akin to the swarming phase of grasshoppers that become migratory after an intense exploitation of an area’s resources. The two phases of locust swarms – migrating and devouring or exploring and exploiting – have been designed and coordinated using a coarse search-greedy search framework. The coarse search phase of locust swarms has been implemented using “smart” start points [9]. In continuous search spaces, the fitness of a random point can be expected to have a correlation to the fitness of its nearby local optima, so random search can help identify promising areas for further exploration and exploitation. The role of this coarse search can be viewed as “scouts” which identify promising new areas for the swarm to exploit. The resulting system has three sets of design parameters: scout parameters, swarm parameters, and general parameters. The primary role of the general parameters is to effectively allocate the allowed function evaluations (FE). In (5), N is the number of swarms – each of which explores a (local) optimum, S is the number of scouts, L is the number of locusts, and n is the number of iterations each swarm is allowed to use to find a local optimum. FE = N * (S + L*n)

(1) (2) (3) (4)

As particles “fly” through the search space, they examine many points in the vicinity of the global best location. With small momentum M, this would lead to a very greedy search similar to hill climbing. However, particles can also “fly past” the global best, and these overshoots are an important feature that allow particle swarms to perform effective search space explorations. Nonetheless, there is no guarantee that the particles will converge to the global optimum in a multi-modal search space, so some form of restart can be useful. The development of locust swarms [8] builds upon particle swarm optimization and a multi-optima variation called WoSP (for Waves of Swarm Particles) [13]. The restart mechanism in WoSP is similar to a physics-based short-range force that repulses particles that have converged

(5)

The scouts perform a coarse search around the previous optima by adding a delta calculated in (6) to a small number of terms in a solution vector. By reducing the number of dimensions that are changed, locust swarms are able to exploit some information from the previous optima. The effect of dimension reductions is a problem separation – exploration occurs only in the selected DIMr number of dimensions. The amount of exploration is also influenced by requiring scouts to be a minimum gap from the previous optima (with some additional spacing as appropriate). Note: range is the range for (the given dimension of) the search space and randn() is a normally distributed random number. Specifically, the first swarm starts with S scouts with a uniform random distribution throughout the search space. Subsequent swarms have scouts that are centred around the previous optimum. For a given dimension j, the scout’s value for that term will be calculated by (7) for D-DIMr dimensions and by (8) for DIMr dimensions. Each of the

2951

TABLE I BBOB FUNCTIONS

scouts will randomly select which DIMr dimensions it will change, and deltaj will be calculated DIMr times. Set

deltaj = ± rangej * (gap + abs(randn()*spacing)) scoutj = previousoptimumj scoutj = previousoptimumj + deltaj

(6) (7) (8)

(9) (10) (11) (12)

The overall behaviour of locust swarms can be viewed from many perspectives. They can be viewed as an evolution strategy [14] (i.e. the scouts) with local optimization (i.e. the swarms). They can be viewed as a swarm-like tabu search [15] in which the gap parameter acts as a prevention mechanism to avoid repeated exploration of the same optimum. They can also be viewed as a chained/one-point restart method (as opposed to a population-based restart method like a memetic algorithm [16]) for particle swarm optimization. III. BLACK-BOX OPTIMIZATION BENCHMARKING FUNCTIONS The BBOB problems [11] come in five sets: 1–separable functions, 2–functions with low or moderate conditioning, 3–unimodal functions with high conditioning, 4–multimodal functions with adequate global structure, and 5– multi-modal functions with weak global structure. In Table I, some key attributes of the functions are indicated – i.e. whether or not they are separable (s), unimodal (u), or have (adequate) global structure (gs) – e.g. globally convexity. Separable base functions (e.g. Rastrigin, BBOB 3) are converted into non-separable functions (e.g. BBOB 15) by applying a rotation matrix. Due to the computational costs of these rotations, the primary standard for reporting BBOB results is D = 20 dimensions. Since there is no explicit limit to the number of function evaluations that should be used on the BBOB problems, the value of FE = 5000 * D is taken from the LSGO competition [1]. IV. ANALYZING THE ROLE OF DIMENSION REDUCTIONS IN LOCUST SWARMS On the separable CEC2008 LSGO problems [1], it has been determined that the single largest influence on the

s X X X X X

Attribute u gs X X X X X X X X X

1 Sphere 2 Ellipsoidal, original 3 Rastrigin 1 4 Büche-Rastrigin 5 Linear Slope 6 Attractive Sector 7 Step Ellipsoidal 2 8 Rosenbrock, original 9 Rosenbrock, rotated 10 Ellipsoidal, rotated X X 11 Discus X X 12 Bent Cigar X 3 13 Sharp Ridge X 14 Different Powers X 15 Rastrigin, rotated X 16 Weierstrass X 17 Schaffers F7 X 4 18 Schaffers F7, moderately ill-conditioned X 19 Composite Griewank-Rosenbrock F8F2 X 20 Schwefel 21 Gallagher’s Gaussian 101-me Peaks 22 Gallagher’s Gaussian 21-hi Peaks 5 23 Katsuura 24 Lunacek bi-Rastrigin Names and selected attributes of the 24 functions in the BBOB problem set – separable (s), unimodal (u), global structure (gs).

Keeping the L best scouts for the swarm, the initial velocities (9) launch the locusts away from the previous optima (position0 is the position of the scout) with a speed affected by the launch parameter. Each locust behaves like a particle in a standard PSO (10) = (1), but the velocity update (11) does not include an attraction to pbest (i.e. C = 0 in (2)) – multiple swarms are instead used to balance exploration with exploitation. velocity0 = launch*(position0 - previousoptimum) positioni+1 = positioni + step*velocityi velocityi+1 = M*velocityi + G*globali globali = gbest - positioni

Function Name

performance of locust swarms is the value of DIMr [10]. To understand this influence, consider the operation of locust swarms with DIMr = 1 and L = 1. With DIMr = 1, each scout will only change one term from the previous local optimum. The initial velocity (9) is along the line that connects the scout point with the previous local optimum, so the resulting particle trajectory during the swarm phase would be restricted to a one-dimensional line that keeps the remaining D-1 dimensions constant. In the limit, locust swarms have the ability to solve fully separable problems one term/dimension at a time. Locust swarms have been applied to the BBOB problem set [11] for D = 20 dimensions with DIMr = 1, 2, 3, 5, 10, and 20. Function evaluations were set to FE = 5000 * D = 100,000. Using locust and swarm parameters from [10], N = 5 * D = 100, S = 500, L = 5, and n = 100 in (5), step = 0.6 in (10), and M = G = 0.5 in (11). To accommodate the large variety of fitness landscapes in the BBOB problem set, the parameters in (6) and (9) were chosen to balance varying needs for exploration and exploitation – gap = 0.05, spacing = 0.1, and launch = 1.5. Using the first five instances of each BBOB problem, five trials for each were run for a total of 25 trials per problem. The means, standard deviations, and t-tests for these experiments are reported in Tables II-IV. The t-tests determine if the relative effects for different values of DIMr are significant. Focusing on the “separable functions” (set 1, BBOB 1-5) and the “multi-modal functions with adequate global structure” (set 4, BBOB 15-19), the effects of dimension reductions are highlighted in Fig. 1 and Fig. 2.

2952

TABLE II BBOB RESULTS FOR LOCUST SWARMS WITH D = 20 DIMr Set 1 2 3 5 10 20 1 7.51e-7 8.51e-4 3.24e-3 2.02e-2 2.31e-1 1.29e+0 2 1.29e+4 1.43e+4 1.38e+4 1.83e+4 7.60e+3 3.31e+4 1 3 5.12e-1 5.24e+0 1.29e+1 3.17e+1 7.43e+1 1.06e+2 4 3.26e+0 7.60e+0 1.61e+1 3.83e+1 9.01e+1 1.43e+2 5 0.00e+0 0.00e+0 0.00e+0 0.00e+0 0.00e+0 0.00e+0 6 2.70e+2 3.58e+0 6.11e+0 1.34e+1 2.89e+1 5.55e+1 7 1.58e+1 3.98e+0 3.31e+0 4.52e+0 8.28e+0 1.70e+1 2 8 1.07e+2 2.48e+1 3.68e+1 3.00e+1 3.95e+1 1.97e+2 9 4.87e+1 2.70e+1 2.42e+1 2.30e+1 2.98e+1 4.62e+1 10 8.66e+4 1.15e+4 6.15e+3 5.42e+3 1.02e+4 1.78e+4 11 2.33e+2 6.37e+0 6.63e+0 1.44e+1 3.07e+1 5.86e+1 12 7.69e+4 5.19e+2 1.91e+3 1.34e+4 1.46e+5 1.07e+6 3 13 4.09e+1 1.55e+1 3.26e+1 4.24e+1 1.27e+2 2.46e+2 14 4.83e-2 4.53e-3 6.47e-3 2.30e-2 1.47e-1 7.09e-1 15 1.73e+2 6.88e+1 8.35e+1 9.13e+1 9.72e+1 1.05e+2 16 1.32e+1 4.35e+0 2.83e+0 3.02e+0 6.93e+0 8.89e+0 17 4.15e+0 1.35e+0 2.76e-1 4.33e-1 8.91e-1 1.41e+0 4 18 1.16e+1 3.64e+0 2.16e+0 1.99e+0 3.30e+0 5.19e+0 19 3.00e+0 2.85e+0 2.89e+0 2.92e+0 3.02e+0 3.11e+0 20 8.97e-1 5.03e-1 9.37e-1 1.59e+0 2.07e+0 1.84e+0 21 8.61e+0 5.43e+0 8.53e+0 7.08e+0 4.47e+0 9.22e+0 5 22 5.48e+0 8.67e+0 7.64e+0 8.87e+0 8.96e+0 7.60e+0 23 1.21e+0 9.27e-1 8.48e-1 8.57e-1 8.12e-1 8.69e-1 24 1.14e+2 1.08e+2 1.09e+2 1.04e+2 1.13e+2 1.18e+2 Mean error from known optimum for locust swarms on BBOB problem set with D = 20. For each value of DIMr = 1, 2, 3, 5, 10, and 20, 25 total trials were run – five independent trials on each of the first five instances of each BBOB function. On 17 of the 24 problems, locust swarms perform significantly better with dimension reductions (e.g. DIMr = 10) – see Table IV.

TABLE III BBOB RESULTS FOR LOCUST SWARMS WITH D = 20 DIMr Set 1 2 3 5 10 20 1 1.87e-6 1.11e-3 4.56e-3 1.64e-2 1.63e-1 5.80e-1 2 1.09e+4 1.16e+4 1.32e+4 1.77e+4 8.75e+3 1.24e+4 1 3 6.26e-1 1.06e+0 2.33e+0 3.05e+0 6.38e+0 1.02e+1 4 1.64e+0 1.66e+0 2.16e+0 4.95e+0 1.01e+1 1.34e+1 5 0.00e+0 0.00e+0 0.00e+0 0.00e+0 0.00e+0 0.00e+0 6 6.14e+2 1.27e+0 2.04e+0 3.82e+0 6.23e+0 1.23e+1 7 8.92e+0 2.11e+0 1.74e+0 1.47e+0 2.06e+0 4.44e+0 2 8 5.74e+1 2.15e+1 2.89e+1 3.15e+1 2.03e+1 8.75e+1 9 5.66e+1 3.11e+1 2.44e+1 1.84e+1 4.30e+1 5.67e+1 10 5.11e+4 6.72e+3 4.95e+3 2.48e+3 3.45e+3 5.58e+3 11 7.35e+1 6.35e+0 2.86e+0 3.94e+0 1.09e+1 1.29e+1 12 2.26e+5 6.91e+2 4.24e+3 1.90e+4 9.08e+4 3.99e+5 3 13 2.39e+1 1.30e+1 2.08e+1 1.94e+1 2.23e+1 3.90e+1 14 1.42e-2 1.93e-3 2.81e-3 1.08e-2 6.93e-2 2.35e-1 15 4.97e+1 1.51e+1 1.14e+1 9.97e+0 9.68e+0 8.12e+0 16 5.11e+0 2.38e+0 1.23e+0 1.00e+0 1.01e+0 8.64e-1 17 1.43e+0 1.10e+0 2.07e-1 1.12e-1 1.48e-1 2.23e-1 4 18 6.65e+0 3.26e+0 1.54e+0 6.04e-1 5.61e-1 8.03e-1 19 6.48e-1 3.85e-1 3.60e-1 3.55e-1 4.43e-1 3.11e-1 20 6.64e-1 1.34e-1 1.30e-1 1.44e-1 2.24e-1 2.03e-1 21 8.83e+0 7.50e+0 8.52e+0 7.96e+0 5.42e+0 1.20e+1 22 6.62e+0 8.84e+0 7.66e+0 8.68e+0 8.71e+0 7.40e+0 5 23 5.19e-1 1.99e-1 1.66e-1 1.41e-1 1.73e-1 1.56e-1 24 1.25e+1 1.14e+1 1.12e+1 1.20e+1 1.35e+1 1.18e+1 Standard deviations in errors from known optimum for data presented in Table II. Standard deviations on the Gallagher functions (BBOB 21 and 22) are often larger than the mean error – this randomness in the performance of locust swarms reflects the random fitness landscapes on these problems.

2953

TABLE IV BBOB RESULTS FOR LOCUST SWARMS WITH D = 20 DIMr Set 1-2 2-3 3-5 5-10 10-20 1 0.1% 1.6% 0.0% 0.0% 0.0% 2 64.8% 89.3% 30.5% 2.3% 0.0% 1 3 0.0% 0.0% 0.0% 0.0% 0.0% 4 0.0% 0.0% 0.0% 0.0% 0.0% 5 6 4.0% 0.0% 0.0% 0.0% 0.0% 7 17.9% 0.0% 1.5% 0.0% 0.0% 2 8 11.5% 44.1% 23.9% 0.0% 0.0% 9 13.3% 73.1% 86.4% 40.6% 22.5% 10 49.0% 0.0% 0.1% 0.0% 0.0% 11 85.0% 0.0% 0.0% 0.0% 0.0% 12 10.4% 10.2% 3 0.3% 0.0% 0.0% 13 0.0% 0.1% 1.9% 0.0% 0.0% 14 0.0% 0.4% 0.0% 0.0% 0.0% 15 7.4% 0.0% 0.1% 1.2% 0.5% 16 58.6% 0.0% 1.0% 0.0% 0.0% 17 4 0.0% 0.0% 0.6% 0.0% 0.0% 18 62.5% 0.0% 4.8% 0.0% 0.0% 19 37.2% 63.1% 83.6% 35.5% 34.8% 20 0.9% 0.0% 0.0% 0.0% 0.3% 21 15.1% 16.8% 59.5% 23.4% 5.6% 22 11.7% 62.0% 62.1% 97.5% 54.7% 5 23 15.9% 85.3% 37.6% 25.1% 1.4% 24 5.8% 73.3% 20.9% 14.7% 3.1% Pair-wise comparisons on the effects of DIMr. Values represent t-test results to determine if the means in Table II could be drawn from the same distribution. Values in bold pass the test of significance of less than 5%.

Fig. 1. Relative performance of locust swarms on BBOB set 1 – separable functions. 1.00 represents the mean performance of locust swarms with DIMr = D = 20. Error bars represent one standard deviation. Lines in bold have significant differences in at least four out of five pairs (see Table IV).

Fig. 2. Relative performance of locust swarms on BBOB set 4 – multi-modal functions with adequate global structure. See Fig. 1 for more information on the presented data.

On the separable functions, there are clear trend lines on BBOB 1, 3, and 4 which show improving performance with decreasing values of DIMr (see Fig. 1). Smaller values of DIMr allow greater exploitation of function separability, and this can lead to large performance improvements. Conversely, on non-separable functions, there is no benefit to limiting the search process to one dimension (i.e. DIMr = 1), and the results of locust swarms on BBOB set 4 tend to be much worse with DIMr = 1 than for DIMr = D = 20. However, for BBOB 15-18, there are significant advantages obtained from dimension reductions (e.g. DIMr = 5) and clear trend lines on BBOB 16-18 that show improved performance as DIMr moves from 20 to 5 and worsening performance as DIMr moves from 3 to 1. (See Fig. 2.) A particularly interesting pair of problems is BBOB 3 and BBOB 15 (Rastrigin and rotated Rastrigin). For DIMr = D = 20, the performance of locust swarms on both problems is essentially the same (see Table II) – the line (e.g. a search trajectory) connecting two points is the same line regardless of the co-ordinate system in which it is represented. On a problem with global structure, changing a term will either move the solution towards or away from the global optimum. When fewer terms are changed (with smaller DIMr), there is a greater probability that all changed terms will move in the right direction – e.g. 1 in 8 for DIMr = 3 vs. 1 in 1024 for DIMr = 10. The benefits of these changes are much higher when there is an axial bias to the problem (e.g. it is separable), and these benefits disappear altogether if all single-axis motions are tangential to the global structure. V. LOCUST SWARMS VS. STANDARD PSO The results of the previous section show that DIMr has a large influence on the performance of locust swarms. In this section, the performance of locust swarms is compared with a version of the “Constricted GBest” standard for particle swarm optimization (PSO) [17]1. The specific benchmark PSO is the BBOB entry of El-Abd and Kamel [12] (which provides exact source code). For benchmarking purposes, it is common to use a single set of parameters for all problems with the caveat that all techniques can benefit from additional problem-specific crafting. Therefore, the comparisons in this section will have DIMr = 5 even though locust swarms will clearly perform better on the separable functions with DIMr = 1. The number of function evaluations for both locust swarms and the benchmark PSO is FE = 5000 * D. For D = 20, locust swarms perform better on 8 problems, the benchmark PSO performs better on 6 problems, and the two methods are essentially tied on 10 problems (see Table V). In linear slope (BBOB 5), the global optimum is on the edge of the search space and the current implementation of 1 A “Constricted LBest” version has been made by switching the benchmark PSO to a ring topology. This leads to similar improvements in PSO performance as reported in [17], but no changes to the fundamental results presented in this paper.

2954

TABLE V BBOB RESULTS FOR LOCUST SWARMS AND PSO WITH D = 20 Locust Swarms PSO Set %-diff t-test mean std dev mean std dev 1 2.02e-2 1.64e-2 4.85e-2 2.43e-1 58.5% 55.3% 2 1.83e+4 1.77e+4 3.67e+2 1.04e+3 -4893.4% 0.0% 1 3 3.17e+1 3.05e+0 3.06e+1 1.65e+1 -3.6% 73.7% 4 3.83e+1 4.95e+0 2.60e+1 1.08e+1 -47.3% 0.0% 5 0.00e+0 0.00e+0 1.42e+1 1.95e+1 100.0% 0.1% 6 1.34e+1 3.82e+0 2.31e+1 4.53e+1 41.9% 30.1% 7 4.52e+0 1.47e+0 1.12e+1 7.68e+0 59.7% 0.0% 2 8 3.00e+1 3.15e+1 1.74e+1 4.15e+1 -72.3% 24.7% 9 2.30e+1 1.84e+1 2.65e+1 2.51e+1 13.1% 60.7% 10 5.42e+3 2.48e+3 5.36e+3 5.46e+3 -1.1% 96.0% 11 1.44e+1 3.94e+0 1.90e+1 8.17e+0 24.0% 1.7% 12 1.34e+4 1.90e+4 7.78e+4 3.89e+5 82.7% 41.9% 3 13 4.24e+1 1.94e+1 1.09e+1 1.11e+1 -290.7% 0.0% 14 2.30e-2 1.08e-2 5.20e-4 1.31e-4 -4312.8% 0.0% 15 9.13e+1 9.97e+0 5.89e+1 2.15e+1 -55.1% 0.0% 16 3.02e+0 1.00e+0 5.25e+0 1.83e+0 42.5% 0.0% 4 17 4.33e-1 1.12e-1 1.15e+0 5.76e-1 62.5% 0.0% 18 1.99e+0 6.04e-1 3.65e+0 2.05e+0 45.4% 0.1% 19 2.92e+0 3.55e-1 3.98e+0 5.65e-1 26.6% 0.0% 20 1.59e+0 1.44e-1 1.13e+0 2.38e-1 -39.8% 0.0% 21 7.08e+0 7.96e+0 9.56e+0 1.27e+1 25.9% 34.9% 22 8.87e+0 8.68e+0 9.26e+0 1.11e+1 4.2% 88.0% 5 23 8.57e-1 1.41e-1 1.64e+0 3.53e-1 47.9% 0.0% 24 1.04e+2 1.20e+1 1.03e+2 2.78e+1 -1.0% 86.5% Means and standard deviations (std dev) for 25 trials of locust swarms with DIMr = 5 and the benchmark PSO on the BBOB problem set with D = 20. Percent difference (%-diff) shows relative performance of locust swarms (a) with respect to PSO performance (b) – (b-a)/b. Values in bold represent significant performance differences as determined by t-tests.

TABLE VI BBOB RESULTS FOR LOCUST SWARMS AND PSO WITH D = 50 Locust Swarms PSO Set %-diff t-test mean std dev mean std dev 1 8.63e-2 5.72e-2 0.00e+0 0.00e+0 -∞ 0.0% 2 4.76e+4 2.10e+4 2.36e+4 6.15e+4 -101.6% 8.7% 1 3 9.16e+1 6.84e+0 1.67e+2 6.40e+1 45.2% 0.0% 4 1.13e+2 9.91e+0 1.72e+2 5.89e+1 34.2% 0.0% 5 0.00e+0 0.00e+0 8.37e+1 4.09e+1 100.0% 0.0% 6 4.84e+1 8.03e+0 2.44e+2 2.01e+2 80.2% 0.0% 7 2.50e+1 5.73e+0 9.94e+1 5.43e+1 74.9% 0.0% 2 8 7.18e+1 2.94e+1 3.44e+2 1.22e+3 79.1% 27.6% 9 7.68e+1 3.88e+1 5.75e+1 3.47e+1 -33.6% 10.9% 10 4.14e+4 1.75e+4 4.21e+4 2.15e+4 1.8% 88.4% 11 2.82e+1 5.49e+0 7.58e+1 2.91e+1 62.8% 0.0% 12 7.97e+4 7.17e+4 3.72e+6 1.65e+7 97.9% 28.2% 3 13 9.68e+1 2.81e+1 9.03e+1 1.58e+2 -7.2% 82.9% 14 7.12e-2 1.70e-2 4.97e-1 1.20e+0 85.7% 8.7% 15 3.84e+2 1.74e+1 2.61e+2 7.86e+1 -47.2% 0.0% 16 6.35e+0 1.46e+0 1.41e+1 3.40e+0 54.9% 0.0% 17 1.37e+0 7.32e-1 4.30e+0 1.02e+0 4 68.0% 0.0% 18 5.41e+0 1.69e+0 1.40e+1 4.03e+0 61.4% 0.0% 19 6.09e+0 2.92e-1 7.06e+0 5.58e-1 13.7% 0.0% 20 1.83e+0 7.54e-2 2.86e+2 1.08e+3 99.4% 20.1% 21 5.22e+0 4.96e+0 6.86e+0 8.19e+0 23.9% 30.2% 22 8.50e+0 7.67e+0 1.12e+1 1.46e+1 24.0% 42.0% 5 23 1.95e+0 1.78e-1 2.99e+0 4.76e-1 35.0% 0.0% 24 4.59e+2 2.36e+1 4.86e+2 5.67e+1 5.7% 1.5% Means and standard deviations (std dev) for 25 trials of locust swarms with DIMr = 5 and the benchmark PSO on the BBOB problem set with D = 50. Percent difference (%-diff) shows relative performance of locust swarms with respect to PSO performance. Values in bold represent significant performance differences as determined by t-tests.

TABLE VII BBOB RESULTS FOR LOCUST SWARMS AND PSO WITH D = 100 Locust Swarms PSO Set %-diff t-test mean std dev mean std dev 1 3.17e-1 1.46e-1 4.18e+0 6.82e+0 92.4% 0.9% 2 1.06e+5 3.21e+4 1.06e+5 1.58e+5 -0.3% 99.2% 1 3 1.97e+2 1.18e+1 4.21e+2 1.41e+2 53.1% 0.0% 4 2.50e+2 1.32e+1 5.27e+2 1.49e+2 52.6% 0.0% 5 0.00e+0 0.00e+0 3.75e+2 1.24e+2 100.0% 0.0% 6 1.08e+2 1.80e+1 7.91e+2 2.48e+2 86.3% 0.0% 7 7.38e+1 1.14e+1 3.58e+2 7.53e+1 79.4% 0.0% 2 8 1.47e+2 4.66e+1 1.51e+4 3.82e+4 99.0% 6.2% 9 1.48e+2 5.27e+1 8.28e+2 2.41e+3 82.1% 17.2% 10 8.82e+4 1.76e+4 1.52e+5 4.17e+4 41.8% 0.0% 11 5.21e+1 7.09e+0 1.86e+2 6.72e+1 72.0% 0.0% 12 2.42e+5 1.56e+5 7.58e+6 9.82e+6 3 96.8% 0.1% 13 1.45e+2 2.83e+1 3.49e+2 2.55e+2 58.5% 0.0% 14 1.65e-1 2.93e-2 3.81e+0 3.80e+0 95.7% 0.0% 15 9.85e+2 4.48e+1 8.80e+2 2.23e+2 -12.0% 2.6% 16 9.69e+0 1.17e+0 2.19e+1 3.63e+0 55.8% 0.0% 17 4.55e+0 1.34e+0 5.66e+0 1.02e+0 4 19.5% 0.1% 18 1.06e+1 2.66e+0 2.08e+1 3.72e+0 49.3% 0.0% 19 8.30e+0 1.61e-1 1.03e+1 1.08e+0 19.2% 0.0% 20 1.88e+0 6.68e-2 5.37e+2 1.17e+3 99.6% 3.1% 21 7.52e+0 1.12e+1 9.80e+0 1.00e+1 23.3% 37.1% 22 1.04e+1 1.42e+1 1.16e+1 1.68e+1 11.1% 56.5% 5 23 2.72e+0 1.88e-1 3.69e+0 4.10e-1 26.5% 0.0% 24 1.18e+3 5.98e+1 1.32e+3 1.08e+2 10.6% 0.0% Means and standard deviations (std dev) for 25 trials of locust swarms with DIMr = 5 and the benchmark PSO on the BBOB problem set with D = 100. Percent difference (%-diff) shows relative performance of locust swarms with respect to PSO performance. Values in bold represent significant performance differences as determined by t-tests.

TABLE VIII BBOB RESULTS FOR LOCUST SWARMS AND PSO WITH D = 200 Locust Swarms PSO Set %-diff t-test mean std dev mean std dev 1 7.94e-1 1.51e-1 2.72e+1 1.98e+1 97.1% 0.0% 2 2.13e+5 5.12e+4 7.35e+5 6.55e+5 71.0% 0.1% 1 3 4.21e+2 1.50e+1 1.07e+3 1.53e+2 60.8% 0.0% 4 5.33e+2 1.68e+1 1.43e+3 3.65e+2 62.6% 0.0% 5 0.00e+0 0.00e+0 1.14e+3 2.35e+2 100.0% 0.0% 6 3.54e+2 5.50e+1 2.41e+3 4.61e+2 85.3% 0.0% 7 2.36e+2 2.81e+1 1.52e+3 2.90e+2 84.5% 0.0% 2 8 3.37e+2 9.86e+1 1.63e+5 2.64e+5 99.8% 0.5% 9 3.23e+2 9.77e+1 2.08e+4 2.21e+4 98.4% 0.0% 10 2.34e+5 4.21e+4 6.27e+5 1.87e+5 62.7% 0.0% 11 8.78e+1 8.51e+0 4.95e+2 7.45e+1 82.3% 0.0% 12 6.22e+5 2.57e+5 6.83e+7 4.43e+7 3 99.1% 0.0% 13 2.15e+2 2.29e+1 9.86e+2 3.58e+2 78.2% 0.0% 14 3.26e-1 6.40e-2 1.40e+1 6.29e+0 97.7% 0.0% 15 2.25e+3 2.42e+2 2.64e+3 4.93e+2 14.7% 0.0% 16 1.41e+1 8.98e-1 3.12e+1 3.38e+0 54.9% 0.0% 17 9.13e+0 1.15e+0 7.92e+0 6.53e-1 4 -15.3% 0.0% 18 2.58e+1 3.50e+0 3.08e+1 4.03e+0 16.4% 0.0% 19 1.04e+1 2.13e-1 1.61e+1 2.96e+0 35.6% 0.0% 20 2.00e+0 5.20e-2 6.28e+3 8.32e+3 100.0% 0.1% 21 3.26e+0 3.35e+0 5.51e+0 5.12e+0 40.9% 3.4% 22 5.69e+0 7.20e+0 8.36e+0 6.06e+0 31.9% 15.9% 5 23 2.56e+0 1.15e-1 3.19e+0 1.89e-1 19.5% 0.0% 24 3.05e+3 1.00e+2 3.58e+3 2.77e+2 15.0% 0.0% Means and standard deviations (std dev) for 25 trials of locust swarms with DIMr = 5 and the benchmark PSO on the BBOB problem set with D = 200. Percent difference (%-diff) shows relative performance of locust swarms with respect to PSO performance. Values in bold represent significant performance differences as determined by t-tests.

locust swarms moves scouts which are outside of the search space back to the edge of the search space. On the Gallagher functions (BBOB 21 and 22), the search space has random peaks which should benefit a multi-optima technique, but the gap and spacing may not be large enough to ensure that each swarm has a reasonable chance of escaping from the most recent peak. The results for D = 20 are hardly a conclusive demonstration of the benefits of dimension reductions. However, 20 dimensions is still a relatively low-dimension search space. To the extent that dimension reductions can exploit separability to combat the “curse of dimensionality” (even on non-separable problems), experiments on higherdimension problems are required. Tables VI-VIII show the results for locust swarms with DIMr = 5 and the benchmark PSO on the BBOB problem set with D = 50, 100, and 200. For D = 50, locust swarms perform better on 12 problems, PSO performs better on 2 problems, and the two methods are essentially tied on 10 problems. For D = 100, locust swarms perform better on 18 problems, PSO performs better on 1 problem, and the two methods are essentially tied on 5 problems. Finally, for D = 200, locust swarms perform better on 22 problems, PSO performs better on 1 problem, and the two methods are essentially tied on 1 problem.

dimension D increases from 20 to 200, the mean errors from the optimal solutions can increase dramatically – on ten of the problems, the mean error increases by over 100 times. Although dimensionality also affects the performance of locust swarms (see Table X), the increases in mean errors are substantially less on average. The “devour and move on” strategy for locust swarms was developed with multi-modal fitness landscapes in mind – find a (local) optimum, use scouts to search the surrounding area, try to find a better optimum. The performance of locust swarms on multi-modal functions (e.g. BBOB set 4) is generally better than standard PSO, and the relative performance tends to improve with increasing dimensionality. However, the changes in relative performance as D increases are much greater on the unimodal functions in BBOB set 3. (See Tables V-X.) One reason why dimensionality may affect the performance of standard PSOs on these unimodal functions is that different dimensions can converge at different rates (e.g. [18]). On non-separable problems, the best value for a term at a given stage of the search process may not be the same value it will have at the global optimum. If such a dimension of the PSO becomes fully converged at an early stage of the search process (at a locally optimal value), the eventual effectiveness of the PSO can be limited by this premature convergence. One potential advantage of locust swarms is that the restarts can help keep all dimensions active throughout the

VI. DISCUSSION The effects of dimensionality are clearly visible in the performance of the benchmark PSO (see Table IX). As the

2955

TABLE IX EFFECTS OF DIMENSIONALITY ON PSO PERFORMANCE Set D = 20 D = 50 D = 100 D = 200 1 1.0 0.0 86.1 559.4 2 1.0 64.3 288.6 2003.3 1 3 1.0 5.5 13.7 35.1 4 1.0 6.6 20.3 54.9 5 1.0 5.9 26.4 80.5 6 1.0 10.6 34.2 104.2 7 1.0 8.9 31.8 135.5 2 8 1.0 19.8 867.9 9335.4 9 1.0 2.2 31.2 785.7 10 1.0 7.9 28.3 116.9 11 1.0 4.0 9.8 26.0 12 1.0 47.8 97.4 878.1 3 13 1.0 8.3 32.1 90.8 14 1.0 955.3 7314.4 26833.9 15 1.0 4.4 15.0 44.8 16 1.0 2.7 4.2 5.9 17 1.0 3.7 4.9 6.9 4 18 1.0 3.8 5.7 8.4 19 1.0 1.8 2.6 4.1 20 1.0 252.0 473.8 5534.5 21 1.0 0.7 1.0 0.6 22 1.0 1.2 1.3 0.9 5 23 1.0 1.8 2.2 1.9 24 1.0 4.7 12.8 34.6 mean 1.0 59.3 391.9 1945.1 Values represent ratio of mean error to PSO performance with D = 20 dimensions.

TABLE X EFFECTS OF DIMENSIONALITY ON LOCUST SWARM PERFORMANCE Set D = 20 D = 50 D = 100 D = 200 1 1.0 4.3 15.7 39.4 2 1.0 2.6 5.8 11.6 1 3 1.0 2.9 6.2 13.3 4 1.0 2.9 6.5 13.9 5 1.0 1.0 1.0 1.0 6 1.0 3.6 8.1 26.4 7 1.0 5.5 16.3 52.2 2 8 1.0 2.4 4.9 11.2 9 1.0 3.3 6.4 14.0 10 1.0 7.6 16.3 43.1 11 1.0 2.0 3.6 6.1 12 1.0 5.9 18.0 46.3 3 13 1.0 2.3 3.4 5.1 14 1.0 3.1 7.2 14.2 15 1.0 4.2 10.8 24.7 16 1.0 2.1 3.2 4.7 17 1.0 3.2 10.5 21.1 4 18 1.0 2.7 5.3 12.9 19 1.0 2.1 2.8 3.6 20 1.0 1.2 1.2 1.3 21 1.0 0.7 1.1 0.5 22 1.0 1.0 1.2 0.6 5 23 1.0 2.3 3.2 3.0 24 1.0 4.4 11.3 29.2 mean 1.0 3.1 7.1 16.6 Values represent ratio of mean error to locust swarm performance with D = 20 dimensions.

search process. Further, since different dimensions can converge at different rates, all dimensions may not require equal attention at all stages of the search process. Specifically, if one term has converged to a good value (e.g. on a separable problem), then it may be beneficial to leave this term unchanged while the search process focuses on other terms. The use of dimension reductions allows locust swarms to exploit this potential benefit. The effects of dimension reductions also have similarities to the effects of a crossover factor less than 1 in differential evolution [19] – i.e. some terms will be held constant from one solution to the next. In general, crossover factors of less than 1 are recommended for differential evolution with 0.9 being a typical value (e.g. [20]). Compared to DIMr = 5 on problems with D = 20, 50, 100, and 200, a crossover factor of 0.9 could change a much larger number of terms between each pair of solutions. Some implementations of differential evolution have evolvable crossover factors (e.g. [2]), and a similar ability for adaptations in DIMr may also help the performance of locust swarms. In particular, locust swarms tend to perform better on the BBOB problem set with D = 20 using DIMr = 3 instead of DIMr = 5 (see Table II). However, DIMr = 5 was used for the experiments with D = 50, 100, and 200 because it was expected (but not confirmed due to computational costs) that DIMr = 3 would be too small (like DIMr = 2 with D = 20 – see Fig. 2). A further analysis of adaptive DIMr and the relationship between DIMr and the crossover factor in differential evolution [21] are promising areas for further research.

VII. SUMMARY The performance of particle swarm optimization can degrade dramatically with increasing dimensionality. Using dimension reductions can help improve the performance of locust swarms on both separable and non-separable problems. The subsequent performance of locust swarms is less susceptible to the “curse of dimensionality”, and the relative performance of locust swarms as compared to standard PSO tends to get larger with increasing dimensionality. Other search techniques such as differential evolution also have the ability to reduce the number of dimensions that they actively search. Understanding and exploiting the effects of low-dimension searches is an area of on-going research. REFERENCES [1]

[2] [3]

[4] [5]

2956

K. Tang, X. Yao, P. N. Suganthan, C. MacNish, Y. P. Chen, C. M. Chen, and Z. Yang, “Benchmark Functions for the CEC’2008 Special Session and Competition on Large Scale Global Optimization,” Technical Report, http://www.ntu.edu.sg/home/EPNSugan, 2007. L.-Y. Tseng and C. Chen, “Multiple Trajectory Search for Large Scale Global Optimization,” in Proc. 2008 IEEE Congress on Evolutionary Computation, pp. 3052–3059. J. Brest, A. Zamuda, B. Boskovic, M. S. Maucec, and V. Zumer, “High-Dimensional Real-Parameter Optimization using Self-Adaptive Differential Evolution Algorithm with Population Size Reduction,”. in Proc. 2008 IEEE Congress on Evolutionary Computation, pp. 2032– 2039. Z. Yang, K. Tang, and X. Yao, “Multilevel Cooperative Coevolution for Large Scale Optimization,” in Proc. 2008 IEEE Congress on Evolutionary Computation, pp. 1663–1670. S.-T. Hsieh, T.-Y. Sun, C.-C., Liu, and S.-J. Tsai, “Solving Large Scale Global Optimization Using Improved Particle Swarm

[6]

[7] [8] [9]

[10]

[11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21]

Optimizer,” in Proc. 2008 IEEE Congress on Evolutionary Computation, pp. 1777–1784. S. Z. Zhao, J. J. Liang, P. N. Suganthan, and M. F. Tasgetiren, “Dynamic Multi-Swarm Particle Swarm Optimizer with Local Search for Large Scale Global Optimization,” in Proc. 2008 IEEE Congress on Evolutionary Computation, pp. 3845–3852. J. Kennedy and R. C. Eberhart, “Particle Swarm Optimization”, in Proc. IEEE International Conference on Neural Networks, 1995, pp. 1942–1948. S. Chen, “Locust Swarms – A New Multi-Optima Search Technique”, in Proc. 2009 IEEE Congress on Evolutionary Computation, pp. 1745–1752. S. Chen, K. Miura, and S. Razzaqi, “Analyzing the Role of "Smart" Start Points in Coarse Search-Greedy Search,” in Lecture Notes in Computer Science, Vol. 4828 : Proceedings of Third Australian Conference on Artificial Life, M. Randall, H. Abbass, and J. Wiles, Eds. Springer-Verlag, 2007, pp. 13–24. S. Chen, “An Analysis of Locust Swarms on Large Scale Global Optimization Problems,” in Lecture Notes in Computer Science, Vol. 5865 : Proceedings of Fourth Australian Conference on Artificial Life, K. Korb, M. Randall, and T. Hendtlass, Eds. Springer-Verlag, 2009, pp. 211–220. N. Hansen, S. Finck, R. Ros, and A. Auger, “Real-Parameter BlackBox Optimization Benchmarking 2009: Noiseless Functions Definitions,” INRIA Technical Report RR-6829, 2009. M. El-Abd and M. S. Kamel, “Black-Box Optimization Benchmarking for Noiseless Function Testbed using Particle Swarm Optimization,” in Proc. GECCO’09, pp. 2269–2273. T. Hendtlass, “WoSP: A Multi-Optima Particle Swarm Algorithm,” in Proc. 2005 IEEE Congress on Evolutionary Computation, pp. 727– 734. H.-G. Beyer and H.-P. Schwefel, “Evolution Strategies: A comprehensive introduction,” Natural Computing, vol. 1 pp. 3–52, 2002. F. Glover, “Tabu Search–Part I,” ORSA Journal on Computing, vol. 1(3) pp. 190–206, 1989. M. G. Norman and P. Moscato, “A Competitive and Cooperative Approach to Complex Combinatorial Search,” in Proc. 20th Informatics and Operations Research Meeting, 1991, pp. 3.15–3.29. D. Bratton and J. Kennedy, “Defining a Standard for Particle Swarm Optimization,” in Proc. 2007 IEEE Swarm Intelligence Symposium, pp. 120-127. T. Hendtlass, “Particle Swarm Optimisation and High Dimensional Problem Spaces,” in Proc. 2009 IEEE Congress on Evolutionary Computation, pp. 1988–1994. R. Storn and K. Price, “Differential Evolution – A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces,” Journal of Global Optimization, vol. 11 pp. 341–359, 1997. J. Montgomery, “Differential Evolution: Difference Vectors and Movement in Solution Space,” in Proc. 2009 IEEE Congress on Evolutionary Computation, pp. 2833–2840. J. Montgomery and S. Chen, “An Analysis of the Operation of Differential Evolution at High and Low Crossover Rates,” in Proc. 2010 IEEE Congress on Evolutionary Computation.

2957

Suggest Documents