Memetic Algorithm using Multi-Surrogates for ... - CiteSeerX

myjournal manuscript No. (will be inserted by the editor)

Memetic Algorithm using Multi-Surrogates for Computationally Expensive Optimization Problems Zongzhao Zhou1 , Yew Soon Ong1 , Meng Hiot Lim2 , Bu Sung Lee1 1

School of Computer Engineering, Nanyang Technological University, Singapore, 639798(phone: +65-6790-6448; fax: +65-6792-6559; email: {zhou0018, asysong, ebslee}@ntu.edu.sg)

2

School of Electrical and Electronic Engineering,Block S1, Nanyang Technological University, Singapore, 639798(email: [email protected])

Received: date / Revised version: date

Abstract In this paper, we present a Multi-Surrogates Assisted Memetic Algorithm (MSAMA) for solving optimization problems with computationally expensive fitness functions. The essential backbone of our framework is an evolutionary algorithm coupled with a local search solver that employs multi-surrogates in the spirit of Lamarckian learning. Inspired by the notion of ‘blessing and curse of uncertainty’ in approximation models, we combine regression and exact interpolating surrogate models in the evolutionary search. Empirical results are presented for a series of commonly used benchmark problems to demonstrate that the proposed framework converges to good solution quality more efficiently than the standard Genetic Algorithm (GA), Memetic Algorithm (MA) and Surrogate-Assisted Memetic Algorithms (SAMAs).

2

Zongzhao Zhou et al.

Key words Evolutionary optimization, Memetic Algorithm, Surrogate model, Radial Basis Function, Polynomial Regression

1 Introduction

A continuing trend in science and engineering is the use of increasingly accurate simulation codes in the design and analysis process so as to produce even more reliable and high quality products. Modern Computational Structural Mechanics (CSM), Computational Electro-Magnetics (CEM) and Computational Fluid Dynamics (CFD) analysis represent some of the recent technologies that now play a central role in aiding scientists validate crucial designs and study the effect of altering key design parameters on product performance with astonishingly accuracy. Nonetheless, the use of accurate simulation methods is often very timing consuming, leading to possibly unrealistic design cycle. For example, in a variety of contexts such as drug design, aerospace design, multidisciplinary structural system, synthesis of proteins in vitro, rainfall prediction, establishing the quality of a potential solution can now take from many minutes to hours or months of supercomputer time. Hence, the overwhelming part of the total run time in such optimization problems is taken up by runs of the computationally expensive simulation codes. This poses a serious impediment to the practical application of existing optimization algorithms for automatically establishing the critical design parameters of real world problems in science and engineering. Particularly, modern stochastic optimization method such as Evolutionary Algorithms (EAs) typically require

Title Suppressed Due to Excessive Length

3

many thousands of function calls to the simulation codes in order to locate a near optimal solution. Since the EA search cycle time is directly proportional to the number of calls to the expensive fitness function, it is now common practice for computationally cheap approximation models to be used in lieu of exact models to reduce computational cost such as reported on aerodynamic airfoil and aircraft designs in [1], [2], [3], [4], [5]. Specifically, the approximation models are used to replace calls to the computationally expensive codes as often as possible in the evolutionary search process. Using approximation models, the computational burden can be greatly reduced since the efforts involved in building the surrogates and optimization using it is much lower than the standard approach of directly coupling the simulation codes with the optimizer. This class of evolutionary algorithms that employ approximation models to enhance search efficiency is often referred to as Surrogate-Assisted Evolutionary Algorithms or SAEAs in short [1], [2], [3], [6]. Up to date, extensive research on SAEAs have emphasized on reducing the effect of ‘curse of uncertainty’, a term used here to refer to the impairments on evolutionary search performance as a result of the approximation errors in the surrogate. Particularly, most SAEAs have sought for approximation techniques that are capable of modelling complex landscapes accurately so as to reduce the effect of ‘curse of uncertainty’. For instance, most of the recently proposed SAEA frameworks [1], [2], [6], [7], [8] have opted for local over global surrogates since the former is believed to generate better prediction accuracy. Besides the ‘curse of uncertainty’ we also introduce the notion of ‘blessing of uncertainty’ which refers

4


to the benefits attributed by approximation inaccuracies on evolutionary search performance [9]. In this paper, we begin with a detail study on the impact of uncertainty, i.e., ‘curse of uncertainty’ and ‘blessing of uncertainty’, in the approximated function on Surrogate-Assisted Memetic Algorithms (SAMA) search performance. Inspired by the effects of ‘curse and blessing of uncertainty’ on SAMA search, we proposed a Multi-Surrogates Assisted Memetic Algorithm (MSAMA) for solving optimization problems with computationally expensive fitness functions. In contrast to earlier works, the essential backbone of our proposed MSAMA framework is an evolutionary algorithm coupled with local solvers that employs multiple surrogates. Particularly, both regression and interpolating local surrogates are used in the Lamarckian learning process to provide a diversity of approximation models in MSAMA. Lamarckian learning forces the genotype to reflect the result of improvement by replacing the locally improved individual back into the population to compete for reproductive opportunities [10]. With the use of multiple surrogates, we show that MSAMA converges to good solution quality more efficiently than the standard GA, MA and other SAMA variants. The remaining of this paper is organized as follows. In the next two sections, we present a brief survey on recent state-of-the-art SAEAs and an overview of two surrogate modelling techniques used in the present study. The standard MA and SAMA used for investigating the impacts of uncertainty pose by different surrogates on evolutionary search is outlined in Section 4. In section 5, we describe the proposed MSAMA that employs multiple surrogates for enhancing evolution-


5

ary search performances. The numerical performances of GA, MA, SAMA and MSAMA are then demonstrated on a series of commonly used benchmark functions while Section 6 concludes this paper.

2 Related Work

A variety of techniques for the constructions of surrogates, often also referred to as metamodels or approximation models, have been used in engineering design optimization. Among these techniques, Polynomial Regression (PR, also known as response surface method) [11][12], Artificial Neural Network (ANN) [13], Radial Basis Function (RBF) [1][2][3] and Gaussian Process (GP) (also referred to as Kriging or Design and Analysis of Computer Experiments (DACE) models) [2][6][14][15][16], are the most prominent and commonly used techniques. Over the years, a series of SAEA frameworks using different approximation techniques have been introduced. Ratle [6] and El-Beltagy et al. [15] examined strategies for integrating evolutionary search with global surrogates based on Kriging. Various strategies using GP global surrogates have also been considered in D. B¨ uche [14] and Ulmer et al. [16]. However, since the idea of constructing accurate global surrogates might be fundamentally flawed due to the ‘curse of dimensionality’, online local surrogates using RBF was considered in Ong et al. [1], Giannakoglou et al. [3], Emmerich et. al. [7] and Regis et. al. [8], [17], in place of global models. Possible synergy between global and local surrogates for accelerating SAEA search was first investigated in [2] while [18] considered using gradient information to enhance the prediction accuracy of the constructed surro-

6


gates. In [19], neural network ensemble models was also considered for improving the quality of fitness evaluation used in the SAEA. The use of approximation models in the context of multi-objective evolutionary optimization of computationally expensive problems can also be found in Emmerich et al. [7], Nain et al. [20], Xuan et al. [21] and Knowles [22]. For greater details on some of the evolutionary computational frameworks that employ approximation models, the reader is referred to the survey papers in [23] and [24].

3 Surrogate Modeling

In this section, we give a brief overview of two surrogate modelling techniques used in the present study. An important question that arises when choosing approximation methods in practice is whether one should use interpolation or regression techniques to construct surrogates when the observational data is generated using computer models. Here, since we are primarily concerned with approximating deterministic computer models that is assume to not suffer from numerically induced convergence or discretization noise, perfectly interpolating models are most germane to our concerns. The effect of ‘curse of uncertainty’ on evolutionary search is thus decreased for the training samples when using exact interpolating models. In contrast, regressing techniques do not model the training data exactly but generalizes from the training samples, for example, in a least square sense. To investigate and utilize the effects of ‘curse and blessing of uncertainty’, we use least-squares regression based on low-order polynomials (PR), otherwise also known as response


7

surface methods, such that the approximation can be significant even on the training samples. Let D = {xi , ti }, i = 1 . . . n denote the training dataset, where xi ∈ Rd is the input design vector, ti ∈ R is the corresponding target value, and xi = (xi1 , xi2 , . . . , xid ), d denotes the dimension of the problem, we have ti = f (xi ) = f (xi1 , xi2 , . . . , xid ).

3.1 Polynomial Regression (PR)

The Polynomial Regression (PR) technique can easily be fitted to data with a least squares approach. The theory of least square polynomial curve fitting is well-know and many computer programs have been written to fit polynomials of arbitrary order to multi-dimensional input data [11]. Define an exponent vector ε as a vector of positive integers (π1 , π2 , . . . , πd ) and xε as (x1 π1 , x2 π2 , . . . , xd πd ). Given a set of exponent vectors ε1 , ε2 , . . . , εm and the set of data (xi , ti ), where i = 1, 2, . . . , n. The PR model, tˆ, with (m−1)th order is of the form:

tˆ = fˆ(x) =

m X

Cj xεj

(1)

j=1

where C1 , C2 , . . . , Cm are the coefficient vectors to be estimated by minimizing the least square error E where

E=

n X i=1

[ti − tˆi ]2

(2)

8


Here, we consider 2nd order polynomial models that provide good generalization capabilities at the low computational cost.

3.2 Radial Basis Function Interpolation (RBF)

The theory of Radial Basis Function (RBF) networks can be tracked back to interpolation problems [25]. The method uses linear combinations of a radially symmetric function based on Euclidean distance or other such metric to approximate response functions. The output of a RBF model is a weighted sum of radial basis functions, each characterized by its center µ ∈ Rd in the design space, and a function which declines with increasing distance from the center. The local surrogate models are exact interpolating radial basis function networks of the form

tˆ = fˆ(x) =

n X

αi K(||x − xi ||)

(3)

i=1

where K(||x−xi ||) : Rd → R is a RBF and α = {α1 , α2 , . . . , αn } ∈ Rn denotes the vector of weights. Typical choices for the kernel include linear splines, cubic splines, multiquadrics, thin-plate splines, and Gaussian functions [26]. We consider the use of linear splines for constructing surrogates since earlier studies [1], [2], suggest that this kernel is capable of generating exact interpolating models that enhance SAMA search performances at a low computational cost.


9

4 Impacts of Uncertainty on Surrogate Assisted Memetic Algorithm

In this section, we investigate the impacts of uncertainty introduced by inaccurate surrogate on the Surrogate Assisted Memetic Algorithm (SAMA). In particular, we consider the general bound constrained nonlinear programming optimization problem of the form: Minimize :

f (x)

Subject to : xl ≤ x ≤ xu ,

(4)

where f (x) is a scalar-valued fitness function, x ∈ Rd is the vector of continuous design variables, and xl and xu are vectors of lower and upper bounds, respectively. In the present context, we are interested in cases where the evaluation of f (x) is computationally expensive, and it is desired to obtain a near optimal solution on a limited computational budget.

4.1 Outline of Surrogate Assisted Memetic Algorithm

Memetic Algorithm (MA) is a population-based meta-heuristic search methods that is inspired by Darwinian principles of natural evolution and Dawkins notion of a meme defined as a unit of cultural evolution that is capable of local refinements. A unique property of MA is the heavy use of the local search strategy throughout the entire evolutionary search. Recent studies on MA have revealed their successes on a wide variety of real world problems. If designed correctly, they should converge to high quality solutions more efficiently than conventional evolutionary algorithms [10].

10

Zongzhao Zhou et al. BEGIN Initialize: Generate a database containing a population of designs. While(computational budget is not exhausted) • Evaluate all individuals in the population using the exact fitness function. • For each non-duplicated individual in the EA population • MA: Apply local search strategy using original exact fitness functions. • SAMA: Apply local search strategy using online surrogates. End For • Replace the individuals in the population with the locally improved solution in the spirit of Lamarckian learning. • Apply standard EA operators to create a new population. End While END

Fig. 1 Pseudo codes of the standard MA and SAMA

The standard MA considered here is a hybridization of the GA and a local search solver. All chromosomes in the GA population undergo local learning in the MA. Note that no form of approximation model is employed in the standard MA. The core idea of the SAMA is to reduce the number of calls to the computationally expensive function by replacing the exact fitness function used in the local search phase of the standard MA with computationally cheap approximation models. The core difference between the standard MA and SAMA are highlighted in Fig. 1.


11

4.2 Impact of Uncertainty on SAMA

Here, we illustrate the effects of uncertainty introduced by inaccurate surrogates on SAMA search. If f (x) denotes the original fitness function and the approximated function is fˆ(x), the approximation errors at any design point xi is e(xi ) , i.e., the uncertainty introduced by the surrogate at xi , may then be defined as e(xi ) = |f (xi ) − fˆ(xi )|

(5)

For each non-duplicated individual, if n fitness calls to fˆ(x) are made in the SAMA local search strategy, the root mean square error, rmse can be derived as r Pn rmse =

4.2.1 Curse of Uncertainty

e2 (xi ) . n

i=1

(6)

The negative impact of ‘curse of uncertainty’ on

SAMA search can be illustrated using Fig. 2. The full and dotted line depicted in Fig. 2 denote the original one dimensional function, f (x) and approximated function, fˆ(x), respectively, assuming only 3 sparse points are available as the training dataset. Note that this is equivalent to hundreds of training data points in a multi-variate problem due to the fact that the number of hypercubes required to fill out a compact region of a d-dimensional space grows exponentially with d. If the start point of a gradient-based local solver in SAMA is xc , where f (xc ) = fˆ(xc ), the local search is likely to converge to the local optimum of the approximated function situated at xlo which has a predicted fitness value of fˆ(xlo ). It is worth noting that when xlo is mapped onto the exact fitness function, there

12

Zongzhao Zhou et al. 45

fˆ ( x)

40

f ( x)

35 30 25 20

f ( xc ) = fˆ ( xc )

15 10 5 0

-5

-4

-3

-2

-1

xc

0

1

0

1

2

3

4

5

2

3

4

5

45

fˆ ( x)

40

f ( x)

35 30 25

f ( xlo ) 20

f ( xc ) = fˆ ( xc )

15 10

fˆ ( xlo )

5 0

-5

-4

-3

-2

-1

xc

xlo

Fig. 2 ‘Curse of uncertainty’ in Surrogate-Assisted Memetic Algorithm

is no fitness improvement over the start point of xc since f (xlo ) > fˆ(xlo ) and f (xlo ) > f (xc ), see Fig. 2.

4.2.2 Blessing of Uncertainty

Next, we consider the ‘blessing of uncertainty’

that refers to the benefits attributed by inaccurate surrogates on SAMA search. The full and dotted line in Fig. 3 denote f (x) and fˆ(x) assuming 3 alternative training sample points, respectively. If the starting point is at xc , the local search strategy now converges to a local optimum that provides significant fitness im-


13

45

fˆ ( x)

40

f ( x)

35 30 25

f ( xc ) = fˆ ( xc )

20 15 10 5 0

-5

-4

-3

-2

-1

0

1

2

3

4

5

0

1

2

3

4

5

xc

45

fˆ ( x)

40

f ( x)

35 30 25

f ( xc ) = fˆ ( xc )

20 15 10

fˆ ( xlo )

5

f ( xlo )

0

-5

-4

-3

-2

xc

-1

xlo

Fig. 3 ‘Blessing of uncertainty’ in Surrogate-Assisted Memetic Algorithm

provement over the initial point xc with f (xlo ) < fˆ(xlo ) and f (xlo ) < f (xc ), thus accelerating evolutionary search. Blessed by the presences of uncertainty or approximation errors in the surrogate, SAMA converges to the global optimum of the original fitness function in a fast track mode, i.e., xlo corresponds to the global optimum of the original fitness function.

14


4.3 Empirical Study on Impact of Uncertainty

To demonstrate the impact of uncertainty on SAMA search, we present an empirical study in this subsection for some commonly used multimodal benchmark problems. These include the Ackley, Griewank and Rastrigin functions. The global optimum (i.e., minimum) solution of all 20-dimensional benchmark problems considered is located at zero. Greater details on the benchmark problems used are provided in the appendix. In our study, we consider PR and RBF models in SAMA, which are referred here as SAMA-PR and SAMA-RBF, respectively. Besides the standard GA, standard MA, SAMA-PR and SAMA-RBF, the results of SAMA-Perfect is also reported for comparisons. SAMA-Perfect refers to an SAMA that employs a conceptual approximation technique that is assumed to generate error-free surrogates, i.e., rmse = 0. Hence the notion of ‘curse or blessing of uncertainty’ do not exists in a SAMA-Perfect search. As such, any SAMA that perform worse or better than the SAMA-Perfect is clearly attributed to the effects of ‘curse and blessing of uncertainty’, respectively. Note the difference between MA and SAMA-Perfect. In SAMA-Perfect, the exact fitness evaluations generated by the application of local search are not considered as part of the total exhausted exact evaluation counts in the stopping criteria. For the sake of brevity, the abbreviations of the SAMA variants considered here are summarized in Table 2.

4.3.1 Impact of Uncertainty on SAMA Variants

In our empirical study, a stan-

dard binary coded GA with population size of 50, 1-point crossover and mutation


15

operators at probabilities 0.6 and 0.01, respectively, is employed. A linear ranking algorithm is used for selection. Without loss of generality, other variants of GAs or EAs may also be considered for studying the effect of ‘curse or blessing of uncertainty’. Real-coded GAs [27] and steady state GAs represent some of those that may be used. Apart from the standard GA settings, one user-specified parameters of the SAMA is the number of training data points used to construct the surrogate, m. In our numerical studies, we set m = (d + 1)(d + 2)/2, which is the minimum number of data points required for fitting a 2nd order PR model. In the SAMA-Perfect, error-free surrogates are simulated using the exact fitness function. Further, we consider online surrogates constructed using localized training data, i.e., the m nearest neighbors of an individual are selected from the search engine database containing previously searched points. Note that the database is continuously updated as the search progresses. The criterion used to determine the similarity between searched points is the simple Euclidean distance metric. The Feasible Sequential Quadratic Programming (FSQP) [28] is then used as the local search solver in both the MA and SAMA variants. The convergence trends obtained for the standard GA, standard MA, SAMAPR, SAMA-RBF and SAMA-Perfect on the multimodal benchmark problems considered are depicted in Figs. 4(a)-4(c). The results presented are averaged of 20 independent runs conducted with a limited computational budget of 3 × 103 exact fitness function evaluations. From these results, it is notable that all the SAMA variants considered here, i.e., SAMA-PR, SAMA-RBF and SAMA-Perfect, are capable of converging to

16


6

20

20D−Ackley

MA GA

18

GA

Fitness Value (Natural Log)

16 Fitness Value

20D−Griewank

4

SAMA−Perfect 14 12 10 8 SAMA−RBF

MA 2

SAMA−RBF

0

SAMA−PR

−2

6

SAMA−Perfect

SAMA−PR

4

−4

2 0

−6

0

500

1000 1500 2000 2500 Fitness Function Evaluation Count

3000

0

(a) Ackley Function

500


3000

(b) Griewank Function 20D−Rastrigin

GA MA

5


SAMA−PR 4.5

4 SAMA−Perfect

3.5 SAMA−RBF

3

0

500

1000

1500

2000

2500

3000

Fitness Function Evaluation Count

(c) Rastrigin Function Fig. 4 Convergence trends of the GA, MA, SAMA-PR, SAMA-RBF and SAMA-Perfect

good solution quality more efficiently than both standard GA and MA on the benchmark problems. This makes good sense since memetic algorithms, i.e., EAs that employ local search heavily such as SAMA-RBF, generally search more efficiently. Most importantly, since the surrogates are used in place of the exact fitness function when conducting local searches, the SAMA-PR, SAMA-RBF and SAMA-Perfect incurs significantly lower computational efforts than the standard MA for the same search generations.


17

Further, it is also worth highlighting that both SAMA-RBF and SAMA-PR outperforms the SAMA-Perfect significantly on the Ackley function (see Fig. 4(a). On the Rastrigin function in Fig. 4(c), SAMA-RBF converges to good quality solutions significantly faster than SAMA-Perfect. Clearly, these cases demonstrate the effect of ‘blessing of uncertainty’ on multimodal functions in accelerating evolutionary search. The uncertainty in the surrogate model helps smooth the multimodal landscape of the original fitness function, leading to significant speedup in search convergence. On the other hand, SAMA-Perfect performs better than both SAMA-PR and SAMA-RBF on Griewank function as shown in Fig. 4(b). It also outperforms SAMA-PR on the Rastrigin function in Fig. 4(c). Note that these demonstrates the effect of ‘curse of uncertainty’ due to the negative impact of approximation errors in the surrogate model which could result the evolutionary search in converging at the false global optima.

4.3.2 Surrogate Quality/Uncertainty and SAMA Search Performance Relations Next we seek to identify any forms of relations between the quality of surrogate model or uncertainty to SAMA search performance. The quality of a surrogate can be assessed a posteriori by comparing an independent set of exact fitness values with the corresponding fitness estimated using the surrogate. The root mean square error (denoted by rmse) and correlation coefficient (denoted by r) [29] are two commonly used measures for evaluating the accuracy of a surrogate.

18


Benchmark

r correlation

rmse

Problems

PR

RBF

PR

RBF

Ackley

0.9484

0.9979

2.2303

0.3276

Griewank

1.0000

0.9982

0.0154

1.8852

Rastrigin

0.5120

0.5988

98.8377

61.4169

Table 1 The rmse and r correlation measures for PR and RBF surrogates

The rmse can be determined by using Equation 6 while the correlation coefficient r is defined by P P ttˆ − t tˆ . r= q P P P P [N t2 − ( t)2 ][N tˆ2 − ( tˆ)2 ] N

P

(7)

where N is the sample size used. t denotes the exact fitness values from the independent test set, and tˆ are the fitness values estimated by the surrogate. If a correlation coefficient of 1 is obtained, this implies that the surrogate models the test set accurately. Here, locally optimized solutions using PR or RBF models in the SAMA run are archived to form the sample set for evaluating the quality of different surrogates. The obtained rmse and r measures when using PR or RBF models on the benchmark problems are reported in Table 1. The t − tˆ correlation plots of the PR and RBF models on the 20D Ackley function are also depicted in Figs. 5(a) and 5(b). Similar plots for the 20D Rastrigin function are provided in Figs. 6(a) and 6(b). The results obtained indicate that linear-spline RBF models more accurately than the 2nd order PR on 2 out of the 3 benchmark problems. Although this leads to


PR modeling

20

18

16

16

14

14

12 10 8

12 10 8

6

6

4

4

2 0

RBF modeling

20

18

predicted fitness

predicted fitness

19

2 0

5

10 15 exact fitness − Ackley function

0

20

0

(a) PR Modeling

5

10 15 exact fitness − Ackley function

20

(b) RBF Modeling

Fig. 5 t − tˆ plots of PR and RBF modeling for 20D Ackley function

300

PR modeling

250

250

200

200 predicted fitness

predicted fitness

300

150

150

100

100

50

50

0

0

50

100 150 200 exact fitness − Rastrigin function

(a) PR Modeling

250

300

RBF modeling

0

0

50

100 150 200 exact fitness − Rastrigin function

250

300

(b) RBF Modeling

Fig. 6 t − tˆ plots of PR and RBF modeling for 20D Rastrigin function

better search performances on the Rastrigin function in SAMA-RBF than SAMAPR (see Fig. 4(c)), the same conclusion cannot be drawn on Ackley, see Fig. 4(a). This implies that there remains to be no clear relation between accuracy of the surrogate models or the effect of ‘cursing and blessing of uncertainty’ to SAMA search performance that one could leverage for appropriate model selection.

5 Multi-Surrogates Assisted Memetic Algorithm

From the numerical results obtained in subsection 4.3, it is worth keeping in mind that uncertainty introduced by approximation errors in the surrogate model can

20


benefit as well as obstruct effective optimization search. Hence, in contrast to mitigating only the effects of ‘curse of uncertainty’ by seeking for high accuracy approximation models (like in most existing works), it would be wise to also consider leveraging from possible benefits of ‘blessing of uncertainty’ when designing new SAEAs. Since there is often no prior knowledge on which surrogates would be most appropriate for the problem of interest as show in subsection 4.3.2, we propose the use of multi-surrogates in the SAMA search, i.e., Multi-Surrogates Assisted Memetic Algorithm (MSAMA) framework, for solving computationally expensive optimization problems efficiently.

The outline of our proposed MSAMA is depicted in Fig. 7. The key difference between MSAMA and SAMA lies in the local search phase. In MSAMA, multilocal searches based on dissimilar surrogates, i.e., {tˆ1 , tˆ2 , ..., tˆj } are conducted in parallel [30], where j represents the number of surrogates considered. Note that surrogate ensemble [19] may also be considered as one of independent surrogate model used in MSAMA. Subsequently, the best improved solution obtained in the local learning phase then proceeds with the Lamarckian Learning process. Since the local searches and the construction of local surrogates are conducted in parallel, the computational efforts incurred remains significantly lower than using the exact simulation codes.

Further, by using multiple surrogate models that exhibits different degree of ‘curse and blessing of uncertainty’, the proposed MSAMA can accelerate evolutionary search in the following ways:


21

Intialization: generate the initial population For each non-duplicated individual i

i < =population size? Yes

Apply local search strategy that embeds a local search solver with multiple surrogates

Training data points

No

surrogate model 1

surrogate model 2

…...

surrogate model j

Replace each individual with the best locally improved solution among all the j surrogate models in the spirit of Lamarckian learning

No

DataBase

Add sample points

Standard EA operator: Crossover/Mutation/Selection

Computational budget exhausted? Yes

Terminated

Fig. 7 Outline of Multi-Surrogates Assisted Memetic Algorithm (MSAMA)

– (1) If all the surrogate models used display effects of ‘blessing of uncertainty’, benefit is achieved by leveraging from either of the model’s blessing. – (2) If none of the surrogate models display any effects of ‘blessing of uncertainty’, benefit can still be attained by leveraging from the model that provides the minimum prediction errors. – (3) If some surrogate models display effects of ‘blessing of uncertainty’, while others on ‘blessing of uncertainty’, benefits can still be attained by leveraging from models that provides blessing on the search.

22


5.1 Empirical Study on MSAMA

To investigate the efficacy of the proposed MSAMA, we conducted an empirical study based on five benchmark problems having diverse fitness landscapes, i.e., unimodal Sphere and Step functions and three previously considered multimodal benchmark problems. The global optimum (i.e., minimum) solution of all 20-dimensional benchmark problems considered is located at zero. The configurations of all parameters in the algorithms are kept consistent to our previous study in subsection 4.3 on SAMAs. When it comes to model selection in the MSAMA, it is obvious to favor surrogates that provides maximum search improvements. A clear choice of surrogate in the MSAMA is one with perfect prediction accuracy, i.e. perfect model, capable of resolving the ‘curse of uncertainty’. On the other hand, to benefit from the effects of ‘blessing of uncertainty’, any surrogates that is capable of generalizing the multi-modality properties of the problem landscape or providing good search directions should be considered. To facilitate a diverse pool of surrogate models in the MSAMA, we used both exact interpolating and regression approximation techniques since they produce fundamentally very different surrogates. These include the linear-spline exact interpolating RBF, 2nd order regression PR, and a conceptual approximation method that generates error-free surrogates, i.e., a perfect model. The abbreviations of the MSAMA variants are tabulated in Table 2.

5.1.1 MSAMA using Perfect Model

We first investigate the performance of MSAMA

using the perfect model and either PR regression or RBF interpolation model. Fig-


23

Abbreviation

SAMA or MSAMA Variants

SAMA-PR

SAMA with PR model

SAMA-RBF

SAMA with RBF model

SAMA-Perfect

SAMA with Perfect model

MSAMA-EP

MSAMA with Perfect and PR models

MSAMA-EF

MSAMA with Perfect and RBF models

MSAMA-PF

MSAMA with PR and RBF models

Table 2 Abbreviation of SAMA and MSAMA Variants

ures 8(a)-8(e) depict the search trends of MSAMA-EP and MSAMA-EF in comparison to the SAMA variants (i.e., SAMA-PR, SAMA-RBF and SAMA-Perfect) on all the 20 dimensional benchmark problems. It is noted that the search trends of SAMA-Perfect is omitted from Figs. 8(b) and 8(c) for comparison due to their significantly poor results on the Step and Ackely functions. The black square in the figures denotes a convergence to the true global optimum within the limited computational budget imposed. We first discuss the results obtained on the unimodal benchmark problems. Sphere is a nonlinear, continuous, convex smooth function. Since the landscape of the Sphere function is quadratic in nature, a 2nd order PR model clearly serves as more appropriate than the linear-spline RBF. This explains the improved search performance by SAMA-PR over SAMA-RBF on the Sphere function. Nevertheless, neither SAMA-PR nor SAMA-RBF could outperform the SAMA-Perfect which demonstrates the presence of ‘curse of uncertainty’. Further, both MSAMAEP and MSAMA-EF can achieve better solution than all the SAMA counterparts.

24


1

0

20D−Sphere Fitness Value (Natural Log)


20D−Step

0.5

−2 −4 −6 −8

0

−0.5

SAMA−RBF

SAMA−PR

−1

−1.5

−10 SAMA−Perfect

−2 MSAMA−EP

−2.5

−12

SAMA−RBF

−3

MSAMA−EP

−14

MSAMA−EF

SAMA−PR MSAMA−EF

−3.5

−16 0

500

1000 1500 2000 Fitness Function Evaluation Count

2500

−4

3000

0

500 1000 Fitness Function Evaluation Count

(a) Sphere Function

(b) Step Function

8

20D−Ackley Fitness Value (Natural Log)

Fitness Value

SAMA−RBF

5 4 SAMA−PR 3

SAMA−Perfect

−4 −6 −8 −10 MSAMA−EP

−14

1

MSAMA−EF

MSAMA−EF

−16 0

20D−Griewank

SAMA−PR −2

−12

MSAMA−EP

2

SAMA−RBF

0

7 6

1500

0

500


3000

0

500

(c) Ackley Function


(d) Griewank Function

5 SAMA−PR

20D−Rastrigin


4.5

4 SAMA−Perfect

MSAMA−EP

3.5 SAMA−RBF

3 MSAMA−EF

0

500

1000

1500

2000

2500

3000


(e) Rastrigin Function Fig. 8 Convergence trends of the MSAMA variants on all benchmark problems

3000


25

For instance, MSAMA-EF converges to the global optimum in less than 1.5 × 103 exact fitness function evaluations. The perfect model used in MSAMA-EF and MSAMA-EP help resolve any forms of ‘curse of uncertainty’ that exists. At the same time, MSAMA benefits from the effects of ‘blessing of uncertainty’ introduced by either PR or RBF approximations, leading to improved search performances.

The Step function consist of flat plateaus with slope = 0 in an underlying continuous function. In contrast to Sphere, it is hard for any gradient-based optimization algorithm to locate the global optimum of the Step function since minor changes of the problem variables do not affect the fitness value. However, the surrogate modeling techniques in SAMA helps smooth the discontinuities of the step landscape,i.e., an effect of ‘blessing of uncertainty’ on SAMA search, thus leading to the superior search performances displayed by both SAMA-PR and SAMARBF (i.e., the global optimum is found in less than 1.5 × 103 exact fitness function calls). Once again, both MSAMA-EP and MSAMA-EF converges to the global optimum more efficiently than all the SAMA counterparts.

Consider next the multimodal benchmark problems. The results in Figures 8(c)-8(e) also indicate that both MSAMAs, i.e., MSAMA-EP and MSAMA-EF, converges to the global optimum of the benchmark problems or improved quality solutions significantly faster than the SAMA counterparts, once again demonstrating the positive benefits of ‘blessing of uncertainty’ introduced by either PR or RBF approximation on MSAMA search.

26


5.1.2 MSAMA using both Regression and Interpolation Models

Since an approx-

imation technique that produces perfect model of zero rmse do not exists in reality, we consider the exact interpolating RBF models as a possible substitute in practice since its rmse is zero on the training samples. The 2nd order PR on the other hand fits in naturally as an appropriate least-squares regression model that possess abilities for generalizing the landscape multi-modality. Since 2nd order PR regression model produces fundamental very different models from linear-spline RBF exact interpolation, the diversity of surrogates used in MSAMA is ensured. The MSAMA that employs both PR and RBF surrogates in the local search phase is then referred here as MSAMA-PF in short. The search traces obtained by MSAMA-PF, SAMA-PR and SAMA-RBF for solving the benchmark problems are plotted in Figs. 9(a)-9(e). These results again highlights the superior performance of the MSAMA over the SAMA counterparts where only a single approximation method, i.e., either PR or RBF model, is considered. In particular, the MSAMA-PF converges to the global optimum in less than 4 × 102 exact fitness evaluations on the Step function. It also converges to improved solution quality within the computational budget imposed, on all the other problems considered. Further statistical significance in the search performances of MSAMA-PF, SAMA-PR and SAMA-RBF can be arrived from the box plots provided in Figs. 10(a)-10(d) which shows the median and the variance of the fitness value for the best found individual across 20 independent runs. No box plot for the Step function is provided since both SAMAs and MSAMA-PF converge to the global minimum


27

1

0

20D−Sphere Fitness Value (Natural Log)


−4

0

−0.5

SAMA−RBF −6 −8

20D−Step

0.5

−2

−1

−1.5

SAMA−PR

−10

−2

−2.5

−12

MSAMA−PF

SAMA−RBF

−3 −14

MSAMA−PF

−16 0

500

SAMA−PR

−3.5

1000 1500 2000 Fitness Function Evaluation Count

2500

−4

3000

0

500 1000 Fitness Function Evaluation Count

(a) Sphere Function

(b) Step Function

8

3

20D−Ackley

20D−Griewank

7 Fitness Value (Natural Log)

2

6 Fitness Value

1500

SAMA−RBF

5 4 SAMA−PR 3

1 SAMA−RBF

0

−1 SAMA−PR −2

2 −3

MSAMA−PF

MSAMA−PF

1 −4

0

0

500


0

3000

500

(c) Ackley Function


3000

(d) Griewank Function

5 SAMA−PR

20D−Rastrigin

4.8


4.6 4.4 4.2 4 SAMA−RBF 3.8 3.6 MSAMA−PF

3.4 3.2 3

0

500

1000

1500

2000

2500

3000


(e) Rastrigin Function Fig. 9 Convergence trends of the SAMA-PR, SAMA-RBF and MSAMA-PF on all benchmark problems

28


−3

x 10 2

6

1.8 1.6

5 Fitness Value

Fitness Value

1.4 1.2 1 0.8

4 3 2

0.6 0.4

1 0.2 0

0 SAMA−PR

SAMA−RBF Sphere Function

MSAMA−PF

SAMA−PR

(a) Sphere Function

SAMA−RBF Ackley Function

MSAMA−PF

(b) Ackley Function 140

1.2 120 1

Fitness Value

Fitness Value

100 0.8

0.6

80 60

0.4 40 0.2 20 0 SAMA−PR

SAMA−RBF Griewank Function

MSAMA−PF

(c) Griewank Function

SAMA−PR

SAMA−RBF Rastrigin Function

MSAMA−PF

(d) Rastrigin Function

Fig. 10 SAMA-PR, SAMA-RBF and MSAMA-PF Box plot of the best fitness in the final generation for all benchmark problems

within the imposed computational budget (see Fig. 9(b)). In a notched box plot, the notch represents a robust estimate of the uncertainty about the median. The lower and upper bounds of the box ar the 25% and 75% lower quartiles. The whiskers are lines extending from each end of the box to show the extent of the rest of the data. Outliers are data with values beyond the ends of the whiskers and denoted by the ‘+’ sign. Boxes whose notches do not overlap indicate that the medians of the two groups differ at the 5% significance level.


29

Algorithm

Best Fitness

Worst Fitness

Mean Fitness

Median Fitness

Deviation

SAMA-PR

0

0.2180

0.0478

0.0210

0.0575

SAMA-RBF

0

0.0021

2.7e-4

0

5.4299e-4

MSAMA-PF

0

3.0e-6

2.5e-7

0

7.1635e-7

Table 3 Statistics of SAMA-PR, SAMA-RBF and MSAMA-PF for Sphere Function

Algorithm

Best Fitness

Worst Fitness

Mean Fitness

Median Fitness

Deviation

SAMA-PR

2.1086

3.0550

2.6352

2.6601

0.2785

SAMA-RBF

1.0831

6.5819

3.3487

3.4032

1.4313

MSAMA-PF

0.0261

1.1101

0.298

0.234

0.2467

Table 4 Statistics of SAMA-PR, SAMA-RBF and MSAMA-PF for Ackley Function

On the unimodal Sphere function, SAMA-PR, SAMA-RBF as well as MSAMAPF all arrives at very good solutions with overlapping notches. Nevertheless, only MSAMA-PF arrives at the global optimum in all the 20 independent runs on the Sphere function, see Fig. 10(a). The results in Figs. 10(b)-10(c) also indicate that MSAMA-PF outperforms the two SAMA counterparts significantly statistically on Ackley and Griewank functions. There is however no statistically significant differences between SAMA-RBF and MSAMA-PF on the Rastrigin function. For a detailed comparisons between SAMA-PR, SAMA-RBF and MSAMA-PF, the reader is referred to more statistics provided in Tables 3-6.

6 Conclusion

For computationally expensive optimization problems, the use of surrogates helps to greatly reduce the number of evaluations of the exact fitness function by exploit-

30


Algorithm

Best Fitness

Worst Fitness

Mean Fitness

Median Fitness

Deviation

SAMA-PR

0

0.4475

0.1308

0.0949

0.1354

SAMA-RBF

0.4883

1.2427

0.9017

0.9854

0.1869

MSAMA-PF

0

0.3249

0.0164

0

0.0726

Table 5 Statistics of SAMA-PR, SAMA-RBF and MSAMA-PF for Griewank Function

Algorithm

Best Fitness

Worst Fitness

Mean Fitness

Median Fitness

Deviation

SAMA-PR

87.9489

136.4428

116.7137

116.5195

11.4290

SAMA-RBF

11.2311

41.2498

24.1048

22.2128

8.5925

MSAMA-PF

6.9430

41.4958

22.0591

22.3556

8.2804

Table 6 Statistics of SAMA-PR, SAMA-RBF and MSAMA-PF for Rastrigin Function

ing the information contained in the search history. In this paper, we have presented the negative and positive impacts of the uncertainty introduced by approximation errors in the surrogate on evolutionary optimization, otherwise referred to as ‘curse and blessing of uncertainty’, respectively. Empirical studies are presented for a number of unimodal and multimodal benchmark test functions to illustrate the impact of uncertainty using the SAMA proposed in [1]. The experimental results are compared with those obtained using a standard GA, MA and SAMA with PR, RBF or a perfect surrogate. The results obtained suggest that SAMA is capable of solving computationally expensive optimization problems more efficiently than the standard GA and MA on a limited computational budget. It also demonstrates the effect of ‘curse of uncertainty’ and ‘blessing of uncertainty’ attributed by approximation errors in the


31

surrogate-assisted evolutionary algorithms, leading to a new paradigm in SAEA design. Taking this cue, the MSAMA framework is proposed for enhancing evolutionary search performance by leveraging from the benefits of ‘curse and blessing of uncertainty’ pose by different surrogate models. Empirical results obtained on both unimodal and multimodal problems further demonstrate that MSAMAs outperforms the standard GA, standard MA and other SAMAs significantly, thus making it a promising approach for solving computationally expensive optimization problems.

Appendix

The benchmark problems considered are summarized here: – Ackley Test Function s −0.2

f (x) = 20 + e − 20e

1 n

n P i=1

x2i

−e

1 n

n P

cos(2πxi )

i=1

(8)

−32.768 ≤ xi ≤ 32.768, i = 1, 2, . . . , n. – Griewank Test Function f (x) = 1 +

Pn i=1

x2i /4000 −

Qn i=1

√ cos(xi / i)

(9)

−600 ≤ xi ≤ 600, i = 1, 2, . . . , n. – Rastrigin Test Function f (x) = 10n +

Pn

2 i=1 (xi

− 10 cos(2πxi ))

−5.12 ≤ xi ≤ 5.12, i = 1, 2, . . . , n.

(10)

32


– Sphere Test Function

f (x) =

Pn

2 i=1 (xi )

(11)

−5.12 ≤ xi ≤ 5.12, i = 1, 2, . . . , n.

– Step Test Function

f (x) =

Pn

2 i=1 bxi c

(12)

−5.12 ≤ xi ≤ 5.12, i = 1, 2, . . . , n.

References

1. Y. S. Ong, P. B. Nair, and A. J. Keane, “Evolutionary optimization of computationally expensive problems via surrogate modeling,” American Institute of Aeronautics and Astronautics Journal, vol. 41, no. 4, pp. 687–696, 2003. 2. Z. Zhou, Y. S. Ong, P. B. Nair, A. J. Keane, and K. Y. Lum, “Combining global and local surrogate models to accelerate evolutionary optimization,” IEEE Transactions on Systems, Man and Cybernetics (SMC), part C - Applications and Reviews, vol. 37, no. 1, Jan, 2007. 3. K. C. Giannakoglou, “Design of optimal aerodynamic shapes using stochastic optimization methods and computational intelligence,” International Review Journal Progress in Aerospace Sciences, vol. 38, no. 5, pp. 43–76, 2002. 4. D. D. Daberkow and D. N. Marvis, “New approaches to conceptual and preliminary aircraft design: A comparative assessment of a neural network formulation and a response surface methodology,” in Proc. 1998 World Aviation Conference, AIAA, Anaheim, CA, USA, Sept. 1998, AIAA-98-5509.


33

5. Y. S. Ong and A. J. Keane, “A domain knowledge based search advisor for design problem solving environments,” Engineering Applications of Artificial Intelligence, vol. 15, no. 1, pp. 105–116, 2002. 6. A. Ratle, “Kriging as a surrogate fitness landscape in evolutionary optimization,” Artificial Intelligence for Engineering Design Analysis and Manufacturing, vol. 15, no. 1, pp. 37–49, 2001. 7. M. Emmerich, A. Giotis, M. Ozdemir, T. Back, and K. Giannakoglou, “Metamodelassisted evolution strategies,” in Proc. the 7th International Conference on Parallel Problem Solving from Nature (PPSN 2002), Granada, Spain, Sept. 2002, pp. 361–370. 8. R. G. Regis and C. A. Shoemaker, “Local function approximation in evolutionary algorithms for the optimization of costly functions,” IEEE Transactions on Evolutionary Computation, vol. 8, no. 5, pp. 490 – 505, 2004. 9. Y. S. Ong, Z. Zhou, and D. Lim, “Curse and blessing of uncertainty in evolutionary algorithm using approximation,” in Proc. The 2006 IEEE World Congress on Computational Intelligence (WCCI2006), British Columbia, Canada, July 2006, pp. 2928–2935. 10. Y. S. Ong and A. Keane, “Meta-lamarckian in memetic algorithm,” IEEE Transactions On Evolutionary Computation, vol. 8, no. 2, pp. 99–110, 2004. 11. F. H. Lesh, “Multi-dimensional least-square polynomial curve fitting,” Communications of ACM, vol. 2, no. 9, pp. 29–30, 1959. 12. A. A. Guinta and L. Watson, “A comparison of approximation modelling techniques: polynomial versus interpolating models,” in Proc. the 7th AIAA/USAF/NASA/ISSMO Symposium on Multidisciplinary Analysis and Optimization, vol. 1, St. Louis, MO, USA, Sept. 1998, pp. 392–404, AIAA–98–4758. 13. Y. Jin, M. Olhofer, and B. Sendhoff, “A framework for evolutionary optimization with approximate fitness functions,” IEEE Transactions on Evolutionary Computation, vol. 6, no. 5, pp. 481–494, 2002.

34


14. D. B¨ uche, N. N. Schraudolph, and P. Koumoutsakos, “Accelerating evolutionary algorithms with gaussian process fitness function models,” IEEE Transactions on Systems, Man, and Cybernetics, Part C, Special Issue on Knowledge Extraction and Incorporation in Evolutionary Computation, vol. 35, pp. 183–194, 2005. 15. M. A. El-Beltagy, P. B. Nair, and A. J. Keane, “Metamodelling techniques for evolutionary optimization of computationally expensive problems: Promise and limitations,” in Proc. IEEE Genetic and Evolutionary Computation Conference (GECCO’99), Florida , USA, July 1999, pp. 196–203. 16. H. Ulmer, F. Streichert, and A. Zell, “Evolution startegies assisted by gaussian processes with improved pre-selection criterion,” in Proc. IEEE Congress on Evolutionary Computation CEC’03, Canberra, Australia, Dec. 2003, pp. 1741–1751. 17. Y. S. Ong, P. B. Nair, and K. Y. Lum, “Max-min surrogate-assisted evolutionary algorithm for robust aerodynamic design,” IEEE Transaction on Evolutionary Computation, vol. 10, no. 4, pp. 392–404, 2006. 18. ——, “Evolutionary algorithm with hermite radial basis function interpolants for computationally expensive adjoint solvers,” Computational Optimization and Applications, in press, 2006. 19. Y. Jin and B. Sendhoff, “Reducing fitness evaluations using clustering techniques and neural networks ensembles,” in Proc. the Genetic and Evolutionary Computation Conference, ser. LNCS, vol. 3102.

Springer, 2004, pp. 688–699.

20. P. Nain and K. Deb, “Computationally effective search and optimization procedure using coarse to fine approximation,” in Proc. Congress on Evolutionary Computation (CEC’03), Canberra, Australia, Dec. 2003, pp. 2081–2088. 21. J. Xuan, D. Chafekar, and K. Rasheed, “Constrained multi-objective ga optimization using reduced models,” in Proc. the Genetic and Evolutionary Computation Conference (GECCO’03), Chicago, Illinois, USA, July 2003, pp. 174–177.


35

22. J. Knowles, “ParEGO: A hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems,” IEEE Transactions on Evolutionary Computation, vol. 10, no. 1, pp. 50–66, 2006. 23. Y. Jin, “A comprehensive survey of fitness approximation in evolutionary computation,” Soft Computing, vol. 9, no. 1, pp. 3–12, 2005. 24. Y. S. Ong, P. B. Nair, A. J. Keane, and K. W. Wong, “Surrogate-assisted evolutionary optimization frameworks for high-fidelity engineering design problems,” in Knowledge Incorporation in Evolutionary Computation, Studies in Fuzziness and Soft Computing, Y. Jin, Ed.

Springer Verlag, 2004, pp. 307–332.

25. M. Powell, “Radial basis functions for multi-variable interpolation: A review,” in Algorithms for Approximation, C. Mason and M. G. Cox, Eds.

Oxford, U.K.: Oxford

University Press, 1987, pp. 143–167. 26. C. Bishop, Neural Networks for Pattern Recognition. Oxford University Press, 1995. 27. F. Herrera, M. Lozano, and A. S´ anchez, “Hybrid crossover operators for real-coded genetic algorithms: an experimental study,” Soft Computing, vol. 9, no. 4, pp. 280–298, 2005. 28. J. Lee and P. Hajela, “A computationally efficient feasible sequential quadratic programming algorithm,” Society for Industrial and Applied Mathematics, vol. 11, no. 4, pp. 1092–1118, 2001. 29. A. L. Edwards, An Introduction to Linear Regression and Correlation. W. H. Freeman, 1976. 30. H. Ng, D. Lim, Y. S. Ong, B. S. Lee, L. Freund, S. Parvez, and B. Sendhoff, “A multicluster grid enabled evolution framework for aerodynamic airfoil design optimization,” in International Conference on Natural Computing (ICNC), ser. Lecture Notes in Computer Science, vol. 3611.

Springer Berlin/Heidelberg, 2005, pp. 1112–1121.

Memetic Algorithm using Multi-Surrogates for ... - CiteSeerX

Memetic Algorithm using Multi-Surrogates for ... - CiteSeerX

Suggest Documents

Using improved Memetic Algorithm and local search to ... - CiteSeerX

A memetic algorithm using local search chaining for black-box

A genetic algorithm and memetic algorithm to sequencing ... - CiteSeerX

Memetic Algorithm for Dynamic Optimization Problems - International ...

A Memetic Algorithm for the Minimum Weighted k ... - CiteSeerX

A Memetic Algorithm for Periodic Capacitated Arc Routing ... - CiteSeerX

Memetic Algorithm with Local Search Chaining for Large ... - CiteSeerX

Memetic firefly algorithm for combinatorial optimization

Parallel Memetic Algorithm for VLSI Circuit

13. Breaking A Playfair Cipher Using Memetic Algorithm

An improved memetic algorithm using ring neighborhood ... - GitHub

An improved memetic algorithm using ring neighborhood ... - GitHub

A Study on the Design Issues of Memetic Algorithm - CiteSeerX

Memetic Algorithm-Based Multi-Objective Coverage ... - MDPI

Memetic Modified Cuckoo Search Algorithm with

A Comparison between Memetic algorithm and Genetic

Memetic Algorithm-Based Multi-Objective Coverage ... - MDPI

PARALLEL MEMETIC ALGORITHM WITH ... - Semantic Scholar

A tabu search based memetic algorithm for the ... - UniversitÃ© d'Angers

A memetic algorithm for the multi-objective flexible job shop ...

Redalyc.A Memetic Algorithm for Designing Balanced Incomplete Blocks

A memetic optimization algorithm for multi-constrained ... - PLOS

A New Memetic Algorithm for Multi-Document Summarization based

An Order Based Memetic Evolutionary Algorithm for Set ... - ORCA