Multi-Objective Evolutionary Optimization of Subsonic Airfoils by Kriging Approximation and Evolution Control Salvatore D’Angelo Politecnico di Torino C.so Duca degli Abruzzi, 24 10129 - Turin, Italy
[email protected]
Edmondo A. Minisci Politecnico di Torino C.so Duca degli Abruzzi, 24 10129 - Turin, Italy
[email protected]
Abstract- This work focuses on multi-objective evolutionary optimization by approximation function. It uses the new general concept of evolution control to on-line enriching the database of correct solutions, which are the basis of the learning procedure for the kriging approximators. Substantially, given an initial very poor model approximation (small size of the database), the database, and consequently the models, is enriched by evaluating part of the individuals of the optimization process. The technique showed being efficient for the considered aerodynamic problem, by requiring few hundreds of true computations when the dimensionality of the problem is 5.
uals of the last generation and c) repeating point a) and b) until no new individuals are found. If the first set of data “contains” the global optima, this approach could efficiently bring to the solution(s). This is surely true only if the dataset would span the whole input space of interest, ensuring that any predicted value (i.e. output of the model) is the result of an interpolation process and not a risky extrapolation, but we have to face the curse of dimensionality. Due to the necessity to limit the number of training samples, if the number of dimensions of the search space increases over a threshold, whose value mainly depends on the cost of the true model, it is very difficult to construct an approximate model that is globally correct. Likely the approximation brings the optimization algorithm to false optima, which are optima of the approximate model but not optima of the true functions. In order to avoid finding false optima, or losing some of the optima, in this work we use the concept of model management or evolution control, borrowed from [jin02], where it is applied for single objective optimization. The remainder of the paper is structured as follows. Next section describes the adopted evolutionary algorithm. After a short introduction of the approximation and the evolution control techniques, the used algorithm, the MOPED (Multi-Objective Parzen based Estimation of Distribution) [costa03], will be described (major details will be given about the modifications made to the original algorithm in order to take into account the evolution control approach). In section 3, the simple aerodynamic model and the optimization structure will be detailed and current most meaningful results will be shown and discussed. The work will be summarized in section 4.
1 Introduction Evolutionary Multi-Objective Optimization (EMOO) is now a mature field of Computational Intelligence (CI), both in terms of methodologies and algorithm developments, but a wide variety of available industrial problems can not be solved by evolutionary methods because the too high computational cost of objective function evaluations. In order to face this big challenge, some alternative approaches can be considered [farina02]. When the number of objective function calls, which can be afforded from the point of view of an industrially practical computational cost, is not so big as required for convergence of available Multi-Objective Evolutionary Algorithms (MOEAs), we could choose among a) building specific MOEAs for tiny populations, b) using hybrid algorithms combining evolutionary and deterministic search, or c) using the function approximation methods in order to approximate objective functions and constraints. In this work we will apply the latter approach, whose main idea is to consider, throughout the optimization process, two different levels of objective functions, each corresponding to the true functions, which are to be evaluated as few times as possible, because of their high computational cost, and to the interpolation of the true objective functions by means of some interpolation technique, which can be evaluated as many times as it is needed, because considered inexpensive (this is the hypothesis). How scheduling the use of the approximate model and the real one is the problem. The simplest idea is a) running the optimization, using the approximation, until convergence is achieved, b) upgrading the approximate model by computing the individ-
2 MOPED with approximation The used code is based on the modification (development) of the MOPED algorithm, which is a multi-objective optimization algorithm for continuous problems that uses the Parzen method to build a probabilistic representation of Pareto solutions, with multivariate dependencies among variables. Similarly to what was done in [khan02] for multiobjective Bayesian Optimization Algorithm (BOA), the techniques of NSGA-II [deb02] (and not only) are used to classify promising solutions in the objective space, while new individuals are obtained by sampling from the Parzen model.
The Parzen method [fuku72] pursues a non-parametric approach to kernel density estimation and it gives rise to an estimator that converges everywhere to the true Probability Density Function (PDF) in the mean square sense. Should the true PDF be uniformly continuous, the Parzen estimator can also be made uniformly consistent. In short, the method allocates exactly n identical kernels, each one “centered” on a different element of the sample. The original MOPED demonstrated to be effective and efficient when applied to general test cases and realer problems. Here, in order to improve its efficiency, we try a possible manner to hybridize it with approximation techniques. Before going into the algorithm, few words will be spent to introduce the used approximation and the evolution control techniques. 2.1 Kriging Approximation For the first attempt to utilize the hybridized version of MOPED, we chose to adopt a kriging approach in order to approximate the real model. As commonly accepted [jones98, simp01], this method can build accurate approximations of design spaces both for linear and non-linear functions. The approach is based on the hypothesis that real models are deterministic, that is responses from the models are error/noise free. x1 , ..., x p ]T with Given a set of p design sites X = [x xi ∈ Rn and responses Y = [yy 1 , ..., y p ]T with y i ∈ Rq , we look for an approximated model that can express the deterministic response y for the input x. Among the almost infinite ways, we do this by imposing that the approximation model is given as a combination of a regression model and a random function (under the hypothesis that, though the response of the true model is deterministic, the deviation from the regression model can be similar to a sample path of a stochastic process) as follows: x) = F (β β j , x ) + zj (x x) yˆj (x
j = 1, ..., q
(1)
where the regression model is a linear combination of r functions r X β j , x) = x) F (β βi,j f ri (x (2) i=1
and the random process z is assumed to have null mean and xi ) and zj (x xk ) (i, k = 1, ..., p) covariance between zj (x xi ) zj (x xk )] = E [zj (x
σj2
R (θ, xi , xk )
j = 1, ..., q (3)
with σj2 being the process variance for the j −th component of the response and R the correlation model with parameters θ. Chosen the regression functions and the correlation models, since coefficients β and σ 2 can be estimated as functions X and of θ’s, these latter are found in order to fit the data (X Y ). We have used the Kriging tool proposed in [loph02] where the reader can find a more detailed description.
2.2 Evolution Control If the problem we are handling is simple, i.e. it has low dimensionality and low multi-modality, both random sampling and regularized learning can be very helpful to prevent finding false minima. However, if this is not the case (and in general it is not) it would be very difficult to achieve correct convergence using these methods. To overcome this obstacle, we use the concept of evolution control as suggested in [jin02], where evolution control means employing the original fitness function to prevent the evolutionary algorithm from converging to false minima. In [jin02] two methods are proposed: a) individualbased control and generation-based control. In the first approach, part of the individuals (nv ) in the current population are chosen and evaluated with the true model. Moreover, if the controlled individuals are chosen randomly, we call it a random strategy, if we choose the best nv individuals as the controlled individuals, we call it a best strategy. In the latter, on the other hand, the whole population of nvgen generations will be evaluated with the real model, every ncyc generations, where nvgen < ncyc . Of course, computed individuals are introduced into the dataset in order to improve the approximation of the model (kriging in this case) in the promising regions. In both evolution control schemes, it is very important to determine the minimal control frequency that is needed to guarantee a correct convergence. Our way, founded on individual-based control, is described below. 2.3 Hybridization As can be argued by seeing the diagram in figure 1, the MOPED has the main structure similar to any other evolutionary algorithm and, in particular, it is like NSGAII except for the constraint handling technique and for the adopted search approach, which is the original part of the algorithm. Without going into details of the “genuine” MOPED, which can be found in [costa03] and [avan03], the hybridized one is described below. With figure 1 in mind: 1. The random initial population, nind individuals sampled uniformly within the search space, is evaluated by the approximated model, which was initialized by some sort of design of experiment (DOE). 2. The individuals of the population are classified in a way that favors the most isolated individuals in the objective function space, in the first sub-class (highest dominance) of the first class (best suited with respect to problem constraints). The first step in the evaluation of the fitness parameter is given by the determination of the degree of compatibility of each individual with the problem constraints, the overall number of which is given by m. This compatibility is measured by a constraint parameter, cp, which is a weighted sum of every unsatisfied constraint. Once the value of cp is evaluated for all the individuals, the population is divided in a
predetermined number of classes, 1 + Ncl . The nbest individuals that satisfy all the constraints (cp = 0) are in the first class. The remainder of the population is divided in the other groups, each one containing an approximately equal number of individuals, given by (nind − nbest )/Ncl . The second class is formed by those individuals with the lower values of the constraint parameter and the last one by those with the highest values. For each class, individuals are ranked in terms of dominance criterion and crowding distance in the objective function space, using the NSGA-II techniques. After that all the individuals of the population have been ranked, from the best to the worst one, as a consequence of their belonging to a given class and subclass, and the value of their crowding parameter, a fitness value, linearly varying from 2 − α (best individual of the entire population) to α (worst individual), with α ∈ [0, 1), is assigned to each individual. This fitness value determines the weighting of the kernel for the sampling of the next generation individuals. 3. On the basis of the information given by nind individuals of the current population, by means of the Parzen method, a probabilistic model of the promising search space portion is built (see previous section). Then, τ nind new individuals, with τ ≥ 1 are sampled, using the probabilistic model just determined. The variance associated to each kernel depends on (i) the distribution of the individuals in the search space and (ii) on the fitness value associated to the pertinent individual, so as to favor the sampling in the neighborhood of the most promising solutions. For generic processes it can be useful to alternatively adopt different kernels from a generation to the other, in order to obtain an effective exploration of the search space. 4. (grey part of the diagram) The entire population, (1 + τ )nind individuals, is now evaluated by the current approximated model and classified as described at point 2. nv individuals are chosen by uniformly sampling from the best half part of the entire population and if their distance from the other elements of the dataset, measured on a normalized search space, is > distmin , then they are added to the dataset and the approximated model is updated (when a number of individuals < nv is added, the worst half part of the population is explored). Under the hypothesis that every operation, but the use of the true model, is inexpensive, current population is recomputed by the approximation model and re-classified (in order to use new information as soon as possible). 5. The best nind individuals are selected to be next generation. 6. If the convergence criteria are achieved the algorithm stops.
Figure 1: MOPED algorithm
3 Optimization Process 3.1 System Model The aim is optimizing the performance of a subsonic airfoil within assigned flight conditions. Geometries of the airfoils are given by the superposition of a mean line and thickness distribution, each of them parameterized by bezier curves as in figure 2. Performance of the airfoil during the optimization process is computed by a well-known 2D panel code named XFOIL ([drela00]). The use of XFOIL allows verifying the validity of the optimization approach without requiring extensive computational capabilities. Mean Line 0.2 0.15 0.1 0.05 0 −0.2
0
0.2
0.4
0.6
0.8
1
1.2
0
0.2
0.4
0.6
0.8
1
1.2
0.6
0.8
1
1.2
Thickness
0.1
0.05
0 −0.2 0.3
Airfoil
0.2 0.1 0 −0.1 −0.2
0
0.2
0.4
Figure 2: Airfoil parameterization
Since we fix the coordinates of the extreme control points and the abscissa of the second control point related to the thickness distribution (xml,1 = 0, yml,1 = 0, xml,3 = 1, yml,3 = 0, xt,1 = 0, yt,1 = 0, xt,4 = 1, yt,4 = 0, xt,2 = 0), the optimization process was aimed at finding the five free coordinates of the bezier curves that minimize the functions F1 = cd (4) F2 = 4 − cl
With the algorithm set as above described, we are able to find a good approximation of the pareto front obtained with the original MOPED, which uses the true model. In figure 3(b) the front resulting from one of 10 runs is compared with the true front obtained after 2300 computations (one of three runs carried out by means of the true model). Figure 3(a) shows how the database information varies during the optimization process, for the same run. The dashed curve indicates how many individuals are verified (and added to the database) per generation, whereas the solid one is the cumulative sum of added individuals. As can be seen from this figure, the approximation of the complete front requires little more than 10% of computations needed by the process using only the true model (291 is the average value on 10 runs).
subject to cl ≥ 0.9 (F2 ≤ 3.1) cd = F1 ≤ 0.02
(5)
where cd and cl are the drag and the lift coefficients, respectively, computed for reynolds number = 3 106 without compressibility effects (low subsonic), for a fixed 8 deg of incidence. That is, given the parameterization, we look for those airfoils that minimize the drag and maximize the lift. The search space has been delimited as in table 1, where subscripts ml and t refer to mean line and thickness, and the MOPED parameters have been set as in table 2, where generations means the maximum number of generation allowed, but the algorithm can stop earlier if every individual of the current population satisfy the constraints and it is not non-dominated. The database has been initialized by means of central composite design on base 2 (43 should be number, but for 5 of them XFOIL was not able to converge).
40
320
35
280
30
240
25
200
20
160 total added individuals added individuals per generation
15
120
10
80
5
40
0 0
10
20
30
40 50 Generations
60
70
Total Added Individuals
3.3 Results
Added Individuals per Generation
3.2 Optimization Statement
0 80
(a) Enriching the dataset
Table 1: Search space 0.1 ≤ xml,2 ≤ 0.9 0 ≤ yml,2 ≤ 0.2 0.05 ≤ yt,2 ≤ 0.2 0.1 ≤ xt,3 ≤ 0.9 0.05 ≤ yt,3 ≤ 0.2
with approximation witout approximation
F2
3
Table 2: MOPED parameters Values Population 100 Generations 150 Fitness param. 0.9 Generator index τ 1 Constraint classes 10 nv 15 distmin 0.15 Apparently trivial, this test case presents difficulties typical of more complexly modeled problems: the true pareto front, even if mathematically unknown, appears discontinuous and the convergence of the aerodynamic code is not guaranteed at 100% (that introduces irregularities in the objective functions).
2.5
2 0.008
0.009
0.01
0.011
0.012 F
0.013
0.014
0.015
1
(b) Pareto front
Figure 3: Outputs of a run that uses the approach suggested in section 2.3 An a posteriori analysis shows that the error of the approximated results, which mainly depends on the approximation technique and on the parameter distmin , is < 1%. As a consequence of this we can say that distmin has been correctly chosen and obtained individuals are useful ones (in figure 4 the bests), even if the precision of the approximation especially depends on the modeling technique (when the dimensionality increases, setting this parameter could be a big problem).
see from the figure 5(b), whether a part of the front appears being correct (compared to the results of the optimization without approximation), a part completely lacks. This principally means that the right part of the front is neither “contained” in the initial database neither, therefore, in the initial approximated model.
0.25
0.2
0.15
0.1
0.05
0
40
110
35
100
30
90
0
0.2
0.4
0.6
0.8
1
1.2
Figure 4: Extreme individuals: individual maximizing the lift (solid), individual minimizing the drag (dashed) In order to better understand capabilities of the hybrid MOPED, in terms of efficacy and efficiency, some exploring tests, regarding different ways to enrich the database during the optimization, have been carried out.
25
added individuals per generation 20
70
15
60
10
50
5
40
110
30 Generations
35
100
(a) Enriching the dataset
30
90 total added individuals added individuals per evolution
80 70
15
60
10
50
5
40
10
20
40
30 60
50
with approximation witout approximation
3
F2
20
Total Added Individuals
Added Individuals per Evolution
0 0 40
25
80
total added individuals
Toal Added Individuals
−0.1 −0.2
Added Individuals per Generation
−0.05
0 0
5
10
15 Evolutions
20
30 30
25
2.5
(a) Enriching the dataset
with approximation without approximation
2 0.008
0.009
0.01
0.011
0.012
0.013
0.014
0.015
F1
(b) Pareto front 3
F2
Figure 6: Outputs of one run with non-dominated individuals verification
2.5
2 0.008
0.009
0.01
0.011
0.012
0.013
0.014
0.015
F1
(b) Pareto front
Figure 5: Outputs obtained without evolutionary control The same problem has been tackled without using evolution control approach, by repeatedly running the optimization code with approximated model and upgrading the dataset only by means of the obtained best individuals. This approach has a much faster convergence (in terms of individuals computed by the true model, meanly 98 computations on 10 runs, figure 5(a), where the abscissa now represents evolutions), but it is an illusory one. As we can
Another test concerns the choice of the individuals that have to be verified. The question in this case is what happens if we use a higher selective pressure? In figure 6(b) is shown one of the obtained front, when the upgrading individuals are picked up only among best individuals (first subclass within the first class). Again, the convergence appears being faster, with meanly 102 computations on 10 runs (figure 6(a)), but the approximated front is not complete. In these latter cases, could be helpful some technique that allows better exploring the search space ([farina02]), but we do not have results in this sense at the moment.
4 Conclusions This paper presents the last “evolution” of an already existing multi-objective optimization algorithm, whose upgrade concerns the hybridation of the algorithm itself with some sort of objective function approximation (kriging in this
case). Even if the approximation technique is fundamental, the work is only devoted to understand which could be the best way to alternate the true model and the approximate one in order to converge as nearest as possible to the true pareto front. When applied to a simplified aerodynamic problem, the hybrid algorithm demonstrates good performances both in terms of pareto front approximation and saving of true model calls (if compared with requirements of optimization processes using always the true model) and, even if the problem is too simple to allow a wide generalization of the results, the approach appears promising. In order to confirm the validity of the suggested method, we will have to better analyze a) the sensitivity of the code as regards the approximation technique, by using different methods such as neural networks, and b) its performance when applied to more difficult problems, in terms of dimensions and multimodality.
Bibliography [avan03] Avanzini, G., Biamonti, D. and Minisci, E.A. (2003) ”Minimum-Fuel/Minimum-Time Maneuvers of Formation Flying Satellites,” In Proceedings of the Astrodynamics Specialist Conference, pp. 1-20, Big Sky, Montana. AAS 03-654. [costa03] Costa, M. and Minisci, E. (2003) ”MOPED: a Multi-Objective Parzen-based Estimation of Distribution algorithm,” In Fonseca, C., Fleming, P., Zitzler, E., Deb, K. and Thiele, L. (editors), Proceedings of the Second International Conference on Evolutionary Multi-Criterion Optimization (EMO 2003), pp. 282294, Faro, Portugal. Springer LNCS 2632. [deb02] Deb, K, Pratap, A., Agarwal, S. and Meyarivan, T. (2002) ”A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II,” IEEE Transaction on Evolutionary Computation, 6(2), pp. 182-197. [drela00] Drela , M., (2000) ”XFOIL - Subsonic Airfoil Development System,” ACDL research group webpage, MIT, URL: http://raphael.mit.edu/xfoil/ [farina02] Farina, M. (2002) ”A Neural Network Based Generalized Response Surface Multiobjective Evolutionary Algorithm,” In Fogel, D.B., El-Sharkawi, M.A., Yao, X., Greenwood, G., Iba, H., Marrow, P. and Shackleton, M. (editors), Proceedings of the 2002 Congress on Evolutionary Computation (CEC2002), Vol. 1, pp. 956-961, Piscataway, New Jersey. [fuku72] Fukunaga, K. (1972) ”Introduction to statistical pattern recognition,” Academic Press, 1972. [jin02] Jin, Y., Olhofer, M. and Sendhoff, B. (2002) ”A Framework for Evolutionary Optimization with Approximate Fitness Functions”, IEEE Transactions on Evolutionary Computation, 6(5), pp. 481-494.
[jones98] Jones, D.R., Schonlau, M. and Welch, W. J. (1998) ”Efficient Global Optimization of Expensive Black-Box Functions,” Journal of Global Optimization, 13(4), pp. 455-492. [khan02] Khan, N., Golberg, D.E. and Pelikan, M. (2002) ”Multi-objective Bayesian Optimization Algorithm,” Technical Report IlliGAL 2002009, University of Illinois at Urbana-Champain - IlliGAL. [loph02] Lophaven, S.N., Nielsen, H.B. and Sondergaard, J. (2002) ”DACE - A Matlab Kringing Tool,” Technical University of Denmark. [simp01] Simpson, T.W., Mauery, T.M., Korte, J.J. and Mistree, F. (2001) ”Kriging Models for Global Approximation in Simulation-Based Multidisciplinary Design Optimization,” AIAA Journal, 39(12), pp. 2233-2241.