Optimal Convex Combinations of Metamodels by Minimizing the PRESS Vector Niclas Str¨omberg Department of Mechanical Engineering School of Science and Technology ¨ Orebro University ¨ SE–701 82 Orebro, Sweden E-mail:
[email protected] June 28, 2018 Abstract In this work we establish optimal ensembles of metamodels by minimizing the taxicab, Euclidean or infinity norm of the PRESS vector. The PRESS vector is defined by the leave-oneout cross-validation errors of a linear combination of ten metamodels. The metamodels of the ensemble are given by a quadratic regression model, Kriging with linear and quadratic regressions, two radial basis function networks (RBFN) with a priori bias, two RBFN with a posteriori bias, polynomial chaos expansion, support vector regression and least square support vector regression. The minimization of the norms are constrained such that the sum of weights of the linear combination equals one and only non-negative weights are allowed. We have found that the latter constraints are extremely important when the ensembles of metamodels represent design constraints on inequality form. Thus, our ensemble is in-fact a convex combination of ten metamodels. The minimization problem with the Euclidean norm is of course a QP-problem and the other two problems are rewritten as equivalent LP-problems. Eight benchmark functions are studied as “black-boxes” using Halton and Hammersley sampling. It is found that all three norms produce most accurate but different
1
optimal convex combinations of metamodels, and that the optimal solution depends on the character of the response and the choice of sampling. Finally, we also demonstrate the excellent performance of the approach by performing reliability based design optimization using optimal convex combinations of our ten metamodels.
Keywords: Ensemble, Metamodel, Convex combination
1
Introduction
The use of metamodels, such as Kriging [1], RBFN [2], polynomial chaos expansion (PCE) [3], support vector machines [4] and support vector regression [5], for approximating computational expensive models, such as non-linear finite element models, in order to perform advanced studies, such as different disciplines of design optimization [6, 7], is today a most established approach in engineering [8]. The best choice of metamodel depends strongly on the character of the response, the choice of design of experiments and the definition of best. Therefore, searching for a general best single metamodel valid for all situations is like searching for the holy grail. A more fruitful search is to find the best ensemble of metamodels for a particular problem. This is the topic of the current paper. To be more precise, in this paper we establish the best convex combination of ten metamodels, where best is defined by the taxicab norm, Euclidean norm or the infinity norm of the PRESS vector of leave-one-out cross-validation errors. Typically, one establish an ensemble of metamodels for the set of sampling points of interest and then pick the best metamodel from the ensemble according to any quality criterium such as e.g. the sum of squared residuals. This can also be done automatically by finding optimal weights wi for a linear combination of the metamodels, i.e. yen =
M X
wi y i ,
(1)
i=1
where yi represents the metamodels in the ensemble and yen becomes the optimal ensemble of metamodels. Usually, the sum of weights wT 1 is taken to be equal to one, i.e. the linear combination in (1) is treated as an affine combination. Assuming all metamodels yi predict a point correctly, if this constraint is satisfied, then yen will also approximate this point correctly. In this paper, we will also add the constraints wi ≥ 0. We argue that these constraints are crucial when the optimal ensemble of metamodels 2
represents a design constraint on the form yen ≤ 0, which e.g. could be a constraint on the von Mises stress level in a component. Now, assuming that all metamodels in the ensemble fulfill yi ≤ 0 at a design point, if these constraints on the weights are not satisfied, then yen > 0 might happen. This is of course most unsatisfactory and our experience is that there is an obvious risk that the corresponding design optimization study might fail. However, by including the constraints wi ≥ 0 also ensures yen ≤ 0. Typically, these constraints are not included in previous works on optimal ensemble of metamodels. By including wi ≥ 0, the linear combination in (1) is a conical combination. Thus, in this paper, we establish the best ensemble of metamodels when both w T 1 = 1 and wi ≥ 0 are considered, i.e. when (1) is a convex combination. The best is defined by the taxicab norm, Euclidean norm or the infinity norm of the PRESS vector of leave-one-out crossvalidation errors e of the convex combination of metamodels in (1). Thus, we establish the optimal ensemble of metamodels by solving kek min w x (2) w T 1 = 1, s.t. wi ≥ 0, i = 1, . . . , M, where k · kx represents one of the three norms discussed above. A similar approach using the Euclidean norm was recently utilized by Ferreira and Serpa [9] for efficient global optimization. Different variants of least square approaches were discussed by the same authors in a previous work [10], where a formulation similar to ours including constraints wi ≥ 0 is mentioned briefly. We argue that this variant is the proper least square approach for finding the optimal ensemble of metamodels. An early paper on minimizing any error metric measuring the accuracy of the ensemble of metamodels is presented by Acar and Rais-Rohani [11]. However, they did not include wi ≥ 0 and the solution is then simply given by a closed solution to the optimality conditions. In this work we need of course to solve LP and QP problems explicitly due to the constraints wi ≥ 0. Vianna et al. [12] minimized an approximation of the mean squared error using vectors of cross-validation errors. We suggest to also include wi ≥ 0 in their formulation. However, then the optimal solution is no longer given by a closed solution to the optimality conditions, but instead becomes a QP problem to be solved explicitly by any quadratic programming approach. We have tested this and it seems to be working fine. An early work on ensemble of metamodels can be found in [13]. Examples of other similar works on ensemble of metamodels are [14, 15, 16, 17, 18]. Reliability based 3
design optimization (RBDO) by using ensemble of metamodels was performed by Gu et al. in [19]. In this work a first step towards RBDO by using optimal convex combinations of metamodels is taken by solving a benchmark using (2) for all three norms. Thus, the benchmark is represented by optimal convex combinations of metamodels which in turn are solved using the SQP-based RBDO approach recently proposed by Str¨omberg [20]. The outline of the paper is as follows: in the next section we present the QP-problem as well as the two LP-problems which are solved in order to establish our optimal convex combinations of ten metamodels, in Section 3 we present the governing equations of our 10 metamodels used in the ensemble, in Section 4 the proposed approach is tested on eight benchmark functions for two different sets of sampling points and one RBDO benchmark is solved using optimal convex combinations of metamodels, and, finally, some concluding remarks are given.
2
Optimal convex combination
Let us define a new metamodel yen = yen (x) as a convex combination of an ensemble of metamodels, i.e. yen = yen (x) =
M X
wi yi (x),
(3)
i=1
where M is the total number of metamodels in the ensemble, wi ≥ 0 are weights satisfying w1 + w2 + . . . + wM = 1 and yi = yi (x) represents a particular metamodel. In the next section, we present the basic equations of the ten metamodels yi as implemented in this paper. ˆ k of yen is given The leave-one-out cross-validation error at a point x by M X (−k) (−k) ek = e(ˆ xk ) = fˆk − yen (xk ) = fˆk − wi yi (ˆ xk ), (4) i=1
(−k) yi (x)
where represents the metamodel with the k-th data point excluded from the sampling set {ˆ xi , fˆi }, see also the next section for further details on how to establish these metamodels. If we now perform the leave-one-out cross-validation in (4) for every data point, then we establish the vector of PRESS residuals: e = {ei } = fˆ − Y w,
(5)
where fˆ contains fˆi , w is a vector of weights wi and (−i)
[Y ]ij = yj 4
(ˆ xi ).
(6)
3000
2000
0
Z
Z
0.5
1000
-0.5
0 2
2
1
2
2
1.8
1.5
1
0
1.6 1.4
Y
1
(a)
-1
1.2 1
0 -1
Y
X
(b)
f1 - sincos
-2
-2
X
f2 - Rosenbrock’s banana
200 5
100
Z
Z
150
0
50 -5 0 4
2 1
4
3
2 1
0
3 2
Y
1
(c)
1
0
-1
2
-1
Y
X
-2
(d)
f3 - new
-2
X
f4 - peaks
5
0.7 0.6
Z
Z
0.5 0.4
0
0.3 0.2 -5 5
1 0.5
1
5
0.5
0
Y
0
-0.5 -1
(e)
0
0
-0.5 -1
Y
X
-5
(f)
f5 - Giunta
-5
X
f6 - Adjiman
0
600
400
Z
Z
-100 -200 200
-300 -400 5
0 10 5
4
10
(g)
4 3
2
0
-5
Y
5 3
5
0
1
-5 -10
-10
2
Y
X
(h)
f7 - modified Brent
0
1 0
X
f8 - Hosaki
Figure 1: Analytical benchmark functions defined in (32). 5
Let us now minimize the norm of the PRESS vector e subjected to w T 1 = 1 and wi ≥ 0, i.e. kek min w x (7) w T 1 = 1, s.t. wi ≥ 0, i = 1, . . . , M, where kekx represents the taxicab norm kek1 , the Euclidean norm kek2 and the infinity norm kek∞ according to kek1 =
N X i=1
√
|ei |,
(8)
eT e,
kek2 = kek∞ = max(e1 , . . . , eN ), where N is the number of sampling points. The problem in (7) with the taxicab norm corresponds to the following LP-problem: N X min pi + qi (w, p, q) i=1 (9) Y w − fˆ = p − q, s.t. w T 1 = 1, wi , pj , qj ≥ 0, i = 1, . . . , M, j = 1, . . . , N.
By taking the square of the Euclidean norm, (7) becomes of course a QP-problem. Finally, using the infinity norm, (7) can be rewritten as the following LP-problem: min t (w, t) Y w − fˆ ≤ t1, (10) −Y w + fˆ ≤ t1, s.t. wT 1 = 1, wi , t ≥ 0, i = 1, . . . , N. Here, above and in the following, 1 represents a column vector of ones of proper size.
3
Metamodels
Let us assume that we have a set of sampling data {ˆ xi , fˆi } obtained from design of experiments as mentioned in the previous section. We 6
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
X2
X2
1
0
0
-0.2
-0.2
-0.4
-0.4
-0.6
-0.6
-0.8
-0.8
-1 -1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
-1 -1
1
-0.8
X1
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
X1
Figure 2: Left: Halton sampling, right: Hammersley sampling. Details about these two samplings are presented in appendix A. would like to represent this set of data with a function, which we call a response surface, a surrogate model or a metamodel. One choice of such a function is the regression model given by f = f (x) = ξ(x)T β,
(11)
where ξ = ξ(x) is a vector of polynomials of x and β contains regression coefficients. By minimizing the sum of squared errors, i.e. N 2 X min Xij βj − fˆi , β
(12)
i=1
where Xij = ξj (ˆ xi ) and N is the number of sampling points, then we obtain optimal regression coefficients from the normal equation reading β∗ = X T X
−1
X T fˆ .
(13)
We use a quadratic regression model, denoted model Q in the next section, as one of the ten metamodels in the ensemble. Examples of other useful metamodels, which are included in our ensemble, are Kriging, radial basis functions, polynomial chaos expansion, support vector regression and least square support vector regression. The basic equations of these models as implemented in our in-house toolbox1 are presented below. 1
MetaBox, www.fema.se
7
1
3.1
Kriging
The Kriging models are given by f (x) = ξ(x)T β ∗ + r(x)T R−1 (θ ∗ ) fˆ − Xβ ∗ ,
(14)
where the first term represents the global behavior by a linear (model Kr-L) or quadratic regression model (model Kr-Q) and the second term ensures that the sample data is fitted exactly. R = R(θ) = [Rij ], where ! N X ˆ i, x ˆ j ) = exp − (15) θk (ˆ xik − xˆik )2 . Rij = Rij (θ, x k=1
Furthermore, θ ∗ is obtained by maximizing the likelihood function ! 1 (Xβ − fˆ )T R−1 (Xβ − fˆ ) p exp − (16) 2σ 2 σ N det(R)(2π)N
using a genetic algorithm and establishing β ∗ = X T R−1 (θ ∗ )X from the optimality conditions.
−1
X T R−1 (θ ∗ )fˆ
(17)
Table 1: Optimal weights for the Halton sampling and the taxicab norm. f1 Q Kr-L Kr-Q Rpri-L Rpri-Q Rpost-L Rpost-Q PCE SVR L-SVR
3.2
f2
f3 0.0247
0.0258 0.0857
f4 0.0217 0.7556
f5 0.0001 0.0188 0.0094 0.0034
f6
f7
0.9719
0.1624
f8 0.0511 0.208 0.4066
0.8206 0.0669 0.8137 0.0749
1
0.0995 0.5891 0.2457 0.041
0.1557
0.2537 0.6668 0.0313 0.0104 0.0061
0.017
0.3343
0.0281
Radial basis function networks
ˆ k the outcome of the radial basis function netFor a particular input x work can be written as k
k
f = f (ˆ x )=
NΦ X
Aki αi +
i=1
8
Nβ X i=1
Bki βi ,
(18)
where αi and βi are constants defined by (21) and (22), or (23) presented below, (19) Aki = Φi (ˆ xk ) and Bki = ξi (ˆ xk ). Here, Φi = Φi (ˆ xk ) represents the radial basis function and the second term in (18) is a linear or quadratic bias. Furthermore, for a set of signals, the corresponding outgoing responses fˆ = {fˆi } of the network can be formulated compactly as fˆ = Aα + Bβ,
(20)
where α = {αi }, β = {βi }, A = [Aij ] and B = [Bij ]. If we let β be given a priori by the normal equation in (13) as B T fˆ ,
(21)
ˆ ˆ f − Bβ .
(22)
β = BT B then α=A
−1
−1
We utilize two settings of this model in the ensemble depending on the choice of bias, called model Rpri-L and model Rpri-Q, respectively. Otherwise, when the bias is unknown, α and β are established by solving A B α fˆ . (23) = β BT 0 0 We have also two settings of this model in our ensemble, which we call model Rpost-L and model Rpost-Q. Table 2: Optimal weights for the Hammersley sampling and the taxicab norm. f1
Q Kr-L Kr-Q Rpri-L Rpri-Q Rpost-L Rpost-Q PCE SVR L-SVR
f2
f3 0.0016
0.1602
f4 0.251 0.5772
f5
0.029
f6 0.1397 0.5266
f7
0.1998 0.8002
f8 0.0755 0.8681 0.0563
0.0035 0.455 0.3584 0.0265
1
0.8411 0.1573
0.0054 0.0314 0.1314
9
0.0005 0.5171 0.1719 0.0844 0.1961 0.0009
0.3337
3.3
Polynomial chaos expansion
Polynomial chaos expansion (model PCE) by using Hermite polynomials ϕn = ϕn (x) can be written as f (x) =
Mt X
NVAR
ci
Y
ϕi (xj ),
(24)
j=1
i=0
where Mt + 1 is the number of terms and constant coefficients ci , and NVAR is the number of variables xi . The Hermite polynomials are defined by 2 n 2 x d x n ϕn = ϕn (x) = (−1) exp . (25) exp − n 2 dx 2 For instance, one has ϕ0 = 1, ϕ1 = x, ϕ2 = x2 − 1, ϕ3 = x3 − 3x, ϕ4 = x4 − 6x2 + 3, ϕ5 = x5 − 10x3 + 15x, ϕ6 = x6 − 15x4 + 45x2 − 15, ϕ7 = x7 − 21x5 + 105x3 − 105x.
(26)
The unknown constants ci are then established by using the normal equation in (13). A nice feature of the polynomial chaos expansion is that the mean of f (X) in (24) for uncorrelated standard normal distributed variables Xi is simply given by E[f (X)] = c0 .
3.4
(27)
Support vector regression
The soft non-linear support vector regression model (model SVR) reads f (x) =
N X i=1
i
i
λ k(x , x) −
10
N X i=1
ˆ i k(xi , x) + b∗ , λ
(28)
3000
2000
0
Z
Z
0.5
1000
-0.5
-1 2
0 2 1
2
2
1.8
1.5
1
0
1.6 1.4
Y
1
(a)
-1
1.2 1
0 -1
Y
X
(b)
yen - sincos
-2
-2
X
yen - Rosenbrock’s banana
6
150
4 2
Z
100
0 50
-2 -4
0 4
2 1
4
3
2
(c)
0
-1
2 1
-1
Y
1
-2
(d)
yen - new
0.7
3
0.6
2
0.5
1
Z
Z
1
0
3 2
0.4
X
yen - peaks
0
0.3
-1
0.2
-2
1
-2
5 0.5
1
5
0.5
0
Y
0
-0.5 -1
(e)
0
0
-0.5 -1
Y
X
-5
(f)
yen - Giunta
-5
X
yen - Adjiman
0.5 0
600
400
Z
Z
-0.5 -1 -1.5
200
-2 10
5 5
10 5
0
0
-5
Y
(g)
4
5 3
4 3
2 1
-5 -10
-10
2
X
yen - modified Brent
Y
(h)
0
1 0
X
yen - Hosaki
Figure 3: Optimal convex combinations of ten metamodels for the eight benchmark functions in (32). 11
ˆ i and b∗ are established where k(xi , x) is the kernel function, and λi , λ by solving N N 1 X X i ˆi j ˆj (λ − λ )(λ − λ )k(xi , xj ) + . . . 2 i=1 j=1 min N N X X ˆ (λ,λ) i i ˆi ˆ ˆi) (λ − λ )f + δ (λi + λ (29) j=1 j=1 N X ˆ i ) = 0, (λi − λ s.t. j=1 i ˆ i 0 ≤ λ , λ ≤ C, i = 1, . . . , N. Finally, the corresponding least square support vector regression model (model L-SVR) is established by solving 0 0 1T −1T b ˆ 1 B + γI λ −B f − δ1 = , (30) ˆ ˆ −1 −B B + γI λ −f − δ1 where γ = 1/C and
B = [Bij ],
Bij = k(xi , xj ).
(31)
Summarized, the ensemble consists of the following metamodels: Q, Kr-L, Kr-Q, Rpri-L, Rpri-Q, Rpost-L, Rpost-Q, PCE, SVR and L-SVR. In the next section, we establish optimal convex combinations of this ensemble for 8 benchmark functions and one RBDO problem. Table 3: Optimal weights for the Halton sampling and the Euclidean norm. f1
Q Kr-L Kr-Q Rpri-L Rpri-Q Rpost-L Rpost-Q PCE SVR L-SVR
4
f2
f3
0.17
0.6848 0.1452
f4 0.0302
f5 0.0001 0.0175
f6
f7
0.9544
f8
0.6983
0.79
1
0.2519 0.6225 0.1256
0.0603 0.0301 0.0894
0.0018 0.0017 0.8415 0.0061 0.1148 0.0097 0.0068
0.9629 0.0339 0.0456
0.0999 0.2018
0.0032
Examples
Optimal convex combinations of the ensemble of ten metamodels presented in the previous section are established for the eight benchmark 12
functions presented below in (32) by solving (7) and (8) using linprog.m and quadprog.m in Matlab. f1 = sin(4(x1 − 1) − 2) cos(4(x2 − 1) − 2), 1 ≤ xi ≤ 2, f2 = 100 ∗ (x2 − x21 )2 + (x1 − 1)2 , −2 ≤ xi ≤ 2, p f3 = (1000 ∗ (4/x1 − 2)4 + 1000 ∗ (4/x2 − 2)4 ), 1 ≤ xi ≤ 4, f4 = peaks(x1 , x2 ), −2 ≤ xi ≤ 2, 0.6 + sin(16/15 ∗ x1 − 1) + . . . sin(16/15 ∗ x1 − 1)2 + . . . , −1 ≤ xi ≤ 1, f5 = (32) sin(16/15 ∗ x2 − 1) + . . . 2 sin(16/15 ∗ x2 − 1) f6 = cos(x1 ) sin(x2 ) − x1 /(x22 + 1), −5 ≤ xi ≤ 5, (x1 + 10)2 + (x2 + 10)2 − . . . , −10 ≤ xi ≤ 10, f7 = 190 ∗ exp(−0.1 ∗ x21 − 0.1 ∗ x22 ) (1 − 8x1 + 7x21 − 7/3x31 + . . . f8 = , 0 ≤ xi ≤ 5. 1/4x41 )x22 exp(−x2 )
Seven of the eight functions in (32) are well-known test examples, while f3 is a test function of our own developed to evaluate algorithms for RBDO [21]. The corresponding RBDO problem is solved at the end of this section using optimal convex combinations of metamodels. The fourth function f4 is the well-known peaks function implemented in Matlab, where the complete analytical expression can easily be found in the documentation of Matlab. All analytical test functions f1 -f8 in (32) are plotted in Figure 1. Table 4: Optimal weights for the Halton sampling and the infinity norm. f1 Q Kr-L Kr-Q Rpri-L Rpri-Q Rpost-L Rpost-Q PCE SVR L-SVR
f2
f3
f4
f5
f6
0.667
0.0039
1
0.2013 0.0203
0.6258 0.1028 0.0499
f7
f8 0.3644
0.0001
1
0.2962 0.7038
0.1053 0.2277
0.5371 0.2135 0.193
0.8981 0.1019
0.6356
0.0523
Now these eight test functions are considered to be “black-boxes” which we treat by setting up two different design of experiments with 30 sampling points; Halton and Hammersley sampling according to Figure 2 are adopted, see also appendix A. We begin by solving (7) using the taxicab norm for both sets of sampling points. The optimal weights 13
wi are presented in Tables 1 and 2. It is clear by comparing the two tables that the optimal solutions depend on the choice of design of experiments2 . Furthermore, it is also obvious that character of the optimal solutions differs strongly on the type of functional response. For instance, the Rosenbrock’s banana is best represented by polynomial chaos expansion only, but the Giounta function with Halton sampling almost need all metamodels for the optimal convex combination. The representation of the eight functions with optimal convex combinations of metamodels is outstanding. The corresponding plots of Table 1 are given in Figure 3, which should be compared to Figure 1. The resemblance between the plots is remarkable. By looking carefully one can find some minor differences. Table 5: Error norms Ly for different norms Lx used in (7). Lx-Ly f1 f2 f3 f4 f5 f6 f7 f8
L1-L1 0.5938 0.0112 142.20 18.392 0.0138 15.452 297.34 5.6236
L2-L1 0.7057 0.012 150.23 19.287 0.0157 15.481 323.58 6.0596
L∞-L1 0.7626 0.0159 162.02 33.980 0.0233 15.488 490.29 6.4
L1-L2 0.2034 0.0034 41.36 5.288 0.0055 3.4709 130.75 2.0955
L2-L2 0.1467 0.0033 38.20 5.254 0.0047 3.4702 123.42 1.6785
L∞-L2 0.159 0.004 39.43 8.042 0.0058 3.4747 170.42 1.815
L1-L∞ 0.1749 0.0025 29.53 3.5901 0.0042 1.5073 120.07 1.6828
L2-L∞ 0.0602 0.0024 20.41 3.550 0.0031 1.5179 109.79 1.059
L∞-L∞ 0.0459 0.0021 16.50 3.121 0.0021 1.4903 109.41 0.957
In addition we also solve (7) for the Euclidean and infinity norm. The obtained optimal weights for the Halton sampling are presented in Tables 3 and 4. Thus, we can conclude that the optimal solutions also depend on the choice of norm, i.e. the meaning of best. Still, the different optimal solutions generate very similar values of the norms. This is stated by Table 5, which presents the particular norm value Ly using norm Lx when solving (7), i.e. L1-L2 means that the taxicab norm is used in (7) and the Euclidean norm value is tabulated. One can see that the lowest norm value is obtained when the same norm is used in (7) as expected. Finally, we consider a benchmark for RBDO [21] reading s 2 2 4 4 min − 2 + 1000 −2 1000 µi µ µ 1 1 (33) 4 4 Pr[(X − 0.5) + (X − 0.5) ≤ 2] ≥ P , 1 2 s s.t. 1 ≤ µi ≤ 4, where Ps = 0.999 and VAR[Xi ] = 0.12 . The deterministic solution is (1.5,1.5) and the minimum of the unconstrained objective function is 2
An empty cell implies wi = 0.
14
found at (2,2). The solution to (33) obtained by our SQP-based RBDO approach is (1.2705,1.2705). The corresponding reliability is 99.9%. Now, we consider (33) to be a “black-box” which we treat by the Hammersley sampling in Figure 2 and our proposed optimal convex combinations of metamodels. The deterministic solutions are (1.4857,1.5140) for the taxicab norm, (1.4866,1.5131) for Euclidean norm and (1.5017,1.4993) for infinity norm. Furthermore, the FORMbased RBDO solutions using Breitung-based SORM corrections are (1.2757,1.2719), (1.2779,1.2749) and (1.2745, 1.2722), and the corresponding estimations of reliability are 99.88%, 99.87% and 99.89% for these three solutions, respectively. 450
450
400
400
350
350
300
300
250
250
200
200
150
150
100
100
50
50
0
0 0
20
40
60
80
100
120
140
160
180
200
0
20
40
60
f (a)
(b)
Objective, analytical solution
450
400
400
350
350
300
300
250
250
200
200
150
150
100
100
50
50
-2
-1.5
-1
-0.5
0
0 -2.5
0.5
g (c)
100
120
140
160
180
200
f
450
0 -2.5
80
Objective, ∞-optimal ensemble
-2
-1.5
-1
-0.5
0
0.5
g (d)
Constraint, analytical solution
Constraint, ∞-optimal ensemble
Figure 4: Comparison of histogram for analytical solution and solution obtained using optimal convex combinations of metamodels. Thus, all three optimal convex combinations of metamodels produce very accurate solutions. For instance, a choice of quadratic regression models as metamodels instead of the optimal ensembles generates the following corresponding solutions: (2.0274,2.0335) and (1.7998,1.8074). By using Kriging models, we obtain instead (1.5266,1.5459) and (1.4259,1.4588). It is clear that the optimal convex combinations of 15
metamodels are superior these two choices of metamodels. In Figure 4, the histograms for the objective and the constraint for both the analytical solution as well as the solution obtained using optimal convex combinations of metamodels are plotted. Again the resemblance is stunning. Table 6: The two sets of 30 sampling points plotted in Figure 2. Halton x1 -1 0.066667 -0.46667 0.6 -0.73333 0.333333 -0.2 0.866667 -0.86667 0.2 -0.33333 0.733333 -0.6 0.466667 -0.06667 1 -0.93333 0.133333 -0.4 0.666667 -0.66667 0.4 -0.13333 0.933333 -0.8 0.266667 -0.26667 0.8 -0.53333 0.533333
5
y1 -1 -0.30769 0.384615 -0.76923 -0.07692 0.615385 -0.53846 0.153846 0.846154 -0.92308 -0.23077 0.461538 -0.69231 0 0.692308 -0.46154 0.230769 0.923077 -0.84615 -0.15385 0.538462 -0.61538 0.076923 0.769231 -0.38462 0.307692 1 -0.97436 -0.28205 0.410256
Hammersley x1 y1 -1 -1 -0.93103 -0.30769 -0.86207 0.384615 -0.7931 -0.76923 -0.72414 -0.07692 -0.65517 0.615385 -0.58621 -0.53846 -0.51724 0.153846 -0.44828 0.846154 -0.37931 -0.92308 -0.31034 -0.23077 -0.24138 0.461538 -0.17241 -0.69231 -0.10345 0 -0.03448 0.692308 0.034483 -0.46154 0.103448 0.230769 0.172414 0.923077 0.241379 -0.84615 0.310345 -0.15385 0.37931 0.538462 0.448276 -0.61538 0.517241 0.076923 0.586207 0.769231 0.655172 -0.38462 0.724138 0.307692 0.793103 1 0.862069 -0.97436 0.931034 -0.28205 1 0.410256
Concluding remarks
In this works we suggest to minimize the PRESS vector for establishing optimal ensembles of a linear combination of ten metamodels. This is done for non-negative weights. We have found it crucial to add these constraints on the weights of the linear combination of metamodels in order to represent design inequality constraints properly. In addition, we also include the established constraint that the sum of weights should equal one. Consequently, the linear combination is a convex combination of metamodels. Three different norms of the PRESS vector are used as objective leading to one QP-problem and two LP-problems. All 16
three mathematical programming problems produce exceptional good ensembles of metamodels. This is demonstrated for eight benchmark functions and one RBDO example.
A
Halton and Hammersley sampling
The Halton sequence and the Hammersley sequence are two examples of sparse uniform samplings generated by quasi-random sequences. Let p1 , p2 ,...,pD represent a sequence of prime numbers, where D is the dimension of a Halton point defined by xHal = xHal (k) = {Φ(k, p1 ), Φ(k, p2 ), . . . , Φ(k, pD )}
(34)
for a non-negative integer k. Furthermore, for any prime number p, Φ(k, p) =
a0 a1 aM + 2 + . . . + M +1 , p p p
(35)
where the integers a0 , a1 ,..., aM are obtained from the fact that k can be represented as k = a0 + a1 p + ap2 + . . . + aM pM .
(36)
A quasi-random set of N Halton points is now simply obtained by taking a sequence of Halton points in (34) for k = 0, 1, 2,..., N − 1. By defining the Hammersley point as k , Φ(k, p1 ), Φ(k, p2 ), . . . , Φ(k, pD−1 ) , (37) xHam = xHam (k) = N we can easily generate a set of Hammersley sampling points in a similar way as for the Halton set. The two sets of Halton and Hammersley points used in this paper are given in Table 6 and plotted in Figure 2.
References [1] J.P.C. Kleijnen, Kriging metamodeling in simulation: a review, European Journal of Operational Research, 192, 707–716, 2009. [2] K. Amouzgar & N. Str¨omberg, Radial basis functions as surrogate models with a priori bias in comparison with a posteriori bias, Structural and Multidisciplinary Optimization, 55, 1453–1469, 2017. [3] B. Sudret, Global sensitivity analysis using polynomial chaos expansions, Reliability Engineering & System Safety, 93, 964–979, 2008. 17
[4] N. Str¨omberg, Reliability-based design optimization by using support vector machines, in the proceedings of ESREL - European Safety and Reliability Conference, Trondheim, Norway, 17–21 June, 2018. [5] Y. Yun, M. Yoon & H. Nakayama, Multi-objective optimization based on meta-modeling by using support vector regression, Optimization and Engineering, 10, 167-181, 2009. [6] N. Str¨omberg & M. Tapankov, Sampling- and SORM-based RBDO of a knuckle component by using optimal regression models, in the proceedings of the 14th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference, Indianapolis, Indiana, 17–19 September, 2012. [7] K. Amouzgar, A. Rashid & N. Str¨omberg, Multi-objective optimization of a disc brake system by using SPEA2 and RBFN, in the proceedings of 39th Design Automation Conference, ASME, Portland, Oregon, USA, August 4–7, 2013. [8] G.G. Wang & S. Shan, Review of metamodeling techniques in support of engineering design optimization, Journal of Mechanical Design, 129, 370–380, 2006. [9] W.G. Ferreira & A.L. Serpa, Ensemble of metamodels, Structural and Multidisciplinary Optimization, 57, 131–159, 2018. [10] W.G. Ferreira & A.L. Serpa, Ensemble of metamodels: the augmented least squares approach, Structural and Multidisciplinary Optimization, 53, 1019–1046, 2016. [11] E. Acar & M. Rais-Rohani, Ensemble of metamodels with optimized weight factors, Structural and Multidisciplinary Optimization, 37, 279–294, 2009. [12] F.A.C. Viana, R.T. Haftka & V. Steffen Jr., Multiple surrogates: how cross-validation errors can help us to obtain the best predictor, Structural and Multidisciplinary Optimization, 39, 439–457, 2009. [13] T. Goel, R.T. Haftka, W. Shyy & N.V. Queipo, Ensemble of surrogates, Structural and Multidisciplinary Optimization, 33, 199–216, 2007. [14] E. Acar, Varius approaches for constructing an ensemble of metamodels using local measures, Structural and Multidisciplinary Optimization, 42, 879–896, 2010. 18
[15] X.J. Zhou, Y. Zhong Ma & X. Fang Li, Ensembles of surrogates with recursive arithmetic avarage, Structural and Multidisciplinary Optimization, 44, 651–671, 2011. [16] Y. Lee & D.-H. Choi, Pointwise ensemble of meta-models using ν nearest points cross-validation, Structural and Multidisciplinary Optimization, 50, 383–394, 2014. [17] R. Shi, L. Liu, T. Long & J. Liu, An efficient ensemble of radial basis functions method based on quadratic programming, Engineering Optimization, 48, 1202–1225, 2016. [18] X. Song, L. Lv, J. Li, W. Sun & J. Zhang, An advanced and robust ensemble surrogate model: extended adaptive hybrid functions, Journal of Mechanical Design, 140, 2018. [19] X. Gu, J. Lu & H. Wang, Reliability based design optimization for vehicle occupant protection system based on ensemble of metamodels, Structural and Multidisciplinary Optimization, 51, 533–546, 2015. [20] N. Str¨omberg, Reliability-based design optimization using SORM and SQP, Structural and Multidisciplinary Optimization, 56, 631– 645, 2017. [21] N. Str¨omberg, Reliability based design optimization by using a SLP approach and radial basis function networks, in the proceedings of the ASME 2016 International Design Engineering Technical Conferences & Computers and Information in Engineering Conference IDETC/CIE, Charlotte, North Carolina, USA, August 21–24, 2016.
19