Proceedings of the 2nd International Symposium on Uncertainty Quantification and Stochastic Modeling June 23th to June 27th, 2014, Rouen, France
PC-Kriging: A new metamodelling method combining Polynomial Chaos Expansions and Kriging R. Sch¨obi1 , B. Sudret2 1 ETH Zurich, Stefano-Franscini-Platz 5, 8093 Z¨ urich, Switzerland,
[email protected] 2 ETH Zurich, Stefano-Franscini-Platz 5, 8093 Z¨ urich, Switzerland,
[email protected]
Abstract. Metamodelling decreases the computational effort of time-consuming computer simulations by approximating the underlying computer model with a simple and easy-to-evaluate function. Two popular non-intrusive metamodelling techniques are Polynomial Chaos Expansions (PCE) and Kriging. PCE approximate the model by a series of orthogonal polynomials in the input variables. The orthogonal polynomials are defined in coherency with the probability distribution functions of these variables. The coefficients of a PC expansion may be computed by least-squares minimization techniques. In contrast, Kriging assumes that the model is a realization of a Gaussian random field whose properties are computed from the experimental design. A new non-intrusive metamodelling method called PC-Kriging is derived as a combination of the two distinct approaches. PC-Kriging approximates the global behaviour of the computational model with a sparse set of orthonormal polynomials used as a regression term. An adaptive algorithm determines the optimal sparse set of polynomials similarly to the least angle regression algorithm. The sparse set of polynomials forms the trend part of the universal Kriging. Kriging models the local variabilities by interpolation between the neighbouring experimental design points. The performance and validation of PC-Kriging is illustrated on several analytical functions. The results show that PC-Kriging behaves better than or at least as well as the two separate techniques. Keywords. Experimental design, Kriging, metamodel, PC-Kriging, Polynomial Chaos Expansions 1
INTRODUCTION
Physical experiments are time- and money consuming in a context where resources are a constraint. With the increasing computer power computational models are created in order to replace physical experiments and to artificially simulate system responses. The more accurately a computational model resembles the reality the more complex it becomes. Evaluating advanced computational models to represent physical systems requires large computational resources which are not available in many cases. Computational models are used widely to solve problems such as optimization (Rackwitz, 2001; Rasmussen and Williams, 2006) and reliability analysis (Bect et al., 2012; Kaymaz, 2005). Solving these problems requires a large number of model evaluations (computer experiments) to assess the performance criteria with an acceptable accuracy. Thus the idea of metamodelling the computational model has emerged in the past decades. Metamodels (also called surrogate models) decrease the costs of the analysis by approximating the underlying computational model by a simple and easy-to-evaluate function. A carefully chosen design of experiments (DOE) contains the support points of the metamodel. A response surface is built up on the support points where the input and output values of the computational model are known. Such approaches are called non-intrusive because no information about the equations behind the computational model is necessary to create the metamodel (only input/output values are available). A common requirement for applying the metamodels described in this paper is that the underlying process behaves smoothly. The metamodel simplifies the prediction of new input samples at an affordable cost. Two popular non-intrusive metamodeling techniques are Polynomial Chaos Expansion (PCE) (Ghanem and Spanos, 2003) and Kriging (also known as Gaussian process modelling) (Santner et al., 2003). Polynomial chaos expansions approximate the underlying model by a set of orthonormal polynomials in coherency with the distributions of the input variables. Two polynomials are orthonormal with respect to the input distribution when the scalar product weighted with the variable’s input distribution is equal to unity only if the polynomials are identical. A sparse set of polynomials may be determined by a selection algorithm, such as the least-angle regression (Efron et al., 2004), and the coefficients are obtained by least-squares minimization (Blatman and Sudret, 2011). Kriging originated in mining and was introduced by Krige (1951). Today it is more widely known as Gaussian process modelling. The main assumption behind this metamodelling technique is that the system response (model output) is a realization of a (unknown) Gaussian process. The Gaussian process is described by an autocorrelation functions whose parameters are fitted from the experimental design. The two approaches have been developed by different research communities so far. Surprisingly, there has been only rare interaction between them. In this paper we propose to unify the two traditionally separate metamodelling techniques into a unified Polynomial-Chaos-Kriging (PC-Kriging) metamodelling approach. PC-Kriging combines the advantages of both distinct metamodelling techniques: the ability of modelling the global behaviour of PCE and that of modelling local variabilities of Kriging. This paper shows that the combination approach leads to a metamodel whose accuracy is better
¨ B. Sudret R. Schobi, PC-Kriging: A new metamodelling method combining Polynomial Chaos Expansions and Kriging
than or at least as high as the two distinct approaches. This paper is organized as follows. The metamodelling approaches Polynomial chaos expansion and Kriging are presented in Sec. 2 and Sec. 3, respectively. The new metamodelling technique is introduced in Sec. 4. With the help of analytical benchmark functions the different metamodelling techniques are compared in Sec. 5. The paper is summarized in Sec. 6. 2
POLYNOMIAL CHAOS EXPANSION
Polynomial Chaos Expansion (PCE) is considered here as a non-intrusive metamodelling approach, i.e. the model is assumed to be a black-box with unknown inner structure (such as non-linearities, dependencies between variables). Consider a system represented by a computational model M : x ∈ DX ⊂ RM → y ∈ R which maps the M-dimensional input space into a one-dimensional output space. The input space is represented by the random vector X whereas the model response by the random value Y : X) Y = M (X
(1)
Polynomial chaos expansions approximate the computational model with a finite sum of orthonormal polynomials in the input variables. X) = Y ≈ M (PCE) (X
∑
X) a α ψα (X
(2)
α ∈A
where {aaα , α ∈ A } are the coefficients to be determined for all multi-indices α = {α1 , . . . , αM } in the finite set A , M X ) are is the number of input variables X = {Xi , i = 1, . . . , M} (supposed independent here for the sake of clarity). ψα (X the multivariate orthonormal polynomials. The multi-dimensional polynomials are constructed as a product of univariate orthonormal polynomials: M
(i)
X ) = ∏ ψαi (Xi ) ψα (X
(3)
i=1
(i)
where ψαi is the polynomial in the i-th variable of degree αi . The input distributions may vary depending on the input parameters. For every type of input probability density function there exists an orthonormal polynomial basis. A set of polynomials where all polynomials are orthonormal to each other defines an orthonormal polynomial basis. Two polynomials are orthonormal with respect to their input distribution if their scalar product weighted by the probability density function of the input variable is equal to unity for identical polynomials and zero otherwise. Some of the classical polynomial bases can be found in Sudret (2012). Given a set of multivariate orthonormal polynomials, the next step is to determine the expansion coefficients {aaα , α ∈ A }. Berveiller et al. (2006), Blatman (2009), Blatman and Sudret (2010,2011) proposed to cast it as a least-squares regression problem: !2 X) a = arg min E Y − ∑ a α ψα (X (4) a ∈R|A |
α ∈A
Given an experimental design X = {χ (1) , . . . , χ (N) } of N samples χ (i) ∈ RM and the associated model output Y = {M (χ (1) ), . . . , M (χ (N) )}, the numerical formulation of the least-squares minimization method writes: ( Y (i) = M (χ (i) ), i = 1, . . . , N 2 (5) aˆ = arg mina ∈R|A | N1 ∑Ni=1 Y (i) − ∑α ∈A a α ψα (χ (i) ) Following the generalized least-squares solution for independent Y the optimal coefficients are obtained by: aˆ = (FT F)−1 FT Y
(6)
where F is the information matrix of size N × |A | whose generic term reads: Fi j = ψ j (χ (i) )
(7)
In practice only a small number of polynomials may approximate the behaviour of the computational model well. It is thus crucial to select an appropriate set of multi-indices before determining the coefficients in order to increase the efficiency of the algorithm. Hyperbolic index sets (Blatman and Sudret, 2011) limit the number of multi-indices by a maximal total polynomial degree p and the factor q limiting the number of interactive polynomials. The q-norm describes the set of candidate polynomials AqM,p : α ∈ NM : kα α kq ≤ p} , where kα α kq ≡ AqM,p ≡ {α
M
∑ αiq
i=1
!1 q
(8)
Proceedings of the 2nd International Symposium on Uncertainty Quantification and Stochastic Modeling June 23th to June 27th, 2014, Rouen, France
A further reduction in the number of predictors (multi-variate polynomials) may be obtained by the least-angle regression selection (LARS) algorithm (Efron et al., 2004; Blatman and Sudret, 2011). The result is a sparse set of predictors describing the behaviour of the computational model which is suitable to optimally represent the model output. Given a PCE model, the prediction of new samples is straightforward: input the new sample point x into Eq. (2) to obtain the output value y. 3
KRIGING
Kriging (a.k.a. Gaussian process modelling) is a stochastic interpolation algorithm which assumes that the model output M (xx) is a realization of a Gaussian process indexed by x ∈ DX ⊂ RM . M (xx) ≈ M (K) (xx) = β T · f (xx) + σ 2 Z(xx, ω)
(9)
where β T · f (xx) = ∑Pj=1 β j f j (xx) is the mean value of the Gaussian process, also called trend, and Z(xx, ω) is the zero mean, unit variance Gaussian process. σ 2 is the Kriging variance, ω describes outcomes of the underlying probability space with a correlation family R and its parameters θ . The autocorrelation between any two points x and x0 is represented by a parametric autocorrelation function R(xx − x 0 ; θ ), where θ are referred to as hyper-parameters. Examples of popular autocorrelation functions are summarized in Echard (2012) and Dubourg (2011). The trend of the Kriging model is composed of a predefined set of functions { f j (xx), j = 1, . . . , P} and a set of parameters β to be determined. This general formulation of the Kriging metamodel is also known as universal Kriging. Given an experimental design X = {χ (1) , . . . , χ (N) } and the corresponding output values Y = {Y (1) , . . . , Y (N) }, the β , σ 2 } may be obtained analytically conditional on the correlation parameters calibration of the metamodel parameters {β θ (generalized least-squares solution): −1 F R−1 Y β (θθ ) = FT R−1 F σy2 (θθ ) =
(10)
1 (Y − F β )T R−1 (Y − F β ) N
(11)
where Ri j = R(χ (i) − χ ( j) ; θ ) is the correlation matrix of the experimental design and Fi j = f j (χ (i) ) is the Vandermonde matrix. σy2 is the optimal Kriging variance and β are the optimal trend coefficients. The prediction of new samples is not as straightforward as for PCE. The prediction at a new sample x , denoted by Yˆ (xx), is a Gaussian variable defined by a mean value µYˆ and a variance σY2ˆ : µYˆ (xx) = f (xx)T β + r (xx)T R−1 (Y − F β ) σY2ˆ (xx) = σy2
T
T
1 − h f (xx) r (xx) i
0 F
FT R
(12) −1
f (xx) r (xx)
! (13)
where ri (xx) = R(xx − χ (i) ; θ ) is the correlation between the new sample x and the experimental design points X . In the general case of a priori unknown correlation parameters θ , the optimal values can either be obtained by a maximum likelihood estimate (Marrel el al., 2008; Dubourg et al., 2011) or a leave-one-out cross-validation estimate (CV) (Bachoc, 2013). Bachoc (2013) compared these two approaches and concluded that CV leads to more stable results in general. The objective function for minimization reads in this case: fCV (θθ ) = Y T R(θθ )−1 diag R(θθ )−1
−2
R(θθ )−1 Y
(14)
Note that the Kriging algorithm is efficient for a small set of points. Kriging interpolates between samples, i.e. samples close to a new candidate sample have a large influence on its prediction values. Samples far away have a negligible impact on the candidate sample. 4 4.1
PC-KRIGING Idea
Polynomial Chaos Kriging (PC-Kriging) is the combination of the two separate metamodelling techniques PCE and Kriging. PCE handles the global behaviour of a computational model well. Kriging interpolates local variations as a function of the neighbouring points. PC-Kriging combines the two characteristics: a set of polynomials approximates the global behaviour whereas the local variations are targeted with the Kriging algorithm. The new algorithm is based on the universal Kriging metamodel which combines a regression part and a correlation part (Eq. (9)). The PC-Kriging metamodel M (PCK) reads as follows: M (xx) ≈ M (PCK) =
∑
α ∈A
a α ψα (xx) + σ 2 Z(xx, ω)
(15)
¨ B. Sudret R. Schobi, PC-Kriging: A new metamodelling method combining Polynomial Chaos Expansions and Kriging
where a T · ψ α (xx) ≡ ∑α ∈A a α ψα (xx) is the mean value of the Gaussian process, P = |A | is the number of polynomials and ψ α (xx) are the multivariate polynomials defined in Eq. (3). In this notation, {ψα (xx), α ∈ A } is the set of orthonormal polynomials. Z(xx, ω) is a zero mean, unit variance Gaussian process with the set of hyper-parameters {σ 2 , θ }. The new metamodelling algorithm contains two separate parts: i) the determination of a set of polynomials for the trend and ii) the determination of the optimal correlation {σ 2 , θ } and trend coefficients {aaα , α ∈ A }. The set of polynomials is computed with the LARS algorithm. The universal Kriging algorithm provides the values for the correlation and the trend coefficients. The two tasks are executed in series, i.e. subsequent. 4.2
Algorithm
The link between the two distinct metamodelling techniques may be achieved in many ways. Two approaches are proposed here and shown more in detail: • FPC-Kriging: Full-PC-Kriging describes an approach where the two tasks are carried out sequentially. The first step is to find the optimal set of polynomials using the PCE framework, i.e. applying LARS to the experimental design. Then the determined set of polynomials is embedded into the universal Kriging algorithm and the combined metamodel is calibrated. The heuristic assumption behind this algorithm is that the optimal set of polynomials arising from the PCE algorithm is also the optimal set for the combined algorithm. Given an experimental design (input and output values), the FPC-Kriging algorithm consists of the following steps in order to obtain a metamodel: 1. PCE-related: Determine the input distributions and the corresponding orthonormal bases. Use hyperbolic index sets to decrease the number of candidate polynomials of the LARS algorithm. 2. Kriging-related: Choose an autocorrelation function family to describe the Gaussian process. 3. Select the sparse set of polynomials by LARS and use is as trend of the FPC-Kriging model. 4. Determine the hyper-parameters (θθ ) of the autocorrelation function. β , σy2 } from Eq. (10) and (11). 5. Determine the metamodel parameters {β After these steps the metamodel is set up and new input samples x may be predicted using Eq. (12) and (13). • OPC-Kriging: The Optimal-PC-Kriging approach computes the optimal metamodel iteratively. The optimal set of polynomials is determined by the PCE framework, exactly by the LARS algorithm (as in FPC-Kriging). The LARS algorithm results in a list of ranked polynomials which are chosen depending on their correlation to the output residuals in decreasing order. This means that the polynomials that are most correlated to the residuals are ranked highest. The computation of the universal Kriging model starts with the highest ranked polynomial as trend part. Several Kriging models are computed with an increasing number of polynomials according to the ranking by LARS. The iterative algorithm terminates when the target error measure converges to a minimum. The evolution of the metamodel’s accuracy is determined by the leave-one-out error described in Eq. (16). The metamodel with the smallest error is then called the OPC-Kriging model. 4.3
Error estimation
PC-Kriging is a universal Kriging metamodel with a particular trend. Thus Kriging error measures may be used to measure the performance of a specific metamodel. In particular, the leave-one-out error is used to compare different PCKriging models on a global scale. The leave-one-out error compares the prediction of a sample out of the experimental design χ (i) through a metamodel based on the other samples in the experimental design X (−i) = X \χ (i) : ErrLOO =
2 1 N (i) Y − µYˆ(−i) (χ (i) ) ∑ N i=1
(16)
where µYˆ(−i) (χ (i) ) is the mean prediction at sample point χ (i) on the modified experimental design X (−i) = X \χ (i) = {χ ( j) , j = 1, . . . , i − 1, i + 1, . . . , N}. Dubrule (1983) derived an analytical formulation of the leave-one-out error for universal Kriging in order to avoid computing N distinct metamodels. The prediction mean and variance of the leave-oneout metamodels are given by: N
µYˆ(−i) =
Bi j ( j) Y = j=1, j6=i Bii
σY2ˆ
1 Bii
(−i)
=
where σY2ˆ
(−i)
∑
N
Bi j
∑ Bii Y ( j) − Y (i)
(17)
j=1
(18)
is the prediction variance at sample point χ (i) on the modified experimental design X (−i) and B is a square
matrix of size (N + P), N is the number of samples in the experimental design X and P is the number of polynomials in
Proceedings of the 2nd International Symposium on Uncertainty Quantification and Stochastic Modeling June 23th to June 27th, 2014, Rouen, France
the trend: B=
σy2 R FT
F 0
−1 (19)
where σy2 is the variance of the Kriging model including the entire experimental design X . Kriging, and also PC-Kriging, has the ability of providing a local error measure. The Kriging prediction is stochastic, i.e. the mean and variance of the prediction are provided at any prediction point x in the input sample space. The variance can be interpreted as a local error measure. It is useful when searching for regions of the input space with low prediction accuracy. By placing new sample in these regions the overall accuracy of the metamodel can be increased iteratively. Bichon et al. (2011) and Echard et al. (2011) refer to this technique as adaptive metamodelling and it may be applied in reliability and optimization algorithms. 5 5.1
BENCHMARK APPLICATION Setup
For validation purposes, analytical benchmark functions are metamodelled in this section. The functions are easy-toevaluate and thus it is possible to compare the metamodels to the exact functions. In this paper the focus is on the three functions known as Ishigami, Sobol’ and Rosenbrock functions. They are defined as follows: • Ishigami function: The Ishigami function is a low-dimensional function composed of polynomial parts and trigonometric parts. This is a smooth function which can be approximated with a set of polynomials, e.g. a Taylor series. The function formulation reads: f1 (xx) = sin x1 + 7 sin2 x2 + 0.1 x34 sin x1
(20)
This function is used for sensitivity analysis assuming that the input variables are xi = U (−π, π), where i = 1, 2, 3 • Sobol function: The Sobol’ function is a common benchmark in sensitivity analysis because sensitivity measures such as the Sobol’ indices are easy to derive analytically. Due to an absolute value operator in the analytical formulation the behaviour of the function is non-smooth in the point xi = 0.5. The function formulation reads: M
f2 (xx) = ∏ i=i
|4 xi − 2| + ci 1 + ci
(21)
This function is used for sensitivity analysis assuming that the input variables are xi = U (0, 1) where i = 1, . . . , 8 and c = (1, 2, 5, 10, 20, 50, 100, 500)T . • Rosenbrock: The Rosenbrock function is a smooth polynomial function. In this paper the 2-dimensional formulation is shown for the sake of clarity in visualisations. The analytical formulation reads: f3 (xx) = 100 x2 − x12
2
+ (1 − x1 )2
(22)
For sensitivity analysis the input variables are defined as xi = U (−2, 2) where i = 1, 2 5.2
Analysis Four previously discussed surrogate models are compared in this section, namely: • Ordinary Kriging is a special case of the universal Kriging discussed in Sec. 3 where the trend is simplified (Eq. (9)). The vector product β T · f (xx) reduces to β1 · f1 (xx) where f1 (xx) = 1, i.e. the unknown coefficient is β1 . For the autocorrelation function we choose the Mat´ern autocorrelation function with the parameter ν = 5/2 which reads (Roustant et al., 2012): ! ! √ √ M 0| 0 )2 0| 5 |x − x 5 (x − x 5 |x − x i i i i i i + (23) R(xx − x 0 ; l , ν = 5/2) = ∏ 1 + exp − li li 3 li2 i=1 where l : {li > 0, i = 1, . . . , M} are the correlation lengths and M is the number of input variables. • PCE method described in Sec. 2. The hyperbolic index set parameter is set to q = 0.75. The maximal polynomial degree used for the Ishigami, the Sobol and the Rosenbrock function are p = 20, p = 8 and p = 6 respectively. • FPC-Kriging and OPC-Kriging as described in Sec. 4 with the Mat´ern autocorrelation function (ν = 5/2) as in the case of ordinary Kriging.
¨ B. Sudret R. Schobi, PC-Kriging: A new metamodelling method combining Polynomial Chaos Expansions and Kriging
The response values of the analytical functions are easy to obtain. That is why the performance of the metamodelling approaches can be compared based on a large validation set X = {xx1 , . . . , x n } of size n = 105 . The performance is measured by the relative generalized error which is defined as:
εgen ≡
2 c(xxi ) ∑ni=1 M (xxi ) − M 2 ∑ni=1 (M (xxi ) − µby )
,
µby =
1 M (xxi ) n∑
(24)
c(xxi ) is the response of the metamodel. where M (xxi ) is the response of the computational model to x i and M The comparison of the four approaches is carried out using different settings such as varying the number of samples in the experimental design. The experimental design is generated according to the Latin-hypercube sampling algorithm (McKay et al., 1979). To ensure a statistical stability of the results, 50 independent runs are carried out and their results are presented in box plots. It is reminded that the central mark of the box plot corresponds to the median value, the box’s edges mark the 25th and the 75th percentile values. The whiskers describe the boundary to the outliers which are defined as smaller than q25 − 1.5(q75 − q25 ) and larger than q75 + 1.5(q75 − q25 ). 5.3
Results
5.3.1 Comparison of different metamodelling techniques Figures 1 and 2 compare the relative generalized error as a function of the DOE size for the four metamodelling techniques. In each figure (a) shows the ordinary Kriging model, (b) shows the PCE model, (c) shows the FPC-Kriging model and (d) shows the OPC-Kriging model. The range of the size of the experimental designs is chosen so as to get a large range of error estimates, i.e. from large to small errors. The box plots summarize 50 independent runs for each of the sample sizes. The results of the Ishigami function are plotted in Fig. 1. For very small experimental designs (N = 20) ordinary Kriging performs better than the other metamodelling techniques. Already at N = 32 samples OPC-Kriging has lower error estimates than the other surrogate models. When increasing the number of samples, PC-Kriging performs better than the separate techniques on their own. OPC-Kriging has slightly smaller error estimates than FPC-Kriging. For larger experimental designs PCE clearly outperforms Kriging. The PC-Kriging approaches rather follow the trend of PCE than Kriging in this case. The second analytical benchmark application is the Sobol’ function (M = 8) in Fig. 2. For very small sample sizes of N = 16, 32 samples, PC-Kriging performs better than the two distinct approaches. PC-Kriging has the lowest median values over the entire range. The leading metamodel is again OPC-Kriging with the smallest relative generalized errors. The difference between PCE and Kriging is not as large as for the Ishigami function. An explanation for this behaviour lies in the shape of the Sobol’ function. The Sobol’ function is non-smooth at xi = 0.5 due to the absolute value in the numerator of the analytical function. This behaviour is hard to approximate with a set of polynomials. That may be the reason why the performance of Kriging is in the same range as PCE. Here it is visible that PC-Kriging combines the two characteristics and provides a better metamodel. Overall, PC-Kriging performs better or at least as good as the two distinct metamodelling techniques. Considering a realistic case where the underlying process is totally unknown, i.e. not knowing whether one or the other metamodel leads to a better approximation of the computational model, then PC-Kriging provides the best metamodel out of the ones considered in this paper. The PC-Kriging can tend to two different directions: i) the metamodel consists of a small trend part (with only a few elements) and a large Kriging variance σy2 or ii) the metamodel consists of a very accurate trend part and a small σy2 . 1
egen
0.8 0.6 0.4 0.2 0
20
32
DOE
50
(a) Ordinary Kriging
64
20
32
50
64
20
32
50
64
20
32
50
DOE
DOE
DOE
(b) PCE
(c) FPC-Kriging
(d) OPC-Kriging
64
Figure 1: Relative generalized error versus the DOE size of the Ishigami function
5.3.2 Error Variations The results of the OPC-Kriging models of the Rosenbrock function are shown in Fig. 3. The error estimates are low for small experimental designs. A sample size of N = 20 is sufficient to metamodel this function with a high accuracy due to the polynomial formulation of the function. Fig. 3(a) shows the boxplot of the relative generalized error of 50 independent runs with a sample size of N = 12 samples. The output error is drawn on a logarithmic scale (base 10) in order to see the variation in the values. The order of magnitude of the error varies between 10−4 to 10−1 .
Proceedings of the 2nd International Symposium on Uncertainty Quantification and Stochastic Modeling June 23th to June 27th, 2014, Rouen, France
1
egen
0.8 0.6 0.4 0.2 0
16
32
DOE
64
128
16
32
(a) Ordinary Kriging
64
128
16
32
64
128
16
32
64
128
DOE
DOE
DOE
(b) PCE
(c) FPC-Kriging
(d) OPC-Kriging
Figure 2: Relative generalized error versus the DOE size of the Sobol’ function
Fig. 3(b) shows the logarithmic (base 10) contour plot of the output of the computational model Y = M (X). The red colours represent high values whereas the blue coloured regions represent low values. The variation of output values is large. The function formulation is polynomial, thus a smooth behaviour is ensured as seen in the figure. Fig. 3(c) and Fig. 3(d) present the designs of experiment (DOE) which lead to the smallest and the largest error measure, respectively. The best experimental design (lowest relative generalized error) for this function generated by the Latin-hypercube sampling algorithm is shown in Fig. 3(c). The worst DOE is presented in Fig. 3(d). That means that the two DOE lead to different error estimates with an order of magnitude of 103 difference. This is surprising as the two DOE look alike at first sight. When taking a closer look to the two figures the difference becomes apparent. The best experimental design has more sample points in the region of high contrast (upper half of Fig. 3(c)). The points are closer to the sickle-shaped area with low output values than in the worst DOE. The worst DOE (Fig. 3(d)) includes more point in the low contrast regions which do not add information about the behaviour of the function. The DOE in Fig. 3(d) also performs poorly for the traditional PCE and Kriging. The best DOE in Fig. 3(c) though is different from the best ones for the PCE and Kriging. The difference is obvious as PCE and Kriging are two different meatmodelling frameworks.
0
2
−1
1 X2
log10(egen)
2
−2
−2
−1
−3 −4
0 0
−2 −2
12 DOE
(a) OPC-Kriging
−1
0 X1
1
(b) Logarithmic contour plot
2
−4
−2
−1
0 X1
(c) Best DOE
1
2
−2
−1
0 X1
1
2
(d) Worst DOE
Figure 3: Analysis of the experimental designs (DOE) on the Rosenbrock function (N = 12 samples)
Note that the experimental design is very small is this case. The error variations decrease with increasing experimental design rapidly. Nonetheless this example can be adapted to a high-dimensional computational model where only “few” model evaluations are available. The metamodel then depends strongly on each sample of the experimental design and even advanced sampling techniques may lead to poor metamodel performance. 6
CONCLUSION
This paper proposes a new non-intrusive metamodelling approach combining the two approaches Kriging and polynomial chaos expansions. PC-Kriging is based on the universal Kriging algorithm where the trend is represented here by a sum of orthonormal polynomials. The least-angle regression algorithm (used in PCE algorithms) determines the optimal sparse set of orthonormal polynomials in coherency with the input variables’ distributions. PC-Kriging is compared to the traditional approaches on analytical benchmark functions which are easy to evaluate. The metamodels are compared via their relative generalized error, which compares the computational model output of a large validation set to the metamodel output. The conclusion of the comparison is that the new metamodelling technique performs better or at least as good as the distinct approaches. Thus PC-Kriging is advantageous over the distinct techniques. In the last section the influence of the experimental design is discussed. For small experimental designs the influence of each sample on the quality of the metamodel is large (even with advanced input sampling techniques). This situation appears frequently in reality as there are often severe resource limitations. The influence of the experimental design is large and one has to pay attention on the definition of the input sample points.
¨ B. Sudret R. Schobi, PC-Kriging: A new metamodelling method combining Polynomial Chaos Expansions and Kriging
7
REFERENCES
Bachoc, F., 2013, “Cross validation and maximum likelihood estimations of hyperparameters of Gaussian processes with model misspecifications”, Computational Statistics & Data Analyses, 22(3):773–793. Bect, J., Ginsburger, D., Li, L., Picheny, V. and Vazquez, E., 2012, “Sequential design of computer experiments for the estimation of a probability of failure”, Computational Statistics & Data Analyses, 22(3):773–793. Berveiller, M., Sudret, B. and Lemaire, M., 2006, “Stochastic finite elements: a non intrusive approach by regression”, European Journal of Computational Mechanics, 15(1-3):81–92. Bichon, B, McFarland, J. and Mahadevan, S., 2011, “Efficient surrogate models for reliability analysis of systems with multiple failure modes”, Reliability Engineering & System Safety, 96(10):1385–1395. Blatman, G., 2009, “Adaptive sparse polynomial chaos expansion for uncertainty propagation and sensitivity analysis”, Ph.D. thesis, Universit´e Blaise Pascal - Clermont II, France. Blatman, G. and Sudret, B., 2010, “An adaptive algorithm to build up sparse polynomials expansions for stochastic finite element analysis”, Probabilistic Engineering Mechanics, 25(2):183–197. Blatman, G. and Sudret, B., 2011, “Adaptive sparse polynomials chaos expansion based on least angle regression”, Journal of Computational Physics, 230:2345–2367. Dubourg, V., Sudret, B. and Bourinet, J.-M., 2011, “Reliability-based design optimization using Kriging and subset simulation”, Structural and Multidisciplinary Optimization, 44(5):673–690. Dubourg, V., 2012, “Adaptive surrogate models for reliability analysis and reliability-based design optimization”, Ph.D. thesis, Universit´e Blaise Pascal, Clermont-Ferrand, France. Dubrule, O., 1983, “Cross validation of Kriging in a unique neighbourhood”, Mathematical Geology, 15(6):687–698. Echard, B., Gayton, N. and Lemaire, M., 2011, “AK-MCS: an active learning reliability method combining Kriging and Monte Carlo simulation”, Structural Safety, 33(2):145–154. ´ Echard, B., 2012, “Evaluation par krigeage de la fiabilit´e des structures sollicit´ees en fatigue.”, Ph.D. thesis, Universit´e Blaise Pascal - Clermont II, France. Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R., 2004, “Least angle regression”, Annals of Statistics, 32:407–499 Ghanem, R. and Spanos, P., 2003, “Stochastic Finite Elements: A Spectral Approach”, Courier Dover Publications. Kaymaz, I., 2005, “Application of Kriging method to structural reliability problems”, Structural Safety, 27(2):133–151. Krige, D., 1951, “A statistical approach to some basic mine valuation problems on the Witwatersrand”, Journal of the Chemical, Metallurgical and Mining Society of South Africa, 52(6):119–139. Marrel, A., Iooss, B., Van Dorpe, F., Volkova, E., 2008, “An efficient methodology for modeling complex computer codes with Gaussian processes”, Computational Statistics & Data Analysis, 52:47311–4744. McKay, M.D., Beckman, R.J. and Conover, W.J., 1979, “A comparison of three methods for selecting values of input variables in the analysis of output from a computer code”, Technometrics, 2:239–245. Rackwitz, R., 2001, “Reliability analysis - a review and some perspectives”, Structural Safety, 23(4):365–395. Rasmussen, C. and Williams, C., 2006, “Gaussian processes for machine learning. Adaptive computation and machine learning”, Cambridge, Massachusetts: MIT Press. Santner, T., Williams, B. and Notz, W., 2003, “The design and analysis of computer experiments”, Springer, New York. Sudret, B., 2012, “Metamodels for structural reliability and uncertainty quantification”, In: Proceedings of the 5th AsianPacific Symposium of Structural Reliability (APSSRA’2012), Singapore, 53–76. RESPONSIBILITY NOTICE The authors are the only responsible for the printed material included in this paper.