Multi-Source Surrogate Modeling with Bayesian. Hierarchical Regression. Sayan Ghoshâ, Ryan B. Jacobsâ, and Dimitri N. Mavrisâ . Aerospace Systems Design ...
AIAA 2015-1817 AIAA SciTech 5-9 January 2015, Kissimmee, Florida 17th AIAA Non-Deterministic Approaches Conference
Multi-Source Surrogate Modeling with Bayesian Hierarchical Regression Sayan Ghosh∗, Ryan B. Jacobs∗, and Dimitri N. Mavris†
Downloaded by GEORGIA INST OF TECHNOLOGY on December 3, 2015 | http://arc.aiaa.org | DOI: 10.2514/6.2015-1817
Aerospace Systems Design Laboratory, Georgia Institute of Technology, Atlanta, GA 30332
The early phases of design of unconventional aerospace systems often lacks historical data and accurate low-fidelity analyses. Accurate data from sources such as flight tests, wind tunnel tests, and computational experiments are required to enable accurate predictions of system performance through out the design space. Due to the significant resources required by these physical and computational experiments, high-fidelity data are sparsely distributed throughout the design space. This paper proposes a multi-source surrogate modeling method using Bayesian hierarchical regression to build an accurate regression model by combining sparse information from different sources. This paper illustrates the method using one dimensional analytic functions for two different scenarios: 1) when all of the sources are at same level of fidelity and 2) when sources have different levels of fidelity. Finally, the method is demonstrated on an airfoil drag analysis problem in which a regression model on sparse wind tunnel data is improved using data from two moderate fidelity computer programs.
Nomenclature βj Vector of regression parameter of j th source 2 γ Variance vector of regression parameters β j θ Mean vector of regression parameters β j I Identity matrix xi,j Vector of regressors of ith observation from j th source Xj nj × p matrix formed by rearranging x1,j , . . . , xnj ,j vectors of regression in j th source Yj Vector of nj observation in j th source yˆj Response surface based surrogate model of yj yˆc1 , yˆc2 Response surface based surrogate model of yc1 , yc2 yˆhf Response surface based surrogate model of yhf σ2 Variance of noise in each observation bi Coefficient/Parameters of true function ytrue cji Coefficient/Parameters of yj model m Number of sources of observation nj Number of observation in j th source p Length of regressor vector yj j th computational model to predict true function yc1 , yc2 Low fidelity computational models yhf High fidelity computational model Yi,j ith observation from j th source ytrue True function ∗ Graduate
Research Assistant, School of Aerospace Engineering, 270 Ferst Drive, Mail Stop 0150, AIAA Student Member. Professor of Advanced Aerospace Systems Analysis, School of Aerospace Engineering, 270 Ferst Drive, Mail Stop 0150, AIAA Fellow. † Boeing
1 of 12 American Institute of Aeronautics and Astronautics Copyright © 2015 by Sayan Ghosh. Published by the American Institute of Aeronautics and Astronautics, Inc., with permission.
Downloaded by GEORGIA INST OF TECHNOLOGY on December 3, 2015 | http://arc.aiaa.org | DOI: 10.2514/6.2015-1817
I.
Introduction
During the early design phases for unconventional aerospace systems, it is desirable to increase knowledge by exploring the design space and to delay the reduction of design freedom so that expensive redesigns can be avoided in later phases. One element that is necessary to enable this is the accurate prediction of system performance throughout the design space. Analyses based on historical data and low-fidelity analyses are often not sufficient for this role. Thus, it is ideal to employ high-fidelity, physics-based analyses early in the design process as well. Also, limited amounts of data obtained from physical experiments may be available to the designer. However, due to the significant resources required for physical experimentation and many high-fidelity analysis codes, accurate predictions are typically sparsely distributed throughout the design space. Surrogate models can be generated for interpolation and extrapolation throughout the design space with minimal computational expense, but uncertainty in the predictions will be large due to scarcity of the high-fidelity data. If a larger quantity of data generated by lower fidelity analyses is obtained, it can be used to improve the accuracy and reduce uncertainty of the surrogate model predictions. Many of the existing methods for generating multi-fidelity surrogate models are based on the idea that the high-fidelity function can be approximated as a tuned or corrected function of the low-fidelity model (e.g., [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]). This approach was generalized by Toropov [11] with three types of tuning: linear and multiplicative functions, correction factors, and the use of low-fidelity model inputs as tuning parameters. For the first two types, the analyst must specify a functional relationship between the low- and high-fidelity functions. The third type requires that the low-fidelity code(s) possess inputs that can serve as tuning parameters. Kriging and Gaussian process regression have also been proposed for the multi-fidelity analysis problem (e.g., [2,10]). These methods use an auto-regressive model to correct low-fidelity predictions, and the response covariance matrix is modified to account for the multiple data sources. Although Kriging models are useful for many nonlinear problems due to their flexibility, some problems do not require their sophistication and complexity. For example, an analysis performed by Viana et al. [12] indicated that the polynomial response surface regression-based surrogate modeling technique is most used by authors in the structural and multidisciplinary optimization and probabilistic analysis literature of 2012. A novel approach to multi-source surrogate modeling is desired that does not require the analyst to describe the high-fidelity model(s)/data as a function of the low-fidelity model(s) or use tuning parameters in the low-fidelity model(s), leverages the simplicity of linear regression, and can accommodate any number of data sources. The authors prefer the term multi-source in this research because there are situations in which a hierarchy of fidelity levels cannot be specified. For example, one may possess multiple aerodynamic data sets generated by different wind tunnels, which are regarded as having the same “fidelity”. A technique from Bayesian statistics, called hierarchical regression [13, 14, 15], offers all of the desired features. Bayesian hierarchical regression falls under the category of Bayesian hierarchical models, which are commonly used to carry out meta-analysis in the field of biostatistics [16]. Meta-analysis is defined as “a quantitative method of combining the results of independent studies (usually drawn from the published literature) and synthesizing summaries and conclusions which may be used to evaluate therapeutic effectiveness, plan new studies, etc., with application chiefly in the areas of research and medicine.” [17] Although the Bayesian hierarchical technique can be applied to advanced surrogate models, such as Artificial Neural Networks [18], this paper demonstrates the method with polynomial response surface regression. The multi-source approach based on this technique is briefly described in the following section. The method is then illustrated with a onedimensional example. Then, the method is demonstrated with a multi-fidelity airfoil drag analysis problem. Finally, some concluding remarks are presented.
II. j
th
Bayesian Hierarchical Regression
In multi-source surrogate modeling using Bayesian hierarchical regression, an ith observation from the source, Yi,j , can be modeled by source-specific regression parameters, β j , as Yi,j = β Tj xi,j + i,j ,
i,j ∼ i.i.d. N (0, σ 2 )
(1)
where, xi,j is a vector of regressors and σ 2 is the homoscedastic noise. If Y j represents a vector of nj observations from the j th source and x1,j , . . . , xnj ,j are collected in an nj × p matrix X j , then within the j th source the sampling model is given as Y j ∼ N (X j β j , σ 2 I) 2 of 12 American Institute of Aeronautics and Astronautics
(2)
If σ 2 and the regression parameters from m sources β 1 , . . . , β m are given, then Y 1 , . . . , Y m , are conditionally independent: m Y P(Y 1 , . . . , Y m |β 1 , . . . , β m , σ 2 ) = P(Y i |β 1 , . . . , β m , σ 2 ) (3) i=1
The heterogeneity among the regression parameters β 1 , . . . , β m is described as the between-source sampling model. If the fidelity of the sources of data cannot be distinguished, then regression parameters can be considered as independent and identically distributed (i.i.d.) from any given distribution representing the sampling variability between the sources. A normal hierarchical regression model can be used to define between-source heterogeneity with the multivariate normal distribution given as
Downloaded by GEORGIA INST OF TECHNOLOGY on December 3, 2015 | http://arc.aiaa.org | DOI: 10.2514/6.2015-1817
β 1 , . . . , β m ∼ i.i.d. N (θ, γ 2 I)
(4)
The multivariate distribution on regression parameters is not a prior distribution but rather a sampling distribution representing heterogeneity among the regression parameters from different sources. Figure 1 shows a graphical representation of the hierarchical regression model.
...
Figure 1. Graphical representation of the hierarchical regression model
To carry out multi-source modeling using Bayesian hierarchical regression, data from different experiments like flight tests, wind tunnel tests, and computational models are treated as different sources. The hyperparameters θ and γ 2 accumulate the information from all the regression parameters obtained from each source rather than the samples directly. If all sources are at same level of fidelity, then the regression based on hyperparameters—also referred as meta regression in this paper—forms the ensemble model of all the sources. In a different scenario, when the sources of data are not at same level of fidelity, the Bayesian hierarchical model helps in building the strength of the high-fidelity surrogate model(s) by passing on information related to the trend or regression coefficients of the lower fidelity model(s) to the higher fidelity model(s) and vice versa. In the following section, examples are presented to demonstrate the application of the Bayesian hierarchical regression on both of the above mentioned scenarios.
III.
Illustrative Examples
Two cases of multi-source problems are demonstrated in this section with one dimensional analytical functions. The first case demonstrates the scenario when the sources (in this case, analytical functions)
3 of 12 American Institute of Aeronautics and Astronautics
Downloaded by GEORGIA INST OF TECHNOLOGY on December 3, 2015 | http://arc.aiaa.org | DOI: 10.2514/6.2015-1817
are at same level of fidelity. It is assumed that the sources share similar functional form and trend as the “true” function. Discrepancies in the analytical functions are assumed to be due to the error in calibration parameters. This is similar to the situation when data are available from different wind tunnel tests. The number of samples or data from different sources may or may not be the same. All the sources are modeled with linear regression. The regression coefficients are then linked with hyper-parameters in a Bayesian hierarchical regression framework. The ensemble model, also called the meta-regression in this work, is given by the regression model built using the posterior distribution of hyper-parameters. The second case demonstrates the scenario when the sources are at different levels of fidelity. It is assumed that the “true” model is available but only a few samples are available due to its computational expense. The goal is to build a better predictor of the high-fidelity model with the assistance of samples from lower fidelity models. The lower fidelity models are assumed to be cheaper and the number of samples available from these models are relatively large. The lower fidelity models are not accurate, but they share some characteristics with higher fidelity models at different region in the design space. To build an accurate predictor of the high-fidelity model, a regression model is built for all the high- and low- fidelity models. As in the first case, the coefficients of regressions are linked through hyper-parameters in a Bayesian hierarchical regression framework. The accurate predictor is then represented by the regression model formed by the posterior distribution of regression coefficients corresponding to the high-fidelity model. A.
Case 1: Models at the Same Level of Fidelity
Consider a true function of the form ytrue = b1 + b2 x + b3 x sin(b4 x)
(5)
where, bT = [0.3, 0.1, −0.3, 3]. Let’s assume that multiple computational models, yj , are available which predict the true function given in Eq. (5). Also, assume that the computational models are of the same fidelity and cannot be differentiated. The functional form of these computational models is similar to the true function but with different coefficients. yj = cj1 + cj2 x + cj3 x sin(cj4 x)
(6)
The coefficient vector cj is deterministic and can be thought of as a vector of calibration parameters for the j th model. Due to calibration error, modeling limitations, etc., it has been assumed that calibration parameters of the j th computational model are a random draw from a normal distribution with mean bT and standard deviation of 0.1bT . Two computational models y1 and y2 are randomly selected with c1 = [0.3022, 0.1052, −0.3117, 2.7926] and c2 = [0.3211, 0.1015, −0.2915, 3.0367]. Four samples are generated using y1 and twenty samples are generate using y2 in the range x ∈ [0, 1], as shown in figure 2. A linear regression is carried out to fit a cubic polynomial by combining data from both of the computational codes, as shown in figure 3. Due to relatively large samples in y2 , the cubic fit is biased towards the y2 samples. This happens when linear regression on the combined data cannot differentiate between within-source and between-source heterogeneity.
4 of 12 American Institute of Aeronautics and Astronautics
0.4
y1 samples y2 samples
0.35 y2
y
0.3
0.25
ytrue
y1 0.15 0
0.2
0.4
x
0.6
0.8
1
Figure 2. Samples from two models at the same level of fidelity used for Bayesian hierarchical modeling
To overcome this issue, a Bayesian hierarchical regression is carried out by modeling each computational model as a cubic polynomial. yˆj = β0,j + β1,j x + β2,j x2 + β3,j x3 (7) The regression coefficients β j are i.i.d. from the multivariate distribution N (θ, γ 2 I). The prior distribution on θ is assumed to be a normal distribution with mean αT = [0.0, 0.0, 0.0, 0.0] and standard deviation 3 2 ΛT 0 = 10 [1.0, 1.0, 1.0, 1.0]. For σ , a prior distribution of inverse-gamma(0.001, 0.001) is assumed. A Markov Chain Monte Carlo (MCMC) based Gibbs sampling is carried out using OpenBUGS [19] to calculate the posterior distribution on the parameters. The polynomial fit using the mean of the posterior distribution of θ, labeled as meta regression, is also shown in figure 3. It is observed that the meta regression is not biased towards y2 . The meta regression is able to gather strength from the regression coefficients of each model rather than the samples which makes the meta regression a better predictor of the true model when compared to the simple regression of the combined data.
0.4 y1 samples y2 samples
0.35
Cubic fit of combined data
0.3 y
Downloaded by GEORGIA INST OF TECHNOLOGY on December 3, 2015 | http://arc.aiaa.org | DOI: 10.2514/6.2015-1817
0.2
0.25
0.2
0.15 0
ytrue Meta regression
0.2
0.4
x
0.6
0.8
1
Figure 3. Comparison of true function ytrue , cubic fit of combined data, and meta regression using samples from two models at same level of fidelity
5 of 12 American Institute of Aeronautics and Astronautics
0.4
0.35
y1 samples y2 samples y3 samples
y2
y
0.3
0.25 ytrue y3
0.15 0
y1
0.2
0.4
x
0.6
0.8
1
Figure 4. Samples from three models at the same level of fidelity used for Bayesian hierarchical modeling
Figure 4 shows a third model y3 with calibration vector c3 = [0.2741, 0.0889, −0.3231, 3.3352], which is added to the Bayesian hierarchical regression framework. Figure 5 shows a comparison of the cubic fit of combined data, meta regression, and the true function. 0.4 y1 samples y2 samples y3 samples
0.35
Cubic fit of combined data
0.3 y
Downloaded by GEORGIA INST OF TECHNOLOGY on December 3, 2015 | http://arc.aiaa.org | DOI: 10.2514/6.2015-1817
0.2
0.25
ytrue
0.2 Meta regression 0.15 0
0.2
0.4
x
0.6
0.8
1
Figure 5. Comparison of true function ytrue , cubic fit of combined data, and meta regression using samples from three models at same level of fidelity
B.
Case 2: Models at Different Levels of Fidelity
In this example scenario, it is assumed that a high-fidelity and expensive computational model of the true function given by Eq. (5) is available. Since the computational model is expensive, only three samples are available, which are insufficient to accurately fit the regression. A quadratic fit using these three samples is shown in figure 6 and is compared with ytrue . As expected, the quadratic fit is a poor representation of the true function.
6 of 12 American Institute of Aeronautics and Astronautics
0.4
yhf samples
0.35
0.3 y
yhf
0.25
Quadratic fit with 3 samples
0.15 0
0.2
0.4
x
0.6
0.8
1
Figure 6. Samples from the high-fidelity model yhf and the quadratic fit of the samples
To assist the modeling of the high-fidelity function using three samples, two low-fidelity and cheaper models, yc1 and yc2 are assumed to be yc1 = 0.28 + 0.002x − 0.15x sin(3(x − 0.2)) yc2 = 0.4 − 0.3x sin(2.8(x + 0.05)) Due to lower fidelity, these models are inaccurate and cannot be considered at the same level as the true model, but they may share some of the characteristics of the high-fidelity model in different regions of the design space. An analogy can be drawn here by comparing the computational models based on a low-fidelity panel method and high-fidelity full Navier Stokes equations for aerodynamic analysis of an aircraft. For low Mach numbers, panel method-based models and full Navier Stokes equations-based models are comparable, but differences arise at higher Mach numbers. The lower fidelity models are assumed to be computationally cheap; therefore, ten samples are selected from each one of them, as shown in figure 7. 0.4 yhf samples yc1 samples yc2 samples
yc2
0.35
0.3 y
Downloaded by GEORGIA INST OF TECHNOLOGY on December 3, 2015 | http://arc.aiaa.org | DOI: 10.2514/6.2015-1817
0.2
0.25
yc1
yhf
0.2
0.15 0
0.2
0.4
x
0.6
0.8
1
Figure 7. The high fidelity (yhf ) and the low fidelity (yc1 and yc2 ) models and the samples used in Bayesian hierarchical modeling
7 of 12 American Institute of Aeronautics and Astronautics
To build an accurate predictor of the true model a Bayesian hierarchical model has been built as in case 1 by selecting cubic polynomials to fit each of the computational models. Since the models are not at the same level of fidelity, the objective is to find an accurate posterior distribution on regression coefficients β true of the high fidelity model. In this case the hyper parameters θ and γ 2 play a role of gaining information from trends of the lowerfidelity models and passing it on to the higher-fidelity model. Posterior means of the polynomial fits are shown in figure 8. It is observed that the Bayesian hierarchical model helps the higher-fidelity model to gain information with regard to the trend from the lower-fidelity model, resulting in a better posterior fit. A comparison of surrogate models based on the posterior mean of high-fidelity regression coefficients is shown in figure 9. The figure also shows the uncertainty associated with the posterior fit of the high-fidelity model. 0.4 yc1 samples yc2 samples
0.35
0.3 y
Posterior mean of yˆc2
0.25
0.2
Posterior mean of yˆhf Posterior mean of yˆc1
0.15 0
0.2
0.4
x
0.6
0.8
1
Figure 8. Posterior mean of yˆhf , yˆc1 , and yˆc2 using Bayesian hierarchical modeling
0.4 95% CI of yˆhf Posterior mean of yˆhf yhf samples yhf
0.35
0.3 y
Downloaded by GEORGIA INST OF TECHNOLOGY on December 3, 2015 | http://arc.aiaa.org | DOI: 10.2514/6.2015-1817
yhf samples
0.25
0.2
0.15 0
0.2
0.4
x
0.6
0.8
1
Figure 9. Comparison of posterior mean of yˆhf with yhf and the 95% Confidence Interval (CI) of yˆhf
8 of 12 American Institute of Aeronautics and Astronautics
IV.
Case Study
xmin = (0, 0, 12.0, −0.3)
(8)
xmax = (4.0, 40.0, 18.0, 1.5)
(9)
where x is defined as x = (cmax , cpos , tmax , Cl ).
Max. Cam. Pos %
40
30
20
WTT MSES XFoil
10
0 18
Max. Th. %
17 16 15 14 13 12 1.5
1 Cl
Downloaded by GEORGIA INST OF TECHNOLOGY on December 3, 2015 | http://arc.aiaa.org | DOI: 10.2514/6.2015-1817
In this section, Bayesian hierarchical regression is demonstrated for a multi-fidelity airfoil drag analysis problem. Three different sources of drag coefficient data are used for the analysis. The first source of data is available through wind tunnel test (WTT) [20] and the other two are generated using moderate fidelity computer programs MSES [21] and XFOIL [22]. MSES is a coupled viscous/inviscid Euler method for airfoil design and analysis. It employs streamline-based Euler discretization. It uses a two-equation boundary layer formulation with coupling through the displacement thickness and solves simultaneously with a full Newton method. Similar to MSES, XFOIL is a viscous/inviscid solver for subcritical airfoils. It uses an inviscid linear-vorticity panel method with a Karman-Tsien compressiblity correction for direct and mixed-inverse modes. The viscous layer is represented by a two-equation lagged dissipation integral method. In the analysis maximum camber (cmax ), maximum camber location (cpos ), maximum thickness (tmax ), and lift coefficient (Cl ) are used as design variables. The airfoils are parameterized using NACA 4-digit airfoil shape functions [23]. The ranges of design variables used for this analysis are:
0.5
0
−0.5 0
1
2 3 Max. Cam. %
40
10 20 30 Max. Cam. Pos %
4012
14 16 Max. Th. %
18
Figure 10. Samples from two models at same level of fidelity used for Bayesian hierarchical modeling
For the analysis, 10 samples were randomly selected from wind tunnel drag polar data of the NACA 0012, NACA 2412, and NACA 2415 airfoils at a Reynolds number of 6 million. For the moderate fidelity 9 of 12 American Institute of Aeronautics and Astronautics
computer program, 100 samples of design variables were generated for MSES and XFOIL independently, using Latin hypercube sampling. The samples used for the analysis are shown in figure 10.
0.03 WTT 0.025
MSES XFoil
Cd
0.02 0.015
Downloaded by GEORGIA INST OF TECHNOLOGY on December 3, 2015 | http://arc.aiaa.org | DOI: 10.2514/6.2015-1817
0.01 0.005 0 −0.5
0
0.5
1 Cl
1.5
2
2.5
Figure 11. Samples of design variables for WTT, MSES, and XFOIL data
Simulations were carried out using MSES and XFOIL in viscous mode for the selected airfoils at a Reynolds number of 6 million and at a Mach number of 0.3 to estimate the drag coefficients. Marginalized data of drag coefficient (Cd ) as a function of Cl is shown in figure 11. As evident from the figure, the computer programs were not accurate but were able to predict the trend of Cd as a function of Cl . To select the polynomial model for fitting the data, simple Bayesian regression for each source of data was carried out using a quadratic polynomial model with an additional cubic term for lift coefficient. Based on the posterior distribution of regression coefficients, only 9 terms were found to be important and drag coefficient was modeled as Cdi = βi,0 + βi,1 cpos + βi,2 tmax + βi,3 Cl + βi,4 cmax Cl + βi,5 cpos Cl + βi,6 c2max + βi,7 Cl2 + βi,8 Cl3
(10)
where index i can be 1, 2, or 3 and represents WTT, MSES, and XFOIL, respectively. In the Bayesian hierarchical model setting, drag coefficient for each source was modeled as a Gaussian distribution: Cdi ∼ N (µi , 1/τi )
(11)
where µi is the mean and was modeled using Eq. 10 and τi is the precision. The coefficient of each term in µi from all the sources was modeled using a Gaussian distribution. A particular coefficient βi,j of a term in µi from all the sources was linked by a common hyperparameter αµj and ατj as βi,j ∼ N (αµj , ατj )
(12)
The prior distribution of the parameter τi was modeled using a non-informative gamma (Γ) distribution as τi ∼ Γ(a, b)
(13)
where a and b are shape and rate parameters for gamma distribution. In the current study a = 0.001 and b = 0.001 has been used. The hyper-prior distribution on hyperparameter αµj was assumed to be a Gaussian distribution and was modeled as αµj ∼ N (0.0, 1/10−6 ) (14) The hyper-prior distribution on hyper parameter ατj was assumed to be a non-informative gamma distribution and was modeled as ατj ∼ Γ(0.001, 0.001) (15) 10 of 12 American Institute of Aeronautics and Astronautics
An MCMC simulation was carried out with the OpenBUGS software using the data available from multiple sources to estimate the posterior distribution on the parameters. The posterior fit of the WTT model as a function of Cl is shown in figure 12 for the NACA 2418 airfoil, which was not used for posterior estimates of parameters. The plot also shows the estimate of simple Bayesian regression, which was built using only 10 samples from WTT data.
0.03 WTT Posterior CWTT (BHM) d
0.025
50% CI Regression
Cd 0.015 0.01 0.005 0
0.2
0.4
0.6 Cl
0.8
1
1.2
Figure 12. Comparison of posterior fit of WTT regression model using Bayesian hierarchical regression and simple Bayesian regression for NACA 2418 airfoil
Variation of drag coefficient as a function of maximum camber, maximum camber position, and maximum thickness in the neighborhood of the NACA 2418 airfoil is shown in the profiler of figure 13 using the posterior fit of CdW T T . 0.035 Posterior CWTT (BHM) d
0.03
WTT 0.025 Cd
Downloaded by GEORGIA INST OF TECHNOLOGY on December 3, 2015 | http://arc.aiaa.org | DOI: 10.2514/6.2015-1817
0.02
0.02 0.015 0.01 0.005 0
1
2 Max Cam %
3
4
0
10
20 30 Max Cam Pos %
40 12
14
16 Max Th %
18 −0.5
0
0.5 Cl
1
1.5
Figure 13. Profiler showing variation of drag coefficient as a function of maximum camber, maximum camber position, maximum thickness about NACA 2418
V.
Conclusion
In the current work, a multi-source surrogate modeling method using Bayesian hierarchical regression has been presented to build an accurate regression model by combining sparse information from multiple sources. The method was illustrated using simple analytical functions for two different scenarios: 1) when all the data sources are at the same level of fidelity and 2) when data sources are at different levels of fidelity. The method was demonstrated on a multi-fidelity airfoil drag analysis problem where an improved linear regression model was built using sparse data available from wind tunnel tests and relatively abundant computer model data. This method helps in building a meta-regression by combining information from sources of data with same fidelity without bias towards the density of data available from each source. Also,
11 of 12 American Institute of Aeronautics and Astronautics
the method has been found to improve the accuracy of a regression model built on high fidelity but sparse data using the information from a relatively large number of low fidelity data. In the current work, only the case study of a multi-fidelity scenario has been demonstrated. In future, the work will carried out on an case equivalent multi-source problem with data at same fidelity.
Downloaded by GEORGIA INST OF TECHNOLOGY on December 3, 2015 | http://arc.aiaa.org | DOI: 10.2514/6.2015-1817
References 1 Keane, A. and Nair, P., Computational Approaches for Aerospace Design: The Pursuit of Excellence, John Wiley & Sons, 2005. 2 Kennedy, M., “Predicting the output from a complex computer code when fast approximations are available,” Biometrika, Vol. 87, No. 1, 2000, pp. 1–13. 3 Vitali, R., Haftka, R., and Sankar, B., “Correction response surface approximations for stress intensity factors of a composite stiffened plate,” 39th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Conference and Exhibit, Structures, Structural Dynamics, and Materials and Co-located Conferences, American Institute of Aeronautics and Astronautics, April 1998. 4 Alexandrov, N. M. and Lewis, R. M., “An Overview of First-Order Model Management for Engineering Optimization,” Optimization and Engineering, Vol. 2, No. 4, Dec. 2001, pp. 413–430. 5 Eldred, M., Giunta, A., and Collis, S., “Second-Order Corrections for Surrogate-Based Optimization with Model Hierarchies,” 10th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference, Multidisciplinary Analysis Optimization Conferences, American Institute of Aeronautics and Astronautics, Aug. 2004. 6 Haftka, R. T., “Combining global and local approximations,” AIAA Journal, Vol. 29, No. 9, Sept. 1991, pp. 1523–1525. 7 Umakant, J., Sudhakar, K., Mujumdar, P., and Rao, C., “Customized Regression Model for Improving Low Fidelity Analysis Tool,” 11th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference, Multidisciplinary Analysis Optimization Conferences, American Institute of Aeronautics and Astronautics, Sept. 2006. 8 Berci, M., Toropov, V., Hewson, R., and Gaskell, P., “Metamodelling Based on High and Low Fidelity Model Interaction for UAV Gust Performance Optimization,” 50th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Conference, Structures, Structural Dynamics, and Materials and Co-located Conferences, American Institute of Aeronautics and Astronautics, May 2009. 9 Qian, P. Z. G. and Wu, C. F. J., “Bayesian Hierarchical Modeling for Integrating Low-Accuracy and High-Accuracy Experiments,” Technometrics, Vol. 50, No. 2, May 2008, pp. 192–204. 10 Forrester, A. I. J., S´ obester, A., and Keane, A. J., “Multi-fidelity optimization via surrogate modelling,” Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, Vol. 463, No. 2088, 2007, pp. 3251–3269. 11 Toropov, V. V., “Modelling and Approximation Strategies in Optimization: Global and Mid-Range Approximations, Response Surface Methods, Genetic Programming, Low/High Fidelity Models,” Emerging Methods for Multidisciplinary Optimization, edited by J. Blachut and H. Eschenauer, chap. 5, Vol. 425, CISM Courses and Lectures, Springer-Verlag, New York, 2001, pp. 205–256. 12 Viana, F. A. C., Simpson, T. W., Balabanov, V., and Toropov, V., “Metamodeling in Multidisciplinary Design Optimization: How Far Have We Really Come?” AIAA Journal, Vol. 52, No. 4, 2014, pp. 670–690. 13 Hoff, P. D., A First Course in Bayesian Statistical Methods, Springer, New York, 2009. 14 Gelman, A., Carlin, J. B., Stern, H. S., and Rubin, D. B., Bayesian Data Analysis, CRC Press, 2003. 15 Gelman, A. and Hill, J., Data Analysis Using Regression and Multilevel/Hierarchical Models, Cambridge University Press, 2006. 16 David J. Spiegelhalter, Keith R. Abrams, J. P. M., Bayesian Approaches to Clinical Trials and Health-Care Evaluation, Wiley, 1st ed., 2004. 17 “Meta-analysis,” National Library of Medicine (http://www.nlm.nih.gov/), Retrieved on Nov, 2014. 18 Bishop, C. M., Pattern Recognition and Machine Learning, Springer-Verlag New York, Inc. Secaucus, NJ, USA, 2006. 19 Spiegelhalter, D., Thomas, A., Best, N., and Lunn, D., “OpenBUGS user manual,” MRC Biostatistics Unit, Cambridge, 2007. 20 Ira H. Abbott, A. E. v. D., Theory of Wing Sections: Including a Summary of Airfoil Data, Dover Publications, 1st ed., 1959. 21 Drela, M., A User’s Guide to MSES 3.05 , MIT Department of Aeronautics and Astronautics, 2007. 22 Drela, M., XFOIL 6.9 User Primer , MIT Department of Aeronautics and Astronautics, 2001. 23 Moran, J., An Introduction to Theoretical and Computational Aerodynamics, Dover Publications, 2010.
12 of 12 American Institute of Aeronautics and Astronautics