A Comparison of Two Sampling Methods for Global Sensitivity Analysis Stefano Tarantola, William Becker, and Dirk Zeitz Joint Research Centre of the European Commission, Institute for the Protection and Security of the Citizen, Via E. Fermi 2749, 21027 Ispra, Italy
[email protected]
Abstract. We compare the convergence properties of two different quasirandom sampling designs - Sobol’s quasi-Monte Carlo, and Latin supercube sampling in variance-based global sensitivity analysis. We use the non-monotonic V-function of Sobol’ as base case-study, and compare the performance of both sampling strategies at increasing sample size and dimensionality against analytical values. The results indicate that in general, the Sobol’ design performs better, however the Latin supercube sampling design appears to offer advantages in specific cases, such as smaller sample sizes and medium dimensionality. Keywords: Latin supercube, Sobol’ sequence, Monte Carlo, quasi-random sequence, effective dimension
1
Introduction
When implementing a global sensitivity test, practitioners are required to sample the model a large number of times to feed Monte Carlo estimators [1] of sensitivity indices. This requires a choice of sampling strategy, which could be either simple random sampling, or quasi-random (low-discrepancy) sampling, including stratification approaches. The former can be implemented by various available random number generators, while the latter can be realized by Sobol’, Halton, Faure or Niederreiter sequences, to name but a few – see [2] for a good summary. In this paper, we test two alternative sampling designs and investigate their performance in terms of convergence rate for global sensitivity analysis. The two designs under consideration are the Sobol’ Quasi-Monte Carlo sampling routine [3, 4] and Latin Supercube Sampling (LSS) [5]. LSS was developed because the efficacy of QMC has been observed to deteriorate for high-dimensional problems [6]. The LSS design has proven to be more efficient than standard Monte Carlo [5] and the Sobol’ QMC design has proven to be more efficient than Latin hypercube sampling [7]. However, to date, no comparison between LSS and Sobol’ QMC designs has been made in the context of sensitivity analysis. The test model used in this performance assessment is the V-function of Sobol’ (also sometimes known as the G-function), which is a base case study for which analytical values of first and total order sensitivity indices are available.
2
Different settings of the V-function are tested, varying the number of model inputs and their relative importance.
2
Sensitivity Analysis and Sampling
Global sensitivity analysis is the study of how the uncertainty in the output of a model can be decomposed according to the input sources of uncertainty. We shall here concentrate on the class of variance-based methods, which involve decomposing the output variance into portions attributable to sets of inputs. We consider the model to be a square-integrable function Y = f (X1 , X2 , ...Xk ) defined over Ω, the k-dimensional unit hypercube: Ω = (X|0 ≤ xi ≤ 1;
i = 1, ...k)
The two widely used sensitivity indices are the k first order indices Si [8] and the k total effect indices ST i [9], which are defined as, Si =
VXi (EX∼i (Y |Xi )) V (Y )
(1)
EX∼i (VXi (Y |X∼i )) VX∼i (EXi (Y |X∼i )) =1− (2) V (Y ) V (Y ) Both the first- and total-order effects can be estimated using the Monte Carlo method. This involves running the model at N different points {Xj }N j=1 in the input space and generating a corresponding set of function evaluations {f (Xj )}N j=1 . These data are then fed into purpose-built estimators for both the first order effects [1], and the total-order effects [10, 11]. The details of these estimators are omitted here for the sake of brevity; suffice to say however that both methods require a N × 2k sample for each variable. It is therefore of great interest to keep N as low as possible to limit computational expense. It is a well-known fact that the error of a purely-random Monte Carlo estimate is O(N −1/2 ). On the other hand, quasi Monte Carlo (QMC) approaches, which involve the use of quasi-random sequences, can have accuracies approaching the much more favourable O(N −1 ). For this reason, many quasi-random sampling procedures have been proposed; here we provide a comparison of two designs. ST i =
2.1
Sobol’ Sequence
One sampling procedure that is very popular when performing variance-based sensitivity analysis is the Sobol’ LP-τ sequence [3, 4], which is a sequential space-filling design with good uniform properties (a low-discrepancy sequence). The details of this sequence will not be given here but an implementation of the algorithm and practical details can be found in [12]. This design could be considered as a “benchmark” design since it is the standard choice for many practitioners of sensitivity analysis, given its low discrepancy and the fact that additional points can be added without redesigning the whole set.
3
2.2
Latin Supercube Sampling
The concept of the “effective dimension” of a function [13] is, roughly speaking, a measure of either the number of variables (truncation sense), or the orders of interactions (superposition sense) that are causing the majority of output variance. The point is that quite often a small subset of variables is responsible for most of the output variance, therefore the effective dimension of a function is often considerably smaller than k. LSS was introduced by Owen as a means of exploiting this kind of structure in high-dimensional integrals [5], by grouping influential variables into subsets and creating QMC designs in each subset. LSS can be viewed as a combination of Latin hypercube sampling (LHS) and QMC. The k input dimensions are divided into subsets {s1 , s2 , ..., sd }, such Pd that r=1 |sr | = k, where |sr | denotes the cardinality of sr . For each subset sr a QMC design is constructed of N points, giving d N ×|sr | matrices. Much like the LHS design, the full LSS design is constructed by randomly permuting the rows (samples) inside each sr , and concatenating them to make the full Nk matrix. In short, while the LHS randomises the order of samples for every dimension, LSS randomises over subsets of dimensions. Owen explains that in order to exploit the full potential of LSS, variables that interact strongly should be grouped into subsets together [5]. In practice this can be difficult, since variable interactions are generally unknown until a sensitivity analysis has been performed. However, in our investigation we examine the case when the ideal division of input variables is known. We therefore consider a simple case of LSS where there are only two subsets such that |s1 | + |s2 | = k, using the Sobol’ sequence to generate samples within each subset.
3
Setting the Experiment
The test function employed in this work has been widely used as benchmark for sensitivity analysis (see [8]) and is defined as: Y = V(X1 , X2 , ..., Xk, a1 , a2 , ..., ak ) =
k Y |4Xi − 2| + ai i=1
1 + ai
(3)
The nature of the function V is driven by the dimensionality k as well as by the values of the coefficients ai . These latter determine the relative importance of model inputs (i.e., the lower ai the higher the dependence of Y on Xi ) and can therefore be used to generate a wide spectrum of test cases of different levels of complexity. 3.1
Function Types
The test cases used in this work can be grouped according to a proposed classification [7] which defines three different types of functions (here labelled A, B and C) with different characteristics and different effective dimensionality:
4
Type A includes functions with a few dominant model inputs, which could also be involved in interactions (low effective dimension in the superposition and truncation sense). These are the most common functions in practice. Type B includes functions with a homogeneous effect of all model inputs, which are involved in some interactions (effective dimension is roughly equal to k in the truncation sense). Type C is like type B but the model inputs are involved in strong interactions (effective dimension is similar to k in both senses). We will alter the V-function to examine functions that correspond to each of these definitions, varying k to examine the effect of dimensionality. For each model, we run the experiments at increasing N (up to 8192) to check convergence of the sensitivity estimates to the true values. Finally, each experiment at a given N and k is replicated R = 50 times in order to quantify the statistical error of the sensitivity estimates, in terms of deviation between the estimates and the true values. 3.2
Measuring Convergence
We use four indicators to quantify the statistical error of the sensitivity estimates at given N and k - these are named AES, MAES, AEST and MAEST. Let us denote by S¯i the analytic value of a first order index for the i-th input and by the corresponding estimated value of the rth repetition. We now define AES as the mean absolute error of each Si : AESi =
R 1 X (r) ¯ S − Si R r=1 i
(4)
i.e. these help to investigate which model input causes most of the deviation from the analytical values. MAES is the mean of the AES values over all k. The definitions of AEST and MAEST are exactly equivalent but measure error based on estimates of ST i . For any test function and any k we regress both MAES and MAEST against N by a trend line cN −a (power regression) in a log-log plot. Values of a ∈ [0, 1] are given in brackets in the legend of the figures and show the rate of convergence of the sensitivity estimates for each implementation design at increasing sample size (larger a implies better convergence properties). For reference, standard Monte Carlo has a = 0.5. The coefficient c is related to the approximation error at small N , therefore small values of c mean that the implementation performs well with a small sample size.
4 4.1
Results and Analysis Type A Functions
We here construct a typePA function where x1 and x2 account for the majority of k the input variance and i=1 Si = 0.8268 (we will call this A1). Figure 1 shows
5
the convergence of the estimates, as well as AES values per input, in the case where k = 30. It is clear that in this case the Sobol’ design outperforms the LSS design, both in terms of convergence as well as absolute error at the sample rate shown. This is unsurprising since the strength of LSS is based on strongly interacting subsets of variables, whereas in this configurations the interactions are very weak.
Fig. 1. Convergence (top) and i-wise errors at N = 8192 (bottom) for type A1 function, k = 30
A second type A configuration was tested (A2), where the set {x1 , x2 , ..., x5 } PN are the influential input variables, and i=1 Si = 0.5079. In this case therefore, there are much more significant interactions between variables. Figure 2 shows the results of the analysis. In this case the LSS design has a greater scope to perform to its full potential, which is reflected in the fact that the LSS design generally provides more accurate estimates for a given sample size, as well as roughly half the error for the influential inputs. To examine the effect of dimensionality in this case, the A2 function was run with increasing k and varying sample size. Figure 3 shows the difference between values of log(MAES) from the Sobol’ and LSS designs, for increasing values of k and N (essentially the pointwise difference between the convergence curves). The indication here is that on average, LSS performs better for small sample sizes, but loses its advantage as N increases. In low-dimensional models (k=10), Sobol’ has a significant advantage for large samples, whereas for the higher-dimensional cases the difference is not so marked. In particular, the LSS design seems to perform well for k=30. Note that for the higher values of N there are no results for large k due to the difficulty of constructing samples.
6
Fig. 2. Convergence (top) and i-wise errors at N = 8192 (bottom) for type A2 function, k = 30
4.2
Type B and C Functions
The results of a type B function, where k = 10, are shown in Figure 4. In this case, inputs are equally important and there are only weak interactions. As such, the LSS design cannot exploit any structure in the integrals and does not perform as well as the Sobol’ sample. The convergence rate a for the Sobol’ design is very good (a ∼ = 1), whereas the rate for LSS designs is around 0.50, typical of a standard Monte Carlo sampling strategy. The superiority of the Sobol’ design is also seen when k is increased (results omitted here due to space limitations). Type C functions are essentially indistinguishable from noise (none of the inputs have much effect on the output). They are therefore not suitable for sensitivity analysis because it would not reveal any meaningful information (other than the fact that it is a type C function). The results for this function will not be shown here, but it was found that the Sobol’ design tended to perform better, although notably did produce some ”spikes” of error for certain input dimensions (perhaps due to the deterioration of the Sobol’ design at higher dimensions).
5
Conclusion
As expected, LSS sampling appears performs best when a subset of stronglyinteracting variables can be identified. However, at least in the function tested here, this advantage appears to diminish as the sample size increases. LSS cannot be shown here to offer advantages in other function types. It is difficult therefore to recommend LSS unless it is known a priori that a strongly-interacting subset exists.
7
2
1.5
LSS advantage
Log(MAES, Sobol) - log(MAES, LSS)
1
0.5
0 16
32
64
128
256
512
1024
-0.5
2048
4096
8192
k=10 k=20 k=30 k=50 k=75 k=100
-1
-1.5
-2
Sobol' advantage
-2.5 Number of model runs
Fig. 3. Comparison of absolute MAES values for A2 function at increasing N and k
Fig. 4. Convergence (top) and i-wise errors at N = 8192 (bottom) for type B function, k = 10
References
[1] Andrea Saltelli, Paola Annoni, Ivano Azzini, Francesca Campolongo, Marco Ratto, and Stefano Tarantola. Variance based sensitivity analysis of model output. design and estimator for the total sensitivity index. Computer Physics Communications, 181(2):259–270, 2010. [2] H. Niederreiter. Random number generation and quasi-Monte Carlo methods. Society for Industrial Mathematics, 1992. [3] I. M. Sobol’. On the distribution of points in a cube and the approximate evaluation of integrals. USSR Computational Mathematics and Mathematical Physics, 7(4):86–112, 1967. [4] I. M. Sobol’. Uniformly distributed sequences with an additional uniform property. USSR Computational Mathematics and Mathematical Physics, 16(5):236–242, 1976. [5] A. B. Owen. Latin supercube sampling for very high-dimensional simulations. ACM Transactions on Modeling and Computer Simulation, 8(1):71– 102, 1998. [6] William J. Morokoff and Russel E. Caflisch. Quasi-random sequences and their discrepancies. SIAM Journal on Scientific Computing, 15(6):1251– 1279, 1994. [7] S Kucherenko, B Feil, N Shah, and W Mauntz. The identification of model effective dimensions using global sensitivity analysis. Reliability Engineering and System Safety, 96(4):440–449, 2011. [8] I. M. Sobol’. Sensitivity estimates for nonlinear mathematical models. Mathematical Modeling and Computational Experiment, 1(4):407–414, 1993. [9] T Homma and A Saltelli. Importance measures in global sensitivity analysis of nonlinear models. Reliability Engineering and System Safety, 52(1):1–17, 1996. [10] M.J.W. Jansen, W.A.H. Rossing, and R.A. Daamen. Monte carlo estimation of uncertainty contributions from several independent multivariate sources. In G. van Straaten J. Grasman, editor, Predictability and Nonlinear Modelling in Natural Sciences and Economics, pages 335–343. Kluwer Academic Publishing, Dordrecht, 1994. [11] M. J. W. Jansen. Analysis of variance designs for model output. Computer Physics Communication, 117(1-2):35–43, 1999. [12] P. Bratley and B. L. Fox. Algorithm 659: implementing sobol’s quasirandom sequence generator. ACM Transactions on Mathematical Software, 14(1):88–100, 1988. [13] R. E. Caflisch, W. Morokoff, and A. Owen. Valuation of mortgage backed securities using brownian bridges to reduce effective dimension. Journal of Computational Finance, 1(1):27–46, 1997.