Variance decompositions of nonlinear time series ...

Statistics and Computing manuscript No. (will be inserted by the editor)

Variance decompositions of nonlinear time series using stochastic simulation and sensitivity analysis T.J. Harris · W. Yu

the date of receipt and acceptance should be inserted later

Abstract In this paper, a variance decomposition approach to quantify the effects of endogenous and exogenous variables for nonlinear time series models is developed. This decomposition is taken temporally with respect to the source of variation. The methodology uses Monte-Carlo methods to affect the variance decomposition using the ANOVA-like procedures proposed in [1, 43]. The results of this paper can be used in investment problems, biomathematics and control theory, where nonlinear time series with multiple inputs are encountered. Keywords : ANOVA, nonlinear time series, sensitivity analysis, stochastic simulation, MonteCarlo methods, variance decomposition. Mathematics Subject Classification (2000) 37N99 · 65C05 · 68U20 · 93E03 1 Introduction The task of calculating an analysis of variance (ANOVA) in nonlinear static systems is often accomplished using Monte-Carlo methods. Much of the recent literature can be traced to the contributions of Sobol [43] and Cukier and co-authors [7, 8]. The literature on this topic is extensive, and the reader is referred to the monographs of [23, 36], and expository and review articles [1, 17, 19, 37, 40, 43]. Many applications in the physical sciences and engineering are reported in these references. In this article, we are interested in using global sensitivity techniques to analyze nonlinear time series models that can be written in the form: ∗ ∗ ∗ zt = f (zt−1 , ξ1,t , · · · , ξn , a∗1,t , · · · , a∗na ,t ) ξ ,t

(1)

T.J. Harris Department of Chemical Engineering, Queens’ University, Kingston,Ont. Canada, K7L 3N6 E-mail: [email protected] W.Yu Industrial Information & Control Centre, The University of Auckland, Auckland, New Zealand E-mail: [email protected]

2

The ∗ notation denotes that the variable includes its current value and a number of ∗ ∗ previous values, i.e. zt−1 = (zt−1 zt−2 · · · znz ). {ξj,t }, j = 1..nξ are serially and contemporaneously independent random variables. {aj,t }, j = 1..na are serially independent random variables. They may be contemporaneously correlated, but they are indepen∗ dent of {ξj,t }, j = 1..nξ . The {aj,t }, j = 1..na often appear in nonlinear time series to modify coefficients, i.e. random coefficient effects models. Depending on the application, ∗ one might think of the {aj,t }, j = 1..na as endogenous variables and {ξj,t }, j = 1..nξ as exogenous variables affecting the system. The description given in Eqn(1) is quite general, and is flexible enough to encompass most parametric time series models, including neural networks, uniformly sampled stochastic differential equations, and models with exogenous inputs [12, 50]. A specific example of a system that is amenable to the approach considered in this paper, is the time-invariant-discrete-state equation with multiplicative disturbance, described by [13, 22]:

Xt+1 = (A +

m1 X

As,i ξti )Xt + (B +

i=1

Yt = CXt + at

m2 X

Bs,i εit )Ut + Fωt

i=1

(2)

where Xk ∈ Rnx is the state vector, Yk ∈ Rny and Uk ∈ Rnu are the output and input vectors of interest, ωt ∈ Rnω and at ∈ Rna are the process and measurement noise vectors, ξ ∈ Rnξ and ε ∈ Rnε are multiplicative noises, A, As,i , B, Bs,i , C and F are constant matrices which have suitable dimensions. The independent random variables ωt , at , ξt and εt have zero-mean gaussian distributions with variance-covariance descriptions Ω ω , Ω a and Ω ξ = Ω ε = I respectively. When there is no control action, Ut = 0, Eqn (2) describes a random coefficient autoregressive model of order k, RCA(k) [50]. Our initial motivation for undertaking an ANOVA on nonlinear time series models arises from applications in automation/control [18, 30] and finance/economics [14, 50]. In these applications, the stochastic components in Eqn(1) are often attributed to disturbances affecting a variable of interest. It is often desired to keep a process variable, such as temperature, pressure, composition, etc., at a desired value to ensure compliance with quality, environmental, safety and economic constraints. The decomposition of the variance into sources of variation has proven valuable, as it is possible to quantify the importance of exogenous variables in dynamic systems, and determine which of these should be controlled or eliminated so as to reduce overall process variability. For systems that can be adequately described by linear time series models, there is rich literature on variance decomposition [15, 27]. Applications in the chemical process industries are described in [10, 31, 41, 42, 47]. In the application of ANOVA methods to univariate and multivariate linear systems, the variance decomposition can be realized through a study of the Impulse Response Function as there is a direct relationship between the impulse response and the variance (IRF)[15, 27]. The IRF for nonlinear systems is not uniquely defined, and it cannot be used in the variance decomposition for nonlinear systems in general. A variance decomposition method, called ANOVA-like decomposition, commonly for the static systems, will be modified to enable its application to nonlinear dynamic systems.

3

To introduce these concepts, denote the scalar variable of interest by Y . It is assumed that Y can be calculated by a model of the form: Y = f (X1 , X2 , · · · Xp ) = f (x)

(3)

where x is a vector of variables that influence Y through the function f . The description in Eqn(3) would encompass empirical models, or fundamental models generated from the application of constitutive equations such as those derived from material, energy and momentum balances. The output may be generated from a combination of algebraic equations, and ordinary or partial differential equations. The intent of sensitivity analysis (SA) is to decompose the variance of Y into contributions arising from the variables x and to assess the magnitude and significance of each component in x. The model can depend on other variables, for which no analysis is required. These variables are not explicitly noted for notational convenience. Most decompositions employ the law of iterated expectations for variance. Letting x be written as xT = (uT , v T )T , then the law of iterated expectations for variance is [29]: var{Y } = varu {Ev {Y |u}} + Eu {varv {Y |u}}

(4)

i.e. the variance of Y decomposes into the variance of the conditional mean plus the expectation of a conditional variance. Eqn (4) holds, regardless of existence of inter-block correlation. When there is no inter-block correlation between u and v, the following ANOVA decomposition exists [43]: var{Y } = Vu + Vv + Vv ,u

(5)

where Vu = varu {Ev {Y |u}} Vv = varv {Eu {Y |v}}

Vu,v = varu,v {E{Y |u, v}} − Vu − Vv

(6)

Vu and Vv are interpreted as the direct impact of each of these variable on the variance of Y , when averaged over all possible values of the complementary variable. Vu,v measures the interacting effects of u and v on the variance of Y . The various terms in Eqn(5) can be estimated readily using Monte-Carlo techniques, which employ sampling methods such as ordinary random sampling, latin hypercube sampling and quasi-random sampling [16, 17, 45]. The ANOVA and the numerical evaluation of the requisite quantities can be extended to more than two blocks of variables. A number of sensitivity indices can be constructed using the estimated variance contributions and an analysis can be undertaken to ascertain the most important variables. There are many published applications on the use of global sensitivity measures in the physical sciences and engineering. Methods using Monte-Carlo techniques are sometimes referred to as global sensitivity analysis, to distinguish the analysis from methods that inherently exploit a local linearization, i.e. a Taylor series expansion, of the model about some fixed point. The objective of this paper is to apply sensitivity analysis (SA) to nonlinear time series models. The outline of the paper is as follows. In the next section, computational

4

methods for the calculation of sensitivity indices are reviewed. This is then followed by a section describing how these methods are utilized to undertake a sensitivity analysis for the class of models described by Eqn(1). Variance decompositions can be undertaken ∗ with respect to individual variables, such as ({ξ1,t }, t = 1 · · · ) or ({aj,t }, t = 1 · · · ), ∗ contemporaneous groups of variables, ({ξi,t }, i = 1 · · · nξ , {ai,t }, i = 1 · · · na ), or combinations of these two approaches. Several examples are then used to illustrate the methodology. 2 Numerical Calculation of Sensitivity Indices 2.1 Definition of Sensitivity Indices For a function of p independent variables (Xi , i = 1 · · · p), the variance of Y can be decomposed as [1, 6, 43]: X XX var{Y } = Vi + Vij + · · · + V12···p (7) i

i

j>i

where Vi = V (E(Y |Xi = x∗i ))

Vij = V (E(Y |Xi = x∗i , Xj = x∗j )) x∗i )

− V (E(Y |Xi = x∗i )) − V (E(Y |Xj = x∗j ))

(8)

and so on. E(Y |Xi = denotes the expectation of Y conditional on Xi having a fixed value x∗i . V stands for variance over all the possible values of x∗i . The nomenclature used in Eqns (7 & 8) is commonly employed. Vi denotes the main effect of Xi on the var{Y }. Vij is interaction of order two between Xi and Xj contributing to the var{Y } that is not accounted for by the main effects Vi and Vj . The random variables, X1 , ..., Xp , in Eqn(7) are mutually uncorrelated[1]. Archer [1] notes that decomposition in Eqn(7) has been discovered and rediscovered several times. The decomposition described in Eqn(7) also applies when any or all of the variables Xi are in fact a grouping of variables, say Xi = (w1 w2 · · · wq ) where (w1 w2 · · · wq ) may be a correlated set of variables [1, 20, 44]. These variables must be statistically independent of other groups of variables. (Xi , i = 1 · · · p) may exhibit intra-group correlation, but inter-dependence between groups is not permitted. The sensitivity index is defined as Si = Vi /V ,i = 1, 2, ..., p. There may be as many as 2p − 1 variance components in the Eqn(7), consequently there are 2p − 1 sensitivity indices. To circumvent calculation of all of these quantities, the total sensitivity index (TSI) has been introduced [19]. The TSI, STi , of one factor Xi is defined as the sum of all the sensitivity indices involving that factor. For example, in a three-factor case, the TSI can be written as: ST1 = S1 + S12 + S13 + S123 ST2 = S2 + S12 + S23 + S123 ST3 = S3 + S23 + S13 + S123

(9)

Another sensitivity measure, is the complementary sensitivity factor ST∼i , which is the sensitivity index for all variables that do not involve variable Xi . By definition Si + ST∼i = 1

(10)

5

2.2 Numerical Estimation of Sensitivity Measures Monte-Carlo methods can be used to estimate the variance components and sensitivity measures[1, 16, 19, 21, 36, 43, 44]. An insightful taxonomy of the sampling schemes is provided in [26]. These authors also provide a summary of the various Monte-Carlo algorithms that are used for the calculation of the sensitivity measures. Studies in a number of these papers indicate that efficiencies may accrue to certain methods when the computing time for evaluating the function is high. Often, the more efficient methods require more intricate program techniques. The objective in this paper is to demonstrate the applications of sensitivity methods to nonlinear time series models. The mean and variance of Y can be estimated by [1, 43, 44]: ! N N X X 1 k k k k → − → − ˆ f0 = (11) f( x 1, · · · , x p) f (x1 , · · · , xp ) + 2N k=1 k=1 ! N N X X 1 k k 2 → 2 k k → − − ˆ V = (12) f ( x 1 , · · · , x p ) − fˆ02 f (x1 , · · · , xp ) + 2N k=1

k=1

k k − − where and (→ x 1, · · · , → x p ) are independent sets of N simulations of multidimensional inputs which have the requisite probability density functions. First order indices are estimated as Sˆj = Vˆj /Vˆ , where the variance components may be estimated as [33, 34]:

(xk1 , · · ·

, xkp )

N 1 X f (xk1 , · · · , xkj−1 , xkj , xkj+1 , · · · , xkp ) Vˆj = N k=1 k k k k k k − − − − − − × f (→ x 1, · · · , → x j−1 , xkj , → x j+1 , · · · , → x p ) − f (→ x 1, · · · , → x p)

(13)

ˆ j = 1 − Vˆ∼j /Vˆ , where the complementary variance may be An estimate of STj is ST calculated as [33]: N 1 X k − Vˆ∼j = Vˆ − f (xk1 , · · · , xkp ) × f (xk1 , · · · , xkp ) − f (xk1 , · · · , xkj−1 , → x j , xkj+1 , · · · , xkp ) N k=1

(14)

Interaction terms are not used in this paper, and the reader can refer to a number of the previously cited references to find suitable Monte-Carlo estimators. There are a number of other implementations of Monte-Carlo estimators for variance components, some of which may be more efficient than given by Eqns(13 & 14). The reader is refered to the comprehensive bibliography and comparative analysis in [33]. The random data can be generated using ordinary random sampling, winding stairs method [4, 21], Latin hypercube sampling [16, 28, 48] and quasi-random sampling [9, 49]. Improvements in the efficiencies of the Monte-Carlo estimators is connected to the manner in which the random data is generated [33]. The Fourier Amplitude Sampling Test (FAST) [8, 40] can also be used to estimate the sensitivity measures. While this method may afford some computational efficiencies, the implementation is more complex and requires the user to specify a tuning parameter, which is the maximum number of harmonics. The selection of this parameter can significantly affect the reliability of the results. This method is not pursued further in this paper.

6

2.3 Correlated Factors The decomposition in Eqn(7) is only valid if the factors Xi and Xj are independent of each other. We note that Eqn(4) is always valid and does not depend on factor independence. The presence of correlated factors does not invalidate the use of sensitivity analysis. Rather, the interpretation of the sensitivity measures is more complex. Saltelli and Tarantola [39] have identified two settings for the use of sensitivity analysis. In the first settings, it is desired to identify the most important factor contributing to the variance of Y . The factor that has the largest first order sensitivity factor Si is the most important factor, regardless of the correlation pattern among the variables [39]. In the second settings, it is desired to reduce the variance of Y by simultaneously fixing a number of factors. Further developments are explored in [38] where variables with the highest total effect are examined for variance reduction potential. When the factors are correlated, the choice of the factors is quite subtle. The reader is referred to [35, 39] for details. As will be demonstrated in the next section, many time series models can be described by independent factors for which sensitivity analysis methods can be directly applied.

3 Application to Nonlinear Time Series Models The modelling and analysis of nonlinear processes that can be described by discrete time series is complicated, not only by the richness of the models available, but also by the behavior that these processes can exhibit [12, 50]. In this paper, we are interested in the general class of models that admit a description given by Eqn (1). This class includes linear time series models, nonlinear autoregressive moving average (NARMA) models, Hammerstein and Wiener system, random coefficient models, switching models such as SETAR and neural networks [12, 14, 25, 30, 50]. Also included are linear systems with multiplicative noise, which have seen widespread application in areas such as chemistry, biology, ecology and economics [2, 46]. Although the identification of nonlinear stochastic systems described by Eqn(1) has been studied in some depth in the control community, the statistical analysis of these models is still in its infancy.

3.1 Variance Decomposition When the time series admit a linear ARMA description, the variance of a strictly stationary series is entirely determined by the impulse response coefficients and the variance-covariance matrix of the stochastic components [15, 27]. The variance of the series does not depend on initial conditions. When the stochastic components are independent, it is straightforward to undertake an ANOVA with respect to the stochastic driving forces. When the stochastic components are correlated, the variance decomposition is less insightful. The variance decomposition is often then undertaken with respect to orthogonalized components, i.e. linear combinations of the stochastic components [15, 27]. When the series is marginally nonstationary, i.e. has periodic components or nonstationary elements that are removed by differencing, one can readily construct a multi-step predictor, and undertake a variance decomposition on prediction-error variance.

7

To undertake a variance analysis of time series described by Eqn(1,) we make the following assumptions: Assumptions: ∗ 1. Eq. (1) can be solved numerically subject to the initial condition of I0 = (z0∗ , ξ1,0 , ∗ ∗ ∗ ∗ ∗ ∗ · · · , ξn ,a , · · · , a ),t ≤ 0 to give z , t = 1 · · · for all (ξ , · · · , ξ , a t nξ ,t 1,0 na ,0 1,t 1,t ξ ,0 , · · · ,a∗na ,t ), t > 0. ∗ 2. {ξj,t }, j = 1..nξ are serially and contemporaneously independent random variables. {aj,t }, j = 1..na are serially independent. They may be contemporaneously corre∗ lated, but they are independent of {ξj,t }, j = 1..nξ . 3. The stochastic elements for t > 0 and the initial condition I0 are independent. 4. The initial condition I0 may be a random vector. If so, we denote its probability density function by P (I0 ).

Consider the following algorithm: ∗ ∗ 1. Evaluate (zt , t = 1 · · · L) N times with N independent sequences (ξ1,t , · · · , ξn , ξ ,t ∗ ∗ a1,t , · · · , ana ,t ) of random variables having the desired distribution. Record the results of the simulation in the N × L matrix Z   (1) (1) (1) z1 z2 · · · zL  (2) (2) (2)  z   1 z2 · · · zL  Z = . (15)  . . .  .. .. .. ..    (N ) (N ) (N ) z1 z2 · · · zL

2. Calculate the quantities:

zm =

N 1 X (k) zm N

(16)

k=1

var{zm } =

N 1 X (k) (zm − z m )2 N

(17)

k=1

Eqn(16) is recognized as the Monte-Carlo estimate of the conditional mean of zt+m given only information up to and including time t, and Eqn(17) is the Monte-Carlo estimate of the m step-prediction error variance [3, 14, 50], that is: z m ≃ E{zt+m|t } var{zm } ≃

E{ǫ2t+m|t }

(18) (19)

If the process is stationary, then [14]: lim E{ǫ2t+m|t } = Var{z}

m→∞

(20)

For linear time series models that are stationary, Var{z} is independent of initial conditions. For a stationary, nonlinear model,Var{z} may depend on these values. If the stationary distribution of limm→∞ zt does not depend on the initial conditions I0 , then the process is ergodic [50]. A sufficient condition for a nonlinear time series be ergodic is given in [50] (pp. 127). This expression is quite general and it can be very difficult in general to establish ergodicty. Results exist for specific nonlinear models.

8

If the time series is stationary, which often must be established via simulation, then the variance decomposition can be undertaken and one is assured that E{ǫ2t+m|t } will converge to a limiting value as t → ∞. The results may depend on the initial conditions, unless the process is ergodic. This dependency can be checked by repeating the analysis for several different starting values. Alternatively, the initial distribution may be treated as another block of variables and included in the Monte-Carlo simulation. The application of SA proceeds in straightforward fashion. To illustrate the methodology, consider the case where there are two stochastic sources ξt and (a1,t , a2,t ) Using the nomenclature in the previous section, define X1,m = [ξ1 , ξ2 , · · · , ξm ] and X2,m = [(a1,1 a2,1 ), (a1,2 a2,2 ), · · · (a1,m a2,m )], where m is the time length. In this formulation X1 and X2 are groups of variables. To undertake the computations two arrays of random vectors, each of size N × 3L are generated. − → − → − → X = (X 1 , X 2 ) , X = X 1 , X 2 (21) Define the following N × L matrices Z A = f (X 1 , X 2 ),

− → Z B = f (X 1 , X 2 ),

− → Z C = f (X 1 , X 2 )

(22)

The ij th element of Z A is zj from the ith simulation using the random variable draw (X 1 , X 2 ) . The ANOVA proceeds directly. Let y A (m), y B (m), y C (m),be the mth column of Z A , Z B , Z C respectively. Then 1 Vˆ (m) = (y A (m))T y A (m) − (y A (m))2 N 1 Vˆ1 (m) = (y A (m))T y B (m) − (y A (m))(y B (m)) N 1 Vˆ2 (m) = (y A (m))T y C (m) − (y A (m))(y C (m)) N (23) The overline · denotes the average of the vector ·. The methodology is able to produce the variance decomposition at any particular horizon as a consequence of calculating the response for the variance of z.

3.2 Number of Simulations Some guidance on the number of simulations can be obtained by considering the case where the time series is described by a linear moving average time series model of order L that is driven by iid normal random variables, {at } with mean 0 and variance σa2 . The output of this time series can be calculated as zL = θ T a, where the parameters of 2 is this model are denoted by the vector θ T = [1 θ1 · · · θL ] and aT = [aL · · · a1 ]T . zL 2 recognized as a weighted sum of chi-squared variables. Denote the average value of zL across all simulations by z 2L . Using standard results for a weighted sum of quadratic forms: E{z 2L } = θ T θσa2 2 V ar{z 2L } = (E{z 2L })2 N

(24)

9

q

V ar{z 2L } E{z 2L }

=

r

2 N

(25)

Thus one would anticipate that the variability of z 2L to increase with L and then plateau when the |θi | are sufficiently small that z 2L does not increase with L. Eqn(24) indicates that the relative error is independent of L. The choice of L depends upon the purpose of the analysis. If the objective is to investigate the sources of variability that affect the variance of the forecast-error, then L is chosen to reflect the horizon of interest. If the objective is to investigate the variance of z, then longer horizons are required. An examination of the autocorrelation function of z or the autocorrelation function of z 2 provides guidelines for the selection of L. However, by undertaking the analysis with respect to the horizons m = 1 · · · L, it will be very apparent whether the choice of L is appropriate. Instead of producing one estimate of the ANOVA measures, several estimate can be obtained by dividing the N simulations into R blocks and undertaking the computations of R independent estimates. Each estimate is based on N/R simulations. As noted by one of the reviewers, this approach can help assess the robustness of the analysis, recognizing however that each analysis is based on a small number of simulations. Alternatively, one can check the convergence of the estimates by increasing N .

4 Examples 4.1 Random Coefficient Autoregressive Model Consider the first order random coefficient autoregressive RCA(1) model: zt = (α + a1,t )zt−1 + a2,t

(26) σi2

where {ai,t }, i = 1, 2 are i.i.d. random variables with mean zero and variance that are independent of initial condition z0 . The initial value, z0 , is not treated as a random variable in this analysis. {a1,t } and {a2,t } are independent of each other and α is a real constant. It is readily established that the variance of the m-step ahead prediction error, σǫ2 (m) = E{ǫ2t+m|t } follows the recursion: σǫ2 (m) = (α2 + σ12 )σǫ2 (m − 1) + σ22 , σǫ2 (1)

= (α

2

+ σ12 )z02

m>1

+ σ22 ;

(27)

A necessary and sufficient condition for the variance to be finite for increasing m,is that τ = α2 + σ12 < 1. This requirement also insures that the variance of z is independent of the initial condition z0 . When z0 = 0, then: σǫ2 (m) =

1 − τm 2 σ 1−τ 2

(28)

The condition α2 + σ12 < 1 is a necessary and sufficient condition for model (26) to have a unique stationary and ergodic solution [50]. The analytical solution for the variance decomposition is: V1 (∞) = 0,

V2 (∞) =

σ22 , 1 − α2

V12 (∞) =

σ12 σ22 (1 − α2 )(1 − τ )

(29)

10

To illustrate the numerical computations, we set α = 0.6, σ12 = 0.3 and σ22 = 0.4. An examination of the autocorrelation and bi-squared correlation plots suggests that L = 15. Since this example is initial condition independent, z0 is assigned to zero. Figure (1) shows the estimated ANOVA results using N = 20, 000. The random numbers were generated using normal random sampling. Also plotted are the ’steady state’ values V2 (∞) and V (∞) Since V1 (m) = 0, ∀m, V12 (m) = V (m)−V2 (m). The numerical values provide very reliable estimates for the true values. The direct impact of {a2,t } on the prediction error variance levels out very quickly. The interaction between {a1,t }and {a2,t } makes an large contribution to σǫ2 (m) for m > 1. From this analysis, one can see that if were possible to eliminate or reduce the effect of {a2,t }, then the variance of the long-horizon prediction error would vanish. A reduction in {a1,t } by itself would significantly reduce the variance of the long-term prediction error as the interaction effect would vanish. Figure(2) shows a sequence of estimated ANOVA results obtained by blocking N into 10 equal-sized groups. As can be seen from these results, estimates of the variance components for small values of m are much less variable than those obtained for larger values of m. To obtain reliable estimates for larger horizons, the simulation time N , had to be in the range N ≈ 5, 000 − 10, 000. Similar results were obtained for the other examples studied in this paper. The calculations were performed in a MATLABr environment using a typical desktop computing configuration. The computing times required are summarized in Table(1).

4.2 Newer Exponential Autoregressive (NEAR) Model Consider the NEAR(2)Model introduced in [24] and discussed more fully in [11]. zt = a1,t β1 zt−1 + a2,t β2 yt−2 + f ξt

(30)

with |βi | ≤ 1, i = 1, 2. f is a scalar, {ξt } is an iid sequence of Laplace variates with mean µξ and variance σξ2 and P rob{a1,t , a2,t } = {1, 0} = α1

P rob{a1,t , a2,t } = {0, 1} = α2

P rob{a1,t , a2,t } = {0, 0} = 1 − α1 − α2

(31)

with α1 , α2 ≥ 0 and α1 + α2 ≤ 1. It can be verified that: a α1 E{ 1,t } = a2,t α2 (1 − α1 )α1 −α1 α2 a cov{ 1,t } = −α1 α2 (1 − α2 )α2 a2,t

(32)

A necessary and sufficient condition for Eqn(30) to have a unique ergodic solution is that α1 β12 + α2 β22 < 1 [11]. These models are a special class of random coefficient models. The choice of the scalar f can be used to ensure that the marginal distribution of z is exponential. We will consider the specific case of β1 = β2 = 1, α1 = 0.5, α2 = 0.3, α3 = 0.2 and

11

√ f = α3 . In this case {zt } has an exponential distribution [11]. Analysis of the autocorrelation function and bi-squared correlation function suggested that L = 20. The Laplace variable with a location parameter of zero and unit scale parameter was generated from a transformation from a uniform variate and {a1,t , at,2 } were generated from a multinomial distribution. The generation of Laplace variables is more computationally intensive than normal or uniform variates, and contributes significantly to the computation time in this example, Table(1). The variance decomposition for N = 10, 000 is shown in Fig(3). The direct variance contribution from {a1,t , at,2 } is essentially zero, as is expected. D(∞) ≃ 9.5,Dξ (∞) ≃ 4.2 and Da (∞) = 0, indicating that the interaction between {a1,t , a2,t } and {ξt } is important. The analysis is similar to that of the previous example. Reducing the variability of {ξt } will reduce the variability of the output D through a direct effect and an interaction effect through {a1,t , a2,t }. A total elimination of the variability in {a1,t , a2,t } will only reduce the long-term variability D(∞) to 4.2

4.3 Nonlinear Autoregressive Moving Average with Exogenous Input (NARMAX) Model Consider the NARMAX model developed in [32]. zt = .9722zt−1 + .357ut−1 − .1295ut−2 + · · ·

2 − .3103zt−1 ut−1 − .04228zt−2 + .1663zt−2 ut−2 + · · ·

2 2 .2573zt−2 ξt−1 − .03259zt−1 zt−2 − .3513zt−1 ut−2 + · · ·

2 .3084zt−1 zt−2 ut−2 + .2939zt−2 ξt−1 + · · ·

.1087zt−2 ut−1 ut−2 + .477zt−2 ut−1 ξt−1 + · · ·

.6389u2t−2 ξt−1 + ξt

(33)

zt describes the liquid level in a conical vessel. ut is an exogenous variable. The model was identified from experimental data with a designed experiment. We are interested in the contribution of ξt and ut to the variance of z for the situation where ξt is ∼ iidN (0, .05) The exogenous variable is an autoregressive process ut = 0.9ut−1 + at , with at ∼ N (0, .025(1 − 0.92 )). L = 25 was chosen for the analysis. The variance contributions are shown in Fig(4) using N = 20, 000. 1. Case A: Initial values were zt−j = ξt−j = ut−j = 0, j = 1, 2. The temporal variance propagation of ξt is not monotonic. This characteristic has been observed in other time series models [50]. We also notice that the variance component arising from the interaction between {ξt } and {at } increases with the prediction horizon m. 2. Case B: Initial values were zt−j = ξt−j = ut−j = 0.25, j = 1, 2. The variancedecomposition is shown in Fig(5). The ANOVA for case A is slightly different from that depicted in Fig(5). The steady state or limiting behavior is very similar. This is not surprising. Based on material balance considerations, one might anticipate that the system might be ergodic, although this is difficult to prove. In both cases, the impact of the exogenous variable on the variance of the prediction error increases with horizon m, while the direct impact of the endogenous variable ξt first increases, and then decreases with horizon. For prediction horizons m = 1 · · · 5, the variance of the prediction error is most strongly influenced by the endogenous variable.

12

The temporal analysis indicates that the greatest reduction in the variance of z can be achieved by moderating the influence of the endogenous variable.

5 Conclusion The temporal and component factor ANOVA decomposition of nonlinear time series and nonlinear stochastic systems can be cast in a form that is amenable to the use of existing Monte-Carlo methods. The methodology essentially involves treating the stochastic variables as model inputs. The inputs at different times are grouped together and sensitivity measures are obtained at each time point on the groups themselves. Applications of this approach on a series of examples of increasing complexity, indicate that this approach gives very credible estimates of the variance decomposition. This decomposition can be used to ascertain the importance of variables to the variance of the prediction error at various prediction horizons, or to determine the most important ranking of variables that might be used to reduce the variance of the variable of interest. Variations on the method can be used to ascertain the impact that a particular random component, or shock, at time t + m has on the variance of the future forecast error variance, E{ǫ2t+m+j|t }, j > 0. Acknowledgements The authors acknowledge the support of the Natural Science and Engineering Research Council of Canada (NSERC) and the School of Graduate Studies and Research at Queen’s University. The comments of the anonymous reviewers were very helpful.

References 1. Archer, G.E.B., Saltelli, A., Sobol, I.M.: Sensitivity measures, anova-like techniques and the use of bootstrap. Journal of Statistical Computation and Simulation 58, 99–120 (1997) 2. Bailey B.A., D.S., Lima, I.: Quantifiying the effects of dynamical noise on the predictability of a simple ecosystem model. Environmetrics 15, 337–355 (2004) 3. Brown, B.W., Mariano, R.S.: Predictors in dynamic nonlinear models: Large-sample behavior. Econometric Theory 5(3), 430–452 (1989). URL http://www.jstor.org/stable/3532377 4. Chan, K., Saltelli, A., Tarantola, S.: Winding stairs: A sampling tool to compute sensitivity indices. Statistics and Computing 10, 187–196 (2000) 5. Chan, K., Tarantola, S., Saltelli, A., Sobol’, I.M.: Sensitivity Analysis, chap. VarianceBased Methods, pp. 1167–197. Wiley, West Susex (2000) 6. Cox, D.: An analytic method for uncertainty analysis of nonlinear output functions, with applications to fault-tree analysis. IEEE Transactions on Reliability 31, 465–468 (1982) 7. Cukier, R., Fortuin, C., Shuler, K., Petschek, A., Schaibly, J.: Study of sensitivity of coupled reaction systems to uncertainties in rate coefficients. i theory. J. Chemical Physics 59, 3873–3878 (1973) 8. Cukier, R., Levine, H., Shuler, K.: Nonlinear sensitivity analysis of multi-parameter model systems. Journal of Physical Chemistry 81, 2365:2366 (1977) 9. Davis, P., Rabinowitz, P.: Methods of numerical integration (2nd ed.). Academic Press, New York (2007) 10. Desborough, L.D., Harris, T.J.: Performance assessment measures for univariate feedforward/feedback control. Canadian Journal of Chemical Engineering 71, 605–616 (1993) 11. Dewald, L., Lewis, P.: A new laplace second-order autoregressive time-series model nlar(2). IEEE Transactions on Information Theory 31, 645–651 (1985) 12. Fan, J., Yao, Q.: Nonlinear Time Series: Nonparametric and Parametric Methods. Springer-Verlag, New York (2005) 13. Ghaoui, L.E.: State-feedback control of system with multiplicative noise via linear matrix inequalities. Systems & Control Letters, 24:223–228 (1995).

13 14. Granger, C., Ter¨ asvirta, T.: Modelling Nonlinear Economic Relationships,. Oxford University Press, Oxford (1993) 15. Hamilton, J.D.: Time Series Analysis. Princeton University Press, Princeton (1994) 16. Helton, J., Davis, F.: Latin hypercube sampling and the propagation of uncertainty in analysis of complex systems. Reliability Engineering and System Safety 81, 23–69 (2003) 17. Helton, J., Johnson, J., Sallaberry, C., Storlie, C.: Survey of sampling-based methods for uncertainty and sensitvity analysis. Reliability Engineering and System Safety 91, 1175– 1209 (2006) 18. Henson, M.A., Seborg, D.E. Nonlinear Process Control, Prentice Hall, Upper Saddle River, N.J., 1997. 19. Homma, T., Saltelli, A.: Importance measures in global sensitivity analysis of nonlinear models. Reliability Engineering and System Safety 52, 1–17 (1996) 20. Jacques, J.,Lavergne, C., Devictor, N.: Sensitivity analysis in the presence of model uncertainty and correlated inputs. Reliability Engineering and System Safety 91, 1126-1134 (2006) 21. Jansen, K., Rossing, W., R.A., D.: Monte Carlo estimation of uncertainty contributions from several independent multivariate sources. Predictability and Nonlinear Modelling in Natural Sciences and Economics. Kluwer Academic Publishers (1994) 22. Jimenes, J. C., Ozaki, T.: Linear estimation of continuous-discrete linear state space models with multiplicative noise. Systems & Contrl Letters 47:91–101 (2002). 23. Kurowicka, D., Cooke, R.: Uncertainty Analysis with High Dimensional Dependence Modelling. Probability and Statistics. John Wiley & Sons, Chichester, England (2006) 24. Lawrance, A., Lewis, P.: Modeling and residual analysis of nonlinear autoregressive time series in exponential variables. Journal of the Royal Statistical Society. Series B - Methodological 47, 165–202 (1985) 25. Leontaritis, I.J., Billings, S.: Input-output parametric models for nonlinear systems, part ii: stochastic nonlinear systems. International journal of Control 41, 329–344 (1985) 26. Lilburne, L., Tarantola, S.: Sensitivity analysis of spatial models. International Journal of Geographical Information Science 23, 151–168 (2009) 27. L¨ utkepohl, H.: Introduction to Multiple Time Series Analysis. Springer-Verlag, Berlin (1991) 28. McKay, M., Beckman, R., Conover, W.: A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 42, 55–61 (2000) 29. Panjer, H.: Decomposition of moments by conditional moments. The American Statististican 27, 170–171 (1973) 30. Pearson, R.: Discrete-Time Dynamical Models. Oxford University Press, Oxford (1999) 31. Petersson, M., rzn, K.E., Hgglund, T.: A comparison of two feedforward control structure assessment methods. International Journal of Adaptive Control and Signal Processing 17, 609 – 624 (2003) 32. Sales, K.R., Billings, S.A.: Self-tuning control of non-linear ARMAX models. International Journal of Control 51(4), 753–769 (1990) 33. Satelli, A., Annoni, P., Azzini, I., Campolongo, F., Ratto, M., Tarantola, S.:Variance based sensitivity analysis of model output. Design and estimator for the total sensitivity index. Computer Physics Communications 181:259-270 (2010) 34. Saltelli, A.: Making best use of model evaluations to compute sensitivity indices. Computer Physics Communications 145, 280–297 (2002) 35. Saltelli, A.: Sensitivity analysis for importance assessment. Risk Analysis 22, 579–590 (2002) 36. Saltelli, A., Chan, K., Scott, E. (eds.): Sensitivity Analysis. Wiley, New York (2000) 37. Saltelli, A., Ratto, M., Tarantola, S., Campolongo, F.: Sensitivity analysis practices: strategies for model-based inference. Reliability Engineering and System Safety 91, 1109–1125 (2006) 38. Saltelli, A., Tarantola, S., Campolongo, F., Ratto, M.: Sensitivity Analysis in Practice: A Guide to Assessing Scientific Models Wiley, New York (2004) 39. Saltelli, A., Tarantola, S.: On the relative importance of input factors in mathematical models: Safety assessment for nuclear waste disposal. Journal of American Statistical Assocation 97, 702–709 (2002) 40. Saltelli, A., Tarantola, S., Chan, K.: A quantitative model-independent method of global sensitivity analysis of model output. Technometrics 41, 39–56 (1999)

14 41. Seppala, C., Harris, T., Bacon, D.: Time series methods for dynamic analysis of multiple controlled variables. Journal of Process Control 12, 257–276 (2002) 42. Shah, S., Patwardhan, R., Huang, B.: Multivariate controller performance analysis: Methods, applications and challenges. In: Chemical Process Control - CPC VI, AICHE Symposium Series, vol. 98, pp. 190–207 (2002) 43. Sobol’, I.: Sensitivity estimates for nonlinear mathematical models. Matematicheskoe Modelirovanie (in Russian), translated in Mathematical Modelling and Computational Experiments 1,307-414 (1993) 2, 112–118 (1990) 44. Sobol’, I.M.: Global sensitivity indices for nonlinear mathematical models and their monte carlo estimates. Mathematics and Computers in Simulation 55, 271–280 (2001) 45. Sobol’, I.M.: Distribution of points in a cube and approximate evaluation of integrals. U.S.S.R. Comput. Maths. Math. Phys. 7, 86-112 (1967) 46. Stachurski, J.: Economic dynamical systems with multiplicative noise. Journal of Mathematical Economics 39, 135–152 (2003) 47. Stanfelj, N., Marlin, T.E., MacGregor, J.F.: Monitoring and diagnosing process control performance: The single-loop case. Industrial Engineering Chemistry Research 32, 301– 314 (1993) 48. Stein, M.: Large sample properties of simulations using latin hypercube sampling. Technometrics 29, 143–151 (1987) 49. Stroud, A.: Approximate calculation of multiple integrals. Prentice-Hall, Englewood Cliffs (1971) 50. Tong, H.: Non-linear Time Series: A Dynamical System Approach. Oxford University Press, New York (1990)

15 Table 1 Computation Time (sec) Example

N

1 2 3

20,000 10,000 20,000

Random Number Generation 1.2 213.7 0.1

Simulation

ANOVA

Total

151.7 74.7 379.8

0.5 0.6 1.8

153.4 289.0 381.7

16

1.3 1.2

Variance Component

1.1 V(m)

1

Vˆ (m)

0.9 0.8 0.7 0.6 0.5

Vˆ2 (m)

V2(∞)

0.4 0

5

10

15

10

15

horizon m

Fig. 1 ANOVA estimates: Example 1

1.6

1.4

Variance Component

Vˆ (m) 1.2

1

0.8

0.6

0.4

0.2 0

Vˆ2 (m)

5 horizon m

Fig. 2 Envelope of ANOVA estimates: Example 1

17

10

8

Variance Components

Vˆ (m) 6

4 Vˆξ 2 Vâ 0

−2 0

2

4

6

Fig. 3 ANOVA estimates: Example 2

8

10 horizon − m

12

14

16

18

20

18

1.4

1.2 Vˆ (m)

variance component

1

0.8

Vˆξ (m) + Vâ (m) Vâ (m)

0.6

0.4 Vˆξ (m)

0.2

0 0

5

10

15

20

25

20

25

horizon m

Fig. 4 ANOVA estimates: Example 3A

1.4

1.2 Vˆ (m)

variance component

1

0.8 Vˆξ (m) + Vâ (m) 0.6

Vâ (m)

0.4

Vˆξ (m)

0.2

0 0

5

10

15 horizon m

Fig. 5 ANOVA estimates: Example 3B