Working Paper 11-34 Statistics and Econometrics Series 026 October 2011
Departamento de Estadística Universidad Carlos III de Madrid Calle Madrid, 126 28903 Getafe (Spain) Fax (34) 91 624-98-49
BOOTSTRAP FORECAST OF MULTIVARIATE VAR MODELS WITHOUT USING THE BACKWARD REPRESENTATION Lorenzo Pascual♦, Esther Ruiz♠ and DiegoFresoli♣
Abstract In this paper, we show how to simplify the construction of bootstrap prediction densities in multivariate VAR models by avoiding the backward representation. Bootstrap prediction densities are attractive because they incorporate the parameter uncertainty and and do not rely on any particular assumption about the error distribution. What is more, the construction of densities for more than one-step-ahead is possible even in situations when these densities are unknown asymptotically. The main advantage of the new procedure is that it is computationally simple without loosing the good performance of bootstrap procedures. Furthermore, by avoiding a backward representation, its asymptotic validity can be proved without relying on the assumption of Gaussian errors as needed by alternative procedures. Finally, the new procedure proposed in this paper can be implemented to obtain prediction densities in models without a backward representation as, for example, models with MA components or GARCH disturbances. By comparing the finite sample performance of the proposed procedure with those of alternatives, we show that nothing is lost when using it. Finally, we implement the procedure to obtain prediction regions for US quarterly future inflation, unemployment and GDP growth.
Keywords: Non-Gaussian VAR models, Prediction cubes, Prediction density, Prediction regions, Prediction ellipsoids, Resampling methods.
♦
EDP-Energías de Portugal, S.A., Unidade de Negócio de Gestao da Energía, Director Adjunto. Corresponding author: Dpt. Estadística and Instituto Flores de Lemus, Universidad Carlos III de Madrid, C/ Madrid 126, 28903 Getafe, Spain. Tel: 34 916249851, Fax: 34 916249849, e-mail:
[email protected]. ♣ Dpt. de Estadística, Universidad Carlos III de Madrid, C/ C/ Madrid 126, 28903 Getafe (Madrid), Spain, e-mail:
[email protected]. Acknowledgements. The last two authors are grateful for financial support from project ECO2009-08100 of the Spanish Government. We are also grateful to Gloria González-Rivera for her useful comments. The usual disclaims apply. ♠
Bootstrap Forecast of Multivariate VAR Models without using the backward representation Lorenzo Pascual∗ Esther Ruiz† Diego Fresoli‡§ October 2011
Abstract In this paper, we show how to simplify the construction of bootstrap prediction densities in multivariate VAR models by avoiding the backward representation. Bootstrap prediction densities are attractive because they incorporate the parameter uncertainty and do not rely on any particular assumption about the error distribution. What is more, the construction of densities for more than one-step-ahead is possible even in situations when these densities are unknown asymptotically. The main advantage of the new procedure is that it is computationally simple without loosing the good performance of bootstrap procedures. Furthermore, by avoiding a backward representation, its asymptotic validity can be proved without relying on the assumption of Gaussian errors as needed by alternative procedures. Finally, the new procedure proposed in this paper can be implemented to obtain prediction densities in models without a backward representation as, for example, models with MA components or GARCH disturbances. By comparing the finite sample performance of the proposed procedure with those of alternatives, we show that nothing is lost when using it. Finally, we implement the procedure to obtain prediction regions for US quarterly future inflation, unemployment and GDP growth. KEYWORDS : Non-Gaussian VAR models, Prediction cubes, Prediction density, Prediction regions, Prediction ellipsoids, Resampling methods.
∗ EDP-Energ´ ıas
de Portugal, S.A., Unidade de Neg´ ocio de Gest˜ ao da Energ´ıa, Director Adjunto. author: Dpt. Estad´ıstica and Instituto Flores de Lemus, Universidad Carlos III de Madrid, C/ Madrid 126, 28903 Getafe, Spain. Tel: 34 916249851, Fax: 34 916249849, e-mail:
[email protected]. ‡ Dpt. de Estad´ ıstica, Universidad Carlos III de Madrid, C./ Madrid 126, 28903 Getafe (Madrid), Spain, e-mail
[email protected]. § Acknowledgements. The last two authors are grateful for financial support from project ECO2009-08100 of the Spanish Government. We are also grateful to Gloria Gonz´ alez-Rivera for her useful comments. The usual disclaims apply. † Corresponding
1
1
Introduction
Bootstrap procedures are known to be useful when forecasting time series because they allow the construction of prediction densities without imposing particular assumptions on the error distribution and simultaneously incorporating the parameter uncertainty. Note that when the errors are non-Gaussian the prediction densities are usually unknown when the prediction horizon is larger than one-step-ahead. However, the bootstrap can be implemented in these cases to obtain the corresponding prediction densities. They are also attractive because their computational simplicity and wide applicability. However, these advantages are limited by the use of the backward representation that many authors advocate after the seminal paper of Thombs and Schuchany (1990). In particular, Kim (1999) extends the procedure of Thombs and Schuchany (1990) to stationary VAR(p) models. Later, Kim (2001, 2004) considered bias-corrected prediction regions by employing a bootstrap-after-bootstrap approach. On the other hand, Grigoletto (2005) proposes two further alternative procedures based on Kim (1999) that take into account not only the uncertainty due to parameter estimation but also the uncertainty attributable to model specification. In any case, the bootstrap procedures conceived by Kim (1999, 2001, 2004) and Grigoletto (2005) use the backward representation to generate the bootstrap samples used to obtain replicates of the estimated parameters. Using the backward representation has three main drawbacks. First, the resulting procedure is computationally complicate and time consuming. Second, given that the backward residuals are not independent it is necessary to use the relationship between the backward and forward representation of the model in order to resample from independent residuals; see Kim (1997, 1998) for this relationship which can be rather complicate for high order models. Consequently, Kim (1999) resamples directly from the dependent backward residuals. However, the asymptotic validity of the bootstrap resampling can only be proved by imposing i.i.d. and, as a result, it requires assuming Gaussian errors. Finally, these bootstrap alternatives can only be applied to models with a backward representation which excludes their implementation in, for example, multivariate models with Moving Average (MA) components or with GARCH disturbances. In an univariate framework, Pascual et al. (2004a) show that the backward representation can be avoided without loosing the good properties of the bootstrap prediction densities. When dealing with multivariate systems it is even more important avoiding the backward representation due to its larger complexity; see Kim (1997, 1998). In this paper, we propose an extension of the bootstrap procedure proposed by Pascual et al. (2004a) for univariate ARIMA models to obtain
2
joint prediction densities for multivariate VAR(p) models avoiding the backward representation and, consequently, overcoming its limitations. We prove the asymptotic validity of the proposed procedure without relying on particular assumptions about the prediction error distribution. We focus on the construction of multivariate prediction densities from which it is possible to obtain marginal prediction intervals for each of the variables in the system and joint prediction regions for two or more variables within the system.1 Monte Carlo experiments are carried out to study the finite sample performance of the marginal prediction intervals obtained by the new bootstrap procedure and compare it those of alternative procedures available in the literature. We also compare their corresponding elliptical and Bonferroni regions. We show that, although the bootstrap procedure proposed in this paper is computationally simpler, its finite sample properties are similar to those of previous more complicate bootstrap approaches and clearly better than those of the standard and asymptotic prediction densities. We also show that when the errors are non-Gaussian, the bootstrap elliptical regions are inappropriate with the Bonferroni regions having better properties. The procedures are illustrated with an empirical application which consists of predicting future inflation, unemployment and growth rates in the US. The rest of the paper is organized as follows. Section 2 describes the asymptotic and bootstrap prediction intervals and regions previously available in the literature. In Section 3, we propose a new bootstrap procedure, derive its asymptotic distribution and analyze its performance in finite samples. We compare the new bootstrap densities and corresponding prediction intervals and regions with the standard, asymptotic and alternative bootstrap procedures. Section 4 illustrates the results with an empirical application. Finally, section 5 concludes the paper with suggestions for further research.
2
Asymptotic and bootstrap prediction intervals and regions for VAR models
In this section, we describe the construction of prediction regions in stationary VAR models based on assuming known parameters and Gaussian errors. We also describe how the parameter uncertainty can be incorporated by using asymptotic and bootstrap approximations of the finite sample distribution of the parameter estimator. The bootstrap procedures can also be implemented to deal with non-Gaussian errors. 1 Previous
paper focus on the Bonferroni regions and not in the bootstrap densities themselves.
3
2.1
Asymptotic prediction intervals and regions
Consider the following multivariate VAR(p) model
Φ(L)Yt = µ + at ,
(1)
where Yt is the N x1 vector of observations at time t, µ is a N x1 vector of constants, Φ(L) = IN − Φ1 L − ... − Φp Lp with L being the lag operator and IN the N xN identity matrix. The N xN parameter matrices, Φi , i = 1, ..., p, satisfy the stationarity restriction. Finally, at is a sequence of N x1 independent white noise vectors with nonsingular contemporaneous covariance matrix given by Σa . It is well known that if at is an independent vector white noise sequence then the point predictor of YT +h that minimizes the Mean Square Error (MSE) is its conditional mean which depends on the model parameters. In practice, these parameters are unknown and the predictor of YT +h is obtained with the parameters substituted by consistent estimates as follows
b 1 YbT +h−1|T + ... + Φ b p YbT +h−p|T YbT +h|T = µ b+Φ
(2)
where YbT +j|T = YT +j , j = 0, −1, ... Furthermore, the MSE of YbT +h|T is usually estimated as follows b b (h) = Σ Y
h−1 X
bjΣ b aΨ b 0j , Ψ
(3)
j=0
b j are the estimated matrices of the MA representation of Yt and Σ ba = where Ψ
b ab a0 T −N p−1
where
b a = (b a1 , ..., b aT ) with b 1 Yt−1 − ... − Φ b p Yt−p . b at = Yt − µ b−Φ
(4)
If at is further assumed to be Gaussian, then the marginal prediction density of the nth variable in the system is also Gaussian and the standard practice is to construct the (1-α)100% prediction interval for the nth variable in the system as follows
GIT +h = yn,T +h |yn,T +h|T ∈ ybn,T +h|T ± zα/2 σ bn,h ,
(5)
where ybn,T +h|T is the nth component of YbT +h|T , σ bn,h is the square root of the nth diagonal b b (h) and zα is the α-quantile of the standard Gaussian distribution. The Gaussianity element of Σ Y of the forecast errors can also be used to obtain the following (1-α)100% joint ellipsoid for all the
4
variables in the system
GET +h
h i h i0 −1 2 b b b YT +h − YT +h|T < χα (N ) , = YT +h | YT +h − YT +h|T ΣYb (h)
(6)
where χ2α (N ) is the α-quantile of the χ2 distribution with N degrees of freedom.2 Constructing the ellipsoids in (6) can be quite demanding when N is larger than two or three. Consequently, L¨ utkepolh (1991) proposes using the Bonferroni method to construct the following prediction cubes with coverage at least (1-α)100%
GCT +h = YT +h |YT +h ∈ ∪N bn,T +h|T ± zτ σ bn,h , n=1 y
(7)
where where τ = 0.5(α/N ). However, note that the prediction intervals and regions above have two main drawbacks. First, they are constructed using the MSE in (3) which does not incorporate the parameter uncertainty. As a consequence, if the sample size is small the uncertainty associated with YbT +h|T is underestimated and the corresponding intervals an regions will have lower coverages than the nominal. The second problem, is related to the Gaussianity assumption. When this assumption does not hold, the quadratic form in (6) is not adequate as well as the width of the intervals in (5) and (7). Even more, when the prediction errors are not Gaussian, the shape of their densities for h >2 is in general unknown. As an illustration, consider the following VAR(2) bivariate model
y1,t −0.9 = y2,t 0.4
0 y1,t−1 −0.5 −0.7 y1,t−2 a1,t + + 0.8 −0.1 y2,t−2 a2,t y2,t−1 0
(8)
where at = (a1,t , a2,t )0 is an independent white noise vector with contemporaneous covariance 0
matrix given by vecΣa = (1, 0.8, 0.8, 1) where vec denotes the column stacking operator. The distribution of at is a χ2 (4). Panel (a) of Figures 1 and 2 display the joint one-step-ahead and eight-steps-ahead densities of y1,t and y2,t respectively, which have clear asymmetries, although more pronounced in the former. After generating a time series of size T = 100, the VAR(2) parameters are estimated by Least Squares (LS). Panel (b) of Figures 1 and 2 plot kernel estimates of the joint density obtained as usual after assuming that the prediction errors are jointly Gaussian 2 The same argument can be applied when the interest lies on only a subset of components of Y . t For example, if the focus is on the first J components of Yt , the prediction ellipsoid is given by h i0 −1 h i b b (h)C 0 {YT +h | YT +h − YbT +h|T C0 CΣ C YT +h − YbT +h|T < χ2α (J)}, where C = [IJ 0]. Y
5
with zero mean and convariance matrix given by (3).3 Comparing panels (a) and (b), it is obvious that the Gaussian approach fails to capture the asymmetry of the error distribution. It is usual in practice to construct prediction regions for y1,t and y2,t . From the joint Gaussian density plotted in panel (b) of Figure 1, it is possible to obtain the corresponding 95% one-step-ahead ellipsoids and Bonferroni regions. They are shown in Figure 3 together with a realization of YT +1 . We can observe that the shape of both regions is not appropriate to construct a satisfactory prediction region for YT +1 . Finally, as we may also being interested in forecasting only one variable in the system, Figure 4 displays the true marginal one-step-ahead density of y1,t together with the Gaussian approximation. Once more, it is clear that the Gaussian approach fails to capture the skewness of the prediction density. Consider first the problem of incorporating the parameter uncertainty. As pointed out above, the MSE in (3) underestimates the true prediction uncertainty and, consequently, they can be inappropriate in small samples sizes. Granted that a good estimator is used, the importance of taking into account the parameter uncertainty could be small in systems consisting in few variables; see Riise and Tjostheim (1984). But, in empirical applications we often found VAR(p) models fitted to large systems; see, for example, Simkins (1995) for a VAR(6) model for a system of 5 macroeconomic variables, Waggoner and Zha (1999) who fit a VAR(13) model to a system of 6 macroeconomic variables, Chow and Choy (2006) who fit a VAR(5) model to a system of 5 variables related with the global electronic system, G´omez and Guerrero (2006) for a VAR(3) model fitted to a system of 6 macroeconomic variables and Chevillon (2009) for a VAR(2) model for a system of 4 macroeconomic variables just to mention a few empirical applications. Additionally, as these examples show, when dealing with real systems of time series, their adequate representation often requires a rather large order p. If the number of variables in the system and/or the number of lags of the model are relatively large, the estimation precision in finite samples could be rather low and predictions based on VAR(p) models with estimated parameters may suffer severely from the uncertainty in the parameter estimation. In these cases, it is important to construct prediction intervals and regions that take into account this uncertainty; see, for instance, Schmidt (1977), L¨ utkepolh (1991), West (1996), West and McCracken (1998), Sims and Zha (1998, 1999) for existing evidence on the importance of taking into account parameter uncertainty in unconditional forecasts and Waggoner and Zha (1999) for the same result in conditional forecasts. 3 The smoothed density is obtained by applying a Gaussian kernel density estimator with a diagonal bandwidth matrix with elements given by the Gaussian “rule of thumb”.
6
To incorporate the parameter uncertainty, L¨ utkepolh (1991) suggests approximating the sample distribution of the estimator by its asymptotic distribution density.4 In this case, the MSE of YbT +h|T that incorporates the parameter uncertainty can be approximated by b l (h) = Σ b b (h) + 1 Ω(h), b Σ b Y Y T
(9)
where b Ω(h) =
h−1 X h−1 X
h i b 0 )h−1−i Υ b h−1−j Υ b −1 B b Ψ b iΣ b aΨ b0 , tr (B j
(10)
i=0 j=0
b= with Υ
ZZ 0 T
b is the following (N p + 1)x(N p + 1) matrix and B
1
µ b b = 0 B ... 0
0
0
...
0
b1 Φ
b2 Φ
...
b p−1 Φ
IN
0
...
0
...
...
...
...
0
0
...
IN
0
bp Φ 0 . ... 0
In order to asses the effect of the parameter uncertainty on the MSE given by (9), consider the b b a , and case of one-step-ahead predictions, i.e. when h = 1. In this situation, Ω(1) = (N p + 1)Σ b l (1) can be approximated by Σ b Y
b l (1) = T + N p + 1 Σ b a. Σ b Y T
This expression shows that the contribution of the parameter uncertainty to the one-step-ahead MSE matrix depends on the dimension of the system, N , the VAR order, p, and the sample size, T . As long as N and/or p, or both, are big enough compared to the sample size T , the effect of parameter uncertainty can be substantial. Obviously, as the sample size gets larger then limT →∞
T +N p+1 T
= 1 and the parameter uncertainty contribution to the MSE in (9) vanishes.
Once the MSE is computed as in (9), the corresponding prediction intervals, ellipsoids and cubes are constructed using the Gaussianity assumption as follows
l AIT +h = yn,T +h |yn,T +h ∈ ybn,T +h|T ± zα/2 σ bn,h ,
(11)
4 Alternatively, some authors propose using Bayesian methods which could be rather complicated from a computational point of view; see, for example, Simkins (1995) and Waggoner and Zha (1999) who need the Gaussianity assumption to derive the likelihood and posterior distribution.
7
n h i h i o b l (h)−1 YT +h − YbT +h|T < χ2α (N ) , AET +h = YT +h | YT +h − YbT +h|T Σ b Y
(12)
l ACT +h = YT +h |YT +h ∈ ∪N bn,T +h|T ± zτ σ bn,h , n=1 y
(13)
l b l (h). where σ bn,h is the square root of the nth diagonal element of Σ b Y
Panel (c) of Figure 1 which plots the density of y1,T +1 and y2,T +1 for the same example as above constructed assuming that the forecast error are Gaussian with zero mean and covariance matrix given by (9), shows that this density is not very different from that plotted in panel (b) and, obviously, it is not able to capture the asymmetries of the error distribution. Similarly, the joint density of y1,T +8 and y2,T +8 in panel (c) of Figure 2 does not look different from that in panel (b). The similarity between the standard and the asymptotic densities is even more clear in Figure 3 that plots the elliptical and Bonferroni regions constructed using (12) and (13), respectively. As we can observe they are slightly larger than the standard but still located very close to them. They cannot cope with the lack of symmetry of the prediction error distribution. This similarity could be expected as we are estimating 8 parameters with T = 100. Similar comments deserve Figure 4, where we can observe that the asymptotic marginal density for the first component of the system y1,T +1 only differs from the standard density in the variability, which is slightly larger in the former. Note that the asymptotic approximation of the distribution of the LS estimator can be inadequate in small samples depending on the number of parameters to be estimated and the true distribution of the innovations.
2.2
Bootstrap procedures for prediction intervals and regions
To overcome the limitations of the Gaussian densities described before, Kim (1999, 2001, 2004) and Grigoletto (2005) propose using bootstrap procedures which incorporate the parameter uncertainty even when the sample size is small and do not rely on the Gaussianity assumption. In order to take into account the conditionality of VAR forecasts on past observations, Kim (1999) proposes to obtain bootstrap replicates of the series based on the following backward recursion ∗ ∗ b 1 Yt+1 b p Yt+p Yt∗ = ω b+Λ + ... + Λ +υ bt∗
(14)
where YT∗−i = YT −i for i = 0, 1, ..., p − 1 are p starting values which coincide with the last b 1 , ..., Λ b p , are LS estimates of the parameters of the backward values of the original series, ω b, Λ representation, and υ bt∗ are obtained by resampling from the empirical distribution function of the
8
centered and rescaled backward residuals. Then, bootstrap LS estimates of the parameters of the forward representation are obtained by estimating the VAR(p) model in (1) using {Y1∗ , ..., YT∗ }. ˆ ∗ = (b b ∗ , ..., Φ b ∗p ). The bootstrap forecast for period T + h is then Denote these estimates by B µ∗ , Φ 1 given by b ∗1 Yb ∗ b∗ b ∗ a∗T +h YbT∗+h|T = µ b∗ + Φ T +h−1|T + ... + Φp YT +h−p|T + b
(15)
where b a∗T +h are random draws from the empirical distribution function of centered and rescaled forward residuals. Having obtained R bootstrap replicates of YbT∗+h|T , Kim (2001) defines the bootstrap (1-α)100% prediction interval for the nth variable in the system as follows n h α α io ∗ ∗ 1− KIT +h = yn,T +h |yn,T +h ∈ qK , qK 2 2
(16)
∗ where qK (γ) is the empirical γ-quantile of the bootstrap distribution of the nth component of ∗ yn,T YbT∗+h|T approximated by G∗n,K (x) = #(b +h|T < x)/R. Similarly, Kim (1999) proposes to
construct bootstrap prediction ellipsoids with probability content (1 − α)100% are given by
KET +h
h i0 h i K −1 ∗ ∗ ∗ b b = YT +h | YT +h − YT +h|T SYˆ ∗ (h) YT +h − YT +h|T < QK
(17)
∗(r) where YbT∗+h|T is the sample mean of the R bootstrap replicates YbT +h|T and SYK ˆ ∗ (h) is the cor-
responding sample covariance.5 The quantity Q∗K in (17) is the (1 − α)100% percentile of the bootstrap distribution of the following quadratic form h Yb ∗
T +h|T
− YbT∗+h|T
i0
h i −1 b ∗ SYK YT +h|T − YbT∗+h|T . ˆ ∗ (h)
(18)
Furthermore, Kim (1999) proposes using the Bonferroni approximation to obtain prediction cubes with nominal coverage of at least (1-α)100% which are given by n h τ τ io ∗ ∗ KCT +h = yn,T +h |yn,T +h ∈ ∪N , qK 1− n=1 qK 2 2
(19)
where τ = α/N .6 5 Kim (1999) does not explicitly show how S K (h) should be defined. Alternatively, one can obtain S ˆ ∗ (h) by ˆ∗ Y Y substituting the parameters in (9) by their bootstrap estimates and computing the average through all bootstrap replicates. By calculating it with the sample covariance or by substituting the bootstrap parameters in the corresponding expressions we get similar results. 6 Actually, what Kim (1999) defines as KC is slightly different from (19) as he uses the percentile and percentile-t methods of Hall (1992). Here we prefer to use the Bonferroni prediction regions in (19) because they are better suited to deal with potential asymmetries of the error distribution; see Hall (1992).
9
The bootstrap procedure just described is illustrated by considering again the same time series of size T = 100 simulated by model (8). Panel (d) of Figures 1 and 2 plot the joint bootstrap density of y1,T +1 and y2,T +1 and y1,T +8 and y2,T +8 respectively, based on R = 2999 bootstrap replicates. When comparing these densities with their Gaussian counterparts, it is clear that the bootstrap can reproduce the asymmetry and is much closer to the true density plotted in panel (a) of the same figures. Figure 3 plots the corresponding ellipsoids and cubes defined in (17) and (19). First of all, note that the bootstrap ellipsoid is only slightly larger than the two Gaussian ellipsoids and it is not adequate to represent the shape of the realization of y1,T +1 and y2,T +1 plotted in Figure 2. This is due to the fact the ellipsoid in (17) is still based on a Gaussian assumption and only differs from the ellipsoids described in the previous subsection in the way the MSE is computed. However, Figure 3 clearly illustrates that when dealing with nonGaussian prediction errors, the prediction regions constructed from the bootstrap joint densities cannot be based on ellipsoids. An alternative is to use High Density Regions (HDR) proposed by Hyndman (1996) which have also been plotted in Figure 3. These regions are based on kernel estimates of the joint densities as those plotted in Figure 1. Although the shape of these regions seem to be more adequate, they are unfeasible when the dimension of the system is large as, in this case, there are not satisfactory kernel estimators of the bootstrap densities. On the other hand, when the prediction regions are constructed using the Bonferroni approximation, Figure 3 shows that the bootstrap cube is located towards the northeast so it is more adequate than the ellipsoid. This is in fact reflecting that the quantiles of the marginal densities used in (19) can cope with the asymmetries while the ellipsoids use a quadratic form based on the wrong Gaussianity assumption. Finally, Figure 4 displays the marginal one-step-ahead kernel density of y1,T +1 . We observe that it is clearly closer to the true density than its Gaussian counterparts. Kim (1999) justifies the use of his bootstrap procedure in small samples by suggesting that the asymptotic results of Thombs and Schuchany (1990) can be extended to a multivariate framework. However, the asymptotic validity of the bootstrap procedures based on the backward representation relies on the assumption of Gaussian innovations; see Kim (2001). This is due to the fact that the serial independence of the backward errors is needed and this can only be guarantee under the assumption of Gaussian disturbances. Note that alternatively one could use the relationship between the forward and backward residuals and resampling from the former to obtain the latter. However, obtaining the backward representation can be very complicated in VAR(p) models with
10
large order; see Kim (1997, 1998) for the expression of the backward representation.7
3
A new bootstrap procedure
In this section, we propose an extension of the bootstrap procedure proposed by Pascual et al. (2004a) for univariate ARIMA models, to obtain the joint prediction density of YT +h in VAR(p) models. The new bootstrap procedure avoids the backward representation by resampling without fixing the observations of the available sample when incorporating the parameter uncertainty. It is important to note that although the predictions are conditional on the available time series, the sample distribution of the parameter estimator is defined as the distribution through different replicates; see Harvey (1989) and L¨ utkepohl (1991) who argue that the distribution of the predictions based on estimated parameters is obtained as if the sample used for prediction is different from the sample used for estimation. Therefore, using the backward representation is not necessary theoretically and only adds complexity into the bootstrap procedure without any advantage; see Pascual et al. (2004a) for the same arguments in univariate ARIMA models and Rodr´ıguez and Ruiz (2009) for univariate state space models. By avoiding the backward representation, the new procedure is simpler from a computational point of view, can be implemented in models without such representation and its asymptotic validity can be established without assuming Gaussian errors. In this section, we describe the proposed procedure and prove its asymptotic validity to estimate the joint density of future values of YT +h . We also carry out Monte Carlo experiments to analyse the performance of prediction intervals constructed from the corresponding marginal bootstrap densities and ellipsoids and cubes constructed from the joint bootstrap density.
3.1
Description of the bootstrap procedure
The new procedure proposed in this paper to obtain the bootstrap prediction density of YT +h is similar to that proposed by Kim (1999) but avoiding the backward representation in (14). The algorithm to obtain the bootstrap replicates of YT +h is the following. Step 1. Estimate by LS the parameters of model (1) and obtain the corresponding vector of 7 For the simpler expression of the backward representation in which the lag values of the variables in (1) are substituted by forward values, Tong and Zhang (2005) and Chan et al. (2006) show that a necessary condition for the VAR(p) model to have this backward representation is that the covariance matrices Υ(h) = E [(Yt − E(Yt ))(Yt−h − E(Yt ))0 ] are symmetric for all h. This is a very strong restriction not likely to be satisfied in real data systems.
11
residuals defined as in (4). Center and scale the residuals by using the factor [(T − p)/(T − 2p)]0.5 recommended by Stine (1987). Denote by Fba the empirical distribution function of the centered PT and rescaled residuals, Fba (x) = T1 t=1 1(b at < x) where 1(·) is an indicator variable which takes value 1 if the argument is true. ∗ Step 2. From a set of p initial values, say Y0∗ = {Y−p+1 , . . . , Y0∗ }, construct a bootstrap series
{Y1∗ , . . . , YT∗ } as follows ∗ ∗ b 1 Yt−1 b p Yt−p Yt∗ = µ b+Φ + ··· + Φ +b a∗t , t = 1, . . . , T,
(20)
b ∗ = (b b ∗ , ..., Φ b ∗ ), a bootstrap replicate of where b a∗t are independent draws from Fba . Obtain B µ∗ , Φ p 1 the LS estimates by fitting model (1) to the bootstrap replicate {Y1∗ , ..., YT∗ }. Step 3. Forecast using model (1) with the parameters substituted by their bootstrap estimates and fixing the last p observations of the original series, as follows
b ∗1 Yb ∗ b∗ b ∗ YbT∗+h|T = µ b∗ + Φ a∗T +h , T +h−1|T + · · · + Φp YT +h−p|T + b
(21)
with b a∗T +h being a random draw from Fba and YbT∗+h = YT +h , h ≤ 0. Step 4. Repeat steps 1 to 3 R times. ∗(1)
∗(R)
Using this procedure, we obtain R bootstrap replicates of YT +h , denoted by {YbT +h|T , ..., YbT +h|T } and their corresponding bootstrap distribution which can be used to delimit prediction intervals, ellipsoids and cubes with appropriate probability content just as before. For instance, if we are interested in the nth component of YT +h , we can approximate the bootstrap density of the future ∗(1)
∗(2)
∗(R)
value by using {b yn,T +h , ybn,T +h , ..., ybn,T +h }, so that a (1-α)100% bootstrap prediction interval for the nth variable is given by
∗ ∗ BIT +h = {yn,T +k |yn,T +k ∈ [qB (τ ) , qB (1 − τ )]}
(22)
∗(b)
∗ ∗ where qB (τ ) = G∗−1 yn,T +k ≤ x)/R. Using similar n,B is the τ th percentile of Gn,B (x) = #(b
arguments, we can obtain the following prediction ellipsoids and cubes
BET +h =
h i0 h i −1 ∗ ∗ b YT +h | YT +h − YbT∗+h SYB (h) Y − Y < Q T +h B b∗ T +h
n h τ τ io ∗ ∗ BCT +h = yn,T +h |yn,T +h ∈ ∪N q , q 1 − n=1 B B 2 2
12
(23)
(24)
∗(1)
∗(2)
∗(R)
respectively, where τ = α/N , YbT∗+h|T is the mean of {YbT +h|T , YbT +h , ..., YbT +h } and Q∗B is obtained as in (18). To illustrate the implementation of the new bootstrap procedure proposed in this paper we consider again the simulated bivariate time series previously described. Panel (e) of Figure 1 displays the kernel estimate of the bootstrap joint density of y1,T +1 and y2,T +1 . It is clear that it is more adequate to approximate to the true density than Gaussian procedures and similar to the bootstrap procedure proposed by Kim (1999). This is also evident from panel (e) of Figure 2, which looks more alike to that of panel (a) than those plotted in the other panels. Nothing seems to be lost by not using the backward representation. Figure 3 plots the bootstrap cube and elliptical regions obtained from the new bootstrap density. Although the bootstrap densities are very different from the densities obtained by the standard and asymptotic approximations, we cannot observe a big difference in the location between the new bootstrap ellipsoid and those ellipsoids obtained with the alternative procedures. This is due to the fact that first two moments estimates involved in the definition of the ellipsoids do not differ significantly for all the procedures; i.e., all estimate similar centers and dispersions of the future value. For instance, the center of the ellipsoid in Figure 3 is (−2.21, −0.27) for Gaussian alternatives and (−2.21, −0.26) and (−2.19, −0.26) for Kim’s and the new bootstrap approaches, respectively, while the one-stepahead MSE estimates are given by 0.98 b b (1) = Σ Y 0.79 1.04 SYK ˆ ∗ (h) = 0.84
1.03 b l (1) = Σ b Y 0.83
0.83 , 1.00
0.84 1.09 B and SYˆ ∗ (h) = 1.03 0.90
0.90 . 1.08
0.79 , 0.98
However, we see different locations between the bootstrap cubes and the Gaussian cubes, reflecting the ability of the former to adapt to the asymmetry. Finally, Figure 4 displays the one-step-ahead kernel estimate of the density of y1,T +1 obtained by the new bootstrap which like Kim’s is much closer to the true density than the Gaussian alternatives. After all, this example suggests that both bootstrap densities resemble better the true density than the traditional procedures. Nevertheless, our bootstrap procedure is much simpler than Kim (1999) and its asymptotic validity can be proven without assuming Gaussianity as it is shown in the next subsection.
13
3.2
Asymptotic validity
Consider again the stationary VAR(p) model in (1) and assume that there exists p presample values Y−p+1 , Y−p+2 ,..., Y0 . The error is given by
at (B) = Yt − Φ1 Yt−1 − ... − Φp Yt−p , t = 1, ..., T
(25)
where we make explicit that it depends on the unknown parameters contained in B. Denote by √ F the distribution of at . Assume that we estimate B by a T -consistent estimator denoted by ˆ The estimated residual is then given by B.
b = Yt − Φ b 1 Yt−1 − ... − Φ b p Yt−p , t = 1, ..., T, at (B)
(26)
b as distribution function.8 The asymptotic validity of which after being centered they have F (B) ˆ to F as the sample the bootstrap depends to a large extend on the approximation given by F (B) ˆ F (B)) → 0 size T increases. In particular, Theorem 2.4 of Paparoditis (1996) states that d2 (F (B), in probability as T → 0, where d2 is a Mallow’s metric.9 b where vec is the column stacking In this paper we consider the LS estimator ˆb = vec(B), operator. The second part of the asymptotic validity relies on the approximation of k bb − b k given by k bb∗ − bb k. Let’s define the sequence {k I(p) k2 } of pN 2 x1 vectors of constant such √ that 0 < K1 ≤k I(p) k2 ≤ K2 < ∞ and sT = T I(p)0 (bb − b). Theorem 3 of Lewis and Reinsel √ d (1985, p.399) states sT = T I(p)0 (bb − b) → N (0, I(p)0 (Υ−1 ⊗ Σa )I(p)) in probability where 0 √ Υ = P lim ZZ The bootstrap counterpart of sT is given by s∗T = T I(p)0 (bb∗ − bb). Denote T the laws of sT and s∗T by ` and `∗ , respectively. Theorem 3.2 of Paparoditis (1996, p.284) establishes that d2 (`, `∗ ) → 0 `∗ in probability as T → ∞, where `∗ is conditioned on a given sample Y . As the convergence in Mallow’s metric implies the convergence of the corresponding d
random variables then s∗T → N (0, I(p)0 (Υ−1 ⊗ Σa )I(p)). As far as sT and s∗T converge weakly to the same Gaussian distribution in probability, the bootstrap validity is established. Specifically, Theorem 3.3 of Pararoditis (1996, p.285) sets the asymptotic validity of the Yule Walker estimator of B for the bootstrap procedure, but it still remains valid for any estimator which satisfies 8 The asymptotic validity is established for centered residuals, but according to Stine (1987) it remains valid if they are also rescaled. 9 Assume that X and Y are with distribution G N X and GY random variables in R . The Mallows distance between GX and GY is define as dp (F, G) = inf {E(X − Y )p }p/2 , where the minimum is taken over all joint probability distribution F for (X, Y ) such that the marginal distribution of X and Y are GX and GY , respectively. In what follows we set p = 2.
14
b − B k= Op (p1/2 /T 1/2 ). Consequently, it is valid for the LS estimator considered in this k B paper.10 Finally, the asymptotic validity of the bootstrap approximation of the distribution of a future value focuses on YT∗+h|T , which is given by b∗ b ∗ b∗ + Φ b ∗ Yb ∗ a∗T +h|T YbT∗+h|T = Φ 0 1 T +h−1|T + ... + Φp YT +h−p|T + b
(27)
Theorem. Let {Yt , t = −p + 1, ..., 1, 2, ...T } be a realization of a stationary VAR(p) process ˆ the LS estimator of B {Yt } with E(at ) = 0 and E|ait ajt alt art | < ∞ for 1 ≤ i, j, l, r ≤ N , B and YbT∗+h|T obtained by following the steps 1 to 4 in the previous subsection. Then, YbT∗+h|T conditioned on {Yt , t = −p + 1, ..., 1, 2, ...T } converges weakly in probability to YT +h as T → ∞. Proof. Following the arguments in Pascual et al. (2004a), let’s first consider the one-stepahead bootstrap future value given by
b∗ + Φ b ∗ YT + ... + Φ b ∗ YT −p+1 + b YbT∗+1|T = Φ a∗T +1 0 1 p
(28)
b ∗0 + Φ b ∗1 Yb ∗ b∗ YbT∗+2|T = Φ a∗T +2 T +1|T + ... + Φp YT −p+2 + b
(29)
For h = 2 we have
and replacing the YbT∗+1|T in (28) it follows that b ∗0 (1 + Φ b ∗1 ) + N1 (B b ∗ )YT + ... + Np (B b ∗ )YT −p+1 + M1 (B b ∗ )b YbT∗+2|T = Φ a∗T +1 + b a∗T +2
(30)
which expresses the bootstrap future values as a function of the given realization {YT −p+1 , ..., YT }, the independent random draws a ˆ∗T +h and continuous functions of the bootstrap parameter estib ∗ . Proceeding in this way we obtain the following expression mates B b ∗ ) + N1 (B b ∗ )YT + ... + Np (B b ∗ )YT −p+1 YT∗+h|T = N0 (B
(31)
b ∗ )b b ∗ )b + M1 (B a∗T +1 + M2 (B a∗T +2 + ... + b a∗T +h b ∗ to N x1, Nj , j = 1, ..., p, and Mi , i = 1, ..., h − 1, maps B b ∗ to N xN matrices. where N0 maps B The functions Nj s and Mi s are continuous functions and different for each forecast horizon. b ∗ ) and Nj (B b ∗ )YT −j+1 are just continuous functions of B b ∗ and YT −j+1 is a Note that N0 (B 10 The VAR process may have an infinite order. In particular, the requirement for the validity of the bootstrap estimator is that p/T → 0 as T → 0 which is fulfilled in the case of finite order VAR process.
15
given value, so that both quantities converge weakly in probability to N0 (B) and Nj (B)YT −j+1 , b ∗ ) converges weakly in probability to Mi (B) and b respectively. Furthermore, Mi (B a∗T +j converge weakly to a∗T +j in probability; therefore, by applying the bootstrap version of Slutsky’s Theorem d b ∗ )b we obtain that Mi (B a∗T +j → Mi (B)b aT +j . Finally, observe that b a∗T +j are independent and thus d
all terms in (31) converge weakly in probability. Consequently, YT∗+h|T → YT +h as T goes to infinity, in probability.
3.3
Small sample properties
In this subsection, we carry out Monte Carlo experiments to analyse the finite sample properties of the marginal intervals, ellipsoids and cubes constructed by using the new bootstrap procedure proposed in this paper to obtain bootstrap densities. They are compared with those of the Gaussian approximations and the bootstrap procedure based on the backward representation. The DGP considered all through the experiments is the following bivariate VAR(1) model y −0.5 1,t = + 0.5 y2,t
(1)
0 y1,t−1 a1,t + y2,t−1 a2,t 0.5
(32)
(2)
where at = (at , at )0 is an independent white noise vector with contemporaneous covariance 0
matrix given by vecΣa = (1, 0.8, 0.8, 1) . This model have been previously considered by Kim (2001)11 . We examine three alternative distributions for at , namely, Gaussian, Student-5 and χ2 (4). The last two distributions are proposed to capture fat-tailed and asymmetric distributions. In order to have the covariance matrix described above, the Student-5 and χ2 (4) have been centered and rescaled. The number of Monte Carlo replicates is 1000 and the sample sizes are T = 25 and 100. We generate F = 3000 future values of of the process YT +h using the assumed distribution with mean and MSE given by (2) and (3), respectively, which approximate the density of the future value conditional on {Y− p + 1, ..., Y0 , ..., YT }. The parameters are estimated by LS. Then, the 90%, 95% and 99% marginal prediction intervals for forecast horizons, h, from 1 to 8 are computed for each of the variables in the system by using i) the Gaussian prediction interval without incorporating the parameter uncertainty (GI) in equation (5), ii) the asymptotic (AI) in equation (11), iii) the KI in (16) and, iv) the new bootstrap procedure proposed in 11 The model is stationary with roots given by (-2,2). Results for other alternative VAR(1) and VAR(2) models are similar to those reported in this paper. They are available from the authors upon request.
16
this paper as in equation (22). The same is done for the prediction regions, whether they are based on the elliptical assumption such as GE in (6), AE in (12), KE in (17) and BE in (23), or the Bonferroni approximations GC in (7), AC in (13), KC in (19) and BC in (24). The number of bootstrap replicates is R = 4999. The number of future values inside the interval and regions is then counted to obtained the empirical coverage of each procedure. We also compute the volume which is given by i) the length of the individual prediction intervals, ii) V = [π 0.5N /Γ(1 + 0.5N )][χ21−α (N )]0.5N {det[Σy (h)−1 ]}−0.5 , where Γ(·) is the gamma function, for elliptical regions, and (iii) the product of the lengths of the intervals jointly making the Bonferroni approximation. Finally, for individual prediction intervals we calculate the coverage on the left and right sides of the empirical density for the future value. To compare the alternative prediction densities considered, we rely on an approximation of the Mallows distance based on the following factorization of a joint density
g(y1,T +h , y2,T +h |YT ) = g12 (y1,T +h , |y2,T +h , YT ) ∗ g2 (y2,T +h |YT ).
(33)
In our setting, we consider the conditional densities of y1,T +h given three different values of y2,T +h , which are the 25th, 50th and 75th quantiles of the real density of y2,T +h ; these values are called by q1 , q2 and q3 , respectively. To approximate it given a set of realization that characterized a (1) (R) distribution {YbT +h , ..., YbT +h }, we proceed by considering the realizations on the neighborhood of
these quantiles of yb2,T +h . For instance, the g12 (y1,T +h , |qj , YT ) was approximated by the set of realization satisfying Cj = {YT +h |YbT +h ∈ [b y1,T +h , qj ]0 ± [0, e]0 ). C 12 b Call y1,T Then, the conditional +h the R12 observations of YT +h that fulfill that condition. C distribution given qj is G1|2 (x) = #(b y1,T +k ≤ x)/R12 . The marginal density is obtained just by (r)
considering all the realization in the dimension y2,T +h , just as before G2 (x) = #(b y2,T +k ≤ x)/R. Once we have the conditional and marginal distributions, we compute the Mallows distance by computing the difference among 100 quantiles of the distributions involved. Giving a mass
1 100
to each quantile, then the Mallows distance between, for instance, the real distribution G2 (or G12 in the case of the conditional distribution) and its bootstrap counterpart G∗2 (or G∗12 ) can be 12 Different values of e were proved. Although here we considerer e = 0.1, those values within the limits (0.025, 0.125) yield practically the same results. Obviously, for the choice of e scale consideration matters.
17
approximated as follows "
100
1 X (q) ∗(q) d2 (G2 , G∗2 ) ≈ k y2,T +h − y2,T +h k2 100 i=1 (q)
# 21
∗(q)
where y1,T +h and y1,T +h are the qth quantile of the real and bootstrap marginal density of the first component. We incorporate this distance calculation exercise within the Monte Carlo simulation framework described above. We describe first the results obtained when the objective is the prediction of the marginal distribution of the first variable in the system, y1,T +h . Table 1 reports the Monte Carlo means and standard deviations of the coverages and lengths of its one-step-ahead and eight-steps-ahead prediction intervals when the nominal coverage is 95%. Table 1 also reports the average of the coverages left out on the right and left of the intervals. When looking at the coverages, Table 1 shows that, regardless of the error distribution, when the sample size is small, T = 25, all intervals have coverages smaller than the nominal; L¨ utkepohl (1991) and Kim (1999) also observe undercoverages. In general, the coverage of the GI is the smallest with the AI being slightly closer to the nominal than the bootstrap when the prediction horizon is 1. However, when predicting eight-steps-ahead into the future, the bootstrap intervals are superior to the GI and AI. On the other hand, when T = 100, the coverages of all intervals are similar irrespective of the distribution and horizon. When looking at the results reported on average lengths, we can observe that they are similar for all procedures and distributions and slightly larger when the horizon is larger. Finally, Table 1 also reports the average coverage on the left and right of the prediction intervals which show that GI and AI are not able to cope with the asymmetry of the χ2 (4) error distribution. These intervals left most observations on just one of their sides. Note that the gains of bootstrap intervals are clearer for longer horizons and when the errors are asymmetric. It is important to remark that although the bootstrap procedure proposed in this paper is much simpler than that proposed by Kim (1999) by avoiding the use of the backward representation, the performance of both prediction intervals is comparable. Therefore, these experiments illustrate that there is not price to pay for not using the backward representation. Consider now that the objective is to obtain joint prediction regions that contain the two variables in the system with a given nominal coverage. First, we focus on the performance of the prediction cubes constructed by using the Bonferroni’s approximation. Table 2 reports the corresponding empirical coverages and volumes when the nominal coverage is 90%. The results show that, regardless of the error distribution, when T = 25, the AC has empirical coverage 18
closer to the nominal than any other procedure. However, when h = 8 the bootstrap regions have better properties. On the other hand, when T = 100 all procedures tend to overestimate the nominal coverage with the bootstrap having coverages closer to the nominal. With respect to volumes, both bootstrap procedures provide regions that are generally larger than the regions obtained by the standard methodology, as they incorporate the parameter uncertainty due to the estimation process. Furthermore, Table 2 shows that the volume associated with non-normal errors are, in general, larger than those corresponding to Gaussian innovations, a feature that is also highlighted by Kim (1999). Note that, as before, there are not differences between the sample performance of the bootstrap procedure propose by Kim (1999) and the simpler one proposed in this paper. Regarding the performance of the ellipsoidal prediction regions, Table 3 displays similar information to that in Table 2 for a VAR(1) model in (8) for 90% nominal coverage. The main fact to note is that the empirical coverages in Table 2 are below the corresponding coverages in Table 3. This is not surprising since the Bonferroni’s regions provide a cubical approximation with larger volume than ellipsoids, so that it is expected at least the same coverages for the former; see Figure 3. The other features of Table 3 are similar to those commented before for the Bonferroni’s regions. Finally, Table 4 contains mean distances obtained for conditional and marginal densities. Regarding to the marginal density of y2,T +h , we observe that when the errors are non-Gaussian and the sample size is medium (T = 100), bootstrap densities perform similarly and better than the alternatives based on Gaussian assumptions, regardless of the forecast horizon. On the other hand, when we consider the conditional densities note that, as expected, in the presence of a Gaussian error and sample size is T = 25, the standard and asymptotic prediction densities are closer to the real than bootstrap densities no matter the forecast horizon. But when the sample size increases to T = 100 and the forecast horizon gets longer h = 8, the differences between the bootstrap densities and those based on a Gaussian assumptions tend to vanish. In case of the error distribution with heavy tails such as the Student-5, the Gaussian and asymptotic procedures do provide good approximations of the real conditional prediction densities, as bootstrap alternatives do, when the sample is either small or medium size and the forecast horizon is h = 1. Furthermore, as the forecast horizon gets longer, bootstrap prediction densities are closer to the real than their alternatives. Finally, when the error is χ2 (4) and the sample size is T = 25, bootstrap procedures perform better for the q1 and q2 , regardless of the forecast horizon. When T = 25 and h = 8, the Gaussian approximation overcome bootstrap densities probably because the need to raise
19
the number of replication when q3 to capture the large asymmetry of that part of the error distribution; see Figure 1 and Figure 3. On the other hand, when T = 100 bootstrap densities perform clearly better than any other alternative. After all, the simulations carried out show that the new bootstrap procedure performs better than traditional methods and does not worse than the procedure based on the backward representation with the advantage that it is easier to implement.
4
Empirical application
In this section, we implement the proposed bootstrap procedure to construct prediction intervals, ellipsoids and cubes for quarterly US inflation (πt ), unemployment rate (ut ) and GDP growth (gt ) observed quarterly from 1954Q3 to 2010Q4.13 The inflation rates are computed by πt = log(IP It /IP It−1 ) ∗ 100 where IP I is the US Implicit Price Deflactor. The unemployment is measured by the civilian unemployment rate14 and, finally, the GDP growth is given by gt = log(GDPt /GDPt−1 )∗100 where GDP is the US Real Gross Domestic Product. The whole sample period has been split into an estimation sample from 1954Q3 to 2008Q4 (T = 218) and an outof-sample period from 2009Q1 to 2010Q4. Table 5, which reports some descriptive statistics for the estimation period, shows that for the null of a unit root in the inflation cannot be rejected when using the Dickey-Fuller test; see for instance Stock and Watson (2007) for a discussion of the existence of a unit root in US quarterly IPI inflation. On the other hand, unemployment is stationary at 5% and GDP growth at 1%. Hence a reduce VAR model in which the first difference of inflation (4π) and the current values of unemployment rate and GDP growth rates depend on their lagged values seems to be appropriate for the sample period 1954Q3-2008Q4. On the other hand, the first four columns of Table 5 report the mean, standard deviation, skewness and kurtosis. When considering all the series together the measures of skewness and kurtosis are based on Mardia (1970). It can be observed that all the series are characterized by significant skewness and kurtosis, suggesting non-Gaussian distributions. This is corroborated by the Gaussianity test of Doornik and Hansen (2008) which rejects the null hypothesis of Gaussian series both individually and jointly. To choose the lag-order p, we use several order selection measures, such as Akaike, Schwarz, Hannan-Quin and the final prediction error criterions. All of them suggest a VAR(3), which is the model finally fitted. 13 The
data was obtained from the Federal Reserve Bank of St. Louis, webpage: www.stlouisfed.org. the unemployment data is monthly, we chose the value of the last month of each quarter to measure its quarterly. 14 As
20
The descriptive statistics for the centered and scaled residuals, b at , are displayed in Table 6. All the estimated residuals have significant excess of kurtosis and Gaussianity is rejected. Overall, this fact suggests that the standard approach to forecasting in the context of VAR models may be misleading when working with these US quarterly series, and therefore the implementation of bootstrap prediction intervals, ellipsoids and cubes is advisable. The estimated VAR(3) model is used to construct the multi-step out-of-sample prediction densities up to horizon 8. Figure 5 plots one-step-ahead and eight-steps-ahead bootstrap marginal prediction densities for each of the variables in the system together with the corresponding Gaussian densities. It can be observed a positive skewness in case of unemployment, just in line with the descriptive statistic of the prediction error. A similar analysis can be done for the multivariate prediction. Figure 6 and 7 plot bivariate one-step and eight-steps-ahead Gaussian and bootstrap densities for the variables considered two by two, respectively. The skewness and the kurtosis of the series individually are also clearly manifested in the shape of the kernel estimates of the bivariate densities. When h = 1, this is more evident for first difference of inflation-unemployment and unemployment-GDP growh densities, which are affected by the skewness of unemployment and the kurtosis of inflation rate and unemployment. Finally, we construct the prediction intervals and regions. Figure 8 plots the multi-step point predictions together with the observed series and the 95% new bootstrap and Gaussian prediction intervals for inflation, unemployment and growth up to horizon 8. Note that as expected the bootstrap prediction intervals are usually larger than those obtained by the standard approach, a fact that is related to the shape of out-of-sample prediction densities. On the other hand, the one-step-ahead density for the unemployment has high kurtosis, positive skewness which is manifested in wider intervals stemming mainly from upper bound. For inflation rate (panel a) and GDP grwoth (panel c), the observed series lie within the 95% prediction bands for both GI and BI. However, for the unemployment series we see that standard method fails to capture the first two out-of-sample values; though the BI captures all the unemployment rates in 2009Q1 and 2010Q4. Similarly, we construct one-step-ahead and eight-steps-ahead prediction ellipsoids and cubes which are plotted in Figure 9 for the first difference of inflation-unemployment, unemploymentGDP growth and first difference of inflation-GDP growth for the Gaussian and the new bootstrap procedures. The point predictors and the observed out-of-sample values for the series are also plotted in Figure 9. Observe that the later values lie outside the GE and GC for the first difference of inflation-unemployment regions, but it is still within the border of the BE and BC. In panel
21
(b) of Figure 9 we see that something similar occurs for the unemployment-GDP, in which case the observed value lies outside SC but inside the rest of regions; i.e., GE, BC and BE. At last, the out-of-sample value lies inside the cubes and ellipsoids of all procedures for the inflation-GDP growth. Overall, the shape of the regions constructed by each procedure depends on the estimated prediction densities. For instance, in panel (c) of Figure 9 we see that the prediction cube for the unemployment-GDP growth is larger mainly from the unemployment side, which is in part caused by the positive skewness of the unemployment density. At last, using the out-of-sample observation, we calculate the coverages and volumes of the prediction ellipsoids and cubes obtained using each of the procedures described in this paper. Our strategy is to run a rolling-window estimation for the VAR(3) model starting with data from 1954Q3 to 1966Q4. The sample size is T = 50. Then we construct one-step-ahead elliptical and cubical regions for all the procedures. We count how many real observations belong to the prediction regions constructed and compute their volumes. Table 7 displays the results. With respect to coverage, there are only slight differences between Kim’s and the new bootstrap procedures. However, our method provides smaller regions than the former. Note also that our bootstrap attains better coverage than those approaches based on Gaussian assumptions. After all, the results reported in Table 7 are in line with those obtained with simulated data in the sense that there are no large differences between using the bootstrap procedure proposed in this paper or that proposed by Kim (1999), with the advantage that the our algorithm is simpler from a computational point of view.
5
Conclusions
In this paper we extend the bootstrap procedure proposed by Pascual et al. (2004a) to construct prediction regions in multivariate VAR(p) models. The main attractive of the new bootstrap procedure when compared with alternative bootstrap procedures previously proposed in the literature, is that it does not require the backward representation. As a result, it is possible to prove its asymptotic validity without assuming Gaussian errors. Furthermore, the new procedure can be implemented in multivariate models without a backward representation while its computational burden is reduced. We show that the procedure works properly in incorporating the parameter uncertainty and is robust in presence of non-Gaussian errors. Given its simplicity, the bootstrap procedure proposed in this paper can be easily extended to models with MA components and to cointegrated and non-cointegrated non-stationary systems.
22
Also, as proposed by Pascual et al. (2004b) in univariate systems, our proposed procedure can also be implemented to obtain prediction intervals for the original observations when a VAR model is fitted to transformed observations; see Ari˜ no and Franses (2000) and Bardsen and L¨ utkepohl (2011). Finally, it is often of interest to predict future values of one of the variables in the system conditional on particular values of other variables of the system; see, for example, Waggoner and Zha (1999) for a macroeconomic example. The conditional densities can be easily obtained in the con∗(b) text of the bootstrap algorithm proposed in this paper by keeping the bootstrap replicates YbT +h|T
that satisfy the conditions. Another interesting application of the proposed procedure is the construction of confidence intervals for impulse-response functions; see, for example, Killian (1998) for a bootstrap procedure based on the backward representation and Fachin and Bravetti (1996) who proposed a bootstrap alternative which is not. Related with response-impulse functions is the construction of prediction paths; see Staszewska-Bystova (2010) for bootstrap prediction bands based on the backward representation. Further effort should be directed to the construction of prediction regions. In this sense, it is worth noting that the prediction ellipsoids are only appropriate when the distribution of the future values of the variables in the system is approximately multivariate Gaussian. When the distribution of YT +h departs from Gaussianity, the quality of such approximation deteriorates. The Bonferroni’s approximation to the ellipses does provide a better solution capturing the asymmetry of the distribution. However, the shape of the cube could not be appropriate in some cases. Consequently, it would be interesting to give regions that depart either from the elliptical or rectangular shapes. On the other hand, using HDR as proposed by Hyndman (1996) and applied to GARCH models by Eklund (2005) and more recently by Ter¨arvirta and Zhao (2011), could not be adequate when the number of components in the system is large.
References [1] Ari˜ no, M.A. and P.H. Franses (2000), Forecasting the levels of vector autoregressions logtransformed time series, International Journal of Forecasting, 16, 111-116. [2] Bardsen, G. and H. L¨ utkepolh (2011), Forecasting levels of log variables in vector autoregressions, International Journal of Forecasting, 27(4), 1108-1115. [3] Chan, K.-S., L.-H. Ho and H. Tong (2006), A note on time-reversibility of multivariate linear processes, Biometrika, 93(1), 221-227.
23
[4] Chevillon, G. (2009), Multi-step forecasting in emerging economies: An investigation of the South African GDP, International Journal of Forecasting, 25, 602-628. [5] Chow, H.K. and K.M. Choy (2006), Forecasting the global electronic cycle with leading indicators: A Bayesian VAR approach, International Journal of Forecasting, 22, 301-315. [6] Clements, M.P. and J. Smith (2002), Evaluating multivariate forecast densities: a comparison of two approaches, International Journal of Forecasting, 18, 397-407. [7] Diebold, F.X., J. Hahn and A.S. Tay (1999), Multivariate density forecast evaluation and calibration in financial risk management: High-frequency returns on foreign exchange, The Review of Economics and Statistics, 81(4), 661-673. [8] Doornik, J.A. and Hansen, H. (2008), An Omnibus Test for Univariate and Multivariate Normality, Oxford Bulletin of Economics and Statistics, 70, 927-939. [9] Eklund, B. (2005), Estimating confidence regions over bounded domains, Journal Computational Statistics & Data Analysis, 49(2), 349-360. [10] Fachin, S. and L. Bravetti (1996), Asymptotic Normal and Boostrap Inference in Structural VAR Analysis, Journal of Forecasting, 15, 329-341. [11] G´omez, N. and V. Guerrero (2006), Restricted forecasting with VAR models: An analysis of a test for joint compatibility between restrictions and forecasts, International Journal of Forecasting, 22, 751-770. [12] Grigoletto, M. (2005), Bootstrap prediction regions for multivariate autoregressive processes, Statistical Methods and Applications, 14, 179-207. [13] Hall, P. (1992), The bootstrap and Edgeworth expansions, Springer-Verlag, New York. [14] Harvey, A.C. (1989), Forecasting, Structural Time Series Models and the Kalman Filter, Cambridge University Press. [15] Hyndman, R.J. (1996), Computing and graphing highest density regions, The American Statistician, 50(2), 120-126. [16] Kilian, L. (1998). Confidence intervals for impulse responses under departures from normality, Econometric Reviews, 17, 1-29. [17] Kim, J.H. (1997), Relationship between the forward and backward representation of the stationary VAR model, Problem 97.5.2, Econometric Theory, 13, 899-990. [18] Kim, J.H. (1998), Relationship between the forward and backward representation of the stationary VAR model, Solution 97.5.2, Econometric Theory, 14, 691-693. [19] Kim, J.H. (1999), Asymptotic and bootstrap prediction regions for Vector Autoregression, International Journal of Forecasting, 15, 393-403. [20] Kim, J.H. (2001), Bootstrap after bootstrap prediction intervals for autoregressive models, Journal of Business & Economic Statistics, 19(1), 117-128. 24
[21] Kim, J.H. (2004), Bias-corrected bootstrap prediction regions for Vector Autoregression, Journal of Forecasting, 23, 141-154. [22] Lewis R. and Reinsel, G.C. (1985), Prediction of multivariate time series by autoregressive model fitting, Journal of Multivariate Analysis, 16, 393-411. [23] L¨ utkepolh, H. (1991), Introduction to Multiple Time Series Analysis, 2nd ed., SpringerVerlag, Berlin. [24] L¨ utkepolh, H. (2006), Forecasting with VARMA models, in Elliot, G., C.W.J. Granger and A. Timmerman (eds.), Handbook of Economic Forecasting, Vol. 1, 287-325. [25] Mardia, K.V. (1970), Measures of multivariate skewness and kurtosis with applications, Biometrika, 57(3), 519-530. [26] Paparoditis, E. (1996). Bootstrapping autoregressivce and moving average parameter estimates of infinite order vector autoregressive processes, Journal of Multivariate Analysis, 57, 277-296. [27] Pascual, L., J. Romo, and E. Ruiz (2004a). Bootstrap predictive inference for ARIMA processes, Journal of Time Series Analysis, 25, 449-465. [28] Pascual, L., J. Romo, and E. Ruiz (2004b). Bootstrap prediction intervals for powertransformed time series, International Journal of Forecasting, 21(2), 219-235. [29] Riise, T. and D. Tjostheim (1984), Theory and practice of multivariate ARMA forecasting, Journal of Forecasting, 3, 309-317. [30] Rodr´ıguez A. and Ruiz E. (2009), Bootstrap prediction intervals in statespace models, Journal of Time Series Analysis, 30(2), 167-178. [31] Runkle (1987), Vector autoregressions and reality, Journal of Business & Economic Statistics, 5(4), 437-442. [32] Schmidt, P. (1977), Some small sample evidence on the distribution of dynamic simulation forecasts, Econometrica, 45(4), 997-1005. [33] Simkins, S. (1995), Forecasting with vector autoregressive (VAR) models subject to business cycle restrictions, International Journal of Forecasting, 11, 569-583. [34] Sims, C.A. and T. Zha (1998), Error bands for impulse responses, Econometrica, 67(5), 1113-1155. [35] Sims, C.A. and T. Zha (1999), Bayesian methods for dynamic multivariate models, International Economis Review, 39, 949-968. [36] Staszewska-Bystova, A. (2010), Bootstrap prediction bands for forecast paths from vector autoregression models, Journal of Forecasting, doi:10.1002 /for.1205. [37] Stine, J.H. (1987), Estimating properties of autoregressive forecasts, Journal of Economic Perspectives, 15(4), 101-115. 25
[38] Stock J.H. and M.W. Watson (2001), Vector autoregressions, Journal of Economic Perspective, 15(4), 101-115. [39] Stock J.H. and M.W. Watson (2007), Why Has U.S. Inflation Become Harder to Forecast?, Journal of Money, Credit and Banking, 39, 13-33. [40] Ter¨asvirta, T. and Zhao, Z. (2011). Stylized facts of return series, robust estimates, and three popular models of volatility, Applied Financial Economics, 21(1),67-94. [41] Thombs L. A. and W.R. Schucany (1990), Bootstrap prediction intervals for autoregression, Journal of the American Statistical Association, 85, 486-92. [42] Tong, H. and Z. Zhang (2005), On time-reversibility of multivariate linear processes, Statistica Sinica, 15, 495-504. [43] Tay, A.S. and K.F. Wallis (2000). Density forecasting: a survey, Journal of Forecasting, 19, 235-254. [44] Waggoner, D.F. and T. Zha (1999), Conditional forecasts in dynamic multivariate models, The Review of Economics and Statistics, 81(4), 639-651. [45] West, K.D. (1996), Asymptotic inference about predictive ability, Econometrica, 64(5), 10671084. [46] West, K.D. and M.W. McCracken (1998), Regression based test of predictive ability, International Economic Review, 39, 817-840.
26
(a) Empirical
(b) Gaussian
(c) Asymptotic Gaussian
(d) Kim’s bootstrap
(e) New bootstrap
Figure 1: Kernel estimates of joint densities of one-step predictions for a bivariate series generated by a VAR(2) with T = 100 and χ2 (4) errors.
27
(a) Empirical
(b) Gaussian
(c) Asymptotic Gaussian
(d) Kim’s bootstrap
(e) New bootstrap
Figure 2: Kernel estimates of joint densities of eight-steps predictions for a bivariate series generated by a VAR(2) with T = 100 and χ2 (4) errors.
28
o o o
o o
o o
,
o
N
,
o
29 ~
,
~
~"
Figure 3: 95% Bonferroni cubes and elliptical regions for one-step-ahead prediction for a bivariate VAR(2) model with T = 100 and χ2 (4) errors.
o
,
0.5
i "' .....,
-+-
.;i7 *"""";.,' \, ~
0.45
Jti
11,
- - -Gaussian
'+\\\\
--Asymptotic Gaussian - - -Kim's bootstrap - . - New bootstrap
tI ;/ ~,~\ \ ij ~
0.4
,,
0.35 0.3
**\, \'*
'1 1,4i
'
0.25 30
fI
'\
.~~
.. '1 jI
0.05
\\
' " ~;\~
/f
j
\'
.;; ,.
0.15 0.1
\
\ f¡:,. "\
'i 1
0.2
-Empirical
1;" '
"':t+~.."", ... .
t
•
~*'t* ........._~ ";*~.;...,¡.. ,
O
-6
-5
-4
-3
-2
-1
O
1
Figure 4: Estimated kernel densities for one-step-ahead predictions of the first variable generated by model Model 1 with T = 100 and χ2 (4) errors.
2
31
χ2 (4)
Student-5
Gaussian
T=25 GI AI KI BI T=100 GI AI KI BI T=25 GI AI KI BI T=100 GI AI KI BI T=25 GI AI KI BI T=100 GI AI KI BI (5.91) (5.37) (5.55) (5.64) (1.57) (1.52) (2.01) (1.99) (7.40) (6.84) (6.24) (6.30) (1.66) (1.60) (2.04) (2.13) (4.95) (4.42) (5.27) (5.28) (1.59) (1.54) (2.13) (2.05)
91.89 93.40 92.41 92.40 94.48 94.83 94.41 94.37 91.28 92.54 92.41 92.28 94.30 94.57 94.54 94.48 92.47 93.55 93.10 93.07 94.69 94.90 94.68 94.69 3.91 3.97 3.98 3.98
3.79 4.01 4.13 4.11
3.89 3.95 4.06 4.05
3.80 4.02 4.25 4.23
3.90 3.96 3.95 3.95
3.83 4.05 4.03 4.03
(0.26) (0.26) (0.39) (0.38)
(0.70) (0.74) (0.96) (0.94)
(0.26) (0.27) (0.44) (0.44)
(0.68) (0.72) (1.00) (0.98)
(0.18) (0.18) (0.29) (0.28)
(0.51) (0.54) (0.62) (0.62)
0.52 0.47 2.55 2.54
1.38 0.99 2.99 2.95
2.84 2.70 2.80 2.85
4.47 3.82 3.92 3.99
2.73 2.56 2.80 2.80
4.04 3.28 3.75 3.75
/4.79 /4.63 /2.77 /2.77
/6.15 /5.45 /3.91 /3.99
/2.86 /2.73 /2.65 /2.68
/4.25 /3.63 /3.67 /3.73
/2.80 /2.62 /2.79 /2.84
/4.07 /3.32 /3.84 /3.85
One-step-ahead Coverage Length Left/Right
94.52 94.57 95.09 95.07
93.42 93.62 94.78 94.87
94.51 94.56 95.11 95.08
93.56 93.78 94.74 94.83
94.71 94.77 95.05 95.08
94.08 94.33 95.11 95.23
(1.69) (1.68) (1.97) (2.00)
(4.56) (4.46) (4.34) (4.36)
(1.75) (1.74) (1.82) (1.86)
(4.50) (4.38) (4.10) (4.15)
(1.66) (1.65) (1.83) (1.82)
(4.15) (4.02) (3.92) (3.91)
4.53 4.55 4.74 4.74
4.55 4.60 5.04 5.08
4.51 4.52 4.72 4.71
4.58 4.62 5.08 5.10
4.51 4.52 4.61 4.61
4.59 4.64 4.89 4.93
(0.37) (0.38) (0.48) (0.49)
(0.90) (0.91) (1.18) (1.20)
(0.38) (0.38) (0.47) (0.47)
(0.92) (0.93) (1.29) (1.22)
(0.30) (0.30) (0.36) (0.36)
(0.72) (0.73) (0.87) (0.85)
1.46 1.44 2.35 2.33
1.94 1.86 2.06 1.97
2.73 2.71 2.48 2.48
3.25 3.13 2.66 2.60
2.65 2.62 2.48 2.44
2.95 2.83 2.43 2.38
/4.02 /3.99 /2.56 /2.59
/4.64 /4.52 /3.16 /3.17
/2.76 /2.74 /2.41 /2.44
/3.19 /3.09 /2.60 /2.57
/2.64 /2.61 /2.47 /2.48
/2.97 /2.85 /2.46 /2.39
Eight-steps-ahead Coverage Length Left/Right
Table 1: Monte Carlo means and standard deviations (in parenthesis) of coverages of one-step-ahead and eight-steps-ahead prediction intervals for the first variable of a bivariate VAR(1) model with Gaussian, Student-5 and χ2 (4) errors, constructed using the Gaussian (GI), asymptotic (AI) and bootstrap procedures (KI and BI). Nominal coverage 95%. Samples sizes T = 25 and T = 100.
Table 2: Monte Carlo means and standard deviations (in parenthesis) of coverages of one-stepahead and eight-steps-ahead Bonferroni prediction regions for components of a bivariate VAR(1) model with Gaussian, Student-5 and χ2 (4) errors, constructed using the Gaussian (GC), asymptotic (AC) and bootstrap procedures (KC and BC). Nominal coverage 90%. Samples sizes T = 25 and T = 100. One-step-ahead Coverage Volume
Gaussian
Student-5
χ2 (4)
T=25 GC AC KC BC T=100 GC AC KC BC T=25 GC AC KC BC T=100 GC AC KC BC T=25 GC AC KC BC T=100 GC AC KC BC
Eight-steps-ahead Coverage Volume
87.67 89.85 88.27 88.26
(7.23) (6.63) (6.80) (6.90)
14.77 16.55 16.45 16.46
(3.51) (3.93) (4.61) (4.57)
85.98 87.26 88.27 88.29
(7.47) (7.16) (6.77) (6.90)
91.41 91.93 91.22 91.17
(1.96) (1.90) (2.42) (2.37)
15.21 15.66 15.62 15.57
(1.24) (1.28) (1.87) (1.83)
89.27 89.60 90.04 90.06
(2.38) (2.35) (2.58) (2.58)
87.60 89.39 89.08 88.84
(8.93) (8.24) (7.57) (7.72)
14.76 16.53 18.86 18.69
(5.03) (5.63) (8.95) (8.68)
86.12 87.19 88.24 88.06
(7.99) (7.66) (7.02) (7.41)
92.12 92.50 92.35 92.27
(2.14) (2.06) (2.61) (2.67)
15.24 15.69 16.71 16.62
(1.94) (2.00) (3.32) (3.20)
89.92 90.17 90.95 90.92
(2.71) (2.68) (2.85) (2.88)
89.45 91.01 89.26 89.25
(7.26) (6.46) (7.63) (7.62)
14.71 16.48 17.61 17.43
(5.15) (5.77) (8.16) (7.83)
87.28 88.18 89.09 89.15
(7.83) (7.51) (7.11) (7.22)
93.18 93.44 91.51 91.49
(1.93) (1.87) (3.49) (3.45)
15.32 15.78 15.60 15.62
(1.86) (1.92) (2.84) (2.76)
90.96 91.17 90.78 90.78
(2.46) (2.42) (2.72) (2.72)
32
24.59 (6.29) 26.00 (6.88) 28.19 (19.55) 28.04 (7.58) 24.66 25.03 25.86 25.88
(2.38) (2.44) (2.90) (2.94)
24.48 (8.67) 25.88 (9.31) 30.27 (27.18) 29.62 (12.43) 24.64 25.00 27.10 27.12
(3.35) (3.42) (4.52) (4.53)
24.37 (8.89) 25.74 (9.52) 28.90 (12.22) 29.07 (12.09) 24.78 25.15 26.49 26.45
(3.25) (3.32) (4.08) (4.12)
Table 3: Monte Carlo means and standard deviations (in parenthesis) of coverages of one-stepahead and eight-steps-ahead elliptical prediction regions for components of a bivariate VAR(1) model with Gaussian, Student-5 and χ2 (4) errors, constructed using the Gaussian (GE), asymptotic (AE) and bootstrap procedures (KE and BE). Nominal coverage 90%. Sample sizes T = 25 and T = 100. One-step-ahead Coverage Volume
Gaussian
Student-5
χ2 (4)
T=25 GE AE KE BE T=100 GE AE KE BE T=25 GE AE KE BE T=100 GE AE KE BE T=25 GE AE KE BE T=100 GE AE KE BE
Eight-steps-ahead Coverage Volume
82.86 85.90 85.30 85.11
(8.10) (7.48) (7.27) (7.46)
8.02 8.98 8.92 8.86
(1.55) (1.74) (1.95) (1.86)
85.17 86.38 88.15 88.15
(7.92) (7.59) (7.12) (7.30)
22.34 23.46 25.39 25.55
(5.70) (6.07) (8.77) (6.82)
88.76 89.47 89.07 89.07
(2.16) (2.09) (2.57) (2.58)
8.53 8.78 8.70 8.71
(0.57) (0.58) (0.63) (0.65)
88.64 88.96 89.75 89.73
(2.53) (2.51) (2.79) (2.84)
22.45 22.75 23.61 23.61
(2.16) (2.21) (2.54) (2.59)
82.28 (10.31) 84.71 (9.62) 85.69 (7.99) 85.28 (8.41)
7.83 8.77 8.87 8.71
(2.17) (2.43) (2.86) (2.54)
85.05 86.03 88.27 88.10
(8.30) (7.98) (7.21) (7.63)
22.20 (7.79) 23.30 (8.23) 25.67 (12.89) 25.56 (9.86)
89.13 89.62 89.66 89.53
(2.34) (2.27) (2.68) (2.74)
8.52 8.78 8.72 8.69
(0.86) (0.89) (0.97) (0.95)
88.97 89.20 89.98 89.94
(2.83) (2.79) (3.15) (3.18)
22.46 22.76 23.58 23.62
(3.02) (3.06) (3.35) (3.45)
84.78 86.77 86.77 86.55
(8.33) (7.59) (7.12) (7.22)
7.84 8.78 8.81 8.66
(2.13) (2.38) (2.82) (2.52)
86.65 87.44 88.85 88.84
(8.05) (7.73) (7.30) (7.48)
22.13 23.22 25.30 25.33
(7.99) (8.45) (9.96) (9.49)
90.21 90.57 89.87 89.80
(1.98) (1.91) (2.35) (2.43)
8.57 8.83 8.77 8.74
(0.82) (0.85) (0.91) (0.90)
90.28 90.44 90.24 90.26
(2.55) (2.52) (2.82) (2.83)
22.57 22.87 23.69 23.69
(2.93) (2.97) (3.28) (3.33)
33
Table 4: Monte Carlo means of Mallows distances between of one-step-ahead (h = 1) and eightsteps-ahead (h = 8) marginal and conditional prediction densities and corresponding true densities obtained for a bivariate VAR(1) model with Gaussian, Student-5 and χ2 (4) errors, constructed using the Gaussian (GD), asymptotic (AD) and bootstrap procedures (KD and BD). Sample sizes T = 25 and T = 100. Marginal h=1 h=8
Gaussian
Student-5
χ2 (4)
T=25 GD AD KD BD T=100 GD AD KD BD T=25 GD AD KD BD T=100 GD AD KD BD T=25 GD AD KD BD T=100 GD AD KD BD
Condition 1 (q1 ) h=1 h=8
Condition 2 (q2 ) h=1 h=8
Condition 3 (q3 ) h=1 h=8
0.329 0.334 0.378 0.385
0.468 0.473 0.466 0.495
0.226 0.224 0.308 0.308
0.271 0.267 0.352 0.353
0.210 0.210 0.282 0.281
0.251 0.252 0.327 0.332
0.224 0.222 0.304 0.300
0.274 0.277 0.349 0.352
0.145 0.145 0.195 0.196
0.188 0.188 0.197 0.200
0.113 0.115 0.208 0.208
0.193 0.191 0.217 0.220
0.109 0.107 0.196 0.195
0.173 0.171 0.198 0.200
0.115 0.116 0.217 0.216
0.194 0.193 0.223 0.220
1.126 1.118 1.190 1.187
1.253 1.245 1.408 1.382
0.243 0.251 0.277 0.277
0.335 0.338 0.279 0.279
0.237 0.250 0.249 0.247
0.364 0.368 0.256 0.258
0.235 0.245 0.273 0.272
0.334 0.336 0.280 0.282
0.994 0.992 0.984 0.983
1.143 1.139 1.041 1.040
0.212 0.216 0.237 0.238
0.320 0.320 0.236 0.235
0.210 0.215 0.198 0.197
0.354 0.354 0.211 0.214
0.202 0.207 0.221 0.220
0.317 0.319 0.231 0.234
0.583 0.579 0.561 0.558
0.687 0.686 0.843 0.783
0.299 0.320 0.263 0.256
0.466 0.471 0.308 0.316
0.284 0.291 0.287 0.282
0.316 0.320 0.307 0.316
0.363 0.358 0.402 0.397
0.351 0.349 0.415 0.417
0.468 0.467 0.365 0.364
0.482 0.480 0.289 0.296
0.222 0.230 0.146 0.146
0.391 0.391 0.153 0.152
0.196 0.199 0.181 0.182
0.220 0.218 0.173 0.174
0.248 0.246 0.274 0.275
0.261 0.265 0.252 0.252
34
Table 5: Descriptive statistics of quarterly US inflation, unemployment and GDP growth observed from 1954Q3 to 2008Q4. Series
Mean
Sd
Skewness
Kurtosis
Normality test
A. D-F1
0.87 5.77 0.80 -0.00
0.59 1.42 0.93 0.34
1.21* 0.80* -0.33** -0.24 2.69* 1.63*
4.27* 3.77** 4.29* 5.01* 19.00* 8.12*
93.65* 38.59* 13.81* 27.07* 125.89* 77.20*
-2.35 -3.16** -10.25* -12.53*
Inflation Unemployment GDP growth First difference of inflation (4π) Joint (with π)1 Joint (with 4π)1 1
(*) and (**) significant at 1% and 5%, respectively. The Augmented Dickey-Fuller test critical value is -2.87 at 5%.
Table 6: Diagnostics of the VAR(3) residuals corresponding to the US first difference of inflation, unemployment and GDP growth series. Series
Mean
Sd
Skewness
Kurtosis
Normality test
a ˆ4π a ˆu a ˆg Joint
-0.00 -0.00 0.00
0.28 0.30 0.82
-0.04 0.99* 0.01 1.70*
3.71** 5.00* 4.13* 20.22*
5.89** 29.24* 12.12* 35.31*
(*) and (**) significant at 1% and 5%, respectively.
35
(a) First difference of inflation, h = 1
(b) First difference of inflation, h = 8
(c) Unemployment, h = 1
(d) Unemployment, h = 8
(e) GDP growth, h = 1
(f) GDP growth, h = 8
Figure 5: Kernel estimates of the densities of the one-step-ahead and four-steps-ahead predictions for the first difference of (a-b) inflation, (c-d) unemployment and (e-f) GDP growth.
36
(a) Gaussian 4Inflation-Unemployment
(b) New bootstrap 4Inflation-Unemployment
(c) Gaussian Unemployment-GDP growth
(d) New bootstrap Unemployment-GDP growth
(e) Gaussian 4Inflation-GDP growth
(f) New bootstrap 4Inflation-GDP growth
Figure 6: Kernel estimates of the joint densities of the one-step-ahead predictions for a bivariate US series.
37
(a) Gaussian 4Inflation-Unemployment
(b) New bootstrap 4Inflation-Unemployment
(c) Gaussian Unemployment-GDP growth
(d) New bootstrap Unemployment-GDP growth
(e) Gaussian 4Inflation-GDP growth
(f) New bootstrap 4Inflation-GDP growth
Figure 7: Kernel estimates of the joint densities of the eight-steps-ahead predictions for a bivariate US series.
38
(a) First difference of inflation
(b) Unemployment
(c) GDP growth
Figure 8: 95% prediction intervals for US quarterly (a) inflation, (b) unemployment and (c) GDP growth from 2009Q1 to 2010Q4.
39
(a) 4Inflation-Unemployment, h = 1
(b) 4Inflation-Unemployment, h = 8
(c) Unemployment-GDP growth, h = 1
(d) Unemployment-GDP growth, h = 8
(e) 4Inflation-GDP growth, h = 1
(f) 4Inflation-GDP growth, h = 8
Figure 9: 95% Bonferroni cubes and elliptical regions for the one-step-ahead and eight-stepsahead predictions for (a-b) 4Inflation-Unemployment, (c-d) Unemployment-GDP growth and (e-f) 4Inflation-GDP growth.
40
Table 7: Mean coverages and volumes of one-step-ahead Bonferroni and elliptical prediction regions constructed using the Gaussian (GC), asymptotic (AC) and bootstrap procedures (KC and BC) for a VAR(3) model for US inflation, unemployment and GDP growth. Nominal coverages 90%, 95% and 99%. Bonferroni 90% Gaussian Asymptotic Kim’s bootstrap New bootstrap 95% Gaussian Asymptotic Kim’s bootstrap New bootstrap 99% Gaussian Asymptotic Kim’s bootstrap New bootstrap
Elliptical
Coverage
Volume
Coverage
Volume
71.26 75.45 82.63 83.23
3.75 4.93 7.07 6.78
67.66 71.26 77.84 77.84
2.43 3.19 4.45 4.19
79.04 81.44 86.83 86.23
5.33 7.01 9.70 9.27
71.86 79.04 86.23 86.83
3.39 4.46 6.22 5.86
88.02 89.22 92.22 92.22
9.83 12.92 15.85 15.22
85.03 89.22 94.01 92.81
5.94 7.80 10.88 10.25
41