Dynamic covariance estimation using sparse Bayesian factor stochastic volatility models Gregor Kastner1 , Sylvia Fr¨ uhwirth-Schnatter1 , Hedibert Freitas Lopes2 1 2
WU Vienna University of Economics and Business, Austria Insper, S˜ ao Paulo, Brazil
E-mail for correspondence:
[email protected] Abstract: We address the “curse of dimensionality” arising in time-varying covariance estimation by modeling the underlying volatility dynamics of a time series vector through a lower dimensional collection of latent dynamic factors. Furthermore, we apply a Normal-Gamma shrinkage prior to the elements of the factor loadings matrix, thereby increasing parsimony even more. Estimation is carried out via MCMC in order to obtain draws from the high-dimensional posterior and predictive distributions. To guarantee efficiency of the samplers, we utilize several ancillarity-sufficiency interweaving strategies (ASIS) for sampling the factor loadings. Estimation and forecasting performance is evaluated for simulated and real-world data. Keywords: Shrinkage; Normal-Gamma prior; Curse of dimensionality; Predictive distribution; Ancillarity-sufficiency interweaving strategy (ASIS).
1
Introduction
We extend the standard factor stochastic volatility (SV) model (see, e.g., Chib et al., 2006, and the references therein) to allow for shrinkage of the loadings matrix towards zero. Within a Bayesian framework this can be achieved by employing the Normal-Gamma prior (Griffin and Brown, 2010) for the factor loadings. This hierarchical prior features excellent properties in the sense that factors which are irrelevant (exhibit little or no contribution to the likelihood) for certain components are effectively shrunk towards zero a posteriori. At the same time, the Normal-Gamma prior is flexible enough to cater for informative factors without “overshrinking”. The model reads 1/2 1/2 y t = Λf t + Σt t , f t = V t ut , This paper was published as a part of the proceedings of the 30th International Workshop on Statistical Modelling, Johannes Kepler Universit¨ at Linz, 6–10 July 2015. The copyright remains with the author(s). Permission to reproduce or extract any parts of this abstract should be requested from the author(s).
2
Dynamic covariance estimation using sparse factor SV models
where y t denotes the vector of (potentially demeaned) log-returns of m observed time series at time t for t = 1, . . . , T , Λ is an unknown m × r factor loadings matrix with elements Λij , and f t = (f1t , . . . , frt )0 represents the 1/2 common latent factors at time t. V t = Diag(exp(h1t /2), . . . , exp(hmt /2)) denotes the latent idiosyncratic (i.e., component-specific) volatilities, and 1/2 V t = Diag(exp(hm+1,t /2), . . . , exp(hm+r,t /2)) denotes the latent factor (i.e., common) volatilities. The errors t ∼ Nm (0, I m ) and ut ∼ Nr (0, I r ) represent i.i.d. m- respectively r-variate normal innovations with zero means and unit covariance matrices, where t and us are assumed to be pairwise independent for all t, s ∈ {1, . . . , T }. Both the latent factors and the idiosyncratic shocks are allowed to follow independent SV processes, i.e. hit = µi + φi (hi,t−1 − µi ) + σi ηit ,
ηit ∼ N (0, 1).
Following Griffin and Brown (2010), we substitute the usual factor loadings prior, Λij ∼ N (0, τ 2 ) with τ 2 fixed, by a hierarchical Normal-Gamma prior, 2 2 Λij |τij ∼ N (0, τij ),
2 τij ∼ G(ai , ai λ2i /2),
λ2i ∼ G(ci , di ),
with ai , ci , and di fixed. Choosing ai small enforces strong shrinkage towards zero, while choosing ai large imposes little shrinkage. Note that the Bayesian Lasso prior arises as a special case when ai = 1. Univariate SV process priors are the same as in Kastner and Fr¨ uhwirth-Schnatter (2014).
2
Identifiability and sampling efficiency
Without identifying the scaling of either the jth column of Λ or the variance of fjt , the model is not identified. Aguilar and West (2000) assume that Λjj = 1, while the level µm+j of hm+j,t (which corresponds to the scaling of fjt ) is modeled to be unknown. Alternatively, one can fix the level µm+j at zero and leave the diagonal elements Λjj unrestricted. This is the baseline approach adopted in this paper. −1 It is fruitful to notice that by letting Λ∗ := Λ × Diag(Λ−1 11 , . . . , Λrr ) denote ∗ the restricted factor loadings matrix and f i· := Λii f i· denote the correspondingly transformed factor for i = 1, . . . , r, one can easily move from one identification scheme to the other. This transformation can be exploited to substantially improve the usual Gibbs-sampler by utilizing ASIS (Yu and Meng, 2011) to redraw the factor loadings. To illustrate the effectiveness of these simple reparameterizations, we consider simulated data. The top panel of Figure 1 exemplifies the output of the sampler for the first series’ loading on the first latent factor without using any form of interweaving on the factor loadings. It stands out that even after a thinning of 100, posterior draws show extremely high autocorrelation. In fact, the extent of autocorrelation in these draws is so high that the dependence on the starting values is non-negligible, also after the long
Kastner et al.
3
0
0.65
2
4
Density
6
0.85
8
IF = 14441
0.75
facload 1
0.95
burn-in of 50 000 draws. There seems to be little reason to believe that the sampler has converged at all. The bottom panel displays draws obtained from the sampler using ASIS, exhibiting practically no autocorrelation.
0
200
400
800
1000
0.6 610 7 64
Density Density Density Density
0.550.6
IF = = 100 15046 IF
0.60
0.7
0.65
0.80.70
0.75 0.9
NN==1000 1000 Bandwidth Bandwidth==0.009829 0.01314
2
0.60 0.80 0.50 0.70
00 400 400
600 600
800 800
1000 1000
0.40.6 6 87
Time Time
1000 1000
0.6 0.4
600 600
800 800
Density Density
0.6
0.8
0.7
0.9
0.8
1.0
N= = 1000 1000 Bandwidth Bandwidth = = 0.01215 0.01284 N
00
1000 1000
0.50 8
400 400
0.5 0.7
IF = = 100 14104 IF
1 2 2
0.8 0.75 0.9 0.6 0.65 0.7 0.5 0.55
200 200
0.75
0 0
34 4
Density Density 800 800
5 8 6 107
600 600
Time Time
46
400 400
43
200 200
0.70.9
00 1 22
0.65 0.5
To illustrate the feasibility of our approach for obtaining draws from the m-dimensional predictive distribution, we consider 20 exchange rates previously analyzed in Kastner et al. (2014). Figure 2 illustrates the bivariate marginals from the one-day-ahead predictive distribution on 2008-05-16 (arbitrarily chosen). Evaluating the predictive density at the observed value gives immediate rise to the predictive likelihood. Analogously to Kastner (forthcoming), this measure of forecasting accuracy can straightforwardly be used for model comparison. 0 0
0.60.8
NN==1000 1000 Bandwidth Bandwidth==0.01186 0.0104
65
Prediction
0.50.7
IF = = 100 14708 IF
0.55 0.5
0.600.6
IF = 100
0.65
0.7 0.70
0.75 0.8
0.800.9
0.85
N = 1000 Bandwidth = 0.01307
Density
0.65
6
Time
Aguilar, O. and West, M. (2000). Bayesian dynamic factor models and portfolio allocation. Journal of Business & Economic Statistics, 18, 338 – 357. 0
200
400
600
800
1000
Chib, S., Nardari, F., and Shephard, N. (2006). Analysis of high dimensional multivariate stochastic volatility models. Journal of Econometrics, 134, 341 – 371. Griffin, J.E. and Brown, P.J. (2010). Inference with Normal-Gamma prior distributions in regression problems. Bayesian Analysis, 5, 171 – 188. Kastner, G. (forthcoming). Dealing with stochastic volatility in time series using the R package stochvol. Journal of Statistical Software.
0
0.45
2
0.55
References
4
facload facload23
0.50 0.5
0.40 0.60
200 200
0.8 0.95 0.85 0.7
12 2 43
1000 1000
0.75 0.6
facload facload34
1.0
00 800 800
8 8 10
600 600 Time Time
0 0
facload facload45
0.9
66
400 400
FIGURE 1. Trace plots for 1000 draws, exemplified for the first loading on factor one. Top panel: No interweaving. Bottom panel: ASIS. The draws are thinned; every 100th draw is displayed. Horizontal lines indicate data generating values.
facload 5
0.8 N = 1000 Bandwidth = 0.0127
44
200 200
0.70
0 0
3
0.7
IF IF = = 17893 100
8 5
0.8 0.70 0.9 0.6 0.60 0.7 0.5 0.50
facload facload12
600 Time
0.4
0.5
0.6
0.7
0.8
4
Dynamic covariance estimation using sparse factor SV models
FIGURE 2. Pairwise scatterplots and empirical correlation coefficients of draws from the one-day-ahead predictive distribution for 2008-05-16, obtained from a six factor model.
Kastner, G. and Fr¨ uhwirth-Schnatter, S. (2014). Ancillarity-sufficiency interweaving strategy (ASIS) for boosting MCMC estimation of stochastic volatility models. Computational Statistics and Data Analysis, 76, 408 – 423. Kastner, G., Fr¨ uhwirth-Schnatter, S., and Lopes, H.F. (2014). Analysis of exchange rates via multivariate Bayesian factor stochastic volatility models. In: The Contribution of Young Researchers to Bayesian Statistics – Proceedings of BAYSM2013, Springer Proceedings in Mathematics & Statistics, 63, Lanzarone, E. and Ieva, F. (Eds.), 181 – 186. Yu, Y. and, Meng, X.-L. (2011). To center or not to center: that is not the question—An ancillarity-suffiency interweaving strategy (ASIS) for boosting MCMC efficiency. Journal of Computational and Graphical Statistics, 20, 531 – 570.