GARCH (1,1), GJR(1,1) and EGARCH (1,1) models. The study results indicate
that GJR. (1,1) outperforms other time series models for out-of-sample forecasting
.
Modeling Volatility of S&P 500 Index Daily Returns: A comparison between model based forecasts and implied volatility
Huang Kun
Department of Finance and Statistics Hanken School of Economics Vasa
2011
HANKEN SCHOOL OF ECONOMICS Department of: Finance and Statistics
Type of work: Thesis
Author: Huang Kun
Date: April, 2011
Title of thesis: Modeling Volatility of S&P 500 Index Daily Returns: A comparison between model based forecasts and implied volatility Abstract: The objective of this study is to investigate the predictability of model based forecasts and the VIX index on forecasting future volatility of S&P 500 index daily returns. The study period is from January 1990 to December 2010, including 5291 observations. A variety of time series models were estimated, including random walk model, GARCH (1,1), GJR(1,1) and EGARCH (1,1) models. The study results indicate that GJR (1,1) outperforms other time series models for out-of-sample forecasting. The forecast performance of VIX, GJR(1,1) and RiskMetrics were compared using various approaches. The empirical evidence does not support the view that implied volatility subsumes all information content, and the study results provide strong evidence indicating that GJR (1,1) outperforms VIX and RiskMetrics for modeling future volatility of S&P 500 index daily returns. Additionally, the results of the encompassing regression for future realized volatility at 5-, 10-, 15-, 30- and 60-day horizons, and the results of the encompassing regression for squared return shocks suggest that the joint use of GJR (1,1) and RiskMetrics can produce the best forecasts. By and large, our finding indicates that implied volatility is inferior for future volatility forecasting, and the model based forecasts have more explanatory power for future volatility.
Keywords: volatility, S&P 500, GARCH, GJR, RiskMetrics, implied volatility
CONTENTS 1 Introduction………………………………………………………………………………………………………2 2 Literature Review……………………………………………………………………………………………….6 3 The CBOE Volatility Index – VIX………………………………………………………………………16 3.1 Implied Volatility……………………………………………………………………………………….16 3.2 The VIX Index……………………………………………………………………………………………17 4 Time Series Models for Volatility Forecasting…………………………………………………… 19 4.1 Random Walk Model………………………………………………………………………………….19 4.2 The ARCH(q) Model……………………………………………………………………….………… 19 4.3 The GARCH (p,q) Model………………………………………………………………….…………20 4.3.2 The Stylized Facts of Volatility……………………………………………….…………21 4.4 The GJR (p,q) Model…………………………………………………………………………………23 4.5 The EGARCH (p,q) Model…………………………………………………………………………..24 4.6 RiskMetrics Approach…………………………………………………………………………………25 5 Practical Issues for Model-building……………………………………………………………………26 5.1 Test ARCH Effect………………………………………………………………………………………26 5.2 Information Criterion…………………………………………………………………………………27 5.3 Evaluating the Volatility Forecasts……………………………………………………………….27 5.3.1 Out-of-sample Forecast………………………………………………………………………..27 5.3.2 Traditional Evaluation Statistics…………………………………………………………..28 6 Data………………………………………………………………………………………………………………30 6.1 S&P 500 Index Daily Returns………………………………………………………………………30 6.1.1 Autocorrelation of S&P 500 Index Daily Returns……………………………………32 6.1.2 Testing ARCH Effect of S&P 500 Index Daily Returns……………………………33 6.2 Properties of the VIX Index…………………………………………………………………………34 6.3 Study on S&P 500 Index and the VIX Index………………………………………………….34 6.3.1 Cross-correlation between S&P 500 Index and the VIX Index……………….34
6.3.2 S&P 500 Index Daily Returns and the VIX Index………………………………..37 7 Estimation and Discussion……………………………………………………………………………….43 7.1 Model Selection…………………………………………………………………………………………43 7.2 Test Numerical Accuracy of GARCH Estimates……………………………………………45 7.3 Estimates of Models…………………………………………………………………………………..46 7.4 BDS Test…………………………………………………………………………………………………...49 7.5 Graphical Diagnostic………………………………………………………………………………….51 8 Forecast Performance of Model Based Forecasts and VIX…………………………………..53 8.1 Out-of-sample Forecast Performance of GARCH Models……………………………..53 8.2 In-sample Forecast Performance of VIX……………………………………………………..54 8.3 Comparing Predictability of Time Series Models and VIX…………………………….56 8.3.1 Correlation between Realized Volatility and Volatility Forecasts…………59 8.3.2 Regression for In-sample Realized Volatility……………………………………..60 8.3.3 Residual Tests for Regression of In-sample Realized Volatility……………64 8.3.4 Regression for Out-of-sample Realized Volatility………………………………67 8.3.5 Residual Tests for Regression of Out-of-sample Realized Volatility…….70 8.3.6 Encompassing Regression for Realized Volatility………………………………72 8.3.7 Average Squared Deviation……………………………………………………………..75 8.3.8 Regression for Squared Return Shocks……………………………………………76 8.3.9 Encompassing Regression for Squared Daily Return Shocks……………..78 9 Conclusion…………………………………………………………………………………………………….80
References………………………………………………………………………………………………………….81
Appendix A. VIX and Future Realized Volatility…………………………………………………….86 Appendix B. Out-of-sample Forecast Performance on Realized Volatility………………..89 Appendix C. Residuals from Regression for Out-of-sample Realized Volatility…………91
TABLES Table 1.Summary statistics for S&P 500 index daily returns
31
Table 2 Test for ARCH effect in S&P 500 daily index returns
33
Table 3. Summary statistics of the VIX index
34
Table 4. Cross-correlation between S&P 500 index daily returns and implied volatility index
35
Table 5. Regression results for VIX changes and S&P 500 index daily returns
38
Table 6. Information criteria for estimated GARCH (p,q) models
44
Table 7. The summary statistics of estimated volatility models
47
Table 8. BDS test for serial independence in residuals
50
Table 9. Forecast Performance of GARCH models
53
Table 10. In-sample forecast performance of VIX and GARCH specifications
55
Table 11 Correlation between Realized Volatility and Alternative Forecasters
59
Table 12. Performance of regression for in-sample realized volatility
61
Table 13. Forecast performance on out-of-sample realized volatility
63
Table 14. Residual tests for regression for in-sample realized volatility
66
Table 15. Performance of regression for out-of-sample realized volatility
68
Table 16. Residual tests for regression for out-of-sample realized volatility
71
Table 17. Encompassing regression for realized volatility
74
Table 18. The average squared deviation from alternative approaches
76
Table 19. Regression results for squared return shocks
77
Table 20. Encompassing regression results for squared return shocks
78
FIGURES Figure 1.Daily returns, squared daily returns and absolute daily returns for the S&P 500 index Figure 2. Autocorrelation of
,
32 and | | for S&P 500 index
33
Figure 3. S&P 500 Index (logarithm) and the VIX Index
36
Figure 4. S&P 500 index daily returns and the VIX index
41
Figure 5 S&P 500 index absolute daily returns and the VIX index
42
Figure 6. Estimates from various GARCH (p,q) models
45
Figure 7. Graphical residual diagnostics from GARCH (1,1) to S&P 500 returns
52
2
1 Introduction
Volatility is computed as the standard deviation of equity returns. Modeling volatility in financial market is important because volatility is often perceived as a significant element for the evaluation of assets, the measurement of risk, the investment decision making, the valuation of security and the monetary policy making. The stock market volatility is virtually time-varying. The empirical evidence dates back to the well-known pioneering studies of Mandelbrot (1963) and Fama (1965) demonstrated that large price (small price) changes tend to be followed by large price (small price) changes, implying that there are some periods which display pronounced volatility clustering. It is widely accepted that volatility changes in financial market are predictable. The various models have been applied by extensive empirical studies for future volatility forecasting and measuring the predictability of volatility forecasts. However, there is little consensus in terms of which model or family of models is the best for describing assets returns. To date the two most popular approaches for future volatility forecasting are considered to be the Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model and the RistMetrics approach introduced by Robert Engle (1982) and J. P. Morgan (1992), respectively. The forecasts of these two approaches are derived on the basis of historical data. Additionally, the volatility implied from the actual observed option price is thought to be an efficient volatility forecasts and becoming more and more popular for volatility forecasting, particularly in the U.S market. A large number of empirical evidence documented that, under the efficient option market, implied volatility subsume forward-looking information contained in all other variables in the market’s information set that help measure volatility of option’s lifetime. By and large, the conventional approaches for volatility forecasting are classified into two categories, and they are time series models based on historical data and volatility implied from observed option price. The GARCH model is the natural extension of autoregressive conditional heteroscedasticity (ARCH) model which was thought to be the good description of
3 stock returns and an efficient technique for estimating and analyzing time-varying volatility in stock returns. The seminal ARCH (q) model was pioneered by Engle (1982), representing a function of the squared returns of the past q periods and formulating the conditional variance of returns via maximum likelihood procedure rather than making use of the sample standard deviation. However, there are some limitations of ARCH (q) model. For example, how to decide the appropriate number of lags of the squared residual in the model; the large value of q may induce a nonparsimonious conditional variance model; non-negative constraints might be violated. Some problems of ARCH (q) model can be overcome by GARCH (p,q) model which incorporates the additional dependencies on p lags of the past volatility and the variance of residuals is modeled by an autoregressive moving average ARMA (p,q) process replacing the AR (q) process of ARCH (q) model. GARCH (p,q) model is widely used in practice. The extensive empirical evidence suggest that GARCH (p,q) model is a more parsimonious model than ARCH (q) model and provides a framework for deeper time-varying volatility estimation. One of outstanding features of the GARCH (p,q) model is that it can effectively remove the excess kurtosis in returns. Particularly, GARCH (1,1) model is widely recognized as the most popular framework for modeling volatilities of many financial time series. However, the standard symmetric GARCH (p,q) model also has some underlying limitations. For instance, the requirement that the conditional variance is positive may be violated for the estimated model. The only way to avoid this problem is to place the constraints for coefficients to force them to be positive. The second limitation is that it cannot explain the leverage effect, although it has good performance for explaining volatility clustering and leptokurtosis in a time series. Thirdly, the direct feedback between the conditional mean and conditional variance is not allowed by the standard GARCH (p,q) model. In order to overcome the limitations of the standard symmetric GARCH (p,q) model, a number of extensions have been introduced, such as the asymmetric GJR (p,q) and EGARCH (p,q) models which can better capture the dynamics of time series
4 and make the modeling more flexible. As another conventional approach for volatility forecasting, implied volatility is the volatility implied from observed option price and computed by option pricing formulas, such as the Black-Scholes formula which is widely used in practice. As we know, the required parameters for computing option price using Black-Scholes model are stock price, strike price, risk free interest rate, time to maturity, volatility as well as dividend. Being the unique unknown parameter, implied volatility is thought to be the representation of the future volatility by consensus because option is priced on the basis of future payoffs. Today, implied volatility indices have been constructed and published by stock exchange in many countries, and it is widely recognized that implied volatility index has superior predictability for future stock market volatility. A common question regarding to implied volatility is whether the option price subsumes all relevant information about future volatility. The large number of empirical evidence from previous studies (e.g., Fleming, Ostdiek and Whaley 1995, Christensen and Prabhala 1998, Giot 2005a, Giot 2005b, Corrado and Miller, JR. 2005, Giot and Laurent 2006, Frijns, Tallau and Tourani-Rad 2008, Becker, Clements and McClelland 2009, Becker, Clements and Coleman-Fenn 2009, Frijns, Tallau and Tourani-Rad 2010) demonstrate that implied volatility is a forward-looking measure of market volatility. However, the poor predictive power of implied volatility was also indicated by some studies, such as Day and Lewis (1992), Canina and Figlewski (1993), Becker, Clements and White (2006), Becker, Clements and White (2007) and Becker and Clements (2008). The objective of our study is to investigate whether the model based forecasts or the CBOE volatility index (the VIX index published by Chicago Board Options Exchange) is superior on forecasting future volatility of S&P 500 index daily returns The data used for our study ranges from January 1990 to December 2010. There are several reasons why we consider the use of the VIX index. First, it is on the basis of S&P 500 index which is considered to be the core index for the U.S equity market. Second, VIX is widely believed as the market’s expectation of S&P 500 index. Third,
5 VIX has considerable data set of historical prices over 20-year. Finally, the information content and performance of VIX have been studied by a large number of empirical studies using various approaches, but the study results are conflict. Therefore, it is interesting to examine the performance of VIX by our own study. The time series model studied in this paper includes random walk model, ARCH (p) model, GARCH (p,q) model, GJR (p,q) model, EGARCH (p,q) model and RiskMetrics approach. We first estimated the parameters of respective time series model, and then examined their out-of-sample forecast performance. Our empirical evidence suggest that GJR (1,1) model performs best for modeling S&P 500 index future returns. Next, the predictive power between GJR (1,1), RiskMetrics approach and VIX were compared by different approaches. We performed the regression of future realized volatility at different forecasting horizons of both in-sample and out-of-sample periods, as well as the study of their forecasting performance on the average daily return shocks. To guard against spurious inferences, the diagnostic tests of residuals were conducted. Our study results are in line with Becker, Clements and White (2006), Becker, Clements and White (2007) and Becker and Clements (2008). The empirical evidence of our study does not support the view that implied volatility subsumes all information content, and the study results provide strong evidence indicating that GJR (1,1) is superior for modeling future volatility of S&P 500 index daily returns. Additionally, the results of encompassing regression for future realized volatility at 5-, 10-, 15-, 30- and 60-day horizons, and the results of the encompassing regression for squared return shocks suggest that the joint use of GJR(1,1) and RiskMetrics can produce the best forecasts. The rest of this paper is structured as follow. We reviewed literatures in section 2. In section 3, the implied volatility and the VIX index are introduced. The time series models and practical issues for modeling are detailed in section 4 and section 5, respectively. Section 6 outlines the data used for our study. The estimates of time series models are discussed in section 7. Section 8 presents the empirical results of comparison between VIX, RiskMetrics and GJR(1,1). Finally, section 9 concludes.
6
2 Literature Review
The predictability of ARCH (q) model on volatility of equity returns has been studied by extensive literature. However, the empirical evidence indicating the good forcast performance of ARCH (q) model are sporadic. The previous studies by Franses and Van Dijk (1996), Braisford and Faff (1996) and Figlewski (1997) examined the out-of-sample forecast performance of ARCH (q) models, and their study results are conflict. However, the common ground of their studies is that the regression of realized volatility produce a quite low statistic of R2. Since the average R2 is smaller than 0.1, they suggested that ARCH (q) model has weak predictive power on future volatility. There is a variety of restrictions influencing the forecasting performance of ARCH models. The frequency of data is one of restrictions, and it is an issue widely discussed in preceding papers. Nelson (1992) studied ARCH model and documented that the ARCH model using high frequency data performs well for volatility forecasting, even when the model is severely misspecified. However, the out-of-sample forecasting ability of medium- and long-term volatility is poor. The existing literature regarding to the study on GARCH type models can be classified into two categories, and they are the investigation on the basic symmetric GARCH models and the GARCH models with various volatility specifications. Wilhelmsson (2006) investigated the forecast performance of the basic GARCH (1,1) model by estimating S&P 500 index future returns with nine different error distributions, and found that allowing for a leptokurtic error distribution leads to significant improvements in variance forecasts compared to using the normal distribution. Additionally, the study also found that allowing for skewness and time variation in the higher moments of the distribution does not further improve forecasts. Chuang, Lu and Lee (2007) studied the volatility forecasting performance of the standard GARCH models based on a group of distributional assumptions in the context of stock market indices and exchange rate returns. They found that the
7 GARCH model combined with the logistic distribution, the scaled student’s t distribution and the Riskmetrics model are preferable both stock markets and foreign exchange markets. However, the complex distribution does not always outperform a simpler one. Franses and van Dijk (1996) examined the predictability of the standard symmetric GARCH model as well as the asymmetric Quadratic GARCH and GJR models on weekly stock market volatility forecasting, and the study results indicated that the QGARCH model has the best forecasting ability on stock returns within the sample period. Brailsford and Faff (1996) investigated the predictive power of various models on volatility of the Australia stock market. They tested the random walk model, the historical mean model, the moving average model, the exponential smoothing model, the exponential weighted moving average model, the simple regression model, the symmetric GARCH models and two asymmetric GJR models. The empirical evidence suggested that GJR model is the best for forecasting the volatility of Australia stock market returns. Chong, Ahmad and Abdullah (1999) compared the stationary GARCH, unconstrained GARCH, non-negative GARCH, GARCH-M, exponential GARCH and integrated GARCH models, and they found that exponential GARCH (EGARCH) performs best in describing the often-observed skewness in stock market indices and in out-of-sample (one-step-ahead) forecasting. Awartani and Corradi (2005) studied the predictability of different GARCH models, particularly focused on the predictive content of the asymmetric component. The study results show that GARCH models allowing for asymmetries in volatility produce more accurate volatility predictions. Evans and McMillan (2007) studied the forecasting performance of nine competing models for daily volatility for stock market returns of 33 economies. The empirical results show that GARCH models allowing for asymmetries and long-memory dynamics provide the best forecast performance.
8 By and large, the extensive empirical studies and evidence demonstrated that GARCH models allowing for asymmetries perform very well for modeling future volatility. EWMA model is also a widely used technique for modeling and forecasting volatility of equity returns in financial markets, and the well-known RiskMetrics approach is virtually the variation of EWMA. A great deal of existing studies using EWMA model on various markets demonstrated that EWMA model has different performance. Akgiray (1989) first examined the forecast performance of EWMA technique on volatility forecasting for stocks on the NYSE. The study also examined predictability of ARCH and GARCH models. The finding indicated that EWMA model is useful for forecasting time series, however, the GARCH model performs best for forecasting volatility. Tse (1991) studied volatility of stock returns of Japanese market during the period of 1986 to 1989 using ARCH, GARCH and EWMA models. The study results revealed that the EWMA model outperforms ARCH and GARCH models for volatility forecasting of stock returns in Tokyo Stock Exchange during the sample period. Tse and Tung (1992) investigated monthly volatility movements in Singapore stock market using three different volatility forecasting models which are the naive method based on historical sample variance, EWMA and GARCH models. The study results suggested that EWMA model is the best for predicting volatility of monthly returns for Singapore market. Wash and Tsou (1998) investigated the volatility of Australian index from January 1, 1993 to December 31, 1995 using a variety of forecasting techniques, and they are historical volatility, an improved extreme-value method, the ARCH/GARCH class of models, and EWMA model. The hourly data, daily data and weekly data were used, respectively. The finding indicated that the EWMA model outperforms other volatility forecasting techniques within the sample period. Galdi and Pereira (2007) examined and compared efficiency of EWMA model, GARCH model and stochastic volatility (SV) for Value at Risk (VaR). The empirical
9 results domonstrated that VaR calculated by EWMA model was less violated than by GARCH models and SV for a sample with 1500 observations. Patev, Kanaryan and Lyroudi (2009) studied volatility forecasting on the thin emerging stock markets, and their study primarily focused on Bulgaria stock market. Three different models which are RiskMetrics, EWMA with t-distribution and EWMA with GED distribution were employed for investigation. The study results suggested that both EWMA with t-distribution and EWMA with GED distribution have good performance for modeling and forecasting volatility of stock returns of Bulgaria market. They also concluded that EWMA model can be effectively used for volatility forecasting on emerging markets. Implied volatility is another popular issue which has attracted a great deal of attention by empirical research. Particularly, the information content of implied volatility is the subject of many studies and it has been well documented that implied volatility is an efficient volatility forecast and it subsumes all information contained in other variables. The predictability of model based forecasts and implied volatility have been compared by a number of studies, and the objective is to find out the answer for whether implied volatility or model based forecasts is superior for future volatility forecasting. The implied volatility from index option has been widely studied but the study results are conflict. The studies by Day and Lewis (1992), Canina and Figlewski (1993), Becker et al. (2006), Becker et al. (2007) and Becker and Clements (2008) demonstrated that historical data subsumes important information that is not incorporated into option prices, suggesting that implied volatility has poor performance on volatility forecasting. However, the empirical evidence from the studies by Poterba and Summers (1986), Sheikh (1989), Harvey and Whaley (1992), Fleming, Ostdiek and Whaley (1995), Christensen and Prabhala (1998), Blair, Poon and Taylor (2001), Poon and Granger (2001), Mayhew and Stivers (2003), Giot (2005 a), Giot (2005 b), Corrado and Miller, JR. (2005), Giot and Laurent (2006), Frijns et al. (2008), Becker, Clements and McClelland (2009), Becker, Clements and Coleman-Fenn (2009) and Frijns et al. (2010) documented that the implied
10 volatilities from index options can capture most of the relevant information in the historical data. The implied volatility index (VIX) from CBOE is a widely used index option for empirical research on implied volatility in practice. The VIX index was the volatility implied from the option price of S&P 100 index, and the calculation method has been changed since 2003. Today, the VIX index is computed by the option price from S&P 500 index. Therefore, the literature regarding to the empirical studies on VIX can be classified into two categories: VIX based on S&P 100 index and VIX based on S&P 500 index. Most studies found that the volatility implied by S&P 100 index option prices to be a biased and inefficient forecast of future volatility and to contain little or no incremental information beyond that in past realized volatility. Day and Lewis (1992) examined the volatility implied from the call option prices of S&P 100 index of the period from 1985 to 1989 by the use of the cross-sectional regression. The information content of implied volatility was compared to the conditional volatility of GARCH and EGARCH models of both in-sample and out-of-sample periods. The information content of implied volatility of in-sample period was examined by the likelihood ratio of the nested conditional volatility GARCH and EGARCH models augmented with implied volatility as an exogenous variable. The out-of-sample forecast performance of implied volatility and GARCH and EGARCH models was studied by running the regression for the ex post volatility on implied volatility and the volatility forecasts from GARCH and EGARCH models. The study results show that implied volatility is biased and inefficient. The drawback of their study may be the use of overlapping samples to predict one-week ahead volatility of options which have the remaining life up to 36-day. Canina and Figlewski (1993) showed that implied volatility has no virtual correlation with future return volatility and does not incorporate information contained in recent observed volatility. According to the analysis by Canina and Figlewski (1993), one reason for producing their study results could be the use of S&P 100 index options (OEX) and the index option markets process volatility information
11 inefficiently. The second reason is that the Black-Scholes option pricing model may be not suitable for pricing index options since prohibitive transaction costs associated with hedging of options in the cash index market. However, the Black-Scholes model does not require continuous trading in cash markets. Christesen and Prabhala (1998) mentioned that Constantinides (1994) have argued that transaction costs have no first-order effect on option prices. Therefore, transaction costs cannot interpret the apparent failure of the Black-Scholes model for the OEX options market. It seems that the study results of Canina and Figlewski (1993) refute the basic principle of option pricing theory. (Christesen and Prabhala 1998) The study by Christensen and Prabhala (1998) was the development of the study by Canina and Figlewski (1993). They reinvestigated the relation between implied volatility and realized volatility of the OEX options market, and they found the different study results. Their finding indicates that implied volatility outperforms past volatility in forecasting future volatility and subsumes the information content of past volatility in some of their specifications. Christensen and Prabhala (1998) argued that the reason causing their study results to be different from Canina and Figlewski’s (1993) is that they used a longer volatility series, and ‘this increases statistical power and allows for evolution in the efficiency of the market for OEX index options since their introduction in 1983’. Their sample data ranges from November 1983 to May 1995 which equals to 11.5 year. However, the data used by Canina and Figlewski (1993) was from March 15, 1983 to March 28, 1987, and this period preceded the October 1987 crash. Christensen and Prabhala (1998) documented that there was a regime shift around the crash period, and implied volatility is more biased before the crash. The second reason is that they used monthly data to sample the implied and realized volatility series, while the daily data was used by Canina and Figlewski (1993). The lower frequency of data enables them to ‘construct volatility series with nonoverlapping data with exactly one implied and one realized volatility coving each time period’, and their ‘nonoverlapping sample yields more reliable regression estimates relative to less precise and potentially inconsistent estimates obtained from overlapping samples used in previous work’.
12 Blair et.al (2001) compared ARCH models and VIX based on S&P 100 index using both daily index returns and intraday returns. The data ranges from November 1983 to May 1995, and it spans a time period of 139 months which is approximately 11.5 years. The study results indicate VIX performs very well on volatility forecasting and the volatility forecasts are unbiased. The technique for computing VIX was improved in 2003. Since the new computation is based on the option price of S&P 500 index rather than S&P 100 index, therefore, the evaluation of the performance of VIX on forecasting future volatility of S&P 500 index became the subject of most empirical research. However, the results of various studies are also conflict. Corrado and Miller, JR. (2005) studied implied volatility indices VIX, VXO as well as VXN which are based on S&P 500, S&P 100 and Nasdaq 100 indices, respectively. The study period spans 16 years from January 1988 to December 2003. They compared the results of OLS regression to the estimates derived from instrument variable regression, and the study results documented that implied volatility indices VIX, VXO and VXN dominate historical realized volatility. Particularly, VXN is nearly unbiased and it can produce more efficient forecasts than realized volatility. Giot and Laurent (2006) investigated information content of both VIX and VXO implied volatility indices. The data used for their study ranges from January 1990 to May 2003. The information content was evaluated by running an encompassing regression of the jump/continuous components of historical volatility, and implied volatility was augmented as an additional variable. The study results show that implied volatility subsumes most relevant volatility information. They also indicated that the addition of the jump/continuous components can hardly affect the explanatory power of the encompassing regression. Becker, Clements and McClelland (2009) examined information content of VIX by seeking the answers for two questions. First, whether the VIX index subsumes information regarding to how historical jump activity contributed to the price volatility; second, whether the VIX reflects any incremental information pertaining to
13 future jump activity relative to model based forecasts. The empirical results of their study provide the affirmative answers for these two questions. Becker, Clements and Coleman-Fenn (2009) compared model based forecasts and VIX. They argued that the unadjusted implied volatility is inferior. However, the transformed VIX augmented with the volatility risk-premium can have the same good performance as model based forecasts. The study results of Becker et al. (2006), Becker et al. (2007) and Becker and Clements (2008) refute the hypothesis of VIX being an efficient volatility forecast. The same data set was used for these three studies, ranging from January 1990 to October 2003. The study results indicate that there is significant and positive relationship between VIX and future volatility, but the VIX is an inefficient volatility forecast. There are several determinant variables for computing the implied volatility, such as the index level, risk free interest rate, dividends, contractual provisions of the option and the observed option price. The measurement errors of these variables may lead to the biased estimation of implied volatilities. Since the implied volatilities used by early studies contain relevant measurement errors whose magnitudes are unknown, therefore, this may be the primary reason leading to the conflicting study results of various studies. In addition, the biasness of implied volatility estimation can also be induced by some other factors. For example, the relatively infrequent trading of the stocks in the index; the use of closing prices which have different closing times of stock and options markets; the bid or ask price effects which may cause the first order autocorrelation of the implied volatility series to be negative. Comparing to index option, the study based on the individual stock options is sporadic. The studies by Latané and Rendleman (1976) was conducted with expectation of favoring implied volatility, however, the results are less overwhelming due to these studies predate the development of conditional heteroskedasticity models and applied naive models of historical volatility.
14 Lamoureux and Lastrapes (1993) examined implied volatility based on the option prices of 10 stocks of a 2-year short period from April 1982 to March 1984. They demonstrated that implied volatility is biased and inefficient, and the GARCH model performs better on modeling the conditional variance. Additionally, they also found that when implied volatility was included as a state variable in the GARCH conditional variance equation, historical return shocks still provided important additional information beyond that reflected in option prices. Their study results are difficult to interpret because they used overlapping samples to examine one day ahead forecasting ability of implied volatility computed from options that have a much longer remaining life which is up to 129 trading days. Based on the theory and methodology of the study by Lamoureux and Lastrapes (1993), Mayhew and Stivers (2003) examined 50 firms with the highest option volume traded on the CBOE between 1988 and 1995, and they used the daily time series of the volatility index (VIX) from CBOE. During this period, the VIX represented the implied volatility of an at-the-money option based on the S&P 100 Index with 22 trading days to expiration. Their study results show that the implied volatility outperforms GARCH specification. In addition, when implied volatility is added to the conditional variance equation, it captures most of all of the relevant information in past return shocks, at least for stocks with actively-traded options. Furthermore, they documented that return shocks from period
2 and older
provide reliable incremental volatility information for only a few firms in the sample.Finally, they also found that the implied volatility from equity index options provides incremental information about firm-level conditional volatility. For the most of the firms, index implied volatility contains information beyond that in past returns shocks, suggesting an alternative method for modeling volatility for stocks without traded options. For a small part of firms with less actively-traded individual options, the index implied volatility provides incremental information beyond the own firm’s implied volatility. Therefore, the equity index options appear to impound systematic volatility information that is not available from less liquid stock options.
15 Frijns et al. (2008) and Frijns et al. (2010) studied return volatility of Australian stock market of different period. Due to there is no implied volatility index published by Australian Stock Exchange, Frijns et al. (2010) computed the implied volatility index namely AVX on the basis of the European style index options traded on the Australian Securities Exchange. The approach of constructing AVX is similar to the way of computing VIX by CBOE. The distinctive feature is that the implied volatilities of eight near-the-money options were combined into a single at-the-money implied volatility index with a constant time to maturity of three months (Frijns et al. 2010: 31). Therefore, the computed AVX is considered to be the forecasted future return volatility of S&P/ASX 200 over the subsequent three months. The study results demonstrated that implied volatility outperforms RiskMetrics and GARCH and provides important information for forecasting future return volatility of Australian stock market. Furthermore, it is proposed that AVX could be valuable information to investors, corporations and financial institutions. To summarize, the empirical results of immediate studies favor the conclusion that implied volatility are more efficient and informative for forecasting future volatility of assets returns.
16
3 The CBOE Volatility Index-VIX 3.1 Implied Volatility
Implied volatility is a prediction of process volatility rather than the estimate, and its horizon is given by the maturity of the option. In a constant volatility framework, implied volatility is the volatility of underlying asset price process that implicit in the market price of an option according to a particular model. If the process volatility is stochastic, implied volatility is considered to be the average volatility of the underlying asset price process that is implicit in the market price of an option (Alexander, 2001:22). The market price of options can be computed using various models. A simple model namely Black-Scholes model is widely used for European options pricing in practice. In practice, the theoretical market price and real price of option may differ from each other, whereas application of implied volatility can make these two prices equivalent (Alexander, 2001). A recognized fact is that different options on the same underlying asset can generate various implied volatilities. Furthermore, using different data can induce the irreconcilably different inferences of parameters value. Since implied volatilities are thought of the market’s forecast of the volatility implied from the underlying asset of an option, the calculation of an implied volatility is closely associated with the option valuation model. Blair et al. (2001) argued that the inappropriate use of option valuation model can lead to mis-measurement in implied volatilities. For example, if implied volatilities of S&P 500 index option are calculated by an European model then error will be caused by the omission of the early exercise option due to is an American style option. In addition, Harvey and Whaley (1992) showed that if the option pricing model includes the early exercise option and the timing and level of dividends are assumed to be constant, then the option will be priced by error so that implied volatilities will be mis-measured.
17
3.2 The VIX Index
The VIX index was introduced by the Chicago Board Options Exchange (CBOE) in 1993. By using the implied volatilities of various near-the-money options on the S&P 100 index, Whaley (1993) introduced the VIX index on the basis of a synthetic at-the-money option with a constant time to maturity of one-month, and demonstrated that the VIX index is not only an efficient index for market volatility, but also could be employed for hedging purpose by introducing options and futures on the VIX. The current calculation approach of VIX was changed since September 22, 2003, and it is now calculated from the bid and ask quotes of options on S&P 500 index rather than S&P 100 index. The S&P 500 index is the most popular underlying asset as well as the most widely used benchmark in the U.S market Before changing the calculation approach, the VIX index based on S&P 100 index is a weighted index of American implied volatilities derived from eight near-the-money, near-to-expiry, S&P 100 call and put options, and it was considered to be able to eliminate smile effects and most of problems of mis-measurement. It used the binominal valuation methods with trees that are adjusted to reflect the actual amount and timing of anticipated cash dividends. The midpoint of the most recent bid/ask quotes are used to calculate the option price and this way was considered to be able to avoid problems inducing by bid/ask bounce. Both call and put options were used in order to increase the amount of information and eliminate problems caused by mis-measurement of underlying index and put/call option clientele effects. VIX based on S&P 100 index represents a hypothetical option that is at-the-money and had a constant 22 trading days (30 calendar days) to expiry. It employed pairs of near-the-money exercise prices which are barely above and below the current index price. Otherwise, a pair of times to expiry was also used, one is at least eight calendar days to expiration and another one is the following contract month. Blair et al. (2001) showed that although VIX is robust to mis-measurement, it is still a biased predictor of subsequent volatility due to a trading time adjustment that typically multiplies conventional implied volatilities by approximately 1.2.
18 The new calculation approach makes the VIX index to be much closer to the real financial practices and become the practical standard for trading and hedging volatility. It is widely accepted and considered to be the market’s expected volatility of the S&P 500 index. Since the computation augments a wide range of exercise prices, the VIX index based on S&P 500 index become more robust. In addition, VIX is computed directly from option prices rather than seeking it by the use of the Black-Scholes option pricing model (Ahoniemi 2006). The popularity of VIX are developing rapidly and it has become the main index for the U.S stock market volatility. So far, VIX has been a tradable asset for both option and futures with 6-year history. In terms of CBOE proprietary information (2009), VIX is computed by the at-the-money and out-of-the-money call and put option prices using the formula
2
1
∆
1
1
where σ denotes VIX divided by 100, T is time to maturity, r is the risk free interest rate, F is the forward index level computed by the index option prices, the first strike below the forward index level (F), out-of-the-money option (a call if
is the strike price of ith ; both call and put if
stands for the midpoint of the bid-ask spread for each option with
), strike
; a put if
denotes
, ∆
is the interval between strike prices and it is calculated by the
difference between the strike on either side of
divided by two,
/2.
Since VIX forecasts 30-day volatility of S&P 500 index, the near-term and next-term put and call options of the first two contract months are used to compute VIX. For near-term options, the time to maturity should equal one week at least so that can minimize the potential pricing anomalies which could happen near the time to maturity. If the expiration date of the near-term options is less than one week, then must roll to the next two contract months (CBOE proprietary information 2009).
19
4 Time Series Models for Volatility Forecasting 4.1 Random Walk Model
Perhaps the random walk model is the simplest one for modeling volatility of a time series. Under the efficient market hypothesis, the stock price indices are virtually random. The standard model for estimating the volatility of stock returns using ordinary least square method is the random walk model based on the historical price: 2 where
denotes the stock index return at time t; μ is the average return under the
efficient market hypothesis, and it is expected to be equal to zero;
is the error
term at time t, and its auto-covariance should equal to zero over time.
4.2 The ARCH (q) Model
Engle (1982) introduced the autoregressive conditional heteroskedasticity ARCH (q) model and documented that the serial autocorrelated squared returns (conditional heteroskedasticity) can be modeled using an ARCH (q) model. The framework of the ARCH (q) model is:
3
4
5
where
denotes the conditional mean given information set available at time
20 1;
represents a sequence of iid random variables with mean equals zero and 0 and
unit variance. The constraints of parameters that ensure the conditional variance The equation (5) for
0
1 ,…,
is non-negative.
can be expressed as an AR (q) process for the squared
residuals:
6
where
0
is a martingale difference sequence (MDS) since
and it is assumed that
∞ (Zivot 2008:4). The condition for
to be
covariance stationary is that the sum of all parameters of past residuals 1, … ,
and
should be smaller than unity. The measurements of persistence of
are ∑
and
⁄ 1
∑
, respectively.
4.3 The GARCH (p,q) Model
The generalized ARCH (GARCH) model, proposed by Bollerslev (1986), is the extension of ARCH model. It is based on the assumption that the conditional variance to be dependent upon previous own lags, and it replaces the AR (q) representation in equation (5) with an ARMA (p,q) process:
7
0
where the parameter constraints assure that σ
0, 1,
,
and
0
1,
,
0. The equation (7) together with equation (3) and (4) is known as
the basic GARCH (p,q) model. If
0, the GARCH (p,q) model became an ARCH(q)
model. In the interest of the coefficient estimates of the GARCH term to be identified at least one of parameters
1, … ,
must be significant from zero. For the
basic GARCH (p,q) model, the squared residuals
behave like an ARMA process. It
21 is required that ∑ unconditional variance of
∑
1 for the covariance stationarity. The
is computed as :
1
∑
8
∑
In practice, the GARCH (1, 1) model comprising only three parameters in the conditional variance equation is sufficient to capture the volatility clustering in the data. The conditional variance equation of GARCH (1,1) model is
9
, the equation (9) can be rewritten as
Due to
10
The equation (10) is an ARMA (1,1) process for
, and it is followed by many
properties of GARCH (1,1) model. For instance, the persistence of the conditional volatility is captured by
, and the constraints
1 assures the
covariance stationarity. The covariance stationary GARCH (1,1) model has an ARCH ∞ representation with ⁄ 1
, and the unconditional variance of
is
. (Zivot, 2008:6)
4.3.1 The Stylized Facts of Volatility
The stylized facts about the volatility of economic and financial time series have been studied extensively. The most important stylized facts are known as volatility clustering, leptokurtosis, volatility mean reversion and leverage effect. The volatility clustering can be interpreted by GARCH (1,1) model of equation (9). For many daily or weekly financial time series, a distinctive feature is that the
22 coefficient estimate of the GARCH term approximates 0.9. This implies that the large (small) value of the conditional variance will be followed by the large (small) value. The same discursion can be derived by the ARMA representation of GARCH models in equation (10), i.e. the large changes in and small changes in
will be followed by the large changes,
will be still followed by small changes. (Zivot, 2008)
Compared to the normal distribution, the distributions of the high frequency data usually have fatter tails and excess peakedness at the mean. This fact is known as leptokurtosis, and it suggests the frequent presence of the extreme values. The kurtosis is a statistic for measuring the peak of a distribution of time series compared to a normal distributed random variables with constant mean and variance, and it is calculated by a function of residuals
and their variance
kurtosis =
:
(11)
The kurtosis of a normal distribution is three and the excess kurtosis which equals to kurtosis minus three is zero. The normal distribution with zero excess kurtosis is known as mesokurtic. A distribution with the excess kurtosis larger than three is referred to as leptokurtic, and the distribution is said to be platykurtic if the excess kurtosis is smaller than three. Sometimes financial markets experience excessive volatility, however, it seems that the volatility can ultimately go back to its mean level. The unconditional variance of the residuals of the standard GARCH (1,1) model is computed by ⁄ 1
. In order to clarify that the volatility can be finally driven back
to the long run level, we consider the interpretation by rewriting the ARMA representation in equation (10):
12
by successively iterating k times,
23
13
where γ
1 is required for a
is a moving average process. Due to
covariance stationary GARCH (1,1) model, infinitely. Although
approach zero as k increase
may deviate from the long run level at time t,
will
approach zero as k becomes larger, and this implies that the volatility will eventually go back its long run level σ . The half-life of a volatility shock suggests the average time for |
| to decrease by one half, and it is measured by
Therefore, the speed of mean reversion is dominated by
0.5⁄
.
, i.e. if the value of
1, the half-life of a volatility shock will be very long; if
1, the
GARCH model is non-stationary and the volatility will ultimately explode to infinity as k increases infinitely (Zivot 2008:8). The standard GARCH (p,q) model enforce a symmetric response of volatility to positive and negative shocks because the conditional variance equation of the standard GARCH (p,q) model is a function of the lagged residuals but not their signs, i.e. the sign will be lost if the lagged residuals are squared (Brooks, 2008). Therefore, the standard GARCH (p,q) model cannot capture the asymmetric effect which is also known as the leverage effect in the distribution of returns. One alternative is modeling the conditional variance equation augmented with the asymmetry. Another approach is allowing the residuals to have an asymmetric distribution (Zivot 2008). In order to overcome this limitation of the standard GARCH (p,q) model, a number of extensions have been built such GJR and the exponential GARCH (EGARCH) models.
4.4 The GJR (p,q) Model
The GJR (p,q) model is built with the assumption that the unexpected changes in the market returns have different effects on the conditional variance of returns. Compared to the basic GARCH (p,q) model, the GJR (p,q) model augments with an
24 additional term which is used to account for the possible asymmetries. The function form of the conditional variance is given by:
(14)
0,
where I (.) represents the dummy variable that takes value one if
0, the leverage effect exhibits and suggests that the negative
otherwise zero. If γ
shocks will have a larger impact on conditional variance than positive shocks; if γ 0, the news impact is asymmetric. Since the conditional variance should be positive, therefore, the constraints of parameters are
0, the model is still admissible even if γ
When if γ
0,
2
1
0,
0 and
0.
0. The model is stationary
.
4.5 The EGARCH (p,q) Model
The exponential GARCH (EGARCH) model introduced by Nelson (1991) incorporates the leverage effect and specifies the conditional variance in the logarithmic form. The conditional variance equation of the EGARCH model is expressed as:
|
If
|
0 or there is arrival of good news, the total effect of 0 or there is arrival of bad news, the total effect of
15
|
is 1 is 1
|
|; if
|.
The EGARCH model has three advantages over the basic GARCH model. First, since the conditional variance is modeled in the logarithmic form, the variance will always be positive even if the parameters are negative. With appropriate condition of the parameters, this specification captures the fact that a negative shock leads to a higher conditional variance in the next period than a positive shock. Second,
25 asymmetries are allowed in the EGARCH model. If the relationship between volatility and returns is negative, the parameter of the asymmetry term,
, will be negative.
Third, the EGARCH model is stationary and has finite kurtosis if
1. Thus,
there is no restriction on the leverage effect that the model can represent imposed by the positivity, stationarity or the finite fourth order moment restrictions.
4.6 RiskMetrics Approach
The RiskMetrics approach was introduced by J.P. Morgan (1992). It is a variation of the exponentially weighted moving average (EWMA) model which can be expressed as
∞
1
where
16
denotes the average return estimated by observations and it is assumed to
be zero by RiskMetrics approach as well as many empirical studies.
is the decay
factor determining the weights given to recent and older observations. The determination of the value of
is important. Although
can be estimated, it is
often conventionally restricted to be 0.94 for daily data and 0.97 for monthly data, and such weights are recommended by RiskMetrics approach. To be explicit, the specification of RiskMetrics model is
1
(17)
26
5 Practical Issues for Model-building 5.1 Test ARCH Effect
Volatility clustering is caused by the autocorrelation in squared and absolute returns or in the residuals from the estimated conditional mean equation (Zivot, 2008). There are different approaches for testing the ARCH effect, and two conventional methods are Ljung-Box (1978) statistic and Lagrange multiplier (LM) test suggested by Englie (1982). Denoting the i-lag autocorrelation of the squared or absolute returns by
, the
Ljung-Box statistic is computed as:
̂
2
~
18
The statistic of LM test is given by
·
~
19
where q represents the number of restrictions placed on the model, T denotes the number of total observations, and
is from the regression of the equation (6). The
hypothesis of LM test is:
H :
0 (suggesting there is no ARCH effect)
H :
0 (suggesting there is ARCH effect)
Lee and King (1993) documented that the LM test can also be used to test the GARCH effects. Lumsdaine and Ng (1999) argued that the LM test could fail if the conditional mean equation is specified inappropriately and this can lead to serial autocorrelation of the estimated residuals as well as the squared estimated residuals.
27
5.2 Information Criterion
An important issue regarding to the model-building is the determination of orders of ARCH and GARCH terms of the conditional variance equation. Due to GARCH model can be considered as an ARMA process for squared residuals, therefore, the conventional information criteria can be used for model selection. Three widely used information criteria are Akaike information criterion (AIC), Bayesian information criterion (SBIC) and Hanna-Quinn criterion (HQIC), and their respective algebraic expressions are:
2
20
21
2
where
22
denotes the variance of residuals, T represents the sample size, k is the 1 for a GARCH (p,q)
total number of the estimated parameters, i.e.
model. The model with the smallest value of AIC, SBIC and HQIC is considered to be the best one. However, a common practice is that it is difficult to beat the GARCH (1,1) model.
5.3 Evaluating the Volatility Forecasts 5.3.1 Out-of-sample Forecast
The predictability of the estimated models is often evaluated by the out-of-sample
forecast
performance.
Two
common
approaches
used
for
out-of-sample forecasts are known as recursive forecast and rolling forecast. The
28 recursive forecast has a fixed initial estimation date, and the sample is increased by one and model is re-estimated at each time. For the L step ahead forecasts, this process is continued until no more L step ahead forecasts can be computed. The rolling forecast has a fixed length of the in-sample period used for estimating the model, i.e., both the start and the end estimation dates should increase by one and the model is re-estimated at each time. For the L step ahead forecasts, this process is continued until no more L step ahead forecasts can be computed. (Brooks, 2008)
5.3.2 Traditional Evaluation Statistics
In most empirical studies, four error measurements are widely used to evaluate the forecast performance of the estimated models. They are known as the root mean square error (RMSE), the mean absolute error (MAE), the mean absolute percent error (MAPE), and Theil’s U-statistic. These measurements are expressed as:
1
23
1
1
24
1
100 1
⁄
25
26
where T represents the number of total observations and
is the first
out-of-sample forecast observation. Therefore, the model is estimated by the
29 observations from 1 to ( out-of-sample forecasting.
1 , and observations from and
conditional variance at time t, respectively.
to T are used for the
denote the actual and the estimated is obtained from a benchmark model
which is often a simple model such as the random walk model. RMSE provides a quadratic loss unction. A distinctive feature of RMSE is that it is particularly useful if the estimates errors are extremely large and they can cause the serious problems. However, if there are large estimates errors but they cannot lead to the serious problems, then, this becomes the disadvantage of RMSE. (Brooks, 2008) MAE measures the average absolute forecast error. Although the function form of RMSE and MAE are simple, but they are inconstant to scale transformations, and their symmetric characteristics imply that it is not very realistic and inconceivable in some cases. (Yu, 2002) MAPE measures the percentage error, i.e. its value is restricted between zero and one hundred percent. MAPE has an advantage which is useful to compare the performance of the estimated models and the random walk model. For a random walk in the log level, the criterion MAPE is equivalent to one. Therefore, an estimated model with the MAPE which is smaller than one is considered to be a better one than random walk model. However, if the series take on the absolute value which is smaller than one, then MAPE is not reliable. (Brooks, 2008) Since one term of the function of Theil’s U-statistic is the estimated conditional variance from the benchmark model, therefore, the estimates errors is standardized. The U-statistic can be used to compare the estimated model and the benchmark model. If U-statistic equals to one, it suggests that the estimated model has the same accuracy as the benchmark model. If U-statistic is smaller than one, then the estimated model is considered to be better than the benchmark model (Brooks, 2008). Comparing to MAE, Theil’s U-statistic is constant to scalar transformation, but it is symmetric (Yu, 2002)
30
6 Data
The data used for our empirical study are daily returns and daily implied volatilities of S&P 500 Index of 5291 trading days of a 21-year period. The in-sample period ranges from 3 January 1990 to 31 December 2009 providing 5039 daily observations, followed by the out-of-sample period from 2 January 2010 to 31 December 2010 comprising with 252 daily observations.
6.1 S&P 500 Index Daily Returns
Daily returns from the S&P 500 index are defined in the standard way by the natural logarithm of the ratio of consecutive daily closing levels. Index returns are adjusted for dividends. Denoting the price at the end of trading day t by
, the log
return or continuously compounded return is computed as: 100
log
⁄
(27)
Table 1 shows some standard summary statistics of both full sample and the yearly sub-period along with the Jarque-Bera test for normality. The latter is defined as:
·
6
3 24
28
where S and K represent the sample skewness and kurtosis, respectively. Our null hypothesis is that the observations are iid (identically and independently) normal distribution. JB is asymptotically distributed as chi-square with two degrees of freedom. As can be seen, the average daily returns of full sample period is 0.024% and daily (annual) standard deviation is 1.17% (18.57%). As is expected for a time series of returns, the average daily returns of both full sample period and all sub-period are close to zero, and most of them are slightly positive. It is obvious that
31 Table 1.Summary statistics for S&P 500 index daily returns Period
Obs.
Mean
Max.
Min.
Median
Std. Dev.
Skewness
Kurtosis
JB
All
5291
0.02366
10.9572
-9.46951
0.05222
1.17112
-0.19939
11.86668
1990
252
-0.03392
3.13795
-3.07110
0.10574
1.00134
-0.16909
1991
252
0.09268
3.66421
-3.72717 -0.00908
0.89962
0.17191
1992
254
0.01720
1.54441
-1.87401
0.00475
0.60972
0.05634
3.23772 0.732460
1993
253
0.02695
1.90943
-2.42929
0.00867
0.54192
-0.17885
5.41942 63.05525
1994
252
-0.00616
2.11232
-2.29358
0.01293
0.62069
-0.29147
4.27654 20.67846
1995
252
0.11647
1.85818
-1.55830
0.09443
0.49127
-0.07153
4.08430
1996
254
0.07264
1.92519
-3.13120
0.05538
0.74320
-0.61248
4.75474 48.46755
1997
251
0.10761
4.98869
-7.11275
0.18832
1.14970
-0.67569
9.42657 451.0362
1998
252
0.09381
4.96460
-7.04376
0.14023
1.28147
-0.61991
7.72505 250.5634
1999
252
0.07078
3.46586 -2.84590
0.03313
1.13707
0.06162
2.86455
2000
252
-0.04242
4.65458
-6.00451
-0.03791
1.40018
0.00075
4.38816 20.23325
2001
248
-0.05635 4.88840
-5.04679
-0.06114
1.35822
0.02048
4.44777
21.67631
2002
252
-0.10561
5.57443
-4.24234
-0.17836
1.63537
0.42507
3.66104
12.17688
2003
252
0.09291
3.48136
-3.58671
0.12758
1.07374
0.05323
3.75894 6.166869
2004
252
0.03417
1.62329
-1.64550
0.06359
0.69883
-0.11016
2.86226 0.708838
2005
252
0.01173
1.95440
-1.68168
0.05587
0.64773
-0.01553
2.84928 0.248659
2006
251
0.05087
2.13358
-1.84963
0.09829 0.63098
0.10281
2007
251
0.01382
2.87896
-3.53427
0.08083
1.00926
-0.49408
4.44814 32.14436
2008
253
-0.19206
10.9572
-9.46951
0.00000
2.58401
-0.03373
6.67544 142.4539
2009
252
0.08361
6.83664
-5.42620
0.18690
1.71760
-0.06047
4.85098
2010
252
0.04774
4.30347
-3.97557
0.07988
1.13778
-0.21103
4.95993 42.20451
17367.04
3.62153 5.257010 4.95451
4.15534
41.35232
12.56164
0.352110
14.40212
36.12797
32 there is large difference between maximum and minimum returns, and this is a common feature of index returns. The time-varying statistics of the standard deviation indicate that there is considerable fluctuation of S&P 500 daily returns. The distribution of daily index returns of full sample period is clearly non-normal with negative skewness and pronounced excess kurtosis. The statistics of skewness of 13 sub-period are negative and slightly positive for other 7 sub-period; the values of kurtosis exceed 3 in all periods. The information observed from Table 1 indicates that the distribution of observations do not match our assumption. Figure 1 plots the daily log returns, squared returns, and absolute value of returns of S&P 500 index over the whole study period from January 03, 1990 to December 31, 2010. There is no clear discernible pattern of behavior in the log returns, but there is some persistence indicated in the plots of the squared and absolute returns which represent the volatility of returns. Particularly, the plots show evidence of volatility clustering, implying that low values of volatility are tended to be followed by low values and high values of volatility are followed by high values.
S&P 500 Squared Daily Returns
S&P 500 Daily Returns
S&P 500 Absolute Daily Returns
150
12
100
8
50
4
10
0
-10 0
0 95
00
05
10
95
00
05
95
10
00
05
Figure 1.Daily returns, squared daily returns and absolute daily returns for the S&P 500 index
6.1.1 Autocorrelations of S&P 500 Index Daily Returns
The sample autocorrelations of the daily log returns, squared returns, and absolute value of returns of S&P 500 index are presented in the Figure 2. The autocorrelation is deemed significant if |autocorrelation|
1.96⁄√5226 at 5% level.
10
33 As can be seen, the log returns show no evidence of serial correlation, while the autocorrelation of squared and absolute returns are alternate between positive and negative. Further, the decay rates of the sample autocorrelations of squared and absolute returns appear to be slow, and this is the evidence of long memory behavior.
S&P 500 Squared Daily Returns
S&P 500 Daily Returns
S&P 500 Absolute Daily Returns
.4
.2
.4
.0 .0
acf
acf
acf
.0 -.2
-.4 -.4 -.4
-.8 5
10
15
-.6 5
20
10
Figure 2. Autocorrelation of
,
15
20
5
10
and | | for S&P 500 index
6.1.2 Testing ARCH Effect of S&P 500 Index Daily Returns
The test of the presence of ARCH effect is conducted by Ljung-Box test computed from daily squared returns, and LM test for different lags of residuals of estimation of S&P 500 index daily returns. The summary statistics is presented in Table 2. The results of both the Ljung-Box and the LM tests are statistically significant and indicate that there is presence of ARCH effect in S&P 500 daily index returns, showing the evidence of volatility clustering.
Table 2 Test for ARCH effect in S&P 500 daily index returns lag
1
5
10
15
2089.4
4097.0
5762.2
(0.0000)
(0.0000)
(0.0000)
(0.0000)
220.59
1208.01
1379.53
1529.60
(0.0000)
(0.0000)
(0.0000)
(0.0000)
Ljung-Box 225.51 LM
Notes: p-values are in parentheses
15
20
34
6.2 Properties of the VIX Index
Although VIX has potential flaw, compared to other implied volatility indices, it can eliminate most of the problems of mis-measurement. Therefore, we use it as our measure for S&P 500 index implied volatility. Adjusted daily values of VIX at the close of option trading are used. Table 3 presents the summary statistics of the VIX index of the sample period from January 03, 1990 to December 31, 2010. The average level of implied volatility index is 20.3949% over the sample period. The statistics of autocorrelation indicate that the series is highly persistent. The distribution of VIX is non-normal with positive skewness and excess kurtosis. Since the statistic of Augmented Dickey-Fuller test is -4.49 with p-value equals to 0.0002, the null hypothesis of presence of unit root can be rejected at 1% level.
Table 3. Summary statistics of the VIX index Mean
Std.Dev
Skewness
Kurtosis
p(1)
p(2)
p(3)
p(4)
p(5)
ADF
0.203949 0.082424 2.020700 10.26646 0.983* 0.969* 0.959* 0.950* 0.942*
-4.49 (0.0002)
Note: P(i) denotes autocorrelations of series for i-lag; * is significant at 1% level; the P-value is in the parenthesis.
6.3 Study on S&P 500 Index and the VIX Index 6.3.1 Cross-correlations between S&P 500 Index and the VIX Index
Table 4 presents the statistic results of cross-correlations between S&P 500 index daily returns and the VIX index of both full sample and yearly sub-period. The contemporaneous cross-correlation for the full sample period and all yearly periods are negative, and 15 yearly sub-period are highly significant. We also observed some
35 Table 4. Cross-correlation between S&P 500 index daily returns and implied volatility index Period
Obs.
-2
-1
0
+1
+2
All
5291
0.0135
0.0217
-0.1214*
-0.1085*
-0.00926*
1990
252
0.0463
0.0341
-0.1805*
-0.2036*
-0.1802*
1991
252
0.1840
0.1438
-0.0570
-0.0537
-0.0451
1992
254
0.0352
0.0156
-0.1583*
-0.1789*
-0.1256*
1993
253
0.1210
0.0939
-0.1795*
-0.2546**
-0.1913*
1994
252
0.0403
0.0403
-0.2850*
-0.2743*
-0.2723*
1995
252
-0.0108
-0.0345
-0.2921*
-0.2613*
-0.1789*
1996
254
0.1019
0.0378
-0.3134*
-0.2398*
-0.1970*
1997
251
0.0838
0.0863
-0.1273
-0.1246
-0.0935
1998
252
0.0599
0.0583
-0.1748*
-0.1573*
-0.1258*
1999
252
0.1301
0.1126
-0.2784*
-0.2330*
-0.2706*
2000
252
0.1012
0.0907
-0.2252*
-0.2068*
-0.1223*
2001
248
0.1438
0.1252
-0.1401
-0.1300
-0.1052
2002
252
0.0956
0.0951
-0.1150
-0.1148
-0.0999
2003
252
-0.0223
-0.0084
-0.1088
-0.0974
-0.0801
2004
252
0.0888
0.0788
-0.2422*
-0.2219*
-0.1907*
2005
252
0.1192
0.1524
-0.2500*
-0.1649*
-0.1795*
2006
251
0.0926
0.0598
-0.2606*
-0.2476*
-0.1505*
2007
251
0.0232
0.0788
-0.1839*
-0.1243
-0.1122
2008
253
0.0348
0.0627
-0.1271
-0.0999
-0.0804
2009
252
-0.0313
0.0025
-0.1670*
-0.1394*
-0.1167
2010
252
0.0030
0.0359
-0.2826*
-0.2690*
-0.2314*
36 significant cross-correlations at other leads for various yearly periods but not for any lags. Figure 3 further confirms the negative relationship between S&P 500 Index and the VIX index, i.e. when S&P 500 Index level peaks VIX is at a trough and vice versa. Two common explanations for the phenomenon of Figure 3 are leverage effect and time-varying risk-premium effect. Leverage effect implies that the increase of leverage is the result of the decrease of the value of equity since the stock price decline. Thus, the risk known as volatility of stock market will increase. Time-varying
8.4
90%
8.0
80%
( log ) S&P 500 Index VIX
70%
7.2
60%
6.8
50%
6.4
40%
6.0
30%
5.6
20%
5.2
10%
VIX
( log ) S&P 500 Index
7.6
4.8
0% 1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
2010
Figure 3. S&P 500 Index (logarithm) and the VIX Index
risk-premium effect is also known as volatility feedback effect, suggesting that the increase of the asset’s risk premium is in unison with the increase of expected volatility, and this can lead to a higher expected return and the decrease of current stock price.
37
6.3.2 S&P 500 Index Daily Returns and the VIX Index
The relationship between stock market returns and implied volatility index was first investigated by Fleming et al. (1995) for US stock market, and the presence of significant negative and asymmetric relationship was demonstrated. The VIX index is widely recognized as an effective proxy for expected volatility. Since VIX was calculated by the option prices of S&P 100 index before 2003, therefore, it is interesting to study the contemporaneous relationship between S&P 500 index daily returns and the VIX index using 21-year historical data, and we want to confirm whether the relationship between S&P 500 index and its based VIX is still negative and asymmetric. By following Fleming et al. (1995), we ran a regression of S&P 500 index daily returns and contemporaneous daily VIX changes on leads and lags. In order to evaluate whether there is an asymmetric contemporaneous relationship between S&P 500 index returns and the VIX index, the absolute daily returns at a lag of zero is included. Additionally, the VIX at a lag of one is also included for controlling for first-order autocorrelation. The regression has the form:
| |
∆
∆
In line with previous empirical studies by Fleming et al. (1995), Frijns et al. (2008) and Frijns et al. (2010), the parameter of
is expected to be negative. If
is positive and significant, the relationship between S&P 500 index returns and changes in VIX is asymmetric. Table 5 presents the regression results for VIX changes and intertemporal S&P 500 index daily returns for the full sample and yearly sub-periods. For the full sample period, the value of parameter of significant
is same as our expectation. The highly
with a t-statistic of -90.43 confirms the negative contemporaneous
relationship between VIX changes and S&P 500 index daily returns. The positive and significant
with t-statistic of 9.62 shows the evidence for asymmetric relationship
29
38 Table 5. Regression results for VIX changes and S&P 500 index daily returns Period All
1990
1991
1992
1993
1994
1995
1996
1997
-0.0926*
0.0738*
-0.0377**
-0.9931*
0.1442*
0.0060
0.0429*
-0.0899*
(-5.30)
(6.70)
(-2.15)
(-90.43)
(9.62)
(-0.55)
(3.92)
(-6.63)
-0.5052*
0.1357
-0.0896
-0.9483*
0.6644*
0.0107
0.0354
-0.1161***
(-3.86)
(1.61)
(-0.86)
(-11.24)
(5.00)
(0.13)
(0.43)
(-1.90)
-0.2125**
0.1071
-0.0408
-0.8401*
0.3743*
-0.0915
0.1276***
-0.1064***
(-2.28)
(1.53)
(-0.49)
(-11.99)
(3.56)
(-1.32)
(1.84)
(-1.70)
-0.1979*
0.1579**
-0.1587**
-0.6374*
0.3952*
-0.1131***
-0.0998
-0.1219**
(-3.33)
(2.54)
(-2.18)
(-10.02)
(4.01)
(-1.79)
(-1.55)
(-2.04)
-0.2033*
0.0970
-0.2013**
-0.6820*
0.5470*
-0.0254
0.0544
-0.1841*
(-3.74)
(1.44)
(-2.56)
(-10.19)
(5.42)
(-0.38)
(0.81)
(-3.12)
-0.0835
0.1325**
-0.2083**
-1.0879*
0.1832***
0.0042
0.0247
-0.2372*
(-1.34)
(1.98)
(-2.21)
(-16.18)
(1.80)
(0.06)
(0.37)
(-3.85)
-0.0549
0.1517**
-0.0173
-0.5777*
0.3325*
-0.0508
-0.1233***
-0.1454**
(-0.93)
(2.08)
(-0.22)
(-7.76)
(3.02)
(-0.70)
(-1.70)
(-2.34)
-0.0971
0.1498*
0.1709**
-0.9191*
0.3105*
-0.0593
0.0488
-0.0997
(-1.44)
(2.41)
(2.06)
(-15.45)
(3.49)
(-0.99)
(0.83)
(-1.62)
-0.2008**
0.0828***
0.1154***
-0.7647*
0.3359*
-0.0428
-0.0817***
0.1188***
(-2.40)
(1.76)
(1.71)
(-16.35)
(4.65)
(-0.90)
(-1.75)
(1.94)
0.6238
0.4335
0.3830
0.3290
0.3840
0.5464
0.2362
0.5313
0.5582
39 Table 5 (continued) 1998
1999
2000
2001
2002
2003
2004
2005
2006
0.0778
0.1566*
0.0944
-1.2421*
0.0134
-0.0496
0.0772***
0.0263
(0.85)
(3.26)
(1.02)
(-26.57)
(0.19)
(-1.06)
(1.66)
(0.42)
0.1539***
-0.0058
0.0182
-1.0018*
-0.1007
-0.0409
0.1346*
-0.1211***
(1.92)
(-0.13)
(0.24)
(-23.15)
(-1.42)
(-0.93)
(3.17)
(-1.92)
-0.1155
0.1335*
0.0234
-0.7496*
0.0969
-0.0123
0.0033
-0.0266
(-1.41)
(3.44)
(0.38)
(-20.02)
(1.61)
(-0.33)
(0.09)
(-0.43)
-0.3038*
0.0622
-0.0326
-0.9238*
0.2281*
-0.0569
-0.0429
-0.1048***
(-3.66)
(1.53)
(-0.46)
(-22.95)
(3.67)
(-1.40)
(-1.03)
(-1.67)
-0.2814*
0.0655***
-0.0315
-0.8930*
0.1655*
-0.0282
0.0079
-0.0252
(-3.02)
(1.83)
(-0.47)
(-25.03)
(2.88)
(-0.79)
(0.22)
(-0.40)
-0.1754*
0.1053*
0.0374
-0.5826*
0.2182*
-0.0149
-0.0280
0.0503
(-2.63)
(2.70)
(0.69)
(-14.80)
(3.50)
(-0.37)
(-0.70)
(0.80)
-0.1414*
0.0601
-0.0546
-0.8820*
0.2789*
-0.0461
-0.0717
-0.1517**
(-2.75)
(1.30)
(-0.77)
(-19.40)
(3.79)
(-1.01)
(-1.57)
(-2.48)
-0.0933**
-0.0002
-0.0207
-0.8988*
0.1887*
0.0041
0.0458
-0.1561**
(-2.28)
(-0.00)
(-0.31)
(-23.59)
(2.98)
(0.11)
(1.21)
(-2.53)
-0.0414
0.1711*
0.0110
-1.1016*
0.1827**
-0.1048***
0.0965***
-0.0296
(-0.80)
(3.18)
(0.12)
(-20.09)
(2.25)
(-1.92)
(1.78)
(-0.47)
0.7521
0.7002
0.6469
0.7019
0.7214
0.4901
0.6220
0.7153
0.6664
40 Table 5 (continued) 2007
2008
2009
2010
-0.0403
0.1296**
-0.0924
-1.3498*
0.1485**
0.0463*
0.0249
-0.1336**
(-0.55)
(2.51)
(-0.92)
(-25.79)
(2.00)
(0.91)
(0.49)
(-2.12)
-0.2082
-0.1042**
-0.1385
-1.2136*
0.0132
0.0223
0.0978**
-0.0868
(-1.42)
(-2.38)
(-1.60)
(-26.88)
(0.23)
(0.51)
(2.26)
(-1.37)
-0.3577*
0.1740*
-0.1078
-0.8979*
0.2638*
0.0871**
0.0126
-0.2142*
(-3.39)
(4.10)
(-1.56)
(-20.94)
(4.29)
(2.03)
(0.29)
(-3.64)
-0.2000**
0.1898*
0.0473
-1.4543*
0.3010*
0.0829
-0.0696
0.0005
(-2.01)
(3.12)
(0.43)
(-23.89)
(3.43)
(1.36)
(-1.15)
(0.01)
Note: *,**,*** indicate significant at 1%, 5% and10% level, respectively; t-statistics are in parentheses.
between VIX changes and S&P 500 index daily returns. Since the parameter
is
negative and significant with t-statistic of -6.63, there is first-order autocorrelation in VIX changes series. For lags and leads, the coefficients for lag one and two are significant at 1% and 5% level, respectively, and the coefficients for lead two is highly significant at 1% level but for lead one is insignificant. The value of R2 is moderate at 0.6238. The statistic results of all yearly sub-periods of Table 5 are similar to those for the full sample period. The coefficients
are negative and highly significant at 1%
level for all sub-periods, well exposing the negative intertemporal relationship between VIX changes and S&P 500 index daily returns throughout the full sample. The value of parameter
of each sub-period is almost same as our expectation
except for four yearly sub-periods: 1998, 1999, 2000 and 2008. In general, the positive and highly significant
approve the asymmetric relationship between VIX
changes and S&P 500 index daily returns. Although the negative and significant for the full sample period shows first-order autocorrelation in VIX changes series,
0.7692
0.7806
0.6910
0.7185
41 this fact cannot be observed in most cases, particularly after 1996. We also found that the significant lead and lagged coefficients are sporadic for sub-periods, particularly during the period of 2000 to 2010. Finally, the minimum value of R2 of sub-periods is 0.2362 of 1995, and the maximum value is 0.7806 of 2008. By and large, the value of R2 is desirable. We conclude that the contemporaneous relationship between VIX changes and S&P 500 index daily returns is significantly negative and asymmetric, and the series of VIX changes has autocorrelation of order one. Figure 4 shows the relationship between S&P 500 index daily returns and the VIX index. As can be seen, in some periods, the positive returns associated with decreased implied volatility are smaller than the negative returns associated with increased implied volatility. The plot of Figure 4 explicitly exhibits the negative and asymmetric relationship between these two series as well.
12% 80% S&P 500 Daily Returns Implied Volatility Index (VIX)
8%
4%
40%
0%
-4%
20%
-8% 0% -12% 1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
2010
Figure 4. S&P 500 index daily returns and the VIX index
Implied Volatility Index
S&P 500 Daily Returns
60%
42 The relationship between S&P 500 index absolute daily returns and the VIX index are also examined, and the time series plot is presented by Figure 5. It is obvious that there is close relationship between S&P 500 index absolute daily returns and VIX because these two time series broadly move in unison during the sample period. Therefore, VIX performs well for capturing market volatility of S&P 500
18%
90%
16%
80% Implied Volatility Index (VIX) S&P 500 Absolute Daily Returns
14%
70%
12%
60%
10%
50%
8%
40%
6%
30%
4%
20%
2%
10%
0%
0% 1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
2010
Figure 5 S&P 500 index absolute daily returns and the VIX index
Implied Volatility Index
S&P 500 Absolute Daily Returns
index returns.
43
7 Estimation and Discussion
This section starts with the study on information criterion which is used for the selection of orders of ARCH and GARCH terms of various GARCH (p,q) models. Then, the numerical accuracy of the estimates of GARCH (p,q) models is examined by comparing the estimates from GARCH (p,q) models to the estimates from ARCH (p) models. Next, we detailed the coefficient estimates of respective GARCH (p,q) model as well as the benchmark model (random walk model). In addition, the results of the standard diagnostics for residuals from the estimated models are also analyzed. Finally, we provide the study results for the standardized residuals of GARCH (p,q) models by BDS test and graphical diagnostics.
7.1 Model Selection
The information criterion AIC, SBIC and HQIC of various GARCH (p,q) models fitted to the daily returns of the S&P 500 index are presented in Table 6. When q=0, the GARCH (p,q) model reduce to the pure ARCH (p) model. The information of Panel A indicate that GARCH (4,0) which is equivalent to ARCH (4) is selected by all information criterions. Panel B shows that GARCH (1,1) is picked by SBIC and GARCH (2,2) is selected by both AIC and HQIC. For GJR (p,q) model of Panel C, GJR (2,1) is selected by all information criterions. Finally, the EGARCH (2,2) is picked by AIC and HQIC but EGARCH (2,1) is preferred by SBIC. The ultimate column is the log likelihood of each model, and GARCH (4,0), GARCH (2,2), GJR (2,2) and EGARCH (2,2) models are selected by the maximum value of log likelihood. It appears that AIC and HQIC can always provide the same information. Information criterion is one of criterions for model selection, but it cannot perfectly indicate whether the model with smallest value of AIC, SBIC or HQIC is the best one. The performance of an estimated model is affected by many factors. Since the GARCH (1,1) model is usually assumed to be the best one for modeling and
44 forecasting financial time series, and the parsimonious GJR (1,1) and EGARCH (1,1) models are also widely used in most empirical studies, therefore, we decided to model S&P 500 index daily returns using GARCH (1,1), GJR(1,1) and EGARCH (1,1) models.
Table 6. Information criteria for estimated GARCH (p,q) models ( p,q )
AIC
SBIC
HQIC
LL
Panel A: GARCH(p,0) (1,0)
3.0489
3.0328
3.0503
-7678.777
(2,0)
2.9144
2.9196
2.9163
-7338.925
(3,0)
2.8610
2.8674
2.8632
-7203.218
(4,0)
2.8043
2.8121
2.8070
-7059.384
Panel B: GARCH(p,q) (1,1)
2.7115
2.7167
2.7134
-6827.712
(1,2)
2.7112
2.7177
2.7135
-6825.992
(2,1)
2.7105
2.7170
2.7127
-6824.053
(2,2)
2.7098
2.7175
2.7125
-6821.283
(1,1)
2.6846
2.6911
2.6869
-6758.813
(1,2)
2.6850
2.6928
2.6877
-6758.810
(2,1)
2.6807
2.6885
2.6835
-6748.130
(2,2)
2.6811
2.6902
2.6843
-6748.089
Panel C: GJR (p,q)
Panel D: EGARCH (p,q) (1,1)
2.6823
2.6888
2.6846
-6753.040
(1,2)
2.6827
2.6905
2.6854
-6753.026
(2,1)
2.6783
2.6860
2.6810
-6741.874
(2,2)
2.6777
2.6868
2.6809
-6739.517
Note: LL denotes log likelihood.
45
7.2 Test Numerical Accuracy of GARCH Estimates
After the model has been estimated, it is necessary to test the numerical accuracy of the estimates to assure that the estimated model is efficient for volatility estimating. Otherwise, the inappropriate coefficient estimates can induce spurious inference.
S&P 500 index daily returns 10 5
Estimates from GARCH (1,1)
Estimates from GJR(1,1)
30
30
20
20
10
10
0 -5 0 95
00
05
10
0
95
00
05
10
95
Estimates from GARCH (1,0)
Estimates from EGARCH (1,1) 60
60
20
40
40
10
20
20
0 95
00
05
Estimates from GARCH (3,0)
00
05
10
Estimates from GARCH (4,0) 60
30 40 20 20 10 0
0 95
00
05
10
10
0 95
10
40
05
Estimates from GARCH (2,0)
30
0
00
95
00
05
10
Figure 6. Estimates from various GARCH (p,q) models
95
00
05
10
46 Zivot (2008) suggests that the numerical accuracy of model estimates can be examined by comparing the volatility estimates of GARCH (1,1) model with the volatility estimates from ARCH (p) models. If the volatility estimates from these different models have the similar dynamics, then the coefficient estimates of models are appropriate. By following Zivot (2008), we compared the graphical volatility of GARCH (1,1), GJR (1,1) and EGARCH (1,1) to GARCH (1,0), GARCH (2,0), GARCH (3,0) and GARCH (4,0) models. As can be seen from Figure 6, all models perform well in capturing the observed volatility clustering in S&P 500 index daily returns. In particular, they explicitly describe the tremendous fluctuation of volatilities of the period from 2007 to 2009, and this is a concussive financial period experienced the economic prosperity and economic crisis. Comparing to GARCH (p,0) models ( p=1, 2, 3, 4) which are equivalent to ARCH(p) models, the volatilities of GARCH(1,1), GJR(1,1) and EGARCH(1,1) are much smoother and display more persistence. Since the estimated volatility from these models exhibit the similar dynamics, thus, the estimates of GARCH (1,1), GJR(1,1) and EGARCH(1,1) models are appropriate.
7.3 Estimates of Models
Table 7 presents the estimates of various models, and they are random walk model, GARCH (1,1), GJR (1,1) and EGARCH (1,1) models. The second column presents the estimated parameters and diagnostic results of random walk model. The coefficient estimate of the constant term is close to zero and statistically insignificant. The DW statistic shows the result of Durbin-Watson test and it is very close to 2, suggests that there is no first order autocorrelation of residuals. However, the Q-statistic and LM statistic from Ljung-box and ARCH-LM tests for the test of lag 10 indicate the presence of autocorrelation. The coefficient estimates of conditional mean and conditional variance equation of GARCH (1,1) model are shown in the third column of Table 7. We assumed that the residuals for the S&P 500 daily returns are normal distributed. The coefficient
47 Table 7. The summary statistics of estimated volatility models Models
Random Walk
GARCH(1,1)
GJR(1,1)
EGARCH(1,1)
Panel A: Estimates of Mean Equation
c
0.02245
0.0478*
0.0230**
0.0238**
(1.36)
(4.09)
(1.96)
(2.14)
Panel B: Estimates of Conditional Variance Equation 0.0073*
0.0181*
-0.0883*
(7.14)
(9,85)
(-13.74)
0.0635*
0.0017
0.1123*
(14.48)
(0.38)
(13.63)
0.9309*
0.9322*
0.9855*
(196.06)
(202.05)
(689.48)
0.1132*
-0.0925*
(15.38)
(-16.13)
Panel C: Diagnostic Results of Residuals DW
2.11
2.11
2.11
2.11
53.636*
15.056
13.014
13.739
LM(10)
1296.372*
8.3173
7.6431
12.0981
LL
-7952.988
-6827.712
-6758.813
-6750.04
Skewness
-0.1987
-0.4026
-0.4077
-0.3819
Kurtosis
12.1684
4.9230
4.8199
4.6840
Q-statistic(10)
Note: * (**) denotes significant at 1% (5%) level; z-statistics are in parenthesis; DW statistic is from Durbin-Watson test; Q-statistic and LM are the results of Ljung-Box and ARCH-LM tests, respectively.
48 estimates of the conditional variance equation are consistent with our expectation. The intercept term is very small (0.0073), the parameter of ARCH term equals to 0.0635 and the coefficient on the lagged conditional variance is 0.9309. The coefficients on both the lagged squared residual and lagged conditional variance terms are highly significant, implying the presence of ARCH and GARCH effects. The sum of the coefficients on ARCH and GARCH terms is very close to unity (0.9944), suggesting that the model is covariance stationary with a high degree of persistence and long memory in the conditional variance, i.e., a large positive or negative return will lead future forecasts of the variance to be high for a protracted period. The half-life of shocks to volatility to the S&P 500 index is 123 days. Additionally, the sum of coefficients of ARCH and GARCH terms is also an estimation of the rate at which the response function decays on daily basis. It seems that the response function to shocks decline slowly because the rate is very high (0.9944). It means that the new shock will affect the returns for a longer period. In other words, the old information is more important than recent information and the information decays very slowly. Furthermore, the highly statistically significant coefficient estimates of ARCH and GARCH terms (
and
) suggests that the constant variance model can be rejected,
at least within the sample period. Finally, The unconditional standard deviation of returns is 1.14 computed as
⁄ 1
, and it is very close to the sample
standard deviations presented in Table 1, which equals to 1.17. The DW statistic suggests there is no first order autocorrelation, and the null hypothesis that the residuals are not serial correlated for lag 10 is not rejected by Q-statistic as well as LM statistic. The statistics of skewness and kurtosis show that the residuals are non-normal. As can be seen from the penultimate column of Table 7, the estimated parameters on the asymmetric term and the lagged conditional variance of GJR (1,1) model are statistically significant but the ARCH parameter is insignificant, and the positive and significant coefficient of the asymmetric term implies the presence of leverage effect. Since all coefficient estimates are positive, suggests that the negative shocks imply a
49 higher next period conditional variance than positive shocks of the same sign. This is consistent with our expectations for the application of a GARCH model to the index returns. The sum of the coefficients on the lagged squared error and lagged conditional variance is very close to unity (0.9339), thus, shocks to the conditional variance will be highly persistent. Due to the coefficient estimate of the asymmetric term is smaller than 0.1322 which is computed by 2*(1 – α1 – β1 ) , therefore, the model is stationary. Since DW statistic is not insignificant from 2 and both Q-statistic and LM statistic indicate there is no correlation for lag 10, therefore, the residuals are not serial correlated. The statistical properties of EGARCH (1,1) model are presented in the ultimate column of Table 7. The sum of coefficient estimates of ARCH and GARCH terms approximates 1, implying that shocks to the conditional variance will be highly persistent. The negative coefficient of the asymmetric term suggests that the positive shocks imply a higher next period conditional variance than negative shocks of the same sign, and this is inconsistent with our expectation. The negative coefficient estimate of the asymmetric term suggests the absence of leverage effect and this is conflict to the inference by GJR (1,1). Since the absolute value of coefficient estimate of logarithmic GARCH term is less than 1, the model is stationary and has finite kurtosis. It is interesting to find that the DW statistics of all models studied in Table 7 have the same value. With respect to the serial correlation of residuals of the estimated EGARCH (1,1) model, DW statistic explicitly indicates that there is no first order autocorrelation, and Ljung-Box as well as ARCH-LM tests demonstrate that the residuals are independent for lag 10. The same as GARCH(1,1) and GJR(1,1) models, the null hypothesis that the residuals of EGARCH (1,1) is normality is rejected by negative skewness and excess kurtosis.
7.4 BDS Test
The nonparametric BDS test examines the nonlinearity of residuals. The null hypothesis of BDS diagnostic test is that the data are pure white noise. If the linear of
50 non linear structure is removed from data, the remaining structure should be due to an unknown nonlinear data generating process (Magnus and Fosu, 2006:2046). The ordinary residuals from estimated random walk model and the standardized residuals from the estimated GARCH models were examined and the results of BDS diagnostic test are reported in Table 8. For random walk model, the null hypothesis that the data is pure random is strongly rejected at 1% level, implying that S&P 500 daily returns do not follow a random walk so that the random walk model cannot capture the features of the data.
Table 8. BDS test for serial independence in residuals Random Walk
GARCH
GJR
EGARCH
BDS Asymptotic (p values) Dimension
ε =0.95
ε =0.99
ε=0.95
ε=0.99
ε=0.95
ε=0.99
ε=0.95
ε=0.99
2
0.0000
0.0000
0.3581
0.8692
0.0235
0.6440
0.0501
0.8258
3
0.0000
0.0000
0.8231
0.7870
0.1117
0.7583
0.2537
0.8994
4
0.0000
0.0000
0.6805
0.7754
0.1323
0.7423
0.3161
0.8948
5
0.0000
0.0000
0.5209
0.8319
0.1962
0.7137
0.5483
0.8750
BDS Bootstrap (p values) 2
0.0000
0.0220
0.3692
0.8232
0.0096
0.2680
0.0344
0.9612
3
0.0000
0.0000
0.7792
0.3520
0.0876
0.8656
0.2500
0.7424
4
0.0000
0.0000
0.6480
0.3756
0.1172
0.8628
0.3212
0.7520
5
0.0000
0.0000
0.5088
0.4380
0.1824
0.8144
0.5948
0.7884
Note: The standardized residuals of GARCH models and the ordinary residuals of Random Walk Model were used for BDS test. Bootstrap with 5000 repetitions. ε denotes fraction of pairs epsilon value.
The results of BDS tests for standardized residuals of GARCH models are desirable. Both the asymptotic and bootstrap p values of each model indicate that the null hypothesis of white noise cannot be rejected at 0.99 epsilon bound, suggesting that all GARCH models are correctly specified, and they can well capture the relevant
51 feature of S&P 500 index daily returns. Additionally, the insignificant statistic of Ljung-Box and LM test for the standardized residuals of GARCH models reported in Table 7 show that these models are successful at modeling the serial correlation structure in the conditional mean equation and conditional variance equation.
7.5 Graphical Diagnostics
We also examined the standardized residuals of the estimated GARCH models by graphical diagnostics provided by Figure 7. As can be seen, the autocorrelation function (ACF) of respective estimated model does not show significant autocorrelation. The normal qq-plot of the standardized residuals of each estimated model indicates the strong departures from normality. In addition, the standard statistical diagnostics of Ljung-Box and Engle’s LM tests of the estimated GARCH models in Table 7 show the consistent results with ACF in Figure 7, suggesting that is there is no remaining ARCH effect, and the statistic of skewness and kurtosis of each estimated model also confirm that the residuals are non-normal.
52
.06
4
Quantiles of Normal
.04 .02 .00 -.02 -.04 -.06
2
0
-2
-4 5
10
15
20
25
30
35
ACF of standardized residuals from GARCH(1,1)
-8
-4
0
4
8
Quantiles of standardized residuals from GARCH(1,1)
.06
4
Quantiles of Normal
.04 .02 .00 -.02 -.04 -.06
2
0
-2
-4 5
10
15
20
25
30
35
ACF of standardized residuals from GJR(1,1)
-8
-4
0
4
Quantiles of standardized residuals from GJR(1,1)
.06
4
Quantiles of Normal
.04 .02 .00 -.02 -.04 -.06
2
0
-2
-4 5
10
15
20
25
30
35
ACF of standardized residuals from EGARCH(1,1)
-8
-4
0
4
Quantiles of standardized residuals from EGARCH(1,1)
Figure 7. Graphical residual diagnostics from GARCH (1,1) to S&P 500 returns
53
8 Forecast Performance of Estimated Models and VIX
The forecast performance of both time series models and the implied volatility index (VIX) are discussed in this section. First, the out-of-sample forecast performance of time series models is examined by the conventional error measurements. Next, the in-sample forecast performance of implied volatility is studied by running a GARCH model augmented with dummy variable and exogenous variable. Particularly, the comparison of forecast performance between VIX, GJR(1,1) and RiskMetrics are investigated by a variety of approaches.
8.1 out-of-sample Forecast Performance of GARCH Models
The out-of-sample forecast performance of estimated GARCH models are evaluated by four conventional error measurements, and they are root mean square error, mean absolute error, mean absolute percentage error and Theil’s U-statistic. The model with the smallest statistic is considered to be the best for modeling the conditional volatility of S&P 500 index daily returns. As can be seen from Table 9, the estimated GARCH(1,1) is considered to be the best model for out-of-sample forecasting since it has the smallest statistic of RMSE, MAE and Theil’s U-statistic. If the models are evaluated by MAPE, then GJR (1,1) is preferred.
Table 9. Forecast Performance of GARCH models Models
GARCH(1,1)
GJR(1,1)
EGARCH(1,1)
RMSE
1.1355
1.1357
1.1358
MAE
0.7946
0.7964
0.7964
MAPE
122.0995
107.4572
107.8961
Theil
0.9588
0.9795
0.9789
54 Theil’s U-statistics of each model is smaller than one, and this indicates that the estimated models perform better than the benchmark model. However, the statistic of MAPE of each model is more than 100%, suggesting that the benchmark model outperforms the estimated models. Therefore, the result of the comparison between the estimated models and the random walk model is mixing. It seems MAPE and Theil’s U-statistic conflict in our test. Due to the standard GARCH (1,1) cannot capture asymmetry of volatility, and GJR (1,1) is indicated to be the best one by MAE, thus, GJR (1,1) will be further studied and its forecast performance will be compared to VIX as well as RiskMetrics approach.
8.2 In-sample Forecast Performance of VIX
We examined the in-sample forecast performance of VIX by following Blair et al. (2001) and Frijns et al.(2008) by running the GARCH model augmented with dummy variable and exogenous variable:
~
0, (30)
where
denotes S&P 500 index daily return,
is the average daily return,
random error component on mean level. We assume that with a mean of zero and a conditional variance equals to which equals to one if the innovation
is the
is normally distributed .
is a dummy variable
is negative and zero otherwise, and it is used
to capture the asymmetric impact of shocks on volatility. Six different GARCH specifications are nested by equation (30) if we place restrictions on it: (1) if
0, equation (30) becomes a standard GARCH (1,1) model;
55 (2) if
0, equation (30) becomes a GJR (1,1) model which can capturing
asymmetric impact of shocks; 0,
(3) if
is the only parameter for explaining the volatility
process; 0, then equation (30) becomes a model consisted of VIX and
(4) if market shocks; (5) if
0, then equation (30) includes asymmetry in market shocks;
(6) the equation (30) without any restrictions is a GJR model with implied volatility which is an exogenous variable.
Table 10. In-sample forecast performance of VIX and GARCH specifications LL GARCH
0.0073*
0.9309* 0.0635*
GJR-GARCH
0.0108*
0.9322* 0.0017
VIX
-0.5420*
ARCH-VIX
-0.0183*
-0.0134**
GJR-ARCH-VIX
-0.1058*
GJR-GARCH-VIX 0.0008
excess LL
-6827.712 0.1103*
-6758.813
68.899
0.0040* -6485.995
341.717
0.0028* -6746.639
81.073
-0.0444*
0.0574* 0.0028* -6740.518
87.194
0.8520* -0.0344*
0.1684* 0.0002* -6703.526
124.186
Note: *(**) denote significant at 1% (5%) level. LL is the statistic of log-likelihood.
The in-sample forecast performance of different models nested in equation (30) are presented in Table 10, including parameter estimates, the statistics of log-likelihood, and the values of excess log-likelihood on the basis of the standard GARCH (1,1) in the second row. The highly significant
equals to 0.9309 of GARCH (1,1) model
confirms the strong persistence in volatility. The GJR-GARCH model with asymmetric term performs better than GARCH (1,1) because the log-likelihood increase by 69 approximately. Since the estimates of
of GJR-GARCH model is
positive, thus the negative shocks imply a higher next period conditional variance.
56 The parameter estimates in the fourth row are for the restricted model which has 0 except for the parameter of
. Thus, this model describes the
volatility process only with VIX series, and the large value of excess log-likelihood (approximate 342) implies that this model performs better than GARCH (1,1). The nested ARCH-VIX and GJR-ARCH-VIX models incorporate the shock terms into the specifications, and it is interesting to find that they have the same value of
which
equals to 0.0028 and highly significant at 1% level. However, the values of excess log-likelihood of these two nested models indicate that GJR-ARCH-VIX performs better, because it has larger excess value compared to the log-likelihood of the standard GARCH (1,1). In addition, the coefficient of the shock term of GJR-ARCH-VIX is highly significant at 1% but that of ARCH-VIX is significant at 5%. The last row shows the estimated parameters of unrestricted model. It is obvious that GJR-GARCH-VIX significantly improves the standard GARCH (1,1) and it outperforms GJR-GARCH, ARCH-VIX and GJR-ARCH-VIX, since it has larger excess log-likelihood. The parameter of
is highly significant at 1% level. We also find
that the incorporation of this exogenous variable reduces the value of parameter of the GARCH term of the standard GARCH (1,1) by approximately 0.08, suggesting that the VIX series capture a part of persistence in the volatility process. If comparing all these nested specifications, we find that the addition of VIX can significantly improve the standard GARCH (1,1). Since the nested VIX model in fourth row shows the largest excess log-likelihood, thus we conclude that the volatility process can be reasonably described by VIX series.
8.3 Comparing Predictability of Time Series Models and VIX
In this section, the forecast performance of VIX is investigated as well as compared to the forecasts performance of RiskMetrics approach and GJR (1,1) models by running a regression of realized volatility. We consider the forecasting horizons at 5, 10, 15, 30 and 60 trading days. The objective is to investigate whether the VIX series incorporates all information which has been included in the time series.
57 The time series plots of VIX and annualized future realized volatility over different horizons are presented by the figures of Appendix A. It seems that VIX can perfectly track the realized volatility at each horizon. It is obvious that the level of VIX and realized volatility of each horizon are different, and VIX overestimate the realized volatility at all horizons. Although the information from figures of Appendix A reveal that VIX appears to have good forecast performance on realized volatility, we need to confirm this fact by formal tests of running a regression of realized volatility. Additionally, we need to investigate whether VIX is superior against other approaches. We define the realized volatility as the square root of the sum of S&P 500 index squared daily returns which is computed as
31
is the squared daily return on day t.
where k denotes the number of trading days, The regression of the realized volatility is given by
32
,
where
,
denotes forecasts obtained from alternative approaches, and
5, 10, 15, 30, 60 . In order to run the equation (32),we should primarily construct
,
series over
different horizons by alternative approaches. By following Giot (2005b) and Frijns et al. (2008), the k-day forward-looking volatility forecasts on day t by VIX series is computed as
360
33
58 RiskMetrics approach can be regarded as a simplified and restricted GARCH (1,1) model. We assume that
1
where
(34)
denotes the variance according to RiskMetrics approach, r is return of
S&P 500 index on day t, Due to the parameters 1
equals to 0.94 and it captures the persistence of volatility. and
should sum to one, a unit root is included by
model (51), implying that model (51) is a specific parameterization of Integrated GARCH (1,1) model. The forecast obtained by model (51) is the forecasts for the next day. In order to obtain the forecasts for longer horizon, the forecast measurement is re-scaled and the k-day forward-looking forecast by following Frijns et al. (2008) can be derived from
·
where
35
is the daily forecast by RiskMetrics approach.
The forecasts based on GJR(1,1) model is given by equation (14), and one-day ahead forward-looking forecast can be obtained by
36
For
-day horizon (
1), the forward-looking forecast can be computed as
0.5
The total volatility
-day ahead can be derived from
37
59
38
,
After deriving the forecasts by these three different approaches, we first examine the correlation between these forecasts and realized volatility at different horizons, and our purpose is to confirm whether VIX is a better forecaster than other approaches; then, the forecast performances are evaluated by running the regression of realized volatility by equation (32). Furthermore, due to our sample data spans a considerably long period, thus, the realized volatility at each horizon of both in-sample and out-of-sample period are regressed.
8.3.1 Correlation between Realized Volatility and Volatility Forecasts
The correlation between future realized volatility and volatility forecasts from respective forecaster at each forecasting horizon are reported in Table 11. It is obvious that GJR (1,1) has the highest correlation with realized volatility at each horizon. VIX has higher correlation compared to Riskmetrics approach in most cases except for the 60-day horizon. It is also interesting to find that the correlation between realized volatility and GJR (1,1) increases for longer horizon. The information observed from Table 11 implies that GJR (1,1) may performs best on realized volatility forecasting against VIX and RiskMetrics approach.
Table 11 Correlation between Realized Volatility and Alternative Forecasters 5-day
10-day
15-day
30-day
60-day
VIX
0.7906
0.8141
0.8086
0.7762
0.7039
RiskMetrics
0.7486
0.7808
0.7813
0.7638
0.7135
GJR (1,1)
0.8482
0.9085
0.9311
0.9615
0.9796
60
8.3.2 Regression for In-sample Realized Volatilty
Table 12 presents the performance of regression for in-sample realized volatility by estimated variance from VIX series, RiskMetrics approach and GJR(1,1) model at various horizons. The statistic of
is also used to evaluate the predictability of
of equation (32). Due to the requirements for realized volatility are
0
,
,
to be an unbiased estimates of
1, the joint hypothesis are tested and F-statistic
indicates the test results. Panel A of Table 12 shows the parameter estimates of the regression by VIX series. The coefficients α are negative and statistically significant from zero at 1% level in most cases expect for the 60-day horizon. The coefficients β are close to one and highly significant at 1% level at all horizons. In addition, F-statistic of each horizon significantly rejects the null hypothesis that α = 0 and β =1. Therefore, the estimates of α and β are biased in all cases. However, unbiasedness is not a determinant for a good predictor because the observable and systematic bias can be controlled. The high value of R2 is a required property of good forecaster. By observing the R2 of different horizons, we found that VIX performs best at 10-day horizon and worst at 60-day with R2 equals to 0.6692 and 0.5103, respectively. The regression results from RiskMetrics approach are shown in Panel B of Table 12. As can be seen, the estimates are also biased in all cases since both coefficients α and β are statistically significant at 1% level at all horizons, and the F-statistic significantly rejects the joint hypothesis that α =0 and β =1 at 1% level at each horizon as well. When evaluating the value of R2, we found that RiskMetrics approach performs best at 15-day horizon with R2 equals to 0.6214, and performs worst at 60-day horizon with R2 equals to 0.5230. Comparing to the regression by VIX series, we found that the RiskMetrics approach only outperforms VIX at 60-day horizon, implying that VIX is a better forecaster against RiskMetrics approach in all other cases. The regression results from GJR (1,1) model are reported in the third panel of Table 12. When evaluating the unbiasedness through coefficients of α and β, we found the comparable results with Panel A and Panel B. Both the estimates of α and β are
61 Table 12. Performance of regression for in-sample realized volatility α
β
R2
F-statistic
0.6302
315.91
Panel A:Forecasting Regression by VIX 5-day
10-day
15-day
30-day
60-day
-0.7643*
1.2172*
(0.0340)
(0.0131)
[0.0000]
[0.0000]
-0.8980*
1.1863*
(0.0430)
(0.0118)
[0.0000]
[0.0000]
-0.9188*
1.1519*
(0.0520)
(0.0116)
[0.0000]
[0.0000]
-0.7253*
1.0682*
(0.0755)
(0.0119)
[0.0000]
[0.0000]
0.1900
0.9353*
(0.1162)
(0.0130)
[0.1022]
[0.0000]
[0.0000]
0.6692
259.14 [0.0000]
0.6624
192.7473 [0.0000]
0.6160
80.5313 [0.0000]
0.5103
42.4262 [0.0000]
Panel B: Forecasting regression by RiskMetrics 5-day
10-day
15-day
30-day
60-day
0.1863*
0.8638*
(0.0278)
(0.0106)
[0.0000]
[0.0000]
0.3800*
0.8518*
(0.0350)
(0.0094)
[0.0000]
[0.0000]
0.5784*
0.8327*
(0.0417)
(0.0092)
[0.0000]
[0.0000]
1.1631*
0.7855*
(0.0585)
(0.0091)
[0.0000]
[0.0000]
2.3794*
0.7062*
(0.0870)
(0.0096)
[0.0000]
[0.0000]
0.5697
121.6117 [0.0000]
0.6193
138.5070 [0.0000]
0.6214
173.5402 [0.0000]
0.5981
278.4340 [0.0000]
0.5230
473.5829 [0.0000]
62 Table 12 (continued) Panel C:Forecasting Regression for GJR(1,1) 5-day
10-day
15-day
30-day
60-day
-0.2043*
1.0250*
(0.0233)
(0.0089)
[0.0000]
[0.0000]
-0.2779*
1.0436*
(0.0247)
(0.0067)
[0.0000]
[0.0000]
-0.3350*
1.0500*
(0.0259)
(0.0057)
[0.0000]
[0.0000]
-0.4797*
1.0604*
(0.0269)
(0.0042)
[0.0000]
[0.0000]
-0.7058*
1.0689*
(0.0278)
(0.0031)
[0.0000]
[0.0000]
0.7248
90.7276 [0.0000]
0.8292
88.1022 [0.0000]
0.8705
99.4965 [0.0000]
0.9274
163.9572 [0.0000]
0.9605
321.7394 [0.0000]
Note: * denotes significant at 1% level; the statistics of parentheses are standard error; the statistics of square brackets are P-value; F-statistic are used for the test of joint hypothesis α = 0 and β = 1.
highly significant at 1% level for all cases, and the joint hypothesis α = 0 and β =1 for each horizon is also explicitly rejected by F-statistic. Therefore, the forecasts by GJR (1,1) model are also biased forecasts for future volatility at all horizons. However, comparing to the other two approaches, the coefficients α are more close to zero and the coefficients β are more close to one at each horizon. With respect to the predictive power of GJR (1,1) model, the largest value of R2 equaled 0.9605 appears at 60-day horizon, and the smallest value of R2 equaled 0.7248 is obtained at 5-day horizon. It is interesting to find that the value of R2 of GJR (1,1) model increases for longer horizon, and the value of R2 are extremely high at each horizon. It appears that GJR (1,1) has particular good forecast performance for the long forecasting horizon.
63 To summarize, comparing to VIX series and RiskMetrics approach, GJR (1,1) has the highest value of R2 and coefficients β are more closer to unity at each horizon. Consequently, GJR (1,1) model outperforms other approaches on regression for realized volatility in the sample period, and this finding is consistent with information observed from Table 11. We further investigate the out-of-sample forecast performance of each model by graphical volatility and the conventional error measurements. The figures of Appendix B plot the out-of-sample volatility forecasts by each model at different horizons and each model performs well for tracking the dynamics of future volatility. The statistic of error measures are provided by Table 13, and the information of each panel explicitly indicates that GJR (1,1) has the best forecast performance for future realized volatility.
Table 13. Forecast performance on out-of-sample realized volatility VIX
RiskMetrics
GJR(1,1)
5-day
0.8809
1.0256
0.8012
10-day
1.1243
1.2924
0.8287
15-day
1.4051
1.5418
0.8842
30-day
2.2093
2.2978
0.9979
60-day
3.1862
3.1815
0.8467
5-day
0.71956
0.8186
0.6705
10-day
0.8821
1.0238
0.6413
15-day
1.0964
1.1812
0.6837
30-day
1.7227
1.6903
0.7764
60-day
2.6637
2.5519
0.7310
Panel A: RMSE
Panel B: MAE
64 Table 13 (Continued) Panel C: MAPE 5-day
53.8270
56.2948
43.2352
10-day
34.5104
37.5946
23.5408
15-day
32.7246
33.7394
19.3809
30-day
30.6203
29.1853
13.6856
60-day
29.2769
27.0751
8.0997
Panel D: Theil’s Statistic 5-day
0.1710
0.2068
0.1605
10-day
0.1520
0.1812
0.1166
15-day
0.1560
0.1777
0.1011
30-day
0.1697
0.1824
0.0782
60-day
0.1698
0.1739
0.0456
8.3.3 Residual Tests for Regression of In-sample Realized Volatility
The future realized volatilities of in-sample period are estimated by the classical linear regression model which has five underlying assumptions:
0
1)
∞
2) 3)
,
0
4)
,
0
5)
~
0,
These assumptions make the ordinary least squares (OLS) technique has substantive desirable properties and the hypothesis tests concerning the parameter estimates can be conducted validly. The violations of the assumptions can lead to some problems,
65 such as both the parameter estimates and the associated standard errors are wrong, and the distributions assumed for the test statistic are not appropriate. In order to confirm that the volatility forecasts are efficient and our inferences based on the coefficient estimates of the regression for in-sample realized volatility are correct, the residual diagnostic tests were conducted. Due to the coefficients α are highly significant for all regressions, thus, the first assumption is not violated. In other words, if a constant is included in the regression model, the first assumption that the mean of the residuals equals to zero will never be violated. In terms of
of
the fourth assumption, it denotes the independent variable of regression equation. If the independent variable is stochastic and uncorrelated with residual, then the estimates of OLS are consistent and unbiased. Table 14 reports the diagnostic results of residuals of regression for in-sample realilzed volatility. The second column shows the results for heteroskedasticity tests by the White Test. The autocorrelation of residuals are examined by Durbin-Watson Test and the results are presented in the third column. The correlation coefficients between residual and regressor are listed in the fourth column. The statistic of skewness and kurtosis of the next two columns describe the departures from normality. Unfortunately, the diagnostic results are undesirable. The null hypothesis that the residuals are homoscedastic is explicitly rejected for all regressions at each horizon, and the DW-statistic of each horizon indicates the presence of first order autocorrelation of residuals. The correlation coefficients between residuals and regressors are the same as our expectation. The statistics of skewness and kurtosis imply that the distributions of residuals are non-normal in all cases. For such a long in-sample period with 5287 observations, the violation of the normality assumption is inconsequential.
66 Table 14. Residual tests for regression for in-sample realized volatility W-statistic
DW-statistic
Corr( ,
Skewness
Kurtosis
Panel A: Regression by VIX 5-day
689.86
0.3678
0.0000
1.7703
13.7935
0.1664
0.0000
2.2320
16.4138
0.1294
0.0000
2.7128
21.2689
0.0507
0.0000
3.5227
27.7184
0.0167
0.0000
3.9749
28.3814
0.3698
0.0000
1.8001
14.6771
0.1638
0.0000
1.8377
13.9682
0.1397
0.0000
2.0736
16.2758
0.0518
0.0000
2.8501
21.8930
0.0154
0.0000
3.4874
24.0941
0.4566
0.0000
0.7187
10.9908
0.2247
0.0000
0.1098
11.3857
0.2163
0.0000
-0.2153
13.8315
0.1528
0.0000
-0.2749
15.0022
0.0464
0.0000
-0.2798
12.9577
[0.0000] 10-day
513.23 [0.0000]
15-day
377.04 [0.0000]
30-day
178.57 [0.0000]
60-day
48.87 [0.0000]
Panel B: Regression by RiskMetrics 5-day
559.77 [0.0000]
10-day
687.41 [0.0000]
15-day
589.31 [0.0000]
30-day
284.85 [0.0000]
60-day
91.20 [0.0000]
Panel A: Regression by GJR (1,1) 5-day
1003.478 [0.0000]
10-day
1178.237 [0.0000]
15-day
1132.649 [0.0000]
30-day
1150.130 [0.0000]
60-day
1568.914 [0.0000]
Note: W-statistic is the result for heteroskedasticity test by the White Test; DW-statistic is the result for first order autoccorelation test by the Durbin-Watson Test; Corr( , correlation coefficients between residuals and regressors.
report the
67
8.3.4 Regression for Out-of-sample Realized Volatility
The information observed from Table 14 suggests that the coefficient estimates of regression for in-sample realized volatility may be wrong so that the inferences based on analysis of coefficient estimates could be unreliable. In order to further investigate the predictability of VIX, RiskMetrics and GJR (1,1), we run the OLS regression for the realized volatility for the out-of-sample period. Table 15 presents the coefficient estimates of regression for out-of-sample realized volatility as well as R2 and F-statistic, and the latter one is the result of the joint hypothesis test. Panel A shows the information of regression by VIX. The coefficients α are negative at 5-, 10-, and 15-day horizons and they are statistically significant at 1% level at most horizons except for the 15-day horizon. The estimates of β are positive and highly significant at 1% level in most cases except for the 60-day horizon. We also found that the coefficients α increases with longer horizons but the coefficients β decreases with longer horizons, implying that VIX has weaker explanatory power for longer horizon. Due to the estimates α is insignificantly from zero and β is significant and close to one at 15-day horizon, it appears that VIX performs best for 15-day ahead forecasts. However, F-statistic explicitly rejects the joint null hypothesis that α = 0 and β = 1. In addition, this joint null hypothesis for other cases are also rejected by F-statistic. Therefore, the estimates by α and β are biased. The R2 decreases with longer horizon, suggesting that VIX has better predictability at the shortest horizon. Our finding regarding performance of VIX differs from the previous finding based on the regression for in-sample realized volatility. Panel B of Table 15 is the results of the regression by RiskMetrics approach of out-of-sample period. As can be seen, the coefficients α are positive at all horizons and they are statistically significant in most cases except for the 5-day horizon. The parameter estimates of β are significant in most cases except for the 60-day horizon. These findings are comparable with the results of Panel A. Particularly, the estimates α increases with longer horizons but the estimates of β decreases for longer horizons.
68 Table 15. Performance of regression for out-of-sample realized volatility α β Panel A:Forecasting Regression by VIX 5-day -1.2619* 1.3222* (0.2397) (0.0875) [0.0000] [0.0000]
R2
F-statistic
0.4825
34.1455 [0.0000]
10-day
-0.8393* (0.3133) [0.0079]
1.1023* (0.0805) [0.0000]
0.4387
21.3210 [0.0000]
15-day
-0.1003 (0.3969) [0.8008]
0.9136* (0.0829) [0.0000]
0.3398
16.7091 [0.0000]
30-day
3.1144* (0.5894) [0.0000]
0.4424* (0.0861) [0.0000]
0.1067
31.4245 [0.0000]
9.5196* -0.0502 0.0024 (0.7259) (0.0734) [0.0000] [0.4943] Panel B: Forecasting regression by RiskMetrics 5-day 0.3177 0.7914* 0.2522 (0.2241) (0.0871) [0.1575] [0.0000]
109.0566 [0.0000]
60-day
7.3609 [0.0008]
10-day
1.0219* (0.2848) [0.0004]
0.6623* (0.0778) [0.0000]
0.2320
11.3619 [0.0000]
15-day
1.8635* (0.3435) [0.0000]
0.5338* (0.0763) [0.0000]
0.1718
19.7635 [0.0000]
30-day
4.7194* (0.4703) [0.0000]
0.2181* (0.0729) [0.0031]
0.0389
57.7503 [0.0000]
60-day
10.0989* (0.5732) [0.0000]
-0.1175 (0.0607) [0.0544]
0.0191
169.3804 [0.0000]
69 Table 15 (continued) Panel C:Forecasting Regression by GJR(1,1) 5-day -0.2366 1.0337* (0.1557) (0.0607) [0.1300] [0.0000]
0.5416
4.7423 [0.0095]
10-day
-0.1803 (0.1706) [0.2916]
1.0189* (0.0467) [0.0000]
0.6645
2.3019 [0.1023]
15-day
-0.1720 (0.1886) [0.3629]
1.0153* (0.0420) [0.0000]
0.7124
1.7324 [0.1791]
30-day
-0.0614 (0.7929) [0.0000]
0.9945* (0.0363) [0.0000]
0.7724
1.0366 [0.3564]
60-day
-0.0873 (0.2649) [0.7422]
1.0069* (0.0284) [0.0000]
0.8669
0.2236 [0.8942]
Note: * denotes significant at 1% level; P-values are in square brackets.
Additionally, the R2 decreases with longer horizons as well. The same as VIX, RiskMetrics approach has weaker explanatory power for future volatility for longer horizon. Comparing to the regression for in-sample realized volatility by RiskMetric approach, the information of Panel B shows the different performance of RiskMetrics approach for forecasting out-of-sample realized volatility. The performance of GJR (1,1) on regressing out-of-sample realized volatility are presented by the third panel of Table 15. The coefficient estimates α are not statistically insignificant from zero for all forecasting horizons. The estimates β are close to one and highly significant at 1% level in all cases. The joint null hypothesis that α = 0 and β =1 is not rejected by F-statistic at each horizon. Therefore, the estimates by GJR (1,1) is unbiased and it has strong explanatory power for realized volatility of the out-of-sample period. The value of R2 at each horizon is much higher than the R2 of regression by VIX and RiskMetrics, and it increases with longer
70 horizons. Thus, GJR (1,1) has outstanding forecast performance for out-of-sample realized volatility and outperforms VIX and RiskMetrics. Particularly, GJR (1,1) performs better for longer forecasting horizon. To summarize, the information observed from Table 15 demonstrates that the forecast performance of respective approach on in-sample and out-of-sample realized volatility is different at each forecasting horizon. However, the study results again confirms that GJR (1,1) outperforms other approaches.
8.3.5 Residual Tests for Regression of Out-of-sample Realized Volatility
In order to confirm that the inferences based on analysis for information of Table 15 are convincible, the diagnostic tests for residuals of out-of-sample regression are conducted and the diagnostic results are shown in Table 16. Panel A presents the diagnostic results of residuals of regression by VIX. The W-statistic of the second column are the results for heteroscedasticity test. As can be seen, the null hypothesis that the residuals are homoscedastic is not rejected at 5- and 10-day horizons. The DW-statistic from Durbin-Watson Test at each horizon indicates the presence of first order autocorrelation of residuals and the results are undesirable. The correlation coefficients of residuals and regressors equal to zero at all horizons, suggesting that the estimates by OLS is consistent and unbiased. The statistic of skewness indicate that the residuals are positively skewed at all horizons. The kurtosis at each horizon reveals that the residuals are leptokurtic in most cases but platykurtic at 60-day horizon. Thus, the normality assumption is violated. The diagnostic results of Panel B of Table 16 are similar to the results of Panel A. The residuals of regression by RiskMetrics are homoscedastic at 5- and 10-day horizons documented by W-statistic of the second column. The presence of first order autocorrelation of each horizon is confirmed by DW-statistic of the third column. The residuals are uncorrelated with regressors since the correlation coefficients equal to zero at each horizon. The normality assumption is violated because the residuals are positively skewed and leptokurtic in most cases but platykurtic at 60-day horizon.
71 Table 16. Residual tests for regression for out-of-sample realized volatility W-statistic
DW-statistic
Corr( ,
Skewness
Kurtosis
Panel A: Regression by VIX 5-day
0.2196
0.4846
0.0000
0.9849
4.5510
0.2285
0.0000
1.2872
5.2104
0.1534
0.0000
1.3408
4.9310
0.0598
0.0000
0.9751
3.5156
0.0678
0.0000
0.3166
1.7013
0.4741
0.0000
0.8253
3.9860
0.2126
0.0000
0.9892
4.0637
0.1286
0.0000
0.9826
3.7893
0.0509
0.0000
0.8177
3.1486
0.0461
0.0000
0.1766
1.5969
0.7242
0.0000
0.4750
3.1025
0.3815
0.0000
0.5469
3.7150
0.1457
0.0000
0.6815
3.7572
0.1668
0.0000
0.7217
3.4385
0.2828
0.0000
0.3603
2.0008
[0.6394] 10-day
1.6266 [0.2022]
15-day
7.9010 [0.0049]
30-day
15.7494 [0.0001]
60-day
7.0272 [0.0080]
Panel B: Regression by RiskMetrics 5-day
0.0230 [0.8795]
10-day
2.3318 [0.1268]
15-day
8.2965 [0.0040]
30-day
18.9292 [0.0000]
60-day
37.7532 [0.0000]
Panel C: Regression by GJR(1,1) 5-day
6.4295 [0.0112]
10-day
3.5898 [0.0581]
15-day
1.9599 [0.1615]
30-day
1.9080 [0.1672]
60-day
9.3204 [0.0023]
Note: W-statistic is the result for heteroskedasticity test by the White Test; DW-statistic is the result for first order autoccorelation test by the Durbin-Watson Test; Corr( , correlation coefficients between residuals and regressors.
report the
72 Panel C of Table 16 reports the residual tests results of regression by GJR (1,1) for out-of-sample realized volatility. As can be seen from W-statistic of the second column, the null hypothesis that the residuals are homoscedastic is not rejected in most cases except for the 60-day horizon. DW-statistic confirms the first order autocorrelation of residuals at each horizon. The correlation coefficients of residuals and regressors show that they are independent of each other in all cases. The normality assumption is violated since residuals are positively skewed and leptokurtic in most cases. The residuals from regression by respective approach at each horizon are plotted by the figures of Appendix C. The information observed from these figures suggests that the violation of normality assumption in each case appears to be induced by a small number of very large positive or negative outliers of each case. Although DW-statistic of all regressions indicate the presence of first order autocorrelation at each horizon, the figures of Appendix C show that the residuals have no autocorrelation over time at 5-, 10- and 15-day horizons, but the residuals are positively correlated at 30- and 60-day horizons for all regressions. By and large, the statistics of Table 16 demonstrates that GJR (1,1) performs better than the other two approaches since the null hypothesis of heteroscedasticity test is not rejected in most cases and the residuals of GJR (1,1) have a distribution that is much closer to normality.
8.3.6 Encompassing Regression for Realized Volatility
In the interest of investigating whether the forecast performance of one approach is superior than the other or whether two approaches complement each other, we run the encompassing regression followed by Frijns et al. (2008). The form of the regression is:
,
,
,
39
73 where
,
denotes realized volatility at different horizon,
forecasts from alternative approaches. The significance of whether one approach dominates the other. For instance, if is not, it suggests that If both
and
,
performs better than
,
and
,
and
,
are
will reveal
is significant but
on future volatility forecasting.
are significant, it implies that these two approaches are
complement each other and each of them has information not included by the other. Table 17 reports the estimates of encompassing regression at different horizons, as well as the statistic of R2. Panel A presents the encompassing regression results for VIX against RiskMetrics approach. Both estimates of β1 and β2 are highly significant at 1% level with P-values equal to 0.0000. Therefore, VIX and RiskMetrics approach complement each other, indicating that using both approaches may achieve the best forecast performance. This can be confirmed if we compare R2 of the regressions for realized volatility by VIX and RiskMetrics, i.e. the value of R2 of encompassing regression is larger than R2 of the regression for realized volatility by VIX or RiskMetrics approach at all horizons. The forecasting performance of VIX against GJR (1,1) model evaluated by encompassing regression is presented in the Panel B of Table 17. We found the same result with Panel A. The estimated parameters of both VIX and variance derived by GJR (1,1) model at each horizon are highly significant, therefore, VIX and GJR (1,1) model are complement each other. Since the value of R2 of encompassing regression for VIX against GJR (1,1) are higher than the value of R2 of Table 15 of regression for realized volatility by VIX or GJR (1,1) model at each horizon, thus, the joint use of VIX and the variance from GJR (1,1) model may perform better for future volatility forecasting. Panel C of Table 17 shows the regression results for RiskMetrics approach against GJR (1,1) model. Again, the coefficients of
,
and
,
are highly
significant at 1% level with P-values equal to 0.0000 in all cases, and α are negative and significantly deviate from zero at all horizons. Therefore, the information of Panel C indicates that RiskMetrics approach and GJR (1,1) model are complement each other over all forecasting horizons. Comparing to the value of R2 of regression by
74 Table 17. Encompassing regression for realized volatility β1
β2
R2
-0.6588*
1.0075*
0.1705*
0.6285
(0.0372)
(0.0324)
(0.0243)
[0.0000]
[0.0000]
[0.0000]
-0.6770*
0.8958*
0.2338*
(0.0468)
(0.0288)
(0.0216)
[0.0000]
[0.0000]
[0.0000]
-0.6009*
0.8213*
0.2643*
(0.0564)
(0.0284)
(0.0213)
[0.0000]
[0.0000]
[0.0000]
-0.1218
0.6474*
0.3323*
(0.0816)
(0.0290)
(0.0217)
[0.1357]
[0.0000]
[0.0000]
1.2879*
0.4059*
0.4165*
(0.1240)
(0.0312)
(0.0234)
[0.0000]
[0.0000]
[0.0000]
-0.2956*
0.1215*
0.9375*
(0.0311)
(0.0281)
(0.0221)
[0.0000]
[0.0000]
[0.0000]
-0.1960*
-0.0753*
1.0972*
(0.0322)
(0.0197)
(0.0156)
[0.0000]
[0.0001]
[0.0000]
-0.1198*
-0.1604*
1.1634*
(0.0327)
(0.0157)
(0.0125)
[0.0003]
[0.0000]
[0.0000]
-0.1272*
-0.1741*
1.1788*
(0.0319)
(0.0093)
(0.0076)
[0.0001]
[0.0000]
[0.0000]
-0.3572*
-0.1085*
1.1371*
(0.0315)
(0.0053)
(0.0045)
[0.0000]
[0.0000]
[0.0000]
α Panel A: VIX - RiskMetrics 5-day
10-day
15-day
30-day
60-day
0.6700
0.6637
0.6194
0.5244
Panel B: VIX – GJR (1,1) 5-day
10-day
15-day
30-day
60-day
0.7204
0.8258
0.8696
0.9292
0.9625
75 Table 17 (continued) Panel C: RiskMetrics – GJR (1,1) 5-day
10-day
15-day
30-day
60-day
-0.2123*
-0.6575*
1.6798*
(0.0216)
(0.0251)
(0.0263)
[0.0000]
[0.0000]
[0.0000]
-0.2680*
-0.5443*
1.5789*
(0.0220)
(0.0155)
(0.0164)
[0.0000]
[0.0000]
[0.0000]
-0.3050*
-0.4560*
1.4927*
(0.0225)
(0.0116)
(0.0123)
[0.0000]
[0.0000]
[0.0000]
-0.4047*
-0.2872*
1.3293*
(0.0230)
(0.0066)
(0.0072)
[0.0000]
[0.0000]
[0.0000]
-0.5910*
-0.1521*
1.2027*
(0.0243)
(0.0039)
(0.0043)
[0.0000]
[0.0000]
[0.0000]
0.7517
0.8582
0.8972
0.9443
0.9687
Note: * denotes significant at 1% level; the standard errors are in parentheses; P-values are in brackets.
each of them in Table 15, we found that the value of R2 of encompassing regression is much higher. Consequently, the joint use of RiskMetrics approach and GJR (1,1) model can achieve better forecast performance than using only one of them. To summarize, we do not find the approach dominating the other by encompassing regression. The information of the value of R2 observed from each panel of Table 17 indicate that the forecasting for future volatility by jointly using RiskMetrics approach and GJR (1,1) model performs best at all horizons.
8.3.7 Average Squared Deviation
Mayhew and Stivers (2003) examined forecast performance of implied volatility and GARCH type models based on individual stocks by average squared deviation
76 (ASD). The basic idea of ASD approach is to discover the good forecaster by comparing the average deviation between squared return shocks and estimated volatility, and the forecaster with the lowest average deviation is considered to be the best predictor against other approaches. By following Mayhew and Stivers (2003), we use ASD approach to investigate the forecast performance of VIX, Riskmetrics approach and GJR (1,1) model at different horizons. The ASD for the volatility forecast of each model is expressed as:
1
where
denotes the daily returns,
observations, and
40
is the average return, T is the number of
is the out-of-sample conditional variance derived from
alternative model.
Table 18. The average squared deviation from alternative approaches
ASD
VIX
RiskMetrics
GJR (1,1)
6.1253
6.2622
5.9929
As can be seen from Table 18, GJR (1,1) has the lowest average standard deviation compared to VIX and RiskMetrics approach, respectively. The information of Table 18 again documents that GJR (1,1) beats VIX and RiskMetrics approach for volatility forecasting.
8.3.8 Regression for Squared Return Shocks
We run the regression for squared return shocks by out-of-sample conditional
77 volatility forecasts from VIX, RiskMetrics approach and GJR (1,1) model, respectively. The objective is to examine which approach can well track the dynamics of daily volatility. The regression has the form
41
denotes the average daily return,
is the
conditional volatility forecasted by respective approach. The conditions for
to be
where
is the daily return on day t,
an unbiased forecaster are α = 0 and β = 1.
Table 19. Regression results for squared return shocks VIX
RiskMetrics
GJR (1,1)
α
β
R2
-2.9370*
3.5538*
0.1471
(-4.4445)
(6.5669)
[0.0000]
[0.0000]
0.3182
0.7427*
(1.0663)
(3.8342)
[0.2873]
[0.0002]
0.1954
0.8427*
(0.7355)
(5.0624)
[0.4627]
[0.0000]
0.0555
0.0930
Note:* denotes significant at 1% level; t-statistics are in parentheses; P-values are in brackets.
Table 19 reports the parameter estimates for regression of squared return shocks by out-of-sample volatility forecasts. As can be seen, the coefficient α is negative and highly significant for regression by VIX, and positive and insignificant in other two cases. The coefficients β are positive and highly significant at 1% level for all models. Compared to the regression by VIX, the coefficients β of other two approaches are more close to unity and the coefficients α are insignificant from zero. However, the value of R2 of respective approach indicates that VIX outperforms other approaches outstandingly.
78
8.3.9 Encompassing Regression for Squared Daily Return Shocks
In order to further investigate whether VIX dominates other approaches for tracking dynamics of daily volatility, we run the encompassing regression for squared daily return shocks. By following Mayhew and Stivers (2003), the form of the encompassing regression is:
,
where
,
is the volatility forecasted by VIX, and
42
,
,
is the volatility forecast from
other approaches.
Table 20. Encompassing regression results for squared return shocks
VIX-RiskMetrics
VIX-GJR(1,1)
α
β1
β2
R2
-4.6452*
6.1189*
-1.0265*
0.1766
(-5.3597)
(6.0492)
(-2.9833)
[0.0000]
[0.0000]
[0.0031]
-5.3426*
6.6711*
-1.0026**
(-4.3891)
(4.6532)
(-2.3446)
[0.0000]
[0.0000]
[0.0198]
0.1655
Note: *, ** denotes significant at 1%, 5% level, respectively; t-statistics are in parentheses; P-values are in square brackets.
Table 20 presents the parameter estimates of encompassing regression for squared return shocks by VIX against other approaches, respectively. For regression by volatility forecasts from VIX and RiskMetrics approach, the coefficient α is negative and highly significantly deviate from zero, and both the parameter estimates β1 and β2 are highly significant at 1% level. Therefore, VIX and RiskMetrics approach are complement each other, suggesting that each of them contains information not included by the other. For regression by VIX against GJR (1,1), the coefficient α is
79 also negative and highly significant at 1% level, the coefficient β1 is positive and significant at 1% level, and the coefficient β2 is negative and significant at 5% level. Consequently, the statistic demonstrates that VIX dominates GJR (1,1) at 1% level, and these two approaches complement each other at 5% level. In addition, the parameter estimates of β1 of both encompassing regression indicate that VIX has stronger positive explanatory power. When evaluating the forecasting ability by the value of R2 of each regression, the higher R2 of regression by jointly using VIX and RiskMetrics outperforms the other.
80
9 Conclusion
The objective of this study is to examine the predictive power of model based forecasts and the VIX index on forecasting future volatility of S&P 500 index daily returns. The study period is from January 1990 to December 2010, including 5291 observations. First, a variety of time series models were estimated, including random walk model, GARCH (1,1), GJR(1,1) and EGARCH (1,1) models. The result of analysis for the estimated models indicates that GJR (1,1) performs best for out-of-sample forecast in sample period. Then, the forecast performance of VIX, GJR(1,1) and RiskMetrics were compared using various approaches by following Frijns et al.(2008), Giot (2005b) and Mayhew and Stivers (2003). The empirical results are detailed in section 8. The results of our study are in line with Becker, Clements and White (2006), Becker, Clements and White (2007) and Becker and Clements (2008). The empirical evidence does not support the view that implied volatility subsumes all information content, and the study results provide strong evidence indicating that GJR (1,1) outperforms VIX and RiskMetrics for modeling future volatility of S&P 500 index daily returns. In addition, the results of the encompassing regression for future realized volatility at 5-, 10-, 15-, 30- and 60-day horizons, and the results of the encompassing regression for squared return shocks suggest that the joint use of GJR(1,1) and RiskMetrics can produce the best forecasts. By and large, our finding indicates that implied volatility is inferior for future volatility forecasting, and the model based forecasts have more explanatory power for future volatility.
81
References
Ahoniemi, K. (2006) Modeling and forecasting implied volatility – an econometric analysis of the VIX index. Helsinki Center of Economic Research, discussion paper, No.129. Akgiray, V. (1989). Conditional heteroscedasticity in time series of stock returns: Evidence and forecasts. Journal of Business, 62, 55-79. Alexander, C. (2001) Market Models: A guide to financial data analysis. John Wiley & Sons, Ltd. Awartani, B. M. A. and Corradi, V. (2005).Predicting the volatility of the S&P-500 stock index via GARCH models: The role of asymmetries. International Journal of Forecasting, 21, 167-183. Becker, R. and Clements, A.E. (2008). Are combination forecasts of S&P 500 volatility statistically superior? International Journal of Forecasting, 24, 122-133. Becker, R., Clements, A.E. and Coleman-Fenn, C.A. (2009). Forecast performance of implied volatility and the impact of the volatility risk premium. NCER Working Paper Series. http://www.ncer.edu.au/papers/documents/WPNo45.pdf Becker, R., Clements, A.E. and McClelland, A. (2009). The jump component of S&P 500 volatility and the VIX index. Journal of Banking & Finance, 33, 1033-1038. Becker, R., Clements, A.E. and White, S.I. (2006). On the informational efficiency of S&P 500 implied volatility. North American Journal of Economics and Finance, 17, 139-153. Becker, R., Clements, A.E. and White, S.I. (2007). Does implied volatility provide any information beyond that captured in model-based volatility forecasts? Journal of Banking & Finance, 31, 2535-2549. Blair, B. J., Poon, Ser-H.and Taylor. S. J. (2001). Forecasting S&P 100 volatility: the incremental information content of implied volatilities and high-frequency index returns. Journal of Econometrics, 105, 5-26. Bollerslev, T., Chou, R. Y. and Kroner, K. F. (1992). ARCH modeling in Finance: A
82 review of the theory and empirical evidence. Journal of Econometrics, 52, 1-2, 5-59. Brailsford, T. J. and Faff, R. W. (1996).An evaluation of volatility forecasting techniques.Journal of Banking and Finance, 20, 419-438. Brooks, C. (2008). Introductory Econometrics for Finance.Cambridge University Press. Canina, L. and Figlewski, S. (1993). The information content of implied volatility.The Review of Financial Studies, 6, 3, 659-681. Chong, C. W., Ahmad, M. I. and Abdullah, M. Y. (1999). Performance of GARCH models in forecasting stock market volatility. Journal of Forecasting, 18, 333-343. Christensen, B. J. and Prabhala, N. R., (1998).The relation between implied and realized volatility. Journal of Financial Economics, 50, 125-150. Chuang, I. Y., Lu, J. R. and Lee, P. H. (2007). Forecasting volatility in the financial markets: A comparison of alternative distributional assumptions. Applied Financial Economics, 17, 1051-1060. Corrado, C. and Miller, JR.T.W. (2005). The forecast quality of CBOE implied volatility indexes. The Journal of Futures Markets, 25, 4, 339-373. Day, T. E. and Lewis, C. M. (1992). Stock market volatility and the information content of stock index options. Journal of Econometrics, 52, 267-287. Evans, T. and McMillan, D. G. (2007). Volatility forecasts: The role of asymmetric and long-memory dynamics and regional evidence. Applied Financial Economics, 17, 1421-1430. Figlewski, S. (1997). Forecasting volatility. Financial Markets, Institutions and Instruments, 6,1, 1-88. Fleming, J., Ostdiek, B. and Whaley, R. E. (1995). Predicting stock market volatility: A new measure. The Journal of Futures Markets, 15, 3, 265-302. Franses, P. H. and van Dijk, R. (1996). Forecasting stock market volatility using (non-linear) GARCH models.Journal of Forecasting, 15, 229-235. Frijns, B., Tallau, C. and Tourani-Rad, A., (2008). The information content of implied
83 volatility: Evidence from Australia. 21st Australasian Finance and Banking Conference 2008 Paper. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1246142 Frijns, B., Tallau, C. and Tourani-Rad, A., (2010). Australian Implied Volatility Index. The Finsia Journal of Applied Finance, 1, 31-35. Galdi, F. C. and Pereira, L. M. (2007). Value at Risk (VaR) using volatility forecasting models: EWMA, GARCH and Stochastic volatility. Brazilian Business Review, 4,1, 74-94. Giot, P. (2005a). Relationships between implied volatility indexes and stock index returns. Journal of Portfolio Management Spring 2005, 31, 3, 92-100. Giot, P. (2005b). Implied volatility indexes and daily value at risk models. The Journal of Derivatives, 12, 54-64. Giot, P. and Laurent, S. (2006). The information content of implied volatility in light of the jump/continuous decomposition of realized volatility. Working Paper. http://www.core.ucl.ac.be/econometrics/Giot/Papers/implied4_8.pdf Harvey, C. R. and Whaley, R. E. (1992). Dividends and S&P 100 Index Option Valuation. The Journal of Futures Markets, 12, 123-137. Latané, H. A. and Rendleman, R. J. (1976). Standard deviations of stock price ratios implied in option prices. The Journal of Finance, 31, 2, 369-381. Lamoureus, C. G. and Lastrapes, W. D. (1993). Forecasting stock-return variance: Toward an understanding of stochastic implied volatilities. The Review of Financial Studies, 6, 2, 293-326. Lee, J.H.H. and King, M.L. (1993). A locally most powerful based score test for ARCH and GARCH regression disturbances. Journal of Business and Economic Statistics, 7, 259-279. Lumsdaine, R.L. and Ng, S. (1999). Testing for ARCH in the presence of a possibly misspecified conditional mean. Journal of Econometrics, 93, 257-279. Magnus, F.J. and Fosu, O.A.E. (2006). Modelling and forecasting volatility of returns on the Ghana stock exchange using GARCH models. American Journal of Applied Sciences, 3 (10), 2042-2048.
84 Mayhew, S. and Stivers, C. (2003). Stock return dynamics, option volume, and the information content of implied volatility. The Journal of Futures Markets, 23, 7, 615-646. Nelson, D. B. (1992). Filtering and forecasting with misspecified ARCH models I: getting the right variance with the wrong model. Journal of Econometrics, 52, 61-90. Patev, P., Kanaryan, N. and Lyroudi, K. (2009). Modelling and forecasting the volatility of thin emerging stock markets: the case of Bulgaria. Comparative Economic Research, 12, 4, 47-60. Poon, S. and Granger, C. (2001). Forecasting financial market volatility: A view. Working paper, University of Strathclyde and University of California, Sen Diego. Poterba, J. M. and Summers, L. H. (1986). The persistence of volatility and stock market fluctuations. American Economic Review, 76, 1142-1151. Sheikh, A. M. (1989). Stock splits, volatility increases, and implied volatilities. The Journal of Finance, 44, 1361-1372. The CBOE Volatility Index – VIX. (2009). CBOE Proprietary Information. Chicago Board Options Exchange, Incorporated. http://www.cboe.com/micro/VIX/vixwhite.pdf Tse, Y. K. (1991). Stock returns volatility in the Tokyo Stock Exchange. Japan and the World Economy, 3, 285-298. Tse, Y. K. and Tung, S. H. (1992). Forecasting volatility in the Singapore stock market. Asia Pacific Journal of Management, 9(1), 1-13. Walsh, D. M. and Tsou, G. Y. (1998). Forecasting index volatility: sampling interval and non-trading effects. Applied Financial Economics, 8, 477-485. Whaley, R. E. (1993). Derivatives on market volatility: hedging tools long overdue. Journal of Derivatives, 1,1,71-84. Wilhelmsson,
A.
(2006).
GARCH
forecasting
performance
under
different
distribution assumptions. Journal of Forecasting, 25, 561-578. Yu, J. (2002). Forecasting volatility in the New Zealand stock market. Applied
85 Financial Economics, 12, 193-202. Zivot, E. (2008). Practical issue in the analysis of univariate GARCH models. Handbook of Financial Time Series.
86
Appendix A VIX and Future Realized Volatility
120%
100%
Annualized Realized Volatility (5 trading days) VIX
80%
60%
40%
20%
0% 1/1995
1/2000
1/2005
1/2010
Figure A.1 VIX and annualized future realized volatility (5 trading days)
100%
80%
Annualized Realized Volatility (10 trading days) VIX
60%
40%
20%
0% 1/1995
1/2000
1/2005
1/2010
Figure A.2 VIX and annualized future realized volatility (10 trading days)
87
100%
80%
Annualized Realized Volatility (15 trading days) VIX
60%
40%
20%
0% 1/1995
1/2000
1/2005
1/2010
Figure A.3 VIX and future realized volatility (15 trading days)
90% 80%
Annualized Realized Volatility (30 trading days) VIX
70% 60% 50% 40% 30% 20% 10% 0% 1/1995
1/2000
1/2005
1/2010
Figure A.4 VIX and annualized future realized volatility (30 trading days)
88
90% 80% 70%
Annualized Realized Volatility (60 trading days) VIX
60% 50% 40% 30% 20% 10% 0% 1/1995
1/2000
1/2005
1/2010
Figure A.5 VIX and annualized future realized volatility (60 trading days)
89
Appendix B Out-of-sample Forecast Performance on Realized Volatility
5-day ahead forecast by VIX 8
5-day ahead forecast by GJR(1,1)
5-day ahead forecast by RiskMetrics 6
6
4
4
2
2
0
0
4
0
-2
-2
-4 Apr 10
Jul 10
Apr 10
Oct 10
Jul 10
Apr 10
Oct 10
Jul 10
Oct 10
Figure B1. Out-of-sample 5-day ahead realized volatility forecasts
10-day ahead forecast by VIX
10-day ahead forecast by RiskMetrics
10-day ahead forecast by GJR(1,1)
12
12
8
8
8
6
4
4
4
0
0
2
-4
-4
Apr 10
Jul 10
Oct 10
0 Apr 10
Jul 10
Oct 10
Apr 10
Figure B2. Out-of sample 10-day ahead realized volatility forecasts
Jul 10
Oct 10
90 15-day ahead forecast by VIX
15-day ahead forecast by RiskMetrics
15-day ahead forecast by GJR(1,1)
15
12
10.0
10
8
7.5
5
4
5.0
0
0
2.5
-5
0.0
-4 Apr 10
Jul 10
Oct 10
Apr 10
Jul 10
Apr 10
Oct 10
Jul 10
Oct 10
Figure B3. Out-of-sample 15-day ahead realized volatility forecasts
30-day ahead forecast by VIX
30-day ahead forecast by RiskMetrics
20
15
15
10
30-day ahead forecast by GJR(1,1) 12
8 10
5
5
0
4
0
-5 Apr 10
Jul 10
0
Oct 10
Apr 10
Jul 10
Oct 10
Apr 10
Jul 10
Oct 10
Figure B4. Out-of-sample 30-day ahead realized volatility forecasts
60-day ahead forecast by VIX 30
60-day ahead forecast by RiskMetrics 20
60-day ahead forecast by GJR(1,1) 16
20
12 10
10
8 0
0
4
-10 Apr 10
-10 Jul 10
Oct 10
Apr 10
0 Jul 10
Oct 10
Apr 10
Jul 10
Figure B5. Out-of-sample 60-day ahead realized volatility forecasts
Oct 10
91
Appendix C Residuals from Regressions for Out-of-sample Realized Volatility
Residuals from Regression VIX (10-day)
Residuals from Regression by VIX (5-day) 6
4 3
4 2
2
1 0
0 -1
-2
-2 4/2010
7/2010
4/2010
10/2010
Residuals from Regression by RiskMetrics (5-day)
7/2010
10/2010
Residuals from Regression by RiskMetrics (10-day) 6
6
4
4
2 2
0 0
-2 -4
-2 4/2010
7/2010
10/2010
4/2010
7/2010
10/2010
92
Residuals from Regression by GJR (1,1) (5-day)
Residuals from Regression by GJR(1,1) (10-day)
4
4
3
3
2
2
1
1
0
0
-1
-1 -2
-2 4/2010
7/2010
4/2010
10/2010
Residuals from Regression by VIX (15-day) 6
7/2010
10/2010
Residuals from Regression by VIX (30-day) 6 4
4
2 2 0 0
-2
-2
-4 4/2010
7/2010
10/2010
Residuals from Regression by RiskMetrics (15-day)
4/2010
10/2010
Residuals from Regression by RiskMetrics (30-day)
6
6
4
4
2
2
0
0
-2
-2
-4
7/2010
-4
4/2010
7/2010
10/2010
4/2010
7/2010
10/2010
93
Residuals from Regression by GJR(1,1) (15-day)
Residuals from Regression by GJR(1,1) (30-day)
4
3
3
2
2
1
1
0
0
-1
-1
-2
-2
-3 4/2010
7/2010
10/2010
Residuals from Regression by VIX (60-day) 6
4/2010
7/2010
10/2010
Residuals from Regression by RiskMetrics (60-day) 4
4
2
2 0
0 -2
-2 -4 4/2010
7/2010
10/2010
Residuals from Regression by GJR(1,1) (60-day) 2
1
0
-1
-2 4/2010
7/2010
10/2010
-4 4/2010
7/2010
10/2010