Econometric Modeling of Value-at-Risk - SSRN

2 downloads 0 Views 810KB Size Report
Abstract. Recently risk management has become a standard prerequisite for all financial institutions. Value-at-Risk is the main tool of reporting to the bank ...
Econometric Modeling of Value-at-Risk

Timotheos Angelidis ∗ Stavros Degiannakis†

Abstract Recently risk management has become a standard prerequisite for all financial institutions. Value-at-Risk is the main tool of reporting to the bank regulators the risk that the financial institutions face. As it is essential to estimate it accurately, numerous methods have been proposed in order to minimize the forecast error. This chapter provides a selective survey of the risk management techniques that have been applied and discusses potential improvements in estimating, evaluating and adjusting Value-at-Risk and Expected Shortfall.

JEL Nos.: C22; C52; C53; G15 Keywords: Backtesting, Expected Shortfall, Value-at-Risk, Volatility Forecasting. ∗

Dept. of Economics, University of Crete, Gallos Campus, 74100 Rethymno, Greece and Athens Laboratory of Business Administration. Athinas Ave. & 2a Areos street, 166 71, Vouliagmeni, Greece Email: [email protected] Tel.:+30-210-8964-736 . † Dept. of Statistics, Athens University of Economics and Business, 76, Patision street, Athens GR-104 34 Email: [email protected]. Tel.:+30-210-8203-120. The usual disclaimer applies.

Contents 1 Introduction

1

2 Value at Risk 2.1 Value at Risk Criticisms . . . . . . . . . . . . . . . . . . . . .

2 4

3 Expected Shortfall

5

4 VaR and ES Modeling 4.1 Parametric Volatility Forecasting . . . . . . . . . . . . . . . . 4.1.1 Modeling the Underlying Distribution . . . . . . . . . . 4.1.2 ARCH Volatility Specifications . . . . . . . . . . . . . 4.1.3 One-step-ahead VaR and ES Calculation under Parametric Volatility Forecasting . . . . . . . . . . . . . . . 4.2 Non-Parametric Risk Management Techniques . . . . . . . . . 4.2.1 Historical Simulation . . . . . . . . . . . . . . . . . . . 4.3 Semi-Parametric Volatility Forecasting . . . . . . . . . . . . . 4.3.1 Filtered historical simulation . . . . . . . . . . . . . . . 4.3.2 Extreme value theory . . . . . . . . . . . . . . . . . . . 4.4 Multi-period VaR and ES forecasts . . . . . . . . . . . . . . . 4.5 Realized volatility Models . . . . . . . . . . . . . . . . . . . .

6 6 7 8 16 16 16 18 18 19 20 22

5 Liquidity Adjusted Value-at-Risk 5.1 VaR Adjustments Based on the Bid-Ask Spread . . . . . . . . 5.2 Trading Strategies that Minimize the Expected cost and its Variance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26 26

6 Backtesting Value-at-Risk 6.1 Unconditional Coverage . . . . . . . . . . 6.2 Conditional coverage . . . . . . . . . . . . 6.3 Generalization of the Conditional Coverage 6.4 Loss Functions . . . . . . . . . . . . . . .

30 31 31 32 32

. . . . . . Test . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

28

7 Application

35

8 Summary

39

A Bibliography

40

B Tables and Figures

55 i

List of Tables 1 2 3

Values of the supervisory-determined multiplicative factor (k) in order to determine the Market Required Capital. . . . . . . Unconditional coverage “no rejection” regions for a 95% significance level. . . . . . . . . . . . . . . . . . . . . . . . . . . . . (i) The predictive mean squared loss function, Ψ(RV ) , for i = 1, . . . , 5

55 55

(i)

4

models. Ψ(RV ) is the average squared distance between annu√ (i) alized predicted standard deviation of model i, 252σt+1|t and √ (RV ) annualized realized volatility, 252σt+1 . . . . . . . . . . . . . Percentage of violations (N/T˜ ), p-values of Kupiec’s (unconditional coverage) and Christoffersen’s (conditional coverage) (i) tests, and the predictive mean squared loss functions Ψ(V aR) (i)

and Ψ(ES) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

56

56

List of Figures 1 2

  (p) (p) for p = Pr yt ≤ V aRt = 5% the V aRt = -1.645, under the assumption that yt ∼ N(0, 1). . . . . . . . . . . . . . . . . . .   (p) = 5%, the ValueFor yt ∼ N(0, 1) and p = Pr yt ≤ V aRt (p)

3 4 5 6 7

8

9

(p)

at-Risk, V aRt and the Expected Shortfall, ESt . . . . . . . Liquidity Adjusted to Value-at-Risk. . . . . . . . . . . . . . . S&P500 daily prices from January 1990 to December 2003. . . S&P500 daily returns from January 1990 to December 2003. . S&P500 annualized realized volatility from January 1997 to December 2003. . . . . . . . . . . . . . . . . . . . . . . . . . . . The one-day-ahead annualized standard deviation forecasts of the AR(1)GARCH(1,1)-Normal model and the realized volatility from January 2002 to December 2003. . . . . . . . . . . . . The one-day-ahead annualized standard deviation forecasts of the AR(1)GARCH(1,1)-skewed Student-t model and the realized volatility from January 2002 to December 2003. . . . . . . The one-day-ahead annualized standard deviation forecasts of the AR(1)APARCH(1,1)-Normal model and the realized volatility from January 2002 to December 2003. . . . . . . . . . . . . ii

57 57 58 59 59 60

60

61

62

10

11

12

13

14

15

16

The one-day-ahead annualized standard deviation forecasts of the AR(1)APARCH(1,1)-skewed Student-t model and the realized volatility from January 2002 to December 2003. . . . . . . 63 The one-day-ahead annualized standard deviation forecasts of the ARFIMAX model and the realized volatility from January 2002 to December 2003. . . . . . . . . . . . . . . . . . . . . . 64 Theone-day-ahead VaR and ES forecasts of the AR(1)GARCH(1,1)Normal model and the S&P500 log-returns from January 2002 to December 2003. . . . . . . . . . . . . . . . . . . . . . . . . 65 The one-day-ahead VaR and ES forecasts of the AR(1)GARCH(1,1)skewed Student-t model and the S&P500 log-returns from January 2002 to December 2003. . . . . . . . . . . . . . . . . . . 66 The one-day-ahead VaR and ES forecasts of the AR(1)APARCH(1,1)Normal model and the S&P500 log-returns from January 2002 to December 2003. . . . . . . . . . . . . . . . . . . . . . . . . 67 The one-day-ahead VaR and ES forecasts of the AR(1)APARCH(1,1)skewed Student-t model and the S&P500 log-returns from January 2002 to December 2003. . . . . . . . . . . . . . . . . . . 68 The one-day-ahead VaR and ES forecasts of the ARFIMAX model and the S&P500 log-returns from January 2002 to December 2003. . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

iii

1

Introduction

Nowadays, the importance of efficient risk management have increased dramatically. This is a result of the globalisation of financial markets, the technological revolution in trading systems and, perhaps most important, the development of the derivative markets. Risk management relates the portfolio returns with risk, which is described by a single number. The widespread adoption of Value-at-Risk (VaR) as a risk management tool is part of this approach. VaR refers to a portfolio’s worst outcome that is likely to occur at a given confidence level. According to Basle Committee, VaR is used by financial institutions to calculate capital charges in respect of their financial risk. In addition, the Securities and Exchange Commission allows the use of VaR in order to report their market risk exposure. Recently, there has been an increasing use of VaR as a tool for managing and regulating credit risk and as a methodology for controlling the risk exposure of a portfolio. There are three methods of VaR calculation. The major representatives of the parametric family, are the Autoregressive Conditional Heteroskedasticity (ARCH) models. The second category, the non-parametric modeling, relies on actual prices without assuming any specific distribution. The third framework is the semi-parametric family that combines parametric and non-parametric models in order to take the most of them. Filtered Historical Simulation (FHS) and Extreme Value Theory (EVT) are the representative methods of the third family. There are numerous papers dealing with the evaluation of various VaR models. However, practitioners and academics have not reached to a common conclusion for the best performing model. In general, the performance of the models is data sensitive. In most of the cases, there is not a specific model that outperforms its competitors for all datasets. In financial literature there are mainly two methods of model evaluation; the evaluation of the statistical properties of VaR forecasts and the construction of loss functions that measure the distance between the predicted VaR and the actual portfolio’s outcome. The main drawback of the first method is the inability a unique model to be selected by a risk manager, since more than one model are considered as adequate, and most importantly, these methods are not able to rank the models according to their ability to predict VaR. The main shortcoming of the second class of evaluation is that if the risk management techniques are not filtered by the aforementioned backtesting procedures, an inadequate model may be selected. The present study reviews the most important methods of VaR forecasting 1

as well as the methods of model evaluation. For the latter aspect of the study, a two-stage evaluation framework is presented. In the first stage, the statistical accuracy of the models is examined. In the second stage, a loss function is applied to investigate whether the differences between the models, that pass the first stage, are statistically significant. Under the two-stage method, researchers can find a specific model for each dataset that accurately forecasts next period’s VaR. The rest of the chapter is organized as follows. Section 2 provides an overview of Value-at-Risk, while in section section 3 the Expected Shortfall measure is introduced. Section 4 explains the parametric, non-parametric and semi-parametric methods of estimating risk. Section 5 presents methodologies of adjusting VaR to liquidity risk. Section 6 describes the evaluation framework and section 7 offers an empirical application of some of the presented techniques. In section 8 the chapter is concluded.

2

Value at Risk

Financial risk is the fundamental element that influences financial behaviour, as it describes the unforeseen changes in underlying risk factors. Risk is categorized in five areas: market, liquidity, business, credit and operational risk. Specifically, market risk is defined as the risk that arises from unforeseen movements in market prices. Liquidity risk arises from the fact that an investor can not liquidate the assets that holds without causing significant price changes. Business risk describes the risk that arise from the specific industry and market that the firm operates. Credit risk arises when the counterparty is unable to fulfill its obligations. Finally, operational risk is related with failures of internal systems, physical catastrophes or human errors. The focus of this chapter is on market risk, as VaR is widely adopted for measuring it. The first indirect reference to VaR was made by New York Stock Exchange on 1922 since imposed to its member firms to hold capital equal to 10% of their assets. Leavens (1945) presented the first quantitative example of VaR, while Markowitz (1952) and Roy (1952), in independent works, suggested VaR measures that were based on the covariances of risk factors that reflected hedging and diversification effects. Baumol (1963) proposed a measure based on standard deviation adjusted to a confidence level parameter that reflects the user’s attitude to risk. This measure is not different from the widely known VaR. However, given the fact that the computational demands of estimating VaR are quite high and the processing power were limited, VaR was not applied in real life examples. The widespread adoption of VaR was made after 1994, 2

since JP Morgan made available its RiskMetrics system on the Internet. VaR has been adopted by bank regulators to determine bank capital requirements against the risk that the financial institutions face. Specifically, according to Basle Committee proposal (1995a, 1995b), banks can determine their daily capital charge by following these proposals: • The 99% confidence level must be used; • The holding period must be set to 10 trading days and • Banks can calculate VaR by implementing internal models Specifically, the Market Required Capital (MRC) is calculated as:   60  1 MRCt = max k V aRt−i , V aRt−1 , 60 i=1

(1)

where V aRt is the V aR on day t which is compared to the scaled, by the multiplier k, average VaR over the preceding 60 days. k is the supervisorydetermined multiplicative factor and ranges from 3 to 4 depending on the results of a backtesting procedure. It is used to model operational and specific risks that cannot be captured by the internal risk models of the banks. In particular, for one year period (250 trading days) and for 99% level of confidence, if the number of violations1 is lower than 4, the multiplier equals to 3, while if it is greater than 9, k = 4. The specific values of k are presented in Table 1. Within the green zone (four or fewer exceptions), a VaR model is accurate, while within the yellow zone (five through nine exceptions), k increases incrementally with the number of violations. Finaly, within the red zone (ten or more exceptions), the VaR model is deemed to be inaccurate, k increases to four and the institution must improve its risk measurement and management system. But what is VaR? Let Pt be the observed value of a portfolio at time t, while the profit or loss (P/L), for period t − 1 to t, equals to yt = ln(Pt ) − ln(Pt−1 ). In a more mathematical expression, for a long trading position, and under the assumption of standard normally distributed P/L, VaR is extracted from the following equation:    (p) p = Pr yt ≤ V aRt = 1

(p)

V aRt −∞

  1 1 2 √ exp − yt dyt . 2 2π

A violation occurs if the predicted VaR is not able to cover the realized loss.

3

(2)

Figure 1 depicts this relation. Given the fact that yt ∼ N(0, 1), the probability (p) of a loss less than V aRt =-1.645 is equal to the mass left of the vertical line, or p = 5%. Expressing in a different way, this value (-1.645) is the VaR at 95% level of confidence. For example, for a capital of e 100.000, VaR equals to e 1.645.

2.1

Value at Risk Criticisms

Taleb (1997a, 1997b) and Hoppe (1998, 1999) argued that the underlying statistical assumptions of VaR modeling are violated, while Beder (1995) reached to the conclusion that different risk management techniques produce different VaR forecasts and therefore the risk estimates might be imprecise. Marshall and Siegel (1997) compared the results of eleven software vendors who used a common covariance matrix and implemented the RiskMetrics VaR measure. They showed that there are significant differences for identical portfolios and same market parameters and hence the banking internal models are exposed to ”implementation” risk. Moreover, if many market participants use VaR to allocate capital or maintain market risk limits, they will have a tendency to simultaneously liquidate positions during market turmoil periods. To make this argument more clear, consider the situation where all risk managers make their decisions based on the same VaR number. During market crises VaR will be high and therefore they will not be able to reduce their exposures by selling the most risky assets, as there will not be any counterpart to buy their assets and hence they will be forced to unfold them in much lower prices, a fact which implies that the losses will be significantly higher than the expected ones. The final criticism arised from the work of Artzner et al. (1997, 1999) who showed that VaR is not necessarily sub-additive, e.g. VaR of a portfolio with two instruments may be greater than the sum of individual VaRs of these two instruments. This matters for two reasons: • If risks are not sub-additive, the sum of them will underestimate the total risk; • The capital requirements of the individual units will be less than the capital requirements of the whole firm. Lastly, VaR does not give any indication about the size of the potential loss given that the loss exceeds VaR. For example, it number indicates the potential loss at a specific confidence level, but it does not tell anything about the expected loss. If a VaR violation occurs, a risk manager expects to lose more than that VaR predicted. 4

3

Expected Shortfall

In order to respond to some of these shortcomings, Artzner et al. (1997) and Delbaen (2002) introduced the Expected Shortfall (ES) risk measure, which equals the expected value of loss, given that a VaR violation occurred. ES is a coherent measure, which implies that if P1,t and P2,t are the future values of two risky positions, a risk measure r(.) satisfies the following four properties: r(P1,t ) + r(P2,t ) ≤ r(P1,t + P2,t ) (sub-additivity)   r(i P1,t ) = i r(P1,t ) (homogeneity) if P1,t ≤ P2,t , (monotonicity) r(P1,t ) ≥ r(P2,t ),   r(P1,t + i ) = r(P1,t ) − i (risk-free condition), 

(3)



for any number i and positive number i . Expected Shortfall is defined mathematical as: ESt = E(|Losst | > |V aRt |), (4) where Losst is the expected value of loss if a V aRt violation occurs. In a different way ES, for a long trading position, can be defined as:   (p) (˜ p) ESt = E V aRt , ∀ 0 < p˜ < p, (5) where p equals to (2). ES informs the risk manager about the expected loss if an extreme event occurs. Figure 2 presents this relation. Under the assumption of the standard (p) (p) normal distribution, V aRt = −1.645 and ESt = −2.061, for p = 5%. Continuing the example of VaR, the average loss given a VaR violation equals to e 2.061. In summary, ES shares the same features with VaR as it measures the risk across positions and summarizes the risk in just one number. However, it is a better risk measure, since: • ES informs the risk manager what to expect when a VaR violation occurs; • ES is a more reliable risk measure during market turmoil. Yamai and Yoshiba (2005) argued that VaR can mislead investors in such cases; • ES does not discourage diversification, while VaR sometimes does; • ES estimates might be more accurate than VaR (Mausser and Rosen, 2000) and • ES is a coherent measure. 5

4

VaR and ES Modeling

4.1

Parametric Volatility Forecasting

Let yt = ln(Pt /Pt−1 ) denote the continuously compound rate of return from time t − 1 to t, which is decomposed into two parts, the predictable , µt , and unpredictable, εt , component: y t = µt + εt µt = µ (θ|It−1 ) εt = σt zt σt = g (θ|It−1 ) i.i.d. zt ∼ f (w; 0, 1) ,

(6)

where θ is a vector of unknown parameters, It denotes the information set available at time t, f (.) is the density function of zt , E (zt ) = 0, V (zt ) = 1 and w is the vector of the parameters of f (.). Both the conditional mean, µt , and the conditional standard deviation, σt , are functions of the information set that is available at time t − 1. The conditional mean is usually modeled as an ARMA(κ, l) process: µ t = c0 +

κ 

ci yt−i +

i=1

l 

di εt−i .

(7)

i=1

Non-synchronous trading2 in the stocks making up an index induces autocorrelation in the return series, primarily when high frequency data are used. To control this, Scholes and Williams (1977) suggested a first order moving average, while Lo and MacKinlay (1988) suggested a first order autoregressive specification. In a VaR framework, Angelidis et al. (2004) showed that complex specification of the conditional mean does not add anything significant to the predictive power of the models than complexity in the estimation procedure. The unpredictable component, εt , is an ARCH process with constant unconditional variance V (εt (θ)) = E (ε2t (θ)) = σ 2 (θ), zero unconditional mean and E (εt (θ)εt (θ)) = 0, ∀t = t . The conditional variance of εt is a timevarying, positive and measurable function of It−1 and is defined as V (εt (θ| It−1 )) = E (ε2t (θ)) |It−1 = σt2 (θ). 2

According to Campbell et al. (1997), “The non-synchronous trading or non-trading effect arises when time series, usually asset prices, are taken to be recorded at time intervals of one length when in fact they are recorded at time intervals of other, possible irregular lengths.”

6

4.1.1

Modeling the Underlying Distribution

As concerns the distribution of zt , Engle (1982), assumed it is normally distributed: 2 f (zt ) = (2π)(−1/2) e(−zt /2) . (8) Bollerslev (1987), in order to accommodate the excess kurtosis, proposed the standardized Student-t distribution with a density given by: v+1 Γ ((v + 1)/2) z2 (9) (1 + t )− 2 , v−2 Γ(v/2) π(v − 2)

∞ where v > 2 is the degrees of freedom and Γ(v) = 0 e−x xv−1 dx is the gamma function. The Student-t distribution is symmetric around zero and for v > 4 the conditional kurtosis equals to 3(v − 2)(v − 4)−1 , which exceeds the normal value of three. For v → ∞, however, the density function converges to the standard normal one. Wilson (1993) suggested to risk managers to replace the usual normal distribution with the Student-t distribution in order to accommodate the fat-tailed underlying process. Nelson (1991) suggested another “fat-tailed” distribution, the generalized error distribution (GED) with density equal to:

f (zt ; v) =

f (zt ; v) =

v exp (−0.5 |zt /λ|v ) , 2(1+1/v) Γ(v −1)λ

v > 0, 

where v is the tail-thickness parameter and λ =

Γ( v1 ) 2

2 v Γ( v3 )

(10)

 12 . When v = 2, zt

is standard normally distributed. For v < 2, its distribution has thicker tails than the normal distribution. For example, for v = 1, zt follows the double exponential distribution) while for v > 2, it has tails (for v → ∞, zt √ thinner √ has a uniform distribution on the interval (− 3, 3)). However, many authors proposed asymmetric distributions since the Studentt distribution and the GED cannot accommodate the observed skewness of financial time series. Lambert and Laurent (2001) introduced the use of the standardized skewed Student-t distribution: 2 sf (ξ(szt + m); v) if zt < − ms ξ+ ξ1 f (zt ; ξ, v) = (11) 2 sf ( sztξ+m ; v) if zt ≥ − ms , ξ+ 1 ξ

where f (.; v) is defined in (9), ξ is the asymmetry coefficient, while √ Γ( v−1 ) v−2

m = √2πΓ( v ) (ξ − 1ξ ) and 2 s2 = (ξ 2 + 1/ξ 2 − 1) − m2 . 7

(12)

are the mean and the variance,respectively, of the non-standardized skewed Student-t distribution. As Lambert and Laurent (2000) noted, the density is skewed to the right (left) if log(ξ) > 0(< 0). Also, they produced the αth -quantile function, st∗α (zt ; ξ, v), of the non-standardized skewed Student-t distribution as follows: 1 1 t (z ; v)( α2 (1 + ξ 2 )) if α < a+ξ 2 ∗ ξ α t (13) stα (zt ; ξ, v) = 1 2 −ξtα (zt ; v)( 1−α (1 + ξ )) if α ≥ , 2 a+ξ 2 where tα (zt ; v) is the αth -quantile function of Student-t distribution. The presented density functions of zt are widely applied in risk management. To name but a few, Guermat and Harris (2002), So and Yu (2006) and An´ e (2006) applied the Student-t distribution, Giot and Laurent (2003a, 2003b) and Angelidis and Benos (2007) used the skewed Student-t, Angelidis and Degiannakis (2005a) implement the GED, Bali and Theodossiou (2006) combined the generalized skewed Student-t distribution with 10 GARCH specifications, while the normal distribution was applied in all studies in order to be used as a benchmark distribution. 4.1.2

ARCH Volatility Specifications

Engle (1982) introduced the ARCH(q) model and expressed the conditional variance as a linear function of the past q squared innovations: σt2

= a0 +

q 

ai ε2t−i .

(14)

i=1

For the conditional variance to be positive, the parameters must satisfy the conditions a0 > 0 and ai ≥ 0 for i = 1, . . . , q. However, the ARCH model has the following drawbacks: 1. The value of q cannot be determined in advance; 2. The value of q can be turned out to be quite large in order to capture the dependence in the conditional variance and 3. the non-negative restrictions might be violated more easily than in the cases of the more flexible models. In order to overcome the shortcomings of the ARCH(q) specification, Bollerslev (1986) proposed a generalization of the ARCH(q) model, the GARCH(p,q) model: q p   2 σt2 = a0 + ai ε2t−i + bi σt−i , (15) i=1

i=1

8

q where a > 0, a ≥ 0 for i = 1, . . . , q, and b ≥ 0 for i = 1, . . . , p. If 0 i i i=1 ai +

p b < 1, the process ε is covariance stationary and its unconditional t i=1 i variance is equal to: 2

σ = a0 /(1 −

q 

ai −

i=1

p 

bi ).

(16)

i=1

The GARCH(p,q) model is more parsimonious than the ARCH(q) model, it is not likely the non-negative conditions to be breached and captures several characteristics of financial time series, such as thick tailed returns and volatility clustering. Mandelbrot (1963) is the first who revealed these characteristics of the financial time series and noted that “. . . large changes tend to be followed by large changes of either sign, and small changes tend to be followed by small changes. . . ”. The GARCH(p,q) model has been applied in many studies in order to forecast the risk that the investors face and it has been showed that it generates accurate forecasts. Hansen and Lunde (2005b), in an extensive study of volatility models, concluded that the best perfoming models, out of 330 alternatives, did not provide significantly better volatility forecasts than the GARCH(1,1) model. In a risk management environment the use of the GARCH(1,1) is not always suggested. On the one hand, An´ e (2006) concluded that it remains a safe choice when computing VaR, while on the other hand, Billio and Pelizzon (2000) demonstrated that the number of exceptions that have been generated by a GARCH(1,1) model deviates significantly from the theoretical values. Taylor (1986) and Schwert (1989) assumed that the conditional standard deviation is a linear function of its past values and the past absolute innovations and introduced the absolute GARCH, or AGARCH(p,q), model: σt = a0 +

q 

ai |εt−i | +

i=1

p 

bi σt−i .

(17)

i=1

By adopting such specification, a researcher expects that large shocks should have smaller effect on the conditional variance than in the GARCH model. Bali and Theodossiou (2006) combined a fat tailed and leptokurtic distribution (the skewed generalized distribution) with 10 GARCH specifications in order to calculate both VaR and ES and argued that the model proposed by Taylor (1986) and Schwert (1989), had the best overall performance. Contrary to this finding, An´ e (2006) argued that there are not statistically significant differences between these models (15) and (17). A special case of the GARCH(p,q) family is the IGARCH(p,q) model, since in empirical applications of the GARCH(p,q) model to daily data is 9



likely to be found that qi=1 ai + pi=1 bi ≈ 1, which is commonly referred as the Intergrated GARCH or IGARCH(p,q) model: σt2 = a0 +

q 

ai ε2t−i +

i=1

p 

2 bi σt−i ,

(18)

i=1

where qi=1 ai + pi=1 bi = 1. In IGARCH(p,q) model the unconditional variance is infinite and a shock on the conditional variance is persistent, which implies that it remains important for all conditional volatility forecasts. A special case of equation (18) is the Exponentially Weighted Moving Average (EWMA), used by RiskMetricsTM . The volatility forecast is computed 2 as σt2 = λσt−1 + (1 − λ)ε2t−1 . RiskMetricsTM uses λ = 0.94 for daily data and 75 historical prices to make the estimation. The RiskMetricsTM ’s methodology has been applied in many studies (see for example Billio and Pelizzon 2000 and Angelidis and Degiannakis, 2004) and there is evidence that it underestimates total risk. Therefore the financial institution that employs this method is penalized, as according to equation (1) the capital multiplier will not equal to 3 but most likely to 4. So and Yu (2006) demonstrated that the RiskMetricsTM model is outperformed by both stationary and fractionally integrated GARCH models in estimating 99% VaR. Guermat and Harris (2002) showed that the EWMA variance estimator can be obtained as a special case of a Exponentially Weighted Maximum Likelihood (EWML) procedure that allows for time-variation not only in the variance of the returns distribution, but also in its higher moments. For three equity portfolios (U.S., U.K., and Japan), the EWML model, compared to the GARCH(1,1) specification under both the normal and the Student-t distribution, improved the estimated daily VaR number at the 99% confidence level. Ding et al. (1993) noted that there has been significant evidence of long memory in volatility of asset returns. In order to model the long memory property in volatility, Baillie et al. (1996) extended the IGARCH(p,q) model by developing the FIGARCH(p,d,q) specification:   σt2 = a0 + 1 − B (L) − (1 − Φ (L)) (1 − L)d ε2t + B (L) σt2 , (19) where 0 ≤ d ≤ 1 is a condition in order the process to be strictly stationary q p

bi Li , Φ (L) = ϕi Li , and ergodic, L denotes the lag operator, B (L) = i=1 i=1   ∞ ∞

d Γ(i−d) 1 1 i i πi L = 1 − d Γ(1−d)Γ(i+1) L = 1 − 1! dL − 2! d (1 − d) L2 − ..., (1 − L) = i=0

i=1

and Γ (.) is the gamma function. 10

The FIGARCH(p,d,q) model has been applied in many studies, as the importance of the fractional integrated variance models stems from the added flexibility when modeling long run volatility characteristics. Bollerslev and Mikkelsen (1996) provided evidence that illustrates the importance of using fractional integrated conditional variance models in the context of pricing options with maturity time of one year or longer. Vilasuso (2002) fitted conditional volatility models to daily spot exchange rates and found that the FIGARCH(1,d,1) model generates superior volatility forecasts compared to those generated by a GARCH(1,1) or IGARCH(1,1) model. In the VaR arena, the results are conflicting, as in the one hand So and Yu (2006) argued that it is more important to consider a model with fat-tailed errors than to model the long memory property, while, on the other hand, Angelidis and Degiannakis (2006) suggested that in most cases fractional integrated parameterization of volatility process is necessary. In the GARCH structure, the variance depends only on the magnitude of εt and not on its sign. Neverthelles, it has been shown that the volatility tends to rise in response to bad news, (εt < 0), and to fall in response to good news (εt > 0). This feature of risk is known as leverage effect, a term that was introduced by Black (1976). The importance of asymmetry in VaR framework has been noted by several authors. For example, Brooks and Persand (2003a) pointed out that risk models which do not account for asymmetries in volatility specification, are most likely to generate inaccurate forecasts. Other authors, such as Giot and Laurent (2003a, 2003b), favored models that accommodate at least the asymmetry of the volatility, while Angelidis and Degiannakis (2005a) argued that models that parameterize the leverage effect for the conditional variance, forecast accurately the VaR at the 99% confidence level. An asymmetric model should allow for the possibility that a drop in price to have larger impact on future volatility than an increase of equivalent magnitude. The three most commonly used asymmetric ARCH specifications are Nelson’s (1991) exponential GARCH, or EGARCH(p,q), Glosten’s et al. (1993) and Zako¨ian’s (1994) Threshold GARCH or TARCH(p,q) and the asymmetric GARCH, or AGARCH(p,q), model of Engle (1990): EGARCH(p,q) specification:

ln(σt2 )

   q   p   εt−i  ε t−i 2  + γi = a0 + bi ln(σt−i ). ai  +  σ σ t−i t−i i=1 i=1 11

(20)

TARCH(p,q) specification: σt2

= a0 +

q 

ai ε2t−i

+

γ1 ε2t−1 dt−1

+

i=1

p 

2 bi σt−i .

(21)

i=1

AGARCH(p,q) specification: σt2 = a0 +

q  

p   2 ai ε2t−i + γiεt−1 + bi σt−i .

i=1

(22)

i=1

For the EGARCH model, the logarithmic transformation ensures that variance will be always positive. The asymmetry on the variance exists only if γi = 0. Specifically, if γi = 0 then a positive surprise, εt > 0, has the same effect on volatility as a negative surprise, εt < 0, while when γi < 0 positive shocks generate less volatility than negative ones. In the case of the TARCH(p,q), dt = 1 if εt < 0, and dt = 0 otherwise and therefore allows a response of volatility to news with different coefficients for good and bad news. Finally, for the AGARCH, a negative value of γi means that positive returns increase volatility less than negative returns. There are numerous paper that estimated these models. Awartani and Corradi (2005) is a representative example. They compared GARCH, IGARCH and Riskmetrics models with various asymmetric ARCH models in forecasting S&P500 volatility. The GARCH model is beaten only when compared against asymmetric ARCH models. Ding et al. (1993) introduced the asymmetric power ARCH, or APARCH(p,q), model, without assuming that the conditional variance must be a linear function of the lagged squared returns, since the common use of a squared term is likely to be a reflection of the normality assumption traditionally invoked regarding financial data, as Brooks et al. (2000) stated: σtδ

= a0 +

q 

ai (|εt−i| − γi εt−i ) + δ

i=1

p 

δ bi σt−i ,

(23)

i=1

where a0 > 0, ai ≥ 0, |γi | < 1, bi ≥ 0 and δ > 0. The APARCH(p,q) comprises most of the presented models. For example, if δ = 2 and γi = 0 equation (23) is equivalent to (15). Giot and Laurent (2003a, 2003b) calculated the VaR number for long and short equity trading positions and proposed the APARCH model with skewed Student-t conditionally distributed innovations as it had the best overall performance. Huang and Lin (2004) argued that for the Taiwan Stock Index 12

Futures, the APARCH model under the normal (Student-t) distribution must be used by risk managers to calculate the VaR number at the 95% (99%) confidence level. In contrast to these studies, An´ e (2006) and Angelidis and Degiannakis (2005a) reached to an opposite conclusion, as the APARCH model were not deemed as the most adequate specification in forecasting VaR. Bollerslev and Mikkelsen (1996) extended the idea of fractional integration to the EGARCH model, building the FIEGARCH(p,d,q), model: ln (σt2 ) = a0 (1 − B (L)) + (1 − L)−d (1 + Φ (L)) (γ1 (|εt−1 /σt−1 | − E |εt−1 /σt−1 |) + γ2 (εt−1 /σt−1 )) + B (L) ln (σt2 ) ,

(24)

whereas Tse (1998) built the fractional integration form of the APARCH model, named fractionally integrated APARCH, or FIAPARCH(p,d,q), model:   σtδ = a0 + 1 − B (L) − (1 − Φ (L)) (1 − L)d (|εt | − γεt )δ + B (L) σtδ . (25) Degiannakis (2004) proposed the FIAPARCH model with skewed Student-t standardized residuals in forecasting next day’s VaR for FTSE100, CAC40 and DAX30 indices. Whereas, Angelidis and Degiannakis (2006) showed that the FIEGARCH model with conditional normally distributed innovations provides accurate one-day-ahead VaR and ES forecasts of the S&P500 index. Another class of models has been emerged recently. Since Hamilton (1989) modeled the U.S. business cycle as an outcome of a discrete-state Markov process in order to include structural changes in the parameters, regime switching models have become increasingly popular. These models can accommodate non-constant volatility, skewness and excess kurtosis as a function of the estimated parameters and consequently it is likely to forecast the underlying distribution better than the non-switching corresponding methods. We limit our analysis to the two-state regime switching models, as many researchers identify mainly two states of the economy (high or low risk environments). Ang and Bekaert (2002) argued that there are benefits from international diversification, as they found a regime with high volatility and high correlated returns for three countries (U.S., U.K. and Germany). Finally, Assoe (1998) examined nine emerging markets and found evidence of two regimes that differ only in terms of market volatility. In the two-state regime switching model, the parameters of interest are allowed to take different values at each state. It is expressed as: yt = µj,t + εj,t εj,t = zj,t σj,t , 13

(26)

where µj,t and σj,t are the conditional mean and the conditional standard deviation of state j = 1, 2, respectively. The process of yt is a first-order Markov and is described by a binary variable St = 1, 2. Its transition matrix, Π, is characterized by constant probabilities (p11 and q11 ):  Π=

Pr(St = 1|St−1 = 1) Pr(St = 2|St−1 = 1) Pr(St = 1|St−1 = 2) Pr(St = 2|St−1 = 2)



 =

1 − p11 p11 1 − q11 q11

 .

(27) The filtered probability, p1,t , which describes the probability the market to be at regime j = 1 at time t given the information at t − 1, has a first-order recursive formula and it is calculated as: f1,t−1 p1,t−1 f2,t−1 (1 − p1,t−1 ) +p11 , f1,t−1 p1,t−1 + f2,t−1 (1 − p1,t−1 ) f1,t−1 p1,t−1 + f2,t−1 (1 − p1,t−1 ) (28) where fj,t is the density function of regime j at time t. As far it concerns the various approaches of modeling the conditional vari2 ance (σj,t ), we present two specifications that include ARCH and GARCH effects. In Hamilton’s (1989) work, the variance was assumed to be different across the states but constant within them. Hamilton and Susmel (1994) extended Hamilton’s (1989) specification by estimating a Switching ARCH model (SWARCH) on weekly stock returns. A generalization of their model3 which allows all the parameters to switch across regimes is described as: p1,t = (1−q11 )

yt = µj.t + εj,t

2 σj,t = aj,0 + 2j=1 qi=1 aj,iε2t−i ,

(29)

where j = 1,2 and aj,0 > 0 and aj,i ≥ 0 are the regime switching parameters of the conditional variance. The authors of the previous papers did not include a GARCH term in the conditional variance equation, as they argued that it is impossible to estimate 2 the regime switching model due to the path dependence of σj,t to the entire history of the data. Gray (1996) proposed a method to overcome the problem of the path dependence, as he weighted average the conditional variance at time t by using the regime filtered probability (pj,t). The regime switching GARCH (1,1) specification, under the assumption of the normal distribution, 3

In Hamilton and Susmel (1994) framework, the ARCH(q) process was multiplied by a constant parameter gj .

14

is summarized in the following equations: yt = p1,t µ1,t + p2,t µ2,t + εt 2 2 σ + aj,1 ε2t−1 + bj,1σt−1 j,t = aj,0    2 2 2 + p2,t−1 µ22,t−1 + σ2,t−1 − (p1,t−1 µ1,t−1 + p2,t−1 µ2,t−1 )2 , σt−1 = p1,t−1 µ21,t−1 + σ1,t−1 (30) for j = 1, 2. Li and Lin (2004) estimated the VaR number via regime switching ARCH models for Dow Jones, Nikkei, Frankfurt Commerzbank index, and FTSE. They provided evidence in favor of these models, as they showed that they outperformed both ARCH and GARCH counterparts. Billio and Pelizzon (2000) extended the mixture of normal distributions method that Venkataraman (1996) and Zangari (1996) proposed, as they developed a multivariate regime switching model in order to calculate the daily VaR number for 10 Italian stocks. They demonstrated the superiority of the model over both the RiskMetricsT M and the GARCH methods in terms of two backtesting measures (proportion of failure and the time until first failure). Guidolin and Timmermann (2003) suggested to market participants to use a regimeswitching specification since it can forecast the VaR and the ES measure more accurate than the GARCH models do. Finally, Haas et al. (2004) proposed a Markov Switching-GARCH model with skewed conditional mixture distribution that had better overall performance. Although, there is a vast number of ways to parameterize the conditional volatility in an ARCH framework4 , the models that were presented in this section are the most widely known ones and have been applied in many risk management studies. Most of these can be applied by a risk manager effortlessly, as most econometric packages include routines for their estimation. See Brooks et al. (2001) for a review of software packages for ARCH models. The most commonly used method in estimating the parameters of an ARCH process is the Maximum Likelihood (ML) method5 . Let ψ = (θ , w ) 4

The interested reader is referred to the comprehensive reviews of Bollerslev et al. (1992), Bera and Higgins (1993), Bollerslev et al. (1994), Hamilton (1994), Palm (1996), Gouri´eroux (1997), Andersen and Bollerslev (1998c), Li et al. (2001), Poon and Granger (2001) and Degiannakis and Xekalaki (2004). 5 Alternative methods of estimating the parameters of ARCH processes have been proposed in the literature. Mark (1988), Ferson (1989) and Rich et al. (1991) proposed the estimation of the parameters by the generalized method of moments. Geweke (1988, 1989) recommended a Bayesian approach. Pagan and Schwert (1990) used a collection of nonparametric estimation methods, including Kernels, Fourier series and two-stage least squares regressions and Engle and Gonz´ alez-Rivera (1991), Engle and Ng (1993), Gallant and Tauchen (1989), Gallant et al. (1993) introduced semi-parametric methods of estimation. Harvey et al. (1992) proposed an estimation method based on the Kalman filter, Giraitis and Robin-

15

denote the whole set of the parameters that have to be estimated for the conditional mean, variance and density function. The ML estimator ψˆ is found by maximizing the log-likelihood function of yt (ψ) for a sample of T observations:  T   1  2  LT ({yt } ; ψ) = ln (f (εt (θ) /σt (θ) ; ξ, υ)) − ln σt (θ) . (31) 2 t=1 4.1.3

One-step-ahead VaR and ES Calculation under Parametric Volatility Forecasting

Having presented the major ARCH models of volatility forecasting, it is straightforward to compute the one-step-ahead VaR forecast under any distributional assumption as:   (p) V aRt+1|t = µt+1|t + Fa zt ; ξ (t) , υ (t) σt+1|t , (32)   where Fa zt ; ξ (t) , υ (t) is the ath quantile6 of the assumed distribution, computed based on the information set available at time t, and µt+1|t and σt+1|t are the conditional mean and conditional standard deviation forecast at time t + 1 given the information at time t, respectively. As we have already mentioned, ES is defined as the conditional expected loss, given a VaR violation. In particular, ES is a probability-weighted average of tail loss and the one-step-ahead ES forecast, for long trading position, is calculated as:    (p) (p) ESt+1|t = E yt+1 | yt+1 ≤ V aRt+1|t . (33) Given the fact that there are not analytical solutions for all distributional assumptions, Dowd (2002), in order to calculate ES, suggested to “slice the tail into a large number k˜ of slices, each of which has the same probability mass, estimate the VaR associated with each slice and take the ES as the average of these VaRs”.

4.2 4.2.1

Non-Parametric Risk Management Techniques Historical Simulation

Historical simulation (HS) has received much attention because of its simplicity and its relative lack of theoretical burden, as no distributional assumption about the underlying process of the returns is necessary to be made. son (2000) estimated the parameters of the GARCH process using the Whittle estimation technique while Hall and Yao (2003) suggested bootstrap approximations. 6 For long (short) trading positions a = p (a = 1 − p).

16

Finding a distribution that can be applied across all assets, is not a straightforward task as it has been shown in section 4.1.1 and therefore an alternative approach had been proposed. The main assumption of HS is that the properties of the assets are well described by the chosen sample period. Specifically, the historical returns (e.g. one year data) are used and the P/L distribution of the portfolio is constructed by applying the current weights of each asset: yt =

n 

ωi,t yi,t,

(34)

i=1

where the sum is taken over the n securities of the portfolio and ωi,t denotes the weight of security i at the end of day t. Under this framework, the VaR number for a specific confidence interval is derived as the corresponding ath quantile of the empirical historical distribution: (p)

VaRt+1|t = Fa ({yt }Tt=1 ),

(35)

(˜ p)

while ES is calculated as the average VaRt+1|t , for 0 < p˜ < p (p < p˜ < 1) in the case of long (short) trading position. The main advantage of the historical simulation is that it can be implemented even in a spreadsheet, as it is assumed that the distribution of the portfolio returns is constant and therefore there is no need to model: • the time-varying variance of the assets and • the underlying distribution. This method accommodates nonlinearities and non-normal distributions, a fact which is quite important in the area of risk management, as it has been revealed that the returns of the assets are not normally distributed and extreme events are observed more often that the normal distribution assumes. However, the simplicity of the method does not come without a cost, as: • the sensitivity of VaR cannot be tested since the variance of the asset returns is assumed to be constant7 ; • by construction the same weight on all the observations is put, which implies that past returns are equally important as the recent ones; 7

The limitations of this assumption can be alleviated by implementing stress test scenarios which give to the risk managers the opportunity to calculate the losses of a worst case event. The interest reader is referred to the work of Laubsch (1999).

17

• if T is too large, then the most recent observations, which probably describe the future distribution better, carry the same weight as the earlier ones. In order to lessen the effect of the sample size choice on the VaR forecasts, Boudoukh et al. (1998) applied an exponential weighting scheme equivalent to that of RiskMetricsTM methodology, as they assigned a weighted probability, ηt = {η t−1 (1 − η)/(1 − η)T }Tt=1 , to daily returns. Christoffersen (2003) set η=0.95, whereas Boudoukh et al. (1998) set it equal to either 0.97 or 0.99, as there is no statistical method of estimating the unknown parameter (η). The daily returns are sorted in ascending order. The weights of the ascending returns are accumulated until the 100p% quantile is reached. Under this framework, for a long trading position, VaR equals to the corresponding return of the accumulated 100p% weight, while ES is equal to the average of the ascending returns until the accumulated 100p% weight.

4.3 4.3.1

Semi-Parametric Volatility Forecasting Filtered historical simulation

The presented methods have two main shortcomings: • In the case of the parametric, the distributional choice is crucial; • In the non-parametric, the volatility dynamics of the return series can not be modeled. In order to remedy these effects, Hull and White (1998) and Barone-Adesi et al. (1999) proposed a semi-parametric methodology, named Filtered Historical Simulation (FHS), that combines the two aforementioned methods in the attempt to take the most of them. Specifically, a parametric model is used in order the volatility innovations of the series to be accommodated, while the needed quantile of the VaR is derived from the standardized residuals of the model, similar to the HS procedure. Under this framework, VaR is calculated as: (p)

V aRt+1|t = µt+1|t + Fa ({zt }Tt=1 )σt+1|t , (˜ p)

(36)

while ES is calculated as the average VaRt+1|t , for 0 < p˜ < p (p < p˜ < 1) in the case of along (short) tranding position. By implementing this procedure, a risk manager achieves two important goals: 18

1. A volatility model can be used without worrying for the underlying distribution and 2. The generated risk forecasts accommodate the non-zero skewness,while the tails of the empirical distribution are replicated quite accurately. Even if the method is quite promising, is has not been applied in many studies. Barone-Adesi and Giannopoulos (2001) argued that the FHS produces risk forecasts that accommodate the current state of the market and therefore it is better than the HS. Angelidis and Benos (2007) reached to the same conclusion, claiming that at the higher confidence level the FHS performed better than the parametric and the semi-parametric methods, while Angelidis et al. (2007) claimed that this method produces robust VaR estimates irrespective of the weighting scheme and the sample period. 4.3.2

Extreme value theory

The study of extremes in finance has gradually grown in the last few years. Any statistical procedure attempting to model extremes should benefit from the appropriate choice of the underlying distribution and therefore risk managers must also consider the Extreme Value Theory (EVT) as an alternative risk method. Among others, MacNeil and Frey (2000), Jondeau and Rockinger (2003), Ho et al. (2000), Seymour and Polakov (2003), Gen¸cay and Sel¸cuk (2004) and Bystr¨om (2004) applied the EVT method to estimate VaR for various financial assets and markets. The generalized Pareto distribution (GPD) may well describe the behavior of extremes, as summarized by the so-called “tail index” τ1 . The EVT method must be applied on standardized portfolio returns because in any other case, the estimated parameters of the GPD density will be biased (see McNeil and Frey, 2000). In the following paragraphs, the peaks over threshold method will be presented, since no complex estimation technique is required8 . The probability that the standardized returns, yt∗ are greater than u is given by: P r (yt∗ − u ≤ x | yt∗ > u) = 8

F (x + u) − F (u) , 1 − F (u)

(37)

The interest reader is referred to the works of Harmantzis et al. (2006), Embrechts (2000), McNeil (1997, 1998, 1999) for a detailed discussion of applying different EVT techniques on financial datasets.

19

where x > u and F (.) is the distribution of the standardized returns. The density function of the GPD, f (x; τ1 , τ2 ), is described by: 1 − (1 + ττ12x )−1/τ1 if τ1 = 0 (38) f (x; τ1 , τ2 ) = ) if τ1 = 0, 1 − exp( −x τ2 where τ2 > 0 is a scale factor and x≥u u≤x≤ u−

τ2 τ1

if τ1 ≥ 0 if τ1 < 0.

(39)

Under the assumption of τ1 > 0, the Hill estimator of the tail index (τ1 ) equals to: Tu 1  yt (40) ln( ), τ1 = Tu t=1 u where Tu is the number of observations above the threshold u. Hence, under this framework, the VaR can is calculated as:  −τ1 p (p) VaRt+1|t = σt+1|t u . (41) Tu /T

4.4

Multi-period VaR and ES forecasts

Basle Committee proposal (1995a, 1995b) requires from the financial institutions, among others, to set the holding period equal to 10 trading days and therefore the presented methods of calculating VaR and ES can only be applied for one-period-ahead forecasts. In the simple case where the portfolio returns are i.i.d. normally distributed, VaR equals to: (p) ˜ aR(p) , V aRt+N˜ |t = NV (42) t+1|t ˜ is the required holding period. Diebold et al. (1996) showed that where N √ converting daily VaR to 10-day VaR by using the 10 rule generates inadequate forecasts. Danielsson and Zigrand (2004) demonstrated that the square of root rule leads to an underestimation of risk and therefore the objective of the Basle Committee proposal (1995a, 1995b) is not addressed satisfactory. But how multi-period VaR and ES forecasts are calculated for other distributional assumptions? Given the fact that analytical expressions for the multiperiod density are not available, numerical techniques may be used in order to estimate VaR and ES, as it has been pointed out by Christoffersen (2003) and Andersen et al. (2005). 20

In the next paragraph the method of creating VaR and ES forecasts will be briefly described for the GARCH(1,1) under the normal distribution, but it can be generalized for other models and distributional assumptions. 2 Let yt+1 = σt+1 zt+1 , where zt+1 ∼ N(0, 1) and σt+1 = α0 + α1 yt2 + b1 σt2 . The multi-period forecasts of VaR are calculated according to the following steps: 1. Generate random numbers z˘i,1 for i = 1, . . . , MC 9 from the standard normal distribution. 2. Create the hypothetical returns of the day t + 1, as y˘i,t+1 = σt+1 z˘i,1 , for i = 1, . . . , MC. 2 2 2 3. Create the forecast variance at t + 2 as: σ ˘i,t+2 = α0 + α1 y˘i,t+1 + b1 σt+1 .

4. Generate new random numbers, z˘i,2 , for i = 1, . . . , MC. 5. Calculate the hypothetical returns at time t + 2, y˘i,t+2 = σ˘i,t+2 z˘i,2 , for i = 1, . . . , MC. 6. · · · ˜ y˘ ˜ = σ 7. Calculate the hypothetical returns at time t + N, ˘i,t+N˜ z˘i,t+N˜ . i,t+N   C ˜ -day VaR as V aR(p) = Fa y˘ ˜ }M 8. Calculate the N ˜ |t i,t+N i=1 . t+N   ˜ -day ES as ES (p) = E |Loss ˜ | > |V aR(p) | . 9. Calculate the N ˜ |t ˜ |t t+N t+N t+N The simulation can be also presented graphically:

2 σt+1 →

2 ˘1,t+2 z˘1,1 → y˘1,t+1 → σ 2 z˘2,1 → y˘2,t+1 → σ ˘2,t+2 ··· ··· 2 z˘M C,1 → y˘M C,t+1 → σ˘M C,t+2

··· ··· ··· ··· ···

z˘1,N˜ → y˘1,t+N˜ z˘2,N˜ → y˘2,t+N˜ ··· ··· z˘M C,N˜ → y˘M C,t+N˜

      (p) (p) C MC yi,t+1 )M ) · · · = F (˘ y ) V aRt+1|t = Fa ((˘ V aR ˜ i=1 a i,t+N i=1 ˜ |t t+N       (p) (p) (p) (p) ESt+1|t = E(|Losst+1 | > |V aRt+1|t |) · · · ESt+N˜ |t = E(|Losst+N˜ | > |V aRt+N˜ |t |) 9

M C denotes the number of draws.

21

Finding multi-period VaR forecasts in the area of risk management are considered far from being resolved, due to the difficulties of estimating the risk. In a different approach, Dowd et al. (2004) suggested a method that avoids problems associated with the square-root rule and argued that long term estimates of VaR are subject to model and parameter risk. Brooks and Persand (2003b) also focused on multi-period forecasts, as they evaluate VaR forecasts over the 1- 5-, 10- and 20-day horizons, by applying, however, only the standard normal distributional assumption.

4.5

Realized volatility Models

Andersen and Bollerslev (1998a) introduced an alternative volatility measure, the realized volatility. The modelling of realized volatility is based on the idea of using the sum of squared intraday returns to generate more accurate daily volatility measures. The idea of using high frequency data to compute measures of volatility at a lower frequency was introduced by Merton (1980). Let us consider that the instantaneous logarithmic price, ln (P (t)), of a financial asset follows a simple diffusion process: d ln (P (t)) = σ (t) dW (t) ,

(43)

where σ (t) is the volatility of the instantaneous returns process and W (t) is the standard Wiener process (σ (t) and W (t) are independent). The integrated 2(IV ) volatility, σt , aggregated over the time interval (t − 1, t) is equal to: 2(IV ) σt

t =

σ 2 (x) dx.

(44)

t−1

The integrated volatility is a latent variable which is not observable. However, according to the theory of quadratic variation of semi martingales, the integrated volatility can be consistently estimated by the realized volatility10 , 2(RV ) , which is defined as the sum of squared returns observed over very small σt time intervals: 2(RV ) σt

=

m−1 

    2 ln P(j+1/m),t − ln P(j/m),t ,

(45)

j=1 10

For technical details the reader is referred to Andersen et al. (2001a) and BarndorffNielsen and Shephard (2001).

22

where P(m),t are the financial asset prices during period t with sampling fre2(RV ) denotes quency m. For daily time intervals and m observations per day, σt the realized volatility of trading day t. The realized volatility converges in probability to the integrated volatility as the sampling frequency is increasing, m → ∞: m−1      2 2(IV ) p lim ln P(j+1/m),t − ln P(j/m),t . (46) = σt m→∞

j=1 2(RV )

11 The asymptotic distribution of σt is defined as:  

t 2 √ 2(IV ) m σt − σ (x) dx d t−1  → N (0, 1) .

t 2σ 4 (x) dx

(47)

t−1

The sampling frequency, m, should be as high as possible in order not to be induced bias to volatility estimator by the market microstructure features. In the majority of the studies a sampling frequency of 5-minutes or 30-minutes is used, in order to avoid market microstructure frictions without lessening the accuracy of the continuous record asymptotics. Areal and Taylor (2002), Martens (2002), A¨ıt-Sahalia et al. (2005), Bandi and Russell (2005), Corsi et al. (2001), Engle and Sun (2005), Hansen and Lunde (2005a) and Zhang et al. (2005) are dealing with the effects of market microstructure and data adjustments on the construction of the realized volatility measure. For example, in order to account for changes in the asset prices during the hours that the stock market is closed, Martens (2002) suggested accounting for overnight returns without inserting the noisy effect of daily returns: 2(RV ) σt

2 2 m−1      2 + σco σoc = , 100 ln P(j+1/m),t − ln P(j/m),t 2 σoc j=1

2 = T −1 where σoc

(48)

T     2

ln P(1),t − ln P(1/m),t is the open to close sample

t=1

2(RV )

11

For more details about the asymptotic distribution of σt and volatility’s asympt

2(IQ) totic volatility, σt = 2σ 4 (x) dx , named integrated quarticity, you are referred to t−1

Barndorff-Nielsen and Shephard’s (2002a, 2002b, 2003, 2004a, 2004b, 2005, 2006) manuscripts.

23

2 variance and σco = T −1

T     2

ln P(1/m),t − ln P(1),t−1 is the close to open

t=1

sample variance. Engle and Sun (2005) proposed an econometric model for the joint distribution of tick by tick return and duration, with microstructure noise explicitly filtered out and used the model to obtain a model-based estimate of daily realized volatility. Zhang et al. (2005) found that the usual realized volatility mainly estimates the magnitude of the noise term rather than anything to do with volatility. They showed that instead of sampling sparsely the tick by tick data, one should separate the observations into multiple grids and combine the usual single grid realized volatility with the multiple grid based device. The Fractionally Integrated ARMAX, or ARFIMAX(k, l), specification has been used to model the long memory property of the realized volatility: 

(1 − c (L)) (1 − L)d

    2(RV ) ln σt − w0 − w1 yt−1 − w2 dt−1 yt−1 = (1 + d (L)) εt , i.i.d.

εt ∼ N (0, σε2 ) , where c (L) =

κ

ci Li , d (L) =

i=1

l

(49)

di Li , dt = 1 when yt > 0 and dt = 0

i=1

otherwise. For −0.5 < d < 0.5, the dependent variable is weakly stationary. Parameter w2 models the asymmetric relationship between volatility and past returns. More information about applications of ARFIMAX models on the realized volatility and its properties see Taylor and Xu (1997), Andersen and Bollerslev (1997, 1998b), Ebens (1999), Andersen (2000), Bollerslev and Wright (2001), Oomen (2001), Andersen et al. (2003), Thomakos and Wang (2003), Giot and Laurent (2004), Angelidis and Degiannakis (2005b), Koopman et al. (2005), Andersen et al. (2006) and references therein. However, the volatility of volatility also exhibits time-variation and volatility clustering. A representative example is the realized volatility of the S&P500 index futures analyzed by Corsi et al. (2005). Therefore an ARFIMAX (k, l) -GARCH (p, q) model for the realized volatility is applied: 

(1 − c (L)) (1 − L)d

    2(RV ) ln σt − w0 − w1 yt−1 − w2 dt−1 yt−1 = (1 + d (L)) εt , εt = ht zt , (50) q p

2 2 2 ai εt−i + bi σt−i , ht = a0 + i=1

i=1

where zt ∼ N (0, 1). Naturally, the model can be straightforwardly extended in any ARCH volatility specification and distributional assumption that have 24

been presented in sections 4.1.1 and 4.1.2. Beltratti and Morana (2005) showed that, for Deutschmark / U.S. dollar and Yen / U.S. dollar exchange rates, even a simple GARCH model may work well in one-day-ahead forecasting, but an ARFIMAX-GARCH model improves multistep VaR forecasting. Corsi (2004) suggested the Heterogeneous Autoregressive for the realized volatility, or HAR-RV, model: (RV )

σt

  (RV ) (RV ) = w0 + w1 σt−1 + w2 σt

t−5:t−1

  (RV ) + w3 σt

t−22:t−1

+ εt , (51)

i.i.d.

εt ∼ N (0, σε2 , ) ,   (RV ) where σt

−1

t+1−t :t

= t

t

j=1

(RV )

defines the t -period realized volatil-

σt−j

ity. The last day’s realized volatility is explained from the daily, weekly and monthly realized volatilities. The HAR-RV model is an autoregressive structure of the volatilities realized over different interval sizes. Its economic interpretation stems from the Heterogenous Market Hypothesis presented by M¨ uller et al. (1993). The basic idea is that market participants have a different (RV ) perspective of their investment horizon. Thus, σt−1 component   accounts for (RV )

accounts the volatility of interday or intraday trading strategies, σt t−5:t−1   (RV ) for medium-term trading and σt captures investment strategies of t−22:t−1

one month or longer time horizons. The heterogeneity, which originates from the difference in the time horizon, creates volatility. Corsi et al. (2005) extended the HAR model by including a GARCH component to account for the volatility clustering of the realized volatility. They named the new specification as HAR-GARCH (p, q) model:

(RV ) σt

= w0 +

(RV ) w1 σt−1

  (RV ) + w2 σt



t−5:t−1

+ w3

(RV ) σt

εt = ht zt q p

2 h2t = a0 + ai ε2t−i + bi σt−i , i=1

 t−22:t−1

+ εt , (52)

i=1

where zt ∼ N (0, 1). Of course the extension of the model to account for any ARCH volatility specification and distributional assumption that have been presented in ARCH literature also holds for the HAR model. 25

5

Liquidity Adjusted Value-at-Risk

Traditional VaR approaches (equation 2) are based on the assumption that markets are perfect and hence the investors can buy or sell any amount of stock without causing a significant price change. However, this hypothesis is not always true, and therefore the estimated VaR underestimates the true one12 . Two lines of research have been followed in order to accommodate liquidity risk in VaR framework: 1. VaR adjustments based on the bid-ask spread of the stocks, and 2. Trading strategies that minimize the expected cost and its variance.

5.1

VaR Adjustments Based on the Bid-Ask Spread

The first attempt to adjust VaR by using the spread of the stocks, has been made by Bangia et al. (1999). Even if it is a very simple methodology, it has received much attention (see Dowd, 2002). They classified illiquidity risk into two categories: • the exogenous illiquidity, depending on market conditions, and • the endogenous illiquidity, related to the position of a trader with respect to the bid-ask spread. Specifically, they argued that the classical VaR estimate should be adjusted by adding a liquidity term, L1 : LV aR = V aR + L1 ,

(53)

a relation that is presented in Figure 3. Bangia et al. (1999) estimated the LV aR as:   1 (p) (p) (54) LV aRt+1|t = V aRt+1|t + (Spread + Fa {spreadt }Tt=1 σt,spread ), 2   T where Fa {spreadt }t=1 is the ath quantile of spread’s distribution, σt,spread is the standard deviation of the bid-ask spread, while spreadt is the difference between the bid and the ask price at time t. Under this framework, the 12

In order to overcome this shortcoming of the traditional risk models, it has been suggested, according to the Basle Committee proposal (1995a, 1995b), that banks should calculate VaR based on a 10 day intervals in the attempt to be able to liquidate their positions.

26

computation of the LVaR number is simple and intuitive, as it splits the total 1 risk into a “market”  component (VaR) and an “illiquidity” one ( 2 (Spread +

Fa {spreadt }Tt=1 σt,spread )). However, their approach face several drawbacks:

• The spread is not normally distributed (see Bangia et al. 1999 and Ervan 2000, among others) and therefore the VaR coefficient can not be calculated in advance. • It does not account for the endogenous risk, which relates the liquidation price with the investor’s specific position and therefore mis-estimates liquidity risk. Having in mind the above limitations, Angelidis and Benos (2006) estimated the bid-ask spread components in order to calculate accurately both the endogenous and the exogenous liquidity risk by incorporating the traded volume to Madhavan’s et al. (1997) model: Pt −Pt−1 = k1



T Vt (Xt −ρXt−1 )+k2 (Xt −Xt−1 )+k3 (Xt



T Vt −Xt−1



T Vt−1 )εt , (55) where Pt denotes the transaction price of a security at time t and Xt is the “trade indicator” variable, that equals to +1, if the trade is buy oriented, and -1, if it is sell oriented, and T Vt is the number of traded shares at time t. The coefficient k2 ≥ 0 represents the cost per share of the market maker in supplying liquidity on demand, k1 designs the degree of information asymmetry, k3 reveals whether the order handling or the inventory costs are more important. Based on this set up, Angelidis and Benos (2006) proposed a LV aR measure that is calculated as follows:      (p) (p) T (56) LV aRt+1|t = V aRt+1|t + (k1 + k3 ) Fa {T Vt }t=1 + k2 ,   where Fa {T Vt }Tt=1 is the αth quantile of the traded volume. Equation (56) not only encompasses execution costs through a spread adjustment, but also accommodates the endogenous liquidity risk, adjusting the spread based on trading volume and hence relating the trading position of an individual dealer to the market depth for the given stock. The exogenous risk is described by the implied spread, while the endogenous is calculated according to the following equation: √ [(k1 + k3 ) T Vt + k2 ] − [(k1 + k3 ) T Vt + k2 ], if T Vt ≥ T V ; Endogenous Liquidity = (57) 0, otherwise, 27

where T Vt is the average traded volume. They applied their model to Greek stocks and showed that for the highpriced, high-capitalization stocks, liquidity risk represents 3.40% of total market risk, while for the low capitalization ones, is close to 11%. In all cases, the VaR estimates based on the proposed methodology are more accurate than that of the tranditional methods, while the required capital multiplier is also smaller. To same conclusion Bangia et al. (1999) have also reached by applying their model in exhange rate dataset.

5.2

Trading Strategies that Minimize the Expected cost and its Variance.

The described liquidity measures are based on the bid-ask spread and its volatility and hence can accommodate the transaction costs that an investor faces. However, they are not completely satisfactory, since they may be only appropriate for small size portfolios, as they ignore the risk when the liquidation of a trading position affects the prices. Therefore, the calculation of the LV aR must also take into account the market impact factor and benefit from trading strategies. Bertsimas and Lo (1998) considered a trading strategy of buying X shares during a time period T . They showed that the best strategies are linear correlated with that of a naive one, which is defined as breaking the total size X into T identical packages of size X/T . Under the assumption of no private stock-specific information, they argued that the naive execution strategy is the best under the assumption that the price follows an arithmetic random walk and the impact of the liquidation strategy is linear. Almgren and Chriss (2000) defined two price impacts: • temporary and • permanent, The temporary price effect was defined as the continuously compounded return earned on the difference of the block transaction price and the equilibrium price prior to the block transaction. The permanent price effect was defined as the return on the difference between the equilibrium prices after and before the block transaction. In their set-up, the stock price, P , follows a random walk with no drift. During the liquidation time, T , there are T  equally spaced time intervals (tk−1 , tk ), for k = 1, . . . , N, of length tk − tk−1 = T  /T , ∀k. Again there is a trader that wants to sell X shares over the total time T through a sequence of 28

sales in each of the T  intervals. His holdings at the end points of the intervals are x0 = X, x1 , x2 , . . . , xN −1 , xT  = 0, and his sales during the intervals are

 nk = xk − xk−1 , f ork = 1, . . . , T  , where xk = X − kj=1 nj = Tj=k+1 nj . The k . The stock speed with which he sells in each interval is denoted by νk = Tn /T price at each interval equals to: Pk = Pk 1 + σ



T  /T zk − τ h1 (νk ),

(58)

where σ is the stock price volatility , z is a standard normal variable and h1 (νk ) is the permanent impact function. The effect of the temporary market impact to the stock price is: Pk = Pk1 − h2 (νk ),

(59)

where h2 (νk ) is the temporary impact function. The transaction cost Cost equals to:

Cost = XP0 −

N 

nk Pk

k=1 N N N      = − σ T /T ξk xk + T /T h1 (νk )xk + nk h2 (νk ), (60) k=1

k=1

k=1

where XP0 is the market value of the position at t = 0. According to Almgren and Chriss (2000), Cost is standard normally distributed with: E(Cost) =

N 

T  /T h1 (νk )xk +

k=1

V (Cost) = σ 2 T  /T

N 

nk h2 (νk )

k=1 N 

x2k

(61)

k=1

Under the framework of Almgren and Chriss (2000), the optimal liquidating strategy is determined by minimizing the total cost and hence (LV aR) is defined as: (62) LV aR = E(Cost) + Na V (Cost), where Na is the ath quantile of the normal distribution. 29

6

Backtesting Value-at-Risk

VaR forecasts must neither overestimate nor underestimate the “true” VaR number as, in both cases, the financial institution allocates the wrong amount of capital. In the former case, regulators charge a higher than really needed amount of capital, worsening its performance; in the latter, the regulatory capital set aside may not be enough to cover market risk. The simplest method to evaluate the accuracy of the risk models is to record the number of the violations in order to determine the hysteria factor (k). The smaller k is, the better the model predicts VaR. However, this procedure can only be applied at the 99% confidence interval and when the holding period equals to 10 days. An other way is to count how many times the losses are greater than the VaR. If this number does not differ substantially from the expected one, the VaR estimates are adequate. Nevertheless, a statistical inference and most importantly a statistically comparison between two or more adequate models is necessary. Statistical techniques of evaluating VaR models have been proposed, among others, by Kupiec (1995) and Christoffersen (1998, 2003). The purpose of the two backtesting measures is • to examine whether the failure rate of a model is statistically equal to the expected one (unconditional coverage) and • to investigate if the VaR violations are independently distributed (conditional coverage). If these prerequisites are met, the financial institution does not mis-allocate its capital, since if a model overestimates the “true” VaR, regulators charge a higher than actually needed amount of capital, as Table 1 shows, whereas, if a risk model generates dependent violations, there are indications that it is misspecified. Multiple risk management techniques meet the statistical criteria of VaR evaluation and hence a utility function of risk manager must be brought into picture to judge statistically the differences among the adequate VaR models. Under this framework, the risk manager can reduce to a smaller set the competing models. 30

6.1

Unconditional Coverage

˜ Let N = Tt=1 I˜t be the number of days over a T˜ period that the portfolio loss was larger than the VaR estimate13 where (p) 1, if yt+1 < VaRt+1|t ˜ It+1 = (63) (p) 0, if yt+1 ≥ VaRt+1|t . According to Kupiec (1995), the number of violations follows a binomial distribution, N ∼ B(T˜ , p). In order to investigate whether the failure rate of the model equals to the expected ones, the null and the alternative hypotheses are: H0 : N/T˜ = p (64) H1 : N/T˜ = p, where p is the expected ratio of violations, while the appropriate likelihood ratio statistic equals to:     N T˜−N N N T˜−N N LRun = 2 ln (1 − ) ( ) p . − 2 ln (1 − p) (65) T˜ T˜ Asymptotically, this test is χ2 -distributed with one degree of freedom. Table 2 presents the no rejection regions of N for various sample sizes and confidence levels. This test can reject a model for both high and low failures but, as stated by Kupiec (1995), its power is generally poor14 .

6.2

Conditional coverage

Christoffersen (1998) developed a conditional coverage test, which jointly examines the conjecture that the total number of failures is statistically equal to the expected one and the VaR violations are independent. The hypotheses are presented in the following equation: H0 : N/T˜ = p and π01 = π11 = p H1 : N/T˜ = p and π01 = π11 = p. 13

(66)

We assume a long trading position. For short trading positions I˜t+1 = 0 if yt+1 ≥ (p) VaRt+1|t and 1 otherwise. 14 Berkowitz (2001) suggested not to focus on a specific risk measure in order to judge the adequacy of the model but to backtest the entire distribution that is derived from the risk model and hence to increase the power of the test. However, if only the risk measure is reported, the proposed procedure cannot be applied.

31

Under the null hypothesis that the failure process is independent and the expected proportion of violations is equal to p, the appropriate likehood ratio is:   ˜ n01 n11 (1 − π11 )n10 π11 ) ∼ χ22 , LRcc = −2 ln (1 − p)T −N pN + 2 ln ((1 − π01 )n00 π01 (67) where nij is the number of observations with value i followed by j, for i, j = 0, 1 and nij πij = j nij are the corresponding probabilities. i, j = 1 denotes that an exception has been made, while i, j = 0 indicates the opposite. If the sequence of I˜t is independent, the probabilities to observe or not a VaR violation in the next period must be equal, or π01 = π11 = p. Contrary to Kupiec’s (1995) test, the conditional coverage procedure can reject a VaR model that generates either too many or too few clustered violations.

6.3

Generalization of the Conditional Coverage Test

Similar to the presented backtesting measure, Engle and Manganelli (2004) proposed to test the forecasting power of a model by examining whether the sequence I˜t is independently distributed. This can be examined by implementing the tests of Cowles and Jones (1937), Mood (1940) and Ljung and Box (1978). However, as they argued, these tests can not be applied to access the performance of the VaR models, as the conditional probabilities given VaR violation are not serially uncorrelated. Therefore, they suggested to examine if the variable Hitt = I˜t − p is uncorrelated with anything that belongs to the information set It . For example, Hit must be uncorrelated with lagged values of Hit, with the forecasted VaR and its lagged values. In case all these prerequisites are satisfied, a risk model is deemed as adequate. In order to examine statistically these conditions, a risk manager must run the following regression: (p)

Hitt = δ0 + δ1 Hitt−1 + . . . + ζV aRt|t−1 + . . . + εt ,

(68)

where εt ∼ N(0, 1), and examine whether δ0 = δ1 = . . . = ζ = . . . = 0.

6.4

Loss Functions

Lopez (1999) suggested the development of a loss function that accommodates the specific concerns of risk managers and proposed to measure the 32

accuracy of the VaR forecasts on the basis of the distance between the observed returns and the forecasted VaR values if a violation occurs. He defined a penalty variable as: (p) (p) 1 + (yt+1 − VaRt+1|t )2 , if yt+1 < VaRt+1|t , (69) Ψt+1 = (p) 0, if yt+1 ≥ VaRt+1|t . A VaR model is penalized when an exception takes place, so it is preferred to another if it yields a lower total loss value, defined as the sum of these penalty

˜ scores: Ψ = Tt=1 Ψt . This function incorporates both the cumulative number and the magnitude of exceptions. Nevertheless, his approach has two drawbacks. First, if the risk management techniques are not filtered by the aforementioned backtesting measures, a model that does not generate any violation is deemed the most adequate as Ψ = 0. Second, the return, yt+1 , should be better compared with the ES measure and not with the VaR, as VaR does not give any indication about the size of the expected loss, given a violation occurs15 . Sarma et al. (2003) suggested a two-stage backtesting procedure in order to overcome the first shortcoming. In the first stage, they tested the statistical accuracy of the models in the VaR context, by examining whether the average number of violations is statistically equal to the expected one and whether these violations are independently distributed. In the second stage, they proposed the Firm’s Loss Function (FLF) ”by penalising failures but also imposing a penalty refecting the cost of capital suffered on other days”: (p) (p) (yt+1 − VaRt+1|t )2 , if yt+1 < VaRt+1|t , Ψt+1 = (70) (p) (p) −αc V aRt+1|t , if yt+1 ≥ VaRt+1|t , where αc is a measure of cost of capital opportunity. Following this procedure, the risk manager is ensured that the models, that have not been rejected in the first stage, forecast accurately VaR. Note that in Sarma’s et al. (2003) loss function a score of one is not added when a violation occurs, contrary to that of Lopez (1999), as for all the models the observed exception rates are statistically equal to the expected ones, since they implement a two-stage backtesting procedure. Angelidis and Degiannakis (2006) proposed a method to overcome the second shortcoming of Lopez’s (1999) loss function. They suggested to measure the difference of the loss with the ES and not with the VaR as VaR does not 15

The loss functions are presented for long trading positions.

33

give any indication about the size of the expected loss: (p) (p) (yt+1 − ESt+1|t )2 , if yt+1 < VaRt+1|t , Ψt+1 = (p) 0, if yt+1 ≥ VaRt+1|t .

(71)

According to Angelidis and Degiannakis’ (2006) backtesting procedure, the best performing model i will calculate the VaR number accurately, as the prerequisite of correct unconditional and conditional coverage will be satisfied, and ii the expected loss, given a VaR violation, will be computed accurately, as

˜ the total loss value Tt=1 Ψt is minimized. But how the adequate models in the second-stage can be evaluated? Sarma et al. (2003) suggested to implement the Diebold and Mariano (1995) test. Let (A,B) (A) (B) (A) (B) = Ψt − Ψt , where Ψt and Ψt are the loss functions of models Xt (A,B) indicates that model A is A and B, respectively. A negative value of Xt superior to model B. The Diebold-Mariano (1995) statistic is the ”t-statistic” (A,B) for a regression of Xt on a constant with heteroskedastic and autocorrelated consistent (HAC) standard errors.16 However, based on this procedure multiple comparisons can not be performed. Angelidis and Degiannakis (2006) suggested to implement Hansen’s (2005) Superior Predictive Ability (SPA) criterion in order to evaluate the benchmark model (the best performing one) with all the competing models, simultaneously. The hypothesis of the test is:  ∗  (i ,1) (i∗ ,M ) H0 : E Xt . . . Xt ≤0  ∗  (72) ∗ (i ,1) (i ,M ) H1 : E Xt . . . Xt > 0, (i∗ ,i)

(i∗ )

(i)

= Ψt − Ψt , i∗ denotes the benchmark model, i = 1, . . . , M are where Xt the competitive models. The null hypothesis that the benchmark model i∗ is not outperformed by competing models i, is test with the statistic: T SP A = max

M 1/2 X i V ar (M 1/2 X i )

,

(73)

 

˜ (i∗ ,i) for i = 1, . . . , M, where X i = T1˜ Tt=1 Xt . The V ar M 1/2 X i is calculated according to Politis and Romano (1994) methodology. 16

For more details about HAC standard errors, see White (1980) and Newey and West (1987).

34

Under the two-stage backtesting environment, the risk manager achieves three goals: • VaR is being forecasted accurately, thus the prerequisites of the Basel Committee for Banking Supervision is satisfied. • One model or a family of models is selected among various candidates following a statistical inference procedure. • The amount that may be needed if a VaR violation occurs is known in advance, and therefore is better prepared to face the future losses by forecasting the ES measure accurately.

7

Application

In the next paragraphs, VaR and ES numbers will be computed from a set of volatility models for the S&P500 index. Two ARCH conditional volatility specifications, the GARCH(p,q) and APARCH(p,q) models, with normally and skewed Student-t distributed standardized innovations are estimated, giving in total four ARCH models. The conditional mean is assumed as a first order autoregressive process in order to capture the non-synchronous trading effect. We also set p = q = 1, as in the majority of the volatility forecasting applications17 . For yt = 100 ln (SP 500t /SP 500t−1) denoting the continuously compounded daily returns of S&P500 index, the estimated models are in the following forms: 17

However, there are procedures to choose the order of p, q based on in-sample evaluation methods, i.e. Akaike’s (1973) and Schwarz’s (1978) information criteria, or in out-of-sample evaluation criteria, i.e. predictive loss functions. The HASE and HAAE functions of Andersen et al. (1999) and the LE function of Pagan and Schwert (1990) are representative examples of application of loss functions for evaluation of volatility forecasting models. Xekalaki and Degiannakis (2005) and Degiannakis and Xekalaki (2006) present the performance of various in-sample and out-of-sample methods of selection of ARCH models. In the former study, the evaluation of the various models is performed by comparing different volatility forecasts in option pricing through the simulation of an options market, whereas in the letter one a number of statistical measures are used to examine the performance of the models to predict future volatility, for forecasting horizons ranging from one day to one hundred days ahead. In both studies the best performing methods of model selection are those that base their evaluation on the predictability of the models. It is worth mentioning that the Standardized Prediction Error Criterion (SPEC) model selection algorithm proposes the models with the highest performance in volatility forecasting. The SPEC algorithm suggests as adequate ARCH model for volatility forecasting the one with the lowest sum of squared standardized one-step-ahead prediction errors.

35

AR(1)GARCH(1,1)-Normal yt = c0 + c1 yt−1 + εt εt = σt zt 2 2 σt = a0 + a1 ε2t−1 + b1 σt−1 zt ∼ N (0, 1) .

(74)

AR(1)GARCH(1,1)-skewed Student-t yt = c0 + c1 yt−1 + εt εt = σt zt 2 σt2 = a0 + a1 ε2t−1 + b1 σt−1 zt ∼ st (0, 1; ξ, υ) .

(75)

AR(1)APARCH(1,1)-Normal yt = c0 + c1 yt−1 + εt ε =σz  t  t t δ δ δ  σt = a0 + a1 εt−1  − γ1 εt−1 + b1 σt−1 zt ∼ N (0, 1) .

(76)

AR(1)APARCH(1,1)-skewed Student-t yt = c0 + c1 yt−1 + εt ε =σz  t  t t δ δ δ σt = a0 + a1 εt−1  − γ1 εt−1 + b1 σt−1 zt ∼ st (0, 1; ξ, υ) .

(77)

st (0, 1; ξ, υ) denotes the standardized skewed Student-t distribution. Its quantile function is sta (zt ; ξ, υ) = (st∗a (zt ; ξ, υ) − m) /s, where st∗a (zt ; ξ, υ) is the non-standardized skewed Student-t defined in (13). Moreover, we compute the realized volatility, based on five-minute linearly interpolated S&P500 index prices and estimate the ARFIMAX(1,1) specification with the logarithm of the realized variance as dependent variable: ARFIMAX(1,1) specification      2(RV ) (1 − c1 L) (1 − L)d ln σt − w0 − w1 yt−1 − w2 dt−1 yt−1 = (1 + d1 L) εt (78) εt ∼ N (0, σε2 ) , 2(RV )

where σt

=

2 +σ 2 σoc co 2 σoc

m−1

 j=1

    2 100 ln SP 500(j+1/m),t − ln SP 500(j/m),t is

the realized variance measure as presented in Section 4.5, dt = 1 when yt > 0 and dt = 0 otherwise. 36

The S&P500 daily log-returns from January 1990 to December 2003 is the dataset for the ARCH models, whereas, the intraday data consists of fiveminute linearly interpolated S&P500 prices in the period from January 1997 to December 2003. The S&P500 daily prices and its daily log-returns are presented in figures 4 and 5. The alternation between periods of high and low volatility, named volatility clustering, characterizes the daily √ log-returns. (RV ) Figure 6 presents the annualized realized standard deviation, 252σt . On the right-hand axis, the S&P500 closing prices are plotted to make feasible the tendency of stock returns to be negatively correlated with changes in returns volatility. The next day’s conditional standard deviation forecasts are computed as: GARCH(1,1) Model  (t) (t) (t) 2 σt+1|t = a0 + a1 ε2t|t + b1 σt|t , (79) APARCH(1,1) Model 

σt+1|t =

δ (t)

(t) a0

ARFIMAX Model

+ 

σt+1|t =

(t) a1

  δ(t)   (t) (t) δ(t) + b1 σt|t , εt|t  − γ1 εt|t

  2(RV ) 2(t) , exp ln σt+1|t + 0.5σε

(80)

(81)

where 2(RV )

 (t)

(t)

(t)

(t)

(t)

ln σt+1|t = d1 (1 − L)−d εt|t + w0 + w1 yt + w2 dt yt   (t) 2(RV ) (t) (t) (t) +c1 ln σt|t − w0 − w1 yt−1 − w2 dt−1 yt−1 ,

(1 − L) =1+

 (t)

−d

1  (t) d L 1!

= 1+ +

1 d 2!





i=1  (t) 



 (t)

Γ i+d



 i

L )Γ(i+1)  1 + d (t) L2 − . . . ,  (t)

Γ(d

(82)

(83)



for d (t) > 0, and Γ (.) is the Gamma function. For each model, we compute T˜ = 499 one-day-ahead volatility forecasts. Each trading day, based on a rollingsample of T = 3000 observations, the  parameter vector ψ (t) = θ (t) , ξ (t) , υ (t) is re-estimated18 . Thus, we use all the 

18

I.e.

for the AR(1)GARCH(1,1)-skT model the parameter vector is ψ (t)  .

(t) (t) (t) (t) (t) c0 , c1 , a0 , a1 , b1 , ξ (t) , υ (t)

37

=

information that is available at the time that the estimate should be made19 . The models are estimated using the G@RCH (Laurent and Peters, 2002) and ARFIMA (Doornik and Ooms 2001) packages of Ox (Doornik, 2001). The average squared distance between predicted standard deviation and realized volatility is computed: (i) Ψ(RV )

(i)

where Ψt =

√

(RV )

252σt+1 −



= T˜ −1 (i)

252σt+1|t

T˜ 

2

(i)

Ψt ,

(84)

t=1

, for i = 1, . . . , 5 models. Figures 7 √ (i) to 11 plot the one-day-ahead annualized volatility forecasts, 252σt+1|t , of (i)

the five models. Table 3 presents the predictive mean squared errors, Ψ(RV ) . (i)

From Figures 7 to 11 as well as from the Ψ(RV ) loss functions, the ARFIMAX model outperforms the other models in forecasting next trading day’s realized volatility. Applying the SPA criterion for testing the null hypothesis that the ARFIMAX model is the best forecasting model, the p-value of 0.52 leaves no space to reject the null hypothesis. Table 4 presents the percentage of violations (N/T˜), and p-values of Kupiec’s (unconditional coverage) and Christoffersen’s (conditional coverage) tests. Based on the coverage tests, two models are considered as adequate20 . At 95% confidence level, the exception rates of the AR(1)GARCH(1,1) and AR(1)APARCH(1,1) models with normally distributed standardized innovations equal to 5.01% (25 violations out of 499 forecasts) and 4.61% (23 violations out of 499 forecasts), respectively. From the p-values, we can not infer which model provides most accurate next day’s VaR forecasts. Therefore we proceed in a second stage evaluation and measure the average squared distance between actual returns and VaR forecasts, when a violation occurred: (i) Ψ(V aR)

3

= 10 T˜ −1

T˜ 

(i)

Ψt ,

(85)

t=1 19

Many studies estimated the in-sample parameters once and based on them they derived the out-of-sample forecasts (i.e. Klaasen, 2002, Vilasuso, 2002 and Hansen and Lunde, 2005b), while Billio and Pelizon (2000) re-estimated the model parameters every 50 trading days. (RV ) 20 The conditional realized volatility forecasts, σt|t−1 , can be scaled with the conditional volatility of daily returns, σt . As Giot and Laurent (2004) introduced and Angelidis and Degiannakis (2006) also applied, an ARCH framework of the daily returns but with a conditional volatility proportional to the realized one would provide volatility forecasts closer (RV ) to the daily measures of risk. I.e. yt = µt + σt zt , σt = a0 σt|t−1 .

38



2

(i) (p)(i) where Ψt+1 = yt+1 − V aRt+1|t (p)(i) V aRt+1|t . In the second stage we

(p)(i)

(i)

if yt+1 < V aRt+1|t , and Ψt+1 = 0 if yt+1 ≥

include only the models with at least 10% p(i)

value in both backtesting measures. The lowest value of Ψ(V aR) is achieved by the AR(1)APARCH(1,1)-Normal model (the p-value of the SPA test is 0.52). Note that, if we had omitted the first stage evaluation, the AR(1)GARCH(1,1)skewed Student-t model would have been selected, a model that overestimates VaR (produces less violations that they are expected). In the second stage evaluation, a loss function that is based on the ES measure we be preferable. Therefore, we measure the average squared distance between actual returns and ES forecasts, when a violation occurred: (i) Ψ(ES)

3

= 10 T˜−1

T˜ 

(i)

Ψt ,

(86)

t=1

2  (i) (p)(i) where Ψt+1 = yt+1 − ESt+1|t (p)(i) V aRt+1|t . Under this evaluation

(p)(i)

(i)

if yt+1 < V aRt+1|t , and Ψt+1 = 0 if yt+1 ≥ framework, the AR(1)GARCH(1,1)-Normal (i)

specification has the lowest value of Ψ(ES). The p-value of the hypothesis that the AR(1)GARCH(1,1)-Normal model has better predictive ability is 0.71. Figures 12 to 16 plot the next trading day’s VaR and ES forecasts.

8

Summary

The review of literature focused on various parametric, no-parametric and semi-parametric methods of estimating and forecasting VaR and ES, while presented, in some detail, the evaluation framework of VaR and the adjustment of VaR number to liquidity risk. However, multivariate techniques can be applied in order to take into account the correlation between the assets of the portfolio. I.e. multivariate ARCH models capture not only time-varying variance but also time-varying covariances, an important issue for financial institutions’ portfolios.

39

A

Bibliography

References [1] A¨ıt-Sahalia, Y., Mykland, P.A., and Zhang, L. (2005). How Often to Sample a Continuous-Time Process in the Presence of Market Microstructure Noise. Review of Financial Studies, 18, 351-416. [2] Akaike, H. (1973). Information Theory and an Extension of the Maximum Likelihood Principle. Proceedings of the second international symposium on information theory. In B.N. Petrov and F. Csaki (eds.), Budapest, 267-281. [3] Almgren, R. and Chriss, N. (2000). Optimal Execution of Portfolio Transactions. Journal of Risk, 3, 5-39. [4] An´ e, T. (2006). An analysis of the flexibility of Asymmetric Power GARCH models. Computational Statistics & Data Analysis, forthcoming. [5] Andersen, T. (2000). Some Reflections on Analysis of High-Frequency Data. Journal of Business and Economic Statistics, 18(2), 146-153 . [6] Andersen, T. and Bollerslev, T. (1997). Intraday Periodicity and Volatility Persistence in Financial Markets. Journal of Empirical Finance, 4, 115-158. [7] Andersen, T. and Bollerslev, T. (1998a). Answering the Skeptics: Yes, Standard Volatility Models Do Provide Accurate Forecasts. International Economic Review, 39, 885-905. [8] Andersen, T. and Bollerslev, T. (1998b). DM-Dollar Volatility: Intraday Activity Patterns, Macroeconomic Announcements and Longer-Run Dependencies. Journal of Finance, 53, 219-265. [9] Andersen, T. and Bollerslev, T. (1998c). ARCH and GARCH Models. In Samuel Kotz, Campbell B. Read and David L. Banks, (eds.) Encyclopedia of Statistical Sciences Vol.II, John Wiley and Sons, New York. [10] Andersen, T., Bollerslev, T. and Diebold, F.X. (2006). Parametric and Nonparametric Volatility Measurement. In Yacine A¨ıt-Sahalia and Lars Peter Hansen, (eds.) Handbook of Financial Econometrics, Amsterdam, North Holland, forthcoming. [11] Andersen, T., Bollerslev, T. and Lange, S. (1999). Forecasting Financial Market Volatility: Sample Frequency vis-`a-vis Forecast Horizon. Journal of Empirical Finance, 6, 457-477. 40

[12] Andersen, T., Bollerslev, T., Christoffersen, P. and Diebold, F.X. (2005). Volatility and Correlation Forecasting. In Graham Elliott, Clive W.J.Granger, and Allan Timmermann, (eds.) Handbook of Economic Forecasting Amsterdam: North Holland. [13] Andersen, T., Bollerslev, T., Diebold, F.X. and Labys, P. (2001). The Distribution of Exchange Rate Volatility. Journal of the American Statistical Association, 96, 42-55. [14] Andersen, T., Bollerslev, T., Diebold, F.X. and Labys, P. (2003). Modeling and Forecasting Realized Volatility. Econometrica, 71, 529-626. [15] Ang, A. and Bekaert, G. (2002). International asset allocation with regime shifts. Review of financial Studies, 15(4), 1137-1187. [16] Angelidis, T. and Benos, A. (2006). Liquidity adjusted value-at-risk based on the components of the bid-ask spread. Applied Financial Economics, 16, 835851. [17] Angelidis, T. and Benos, A. (2007). Value-at-Risk for Greek Stocks. Multinational Finance Journal, forthcoming. [18] Angelidis, T. Degiannakis, S. (2005a). Modeling risk for long and short trading positions. Journal of Risk Finance, 6(3), 226-238. [19] Angelidis, T. and Degiannakis, S. (2005b). Volatility Forecasting: The Illusion of Choosing One Model in All Cases. Athens University of Economics and Business, Department of Statistics, Technical Report, 218. [20] Angelidis, T. and Degiannakis, S. (2006). Backtesting VaR Models: An Expected Shortfall Approach, Athens University of Economics and Business, Department of Statistics, Technical Report, 223. [21] Angelidis, T., Benos, A. and Degiannakis, S. (2004). The Use of GARCH Models in VaR Estimation. Statistical Methodology, 1(2), 105-128. [22] Angelidis, T., Benos, A. and Degiannakis, S. (2007). A Robust VaR Model under Different Time Periods and Weighting Schemes. Review of Quantitative Finance and Accounting, forthcoming. [23] Areal, N.M.P.C., Taylor, S.J. (2002). The Realised Volatility of FTSE-100 Future Prices. Journal of Futures Markets, 22, 627-648. 41

[24] Artzner, P., Delbaen, F, Eber, J.-M. and Heath, D. (1997). Thinking Coherently. Risk, 10, 68-71. [25] Artzner, P., Delbaen, F., Eber, J.-M., and Heath, D. (1999). Coherent Measures of Risk. Mathematical Finance, 9, 203-228. [26] Assoe, K.G. (1998). Regime-switching in emerging stock market returns. Multinational Finance Journal, 2, 101-132. [27] Awartani, B.M.A. and Corradi, V. (2005). Predicting the volatility of the S&P-500 stock index via GARCH models: The role of asymmetries, International Journal of Forecasting, 21(1), 167-183. [28] Baillie, R.T., Bollerslev, T. and Mikkelsen, H.O. (1996). Fractionally integrated generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 74, 3-30. [29] Bali, T. G. and Theodossiou, P. (2006). A Conditional-SGT-VaR Approach with Alternative GARCH Models. Annals of Operations Research, forthcoming. [30] Bandi, F.M. and Russell, J.R. (2005). Microstructure Noise, Realized Volatility, and Optimal Sampling. University of Chicago, Technical Report. [31] Bangia, A., Diebold, F.X., Schuermann, T. and Stroughair, (1999). Modeling liquidity risk, with implications for traditional market risk measurement and management. The Wharton Financial Institutions Center, Working Paper 99-06. [32] Barndorff-Nielsen, O.E. and Shephard, N. (2001). Non-Gaussian Ornstein-Uhlenbeck based Models and Some of their Uses in Financial Economics. Journal of the Royal Statistical Society, Series B, 63, 197-241. [33] Barndorff-Nielsen, O.E. and Shephard, N. (2002a). Econometric Analysis of Realised Volatility and its Use in Estimating Stochastic Volatility Models. Journal of the Royal Statistical Society, Series B, 64, 253-280. [34] Barndorff-Nielsen, O.E. and Shephard, N. (2002b). Estimating Quadratic Variation Using Realized Variance. Journal of Applied Econometrics, 17, 457-477. [35] Barndorff-Nielsen, O.E. and Shephard, N. (2003). Realised Power Variation and Stochastic Volatility Models. Bernoulli, 9, 243-265. 42

[36] Barndorff-Nielsen, O.E. and Shephard, N. (2004a). Econometric Analysis of Realized Covariation: High Frequency Based Covariance, Regression, and Correlation in Financial Economics. Econometrica, 72, 885-925. [37] Barndorff-Nielsen, O.E. and Shephard, N. (2004b). Power and Bipower Variation with Stochastic Volatility and Jumps. Journal of Financial Econometrics, 2, 1-37. [38] Barndorff-Nielsen, O.E. and Shephard, N. (2005). How Accurate is the Asymptotic Approximation to the Distribution of Realised Volatility? In Andrews, D., Powell, J., Ruud, P., and Stock, J. (eds.) Identification and Inference for Econometric Models Cambridge University Press. [39] Barndorff-Nielsen, O.E. and Shephard, N. (2006). Econometrics of Testing for Jumps in Financial Economics using Bipower Variation. Journal of Financial Econometrics, 4(1), 1-30. [40] Barone-Adesi, G. and Giannopoulos, K. (2001). Non-parametric VaR techniques. Myths and realities. Economic Notes by Banca Monte dei Paschi di Siena SpA, 30, 167-181. [41] Barone-Adesi, G., Giannopoulos, K. and Vosper, L. (1999). VaR without correlations for nonlinear Portfolios. Journal of Futures Markets, 19, 583602. [42] Basle Committee on Banking Supervision. (1995a). An Internal ModelBased Approach to Market Risk Capital Requirements. Basle Committee on Banking Supervision, Basle, Switzerland. [43] Basle Committee on Banking Supervision. (1995b). Planned Supplement to the Capital Accord to incorporate Market Risks. Basle Committee on Banking Supervision, Basle, Switzerland. [44] Baumol, W.J. (1963). An Expected Gain Confidence Limit Criterion for Portfolio Selection. Management Science, 10, 174-182. [45] Beder, T. (1995). VaR: Seductive but Dangerous. Financial Analysts Journal, 51, 12-24. [46] Beltratti, A. and Morana, C. (2005). Journal of Risk, 7(4), 21-45. [47] Bera, A.K. and Higgins, M.L. (1993). ARCH models: Properties, estimation and testing. Journal of Economic Surveys, 7, 305-366. 43

[48] Berkowitz, J. (2001). Testing density forecasts, with applications to risk management. Journal of Business and Economic Statistics, 19, 465-474. [49] Bertsimas, D. and Lo, A.W. (1998). Optimal control of execution costs. Journal of Financial Markets, 1, 1-50. [50] Billio, M. and Pelizzon, L. (2000). Value-at-Risk: A multivariate switching regime approach. Journal of Empirical Finance, 7 531-554. [51] Black, F. (1976). Studies of stock market volatility changes. Proceedings of the American Statistical Association, Business and Economic Statistics Section, 177-181. [52] Bollerslev, T. (1986). Generalized autoregressive conditional heteroscedasticity. Journal of Econometrics, 31, 307-327. [53] Bollerslev, T. (1987). A conditional heteroscedastic time series model for speculative prices and rates of return. Review of Economics and Statistics, 69, 542-547. [54] Bollerslev, T., and Mikkelsen, H.O. (1996). Modeling and pricing longmemory in stock market volatility. Journal of Econometrics, 73, 151-184. [55] Bollerslev, T. and Wright, J.H. (2001). Volatility Forecasting, HighFrequency Data and Frequency Domain Inference. Review of Economics and Statistics, 83, 596-602. [56] Bollerslev, T., Chou, R. and Kroner, K.F. (1992). ARCH modeling in Finance: A review of the theory and empirical evidence. Journal of Econometrics, 52, 5-59. [57] Bollerslev, T., Engle, R.F. and Nelson, D. (1994). ARCH models. In R.F., Engle and D., McFadden, (eds.), Handbook of Econometrics, 4, Elsevier, Amsterdam, 2959-3038. [58] Boudoukh, J., Richardson, M. and Whitelaw, R. (1998). The Best of Both Worlds. Risk, 11, 64-67. [59] Brooks, C. and Persand, G. (2003a). The effect of asymmetries on stock index return Value-at-Risk estimates. Journal of Risk Finance, Winter, 2942. [60] Brooks, C. and Persand, G. (2003b). Volatility forecasting for risk management. Journal of Forecasting, 22, 1-22. 44

[61] Brooks, C., Burke, S.P. and Persand, G. (2001). Benchmarks and the Accuracy of GARCH Model Estimation. International Journal of Forecasting, 17, 45-56. [62] rooks, R.D., Faff, R.W. and McKenzie, M.D. (2000). A multi-country study of power ARCH models and national stock market returns. Journal of International Money and Finance, 19, 377-397. [63] Bystr¨om, H.N.E. (2004). Managing extreme risks in tranquil and volatile markets using conditional extreme value theory. International Review of Financial Analysis, 13(2), 133-152. [64] Campbell, J., Lo, A. and MacKinlay, A.C. (1997). The Econometrics of Financial Markets. Princeton University Press, New Jersey. [65] Christoffersen, P. (1998). Evaluating Interval Forecasts. International Economic Review, 39, 841-862. [66] Christoffersen, P. (2003). Elements of Financial Risk Management. Academic Press, New York. [67] Corsi, F. (2004). A Simple Long Memory Model of Realized Volatility. University of Southern Switzerland, Technical Report. [68] Corsi, F., Kretschmer, U., Mittnik, S. and Pigorsch, C. (2005). The Volatility of Realised Volatility. Center for Financial Studies, Working Paper, 33. [69] Corsi, F., Zumbach, G., M¨ uller, U.A. and Dacorogna, M. (2001). Consistent High-Precision Volatility from High-Frequency Data. Economic Notes, 30, 183-204. [70] Cowles, A. and Jones, H. (1937). Some a posteriori probabilities in stock market action. Econometrica, 5, 280-294. [71] Danielsson, J. and Zigrand, J.-P. (2004). On time-scaling of risk and the square-root-of-time rule. Department of Accounting and Finance and FMG LSE, Working Paper. [72] Degiannakis, S. (2004). Volatility forecasting: Evidence from a fractional integrated asymmetric power ARCH skewed-t model. Applied Financial Economics, 14, 1333-1342. 45

[73] Degiannakis, S. and Xekalaki, E. (2004). Autoregressive conditional heteroskedasticity models: A review. Quality Technology and Quantitative Management, 1(2), 271-324. [74] Degiannakis, S. and Xekalaki, E. (2006). Assessing the Performance of a Prediction Error Criterion Model Selection Algorithm in the Context of ARCH Models. Applied Financial Economics, forthcoming. [75] Delbaen, F. (2002). Coherent Risk Measures on General Probability Spaces. Essays in Honour of Dieter Sondermann. In: K. Sandmann and P. J. Schnbucher, (eds.), Advances in Finance and Stochastics, Springer, 1-38. [76] Diebold, F.X. and Mariano, R. (1995). Comparing Predictive Accuracy. Journal of Business and Economic Statistics, 13(3), 253-263. [77] Diebold, F.X., Hickman, A., Inoue, A. and Schuermann, √ T. (1996). Converting 1-Day Volatility to h-Day Volatility: Scaling by h is Worse than You Think. University of Pennsylvania, Department of Economics, Working Paper. [78] Ding, Z., Granger, C.W.J. and Engle, R.F. (1993). A long memory property of stock market returns and a new model. Journal of Empirical Finance, 1, 83-106. [79] Doornik, J.A. (2001). Ox: Object Oriented Matrix Programming, 3.0. Timberlake Consultants Press, London. [80] Doornik, J.A. and Ooms, M. (2001). A Package for Estimating, Forecasting and Simulating Arfima Models: Arfima Package 1.01 for Ox. Nuffield College, Oxford, Working Paper. [81] Dowd, K. (2002). Measuring Market Risk. John Wiley & Sons Ltd., New York. [82] Dowd, K., Blake, D. and Cairns, A. (2004). Long-term value at risk. Journal of Risk Finance, 5(2), 52-57. [83] Ebens, H. (1999). Realized Stock Volatility. Johns Hopkins University, Department of Economics, Working Paper, 420. [84] Embrechts, P. (2000). Extremes and Integrated Risk Management, published in association with UBS Warburg and Risk Books, London. 46

[85] Engle, R.F. (1982). Autoregressive Conditional heteroscedasticity with estimates of the variance of U.K. inflation. Econometrica, 50, 987-1008. [86] Engle, R. F. (1990). Discussion: Stock Market Volatility and the Crash of ’ 87. Review of Financial Studies, 3, 103-106. [87] Engle, R.F. and Gonz´alez-Rivera, G. (1991). Semiparametric ARCH Models. Journal of Business and Economic Statistics, 9, 345-359. [88] Engle, R. F. and Manganelli, S. (2004). CAViaR: Conditional Autoregressive Value at Risk by Regression Quantiles. Journal of Business & Economic Statistics, 22(4), 367-381. [89] Engle, R.F. and Ng, V.K. (1993). Measuring and Testing the Impact of News on Volatility. Journal of Finance, 48, 1749-1778. [90] Engle, R.F. and Sun, Z. (2005). Forecasting Volatility Using Tick by Tick Data. European Finance Association, 32th Annual Meeting, Moscow. [91] Ervan, L.S., (2000). Incorporating Liquidity Risk in VaR Models. University of Rene, Working Paper. Estimation with Historical Simulation. Review of Derivatives Research, 1, 371-390. [92] Ferson, W.E. (1989). Changes in Expected Security Returns, Risk and the Level of Interest Rates. Journal of Finance, 44, 1191-1218. [93] Gallant, A.R. and Tauchen, G. (1989). Semi Non-Parametric Estimation of Conditional Constrained Heterogeneous Processes: Asset Pricing Applications. Econometrica, 57, 1091-1120. [94] Gallant, A.R., Rossi, P.E. and Tauchen, G. (1993). Nonlinear Dynamic Structures. Econometrica, 61, 871-907. [95] Gen¸cay, R., and Sel¸cuk, F. (2004). Extreme value theory and Value-atRisk: Relative performance in emerging markets. International Journal of Forecasting, 20(2), 287-303. [96] Geweke, J. (1988). Exact Inference in Models with Autoregressive Conditional Heteroskedasticity. In: W.A. Barnett, E.R. Berndt and H. White. (eds.) Dynamic Econometric Modeling, Cambridge University Press, Cambridge. [97] Geweke, J. (1989). Exact Predictive Densities in Linear Models with Arch Distrubances. Journal of Econometrics, 44, 307-325. 47

[98] Giot, P. and Laurent, S. (2003a). Value-at-Risk for long and short trading positions. Journal of Applied Econometrics, 18, 641-664. [99] Giot, P. and Laurent, S. (2003b). Market risk in commodity markets: a VaR approach. Energy Economics, 25, 435-457. [100] Giot, P. and Laurent, S. (2004). Modelling Daily Value-at-Risk Using Realized Volatility and ARCH Type Models. Journal of Empirical Finance, 11, 379 - 398. [101] Giraitis, L. and Robinson, P.M. (2000). Whittle Estimation of ARCH Models. Econometric Theory, 17, 608-631. [102] Glosten, L., Jagannathan, R. and Runkle, D. (1993). On the relation between the expected value and the volatility of the nominal excess return on stocks. Journal of Finance, 48, 1779-1801. [103] Gourieroux, C. (1997). ARCH models and Financial Applications. Springer-Verlag, New York. [104] Gray, S.F. (1996). Modeling the conditional distribution of interest rates as a regime switching process. Journal of Financial Economics, 42, 27-62. [105] Guermat, C. and Harris, R.D.F. (2002). Forecasting value at risk allowing for time variation in the variance and kurtosis portfolio returns. International Journal of Forecasting, 18, 409-419. [106] Guidolin, M. and Timmermann, A. (2003). Value at Risk and Expected Shortfall under Regime Switching. University of Virginia and University of California at San Diego, Working Paper. [107] Haas, M., Mittnik, S., and Paolella, M. S. (2004). A New Approach to Markov Switching GARCH Models. Journal of Financial Econometrics, 2(4), 493-30. [108] Hall, P. and Yao, Q. (2003). Inference in ARCH and GARCH Models with Heavy-Tailed Errors. Econometrica, 71, 285-317. [109] Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica, 57, 357384. [110] Hamilton, J.D. (1994). Time series analysis, Princeton University Press, New Jersey. 48

[111] Hamilton, J. D. and Susmel, R. (1994). Autoregressive conditional heteroskedasticity and changes in regime. Journal of Econometrics, 64, 307-333. [112] Hansen, P.R. (2005). A Test for Superior Predictive Ability. Journal of Business and Economic Statistics, 23, 365-380. [113] Hansen, P.R. and Lunde, A. (2005a). A Realized Variance for the Whole Day Based on Intermittent High-Frequency Data. Journal of Financial Econometrics, 3(4), 525-554. [114] Hansen, P.R. and Lunde, A. (2005b). A Forecast Comparison of Volatility Models: Does Anything Beat a GARCH1,1)? Journal of Applied Econometrics, 20(7), 873-889. [115] Harmantzis, C.F., Miao, L. and Chien, Y. (2006). Empirical study of value-at-risk and expected shortfall models with heavy tails. Journal of Risk Finance, 7(2), 117-135. [116] Harvey, A.C., Ruiz, E. and Sentana, E. (1992). Unobserved Component Time Series Models with ARCH Disturbances. Journal of Econometrics, 52, 129-157. [117] Ho, L.-C., Burridge, P., Cadle, J. and Theobald, M. (2000). Value-atRisk: Applying the extreme value approach to Asian markets in the recent financial turmoil. Pacific-Basin Finance Journal, 88, 249-275. [118] Hoppe, R. (1998). VaR and Unreal World. Risk, 11, 45-50. [119] Hoppe, R. (1999). Finance is not Physics. Risk Professional, 1(7). [120] Huang, Y.C. and Lin, B-J. 2004. Value-at-Risk Analysis for Taiwan Stock Index Futures: Fat Tails and Conditional Asymmetries in Return Innovations. Review of Quantitative Finance and Accounting, 22, 79-95. [121] Hull, J. and White, A. (1998). Incorporating volatility updating into the historical simulation method for VaR. Journal of Risk, 1, 5-19. [122] Jondeau, E. and Rockinger, M. (2003). Testing for differences in the tails of stock-market returns. Journal of Empirical Finance, 10, 559-581. [123] Klaassen, F. (2002). Improving GARCH Volatility Forecasts. Empirical Economics, 27(2), 363-394. 49

[124] Koopman, S.J., Jungbacker, B. and Hol, E. (2005). Forecasting Daily Variability of the S&P100 Stock Index Using Historical, Realised and Implied Volatility Measurements. Journal of Empirical Finance, 12, 445-475. [125] Kupiec, P.H. (1995). Techniques for verifying the accuracy of risk measurement models. Journal of Derivatives, 3, 73-84. [126] Lambert, P. and Laurent, S. (2000). Modeling skewness dynamics in series of financial data. Institut de Statistique, Louvain-la-Neuve, Discussion Paper. [127] Lambert, P. and Laurent, S. (2001). Modeling financial time series using garch-type models and a skewed student Density. Unversit´ en de Li´ enge, Mimeo. [128] Laubsch, A. J. (1999). Risk Management: A Practical Guide, RiskMetrics Group. [129] Laurent, S. and Peters, J.-P. (2002). G@RCH 2.2: An Ox Package for Estimating and Forecasting Various ARCH Models. Journal of Economic Surveys, 16, 447-485. [130] Leavens, D. H. (1945). Diversification of investments. Trusts and Estates, 80(5), 469-473. [131] Li, M.-Y. L. and Lin, H.-W. W. (2004). Estimating value-at-risk via Markov switching ARCH models an empirical study on stock index returns. Applied Economics Letters, 11, 679691. [132] Li, W.K., Ling, S. and McAleer, M. (2001). A Survey of Recent Theoretical Results for Time Series Models with GARCH Errors. The Institute of Social and Economic Research, Osaka University, Japan, Discussion Paper, 545. [133] Ljung, G. and Box, G. (1979). On a measure of lack of fit in time series models. Biometrica, 66, 265-270. [134] Lo, A. and MacKinlay, A.C. (1988). Stock market prices do not follow random walks: Evidence from a simple specification test. Review of Financial Studies, 1, 41-66. [135] Lopez, J.A. (1999). Methods for Evaluating Value-at-Risk Estimates. Economic Policy Review, Federal Reserve Bank of New York, 2, 3-17. 50

[136] M¨ uller, U., Dacorogna, M., Dav, R., Pictet, O., Olsen, R. and Ward, J. (1993). Fractals and Intrinsic Time - A Challenge to Econometricians. XXXIXth International AEA Conference on Real Time Econometrics, Luxembourg. [137] Madhavan, A., Richardson, M. and Roomans, M. (1997). Why do security prices change? A transaction-level analysis of NYSE stocks. Review of Financial Studies, 10, 1035-1064. [138] Mandelbrot, B. (1963). The Variation of certain speculative prices. Journal of Business, 36, 394-419. [139] Mark, N. (1988). Time Varying Betas and Risk Premia in the Pricing of Forward Foreign Exchange Contracts. Journal of Financial Economics, 22, 335-354. [140] Markowitz, H. M., (1952). Portfolio Selection. Journal of Finance, 7 (1), 77-91. [141] Marshall, C. and Siegel, M. (1997). Value at Risk: Implementing a Risk Measurement Standard. Journal of Derivatives, 4(3), 91-110. [142] Martens, M. (2002). Measuring and forecasting S&P 500 index-futures volatility using high-frequency data. Journal of Futures Markets, 22, 497518. [143] Mausser, H. and Rosen, D. (2000). Managing risk with expected shortfall. In S. Uryasev (ed.) Probabilistic Constrained Optimization: Methodology and Applications. Dordrecht, Kluwer Academic Publishers, 198-219. [144] McNeil, A.J. (1997). Estimating the tails of loss severity distributions using extreme value theory. Theory ASTIN Bulletin, 27(1), 1117-1137. [145] McNeil, A.J. (1998). Calculating quantile risk measures for financial return series using extreme value theory. Department of Mathematics, ETH E-Collection, Swiss Federal Technical University, Zurich. [146] McNeil, A.J. (1999). Extreme value theory for risk managers. Internal Modeling CAD II. Risk Books, London, 93-113. [147] McNeil, A.J. and Frey, R. (2000). Estimation of tail-related risk measures for heteroskedasticity financial time series: An extreme value approach. Journal of Empirical Finance, 7, 271-300. 51

[148] Merton, R.C. (1980). On Estimating the Expected Return on the Market: An Explanatory Investigation. Journal of Financial Economics, 8, 323361. [149] Mood, A. (1940). The distribution theory of runs. Annals of Mathematical Statistics, 11, 367-392. [150] Nelson, D. (1991). Conditional heteroskedasticity in asset returns: A new approach. Econometrica, 59, 347-370. [151] Newey, W and West, K. (1987). A simple positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica, 55, 703-708. [152] Oomen, R. (2001). Using High Frequency Stock Market Index Data to Calculate, Model and Forecast Realized Volatility. European University Institute, Department of Economics, Manuscript. [153] Pagan, A.R. and Schwert, G.W. (1990). Alternative Models for Conditional Stock Volatility. Journal of Econometrics, 45, 267-290. [154] Palm, F. (1996). GARCH Models of Volatility. In Maddala, G. and Rao, C., (eds.). Handbook of Statistics, Elsevier, Amsterdam, 209-240. [155] Politis, D.N. and Romano, J.P. (1994). The Stationary Bootstrap. Journal of the American Statistical Association, 89, 1303-1313. [156] Poon, S.,-H. and Granger, C. (2001). Forecasting financial market volatility: a review. Department of Economics, University of California, San Diego, manuscript. [157] Rich, R.W., Raymond, J. and Butler, J.S. (1991). Generalized Instrumental Variables Estimation of Autoregressive Conditional Heteroskedastic Models. Economics Letters, 35, 179-185. [158] Roy, A. D. (1952). Safety first and the holding of assets. Econometrica, 20(3), 431-449. [159] Sarma, M., Thomas, S. and Shah, A. (2003). Selection of VaR models. Journal of Forecasting, 22(4), 337-358. [160] Scholes, M. and Williams, J. (1977). Estimating betas from nonsynchronous data. Journal of Financial Economics, 5, 309-328. 52

[161] Schwarz, G. (1978). Estimating the Dimension of a Model. Annals of Statistics, 6, 461-464. [162] Schwert, G. W. (1989). Why Does Stock Market Volatility Change Over Time? Journal of Finance, 44, 1115-1153. [163] Seymour, A. J. and Polakow, D. A. (2003). A coupling of extreme-value theory and volatility updating with Value-at-Risk Estimation in emerging markets: A South African test. Multinational Finance Journal, 7, 3-23. [164] So, M.K.P. and Yu, P.L.H (2006). Empirical analysis of GARCH models in Value at Risk Estimation. Journal of International Markets, Institutions and Money, 16(2), 180-197. [165] Taleb, N. (1997a). The World According to Nassim Taleb. Derivatives Strategy, December/January. [166] Taleb, N. (1997b). Against VaR. Derivatives Strategy, April. [167] Taylor, S. (1986). Modeling Financial Time Series. John Wiley & Sons, New York. [168] Taylor, S. and Xu, X. (1997). The Incremental Volatility Information in One Million Foreign Exchange Quotations. Journal of Empirical Finance, 4, 317-340. [169] Thomakos, D.D. and Wang, T. (2003). Realized Volatility in the Futures Markets. Journal of Empirical Finance, 10, 321-353. [170] Tse, Y.K. (1998). The Conditional Heteroskedasticity of the Yen-Dollar Exchange Rate. Journal of Applied Econometrics, 193, 49-55. [171] Venkataraman, S. (1996). Value at Risk for a mixture of normal distributions: The use of quasi-bayesian estimation techniques. Economic Perspectives, Federal Reserve Bank of Chicago, 2-13. [172] Vilasuso, J. (2002). Forecasting exchange rate volatility. Economics Letters, 76, 59-64. White, H. (1980). A heteroskedasticity-consistent covariance matrix and a direct test for heteroskedasticity. Econometrica, 48, 817-838. [173] White, H. 1980. A heteroskedasticity-consistent covariance matrix and a direct test for heteroskedasticity, Econometrica, 48, 817838. [174] Wilson, T. (1993). Infinite wisdom. Risk, 6(6), 37-45. 53

[175] Xekalaki, E. and Degiannakis, S. (2005). Evaluating Volatility Forecasts in Option Pricing in the Context of a Simulated Options Market. Computational Statistics and Data Analysis, 49, 611-629. [176] Yamai, Y. and Yoshiba, T. (2005). Value-at-risk Versus Expected Shortfall: A Practical Perspective. Journal of Banking and Finance, 29(4), 9971015. [177] Zako¨ian, J.-M. (1994). Threshold Heteroskedastic Models. Journal of Economic Dynamics and Control, 18, 931-955. [178] Zangari, P. (1996). An improved methodology for measuring VAR. RiskMetrics Monitor, Reuters/JP Morgan. [179] Zhang, L., Mykland, P.A. and Ait-Sahalia, Y. (2005). A Tale of Two Time Scales: Determining Integrated Volatility With Noisy High-Frequency Data. Journal of the American Statistical Association, 100, 1394-1411.

54

B

Tables and Figures Number of Exceptions 4 or fewer 5 6 7 8 9 10 or more

Multiplier 3.00 3.40 3.50 3.65 3.75 3.85 4.00

Table 1: Values of the supervisory-determined multiplicative factor (k) in order to determine the Market Required Capital.

Confidence level

Evaluation sample size 250 500 750 1000 5% 7 ≤ N ≤ 19 17 ≤ N ≤ 35 27 ≤ N ≤ 49 38 ≤ N ≤ 64 1% 1 ≤ N ≤ 6 2≤N ≤9 3 ≤ N ≤ 13 5 ≤ N ≤ 16 0.5% 0 ≤ N ≤ 4 1≤N ≤6 1≤N ≤8 2≤N ≤9 0.1% 0 ≤ N ≤ 1 0≤N ≤2 0≤N ≤3 0≤N ≤3 0.01% 0 ≤ N ≤ 0 0≤N ≤0 0≤N ≤1 0≤N ≤1

Table 2: Unconditional coverage “no rejection” regions for a 95% significance level.

55

Model AR(1)GARCH(1,1)-Normal AR(1)GARCH(1,1)-skewed Student-t AR(1)APARCH(1,1)-Normal AR(1)APARCH(1,1)-skewed Student-t ARFIMAX

(i)

Ψ(RV ) 38.92 41.88 28.56 31.76 15.68 (i)

Table 3: The predictive mean squared loss function, Ψ(RV ) , for i = 1, . . . , 5 (i)

models. Ψ(RV ) is the average squared distance between annualized predicted √ (i) standard deviation of model i, 252σt+1|t and annualized realized volatility, √ (RV ) 252σt+1 .

Model AR(1)GARCH(1,1)-Normal AR(1)GARCH(1,1)-skewed Student-t AR(1)APARCH(1,1)-Normal AR(1)APARCH(1,1)-skewed Student-t ARFIMAX

N/T˜ LRuc LRcc 5.01% 0.99 0.51 3.01% 0.03 0.33 4.61% 0.68 0.13 2.20% 0.001 0.48 7.21% 0.03 0.02

(V aR)

Ψ(i) 22.13 7.90 21.48 10.46 35.25

(ES)

Ψ(i) 10.36 11.34 12.35 5.75 19.08

Table 4: Percentage of violations (N/T˜ ), p-values of Kupiec’s (unconditional coverage) and Christoffersen’s (conditional coverage) tests, and the predictive (i) (i) mean squared loss functions Ψ(V aR) and Ψ(ES) .

56

0.4

0.35

0.3 VaR=-1.645

0.25

0.2

p=Pr(y”VaR)=5%

0.15

0.1

0.05

4

3

2

1

0

-1

-2

-3

-4

0

  (p) (p) Figure 1: for p = Pr yt ≤ V aRt = 5% the V aRt = -1.645, under the assumption that yt ∼ N(0, 1).

0.4

0.35

0.3 VaR=-1.645 0.25

ES=-2.061

0.2

0.15

0.1

0.05

4

3

2

1

0

-1

-2

-3

-4

0

  (p) = 5%, the Value-atFigure 2: For yt ∼ N(0, 1) and p = Pr yt ≤ V aRt (p)

Risk, V aRt

(p)

and the Expected Shortfall, ESt . 57

0.4

0.35

0.3 VaR 0.25

0.2

0.15

Market RiskMarket Risk

0.1

Liquidity Risk Liquidity Risk

0.05

4

3

2

1

0

-1

-2

-3

-4

0

Figure 3: Liquidity Adjusted to Value-at-Risk.

58

1,800 1,600 1,400 1,200 1,000 800 600 400 200 0 1/26/1990

1/26/1993

1/26/1996

1/26/1999

1/26/2002

S&P500 closing prices

Figure 4: S&P500 daily prices from January 1990 to December 2003.

8 6 4 2 0 -2 -4 -6 -8 1/26/1990

1/26/1993

1/26/1996

1/26/1999

1/26/2002

S&P500 log-returns

Figure 5: S&P500 daily returns from January 1990 to December 2003.

59

1,800

70

1,600

60

Realized Volatility

1,200 40

1,000

30

800 600

20

S&P500 Index Prices

1,400 50

400 10 0 1/2/1997

200 0 1/2/1999

1/2/2001

S&P500 Annualized Realized Standard Deviation

1/2/2003 S&P500 closing prices

Figure 6: S&P500 annualized realized volatility from January 1997 to December 2003.

70 60 50 40 30 20 10 0 1/2/2002

7/2/2002

1/2/2003

Annualized Conditional Standard Deviation Forecast

7/2/2003 Annualized Realized Volatility

Figure 7: The one-day-ahead annualized standard deviation forecasts of the AR(1)GARCH(1,1)-Normal model and the realized volatility from January 2002 to December 2003. 60

70

60

50

40

30

20

10

0 1/2/2002

5/2/2002

9/2/2002

1/2/2003

Annualized Conditional Standard Deviation Forecast

5/2/2003

9/2/2003

Annualized Realized Volatility

Figure 8: The one-day-ahead annualized standard deviation forecasts of the AR(1)GARCH(1,1)-skewed Student-t model and the realized volatility from January 2002 to December 2003.

61

70 60 50 40 30 20 10 0 1/2/2002

5/2/2002

9/2/2002

1/2/2003

Annualized Conditional Standard Deviation Forecast

5/2/2003

9/2/2003

Annualized Realized Volatility

Figure 9: The one-day-ahead annualized standard deviation forecasts of the AR(1)APARCH(1,1)-Normal model and the realized volatility from January 2002 to December 2003.

62

70 60 50 40 30 20 10 0 1/2/2002

5/2/2002

9/2/2002

1/2/2003

Annualized Conditional Standard Deviation Forecast

5/2/2003

9/2/2003

Annualized Realized Volatility

Figure 10: The one-day-ahead annualized standard deviation forecasts of the AR(1)APARCH(1,1)-skewed Student-t model and the realized volatility from January 2002 to December 2003.

63

70

60

50

40

30

20

10

0 1/2/2002

5/2/2002

9/2/2002

1/2/2003

Annualized Standard Deviation Forecast

5/2/2003

9/2/2003

Annualized Realized Volatility

Figure 11: The one-day-ahead annualized standard deviation forecasts of the ARFIMAX model and the realized volatility from January 2002 to December 2003.

64

0

-1

-2

-3

-4

-5

-6 1/2/2002

5/2/2002

9/2/2002 ES

1/2/2003 VaR

5/2/2003

9/2/2003

S&P500 log-returns

Figure 12: Theone-day-ahead VaR and ES forecasts of the AR(1)GARCH(1,1)-Normal model and the S&P500 log-returns from January 2002 to December 2003.

65

0 -1 -2 -3 -4 -5 -6 -7 -8 1/2/2002

5/2/2002

9/2/2002 ES

1/2/2003 VaR

5/2/2003

9/2/2003

S&P500 returns

Figure 13: The one-day-ahead VaR and ES forecasts of the AR(1)GARCH(1,1)-skewed Student-t model and the S&P500 log-returns from January 2002 to December 2003.

66

0

-1

-2

-3

-4

-5

-6 1/2/2002

5/2/2002

9/2/2002 ES

1/2/2003 VaR

5/2/2003

9/2/2003

S&P500 log-returns

Figure 14: The one-day-ahead VaR and ES forecasts of the AR(1)APARCH(1,1)-Normal model and the S&P500 log-returns from January 2002 to December 2003.

67

0 -1 -2 -3 -4 -5 -6 -7 -8 1/2/2002

5/2/2002

9/2/2002 ES

1/2/2003 VaR

5/2/2003

9/2/2003

S&P500 log-returns

Figure 15: The one-day-ahead VaR and ES forecasts of the AR(1)APARCH(1,1)-skewed Student-t model and the S&P500 log-returns from January 2002 to December 2003.

0

-1

-2

-3

-4

-5

-6 1/2/2002

5/2/2002

9/2/2002 ES

1/2/2003 VaR

5/2/2003

9/2/2003

S&P500 returns

Figure 16: The one-day-ahead VaR and ES forecasts of the ARFIMAX model and the S&P500 log-returns from January 2002 to December 2003. 68

Suggest Documents