Estimating GARCH models using Recursive Methods 1 ... - CiteSeerX

43 downloads 0 Views 307KB Size Report
GARCH models. Monte Carlo studies are provided. ... some evidence of long memory behaviour of the first moment of geophysical and macroe- conomic time series. The most widely used GARCH models have been incorporated in a random ...
Estimating GARCH models using Recursive Methods Jon Lee Kierkegaard, Jan Nygaard Nielsen Lars Jensen, Henrik Madsen∗ Department of Mathematical Modelling, Richard Petersens Plads, Building 321, Technical University of Denmark, 2800 Lyngby

Abstract: Neglecting the non-constant variance, if any, of the residuals of a linear regression model may lead to an arbitrarily large loss of asymptotic efficiency and, subsequently, low power of statistical tests, i.e. model diagnostics tests. A number of models of conditional variance of the Generalized AutoRegressive Conditional Heteroscedasticity (GARCH) type has been proposed in the literature, where likelihood methodology is used for parameter estimation. It is, however, well known that the associated likelihoods are difficult to maximize numerically. In this paper, the Recursive Prediction Error Method (RPEM) and the Recursive Pseudo-Linear Regression (RPLR) method are used for parameter estimation in GARCH models. Monte Carlo studies are provided. Keywords: Non-Gaussian processes, likelihood function, parameter estimation, recursive estimation methods.

1

Introduction

Modelling heteroskedasticity, i.e. time- or state-dependent conditional variance in linear regression models, is an important problem to which a number of solutions has been proposed in the time series literature ranging from an ad-hoc Box-Cox transformation over generalized least-squares methods to the Generalized AutoRegressive Conditional Heteroscedastic (GARCH) model. Taking heteroskedasticity explicitly into account has a number of advantages: Firstly, the loss in asymptotic efficiency from neglected heteroskedasticity may be arbitrarily large (Bollerslev, Engle and Nelson, 1994), which may lead to lower power of statistical tests. Second, a correct specification of the heteroskedasticity provides unbiased estimates of the variance. Furthermore, the variance of k-step ahead predictions will be more dynamic, and it will depend on the level of the process, earlier values of the conditional variance, input variables, if any, etc. depending upon the model specification. Consequently confidence and prediction intervals will take on more complicated shapes (as a function of the prediction horizon k) than those known for linear time series models. It follows that heteroskedasticity is an important issue that needs to be addressed either by identifying its structure or by using statistical methods that are robust in the presence of unknown forms of heteroskedasticity, see eg. (Newey and West, 1987; Andrews, 1991). A correct specification of the (conditional) variance structure has clear implications for applications in statistical process control, environmental protection, risk management in the energy and finance sectors, in particular in terms of predictive accuracy. ∗

Corresponding author: Tlf 4525 3408, fax 4588 1397, Email [email protected].

1

Mostly in the financial econometrics literature a plethora of Generalized Autoregressive Conditional Heteroskedasticity (GARCH) models has been proposed following the seminal paper by (Engle, 1982), but they are gaining ground in other areas of applied science, see (Bollerslev, Chou and Kroner, 1992; Shephard, 1996) for more recent reviews and the selected readings in (Engle, 1995). It is an open question whether environmental, ecological and technical data exhibit the particular kind of conditional heteroskedasticity behaviour implied by GARCH models, i.e. the distribution of the residuals has heavier tails than that implied by the Gaussian distribution and the squared residuals may exhibit autocorrelation. See (Baillie, 1996) for some evidence of long memory behaviour of the first moment of geophysical and macroeconomic time series. The most widely used GARCH models have been incorporated in a random walk model with measurement noise in a stochastic state space framework by (Harvey, Ruiz and Santana, 1992) such that the Kalman filter and the Prediction Error Decomposition due to (Schweppe, 1965) may be applied. It is easy to derive the likelihood function for GARCH models, but Jacquier, Polson and Rossi (1994) states that: “It is well known that many ARCH/GARCH likelihoods are difficult to maximize. Even with the standard sets used in this article, there is evidence of multiple local maxima.” In practical terms, this implies that the maximization procedure should be initiated very close to the global maxima in the parameter space in order to obtain reasonably stable estimates. The approach taken in this paper is the application of recursive methods to the squared residuals in order to obtain a more robust estimation method. In some applications it is convenient to update the parameter estimates as new observations become available by making adjustments to the previous parameter estimate as opposed to reestimating the parameters by optimizing the entire (log)likelihood function – in particular in online monitoring and control applications. This adaptivity allows the parameter estimates to vary through time while maintaining the overall model structure. For GARCH models we will show that these recursive methods lead to more robust parameter estimates than the estimates obtained by maximum likelihood estimation, because as new observations become available the (log)likelihood function may move from one local optima to another while the recursive methods provide more stable parameter estimates. The remainder of the paper is organized as follows: Section 2 introduces GARCH effects in linear regression models. Section 3 considers the recursive methods in the context of the previous section. Section 4 present empirical studies of the proposed methodology. Finally, Section 5 provides a conclusion and suggestions for further research.

2

GARCH models

Consider the linear regression model ˜ t> θ˜ + εt Y˜t = X

(1)

˜t is a p-dimensional column vector, θ˜ where Y˜t , t = 1, . . . , T , denotes the observations, X is a p-dimensional parameter vector and εt is a zero-mean measurement noise process. Initially, consider a GARCH(1,1) model for the measurement noise εt = σtet in (1), i.e. 2 σt2 = α0 + α1ε2t−1 + β1 σt−1 ,

2

(2)

where et is a zero mean Gaussian white noise sequence with variance 1, and the constant parameters should satisfy the constraints α0 > 0, α1 ≥ 0, β1 ≥ 0 and α1 + β1 < 1 in order to ensure wide-sense stationarity, see (Bollerslev, 1986). The simpler ARCH(1) model due to (Engle, 1982) is readily obtained for β1 ≡ 0. Imposing the parametric constraint α1 + β1 = 1 yields the Integrated GARCH(1,1) (IGARCH) model proposed by (Engle and Bollerslev, 1986). Eq. (2) may also be written as ε2t = α0 + (α1 + β1)ε2t−1 − β1vt−1 + vt

(3)

where vt = ε2t − σt2 = (e2t − 1)σt2, which shows that a GARCH(1,1) model is essentially an ARMA(1,1) model for the squared residuals. Introducing the unconditional variance E[ε2t ] = σ 2 = α0 (1 − α1 − β1)−1 the k-step ahead predictions are given by σˆ2t+k|t = σ 2 + (α1 + β1)k (σt2 − σ 2 ).

(4)

Thus the forecast function (as a function of k) exhibits a geometric decay towards the unconditional variance σ 2 with a half-life of − ln(2)/ ln(α1 + β1). For α0 ≡ 0, Eq. (2) is an Exponentially Weighted Moving Average (EWMA) model of the conditional variance. If 3α21 + 2α1 β1 + β12 < 1, the fourth order moment exists and is given by E[ε4t ] = 3α20 (1 + α1 + β1)/[(1 − α1 − β1)(1 − β12 − 2α1 β1 − 3α21 )]

(5)

such that the excess kurtosis is κ = (E[ε4t ] − 3E[ε2t ]2 )E[ε2t ]−2 = 6α21 (1 − β12 − 2α1 β1 − 3α21 )−1

(6)

which is greater than zero by assumption. Hence a GARCH(1,1) process has heavy tails (even for β1 = 0). The ACF of ε2t is given by, see eg. (Bollerslev, 1988), ρk = corr(ε2t , ε2t−k ) =

α1 (1 − α1 β1 − β12) (α1 + β1)k−1 ; k > 0 1 − 2α1 β1 − β12

(7)

Thus the ACF decreases geometrically as a function of k (this also holds for various GARCH(1,1) models, see (Ding and Granger, 1996)). The asymptotic properties of the empirical ACF for GARCH models are more complex than for ARMA models, so the ACF cannot readily be used for model identification purposes, see eg. (Davis and Mikosch, 1998). The general GARCH(p,q) model is given by σt2

= α0 +

q X

αi ε2t−i

+

i=1

p X

2 βj σt−j ,

(8)

j=1

where p ≥ 0, q > 0, α0 > 0, αi , i = 1, . . . , q and βj , j = 1, . . . , p. The parameters P P should satisfy qi=1 αi + pj=1 βj < 1 to ensure wide-sense stationarity. The unconditional h

variance is given by α0 1 − GARCH(0,q) model.

3

Pq

i=1 αi −

Pp

j=1 βj

i−1

. The general ARCH(q) model is just a

Recursive Estimation Methods

The recursive estimation methods used in this article are derived from the families of non-recursive estimation methods known as Prediction Error Methods (PEM) and Pseudolinear Regression (PLR). The corresponding recursive versions are denoted RPEM and 3

RPLR, respectively. Among famous members of the two families are Recursive Maximum Likelihood for ARMAX models (member of RPEM), Extended Least Squares for ARMAX models (member of RPLR) and Recursive Least Squares for ARX models (member of both families). See eq. (Ljung, 1987) or (Ljung and S¨oderstr¨om, 1983).

3.1

Nonrecursive versions

Consider the pseudolinear model ˆ θ. ˆ Yˆt|t−1 = Xt> (θ)

(9)

where the regressor X is allowed to depend on the parameter vector θ and the prediction error t is t = Yt − Yˆt|t−1 . (10) The model (9) may be static or dynamic, i.e. the regressor X may contain lagged values of Y or . 3.1.1

Prediction errors

Intuitively, we are interested in estimation methods that generate prediction errors that are as small as possible in some sense. We would also like our methods to extract as much information from the observations as possible, that is, ideally the prediction errors should be uncorrelated with past observations. As these two criteria do not generally coincide, one has to prefer one from the other. This preference leads to either the PEM methods or the PLR methods. A PEM seeks to minimize the sum of the squared prediction errors, i.e. VT =

T 1X 1 2  (θ). T t=1 2 t

(11)

is minimized with respect to θ. See (Ljung, 1987, Chapter 7). A PLR method seeks to compute its estimates such that the prediction errors are uncorrelated with past data. As this is impossible to assure in practical applications, one may confine oneself to demand the prediction errors to be uncorrelated with the regressors Xt . This makes sense here as Xt (θ) is a function of relevant past observations and θ. Formally, a PLR estimate is obtained as the solution to T 1X X > (θ)t(θ) = 0. T t=1 t

3.2

(12)

Prediction Error Methods (PEM)

As the criterion (11) can not be minimized analytically, a numerical procedure is called for. VT is minimized by solving ∂VT = 0, (13) ∂θ

4

a task suitable for the Newton-Raphson procedure. The first derivative with respect to θ is T ∂VT 1X ∂t(θ) t (θ) = . (14) ∂θ T t=1 ∂θ By defining ψ by ψt(θ) =

∂t(θ) ∂ Yˆt =− , ∂θ ∂θ

(15)

Eq. (14) can be rewritten as T ∂VT 1X =− t(θ)ψt (θ). ∂θ T t=1

(16)

The second derivative (the Hessian) is T T 1X 1X ∂ψt(θ) ∂ 2 VT > = ψt (θ)ψt (θ) − t (θ) . 2 ∂θ T t=1 T t=1 ∂θ

(17)

It may be quite difficult to compute the second derivative of  with respect to θ as needed in the last term of the Hessian. This is, however, not necessary as the value of the last term is close to zero when θ is close to the real parameters, and an accurate value of the Hessian is only needed in the proximity of the minimum, see (Ljung, 1987, Chapter 10). Thus the last term of (17) may be discarded and the parameter estimate can be computed iteratively, using the quasi Newton-Raphson algorithm θˆi

∂ 2VT = θˆi−1 + 2 ∂θi−1 ' θˆi−1 +

T X

!−1

∂VT ∂θi−1

(18) !−1

ψt (θˆi−1 )ψt>(θˆi−1 )

t=1

T X

t (θˆi−1 )ψt(θˆi−1 )

(19)

t=1

where i denotes the iteration number. 3.2.1

Recursive Prediction Error Methods (RPEM)

In the following a recursive version of PEM is derived, i.e. an estimation method that can be used to compute a new parameter estimate each time a new observation arrives without the need of a laborious recalculation of the summations in (19). To obtain such a version, consider again Eq. (19). Computing θˆT using θˆT −1 and a single iteration yields θˆT = θˆT −1 +

T X

!−1

ψt (θˆT −1)ψt>(θˆT −1)

t=1

T X

t(θˆT −1 )ψt(θˆT −1).

(20)

t=1

In doing it is assumed that only one iteration is needed for θˆT to solve (13). Nothing precludes us from using more than one iteration per observation, but the improvement in the accuracy of the estimate has been showed to be insignificant in practical applications.

5

The last summation in (20) may be written as T X

t (θˆT −1)ψt(θˆT −1 ) = T (θˆT −1)ψT (θˆT −1 ) +

t=1

TX −1

t (θˆT −1)ψt (θˆT −1)

t=1

(21)



∂VT −1 = T (θˆT −1)ψT (θˆT −1 ) − T , ∂θ θˆT −1

(22)

where (16) is used. Assuming that θˆT −1 minimizes VT −1 , the last term is zero. The first summation in (20) can be written as T X

ψt(θˆT −1)ψt>(θˆT −1 ) = ψT (θˆT −1 )ψT>(θˆT −1) +

t=1

TX −1

ψt(θˆT −1 )ψt>(θˆT −1),

(23)

t=1

but this does not reduce the computations as both terms have to be recalculated each time θˆ is updated. However, if we decline to update the last summation in (23) and if we introduce T X RT (θˆT −1) = ψt(θˆT −1 )ψ >(θˆT −1) (24) t

t=1

Eq. (20) can be rewritten as the RPEM updating scheme: T (θˆT −1 ) = YT − XT (θˆT −1)θˆT −1

(25a)

ˆ ˆ ˆ θˆT = θˆT −1 + R−1 T (θT −1 )T (θT −1 )ψT (θT −1 )

(25b)

RT (θˆT −1 ) = RT −1 (θˆT −1) + ψT (θˆT −1)ψT>(θˆT −1)

(25c)

For a description of the convergence properties see (Ljung and S¨oderstr¨om, 1983). Note that the expression R−1 T in (25b) may be computed without the use of explicit matrix inversion. For a description of the Matrix Inversion Lemma see (Ljung, 1987, Chapter 11).

3.3

Pseudolinear Regressions (PLR)

In order to develop an iterative procedure for the family of pseudolinear regression methods, the left hand side of Eq. (12) is rewritten as T T 1X 1X Xt (θ)t(θ) = Xt (θ)(Yt − Xt (θ)θ) T t=1 T t=1

(26) !

T T 1X 1X = Xt (θ)Yt − Xt (θ)Xt> (θ) θ, T t=1 T t=1

and hence θˆ =

T X

!−1

Xt (θ)Xt> (θ)

t=1

T X

Xt (θ)Yt .

(27)

(28)

t=1

Note that if X is independent of θ, this estimator is identical to the ordinary least squares estimator. 6

Eq. (28) may be used to find an estimate for θ by iteration T X

θˆi =

!−1

Xt (θˆi−1 )Xt> (θˆi−1 )

t=1

T X

Xt (θˆi−1 )Yt .

(29)

t=1

Using Yt = Xt> (θˆi−1 )θˆi−1 + t(θˆi−1 ) the last factor of (28) turns into T X

T X

Xt (θˆi−1 )Yt =

t=1

Xt (θˆi−1 )t (θˆi−1 ) +

t=1

T X

Xt (θˆi−1 )Xt> (θˆi−1 )θˆi−1 .

(30)

t=1

Inserting this into (29) yields θˆi = θˆi−1 +

T X

!−1

Xt (θˆi−1 )Xt>(θˆi−1 )

t=1

3.3.1

T X

Xt (θˆi−1 )t (θˆi−1).

(31)

t=1

Recursive Pseudolinear Regressions (RPLR)

To obtain a recursive version of PLR a new observation is introduced each time an iteration is performed in (31), that is θˆT = θˆT −1 +

T X

!−1

Xt (θˆT −1 )Xt>(θˆT −1 )

t=1

T X

Xt (θˆT −1)t(θˆT −1 ).

(32)

t=1

In line with the derivation of the RPEM method we split the last of the two summations into T TX −1 X Xt (θˆT −1)t (θˆT −1) = XT (θˆT −1)T (θˆT −1) + Xt (θˆT −1)t(θˆT −1 ), (33) t=1

t=1

and note that the last term is zero assuming that θˆT −1 solves (12) when T is substituted by T − 1. Similarly for Eq. (32) T X

Xt (θˆT −1 )Xt>(θˆT −1 ) = XT (θˆT −1 )XT> (θˆT −1) +

t=1

TX −1

Xt (θˆT −1)Xt> (θˆT −1 ).

(34)

t=1

Introducing RT (θˆT −1) =

T X

Xt (θˆT −1)Xt> (θˆT −1 )

(35)

t=1

yields the RPLR updating scheme: T (θˆT −1) = YT − XT (θˆT −1)θˆT −1 ˆ ˆ ˆ θˆT = θˆT −1 + R−1 T (θT −1 )T (θT −1 )XT (θT −1 ) RT (θˆT −1) = RT −1 (θˆT −1) + XT (θˆT −1)XT> (θˆT −1 ).

7

(36a) (36b) (36c)

Remark 3.1 Notice the close resemblance to the RPEM updating scheme. To change RPLR to RPEM, just swap X with ψ. Also notice that RPLR is identical to RPEM if X is independent of θ, since ψ=−

∂ ∂(Y − Xθ) ∂(Xθ) =− = =X ∂θ ∂θ ∂θ

(37)

when ∂X/∂θ = 0.

4

Empirical work

To measure the performance of the estimation methods derived in this article, a few experiments with the purpose of estimating parameters in GARCH and ARCH models have been carried out. GARCH models are estimated using a RPLR method, ARCH models are estimated using a RPEM method. The reason for this will be explained later. The parameters in a GARCH(1, 1) model have been estimated from four different series of stock market data. In particular series where GARCH behaviour is supported by other experiments, see eq. (Bisgaard, 1998). The parameters in an ARCH model have been estimated using simulated data.

4.1

GARCH

To measure the performance of the estimation methods derived in this paper, four series of stock market data have been utilized. The four series are daily closing prices in US$ of Hewlett-Packard, Sony, Mobil and Pepsi stocks from October 20, 1992 to October 20, 1997, a total number of 1265 observations per share. The four series are shown in Figure 1. As the focus of interest is on the relative changes in the stock market prices rather than on the prices themselves, the geometric rate of return is considered: Rt = ln St − ln St−1 = ln

St St−1

(38)

where St and Rt are the stock price and the geometric return, respectively, at day t. The series of geometric returns of Hewlett-Packard are shown in Figure 2. Notice how the heteroscedasticity is visible from the plot. The conditional variance is evidently of low magnitude when we consider observations with numbers between 500 and 550 and of high magnitude when we consider the range between number 950 and 1100. However, visual inspection can not stand alone as evidence of GARCH. Formal investigation of the return series for indications of GARCH behaviour can be done in several ways. We have chosen to concentrate on the Jarque-Bera test for nomality and the Ljung-Box test for autocorrelation in the squared returns also known as the Portmanteau Q-test. The Jarque-Bera test is used to compare the skewness and the kurtosis of the return series with those of a normal distribution. The Portmanteau Qtest, see eq. (Madsen, 1995, p. 158), tests for presence of heteroscedasticity. It should be noted that the two measures mentioned are only a subset of the investigations that constitutes a convincing test for GARCH. The two measures are considered for the purpose of illustration. 8

hp

sony 12000

70 60

10000

50 8000

40 30

6000

20 4000 200

400

600

800

1000 1200

200

400

mobil

600

800

1000 1200

800

1000 1200

pepsi

70

35

60

30

50

25

40

20 15

30 200

400

600

800

1000 1200

200

400

600

Figure 1: Closing prices as functions of the observation number.

4.1.1

Test results

The test results are shown in the tables below. The Jarque-Bera test in the first table, the Portmanteau Q-test in the last table. Series HP Skewness 0.048 Kurtosis 6.089 Statistics 502.26 Probability 0.0000

Sony 0.664 7.725 1268.8 0.0000

Mobil 0.032 4.040 57.147 0.0000

Pepsi 0.218 7.601 1102.7 0.0000

Table 1: Jarque-Bera As seen from the tables, all four series exhibits non-normal behaviour and heteroscedasticity. From the test results it is concluded that GARCH models constitutes a suitable description of the return series. 4.1.2

Estimation

Whether a RPEM method or a RPLR method is considered, a regression vector X has to be chosen. The most obvious step would be to write down the one-step predictor and establish its dependence of the parameters θ.

9

0.1

0.05

0

−0.05

−0.1

200

400

600

800

1000

1200

Figure 2: Geometric returns of Hewlett-Packard

This approach, however, leads nowhere since the expected value of a GARCH-residual is always zero. Instead it is convenient to transform the residuals by making the transformation Yt = ε2t . For given parameters, the one-step ahead prediction of Yt is Yt|t−1 = σt2 = α0 +

(39) q X

αi Yt−i +

i=1

p X

2 βj σt−j

(40)

j=1

= XtT θ 

(41) 





2 2 · · · σt−p where XtT = 1 Yt−1 · · · Yt−q σt−1 and θT = α0 α1 · · · αq β1 · · · βp . Note that the prediction error  is different from the GARCH residual ε.  is the prediction error referred to by the RPLR and RPEM methods, whereas ε is the residual of the linear regression model (1). The choice of X suggests the use of a RPLR method for estimation since a RPEM method would require an expression for ∂Yt|t−1/∂θ. This expression would be too cumbersome to compute due to the recursive parts of (40).

Note that the use of the transformation Yt = 2t is not restricted to ordinary GARCH models. The transformation can be used in any situation where the conditional variance is independent of the sign of . 4.1.3

Estimation results

The results of estimating the parameters in a GARCH(1, 1) model from the four return series are displayed in Table 3. For comparison, non-recursive maximum likelihood estimates are also shown in the table. The maximum likelihood estimates have been obtained using the GARCH module of the S-PLUS software package. Before initiating the RPLR regression, a value for σ02 as well as initial values of the matrix R and the vector θ had to be chosen. The value R0 = 10−4 · I with I being the identity 10

Series HP

Sony

Mobil

Pepsi

Lag 1 2 3 12 1 2 3 12 1 2 3 12 1 2 3 12

Statistic Probality 2.1242 0.1450 2.1246 0.3457 17.493 0.0006 35.967 0.0003 3.8354 0.0502 10.163 0.0062 10.772 0.0130 49.767 0.0001 0.7399 0.3897 12.947 0.0015 12.983 0.0047 43.066 0.0001 4.5436 0.0330 5.2947 0.0708 6.5861 0.0863 13.756 0.3166

Table 2: Portmanteau. All test statistics of lags between 3 and 12 are insignificant at the 5% level. matrix has been used as the initial value of R. This choice corresponds to no prior P knowledge of the parameters. The sample mean T −1 Ti=1 Yi has been used for σ02. For h i θ an initial value of θ0T = 1 · 10−5 0.1 0.9 has been used. This choice corresponds to some prior knowledge of the GARCH structure of the return series. But, as we shall see, the choice of initial parameter values is not critical to the estimation result, only to the speed of convergence. Since the convergence speed of the estimation algorithm does not allow for a stable estimate within the 1264 available observations, the RPLR-algorithm has been restarted a ˆ number of times for each series, each time with the most resent estimate of R, σ 2 and θ.

HP

RPLR ML Sony RPLR ML Mobil RPLR ML Pepsi RPLR ML

α0 2.276 · 10−5 6.566 · 10−5 6.902 · 10−6 2.706 · 10−6 6.595 · 10−7 4.963 · 10−6 7.925 · 10−5 2.917 · 10−5

α1 3.209 · 10−2 9.612 · 10−2 3.357 · 10−2 6.034 · 10−2 9.827 · 10−3 4.804 · 10−2 5.277 · 10−2 8.473 · 10−2

β1 9.170 · 10−1 7.610 · 10−1 9.397 · 10−1 9.338 · 10−1 9.852 · 10−1 9.141 · 10−1 6.085 · 10−1 7.937 · 10−1

Table 3: Estimation of the parameters in a GARCH(1,1) model. As seen from Table 3 the parameter values estimated using RPLR are different from the values obtained using Maximum Likelihood. However, as mentioned earlier, Maximum Likelihood is certainly not a bullet proof method to estimate GARCH parameters. Thus, to further measure the performance of RPLR, the squared errors computed using the 11

RPLR estimates have been compared to the squared errors computed using the Maximum Likelihood estimates. The results of this comparison is seen in Table 4. The squared errors have been computed using the estimated parameters for calculating a prediction error for each observation and then adding up the squared prediction errors. RPLR 1.271 · 10−3 5.640 · 10−4 6.590 · 10−5 4.558 · 10−4

HP Sony Mobil Pepsi

ML 1.281 · 10−3 5.690 · 10−4 6.475 · 10−5 4.587 · 10−4

Table 4: Squared errors. Numbers are computed usP ing the sample mean T −1 Tt=1 Yt as σ0. From Table 4 it seems like the RPLR method is doing a better job than the Maximum Likelihood method, since it is superior to the Maximum Likelihood in three out of four series. It should be noted, however, that the figures in Table 4 can hardly be used to support the conjecture that the RPLR method is superior to Maximum Likelihood. First, only four experiments have been conducted. Second, the computations of the figures in Table 4 exhibits some sensitivity to the choice of σ0 . But the numbers in Table 4 do indicate that the RPLR method is fully comparable to the Maximum Likelihood method. In Figure 3 the estimation trajectory of the parameter β1 of the Hewlett-Packard series as a function of the number of RPLR-iterations is illustrated. The trajectories of the two other parameters (not shown) are structurally similar to the trajectory of β1. 1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

0.5

1

1.5

2

2.5

0

3

0.5

1

1.5

2

2.5 5

4

x 10

x 10

Figure 3: Estimation trajectories of β1 for the Hewlett-Packard series for two different starting values. The estimation has been restarted at each 1264 iterations. The leftmost graph shows the trajectory of β1 corresponding to the initial parameter values α0 = 10−5 , α1 = 0.1 and β1 = 0.9. The rightmost graph shows the trajectory corresponding to the (arbitrary) initial parameter values α0 = 0.01, α1 = 0.4 and β1 = 0.5. Notice the different scales of the x-axis. As seen from the leftmost plot in Figure 3 reliable parameter estimates are not obtained with much less than 15000 iterations. The speed of convergence may be increased by choosing the initial values of R, σ 2 and θ more carefully. However, this may not be worth the effort as it does not take much time to carry out 15000 iterations. The rightmost plot in Figure 3 illustrates that the estimation method manages to converge, even when the initial parameter values are far from the stationary ones. 12

The small scale periodicity on both plots (most visible on the leftmost plot) is a result of the recursive algorithm being restarted.

4.2 4.2.1

ARCH Estimation

When an ARCH model is considered, the regression vector X is Xt> = 





1 Yt−1 · · · Yt−q



and θ> = α0 α1 · · · αq . As X is now independent of θ, the RPLR methods and the RPEM methods coincide. To ease the notation the method used will be denoted as RPEM. 4.2.2

Estimation results

The estimation of ARCH parameters has been carried out using simulated series. 10 trajectory’s from an ARCH(1) process have been simulated, each with 1200 observations. The estimation results are displayed in Table 5. Again the results are compared with the non-recursive maximum likelihood obtained using the GARCH module of the S-PLUS software package. The table shows the empirical mean and variance, respectively, for the RPEM method and the maximum likelihood method. The value R0 = kI with k = 10−5 and I the identity matrix h i have been choosen as the T −3 initial value for R. For θ the initial value θ0 = 10 0.1 have been used. As in the GARCH situation, the RPEM-algorithm has been restarted a number of times in order to obtain stable estimates. Parameter α0 True 1.00 · 10−4 ˆ E[.] RPEM 1.04 · 10−4 ML 1.00 · 10−4 Vˆ [.] RPEM 4.82 · 10−11 ML 4.39 · 10−11

α1 3.00 · 10−1 2.84 · 10−1 3.06 · 10−1 5.09 · 10−3 3.64 · 10−3

Table 5: Estimation of the parameters in an ARCH(1) model. As seen from Table 5 the RPEM regression does indeed make a good job in estimating the parameter values when compared to maximum likelihood, as was the case for GARCH.

5

Discussion and Conclusion

In this paper, it has been proposed to use recursive methods for estimating GARCH type models as an alternative to maximum likelihood. Some preliminary results have been presented, results that indicate that the proposed recursive estimation methods are fully comparable with the Maximum Likelihood method. It is a topic of ongoing research to obtain more reasonable initial values of the parameters and the hessian matrix such that faster convergence is obtained. Two subjects for future research are: 1) To introduce forgetting factors in the recursive methods and 2) to derive

13

expressions for the variance of the parameter estimates such that the uncertainty may be assessed.

References Andrews, D. W. K. (1991), ‘Heteroskedasticity and autocorrelation consistent covariance matrix estimation’, Econometrica 59, 817–858. Baillie, R. (1996), ‘Long memory processes and fractional integration in econometrics’, Journal of Econometrics 73, 5–59. Bisgaard, P. (1998), Stochastic Modelling of Market Risk, Master’s thesis, IMM, Technical University of Denmark, Lyngby. Bollerslev, T. (1986), ‘Generalized Autoregressive Conditional Heteroskedasticity’, Journal of Econometrics 31, 307–327. Bollerslev, T. (1988), ‘On the correlation structure of the generalized autoregressive conditional heteroscedastic process’, Journal of Time Series Analysis 9(2), 12–31. Bollerslev, T., Chou, R. Y. and Kroner, K. F. (1992), ‘ARCH models in finance: A review of the theory and evidence’, Journal of Econometrics 52, 5–59. Bollerslev, T., Engle, R. F. and Nelson, D. B. (1994), ARCH models, in R. F. Engle and D. L. McFadden, eds, ‘Handbook of Econometrics’, Vol. IV, Elsevier Science, chapter 49, pp. 2959–3038. Davis, R. A. and Mikosch, T. (1998), The Sample Autocorrelations of Heavy-Tailed Processes with Applications to ARCH. Manuscript, Colorado State University. Ding, Z. and Granger, C. W. J. (1996), ‘Modeling volatility persistence of speculative returns: A new approach’, Journal of Econometrics pp. 185–215. Engle, R. F. (1982), ‘Autoregressive conditional heteroscedasticity with estimates of the U.K. inflation’, Econometrica 50(4), 987–1008. Engle, R. F. (1995), ARCH Selected Readings, Cambridge University Press, Cambridge. Engle, R. F. and Bollerslev, T. (1986), ‘Modeling the Persistence of Conditional Variances’, Econometric Reviews 5, 1–50. Harvey, A. C., Ruiz, E. and Santana, E. (1992), ‘Unobserved component time series models with ARCH disturbances’, Journal of Econometrics 52, 129–157. Jacquier, E., Polson, N. G. and Rossi, P. E. (1994), ‘Bayesian analysis of stochastic volatility models’, Journal of Business and Economic Statistics 12, 371–417. Ljung, L. (1987), System Identification: Theory for the User, Prentice-Hall, New York. Ljung, L. and S¨oderstr¨om, T. (1983), Theory and Practice of Recursive Estimation, MIT Press, Cambridge, Massachusetts. Madsen, H. (1995), Tidsrækkeanalyse, 2 edn, IMM, DTU, DK-2800 Lyngby. 14

Newey, W. J. and West, K. D. (1987), ‘A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix’, Econometrica 55(3), 703–708. Schweppe, F. (1965), ‘Evaluation of likelihood function for Gaussian signals’, IEEE Transactions on Information Theory 11, 61–70. Shephard, N. (1996), Statistical aspects of ARCH and stochastic volatility, in D. Cox, D. Hinkley and O. Barndorff-Nielsen, eds, ‘Time Series Models: In econometrics, finance and other fields’, Chapman & Hall, pp. 1–67.

15

Suggest Documents