A Generalized Gamma Autoregressive Conditional ...

2 downloads 0 Views 283KB Size Report
I extend the ACD model of Engle and Russell (1998) to generalized gamma du- ..... 6The generalized gamma distribution has been successfully applied in ...
A Generalized Gamma Autoregressive Conditional Duration Model A SGER L UNDE Department of Economics, Politics and Public Administration, Aalborg University, Fibigerstraede 1, DK-9220 Aalborg Ø, Denmark Email: [email protected]

This version: February 1999, First version: November 1996

I extend the ACD model of Engle and Russell (1998) to generalized gamma durations with a conditional mean that depends on the exponential of the explanatory variables. This allows for a non-monotonic hazard function taking U-shaped or inverted U-shaped forms. The extension implies that the trading intensity persistence is reduced considerably, and that the overall fit of the model is enhanced compared to the ACD model. As a further extension of the model it is shown how to include time-varying covariates in a fully parametric framework. We analyze how transaction rates are affected by the posting of price-quotes and their changes. Besides, a model of the time between price-changes is estimated. This model is, as shown by Engle and Russell (1998), closely linked to the volatility of the stock price, and hence showing why price durations are important for intra-day prediction of volatility. The transaction volume and functions of this are used as regressors in this model and are found to be important. The datasets used in the paper consist of a random sample from the fifty stocks at the NYSE with the highest capitalization value on December 13, 1996.

Journal of Economic Literature Classification Numbers: C1, C2, D41, G4.

 This paper was written while the author was at the University of Aarhus, Department of Economics. I am grateful for financial support from the Center for Nonlinear Modelling in Economics, University of Aarhus. I am in particular grateful for comments from Torben G. Andersen and Niels Haldrup. All errors are mine.

Lunde, A.: The GG-ACD Model

1. I NTRODUCTION Following Engle and Russell (1995) researchers have considered the statistical properties of series of waiting times between successive transactions at financial exchanges. These studies1 find significant evidence of duration clustering and over-dispersion (with respect to exponential durations) in the investigated datasets. Most important is duration clustering, indicating a highly persistent correlation pattern with periods of prevailing long durations between transactions tending to be followed by long durations, and vice versa. This feature indicates that the point process of trading timing contains information about the behavior of the agents. By now a considerable number of papers have investigated the modelling of these aspects and the importance of accounting for this phenomenon in market microstructure models. In Engle and Russell (1997), the authors apply their autoregressive conditional duration (ACD) model to foreign exchange quotes arriving on Reuter’s screens. Using this model they develop a measure of the time between price changes. This measure is related to volatility and incorporates the information in the irregular sampling intervals. This relationship is utilized in Engle (1996) and Ghysels and Jasiak (1998) who build bivariate models of volatility and transaction intensity, both combining GARCH models with the ACD model. Engle (1996) finds that longer durations are associated with lower volatility, while Ghysels and Jasiak (1998) find that the GARCH persistence drops dramatically when accounting for intra-trade durations. An alternative to the ACD model relying on a state space representation of the duration dynamics was suggested by Lunde (1997). In this model the time deformation parameter is modelled as a random variable with a p.d.f. drawn from a conjugate family of distributions. This yields both explicit prediction and filtering formulas as well as an analytical prediction decomposition. The model is somewhat restrictive in the representation of the state dynamics because of the desire to retain conjugacy. In a recent paper Ghysels, Gourieroux, and Jasiak (1998) relax some of the limitations of the aforementioned models. They propose a class of two factor dynamic models with the first factor driving the conditional mean dynamics and the second driving the conditional variance. Their model is in line with the stochastic volatility extension of GARCH models studied extensively in recent years and it does not allow an analytical expression for the likelihood; A problem the authors resolve using simulation estimation. The ACD model is built by specifying a parametric representation of the conditional expected duration, and then relying on a quasi maximum likelihood argument for deriving the asymptotics of the estimated parameters. This means that the model potentially can be misspecified, implying that the parameter estimates might be biased and inefficient. This is recognized by Engle and Russell (1998), who limited their study to exponential- and Weibull durations. The degree of misspecification is relatively easily assessed by comparing the hazard function implied by the parametric specification of the estimated model with a nonparametric estimate of the same. Certain specifications will restrict the hazard function to shapes that may misrepresent the behaviour of the agents in the market. A serious drawback of transaction duration studies implemented so far is that all the studies have the model restricted to monotone hazard functions. Hence these mod1

Some references are given in the text below.

2

Lunde, A.: The GG-ACD Model els imply that the hazard function must either increase or decrease during a time-spell, or stay constant as for the exponential case. Engle and Russell (1998) find evidence of a monotonically decreasing hazard function for IBM transactions using a Weibull specification. In contrast Lunde (1997) finds evidence of both a monotonically decreasing and increasing hazard rate in a larger selection of stocks using both Weibull and gamma durations. The study by Lunde (1997) indicates that the shape selected by the data depends on the specified duration distribution. It seems as if the gamma and Weibull durations are at odds in selecting the hazard shape, and hence a more general distribution is called for. Many physical phenomena exhibit hazard functions that are non-monotonic. A common description of non-monotonic hazards, which would apply to modelling of human lifetimes, has three phases. First an initial phase where the hazard rate2 decreases, for humans this phase shows deaths due to hereditary defects, that is, infant mortality, whose impact decreases with time. During the middle phase the hazard rate is essentially constant, as deaths are typically due to accidents. In the final phase the hazard increases, because deaths result from the natural accumulation of negative effects. Such hazard rates are usually termed bathtub shaped or U-shaped. This has a logical counterpart in which the hazard rate initially increases, then becomes close to constant and ultimately decreases. This form is called inverted U-shaped. In the following the ACD model of Engle and Russell (1998) is extended to generalized gamma durations with a conditional mean that depends upon the exponential of the explanatory variables. This allows for a non-monotonic hazard function taking bathtub shaped or inverted U-shaped forms. The inverted U-shaped form is strongly supported by the data. I find that for most of the considered stocks the ACD persistence is reduced significantly when allowing for the more general duration distribution. Moreover, the suggested generalization outperforms the model employed by Engle and Russell (1998) with respect to all design criteria. As a further extension of the model it is shown how to include time-varying covariates in a fully parametric framework. It is found that by permitting the covariates to vary over durations will dramatically change the significance and effects of these. Altogether, this provides a better model for testing economic hypotheses about the market microstructure, as suggested in Engle (1996), Engle and Russell (1997) and Engle and Russell (1998). The proposed model is applied to a random sample consisting of seven stocks from the fifty stocks with the highest capitalization value on the NYSE on December 13, 1996. The rest of the paper is organized as follows. In the next section the ACD model is briefly reviewed. Section 3 presents the model introduced in the paper, and discusses estimation and diagnostics. In section 4 the model is applied to transactions data from the NYSE. In section 5 the model is extended further to allow for time-varying covariates, and this extension is also illustrated on the NYSE data. The final section concludes.

2. M ODELS FOR I NTER - TEMPORALLY C ORRELATED E VENTS The transactions data considered in this study has several features which are closely related to data sets modelled by the volatility papers referred to in the introduction. In the log-return series considered in these studies the correlogram of the squared series 2

We use the notions hazard function and hazard rate interchangeably throughout the paper.

3

Lunde, A.: The GG-ACD Model displays significant correlations at long lengths and hence indicating volatility clustering. The term clustering signifies that large changes (in the rate of return) tend to be followed by large changes, of either sign, and small changes tend to be followed by small changes. The transaction data displays a similar pattern by having extremely persistent autocorrelations in the series of successive durations. In Engle and Russell (1998) a short review of some of the relevant models for intertemporally correlated events are given. The research in this area was initiated by Wold (1948), who suggested a model for correlated durations drawing on an autoregressive structure. One of the most important models was presented by Cox (1955): namely the proportional hazard model. The ACD model of Engle and Russell (1998)3 and the generalization given in the present paper are both formulated as an accelerated failure time model. Before turning to the model of the paper a brief description of the ACD model is given. In Engle and Russell (1998) it is assumed that the time dependence can be summarized by a function which is the conditional expected duration given past information. This function has the property that xi = i is independently and identically distributed, where xi = ti ti 1 . The assumptions imply that the standardized durations satisfy: i

= E [xi j xi 1 ; : : : ; x1 ; ] = (xi 1 ; : : : ; xi p ;

f (xi =

i

j xi 1 ; : : : ; x1 ; )

i 1; : : :

;

i q ; )

= f (xi = i ; ) and iid 8i

(1) (2)

The authors call this model the ACD(p; q ) model. As their simplest version Engle & Russell considered the EACD(1; 1) model, i.e. exponential durations with intensity i 1 where i

= Æ + xi 1 +

i 1

;  0

Æ>0

The most general model considered is the WACD(2; 2), being Weibull durations with i having an ARMA(2; 2)-structure. The property of (2) is used as a diagnostic criterion for the goodness of fit of the models.

3. T HE GG-ACD MODEL Let the observations be the arrival times of transactions, given in the form of a simple point process in continuous time, that is ft0 ; t1 ; : : : ; tn ; : : : g with t0 < t1 < : : : < tn < : : : . To work in event time, we compute xi = ti ti 1 to get the corresponding durations. The model to be presented shortly is basically as the standard textbook model, where z1 ; z2 ; : : : ; zn ; : : : are the waiting times between successive events in a Poisson process on [0; 1) with parameter  1 . In that case zn , n  1 are mutually independent exponentially distributed random variables with common mean 4  . In this simple case the hazard function is constant, (t) =  1 for t  0. 3 4

Still the actual duration distributions used in here collapse to proportional hazard models. We will adopt the convention that the pdf of an exponential random variable is  1 exp( x=).

4

Lunde, A.: The GG-ACD Model 3.1. Definitions and properties The GG-ACD model diverges from the standard setting by a specification allowing for a time-varying conditional mean duration. The hazard function, which generally is a function of event-time (the length of the duration), will also depend on calendartime and the history of the flow of durations. We will see that the model has the scale parameter for the current observation depending on lagged durations and the scale of lagged durations. Basically the model extends the ACD model to the setting of a generalized gamma density. The model has the following structure. The exponential of the error term, U , in

ln((X=)Æ ) = U = ln(Z )

(3)

follows the standard gamma distribution given by:

fZi (zi j ) =

zi 1 exp( zi ) ( )

It follows that fXi = gÆ X reads5

>0

= Zi, with Æ > 0, and thus by transformation of Z the p.d.f. of

Æ (xi )Æ 1 exp fXi (xi j ; ; Æ ) = Æ  ( )





xi Æ 



xi  0

This model was defined by Stacy (1962) as the family of generalized gamma distributions. It includes the exponential distribution ( = 1; Æ = 1), the Weibull distribution ( = 1), the half-normal ( = 12 ; Æ = 2), and of course the ordinary gamma distribution (Æ = 1). In addition, the lognormal occurs as a limiting special case when ! 1. Some useful properties of this distribution are given in Lancaster (1990) p. 38-40. McDonald and Butler (1990) treat regression models for positive random variables in a general setting encompassing the generalized gamma distribution6 . The shape properties of the generalized gamma hazard are derived in Glaser (1980). In here it is shown how parameters, ( ; ; Æ ), divide the shape into three cases: 1.

Æ 1 < 0 Æ  1 ) (t) decreasing. (b) Æ > 1 ) (t) U-shaped. (a)

2.

Æ 1 > 0 (a)

Æ  1 ) (t) increasing.

5

A location parameter could also be included, but there is no reason for this extension in the present setting. 6 The generalized gamma distribution has been successfully applied in several studies due to its great flexibility: Jaggia (1991) shows, using Kennan’s model of strikes how partial tests of heterogeneity (functional form) are quite misleading in the presence of functional misspecification (neglected heterogeneity). To perform specification tests Yamaguchi (1992) uses the distribution in the analysis of permanent employment in Japan. For other applications see Bergstrom ¨ and Edin (1992) and Tunali and Assaad (1992).

5

Lunde, A.: The GG-ACD Model (b) 3.

Æ < 1 ) (t) inverted U-shaped.

Æ 1 = 0 Æ = 1 ) = 1 ) exponential density, (t) constant. (b) Æ < 1 ) (t) decreasing. (c) Æ > 1 ) (t) increasing. (a)

The shapes of the included models are of course special cases. We note that the hazard and survival functions involve the incomplete gamma integral,

I ( ; (xi

=)Æ )

=

Z (xi =)Æ

0

u 1 e u du;

and hence none of these functions can be written in closed form. This is not a problem since most modern computer software has very fast routines for evaluating the expression. It is readily seen that the survivor function is one minus this incomplete integral. For reference the mean and the variance are given by:

( + 1Æ ) E (Xi ) =  ( ) ( ( + 2Æ ) 2 var(Xi ) =  ( )

(4) 

( + 1Æ ) ( )

2 )

(5)

The idea is, like in Engle and Russell (1998), to assume that the time dependence can be summarized by a function , which is the conditional expected duration given past information. This is a natural assumption as the data indicates the mean duration to be time-variant. We have that i

= E [xi j Y i 1 ; xi 1 ; : : : ; x1 ; ! ] = (Y i 1 ; xi 1 ; : : : ; xi p;

i 1; : : :

;

i q ; !)

(6)

Y i contains the explanatory variables of interest and a constant term. In the generalized gamma case the expectation is determined by the three parameters of the distribution ( ; ; Æ ). We introduce the time-deformation through the scale parameter,  , such that i

= E [xi j xi 1 ; : : : ; x1 ; ! ] = i

( + 1Æ ) = i ( ; Æ ) ( )

(7)

and using this equality the dynamics of the conditional expectation can be rewritten as a dynamic equation for the scale parameter. Hence i evolves according to

i =  (Y i 1 ; xi 1 ; : : : ; xi p ; i 1 ; : : : ; i q ; ! ) ;

(8)

which is convenient for estimation. The next step is to consider possible specifications of the -function. It is natural to begin with the form estimated by Engle and Russell (1998). We denote this the EngleRussell-form, and hence put (Y i = c) in (6).

6

Lunde, A.: The GG-ACD Model

Engle-Russell-form i

=c+

p X j =1

aj

i j

+

q X j =1

bj xi

aj ; bj  0 c > 0

j

For simplicity it is convenient to set (p; q = 1) to get i = c + a i 1 + bxi 1 . It is not hard to rewrite this specification in the form of (6). Using (7) it follows that

i =

b c + ai 1 + x ( ; Æ ) ( ; Æ ) i

(9)

1

Now, by selecting the exponential distribution ( = 1; Æ = 1) or the Weibull distribution ( = 1) the model specializes to what has been worked out by Engle and Russell (1998). c = e c It is important that (9) is estimated as stated and not just an equation with ( ;Æ )

b e and ( ;Æ ) = b. Otherwise it is not possible to compare the persistence of the conditional expected duration across specifications. It should be of no surprise that with the explanatory variables included, (9) changes slightly to

i =

0

( ; Æ )

Y i 1 + ai 1 +

b x ( ; Æ ) i

(10)

1

After presenting their specification Engle & Russell note that the model equally could have been stated as yet another model in the GARCH-family, e.g. the EGARCHform proposed by Nelson (1991).

Nelson-form i

= exp( Y i 1 ) exp

p X j =1

aj ln(

i

q X

x bj i j j) + E (xi j ) j =1

!



Again let us simplify and look at i = exp( Yi ) exp a ln( i 1 ) + b xii can also be rewritten in the form (6). From (7) we have 

i

= exp a ln(

m

i 1) + b

xi

1

i 1

+ Y i

i =





1

b xi 1 a ln(i 1 ) + + Y i ( ; Æ ) i



(11)

1

and after taking logs we obtain

ln(i ) = (a 1) ln(( ; Æ )) + a ln(i 1 ) + 7

. This model

1

x i ( ; Æ ) = exp a ln(i ( ; Æ )) + b i 1 + Y i i ( ; Æ ) ( ; Æ )a 1 exp







m

1 1

b xi ( ; Æ ) i

1 1

+ Y i

1

Lunde, A.: The GG-ACD Model If Y i 1 is observable then it is possible, in the same way as in Engle and Russell (1998), to set up the log likelihood in closed form, and the parameters may be estimated by some numerical maximization algorithm. The model also allows for unobservables in Y i 1 , but then no closed form of the likelihood can be found. It can be calculated by numerical integration methods such as Markov Chain Monte Carlo. This is a rather elaborate process that is beyond the scope of the present exposition. The hazard function implied by the general model may now be written as

i (t) =

Æ(t)Æ 1 iÆ ( )

exp



 Æ 

t i

1 I ( ; ( ti )Æ )

where i are given by (10) for the Engle-Russell-form and by (11) for the Nelson-form. This rather complicated expression simplifies considerably in the special cases of the Weibull and the exponential duration model. That is Weibull:

i (t) = (i )Æ ÆtÆ

1

Exponential:

i (t) = i

3.2. Estimation Using the prediction decomposition the log likelihood function is given by

log L( ; Æ; !; xjX0 ) =

n X i=1

li ( ; Æ; !)

where

li ( ; Æ; !) = ln(Æ ) + (Æ 1) ln(xi ) Æ ln(i ) ln( ( ))



xi i



which is to be understood such that the scale is a function of the deformation parameters, i = i (! ) where ! = ( ; a; b). 3.3. Diagnostics As in a regression model it is very useful to be able to generate some kind of residuals for assessing the goodness of fit. Generally, residuals with a unit exponential distribution are defined as follows. If Ti has survivor function S (t) then S (Ti ) is uniformly distributed and ln(S (t)) has a unit exponential distribution. Thus, for a duration xi its residual is defined as

bi = ln(Sd (xi )) =

Z xi

0

b i (t)dt

(12) 8

Lunde, A.: The GG-ACD Model These residuals are often called Cox-Snell residuals as they are derived from the general definition of residuals given by Cox and Snell (1968). Goodness of fit tests are then designed to examine whether

fi; i = 1; : : : ; ng is iid EXP (1) The integral of (12) has a closed form solution in the Weibull and exponential cases. It is easy to show that these are equal to (xi =i )Æ and xi =i respectively. In the general case the integral must be solved numerically.

4. E MPIRICAL A PPLICATION The data is extracted from the Trade and Quote (TAQ) database. The TAQ database is a collection of all trades and quotes in New York Stock Exchange (NYSE), American Stock Exchange (AMEX), and National Association of Securities Dealers Automated Quotation (Nasdaq) securities. We only consider trades and quotes on the NYSE. Schwartz (1993) and Hasbrouck, Sofianos, and Sosebee (1993) document NYSE trading and quoting procedures. Among the fifty stocks with the highest capitalization value at December 13 1996 seven stocks were randomly selected. The names and some summary statistics are given in Table 1 and 2. The numbers of trades and price-quotes are given in Table 1. Trades reported within the same second were treated as one trade. The sample period is the two months from August 4, 1997 to September 30, 1997, which gives a total of 42 trading days. The first duration every day is the duration from the second to the third trade that day. Thus, the NYSE opening duration is excluded from the analysis. As discussed in several of papers referred to earlier, the market exhibits high activity in the morning and before closure. Around lunchtime the activity is always lower. To handle this time-of-the-day effect, E [i jti 1 ], for every second of the day was computed. This daily pattern was estimated by using cubic spline smoothing as discussed in chapters 1, 2 & 3 of Green and Silverman (1994). Some of the estimated splines can be seen in Lunde (1997). Using these estimated time conditional means, the deterministic components were extracted, and we continue to model the transformed samples

x~i =

xi (ti 1 )

where (ti 1 ) is the spline estimate of E [xi jti 1 ]. Figure 1 shows nonparametric density estimates of the filtered transaction durations for the sampled stocks. Note that the shapes are inconsistent with the exponential distribution, as the density of the exponential distribution is monotonically decreasing. In Table 3 through 6 the results of estimating the models presented in the previous sections are reported. The Engle-Russell-form (10) is estimated for the exponential, the Weibull and the generalized gamma specification and the parameter estimates are given in Table 3. Three points are to be made. First it is clear that data does not support a reduction from the generalized gamma specification to a simpler distribution, as both Æ and are significantly different from one. This is also supported by the maximum likelihood values of Table 4. Hence there is hard evidence that the models applied by 9

Lunde, A.: The GG-ACD Model Engle and Russell are misspecified. Which brings about the second point: What does the misspecification imply for the estimates of the dynamic parameter. Engle and Russell (1998) rely on a quasi-maximum likelihood (QML) argument to derive the asymptotic properties of the exponential ACD-model. If one compares the estimates of 0 , a and b from the exponential and the Weibull specification this strongly supports the QML approach. As these are very close the bias from using the simpler quasi likelihood function is virtually zero. This is misleading because when comparing with the generalized gamma estimates it becomes evident that the lack of flexibility (the misspecification) of the simpler distribution causes an upward bias of a + b toward more persistence. This appears as five of the stocks have non-overlapping confidence bands for a + b, whereas the last two stocks have only marginally overlapping bands. One is tempted to talk about spurious persistence of the conditional mean induced as the dynamics in the exponential or Weibull specification tries to account for the lack of flexibility of these distributions. The third point regards the values of the shape parameters7 Æ and . If these are placed into the taxonomy of hazard shapes, it may be concluded that an inverted U-shaped hazard is implied by the generalized gamma specification. This contrasts the hazard shape resulting from parameters estimated under the simpler specifications. Figure 2 compares these hazard shapes implied by the SLB stock. The dynamic parameter i is kept constant at its mean value. It is striking to observe how big a difference the three specifications impose on the shape of hazard. We return to this subject in conjunction with the Nelson-form. The diagnostics of the Engle-Russell-form in Table 4 clearly show that the generalized gamma specification outperforms the specifications used by Engle and Russell (1998). On all criteria the generalized gamma specification wins. It obtains the largest reduction of the Ljung-Box statistics, the smallest difference between the mean and the standard deviation and the lowest value of the overdispersion statistics suggested by Engle and Russell (1998). Most importantly, the exponential- and the Weibull specification are grossly rejected in terms of likelihood ratio tests against the generalized gamma specification. The results for the Nelson-form reported in Table 5 and Table 6 gives the same story. Consulting these tables reveals that all the points made for the Engle-Russell-form apply equally well for the Nelson-form. Hence the persistence parameter is upward biased in relation to the estimate obtained using the more flexible generalized gamma specification Comparing the diagnostics of the two forms, the Nelson-form clearly performs the best in terms of the maximum value of the likelihood function. Still the comparison of the diagnostics is ambiguous. The conclusion is that the greatest advantages delivered by the Nelson-form are the computational convenience of not having to impose restrictions on the space of the dynamic parameters. In particular, this makes the estimation a lot easier when explanatory variables are included in the dynamic equation. Above, it was demonstrated that hazard functions implied by the three specifications are quite different. These could be compared with a nonparametric estimate of the hazard function. However, this is somewhat problematic because such estimates are very erratic in these datasets and has to be smoothed very much before any conclu7

The standard error of Æ was calculated using the delta method.

10

Lunde, A.: The GG-ACD Model sion can be drawn. Instead, a feasible approach is to compare the cumulative hazard function of the parametric specifications with a nonparametric estimate, which is done in Figure 3 for the Nelson-form using the SLB sample. The nonparametric estimator chosen here is the Nelson-Aalen estimator, see e.g. Klein and Moeschberger (1997). It is evident that the generalized gamma specification is the one most closely matching the nonparametric estimate. Graphically we may also compare the nonparametric density estimates of the filtered duration in Figure 1 with the densities selected by the parametric models. In Figure 4 this exercise is presented for the Nelson-form, with i fixed at its mean value. Again the generalized gamma specification comes closest to the nonparametric estimate. Note that the exponential- and the Weibull specification are fundamentally different from the data at the short durations. At the end it should be noted that it does not give the estimated models justice to compare them with i fixed, to the unconditional estimates of the density or the cumulative hazard of the filtered durations. The reason is that when that ACD model has time-variation in the scale parameter, then the unconditional distribution is an exponential, Weibull or generalized gamma scale mixture. Hence the presented Figures are a crude approximation.

5. T IME - VARYING C OVARIATE E FFECTS One potential drawback of the models suggested in the literature so far is the lack of including explanatory variables which change within a duration. Obvious examples of such variables are prices, depth and quoted spreads by the market maker. The model suggested above only allows the researcher to include values of such variables determined before the beginning of a particular duration. This is clearly of separate interest, but it could be argued that it is equally interesting to know the effect of a change in the spread on the expected waiting time to the next trade. This effect is summarized by the change in the mean residual lifetime at the time when the covariate changes. The mean residual lifetime at x measures the most recent duration’s expected remaining lifetime, and it is defined as

mrl(x) = E (X

xjX > x) =

R1

x

R1 (t x)f (t)dt S (t)dt = x S (x) S (x)

One way of allowing for time-varying covariate effects is to utilize the Cox regression approach suggested by Cox (1972) and detailed in Cox (1975). The Cox regression is a semiparametric method which in a first step estimates the effect of the covariates parametrically, and then in the second step, estimates a nonparametric baseline hazard. This method assumes that the observations are independent, and it is not clear how one would extend it to the case of sequences of dependent durations. My objective is to extend the model developed above within its fully parametric framework. To accomplish this task the paper by Petersen (1986) entitled Fitting Parametric Survival Models with Time-Dependent Covariates was very helpful, and we essentially merge this paper with the GG-ACD model. Because it eases the formulation considerably by doing this in terms of the hazard rate, we switch the focus from the conditional expectation to the conditional hazard rate. 11

Lunde, A.: The GG-ACD Model To keep the notation as simple as possible the likelihood is first given for one nonnegative continuous random variable, T , which denotes some inter-event arrival time. The time passed waiting for the next event depends on exogenous covariates, Y (t), which follow step-functions or deterministic continuous functions of time. We will only need Y (t) to be of the step-function type. Hence the observed duration, t, may be partitioned into intervals 0 = a0 < : : : < ak = t where z stays constant Y (aj 1 ) in [aj 1 ; aj ) and jumps to Y (aj ) at aj ; for j = 1; : : : k. The hazard rate is defined as

p (t  T < t + hjT h#0 h

 t; Y (t))

(tjY (t)) = lim

which gives the instantaneous probability of terminating the duration at time t, given that it was not terminated before t and given the path of the covariates up to t. Using the constructed partitioning sequence faj gk0 , it becomes straightforward to write the cumulative hazard as

(t) =

Z t

0

(ujY (u))du =

k Z aj X j =1 aj

1

(ujY (aj 1 ))du

It is clear that the assumption of step-function covariates makes the integration easy, because the total duration can always be partitioned into sub-periods within which all the covariates stay constant. As usual the survivor function is given by S (t) =

exp( (t))

If we had right-censored observations then these would contribute to the likelihood with the survivor function, while non-censored observations enters with the density function given by the product of the hazard function and the survivor function, that is 

1 di

Li = (ti jY (ti ))S (ti )

S (ti )di

and the log likelihood reduces to

ln Li = (1 di ) ln (ti jY (ti )) + ln S (ti ) = (1 di ) ln (ti jY (ti ))

k Z aj X j =1 aj

1

(ujY (aj 1 ))du

Assume that we have a sample of events like trades at times t0 ; : : : ; ti ; ti+1 ; : : : ; tN . Consider the duration from the ti 1 ’th to the ti ’th trade, denoted xi = ti ti 1 . Suppose covariates effecting xi change at times ai0 < : : : < aik 2 [ti 1 ; ti ) with ai0 = ti 1 and aik = ti , such that i refers to the duration number and k refers to partition. Now the log likelihood for the full sample (assuming no censored observations) is given by

ln L =

N X i=1

ln (xi jY (aik 1 ))

!

k Z aij X j =1 aij

1

(ujY (aij 1 ))du

and the model is now closed by specifying the functional form of the hazard function. The specification of the hazard function will be stated starting with the generalized gamma form, and then simplified to the distributions embedded in this. Put the scale 12

Lunde, A.: The GG-ACD Model parameter i (t) =  (t ti 1 jY (t)) and assume that it evolves according to the following difference equation in logs

1) ln (( ; Æ )) + a ln(i 1 (xi 1 )) + ( ; Æ )bxi 1 i 1 (xi 1 ) + Y (t);

ln (i (t)) = (a

(13) such that the scale only changes with t through the changes in the covariates. Remember that ( ; Æ ) is the factor which keeps the parameters compatible across the different duration densities within the generalized gamma family. The hazard function of the generalized gamma form can now be written as

(t ti 1 jY (t)) =

i (t) Æ 1 exp( ( ) (ti (t))

(ti (t))Æ ) 1 IGG( ; Æ; (t ti 1 ))

where IGG[; ; ] is the generalized gamma analog of the incomplete gamma function. Hence, utilizing the step-function assumption results in the following form of the log likelihood

ln L =

N X

(

i=1

ln

i (aik 1 ) Æ 1 exp( ( ) (xi i (aik 1 ))

IGG( ; xi )

1

Æ 1 exp( k Z aij i (aij 1 ) (u (a X i ij 1 )) ( ) j =1 aij

1

(xi i (aik 1 ))Æ )

IGG( ; u)

1

!

(ui (aij 1 ))Æ )

)

)du

which has to be computed numerically. Simplification to the Weibull case and further to the exponential case delivers closed-form expressions for the log likelihood. In the Weibull case we have that

ln L =

N X

ln(i (aik 1 )Æ [i (aik 1 )xi ]Æ 1 )

i=1

k X j =1

!

(aÆij

aÆij 1 )i (aij 1 )Æ

and the exponential case easily follows with Æ = 1. Returning to the mean residual lifetime it is clear that this changes with the covariates through the hazard function. In the very simple exponential case the mean residual lifetime is given by

mrli (x) =

R1

x

exp ( i (x)u) du = i (x) 1 : e i (x)x

The expressions for the Weibull and generalized gamma specifications are more complex but quite analogous. 5.1. Application In this section the effect of introducing time-varying covariates is illustrated using the datasets of section 4. In Engle and Lunde (1998) a model for durations between transactions is estimated as part of a bivariate model for trade-trade and trade-quote durations. 13

Lunde, A.: The GG-ACD Model In this model several explanatory variables are included as regressors, but kept constant within durations even though they actually change before the particular duration terminates. Two such variables are the change in the spread and an equally weighted moving average of ten spread levels. Denote these variables spr(t) and lev.spr(t), respectively. In the model estimated by Engle and Lunde (1998) these two variables do not have much effect. This could be because the information given by new quotes is delayed until the beginning of the next duration, or the variables might be less correlated when incorporating the time-variation feature. The model defined by the hazard function (13) was estimated both with and without time-varying covariates. The results for the exponential and the Weibull specifications were almost identical, hence only the exponential specification is reported and discussed. In this case we have = Æ = 1 which implies that ( ; Æ ) = 1: Hence the scale parameter is equal to the hazard function and reads

ln (i (t)) = a ln(i 1 (xi 1 )) + bxi 1 i 1 (xi 1 ) + 0 + 1 spr(t) + 2 lev:spr(t); When time-varying covariates are not allowed this expression is just

ln (i (t)) = a ln(i 1 (xi 1 )) + bxi 1 i 1 (xi 1 ) + 0 + 1 spr(ti 1 ) + 2 lev:spr(ti 1 ); and hence, spr and lev:spr are constant with the values at time ti 1 over the entire duration. The results presented in Table 7 are rather disappointing. The inclusion of the timevarying feature gives even less reason to believe that these spread variables have much ability to explain the variation in the trading intensity. 5.2. A Price Duration Model In financial econometrics a major research agenda is concerned with modelling and predicting volatility. In Engle and Russell (1998) it is demonstrated how the volatility is linked to the durations between price changes. They show that the instantaneous volatility can be written as

 2 (t

jHt



c )= P (t)

2

(t j H t )

where P (t) is the price of the asset at time t. This motivates the interest in finding good models for the hazard rate. The durations used for this model are waiting times between adjacent mid-quote revision. That is, the time between changes of (Bid + Ask)=2. As explanatory variables for this model we use the square-root of the volume of the most recent transaction, and an equally weighted average of the 20 most recent volumes. For a survey of the volume price relationship see Karpoff (1987) or Goodhart and O’Hara (1997). 14

Lunde, A.: The GG-ACD Model The models to be estimated are given as follows

ln (i (t)) = a ln(i 1 (xi 1 )) + bxi 1 i 1 (xi 1 ) + 0 + 1

p vol(t);

and as above when time-varying covariates are not allowed this expression is just

ln (i (t)) = a ln(i 1 (xi 1 )) + bxi 1 i 1 (xi 1 ) + 0 + 1

p vol(ti 1 );

In Table 9 the estimated parameters are reported. First, it is interesting to note that the effect of volume would be overlooked if the covariates were constant within durations. When permitting time-varying covariates we see that the effect of a transaction with a large volume is to make the next price change come closer. Large transactions are often related to information based trading in the market microstructure literature. Hence our result tells that information based trading in presence of large volume tends to make price revisions more frequent.

6. C ONCLUSION In this paper I have extended the ACD model of Engle and Russell (1998) to generalized gamma durations with a conditional mean that depends upon the exponential function of the explanatory variables. This allows a non-monotonic hazard function taking for instance bathtub shaped or inverted U-shaped forms. The inverted U-shaped form is strongly supported by the data. It was found that for most of the considered stocks the ACD persistence is reduced significantly when allowing the more general duration distribution. The suggested generalization appears to outperform the models employed by Engle and Russell (1998) with respect to the design criteria. As a further extension of the model, it was shown how to include time-varying covariates in a fully parametric framework. It was found that permitting the covariates to vary over durations might dramatically change the significance and effect of these. By using a model for durations between price changes it was demonstrated that the effect of a transaction with a large volume is to make the subsequent price change appear earlier than otherwise. This was interpreted as a prediction that information based trading in presence of large volume tends to make price revisions more frequent. Altogether we have provided an improved model that may be used for testing economic hypotheses about the market microstructure, as suggested in Engle (1996), Engle and Russell (1997) and Engle and Russell (1998).

R EFERENCES Bergstrom, ¨ R., and P.-A. Edin, 1992, Time aggreation and the distributional shape of unemployment duration, Journal of Applied Econometrics, 7, 5–30. Cox, C. R., and E. J. Snell, 1968, A General Definition of Residuals, Journal of the Royal Statistical Society B, 30, 248–75. Cox, D. R., 1955, Some statistical methods connected with series of events (with discussion), Journal of the Royal Statistical Society B, 17(2), 129–164.

15

Lunde, A.: The GG-ACD Model Cox, D. R., 1972, Regression models and life-tables (with discussion), Journal of the Royal Statistical Society B, 34, 187–220. Cox, D. R., 1975, Partial Likelihood, Biometrika, 62(2), 269–276. Engle, R. F., 1996, The econometrics of ultra-high frequency data, Department of Economics, University of California, San Diego Working paper 96-15, committed to appear in Econometrica. Engle, R. F., and A. Lunde, 1998, Trades and quotes: A bivariate point process, Department of Economics, University of California, San Diego. Engle, R. F., and J. Russell, 1995, Forecasting transaction rates: The autoregressive conditional duration model, UCSD working paper. Engle, R. F., and J. Russell, 1997, Forecasting the frequency of changes in quoted foreign exchange prices with the autoregressive conditional duration model, Journal of Empirical Finance, 4, 187–212. Engle, R. F., and J. Russell, 1998, Autoregressive conditional duration: a new model for irregularly spaced transaction data, Econometrica, 66(5), 1127–1163. Ghysels, E., C. Gourieroux, and J. Jasiak, 1998, Stochastic volatility duration models, CIRANO. Ghysels, E., and J. Jasiak, 1998, GARCH for irregularly spaced financial data: The ACDGARCH model, Studies in Nonlinear Dynamics and Econometrics, 2, 133–149. Glaser, R. E., 1980, Bathtub and related failure rate characterizations, Journal of the American Statistical Association, 75(371), 667–672. Goodhart, C. A. E., and M. O’Hara, 1997, High frequency data in financial markets: Issues and applications, Journal of Empirical Finance, 4, 73–114. Green, P. J., and B. W. Silverman, 1994, Nonparametric Regression and Generalized Linear Models, Chapman & Hall. Hasbrouck, J., G. Sofianos, and D. Sosebee, 1993, Orders, Trades, Reports and Quotes at the New York Stock Exchange, Discussion paper, NYSE, Research and Planning Section. Jaggia, S., 1991, Specification tests based on the heterogeneous generalized gamma model of Duration: With an application to Kennan’s strike data, Journal of Applied Econometrics, 6, 169–180. Karpoff, J. M., 1987, The relation between price changes and trading volume: A survey, Journal of Financial and Quantitative Analysis, 22(1), 109–126. Klein, J. P., and M. L. Moeschberger, 1997, Survival Analysis: Techniques for Censored and Truncated Data, Springer-Verlag New York.

16

Lunde, A.: The GG-ACD Model Lancaster, T., 1990, The Econometric Analysis of Transition Data, Cambridge university press. Lunde, A., 1997, A conjugate gamma model for durations in transactions, Working paper, Dept. of Economics, University of Aarhus. McDonald, J. B., and R. J. Butler, 1990, Regression models for postive random variables, Journal of Econometrics, 43, 227–251. Nelson, D. B., 1991, Conditional heteroskedasticity in asset returns: A new approach, Econometrica, 59(2), 347–370. Petersen, T., 1986, Fitting parametric survival models with time-dependent covariates, Applied Statistics, 35(3), 281–288. Schwartz, R. A., 1993, Reshaping the Equity Markets, Business One Irwin. Stacy, E. W., 1962, A generalization of the gamma distribution, Annals of Mathematical Statistics, 33, 1187–1192. Tunali, I., and R. Assaad, 1992, Market structure and spell of employment and unemployment: Evidence from the construction sector in Egypt, Journal of Applied Econometrics, 7, 339–367. Wold, H., 1948, On stationary point processes and Markov chains, Skand. Aktuar., 17, 229–240. Yamaguchi, K., 1992, Accelerated failure-time regression models with a regression model of surviving fraction: An application to the analysis of ”Permanent employment” in Japan, Journal of the American Statistical Association, 87(418), 284–292.

17

A PPENDIX A: TABLES AND FIGURES . Table 1 Selected NYSE stocks Symbol

#Shares

Value

#Trades

#Quotes

DIS

682

47,496

28390

15344

FNM

1129

42,053

24910

10806

General Motors Corporation

GM

757

42,182

32618

14067

BankAmerican Corporation

BAC

387

38,632

34764

19184

McDonald’s Corporation

MCD

830

37,572

24720

9513

Monsanto Company

MTC

822

31,954

25324

11587

Schlumberger Limited

SLB

309

30,848

27787

18193

Company Disney (Walt) Company (The) Federal National Mortgage Ass.

Table 1 shows the eight randomly selected stocks from the fifty leading NYSE stocks in market value, as of December 31, 1996. Shares and value are in millions.

18

Table 2 Summary Statistics Av. tr. dura.

Av. price

Av. size

Av. qu. dura

Av. midqu.

Av. spread(%)

DIS

34.27

78.72

1473.11

24.76

78.73

0.15

FNM

38.93

45.53

2915.59

28.09

45.51

0.20

GM

29.88

64.92

2459.70

24.97

64.91

0.15

BAC

27.99

71.30

1840.85

20.47

71.25

0.17

MCD

39.29

48.61

2623.67

35.03

48.81

0.19

MTC

38.00

42.81

2950.54

32.01

43.23

0.27

SLB

34.96

77.71

1658.58

24.06

77.55

0.18

19

Table 2 gives summary statistics for the datasets. Av. tr. dura. is the average length of the durations between suggestive trades measured in seconds. The next column gives the average transaction price, and the fourth column lists the average size of the amount of shares traded. Av. qu. dura. is the average length of the durations between suggestive quotes posted by the market maker measured in seconds. Av. midqu. reports the average of the mid point of the bid/ask quotes. The av. spread is average of the percentage spread calculated as the log of the ask price divided by the log bid price times 100.

Lunde, A.: The GG-ACD Model

Firm

Table 3 Estimates for the Engle-Russell-form (9) Model

^0 (H0 : 0 =0)

a^ (H0 :a=1)

^b (H0 :b=0)

Æ^ (H0 :Æ=1)

DIS

EXPON WEIBULL GENGAM

0.0106(3.61) 0.0107(3.71) 0.0008(4.99)

0.9594(5.98) 0.9585(6.08) 0.9099(8.64)

0.0301(7.26) 0.0303(7.39) 0.0499(12.83)

0.9227(-21.51) 0.4121(-56.85)

EXPON WEIBULL GENGAM

0.0119(3.52) 0.0118(3.53) 0.0019(5.65)

0.9640(5.42) 0.9640(5.41) 0.8969(6.77)

0.0241(6.72) 0.0240(6.72) 0.0447(11.52)

0.9708(-6.88) 0.4491(-39.17)

EXPON WEIBULL GENGAM

0.0060(4.03) 0.0060(4.05) 0.0008(6.30)

0.9680(7.40) 0.9682(7.39) 0.9530(8.87)

0.0260(8.44) 0.0258(8.44) 0.0333(11.22)

0.9653(-8.89) 0.4856(-43.98)

EXPON WEIBULL GENGAM

0.0206(4.11) 0.0205(4.18) 0.0005(5.17)

0.9386(6.20) 0.9381(6.26) 0.8713(8.34)

0.0409(7.82) 0.0409(7.91) 0.0625(12.93)

0.9506(-14.02) 0.3805(-79.50)

EXPON WEIBULL GENGAM

0.0066(3.34) 0.0066(3.34) 0.0017(5.55)

0.9686(5.96) 0.9685(5.99) 0.9431(6.08)

0.0249(7.00) 0.0250(7.01) 0.0374(8.38)

1.0204(4.53) 0.5308(-26.56)

EXPON WEIBULL GENGAM

0.0022(3.48) 0.0023(3.55) 0.0005(5.69)

0.9703(8.20) 0.9701(8.21) 0.9537(9.90)

0.0276(8.63) 0.0278(8.66) 0.0392(10.91)

0.9775(-5.18) 0.5028(-36.42)

EXPON WEIBULL GENGAM

0.0039(3.79) 0.0039(3.84) 0.0005(6.30)

0.9727(7.59) 0.9727(7.58) 0.9435(7.79)

0.0234(8.43) 0.0233(8.41) 0.0394(10.05)

0.9556(-10.99) 0.4483(-52.44)

FNM

GM

BAC

20 MCD

MTC

SLB

^ (H0 : =1)

4.4329(16.47)

4.1086(13.06)

3.5067(16.33)

5.5723(21.19)

3.2565(11.48)

3.3653(14.12)

4.0571(17.49)

Æ^ ^ (H0 :Æ =1)

C.B.:95% (a+b)

1.8270(20.17)

[0.9837:0.9953] [0.9829:0.9948] [0.9455:0.9741]

1.8453(16.87)

[0.9814:0.9948] [0.9813:0.9947] [0.9174:0.9658]

1.7029(20.37)

[0.9911:0.9970] [0.9910:0.9969] [0.9810:0.9915]

2.1202(28.14)

[0.9697:0.9894] [0.9691:0.9889] [0.9115:0.9560]

1.7286(15.28)

[0.9896:0.9974] [0.9896:0.9974] [0.9702:0.9909]

1.6921(17.57)

[0.9966:0.9993] [0.9964:0.9992] [0.9901:0.9956]

1.8188(22.24)

[0.9940:0.9982] [0.9939:0.9981] [0.9757:0.9901]

Table 3 gives the estimates of the Engle-Russell-form (9). T-statistics are given in parentheses. Numbers in italic boldface are significant at the 99% level, numbers in normal font are significant at the 95% level. The numbers typed with very small types are insignificant. For a, Æ; and Æ the null are equality to one.

Lunde, A.: The GG-ACD Model

FIRM

Table 4 Diagnostics for the Engle-Russell-form (9) LB( )

E ( )

St( )

EXPON WEIBULL GENGAM

898.5 898.5 898.5

14.4 15.3 10.0

0.9998 0.9998 1.0036

1.1764 1.0795 0.9963

FNM

EXPON WEIBULL GENGAM

582.3 582.3 582.3

14.9 15.1 12.7

0.9999 0.9999 1.0026

GM

EXPON WEIBULL GENGAM

1427.2 1427.2 1427.2

34.8 36.6 35.9

BAC

EXPON WEIBULL GENGAM

1438.6 1438.6 1438.6

MCD

EXPON WEIBULL GENGAM

MTC

SLB

jSt()

j

E ( )

E-R EDT

Max Like

0.1765 0.0797 0.0073

22.86 9.85 -0.44

-27798.08 -27629.70 -26923.65

1.1160 1.0807 0.9899

0.1162 0.0809 0.0127

13.70 9.37 -1.12

-24570.71 -24551.29 -24032.75

1.0002 1.0002 1.0018

1.1342 1.0899 0.9946

0.1339 0.0897 0.0072

18.28 12.00 -0.69

-31795.03 -31758.70 -31133.38

34.0 34.2 22.0

1.0003 1.0002 1.0040

1.1701 1.1052 1.0036

0.1699 0.1050 0.0004

24.34 14.59 0.48

-34022.42 -33940.47 -32849.35

1009.6 1009.6 1009.6

40.7 40.1 29.5

1.0000 1.0000 1.0017

1.0453 1.0679 0.9854

0.0453 0.0679 0.0163

5.15 7.81 -1.61

-24223.57 -24214.93 -23821.06

EXPON WEIBULL GENGAM

2584.7 2584.7 2584.7

32.6 33.1 26.4

1.0004 1.0003 1.0016

1.1155 1.0877 1.0010

0.1151 0.0874 0.0006

13.74 10.30 0.12

-23855.02 -23843.33 -23358.05

EXPON WEIBULL GENGAM

1419.7 1419.7 1419.7

61.5 63.9 47.0

1.0001 1.0001 1.0030

1.1556 1.0982 1.0067

0.1555 0.0981 0.0037

19.76 12.13 0.79

-26928.95 -26877.03 -26188.08

Model

DIS

21

Table 4 gives diagnostics for the Engle-Russell-form (9). LB is the Ljung-Box statistic with 15 lags. E-R EDT is the Engle and Russell test for excess dispersion.

Lunde, A.: The GG-ACD Model

) LB(X

FIRM

Table 5 Estimates for the Nelson-form (11) Model

^0 (H0 : 0 =0)

a^ (H0 :a=1)

^b (H0 :b=0)

Æ^ (H0 :Æ=1)

DIS

EXPON WEIBULL GENGAM

-0.0296(-7.91) -0.0309(-7.88) -0.6919(-5.13)

0.9877(4.02) 0.9872(4.11) 0.9442(5.92)

0.0293(7.97) 0.0295(8.11) 0.0482(13.53)

0.9228(-21.47) 0.3430(-58.77)

EXPON WEIBULL GENGAM

-0.0242(-6.92) -0.0245(-6.86) -0.5693(-1.76)

0.9868(3.56) 0.9867(3.57) 0.9411(2.78)

0.0240(6.95) 0.0240(6.95) 0.0397(7.76)

0.9709(-6.88) 0.3966(-11.95)

EXPON WEIBULL GENGAM

-0.0252(-8.79) -0.0253(-8.74) -0.1443(-5.85)

0.9928(4.34) 0.9928(4.35) 0.9832(6.31)

0.0250(8.81) 0.0248(8.81) 0.0312(12.53)

0.9654(-8.87) 0.4530(-43.77)

EXPON WEIBULL GENGAM

-0.0397(-8.14) -0.0409(-8.07) -1.1496(-2.67)

0.9773(4.21) 0.9768(4.19) 0.9251(4.80)

0.0392(8.22) 0.0392(8.31) 0.0564(12.31)

0.9507(-14.00) 0.3083(-20.72)

EXPON WEIBULL GENGAM

-0.0248(-7.41) -0.0247(-7.45) -0.1666(-4.25)

0.9922(3.64) 0.9923(3.60) 0.9773(4.80)

0.0247(7.44) 0.0247(7.45) 0.0356(10.13)

1.0205(4.56) 0.4967(-26.55)

EXPON WEIBULL GENGAM

-0.0272(-8.92) -0.0274(-8.95) -0.1038(-6.42)

0.9967(3.74) 0.9966(3.81) 0.9891(6.03)

0.0270(9.00) 0.0271(9.03) 0.0380(11.88)

0.9775(-5.18) 0.4792(-40.21)

EXPON WEIBULL GENGAM

-0.0234(-8.77) -0.0235(-8.71)

0.9950(4.00) 0.9949(4.06) 0.9764(2.72)

0.0232(8.79) 0.0230(8.78) 0.0380(6.40)

0.9558(-10.95) 0.3967(-13.32)

FNM

GM

BAC

22 MCD

MTC

SLB

-0.2491(-1.90)

^ (H0 : =1)

6.3023(13.57)

5.2061(3.34)

3.9952(14.83)

8.3804(4.18)

3.6859(10.43)

3.6826(14.72)

5.1217(3.70)

Æ^ ^ (H0 :Æ =1)

C.B.:95% (a)

2.1619(18.03)

[0.9817:0.9937] [0.9811:0.9933] [0.9257:0.9627]

2.0646(4.49)

[0.9796:0.9941] [0.9794:0.9940] [0.8996:0.9826]

1.8100(19.09)

[0.9896:0.9961] [0.9896:0.9960] [0.9780:0.9884]

2.5837(5.98)

[0.9667:0.9878] [0.9662:0.9874] [0.8945:0.9557]

1.8310(14.12)

[0.9880:0.9964] [0.9881:0.9964] [0.9680:0.9865]

1.7648(18.76)

[0.9950:0.9984] [0.9948:0.9983] [0.9856:0.9927]

2.0319(4.90)

[0.9925:0.9974] [0.9924:0.9973] [0.9594:0.9934]

Table 5 gives the estimates for the Nelson-form (11). T-statistics are given in parentheses. Numbers in italic boldface are significant at the 99% level, numbers in normal font are significant at the 95% level. The numbers typed with very small types are insignificant. For a, Æ; and Æ the null are equality to one.

Lunde, A.: The GG-ACD Model

FIRM

Table 6 Diagnostics for Nelson-form (11) LB( )

E ( )

St( )

EXPON WEIBULL GENGAM

898.5 898.5 898.5

13.0 14.1 17.7

0.9998 0.9998 1.0040

1.1759 1.0793 0.9828

FNM

EXPON WEIBULL GENGAM

582.3 582.3 582.3

14.1 14.2 12.7

0.9999 0.9999 1.0028

GM

EXPON WEIBULL GENGAM

1427.2 1427.2 1427.2

34.4 36.1 38.7

BAC

EXPON WEIBULL GENGAM

1438.6 1438.6 1438.6

MCD

EXPON WEIBULL GENGAM

MTC

SLB

jSt()

j

E ( )

E-R EDT

Max Like

0.1760 0.0795 0.0212

22.79 9.82 -2.04

-27793.62 -27626.02 -26870.55

1.1160 1.0807 0.9789

0.1161 0.0808 0.0239

13.69 9.37 -2.33

-24570.12 -24550.74 -24002.39

1.0002 1.0002 1.0020

1.1341 1.0900 0.9875

0.1339 0.0897 0.0145

18.27 12.00 -1.58

-31792.18 -31756.00 -31109.23

31.7 32.0 30.7

1.0003 1.0002 1.0040

1.1697 1.1049 0.9882

0.1694 0.1047 0.0157

24.27 14.56 -1.54

-34019.62 -33938.04 -32774.88

1009.6 1009.6 1009.6

39.3 38.8 29.6

1.0000 1.0000 1.0020

1.0450 1.0678 0.9789

0.0450 0.0678 0.0231

5.12 7.79 -2.32

-24221.11 -24212.37 -23804.04

EXPON WEIBULL GENGAM

2584.7 2584.7 2584.7

31.6 32.0 24.6

1.0004 1.0003 1.0019

1.1156 1.0878 0.9966

0.1152 0.0875 0.0052

13.76 10.32 -0.38

-23853.49 -23841.81 -23342.43

EXPON WEIBULL GENGAM

1419.7 1419.7 1419.7

59.0 61.4 43.1

1.0001 1.0001 1.0035

1.1549 1.0978 0.9959

0.1549 0.0977 0.0076

19.67 12.09 -0.48

-26925.30 -26873.81 -26150.09

Model

DIS

23

Table 6 gives diagnostics for the Nelson-form (11). LB is the Ljung-Box statistic with 15 lags. E-R EDT is the Engle and Russell test for excess dispersion.

Lunde, A.: The GG-ACD Model

) LB(X

FIRM

Table 7 Estimates for the Log-linked exponential model with covariates

ln (i (t)) = a ln(i 1 (xi 1 )) + bxi 1 i 1 (xi 1 ) + 0 + 1 spr(t) + 2 lev:spr(t); FIRM

Model

^0 (H0 : 0 =0)

a^ (H0 :a=1)

^b (H0 :b=0)

^1 (H0 : 1 =0)

^2 (H0 : 2 =0)

0.0040(1.74)

0.0049(2.22)

0.0005(0.22)

0.0041(1.92)

+

tv tv

0.0248(6.72) 0.0255(6.55)

0.9864(4.20) 0.9868(4.04)

-0.0293(-8.07) -0.0292(-7.82)

FNM FNM

tv tv

0.0209(5.21) 0.0226(5.34)

0.9859(3.48) 0.9863(3.49)

-0.0244(-6.84) -0.0241(-6.84)

0.0027(1.37)

0.0037(1.27)

+

-0.0014(-0.75)

0.0018(0.66)

GM GM

tv tv

0.0229(6.63) 0.0245(6.56)

0.9930(4.10) 0.9925(4.03)

-0.0245(-8.25) -0.0254(-8.16)

0.0020(1.27)

0.0018(1.09)

+

-0.0014(-0.79)

0.0011(0.66)

BAC BAC

tv + tv

0.0410(7.70) 0.0424(7.66)

0.9776(4.04) 0.9770(4.06)

-0.0392(-7.98) -0.0399(-8.02)

-0.0006(-0.25)

-0.0014(-0.72)

-0.0032(-1.24)

-0.0022(-1.12)

MTC MTC

tv tv

0.0187(5.90) 0.0195(5.84)

0.9911(3.88) 0.9912(3.83)

-0.0246(-7.58) -0.0246(-7.50)

0.0025(1.75)

+

0.0005(0.34)

0.0061(2.79) 0.0053(2.54)

MCD MCD

tv tv

0.0250(8.45) 0.0253(8.30)

0.9964(3.79) 0.9965(3.69)

-0.0269(-8.93) -0.0270(-8.81)

0.0002(0.14)

0.0021(1.82)

+

-0.0017(-1.06)

0.0019(1.67)

SLB SLB

tv tv

0.0215(8.09) 0.0220(7.95)

0.9947(4.02) 0.9946(3.97)

-0.0231(-8.61) -0.0234(-8.51)

-0.0048(-2.46) -0.0075(-3.73)

0.0016(1.28)

+

0.0012(0.95)

Table 7 gives the estimates of the log-linked exponential with covariates as given above. Estimations where the covariates are allowed to vary within durations are marked with + tv. T-statistics are given in parentheses. Numbers in italic boldface are significant at the 99% level, numbers in normal font are significant at the 95% level. The numbers typed with very small types are insignificant. For a the null is equality to one.

Lunde, A.: The GG-ACD Model

24

DIS DIS

Table 8 Diagnostics for the Log-linked exponential model with covariates FIRM

Model

) LB(X

LB( )

E ( )

St( )

jSt()

j

E ( )

E-R EDT

Max Like

+

TV TV

898.5 898.5

12.6 12.9

0.9998 0.9998

1.1760 1.1759

0.1762 0.1761

22.81 22.80

-27788.35 -27790.80

FNM FNM

+

TV TV

582.3 582.3

13.6 13.9

0.9999 0.9998

1.1158 1.1158

0.1160 0.1160

13.67 13.67

-24568.26 -24569.28

GM GM

+

TV TV

1427.2 1427.2

35.7 33.6

1.0001 1.0001

1.1341 1.1338

0.1340 0.1337

18.27 18.23

-31790.70 -31791.19

BAC BAC

+

TV TV

1438.6 1438.6

31.9 31.0

1.0003 1.0003

1.1697 1.1697

0.1695 0.1695

24.27 24.27

-34019.27 -34017.94

MCD MCD

+

TV TV

1009.6 1009.6

39.1 39.5

1.0000 1.0000

1.0440 1.0443

0.0440 0.0443

4.99 5.03

-24215.06 -24216.97

MTC MTC

+

TV TV

2584.7 2584.7

31.5 31.3

1.0001 1.0001

1.1153 1.1153

0.1152 0.1152

13.72 13.72

-23851.03 -23849.39

SLB SLB

+

TV TV

1419.7 1419.7

57.8 57.0

1.0000 1.0000

1.1542 1.1542

0.1542 0.1542

19.57 19.58

-26918.99 -26913.33

Table 8 gives the diagnostics of the log-linked exponential with covariates of Table 7. Estimations where the covariates are allowed to vary within durations are marked with + tv. LB is the Ljung-Box statistic with 15 lags. E-R EDT is the Engle and Russell test for excess dispersion.

Lunde, A.: The GG-ACD Model

25

DIS DIS

Table 9 Estimates for the Log-linked exponential price duration model with covariates

ln (i (t)) = a ln(i 1 (xi 1 )) + bxi 1 i 1 (xi 1 ) + 0 + 1 FIRM

Model

p vol(t);

^0 (H0 : 0=0)

a^ (H0 :a=1)

^b (H0 :b=0)

^1 (H0 : 1=0)

+

tv tv

0.0285(4.61) 0.0221(4.68)

0.9729(2.93) 0.9601(1.96)

-0.0354(-5.79) -0.0418(-4.33)

0.8595(2.04) 2.4023(2.35)

FNM FNM

tv tv

0.0184(4.41) 0.0142(4.06)

0.9912(2.71)

+

-0.0186(-5.17) -0.0209(-3.77)

0.8084(1.55)

GM GM

tv tv

0.0306(3.64) 0.0258(4.17)

0.9718(1.97)

+

BAC BAC

tv tv

0.0249(4.11) 0.0223(3.97)

0.9838(2.28)

+

MCD MCD

tv tv

0.0204(3.73)

0.9710(2.86)

+

0.0050(0.49)

MTC MTC

+

tv tv

SLB SLB

+

tv tv

0.9898(1.86)

0.9637(1.52)

-0.0295(-3.90) -0.0342(-3.17)

0.0424(0.14)

-0.0934(-0.25) 1.0444(1.29)

-0.0279(-4.70) -0.0305(-3.85)

1.0186(2.20)

0.9261(1.69)

-0.0351(-5.07) -0.0526(-4.35)

1.7828(3.29) 5.7780(2.35)

0.0360(5.25) 0.0264(4.44)

0.9592(4.07) 0.9449(3.91)

-0.0467(-7.33) -0.0528(-7.85)

1.4765(2.38) 3.5048(3.79)

0.0287(5.68) 0.0262(5.98)

0.9846(3.40) 0.9842(3.32)

-0.0304(-6.65) -0.0310(-6.60)

0.6016(2.33)

0.9813(1.83)

0.4007(1.41)

0.2405(1.00)

Table 9 gives the estimates of the log-linked exponential price duration model with covariates as given above. Estimations where the covariates are allowed to vary within durations are marked with + tv. T-statistics are given in parentheses. Numbers in italic boldface are significant at the 99% level, numbers in normal font are significant at the 95% level. The numbers typed with very small types are insignificant. For a the null is equality to one.

Lunde, A.: The GG-ACD Model

26

DIS DIS

Table 10 Diagnostics for the Log-linked exponential price duration model with covariates FIRM

Model

) LB(X

LB( )

E ( )

St( )

jSt()

j

E ( )

E-R EDT

Max Like

+

tv tv

452.7 452.7

27.1 25.1

1.0000 1.0000

1.2410 1.2392

0.2410 0.2392

23.64 23.45

-15086.70 -15070.50

FNM FNM

+

tv tv

238.0 238.0

58.6 58.8

0.9995 0.9996

1.3592 1.3575

0.3598 0.3580

31.13 30.96

-10636.48 -10631.21

GM GM

+

tv tv

347.6 347.6

128.3 123.8

1.0001 1.0001

1.2977 1.2961

0.2976 0.2960

28.67 28.49

-13894.18 -13889.37

BAC BAC

+

tv tv

550.8 550.8

44.8 41.8

1.0005 1.0005

1.2562 1.2566

0.2557 0.2561

28.30 28.34

-18845.26 -18837.46

MCD MCD

+

tv tv

321.2 321.2

58.0 55.6

1.0005 1.0005

1.4031 1.3921

0.4026 0.3916

33.38 32.32

-9345.59 -9304.68

MTC MTC

+

tv tv

463.8 463.8

19.3 18.8

1.0002 1.0001

1.3156 1.3176

0.3154 0.3175

27.80 28.00

-11317.05 -11294.57

SLB SLB

+

tv tv

733.1 733.1

49.4 48.1

1.0003 1.0003

1.2241 1.2239

0.2238 0.2236

23.76 23.74

-17804.32 -17800.03

Table 10 gives the diagnostics of the log-linked exponential price duration model with covariates of Table 9. Estimations where the covariates are allowed to vary within durations are marked with + tv. LB is the Ljung-Box statistic with 15 lags. E-R EDT is the Engle and Russell test for excess dispersion.

Lunde, A.: The GG-ACD Model

27

DIS DIS

0.6

SLB MTC MCD BAC GM FNM DIS

0.2

0.4

P.d.f.

0.8

1.0

Lunde, A.: The GG-ACD Model

0.0

0.5

1.0

1.5

2.0

Duration

0.6

Expon Weibull Gengam

0.0

0.2

0.4

p.d.f.

0.8

1.0

1.2

1.4

Fig. 1. Density estimates of filtered durations between transactions for seven NYSE stocks.

0

2

4

6

8

10

duration

Fig. 2. Hazard functions for SLB implied by the three parametric specifications.

28

12

Lunde, A.: The GG-ACD Model

8 6 0

2

4

Cumulative hazard

10

Expon Weibull Gengam Nonparametric

0

2

4

6

8

10

12

Duration

Fig. 3. Cumulative hazards of filtered durations between transactions for SLB, with the cumulative hazards implied by estimation.

2 0

1

Cumulative hazard

3

Expon Weibull Gengam Nonparametric

0

1

2

3

Duration

Fig. 4. Cumulative hazards of filtered durations between transactions for SLB, with the cumulative hazards implied by estimation. Zoomed for the interval 0 to 3.5.

29

1.0

1.2

Lunde, A.: The GG-ACD Model

0.6 0.0

0.2

0.4

p.d.f.

0.8

Expon Weibull Gengam Raw data

0

1

2

3

4

5

duration

Fig. 5. Density of filtered durations between transactions for SLB, with densities implied by estimation.

30