Effect of Neglected Deterministic Seasonality on Unit Root ... - CiteSeerX

17 downloads 0 Views 270KB Size Report
Effect of Neglected Deterministic Seasonality on Unit Root Tests. Matei Demetrescu, Uwe Hassler. Goethe-University Frankfurt∗. Working paper version: May ...
Effect of Neglected Deterministic Seasonality on Unit Root Tests Matei Demetrescu, Uwe Hassler Goethe-University Frankfurt∗ Working paper version: May 31st , 2005

Abstract Sometimes, integration tests are applied to seasonal data without allowing for seasonal deterministics. This paper studies the effect of neglecting seasonal dummy variables. For the Dickey-Fuller test, it is observed that the distribution is shifted to the left, with lower dispersion at the same time, whenever deterministic seasonality is not accounted for. When accounting for serial correlation, the distortions become less predictable. A Monte Carlo study confirms that, in the presence of seasonally varying means, the (augmented) Dickey-Fuller test without seasonal dummies is oversized and has little power at the same time (due to the need of lag augmentation). The KPSS test for stationarity is also examined. It turns out that the effect of neglecting seasonal deterministics on the way the test behaves under the null hypothesis depends on the way the long-run variance is estimated. Key words Neglected seasonal means, Dickey-Fuller test, KPSS test JEL classification C22 (Time series models) ∗

Statistics and Econometric Methods, Goethe-University Frankfurt, Gr¨afstr. 78 / PF 76, D-60054 Frankfurt, Germany, Tel: +49.69.798.23660, Fax: +49.69.798.23662. E-mail: [email protected], [email protected]. We wish to thank Peter Jung for computational research assistance.

1

1

Introduction

Looking at recent volumes of empirical economic journals it becomes obvious that integration and cointegration testing is a major concern of applied economists working with time series. Many of those papers employ quarterly or monthly data (without seasonal adjustment, or at least without indicating that the data had been subject to some seasonal adjustment procedure), typically in order to increase the number of observations. However, much of this work is carried out without allowing for seasonally varying means (i.e. with´ out seasonal dummy variables). Some recent examples are Martin-Alvarez et al. (1999), Metin and Muslu (1999) or Patterson (2000, pp. 295-299). Lim and McAleer (2000), while including dummies in a seasonal unit root test, fail to do so for a unit root test with seasonal data. Other work exhibiting such shortcomings includes Ng and Perron (2001) and Hassler and Neugart (2003). In this paper, we argue that neglecting seasonally varying means may inflate the size and at the same time reduce the power of the Dickey-Fuller test [DF] (Dickey and Fuller, 1979). Similarly, the stationarity test by Kwiatkowski et al. (1992) [KPSS] also leads to distorted decisions in the presence of neglected seasonal deterministics under the null hypothesis. Two aspects of seasonal dummies have been studied in the presence of integrated time series sofar. First, Dickey, Bell, and Miller (1986, p. 25) showed for the DF test that ”removal of seasonal means from autoregressive series [...] has no effect on limit distributions of unit root test statistics“. The same has been shown by Phillips and Jin (2002) and Taylor (2003) for the KPSS test. Hence, the inclusion of seasonal dummy variables when performing an integration test has no asymptotic effect under the null hypothesis. Second, Abeysinghe (1991, 1994), Franses, Hylleberg, and Lee (1995) and Lopes (1999) demonstrated that the use of seasonal dummies in the presence of seasonal unit roots may result in spurious regressions or spurious deterministic seasonality. However, it would be wrong to conclude from their work that seasonal dummies should not be used with integrated time series. Rather the opposite seems to be true, as quantified by the present study1 . The rest of this paper is organized as follows. Section 2 contains the model and assumptions. The asymptotic distortion induced by neglecting seasonal 1 Related work was carried out under restrictive assumptions by Lopes (2002) for the DF test involving quarterly data.

2

structure when performing DF and KPSS tests is then studied. Further, a systematic Monte Carlo study with simulated quarterly data is presented in section 4. The final section concludes, and the proofs have been gathered in the Appendix.

2

Model and assumptions

We consider seasonal data yt with S observations per year. The T observations need not span complete years, i.e. incomplete years at the beginning and at the end of the sample are allowed for. Assumption 1 The series are assumed to be generated by the following data generating process in levels, yt = δ s + x t ,

t = 1, 2, . . . , T ,

(1)

with s = t mod S, and the autoregressive process xt , xt = ρ xt−1 + ut ,

(2)

is driven by stationary innovations ut with mean zero. To ease the proofs, let ut be a stationary AR(p∗ ) process: Assumption 2 Let ut be defined by ut = α1 ut−1 + α2 ut−2 + . . . + αp∗ ut−p∗ + εt , t ∈ Z ,

(3)

where the roots of the associated characteristic equation all lie inside the unit circle and εt are a stationary, covariance-ergodic martingale difference sequence with finite unconditional 4th moments. Let the moments of ut be E(ut ) = 0, E(ut ut−h ) = γh and assume the long-run variance of ut 2

ω =

∞ X h=−∞

is positive, ω 2 > 0. 3

γh

Under the null hypothesis, the DF test assumes that xt is integrated of order 1 (ρ = 1). In terms of differences, ∆yt = yt − yt−1 , the unit root hypothesis implies ∆yt = ζs + ut ,

t = 1, 2, . . . , T ,

(4)

with ζs = ∆δs , i.e. ζ1 = δ1 − δS ,

ζs = δs − δs−1 , s = 2, . . . , S , S X

(5)

ζs = 0 .

s=1

For simplicity, let the starting value y0 be fixed. Under the null hypothesis of the KPSS test, we assume ρ = 0, so that xt = ut is a stationary AR(p∗ ) process with covariances γh and positive longrun variance ω 2 . This formulation is encompassed by the usual component specification of the KPSS test, and is preferred here in order to maintain a common setup with the DF test.

3

Asymptotic results

3.1

DF tests

It is known (Dickey, Bell, and Miller, 1986) that the inclusion of seasonal dummies does not affect the limiting distribution of the DF test under the null hypothesis of a unit root. Here, we examine the problem the other way around: If yt contains deterministic seasonality, how does this affect the DF test when performed as in (6) below without seasonal dummies. Hence, we investigate the usual DF test relying on a regression with constant, ∆yt = b c + φb yt−1 + u bt ,

t = 1, . . . , T ,

(6)

with t statistic tφ = DFc testing for φ = 0. For the DF regression (6), estimated by ordinary least squares (OLS), the d following result is proven in the Appendix, where → stands for convergence in distribution2 . 2

Lopes (2002) addresses the special case of a regression without constant, where S = 4, ut ∼iid in (4) and ζs = c · (−1)s in (5), with c a positive constant.

4

Proposition 1 Let yt be given by Assumptions 1 and 2 with ρ = 1. Then it holds for the regression (6) as T → ∞: R1 R1 2 2 2 δ ζ/ω + W (r) dW (r) + (ω − γ )/2ω − W (1) W (r) dr −1 0 d 0 0 T φb → , ³ ´ 2 R1 R1 2 W (r) dr − 0 W (r) dr 0 R1 R1 2 2 2 d δ−1 ζ/ω + 0 W (r) dW (r) + (ω − γ0 )/2ω − W (1) 0 W (r) dr r , DFc = tφ → q ´2 ³R R1 1 ζ 2 +γ0 W 2 (r) dr − 0 W (r) dr ω2 0 P where W is a standard Brownian motion, ζ 2 := S1 Ss=1 ζs2 , and δ−1 ζ :=

S 1X 1 δs−1 ζs = (ζ1 δS + ζ2 δ1 + · · · + ζS δS−1 ) , S s=1 S

with ζs = ∆δs from (5). Remark 1 Neglecting seasonal deterministics has two effects relative to the DF distributions. On the one hand, the distributions are shifted due to the term ζδ−1 . On the other hand, as soon as one ζs 6= 0, we observe ζ 2 > 0, which reduces the variation of the limiting distribution of DFc = tφ . Further, observe that ζδ−1 ≤ 0. This can be shown by standard methods. Define f (δ1 , . . . , δS ) = S ζδ−1 = (δ1 − δS ) δS + · · · + (δS − δS−1 ) δS−1 . It is straightforward to establish that f has infinitely many stationary points: δ1 = · · · = δS , where f equals zero. The corresponding Hessian matrix can be seen to be negative semi-definite, so f is not positive. Hence, as soon as δi 6= δj for i 6= j, neglecting seasonal dummies shifts the distribution to the left. Since DFc is more concentrated than the usual DF distribution, the t statistic with the usual DF critical values is expected to result in overrejection, for standard nominal levels. Remark 2 In case of a detrended regression, ∆yt = b c + bb t + φb yt−1 + u bt , the analogous results hold true for the model yt = µ t + δs + xt . The demeaned Brownian motion in Proposition 1 simply has to be replaced by a detrended Brownian motion. 5

Remark 3 Even if the driving process ut is iid, the distributions still depend on nuisance parameters, unlike the case where the DGP exhibits no seasonal deterministics. In case of constant levels over all seasons, δ1 = · · · = δS in (1), we obtain ζδ−1 = ζ 2 = 0, and the usual DF distributions are recovered. In practice, applied workers would employ regression (6) only if there is no evidence of serial correlation. But deterministic seasonality is likely to be mistaken as serial correlation (see Remark 6). One way to deal with serial dependence is to apply the correction suggested by Phillips (1987) and Phillips and Perron (1988). But the distribution of the test statistic, be it the coefficient of the lagged level or its t statistic, is already distorted, as seen in Proposition 1. The suggested correction neither accounts for the shift to the left, nor does it consider the overestimation of the variance due to neglecting seasonal deterministics. Further, the correction involves an estimation of the long-run variance, where additional problems may appear, see Proposition 3 in the following subsection. Another way is to compute the augmented Dickey-Fuller test [ADF] (Said and Dickey, 1984) from ∆yt = b c + φb yt−1 +

p X

α bq ∆yt−q + εbt ,

t = p + 2, . . . , T ,

(7)

q=1

with t statistic tφ = ADFc . For convenience, allow for p + 1 starting values instead of just one as for (6), but note that, in practical applications, only T − p−1 observations are effectively available. The inclusion of lagged differences is designed to ensure the invariance of the asymptotic null distribution of the t statistic w.r.t. the nuisance parameters ω 2 and γ0 . However, including lagged differences of yt as regressors in presence of neglected seasonality further distorts the asymptotic distribution of the ADF test statistic, as can be seen in following proposition (of which the proof is given in the Appendix). Proposition 2 Let yt be given by Assumptions 1 and 2 with ρ = 1. Suppose for simplicity that enough lags have been included, so p ≥ p∗ . Let αq = 0 for p∗ < q ≤ p, and denote πs = ζs − ζs−1 α1 − . . . − ζs−p αp . Then, it holds for the regression (7) as T → ∞: ³R ´ R1 1 K + K W (r) dW (r) − W (1) W (r) dr 1 2 0 0 d , T φb → ³ ´2 R1 R1 2 (r) dr − W W (r) dr 0 0 6

d

tφ →

K1 K3

+

³R 1

´ R1 W (r) dW (r) − W (1) W (r) dr 0 0 r , ³R ´2 R1 1 2 W (r) dr − 0 W (r) dr 0

K2 K3

where p

Aq δ−1 π X (−1)q K1 = 2 + ω A q=1

Ã

q−1

2δ−1 ζ + ω 2 − γ0 X γr + ζζ−r + 2ω 2 ω2 r=0

! ,

p X Aq σε (−1)q K2 = + , A ω q=1

v u à !2 u 2 X p p p X X u σε Aq Ar 1 Aq γ|q−r| + 2 π − ζ−q ; K3 = t 2 + ω A2 ω 2 ω A q=1 r=1 q=1 P P further, δ−1 π = S1 Ss=1 δs−1 πs , ζ−1 π = S1 Ss=1 ζs−1 πs , ζζ−h = ζζh = PS 1 s=1 ζs ζs−h for h ∈ Z, S ¯ ¯ ¯ ¯ 2 ζ · · · γ + ζζ γ + p−1 +p−1 0 ¯ ¯ ¯ ¯ .. .. .. A=¯ ¯, . . . ¯ ¯ ¯ γp−1 + ζζ+p−1 · · · ¯ γ0 + ζ 2 ¢0 ¡ Aq is A with the q th column replaced by the column vector ζ−1 π, . . . , ζ−p π , and finally à π−

p X Aq q=1

A

!2 ζ−q

à !2 p S X X 1 Aq = πs − ζs−q . S s=1 A q=1

Remark 4 Although the distortions depend on the ignored seasonality and on the parameters of the process ut in a rather complicated way, we still observe a translation and a rescaling of the DF distribution. Confer the Monte Carlo study for experimental evidence. Remark 5 Note that these distortions appear even if the true model has no short-run dynamics - the effect is due to the inclusion of lags when ignoring seasonal structure. Distortions also worsen with growing p, due to the persistence of the seasonal structure. 7

Remark 6 While the unit root is super-consistently estimated, the estimators α bq can be shown to be asymptotically biased. Hence, significance tests on included lagged differences reject with probability 1, which induces the need to further augment the ADF test regression. In small samples, including a large number of lags arguably produces residuals that are approximately white noise, thus reducing the size distortion. Let us also briefly examine the alternative hypothesis, under which the process yt is covariance-stationary with seasonally varying means. In this case, all estimators are asymptotically biased, in addition to having distorted distributions. A heuristic argument for this statement is as follows. Consider following autoregression in levels yt = m b+

p+1 X

βbq yt−q + νbt .

q=1

This is equivalent to test regression (7), but easier to handle when xt is stationary. Using Lemma A in ³ the Appendix, ´0 it is straightforward to show b b b that the OLS estimators β = β1 , . . . , βp+1 are not consistent: p βb → (Σx + ΠS )−1 Σx β + (Σx + ΠS )−1 ΨS ,

where β are the true autoregressive coefficients of xt , Σx is autocovariance matrix of xt , and  ¡ ¢2 ¡ ¢¡ ¢ · · · δ−p − δ δ − δ δ−δ  .. .. .. ΠS =  . . .  ¡ ¢¡ ¢ ¡ ¢2 δ − δ δ−p − δ · · · δ−δ  ¡ ¢¡ ¢  δ − δ δ−1 − δ   .. ΨS =  , . ¡ ¢¡ ¢ δ − δ δ−p+1 − δ

the (p + 1)th   , 

with ¡

δ−δ

¢¡

!Ã Ã S S 1X 1X δ(s−h) mod δ−h − δ = δr δs − S s=1 S r=1 ¢

8

S

S 1X − δr S r=1

! .

Standard OLS algebra shows that following relationships hold φb =

p+1 X

βbi − 1,

i=1

α bq = −

p+1 X

βbi , q = 1, . . . , p ,

i=q+1

b c = m b

and

εbt = νbt .

Hence, the estimators φb and α bq in (7) are themselves inconsistent. On the one hand, the asymptotic bias of φb may, for some seasonal patterns, be negative and thus increase power. On the other hand, the asymptotic bias of the estimators α bq leads to rejection with probability 1 of tests with null hypothesis αq = 0, even if it holds true. Including lagged differences in the ADF test regression reduces power under the alternative. Again, confer the Monte Carlo study.

3.2

KPSS test

Under the assumptions in Section 2, the LM test for the null hypothesis of stationarity by Kwiatkowski et al. (1992) builds on the partial sum of demeaned series t X Stc = x bcj with x bct = yt − y , j=1

and the test statistic is

T T −2 X c 2 ηc = 2 (S ) . ω bc t=1 t

(8)

Should xt exhibit no serial correlation, it holds ω 2 = γ0 and the OLS variance estimator is used3 : T 1X c 2 ω bc2 = (b x) , (9) T t=1 t but, in practice, it is rarely guaranteed that xt is an uncorrelated process. In the presence of serial correlation, Kwiatkowski et al. (1992) suggest to replace 3

Indeed, this simplest case was already treated in Nyblom and M¨akel¨ainen (1983).

9

the variance estimators by consistent estimators of the spectral density at frequency zero: 2 ω bc,B

T T −1 T −h X 1X c 2 1X c c = (b x ) +2 w(h/B) x bx b , T t=1 t T t=1 t t+h h=1

(10)

where the lag window w(h/B) meets certain requirements, and the bandwidth B increases with T . For most lag windows, B is a truncation parame√ ter (i.e. w(h/B) = 0 for |h| > B) and the typical requirement is B = o( T ) as B → ∞. The limiting distribution4 of the test statistic (8) was given in terms of a Brownian Bridge by Kwiatkowski et al. (1992) and it is known (Phillips and Jin, 2002, and Taylor, 2003) that projecting on seasonal dummies instead of a constant does not change this distribution. But, if the DGP has seasonality and dummies are not included, the distribution crucially depends on how the long-run variance ω 2 is estimated5 . Ever since the work of Newey and West (1987), the Bartlett window, defined by ½ 1 − |h| , |h| < B B , w(h/B) = 0 , |h| ≥ B has been widely used in econometrics. However, the quadratic spectral [QS] kernel, given by µ ¶ 25 B 2 sin(6πh/5B) w(h/B) = − cos(6πh/5B) , 12 π 2 h2 6πh/5B was shown in Andrews (1991) to have superior theoretical properties. Therefore, we study their cases in the Proposition below, together with the OLS variance estimator. Proposition 3 Let yt be given √ by (1) with (2) and ρ = 0. Then, as T → ∞ and B → ∞ jointly, with B/ T → 0, it holds 4

It is sometimes called Cram´er-von-Mises distribution and was first tabulated by Anderson and Darling (1952). 5 A similar behavior of the test statistic was observed in the context of neglected structural breaks by Busetti and Taylor (2003, Prop. 3.1).

10

a) for the OLS variance estimator from (9): R1 ω 2 0 (W (r) − r W (1))2 dr d ηc → , 2 γ0 + δ 2 − δ b) for the Bartlett window: R1

(W (r) − r W (1))2 dr ηc → ¡ ¢¡ ¢, P 2 ω 2 + δ 2 − δ + 2S Ss=1 s δ − δ δ+s − δ ω2

d

0

c) for the QS window: Z

1

d

ηc →

(W (r) − r W (1))2 dr ,

0

where W is a standard Brownian motion, δ 2 := S1 ¡ ¢¡ ¢ ¢¡ ¢ P ¡ and δ − δ δ+s − δ = S1 Si=1 δi − δ δi+s − δ .

PS

2 s=1 δs ,

δ :=

1 S

PS

s=1 δs ,

2

Remark 7 Whenever δi 6= δj for i 6= j, δ 2 − δ in a) will be positive. Therefore, neglecting seasonal dummies results in a conservative test. The distortion in b) depends on the seasonal pattern and can be a contraction as well as a dilution. Remark 8 The reason why the results from b) and c) are different lies in the behavior of the respective lag windows in the frequency domain. While the Bartlett kernel is a truncation lag window in the time domain (and QS isn’t), this is reversed in the frequency domain, where the spectral window associated to QS is truncated and the one associated to the Bartlett lag window is not. Thus, the long-run variance estimator with Bartlett kernel is contaminated, even asymptotically, with the behavior of the periodogram at the seasonal frequencies, while the estimate based on QS is not. Such distortions are likely to appear with other truncation lag windows as well. Remark 9 Although the asymptotic null distribution of the KPSS test statistic in the presence of neglected deterministics does not change when using the QS lag window, one may expect small sample distortions to appear. This is confirmed by the Monte Carlo experiment. 11

4

Monte Carlo evidence

For experimental evidence we simulated quarterly data with S = 4. The data generating process is (1) and (2) with ut = ε t ,

εt ∼ N (0, 1),

ρ ∈ {1, 0.95, 0.9, 0.8, 0.5} ,

t = 0, . . . T = 4N .

We simulate with iid innovations in order to observe the effect of neglected seasonality isolated from short-run dynamics. The number of years was chosen as N = 20 and N = 50, where 20 additional starting values were generated but neglected for the estimation. All experiments rely on 25000 replications and were performed by means of GAUSS. The series to be tested are built from (1), where deterministic seasonality is governed by one parameter δ: (i) (ii) (iii) (iv)

δ1 δ1 δ1 δ1

= δ, δ2 = δ3 = δ4 = 0, = −δ2 = δ, δ3 = δ4 = 0, = δ2 = δ, δ3 = δ4 = 0, = −δ2 = δ3 = −δ4 = δ .

In case (i), only one quarter differs from the others. In cases (ii) and (iii), only two neighbouring quarters are affected, with equal or opposite sign; in case (ii), the effect on the overall mean is zero. Finally, in case (iv), all neighbouring quarters have opposite signs, so that there is no effect on the overall mean. We report here results for δ = 3 (three times the standard deviation of the innovations). Other values ranging from δ = 1 to δ = 10 were also considered. The results are not essentially different, and are available upon request.

4.1

ADF test

The test statistics compared are ADFc , the t statistic from (7), and ADFd , the t statistic from the regression with the usual seasonal dummy variables (i.e. Ds,t = 1 iff s = t mod S) instead of a constant: ∆yt =

4 X

ζes Ds,t + φe yt−1 +

s=1

p X q=1

12

α eq ∆yt−q + u et ,

(11)

with t = p + 2, . . . , T . The choice of p in (7) and (11) is data-driven: Given a maximum lag length M (M = 8 for T = 80 and M = 12 for T = 200), it is tested sequentially, i = 0, 1, . . ., at the 5% level, whether the lags ∆yt−M +i have a significant contribution; ∆yt−p is the first significant lag. We compute the rejection frequencies at the 1%, 5%, and 10% level. With the number of observations not being small, asymptotic critical values due to MacKinnon (1991) are employed: -3.4335, -2.8621, and -2.5671. We also report the mean number of lagged differences included in the ADFc and ADFd test regressions. For the first experiment, case (i), we observe from Table 1 that the ADF test with constant only is heavily oversized; the distortions are decreasing for T = 200, but remain significant. The ADF test with seasonal dummies, although oversized, too, is much closer to the nominal level. Observe that the mean number of lagged differences included in the ADFc test regression is very close to 8, for T = 80, and to 12, for T = 200, under the null as well as under the alternative hypothesis. In spite of this augmentation, the size problem is not removed. The ADFd test needs a significantly lower number of lagged differences: around 2.5, for T = 80, and about 4, for T = 200. Further, for T = 80 with ρ = 0.9, the test with dummies is as powerful as ADFc ; for ρ = 0.8 and ρ = 0.5, ADFd becomes superior in terms of power. For T = 200, ADFc exhibits in almost every case lower power (with the single exception of ρ = 0.95, where the rejection frequencies are roughly equal). Although the overall mean in case (ii) is zero, we learn from Table 2 that the test without dummies is again oversized. In fact, the numbers in Table 2 are very similar to the previous ones in Table 1. The same comment applies to cases (iii) and (iv) in Tables 3 and 4, respectively, with cases (iii) and (iv) exhibiting a somewhat lower number of included lagged differences. To sum up: The ADFd test (with seasonal dummies) has good size properties, considering the use of asymptotic critical values, while the ADFc test (with constant only) may suffer from (severe) size distortions. But even when ADFc is oversized, the regression with dummies may result in a more powerful test. Hence, the inclusion of seasonal dummies must be strongly recommend when applying a DF test with seasonal time series as long as seasonally varying means cannot be excluded a priori.

13

Table 1: Rejection frequencies for ADF tests, case (i)

ρ = 1.00 ρ = 0.95 ρ = 0.90 ρ = 0.80 ρ = 0.50

ADFc ADFd ADFc ADFd ADFc ADFd ADFc ADFd ADFc ADFd

1% 2.92 1.60 5.56 3.62 9.18 7.67 16.79 29.73 28.37 86.45

T = 80 5% 10% 9.48 16.01 6.14 11.02 16.24 26.30 12.01 20.58 24.70 37.25 23.35 36.95 39.32 54.16 61.14 76.28 51.75 65.04 91.31 93.43

Lags 7.58 2.04 7.60 2.15 7.60 2.31 7.53 2.46 7.24 2.48

1% 1.97 1.38 10.59 10.20 25.66 42.28 50.58 83.78 69.13 93.25

T = 200 5% 10% 7.64 13.79 5.76 10.77 30.90 46.85 30.97 47.52 56.50 73.10 73.74 85.18 80.51 90.77 92.68 96.14 91.02 96.58 97.87 99.14

Lags 11.77 3.75 11.78 3.99 11.77 4.11 11.73 4.05 11.53 4.07

Note: ADFc and ADFd stand for the ADF statistics from (7) and (11), respectively. We indicate with ”Lags“ the mean number of included lagged differences in the studied test regressions. Further details are given in the text.

Table 2: Rejection frequencies for ADF tests, case (ii)

ρ = 1.00 ρ = 0.95 ρ = 0.90 ρ = 0.80 ρ = 0.50

ADFc ADFd ADFc ADFd ADFc ADFd ADFc ADFd ADFc ADFd

1% 2.74 1.61 5.52 3.29 9.14 7.43 16.19 30.20 25.94 86.17

T = 80 5% 10% 8.95 15.51 5.93 10.58 16.48 26.32 11.62 19.92 24.76 37.87 23.71 37.64 38.42 53.59 61.70 76.46 50.31 64.25 91.14 93.38

Note: See Table 1.

14

Lags 7.70 2.07 7.74 2.13 7.71 2.30 7.68 2.46 7.43 2.50

1% 2.00 1.20 10.50 10.01 25.82 42.52 50.69 84.01 69.68 93.10

T = 200 5% 10% 7.51 13.80 5.66 10.82 30.76 46.34 30.90 47.44 56.42 73.10 74.14 85.23 80.69 91.02 92.87 96.19 91.34 96.71 97.73 99.08

Lags 11.82 3.70 11.83 4.00 11.83 4.09 11.81 4.07 11.64 4.08

Table 3: Rejection frequencies for ADF tests, case (iii)

ρ = 1.00 ρ = 0.95 ρ = 0.90 ρ = 0.80 ρ = 0.50

ADFc ADFd ADFc ADFd ADFc ADFd ADFc ADFd ADFc ADFd

1% 2.42 1.58 4.60 3.16 7.84 7.42 15.90 29.41 39.26 86.29

T = 80 5% 10% 7.61 13.25 5.99 10.71 13.63 21.92 11.27 20.03 20.74 31.46 23.16 37.00 34.92 48.11 60.93 75.99 59.53 71.25 91.23 93.38

Lags 6.69 2.03 6.73 2.13 6.70 2.29 6.58 2.47 6.13 2.49

1% 1.38 1.34 8.40 10.22 21.19 42.28 45.15 83.66 71.70 93.54

T = 200 5% 10% 6.37 11.62 5.57 10.41 25.71 40.24 31.14 47.95 49.35 66.14 74.32 85.62 76.23 87.88 92.79 96.09 91.90 97.03 97.84 99.10

Lags 10.96 3.70 11.00 4.00 10.93 4.06 10.84 4.08 10.41 4.09

Note: See Table 1.

Table 4: Rejection frequencies for ADF tests, case (iv)

ρ = 1.00 ρ = 0.95 ρ = 0.90 ρ = 0.80 ρ = 0.50

ADFc ADFd ADFc ADFd ADFc ADFd ADFc ADFd ADFc ADFd

1% 2.41 1.63 4.68 3.40 8.79 7.71 20.11 29.54 46.33 86.64

T = 80 5% 10% 8.12 13.86 6.06 11.07 14.63 23.31 11.44 20.26 23.24 34.73 23.38 37.04 42.06 54.79 61.21 76.00 68.45 78.69 91.39 93.70

Note: See Table 1.

15

Lags 5.61 2.06 5.64 2.12 5.63 2.32 5.59 2.45 5.48 2.44

1% 1.46 1.30 8.70 9.93 24.34 42.33 51.40 83.82 80.06 93.16

T = 200 5% 10% 6.30 11.40 5.64 10.70 25.33 39.43 30.67 47.31 51.92 68.03 73.50 85.02 78.60 89.14 92.64 95.90 95.00 98.33 97.74 99.05

Lags 9.54 3.67 9.58 4.02 9.56 4.08 9.53 4.07 9.45 4.10

4.2

KPSS test

The test statistics compared are ηc , without seasonal demeaning from (8) and (9), and ηd , with seasonal demeaning: T T −2 X ¡ d ¢2 ηd = 2 S , ω bd t=1 t

where Std

=

t X

x bdj

,

ω bd2

j=1

(12)

T 1 X ¡ d ¢2 = x b , T t=1 t

with x bdt the OLS residuals from yt =

S X

δbs Ds,t + x bdt .

s=1

Further, we consider ηd,B and ηc,B with the variance estimators replaced 2 by spectral density estimators ω bc,B from (10) and 2 ω bd,B

T T −1 T −h X 1X d d 1 X ¡ d ¢2 w(h/B) x b +2 x bx b , = T t=1 t T t=1 t t+h h=1

respectively. We employ in our simulations the QS kernel with two choices of the bandwidth as in Kwiatkowski et al. (1992): £ ¤ £ ¤ B4 = 4 (T /100)0.25 , B12 = 12 (T /100)0.25 , where [·] denotes the integer part. Given the sample sizes, these should not differ dramatically from the optimal rate given by Andrews (1991), where B = O(T 0.2 ) for the QS window. Examining Tables 5 through 8 we observe for the usual variance estimator that the test based on ηc is heavily undersized. This was to be expected in light of Remark 7. Worse yet, no improvement appears with increasing sample size. On the other hand, ηd holds the required level, and has more power than ηc . The power loss is especially obvious in case (iv) and T = 80, with case (iv) appearing to be the most pronounced seasonal structure. Of 16

Table 5: Rejection frequencies for KPSS tests, case (i)

ρ=0

ρ=1

ηc ηd ηc,B4 ηd,B4 ηc,B12 ηd,B12 ηc ηd ηc,B4 ηd,B4 ηc,B12 ηd,B12

1% 0.00 1.19 0.87 0.59 0.02 0.00 91.66 95.95 69.22 69.18 0.00 0.00

T = 80 5% 0.24 6.56 7.12 5.95 7.54 4.66 96.96 99.04 85.17 85.14 48.10 48.00

10% 0.65 11.00 12.86 10.74 15.54 11.55 98.23 99.55 90.48 90.39 60.27 60.29

1% 0.00 1.15 0.93 0.82 0.38 0.30 99.45 99.78 86.36 86.31 47.25 47.17

T = 200 5% 0.15 6.00 6.24 5.73 6.19 5.39 99.94 99.99 95.39 95.38 69.85 69.85

10% 0.51 10.35 10.82 10.10 11.71 10.78 99.98 1.00 97.54 97.53 76.98 76.94

Note: ηc and ηd are defined in (8) and (12), respectively. For ηc,B and ηd,B the variance estimators are replaced by spectral density estimators with bandwidth B. See the text for further details.

course, the performance of the KPSS test with dummies does not depend on the seasonal structure. For other (long-run) variance estimators the situation isn’t that clearcut. When using QS with B4 , ηd,B behaves as it should under the null, except maybe for the 1% quantile, where the KPSS test with dummies is undersized; however, the same happens with the KPSS without dummies when the true DGP does not exhibit deterministic seasonality and is due to the use of asymptotic critical values. The performance improves for T = 200, but the test with ηd,B still is somewhat oversized at 5%. On the other hand, the behavior of ηc,B depends strongly on the respective seasonal pattern. Models (i) and (ii) are basically similar, oversized at 5% and at 10%, somewhat undersized at 1%. The tests with models (iii) and (iv) are heavily oversized, especially for 5% and 10%. The distortions diminish for T = 200, as expected in light of Proposition 3, item c. The rejection frequency under the considered alternative is essentially the same for both ηd and ηc , but lower as when using the simple variance estimator. With B12 , the bandwidth is definitely too large for the white noise used in simulations; this explains the misbehavior (no rejection) at the 1% quantile, 17

Table 6: Rejection frequencies for KPSS tests, case (ii)

ρ=0

ρ=1

ηc ηd ηc,B4 ηd,B4 ηc,B12 ηd,B12 ηc ηd ηc,B4 ηd,B4 ηc,B12 ηd,B12

1% 0.00 1.11 0.87 0.68 0.10 0.00 85.60 95.91 68.46 68.44 0.00 0.00

T = 80 5% 0.00 6.32 6.91 5.72 8.32 4.58 93.21 99.06 84.78 84.76 48.19 48.10

10% 0.00 11.04 12.61 10.17 17.38 11.51 95.36 99.58 90.14 90.09 60.36 60.17

1% 0.00 1.12 0.93 0.82 0.30 0.24 98.72 99.79 86.60 86.58 47.13 47.08

T = 200 5% 0.00 6.03 5.97 5.58 5.82 5.07 99.72 99.99 95.23 95.22 69.55 69.52

10% 0.00 10.33 10.82 9.92 11.32 10.20 99.88 99.99 97.20 97.24 76.47 76.52

Note: See Table 5.

under the null as well as under the alternative. Fortunately, the situation improves for the 5% and 10% quantiles. While ηd,B performs satisfactorily, ηc,B leads to a severely distorted test for T = 80, again with some improvement for T = 200; the worst distortions appear in case (iv), too. Same as for B4 , the rejection frequency under the alternative is essentially the same when removing seasonality or when simply demeaning, which we attribute to the fact that the purely stochastic part of the long-run variance estimator, now integrated of order 1, dominates (is of higher stochastic order) those terms containing neglected seasonality. In summary, the KPSS without seasonal dummies tends to be oversized (the use of the usual variance estimator leads to an undersized test, but it is unlikely that it is employed in practical applications). Opposed to that, the KPSS with seasonal dummies behaves as it should. As was to be expected, special care is to be taken with the choice of the bandwidth, especially when testing at small levels. Regarding the rejection frequencies of the tests under the alternative, KPSS with demeaning never beats KPSS with seasonal dummies, and is often poorer. Therefore, just as with the ADF test, inclusion of seasonal dummies is recommended.

18

Table 7: Rejection frequencies for KPSS tests, case (iii)

ρ=0

ρ=1

ηc ηd ηc,B4 ηd,B4 ηc,B12 ηd,B12 ηc ηd ηc,B4 ηd,B4 ηc,B12 ηd,B12

1% 0.00 1.08 1.13 0.54 0.05 0.00 90.24 96.13 69.08 69.17 0.00 0.00

T = 80 5% 0.10 6.59 8.00 5.49 9.55 4.75 96.38 99.11 84.76 84.79 47.82 47.70

10% 0.37 11.08 13.70 10.22 18.80 11.76 97.76 99.55 90.19 90.12 60.23 60.18

1% 0.00 1.13 1.09 0.82 0.46 0.24 99.31 99.79 86.10 86.14 46.81 46.84

T = 200 5% 0.02 6.09 6.68 5.72 6.65 5.30 99.90 99.98 95.16 95.19 69.28 69.28

10% 0.18 10.37 11.62 10.04 12.34 10.31 99.96 99.99 97.36 97.43 76.37 76.28

Note: See Table 5.

5

Concluding remarks

In this paper we examined the effect of neglecting seasonal dummies when performing integration tests with seasonal time series that display seasonally varying means. It is shown that the limit distribution of the DF test is shifted to the left and more concentrated at the same time. In practice, empirical workers will account for serial residual correlation. This induces further distortions, and is expected to reduce power. Asymptotic arguments are supplemented by Monte Carlo evidence. Indeed, we observe that the (augmented) DF test without seasonal dummies is oversized and has little power at the same time in the presence of seasonally varying means, as theoretically argued. Inclusion of seasonal dummies, however, avoids those defects. For the KPSS test, it turns out that it is distorted as well. For the spectral density estimator using the Bartlett window, the direction depends on the concrete pattern of the ignored seasonality. Opposed to that, the test becomes conservative when using the OLS variance estimator. Experimental evidence further suggests for the Quadratic Spectral window that its use rather dilutes the small-sample distribution under the null hypothesis, resulting in overrejection. 19

Table 8: Rejection frequencies for KPSS tests, case (iv)

ρ=0

ρ=1

ηc ηd ηc,B4 ηd,B4 ηc,B12 ηd,B12 ηc ηd ηc,B4 ηd,B4 ηc,B12 ηd,B12

1% 0.00 1.21 1.30 0.58 0.17 0.00 78.06 96.10 68.71 68.69 0.00 0.00

T = 80 5% 0.00 6.43 8.97 5.26 11.67 4.87 88.27 99.07 84.74 84.80 47.96 47.63

10% 0.00 11.05 15.53 9.96 22.14 11.70 91.33 99.65 90.29 90.23 60.30 60.20

1% 0.00 1.04 1.02 0.77 0.52 0.26 97.42 99.75 86.23 86.18 46.69 46.70

T = 200 5% 0.00 6.02 7.00 5.49 7.23 5.37 99.21 99.96 95.15 95.11 69.44 69.50

10% 0.00 10.02 12.33 10.07 13.12 10.77 99.54 99.99 97.22 97.22 76.80 76.78

Note: See Table 5.

Analogous results to our findings are to be expected in case of residualbased cointegration testing as proposed by Engle and Granger (1987) and Phillips and Ouliaris (1990) or Shin (1994) or when using the ADF test in the context of seasonal unit roots, as in Hylleberg et al. (1990), Ghysels, Lee and Noh (1994) or Rodrigues (1999).

Appendix: Proofs First, a lemma is presented and proven. Its proof heavily draws from by now standard results due to Phillips and Durlauf (1986), Phillips (1987), and Phillips and Perron (1988) for xt = xt−1 + ut . Next, the proofs of Propositions 1, 2, and 3 will be provided. All results build on a Functional Central Limit Theorem (FCLT) for ut : T

−0.5

[rT ] X

ut ⇒ ω W (r) ,

t=1

20

r ∈ [0, 1] ,

with W a standard Brownian Motion and ω 2 the long-run variance of ut . For simplicity, let T = SN , unless specified otherwise; it is straightforward to check that up to 2S − 2 observations have no influence on the results.

Lemma A Under the assumptions PS of Propositions 1 and 2 and for any constants ψs , s ∈ {1, 2, . . . , S} with s=1 ψs = 0 it holds as T → ∞: PT

a)

1 T

b)

1 T

c)

1 T

d)

1 T

e)

PT PT PT

1

PT

1

PT

h)

1 T 1 T

yt−1 ψs → δ−1 ψ; d

t=1

T 0.5

1 T2

∆yt ∆yt−h → γh + ζζ−h for h ∈ Z; p

t=1

g)

∆yt−1 ψs → ζ−1 ψ; p

t=1

T 1.5

∆yt → 0; p

t=1

f)

i)

p

t=1

∆yt → ω W (1); d

t=1 yt−1 → ω

PT

d

t=1

PT

2 yt−1 → ω2 d

R1 0

R1

W 2 (r)dr;

R1

W (r)dW (r); ³R PT d 1 2 ζδ + ω W (r)dW (r) + y ∆y → −1 t t=1 t−1 0 t=1

yt−1 εt → ωσε

0

W (r)dr;

where ζ−1 ψ =

1 S

0

PS

s=1 ζs−1 ψs and δ−1 ψ =

1 S

ω 2 −γ0 2ω 2

´ ,

PS

s=1 δs−1 ψs .

Proof of Lemma A a) By assumption, T T T 1X 1X 1X p ∆yt = ut + ζs → E (ut ) = 0. T t=1 T t=1 T t=1

21

b) First, T S N S N 1X 1 X X 1 XX ∆yt−1 ψs = ψs u(i−1)S+s−1 + ζs−1 ψs . T t=1 SN s=1 SN i=1 i=1 s=1

Since

1 N

PN

p

u(i−1)S+s−1 → 0, s ∈ {1, 2, . . . , S}, the result follows. P P P c) Because Ss=1 ζs = Ss=1 ζs−h = 0, T1 Tt=1 ∆yt ∆yt−h equals i=1

T T T N S 1X 1X 1X 1 XX ut ut−h + ut ζs−h + ζs ut−h + ζs ζs−h , T t=1 T t=1 T t=1 SN i=1 s=1

which converges in probability to γh + ζζ−h . P d) Obviously, T1 Tt=1 yt−1 ψs equals T T N S 1X 1X 1 XX xt−1 ψs + δs−1 ψs = δs−1 ψs + T t=1 T t=1 SN i=1 s=1 Ã ! N S S−1 X X 1X + x(i−1)S ψ(i−1)S+s + (S − s) u(i−1)S+s ψ(i−1)S+s+1 = T i=1 s=1 s=1 Ã ! S−1 N 1X 1 X p = δ−1 ψ + (S − s) ψ(i−1)S+s+1 u(i−1)S+s → δ−1 ψ. S s=1 N i=1

e) We have T 1 X

T 0.5

T 1 X

∆yt =

T 0.5

t=1

ut +

t=1

T 1 X

T 0.5

t=1

d

→ ω W (1). f ) After observing that T −1 1 X

T 1.5

yt =

t=0

T −1 1 X

T 1.5

δ = √ + T the proof is straightforward to complete. 22

(δs + xt )

t=0 T −1 X t=0

xt ,

ζs

g) Having T T 1 X 2 1 X y = (δs−1 + xt−1 )2 = T 2 t=1 t−1 T 2 t=1 ! µ ¶ ÃX S T T 1 X 2 11X 2 1 + = δs−1 + O x δ xt−1 , O t−1 s−1 p 2 T S s=1 T2 T t=1 t=1

the result follows because

PT t=1

xt−1 = Op (T 1.5 ) and δs−1 = O (1) .

h) By assumption, T 1X yt−1 εt = T t=1

T S N X 1X 1 X xt−1 εt + δs−1 ε(i−1)S+s = T t=1 SN s=1 i=1 Z 1 d → ωσε W (r)dW (r); 0

i) Note that T X t=1

yt−1 ∆yt =

T X

δs−1 ζs +

T X

t=1

δs−1 ut +

t=1

T X t=1

xt−1 ζs +

T X

xt−1 ut .

t=1

Because following holds: T X

p

δs−1 ut → 0,

t=1

and having shown in d) that P lows, due to Ss=1 ζs = 0.

PT t=1

p

xt−1 ψs → 0, the desired result fol-

¥

Proof of Proposition 1 With Lemma A at hand, the proof of Proposition 1 is straightforward. First, for P P P T −1 yt−1 ∆yt − T −2 yt−1 ∆yt b P 2 P Tφ= −2 −3 2 T yt−1 − T ( yt−1 ) 23

the asymptotic distribution is obvious. Second, consider the residual variance estimator, T 1X 2 2 u b, u b2t = (∆yt − b c − φb yt−1 )2 s = T t=1 t with

b c = ∆y − φb y−1 = Op (T −0.5 )

according to Lemma A. Therefore, and due to φb = Op (T −1 ): 2

s

=

T 1X (∆yt )2 + op (1) T t=1

=

S N 1 XX (ζs + u(i−1)S+s )2 + op (1) T i=1 s=1

=

S N S 1X 2 1 XX 2 ζ + u + op (1) S s=1 s SN i=1 s=1 (i−1)S+s

p

→ ζ 2 + γ0 . Hence, we obtain the limiting distribution of the t statistic tφ as required. ¥

Proof of Proposition 2 Denote with β the true values of the model coefficients from (7): ¡ ¢0 β = 0 0 α1 · · · αp and βb its OLS estimate. Let the tth line, t = 1, . . . , T , of the design matrix X be ¡ ¢ Xt = 1 yt−1 ∆yt−1 · · · ∆yt−p . Let similarly ∆y = (∆yt )t=1,...,T be the column vector containing the observed dependent variable. Then, βb can be calculated by solving following linear equations system: (X 0 X) βb = X 0 ∆y ,

24

b Denote X ∗ the matrix with lines and φb by taking the second element of β. Xt∗ , t = 1, . . . , T , ¡ ¢ Xt∗ = 1 xt−1 ∆xt−1 · · · ∆xt−p , and V the difference between X ∗ and X, with lines Vt , t = 1, . . . , T , ¡ ¢ Vt = 0 −δs−1 −ζs−1 · · · −ζs−p . Note that following relationships hold Vt β + ζs = ζs − ζs−1 · α1 − . . . − ζs−p · αp = πs and

S X

πs = 0.

s=1

Define the column vectors ε = (εt )t=1,...,T , ζ = (ζs )s=t mod S, t=1,...,T , and π = (πs )s=t mod S, t=1,...,T . Since ∆yt = ∆xt + ζs = Xt∗ β + εt + ζs = = (Xt + Vt ) β + εt + ζs we have for βb −1 −1 βb − β = (X 0 X) (X 0 (V β + ε + ζ)) = (X 0 X) (X 0 (ε + π)) ;

b for φ−0, we take the second element of βb−β. This can be calculated without inverting (X 0 X) by Cramer’s rule: |M | φb = |X 0 X| where the matrix M is obtained by replacing the second column of (X 0 X) with X 0 (ε + π) . Then, it obviously holds that T φb =

1 T p+2 1

T p+3

25

|M | . |X 0 X|

1 Consequently, T p+3 |X 0 X| equals ¯ P P ¯ yt−1 ∆yt−1 ¯ PT P P 2 ¯ y y y t−1 t−1 t−1 P Pt−1 ∆y 1 ¯¯ P ∆y 2 y ∆y ∆y t−1 t−1 t−1 t−1 ¯ T p+3 ¯ .. .. .. ¯ . . ¯ P . P P ¯ ∆yt−p yt−1 ∆yt−p ∆yt−1 ∆yt−p

¯ ¯ ¯ ¯ ¯ ¯ = ¯¯ ¯ ¯ ¯ ¯

yt−1 PT 1.5 2 yt−1 2 T P yt−1 ∆yt−1 T 1.5

P

P

.. .

∆yt−p T

P

P

P

T PT yt−1 1.5 PT ∆yt−1 T

∆yt−1 T P yt−1 ∆yt−1 PT 1.52 ∆yt−1 T

.. .

P

yt−1 ∆yt−p T 1.5

P · · · P ∆yt−p · · · P yt−1 ∆yt−p ··· ∆yt−1 ∆yt−p .. .. . P . 2 ··· ∆yt−p ∆yt−p T P yt−1 ∆yt−p T 1.5 P ∆yt−1 ∆yt−p T

··· ··· ··· ...

.. .

∆yt−1 ∆yt−p T

P

···

.. .

2 ∆yt−p T

¯ ¯ ¯ ¯ ¯ ¯ ¯= ¯ ¯ ¯ ¯

¯ ¯ ¯ ¯ ¯ ¯ ¯. ¯ ¯ ¯ ¯ ¯

It is easy to check that now all elements are bounded in probability, whereas some are of stochastic order Op (T −0.5 ) or Op (T −1 ) , which allows us to multiply them with each other without fear of undeterminacy and obtain for T → ∞, with the help of Lemma A, following expression ¯ ¯ R1 ¯ ¯ 1 ω W (r) dr 0 · · · 0 0 ¯ R1 ¯ R 1 ¯ ω W (r) dr ω 2 W 2 (r) dr ¯ 0 · · · 0 ¯ ¯ 0 0 ¯ 2 · · · γp−1 + ζζ+p−1 ¯¯ . 0 0 γ0 + ζ ¯ ¯ ¯ .. .. .. .. ¯ ¯ . 0 . . . ¯ ¯ ¯ ¯ 2 ··· γ +ζ 0 0 γ + ζζ p−1

+p−1

0

We thus have 1 T p+3

ÃZ

1

d

|X 0 X| → ω 2 A

µZ

¶2 !

1

W 2 (r) dr −

W (r) dr

,

0

0

1 with A from Proposition 2. Then, T p+2 |M | equals ¯ P P ¯ (εt + πs ) ∆yt−1 ¯ PT P P ¯ yt yt−1 ∆yt−1 P yt−1 (εt + πs ) P 1 ¯¯ P ∆y 2 ∆y (ε + π ) ∆yt−1 t−1 t−1 t s ¯ p+2 ¯ T .. .. .. ¯ . . . ¯ P P P ¯ ∆yt−p ∆yt−p (εt + πs ) ∆yt−1 ∆yt−p

26

P · · · P ∆yt−p · · · P yt−1 ∆yt−p ··· ∆yt−1 ∆yt−p .. ... P . 2 ··· ∆yt−p

¯ ¯ ¯ ¯ ¯ ¯ ¯. ¯ ¯ ¯ ¯

Split

1 T p+2

1 |M | in two determinants, T p+2 |M | = D1 + D2 , where P P P ¯ ∆yt−p ε ∆yt−1 ¯ √T 2 √ t √ √ · · · ¯ PT T T T P P P ¯ yt−1 ∆yt−p y yt−1 εt yt−1 ∆yt−1 √t ¯ · · · T ¯ PT T P T2 P P T ∆yt−1 ¯ ∆y ∆yt−1 ∆yt−p ∆yt−1 εt t−1 ··· D1 = ¯ T √T T T T ¯ . .. . . .. ¯ . . . . . . . ¯ P . P 2 ¯ ∆yt−p P ∆yt−p εt P ∆yt−1 ∆yt−p ∆yt−p ¯ √ ··· T

and

¯ ¯ ¯ ¯ ¯ ¯ ¯ D2 = ¯ ¯ ¯ ¯ ¯ ¯

T

T

T √ 2 PT y √t−1 PT T ∆yt−1 √ T T

P π √ s P T yt−1 πs P T ∆yt−1 πs T

P

P

.. .

∆yt−p √ T T

T

P

∆yt−1 √ T P yt−1 ∆yt−1 P T2 ∆yt−1 T

.. .

P

∆yt−p πs T

.. .

∆yt−1 ∆yt−p T

T

P

··· ··· ··· ... ···

∆yt−p √ T P yt−1 ∆yt−p T P ∆yt−1 ∆yt−p T P

.. .

2 ∆yt−p T

¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯. ¯ ¯ ¯ ¯ ¯

D1 , because of Lemma A, converges in distribution to: ¯ ¯ σε W (1) ? ··· ? ¯ R1 1 ¯ ω W (r) dr ωσε R 1 W (r) dW (r) ? ··· ? ¯ 0 0 ¯ 2 ζ · · · γ + ζζ+p−1 0 0 γ + p−1 0 ¯ ¯ . . . . .. .. .. .. .. ¯ . ¯ ¯ γ0 + ζ 2 0 0 γp−1 + ζζ+p−1 · · · ¶ µZ 1 Z 1 W (r) dW (r) − W (1) W (r) dr , = ωσε A 0

¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯

0

(the elements marked with ? are bounded in probability, and, being multiplied with zero, there is no need forPcalculating them). √ t−p and subtract it from columns 3 to With D2 , multiply first column by ∆y T p + 2, which doesn’t change its value. Expand D2 after its second line and obtain with help of Lemma A for its limit following expression: µ Z 1 p X ω 2 − γ0 q 2 δ−1 πA + W (r) dW (r) + − (−1) Aq δ−1 ζ + ω 2 0 q=1 ¶ Z 1 Xq−1 ¡ ¢ 2 − ω W (1) W (r) dr + γr + ζζ−r . r=0

0

27

Adding to the limit of D1 and rearranging the terms leads to the desired result. For the t statistic, tφ , one needs the 2nd diagonal element of the (X 0 X)−1 matrix and an estimate of the residual variance: φb

tφ = q

σ bε2 [X 0 X]−1 22

,

with σ bε2 =

T ´2 1 X³ b t−1 − α ∆yt − b c − φy b1 ∆yt−1 − . . . − α bp ∆yt−p . T t=1

It further holds: [X

0

−1 X]22

2+2

= (−1)

|M ∗ | , |X 0 X|

where the cofactor is obtained by deleting the second row and column of the matrix (X 0 X): ¯ ¯ P P ¯ ¯ T ∆y · · · ∆y t−1 t−p ¯ P ¯ P P 2 ¯ ¯ ∆y ∆y · · · ∆y ∆y t−1 t−1 t−p t−1 ¯ ¯ |M ∗ | = ¯ ¯, .. .. .. ... ¯ ¯ . . . ¯ P ¯ P P 2 ¯ ¯ ∆yt−p ∆yt−1 ∆yt−p · · · ∆yt−p which leads to

T φb tφ = q , p+3 |M ∗ | σ bε2 TTp+1 |X 0 X|

Then, we have 1 T p+1

|M ∗ | =

1 T p+1

¯ P P ¯ ∆yt−1 · · · P ∆yt−p ¯ P T P 2 ¯ ··· ∆yt−1 ∆yt−p ∆yt−1 ∆yt−1 ¯ ¯ .. .. .. . . ¯ . . . ¯ P P P . 2 ¯ ∆yt−p ∆yt−1 ∆yt−p · · · ∆yt−p

p

→ A, and, for

1 T p+3

|X 0 X|, the known result. 28

¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯

Before examining the behavior of σ bε2 , recall that φb is a superconsistent estimator. Unfortunately, the other estimators in (7) don’t behave that nicely. We have for the autoregressive parameters (w.l.o.g. for α1 ) α b1 − α1 =

|M ∗∗ | |X 0 X|

where the matrix M ∗∗ is obtained by replacing the third column of (X 0 X) 1 |M ∗∗ | equals with X 0 (ε + π), so T p+3 ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯

T PT yt−1 PT 1.5 ∆yt−1 T P

.. .

∆yt−p T

P

yt−1 PT 1.5 2 yt−1 P T2 yt−1 ∆yt−1 T 1.5 P

.. .

yt−1 ∆yt−p T 1.5

P

(εt +πs ) T P yt−1 (εt +πs ) T 1.5 P ∆yt−1 (εt +πs ) T P

.. .

∆yt−p (εt +πs ) T

P

∆yt−2 T P yt−1 ∆yt−2 T 1.5 P ∆yt−1 ∆yt−2 T P

.. .

∆yt−2 ∆yt−p T

P

··· ··· ··· .. . ···

which converges to ¯ R1 ¯ 1 ω W (r) dr 0 0 ¯ R1 0 ¯ ω W (r) dr ω 2 R 1 W 2 (r) dr 0 0 ¯ 0 0 ¯ ζ π γ + ζζ−1 0 0 −1 1 ¯ ¯ . . . . .. .. .. .. ¯ ¯ ¯ ζ−p π γp−2 + ζζ+p−2 0 0

∆yt−p T P yt−1 ∆yt−p T 1.5 P ∆yt−1 ∆yt−p T P

.. .

2 ∆yt−p T

¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯

··· 0 ··· 0 · · · γp−1 + ζζ+p−1 .. ... . ···

γ0 + ζ 2

¯ ¯ ¯ ¯ ¯ ¯ ¯. ¯ ¯ ¯ ¯

Hence, it follows µ d

ω 2 Ai µ

α bi − αi → ω2A

R1 0

R1 0

W 2 (r) dr − W 2 (r) dr −

³R

1 0

W (r) dr

1 0

W (r) dr

³R

´2 ¶

Ai ´2 ¶ = A ,

since the same functional depending on the same sampled Wiener process appears in both numerator and denominator. The estimators α bi are thus not consistent. Similar considerations lead to b c = op (1) .

29

Then, the usual residual variance estimator becomes T ´2 1 X³ b t−1 − α ∆yt − b c − φy b1 ∆yt−1 − . . . − α bp ∆yt−p T t=1 µ ¶ µ ¶ ¶2 T µ A1 1X Ap ∆yt − α1 + = ∆yt−1 − . . . − αp + ∆yt−p + op (1) T t=1 A A Ã !2 p p p T X X X Aq Aq 1X αq ut−q − = ut − ut−q + πs − ζs−q + op (1) T t=1 A A q=1 q=1 q=1 Ã !2 Ã !2 p p T T X X 1X Aq 1X Aq = εt − ut−q + πs − ζs−q + T t=1 A T t=1 A q=1 q=1 Ã !Ã ! p p T X X 2X Aq Aq + εt − ut−q πs − ζs−q + op (1) . T t=1 A A q=1 q=1

σ bε2 =

Since have

so σ bε2

³

PS s=1

πs −

Pp

Aq q=1 A ζs−q

´ =

PS

s=1 πs −

Pp

Aq q=1 A

³P

S s=1 ζs−q

´ = 0, we

à !à ! p p T X X 1X Aq Aq p εt − ut−q πs − ζs−q → 0 , T t=1 A A q=1 q=1 à !2 à !2 p p T T X X 1X Aq 1X Aq = εt − ut−q + πs − ζs−q + op (1) T t=1 A T t=1 A q=1 q=1 à !2 p p p X X X A A A p q r q → σε2 + γ|q−r| + π − ζ−q , 2 A A q=1 r=1 q=1

as required for the result. ¥

Proof of Proposition 3 The proof relies on a FCLT in terms of Brownian Bridges: T

−0.5

[rT ] X

(ut − u) ⇒ ω (W (r) − r W (1)) ,

t=1

30

r ∈ [0, 1] .

The regression on a constant provides under (1) with (2) and ρ = 0: y = δ + u,

x bct = ut − u + δs − δ .

After some computations, we obtain [rT ] X

x bct

=

[rT ] X

ut − [rT ] u + δ1 + · · · + δ[rT ]

mod S

− ([rT ] mod S) δ ,

t=1

t=1

and, therefore, d

T −0.5 Stc → ω (W (r) − r W (1)) . a) The variance estimation, however, is not consistent, even if ut is white noise, σ bc2 = =

N S ¢2 1 XX¡ ut − u + δ s − δ T i=1 s=1 T S √ ¢2 1X 1 X¡ (ut − u)2 + δs − δ + Op (1/ T ) , T t=1 S s=1

as required for the result. 2 b) Analogously to a), the long-run variance estimator ω bc,B can be split: 2 ω bc,B

T −1 T T −h X 1X 1X 2 w(h/B) (ut − u) + 2 (ut − u) (ut+h − u) + = T t=1 T t=1 h=1

T T −1 T −h X √ ¢2 ¢¡ ¢ 1 X¡ 1 X¡ δs − δ + 2 w(h/B) δs − δ δs+h − δ + Op (B/ T ) , T t=1 T t=1 h=1

where the fact that w(h/B) = 0 for h ≥ B has been used. By the same argument is easy to show µ ¶ T −h ¢¡ ¢ ¡ ¢¡ ¢ B 1 X¡ δs − δ δs+h − δ = δ − δ δ+h − δ + O . T t=1 T Then, B X h=1

w (h/B)

T −h ¢¡ ¢ 1 X¡ δs − δ δs+h − δ = T t=1

31

=

B µ X h=1

h 1− B

¶µ

¡

δ−δ

¢¡

µ ¶¶ B δ+h − δ + O = T ¢

¶ µ 2¶ B µ X ¢¡ ¢ h ¡ B = 1− δ − δ δ+h − δ + O . B T h=1 ¡ ¢¡ ¢ Let B = KS + b. Since δ − δ δ+h − δ is cyclic in h with period S, split ¢¡ ¢¡ ¢ PB ¡ h δ − δ δ+h − δ in S sub-sums plus left-over: h=1 1 − B ÃK µ S X X s=1

i=1

+

s + (i − 1)S 1− B

µ B X h=KS+1

While the sum

PS s=1

³P ³ K i=1

h 1− B

1−



¡



¡

δ−δ

δ−δ

s+(i−1)S B

¢¡

¢¡

δ+s − δ

¢

! +

¢ δ+h − δ .

´¡ ¢¡ ¢´ δ − δ δ+s − δ =

µ ¶ s (i − 1)S = δ − δ δ+s − δ 1− − = B B s=1 i=1 µ ¶ S X ¡ ¢¡ ¢ SK (K − 1) Ks = δ − δ δ+s − δ K − − = 2B B s=1 µ ¶ S S ¢¡ ¢ X ¢¡ ¢ SK (K − 1) X ¡ Ks ¡ = K− δ − δ δ+s − δ + δ − δ δ+s − δ 2B B s=1 s=1 S X ¡

¢¡

K ¢X

¢¡ ¢ P ¡ has a finite limit, due to Ss=1 δ − δ δ+s − δ = 0 and part ¶ µ B X ¢¡ ¢ h ¡ 1− δ − δ δ+h − δ B h=KS+1 disappears for B → ∞; because B X h=1

w (h/B)

B2 T

Ks −→ B B→∞

sS, the

= o (1), it holds

S T −h X ¢¡ ¢ ¢¡ ¢ ¡ 1 X¡ s δ − δ δ+s − δ , δs − δ δs+h − δ → S T t=1 s=1

32

as needed to complete the proof. c) Since now 1 ≤ h ≤ T −1, some of the arguments in b) have to be modified. 2 Expanding ω bc,B , we have for the sum containing cross-products T −1 X

T −h ¡ ¢ 1X w(h/B) (ut − u) δs+h − δ = T t=1 h=1

B−1 X

T −1 T −h T −h ¡ ¢ X ¡ ¢ 1X 1X w(h/B) (ut − u) δs+h − δ + w(h/B) (ut − u) δs+h − δ T t=1 T t=1 h=1 h=B

The first term is easily shown to disappear as T → ∞. For the second, note that, as in the proof of Lemma A, item b), it holds ¶ µ T −h ¡ ¢ 1X 1 (ut − u) δs+h − δ = Op √ . T t=1 T ¡ ¢ Then, since w(h/B) = O (h/B)−2 , T −1 X h=B

w(h/B)

T −h ¡ ¢ 1X (ut − u) δs+h − δ = T t=1

Ã

! ¶¶ ¶ µ 2 µ µ T −1 1 B2 X 1 B 1 B = Op √ = Op √ − = Op √ = op (1) . T h=B h2 T B T −1 T ¡ ¢ P −1 P −h Similar holds for Th=1 w(h/B) T1 Tt=1 (ut+h − u) δs − δ , so it follows 2 ω bc,B

T T −1 T −h X 1X 1X 2 = (ut − u) + 2 w(h/B) (ut − u) (ut+h − u) + T t=1 T t=1 h=1 T T −1 T −h X ¢2 ¢¡ ¢ 1 X¡ 1 X¡ δs − δ + 2 w(h/B) δs − δ δs+h − δ + op (1). T t=1 T t=1 h=1

In order to show that the deterministic part converges to zero, we now resort to spectral analysis. Denote with Iδ (λ) , λ ∈ (−π; π) the periodogram of the

33

deterministic seasonal pattern computed from T observations: T T −h T −1 ¢2 X ¢¡ ¢ 1 X¡ 1 X¡ δs − δ + δs − δ δs+h − δ cos (λh) T t=1 T t=1 h=1 ¯2 ¯ T ¢ iλh ¯¯ 1 ¯¯X ¡ δh − δ e ¯ . = ¯ ¯ T ¯

Iδ (λ) =

h=1

Let T = KS + b, 0 ≤ b < S, and consider |λ| < 2π/S. Due to the periodicity of δt − δ, we can write ¯ ¯2 S b ¯ X X ¢ K−1 ¢ ¡ 1 ¯¯X ¡ ¯ δs − δ Iδ (λ) = ¯ eiλ(kS+s) + δs − δ eiλ(KS+s) ¯ . ¯ T ¯ s=1 s=1 k=0

Then, K−1 X

e

iλ(kS+s)

iλs

=e

K−1 X

¡

¢ iλS k

e

k=0

k=0

iλs 1

=e

¡ ¢K − eiλS , 1 − eiλS

which, as can be easily checked, has finite module. The number of seasons S being itself finite, it follows ¡ ¢ Iδ (λ) = O T −1 , |λ| < 2π/S. Note that, at seasonal frequencies λs = 2πs/S, the periodogram tends to + or −∞. The spectral density of a series can be estimated by smoothing the periodogram. At frequency zero, it holds Z π T −1 T T −h X ¢2 ¢¡ ¢ 1 X¡ 1 X¡ w(h/B) δs − δ +2 δs − δ δs+h − δ = Iδ (λ) θ (λ) dλ, T t=1 T −π t=1 h=1 where θ (λ) is the spectral window associated to the used lag window w (·). For the QS lag window, it holds ³ ( ¡ λ ¢2 ´ 3 6π 1 − , |λ| ≤ a 4a a , with a = . θQS (λ) = 5B 0, |λ| > a Hence, for large enough B, it holds Z π ¢ ¡ Iδ (λ) θQS (λ) dλ = O T −1 , −π

34

since ¥

Ra

θ −a QS

(λ) dλ = 1, which completes the proof.

References Abeysinghe, T. (1991) Inappropriate Use of Seasonal Dummies in Regression. Economics Letters 36, 175-179. Abeysinghe, T. (1994) Deterministic Seasonal Models and Spurious Regressions. Journal of Econometrics 61, 259-272. Anderson, T.W., and D.A. Darling (1952) Asymptotic Theory of Certain ”Goodness of Fit” Criteria Based on Stochastic Processes. Annals of Mathematical Statistics 23, 193-212. Andrews, D.W.K. (1991) Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation. Econometrica 59, 817-858. Busetti, F. and A.M.R. Taylor (2003) Testing against stochastic trend and seasonality in the presence of unattended breaks and unit roots. Journal of Econometrics 117, 21-53. Dickey, D.A., and W.A. Fuller (1979) Distribution of the Estimators for Autoregressive Time Series with a Unit Root. Journal of the American Statistical Association 74, 427-431. Dickey, D.A., W.R. Bell, and R.B. Miller (1986) Unit Roots in Time Series Models: Tests and Implications. The American Statistician 40, 12-26. Engle, R.F., and C.W.J. Granger (1987) Co-Integration and Error Correction: Representation, Estimation, and Testing. Econometrica 55, 251-276. Franses P.H., S. Hylleberg, and H.S. Lee (1995) Spurious Deterministic Seasonality. Economics Letters 48, 249-256.

35

Ghysels, E., H.S. Lee, and J. Noh (1994) Testing for unit roots in seasonal time series: Some theoretical extensions and a Monte-Carlo investigation. Journal of Econometrics 62, 415-42. Hassler, U., and M. Neugart (2003) Inflation-Unemployment Tradeoff and Regional Labor Market Data. Empirical Economics 28, 321-334. Hylleberg, S., R.F. Engle, C.W.J. Granger and B.S. Yoo (1990) Seasonal integration and cointegration. Journal of Econometrics 44, 215-238. Kwiatkowski, D., P.C.B. Phillips, P. Schmidt, and Y. Shin (1992) Testing the Null Hypothesis of Stationarity Against the Alternative of a Unit Root. Journal of Econometrics 54, 159-178. Lim, C., and M. McAleer (2000) A Seasonal Analysis of Asian Tourist Arrivals to Australia, Applied Economics 32, 499-509. Lopes, A.S. (1999) Spurious deterministic seasonality and autocorrelation corrections with quarterly data: Further Monte Carlo results. Empirical Economics 24, 341 - 359. Lopes, A.S. (2002) Deterministic Seasonality in Dickey-Fuller Tests: Should we Care? CEMAPRE Working Paper 2002-1. MacKinnon, J.G. (1991) Critical Values for Cointegration Testing. In: R.F. Engle and C.W.J. Granger (eds.) Long-Run Economic Relationships. Oxford: Oxford University Press, pp. 267-276. ´ Martin-Alvarez, F.J., V.J. Cano-Fern´andez, and J.J. C´aceres-Hern´andez (1999) The introduction of seasonal unit roots and cointegration to test index aggregation optimality: an application to a Spanish farm price index, Empirical Economics 24, 403-414. Metin, K. and I. Muslu (1999) Money demand, the Cagan model, testing rational expectations vs adaptive expectations: the case of Turkey, Empirical Economics 24, 415-426. Newey, W.K. and K.D. West (1987) A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix. Econometrica 55, 703-708.

36

Ng, S. and P. Perron (2001) Lag length selection and the construction of unit root tests with god size and power, Econometrica 69, 1519-1554. Nyblom, J., and T. M¨akel¨ainen (1983) Comparison of Tests for the Presence of Random Walk Coefficients in a Simple Linear Model. Journal of the American Statistical Association 78, 856-864. Patterson, K. (2000) An Introduction to Applied Econometrics: A Time Series Approach, St. Martin’s Press. Phillips, P.C.B. (1987) Time Series Regression with a Unit Root. Econometrica 55, 277-301. Phillips, P.C.B., and S.N. Durlauf (1986) Multiple Time Series Regressions with Integrated Processes. Review of Economic Studies LIII, 473-495. Phillips, P.C.B., and S. Jin (2002) The KPSS Test with Seasonal Dummies. Economics Letters 77, 239-243. Phillips, P.C.B. and S. Ouliaris (1990) Asymptotic Properties of Residual Based Tests for Cointegration. Econometrica 58, 165193. Phillips, P.C.B, and P. Perron (1988) Testing for a Unit Root in Time Series Regression. Biometrika 75, 335-346. Rodrigues, P.M.M. (1999) A Note on the Application of the DF Test to Seasonal Data. Statistics and Probability Letters 47, 171-175. Said, S.E. and D.A. Dickey (1984) Testing for Unit Roots in AutoregressiveMoving Average Models of Unknown Order. Biometrika 71, 599-607. Shin, Y. (1994) A Residual-Based Test of the Null of Cointegration Against the Alternative of No Cointegration. Econometric Theory 10, 91-115. Taylor, A.M.R. (2003) Locally Optimal Tests against Unit Roots in Seasonal Time Series Processes. Journal of Time Series Analysis 24, 591-612.

37