A Simple Modification to Improve the Finite Sample Properties of Ng and Perron’s Unit Root Tests Pierre Perron∗
Zhongjun Qu†
Boston University
University of Illinois at Urbana-Champaign This version: February 4, 2006
Abstract The tests introduced by Ng and Perron (2001, Econometrica) have the drawback that for non-local alternatives the power can be very small. The aim of this note is to point out an easy solution to this power reversal problem, which in addition leads to tests having an exact size even closer to nominal size. It involves using OLS instead of GLS detrended data when constructing the modified information criterion.
Ng and Perron (2001), henceforth NP, introduced a class of unit root tests that have a local asymptotic power function close to the Gaussian local power envelop (see Elliott, Rothenberg and Stock, 1996, henceforth ERS). Also, with the use of a modified information criterion, the tests have exact size close to nominal size even in the presence of a large negative moving-average component. This is achieved without sacrificing power for local alternatives. Hence, they have become often used and are now available as options in several popular statistical software packages (for a review that highlight the importance of these tests, see Haldrup and Jansson, 2005). Consider a univariate time series {yt ; t = 1, ..., T } generated by yt = zt0 γ + vt
(1)
vt = αvt−1 + ut P P∞ where ut is a linear process of the form ut = ∞ j=0 cj et−j with j=0 j|cj | < ∞ and et ∼ i.i.d. PT 2 2 −1 2 (0, σ e ). Also, σ = limT →∞ T E( t=1 ut ) is the “long run variance” or (2π) times the ∗
Department of Economics, Boston University, 270 Bay State Rd., Boston, MA, 02215 (
[email protected]). Department of Economics, University of Illinois at Urbana-Champaign, 1206 S. Sixth Street, Champaign, Illinois 61820 (
[email protected]). †
1
spectral density function at frequency zero of ut . The initial condition is such that y1 = Op (1). The vector zt is a set of deterministic components, usually zt = (1, t, ..., tp )0 and in most applications p = 0 (non-trending data) or p = 1 (trending data). The null hypothesis is that α = 1 and the alternative hypothesis is |α| < 1. The tests are easy to apply and can be described succinctly as follows. We first have tests collectively referred to as the M GLS tests. The three members of this class, which have very similar properties, are the following: Ã PT 2 !1/2 −2 −1 2 2 T et−1 y e − s T T AR t=1 y , MSB GLS = MZαGLS = P 2 T 2 sAR 2T −2 t=1 yet−1
and MZtGLS = MSB GLS ∗MZαGLS , where yet = yt −zt0 γˆGLS , with γˆ GLS the so-called GLS estimate of γ obtained from the least-squares regression of ytα¯ on ztα¯ , where the quasi-differenced ¯ yt−1 (t = 2, .., T ) and y1α¯ = y1 (ztα¯ is defined similarly). Following ERS, series are ytα¯ = yt − α α ¯ = 1 + c¯/T , with c¯ = −7 for p = 0 and c¯ = −13.5 for p = 1. Also of interest is the feasible P α) − α ¯ S(1)]/s2AR , where S(α) = inf γ Tt=1 (ytα − γztα )2 , point optimal test of ERS, PT = [S(¯ and a modified version suggested by NP MPTGLS
2
= [¯ cT
−2
= [¯ c2 T −2
T X t=1 T X t=1
2 yet−1 − c¯T −1 yeT2 ]/s2AR for p = 0
2 yet−1 + (1 − c¯)T −1 yeT2 ]/s2AR for p = 1
In all these tests, s2AR is an autoregressive spectral density estimate of σ 2 defined by s2AR = P P ˆ 2ek = T −1 Tt=k+1 eˆ2tk with ˆbi and eˆtk obtained from the following [ˆ σ 2ek /(1 − ki=1 ˆbi )2 ] where σ OLS regression k X ˆ ˆbi ∆e yt−i + eˆtk (2) ∆e yt = b0 yet−1 + i=1
test of ERS is the t-statistic associated with ˆb0 in this last regression. Note that the ADF A crucial component to obtain tests with good size in finite samples is a new method to select the autoregressive order k based on a Modified Akaike Information Criterion (MAIC) which chooses a value kMAIC = arg mink∈[0,kmax ] MAIC(k) where GLS
MAIC(k) = ln(ˆ σ 2k ) + 2(τ T (k) + k)/(T − kmax )
(3)
P P 2 with τ T (k) = (ˆ σ 2k )−1ˆb20 Tt=kmax +1 yet−1 and σ ˆ 2k = (T −kmax )−1 Tt=kmax +1 eˆ2tk . The upper bound is usually set to kmax = int(12(T /100)1/4 ) but other values are possible. NP showed that 2
when all the tests discussed above are constructed using the MAIC, the exact size is close to nominal size even in the presence of a large negative moving-average component, though the tests can be conservative in some cases. Also, for local alternatives the power of the tests is close to the Gaussian local asymptotic power envelop. A drawback of these tests is that for non-local alternatives the power can be very small. In fact, for a given fixed sample size T , the power of the tests can decrease as α gets further away from 1. Such a power reversal was pointed out by Seo (2005), though it has been known to the authors for quite some time. An alternative specification considered by NP, which uses OLS detrended data instead of GLS detrended data in the autoregresssion (2) to construct the spectral density estimate, does not suffer much from this problem but it leads to tests that are too conservative in common sample sizes and, hence, was not recommended. The power problem is quite important. Indeed, if the process is i.i.d. the test has very low power (see below). Hence, if one is faced with a random walk and performs the test both in levels and first-differences, one is likely to conclude that the series is integrated of order 2. The aim of this note is to point out that there is an easy solution to this power reversal problem, which in addition leads to tests having an exact size even closer to nominal size for all specifications considered in NP. The idea is to use a hybrid of the two specifications considered by NP. The autoregressive spectral density estimate is still constructed using GLS detrended data as specified above, but the selection of the autoregressive order k uses the MAIC constructed with OLS detrended data, i.e., replacing yet by yˆt , the residuals from a regression of yt on zt and using the autoregression 1 ∆ˆ yt = ˇb0 yˆt−1 +
k X
ˇbi ∆ˆ yt−i + eˇtk
(4)
i=1
So our modification involves a two step procedure. First, use (4) to construct MAIC(k) = 2 ˇ2 PT σ 2k )−1 ˆt−1 and σ ˇ 2k = (T − ln(ˇ σ 2k ) + 2(τ T (k) + k)/(T − kmax ) with τ T (k) = (ˇ 0 b0 t=kmax +1 y P kmax )−1 Tt=kmax +1 eˇ2tk . Once the order is selected as the minimizer of MAIC(k), say kMAIC , estimate (2) with kMAIC lags to construct s2AR or the ADF GLS test. Table 1 presents the size of the tests MZαGLS (the properties of MSB GLS and MZtGLS are similar), ADF GLS , PT and MPTGLS . The data are generated by (1) with α = 1, v1 = 0, and ut is either an MA(1) process of the form ut = et + θet−1 or an AR(1) of the form ut = ρut−1 +et , with et ∼ i.i.d. N (0, 1), and in both cases u1 = e1 . The sample size is T = 100 1
One can also use a P regression in level with the deterministic components added as regressors, i.e., k ∆yt = c + δt + ˆb0 yt−1 + i=1 ˆbi ∆yt−i + eˆtk . The results are virtually the same.
3
and 50,000 replications are used. We consider the values θ, ρ = −0.8, −0.5, 0.0, 0.5, 0.8. (for the MA(1) case with θ = −0.8 we also report the results with T = 150). The nominal size of the tests is 5%. Comparing the results with those in NP, this version of the tests has size closer to 5% in almost all cases. When p = 0, the size of the MZαGLS , PT and MPTGLS tests is close to 5% for all cases, except with a large negative autoregressive coefficient, in which case they are conservative as in NP. The ADF GLS test has, as expected, an exact size close to 5% with autoregressive errors but is substantially liberal in the case of MA(1) errors with θ = −0.8. These size distortions remain even with T = 150 and larger sample sizes not reported (consistent with the theoretical results in NP). In the case of trending data with p = 1, the same picture emerges except that the MZαGLS , PT and MPTGLS tests have some liberal size distortions with T = 100, which, however, disappear with T = 150 or larger. Consider now the power of the tests, which we simulated for two cases. First, with ut ∼ i.i.d. N (0, 1) and second with MA(1) errors with θ = −0.8 as specified above. The value of α is varied between 1 and 0 (and the graphs report results in terms of the value of 1 − α). Three versions of the tests are considered: a) the original procedure recommended by NP, labelled GLS-GLS; b) our modification labelled OLS-GLS; and c) tests constructed using OLS detrended data throughout (to assess to extent of the power gain of using GLS detrended data). The sample size is T = 100 and again 50,000 replications are used. We present results for the ADF GLS test (Figures 1 for p = 0 and 2 for p = 1) and the MZαGLS test (Figures 3 for p = 0 and 4 for p = 1). The left panel reports the power while the right panel reports the autoregressive order selected. The results show that the original procedure recommended by NP and our modification indeed have higher power than tests based on OLS detrended data when α is “local” to 1. But when α decreases the original NP version has power that decreases rapidly, especially with a negative moving-average component in which case the power is very low. This reversal does not occur with our modification (though in the i.i.d. case there is some slight decrease) and the power for values of α away from 1 is close to what can be achieved using OLS detrended data throughout (which also can exhibit slight power decreases). The reason for the difference in power when using OLS or GLS detrended data to contruct the MAIC can be seen by looking at the right panels, which report the autoregressive orders selected. When the MAIC is constructed with GLS detrended data, the value of k selected increases as α moves away from the null value 1. Such is not the case when the MAIC is constructed with OLS detrended data. The smaller lags translates into higher power. Hence, using OLS detrended data to construct the MAIC allows the selection of the appropriate 4
order under the null hypothesis to achieve an exact size close to nominal size and is also appropriate in selecting a small lag under the alternative to achieve higher power. Using OLS detrended data is preferable to using GLS detrended data not only when using the MAIC but also when using any information criterion such as the AIC. The reason is the following. When α is not local to one, the estimate of the coefficients of the trend function are biased when using GLS detrended data. This implies that the series yet do not have mean zero or may be trending. The autoregression (2) is then mis-specified and the estimate ˆb0 is biased and the bias varies with the lag order. This bias induces serial correlation in the errors and, as a result, additional lags are needed to compensate for that. With the MAIC the effect is more pronounced since the additional penalty term involves ˆb0 which is sensitive to the lag order irrespective of the underlying process. With OLS detrended data, the series yˆt has basically mean zero so that this problem does not occur. In summary, our modification is easy to implement, leads to tests with finite sample size even closer to nominal size and a power function that is not severely reduced for non-local alternatives. The MZαGLS , PT and MPTGLS tests have similar properties, their exact size is close to nominal size in the case of MA processes but can be quite conservative with autoregressive errors with a negative coefficient. Other the other hand, the ADF GLS has liberal size distortions in the case of negative moving-average errors but have size close to nominal size and slightly higher power with autoregressive errors. Since errors with a negative moving average component are more likely than errors with a negative autoregressive coefficient, the MZαGLS , PT and MPTGLS tests should be preferable to the ADF GLS test. References [1] Elliott, G., T.J. Rothenberg, and J.H. Stock (1996): “Efficient Tests for an Autoregressive Unit Root,” Econometrica 64, 813-836. [2] Haldrup, N., and M. Jansson (2005): “Improving Size and Power in Unit Root Testing,” forthcoming in the Palgrave Handbook of Econometrics, Vol. 1, Econometric Theory, K. Patterson and T. C. Mills (eds.), New York: Palgrave Macmillan. [3] Ng, S., and P. Perron (2001): “Lag Lenght Selection and the Construction of Unit Root Tests with Good Size and Power,” Econometrica 69, 1519-1554. [4] Seo, M.H. (2005): “Improving Unit Root Tests by a Generalization of the Autoregressive Spectral Density Estimator at Frequency Zero,” Unpublished Manuscript, Department of Economics, London School of Economics.
5
Table 1: Exact Size of the Tests; T = 100 MA Case:
ut = et + θet−1 ; et ∼ i.i.d.N (0, 1)
p=0
p=1
MZ GLS α
ADF GLS
PT
MP GLS T
MZ GLS α
ADF GLS
PT
MP GLS T
-0.8
0.067
0.143
0.061
0.065
0.122
0.211
0.122
0.124
(T
(0.034)
(0.103)
(0.031)
(0.033)
(0.055)
(0.130)
(0.055)
(0.056)
-0.5
0.063
0.082
0.052
0.059
0.046
0.083
0.046
0.047
0.0
0.048
0.055
0.038
0.044
0.021
0.046
0.020
0.021
0.5
0.058
0.048
0.046
0.052
0.037
0.032
0.035
0.036
0.8
0.064
0.038
0.052
0.058
0.046
0.018
0.043
0.045
θ
= 150)
AR Case:
ut = ρut−1 + et ; et ∼ i.i.d.N (0, 1)
p=0
p=1
MZ GLS α
ADF GLS
PTGLS
MP GLS T
MZ GLS α
ADF GLS
PTGLS
MP GLS T
-0.8
0.015
0.051
0.013
0.015
0.001
0.039
0.001
0.001
-0.5
0.037
0.051
0.031
0.034
0.014
0.042
0.014
0.015
0.5
0.066
0.056
0.052
0.060
0.049
0.043
0.044
0.047
0.8
0.088
0.055
0.067
0.078
0.090
0.047
0.076
0.081
ρ