Spurious Regression and Residual-Based Tests for Cointegration in Panel Data When the Cross-Section and Time-Series Dimensions are Comparable Chihwa Kao Center for Policy Research and Department of Economics Syracuse University Syracuse, NY 13244 (315) 443-3233 Fax: (315) 443-1081 Email:
[email protected] First Draft: January 10, 1995 This Draft: September 2, 1996 Abstract In the rst half of the paper we study spurious regressions in panel data when the crosssection and time-series dimensions are comparable. Asymptotic properties of the least-squares dummy variable (LSDV) estimator and other conventional statistics are examined. We show that the LSDV estimator, b, is consistent for its true value; but the t-statistic, t , diverges so that inferences about the regression coecient, ; are wrong with the probability that goes to one as N ! 1 and T ! 1. The asymptotics of b are also dierent from those of the spurious regression in the pure time series. This has an important consequence for residual-based cointegration tests in panel data, because the null distribution of residual-based cointegration tests depends on the asymptotics of b. In the second half of the paper we study residual-based tests for cointegration regression in panel data. We study Dickey-Fuller (DF) tests and an augmented Dickey-Fuller (ADF) test to test the null of no cointegration. Asymptotic distributions of the tests are derived and Monte I thank two referees, an associate editor and an editor, Richard Blundell, for helpful comments. I also thank participants of workshop seminars at Hong Kong Univeristy of Science and Technology, Academia Sinica, and SUNY-Albany for helpful comments and suggestions. I thank Bangtian Chen for his research assistantce on an earlier draft of this paper. Thanks also go to Jodi Woodson and Martha Bonney for correcting my English and carefully checking the manuscript to enhance its readability. An electronic version of the paper in postscript format can be retrieved from http://web.syr.edu/~cdkao.
1
Carlo experiments are conducted to evaluate nite sample properties of the proposed tests. The simulation results suggest that the DF and DFt tests have better size and power properties than the DF ; DFt ; and ADF tests when is small. However, when is large, ADF clearly dominates the rest. Key Words: Panel Data; Spurious Regression; LSDV; Residual-Based Tests.
1 Introduction There is immense interest in testing for unit roots and cointegration in time-series data but not much attention has been paid to testing the unit roots and cointegration of panel data at either empirical or theoretical levels. Apparently, the only theoretical studies in this area are Breitung and Meyer (1994), Quah (1994), Levin and Lin (1992, 1993), Im, Pesaran and Shin (1995), Kao and Chen (1995a, b) and Pedroni (1995). Breitung and Meyer (1994) derived the asymptotic normality of the Dickey-Fuller test statistic for panel data with a large cross-section dimension and a small time-series dimension. Quah (1994) studied the unit roots tests for panel data that have simultaneous extensive cross-section and time-series variation. Levin and Lin (1992, 1993) recently derived the limiting distributions for unit roots on panel data and showed that the power of these tests increases dramatically as the cross-section dimension increases. Im et al. (1995) critiqued the Levin and Lin Panel unit root statistics and proposed alternatives. Kao and Chen (1995a) studied the asymptotic results for a least-squares dummy variable (LSDV) estimator in a cointegrated regression in panel data. They showed that the LSDV estimator and t-ratio are asymptotically normally distributed. They also conrmed the existence of second-order bias (e.g., due to endogeneity or errors-in-variables). Kao and Chen (1995b) proposed residual based tests for cointegration under a set of restricted assumptions.. On the other hand, Park and Ogaki (1991) derived asymptotic distributions for cointegration coecient estimators and related t-statistics for panel data using canonical cointegrating regression (CCR) transformations. Although they take a seemingly unrelated regression (SUR) approach rather using N dimensional asymptotics, many of the issues are similar. Peasran and Smith (1995) are not directly concerned with cointegration but do touch on a number of related issues, including the potential problems of homogeneity misspecication for cointegrated panels. The rst half of this paper examines a spurious regression in panel data when the cross-section and timeseries dimensions are comparable. Asymptotic properties of the LSDV estimator and other conventional statistics are examined. I show that the LSDV estimator, b, is consistent for its true value; but the tstatistic, t , diverges so that inferences about the regression coecient, ; are wrong with the probability 2
that goes to one as N ! 1 and T ! 1. The asymptotics of b are also dierent from those of the spurious regression in the pure time series. This has an important consequence for residual-based cointegration tests in panel data, because the null distribution of residual-based cointegration tests depends on the asymptotics of b. The second half of the paper examines the asymptotic null distribution of residual-based cointegration tests in panel data. This paper is related to recent work by Pedroni (1995) that came to my attention when this work was completed. Pedroni (1995) derived asymptotic distributions for residual based tests of cointegration for both homogenous and heterogenous panels. This paper discusses the xed eect model, which Pedroni (1995) does not discuss. Section 2 introduces the model and derives the asymptotic distributions of the LSDV estimator and various conventional statistics from the spurious regression in panel data. Section 3 derives the asymptotic distributions for the residual-based tests for the cointegration. Section 4 describes the estimation of long-run variances. Section 5 reports the simulation results. Concluding remarks are given in Section 6. The proofs in the text are collected in the Appendix. 0 A word on notation. We dene 1=2 be any matrix such that = 1=2 1=2 : We use \ ) " to p denote weak convergence, ! to denote convergence in probability, [x] to denote the largest integer x, and I (1) to signify a time series that is integrated of order one.
2 Large Sample Asymptotics In developing our theory, we rst assume the cross-section dimension is a monotonic function of the timeseries dimension so that the law of large numbers (e.g., Theorem 6.2, Billingsley, 1986) and the central limit theorem (e.g., Theorem 27.2, Billingsley, 1986) for triangular arrays can be applied.
2.1 Independent Random Walk Let yit and xit be independent random walks for all i;
yit = yit 1 + uit and
xit = xit 1 + "it
in which yi0 = 0; xi0 = 0: 3
(1)
Suppose these two unrelated I (1) series, yit and xit ; are incorrectly estimated by least squares for all i using panel data; the spurious LSDV regression model is (2)
yit = i + xit + eit ; i = 1; :::; N; t = 1; :::; T: The LSDV estimators of i and are
PN PT b = Pi N Pt T yit(xit xi ) i t (xit xi ) =1
=1
=1
2
=1
and
bi = yi bxi : Thus,
b = =
P
P
P P P P
1 N 1 N 1 N 1 N
P P
N 1 i=1 T 2 N 1 i=1 T 2 N i=1 1iT N i=1 2iT
= 1NT ; 2NT
(3) x xi )yit x xi )2
T ( t=1 it T ( t=1 it
P
(4)
P
where xi = T1 Tt=1 xit ; yi = T1 Tt=1 yit ; 1iT = T12 Tt=1 (xit xi ) yit ; 2iT = T12 Tt=1 (xit xi )2 , 1NT = PN PN 1 1 N i=1 1iT , and 2NT = N i=1 2iT : For each 0 r 1; we dene [rT ] as the integer part of rT: Wi (r) and Vi (r) are independent standard Wiener processes obtained from "it and uit , respectively, i.e., 1 p " T and for 0 r 1:
rT X [
]
"it ) Wi (r)
(5)
rT ] 1p [X uit ) Vi (r)
(6)
u T
t=1
t=1
Assumption 1 The ("it ; uit)0 are assumed to be independent cross i: Remark 1 It is true that the assumption of independence across i is pretty strong. However, the assumption
of independence across i is needed in order to satisfy the requirement of the central limit theorem for triangle arrays. Moreover, as pointed out by Quah (1994), modeling cross-sectional dependence is involved because individual observations in cross-section have no natural ordering. It is possible to extend the current model to allow a degree of dependency across i using the approaches adapted by Im et al. (1995) and Pedroni (1995). However, it goes beyond the scope of this paper.
4
Assumption 2 uit iid(0; u); "it iid(0; " ); E uit < 1, and E "it < 1: 2
2
4
4
Before going into rst theorem of the paper, we need to consider some preliminary results. All limits in Lemmas 1 and 2 are taken as T ! 1; except where otherwise noted.
Lemma 1 If Assumptions 1 and 2 hold, then
R W (r)V (r)dr h R W (r)dri h R V (r)dri ; i i " i u i i R nR o (b) iT ) " Wi (r)dr Wi (r)dr i ;
(a) 1iT ) " u
1 0
1 0
2
2
1 0
1 0
2
1 0
1
2
2
(c) E [ 1iT ] 1T = E [ 1i ] 1 = 0;
(d) E [ 2i ] 2 = 6" ; 2
(e) E [ 2iT ] 2T = 2 + O(T 1 );
(f) V ar [ 1iT ] 21T = "90u + O(T 1 ); 2 2
(g) V ar [ 2iT ] 22T = 45" + O(T 1 ); 4
(h) V ar [ 1i ] 21 = "90u ; 2 2
(i) V ar [ 2i ] 22 = 45" ; 4
(j) 1NT (k) 2NT
!p = 0; 2 !p = " : 1 2
6
All limits in Theorems 1 and 2 are taken as N ! 1 and T ! 1; except where otherwise noted. We are now in a position to state the following theorem:
Theorem 1 Suppose the conditions of Lemma 1 are satised; then p (a) b ! 0;
p
(b) N b ) N 0; 25 u2" or b = Op (N 1=2 ); (c) T 1=2t ) N 0; 52 or t = Op (T 1=2); 2
p (d) R2 ! 0;
(e) DW = Op (T 1); p (f) TDW ! 6:
5
Theorem 1 shows that the LSDV estimator, b, is consistent for its true value, 0; but the t-statistic, t , diverges so that inferences about are wrong with the probability that goes to one as N ! 1 and T ! 1: Moreover, R2 and DW converge to zero in probability. Theorem 1 will be extended to allow for much weaker assumptions, e.g., serial correlation, weak exogeneity, though independence still holds across all i = 1; :::; N in the next section.
2.2 Relaxing Some Assumptions The error process assumptions made in the Section 2.1 are quite restrictive. Indeed, assuming constant variances across i is also restrictive. The real job is how to verify the Lindeberg condition. The Lemma 4.2 (a version of central limit theorem (CLT) for triangular arrays) presented by Levin and Lin (1993) is not ready to derive the asymptotics when the variances are not constant across i. The diculty is that uniform integrability is not guaranteed from the moment conditions in their lemma. Moreover, their conditions in the lemma do not yield the Lindberg condition. Hence, before technical issues can be resolved, I think the assumption of constant variances across i is an acceptable compromise for the purpose of this paper. On other hand imposing strong exogeneity on the explanatory variable and serial independence on the error process is also quite unrealistic in practice. In this section, we relax these assumptions and derive the asymptotics. I will show in this section that the extension consists of correcting for both serial correlation and weak exogeneity, which can be done by replacing 2v with the long-run conditional variance 20u : Dene 0 wt = (uit ; "it ) . The long-run covariance matrix of wit is given by 1E
=Tlim !1 T where
! X !0 T T X t=1
wit
t=1
wit = + +
0
2 3 u" 5; 4 u
(7)
3 5
(8)
2 0
0u" 20"
2
T 0 1 TX1 X =Tlim E wit wit k 4 u !1 T
"u
k=1 t=k+1
and
=Tlim !1 T
t=1
u" "
2 3 E wit wit 4 u u" 5 :
T 1X
0
0
2
u" 2"
(9)
We assume that wit satises the invariance principle of Theorem 2.1 of Phillips and Durauf (1986), so that
p1 T
Tr X [
]
t=1
wit ) 1=2 Wi (r) as T ! 1
6
(10)
where W(r) is a 2 1 dimensional standard Wiener process with
2 3 V ( r ) i 5: Wi (r) = 4 Wi (r)
Assumption 3 (e.g., Phillips,1986) 1. E [wit ] = 0 for all t; 2. sup E jwit jp for some 2 p < 1; t
3. fwit g1 1 is strong mixing with mixing number m satisfying
Let
P1 m m
yit = yit 0u" 0"2 xit ; xit = 0"1 xit ;
1
=1
=p < 1:
2
(11)
with ebit = yit b i b xit : The residuals from LSDV estimation of (2) are identical to those from LSDV of (11) in that yit b i bxit = yit b i b xit = yit 0u" 0"2 xit b i b 0"1=2 xit (12) = yit b i b 0"1 + 0u" 0"2 xit with b i = b i b = b 0"1 + 0"2 0u" : This implies that b = 0" b 0"1 0u" = 0" b 0"1 0u" : Dene 20v = 20u 20u" 0"2 (13) and Notice that
2v = 2u 2u" " 2 :
2 4
yit 0v xit
3 2 3 5 = L0 4 yit 5 ; xit
7
2 L =4
where
0
It follows that (e.g., Hamilton, 1994)
0v
0"
0
Wi (
3
2 0v 0u" 0" 5 : 1 1
1
r) = L0 1=2 Wi (r)
is a Wiener process with an identity covariance, I; with
2 3 Vi (r) 5 Wi (r) = 4 : Wi (r)
It follows that
P Tr L0 w t 2it 32 3 P u" " 5 4 uit 5 = pT tTr 4 2 0 u "" 3 "it P u" " 5 = pT tTr 4 " 2 u 3 P = pT tTr 4 5 p1T
[
1
] =1 [
] =1
1
1
0v
0
1
1
[
[
] =1
] =1
) Wi (r) as T
where
it 0v
0
2
1
it 0v
0
0
2
it 0"
it 0v
"it ! 1;
uit = uit
and
0
0v
"it 0u" 0"2
"it = "it : "
0
Dene
p b pN NT N = ; v NT P xi ) y0vit ; iT = T 2 Tt (xit xi ) , NT = 3
PT (x it t
0
4
P
2 N 1 where 3iT = T 2 =1 4 3 =1 N i=1 3iT , and 4NT = PN 1 N i=1 4iT : Before we derive the asymptotics we summarize some limit theory for the LSDV estimator. 1
1
Lemma 2 If Assumptions 1 and 3 hold, then
R W (r)V (r)dr R W (r)dr R V (r)dr ; i i i i i n o R R ) Wi (r) dr Wi (r)dr i ;
(a) 3iT )
1 0
1 0
1 1 2 (b) 4iT 0 0 (c) E [ 3iT ] 3T = E [ 3i ] 3 = 0;
1 0
3
2
4
8
(d) E [ 4i ] 4 = 16 ; (e) E [ 4iT ] 4T = 4 + O(T 1 ); (f) V ar [ 3iT ] 23T = 901 + O(T 1 ); (g) V ar [ 4iT ] 24T = 451 + O(T 1 ); (h) V ar [ 3i ] 23 = 901 ; (i) V ar [ 4i ] 24 = 451 ; (j) 3NT (k) 4NT
!p = 0; !p = : 3
1 6
4
Using similar techniques in Theorem 1, we can show the results similar to the ones in Theorem 2.
Theorem 2 Suppose the conditions of Lemma 2 are satised; then p 0u" (a) b ! 20" ;
p
(b) N b
(c) T 1=2t
0u" 20" T
1=2
) N 0; 0"2 0u"
p 0u" (d) R2 ! 20v 20" +20u" ; 2
20v ; 20"
2 5
qP P N i=1
s
x
T ( t=1 it
xi )2
) N 0; ; 2 5
(e) DW = Op (T 1); p (f) TDW ! 6:
The results of Theorem 1 and 2 can also be extended to the multiple regression, provided that fxit g are not cointegrated. The asymptotics of b are dierent from those of a spurious regression in the pure time series, and this dierence has an important consequence for residual-based cointegration tests using panel data, because the null distribution of residual-based cointegration tests depends on the asymptotics of b. This point is reported in the next section.
3 Residual-Based Tests for Cointegration In this section we derive the null distributions of residual-based cointegration tests using Dickey-Fuller (DF) tests and augmented Dickey-Fuller (ADF) when applied to the model in Section 2.2. The model in Section 2.1 is a special case of the general class in Section 2.2. 9
3.1 Dickey-Fuller Test The DF test can be applied to the residuals using
beit = beit
1
+ vit ;
(14)
where ebit is the estimate of eit from (2). The OLS of ; b; is
PN PT eb eb b = Pi N Pt T iteb it : =1
2
t=2 it
i=1
(15)
1
=2
1
The null hypothesis that = 1 in regression (14) is tested by:
p NT (Pb 1) P T N 1 p1 be t=2 it = N Pi=1 T P
4be
1 it N T 1 2 it 1 T2 i=1 t=2 N 1 T 1 it N i=1 T t=2 it 1 2 N T 1 1 2 N it 1 i=1NT t=2 1 N i=1 5iT N 1 N i=1 6iT 5NT 6NT
P P bebe 4be P (be ) = P pN P = P 1 N
p
p = N
where 5iT = T1
;
PT eb 4 eb ; = PT eb ; = PN ; and = PN : NT N i iT NT N i iT it iT T t it t it =2
1
2
6
1
2
=2
1
1
5
=1
5
6
1
=1
6
Theorem 3 Suppose Assumptions 1 and 3 hold. Then, p
N5T ) N 0; 3 + 7:24v 6T 40v
p
NT (b 1)
and
p N 3 T v v t sp ) N 0; 2 + 10 ; T v v where T = E [ iT ] and T = E [ iT ] : 2 0
5
5
6
2 0
2
6
5
2
6
Corollary 3 Assume Assumptions 1 and 3 are satised and have
and
p
pN
p
T
! 0 as N ! 1 and T ! 1: Then, we
2 4 v ) N 0; 3 + 7:2v NT (b 1) + 3 N 2 4
v
v
0
0
2 2 t + 26Nv ) N 0; 20v2 + 103v2 : 0v v 0v p
Note that with the strong exogeneity and in the absence of serial correlation we have 2u = 20u = 2v = 20v : This case was previously studied by Kao and Chen (1995b). We then have the following corollary: 10
Corollary 4 Suppose Assumptions 1 and 3 hold. Then,
p T NT (b 1) N T ) N (0; 10:2)
p
5
6
and
p
(
1:25 t
) p N T ) N (0; 1) : p T 5
6
3.2 Augmented Dickey-Fuller (ADF) Test The Dickey-Fuller test in Section 3.1 was based on a simple OLS regression of ebit on its own lagged value. Correction for serial correlation was made to the OLS and t-statistic. Alternatively, the lagged changes in the residuals can be added to the regression of (14):
ebit = ebit 1 + Ppj=1 'j 4 ebit j + vitp ;
(16)
where p is chosen so that the residuals vitp are serially uncorrelated. The ADF test statistic discussed in this section is the usual t-statistic of = 1 the regression (16). Let Xip and Xip be the matrix of observation on the p regressors (4ebit 1; 4ebit 2; ; 4ebit p) and 4ebit 1; 4ebit 2; ; 4ebit p respectively; ei and ei equal the vector of observations of ebit 1 and ebit 1 respectively; vi equal the vector of observations of vitp ; Qi = N P T 2 P vbitp ; where bvitp is an estimate of I Xip (Xip0 Xip ) 1 Xip0 ; Qi = I Xip (Xip0 Xip ) 1 Xip0 ; and s2v = NT1 i=1t=1 vitp . Under the null hypothesis of no cointegration, the ADF test takes the form
tADF =
(b 1)
P N 0 i=1
sv
where b is the OLS estimate of : Note that (b 1) =
"X N i=1
ei Qi ei
# "X N
(e0 Qi ei ) i
11
1
i=1
1 2
;
#
(e0i Qi vi ) :
Now,
tADF
P(e0 Q v ) = s P(e0 Q e ) s pN P e0 Q v ( ) = s P (e0 Q e ) s pN P e0 Q v ( ) s = P (e0 Qe) s N
i i
i
i=1
N
i i i i=1 N 1 1 N T i i i i=1
v
1 N
v
1 N
v
pN
1 N
N
i=1 N
i=1 N
i=1
1 T2 1 T
1 T2
= sv p7NT ; 8NT
N P
i
i i
i i
i
i
i i
N P
where 7NT = N1 7iT ; 7iT = T1 (e0i Qi vi ) ; 8NT = N1 8iT ; and 8iT = T12 (e0i Qi ei ) : i=1 i=1 Theorem 4 Suppose that the assumptions of Theorem 3 are satised and p ! 1. Then, we have
p N T ) N 0; v + 3v sv p T 2v 10 v as N ! 1 and T ! 1, where T = E [ iT ] and T = E [ iT ] : tADF
2 0
7
8
7
7
2
2 0
2
8
8
4 Estimation of Long-Run Parameters The asymptotic distributions given in Theorem 2 depend on unknown parameters (2u ; 2" ; u" ; 20" ; 20u and 0u" ): We now dene 0 wit = (uit ; "it ) : The long-run covariance matrix of wit is given by 1E
=Tlim !1 T where
1 =Tlim !1 T
PT PT k
1 =1
2 3 0 4 u" 5 u ;
! X !0 T T X t=1
wit
t=1 0
t=k+1 E wit wit
k
2 0
wit = + +
2 4
12
u
u"
"u
"
3 5
0
0u" 20"
1 and =Tlim !1 T
PT E w w0 it it t =1
2 3 u" 4 u 5 ; the usual covariance matrix. Once the estimates of wit ; wbit ; are obtained, we can use 2
u" 2"
N X T 1 X wbit wbit0 b = NT
(17)
i=1 t=1
to estimate : can be estimated by
N ( T l T b = 1 X 1 X wbit wbit0 + 1 X $l X wbit wbit0 N T T i=1
t=1
=1
t= +1
0
+ wbit wbit
)
;
(18)
where $l is a weight function or a kernel: Usually kernels are truncated by the bandwidth parameter l so that $l = 0 for > l: Using Phillips and Durlauf (1986) and the law of large numbers for triangular arrays, b and b can be shown to be consistent for and :
5 Some Monte Carlo Results 5.1 Spurious Regression In this section, we report some results from Monte Carlo experiments that examined the nite-sample properties of the spurious LSDV estimator, b, t ; R2 ; and DW . The simulations were performed by a Sun SparcServer 1000 using the software GAUSS 3.27. The results we report are based on 10; 000 replications. The data were generated by creating T + 1; 000 observations and discarding the rst 1; 000 observations to remove the eect of the initial conditions. The data generating process (DGP) was
yit = yit 1 + uit and
xit = xit 1 + "it
for i = 1; :::; N and t = 1; :::; T; where the innovation sequences (uit ; "it ) were generated from a bivariate normal with independence across both individuals and time periods, i.e.,
2 3 02 3 2 4 uit 5 iid N @4 0 5 ; 4 1 "it
0
0 0 2
31 5A :
Random numbers for (uit ; "it ) were generated by the GAUSS procedure RNDNS. For each experiment, we calculated the mean values of b, t ; R2 , DW; the mean estimated standard error (ESE ), P (jtj > 1:96) (the 13
rejection frequency of the t-test of H0 : = 0), and the sample standard deviation (SSD) of b where
SSD( b) =
(
)= M X b[ i ] ; M 1i 1 2
1
2
=1
M P
= M1 bi , and M was the number of replications. For sample size, we considered dierent settings for i=1 T and N . The Monte Carlo results are reported in Tables 1, 2, and 3. In the case where N = T = 30; the probability of rejecting H0 ; P (jt j > 1:96); is approximately .5723, which is far from the conventional value of .05. Similar discrepancies exist for all our other values of N and T: This means that, even when the
null hypothesis is true, we will wrongly reject it most of the time. We know from Table 1 that the mean values of b almost always stay constant around zero, but 2SSD converges to the mean value as panel size increases. Unlike in the pure time-series case, b converges to zero in probability instead of being a random variable. Table 1 also reveals some discrepancy between the SSD and the ESE: However, the discrepancy between these two measures is not as severe as the discrepancy in the pure time series because both the SSD and the ESE diminish as T increases in panel data. The mean value of the t-statistic changes little as T (N ) grows from 10 to 70, but the SSD increases rapidly. The likelihood of jt j > 1:96 becomes higher and higher so that the critical value at the 5% signicance level increases consequently as T increases. Therefore, as in a pure time series, the spurious regression problem becomes worse as the panel size grows. This is conrmed by Table 1, which records the rejection frequency of the t-test for the sample sizes from 10 to 80. The rejection rate of the null hypothesis of no relationship between xit and yit increases steadily with T . At N = T = 50; the rejection rate is as high as 66%. We also examined how the nite sample properties of the spurious LSDV regression change if N and T are dierent. We rst xed N at 30 and changed T . Then, we let T be xed at 30 and changed N . Table 2 summarizes the results for dierent time-series dimensions with N = 30: Table 3 summarizes the results for dierent cross-section dimensions with T = 30. We found that an increase in cross-section decreases the SSD of b; varying cross-section size has little eect on the t-statistic; and the t-statistic is closely related to the time series dimension. The reason why the t statistic does not converge to the standard normal distribution is that the t statistic does not have a zero mean even when T is xed (at 30). Hence, increasing N will not help the t statistic converge to a meaningful distribution (here a standard normal distribution) without some sort of normalization. When T grows, the SSD of the t-statistic and the rejection rate of the null hypothesis based on the conventional critical value increase signicantly. 14
The results obtained from the pure time series tell us that R2 may be quite high in the spurious regression. However, this is not the case using panel data. Tables 1 and 2 reveal that R2 remains very low and decreasing, which is consistent with the result in Theorem 1. The results on the Durbin-Watson statistic are also what we expected. Clearly, as shown in Table 1, DW converges to zero in probability. Furthermore, TDW stays constant at about 6 on average.
5.2 Residual-Based Tests To examine the nite sample properties of the proposed tests, we conducted Monte Carlo experiments which are similar to the design of Engel and Granger (1987) and are used in Gonzalo (1994) and Haug (1996). The data generating process (DGP) was:
yit i xit = zit a1 yit a2 xit = wit zit = zit 1 + ezit wit = wit 1 + ewit ewit = it + it and
2 3 02 3 2 4 ezit 5 iid N @4 0 5 ; 4 it
1
1
31 5A :
0 2 The RNDN procedure in GAUSS was used to generate the random numbers. The data were generated by creating T + 1; 000 observations and discarding the rst 1; 000 observations to remove the eect of the initial conditions. The results we report are based on 10; 000 replications. We generate i from a uniform distribution, U [0; 10], and set = 2. For the purposes of the this paper, we consider the following values for the parameters in the DGP: a1 = 1; a2 = 1; = (:95; 1); = (:25; 1; 4); = ( 0:8; 0; 0:8) ; and = ( 0:5; 0; 0:5) : When = 1; yit and xit are not cointegrated, and when jj < 1; they are cointegrated. The estimate of the long-run covariance matrix
0 1 b b u" A
b = @ u b u" b " 2 0
0
and the long run conditional variance
0
2 0
b20v = b20u b20u" b0"2 : 15
were obtained by using the procedure KERNEL in COINT 2.0 with a Bartlett window of lag length 5. Dene p p NT ( b 1) + 3 N; p DF = 10:2 p p DFt = 1:25t + 1:875N;
DF =
and
where
p
p
NT (b 1) + 3 bN2 bv
r
2
0v
3 + b: b p6Nb v t + DFt = r 2 2b0v 2 ; b0v2 + 3bv2 2b v 10b 0v 4 72 v 4 0v
;
p
tADF + 26bN0ubv ; ADF = r
b b
2 0v 2 2 v
+ 103bbv2 2
0v
b2v = b2u b2u" b" 2 :
We expect that DF ; DFt ; and ADF converge to N (0; 1) asymptotically. We also put DF and DFt here for comparison and the two lags are selected for ADF . Tables 4 contains estimates of the size of tests at the a 5% level when the asymptotic critical value, 1:645; from the N (0; 1) distribution was used. Five dierent tests are considered - DF ; DFt ; DF ; DFt ; and ADF - for dierent sample sizes with = 0; no moving average component ( = 0:0), = 1; and endogenous xit (a1 = 1:0): We note that all tests have a large size distortion when T is small (e.g., T = 10) even at N = 300, but the size distortion begins to disappear quickly when the T is increased to 25 for all N . The empirical sizes of DF and DFt are close to 0:05 when T and N are both large. But the empirical size of the ADF test is greater than 0.09 at each sample sizes for T and N . Overall, it is found that DF and DFt outperform the rest in terms of the size distortion. Tables 5 reports the unadjusted powers of the ve tests when the asymptotic critical value is used and = :90 for dierent sample sizes with 0 contemporaneous correlation ( = 0); no moving average component ( = 0:0), = 1 and endogenous xit (a1 = 1:0): All tests have little power when T = 10 and when N is small, as one would expect. However, DF displayed little power even when N is large. When T is increased to 25 and above, DF dominates DFt ; and the ADF tests for all N . Interestingly, DF and DFt , though they are misspecied, perform pretty well in terms of size distortion and power. 16
To provide further insight into other specications, we consider the size and power of these ve tests for three dierent values of ; ; and in Tables 6 - 8. With negative ; all ve tests are severely distorted, as one would expect. In fact, with = 0:8; all tests except ADF reject no cointegration 100% of the time for a nominal 5% level for sample sizes T = 50 and N = 50. The empirical sizes of the DF and DFt tests are smaller than the nominal sizes of the tests (at 5%), when = 0:8 at N = T = 50: For = 0:8; the size distortion of all ve tests begins to improve as is increased from 0:25 to 1 and 4. For = 0:0; all ve tests show little variation in size across the various values of and : For 1; we note that the unadjusted power of tests is extremely high for all values of ; and : For = 0:25 and = 0; the power of all tests is very low. When is increased to 1 and 4, the power increases dramatically. Table 8 reports the size-adjusted powers with = 0:85 such that each test has the same rejection frequency of 5% when the null hypothesis is true. For = 0:25 and = 0:8; the power of all tests is low and the null hypothesis of no cointegration is often not rejected. With = 0; ADF has the lowest power for all and, on the other hand, DF has the highest power. With = 0; the power of all tests except ADF is high. When is increases to 1, the power of all tests increases to viturally 1 except when = 0:8. In fact, the power of all tests remains very low, zeros for most cases when = 0:8: Interestingly and to my surprise, ADF became very powerful when = 4 and 0. The results in Tables 4-8 suggest that the DF and DFt tests have better size and power properties than the DF ; DFt ; and ADF tests when is small. However, when is large, ADF clearly dominates the rest.
6 Conclusion The rst half of this paper develops a framework for understanding the behavior of spurious panel regression. I have provided an asymptotic theory for the behavior of the LSDV estimator in a model which attempts to estimate the panel regression when the dependent variable and independent variable are actually independent I (1) processes. In particular, the t-statistic diverges in spite of the fact that the LSDV estimator converges to zero in probability. Moreover, DW converges to zero in probability. The asymptotic results and simulation results in this paper are useful beyond explaining the spurious eects on panel regression. Asymptotic distributions of residual-based tests depend on the LSDV estimator from the spurious regression because residual-based tests for cointegration in panel data take the spurious regression as the null hypothesis. In the second half of this paper, we proposed tests for the null hypothesis of no cointegration in panel data 17
and derived asymptotic distribution for each test. The simulations show that the distributions of DF ; DFt ; and ADF can be far dierent from N (0; 1) suggested by the theory when the underlying process contains an MA component. From the Monte Carlo results, it is apparent that the DF and DFt tests are substantially robust despite the model misspecication. The results in Tables 4-8 suggest that the DF and DFt tests have better size and power properties than the DF ; DFt ; and ADF tests.
Appendix A Proof of Lemma 1 Proof. (a) and (b) are immediate from Phillips (1986) and Banerjee et al. (1993). For example, using Lemma 1 in Phillips (1986) and (3.31) in Banerjee et al. (1993), we have for a xed i
PT (x x ) it i t P T = T t xit T xi nR o R ) " Wi (r)dr " Wi (r)dr R W (r)dr nR W (r)dro =
2iT =
1
T2
2
=1
1
2
1 0
2
"
2
2
1 0
2
1
2
=1
2
1 0
2
1 0
2
i
(19)
2
i
i: 2
Now Then, we have
1iT = T12
T X
T X (xit xi ) yit = T12 xit yit t=1 t=1
and
= xi ) "
T
1 2
T
1 2
=
Z
yi ) u
1
Z
0 1 0
T
= xi
1 2
T
=
1 2
yi :
Wi (r)dr;
(21)
Vi (r)dr;
(22)
Z1 T 1 X T 2 xit yit ) " u Wi (r)Vi (r)dr: t=1
(23)
0
The limiting distribution of 1iT can be deduced by using (21) - (23)
1iT ) 1i ; where
1i " u
Z 0
1
Wi (r)Vi (r)dr
Z "
18
1 0
(20)
Z
Wi (r)dr u
1 0
Vi (r)dr :
To prove (c); we use the fact that
E and
E
and, therefore,
Z
1 0
Z
1 0
Wi (r)Vi (r)dr = 0
Wi (r)dr = E
Also,
Z
1 0
Vi (r)dr = 0;
E [ 1i ] 1 = 0:
(24)
E [ 1iT ] = 1T = 0
(25)
because of the independence of xit and yit : Parts (d) and (g) can be shown easily as follows:
E [ 2i ] = 6" 2
(26)
4 V ar [ 2i ] = 45" 22 :
(27)
2
and To prove (e), we note that
2T = E [2iT ] PT 1
P
2 T x 1 = T2 t=1 E [xit ] t=1 it TE o 2 n T (T +1) T (T +1)(2T +1) = T "2 2 6T 2 1 " = 6 + O(T ) = 2 + O(T 1 ):
To prove (f ); we rst rewrite
2
V ar ( 1iT ) = V ar T12 PTt=1 (xit xi ) yit PT 2 2 hPT PT PT i = T14 E x y E x y t=1 it it t=1 it it t=1 xit t=1 yit T5 + T16 E
P T
t=1 xit
PT : t yit 2
2
=1
After some tedious algebra, we can show that
2 T !3 X 1 E4 5 = " u + O T ; x y it it T 6 2
4
t=1
19
2
2
1
2 T5E
" X T
and
t=1
xit yit
! X ! X !# T T t=1
xit
t=1
" u + O T = 415 2
yit
2
1
;
2 T ! T !3 X 5 " u : 1 E 4 Xx + O T y = it it T 9 2
6
2
t=1
2
2
1
t=1
It follows that
4 1 2" 2u + O T 1 ; + 9 2" 2u + O T 1 = 90 V ar ( 1iT ) = 16 15 proving (f ). Next, we prove (h). First,
R R W (r)dr R V (r)dr Wi (r)Vi (r)dr i i R R R " u fE Wi (r)Vi (r)dr + E Wi (r)dr Vi (r)dr hR R i R
V ar ( 1i ) = 2" 2u E =
2
1 0
1 0
2
1 0
2
2E
1 0
Wi (r)Vi (r)dr
1 0
1 0
Wi (r)dr
Again, we can show after tedious algebra,
E E and
E
Z
Hence, establishing (h). Part (i) is easy. Finally, and
1 0
"Z
1 0
"Z
1 0
Wi (r)Vi (r)dr
Wi (r)dr
Wi (r)Vi (r)dr
Z
Z
1 0
1 0
2
1 0
# 2
Vi (r)dr
Wi (r)dr
2
1 0
2
Vi (r)dr g:
= 61 ;
#
Z
1 0
1 0
= 19 ;
Vi (r)dr
2: = 15
2 2 " u ; V ar ( 1i ) = 90
1NT N1 2NT N1
N X i=1
N X i=1
1iT !p 1 = 0
2iT !p 2 = 6"
2
as N ! 1 and T ! 1, as required for (j ) and (k) by the law of large numbers for triangular arrays.
20
B Proof of Theorem 1 Proof. By Lemma 3 we know that
(28) b = 1NT !p 1 = 0 2NT 2 proving (a): We note that, unlike the pure time-series case, b does converge in probability to zero here. Note that the sequence
p N NT = 1
p1N PN i=1 1iT
is a triangular array sequence, and therefore a central limit theorem of a triangular array is needed. The central limit theorem of a triangular array (e.g., see Theorem 27.2, Billingsley, 1986, p. 369) can be applied if the Lindeberg Condition is satised; that is, for any > 0; (T ) h i 1 NX 2 lim E I ( j j > s ) ( ) T 1iT = 0 1iT T !1 s2
T i=i
P
(29)
where s2T = Ni=1(T ) var( 1iT ) ; and I (:) represents an indicator function taking the value one if the condition in the brackets is satised and zero otherwise. It is easy to see that (29) is satised since s2T = PN (T ) var( ) = N (T )2 ! 1 and 2 = 2 + O(T 1) < 1: Since the Lindeberg condition holds, we 1iT 1 1T i=1 p 1T can readily see that the N 1NT will converge to a normal variable with an appropriate normalization and that 2NT will converge to 2 in probability by the law of large numbers for triangular arrays. Using the Slutsky theorem and Lemma 3, we obtain
p b pN NT N = ) N 0; = N 0; 25 u ; NT " p as required for (b). Thus, after rescaling by N; b converges in distribution to a normal random variable. 2 1 2 2
1
2
2
2
We observe that
T From (28),
T
= b i ) u
1 2
Z
1
0
= bi = T 1=2yi
1 2
Vi (r)dr 0 "
for all i as T ! 1: We dene 1 s2 = NT
Z
1 0
= xi :
1 2
Wi (r)dr = u
N X T n X i=1 t=1
bT
Z
1 0
2 Vi (r)dr N 0; 3u
(yit yi ) b (xit xi )
21
o
2
(30)
(31)
and then (e.g., Phillips, 1986, p. 331)
T 1s2 = = = where 1
T
2
PN PT n(y y ) b (x x )o it it i i t N i T PN h PT (y y ) b PT (x x ) i it it i i N i nT PN PtT (y y ) o bT Pt N n PT (x x ) o ; it it i i t t N i T N i T 1
=1
1
=1
1
=1
2
1
2
=1
1
2
1
2
2
2
=1
2
2
2
=1
1
2
=1
1
=1
1
2
=1
(32)
2
PT (y y ) ) R V (r) dr R V (r)dr it i i i i u t 2
=1
2
1 0
2
1 0
2
as T ! 1: It follows that (e.g., Levin and Lin, 1992, Appendix 1)
E [i ] = 6u
(33)
4 var [i ] = 45u :
(34)
2
and Therefore, Similarly, Obviously, Hence,
) 2 N (1 X T 1X p u 2 N T 2 (yit yi ) ! 6 : t=1
i=1
p 2 N 1 1X 2 " N T 2 (xit xi ) ! 6 : i=1
b
2
p N 1 1X 2 N T 2 (xit xi ) ! 0: i=1
T 1 s2 !p 6u : 2
22
(35)
Thus, s2 diverges instead of converging. Now,
qP P b x x T = t = s P x pNb q P = r P h P y y b q P pNb q = P b P q N ; q ) = N 0; = N 0; = N 0; 1 2
T
1=2
N i=1
T ( t=1 it
1 N
N 1 i=1 T 2
N 1 i=1 T 2
1 N
2 u 6 2 1 2 2
xi )2
2 1 ( T 2 it
2
i)
x
xi )2
i
N i=1 2iT
2 1 N
N i=1 iT 2 " 6
2 1 2 2
0
T ( t=1 it
T ( t=1 it
1 N
1 N
2
i)
N i=1 2iT 2 1 2 2
" u
2 " 2 u
2 5
as required for part (c). That is, T 1=2 t has a non-degenerate distribution, t = Op (T 1=2); which is the same as in a pure time-series case. This implies that the t-statistic for b has a divergent distribution. The coecient of determination is
P P y y b PP PP b PP
R2 = PNi=1 PTt=1 ((byyitit N
2 i=1 1 N 1 N 2 1 N 1 N 2 " 2 u
= =
T
2
i)
2 i) t=1 N T 2 1 ( i) i=1 T 2 t=1 it N T 1 2 ( i) i=1 T 2 t=1 it N 2iT i=1 N i=1 iT
x
y
y
x
!p 0 = 0; where
iT ) u 2
"Z
1 0
Vi (r) dr 2
Z
1 0
Vi (r)dr
# 2
:
This proves part (d). Next, we consider the behavior of the Durbin-Watson (DW) statistic when H0 : = 0 is true. The DW statistic is PN PT 2 DW = i=1PN t=2P(beTit bebe2it 1 ) i=1P t=1 it (36) N 1 PT 1 uit b "it 2 1 = T 1 PNN 1i=1PTT t=2 : y y b (x x ) 2 N
i=1 T 2
23
t=1
it
i
it
i
Now,
PN PT u b" it it N i T t 1
=1
1
2
=2
P n P
P
P
= N1 Ni=1 T1 Tt=2 u2it b T1 Tt=2 uit "it + T1 b Tt=2 "2it o o P n P P n P = N1 Ni=1 T1 Tt=2 u2it b N1 Ni=1 T1 Tt=2 uit "it +
b2 N1 !p 2u :
because
PN
i=1
2
PT T t "it 1
2
=2
o (37)
b = op (1);
) N (1 X T 1X N T uit "it = op (1); and From (35),
t=2
i=1
N 1X T 1X 2 N T "it = Op (1): t=2
i=1
N 1 X T 2 p 2u 1X b y ( x x ) ! 6: y it i it i N T2 i=1
Thus,
t=1
(38)
DW = Op (T 1)
or It also follows that
DW !p 0:
(39)
TDW !p 6:
(40)
Therefore, we establish parts (e) and (f ).
C Proof of Lemma 2 Proof. First we note that
2 4T T
=
3 2
=
3 2
PT y 3 PTt x 5 it t =1
=1
it 0v
= T 3=2
R )
1 0
24
2 3 PT L0 4 yit 5 t
Wi (
=1
r)dr;
xit
(41)
2 4
PT y PT y x 3 t T PT x y T PtT (x ) 5 it T 0 t 2 T 3t 1 h i 0 @ PT 4 yit 5 =L T t yit xit A L xit R 0 ) 2 Wi (r) [Wi (r)] dr R V (r) dr R V (r)W (r)dr 3 i i 5: =4 R i R W (r)V (r)dr W (r) dr
and
1
2
=1
1
2
=1
1
2
( it )2
1
2 0v it it 0v
2
=1
1
2
it it 0v
2
=1
(42)
=1
1 0
1 0 1 0
It follows that
i
3iT = T12
1 0
0v
t=1
1
T2
T 1 0
T 1 X xit yit
T 2 t=1
0v
T
= x i
1 2
T
=
1 2
yi : 0v
1 0
it 0v
=1
and
Z
2
1 0
PT x y ) R W (r)V (r)dr; it i i t Z T
3iT )
2
i
2
T (x x ) y X i it = it
Using (41) and (42), we have
Hence,
i
R W (r) dr nR W (r)dro : i i i
4iT )
proving (b). Recall that
1 0 1 0
2
1
= x ) i
1 2
=
1 2
0
Wi (r)dr;
yi ) Z 1 V (r)dr: i 0v 0
Wi (r)Vi (r)dr
This proves (a). (c) (k) are immediate from the Lemma 1.
Z
1 0
Wi (r)dr
Z
1 0
Vi (r)dr:
D Proof of Theorem 2 Proof. Using (a) in Theorem 1 we can show that b !p 0: Recall that
b = b 0"1 + 0"2 0u" : 25
(43)
Then we obtain
b !p 02u" ;
(44)
"
0
proving (a). Using (b) in Theorem 1 we get
2 p b N ) N 0; 5 v : 2 0
Then
p b u" N ) N 0; 25 v " " establishes (b). Now T = t ) N 0; from Theorem 1 and: 2 0 2 0
0 2 0
(45)
2 5
1 2
20v T s 1
1
2
P
P
= N1 Ni=1 T12 Tt=1 120v eb2it P P = N1 Ni=1 T12 Tt=1 120v ebit2 P P 2 = N1 Ni=1 T12 Tt=1 yit0vyi
b
!p ;
2
xi )
20v T 2 (xit 1
1
2
1 6
where
ebit = yit b i b xit ; N X T 1 1X
(xit xi )2 = Op (1); 2 T i=1 t=1
N and
b = op (1):
Observe that
T
= t =
1 2
= = =
q b P P s pNb q P q
T
N i=1
1=2
1 0v
1 N
x
T ( t=1 it
N 1 i=1 T 2
P
s
xi )2 x
T ( t=1 it
xi )2
1 1 2 T 2 0v
pN b 1 + 2 0" q 1 PN 1 PT x x 2 0" 0" 0u" 0v N i) i=1 T 2 t=1 ( it p
N 0v
b
q s q P P q s
1 1 2 T 2 0v
+ 0u" 0"
N 1 i=1 T 2
1 N
1 1 2 T 2 0v
26
T t=1
(xit xi )2
:
(46)
It follows that
T
1=2
0"2 0u"
qP P N
i=1 = t s q PN PT (xit xi)2 T T 1=2b i=1 t=1
T
1 2
= = =
T
1=2
T
1=2
qP P N i=1
q s b P P s q b P P q s
p
N 0v
=
b
2 0" 0u"
1 0"
N i=1
1 N
N 1 i=1 T 2
x
T ( t=1 it 1=2
s x
0"2 0u"
T ( t=1 it
qP P N i=1
x
T ( t=1 it
xi )2
xi )2
xi )2
x
T ( t=1 it T t=1
xi )2
(xit xi )2
1 1 2 T 2 0v
p ; 52 ) 16 ) N( p 1 6 = N 0; : 0
2 5
proving (c). It is easy to show that
PT 2 1 i=1 T 2 t=1 (xit xi ) P P = 20" N1 Ni=1 T12 Tt=1 (xit P = 20" N1 Ni=1 4iT p 20" 1
N
! and
PN
6
1
1
2
=1
1
!
1
=1
2 0v
(47)
;
PN PT (y y ) it t T i P P N T = N i T t yit + xit yit xit P P N T (xit xit ) = N i T t (yit yit ) + p 1
N
xi )2
2
=1
1
2
0u" 0"
=1
1
2
2
=1
0u" 0"
2
2
0u" 0"
(yit 2 0u" 0"
2
2
1 + 0u" : 6 0" To prove (d), we use (47) and (48) to obtain 6
P P by y P P y y R = P P b P P = !p 2
N T ( i=1 t=1 it N T ( it t=1 2 i=1 N 1 1 N i=1 T 2 N 1 1 N i=12 T 2 2 0u" 0" 6 2 0" 2 0v + 0u" 2 1 6 0" 6 2 0u" 2 2 + 2 0v 0" 0u"
= :
27
2
i)
2 i) T 2 ( i) t=1 it T 2 ( i) t=1 it
x
y
y
x
yit ) (x
it
xit )
(48)
Again using (12) gives
DWP P 2 N T = i=1P t=2P(beit beit 1 )
P P (be bebe ) = P P be P P u b " P P be : =T N i=1 1
1 N
Now,
N T 2 i=1 t=1 it 2 T it 1 t=2 it N T 2 i=1 t=1 it 2 N 1 T it T i=1 t=2 it N T 1 1 2 N t=1 it i=1 T 2
PN PT u b" it it i T t b PT P P PT N T b =N i t "it T t uit T t uit "it + T n o n o P P P P N T u T u " + b PN = b N 1
N
=1
1
=1
1
2
1
=2
1
1
N
!p v
i=1 T
=2
1
2
1
2
t=2 it
1
=2
1
i=1 T
N
2
t=2 it it
2
=2
2
1
N
1
i=1 T
PT " t it =2
2
2 0
because
) N (1 X T 1X 2 p 2 N T uit ! 0v ; i=1
b
t=2
= op (1);
N (1 X T 1X
N and From (46), Thus,
T
i=1
t=2
u "
)
it it = op (1);
N 1X T 1X 2 N T "it = Op (1): i=1
t=2
N 1 X T 1X p 20v 2 ! e b N i=1 T 2 t=1 it 6 :
or It also follows that
(49)
DW = Op (T 1) DW !p 0:
(50)
TDW !p 6:
(51)
Therefore, we establish parts (e) and (f ).
28
E Proof of Theorem 3 Proof. First, we note that
ebit
1
and
= yit = yit
b xit 1 b xit 1 xi
bi yi
1 1
4ebit = uit b "it ; where
uit = uit 02u" "it 0"
and
"it = "it : "
0
It follows that
6NT = = =
PN n PT eb o it N i nT PN PtT y y o b PN n PT x i it it N i nT PN PtT y y o + o (1)N; i T t p i it t N i T 1 1
1
1
2
=1
2
=1
1
1
1
2
=1
=2
1
xi
o 2
(52)
2
2
1
=2
N (1 X T 1X xit N T2
)
xi 2 = Op (1)
1
t=2
i=1
and
2
1
=2
1
=1
2
2
1
since
We also note that
=2
b2 = op (1): T 1 X yit T2 t=2
1
y 2 ) 2 i
0
v
"Z
1 0
V 2 (r)dr i
Z
1 0
V (r)dr i
# 2
i; 6
(53)
E [ 6i ] = 60v 6 ; 2
and Thus,
V ar [ 6i ] = 450v 26 : 4
) 2 N (1 X T 1X p 0v 2 N T 2 (yit yi ) ! 6 6 i=1
t=2
29
(54)
and
6NT !p 60v 6 2
by the law of large numbers for triangular arrays and equation (52). Moreover,
p
P
P
N N1 Ni=1 T1 Tt=2 ebith 1 4 ebit p P ih P i = N N1 Ni=1 T1 Tt=2 yit 1 yi b xit 1 xi uit b "it p P P h = N N1 Ni=1 T1 Tt=2 yit 1 yi uit b xit 1 xi uit b yit p P P = N N1 Ni=1 T1 Tt=2 yit 1 yi uit + op (1); N (1 X T 1X xit
since
N
T
t=2 i=1 ( N T 1X 1X N i=1 T t=2 yit
N (1 X T 1X xit
and
N
T
i=1
t=2
x " i
1
)
it
)
1
y "
1
x u
yi "it + b
1
2 xit
1
i
xi "it
!p 12 ;
it = op (1);
i
)
it = op (1)
i
by the law of large numbers for triangular arrays. Hence,
p
p
5NT NT (b 1) = pN 6NTPN PT 1 N N i=1 T1 t=2 (yit P P = 1 N
N i=1
T t=2
1 T2
Next, we derive the asymptotic distribution of
P P P P
pN 1
N
1 N
N 1 i=1 T N 1 i=1 T 2
(yit 1 yi )u it (yit 1 yi )2
T t=2 T t=2
(yit
1 1
5iT = T1 0
and But
T X t=2
yit
5NT = N1 0
p 1 PN 0 = N1 NPN i=1 5iT ;
PT T t yit 1
=2
1
i=1
0
5iT :
yi uit = T 1 PTt=2 yit 1 uit T 30
i=1 6iT
yi uit
1
N X
(55)
p 0 5NT = N 6NT N
where
yi )u it + op (1): yi )2
=
3 2
PT y T t it =2
=
1 2
PT u ; it t =2
T
1
PT y u ) hV (1) i t it it Z T = X 1
=2
T
=
T
t=2
1
y u i
it
and
1 2
1
yit ) 0v
t=2
T T 1X yit
2
3 2
and Clearly,
2 0v
T X t=2
0
2v 20v
2
i
;
Vi (r)dr;
uit ) 0v Vi (1):
2 0v ) V (1)2
2v 2 V (1) Z 1 V (r)dr ; 0v i i 5i 2 i 20v 0 2 E [ 5i ] = 2v = 5 ;
(56)
V ar [ 5i ] = 120v = 25 ; 4
since
E Vi (1)
and
Z
1 0
Vi (r)dr = 1 2
E Vi (1)2 = 1:
h0 i
Let 5T E 5iT = 5 + O(T 1 ) and 6T E [ 6iT ] = 6 + O(T 1 ): The next step is to nd an 0 5T to make sure it converges to a proper random variable. For this appropriate normalization of 5NT 6T 6NT purpose, we note the following standardization
p 0 N NT NT 5
6
0
p
p 0 1 N T NT 0 p N T = + N NT T NT NT 1
5
6
5
6
6
1 : T
6
(57)
Since f 5iT g satises the conditions of the central limit theorem for triangular arrays, we can readily see p 0 that the N 5NT 5T will converge to a normal variable with an appropriate normalization and the 6NT will converge to 6 in probability. Using the Slutsky theorem, we obtain
p 0 1 N NT T ) N 0; NT 5
5
2 6
6
as N ! 1 and T ! 1: Similarly,
p 1 NT N NT 0
5
6
1
6T 31
2 5
) N 0; ; 2 5 4 6
2 6
(58)
0
p
p 1T ] and N [ 6NT 2T ] are
since it can be shown that cov 5iT ; 6iT = O(T 1 ) and N [ 5NT asymptotically uncorrelated. We therefore conclude
p
N
"p
#
5T ) N 0; 1 2 + 25 2 ; 6T 26 5 46 6
0
N 5NT 6NT
where
(59)
4 0v 12 2 2 0v 6
25 26 =
= 3
and
2 5 4 6
=
2 0v 6 72 4 v 4 0v
:
= Thus,
2
2 v 2
2 6
4 0v 4 45
:
p p T ) N 0; 3 + 7:2 v : NT (b 1) N T v 4
5
4 0
6
The t-statistic is
qP P be s P be pNT b q P ( ) pNT b p s ( ) (b 1)
t =
N i=1
T 2 t=2 it 1
e
= =
1
1 N
1
6NT
N 1 i=1 T 2
T 2 t=2 it 1
e
pN
sp e 6NT
5NT 6NT e
= pN s = se p5NT pN06NT 5NT + o(1); = se p 6NT
P PT eb beb : It follows that where se = NT Ni it it t o N T n se = NT P P uit b "it + op (1) i t PN PT u 2 bu " + b " + o (1) = NT p it it it it 2
1
2
=1
1
=2
2
2
1
=1 =2
1
!p v : 2
(60)
i=1t=2
2
2
2
That is, s2e is a consistent estimator of 2v . Next, we derive the asymptotic distribution of
p 0 N NT : p se NT 5
6
32
(61)
p 0 p We use the same approach for the asymptotic distribution of spN 5NT as we used for NT (b 1). We 6NT make the following normalization: pN0
p 5NT se
6NT
pN 5T se p6T
p 0
p
N5NT N5T = se p sep6T pN 6NT 0 hq 1 5NT 5T 5T pN p = + se 6NT se 6NT
Since
i p h0 N NT T p ) N 0; ; se NT v s # "s T pN 1 1 ) N 0; 1 ; se NT T 4 v 5
6
5
NT
6
2 5 3 6
6
2 6 2
p 5T and N [ 6NT 6T ] are asymptotically uncorrelated, we note that p p 0 N N5T ) N 0; 25 + 1 25 22 5NT p 6 2v 4 36 2v se 6NT se p6T
where
25 6 2v
1 4
t
Therefore, we established Theorem 3.
2 5 3 6
2 6 2 v
4 0v 12 2 0v 2 v 6 2 0v 2 2 v
=
=
and
Thus,
6T
:
2
5
i
1
2 5
5
6
p h0 and N
q i
2 4 0v 45 3 2 0v 2 v 6
2 v 2
=
1 4
=
3 2 10 2 0v
v:
p N T ) N 0; v + 3v : se p T 2v 10 v 2 0
5
2
2
6
33
2 0
(62)
F The Proof of Theorem 4 Proof. Observe that
8NT = = = = =
We also note that
N P 1 N i=1 8iT N 1 0 P 1 N i=1 T 2 (ei Qi ei ) N 1 0 P 1 N i=1 T 2 (ei ei ) + op (1) N 1 P T 2 P 1 N i=1 T 2 t=2 ebit 1 + op (1) PN n 1 PT 2 o 1 N i=1 T 2 t=2 (yit y i ) + op (1):
"Z 1 T 1 X 2 2 2 T 2 (yit yi ) ) 0v 0 Vi (r)dr t=2
Z
1 0
V (r)dr i
# 2
i; 8
(63)
E [ 8i ] = 60v 8 ; 2
and
V ar [ 8i ] = 450v 28 : 4
Under the null hypothesis that = 1; expression (16) can be written as
d(L) 4 ebit = vitp ;
where d(L) = 1 '1 L '2 L2 'p Lp and L is the backshift operator. Moreover, N p p N NT = N N P NT N p iP (e0i Qi vi ) = NN i T N p P = NN (e0i vi ) + op (1) i T N p P PT eb v + o (1) = NN p it itp T t i n o p P P = N N Ni T Tt (yit yi ) d(L)uit + op (1): 7
1
=1
7
1
1
=1
1
1
=1
1
1
=1
1
Since
=1
=2 1
1
=2
N N1 P T1 P ebit 1 vit i=1 t=1 N 1 P T p 1 P 2 p = NN ebit 1 1 '1 L '2 L 'p L 4 beit i=1 T t=2 N 1 P T p P = N N1 e b 4 e b ' e b 4 e b ' e b 4 e b ' e b 4 e b 1 it 1 2 it 1 p it 1 it 1 it it 1 it 2 it p i=1 T t=2 P P N T 1 = pNT i=1 t=2 yit 1 y i uit '1 uit 1 '2 uit 2 'p uit p + op (1);
p
N
T
34
(64)
it follows that
N p p N NT = N N P 7
PT eb v + o (1) it it p i T t P PT y y u ' u = pNT Ni it it i it t 1
=1
1
1
p
p7NT tADF = sv N 8NT N pN 1 P N
P y 1 T
T
( it 1
p sv 8NT pN0 = sv p7NT + op (1) ; 8NT =
where
i=1
07iT = T1
and
1
1
1
=2
=1
Hence,
1
=2
T X t=2
t=2
yit
7NT = N1 0
1
N X i=1
'2 uit
yi )d(L)uit
2
'p uit
p + op (1):
+ op (1)
yi d (L) uit 07iT :
Assume that the sequence fuit g satises condition (C 2) in Phillips and Ouliaris (1990) so that [rT ] 1X d (L) uit ) d (1) 0v Vi (r) : T
t=2
Consequently,
Z1 T 1X d (L) u ) d (1) 2 y Vi (r) dVi (r) it 1 it 0v T 0 t=2
and
Z1 T 1X d (L) u ) d (1) 2 V (1) V (r) dr: y i it 0v i i T 0 t=2
Clearly, T 1X yit T t=2
with and Let
1
yi d (L) uit ) d (1) 20v
Z
1 0
Vi (r) dVi (r) d (1) 20v Vi (1)
E ( 7i ) = d (1) 20v 2
V ar ( 7i ) = d2 (1) 120v : 4
35
Z
1 0
Vi (r) dr 7i
1 s2v = NT
NX T X
(ebit b ebit 1 )
i=1 t=2
'b 1 L + 'b 2 L2 + + 'b p Lp 4ebit 2 :
Observe that febit g are linear combinations of uit and "it : Those linear combinations are in general ARMA
processes. Therefore, we need to have p ! 1 to capture the true structure of vitp . As a result of p ! 1, b ! 1 and the estimated coecients f'bi g may well represent f'i g : N P T n P o 1 ' L ' L 'p Lp uit b "it + op (1) i t N P T P = NT d (L) uit 2 b uit "it + b "it + op (1) i t
s2v = NT1
1
=1 =2
1
2
2
2
2
2
2
2
=1 =2
!p d (1) v : 2
2
2
We are now in a position to derive the asymptotic distribution of
p 0 N p NT : sv NT 7
8
Dene
0
0
7T E 7iT = E 7i + op (1)
and
8T E ( 8iT ) = E ( 8i ) + op (1) :
Then, we make the following normalization:
tADF
pN 7T sv p8T
p 0
pN 7T sv p8T
p7NT = sv N
p p q 1 7NT 7T ) + 7T = Ns(vp sv N NT 8NT
8NT
Since
!
q 1
8T
N (p 7T ) ) N 0; d2 (1) 120v = N 0; d2 (1) 20v 7NT 2 22 sv 8NT 2 60v s s ! 0 d (1) 2v 2 40v 1 2 4 p 2 45 C 7T N 1 1 3d (1) v B 1 sv 8NT 8T ) N @0; 4 2 20v 3 A = N 0; 102 20v 6 p p and N ( 7NT 7T ) and N ( 8NT 8T ) are asymptotically uncorrelated., we conclude that p d2 (1) 2 2 4 N 3 d (1) 7T v 0v tADF s p ) N 0; 2d2 (1) 2 + 10 (d2 (1) 2 ) 2 v 8T v v 0v p 20v + 32v 7T tADF s N ) N 0 ; p 22v 1020v v 8T as N ! 1 and T ! 1:
p
4
36
References [1] Banerjee, A., Dolado, J., Galbraith, J. W., and Hendry, D. F. (1993), Co-integration, Error Correction, and the Econometric Analysis of Non-stationary Data, Oxford University Press, New York. [2] Billingsley, P. (1986), Probability and Measure, John Wiley, New York. [3] Breitung, J., and Meyer, W. (1994), Testing for Unit Roots in Panel Data: Are Wages on Dierent Bargaining Levels Cointegrated? Applied Economics, 26, 353-361. [4] Granger, C. W. J., and Newbold, P. (1974), Spurious Regressions in Econometrics, Journal of Econometrics, 2, 111-120. [5] Haug, A. A. (1996), Tests for Cointegration: A Monte Carlo Comparison, Journal of Econometrics, 71, 89-115. [6] Im, K., Pesaran, H., and Shin, Y. (1995), Testing for Unit Roots in Heterogeneous Panels, Manuscript, University of Cambridge. [7] Kao, C., and Chen, B. (1995a), On the Estimation of a Cointegrated Regression in Panel Data when the Cross-Section and Time-Series Dimensions are Comparable, Manuscript, Department of Economics, Syracuse University. [8] Kao, C., and Chen, B. (1995b), Residual-Based Tests for Cointegration in Panel Data when the CrossSection and Time-Series Dimensions are Comparable, Manuscript, Department of Economics, Syracuse University. [9] Levin, A., and Lin, C.-F. (1992), Unit Root Tests in Panel Data: Asymptotic and Finite-Sample Properties, Discussion Paper 92-93, Department of Economics, UC-San Diego. [10] Levin, A., and Lin, C.-F. (1993), Unit Root Tests in Panel Data: New Results, Discussion Paper, Department of Economics, UC-San Diego. [11] Park, J., and Ogaki, M. (1991), Seemingly Unrelated Canonical Cointegrating Regressions, The Rochester Center for Economic Research, Working Paper No. 280. [12] Pedroni, P. (1995), Panel Cointegration: Asymptotic and Finite Sample Properties of Pooled Time Series Tests with an Application to the PPP Hypothesis, Indiana University Working Papers in Economics, No. 95-013. 37
[13] Pesaran, H., and Smith, R. (1995), Estimating Long-Run Relationships from Dynamic Heterogenous Panels, forthcoming in Journal of Econometrics. [14] Phillips, P. C. B. (1986), Understanding Spurious Regressions in Econometrics, Journal of Econometrics, 33, 311-340. [15] Phillips, P. C. B. (1987), Time Series Regression with a Unit Root, Econometrica, 55, 277-302. [16] Phillips, P. C. B., and Ouliaris, S. (1990), Asymptotic Properties of Residual Based Test for Cointegration, Econometrica, 58, 165-193. [17] Quah, D. (1994), Exploiting Cross Section Variation for Unit Root Inference in Dynamic Data, Economics Letters, 44, 9-19.
38
Table 1: LSDV Regression with N = T SSD(t ) P (jt j > 1:96) .1495 -.0185 1.9603 .315 .0727 .0339 2.8375 .4889 .0479 .0248 3.4693 .5723 .0358 .016 4.0065 .6166 .0286 .006 4.4392 .6598 .0237 .0207 4.8987 .6894 .0203 -.0365 5.3014 .7096 .0178 -.0057 5.6369 .7264
N(T)
SSD( b) ESE ( b) t
T
SSD( b) ESE ( b) t
10 -.0028 .2865 20 .0024 .2039 30 .0012 .1652 40 .0006 .1423 50 .0001 .1265 60 .0004 .1160 70 -.0006 .1072 80 -.0001 .1002 Note: P b (a) = M1 PM i=1 i (b) t = M1 PM i=1 t i M R2 (c) R2 = M1 P i=1 i (d) DW = M1 M i=1 DWi (e) The number of replications = 10,000.
Table 2: LSDV Regression with N = 30 SSD(t ) P (jt j > 1:96) .0861 .0104 1.9546 .3130 .0593 .0397 2.8192 .4858 .0479 .0248 3.4693 .5723 .0413 .0131 3.9855 .6218 .0368 -.0074 4.5078 .6651 .0336 .0381 4.9167 .6889 .0310 -.0419 5.3318 .7103 .0259 .1072 6.3427 .7548 .0211 .08022 7.8426 .8022
10 .0006 .1669 20 .0022 .1660 30 .0012 .1652 40 .0004 .1639 50 -.0001 .1651 60 .0013 .1643 70 -.0012 .1647 100 .0026 .1634 150 .0002 .1647 Note: P b (a) = M1 PM i=1 i (b) t = M1 PM i=1 t i M R2 1 2 (c) R = M P i=1 i (d) DW = M1 M i=1 DWi (e) The number of replications = 10,000.
39
R2
DW
R2
DW
.0384 .02 .0133 .01 .0079 .0066 .0057 .0049
.0136 .0134 .0133 .0131 .0133 .0131 .0132 .0130 .0132
.6184 .3057 .2033 .1519 .1212 .1009 .0864 .0755
.5686 .2991 .2033 .1538 .1238 .1033 .0890 .0626 .0418
N
Table 3: LSDV Regression with T = 30 SSD(t ) P (jt j > 1:96) .0832 .0261 3.5000 .5732 .0588 .0567 3.4577 .5788 .0479 .0248 3.4693 .5723 .0415 .0195 3.4532 .5692 .0371 .0259 3.4306 .5699 .0339 .0444 3.4478 .5635 .0313 .0588 3.4613 .5745
SSD( b) ESE ( b) t
10 .0029 .2848 20 .0037 .2005 30 .0012 .1652 40 .0009 .1425 50 .0001 .1265 60 .0017 .1163 70 .0020 .1080 Note: P b (a) = M1 PM i=1 i (b) t = M1 PM i=1 t i M R2 1 2 (c) R = M P i=1 i (d) DW = M1 M i=1 DWi (e) The number of replications = 10,000.
40
R2
.0378 .0195 .0133 .0099 .0109 .0067 .0058
DW
.2232 .2080 .2033 .2007 .1993 .1983 .1976
Table 4: The Empirical Size at 5 Percent for a1 = 1, = 1, = 1, = 0:0, and = 0
T = 10 N
1 2 5 10 15 20 25 50 75 100 150 200 250 300
DF .350 .257 .182 .149 .142 .144 .149 .178 .208 .242 .318 .378 .446 .513
DFt
.282 .233 .191 .179 .179 .193 .212 .270 .320 .372 .472 .551 .623 .682
DF
.034 .028 .021 .018 .021 .023 .024 .041 .068 .098 .173 .269 .372 .473
T = 25 DFt .075 .079 .084 .096 .099 .115 .121 .192 .255 .314 .433 .535 .630 .708
ADF .139 .138 .167 .210 .272 .315 .367 .575 .722 .829 .939 .979 .993 .998
T = 50
DF .435 .307 .195 .142 .135 .121 .115 .119 .117 .121 .132 .142 .153 .171
DFt
.239 .180 .139 .114 .118 .109 .110 .122 .126 .139 .154 .170 .187 .212
DF .157 .130 .088 .068 .059 .054 .052 .053 .054 .061 .070 .079 .091 .105
DFt
ADF
.098 .092 .089 .079 .080 .078 .078 .089 .095 .106 .123 .138 .156 .177
.137 .125 .115 .113 .123 .123 .129 .161 .185 .214 .265 .308 .356 .407
.179 .133 .103 .082 .078 .077 .071 .071 .067 .062 .067 .068 .071 .069
.184 .139 .113 .093 .085 .087 .084 .081 .079 .081 .085 .091 .096 .095
T = 100
1 .461 .218 .354 .145 .167 .467 .213 2 .322 .159 .249 .115 .135 .319 .152 5 .202 .118 .153 .093 .107 .209 .115 10 .149 .101 .106 .079 .092 .150 .092 15 .127 .091 .087 .075 .090 .127 .083 20 .121 .093 .084 .076 .093 .116 .084 25 .113 .091 .077 .074 .094 .108 .078 50 .105 .090 .072 .075 .098 .095 .076 75 .093 .085 .065 .075 .107 .087 .073 100 .092 .093 .064 .077 .114 .085 .070 150 .094 .094 .069 .084 .127 .084 .075 200 .098 .098 .069 .086 .141 .085 .076 250 .098 .104 .072 .091 .149 .079 .078 300 .104 .113 .075 .096 .166 .079 .076 Note: (a) The number of replications =10,000. (b) All rejection frequencies are based on the asymptotic critical value 1.645.
41
.429 .291 .187 .129 .105 .098 .090 .079 .073 .071 .069 .068 .069 .065
Table 5: The Power at 5 Percent for a1 = 1, = 1, = :90, = 0, and = 0
T = 10 N
1 2 5 10 15 20 25 50 75 100 150 200 250 300
DF .359 .296 .270 .286 .329 .375 .419 .626 .773 .870 .958 .988 .998 .999
DFt
.283 .252 .249 .277 .317 .358 .405 .581 .714 .809 .919 .964 .988 .995
DF
.028 .022 .017 .014 .017 .018 .019 .027 .040 .054 .093 .143 .195 .266
T = 25 DFt .068 .073 .077 .090 .100 .112 .121 .195 .264 .325 .464 .583 .679 .758
ADF .137 .136 .184 .248 .323 .373 .435 .665 .811 .899 .974 .993 .998 .000
T = 50
DF .496 .451 .465 .569 .674 .756 .823 .966 .994 1.00 1.00 1.00 1.00 1.00
DFt
.255 .234 .261 .341 .421 .488 .559 .799 .914 .967 .996 1.00 1.00 1.00
DF .135 .154 .185 .255 .334 .406 .484 .774 .913 .972 .998 1.00 1.00 1.00
DFt
ADF
.084 .091 .115 .163 .205 .248 .291 .507 .674 .793 .922 .977 .995 .998
.139 .145 .176 .225 .276 .321 .368 .566 .715 .822 .934 .977 .992 .997
.391 .601 .922 .998 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
.355 .518 .843 .985 .998 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
T = 100
1 .622 .293 .458 .179 .197 .845 .471 2 .569 .341 .255 .163 .225 .939 .667 5 .798 .471 .657 .340 .324 .999 .949 10 .936 .698 .853 .561 .496 1.00 1.00 15 .983 .841 .944 .724 .630 1.00 1.00 20 .997 .922 .982 .835 .735 1.00 1.00 25 .999 .995 .995 .905 .818 1.00 1.00 50 1.00 1.00 1.00 .996 .977 1.00 1.00 75 1.00 1.00 1.00 1.00 .997 1.00 1.00 100 1.00 1.00 1.00 1.00 1.00 1.00 1.00 150 1.00 1.00 1.00 1.00 1.00 1.00 1.00 200 1.00 1.00 1.00 1.00 1.00 1.00 1.00 250 1.00 1.00 1.00 1.00 1.00 1.00 1.00 300 1.00 1.00 1.00 1.00 1.00 1.00 1.00 Note: (a) The number of replications =10,000. (b) All rejection frequencies are based on the asymptotic critical value 1.645.
42
.781 .898 .994 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
Table 6: The Empirical Size of 5 Percent of Tests with the Null Hypothesis of No Cointegration
= :25 = 0:8 DF DFt DF DFt ADF =0 DF DFt DF DFt ADF = 0:8 DF DFt DF DFt ADF
=1
=4
-0.5
0
0.5
-0.5
0
0.5
-0.5
0
0.5
1.00 1.00 1.00 1.00 .814
1.00 1.00 1.00 1.00 .038
1.00 1.00 1.00 1.00 .062
1.00 1.00 1.00 1.00 1.00
1.00 1.00 1.00 1.00 .985
1.00 1.00 1.00 1.00 .998
1.00 1.00 1.00 1.00 .993
1.00 1.00 1.00 1.00 .672
1.00 1.00 1.00 1.00 .466
.100 .089 .069 .072 .097
.102 .089 .069 .072 .096
.102 .089 .069 .072 .096
.104 .087 .073 .077 .099
.105 .090 .072 .090 .098
.104 .088 .073 .088 .100
.104 .089 .073 .074 .099
.104 .088 .070 .076 .099
.101 .088 .075 .073 .100
.005 .014 .287 .161 .519
.000 .000 .095 .070 .615
.000 .000 .041 .048 .416
.145 .113 .208 .149 .157
.020 .028 .163 .116 .317
.067 .064 .273 .174 .269
.287 .206 .101 .098 .049
.086 .078 .076 .078 .113
.306 .215 .089 .088 .044
Note: (a) The number of replications = 10,000. (b) N = 50 and T = 50. (c) All rejection frequencies are based on the asymptotic critical value 1.645. (d) a1 = 1.
43
Table 7: The Power of 5 Percent of Tests with the Null Hypothesis of No Cointegration and the Alternative Hypothesis of Cointegration ( = 0:90)
= :25 = 0:8 DF DFt DF DFt ADF =0 DF DFt DF DFt ADF = 0:8 DF DFt DF DFt ADF
=1
=4
-0.5
0
0.5
-0.5
0
0.5
-0.5
0
0.5
1.00 1.00 1.00 1.00 .198
1.00 1.00 1.00 1.00 .021
1.00 1.00 1.00 1.00 .012
1.00 1.00 1.00 1.00 .882
1.00 1.00 1.00 1.00 .781
1.00 1.00 1.00 1.00 .877
1.00 1.00 1.00 1.00 .986
1.00 1.00 1.00 1.00 .991
1.00 1.00 1.00 1.00 .999
.100 .044 .127 .054 .095
.487 ..298 .424 .251 .259
.215 .165 .293 .206 .224
.998 .907 .997 .812 .813
1.00 1.00 1.00 .996 .977
1.00 1.00 1.00 1.00 .997
1.00 1.00 1.00 .999 .988
1.00 1.00 1.00 1.00 .998
1.00 1.00 1.00 1.00 .998
.665 .324 .998 .888 .844
.369 .186 .995 .855 .993
.111 .076 .973 .738 .996
1.00 1.00 1.00 1.00 .987
1.00 1.00 1.00 1.00 1.00
1.00 1.00 1.00 1.00 1.00
1.00 1.00 1.00 1.00 .982
1.00 1.00 1.00 1.00 .999
1.00 1.00 1.00 1.00 .988
Note: (a) The number of replications = 10,000. (b) N = 50 and T = 50. (c) All rejection frequencies are based on asymptotic critical value: 1.645. (d) a1 = 1.
44
Table 8: The Size-adjusted Power of 5 Percent of Tests with the Null Hypothesis of No Cointegration and the Alternative Hypothesis of Cointegration ( = 0:85)
= :25 = 0:8 DF DFt DF DFt ADF =0 DF DFt DF DFt ADF = 0:8 DF DFt DF DFt ADF
=1
=4
-0.5
0
0.5
-0.5
0
0.5
-0.5
0
0.5
.001 .002 .000 .000 .000
.027 .026 .012 .014 .026
.012 .011 .004 .106 .004
.000 .000 .000 .000 .000
.000 .000 .000 .000 .000
.000 .000 .000 .000 .000
.000 .000 .000 .000 .073
.001 .000 .000 .000 .925
.010 .003 .000 .000 1.00
.297 .128 .388 .128 .128
.732 .446 .751 .458 .299
.508 .334 .728 .464 .336
1.00 1.00 1.00 1.00 .991
1.00 1.00 1.00 1.00 .999
1.00 1.00 1.00 1.00 1.00
1.00 1.00 1.00 1.00 1.00
1.00 1.00 1.00 1.00 1.00
1.00 1.00 1.00 1.00 1.00
1.00 .998 1.00 .990 .582
1.00 .997 1.00 .995 .974
1.00 .993 1.00 .992 .996
1.00 1.00 1.00 1.00 1.00
1.00 1.00 1.00 1.00 1.00
1.00 1.00 1.00 1.00 1.00
1.00 1.00 1.00 1.00 1.00
1.00 1.00 1.00 1.00 1.00
1.00 1.00 1.00 1.00 .988
Note: (a) The number of replications = 10,000. (b) N = 50 and T = 50. (c) All rejection frequencies are based on the 5 percent percentile of the empirical distribution. (d) a1 = 1.
45