Estimating and Testing Multiple Structural Changes in Models with Endogenous Regressors∗ Pierre Perron†
Yohei Yamamoto‡
Boston University
University of Alberta
June 16, 2008; Revised: December 3, 2009
Abstract We consider the problem of estimating and testing for multiple breaks in a single equation framework with regressors that are endogenous, i.e., correlated with the errors. First, we show based on standard assumptions about the regressors, instruments and errors that the second stage regression of the instrumental variable (IV) procedure involves regressors and errors that satisfy all the assumptions in Perron and Qu (2006) so that the results about consistency, rate of convergence and limit distributions of the estimates of the break dates, as well as the limit distributions of the tests, are obtained as simple consequences. More importantly from a practical perspective, we show that even in the presence of endogenous regressors, it is still preferable to simply estimate the break dates and test for structural change using the usual ordinary leastsquares (OLS) framework. In most cases, it delivers estimates of the break dates with higher precision and tests with higher power compared to those obtained using an IV method. To illustrate the relevance of our theoretical results, we consider the stability of the New Keynesian hybrid Phillips curve. IV-based methods do not indicate any instability. On the other hand, OLS-based ones strongly indicate a change in 1991:1 and that after this date the model looses all explanatory power. JEL Classification Number: C22. Keywords: structural change, instrument variables, parameter variations, Phillips Curve.
∗
Perron acknowledges financial support for this work from the National Science Foundation under Grant SES-0649350. We are grateful to Zhongjun Qu and two referees for useful comments. † Department of Economics, Boston University, 270 Bay State Rd., Boston, MA, 02215 (
[email protected]). ‡ University of Alberta School of Business, Business Building, Edmonton, AB, T6G 2R6 Canada (
[email protected]).
1
Introduction
Both the statistics and econometrics literature contain a vast amount of work on issues related to structural changes with unknown break dates, most of it specifically designed for the case of a single change (see, Perron, 2006, for a detailed review). However, the problem of multiple structural changes has received more attention recently. Bai and Perron (1998, 2003) provided a comprehensive treatment of various issues in the context of multiple structural change models: consistency of estimates of the break dates, tests for structural changes, confidence intervals for the break dates, methods to select the number of breaks and efficient algorithms to compute the estimates. Related contributions include Hawkins (1976) who presents a comprehensive treatment of estimation based on a dynamic programming algorithm. Perron and Qu (2006) extended the analysis to the case where arbitrary linear restrictions are imposed on the coefficients of the model. In doing so, they also considerably relaxed the assumptions used in Bai and Perron (1998). Bai, Lumsdaine and Stock (1998) considered asymptotically valid inference for the estimate of a single break date in multivariate time series allowing stationary or integrated regressors as well as trends with estimation carried using a quasi maximum likelihood (QML) procedure. Also, Bai (2000) considered the consistency, rate of convergence and limiting distribution of estimated break dates in a segmented stationary VAR model estimated again by QML when the break can occur in the parameters of the conditional mean, the variance of the error term or both. Qu and Perron (2007) considered a multivariate system estimated by quasi maximum likelihood which provides methods to estimate models with structural changes in both the regression coefficients and the covariance matrix of the errors. Kejriwal and Perron (2008, 2009) provide a comprehensive treatment of issue related to testing and inference with multiple structural changes in a single equation cointegrated model. Zhou and Perron (2007) considered the problems of testing jointly for changes in regression coefficients and variance of the errors. More recently, Hall et al. (2006, 2007) considered a single equation framework with regressors that are endogenous, i.e., correlated with the errors. They considered using an instrumental variable (IV) procedure. Hall et al. (2006) provided a very detailed proof for the consistency and rate of convergence of the estimates of the break fractions and the limit distributions of the tests for multiple structural changes. Hall et al. (2007) provided a detailed proof of the limit distribution of the estimate of the break date in a single break model assuming a stable reduced form equation. In all cases, the results are similar to those in Bai and Perron (1998). This paper follows up on the work of Hall et al. (2006, 2007) in
1
two directions. First, we provide a very simple proof of the results they derived by showing that using generated regressors, the projection of the regressors of the space spanned by the instruments, to account for potential endogeneity implies that all the assumptions of Perron and Qu (2006) (or those of Bai and Perron, 1998) obtained with original regressors contemporaneously uncorrelated with the errors, are satisfied. Hence, all results derived in those papers continue to hold. There is no need for a separate detailed analysis as done in Hall et al. (2006, 2007). This also allows us to have even more general results, in particular about the limit distributions of the estimates of the break dates for which we do not require a stable reduced form equation linking the regressors and the instruments. The second contribution is more important from a practical perspective. We show that even in the presence of endogenous regressors, it is still preferable to simply estimate the break dates and test for structural change using the usual ordinary least-squares (OLS) framework. The idea is simple yet compelling. First, changes in the true parameters of the model imply a corresponding change in the probability limits of the OLS parameter estimates, which is equivalent in the leading case of regressors and errors that have a homogenous distribution across segments. Second, one can reformulate the model with those probability limits as the basic parameters in a way that the regressors and errors are contemporaneously uncorrelated. We are then simply back to the framework of Bai and Perron (1998) or Perron and Qu (2006) and we can use their results directly to obtain the relevant limit distributions. More importantly, the OLS framework involves the original regressors while the IV framework involves as regressors the projection of these original regressors on the space spanned by the instruments. As is well known, this implies that the generated regressors in the IV procedure have less quadratic variation than the original regressors. Hence, a given change in the true parameters will cause a larger change in the conditional mean of the dependent variable in the OLS framework compared to the corresponding change in an IV framework. Accordingly, using OLS not only delivers consistent estimates of the break fractions and tests with the usual limit distributions, it also improves on the efficiency of the estimates and the power of the tests in most cases. This is shown theoretically and also via simulations. To illustrate the relevance of our theoretical results, we consider the stability of the New Keynesian hybrid Phillips curve as put forward by Gali and Gertler (1999). The results show that IV-based methods do not provide any evidence of structural instability. On the other hand, an OLS-based sup-Wald test for one break is highly significant, with 1991:1 as the estimate of the break date (with a very tight confidence interval (1990:4 to 1991:2)). The higher discriminatory power of OLS-based methods over IV-based ones occurs despite the 2
fact that the instruments are highly correlated with future inflation. The estimates for the period 1960:1-1991:1 are close to the full sample estimates reported earlier and support Gali and Gertler’s (1999) conclusion. On the other hand, the estimates for the period 1991:21997:4 are all very small and insignificantly different from zero. Hence, the hybrid New Keynesian Phillips curve specification has lost any explanatory power for inflation. This forecast breakdown of the New Keynesian Phillips curve for inflation is interesting and can be traced back to the change in the behavior of inflation (e.g., Stock and Watson, 2007). The structure of the paper is as follows. Section 2 presents the model and the assumptions on the regressors, instruments and errors. We adopt those in Perron and Qu (2006) since they are the most general available and also allows for restrictions on the parameters, including partial structural change models. Section 3 shows that these assumptions imply that the second stage regression of the instrumental variable procedure involves regressors and errors that satisfy all the assumptions in Perron and Qu (2006) so that the results about consistency, rate of convergence and limit distribution of the estimates of the break dates, as well as the limit distribution of the tests, are obtained as simple consequences. Section 4 shows how using the OLS framework is not only valid but also preferable in that it delivers estimates of the break dates with higher precision and tests with higher power. Section 5 substantiates our theoretical results via simulations and shows their practical importance. Section 6 presents our empirical illustration related to the New Keynesian hybrid Phillips curve. Section 7 provides brief concluding remarks and an appendix contains some technical derivations. 2
The model and assumptions
Consider a multiple linear regression model with m breaks or m + 1 regimes. There are T observations and m is assumed known. The break dates occur at {T1 , ..., Tm }. Let y = ¯ = (y1 , ..., yT )0 be the dependent variable and X a T by p matrix of regressors. Define X diag(X1 , ..., Xm+1 ), a T by (m + 1)p matrix with Xi = (xTi−1 +1 , ..., xTi )0 for i = 1, ..., m + 1, with the convention that T0 = 1 and Tm+1 = T (each matrix Xi is a subset of the regressor ¯ is a diagonal partition of X, the partition matrix X corresponding to regime i). The matrix X being taken with respect to the set of break points {T1 , ..., Tm }. The vector u = (u1 , ..., uT )0 is the set of disturbances and δ = (δ 01 , ..., δ 0m+1 )0 is the (m + 1)p vector of coefficients. Following Perron and Qu (2006), we consider the general pure structural change model with restrictions on the coefficients, i.e., ¯ + u, y = Xδ 3
(1)
where Rδ = r with R a k by (m + 1)p matrix with rank k and r a k dimensional vector of constants. Note that this framework includes the case of a partial structural change model by an appropriate choice of the restrictions on the parameters. Since some regressors may be correlated with the errors, we assume that there exists a set of q variables zt that can serve as instruments, and we define the T by q matrix Z = (z1 , ..., zT )0 . The goal here is to estimate the unknown break dates whose true values are denoted with a 0 superscript, i.e., (T10 , ..., Tm0 ), using the observables (y, X, Z) and the restrictions on the coefficients. The relevant IV regression is then ¯ ∗δ + u e, y=X
(2)
(Tˆ1 , ..., Tˆm ) = arg min SSRTR (T1 , ..., Tm ),
(3)
¯ ∗ = diag(X ˆ 1 , ..., X ˆ m+1 ), a T by (m + 1)p matrix subject to the restrictions Rδ = r, where X ˆ = (ˆ ˆ i = (ˆ xTi−1 +1 , ..., xˆTi )0 for i = 1, ..., m + 1, and X x1 , ..., xˆT )0 = PZ X where with X PZ = Z(Z 0 Z)−1 Z 0 is the orthogonal projection matrix onto the space spanned by the columns eT )0 with u et = ut + η t where η t = (x0t − xˆ0t )δ j for Tj−1 + 1 ≤ t ≤ Tj . of Z. Also, u e = (e u1 , ..., u The estimates of the break dates are then T1 ,...,Tm
where SSRTR (T1 , ..., Tm ) is the sum of square residuals from the restricted OLS regression (2) evaluated at the partition {T1 , ..., Tm }. It will be useful also to define the break fractions ˆ 1 , ..., λ ˆ m ) = (Tˆ1 /T, ..., Tˆm /T ). (λ01 , ..., λ0m ) = (T10 /T, ..., Tm0 /T ) with corresponding estimates (λ We impose the following assumptions on the data, the errors and the break dates, which are trivial extensions of those in Perron and Qu (2006). 0 − Ti0 ), • Assumption A1: Let wt = (x0t , zt0 )0 . For each i = 1, ..., m + 1, let ∆Ti0 = (Ti+1 then ⎤ ⎡ 0 +[∆T 0 v] Ti−1 i i i X ¢ ¡ QXX (s) QXZ (s) ⎦, wt wt0 →p Qi (s) ≡ ⎣ 1/∆Ti0 i i QZX (s) QZZ (s) 0 +1 t=Ti−1
a non-random positive definite matrix uniformly in s ∈ [0, 1].
• Assumption A2: There exists an l0 > 0 such that for all l > l0 , the minimum eigenvalues PTi0 PTi0 +l 0 z z 0 are bounded away from zero (i = 1, ..., m). of (1/l) t=T 0 +1 zt zt and of (1/l) t=Ti0 −l t t i P • Assumption A3: The matrix lt=k zt zt0 is invertible for l − k ≥ T for some > 0. P P • Assumption A4: Let the Lr -norm of a random matrix A be defined by kAkr = ( i j E |Aij |r )1/r for r ≥ 1. (Note that kAk is the usual matrix norm or the Euclidean norm of a vector.) 4
With {Fi : i = 1, 2, ..} a sequence of increasing σ-fields, we assume that {zi ui , Fi } forms a Lr -mixingale sequence with r = 2 + ε for some ε > 0. That is, there exist nonnegative constants {ci : i ≥ 1} and {ψj : j ≥ 0} such that ψj ↓ 0 as j → ∞ and for all i ≥ 1 and j ≥ 0, we have: (a) kE(zi ui |Fi−j )kr ≤ ci ψj , (b) kzi ui − E(zi ui |Fi+j )kr ≤ ci ψj+1 . Also assume (c) P 1+k ψj < ∞, (e) kzi k2r < M < ∞ and kui k2r < N < ∞ for maxi ci ≤ K < ∞, (d) ∞ j=0 j some K, M, N > 0. • Assumption A5: The sequence {zi η i , Fi } also satisfies the conditions in Assumption A4. • Assumption A6: Ti0 = [T λ0i ] , where 0 < λ01 < ... < λ0m < 1. • Assumption A7: The minimization problem defined by (3) is taken over all possible partitions such that Ti − Ti−1 ≥ T for some > 0. Assumption A1 basically rules out unit root regressors; otherwise the particular scaling used is not important and could be relaxed at the expense of substantial technical complications. Assumption A2 imposes restrictions on the instruments in a local neighborhood of the break points. They ensure that there is no local collinearity problem so the break points can be identified. Assumption A3 is a standard invertibility requirement to have well defined estimates. Assumption A4 imposes mild restrictions on the vector zt ut . They permit a wide class of potential correlation and heterogeneity (including conditional heteroskedasticity) and also allow lagged dependent variables. Assumption A6 is a standard requirement to have asymptotically distinct break dates and A7 requires that the search for breaks precludes candidates which are too close. This, however, is not constraining in practice since can be chosen arbitrarily small. For future references, note that A1 implies that T
−1
T P
t=1
zt zt0
=
m+1 P i=1
0
Ti−1 m+1 P P ∆Ti0 [(∆Ti0 )−1 zt zt0 ] →p (λi − λi−1 )QiZZ (1) ≡ QZZ , T 0 +1 i=1 t=Ti−1
with QZZ a non-random positive definite matrix. Similarly, T −1
T P
zt x0t →p
T P
xt x0t →p
t=1
m+1 P
(λi − λi−1 )QiZX (1) ≡ QZX ,
i=1
(4)
with QZX is a non-random matrix with rank equal to p, and T −1
t=1
m+1 P i=1
(λi − λi−1 )QiXX (1) ≡ QXX ,
a non-random matrix with rank p. Assumption A5 is a high-level assumption which needs to be verified on a case by case basis. If there is a stable reduced form equation of the form X = Zθ0 + v, 5
(5)
with
⎤ ⎡ 2 0 σ γ ⎦, E [(ut , vt0 )0 (ut , vt0 )] = ⎣ γ Σ
then, QZX = QZZ θ0 and QXX = θ00 QZZ θ0 + Σ. Also, η t = vt0 δ j − zt0 (Z 0 Z)−1 (Z 0 V )δ j = vt0 δj + op (1) uniformly in t. Hence, if {zt vt0 , Ft } is mixing and satisfy A4 then A5 is also satisfied. But our framework applies much more generally. Consider, for example, the case for which the parameters of the reduced form exhibits time variation such that x0t = zt θt + vt0 where θt = θ∗ + ξ t with E(ξ t ) = 0. We then have η t = vt0 δ j − zt0 (Z 0 Z)−1 (Z 0 V )δj + zt0 [θt − (
T P
t=1
zt zt0 )−1 (
T P
t=1
zt zt0 θt )]δj = vt0 δj + zt0 ξ t + op (1)
uniformly in t. Hence, A5 is satisfied if {zt ξ t , Ft } is mixing and satisfies A4. What is ei , Fi } satisfies A4. important is that Assumptions A4 and A5 imply that {zi u The framework can also accommodate a reduced form that exhibits structural changes which are estimated in the usual way. As in Hall et al. (2006), suppose that X = Z¯ 0 θ0 + v,
(6)
with Z¯ 0 = diag(Z10 , ..., Zn0 ), the diagonal partition of Z at the (possibly different) break dates (T1z , ..., Tnz ) and θ0 re-defined accordingly. Suppose that we have estimates (Tˆ1z , ..., Tˆnz ) and we use as instruments Zˆ = diag(Zˆ1 , ..., Zˆn ), a T by (n + 1)q matrix with Zˆi = (zTˆz +1 , ..., zTˆz )0 i−1 i for i = 1, ..., n + 1. This fits in the framework discussed above by augmenting the number of instruments so that there are n such set of instruments. Then, provided the estimates of ˆ z ) = (Tˆz /T, ..., Tˆz /T ) are consistent at rate T , the assumptions ˆ z , ..., λ the break fractions (λ 1 n 1 n still hold with the limit values properly re-defined. 3
Limit results for estimates and tests obtained using the IV method
We shall prove results about the consistency, rate of convergence and limit distribution of the estimates of the break fractions by simply showing that the assumptions imposed here, which are less restrictive than those of Hall et al. (2006, 2007), imply that the conditions stated in Perron and Qu (2006), for regressors and errors that are contemporaneously uncorrelated, continue to hold for the IV regression with xˆt when the original regressors are contemporaneously correlated with the errors. We first consider the consistency of the estimates of the break fractions. We have the following lemma proved in Appendix A. 6
Lemma 1 Assumptions A1-A4 hold if we replace zt with xˆt . Lemma 1 in conjunction with A5-A7 directly imply that the estimates of the break fractions are consistent. To obtain a rate of convergence, we adopt as is common in this literature a framework in which the magnitude of the shifts may decrease as the sample size increases. This is stated in the following assumption: Assumption A8. Let ∆T,i = δ 0i+1 − δ 0i . Assume ∆T,i = vT ∆i , for some ∆i independent of T where vT > 0 is a scalar satisfying either a) vT is fixed, or b) vT → 0 and T 1/2−η vT → ∞ for some η ∈ (0, 1/2). Lemma 1 in conjunction with A5-A8 directly implies the following result. Proposition 1 Under Assumptions A1-A8: for every > 0, there exists a C < ∞, such ˆ k − λ0 )| > C) < for every k = 1, ..., m. that for all large T , P (|T vT2 (λ k To analyze the limit distributions of the estimates, we need, as in Bai and Perron (1998) and Perron and Qu (2006), some additional assumptions. These are stated as follows. 0 , for i = 1, ..., m, and wt = (x0t , zt0 )0 . Then, as Assumption A9. Let ∆Ti0 = Ti0 − Ti−1 ∆Ti0 → ∞, uniformly in s ∈ [0, 1]: . ⎤ ⎡
1.(∆Ti0 )−1
0 +[s∆T 0 ] PTi−1 i
0 +[s∆T 0 ] PTi−1 i
QiXX QiXZ QiZX
QiZZ
⎦,
u e2t →p se σ 2i , 0 +1 t=Ti−1 0 +[s∆T 0 ] PT 0 +[s∆T 0 ] PTi−1 i i−1 i 3. (∆Ti0 )−1 t=T (zr zt0 u er u et )0 →p sΩiZ Uh , 0 0 +1 r=Ti−1 i−1 +1 0 +[s∆T 0 ] PTi−1 i 4. (∆Ti0 )−1/2 t=T zt u et ⇒ BZi Uh (s) with BZi Uh (s) a multivariate Gaussian 0 i−1 +1 [0, 1] with mean zero and covariance E[BZi Uh (s)BZi Uh (r)0 ] = min{s, r}ΩiZ Uh . 2. (∆Ti0 )−1
on
0 +1 t=Ti−1
wt wt0 →p sQi ≡ s ⎣
process
These assumptions are the same as those used in Perron and Qu (2006) for regressors that are uncorrelated with the errors. Part (1) compared to A1 implies, in particular, that the regressors are non-trending. Similarly, part (2) implies that the variance of the errors is fixed within each segment. Parts (3) and (4) are standard. Parts (2-4) are high level assumptions involving u et but are satisfied for all cases discussed above. Consider, for example, the case of a stable reduced form. For part (2) η t = vt0 δ i − zt0 (Z 0 Z)−1 (Z 0 V )δ i = vt0 δi +op (1) so that E(ut η t ) = γ 0 δ i +op (1) since E(ut vt ) = γ 0 and E(η 2t ) = δ 0i Σδ i +op (1). Hence, σ e2i = σ 2 + δ 0i Σδ i + 2γ 0 δi . For parts (3-4), suppose that at regime i, V ar [(ut , vt0 )0 | zt ] = Ωi , i,1/2 i,1/2 i,1/2 i,1/2 Ωi,1/2 = (Ωu , Ωv )0 where Ωi is (p+1)×(p+1), Ωu is 1×(p+1), and Ωv is p×(p+1). 7
0 +[s∆T 0 ] PTi−1 i Also, (∆Ti0 )−1/2 t=T [zt ⊗ (ut , vt0 )0 ] ⇒ Bi (s) with E[Bi (1)Bi (1)0 ] = QiZZ ⊗ Ωi . These 0 i−1 +1 assumptions correspond to assumptions 11-12 in Hall et al. (2007). We then have 0 +[s∆T 0 ] PTi−1 i,1/2 i zt ut ⇒ [QZZ ⊗ Ωi,1/2 ]W (s) (a) (∆Ti0 )−1/2 t=T 0 u i−1 +1 0 +[s∆T 0 ] PTi−1 i,1/2 i zt vt0 δ i ⇒ [QZZ ⊗ δ0i Ωi,1/2 ]W (s) (b) (∆Ti0 )−1/2 t=T 0 v i−1 +1 0 +[s∆T 0 ] PTi−1 −1/2 i (c) (∆Ti0 )−1/2 t=T zt zt0 (Z 0 Z)−1 (Z 0 V )δ i ⇒ ξ[QiZZ QZZ ⊗ δ 0i Ωi,1/2 ]W (1) 0 +1 v i−1
where ξ =
s(λ0i
−
λ0i−1 )1/2 ,
(∆Ti0 )−1/2
so that
0 +[s∆T 0 ] PTi−1 i 0 +1 t=Ti−1
i,1/2
zt u et ⇒ [QZZ ⊗ (Ωi,1/2 + δ0i Ωi,1/2 )]W (s) u v
(7)
−1/2
]W (1) −ξ[QiZZ QZZ ⊗ δ 0i Ωi,1/2 v
The same result holds if the reduced form exhibits changes which are consistently estimated. The details are in Appendix B, thereby generalizing the results of Hall et al. (2007). A similar result can be obtained in the case of random stationary variation in the parameters under the conditions stated earlier. These assumptions imply the following counterparts for the case with xˆt as the regressors used in the instrumental variable regression (see the Appendix A for a short proof). 0 , for i = 1, ..., m. Then, as ∆Ti0 → ∞, uniformly in Lemma 2 Let ∆Ti0 = Ti0 − Ti−1 s ∈ [0, 1]: 0 +[s∆T 0 ] PTi−1 i −1 i i xˆt xˆ0t →p sQ0ZX Q−1 1. (∆Ti0 )−1 t=T 0 ZZ QZZ QZZ QZX ≡ sQHH , i−1 +1 0 +[s∆T 0 ] PT 0 +[s∆T 0 ] PTi−1 i i−1 i −1 i i 2. (∆Ti0 )−1 t=T (ˆ xr xˆ0t u er u et )0 →p sQ0ZX Q−1 0 0 +1 ZZ ΩZ U h QZZ QZX ≡ sΩH U h, r=Ti−1 i−1 +1 0 0 PTi−1 +[s∆Ti ] i i i 3. (∆Ti0 )−1/2 t=T xˆt u et ⇒ Q0ZX Q−1 0 ZZ BZ U h (s) ≡ BH U h (s) with BH U h (s) a Gaussian i−1 +1 i i 0 process on [0, 1] with mean zero and covariance E[BH Uh (s)BH Uh (u) ] = min{s, u}ΩiH Uh . The following result then follows immediately from Perron and Qu (2006).
Proposition 2 Under Assumptions A1-A9, we have, for i = 1, ..., m:
(i)
(i)
(∆0i QiHH ∆i )2 (i) vT (Tˆi − Ti0 ) →d arg max VH (s), 0 i s ∆i ΩH Uh ∆i
where VH (s) = W1 (−s) − |s| /2 if s ≤ 0 and q (i) (i) i,1 i VH (s) = ξ iH (φi,2 H /φH )W2 (s) − ξ H |s| /2 if s > 0, i,2 2 2 0 i 0 i 0 i+1 0 i+1 with (φi,1 H ) = ∆i ΩH U h ∆i /∆i QHH ∆i , (φH ) = ∆i ΩH U h ∆i /∆i QHH ∆i and 0 i ξ iH = ∆0i Qi+1 HH ∆i /∆i QHH ∆i .
8
In the case of a stable reduced form as specified by (6), we have
(i)
(i)
(∆0i θ00 QiZZ θ0 ∆i )2 (i) vT (Tˆi − Ti0 ) →d arg max VH (s), 0 0 00 i s ∆i θ ΩZ Uh θ ∆i
where VH (s) = W1 (−s) − |s| /2 if s ≤ 0 and q (i) (i) i,1 i VH (s) = ξ iH (φi,2 H /φH )W2 (s) − ξ H |s| /2 if s > 0,
i,2 2 0 0 2 0 00 i 0 00 i 0 00 i+1 0 0 00 i+1 0 with (φi,1 H ) = ∆i θ ΩZ U h θ ∆i /∆i θ QZZ θ ∆i , (φH ) = ∆i θ ΩZ U h θ ∆i /∆i θ QZZ θ ∆i and 0 0 0 00 i ξ iH = ∆0i θ00 Qi+1 ZZ θ ∆i /∆i θ QZZ θ ∆i . The result for this special case is the same as that in Theorem 3 of Hall et al. (2007). Given the result in Appendix B, the same limit distribution applies for an unstable reduced form. Despite the fact that our result, as stated in Proposition 2, is simpler and involves less parameters to estimate, it is indeed more general since it does not presume the existence of a stable reduced form equation. All that is required are consistent estimates of the various quantities involved. Consider now the problem of testing the null hypothesis of no structural change. In Appendix A, we show that the following results hold given Assumptions A1-A6 when no structural change is present.
Lemma 3 Assume that ut is serially uncorrelated and that with η t = (x0t − xˆ0t )δ, E(η 2t ) = σ 2η + op (1) and E(ut η t ) = Ωuη + op (1) for all t, then, uniformly in s, a) T −1
P[T s] t=1
xˆt xˆ0t →p sQ0ZX QZZ −1 QZX ≡ sQHH ;
and b) E(e u2t ) = σ 2 + op (1) for all t and T −1/2
P[T s] t=1
1/2
xˆr u et ⇒ σQHH Wp (s).
Hence, assumption A10 in Perron and Qu (2006) is satisfied so that all results pertaining to hypothesis testing remain valid. Relaxing the assumption of serially uncorrelated errors is done by modifying the various tests using a heteroskedasticity and autocorrelation robust covariance matrix for the parameter estimates. Note that the assumption that E(η 2t ) = σ 2η + op (1) for all t is satisfied in all cases discussed above. For example, in the case of a stable reduced form η t = vt0 δ − zt0 (Z 0 Z)−1 (Z 0 V )δ = vt0 δ + op (1) so that E(ut η t ) = γ 0 δ + op (1) since E(ut vt ) = γ 0 and E(η 2t ) = δΣδ 0 + op (1). To summarize, we have shown that using the generated regressors xˆt , the projection of the regressors on the space spanned by the instruments, to account for potential endogeneity implies that all the assumptions of Perron and Qu (2006) (or those of Bai and Perron, 9
1998) obtained with original regressors contemporaneously uncorrelated with the errors, are satisfied. Hence, all results derived in those papers continue to hold. But the question of interest is then whether using an instrumental variable approach is necessary or even useful. The next section addresses this issue and shows that simply using OLS is preferable. 4
Estimating and testing using OLS estimates
In this section, we show that it is preferable to estimate the break dates using the standard OLS method rather than an IV procedure even in the presence of endogenous regressors. The reasons for this are very simple. Assume for simplicity that the break dates are known and let p limT →∞ E(Xi0 u) = φi for i = 1, ..., m + 1, then, the probability limit of the OLS estimate ˆδ from (1) is given by, under A9, ¯ 00 X ¯ 0 )−1 X ¯ 00 u = δ 0 + diag(Q1XX , ..., Qm+1 )−1 (φ1 , ..., φm+1 )0 δ ∗ = δ 0 + p lim (X XX 0
= δ +
T →∞ −1 0 [(Q1XX )−1 φ1 , ..., (Qm+1 XX ) φm+1 )] .
Any change in the parameter δ 0 will imply a change in the limit value of the OLS estimate δ ∗ . Hence, one can still identify parameter changes using OLS estimates. The second feature is the well known inequality ||PZ X|| ≤ ||X|| so that using an IV procedure leads to regressors that have less quadratic variation than when using OLS. This and the fact that the limit value of the OLS estimate changes in an equal way imply that the estimates of the break dates will, in general, be less precisely estimated using an IV procedure. The main cause for this is the fact that a change in the parameter δ 0 will, in general, cause a larger change in the conditional mean of the dependent variable in the OLS framework compared to the corresponding change in an IV regression. To make the above arguments more precise, consider writing the DGP (1) as ¯ 0 δ 0 + PX¯ u + (I − PX¯ )u y = X 0 0 0 0 −1 0 ¯0) X ¯ 0 u] + (I − PX¯ )u = X ¯ 0 [δ + (X ¯0X ¯ 0 δ ∗T + u∗ , = X 0 ¯ 00 X ¯ 0 )−1 X ¯ 00 u] for which δ∗T →p δ ∗ . So we can consider where u∗ = (I − PX¯0 )u and δ ∗T = [δ 0 + (X ¯ 0 δ ∗ + u∗ . It is a regression in terms of the population value of the parameters, viz., y = X ¯ 0 is uncorrelated with u∗ so that the OLS estimate, say ˆδ ∗ , clear that in this framework X will be consistent for δ ∗ . This suggests estimating the break dates by minimizing the sum of squared residuals from the following regression ¯ ∗ + u∗ . y = Xδ 10
Then, the estimates of the break dates are given by (Tˆ1∗ , ..., Tˆm∗ ) = arg minT1 ,...,Tm SSRT∗ (T1 , ..., Tm ) ¯ ∗ )0 (y − Xδ ¯ ∗ ). We then obtain directly from Perron and where SSRT∗ (T1 , ..., Tm ) = (y − Xδ Qu (2006) that, under the same conditions as used above, the estimates of the break fractions are consistent and have the same convergence rate as in the usual OLS framework with regressors contemporaneously uncorrelated with the errors. The limit distribution of the estimates of the break dates is given in the following proposition. Proposition 3 Under Assumptions A1—A4 and A6-A9 with A2-A4 and A9 stated with xt instead of zt , A4 and A9 stated with u∗t instead of ut and u et (using the notations σ ∗i , ΩiXU ∗ ∗ ∗ i i i ∗ and BXU ∗ instead of σ i , Ω h , and B h ), with A8 stated in term of ∆i = δ i+1 − δ i instead ZU ZU of ∆i = δ 0i+1 − δ0i , we have, for i = 1, ..., m: i ∗ 2 (∆∗0 i QXX ∆i ) vT (Tˆi∗ − Ti0 ) →d arg max V (i) (s) i ∗ s ∆∗0 Ω ∆ i XU ∗ i (i)
where V (i) (s) = W1 (−s) − |s| /2 if s ≤ 0 and p (i) V (i) (s) = ξ i (φi,2 /φi,1 )W2 (s) − ξ i |s| /2
if s > 0,
2 i ∗ ∗0 i ∗ ∗0 i+1 ∗ ∗0 i+1 ∗ with φ2i,1 = ∆∗0 i ΩXU ∗ ∆i /∆i QXX ∆i , φi,2 = ∆i ΩXU ∗ ∆i /∆i QXX ∆i and i+1 ∗ ∗0 i ∗ ξ i = ∆∗0 i QXX ∆i /∆i QXX ∆i .
Note that the limit distribution depends on ∆∗i = δ ∗i+1 − δ ∗i . But since the OLS estimates are consistent for δ ∗ this quantity can still be consistently estimated. One can then compare the limit distributions of the estimates of the break dates using OLS or the IV procedure discussed in the previous section. For simplicity, consider the case where the instruments, regressors and errors have a constant distribution throughout the sample so the moment matrices do not change across regimes. Also, suppose that the errors are uncorrelated. Then, the limit distribution of the break fractions based on the IV ˆ IV , is given by procedure, denoted λ ∆0T,i QHH ∆T,i (i)
(i)
σ e2i
(i)
ˆ i,IV − λ0 ) →d arg max V (s), T (λ i H s
(i)
(i)
i,1 where VH (s) = W1 (−s) − |s| /2 if s ≤ 0, VH (s) = (φi,2 H /φH )W2 (s) − |s| /2 if s > 0 i,2 2 2 0 i 0 0 i+1 0 with (φi,1 H ) = ∆i ΩH U h ∆i /∆i QHH ∆i , (φH ) = ∆i ΩH U h ∆i /∆i QHH ∆i , and that of the break ˆ OLS , is fractions based on the OLS procedure, denoted λ
∆0T,i QXX ∆T,i ˆ i,OLS − λ0 ) →d arg max[W (s) − |s|/2], T (λ i s σ ∗2 11
¯ ˆδ∗ )0 (y − X ¯ ˆδ ∗ ). Note that, even in where σ ∗2 = p limT →∞ T −1 uˆ∗0 uˆ∗ = p limT →∞ T −1 (y − X i,1 this case, the limit distribution of the IV estimate will not be symmetric since φi,2 H 6= φH even with homogenous distributions for the errors and regressors. This makes difficult a precise comparison of the relative efficiency of the two estimates. Note, however, that the component that contributes most to the variablity of each estimate is the scaling factor on the left-hand side. Hence, to gain some insights about the relative efficiency of the OLS and σ 2i is positive in IV estimates, we shall analyze the conditions for which QXX /σ ∗2 − QHH /e the scalar case with one regressor. We then have yt = xt δ i + ut ,
xt = zt θ + vt ,
e2i = σ 2 + 2φδi + δ 2i σ 2v , σ ∗2 = σ 2 − φ2 (θ2 σ 2z + σ 2v )−1 and with QHH = θ2 σ 2z , QXX = θ2 σ 2z + σ 2v , σ φ = E(ut vt ) such that φ2 ≤ σ 2 σ 2v . Finally, e2i − QHH σ ∗2 , D ≡ QXX σ
= (θ2 σ 2z + σ 2v )(σ 2 + 2φδ i + δ2i σ 2v ) − θ2 σ 2z (σ 2 − φ2 (θ2 σ 2z + σ 2v )−1 )
= 2θ2 σ 2z φδ i + θ2 σ 2z δ 2i σ 2v + σ 2 σ 2v + 2φδ i σ 2v + δ 2i σ 4v + θ2 σ 2z φ2 (θ2 σ 2z + σ 2v )−1 .
We seek sufficient conditions such that D ≥ 0. Consider first the case where σ 2v = 0. Then e2i = σ ∗2 = σ 2 hold, hence OLS and IV are equivalent φ = 0 so that QHH = QXX = θ2 σ 2z and σ (D = 0). When σ 2v > 0, we can, without loss of generality, normalize σ 2v = 1 so that D = 2θ2 σ 2z φδi + θ2 σ 2z δ 2i + σ 2 + 2φδ i + δ 2i + θ2 σ 2z φ2 (θ2 σ 2z + 1)−1 , e2i = σ 2 ≥ σ ∗2 = σ 2 − φ2 (θ2 σ 2z + σ 2v )−1 so with φ2 ≤ σ 2 . First, when δ i = 0, QHH ≤ QXX and σ that D ≥ 0 and OLS dominates IV. Also, when φ = 0, QHH = θ2 σ 2z ≤ QXX = θ2 σ 2z + σ 2v and σ e2i = σ 2 + δ 2i σ 2v ≥ σ ∗2 = σ 2 so that OLS again dominates IV. When δ i 6= 0, define a = θ2 σ 2z and xi = φ/δ i so that D/δ 2i = 2axi + a + (σ 2 /δ2i ) + 2xi + 1 + (a/(a + 1))x2i . Using the fact that φ2 ≤ σ 2 a sufficient condition for D ≥ 0 is that D∗ ≥ 0 where D∗ = 2axi + a + x2i + 2xi + 1 + (a/(a + 1))x2i . After some algebra we get ∙ ¸ (a + 1)2 (a + 1)2 a+1 a+1 ∗ 2 D = xi + 2 xi + = xi + [xi + a + 1] 2a + 1 2a + 1 2a + 1 2a + 1 12
Hence, D∗ ≥ 0 when xi ≤ −(a + 1) or xi ≥ −(a + 1/(2a + 1)). Using the fact that R2 = θ2 σ 2z /(θ2 σ 2z + σ 2v ) these conditions can be expressed as, without the normalization σ 2v = 1, ½ ¾ δ i σ 2v δ i σ 2v 2 , −1 − . (8) R ≤ max 1 + φ φ
Accordingly, OLS dominates IV except for a narrow cone centered around δi σ 2v /φ = −1. Though this case can possibly occur, it is highly unlikely in practice. For example, if σ 2v = 1, it requires that the correlation between the errors and regressors (φ) be exactly the negative of the value of the parameter in the first regime (δ i ). We nevertheless consider the finite sample properties of the estimates in that case in Section 5.4. In summary, it is, in most cases, preferable to estimate the break dates using the simple OLS-based method. As we shall see below, the loss in efficiency can be especially pronounced when the instruments are weak as is often the case in applications. Of course, if the goal is not to get estimates of the break dates per se but of the parameters within each regime, one should use an IV regression but conditioning on the estimates of the break dates obtained using the OLS-based procedure. Their limit distributions will, as usual, be the same as if the break dates were known since the estimates of the break fractions converge at a fast enough rate. Using the same logic, we can expect the power of tests for structural change to be more powerful when based on the OLS regression rather than the IV regression. This will be investigated through simulations in the next section. 5
Simulation evidence
In this section, we provide simulation evidence to assess the advantages of estimating the break dates and constructing tests using an OLS regression compared to using an IV regression. Three scenarios are possible for a change between regimes i and i − 1, say: 1) δ ∗i 6= δ ∗i−1 and δ 0i 6= δ 0i−1 ; 2) δ ∗i 6= δ ∗i−1 and δ 0i = δ 0i−1 ; 3) δ∗i = δ ∗i−1 and δ 0i 6= δ 0i−1 . The third possibility is a knife-edge case that would require a change in the bias term to exactly offset the change in the structural parameters, hence we shall not analyze it further. The first is the most interesting since one can test and form confidence intervals using either the OLS or IV method. We discuss this scenario in Section 5.1 with the leading case of regressors that have a homogenous distribution across segments and with the correlation between regressors and errors also invariant. In Section 5.2, we consider the case for which either can be changing, a case not covered by our theory. In Section 5.3 we consider the second scenario.
13
5.1
Scenario 1: Homogenous distributions across segments
The data are generated by yt = xt δ t + ut where yt , xt and ut are scalars. We consider, for simplicity, the case of a single break in the parameter δ t occurring at mid-sample, so that ½ c for t ∈ [1, T /2] , δt = (9) −c for t ∈ [T /2 + 1, T ] .
First, define the following random variables: ut and vt are i.i.d. N (0, 1) with E(ut vt ) = 1/2; also ξ t ∼ i.i.d. N (1, 1) and εt ∼ i.i.d. N (0, 1) both uncorrelated with ut and vt . The regressor xt is kept the same throughout the specifications and is generated by xt = ξ t + vt and, again √ √ for simplicity, we consider the case of single instrument zt generated by zt = γξ t + 2 − γεt with 0 ≤ γ ≤ 2. Note that ut and vt are correlated so the regressors xt are correlated with the errors ut . The variable ξ t is a component common to xt and zt and the parameter γ controls the extent of the correlation between the regressors and the instruments. The component εt is added in the generation of zt to keep the variability of the instrument constant when varying γ. The R2 in the reduced form linking xt and zt is a function of γ √ √ given by R2 = γσ ξ /σ x σ z = γ/2 where σ ξ , σ x and σ z are the standard error of ξ t , xt and zt , respectively. We report results for the following values R2 = .7, .5, .2 and .001, and we consider three values for the size of the change in the parameter, c = .25, .5 and 1.0. The sample size is T = 100 and the number of the replication is 1, 000. The results are presented in Figure 1 for the cumulative distribution functions. The OLS-based estimates are always more precise than the IV-based estimates, even when the R2 is as high as 0.7. The extent to which the IV-based estimates perform badly increases as R2 decreases, and even for a value as high as 0.5, which is larger than what can be obtained in most applications, the IV-based estimates show much higher variability than the OLS-based estimates. In the case of very weak instruments, the IV-based estimates are totally uninformative. Also, when the magnitude of change increases, the precision of the OLS-based estimate increases noticeably. The increase in the precision of the IV-based estimate is, however, marginal as c increases. Hence, even with large breaks, the OLSbased estimates are far superior. The simulations show that it is indeed highly preferable to estimate the break dates using a standard OLS framework even when the regressors are contemporaneously correlated with the errors. We also simulated the power of the sup-Wald test for a single break of Quandt (1958, 1960) as developed by Andrews (1993). Given our setup, it is given by sup W =
SSRTr − SSR(Tb ) , [ T ] 0, τ 1 ω1 is bounded away from zero, which implies that the minimum eigenvalues of MT are bounded away from zero. To verify assumption A3, note that l P
xˆt xˆ0t = (T −1 X 0 Z)(T −1 Z 0 Z)−1 (
t=k
Since (T −1 X 0 Z), (T −1 Z 0 Z)−1
l P
zt zt0 )(T −1 Z 0 Z)−1 (T −1 Z 0 X)
t=k
P and ( lt=k zt zt0 ) have full column rank by A3,
rank[(T −1 X 0 Z)(T −1 Z 0 Z)−1 (
l P
t=k
£ ¤ zt zt0 )(T −1 Z 0 Z)−1 (T −1 Z 0 X)] = rank (T −1 X 0 Z) = p.
We now verify assumption A4. For any finite T , given a sample of {xt } and {zt }, let P P et = γ 0T zt u et = γ 0 zt u et + op (1), γ T = ( Tt=1 zt zt0 )−1 ( Tt=1 zt x0t ), a q × p matrix. Then xˆt u −1 r et is a linear combination of the L -mixingale uniformly in t, where γ = QZZ QZX . Now γzt u sequence {zt u et } (A4 and A5). Therefore all we need to show is that linear combinations of 23
any finite numbers of Lr -mixingale processes are also Lr -mixingale processes. To prove this it is sufficient to consider two sequences {η 1t } and {η 2t } that satisfy A4 and show that linear combinations of these, say aη 1i + bη 2i , satisfy properties (a), (b) and (e) in A4. For property (a), we have, ° ° ° ° °E(aη 1i + bη 2i |Fi−j )° = °E(aη 1i |Fi−j ) + E(bη 2i |Fi−j )° r r ° ° ° ° ≤ °E(aη 1i |Fi−j )°r + °E(aη 2i |Fi−j )°r ≤ |a|r c1i ψ1i + |b|r c2i ψ2i , while for property (b), ° 1 ° °aη i + bη 2i − E(aη 1i + bη 2i |Fi−j )°
° ° = °aη 1i + bη 2i − E(aη 1i |Fi−j ) − E(bη 2i |Fi−j )°r ° ° ° ° ≤ °aη 1i − E(aη 1i |Fi−j )° + °bη 2i − E(bη 2i |Fi−j )°
r
r
|a|c1i ψ1i
≤
+
r
|a|c2i ψ2i ,
and, for property (e), ° 1 ° ° ° ° ° °aη i + bη 2i ° ≤ |a| °η 1i ° + |b| °η 2i ° < (|a| + |b|)M < ∞. 2r 2r 2r
et differs from γzt u et by an op (1) term uniformly in t, it has the same properties. Since γ 0T zt u Proof of Lemma 2. For part 1, we have (∆Ti0 )−1 = (T
−1
0 +[s∆T 0 ] Ti−1 i
P
0 +1 t=Ti−1
0
X Z)(T
−1
xˆt xˆ0t
0
−1
Z Z)
[(∆Ti0 )−1
0 −1 i −1 p sQZX QZZ QZZ QZZ QZX
→
For part 2,
(∆Ti0 )−1 = (T →
−1
0
P
0 +1 t=Ti−1
≡ sQiHH .
zt zt0 ](T −1 Z 0 Z)−1 (T −1 Z 0 X)
0 +[s∆T 0 ] T 0 +[s∆T 0 ] Ti−1 i i−1 i
P
P
0 +1 t=Ti−1
X Z)(T
−1
0
0 +1 r=Ti−1
−1
Z Z)
For part 3,
(ˆ xr xˆ0t u er u et )
[(∆Ti0 )−1
0 −1 i −1 p sQZX QZZ ΩZ U h QZZ QZX
(∆Ti0 )−1/2
0 +[s∆T 0 ] Ti−1 i
0 +1 t=Ti−1
xˆt u et = (T
P
0 +1 t=Ti−1
≡ sΩiH Uh .
0 +[s∆T 0 ] Ti−1 i
P
0 +[s∆T 0 ] T 0 +[s∆T 0 ] Ti−1 i i−1 i
−1
P
0 +1 r=Ti−1
(zr zt0 u er u et )] (T −1 Z 0 Z)−1 (T −1 Z 0 X)
XT0 ZT )(T −1 ZT0 ZT )−1 [(∆Ti0 )−1/2
i i ⇒ Q0ZX Q−1 h (s) ≡ BH U h (s) ZZ BZ U
0 +[s∆T 0 ] Ti−1 i
P
0 +1 t=Ti−1
zt u et ]
Proof of Lemma 3: Part (a) follows from the proof of Lemma 1(a) with trivial modifications using the fact that there is a single segment involved. Part (b) follows from the fact that with serially uncorrelated errors and a single regime, ΩiH Uh in lemma 2 (part 3) reduces to σ2 QHH . 24
Appendix B In this appendix, we show that (7) holds in the case of a consistently estimated unstable reduced form. Let the reduced form be given by x0t = zt θt + vt0 and suppose, for simplicity, that a single break occurs at k0 T , that is, ½ θ1 for t < k0 T θt = θ2 otherwise We further assume that k0 can be estimated consistently at rate T so that T (kˆ − k0 ) = O(1). P ] 0 p P[sT ] 0 1/2 −1/2 Also, T −1 [sT t=1 zt zt → sQZZ and T t=1 zt vt ⇒ Ωv B(s). Without loss of generality, 0 ˆ consider the case kˆ < k0 . Denote, for ease of notation, the following sets: a = [Ti−1 + 1, T k], ˆ and C = [T k0 + 1, T ]. Now b = [T kˆ + 1, T k0 ], c = [T k0 + 1, Ti0 ], A = [1, T k] ⎧ ⎪ ⎪ u + vt0 δ − zt0 (ˆθ1 − θ1 )δ j for t ∈ a ⎪ ⎨ t u et = ut + vt0 δ − zt0 (ˆθ1 − θ2 )δ j for t ∈ b ⎪ ⎪ ⎪ ⎩ u + v 0 δ − z 0 (ˆθ − θ )δ for t ∈ c t 2 j t t 2 ⎧ P P ⎪ ⎪ zt ut + zt vt0 δj − zt zt0 [ A zt zt0 ]−1 [ A zt vt0 ] δ j for t ∈ a ⎪ ⎪ hP i−1 P ⎪ ⎪ ⎪ 0 ⎪ [ B (zt zt0 θ1 + zt vt0 )] δ j zt ut + zt vt0 δ j − zt zt0 ⎪ B,C zt zt ⎪ ⎨ hP i−1 P 0 zt u et = −zt zt0 z z [ C (zt zt0 θ2 + zt vt0 )] δ j + zt zt0 θ1 δ j t t B,C ⎪ hP i−1 P ⎪ ⎪ ⎪ 0 0 0 ⎪ u + z v δ − z z z z [ B (zt zt0 θ1 + zt vt0 )] δ j z t t t j t t ⎪ t t t B,C ⎪ ⎪ h i ⎪ −1 P P ⎪ 0 ⎩ −zt zt0 z z [ C (zt zt0 θ2 + zt vt0 )] δj + zt zt0 θ2 δ j t t B,C
for t ∈ b for t ∈ c
0 + [s∆Ti0 ]], it is easy to show that Then consider without loss of generality c ∈ [k0 T, Ti−1
(∆Ti0 )−1/2
P
et abc zt u
P P = (∆Ti0 )−1/2 abc zt ut + (∆Ti0 )−1/2 abc zt vt0 δ i £ ¤ P P −1 P − (∆Ti0 )−1/2 a zt zt0 [ A zt zt0 ] [ A zt vt0 δ i ] £ ¤ P P −1 P − (∆Ti0 )−1/2 c zt zt0 [ C zt zt0 ] [ C zt vt0 δ i ] + op (1) i,1/2
−1/2
+ δ 0i Ωi,1/2 )]W (s) − ξ[QiZZ QZZ ⊗ δ 0i Ω1/2 ⇒ [QZZ ⊗ (Ωi,1/2 u v v ]W (1) where ξ = s(λ0i − λ0i−1 )1/2 .
25
References Andrews, D.W.K. (1993): “Tests for parameter instability and structural change with unknown change point,” Econometrica, 61, 821-856. Bai, J. (1997): “Estimation of a change point in multiple regression models,” Review of Economic and Statistics, 79, 551-563. Bai, J. (2000): “Vector autoregressive models with structural changes in regression coefficients and in variance-covariance matrices,” Annals of Economics and Finance, 1, 303-339. Bai, J., Lumsdaine, R.L., and Stock, J.H. (1998): “Testing for and dating breaks in multivariate time series,” Review of Economic Studies, 65, 395-432. Bai, J., and Perron, P. (1998): “Estimating and testing linear models with multiple structural changes,” Econometrica, 66, 47-78. Bai, J., and Perron, P. (2003): “Computation and analysis of multiple structural change models,” Journal of Applied Econometrics, 18, 1-22. Gali, J., and Gertler, M. (1999): “Inflation dynamics: a structural econometric analysis,” Journal of Monetary Economics, 44, 195-222. Hall, A., Han, S., and Boldea, O. (2006): “Inference regarding multiple structural changes in linear model estimated via two stage least squares,” Unpublished manuscript, Department of Economics, North Carolina State University. Hall, A., Han, S., and Boldea, O. (2007): “Asymptotic distribution theory for break point estimators in models estimated via 2SLS,” Unpublished manuscript, Department of Economics, North Carolina State University. Hansen, B.E. (2000), “Testing for Structural Change in Conditional Models,” Journal of Econometrics, 97, 93-115. Hawkins, D.M. (1976): “Point estimation of the parameters of piecewise regression models,” Applied Statistics, 25, 51-57. Kejriwal, M, and Perron, P. (2008): “Testing for multiple structural changes in cointegrated regression models,” forthcoming in Journal of Business and Economic Statistics. Kejriwal, M, and Perron, P. (2009): “The limit distribution of the estimates in cointegrated regression models with multiple structural changes,” Journal of Econometrics 146, 59-73.
26
Kurmann, A. (2007): “VAR-based estimation of Euler equations with an application to New Keynesian pricing,” Journal of Economic Dynamics and Control, 31, 767-796. Perron, P. (2006): “ Dealing with structural breaks,” in Palgrave Handbook of Econometrics, Vol. 1: Econometric Theory, K. Patterson and T.C. Mills (eds.), Palgrave Macmillan, 278352. Perron, P., and Qu, Z. (2006): “Estimating restricted structural change models,” Journal of Econometrics, 134, 373-399. Qu, Z., and Perron, P. (2007): “Estimating and testing multiple structural changes in multivariate regressions,” Econometrica, 75, 459-502. Quandt, R.E. (1958): “The estimation of the parameters of a linear regression system obeying two separate regimes,” Journal of the American Statistical Association, 53, 873-880. Quandt, R.E. (1960): “Tests of the hypothesis that a linear regression system obeys two separate regimes,” Journal of the American Statistical Association, 55, 324-330. Rossi, B., and Sekhposyan T. (2008): “Has models’ forecasting performance for US output growth and inflation changed over time, and when?” Economic Research Initiatives at Duke Working Paper No. 12, Duke University. Stock, J.H., and M.W. Watson (2007): “Why has U.S. inflation become harder to forecast?” Journal of Money, Credit and Banking, 39, 3-33. Zhou, J., and Perron P. (2007): “Testing jointly for structural changes in the error variance and coefficients of a linear regression model,” Unpublished Manuscript, Department of Economics, Boston University.
27
Figure 1 : Cumulative distribution functions of the estimates of the break date. c = 0.25 1.0 0.9 0.8 0.7 0.6 0.5 OLS 0.4
R2=0.7 R2=0.5
0.3
R2=0.2 0.2
R2=0.001
0.1 0.0 -0.30
-0.20
-0.10
0.00
0.10
0.20
0.30
0.10
0.20
0.30
0.10
0.20
0.30
c = 0.5 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 -0.30
-0.20
-0.10
0.00
c=1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 -0.30
-0.20
-0.10
0.00
Figure 2 : Power functions of the sup-Wald structural change test.
1.0 OLS
0.9
R2=0.7 R2=0.5
0.8
R2=0.2 R2=0.001
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0 0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0 (c)
Figure 3 : Change in the variance of the regressor; (Left : 2v = 1 and 1v = a, Right : 1v = 1 and 2v = b) a = 0.8
b = 0.8
1.0
1.0
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5 OLS
0.4
0.4
R2=0.7 0.3
0.3
R2=0.5 R2=0.2
0.2
0.2
R2=0.001 0.1
0.1
0.0
0.0 -0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
-0.3
-0.2
-0.1
a = 0.5
0.0
0.1
0.2
0.3
0.1
0.2
0.3
0.1
0.2
0.3
b = 0.5
1.0
1.0
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0.0
0.0 -0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
-0.3
-0.2
-0.1
a = 0.25
0.0
b = 0.25
1.0
1.0
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0.0
0.0 -0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
-0.3
-0.2
-0.1
0.0
Figure 4 : Change in the correlation between the regressor and errors; (Left : 2 = 1 and 1 = a, Right : 1 = 1 and 2 = b) a = 0.25
b = 0.25
1.0
1.0
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5 OLS
0.4
0.4
R2=0.7 0.3
0.3
R2=0.5 R2=0.2
0.2
0.2
R2=0.001 0.1
0.1
0.0
0.0 -0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
a = 0.1
1.0
-0.3
-0.2
-0.1
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0.0
0.1
0.2
0.3
0.1
0.2
0.3
0.1
0.2
0.3
b = 0.1
1.0
0.9
0.0
0.0 -0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
-0.3
-0.2
-0.1
0.0
b = 0.01
a = 0.01 1.0
1.0
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1 0.0
0.0 -0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
-0.3
-0.2
-0.1
0.0
Figure 5 : Change in the mean of the regressor; (Left : 2 = 1 and 1 = a, Right : 1 = 1 and 2 = b) a = 1.5
b = 1.5
1.0
1.0
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5 OLS
0.4
0.4
R2=0.7 0.3
0.3
R2=0.5 R2=0.2
0.2
0.2
R2=0.001 0.1
0.1
0.0
0.0 -0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
-0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
0.1
0.2
0.3
0.1
0.2
0.3
b = 2.0
a = 2.0 1.0
1.0
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1 0.0
0.0 -0.3
-0.2
-0.1
0.0
0.1
0.2
-0.3
0.3
-0.2
-0.1
a = 5.0
0.0
b = 5.0
1.0
1.0
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0.0
0.0 -0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
-0.3
-0.2
-0.1
0.0
Figure 6 : Set of parameter values for which IV can be more efficient than OLS. 1.0
0.8
0.6
0.4
0.2
0.0 -5
-4
-3
-2
-1
0
1
2
3
4
5
Figure 7 : Finite sample distributions of break date estimates under the condition (δ/φ)σ 2v = −1; R 2 = 0.5
R 2 = 0 .7
1.0
1.0
0.8
0.8
0.6
0.6 0.4
0.4
OLS 0.2
0.2
IV 0.0
0.0 -0.35
0
0.35
-0.35
0
0.35
R 2 = 0 .001
R 2 = 0 .2 1.0
1.0
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.0
0.0 -0.35
0
0.35
-0.35
0
0.35
Table 1. Rejection frequencies of the sup-Wald tests Change in the variance of the regressor σ 21v bias change OLS IV(R2 = 0.7) IV(R2 = 0.5) IV(R2 = 0.2) IV(R2 = 0.001)
2 -0.15 .057 .051 .045 .058 .059
0.8 0.06 .046 .053 .048 .035 .042
0.5 0.15 .064 .048 .046 .053 .057
0.25 0.22 .079 .052 .057 .054 .052
Table 2. Rejection frequencies of the sup-Wald tests Change in the correlation between the regressor and errors φ1 bias change OLS IV(R2 = 0.7) IV(R2 = 0.5) IV(R2 = 0.2) IV(R2 = 0.001)
0.5 0 .048 .048 .052 .051 .050
0.25 -0.125 .071 .042 .037 .049 .057
0.1 -0.20 .128 .041 .046 .051 .048
0.01 -0.245 .199 .058 .053 .050 .052
Table 3. Rejection frequencies of the sup-Wald tests Change in the mean of the regressor κ1 bias change OLS IV(R2 = 0.7) IV(R2 = 0.5) IV(R2 = 0.2) IV(R2 = 0.001)
0 0.25 .062 .049 .091 .317 .496
0.5 0.15 .043 .052 .060 .119 .159
2 -0.15 .108 .044 .128 .456 .562
5 -0.23 .511 .092 .614 1.000 1.000
Table 4. Full sample estimates and tests for structural change for the hyblid Philips curve (a) xt is labor income share SupF
parameter estimates
OLS
28.2
break date 1991:1
2SLS
11.6
1974:1
∗∗∗
95% C.I.
const
0.001 (0.001) 1963:3 1984:3 0.000 (0.001) (b) xt is the GDP gap 1990:4
1991:2
SupF OLS
21.1∗∗∗
2SLS
8.6
π t−1 0.480 (0.060) 0.312 (0.081)
xt −0.002 (0.007) 0.004 (0.008)
E(πt ) 0.479 (0.063) 0.684 (0.080)
parameter estimates
95% C.I.
break date 1991:1
1990:4
1991:2
1974:1
1962:3
1985:3
const
0.000 (0.000) 0.000 (0.000)
π t−1 0.482 (0.062) 0.293 (0.088)
xt 0.001 (0.007) −0.006 (0.008)
E(πt ) 0.478 (0.066) 0.706 (0.092)
Table 5. IV sub-sample estimates for the hybrid Phillips curve (a) xt is labor income share γ α xt 1960:1-1991:1 0.000 0.303 −0.001 (0.002) (0.086) (0.009) 1991:2-1997:4 −0.002 −0.015 0.057 (0.006) (0.150) (0.041) (b) xt is the GDP gap γ α xt 1960:1-1991:1 0.000 0.286 −0.005 (0.000) (0.093) (0.008) 1991:2-1997:4 0.006 −0.083 −0.046 (0.001) (0.147) (0.021)
E(πt ) 0.679 (0.084) −0.062 (0.233) E(πt ) 0.701 (0.096) −0.129 (0.215)
Notes:
1. The Sup-F test uses a 15% trimming. In parentheses are the heteroskedasticity robust estimate of the standard errors. *, ** and *** indicates significance at the 10%, 5% and 1% level, respectively.
2. The confidence intervals of the estimates of the break date are based on the two-sided 95% nominal level symmetric method described in Bai (1997).