Lecture 16 Unit Root Tests

125 downloads 1206 Views 495KB Size Report
But, when we have a non-stationary system, effect of a shock is permanent ..... space of all real valued functions on [0,1] that are right continuous at each point on [0 ...... depends on the location of the structural break, λ= TB/T. Critical values are ...
RS – EC2 - Lecture 16

Lecture 16 Unit Root Tests

1

Autoregressive Unit Root • A shock is usually used to describe an unexpected change in a variable or in the value of the error terms at a particular time period. • When we have a stationary system, effect of a shock will die out gradually. But, when we have a non-stationary system, effect of a shock is permanent. • We have two types of non-stationarity. In an AR(1) model we have: - Unit root: | Φ1 | = 1: homogeneous non-stationarity - Explosive root: | Φ1 | > 1: explosive non-stationarity • In the last case, a shock to the system become more influential as time goes on. It can never be seen in real life. We will not consider them.

1

RS – EC2 - Lecture 16

Autoregressive Unit Root • Consider the AR(p) process:  ( L ) yt     t where  ( L )  1  1 L1  L2 2  ....   p L p As we discussed before, if one of the rj’s equals 1, Φ(1)=0, or

1   2  ....   p  1 • We say yt has a unit root. In this case, yt is non-stationary. Example: AR(1): y t    1 y t 1   t =>Unit root: Φ1=1. H0 (yt non-stationarity): Φ1=1 (or, Φ1-1 = 0) H1 (yt stationarity): Φ1 < 1 (or, Φ1-1 < 0) • A t-test seems natural to test H0. But, the ergodic theorem and MDS CLT do not apply: the t-statistic does not have the usual distributions.

Autoregressive Unit Root • Now, let’s reparameterize the AR(1) process. Subtract yt-1 from yt: yt  yt  yt 1    (1  1) yt 1   t     0 yt 1   t

• Unit root test: H0: α0= Φ1-1 =0 against H1: α0< 0. • Natural test for H0: t-test . We call this test the Dickey-Fuller (DF) test. But, what is its distribution? • Back to the general, AR(p) process: ( L) yt     t We rewrite the process using the Dickey-Fuller reparameterization::

 y t     0 y t 1   1 y t 1   2  y t  2  ....   p 1 y t  ( p 1)   t • Both AR(p) formulations are equivalent.

2

RS – EC2 - Lecture 16

Autoregressive Unit Root – Testing • AR(p) lag Φ(L):

( L)  1  1 L1  L2  2  ....   p L p

• DF reparameterization: (1  L )   0   1 ( L  L 2 )   2 ( L 2  L3 )  ....   p 1 ( L p 1  L p ) • Both parameterizations should be equal. Then, Φ(1)=-α0. => unit root hypothesis can be stated as H0: α0=0. Note: The model is stationary if α0< 0 => natural H1: α0 < 0. • Under H0: α0=0, the model is AR(p-1) stationary in Δyt. Then, if yt has a (single) unit root, then Δyt is a stationary AR process. • We have a linear regression framework. A t-test for H0 is the Augmented Dickey-Fuller (ADF) test.

Autoregressive Unit Root – Testing: DF • The Dickey-Fuller (DF) test is a special case of the ADF: No lags are included in the regression. It is easier to derive. We gain intuition from its derivation. • From our previous example, we have:

 y t      1 y t 1   t     0 y t 1   t • If α0 = 0, system has a unit root:: H 0 :  0  0

H1 :  0  0 (|  0 | 0) • We can test H0 with a t-test:

t  1 

ˆ  1 SE ˆ



• There is another associated test with H0, the ρ-test:. (T-1)( ˆ −1).

3

RS – EC2 - Lecture 16

Autoregressive Unit Root – Testing: DF • Note that an OLS estimation of ˆ , under a unit root, involves taking averages of the integrated series, yt. Recall that, by backward substitution, yt: t y t  y t 1   t  y 0    j j 1

⇒ each yt is a result of a constant (y0), plus a (partial) sum of errors. • The mean of yt (assuming y0=0) involves a sum of partial sums: y 

1 T

T



t 1

yt 

1 T

T



t 1

t

(  j ) j 1

Then, the mean shows a different probabilistic behavior than the sum of stationary and ergodic errors. It turns out that the limit behavior of y when ϕ = 1 is described by simple functionals of Brownian motion.

Review: Stochastic Calculus • Brownian motion (Wiener process) Preliminaries:  Let the variable W(t) be almost surely continuous, with W(t=0)=0.  The change in a small interval of time t is W.  (,v): usual normal distribution, mean= & variance= v. Definition: The variable W(t) follows a Wiener process if  W(0) = 0  W(t) has continuous paths at t.  W = ε√t, where ε〜N(0,1)  The values of W for any 2 different (non-overlapping) periods of time are independent.

4

RS – EC2 - Lecture 16

Review: Stochastic Calculus – Wiener process • A Brownian motion is a continuous stochastic process, but not differentiable (“jagged path”) and with unbounded variation. (“A continuous random walk”). Notation: W(t), W(t, ω), B(t). Example: A partial sum: WT (r ) 

1 T

(1   2   3  ...  [Tr ] ); r  [0,1]

• Summary: Properties of the Wiener processes {W(t)} for t 0. – Let N=T/t, then W(T) - W(0) = Σi εi√t Thus, W(T) - W(0) = W(T) 〜 N(0, T) ⇒ E[W(T)]=0 & Var[W(T)]=T – E[W] = 0 – Var[W] = t ⇒ Standard deviation of W is √t

9

Review: Stochastic Process – Partial Sums • Partial Sum Process Let εt ∼ iid N(0, σ2). For r ∈[0, 1], define the partial sum process: 1 Tr 1  t  S [Tr ] ;  T t 1 T Example: Let T=20, r ={0.01,..., 0.1,...} X T (r ) 

r  [ 0,1]

r = 0.01;

[Tr ] = [20*0.01] = 0



X 20 (0.01) 

r = 0.1;

[Tr ] = [20*0.1] = 2



X 20 (0.1) 

1 [ 20*.01]   t  0. 20 t 1

1 [20*0.1] 1 t  T (1  2 ). 20 t 1

=> XT(r) is a random step function. As T grows, the steps get smaller. XT(r) looks more like a Wiener process. 10

5

RS – EC2 - Lecture 16

Review: Stochastic Calculus – FCLT • Functional CLT (Donsker’s FCLT) If εt satisfies some assumptions (stationarity, E[|εt|q 2), then D WT(r)   W(r), where W(r) is a standard Brownian motion for r ∈ [0, 1]. Note: That is, sample statistics, like WT(r), do not converge to constants, but to functions of Brownian motions. Check Billingsley (1986). • A CLT is a limit for one term of a sequence of partial sums {Sk}, Donsker’s FCLT is a limit for the entire sequence {Sk}, instead of one term. • Donsker’s FCLT is an important extension to the CLT because it is tool to obtain additional limits, such as statistics with I(1) variables. 11

Review: Stochastic Calculus – CMT • These extensions primarily follow from the continuous mapping theorem (CMT) and generalizations of the CMT. • Continuous Mapping Theorem (CMT) D   W(r) and h(.) is a continuous functional on D[0,1], the If WT(r)  space of all real valued functions on [0,1] that are right continuous at each point on [0,1] and have finite left limits, then D   h(W(r)) as T → ∞ h(WT(r))  1

Example: Let h(YT(.)) = YT ( r )dr  D & √T XT(r)    W(r). 0 1

Then,

D h(√T XT(r))   

W 0

T

( r )dr 12

6

RS – EC2 - Lecture 16

Review: Stochastic Calculus – FCLT (2) • FCLT: If T is large, WT(.) is a good approximation to W(r); r ∈[0,1]: W(r) = limT→∞ {WT(r) = √T XT(r)} Sketch of proof: [Tr ]  1 [Tr ]  1 [Tr ]  For a fixed r ∈[0,1]: WT (r )  T X T (r )     t   t   T

[Tr]

Since, as T →∞,

T

 r

T  [Tr ]

t 1



t 1

 d  1  t   N(0, 2 ).  [Tr]  t 1   [Tr]

&

Then, from Slutsky’s theorem, we get (again, for a fixed r): d d   N(0,1)≡ W(1).) (r =1, WT(r)/σ  WT (r ) /    N (0, r )  W (r ) This result holds for any r ∈ [0, 1]. We suspect it holds uniformly for r∈[0, 1]. In fact, the pdf of the sequence of random step functions D WT(.)/σ, defined on [0,1],    W(.) (This is the FCLT.) 13

Review: Stochastic Calculus – Example • Example: yt = yt-1 + εt. Get distribution of (X’X/T2)-1 for yt. T

T

t 1

T

t 1

t 1

i 1

t 1

T 2  ( yt 1 ) 2  T 2 [  t i  y0 ]2  T 2 [ St 1  y0 ]2 T

 T 2 [(St 1 ) 2  2 y0 St 1  y02 ] t 1

2

T T  S   S    2   t 1  T 1  2 y0T 1/ 2   t 1 T 1  T 1 y02 t 1   T  t 1   T  2

t /T

t /T

T  1   1  S[Tr ]  dr  2 y0T 1/ 2    S[Tr ] dr  T 1 y02     t 1 ( t 1) / T   T t 1 ( t 1) / T   T T

 2 1

1

  2  WT (r ) 2 dr  2 y0T 1/ 2  WT (r ) dr  T 1 y02 0

0

1

d    2  W (r ) 2 dr, 0

T  . 14

7

RS – EC2 - Lecture 16

Review: Stochastic Calculus – Example • We can also derive the additional results for yt = yt-1 + εt. T

T

T 3/ 2  yt1  T 3/ 2 [St 1  y0 ] t 1

t 1

T

St1 1 3/ 2 T  T y0 t 1  T

 

t /T

 1  S[Tr] dr  T 3/ 2 y02   t 1 (t 1) / T   T T

 



1

1

   XT (r)dr T 3/ 2 y02

d     W(r)dr

0

(as T ).

0

Similarly, T

1

t 1

0

d T 1 yt1t1     2 W(r)dW

(Intuition: think of εt as dW(t).) 15

Review: Stochastic Calculus – Ito’s Theorem • The integral w.r.t a Brownian motion, given by Ito’s theorem (integral): where tk є [tk,tk + 1) as tk + 1 – tk → 0. ∫ f(t,ω) dB = ∑ f(tk,ω) ∆Bk As we increase the partitions of [0,T], the sum • Ito’s Theorem result: 1

Example:

to the integral.

∫B(t,ω) dB(t,ω) = B2(t,ω)/2 – t/2. 1

 W (t )dW (t )  2 [W (1) 0

p   

2

 W ( 0 ) 2  1] 

1 [W (1) 2  1] 2

Intuition: We think of εt as dW(t). Then, Σk=0 to t εk, which corresponds to ∫0to(t/T) dW(s)=W( s/T) (for W(0)=0).

16

8

RS – EC2 - Lecture 16

Autoregressive Unit Root – DF Test • Case 1. We continue with yt = yt-1 + εt. Using OLS, we estimate : T

ˆ 



T

y t y t 1

t 1 T

 t 1





T

( y t 1   y t 1 ) y t 1

t 1

T



y 2t 1

t 1

 1

y

t 1  y t 1

t 1

y 2t 1

T

y t 1

2 t 1

• This implies for the normalized bias: T

T ( ˆ  1)  T

y

T

t 1  y t 1

t 1

T

y t 1

 2 t 1

 (y

t 1

/ T )(  t / T )

t 1

1 T

T

 (y

t 1

/ T )2

t 1

• From the way we defined WT(.), we can see that yt/sqrt(T) converges to a Brownian motion. Under H0, yt is a sum of white noise errors.

Autoregressive Unit Root – DF: Distribution • Using previous results, we get the DF’s asymptotic distribution under H0 (see Billingley (1986) and Phillips (1987), for details): 1

T (ˆ  1)   d

 W (t )dW (t ) 0

1

 W (t )

2

dt

0

• Using Ito’s integral, we have 1 W (1) 2  1 d T (ˆ  1)   21 2  W (t ) dt 0

Note: W(1) is a N(0,1). Then, W(1)2 is just a χ2(1) RV.

9

RS – EC2 - Lecture 16

Autoregressive Unit Root – DF: Intuition 1 W (1) 2  1 d T ( ˆ  1)  2 1 W (t ) 2 dt

 0

• Contrary to the stationary model the denominator of the expression for the OLS estimator –i.e., (1/T)Σtxt2– does not converge to a constant a.s., but to a RV strongly correlated with the numerator. • Then, the asymptotic distribution is not normal. It turns out that the limiting distribution of the OLS estimator is highly skewed, with a long tail to the left. • The normalized bias has a well defined limiting distribution. It can be used as a test of H0.

Autoregressive Unit Root – Consistency • In the AR(1) model, yt = ϕ yt-1 + εt , we show the consitency of the OLS estimator under H0: ϕ=1. T

ˆ  1 

y t 1 T

T

t 1 t

y t 1

T

t 1

     p



2

T  2  y t 1 t t 1 T

2

y t 1

1

  1 0 W ( r ) dr    T  1

2

2 t 1

2

1

2



 W ( r ) dW   0 0

p  1. Thus, ˆ 

10

RS – EC2 - Lecture 16

Autoregressive Unit Root – Testing: DF • Back to the AR(1) model. The t-test statistic for H0: α0=0 is given by t 1 

ˆ  1  SE ( ˆ )

ˆ  1 T

s 2 ( y t2

2 t 1

) 1

• The =1 test is a one-sided left tail test. Q: What is its distribution ? Note: If {yt} is stationary (i.e.,|φ| < 1) then it can be shown

 

 

d



T ˆ    N 0, 1   2 .

• Then, under H0, the asymptotic distribution of t=1 is N(0,1). That is, under H0: d

ˆ  N 1,0 

which we know is not correct, since yt is not stationary and ergodic.

Autoregressive Unit Root – Testing: DF • Recall that under H0 we got. 1





D

T ˆ  1 

 W r dW r  0

1

 W r 

2

dr

0

• A little bit of algebra delivers the distribution for the t-test, tϕ. 1

t  1

ˆ  1  T ˆ  1{T  y  (s ) SE (ˆ ) 2

2 1/ 2 T

T

t 1

2 t 1

1/ 2

}

 D

 W r dW r  0

1   W r 2 dr  0

   

1/ 2

where W(r) denotes a standard Brownian motion (Wiener process) defined on [0,1].

11

RS – EC2 - Lecture 16

Autoregressive Unit Root – DF: Intuition • The limiting distribution of tФ=1 is the DF distribution,. Below, we graph the DF distribution relative to a Normal. It is skewed, with a long tail to the left.

Autoregressive Unit Root – Testing: DF • ˆ is not asymptotically normally distributed and tФ=1 is not asymptotically standard normal. • The DF distribution, which does not have a closed form representation, is non-standard. The quantiles of the DF distribution can be numerically approximated or simulated. • The DF distribution has been tabulated under different scenarios. 1) no constant: yt = Φ yt-1 + εt. 2) with a constant: yt = μ + Φ yt-1 + εt. 3) with a constant and a trend: yt = μ + δ t +Φ yt-1 + εt. • The tests with no constant are not used in practice.

12

RS – EC2 - Lecture 16

Autoregressive Unit Root – Testing: DF • Critical values of the DFt test under different scenarios.

Autoregressive Unit Root – DF: Case 1 Example: We want to simulate the DF distribution in R: d t  1  

W (1) 2  1 1

2{  W ( t ) 2 dt }1 / 2

T