1. ⢠Using previous results, we get the DF's asymptotic distribution under H0 (see Billingley (1986) and Phillips (1987), for details):. ⢠Using Ito's integral, we have.
RS – EC2 - Lecture 16
Lecture 16 Unit Root Tests
1
Autoregressive Unit Root • A shock is usually used to describe an unexpected change in a variable or in the value of the error terms at a particular time period. • When we have a stationary system, effect of a shock will die out gradually. But, when we have a non-stationary system, effect of a shock is permanent. • We have two types of non-stationarity. In an AR(1) model we have: - Unit root: | Φ1 | = 1: homogeneous non-stationarity - Explosive root: | Φ1 | > 1: explosive non-stationarity • In the last case, a shock to the system become more influential as time goes on. It can never be seen in real life. We will not consider them.
1
RS – EC2 - Lecture 16
Autoregressive Unit Root • Consider the AR(p) process: ( L ) yt t where ( L ) 1 1 L1 L2 2 .... p L p As we discussed before, if one of the rj’s equals 1, Φ(1)=0, or
1 2 .... p 1 • We say yt has a unit root. In this case, yt is non-stationary. Example: AR(1): y t 1 y t 1 t =>Unit root: Φ1=1. H0 (yt non-stationarity): Φ1=1 (or, Φ1-1 = 0) H1 (yt stationarity): Φ1 < 1 (or, Φ1-1 < 0) • A t-test seems natural to test H0. But, the ergodic theorem and MDS CLT do not apply: the t-statistic does not have the usual distributions.
Autoregressive Unit Root • Now, let’s reparameterize the AR(1) process. Subtract yt-1 from yt: yt yt yt 1 (1 1) yt 1 t 0 yt 1 t
• Unit root test: H0: α0= Φ1-1 =0 against H1: α0< 0. • Natural test for H0: t-test . We call this test the Dickey-Fuller (DF) test. But, what is its distribution? • Back to the general, AR(p) process: ( L) yt t We rewrite the process using the Dickey-Fuller reparameterization::
y t 0 y t 1 1 y t 1 2 y t 2 .... p 1 y t ( p 1) t • Both AR(p) formulations are equivalent.
2
RS – EC2 - Lecture 16
Autoregressive Unit Root – Testing • AR(p) lag Φ(L):
( L) 1 1 L1 L2 2 .... p L p
• DF reparameterization: (1 L ) 0 1 ( L L 2 ) 2 ( L 2 L3 ) .... p 1 ( L p 1 L p ) • Both parameterizations should be equal. Then, Φ(1)=-α0. => unit root hypothesis can be stated as H0: α0=0. Note: The model is stationary if α0< 0 => natural H1: α0 < 0. • Under H0: α0=0, the model is AR(p-1) stationary in Δyt. Then, if yt has a (single) unit root, then Δyt is a stationary AR process. • We have a linear regression framework. A t-test for H0 is the Augmented Dickey-Fuller (ADF) test.
Autoregressive Unit Root – Testing: DF • The Dickey-Fuller (DF) test is a special case of the ADF: No lags are included in the regression. It is easier to derive. We gain intuition from its derivation. • From our previous example, we have:
y t 1 y t 1 t 0 y t 1 t • If α0 = 0, system has a unit root:: H 0 : 0 0
H1 : 0 0 (| 0 | 0) • We can test H0 with a t-test:
t 1
ˆ 1 SE ˆ
• There is another associated test with H0, the ρ-test:. (T-1)( ˆ −1).
3
RS – EC2 - Lecture 16
Autoregressive Unit Root – Testing: DF • Note that an OLS estimation of ˆ , under a unit root, involves taking averages of the integrated series, yt. Recall that, by backward substitution, yt: t y t y t 1 t y 0 j j 1
⇒ each yt is a result of a constant (y0), plus a (partial) sum of errors. • The mean of yt (assuming y0=0) involves a sum of partial sums: y
1 T
T
t 1
yt
1 T
T
t 1
t
( j ) j 1
Then, the mean shows a different probabilistic behavior than the sum of stationary and ergodic errors. It turns out that the limit behavior of y when ϕ = 1 is described by simple functionals of Brownian motion.
Review: Stochastic Calculus • Brownian motion (Wiener process) Preliminaries: Let the variable W(t) be almost surely continuous, with W(t=0)=0. The change in a small interval of time t is W. (,v): usual normal distribution, mean= & variance= v. Definition: The variable W(t) follows a Wiener process if W(0) = 0 W(t) has continuous paths at t. W = ε√t, where ε〜N(0,1) The values of W for any 2 different (non-overlapping) periods of time are independent.
4
RS – EC2 - Lecture 16
Review: Stochastic Calculus – Wiener process • A Brownian motion is a continuous stochastic process, but not differentiable (“jagged path”) and with unbounded variation. (“A continuous random walk”). Notation: W(t), W(t, ω), B(t). Example: A partial sum: WT (r )
1 T
(1 2 3 ... [Tr ] ); r [0,1]
• Summary: Properties of the Wiener processes {W(t)} for t 0. – Let N=T/t, then W(T) - W(0) = Σi εi√t Thus, W(T) - W(0) = W(T) 〜 N(0, T) ⇒ E[W(T)]=0 & Var[W(T)]=T – E[W] = 0 – Var[W] = t ⇒ Standard deviation of W is √t
9
Review: Stochastic Process – Partial Sums • Partial Sum Process Let εt ∼ iid N(0, σ2). For r ∈[0, 1], define the partial sum process: 1 Tr 1 t S [Tr ] ; T t 1 T Example: Let T=20, r ={0.01,..., 0.1,...} X T (r )
r [ 0,1]
r = 0.01;
[Tr ] = [20*0.01] = 0
⇒
X 20 (0.01)
r = 0.1;
[Tr ] = [20*0.1] = 2
⇒
X 20 (0.1)
1 [ 20*.01] t 0. 20 t 1
1 [20*0.1] 1 t T (1 2 ). 20 t 1
=> XT(r) is a random step function. As T grows, the steps get smaller. XT(r) looks more like a Wiener process. 10
5
RS – EC2 - Lecture 16
Review: Stochastic Calculus – FCLT • Functional CLT (Donsker’s FCLT) If εt satisfies some assumptions (stationarity, E[|εt|q 2), then D WT(r) W(r), where W(r) is a standard Brownian motion for r ∈ [0, 1]. Note: That is, sample statistics, like WT(r), do not converge to constants, but to functions of Brownian motions. Check Billingsley (1986). • A CLT is a limit for one term of a sequence of partial sums {Sk}, Donsker’s FCLT is a limit for the entire sequence {Sk}, instead of one term. • Donsker’s FCLT is an important extension to the CLT because it is tool to obtain additional limits, such as statistics with I(1) variables. 11
Review: Stochastic Calculus – CMT • These extensions primarily follow from the continuous mapping theorem (CMT) and generalizations of the CMT. • Continuous Mapping Theorem (CMT) D W(r) and h(.) is a continuous functional on D[0,1], the If WT(r) space of all real valued functions on [0,1] that are right continuous at each point on [0,1] and have finite left limits, then D h(W(r)) as T → ∞ h(WT(r)) 1
Example: Let h(YT(.)) = YT ( r )dr D & √T XT(r) W(r). 0 1
Then,
D h(√T XT(r))
W 0
T
( r )dr 12
6
RS – EC2 - Lecture 16
Review: Stochastic Calculus – FCLT (2) • FCLT: If T is large, WT(.) is a good approximation to W(r); r ∈[0,1]: W(r) = limT→∞ {WT(r) = √T XT(r)} Sketch of proof: [Tr ] 1 [Tr ] 1 [Tr ] For a fixed r ∈[0,1]: WT (r ) T X T (r ) t t T
[Tr]
Since, as T →∞,
T
r
T [Tr ]
t 1
t 1
d 1 t N(0, 2 ). [Tr] t 1 [Tr]
&
Then, from Slutsky’s theorem, we get (again, for a fixed r): d d N(0,1)≡ W(1).) (r =1, WT(r)/σ WT (r ) / N (0, r ) W (r ) This result holds for any r ∈ [0, 1]. We suspect it holds uniformly for r∈[0, 1]. In fact, the pdf of the sequence of random step functions D WT(.)/σ, defined on [0,1], W(.) (This is the FCLT.) 13
Review: Stochastic Calculus – Example • Example: yt = yt-1 + εt. Get distribution of (X’X/T2)-1 for yt. T
T
t 1
T
t 1
t 1
i 1
t 1
T 2 ( yt 1 ) 2 T 2 [ t i y0 ]2 T 2 [ St 1 y0 ]2 T
T 2 [(St 1 ) 2 2 y0 St 1 y02 ] t 1
2
T T S S 2 t 1 T 1 2 y0T 1/ 2 t 1 T 1 T 1 y02 t 1 T t 1 T 2
t /T
t /T
T 1 1 S[Tr ] dr 2 y0T 1/ 2 S[Tr ] dr T 1 y02 t 1 ( t 1) / T T t 1 ( t 1) / T T T
2 1
1
2 WT (r ) 2 dr 2 y0T 1/ 2 WT (r ) dr T 1 y02 0
0
1
d 2 W (r ) 2 dr, 0
T . 14
7
RS – EC2 - Lecture 16
Review: Stochastic Calculus – Example • We can also derive the additional results for yt = yt-1 + εt. T
T
T 3/ 2 yt1 T 3/ 2 [St 1 y0 ] t 1
t 1
T
St1 1 3/ 2 T T y0 t 1 T
t /T
1 S[Tr] dr T 3/ 2 y02 t 1 (t 1) / T T T
1
1
XT (r)dr T 3/ 2 y02
d W(r)dr
0
(as T ).
0
Similarly, T
1
t 1
0
d T 1 yt1t1 2 W(r)dW
(Intuition: think of εt as dW(t).) 15
Review: Stochastic Calculus – Ito’s Theorem • The integral w.r.t a Brownian motion, given by Ito’s theorem (integral): where tk є [tk,tk + 1) as tk + 1 – tk → 0. ∫ f(t,ω) dB = ∑ f(tk,ω) ∆Bk As we increase the partitions of [0,T], the sum • Ito’s Theorem result: 1
Example:
to the integral.
∫B(t,ω) dB(t,ω) = B2(t,ω)/2 – t/2. 1
W (t )dW (t ) 2 [W (1) 0
p
2
W ( 0 ) 2 1]
1 [W (1) 2 1] 2
Intuition: We think of εt as dW(t). Then, Σk=0 to t εk, which corresponds to ∫0to(t/T) dW(s)=W( s/T) (for W(0)=0).
16
8
RS – EC2 - Lecture 16
Autoregressive Unit Root – DF Test • Case 1. We continue with yt = yt-1 + εt. Using OLS, we estimate : T
ˆ
T
y t y t 1
t 1 T
t 1
T
( y t 1 y t 1 ) y t 1
t 1
T
y 2t 1
t 1
1
y
t 1 y t 1
t 1
y 2t 1
T
y t 1
2 t 1
• This implies for the normalized bias: T
T ( ˆ 1) T
y
T
t 1 y t 1
t 1
T
y t 1
2 t 1
(y
t 1
/ T )( t / T )
t 1
1 T
T
(y
t 1
/ T )2
t 1
• From the way we defined WT(.), we can see that yt/sqrt(T) converges to a Brownian motion. Under H0, yt is a sum of white noise errors.
Autoregressive Unit Root – DF: Distribution • Using previous results, we get the DF’s asymptotic distribution under H0 (see Billingley (1986) and Phillips (1987), for details): 1
T (ˆ 1) d
W (t )dW (t ) 0
1
W (t )
2
dt
0
• Using Ito’s integral, we have 1 W (1) 2 1 d T (ˆ 1) 21 2 W (t ) dt 0
Note: W(1) is a N(0,1). Then, W(1)2 is just a χ2(1) RV.
9
RS – EC2 - Lecture 16
Autoregressive Unit Root – DF: Intuition 1 W (1) 2 1 d T ( ˆ 1) 2 1 W (t ) 2 dt
0
• Contrary to the stationary model the denominator of the expression for the OLS estimator –i.e., (1/T)Σtxt2– does not converge to a constant a.s., but to a RV strongly correlated with the numerator. • Then, the asymptotic distribution is not normal. It turns out that the limiting distribution of the OLS estimator is highly skewed, with a long tail to the left. • The normalized bias has a well defined limiting distribution. It can be used as a test of H0.
Autoregressive Unit Root – Consistency • In the AR(1) model, yt = ϕ yt-1 + εt , we show the consitency of the OLS estimator under H0: ϕ=1. T
ˆ 1
y t 1 T
T
t 1 t
y t 1
T
t 1
p
2
T 2 y t 1 t t 1 T
2
y t 1
1
1 0 W ( r ) dr T 1
2
2 t 1
2
1
2
W ( r ) dW 0 0
p 1. Thus, ˆ
10
RS – EC2 - Lecture 16
Autoregressive Unit Root – Testing: DF • Back to the AR(1) model. The t-test statistic for H0: α0=0 is given by t 1
ˆ 1 SE ( ˆ )
ˆ 1 T
s 2 ( y t2
2 t 1
) 1
• The =1 test is a one-sided left tail test. Q: What is its distribution ? Note: If {yt} is stationary (i.e.,|φ| < 1) then it can be shown
d
T ˆ N 0, 1 2 .
• Then, under H0, the asymptotic distribution of t=1 is N(0,1). That is, under H0: d
ˆ N 1,0
which we know is not correct, since yt is not stationary and ergodic.
Autoregressive Unit Root – Testing: DF • Recall that under H0 we got. 1
D
T ˆ 1
W r dW r 0
1
W r
2
dr
0
• A little bit of algebra delivers the distribution for the t-test, tϕ. 1
t 1
ˆ 1 T ˆ 1{T y (s ) SE (ˆ ) 2
2 1/ 2 T
T
t 1
2 t 1
1/ 2
}
D
W r dW r 0
1 W r 2 dr 0
1/ 2
where W(r) denotes a standard Brownian motion (Wiener process) defined on [0,1].
11
RS – EC2 - Lecture 16
Autoregressive Unit Root – DF: Intuition • The limiting distribution of tФ=1 is the DF distribution,. Below, we graph the DF distribution relative to a Normal. It is skewed, with a long tail to the left.
Autoregressive Unit Root – Testing: DF • ˆ is not asymptotically normally distributed and tФ=1 is not asymptotically standard normal. • The DF distribution, which does not have a closed form representation, is non-standard. The quantiles of the DF distribution can be numerically approximated or simulated. • The DF distribution has been tabulated under different scenarios. 1) no constant: yt = Φ yt-1 + εt. 2) with a constant: yt = μ + Φ yt-1 + εt. 3) with a constant and a trend: yt = μ + δ t +Φ yt-1 + εt. • The tests with no constant are not used in practice.
12
RS – EC2 - Lecture 16
Autoregressive Unit Root – Testing: DF • Critical values of the DFt test under different scenarios.
Autoregressive Unit Root – DF: Case 1 Example: We want to simulate the DF distribution in R: d t 1
W (1) 2 1 1
2{ W ( t ) 2 dt }1 / 2
T