Stochastic Process - MDPI

5 downloads 0 Views 303KB Size Report
Nov 14, 2013 - but this is absurd since. I∆y1 (t+)?H∆y2 (t). Case (2) a12 = 0 and a21 6= 0. In this case we have. < ∆y2,t+1, ∆y1,t >= a21σ2. 1 6= 0. Again this is ...
Econometrics 2013, 1, 207-216; doi:10.3390/econometrics1030207

OPEN ACCESS

econometrics

ISSN 2225-1146 www.mdpi.com/journal/econometrics Article

The Geometric Meaning of the Notion of Joint Unpredictability of a Bivariate VAR(1) Stochastic Process Umberto Triacca Department of Computer Engineering, Computer Science and Mathematics, University of L’Aquila, Via Vetoio I-67010 Coppito, L’Aquila, I-67100, Italy; E-Mail: [email protected]; Tel.: +39-086-2 43-4849, Fax.: +39-086-243-4407 Received: 21 August 2013; in revised form: 5 November 2013 / Accepted: 5 November 2013 / Published: 14 November 2013

Abstract: This paper investigates, in a particular parametric framework, the geometric meaning of joint unpredictability for a bivariate discrete process. In particular, the paper provides a characterization of the joint unpredictability in terms of distance between information sets in an Hilbert space. Keywords: Hilbert spaces; predictability; stochastic process JEL Classification: C18; C32

1. Introduction Let (⌦, F, P ) be a probability space and {yt = [y1,t , y2,t ]0 ; t = 0, 1, ...} a bivariate stochastic process defined on (⌦, F, P ). We consider the differenced process { yt = [ y1,t , y2,t ]0 ; t = 1, ...}, where is the first-difference operator. Following Caporale and Pittis [1] and Hassapis et al. [2], we say that the process { yt ; t = 1, ...} is jointly unpredictable if E( yt+1 | (yt , ..., y0 )) = 0

8t

(1)

where (yt , ..., y0 ) is the -field generated by past vectors yi i = 0, ..., t. The goal of this paper is to show that the notion of joint unpredictability, in a particular parametric framework, can be characterized by a geometric condition. This characterization is given in terms of distance between information sets in an Hilbert space. In particular, we will show that the process { yt ; t = 1, ...} is jointly unpredictable if and only if the information contained in its past is ‘much

Econometrics 2013, 1

208

distant’ from the information contained in its future. Even if our result is not as general as might seem desirable, we think that the intuition gained from this characterization makes the notion of joint unpredictability more clear. The rest of the paper is organized as follows. Section 2 presents the utilized mathematical framework. Sections 3 presents the geometric characterization. Section 4 concludes. 2. Preliminaries Definitions, notation, and preliminary results from Hilbert space theory will be presented prior to establish the main result. An excellent overviews of the applications of Hilbert space methods to time series analysis can be found in Brockwell and Davis [3]. We use the following notations and symbols. Let (⌦, F, P ) be a probability space. We consider the Hilbert space L2 (⌦, F, P ) of all real square integrable random variables on (⌦, F, P ). The inner product in L2 (⌦, F, P ) is defined by hz, wi = E(zw) for any z, w 2 L2 (⌦, F, P ). The space L2 (⌦, F, P ) is a 1/2 normed space and the norm is given by kwk = [E(w2 )] . The distance between z,w 2 L2 (⌦, F, P ) is d(z,w) = kz wk. A sequence {zn }⇢ L2 (⌦, F, P ) is said to converge to a limit point z 2 L2 (⌦, F, P ) if d(zn , z) ! 0 as n ! 1. A point z 2 L2 (⌦, F, P ) is a limit point of a set M (subset of L2 (⌦, F, P )) if it is a limit point of a sequence from M . In particular, M is said to be closed if it contains all its limit points. If S is a arbitrary subset of L2 (⌦, F, P ), then the set of all ↵1 z1 + ... + ↵h zh (h = 1, 2, ....; ↵1 , ..., ↵h arbitrary real numbers; z1 , ..., zh arbitrary elements of S) is called a linear manifold spanned by S and is symbolized by sp(S). If we add to sp(S) all its limit points we obtain a closed set that we call the closed linear manifold or subspace spanned by S, symbolized by sp(S). Two elements z, w 2 L2 are called orthogonal, and we write z ? w, if hz, wi = 0. If S is any subset of L2 (⌦, F, P ), then we write x ? S if x ? s for all s 2 S; similarly, the notation S ? T , for two subsets S and T of L2 (⌦, F, P ), indicates that all elements of S are orthogonal to all elements of T . For a given z 2 L2 (⌦, F, P ) and a closed subspace M of L2 (⌦, F, P ), we define the orthogonal projection of z on M , denoted by P (z|M ), as the unique element of M such that kz P (z|M )k  kz wk for any w 2 M. We remember that if z ? M , then P (z|M ) = 0. If M and N are two arbitrary subsets of L2 (⌦, F, P ), then the quantity d (M, N ) = inf {km

nk ; m 2 M, n 2 N }

is called distance between M and N . We close this section introducing some further definitions, concerning discrete stochastic processes in L2 (⌦, F, P ). Let {xt } be a univariate stochastic process. We say that {xt } is integrated of order one (denoted xt ⇠ I(1)) if the process { xt = xt xt 1 } is stationary whereas {xt } is not stationary. We say that the bivariate stochastic process yt = [y1,t , y2,t ]0 is integrated of order one if y1,t ⇠ I(1) and y2,t ⇠ I(1). A stochastic process {yt } Granger causes another stochastic process {xt }, with respect to a given information set It that contains at least xt j , yt j , j > 0, if xt can be better predicted by using past values of y than by not doing so, all other information in It (including the past of x) being

Econometrics 2013, 1

209

used in either case. More formally, we say that {yt } is Granger causal for {xt } with respect to Hxy (t) = sp {xt , yt , xt 1 , yt 1 , ...} if kxt+1

P (xt+1 |Hxy (t))k2 < kxt+1

P (xt+1 |Hx (t))k2

where Hx (t) = sp {xt , xt 1 , ...}. Two stochastic processes, {xt }, and {yt }, both of which are individually I(1), are said to be cointegrated if there exists a non-zero constant such that {zt = xt yt } is a stationary (I(0)) process. It is important to note that cointegration between two variables implies the existence of causality (in the Granger sense) between them in at least one direction (see Granger [4]). 3. A Geometric Characterization In this section we assume that yt = [y1,t , y2,t ]0 ; t = 0, 1, ... be a bivariate stochastic process defined on (⌦, F, P ), integrated of order one, with y1,0 = y2,0 = 0, that has a VAR(1) representation yt = Ayt where A=

"

1

(2)

+ ut

a11 a12 a21 a22

#

0 0 is " a fixed (2 # ⇥ 2) coefficient matrix and ut = [u1,t , u2,t ] is i.i.d. with E (ut ) = 0 and E(ut ut ) = ⌃ = 2 1 0 for all t and E(ut u0s ) = 0 for s 6= t. 2 0 2 In this framework we have that {y2,t } does not Granger cause {y1,t } if and only if a12 = 0. Similarly, {y1,t } does not Granger cause {y2,t } if and only if a21 = 0. We observe that the VAR residuals are usually correlated and hence the covariance matrix ⌃ is seldom a diagonal matrix. However, because the main aim of this study is pedagogical, we assume that ⌃ is diagonal for analytical convenience. We consider the following information sets: I y1 (t+) = { y1,t+1 , y1,t+2 , ...}, I y2 (t+) = { y2,t+1 , y2,t+2 , ...}, H y1 (t) = sp { y1,t , y1,t 1 , ...} and H y2 (t) = sp { y2,t , y2,t 1 , ...}.

Theorem 3.1. Let yt be a VAR(1) process defined as in (2). The differenced process { yt ; t = 1, ...} is jointly unpredictable if and only if d (I

y1 (t+), H y2 (t))

=

y1

and d (I

y2 (t+), H y1 (t))

=

y2

Theorem 1 provides a geometric characterization of the notion of joint unpredictability of a bivariate process in term of distance between information sets. It is important to note that d (I

y1 (t+), H y2 (t))



y1

and d (I

y2 (t+), H y1 (t))



y2

Thus we have that the process { yt = [ y1,t , y2,t ]0 ; t = 1, ...} is jointly unpredictable if and only if the distances d (I y1 (t+), H y2 (t)) and d (I y2 (t+), H y1 (t)) achieve their maximum value, respectively.

Econometrics 2013, 1

210

It is intuitive to think that if these distances achieve their maximum value, then (yt , ..., y0 ) does not contain any valuable information about the future of the differenced series, yt = [ y1,t , y2,t ]0 and hence these are jointly unpredictable with respect to the information set (yt , ..., y0 ), that is E( yt+1 | (yt , ..., y0 )) = 0. We recall that Theorem 1 holds only in a bivariate setting. 3.1. Lemmas In order to prove Theorem 1, we need the following lemmas. Lemma 3.2. Let V be a closed subspace of L2 (⌦, F, P ) and G 6= ; a subset of L2 (⌦, F, P ) such that kgk = ⌘ 2 R, 8g 2 G. G ? V if and only if d (G, V ) = ⌘. Proof. Focker and Triacca ([5], p. 767). Lemma 1 establishes a relationship between the orthogonality of sets/spaces in the Hilbert space L2 (⌦, F, P) and their distance. We note that the orthogonality between G and V holds if and only if the distance d (G, V ) achieves the maximum value. In fact, d (G, V ) can not be greater than ⌘ since 0 2 V . Lemma 3.3. The processes {y1,t } and {y2,t } are not cointegrated if and only if A = I. Proof. By (2) we have "

y1,t y1,t

#

=

"

a11 1 a12 a21 a22 1

#"

y1,t y2,t

1 1

#

+

"

u1,t u2,t

These equations must be balanced, that is the order of integration of (a11 a21 y1,t 1 + (a22 1)y2,t 1 must be zero. ()) If A 6= I, since (a11 1)y1,t 1 + a12 y2,t 1 ⇠ I(0) and a21 y1,t 1 + (a22 have three cases.

# and

1)y1,t

1

+ a12 y2,t

1)y2,t

1

⇠ I(0), we can

1

Case (1) A = [aij ], with aij 6= 0 i, j = 1, 2, i 6= j and aii 6= 1 i = 1, 2. Case (2) " # a11 a12 A= 0 1 with a11 6= 1 and a21 6= 0. Case (3) A=

"

1 0 a21 a22

#

with a21 6= 0 and a22 6= 1. In all three cases, there exists at least a not trivial linear combination of the processes {y1,t } and {y2,t } that is stationary. Thus we can conclude that {y1,t } and {y2,t } are cointegrated. (() If A = I, then a12 = a21 = 0 and so {y1,t } does not Granger cause {y2,t } and {y2,t } does not Granger cause {y1,t }. It follows that {y1,t } and {y2,t } are not cointegrated.

Econometrics 2013, 1

211

Lemma 3.4. If {y1,t } and {y2,t } are cointegrated, then |a11 + a22

1| < 1.

Proof. We subtract [y1,t 1 , y2,t 1 ]0 from both sides of Equation (2) by obtaining "

y1,t y2,t

#

"

=

a11 1 a12 a21 a22 1

If {y1,t } and {y2,t } are cointegrated, we have " # " # y1,t ↵1 h = 1 y2,t ↵2 " # ↵1 h = 1 1 ↵2 " # #1 h = 1 #2 " # #1 = (y1,t 1 #2

2

#"

y1,t y2,t

1

y2,t = #1 y1,t

1

#2 y1,t

y2,t = (y1,t

#1 y2,t

1

where = 1 + #1 #2 = a11 + a22 stationary process and so

1

1

1 ↵1

2

+

#

and #2 =

#2 y2,t

1

1 ↵2

+ u1,t

are the speed of

u2,t

u2,t

1. Since {y1,t } and {y2,t } are cointegrated, {y1,t

y2,t } is a

1| < 1

yt = [ y1,t , y2,t ]0 ; t = 0, 1, ... is jointly unpredictable if and only if A=I

Proof. ()) process

(3)

y2,t :

y2,t 1 ) + u1,t

|a11 + a22 Lemma 3.5. The process

u1,t u2,t

"

By rearranging Equation (3) we obtain an AR(1) model for y1,t y1,t

+

"

# " # y1,t 1 u1,t + y2,t 1 u2,t " # " # i y u1,t 1,t 1 + 2/ 1 y2,t 1 u2,t " # " # i y u1,t 1,t 1 + y2,t 1 u2,t " # u1,t y2,t 1 ) + u2,t i

where = ( 2 / 1 ) is the cointegration coefficient and #1 = adjustment coefficients. We observe that y1,t

#

1

yt = [ y1,t , y2,t ]0 ; t = 0, 1, ... is jointly unpredictable, then E(yt | (yt 1 , ..., y1 )) = yt

1

On the other hand, since yt = Ayt

1

+ ut

with E (ut ) = 0 and E(ut u0t ) = ⌃ for all t and E(ut u0s ) = 0 for s 6= t, we have that E(yt | (yt 1 , ..., y1 )) = Ayt

1

Econometrics 2013, 1

212

Hence we have Ayt

= yt

1

1

and so A=I

(() If A=I then yt = yt we have

1

+ ut with E (ut ) = 0 and E(ut u0t ) = ⌃ for all t and E(ut u0s ) = 0 for s 6= t, and hence E(yt | (yt 1 , ..., y0 )) = yt

1

0

Thus we can conclude that the process yt = [ y1,t , y2,t ] ; t = 1, ... is jointly unpredictable. Before to conclude this subsection we observe that Equation (2) can be written in lag operator notation. The lag operator L is defined such that Lyt = yt 1 . We have that (I or

"

1

AL)yt = ut

a11 L a12 L a21 L 1 a22 L

#"

y1,t y2,t

#

=

"

u1,t u2,t

#

3.2. Proof of Theorem 1 Sufficiency. If d (I

y1 (t+), H y2 (t))

=

y1

and d (I

y2 (t+), H y1 (t))

=

y2

then, by Lemma 1, we have I

y1 (t+)?H y2 (t)

I

y2 (t+)?H y1 (t)

and Now we assume that a12 and a21 are not both equal to zero. We can have three cases. Case (1) a12 6= 0 and a21 = 0. This implies that r1,t = (a11

1)y1,t + a12 y2,t + u1,t

and y2,t = u2,t Thus
= E( y1,t+1 y2,t )

= (a11 = (a11

1)E(y1,t u2,t ) + a12 E(y2,t u2,t ) + E(u1,t+1 u2,t ) t X 1)E(y1,t u2,t ) + a12 E(u2,t u2,s ) s=1

= (a11

1)E(y1,t u2,t ) +

a12 22

Econometrics 2013, 1

213

Now, we note that E(y1,t u2,t ) = at11 E(y1,0 u2,t ) = 0 Thus
= a12

2 2

6= 0

2 1

6= 0

but this is absurd since I

y1 (t+)?H y2 (t)

Case (2) a12 = 0 and a21 6= 0. In this case we have
= a21

Again this is absurd since I

y2 (t+)?H y1 (t)

Case (3) a12 6= 0 and a21 6= 0. We note that " # " #" # y1,t (1 a22 L) (L) a12 L (L) u1,t = y2,t a21 L (L) (1 a11 L) (L) u2,t where (L) =

1 (1

a11 L)(1

L a22 L)

a12 a21 L2

By Lemma 2, we have that {y1t } and {y2t } are cointegrated and hence the matrix " # a11 1 a12 A I= a21 a22 1 has rank 1. It follows that a12 a21 = (1

a11 )(1

a22 )

Thus (L) = = = = =

1 L (1 a11 L)(1 a22 L) (1 a11 )(1 a22 )L2 1 L (1 L)(1 + L) (1 L)(a11 + a22 )L 1 1 + L (a11 + a22 )L 1 1 (a11 + a22 1)L 1 1 L

where = a11 + a22 1. Since {y1t } and {y2t } are cointegrated, by Lemma 3 we have that | | < 1 and hence (L) = 1 + L + Now, we can have two cases.

2

L2 + ...

Econometrics 2013, 1

214

Case (a) = 0. In this case we have y1,t = u1,t

a22 u1,t

1

+ a12 u2,t

1

+ u2,t

a11 u2,t

1

and y2,t = a21 u1,t

1

Thus
= a12

2 2

6= 0


= a21

2 1

6= 0

and but this is absurd since I

y1 (t+)?H y2 (t)

I

y2 (t+)?H y1 (t)

and Case (b) 6= 0. In this case we have y1,t = u1,t + (a11 = u1,t + (a11

1)u1,t 1 + (a11 1)u1,t 2 + ...a12 u2,t 1 1 X X i i 1) u1,t 1 i + a12 u2,t 1 i i=0

1

+ a12 u2,t

2

+ ...

1

+ a21 u1,t

2

+ ...

i=0

and y2,t = u2,t + (a22 = u2,t + (a22

1)u2,t 1 + (a22 1)u2,t 2 + ...a21 u1,t 1 1 X X i i 1) u2,t 1 i + a21 u1,t 1 i i=0

i=0

Thus
= (a11

1)

2 1

1)

2 2

a 2 21

1

y2,t+1 , y1,t >= (a22

Now, we consider the system " ⇥ 1 + (a22 1) 1 (a22 1) 22 1

2



2

2 2

a 2 12

1

1)



1)

+ 1 + (a22

and

Suggest Documents