An application of hidden Markov models to asset allocation problems

14 downloads 307 Views 128KB Size Report
Models are applied to a discrete time asset allocation problem. For the commonly ... Key words: Hidden Markov models, asset allocation, portfolio selection.
Finance Stochast. 1, 229–238 (1997)

c Springer-Verlag 1997

An application of hidden Markov models to asset allocation problems? Robert J. Elliott1 , John van der Hoek2 1 Department 2 Department

of Mathematical Sciences, University of Alberta, Edmonton, Alberta, Canada T6G 2G1 of Applied Mathematics, University of Adelaide, Adelaide, South Australia 5005

Abstract. Filtering and parameter estimation techniques from hidden Markov Models are applied to a discrete time asset allocation problem. For the commonly used mean-variance utility explicit optimal strategies are obtained. Key words: Hidden Markov models, asset allocation, portfolio selection JEL classification: C13, E44, G2 Mathematics Subject Classification (1991): 90A09, 62P20

1. Introduction A typical problem faced by fund managers is to take an amount of capital and invest this in various assets, or asset classes, in an optimal way. To do this the fund manager must first develop a model for the evolution, or prediction, of the rates of returns on investments in each of these asset classes. A common procedure is to assert that these rates of return are driven by rates of return on some observable factors. Which factors are used to explain the evolution of rates of return of an asset class is often proprietary information for a particular fund manager. One of the criticisms that could be made of such a framework is the assumption that these rates of return can be adequately explained in terms of observable factors. In this paper we propose a model for the rates of return in terms of observable and non-observable factors; we present algorithms for the identification for this model, (using filtering and prediction techniques) and ? Robert Elliott wishes to thank SSHRC for its support. John van der Hoek wishes to thank the Department of Mathematical Sciences, University of Alberta, for its hospitality in August and September 1995, and SSHRC for its support. Manuscript received: December 1995 / final version received: August 1996

230

R.J. Elliott, J. van der Hoek

indicate how to apply this methodology to the (strategic) asset allocation problem using a mean-variance type utility criterion. Earlier work in filtering is presented in the classical work [4] of Liptser and Shiryayev. Previous work on parameter identification for linear, Gaussian models includes the paper by Anderson et al. [1]. However, this reference discusses only continuous state systems for which the dimension of the observations is the same as the dimension of the unobserved state. Parameter estimation for noisily observed discrete state Markov chains is fully treated in [2]. The present paper includes an extension of techniques from [2] to hybrid systems, that is, those with both continuous and discrete state spaces. 2. Mathematical model A bank typically keeps track of between 30 and 40 different factors. These include observables such as interest rates, inflation and industrial indices. The observable factors will be represented by xk ∈ R n . In addition we suppose there are unobserved factors which will be represented by the finite state Markov chain Xk ∈ S . We suppose Xk is not observed directly. Its states might represent in some approximate way the psychology of the market, or the actions of competitors. Therefore, we suppose the state space of the factors generating the rates of return on asset classes has a decomposition as a hybrid combination of Rn and a finite set Sb = {s1 , s2 , . . . , sN } for some N ∈ N. Note that if Sb = {s1 , s2 , . . . , sN } is any finite set, by considering the indicator functions Isi : Sb → {0, 1}, (Isi (s) = 1 if s = si and 0 otherwise), there is a canonical bijection of Sb with S = {e1 , e2 , . . . , eN }. Here, the ei = (0, 0, . . . , 1, . . . , 0); 1 ≤ i ≤ N are the canonical unit vectors in RN . The time index of state evolution will be the non-negative integers N = {0, 1, 2, . . .}. Let {xk }k ∈N be a process with state space Rn and {Xk }k ∈N be a process whose state space we can, without loss of generality, identify with S . Suppose (Ω, F , P ) is a probability space upon which {wk }k ∈N , {bk }k ∈N are independent, identically distributed (i.i.d.) sequences of Gaussian random variables, with zero means and non-singular covariance matrices Σ and Γ respectively; x0 is normally distributed, X0 is uniformly distributed, independently of each other and of the sequences {wk }k ∈N and {bk }k ∈N . Let {Fk }k ∈N be the complete filtration with Fk generated by {x0 , x1 , . . . , xk , X0 , X1 , . . . , Xk } and {Xk }k ∈N the complete filtration with Xk generated by {X0 , X1 , . . . , Xk }. The state {xk }k ∈N is assumed to satisfy xk +1 = Axk + wk +1

(1)

that is, {xk }k ∈N is a VAR(1) process, [5], and A is an n × n matrix. {Xk }k ∈N is assumed to be a X -Markov chain with state space S . The Markov property implies that   P Xk +1 = ej |Xk = P Xk +1 = ej |Xk . Write πji = P (Xk +1 = ej |Xk = ei ) and Π for the N × N matrix (πji ). Then

An application of hidden Markov models to asset allocation problems

  E Xk +1 |Xk

=

  E Xk +1 |Xk

=

ΠXk .

231

Define Mk +1 := Xk +1 − ΠXk . Then   E Mk +1 |Xk

=

  E Xk +1 − ΠXk |Xk

=

0

so Mk +1 is a martingale increment and Xk +1 = ΠXk + Mk +1

(2)

is the semimartingale representation of the Markov chain. This form of dynamics is developed in [2] and [3]. The process xk with dynamics given by (1) will be a model for the observed factors, and the process Xk , with dynamics given by (2), will model the unobserved factors. Suppose that there are m asset classes in which a fund manager can make investments. (For strategic asset allocation, the number of asset classes is usually between 5 and 10 and it includes general groups of assets such as bonds, manufacturing and energy securities.) Let rk ∈ Rm denote the vector of the rates of return on the m asset classes in the k th period for, k = 1, 2, 3, . . . . We shall assume that the model for the “true” rates of return, which apply between times k − 1 and k , is given by: rk = Cxk + HXk .

(3)

Here C and H are m ×n and m ×N matrices respectively. However, we shall assume that these rates are observed in Gaussian noise. This means that, if {yk }k ∈N is the sequence of our observations of the rates of return, then yk = Cxk + HXk + bk .

(4)

Let {Yk }k ∈N be the complete filtration with Yk generated by bk the conditional ex{y0 , y1 , . . . , yk , x0 , x1 , . . . , xk }, for k ∈ N, and denote by X pectation under P of Xk given Yk . The Markov chain Xk is noisily observed through (4) and so we have a variation of the Hidden Markov Model, HMM, discussed in [3]. Our first task will be to identify this model. This means that we shall provide best estimates for A, Π, C , H given the observations of {yk }k ∈N and {xk }k ∈N . These estimates will then be used to predict future rates of return. Finally we shall provide algorithms for allocating funds to the asset classes based on these predictions.

232

R.J. Elliott, J. van der Hoek

3. A measure change We first consider all processes initially defined in an ‘ideal’ probability space (Ω, F , P ); under a new probability P , to be defined, the dynamics (1), (2) and (4) will hold. Suppose that under P {yk }k ∈N is an i.i.d. N (0, Γ ) sequence of Gaussian random variables. For y ∈ Rm , set φ(y) = φΓ (y) =

1 1 1 exp[− y 0 Γ −1 y]. 2 (2π)n/2 | det Γ |1/2

(5)

With λ` := φ(y` − Cx` − HX` )/φ(y` ), write Λk =

k Y

` = 0, 1, . . .

λ` .

(6)

(7)

`=0

Define {Gk }k ∈N to be the complete filtration with terms generated by {x0 , . . . , xk , y0 , . . . , yk , X0 , . . . , Xk −1 }, and define P on (Ω, F ) by setting the restriction of the Radon-Nikodym derivative dP /d P to Gk equal to Λk . One can check, (see [3], page 61), that on (Ω, F ) under P , {bk }k ∈N are i.i.d. N (0, Γ ), where bk := yk − Cxk − HXk . Furthermore, by Bayes’ Theorem ([3], page 23)  E [Xk |Yk ] = E [Λk Xk |Yk ] E [Λk |Yk ] and E [Λk |Yk ] = hE [Λk Xk |Yk ], 1i, where 1 = (1, 1, . . . , 1). Here, h , i denotes the scalar product in R N . Write qk = E [Λk Xk |Yk ]. Then, by a simple modification of the arguments in [3]: qk +1

= E [Λk +1 Xk +1 |Yk +1 ] 1 = E [Λk φ(yk +1 − Cxk +1 − HXk +1 )Xk +1 |Yk +1 ] φ(yk +1 ) N X 1 = E [Λk φ(yk +1 − Cxk +1 − HXk +1 )Xk +1 hXk +1 , ei i|Yk +1 ] φ(yk +1 ) i =1

=

=

1 φ(yk +1 )

N X

φ(yk +1 − Cxk +1 − Hei )E [Λk hXk +1 , ei i|Yk +1 ]ei

i =1

N N i X φ(yk +1 − Cxk +1 − Hei ) h X E Λk hXk , ej i hΠXk , ei i|Yk ei φ(yk +1 ) i =1

j =1

An application of hidden Markov models to asset allocation problems

=

233

N X φ(yk +1 − Cxk +1 − Hei ) E [ΛhXk , ej i hΠXk , ei i|Yk ]ei φ(yk +1 )

i ,j =1

=

N X φ(yk +1 − Cxk +1 − Hei ) πij E [Λk hXk , ej i|Yk ]ei φ(yk +1 )

i ,j =1

=

N X φ(yk +1 − Cxk +1 − Hei ) πij hqk , ej iei . φ(yk +1 )

i ,j =1

Alternatively, for the i th component of qk qki = E [Λk hXk , ei i|Yk ],

i = 1, . . . , N

we have the recurrence relationship: qki +1 =

N X φ(yk +1 − Cxk +1 − Hei ) πij qkj , φ(yk +1 )

(8)

j =1

for i = 1, 2, . . . , N . If we define the normalized conditional expectation pk by pki = E [hXk , ei i|Yk ], then pki = qki

N hX

i = 1, . . . , N

qki

i−1

.

(9)

(10)

i =1

If the vector p0 is given, or estimated, then (8) can be initialized with q0

= E [Λ0 X0 |Y0 ]

 −1 E [Λ−1 0 Λ0 X0 |Yk ] E [Λ0 |Y0 ]  = p0 E [Λ−1 0 |Y0 ]

=

where E [Λ−1 0 |Yk ]

i φ(y0 ) |x0 , y0 φ(y0 − Cx0 − HX0 ) N X φ(y0 ) pi . φ(y0 − Cx0 − Hei ) 0 h

= =

E

(11)

i =1

This discussion leads to the following proposition: Proposition 1. For given A, Π, C , H in equations (1), (2), (4) the estimate of the H.M.M. is given by bk = E [Xk |Yk ] = pk X where the pk are determined by (10).



234

R.J. Elliott, J. van der Hoek

Now consider the issue of estimating the coefficient matrices A, C , Π, H . We estimate A by maximizing an appropriate log-likelihood function. As this is relatively standard procedure, we only outline the details. To this end suppose that under probability Q, {xk }k ∈N is an i.i.d. N (0, Σ) sequence of Gaussian random variables. Choose φ = φΣ and define γ`A ΓkA

φ(x` −Ax`−1 ) φ(x` ) Qk := `=1 γ`A .

:=

If {Hk }k ∈N is the complete filtration with Hk the sigma-algebra generated by {x0 , x1 , . . . , xk }, define Q A on (Ω, F ) by setting the Radon-Nikodym derivative dQ A /dQ, restricted to Hk , equal to ΓkA . Then {wk }k ∈N defined by wk +1 := xk +1 − Axk will, under Q A , be an i.i.d. sequence of N (0, Σ) Gaussian random variables. For b to maximimize some interim estimate A1 of A we choose A = A log

h dQ A

k i 1 X = − |H (x` − Ax`−1 )0 Σ −1 (x` − Ax`−1 ) + R k dQ A1 2 `=1

where R does not depend on A. This leads to k X

0 x` x`−1 =A

`=1

or b=A bk = A

k X

0 x`−1 x`−1

`=1

k X `=1

0 x` x`−1

k  X

0 x`−1 x`−1

−1

.

(12)

`=1

Pk 0 Here we have used the result that for k ≥ n, `=1 x`−1 x`−1 is, almost surely, invertible [4]. This will be our estimate for A, given observations of x0 , x1 , . . . , xk . We now provide analogous estimates for C . Again use the measure change, as in (5), (6), (7), but instead write P = P C , Λ = ΛC to indicate that we are focusing on the dependence of these variables of the matrix C . b to maximize Given C1 , an interim estimate for C , we choose C h i dP C |Y (13) E1 log k dP C1 (see [2], p. 36). Here E1 denotes expectation taken with respect to P C1 , given the observations of x0 , x1 , . . . , xk and y0 , y1 , . . . , yk . The expression in (13) is k n o 1X b E1 − (y` − Cx` − HX` )0 Γ −1 (y` − Cx` − HX` )|Yk + R 2 `=0

b does not depend on C . Using the notation hA, B i = where R matrices A, B , we can write the expression (13) in the form:

P ij

Aij Bij for

An application of hidden Markov models to asset allocation problems k n 1X  0 −1 E1 − y` Γ y` 2

235

−2hC , Γ −1 y` x`0 i − 2hH , Γ −1 y` X`0 i + hC , Γ −1 Cx` x`0 i

`=0

o  b +hH , Γ −1 HX` X`0 i + 2hC , Γ −1 HX` x`0 i |Yk + R.

The first order optimality condition for C yields: C

k X

x` x`0 + H

`=0

k X

E1 [X` |Yk ]x`0 =

`=0

k X

y` x`0 .

(14)

`=0

This implies that b =C bk = C

k hX

y` x`0 − H

`=0

k X

E1 [X` |Yk ]x`0

`=0

k ih X

x` x`0

i−1

(15)

`=0

Pk where we have again used the fact that `=0 x` x`0 is almost surely invertible if P bk we need k E1 [X` |Y` ]x 0 . k ≥ n. In order to compute C ` Pk Pk −1 `=0 0 Now `=1 E1 [X` |Y` ]x`0 = qk xk0 + `=1 E1 [X` |Y` ]x`−1 . Write :=

Tk

k −1 X

X` x`0

`=0

Tkrs

:=

k −1 X

hX` , er i x`s

`=0

where = ∈R . We need to compute: E1 [Tkrs |Yk ]. We proceed as in Elliott [2] using Theorem 5.3 etc. viz., x`0

(x`0 , x`2 , . . . , x`n )

n

E1 [Tkrs |Yk ]

=

E [ΛCk 1 Tkrs |Yk ]/E [ΛCk 1 |Yk ]



σk (Tkrs )/σk (1)

where σk (Hk ) := E [ΛCk 1 Hk |Yk ]. With Hk ≡ Tkrs , for Theorem 5.3 we have αk = hXk −1 , er i xks−1 , Then σk (Tkrs Xk )

=

N X

βk = δk = 0.

Γ j (yk ) hσk −1 (Tkrs−1 Xk −1 ), ej i πj

j =1

+Γ r (yk ) hσk −1 (Xk −1 , er i xks−1 πr σk (Xk )

=

N X

Γ j (yk ) hσk −1 (Xk −1 ), ej i πj

j =1

where πj = Πej , We then have

Γ j (yk ) is given by φ(yk − Cxk − Hej )/φ(yk ), with φ as in (5).

236

R.J. Elliott, J. van der Hoek

σk (Tkrs )

=

hσk (Tkrs Xk ), 1i

σk (1)

=

hσk (Xk ), 1i

as usual. Summarizing the results so far we can state the following: b and C b , given obserProposition 2. Given Π, H the log-likelihood estimates A vations x0 , x1 , . . . , xk and y0 , y1 , . . . , yk , are determined by (12) and (15).

 Once C has been estimated, we can estimate Π, H using the HMM methodology of [3]. In fact setting zk = yk − Cxk we can estimate Π, H with the observations (16) zk = HXk + bk using the procedures of [3], pages 68-70. Thus, estimates of the model parameters can be made in the order: A, C , Π, H , and then updated in this order given new observations. An issue for further investigation is the estimation of N . We expect that estimates of N will be smaller for the case n ≥ 1. 4. Asset allocation Suppose we have a vector rk +1 ∈ Rm of rates of return given by (3). We wish to determine at time k a vector w ∈ Rm with w0 1 = 1, 1 = (1, 1, . . . , 1)0 which maximizes (17) J (w) = νE [w0 rk +1 |Yk ] − var [w0 rk +1 |Yk ], for some ν > 0. This expresses the utility of the rate of return of a portfolio in which wealth is distributed among the m asset classes in ratios expressed by w. bk = E [Xk |Yk ] we can estimate this objective function and compute the Writing X optimal w. Note such a choice w = wk makes {wk }k ∈N a predictable sequence of decision variables. In fact J (w)

=

νE [w0 rk +1 |Yk ] − E [(w0 rk +1 )2 |Yk ] +(E [w0 rk +1 |Yk ])2

=

rk +1 + w0b rk +1b r 0k +1 w νw0b −w0 E [rk +1 rk0 +1 |Yk ]w

(18)

bk , and X bk = E [Xk |Yk ]. Now by (1), (2) where b rk +1 = E [rk +1 |Yk ] = CAxk + H Π X E [rk +1 rk0 +1 |Yk ]

=

E [CAxk xk0 A0 C 0 + C wk +1 wk0 +1 C 0 + H ΠXk Xk0 Π 0 H 0 +HMk +1 Mk0+1 H 0 + 2CAxk Xk0 Π 0 H 0 |Yk ]

where we have used E [wk +1 |Yk ] = 0 ∈ R n ,

(19)

An application of hidden Markov models to asset allocation problems

237

and E [wk +1 Mk0+1 |Yk ] = 0 ∈ R n×N .

E [Mk +1 |Yk ] = 0 ∈ R N

Furthermore, Xk Xk0 = diag Xk , E [wk +1 wk0 +1 |Yk ] = D, and E [Mk +1 Mk0+1 |Yk ] = bk Π 0 , (see [3], page 20). Consequently, bk − Π diag X diag Π X E [rk +1 rk0 +1 |Yk ]

bk ]Π 0 H 0 CAxk xk0 A0 C 0 + CDC 0 + H Π[diag X bx0 Π 0 H 0 . bk ]H 0 + 2CAxk X +H [diag Π X

=

(20)

Noting rk +1 )2 (w0b

= =

rk +1b rk0 +1 w w 0b

bk0 Π 0 H 0 bk X CAxk xk0 A0 C 0 + H Π X bk0 Π 0 H 0 , +2CAxk X

then J (w) = w0 K − w0 V w where bk ] K = νb rk +1 = ν[CAxk + H Π X

(21)

bk )H 0 CDC 0 + H (diag Π X bk )(Xk − X bk )0 |Yk ]Π 0 H 0 . +H ΠE [(Xk − X

(22)

and V

=

We can assume that C has rank m, and CDC 0 is positive definite (as D is). The other terms in V are non-negative definite so V is positive definite and hence invertible. The maximum of J (w), subject to w0 · 1 = 1, is given by the first order (Kuhn-Tucker) conditions: K − 2V w + λ1 0

w ·1

=

0

(23)

=

1

(24)

for some Lagrange multipler λ. Solving (23) and (24) we obtain w=

1 −1 V [K + λ1], 2

(25)

with λ = [2 − K 0 V −1 1]/(10 V −1 1).

(26)

This solves the one-period asset allocation problem. In subsequent periods, we have the option to update estimates for Π, A, C , H . We can update K = Kk and bk = pk , (see (10)), V = Vk , (given by (21) and (22)), in terms of updates on X since bk )(Xk − X bk )0 |Yk ] = diag X bk − (X bk )2 . E [(Xk − X

238

R.J. Elliott, J. van der Hoek

References 1. Anderson, W.N. Jr., Kleindorfer, G.B., Kleindorfer, P.R., Woodroofe, M.B.: Consistent estimates of the parameters of a linear system. Ann. Math. Stat. 40, 2064–2075 (1969) 2. Elliott, R.J.: Exact adaptive filters for Markov chains observed in Gaussian noise. Automatica 30, 1399–1408 (1994) 3. Elliott, R.J., Aggoun, L., Moore, J.B.: Hidden Markov models: Estimation and control. Berlin, Heidelberg, New York: Springer 1995 4. Liptser, R.S., Shiryayev, A.N.: Statistics of random processes. Vols. I and II. Berlin, Heidelberg, New York: Springer 1977 5. L¨utkepohl, H.: Introduction to multiple time series analysis. Berlin, Heidelberg, New York: Springer 1991