IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004
This seems to be a simpler function since it is in particular continuous. However, we were not able to exploit this fact for our analysis.
447
Risk Control Over Bankruptcy in Dynamic Portfolio Selection: A Generalized Mean-Variance Formulation Shu-Shang Zhu, Duan Li, and Shou-Yang Wang
III. CONCLUSION Portfolio optimization with stochastic market data is more realistic than standard models with constant coefficients. The formulation of the market condition as a continuous-time Markov chain makes the analysis simpler as in the case of a driving diffusion. For the utility functions treated here, the maximal portfolio value can be computed as a solution of a simple linear differential equation. More complicated is the case of benchmark optimization. It remains open whether a closed form solution can be derived in the general Markov modulated case. REFERENCES [1] S. Browne, “Reaching goals by deadline: Digital options and continuous-time active portfolio management,” Adv. Appl. Probab., vol. 31, pp. 551–577, 1999. [2] M. H. A. Davis, Markov Models and Optimization. London, U.K.: Chapman and Hall, 1993. [3] G. B. Di Masi, Y. M. Kabanov, and W. J. Runggaldier, “Mean-variance hedging of options on stocks with Markov volatilities,” Theory Probab. Appl., vol. 39, pp. 211–222, 1994. [4] W. H. Fleming and D. Hern´ndez-Hern´ndez, “An optimal consumption model with stochastic volatility,” Finance Stoch., vol. 7, pp. 245–262, 2003. [5] W. H. Fleming and R. W. Rishel, Deterministic and Stochastic Optimal Control. New York: Springer-Verlag, 1975. [6] W. H. Fleming and H. M. Soner, Controlled Markov Processes and Viscosity Solutions. New York: Springer-Verlag, 1993. [7] H. Föllmer and P. Leukert, “Quantile hedging,” Finance Stoch., vol. 3, pp. 251–273, 1999. [8] J. P. Fouque, G. Papanicolaou, and K. R. Sircar, Derivatives in Financial Markest with Stochastic Volatility. Cambridge, U.K.: Cambridge Univ. Press, 2000. [9] R. Frey and W. J. Runggaldier, “A nonlinear filtering approach to volatility estimation with a view toward high frequency data,” Int. J. Theoret. Appl. Finance, vol. 4, pp. 199–210, 2001. [10] T. Goll and J. Kallsen, “Optimal portfolios for logarithmic utility,” in Stoch. Proc. Appl., vol. 89, 2000, pp. 31–48. [11] S. Heston, “A closed-form solution for options with stochastic volatility with applications to bond and currency options,” Rev. Financial Stud., vol. 6, pp. 327–343, 1993. [12] T. S. Y. Ho and S. B. Lee, “Term structure movements and pricing interest contingent claims,” J. Finance, vol. 41, pp. 1011–1029, 1986. [13] J. Jacod and A. N. Shiryaev, Limit Theorems for Stochastic Processes. New York: Springer-Verlag, 1987. [14] R. Korn and H. Kraft, “A stochastic control approach to portfolio problems with stochastic interest rates,” SIAM J. Control Optim., vol. 40, pp. 1250–1269, 2001. [15] M. Kulldorff, “Optimal control of favorable games with a time limit,” SIAM J. Control Optim., vol. 31, pp. 52–69, 1993. [16] H. J. Kushner and P. G. Dupuis, Numerical Methods for Stochastic Control Problems in Continuous Time. New York: Springer-Verlag, 1992. [17] A. Lioui and P. Poncet, “On optimal portfolio choice under stochastic interest rates,” J. Econ. Dyna. Control, vol. 25, pp. 1841–1865, 2001. [18] R. C. Merton, “Optimum consumption and portfolio rules in a continuous-time model,” J. Econ. Theo., pp. 373–413, 1971. [19] H. Pham and M. C. Quenez, “Optimal portfolio in partially observed stochastic volatility models,” Ann. Appl. Probab., vol. 11, pp. 210–238, 2001. [20] O. Vasicek, “An equilibrium characterization of the term structure,” J. Financial Econ., vol. 5, pp. 177–188, 1977. [21] T. Zariphopoulou, “Optimal investment and consumption models with nonlinear stock dynamics,” Math. Meth. Oper. Res., vol. 50, pp. 271–296, 1999. [22] , “A solution approach to valuation with unhedgeable risks,” Finance Stoch., vol. 5, pp. 61–82, 2001. [23] Q. Zhang, “Stock trading: An optimal selling rule,” SIAM J. Control Optim., vol. 40, pp. 64–87, 2001.
Abstract—For an investor to claim his wealth resulted from his multiperiod portfolio policy, he has to sustain a possibility of bankruptcy before reaching the end of an investment horizon. Risk control over bankruptcy is thus an indispensable ingredient of optimal dynamic portfolio selection. We propose in this note a generalized mean-variance model via which an optimal investment policy can be generated to help investors not only achieve an optimal return in the sense of a mean-variance tradeoff, but also have a good risk control over bankruptcy. One key difficulty in solving the proposed generalized mean-variance model is the nonseparability in the associated stochastic control problem in the sense of dynamic programming. A solution scheme using embedding is developed in this note to overcome this difficulty and to obtain an analytical optimal portfolio policy. Index Terms—Dynamic portfolio selection, dynamic programming, mean-variance formulation, stochastic control.
I. INTRODUCTION Optimal dynamic portfolio selection is to redistribute successively in each time period an investor’s current wealth among a basket of securities in an optimal way in order to maximize a measure of the investor’s final wealth. The literature of dynamic portfolio selection has been dominated by the results of maximizing expected utility functions of the terminal wealth [2]–[4], [6], [7], [10]–[14], [17]. The Markowitz’s mean-variance model [9] has been recently extended in [8] to a multiperiod setting. The analytical expression of the efficient frontier for the multiperiod portfolio selection is derived. The continuous-time mean-variance formulation is studied in [19]. The dynamic mean-variance formulation in [8] and [19] enables an investor to specify a risk level which he can afford when he is seeking to maximize his expected terminal wealth or to specify an expected terminal wealth he would like to achieve when he is seeking to minimize the corresponding risk. It is easier and more direct for investors to provide this kind of subjective information than for them to construct a utility function in terms of the terminal wealth. The tradeoff information between the expected return and the risk is clearly shown on the efficient frontier, that is most useful for an investor to decide his investment decision. Performing an optimal investment policy in accordance with a dynamic portfolio formulation does not eliminate the possibility that an investor goes to bankruptcy in a volatile financial market before he claims his wealth at the terminal stage. Although risk control over bankruptcy is crucial to a successful investment via dynamic portfolio selection, to our knowledge, no existing literature has addressed it in a context of dynamic portfolio selection. An integration of risk control over bankruptcy and dynamic portfolio selection is evidently needed. In this note, we propose a model via which an optimal investment policy Manuscript received October 15, 2002; revised November 15, 2003. Recommended by Guest Editor B. Pasik-Dunca. This work was supported in part by CAS, the National Science Foundation of China, and the Hong Kong RGC under Grant CUHK4392/99E. S.-S. Zhu is with the Department of Management Science, School of Management, Fudan University, Shanghai 200433, China. D. Li is with the Department of Systems Engineering and Engineering Management, Chinese University of Hong Kong, Shatin, N.T., Hong Kong, (e-mail:
[email protected]). S.-Y. Wang is with the Institute of Systems Science, Academy of Mathematics and Systems Sciences, Chinese Academy of Sciences, Beijing 100080, China. Digital Object Identifier 10.1109/TAC.2004.824474
0018-9286/04$20.00 © 2004 IEEE
448
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004
can be generated to help investors not only achieve an optimal return in the sense of a mean-variance tradeoff, but also have a good risk control over bankruptcy. The note is organized as follows. In Section II, the phenomenon of bankruptcies in multiperiod portfolio selection is investigated, thus raising a recognition of the importance of the risk control over bankruptcy. In Section III, a generalized mean-variance model is proposed for risk control over bankruptcy in dynamic portfolio selection. One key difficulty in solving the proposed generalized mean-variance model is the nonseparability in the associated stochastic control problem in the sense of dynamic programming. A solution scheme adopting a Lagrangian dual formulation and using embedding is developed in this note to obtain an analytical optimal portfolio policy. In Section IV, a case study is presented to gain insights about the significance of the risk control in dynamic portfolio selection. Conclusions are drawn in Section V. II. POSSIBILITY OF BANKRUPTCY IN DYNAMIC PORTFOLIO SELECTION As the same as in [8], we consider a capital market with (n +1) risky securities with random rates of returns. An investor joins the market at time 0 with an initial wealth x0 . The investor can allocate his wealth among the (n + 1) assets. The wealth can be reallocated among the (n + 1) assets at the beginning of each of the following (T 0 1) consecutive time periods. The rates of return of the risky securities at time 0 period t are denoted by vector et = et0 ; e1t ; . . . ; en , where eti is the t random return for security i at time period t. It is assumed that vectors et , t = 0; 1; . . . ; T 0 1, are statistically independent and0 return et has a known mean E (et ) = E et0 ; E et1 ; . . . ; E (etn ) and a known covariance
Cov(e ) =
00 . . . 0 t;
t; n
.. .
t
..
0
t; n
.
...
.. .
:
t;nn
Let xt be the wealth of the investor at the beginning of the tth period, uti ; i = 1; 2; . . . ; n, be the amount invested in the ith risky asset at the beginning of the tth time period. The amount invested in the 0th risky uti . asset at the beginning of the tth time period is equal to xt 0 n i=1 The wealth dynamics can be thus given as n
n
uti et0 =1 i=1 = et0 xt + Pt ut ; t = 0; 1; 2; . . . ; T 0 1 (1) 1 2 1 2 n n where ut = [ut ; ut ; . . . ; ut ] and Pt = Pt ; Pt ; . . . ; Pt = (et1 0 et0 ); (et2 0 et0 ); . . . ; (etn 0 et0 ) . x +1 =
eu + x i t
t
i t
t
0
i
0
0
0
0
The mean-variance formulation (MV (wT )) for multiperiod portfolio selection [8] can be posed as follows when security 0 is taken as a reference:
max E (x ) 0 w Var(x ) s:t: (1) u
T
T
T
where wT 2 (0; +1) represents the tradeoff between the expected terminal wealth and the associated risk represented by the variance of the terminal wealth. A bankruptcy occurs when the total wealth of an investor falls below a predefined “disaster” level in any intermediate or the final time period. It is assumed that when an investor is in bankruptcy, he is not able to pursue further investment due to his high liability and a low credit. Mathematically, we denote the “disaster”level at period t by bt and label the event of a bankruptcy at period t as BRt . The probability of BRt is
P (BR ) = P (x b ; x > b ; i = 1; . . . ; t 0 1); t = 1; . . . ; T: t
t
t
i
i
Obviously, BRi \ BRj = ; for any i 6= j; i; j 2 f1; . . . ; T g. Thus, the total probability of bankruptcy in the whole investment horizon is T P (BRt ). t=1 We give an example in the following to demonstrate a possible high frequency of bankruptcies when adopting the optimal dynamic portfolio policy for the multiperiod mean-variance formulation (MV (wT )). Example 1: Consider the case study in [18, Ch. 7] by assuming a stationary multiperiod process. An investor has one unit wealth in the beginning of the planning horizon. The investor is trying to find out the best allocation of his wealth among three risky securities, A, B, and C. The expected returns for risky securities, A, B, and C are
r = E (e) = E (e ; e ; e ) = (1:162; 1:246; 1:228) : The covariance matrix of e is 0:0146 0:0187 0:0145 6 = Cov(e) = 0:0187 0:0854 0:0104 : 0:0145 0:0104 0:0289 Suppose that the return vector e is normally distributed as N (r; 6). Let w = 0:2, T = 6, b = 0, t = 1; . . . ; 6. The optimal dynamic portfolio policy of (MV (w )) can be derived for this example using the results from [8]. A Monte Carlo simulation of SN (= 10 000) A
T
B
C
0
0
t
T
samples was performed to examine this optimal dynamic portfolio policy. The corresponding results are shown in Table I, where E (xt ) and V ar(xt ) denote the derived theoretical expected value and the variance of xt , respectively, E (xt ) and V ar (xt ) denote the sample expected value and the sample variance of xt , respectively, BNt denotes the total number that the investor goes to a bankruptcy at period t in the simulation, and P (BRt ) is an estimation of P (BRt ) defined by BNt =SN . From Table I, the optimal policy derived from the dynamic meanvariance portfolio selection model (MV (wT )) becomes questionable for its real implementation, due to its corresponding high probability of bankruptcy (22.32%). Bankruptcies occur mostly in the first three periods. A dramatically high increase rate in the expected wealth in the early periods is always accompanied by a relatively high variance. The ratio of the variance to the expected wealth is higher in the earlier periods than in the later periods. The reason behind this phenomenon seems to be clear: A dynamic portfolio policy derived from maximizing a terminal objective may result in some very aggressive policies in the early stages, which are not penalized in the mathematical formulation (MV (wT )). In real implementation, however, this negligence of risk control will be penalized by a high possibility of a miscarriage of an investment plan. III. GENERALIZED MEAN-VARIANCE MODEL FOR DYNAMIC PORTFOLIO SELECTION WITH RISK CONTROL OVER BANKRUPTCY From the discussion in Section II, a necessity becomes clear to consider a bankruptcy control in dynamic portfolio selection models. Let the “disaster”level at period t be bt which is reasonably assumed to be less than E (xt ). By Tchebycheff inequality, we have
x) : b ) [E (Var( x ) 0 b ]2 Because P (x b ) P (BR ) = P (x b ; x > b ; i = 1; . . . ; t 0 1), controlling the risk of bankruptcy at period t can be achieved by setting a small value to bound V ar(x )=[E (x ) 0 b ]2 . P (x
t
t
t
t
t
t
t
t
t
t
t
i
i
t
t
We thus propose in this note the following generalized mean-variance model (GMV (wT ; )) for dynamic portfolio selection with a risk control over bankruptcy:
max E (x ) 0 w V ar(x ) s:t: x +1 = e0 x + P u ; t = 0; 1; . . . ; T 0 1 t = 1 ; 2; . . . ; T 0 1 V ar(x ) [E (x ) 0 b ]2 ; T
u
t
T
t
t
t
T
t
t
t
0
t
t
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004
449
TABLE I BANKRUPTCY FREQUENCY IN EXAMPLE 1 WITH THE MEAN-VARIANCE FORMULATION
where wT 2 (0; +1) is a parameter that denotes the tradeoff between the expected return and the risk associated with the terminal wealth, 01 and = (1 ; 2 ; . . . ; T 01 )0 is a vector in T + whose components represent the levels of a risk control over bankruptcy for the intermediate periods in a dynamic investment practice. Parameters wT and reflect an investor’s attitude toward risk and they should be assigned by the investor before searching for an optimal dynamic portfolio policy. Notice that E (et et0 ) = Cov(et ) + E (et )E (et0 ). It is reasonable to assume that E (et et0 ) is positive definite for all time periods, i.e.,
R
E (et et0 ) =
E ((e0t )2 ) E (et0et1 ) E (et1et0 ) E ((et1)2 ) .. .
.. .
1 1 1 E (et0etn ) 1 1 1 E (et1etn) ..
.. .
.
0:
E (etnet0 ) E (etnet1 ) . . . E ((etn)2 )
A dynamic portfolio policy is admissible if for every t, t only depends on I t , the information set at the beginning of the tth period. More specifically, t maps I t into a portfolio decision in the tth period
ut1 ut2 .. .
utn
E ((et0)2 ) E (et0Pt0 ) E (et0Pt ) E (Pt Pt0 )
1 0 111 0 01 1 1 1 1 0
1 01 1 1 1 0 1 0 1 111 0 = .. .. . . .. E (et et0 ) .. .. . . .. 0: . . . . . . . . 0 0 111 1 01 0 1 1 1 1
Furthermore, we have
E (Pt Pt0 ) 0;
t = 0 ; 1; . . . ; T 0 1
and
E ((et0)2 ) 0 E (et0Pt0 )E 01 (Pt Pt0 )E (et0Pt ) > 0; t = 0 ; 1; . . . ; T 0 1: Define I t to be the information set at the tth period,
I t = fx0 ; u0 ; x1 ; u1 ; . . . ; xt01 ; ut01 ; xt g: A dynamic portfolio policy is an investment sequence
= f0 ; 1 ; . . . ; T 01 g 1T 01 10 11 2 2 2T 01 0 1 = .. ; .. ; . . . ; .. .
n0
.
n1
.
nT 01
.. .
:
nt (I t )
A primal-dual method is adopted to solve (GMV (wT ; )). By introducing nonnegative Lagrangian multipliers w1 , w2 ; . . . ; wT 01 , a Lagrangian maximization problem (L(w; wT ; )) of (GMV (wT ; )) is formed by attaching the constraints for risk control over bankruptcy to the objective function
max u E (xT ) 0 wT V ar(xT ) T 01
wt V ar(xt ) 0 t (E (xt ) 0 bt )2 t=1 s:t: xt+1 = et0 xt + Pt 0 ut ; t = 0; 1; . . . ; T 0 1 where w = (w1 ; w2 ; . . . ; wT 01 )0 . Denote 53 (w; wT ; ) to be the set of optimal admissible = policies of problem (L(w; wT ; )), i.e., 53 (w; wT ; ) fj is an optimal admissible policy of (L(w; wT ; ))g. The stochastic control problem (L(w; wT ; )) is nonseparable in
0
Thus, we have
=
1t (I t ) 2t (I t )
the sense of dynamic programming. Note that a variance term involves a nonlinear function of an expectation term, V ar(xt ) = E (xt2 ) 0 E 2 (xt ). A key fact is that while the expectation operator satisfies the smoothing property: E [E (1 j I j ) j I k ] = E (1 j I k ), 8 j > k , any nonlinear function of the expectation does not. More specifically, we have E [E 2 (1 j I j ) j I k ] 6= E 2 (1 j I k ), 8 j > k . Thus, (L(w; wT ; )) cannot be directly solved by dynamic programming or any other existing stochastic control method. A solution procedure using an embedding scheme will be developed in this note to seek an optimal dynamic portfolio policy for problem (L(w; wT ; )). Consider the following auxiliary problem (A(; w; wT )) of (L(w; wT ; )):
T
E (t xt 0 wt x2t ) t=1 s:t: xt+1 = et0 xt + Pt0 ut ; where = (1 ; . . . ; T )0 .
max u
t = 0 ; 1; . . . ; T 0 1
Remark 1: In the following discussion, we assume all
w1 , w2 ,
1 1 1, wT 01 > 0. If some wt = 0 for t 2 f1; 2; . . . ; T 0 1g, then (L(w; wT ; )) can be embedded into (A(; w; wT )) with the
:
corresponding t setting at zero. The analysis will remain the same. Denote 53A (; w; wT ) to be the set of optimal admissible policies of problem (A(; w; wT )), i.e., 53A (; w; wT ) = fj is an optimal admissible policy of (A(; w; wT ))g.
450
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004
where the fact of @U=@E(xt2 ) = 0wt , t = 1; 2; . . . ; T 0 1; T , for any is used. Combining the previous inequality and (5) yields
Consider the objective function of (L(w; wT ; ))
U E(x1 ); E(x21 ); . . . ; E(xT ); E(xT2 ) = E(xT ) 0 wT V ar(xT ) T 01 0 wt V ar(xt ) 0 t (E(xt) 0 bt )2 t=1
= E(xT ) 0 +
T 01 t=1
T 01 t=1
2wt t bt E(xt ) + wT E 2 (xT )
wt (1 + t )E 2 (xt )
0 wT E(xT2 ) 0 It is obvious that U E(x1 ); . . . ; E(xT ). Denote
d() =
U E(x1 ); E(x12 ); . . . ; E(xT ); E(xT2 ) j > U E(x1 ); E(x12 ); . . . ; E(xT ); E(xT2 ) j : This contradicts the assumption of 3 2 53 (w; wT ; ). The proof is
T 01 t=1
wt E(xt2) +
T 01 t=1
wt t b2t :
(2)
is a convex function of E(x12 ); . . . ; E(xT2 ) and
@U ; . . . ; @U ; @U 0 @E(x1 ) @E(xT 01 ) @E(xT ) j
where
@U = 0 2wt t bt + 2wt (1 + t )E(xt ); @E(xt ) j t = 1; 2; . . . ; T 0 1 (3) @U = 1 + 2wT E(xT ): (4) @E(xT ) j Theorem 1: For any 3 2 53 (w; wT ; ), 3 2 53A (d(3 ); w; wT ). = 53A (d(3 ); w; wT ), Proof: By contradiction, assume that 3 2 i.e., there exists an admissible policy such that E(x12 ) .. .
E(xT2 ) [0w0 ; 0wT ; d(3 )0] E(x1 ) .. .
E(xT2 ) > [0w0 ; 0wT ; d(3 )0 ] E(x1 )
:
.. .
U E(x1 ); E(x12 ); . . . ; E(xT ); E(xT2 ) j U E(x1); E(x12); . . . ; E(xT ); E(xT2 ) j E(x12 ) E(x12 )
.. .
E(xT ) j
~t = t + ~t+1 A1t ; t = 1; . . . ; T 0 1 w~t = wt + w~t+1 A2t ; t = 1; . . . ; T 0 1 ~t+1
t = ; t = 0; 1; . . . ; T 0 1: w~t+1 The optimal portfolio policy of (A(; w; wT )) is derived analytically in the appendix of this note by using dynamic programming. The result is summarized as follows. The portfolio policy at time period t takes the following form:
xt+1 = et0 0 Pt0 Kt xt + Pt0 0t t ; (5)
E(xT ) j Since U is a convex function of E(x12 ); . . . ; E(xT2 ) and E(x1 ); . . . ; E(xT ), it follows that
E(xT2 ) E(x1 )
Let T = T and wT = wT . We further introduce the following two recursions and one notation:
t = 0; 1; . . . ; T 0 1:
(6)
The wealth dynamics under portfolio policy ut3 (xt ; t ) is given by
.. .
+ [0w0 ; 0wT ; d(3 )0 ]
A1t = E e0t 0 E Pt0 E 01 Pt Pt0 E et0 Pt A2t = E (et0)2 0 E et0 Pt0 E 01 Pt Pt0 E et0 Pt Bt = E Pt0 E 01 Pt Pt0 E (Pt ) Kt = E 01 Pt Pt0 E et0 Pt 0t = 1 E 01 Pt Pt0 E (Pt ) : 2
ut3 (xt ; t ) = 0Kt xt + 0t t ;
E(xT ) j E(x12 )
.. .
completed. Theorem 1 implies that 53 (w; wT ; ) [ 53A (; w; wT ). It is clear now that the optimal portfolio policy of (L(w; wT ; )) can be generated by solving a tractable auxiliary problem (A(; w; wT )). This embedding has been validated via characterizing the relationship between (L(w; wT ; )) and (A(; w; wT )). Prominent features of problem (A(; w; wT )) are its separability in the sense of dynamic programming and its linear-quadratic structure. Denote for t = 0; 1; . . . ; T 0 1
.. .
2 T) 0 E(x E(x1 ) .. .
E(xT ) j
t = 0; 1; . . . ; T 0 1:
(7)
Taking the expectation on both sides of (7) and noticing the statistical independence between (et0 ; Pt ) and xt , we have
E[xt+1 (; w; wT )] = E et0 ) 0 E(Pt0 E 01 Pt Pt0 E et0 Pt
2 E[xt(; w; wT )]
+ E Pt0 E 01 Pt Pt0 E(Pt ) t 2 B t t 1 = At E[xt (; w; wT )] + ; 2 t = 0; 1; . . . ; T 0 1:
(8)
Squaring both sides of (7) yields
x2t+1 = (et0)2 0 2et0 Pt0 Kt + Kt0 Pt Pt0 Kt x2t + 2(et0 0 Pt0 Kt )xt Pt0 0t t + t 00t Pt Pt0 0t t ; t = 0; 1; . . . ; T 0 1:
(9)
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004
Taking the expectation on both sides of (9) and then simplifying yield the following expression of a recursive relation for the expected value of the square wealth under policy u3t (xt ; t ) for t = 0; 1; . . . ; T 0 1:
451
To complete the proof, we only need to show that the following vectors:
@E [x1 (3 ; w; wT )] ; @E [x2 (3 ; w; wT )] ; . . . @k @k @E [xT (3 ; w; wT )] ; @k j
E x2t+1 (; w; wT ) = E ((e0t )2 ) 0 E (et0Pt0 )E 01 (Pt Pt0 )
2 E (et0Pt ) E x2t (; w; wT ) 2
+ E (Pt0 )E 01 (Pt Pt0 )E (Pt ) 4t = A2t E [x2t (; w; wT )] + 41 Bt t2 : (10) Theorem 2: Suppose 3 2 53A (3 ; w; wT ). A necessary condition for 3 2 53 (w; wT ; ) is 3 = d( 3 ):
(11)
Proof: For given w and wT , the solution set of (A(; w; wT )) can be parameterized by . Since 53 (w; wT ; ) [ 53A (; w; wT ), problem (L(w; wT ; )) can be reduced to the following equivalent maximization problem:
max U E [x1 (; w; wT )]; E x21 (; w; wT ) ; . . . E [xT (; w; wT )]; E x2T (; w; wT )
1 + 2wT E (xT )j +
T 01 t=1
3 aij = @E [xi ( ; w; wT )] ; i; j = 1; . . . ; T: @j Equation (8) is used in deriving (15). Multiplying A1t to row t and subtracting it from row t + 1 recursively for t = T 0 1; T 0 2; . . . ; 2, 1 yields (15). The solution to (8) is
E [xt (; w; wT )]
= where
01
t01 k=0
A1k x0 +
t01
k=0
s=k+1
A1s Bk k
2
t =
t+1 + t+2 A1t+1 + 1 1 1 + T wt+1 + wt+2 A2t+1 + 1 1 1 + wT
E [xt (; w; wT )] is a linear function of E [xt (; w; wT )] in the following way: t01 E [xt (; w; wT )] = A1k x0
T 01 i=t+1 T 01
(16)
t = 0 ; 1; . . . ;
A1i
i=t+1
(17)
A2i
and we can express
k=0
02wt t bt + 2wt (1 + t )E (xt)j
; w; wT )] ; . . . + @E [xt (@
3 ; w; wT )] 2 @E [xt (@ k T @E x2t (3 ; w; wT ) 0 wt = 0; @k t=1 (13) k = 1; . . . ; T: On the other hand, when 3 2 53A (3 ; w; wT ), we have from [16] 3 3t @E [xt ( ; w; wT )] @k t=1 T 2 3 ; w; wT ) 0 wt @E xt (@ = 0; k t=1 k = 1; . . . ; T:
t01
t01 1 s=t As is defined to be equal to 1. Since, for
(12)
@E [xT (3 ; w; wT )] @k
k = 1; . . . ; T
are linearly independent. This can be verified from (15), as shown at the bottom of the next page. Note in (15) that
T
where E [xt (; w; wT )] and E [x2t (; w; wT )] (t = 1; . . . ; T ) satisfy (8) and (10). From (2), a first-order necessary condition for optimal solution 3 to (12) is
1
2
@E [xt (; w; wT )] @T 1
.. . T
(18)
where the partial derivatives, @E [xt (; w; wT )]=@k , 1; 2; . . . ; T , can be obtained using (16) and (17). Denote
T
k
0 1 0 2w T E (x T )j T 01
@E [xT (3 ; w; wT )] @k
3t + 2wt t bt 0 2wt (1 + t )E (xt )j t=1 @E 3 ; w; wT )] = 0; 2 [xt (@ k = 1; . . . ; T: k
+
=
9 = diag 2w1 (1 + 1 ); 2w2 (1 + 2 ); . . . (14)
2wT 01 (1 + T 01 ); 2wT
Combining (13) and (14) yields
3T
T
and
3=
@E [x ( ;w;w @ @E [x ( ;w;w @ .. . @E [x ( ;w;w @
)] 1 1 1 )] 1 1 1 ..
.
)] 1 1 1
@E [x ( ;w;w @ @E [x ( ;w;w @ .. . @E [x ( ;w;w @
)] )] )]
:
452
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004
By (3), (4), (11), and (18), we have
31 32 .. .
=9
3T
0901
E [x1 (3 ; w; wT )] E [x2 (3 ; w; wT )]
Similar to the proof in (15), multiplying
=9
T 01 t=0
+
+ 93
.. .
A1t x0 0 2w 1 1 b1 .. .
02wT 01 T 01 bT 01
31 32 .. .
3T :
.. .
3T
=0
t=0
A1t x0
a11 a21
=
@E [x ( ;w;w )] @ @E [x ( ;w;w )] @ .. . @E [x ( ;w;w )] @ @E [x ( ;w;w )] @
..
111 111
@E [x ( ;w;w )] @ @E [x ( ;w;w )] @ .. . @E [x ( ;w;w )] @ @E [x ( ;w;w )] @
.. .
.. .
111 111 ..
111 111
a12 a22 .. .
.. .
.
a1;T 01 a2;T 01
.
a1;T a2;T
aT 01;1 aT 01;2 1 1 1 aT 01;T 01 A1T 01 aT 01;1 A1T 01 aT 01;2 1 1 1 A1T 01 aT 01;T 01 T 02 B B A1 B A1 A1 1 1 1 B A1 2Bw 1 2w 1 2 2w 2w 2w t=1 t T 02 B B A1 0 1 1 1 2Bw A1t 2Bw 2 2w 2w t=2 T 02 B B 0 1 1 1 A1 2Bw = 0 2w 2w t=3 t .. .
T
0 0
.. .
0 0
= B2tw01 > 0: t t=1
of matrix
Substituting the optimal value of back to (6) yields the optimal portfolio policy for (L(w; wT ; )) which is the Lagrangian problem of (GMV (wT ; )). The remaining task is to search for optimal parameter vector w for Lagrangian problem (L(w; wT ; )) such that an optimal solution of the primal problem (GMV (wT ; )) can be attained. A primal-dual solution method is used to solve (GMV (wT ; )). One can refer to [1], [5], and [15] for detailed discussion on Lagrangian duality and the gradient method in solving the dual problem.
.. .
@E [x ( ;w;w )] @ @E [x ( ;w;w )] @ .. . @E [x ( ;w;w )] @ @E [x ( ;w;w )] @
t
s:t: (20)
A10 x0 A11 A10 x0 T 01
to row
. . . ; E [xT (; w; wT )]; E [x2T (; w; wT )]
(19)
Notice that 9 is invertible. Reformulating (19) yields
3 0 901
At
1
max U E [x1 (; w; wT )]; E [x21 (; w; wT )];
1
31 32
(20)
bottom of the next page. It is obvious that rank (8) T 0 1. Therefore, rank(3 0 901 ) T 0 1. If 3 0 901 is nonsingular, a unique optimal 3 can be found by solving the set of linear equations in (20). Otherwise, when rank(3 0 901 ) = T 0 1, the degree of freedom in 3 is one. Thus, 3 can be found efficiently by a line search method for the following optimization problem:
02wT 01 T 01 bT 01 1
:
02wT 01 T 01 bT 01
3 0 901 and subtracting it from row t + 1 recursively for t = T 0 1; T 0 2; . . . ; 2; 1 yields matrix 8 in (21), as shown at the
.. .
A10 x0 A11 A10 x0
.. .
1
.. .
E [xT (3 ; w; wT )] 02w11 b1
+
0 2w 1 1 b1
.. .
0 0
..
.
111 111
B
2w
.. .
0
aT 01;T A1T 01 aT 01;T + B2w T 01 A1t t=1 T 01 A1t t=2 T 01 A1t t=3 .. .
B
2w
B
A1T 01
2w
(15)
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004
Consider
the
(GMV (wT ; ))
Lagrangian dual
(LD(wT ; ))
problem
Substituting the optimal value of w and corresponding optimal back to (6) gives us the optimal portfolio policy of (GMV (wT ; )). The derived optimal multiperiod portfolio policy consists of two terms and exhibits a decomposition property between the investor’s risk attitude and his current wealth. The second term in u3t is dependent on the investor’s risk attitude specified by wT and and is independent of his current wealth. It can be calculated offline before the real investment process starts. The first term in ut3 is dependent on the current wealth and is independent of the investor’s risk attitude. It is calculated online at every time period when the current wealth is observed. Recall from (20) that the determination of the optimal value for (thus, also the optimal value for w ) relies on the initial value of x0 . Therefore, the second term in ut3 is dependent on the initial wealth x0 . More specifically, the optimal portfolio policy is of the following form:
of
H (w ) min w0 where the dual function H (w) is the maximum value of (L(w; wT ; )) in (2) for a given w , i.e.,
E (xT ) 0 wT V ar(xT ) H (w) = max u T 01 0 wt V ar(xt ) 0 t [E (xt) 0 bt ]2 t=1
s:t: xt+1 = et0 xt + Pt 0 ut ;
t = 0 ; 1; . . . ; T 0 1: H (w) is a convex function of w [1]. Given w = w , suppose that is the optimal policy to (L(w; wT ; )). Denote g(w; ) = [g1 (w; ); g2 (w; ); . . . ; gT 01 (w ; )]0 where for t = 1; 2; . . . ; T 0 1 Var(xt )0t (E (xt ) 0 bt )2 j ; wt>0 gt (w; )= 2 max 0; V ar(xt )0t (E (xt )0bt ) j ; wt=0: If g (w ; ) = 0, then w is an optimal solution to (LD(wT ; )), and is the optimal portfolio policy of (GMV (wT ; )). Otherwise, g (w ; ) is a feasible descent direction of H (w) at w .
ut3 = t (I t ) = 0Kt xt + (x0 ) nonlinear function of x0 . One important
where is a conclusion is that the optimal portfolio policy for stochastic control problem (GMV (wT ; )) is of a feedback form, but not Markovian. At each period t, the optimal control policy depends only on two pieces of information from the given information set I t , the current state xt (current wealth) and the initial state x0 (initial wealth), linear on the former and nonlinear on the latter. Consider now the application of (GMV (wT ; )) to Example 1 with wT = 0:2 and = (0:12; 0:15; 0:15; 0:20; 0:20). The corresponding optimal portfolio policy is derived for (GMV (wT ; )) by using the solution method described in this note. A Monte Carlo simulation with SN (= 10 000) samples was performed. The corresponding results are shown in Table II. We can observe from Table II that the bankruptcy probability decreases significantly when compared to the results in Table I. Of course, tradeoffs exist between adopting models (MV (wT )) or (GMV (wT ; )). Fig. 1 shows the E (xT )—V ar(xT ) efficient frontiers of Example 1 generated by solving (MV (wT )) and (GMV (wT ; )) with = (0:12 0:15 0:15 0:20 0:20), respectively. Although an investor needs to sacrifice a certain amount in the expected final wealth to gain a decrease in the probability of bankruptcy, an integration of dynamic portfolio selection with risk control over bankruptcy yields an assurance of a stable growth of the wealth. It is interesting to compare the optimal portfolio policies for (MV (wT )) and (GMV (wT ; )). Examining the optimal policy
Based on the previous discussion, a primal-dual iterative algorithm is proposed as follows.
0) Choose an initial point and a very small positive number . Let ; . Denote its optimal 1) Solve policy as . If all for , then stop. Otherwise, go to 2); 2) Solve H wk + g(wk ; k ) min s:t: wk + g(wk ; k ) 0
0:
Let
be the optimal solution, and let . Replace by , go
to 1).
0
B w
w
2
w
2
8=
B w
2
1 (1+
A
(1+
0 .. .
0 0
B w
A11
2
453
A11 A12
2
B T 02 A1 w t=1 t
...
2
...
2
...
2
B T 01 A1 w t=1 t
2
)
)
0
B w
B w
2
w
2
w
2
1 (1+
A
(1+
.. .
0 0
A12
2
B T 02 A1 w t=2 t
B T 01 A1 w t=2 t
2
)
)
0
B w
2
w
2
1 (1+
.. .
0 0
B T 02 A1 w t=3 t
B T 01 A1 w t=3 t
2
)
..
.
... ...
0
B w
2
w
w
B w
2
1 (1+
2
2
.. .
A
(1+
.. .
A1T 01
)
)
B
w
2
0
1
w
2
:
(21)
454
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004
TABLE II BANKRUPTCY FREQUENCY IN EXAMPLE 1 WITH THE GENERALIZED MEAN-VARIANCE FORMULATION
(MV (wT )) in [8, eqs. (22) and (23)] and the optimal policy for (GMV (wT ; )) in (6), it is clear that the first terms in both control for
laws are exactly the same. The information concerning the current wealth is fed back into the portfolio policy in the same manner. The difference only exists in the second term which is related to the investor’s risk attitude. IV. CASE STUDY To illustrate our proposed model and solution method, a case study in [9] is investigated again with certain modifications. The data set contains the yearly net returns of nine stocks: American Tobacco, AT&T, United States Steel, General Motors, Atchison, Topeka and Santa Fe, Coca-Cola, Borden, and Firestone and Sharon Steel from 1937 to 1954. We assume reasonably that the yearly returns of the nine stocks follow a stationary joint normal process as
et N (r; 6);
t = 1 ; 2; . . . ; T
where, estimated by the data set, r and 6 are, respectively, given in (22) and (23), as shown at the bottom of the page. Assume that an investor invests his wealth in the 9 stocks with an initial wealth of 1 unit. The planning horizon is eight years, and one time period is one year. The “disaster”level is 0 for all time periods. Tables III–VI show the simulation results according to different policies generated by (MV (wT )) and (GMV (wT ; )) with different wT and . The number of samples, SN , in Monte Carlo simulation for all situations is 10 000. From these simulation results, it is easy to see that the portfolio policy generated by (MV (wT )) with a high expected terminal wealth is always accompanied by a high risk of bankruptcy, especially in the first several periods. Because (MV (wT )) does not consider directly a risk control over the intermediate periods, the corresponding strategies in the early periods are aggressive (as evidenced by an unreasonably
Fig. 1. Efficient frontiers of Example 1 with the mean-variance formulation and the generalized mean-variance formulation.
rapid growth of the expected wealth). The probability of bankruptcy is relatively high in the first several periods due to these aggressive strategies. If an investor tries to generate a moderate portfolio policy (without being overly aggressive) by increasing wT , then the expected terminal wealth may become unacceptably low (see Table IV). Comparing the simulation results generated by (GMV (wT ; )) with these generated by (MV (wT )), we can clearly observe that the probability of bankruptcy is substantially decreased. The aggressive behaviors in the first several periods can be successfully eliminated by setting a relatively small in (GMV (wT ; )). The growing of the expected wealth under the policy generated by (GMV (wT ; )) is
r = E (e t ) = (1:0659; 1:0616; 1:1461; 1:1734; 1:1981; 1:0551; 1:1276; 1:1903; 1:1156) : 6 = Cov(et ) 0:0534 0:0215 0:0287 0:0490 0:0162 0:0322 0:0243 0:0400 0:0215 0:0147 0:0188 0:0244 0:0080 0:0100 0:0145 0:0254 0:0287 0:0188 0:0855 0:0626 0:0444 0:0133 0:0104 0:0686 0:0490 0:0244 0:0626 0:0955 0:0515 0:0290 0:0208 0:0900 = 0:0162 0:0080 0:0444 0:0515 0:1279 0:0128 0:0209 0:1015 0:0322 0:0100 0:0133 0:0290 0:0128 0:0413 0:0113 0:0296 0:0243 0:0145 0:0104 0:0208 0:0209 0:0113 0:0288 0:0291 0:0400 0:0254 0:0686 0:0900 0:1015 0:0296 0:0291 0:1467 0:0362 0:0208 0:0420 0:0366 0:0450 0:0217 0:0174 0:0528 0
(22)
0:0362 0:0208 0:0420 0:0366 0:0450 : 0:0217 0:0174 0:0528 0:0793
(23)
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004
CASE STUDY—GMV (w
TABLE III CASE STUDY MV (w ) WITH w
= 3
TABLE IV CASE STUDY—MV (w ) WITH w
= 8
; ) WITH w
CASE STUDY—GMV (w
455
; ) WITH w
:
TABLE V
= 0 10 AND
= (0:10; 0:10; 0:10; 0:10; 0:10; 0:10; 0:10)
TABLE VI = 1 AND
= (0:10; 0:10; 0:10; 0:10; 0:08; 0:08; 0:08)
456
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004
much more stable than that under the policy generated by (MV (wT )). Observing the simulation results, an investor will be able to adjust the values of wT and interactively in a decision-making process to achieve a portfolio policy to his satisfaction. The policy in Table VI could be a rational choice for an investor, as the expected final wealth is about ten times the initial wealth, and the probability of bankruptcy is low.
The dynamic programming algorithm starts from stage T 0 1. At stage T 0 1, the optimization problem for a given xT 01 is as follows:
max JT 01 (xT 01 ; uT 01 )
u
0 wT x2T + T xT j xT 01 g
= umax 0 wT 01 + wT E (eT0 01 )2 x2T 01 + T 01 + T E eT0 01 xT 01 + T E PT0 01 0 2wT xT 01 E eT0 01 PT0 01 uT 01
V. CONCLUSION Due to the volatility of financial markets, bankruptcy control is an indispensable issue to be addressed in dynamic portfolio selection. An integration of bankruptcy control and dynamic portfolio selection has been considered in this note. We have proposed a generalized meanvariance formulation from which an optimal investment policy can be generated to help investors not only achieve an optimal return in the sense of a mean-variance tradeoff, but also have a good risk control over bankruptcy. Notice that in the mean-variance formulation, reduction of the probability of bankruptcy can be achieved by increasing the weighting coefficient for the variance. This, however, is always accompanied with a heavy reduction in the expected final wealth. By incorporating a control of the probability of bankruptcy in the generalized mean-variance formulation proposed in this note, the dynamic portfolio selection problem becomes to seek a balance among three objectives, which could lead to a more satisfactory tradeoff between the probability of bankruptcy and the expected value of the final wealth. While the traditional stochastic control theory only concerns a sole objective of minimizing an expected performance measure, variance control occurs naturally in dynamic portfolio selection problems with a mean-variance formulation and many other applications. The celebrated dynamic programming is the only universal solution scheme to achieve an optimality for stochastic control problems. Dynamic programming, however, is only applicable to problems that satisfy the property of separability and monotonicity. Variance control problems are not directly solvable by dynamic programming due to its nonseparability. In this respect, variance minimization is a notorious kind of stochastic control problems. Using an embedding scheme, a feedback optimal portfolio policy can be obtained for variance control problems via parametric dynamic programming method, while the corresponding optimal condition for the parameter can be derived by examining the relationship between the primal and the auxiliary problems. The generalized mean-variance formulation proposed in this note for risk control over bankruptcy in discrete-time dynamic portfolio selection can be extended to continuous-time dynamic portfolio selection by imposing probability constraints at distinct time instants in the continuous time horizon.
= umax E f0wT 01 xT2 01 + T 01 xT 01
0 wT uT0 01 E PT 01 PT0 01 uT 01 :
Optimal
uT 01
(24)
for (24) can be obtained by solving
(xT 01 ; uT 01 )=duT 01 = 0
dJT 01
u3T 01 = E 01 PT 01 PT0 01 E (PT 01 ) T 2w T 0E eT0 01 PT 01 xT 01 : Substituting uT3 01 back to JT 01 (xT 01 ; uT 01 ) yields the optimal wealth-to-go at given xT 01 JT3 01 (xT 01 ) = 0wT 01 xT2 01 + T 01 xT 01 +T 01 E PT0 01 E 01 PT 01 PT0 01 E (PT 01 ) where
wT 01 = wT 01 + wT E (eT0 01 )2
0 E eT0 01 PT0 01
2 E 01 PT 01 PT0 01 E eT0 01 PT 01 T 01 = T 01 + T E eT0 01
0 E PT0 01
2 E 01 PT 01 PT0 01 E eT0 01 PT 01 2 T 01 = T : 4w T Suppose that the wealth-to-go at stage t is of the following form:
Jt3 (xt ) = 0wt x2t + t xt + 4t APPENDIX SOLUTION FOR (A(; w; wT )) USING DYNAMIC PROGRAMMING Consider the following discrete-time stochastic control problem
(A(; w; wT )): max
T t=1
E (0wt x2t + t xt )
s:t: xt+1 = e0t xt + Pt ut t = 0; 1; . . . ; T 0 1 0
where = (1 ; 2 ; . . . ; T )0 ; w = (w1 ; w2 ; . . . ; wT 01 )0 and wT are t ]0 , P = [(et 0 e0 ); (et 0 e0 ); . . . ; (et 0 given, ut = [ut1 ; u2t ; . . . ; un t 1 t 2 t n 0 0 et )] is of known statistics of the first- and second-order moments. Dynamic programming is used as a solution scheme to identify an optimal feedback policy for problem (A(; w; wT )).
where
wt = wt + wt+1 E 0E t = t + t+1 E T 01 4t = k E Pk0 k=t
k =
(et0)2
et0 Pt0 E 01 Pt Pt0 E et0 Pt et0 0 E Pt0 E 01 Pt Pt0 E et0 Pt E 01 Pk Pk0 E (Pk )
2k+1 4wk+1 ; k = t; t + 1; . . . ; T 0 1
with boundary conditions of wT = wT and T induction assumption holds for stage T 0 1.
= T . Notice that this
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004
At stage as follows:
t
0 1,
the optimization problem for a given
xt01
is
max Jt01 (xt01 ; ut01 ) u
E 0wt01 x2t01 + t01 xt01 = max u
0wt x2t + txt + 4t j xt01
= max 0 wt01 + wt E (e0t01 )2 x2t01 u + t01 + t E et001 xt01 + 4t
+ t E Pt001 0 2wt xt01 E et001 Pt001 ut01
0 wt ut0 01 E Pt01 Pt001 ut01: Optimal
ut01
(25)
for (25) can be obtained by solving
(xt01 ; ut01 )=dut01 = 0
ut301 = E 01 Pt01 Pt001 E (Pt01 ) t 2wt Substituting ut301 back to wealth-to-go at given xt01
dJt01
0 E et001 Pt01 xt01 :
Jt01 (xt01 ; ut01 )
yields the optimal
457
[7] N. H. Hakansson, “On optimal myopic portfolio policies, with and without serial correlation of yields,” J. Business, vol. 44, pp. 324–334, 1971. [8] D. Li and W. L. Ng, “Optimal dynamic portfolio selection: Multiperiod mean variance formulation,” Math. Finance, vol. 10, pp. 387–406, 2000. [9] H. M. Markowitz, Portfolio Selection: Efficient Diversification of Investment. New York: Wiley, 1959. [10] R. C. Merton, “Lifetime portfolio selection under uncertainty: The continuous-time case,” Rev. Econ. Statist., vol. 51, pp. 247–257, 1969. , Continuous-Time Finance. Cambridge, MA: Basil Blackwell, [11] 1990. [12] J. Mossin, “Optimal multiperiod portfolio policies,” J. Business, vol. 41, pp. 215–229, 1968. [13] R. Östermark, “Vector forecasting and dynamic portfolio selection: Empirical efficiency of recursive multiperiod strategies,” Eur. J. Oper. Res., vol. 55, pp. 46–56, 1991. [14] S. R. Pliska, Introduction to Mathematical Finance. Cambridge, MA: Basil Blackwell, 1997. [15] B. T. Polyak, Introduction to Optimization. New York: Optimization Software, Inc., 1987. [16] R. W. Reid and S. J. Citron, “On noninferior performance index vector,” J. Optim. Theory Applicat., vol. 7, pp. 11–28, 1971. [17] P. A. Samuelson, “Lifetime portfolio selection by dynamic stochastic programming,” Rev. Econ. Statist., vol. 50, pp. 239–246, 1969. [18] W. F. Sharpe, G. F. Alexander, and J. V. Bailey, Investments. London, U.K.: Prentice-Hall, 1995. [19] X. Y. Zhou and D. Li, “Continuous time mean-variance portfolio selection: A stochastic LQ framework,” Appl. Math. Optim., vol. 42, pp. 19–33, 2000.
Jt301 (xt01 ) = 0wt01 xt201 + t01 xt01 + 4t01 where
wt01 = wt01 + wt 2 E (et001 )2 0 E et001 Pt001 2E 01 Pt01 Pt001 E et001 Pt01 t01 = t01 + t 2 E et001 0 E Pt001 E 01 Pt01 Pt001 2 E et001 Pt01 T 01 4t01 = k E Pk0 E 01 Pk Pk0 E (Pk ) k=t01 2 t01 = t : 4wt We have thus proved the optimal policy of (A(; w; wT )) in (6). ACKNOWLEDGMENT The authors would like to thank B. Li for editorial assistance. REFERENCES [1] M. S. Bazaraa and C. M. Shetty, Nonlinear Programming: Theory and Algorithms. New York: Wiley, 1979. [2] B. Dumas and E. Lucinao, “An exact solution to a dynamic portfolio choice problem under transaction costs,” J. Finance, vol. 46, pp. 577–595, 1991. [3] E. J. Elton and M. J. Gruber, “On the optimality of some multiperiod portfolio selection criteria,” J. Business, vol. 47, pp. 231–243, 1974. [4] E. F. Fama, “Multiperiod consumption-investment decisions,” Amer. Econ. Rev., vol. 60, pp. 163–174, 1970. [5] A. M. Geoffrion, “Duality in nonlinear programming: A simplified applications-oriented development,” SIAM Rev., vol. 13, pp. 1–37, 1971. [6] R. R. Grauer and N. H. Hakansson, “On the use of mean-variance and quadratic approximations in implementing dynamic investment strategies: A comparison of returns and investment policies,” Manage. Sci., vol. 39, pp. 856–871, 1993.
Risk-Sensitive Portfolio Optimization With Completely and Partially Observed Factors Lukasz Stettner Abstract—In this note, optimal portfolio maximizing the long run risksensitized growth rate of the capital process in the case when the dynamics of the asset prices depend on some economical factors, which are completely or partially observed, using a discounted cost approach is shown. Index Terms—Factors, optimal portfolio, partial observation, risksensitive cost.
I. INTRODUCTION Assume we are given a market consisting of m securities and k1 + k2 factors. The price of the ith security Si (t) at time t depends on the values of economic factors x(t) = (z1 (t); z2 (t)) as, e.g., dividend yields, price-earning ratios, rate of inflation (see, e.g., [1] and the references therein for examples of such factors), in discrete-time moments for simplicity denoted by n = 0; 1; . . .. Given a probability space ( ; F; P ) assume that
Si (n + 1) = Si (n)wi (n)
(1)
where w(n) = (w1 (n); . . . ; wm (n)) is a random vector with positive coordinates. Manuscript received October 15, 2002; revised November 15, 2003. Recommended by Guest Editor B. Pasik-Duncan. This paper was supported by Grant PBZ KBN 016/P03/99. The author is with the Institute of Mathematics, Polish Academy of Sciences, 00-950 Warsaw, Poland (e-mail:
[email protected]). Digital Object Identifier 10.1109/TAC.2004.824476
0018-9286/04$20.00 © 2004 IEEE