J Syst Sci Complex (2011) 24: 907–918
DYNAMIC CVAR WITH MULTI-PERIOD RISK PROBLEMS∗ Zhiqing MENG · Min JIANG · Qiying HU
DOI: 10.1007/s11424-011-9010-7 Received: 15 January 2009 / 30 June 2010 c The Editorial Office of JSSC & Springer-Verlag Berlin Heidelberg 2011 Abstract This paper studies multi-period risk management problems by presenting a dynamic risk measure. This risk measure is the sum of conditional value-at-risk of each period. The authors model it by Markov decision processes and derive its optimality equation. This equation is further transformed equivalently to an analytically tractable one. The authors then use the model and its results to a multi-period portfolio optimization when the return rate vectors at each period form a Markov chain. Key words α-CVaR, multi-period, optimality equation, optimal policy.
1 Introduction Value-at-risk (VaR) is a measure for describing potential losses of decision making in the financial market. With respect to a specified probability level α, the α-VaR of a decision is the lowest amount y such that, with probability α, the loss will not exceed y. VaR has achieved its great success in practice but it has some drawbacks, e.g., it is not subadditivity. The concept of conditional value at risk (CVaR) was presented mainly to solve the drawback of non-subadditivity. With a specified probability α, α-CVaR is the conditional expectation of the loss above VaR. CVaR has good properties, especially its good computability. Rockafellar and Uryasev[1] presented a function through which the value of α-CVaR can be easily computed. Chernozhukov and Umantsev[2] stressed the important aspects of measuring external and intermediate conditional risk and gave an empirical application characterizing key economic determinants of various levels of conditional risk. Andersson, et al.[3] presented an efficient algorithm to minimize CVaR. Rockafellar and Uryasev[4] derived some fundamental properties of CVaR. The losses that should be considered in practical risk management are often multiple, such as those due to interest risk, exchange risk, shares risk, and commercial risk. So, the risk management problems are often of multiobjective. Krokhmal, Palmquist, and Uryasev[5], and Wang and Li[6] solved efficient frontier problems with three losses under the framework of CVaR. Zhiqing MENG · Min JIANG College of Business and Administration, Zhejiang University of Technology, Hangzhou 310023, China. Email:
[email protected];
[email protected]. Qiying HU (Corresponding author) School of Management, Fudan University, Shanghai 200433, China. Email:
[email protected]. ∗ This research was supported in part by the National Natural Science Foundation of China under Grant Nos. 70971023 and 71001089 and in part by the Natural Science Foundation of Zhejiang Province under Grant No. Y60860040. This paper was recommended for publication by Editor Shouyang WANG.
908
ZHIQING MENG · MIN JIANG · QIYING HU
Jiang, Hu and Meng[7−9] studied general multi-loss CVaR optimization problems. They used multi-objective programming and neural methods to solve the problems. F´ abi´ an[10] used CVaR to study a two-stage stochastic model. Boda and Filar[11] studied VaR and CVaR for multi-period risk control problems. They defined a concept of time consistent dynamic risk measures. Roughly, time consistency of a risk measure means that if a decision-maker uses a risk measure minimizing policy for an n-period problem, then the component of that multi-period policy from the tth-period to the end should be a risk measure minimizing policy in the remaining (n − t + 1)-period problem, for each t = 1, 2, · · · , n. They showed that VaR and CVaR do not need to be time consistent. By using Markov decision processes (MDP), they presented new measures of VaR and CVaR for multiperiod problem. Based on the optimality equation from MDP, they presented an algorithm to compute optimal policies. The dynamic risk measure for multi-period risk control problems presented in [11] is the CVaR of the total loss in all periods. But in practice, risk of loss for each period may be controlled and at the same time the risk for the total loss should also be controlled. For example, in real estate industry, a project may be divided into several periods such as financing, buying land, designing, constructing, and retailing. For each period, we may have some different objective for controlling risks. Certainly, risk of the project in its total life cycle should also be controlled. In this paper, we present another time consistent dynamic risk measure for multi-period risk control problems. The measure presented here is the sum of CVaR in each period. This measure can be computed easier by a standard MDP method. Differently from that in [11], optimal policy can be computed directly via the optimality equation. So, the method here is simpler. In the multi-period dynamic problem we considered, the system at each period is in some state (say s), and after observing this a decision (say x) is chosen from some decision set. Then, there occurs a loss which depends on some random variables, and the system transits randomly to some new sate at the next period. By introducing the minimum expected total CVaR during the multi-period, we present a Markov decision process model for it and get the optimality equation. Then, we transform it into an equivalent one with analytically tractability. Finally, we apply the model and its results to a multi-period portfolio optimization problem. The remainder of this paper is organized as follows. In Section 2, we introduce the concept of α-CVaR and the main results in [2]. Then, in Section 3, we present the model for the multi-period dynamic CVaR and its Markov decision process model with the corresponding optimality equations. In Section 4, we apply the model and its results to a multi-period portfolio optimization problem. Section 5 gives some conclusions.
2 The Model and Preliminaries We consider a problem with N + 1 periods denoted by n = 0, 1, · · · , N . At period n, the system is in some state sn ∈ Sn with Sn ⊂ Rm for some integer m > 0 and a decision xn ∈ Xn should be chosen with Xn ⊂ Rl for some integer l > 0. Then, there occur two things: a) The system incurs a loss fn (sn , xn , ξn ) ∈ R1 , where ξn ∈ Rr is a random vector depending on the state sn at period n with Gn (·|s) being the distribution function conditioned on sn = s; and b) The system will transit to state sn+1 at period n + 1 according to the following state transition probability: Hn (s | sn , xn ) = P {sn+1 ≤ s | sn , xn }. (1) The problem faced by the decision maker is how to choose an action at each period to minimize
909
DYNAMIC CVAR WITH MULTI-PERIOD
his/her risk. We say (s, x) ∈ Sn × Xn a (state-decision) pair at period n if the system is in state s ∈ Sn and action x is chosen from Xn at period n. For each n = 0, 1, · · · , N , we introduce the following notations and definitions as those in [1]. We denote by Ψ n (s, x, ·) the distribution function of the loss fn (s, x, ξn ), that is, Ψn (s, x, y) = P {fn (s, x, ξn ) ≤ y} = dGn (z|s). (2) fn (s,x,z)≤y
For α ∈ (0, 1), let yn,α (s, x) = min{y|Ψn (s, x, y) ≥ α}
(3)
be the α-VaR of the pair (s, x) ∈ Sn × Xn at period n under the confidence level α. yn,α (s, x) means that, with probability α, the loss of pair (s, x) at period n will not exceed yn,α (s, x). Therefore, the α-VaR yn,α (s, x) is the lowest amount y such that the loss will not exceed y with probability α. Clearly, when Ψn (s, x, y) is continuous in y, yn,α (s, x) is the smallest root y of the following equation: Ψn (s, x, y) = α. When discussing α-VaR of (s, x), we want to know how much the expected loss is. So, for each n = 0, 1, · · · , N , we define −1 ϕn,α (s, x, y) = (1 − α) fn (s, x, z)dGn (z|s) (4) fn (s,x,z)≥y
and let ϕn,α (s, x) = ϕn,α (s, x, yn,α (s, x)) be the α-CVaR of pair (s, x) at period n. ϕn,α (s, x) is the conditional expected loss conditioned on which the loss is over the α-VaR yn,α (s, x) for (s, x) at period n and α. The α-CVaR describes risk on the loss of pair (s, x) (i.e., decision x at state s). Finally, we introduce a function Fn,α (s, x, y) as follows: Fn,α (s, x, y) = y + (1 − α)−1 [fn (s, x, z) − y]+ dGn (z|s), y∈R (5) z∈R
for (s, x) ∈ Sn × Xn , where t+ = max(0, t) is the positive part of real number t. From [1], we have the following lemma. Lemma 1 Suppose n = 0, 1, · · · , N . 1) For each pair (s, x) ∈ Sn × Xn and confidence level α ∈ (0, 1), Fn,α (s, x, y) is convex and continuously differentiable in y and ∂ Fn,α (s, x, y) = (1 − α)−1 [Ψn (s, x, y) − α]. ∂y Moreover, for each s ∈ S, Fn,α (s, x, y) is convex with respect to (x, y) and ϕn,α (s, x) is convex with respect to x when fn (s, x, y) is convex with respect to x. 2) Under the condition of P {fn (s, x, ξn ) = y} = 0,
∀(s, x) ∈ Sn × Xn ,
y ∈ R,
(6)
910
ZHIQING MENG · MIN JIANG · QIYING HU
we have ϕn,α (s, x) = min Fn,α (s, x, y), y∈R
min ϕn,α (s, x) =
x∈Xn
min
(x,y)∈Xn ×R
(s, x) ∈ Sn × Xn ,
Fn,α (s, x, y),
s ∈ Sn .
(7) (8)
Lemma 1 tells us that the function Fn,α (s, x, y) has better analytical properties than ϕn,α (s, x) does. Moreover, it is difficult to compute ϕn,α (s, x). Hence, it suffices to minimize Fn,α (s, x, y) over (x, y) to get the minimum α-CVaR for each period. When fn (s, x, y) is convex with respect to x, Fn,α (s, x, y) is convex with respect to (x, y). So, the minimization of Fn,α (s, x, y) over (x, y) can be solved by its first order condition.
3 The Minimal CVaR and Optimal Policy In this section, we first formulate the CVaR over multiple periods as a Markov decision process model and then solve the minimal CVaR and the optimal policy for it. First, we introduce policies. A policy is a sequence π = (π0 , π1 , · · · , πN ), where for each period n = 0, 1, · · · , N , πn is a mapping from the state set Sn to the decision set Xn , i.e., πn : Sn → Xn . We call πn a decision function at period n. Here, using a policy π means that action πn (s) is chosen whenever the state at period n is s. Denote by Π the set of all policies. Let Ωn and Δn be the state and action at period n, respectively. Then, under a given policy π, decision Δn at period n is determined by state Ωn and the decision function πn , that is, Δn = πn (Ωn ). Hence, from (1), we know that, under a given policy π, the state sequence {Ω0 , Ω1 , · · · , ΩN } forms a Markov chain with finite horizons. Its state transition law is simply given by (1) with xn = πn (sn ). We let Pπ,s0 and Eπ,s0 be the probability distribution and the expectation, respectively, under the policy π with the initial state s0 . The details can be found in textbooks of Markov decision processes, e.g., [12–13]. For period n, the αn -CVaR is ϕn,αn (s, x) when Ωn = s and Δn = x. But the state at period n is random. Then under policy π with the initial state Ω0 = s, the expected αn -CVaR of period n is Eπ,s ϕn,αn (Ωn , Δn ). Hence, for the confidence level sequence α = (α0 , α1 , · · · , αN ), where all αn ∈ (0, 1), we define the expected total CVaR under policy π by Φα (π, s) =
N
β n Eπ,s ϕn,αn (Ωn , Δn ),
s ∈ S0 .
(9)
n=0
Here, β ∈ [0, 1] is the discount factor. Φα (π, s) describes the total discounted conditional expected loss with the confidence level sequence α under policy π and initial state s. Definition 1 We call Φα (π, s) the (multi-period) α-CVaR of policy π at the initial state s. In Definition 1, we do not require the same confidence level for each period, that is, αn may vary with period n. The problem is to choose a policy π ∗ to minimize the expected total α-CVaR Φα (π, s), i.e., Φα (s) := min Φα (π, s), π∈Π
s ∈ S0
(10)
with the given α. The policy π ∗ achieving the minimum in the above equation is called an optimal policy and Φα (s) is called the minimum (total) α-CVaR at the initial state s. In the following, the confidence level α is arbitrary but fixed.
911
DYNAMIC CVAR WITH MULTI-PERIOD
Summarily, we get a Markov decision process model[13] for the multi-period α-CVaR problem: {Sn , Xn , ϕn,αn (s, x), Hn (· | s, x), Φn }, where Sn and Xn are the state space and the action set at period n, while ϕn,αn (s, x) is the reward function and Hn (· | s, x) describes the state transition, and Φn is the objective function. Let Φn (s) be the minimum expected total CVaR from period n to N when the state at period n is s, that is, Φn (s) = min π∈Π
N
β k−n Eπ {ϕk,αk (Ωk , Δk ) | Ωn = s},
s ∈ Sn .
k=n
Then, from the standard results in Markov decision processes (see Hu and Yue [13]), we can conclude that Φn (s) satisfies the following optimality equation: Φn (s) = min
x∈Xn
ϕn,αn (s, x) + β
Ωn+1
Φn+1 (s )ds Hn (s |s, x) , s ∈ Sn , n = 0, 1, · · · , N, (11)
with the boundary condition ΦN +1 (s) = 0. Certainly, Φα (s) = Φ0 (s). Moreover, any policy achieving the minimum in the above optimality equation is an optimal policy, i.e., if πn∗ satisfies min
x∈Xn
ϕn,αn (s, x) + β
= ϕn,αn (s, πn∗ (s)) + β
Sn+1
Sn+1
Φn+1 (s )ds Hn (s |s, x)
Φn+1 (s )ds Hn (s |s, πn∗ (s)), s ∈ Sn ,
∗ for n = 0, 1, · · · , N , then π ∗ := (π0∗ , π1∗ , · · · , πN ) is an optimal policy. However, it may be not easy to get the minimizer of Equation (11). By noting Lemma 1, we naturally consider the following equation:
Φn (s) =
min
(x,y)∈Xn ×R
Fn,αn (s, x, y) + β
Sn+1
Φn+1 (s )ds Hn (s |s, x) ,
s ∈ Sn , n = 0, 1, · · · , N,
(12)
with the boundary condition ΦN +1 (s) = 0. This equation is similar to (11) except the reward function, i.e., the first term in the bracket of the right hand side above. In the following theorem, it shows that Equation (12) is equivalent to Equation (11) under the following condition: P {fn (s, x, ξn ) = y} = 0,
∀(s, x) ∈ Sn × Xn ,
y ∈ R,
n = 0, 1, · · · , N.
(13)
Theorem 1 1) Φn (s) satisfies the optimality Equation (11) and any policy achieving the minimum in (11) is optimal. 2) Under the condition (13), Φn (s) is the solution of the optimality Equation (11) if and only if Φn (s) is the solution of Equation (12).
ZHIQING MENG · MIN JIANG · QIYING HU
912
Proof 1) is obvious. For 2), suppose that Φn (s) is the solution of the optimality Equation (11). Then, from Lemma 1, we have ϕn,αn (s, x) + β
Φn (s) = min
x∈Xn
= min
x∈Xn
Sn+1
Φn+1 (s )ds Hn (s |s, x)
min Fn,αn (s, x, y) + β y∈R
Sn+1
=
min
Sn+1
(x,y)∈Xn ×R
Φn+1 (s )ds Hn (s |s, x)
= min min Fn,αn (s, x, y) + β x∈Xn y∈R
Φn+1 (s )ds Hn (s |s, x)
Fn,αn (s, x, y) + β
Sn+1
Φn+1 (s )ds Hn (s |s, x) ,
s ∈ Sn
for n = 0, 1, · · · , N . So, Φn (s) is the solution of Equation (12). The reverse case can be proved similarly. This shows the theorem. From Theorem 1, we call (12) the optimality equation, too. When N is large enough, the problem with finite horizons case can be approximated by the infinite horizons case. For the infinite horizons case, a policy is an infinite sequence π = (π0 , π1 , · · ·). Moreover, for the given confidence level sequence α = (α0 , α1 , · · ·), with αn ∈ (0, 1), the total conditional expected loss with the confidence level sequence α under policy π from state s at period n is defined by Φn (π, s) =
∞
β k Eπ {ϕk,αk (Ωk , Δk ) | Ωn = s} ,
s ∈ Sn .
(14)
k=n
We let Φn (s) = min Φn (π, s), π∈Π
s ∈ Sn ,
n ≥ 0.
(15)
This is just Φn (s) with N = ∞. So, we write the same notation Φn (s) here. From the standard results in Markov decision processes (see [12]), we know that Theorem 1 is still true when N = ∞. In what follows, we consider the stationary case, that is, Hn , Gn , fn , gn , αn are all irrespective of n = 0, 1, · · ·. Then, from Markov decision processes, Φn (s) and ϕn,α (s, x) are also irrespective of n. We write them by Φ(s) and ϕα (s, x), respectively. Moreover, a policy π = (π0 , π1 , · · ·) is called a stationary policy if π0 = π1 = · · ·. We have the following theorem for the stationary case. Theorem 2 For the stationary case with infinite horizons, 1) Φ(s) satisfies the following optimality equation: Φ(s) = min ϕα (s, x) + β Φ(s )ds H(s |s, x) , s ∈ S. (16) x∈X
S
Moreover, any stationary policy achieving the minimum in the above optimality equation will be optimal. 2) Suppose that P {f (s, x, ξ) = y} = 0,
∀(s, x) ∈ S × X,
y ∈ R.
(17)
913
DYNAMIC CVAR WITH MULTI-PERIOD
Then, Φ(s) is the solution of the optimality Equation (16) if and only if Φ(s) is the solution of the following equation: Φ(s) = min Fα (s, x, y) + β Φ(s )ds H(s |s, x) , s ∈ S. (18) (x,y)∈X×R
S
Due to Theorem 2, Equation (18) is also called the optimality equation for the stationary infinite horizons case. The model, the methods used, and the results obtained here differs from [11]. First, their model is limited in countable possible values of losses, as required in the MDP model Γ in Equation (6) there. While for our model, there is not such a limitation, i.e., the loss functions can take their values in Rm for some integer m. Second, based on Γ , Boda and Filar presented a method to compute the optimal VaR, CVaR, and the corresponding optimal policies in Subsection 4.2 there. This method is, in fact, numerical on a suitable grid of x ∈ R := (−∞, ∞) and so is not easy to perform. Our measure together with the optimal policy can be computed simply via the optimality equation. Hence, our method is easier. Third, it is difficult to verify the condition A∗n (i) = ∅ for all i ∈ S and n = 0, 1, · · · , N , required in the main theorems (Theorems 4 and 5) there. Certainly, the measure presented in this paper differs from that in [11]. In the following, we consider two special cases where there is a termination period for the problem and the loss occurs only when the problem is terminated. Two cases are divided into whether or not the termination is deterministic or random. Case I Fixed termination time The first case is about N + 1 periods and there is no loss except in the final period N , that is, the loss in the previous periods are zero: n = 0, 1, · · · , N − 1,
fn (sn , xn , ξn ) = 0,
while the loss in the final period N is fN (sN , xN , ξN ) with the distribution function ΨN (s, x, ·). We call this case as the terminal loss. In this case, for n = 0, 1, 2, · · · , N − 1, we have obviously that Ψn (s, x, y) = χ{y ≥ 0} where χ is the indicator function, the VaR yn,α (s, x) = 0, the α-CVaR ϕn,α (s, x, y) = ϕn,α (s, x) = 0, and Fn,α (s, x, y) = y for y ≥ 0 and = −αy/(1 − α) for y ≤ 0. Then, the minimum expected total CVaR satisfies the following equation: Φn (s) = β min Φn+1 (s )ds Hn (s |s, x), s ∈ Sn , n = 0, 1, · · · , N − 1, x∈Xn
Sn+1
ΦN (s) = min ϕN,αN (s, x) = x∈XN
min
(x,y)∈XN ×R
FN,αN (s, x, y),
(19)
where the last equality holds under the condition that P {fN (s, x, ξN ) = y} = 0,
∀(s, x) ∈ SN × XN ,
y ∈ R.
Case II Random termination time In the above special case, the terminal time N is deterministic. In some other cases, the terminal time is a random variable. We denote it by τ with the probability distribution being pn = P {τ = n},
n = 0, 1, · · · .
Moreover, the loss occurs only at the terminal time, that is, we have fn (sn , xn , ξn ) = 0, if τ = n.
ZHIQING MENG · MIN JIANG · QIYING HU
914
N It should be noted that if n=0 pn = 1 for some integer N then the problem is about finite horizons. Otherwise, the problem is about infinite horizons. Let Φn (s) also be the minimum expected total CVaR over infinite horizons from period n whenever the problem is not terminated before period n, i.e., τ ≥ n. Then, from the proof of Theorem 1 we know that Φn (s) satisfies the following equation: ∞ 1 Φn (s) = ∞ min pn ϕn,αn (s, x) + β pm Φn+1 (s )ds Hn (s |s, x) , n ≥ 0. Sn+1 m=n pm x∈Xn m=n+1 Letting Φ n (s) = lowing one:
∞
m=n
pm Φn (s), then the equation above is obviously equivalent to the fol-
Φ n (s) = min
x∈Xn
pn ϕn,αn (s, x) + β
Sn+1
Φ n+1 (s )ds Hn (s |s, x) ,
n ≥ 0.
Similarly, when the following condition is true: P {fn (s, x, ξn ) = y} = 0,
∀(s, x) ∈ Sn × Xn ,
y ∈ R,
n ≥ 0,
the previous equation is equivalent to the following one: pn Fn,αn (s, x, y) + β Φ n (s) = min Φ n+1 (s )ds Hn (s |s, x) , (x,y)∈Xn ×R
Sn+1
n ≥ 0.
(20)
Examples for the terminal losses include the terminal wealth in portfolio optimization problems, as discussed in [14]. When loss occurs only when the problem is terminated, the expected total CVaR surely coincides with that defined in [11]. But the computation here is obviously simpler. In the next section, we study the multi-period portfolio optimization by using the models and results in this section.
4 Applications to Portfolio Optimization In this section, we apply the model and results discussed in the previous section to the multiperiod portfolio optimization. Suppose that there are m securities in the financial market. Let the decision be portfolio x = (x1 , x2 , · · · , xm ) with xj being position in security j and m
xj = 1,
j = 1, 2, · · · , m.
j=1
Let X be the set of all portfolios. The random vector ξ = (ξ0 , ξ1 , · · · , ξ m ) be the return rate vector of these securities. Thus, the return rate of portfolio x is xT ξ = m j=1 xj ξj , and so the T loss can be described by −x ξ. We consider a multi-period portfolio optimization as follows. The period is indexed by n = 0, 1, · · · , N . Suppose security j has a return rate ξnj at the end of period n for j = 1, 2, · · · , m. Let ξ n = (ξn1 , ξn2 , · · · , ξnm ) be the return rate vector at period n. Then, ξn is random for the decision maker at the beginning of period n, but will be realized at the end of the period. As
DYNAMIC CVAR WITH MULTI-PERIOD
915
usual, we assume that the decision maker knows that {ξ0 , ξ1 , · · ·} is a Markov chain with the state transition probability Gn (z | z ) = P {ξn ≤ z | ξn−1 = z },
z, z ∈ Rm .
At the beginning of period n, the decision maker knows ξn−1 = z and so the conditional distribution function (d.f.), Gn (· | z ), of ξn . Let xn = (xn1 , xn2 , · · · , xnm ) be a new portfolio at period n. Then, at the beginning of period n, the decision maker knows xn−1 and ξn−1 , based on which xn should be chosen. Hence, we define for period n the state variable by sn = (xn−1 , ξn−1 ) and the decision variable by xn . However, for n = 0, it is assumed that s0 = (x−1 , ξ−1 ) is given. Suppose that there is a transmission cost c(x , x) when portfolio x is replaced by x. Let a function be f (x , x, z) = −xT z − c(x , x). Then the loss function at period n with state s = (x , z ) and decision x is defined as fn (s, x, ξn ) := f (x , x, ξn |z ) = −xT ξn |z − c(x , x),
n = 0, 1, · · · , N,
where ξn |z is the random variable ξn but conditioned on ξn−1 = z , that is, ξn |z is a random variable with d.f. Gn (·|z ). We write fn (s, x, ξn ) also as fn (x , z , x, ξn ). Moreover, the state transition law is described as follows. The system at state sn = (xn−1 , ξn−1 ) = (x , z ) with action xn at period n will transit at the next period into state sn+1 := (xn , ξn |z ), where xn is the action (portfolio) chosen at period n and ξn |z is a random variable with conditional d.f. Gn (·|z ). We write this transition by x
n (xn , ξn ). (xn−1 , ξn−1 ) −→
From those in Section 2, we have Ψn (x , z , x, y) = P {fn (x , z , x, ξn ) ≤ y} = P {xT ξn |z ≥ −y − c(x , x)}, yn,α (x , z , x) = min{y|Ψn (x , z , x, y) ≥ α}, f (x , x, z)dGn (z|z ) ϕn,α (x , z , x, y) = (1 − α)−1 f (x ,x,z)≥y xT zdGn (z|z ) = −(1 − α)−1 xT z≤−y−c(x ,x)
−1
− (1 − α)
c(x , x)P {xT ξn |z ≤ −y − c(x , x)},
the α-CVaR of (s, x) with s = (x , z ) at period n is ϕn,α (x , z , x) = ϕn,α (x , z , x, yn,α (x , z , x)), and Fn,α (x , z , x, y) = y + (1 − α)−1 = y + (1 − α)−1
z∈Rm
z∈Rm
[f (x , x, z) − y]+ dGn (z|z ) [−xT z − c(x , x) − y]+ dGn (z|z ).
From (11), the optimality equation for the multi-period portfolio optimization is Φn+1 (x, z)dGn (z|z ) , n = 0, 1, · · · , N, (21) Φn (x , z ) = min ϕn,αn (x , z , x) + β x∈X
z∈Rm
ZHIQING MENG · MIN JIANG · QIYING HU
916
with the boundary condition ΦN +1 (x , z ) = 0. Condition (13) here becomes equivalently T
P x ξn |z = y = dGn (z|z ) = 0, ∀x ∈ X, y ∈ R, z ∈ Rm , n = 0, 1, · · · , N. xT z=y
This is obviously equivalent to Gn (z |z) = 0 for all z, z ∈ Rm and n ≥ 0, which is true when the distribution functions Gn (·|z) are continuous type for all z ∈ Rm and n ≥ 0. Under this condition, the optimality Equation (21) is equivalent to the following one: Fn,αn (x , z , x, y) + β Φn (x , z ) = min Φn+1 (x, z)dGn (z|z ) , n ≤ N, (22) (x,y)∈X×R
z∈Rm
with the boundary condition ΦN +1 (x , z ) = 0. So, we also call (22) as the optimality equation for the multi-period portfolio problem. Thus, we have the following theorem. Theorem 3 For the multi-period portfolio optimization problem, the minimum expected total CVaR Φn (x , z ) satisfies the optimality Equation (21), or (22) when Gn (z |z) = 0 for all z, z ∈ Rm and n ≥ 0, and any policy achieving the minimum in the optimality equation is optimal. In the following, we consider several special cases for the multi-period portfolio optimization. We expect better results for them. Case I There is no transmission cost, i.e., c(x , x) = 0. Then all of Ψn (x , z , x, y), yn,α (x , z , x), ϕn,α (x , z , x), Fn,α (x , z , x, y) and Φn (x , z ) are independent of x . Hence, in this case, the optimality Equation (22) is simplified as Φn (z ) = min Fn,αn (z , x, y) + β Φn+1 (z)dGn (z|z ), n ≤ N, (23) (x,y)∈X×R
z∈Rm
with the boundary condition ΦN +1 (z ) = 0. It should be noted that in the above equation, the minimum is just for the function Fn,αn (z , x, y), not the sum of the two terms in the right hand side. Let πn∗ be such that πn∗ (z ) = arg
min
(x,y)∈X×R
Fn,αn (z , x, y), (x, y) ∈ X × R,
n = 0, 1, · · · , N.
(24)
∗ ) is an optimal policy. This means that the optimal policy is myopic, Then, π ∗ = (π0∗ , π1∗ , · · · , πN i.e., the optimal action at period n depends only on the loss of period n. Suppose that the problem is stationary, that is, {ξn } is stationary, i.e., Gn is irrespective of n, and αn = α is a constant. Then, the loss function fn = f and the distribution functions Fn,αn (z , x, y) = Fα (z , x, y) are irrespective of n. So, we have an optimal stationary myopic policy π ∗ = (π0∗ , π0∗ , · · · , π0∗ ) with
π0∗ (x) = arg
min
(x,y)∈X×R
Fα (z , x, y), (x, y) ∈ X × R.
Case II {ξ0 , ξ1 , · · ·} are independent with each other. In this case, Gn (z|z ) = Gn (z) is irrespective of z . Hence, the state variable sn = (xn−1 , ξn−1 ) can be simplified as sn = xn−1 . So, ϕn,α (x , z , x), Fn,α (x , z , x, y), and Φn (x , z ) are all irrespective of z and will be denoted by ϕn,α (x , x), Fn,α (x , x, y), and Φn (x ), respectively. Therefore, the optimality Equations (21) and (22) are simplified as Φn (x ) = min {ϕn,αn (x , x) + βΦn+1 (x)} x∈X
=
min
(x,y)∈X×R
{Fn,αn (x , x, y) + βΦn+1 (x)} ,
n = 0, 1, · · · , N.
(25)
917
DYNAMIC CVAR WITH MULTI-PERIOD
This is converse to Case 1 where Φn (z ) is irrespective of x but depends on z . Case III Both conditions in Cases I and II above are true, that is, there is no transmission cost and {ξ0 , ξ1 , · · ·} are independent with each other. Then, the loss function becomes f (x , x, z) = −
m
xk zk ,
k=1
which is irrespective of x , and will be denoted by f (x, z). So, −1 ϕn,α (x , x, y) = (1 − α) f (x, z)dGn (z|s), f (x,z)≥y [f (x, z) − y]+ dGn (z|s), Fn,α (x , x, y) = y + (1 − α)−1 z∈Rm
and yn,α (x , x), ϕn,α (x , x) are all irrespective of x . So, we delete the notation “x ” in them. Hence, from the optimality Equation (25), we know that Φn (x ) is further irrespective of x . We write Φn = Φn (x ). Therefore, the optimality Equations (25) becomes Φn = min{ϕn,αn (x) + βΦn+1 } = min ϕn,αn (x) + βΦn+1 x∈X
x∈X
=
min
(x,y)∈X×R
Fn,αn (x, y) + βΦn+1 ,
n = 0, 1, · · · , N,
with ΦN +1 = 0. This results in Φn =
N k=n
β k−n min ϕk,αk (x) = x∈X
N k=n
β k−n
min
(x,y)∈X×R
Fk,αk (x, y),
n = 0, 1, · · · , N.
(26)
Therefore, the minimum expected total CVaR is just the sum of the αn -CVaR for each period. ∗ Similar to Case I, we have an optimal myopic policy π ∗ = (π0∗ , π1∗ , · · · , πN ), where πn∗ = arg min ϕn,αn (x), x∈X
n = 0, 1, · · · , N.
(27)
Hence, the action in each period under this optimal policy is fixed, irrespectively of whatever the state is, that is, the optimal action at period n depends only on the period index n. Summarily, we have the following proposition. Proposition 1 For the multi-period portfolio optimization problem, a) when there is no transmission cost, the minimum expected total CVaR Φn (z ) is irrespective of the last decision x and satisfies the optimality Equation (23), and there is an optimal myopic policy satisfying (24); b) when {ξ0 , ξ1 , · · ·} are independent with each other, the minimum expected total CVaR Φn (x ) is irrespective of the random realization ξn−1 in the last period and satisfies the optimality Equation (26); c) when both conditions in a) and b) are true, the minimum expected total CVaR Φn depends only on n and is just the sum of the αn -CVaR for each period, and the constant myopic policy π ∗ given by (27) is optimal. Part a) of the above proposition implies that when the transmission cost can be ignored, we can adjust free the portfolio and so the minimum expected total CVaR is irrespective of the portfolio in the last period. Furthermore, the optimal policy is myopic, i.e., it considers only the loss of the current period. When further the return rate vectors ξn of securites in different
918
ZHIQING MENG · MIN JIANG · QIYING HU
periods are independent, Part c) implies that the CVaR at period n can be minimized at the portfolio πn∗ , the optimal policy is a portfolio sequence, and the minimum total CVaR is just the sum of the minimum CVaR at each period. The terminal wealth is studied in many papers of portfolio optimization, e.g., in [14]. When we concern only the wealth at the terminal time, the multi-period portfolio optimization problem is exactly the two special cases discussed at the end of the last section. One can solve this problems easily and we omit them here.
5 Conclusion This paper studied the optimal control of multi-period risk management problems. We define the minimum expected total α-CVaR for these problems. Then, we present a Markov decision process model and derive two equivalent optimality equations. Finally, we use the model and its results to a multi-period portfolio optimization when the return rate vectors form a Markov chain. Future research may include to apply the model and results discussed in this paper to optimal performing stock options. The American stock option is modeled by the deterministic terminal loss while the European stock option is modeled by the random terminal loss. However, these problems differs from the terminal losses discussed in Section 3. References [1] R. T. Rockafellar and S. Uryasev, Optimization of conditional value-at-risk, Journal of Risk, 2000(2): 21–41. [2] V. Chernozhukov and L. Umantsev, Conditional value-at-risk: Aspects of modeling and estimation, Empirical Economics, 2001, 26: 271–292. [3] F. Andersson, H. Mausser, D. Rosen, and S. Uryasev, Credit risk optimization with conditional value-at-risk criterion, Math. Program., 2001, 89: 273–291. [4] R. T. Rockafellar and S. Uryasev, Conditional value-at-risk for general loss distributions, Journal of Banking & Finance, 2002, 26: 1443–1471. [5] P. Krokhmal, J. Palmquist, and S. Uryasev, Portfolio optimization with conditional value-at-risk objectives and constraints, Journal of Risk, 2002(2): 124–129. [6] J. H. Wang and C. L. Li, New method of measurement and control finance risk, Joural of Wuhan University of Techology, 2002, 24(2): 60–63. [7] M. Jiang, Q. Hu, and Z. Meng, A method on solving multiobjective conditional value-at-risk, Lecture Notes in Computer Science, 2004, 3039: 923–930. [8] M. Jiang, Z. Meng, and Q. Hu, A neural network model on solving multiobjective conditional value-at-risk, Lecture Notes in Computer Science, 2004, 3174: 1000–1006. [9] M. Jiang, Q. Hu, and Z. Meng, A method on solving multiple conditional value-at-risk based on weights, Far East Journal of Applied Mathematics, 2004, 17(3): 359–369. [10] C. I. F´ abi´ an, Handling CVaR objectives and constraints in two-stage stochastic models, European Journal of Operational Research, 2008, 191(3): 888–911. [11] K. Boda and J. A. Filar, Time consistent dynamic risk measures, Math. Meth. Oper. Res., 2006, 63: 169–186. [12] K. Hinderer, Foundations of Non-Stationary Dynamic Programming with Discrete Time Parameter, Springer-Verlag, Berlin, 1970. [13] Q. Hu and W. Yue, Markov Decision Processes with Their Application, Springer, New York, 2008. [14] D. Li and W. L. Ng, Optimal dynamic portfolio selection: Multiperiod mean-variance formulation, Mathematical Finance, 2000, 10: 387–406.