Oct 7, 2011 - arXiv:1110.1564v1 [math.OC] 7 Oct 2011. A Linear-Quadratic Optimal Control Problem for Mean-Field Stochastic Differential Equations. â.
A Linear-Quadratic Optimal Control Problem
arXiv:1110.1564v1 [math.OC] 7 Oct 2011
for Mean-Field Stochastic Differential Equations∗ Jiongmin Yong Department of Mathematics, University of Central Florida, Orlando, FL 32816, USA. October 10, 2011
Abstract A Linear-quadratic optimal control problem is considered for mean-field stochastic differential equations with deterministic coefficients. By a variational method, the optimality system is derived, which turns out to be a linear mean-field forward-backward stochastic differential equation. Using a decoupling technique, two Riccati differential equations are obtained, which are uniquely solvable under certain conditions. Then a feedback representation is obtained for the optimal control.
Keywords. Mean-field stochastic differential equation, linear-quadratic optimal control, Riccati differential equation, feedback representation. AMS Mathematics subject classification. 49N10, 49N35, 93E20.
1
Introduction.
Let (Ω, F , lP, lF) be a complete filtered probability space, on which a one-dimensional standard Brownian motion W (·) is defined with lF ≡ {Ft }t≥0 being its natural filtration augmented by all the lP-null sets. Consider the following controlled linear stochastic differential equation (SDE, for short): n o b b dX(s) = A(s)X(s) + B(s)u(s) + A(s)lE[X(s)] + B(s)lE[u(s)] ds n o (1.1) b1 (s)lE[X(s)]+ B b1 (s)lE[u(s)] dW (s), s ∈ [0, T ], + A1 (s)X(s)+B1 (s)u(s)+ A X(0) = x,
b B(·), b b1 (·), B b1 (·) are given deterministic matrix valued functions. In the where A(·), B(·), A(·), A1 (·), B1 (·), A n above, X(·), valued in lR , is the state process, and u(·), valued in lRm , is the control process.
We note that lE[X(·)] and lE[u(·)] appear in the state equation. Such an equation is referred to as a linear controlled mean-field (forward) SDE (MF-FSDE, for short). MF-FSDEs can be used to describe particle systems at mesoscopic level, which is of great importance in applications. Historically, the latercalled McKean–Vlasov SDE, a kind of MF-FSDE, was suggested by Kac [18] in 1956 as a stochastic toy model for the Vlasov kinetic equation of plasma and the study of which was initiated by McKean [24] in 1966. Since then, many authors made contributions on McKean–Vlasov type SDEs and applications, see, for examples, Dawson [13], Dawson–G¨artner [14], G´artner [15], Scheutzow [29], Graham [16], Chan ∗ This
work is supported in part by NSF Grant DMS-1007514.
1
[10], Chiang [11], Ahmed–Ding [1], to mention a few. In recent years, related topics and problems have attracted more and more researchers’ attentions, see, for examples, Veretennikov [32], Huang–Malham´e– Caines [17], Mahmudov–McKibben [23], Buckdahn-Djehiche-Li-Peng [8], Buckdahn-Li-Peng [9], Borkar– Kumar [6], Crisan–Xiong [12], Kotelenez–Kurtz [20], Kloeden–Lorenz [19], and so on. More interestingly, control problems of McKean–Vlasov equation or MF-FSDEs were investigated by Ahmed–Ding [2], Ahmed [3], Buckdahn–Djehiche–Li [7], Park–Balasubramaniam–Kang [28], Andersson–Djehiche [4], Meyer-Brandis– Oksendal–Zhou [25], and so on. This paper can be regarded as an addition to the study of optimal control for MF-FSDEs. For the state equation (1.1), we introduce the following: U[0, T ] =
∆ L2lF (0, T ; lRm ) =
Z n m u : [0, T ] × Ω → lR lE
0
T
o |u(s)|2 ds < ∞ .
Any u(·) ∈ U[0, T ] is called an admissible control. Under mild conditions, one can show that (see below) for any (x, u(·)) ∈ lRn × U[0, T ], (1.1) admits a unique solution X(·) = X(· ; x, u(·)). We introduce the following cost functional: nZ T h b J(x; u(·)) = lE h Q(s)X(s), X(s) i + h Q(s)lE[X(s)], lE[X(s)] i + h R(s)u(s), u(s) i (1.2) 0 i o b b + h R(s)lE[u(s)], lE[u(s)] i ds + h GX(T ), X(T ) i + h GlE[X(T )], lE[X(T )] i , b b being suitable deterministic symmetric matrix-valued functions, and G, G b being with Q(·), R(·), Q(·), R(·) symmetric matrices. Our optimal control problem can be stated as follows: Problem (MF). For given x ∈ lRn , find a u ¯(·) ∈ U[0, T ] such that J(x; u¯(·)) =
inf
u(·)∈U [0,T ]
J(x; u(·)).
(1.3)
Any u ¯(·) ∈ U[0, T ] satisfying the above is called an optimal control and the corresponding state process ¯ ¯ X(·) ≡ X(· ; x, u ¯(·)) is called an optimal state process; the pair (X(·), u¯(·)) is called an optimal pair. From the above-listed literature, one has some motivations for the inclusion of lE[X(·)] and lE[u(·)] in the state equation. We now briefly explain a motivation of including lE[X(·)] and lE[u(·)] in the cost functional. We recall that for a classical LQ problem with the state equation h i h i dX(s) = A(s)X(s) + B(s)u(s) ds + A (s)X(s) + B (s)u(s) dW (s), s ∈ [0, T ], 1 1 (1.4) X(0) = x, one has the following cost functional nZ T h i o J0 (x; u(·)) = lE h Q0 (s)X(s), X(s) i + h R0 (s)u(s), u(s) i ds + h G0 X(T ), X(T ) i .
(1.5)
0
For such a corresponding optimal control problem, it is natural to hope that the optimal state process and/or control process are not too sensitive with respect to the possible variation of the random events. One way to achieve this is try to make the variations var [X(·)] and var [u(·)] small. Therefore, one could include the var [X(·)] and var [u(·)] in the cost functional. Consequently, one might want to replace (1.6) by the following: nZ T h i Jb0 (x; u(·)) = lE h Q0 (s)X(s), X(s) i +q(s)var [X(s)] + h R0 (s)u(s), u(s) i +ρ(s)var [u(s)] ds (1.6) 0 o + h G0 X(T ), X(T ) i +gvar [X(T )] , 2
for some (positive) weighting factors q(·), ρ(·), and g. Since 2 var [X(s)] = lE|X(s)|2 − lE[X(s)] , and similar things hold for var [X(T )] and var [u(s)], we see that Jb0 (x; u(·)) = lE
nZ
T 0
h
2 h[Q0 (s) + q(s)I]X(s), X(s) i −q(s) lE[X(s)] + h[R0 (s) + ρ(s)I]u(s), u(s) i 2 i 2 o −ρ(s) lE[u(s)] ds + h[G0 + gI]X(T ), X(T ) i −g lE[X(T )] .
Clearly, the above is a special case of (1.2) with Q(·) = Q0 (·) + q(·)I, R(·) = R0 (·) + ρ(·)I, b = −q(·)I, R(·) b = −ρ(·)I, G b = −gI. Q(·)
G = G0 + gI,
b b b are not positive semi-definite. Because of this, we do not Note that in the above case, Q(·), R(·), and G b b b assume the positive semi-definiteness for Q(·), R(·), and G.
The purpose of this paper is to study Problem (MF). We will begin with the well-posedness of the state equation and the solvability of Problem (MF) in Section 2. Then, in Section 3, we will establish necessary and sufficient conditions for optimal pairs. Naturally, a linear backward stochastic differential equation of mean-filed type (MF-BSDE, for short) will be derived. Consequently, the optimality system turns out to be a coupled mean-field type forward-backward stochastic differential equation (MF-FBSDE, for short). Inspired by the invariant imbedding [5], and the Four-Step Scheme [22], we derive two Riccati differential equations in Section 4, so that the optimal control is represented as a state feedback form. Well-posedness of these Riccati equations will be established. We also present a direct verification of optimality for the state feedback control by means of completing squares. In Section 5, we will look at a modified LQ problem which is one motivation of the current paper.
2
Preliminaries.
In this section, we make some preliminaries. First of all, for any Euclidean space H = lRn , lRn×m , S n (with S n being the set of all (n × n) symmetric matrices), we let Lp (0, t; H) be the set of all H-valued functions that are Lp -integrable, p ∈ [1, ∞]. Next, we introduce the following spaces: n o Xt ≡ L2Ft (Ω; lRn ) = ξ : Ω → lRn ξ is Ft -measurable, lE|ξ|2 < ∞ , n o Ut ≡ L2Ft (Ω; lRm ) = η : Ω → lRm η is Ft -measurable, lE|η|2 < ∞ , Z t o n |X(s)|2 ds < ∞ , L2 (0, t; lRn ) = X : [0, t] × Ω → lRn X(·) is lF-adapted, lE lF
0 n X [0, t] ≡ ClF ([0, t]; lRn ) = X : [0, t] → Xt X(·) is lF-adapted,
o s 7→ X(s) is continuous, sup lE|X(s)|2 < ∞ , s∈[0,t] n n n 2 b X [0, t] ≡ LlF (Ω; C([0, t]; lR )) = X : [0, t] × Ω → lR X(·) is lF-adapted, i o h X(·) has continuous paths, lE sup |X(s)|2 < ∞ . s∈[0,t]
3
Note that in the definition of X [0, t], the continuity of s 7→ X(s) means that as a map from [0, t] to Xt , it is continuous. Whereas, in the definition of Xb[0, t], X(·) has continuous paths means that for almost sure ω ∈ Ω, s 7→ X(s, ω) is continuous. It is known that Xb[0, t] ⊆ X [0, t] ⊆ L2lF (0, t; lRn ),
Xb[0, t] 6= X [0, t] 6= L2lF (0, t; lRn ).
We now introduce the following assumptions for the coefficients of the state equation. (H1) The following hold:
A(·), A(·), b A1 (·), A b1 (·) ∈ L∞ (0, T ; lRn×n ), B(·), B(·), b b (·) ∈ L∞ (0, T ; lRn×m ). B (·), B 1
(2.1)
1
(H1)′ The following hold: A(·), A(·) b ∈ L2 (0, T ; lRn×n ), B(·), B(·) b ∈ L2 (0, T ; lRn×m ),
b1 (·) ∈ L∞ (0, T ; lRn×n ), A1 (·), A b1 (·) ∈ L∞ (0, T ; lRn×m ). B1 (·), B
(H1)′′ The following hold: A(·), A(·) b ∈ L1 (0, T ; lRn×n ), B(·), B(·) b ∈ L2 (0, T ; lRn×m ),
b1 (·) ∈ L2 (0, T ; lRn×n ), A1 (·), A b1 (·) ∈ L∞ (0, T ; lRn×m ). B1 (·), B
(2.2)
(2.3)
Clearly, (H1) implies (H1)′ which further implies (H1)′′ . Namely, (H1)′′ is the weakest assumption among the above three. Whereas, (H1) is the most common assumption. For the weighting matrices in the cost functional, we introduce the following assumption. (H2) The following hold: b ∈ L∞ (0, T ; S n ), Q(·), Q(·)
b ∈ L∞ (0, T ; S m ), R(·), R(·)
(H2)′ In addition to (H2), the following holds: Q(s), Q(s) + Q(s) b b ≥ 0, R(s), R(s) + R(s) ≥ 0, G, G + G b ≥ 0.
b ∈ Sn. G, G
(H2)′′ In addition to (H2), the following holds: For some δ > 0, Q(s), Q(s) + Q(s) b b ≥ 0, R(s), R(s) + R(s) ≥ δI, G, G + G b ≥ 0.
s ∈ [0, T ],
s ∈ [0, T ],
(2.4)
(2.5)
(2.6)
b b b are not necessarily positive semi-definite. Therefore, in (H2), From (1.4), we see that Q(·), R(·), and G we do not mention positive-definiteness of the involved matrices and matrix-valued functions. Now, for any X(·) ∈ L2lF (0, T ; lRn ) and any u(·) ∈ U[0, T ], we define Z t b [AX(·)](t) = A(s)X(s) + A(s)lE[X(s)] ds 0 Z t b1 (s)lE[X(s)] dW (s), A1 (s)X(s) + A +
0 Z t b B(s)u(s) + B(s)lE[u(s)] ds [Bu(·)](t) = 0 Z t b1 (s)lE[u(s)] dW (s), B1 (s)u(s) + B + 0
4
t ∈ [0, T ], (2.7)
t ∈ [0, T ].
The following result is concerned with operators A and B. Lemma 2.1. The following estimates hold as long as the involved norms on the right hand sides are meaningful: For any t ∈ [0, T ], kAX(·)k2b
X [0,t]
h 2 b ≤ C kA(·)k2L2 (0,t;lRn×n ) + kA(·)k L2 (0,t;lRn×n )
i 2 b1 (·)k2 ∞ +kA1 (·)k2L∞ (0,t;lRn×n ) + kA n×n L (0,t;lR ) kX(·)kL2 (0,t;lRn ) ,
(2.8)
lF
h
2 b ≤ C kA(·)k2L1 (0,t;lRn×n ) + kA(·)k L1 (0,t;lRn×n )
kAX(·)k2b
X [0,t]
and kBu(·)k2b
X [0,t]
i 2 b1 (·)k2 2 +kA1 (·)k2L2 (0,t;lRn×n ) + kA L (0,t;lRn×n ) kX(·)kX [0,t],
(2.9)
i 2 b1 (·)k2 ∞ +kB1 (·)k2L∞ (0,t;lRn×n ) + kB L (0,t;lRn×n ) ku(·)kU [0,t] .
(2.10)
h 2 b ≤ C kB(·)k2L2 (0,t;lRn×n ) + kB(·)k L2 (0,t;lRn×n )
Hereafter, C > 0 represents a generic constant which can be different from line to line. Proof. For any t ∈ (0, T ], and any X(·) ∈ L2lF (0, t; lRn ), lE
n Z t 2 Z t 2 b sup |[AX(·)](s)| ≤ C lE |A(s)||X(s)|ds + |A(s)||lE[X(s)]|ds
s∈[0,t]
Z
Z
t
0
0
o
t
b1 (s)|2 |lE[X(s)]|2 ds |A1 (s)|2 |X(s)|2 ds + |A 0 n h Z t Z t i Z t Z t b 2 ds ≤ C lE |A(s)|2 ds |X(s)|2 ds + |A(s)| |lE[X(s)]|2 ds 0 0 0 0 2 Z t Z t o b1 (s)|2 + sup |A1 (s)| lE|X(s)|2 ds + sup |A lE|X(s)|2 ds +lE
0
s∈[0,t] t
≤C
n Z
0
|A(s)|2 ds +
0
Z
0
s∈[0,t]
t
0
b 2 ds + sup |A1 (s)|2 + sup |A b1 (s)|2 |A(s)| s∈[0,t]
s∈[0,t]
oZ
t
lE|X(s)|2 ds. 0
Thus, estimate (2.8) holds. Next, for any X(·) ∈ X [0, t], making use of Burkholder–Davis–Gundy’s inequality ([34]), we have n Z t 2 Z t 2 b lE sup |[AX(·)](s)|2 ≤ C lE |A(s)||X(s)|ds + |A(s)||lE[X(s)]|ds s∈[0,t]
Z
0
Z
0
o b1 (s)|2 |lE[X(s)]|2 ds +lE |A1 (s)|2 |X(s)|2 ds + |A 0 0 n Z t Z t Z t 2 b ≤C |A(s)|ds |A(s)|lE|X(s)|2 ds + |A(s)|ds sup |lE[X(s)]|2 0
0
Z
t
t
0
Z
s∈[0,t]
o b1 (s)|2 lE|X(s)|2 ds |A1 (s)|2 lE|X(s)|2 ds + |A 0 0 Z t h Z t 2 Z t 2 Z t i b b1 (s)|2 ds ≤C |A(s)|ds + |A(s)|ds + |A1 (s)|2 ds + |A sup lE|X(s)|2 . +
0
t
0
0
t
0
Hence, (2.9) follows. Similar to the proof of (2.8), we can prove (2.10). The above lemma leads to the following corollary. 5
s∈[0,t]
Corollary 2.2. If (H1)′′ holds, then A : X [0, T ] → Xb[0, T ] and B : U[0, T ] → Xb[0, T ] are bounded. Further, if (H1)′ holds, then A : L2lF (0, T ; lRn ) → Xb[0, T ] is also bounded. In particular, all the above hold if (H1) holds. Next, we define IT X(·) = X(T ), AT X(·) = IT AX(·) ≡ [AX(·)](T ) Z T Z T b1 (s)lE[X(s)] dW (s), b A1 (s)X(s) + A A(s)X(s) + A(s)lE[X(s)] ds + = 0 0 BT u(·) = IT Bu(·) ≡ [Bu(·)](T ) Z T Z T b1 (s)lE[u(s)] dW (s). b B1 (s)u(s) + B B(s)u(s) + B(s)lE[u(s)] ds + =
(2.11)
0
0
It is easy to see that
IT : X [0, T ] → XT is bounded. According to Lemma 2.1, we have the following result. Corollary 2.3. If (H1)′′ holds, then AT : X [0, T ] → XT and BT : U[0, T ] → XT are bounded. Further, if (H1)′ holds, then AT : L2lF (0, T ; lRn ) → XT is also bounded. In particular, all the above hold if (H1) holds. Recall that if ξ ∈ XT , then there exists a unique ζ(·) ∈ L2lF (0, T ; lRn ) such that Z T ξ = lEξ + ζ(s)dW (s). 0
We denote ζ(s) = Ds ξ,
s ∈ [0, T ],
and call it the Malliavin derivative of ξ ([27]). Next, we have the following results which give representation of the adjoint operators of A, B, AT , and BT . Proposition 2.4. The following hold: Z T T ∗ b T lE[Y (t)] + A1 (s)T Ds Y (t) + A b1 (s)T lE[Ds Y (t)] dt, A(s) Y (t) + A(s) (A Y )(s) = s Z T b T lE[Y (t)] + B1 (s)T Ds Y (t) + B b1 (s)T lE[Ds Y (t)] dt, B(s)T Y (t) + B(s) (B ∗ Y )(s) = s ∗ b T lEξ + A1 (s)T Ds ξ + A b1 (s)T lE[Ds ξ], (AT ξ)(s) = A(s)T ξ + A(s) b T lEξ + B1 (s)T Ds ξ + B b1 (s)T lE[Ds ξ]. (BT∗ ξ)(s) = B(s)T ξ + B(s)
(2.12)
Proof. For any Y (·) ∈ L2lF (0, T ; lRn ), Z
∗
T
h X, A Y i = h AX, Y i = lE h[AX](t), Y (t) i dt 0 Z t Z T Z t b1 (s)lE[X(s)] dW (s), Y (t) i dt b A1 (s)X(s)+ A ds+ = lE h A(s)X(s)+ A(s)lE[X(s)] 0 0 0 Z TZ T Z TZ T b b1 (s)lE[X(s)], Ds Y (t) i dtds = lE h A(s)X(s)+ A(s)lE[X(s)], Y (t) i dtds+lE h A1 (s)X(s)+ A 0 s 0 s Z T Z T b T lE[Y (t)]+A1 (s)T Ds Y (t)+ A b1 (s)T lE[Ds Y (t)] dt i ds. A(s)T Y (t)+A(s) = lE h X(s), 0
s
6
Thus, the representation of A∗ follows. Similarly, we can obtain the representation of B ∗ . Next, for any ξ ∈ XT , h X, A∗T ξ i = h AT X, ξ i Z T Z T b b1 (s)lE[X(s)] dW (s), ξ i = lE h A(s)X(s) + A(s)lE[X(s)] ds, ξ i +lE h A1 (s)X(s) + A 0 0 Z T T T T b b1 (s)T lE[Ds ξ] i ds. h X(s), A(s) ξ + A(s) lEξ + A1 (s) Ds ξ + A = lE 0
Therefore, the representation of A∗T follows. Similarly, we can obtain the representation of BT∗ . For completeness, let us also prove the following result. Proposition 2.5. It holds IT∗ ξ = ξδ{T } ,
∀ξ ∈ XT ,
(2.13)
∀x ∈ lRn .
(2.14)
where δ{T } is the Dirac measure at T , and lE∗ x = xT lE,
Proof. First of all, since IT : X [0, T ] → XT is bounded, we have IT∗ : XT∗ ≡ XT → X [0, T ]∗ . For any ξ ∈ XT , and any Y (·) ∈ X [0, T ], we have h IT∗ ξ, Y (·) i = h ξ, IT Y (·) i = lE h ξ, Y (T ) i = lE
Z
T
h Y (s), ξ i δT (ds).
0
Next, since lE : XT → lR is bounded, we have lE∗ : lR → XT . For any ξ ∈ XT and x ∈ lR, h lE∗ x, ξ i = h x, lEξ i = lE h x, ξ i = xT lEξ. This completes the proof. With operators A and B, we can write the state equation (1.1) as follows: X = x + AX + Bu.
(2.15)
We now have the following result for the well-posedness of the state equaton. Proposition 2.6. Let (H1) hold. Then for any (x, u(·)) ∈ lRn × U[0, T ], state equation (1.1) admits a unique solution X(·) ≡ X(· ; x, u(·)) ∈ Xb[0, T ]. Proof. For any X(·) ∈ X [0, T ] and u(·) ∈ U[0, T ], by (2.9), we have i i h h lE sup |(AX)(s)|2 ≤ α(t)lE sup |X(s)|2 , s∈[0,t]
s∈[0,t]
with α(t) ∈ (0, 1) when t > 0 is small. Hence, by contraction mapping theorem, we obtain well-posedness of the state equation on [0, t]. Then by a usual continuation argument, we obtain the well-posedness of the state equation on [0, T ]. From (2.9), we see that if for some ε > 0, A(·), A(·) b ∈ L1+ε (0, T ; lRn×n ), B(·), B(·) b ∈ L2 (0, T ; lRn×m ),
7
b1 (·) ∈ L2+ε (0, T ; lRn×n ), A1 (·), A b1 (·) ∈ L∞ (0, T ; lRn×m ), B1 (·), B
(2.16)
then the result of Proposition 2.6 holds. It is ready to see that (2.16) is stronger than (H1)′′ and weaker than (H1)′ . In what follows, for convenient, we will assume (H1). However, we keep in mind that (H1) can actually be relaxed. Proposition 2.6 tells us that under, say, (H1), the operator I − A : Xb[0, T ] → Xb[0, T ] is invertible and the solution X to the state equation corresponding to (x, u(·)) ∈ lRn × U[0, T ] is given by X = (I − A)−1 x + (I − A)−1 Bu. Note that
h i IT (I − A)−1 x + (I − A)−1 Bu = IT X = X(T ) = x + AT X + BT u h i h i = I + AT (I − A)−1 x + AT (I − A)−1 B + BT u.
Therefore, IT (I − A)−1 = I + AT (I − A)−1 , Now, let
IT (I − A)−1 B = AT (I − A)−1 B + BT .
[QX(·)](s) = Q(s)X(s), s ∈ [0, T ], ∀X(·) ∈ L2lF (0, T ; lRn ), b b ∀ϕ(·) ∈ L2 (0, T ; lRn ), [Qϕ(·)](s) = Q(s)ϕ(s), s ∈ [0, T ], [Ru(·)](s) = R(s)u(s), s ∈ [0, T ], ∀u(·) ∈ U[0, T ], b b [Rψ(·)](s) = R(s)ψ(s), s ∈ [0, T ], ∀ψ(·) ∈ L2 (0, T ; lRm ), b = Gx, b Gξ = Gξ, ∀ξ ∈ XT , Gx ∀x ∈ lRn .
Then the cost functional can be written as
b b b J(x; u(·)) = hQX, Xi+hQlEX, lEXi+hRu, ui+hRlEu, lEui+hGX(T ), X(T )i+hGlEX(T ), lEX(T )i
= h Q[(I − A)−1 x + (I − A)−1 Bu], (I − A)−1 x + (I − A)−1 Bu i b b + h QlE[(I − A)−1 x + (I − A)−1 Bu], lE[(I − A)−1 x + (I − A)−1 Bu] i + h Ru, u i + h RlEu, lEu i
+h G{[I +AT (I −A)−1 ]x+[AT (I −A)−1 B+BT ]u}, [I +AT (I −A)−1 ]x+[AT (I −A)−1 B+BT ]u i b +hGlE{[I +AT (I −A)−1 ]x+[AT (I −A)−1 B+BT ]u}, lE{[I +AT (I −A)−1 ]x+[AT (I −A)−1 B+BT ]u}i
(2.17)
≡ h Θ2 u, u i +2 h Θ1 x, u i + h Θ0 x, x i, where
b + B ∗ (I − A∗ )−1 Q(I − A)−1 B + B ∗ (I − A∗ )−1 lE∗ QlE(I b Θ2 = R + lE∗ RlE − A)−1 B −1 b +[B ∗ (I −A∗ )−1 A∗ +B ∗ ]G[AT (I −A)−1 B+BT ]+[B ∗(I −A∗ )−1 A∗ +B ∗ ]lE∗ GlE[A B+BT ] T (I −A) T
T
T
T
b + B ∗ (I − A∗ )−1 (Q + lE∗ QlE)(I b = R + lE RlE − A)−1 B −1 b +[B ∗ (I − A∗ )−1 A∗ + B ∗ ](G + lE∗ GlE)[A B + BT ], T (I − A) ∗
T
∗
∗ −1
Θ1 = B (I − A )
T
−1
Q(I − A)
b − A)−1 + B ∗ (I − A∗ )−1 lE∗ QlE(I
+[B ∗ (I − A∗ )−1 A∗T + BT∗ ]G[I + AT (I − A)−1 ] b +[B ∗ (I − A∗ )−1 A∗ + B ∗ ]lE∗ GlE[I + AT (I − A)−1 ] T ∗
T
b b = B (I − A ) (Q + lE QlE)(I − A)−1 + [B ∗ (I − A∗ )−1 A∗T + BT∗ ](G + lE∗ GlE)[I + AT (I − A)−1 ], b Θ0 = (I − A∗ )−1 Q(I − A)−1 + (I − A∗ )−1 lE∗ QlE(I − A)−1 ∗
∗ −1
b + AT (I − A)−1 ] +[I + (I − A∗ )−1 A∗T ]G[I + AT (I − A)−1 ] + [I + (I − A∗ )−1 A∗T ]lE∗ GlE[I b b = (I − A∗ )−1 (Q + lE∗ QlE)(I − A)−1 + [I + (I − A∗ )−1 A∗T ](G + lE∗ GlE)[I + AT (I − A)−1 ]. 8
Consequently, for any u(·), v(·) ∈ U[0, T ], and x ∈ lRn , J(x; v(·)) = J(x; u(·) + [v(·) − u(·)]) = h Θ2 [u + (v − u)], u + (v − u) i +2 h Θ1 x, u + (v − u) i + h Θ0 x, x i
(2.18)
= h Θ2 u, u i +2 h Θ1 x, u i + h Θ0 x, x i +2 h Θ2 u + Θ1 x, v − u i + h Θ2 (v − u), v − u i = J(x; u(·)) + 2 h Θ2 u + Θ1 x, v − u i + h Θ2 (v − u), v − u i . We now present the following result whose proof is standard, making use of the above (see [26] for details). Proposition 2.7. If u(·) 7→ J(x; u(·)) admits a minimum, then Θ2 ≥ 0.
(2.19)
Θ1 x ∈ Θ2 U[0, T ] ,
(2.20)
Conversely, if in addition to (2.19), one has
then u(·) 7→ J(x; u(·)) admits a minimum u¯(·) ∈ U[0, T ]. Further, if Θ2 ≥ δI,
(2.21)
for some δ > 0, then for any given x ∈ lRn , u(·) 7→ J(x; u(·)) admits a unique minimum. By the definition of Θ2 , we see that (2.19) is implied by the following:
and (2.21) is implied by
b ≥ 0, R + lE∗ RlE
b ≥ 0, Q + lE∗ QlE
b ≥ 0, G + lE∗ GlE
(2.22)
b ≥ δI, R + lE∗ RlE
b ≥ 0, Q + lE∗ QlE
b ≥ 0, G + lE∗ GlE
(2.23)
for some δ > 0. Now, we would like to present more direct conditions under which (2.19) and (2.21) hold, respectively. Proposition 2.8. Let (H1) and (H2)′ hold. Then (2.19) holds. Further, if (H2)′′ holds for some δ > 0, then (2.21) holds and Problem (MF) admits a unique solution. Proof. For any ξ ∈ XT , h i h i b b lE h Gξ, ξ i + h GlE[ξ], lE[ξ] i = lE h G(ξ − lE[ξ]), ξ − lE[ξ] i + h(G + G)lE[ξ], lE[ξ] i ≥ 0, h i h i b b lE h Q(s)ξ, ξ i + h Q(s)lE[ξ], lE[ξ] i = lE h Q(s)(ξ − lE[ξ]), ξ − lE[ξ] i + h[Q(s) + Q(s)]lE[ξ], lE[ξ] i ≥ 0,
and for any η ∈ UT , h i h i b b lE h R(s)η, η i + h R(s)lE[η], lE[η] i = lE h R(s)(η − lE[η]), η − lE[η] i + h[R(s) + R(s)]lE[η], lE[η] i ≥ 0. Thus, (2.19) holds. Next, if (H2)′′ holds, then
h i h i b b lE h R(s)η, η i + h R(s)lE[η], lE[η] i = lE h R(s)(η − lE[η]), η − lE[η] i + h[R(s) + R(s)]lE[η], lE[η] i i h ≥ δlE |η − lE[η]|2 + |lE[η]|2 = δlE|η|2 . Hence, (2.21) holds. 9
3
Optimality Conditions.
In this section, we first derive a necessary condition for optimal pair of Problem (MF). ¯ Theorem 3.1. Let (H1) and (H2) hold. Let (X(·), u¯(·)) be an optimal pair of Problem (MF). Then the following mean-field backward stochastic differential equation (MF-BSDE, for short) admits a unique adapted solution (Y (·), Z(·)): b T lE[Y (s)] + A b1 (s)T lE[Z(s)] dY (s) = − A(s)T Y (s) + A1 (s)T Z(s) + A(s) (3.1) ¯ b ¯ +Q(s)X(s) + Q(s)lE[ X(s)] ds + Z(s)dW (s), s ∈ [0, T ], Y (T ) = GX(T ¯ ) + GlE[ b X(T ¯ )],
such that
b b T lE[Y (s)] + B b1 (s)T lE[Z(s)] = 0, R(s)¯ u(s) + B(s)T Y (s) + B1 (s)T Z(s) + R(s)lE[¯ u(s)] + B(s) s ∈ [0, T ],
(3.2)
a.s.
¯ Proof. Let (X(·), u¯(·)) be an optimal pair of Problem (MF). For any u(·) ∈ U[0, T ], let X(·) be the state process corresponding to the zero initial condition and the control u(·). Then we have 0 = lE
nZ
0
= lE
nZ
0
T
T
h
¯ b ¯ h Q(s)X(s), X(s) i + h Q(s)lE[ X(s)], lE[X(s)] i + h R(s)¯ u(s), u(s) i i o b ¯ ), X(T ) i + h GlE[ b X(T ¯ )], lE[X(T )] i + h R(s)lE[¯ u(s)], lE[u(s)] i ds + h GX(T i h ¯ b ¯ b h Q(s)X(s) + Q(s)lE[ X(s)], X(s) i + h R(s)¯ u(s) + R(s)lE[¯ u(s)], u(s) i ds o ¯ ) + GlE[ b X(T ¯ )], X(T ) i , + h GX(T
On the other hand, by [9], we know that (3.1) admits a unique adapted solution (Y (·), Z(·)). Then ¯ ) + GlE[ b X(T ¯ )] i = lE h X(T ), Y (T ) i lE h X(T ), GX(T Z n T b b = lE h A(s)X(s) + B(s)u(s) + A(s)lE[X(s)] + B(s)lE[u(s)], Y (s) i 0
b T lE[Y (s)] + A b1 (s)T lE[Z(s)] + Q(s)X(s) ¯ b ¯ − h X(s), A(s)T Y (s) + A1 (s)T Z(s) + A(s) + Q(s)lE[ X(s)] i o b1 (s)lE[X(s)] + B b1 (s)lE[u(s)], Z(s) i ds + h A1 (s)X(s) + B1 (s)u(s) + A Z n T ¯ b ¯ = lE − h X(s), Q(s)X(s) + Q(s)lE[ X(s)] i 0 o b T lE[Y (s)] + B b1 (s)T lE[Z(s)] i ds . + h u(s), B(s)T Y (s) + B1 (s)T Z(s) + B(s)
Hence,
Z T b T lE[Y (s)]+ B b1 (s)T lE[Z(s)]+R(s)¯ b u(s)+ R(s)lE[¯ u(s)] i ds = 0, lE h u(s), B(s)T Y (s)+B1 (s)T Z(s)+ B(s) 0
which leads to
b b T lE[Y (s)] + B b1 (s)T lE[Z(s)] = 0. R(s)¯ u(s) + B(s)T Y (s) + B1 (s)T Z(s) + R(s)lE[¯ u(s)] + B(s)
This completes the proof.
10
From the above, we end up with the following optimality system: (with s suppressed) ¯ + B1 u b1 lE[X] ¯ +B b1 lE[¯ ¯ = AX ¯ + Bu b X] ¯ + BlE[¯ b u] ds + A1 X ¯ + A u ] dW (s), d X ¯ + AlE[ ¯ +A bT lE[Y ] + A bT1 lE[Z] + QlE[ b X] ¯ ds + ZdW (s), dY = − AT Y + AT1 Z + QX ¯ ) + GlE[ b X(T ¯ )], X(0) = x, Y (T ) = GX(T b u] + B T Y + B1T Z + B b T lE[Y ] + B b1T lE[Z] = 0. R¯ u + RlE[¯
(3.3)
This is called a (coupled) forward-backward stochastic differential equations of mean-field type (MF-FBSDE, for short). Note that the coupling comes from the last relation (which is essentially the maximum condition ¯ in the usual Pontryagin type maximum principle). The 4-tuple (X(·), u ¯(·), Y (·), Z(·)) of lF-adapted processes satisfying the above is called an adapted solution of (3.3). We now look at the sufficiency of the above result. ¯ Theorem 3.2. Let (H1), (H2), and (2.19) hold. Suppose (X(·), u¯(·), Y (·), Z(·)) is an adapted solution ¯ to the MF-FBSDE (3.3). Then (X(·), u¯(·)) is an optimal pair. ¯ Proof. Let (X(·), u¯(·), Y (·), Z(·)) be an adapted solution to the MF-FBSDE. For any u(·) ∈ U[0, T ], let X1 (·) ≡ X(· ; 0, u(·) − u ¯(·)). Then ¯ X(s; x, u(·)) = X(s) + X1 (s),
s ∈ [0, T ].
Hence, (suppressing s) J(x; u(·)) − J(x; u¯(·)) nZ T h i ¯ X1 i + h QlE[ b X], ¯ lE[X1 ] i + h R¯ b u], lE[u − u = 2lE h QX, u, u − u¯ i + h RlE[¯ ¯] i ds 0 o ¯ ), X1 (T ) i + h GlE[ b X(T ¯ )], lE[X1 (T )] i + h GX(T nZ T h i b b +lE h QX1 , X1 i + h QlE[X ¯), u − u ¯ i + h RlE[u − u¯], lE[u − u ¯] i ds 1 ], lE[X1 ] i + h R(u − u 0 o b + h GX1 (T ), X1 (T ) i + h GlE[X 1 (T )], lE[X1 (T )] i nZ T h i o ¯ + QlE[ b X] ¯ i + h u − u¯, R¯ b u] i ds + h X1 (T ), GX(T ¯ ) + GlE[ b X(T ¯ )] i = 2lE h X1 , Q X u + RlE[¯ 0
+J(0; u(·) − u¯(·)).
Note that ¯ ) + GlE[ b X(T ¯ )] i = lE h X1 (T ), Y (T ) i lE h X1 (T ), GX(T nZ T b b h AX1 + B(u − u ¯) + AlE[X ¯], Y i = lE 1 ] + BlE[u − u 0
bT lE[Y ] + A bT lE[Z] + QX ¯ + QlE[ b X] ¯ i − h X1 , AT Y + AT1 Z + A 1 o b1 lE[X1 ] + B b1 lE[u − u + h A1 X1 + B1 (u − u¯) + A ¯], Z i ds Z n T o ¯ + QlE[ b X] ¯ i+hu − u b T lE[Y ] + B b T lE[Z] i ds . = lE − h X1 , Q X ¯, B T Y + B1T Z + B 1 0
Thus,
J(x; u(·)) − J(x; u¯(·)) Z T b u] + B T Y + B T Z + B b T lE[Y ] + B b T lE[Z] i ds + J(0; u(·) − u¯(·)) h u − u¯, R¯ u + RlE[¯ = 2lE 1 1 0
= J(0; u(·) − u¯(·)) = h Θ2 (u − u ¯), u − u¯ i ≥ 0.
11
¯ Hence, (X(·), u¯(·)) is optimal. We have the following corollary. Corollary 3.3. Let (H1) and (H2)′′ hold. Then MF-FBSDE (3.3) admits a unique adapted solution ¯ ¯ (X(·), u ¯(·), Y (·), Z(·)) of which (X(·), u¯(·)) is the unique optimal pair of Problem (MF). Proof. We know from Proposition 2.8 that under (H1) and (H2)′′ , Problem (MF) admits a unique optimal ¯ ¯ pair (X(·), u¯(·)). Then by Theorem 3.1, for some (Y (·), Z(·)), the 4-tuple (X(·), u¯(·), Y (·), Z(·)) is an adapted e e solution to MF-FBSDE (3.3). Next, if (3.3) has another adapted solution (X(·), u e(·), Ye (·), Z(·)). Then by e Theorem 3.2, (X(·), u e(·)) must be an optimal pair of Problem (MF). Hence, by the uniqueness of the optimal pair of Problem (MF), we must have
e = X(·), ¯ X(·)
u e(·) = u ¯(·).
Ye (·) = Y (·),
e = Z(·), Z(·)
Then by the uniqueness of MF-BSDE (3.1), one must have
proving the corollary.
4
Decoupling the MF-FBSDE and Riccati Equations
From Corollary 3.3, we see that under (H1) and (H2)′′ , to solve Problem (MF), we need only to solve MFFBSDE (3.3). To solve (3.3), we use the idea of decoupling inspired by the Four-Step Scheme ([21, 22]). More precisely, we have the following result. For simplicity, we have suppressed the time variable s below. ¯ Theorem 4.1. Let (H1) and (H2)′′ hold. Then the unique adapted solution (X(·), u¯(·), Y (·), Z(·)) of MF-FBSDE (3.3) admits the following representation: h i b T Π + (B1 + B b1 )T P (A1 + A b1 ) lE[X], ¯ ¯ − lE[X] ¯ − K −1 (B + B) u¯ = −K −1 (B T P + B1T P A1 ) X 0 1 Y =P X ¯ − lE[X] ¯ + ΠlE[X], ¯ (4.1) h i −1 T T ¯ − lE[X] ¯ Z = P A − P B K P A ) (B P + B X 1 1 1 1 0 i h b T Π + (B1 + B b1 )T P (A1 + A b1 ) lE[X]. ¯ b1 ) − P (B1 + B b1 )K −1 (B + B) + P (A1 + A 1
where
K0 = R + B1T P B1 ,
b + (B1 + B b1 )T P (B1 + B b1 ). K1 = R + R
and P (·) and Π(·) are solutions to the following Riccati equations, respectively: P ′ + P A + AT P + AT P A + Q − (P B +AT P B )K −1 (B T P + B T P A ) = 0, 1 1 1 1 1 1 0 P (T ) = G,
(4.2)
s ∈ [0, T ],
and h i b − (B + B)K b −1 (B1 + B b1 )T P (A1 + A b1 ) Π′ + Π (A + A) 1 i h b −1 (B + B) b TΠ b T − (A1 + A b1 )T P (B1 + B b1 )K −1 (B + B) b T Π − Π(B + B)K + (A + A) 1 1 h i b1 )T P − P (B1 + B b1 )K −1 (B1 + B b1 )T P (A1 + A b1 ) + Q + Q b = 0, +(A + A s ∈ [0, T ], 1 1 Π(T ) = G + G. b 12
(4.3)
(4.4)
¯ Finally, X(·) solves the following closed-loop system nh i ¯ = A − BK −1 (B T P + B T P A1 ) X ¯ − lE[X] ¯ dX 1 0 i o h b T Π + (B1 + B b1 )T P (A1 + A b1 ) lE[X] ¯ dt b − (B + B)K b −1 (B + B) + (A + A) 1 nh i −1 T T ¯ ¯ + A − B K P A ) (B P + B X − lE[ X] 1 1 1 1 0 i o h b T Π + (B1 + B b1 )T P (A1 + A b1 ) lE[X] ¯ dW (t), b1 ) − (B1 + B b1 )K −1 (B + B) + (A1 + A 1 X(0) ¯ = x.
(4.5)
Proof. First of all, under (H1) and (H2)′′ , Riccati equation (4.3) admits a unique solution P (·) which is positive definite matrix valued ([34]). With such a function P (·), K1 defined in (4.2) is positive definite. Next, we note that b1 )K −1 (B1 + B b1 )T P P − P (B1 + B 1 h i−1 b1 ) R+ R+(B b b T b eT P b1 )T P ≡ P −P B( e R+ e B e T P B) e −1 B = P −P (B1 + B (B1 + B 1 + B1 ) P (B1 + B1 ) i h −1 i h 1 1 e− 12 B e− 21 B eR e− 21 I + R e T P 12 P 12 B eR e− 21 e T P 12 P 12 ≡ P 21 I −Γ(I +ΓT Γ)−1 ΓT P 12 = P 2 I −P 2 B R
1 1 1 1 eR e−1 B e TP 12 )−1P 21 ≡ P 12[I+P 21(B1+ B b1 )(R+ R) b −1(B1+ B b1 )TP 21 ]−1P 21 ≥ 0. = P 2(I +ΓΓT)−1P 2 ≡ P 2 (I +P 2 B
In the above, we have denoted
and used the fact
e = B1 + B b1 , B
e = R + R, b R
1 eR e 21 , Γ = P 2B
I − Γ(I + ΓT Γ)−1 ΓT = (I + ΓT Γ)−1 . Hence, Riccati equation (4.4) is equivalent to the following:
Since
h i ′ b − (B + B)K b −1 (B1 + B b1 )T P (A1 + A b1 ) Π + Π (A + A) 1 i h T T + (A+ A) b −1 (B + B) b TΠ b −(A1 + A b1 ) P (B1 + B b1 )K −1 (B + B) b T Π−Π(B + B)K 1 1 h i−1 1 1 −1 T 12 T 12 b b b b1 )+Q + Q b = 0, b 2 (B + B I +P )(R+ R) (B + B ) P P 2 (A1+ A +(A + A ) P 1 1 1 1 1 1 b Π(T ) = G + G.
(4.6)
b ≥ 0, G+G
h i−1 1 b1 )T P 21 I +P 12 (B1 + B b1 )(R+ R) b −1 (B1 + B b1 )T P 21 b1 )+Q + Q b ≥ 0, (A1 + A P 2 (A1 + A
according to [34], Riccati equation (4.4) admits a unique solution Π(·) which is also positive definite matrix valued. ¯ Now, suppose (X(·), u ¯(·), Y (·), Z(·)) is the adapted solution to (3.3). Assume that Y (s) = P (s)X(s) + Pb (s)lE[X(s)],
s ∈ [0, T ],
for some deterministic and differentiable functions P (·) and Pb (·) such that P (T ) = G,
13
b Pb (T ) = G.
(4.7)
(4.8)
For the time being, we do not assume that P (·) is the solution to (4.3). Then (suppressing s)
¯ −A bT lE[Y ] − A bT1 lE[Z] − QlE[ b X] ¯ ds + ZdW (s) − AT Y − AT1 Z − QX ¯ + PblE[X] ¯ = dY = d P X h i ¯ + P AX ¯ + Bu b X] ¯ + BlE[¯ b u] + Pb ′ lE[X] ¯ + Pb (A + A)lE[ b X] ¯ + (B + B)lE[¯ b u] ds = P ′X ¯ + AlE[ ¯ + B1 u b1 lE[X] ¯ +B b1 lE[¯ ¯+A u] dW (s) +P A1 X h i ¯ + P Bu b + PA b lE[X] ¯ + Pb (B + B) b + PB b lE[¯ = (P ′ + P A)X ¯ + Pb ′ + Pb(A + A) u] ds ¯ + B1 u b1 lE[X] ¯ +B b1 lE[¯ ¯+A u] dW (s) +P A1 X
Comparing the diffusion terms, we should have ¯ + B1 u¯ + A b1 lE[X] ¯ +B b1 lE[¯ u] . Z = P A1 X This yields from (3.2) that
b u] + B T Y + B b T lE[Y ] + B T Z + B b T lE[Z] 0 = R¯ u + RlE[¯ 1 1 T T b ¯ b ¯ b ¯ = R¯ u + RlE[¯ u] + B (P X + P lE[X]) + B (P + Pb )lE[X] ¯ + B1 u b1 lE[X] ¯ +B b1 lE[¯ b1T P (A1 + A b1 )lE[X] ¯ + (B1 + B b1 )lE[¯ ¯+A u] + B u] +B1T P A1 X b + B1T P B b1 + B b1T P B1 + B b1T P B b1 )lE[¯ ¯ = (R + B1T P B1 )¯ u + (R u] + (B T P + B1T P A1 )X b T (P + Pb ) + B1T P A b1 + B b1T P (A1 + A b1 ) lE[X]. ¯ + B T Pb + B
Taking lE, we obtain
h i h i b b T b b T (P + Pb )+(B1 + B b1 )T P (A1 + A b1 ) lE[X] ¯ 0 = R+ R+(B u]+ (B + B) 1 + B1 ) P (B1 + B1 ) lE[¯ h i b T (P + Pb )+(B1 + B b1 )T P (A1 + A b1 ) lE[X]. ¯ ≡ K1 lE[¯ u]+ (B + B)
Assuming
to be invertible, one gets
Then we have
∆ b + (B1 + B b1 )T P (B1 + B b1 ) K1 ≡ K1 (P ) = R + R
h i b T (P + Pb ) + (B1 + B b1 )T P (A1 + A b1 ) lE[X]. ¯ lE[¯ u] = −K1−1 (B + B)
¯ b + B1T P B b1 + B b1T P B1 + B b1T P B b1 lE[¯ u] + (B T P + B1T P A1 )X 0 = (R + B1T P B1 )¯ u+ R b T (P + Pb) + B T P A b1 + B b T P (A1 + A b1 ) lE[X] ¯ + B T Pb + B 1 1 ¯ + + B T Pb + B b T (P + Pb ) + B T P A b1 + B b T P (A1 + A b1 ) lE[X] ¯ ≡ K0 u ¯ + (B T P + B1T P A1 )X 1 1 h i −1 T b T (P + Pb)+(B1 + B b1 )T P (A1 + A b1 ) lE[X]. ¯ b b bT bT b (B + B) − R+B 1 P B1 + B1 P B1 + B1 P B1 K 1
Consequently, by assuming
∆
K0 ≡ K0 (P ) = R + B1T P B1
14
(4.9)
(4.10)
to be invertible, we obtain b T (P + Pb ) + B1T P A b1 + B b1T P (A1 + A b1 ) lE[X] ¯ ¯ − K −1 B T Pb + B u ¯ = −K0−1 (B T P + B1T P A1 )X 0 h i −1 T b b T(P + Pb )+(B1 + B b1 )TP (A1 + A b1 ) lE[X] ¯ b bT bT b (B + B) +K0−1 R+B 1 P B1 + B1 P B1 + B1 P B1 K 1 b T (P + Pb) + B T P A b1 + B b T P (A1 + A b1 ) lE[X] ¯ ¯ − K −1 B T Pb + B = −K0−1 (B T P + B1T P A1 )X 1 1 0 h i b T (P + Pb ) + (B1 + B b1 )T P (A1 + A b1 ) lE[X] ¯ +K0−1 K1 − K0 K1−1 (B + B) b T (P + Pb) + B T P A b1 + B b T P (A1 + A b1 ) lE[X] ¯ ¯ − K −1 B T Pb + B = −K0−1 (B T P + B1T P A1 )X 1 1 0 h i b T (P + Pb) + (B1 + B b1 )T P (A1 + A b1 ) lE[X] ¯ +K0−1 (B + B) h i b T (P + Pb) + (B1 + B b1 )T P (A1 + A b1 ) lE[X] ¯ −K1−1 (B + B) i h ¯ ¯ + K −1 B T P + B T P A1 lE[X] = −K0−1 (B T P + B1T P A1 )X 1 0 h i b T (P + Pb) + (B1 + B b1 )T P (A1 + A b1 ) lE[X]. ¯ −K1−1 (B + B)
(4.11)
Here, we note that
b + (B1 + B b1 )T P (B1 + B b1 ) − R − B T P B1 K1 − K0 = R + R 1 T T T b b b b b = R + B P B1 + B P B1 + B P B1 . 1
1
1
Hence, comparing the drift terms in (4.9), we have
¯ + P Bu b + PA b lE[X] ¯ + Pb (B + B) b + PB b lE[¯ 0 = (P ′ + P A)X ¯ + Pb′ + Pb (A + A) u] ¯ + B1 u b1 lE[X] ¯ +B b1 lE[¯ ¯ + Pb lE[X] ¯ + AT P A1 X ¯+A u] +AT P X 1 h i bT (P + Pb)lE[X] ¯ +A bT1 P (A1 + A b1 )lE[X] ¯ + (B1 + B b1 )lE[¯ ¯ + QlE[ b X] ¯ +A u] + QX h i ¯ + (P B + AT1 P B1 )¯ = P ′ + P A + AT P + AT1 P A1 + Q X u h i b b T Pb +P A+ b A bT P +(A1 + A b1 )T P (A1 + A b1 )−AT P A1 + Q b lE[X] ¯ + Pb ′ + Pb(A+ A)+(A+ A) 1 i h b + PB b + AT1 P B b1 + A bT1 P (B1 + B b1 ) lE[¯ u] + Pb(B + B) h i ¯ = P ′ + P A + AT P + AT1 P A1 + Q X i h n ¯ ¯ + K −1 B T P + B T P A1 lE[X] +(P B + AT1 P B1 ) − K0−1 (B T P + B1T P A1 )X 1 0 h i o b T (P + Pb) + (B1 + B b1 )T P (A1 + A b1 ) lE[X] ¯ −K1−1 (B + B) h i b b T Pb +P A+ b A bT P +(A1 + A b1 )T P (A1 + A b1 )−AT P A1 + Q b lE[X] ¯ + Pb ′ + Pb(A+ A)+(A+ A) 1 h i b + PB b + AT P B b1 + A bT P (B1 + B b1 ) + Pb(B + B) 1 1 h i o n b T (P + Pb) + (B1 + B b1 )T P (A1 + A b1 ) lE[X] ¯ · − K1−1 (B + B) h i ¯ = P ′ + P A + AT P + AT1 P A1 + Q − (P B + AT1 P B1 )K0−1 (B T P + B1T P A1 ) X n b+A bT P + (A1 + A b1 )T P (A1 + A b1 ) − AT P A1 + Q b b + (A + A) b T Pb + P A + Pb ′ + Pb(A + A) 1
+(P B + AT1 P B1 )K0−1 (B T P + B1T P A1 ) h io h i −1 b T (P + Pb )+(B1 + B b1 )T P (A1 + A b1 ) lE[X]. ¯ b b T b (B + B) − (P + Pb)(B + B)+(A 1 + A1 ) P (B1 + B1 ) K1
Therefore, by choosing P (·) to be the solution to Riccati equation (4.3), we have that K0 and K1 are positive 15
definite, and the above leads to n b + (A + A) b T Pb + P A b+A bT P + (A1 + A b1 )T P (A1 + A b1 ) − AT1 P A1 + Q b Pb ′ + Pb(A + A)
+(P B + AT1 P B1 )K0−1 (B T P + B1T P A1 ) h io h i −1 b T (P + Pb )+(B1 + B b1 )T P (A1 + A b1 ) lE[X] ¯ = 0. b b T b (B + B) − (P + Pb)(B + B)+(A 1 + A1 ) P (B1 + B1 ) K1
Now, if Pb(·) satisfies the following:
b + (A + A) b T Pb + P A b+A bT P + (A1 + A b1 )T P (A1 + A b1 ) − AT P A1 + Q b 0 = Pb ′ + Pb(A + A) 1
+(P B + AT1 P B1 )K0−1 (B T P + B1T P A1 ) h i h i b T (P + Pb ) + (B1 + B b1 )T P (A1 + A b1 ) b + (A1 + A b1 )T P (B1 + B b1 ) K −1 (B + B) − (P + Pb )(B + B) 1 i h −1 T ′ b P + (B1 + B b1 )T P (A1 + A b1 ) b − (B + B)K b (B + B) = Pb + Pb A + A 1 h i b T − P (B + B) b + (A1 + A b1 )T P (B1 + B b1 ) K −1 (B + B) b −1 (B + B) b T Pb − Pb (B + B)K b T Pb + (A + A) 1 1 b+A bT P + (A1 + A b1 )T P (A1 + A b1 ) − AT P A1 + Q b + (P B + AT P B1 )K −1 (B T P + B T P A1 ) +P A 1 1 1 0 h i h i −1 T T T b b b b b b − P (B + B) + (A1 + A1 ) P (B1 + B1 ) K1 (B + B) P + (B1 + B1 ) P (A1 + A1 ) ,
then (Y (·), Z(·)) defined by (4.7) and (4.10) with u¯(·) given by (4.11) satisfies the MF-BSDE in (3.3). Hence, we introduce the following Riccati equation for Pb(·): i h b T P + (B1 + B b1 )T P (A1 + A b1 ) b ′ + Pb A + A b − (B + B)K b −1 (B + B) P 1 i h T b b b1 )T P (B1 + B b1 ) K −1 (B + B) b T Pb + (A + A) − P (B + B) + (A + A 1 1 −1 T b T T b b b b b b b1 ) − AT1 P A1 + Q b −P (B + B)K1 (B + B) P + P A + A P + (A1 + A1 ) P (A1 + A (4.12) −1 T T T +(P B + A P B )K (B P + B P A ) 1 1 1 1 0 h i h i −1 T T T b b b b b b − P (B + B) + (A + A ) P (B + B ) K (B + B) P + (B + B ) P (A + A ) = 0, 1 1 1 1 1 1 1 1 1 b b P (T ) = G.
b is just assumed to be symmetric, and The solvability of this Riccati equation is not obvious since G
e ≡ PA b+A bT P + (A1 + A b1 )T P (A1 + A b1 ) − AT P A1 + Q b + (P B + AT P B1 )K −1 (B T P + B T P A1 ) Q 1 1 1 0 h i h i −1 T T T b + (A1 + A b1 ) P (B1 + B b1 ) K b P + (B1 + B b1 ) P (A1 + A b1 ) − P (B + B) (B + B) 1
is also just symmetric. To look at the solvability of such a Riccati equation, we let
Then
Π = P + Pb . b ≥ 0, Π(T ) = G + G
16
and consider the following 0 = P ′ + P A + AT P + AT1 P A1 + Q − (P B + AT1 P B1 )K0−1 (B T P + B1T P A1 ) b + (A + A) b T Pb + P A b+A bT P + (A1 + A b1 )T P (A1 + A b1 ) − AT P A1 + Q b +Pb ′ + Pb (A + A) 1
+(P B + AT1 P B1 )K0−1 (B T P + B1T P A1 ) h i h i −1 b T (P + Pb )+(B1 + B b1 )T P (A1 + A b1 ) b b T b (B + B) − (P + Pb)(B + B)+(A 1 + A1 ) P (B1 + B1 ) K1
b b T P +Q+ Q+(Π−P b b b T (Π−P )+(A1 + A b1 )T P (A1 + A b1 ) = Π′ +P (A+ A)+(A+ A) )(A+ A)+(A+ A) h i h i b T Π + (B1 + B b1 )T P (A1 + A b1 ) b + (A1 + A b1 )T P (B1 + B b1 ) K −1 (B + B) − Π(B + B) 1 h i b − (B + B)K b −1 (B1 + B b1 )T P (A1 + A b1 ) = Π′ + Π (A + A) 1 i h b −1 (B + B) b TΠ b T − (A1 + A b1 )T P (B1 + B b1 )K −1 (B + B) b T Π − Π(B + B)K + (A + A) 1 1 h i b1 )K −1 (B1 + B b1 )T P (A1 + A b1 ) + Q + Q. b b1 )T P − P (B1 + B +(A1 + A 1
Thus, Π(·) is the solution to Riccati equation (4.4). Consequently, Riccati equation (4.12) admits a unique solution Pb (·) = Π(·) − P (·). Then we obtain (from (4.7) and (4.11)) ¯ − lE[X] ¯ + ΠlE[X], ¯ Y =P X and h i ¯ − lE[X] ¯ − K −1 (B + B) b T Π + (B1 + B b1 )T P (A1 + A b1 ) lE[X]. ¯ u ¯ = −K0−1 (B T P + B1T P A1 ) X 1
Also, from (4.10), it follows that
¯ + B1 u¯ + A b1 lE[X] ¯ +B b1 lE[¯ u] Z = P A1 X h i o n b T Π+(B1 + B b1 )TP (A1+ A b1 ) lE[X] ¯ ¯ −lE[X] ¯ +K −1 (B+ B) ¯ −P B1 K −1 (B TP +B1TP A1 ) X = P A1 X 0 1 i h b T Π+(B1 + B b1 )T P (A1 + A b1 ) lE[X] ¯ b1 lE[X]−P ¯ b1 K −1 (B + B) +P A B 1 h i ¯ − lE[X] ¯ = P A1 − P B1 K0−1 (B T P + B1T P A1 ) X i h b T Π + (B1 + B b1 )T P (A1 + A b1 ) lE[X]. ¯ b1 ) − P (B1 + B b1 )K −1 (B + B) + P (A1 + A 1
Hence, (4.1) follows. Plugging the above representations into the state equation, we obtain
i n h b T Π+(B1 + B b1 )T P (A1 + A b1 ) lE[X] ¯ ¯ = AX ¯ −B K −1 (B TP +B T P A1 ) X ¯ −lE[X] ¯ −K −1 (B + B) dX 1 0 1 h i o b T Π + (B1 + B b1 )T P (A1 + A b1 ) lE[X] ¯ dt b X] ¯ − BK b −1 (B + B) +AlE[ 1 i h n b T Π+(B1 + B b1 )TP (A1 + A b1 ) lE[X] ¯ ¯ −lE[X] ¯ −K −1 (B + B) ¯ −B1 K −1 (B TP +B1TP A1 ) X + A1 X 0 1 h i o −1 T T b Π + (B1 + B b1 ) P (A1 + A b1 ) lE[X] ¯ dW (t) b1 lE[X] ¯ −B b1 K (B + B) +A 1 nh i −1 T ¯ − lE[X] ¯ = A − BK0 (B P + B1T P A1 ) X i o h b T Π + (B1 + B b1 )T P (A1 + A b1 ) lE[X] ¯ dt b − (B + B)K b −1 (B + B) + (A + A) 1 nh i ¯ − lE[X] ¯ + A1 − B1 K0−1 (B T P + B1T P A1 ) X i o h b T Π + (B1 + B b1 )T P (A1 + A b1 ) lE[X] ¯ dW (t). b1 ) − (B1 + B b1 )K −1 (B + B) + (A1 + A 1 17
¯ This gives the closed-loop system (5.15). From the above derivation, we see that if X(·) is a solution to ¯ (5.15), and (¯ u(·), Y (·), Z(·)) are given by (4.1), then (X(·), u¯(·), Y (·), Z(·)) is an adapted solution to MF¯ FBSDE (3.3). By Corollary 3.3, we know that such a constructed 4-tuple (X(·), u¯(·), Y (·), Z(·)) is the unique solution to (3.3). The following is a direct verification of optimality of state feedback control. Theorem 4.2. Let (H1) and (H2)′′ hold. Let P (·) and Π(·) be the solutions to the Riccati equations (4.3) and (4.4), respectively. Then the state feedback control u¯(·) given in (4.1) is the optimal control of Problem (MF). Moreover, the optimal value of the cost is given by inf
u(·)∈U [0,T ]
J(x; u(·)) = h Π(0)x, x i,
∀x ∈ lRn .
(4.13)
Proof. Let P (·) and Π(·) be the solutions to the Riccati equations (4.3) and (4.4), respectively and denote K0 and K1 as in (4.2), which are positive definite. Let Pb (·) = Π(·) − P (·). Then Pb (·) solves (4.12). Now, we observe J(x; u(·)) − h Π(0)x, x i = J(x; u(·)) − h[P (0) + Pb(0)]x, x i Z Tn b b = lE h QX, X i + h QlE[X], lE[X] i + h Ru, u i + h RlE[u], lE[u] i 0
b b + h P ′ X, X i +2 h P (AX + Bu + AlE[X] + BlE[u]), Xi b b b1 lE[X] + B b1 lE[u] i + h P (A1 X + B1 u + A1 lE[X] + B1 lE[u]), A1 X + B1 u + A o b b + h Pb ′ lE[X], lE[X] i +2 h Pb lE[X], (A + A)lE[X] + (B + B)lE[u] i ds Z Tn = lE h(P ′ +P A+ATP +AT1 P A1+Q)X, X i+2 h u, (B TP +B1TP A1 )X i+h(R+B1TP B1 )u, u i 0
b b b b + h QlE[X], lE[X] i + h RlE[u], lE[u] i +2 h P (AlE[X] + BlE[u]), lE[X] i b b b b b1 lE[X]+ B b1lE[u]i +2 hP (A1 lE[X]+B1lE[u]), A1 lE[X]+ B1 lE[u]i+hP (A1 lE[X]+ B1 lE[u]), A o b b + h Pb ′ lE[X], lE[X] i +2 h Pb lE[X], (A + A)lE[X] + (B + B)lE[u] i ds Z Tn h(P ′ +P A+AT P +AT1 P A1 +Q)X, X i+2 h u, (B T P +B1T P A1 )X i+h K0 u, u i = lE 0
b A bTP + A bTP A1 +ATP A b1 + A bTP A b1 + Pb (A+ A)+(A+ b b TPb + Q]lE[X], b + h[Pb ′ +P A+ A) lE[X] i 1 1 1 b+B b1T P B1 + B1T P B b1 + B b1T P B b1 )lE[u], lE[u] i + h(R o bT P + B b1T P A1 + (B1 + B b1 )T P A b1 + (B + B) b T Pb ]lE[X] i ds +2 h lE[u], [B Z Tn = lE h(P ′ +P A+AT P +AT1 P A1 +Q)X, X i+2 h u−lE[u], (B T P +B1T P A1 )(X −lE[X]) i 0
+ h K0 (u − lE[u]), u − lE[u] i b+A bT P + A bT P A1 + AT P A b1 + A bT P A b1 + h Pb′ + P A 1 1 1 b + (A + A) b T Pb + Q b lE[X], lE[X] i + h K1 lE[u], lE[u] i +Pb (A + A) o b T (P + Pb ) + (B1 + B b1 )T P A1 + (B1 + B b1 )T P A b1 lE[X] i ds +2 h lE[u], (B + B)
18
= lE
Z Tn h[P ′ +P A+AT P +AT1 P A1 +Q−(P B +A1P B1 )K0−1 (B T P +B1T P A1 )]X, X i 0 1h i 2 + K02 u − lE[u] + K0−1 (B T P + B1T P A1 )(X − lE[X]) h b+A bT P + A bT P A1 + AT P A b1 + A bT P A b1 + Pb (A + A) b + (A + A) b T Pb + Q b + h Pb ′ + P A 1 1 1
+(P B + A1 P B1 )K0−1 (B T P + B1T P A1 ) b + AT P (B1 + B b1 ) + A bT P (B1 + B b1 ) K −1 − (P + Pb )(B + B) 1 1 1 i b T (P + Pb ) + (B1 + B b1 )T P A1 + (B1 + B b1 )T P A b1 lE[X], lE[X] i · (B + B) 1h i 2 o b T (P + Pb)+(B1 + B b1 )T P A1 +(B1 + B b1 )T P A b1 lE[X] ds + K12 lE[u]+K1−1 (B + B) Z T n h i 2 12 = lE K0 u − lE[u] + K0−1 (B T P + B1T P A1 )(X − lE[X]) 0 1h i 2 o b T Π + (B1 + B b1 )T P A1 + (B1 + B b1 )T P A b1 lE[X] ds ≥ 0. + K12 lE[u] + K1−1 (B + B)
Then our claim follows.
We see that Riccati equation (4.4) can be written as follows: ′ b + (A + A) b T Π + (A1 + A b1 )T P (A1 + A b1 ) + Q + Q b Π + Π(A + A) h i h i − Π(B + B) b + (A1 + A b1 )T P (B1 + B b1 ) K −1 (B + B) b T Π + (B1 + B b1 )T P (A1 + A b1 ) = 0, 1
When
s ∈ [0, T ],
b Π(T ) = G + G.
we have
b=A b1 = 0, A
b=B b1 = 0, B
b = 0, Q
b = 0, R
b = 0, G
K0 = K1 = R + B1T P B1 , and the Riccati equation for Π(·) can be written as Π′ + ΠA + AT Π − (ΠB + AT P B )K −1 (B T Π + B T P A ) + AT P A + Q = 0, 1 1 1 1 1 1 0 Π(T ) = G.
Then
0 = (Π − P )′ + (Π − P )A + AT (Π − P ) − (ΠB + AT1 P B1 )K0−1 (B T Π + B1T P A1 ) +(P B + AT1 P B1 )K0−1 (B T P + B1T P A1 ) = (Π − P )′ + (Π − P )A + AT (Π − P ) − (Π − P )BK0−1 (B T Π + B1T P A1 ) −(P B + AT1 P B1 )K0−1 B T (Π − P ). Therefore, by uniqueness, we have Π = P. Consequently, the feedback control can be written as h i b T Π + (B1 + B b1 )T P (A1 + A b1 ) lE[X], ¯ ¯ − lE[X] ¯ − K −1 (B + B) u ¯ = −K0−1 (B T P + B1T P A1 ) X 1 ¯ ¯ − lE[X] ¯ − K −1 B T P + B T P A1 lE[X] = −K0−1 (B T P + B1T P A1 ) X 1 0 ¯ = −K0−1 (B T P + B1T P A1 )X.
This recovers the result for classical LQ problem ([34]). 19
(4.14)
5
A Modification of Standard LQ Problems.
In this section, we are going to look at a special case which was mentioned in the introduction. For convenience, let us rewrite the state equation here: h i h i dX(s) = A(s)X(s) + B(s)u(s) ds + A (s)X(s) + B (s)u(s) dW (s), 1 1 (5.1) X(0) = x, with the cost functional:
J0 (x; u(·)) = lE
hZ
T
0
i h Q0 (s)X(s), X(s) i + h R0 (s)u(s), u(s) i ds + h G0 X(T ), X(T ) i .
(5.2)
Classical LQ problem can be stated as follows. Problem (LQ). For any given x ∈ lRn , find a u¯(·) ∈ U[0, T ] such that J0 (x; u¯(·)) =
inf
u(·)∈U [0,T ]
J0 (x; u(·)).
(5.3)
The following result is standard (see [34]). Theorem 5.1. Let (H1) hold and Q0 (s) ≥ 0,
R0 (s) ≥ δI,
s ∈ [0, T ];
G0 ≥ 0.
(5.4)
¯ 0 (·), u¯0 (·)). Moreover, the following holds: Then Problem (LQ) admits a unique optimal pair (X h i−1 h i ¯ 0 (s), s ∈ [0, T ], u ¯0 (s) = − R0 (s) + B1 (s)T P0 (s)B(s) B(s)T P0 (s) + B1 (s)T P0 (s)A1 (s) X where P0 (·) is the solution to the following Riccati equation: ′ T T P0 + P0 A + A P0 + A1 P0 A1 + Q0
−(P0 B + AT1 P0 B1 )(R0 + B1T P0 B1 )−1 (B T P0 + B1T P0 A1 ) = 0,
s ∈ [0, T ],
(5.5)
(5.6)
P0 (T ) = G0 .
¯ and X(·) is the solution to the following closed-loop system: h i ¯ 0 (s) = A − B(R0 + B1T P0 B1 )−1 (B T P0 + B1T P0 A) X(s)ds ¯ dX h i ¯ + A1 − B1 (R0 + B1T P0 B1 )(B T P0 + B1T P0 A1 ) X(s)dW (s), ¯ X0 (0) = x.
s ∈ [0, T ],
(5.7)
We now introduce the following modified cost functional: hZ T i b J0 (x; u(·)) = lE h Q0 (s)X(s), X(s) i + h R0 (s)u(s), u(s) i ds + h G0 X(T ), X(T ) i 0 hZ T i +lE q(s)var [X(s)] + ρ(s)var [u(s)] ds + gvar [X(T ) 0 2 nZ T h = lE h Q0 (s) + q(s)I X(s), X(s) i −q(s) lE[X(s)] + h R0 (s) + ρ(s)I u(s), u(s) i 0 2 i 2 o −ρ(s) lE[u(s)] ds + h G0 + gI X(T ), X(T ) i −g lE[X(T )] , 20
(5.8)
with q(·), ρ(·) ∈ L∞ (0, T ), g ∈ [0, ∞) such that q(s), ρ(s) ≥ 0, Also, of course, we assume that
Z
s ∈ [0, T ].
(5.9)
T
[q(s) + ρ(s)]ds + g > 0.
(5.10)
0
We want to compare the above Problem (LQ) with the following problem: Problem (LQ)′ . For any given x ∈ lRn , find a u¯(·) ∈ U[0, T ] such that Jb0 (x; u¯(·)) =
inf
u(·)∈U [0,T ]
Jb0 (x; u(·)).
(5.11)
We refer to the above Problem (LQ)′ as a modified LQ problem. This is a special case of Problem (MF) with A b=A b1 = 0, B b=B b1 = 0, Q = Q + qI, Q b = −qI, R = R0 + ρI, R b = −ρI, G = G0 + gI, G b = −gI. 0 Then the Riccati equations are
and
′ T T P + P A + A P + A1 P A1 + Q0 + qI −(P B + AT1 P B1 )(R0 + ρI + B1T P B1 )−1 (B T P + B1T P A1 ) = 0, P (T ) = G0 + gI,
s ∈ [0, T ],
Π′ +ΠA+AT Π+AT P A +Q −(ΠB +AT P B )(R +B T P B )−1 (B T Π+B T P A ) = 0, 1 0 1 0 1 1 1 1 1 1 Π(T ) = G .
(5.12)
s ∈ [0, T ],
0
(5.13)
The optimal control is given by ¯ − lE[X] ¯ u¯ = −(R0 + ρI + B1T P B1 )−1 (B T P + B1T P A1 ) X ¯ −(R0 + B1T P B1 )−1 (B T Π + B1T P A1 )lE[X],
and the closed-loop system reads nh i ¯ = A − B(R0 + ρI + B1T P B1 )−1 (B T P + B1T P A1 ) X ¯ − lE[X] ¯ dX i o h ¯ ds + A − B(R0 + B1T P B1 )−1 B T Π + B1T P A1 lE[X] nh i −1 T T ¯ − lE[X] ¯ + A − B K (B P + B P A ) X 1 1 1 1 0 i o h −1 T T ¯ lE[ X] dW (s), B Π + B P A + A − B K 1 1 1 1 1 X(0) ¯ = x.
(5.14)
(5.15)
By the optimality of u ¯0 (·) and u ¯(·), we have that
h P0 (0)x, x i = J0 (x; u¯0 (·)) ≤ J0 (x; u¯(·)),
21
(5.16)
and h Π(0)x, x i = J0 (x; u¯(·)) + lE
hZ
0
T
i ¯ ¯ )] q(s)var [X(s)] + ρ(s)var [¯ u(s)] ds + gvar [X(T
= Jb0 (x; u¯(·)) ≤ Jb0 (x; u¯0 (·)) hZ T i ¯ 0 (s)] + ρ(s)var [¯ ¯ 0 (T )] = J0 (x; u¯0 (·)) + lE q(s)var [X u0 (s)] ds + gvar [X 0 hZ T i ¯ 0 (s)] + ρ(s)var [¯ ¯ 0 (T )] . ≤ J0 (x; u¯(·)) + lE q(s)var [X u0 (s)] ds + gvar [X
(5.17)
0
This implies
hZ lE ≤ lE
T 0
hZ
i ¯ ¯ )] q(s)var [X(s)] + ρ(s)var [¯ u(s)] ds + gvar [X(T T
0
i ¯ 0 (s)] + ρ(s)var [¯ ¯ 0 (T )] . q(s)var [X u0 (s)] ds + gvar [X
(5.18)
Hence, J0 (x; u¯(·)) − J0 (x; u¯0 (·)) is the price for the decrease lE
hZ
T
0
hZ −lE
T 0
i ¯ 0 (s)] + ρ(s)var [¯ ¯ 0 (T )] q(s)var [X u0 (s)] ds + gvar [X ¯ ¯ )] q(s)var [X(s)] + ρ(s)var [¯ u(s)] ds + gvar [X(T
¯ 0 (·), u of the (weighted) variances of the optimal state-control pair (X ¯0 (·)). Moreover, (5.17) further implies that J0 (x; u¯(·)) − J0 (x; u¯0 (·)) hZ T i ¯ 0 (s)] + ρ(s)var [¯ ¯ 0 (T )] ≤ lE q(s)var [X u0 (s)] ds + gvar [X 0 hZ T ¯ ¯ )]. q(s)var [X(s)] + ρ(s)var [¯ u(s)] ds + gvar [X(T −lE 0
The above roughly means that the amount increased in the cost is “covered” by the amount decreased in the weighted variance of the optimal state-control pair. We now look at a simple case to illustrate the above. Let us look at a one-dimensional controlled linear SDE: dX(s) = bu(s)ds + X(s)dW (s), (5.19) X(0) = x,
with cost functionals:
J0 (x; u(·)) = lE
hZ
0
and Jb0 (x; u(·)) = lE
nZ
nZ = lE
T
0 T
T
i |u(s)|2 ds + g0 |X(T )|2 ,
o |u(s)|2 ds + g0 |X(T )|2 + gvar [X(T )]
2 o |u(s)| ds + (g0 + g)|X(T )| − g lE[X(T )] , 2
0
2
22
(5.20)
(5.21)
where g0 ≥ 0 and g > 0. As above, we refer to the optimal control problem associated with (5.19) and (5.20) as the standard LQ problem, and to that associated with (5.19) and (5.17) as the modified LQ problem. The Riccati equation for the standard LQ problem is p′ (s) + p (s) − b2 p (s)2 = 0, s ∈ [0, T ], 0 0 0 (5.22) p (T ) = g . 0
0
A straightforward calculation shows that p0 (s) =
(eT −s
eT −s g0 > 0, − 1)b2 g0 + 1
s ∈ [0, T ].
(5.23)
The optimal control is ¯ 0 (s), u ¯0 (s) = −bp0 (s)X
s ∈ [0, T ],
(5.24)
and the closed-loop system is dX ¯ 0 (s) = −b2 p0 (s)X ¯ 0 (s)ds + X ¯ 0 (s)dW (s),
s ∈ [0, T ],
(5.25)
X ¯ 0 (0) = x.
Thus,
2
¯ 0 (s) = xe−b X Consequently,
p0 (τ )dτ − 12 s+W (s)
RT
p0 (τ )dτ − 21 T + 12 T
−2b2
RT
p0 (τ )dτ −T +2T
and Hence,
0
−b2
¯ 0 (T )] = xe lE[X ¯ 0 (T )2 ] = x2 e lE[X
Rs 0
0
s ∈ [0, T ].
,
−b2
RT
p0 (τ )dτ
−2b2
RT
p0 (τ )dτ +T
= xe
= x2 e
(5.26)
0
0
(5.27)
,
(5.28)
.
RT 2 2 ¯ 0 (T )] = lE[X ¯ 0 (T )2 ] − lE[X ¯ 0 (T )] = x2 e−2b 0 p0 (τ )dτ (eT − 1). var [X
(5.29)
Also, the optimal expected cost is J0 (x; u¯0 (·)) = p0 (0)x2 =
(eT
eT g0 x2 . − 1)b2 g0 + 1
Next, for the modified LQ peoblem, the Riccati equations are: p′ (s) + p(s) − b2 p(s)2 = 0, p(T ) = g + g,
(5.30)
s ∈ [0, T ],
(5.31)
0
and
π ′ (s) − b2 π 2 (s) + p(s) = 0, π(T ) = g .
s ∈ [0, T ],
(5.32)
0
Clearly, p(s) =
eT −s g0 eT −s (g0 + g) > = p0 (s) > 0, (eT −s − 1)b2 (g0 + g) + 1 (eT −s − 1)b2 g0 + 1
s ∈ [0, T ].
(5.33)
We now show that p0 (s) < π(s) < p(s), 23
s ∈ [0, T ].
(5.34)
In fact,
Thus,
d [π(s) − p0 (s)] − b2 [π(s) + p0 (s)][π(s) − p0 (s)] + p(s) − p0 (s) = 0, ds π(T ) − p (T ) = 0. 0 π(s) − p0 (s) =
Z
s
Next,
T
e
−
Rt s
b2 [π(τ )+p0 (τ )]dτ
[p(t) − p0 (t)]dt > 0,
d [p(s) − π(s)] − b2 [p(s) + π(s)][p(s) − π(s)] = 0, ds p(T ) − π(T ) = g.
Hence,
p(s) − π(s) = e
−
RT s
b2 [p(τ )+π(τ )]dτ
g > 0,
s ∈ [0, T ],
s ∈ [0, T ).
s ∈ [0, T ],
s ∈ [0, T ].
This proves (5.34). Note that the optimal control of modified LQ problem is given by ¯ ¯ ¯ u ¯(s) = −bp(s) X(s) − lE[X(s)] − bπ(s)lE[X(s)], s ∈ [0, T ], and the closed-loop system is h i dX(s) ¯ ¯ ¯ ¯ ¯ = − b2 p(s) X(s) − lE[X(s)] + b2 π(s)lE[X(s)] ds + X(s)dW (s), ¯ X(0) = x. Thus,
s ∈ [0, T ],
d ¯ ¯ lE[X(s)] = −b2 π(s)lE[X(s)], ds
which leads to 2
−b ¯ lE[X(s)] =e
Rs 0
π(τ )dτ
x,
s ∈ [0, T ].
On the other hand, by Itˆ o’s formula, o n h i ¯ 2 ] = − 2 b 2 pX ¯ X ¯ − lE[X] ¯ + b2 π XlE[ ¯ X] ¯ +X ¯ 2 ds + [· · ·]dW. d[X Then
(5.35)
n h 2 i 2 o ¯ 2 ] = − 2b2 p lE[X ¯ 2 ] − lE[X] ¯ ¯ ¯ 2 ] ds d lE[X − 2b2 π lE[X] + lE[X n 2 o ¯ 2 ] + 2b2 (p − π) lE[X] ¯ = (1 − 2b2 p)lE[X ds Rs o n 2 ¯ 2 ] + 2b2 (p − π)e−2b 0 π(τ )dτ x2 ds. = (1 − 2b2 p)lE[X
Hence, Z s Rt Rt Rs h i −t+2 p(τ )dτ −2b2 π(τ )dτ s−2b2 p(τ )dτ 2 2 ¯ 0 0 0 e 2b2 [p(t) − π(t)]e dt lE[X (s)] = e x 1+ 0 Z s Rt Rs h i −t+2b2 [p(τ )−π(τ )]dτ 2 s−2b2 p(τ )dτ 2 0 0 e 2b [p(t) − π(t)]dt =e x 1+ 0 Z s Rs Rt Rs i h −t+2b2 [p(τ )−π(τ )]dτ [p(τ )−π(τ )]dτ s−2b2 p(τ )dτ 2 −s+2b2 0 0 0 + e dt =e x e Z s Rs R 0t Rs i h π(τ )dτ s−t−2b2 p(τ )dτ −2b2 π(τ )dτ −2b2 t 0 0 dt x2 . e = e + 0
24
(5.36)
It follows that Z s Rs Rs Rs Rt h i −2b2 π(τ )dτ s−t−2b2 p(τ )dτ −2b2 π(τ )dτ −2b2 π(τ )dτ ¯ 0 t 0 0 x2 var [X(s)] = e + e dt x2 − e 0 Rs Rt hZ s i s−t−2b2 p(τ )dτ −2b2 π(τ )dτ t 0 = e dt x2 . 0
Consequently, noting p0 (s) < p(s), we have ¯ )] = var [X(T
hZ
T
e
T −t−2b2
0
=
hZ
T
e
RT
T −t−2b2
0
p(τ )dτ −2b2
t
RT 0
p0 (τ )dτ
Rt 0
p0 (τ )dτ
i
2
dt x
In fact, by letting pe =
Then
g0 g0 +g p,
g0 p(s), g0 + g
s ∈ [0, T ).
(5.38)
we have pe′ (s) + pe(s) − b2 pe(s)2 − pe(T ) = g . 0
g0 g p(s)2 = 0, (g0 + g)2
[p′ (s) − pe′ (s)] + [p0 (s) − pe(s)] − b2 [p0 (s) + pe(s)][p0 (s) − pe(s)] + 0 p (T ) − pe(T ) = 0. 0
g0 g p(s)2 = 0, (g0 + g)2
This leads to (5.38). Consequently, ¯ )] = Jb0 (x; u¯(·)) = π(0)x2 ≤ p(0)x2 ≤ g0 + g p0 (0)x2 = g0 + g J(x; u¯0 (·)). J0 (x; u¯(·)) + var [X(T g0 g0
Therefore,
¯ )] ≤ g J(x; u¯0 (·)). 0 ≤ J0 (x; u¯(·)) − J0 (x; u¯0 (·)) + gvar [X(T g0
The above gives an upper bound for the cost increase in order to have a smaller var [X(T )]. Taking into account of (5.37), we see that it is a very good trade-off to consider the modified LQ problem if one wishes to have a smaller var [X(T )]. It is possible to more carefully calculate the price difference J0 (x; u(·)) − J0 (x; u¯0 (·)). We omit the details here. Also, it is possible to calculate the situation of including var [u(s)] and/or var [X(s)] in the integrand of the modified cost functional. The details are omitted here as well. To conclude this paper, let us make some remarks. First of all, we have presented some results on the LQ problem for MF-SDEs with deterministic coefficients. Optimal control is represented by a state feedback form involving both X(·) and lE[X(·)], via the solutions of two Riccati equations. Apparently, there are many problems left unsolved (and some of them might be challenging). To mention a few, one may consider the case of infinite-horizon problem, following the idea of [31], and more interestingly, the case of random coefficients (for which one might have to introduce some other techniques since the approach used in this paper will not work). We will continue our study and report new results in our future publications. 25
References [1] N. U. Ahmed and X. Ding, A semilinear McKean-Vlasov stochastic evolution equation in Hilbert space, Stoch. Proc. Appl., 60 (1995), 65–85. [2] N. U. Ahmed and X. Ding, Controlled McKean-Vlasov equations, Comm. Appl. Anal., 5 (2001), 183–206. [3] N. U. Ahmed, Nonlinear diffusion governed by McKean-Vlasov equation on Hilbert space and optimal control, SIAM J. Control Optim., 46 (2007), 356–378. [4] D. Andersson and B. Djehiche, A maximum principle for SDEs of mean-field type, Appl. Math. Optim., 63 (2011), 341–356. [5] R. Bellman, R. Kalaba, and G. M. Wing, Invariant imbedding and the reduction of two-point boundary value problems to initial value problems, Proc Natl Acad Sci USA, 46 (1960), 1646-1649. [6] V. S. Borkar and K. S. Kumar, McKean-Vlasov limit in portfolio optimization, Stoch. Anal. Appl., 28 (2010), 884–906. [7] R. Buckdahn, B. Djehiche, and J. Li, A general maximum principle for SDEs of mean-field type, preprint. [8] R. Buckdahn, B. Djehiche, J. Li, and S. Peng, Mean-field backward stochastic differential equations: a limit approach, Ann. Probab., 37 (2009), 1524-1565. [9] R. Buckdahn, J. Li, and S. Peng, Mean-field backward stochastic differential equations and related partial differential equations, Stoch. Process. Appl., 119 (2009), 3133-3154, [10] T. Chan, Dynamics of the McKean-Vlasov equation, Ann. Probab. 22 (1994), 431–441. [11] T. Chiang, McKean-Vlasov equations with discontinuous coefficients, Soochow J. Math., 20 (1994), 507–526. [12] D. Crisan and J. Xiong, Approximate McKean-Vlasov representations for a class of SPDEs, Stochastics, 82 (2010), 53–68. [13] D. A. Dawson, Critical dynamics and fluctuations for a mean-field model of cooperative behavior, J. Statist. Phys., 31 (1983), 29–85. [14] D. A. Dawson and J. G¨ artner, Large deviations from the McKean-Vlasov limit for weakly interacting diffusions, Stochastics, 20 (1987), 247–308. [15] J. G¨artner, On the Mckean-Vlasov limit for interacting diffusions, Math. Nachr., 137 (1988), 197–248. [16] C. Graham, McKean-Vlasov Ito-Skorohod equations, and nonlinear diffusions with discrete jump sets, Stoch. Proc. Appl., 40 (1992), 69–82. [17] M. Huang, R. P. Malham´e, and P. E. Caines, Large population stochastic dynamic games: closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle, Comm. Inform. Systems, 6 (2006), 221–252. [18] M. Kac, Foundations of kinetic theory, Proc. 3rd Berkeley Sympos. Math. Statist. Prob. 3 (1956), 171–197.
26
[19] P. E. Kloeden and T. Lorenz, Stochastic differential equations with nonlocal sample dependence, Stoch. Anal. Appl., 28 (2010), 937–945. [20] P. M. Kotelenez and T. G. Kurtz, Macroscopic limit for stochastic partial differential equations of McKean-Vlasov type, Prob. Theory Rel. Fields, 146 (2010), 189–222. [21] J. Ma, P. Protter, and J. Yong, Solving forward-backward stochastic differential equations explicitly — a four-step scheme, Probab. Theory & Related Fields, 98 (1994), 339–359. [22] J. Ma and J. Yong, Forward-Backward Stochastic Differential Equations and Their Applications, Lecture Notes in Math., Vol. 1702, Springer-Verlag, 1999. [23] N. I. Mahmudov and M. A. McKibben, On a class of backward McKean-Vlasov stochastic equations in Hilbert space: existence and convergence properties, Dynamic Systems Appl., 16 (2007), 643–664. [24] H. P. McKean, A class of Markov processes associated with nonlinear parabolic equations, Proc. Natl. Acad. Sci. USA, 56 (1966), 1907–1911. [25] T. Meyer-Brandis, B. Oksendal, and X. Zhou, A mean-field stochastic maximum principle via Malliavin calculus, A special issue for Mark Davis’ Festschrift, to appear in Stochastics. [26] L. Mou and J. Yong, Two-person zero-sum linear quadratic stochastic differential games by a Hilbert space method, J. Industrial Management Optim., 2 (2006), 95–117. [27] D. Nualart, The Malliavin Calculus and Related Topics, 2nd Ed., Springer-Verlag, Berlin, 2006. [28] J. Y. Park, P. Balasubramaniam, and Y. H. Kang, Controllability of McKean-Vlasov stochastic integrodifferential evolution equation in Hilbert spaces, Numer. Funct. Anal. Optim., 29 (2008), 1328–1346. [29] M. Scheutzow, Uniqueness and non-uniqueness of solutions of Vlasov-McKean equations, J. Austral. Math. Soc., Ser. A, 43 (1987), 246–256. [30] S. Tang, General linear quadratic optimal stochastic control problems with random coefficients: linear stochastic Hamilton systems and backward stochastic Riccati equations, SIAM J. Control Optim., 42 (2003), 53-75. [31] H. Wu and X. Y. Zhou, Stochastic frenquency characteristics, SIAM J. Control Optim., 40 (2001), 557–576. [32] A. Yu. Veretennikov, On ergodic measures for McKean–Vlasov stochastic equations, From Stochastic Calculus to Mathematical Finance, 623–633, Springer, Berline, 2006. [33] J. Yong, Stochastic optimal control and forward-backward stochastic differential equations, Computational & Appl. Math., 21 (2002), 369–403. [34] J. Yong and X. Y. Zhou, Stochastic Controls: Hamiltonian Systems and HJB Equations, SpringerVerlag, 1999.
27