Relationship between Maximum Principle and Dynamic Programming

Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2013, Article ID 285241, 12 pages http://dx.doi.org/10.1155/2013/285241

Research Article Relationship between Maximum Principle and Dynamic Programming for Stochastic Recursive Optimal Control Problems and Applications Jingtao Shi and Zhiyong Yu School of Mathematics, Shandong University, Jinan 250100, China Correspondence should be addressed to Jingtao Shi; [email protected] Received 26 October 2012; Accepted 23 December 2012 Academic Editor: Guangchen Wang Copyright © 2013 J. Shi and Z. Yu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. This paper is concerned with the relationship between maximum principle and dynamic programming for stochastic recursive optimal control problems. Under certain differentiability conditions, relations among the adjoint processes, the generalized Hamiltonian function, and the value function are given. A linear quadratic recursive utility portfolio optimization problem in the financial engineering is discussed as an explicitly illustrated example of the main result.

1. Introduction The nonlinear backward stochastic differential equation (BSDE) was introduced by Pardoux and Peng [1]. Independently, Duffie and Epstein [2] introduced BSDE from economic background. In [2], they presented a stochastic differential formulation of recursive utility. Recursive utility is an extension of the standard additive utility with the instantaneous utility depending not only on the instantaneous consumption rate but also on the future utility. As found by El Karoui et al. [3], the utility process can be regarded as a solution to a special BSDE. The optimal control problem that the cost functional is described by the solution to a BSDE is called a stochastic recursive optimal control problem. In this case, the control systems become forward-backward stochastic differential equations (FBSDEs). This kind of optimal control problems has found important applications in real-world problems such as mathematical economics, mathematical finance, and engineering (see Schroder and Skiadas [4], El Karoui et al. [3, 5], Ji and Zhou [6], Williams [7], and Wang and Wu [8]). It is well known that Pontryagin’s maximum principle and Bellman’s dynamic programming are two of the most important tools in solving stochastic optimal control problems. See the famous reference book by Yong and Zhou [9]

for systematic discussion. For stochastic recursive optimal control problems, Peng [10] first obtained a maximum principle when the control domain is convex. And then Xu [11] studied the nonconvex control domain case, but he needs to assume that the diffusion coefficient does not contain the control variable. Ji and Zhou [6] established a maximum principle when the forward state is constrained in a convex set at the terminal time. Wu [12] established a general maximum principle, where the control domain is nonconvex and the diffusion coefficient depends on the control variable. Maximum principle for stochastic recursive optimal control systems with Poisson jumps, and their applications in finance were studied in Shi and Wu [13], where the control domain is convex. For another important approach—dynamic programming—to study stochastic recursive optimal control problems, Peng [14] (also see Peng [15]) first obtained the generalized dynamic programming principle and introduced a generalized Hamilton-Jacobi-Bellman (HJB) equation which is a second-order parabolic partial differential equation (PDE). Result that the value function is the viscosity solution to the generalized HJB equation is also proved in [14]. Wu and Yu [16] extended the results of [14, 15] with obstacle constraint for the cost functional described by the solution to a reflected backward stochastic differential equation and proved that

2 the value function is the unique viscosity solution to their generalized HJB equation. Li and Peng [17] generalized the results of [14, 15] by considering the cost functional defined by the controlled BSDE with jumps. They proved that the value function was the viscosity solution to the associated generalized HJB equation with integral-differential operators. Hence, a natural question arises: are there any relations between these two methods? Such a topic was intuitively discussed by Bismut [18] and Bensoussan [19] and then studied by many researchers. Under certain differentiability conditions, the relationship between the maximum principle and dynamic programming is essentially the relationship between the derivatives of the value function and the solution to the adjoint equation along the optimal state. However, the smoothness conditions do not hold in general and are difficult to verify a priori, see Zhou [20] for the deterministic case and Yong and Zhou [9] for its stochastic counterpart. Zhou [21] first obtained the relationship between general maximum principle and dynamic programming using the viscosity solution theory (see also Zhou [22] or Yong and Zhou [9]), without the assumption that the value function is smooth. For diffusion with jumps, the relationship between maximum principle and dynamic programming was first given by Framstad et al. [23, 24] under certain differentiability conditions, and then Shi and Wu [25] eliminated these restrictions within the framework of viscosity solutions. For singular stochastic optimal control problem, the relationship between maximum principle and dynamic programming was given by Bahlali et al. [26], with the derivatives of the value function. For Markovian regime-switching jump diffusion model, the relationship between maximum principle and dynamic programming was given by Zhang et al. [27], also with the derivatives of the value function. In this paper, we derive the relationship between maximum principle and dynamic programming for the stochastic recursive optimal control problem. For this problem, we connect the maximum principle of [10] with the dynamic programming of [14, 15], under certain differentiability conditions. Specifically, when the value function is smooth, we give relations among the adjoint processes, the generalized Hamiltonian function, and the value function. For this target, in Section 2, we first adopt some related results of [14, 15], which in this paper are stated as a stochastic verification theorem. Also we prove that under additional convexity conditions, the necessary conditions in the maximum principle of [10] are in fact sufficient. In Section 3, we show the relationship between maximum principle and dynamic programming under certain differentiability conditions for our stochastic recursive optimal control problem, by the martingale representation technique. In Section 4, we discuss a linear quadratic (LQ) recursive utility portfolio optimization problem in the financial engineering. In this problem, the state feedback optimal control is obtained by both the maximum principle and dynamic programming approaches, and the relations we obtained are illustrated explicitly. Finally, we end this paper with some concluding remarks in Section 5. Notations. Throughout this paper, we denote by R𝑛 the space of 𝑛-dimensional Euclidean space, by R𝑛×𝑑 the space of 𝑛 × 𝑑

Mathematical Problems in Engineering matrices, and by S𝑛 the space of 𝑛 × 𝑛 symmetric matrices. ⟨⋅, ⋅⟩ and | ⋅ | denote the scalar product and norm in the Euclidean space, respectively. ⊤ appearing in the superscripts denotes the transpose of a matrix.

2. Problem Statement and Preliminaries Let (Ω, F, P) be a complete probability space equipped with a 𝑑-dimensional standard Brownian motion {𝑊(𝑡)}𝑡≥0 . For fixed 𝑡 ≥ 0, the filtration {F𝑡𝑠 }𝑠≥𝑡 is generated as F𝑡𝑠 = 𝜎{𝑊(𝑟) − 𝑊(𝑡); 𝑡 ≤ 𝑟 ≤ 𝑠} ⋁ N, where N contains all Pnull sets in F and 𝜎1 ⋁ 𝜎2 denotes the 𝜎-field generated by 𝜎1 ∪ 𝜎2 . In particular, if 𝑡 = 0, we write F𝑠 ≡ F𝑡𝑠 . Let 𝑇 > 0 be finite and let U ⊂ R𝑘 be nonempty, convex. For any initial time and state (𝑡, 𝑥) ∈ [0, 𝑇) × R𝑛 , consider the state 𝑋𝑡,𝑥;𝑢 (⋅) ∈ R𝑛 given by the following controlled SDE: 𝑑𝑋𝑡,𝑥;𝑢 (𝑠) = 𝑏 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , 𝑢 (𝑠)) 𝑑𝑠 + 𝜎 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , 𝑢 (𝑠)) 𝑑𝑊 (𝑠) ,

𝑠 ∈ [𝑡, 𝑇] ,

𝑋𝑡,𝑥;𝑢 (𝑡) = 𝑥. (1) Here 𝑏 : [0, 𝑇] × R𝑛 × U → R𝑛 , 𝜎 : [0, 𝑇] × R𝑛 × U → R𝑛×𝑑 are given functions. Given 𝑡 ∈ [0, 𝑇), we denote by U[𝑡, 𝑇] the set of {F𝑡𝑠 }𝑠≥𝑡 adapted processes. For given 𝑢(⋅) ∈ U[𝑡, 𝑇] and 𝑥 ∈ R𝑛 , an R𝑛 -valued process 𝑋𝑡,𝑥;𝑢 (⋅) is called a solution to (1) if it is an F𝑡𝑠 -adapted process such that (1) holds. We refer to such 𝑢(⋅) ∈ U[𝑡, 𝑇] as an admissible control and (𝑋𝑡,𝑥;𝑢 (⋅), 𝑢(⋅)) as an admissible pair. We assume the following. (H1) 𝑏, 𝜎 are uniformly continuous in (𝑠, 𝑥, 𝑢), and there exists a constant 𝐶 > 0 such that for all 𝑠 ∈ [0, 𝑇], ̂ ∈ U, ̂ ∈ R𝑛 , 𝑢, 𝑢 𝑥, 𝑥 ̂, 𝑢 ̂ )| ̂, 𝑢 ̂ )| + |𝜎 (𝑡, 𝑥, 𝑢) − 𝜎 (𝑠, 𝑥 |𝑏 (𝑠, 𝑥, 𝑢) − 𝑏 (𝑠, 𝑥 ̂ | + |𝑢 − 𝑢 ̂ |) , ≤ 𝐶 (|𝑥 − 𝑥

(2)

|𝑏 (𝑠, 𝑥, 𝑢)| + |𝜎 (𝑠, 𝑥, 𝑢)| ≤ 𝐶 (1 + |𝑥|) . For any 𝑢(⋅) ∈ U[𝑡, 𝑇], under (H1), SDE (1) has a unique solution 𝑋𝑡,𝑥;𝑢 (⋅) by the classical SDE theory (see, e.g., Yong and Zhou [9]). Next, we introduce the following controlled BSDE coupled with controlled SDE (1): − 𝑑𝑌𝑡,𝑥;𝑢 (𝑠) = 𝑓 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , 𝑌𝑡,𝑥;𝑢 (𝑠) , 𝑍𝑡,𝑥;𝑢 (𝑠) , 𝑢 (𝑠)) 𝑑𝑠 − 𝑍𝑡,𝑥;𝑢 (𝑠) 𝑑𝑊 (𝑠) ,

𝑠 ∈ [𝑡, 𝑇] ,

𝑌𝑡,𝑥;𝑢 (𝑇) = 𝜙 (𝑋𝑡,𝑥;𝑢 (𝑇)) . (3) Here 𝑓 : [0, 𝑇] × R𝑛 × R × R𝑑 × U → R, Φ : R𝑛 → R are given functions. We assume the following.

Mathematical Problems in Engineering

3

(H2) 𝑓, 𝜙 are uniformly continuous in (𝑠, 𝑥, 𝑦, 𝑧, 𝑢) and there exists a constant 𝐶 > 0 such that for all 𝑠 ∈ ̂ ∈ R, 𝑧, 𝑧̂ ∈ R𝑑 , 𝑢, 𝑢 ̂ ∈ U, ̂ ∈ R𝑛 , 𝑦, 𝑦 [0, 𝑇], 𝑥, 𝑥 󵄨 󵄨󵄨 ̂, 𝑦 ̂ , 𝑧̂, 𝑢 ̂ )󵄨󵄨󵄨 󵄨󵄨𝑓 (𝑠, 𝑥, 𝑦, 𝑧, 𝑢) − 𝑓 (𝑠, 𝑥 ̂ |) , ̂ 󵄨󵄨󵄨󵄨 + |𝑧 − 𝑧̂| + |𝑢 − 𝑢 ̂ | + 󵄨󵄨󵄨󵄨𝑦 − 𝑦 ≤ 𝐶 (|𝑥 − 𝑥 (4) 󵄨 󵄨 󵄨 󵄨󵄨 󵄨󵄨𝑓 (𝑠, 𝑥, 0, 0, 𝑢)󵄨󵄨󵄨 + 󵄨󵄨󵄨𝜙 (𝑥)󵄨󵄨󵄨 ≤ 𝐶 (1 + |𝑥|) , 󵄨 󵄨󵄨 ̂| . 𝑥)󵄨󵄨󵄨 ≤ 𝐶 |𝑥 − 𝑥 󵄨󵄨𝜙 (𝑥) − 𝜙 (̂ Then for any 𝑢(⋅) ∈ U[𝑡, 𝑇] and the given unique solution 𝑋𝑡,𝑥;𝑢 (⋅) to (1), under (H2), BSDE (3) admits a unique solution (𝑌𝑡,𝑥;𝑢 (⋅), 𝑍𝑡,𝑥;𝑢 (⋅)) by the classical BSDE theory (see Pardoux and Peng [1] or Yong and Zhou [9]). Given 𝑢(⋅) ∈ U[𝑡, 𝑇], we introduce the cost functional 󵄨 𝐽 (𝑡, 𝑥; 𝑢 (⋅)) := −𝑌𝑡,𝑥;𝑢 (𝑠)󵄨󵄨󵄨󵄨𝑠=𝑡 , (𝑡, 𝑥) ∈ [0, 𝑇] × R𝑛 . (5)

Lemma 2 (stochastic verification theorem). Let (H1)-(H2) hold and let (𝑡, 𝑥) ∈ [0, 𝑇) × R𝑛 be fixed. Suppose that 𝑉 ∈ 𝐶1,2 ([0, 𝑇] × R𝑛 ) is a solution to (7), then 𝑉 (𝑡, 𝑥) ≤ 𝐽 (𝑡, 𝑥; 𝑢 (⋅)) ,

∀𝑢 (⋅) ∈ U [𝑡, 𝑇] , (𝑡, 𝑥) ∈ [0, 𝑇] × R.

Furthermore, an admissible pair (𝑋 Problem (RSOCP) if and only if 𝐺 (𝑠, 𝑋

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

= max 𝐺 (𝑠, 𝑋

(𝑠)) , −𝑉𝑥 (𝑠, 𝑋

(𝑠) , −𝑉 (𝑠, 𝑋

−𝑉𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠)) ,

(𝑠)) , −𝑉𝑥𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

Our recursive stochastic optimal control problem is the following. Problem 1 (RSOCP). For given (𝑡, 𝑥) ∈ [0, 𝑇) × R𝑛 , to minimize (5) subject to (1)–(3) over U[𝑡, 𝑇]. We define the value function 𝑉 (𝑡, 𝑥) := inf 𝐽 (𝑡, 𝑥; 𝑢 (⋅)) , (𝑡, 𝑥) ∈ [0, 𝑇] × R𝑛 , 𝑢(⋅)∈U[𝑡,𝑇]

𝑥 ∈ R𝑛 .

𝑉 (𝑇, 𝑥) = −𝜙 (𝑥) ,

(6) Any 𝑢(⋅) ∈ U[𝑡, 𝑇] achieves the above infimum is called an 𝑡,𝑥;𝑢

optimal control, and the corresponding solutions 𝑋 (1) and (𝑌

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(⋅), 𝑍

(⋅) to

(⋅)) to (3) are called optimal state. For 𝑡,𝑥;𝑢

simplicity, we refer to (𝑋 optimal quartet.

(⋅), 𝑌

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(⋅), 𝑍

Proof. For any 𝑢(⋅) ∈ U[𝑡, 𝑇] with the corresponding state 𝑋𝑡,𝑥;𝑢 (⋅), applying Itô’s formula to 𝑉(𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠)), we obtain the following: 𝑉 (𝑡, 𝑥) = −E𝜙 (𝑋𝑡,𝑥;𝑢 (𝑇)) 𝑇

𝑡

V (𝑇, 𝑥) = −𝜙 (𝑥) ,

(7)

+ ⟨𝑉𝑥 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠)) , 𝑏 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , 𝑢 (𝑠))⟩ +

⊤ 1 tr (𝜎(𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , 𝑢 (𝑠)) 2

× 𝑉𝑥𝑥 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠)) × 𝜎 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , 𝑢 (𝑠)))} 𝑑𝑠

𝑇

− E ∫ {𝑉𝑠 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠)) 𝑡

+ 𝐺 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , −𝑉 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠)) ,

∀𝑥 ∈ R𝑛 ,

where the generalized Hamiltonian function 𝐺 : [0, 𝑇] × R𝑛 × R × R𝑛 × S𝑛 × U → R is defined as 𝐺 (𝑡, 𝑥, 𝑟, 𝑝, 𝐴, 𝑢) 1 := tr {𝜎(𝑡, 𝑥, 𝑢)⊤ 𝐴𝜎 (𝑡, 𝑥, 𝑢)} 2 + ⟨𝑝, 𝑏 (𝑡, 𝑥, 𝑢)⟩ + 𝑓 (𝑡, 𝑥, 𝑟, 𝜎(𝑡, 𝑥, 𝑢)⊤ 𝑝, 𝑢) . (8) We have the following result.

(𝑠))

= −E𝜙 (𝑋𝑡,𝑥;𝑢 (𝑇))

𝑢∈U

(𝑡, 𝑥) ∈ [0, 𝑇) × R𝑛 ,

𝑡,𝑥;𝑢

− E ∫ {𝑉𝑠 (𝑠, 𝑋

Remark 1. Because 𝑏, 𝜎, 𝑓, 𝑔 are all deterministic functions, then from [15, Proposition 5.1 of Peng], we know that under (H1) and (H2), the above value function is a deterministic function. So our definition (6) is meaningful. We introduce the following generalized HJB equation:

−V𝑥𝑥 (𝑡, 𝑥) , 𝑢) = 0,

(𝑠)) , 𝑢) , (10)

a.e. 𝑠 ∈ [𝑡, 𝑇], P-a.s.

(⋅), 𝑢(⋅)) as the

− V𝑠 (𝑡, 𝑥) + sup 𝐺 (𝑡, 𝑥, −V (𝑡, 𝑥) , −V𝑥 (𝑡, 𝑥) ,

(𝑠)) ,

(𝑠)) , 𝑢 (𝑠))

𝑡,𝑥;𝑢

𝑢∈U

(⋅), 𝑢(⋅)) is optimal for 𝑡,𝑥;𝑢

(𝑠) , −𝑉 (𝑠, 𝑋

−𝑉𝑥𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

(9)

− 𝑉𝑥 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠)) , − 𝑉𝑥𝑥 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠)) , 𝑢 (𝑠)) − 𝑓 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , −𝑉 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠)) , − 𝜎(𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , 𝑢 (𝑠))

⊤

× 𝑉𝑥 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠)) , 𝑢 (𝑠) )} 𝑑𝑠

4

Mathematical Problems in Engineering = 𝐽 (𝑡, 𝑥; 𝑢 (⋅))

In convenient to state the maximum principle, we regard the above controlled SDE (1) and BSDE (3) as a controlled FBSDE:

𝑇

+ E ∫ { − 𝑉𝑠 (𝑠, 𝑋𝑠𝑡,𝑥;𝑢 ) 𝑡

+ 𝐺 (𝑠, 𝑋

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠) , −𝑉 (𝑠, 𝑋

− 𝑉𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

− 𝑉𝑥𝑥 (𝑠, 𝑋

𝑑𝑋𝑡,𝑥;𝑢 (𝑠) = 𝑏 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , 𝑢 (𝑠)) 𝑑𝑠

(𝑠)) ,

+ 𝜎 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , 𝑢 (𝑠)) 𝑑𝑊 (𝑠) ,

(𝑠)) ,

𝑡,𝑥;𝑢

− 𝑑𝑌𝑡,𝑥;𝑢 (𝑠) = 𝑓 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , 𝑌𝑡,𝑥;𝑢 (𝑠) , 𝑍𝑡,𝑥;𝑢 (𝑠) , 𝑢 (𝑠)) 𝑑𝑠

(𝑠)) , 𝑢 (𝑠) ) } 𝑑𝑠

− 𝑍𝑡,𝑥;𝑢 (𝑠) 𝑑𝑊 (𝑠) ,

≤ 𝐽 (𝑡, 𝑥; 𝑢 (⋅))

𝑋𝑡,𝑥;𝑢 (𝑡) = 𝑥,

𝑇

𝑠 ∈ [𝑡, 𝑇] ,

𝑌𝑡,𝑥;𝑢 (𝑇) = 𝜙 (𝑋𝑡,𝑥;𝑢 (𝑇)) .

+ E ∫ {−𝑉𝑠 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠))

(14)

𝑡

+ max 𝐺 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , −𝑉 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠)) , 𝑢∈U

− 𝑉𝑥 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠)) , − 𝑉𝑥𝑥 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠)) , 𝑢)} 𝑑𝑠 = 𝐽 (𝑡, 𝑥; 𝑢 (⋅)) . (11)

This kind of control system was studied by Peng [10] and a maximum principle was obtained. In order to mention his result, we need the following assumption. (H3) 𝑏, 𝜎, 𝜙 are continuously differentiable in (𝑥, 𝑢) and 𝑓 is continuously differentiable in (𝑥, 𝑦, 𝑧, 𝑢). Moreover, 𝑏𝑥 , 𝜎𝑥 , 𝑓𝑥 , 𝑓𝑦 , 𝑓𝑧 , 𝑏𝑢 , 𝜎𝑢 , 𝑓𝑢 are bounded and there exists a constant 𝐶 > 0 such that 󵄨󵄨󵄨𝜙𝑥 (𝑥)󵄨󵄨󵄨 ≤ 𝐶 (1 + |𝑥|) , ∀𝑥 ∈ R𝑛 . (15) 󵄨 󵄨 𝑡,𝑥;𝑢

Thus (9) holds. Next, applying the above inequality to (𝑋

𝑡,𝑥;𝑢

(⋅), 𝑢(⋅)), we have

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

Let (𝑋 (⋅), 𝑌 (⋅), 𝑍 For all 𝑠 ∈ [0, 𝑇], we denote

(⋅), 𝑢(⋅)) be an optimal quartet.

𝑡,𝑥;𝑢

𝑏 (𝑠) := 𝑏 (𝑠, 𝑋

𝑉 (𝑡, 𝑥) = 𝐽 (𝑡, 𝑥; 𝑢 (⋅))

𝜎 (𝑠) := 𝑏 (𝑠, 𝑋

𝑇

+ E ∫ {−𝑉𝑠 (𝑠, 𝑋

𝑡,𝑥;𝑢

𝑡

+ 𝐺 (𝑠, 𝑋

(𝑠))

𝑡,𝑥;𝑢

𝑓 (𝑠) := 𝑓 (𝑠, 𝑋

(𝑠) , −𝑉 (𝑠, 𝑋 𝑡,𝑥;𝑢


𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠) , 𝑌

𝑡,𝑥;𝑢

(𝑠) , 𝑢 (𝑠)) , 𝑡,𝑥;𝑢

(𝑠) , 𝑍

(𝑠) , 𝑢 (𝑠)) , (16)

(𝑠)) ,

(𝑠)) ,


𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠) , 𝑢 (𝑠)) ,

and similar notations are used for all their derivatives. We introduce the adjoint equation which is an FBSDE: − 𝑑𝑝 (𝑠) = [𝑏𝑥 (𝑠)⊤ 𝑝 (𝑠) − 𝑓𝑥 (𝑠)⊤ 𝑞 (𝑠) + 𝜎𝑥 (𝑠) 𝑘 (𝑠)] 𝑑𝑠

(𝑠)) , 𝑢 (𝑠))} 𝑑𝑠. (12)

The desired result follows immediately from the fact that

− 𝑘 (𝑠) 𝑑𝑊 (𝑠) , 𝑑𝑞 (𝑠) = 𝑓𝑦 (𝑠)⊤ 𝑞 (𝑠) 𝑑𝑠 + 𝑓𝑧 (𝑠)⊤ 𝑞 (𝑠) 𝑑𝑊 (𝑠) , 𝑝 (𝑇) = −𝜙𝑥 (𝑋

𝑡,𝑥;𝑢

⊤

(𝑇)) 𝑞 (𝑇) ,

𝑠 ∈ [𝑡, 𝑇] ,

𝑞 (𝑡) = 1, (17)

− 𝑉𝑠 (𝑠, 𝑋

𝑡,𝑥;𝑢

(𝑠)) + 𝐺 (𝑠, 𝑋

𝑡,𝑥;𝑢

(𝑠) , −𝑉 (𝑠, 𝑋


𝑡,𝑥;𝑢

− 𝑉𝑥𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

(𝑠)) ,

(𝑠)) ,

𝑡,𝑥;𝑢

(𝑠)) , 𝑢 (𝑠)) ≤ 0, (13)

which is due to the generalized HJB equation (7). The proof is complete.

and the Hamiltonian function 𝐻 : [0, 𝑇] × R𝑛 × R × R𝑑 × U × R𝑛 × R × R𝑛×𝑑 → R is defined as 𝐻 (𝑡, 𝑥, 𝑦, 𝑧, 𝑢, 𝑝, 𝑞, 𝑘) := ⟨𝑝, 𝑏 (𝑡, 𝑥, 𝑢)⟩ − ⟨𝑞, 𝑓 (𝑡, 𝑥, 𝑦, 𝑧, 𝑢)⟩ + tr [𝜎(𝑡, 𝑥, 𝑢)⊤ 𝑘] . (18) Under (H1)–(H3), the forward equation in (17) admits an obvious unique solution 𝑝(⋅), and then the backward equation in (17) admits a unique solution (𝑞(⋅), 𝑘(⋅)). We call 𝑝, 𝑞, 𝑘 the adjoint processes. Next, the following result holds.


5

Lemma 3 (necessary maximum principle). Let (H1)–(H3) hold and (𝑡, 𝑥) ∈ [0, 𝑇) × R𝑛 be fixed. Suppose that 𝑢(⋅) 𝑡,𝑥;𝑢

is an optimal control for Problem (RSOCP), and (𝑋 𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

𝑇

= E ∫ {⟨𝑌

𝑡,𝑥;𝑢

(𝑠) , 𝑌

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠) , 𝑍

𝑘 (𝑠)) , 𝑢 − 𝑢 (𝑠)⟩ ≥ 0,

(𝑠) − 𝑌𝑡,𝑥;𝑢 (𝑠) , 𝑓𝑦 (𝑠)⊤ 𝑞 (𝑠)⟩

𝑡

(⋅),

𝑡,𝑥;𝑢

+ ⟨𝑍

𝑌 (⋅), 𝑍 (⋅)) is the corresponding optimal state. Let (𝑝(⋅), 𝑞(⋅), 𝑘(⋅)) be the adjoint processes. Then ⟨𝐻𝑢 (𝑠, 𝑋

𝑡,𝑥;𝑢

(𝑠) − 𝑍𝑡,𝑥;𝑢 (𝑠) , 𝑓𝑧 (𝑠)⊤ 𝑞 (𝑠)⟩

− ⟨𝑓 (𝑠) − 𝑓 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , 𝑌𝑡,𝑥;𝑢 (𝑠) , 𝑍𝑡,𝑥;𝑢 (𝑠) , 𝑢 (𝑠) , ) , 𝑞 (𝑠) ⟩

(𝑠) , 𝑢 (𝑠) , 𝑝 (𝑠) , 𝑞 (𝑠) , + ⟨𝑋

∀𝑢 ∈ U, (19)

𝑡,𝑥;𝑢

(𝑠) − 𝑋𝑡,𝑥;𝑢 (𝑠) , −𝑏𝑥 (𝑠)⊤ 𝑝 (𝑠)

+𝑓𝑥 (𝑠)⊤ 𝑞 (𝑠) − 𝜎𝑥 (𝑠) 𝑘 (𝑠) ⟩

a.e. 𝑠 ∈ [𝑡, 𝑇], P-a.s.

+ ⟨𝑏 (𝑠) − 𝑏 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , 𝑢 (𝑠)) , 𝑝 (𝑠)⟩

Proof. It is an immediate consequence of [10, Theorem 4.4 of Peng].

+ tr [(𝜎 (𝑠) − 𝜎 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , 𝑢 (𝑠))) 𝑘 (𝑠)] 𝑑𝑠

As we mentioned in the introduction, we can also prove that, under some additional convexity conditions, the above necessary condition in Lemma 3 is also sufficient.

⊤

𝑇

𝑡,𝑥;𝑢

= E ∫ {𝐻 (𝑠, 𝑋 𝑡

𝑡,𝑥;𝑢

(𝑠) , 𝑌

𝑡,𝑥;𝑢

(𝑠) , 𝑍

(𝑠) , 𝑢 (𝑠) ,

𝑝 (𝑠) , 𝑞 (𝑠) , 𝑘 (𝑠) )

Lemma 4 (sufficient maximum principle). Let (H1)– (H3) hold. Suppose that 𝑢(⋅) is an admissible control, and

− 𝐻 (𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , 𝑌𝑡,𝑥;𝑢 (𝑠) , 𝑍𝑡,𝑥;𝑢 (𝑠) ,

(𝑋 (⋅), 𝑌 (⋅), 𝑍 (⋅)) is the corresponding state, with 𝑌(𝑇) = 𝑀𝑇⊤ 𝑋(𝑇), 𝑀𝑇 ∈ R𝑛 . Let (𝑝(⋅), 𝑞(⋅), 𝑘(⋅)) be the adjoint processes. Suppose that the Hamiltonian function 𝐻 is convex in (𝑥, 𝑦, 𝑧, 𝑢). Then 𝑢(⋅) is an optimal control for Problem (RSOCP) if it satisfies (19).

𝑢 (𝑠) , 𝑝 (𝑠) , 𝑞 (𝑠) , 𝑘 (𝑠) )

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

− ⟨𝐻𝑥 (𝑠, 𝑋

𝐽 (𝑡, 𝑥; 𝑢 (𝑡)) − 𝐽 (𝑡, 𝑥; 𝑢 (𝑡)) = 𝑌

(𝑡) − 𝑌

𝑡,𝑥;𝑢

= E [𝑌𝑡,𝑥;𝑢 (𝑡) − 𝑌

𝑡,𝑥;𝑢

𝑋

⟨𝑋

𝑡,𝑥;𝑢

(𝑠) − 𝑋

𝑡,𝑥;𝑢

𝑌 (𝑇) = get E (𝑌

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠), 𝑝(𝑠)⟩, noting (14), (17), and 𝑌

𝑡,𝑥;𝑢 𝑀𝑇⊤ (𝑋 (𝑇)

−𝑋

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

− ⟨𝐻𝑧 (𝑠, 𝑋

(𝑇) −

(𝑠) , 𝑌

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠) , 𝑍

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠) , 𝑌

𝑡,𝑥;𝑢

(𝑠) , 𝑍

+ E ⟨𝑋 − E ⟨𝑋 = −E [𝑌

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(21)

(𝑡) − 𝑌𝑡,𝑥;𝑢 (𝑡)) 𝑞 (𝑡)

𝑡,𝑥;𝑢

(𝑇) − 𝑋𝑡,𝑥;𝑢 (𝑇) , 𝑝 (𝑇)⟩ (𝑡) − 𝑋

(𝑡) − 𝑌

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠) ,

(𝑠) − 𝑍𝑡,𝑥;𝑢 (𝑠)⟩ } 𝑑𝑠.

(𝑇) − 𝑌𝑡,𝑥;𝑢 (𝑇)) 𝑞 (𝑇) 𝑡,𝑥;𝑢

(𝑠) ,

𝑢 (𝑠) , 𝑝 (𝑠) , 𝑞 (𝑠) , 𝑘 (𝑠) ) ,

(𝑇)), 𝑝(𝑇) = −𝑀𝑇 𝑞(𝑇), we 𝑍

− E (𝑌

(𝑠) ,

(𝑠) − 𝑌𝑡,𝑥;𝑢 (𝑠)⟩

𝑌

(𝑠) − 𝑌𝑡,𝑥;𝑢 (𝑠), 𝑞(𝑠)⟩ + 𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠) , 𝑍

𝑢 (𝑠) , 𝑝 (𝑠) , 𝑞 (𝑠) , 𝑘 (𝑠) ) ,

(𝑡)] . (20)

Applying Itô’s formula to ⟨𝑌

𝑡,𝑥;𝑢

(𝑠) − 𝑋𝑡,𝑥;𝑢 (𝑠)⟩

− ⟨𝐻𝑦 (𝑠, 𝑋

(𝑡)

𝑡,𝑥;𝑢

(𝑠) , 𝑌

𝑢 (𝑠) , 𝑝 (𝑠) , 𝑞 (𝑠) , 𝑘 (𝑠) ) ,

Proof. For any 𝑢(⋅) ∈ U[𝑡, 𝑇] with the corresponding state (𝑋𝑡,𝑥;𝑢 (⋅), 𝑌𝑡,𝑥;𝑢 (⋅), 𝑍𝑡,𝑥;𝑢 (⋅)). By Remark 1, we have for fixed 𝑡 ∈ [0, 𝑇], 𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

Since 𝐻 is convex in (𝑥, 𝑦, 𝑧, 𝑢), then by the maximum condition (19), 𝐽 (𝑡, 𝑥; 𝑢 (𝑡)) − 𝐽 (𝑡, 𝑥; 𝑢 (𝑡)) 𝑇

≤ E ∫ ⟨𝐻𝑢 (𝑠, 𝑋 (𝑡) , 𝑝 (𝑡)⟩

(𝑡)]

𝑡

𝑡,𝑥;𝑢

(𝑠) , 𝑌

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠) , 𝑍

(𝑠) , 𝑢 (𝑠) ,

𝑝 (𝑠) , 𝑞 (𝑠) , 𝑘 (𝑠)) , 𝑢 (𝑠) − 𝑢 (𝑠)⟩ ≤ 0. (22)

6

Mathematical Problems in Engineering 𝑠

Thus 𝑢(⋅) is really the optimal control for Problem (RSOCP). The proof is complete.

+ ∫ 𝑓𝑧 (𝑟, 𝑋 𝑡

Theorem 5. Let (H1)–(H3) hold and let (𝑡, 𝑥) ∈ [0, 𝑇) × R𝑛 be fixed. Suppose that 𝑢(⋅) is an optimal control for Problem 𝑡,𝑥;𝑢

𝑉𝑠 (𝑠, 𝑋

= 𝐺 (𝑠, 𝑋

(𝑠) , −𝑉 (𝑠, 𝑋

−𝑉𝑥 (𝑠, 𝑋 = max 𝐺 (𝑠, 𝑋

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

𝑢∈U

𝑡,𝑥;𝑢

𝑉 (𝑠, 𝑋

𝑡,𝑥;𝑢

(𝑠)) = −𝑌

𝑡,𝑥;𝑢

(𝑠)

𝑇

= −E [∫ 𝑓 (𝑟) 𝑑𝑟 + 𝜙 (𝑋 𝑠

𝑡,𝑥;𝑢


𝑡,𝑥;𝑢

(𝑠)) , 𝑢 (𝑠))

(𝑠)) ,

(𝑠)) , −𝑉𝑥𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

(𝑠)) , 𝑢) , (23)

a.e. 𝑠 ∈ [𝑡, 𝑇], P-a.s. Furthermore, if 𝑉 ∈ 𝐶1,3 ([0, 𝑇] × R𝑛 ) and 𝑉𝑠𝑥 is also continuous, then 𝑝 (𝑠) = 𝑉𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

⊤

(𝑠)) 𝑞 (𝑠) ,

𝑡,𝑥;𝑢

𝑘 (𝑠) = {𝑉𝑥𝑥 (𝑠, 𝑋

× 𝑓𝑧 (𝑠, 𝑋

∀𝑠 ∈ [𝑡, 𝑇] , P-𝑎.𝑠.,

⊤

(𝑠)) 𝜎 (𝑠) + 𝑉𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

(𝑠) , −𝑉 (𝑠, 𝑋 𝑡,𝑥;𝑢


𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠))

𝑡

𝑠

𝑚 (𝑠) = 𝑚 (𝑡) + ∫ 𝑀 (𝑟) 𝑑𝑊 (𝑟) ,

𝑡

where 𝑡 ∈ [0, 𝑇) is fixed. Then 𝑉 (𝑠, 𝑋

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

−𝑉𝑥 (𝑟, 𝑋

𝑠

𝑑𝑉 (𝑠, 𝑋

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

+

󵄨󵄨2 ×𝜎 (𝑟) , 𝑢 (𝑟) ) 󵄨󵄨󵄨 ] 𝑑𝑟 󵄨

(29)

(𝑇)) . (𝑠)),

(𝑠)) + ⟨𝑉𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

(𝑠)) , 𝑏 (𝑠)⟩

𝑡,𝑥;𝑢 1 tr (𝜎(𝑠)⊤ 𝑉𝑥𝑥 (𝑠, 𝑋 (𝑠)) 𝜎 (𝑠))} 𝑑𝑠 2

+ 𝑉𝑥 (𝑠, 𝑋

(𝑟)) ,

(𝑟))

𝑠

𝑡,𝑥;𝑢

(𝑠))

= {𝑉𝑠 (𝑠, 𝑋

𝑡,𝑥;𝑢 𝑡,𝑥;𝑢 1 󵄨󵄨 − 󵄨󵄨󵄨𝑓𝑧 (𝑟, 𝑋 (𝑟) , −𝑉 (𝑟, 𝑋 (𝑟)) , 2󵄨 𝑡,𝑥;𝑢

𝑇

𝑡,𝑥;𝑢

(𝑟) , 𝑢 (𝑟)) 𝜎 (𝑟) , 𝑢 (𝑟))

− 𝑉𝑥 (𝑟, 𝑋

𝑇

(𝑠)) = − ∫ 𝑓 (𝑟) 𝑑𝑟 − ∫ 𝑀 (𝑟) 𝑑𝑊 (𝑟)

On the other hand, applying Itô’s formula to 𝑉(𝑠, 𝑋 we obtain

(𝑠)) 𝜎 (𝑠) , 𝑢 (𝑠)) } 𝑞 (𝑠) ,

(𝑟) , −𝑉 (𝑟, 𝑋

(28)

𝑡

where 𝑞 (𝑠) = exp {∫ [𝑓𝑦 (𝑟, 𝑋

(27)

+ 𝑉 (𝑇, 𝑋

(𝑠)) ,

𝑡,𝑥;𝑢

𝑠 ∈ [𝑡, 𝑇] .

Thus, by the martingale representation theorem (see Yong and Zhou [9]), there exists a unique 𝑀(⋅) ∈ 𝐿2F ([𝑡, 𝑇]; R𝑑 ) satisfying

(24) 𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

𝑚 (𝑠) := −E [∫ 𝑓 (𝑟) 𝑑𝑟 + 𝜙 (𝑋𝑇 ) | F𝑡𝑠 ] ,

⊤

𝑎.𝑒. 𝑠 ∈ [𝑡, 𝑇] , P-𝑎.𝑠.,

𝑠

(𝑇)) | F𝑡𝑠 ] ,

∀𝑠 ∈ [𝑡, 𝑇] , P-a.s. (26) 𝑇

(𝑠)) , −𝑉𝑥𝑥 (𝑠, 𝑋

(𝑠) , −𝑉 (𝑠, 𝑋

𝑡,𝑥;𝑢

Define a square-integrable F𝑡𝑠 -martingale

(𝑠)) ,

𝑡,𝑥;𝑢

(𝑟))

Proof. Obviously (25) can be obtained via solving the forward SDE in (17) directly. Now let us prove (24). By [15, Theorem 5.4 of Peng], for fixed 𝑡 ∈ [0, 𝑇), it is easy to obtain

(𝑠)) 𝑡,𝑥;𝑢

(𝑟)) ,

(25)

(RSOCP), and (𝑋 (⋅), 𝑌 (⋅), 𝑍 (⋅)) is the corresponding optimal state. Let (𝑝(⋅), 𝑞(⋅), 𝑘(⋅)) be the adjoint processes. If 𝑉 ∈ 𝐶1,2 ([0, 𝑇] × R𝑛 ), then 𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

× 𝜎 (𝑟) , 𝑢 (𝑟) ) 𝑑𝑊 (𝑟) } .

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑟) , −𝑉 (𝑟, 𝑋

− 𝑉𝑥 (𝑟, 𝑋

3. Relationship between Maximum Principle and Dynamic Programming In this section, we investigate the relationship between the maximum principle and dynamic programming. That is, the connection between the value function 𝑉, the generalized Hamiltonian function 𝐺, and adjoint processes 𝑝, 𝑞, 𝑘. Our main result is the following.

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

⊤

(𝑠)) 𝜎 (𝑠) 𝑑𝑊 (𝑠) . (30)

Comparing the above two equalities, we conclude that 𝑉𝑠 (𝑠, 𝑋 +

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠)) , 𝑏 (𝑠)⟩

𝑡,𝑥;𝑢 1 tr (𝜎(𝑠)⊤ 𝑉𝑥𝑥 (𝑠, 𝑋 (𝑠)) 𝜎 (𝑠)) = 𝑓 (𝑠) , 2

𝑡,𝑥;𝑢

𝑉𝑥 (𝑠, 𝑋

(𝑠)) + ⟨𝑉𝑥 (𝑠, 𝑋

⊤

(𝑠)) 𝜎 (𝑠) = 𝑀 (𝑠) ,

a.e. 𝑠 ∈ [𝑡, 𝑇] , P-a.s. (31)


7

However, by the uniqueness of solution to BSDE (3), we have 𝑡,𝑥;𝑢

𝑌 𝑡,𝑥;𝑢

𝑍

(𝑠) = −𝑉𝑥 (𝑠, 𝑋

(𝑠) = −𝑉 (𝑠, 𝑋 𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

tr (𝜎⊤ 𝑉𝑥𝑥𝑥 𝜎)

(𝑠)) ,

1

⊤

(𝑠)) 𝜎 (𝑠) ,

a.e. 𝑠 ∈ [𝑡, 𝑇] , P-a.s. (32)

Since 𝑉 ∈ 𝐶1,2 ([0, 𝑇] × R𝑛 ), it satisfies the generalized HJB equation (7), which implies (23). Also, by (7), we have 0 = −𝑉𝑠 (𝑠, 𝑋

𝑡,𝑥;𝑢

+ 𝐺 (𝑠, 𝑋

𝑡,𝑥;𝑢

(𝑠)) (𝑠) , −𝑉 (𝑠, 𝑋


𝑡,𝑥;𝑢

where

𝑡,𝑥;𝑢

(𝑠)) , −𝑉𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

(𝑠)) ,

with ((𝑉𝑥 )1 , . . . , (𝑉𝑥 )𝑛 )⊤ = 𝑉𝑥 . On the other hand, applying Itô’s formula to 𝑉𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

(𝑠)), we get 𝑑𝑉𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

(𝑠))

= − {𝑏𝑥 (𝑠)⊤ 𝑉𝑥 (𝑠, 𝑋

(𝑠)) , 𝑢 (𝑠))

× 𝑉𝑥𝑥 (𝑠, 𝑋

≥ −𝑉𝑠 (𝑠, 𝑥) + 𝐺 (𝑠, 𝑥, −𝑉 (𝑠, 𝑥) , −𝑉𝑥 (𝑠, 𝑥) ,

− 𝑓𝑥 (𝑠, 𝑋

∀𝑥 ∈ R𝑛 .

−𝑉𝑥𝑥 (𝑠, 𝑥) , 𝑢 (𝑠)) ,

Consequently, if 𝑉 ∈ 𝐶1,3 ([0, 𝑇] × R𝑛 ) and 𝑉𝑠𝑥 is also continuous, then 𝜕 {−𝑉𝑠 (𝑠, 𝑥) + 𝐺 (𝑠, 𝑥, −𝑉 (𝑠, 𝑥) , −𝑉𝑥 (𝑠, 𝑥) , 𝜕𝑥 󵄨 −𝑉𝑥𝑥 (𝑠, 𝑥) , 𝑢 (𝑠))}󵄨󵄨󵄨𝑥=𝑋𝑡,𝑥;𝑢 (𝑠) = 0, (34)

− 𝑏𝑥 (𝑠)⊤ 𝑉𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

× 𝑉𝑥 (𝑠, 𝑋 + 𝑓𝑧 (𝑠, 𝑋

+ 𝑓𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

+ 𝑉𝑥𝑥 (𝑠, 𝑋

−𝑉𝑥 (𝑠, 𝑋 𝑡,𝑥;𝑢

− 𝑓𝑦 (𝑠, 𝑋

(𝑠) , −𝑉 (𝑠, 𝑋

− 𝑉𝑥 (𝑠, 𝑋 × 𝑉𝑥 (𝑠, 𝑋 − 𝑓𝑧 (𝑠, 𝑋

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

𝑑𝑉𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠)) (𝑠) , −𝑉 (𝑠, 𝑋

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠)) 𝜎 (𝑠) , 𝑢 (𝑠))

𝑡,𝑥;𝑢

(𝑠)) 𝜎𝑥 (𝑠)] } 𝑑𝑠

𝑡,𝑥;𝑢

⊤

𝑡,𝑥;𝑢

(𝑠)) 𝑞 (𝑠)

(𝑠) , −𝑉 (𝑠, 𝑋


(𝑠)) ,

(𝑠)) 𝜎 (𝑠) , 𝑢 (𝑠)) 𝑞 (𝑠)

𝑡,𝑥;𝑢

+ 𝑉𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

⊤

𝑡,𝑥;𝑢

(𝑠)) ,

(𝑠)) 𝜎𝑥 (𝑠)] , (35)

(𝑇)). Applying again

⊤

(𝑠)) 𝑞 (𝑠)

+ 𝑓𝑥 (𝑠, 𝑋

(𝑠)) 𝜎 (𝑠) , 𝑢 (𝑠))

𝑡,𝑥;𝑢

(𝑠)) 𝑞(𝑠), we have

− 𝜎𝑥 (𝑠)⊤ [𝑉𝑥𝑥 (𝑠, 𝑋

(𝑠)) 𝜎 (𝑠) + 𝑉𝑥 (𝑠, 𝑋

(𝑠)) ,

(𝑠)) 𝜎 (𝑠)

(𝑇)) = −𝜙𝑥 (𝑋

𝑡,𝑥;𝑢

(𝑠)) ,

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠)) 𝜎 (𝑠) 𝑑𝑊 (𝑠) .

= { − 𝑏𝑥 (𝑠)⊤ 𝑉𝑥 (𝑠, 𝑋

(𝑠)) 𝜎 (𝑠) , 𝑢 (𝑠))

(𝑠) , −𝑉 (𝑠, 𝑋

(𝑠)) 𝜎 (𝑠) , 𝑢 (𝑠))

⊤

𝑡,𝑥;𝑢

Itô’s formula to 𝑉𝑥 (𝑠, 𝑋

(𝑠)) ,

(𝑠))

− 𝑉𝑥 (𝑠, 𝑋 × [𝑉𝑥𝑥 (𝑠, 𝑋

Note that 𝑉𝑥 (𝑇, 𝑋

(𝑠)) 𝜎 (𝑠) , 𝑢 (𝑠) ) 𝑡,𝑥;𝑢

(𝑠)) ,

(37)

(𝑠)) 𝜎 (𝑠) 𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠)) ,

(𝑠)) 𝜎 (𝑠) , 𝑢 (𝑠)) 𝑡,𝑥;𝑢

+𝑉𝑥 (𝑠, 𝑋

(𝑠) , −𝑉 (𝑠, 𝑋 𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠) , −𝑉 (𝑠, 𝑋

𝑡,𝑥;𝑢

× [𝑉𝑥𝑥 (𝑠, 𝑋

(𝑠)) 𝑏 (𝑠)

(𝑠))

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠) , −𝑉 (𝑠, 𝑋


𝑡,𝑥;𝑢 1 − tr (𝜎(𝑠)⊤ 𝑉𝑥𝑥𝑥 (𝑠, 𝑋 (𝑠)) 𝜎 (𝑠)) 2

− 𝜎𝑥 (𝑠)⊤ 𝑉𝑥𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

(𝑠)) + 𝜎𝑥 (𝑠)⊤

(𝑠)) 𝜎 (𝑠)


This is equivalent to (recall (8)), for all 𝑠 ∈ [𝑡, 𝑇], (𝑠)) − 𝑉𝑥𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

+ 𝑓𝑦 (𝑠, 𝑋

∀𝑠 ∈ [𝑡, 𝑇] . 𝑡,𝑥;𝑢

𝑡,𝑥;𝑢


(33)

0 = −𝑉𝑠𝑥 (𝑠, 𝑋

⊤

𝑛

:= (tr (𝜎⊤ ((𝑉𝑥 ) )𝑥𝑥 𝜎) , . . . , tr (𝜎⊤ ((𝑉𝑥 ) )𝑥𝑥 𝜎)) , (36)

𝑡,𝑥;𝑢


𝑡,𝑥;𝑢

⊤

(𝑠)) 𝜎 (𝑠) (𝑠))

⊤

(𝑠) , −𝑉 (𝑠, 𝑋

−𝑉𝑥 (𝑠,𝑋

𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠)) ,

(𝑠)) 𝜎 (𝑠),𝑢𝑠)]𝑞 (𝑠) } 𝑑𝑠

8

Mathematical Problems in Engineering + {𝑉𝑥𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢


⊤

(𝑠)) 𝜎 (𝑠) + 𝑉𝑥 (𝑠, 𝑋

𝑡,𝑥;𝑢

(𝑠) , −𝑉 (𝑠, 𝑋 𝑡,𝑥;𝑢


𝑡,𝑥;𝑢

𝑡,𝑥;𝑢

(𝑠))

⊤

the Lagrange multiplier method, we know that it is equivalent to study the following problem: 1 2 sup E [− (𝑋𝑢 (𝑇) − 𝑎) ] , 2 𝑢(⋅)∈Uad

(𝑠)) ,

(𝑠)) 𝜎 (𝑠) , 𝑢 (𝑠)) }

× 𝑞 (𝑠) 𝑑𝑊 (𝑠) . (38) Hence, by the uniqueness of the solutions to the adjoint equation (17), we obtain (24). The proof is complete.

4. Applications to Financial Portfolio Optimization In this section, we consider an LQ recursive utility portfolio optimization problem in the financial engineering. In this problem, the optimal portfolio in the state feedback form is obtained by both maximum principle and dynamic programming approaches, and the relations which we obtained in Theorem 5 are illustrated explicitly. Suppose the investors have two kinds of securities in the market for possible investment choice. (i) A risk-free security (e.g., a bond), where the price 𝑆0 (𝑡) at time 𝑡 is given by 𝑑𝑆0 (𝑡) = 𝜌𝑡 𝑆0 (𝑡) 𝑑𝑡,

𝑆0 (0) > 0,

(39)

here 𝜌𝑡 is a bounded deterministic function. (ii) A risky security (e.g., a stock), where the price 𝑆1 (𝑡) at time 𝑡 is given by 𝑑𝑆1 (𝑡) = 𝜇𝑡 𝑆1 (𝑡) 𝑑𝑡 + 𝜎𝑡 𝑆1 (𝑡) 𝑑𝑊 (𝑡) ,

𝑆1 (0) > 0,

(40)

here 𝑊(⋅) is a 1-dimensional Brownian motion and 𝜇𝑡 , 𝜎𝑡 ≠ 0 are bounded deterministic functions with 𝜇𝑡 > 𝜌𝑡 . Let 𝑢(𝑡) denote the total market value of the investor’s wealth invested in the risky security which we call portfolio. Given the initial wealth 𝑋𝑢 (0) = 𝑋0 ≥ 0, combining (39) and (40), we can get the following wealth dynamics:

where some 𝑎 ∈ R is given. Using the completion of squares technique, the study of [28] obtained an optimal portfolio in the state feedback form by some stochastic Riccati equation and BSDE. The optimal value function was also obtained. In this paper, we generalize the above mean-variance portfolio optimization problem to a recursive utility portfolio optimization problem. The recursive utility means that the utility at time 𝑡 is a function of the future utility (in this section, we do not consider the consumption). In fact, in our framework, the recursive utility can be assumed to satisfy some controlled BSDE. We consider a small investor, endowed with initial wealth 𝑋0 > 0, who chooses at each time 𝑡 his/her portfolio 𝑢(𝑡). The investor wants to choose an optimal portfolio 𝑢∗ (⋅) ∈ Uad to maximize the following recursive utility functional with generator: 𝑓 (𝑡, 𝑢, 𝑥, 𝑦) = 𝜌𝑡 𝑥 + (𝜇𝑡 − 𝜌𝑡 ) 𝑢 − 𝛽𝑦,

𝑡 ≥ 0,

(41)

𝑢

𝑋 (0) = 𝑋0 . We denote by Uad the set of admissible portfolios valued in U = R. For any given initial wealth 𝑋0 > 0, Kohlmann and Zhou [28] discussed a mean-variance portfolio optimization problem. That is, the investor’s object is to find an admissible portfolio 𝑢∗ (𝑡) which minimizes the variance Var[𝑋𝑢 (𝑇)] := E[(𝑋𝑢 (𝑇) − E[𝑋𝑢 (𝑇)])2 ] at some future time 𝑇 > 0 under the condition that E[𝑋𝑢 (𝑇)] = 𝐴 for some given 𝐴 ∈ R. Using

(43)

where constant 𝛽 ≥ 0. Remark 6. In fact, the recursive utility functional (43) defined above stands for some standard additive utility of recursive type. It is a meaningful and nontrivial generalization of the classical standard additive utility and has many applications in mathematical economics and mathematical finance. For more details about utility functions, see Duffie and Epstein [2], Section 1.4 of El Karoui et al. [3] or Schroder and Skiadas [4]. More precisely, for any 𝑢(⋅) ∈ Uad , the investor’s utility functional is defined by 󵄨 (44) 𝐽 (𝑢 (⋅)) := 𝑌𝑢 (𝑡)󵄨󵄨󵄨𝑡=0 , where 𝑌𝑢 (𝑡) 1 2 := E [ − (𝑋𝑢 (𝑇) − 𝑎) 2 𝑇

+ ∫ [𝜌𝑠 𝑋𝑢 (𝑠) + (𝜇𝑠 − 𝜌𝑠 ) 𝑢 (𝑠) − 𝛽𝑌𝑢 (𝑠)] 𝑑𝑠 | F𝑡], 𝑡

𝑑𝑋𝑢 (𝑡) = [𝜌𝑡 𝑋𝑢 (𝑡) + (𝜇𝑡 − 𝜌𝑡 ) 𝑢 (𝑡)] 𝑑𝑡 + 𝜎𝑡 𝑢 (𝑡) 𝑑𝑊 (𝑡) ,

(42)

∀𝑡 ∈ [0, 𝑇] . (45) In fact, in our framework, the wealth process 𝑋𝑢 (⋅) and recursive utility process 𝑌𝑢 (⋅) can be regarded as the solution to the following controlled FBSDE: 𝑑𝑋𝑢 (𝑡) = [𝜌𝑡 𝑋𝑢 (𝑡) + (𝜇𝑡 − 𝜌𝑡 ) 𝑢 (𝑡)] 𝑑𝑡 + 𝜎𝑡 𝑢 (𝑡) 𝑑𝑊 (𝑡) , − 𝑑𝑌𝑢 (𝑡) = [𝜌𝑡 𝑋𝑢 (𝑡) + (𝜇𝑡 − 𝜌𝑡 ) 𝑢 (𝑡) − 𝛽𝑌𝑢 (𝑡)] 𝑑𝑡 − 𝑍𝑢 (𝑡) 𝑑𝑊 (𝑡) , 𝑋𝑢 (0) = 𝑋0 ,

𝑡 ∈ [0, 𝑇] , 1 2 𝑌𝑢 (𝑇) = − (𝑋𝑢 (𝑇) − 𝑎) , 2

(46)


9

and our portfolio optimization problem can be rewritten as (denoting 𝐽 = −𝐽) 𝐽 (𝑢∗ (⋅)) = inf 𝐽 (𝑢 (⋅)) . 𝑢(⋅)∈Uad

(47)

where 𝜙𝑠 , 𝜓𝑠 are deterministic differentiable functions. Applying Itô’s formula to (53), we have 𝑑𝑝 (𝑠) = 𝑒−𝛽(𝑠−𝑡) [(𝜙̇ 𝑠 + (𝜌𝑠 − 𝛽) 𝜙𝑠 ) 𝑋𝑡,𝑥;𝑢 (𝑠) + 𝜙𝑠 (𝜇𝑠 − 𝜌𝑠 ) 𝑢 (𝑠) + 𝜓̇ 𝑠 − 𝛽𝜓𝑠 ] 𝑑𝑠

Since we are going to involve the dynamic programming in treating the above problem, we will adopt the formulation as in Section 2. Let 𝑇 > 0 be given. For any (𝑡, 𝑥) ∈ [0, 𝑇) × R, consider the following controlled FBSDE:

+ 𝑒−𝛽(𝑠−𝑡) 𝜙𝑠 𝜎𝑠 𝑢 (𝑠) 𝑑𝑊 (𝑠) . (54) Comparing (51) with (54), noting (52) and (53), we get

𝑑𝑋𝑡,𝑥;𝑢 (𝑠) = [𝜌𝑠 𝑋𝑡,𝑥;𝑢 (𝑠) + (𝜇𝑠 − 𝜌𝑠 ) 𝑢 (𝑠)] 𝑑𝑠

(𝜙̇ 𝑠 + (𝜌𝑠 − 𝛽) 𝜙𝑠 ) 𝑋𝑡,𝑥;𝑢 (𝑠) + 𝜙𝑠 (𝜇𝑠 − 𝜌𝑠 ) 𝑢 (𝑠) + 𝜓̇ 𝑠 − 𝛽𝜓𝑠

+ 𝜎𝑠 𝑢 (𝑠) 𝑑𝑊 (𝑠) ,

= −𝜌𝑠 (𝜙𝑠 𝑋𝑡,𝑥;𝑢 (𝑠) + 𝜓𝑠 − 1) ,

− 𝑑𝑌𝑡,𝑥;𝑢 (𝑠) = [𝜌𝑠 𝑋𝑡,𝑥;𝑢 (𝑠) + (𝜇𝑠 − 𝜌𝑠 ) 𝑢 (𝑠) − 𝛽𝑌𝑡,𝑥;𝑢 (𝑠)] 𝑑𝑠 − 𝑍𝑡,𝑥;𝑢 (𝑠) 𝑑𝑊 (𝑠) , 𝑋

𝑡,𝑥;𝑢

(𝑡) = 𝑥,

𝑌

𝑡,𝑥;𝑢

(55) 𝑘 (𝑠) = 𝜙𝑠 𝜎𝑠 𝑢 (𝑠) 𝑒−𝛽(𝑠−𝑡) .

𝑠 ∈ [𝑡, 𝑇] ,

2 1 (𝑇) = − (𝑋𝑡,𝑥;𝑢 (𝑇) − 𝑎) . 2

∗

(48)

And our recursive utility portfolio optimization problem is to find an optimal portfolio 𝑢∗ (⋅) ∈ Uad to minimize the recursive utility functional 𝐽(𝑡, 𝑥; 𝑢(⋅)) := −𝑌𝑡,𝑥;𝑢 (𝑠)|𝑠=𝑡 . We define the value function as 𝑉 (𝑡, 𝑥) := 𝐽 (𝑡, 𝑥; 𝑢∗ (⋅)) = inf 𝐽 (𝑡, 𝑥; 𝑢 (⋅)) . 𝑢(⋅)∈Uad

Let 𝑢∗ (⋅) be a candidate optimal portfolio, (𝑋𝑡,𝑥;𝑢 (⋅), ∗ 𝑡,𝑥;𝑢∗ (⋅), 𝑍𝑡,𝑥;𝑢 (⋅)) the corresponding solution to controlled 𝑌 FBSDE (48) with corresponding solution (𝑝∗ (⋅), 𝑞∗ (⋅), 𝑘∗ (⋅)) to the adjoint equation (51). Now the Hamiltonian function (50) is ∗

∗

∗

𝐻 (𝑠, 𝑥, 𝑦, 𝑧, 𝑢, 𝑝, 𝑞, 𝑘) = 𝑝 [𝜌𝑠 𝑥 + (𝜇𝑠 − 𝜌𝑠 ) 𝑢]

+ 𝑘∗ (𝑠) 𝜎𝑠 𝑢. (57) Since this is a linear expression of 𝑢, by the maximum condition (19), we have (𝜇𝑠 − 𝜌𝑠 ) (𝑝∗ (𝑠) − 𝑞∗ (𝑠)) + 𝜎𝑠 𝑘∗ (𝑠) = 0. ∗

(𝑇) − 𝑎] 𝑞 (𝑇) ,

.

(59)

∗

(51)

(52)

Due to the terminal condition of (51), we try a process 𝑝(⋅) of the form ,

=

(𝜙̇ 𝑠 + 2𝜌𝑠 𝜙𝑠 − 𝛽𝜙𝑠 ) 𝑋𝑡,𝑥;𝑢 (𝑠) + 𝜓̇ 𝑠 + (𝜌𝑠 − 𝛽) 𝜓𝑠 − 𝜌𝑠 𝜙𝑠 (𝜌𝑠 − 𝜇𝑠 )

𝑞 (𝑡) = 1.

𝑠 ∈ [𝑡, 𝑇] .

(𝑠) + 𝜓𝑠 ] 𝑒

𝜙𝑠 𝜎𝑠2

𝑢 (𝑠)

Noting that in this case the adjoint process 𝑞(⋅) reduced to a deterministic function because our generator 𝑓 does not contain the process 𝑧(⋅). We immediately have

𝑝 (𝑠) = [𝜙𝑠 𝑋

𝑢 (𝑠) =

(𝜌𝑠 − 𝜇𝑠 ) (𝜙𝑠 𝑋𝑡,𝑥;𝑢 (𝑠) + 𝜓𝑠 − 1)

On the other hand, (55) gives

𝑠 ∈ [𝑡, 𝑇] ,

−𝛽(𝑠−𝑡)

∗

∗

−𝑑𝑝 (𝑠) = 𝜌𝑠 [𝑝 (𝑠) − 𝑞 (𝑠)] 𝑑𝑠 − 𝑘 (𝑠) 𝑑𝑊 (𝑠) ,

𝑡,𝑥;𝑢

(58)

Substituting (52), (53) and (56) into (58), we can get

and the adjoint equation (17) writes

𝑞 (𝑠) = 𝑒−𝛽(𝑠−𝑡) ,

∗

− 𝑞∗ (𝑠) [𝜌𝑠 𝑋𝑡,𝑥;𝑢 (𝑠) + (𝜇𝑠 − 𝜌𝑠 ) 𝑢 − 𝛽𝑌𝑡,𝑥;𝑢 (𝑠)]

− 𝑞 [𝜌𝑠 𝑥 + (𝜇𝑠 − 𝜌𝑠 ) 𝑢 − 𝛽𝑦] + 𝑘𝜎𝑠 𝑢, (50)

𝑝 (𝑇) = [𝑋

∗

= 𝑝∗ (𝑠) [𝜌𝑠 𝑋𝑡,𝑥;𝑢 (𝑠) + (𝜇𝑠 − 𝜌𝑠 ) 𝑢]

(49)

4.1. Maximum Principle Approach. In this case, the Hamiltonian function (18) reduces to

𝑡,𝑥;𝑢

∗

𝐻(𝑠, 𝑋𝑡,𝑥;𝑢 (𝑠) , 𝑌𝑡,𝑥;𝑢 (𝑠) , 𝑍𝑡,𝑥;𝑢 (𝑠) , 𝑢, 𝑝∗ (𝑠) , 𝑞∗ (𝑠) , 𝑘∗ (𝑠))

We can check that all the assumptions in Section 2 are satisfied. Then we can use both the dynamic programming (Lemma 2) and maximum principle (Lemmas 3 and 4) approaches to solve the above problem (49).

𝑑𝑞 (𝑠) = − 𝛽𝑞 (𝑠) 𝑑𝑠,

(56)

𝑠 ∈ [𝑡, 𝑇] ,

(53)

.

(60)

Combining (59) and (60) (noting the terminal condition in (51)), we get 2

(𝜌 − 𝜇 ) 𝜙̇ 𝑠 = [ 𝑠 2 𝑠 − 2𝜌𝑠 + 𝛽] 𝜙𝑠 , 𝜎𝑠 2

𝜓̇ 𝑠 = [

𝜙𝑇 = 1,

(61)

2

(𝜌𝑠 − 𝜇𝑠 ) (𝜌 − 𝜇 ) − 𝜌𝑠 + 𝛽] 𝜓𝑠 − 𝑠 2 𝑠 + 𝜌𝑠 , 𝜎𝑠2 𝜎𝑠 𝜓𝑇 = −𝑎.

(62)

10

Mathematical Problems in Engineering We conjecture that 𝑉(𝑡, 𝑥) is quadratic in 𝑥, namely,

The solutions to these equations are 𝑇

2

2

𝜙𝑠 = 𝑒− ∫𝑠 [(𝜌𝑟 −𝜇𝑟 ) /𝜎𝑠 −2𝜌𝑟 +𝛽]𝑑𝑟 , 𝑇

2

2

𝑇

𝑠

𝑇 2 2 (𝜌𝑟 − 𝜇𝑟 ) − 𝜌𝑟 ] 𝑒∫𝑠 [(𝜌𝜆 −𝜇𝜆 ) /𝜎𝜆 −𝜌𝜆 +𝛽]𝑑𝜆 𝑑𝑟 − 𝑎} . 2 𝜎𝑟 (64)

With this choice of 𝜙𝑠 and 𝜓𝑠 , the processes 𝑞∗ (𝑠) = 𝑒−𝛽(𝑠−𝑡) ,

𝑝∗ (𝑠) = [𝜙𝑠 𝑋𝑡,𝑥;𝑢∗ (𝑠) + 𝜓𝑠 ] 𝑒−𝛽(𝑠−𝑡) , ∗

∗

−𝛽(𝑠−𝑡)

𝑘 (𝑠) = 𝜙𝑠 𝜎𝑠 𝑢 (𝑠) 𝑒

(65) satisfying the adjoint equation (51) with 𝑢∗ (𝑠) given by (59). With this choice of 𝑢∗ (𝑠), the maximum condition (19) of Lemma 3 holds. Moreover, we can check that all conditions in Lemma 4 hold, then 𝑢∗ (𝑠) given by (59) is really the optimal control. Finally, let 𝑡 = 0, then we can solve the initial problem (47) and give the explicit optimal portfolio in the state feedback form. Theorem 7. The optimal solution 𝑢∗ (⋅) of our recursive utility portfolio optimization problem (47), when the wealth dynamics obeys (41), is given in the state feedback form by (𝜌 − 𝜇𝑠 ) (𝜙𝑠 𝑋∗ + 𝜓𝑠 − 1) , 𝑢 (𝑠, 𝑋 ) = 𝑠 𝜙𝑠 𝜎𝑠2 ∗

(69)

for some deterministic differentiable functions 𝑃𝑡 , 𝜑𝑡 , and 𝜆 𝑡 with

2

𝜓𝑠 = 𝑒− ∫𝑠 [(𝜌𝑟 −𝜇𝑟 ) /𝜎𝑟 −𝜌𝑟 +𝛽]𝑑𝑟 ×{∫ [

1 𝑉 (𝑡, 𝑥) = 𝑃𝑡 𝑥2 + 𝜑𝑡 𝑥 + 𝜆 𝑡 , 2

(63)

∗

(66)

4.2. Dynamic Programming Approach. In this case, the value function 𝑉 should satisfy the following generalized HJB equation: 𝑢∈U

1 𝑉 (𝑇, 𝑥) = (𝑥 − 𝑎)2 , 2

1 𝜆 𝑇 = 𝑎2 . 2

(70)

Substituting (69) into (68) and using completion of squares repeatedly, we get 𝐺 (𝑡, 𝑥, −𝑉 (𝑡, 𝑥) , −𝑉𝑥 (𝑡, 𝑥) , −𝑉𝑥𝑥 (𝑡, 𝑥) , 𝑢) (𝜌 − 𝜇𝑡 ) (𝑃𝑡 𝑥 + 𝜑𝑡 − 1) 1 = − 𝑃𝑡 𝜎𝑡2 [𝑢 (𝑡) − 𝑡 ] 2 𝑃𝑡 𝜎𝑡2

2

2

+[

(𝜇𝑡 − 𝜌𝑡 ) 𝑃𝑡 1 + 𝛽𝑃𝑡 − 𝜌𝑡 𝑃𝑡 ] 𝑥2 2 2𝜎𝑡2

+[

(𝜌𝑡 − 𝜇𝑡 ) (𝜑𝑡 − 1) + 𝛽𝜑𝑡 − 𝜌𝑡 𝜑𝑡 + 𝜌𝑡 ] 𝑥 𝜎𝑡2

2

2

−

(𝜌𝑡 − 𝜇𝑡 ) 𝜑𝑡 + 𝛽𝜆 𝑡 𝑃𝑡 𝜎𝑡2

(71)

2

≤[

(𝜇𝑡 − 𝜌𝑡 ) 𝑃𝑡 1 + 𝛽𝑃𝑡 − 𝜌𝑡 𝑃𝑡 ] 𝑥2 2 2𝜎𝑡2

+[

(𝜌𝑡 − 𝜇𝑡 ) (𝜑𝑡 − 1) + 𝛽𝜑𝑡 − 𝜌𝑡 𝜑𝑡 + 𝜌𝑡 ] 𝑥 𝜎𝑡2 2

−

(𝜌𝑡 − 𝜇𝑡 ) 𝜑𝑡 + 𝛽𝜆 𝑡 , 𝑃𝑡 𝜎𝑡2

provided that 𝑃𝑡 > 0 for all 𝑡 ∈ [0, 𝑇], which we will prove later. Then we see that the optimal state feedback portfolio is given by (𝜌𝑡 − 𝜇𝑡 ) (𝑃𝑡 𝑋∗ + 𝜑𝑡 − 1) . 𝑃𝑡 𝜎𝑡2

𝑢∗ (𝑡, 𝑋∗ ) =

− 𝑉𝑡 (𝑡, 𝑥) + sup 𝐺 (𝑡, 𝑥, −𝑉 (𝑡, 𝑥) , −𝑉𝑥 (𝑡, 𝑥) ,

(𝑡, 𝑥) ∈ [0, 𝑇] × R𝑛 ,

𝜑𝑇 = −𝑎,

2

for 𝑠 ∈ [0, 𝑇], where 𝜙𝑠 , 𝜓𝑠 are given by (63) and (64), respectively.

−𝑉𝑥𝑥 (𝑡, 𝑥) , 𝑢) = 0,

𝑃𝑇 = 1,

(72)

In addition, the generalized HJB equation (67) now reads (67)

𝑥 ∈ R𝑛 ,

1 ̇ 2 𝑃 𝑥 + 𝜑̇ 𝑡 𝑥 + 𝜆̇ 𝑡 2 𝑡 2

=[

where the generalized Hamiltonian function (8) is

(𝜇𝑡 − 𝜌𝑡 ) 𝑃𝑡 1 + 𝛽𝑃𝑡 − 𝜌𝑡 𝑃𝑡 ] 𝑥2 2 2𝜎𝑡2 2

𝐺 (𝑡, 𝑥, −𝑉 (𝑡, 𝑥) , −𝑉𝑥 (𝑡, 𝑥) , −𝑉𝑥𝑥 (𝑡, 𝑥) , 𝑢) 1 = − 𝑉𝑥𝑥 (𝑡, 𝑥) 𝜎𝑠2 𝑢2 − 𝑉𝑥 (𝑡, 𝑥) [𝜌𝑡 𝑥 + (𝜇𝑡 − 𝜌𝑡 ) 𝑢] 2 + 𝜌𝑡 𝑥 + (𝜇𝑡 − 𝜌𝑡 ) 𝑢 + 𝛽𝑉 (𝑡, 𝑥) . (68)

(𝜌 − 𝜇 ) + [ 𝑡 2 𝑡 (𝜑𝑡 − 1) + 𝛽𝜑𝑡 − 𝜌𝑡 𝜑𝑡 + 𝜌𝑡 ] 𝑥 𝜎𝑡

(73)

2

−

(𝜌𝑡 − 𝜇𝑡 ) 𝜑𝑡 + 𝛽𝜆 𝑡 . 𝑃𝑡 𝜎𝑡2

Then noting (70), by comparing the quadratic terms and linear terms in 𝑥, we recover (61) and (62), respectively. That


11

is to say, we have that 𝑃𝑡 coincides with 𝜙𝑡 and 𝜑𝑡 coincides with 𝜓𝑡 . Then by (63), we have 𝜙𝑡 > 0, for all 𝑡 ∈ [0, 𝑇] as expected before. And also we have 2

(𝜌 − 𝜇𝑡 ) 𝜓𝑡 𝜆̇ 𝑡 = − 𝑡 + 𝛽𝜆 𝑡 , 𝜙𝑡 𝜎𝑡2

1 𝜆 𝑇 = 𝑎2 . 2

(74)

The solution to (74) is 𝑇

𝜆 𝑡 = 𝑒−𝛽(𝑇−𝑡) {∫ [ 𝑡

2

(𝜌𝑟 − 𝜇𝑟 ) 𝜓𝑟 1 − 𝜌𝑟 ] 𝑒𝛽(𝑇−𝑟) 𝑑𝑟 + 𝑎2 } . 2 𝜙𝑟 𝜎𝑟 2 (75)

Then the value function 1 𝑉 (𝑡, 𝑥) = 𝜙𝑡 𝑥2 + 𝜓𝑡 𝑥 + 𝜆 𝑡 , 2

Theorem 8. The optimal solution 𝑢∗ (⋅) for our recursive utility portfolio optimization problem (47), when the wealth dynamics obeys (41), is given in the state feedback form by (𝜌𝑡 − 𝜇𝑡 ) (𝜙𝑡 𝑋∗ + 𝜓𝑡 − 1) , 𝜙𝑡 𝜎𝑡2

The authors would like to thank the anonymous referees for many constructive comments that led to an improved version of the paper. The authors also thank the Academic Editor for his efficient handling of this paper. Finally, many thanks are devoted to Dr. Qingxin Meng for his suggesting discussion during the revising progress. This work is supported by China Postdoctoral Science Foundation Funded Project (No. 20100481278), Postdoctoral Innovation Foundation Funded Project of Shandong Province (No. 201002026), the National Natural Sciences Foundations of China (No. 11201264 and 11101242) and Shandong Province (No. ZR2011AQ012, ZR2010AQ004), and the Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry, China.

(76)

where 𝜙𝑡 , 𝜓𝑡 , and 𝜆 𝑡 are determined by (61), (62), and (75), respectively. By Lemma 2, we have proved the following

𝑢∗ (𝑡, 𝑋∗ ) =

Acknowledgments

(77)

for 𝑡 ∈ [0, 𝑇], and the value function is given by (76), where 𝜙𝑡 , 𝜓𝑡 , and 𝜆 𝑡 are determined by (61), (62), and (75), respectively. 4.3. Relationship. We now can explicitly illustrate the relationships in Theorem 5. In fact, relationship (23) is obvious from (67). And (65) is exactly the relationship given in (25) and (24).

5. Concluding Remarks In this paper, we have studied the relationship between maximum principle and dynamic programming for stochastic recursive optimal control problems. Under certain differentiability conditions, we give relations among the adjoint processes, the generalized Hamiltonian function, and the value function. A linear quadratic recursive utility portfolio optimization problem in the financial market is discussed as an explicitly illustrated example of our result. An interesting and challenging problem remains open. For the stochastic recursive optimal control problem, what is the relationship between maximum principle and dynamic programming without the illusory differentiability conditions on the value function? This problem may be solved in the framework of nonsmooth analysis. Viscosity solution theory is certainly a nice tool (e.g., see Yong and Zhou [9]). A new result on stochastic verification theorem for forwardbackward controlled system using viscosity solutions has been published very recently by Zhang [29]. However, at this moment, we do not have publishable results for the relationship within the framework of viscosity solutions. We hope to address this problem in the future work.

References [1] E. Pardoux and S. G. Peng, “Adapted solution of a backward stochastic differential equation,” Systems & Control Letters, vol. 14, no. 1, pp. 55–61, 1990. [2] D. Duffie and L. G. Epstein, “Stochastic differential utility,” Econometrica, vol. 60, no. 2, pp. 353–394, 1992. [3] N. El Karoui, S. G. Peng, and M. C. Quenez, “Backward stochastic differential equations in finance,” Mathematical Finance, vol. 7, no. 1, pp. 1–71, 1997. [4] M. Schroder and C. Skiadas, “Optimal consumption and portfolio selection with stochastic differential utility,” Journal of Economic Theory, vol. 89, no. 1, pp. 68–126, 1999. [5] N. El Karoui, S. G. Peng, and M. C. Quenez, “A dynamic maximum principle for the optimization of recursive utilities under constraints,” The Annals of Applied Probability, vol. 11, no. 3, pp. 664–693, 2001. [6] S. L. Ji and X. Y. Zhou, “A maximum principle for stochastic optimal control with terminal state constraints, and its applications,” Communications in Information and Systems, vol. 6, no. 4, pp. 321–337, 2006. [7] N. Williams, “On dynamic principal-agent problems in continuous time,” Working paper, http://www.ssc.wisc.edu/∼nwilliam/dynamic-pa1.pdf, 2008. [8] G. C. Wang and Z. Wu, “The maximum principles for stochastic recursive optimal control problems under partial information,” IEEE Transactions on Automatic Control, vol. 54, no. 6, pp. 1230– 1242, 2009. [9] J. M. Yong and X. Y. Zhou, Stochastic Controls: Hamiltonian Systems and HJB Equations, Springer-Verlag, New York, NY, USA, 1999. [10] S. G. Peng, “Backward stochastic differential equations and applications to optimal control,” Applied Mathematics and Optimization, vol. 27, no. 2, pp. 125–144, 1993. [11] W. S. Xu, “Stochastic maximum principle for optimal control problem of forward and backward system,” Journal of the Australian Mathematical Society B, vol. 37, no. 2, pp. 172–185, 1995. [12] Z. Wu, “A general maximum principle for optimal control problems of forward-backward stochastic control systems,” In press. [13] J. T. Shi and Z. Wu, “Maximum principle for forward-backward stochastic control system with random jumps and applications to finance,” Journal of Systems Science & Complexity, vol. 23, no. 2, pp. 219–231, 2010.

12 [14] S. G. Peng, “A generalized dynamic programming principle and Hamilton-Jacobi-Bellman equation,” Stochastics & Stochastics Reports, vol. 38, no. 2, pp. 119–134, 1992. [15] S. G. Peng, “Backward stochastic differential equations-stochastic optimization theory and viscosity solutions of HJB equations,” in Topics on Stochastic Analysis, J. Yan, S. Peng, S. Fang, and L. Wu, Eds., pp. 85–138, Science Press, Beijing, China, 1997. [16] Z. Wu and Z. Y. Yu, “Dynamic programming principle for one kind of stochastic recursive optimal control problem and Hamilton-Jacobi-Bellman equation,” SIAM Journal on Control and Optimization, vol. 47, no. 5, pp. 2616–2641, 2008. [17] J. Li and S. G. Peng, “Stochastic optimization theory of backward stochastic differential equations with jumps and viscosity solutions of Hamilton-Jacobi-Bellman equations,” Nonlinear Analysis. Theory, Methods & Applications, vol. 70, no. 4, pp. 1776–1796, 2009. [18] J. M. Bismut, “An introductory approach to duality in optimal stochastic control,” SIAM Review, vol. 20, no. 1, pp. 62–78, 1978. [19] A. Bensoussan, Lectures on Stochastic Control, vol. 972 of Lecture Notes in Mathematics, Springer-Verlag, Berlin, Germany, 1982. [20] X. Y. Zhou, “Maximum principle, dynamic programming, and their connection in deterministic control,” Journal of Optimization Theory and Applications, vol. 65, no. 2, pp. 363–373, 1990. [21] X. Y. Zhou, “The connection between the maximum principle and dynamic programming in stochastic control,” Stochastics & Stochastics Reports, vol. 31, no. 1–4, pp. 1–13, 1990. [22] X. Y. Zhou, “A unified treatment of maximum principle and dynamic programming in stochastic controls,” Stochastics & Stochastics Reports, vol. 36, no. 3-4, pp. 137–161, 1991. [23] N. C. Framstad, B. Øksendal, and A. Sulem, “Sufficient stochasticmaximum principle for the optimal control of jump diffusions and applications to finance,” Journal of Optimization Theory and Applications, vol. 121, no. 1, pp. 77–98, 2004. [24] N. C. Framstad, B. Øksendal, and A. Sulem, “Errata corrige. Sufficient stochastic maximum principle for the optimal control of jump diffusions and applications to finance,” Journal of Optimization Theory and Applications, vol. 124, no. 2, pp. 511– 512, 2005. [25] J. T. Shi and Z. Wu, “Relationship between MP and DPP for the stochastic optimal control problem of jump diffusions,” Applied Mathematics and Optimization, vol. 63, no. 2, pp. 151–189, 2011. [26] K. Bahlali, F. Chighoub, and B. Mezerdi, “On the relationship between the stochastic maximum principle and dynamic programming in singular stochastic control,” Stochastics, vol. 84, no. 2-3, pp. 233–249, 2012. [27] X. Zhang, R. J. Elliott, and T. K. Siu, “A stochastic maximum principle for a Markov regime-switching jump-diffusion model and its application to finance,” SIAM Journal on Control and Optimization, vol. 50, no. 2, pp. 964–990, 2012. [28] M. Kohlmann and X. Y. Zhou, “Relationship between backward stochastic differential equations and stochastic controls: a linear-quadratic approach,” SIAM Journal on Control and Optimization, vol. 38, no. 5, pp. 1392–1407, 2000. [29] L. Q. Zhang, “Stochastic verification theorem of forwardbackward controlled systems for viscosity solutions,” Systems & Control Letters, vol. 61, no. 5, pp. 649–654, 2012.


Advances in

Operations Research Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Advances in

Decision Sciences Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Journal of

Applied Mathematics

Algebra

Hindawi Publishing Corporation http://www.hindawi.com


Volume 2014

Journal of

Probability and Statistics Volume 2014

The Scientific World Journal Hindawi Publishing Corporation http://www.hindawi.com


Volume 2014

International Journal of

Differential Equations Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Volume 2014

Submit your manuscripts at http://www.hindawi.com International Journal of

Advances in

Combinatorics Hindawi Publishing Corporation http://www.hindawi.com

Mathematical Physics Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Journal of

Complex Analysis Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

International Journal of Mathematics and Mathematical Sciences


Journal of

Mathematics Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014


Volume 2014

Volume 2014


Volume 2014

Discrete Mathematics

Journal of

Volume 2014


Discrete Dynamics in Nature and Society

Journal of

Function Spaces Hindawi Publishing Corporation http://www.hindawi.com

Abstract and Applied Analysis

Volume 2014


Volume 2014


Volume 2014

International Journal of

Journal of

Stochastic Analysis

Optimization



Volume 2014

Volume 2014

Relationship between Maximum Principle and Dynamic Programming

Relationship between Maximum Principle and Dynamic Programming

Suggest Documents

Relationship between Maximum Principle and Dynamic Programming

Relationship between Matching Principle and Earnings Properties ...

Relationship between Methanogenic Cofactor Content and Maximum ...

relationship between deep chlorophyll maximum and ...

Relationship between maximum oxygen uptake and peripheral ...

Pontryagin Maximum Principle and Maximum Entropy ...

WEAK DYNAMIC PROGRAMMING PRINCIPLE FOR ... - CMAP

The Maximum Principle - Springer

THE DYNAMIC INTER-RELATIONSHIP BETWEEN

Relationship between the Growing Season Maximum ... - MDPI

relationship between maximum toe flexor muscle ...

The relationship between the stochastic maximum

Relationship between Maximum Leaf Photosynthesis, Nitrogen ...

The Dynamic Relationship Between Permanent and Transitory ...

dynamic relationship between investment, earnings and dividends

dynamic relationship between stock and property markets

On the Relation between the Minimum Principle and Dynamic ...

Stochastic maximum principle with Lagrange

On the Relation Between the Minimum Principle and Dynamic

Abolishing the maximum tension principle

Maximum Filling Principle and Sublattice ... - Springer Link

Maximum entropy principle and quantum statistical information ...

Penalized Maximum Likelihood Principle for

Relationship between silent atrial fibrillation and the maximum heart ...