European Journal of Control (2006)12:455–463 # 2006 EUCA
A Dual Dynamic Programming For Multidimensional Parabolic Optimal Control Problems E. Galewska1, and A. Nowakowski2, 1
Faculty of Mathematics, University of Lodz, Banacha 22, 90-238 Lodz, Poland; 2Faculty of Mathematics, University of Lodz, Banacha 22, 90-238 Lodz, Poland
In the paper the optimal control problems governed by parabolic equations are considered. We apply a new dual dynamic programming approach to derive sufficient optimality conditions for such problems. The idea is to move all the notions from a state space to a dual space and to obtain a new verification theorem providing the conditions which should be satisfied by a solution of the dual partial differential equation of dynamic programming. We also give sufficient optimality conditions for the existence of an optimal dual feedback control and some approximation of the problem considered which seems to be very useful from the practical point of view. Keywords: Dual Dynamic Programming; Dual Feedback Control; Parabolic Equation; Optimal Control Problem; Sufficient Optimality Conditions; Verification Theorem
1. Introduction Consider the following optimal control problem (P): Z minimize Jðx, uÞ ¼ Lðt, z, xðt, zÞ, uðt, zÞÞdtdz ½0;T Z þ lðxðT, zÞÞdz
E-mail:
[email protected]. Correspondence to: A. Nowakowski; e-mail: annowako@math. uni.lodz.pl
subject to xt ðt, zÞ þ z xðt, zÞ ¼ fðt, z, xðt, zÞ, uðt, zÞÞ a.e. on ½0, T ð1Þ xð0, zÞ ¼ ’ð0, zÞ on
ð2Þ
xðt, zÞ ¼ ðt, zÞ on ½0, T @
ð3Þ
uðt, zÞ 2 U a. e. on ½0, T
ð4Þ
where is a given subset of Rn which is bounded with Lipschitz boundary and U is a given nonempty set in Rm ; L, f : ½0, T R Rm ! R, l : R ! R and ’ : Rnþ1 ! R are given functions; x : ½0, T ! R, x 2 W2, 2 ðÞ and u : ½0, T ! Rm is a Lebesgue measurable function. We assume that for each s in R, the functions ðt, z, uÞ ! Lðt, z, s, uÞ, ðt, z, uÞ ! fðt, z, s, uÞ are (L B)-measurable, where L B is the -algebra of subsets of ½0, T Rm generated by products of Lebesgue measurable subsets of ½0, T and Borel subsets of Rm , and for each ðt, z, uÞ 2 ½0, T Rm , the functions s ! Lðt, z, s, uÞ, s ! fðt, z, s, uÞ are continuous. We call a pair ðxðt, zÞ, uðt, zÞÞ to be admissible if it satisfies (1)–(4) and Lðt, z, xðt, zÞ, uðt, zÞÞ is summable; then the corresponding trajectory x(t, z) is said to be admissible. The aim of the paper is to present sufficient optimality conditions for problem (P) in terms of
Received 18 January 2005; Accepted 12 February 2006 Recommended by J. Tsinias, D. Normand-Cyrot
456
dynamic programming conditions directly. In the literature, there is not work in which problem (P) is studied directly by a dynamic programming method. The only results known to the authors (see e.g. [1]– [11], [12], [13] and references therein) treat problem (P) as an abstract problem with an abstract evolution equation (1) and later derive from abstract Hamilton– Jacobi equations the suitable sufficient optimality conditions for problem (P). We propose almost a direct method to study (P) by a dual dynamic programming approach following the method described in [14] for one dimensional case and in [9] for multidimensional case. We move all notions of a dynamic programming to a dual space (the space of multipliers) and then develop a dual dynamic approach together with a dual Hamilton–Jacobi equation and as a consequence sufficient optimality conditions for (P). We also define an optimal dual feedback control in the terms of which we formulate sufficient conditions for optimality. Such an approach allows us to weak significantly the assumptions on the data. An approximate minimum in terms of the dual dynamic programming is also investigated.
2. A Dual Dynamic Programming In this section we describe an intuition of a dual dynamic approach to optimal control problems governed by parabolic equations. Let us recall what does a dynamic programming mean? We have an initial condition ðt0 , x0 ðt0 , zÞÞ, z 2 for which we assume that we have an optimal solution ðx, uÞ. Then by necessary optimality conditions there exists a conjugate function pðt, zÞ ¼ ðy0 , yðt, zÞÞ on ½0, T being a solution to the corresponding adjoint system (see e.g. [5], [11]). The element p ¼ ðy0 , yÞ plays a role of multipliers from the classical Lagrange problem with constraints (with multiplier y0 staying by the functional and y corresponding to the constraint). If we perturb ðt0 , x0 Þ then, assuming that the optimal solution exists for each perturbed problem, we also have a conjugate function corresponding to it. Therefore making perturbations of our initial conditions we obtain two sets of functions: optimal trajectories x and corresponding to them conjugate functions p. The graphs of optimal trajectories cover some set in a state space (t, z, x), say a set X (in the classical calculus of variation it is named the field of extremals), and the graphs of conjugate functions cover some set in a conjugate space (t, z, p), say a set P (in classical mechanics it is named the space of momentums). In the classical dynamic programming
E. Galewska and A. Nowakowski
approach we explore the state space (t, z, x), i.e. the set X (see e.g. [1]) but in the dual dynamic programming approach we explore the conjugate space (the dual space) (t, z, p), i.e. the set P (see [14] for one dimensional case and [9] for multidimensional case). It is worth to note that although in elliptic control optimization problems we have not possibilities to perturb that problems, the dual dynamic programming is still possible to be applied (see [10]). It is natural that if we want to explore the dual space (t, z, p) then we need a mapping between the set P and the set X : P 3 ðt, z, pÞ ! ðt, z, x~ðt, z, pÞÞ 2 X to have a possibility to formulate, at the end of some consideration in P, any conditions for optimality in our original problem as well as on an optimal solution x. Of course, such a mapping should have the property that for each admissible trajectory x(t, z) lying in X we must have a function p(t, z) lying in P such that xðt, zÞ ¼ x~ðt, z, pðt, zÞÞ. Hence, we conduct all our investigations in a dual space (t, z, p), i.e. most of our notions concerning the dynamic programming are defined in the dual space including a dynamic programming equation which becames now a dual dynamic programming equation. Therefore let P Rnþ3 be a set of the variables ðt, z, pÞ ¼ ðt, z, y0 , yÞ, ðt, zÞ 2 ½0, T , y0 0, y 2 R, and let c ¼ ðc0 , cÞ 2 R2 be fixed. The constant c is introduced because of the practical purpose only, i.e. in order to make easier the calculations of some relation stated below for concrete problems (see section An Example). We adopt the convention that ðt, z, cpÞ ¼ ðt, z, c0 y0 , cyÞ for ðt, z, pÞ 2 P. Let x~ : P ! R be such a function that for each admissible trajectory x(t, z) there exists a function pðt, zÞ ¼ ðy0 , yðt, zÞÞ, p 2 W2, 2 ð½0, T Þ, ðt, z, pðt, zÞÞ 2 P such that xðt, zÞ ¼ x~ðt, z, pðt, zÞÞ for ðt, zÞ 2 ½0, T : ð5Þ Now, let us introduce an auxiliary C2 function Vðt, z, pÞ : P ! R such that for ðt, z, pÞ 2 P, ðt, z, cpÞ 2 P the following condition is satisfied Vðt, z, cpÞ ¼ c0 y0 Vy0 ðt, z, cpÞ þ cyVy ðt, z, cpÞ ð6Þ ¼ cpVp ðt, z, cpÞ: The condition (6) is the generalization of a tranversality condition known in classical mechanics as the orthogonality of a momentum to the front of a wave. Similarly as in the classical dynamic programming define at (t, p( )), where pðzÞ ¼ ðy0 , yðzÞÞ is any
457
A Dual Dynamic Programming for Parabolic Problems
function p 2 W2, 2 ðÞ, ðt, z, pðzÞÞ 2 P, a dual value function SD by the formula ( Z SD ðt,pðÞÞ:¼inf c0 y0
Lð,z,xð,zÞ,uð,zÞÞddz ½t;T
equation of multidimensional dynamic programming (DSPDEMDP) n max Vt ðt, z, cpÞ þ z Vðt, z, cpÞ
)
Z
þ c0 y0 Lðt, z, Vy ðt, z, cpÞ, uÞ
lðxðT,zÞÞdz ,
o þ cyfðt, z, Vy ðt, z, cpÞ, uÞ : u 2 U ¼ 0:
ð7Þ
ð15Þ
where the infimum is taken over all admissible pairs xð, Þ, uð, Þ, 2 ½t, T such that
Let us note that the function x~ðt, z, pÞ introduced a little bit artificially at the begining of this section in fact is defined by Vy ðt, z, pÞ, where V is a solution to (15), i.e. knowing the set P and Vy we are able to describe the set X in which our original problem we need to consider. The assumption that the auxiliary function V(t, z, p) is of C2 is important in this paper and we cannot weakend that. However, we would like to stress that it is only an auxiliary assumption and it is not put on a value function which in our case need not to be even continuous.
c0 y0
xðt, zÞ ¼ x~ðt, z, pðzÞÞ for z 2 x~ðt, z, pðzÞÞ ¼ ðt, zÞ for z 2 @
ð8Þ ð9Þ
i.e. whose trajectories start at ðt, x~ðt, , pðÞÞ. Then, integrating (6) over , for any function pðzÞ ¼ ðy0 ,yðzÞÞ, p 2 W2;2 ðÞ, ðt,z,pðzÞÞ 2 P, ðt,z,cpðzÞÞ 2 P, such that x(, ) satisfying xðt,zÞ ¼ x~ðt,z,pðzÞÞ for z 2 , is an admissible trajectory, we also have the equality Z rz Vðt, z, cpðzÞÞðzÞdz @
Z
¼ c
@
yðzÞr~ xðt, z, pðzÞÞðzÞdz SD ðt, pðÞÞ ð10Þ
with
Z 0 0
c y
@
rz Vy0 ðt, z, cpðzÞÞðzÞdz ¼ SD ðt, pðÞÞ ð11Þ
and assuming that x~ðt, z, pðzÞÞ ¼ Vy ðt, z, cpðzÞÞ for ðt, zÞ 2 ½0, T , ðt, z, cpðzÞÞ 2 P. Here ( ) is the exterior unit normal vector to @ and rx(t, z, p(z)) means ‘‘r’’ of the function z ! x(t, z). Denote by the symbol z h the sum of the second partial derivatives of the function h : P!R with respect to the variable zi , i ¼ 1, . . . , n, i.e. Xn z hðt, z, pÞ :¼ ð@ 2 =@z2i Þhðt, z, pÞ: ð12Þ i¼1 It turns out that the function V(t, z, p) being defined by (10), (11) satisfies the second order partial differential equation Vt ðt, z, cpÞ þ z Vðt, z, cpÞ þ Hðt, z, Vy ðt, z, cpÞ, pÞ ¼ 0,
ð13Þ
where Hðt, z, v, pÞ ¼ c0 y0 Lðt, z, v, uðt, z, pÞÞ þ cyfðt, z, v, uðt, z, pÞÞ
ð14Þ
and u(t, z, p) is an optimal dual feedback control, and the dual second order partial differential
Remark. We would like to stress that the duality which is sketched in this section is not a duality in the sense of convex optimization. It is a new nonconvex duality, first time described in [14] and next developed in [9], for which we have not the relation sup(D) inf(P) (D-means a dual problem, P a primal one). But instead of it we have other relations, namely (6) and (10), which are generalizations of transversality conditions from classical mechanics. If we find a solution to (13) then checking the relation (6) for concrete problems is not very difficult.
3. A Verification Theorem The most important conclusion of a dynamic programming is a verification theorem. We present it in a dual form accordingly to our dual dynamic programming approach described in the previous section. Theorem 1. Let ðxðt, zÞ, uðt, zÞÞ, ðt, zÞ 2 ½0, T , be an admissible pair. Assume that there exist c ¼ ðc0 , cÞ 2 R2 and a C2 solution V(t, z, p) of DSPDEMDP (15) on P such that (6) holds. Let further pðt, zÞ ¼ ðy0 , yðt, zÞÞ, p 2 W2;2 ð½0, T Þ, ðt, z, cpðt, zÞÞ 2 P, be such a function that xðt, zÞ ¼ Vy ðt, z, cpðt, zÞÞ for ðt, zÞ 2 ½0, T . Suppose that V(t, z, p) satisfies the boundary condition for ðT, z, cpÞ 2 P, Z Z c0 y0 Vy0 ðT, z, cpÞdz ¼ c0 y0 lðVy ðT, z, cpÞÞdz:
ð16Þ
458
E. Galewska and A. Nowakowski
Moreover, assume that for almost all ðt, zÞ 2 ½0, T , Vt ðt, z, cpðt, zÞÞ þ z Vðt, z, cpðt, zÞÞ
We conclude ðt, zÞ 2 ½0, T ,
for
þ c0 y0 Lðt, z, Vy ðt, z, cpðt, zÞÞ, uðt, zÞÞ
ð17Þ Then ðxðt, zÞ, uðt, zÞÞ, ðt, zÞ 2 ½0, T , is an optimal pair relative to all admissible pairs ðxðt, zÞ, uðt, zÞÞ, ðt, zÞ 2 ½0, T , for which there exists such a function pðt, zÞ ¼ ðy0 , yðt, zÞÞ, p 2 W2;2 ð½0, T Þ, ðt, z, cpðt, zÞÞ 2 P, that xðt, zÞ ¼ Vy ðt, z, cpðt, zÞÞ for ðt, zÞ 2 ½0, T and yð0, zÞ ¼ yð0, zÞ for z 2
ð18Þ
yðt, zÞ ¼ yðt, zÞ for ðt, zÞ 2 ½0, T @
ð19Þ
Proof. Let ðxðt, zÞ, uðt, zÞÞ, ðt, zÞ 2 ½0, T , be an admissible pair for which there exists such a function pðt, zÞ ¼ ðy0 , yðt, zÞÞ, p 2 W2;2 ð½0, T Þ, ðt, z, cpðt, zÞÞ 2 P, that xðt, zÞ ¼ Vy ðt, z, cpðt, zÞÞ for ðt, zÞ 2 ½0, T and (18), (19) are satisfied. From transversality condition (6), we obtain that for ðt, zÞ 2 ½0, T , Vt ðt, z, cpðt, zÞÞ þ z Vðt, z, cpðt, zÞÞ ¼ c0 y0 ½ðd=dtÞVy0 ðt, z, cpðt, zÞÞ
þ cyðt, zÞfðt, z, Vy ðt, z, cpðt, zÞÞ, uðt, zÞÞ ð23Þ hence, by (15) and (23), that ðd=dtÞWðt, z, cpðt, zÞÞ þ z Wðt, z, cpðt, zÞ 0 for ðt, zÞ 2 ½0, T
ð24Þ
and finally, after integrating (24) and applying (22), that Z ½ðd=dtÞVy0 ðt, z, cpðt, zÞÞ c 0 y0 ½0;T
þ divrz Vy0 ðt, z, cpðt, zÞÞdtdz Z 0 0 Lðt, z, xðt, zÞ, uðt, zÞÞdtdz: ð25Þ c y ½0;T
Thus from (25), (16), (18), (19) and by the Green formula it follows that Z 0 0 ½lðVy ðT, z, cpðT, zÞÞÞ c y
Vy0 ð0, z, c0 y0 , cyð0, zÞÞdz Z Z rz Vy0 ðt, z, c0 y0 , cyðt, zÞÞðzÞdz dt þ c 0 y0
þ z Vy0 ðt, z, cpðt, zÞÞ þ cyðt, zÞ½ðd=dtÞVy ðt, z, cpðt, zÞÞ ð20Þ
Since xðt, zÞ ¼ Vy ðt, z, cpðt, zÞÞ, for ðt, zÞ 2 ½0, T , (1) shows that for ðt, zÞ 2 ½0, T , ðd=dtÞVy ðt, z, cpðt, zÞÞ þ z Vy ðt, z, cpðt, zÞÞ ð21Þ
Now define a function W(t, z, p(t, z)) on P by the following requirement for ðt, zÞ 2 ½0, T ,
Z c0 y0
½0;T
@
Lðt, z, xðt, zÞ, uð, zÞÞdtdz, ½0;T
ð26Þ where ( ) is the exterior unit normal vector to @. So by (26) we get Z Z rz Vy0 ðt, z, c0 y0 , cyðt, zÞÞðzÞdz dt c 0 y0 ½0;T
Z c0 y0
:¼ c y ½ðd=dtÞVy0 ðt, z, cpðt, zÞÞ 0 0
@
Z
c0 y0
ðd=dtÞWðt, z, cpðt, zÞÞ þ z Wðt, z, cpðt, zÞÞ
Vy0 ð0, z, c0 y0 , cyð0, zÞÞdz Lðt, z, xðt, zÞ, uðt, zÞÞdtdz
½0;T
Z
þ z Vy0 ðt, z, cpðt, zÞÞ þ Lðt, z, Vy ðt, z, cpðt, zÞÞ, uðt, zÞÞ:
that
¼ Vt ðt, z, pðt, zÞÞ þ z Vðt, z, pðt, zÞÞ
þ cyðt, zÞfðt, z, Vy ðt, z, cpðt, zÞÞ, uðt, zÞÞ ¼ 0:
¼ fðt, z, Vy ðt, z, cpðt, zÞÞ, uðt, zÞÞ
(20)–(22)
ðd=dtÞWðt, z, cpðt, zÞÞ þ z Wðt, z, cpðt, zÞÞ
þ c0 y0 Lðt, z, Vy ðt, z, cpðt, zÞÞ, uðt, zÞÞ
þ z Vy ðt, z, cpðt, zÞÞ:
from
ð22Þ
c 0 y0
lðxðT, zÞÞdz:
ð27Þ
459
A Dual Dynamic Programming for Parabolic Problems
In the same manner applying (17) and (23) we have ðd=dtÞWðt, z, cpðt, zÞÞ þ z Wðt, z, cpðt, zÞ ¼ 0 for ðt, zÞ 2 ½0, T :
ð28Þ
Now from (28), (22), (16) and the Green formula we have Z Z 0 0 0 0 rz Vy0 ðt, z, c y , cyðt, zÞÞðzÞdz dt c y ½0;T @ Z c0 y0 Vy0 ð0, z, c0 y0 , cyð0, zÞÞdz Z 0 0 Lðt, z, xðt, zÞ,uðt, zÞÞdtdz ¼ c y ½0;T Z c0 y0 lðxðT, zÞÞdz: ð29Þ
Combining (27) with (29)gives Z Lðt, z, xðt, zÞ, uðt, zÞÞdtdz c 0 y0 ½0;T Z c0 y0 lðxðT, zÞÞdz Z Lðt, z, xðt, zÞ, uðt, zÞÞdtdz c0 y0 ½0;T Z c0 y0 lðxðT, zÞÞdz,
which completes the proof.
Definition 3. A dual feedback control uðt, z, pÞ is called an optimal dual feedback control, if there exist a function xðt, z, pÞ; ðt, z, pÞ 2 P, corresponding to uðt, z, pÞ as in Definition 2, and a function pðt, zÞ ¼ ðy0 , yðt, zÞÞ, p 2 W2, 2 ð½0, T Þ, ðt, z, pðt, zÞÞ 2 P; ðt, z, cpðt, zÞÞ 2 P with c ¼ ðc0 , cÞ, such that, for Z 0 0 SD ðt,pðt,ÞÞ ¼ c y Lð,z,xð,z,pð,zÞÞ, ½t, T Z uð,z,pð,zÞÞÞddz c0 y0 lðxðT,z,pðT,zÞÞÞdz
ð32Þ defining Vy0 ðt, z, cpðt, zÞÞ by Z c0 y0 rz Vy0 ðt,z,cpðt,zÞÞðzÞdz ¼ SD ðt,pðt,ÞÞ @
ð33Þ and for Vy ðt, z, cpÞ ¼ xðt, z, pÞ for ðt, z, pÞ 2 P, ðt, z, cpÞ 2 P,
ð34Þ
there is Vðt, z, pÞ satisfying (6). ð30Þ &
4. An Optimal Dual Feedback Control It often occurs that for engineers and in practice a feedback control is more important than a value function. It turns out that a dual dynamic programming approach allows also to investigate a kind of a feedback control which we call a dual feedback control. Suprisingly it can have a better properties than classical one – now our state equation depends only on the parameter and not additionaly on the state in a feedback function, what makes the state equation difficult to solve. Definition 2. A function u~ ¼ u~ðt, z, pÞ from a subset P of Rnþ3 of the points ðt, z, pÞ ¼ ðt, z, y0 , yÞ, ðt, zÞ 2 ½0, T , y0 0, y 2 R, into U is called a dual feedback control, if there is any solution x~ðt, z, pÞ, ðt, z, pÞ 2 P, of the partial differential equation xt ðt, z, pÞ þ z xðt, z, pÞ ¼ fðt, z, xðt, z, pÞ, u~ðt, z, pÞÞ ð31Þ such that for each admissible trajectory xðt, zÞ, ðt, zÞ 2 ½0, T , there exists such a function pðt, zÞ ¼ ðy0 , yðt, zÞÞ, p 2 W2;2 ð½0, T Þ, ðt, z, pðt, zÞÞ 2 P, that (5) holds.
The next theorem is nothing more that the verification theorem formulated in terms of a dual feedback control. Theorem 4. Let uðt, z, pÞ be a dual feedback control in P. Suppose that there exist c ¼ ðc0 , cÞ 2 R2 and a C2 solution V(t, z, p) of DSPDEMDP (15) on P such that (6) and (16) hold. Let pðt, zÞ ¼ ðy0 , yðt, zÞÞ, p 2 W2;2 ð½0, T Þ, ðt, z,pðt, zÞÞ 2 P, ðt, z,cpðt, zÞÞ 2 P, be such a function that ðxðt, zÞ,uðt, zÞÞ, where xðt, zÞ ¼ xðt, z,pðt, zÞÞ and uðt, zÞ ¼ uðt, z,pðt, zÞÞ, ðt, zÞ 2 ½0, T , is an admissible pair with xðt, z, pÞ, ðt, z, pÞ 2 P, corresponding to uðt, z, pÞ as in Definition 2. Assume further that: Vy ðt, z, cpÞ ¼ xðt, z, pÞ for ðt, z, pÞ 2 P, ðt, z, cpÞ 2 P, Z
Z
rz Vy0 ðt, z, c y , cyðt, zÞÞðzÞdz dt Z 0 0 c y Vy0 ð0, z, c0 y0 , cyð0, zÞÞdz
0 0
c y
ð35Þ
0 0
½0, T
@
Z ¼ c y
0 0
Lðt, z, xðt, z, pðt, zÞÞ, uðt, z, pðt, zÞÞÞdtdz Z c0 y0 lðxðT, z, pðT, zÞÞÞdz: ð36Þ
½0, T
Then uðt, z, pÞ is an optimal dual feedback control. Proof. Take any function pðt, zÞ ¼ ðy0 , yðt, zÞÞ, p 2 W2, 2 ð½0, T Þ, ðt, z, pðt, zÞÞ 2 P, ðt, z, cpðt, zÞÞ 2 P,
460
E. Galewska and A. Nowakowski
such that ðxðt, zÞ, uðt, zÞÞ, where xðt, zÞ ¼ xðt, z, pðt, zÞÞ, uðt, zÞ ¼ uðt, z, pðt, zÞÞ, ðt, zÞ 2 ½0, T , is an admissible pair and (18), (19) hold. By (35), it follows that xðt, zÞ ¼ Vy ðt, z, cpðt, zÞÞ for ðt, zÞ 2 ½0, T . As in the proof of Theorem 1, equation (36) gives Z 0 0 c y Lðt, z, xðt, z, pðt, zÞÞ, uðt, z, pðt, zÞÞÞdtdz ½0, T Z c0 y0 lðxðT, z, pðT, zÞÞÞdz
pffiffiffi Hðt, z, v, cpÞ ¼ 2 6ðy0 Þ3=2 = 27ðyÞ1=2 v2 : ð41Þ Hence second order partial differential equation (13) has the form Vt ðt,z,cpÞ þ z Vðt,z,cpÞþ pffiffiffi
3=2 2 6 y0 = 27ðyÞ1=2 ðVy ðt,z,cpÞÞ2 ¼ 0:
Z c0 y0
From (39)–(40) we have
½0, T
Lðt, z, xðt, z, pðt, zÞÞ, uðt, z, pðt, zÞÞdtdz
c 0 y0
Z
ð42Þ Let
xðT, z, pðT, zÞÞdz:
Vðt, z, cpÞ :¼ ð4=3Þ1=4 ð4y=3Þ3=4 Xn þ ð2y0 =3Þ3=2 z2 =ð6nÞ: ð43Þ i¼1 i
ð37Þ We conclude from (37) that Z SD ðt,pðt:ÞÞ ¼ c0 y0
½t, T
Lð, z,xð, z,pð, zÞÞ,
uð, z,pð, zÞÞÞddz Z 0 0 c y lðxðT, z,pðT,zÞÞÞdz
ð38Þ and it is sufficient to show that uðt, z, pÞ is an optimal dual feedback control, by Theorem 1 and Definition 3. &
Then the function Vðt, pÞ satisfies on P both DPDEMDP (15) and transversality conditions (6). Let uðt, z, pÞ : P ! U be a function defined as follows ð44Þ uðt, z, pÞ :¼ y0 = 6t4 ðyÞ7=12 Hence equation (31) has the following form xt ðt,z,pÞþz xðt,z,pÞ , 0 3=2
¼ðy Þ
ð6yÞ
3=2
ðyÞ
1=2
Xn
2=3 2
!
z i¼1 i ð45Þ
5. An Example
If we take y0 ¼ 6ð2n=9Þ2=3 then (45) has the solution
Put for ðt, z, x, uÞ 2 ½0, T Rþ Rþ ,
xðt, z, pÞ ¼ ðyÞ1=2
2 7=6 1=2
Lðt, z, x, uÞ :¼ t x
u
fðt, z, x, uÞ :¼ t6 x1=2 u3=2 where :¼ fz 2 Rn : 0 < zi < 1, i ¼ 1, . . . , ng. Let further Y :¼ p ¼ ðy0 , yÞ2 R2 : y0 0, y < 0 , P :¼ ½0, T Y, U :¼ Rþ and c :¼ ð2=3, 4=3Þ 2 R2 . Define the Hamiltonian H : ½0, T Rþ Y ! R by the following formulas Hðt, z, v, cpÞ :¼ minþ Hðt, z, v, cp, uÞ,
ð39Þ
u2R
where H : ½0, T Rþ Y U ! R denotes the Pontryagin Hamiltonian Hðt, z, v, cp, uÞ :¼ ð2=3Þy0 t2 v7=6 u1=2 ð4=3Þyt v
6 1=2 3=2
u
:
ð40Þ
Xn
2=3
z i¼1 i
ð46Þ
on P. We denote that y0 by y0 . Therefore uðt, z, y0 , yÞ is a dual feedback control in P. The function (16) for RVðt, z, pÞ satisfies boundary condition 1=3 . lðVy ðT, z, cpÞÞdz ¼ ð1=9Þð2n=9Þ Let pðt, zÞ :¼ ðy0 , yðt, zÞÞ, where y0 comes from (45) and Xn 8=3 yðt, zÞ :¼ z : ð47Þ i i¼1 Then xðt, z, pðt, zÞÞ ¼
Xn
z i¼1 i
2=3
,
ð48Þ
X 14=9 n 4 ð49aÞ uðt, z, pðt, zÞÞ ¼ y = 6t z i¼1 i 0
:
461
A Dual Dynamic Programming for Parabolic Problems
and ðxðt, zÞ, uðt, zÞÞ, where xðt, zÞ ¼ xðt, z, pðt, zÞÞ zÞ 2 ½0, T , is an and uðt, zÞ ¼ uðt, z, pðt, zÞÞ, ðt, P 2=3 n on and admissible P pair for ’ð0, zÞ ¼ i¼1 zi
2=3 n z on ½0, T @ such that ðt, zÞ ¼ i i¼1 Vy ðt, z, cpðt, zÞÞ ¼ xðt, z, pðt, zÞÞ for ðt, zÞ 2 ½0, T :
ð50Þ
ð51Þ
6. An "-optimization If we want to solve concrete problem (1)–(4) for particular data then usually we are not able to solve it exactly especially the problem we consider is nonlinear. Therefore each possibility to aproximate our optimal problem (1)–(4) may turn out very useful. Below we find a certain type of a such approximation. Definition 5. Let " > 0 and c0 > 0 be fixed. A function S"D ðt, pðt, ÞÞ is called an "-dual value function, if SD ðt, pðt, ÞÞ S"D ðt, pðt, ÞÞ SD ðt, pðt, ÞÞ "c0 y0" TvolðÞ ð52Þ for any y0" 0. Definition 6. Let " > 0 and c ¼ ðc0 , cÞ 2 R2 , c0 > 0 be eðt, z, pÞ be a given C2 function. Let fixed and let V ðx" ðt, zÞ, u" ðt, zÞÞ, t 2 ½0, T , be an admissible pair and let p" ðt, zÞ ¼ ðy0" , y" ðt, zÞÞ, p" 2 W2, 2 ð½0, T Þ, ðt, z, cp" ðt, zÞÞ 2 P, be such a function that x" ðt, zÞ ¼ ey ðt, z, cp" ðt, zÞÞ for ðt, zÞ 2 ½0, T . The pair V ðx" ðt, zÞ, u" ðt, zÞÞ, ðt, zÞ 2 ½0, T , is called an "-optimal pair relative to all admissible pairs ðxðt, zÞ, uðt, zÞÞ, t 2 ½0, T , for which there exists such a function pðt, zÞ ¼ ðy0" , yðt, zÞÞ, p 2 W2, 2 ð½0, T Þ, ey ðt, z, cpðt, zÞÞ for ðt, z, cpðt, zÞÞ 2 P, that xðt, zÞ ¼ V ðt, zÞ 2 ½0, T and yð0, zÞ ¼ y" ð0, zÞ for z 2 yðt, zÞ ¼ y" ðt, zÞ for ðt, zÞ 2 ½0, T @
Z
c0 y0"
ð53Þ ð54Þ
½0, T
Lðt, z, x" ðt, zÞ, u" ðt, zÞÞdtdz
Z c0 y0"
Moreover, assumption (36) of Theorem 4 holds for R lðxðT, z, pðT, zÞÞÞdz ¼ ð1=3Þð2n=9Þ1=3 ðT þ 1=3Þ. Therefore, by Theorem 4 we obtain that uðt, z, pÞ is an optimal dual feedback control and from (32) we conclude that a dual value function SD equals SD ðt, pðt, ÞÞ ¼ ð8n=81Þð6T 9t 1Þ:
if,
lðx" ðT, zÞdz
Z
c0 y0"
c0 y0"
Z
½0, T
Lðt, z, xðt, zÞ, uðt, zÞÞdtdz
lðxðT, zÞdz "c0 y0" TvolðÞ: ð55Þ
Theorem 7. Let ðx" ðt, zÞ, u" ðt, zÞÞ, ðt, zÞ 2 ½0, T , be an admissible pair. Assume that there exist " > 0, eðt, z, pÞ c ¼ ðc0 , cÞ 2 R2 , c0 > 0, and a C2 function V such that for ðt, z, cpÞ 2 P: ( et ðt, z, cpÞ þ z V eðt, z, cpÞ max V ey ðt, z, cpÞ, uÞ þ c0 y0 Lðt, z, V
)
ey ðt, z, cpÞ, uÞ : u 2 U þ cyfðt, z, V
"c0 y0" , ð56Þ
eðt, z, cpÞ ¼ cpV ep ðt, z, cpÞ: V
ð57Þ
Let further p" ðt, zÞ ¼ ðy0" , y" ðt, zÞÞ, p" 2 W2, 2 ð½0, T Þ, ðt, z, cp" ðt, zÞÞ 2 P, be such a function that x" ðt, zÞ ¼ ey ðt, z, cp" ðtÞÞ for ðt, zÞ 2 ½0, T . Suppose that V e Vðt, z, pÞ satisfies the boundary condition for ðT, z, cpÞ 2 P, Z Z ey0 ðT, z, cpÞdz ¼ c0 y0 l V ey ðT, z, cpÞ dz: c0 y0" V "
ð58Þ Moreover, suppose ðt, zÞ 2 ½0, T ,
that
for
almost
all
eðt, z, cp" ðt, zÞÞ et ðt, z, cp" ðt, zÞÞ þ z V V ey ðt, z, cp" ðt, zÞÞ, u" ðt, zÞÞ þ c0 y0" Lðt, z, V ey ðt, z, cp" ðt, zÞÞ, u" ðt, zÞÞ 0: þ cy" ðt, zÞfðt, z, V ð59Þ Then ðx" ðt, zÞ, u" ðt, zÞÞ, ðt, zÞ 2 ½0, T , is an "-optimal pair relative to all admissible pairs
462
E. Galewska and A. Nowakowski
ðxðt, zÞ, uðt, zÞÞ, ðt, zÞ 2 ½0, T , for which there exists such a function pðt, zÞ ¼ ðy0" , yðt, zÞÞ, p 2 W2;2 ð½0, T Þ, ðt, z, cpðt, zÞÞ 2 P, that xðt, zÞ ¼ ey ðt, z, cpðt, zÞÞ for ðt, zÞ 2 ½0, T and (53), (54) V are satisfied. Proof. Take any admissible pair ðxðt, zÞ, uðt, zÞÞ, ðt, zÞ 2 ½0, T , for which there exists such a function pðt, zÞ ¼ ðy0" , yðt, zÞÞ, p 2 W2, 2 ð½0, T Þ, ey ðt, z, cpðt, zÞÞ for ðt, z, cpðt, zÞÞ 2 P, that xðt, zÞ ¼ V ðt, zÞ 2 ½0, T and (53), (54) hold. Then, from (57), we have for ðt, zÞ 2 ½0, T , eðt, z, cpðt, zÞÞ et ðt, z, cpðt, zÞÞ þ z V V h ey0 ðt, z, cpðt, zÞÞ ¼ c0 y0" ðd=dtÞV i ey0 ðt, z, cpðt, zÞÞ þ z V h ey ðt, z, cpðt, zÞÞ þ cyðt, zÞ ðd=dtÞV i ey ðt, z, cpðt, zÞÞ : þ z V
"c0 y0" TvolðÞ:
From (64), (58), (18), (54) and the Green formula it follows that Z Z 0 0 0 0 e rz Vy0 ðt,z,c y" ,cy" ðt,zÞÞðzÞdz dt c y" ½0, T @ Z 0 0 ey0 ð0,z,c0 y0 ,cy" ð0,zÞÞdz c y" V "
c0 y0"
½0, T
Z ð60Þ
c0 y0"
Lðt,z,xðt,zÞ,uð,zÞÞdtdz
lðxðT,zÞÞdz"c0 y0" TvolðÞ, ð65Þ
where ð:Þ is the exterior unit normal vector to @. Similarly, by (59) and (61) we obtain e ðt, z, cp" ðt, zÞÞ þ z W e ðt, z, cp" ðt, zÞÞ 0 ðd=dtÞW for ðt, zÞ 2 ½0, T : ð66Þ
e ðt, z, cpðt, zÞÞ e ðt, z, cpðt, zÞÞ þ z W ðd=dtÞW h ey0 ðt, z, cpðt, zÞÞ :¼ c0 y0" ðd=dtÞV ey0 ðt, z, cpðt, zÞÞ þ z V ð61Þ
ey ðt, z, cpðt, zÞÞ þ z V ey ðt, z, cpðt, zÞÞ ¼ Since ðd=dtÞV e fðt, z, Vy ðt, z, cpðt, zÞÞ, uðt, zÞÞ for ðt, zÞ 2 ½0, T , it follows, by (60) and (61), that for ðt, zÞ 2 ½0, T ,
Now from (66), (61), (58) and the Green formula we have Z Lðt,z,x" ðt,zÞ,u" ðt,zÞÞdtdz c0 y0" ½0, T Z c0 y0" lðx" ðT,zÞÞdzdt
c0 y0"
½0, T
c0 y0"
eðt, z, pðt, zÞÞ et ðt, z, pðt, zÞÞ þ z V ¼V ey ðt, z, cpðt, zÞÞ, uðt, zÞÞ þ c0 y0" Lðt, z, V ey ðt, z, cpðt, zÞÞ, uðt, zÞÞ þ cyðt, zÞfðt, z, V ð62Þ
c y" 0
e ðt, z, cpðt, zÞÞ "c0 y0 e ðt, z, cpðt, zÞÞ þ z W ðd=dtÞW " ð63Þ
@
ey0 ðt,z,c0 y0 ,cy" ðt,zÞÞðzÞdz rz V "
ey0 ð0,z,c0 y0 ,cy" ð0,zÞÞdz: V "
ð67Þ
Therefore, combining (65) with (67) yields Z 0 0 Lðt, z, x" ðt, zÞ, u" ðt, zÞÞdtdz c y" ½0, T Z lðx" ðT, zÞÞdz c0 y0"
Thus, by (56) and (62), we get
Z
Z
Z
e ðt, z, cpðt, zÞÞ e ðt, z, cpðt, zÞÞ þ z W ðd=dtÞW
for ðt, zÞ 2 ½0, T :
ð64Þ
Z
e ðt, z, pðt, zÞÞ be any function defined on P such Let W that for ðt, zÞ 2 ½0, T ,
ey ðt, z, cpðt, zÞÞ, uðt, zÞÞ: þ Lðt, z, V
Integrating (63) now and applying (61) we obtain Z h 0 0 ey0 ðt, z, cpðt, zÞÞ ðd=dtÞV c y" ½0, T i ey0 ðt, z, cpðt, zÞÞ dtdz þ divrz V Z 0 0 c y" Lðt, z, xðt, zÞ, uðt, zÞÞdtdz ½0, T
Z c0 y0"
Z
0
½0, T
Lðt, z, xðt, zÞ, uð, zÞÞdtdz
lðxðT, zÞÞdz "c0 y0" TvolðÞ,
which proves the assertion of the theorem.
ð68Þ &
463
A Dual Dynamic Programming for Parabolic Problems
References 1. Barbu V. Analysis and control of nonlinear infinite dimensional systems. Academic Press, Boston, 1993 2. Barbu V. The dynamic programming equation for the time-optimal control problem in infinite dimensions. SIAM J Control Optim 1991; 29: 445–456 3. Barbu V, Da Prato G. Hamilton-Jacobi equations in Hilbert spaces. Pitman Advanced Publishing Program, Boston, 1983 4. Cannarsa P, Carja O. On the Bellman equation for the minimum time problem in infinite dimensions. SIAM J Control Optim 2004; 43: 532–548 5. Casas E. Pontryagin’s principle for state-constrained boundary control problems of semilinear parabolic equations. SIAM J Control Optim 1997; 35: 1297–1327 6. Fattorini HO. Infinite-dimensional optimization and control theory. Cambridge University Press, Cambridge, 1999 7. Fattorini HO, Murphy T. Optimal control for nonlinear parabolic boundary control systems: the Dirichlet
8. 9. 10.
11. 12. 13. 14.
boundary conditions. Diff Integ Equ 1994; 7: 1367– 1388 Fursikov AV. Optimal control of distributed systems. Theory and applications. American Mathematical Society, Providence, RI, 2000 Galewska E, Nowakowski A. Multidimensional dual dynamic programming. J Optim Theory Appl 2005; 124: 175–186 Galewska E, Nowakowski A. A dual dynamic programming for multidimensional elliptic optimal control problems. Numer Funct Anal Optimization 2006; 27: 279–280 Gozzi F, Tessitore ME. Optimality conditions for Dirichlet boundary control problems of parabolic type. J Math Systems Estim Control 1998; 8: 143–146 Li X, Yong J. Optimal control theory for infinite dimensional systems. Birkhauser, Boston, 1994 Neittaanmaki P, Tiba D. Optimal control of nonlinear parabolic systems. Marcel Dekker, New York, 1994 Nowakowski A. The dual dynamic programming. Proc Am Math Soc 1992; 116: 1089–1096