A Dual Dynamic Programming For Multidimensional Parabolic Optimal

2 downloads 0 Views 130KB Size Report
In the paper the optimal control problems governed by parabolic equations are considered. We apply a new dual dynamic programming approach to derive suffi-.
European Journal of Control (2006)12:455–463 # 2006 EUCA

A Dual Dynamic Programming For Multidimensional Parabolic Optimal Control Problems E. Galewska1, and A. Nowakowski2, 1

Faculty of Mathematics, University of Lodz, Banacha 22, 90-238 Lodz, Poland; 2Faculty of Mathematics, University of Lodz, Banacha 22, 90-238 Lodz, Poland

In the paper the optimal control problems governed by parabolic equations are considered. We apply a new dual dynamic programming approach to derive sufficient optimality conditions for such problems. The idea is to move all the notions from a state space to a dual space and to obtain a new verification theorem providing the conditions which should be satisfied by a solution of the dual partial differential equation of dynamic programming. We also give sufficient optimality conditions for the existence of an optimal dual feedback control and some approximation of the problem considered which seems to be very useful from the practical point of view. Keywords: Dual Dynamic Programming; Dual Feedback Control; Parabolic Equation; Optimal Control Problem; Sufficient Optimality Conditions; Verification Theorem

1. Introduction Consider the following optimal control problem (P): Z minimize Jðx, uÞ ¼ Lðt, z, xðt, zÞ, uðt, zÞÞdtdz ½0;T Z þ lðxðT, zÞÞdz 

E-mail: [email protected]. Correspondence to: A. Nowakowski; e-mail: annowako@math. uni.lodz.pl

subject to xt ðt, zÞ þ z xðt, zÞ ¼ fðt, z, xðt, zÞ, uðt, zÞÞ a.e. on ½0, T   ð1Þ xð0, zÞ ¼ ’ð0, zÞ on 

ð2Þ

xðt, zÞ ¼ ðt, zÞ on ½0, T  @

ð3Þ

uðt, zÞ 2 U a. e. on ½0, T  

ð4Þ

where  is a given subset of Rn which is bounded with Lipschitz boundary and U is a given nonempty set in Rm ; L, f : ½0, T    R  Rm ! R, l : R ! R and ’ : Rnþ1 ! R are given functions; x : ½0, T  ! R, x 2 W2, 2 ðÞ and u : ½0, T   ! Rm is a Lebesgue measurable function. We assume that for each s in R, the functions ðt, z, uÞ ! Lðt, z, s, uÞ, ðt, z, uÞ ! fðt, z, s, uÞ are (L  B)-measurable, where L  B is the -algebra of subsets of ½0, T    Rm generated by products of Lebesgue measurable subsets of ½0, T   and Borel subsets of Rm , and for each ðt, z, uÞ 2 ½0, T    Rm , the functions s ! Lðt, z, s, uÞ, s ! fðt, z, s, uÞ are continuous. We call a pair ðxðt, zÞ, uðt, zÞÞ to be admissible if it satisfies (1)–(4) and Lðt, z, xðt, zÞ, uðt, zÞÞ is summable; then the corresponding trajectory x(t, z) is said to be admissible. The aim of the paper is to present sufficient optimality conditions for problem (P) in terms of

Received 18 January 2005; Accepted 12 February 2006 Recommended by J. Tsinias, D. Normand-Cyrot

456

dynamic programming conditions directly. In the literature, there is not work in which problem (P) is studied directly by a dynamic programming method. The only results known to the authors (see e.g. [1]– [11], [12], [13] and references therein) treat problem (P) as an abstract problem with an abstract evolution equation (1) and later derive from abstract Hamilton– Jacobi equations the suitable sufficient optimality conditions for problem (P). We propose almost a direct method to study (P) by a dual dynamic programming approach following the method described in [14] for one dimensional case and in [9] for multidimensional case. We move all notions of a dynamic programming to a dual space (the space of multipliers) and then develop a dual dynamic approach together with a dual Hamilton–Jacobi equation and as a consequence sufficient optimality conditions for (P). We also define an optimal dual feedback control in the terms of which we formulate sufficient conditions for optimality. Such an approach allows us to weak significantly the assumptions on the data. An approximate minimum in terms of the dual dynamic programming is also investigated.

2. A Dual Dynamic Programming In this section we describe an intuition of a dual dynamic approach to optimal control problems governed by parabolic equations. Let us recall what does a dynamic programming mean? We have an initial condition ðt0 , x0 ðt0 , zÞÞ, z 2  for which we assume that we have an optimal solution ðx, uÞ. Then by necessary optimality conditions there exists a conjugate function pðt, zÞ ¼ ðy0 , yðt, zÞÞ on ½0, T   being a solution to the corresponding adjoint system (see e.g. [5], [11]). The element p ¼ ðy0 , yÞ plays a role of multipliers from the classical Lagrange problem with constraints (with multiplier y0 staying by the functional and y corresponding to the constraint). If we perturb ðt0 , x0 Þ then, assuming that the optimal solution exists for each perturbed problem, we also have a conjugate function corresponding to it. Therefore making perturbations of our initial conditions we obtain two sets of functions: optimal trajectories x and corresponding to them conjugate functions p. The graphs of optimal trajectories cover some set in a state space (t, z, x), say a set X (in the classical calculus of variation it is named the field of extremals), and the graphs of conjugate functions cover some set in a conjugate space (t, z, p), say a set P (in classical mechanics it is named the space of momentums). In the classical dynamic programming

E. Galewska and A. Nowakowski

approach we explore the state space (t, z, x), i.e. the set X (see e.g. [1]) but in the dual dynamic programming approach we explore the conjugate space (the dual space) (t, z, p), i.e. the set P (see [14] for one dimensional case and [9] for multidimensional case). It is worth to note that although in elliptic control optimization problems we have not possibilities to perturb that problems, the dual dynamic programming is still possible to be applied (see [10]). It is natural that if we want to explore the dual space (t, z, p) then we need a mapping between the set P and the set X : P 3 ðt, z, pÞ ! ðt, z, x~ðt, z, pÞÞ 2 X to have a possibility to formulate, at the end of some consideration in P, any conditions for optimality in our original problem as well as on an optimal solution x. Of course, such a mapping should have the property that for each admissible trajectory x(t, z) lying in X we must have a function p(t, z) lying in P such that xðt, zÞ ¼ x~ðt, z, pðt, zÞÞ. Hence, we conduct all our investigations in a dual space (t, z, p), i.e. most of our notions concerning the dynamic programming are defined in the dual space including a dynamic programming equation which becames now a dual dynamic programming equation. Therefore let P  Rnþ3 be a set of the variables ðt, z, pÞ ¼ ðt, z, y0 , yÞ, ðt, zÞ 2 ½0, T  , y0  0, y 2 R, and let c ¼ ðc0 , cÞ 2 R2 be fixed. The constant c is introduced because of the practical purpose only, i.e. in order to make easier the calculations of some relation stated below for concrete problems (see section An Example). We adopt the convention that ðt, z, cpÞ ¼ ðt, z, c0 y0 , cyÞ for ðt, z, pÞ 2 P. Let x~ : P ! R be such a function that for each admissible trajectory x(t, z) there exists a function pðt, zÞ ¼ ðy0 , yðt, zÞÞ, p 2 W2, 2 ð½0, T  Þ, ðt, z, pðt, zÞÞ 2 P such that xðt, zÞ ¼ x~ðt, z, pðt, zÞÞ for ðt, zÞ 2 ½0, T  : ð5Þ Now, let us introduce an auxiliary C2 function Vðt, z, pÞ : P ! R such that for ðt, z, pÞ 2 P, ðt, z, cpÞ 2 P the following condition is satisfied Vðt, z, cpÞ ¼ c0 y0 Vy0 ðt, z, cpÞ þ cyVy ðt, z, cpÞ ð6Þ ¼ cpVp ðt, z, cpÞ: The condition (6) is the generalization of a tranversality condition known in classical mechanics as the orthogonality of a momentum to the front of a wave. Similarly as in the classical dynamic programming define at (t, p(  )), where pðzÞ ¼ ðy0 , yðzÞÞ is any

457

A Dual Dynamic Programming for Parabolic Problems

function p 2 W2, 2 ðÞ, ðt, z, pðzÞÞ 2 P, a dual value function SD by the formula ( Z SD ðt,pðÞÞ:¼inf c0 y0

Lð,z,xð,zÞ,uð,zÞÞddz ½t;T

equation of multidimensional dynamic programming (DSPDEMDP) n max Vt ðt, z, cpÞ þ z Vðt, z, cpÞ

)

Z

þ c0 y0 Lðt, z,  Vy ðt, z, cpÞ, uÞ

lðxðT,zÞÞdz ,

o þ cyfðt, z,  Vy ðt, z, cpÞ, uÞ : u 2 U ¼ 0:

ð7Þ

ð15Þ

where the infimum is taken over all admissible pairs xð, Þ, uð, Þ,  2 ½t, T such that

Let us note that the function x~ðt, z, pÞ introduced a little bit artificially at the begining of this section in fact is defined by Vy ðt, z, pÞ, where V is a solution to (15), i.e. knowing the set P and Vy we are able to describe the set X in which our original problem we need to consider. The assumption that the auxiliary function V(t, z, p) is of C2 is important in this paper and we cannot weakend that. However, we would like to stress that it is only an auxiliary assumption and it is not put on a value function which in our case need not to be even continuous.

c0 y0



xðt, zÞ ¼ x~ðt, z, pðzÞÞ for z 2  x~ðt, z, pðzÞÞ ¼ ðt, zÞ for z 2 @

ð8Þ ð9Þ

i.e. whose trajectories start at ðt, x~ðt,  , pðÞÞ. Then, integrating (6) over , for any function pðzÞ ¼ ðy0 ,yðzÞÞ, p 2 W2;2 ðÞ, ðt,z,pðzÞÞ 2 P, ðt,z,cpðzÞÞ 2 P, such that x(, ) satisfying xðt,zÞ ¼ x~ðt,z,pðzÞÞ for z 2 , is an admissible trajectory, we also have the equality Z rz Vðt, z, cpðzÞÞðzÞdz @

Z

¼ c

@

yðzÞr~ xðt, z, pðzÞÞðzÞdz  SD ðt, pðÞÞ ð10Þ

with

Z 0 0

c y

@

rz Vy0 ðt, z, cpðzÞÞðzÞdz ¼ SD ðt, pðÞÞ ð11Þ

and assuming that x~ðt, z, pðzÞÞ ¼ Vy ðt, z, cpðzÞÞ for ðt, zÞ 2 ½0, T  , ðt, z, cpðzÞÞ 2 P. Here (  ) is the exterior unit normal vector to @ and rx(t, z, p(z)) means ‘‘r’’ of the function z ! x(t, z). Denote by the symbol z h the sum of the second partial derivatives of the function h : P!R with respect to the variable zi , i ¼ 1, . . . , n, i.e. Xn z hðt, z, pÞ :¼ ð@ 2 =@z2i Þhðt, z, pÞ: ð12Þ i¼1 It turns out that the function V(t, z, p) being defined by (10), (11) satisfies the second order partial differential equation Vt ðt, z, cpÞ þ z Vðt, z, cpÞ þ Hðt, z,  Vy ðt, z, cpÞ, pÞ ¼ 0,

ð13Þ

where Hðt, z, v, pÞ ¼ c0 y0 Lðt, z, v, uðt, z, pÞÞ þ cyfðt, z, v, uðt, z, pÞÞ

ð14Þ

and u(t, z, p) is an optimal dual feedback control, and the dual second order partial differential

Remark. We would like to stress that the duality which is sketched in this section is not a duality in the sense of convex optimization. It is a new nonconvex duality, first time described in [14] and next developed in [9], for which we have not the relation sup(D)  inf(P) (D-means a dual problem, P a primal one). But instead of it we have other relations, namely (6) and (10), which are generalizations of transversality conditions from classical mechanics. If we find a solution to (13) then checking the relation (6) for concrete problems is not very difficult.

3. A Verification Theorem The most important conclusion of a dynamic programming is a verification theorem. We present it in a dual form accordingly to our dual dynamic programming approach described in the previous section. Theorem 1. Let ðxðt, zÞ, uðt, zÞÞ, ðt, zÞ 2 ½0, T  , be an admissible pair. Assume that there exist c ¼ ðc0 , cÞ 2 R2 and a C2 solution V(t, z, p) of DSPDEMDP (15) on P such that (6) holds. Let further pðt, zÞ ¼ ðy0 , yðt, zÞÞ, p 2 W2;2 ð½0, T  Þ, ðt, z, cpðt, zÞÞ 2 P, be such a function that xðt, zÞ ¼ Vy ðt, z, cpðt, zÞÞ for ðt, zÞ 2 ½0, T  . Suppose that V(t, z, p) satisfies the boundary condition for ðT, z, cpÞ 2 P, Z Z c0 y0 Vy0 ðT, z, cpÞdz ¼ c0 y0 lðVy ðT, z, cpÞÞdz: 



ð16Þ

458

E. Galewska and A. Nowakowski

Moreover, assume that for almost all ðt, zÞ 2 ½0, T  , Vt ðt, z, cpðt, zÞÞ þ z Vðt, z, cpðt, zÞÞ

We conclude ðt, zÞ 2 ½0, T  ,

for

þ c0 y0 Lðt, z,  Vy ðt, z, cpðt, zÞÞ, uðt, zÞÞ

ð17Þ Then ðxðt, zÞ, uðt, zÞÞ, ðt, zÞ 2 ½0, T  , is an optimal pair relative to all admissible pairs ðxðt, zÞ, uðt, zÞÞ, ðt, zÞ 2 ½0, T  , for which there exists such a function pðt, zÞ ¼ ðy0 , yðt, zÞÞ, p 2 W2;2 ð½0, T  Þ, ðt, z, cpðt, zÞÞ 2 P, that xðt, zÞ ¼ Vy ðt, z, cpðt, zÞÞ for ðt, zÞ 2 ½0, T   and yð0, zÞ ¼ yð0, zÞ for z 2 

ð18Þ

yðt, zÞ ¼ yðt, zÞ for ðt, zÞ 2 ½0, T  @

ð19Þ

Proof. Let ðxðt, zÞ, uðt, zÞÞ, ðt, zÞ 2 ½0, T  , be an admissible pair for which there exists such a function pðt, zÞ ¼ ðy0 , yðt, zÞÞ, p 2 W2;2 ð½0, T  Þ, ðt, z, cpðt, zÞÞ 2 P, that xðt, zÞ ¼ Vy ðt, z, cpðt, zÞÞ for ðt, zÞ 2 ½0, T   and (18), (19) are satisfied. From transversality condition (6), we obtain that for ðt, zÞ 2 ½0, T  , Vt ðt, z, cpðt, zÞÞ þ z Vðt, z, cpðt, zÞÞ ¼ c0 y0 ½ðd=dtÞVy0 ðt, z, cpðt, zÞÞ

þ cyðt, zÞfðt, z,  Vy ðt, z, cpðt, zÞÞ, uðt, zÞÞ ð23Þ hence, by (15) and (23), that ðd=dtÞWðt, z, cpðt, zÞÞ þ z Wðt, z, cpðt, zÞ  0 for ðt, zÞ 2 ½0, T  

ð24Þ

and finally, after integrating (24) and applying (22), that Z ½ðd=dtÞVy0 ðt, z, cpðt, zÞÞ c 0 y0 ½0;T

þ divrz Vy0 ðt, z, cpðt, zÞÞdtdz Z 0 0 Lðt, z, xðt, zÞ, uðt, zÞÞdtdz: ð25Þ  c y ½0;T

Thus from (25), (16), (18), (19) and by the Green formula it follows that Z 0 0 ½lðVy ðT, z, cpðT, zÞÞÞ c y 

 Vy0 ð0, z, c0 y0 , cyð0, zÞÞdz  Z Z rz Vy0 ðt, z, c0 y0 , cyðt, zÞÞðzÞdz dt þ c 0 y0

þ z Vy0 ðt, z, cpðt, zÞÞ þ cyðt, zÞ½ðd=dtÞVy ðt, z, cpðt, zÞÞ ð20Þ

Since xðt, zÞ ¼ Vy ðt, z, cpðt, zÞÞ, for ðt, zÞ 2 ½0, T  , (1) shows that for ðt, zÞ 2 ½0, T  , ðd=dtÞVy ðt, z, cpðt, zÞÞ þ z Vy ðt, z, cpðt, zÞÞ ð21Þ

Now define a function W(t, z, p(t, z)) on P by the following requirement for ðt, zÞ 2 ½0, T  ,

Z  c0 y0

½0;T

@

Lðt, z, xðt, zÞ, uð, zÞÞdtdz, ½0;T

ð26Þ where (  ) is the exterior unit normal vector to @. So by (26) we get  Z Z rz Vy0 ðt, z, c0 y0 , cyðt, zÞÞðzÞdz dt c 0 y0 ½0;T



Z  c0 y0

:¼ c y ½ðd=dtÞVy0 ðt, z, cpðt, zÞÞ 0 0

@

Z

 c0 y0

ðd=dtÞWðt, z, cpðt, zÞÞ þ z Wðt, z, cpðt, zÞÞ

Vy0 ð0, z, c0 y0 , cyð0, zÞÞdz Lðt, z, xðt, zÞ, uðt, zÞÞdtdz

½0;T

Z

þ z Vy0 ðt, z, cpðt, zÞÞ þ Lðt, z,  Vy ðt, z, cpðt, zÞÞ, uðt, zÞÞ:

that

¼ Vt ðt, z, pðt, zÞÞ þ z Vðt, z, pðt, zÞÞ

þ cyðt, zÞfðt, z,  Vy ðt, z, cpðt, zÞÞ, uðt, zÞÞ ¼ 0:

¼ fðt, z,  Vy ðt, z, cpðt, zÞÞ, uðt, zÞÞ

(20)–(22)

ðd=dtÞWðt, z, cpðt, zÞÞ þ z Wðt, z, cpðt, zÞÞ

þ c0 y0 Lðt, z,  Vy ðt, z, cpðt, zÞÞ, uðt, zÞÞ

þ z Vy ðt, z, cpðt, zÞÞ:

from

ð22Þ

 c 0 y0

lðxðT, zÞÞdz: 

ð27Þ

459

A Dual Dynamic Programming for Parabolic Problems

In the same manner applying (17) and (23) we have ðd=dtÞWðt, z, cpðt, zÞÞ þ z Wðt, z, cpðt, zÞ ¼ 0 for ðt, zÞ 2 ½0, T  :

ð28Þ

Now from (28), (22), (16) and the Green formula we have  Z Z 0 0 0 0 rz Vy0 ðt, z, c y , cyðt, zÞÞðzÞdz dt c y ½0;T @ Z  c0 y0 Vy0 ð0, z, c0 y0 , cyð0, zÞÞdz Z 0 0 Lðt, z, xðt, zÞ,uðt, zÞÞdtdz ¼ c y ½0;T Z  c0 y0 lðxðT, zÞÞdz: ð29Þ 

Combining (27) with (29)gives Z Lðt, z, xðt, zÞ, uðt, zÞÞdtdz  c 0 y0 ½0;T Z  c0 y0 lðxðT, zÞÞdz Z Lðt, z, xðt, zÞ, uðt, zÞÞdtdz  c0 y0 ½0;T Z  c0 y0 lðxðT, zÞÞdz, 

which completes the proof.

Definition 3. A dual feedback control uðt, z, pÞ is called an optimal dual feedback control, if there exist a function xðt, z, pÞ; ðt, z, pÞ 2 P, corresponding to uðt, z, pÞ as in Definition 2, and a function pðt, zÞ ¼ ðy0 , yðt, zÞÞ, p 2 W2, 2 ð½0, T  Þ, ðt, z, pðt, zÞÞ 2 P; ðt, z, cpðt, zÞÞ 2 P with c ¼ ðc0 , cÞ, such that, for Z 0 0 SD ðt,pðt,ÞÞ ¼ c y Lð,z,xð,z,pð,zÞÞ, ½t, T Z uð,z,pð,zÞÞÞddz  c0 y0 lðxðT,z,pðT,zÞÞÞdz 

ð32Þ defining Vy0 ðt, z, cpðt, zÞÞ by Z c0 y0 rz Vy0 ðt,z,cpðt,zÞÞðzÞdz ¼ SD ðt,pðt,ÞÞ @

ð33Þ and for Vy ðt, z, cpÞ ¼ xðt, z, pÞ for ðt, z, pÞ 2 P, ðt, z, cpÞ 2 P,

ð34Þ

there is Vðt, z, pÞ satisfying (6). ð30Þ &

4. An Optimal Dual Feedback Control It often occurs that for engineers and in practice a feedback control is more important than a value function. It turns out that a dual dynamic programming approach allows also to investigate a kind of a feedback control which we call a dual feedback control. Suprisingly it can have a better properties than classical one – now our state equation depends only on the parameter and not additionaly on the state in a feedback function, what makes the state equation difficult to solve. Definition 2. A function u~ ¼ u~ðt, z, pÞ from a subset P of Rnþ3 of the points ðt, z, pÞ ¼ ðt, z, y0 , yÞ, ðt, zÞ 2 ½0, T  , y0  0, y 2 R, into U is called a dual feedback control, if there is any solution x~ðt, z, pÞ, ðt, z, pÞ 2 P, of the partial differential equation xt ðt, z, pÞ þ z xðt, z, pÞ ¼ fðt, z, xðt, z, pÞ, u~ðt, z, pÞÞ ð31Þ such that for each admissible trajectory xðt, zÞ, ðt, zÞ 2 ½0, T  , there exists such a function pðt, zÞ ¼ ðy0 , yðt, zÞÞ, p 2 W2;2 ð½0, T  Þ, ðt, z, pðt, zÞÞ 2 P, that (5) holds.

The next theorem is nothing more that the verification theorem formulated in terms of a dual feedback control. Theorem 4. Let uðt, z, pÞ be a dual feedback control in P. Suppose that there exist c ¼ ðc0 , cÞ 2 R2 and a C2 solution V(t, z, p) of DSPDEMDP (15) on P such that (6) and (16) hold. Let pðt, zÞ ¼ ðy0 , yðt, zÞÞ, p 2 W2;2 ð½0, T  Þ, ðt, z,pðt, zÞÞ 2 P, ðt, z,cpðt, zÞÞ 2 P, be such a function that ðxðt, zÞ,uðt, zÞÞ, where xðt, zÞ ¼ xðt, z,pðt, zÞÞ and uðt, zÞ ¼ uðt, z,pðt, zÞÞ, ðt, zÞ 2 ½0, T  , is an admissible pair with xðt, z, pÞ, ðt, z, pÞ 2 P, corresponding to uðt, z, pÞ as in Definition 2. Assume further that: Vy ðt, z, cpÞ ¼ xðt, z, pÞ for ðt, z, pÞ 2 P, ðt, z, cpÞ 2 P, Z

Z



rz Vy0 ðt, z, c y , cyðt, zÞÞðzÞdz dt Z 0 0 c y Vy0 ð0, z, c0 y0 , cyð0, zÞÞdz

0 0

c y

ð35Þ

0 0

½0, T

@



Z ¼ c y

0 0

Lðt, z, xðt, z, pðt, zÞÞ, uðt, z, pðt, zÞÞÞdtdz Z  c0 y0 lðxðT, z, pðT, zÞÞÞdz: ð36Þ

½0, T



Then uðt, z, pÞ is an optimal dual feedback control. Proof. Take any function pðt, zÞ ¼ ðy0 , yðt, zÞÞ, p 2 W2, 2 ð½0, T  Þ, ðt, z, pðt, zÞÞ 2 P, ðt, z, cpðt, zÞÞ 2 P,

460

E. Galewska and A. Nowakowski

such that ðxðt, zÞ, uðt, zÞÞ, where xðt, zÞ ¼ xðt, z, pðt, zÞÞ, uðt, zÞ ¼ uðt, z, pðt, zÞÞ, ðt, zÞ 2 ½0, T  , is an admissible pair and (18), (19) hold. By (35), it follows that xðt, zÞ ¼ Vy ðt, z, cpðt, zÞÞ for ðt, zÞ 2 ½0, T  . As in the proof of Theorem 1, equation (36) gives Z 0 0 c y Lðt, z, xðt, z, pðt, zÞÞ, uðt, z, pðt, zÞÞÞdtdz ½0, T Z  c0 y0 lðxðT, z, pðT, zÞÞÞdz

  pffiffiffi Hðt, z, v, cpÞ ¼ 2 6ðy0 Þ3=2 = 27ðyÞ1=2 v2 : ð41Þ Hence second order partial differential equation (13) has the form Vt ðt,z,cpÞ þ z Vðt,z,cpÞþ  pffiffiffi

3=2   2 6 y0 = 27ðyÞ1=2 ðVy ðt,z,cpÞÞ2 ¼ 0:



Z  c0 y0

From (39)–(40) we have

½0, T

Lðt, z, xðt, z, pðt, zÞÞ, uðt, z, pðt, zÞÞdtdz

 c 0 y0

Z

ð42Þ Let

xðT, z, pðT, zÞÞdz: 

Vðt, z, cpÞ :¼ ð4=3Þ1=4 ð4y=3Þ3=4 Xn þ ð2y0 =3Þ3=2 z2 =ð6nÞ: ð43Þ i¼1 i

ð37Þ We conclude from (37) that Z SD ðt,pðt:ÞÞ ¼ c0 y0

½t, T

Lð, z,xð, z,pð, zÞÞ,

uð, z,pð, zÞÞÞddz Z 0 0 c y lðxðT, z,pðT,zÞÞÞdz 

ð38Þ and it is sufficient to show that uðt, z, pÞ is an optimal dual feedback control, by Theorem 1 and Definition 3. &

Then the function Vðt, pÞ satisfies on P both DPDEMDP (15) and transversality conditions (6). Let uðt, z, pÞ : P ! U be a function defined as follows   ð44Þ uðt, z, pÞ :¼ y0 = 6t4 ðyÞ7=12 Hence equation (31) has the following form xt ðt,z,pÞþz xðt,z,pÞ , 0 3=2

¼ðy Þ



ð6yÞ

3=2

ðyÞ

1=2

Xn

2=3 2

!

z i¼1 i ð45Þ

5. An Example

If we take y0 ¼ 6ð2n=9Þ2=3 then (45) has the solution

Put for ðt, z, x, uÞ 2 ½0, T    Rþ  Rþ ,

xðt, z, pÞ ¼ ðyÞ1=2

2 7=6 1=2

Lðt, z, x, uÞ :¼ t x

u

fðt, z, x, uÞ :¼ t6 x1=2 u3=2 where  :¼ fz 2 Rn : 0 < zi < 1, i ¼ 1, . . . , ng. Let  further Y :¼ p ¼ ðy0 , yÞ2 R2 : y0  0, y < 0 , P :¼ ½0, T    Y, U :¼ Rþ and c :¼ ð2=3, 4=3Þ 2 R2 . Define the Hamiltonian H : ½0, T    Rþ  Y ! R by the following formulas Hðt, z, v, cpÞ :¼ minþ Hðt, z, v, cp, uÞ,

ð39Þ

u2R

where H : ½0, T    Rþ  Y  U ! R denotes the Pontryagin Hamiltonian Hðt, z, v, cp, uÞ :¼ ð2=3Þy0 t2 v7=6 u1=2  ð4=3Þyt v

6 1=2 3=2

u

:

ð40Þ

Xn

2=3

z i¼1 i

ð46Þ

on P. We denote that y0 by y0 . Therefore uðt, z, y0 , yÞ is a dual feedback control in P. The function (16) for RVðt, z, pÞ satisfies boundary condition 1=3 .  lðVy ðT, z, cpÞÞdz ¼ ð1=9Þð2n=9Þ Let pðt, zÞ :¼ ðy0 , yðt, zÞÞ, where y0 comes from (45) and Xn 8=3 yðt, zÞ :¼  z : ð47Þ i i¼1 Then xðt, z, pðt, zÞÞ ¼

Xn

z i¼1 i

2=3

,

ð48Þ

 X 14=9  n 4 ð49aÞ uðt, z, pðt, zÞÞ ¼ y = 6t z i¼1 i 0

:

461

A Dual Dynamic Programming for Parabolic Problems

and ðxðt, zÞ, uðt, zÞÞ, where xðt, zÞ ¼ xðt, z, pðt, zÞÞ zÞ 2 ½0, T  , is an and uðt, zÞ ¼ uðt, z, pðt, zÞÞ, ðt, P 2=3 n on  and admissible P pair for ’ð0, zÞ ¼ i¼1 zi

2=3 n z on ½0, T  @ such that ðt, zÞ ¼ i i¼1 Vy ðt, z, cpðt, zÞÞ ¼ xðt, z, pðt, zÞÞ for ðt, zÞ 2 ½0, T  :

ð50Þ

ð51Þ

6. An "-optimization If we want to solve concrete problem (1)–(4) for particular data then usually we are not able to solve it exactly especially the problem we consider is nonlinear. Therefore each possibility to aproximate our optimal problem (1)–(4) may turn out very useful. Below we find a certain type of a such approximation. Definition 5. Let " > 0 and c0 > 0 be fixed. A function S"D ðt, pðt, ÞÞ is called an "-dual value function, if SD ðt, pðt, ÞÞ  S"D ðt, pðt, ÞÞ  SD ðt, pðt, ÞÞ  "c0 y0" TvolðÞ ð52Þ for any y0"  0. Definition 6. Let " > 0 and c ¼ ðc0 , cÞ 2 R2 , c0 > 0 be eðt, z, pÞ be a given C2 function. Let fixed and let V ðx" ðt, zÞ, u" ðt, zÞÞ, t 2 ½0, T  , be an admissible pair and let p" ðt, zÞ ¼ ðy0" , y" ðt, zÞÞ, p" 2 W2, 2 ð½0, T  Þ, ðt, z, cp" ðt, zÞÞ 2 P, be such a function that x" ðt, zÞ ¼ ey ðt, z, cp" ðt, zÞÞ for ðt, zÞ 2 ½0, T  . The pair V ðx" ðt, zÞ, u" ðt, zÞÞ, ðt, zÞ 2 ½0, T  , is called an "-optimal pair relative to all admissible pairs ðxðt, zÞ, uðt, zÞÞ, t 2 ½0, T  , for which there exists such a function pðt, zÞ ¼ ðy0" , yðt, zÞÞ, p 2 W2, 2 ð½0, T  Þ, ey ðt, z, cpðt, zÞÞ for ðt, z, cpðt, zÞÞ 2 P, that xðt, zÞ ¼ V ðt, zÞ 2 ½0, T   and yð0, zÞ ¼ y" ð0, zÞ for z 2  yðt, zÞ ¼ y" ðt, zÞ for ðt, zÞ 2 ½0, T  @

Z 

c0 y0"

ð53Þ ð54Þ

½0, T

Lðt, z, x" ðt, zÞ, u" ðt, zÞÞdtdz

Z  c0 y0"

Moreover, assumption (36) of Theorem 4 holds for R lðxðT, z, pðT, zÞÞÞdz ¼ ð1=3Þð2n=9Þ1=3 ðT þ 1=3Þ.  Therefore, by Theorem 4 we obtain that uðt, z, pÞ is an optimal dual feedback control and from (32) we conclude that a dual value function SD equals SD ðt, pðt, ÞÞ ¼ ð8n=81Þð6T  9t  1Þ:

if,



lðx" ðT, zÞdz

Z



c0 y0"



c0 y0"

Z 

½0, T

Lðt, z, xðt, zÞ, uðt, zÞÞdtdz

lðxðT, zÞdz  "c0 y0" TvolðÞ: ð55Þ

Theorem 7. Let ðx" ðt, zÞ, u" ðt, zÞÞ, ðt, zÞ 2 ½0, T  , be an admissible pair. Assume that there exist " > 0, eðt, z, pÞ c ¼ ðc0 , cÞ 2 R2 , c0 > 0, and a C2 function V such that for ðt, z, cpÞ 2 P: ( et ðt, z, cpÞ þ z V eðt, z, cpÞ max V ey ðt, z, cpÞ, uÞ þ c0 y0 Lðt, z,  V

)

ey ðt, z, cpÞ, uÞ : u 2 U þ cyfðt, z,  V

 "c0 y0" , ð56Þ

eðt, z, cpÞ ¼ cpV ep ðt, z, cpÞ: V

ð57Þ

Let further p" ðt, zÞ ¼ ðy0" , y" ðt, zÞÞ, p" 2 W2, 2 ð½0, T  Þ, ðt, z, cp" ðt, zÞÞ 2 P, be such a function that x" ðt, zÞ ¼ ey ðt, z, cp" ðtÞÞ for ðt, zÞ 2 ½0, T  . Suppose that V e Vðt, z, pÞ satisfies the boundary condition for ðT, z, cpÞ 2 P, Z Z   ey0 ðT, z, cpÞdz ¼ c0 y0 l V ey ðT, z, cpÞ dz: c0 y0" V " 



ð58Þ Moreover, suppose ðt, zÞ 2 ½0, T  ,

that

for

almost

all

eðt, z, cp" ðt, zÞÞ et ðt, z, cp" ðt, zÞÞ þ z V V ey ðt, z, cp" ðt, zÞÞ, u" ðt, zÞÞ þ c0 y0" Lðt, z,  V ey ðt, z, cp" ðt, zÞÞ, u" ðt, zÞÞ 0: þ cy" ðt, zÞfðt, z,  V ð59Þ Then ðx" ðt, zÞ, u" ðt, zÞÞ, ðt, zÞ 2 ½0, T  , is an "-optimal pair relative to all admissible pairs

462

E. Galewska and A. Nowakowski

ðxðt, zÞ, uðt, zÞÞ, ðt, zÞ 2 ½0, T  , for which there exists such a function pðt, zÞ ¼ ðy0" , yðt, zÞÞ, p 2 W2;2 ð½0, T  Þ, ðt, z, cpðt, zÞÞ 2 P, that xðt, zÞ ¼ ey ðt, z, cpðt, zÞÞ for ðt, zÞ 2 ½0, T   and (53), (54) V are satisfied. Proof. Take any admissible pair ðxðt, zÞ, uðt, zÞÞ, ðt, zÞ 2 ½0, T  , for which there exists such a function pðt, zÞ ¼ ðy0" , yðt, zÞÞ, p 2 W2, 2 ð½0, T  Þ, ey ðt, z, cpðt, zÞÞ for ðt, z, cpðt, zÞÞ 2 P, that xðt, zÞ ¼ V ðt, zÞ 2 ½0, T   and (53), (54) hold. Then, from (57), we have for ðt, zÞ 2 ½0, T  , eðt, z, cpðt, zÞÞ et ðt, z, cpðt, zÞÞ þ z V V h ey0 ðt, z, cpðt, zÞÞ ¼ c0 y0" ðd=dtÞV i ey0 ðt, z, cpðt, zÞÞ þ z V h ey ðt, z, cpðt, zÞÞ þ cyðt, zÞ ðd=dtÞV i ey ðt, z, cpðt, zÞÞ : þ z V

 "c0 y0" TvolðÞ:

From (64), (58), (18), (54) and the Green formula it follows that Z  Z 0 0 0 0 e rz Vy0 ðt,z,c y" ,cy" ðt,zÞÞðzÞdz dt c y" ½0, T @ Z 0 0 ey0 ð0,z,c0 y0 ,cy" ð0,zÞÞdz c y" V " 

c0 y0"

½0, T

Z ð60Þ

c0 y0"



Lðt,z,xðt,zÞ,uð,zÞÞdtdz

lðxðT,zÞÞdz"c0 y0" TvolðÞ, ð65Þ

where ð:Þ is the exterior unit normal vector to @. Similarly, by (59) and (61) we obtain e ðt, z, cp" ðt, zÞÞ þ z W e ðt, z, cp" ðt, zÞÞ 0 ðd=dtÞW for ðt, zÞ 2 ½0, T  : ð66Þ

e ðt, z, cpðt, zÞÞ e ðt, z, cpðt, zÞÞ þ z W ðd=dtÞW h ey0 ðt, z, cpðt, zÞÞ :¼ c0 y0" ðd=dtÞV ey0 ðt, z, cpðt, zÞÞ þ z V ð61Þ

ey ðt, z, cpðt, zÞÞ þ z V ey ðt, z, cpðt, zÞÞ ¼ Since ðd=dtÞV e fðt, z,  Vy ðt, z, cpðt, zÞÞ, uðt, zÞÞ for ðt, zÞ 2 ½0, T  , it follows, by (60) and (61), that for ðt, zÞ 2 ½0, T  ,

Now from (66), (61), (58) and the Green formula we have Z Lðt,z,x" ðt,zÞ,u" ðt,zÞÞdtdz c0 y0" ½0, T Z c0 y0" lðx" ðT,zÞÞdzdt 

 c0 y0"

½0, T

c0 y0"

eðt, z, pðt, zÞÞ et ðt, z, pðt, zÞÞ þ z V ¼V ey ðt, z, cpðt, zÞÞ, uðt, zÞÞ þ c0 y0" Lðt, z,  V ey ðt, z, cpðt, zÞÞ, uðt, zÞÞ þ cyðt, zÞfðt, z,  V ð62Þ



 c y" 0

e ðt, z, cpðt, zÞÞ  "c0 y0 e ðt, z, cpðt, zÞÞ þ z W ðd=dtÞW " ð63Þ

@

ey0 ðt,z,c0 y0 ,cy" ðt,zÞÞðzÞdz rz V "

ey0 ð0,z,c0 y0 ,cy" ð0,zÞÞdz: V "



ð67Þ

Therefore, combining (65) with (67) yields Z 0 0 Lðt, z, x" ðt, zÞ, u" ðt, zÞÞdtdz  c y" ½0, T Z lðx" ðT, zÞÞdz  c0 y0" 

Thus, by (56) and (62), we get

Z

Z

Z

e ðt, z, cpðt, zÞÞ e ðt, z, cpðt, zÞÞ þ z W ðd=dtÞW

for ðt, zÞ 2 ½0, T  :

ð64Þ

Z

e ðt, z, pðt, zÞÞ be any function defined on P such Let W that for ðt, zÞ 2 ½0, T  ,

ey ðt, z, cpðt, zÞÞ, uðt, zÞÞ: þ Lðt, z,  V

Integrating (63) now and applying (61) we obtain Z h 0 0 ey0 ðt, z, cpðt, zÞÞ ðd=dtÞV c y" ½0, T i ey0 ðt, z, cpðt, zÞÞ dtdz þ divrz V Z 0 0  c y" Lðt, z, xðt, zÞ, uðt, zÞÞdtdz ½0, T

Z  c0 y0"

Z

0



½0, T

Lðt, z, xðt, zÞ, uð, zÞÞdtdz

lðxðT, zÞÞdz  "c0 y0" TvolðÞ,

which proves the assertion of the theorem.

ð68Þ &

463

A Dual Dynamic Programming for Parabolic Problems

References 1. Barbu V. Analysis and control of nonlinear infinite dimensional systems. Academic Press, Boston, 1993 2. Barbu V. The dynamic programming equation for the time-optimal control problem in infinite dimensions. SIAM J Control Optim 1991; 29: 445–456 3. Barbu V, Da Prato G. Hamilton-Jacobi equations in Hilbert spaces. Pitman Advanced Publishing Program, Boston, 1983 4. Cannarsa P, Carja O. On the Bellman equation for the minimum time problem in infinite dimensions. SIAM J Control Optim 2004; 43: 532–548 5. Casas E. Pontryagin’s principle for state-constrained boundary control problems of semilinear parabolic equations. SIAM J Control Optim 1997; 35: 1297–1327 6. Fattorini HO. Infinite-dimensional optimization and control theory. Cambridge University Press, Cambridge, 1999 7. Fattorini HO, Murphy T. Optimal control for nonlinear parabolic boundary control systems: the Dirichlet

8. 9. 10.

11. 12. 13. 14.

boundary conditions. Diff Integ Equ 1994; 7: 1367– 1388 Fursikov AV. Optimal control of distributed systems. Theory and applications. American Mathematical Society, Providence, RI, 2000 Galewska E, Nowakowski A. Multidimensional dual dynamic programming. J Optim Theory Appl 2005; 124: 175–186 Galewska E, Nowakowski A. A dual dynamic programming for multidimensional elliptic optimal control problems. Numer Funct Anal Optimization 2006; 27: 279–280 Gozzi F, Tessitore ME. Optimality conditions for Dirichlet boundary control problems of parabolic type. J Math Systems Estim Control 1998; 8: 143–146 Li X, Yong J. Optimal control theory for infinite dimensional systems. Birkhauser, Boston, 1994 Neittaanmaki P, Tiba D. Optimal control of nonlinear parabolic systems. Marcel Dekker, New York, 1994 Nowakowski A. The dual dynamic programming. Proc Am Math Soc 1992; 116: 1089–1096

Suggest Documents