Lie Algebraic Methods in Optimal Control of Stochastic Systems with Exponential-of-Integral Cost Charalambos D. Charalambous Department of Electrical Engineering McGill University Montreal, P.Q. Canada H3A 2A7 email:
[email protected]
Submitted
Systems and Control Letters June 12, 1997
Paper I.D.: SCL 97-118
This work was supported by the Natural Science and Engineering Research Council of Canada
under grant OGP0183720
Abstract The purpose of this paper is to formulate and study the optimal control of partially observed stochastic systems with exponential-of-integral-sample cost, known as risk-sensitive problems, using Lie algebraic tools. This leads to the introduction of the sucient statistic algebra, Ls, through which one can determine a priori the maximum order of the controller. When dim(Ls) < 1, the construction of the control laws is addressed through extensions of the Wei-Norman method, as in nonlinear ltering problems. Aside from speci c known nite-dimensional examples which are studied in order to delineate the application of the Lie algebraic tools, new classes of nite-dimensional controllers are identi ed as well. In addition, relations with minimax dynamic games are explored to best assess the importance and generality of the nite-dimensional control systems.
1 Introduction In [1], Charalambous and Elliott have studied the optimal control of partially observed stochastic systems with integral cost. They have introduced the notion of sucient statistic algebra which is responsible for propagating the information state of stochastic control problems. The sucient statistic algebra is a generalization of the estimation algebra of nonlinear ltering problems introduced in [2], by Brockett and Clark. The concept of estimation algebra is by now well-studied and understood; it enables us to address the questions of classi cation, equivalence, minimum realization, and construction of nite-dimensional minimum variance lters. The sucient statistic algebra is a recent concept which is equally valuable in addressing these questions in the context of optimal control. However, there are certain fundamental dierences between the sucient statistic algebras of the exponentialof-integral and the integral control problems. It is the purpose of this note to introduce the sucient statistic algebra for stochastic optimal control problems with exponential-of-integral cost, so that the value of the Lie algebraic methods can be exploited. The sucient statistic algebra is important in determining a priori whether there exist nite-dimensional controllers. The following results are established: 1. To identify new classes of nonlinear control systems with nite-dimensional controllers; 2. To introduce a procedure for deriving nite-dimensional sub-optimal controllers for a large class of nonlinear systems which are inherently in nite-dimensional; 3. To relate the controller design to that of minimax dynamic games used in addressing disturbance attenuation problems. In section 2, we introduce the sucient statistic algebra, Ls, and associated tools for the understanding of subsequent sections. In section 3.1, we consider the Linear-ExponentialQuadratic-Gaussian (LEQG) problem; we verify the nite-dimensionality of Ls and its isomorphism relation with a related algebra. In section 3.2, we obtain nite-dimensionality of Ls, for a class of nonlinear systems with exponential-of-quadratic cost. Generalizations to exponential-of-integral costs are discussed. In addition, a sub-optimal methodology is introduced which leads to nite-dimensional controllers for very general classes of nonlinear systems. Relations between risk-sensitive control problems and minimax dynamic games are next introduced, and subsequently used to show that the sub-optimal controllers are important in addressing the disturbance attenuation problems with respect to a quadratic cost. Let ( ; A; P u) be a complete probability space and let fFt; t [0; T ]g, be an increasing family of sub-- elds of A. All stochastic processes are assumed to be de ned on this probability space. Consider a stochastic system based on the following dynamics and observations model:
dx(t) = f (x(t))dt +
k X
gj (x(t))uj (t; y)dt +
j =1 dyj (t) = hj (x(t))dt + dbj (t);
n X j =1
j (x(t))dwj (t); x(0) IRn;
yj (0) = 0 IR; 1 j d: 1
(1.1) (1.2)
Here fwj (t); t [0; T ]gnj=1 and fbj (t); t [0; T ]gdj=1, are mutually independent standard Brownian motion processes, which are also independent of x(0). u() = [u1; u2; : : : ; uk ]0() is a vector of control processes, with \0" denoting transpose of a matrix. Let fF0y;t; t [0; T ]g denote the increasing family of sub-- elds of A generated by the observations -algebra up to time t. To make the problem precise, we assume u(t; y) U IRk , a compact subset, and u() L2y ([0; T ]; IRk ) (the set of square-integrable fF0y;t; t [0; T ]g adapted stochastic processes); we denote the set of such control laws by Uad . Our goal is to minimize over the controls u() Uad the expectation of an exponential-ofintegral cost:
J0;T
(u) =
inf uU
ad
Eu
n
Z T
exp
0
o
`(x(t); u(t; y))dt + '(x(T )) ; > 0;
(1.3)
in which Eu[] denotes expectation with respect to measure P u. For simplicity we assume
`(x; u) = `0(x) +
k X j =1
`j (x)uj +
k X j =1
j (x)u2j ;
(1.4)
and that f; gj ; i are C 1(IRn) vector elds, 1 j k; 1 i n, and hi; `0; `j ; j are C 1(IRn) functions, 1 i d; 1 j k. Finally, we assume that on ( ; A), there is a measure, namely, P which is equivalent to P u, (see, [3]), with Radon-Nikodym derivative d Z T d Z T X dP = exp ? X 1 u;?1: 2 (x(s))ds ? = (1.5) h h ( x ( s )) dy ( s ) + j j 0;T dP u 2 j=1 0 j j =1 0
Under this measure, (i.e., P ), fyj (t); t [0; T ]gdj=1 are standard Brownian motion processes and the distribution of fx(t); t [0; T ]g is the same as under P u. Using a version of Bayes' theorem it is easily shown, (see, [4]), that conditional expectations (resp. unconditional expectations) under P u are related to those under P by E[(x(t))u jF y ] ? Eu[(x(t))jF0y;t ] = E[u jF0;ty ] 0;t resp. Eu[(x(t))] = E[(x(t))u0;t ] ; (1.6) 0;t 0;t in which E[] denotes expectation with respect to measure P and d Z t d Z t X X dP u 1 2 u (1.7) 0;t = E dP jF0;t = exp j=1 0 hj (x(s))dyj (s) ? 2 j=1 0 hj (x(s))ds : Using (1.6) we introduce the following quantities: ut() = Eu
h
Z t
i
`(x(s); u(s; y))ds jF0y;t h R 0 i E (x(t)) exp 0t `(x(s); u(s; y))ds u0;tjF0y;t = E[u0;tjF0y;t] u () t = E[u jF y ] : 0;t 0;t (x(t)) exp
2
(1.8)
Using (1.8) in (1.3) we now have:
J0;T
n
h
Z t
`(x(t); u(t; y))dt jF0y;T h i n h 0 io u u = E 0;T T (exp(')) = E E u0;T uT (exp('))jF0y;T
(u) = Eu
Eu
n h
=E E
exp('(x(T ))) exp
u0;T jF0y;T
i
uT (exp('))
o
io
h
= Eu uT (exp('))
:
i
(1.9)
Hence, the original cost function (1.3) is now given by
J0;T (u) = E
h
Tu (exp('))
i
:
(1.10)
Under some weak integrability conditions on '; `0; `j ; j ; hi; 1 j k; 1 i d, usually expressed as growth conditions in \x", (see [5]), and for suciently small 0 < , it can be shown that tu(exp(')) has a density function u : [0; T ] ! IR, which is a unique fF0y;t; t [0; T ]g-adapted solution, (P -a.s. for any u Uad ), of the It^o stochastic equation
u(x; t) = u(x; 0) + +
k Z t X j =1 0
Z t
0
(A + `0
)u(x; s)ds +
j u(x; s)u2j (s; y)ds +
k Z t X
j =1 0 d Z t X j =1 0
Lj u(x; s)uj (s; y)ds
hj u(x; s)dyj (s);
(1.11)
in which n n 2 X X 1 @ @ (f )(x); 0 A(x) = 2 ([ ] i;j )(x) ? j i;j =1 @xi @xj j =1 @xj n X Lj (x) = ? @x@ (gi;j )(x) + `j (x): i=1 i
(1.12) (1.13)
Therefore, the original optimal control problem (1.1)-(1.4) is now understood in terms of minimizing the linear functional (1.10) with new state fu(x; s); 0 s t T g which is a solution of (1.11). In the latter formulation the quantity fu(x; s); 0 s t T g is an information state, in the sense that this density propagates the information available to the controller. In fact, the cost functional (1.10) together with the new state given by (1.11) is the starting point of our analysis. We now make our earlier exposition precise by introducing the following assumption.
Assumptions 1.1
The conditional expectation tu(exp(')) in (1.10) has a unique density function tu(exp(')) = R u u IRn exp('(z )) (z; t)dz satisfying (1.11) and Ejt (exp('))j < 1, for some 0 < . 2
3
2 Sucient Statistic Algebras First, we introduce the corresponding Fisk-Stratonovich version of (1.11), namely,
u(x; t) = u(x; 0) + + in which
k Z t X j =1 0
Z t
0
L0
u(x; s)ds +
k Z t X
Lj u(x; s)uj (s; y)ds
j =1 0 d Z t X
j u(x; s)u2j (s; y)ds +
j =1 0
hj u(x; s) dyj (s);
d X 1 L0 (x) = (A + `0 ? 2 h2j )(x): j =1
(2.1)
(2.2)
The importance of the next two de nitions will be delineated shortly.
De nition 2.1
The information state u() is a nite-dimensional sucient statistic (for the control process), if there exist a smooth manifold with C 1 vector elds : M ! IRn; i;1; i;2 : M ! IRn; j : M ! IRn; 1 i k; 1 j d, and C 1 functions : IRn M [0; T ] ! M; : [0; T ] M ! IR, such that u() is represented by u(x; t) = (x; xbu(t); t); (2.3) where
xbu(t) = xb(0) + +
Z t
0
(xbu(s))ds +
k Z t X
k Z t X
j =1 0
j;1(xbu (s))uj (s; y)ds
d Z t X ; 2 u 2 j (xb (s))uj (s; y)ds +
j (xbu(s)) dyj (s); j =1 0 j =1 0
and the cost criterion J0;T (u) is represented by h ?
i
J0;T (u) = E T; xbu (T ) : 2
(2.4) (2.5)
Note that in the above de nition u(x; t) depends on fuj (t; y); 0 s tgkj=1 through fxbu(s); 0 s tg. One aspect of De nition 2.1 worth noting is the structural properties of (2.4) with respect to the input uj ; u2j ; dyi; 1 j k; 1 i d, which is chosen to be consistent with that of (2.1). 2
De nition 2.2
The sucient statistic algebra Ls of the control problem (1.1), (1.4) is the Lie algebra generated by the operators L0 ; L1; L2 ; : : : ; Lk ; 1; 2; : : : ; k ; h1 ; h2; : : : hd, which is de ned by
Ls = fL0 ; L1; : : : ; Lk ; 1; : : : ; k ; h1; : : : ; hk gL:A: 2 4
(2.6)
Since the elements of (2.6) are dierential operators we recall that the Lie bracket of dierential operators X; Y : C 1(M ) ! C 1(M ), with C 1 coecients, is de ned by [X; Y ]() = X (Y ()) ? Y (X ()); 8 C 1(M ). We are now equipped with the tools to give a formal discussion on the ideas that enable us to generalize the Lie algebraic methods to the stochastic control problem introduced above. Notice that the conditional density equation (2.1) together with the linear functional Tu (exp(')), can be viewed as an in nite-dimensional realization, (with state u(x; t)), of an input-output map from input functions uj (); u2j (); dyi(); 1 j k; 1 i d, to an output Tu (exp(')). In this case, if the control problem (1.1)-(1.4) has a nite-dimensional sucient statistic, which by de nition implies that Tu (exp(')) is nite-dimensionally computable, then (2.1), (1.10) and (2.4), (2.5), represent two realizations of the same input-output map. It then follows from nonlinear realization theory, (see [6]), that if (2.4), (T; xbu(T )) is observable, there is a smooth map from the reachable part of (2.1) to the reachable part of (2.4). In addition, if (2.4), (T; xbu(T )) is a minimal realization, then there exist a dierentiable map which maps L0 to ; Lj to j;1; i to i;2; hm to m; 1 i; j k; 1 m d, which extends to a homomorphism of Lie algebras: : Ls = fL0 ; L1 : : : ; Lk ; 1; : : : ; k ; h1; : : : ; hd gL:A ! Lbs = f ; 1;1 : : : ; k;1; 1;2; : : : ; k;2; 1 ; : : : ; d gL:A (i.e., preserves the Lie brackets: ([u; v]) = [(u); (v)]; 8u; v Ls). Moreover, is also an isomorphism if it is both injective and surjective. In fact, if there exist nite-dimensional sucient statistics, then (2.1) together with Tu (exp(')) and (2.4) together with (T; xbu (T )) realize the same input-output map. The dimension of Ls is an upper bound on the minimum number of sucient statistics. Finally, we note that when dim (Ls) < 1, the Wei-Norman method can be used to construct the explicit representation of (2.1), which is of the form, (2.3),(2.4), (see [6]).
3 Finite-Dimensional Sucient Statistic Algebras In the rest of the paper we assume validity of the following conditions.
Assumptions 3.1
Assumptions 1.1 hold, [1 ; 2; : : : ; n][1 ; 2; : : : ; n]0 (x) = In, that is, (x) is orthogonal, and for the case of scalar observations H 6= 0. 2 De ne n n X X Di = @x@ ? fi; 1 i n; = @x@ (fi) + fi2 ; wi;j = @x@ (fj ) ? @x@ (fi); 1 i; j n: i
i=1
i
i=1
5
i
j
(3.1)
Then
n ?X L0 = 21 Di2 ? + `0
(3.2)
i=1
where Di2()(x) = ( @x@22i ? @x@ i (fi) ? 2fi @x@ i + fi2)()(x); 1 i n.
3.1 The Linear-Exponential-Quadratic Case Much insight on the Lie algebraic tools is gained by investigating the Linear-ExponentialQuadratic-Gaussian (LEQG) problem, whose complete solution was rst announced in [7]. Consider the system
dyj (t) =
n X i=1
k X
Bj uj (t; y)dt + dw(t); x(0) IRn;
(3.3)
Hj;ixi (t)dt + dbj (t); y(0) IR; 1 j d;
(3.4)
dx(t) = Fx(t)dt +
j =1
n k X X 1 1 2 `(x; u) = 2 Qi;ixi + 2 Ri;iu2i = `q (x; u); Qi;i 0; Rj;j > 0; 8i; j: i=1 i=1
(3.5)
The sucient statistic algebra Ls, is generated by the operators
n n n X X ?X Di2 ? + 2 Qi;ix2i ; Li = ? Bj;i @x@ ki=1; L0 = 21 j
hi =
n X
i=1
i=1
Hi;j xj di=1; 2 Ri;i 1 ki=1 j =1
where
Di = @x@ i ? Pn
Pn
j =1 Fi;j xj ;
j =1
(3.6) 1 i n;
2 2 Pd Pn i=1 ( j =1 Fi;j xj ) + j =1 ( i=1 Hj;ixi ) :
Pn
Pn
(3.7)
= i=1 Fi;i + Using the de nition of Lie bracket we obtain the following non-zero commutative relations: n n X X 1 @ [L0 ; xj ] = Dj ; [L0 ; Dj ] = (Fi;j ? Fj;i)Di + 2 @x ( ? Qi;ix2i ); 1 j n; (3.8) j i=1 i=1 (3.9) [Di; xj ] = i;j ; where i;j = 1 if i = j and i;j = 0 if i 6= j: (3.10) Consequently, we deduce that Ls has dimension at most 2n + 2 with basis Ls SpanfL0 ; x1 ; x2; : : : ; xn; D1; D2; : : : ; Dn; 1g: (3.11) 6
Example 3.1 (The scalar LEQG case).
Consider the system
dx(t) = Fx(t)dt + Bu(t; y)dt + dw(t); x(0) IR; dy(t) = Hx(t)dt + db(t); y(0) = 0 IR; ; `(x; u) = 21 Qx2 + 12 Ru2 = `q (x; u); Q 0; R > 0:
(3.12) (3.13) (3.14)
Then Ls is generated by the operators @ ; = R 1; h = Hx; L0 = 21 (D2 ? ) + 2 Qx2 ; L = ?B @x (3.15) 2 in which D = @x@ ? Fx; = (Fx)2 + F + (Hx)2 . Next we compute a sequence of Lie brackets to obtain the elements fei g Ls , (which we expect to be nite).
@ ? B (H 2 ? Q)x ) [L ; L] = D + x: [L0 ; L] = ?BF @x 0 1 2
Hence D Ls . Continuing,
[L0 ; h] = HD; [[L0 ; h]; h] = H 2: Hence 1 Ls and [L0 ; h] Ls. Moreover [L0 ; [L0 ; h]] = H (F 2 + H 2)x ? QHx ) [L0 ; [L0; h]] = 3 x; [L; h] = ?BH ) [L; h] = 4 1; [L; [L0 ; h]] = BHF ) [L; [L0 ; h]] = 5 1: From the above calculations we conclude that Ls is 4-dimensional (if H is nonzero) spanned by the elements Ls = SpanfL0 ; x; D; 1g: (3.16) One can easily show by either using the Wei-Norman method or by solving (2.1), that when x(0) is Gaussian the nite-dimensional sucient statistic of De nition 2.1 is given by b t); u(x; t) = (2)1=21P 1=2 (t) exp ? 21 (x ? xb(t))2 P ?1(t) ( (3.17) where dxb(t) = F + PQ ? P (t)H 2 xb(t)dt + Bu(t; y)dt + P (t)H dy(t); xb(0) = ; (3.18) _P (t) = 2FP (t) ? P 2 H 2 ? Q + 1; P (0) = P0; (3.19) b t) H 2 x b t)H x b t) = ? 1 ( b2 (t) + H 2 P (t) dt + ( b(t) dy (t) d( 2 b t) Qx b b2 (t) + Ru2 (t; y ) + P (t)Q dt; (0) = 1: (3.20) + 2 ( 7
Actually, the case when x(0) is non-Gaussian can be handled as well. Next, we wish to exploit the relationship between the Lie algebra associated with (3.18)-(3.20) and the sucient statistic algebra Ls. For simplicity, we set F = 0; B = H = Q = R = 1. The corresponding equations are
dxb(t) = P (t) ? P (t) xb(t)dt + u(t; y)dt + P (t) dy(t); (3.21) P_ (t) = ?P 2 (t)(1 ? ) + 1; (3.22) b t) xb2 (t) + u2(t; y) + P (t) dt: b t) = ? 1 ( b t) x b t)x d( b2 (t) + P (t) dt + ( b(t) dy (t) + ( 2 2 (3.23) The Lie algebra Lbs, of the 3-dimensional system (3.21)-(3.23) with inputs dy; u; u2, is the one which is generated by the vector elds 3
2
2
3
2
P ( ? 1)xb 1 2 5 4 4 ( ? 1) P + 1 ; f1 = 0 5 ; f 2 = 4 f0 = 1 2 2 0 ? 2 (xb + P )b + 2 (xb + P )b
3
2
3
P 0 0 5 ; f3 = 4 0 5 : (3.24) b xbb 2
Also, Ls = Spanf 12 @x@22 ? 12 (1 ? )x2 ; x; @x@ ; 1g. @ (f ) g ? @ (g) f , (i.e., g; f C 1 (IRn )), Using the Lie bracket of vector elds de ned by [f; g] = @x @x we calculate the following: 2
3
2
3
P ( ? 1) ?1 4 5 4 0 [f0 ; f1] = = ( ? 1) f3; [f0 ; f2] = 0; [f0; f3 ] = 0 5 = ?1 f1 ; xb( ? 1) 0 2 3 0 4 [f1 ; f2] = 0; [f1 ; f3] = ? 0 5 = ? 2 f2 ; [f2 ; f3] = 0: b Therefore, Lbs is 4-dimensional with basis Lbs = Spanff0; f1 ; f2 ; f3 g. Moreover, it can be veri ed that there is a Lie algebra homomorphism : Ls ! Lbs, which takes L0 ! f0 ; x ! f3 ; @x@ ! f1; 1 ! f2, which is also isomorphism. 2
3.2 The Nonlinear-Drift Exponential-of-Integral Cost First, we introduce a class of systems with nonlinear drift terms which have a nite-dimensional sucient statistic; the cost is of exponential-of-quadratic form.
Theorem 3.1
Consider system (1.1)-(1.3) with
8
fi =
Pn
j =1 Fi;j xj ;
1im
fm+1 = fm+1 (x1 ; x2; : : : ; xn) ...
fn = fn(x1 ; x2 ; : : : ; xn) hi =
Pn
j =1 Hi;j xj ;
(3.25)
1id
Bi;j = 0; 8i > m; 1 j k n k X X 1 1 2 `(x; u) = 2 Qi;ixi + 2 Ri;iu2i = `q (x; u); Qi;i 0; Ri;i > 0; 8i; j: i=1 i=1
(3.26)
1. If n X
n d X X @ 2 = @x (fi ) + fi + h2i = a nonnegative quadratic function of (x1; : : : ; xn); i=1 i i=1 i=1 (3.27)
and
wi;j = constant, 1 i; j n;
(3.28)
then Ls has dimension at most 2n + 2 with basis
Ls SpanL0 = 12 (
n X Di2 ? ) + `0; fxigni=1 ; f @x@ gmi=1; fDi+1 gni=m; 1 ; `0 (x) = Qi;ix2i : i i=1 i=1 (3.29)
n X
2. If m @ f = @ f = 0; 1 i m; m + 1 j n; h = X Hi;j xj ; 1 i d; i @xj i @xi j j =1
(3.30)
and = A nonnegative quadratic function of (x1 ; : : : ; xm) + (xm+1 ; : : : ; xn), for some
C 1(IRn?m), then Ls has dimension at most 2m + 2 with basis (3.31) Ls Span L0 ; fxigmi=1; f @x@ gmi=1; 1 : i 9
The non-zero commutative relations are m n X X @ @ (wj;iDi [L0 ; xj ] = Dj ; 1 j n; [L0 ; @x ] = (wj;iDi + @x (fj )Di) + j i i=1 i=m+1 n X @ 1 @ + (fj )Di) + ( ? Qi;ix2i ); 1 j m; @xi 2 @xj i=1 n n X X [L0 ; Dj ] = wj;iDi + 21 @x@ ( ? Qi;ix2i ); m + 1 j n; j i=1 i=1 @ [ @x ; xj ] = i;j ; 1 j m; 1 i n; [Di ; xj ] = i;j ; 1 j n; m + 1 i n; i @ [ @x ; Dj ] = ?wi;j ? @x@ (fi); 1 i m; m + 1 j n: i
j
Proof. 1. Let fe1 = L0; ffe2;i = ? Pnj=1 Bk;i @x@ j gki=1; ffe3;i = Pnj=1 Hi;j xj gdi=1; ffe4;i = 2 Ri;i 1gki=1. [fe1 ; fe2;j ] = ?
n X n X
m
i=1 m=1 n X
? 12
m=1
2 Bm;j @x@ (fi)Di + 21 Bm;j @x@@x (fi)
Bm;j @x@ ( ? m
n X i=1
i
m
Qi;ix2i ):
Since Bi;j = 0, for i > m; 1 j k; @x@m (fi) ? @x@ i (fm) = wm;i, we have [fe1 ; fe2;j ]
=?
n X m X i=1 k=1 m X
? 21 =
n X i=1
k=1
@ 2 (f ) Bk;j @x@ (fk )Di + Bk;j wk;iDi + 12 Bk;j @x 2 k i
Bk;j @x@ ( ?
i x i +
k m X i=1
n X i=1
i
Qi;i x2i );
1;j @x@ + i
n X i=m+1
1;i Di + n+1 1; 1 j k:
Hence ffe5;i = xi gni=1; ffe6;i = @x@ i gmi=1; ffe7;i = Digni=m+1 Ls. Continuing [fe1 ; fe3;j ] [fe2;i ; fe3;j ]
n X
n X
m X
n X @ = Hj;iDi = i f5;i + 1;i @x + 2;i Di ; 1 j d: i i=m+1 i=1 i=1 i=1
=?
m X `=1
B`;iHj;` = 1 1; 1 i k; 1 j d:
From the commutative relation we deduce (3.29). 10
2. Consider the case d. Then [fe1; fe2;j ]
@ @xj fi
= @x@ i fj = 0; 1 i m; m + 1 j n; hi =
Pm
j =1 Hi;j xj ; 1 i
m X m X
m n X X @ @ 1 =? fBk;j @x (fk )Dig ? 2 Bk;j @x ( ? Qi;ix2i ); 1 j k: i k i=1 k=1 k=1 k=1
From the assumption on we also deduce [fe1 ; fe2;j ]
=
[fe1 ; fe3;j ] =
m X
i=1 m X i=1
i D i +
m X i=1
1;i xi + m+1 1; 1 j k;
Hj;iDi; 1 j d:
Hence, ff5;j = xj gmj=1 Ls, and from the commutative relations we obtain (3.31). 2
Remark 3.1
At this stage it is obvious that in view of the requirement that (3.27) should be satis ed, the class of systems with nite-dimensional controllers is very restrictive. This is, however, the case for nonlinear ltering problems as well. Next, we shall remove (3.27) by considering cost functions which are not necessarily of exponential-of-quadratic form. 2
Corollary 3.1
Consider system (1.1)-(1.4) with
fi =
n X j =1
Fi;j xj ; 1 i m;
fm+1 = fm+1 (x1 ; x2; : : : ; xn); ... fn = fn(x1 ; x2 ; : : : ; xn); hi =
n X j =1
(3.32)
Hi;j xj ; 1 i d;
Bi;j = 0; 8i > m; 1 j k n k X X 1 1 2 `(x; u) = 2 Qi;ixi + 2 Ri;iu2i + 21 `0(x); Qi;i 0; Rj;j > 0; 8i; j: i=1 i=1
1. If there exist an `0 such that ? `0 = a nonnegative quadratic function of (x1 ; : : : ; xn); and wi;j = constant ; 1 i; j n; 11
(3.33) (3.34) (3.35)
then Ls has dimension at most 2n + 2 with basis given by
n X 1 e + mx + ]); fxi gn ; f @ gm ; fDi+1 gn ; 1 ; (3.36) Ls Span L0 = 2 ( Di2 ? [x0 Qx i=1 @x i=1 i=m i i=1
where Qe = Qe 0; m (IRn)0; IR. 2. Moreover, if `0 0 the nite-dimensional controller of part 1 is also sub-optimal with respect to an exponential-of-quadratic cost.
Proof. 1. The rst part is a corollary of Theorem 3.1. 2. Since `0 0 we have the upper bound which implies 2. 2
n k X 1 X 2 2 Q x + R u 2 i=1 i;i i i=1 i;i i `(x; u);
(3.37)
d n n X X X 1 @ 2 `0 (x) = fi + @x (fi) + h2i i=1 i=1 i=1 i
(3.38)
Remark 3.2
Notice that by setting
we deduce the existence of nite-dimensional controllers for systems which satisfy wi;j =constant, 1 i; j n. Contrary to the control case, this exibility is absent in nonlinear ltering problems. 2 Next, we introduce two examples of the earlier results.
Example 3.2 Consider the system dx1 (t) = (F1;1 x1 (t) + F1;2x2 (t))dt + B1;1 u(t; y)dt + dw1(t); x1(0) IR; dx2 (t) = (F2;1 x1 (t) + f2;2(x2 (t))dt + dw2(t); x2 (0) IR; dy(t) = (H1;1x1 (t); +H1;2x2 (t))dt + db(t); y(0) = 0 IR;
(3.39) (3.40) (3.41)
2 X 1 `(x; u) = 2 Qi;ix2i + 12 R1;1 u2 + 21 `0(x); Qi;i 0; i = 1; 2; R1;1 > 0: i=1
(3.42)
Since = F1;1 + @x@ 2 (f2;2 (x2 )) + (F1;1 x1 + F1;2 x2 )2 + (F2;1 x1 + f2 (x2 ))2 + (H1;1 x1 + H1;2 x2 )2 and w1;2 = F2;1 ? F1;2 = constant, if we set ? ? `0 (x) = f22;2(x2 ) + @x@ f2 (x2) + 2f2;2 (x2)F2;1 x1 = f2;2(x2 ) + F2;1x1 2 ? F2;1x1 2 + @x@ f2 (x2); 2 2 (3.43)
12
then
e + mx + ; Q e 0: ? `0 = F1;1 + (F1;1x1 + F1;2 x2 )2 + (F2;1 x1)2 + (H1;1x1 + H1;2x2 )2 = x0Qx (3.44)
By Theorem 3.1 we deduce that 2 X 1 2 0 e Ls = SpanfL0 = 2 (D ? [x Qx + mx + ]) + 2 Qi;ix2i ; fxig2i=1 ; @x@ ; D2 = @x@ ? f2 ; 1g: 2 1 2 i=1
Example 3.3 (The quadratic sensor non-EQ case). Consider the system
dx(t) = Fx(t)dt + Bu(t; y)dt + dw(t); x(0) IR dy(t) = Hx2(t)dt + db(t); y(0) = 0 IR; `(x; u) = 21 Qx2 + 12 Ru2 = `q (x; u):
(3.45) (3.46) (3.47)
This problem has an in nite-dimensional sucient statistic algebra. However, if we consider instead the non-quadratic cost function, `q (x; u) + 21 (Hx2 )2 , then the control law is nitedimensional, which is veri ed as follows. The sucient statistic algebra of the sub-optimal problem, Lup s , is generated by the operators @ ; = R 1; h = Hx2 ; L0 = 21 (D2 ? ) + 2 Qx2 + 21 (Hx2 )2; L = ?B @x (3.48) 2 where = (Fx)2 + F + (Hx2 )2; D = @x@ ? Fx. Here L0 = 21 (D2 ? s ) + 2 Qx2 , where s = (Fx)2 + F . Simple calculations yield that Lup s is 6-dimensional with basis @ ; @ ; x2 ; x; 1 : 2 (3.49) Ls = Span L0 = 21 (D2 ? s) + 2 Qx2 ; x @x @x Next, we shall link the earlier ndings to the design of nite-dimensional minimax dynamic games. It has been recently established in [14], that the risk-sensitive system (1.1)-(1.3) is logarithmically equivalent to a minimax dynamic game with upper value ( > 0) Z 1 log inf J (u) = inf sup EQn T `(x(t); u(t; y)) ? 1 j (t)j2 ? 1 j(t)j2dt + '(x(T ))o; uU uU 0;T
ad
ad ( ;)Dad
0
(3.50)
where under measure P Q system (1.1), (1.2) becomes
dx(t) = f (x(t))dt +
k X j =1
gj (x(t))uj (t; y)dt + (x(t)) (t)dt + 13
n X j =1
j (x(t))dwj (t); x(0) IRn; (3.51)
dyj (t) = hj (x(t))dt + j (t)dt + dbj (t); yj (0) = 0 IRn; 1 j d:
(3.52)
Here f j (t); t [0; T ]gnj=1 and fj (t); t [0; T ]gdj=1, are fF0;t; t [0; T ]g adapted squareintegrable stochastic processes, with values in IR; Dad denotes the set of such processes. An analogous relation in terms of a deterministic minimax dynamic games is derived in [15], by considering the small-noise version of (1.1)-(1.4). That is, by introducing the changes p p ! ; dbj ! dbj ; ! , and then pursuing the limit lim!0 inf uUad log J0;T (u). The resulting game is of the form (3.50)-(3.52) free of the random inputs and the mathematical expectation. Consequently, the Lie algebraic techniques are applicable to dynamic games as well. Moreover, disturbance attenuation problems are posed in terms of nding a control law u() L2y ([0; T ]; IRk ) and a constant < 1 such that the L2 ?gain inequality EQ
hZ T
Z
h T i `q (x(t); u(t; y))dt + '(x(T )) 1 EQ j (t)j2 + j(t)j2 dt + ; 8( ; ) Dad ; 0 0 (3.53) i
holds, which is solved using the dynamic minimax game introduced above. Hence, the suboptimal controller design methodology introduced earlier is very important in nding nitedimensional controllers which satisfy an L2?gain inequality with quadratic cost, namely, (3.53), with ` ! `q .
4 Conclusion This paper introduces the sucient statistic algebra which is responsible for propagating the information state of optimal control problems with exponential-of-integral cost. The sucient statistic algebra is shown to be nite-dimensional for LEQG problems and for certain nonlinear problems as well. Its importance in terms of deriving nite-dimensional sup-optimal control laws is further delineated. The Lie algebraic techniques are also shown to be valuable in addressing nite-dimensionality of minimax dynamic games; this is due to the equivalence between continuous-time risk-sensitive control problems and minimax dynamic games [14, 15]. Finally, we note that Lie algebraic are also important h techniques R t in risk-sensitive ltering problems, in which the cost is E exp 0 (x(s) ? (s; y))0(x(s) ? i (s; y)) ds , where the minimizion is with respect to the non-anticipative functional (t; f0 s tg) of the observation process.
REFERENCES
[1] C.D. Charalambous and R. Elliott, \Information States in Optimal Control and Filtering: A Lie Algebraic Theoretic Approach," IEEE Transactions on Automatic Control (submitted April 1997). [2] R. Brockett and J. Clark, \Geometry of the Conditional Density Equation," in Proceedings of the International Conference on Analysis and Optimization of Stochastic Systems, Oxford 1978. 14
[3] C.D. Charalambous and J.L. Hibey, \Minimum Principle for Partially Observable Nonlinear Risk-Sensitive Control Problems using Measure-Valued Decompositions," Stochastics and Stochastics Reports, vol.57, pp.247-288, 1996. [4] E. Wong and B. Hajek, Stochastic Processes in Engineering Systems, Springer-Verlag, 1985. [5] C.D. Charalambous and R. Elliott, \Certain Nonlinear Stochastic Optimal Control Problems with Explicit Control Laws Equivalent to LEQG/LQG Problems," IEEE Transactions on Automatic Control, vol.42, No.4, pp.482-497, 1997. [6] S. Marcus, \Algebraic and Geometric Methods in Nonlinear Filtering," SIAM Journal on Control and Optimization, vol.26, No.5, pp.817-844, 1984. [7] A Bensoussan and J. van Schuppen, \Optimal Control of Partially Observable Stochastic Systems with an Exponential-of-Integral Performance Index", SIAM Journal in Control and Optimization, vol.23, No.4, pp.599-613, 1985. [8] J. Chen, S. -T. Yau, C.-W. Leung, \Finite-Dimensional Estimation Algebras of Maximum Rank with State-Space Dimension 3," SIAM Journal on Control and Optimization, vol.34, No.1, pp.179-198, 1996. [9] W. Wong, \On a New Class of Finite-Dimensional Estimation Algebras," Systems and Control Letters, vol.9, pp.79-83, 1987. [10] A Bensoussan and R. Elliott, \General Finite Dimensional Risk Sensitive Problems and Small Noise Limits," IEEE Transactions on Automatic Control, vol.41,No.2, pp.210-215, 1996. [11] C.D. Charalambous, D. Naidu, K. Moore, \Solvable Risk-Sensitive Control Problems with Output Feedback," in Proceedings of 33rd IEEE Conference on Decision and Control, pp.1433-1434, Lake Buena Vista, Florida, December 1994. [12] C. D. Charalambous, \Partially Observable Risk-Sensitive Control Problems: Dynamic Programming and Veri cation Theorems", IEEE Transactions on Automatic Control, vol. 42, no. 8, pp. 1130-1138, 1997. [13] R.E. Kalman, \When is a Linear System Optimal,' Journal of Basic Engineering, vol. 86, pp. 51-60, 1964. [14] P. Dai Pra, L. Meneghini, W.J. Runggaldier, \Some Connections Between Stochastic Control and Dynamic Games", preprint, 1997. [15] C.D. Charalambous, \The Role of Information State and its Adjoint in Relating Nonlinear Output Feedback Risk-Sensitive Control and Dynamic Games", IEEE Transactions on Automatic Control, vol. 42, no. 8, pp. 1163-1170, 1997.
15