Key Words: Optimal Control, Filtering, Minimax, Partial Observations, Feynman-Kac, ... density function, say, f (x; t);t 0g, then 1] ...... i ; x1; x2; D1; D2;1g: 2 (4.110).
Information States in Optimal Control and Filtering: A Lie Algebraic Theoretic Approach Charalambos D. Charalambous1 and Robert J. Elliott2 Department of Electrical Engineering McGill University, Montreal, P.Q., Canada H3A 2A7 1
Department of Mathematical Sciences University of Alberta, Edmonton, Alberta, Canada T6G 2G1 2
IEEE Transactions on Automatic Control: Submitted April 21, 1997; Revised April 17, 1998
Reference Number:97-182 Abstract.
The purpose of this paper is twofold; (i) to introduce the sucient statistic algebra which is responsible for propagating the sucient statistic, or information state, in the optimal control of stochastic systems, and (ii) to apply certain Lie algebraic methods widely used in nonlinear control theory, to derive new results concerning nite-dimensional controllers. This, enhances our understanding of the role played by the sucient statistic. The sucient statistic algebra enables us to determine a priori whether there exist nite-dimensional controllers; it also enables us to classify all nite-dimensional controllers. Relations to minimax dynamic games are delineated.
Key Words: Optimal Control, Filtering, Minimax, Partial Observations, Feynman-Kac, Lie Algebras, Sucient Statistic, Finite-Dimensional.
1
This author's work was supported by the Natural Science and Engineering Research Council of Canada under
grant OGP0183720
1
1 Introduction The practical utility of the Duncan-Mortensen-Zakai (DMZ) equation is greatly appreciated in both nonlinear ltering and stochastic control problems with partial information. The DMZ equation of nonlinear ltering of diusion processes is a linear, stochastic, partial dierential equation (PDE) which describes in a recursive manner the evolution of the unnormalized conditional distribution of the state process, fx(t); t 0g, given the observations, fy(t); t 0g. If this distribution has a density function, say, f(x; t); t 0g, then [1]
d (x; t) = L (x; t) + h(x)(x; t) d y(t); (x; t) 2 < (0; T ]: (1.1) 0 dt dt Consequently, f(x; s); 0 s tg evolves forward in time with initial condition (x; 0). Here, L0 is a certain second-order dierential operator related to the drift and diusion coecients of the state process, the Kolmogorov forward operator, and h(x) is a zero-order dierential operator related to the signal in the observations. In ltering problems one is concerned with conditional expectations R (z)(z; t)dz Z E [(x(t))jfy(s); 0 s tg] = (z)N (z; t)dz = 0:
(4.88)
Similar to Theorem 2.4 (see [20]), the information state approach to this control problem yields:
J (u ) = u2U inf E ad
Z
exp ('(x)) (x; T )dx : n
(4.89)
0) 1 log inf J (u) = inf sup E Q n Z T [(x(t); u(t; y)) ? 1 j (t)j2 ? 1 j(t)j2 ]dt + '(x(T ))o;
u2Uad
u2Uad ( ;)2Dad
0
(4.95)
where E Q [:] denotes expectation with respect to measure P Q under which (the system (4.87), (4.88) becomes)
dx(t) = f (x(t))dt +
` X j =1
gj (x(t))uj (t; y)dt + (x(t)) (t)dt +
n X j =1
j (x(t))dwj (t); x(0) 2 k; 1 j `; wi;j = constant ; 1 i; j n; 29
(4.101)
and
(x; u) =
n X i=1
Qi;ix2i +
1. If there exist an 0 such that
` X i=1
Ri;iu2i + 0 (x) 0; Qi;i 0; Rj;j > 0; 8i; j:
? 20 = Nonnegative quadratic function of (x1 ; : : : ; xn ); then LS has dimension at most 2n + 2 and n X 1 : ~ + mx + ]); x1 ; : : : ; xn ; D1 ; : : : Dn ; 1g; L L = SpanfL0 = ( D2 ? [x0 Qx
(4.102) (4.103)
(4.104) 2 i=1 i where Q~ = Q~ 0; m 2 (