Linear Estimation in Krein Spaces - Part II

0 downloads 0 Views 366KB Size Report
Jun 2, 1995 - Signal Procesing Magazine, July 1994, pp. 18-60. ... Lyapunov functions for the problem of Lur'e in Automatic Control. Pro- ceedings of the ...
Linear Estimation in Krein Spaces - Part II: Applications Babak Hassibiy, Ali H. Sayedz and Thomas Kailath Information Systems Laboratory Stanford University, Stanford CA 94305

June 2, 1995

Abstract We show that several interesting problems in H 1 ? ltering, quadratic game theory and risk sensitive control and estimation, follow as special cases of the Krein space linear estimation theory developed in [1]. We show that all these problems can be cast into the problem of calculating the stationary point of certain second order forms, and that by considering the appropriate state space models and error Gramians, we can use the Krein space estimation theory to calculate the stationary points and study their properties. The approach discussed here allows for interesting generalizations, such as nite memory adaptive ltering with varying sliding patterns.

 This work was supported in part by the Air Force Oce of Scienti c Research, Air Force Systems Command

under Contract AFOSR91-0060 and by the Army Research Oce under contract DAAL03-89-K-0109. This manuscript is submitted for publication with the understanding that the US Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon. The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the ocial policies or endorsements, either express or implied, of the Air Force Oce of Scienti c Research or the U.S. Government. y Contact author: Information Systems Laboratory, Stanford University, Stanford CA 94305. Phone (415) 723-1538 Fax (415) 723-8473 E-mail: [email protected] z The work of A.H. Sayed was also supported by a grant from NSF under award no. MIP-9409319. He is with the Dept. of Electrical and Computer Engineering, University of California, Santa Barbara, CA 93106.

1

1 Introduction Classical results in linear least-squares estimation and Kalman ltering are based on an L2 or H2

criterion and require a-priori knowledge of the statistical properties of the noise signals.

In some applications however, one is faced with model uncertainties and lack of statistical information on the exogenous signals, which has led to an increasing interest in minimax estimation (see, e.g., [2, 3, 4, 5, 6, 7, 8, 9] and the references therein), with the belief that the resulting so-called

H1

algorithms will be more robust and less sensitive to parameter

variations. Furthermore, while the statistical Kalman ltering algorithm can be viewed as a recursive procedure that minimizes a certain quadratic cost function, there has also been increasing interest in an alternative so-called exponential (LEQG) cost function [13, 14, 15, 16], which is risk sensitive, in the sense that it depends on a real parameter that determines whether more or less weight should be given to higher or smaller errors. The corresponding lters have been termed risk-sensitive and include the Kalman lter as a special case. We show in this paper that the H 1 and risk-sensitive lters can both be obtained by using appropriate Krein space Kalman lters, based on the theory developed in Part I [1]. H1

and risk-sensitive estimation and control problems, quadratic games (and nite mem-

ory adaptive ltering problems) lead almost by inspection to inde nite deterministic quadratic forms. Following [1], we solve these problems by constructing the corresponding Krein space \stochastic" problems for which the Kalman lter solutions can be written down immediately; moreover, the conditions for a minimum can also be expressed in terms of quantities easily related to the basic Riccati equations of the Kalman lter. This approach also explains the many similarities between say, the H 1 solutions and the classical LQ solutions, and in addition marks out their key di erences. The paper is organized as follows. In Section 2 we introduce the H 1 estimation problem, state the conventional solution and discuss its similarities with and di erences from the conventional Kalman lter. In Section 3 we reduce the H 1 estimation problem to guaranteeing the positivity of a certain inde nite quadratic form. We then relate this quadratic form to a 2

certain Krein state-space model, which allows us to use the results of the companion paper [1] to derive conditions for its positivity, and to show that projection in the Krein space allows us to solve the H 1 estimation problem. In this context we derive the H 1 aposteriori, apriori, and smoothing lters, and show that H 1 estimation is essentially Kalman ltering in Krein space; we also obtain a natural parametrization of all H 1 estimators. One advantage of our approach is that it suggests how well known conventional Kalman ltering algorithms, such as square root arrays and Chandrasekhar recursions, can be extended to the H 1 setting. In Section 4 we describe the problem of risk-sensitive estimation [13, 14, 15], and show that a risk-sensitive estimator is one that computes the stationary point of a certain second order form, provided that this second order form has a minimum over a certain set of variables. By considering a corresponding Krein state space model, we use the results of [1] to derive conditions for the existence of the minimum, and to show that the Krein space projection also solves the risk-sensitive estimation problem. We then derive risk-sensitive aposteriori, apriori, and smoothing lters parallel to what was done in Section 2. We also use this parallel to stress the connection between H 1 and risk-sensitive estimation that was rst discovered in [21], using di rent arguments. Before concluding with Section 6, we describe the nite memory adaptive ltering problem in Section 5 and use the Krein space approach to solve this problem, and to connect it to state-space approaches to adaptive ltering. As was done in the companion paper [1], we shall use bold letters for elements in a Krein space, and normal letters for corresponding complex numbers. Also, we shall use z to denote the estimate of z (according to some criterion), and z^ to denote the Krein space projection, thereby stressing the fact that they need not coincide. Many of the results discussed here were obtained earlier by several other authors, using di erent methods and arguments. Our approach, we believe, provides a powerful uni cation, with immediate insights to various extensions.

2 H 1 Estimation Several H 1 - ltering algorithms have been recently derived by a variety of methods in both the continuous and discrete-time cases (see, e.g., [2, 3, 4, 5, 6, 7, 8, 9] and the references therein). 3

Many authors have noticed some formal similarities between the H 1 lters and the conventional Kalman lter, however we shall further clarify this connection by showing that H 1 lters are nothing more than certain Krein space Kalman lters. In other words, the H 1 lters can be viewed as recursively performing a (Gram-Schmidt) orthogonalization (or projection) procedure on a convenient set of observation data that obey a state-space model whose state evolves in an inde nite metric space. This is of signi cance since it yields a geometric derivation of the H 1 lters, and because it uni es H 2 and H 1 estimation in a simple framework. Moreover, once this connection has been made explicit, many known alternative and more ecient algorithms, such as square-root arrays and Chandrasekhar equations[24], can be applied to the H 1 ?setting as well. Also our results deal directly with the time-varying scenario. Finally, we note that although we restrict ourselves here, to the discrete-time case, however, the continuous time analogs follow the same principles.

2.1 Formulation of the H 1 Filtering Problem Consider a time-variant state-space model of the form

8 > < xi+1 = > : yi =

Fi xi + Gi ui ; Hi xi + vi ;

x0 i0

(1)

where Fi 2 C nn , Gi 2 C nm and Hi 2 C pn are known matrices, x0 , fui g; and fvi g are unknown quantities and yi is the measured output. We can regard vi as a measurement noise and ui as a process noise or driving disturbance. We make no assumption on the nature of the disturbances (e.g. , normally distributed, uncorrelated, etc). In general, we would like to estimate some arbitrary linear combination of the states, say zi

= Li xi ;

where Li 2 C qn is given, using the observations fyj g. Let ziji = Ff (y0 ; y1; : : : ; yi) denote the estimate of zi given observations fyj g from time 0 up to and including time i, and zi =

Fp(y0; y1; : : : ; yi?1) denote the estimate of zi given observations fyj g from time 0 to time i ? 1. We then have the following two estimation errors: the ltered error ef;i = ziji ? Li xi ;

4

(2)

and the predicted error ep;i = zi ? Li xi :

?0 1=2(x0 ? x0)

-

Ti(Ff )

?0 1=2(x0 ? x0)

-

Ti(Fp)

fuj gij=0 fvj gij=0

fuj gij=0 fvj gij=0

(3)

- fz^jjj ? Lj xj gij=0

.

- fz^j ? Lj xj gij=0

Figure 1: Transfer matrix from disturbances to ltered and predicted estimation errors. As depicted in Fig. 1, let Ti(Ff ) and Ti (Fp) denote the transfer operators that map the

unknown disturbances f?0 1=2(x0 ? x0); fuj gij =0 ; fvj gij =0 g (where x0 denotes an initial guess for x0 , and 0 is a given positive de nite matrix) to the ltered and predicted errors fef;j gij =0

and fep;j gij =0 , repectively. The problem is to choose the functionals Ff () and Fp () so as to

respectively minimize the H 1 norm of the transfer operators Ti (Ff ) and Ti (Fp).

De nition 1 The H 1 norm of a transfer operator T is de ned as kT k1 = sup kkTuukk2 u2h2 ;u6=0 2

P u u . where kuk2 is the h2 ?norm of the causal sequence fuk g, i.e. kuk22 = 1 k=0 k k The H 1 norm thus has the interpretation of being the maximum energy gain from the input u

to the output y . Our problem may now be formally stated as follows.

Problem 1 (Optimal H 1 Problem) Find H 1-optimal estimation strategies ziji = Ff (y0; y1; : : :; yi) 5

and zi = Fp(y0 ; y1; : : : ; yi?1 ) that respectively minimize k Ti (Ff ) k1 and k Ti (Fp) k1 , and obtain the resulting

Pi e e j =0 f;j f;j sup P P ? 1  Ff x0 ;u2h2 ;v2h2 (x0 ? x0 ) 0 (x0 ? x0 ) + ij =0 uj uj + ij =0 vjvj

2 = inf k Ti(Ff ) k2 = inf

f;o 1 Ff

(4)

and

Pi e e j =0 p;j p;j sup Fp x0 ;u2h2 ;v2h2 (x0 ? x0 )0 ?1 (x0 ? x0 ) + Pij?=01 uj uj + Pij?=01 vjvj

2 = inf k Ti(Fp) k2 = inf

p;o 1 Fp

(5)

where 0 is a positive de nite matrix that re ects apriori knowledge as to how close x0 is to the initial guess x0.

Note that the in mum in (5) is taken over all strictly causal estimators Fp , whereas in (4) the estimators Ff are only causal since they have additional access to yi . This is relevant since the solution to the H 1 problem, as we shall see, depends on the structure of the information available to the estimator. The above problem formulation shows that H 1 optimal estimators guarantee the smallest estimation error energy over all possible disturbances of xed energy. They are therefore over-conservative, which results in a better robust behaviour to disturbance variation. A closed form solution to the optimal H 1 estimation problem is available only in some special cases (see e.g. , [25]), and so it is common in the literature to settle for a suboptimal solution.

Problem 2 (Sub-optimal H 1 Problem) Given scalars f > 0 and p > 0, nd H 1 suboptimal estimation strategies ziji = Ff (y0 ; y1; : : : ; yi) (known as an a posteriori lter) and zi = Fp(y0 ; y1; : : : ; yi?1 ) (known as an a priori lter) that respectively achieve k Ti(Ff ) k1 < f and k Ti(Fp) k1 < p. In other words, nd strategies that respectively achieve Pi e e j =0 f;j f;j sup < f2 (6) P P ? 1  x0 ;u2h2 ;v2h2 (x0 ? x0 ) 0 (x0 ? x0 ) + ij =0 uj uj + ij =0 vj vj and

Pi e e j =0 p;j p;j sup P P ? 1  x0 ;u2h2 ;v2h2 (x0 ? x0 ) 0 (x0 ? x0 ) + ij?=01 uj uj + ij?=01 vj vj 6

< p2:

(7)

This clearly requires checking whether f  f;o and p  p;o.

Note that the solutions to Problem 1 can be obtained to desired accuracy by iterating on the f and p of Problem 2. From here on we shall be only dealing with Problem 2. Note that the problems de ned above are nite-horizon problems. So-called in nite-horizon problems can be considered if we de ne T (Ff ) and T (Fp ) as the transfer operators that map 1 1 1 fx0 ? x0; fuj g1 j =0 ; fvj gj =0 g to fef;j gj =0 and fep;j gj =0 , respectively. Then by guaranteeing k Ti(Ff ) k1< f and k Ti(Fp) k1< p for all i, we can solve the in nite-horizon problems k T (Ff ) k1 f and k T (Fp) k1 p, respectively. However, direct solutions are also possible.

2.2 Solution of the Suboptimal H 1 Filtering Problem We now present the existing solutions (see e.g. [4, 7]) to the suboptimal H 1 ltering problem and note that they are intriguingly similar in several ways to the conventional Kalman lter. It was this similarity in structure that led us to extend Kalman ltering to Krein spaces (see [1]): in e ect, H 1 lters are just Kalman lters in Krein space.

Theorem 1 (An



Aposteriori Filter) [7] For a given > 0, if the Fj full rank, then an estimator that achieves kTi (Ff )k1 < exists if, and only if, H1

Pj?1 + HjHj

? ?2Lj Lj > 0;

Pj +1

with

= Fj Pj Fj + Gj Gj ? Fj Pj

2 I Re;j = 6 4

Hj Lj

3 2

= 0; : : :; i

(8)



2 3 1 6 Hj 75 Pj F  R? j e;j 4

(9)

3

0 7 6 Hj 7  5 + 4 5 Pj Hj

0 ? 2I Lj If this is the case, then one possible level- H 1 lter is given by zj jj

= Lj x^j jj 7

have

j

where P0 = 0 and Pj satis es the Riccati recursion



Gj



Lj



Lj :

(10)

where x^j jj is recursively computed as x^j +1jj +1

= Fj x^j jj + Ks;j +1 (yj +1 ? Hj +1 Fj x^j jj )

;

x^?1j?1

= initial guess

(11)

and Ks;j +1

Theorem 2 (An

= Pj +1 Hj+1 (I + Hj +1 Pj +1 Hj+1 )?1 :

(12)



Apriori Filter) [7] For a given > 0, if the Fj rank, then an estimator that achieves kTi (Fp)k1 < exists if, and only if, H1

P~j?1

= Pj?1 ? ?2Lj Lj > 0;

j

Gj



= 0; : : :; i

have full

(13)

where Pj is the same as in Theorem 1. If this is the case, then one possible level- H 1 lter is given by zj x^j +1

= Lj x^j

= Fj x^j + Ka;j (yj ? Hj x^j )

(14) ;

x^0 = initial

guess

(15)

where Ka;j

= Fj P~j Hj (I + Hj P~j Hj)?1 :

(16)

Comparisons with the KF The Kalman lter algorithm for estimating the states in (1), assuming that the fui g and fvi g are uncorrelated unit variance white noise processes is x ^j +1

=

Fj x^j + Fj Pj Hj (I + Hj Pj Hj)?1 (yj ? Hj x^j )

x^j +1jj +1

=

Fj x^j jj + Pj +1 Hj+1 (I + Hj +1 Pj +1 Hj+1 )?1 (yj +1 ? Hj +1 x ^j +1 )

where Pj +1

= Fj Pj Fj + Gj Gj ? Fj Pj (I + Hj Pj Hj )?1Pj Fj ;

As several authors have noted, the

H1

P0 =  0 :

(17)

solutions are very similar to the conventional

Kalman lter. The major di erences are the following: 8

 The structure of the H 1 estimators depends, via the Riccati recursion (9), on the linear combination of the states that we intend to estimate (i.e. the Li ). This is as opposed to the Kalman lter, where the estimate of any linear combination of the state is given by that linear combination of the state estimate. Intuitively, this means that the H 1 lters are speci cally tuned towards the linear combination Li xi .

 We have additional conditions, (8) or (13), that must be satis ed for the lter to exist; in the Kalman lter problem the Li would not appear, and the Pi would be positive de nite, so that (8) and (13) would be immediate.

3 2 I 0 75 vs. just I in the KF.  We have inde nite (covariance) matrices, e.g. , 64 0 ? 2I  As ! 1, the Riccati recursion (9) reduces to the Kalman lter recursion (17). [This suggests that the H 1 norm of the conventional Kalman lter may be quite large, and that it may have poor robustness properties. Note also that condition (13) is more stringent than condition (8), indicating that the existence of an apriori lter of level implies the existence of an aposteriori lter of level , but not necessarily vice versa.] Despite these di erences, we shall show by applying the results of the companion paper [1], that the lters of Theorems 1 and 2 can in fact be obtained as certain Kalman lters, not in an H 2 (Hilbert) space, but in a certain inde nite vector space, called a Krein space. The inde nite covariances and the appearance of Li in the Riccati equation will be easily explained in this framework. The additional condition (8) will be seen to arise from the fact that in Krein space, unlike as in the usual Hilbert space context, quadratic forms need not always have minima or maxima, unless certain additional conditions are met. Moreover, our approach will provide a simpler and more general alternative to the tests (8) and (13).

3 Derivation of the H 1 Filters As shown in the companion paper [1], the rst step is to associate an inde nite quadratic form with each of the (level ) aposteriori and apriori ltering problems. This will lead us 9

to construct an appropriate (so-called partially equivalent) Krein space state-space model, the Kalman lter for which will allow us to compute the stationary points for the H 1 quadratic forms; conditions that these are actually minima will be deduced from the general results of Part I, and shown to be just (8) and (13); simpler equivalent conditions will also be noted. Therefore we begin by examining the structure of the H 1 problem in more detail. The goal will be to relate the problem to an inde nite quadratic form. We shall rst consider the aposteriori ltering problem.

3.1 The Suboptimal H 1 Problem and Quadratic Forms Referring to Problem 2, we rst note that kTi(Ff )k1

< f ,

implies that for all nonzero

fx0; fuj gij=0; fvj gij=0g

Pi jz ? L x j2 j j j =0 j jj < f2: (18) P P i i ? 1 2 2  (x0 ? x0 ) 0 (x0 ? x0 ) + j =0 juj j + j =0 jyj ? Hj xj j Moreover, (18) implies that for all k  i, we must have Pk jz ? L x j2 j j j =0 j jj < f2: (19) P P k ? 1 k 2 2  (x0 ? x0 ) 0 (x0 ? x0) + j =0 juj j + j =0 jyk ? Hi xi j We remark that if the fyj gij =0 are all zero then it is easy to see that the fzj jj g must all be zero as well. Therefore we need only consider the case where fyj gij =0 is a nonzero sequence. We shall then prove the following result, relating the condition kTi (Ff )k1 < f to the positivity of a certain inde nite quadratic form. Incidentally from now on, without loss of generality, we assume x0 = 0; a nonzero x0 = 0 will only change the initial condition of the lter.

Lemma 1 (Inde nite Quadratic Form) Given a scalar f > 0, then kTi(Ff )k1 < f if, and only if, there exists zkjk = Ff (y0 ; : : :; yk ) (for all 0  k  i) such that for all complex vectors x0, for all causal sequences fuj gij =0 and for all nonzero causal sequences fyj gij =0 , the scalar quadratic form Jf;k (x0; u0; : : : ; uk ; y0; : : : ; yk ) =

Pk Pk ?2 Pk 1 x0? 0 x0 + j =0 uj uj + j =0 (yj ? Hj xj ) (yj ? Hj xj ) ? f j =0 (zj jj ? Lj xj ) (zj jj ? Lj xj ) (20)

10

satis es Jf;k (x0; u0; : : : ; uk ; y0; : : : ; yk ) > 0

for all 0  k  i:

(21)

Proof: Assume there exists a solution zkjk (for all k  i) that achieves kTi(Ff )k1 < f . Then if we mulitply both sides of (19) by the positive denominator on the LHS, we obtain (21). Conversely, if there exists a solution zkjk (for all k  i) that achieves (21), we can divide both sides of (21) by the positive quantity 1 x0 ? 0 x0 +

k X j =0

uj uj

+

k X j =0

(yj ? Hj xj ) (yj ? Hj xj )

to obtain (19), and thereby kTi (Ff )k1 < f .

Remark: Lemma 1 is a straightforward restatement of the inequality (19) which is required of all suboptimal H 1 aposteriori lters with level f . However, the statement of Lemma 2, given below, is a key result, since it shows how to check the conditions of Lemma 1 by computing the stationary point of the inde nite quadratic form Jf;k (x0 ; u0; : : : ; uk ; y0; : : : ; yk ) and checking its condition for a minimum. This is in the spirit of the approach taken in [1]. Note that since the zkjk are functions of the fyj gkj=0 , Jf;k (x0; u0; : : : ; uk ; y0 ; : : :; yk ) is really

a function of only fx0 ; u0; : : : ; uk ; y0; : : : ; yk g. Moreover, since the fyj g are xed observations, the only free variables in Jf;k (x0; u0; : : : ; uk ; y0; : : : ; yk ) are the disturbances fx0; u0; : : : ; uk g. We then have the following result.

Lemma 2 (Positivity Condition) The scalar quadratic forms Jf;k (x0; u0; : : : ; uk; y0; : : :; yk ) satisfy the conditions (21), i , for all 0  k  i, (i) Jf;k (x0; u0; : : : ; uk ; y0 ; : : :; yk ) has a minimum with respect to fx0; u0; u1; :::uk g. (ii) The fzkjk gik=0 can be chosen such that the value of Jf;k (x0 ; u0; : : : ; uk ; y0; : : : ; yk ) at this minimum is positive, viz.,

min J (x ; u ; : : : ; uk ; y0 ; : : :; yk ; z0j0; : : : ; zkjk ) > 0: fx0 ;u0 ;:::;uk g f;k 0 0 11

Proof: Assume Jf;k (x0; fuj gkj=0; fyj gkj=0) > 0, then condition (i) is clearly satis ed because if Jf;k (x0 ; fuj gkj=0 ; fyj gkj=0 ) does not have a minimum over fx0; u0; : : : ; uk g, then it is always possible to choose fx0 ; u0; : : :; uk g so as to make Jf;k (x0 ; fuj gkj=0 ; fyj gkj=0 ) arbitrarily small and negative. Moreover, the existence of a minimum, along with Jf;k (x0; fuj gkj=0 ; fyj gkj=0 ) > 0, guarantees condition (ii) since the value at the minimum must be positive. Conversely, if (i) and (ii) hold, then it follows that Jf;k (x0; fuj gkj=0 ; fyj gkj=0 ) > 0.

3.2 A Krein Space State-Space Model To apply the methodology of Part I, we rst identify the inde nite quadratic form Jf;k as a special case of the general form studied in Theorem 6 of [1] by rewriting it as

1 x0? 0 x0 +

k X j =0

uj uj

+

k X j =0

02 B@64

Jf;k (x0 ; u0; : : : ; uk ; y0; : : : ; yk ) = yj zj jj

3 02 3 2 3 1 0 7 B6 yj 7 6 Hj 7 C 5 @4 5 ? 4 5 xj A : ? 2 Lj zj jj 0 ? f I

3 2 3 1 2 75 ? 64 Hj 75 xj CA 64 I Lj

(22)

Then by Lemmas 6 and 7 in [1], we can introduce the following Krein space system

8 > x = Fj xj + Gj uj > > 2 3 < 2 j+13 y 64 j 75 = 64 Hj 75 xj + vj > > > : zjjj Lj

with

2 66 x0 h666 uj 4 vj

32 77 66 x0 77 ; 66 u 75 64 k vk

2 3 6 0 77 666 0 77i = 66 75 66 64 0

0

0

Ijk

0

2 I 0 64

3 0 7 5 jk 2

(23)

3 77 77 77 77 75

(24)

0 ? f I Note that Qj = I , Sj = 0, 0 > 0, and that we must consider a Krein space since

2 6 Rj = 4

3 0 7 5

I

0

? 2I f

is inde nite. 12

(25)

3.3 Proof of Theorem 1 In order to focus the discussion we brie y review the procedure of the proof. - Referring to Lemma 2, we rst need to check the whether Jf;k (x0; u0; : : : ; uk ; y0 ; : : : ; yk ) has a minimum with respect to fx0 ; u0; u1; :::ukg. This is done via the Krein space Kalman lter corresponding to (23-24), and yields the condition (8) along with several equivalent conditions. - Next we need to choose the fzkjk gik=0 such that the value of Jf;k (x0; u0; : : : ; uk ; y0 ; : : : ; yk ) is positive at its minimum. Now according to Theorem 6 in Part I, the value at the

P

minimum is Jf;k (min) = kj=0 ej R?e;j1 ej , where ej is the innovations corresponding to (23-24). We can then compute the fej g using the Krein space Kalman lter, and thereby choose the appropriate fzkjk gik=0 which yields the desired a posteriori lter.

A remark on the strong regularity of the model (23-24): In what follows we would like to use the Krein space Kalman lter corresponding to the state-space model (23-24). This of course requires the strong regularity of its output Gramian matrix, which we denote by Rzy (since the output of (23) consists of both a y and a z component).

If Rzy is strongly regular, then the Krein space Kalman lter may be applied to check for

the positivity of Jf;k for each 0  k  i. But what if Rzy is not strongly regular? Then it turns

out that Jf;k cannot be positive for all 0  k  i. To see why suppose that Jf;k > 0 for some arbitrary k. Then Jf;k must have a minimum, and according to Lemma 9 in [1], the leading kk

block submatrices of Rzy and R ? S  QS = R must have the same inertia. Now due to

(25), all leading submatrices of R are nonsingular, and since k was arbitrary, the same will be

true of Rzy . Therefore Rzy will be strongly regular. To summarize, we may use the Krein space Kalman lter to check the positivity of Jf;k . If one of the Re;k becomes singular (so that Rzy is no longer strongly regular), Jf;k will lose its

positivity by default.

13

3.3.1 Proof of existence condition (8): The Riccati recursion corresponding to (23) is the exact same Riccati recursion that was given by (9) in Theorem 1. We can now apply any of the conditions for a minimum developed in [1], to check whether a minimum exists for Jf;k (x0 ; u0; : : :; uk ; y0: : : : ; yk ) for all 0  k 





If we assume that the Fk Gk have full rank, then according to Lemma 13 in [1], Jf;k (x0; u0; : : : ; uk ; y0: : : : ; yk ) will have a minimum for all 0  k  i, i

i.

Pj?jj1

= Pj?1 +



Hj Lj

3 2 3 2  6 I 0 7?1 6 Hj 7 5 4 5>0 4 0 ? f2I Lj

which yields the condition (8). Since we still need to satisfy the second condition of Lemma 2, this, of course, only shows that (8) is a necessary condition for the existence of an H 1 aposteriori lter of level f . However, we shall later show that if the minimum condition is satis ed then the second condition of Lemma 2 can also be satis ed. Therefore (8) is indeed necessary and sucient for the existence of the lter.

3.3.2 Other Existence Conditions Using the results of [1], we can obtain alternative conditions for the existence of H 1 aposteriori lters of level f . If we use Lemma 12 in [1] we have the following condition.

Lemma 3 (Alternative Test for Existence) The conditon (8) can be replaced by the condition that

3 2 I 0 75 and 6 Rj = 4 0 ? f2I

2 6 Re;j = 4

3 2 3 0 7 6 Hj 7   5 + 4 5 Pj H j 2

I

0 ? f I

Lj

have the same inertia for all 0  j  i. We no longer require that



Fj Gj

Lj





have full rank,

and the size of the matrices involved is generally smaller than in (8).

Using a block triangular factorization of Re;j , and the fact that when we have a minimum, Pj

is positive de nite, we can show the following result. 14

Corollary 1 (Alternative Test for Existence) The condition of Lemma 3 is equivalent to I + Hj Pj Hj > 0

and ? f2I + Lj (Pj?1 + HjHj )?1 Lj < 0;

(26)

for all 0  j  i.

The test of Lemma 3 has various advantages over (8) that are mentioned in the discussions following Lemma 13 in [1]. In particular, Lemma 3 allows us to go to a square-root form of the H1

ltering algorithm, where there is no need to explicitly check for the existence condition

- these conditions are built into the the square-root recursions themselves, so that a solution exists i the algorithm can be performed [24]. Many alternative existence conditions can also be obtained. Here is one that follows Lemma 14 in [1].

Lemma 4 (Alternative Test for Existence) If the fFj g are nonsingular, an H 1 aposteriori lter of level f exists, i Pi+1 > 0;

and 1 Gj > 0 j I ? Gj Pj?+1

= 0; 1; : : : ; i:

3.3.3 Construction of the H 1 aposteriori lters To complete the proof of Theorem 1 we still need to show that if a minimum over fx0 ; u0; : : :; uk g exists for all 0 

k

 i, then we can nd the estimates fzkjk gik=0 such that the value of

Jf;k (x0; u0; : : : ; uk ; y0; : : : ; yk )

at its minimum is positive.

According to Theorem 6 in [1], the minimum value of Jf;k (x0; u0; : : : ; uk ; y0 ; : : :; yk ) is k  X j =0

ey;j ez;j



2 1 6 ey;j R? e;j 4 ez;j

3 75 =

3 3?1 2 3 2 y ? y ^ Hj Pj Lj ? y^jjj?1 7 6 I + Hj Pj Hj 75 64 j jjj?1 75 54 j =0 zj jj ? z^j jj ?1 zj jj ? z^j jj ?1 Lj Pj Hj ? f2I + Lj Pj Lj 2 k X 64

yj

15

where y^j jj ?1 and z^j jj ?1 are obtained from the Krein space projections of yj and zj jj onto n ?1 ?1 o, respectively. Thus z^ j ?1 ; fzljlgjl=0 L fylgjl=0 j jj ?1 is a linear function of fyl gl=0 . Using the block triangular factorization of the Re;j we may rewrite the above as

3 3?1 2 3 2 2  0 75 64 yj ? y^jjj?1 75 64 yj ? y^jjj?1 75 64 I + Hj Pj Hj j =0 zj jj ? z^j jj zj jj ? z^j jj 0 ? f2I + Lj (Pj?1 + HjHj )?1Lj k X

where

3 32 3 2 2 I 0 7 6 yj ? y^j jj ?1 7 64 yj ? y^jjj?1 75 = 64 (27) 5: 54 zj jj ? z^j jj ?1 ?Lj Pj Hj(I + Hj Pj Hj)?1 I zj jj ? z^j jj n ?1 o, Note that z^j jj is obtained from the Krein space projection of zj jj onto L fylgjl=0 ; fzljlgjl=0 and is therefore a linear function of fyl gjl=0 . Recall from Corollary 1 that I + Hj Pj Hj > 0

? f2I + Lj (Pj?1 + HjHj )?1Lj < 0:

and

(28)

Therefore all we must do is choose some zj jj such that

Pk (y ? y^ )(I + H P H )?1(y ? y^ ) j j j j j jj ?1 j jj ?1 j =0 j   ? 1 P + kj=0 (zj jj ? z^j jj ) ? f2 I + Lj (Pj?1 + Hj Hj)?1 Lj (zj jj ? z^j jj ) > 0:

(29)

There are many such choices, but in view of (28), the simplest is zj jj

(j  k  i)

= z^j jj = Lj x^j jj

o

n

?1 . where x^j jj is given by the Krein space projection of the state x onto fylgjl=0 ; fzljlgjl=0 We may now utilize the ltered form of the Krein space Kalman lter corresponding to the j

state-space model (23) to recursively compute x^j jj (see Corollary 4 in [1]),

2 3 y ? y ^ 1 6 j +1 j +1jj 75 x^j +1jj +1 = Fj x^j jj + Pj +1 Hj+1 Lj +1 R? e;j +1 4 zj +1jj +1 ? z^j +1jj 



(30)

Using y^j +1jj = Hj +1 Fj x^j jj and the above mentioned triangular factorization of Re;j +1 we have x^j+1jj+1

=

Fj x^jjj

+ Pj+1

h

Hj+1

2  4 I + Hj+1 Pj+1Hj+1 0

Lj+1

i

3 2 I ?(I + Hj+1 Pj+1 Hj+1 )?1 Hj+1 Pj+1 Lj+1 5 4 0 I 3?1 2 3 yj+1 ? Hj+1 Fj x^jjj 0 5 4 5

? f2 I + Lj+1 (Pj?+11 + Hj+1 Hj+1 )?1 Lj+1

16

zj+1jj+1

? z^j+1jj+1

Choosing zj +1jj +1 = z^j +1jj +1 yields the desired recursion of Theorem 1 x^j +1jj +1

= Fj x^j jj + Pj +1 Hj+1 (I + Hj +1 Pj +1 Hj+1 )?1 (yj +1 ? Hj +1 Fj x^j jj ):

3.4 Parametrization of All H 1 Aposteriori Filters The lter of Theorem 1 is one among many possible lters with level . All lters that guarantee Jf;k (x0 ; u0; : : : ; uk ; y0; : : : ; yk ) > 0 are represented by (28) and (29). We may use these expressions to obtain a more explicit characterization of all possible estimators. Similar results appear in [4, 7, 11].

Theorem 3 (All H 1 Aposteriori Estimators) All H 1 aposteriori estimators that achieve a level f (assuming they exist) are given by zj jj

=

1 Lj x ^j jj + [ f2I ? Lj (Pj?1 + Hj Hj )?1Lj ] 2

  Sj (I + Hj Pj Hj) 12 (yj ? Hj x^jjj ); : : :; (I + H0P0H0) 12 (y0 ? H0x^0j0)

(31)

where x^j jj satis es the recursion x ^j +1jj +1

= Fj x^j jj + Ks;j +1 (yj +1 ? Hj +1 Fj x^j jj ) ? Kc;j (zj jj ? Lj x^j jj )

(32)

with Ks;j +1 the same as in Theorem 1, Kc;j

= (I + Pj +1 Hj +1 Hj+1 )?1 Fj (Pj?1 + Hj Hj ? f?2Lj Lj )?1 Lj ;

and

3 2 66 S0(a0) 77 66 S (a ; a ) 77 S (aj ; : : : ; a0) = 666 1 .1 0 777 .. 77 66 5 4 Sj (aj ; : : :; a0)

is any (possibly nonlinear) contractive causal mapping, i.e., k X

j =0

jSj (aj ; : : : ; a0)j2
0 n ?1 ?1 o and ; f z j gjl=0 where x^j and x^j jj denote the Krein space projections of x onto fy gjl=0 n j ?1 o, respectively. Therefore x^ and x^ are related through one additional fy gl=0; fz j gjl=0 j j jj j

l

l

l l

l l

projection onto yj , and we may write x^j jj

= x^j + Pj Hj(I + Hj Pj Hj)?1 (yj ? Hj x^j ):

Therefore yj

? Hj x^jjj =



I ? Hj Pj Hj(I + Hj Pj Hj)?1



(35)

(yj ? Hj x^j )

= (I + Hj Pj Hj)?1 (yj ? Hj x^j ) so that (yj ? Hj x^j ) = (I + Hj Pj Hj)(yj ? Hj x^j jj ): Now (34) can be written as:

Pk (y ? H x^ )(I + H P H )(y ? H x^ ) j j jj j j j j j j jj j =0 j   ?1 P + kj=0 (zj jj ? Lj x^j jj ) ? f2 I + Lj (Pj?1 + Hj Hj)?1 Lj (zj jj ? Lj x^j jj ) > 0 or equivalently,

k ( f2I ? Lj (Pj?1 + Hj Hj)?1Lj )? 21 (zjjj ? z^jjj ) k22 < k (I + Hj Pj Hj) 12 (yj ? Hj x^jjj ) k22 : Since zj jj is a causal function of the observations yj , then (zj jj ? z^j jj ) will also be a causal function of (yj ? Hj x^j jj ). Therefore using the above expression, we can write

2 3 2 3 1  2I ? L0(P0?1 + H0 H0)?1 L0)? 12 (z0j0 ? z^0j0 ) 2 ( I + H P H ) (

( y ? H x ^ ) 0 0 0 0 0 0j0 7 66 f 77 66 77 .. . 6 66 7 .. = S 6 7 77 . 64 75 64 5 ( f2I ? Li (Pi?1 + HiHi )?1Li )? 12 (ziji ? z^iji ) (I + Hi Pi Hi) 21 (yi ? Hi x^iji ) 18

for some causal contractive mapping S . Eq. (31) now readily follows. Finally, we must show (32). To this end, recall from the proof of Theorem 1 (see (30)) that the recursion for x^j is given by

3 2 1 6 yj +1 ? y^j +1jj 75 : x^j +1jj +1 = Fj x^j jj + Pj +1 Hj+1 Lj +1 R? e;j +1 4 zj +1jj +1 ? z^j +1jj 



Using (35) to replace x^j by x^j jj yields, after some algebra, the desired recursion (32). Note that although the lter obtained in Theorem 1 is linear, the full parametrization of all H 1 lters with level f is given by a nonlinear causal contractive mapping S . The lter of Theorem 1 is known as the central lter and as we have seen, corresponds to S = 0. This central lter has a number of other interesting properties. It corresponds, as we shall see in the next section, to the risk-sensitive optimal lter, and can be shown to be the maximum entropy lter [10]. Moreover, in the game theoretic formulation of the H 1 problem, the central lter corresponds to the solution of the game [12]. In our context, the central lter is recognized as the Krein space Kalman lter corresponding to the state-space model (23).

3.5 Derivation of the Apriori H 1 Filter We shall now turn to the H 1 apriori lter of Problem 2 and our main goal will be to prove the results of Theorem 2. Our approach will follow the one used for the aposteriori case, namely we will relate an inde nite quadratic form to the apriori problem, construct its corresponding Krein space state-space model, and use the Krein space Kalman lter to obtain the solution. Since our derivations parallel the ones given earlier, we shall omit several details.

The Suboptimal H 1 Apriori Problem and Quadratic Forms Referring to Problem 2 we rst note that kTi(Fp)k1

fx0; fuj gji?=01 ; fvj gij?=01 g

< p ,

implies that for all nonzero

Pi jz ? L x j2 j j j =0 j < p2; P P i ? 1 i ? 1 ? 1  2 2 x00 x0 + j =0 juj j + j =0 jyj ? Hj xj j 19

(36)

where, without loss of generality, we have assumed x0 = 0. Moreover, (36) implies that for all k  i, we

must have

Pk jz ? L x j2 j j j =0 j 2 P P ? 1 k ? 1 ?1 jyj ? Hj xj j2 < p :  2 x00 x0 + j =0 juj j + jk=0

(37)

As before, we may easily show the following result.

Lemma 5 (Inde nite Quadratic Form) Given a scalar p > 0, then kTi(Fp)k1 < p if, and only if, there exists zk = Fp(y0 ; : : : ; yk?1 ) (for all 0  k  i) such that for all complex vectors x0, for all causal sequences fuj gij?=01 and for all nonzero causal sequences fyj gij?=01 the scalar quadratic form 1 x0 ? 0 x0 +

Jp;k (x0; u0; : : : ; uk?1 ; y0; : : : ; yk?1 ) =

Pk?1 uu + Pk?1(y ? H x )(y ? H x ) ? ?2 Pk (z ? L x )(z ? L x ) j j j j j j j j j j p j =0 j j =0 j j j =0 j

(38)

satis es Jp;k (x0; u0; : : : ; uk?1 ; y0; : : : ; yk?1 ) > 0

for all 0  k  i:

(39)

We can also readily obtain the analog of Lemma 2.

Lemma 6 (Positivity Condition) The scalar quadratic forms Jp;k (x0; u0; : : : ; uk?1; y0; : : : ; yk?1) satisfy the conditions (39), i , for all 0  k  i, (i) Jp;k (x0; u0; : : : ; uk?1; y0 ; : : : ; yk?1) has a minimum with respect to fx0 ; u0; u1; :::uk?1g. (ii) The fzk gik=0 can be chosen such that the value of Jp;k (x0 ; u0; : : :; uk?1 ; y0; : : : ; yk?1 ) at this minimum is positive, viz.,

min J (x ; u ; : : : ; uk?1 ; y0; : : : ; yk?1 ; z0; : : : ; zk ) > 0: fx0 ;u0 ;:::;uk?1 g p;k 0 0

20

3.5.1 A Krein Space State-Space Model Due to the fact that the summations in Jp;k go up to both k and k ? 1 (see (38)), it is slightly more dicult to come up with a Krein state-space model whose corresponding quadratic form is Jp;k . However, with some e ort, we see that the appropriate Krein state-space model is

8 > < 2j+1 > : z^j 8 > < 2j+2 > : yj

= 2j ; 0 = x0 = Lj 2j + v2j = Fj 2j +1 + Gj u 2j +1 = x2i+2 =

j

0

(40)

Hj 2j +1 + v2j +1

where 0 > 0, Q2j = 0, Q2j +1 = I , R2j = ? p2I , R2j +1 = I , and Sj = 0. To see why, let us construct the deterministic quadratic form corresponding to (40). Thus: J;2k

= 0 ?0 1  + = 0 ?0 1  + = 0 ?0 1  +

2k X j =0 kX ?1 j =0 kX ?1 j =0

1 uj Q? j uj

+

2k X

1 vj R? j vj

j =0 kX ?1

u2j +1 u2j +1 + u2j +1 u2j +1 +

j =0 kX ?1 j =0

1 v2j +1 R? 2j +1 v2j +1 +

k X j =0

jyj ? Hj 2j+1j2 ? p?2

1 v2j R? 2j v2j

k X j =0

jzj ? Lj 2j j2:

From (40) we see that 2j = 2j +1 = xj . Using this fact, and de ning u2j +1 = uj , we readily see that J;2k = Jp;k . Note also that the Riccati recursion for the model (40) is

8 > < 2j+1 = 2j ? 2j Lj (? p2I + Lj 2j Lj )?1Lj 2j > : 2j+2 = Fj 2j+1Fj + Gj Gj ? Fj 2j+1Hj(I + Hj 2j+1Hj)?1Hj 2j+1Fj

0 = 0 : (41)

3.5.2 Existence Conditions Using Lemma 12 from [1], the condition for a minimum is that Re;j and Rj should have the same inertia for all j = 0; 1; : : :; 2i (since each two time steps in (40) correspond to one time step in Jp;k ). Thus the condition for a minimum is

? p2I + Lj 2j Lj < 0 and 21

I + Hj 2j +1 Hj > 0:

(42)

The second of the above conditions is obvious since when we have a minimum j is positive de nite. If the



minimum is

Fj Gj



have full rank, then using Lemma 13 from [1], the condition for a

?2j1 ? p?2Lj Lj > 0 and ?2j1+1 + Hj Hj > 0

(43)

where once more the second of the above conditions is redundant. In order to connect with the results of Theorem 2, we may note that by de ning Pj = 2j and combining the coupled pair of Riccati recursions in (41), we can write the following Riccati recursion for Pj Pj +1

with

= Fj Pj Fj + Gj Gj ? Fj Pj



Lj Hj

3 2 2 2 ? I 0 7 6 Re;j = 6 5+4 4 p

Lj



2 16 R? e;j 4

Lj Hj

3  75 Pj L j

3 75 Pj Fj;

P0 =  0

(44)



Hj :

(45) 0 I Hj But this is the same Riccati as (9) in Theorem 2. Thus the condition for a minimum (43) becomes

Pj?1 ? p?2Lj Lj > 0;

(46)

which is the condition (13) of Theorem 2. To express the condition (42) in a form that is more similar to that of Lemma 3, we introduce the Krein space state-space model

8 > x = Fj xj + Gj uj > > 2 3 < 2 j+13  64 zjjj 75 = 64 Lj 75 xj + vj > > > : yj Hj

where 0 > 0, Qj = I , Sj = 0 and

(47)

3 2 2 6 ? p I 0 75 : Rj = 4 0

I

Note that the only di erence between the state-space models (23) and (47) is that the order of the output equations has been reversed. 22

We can now use the state-space model (47) to express the condition (42) in the form of the following Lemma.

Lemma 7 (Alternative Test for Existence) The condition (13) can be replaced by the condition that all leading submatrices of

3 2 3 2 3 2 2p I 0 2p I 0 Lj 7  ?

?

6 7 6 7 6 Rj = 4 5 + 4 5 Pj Lj 5 and Re;j = 4 Hj 0 I 0 I have the same inertia for all 0  j  i. In other words, ? p2I + Lj Pj Lj < 0 and

Hj



I + Hj P~j Hj > 0

where P~j?1 = Pj?1 ? p?2Lj Lj . We no longer require that



Fj Gj



have full rank, and the

size of the matrices involved is generally smaller than in (8).

Note that compared to Lemma 3, the condition in Lemma 7 is more stringent since it requires that all leading submatrices of Rj and Re;j have the same inertia. This distinction is especially important in square-root implementations of the H 1 lters [24].

Construction of the H 1 Apriori Filter To complete the proof of Theorem 2 we still need to show that if a minimum over fx0 ; u0; : : :; uk?1 g exists for all 0 

k

 i, then we can nd the estimates fzk gik=0 such that the value of

Jp;k (x0; u0; : : : ; uk?1 ; y0; : : : ; yk?1 )

at its minimum is positive.

According to Theorem 6 in [1], the minimum value of Jp;k (x0 ; u0; : : :; uk?1 ; y0; : : : ; yk?1 ) is

Pk?1 j =0



ez;j ey;j



2 3 1 6 ez;j 75 + e (? 2I + Lk Pk L )?1 ez;k = R? p e;j 4 z;k k ey;j

2 3 2 3 3?1 2 2   Pk?1 64 zj ? z^jjj?1 75 64 ? p I + Lj Pj Lj Lj Pj Hj 75 64 zj ? z^jjj?1 75 j =0 yj ? y^j jj ?1 Hj Pj Lj I + Hj Pj Hj yj ? y^j jj ?1 +(zk ? z^kjk?1 )(? p2I + Lk Pk Lk )?1 (zk ? z^kjk?1 ) > 0 where z^j jj ?1 and y^j jj ?1 are obtained from the Krein space projections of zj and yj onto n ?1 ?1 o, respectively. Thus z^ j ?1 ; fylgjl=0 L fzlgjl=0 j jj ?1 is a linear function of fyl gl=0 . Using the block 23

triangular factorization of the Re;j we may rewrite the above as k X

j =0

(zj ? z^j jj ?1 ) (? p2I + Lj Pj Lj )?1 (zj ? z^j jj ?1 ) +

kX ?1 j =0

(yj ? yj jj ?1 )(I + Hj P~j Hj)?1 (yj ? yj jj ?1 ) > 0(48)

where

32 3 2 3 2 I 0 7 6 zj ? z^j jj ?1 7 64 zj ? z^jjj?1 75 = 64 (49) 54 5 ?Hj Pj Lj (? p2I + Lj Pj Lj )?1 I yj ? y^j jj ?1 yj ? yj jj ?1 n ?1 o. Recall Note that yj jj ?1 is given by the Krein space projection of yj onto fzl gjl=0 ; fylgjl=0

from Lemma 7 that

? p2I + Lj Pj Lj < 0

;

I + Hj P~j Hj > 0:

Any choice of zj jj ?1 that renders (48) positive will do, and the simplest choice is zj jj ?1 = n ?1 ?1 o. ; fyj gjl=0 z^j jj ?1 = Lj x^j jj ?1 , where x^j jj ?1 is given by the Krein space projection of xj onto fzl gjl=0 We may now utilize the Krein space Kalman lter corresponding to the state-space model (47) to recursively compute x^j jj ?1 , viz., x ^j +1jj

= Fj x^j jj ?1 + Fj Pj



Lj Hj



2 3 z  ? L x ^ j j jj ?1 7 16 j R? 5 e;j 4 yj ? Hj x^j jj ?1

(50)

Setting zj ? Lj x^j jj ?1 = 0 and simplifying we get the desired recursion for x^j +1jj .

3.6 All H 1 Apriori Filters The positivity condition (48) gives a full parametrization of all H 1 apriori estimators. We thus have the following result.

Theorem 4 (All H 1 Apriori Estimators) All H 1 apriori estimators that achieve a level

p

(assuming they exist) are given by z^j

=

Lj x^j

1

+ ( p2I ? Lj Pj Lj ) 2

(51)

  Sj (I + Hj?1P~j?1 Hj?1)? 21 (yj?1 ? Hj?1xj?1); : : : ; (I + H0P~0H0)? 12 (y0 ? H0x0)

where xk

= x^k + Pk Lk (? p2I + Lk Lk )?1 (zk ? Lk x^k ); 24

(52)

x^j

satis es the recursion x^j +1jj

= Fj x^j jj ?1 + Fj Pj



Lj Hj



2 3 z  ? L x ^ j j jj ?1 7 16 j R? 5; e;j 4 yj ? Hj x^j jj ?1

(53)

with Pj , P~j and Re;j given by Theorem 2, and S is any (possibly nonlinear) contractive causal mapping.

Proof: Referring to (49), we see that the yjjj?1 = Hj xj di er from y^jjj?1 = Hj x^j , via the additional projection onto zj jj ?1 . Thus we can write xj jj ?1

= x^j jj ?1 + Pj Lj (? p2I + Lj Pj Lj )?1 (zj jj ?1 ? z^j jj ?1 );

which proves (52). Moreover from the proof of Theorem 2 (see (50)), the recursion for x^j is given by (53). Condition (48) can now be rewritten as k X

j =0

(zj jj ?1 ? z^j jj ?1 )(? p2I + Lj Pj Lj )?1 (zj jj ?1 ? z^j jj ?1 ) + kX ?1 j =0

(yj ? yj jj ?1 )(I + Hj P~j Hj)?1 (yj ? yj jj ?1 ) > 0

(54)

and an argument similar to the one given in the proof of Theorem 3 will yield the desired result.

3.7 The H 1 Smoother If instead of ef;k and ep;k , which correspond to the aposteriori and apriori lters, respectively, we consider the smoothed error es;k

= zkji ? Lk xk ;

ki

where zkji = Fs (y0; y1 ; : : :; yi ) is the estimate of zk given all observations fyj g from time 0 until time i, we are led to the so-called H 1 smoothers. Such estimators guarantee that the maximum

o

n

energy gain from the disturbances ?0 1=2(x0 ? x0 ); fuj gij =0 ; fvj gij =0 to the smoothing errors

fes;j gij=0 is bounded by s, i.e.,

Pi e e j =0 s;j s;j sup P P ? 1  x0 ;u2h2 ;v2h2 (x0 ? x0 ) 0 (x0 ? x0 ) + ij =0 uj uj + ij =0 vj vj 25

< s2:

(55)

Using an argument similar to the ones given before, we are led to the following quadratic form 1 x0? 0 x0 +

Js;i (x0; u0; : : : ; ui; y0 ; : : :; yi ) =

Pi

Pi   ?2 Pi (zkji ? Lk xk )(zkji ? Lk xk ): k=0 uk uk + k=0 (yk ? Hk xk ) (yk ? Hk xk ) ? k=0 (56)

Note that the only di erence between Js;i and Jf;i is that zkjk has been replaced by zkji (i.e., ltered estimates have been replaced by smoothed estimates). Once more it can be shown that an H 1 smoother of level s will exist if, and only if, there exists some zkji such that Js;i  0. The rather interesting result shown below, and which has already been pointed out

in the literature (see e.g., [4, 7, 16]), is that one H 1 smoother is given by the conventional H 2 smoother (which does not even depend on s ).

Theorem 5 (H 1 Smoother) For a given s > 0, an H 1 smoother that achieves level s exists, i the block diagonal matrix Re

where

= Re;0  Re;1  : : :  Re;i ;

2 I Re;j = 6 4

3 2

3

0 7 6 Hj 7  5 + 4 5 Pj Hj

Lj



0 ? s2 Lj and Pj is the same as in Theorem 1, has (i + 1)p positive eigenvalues and (i + 1)q negative eigenvalues. In other words, i



In [Re ] = (i + 1)p 0 (i + 1)q



:

If this is the case, one possible H1 smoother is given by the H 2 smoother.

Proof: The condition for Js;i(x0; u0; : : : ; ui; y0; : : : ; yi) to have a minimum is slightly di erent than the earlier cases, since we do not require that Js;k (x0; u0; : : : ; uk ; y0; : : : ; yk ) have a minimum over the disturbances for all past values k < i. Thus using Lemma 9 from the companion paper [1], the condition for a minimum over fx0 ; u0; : : :; ui g is that the matrices Re

and

R = R 0  R1  : : :  R i

26

have the same inertia, where Rj = Ip  (? s2Iq ). But this is precisely the inertia condition given in the statement of the Theorem. The value of Js;i at its minimum is (see [1])

 where we have de ned

y

2 6 zji 4 

Ry

Ryz

Rzy

Rz

3?1 2 75 64

y zji

3 75

3 2 32 3 75 = h64 y 75 ; 64 y 75i (57) zji zji Rzy Rz with 2 3 2 3 2 3 2 3 66 z0ji 77 66 y0 77 66 z0ji 77 66 y0 77 y = 666 ... 777 ; zji = 666 ... 777 ; y = 666 ... 777 ; zji = 666 ... 777 ; 4 5 4 5 4 5 4 5 ziji yi ziji yi and where the fyj g and fzj ji g satisfy the Krein state-space model (23). In this case all the entries in zji are unknown and there is no causal dependence between the fzj jk g and the fyj g. 2 64

Ry

Ryz

Using a block triangular factorization, or a completion of squares argument, the value at the minimum can be rewritten as ?1 z ? Rzy R?1 y ): ?1 1  1 zji ? Rzy R? y  R? ji y y y ) (Rz ? Rzy Ry Ryz) ( y y + (

But Ry

>

0 (since it is the covariance of a Hilbert space state-space model), and hence one

possible choice of zji to guarantee Js;i > 0 is to choose zji = Rzy R?y 1 y = z^ji , which is clearly the H 2 smoothed estimate of z.

The following result is now straightforward.

Theorem 6 (All H 1 Smoothers) All H 1 smoothers that achieve a level s (assuming they exist) are given by zji

= z^ji + (Rz ? Rzy R?y 1 Ryz)1=2S (R?y 1=2y )

where S is any (not necessarily causal) contractive mapping, z^ji is the usual estimate, and Ry and Rz ? Rzy R?y 1 yRyz are de ned in (57).

27

(58) H2

smoothed

It is clear from the discussions so far in the paper that the Krein space estimation formalism provide simple derivations of H 1 estimators. These estimators turn out to be certain Krein space Kalman lters, and show that Krein space estimation yields a uni ed approach to H 2 and H 1 problems.

To derive such lters, and to solve other related problems as discussed ahead, all

one essentially needs is to identify an inde nite quadratic form, and to construct a convenient auxiliary state-space model with the appropriate Gramians. Two further applications of this approach are discussed next.

4 Risk-Sensitive Estimation Filters The so-called risk-sensitive (or exponential cost) criterion was introduced in [13] and further studied in [14, 15, 16]. Glover and Doyle [21] noticed their close connection to the H 1 lters discussd earlier. We shall make this connection in a di erent way by bringing in an appropriate quadratic form.

4.1 The Exponential Cost Function We again start with a state-space model of the form

8 > < xj+1 = > : yj =

Fj xj + Gj uj ; Hj xj + vj

j

0

(59)

However, we now assume that x0, fuj g; and fvj g are independent zero mean Gaussian random variables with covariances 0 , Qi ; and Ri ; respectively. We further assume that the fuj g and fvj g are white-noise processes. Conventional H 2 estimators, such as the Kalman lter, estimate the quantity zi = Li xi from the observations fyj g by performing the following minimization (see e.g., [22, 23, 1]) min E [Ci ]

fzjjl g

(60)

P where Ci = ij =0 (zj jl ?Lj xj ) (zj jl ?Lj xj ), zj jl denotes the estimate of zj given the observations up to and including time l, and E [] denotes expectation. As we have seen earlier, l = j , l= j?1

and l = i correspond to the aposteriori, apriori and smoothed estimation problems, 28

respectively. Moreover, the expectation is taken over the Gaussian random variables x0 and

fuig whose joint conditional distribution is given by   1 p(x0 ; u0; : : :; ui jy0; : : : ; yi ) / exp ? Ji (x0; u0; : : : ; ui; y0 ; : : :; yi ) 2

(61)

where the symbol / stands for \proportional to" and Ji (x0; u0; : : : ; ui; y0; : : : ; yi ) is equal to

(using the fact that x0, fuj g; and fvj g are independent, and that vj = yj ? Hj xj ) 1 x0? 0 x0 +

i X j =0

1 uj Q? j uj

+

Xi j =0

(yj ? Hj xj )R?j 1 (yj ? Hj xj ):

(62)

In the terminology of [15], the lter that minimizes (60) is known as a risk-neutral lter. An alternative criterion that is risk-sensitive has been extensively studied in [13, 14, 15, 16] and corresponds to the minimization problem



min  () = min ? 2 log f g i f g  zj

jl

zj

jl



Eexp(?



C) 2 i



(63)

The criterion in (63) is known as an exponential cost criterion, and any lter that minimizes i ()

is referred to as a risk-sensitive lter. The scalar parameter  is correspondingly called

the risk-sensitivity parameter. Some intuition concerning the nature of this modi ed criterion is obtained by expanding i () in terms of  and writing,  i () = E (Ci) ? V ar(Ci) + O(2 )

4

The above equation shows that for  = 0, we have the risk-neutral case (i.e., conventional H 2 estimation). When  > 0, we seek to maximize Eexp(? 2 Ci), which is convex and decreasing

in Ci . Such a criterion is termed risk-seeking (or optimistic) since larger weights are on small

values of Ci , and hence we are more concerned with the frequent occurrence of moderate values of Ci than with the occasional occurrence of large values. When  < 0, we seek to minimize Eexp(? 2 Ci ), which

is convex and increasing in Ci . Such a criterion is termed risk-averse (or

pessimistic) since large weights are on large values of Ci , and hence we are more concerned with the occasional occurence of large values than with the frequent occurence of moderate ones. In what follows, we shall see that in the risk-averse case minimizing (63) makes sense is the optimal H 1 criterion. 29


0 :

maxfzjjl g

 (ii)  < 0 :

minfzjjl g

R exp h?  C ? 1 J (x ; u ; : : : ; u ; y ; : : :; y )i dx du : : : du : i 0 i 0 0 i 2 i 2 i 0 0 R exp h?  C ? 1 J (x ; u ; : : : ; u ; y ; : : : ; y )i dx du : : : du : 2 i

2 i 0 0

i 0

0

i

0

i

This suggests that we de ne the second-order scalar form Ji (x0; u0; : : : ; ui;0 ; : : : ; yi ) = Ji (x0 ; u0; : : :; ui ;0 ; : : :; yi ) + Ci

1 x0 ? 0 x0 +

Xi j =0

1 uj Q? j uj +

i X j =0

(yj ? Hj xj ) R?j 1 (yj ? Hj xj ) + 

Xi j =0

=

(zj jj ?1 ? Lj xj )(zj jj ?1 ? Lj xj )

Before proceeding with the extremizations in (i) and (ii), we need to ensure that the integrals in (i) and (ii) are nite. The condition is given by the following Lemma, which is easy to prove.

Lemma 8 (Finiteness Condition) The integral  Z  1 exp ? Ji (x0 ; u0; : : : ; ui; y0; : : : ; yi) dx0 du0 : : : dui 2 is nite i Ji (x0 ; u0; : : : ; ui; y0; : : : ; yi) has a minimum over fx0; u0; : : : ; uig. In that case it is proportional to exp

  1 Ji (x0; u0; : : : ; ui; y0 ; : : :; yi ) : ? 2 min x0 ;u

The above Lemma thus reduces the risk-sensitive problem to one of nding the minimum of a second-order scalar form. More precisely, the criterion becomes

 (i)  > 0 :  (ii)  < 0 :

 minfzjjl g minx0 ;u Ji (x0 ; u0; : : : ; ui ; y0; : : : ; yi) .

 maxfzjjl g minx0 ;u Ji (x0; u0; : : : ; ui; y0; : : : ; yi ) . 30

Note that the second of the above problems is a quadratic game problem [12]. Though we shall not consider quadratic games here, it is also possible to solve them using the approach given in this and the companion paper. In order to solve the above problem, we can introduce the following auxiliary Krein statespace model that corresponds to the (possibly inde nite) quadratic form Ji (x0; u0; : : : ; ui;0 ; : : : ; yi).

8 > x = F j x j + G j uj j  0 > > 2 3 < 2 j+13 y i 64 75 = 64 Hi 75 xi + vi > > > : zijl Li

with

2 66 x0 h666 uj 4 vj

32 77 66 x0 77 ; 66 u 75 64 k vk

2 3 6 0 77 666 0 77i = 66 75 66 64 0

0

(64)

0

Qi ij

0

2 64 Rj

0

3

0 7 5 jk

3 77 77 77 : 77 75

(65)

0 ?1 I We can now readily use the state-space model (64), and the results of the companion paper [1], to check for the condition of a minimum over fx0; u0; : : : ; uig, and to compute the value at the minimum. Then the resulting quadratic form can be further extremized via a Krein space projection. The details will not be repeated here, since they are the same as those given in the derivation of the H 1 lter. We shall just note that the quadratic form Ji (x0 ; u0; : : :; ui ; y0; : : : ; yi) is exactly the same as the quadratic forms Jf;i (x0; u0; : : : ; ui; y0; : : : ; yi ), Jp;i (x0; u0; : : : ; ui; y0; : : : ; yi )

p?2

and



and Js;i (x0 ; u0; : : :; ui ; y0; : : : ; yi), when we choose  = f?2,  =

= s?2, and when the estimate is chosen as a ltered, predicted and smoothed

estimate, respectively. Therefore the derivations of the risk-sensitive lters follow exactly the same derivation of the H 1 lters discussed earlier. We thus have the following results.

Theorem 7 (Aposteriori Risk-Sensitive Filter) For a given  > 0, the risk-sensitive estimation problem always has a solution. For a given  < 0, a solution exists i ,

2 64 Rj 0

3

0 7 5 and

?1 I

2 Rj Re;j = 6 4 0

3 2

3

0 7 6 Hj 7  5 + 4 5 Pj Hj

?1 I

31

Lj

Lj



have the same inertia for all j = 0; 1; : : :i, where P0 = 0 and Pj +1

= Fj Pj Fj + Gj Qj Gj ? Fj Pj



Hj Lj



2 3 1 6 Hj 75 Pi F  : R? i e;j 4 Lj

In both cases the optimal risk-sensitive lter with parameter  is given by ziji = Li x^iji x^i+1ji+1

= Fi x^iji + Ks;i+1 (yi+1 ? Hi+1 Fi x^iji)

;

x^?1j?1

=0

and Ks;i+1 = Pi+1 Hi+1 (I + Hi+1 Pi+1 Hi+1 )?1 :

Proof: The proof is2 exactly that3 of Theorem 1. We only note here that for  > 0 a solution ?1 I 0 7 always exists since 64 5 > 0 and the state-space model reduces to the usual Hilbert0

space setting.

Ri

Theorem 8 (Apriori Risk-Sensitive Filter) For a given  > 0, the apriori risk-sensitive estimation problem always has a solution. For a given  < 0, a solution exists i all leading submatrices of

3 2 ?1 I 0  75 and 64 0

Rj

3 2 2 ?1 I 0  75 + 64 6 Re;j = 4 0

Rj

Lj Hj

3 75 Pj  L j

Hj



have the same inertia for all j = 0; 1; : : :i, where Pj is the same as in the aposteriori case. In both cases the apriori risk-sensitive lter with parameter  is given by ziji?1 = Li x^iji?1 x ^i+1ji = Fi x^iji?1 + Ka;i(yi ? Hi x^iji?1 )

;

where Ka;i = Fi P~i Hi (I + Hi P~i Hi )?1:

32

x ^0j?1 = 0

Theorem 9 (Risk-Sensitive Smoother) For a given 

>

0, the risk-sensitive smoother

always has a solution. For a given  < 0, a solution exists i the block diagonal matrix Re = Re;0  Re;1  : : :  Re;i

where

2 Rj Re;j = 6 4

3 2

3

0 7 6 Hj 7   5 + 4 5 Pj Hj

Lj



0 ?1 I Lj and Pj is the same as in the aposteriori case, has (i + 1)p positive eigenvalues and (i + 1)q negative eigenvalues. In other words, i



In [Re ] = (i + 1)p 0 (i + 1)q



:

In both cases the risk-sensitive smoother is the H 2 smoother.

We can now state the striking resemblances between the H1 and the risk-sensitive lters.

The H 1 lters obtained earlier are essentially risk-sensitive lters with parameter  = ? ?2. Note, however, that at each level , the H 1 lters are not unique, whereas for each , the risk-sensitive lters are unique. Also, the risk-sensitive lters generalize to the

 >

0 case.

It is also noteworthy that the optimal H1 lter corresponds to the risk-sensitive lter with  = ? o?2 , and that  is that value for which the minimizing property of Ji breaks down and i () becomes in nite.

This relationship between the optimal H1 lter and the corresponding

risk-sensitive lter was rst noted in [21].

5 Finite Memory Adaptive Filtering We now consider an application of the Krein space Kalman lter to the problem of nite memory (or sliding window) adaptive ltering. It has been recently shown [20] that a uni ed derivation of adaptive ltering algorithms, and their corresponding fast versions, can be obtained by properly recasting the adaptive problem into a standard state-space estimation problem. We now verify that if we further allow the elements of the state-space model to belong to a Krein space, then the so-called sliding window or problem can also be handled within the 33

same framework. In fact, this framework also allows us to easily consider more general sliding patterns with windows of varying lengths, as explained ahead. Moreover, we shall obtain a physical interpretation of innovations with negative Gramian, and see that it corresponds to the loss of information.

5.1 The Standard Problem The nite memory adaptive ltering problem can be formulated as follows: given the inputoutput pairs fhj ; dj g where hj 2 C 1n is a known input vector and dj 2 C is a known output scalar, recursively determine estimates of an unknown weight vector w 2 C n , such that the scalar second order form 1 Ji (w; di?li+1 ; : : : ; di; hi?li+1 ; : : :; hi ) = w ? 0 w+

Xi j =i?li +i

(dj ? hj w)(dj ? hj w);

(66)

where 0 > 0, is minimized for each i. Since Ji is a function of the pairs fhj ; dj gij =i?li +i , at each time instant i, we are interested in determining the estimate of w using only the data given over an interval of length li . The quantity li  0 is therefore referred to as the (memory) length of the sliding window. Note that we allow for a time-variant window length. To clarify this point, consider the example of Fig. 2 where at time i we have a window length of li = l. At the next time instant we add the data point fhi+1 ; di+1g, so that the window length changes to li+1 = l + 1. At time i + 2 we add the data point fhi+2 ; di+2 g and drop the data point fhi?l ; di?l g so that the window length remains li+2 = l +1. In a similar fashion, more general sliding window patterns can be considered as well. To recast expression (66) into the usual quadratic form considered in this paper, the lower index of the summation term needs to start at the xed time 0. For this purpose, we rewrite Ji

as follows 1 Ji (w; di?li+1 ; : : : ; di; hi?li+1 ; : : :; hi ) = w ? 0 w+

+

iX ?li j =0

(dj ? hj w) (dj ? hj w) ?

Xi?li

i X

(dj ? hj w)(dj ? hj w)

j =i?li +1

 j =0 (dj ? hj w) (dj ? hj w)

34

(67)

(hi-l di-l ) Time i

(hi-l+1di-l+1) Length

(hi-l di-l )

di )

( hi

di )

(hi+1 di+1)

di )

(hi+1 di+1)

li = l

(hi-l+1di-l+1)

Time i+1

( hi

Length

l i+1= l+1

(hi-l+1di-l+1)

( hi

Time i+2

Length

(hi+2 di+2)

li+2 = l+1

Figure 2: Sliding window with varying window length. where we have added and subtracted identical terms. We now invoke a change of variables and substitute the time index i by another time index k that allows us to replace Ji by a Jk . The new index k has the property whenever a new data point is added (i.e. , i is incremented) then k is incremented. However, whenever a data point is discarded from the window, k is incremented as well. Thus if at time i the length of the window is li , then the index k will run from 0 to 2i ? li + 1 (since there will have been i data points added and i ? li + 1 data points removed). To be more speci c, the change of variables is as follows. (a) At each time i, since the data point fhi ; dig is added, we increment the index k and de ne  k = hi and R k = 1: dk = di; h (68) (b) If at time i the data point fhi?li ; di?li g is removed, we increment the index k once more and de ne dk

= di?li ; h k = hi?li and R k = ?1:

(69)

With this convention we may write the quadratic form Ji (w; di?li+1 ; : : :; di; hi?li +1 ; : : : ; hi) as Ji

X = Jk (w; d0; : : : ; dk ; h0 ; : : :; hk ) = w ?0 1 w + (dj ? hj w)(dj ? hj w); k

j =0

35

(70)

which is of the form that we have been considering in this and the companion paper [1]. Note that the quadratic form Jk (w; d0; : : : ; dk ; h0 ; : : :; hk ) is inde nite, since whenever a data point

is dropped we have R k = ?1. We can therefore use Krein space methods to solve the problem.

Using the same approach that we have used so far, we now construct the partially equivalent state-space model to the inde nite quadratic form Jk . Thus

8 > < xj+1 = xj ; x0 = w 0  j  k > : dj = h j xk + vk

(71)

with  > 0, Qj = 0, Sj = 0 and R j as in (68) and (69). We can now state the following result.

Theorem 10 (Finite Memory Adaptive Filter) The nite memory adaptive lter is given by the following recursions. (a) For updating the data point fhi ; dig at time i we have w ^ji:i?li?1

= w^ji?1:i?li?1 + Kp;k (di ? hi w^ji?1:i?li?1 )

(72)

where w^ji:j is the estimate when the sliding window encompasses all the data from time j

to time i, and Kp;k

= Pk hi R?e;k1

;

Re;k

= 1 + hi Pk hi

(73)

and where Pk satis es the recursion Pk+1

 ; = Pk ? Kp;k Re;k Kp;k

P0

= 0 :

(74)

(b) For downdating the data point fhi?li ; di?li g at time i we have w ^ji:i?li+1

= w^ji:i?li + Kp;k (di?li ? hi?li w^ji:i?li )

(75)

where Kp;k

= Pk hi?li R?e;k1

;

Re;k

= ?1 + hi?li Pk hi?li

(76)

and Pk satis es the recursion Pk+1

 : = Pk ? Kp;k Re;k Kp;k

36

(77)

Moreover, the above solutions for w^ji:j always correspond to a minimum, and in particular

= 1 + hi Pk hi > 0;

(78)

= ?1 + hi?li Pk hi?li < 0;

(79)

Re;k

when we are updating, and Re;k

when we are downdating.

Proof: The solutions given by (a) and (b) in the above Theorem are simply the Krein space Kalman lter recursions for the state-space model (71) which we know computes the stationary point of Jk over w. However, this stationary point is always a minimum since @ 2Jk @w2

i X

2

= @@wJ2i = ?0 1 +

(recall that ?0 1 > 0).

j =i?li +1

hj hj > 0;

Using Lemma 12 in the companion paper [1], having a minimum means that Re;k and Rk have the same inertia for all k. Thus the statements (78) and (79) readily follow. The fact that whenever we drop data we have Re;k < 0 has an interesting interpretation. Consider the equation Pk+1

= Pk ? Kp;k Re;k Kp;k

If we drop data at step k we would expect Pk+1 to get larger (more positive-de nite) than Pk . This can only happen if Re;k < 0. Thus, we may infer that innovations with negative Gramian correspond to a loss of information. The above discussion puts the problem of nite memory adaptive ltering into the same state-space estimation framework as conventional adaptive ltering techniques (see [20]). Therefore the various algorithmic extensions discussed there may be applied to nite memory problems, albeit that we now need to consider a Krein space. We shall not give the details here, but shall just mention that when the elements of the input vectors fhi g form a time sequence, viz., hi

=



ui ui?1 : : : ui?n+1

37



;

and when the window length is constant, i.e. , li  l, then the state-space model (71) is periodic with period T = 2, and we may speed up the the estimation algorithm by a so-called Chandrasekhar-type recursion. Similar results, obtained via a di erent approach, have been reported in [26].

6 Conclusion Certain studies in least-squares estimation, adaptive ltering and H 1 ltering motivated us to develop a theory for linear estimation in certain inde nite metric spaces, called Krein spaces. The main di erence from the conventional Hilbert space framework for Kalman ltering and LQG control are that projections in Krein spaces may not necessarily exist or be unique, and that quadratic forms may have stationary points that are not necessarily extreme points (i.e., minima or maxima). We showed that these simple but fundamental di erences explain both the unexpected similarities and di erences between the well-known Kalman lter solution for stochastic state-space systems and the solution for the completely nonstochastic H 1 ltering problem. The main points are the following. There are many problems whose solution can be reduced to the recursive minimization of some inde nite quadratic form. A stationary point, when it exists, of the quadratic form can be computed as follows: set up a (partially equivalent) problem of projecting a vector in a Krein space onto a certain subspace. The advantage is that when there is state-space structure, this projection can be recursively computed by using the innovations approach to derive a Krein space Kalman lter. The equivalence is only partial because the Krein space projection only de nes the stationary point of the quadratic form and further conditions need to be checked to determine if this point is also a minimum. It turns out that this checking can also be done recursively using quantities arising in the Kalman ltering algorithms. Apart from quite straightforward derivations of known results in H 2, H 1 and risk-sensitive estimation and control, the above approach allows us to extend to the H 1 setting some of the huge body of results and insights developed over the last three decades in the eld of 38

Kalman ltering (and LQG control). A rst bonus is the derivation (see [24]) of square-root and (fast) Chandrasekhar algorithms for H 1 estimation and control, a possibility that is much less obvious in current approaches. These square-root algorithms, which are now increasinlgy standard in H 2 Kalman ltering, have two advantages over the earlier H 1 algorithms: they eliminate the need for explicitly checking the existence conditions of the lters, and have various potential numerical and implementational advantages. Application of the Krein space formulation to adaptive ltering arises from the approach in [20] where it was shown how to recast adaptive ltering problems as state-space estimation problems. If we further allow the elements of the state-space model to belong to a Krein space, then we can solve nite memory and H 1 adaptive ltering problems. In the nite memory case, this allows us to consider general sliding patterns with windows of varying lengths. In the H 1 adaptive case, this has allowed us to establish that the famed LMS (or stochastic gradient) algorithm is an optimal H 1 lter [25]. We also remark that, although not pursued here, it is also possible to construct dual (rather than partially equivalent) Krein state-space models (via the concept of a dual basis), which can be used to extend the methods of this paper to the solution of H 2 and H 1 control problems. We may nally remark that a major motivation for the Krein space formulation is that it provides geometric insights into various estimation and control problems. Such geometric insights are also useful elsewhere. For example, they can be used to provide a stochastic interpretation and a geometric proof of the KYP (Kalman-Yacobovich-Popov) Lemma [27][29].

Acknowledgement The authors would like to thank Professors P. P. Khargonekar and Professor D. J. N. Limebeer for helpful discussions during the preparation of this manuscript. Seminars by Dr. P. Park on the KYP Lemma were also helpful in leading us to begin these researches.

39

References [1] B. Hassibi, A. H. Sayed, and T. Kailath. Linear estimation in Krein spaces - Part I: Theory. This issue. [2] H. Kwakernaak. A polynomial approach to mimimax frequency domain optimization of multivariable feedback systems. Int. J. of Control, 44:117{156, 1986. [3] J. C. Doyle, K. Glover, P. Khargonekar, and B. Francis. State-space solutions to standard H2

and H1 control problems. IEEE Transactions on Automatic Control, 34(8):831{847,

August 1989. [4] P.P. Khargonekar and K. M. Nagpal. Filtering and smoothing in an H 1? setting. IEEE Trans. on Automatic Control, AC-36:151{166, 1991.

[5] T. Basar. Optimum performance levels for minimax lters, predictors and smoothers. Systems and Control Letters, 16:309{317, 1991.

[6] D. J. Limebeer and U. Shaked. New results in H 1 - ltering. In Proc. Int. Symp. on MTNS, pages 317{322, June 1991.

[7] U. Shaked and Y. Theodor. H1 ?optimal estimation: A tutorial. In Proc. IEEE Conference on Decision and Control, pages 2278{2286, Tucson, AZ, Dec. 1992.

[8] I. Yaesh and U. Shaked H1 ?optimal estimation - The discrete time case. In Proc. Int. Symp. on MTNS, pages 261{267, June 1991. [9] M. J. Grimble. Polynomial matrix solution of the

H1

ltering problem and the re-

lationship to Riccati equation state-space results. IEEE Trans. on Signal Processing, 41(1):67{81, January 1993. [10] K. Glover and D. Mustafa. Derivation of the maximum entropy H 1 controller and a state space formula for its entropy. Int. J. Control, 50:899-916, 1989. [11] M. Green and D.J.N. Limebeer. Linear Robust Control. Prentice Hall, Englewood Cli s NJ, 1995. 40

[12] T. Basar and P. Bernhard. H 1 -Optimal Control and Related Minimax Design Problems: A Dynamic Game Approach Birkhauser, Boston, 1991.

[13] D. H. Jacobson. Optimal stochastic linear systems with exponential performance criteria and their relation to determenistic games. IEEE Trans. Automatic Control, 18(2), April 1973. [14] J. Speyer, J. Deyst, and D. H. Jacobson. Optimization of stochastic linear systems with additive measurement and process noise using exponential performance criteria. IEEE Trans. Automatic Control, 19:358{366, August 1974.

[15] P. Whittle. Risk Sensitive Optimal Control. John Wiley and Sons, New York, 1990. [16] J.L. Speyer, C. Fan, and R. N. Banavar. Optimal stochastic estimation with exponential cost criteria. In Proc. IEEE Conference on Decision and Control, pages 2293{2298, Tucson, Arizona, December 1992. [17] N. Young. An Introduction to Hilbert Space. Cambridge University Press, Cambridge, 1988. [18] B. A. Francis. A Course in H 1 Control Theory. Springer Verlag, NY, 1987. [19] P. Henrici. Applied and Computational Complex Analysis, volume 1. Wiley, New York, 1974. [20] A. H. Sayed and T. Kailath. A state-space approach to adaptive RLS ltering. IEEE Signal Procesing Magazine, July 1994, pp. 18-60.

[21] K. Glover and J. C. Doyle. State-space formulae for all stabilizing controllers that satisfy an H 1 -norm bound and relations to risk sensitivity. System and Control Letters, 11:167{ 172, 1988. [22] A. H. Jazwinski. Stochastic Processes and Filtering Theory, volume 64 of Mathematics in Science and Engineering. Academic Press, New York, 1970.

41

[23] B. D. O. Anderson and J. B. Moore. Optimal Filtering. Prentice-Hall Inc., NJ, 1979. [24] B. Hassibi, A. H. Sayed, and T. Kailath. Square root arrays and Chandrasekhar recursions for H 1 problems. Submitted to IEEE Transactions on Automatic Control. Also in the Proceedings of 33rd IEEE Conf. on Dec. and Cont., 1994, pp. 2237-2243.

[25] B. Hassibi, A. H. Sayed, and T. Kailath. H 1 Optimality of the LMS Algorithm. Submitted to IEEE Trans. on Signal Processing. Also in the Proceedings of 32rd IEEE Conf. on Dec. and Cont., 1993, pp. 74-80.

[26] A. Houacine. Regularized fast recursive least squares algorithms for nite memory ltering. IEEE Transactions on Signal Processing, 40(4):758{769, April 1992.

[27] R. E. Kalman. Lyapunov functions for the problem of Lur'e in Automatic Control. Proceedings of the National Academy of Sciences, Vol. 49, No. 2, 1963, pp. 201-205.

[28] V. A. Yakubovich. The solution of certain matrix inequalities in automatic control theory. Doklady Akademii Nauk, SSSR, Vol. 143, 1962, pp. 1304-1307.

[29] V. M. Popov. Hyperstability and optimality of automatic systems with several control functions. Rev. Roumaine Sci. Tech. - Elektrotechn. et Energe., Vol. 9, No. 4, 1964, pp. 629-690.

42