processes generat,ed by models of the form (9), i.e., t.o re- strict &ention to what ..... Mayne-Fraser formulas (compare the references in [ZO]) for smoot.hing with ...
720
IEEE TRSXSACTIONS
ON AUTOhL4TIC CONTROL,
VOL.
AC-16,NO. 6, DECEMBER 1971
An Innovations Approach to Least Squares Estimation-Part IV: Recursive Estimation Given Lumped Covariance Functions
Abstract-We show how to recursivelycompute linearleast squares filtered and smoothed estimates fora lumped signal process in additive white noise. However, unlike the Kalmm-Bucy problem, here only the covariance function of the signal process is known and not a specific state-variable model. The solutions are based on the innovations representation for the observation process.
In the important case when y ( - ) has a rational powerspectral-densityfunction(and, of course, also in cert,ain more general cases-see [lI), (4)can be explicit,ly solved and yields t,he fairlywell-known result (see, e.g., [2, ch. 61)
I. INTRODUCTION THE engineering literature two versions of the least where squares problem, somet.imes called t,he Wiener problem S,+(j) = t,he transfer function of t,he unique causal andtheIialman problem, are usuallystudied. In the and causally invertible linear filter \\-hose presentpaper we shall develop a synthesis of the t,wo response to white noise has power spectral problems. To explain this, we recall t.hat in the most. widely densit,y S,(j). (6) studied special case of the Wiener problem, we are given As a point. of notation, we should remark that t,he repobservat,ions of the form resentation of a stochastic process as the response of a Z ( S ) = SI e(s), --co < s I t (1) causal and causally invertible filter t.o a white-noise process where z ( -), y( - ), and e ( . ) are stat,ionary processes with will be called a cu~aonieal~eepresentation(CR) or an i m o eations represenlation (1R)-compare [3]-[5]. When we power spect.ra turn toproblems with finite-t,ime observation of stat,ionary 5,df>7 Sdf) = 1 s m = 1 S,df>. (2) and/or nonst,at,ionaryprocesses, the estimate G ( t ) must. be The linear least squares estimate Q(t) is to be found in the writt,en form
I”
+
+
Q(t) =
J‘
h(s)z(t - s) ds,
-a
y(s) e(s)y(t)] = K(t, 8). z(t> = JI(t)dt) .(t> (16b) Then
+
+
+
+
The notation is tVs
=
max(t, s); ths
=
min(t, s).
...
..
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, DECEMBER
Let x ( - ) be a process that is known to arise from some, not necessarily known, lumped model of the form (9) and suppose it.s covariance function has the known form
.._ ..
+
(22)
fi(0 = 1V(t)+(t)
(23a)
I
723
1V
HdIL4TH AND GEESEY: INNOVATIONS APPROACHESTINITION-PART SQUARES TO LEAST
M,( = M ( If, in addition, y ( - ) and e ( . ) are complet,ely uncorrelated, then a.lso N,( = N ( D(t) = x @ ) - .(t) (23b) Remark 3: These formulas for t,he smoothed estimat,es generalize those of Bryson and Fra.zier [21] for smoothing where V( ) a.nd $( - ) are as inTheorem 1. with known lumped models. We may note that there are Corollary S-State Estimation: Suppose we ha.ve a model no analogs in t,hepresent problem-only covariances of t,he form (9) wit (25) tion in colored noise. Related Results and Histoyical Remarks: To begin with, where ei( .) is a row vect,or with a one in t.he ith position it should be remarked that in much of the current control and zeros elsewhere, and 4( is as in Theorem1. 1iterat)ure it, seems t,o have been forgott,en t,hat historiTheMean-Square Error: The mean-square error (mse) cally the linear least squares est.imation problems were all can be directly calculated from knowledge of t.he covaribased on the assumption of known covariances (or power ance function of y ( . ) as well as thatof x ( . ) . spectra) and not on t.he assumpt,ion of a known lumped Corollary 4-Mean-Square Error:. We have model. Thus Wiener and Kolmogorov started with known mse = R,(t, t ) - Rt(t,t ) = R,(t, t ) power spectra. and Dolph and Woodbury [22], Darlington 1251, andotherstreat,ed - M(t)P(t)M’(t). (26) [23],Shinbrot[24],Peterson various problems v d h nonstationary processes. The major If the signa.1a.nd noise processes a.re independent,, then handicaps of t,he work of Shinbrot, and ot.hers were that 1) R,(t, 1) = K ( t , t ) = a(t)a(t) = ilf(t)N(t) (27) their solut,ions were notrecursive and 2 ) theysought explicit solutions rat,her than just. effective computational and the algorit,hms. The major contribution of Kalman, Bucy, and mse = M ( t ) [ N ( t )- P(t)M’(t)]= M(t)Z(t). (28) Stratonovich was their denlonstra.tion of the significance of the a.bove two facts. I n fact, in Kalman’s papers the The Smoothing Problem: So far we have only discussed major examplesof nonstat,ionarg filt,ering were drawn from causal, or filkred, estimates 6(t).However, from knowl- t,he work of Shinbrot, andPeterson. However, it had edge of 6 ( t ) we canreadilydetermine the noncausalor seemed that. thereason for the power of t,he Stratonovichsmoothed estimate 6 ( t l T ) . To do t,his we have only to use Kalman-Bucy a.pproach layin t,heir assumption of a t,he general formula. developed in Part I1 of t,his series of known model for the signal process. The main point of the papers compare [20, (S)], which will yield present paper is t.hat t.his assumption is not essential and that one can also obtain a recursive solution of the probT z2(tIT) = 6 ( t ) E [ w ( ~ ) Y ( s ) ]C v (IS s ) (29) lems of Shinbrot et al. Thus the choice is really up to the proposer of the est.ima,t.ionproblem as towhether a known where v(.) is the innovations process in Theorem 1. To go lumped model or a known lumped covariance is specified. beyond t.his formula we shall need one more assumpt,ion. As explained earlier, the key t.0 our result is tionprocess. Partly for pedagogical reasons, in this paper we shall exM,(t)@(t,s ) N ( s ) , t 2 s (30a.) ploit this relat,ionship in both directions. More precisely E’IL!(~)z(s) = M(s)@(s,t)N,(t), IR shall st.a.rt.wit.h the Stra.tonot 5 s (30b) in order to obtain the we vich-Kalman-Bucy formulasfor the case of a known where M ( . ) , A T ( . ) ,and a(., .) are as in Theorem 1 [com- model. ( R e recall t>hatt,he simple derivat.ion of these forpare (15)]. Then mulas in [ 5 ] and in [17] was based on the innovat.ions 6(tlT) = M,(t)Q(t) [N,’(t) - &I,(t)P(t)]X(t) (31) concept.) Having obtained t,he I R for t,his case, we shall rewrite it in t,erms of t,he cova.ria.ncefunct.ion and t.hen where apply it to obt.ain our results on estimation. Thus we obtain a reasonably self-contained derivation and this is one X(t) = - [A‘(t)- M’(t)$’(t)]X(t)- W ( t ) .(t), X(T) = 0 of the major pedagogical aims of t.his paper. We say this (32) because the filteringresult of Theorem 2 was already and $(-), P ( . ) , and K ( . ) are as in Theorem 1 [compare proved by us in a more general context [lo, (24) and footnot,e 11], where we repheed t.he assumptionthat 6(t - s) (W, (17) I. Corollary 5-Signals in Noise: Suppose z ( t ) = y ( t ) K(t, s) was known to be the out.put, covariance functionof O(t) where y(.) and 8 ( . ) areasin Corollary 2. Then some lumped system x5it.h t.he assumptionthat 6 ( t - s) or equivalently
e )
e).
e )
( a )
0
e ) .
)
a ) ,
e )
+
I
+
+
+ +
724
IEEE TRANSACTIOXS O N ATJ!COMATIC CONTROL, DECENBER
K ( t , s) was strict,ly positive definite; however, our proof was not. self-contained but relied on an identit>y of Siegert, Krein, and Bellman (compare [26]). Resulk- equivalent,to also been obta.ined by our Corollaries 2 and 3 have recently Anderson and Moore [27] (see also [28]) but for the proof of existence they appeal to some results derived in [29] for quadratic regulat,or problems. We should also ment.ion here the somervhatoverlooked work of Swerling[30], [31], who was one of the first in modern times to apply rccursive techniques to leastsquares problems.2 I n [31], Sw-erling gives some general recursive formulas for tlhe smoot.hing of random signals in independent 1vhit.e noisesee especially (S6), (87). However, no formulasare explicit,ly given for filtering or for lumped processes or covariances. Furthermore, in order to have both Swerling’s (86) and (87) theassumption of independenceis essential-see the discussion in[26]and [20]. Incidenbally, (S6) and (87) of Swerling [31] are similarto some formulas obtained in 1953 by Siegert, [32]; as already mentioned in t,he introduct.ion of [17], Siegert had, in a different. cont,ext, already derived recursive formulas for the solut.ion of certain Fredholm (relat.ed to smoothing) and WienerHopf (related to filtering) equations. 111. PROOFS We beginwith the a.ssumption that, we are givena process m-ith a known lumped Markov model and show how to find the innovations representation (IR) for it. Now it is lmon-n that the IR is uniquely determined, at least, up t o its impulse response, by the covaria.nce funct.ion of the process (compare [4], [SI). Therefore we shall be able to express t,he IR foundfrom a knownmodel ent,irely in terms of t,he covaria.nce function of the process. Doing this nil1 prove Theorem 1 from which the other results follow rea.dily. Therefore, let us assume that we have a process z ( . ) wit.h a lumped Markov model of t.he form (9). This model for z ( .) is causal but is not. in generalcausally invertible. The Iq for sucha process was a.lrea.dy obtainedin Part I of this series of papers (compare [17, appendix 111 and also [IS]). It. is actually just a rearrangement, of t,he Kalman-Bucy filter (lOa)-(lOd), viz., it is
z ( t ) = C(t)?(t)
+ v(t)
in (33a) by x ( t ) - H(t)Z(t)and solve for f(t). When this is done, v( ca.n be readily calculated. We ma.y recall that t.he fact. tha.t t,heinnovat.ions process v = z - Hf is q-hite a more general context, in [17, was proved, actually in appendix 111 (see also [ 5 ] ) . Next, we shall show how to express t,he I R (33) in t e r m of the parameters of the covariance funct,ion of z( say, R,(t, s). Bydirect calculation, using some well-knom formulas for the covariance funct*ion of lumpedmodels driven by 1vhit.e noise (compare [17], appendix 1-B]), we obtain the expression a )
e),
K ( t , 8)
x(.)
= 6(t - s)
+ K ( t , s)
(344
with
K ( t , S)
=
Ilf(tV~~)@(tVs, t ~ i ~ ) N ( t A s ) (34b)
where @(t, s) is t.he state-transition matrix of A (.), and
n(t)c/(t) + r(t)x(t).(344 is tahecovariance matrix of the st,atesx( - ),
nr(t) = c(t) Finally, rI(
a )
~ ( t =)
rI(t) = Ex(t)x’(t)
(35a.)
and obeys the differential equation
+ F(t)Z(t)T’(t),
h(t)= A(t)II(t)+ II(t)A‘(t)
II(to)
=
no.
(35b)
Note that. thereis no loss of genemlity here in assuming a particular matrix A ( .), for by a simple 1inea.rtransformation of t.he states it is possible to modify the A ( .) matrix without. affecting the funct,ion K ( t , s). Kow let. us assume that. we are given M ( - ) N , ( - ) ,and %(.) [or equivalently A ( -) 1. Then torewrite the IR(33) in terms of these parameters, we must be able to express the gain function I?(.) entirely in terms of t,hem. To do so, let us first define a mat.rix P(t) as thecovariance nmtrix of the states Z(t) of the IR (33);
P(t) = EZ(t)Z‘(t).
(36)
We note that by t,he orthogonality of Z(t) and Z(t) (compare [I], theorem l), we will have
Ex(t)x’(t) = EE(t)Z‘(t)
rI(t)
(33b)
where v( .) is the w7hit.e-noise innovat.ions process and is the Kalman-Bucy gain funct.ion given by (lOc) and (10d). It is easily checked that the representation (33) for x(.) is causally invertible: given z ( . ) , n-e replace v ( t )
1971
=
P(t)
+ EZ(t).T’(t)
+qt).
(37)
Kow direct calculation from(33a), or from t.he differenbl equations (35b) for E(-)and (10e) for Z ( - ) , yields the equa.tion
P
=
AP
+ PA’ + RE’,
P(t0) = 0.
(38)
But. the Kalman-Bucy gain function E( can itself be expressed in terms of P ( - ) ,A I ( - ) ,and N(-) e )
Similar applications mere made in 1958 by Breakwell and Striebel (personal communication), and perhaps also by others, including Follin, Carlton, Hanson, and Bucy, at. the Physics Laboratory, Johns Hopkins University (compare the discussion in [38]). Swerling’s general results can be specialized, wit.h hindsight, to yield thefundamentald8erent.ial equations of Stratonovich, Kalman, and Bucy. Reference [39] is a very early report on this general problem area.
E(t) =
WC’(t)
+W)X(t) +
=
n r ( t ) ~ ~ ( t )r(tlx(t)- ~ ( t ) w ( t )
=
N ( t ) - P(t)M’(t).
(39)
?(tlT) = ?(t)
Therefore, to express the IR (33) entirelyinterms of &I(-), N ( -), and A (-),we first, solve the following Ricca.ti equation for P(
where
P(t) = A(t)P(t)+ P(t)A’(t) + [ N ( t ) - P(t)M’(t)I
k(t)
a )
*
[ N ( t ) - P(t)M‘(t)]’,
P(t0) = 0.
(40)
(39) and m-e thereby Then $( .) can be calculated from have the IRent.irely in terms of t,he parametersof t.he covariance funct.ion. The existence3 of a solution of the nonlinear equation (40) is easy to establish, for we note from t.he relation (37) that
P(t) 5 II(t),
to
5t5 T
K ( t , t ) = M(t)N(t) = M ( t ) [rI(t)M’(t) tained by methods similar to those used in Section 111; W X ( t ) I. (42) in fact, the general discrete-time problem is a bit. simpler because it. is not necessary to consider sepa,rately problems To complete the proof of Theorem 1, we now merely have without additive white noise in the observations. The exto replace 2( .) in (33) by +( .). The proof of Corollary 1 is plicit formulas and some ot,her interest.ing features of the obvious from t.hediscussion below (33). discrete-time ca.se will be presented elsewhere. To prove Theorem2 , me note that a process w(.) satisfySome c0mment.s may be useful on t,he assumpt,ion that ing the covariance function is available in the form(ll),(12). For stat,ionary processes, a large number of algorithms are 0 5 she form ( l l ) , (12) from knowledge of fied by direct calculation. But t.hen the covariance sequence. Much less has been done on COvariance est,imation fornonstationary processes, though $ ( t ) = M,(t)2(t) = llIm(t)+(t) some analogs of the Ho algorithm ha,ve been presented which is the cont.ent of Theorem 2. [33], [41]. An interesting open problem is whether it may Corollary2 now follows by notingt,hat. Ey(t)z(s) = be bett.er taoattempt, todirectly ident,ify the IR (33) from M ( t ) @ ( t ,s ) N ( s ) for t > s. Corollary 3 follows from t.he cal- observations of z( rather than to go via t.he covariance culat.ion Ea,z(s) = ei@(t, s)N(s). Corollary 4 is based on funct,ion of z( .). It. should be noted that, for purposes of some obvious calculations. ident>ification, t.he innovat.ions model (33) will usually Finally for Theorem 3, we not,e t.hat the assumption (30) have fewer parameters than models of the form (9) : in the for Ew(t)z(s)is consistent with the assumptions 1at.ter I’( may often be a rectangular matrix, while, for scalar y ( - ) , R ( - )in the IR (33) \vi11 always be a column w ( t ) = M,(t)z(t) N,(t) = II(t)Jf,’(t). (43) matrix.Astrom[34]came earlier to this conclusion for scalar-valued infinite-time stationary processes. We have Nom- we use our earlier result [20, (28)] for smoothedestipointed out. the appropriate generalizations in [18] and in mat.es wit.h t,he model (9) : theabove; following these generalizat.ions, Mehra has ma.de several studies [35], [36] of t.he identification probNote t.hat, the exisbence of a solut,ion for the nonlinear Riccati arguments. It is is vital to our this equation (40) problem that has lem. been more difficult to handle in earlier solutions [16]-[19]. In our I n fina.1summary, we have shown how to develop recurproof here, it is the assumption that. some lumped Markov model is sive filt,ering and smoothing solutions for a.n observation known to exist for y(. ) that guarantees the existence of a solution. However, as noted at. the end of Section I, the e?iistence of the IR in process specified not bya lumped model but. by a separable the form of Theorem 1 can also be proved under weaker condit.ions, covariancefunction of the form ( l l ) , (12). Apartfrom but with somewhat more difficulty (compare [ 111, [ 121). a )
+
0
)
e )
e )
J
726
~
E TR.~NSACTIONS E ON AUTOMATIC CONTROL, DECEMBER
1971
the integral equations arising in optimization of time-varying theirintrinsicinterest,theseresults are useful because linear systems with nonstationary inputs,” IRE Trans. Inform.. they show t,he intimat,e rela,tions betaweenthe problem of Theory, vol. IT-3, pp. 220-224, Dec. 1957. least squares estimat.ion and the problem of determining I251 E. L. Peterson, Statistical Analysis an.d Optimization of Systems. New York: Wiley, 1961. the innovations represent,ation of a given stochastic [26] T. Kailath, “Application of a resolvent identity to alinear smoothing problem,” S I A X J . Contr., vol. 7, pp. 68-74, Feb. process. These relationships \ d l also be useful in studying 1969. estimat.ion incolored noise, and t>heir dual singularcont.ro1 [27] B. D. 0. Anderson and J . B. Moore, “The Kalman-Bucy filter as a true time-varying Wiener filter,” ZEEE Trans. Syst., Man, problems, as we shall show in laterpapers.
REFERENCES [ l ] 31. G. Krein, “Integral equationsona half-line with kernel depending upon t,he difference of the arguments,” d m e r . ;klaih,. SOC.Transl., ser. 2, vol. 22, pp. 163-288, 1962. [2] H. L. Van Trees, Detection, Estima-tion a.nd Xodzhtion Theory, vol. 1. S e w York: Wiley,1968. [3] H. CramCr, “A cont,ribut,ion to the mult,iplicity theory of stochastic processes,” in Proc. 5th Berkeley Symp. dfathematics, Statistics, and Probability, vol. 2, pp. 216-221, 1967. [4] T. Hida, “Canonical representations of Gaussian processes and their applicat,ions,” Xm.Coll. Sci., Kyoto Univ., ser. A, vol. 33, 1960. [.?I T.Kallath, “The innovat.ions approach to detection and estimat.ion theory,”. Proc. I E E E , vol. 58, pp. 680-693, N a y 1970. [6] R. L. Stratonovich, “Application of the theory of IIarkov processes for optimum filtration of signals,” Radio Eng. Electron. Phys. (USSR), V O ~ .1, pp. 1-19, KOV.1960. [i]R. E. Kalman and R. S. Bucy, “New result,s in linear filtering and prediction theory,” Trans. A S N E , J . Bazic Eng., ser. A, vol. 83, pp. 9.5107, Dee. 1961. [8] R. Geesey, “Canonicalrepresentations of second-order processes with applicat,ions,” Ph.I). dissert,ation, Dep. Elec. Eng., Stanford Univ., Stanford, Calif., Dec. 1968; also, SEL Tech. Rep. 7050-17, June 1969. R. Geesey, “Covariance factorization--An [9] T. Kailathand explicat,ion via examples,’’ in Proc. 2nd Asilcsmar Conf. Systems and Circuits, Monterey, Calif., Kov. 1968. [lo] T. Kailat.h, “Fredholm resolvents, Wiener-Hopf equations, and Riccati differential equations,” ZEEE Trans. Inform. Theory, V O ~ .IT-15, pp. 66S672, NOV.1969. [ I l lR .G e s e y and T. Kailath, “Applicat.ion of the canonical representation of second-order processes toestimationand detection in colored noise,” in Proc. Symp.Computer Processing in Communications, Polytechnic 1nst.itute of Brooklyn, Brooklyn, N. Y. Apr. 1969. [ l a ] J. B. Moore and B. D. 0. Anderson, “Spectral factorization of time-varyingcovariance functions:The singularcase,” dlath. Syst. Thewy, vol. 4, no. 1, pp. 10-23, 1970. [13] L. Brandenburg, “Shaping filter models for nonstationary random processes,’’ P b D . dissertation,Columbia Univ., New York, N.Y., June 1968. [14] P. Faurre, “Representation of stochastic processes,” Ph.D. dissert,ation,Dep.Elec. Eng., St.anford Univ.,Stanford, Calif., Feb. 1967; also Lecture -Votes in ~lfathwmmtics(from Nice Symp. Optimization), vol. 132. New York: Springer, 1970, pp. 85107. [I51 R. E. Kalman, “Linear stochastic filtering-Reappraisal and outlook,” in Proc. Symnp. System Theory, Polytechnic Institute of Brooklyn, Brooklyn, N.Y., pp. 197-205, Apr. 1963. [16] I. Gohberg and M . G. Krein, “On t.he factorizat.ion of operators in Hilbert. spaces,” -4mner. ilfath. SOC.Transl., ser. 2, vol. 51, pp. 13%188,1966.
[lT] [lS]
[19] [20]
[21] [22]
[23] 1241
T. Kailat,h, “An innovations approach to least-squares eati-
mat.ion-Part I: Linear filtering in additive white noise,” Z E E E Tra.ns. ~4utomaf. Contr.,vol. AC-13, pp. M&6.i3, Dec. 1968. R. Geesey and T. Kailath, “Comments on ‘The relationship of alternate state-spacerepresentations in linear filtering prohlems’,” I E E E T r a n s . Autmnut. Contr. (Corresp.), vol. XC-14, pp. 113-114, Feb. 1969. G. &I,Jenkins and D. Watts,Spectral Analysis andits dppliculions. San Francisco, Calif.:Holden-Day, 1968. T . Kailath and P. Frost, “An innovations approach to leastin additive squares estimat,ion-Part 11: Linearsmoothing white noise,” ZEEE Trans. dutomat. Contr., vol. X - 1 3 , pp. 655-660, Dec. 1968. A. E. Bryson and M. Frazier, “Smoothing for linear and nonlineardynamic syst.ems,” Aeronautical Svst.Div., WrightPat.terson AFB,Ohio, Tech. Rep. XSD-TDR-63-119, Feb. 1963. C. I. Dolph and 31. A. Woodbury, “On t,he relation betaeen Green’s functions and covariances of cert.ain stochastic processes and its application to unbiased linear prediction,” Tra.ns. d m e r . X a t h . Soc., vol. 72, pp. 519-350, 1952. S.Darlington, “Nonst.at.ionary smoothing and prediction using network t,heory concepts,” in Trans. 1969 Znt.Sgmnp. Circuit and Information Theory, Los ringeles, Calif., pp. 1-13, 19.39. 31.Shinbrot, “A generalizat.ion of a met.hod for the solution of
Cybernetics, vol. SILIC-1, pp. 119-128, Apr. 1971. [28] B. D. 0. Anderson, J. B. Noore,and S. G. Loo, “Spectral factorizat,ion of time-varying covariance functions,” Z E E E Trans. Znfor,tn.. Theory, vol. IT-15, pp. 550-557, Sept. 1969. [29] J. B. XIoore and B. D. 0. Anderson, ‘‘&tensions of quadratic minimizat.ion t,heory, I,” Int. J . Contr., vol. ‘7,pp: 465472, 1968. [30] P. Swerling, “First-order error propagat.ion In ast.agenlse smoothing procedure for satellite observations,” J . Astronaut. Sci., vol. 6, pp. 46-52, 1959. [31] --, “Topics in generalized least-squares signal estimation,” S l d X J . Appl. Math., vol. 14, pp. 998-1031, Sept. 1966. , “Modern state estimation met.hods from the viewpoint of least squares,” this issue, pp. 707-719. [32] -4. J. F. Siegert, “A systematic approach to a class of problems in t.he theory of noise and ot.her random phenomena,” Parts I1 and 111, IRETrans.lnjorm.Theory, vol. IT-3, pp. 3843, Mar. 1957; vol. IT-4, pp. 4-14, Mar. 1958. [33] C. Bruni, 8 . Isidori, and 4.Ruberti, ‘‘A method of factorization of the impulse-response matrix,” ZEEE Trans. Autonmi.Contr. (Corresp.), vol. AC-13, pp. 739-741, Dec.. 1968. [34] h. Astrom and S. Wenmark, "Numerical Identification of st.ationarF time series,“ in Proc. 6th Int. Instruments an,d U e u surements Congr., Stockholm, Sweden, Sept. 1964. [33] R. AIehra, “On-line identificat.ion of linear dynamic systems Kith applications to Kalman filtering,” ZEEE Tra.ns. dzctomat. Contr., vol. AC-16, pp. 12-21, Feb. 1971. [36] --, “Approaches to adaptive filtering,” in Proc. 1970 I E E E Symp. ddapti~leProcesses, Austin, Tex., Dec. 1970. [37] J. Rissanen, “Recursive identification of linearsystems,” SZAX J . Contr., vol. 9, Aug. 1971. [38] R. S.B U Cand ~ P. D. Joseph, Filteringfor Stochastic Processes with Applications to Guidance. New York: Wiley,1968. [39] X. G. Carlton and J. W. Follin, Jr., “Recent developments in fked and adapt.ive filtering,” AGARDograph 81, 1956. [40] It. E. Kalman, P. Falb, and 11. A . Arbib, Topics in UathenmticalSystem.Theory. New k’ork: hlcGraa-Hill, 1969. [41] L. Silverman, “Rea1izat.ion of linear dynamical systems,” this issue, pp. 654-56’7; also, Ph.1). dissertation, Columbia Univ., New York, N.Y., 1966; also D. C. Youla and P. Tissi, in 1966 I E E E I n t . Conv. Rec., part I.
-
Thomas Kailath (S157-il1’62-F’7O) was born in Poona, India, on June 7, 1935. He received t.he B.E. degreein telecommunications engineering fromthe Univel-sity of Poona in 1956 and the S.M. and Sc.D. degrees in elect,rical engineering from the1Iassachusetts Institute of Technology, Cambridge, in 1959 and 1961, respectively. During 1961-1962he worked a t t h e J e t Propulsion Laborat,ories, Pasadena, Calif., where he also taught part-time a t the California Institute of Technology, Pasadena. Since then he has been at,Stanford University, Stanford, Calif., where, in 1968, he became a Professor of ElectricalEngineering. FromJanuaryto June 1963, he was a Visiting Scholar at the University of California, Berkeley,during 1969-1970 he was a Fellow of the John Guggenheim Memorial Foundat,ion and a Viit.ing Professor at the Indian 1nstit.ute of Science, Bangalore, Iudia. His research interests are generally in statistical d a h processing in communications and control. Along with a coauthor, he received t.he 19651966 prize of the Information Theory Group for a paper, “Feedback Communications Systems.” He is also edit,or of a Prentice-Hall series of books in information and system sciences and a consult.ant to several industrial companies. He mill be the cc-chairman of the forthcoming 1972 International Symposium on Information Theory. Dr. Kailat,h is a member of the 1nstit.ute of Mathematical Statistics, Commission VI of URSI (The International Scientific Radio Union), Sigma S i , and several other scientific societies. I n J a n u a v
1970, he received the Mastech Award of the Council of Scientific and Industrial Research, India, forpaper a the in field of probability theory and statistics and their applications to science and technology.
Roger A. Geesey (R.I’61) was born in Hagerst,own, Md., on May 25, 1936. He received the B.S. degree in elect.rica1 engineering from Lehigh University, Bet.hlehem, Pa., in 1958 and the M.S.and Ph.D. degrees from Stanford University, Stanford, Cali., in 1963 and 1969, respectively.
Since 1958 he has been on active duty with the U.S. Air Force and presently serves at HQ .4eronaut,ical Chart and Information Center, St. Louis, Mo. Previously he had Air Force assignments a t the Rome Air Development Center, GriEss AFB, N.Y., andthe Frank J. Seiler Research Laboratory, USAF Academy, Colo. H e i s also presently affiliated with graduate the program in control systemsscience and engineering at Washington Universit,y, St.. Louis, Mo., where he has taught a course in stochast.ic filtering theory. fi___z=-=
=
~~
Discrete Square Root Filtering: A Survey of Current Techniques
Absfrucf-The conventional Kalman approach to discrete filtering involves propagation of a state estimate andan error covariance matrix from stage tostage. Alternate recwsive relationships have been developed to propagate a state estimate and a square root error covariance instead. Although equivalent algebraically to the conventional approach, the square root filters exhibit improved numerical characteristics, particularly in ill-conditioned problems. In this paper, current techniques in square root filtering are surveyed and related by applying a duality association. Four efficient square root implementations are suggested, and comparedwith three common conventional implementations in terms of computational complexity and precision. Thesquare rootcomputational burden should not exceed the conventional by more than 50 percent i n most practical problems. An examination of numerical conditioning predicts that the square root approach can yield twice the effective precision of the conventional filter in ill-conditioned problems. This prediction is verfied in two examples. The excellent numerical characteristics and reasonable computation requirements of the square root approach make it aviable alternative to the conventional filter i n many applications, particularly when computer word length is limited, or the estimation problem is badly conditioned.
semidefinite-a theoretical impossibility. This may occur when 1) t,he covariance matrix is rapidly reduced by processing very accurate measurements, 2) a linear combination of state vector components is known w i t h great precision, while ot,hercombinations are essentially unobservable. The source of trouble in bot,h cases is numerical comput,ation of ill-conditioned quant.itiesin finite word length. To circumvent this difficulty, Potter [ a ] gave a method for propa.gating t,he error covariance matrix in a square roo form in t,he absence of process noise. This method is complet.ely successful in maintaining the positive senGdefinit.e nat,ure of the error covariance, and it can provide t.wice the effective precision of the convent,ional filter implementation inill-condit,ionedproblems. The out.standing numerical characteristics and relative simp1icit.y of this Potter square root approach led t,o itsimplementationin the Apollo navigation filters [3]. Extensions of the Potter square root. approach, a,nd the development of an informationsquarerootfilterhave I. INTRODUCTION provided several recursive square root solutions to thedisN A significa.nt class of filtering problems, propagation cre$e filtering problem. The purpose of t.his paper is to of t,he error covariancematrix by mea,m of t.he Ralman review and relate these square root, filters, summarize the filt.er equations [ 1] result.s in a mat.rix which is not positive most promising computat.iona1 approaches, and compare t.hese with the conventional approach in terms of computational complexity and precision.
I
Manuscript received July 12, 1971. Paper recommended by D. G. Luenberger, Associate Guest Editor. T ~ research E was sponsored In part by Research Grant. NASA-XgL-05-020-007, a National Science Foundat.ion graduat.e fellowship, and the U S . Air Force. The views expressed herein are theaut.hors’ and do notnecessarily reflect those of Air University, t,he U.S. Air Force, or t,he Department of Defense. P. G. Kaminski is with theU.S. Air Force Space and Missile Systems Organization, El Segundo, Calif. A. E. Bryson, Jr., is with the Department of Aeronautics and Ast,ronautics, Stanford University, Stanford, CFlif. S. F. Schmidt. is with Analytical Mechanics Associates, Inc., Mountain View, Calif.
11. CONVEKTIOKAL FILTERING The following notation is used to describe the discrete time problem:
observation: z(k)
=
C ( k ) x ( k ) 3- 8 ( k )
(2)