AC-16, NO. 6, DECEMBER 1971
An Innovations Approach to Least Squares Estimation-Part IV: Recursive Estimation Given Lumped Covariance Functions
Abstract-We show how to recursivelycompute linearleast squares filtered and smoothed estimates fora lumped signal process in additive white noise. However, unlike the Kalmm-Bucy problem, here only the covariance function of the signal process is known and not a specific state-variable model. The solutions are based on the innovations representation for the observation process.
In the important case when y ( - ) has a rational powerspectral-densityfunction(and, of course, also in cert,ain more general cases-see [lI), (4)can be explicit,ly solved and yields t,he fairlywell-known result (see, e.g., [2, ch. 61)
I. INTRODUCTION THE engineering literature two versions of the least where squares problem, somet.imes called t,he Wiener problem S,+(j) = t,he transfer function of t,he unique causal andtheIialman problem, are usuallystudied. In the and causally invertible linear filter \\-hose presentpaper we shall develop a synthesis of the t,wo response to white noise has power spectral problems. To explain this, we recall t.hat in the most. widely densit,y S,(j). (6) studied special case of the Wiener problem, we are given As a point. of notation, we should remark that t,he repobservat,ions of the form resentation of a stochastic process as the response of a Z ( S ) = SI e(s), --co < s I t (1) causal and causally invertible filter t.o a white-noise process where z ( -), y( - ), and e ( . ) are stat,ionary processes with will be called a cu~aonieal~eepresentation(CR) or an i m o eations represenlation (1R)-compare [3]-[5]. When we power spect.ra turn toproblems with finite-t,ime observation of stat,ionary 5,df>7 Sdf) = 1 s m = 1 S,df>. (2) and/or nonst,at,ionaryprocesses, the estimate G ( t ) must. be The linear least squares estimate Q(t) is to be found in the writt,en form
(39) and m-e thereby Then $( .) can be calculated from have the IRent.irely in terms of t,he parametersof t.he covariance funct.ion. The existence3 of a solution of the nonlinear equation (40) is easy to establish, for we note from t.he relation (37) that
P(t) 5 II(t),
5t5 T
K ( t , t ) = M(t)N(t) = M ( t ) [rI(t)M’(t) tained by methods similar to those used in Section 111; W X ( t ) I. (42) in fact, the general discrete-time problem is a bit. simpler because it. is not necessary to consider sepa,rately problems To complete the proof of Theorem 1, we now merely have without additive white noise in the observations. The exto replace 2( .) in (33) by +( .). The proof of Corollary 1 is plicit formulas and some ot,her features of the obvious from t.hediscussion below (33). discrete-time will be presented elsewhere. To prove Theorem2 , me note that a process w(.) satisfySome c0mment.s may be useful on t,he assumpt,ion that ing the covariance function is available in the form(ll),(12). For stat,ionary processes, a large number of algorithms are 0 5 she form ( l l ) , (12) from knowledge of fied by direct calculation. But t.hen the covariance sequence. Much less has been done on COvariance est,imation fornonstationary processes, though $ ( t ) = M,(t)2(t) = llIm(t)+(t) some analogs of the Ho algorithm ha,ve been presented which is the cont.ent of Theorem 2. [33], [41]. An interesting open problem is whether it may Corollary2 now follows by notingt,hat. Ey(t)z(s) = be taoattempt, todirectly ident,ify the IR (33) from M ( t ) @ ( t ,s ) N ( s ) for t > s. Corollary 3 follows from t.he cal- observations of z( rather than to go via t.he covariance culat.ion Ea,z(s) = ei@(t, s)N(s). Corollary 4 is based on funct,ion of z( .). It. should be noted that, for purposes of some obvious calculations. ident>ification, t.he innovat.ions model (33) will usually Finally for Theorem 3, we not,e t.hat the assumption (30) have fewer parameters than models of the form (9) : in the for Ew(t)z(s)is consistent with the assumptions 1at.ter I’( may often be a rectangular matrix, while, for scalar y ( - ) , R ( - )in the IR (33) \vi11 always be a column w ( t ) = M,(t)z(t) N,(t) = II(t)Jf,’(t). (43) matrix.Astrom[34]came earlier to this conclusion for scalar-valued infinite-time stationary processes. We have Nom- we use our earlier result [20, (28)] for smoothedestipointed out. the appropriate generalizations in [18] and in wit.h t,he model (9) : theabove; following these generalizat.ions, Mehra has several studies [35], [36] of t.he identification probNote t.hat, the exisbence of a solut,ion for the nonlinear Riccati arguments. It is is vital to our this equation (40) problem that has lem. been more difficult to handle in earlier solutions [16]-[19]. In our I n fina.1summary, we have shown how to develop recurproof here, it is the assumption that. some lumped Markov model is sive filt,ering and smoothing solutions for a.n observation known to exist for y(. ) that guarantees the existence of a solution. However, as noted at. the end of Section I, the e?iistence of the IR in process specified not bya lumped model but. by a separable the form of Theorem 1 can also be proved under weaker condit.ions, covariancefunction of the form ( l l ) , (12). Apartfrom but with somewhat more difficulty (compare [ 111, [ 121). a )
the integral equations arising in optimization of time-varying theirintrinsicinterest,theseresults are useful because linear systems with nonstationary inputs,” IRE Trans. Inform.. they show t,he intimat,e rela,tions betaweenthe problem of Theory, vol. IT-3, pp. 220-224, Dec. 1957. least squares estimat.ion and the problem of determining I251 E. L. Peterson, Statistical Analysis an.d Optimization of Systems. New York: Wiley, 1961. the innovations represent,ation of a given stochastic [26] T. Kailath, “Application of a resolvent identity to alinear smoothing problem,” S I A X J . Contr., vol. 7, pp. 68-74, Feb. process. These relationships \ d l also be useful in studying 1969. estimat.ion incolored noise, and t>heir dual singularcont.ro1 [27] B. D. 0. Anderson and J . B. Moore, “The Kalman-Bucy filter as a true time-varying Wiener filter,” ZEEE Trans. Syst., Man, problems, as we shall show in laterpapers.
mat.ion-Part I: Linear filtering in additive white noise,” Z E E E Tra.ns. ~4utomaf. Contr.,vol. AC-13, pp. M&6.i3, Dec. 1968. R. Geesey and T. Kailath, “Comments on ‘The relationship of alternate state-spacerepresentations in linear filtering prohlems’,” I E E E T r a n s . Autmnut. Contr. (Corresp.), vol. XC-14, pp. 113-114, Feb. 1969. G. &I,Jenkins and D. Watts,Spectral Analysis andits dppliculions. San Francisco, Calif.:Holden-Day, 1968. T . Kailath and P. Frost, “An innovations approach to leastin additive squares estimat,ion-Part 11: Linearsmoothing white noise,” ZEEE Trans. dutomat. Contr., vol. X - 1 3 , pp. 655-660, Dec. 1968. A. E. Bryson and M. Frazier, “Smoothing for linear and nonlineardynamic syst.ems,” Aeronautical Svst.Div., WrightPat.terson AFB,Ohio, Tech. Rep. XSD-TDR-63-119, Feb. 1963. C. I. Dolph and 31. A. Woodbury, “On t,he relation betaeen Green’s functions and covariances of cert.ain stochastic processes and its application to unbiased linear prediction,” Tra.ns. d m e r . X a t h . Soc., vol. 72, pp. 519-350, 1952. S.Darlington, “ smoothing and prediction using network t,heory concepts,” in Trans. 1969 Znt.Sgmnp. Circuit and Information Theory, Los ringeles, Calif., pp. 1-13, 19.39. 31.Shinbrot, “A generalizat.ion of a met.hod for the solution of
[30] P. Swerling, "First-order error propagation in a stagewise smoothing procedure for satellite observations," J. Astronaut. Sci., vol. 6, pp. 46-52, 1959.
[31] --, "Topics in generalized least-squares signal estimation," SIAM J. Appl. Math., vol. 14, pp. 998-1031, Sept. 1966.
, "Modern state estimation methods from the viewpoint of least squares," this issue, pp. 707-719.
[32] A. J. F. Siegert, "A systematic approach to a class of problems in the theory of noise and other random phenomena," Parts II and III, IRE Trans. Inform. Theory, vol. IT-3, pp. 38-43, Mar. 1957; vol. IT-4, pp. 4-14, Mar. 1958.
[33] C. Bruni, A. Isidori, and A. Ruberti, "A method of factorization of the impulse-response matrix," IEEE Trans. Automat. Contr. (Corresp.), vol. AC-13, pp. 739-741, Dec. 1968.
[34] K. Astrom and S. Wenmark, "Numerical Identification of stationary time series," in Proc. 6th Int. Instruments and Measurements Congr., Stockholm, Sweden, Sept. 1964.
[35] R. Mehra, "On-line identification of linear dynamic systems with applications to Kalman filtering," IEEE Trans. Automat. Contr., vol. AC-16, pp. 12-21, Feb. 1971.
[36] --, "Approaches to adaptive filtering," in Proc. 1970 IEEE Symp. Adaptive Processes, Austin, Tex., Dec. 1970.
[37] J. Rissanen, "Recursive identification of linear systems," SIAM J. Contr., vol. 9, Aug. 1971.
[38] R. S. Bucy and P. D. Joseph, Filtering for Stochastic Processes with Applications to Guidance. New York: Wiley, 1968.
[39] N. G. Carlton and J. W. Follin, Jr., "Recent developments in fixed and adaptive filtering," AGARDograph 81, 1956.
[40] R. E. Kalman, P. Falb, and M. A. Arbib, Topics in Mathematical System Theory. New York: McGraw-Hill, 1969.
[41] L. Silverman, "Realization of linear dynamical systems," this issue, pp. 654-567; also, Ph.D. dissertation, Columbia Univ., New York, N.Y., 1966; also D. C. Youla and P. Tissi, in 1966 IEEE Int. Conv. Rec., part I.
Discrete Square Root Filtering: A Survey of Current Techniques
Absfrucf-The conventional Kalman approach to discrete filtering involves propagation of a state estimate andan error covariance matrix from stage tostage. Alternate recwsive relationships have been developed to propagate a state estimate and a square root error covariance instead. Although equivalent algebraically to the conventional approach, the square root filters exhibit improved numerical characteristics, particularly in ill-conditioned problems. In this paper, current techniques in square root filtering are surveyed and related by applying a duality association. Four efficient square root implementations are suggested, and comparedwith three common conventional implementations in terms of computational complexity and precision. Thesquare rootcomputational burden should not exceed the conventional by more than 50 percent i n most practical problems. An examination of numerical conditioning predicts that the square root approach can yield twice the effective precision of the conventional filter in ill-conditioned problems. This prediction is verfied in two examples. The excellent numerical characteristics and reasonable computation requirements of the square root approach make it aviable alternative to the conventional filter i n many applications, particularly when computer word length is limited, or the estimation problem is badly conditioned.
semidefinite-a theoretical impossibility. This may occur when 1) t,he covariance matrix is rapidly reduced by processing very accurate measurements, 2) a linear combination of state vector components is known w i t h great precision, while ot,hercombinations are essentially unobservable. The source of trouble in bot,h cases is numerical comput,ation of ill-conditioned quant.itiesin finite word length. To circumvent this difficulty, Potter [ a ] gave a method for propa.gating t,he error covariance matrix in a square roo form in t,he absence of process noise. This method is complet.ely successful in maintaining the positive senGdefinit.e nat,ure of the error covariance, and it can provide t.wice the effective precision of the convent,ional filter implementation inill-condit,ionedproblems. The out.standing numerical characteristics and relative simp1icit.y of this Potter square root approach led t,o itsimplementationin the Apollo navigation filters [3]. Extensions of the Potter square root. approach, a,nd the development of an informationsquarerootfilterhave I. INTRODUCTION provided several recursive square root solutions to thedisN A significa.nt class of filtering problems, propagation cre$e filtering problem. The purpose of t.his paper is to of t,he error covariancematrix by mea,m of t.he Ralman review and relate these square root, filters, summarize the equations [ 1] result.s in a mat.rix which is not positive most promising computat.iona1 approaches, and compare t.hese with the conventional approach in terms of computational complexity and precision.
Manuscript received July 12, 1971. Paper recommended by D. G. Luenberger, Associate Guest Editor. T ~ research E was sponsored In part by Research Grant. NASA-XgL-05-020-007, a National Science Foundat.ion graduat.e fellowship, and the U S . Air Force. The views expressed herein are theaut.hors’ and do notnecessarily reflect those of Air University, t,he U.S. Air Force, or t,he Department of Defense. P. G. Kaminski is with theU.S. Air Force Space and Missile Systems Organization, El Segundo, Calif. A. E. Bryson, Jr., is with the Department of Aeronautics and Ast,ronautics, Stanford University, Stanford, CFlif. S. F. Schmidt. is with Analytical Mechanics Associates, Inc., Mountain View, Calif.
11. CONVEKTIOKAL FILTERING The following notation is used to describe the discrete time problem:
observation: z(k)
C ( k ) x ( k ) 3- 8 ( k )