David P. Lindorff (Sh.1'62) was born in Flush- ing, X. Y., in 1922. ... estimates of a random process given observations of another colored ... man of the IEEE SCS St.ochastic Cont.ro1 Committ.ee. ..... Ex0 = 0, Ex,T~' = lIz(0), Eu(t)~o' E 0 (15~).
IEEE TRANSACTIONS ON AUTOUTIC
CONTROL, VOL.
435
AC-18,NO. 5, OCTOBER 1973
Robert L. Carroll (S’69) was born in Turkey, N.C., on June 8, 1945. He received t,he B.S. degree in electrical engineering from North Carolina State University, Raleigh, in 1967, the Master of Philosophy degree in systems theory fromYaleGniversity, Kew Haven, ,l970, and is current,ly working Conn.,in toward t,he Ph.D. degree in adapt,ive systems theory a t t,he Gnivemity of Connecticut,, Storrs,underthe direction of Dr. D. P. Lindorff. While a graduate st,udent heheld an S D E A Title IV fell13wshiD and a N$SA Research Traineeshin During the summer mdnths of 1963 he worked for the Gedrge C. Marshall Space Flight Cent.er, Huntsville, Ala., and in 1967 for Radiation, Inc., Melbourne, Fla. His current research int.erest is t,he problem of output, control of unknown systems. Mr. Carroll is a member of Phi Kappa Phi, Sigma Xi, Tau Beta Eta Sigma, hIu Beta Psi, and the American Pi, Eya Kappa Nu, Phi Daffodil Societ,y.
David P. Lindorff (Sh.1’62)was born in Flushing, X. Y., in 1922. He received the B.S.E.E. degree from the Massachusetts Institute of Technology, Cambridge,in 1948, the M S . E.E. degree from University of Pennsylvania, Philadelphia, in 1950, and the Dr. Ing. degree from T. H. Darmstadt,, Germany, in 1965. Since 1951 he h a been on t,he Elect.rica1 Engineering Facu1t.y at. theUniverjity of Connecticut,, Storrs, where he is Professor and Chairman of the Syst.ems Group. In 1958 he was a Visitor in the Department. of Engineering at, Ca.mbridge Universit,y, Cambridge, Mass. In 1965 and 1971 he received Hunlboldt Fellomships for visitations at Technical Universities in Ilarmstadt and Stuttgart,, Germany, respectively. Dr. Lindorff is a member of Sigma S i , E t a Kappa Nu, and the Editorial Advisory Boa.rd for t,he journal Computers and Electrical Engineering.
An Innovations Approach to Least-Squares Estimation-Part V: Innovations Representations and Recursive Estimation in Colored Noise THOMAS IiAILATH
AKD
Abstract-We show thatleast-squaresateredandsmoothed estimates of a random process given observations of another colored noise process can be expressed a s certain linear combinations of the state vector of the so-called innovations representation (IR) of the observed process. The IR of a process is a representation of it a s the response of a causal and causally invertible linear filter to a white-noise “innovations” process. For nonstationary colored noise processes, the IR may not always exist and a major part of this paper is devoted to the problem of identifying a proper class of such processes and of giving effective recursive algorithms for their determination. The IR can be expressed either in terms of the parameters of a known lumped model for the process or in terms of its covariancefunction. In the first case,ourresults on estimation encompass most of those found in the previous literature on the subject; in the second case, there seems to be no prior literature. Manuscript received March 31, 1972: revised February 22, 1973. Paper recommended by P. A. Frost, Chairman of the IEEE S-CS Est.inlation and Identification Committ.ee, and D. U. Saorder, Chairman of t h e I E E E S CSt.ochastic S Cont.ro1Committ.ee. This work was supported in part by the Air Force Office of Scientific Research, Air Force Syst.ems Command under Contract. AF 44-620-69-GO101 and in part by the Joint Services Electronics Program under Cont.ract X0001467-A-01120044. Part of the work done by T. Kai1at.hwas done at t.he Indian 1nstit.uteof Science, Bangalore, India, with the aid of a Felloxship from the JohnSimon Guggenheinl Memorial Foundation. T. Kailath is with t.he Department. of Elect.rica1 Engineering, Stanford University, St.anford, Calif. 94305. R. A . Gee-sey is wit.h HQ USAFjSAGR, Washingt,on, D. C. 20330.
ROGER
A.
GEESET
Finally, we may note that our proofs rely on, and exploit in both directions, the intimate relation that exists between least-squares estimation and the innovations representation.
I. IXTRODUCTION
IS
T H E previous papers in thi,.; series [1]-[4], W P dealt, withleast-squarescstimat,iongivcnobscrvationprocesses t,ha.t cont.aincd a white Gaussian noiscl compuncnt. N o w we turn to son~olinear clstimation problems x-herc! this assunlpt,ion canberelaxed;inpart.icular, we shall stud? the linear estimation of a signal in colorcd additive noise and, more generally, the cstimation of one random process from observat.ions of a rclatcd but. smooth random process. ;Is in [l]-[j.], w e shall find t.hat the concept of an innovations reprcsent.ation is very helpful in the study of such problems. The innovations rcpresentat.ion (IR) of a p r o w s will be defined asarepresentation of it. as the responsc of a causal and causally invertible linear filtert,o awhite-noise innovations process. Theinnovations \vi11 be a lit.tle moredifficult. to find than when the observations containwhitenoisr;but oncefound?theyimnediately -ield thesolutions of associated least-squares filtering and smoothing problems.
436
IEEE T R S X S A ~ I O N S ON AUTOMATIC CONTROL, OCTOBER
1973
As one exampleof our resultsconsider the determination that, as in [4].the question of whether dynamical-system of the least-squares filtered estimate of a lumped' signal modcls are known for z ( .) and n ( .) is not too important. The reason is that the innovations representationof g ( . ) , process r ( . ) given observations whichcompletelydetermines the solution, depends only y(f) = r ( t ) n(t): 0 5 f 5 I' < 00 (1) upon the covariance functionof y( . ) ?and is independent, of where n ( - ) is also a lumped process. Suppose t,hat z( - ) and any part.icular a priori models for z( .) and/or n.(.). Therefore, even if the estimate i ( - )is expressed in terms 12(.) are jointly Alarkov a.nd that thc covariance function of a laown model for z ( - ) , it must turn out that the of z ( .) is of the form formulas for z(.) can be rearranged t o depend only upon quantit.iest,hat. enttr into the covariancesfor z ( . ) and y(.). This simple observation is essentially the key t o [4] and t o this paper. From t.hc above comments it should not be surprising
+
of observat.ion processes. Once this problem is well understood, the filtering and smoothing problem d l be quite easy, as \-e shall illustrate in t.he rather short. SectionV .
sa?-.
where
t > O 1 = 0 t s (7)
Rl(t,s) = min(t,s)
-
(ts)/2,
0 5 t,s 5 T. (10)
The covariance funct.ion of t.he first. derivative of t.he process is
and z ( - ) is afinite-variance process. Weshowed [l? theorem 21, that theinnova,t,ions process for such y ( . ) ca.n be obtained as It is true that whitc noise appears in t,he first derirat,ivc, but. because -4 is not a coraria.nce, it is not. clcar ifwe .(t) = y(l) - @(t) = y ( t ) - i ( t ) (8) can m i t e t h e derivativeinthe signal-plus-n-hit.e-noise where the caret denotes the leaet.-squa,res estimate based form (7) and, therefore, it. is not clear if the const.ruct,ion on observations of y( .) upto t. Such a procedure brcalcs (S) can be used t o obtain t.hc innovations. down xhen y ( - ) is smooth, c.g., if it has continuom pat.hs, Example 2: Consider a proces with covariance for then@(t)=y(t). The problenl of determining the innovations for smooth processes is difficult. and there are nonst.ationaryprocesses t,hat. have no innovation representat,ions. Int,his paper we shall describe aspecial class of differentiable proccsscs that. (12) do have such innovation repre3ent.ation.s. In the folloming we consider the case of vector-salucd This would incorrcc6ly suggest that the original process is observation processes y( .) where t.he component processes integrated white noise; the difficulty, of course, is u-ith have t.hc same ordersof differentiability, or more precisely, ”initial” conditions. ha.ve the mme “rela.t.ive order,” where this term will be This second difficulty is easier t o resolve; we have t o definedpresently.This is astrongassumption, which treat separately theeffect of the initial conditions and this essentially makes this vector case the same as the scalar can be done by following a procedure, due t.o Shepp [19], case; we a.ctually only t.reat. the vector case here for ease of t,hat, we shall describe presently. The first. difficu1t.y is reference in other contexts. Once thisspecial vector case is resolved by noting [13], [30]that if the covariance of t.he underst.ood, however, there u-ould seem t o be no essentia,l differentiatedprocess is strictly positive-definite (rat,her difficulty inextendingourresults tothe general case. thanjust. nonnegative-definite) t,hen this process ca.n It is our opinion, reinforced by the paper of Brysun and always be relvritten in the signal-plus-noise form (7): so Johansen [GI, that t.he nmin complication is t o keep track thattheconstruction (S) can a.gain be used. This imof the various orders of differentiability of t.he component, portant result? is provedin sonle generalityin [13]. by processes; Silverman and Anderson (personal communication)havemade someprogresson orga.nizing this proshorn clearly the import,ance of the condition Ev(t)z’(s) 0 cedure. For moregeneral classes of ”nonlumped” pro- in (It i ) whose , significance is apparent.ly only slowly being appreciated cesses, of course, themult,ivariate case canint.roduce and in less generaliQ7 in the cont.ro1literahre (cf. [53] ). ~
43s
IEEE TRANS.4CTIOXS O N AUTOMATIC CONTROL, OCTOBER
lg’i8
and Ragazzini,butsomen-hat.unaccountablywasnot. raised i n [%-[31].It happensthat in anattemptto obtain sornc kind of unique solution, conditions equivalent t o invclrtibility arc obtained in [28]-[31], but invcrtibility itself is not mentioned. On the other hand, r e exploit the i,llwfibility and its relationship to least-squaresestinzation estrewtel!/ Ireazdy and i n both rli/,ectims: in Part I of t h s scrirs of papcrs n-c usrd it t o give an casy proof of the 1I;alman-Bucy formulas;in Part IT and in the present paper n-c us(’ the I 0, models that canbe used t o generatesuch cova,riances. The main result is Lemma 1. We also describe how the 0 5 t 5 T . (14c) sonlewhat difficult problem of init,ial conditions for smooth (colored-noise) processes can be handled by using a result The int.eger 01 mill be called t,he wlatiae order of the process of Shepp [19], which we were aware of from some of our and of the comriance R ( ., It can be verified that a work indetectiontheory. The properchoice of initial stat,ionary process with a rational power spect,ral-density conditions has been of concern there for some time (see: funct,ion will satisfy the above conditions, with the relative e.g., [37, footnote 11 and [38]).Even in estimation theory, order a being equal t.0 one-half the difference between the the physical reason for t.he particula.r form of the proper degrees of the denominatora.nd numerator polynomials of initial condit,ions docs not seem t o have been knon-n (see, the spectral densit.y. e.g.. [GI and [lo]); we give a simple physical expla,nation It ma.y be noted that a process of rela.tive order a will below. have a - 1, a.nd no more, derimtives in the mean-square As a prelude t o thesc result,s, we may note that a major sense. We recall (Example 0) thatthereare processes difficulty in theanalysis of nonst,ationary lumpedprocesses (usually degenerate) for which the relative order may be is in trying to be t,oogeneral. Xore rapid progress can be infinite. made if attention is restricted t o some proper but nonKext n-e shall exanline the properties that alumped trivial subset of t,hese processes. We shall work with what ?tla.rkovprocessdeterminedby a stat.e-variablemodel we feel is a "nat.ura1" subset., where by "natural" we mean must. have if it,s covariance hasrela.t,iveorder a. A general that in the stat,ionary case these processes should be just Markov model is the well-knoxnprocesses wit.h rational pon-er-spect,ralt ( t ) = F(t)z(t) G(t)u(t), x(0) = zo (l5a) densityfunctions. I n addition t o isolating this cla.ss of processes, n-e sha.11 also find verr useful a well-known pre= HW(0 (l5b) liminary transformation of t.he usual state-variable "feedback" form of system description into an equivalent, form,where F ( . ) , G ( . ) ,H ( . ) are known funct,ions, and without, a.ny state-variabie feedback but with the same Ex0 = 0, Ex,T~' = lIz(0), Eu(t)~o'E 0 ( 1 5 ~ ) impulseresponse.Therefore, instead of plungingright Eu(t) = 0, Eu(t)u'(s) = 16(t - 8). (lad) awayintothe calcula,tion of t.he IR, we shall first, go somewhat slow-12-through some definitions and someeasy Kote that t.hough this model is causal,it need notbe transformations and calculations. Despiteour best efforts, causally invertible. there is a considerable amount of algebraic detail in this To see most clearly t,he constraints tha.t. the relativesect,ion and t.he next; however, this is a.pparently unavoid- order properties (14) impose on the model (15), it, is conable and perhaps explains why, despit,e t.he early effortsof venient, t o firstmakea well-lmown prelinlinary &ate Darlington, Batkov, Stear, and ot,hers (cf. the references transformat,ion of the model. in [33]), a comp1et.e solution has taken so long t o appear. Thus let T(t)be a nonsingular, and continuously differThe casual reader nught find it helpful t o ignore all proofs entiable, mat,rix such t,hat, except perhaps that of Theorem 1. F(t) = $(t)T-l(t). (16) We shall study processes wit,h covariance functionsof the form T ( .); in particular the sta.teThere can be many such t.ransitionmatrix (or fundamentalmatrix) of F ( . ) is readily seen t o be a suita.bleT ( .) matrix. Now if we set e).
+
40
=
W)x(t)
where, as explained inSection I, A ( .) and B( .) a.re then it is easy t,o see that. t.he system matrices of compa,t,ibledimensions. Weshall assume, withx ( 0 ) = X0 X ( 0 = W.(t>, out loss of generalit,y, that the columns of A ( .) and B( .) Exoxo' = N o ) a.re linearly independent over [O.T].I n order tlo be ablet o y(t) = A ( t ) x ( t ) , find innovations (or canonical) representations for the where u ( . ) is as in (15), but. process? say g(. ), wit.h comriance (13), we shall have t o n(0) = [T-'(0)]rI,(O) [T-1(0)I' make certain smoothness assumptions on the elements of A ( - ) and B ( . ) . and We shall assume that t>hereis a finit.e integer a, a > 0, $(t) = T-'(t)G(t), S(t) = H(t)T(t) such t.hat,
(17)
( W (1Sb)
(194
(19b)
440
os AUTOMATIC CONTROL, OCTOBER 1953
IEEE TRASSACTIONS
= A(t)rI(O)A'(S)
+
S,'L
[ ( t - 7)(s -
7')]--1
( a - l)!(CY- l)!
Relatire O,dc,. of Jlodels and Corwian.ces The formulas (2'7) bring out the physical content of the statement that thr covariance of a lumped process has relative order a . S o t c t h a t (3'7) ill alwaysbc true for stationary lumped Markovprocesses of relative order a. It i9 important t o note that all models generating the covariance R(t,s) sharethe smoothmess properties of R(t.s).This fact isnot f m e in discrete time and is the source of certain sonleu-hat unexpected differences between discretc-time and continuous-timeresults (cf. [39]and [40]). It may be useful t o formalize the above statenlent as follox-s. We shall sa?- that an impulse rpsponsc h(f.7)
has relative ordrr
[&) p ( t )
=
4("
(28)
CY 5 n) if +(.) has a derivatives and . . . p - 1 ) ( t ) ] $ ( f )= [O . . . r(t)] (29)
wherc $( .), I.( .). l;r(.), + ( - ) ( . ) / r ( . ) arc all square integrable. Then u-e can state the following lenuna. Lemma 2: Lct h ( i , ~ be ) the impulseresponse of a lumped system whose response t o initial conditions and a I\-hite-noise inputhascovariance R(t,s). Then R(t.s) has relative order CY if and only if h(t,s) has relative order CY.
The Indial Conditions As pointed out in Esanlple 2. initial conditions will bc lost in the process of differentiation;therefor]r-l(t)
(2)
=
E2r(f)2r'(t),
17r(t)
= xr(t) -
R(t)
(61)
satisfies (55). The hypothesis (14a) that A'"'(.) is a continuous function ensures t.he existence and uniqueness of P , ( - ) . The proof essentially follows from t,he observa,t,ion that (wit.h
Pr00.f~o j T h e o r e m 1 an.d Corollaries 1 a72.d 2: We shall first determinr thr IR for the process y,(.) and then shall show that the initial condition process y i r ( . ) can be easily combinrd with the IRfor y,( .). SOTIfor the procrss y,( .) we have the relations [cf. (46) and (4S)l = '(t)zc(t),
=
and
and
Xr(t)
1973
where
(32a) Y(t)
CONTROL, OCTOBER
of theform
KAIIATH AND GEESE€: LEASFSQUARES ESTIMATION-PART
R(t,s)
=
A ( t ) B ( s ) l ( t-
S)
443
V
+ B'(t)A'(s)l(s- t)
Moreover,
and suppose4that y( .) has a model of the form(51) where, however, 11. and H(0) may not be known. Then the IR of y(.) can be writt,en in t.he sa.meform as in Theorem1, viz., as in (52) and (53), except that K ( . ) is expressed as ~ ( t= ) [ ~ ( " ' ( t )-
z(t)~'(")(t) ]r-l(t)
(71)
where z(.) is the unique nonnegative definite solution of t,he Riccat.iequat.ion =
KK'
=
[ ~ ( m )
2(0)
1
~ ~ r ( a ) ] r - 2 [ ~-( 0Z)A ~ ( Q )
QB(0)[QA(O)&B(O)
=
(724 (72b)
I-lQB'(O)-
Corolhry 3-Whitening Filter Given th.e Covariance F z m c f i m : Theinnovations process { V ( O ) , v ( . ) } canbe obtained from y(.) by the same construction as in Corollary 1, viz.,via (56) and (.57), except that K ( . ) should now be given by (71) and (72). Proofs: Our task is t,o rewrite the I R in Theorem 1 in terms of the funct.ions A ( .) and B ( - ) .This means tha.t we must express K ( .) in terms of A and B( To t,his end let us introduce the covariance matrix of the states ?,( of t,he I R (60) for y,(.) : ( e )
e).
e )
&(t)
=
E?,(t)?,'(t)
=
II,(t) - P,(t).
+ $(t)c'(t)]
K(t)r(t)=' [P,(t)A'(")(t)
+ +(t)C'(t) - Z,(t)A""'(t)]
[rr,(t)A'("'(t)
=
[B,'"'(t) - zr(t)A""'(t)]
(74)
where \=,-e haveused the formula (50) for B,'"'(.). But nom- observe from the formula (47c) for 11,and (SSb) for P, that
2,
=
KK' = [B,'"' - z , A ~ ] ~ - ? [ B , ( " )&A""']', Z,(O)
=
Z(t) =
zr(t)
=
E,($)
+ QB(0)
+
20,
[QA(O)QB(O)]-I&B'(O)
say
(76)
=
+
[B("!(.t)- Z(t)A""'(t)].
20.
n(t)Q,'(t)= Qe(0
('75)
(79)
where Qd( .) and Q B ( .) are ap X n and n X a;r, mat.rices, respectively,where n is the dimension of x and p t.he dimension of y. It turns out. (cf. Appenhx I) that the matrix E ( - ) , which essentially determines the IR, is similarly const,rained,viz.,
0 _< t 5 T .
Z(t)QA'(t) = QB(t),
(80)
Wecan exploit thesealgebraicconst.raintson z(.)t o reduce! by a suitable changes of variables, the order of the Riccat.i equation t,ha.t.we have t o solve t.0 obtain t.he IR. We shall illustrate t.he procedure for t.heI R based on the parameters of the covariance function; similar, but. less explicit,, results were also obt.ained by Brandenburg [31]. For notational simplicit,y,we shall consider only the scalar case, where p = 1. First.someexperimentation will show t.hat. t,he t,ransformation matrix for the change of variables should be of the form
T(t) =
(51)
where the ( n - cy) X n mat.rix T I (.) is arbitrary provided that it. is differentiable on [O,T]and that. the matrix T(t) is n0nsingula.r for all t. (In many problenzs i t ha.ppens tha.t rows of the matrices L4(Q'(-), A ( " + ' ) ( . ) ,. .. , A ( " ) ( . ) . can be taken for the rows of TI( .), see Appendix 11.) Then we change variables from 2 ( - )t o i(.), where
i(t)
Z:oA""'(t) - Z(t)$'("'(t)]
Z(0) =
Our Riccati equations (55) and (72) are n X n matrix equations. But, theproofs of Theorems 1 and 2 show that. these equations are obtained by use of the mat,rix Z ( t ) , which though it has dimension 11 is subject t o certain constraints because of the relative-order properties. In particular, by (24), we have t,he algebraic constraints
Then, using (76) and (47), we have
K ( t ) r ( t )= [B,'"'(t)
ir= K K ' ,
Reduced-Order Riccati Equation
0 (75)
so that &( .) and hence K ( .) can be calculated directly from A ( .) and B,( .). By a simple change of variables we can express K ( - ) in terms of A ( . ) and I?(.), not B,(.). Thus let
=
Fina.lIy, with K ( - ) defined by (77) and (78) and j j ( .) replaced by o( .), the IR (52) of Theorem 1 is complet.ely expressed in terms of the paramcrers A ( .) and B( .) of the covariancefunction.Thiscompletes t.he proof of Theorem 2 . Corollary 3 follows similarly from (69) and (70) and (77) and (75).
(73)
Then t.he function K ( .) of (54) can be written
=
i:
=
T(i)Z(t)T'(t).
(82)
N s o let. (77)
This condit,ionis not unreasonable but,there are otherpotentially weaker conditions. In particular itsuffices if t.he function Rr(-.d(t,s) of (49) is strictly positive definite on Lp 8 L?, where Ln is the space of square-integrable functions. These conditionsare established and further discussed in Appendix 111. We note again that the result of Theorem 2 was obt,ained in a different F a y by Moore and Anderson 1291; they, however, were notaware of the canonical (invertible) nature of t.his representation.
m
= 12 -
cy, F(t)
=
T(t)T-'(t) (53)
and let. LE pa.rtition t h e n X n matrices i(.) and F ( . ) as
Then it turnsout. that all the subma.t,rices of i except imm are immediat.ely determined fromA (-), B( .), and TI( .) as
IEEE TUNSACTIOMS
ON A U T Q X ~ ~ I CCONTROL, OCTOBER
1973
Example 3-A Statimzary Process Orer a Finite In.terva1: Consider a process y(.) over a finite interval [O,T]where y(.) is stationary n-ith power spect.ra1 density
and corresponding covariancefunct.ion
\There
and
i((t) = [B,, -
~ & a ,- zmmam]r-l.( m ) . . -
the transHaving determined 4, n-e can go back via formation I’ t o t h ematrix and from thence to the innovations rcprchswtntion. Ho\wvcr! if our interest is inthe statc~-variablcmodels ( n j and (72) for the IR and the u-hitcning filter, thcn ~e can work directly with 5 if we apply thr samc transformation T to the models (71) and (72). After aomc algcbraic manipulations, w’ obtain t h r follon-ing results. Theorem S-JIodijiecl IR: With the various quantities as: defined in (Sl)-(SG) 11-e ca.n \Trite the IR as
. . . .
.
.
.
a
.
.
!
+
(91b) andhascovariance (91) on [O,T] X [O,T]for the initial condition couarianccl matrix given in (91b). This initial conditionmatrisisfound as t.hesolution of the linear Lyapunov matrix eyuat,ion
.:. .
There are several linear systems driven by xhite noise that nil1 yield an output process y(t), t E [O,T]having the covaria.nce (90) on [O,TI X [O,Tl. For example? the factorization of the spectral density that t.akes the left halfplane (LHP) poles and zeros in t.hes plane, s = c jx,of S ( s ) has the realization
+
FIIo
+ Sop’ + GG’ = 0
(914
where F is the2 X 2 matrix and G is t.he 2 X 1 matrix in the state equation (91a). ,411 alternaterealizationis a system u-hose transfer function contains the LHP poles and R H P zeros of S(s). This realization is
(92b)
$,ka?llples I T shall : ( & illustrate thr abovtl procedures a.nd also obtain
further insights into the innovations representation b>- considering sonw csamplcs.
some'
The initialconditionmatrix is againdeterminedfrom ( 9 1 ~ ) with : F and G nox being asin (9.a). Hen-ever, rleithrr of thrsc representations has thecanonical property since the systems in (91) and (92) are not invertible: the difficulty isthattheinitialconhtionsarenotsuitably constrainedandcannotbedeterminedfrom y(.). The rclaticeorder of eachsystem is 1. butthecovariance matrices o f the initial conditions are of rank 2 requiring indeprndrntinitial-conditionrandomvariablesforthe representation. Rrlative order 1 implies that only a single initialconditioncanbedeterminedfrom theoutput [namely y(0) = .rl(0)]. Thus in thrsc rcprcscntations, t.he lvhite noise u ( . ) and the random variablcs (x1(0),r ~ ( 0 ) )
KAILATH AND GEESEY: LEAST-SQUARES ESTI?,fBTION-P.QRT
of t.hc observedprocess
havealargerspanthanthat
t
!do:
44.5
V
1 E [O,Tlj.
To determine the IR, wc first not.(. that the covariance fundion ha.s the separable form R(t,s) = A(t)B(s)l(t - s) B’(t)A’(s)l(s - t) with
+
Q.3(t)
P ( t ) E 1,
1,
=
v = j - Q .
Thc reduced-orderwhiteningfilter stat.cequationas
Then AB(’) - BA(’)= 1 a.nd thereforc =
j=&=x1+v,
B’(f) = { ( 3 / 1 6 ) e C ,( 5 / 4 S ) e 3 2 } .
A(t) = { e - ’ ,e - 3 2 } ,
CY
variable V(O),which __ is dctcrmined from the.observation process as V ( O ) = v‘24/7 ~ ( 0 ) . The complete whitening filt,er is obt,a.ined from (88) or by simply inverting (95) through t,hc relations
A(f),
QB(t)
= B(t).
The two-dimensiona.1 sepa.ration of t.hc covariance function impliest>hat. therepresentation will be asecond-ordor system and t.hc definite relat.ivc order 1 shows that t,hc difference in the number of poles and zeros in thc transfer function of t.hc system is 1, as is also easily scen from the spectral density. The equivalencetransformat.ion T ( f ) leading t ot h e realizabion of t,he IR given in (ST) is conveniently formed for t.his example byt.aking Tl(t) = i4(1)(t), u-hich is linearly independent of A (1). Thus,
+ k ] w - 3y + k y ,
c i = - [4
v = -w
+
is givenby
a scalar
w ( 0 ) = - (12/7)y(O)
(96a)
~__
V ( 0 ) = &4/7
y ;
y(0).
(96b)
The most, notable feature of the IR obtained for this exa.mple is t,hc appearanceof a. timcvariant gain in (95a) for the representationof a stati0na.r: process. It. is because therepresentationproblem consideredhereinvolvesa finit.e time int,erval tha.t, the time varia.ble function K ( t ) arises. The IR in (95) actually converges to the stationary representat,ion (91) since as t a ,linl K ( t ) = -2, or - 6, with K , = -2 being t.hc stable solut,ion of (94) obtained for the init,ial condit,ion - 111/56, while K , = - 6 is an unstable steady-stat,e value t.ha.t, corresponds t o t,he representat,ion (92). Not.ice that the initial condition covariance matrix for t h e I Rshown in (93b) is of rank 1 and “smaller” than the covariance matrixof either of the represent,ations (91) and (92). A41so,the covariance matrix of initial condit,ions for t,he “nonminimunl” phase representation (92) is “larger” t,han t,hat, of .the ”minimum-phase” representat.ion (91). This is a.n int,eresting general propert.y, which is furt.her discussed in Appendix I. Example &The Stationary Process of Example 2: We shall reconsider the covariance ---f
for ivhich
F(t)
=
T(t)T-’(t) =
[-: e
]
:
-
This corresponds t o t,he system mat>rix fort,he coordinat.es used in t,he above noncanonical realizations (91) and (92). The rcduced-orderRiccatiequat.ion is t.hcscalardifferential equation [cf. (S6a)l $92
=
-4522
+ (iZ2 + s/s>~ +3
- iZ24
(93a)
with t.erminal condition
&(O)
=
1241 6 -.-.= -.
2 7 2
Rz(f,S) =
7
TheRiccativariableentersthcrealizationas 8(t)[cf. (S6d)], and since t,he Riccati va.riablc in t.his example is scalar, we may obta.in a different,ial equationfor I ? ( - ) directly, since R = 2 2 2 9/~,
+
(1 - It - ~ ( ) / 2 ,
0 5 t ,< ~2
(97)
which can also be writt,en in the separable form
R(t,s) = A(tVs)B(tAs),
A(t)
=
;[I - t,1],
B’(t) = [l,t]. (9s) The definite relat,ive order CY is 1 since AB(’) - A(”B C2 = 1; then QA(t) = A(t) and QB(t) = B(t).Form
=
(94) The canonical representa,t,ionis t.hen
for which
F(t)
=
?(t)T-’(t)
=
[,”3
The reduced-orderRiccat,i equation is the scalar ential equation 222 = i Z 2 2 ,
The canonical property of t.his representa,t,ionis evident from the appearance of a single init,ial condition random
&2(0)
=
1/2
which has the closed-form solution &(t)
=
1/(2 - t ) ,
0 4 1 < 2.
differ(99)
446
IEEE OCTOBER TRANS9CTIONS CONTROL, O S AUTOMATIC
A
The function* K ( . ) appearingintheinnovat.ions repre sentation is K = --&, so that. the IR can be written
(100a)
[xl(O),52(0)]
=
[1/1/z - l/d/2]7-(0).
(100b)
The present emmplc provides a good illustration of the fact that the state-variablerepresentation of the IR is not necessaril>-minimal (Le., have the lowst dimension state vector) even if the elenlrnts of the A ( . ) and B ( . ) matrices int he covariancefunctionarelinearlyindependent . Thus in thr presentproblem \ye can calculatt that the impulse responsr (with zero initial conditions. of course) of the IR is (t - 2)/(s - 2 ) , which yields the representation y ( t ) = (1 - t ) T - ( o ) / d 2
+ (t. - a) J= os--2
4 s ) ds.
But. this simple explicit, representation sho~vst,hat,\-e can obtain a onedimensional state representation. altrrnativc t.o (loo), as
v = -w
+ jt,
5'(0)
=
1973
in [14] (the actual calculations areshown in (26, examples 11 and 131). We turn now t o some applications t o filteringand smoothing of t,hr above innovations representations.
V. SONEAPPLICATIONS TO FILTERIXG AXD SMOOTHING We shall apply the above theorenls and corollaries to someleast-squaresestimationproblems. R e shall first, treat problems in which lumped 3larkov modcls are given for the signal process and for the observations. Then Tve shall solve problems in whichonly the covariance information is Bnov-n. noted in the introduction. our results fnr the first class of problems rsscntidl>- encompass all previously know1 results on this problem. as foundin[5]-[12]. Furthermore; our dcrivat ions, eepeciallp for the smoothing problem, are generally much more t.ransparent. For problems of the second class? there seems t.0 be no prior literature.
Our discussions for processes with known models 11-illbe based on Theorem 1 and Corollaries 1 and 2.
4 5 y(0).
The solution of (104) will also yield the closed form (103). T o conclude this example, and our discus5ion of IR's for colored noise, we may note that the forns(101)-(103) for the IR and the whitening filter could, in this example,be directly obtained by using the general technique described
Jt
xhere ?(.) is thc filtercd estimate!
v(t) and
=
r - l ( t ) [ y c m ' ( t )- --l'-!(f)ji(t)]
(109)
KALLATH A N D GEESEY: LEASFSQUARES EGTJMA!PION-PART
P(t,s)
=
a@) = x@>
EZ(t)Z’(s),
- m
447
V
k
.
(110)
By using t,heeasily derived fornlula (cf. [l,Appendix I])
P(t) = EZ(t)Z’(t) (111)
P(t,s) = P(t)CP’(s,t),
is the fundament,al matrix of (105), viz., the where unique solution of e )
@(e,
d@(t,s)/dt
=
+(o,o) = I ,
-K(t)r-l(t)B(“’(t)CP(t,s),
(112)
=
--Kr-1A(*)g
+ (+ - K r - l c ) u ,
where X(
a )
+ P(t>X(t>
$ ( t l ~ )= [++’
-
(K - P A ’ ( ~ ) ~ - ~ ) K ~ ] x ( ~ )
+ ( K - Pat(a)r-l)v(t) =
(113)
[++’ - ( K -
~~’(“)r-l)~’]~-l[ii(tJ~
- ?(t)]
is the so-called adjoint variable,
JT
- ?,,.
Now the remaining formulas (113)-(117) are immediate. Remark: The formula (113) is the analog of the BrysonFrazierform for the fixed-interval smoothingestimate. As in Part 11, we may t.ry t o obtain t,he Kauch-TungStriebel form inwhich $(tlT)is expressed in t.erms of i ( t ) and i ( t l T ) . If we try that procedure here, we obtain, by different,iationof (113) and use of (105) and (114)-(116),
we can rewrite ?(tl T ) as
i(tlT) = 2 0
~ ( o =) x.
+ (K - PA’(“)r-l)v(t). (119)
Therefore, unlike inthe addit,ivewhite-noise problem, it is not possible t o dispense with the data y ( - ) or v( .) once j i ( .) is found; the reasonsa.re the presence of initia.1-condiThe function P ( . ) is given by formulas [cf. (%)I tion terms and a,lso the correlation between the process P(t) = P,(t) &(O) [Q,(o)QB(o)]-’QB’(~> (115) x( .) a.nd the observations V( .), both of which prevent the identification K r = PA”*’. (We remark that it was not where P r ( . )obeys the differential equation (55). The a.d- explicitly point.ed out. in Part. I1 that the Rauch-Tungjoint mriable X(-) satisfies the differential equation Striebel formulafailed when there was correlation b e h e e n the signalprocess andtheaddit.ivc noise. Similar comX(t) = a’(“)(t)r-l(f)K’(t)X(t) - A‘(*)(t)r-l(t)v(t), ments applyto the Mayne-Fraser fixed-interval smoothing formula.) Ho-xever, as alreadynotedearlier, we can X(T) = 0. (116) obtain analogs here of the fixed-point andked-lag As in Part 11, t.he basic fixed-interval formula (113) can be smoothing formulasof Part 11. easily rearranged t o yield the proper formulas for k e d Sipmls in Additice Noise: Our previous result,s ha~7enot point and fixed-lag smoothing. For reasons of space, we explicitly been for the usual signal-in-additive-noise shall not. give t.he formulas here. R e should note that,t.he estimation problem, but they ca.n readily be specialized t o Rauch-Tung-Striebel formula of Part, I1 12, eq. (34) ] has this case. Thus suppose we wish t o calculate i ( t ) , where no analog in the present, problem[cf. the discussion in t.he tahe observations a.re pamgraph below (lls)]. Finally, for future reference, we y(7) = 2 ( 7 ) v(7), 0 5 75 t (120) notethat for therelated process W ( . ) , of (106), the x(t) =
CP’(s,t)A’(“)(s)r-l(s)Y(s)ds.
(114)
+
+
smoothed est.imate can be written
@ ( f l T ) = Ilf,(t);i(tlT)
+ “t)P(t)X(t)
=
(117)
where P( .) and X( are as in(55) and (116), respectively. Proof of the Smoothing Formulas: The arguments are similar t o those of Pa.rt. 11. We first replace the given observations (y(.)] byt,heinnovations process { V ( O ) , v(.)].Then by direct calculation, using the ort,hogonality property of least-squares estimates, we have
.) a.re coloredprocesses where z( .) and Y( lumped models of the form
X,
= +z,u.z,
&(O)
=
m,
withknown
Ex~~x= . ~ ITzo ’
(12la)
e )
+
lT
? ( t ( T ) = E[x(f)V‘(O)]V(O)
z
=
Ezl.z(t)uz’(s)= I6(t - s),
A&
Eu,(t)~o’--=O (121b)
and
X,
= 2’
E[x(t)v’(s)]v(s) ds
=
EX,~X,~’ = IIov, Euo(t)xn0’ = 0 (122a) A,x,, E,uv(t)uz’(s)= Z6(1 - s). (122b)
We also assume that. =
530
+ ~TE[x(t)v’(s)lv(s) ds.
(11s)
But
+ u’(s)C’(s) 11
E[x(t)v’(s)lr(s) = E ( x ( t )[Z’(s)A’(“)(s) =
~ ( t , s ) ~ ’ ( * ) ( s ) , for s
> t.
This est.ablishes (108)-(110). Formulas (111) and (112) from follow the differential equat,ion for a( .) :
Eu,(t)u,’(s) = I!L,(t)G(t - s),
E X ~ O X= ~ OI L‘o .
(123)
These assumptions imply that. z( .) and v( - ) arc joint,ly Markov. We can reduce this problem t o t h e one we first. studied by defining
x!(.)
=
[xz’(.>rxn’(.)l,
#(.)
=
diad#z(.);#.o(.>l,
u t ( . ) = [u,’(.):u,‘(~)] (124a.)
44s
IEEE T R ~ ~ X S A ~ I O ON S S AUTOJLLTIC CONTROL, OCTOBER
A(.)
[xZo‘~x,~’] (124b)
XO’ =
= [Az(.)
for then ~e can write
X(t)
=
+(t)u(t),
x(0) =
y(t) = A ( t ) X ( t ) .
XO?
(12.3)
where e ( . ) is thestaterector of the whiteningfilterfor the process y(.)! i.e., e ( . ) is given by the differential equation (cf. Corollary 3) e(t) =
K(t)r-yt) [ y ! = ) ( t ) - s ( a . l ( t ) e ( t ) ] ,
Son- ?( .) can be found as before, viz.? via (105): and t.hen \\-e will have by (106) and (107) i ( t ) = [A#) iO]?(t).
(126)
We mal-note that we can also estimate .(.) by C(t) = [Oi&(t)];(t)
=
y(t)
- i(t).
(127)
e(o> = Q B ( 0 ) [ @ - 4 ( O ) Q B ( O ) ]-lY(o) where K ( is given by (71) and (72). Proof of (131): Since we know that R(t,s) is the covariance of somelumpedMarkovprocess, suppose we have t.he model e )
X@)
The smoothed estimates canalso be found as i ( t [T ) =
[ A , ( t ) iO]?(t( T )
x(0i
= Y W Z ~ ) :
y(t) = A(t)x(t),
(128)
where ? ( f J T )is obtained via (10s) or (113)-(116). The Special Case a = 1: Several of the previous papers on this problem have restricted thenselves t o t h e special caw CY = 1 [7]-[9]. -4s noted by Bryson and Johansen [6] andothers,inthis special case, dift’erentiation can be avoided by using a special trick \vel1 know1 in analog computer practice. H o w v w , this trick does not scem t o n-ork when a > 1. Rela.tion to Obsel.cel. Theory: I t~rell is knon-11 (see, e.g., Astrom [43, pp. 15&158]) that with additive n-hite noise, thc Kalman filter can be obtained b>- assuming a feedback structure for the estimator andchoosing the feedback gain to minimize the error variance. The particular assumed feedback structurc is somcxtimcs called an observer bccausc if tlw feedback gain is chosen differently then one obtains tlw so-callcd Bcrtrul~l-Bass-Lucnber~er observcr for deterministicsystems.Similarcalculationscanbc madc n-ith colored noise [ U ] .but now the optimunl condition:d mean estimatc willonly be obtained when a: = 1; u-hcn a > 1. thtx so-called observeris suboptimum bwauw it d o c s not usehigher orderderivatives of y ( . ) . This is sometimes regardcd as an advantage of the observer, but t h e r t d point is one of propclr modeling: if one cannot u w dcrivativc5, thvn one should not start xith a model in Vihich cr > 1. We t u r n n m t t oproblems in xvhich only the covariance functions arc. known.
=
x0,
EU(t)U’(T)
Exoxo’ = no = z6(t - T )
and
EX(t)X’(T)
=
II(tA7).
H(t)
B(t)
=
=
no +
Lt
$(T)+’(T)
xo\vit is to that the follo\villg is consistent Jyith our assumption on ~ [ ~ ~ . ( t ) ~ ’ ( ~ ) ] : w(t)
= X w ( t j X ( f ) -i y’(f)
(132)
where y’( .) is a zero-mean process uncorrelated 11-ithy ( -). Therefore,
e@)= Jr,(t)x(t)
(133)
and if we refer t o Corollary 2 and theproof of Theorem 2, ~ v vwe that j i ( . ) can be esprcsscd cntirc.1)- in terms of the parameters d ( .) and B( .) of the co\-ariancc~function-in fact is just th(3 stntv vcctor o f the \x-hitcning filter, as drscribed in Corollary 3, for y ( .). Sntoothed Est imafes: To calculat et he smoothed estimate 8 ( t ( TI, I\-(’S M I newt t o supp~cmcnttllc cross-covariance information (13Oj for f >_ 7 x i t h a specification for t 5 T. Hon-ever, it will b(>clcnr from the proof of (131) that this should be done in n \GI>- that is consistcnt with the assumption (132) on r(.). For this: 11-e must have
x(.)
E r r ( t ) y ’ ( ~ )= - 1 ~ w ( t ) E [ X ( t ) X ’ ( T . ) ] - ~ ’ ( 7(134a) ) = Xw(f)n(T)A’(T),
for
T
T
_< t
(134b)
We shall present analogs of some of t,he results of the first half of this section. Filte,,etl Estimates: Wc are givenobservations of a lumped l\Iarkor proccss g ( . ) with covariance function fur t 2
dT
rI(t)A’(t).
B. Recursille Estimation Giwn Cocariance Functions
R ( ~ , T=) A ( f ) B ( T j ,
1973
(129)
=
N F r ( t ) ~ ( f ) A ’ ( ~ ) . for
T
2 t. (134c)
K i t h thiscalculationinmind, \ve shallassume for the smoothing problem that the cross covarinncc bct1veen LC(-) and g ( .) is
and of definite relative order a . We n-i$h t o find the h s t squaws filtrwd estimate o f a signal process ( I ’ ( . j that is
wch that
Ec!.(f)g’(Tj= X,(t)B(7j,
0 5
T
5 t 5 T . (130)
K c shall sholv that
6(t)= -1LT&e(t)
( 131)
G(t1T)
= G(t)
+
[X,.(t)
- Z(f)-ll,,.’(t)]‘X(t)
(136)
K4IL:ITH AND GEESET:
LESSFSQVARES ESTIMATION-PART
449
V
where X(.) isasin (114) and (116) and u ( . ) , E ( - ) , and K ( - ) are as in Theorem 2 and Corollary (3), i.e., as in ( S i ) ! (71), and (72). Proof of (136): Since TTe have assumed that the crosscovariance informa.t.ion is consistent with t.he awumption that
VI. CONCLUDISG REX~RKS
The major part of this papcr has been devoted to thc determination of finit.c-time innovations (or canonical) representat.ionsforstationary processrs withrational poxer-spectral density functions and for thc natural nonstat.ionary analogs of such processes. The history of this beendiscussed in somedetail. zc(t) = M,(t)x(t) yL(t), 0 5 t _< T (137) importantproblemhas However, t,he arguments in this paper are essentially selfwe ca.n begin with t,he previously derived formula (117) conta.ined and have the important feature of highlighting c(tiT) = ~ ( t ) M ~ ( ~ ) P ( L ) x ( ~ ) the fundamental and intimate relationship between the representation problem andtheleast-squaresfiltering and try to express it mt.irely in t e r m of the functions problem. An immediate benefit of this recognition, u-hich A ( - ) : B ( . ) , X W ( - ) ,and Nw(.). We h o w already that, does not, seem t o have been madc so explicitly before (for since the function X ( - ) can be expressed [viz., (71) and nonstationary problems),has been the verysimple solution (i2)l in such terms, $(.) and A ( - ) , as given by (116); can of the filtering and smoot.hing problems of Section TT-A-A also be 80 expressed. (for known models) and Section Y-B (for known cova.riFinally, since anccs) . This may be compared with the several papers in P(t) = E%(t)%’(t) = Ex(f)x’(t)- E2(f)?’(t)= n(t)- E ( t ) the 1iterat.ure on the known-models case [.?]-[12] and the absence of any literature on the known-covariance probwhere 2( .) is given by (Z), we can write lem. Our work attempts to explain andemphasizrthe fundamentalrole M,(t)P(t) = X , ( t ) l I ( t ) - X , ( t ) Z ( ( f ) of thc ca,nonical whiteningfilter or equivalently of the innovat,ions rcpresmtat.ion. This im= NW’(t) - X ~ ( t ) E ( f ) . (135) portance was very clear t o the earliest .viorl-pe,as solved for known tions representations for nonstationary proccsws (inmodels (120)-(12S), we need the covaria.nce functions of cluding st.ationa.rg processes over finiteinters7als). Wr feel the process y(.) as describedby (120)-(122). Straight- that. the approach in our paper helps in gaining a simulforward calculabion yields taneous appreciation both of how IR’s can br used and R ( ~ ? T=) E y ( t ) y ’ ( ~ )= A ( t ) I l ( f A . r ) A ’ ( . r ) (139) how t,hey can be calculated. I this nmv drrivat ion docs not direct 1y bring outthe intimate g(”(t) = -4ii’(t)x(t), i = O ? . . - , a - 1. (-\-4) connection bet\\-pen the problems of finding innovations representations and of calculating least-squares estimat,es Therc.fore, and thus is less usefulindevelopingapplications, e.g., g ( i ) ( t ) = f j ( j ! ( f ) = A(*’(t)2(t). (A-5) tllosc in Section 1’. Theorem A-I-Sepsable Cocariances Hace Sepa.rable Hrncc Factors: Let i l ( i ) ( t ) i ( t )= 0 (-4-6) R,(t,s) = 6(f - s) K(t,s), [f.s]E I = [O.T] from 11-hichthere immediately follow xvhere K ( ., .) is symmetric and square integrable on I @I L4(i;(L)2(t)jjf(t) = 0 and and K(t,s) = Jl(t)IV(s): for f 2 s.
+
A ( i ) ( t ) E g ( f ) g f ( f=) 0,
.i
=
0:
. . > a- 1.
Then if Ro(t.s)is strictly positive drfinitc on & if
@
&. i.e.,
Thrse are just the desired equations (A-3). PT r T This proof is of interest because of the physical significancc of the property (A-5). Actually, (-4-1) also follou-s bynoting tl1a.t the innovation? model ( 3 2 ) $ \vhich has state variance E ( - ) . hasesa.ctly thesameform as tllr for all +( .) such that PT modcl (1s).which had state variance II(.)and for n-hich J +‘(t) f?t < a , the analogous relation (79) was proved as in c20)-c24). 0 Incidentally, the above also proves that of all models there exists R function k ( t > ~of) the form (1s) for a givencovariance (13): theinnovatiommodrl (32) has the smallest state variance. For, given any modrl with variance rI, there is a unique (deprncling only on the covariance) innovations model with variance Zs and our such that tllc rcspmw to \vhit(x noiw of a filter with imanalysisinTheorrm 1 shox-s that II - S 2 P 2 0. pulse responsr I6(t - 7) k ( t ? T ) has covariancc RO(t.s). This argumentalso simplifies the proof of the result of [Q]. Prooj: Vnder the hypotheses on X,, Gohbcrg and Krein [19] ha.ve sho1\-n thatthere exists a square inAPPESDIS I1 tegrable Yolterra function SPECIAL FORN FOR REDUCED-ORDEK ~IODEL k ( t , T ) = 0. 7 > t Suppose that theobserved process y ( .) is scalarand that such that in an obvious operator notation its covariance
+
R(t,s) = A ( t ) B ( S ) ?
s I t
Ro
=
I
+K
=
(I
+ k ) ( I + X.*) +
exists is suchthat A (.) is -n-timrs continuouslp differcntiablv andu-here k* is t h r adjoint of k. >loreover. ( I and can bc \nitten Qnf(t) = [LAf(t); . . . 0)1 (1+ k*)-’ = I - ],* is nonsingular for all t. If R(f?.s)is regardcd as an imI)ulsc response of a linearsystem,thenwiththeabove con- \d-herc h is square integrable and straints on A ( . ) the systrm is said [%] t o bc “uniforml>h*(t,T) = 0, f > 7. observablp.” In an\- casp, if TTI take the matrix T(tj of (81) t o be equal t o Qn( t ) , then some calcula.tion will slmv Therrforc. we can write that the matrixF ( .) of (83) can be xritten in‘iconqmnion” I x. = ( I K ) ( l - I(*) form! with ones on the first. superdiagonal. -a&), . . .: or -an-l(t) in the last row, and zeros elsewllcre! &err the k = -h* K ( I - h*). ai(.) satisfy the rquation Let us define the causal part of a function A(t!.s) as n--1
+
?;*)e1
+
+
A(n’(t) =
Uj(f)A(j:(f). 0
APPEXDIX I11 SOJIEGEXERALFACTORIZATIOS RE.:kULTS
In the test we have shown how t o obtain innovations representations under assumption the that a lumped
=
K ( f , s ) l ( t- s) - ( K h * ) +
KAILATH AND GEESEY: LEASS'liSQUAFLESESTIMATION-PAFLT
Then
V
451
and
function
a
that. ka(t,s) such
exists
ko(t,s)
s,
IAs
=
K ( t , T ) h * ( T , S ) d7
=
0,
s
>t
+0 and in operator notation,
tAs
ilf(t)N(T)h*(T,s) dr.
=
The IR for y( .>then is
Therefore, we have for k
k(t,s)
=
M(t)N(s) - M(t)
s,' N ( ~ ) ~ . * ( T , s )
=
M(t)$(s),
0,
s,
yo =
r"t) clt