Catlin subsequently showed that the same smoother can be derived without assuming that the forward and backward errors are independent [7]. A discrete-time ...
OPTIMUM MINIMUM VARIANCE FIXED INTERVAL SMOOTHING Garry A. Einicke CSIRO – Exploration and Mining, Queensland, Australia ABSTRACT The paper describes an optimal minimum-variance fixed-interval smoother. The optimal solution involves a cascade of a Kalman predictor and an adjoint Kalman predictor. Speech enhancement and nonlinear demodulation examples are presented which demonstrate that optimal and extended Kalman smoothers can provide performance benefits.
1. INTRODUCTION Fixed interval smoothers (see [1] – [8]) can outperform filters when processing blocks of noisy measurements. The optimal noncausal filter or fixed-interval smoother was first described in the frequency domain by Wiener [1, 2]. An early time-domain fixed-interval smoothing algorithm using the Kalman framework was developed by Rauch [3]. Some further developments of [3] are reported by Mayne in [4]. Rauch, Tung and Striebel developed a maximum likelihood smoother [5] which uses a Kalman filter, stores the corrected states, predicted covariance and corrected covariance and employs them within a backward recursion. In the Fraser-Potter smoother [6], Kalman filters are applied in forward and backward directions. The smoothed state estimate arises as a linear combination of forward and backward state estimates weighted by the inverse of the underlying error covariance matrices. Catlin subsequently showed that the same smoother can be derived without assuming that the forward and backward errors are independent [7]. A discrete-time fixed interval smoother is described in [8], which employs a Riccati difference equation (RDE) within a filter to calculate forward states that are combined with adjoint states to obtain smoothed estimates. A minimum-variance fixed-interval smoother is described here which derives from classical frequency domain (Wiener) filtering. The noncausal filter involves a cascade of a forward Kalman predictor and an adjoint Kalman predictor. An extended Kalman smoother is also described. Linear and nonlinear applications are presented which demonstrate that smoothers can provide performance benefits.
2. OPTIMAL SMOOTHING 2.1 Adjoint Systems In Section 2.2, where a minimum-variance smoother is described, the discrete-time adjoint system is required. Let G : R m × N → R p × N be a linear operator between two Hilbert spaces R m × N and R p × N . Suppose that G has the state-space realization xk +1 A B xk (1) y = , k C D wk xk ∈ R n
x = ( x1 , x 2 ,..., x N ) ,
where
w = ( w1 , w 2 ,..., w N ) ,
wk ∈ R
y = ( y1 , y2 ,..., y N ) ,
yk ∈ R p
n × m
m
∀k ,
E{wk wkT } = Q ,
∀k ,
∀k ,
A∈R
n × n
,
p × m
p × n
B∈R , C∈R and D ∈ R . Let < , > denote the inner product. Then the adjoint of G is the
linear system β = G H α that maps R and
satisfies p × N
=
p × N
→R
< G Hα , w >
m × N
∀
m × N
and w ∈ R [9]. Following the α ∈R continuous-time approach [9], it can be shown that the adjoint G H is the linear system
− C T ξ k ξ k −1 A T . (2) β = − B T D T α k k The adjoint (2) has the same 2-norm as the original system (1). Note that the adjoint system is anti-causal, i.e., it evolves backwards in time k. Suppose, hypothetically, that G is asymptotically stable and it is desired to realize G H . The G H cannot be realized directly because it is unstable. By exploiting
(G )
H H
= G , the G H can be realized by operating G
on the transposed time-reversed input data and then taking the transposed time reverse of the result (see [10] Ex. 2.3). 2.2 Fixed Interval Smoothing Consider the output estimation problem depicted in Fig. 1, where G is given by (1). A linear system yˆ1 = Hz
p × N
q × N
mapping R is to be designed that →R produces estimates of the output of G , denoted by yˆ k ,
from the measurements zk = yk + vk , where
v = (v1 , v2 ,..., v N ) ,
Then in the output estimation case, the smoother that minimizes Rei ReiH is given by 2
xk +1 Ak − K k Ck K k xk α = , −1 / 2 −1 / 2 − Ω Ω C ( ) ( k k k k) zk λk −1 AkT − CkT ( K k )T CkT (Ω k ) −1 / 2 λk , β = T (Ω k ) −1 / 2 α k k − (Kk )
(3)
vk ∈ R p
∀k ,
is
an
independent process with E{v v } = R . Denote the estimation error as T k k
and
v HG − G ] k wk
ek = yˆ k − yk = [ H
and define Rei ReiH = [ H
yk = zk − Rk β k .
(4)
HH R 0 . (5) HG − G ] H H H 0 Q G H − G
v The R ei is known as the map from the inputs ik = k wk to the estimation error.
= ∆−1 (see [10] Ex. 2.3), that is, operate (8) on
the time-reversed transpose of the α k and then take the time-reversed transpose of the result. Step 3. Apply (9).
Fig. 1. The output estimation problem.
The Kalman filter [11] is given by xˆk +1 / k Ak − K k Ck K k xˆk / k −1 ˆ = , y1, k / k Ck − Lk Ck Lk zk
(6)
K k = ( Ak Pk / k −1CkT + Bk Qk DkT )(Ω k ) −1
predictor gain, Lk = (Ck Pk / k −1C + Dk Qk D )(Ωk ) T k
(9)
The optimal smoother for output estimation (7), (8), (9) involves a cascade of a ∆−1 evolving forward in time followed by a ∆− H evolving backward in time. It can be implemented via the following three-step procedure. Step 1. Realize the system (7) operating on the measurements z k . Step 2. In lieu of (8) which is unrealizable, exploit −H H
T k
(8)
A proof is detailed in [12]. The structure of the noncausal filter for output estimation is shown in Fig. 2 A − K k Ck K k in which H P = k denotes the Kalman 0 Ck predictor.
(∆ )
where
(7)
is
the
−1
is the
filter gain, Ωk = Ck Pk −1 / k C + Dk Qk D + Rk and Pk / k −1 T k
T k
Pk +1 / k =
is the solution of the RDE
− K k Ω k ( K k )T + Bk Qk BkT .
Ak Pk / k −1 AkT
The Kalman filter is the
optimal causal solution that minimizes Rei ReiH , which 2
is equivalent to minimizing the variance of the estimation error. Next an optimal noncausal solution is stated which minimizes Rei ReiH . 2
Theorem [12]: For the above time-varying output estimation problem, suppose that the factor A K k (Ω k )1 / 2 ∆ˆ = k and its inverse (Ω k )1 / 2 Ck A −K C ∆ˆ −1 = k −1k/ 2 k − (Ω k ) Ck
K k( 2 ) exist ∀k. (Ω k ) −1 / 2
Fig. 2. The smoother for output estimation.
The minimum variance smoother (7) – (9), estimates the output of G , i.e. it produces the estimate Cxˆ k . Smoother formulations for the more general case, including input estimation or equalisation, are described in [12]. Another possible way of calculating state estimates is via (7), (8) and xˆk = C # ( zk − Rk β k ) , (10)
where C # = (C T C ) −1 C T is known as the Moore-Penrose T
pseudo inverse, which assumes that (C C )
−1
exists.
3 APPLICATIONS 3.1 Speech Enhancement Speech is often modelled as a time varying autoregressive (AR) process. In high measurement noise applications, it is convenient to employ a time-invariant AR1 process (10) xk +1 = µxk + wk , zk = xk + vk , (11) where µ ∈ R, 0 < µ < 1 , the speech message wk and measurement noise v k are uncorrelated, zero-mean,
gaussian processes of variances σ w2 and σ v2 respectively. Suppose that a filter designed for (10) and (11) has produced n state estimates xˆ k from n measurements z k . Using the approach of [13] and assuming that zk ~ N ( xk , σ v2 ) and xk +1 ~ N ( µxk , σ w2 ) , it is straightforward to show that the maximum likelihood estimates (MLEs) for the unknown parameters are N N N 1 2 µˆ = xˆk +1 xˆk xˆk , σˆ v2 = ( zk − xˆk ) 2 and N k =1 k =1 k =1
∑
∑
∑
1 N −1 σˆ = ( xˆk +1 − µxˆk ) 2 . N − 1 k =1 2 w
∑
at 20 dB SNR using Kalman filter state estimates within an expectation maximization algorithm. The time-varying minimum-variance smoother was applied to the above-mentioned noisy speech data and the simulation results are shown in Fig. 3. In this example, the maximum likelihood smoother [5] and the minimum variance smoother yield indistinguishable performance. The figure shows that the smoothers outperform the Kalman filter. A robust smoother which can yield further performance improvements is described in[12]. 3.2 FM demodulation Consider a nonlinear plant of the form xk +1 = ak ( xk ) + Bwk , yk = ck ( xk ) + vk , in which the nonlinearities ak (.) , ck (.) are assumed to be smooth, differentiable functions of appropriate dimensions (see [11]). An extended Kalman smoother for the corresponding output estimation problem can be implemented using the linearizations akin to the extended Kalman filter (EKF) via the following threestep procedure. Step 1. In view of (7), calculate α k = − (Ω k )1 / 2 ( z k − ck ( xˆk / k −1 )) ,
xˆk / k = xˆk + Lk ( zk − ck ( xˆk / k −1 )) ,
xˆ k +1/ k = ak ( xˆk / k −1 ) ,
where Lk = Ck Pk / k −1CkT (Ω k ) −1 , Ω k = Ck Pk −1 / k CkT + Rk ,
Pk / k = Pk / k −1 - Lk Ω k ( Lk )T , Pk +1 / k = Ak Pk / k −1 AkT + Bk Qk BkT ,
in
Ck = ∂ck ( x) ∂x x = xˆ
which k / k −1
Ak = ∂ak ( x) ∂x x= xˆ
k/k
and
.
Step 2. In lieu of (8), calculate the β k by carrying out Step 1 operating on the transposed time-reversed data, and then taking the transposed time-reverse of the result. Step 3. Calculate yˆ k = zk − Rβ k .
Fig. 3. Speech estimate performance comparison: (i) data, (ii) Kalman filter and (iii) maximum likelihood and minimum variance smoother.
A voiced speech utterance "a e i o u" was sampled at 8 kHz for the purpose of comparing smoother performance. Simulations were conducted with the zeromean, unity-variance speech sample interpolated to a 16 kHz sample rate, to which 200 realizations of gaussian noise were added and the signal to noise ratio (SNR) was varied from -5 to 5 dB. The µˆ and σˆ w2 were calculated
Consider the problem of demodulating a unity amplitude x (1) frequency modulated (FM) signal. Let xk = (k2) , xk 0 µ A= , B = [1 0] , .99 .99 z k(1) cos xk( 2 ) vk(1) + ( 2 ) , where xk(1) , xk( 2) , zk and ( 2) = ( 2) z sin x k k vk vk denote the instantaneous frequency, instantaneous phase, complex observations and measurement noise respectively. Simulations were conducted in which an FM signal was generated using the speech message described in Section 3.1. The MLE parameter estimates of Section 3.1 were used within an extended Kalman Filter demodulator.
An
FM discriminator [14], i.e., ( 2) (1) − 1 dz dz z k(3) = z k(1) k − z k( 2 ) k ( z k(1) ) 2 + ( z k(1) ) 2 , serves as dt dt a benchmark and as an auxiliary frequency measurement for an extended Kalman smoother. The innovations z k(1) cos( xˆk( 2 ) ) within Steps 1 and 2 are given by z k( 2 ) − sin( xˆ k( 2 ) ) z k(3) xˆk(1)
(
)
α k(1) cos( xˆ k( 2 ) ) and α k( 2 ) − sin( xˆ k( 2 ) ) respectively. α k(3) xˆ k(1)
The SNR was varied in 1.5 dB steps from 3 dB to 15 dB. The MSEs were calculated over 200 realisations of gaussian measurement noise and are shown in Fig. 4. It can be seen from the figure, that at 7.5 dB SNR, the EKF improves on the FM discriminator MSE by about 12 dB. The improvement arises because the EKF demodulator exploits the signal model whereas the FM discriminator does not. The figure shows that the extended Kalman smoother further reduces the MSE by about 2 dB, which illustrates the advantage of exploiting all the data in the time interval. This nonlinear example illustrates once again that smoothers can outperform filters.
Fig. 4. FM demodulation performance comparison: (i) FM discriminator, (ii) EKF and (iii) extended Kalman smoother.
4. CONCLUSION The paper introduces an optimum fixed-interval smoother for discrete-time systems. The smoother involves a cascade of a Kalman predictor and an adjoint Kalman predictor. The solution is optimal in the sense that it minimizes the 2-norm of the map from the input processes to the estimation error. The advantage of a state space formulation is that time-varying and nonlinear applications can be described. A speech enhancement example is discussed which demonstrates that combining forward and adjoint predictors within a
smoother can be beneficial. An FM demodulation example is presented which illustrates that an extended Kalman smoother can provide performance benefits compared to an extended Kalman filter .
REFERENCES [1] J. S. Meditch, “A Survey of Data Smoothing for Linear and Nonlinear Dynamic Systems”, Automatica, vol. 9, pp. 151-162, 1973. [2] T. Kailath, “A View of Three Decades of Linear Filtering Theory”, IEEE Trans. Info. Theory, vol. 20, no. 2, pp. 146-181, Mar., 1974. [3] H. E. Rauch, “Solutions to the Linear Smoothing Problem, IEEE Trans. Automat. Contr., vol. 8, pp. 371-272, 1963. [4] D. Q. Mayne, “A Solution of the Smoothing Problem for Linear Dnamic Systems”, Automatica, vol. 4, pp. 73-92, 1966. [5] H. E. Rauch, F. Tung and C. T. Striebel, “Maximum Likelihood Estimates of Linear Dynamic Systems”, AIAA Journal, vol. 3, no. 8, pp. 1445 – 1450, Aug., 1965. [6] D. C. Fraser and J. E. Potter, "The Optimum Linear Smoother as a Combination of Two Optimum Linear Filters", IEEE Trans. Automat. Contr., vol. AC-14, no. 4, pp. 387-390, Aug., 1969. [7] D. C. Catlin, “The Independence of Forward and Backward Estimation Errors in the Two-Filter Form of the Fixed Interval Kalman Smoother”, IEEE Trans. Automat. Contr., vol. 25, no 6, pp. 11111114, Dec., 1980. [8] Y. Theordor, U. Shaked and C. E. de Souza, “A Game Theory Approach to Robust Discrete-Time H ∞ Estimation”, IEEE Trans. Signal Processing, vol. 32, no. 6, pp. 1486-1495, Jun., 1996. [9] D. J. N. Limebeer, B. D. O. Anderson, P. Khargonekar and M. Green, “A Game Theorectic Approach to H ∞ Control for Time-varying Systems”, SIAM J. Control and Optimization, vol. 30, no. 2, pp. 262 – 283, 1992. [10] C. S. Burrus, J. H. McClellan, A. V. Oppenheim, T. W. Parks, R. W. Schafer and H. W. Schuessler, Computer-Based Exercises for Signal Processing Using Matlab, Prentice-Hall Inc., Englewood Cliffs, New Jersey, pp. 38 – 40, 1994. [11] B. D. O. Anderson and J. B. Moore, Optimal Filtering, Prentice-Hall Inc, Englewood Cliffs, New Jersey, 1979. [12] G. A. Einicke, “Optimal and Robust Noncausal Filter Formulations”, IEEE Trans. Signal Processing, (to appear), 2005. [13] S. M. Kay, Fundamentals of Statistical Signal Processing: Estimation Theory, Prentice Hall, Englewood Cliffs, New Jersey, ch. 7, pp. 157 – 204, 1993. [14] J. Aisbett, “Automatic Modulation Recognition Using Time Domain Parameters”, Signal Processing, vol. 13, pp. 311-323, 1987.