D1209 THE âKALMANâ RECURSIVE LEAST SQUARES (RLS) AND. LEAST SQUARES LATTICE (LSL) ADAPTIVE ALGORITHMS. Paul S. Lewis. Los Alamos ...
MULTICHANNEL ADAPTIVE LEAST SQUARES-RELATING THE “KALMAN” RECURSIVE LEAST SQUARES (RLS) AND LEAST SQUARES LATTICE (LSL) ADAPTIVE ALGORITHMS
D1209
P a u l S. Lewis Los Alamos National Laboratory‘ Mechanical a n d Electronic Engineering Division Mail Stop 5580, Los Alamos, NM, 87545, U.S.A.
Abstract A mu1t)ichannel adaptive lattice algorithm is derived that provides a least squares estimate of a priniary signal in terms of the last p samples of m reference signals. The recursions developed are optimized for multichannel use and provide an algorithm with computational complexity of O ( p m 2 ) . In the single reference channel case (m = l ) , the algorithm simplifies to the O ( p ) least squares lattice (LSL) algorithm. In the single filter tap case ( p = I), the algorithm simplifies to the O(m2)“Kalman” recursive least squares (RIAS) algorithm. Hence, the general multichannel least squares lattice (MLSL) presented here provides an explicit relationship between, and common derivation of, both the LSL and RLS algorithms.
1 Introduction Least squares adaptive a.lgorit,limsprovide an e d m a t e of a p r i m a r y signal via a linear combination of reference signals. This estimate is derived by minimizing the exact squared difference between the estimate and the given data. The traditional recursive least squares ( R L S ) algorithm, also known as the “Kalman” algorit,hm, has complexity of O ( n 2 ) ,where n is the number of inputs. This algorithm is valid for any linear combiner, as no assumptions are made about relationships between inputs. If the inputs are the taps of an adaptive finite impulse response ( F I R ) filter, then the shift invariance propert,y of the inputs may be ut,ilized to develop faster algorithms. Among the most efficient, stable, and robust is the class of least squares lattice (LSL) algorithms. LSL algorithms have a complexity of 0(.), providing significant performance improvement over the RLS algorithm. LSL algorithms are both order and time recursive and are based upon solutions to the forward/backward predictor filters for the input data. In the literature, these two approaches have been derived and analyzed separately and even cornpared, without any hint of a common connection. In this paper it is demonstrated that the algorithms are complementary and can be related by a least squares multichannel F I R adaptive filter. In a multichannel adaptive FIR filter, the desired signal is estimated by a weighted combination of the last p samples of m input channels. Hence, the problem is a combination of both the linear combiner and FIR cases. Multichannel LSL (MLSL) algorithms have been derived before, but usually as straightforward generalizations of single-channel LSL algorithms. In a single-channel LSL algorithm, the time and order recursive quantities are scalars. In a multichannel algorithm they are a combination of matrices, vectors, and scalars. A straightforward multichannel generalization of a LSL algorithm will lead t o an O ( p m 3 ) algorithm. This is due chiefly to the computational expense of matrix inversion. The MLSL algorithm presented in this paper is optimized for niultichannel operation and has computational complexity of
O ( p m 2 ) . This is achieved by choosing a particular set of the available recursive relationships and by utilizing the matrix inversion lemma to eliminate explicit matrix inversions. In addition, algebraic manipulation and the introduction of specific intermediate variables permit a minimization of the vector-matrix multiplications that contribute to the quadratic term of the operation count. Final per time step operation count, including joint estimation, is 3 p divisions, 1 0 p m 2 9 p m 5 p multiplications, and 7 p m 2 3 p m additions. Just as the multichannel problem represents a combination of the FIR and linear combiner cases, the MLSL algorithm presented here can be interpreted as a combination of the RLS and LSL algorithms. When the number of channels in the MLSL is set to one ( m = l ) , the MLSL algorithmsimplifies to an efficient O ( p ) “unnormalized” LSL a l g ~ r i t ~ h mLikewise, . when the nnmber of filter taps is set t o one ( p = l ) , the MLSL algorithm simplifies to the O(m2) RLS algorithm. Hence, the LSL and RLS algorithms form complementary parts of the MLSL algorithm. Although, in the general case, t,he components are not separable, a rough interpretation is that the RLS part handles the multiple channels and the LSL part handles the niultiple taps within each channel. The MLSL algorithm presented in this paper then provides an explicit relationship between, and common derivation of, these two algorithms.
+
2
+
Problem Definition
A block diagram of the problem is shown in Figure 1 . Here there is a single’ primary channel z(k) with samples available for 1 5 k 5 t . In addition, there are m sampled reference channels, represented by the m dimensional vector y(k), with samples available for 1 5 k 5 t . An estimate of z ( t ) is constructed by forming a linear combination of the past p signal samples of these m channels. The linear combination of past reference samples used to estimate ~ ( t ) can be specified by m p filter weights. The values of these weights are determined by a weighted least squares criterion. The output e:,, is then the difference between the actual and estimated values. The function of the adaptive algorithm is to (implicitly) set the weights so that the error in estimating both the present and past values of z ( t ) is minimized. The derivation presented here is a finite dimensional vector space approach that combines aspects of the derivations found in References [1,2]. To simplify notation, a “large” z dimensional vector space is used where z >> t . Problem quantities are defined either as vectors in this space or as matrices consisting of vectors in this space (the matrix has z rows). To begin, define E,
[ z ( t ) z(t
-
I)
. . . z(1) 0 . . . 01’
(1)
‘The single priiiiary channel case is general. A treatment of the niult iple prim;Lry chaniiel case along these lines leads to a solution for each channel t h a t is independent of the others-the solutions are separable.
1926 CH2561-9/88/0000-1926$1.00 0 1988 IEEE
+
as the
t
dimensional signal vector, and
as the
t
x m exponentially weighted reference matrix. Now define
yt
For a time only update, the following identity is used [1,2,3,4]:
= [ y ( t ) A$y(t - 1) ' . . Aiy(1) 0 . . . 01'
(2) (9)
-
Here P i t - l is the ( z - 1 ) x to a Yp+1 defined in a
0'
...
-
0'
t
Y,,t(Y',,tY,,t)-'Y',,t as the
The error of this "joint process" estimation at t is the leading component of By defining a "pinning" vector 9 [1,2,3,4] as E G [l 0 . . O]', the error can be expressed as the inner product
3
1) projection matrix corresponding
1 dimensional space.
The above results may be used t o set up an estimation/prediction of r(k) in terms of y(k) by adding two more variables and their associated recursions to those given above. The first variable is the previously defined error. The second intermediate variable represents a correlation between z(k) and y(k - p). Using these variables in the update formula generates two additional equations. The variable definitions and additional equations are listed at the bottom of Tables 1 and 2.
x mp matrix of past reference matrices.2 Define P i t = t x z projection matrix onto the orthogonal complement of the column space of Y,,t. The overall error is given by e;,t = P&Zt , (4) as the
I
(t-
t-
4
Multichannel RLS Algorithm
There are numerous ways to organize the equations of Table 2 into an adaptive algorithm. The variations involve both the selection of equations to use (each of Ce,C', and 7 have two recursive relations to choose between), the ordering in which they are computed, and the grouping of intermediate terms. Because the multichannel case is under consideration, the following set will be used to minimize operations:
Lattice Recursions
The recursive least square lattice solution to the above joint estil)] mation problem [estimating s ( t ) in terms of y ( t ) . . . y ( t - p can be expressed as an extension of the lattice solution to the least squares forward/backward prediction problem. In the forward prediction problem, y ( t + 1) is estimated in terms of y ( t ) . . . y ( t - p + I ) . Likewise, in the backward prediction problem, y ( t - p ) is also estimated in terms of y ( t ) . . . y ( t - p f 1). In this section prediction equations are developed andthen extended to solve the overall joint estimation problem. The variables for the prediction lattice are defined in the top part of Table 1 . The recursive relationships between these variables are listed in the top part of Table 2. These relations are based on the following projection matrix update formula [1,2,3,4]. Given four matrices, X, Y ?U , arid V , then
+
U'Pb ,]V = U'PblV
~
In the equations above, matrix inversions are required, producing an 0(pm3)algorithm. To avoid this, the update equations for Ce and C' can be rewritten in terms of their inverses by using a special form of the matrix inversion identity, as in
( U ' P ~ I X ) ( X ' P / ~ I X ) - ' ( X ' P, ~( 6I )V )
where Pb, I - Y(Y'Y)-'Y' i s a generalization of the previous notation and denotes the matrix that projects onto the orthogonal complement of the column space of Y . Use of the above identity with Y = Yp,t allows the definition of recursions for time and/or order update by using various values of X . For an order update
Yp+l,t = IYt-p YP,tI
Using this and normalizing the indices yields the following recursions:
(7)
'
For a time and order update
'This coriespnrids t o the exponentidly weighted, "prewindowed" form of the prc~bleni131 i n which it is nssuiiied t h a t y(k) = 0 ;tnd z(k) = 0 for k
< 0.
1927
The above recursions are over p and t. To form an efficient serial algorithm they must be placed in order. To reduce the computational load and t o eliminate redundant computations, intermediahe variables may be defined. In particular, note that matrix Cp,: must be multiplied by both gp,t and ~ ~ , ~The + lsame . situation also holds for C;,;. Introduction of intermediate variables, along with some algebraic manipulation, permits the elimination of one of these matrix-vector multiplications. For example, multiplying both sides of Equation 22 by ep,tyields
5.2
Single Tap MLSL-The
RLS Algorithm
When the number of filter taps is set t o one ( p = l ) , then the problem reduces to one of a general linear combiner, with no shift invariance relation between inputs. In this case the multichannel equations reduce to the traditional O ( m z ) RLS algorithm. To derive this, set p = l and use the following definitions and substitutions in Equations 19-26:
The same can be done for Equation 23. Defining intermediate variables, This yields the following recursions:
& =
cp,:-1ep,t
i;,$ =
C&rp,t
g , t
=
, ,
(28) (29)
i;,t xYp,t-l
+ e‘p,t&
(30)
’
(45)
and placing the recursions in proper order leads to the serial algorithm in Table 3. The numerical robustness of this algorithm can be improved by utilizing QR orthogonalization-based techniques [5] t o replace the “covariance” update calculations of Equations 22 and 23. This topic, with parallel implementations and architectures, is covered in a companion paper in these proceedings [6]. An alternate Gram-Schmidt orthogonalization-based approach is presented in Reference [7).
(46)
This decouples each of the joint process, forward, and backward estimation problems. Joint process estimation can be achieved by using the last three equations. These equations can be manipulated to get the traditional RLS form as follows. Define the weight vector as & cta=(t) (47) Then using Equation 46 yields
5
Relating the RLS and LSL Algorithms wt =
5.1
Single Channel MLSL-The
LSL Algorithm
Acta:-, + Cty(t)r(t)
.
(48)
Use Equation 44 to substitute for Ct.
In the single channel case (m = l ) , all of the recursive quantities are scalars. Because no matrices are involved, it is unnecessary to to use the matrix inversion lemma to reformulate the recursions and an algorithm may be derived directly from Equations 10-17. This yields the O ( p ) LSL algorithm. ep,t = ep-l,t
-
Ap-1,t rp-1,t-l
c;- 1,t- 1
(32) (33)
Defining the ‘(I‘ialman” gain vector as
c;,$ = xc;,t-,+
(35)
7p,t-1
(36) 7p,t =
Vp-1,t-l
-
%l,t
__ Cp-1,t
.
yields the O(m2) RLS algorithm, which computes both the weight vector and error signal.
(37) (53)
1928
[6] P. S. Lewis. QR algorithm and array architectures for multichannel adaptive least squares lattice filters. In Proc. 1988 Int. Conf. Acoustics, Speech, and Signal Processing, IEEE, New York, NY, April 1988. Los Alamos National Laboratory Report LA-UR-87-3972
6
Summary
A multichannel least squares lattice (MLSL) algorithm has been derived. It provides estimates of a primary signal in terms of the last p samples of m reference signals. The recursions developed were optimized for multichannel use and provide an algorithm with computational complexity of O(pm2). Final per time step operation count, including joint estimation, is 3p divisions, 10pm2 9pm 5p multiplications, and 7pm2 3pm additions. In the single reference channel case ( m = l ) , the algorithm simplifies to the O(p) least squares lattice (LSL) algorithm. In the single filter tap case (p = l ) , the algorithm simplifies t o the O(m2) "Kalman" recursive least squares (RLS) algorithm. Hence the MLSL algorithm provides an explicit relationship between, and common derivation of, both the LSL and RLS algorithms.
+
+
[7] F . Ling and J. G . Proakis. A generalized multichannel least squares lattice algorithm based on sequential processing stages. IEEE Trans. Acoustics, Speech, and Signal Processing, ASSP32(2):381-389, April 1984.
Description Forward prediction error Backward prediction error Cross correlation coefficient Forward error covariance Backward error covariance Likelihood variable Joint estimation error
+
U'P$,, ,,V [I] R . Friedlander. Lattice filters for adapt,ive processing. Proc. IEEE, 70(8):829-867, August 1982. [2] J M Cioffi and T . Iiailath Fast, recursive-least-squares transversal filters for adaptive filtering. IEEE Trans. on Acoustics, Speech and Szgnal Processing, ASSP-32(2):304-337, April 1984.
\
,
,
I
I
Dim. m x l m x l Y't-p B mxm Y't+l Yt-p mxm Y't+l Yt+l mxm Y't-o Yt-U at. x' 11x1
X TJ'PitV U'Pi,tX (X'P&X)-I X'Pi,tV gp,t+l
Ap,t+l
4+l,t+l
4,t
A'P,ttl
xAp,t
AP,t+l
S,t+l
1/7P,t
r'p,t
a
ACE,,
c;,t+1
%t+l
1/7p,t
C'p,t+l
7r -
';st+l
4 , t
Yt-p Yt+l
%+I
(C;,t+l)-l
(c;J)-l
Afp*t+l
Yt-p
xc;,t-1
c;,t
c;+1,t+1
c;,t
A'PC+l
tc;,t+l)-l
AP,t+l
7p+l,t+l
7PYt
e'P,t+l
(c;,t+l)-
%t+l
7p+l,t
7p,t
dp,t
(c;,t)-1
cp,t
Yt-p
S+l,t
e;,t
A';$
tC;J1
4,t
Yt-p 2T
xa';,t-~ a'f,t
(41 B. Porat, B. Friedlander, and M Morf. Square root covariance
r.-
"I"+
v
gp+:,t+l
c;+'Jt
131 H. Lev-Ari, T Kailath, and J. Cioffi. Least-squares adaptive lattice and transversal filters: A unified geometric theory. IEEE Trans. on Information Theory, IT-30(2):222-236, March 1984.
I
U'
Y't+l
Cp,t
$,t
U7PJ
7r
ep,t
Yt+l
fp,t
1/7P,t
Yt+l
"
Equations
[5] P. S. Lewis. Algorithms and architectures for multichannel enhancement of magnetoencephalographic signals. In Proc. 21 st Annual Asilomar Conference on Signals, Systems, and Computers, IEEE, Pacific Grove, CA, November 1987. Los Alamos National Laboratory Report LA-UR-87-3527.
Operations
:
x
f
o
m2
ma - m
o o
m2 m+1 m+l m2+m
m2 - m m m m2
o
m2
1 1
+m
na2
r - - - - - - - - - - - -- 1
0
o I
:
0
I
3 0 0 0
Figure 1: Multichannel least-squares estimation block diagram.
2m2 2m2
m+l lorn2+ 6m+3 m+l 2m+1 3m+2
m2 m2 m 7m2+m m m 2m
Table 3: Serial MLSL algorithm and operation count. (Intermediate variables are marked by "1
1929