Document not found! Please try again

Modular and numerically stable fast transversal filters for ... - IEEE Xplore

1 downloads 0 Views 2MB Size Report
Apr 4, 1992 - tive algorithms, the group of recursive least squares (RLS) algorithms is often favored over their stochastic-approxi-. Manuscript received May ...
184

lEEE TRANSACTIONS ON SIGNAL PROCESSING. VOL. 40, NO. 4, APRIL 1992

Modular and Numerically Stable Fast Transversal Filters for Multichannel and Multiexperiment RLS Dirk T. M. Slock, Member, IEEE, Luigi Chisci, Hanoch Lev-Ari, Member, IEEE, and Thomas Kailath, Fellow, IEEE

Abstract-ln this paper, we present scalar implementations of multichannel and multiexperiment fast recursive least squares algorithms in transversal filter form (the so-called FTF algorithms). The point is that by processing the different channels and/or experiments sequentially, i.e., one at a time, the multichannel and/or multiexperiment algorithm gets decomposed into a set of intertwined single-channel single-experiment algorithms. For multichannel algorithms, the general case of possibly different filter orders in different channels is handled. Geometrically, this modular decomposition approach corresponds to a Gram-Schmidt orthogonalization of multiple error vectors. Algebraically, this technique corresponds to matrix triangularization of error covariance matrices and converts matrix operations into a regular set of scalar operations. Modular algorithm structures that are amenable to VLSI implementation on arrays of parallel processors naturally follow from our approach. Numerically, the resulting algorithm benefits from the advantages of triangularization techniques in block processing, which are a well-known part of Kalman filtering expertise. Furthermore, recently introduced stabilization techniques for proper control of the propagation of numerical errors in the update recursions of FTF algorithms are also incorporated.

I. INTRODUCTION

0

VER the last few decades, adaptive filters have become increasingly popular in various signal processing applications (see [l], [2]). Within the family of adaptive algorithms, the group of recursive least squares (RLS) algorithms is often favored over their stochastic-approxiManuscript received May 11, 1989; revised March 3 . 1991. This work was supported in part by the Joint Services Program at Stanford University (U.S. Army, U . S . Navy, U.S. Air Force) under Contract DAAL03-88-C001 1 and the SDI/IST Program managed by the Office of Naval Research under Contract N00014-85-K-0550. D. T. M. Slock was with the Department of Electrical Engineering, Information Systems Laboratory, Stanford University, Stanford, CA 94305. He is now with Eurecom, CICA, F-06560 Sophia Antipolis. France. L. Chisci was with the Department of Electrical Engineering, Information Systems Laboratory, Stanford University. Stanford. CA 94305. on leave from the Dipartimento di Sistemi e Informatica. Universita degli Studi di Firenze, 50139 Firenze. Italy, while this work was being done. H . Lev-Ari was with the Department of Electrical Engineering. Information Systems Laboratory. Stanford University, Stanford, CA 94305. He is now with the Department of Electrical and Computer Engineering. Northeastern University, Boston, MA 021 15. T. Kailath is with the Department of Electrical Engineering. Information Systems Laboratory, Stanford University, Stanford. CA 94305. IEEE Log Number 9106021.

mation-based least mean square (LMS) counterparts, based on the consideration of issues such as convergence rate, tracking capability, and steady-state performance, as discussed extensively in the literature [3]-[5]. Single-channel algorithms have found a wide range of applications [6], such as high resolution spectrum estimation, noise cancellation, speech and biomedical signal processing, etc. The multichannel algorithm yet significantly broadens this range and accommodates such applications as identification of systems described by difference equations with multiple polynomials [7], adaptive minimum-variance control [8, sec. 6.31, fractionally spaced and decision-feedback equalizers [9], frequency domain adaptive filtering [lo], multirate signal processing [ 1 1 , p. 2711, image enhancement [12], and adaptive beamforming with antenna arrays [ 131. Although authors have generally made an effort to present algorithms in the general multichannel context, it appears that single-channel algorithms have been substantially more popular with practitioners. This discrepancy between theory and practice is at least partly due to the fact that multichannel algorithms (in their straightforward extension of the single-channel forms) require matrix computations. Because of these matrix operations, both the simplicity and the relatively high throughput of singlechannel algorithms are sacrificed in such multichannel implementations. Also, potential numerical difficulties are inherently associated with these matrix operations. The difficulties in implementation of multichannel algorithms have spurred a growing interest in scalar implementations of multichannel recursions, namely, implementations that require no matrix processing. Moreover, the increasing interest in dedicated VLSI hardware implementation favors algorithms that can be implemented in modular architectures with a regular and highly parallel structure. In [ 141, a general principle for modular decomposition of multichannel recursions was outlined and then applied to multichannel lattice algorithms; modular multichannel lattice algorithms were also independently derived in [ 151, [16], [37]. However, due to their lower computational complexity, fast transversal filters (FTF’s) [3] have challenged the popularity of their lattice counterparts. Furthermore, the recent numerical stabilization of the FTF

1053-587X192$03.00 O 1992 IEEE

785

SLOCK er n l . : MODULAR A N D NUMERICALLY STABLE FTF'\

algorithms [ 171, [ 181 has significantly increased the potential applicability of these algorithms. Another type of block processing appearing in least squares parameter estimation is multiexperiment filtering [ 191. This type of filtering arises when we have simultaneous measurements available from different replicas of the same model (different experiments). All signals available will yield valuable information about the same unknown system, and it will be to our advantage to combine these measurements in the estimation of the system parameters. An application might be the identification of the eigenmodes of a large vibrating structure with several sensors attached. Another application arises in the following form of block processing. Assume it is sufficient to update the FIR filter estimate only every q samples instead of every sample. Then one can redefine the Sampling frequency to have a period that is q times larger, and consider the q data samples available in every new sample period to come from q experiments with the same system. In this way we will have the same sequence of FIR filter estimates available, but subsampled by a factor q. This approach does not provide a reduction in total computations count, but leads to a different algorithm structure that may be more amenable to parallel implementation, allowing indirectly for an increased throughput. Again, the classical approach to multiexperiment filtering [19] is to process the measurements en bloc by defining the measurements to be vectors, leading to an algorithm that is notationally a straightforward extension of the single-experiment version. We will see that the multichannel and the multiexperiment algorithms are in some sense dual extensions of the same basic single-channe1 single-experiment algorithm. In this paper, we apply the geometric modular decomposition principle of 1141 to arrive at new multichannel and multiexperiment FTF algorithms in which the updates are processed sequentially (i.e., one at a time). The geometric approach to recursive least-squares estimation focuses on the orthogonality principle of least-squares (201 and involves projection operators. In this approach, the following geometric identity is the basis for all recursive updating identities' : P[XYl =

=

px + PPiY Px

+ P i Y ( Y H P $Y ) - ' YHP;

(1)

spaces { X } and { P i U } . The matrix P i Y is known as the residual (also error or innovation) of Y with respect to (orthogonal projection on) X. The matrix Y H P : Y = ( P i Y ) H( P i Y ) is known as the residual (or error) covariance (matrix). Now, suppose we decompose the column space of Y into k subspaces Y,, viz.,

Y = [YI Y'

*

* *

Yk]

(2)

then the orthogonal decomposition in (1) can be done recursively as

P[XY]= p x + PPx'Y, + pPI:y,lY?+

*

*

+

y,

,,y,

(3) where the set of subspaces { P i Y,, i = 1, . . , k } is replaced recursively by an orthogonalized set, viz.,

{Pk yl, P i y,, =

* * '

PX'YI} @

Piyk}

9

{f$Yl]Y*1 @

* * *

@ ~~;x~l...Y,-l]ykl.

(4)

It is well known that the FTF (as well as the lattice [21] and even the conventional) RLS algorithm is built up from an interconnection of building blocks, where each block corresponds to the evaluation of Pcxy1from Px. Instead of carrying out the matrix computations that are associated with the multichannel/multiexperiment update, our new modular algorithm carries out a sequence of scalar updates (where "scalar" refers to the dimension of the subspaces U, ), which corresponds to the sequential processing of the Y,, viz., px

+

P[XYll

+

P[XY,Yz]

+

*

* *

+

P[XYl

'

'

Y,]

= P[XYl. (5)

When applied to specific estimation algorithms, this sequential processing leads to a modular architecture consisting of regularly interconnected simple processing blocks that perform scalar operations (if we take each U, to be a single column), hence the label modular. Algebraically, the orthogonalization procedure of (4) corresponds to a so-called lower-diagonal-upper (LDU) triangular factorization of the error covariance matrix Y H P i Y in (1). The (inverses of the) elements of the diagonal facto?

where Px ( P i ) denotes the orthogonal projection on (the orthogonal complement of) the subspace spanned by the columns of the matrix X , and [ X Y ]denotes the matrix that has matrices X and Y as block columns (see also Table I for explicit definitions). To put (1) in words, if one makes an orthogonal decomposition { X , Y } = { X } 0 { P k Y } of the space spanned by the columns of data matrices X and Y , then the projection onto this space { X , Y } is equal to the sum of the projections onto the two orthogonal sub-

remain in (3), while the triangular factors of the LDU factorization are absorbed in the multiplying quantities. This decomposition eliminates the issue of matrix inversion in (1) and the algorithms based on this modular approach. Furthermore, the diagonal elements reveal very valuable information about the conditioning of a multichannel

'In this paper the superscripts H , # denote Hermitian (complex conjugate) transpose and arbitrary generalized inverse, respectively.

0 Ai

111

-

k

D = @ > T ) , containing the following vectors (see also [3],[25]):

AT/'xH(O)

AT/=dH(0)

0

0

forms a linear combination of the Nicurrent and previous samples of input signal x , ( T > . Let w N , T = [wG,)T * * w)oT] denote the concatenation of the time-varying impulse responses of the p FIR filters (i.e., a row vector of length N ) . The error signal resulting from the approximation of the desired response can be described as

-

fN(T)

'

d(T)

-k

0

(7)

wN,TXN(T)

.

We will consider the following data matrices:

where

XN(T)

A

[X#IH(T) . . . Xg)H(T)]H

X I , , ,=

[xg;,T. . . X f i , _ f l T XN ,(,'T)X(I+ N i +1)l , T - l

* '

xg!T-l]

with

(13) X$!(T) b [xF(T) * . . xH(T - Ni

+ 1)IH

(8)

is the composite regressor vector. In the prewindowed least squares method with exponential weighting, we assume that no data is available before time T = 0, and that the distinct past has an exponentially decreasing weight in order to be able to track slowly time-varying phenomena. In the general case of multiple desired-response signals, d ( T ) , f N ( T ) ,and the transversal filter coefficients are r X 1 complex vectors. The filter coefficients WN. are obtained as the solution of the following weighted quadratic minimization problem T

=

min

C

wN,T l = O

where d! is a r solution

X

(9)

AT-'c~(t)QeN(t)

r positive definite weighting matrix. The

x,,,+,,j-A

rxG/,T

* *

*

xfi~_f)T~~!+i,T~I;,:tlT-1

..* (14)

x(y41.T- I ]

where is the partial data matrix with the data from channel i, XI,,,,, is the data matrix in which the order update and downdate processing at time T has been done for channels 1 through i (the remaining channels still conis similar taining the data from time T - l), and X l , + to but with channel i being right in between order update and order downdate (whence an order increase of Xp,,,,,T , which implies Xo,N,T = 1). We also define X N , XN,T - The least squares criterion EN(T)in (9) can now be seen to be the squared norm of the following error vector: 6N.T

A

dT

+ xN,TwE,T

(15)

The solution WN,Tto the LLS problem satisfies the orthogonality condition X i , ,T E,,,, = 0, and hence we find

WN,T= - d y x N , T C X f i , turns out to be independent of the weighting matrix 52 and hence we can consider 52 = I in the criterion (9). Furthermore, for !d = I it can be easily seen that the r error components can be minimized independently. Hence, even for general 52, the problem decomposes into r independent subproblems, each one of which corresponds to the case r = 1 . We will therefore henceforth assume r =

HI

-

TxN, T I -

'.

(16)

Apart from the projection operator P x defined earlier, we introduce the sample covariance matrix Rx X H X , and the filter operator (transpose of the left inverse of X ) Kx X ( X H X ) - ' = XI?,'. Note that P x Y = X ( Y H K x ) H .So Y H K xis the filter that, when operating on X , will produce the projection of Y onto X . To shorten expressions, let, e.g., Phi, denote P , T . With the notation thus introduced,

I

788

IEEE TRANSACTIONS ON SIGNAL PROCESSING. VOL. 40. NO. 4. APRIL 1992

we can rewrite (16), ( 1 5 ) , and (9) as wN. T =

- d y ~ N T, ,

EN. T

=

P,$.T dT9

= dFP,'.T dP

EN(T)

(17)

In order to arrive at a time recursion for the filter solution W N , T it, is useful to introduce a pinning vector u 2 [ l 0 . . . 0IH [26], [24]. Then we can describe the influence of the most recent data samples d ( T ) = d F u and x, ( T ) = x f T u on the old solution Wv,T - I via the Kalman gain CN,T,which has an interesting interpretation (in unnormalized form [3]) as the optimal LLS estimation filter for the pinning vector, viz.,

+ X N TC"

min CY r

TII:

=

yN(T)

=

aHP$ T U

(18)

resulting in CN = - u H K v (see also Table I). The notation X ( T ) will generically denote the upper element (row) of a vector (matrix) X T : X ( T ) = X y u . The matrix Swill denote a shift matrix (ones on the first subdiagonal, zeros elsewhere), corresponding to the pinning vector U (namely, uuH + ' S H = I ) , with the shifting property S H x T = hl"xT- I. F illy, we also introduce a set of ( N 1) X (N 1 ) permutation matrices:

+

+

6, 2

[ZE,

0 S;+l-E,l +

A. Conventional Multichannel Algorithm As in [3], [24], we will denote the conventional multichannel FTF forward and backward prediction filters and error covariances by AN, T , BN,T,a N ( T )and P N ( T ) .However, the treatment of multichannel problems in this paper differs from the one in [3], [24], [28], [29] in that we are handling multichannel problems with possibly different filter orders in different channels. To that end we have defined a different regression vector X N ( T )in (8), differing from the usual regression vector by a permutation of the elements. The columns of A N , Tand BN,Tare ordered correspondingly, so that their outputs eN, and rN, are obtained by taking the inner product of the filters with the following extended data matrix =

xN+p,T

rxCit

l.T

' ' '

xK,)+

I,Tl.

(19)

This permutation does not alter eN,T, rN.,, a N ( T ) ,or P N ( T ) . Similar (permutation) comments also apply to W NT, , e,v,j- and y N( T ) . This permuted arrangement was first proposed by the authors of [9] for the fast Kalman algorithm. The key ingredient in FTF algorithms is the existence of two different time-updating strategies. The first strategy is straightforward and is also used in the conventional RLS algorithm. It follows from the decomposition

uh+lu;+I

where

Ei 2

c Nk

h=l

Using (9) in Table I (i.e., I-(9)), this gives rise to ( N = E,,),

I. and S. are unity and shift matrices of the indicated dimension and uj is thejth unit vector. The permutation matrix 6 , is such that the zero column in [ X I+ l,,,,, 01 6 , is permuted to the position of column x, I, T in X i + I. N I . T or column x i . 7 - N, in XI,N + I , T .

111. MULTICHANNEL ALGORITHM DERIVATION The key updating identities needed for the derivation of FTF algorithms are summarized in Table 1. A derivation of these updating identities can be found in 131, [25]. These identities are all rooted in the previously noted basic geometric updating identity (1). This update formula (1) is often referred to as an order update, since it corresponds to an increase in dimension of the projection subspace. The particular choice Y = U in (1) yields the socalled time update formulas ( P i = I - uuH = S S H ) . From the updating formulas for the projection operator Px,one can derive corresponding update formulas for Kx, R,', etc. In the alternative algebraic derivation (271, one can start with an update formula for R, I , using a wellknown inversion lemma for partitioned matrices, and build on that to find the update formulas for K x and Px.Note that the projection onto the space spanned by a set X of vectors is invariant w.r.t. a permutation X 6 of the order of these vectors: P,, = P,. For the effect of a permutation 6 of the columns of X on Kx and R,. we have KxP = K , 6 and RXN = U " R x 6 .

ex,

WN,T=

wN.T-I

+

(21)

fN(T)cN,T

which is 111-(19), and 111-(17), ( 1 8 ) follow similarly from 1-(5), I-(6). To complete the time update, we need a re-

cN,T.

cursion for The conventional RLS algorithm invokes a Riccati equation to update R , ' , , from which can be computed (see I-(2)). The FTF algorithm however exploits a second time-updating strategy to find a complete set of filter recursions that can replace the Riccati equation, reducing the computational requirements from O(N2)to O(Np). This second time-updating strategy follows from the special structure of the data matrix x,+,,,, viz., [xN.T- Ixp T

'

. . xl,T]@O . 6,- I '

-

= XI$'+,,,

'

. . xI.T-N~l@I .

6,. (22) Taking the above two partitions of the space spanned by the columns of X N + p , T and substituting in I-(7), I-(8) yields - [xN.Txp.T-Np

G a pT

1

lcN.T-1

'

Oix,16o

.

- e$H(T)X-lcx,GI(T= [cN.r -

OiX,161

' '

*

1)AN.T- I

. . . 6,

r~H(T)h-I/3NI(T - 1)BN.T-I

(23)

and the corresponding update equations for the likelihood quantity y i ' ( T ) . These order updates and downdates involve the forward and backward prediction filters AN, T and

SLOCK

BN. T

BN.T

PI

NI

'

789

MODULAR A N D NUMERICALLY STABLE FTF'\

TABLE 11 CONVFYIION M LALIT I C H A Y * U t . L FTF ALGORITHM STRCCTLRC [-IXp.T

'

[-[X,~.T-N,,

.*

Xl.TIHKN.T-l * *

rplsO

.*

*

@,-I

Prediction Problem

. - r i . T - , v , ~ ~ ~ v . 1T ~ ~ 1. ~. *1 p,, (24)

and their outputs (prediction errors) e N ( T ) and r N ( T )with sample covariances aN( T ) and PN ( T ) , respectively, (definitions of these quantities can be found in [3], (251, [24], or one can parallel the definitions in Tables V, VI11 below). The time update for AN, and BN, progresses in the same way as for W N , T(using I-(9)). The time updates for the forward and backward error covariances a N ( T )and P N ( T )are found by using their definition in I-(IO). Table I1 depicts the main blocks in a FTF algorithm for the joint-process filtering problem. In the prediction part, is propagated via an order update the Kalman gain followed by an order downdate. After updating, the Kalman gain is then fed into the joint-process filtering part. In the prediction part, all update equations can be grouped into the two parts of order update and downdate, and update and downdate can each be viewed as a transformation of a set of two vectors. From Table 11, one can easily derive that the transformation matrix associated with the order update

cN.T

satisfies the following orthogonality relation:

Hence is an unnormalized orthogonal rotation. Similarly one can show that the transformation matrix associated with the order downdate

satisfies the following relation:

Hence OD is an unnormalized hyperbolic rotation. The actual details of the complete algorithm are spelled out in Table I11 (see [3] [25], [24] also for more extensive derivations). They encompass a numerical stabilization

Joint-Process Extension

mechanism that will be discussed later and that is based on the introduction of some redundancies. If one is interested in the algorithm of lowest complexity though, it can be found by stripping the algorithm in Table I11 from all redundancies. To do so, omit the quantity r$(T) (hence omit 111-(8), put K , = K2 = 0 in 111-(IO) and replace r $ ( T ) in III-(12) by r c ( T ) )and ignore the superscript s. With an abuse of notation for indexing elements in vectors, carrying over the notation from the single-channel and case [3], [25], we let the (1 x p ) quantities denote the p coefficients of d;v +,., at the position of the p zeros in [ c N , T P l O l , x , ] 6 0 .. and [ C N , T 0 , ,,I 6 ,. 6,,respectively. To compare the algorithm with other algorithms to be considered below, we have reproduced the algorithm in a more compact form in Table IV. To that end we have introduced an operatorf" which represents compactly the operations 111-( 1)-(7) associated with the order update step for the Kalman gain. Similarly, the operator fD is defined via 111-(8)415) and corresponds to the order downdate, and f J denotes the joint-process operation of III-( 17)-( 19). This algorithm representation retains the FTF skeleton with the rotations interpretation of Table 11, but is also appropriately defined to have all the correct details. As can be seen from (22), the second time-updating strategy involves a total of 2p updates and downdates. The conventional multichannel FTF algorithm splits up these 2p operations into a block of p order updates and a block of p downdates. This leads to matrix quantities of size p ; e.g., A N , and B N , are of dimension p X ( N + p ) and a N ( T )and P N ( T )are of dimension p x p. The algo1

IEEE TRANSACTIONS ON SIGNAL PROCESSING. VOL. 40, NO. 4. APRIL 1992

790

TABLE 111 CONVENTIONAL MULTICHANNEL FTF ALGORITHM #

Computation

Cost

(X)

Prediction Problem

NP 1 . 5 ~ '+ 0 . 5 ~

NP 1 . 5 ~ '+ 0 . 5 ~

Joint-Process Extension

TABLE IV CONVENTIONAL M U L ~ I C H A NFTF N E LALGORITHM (COMPACT FORM) #

Computation

Cost

3pN

(X)

+ 2.5~'+ 3 . 5 ~ (1+)

p-channel total cost (3 divisions):

(6p

111

+ 2)N + 5p' +

lop

+

12

79 I

SLOCK er al.: MODULAR AND NUMERICALLY STABLE FTF'q

rithm also requires quantities like a G ' ( T ) and involves quite a number of matrix operations. Next we describe an alternative approach.

filter order

...

our

path

*..

B. Modular Multichannel Algorithm

up / downdat es

The modular multichannel FTF algorithm goes one step further in granularity and handles all 2p up- and downdates individually in a sequential way. This leads to operations involving only scalar quantities. Several benefits of this approach for (parallel) implementation and in light of numerical considerations accrue. One issue arising from this approach of sequential processing is the freedom of choice of the ordering of the upand downdates. This issue is illustrated in Fig. 2 where the evolution of the order of the Kalman gain filter is plotted versus up- or downdate operations. All possible paths have to go from (0, N ) to ( 2 p , N ) and have to lie within or on the dotted square. The path corresponding to the upper two sides of the square is the strategy followed in [30]. This sequential-processing strategy is the unique (up to a permutation of the individual channels) strategy that one obtains if one starts from the conventional multichannel algorithm and straightforwardly applies the decomposition strategy described by (3)-(5) to obtain LDU factorizations of the p x p error covariance matrices aN( T )and ON ( T ) . Other possible strategies do not become apparent in such an approach, but they are clear if one takes a more global view at the second FTF time-updating mechanism of (22). We propose a particular strategy (see Fig. 2) involving an intertwining of up- and downdates. In other words, process each channel sequentially. This has the advantage that the processing of any given channel then, involving one update and one downdate, corresponds exactly to the processing in a single-channel FTF algorithm. Apart from regularity in the implementation, this means that all expertise on single-channel algorithms (e.g., numerical properties) can be carried over straightforwardly to the modular multichannel algorithm. Also, the filter order ( N ) of the resulting p forward and backward prediction filters is constant for each channel and is lower than in the case of the obvious sequential processing strategy, involving doing all p order updates first, in which case the p - 1. Of prediction filter orders range from N to N course, one could also consider paths in the lower triangle in Fig. 2 which would lead to even lower prediction filter orders. In particular, following the path along the lower two sides of the dotted square yields the lowest orders. However, the algorithms resulting from such strategies do not correspond well with existing single-channel algorithms. In any case, the difference in computational complexity between all possible strategies is negligible for N >> p (the case of interest for fast algorithms). We now turn to the derivation of the algorithm, which is described compactly in Table VI. For a definition and overview of the several quantities that arise in the algorithm, we refer to Table V . So the first time-updating strategy remains unaltered, hence VI-(4). For the second strategy, in order to process channel i, we need to go from

+

.:

P

2p

Fig. 2. Possible updating and downdating schedules for the Kalman gain in modular multichannel FTF algorithms.

X i - N. to X i ,N . T . After having processed all p channels, we will have moved from X o , N , T = X N , T - to X p , N , T = X N ,T. To depict the particular up- and downdate for channel i, we note that = LXi.N,T

xi,N+,.T

[ x ; - l . ~ . T x;,T]@i-l

xi.T-N,]@i.

(29) One can now parallel the steps leading to (23) of the conventional algorithm. This yields the following up- and downdates: =

ci.,v+l,T

OI6i-1

[c,-l,N,T

-

e$(T)X-'aI:h(T

= [ei,N.T -

-

l)A;,N,T-I

ol@i

r ~ , % ( T ) ~ - ' O ~-h (1 )TB i . N . T -

(30)

I

and similarly for y,;k(T). Note that the position of the permuted zeros in [Ci- I , N ,T 01 6 ,- and T 01 Pi corresponds to the position of the 1 entries in Aj,N,T - I and B i . N , T - I , respectively, which also corresponds to the position of x i , and xf.T - N , , respectively, in Xi, + l , and the position of + I . T and C y N + I T , respectively, in I . T (with again an abuse of notation). To illustrate the structure of the filter update equations more explicitly, C f f i , T ] where consider the notation 6;.,N,T= [C),lh,T that applies to the sigC,;!j denotes the segment of nal in channel j , and let Aj!A,T and B f ! A , T have a similar meaning. Then we can rewrite (30) as

[e,,,

9

i

c:!i+ 1

O1

[c'/!h.T

j

=

.

i.

= ~

[o

cy!

= c!lk+l.T

I.N,T]

f

cjlc+l , ~ A f f k , ~ -

- c(I)Nv I

r.N+I.T

I

B(;)

3

r.N.T- I

(31)

The part with J = i corresponds exactly to the singleof channel situation (here the superscripts 0 and N , I , T and e~fkN"l-,~,_correctly indicate the position of these scalars within C ~ ~ ~ + INote , T ) how . the modular multichannel algorithm corresponds to a set of intertwined single-channel algorithms. The quantities that are

cjlc+

IEEE TRANSACTIONS ON SIGNAL PROCESSING. VOL. 40. NO. 4, APRIL 1992

192

TABLE V DEFINITION OF MODULAR MULTICHANNEL FTF VARIABLES Variable

Definition

TF Computation

A(.N.7 B3.N.T

W N r.

er.N.T

r,.N . T €N. T

TABLE VI MODULAR MULTICHANNEL FTF ALGORITHM Computation

#

cost

(X)

Prediction Problem Fori

= 1,

. . . .pdo:

p-channel total cost ( 2 p divisions):

updated via the first mechanism are recursive in the time index, while the quantities that are updated via the second mechanism are recursive in the channel index.

C . Modular Ordinary RLS Algorithm In this subsection we want to briefly indicate how the modular multichannel FTF algorithm actually also en-

(6p

+ 2)N +

16p

+3

compasses a factorized estimation algorithm for ordinary RLS, or a so-called square-root RLS algorithm [ 2 2 ] , [ 2 3 ] . Indeed, since the ordinary RLS algorithm does not assume a shift-invariant structure in the regressor vector, one can assign each element in the regressor vector to a separate channel of first order each. This corresponds to taking N = p in the joint-process part VI-(4). For the prediction part we will take N , = 0, and hence N = 0. Since

SLOCK er

U/.:

793

MODULAR AND NUMERICALLY STABLE FTF.2

TABLE VI1 MODULAR ORDINAR RLS Y ALGORITHM #

Computation

cost ( X )

Prediction Problem Fori=I;..,pdo:

there is no shift invariance to be exploited, we can straightforwardly apply the Gram-Schmidt procedure of (3)-(5) to obtain an order-recursive updating scheme for the Kalman gain involving only order updates. This scheme corresponds to the straight path in Fig. 2 connecting the points (0, 0) and ( p , p ) , which coincides with a side of the square and hence the first half of the modular FTF strategy of [30]. This path will lead us from which has no dimensions and hence is easy to initialize, and Y ~ . ~ (=T 1) to the desired cP, = and y p ( T ) = Y ~ , ~ ( So, T ) .taking N = i - 1 in VI-(I), we have gathered all the necessary parts for a modular ordinary RLS algorithm as shown in Table VII. Note that this algorithm only involves the orthogonal rotations fu and not the hyperbolic rotations fD. This is indicative of the numerical stability of the ordinary RLS algorithm (see [31], 1321). From the conventional multichannel FTF point of view, the prediction problem involves a zeroth-order prediction, and hence we can write for the sample covariance matrix

ular algorithm that involves inner products (to compute the output of the filters A i , ;- T ) . An alternative modified Gram-Schmidt orthogonalization procedure leads to modular RLS algorithms without inner products. Such algorithms are considered in 1331-1351. An embedding of the RLS problem into a first-order multichannel joint-process filtering problem and application of modular multichannel fast lattice RLS algorithms 1361, 1151, [14], [37] or a modular form of multichannel fast QR-RLS algorithms [38] would lead to such inner product-free modular RLS algorithms (see 1391 for an example). Different lattice and/ or QR algorithms would lead to different forms of modular RLS algorithms. For instance, the lattice algorithm of [40] leads to the (main) modular RLS algorithm of [41].

co.o,T,

cF,,F,,

(32)

Rp.T

In the introduction, we have indicated that the modular approach corresponds to triangular factorization of error covariance matrices. In particular, we will show below in (55) that the modular algorithm of Table VI1 propagates the factors in a UDL factorization of a ; l ( T ) , namely,

&(T)

=

L~D,L,

D. Initialization We now show how appending a soft constraint to the basic cost function (9) results in a specific initialization of the modular multichannel FTF algorithm. Nonzero initial conditions can arise from additional side information, previous use of other (adaptive) algorithms, or when restarting the algorithm. The augmented cost function becomes EN(f)EN(f)

+

xT+I(WN.T

- WO>

(33)

with

(34) Ap.p-I.T

J (35)

The particular modular RLS algorithm of Table VI1 is one among many. The Gram-Schmidt orthogonalization procedure of the modular FTF approach leads to a mod-

where WO = [W: . . . W r - ' ] represents the initial condition and RN,- = X c , - X N ,- ,. The matrix X N ,- I should have (full) rank N . One can absorb the soft constraint into the time series d( . ) and x, ( . ) if X N , conforms to the structure indicated in (12)-(14), and one takes d - , = - X N , W:. In this way, the augmented cost function (36) can still be minimized recursively using the FTF algorithm and the soft-constraint part just leads to a specific -I, initialization for the algorithmic quantities {AlLN, B ~ . N , - l > a ! , N ( - I ) > ~ I , N ( - ~ ) i , = 1, * * * p > , cN,-l> y N ( - 1) and W N -,I = WO.One can in principle choose 7

IEE3E TRANSACTIONS ON SIGNAL PROCESSING, VOL. 40. NO. 4, APRIL 1992

194

any X,, and figure out the corresponding initial conditions for above quantities. The following is an easy choice:

(37) where

L

1

,

The rows of zeros are introduced so that Xi, + I , I can be of the form indicated in (14). This specification for X,, - I corresponds to the following noncausal part of the sigi + E;), i = 1, . . . , p , t < 0. The nals: q ( t ) = pi&? resulting initial conditions at time - 1 with which to start the steady-state algorithm (Table VI) at time 0 are -

+

B1,,,-I = [O

*

0 l ] S , , &,(-l)

* .

i = 1,

c,,-l = [O W,,-I

=

*

*

a

-

.

=

A1-l+zl

= I

WO.

(39)

Appropriate choices for the p l have been discussed in [ 3 ] , 1251. CONSIDERATIONS IV. NUMERICAL A major issue in FTF algorithms is the propagation of numerical errors as the algorithm cycles through the recursions. This issue has been studied in [17] for singlechannel FTF algorithms and a solution was proposed there also. This solution involves computing the quantities r g ( T ) and y i ' ( T ) in 2 different ways, denoted r c ( T ) , r$( T ) and y i J( T ) , y i a ( T ) , respectively. Extensions to the conventional multichannel algorithm were also considered there (and are reflected in Table 111). Since the modular multichannel algorithm can be viewed as a set of intertwined single-channel algorithms, its stabilization is fairly straightforward given the single-channel solution. Another numerical issue that makes the multichannel algorithm more intricate than the single-channel algorithm is the possibility of ill-conditioned error covariance matrices a N ( T )and P N ( T ) . Especially in this respect the modular multichannel algorithm is clearly preferable to its conventional equivalent.

A . Redundancies f o r Numerical Robustness

One way for computing r T N ( T ) ,which we will denote by r$:,(T) (the superscript f denotes filtering), is by exploiting its definition as the output of the backward prediction filter: r?:,(T) = Bi,N.T-, X i , , + l ( T ) . There is another possibility that follows from the order downdate of to [ ~ ; . N , T 016;: the zero entry in the LHS of

cIir,~+l,~

cyN+

' PIP1"

, p

01, y,(-l)

111-(I 1) is a known quantity that can be exploited to compute the scalar I , T multiplying Bi,,, T - I (we refer to Table I11 since it contains the definitions of the operators f o , f D used in the modular multichannel FTF algorithm of Table VI). This yields an alternative way of computing r:,,(T) as in 111-(9). We shall denote this second computation by ry,>(T) (the superscript s denotes computation by manipulation of scalars). The computationally most efficient form of the FTF algorithm [3], [27] uses r r N ( T ) and avoids the, inner product of the filtering operation. When infinite precision is used, the two ways of computing r:,,(T) will yield identical answers. However, the answers will differ in any practical implementation and the difference will be purely due to numerical error. Hence, the difference signal can be regarded as an output signal of the error propagation system, and one can use it in a feedback mechanism in order to influence the error propagation. One such feedback mechanism is obtained by taking as the final value for r f N ( T )a convex combination of the two ways of computing it, viz.,

=

Kr:!,(T)

+ (1

-

K)rfh(T).

(40)

Now, the quantity r:,,(T) is used in several instances in the algorithm. So if we use different values for the feedback constant K at those different places, then we shall have more freedom in affecting the error system dynamics. Again, we emphasize that the quantities computed by the algorithm will be independent of the KJ if infinite precision is available. In the conventional multichannel approach, there is the issue of whether to keep the feedback coefficients KJ as scalars or whether to make them p X p matrices. In the modular approach, we choose a straightforward extension of the single-channel solution, involving scalar feedback coefficients. Since time-independent feedback coefficients work in the single-channel algorithm, we can choose channel-independent feedback coefficients in the modular multichannel algorithm. Also yl:A(T) can be computed in two ways: once by using its interpretation (see (1 8)) as the inverse of the energy of the residual in estimating 0,see y,;i(T) in 111-(4), (12), and also in an alternative way as the ratio shown in VI-(3). We might elaborate a bit on the origin of y:(T) in VI-(3). Using the update relations for a , P , and y,one may show the following identities:

which leads to

SLOCK et al.: MODULAR AND NUMERICALLY STABLE FTF‘s

195

We can work with ylTA(T) for the processing of the p order up- and downdates and limit the use of the alternative computation y i ( T ) to once per time update, once per processing cycle of all p channels, as in VI-(3). The stabilization thus incorporated in Table 111 is a “bare bones” solution that works. Some possible generalizations are considered in [24] that could be straightforwardly incorporated here also. As mentioned before, one can obtain the FTF algorithm with lowest computational complexity (and hence no numerical stabilization) by putting the feedback coefficients K 1 and K2 equal to zero.

the y,yi(T) recursion only once every p updates is sufficient to stabilize that part of the error system. The 2 divisions that are mentioned in the total computations count in Table VI are a ‘ ‘ l / x ” operation for yi: , ( T ) infu (to be used in III-(7)) and y i S ( T )info (to be used in 111-(13)). The algorithm also requires a “1 /x” operation for yN( T ) (to be used in IIL(4)). However, since y-’- y-‘ presumably is very small compared to y-’,one can use a first-order Taylor-series expansion for computing y~l(~), viz., y N I ( T ) = YNJ(T)[2

B. Analysis of the Error Propagation The issue of the propagation of numerical errors as the algorithm cycles through the recursions has been studied extensively in [24] for single-channel FTF algorithms. It was shown there that the computationally most efficient form of the FTF algorithms (all K, = 0) has an error propagation system that is exponentially unstable. It was also shown that the proper introduction of redundancies could stabilize the error propagation. Now consider the specific structure of the modular multichannel FTF algorithm proposed in this paper. From Table VI, it is easy to see that the transformation

-

(44)

-YN(~)rli”7-)1

requiring two multiplications, which are indicated in the cost of VI-(3). The explanation for the computation count in III-( 16) can be found in [24]. The total computational cost for the modular multichannel FTF algorithm is (6p 2)N 16p 3 multiplications and 2p divisions, as indicated in Table VI. This should be compared to (6p + 2)N 5p2 lop + 12 multiplications and 3 divisions in the conventional multichannel algorithm of Table 111. Both algorithms have the same coefficient of N and the difference in computations count is minor (at least for p

Suggest Documents