A New Block Exact Affine Projection Algorithm Felix Albu*
H.K.Kwan
Depart ment of Telecommun ications “Politehnica” University of Bucharest Bucharest, Ro man ia
[email protected]
Depart ment of Electrical and Co mputer Engineering University of Windsor Windsor, Canada kwan
[email protected]
Abstract— A block affine projection algorithm that it is mathematically equivalent to a recently proposed Gauss-Seidel Pseudo Affine Projection (GS -PAP) algorithm is proposed. A partitioning method is applied to the original sample-bysample algorithm. It is shown that the derived algorithm has better convergence, tracking abilities and much reduced complexity than the NLMS algorithm. Its application in an acoustic echo cancellation system is investigated.
I.
INT RODUCTION
Adaptive filtering is essential in application such as system identification, acoustic echo cancellation, channel equalization, speech coding. In acoustic echo cancellation systems an adaptive filter algorith m is used to reduce the echo. The well-known normalized LMS (NLM S) algorithm has been widely used but it has slow asymptotic convergence. The affine projection algorithm (APA) [1] can be considered as a generalization of the NLMS algorith m. This algorithm provides a much improved convergence speed compared to stochastic gradient descent algorithms. Its performance rivals the recursive-least square algorithms in many situations, but its complexity is high [1]. Its fast version proposed in [2], when implemented with an embedded FRLS (Fast Recursive Least Squares) algorithm suffers from nu merical instability. In [3] we proposed a simpler and sub-optimal implementation of the AP algorithm based on Gauss-Seidel method called the Gauss -Seidel Pseudo Affine Projection (GS-PAP) algorithm. This algorithm is shortly presented in Section 2. In order to further reduce the numerical complexity of the affine projection algorithms, several subband and block exact variants were proposed [6-8]. In Section 3 a novel block exact version of the GS-PAP, called Fast Block Exact GSPAP (FBEGS-PAP) is proposed. The results of simulations for echo cancellation systems and the computational complexity of the algorithm are evaluated in section 4. Section 5 concludes this work. II.
GS-PAP ALGORITHM
We’ll use most of the notations from [3]. Let take the following notations: x n is the input signal and y n is
*
The author is now with Aristotle University of Thessaloniki, Greece
the desired output signal. e n is the output error, e n is the normalized error, X n x n ,..., x n L 1
T
,where
L is the filter length. R n is the auto-correlation matrix of the signal. ξn xn,...,xn N 1T , where N is the affine projection order. n is a regularizat ion factor that prevents the input auto-correlation matrix fro m becoming ill-conditioned. The initial value of the regularizat ion parameter is computed as 1% of the maximu m possible value of the R n . U n u n ,..., u n L 1
T
is the
approximated de-correlation vector. b is an N vector with only one non-zero element, which is unity at the top.
H n [h1 n ,..., hL n ]T is the filter coefficients vector. P is an N length vector and Pi , i 0 to N 1 is the ith element of it. Below are the equations of the derived Gauss Seidel Pseudo Affine Projection algorith m (GS-PAP) [3]: Initializat ion X 1 0, R 1 1 I, P 1 b / 1 ,
U 1 0, H 1 0
(1)
At each sample n 0 ,
R n R n 1 ξ n ξT n ξ n L ξT n L 1 I
(2)
R n P n b -> solved with Gauss-Seidel method (3)
e n y n XT n H n 1
U n e n
1
P0 n
N 1
P n X n i i
e n n U n n H n H n 1 U n e n U
(5)
i 0
T
(4)
(6) (7)
We solve R n P n b with the Gauss-Seidel method [2]. The Gauss-Seidel method is guaranteed to converge because the matrix R n is symmetric and positive defin ite. An advantage of the GS-PAP algorith m is that it provides the filter coefficients unlike the other FAP algorith ms that compute only auxiliary coefficients [2], [6-7]. Therefore, it can be applied for other system identification applications. The acoustic echo cancellation systems should have a reliable and pro mpt double-talk detector, in order to avoid the divergence of the adaptive algorithm. Also, in noisy conditions or near-end disturbances the regularization is a necessary part of the affine projection algorith ms [4-5]. We used the follo wing regularizat ion that proved efficient in our simu lations for the usual values of the projection order.
n max L x n /10, L y n , -1 2
2
(8)
where the time-averaged powers are obtained by IIR filtering. III.
FBEGS-PAP A LGORITHM
Next, we’ll derive a novel algorith m, reducing the complexity of GS-PAP by using the same idea of deriving Modified Fast Exact NLMS (MFENLM S) algorith m fro m the NLMS algorith m [9]. It can be seen that the equations 6 and 7 lead to a similar update equation of the NLMS algorith m. We denote by M the block size. We partition the vectors X n , U n and H n according to:
XT n XTa n XTb n 2M T T T U n U a n Ub n 2 M T T T H n H a n Hb n with XTa n x n ,..., x n 2M 1 UTa n u n ,..., u n 2M 1 HTa n h1 n ,..., h2 M n XTb n x n 2M ,..., x n L 1
n u n 2M ,..., u n L 1 HTb n h2M 1 n ,..., hL n
UTb
(9) (10) (11)
encountered in other block algorith ms ([7-8], [10]) can be allev iated. The equation (7) is repeatedly substituted in equation (4) taking into account the previous partitions. The following equations are obtained for i n,...n M 1 :
yˆ a i XTa i HTa i
M i n
j
i 2M e i j
j i 2M UTb i 2M j Xb i 2M
(19)
yˆb n M
(20)
i 2M Hb n M Ha i Ha i 1 Xa i 1 e i 1 UTb
Hb n Hb n M
L
U
b
n 2M j e n j
e i y i yˆa i yˆb n M
(14) (15) (16) (17)
The equations (4-7) of the GS-PAP algorith m can be partitioned into an “updating” part and a “fixed” filtering part as in [9]. Co mputation of the “updating” part with a lower order of co mplexity is performed at each recursion step, whereas the “fixed” part can be computed from M recursion steps. Therefore, the non-uniformly distribution of the arithmetic operations within a block of M samples,
(22) (23)
The Fast Block Exact GS-PAP (FBEGS-PAP) algorith m includes the equations 2-3, 5-6, 18-23. All equations excepting equations 20 and 22 are perfo rmed every step. Their co mplexity is small because usually M and N are much smaller than the filter length L. The “filtering part” (equations 20 and 22) depends on the data from the last block. This part involves the computation of M successive outputs of a fixed coefficient filter. Several fast FIR filtering procedures that reduce significantly the number of operations exists (e.g. based on linear or circular convolution). We used the efficient FIR filtering architecture presented in Fig. 2 of [11] for cases when the block size is a power o f two ( M 2k ). The nu merical complexit ies of the algorith ms are measured by counting the number of mu ltip licat ions and divisions per recursion. Therefore we have: (24) CNLMS 2L 4
CMFENLMS 10M 1 [2 3/ 4 L 2M ] CFBEGS PAP 10M 2 2 3/ 4 L 2M N 2 4 N 2 CGS PAP 2L N 3N 5 k
(13)
(21)
j 1
k
(12)
(18)
j 1
(25) (26) (27)
The increase in complexity of FBEGS -PAP algorith m in comparison with MFENLMS is rather small for the usual values of N and doesn’t depend on L or M. The proper choice of the block size has a crucial importance in reducing the number of mult iplies and divisions. As shown in [9], there is an “optimal” block size value of M that minimizes its complexity if the projection order is fixed. The FBEGS-PAP algorithm has a more uniform distribution of the operations within the block consisting of M samples and compares favorably with other such algorithms fro m co mplexity point of view. Different block sizes can be used for each fast FIR procedure, but the practical implementation is more complicated.
IV.
SIMULATIONS
We tested the convergence and tracking abilities of the FBEGS-PAP algorith m using speech as excitation signals in an acoustic echo cancellation examp le. As expected, because it is mathematically equivalent with GS-PAP, it inherits its performances. An example of the superior convergence and tracking abilities of the FBEGS -PAP algorith m to the NLMS when using speech excitation signals is shown in Fig. 1. In our simulations we used L 256, N 10, M 8 . The impulse response of the measured car cabin impulse response was truncated to 259 coefficients, so that the theoretical minimu m misalign ment was calculated to be –49.06 dB. The tracking abilities of the algorith ms were investigated by a sudden change in the sign of the measured echo path coefficients after 25,000 samp les.
of M in order to minimize the co mplexity indicated by equation 26. TABLE I. COMPUTATIONAL COMPLEXITY (NUMBER OF MULTIP LICATIONS AND DIVISIONS PER RECURSION) OF THE INVESTIGATED ALGORITHMS FOR DIFFERENT FILTER LENGTHS, L, BLOCK SIZES, M, AND P ROJECTION ORDERS, N.
L 256, M 8
L 1024, N 10
N 10
N 4
M 32
M 8
425
317
918
1073
(335)
(313)
(828)
(983)
647
545
2183
2183
(557)
(531)
(2093)
(2093)
MFENLMS [7]
284
284
777
932
NLMS
516
516
2052
2052
ALGORITHM
FB EGS-PAP
GS-PAP [2]
Figure 1. T he misalignment of FBEGS-PAP and NLMS algorithms for speech excitation under a sudden change in echo path ( L 256 , N 10, M 8, =1 ).
The convergence of the algorithms was compared by using the squared norm of the difference between the car cabin impulse response and the adaptive filter (in d B). Table 1 co mpares the numerical co mplexity of the considered algorith ms for different filter lengths, block sizes and projection orders. It can be seen that the complexity of FBEGS-PAP is significantly reduced in comparison with NLMS or GS-PAP (depending on the block size is between 50%- 70%). Also, it is marginally more co mp lex than MFENLMS for moderate values of the affine projection order (see Table 1). We found by simulations that updating every 10 samples the P vector or the optimal forward predictor vector do not change seriously the learning curves of the GS-PAP or FBEGS-PAP algorithms. The values between accolades from Table 1 shows the average number of mult iplies and divisions using the less updating procedure. The proper choice of the block size has a crucial importance in reducing the number o f mu ltip lies and divisions (see Fig. 2). There is an “optimal” block size value
For example (see Fig. 3), if L=8192, the optimu m block size is 128 (i.e. k=7, M 2 k ). It can be seen that the value of the block size that minimize the number of multip lies and divisions is higher for higher filter lengths (e.g. if N 10 , the “optimal” value of M is 8 for L 256 , 32 for L 1024 etc.). This dependence of “optimal” M with the filter length is represented in Fig. 3. Fig. 4 shows that the FBEGS-PAP algorith m has a good behavior in a double-talk scenario.
Figure 2. T he number of multiplies and divisions of FBEGS-PAP as a function of the block size
M 2 k (L=8182, N=10)
has significantly better performances and reduced complexity in co mparison with NLMS. Its stability and uniform d istribution of operations recommends it for practical acoustic echo cancellation systems. REFERENCES [1]
Figure 3. T he dependence of the “optimal” block size on the filter length
Figure 4. The misalignment (in d B) provided by FBEGS -
PAP with time-varying regularizat ion in a double-talk scenario for SNR 30 dB, 1 . Therefore, if a certain delay is acceptable, the FBEGSPAP algorithm is an attractive algorithm for practical real-t ime imp lementations of AEC systems although it needs more memory than GS-PAP or other fast affine projection algorithms. V.
CONCLUSIONS
We’ve proposed an efficient imp lementation of the GS-PAP algorith m. It was proved that the FBEGS-PAP algorith m
K. Ozeki, T. Umeda, 'An adaptive filtering algorithm using an orthogonal projection to an affine subspace and its p roperties,' Electronics and Communications in Japan, Vol. 67-A, No.5, 1984 [2] S.L. Gay, and S. T avathia, “The fast affine projection algorithm”, Proceedings of ICASSP 1995, Detroit (MI), May 1995, vol. 5, pp. 3023-3026. [3] F. Albu, H. K. Kwan, “Combined Echo and Noise Cancellation Based on Gauss-Seidel Pseudo Affine Projection Algorithm ”, Proceedings of ISCAS 2004, Vancouver, Canada, May 23-26, 2004, pp. 505-508 [4] V. Myllyla and G. Schmidt, “Pseudo-optimal regularization for affine projection algorithms,” Proceedings of IEEE ICASSP 2002, pp. II 1917-1920, 2002. [5] H. Sheikhzadeh, R. L. Brennan, and K. R. L. Whyte, “ Near-end distortion in over-sampled subband adaptive implementation of affine projection algorithm”, Proc. of Eusipco 2004, pp. 413-416, September 2004, Vienna, Austria [6] Q.G. Liu, B. Champagne, K.C. Ho, “On the use of a modified fast affine projection algorit hm in subbands for acoustic echo cancellation”, IEEE Digital Signal Processing Workshop Proceedings, 1996, 1-4 Sept. 1996, pp. 354 – 357 [7] M. Tanaka, S. Makino, Y. Kojima, “A block exact fast affine projection algorithm”, IEEE Transactions on Speech and Audio Processing, vol. 7, pp. 79-86, Jan 1999 [8] G. Rombouts, M. Moonen, “A sparse block exact affine projection algorithm”, Speech and Audio Processing, IEEE T ransactions on, Vol. 10, Issue: 2, Feb. 2002, pp. 100 – 108 [9] H. Schutze, Z. Ren, and I. Hartmann, “Modified fast exact NLMS adaptive algorithm”, Electronics Letters, vol. 30, pp. 1029-1031, June 1994 [10] J. Benesty, P. Duhamel, "A fast exact least mean square adaptive algorithm", IEEE Trans., SP-40, (12), pp. 2904-2920, 1992 [11] H. K. Kwan, M. T . T sim, “High speed 1-D FIR digital filtering architectures using polynomial convolution”, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Dallas, U.S.A., April 6-9, 1987, pp. 1863-1866
The mat lab code for the proposed algorithm can be obtained from http://falbu.50webs.com/gs.html
The reference fo r the paper is: F. Al bu, H.K. Kwan, “A new block exact affine project ion algorith m”, Proceedings of ISCAS 2005, Kobe, Japan, May 2005, pp. 4337-4340