LNCS 3195 - New Eigensystem-Based Method for ... - Springer Link

New Eigensystem-Based Method for Blind Source Separation Rubén Mart´ın-Clemente1 , Susana Hornillo-Mellado1 , Carlos G. Puntonet2 , and José I. Acha1 1

´ Area de Teor´ıa de la Se˜ nal y Comunicaciones, Universidad de Sevilla Avda. de los Descubrimientos s/n., 41092-Sevilla, Spain {ruben,susanah}@us.es 2 Departamento de Arquitectura y Tecnolog´ıa de Computadores Universidad de Granada, E-18071, Granada, Spain {carlos,mrodriguez}@atc.ugr.es

Abstract. In this paper, it is presented an algorithm to construct a cumulant matrix that has a well-separated extremal eigenvalue. The corresponding eigenvector is well-conditioned and could be used to develop robust algorithms for blind source extraction. Simulations demonstrate the effectiveness of the proposed approach.

1

Introduction

Blind Source Separation (BSS) is a challenging problem in Signal Processing. It consists in extracting source signals from sensor measurements. Here, the ‘blind’ qualification emphasizes that neither the sources nor the mapping between the sources and the sensor measurements are known a priori. Applications arise in numerous fields: e.g., array processing, speech enhancement, noise cancellation, data communications, biomedical signal processing et cetera. Consider the linear instantaneous BSS model: x(t) = A s(t)

(1)

where s(t) denotes the N × 1 vector whose components si (t) are the sources, x(t) is the N × 1 sensor measurement and A denotes an unknown mixing matrix. Starting from the seminal work [11], the problem has been studied by a large number of researchers (see [6, 10] and the references therein). In recent times, independence criteria which are based on Information-Theoretic models have attracted a great deal of attention. The algebraic structure of the so-called ‘quadricovariance’ has been exploited as well: roughly speaking, the quadricovariance is a fourth-order tensor whose coordinates are the cumulants of the whitened sensor measurements; the matrix N formed by the contraction of the quadricovariance with any arbitrary matrix M is always diagonalized by the mixing matrix [3] – consequently, the eigenvectors of N give the columns of the mixing matrix.The problem arises when matrix N has close eigenvalues since, in this case, its eigenvectors are very sensitive to errors in the computation of the C.G. Puntonet and A. Prieto (Eds.): ICA 2004, LNCS 3195, pp. 33–40, 2004. c Springer-Verlag Berlin Heidelberg 2004

34

Rubén Mart´ın-Clemente et al.

statistics of the data. To obtain more robust estimates, the joint diagonalization of several matrices Ni has been proposed [3, 5], whereby each matrix Ni is the contraction of the cumulant tensor with a different matrix Mi ; however, this approach is computationally demanding. The purpose of this paper is to propose a simple algorithm that produces a cumulant matrix N that has a well-separated extremal eigenvalue. Consequently, the corresponding eigenvector is expected to be numerically stable, in the sense that small changes in N do not induce large changes in the eigenvector. The new method could be used to develop fast and robust algorithms for blind source extraction.

2

Problem Statement and Notation

The aim of BSS is to determine an N × N matrix B from the sole observation of the data x(t) such that: y(t) = B x(t) = G s(t)

(2)

is an estimate of the source vector (up to permutation and scaling). The following hypotheses are assumed: (H1) (H2) (H3) (H4) (H5)

The sources si (t) are statistically independent. Each source si (t) is a stationary zero-mean unity-variance process. At most one source is gaussian distributed. The mixing matrix A is nonsingular. The observed vector x(t) is spatially white at order 2, i.e.: E x(t) x(t)H = I

Hypothesis (H5) is not restrictive: one can always whiten the observations. It follows that matrix A is unitary, i.e., A AH = I. This can be seen from: I = E x(t) x(t)H = A E s(t) s(t)H AH = A AH where eq. < 1 > follows from (H5) and eq. < 2 > follows from (H1)–(H2). Consequently, the search for the inverse of A can be restricted to the space of the unitary matrices. Therefore, matrix B is supposed to be unitary. Similarly, it follows that G = B A is unitary as well. 2.1

Quadricovariance: Definition and Properties

Under the term“quadricovariance” [2], we understand the fourth-order tensor with coordinates: (3) qiljk = cum(xi , x∗j , x∗k , xl ) where: cum(xi , x∗j , x∗k , xl ) = E{xi x∗j x∗k xl } − E{xi x∗j } E{x∗k xl } −

−E{xi x∗k } E{x∗j xl } − E{xi xl } E{x∗j x∗k }

New Eigensystem-Based Method for Blind Source Separation

Let N be the matrix with entries defined as: def nij = qiljk mkl

35

(4)

1≤k,l≤N

where mkl are arbitrary constants1 . It can be shown that [3, 5] N A = A ΛM ,

(5)

where ΛM is a diagonal matrix whose diagonal elements depend on the statistics of x(t) as well as the particular constants mkl . Eqn. (5) is the usual definition of eigenvalues and eigenvectors: in view of (5), it is inferred that the eigenvectors of N are the columns of the unitary mixing matrix A (up to complex constants of unit norm). Hence, a true separating matrix B is just obtained by transferring the columns from (the complex conjugate) of matrix A to the rows in matrix B. This approach is very elegant. However, if N has close eigenvalues, the method is very sensitive to errors in the estimation of the cumulants: small changes in N produce large changes in its eigenvectors. Several ideas have been proposed to overcome this serious drawback [2, 3, 15]. In particular, our own approach is presented in the next Section.

3

Extraction of a Single Source

Our idea is to produce a matrix N that has one eigenvalue that is well-separated from the others. The corresponding eigenvector is hence expected to be numerically stable, in the sense that small changes in N do not induce large changes in the vector (see [8], Theorem 8.1.12 and Example 8.1.6). 2 Let m = (m1 , . . . , mN )T be a unit-norm vector, i.e. N k=1 | mk | = 1 and N be the N × N matrix defined entrywise by def (N)ij = qiljk m∗k ml (6) 1≤k,l≤N

– observe that (6) is nothing but a particular instance of (4). It is obtained that the eigenvalues of N are | h1 |2 κs1 , . . . , | hN |2 κsN where def

hn =

N

mk akn ,

(7)

(8)

k=1

and κsn = cum(sn , s∗n , sn , s∗n ) is the kurtosis of the nth source signal; since A is 2 unitary, we have that N n=1 | hn | = 1. 1

Technically speaking, N is said to be the contraction of the quadricovariance with matrix M = (mij ).

36


We propose to make zero all the eigenvalues of N excepting one. The rationale is as follows: the separation between one eigenvalue, e.g., | hi |2 κsi , and the closest other eigenvalue, say | hj |2 κsj , is | hi |2 κsi − | hj |2 κsj Suppose that | κsi |>| κsj | (no loss of generality). Then, the distance between the two eigenvalues is maximized when | hi |= 1, which implies that | hk |= 0 for k = i since ∀n | hn |2 = 1. This is the situation in which there is only one nonzero eigenvalue; the separation between this particular eigenvalue and the others will be, hence, maximum, as desired. As a further note, it is known that the sensitivity of the corresponding eigenvector is upper bounded by the inverse of that separation [8]. 3.1

Choice of the Coefficients mk

The question arises, how do we compute the coefficients mk ? In this paper, vector m = [m1 . . . mN ]T is computed as the solution to the optimization problem: max mH Ni T m, subject to m2 = 1 m

(9)

where Ni is the N × N matrix defined entrywise by (Ni )mn = cum(xm , x∗n , yi∗ , yi )

(10)

Using basic algebra [8], the optimum m is immediately found to be the conjugate of the principal eigenvector (the one associated with the largest eigenvalue) of Ni . The rationale behind this choice of m is the following: the change of variables G = B A allows us to rewrite[12] mH Ni T m ≡

N

| hn |2 | gin |2 κsn

(11)

n=1

where gin is the (i, n)th coordinate of the global matrix G = B A and h1 , . . . , hN were defined in (8). Suppose that | gi1 |2 κs1 >| gi2 |2 κs2 > . . . >| giN |2 κsN

(12)

For instance, (12) holds (no loss of generality) when the coordinates bij have been randomly chosen, which allows us to establish that the numbers | gin |2 κsn are distinct with probability one. Then, (11) is maximized by making | h1 | as large as possible and this occurs with | h1 |= 1 and | hn |= 0 for n = 1 (as 2 ∀n | hn | = 1). In other words: the optimum m makes zero all the coefficients h1 , . . . , hN excepting one. Returning to the main problem, since the eigenvalues of N are precisely | h1 |2 κs1 , . . . , | hN |2 κsN , it readily follows that the matrix N obtained by substituting m in definition (6) possesses only one nonzero eigenvalue, as desired.

New Eigensystem-Based Method for Blind Source Separation

3.2

37

Algorithm

The computation of m, N and its principal eigenvector, collectively and in that order, constitute the basis of our method for extracting a single source. The corresponding algorithm may take the following simple form: (0) Apply the whitening transformation to the data. (1) Start with unit-norm vector bi = (bi1 , . . . , biN )T (initial guess). (2) for k = 1, 2, . . . , kmax (2.1) Set yi = n bin xn . (2.2) Estimate matrix Ni . (2.3) Set m to the conjugate of the principal eigenvector of Ni . (2.4) Estimate matrix N. (2.5) Set bi to the conjugate of the principal eigenvector of N. (3) end for (4) return yi = n bin xn , the estimated source. Regarding the for step, this is just a mechanism for the iterative refinement of the solutions. It is necessary since the true cumulant matrices N and Ni cannot be perfectly estimated in practice, due to the finite sample size.

4

Extraction of Several Sources

The rows of B are decoupled from each other by virtue of their orthogonality. This decoupling property makes it possible for us to accomplish the global problem as a sequence of local optimizations. That is, in order to estimate M ≤ N sources, we may compute at each step the principal eigenvectors of different matrices Ni (i = 1, . . . , M ) and then apply Gram-Schmidt to orthonormalize them. The procedure is repeated until convergence. A similar approach is used in [9].

5

Numerical Experiments

In this Section we explore the algorithm through a simulation example. The performance is measured by the signal to noise ratio (SNR) of each source at the separator output. It is defined for source k by the following expression SN Rk = 10 log

E{| sk |2 } = −10 log E{| sk − sˆk |2 } E{| sk − sˆk |2 }

where sˆk is the estimate of the kth source. The source signals s1 (t) and s2 (t) were 16-PSK digitally modulated signals, whereas s3 (t) and s4 (t) were 16-QASK baseband signals. All of them are often used in communication systems. The complex baseband equivalent waveform of each source signal was used in the simulations. The coefficients of the mixing were random numbers whose real and imaginary parts were drawn from the

38

Rubén Mart´ın-Clemente et al. Scatter plot of s (t)

Scatter plot of s (t)

1

2

0.5 Quadrature

0.5 Quadrature

1

1

0

−0.5

−0.5

−1 −1

0

−0.5

0 0.5 In−Phase

−1 −1

1

−0.5


1


3

4

0.5

0.5 Quadrature

Quadrature

0 0.5 In−Phase

0

−0.5

0

−0.5

−0.5

0 In−Phase

0.5

−0.5

0 0.5 In−Phase

Fig. 1. Scatter Plots of the four sources s1 (n), s2 (n), s3 (n), s4 (n). Scatter plot of x1(t)

Scatter plot of x2(t)

10

6 4 2 Quadrature

Quadrature

5

0

0 −2 −4

−5

−6 −10 −10

−5

0 In−phase

5

−8 −10

10

−5

6

4

4

2

2

0 −2 −4 −6 −10

5

10

5

10


6

Quadrature

Quadrature


0 In−phase

0 −2 −4

−5

0 In−phase

5

10

−6 −10

−5

0 In−phase

Fig. 2. Scatter Plots of the four measured signals x1 (n), x2 (n), x3 (n), x4 (n).

New Eigensystem-Based Method for Blind Source Separation Scatter plot of y (t)

Scatter plot of y (t) 2

1

0.5

0.5 Quadrature

Quadrature

1

1

0

0 −0.5

−0.5

−1 −1

−1 −0.5

0 0.5 In−Phase

1

−1

Scatter plot of y (t)

0 0.5 In−Phase

1

4

1

1

0.5 Quadrature

0.5 Quadrature

−0.5

Scatter plot of y (t)

3

0

0

−0.5

−0.5

−1 −1

39

−0.5

0 0.5 In−Phase

1

−1 −1

−0.5

0 0.5 In−Phase

1

Fig. 3. Scatter Plots of the four estimated sources y1 (n), y2 (n), y3 (n), y4 (n).

normal distribution with zero-mean and unit-variance. Figures 1, 2 and 3 depict, respectively, the scatter plots of the sources, the measured signals and the estimated sources2. The constellation of each estimated source appears clearly in Figure 3, showing that the separation is successful. In fact, the mean signal to noise ratio equals 31.64 dB after the separation (averaged over 100 independent experiments).

6

Conclusions

This paper introduces a cumulant matrix which is defined as the contraction of the fourth-order cumulant tensor with m∗ mT , where m is a unit-norm vector. The specific structure of this definition allows us to make zero all the eigenvalues of N excepting one. This makes the computation of the associated eigenvector more robust. The method is then used to develop a new algorithm for BSS.

References 1. A. Belouchrani, K. Abed Meraim, J.-F. Cardoso and E. Moulines, “A Blind Source Separation Technique based on Second Order Statistics”, in IEEE Transactions on Signal Processing, vol. 45(2), pp. 434-444, 1997. 2

The scatter plot represents the imaginary part (which is termed ‘Quadrature Component’) versus the real part (which is termed ‘In-Phase Component’) of the signal.

40


2. J.-F. Cardoso, “Eigenstructure of the Fourth-Order Cumulant Tensor with Application to the Blind Source Separation Problem”, in Proceedings ICASSP’90, pp. 2655-2658, Albuquerque, 1990. 3. J.-F. Cardoso and A. Souloumiac, “Blind Beamforming for non-Gaussian Signals”, in Proceedings of the IEE, vol. 140 (F6), pp. 362-370, 1993. 4. available at: ftp://tsi.enst.fr/pub/jfc/Algo/Jade/jade.m 5. J.-F. Cardoso, “High-Order Contrasts for Independent Component Analysis”, in Neural Computation, vol. 11, pp. 157-192, 1999. 6. A. Cichocki and S. I. Amari, “Adaptive Blind Signal and Image Processing”, John Willey and Sons, 2002. 7. N. Delfosse, P. Loubaton, “Adaptive Blind Separation of Independent Sources: A Deflation Approach”, in Signal Processing, vol. 45, pp. 59-83, 1995. 8. G. Golub and C. van Loan, “Matrix Computations”, The John Hopkins University Press, 1996. 9. A. Hyv¨ arinen and E. Oja, “A Fast Fixed-Point Algorithm for Independent Component Analysis”, in Neural Computation, vol. 6, pp. 1484-1492, 1997. 10. A. Hyv¨ arinen, J. Karhunen and E. Oja, “Independent Component Analysis”, John Willey and Sons, 2001. 11. C. Jutten and J. Herault, “Blind Separation of Sources, Part I: an adaptive algorithm based on neuromimetic architecture”, in Signal Processing, vol. 24, pp. 1-10, 1991. 12. R. Mart´ın-Clemente and J. I. Acha, “Eigendecomposition of Self-Tuned Cumulant Matrices for Blind Source Separation”, submitted. 13. E. Moulines and J.-F. Cardoso, “Second-order versus fourth-order MUSIC algorithms. An asymptotical statistical performance analysis”, in Proceedings Workshop on Higher-Order Statistics, pp. 121-130, Chamrousse, France, 1991. 14. C. Nikias and A. Petropulu ”Higher-order spectra analysis”, Prentice-Hall, 1993. 15. L. Tong, Y. Inouye and R. Liu, “Waveform preserving blind estimation of multiple independent sources”, in IEEE Transactions on Signal Processing, vol. 41, pp. 2461-2470, 1993.