SEPARATING MORE SOURCES THAN SENSORS ... - CiteSeerX

4 downloads 0 Views 510KB Size Report
a blind source separation approach is considered using the observation spatial time-frequency distributions (STFD). Ex- isting solutions are bound to the ...
SEPARATING MORE SOURCES THAN SENSORS USING TIME-FREQUENCY DISTRIBUTIONS Linh-Trung Nguyeny yy , Adel Belouchraniz, Karim Abed-Meraimyy & Boualem Boashashy y Signal Processing Research Centre, Queensland University of Technology, Australia z Electrical Engineering Department, National Polytechnique, Algiers, Algeria yy Signal & Image Processing Department, Telecom Paris (ENST), Paris, France

[email protected], [email protected] ABSTRACT This paper deals with the problem of blind source separation of nonstationary signals of which only instantaneous linear signals are observed. Exploiting the effectiveness of time-frequency signal processing for nonstationary signals, a blind source separation approach is considered using the observation spatial time-frequency distributions (STFD). Existing solutions are bound to the situation in which the number of sources being separated is less than the number of available sensors measuring the mixed sources. In this paper, we consider the more general case when we can have more sources than sensors assuming that the former are “separable” in the time-frequency domain. The proposed solution proceeds through 3 main steps: (i) a testing procedure is applied (after whitening the STFD) to first separate the cross-terms from auto-terms; (ii), the sources are then separated in the time-frequency domain (from the autoterms only) using a vector classification approach; and finally (iii), the source signatures are obtained using timefrequency synthesis. 1. INTRODUCTION Blind source separation (BSS) is a fundamental problem in signal processing that is sometimes known under different names: blind array processing, signal copy, independent component analysis, waveform preserving estimation, etc. In all these instances, the underlying model is that of n ‘statistically’ independent signals whose m (possibly noisy) mixtures are observed. Neither the structure of the mixtures nor the source signals are known to the receivers. In this environment, we want to identify the mixtures (blind identification problem) and decouple the mixtures (blind source decoupling). This area has been very active over the last two decades. Surprisingly, this seemingly impossible problem has elegant solutions that depend on the nature of the mixtures and the nature of the source statistical information.

Most of the approaches to blind source separation are based (explicitly or not) on a model where each source signal is a sequence of independently and identically distributed (iid) variables, e.g. [1]. In this context, the blind source separation is possible only if at most one of the sources has a Gaussian distribution. In contrast, if the source sequences are not iid, it is possible to blindly separate the sources even for Gaussian processes. Several authors have considered the case where each source sequence is a temporally correlated stationary process, e.g., [2], in which case blind source separation is possible if the source signals have different spectra. Other contributors, e.g., [3, 4], have addressed the case where the second ‘i’ of ‘iid’ is failing, that is, the nonstationary case. We consider here the exploitation of nonstationarity for blind source separation. In that case, one can take advantage of the powerful tool of time frequency signal representations to separate and recover the incoming signals. The underlying problem can be posed as a signal synthesis from the time-frequency (TF) plane with the incorporation of the spatial diversity provided by the multisensor. Our focus is on the blind separation of more sources than sensors, also known as the underdetermined BSS problem. This challenging problem has recently been signified by the work in [5, 6] where a priori knowledge on the sources pdf was used and, in [7] where ‘disjoint orthogonality of Short Fourier Transforms’ concept was exploited. In this work, we propose a new BSS method based on the TF domain orthogonality concept. We believe the latter is more general than the previous concept and much likely to be satisfied in practice. 2. DATA MODEL & ASSUMPTIONS Consider m sensors receiving an instantaneous linear mixture of signals emitted from n sources (with, possibly, n > m). The m 1 vector (t) denotes the output of the sensors at time instant t which may be corrupted by additive noise

x

n(t). Hence, the linear data model is given by: x(t) = As(t) + n(t) (1) where the m n matrix A is called the ‘mixing matrix’. The n source signals are collected in a n 1 vector denoted s(t) which is referred to as the source signal vector.

A1) The sources are assumed to have different structures and localization properties in the TF domain. More precisely, we assume the sources to be orthogonal in the TF domain according to the following definition:

Definition. Let S1 (t f ) and S2 (t f ) be TF distributions1 (TFD) of two source signals s1 (t) and s2 (t), respectively. Let 1 and 2 be TF supports of S1 and S2 , respectively, i.e.,

S1 (t f ) 6= 0 S2 (t f ) 6= 0

iff (t f ) 2 1 iff (t f ) 2 2

The sources s1 (t) and s2 (t) are, then, said to be othogonal in the TF domain if

1 \ 2 =  It is clear that this definition is too restrictive and will almost never be satisfied exactly in practice. However, as shown in our simulation results, it suffices that the sources satisfy approximately (i.e., most of the source energy are localized in disjoint TF regions) this TF orthogonality to achieve their separation using the proposed BSS algorithm. A2) The column vectors of matrix are assumed to be pairwise linearly independent, i.e., for any i j 2 1 2 : : :  n and i 6= j , we have i and j are linearly indepedent where =  1  2  : : :  n ]. Obviously, if two sources, for example, s1 and s2 have linearly dependent vectors, i.e. 2 =  1 , their separation is, then, inherently impossible since we can write = ~ ~ where ~ =  1  3  : : :  n ] and ~ = s1 + s2  s3  : : :  sn ]T . It is known that BSS is only possible upto an unknown scaling and an unknown permutation [2]. We take the advantage of this indeterminacy to assume, without loss of generality, that the column vectors of are unit norm, i.e., k i k = 1 8i.

A

A a a a x s A

a a

a

a

A

As

a a

a

a

3. SPATIAL TIME-FREQUENCY DISTRIBUTIONS

where t and f represent the time index and the frequency index, respectively. The kernel (m l) characterizes the distribution and is a function of both the time and lag variables. The cross-TFD of two signals x1 (t) and x2 (t) is defined by

1 X 1 X (k l) l=;1 k=;1 x1 (t + k + l)x2 (t + k ; l)e;j4

Dx1x2 (t f ) =

(3) fl

Expressions (2) and (3) are now used to define the following data spatial time-frequency distribution (STFD) matrix,

Dxx(t f ) = D

1 X 1 X (k l) l=;1 k=;1

x(t + k + l)xH (t + k ; l)e;j4 fl

(4)

where  xx (t f )]ij = Dxi xj (t f ), for i j = 1 2 : : :  m and H denotes the conjugate transpose of . Under the linear data model of equation (1) and assuming noise-free environment, the STFD matrix defined in (4) takes the following structure:

x

D

x

Dxx(t f ) = ADss(t f )AH

where ss (t f ) is the source TFD matrix whose entries are the auto- and cross-TFDs of the sources. By selecting auto-term TF points, ss (t f ) will be diagonal (the off-diagonal elements of ss (t f ) are crossterms, and thus, the source TFD matrix is quasi-diagonal for each TF point that corresponds to a true power concentration, i.e. a source auto-term). Moreover, since the sources have orthogonal TF supports, the diagonal entries of ss (t f ) are all zero except for one value which corresponds to the particular TF domain containing the given TF point. This leads to

D D

D

Dxx(t f ) = Ds s (t f )ai aHi  i i

where (t f ) 2 i

It is this particular structure that will be used next to achieve the BSS. 4. PROPOSED ALGORITHM First, notice that two auto-term points (t1  f1 ) and (t2  f2 ) corresponding to the same source s i (t) are such that:

Dxx(t1  f1) = Ds s (t1 f1)ai aHi Dxx(t2  f2) = Ds s (t2 f2)ai aHi which means that Dxx (t1  f1 ) and Dxx (t2  f2 ) have the same principal eigenvector ai . i i

The discrete-time form of the Cohen’s class of TFDs, for a signal x(t), is given by [8]

Dxx(t f ) =

1 This

1 X 1 X (k l) l=;1 k=;1 x(t + k + l)x (t + k ; l)e;j4

(2) fl

concept can be applied for any time-frequency distribution.

i i

The idea of the proposed algorithm is of grouping together auto-term (t f ) points associated to the same principal eigenvector. The TFD of the sources are obtained as the principal eigenvalues of the auto-term STFDs. In summary, we have the following algorithm:

1. Compute the STFD of the observation as given in (4). 2. To reduce the complexity and to not process the STFD matrices for all (t f ) points, we use here a thresholding to keep only the f(tac  fac )g points with sufficient energy, i.e., keep (tac  fac ) iff

kDxx(tac  fac)k > 1

with 1 being a chosen positive scalar. 3. We separate the auto-terms from the cross-terms using the testing procedure given in [9]. 4. For each auto-term (ta  fa ) point, compute the main eigenvector, (ta  fa ), and eigenvalue, (ta  fa ), of xx (ta  fa ).

a

D

a

5. Given the set of vectors2 f (ta  fa )g corresponding to all selected auto-term points, we classify them into different classes fCi g, each of them containing closely separated vectors, i.e. (ti  fi ) and (tj  fj ) belong to the same class if d( (ti  fi ) (tj  fj )) < 2 where 2 is a properly chosen positive scalar. As an example, we have classified the vectors in our simulation experiment according to their angles defined as

a

a

a

a

d(a1  a2 ) = arccos(~aT1 a~2 )

a

a

a

a

where ~i = Re( i )T Im( i )T ]T and k ~i k = 1. 6. Set the number of sources equal to the number of classes and, for each source si (i.e. each class Ci ), estimate its TFD as:

(

D^ s s (t f ) = (ta  fa ) 0 i i

if (t f ) = (ta  fa ) 2 Ci otherwise

for the fact that it is an invertible distribution upto a constant phase [10]. However, the choice of the TFD should be made according to the nature of the application of interest and the desired properties one is looking for. More tools in this area can be found in [8, 10]. Noise thresholding. This can be chosen based on the signal to noise ratio and the possible structure of the mixed signals. However, this is used mainly for the benefit of reducing the computational complexity, i.e., we need not consider the (t f ) points with negligible energy. Vector classification. A very simple algorithm of vector classification was used in our simulation in order to show the feasibility of BSS for the case of more sources than sensors. More sophisticated algorithms are well equiped in the literature [11] and should be applied to achieve robust separation. Number of sources. We have noticed that it was often the case in which the number of classes classified was greater than the actual number of sources set in our experiment. Simple thresholding scheme based on energy leveling was used to eliminate the classes with insignificant energy compared to others. These classes may or may not be considered as noise depending on the nature of the sources in the particular application of interest. At this level, problems may arise if one or more sources have much higher energy than others. In that case, a solution consists in using our BSS algorithm in conjunction with a deflation technique [12]. This point will be investigated in future works. TF synthesis. The source signatures after proper classification can be reconstructed to obtain their original waveforms through the use of TF synthesis. We have not yet applied this procedure in our simulations. However, such tools can be found in [10, 13]. 5. SIMULATIONS

7. Use an adequate source synthesis procedure to estimate the source signal si (t) i = 1 2 : : :  n, from ^ si si . their respective TFD estimates D We need to address the following issues regarding the proposed algorithm: TFD orthogonality. It is important to have orthogonality in the TF domain for different sources in order to achieve the BSS. However, according to our simulation results, we can relax the above condition as almost orthogonal in the TF domain. Choice of the TFD. We have chosen the Wigner-Ville distribution (WVD) as the TFD forming the STFD matrices for our simulation as of an example only. The reason stems 2 These vectors are estimated upto a random phase ej ,  2 0 2). To get rid of this phase, we force all vectors to have the first entry real positive.

We considered 2 experiments, both with a uniform linear array of m = 2 sensors having half wavelength spacing and receiving signals from n = 3 independent sources in the presence of additive white Gaussian noise with SNR level of 20 dB. The sources arrive at different directions (30  , 45 and 60 ). WVD was used to represent individual source signals, and cross WVD was used to compute the STFD matrices. The number of data samples in each signal vector is N = 128. Experiment 1 (cf. Fig.1). Three sources were chosen to be all monocomponent linear FM signals (chirp) and are well separated in the TF domain (Fig.1.a–c). The “noisy” (t f ) points appearing in the data mixture (Fig.1.d, this figure represents the TFD of the first sensor output) were first removed as shown in Fig.1.e, then were the cross-terms as in Fig.1.f. After applying the vector classsification procedure, three classes representing the three original source signals

were separated (upto a permutation) as shown in Fig.1.g-i, thus, indicating the success of our BSS algorithm. WVD of source s1

WVD of source s2

120

WVD of source s3

120

100

80

80

80

60

40

40

20

20

Time (secs)

Time (secs)

Time (secs)

40

20

0

0.05

0.1

0.15

0.2

0.25 0.3 frequency (Hz)

0.35

0.4

0.45

0

0.5

40

20

0

0.05

0.1

60

0.2

0.25 0.3 frequency (Hz)

0.35

0.4

0.45

0

0.5

0.05

0.1

0.15

0.2

0.25 0.3 frequency (Hz)

0.35

0.4

0.45

0

0.5

0

0.05

0.1

0.15

0.2

1.b

WVD of data mixture X

0.25 0.3 frequency (Hz)

0.35

0.4

0.45

Truncated TF representation (auto+cross terms)

Truncated TF representation (auto terms) 120

100

100

100

80

80

80 Time (secs)

Time (secs)

Time (secs)

60

40

20

20

80

80

80

60

60

40

40

0.15

0.2

0.25 0.3 frequency (Hz)

0.35

0.4

0.45

0

0.5

0.05

0.1

0.15

20

20

0.2

0.25 0.3 frequency (Hz)

0.35

0.4

0.45

0

0.5

0

0.05

0.1

0.15

0.2

0.25 0.3 frequency (Hz)

0.35

0.4

0.45

0.5

1.f

Plot of TF points in class: 2 of 3

Plot of TF points in class: 3 of 3

120

120

100

100

100

80

80

80

0

0.05

0.1

0.15

0.2

0.25 0.3 frequency (Hz)

0.35

0.4

0.45

0

0.5

0.05

0.1

0.15

Time (secs)

Time (secs)

Time (secs)

0.2

0.25 0.3 frequency (Hz)

0.35

0.4

0.45

100

80

80

80

60

60

60

40

40

20

20

0

0.05

0.1

0.15

0.2

0.25 0.3 frequency (Hz)

0.35

0.4

0.45

0

0.5

0.05

0.1

0.15

0.2

2.g

60

0.25 0.3 frequency (Hz)

0.2

0.25 0.3 frequency (Hz)

0.35

0.4

0.45

0.5

0.35

0.4

0.45

0.5

120

120

0

100

0.35

0.4

0.45

100

80

80

0.1

0.15

0.2

0.25 0.3 frequency (Hz)

0.35

0.4

0.45

0.5

1.h

0

0.05

0.1

0.15

0.2

0.25 0.3 frequency (Hz)

0.35

0.4

0.45

0.5

1.i

Figure 1: Experiment 1.

Time (secs)

0.05

60

40

6. REFERENCES [1] J. F. Cardoso, “Blind signal separation: Statistical principles,” Proc. IEEE, vol. 9, pp. 2009–2025, Oct. 1998. [2] A. Belouchrani, K. Abed-Meraim, J. F. Cardoso, and E. Moulines, “Blind source separation using second order statistics,” IEEE Trans. Sig. Proc., vol. 42, pp. 434–444, Feb. 1997.

0

0.05

0.1

0.15

0.2

0.25 0.3 frequency (Hz)

60

40

20

0

Experiment 2 (cf. Fig.2). Unlike in the first experiment, we used here a mixture of 2 monocomponent chirps (Fig.2.a–b) and 1 multicomponent chirp (Fig.2.c) to be our 3 source signals. Similarly, the auto-terms were obtained as in Fig.2.f. Finally, the vector classification procedure arrived at the separation of the original source signals shown in Fig.2.g–i which were expected. However, there are two “virtual” sources (Fig.2.j–k) resulting from the drawback in using simple vector classification in the simulation. As already mentioned in the discussions previously, more sophisticated algorithms need to be applied to achieve robust BSS.

0

0.5

Plot of TF points in class: 5 of 5

Time (secs)

Plot of TF points in class: 4 of 5

0

60

2.i

20

0.5

0.15

2.h

0

0.45

0.1

20

0

0

0.4

0.05

40

40

0.35

0.5

Plot of TF points in class: 3 of 5 120

100

20

1.g

0

Plot of TF points in class: 2 of 5 120

100

40

0.25 0.3 frequency (Hz)

0.45

2.f

20

0.2

0

0.5

2.e

40

0.15

0.4

20

0

120

0

0.1

0.35

40

2.d

1.e

Plot of TF points in class: 1 of 3

0.05

0.25 0.3 frequency (Hz)

20

0

120

0

0.2

40

1.d

60

0.15

60

Time (secs)

0.1

0.1

Truncated TF representation (auto terms)

100

Time (secs)

0.05

0.05

Truncated TF representation (auto+cross terms) 120

60

Time (secs)

0

0

2.c

100

Plot of TF points in class: 1 of 5

0

0

0.5

120

0

40

0.45

100

0.5

1.c

120

60

0.4

20

0

1.a

120

0.35

Time (secs)

0.15

Time (secs)

0.1

0.25 0.3 frequency (Hz)

40

Time (secs)

0.05

0.2

2.b

WVD of data mixtures X

0

0.15

2.a

120

0

60

80

60

20

60

100

80

40

Time (secs)

100

0

60

WVD of source s3 120

100

120

100

80

WVD of source s2 120

100

Time (secs)

Time (secs)

WVD of source s1 120

20

0

0.05

0.1

0.15

0.2

0.25 0.3 frequency (Hz)

2.j

0.35

0.4

0.45

0.5

0

0

0.05

0.1

0.15

0.2

0.25 0.3 frequency (Hz)

0.35

0.4

0.45

0.5

2.k

Figure 2: Experiment 2. [6] K. I. Diamantaras, “Blind separation of multiple binary sources using a single linear mixture,” in ICASSP’2000, vol. V, (Istanbul, Turkey), pp. 2657–2660, June 2000. [7] A. Jourjine, S. Rickard, and O. Yilmaz, “Blind separation of disjoint orthogonal signals: demixing n sources from 2 mixtures,” in ICASSP’2000, vol. 5, (Istanbul, Turkey), pp. 2985– 2988, June 2000. [8] L. Cohen, Time-Frequency Analysis. Englewood Cliffs, NJ: Prentice-Hall, 1995. [9] A. Belouchrani, K. Abed-Meraim, M. G. Amin, and A. M. Zoubir, “Joint anti-diagonalization for blind source separation.” to appear in ICASSP2001, Utah. [10] B. Boashash, ed., Time-Frequency Signal Analysis: Methods and Applications. Melbourne, Australia: Longman Cheshire, 1992. [11] A. Gersho and R. M. Gray, Vector Quantization and Signal Compression. Kluwer Academic Publishers, 1991.

[3] A. Belouchrani and M. G. Amin, “Blind source separation based on time-frequency signal representations,” IEEE Trans. Sig. Proc., vol. 46, pp. 2888–2897, Nov. 1998.

[12] A. Delfosse and P. Loubaton, “Adaptive separation of independent sources: a deflation approach,” in ICASSP’94, vol. IV, pp. 41–44, 1994.

[4] L. Parra and C. Spence, “Convolutive blind separation of non-stationary sources,” IEEE Trans. Spch. Aud. Proc., vol. 8, pp. 320–327, May 2000.

[13] A. Francos and M. Porat, “Analysis and synthesis of multicomponent signals using positive time-frequency distributions,” IEEE Trans. Sig. Proc., vol. 47, pp. 493–504, Feb. 1999.

[5] P. Comon and O. Grellier, “Nonlinear inversion of underdetermined mixtures,” in ICA’99, (Aussois, France), pp. 461– 465, Jan. 1999.