Second-order blind source separation based on ... - Semantic Scholar

5 downloads 2852 Views 170KB Size Report
Florida State University, Tallahassee, FL 32310-6046, USA [email protected] ..... Lange from the Department of Clinical Radiology, Ludwig-Maximilian Univer-.
Second-order blind source separation based on multi-dimensional autocovariances Fabian J. Theis1,2 , Anke Meyer-Baese2 , and Elmar W. Lang1 1

Institute of Biophysics, University of Regensburg, D-93040 Regensburg, Germany 2 Department of Electrical and Computer Engineering, Florida State University, Tallahassee, FL 32310-6046, USA [email protected]

Abstract. SOBI is a blind source separation algorithm based on time decorrelation. It uses multiple time autocovariance matrices, and performs joint diagonalization thus being more robust than previous time decorrelation algorithms such as AMUSE. We propose an extension called mdSOBI by using multidimensional autocovariances, which can be calculated for data sets with multidimensional parameterizations such as images or fMRI scans. mdSOBI has the advantage of using the spatial data in all directions, whereas SOBI only uses a single direction. These findings are confirmed by simulations and an application to fMRI analysis, where mdSOBI outperforms SOBI considerably.

Blind source separation (BSS) describes the task of recovering the unknown mixing process and the underlying sources of an observed data set. Currently, many BSS algorithm assume independence of the sources (ICA), see for instance [1, 2] and references therein. In this work, we consider BSS algorithms based on time-decorrelation. Such algorithms include AMUSE [3] and extensions such as SOBI [4] and the similar TDSEP [5]. These algorithms rely on the fact that the data sets have non-trivial autocorrelations. We give an extension thereof to data sets, which have more than one direction in the parametrization, such as images, by replacing one-dimensional autocovariances by multi-dimensional autocovariances. The paper is organized as follows: In section 1 we introduce the linear mixture model; the next section 2 recalls results on time decorrelation BSS algorithms. We then define multidimensional autocovariances and use them to propose mdSOBI in section 3. The paper finished with both artificial and real-world results in section 4.

1

Linear BSS

We consider the following blind source separation (BSS) problem: Let x(t) be an (observed) stationary m-dimensional real stochastical process (with not necessarily discrete time t) and A an invertible real matrix such that x(t) = As(t) + n(t)

(1)

2

Fabian J. Theis, Anke Meyer-Baese, and Elmar W. Lang

where the source signals s(t) have diagonal autocovariances Rs (τ ) := E (s(t + τ ) − E(s(t)))(s(t) − E(s(t))>



for all τ , and the additive noise n(t) is modelled by a stationary, temporally and spatially white zero-mean process with variance σ 2 . x(t) is observed, and the goal is to recover A and s(t). Having found A, s(t) can be estimated by A−1 x(t), which is optimal in the maximum-likelihood sense (if the density of n(t) is maximal at 0, which is the case for usual noise models such as Gaussian or Laplacian noise). So the BSS task reduces to the estimation of the mixing matrix A. Extensions of the above model include for example the complex case [4] or the allowance of different dimensions for s(t) and x(t), where the case of larger mixing dimension can be easily reduced to the presented complete case by dimension reduction resulting in a lower noise level [6]. By centering the processes, we can assume that x(t) and hence s(t) have zero mean. The autocovariances then have the following structure   ARs (0)A> + σ 2 I τ =0 > (2) Rx (τ ) = E x(t + τ )x(t) = ARs (τ )A> τ 6= 0 Clearly, A (and hence s(t)) can be determined by equation 1 only up to permutation and scaling of columns. Since we assume existing variances of x(t) and hence s(t), the scaling indeterminacy can be eliminated by the convention Rs (0) = I. In order to guarantee identifiability of A except for permutation from the above model, we have to additionally assume that there exists a delay τ such that Rs (τ ) has pairwise different eigenvalues (for a generalization see [4], theorem 2). Then using the spectral theorem it is easy to see from equation 2 that A is determined uniquely by x(t) except for permutation.

2

AMUSE and SOBI

Equation 2 also gives an indication of how to perform BSS i.e. how to recover A ˜ (t) := from x(t). The usual first step consists of whitening the no-noise term x As(t) of the observed mixtures x(t) using an invertible matrix V such that V˜ x(t) has unit covariance. V can simply be estimated from x(t) by diagonalization of the symmetric matrix Rx˜ (0) = Rx (0) − σ 2 I, provided that the noise variance σ 2 is known. If more signals than sources are observed, dimension reduction can be performed in this step, and the noise level can be reduced [6]. In the following without loss of generality, we will therefore assume that ˜ (t) = As(t) has unit covariance for each t. By assumption, s(t) also has x unit covariance, hence I = E As(t)s(t)> A> = ARs (0)A> = AA> so A is ¯ orthogonal. Now define  the symmetrized autocovariance of x(t) as Rx (τ ) := 1 > . Equation 2 shows that also the symmetrized autocovari2 Rx (τ ) + (Rx (τ )) ance x(t) factors, and we get ¯ x (τ ) = AR ¯ s (τ )A> R

(3)

SOBI based on multi-dimensional autocovariances

3

¯ s (τ ) is diagonal, so equation 3 is an eigenvalue decomfor τ 6= 0. By assumption R ¯ x (τ ). If we furthermore assume that R ¯ x (τ ) position of the symmetric matrix R ¯ or equivalently Rs (τ ) has n different eigenvalues, then the above decomposition ¯ x (τ ) except for orthogonal transformation i.e. A is uniquely determined by R in each eigenspace and permutation; since the eigenspaces are one-dimensional this means A is uniquely determined by equation 3 except for permutation. In addition to this separability result, A can be recovered algorithmically by simply ¯ x (τ ) (AMUSE, [3]). calculating the eigenvalue decomposition of R In practice, if the eigenvalue decomposition is problematic, a different choice of τ often resolves this problem. Nontheless, there are sources in which some components have equal autocovariances. Also, due to the fact that the autocovariance matrices are only estimated by a finite amount of samples, and due to possible colored noise, the autocovariance at τ could be badly estimated. A more general BSS algorithm called SOBI (second-order blind identification) based on time decorrelation was therefore proposed by Belouchrani et al. [4]. In addition to only diagonalizing a single autocovariance matrix, it takes a whole set of autocovariance matrices of x(t) with varying time lags τ and jointly diagonalizes the whole set. It has been shown that increasing the size of this set improves SOBI performance in noisy settings [1]. Algorithms for performing joint diagonalization of a set of symmetric commuting matrices include gradient descent on the sum of the off-diagonal terms, iterative construction of A by Givens rotation in two coordinates [7] (used in the simulations in section 4), an iterative two-step recovery of A [8] or more recently a linear least-squares algorithm for diagonalization [9], where the latter two algorithms can also search for non-orthogonal matrices A. Joint diagonalization has been used in BSS using cumulant matrices [10] or time autocovariances [4,5].

3

Multidimensional SOBI

The goal of this work is to improve SOBI performance for random processes with a higher dimensional parametrization i.e. for data sets where the random processes s and x do not depend on a single variable t, but on multiple variables (z1 , . . . , zM ). A typical example is a source data set, in which each component si represents an image of size h × w. Then M = 2 and samples of s are given at z1 = 1, . . . , h, z2 = 1, . . . , w. Classically, s(z1 , z2 ) is transformed to s(t) by fixing a mapping from the two-dimensional parameter set to the one-dimensional time parametrization of s(t), for example by concatenating columns or rows in the case of a finite number of samples. If the time structure of s(t) is not used, as in all classical ICA algorithms in which i.i.d. samples are assumed, this choice does not influence the result. However, in time-structure based algorithms such as AMUSE and SOBI results can vary greatly depending on the choice of this mapping, see figure 2. Without loss of generality we again assume centered random vectors. Then define the multidimensional covariance to be  Rs (τ1 , . . . , τM ) := E s(z1 + τ1 , . . . , zM + τM )s(z1 , . . . , zM )>

4

Fabian J. Theis, Anke Meyer-Baese, and Elmar W. Lang 1 1d−autocov 2d−autocov

0.8

0.6

0.4

0.2

0

PSfrag replacements

0

50

100

150

200

250

300

τ respectively |(τ1 , τ2 )| (rescaled to N)

Fig. 1. Example of one- and two-dimensional autocovariance coefficient of the grayscale 128 × 128 Lena image after normalization to variance 1.

where the expectation is taken over (z1 , . . . , zM ). Rs (τ1 , . . . , τM ) can be estimated given equidistant samples by replacing random variables by sample values and expectations by sums as usual. The advantage of using multidimensional autocovariances lies in the fact that now the multidimensional structure of the data set can be used more explicitly. For example, if row concatenation is used to construct s(t) from the images, horizontal lines in the image will only give trivial contributions to the autocovariance (see examples in figure 2 and section 4). Figure 1 shows the oneand two-dimensional autocovariance of the Lena image for varying τ respectively (τ1 , τ2 ) after normalization of the image to variance 1. Clearly, the twodimensional autocovariance does not decay as quickly with increasing radius as the one-dimensional covariance. Only at multiples of the image height, the onedimensional autocovariance is significantly high i.e. captures image structure. Our contribution consists of using multidimensional autocovariances for joint diagonalization. We replace the BSS assumption of diagonal one-dimensional autocovariances by diagonal multi-dimensional autocovariances of the sources. Note that also the multidimensional covariance satisfies the equation 2.  Again we as (1) (1) ¯ sume whitened x(z1 , . . . , zK ). Given a autocovariance matrix Rx τ1 , . . . , τM with n different eigenvalues, multidimensional AMUSE (mdAMUSE ) detects the orthogonal unmixing mapping W by diagonalization of this matrix. In section 2, we discussed the advantages of using SOBI over AMUSE. This of course also holds in this generalized case. Hence, the multidimensional SOBI algorithm (mdSOBI ) consists of the joint diagonalization of a set of symmetrized multidimensional autocovariances  o   n ¯ x τ (K) , . . . , τ (K) ¯ x τ (1) , . . . , τ (1) , . . . , R R 1 1 M M

SOBI based on multi-dimensional autocovariances

5

4 SOBI SOBI transposed images mdSOBI mdSOBI transposed images

ˆ I) crosstalking error E1 (A,

3.5

3

2.5

2

1.5

1

0.5

PSfrag replacements 0

(a) source images

0

10

20

30

40

50

K (b) performance comparison

60

70

Fig. 2. Comparison of SOBI and mdSOBI when applied to (unmixed) images from (a). The plot (b) plots the number K of time lags versus the crosstalking error E1 of the ˆ and the unit matrix I; here A ˆ has been recovered by bot SOBI recovered matrix A and mdSOBI given the images from (a) respectively the transposed images.

after whitening of x(z1 , . . . , zK ). The joint diagonalizer then equals A except for permutation, given the generalized identifiability conditions from [4], theorem 2. Therefore, also the identifiability result does not change, see [4]. In practice, (k) (k) we choose the (τ1 , . . . , τM ) with increasing modulus for increasing k, but with (k) the restriction τ1 > 0 in order to avoid using the same autocovariances on the diagonal of the matrix twice. Often, data sets do not have any substantial long-distance autocorrelations, but quite high multi-dimensional close-distance correlations (see figure 1). When performing joint diagonalization, SOBI weighs each matrix equally strong, which can deteriorate the performance for large K, see simulation in section 4. Figure 2(a) shows an example, in which the images have considerable vertical structure, but rather random horizontal structure. Each of the two images consists of a concatenation of stripes of two images. For visual purposes, we chose the width of the stripes to be rather large with 16 pixels. According to the previous discussion we expect one-dimensional algorithms such as AMUSE and SOBI to perform well on the images, but badly (for number of time lags  16) on the transposed images. If we apply AMUSE with τ = 20 to the images, we get excellent performance with a low crosstalking error with the unit matrix of 0.084; if we however apply AMUSE to the transposed images, the error is high with 1.1. This result is further confirmed by the comparison plot in figure 2(b); mdSOBI performs equally well on the images and the transposed

6

Fabian J. Theis, Anke Meyer-Baese, and Elmar W. Lang 8 SOBI K=32 SOBI K=128 mdSOBI K=32 mdSOBI K=128

ˆ A) crosstalking error E1 (A,

7

6

5

4

3

2

1

PSfrag replacements 0

0

0.05

0.1

0.15

0.2

0.25

σ

0.3

0.35

0.4

0.45

0.5

Fig. 3. SOBI and mdSOBI performance dependence on noise level σ. Plotted is the ˆ with the real mixing matrix A. See crosstalking error E1 of the recovered matrix A text for more details.

images, whereas performance of SOBI strongly depends on whether column or row concatenation was used to construct a one-dimensional random process out of each image. The SOBI breakpoint of around K = 52 can be decreased by choosing smaller stripes. In future works we want to provide an analytical discussion of performance increase when comparing SOBI and mdSOBI similar to the performance evaluation in [4].

4

Results

Artificial mixtures. We consider the linear mixture of three images (baboon, black-haired lady and Lena) with a randomly chosen 3 × 3 matrix A. Figure 3 shows how SOBI and mdSOBI perform depending on the noise level σ. For small K, both SOBI and mdSOBI perform equally well in the low noise case, but mdSOBI performs better in the case of stronger noise. For larger K mdSOBI substantially outperforms SOBI, which is due to the fact that natural images do not have any substantial long-distance autocorrelations (see figure 1), whereas mdSOBI uses the non-trivial two-dimensional autocorrelations. fMRI analysis. We analyze the performance of mdSOBI when applied to fMRI measurements. fMRI data were recorded from six subjects (3 female, 3 male, age 20–37) performing a visual task. In five subjects, five slices with 100 images (TR/TE = 3000/60 msec) were acquired with five periods of rest and five

SOBI based on multi-dimensional autocovariances

7

1

2

3

1

cc: −0.08

2

cc: 0.19

3

cc: −0.11

4

5

6

4

cc: −0.21

5

cc: −0.43

6

cc: −0.21

7

8

7

cc: −0.16

8

cc: −0.86

(a) component maps

(b) time courses

Fig. 4. mdSOBI fMRI analysis. The data was reduced to the first 8 principal components. (a) shows the recovered component maps (white points indicate values stronger than 3 standard deviations), and (b) their time courses. mdSOBI was performed with K = 32. Component 5 represents inner ventricles, component 6 the frontal eye fields. Component 8 is the desired stimulus component, which is mainly active in the visual cortex; its time-course closely follows the on-off stimulus (indicated by the gray boxes) — their crosscorrelation lies at cc = −0.86 — with a delay of roughly 2 seconds induced by the BOLD effect.

photic simulation periods with rest. Simulation and rest periods comprised 10 repetitions each, i.e. 30s. Resolution was 3 × 3 × 4 mm. The slices were oriented parallel to the calcarine fissure. Photic stimulation was performed using an 8 Hz alternating checkerboard stimulus with a central fixation point and a dark background with a central fixation point during the control periods. The first scans were discarded for remaining saturation effects. Motion artifacts were compensated by automatic image alignment (AIR, [11]). BSS, mainly based on ICA, nowadays is a quite common tool in fMRI analysis (see for example [12]). Here, we analyze the fMRI data set using spatial decorrelation as separation criterion. Figure 4 shows the performance of mdSOBI; see figure text for interpretation. Using only the first 8 principal components, mdSOBI could recover the stimulus component as well as detect additional components. When applying SOBI to the data set, it could not properly detect the stimulus component but found two components with crosscorrelations cc = −0.81 and −0.84 with the stimulus time course.

5

Conclusion

We have proposed an extension called mdSOBI of SOBI for data sets with multidimensional parametrizations, such as images. Our main contribution lies in

8

Fabian J. Theis, Anke Meyer-Baese, and Elmar W. Lang

replacing the one-dimensional autocovariances by multi-dimensional autocovariances. In both simulations and real-world applications mdSOBI outperforms SOBI for these multidimensional structures. In future work, we will show how to perform spatiotemporal BSS by jointly diagonalizing both spatial and time autocovariance matrices. We plan on applying these results to fMRI analysis, where we also want to use three-dimensional autocovariances for 3d-scans of the whole brain.

Acknowledgements The authors would like to thank Dr. Dorothee Auer from the Max Planck Institute of Psychiatry in Munich, Germany, for providing the fMRI data, and Oliver Lange from the Department of Clinical Radiology, Ludwig-Maximilian University, Munich, Germany, for data preprocessing and visualization. FT and EL acknowledge partial financial support by the BMBF in the project ’ModKog’.

References 1. Cichocki, A., Amari, S.: Adaptive blind signal and image processing. John Wiley & Sons (2002) 2. Hyv¨ arinen, A., Karhunen, J., Oja, E.: Independent component analysis. John Wiley & Sons (2001) 3. Tong, L., Liu, R.W., Soon, V., Huang, Y.F.: Indeterminacy and identifiability of blind identification. IEEE Transactions on Circuits and Systems 38 (1991) 499–509 4. Belouchrani, A., Meraim, K.A., Cardoso, J.F., Moulines, E.: A blind source separation technique based on second order statistics. IEEE Transactions on Signal Processing 45 (1997) 434–444 5. Ziehe, A., Mueller, K.R.: TDSEP – an efficient algorithm for blind separation using time structure. In Niklasson, L., Bod´en, M., Ziemke, T., eds.: Proc. of ICANN’98, Sk¨ ovde, Sweden, Springer Verlag, Berlin (1998) 675–680 6. Joho, M., Mathis, H., Lamber, R.: Overdetermined blind source separation: using more sensors than source signals in a noisy mixture. In: Proc. of ICA 2000, Helsinki, Finland (2000) 81–86 7. Cardoso, J.F., Souloumiac, A.: Jacobi angles for simultaneous diagonalization. SIAM J. Mat. Anal. Appl. 17 (1995) 161–164 8. Yeredor, A.: Non-orthogonal joint diagonalization in the leastsquares sense with application in blind source separation. IEEE Trans. Signal Processing 50 (2002) 15451553 9. Ziehe, A., Laskov, P., Mueller, K.R., Nolte, G.: A linear least-squares algorithm for joint diagonalization. In: Proc. of ICA 2003, Nara, Japan (2003) 469–474 10. Cardoso, J.F., Souloumiac, A.: Blind beamforming for non gaussian signals. IEE Proceedings - F 140 (1993) 362–370 11. Woods, R., Cherry, S., Mazziotta, J.: Rapid automated algorithm for aligning and reslicing pet images. Journal of Computer Assisted Tomography 16 (1992) 620–633 12. McKeown, M., Jung, T., Makeig, S., Brown, G., Kindermann, S., Bell, A., Sejnowksi, T.: Analysis of fMRI data by blind separation into independent spatial components. Human Brain Mapping 6 (1998) 160–188

Suggest Documents