Comparison of ICA Algorithms for the Isolation of Biological Artifacts in ...

1 downloads 0 Views 875KB Size Report
J. Rosca et al. (Eds.): ICA 2006, LNCS 3889, pp. 511–518, 2006. © Springer-Verlag Berlin Heidelberg 2006. Comparison of ICA Algorithms for the Isolation of.
Comparison of ICA Algorithms for the Isolation of Biological Artifacts in Magnetoencephalography Heriberto Zavala-Fernández1, Tilmann H. Sander2, Martin Burghoff2, Reinhold Orglmeister1, and Lutz Trahms2 1

Technische Universität Berlin, Institut für Elektronik, Einsteinufer 17, 10587 Berlin, Germany [email protected] 2 Physikalisch-Technische Bundesanstalt, Abbestr. 2-10, 10587 Berlin, Germany

Abstract. The application of Independent Component Analysis (ICA) to achieve blind source separation is now an accepted technique in the field of biosignal processing. The reduction of biological artifacts in magneto- and electroencephalographic recordings is a frequent goal. Four of the most common ICA methods, extended Infomax, FastICA, JADE, and SOBI are compared here with respect to their ability to isolate magnetoencephalographic (MEG) artifacts. The four algorithms are applied to the same data set containing heart beat and eye movement artifacts. For a quantification of the result simple spatial and temporal correlation measures are suggested and the usage of reference signals. Of the four algorithms only JADE was marginally less successful.

1 Introduction For the analysis of magnetoencephalographic (MEG) recordings the suppression of unwanted signal components is an important preprocessing step. Independent Component Analysis (see recent introductory texts [1-3] for references) is widely used for this purpose. From the multitude of algorithms described theoretically it is very difficult for the practitioner to choose a suitable algorithm for the task at hand. Only very few studies address this for experimental data important issue. The power line interference artifact in MEG was isolated in [4] using three ICA algorithms. In [5] respiratory and eye movement artifacts were removed from MEG data using a second-order algorithm followed by a higher order algorithm. Four algorithms were applied to remove eye movement and blinking artifacts from EEG data in [6]. These studies differ considerably in their methodology. In the present study the focus is on the two most common biological artifacts in MEG data: heart beat and eye movements. Investigating these artifacts we suggest a testing framework using temporal and spatial correlation. To start, results for four frequently used algorithms are presented: extended Infomax, FastICA, JADE, and SOBI. Success of ICA can only be proven on simulated data. As a simulation is not possible in all situations (ICA is often applied because the signal content is unknown) the more practical route of validation using separately recorded reference signals is applied here. J. Rosca et al. (Eds.): ICA 2006, LNCS 3889, pp. 511 – 518, 2006. © Springer-Verlag Berlin Heidelberg 2006

512

H. Zavala-Fernández et al.

2 Algorithms Used These algorithms are derived from the ICA concept, which supposes that each observed signal xi of a multi-channel recording with m channels can be described by a linear superposition n

xi (t ) = ∑ a ij s j (t)

(1)

j =1

of n source signals sj, i.e. number of components n equals number of sensors m. Assuming that the sources are statistically independent the joint probability density function of the signals sj factorizes. Then the sources can be separated theoretically by estimating a demixing matrix W. Estimates yi of the original sources sj are found by applying the demixing matrix to the measured variables: y(t) =Wx(t). Before the ICA algorithms are applied the observed signals are pre-processed in two steps: Centering[1-3] and whitening[1-3]. FastICA. The FastICA[7] algorithm estimates the non-Gaussianity of the signal distributions using higher-order statistics and the negentropy, which is a non-negative function of the differential entropy. FastICA is based on an iteration scheme for T finding a projection u=W x maximizing non-Gaussianity. It can be summarized by the update rule W ∝ xg ( W T x ) − g ' ( W T x ) W ,

(2)

and the subsequent normalization of the updated W until convergence is reached. There exist several possible choices for the non-linear function g(u). Extended Infomax. This extended Infomax[9] is an extension of the Infomax algorithm, which is based on the information maximization principle[8] with the ability of separate mixed signals with sub- and super-Gaussian distributions. This is achieved by introducing a learning rule able to switch between both distributions. The switching criterion between the sub- and super-Gaussian distributions for y(t) is contained in the following learning rule

∆W ∝ ⎡⎣I − K tanh(y )y T − yy T ⎤⎦ W

⎧ ki = 1 :supergaussian ⎨ ⎩ ki = −1 :subgaussian.

(3)

The elements of the diagonal matrix K are obtained according to ⎡ ki = sign ⎢ sec h 2 ( yi ) ⎣

⎤ yi2 − [tanh( yi )] yi ⎥ . ⎦

(4)

The Infomax algorithm maximizes the entropy of the outputs H(y). The maximization of this joint entropy consists of maximizing the individual entropies while minimizing the mutual information I(x) shared between them. When this latter quantity is zero, the variables are statistically independent.

Comparison of ICA Algorithms

513

JADE. The Joint Approximate Diagonalization of Eigenmatrices[10,11] (JADE) algorithm uses fourth-order cumulants Q = Cum(xi,xj,xk,xl), which present in a clear form the additional information provided for the high-order statistics. The JADE algorithm aims to reduce the mutual information contained in the cumulant matrices by looking for a rotation matrix such that the cumulant matrices are as diagonal as possible. The joint diagonalization is found by the Jacobi technique. SOBI. In contrast to the three algorithms sketched so far the Second Order Blind Identification algorithm(SOBI[12], TDSEP[13,14]) takes advantage of the temporal structure in the observed data. The basis of the SOBI algorithm is a set of time-lagged covariance matrices Rx(τ)=

τ ≠0 .

(5)

For independent sources these matrices have to be diagonal. To estimate the sources a joint diagonalization of the time-lagged covariance matrices is performed similarly to the JADE algorithm. The approach to use a set of τ values is intended to avoid an inferior source separation as there is no theoretically proven choice of τ values.

3 MEG Measurement and ICA Calculations The MEG recordings were performed in a shielded room, where the level of technical interference signals was reduced. The data were measured using a helmet shaped MEG sensor (www.eagle-tek.com) with 93 channels. For the purpose of generating typical biological artifacts the subject was instructed along the following protocol during the measurement: Rest for 30 s, horizontal eye movements for 30 s, rest for 30 s, vertical eye movements for 30 s, rest for 30 s, and eye blinking for 30 s. To simplify the artifact identification in the MEG the Electroocullogram (EOG: single lead below the eye relative to the forehead) and the Electrocardiogram (ECG: single lead on sternum relative to forehead) signals were simultaneously recorded. The raw data sampled at 2 kHz were downsampled offline by a factor of 8 to 250 samples/s to reduce the computational load for the ICA algorithms. In total 62500 data points (250 s * 250 samples/s) were input into the ICA calculations for each of the 93 channels without dimension reduction of the signal space. This study was realized using the software package EEGLAB[15]. EEGLAB is a freely available MATLAB® package for the analysis of single-trial EEG dynamics including various ICA algorithms. The following parameters were chosen: FastICA with g(u) = u3, SOBI with vector τ = {1,2,…,100} (no specific parameters for JADE and extended Infomax).

4 Results After the ICA calculation the component due to the heart beat was identified manually by comparison with the ECG signal and the result is shown in Fig. 1. All algorithms

514

H. Zavala-Fernández et al.

have successfully isolated a Cardiac Artifact (CA) component. On the left side the time series for the CA components (FastICA = FICA, extended Infomax = eINFO) are shown together with the ECG reference signal. The algorithm associated with each time series is indicated and the time series were rescaled for ease of comparison. The time series shown in Fig. 1 a) is only a short section of the time series input into the calculation. On the right side the CA maps for the four different algorithms are shown. The maps are interpolations using a projection of the three dimensional MEG sensor coordinates onto a plane showing the magnetic field as level curves. The maps shown are views onto the top of the head with nose and ears indicated.

f(t)

1

R

0 Q

f(t)

−1 1

ECG

S

0

CAJADE

f(t)

−1 1 0

CASOBI

f(t)

−1 1 0

CAeINFO

f(t)

−1 1 0 −1 20.5

CAFICA 21

21.5 22 Time (Sec)

(a)

22.5

(b)

Fig. 1. a) Typical ECG signal (top trace) and the time series of the CAs isolated from MEG by the ICA algorithms, b) associated CA maps (nose and ears indicated)

The four ICA time series in Fig. 1a) agree with each other, but they are different from the ECG (S-peak not visible as in the ECG trace). Such differences are well known[17] and reflect the complementary nature of electrical and magnetic measurements. Comparing the maps in Fig. 1b) visually it can be seen that the CAJADE map is different from the others and the CAeINFO and CASOBI maps are most similar. The CA map is expected[16] to exhibit a homogeneous (smooth) field distribution due to the relatively large sensor to source distance. The small extrema in the CAJADE map indicate a suboptimal identification of the CA. The strongest field in all maps is on the left side in agreement with the position of the heart on the left side of the body. For a quantitative analysis of the ICA result two types of comparison were made. Firstly the appropriate reference time series (ECG, EOG) was correlated using Eq. 7 with the full length time series yi(t) resulting from the ICA after demixing, i.e. inverting Eq. 1, and the ICA time series were correlated with each other. Secondly the vector angle between the ICA field maps, i.e. the ICA base vectors, was calculated r using Eq. 6, where vi with V=W-1 denotes the base vector.

Comparison of ICA Algorithms

ur uur ⎛ v ⋅v ⎞ i j ⎜ α = cos ur uur ⎟ ⎜ vi v j ⎟ ⎝ ⎠ −1

ρ fy =

( f (t ) − f )( yi (t ) − yi )

σ ( f )σ ( yi )

515

(6)

f (t ) = ECG, EOG, yi .

(7)

The angles between the CA components maps resulting from the different ICA calculations are shown in matrix format in Fig. 2a), where the labels are FastICA = F, extended Infomax = eI, J = JADE, S = SOBI and the grey scale for the angles is indicated on the right side. It can be seen immediately that the JADE results is different from the other results. The angle between CAJADE and the others CAs has a minimum value of 29° and a maximum of 41°. In comparison the angles between the others CAs are less than 12° (Figure 2a). J

S

eI

F

J

40

S

eI

F

1

J

J 30

S 20

eI

0.99 0.98

S

α

0.97 0.96

eI 10

ρ

0.95 0.94

F

F

0.93

0

(a)

(b)

Fig. 2. a) Matrix representation of angles between the CA component maps calculated using Eq. 6, b) the correlation values between the CA time series calculated using Eq. 7. The labels are F = FastICA, eI = extended Infomax, J = JADE, S = SOBI.

The correlations between the CA time series are shown in Fig. 2b). The correlations have generally high values between 0.92 and 0.99 and the lowest correlation occurs between the time series of SOBI and JADE. In contrast to the high correlations between the results from the ICA algorithms the correlation between ECG and the ICA result was 0.42 to 0.45 (values not shown in Fig. 2b). This is a consequence of the differences in signal morphology as mentioned above. The ICA results for the horizontal eye movement artifact (hEMA) are shown in Fig. 3, which displays time series on the left and maps on the right as in Fig. 1. All algorithms identified the hEMA as can be seen from the time series in Fig. 3 a), although the JADE time series appears noisier. The maps on the right side have basically the same structure with the strongest signals at the front. This is expected for the signal due to eye movements. Similar to the behavior observed for the CA the JADE map of the hEMA appears less regular compared to the other maps. A quantification of these observations was made using Eqns. 6 and 7 on the hEMA maps and time series and additionally the correlation between EOG and hEMA time series was calculated. The smallest angle between ICA maps is 6° occurring between

516

H. Zavala-Fernández et al.

extended Infomax and FastICA. The angle between the JADE hEMA map and the other maps is 13° to 20° indicating that the JADE result is slightly different from the others. For the correlation values the result is similar: The JADE time series has always the lowest correlations in the range from 0.94 to 0.96. The correlations between the other algorithms and the EOG signal always exceed 0.98.

f(t)

1 0

hEOG

f(t)

−1 1 0

f(t)

−1 1 0

f(t)

−1 1

hEMAJADE

hEMASOBI

0

f(t)

−1 1

hEMAeINFO

0 −1 20

hEMAFICA 25

30

35

40 Time (Sec)

45

50

55

60

(a)

(b)

Fig. 3. a) EOG signal (top trace) and the time series of the horizontal EMAs isolated by the ICA algorithms, b) associated EMA maps (nose and ears indicated) 1

f(t)

f(t)

1 0

vEOG

f(t) f(t)

vEMAJADE2

bEMAJADE

0

bEMA

SOBI1

−1 1

0 −1 1

vEMASOBI

0 −1 1

bEMASOBI2

0

f(t)

vEMA

eINFO

−1 1 0 −1 100

0 −1 1

vEMAFICA 105

f(t)

f(t)

0 −1 1

f(t)

f(t) f(t) f(t)

vEMAJADE1

0 −1 1

f(t)

−1 1

0 −1 1

0

bEOG

−1 1

110

115

120 Time (Sec)

(a)

125

130

135

bEMAeINFO

0

bEMA

140

−1 180

FICA

185

190

195 Time (Sec)

200

205

210

(b)

Fig. 4. EOG signal (top trace) and the time series of: a) the vertical EMAs and b) the blinking EMAs isolated by the ICA algorithms

The ICA results for the vertical and blinking eye movement artifact (vEMA and bEMA respectively) are shown in Fig. 4 displaying the time series only. In case of vEMA in Fig. 4a), JADE found two separate components related to the vertical EOG, while ext. Infomax, SOBI and FastICA extracted only a single component appearing

Comparison of ICA Algorithms

517

less noisier than the others. The JADE and ext. Infomax time series have relatively low correlations (Eq. 7) to the vertical EOG in the range from 0.6 to 0.88, while the corresponding correlations for the other algorithms always exceed 0.95. The separation of the bEMA in Fig. 4b) shows that SOBI found two components related to the blinking EOG, while the other algorithms found only one. The correlations between bEOG and bEMA of SOBI are in the range from 0.5 to 0.83. The corresponding JADE and ext. Infomax correlations are in the range from 0.8 to 0.9. The best correlation was 0.95 for the FastICA algorithm. The angles calculated using Eq. 6 do not contradict the correlation result (omitted due to lack of space). Principal Component Analysis (PCA) was also applied to the full data set. The five strongest PCA components are artifact related. The CA is well identified in one component, but in contrast to the ICA algorithms PCA was not able to separate the EMAs into single PCA components. Furthermore their time series show temporal overlap between CA and the EMAs. As it is well known decorrelation is not sufficient to achieve independence.

5 Conclusions Four different ICA algorithms were applied to a 93 channel MEG data set containing heart beat and eye movement artifacts signals. Using the simultaneously measured ECG and EOG and prior knowledge artifact related independent components could be identified in the result from all four algorithms. Comparing the results from the four algorithms using a correlation between the time series and the angle between the maps it was found that JADE performs slightly inferior to the other algorithms for the single data MEG data set used here. The computation time needed to obtain the independent components using JADE was ten times longer compared to the other ICA algorithms. All algorithms were capable of isolating the super-Gaussian probability distribution of the heart beat and most algorithms succeeded for the essentially bimodal distributions of the eye movements. In contrast to the continuous presence of the heart beat eye movements can be controlled to a certain degree during an experiment and the data set used seems to be unrealistic in this respect. There were two reasons for this choice: Firstly, patients or elderly people often cannot control their eye movements, and secondly, infrequently occurring artifacts violate the stationarity assumption of ICA. Therefore a comparative study using a data set with infrequent artifacts might mainly test the ability of algorithms to cope with non-stationarity. To summarize it seems that the suggested comparative framework is of high practical value as demonstrated on a limited data set. Future work will assess the separability between cortical activity and artifact signals.

Acknowledgments Help with the measurements by Alf Pretzell, the DAAD (German Academic Exchange Service) scholarship for H.Z.F. (PKZ: A/04/21558), and the Berlin Neuroimaging Centre (BMBF 01GO0208 BNIC) are gratefully acknowledged.

518

H. Zavala-Fernández et al.

References 1. Hyvärinen, A., Karhunen, J., Oja, E.: ICA, John Wiley and Sons, New York, 2001. 2. Roberts, S., Everson, R., Eds.: ICA - Principles and Practice, Cambridge University Press, Cambridge, 2001. 3. Cichocki, A., Amari, S.: Adaptive Blind Signal and Image Processing. John Wiley and Sons, New York, 2002. 4. Ziehe, A., Nolte, G., Sander, T.H., Mueller, K.-R., Curio, G.: A comparison of ICA-based artifact reduction methods for MEG, in Proc of Biomag2001, J. Nenonen, R.J. Ilmoniemi, and T. Katila Eds., Helsinki Univ. of Technology, pp. 895–899, 2001. 5. Moran, J.E., Drake, C.L., Tepley, N.: ICA Methods for MEG Imaging, Neurol. And Clin. Neurophysiol., vol. 72, pp. 1-4, 2004. 6. Joyce, C.A., Gorodnitsky, I.F., Kutas, M.: Automatic Removal of Eye Movement and Blink Artifacts from EEG Data Using Blind Component Separation, Psychophysiol., vol. 72, pp. 313-325, 2005. 7. Hyvärinen, A., Oja, E.: A Fast Fixed-Point Algorithm for Independent Component Analysis, Neural Computation, vol. 9, p. 1483–1492, 1997. 8. Bell, A., Sejnowski, T.: An Information Approach to Blind Separation and Blind Deconvolution, Neural Comput., 7, pp. 1129-1159, 1995. 9. Lee, T.-W., Girolami, M., Sejnowski, T.-J.: ICA Using an Extended Infomax Algorithm for Mixed Sub- and Supergaussian Sources. Neural Comp., vol. 11, pp. 417–441, 1999. 10. Cardoso, J.-F., Souloumiac, A.: Blind Beamforming for Non Gaussian Signals. IEEProceedings-F, vol. 140, no 6, pp. 362–370, 1993. 11. Cardoso, J.-F.: High-Order Contrasts for Independent Component Analysis. Neural Computation, vol. 11, pp. 157–192, 1999. 12. Belouchrani, A., Abed-Meraim, K., Cardoso, J.-F., Moulines, E.: A Blind Source Separation Technique Based on Second-Order Statistics, IEEE Trans. on Sig. Proc., vol. 45, pp. 434–444, 1997. 13. Ziehe, A., Müller, K.-R.: TDSEP – An Efficient Algorithm for Blind Separation Using Time Structure, in Proceedings of the 8th ICANN, L. Niklasson, M. Bodén, and T. Ziemke Eds., pp. 675–680, Springer Verlag, 1998. 14. Köhler, B.-U., Orglmeister, R.: A Blind Source Separation Algorithm Using Weighted Time Delays, in Proc. of the 2nd Intern. Workshop on ICA and BSS, P. Pajunen and J. Karhunen Eds., Helsinki, pp. 471–475, 2000. 15. Delorme, A., Makeig, S.: EEGLAB: An Open Source Toolbox for Analysis of Single-Trial EEG Dynamics Including Independent Component Analysis, Journal of Neuroscience Methods., vol. 134, pp. 9-21, 2003. 16. Sander, T.H., Wuebbeler, G., Lueschow, A., Curio, G., Trahms, L.: Cardiac Artifact Subspace Identification and Elimination in Cognitive MEG-Data Using Time-Delayed Decorrelation, IEEE Trans. Biomed. Eng., vol. 49, p. 345–354, 2002. 17. Brockmeier, K., Schmitz, L., Bobadilla-Chavez, J.-J., Burghoff, M., Koch, H., Zimmermann, R., Trahms, L.: Magnetocardiography and 32-Lead Potential Mapping: Repolarization in Normal Subjects During Pharmacologically Induced Stress, J. Cardiovasc. Electrophys., vol. 8, p. 615-626, 1997.

Suggest Documents