N. Nicolaou and S. J. Nasuto, Comparison of Temporal and Standard Independent Component Analysis (ICA) Algorithms for EEG Analysis
Comparison of Temporal and Standard Independent Component Analysis (ICA) Algorithms for EEG Analysis Nicoletta Nicolaou and Slawomir J. Nasuto CIRG, Department of Cybernetics, University of Reading, Whiteknights, P.O.Box 225, Reading, RG6 6AY, UK
[email protected]
Abstract - Growing interest in Electroencephalogram (EEG) classification brings a need for the development of appropriate analysis and processing techniques. One of the most significant issues associated with EEG analysis is the high contamination of the recorded signals with various artefacts, both from the subject and from equipment interference. This paper discusses the advantages of using temporal Independent Component Analysis (ICA) over standard ICA for artefact removal from EEG signals. The performance of three ICA algorithms, standard ICA (FastICA) and two extensions including temporal information (Temporal FastICA and TDSEP), has been compared using both artificial and physiological data. It has been found that, in both cases, the temporal algorithm TDSEP displays a significant improvement in performance over the remaining two algorithms.
Cτ = si (t ), s j (t + τ )
1. INTRODUCTION The externally recorded EEG is a highly attenuated and mixed signal since it originates from the activity of thousands of neurons, which passes through different tissue layers before reaching the recording electrodes. An important problem in EEG analysis is the contamination of the signals with various artefacts, such as eye blinks or facial muscle movements [1]. These artefacts must be removed prior to any analysis. One method of achieving this is via thresholding, where any data whose amplitude exceeds a set threshold is discarded. One implication of thresholding is the loss of valuable EEG activity, which has been masked by the large amplitude artefacts. An alternative method is the application of Blind Source Separation (BSS), the most commonly used BSS algorithm being Independent Component Analysis (ICA) [2]-[6]. ICA is a statistical technique for obtaining independent sources, s, from their linear mixtures, x, when neither the original sources nor the actual mixing matrix, A, are known (equation 1). This is achieved by exploiting higher order signal statistics and optimisation techniques.
x = As
In order to successfully apply ICA for EEG analysis, the assumptions on which the operation of ICA is based must be fulfilled. In standard ICA it is assumed that (i) the recorded signals are a result of linear mixing; (ii) the artefacts are statistically independent from the EEG signals and (iii) the number of sources is the same as the number of mixtures. There are a number of issues associated with the application of ICA for EEG analysis: (i) Both the EEG and artefact signals have a temporal structure [7]. This is illustrated in figure 1, which shows the correlation matrices of the EEG and artefact signals for different time lags. The correlation matrices, Cτ , are defined as in equation 2.
(1)
(2)
where si - the signals - expectation τ – time lag; τ=0,1,… The diagonal structure of the correlation matrices for different time lags is clearly visible. It signifies that the signals, si , are mutually uncorrelated. The presence of nonzero diagonal elements indicates that all signals show some dependence on their past values, hence have a clear temporal structure. However, standard ICA, using only higher order statistics effectively discards information contained in the temporal structure of the signals. (ii) The amplitude distribution of the EEG, at least in quasi-stationary regions, is Gaussian, as can be seen in figure 2. This is also supported by the extensive use of Autoregression (AR) models with Gaussian noise distribution for modelling of the EEG in several studies [2], [4], [8]. However, standard ICA fails to separate the mixtures if more than one of the sources has a Gaussian amplitude distribution. Therefore, employing the time structure information in ICA calculations can potentially improve artefact removal and enhance the overall classification.
N. Nicolaou and S. J. Nasuto, Comparison of Temporal and Standard Independent Component Analysis (ICA) Algorithms for EEG Analysis A number of ICA algorithms that incorporate the temporal structure of the sources have been developed [9]-[13]. This paper is concerned with three ICA algorithms: (i) FastICA [14], which represents traditional ICA, (ii) Temporal FastICA (TFastICA) [11] and (iii) Temporal Decorrelation Source Separation (TDSEP) [9]. In TFastICA the temporal structure is incorporated through pre-processing of the mixture prior to the application of traditional ICA. In TDSEP the temporal structure has been incorporated by modifying the operation of standard ICA. The use of higher order statistics has been substituted by the use of time-delayed correlation matrices for source separation. TDSEP is representative of a number of temporal ICA algorithms, e.g. the algorithm by Cichocki et al [10].
Fig. 1. Correlation matrices for two EEG signals and two artefact signals (ECG and EMG) for a time lag of 0 (top left graph) up to a time lag of 5 (bottom right graph). The column and row labels are the same.
negentropy of the mixture such that uncorrelated and independent sources whose amplitude distribution is as nonGaussian as possible are obtained. The independent components are considered to be random variables, thus making the assumption of independence crucial for the separation [11]. The basic algorithm of FastICA has been extended to include the temporal structure of the signals, resulting in Temporal FastICA (TFastICA) [11]. The temporal structure has been incorporated by modelling the mixture using an AR model and applying traditional ICA on the innovation process of the mixture, which is defined as the ‘error of the best prediction of a stochastic process given its past’ in [11]. The resulting de-mixing matrix, W, can be applied in order to demix the signals. The rationale for using the innovations is that they are usually more statistically independent and less Gaussian than the original sources. Thus, in TFastICA the assumption of independence is relaxed. An advantage of TFastICA is that separation is improved in cases when traditional ICA fails. However, in spite of this, adequate separation may still not be achieved when the mixed sources have a Gaussian distribution. The third ICA method considered, TDSEP, is based on the ‘simultaneous diagonalisation of several time-delayed correlation matrices’ [9]. A cost function which describes the time structure of the sources is minimised, achieved when decorrelation over time of the estimated sources occurs. One of the advantages of TDSEP is that, because separation is based on the correlation of the sources, TDSEP can separate signals whose amplitude distribution is Gaussian. However, it cannot separate signals that are correlated over time.
3. EXPERIMENTS AND RESULTS The experiments consisted of the assessment of the performance of the algorithms in separating both synthetic AR data mixtures and an artificial mixture of EEG and artefact signals. All algorithms were implemented in Matlab®. TFastICA was implemented by the authors whereas FastICA and TDSEP were obtained from [15] and [16] respectively. The performance of the algorithms was measured using the SIR index (equation 3), quantifying the distance of the obtained permutation matrix, P = (WA) , from the optimum permutation matrix (identity matrix) [9]. The lower the SIR index, the better the achieved separation. A SIR index of zero implies a perfect separation. Fig. 2. Histogram of the amplitude distribution of two EEG signals (top row) and two artefact signals, ECG and EMG (bottom row). It can be seen that the EEG signals display an approximately Gaussian distribution, as opposed to the artefact signals.
2. ALGORITHMS FastICA is a fixed-point ICA algorithm that employs higher order statistics for the recovery of independent sources [14]. Separation is performed by minimisation of the
SIR ( P ) =
3.1.
n Pij − 1 n max k Pij i =1 j =1
1
n
∑∑
( )
(3)
AR data
The first dataset consisted of a mixture of 3 second-order AR sources (equation 4) identical to data used in [9]. The length of
N. Nicolaou and S. J. Nasuto, Comparison of Temporal and Standard Independent Component Analysis (ICA) Algorithms for EEG Analysis each source was 4500 samples and the noise, ni , was Gaussian with zero mean and unit variance. The tests consisted of 50 trials, in which different realisations of the 3 AR processes were generated and mixed using a randomly chosen square matrix. Traditional ICA is expected to fail, as all three sources have Gaussian distributions, but the temporal ICA methods are expected to perform well. Figure 3 shows the histogram of the obtained SIR index. TDSEP displays the lowest SIR index and the lowest variation of the three, which also implies a good separation. Even though TFastICA performed better than FastICA, the SIR index values obtained are high, implying a bad separation. This suggests that taking into account the temporal structure in the form of innovations is not beneficial in this case, since their distribution remains Gaussian, which ICA cannot deal with. Also, both FastICA and TFastICA display a large variation of SIR index. s1 (t ) = −0.8s1 (t − 1) − 0.6 s1 (t − 2) + n1 (t ) s2 (t ) = −0.7 s2 (t − 1) − 0.3s2 (t − 2) + n2 (t )
(4)
s3 (t ) = −0.9 s3 (t − 1) − 0.15s3 (t − 2) + n3 (t )
single subject during sleep (courtesy of the “Siesta” project [17]). The reason for testing the algorithms on an artificial mixture rather than on real EEG data is mainly to allow a better assessment of the algorithm performance, since a priori knowledge of both the original sources and the mixing process is available. With real EEG data it is possible to isolate certain artefacts, such as eye blinks, by placement of electrodes in appropriate positions. The recorded artefact signals can then be used as reference signals for comparison between the obtained independent components and the actual artefact signals. However, it is still impossible to know the exact number of artefacts contained in the recorded EEG signals and therefore, the performance of the algorithms will not be adequately assessed. Also the nature of attenuation and mixing of different EEG sources is thought to be well captured by an instantaneous linear mixing assumption. Comparing the algorithms on an artificial mixture will give an initial indication of their performance when used for demixing of real EEG data. The tests consisted of 50 trials in which the four signals were mixed using a randomly generated square matrix. The number of time-lagged correlation matrices used by TDSEP was varied. The results obtained for the different maximum lags are presented in table I. TABLE I SIR INDEX OF TDSEP FOR SEPARATION OF EEG AND ARTEFACT SIGNALS FOR DIFFERENT TIME LAGS
LAG SIR INDEX
0 0.6205
3 0.4645
6 0.3195
10 0.2388
13 0.1193
15 0.1419
20 0.1997
Fig. 3. SIR index of FastICA, TFastICA and TDSEP for separation of a mixture of 3 AR(2) sources for 50 trials. The average SIR index is displayed in the legend. The SIR index for TDESP is concentrated around low values. The SIR index for both FastICA and TFastICA displays a larger spread.
3.2.
EEG data
Since AR models have successfully been used for modelling of EEG signals, it is expected that if a method performs well in separating mixtures of AR data, it should also perform well in separating EEG from artefact signals. Traditional ICA, however, is expected to fail in this task, as the recorded EEG is inherently a mixture in which more than one source has a Gaussian distribution. To test this hypothesis, the second dataset consisted of an artificial mixture of two EEG signals recorded from the Frontal, Fp1, and Occipital, O1, positions according to the International 10/20 System, and two artefact signals, the ElectroCardiogram (ECG) and the ElectroMyogram (EMG). All signals were obtained from a
Fig. 4. Histogram of the SIR index for EEG and artefact signal separation over 50 trials. The average SIR index is displayed in the legend. (TDSEP uses time lag 13). TDSEP displays a very low SIR index, whereas the other two methods display higher SIR index, with TFastICA displaying a larger spread of values.
TDSEP achieved the best separation and lowest variation of SIR index (figure 4). Even though FastICA resulted in inadequate separation, the SIR index has a low variation with an average value approximately equal to the average SIR
N. Nicolaou and S. J. Nasuto, Comparison of Temporal and Standard Independent Component Analysis (ICA) Algorithms for EEG Analysis index of TFastICA. Using the innovations in TFastICA was, therefore, unable to improve the overall performance of standard ICA. An interesting observation is that the optimum SIR index in TDSEP is obtained for a maximum lag of 13. This seems to agree with results obtained in [18] and [7] suggesting that an AR model of order 11-13 provides an adequate description of the temporal dependencies of the EEG signals. 4. CONCLUSIONS
[8] [9] [10] [11] [12] [13]
The performance of standard ICA and ICA with temporal modifications has been assessed in order to establish whether the inclusion of the temporal structure is beneficial for EEG analysis. The results indicate that utilising time-delayed correlation matrices to incorporate the temporal structure of the sources significantly improves their separation. Further work will focus on: (i) the investigation of the behaviour of temporal ICA algorithms, such as the ones by Matsuoka et al [19] and Attias [12], where the temporal structure is incorporated in a different way; (ii) using temporal ICA for artefact removal from the EEG, investigating the possibility of automatising the process; (iii) investigating whether using temporal ICA to separate the EEG into its components prior to its analysis could enhance EEG classification; and (iv) the quality of classification, of EEG in particular [20], depends on both the choice of classifier and the choice of the representative signal features. This opens an avenue for using ICA for feature extraction. ACKNOWLEDGEMENTS The first author would like to thank ORS (UK) and the Department of Cybernetics for their support provided for this project. REFERENCES [1] [2]
[3]
[4]
[5]
[6] [7]
B. J. Fisch, “Fisch and Spehlmann’s EEG Primer”, 3rd Edition, Elsevier Science, 1999 Tzyy-Ping Jung, S. Makeig, C. Humphries, Te-Won Lee, M. J. Mckeown, V. Iragui and T. J. Sejnowski, “Removing electroencephalographic artefacts by blind source separation”, from Phychophysiology, 37, pp. 163-178, Cambridge University Press, 2000 R. Vigario, V. Jousmaki, M. Hamalainen, R. Hari and E. Oja, “Independent Component Analysis for identification of artefacts in MEG recordings”, in Neural Information Processing Systems 10, Proceedings of NIPS '97, Denver, December, 1997 W. D. Penny, S. J. Roberts and R. M. Everson, “Hidden Markov Independent Components for biosignal analysis”, In Proceedings of MEDSIP-2000, International Conference on Advances in Medical Signal and Information Processing, 2000 L. De Lathauwer, B. De Moor and J. Vandewalle, “Fetal Electrocardiogram extraction by source subspace separation”, Proceedings IEEE SP / Athos Workshop on Higher-Order Statistics, pp. 134-138, June 12-14, Girona, Spain, 1995 A. Cichocki and Shun-ichi Amari, “Adaptive blind signal and image processing”, England: John Wiley & Sons, 2002, ch. 1 F. Lopes Da Silva, “EEG Analysis: theory and practice”, from 'Electroencephalography: Basic Principles, Clinical Applications and Related Fields', Chapter 61, pp. 1135-1163, 1998
[14] [15] [16] [17] [18] [19] [20]
C. Guger, A. Schlogl, C. Neuper, D. Walterspacher, T. Strein and G. Pfurtscheller, “Rapid prototyping of an EEG-based BCI”, in IEEE Trans. On Neural Systems and Rehab. Eng, Vol. 9, No. 1, March, 2001 A. Ziehe and K. R. Muller, “TDSEP - an efficient algorithm for blind separation using time structure”, in Proceedings of ICANN '98, pp. 675680, December, 1998 K. Barros and A. Cichocki, “Extraction of specific signals with temporal structure”, in Neural Computation, 13, pp. 1995-2003, MIT, 2001 A. Hyvarinen, “Independent Component Analysis for time-dependent stochastic processes”, In Proc. Int. Conf. on Artificial Neural Networks (ICANN'98), Skövde, Sweden, pp. 541-546, 1998 H. Attias, “Independent Factor Analysis with temporally structured sources”, in Neural Computation, Vol. 11, No. 4, pp. 803-851, 1999 K. R. Muller, P. Philips and A. Ziehe, “JADEtd: Combining higherorder statistics and temporal information for blind source separation (with noise)”, in Proc. Int. Workshop on Independent Component Analysis and Signal Separation (ICA'99), Aussois, France, pp. 87-92, 1999 A. Hyvarinen, “A survey on Independent Component Analysis”, in Neural Computing Surveys, Vol. 2, pp. 94-124, 1999 FastICA toolbox available from: http:// www.cis. hut .fi/projects/ica/fastica/ TDSEP available from: http://www.first.gmd.de /persons/Mueller.Klaus-Robert.html EEG signals: http://www.dpmi.tu-graz.ac.at/ ~schloegl/siesta/t310/CD/ J. J. Wright, R. R. Kydd and A. A. Sergejew, “Autoregression models of EEG”, in Biological Cybernetics, Vol. 62, pp. 201-210, Elsevier Science, 1990 K. Matsuoka, M. Ohya and M. Kawamoto, “A Neural Net for Blind Separation of Nonstationary Signals”, in Neural Networks, Vol. 8, No. 3, pp. 411-419, Elsevier Science, 1995 A. Flexer, “Data Mining and EEG”, in Statistical Methods in Medical Research, Vol. 9, pp. 395-413, 2000