Neural Networks with Wavelet Preprocessing in EEG Artifact Recognition Rafal Ksiezyk, Katarzyna Blinowska, Piotr Durka Laboratory of Medical Physics, Institute of Experimental Physics, Warsaw University, Hoza 69, 00 681 Warszawa, Poland.
[email protected],
[email protected],
[email protected] W.Szelenberger, W. Androsiuk, Department of Psychiatry, Warsaw Medical Academy
[email protected],
[email protected] Abstract: Almost every evaluation of EEG signal is preceded by the elimination of artifacts, usually performed by a human expert. We have applied Artificial Neural Networks (ANN) for the automatization of this laborious task. Choice of input pre-processing has proven to be crucial for both the learning speed and performance of the network. Best results were achieved when as the input values certain combinations of signal’s wavelet coefficients were used. The artifact detection obtained by us was comparable to human judgement. INTRODUCTION Recognition and elimination of artifacts in EEG signal is a complicated and tedious task, but essential to the development of practical systems of EEG analysis. Major types of physiological artifacts include: EOG artifacts, muscular activity, respiration, head and body movements. Various techniques for automatic detection and/or elimination of EEG artifacts had been reported, however the methods applied so far either required individual manual adjustment or have been based on limited and inflexible decision criteria and none of them has proven to be satisfactory for routine application in the clinical or experimental laboratory. Different automatic methods of artifact rejection were proposed. The methods of correction for eye movement artifacts were essentially based on autoregressive procedures of subtraction EOG signal from EEG (Van den Berg-Lenssen et al. 1989, Jervis et al. 1989). Takano et al. (1996) used for cancelling eye movements artifacts in EEG adaptive recursive least mean square algorithm. The elimination of the muscle artifacts relied usually on applying low pass filtering (Gotman et al. 1981, Brunner et al. 1996). Gevins et al. (1977, 1988) proposed adaptive approach for setting the thresholds of spectral parameters used to distinguish various artifacts. Logar et al. (1992) claimed that elimination of artifacts by means of visual control gives more positive results than automated detection. The introduction of ANN brought a new possibilities in development of adaptive methods of structures recognition and solving complex classification problems which can be related to their ability to learn a certain mapping from the set of the realisation examples. However performance of ANNs depends heavily on input parameters. We will apply ANNs for artifact recognition testing their performance for different input parameter sets. MATERIAL AND METHOD Overnight recording of 8 hours sleep EEG included the standard polysomnographic derivations, extended to 21 channels of EEG (10-20 standard). The raw multichannel EEG data were divided into 256points (2.5 second) segments and scored by experienced electroencephalographer. Learning set for neural network training was built from the signal segments of even positions in recording and testing set from odd position segments (approximately 2000 segments in each set). In this study we used 3 layer, feed-forward networks with classical sigmoidal transfer function nodes. Rumelhart’s error backpropagation algorithm with momentum was used for the training procedure. Initial weights were set to small random numbers in the range (-0.1, 0.1). Training was performed by presenting learning set to the network epoch by epoch. Five different data pre-processing schemes were used. Number of neurons in the first layer and structure of connections between layers depended on the used set of input parameters. In the last layer there was only one neuron producing output in the normalised range (0, 1); values close to 1 corresponding to artifact, close to 0 non-artifact.
Figure 1. Scheme of the network. ANN1. Raw 256 points segments of 27 channels recording; ANN2. Correlation coefficient between both EOG channels, correlation coefficients between one of the two EOG channels (EOG1) and EEG from electrodes Fp1, Fp2, F7, F8, averages and standard deviations of the signal segment for every channel; ANN3. Raw wavelet coefficients for segments of 256 points for 27 channels. Orthogonal wavelets (Mallat 1989) were used; ANN4. Correlation coefficients calculated in 0.4-3.2 Hz frequency band between EOG1 and EEG from electrodes Fp1, Fp2, F7, F8, normalised powers in 25.6-51.2 Hz band (muscular artifacts) and below 0.8 Hz (movement artifacts). The parameters were calculated from wavelet transform which made possible to estimate the correlation coefficients only in the frequency bands of interest and saved the computation time. ANN5. Wavelet coefficients integrated in the seven frequency bands.
Figure 2. Five ways of pre-processing of the input data.
RESULTS Performance of the network was tested by means of the ROC (Receiver Operating Characteristic) curves. They are defined by the ratios of truly recognised non-artifacts or artifacts as a function of the threshold
setting for the output neuron. For more accurate evaluation our ANNs we have considered both detectability and selectivity defined as: detectability of artifacts = TP / (TP + FP) selectivity of artifacts = TP / (TP + FN) detectability of non-artifacts = TN / (TN + FN) selectivity of non-artifacts = TN / (TN + FP) where TP, TN, FP, FN are defined in Table 1. Table 1. Definitions. ANN\expert non-artifact artifact D
D
373_(
E
373_6
373_6
F
G
371_6
I. Learning set
G
F
371_(
371_6
373_(
artifact FP TP
E
371_(
non-artifact TN FN
II. Testing set D
E
373_6
373_(
F
371_6
371_(
G
III. Additional testing set Figure 3. Curves representing: a) detectability of artifacts, b) selectivity of artifact detection, c) detectability of non-artifacts, d) selectivity of non-artifact detection as a function of threshold parameter. Three panels show results for: I. - learning set, II. - testing set (the data not used in learning phase but coming from the same patient), III. - testing set (data coming from the subject not used in the learning phase). Solid line with bars represents ROC curves obtained by us and thin line with crosses represents results for random detector (given for comparison).
The examples of the ROC curves are shown in Figure 3. The classification of the data epoch as artifact or artifact-free depends on the threshold setting in a ROC curve. It can be adjusted in a way to eliminate all epochs suspected of containing artifacts or in a more liberal way, leaving more EEG epochs for which the presence of artifact may not be completely excluded. Table 2. Convergence and generalisation ability of ANN for different input parameters sets. input pre-processing ANN 1 ANN 2 ANN 3 ANN 4 ANN 5
input size
(none) raw signal raw wavelet coefficients correlations, variances and averages wavelet-based band-delimited correlations and powers integrated wavelet coefficients in freq. bands
6912 6912 45 46 8
covergence (iterations) 6 18*10 6 2*10 6 22*10 6 2.1*10 6
2.4*10
generalisation poor poor good good good
ANN1. During training period segments from learning set were presented to the network up to 18 million times, after which there was no admissible progress in learning. Percent of truly recognised artifacts (TP) was at level of 85% while truly recognised non-artifact segments (TN) reached 40% for threshold level 0.3. For threshold level of 0.6 we got TP and TN respectively 35% and 80%. Testing of network’s performance on separate testing set revealed poor generalisation; ANN2. Network converged after 22 million of learning iterations with 68% artifact detection efficiency (TP) and 81% non-artifact detection efficiency (TN). Test on the data not presented before showed generalisation at the level of 57% TP and 90% TN; ANN3. Learning capability of network significantly increased since after 2 millions of learning iterations we obtained nearly 100% of both TP and TN value, but generalisation test failed again; ANN4. Tests performed after 2.1 million learning iterations showed similar results for both learning and testing set with optimal detection of 71% for artifacts (TP) and 82% for non-artifacts (TN).The calculation of correlations by means of wavelet transform only in the relevant frequency band brought substantial shortening of the calculation time; ANN5. Tests performed after 2.1 million learning iterations showed good generalisation - similar results for both training and learning sets: 80% of correct classification for artifacts (TP) and 75% for non-artifacts (TN). Summary of results is presented in Table 2 and in Figure 4.
Ratio
Ratio
0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2
select. / non-artifacts select. / artifacts detect. / non-artifacts detect. / artifacts Performance
0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 ANN 5 ANN 4 ANN 2 ANN 3 ANN 1
Preprocessing
Figure 4. Performance of ANNs expressed in terms of selectivity and detectability for artifacts and nonartifacts detection.
DISCUSSION The results of our analysis show that application of a raw data or data parametrised, without taking into account the specificity of the problem, leads to the poor generalisation, and in case of raw data to a very long computation time The best results were found for networks using input parameters evaluated on the basis of wavelet coefficients ANN4 and ANN5. Learning rate of these networks was very high, percentage of learned examples from training set and generalisation were good and the number of iterations was an order of magnitude better than for the raw data or input parameter based on correlation coefficients computed in a conventional way. If we compare performance of these ANNs with results obtained in other works e.g. (Gevins et al. 1977) (65% TP and 56% TN) or with the consensus between single EEG scorer versus consensus of scorers (86% TP and 72% TN) we can consider results as satisfactory. We can conclude that for ANNs application in the time series classification pre-processing should include frequency information about the signal. In many other applications power spectra calculated from Fourier transform are applied. We can recommend wavelet transform as more efficient and faster method, providing both - time and frequency characteristics of signal and thus offering universal preprocessing. REFERENCES Brunner D.P., Vasko R.C., Detka C.S., Monahan J.P., Reynolds C.F. 3rd, Kupfer D.J. 1996, Muscle artifacts in the sleep EEG; automated detection and effect on all-night power spectra, Journal of Sleep Research, 5(3):155-164. Gevins A.S., Yeager C.L., Zeitlin G.M., Ancoli S., Dedon M.F. 1977, On-line Computer Rejection of EEG Artifact, Electroencephalography and Clinical Neurophysiology, 42; 267-274. Gevins A.S., Morgan N.H. 1988, Applications of Neural-Network (NN) Signal Processing in Brain Research, IEEE Trans. Acoust., Speech ans Signal Processing, 36; 1152-1160. Gotman J., Ives J.R., Gloor P. 1981, Frequency content of EEG and EMG at seizure onset: possibility of removal of EMG artefact by digital filtering, Electroencephal. Clin. Neurophys. 52:626-639. Jarvis B., Coelho M., Morgan G.W., 1989, Effect on EEG responses of removing ocular artefacts by proportional EOG subtraction. Med. & Biol. Engin. & Cmput. 27:484-490. Logar C., Freidl W., Lechner H. 1992, A comparison of EEG mapping with and without visual artefact control in focal cerebral lesions, EEG-EMG Zeitschrift fur Elektroenzephalographie Elektromyographie und Verwandte Gebeite, 23(2):101-4. Mallat S.G. 1989, A Theory for Multiresolution Signal Decomposition: The Wavelet Representation, IEEE Trans. on Patt. Anal. and Mach. Intel., vol. 11; 674-693. Takano N., Maruyama T., Tamagawa M., Yana K. 1996, Artifact Cancelling Using RLS Adaptive Filtering Technique, Proceedings of IEEE Engineering in Medicine and Biology 18th Annual International Conference, Amsterdam. Van der Berg-LenssenM.M.C, Brunia C.H.M., BlomJ.A. 1989, Correction of ocular artifacts in EEGs using an autorregressive model to describe EEG: a pilot study. Electroenceph.clin.Neurophys. 73:72-83.