Wavelet-Based EEG Preprocessing for Biometric ... - IEEE Xplore

5 downloads 0 Views 514KB Size Report
Wavelet-based EEG Preprocessing for Biometric. Applications. Su Yang, Farzin Deravi. School of Engineering and Digital Arts. University of Kent. Canterbury ...
2013 Fourth International Conference on Emerging Security Technologies

Wavelet-based EEG Preprocessing for Biometric Applications Su Yang, Farzin Deravi School of Engineering and Digital Arts University of Kent Canterbury, UK [email protected], [email protected]

subjects, however, the performance with larger number of subjects still needs to be explored. Also there is a continued need to explore how to make the deployment of EEG biometrics more practical by reducing the number of electrodes and the amount of data needed for training and testing while maintaining high performance for a larger number of users. To achieve these goals the preprocessing stage is of crucial importance and this paper investigates some approaches for its enhancement.

Abstract—EEG signals, measuring transient brain activities, can be used as a source of biometric information with potential application in high-security person recognition scenarios. However, due to the inherent nature of these signals and the process used for their acquisition, their effective preprocessing is critical for their successful utilisation. In this paper we compare the effectiveness of different wavelet-based noise removal methods and propose an EEG-based biometric identification system which combines two such de-noising methods to enhance the signal preprocessing stage. In tests using 50 subjects from a public database, the proposed new approach is shown to provide improved identification performance over alternative techniques. Another important preprocessing consideration is the segmentation of the EEG record prior to de-noising. Different segmentation approaches were investigated and the trade-off between performance and computation time is explored. Finally the paper reports on the impact of the choice of wavelet function used for feature extraction on system performance.

The rest of the paper is organized as follows: Section 2 gives an outline of the proposed EEG-based biometric system. Section 3 gives details of database used for testing and the evaluation scheme. In Section 4 different noise removal methods are described and a new hybrid method is proposed and evaluated. In Section 5 the effect of preprocessing window sizes on performance is investigated. In Section 6 the choice of the wavelet functions used for feature extraction is explored. Finally conclusions and suggestions for further work are presented in Section 7.

Keywords—EEG, Biometrics, De-noising, Wavelets

1. INTRODUCTION With the rapid development of machine learning techniques as well as the increasing availability of low-cost sensors, biometric person recognition technologies have been an active area of research in recent years; leading to significant deployments in a range of application domains. However, despite their considerable success, important challenges still face their widespread adoption and acceptance [1] and because of this the search for new biometric modalities continues. This paper is concerned with a relatively new biometric modality: the Electroencephalogram (EEG).

2. SYSTEM DESIGN The proposed system can be divided into three main modules: Preprocessing, Feature Extraction and Classification as shown in Figure 1. The raw EEG signals were segmented into non-overlapping windows of short duration (approximately 6 seconds) and were subjected to a waveletbased noise removal operation. After signal preprocessing, the feature extraction was performed in two stages. Wavelet Packet Decomposition (WPD) was performed first and the signals were decomposed to Level 5 [4]. Only the coefficient bands in the range between 1 to 50Hz were kept. The wavelet coefficients were then partitioned into three equal sets corresponding to three different frequency bands: 1~10Hz, 10~20Hz and 20~50Hz.

Wavelet techniques have been used effectively for EEG signal processing. Gupta et al. measured the visual evoked potential (VEP) using 8 electrodes on 4 subjects. Wavelet packet analysis was used for feature extraction and a radial basis function network for classification. They reported a performance of 85% in an identification scenario [2]. Muhammad et al. captured EEG data from 10 subjects in the resting state with eyes open and eyes closed, and used wavelet features and neural network classifiers. An identification accuracy rate of 81% was reported with 4 channels and 71% with 2 channels respectively [3]. These reports seem to suggest that although a good performance can be achieved with relatively small number of electrodes for a small number of 978-0-7695-5077-0/13 $26.00 © 2013 IEEE DOI 10.1109/EST.2013.14

In the second feature extraction stage, the standard deviations (SD) of the wavelet coefficients for each band and for each window were calculated and used as features for classification. This was done to achieve a significant reduction in the dimensionality of the feature vectors. The choice of SD as the feature follows experiments on the results of using several different dimension reduction strategies (mean, variance, entropy and kurtosis). It seems that the SD could better capture the useful transient activities revealed in the EEG 43

However, the selected five electrodes were not individually the top-five performing electrodes amongst the ten electrodes tested: it appears that there might be correlations between the signals captured by these electrodes which affecting their overall performance.

wavelet coefficients. SDs from all the windows were concatenated and used as the feature vector to train a Linear Discriminant Classifier (by assuming normal densities with equal covariance matrices) for person identification [5].

4. WAVELET BASED DE-NOISING In this section three wavelet-based noise removal methods as well as a novel hybrid de-noising strategy are introduced. The performances of these methods were compared in an identification scenario using EEG data from 25 users (S1-S25), with the feature extraction and classification methods described in the previous section. 4.1 WAVELET THRESHOLDING The wavelet shrinkage approach to noise removal assumes a noise model for its operation [12]. Regression models are often used to recover the underlying signal which is mixed up with the noise. Such models may be expressed in the form:‫ݕ‬௜ ൌ ݂ሺ‫ݔ‬௜ ሻ ൅ ߳௜ , where‫ݕ‬௜ is the mixed signal, ݂ሺ‫ݔ‬௜ ሻ is the “clean” signal function and the ߳௜ is the noise function (i = 1, … , n), ߳௜ is assumed to be a centered Gaussian white noise of unknown variance ߪ ଶ [11].

Fig. 1 System diagram

3. EXPERIMENTAL DESIGN To evaluate different pre-processing schemes part of the database (first 50 subjects, S1-S50) from the “EEG Motor Movement/Imagery Dataset”, supplied by the developers of the BCI2000 instrumentation system [6, 7] were used for testing in an identification (one-to-many recognition) scenario. The sampling rate of this database was 160 Hz, only the data from Task 4 (motor imagery task) were used as earlier research results indicate that motor imagery tasks may contain a rich source of biometric information [8].

The wavelet shrinkage approach assumes that the useful information is mostly represented by the approximation coefficients generated by wavelet decomposition. The other set of coefficients that are produced after wavelet transform, namely the high frequency or detail coefficients are regarded as noise. However, if the whole of the detail coefficients were removed, there may be loss of some useful information. The “hard” threshold strategy was adopted for the current experiment, which performs a “keep or kill” policy on the wavelet detail coefficients using the minimax principle [13, 14]. We tested two wavelet coefficients thresholding schemes: global threshold and level-dependent thresholds. Global threshold calculates a single threshold based on the minimummaximum estimation, whereas level-dependent thresholding allows specifying the threshold for each different decomposition level.

The database contained, for each subject, three separate EEG recording sessions each lasting two minutes. From each of these recordings the 20 non-overlapping windows of size 960 samples were extracted for preprocessing, each window corresponding to 6 seconds of EEG data.

4.2 MULTIVARIATE AND MULTI-SCALE PRINCIPAL COMPONENT ANALYSIS As the five electrodes are closely positioned with the same sampling frequency, some noise components of the signal may be correlated and the signal quality could be improved by removing these [15]. However, the wavelet shrinkage approach assumes only independent Gaussian noise. One option could be performing principal component analysis (PCA) on the wavelet coefficients to eliminate the cross-correlation effects among electrodes. The multivariate de-noising method is such a method that combines wavelet decomposition and PCA: First the wavelet decomposition is performed; next PCA is applied on the approximation coefficients for de-noising. After reconstruction by inverse wavelet transform the PCA is applied again to the signal [16].

Fig. 2 Identification accuracy of the tested electrode positions, after [9] (10-10 system [10])

Ten electrode positions were individually evaluated to select a subset of these electrodes for subsequent system evaluations as shown in Figure 2. Two of the recording sessions were used for training and the remaining one was used for testing. The reported identification accuracies for each electrode are the averages of the accuracies obtained from the three sessions used in turn for testing. The dot-dash lines indicate the five selected electrodes (FCz, Cz, CPz, Pz, POz). These electrodes in combination provided the highest classification rate compared with other possible five-electrode combination schemes.

According to [17], this Multivariate Analysis is particularly suitable for stationary signal de-noising. However, this method may not perform well for the EEG data used for the evaluation of the proposed system [15]. The experiments reported in this

44

paper verified this conjecture; multivariate analysis for denoising resulted in a degradation of performance with a worse result than without any noise removal (Fig. 3).

To further investigate the sensitivity of the proposed methodology, the 1~50Hz band-passed signal was divided into three frequency ranges: 0~10Hz, 10~20Hz and 20~50Hz. The training and testing data for 25 subjects (S1-S25) were based on the respective frequency bands. Figure 4 shows that the proposed hybrid method was more effective in the higher frequency range. The performance within 0 to 10Hz even suffered some degradation. The low frequency range of the signal might not contain significant noise and may not need to be de-noised.

Multi-scale PCA de-noising method [18] applies the PCA algorithm to both the approximation and the detail coefficients, and hence may better remove the correlated noises. In the results reported below this method indeed provided good performance as indicated by identification rate. 4.3 HYBRID DE-NOSING METHOD Wavelet shrinkage de-noising is good at removing white noise that might be generated from electrical equipment and sensors during the EEG recording process. Since five closely related electrodes were used for data capture, spatially correlated noise may also affect the quality of the signal: this kind of contamination might be alleviated by applying multiscale PCA analysis. We propose a novel strategy which combines both wavelet coefficients thresholding and PCA methodologies to further de-noise the raw signals. The proposed methodology is as follows: 1) Apply Discrete Wavelet Transform (DWT) to the raw signal up to Level 5 with the sym8 (Symlets order 8) wavelet function [7], using minimum-maximum rule to estimate the mean square error with a “hard” level-dependent threshold, and reconstruct the signal after thresholding.

Fig. 4 The sensitivity of the proposed hybrid noise-removing method in different frequencies

2) Apply multi-scale PCA analysis to the thresholded signals to further remove the spatially correlated noise by preserving only some of the (uncorrelated) principal component vectors.

5. WINDOW SIZE OPTIMIZATION The raw signal was segmented into several non-overlapping windows before de-noising. The proposed de-noising method may be sensitive to the size of the window. The feature extraction stage may also be affected by the window size. It is, therefore, important to investigate the impact of window size on performance. Originally each of the two minutes’ recordings was divided into 20 windows of 960 samples each. After several tests, windows with size of 3840 samples were chosen for denoising. This scheme showed a slight improvement compared with smaller or larger window sizes such as 960 or 4800 samples per window. The feature extraction stage, however, was found to be quite sensitive to the window size.

Five wavelet-based de-noising methods were compared using data from the first 15 subjects (S1-S15) in the database. By applying the proposed hybrid method the identification rate could be improved by more than 5.5% compared with no noise removal (Fig. 3). The level-dependent wavelet shrinkage provided better performance than the global threshold shrinkage method; it appears that the characteristic of noise in each decomposition level is different and best treated separately. The proposed method aims to remove both the independent Gaussian noise and the correlated noise, and indeed it appears to result in better performance than all of the other methods investigated.

Fig. 5 50-fold-cross-validation tests of 6 seconds recording for windowing schemes

Fig. 3 Comparison of the de-noising methods based on biometric identification score

45

adopted database improved by more than 10% compare with the results without preprocessing optimization to reach 95.5%.

One of the two-minute recording sessions was used to provide data for testing. Figure 5 shows the accuracy rates from different window -sizes schemes. By randomly picking several 6 seconds long samples from the 2 minutes’ recording, tests were performed at different preprocessing window sizes. The “×” symbol of each boxplot represents the mean accuracy rate of 50 testing attempts. With the window size increasing, the computation time decreases and the identification rate increases until it reaches its peak with a window size of 4800 samples, then the performance begins to drop to a plateau as the size is further increased. Figure 5 also indicates the variation of the results achieved with different window sizes. For this dataset “4800 samples per window” results in a more stable performance as indicated by the most compact boxplot which represents the smallest variance of the 50 attempts.

More research is still required to concurrently optimize all these parameters that affect EEG signal preprocessing. The performance might be further improved by applying the hybrid de-noising method on only the higher frequency EEG bands. REFERENCES [1] Jain, Anil K., Sharath Pankanti, Salil Prabhakar, Lin Hong, and Arun Ross. "Biometrics: a grand challenge." In Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on, vol. 2, pp. 935-942. IEEE, 2004. [2] Gupta, Cota Navin, Yusuf U. Khan, Ramaswamy Palaniappan, and Francisco Sepulveda. "Wavelet framework for improved target detection in oddball paradigms using P300 and gamma band analysis." Biomedical Soft Computing and Human Sciences 14, no. 2 (2009): pp. 61-67. [3] Abdullah, Muhammad Kamil, Khazaimatol S. Subari, Justin Leo Cheang Loong, and Nurul Nadia Ahmad. "Analysis of the EEG signal for a practical biometric system." World Academy of Science, Engineering and Technology 68 (2010): pp. 1123-1127. [4] Daubechies, Ingrid. Ten lectures on wavelets. Vol. 61. Philadelphia: Society for industrial and applied mathematics, 1992. [5] Duin, R. P. W., Juszczak, P., de Ridder, D., Paclk, P., Pezkalska, E., & Tax, D. M. J. (2004). PR-Tools: Pattern Recognition Tools. Accessed from http://www.37steps.com/prhtml/prtools/ldc.html. [6] Schalk, Gerwin, Dennis J. McFarland, Thilo Hinterberger, Niels Birbaumer, and Jonathan R. Wolpaw. "BCI2000: a general-purpose brain-computer interface (BCI) system.", IEEE Transactions on Biomedical Engineering, 51, no. 6 (2004): pp. 1034-1043. [7] Goldberger, Ary L., Luis AN Amaral, Leon Glass, Jeffrey M. Hausdorff, Plamen Ch Ivanov, Roger G. Mark, Joseph E. Mietus, George B. Moody, Chung-Kang Peng, and H. Eugene Stanley. "PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals." Circulation 101, no. 23 (2000): e215-e220. [8] Yang, Su, and Farzin Deravi. "On the Effectiveness of EEG Signals as a Source of Biometric Information." In Emerging Security Technologies (EST), 2012 Third International Conference on, pp. 49-52. IEEE, 2012. [9] EEG Motor Movement / Imagery Dataset. Available from: http://www.physionet.org/pn4/eegmmidb/, accessed April 2013. [10] Chatrian, G. E. "Ten percent electrode system for topographic studies of spontaneous and evoked EEG activity." Am J Electroencephalogr Technol 25 (1985): pp. 83-92. [11] Kutner Michael H., Nachtsheim Chris J., Neter John. Applied Linear Regression Models, 4th edition. Publisher: McGraw-Hill Higher Education. [12] Taswell, Carl. "The what, how, and why of wavelet shrinkage denoising." Computing in science & engineering 2, no. 3 (2000): pp. 12-19. [13] Vidakovic, Brani. "Nonlinear wavelet shrinkage with Bayes rules and Bayes factors." Journal of the American Statistical Association 93, no. 441 (1998): pp. 173-179. [14] Donoho, David L. "De-noising by soft-thresholding." IEEE Transactions on Information Theory, 41, no. 3 (1995): pp. 613-627. [15] Suter, Bruce W. Multirate and wavelet signal processing. Vol. 8. Academic Press, 1997. [16] Aminghafari, Mina, Nathalie Cheze, and Jean-Michel Poggi. "Multivariate denoising using wavelets and principal component analysis." Computational Statistics & Data Analysis 50, no. 9 (2006): pp. 2381-2398. [17] Xie, Shengkun, Pietro Lio, and Anna T. Lawniczak. "A comparative study of noise effect on wavelet based de-noising methods." In Science and Technology for Humanity (TIC-STH), 2009 IEEE Toronto International Conference, pp. 919-926. IEEE, 2009. [18] Bakshi, Bhavik R. "Multiscale PCA with application to multivariate statistical process monitoring." AIChE Journal 44.7 (1998): pp. 15961610.

6. WAVELET FUNCTION OPTIMIZATION The choice of the wavelet function was also considered. Extensive tests on the wavelet-based de-noising module suggest that it does not significantly depend on the type of wavelet chosen for its implementation. However, the effectiveness of the features extracted by the WPD algorithm is affected by the types and orders of wavelet functions used in the feature extraction module. Three types of wavelets were examined: Daubechies, Symlets and Coiflets and some results are shown in Table 1. Table 1 Performance of Different Wavelet Functions with Denoising and Window-size Optimization.

For the Daubechies and Symlets wavelet families, the orders from 1 to 20 were evaluated and the Coiflets were tested from order 1 to 5. Test results indicate that as the wavelet order increases the impact on performance became less significant. The wavelet function for WPD feature extraction was chosen as Daubechies4. 7. CONCLUSION AND FUTURE WORK The experiments reported in this paper suggest that the preprocessing has the potential to substantially improve the performance of EEG biometric identification and may help it reach levels of accuracy needed for some high-security applications. The proposed hybrid system with the selected windowing scheme and wavelet type improved the accuracy rate: the identification accuracy for 50 subjects from the

46

Suggest Documents