decision fusion for eeg-based emotion recognition

1 downloads 0 Views 290KB Size Report
individual base classifiers and the feature fusion based classifier .... X is the result of Fast Fourier .... participant rated his content in terms of arousal, valence,.
Proceedings of the 2015 International Conference on Machine Learning and Cybernetics, Guangzhou, 12-15 July, 2015

DECISION FUSION FOR EEG-BASED EMOTION RECOGNITION SHUAI WANG1, JIACHEN DU1, RUIFENG XU1 1

Shenzhen Engineering Laboratory of Performance Robots at Digital Stage, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China E-MAIL: [email protected], [email protected]

Abstract: The emotion recognition using electroencephalogram (EEG) received lots of attentions in recent years. Various features from different angels were proposed. In this paper, we propose an EEG-based emotion recognition framework based on weighted fusion of the outputs from base classifiers. Three base classifiers based on SVM (RBF kernel) using Power Spectral, Higuchi Fractal Dimension and Lempel-Ziv Complexity features are developed, respectively. The outputs of base classifiers are integrated by following a weighted fusion strategy which is based on the confidence estimation on each class by each base classifier. The evaluation on DEAP dataset shows that our proposed decision fusion based method outperforms the individual base classifiers and the feature fusion based classifier integration on EEG-based emotion recognition.

Keywords: EEG; Emotion recognition; Decision fusion

1.

Introduction

Emotion is human consciousness and behavior which reflects human perception and attitudes. Emotion plays a critical role in daily life, especially in the interaction between human beings. Furthermore, in the area of human-machine interaction, the emotion recognition plays an important role [1]. In recent decades, many studies on emotion recognition were reported. The majority of these studies are based on non-physiological signals such as facial expression, posture, body motion and voice. However, facial expressions and tone of voice can be deliberately hidden, thus the corresponding recognition are not always reliable. EEG-based emotion recognition is regarded more effective and reliable, because the physiological signal cannot be controlled by human intentionally. Feature extraction plays the key role in EEG-based emotion recognition. With the rapid development of digital signal processing, computation geometry and nonlinear dynamical analysis, many EEG analysis methods were proposed. These methods analyzed the EEG signals from time domain, frequency domain, or time-frequency domain.

On these basis, some features based on band power [2,3], information entropy [4], fractal dimension [5] and complexity features [6] have been proposed. These features are designed for quantifying EEG characteristics from different angles. Naturally, the feature integration or fusion is expected to improve the EEG-based emotion recognition. However, the human brain activity can be seen as a complicate nonlinear dynamic system, the simple fusion of different features extracted from EEG signals which integrating all features into one vector sometimes brings negative impacts. Considering the problems above, this paper proposes an effective decision fusion method based on multiple features. Firstly, three groups of basic features are extracted from 32 electrodes: (1) power spectral features from theta (4-8 Hz), slow alpha (8-10 Hz), alpha (8-12 Hz), beta (12-30 Hz), and gamma (30+ Hz) bands [7]; (2) Lempel-Ziv complexity features [6,23]; (3) Higuchi Fractal Dimension (HFD) features [9]. Secondly, we construct three base classifiers using the above feature individually. The weight coefficient of each feature is automatically learned by using the training dataset. Finally, the outputs of theses base classifiers are integrated through confidence-based decision fusion. Evaluations on DEAP dataset (Dataset for Emotion Analysis using electroencephalogram, Physiological and Video Signals) show that our proposed decision fusion approach outperforms the base classifiers and the feature fusion method. The rest of this paper is organized as follows. Section 2 briefly reviews the research background. Section 3 describes our method in details. Section 4 gives the evaluation and discussion. Finally, Section 5 concludes the paper. 2. 2.1.

Research Background Emotion and EEG

An emotional state refers to a psychological and physiological state in which emotions and behaviors are interrelated and appraised within a context [10]. Generally speaking, the space of emotional state can be divided into the discrete model and the dimensional model. In the discrete

978-1-4673-7220-6/15/$31.00 © 2015 IEEE 883

Proceedings of the 2015 International Conference on Machine Learning and Cybernetics, Guangzhou, 12-15 July, 2015

model, an emotional state is defined as a set of a finite number of discrete states corresponding to one of core emotions, including anger, fear, disgust, surprise, happiness, and sadness, or a combination of them [11]. The dimensional model defines an emotional state with the basic dimensions of emotion such as valence and arousal [12]. According to this model, emotions are specified by their position in the two-dimensional space as shown in Fig.1. Valence represents the quality of an emotion ranging from unpleasant to pleasant. Arousal refers to the quantitative activation level ranging from calm to excited [13]. Based on the emotion models, neurophysiologic mechanisms under the emotional state have been investigated. Recently, EEG-based emotion recognition gained much research attentions in the field of Brain-Computer-Interfaces (BCIs), which is named as affective BCI [14]. HIGH AROUSAL

fear joy

Q2

anger

NEGATIVE VALENCE

Q1 POSITIVE VALENCE

Q3

Q4

digust relaxation

sad

LOW AROUSAL

FIGURE 1. TWO-DIMENSIONAL EMOTION MODEL.

2.2.

EEG-based Emotion Recognition

EEG-based emotion recognition gains increasing research interest, which offers great benefits such as efficient time resolution, less intrusiveness, and signals captured from the origin of the emotion genesis [15]. The specific EEG-based emotion recognition task is different from the stimulus materials on subjects and the specific emotion evaluation modules. Commonly used stimulus materials are videos, audios, or pictures in EEG-based emotion recognition tasks. Psychologists do not present emotions as discrete states, but continuous ones. Thus, they always demonstrate the emotion in an n-dimensional space. The most frequently used representation is two-dimensional valence/arousal space (VAS) as shown in Figure 1. Up to now, there are few affective EEG databases

publicly available including eNTERFACE database, DEAP database, and the one established by Yazdani et al [16]. DEAP database [17] has a large amount of subjects (32 subjects) who participated in the data collection. The stimuli to elicit emotions in the experiment are one minute long music videos. It was used as evaluation dataset in many EEG-based emotion recognition researches. 2.3.

EEG Feature Extraction

The feature extraction aims to build a computational model to find emotion-related features based on neurophysiologic and neuropsychological knowledge. Feature extraction methods can be divided into three categories with time domain analysis, frequency domain analysis, and the time-frequency domain analysis. Event related potentials (ERP) and some statistics of signal, such as power spectral, mean, standard deviation, Higher Order Crossings (HOC), and fractal dimension (FD) are typical representation of time domain features [9,18,19]. In frequency domain analysis, the most popular features are power features from different frequency bands [3,7,17]. If the signal is non-stationary, time-frequency analysis brings additional information by considering dynamical changes [18]. The typical works are the Hilbert-Huang Spectrum [20] and the Discrete Wavelet Transform [21]. With the development of information theory and the nonlinear dynamical analysis, some entropy based features and complexity degree features, such as Fuzzy Entropy [4], Approximate Entropy [8], Lempel-Ziv complexity [6,22,23] have also been employed. They are shown effective in EEG-related analysis tasks. 3.

Decision Fusion of Multiple Features for EEG-based Emotion Recognition

As discussed before, there are abundant features available from time domain, frequency domain and time-frequency domain, respectively. Naturally, the feature integration or fusion is expected to improve EEG-based emotion recognition. Traditional feature fusion methods concatenate multiple features into one single vector cannot improve the recognition performance stably, because some bad-performed features bring negative influences. Furthermore, the simple feature concatenation always ignores the non-linear structure of original EEG signals. To overcome these problems, this paper proposes an effective decision fusion method of multiple features based on Support Vector Machine (SVM). Firstly, three base SVM-RBF based binary classifiers are constructed on the basis of power spectral features, higuchi fractal dimension features, Lempel-Ziv

884

Proceedings of the 2015 International Conference on Machine Learning and Cybernetics, Guangzhou, 12-15 July, 2015

complexity features of 32 EEG channels, respectively. Secondly, the weight coefficients of each feature are learned based on the confidence outputted by each base classifier. Thirdly, a decision fusion based classifier is constructed by integrating the outputs of three base classifiers with the weight coefficients to estimate the overall confidence of one sample belonging to each class. Following the principle of minimum error, the class with the maximum confidence is outputted as the emotion recognition result. The framework of our proposed decision fusion based EEG classification is shown in Figure 2.

gamma (30+ Hz) bands were extracted from all 32 electrodes as feature collection Fpower. The total number of this group of EEG features based on the power spectral intensity (PSI) for 32 electrodes, as shown as formula l, is 32*5=160.

EEG EEG signals signals

Another group of frequently used measure of complexity is the fractal dimension, which can be computed via Higuchi algorithm [9]. Thus, Higuchi Fractal Dimension (HFD) from each EEG channel data is employed. HFD is known to produce results closer to the theoretical FD values. Higuchi’s algorithm constructs k new series from the original series by: xm , xm  k , xm  2k ,..., xm  ( N  m) / k k , (2)

Feature Feature Extraction Extraction

Classifier Classifier F-power F-power

Classifier Classifier F-hfd F-hfd Base classifiers

Classifier Classifier CLZ CLZ

PSI k 

3.1.

Feature Selection

In order to model the human emotion status from each channel data[x1,x2,…,xN], we extract power spectral feature comes from the signal processing, the Higuchi Fractal Dimension feature comes from the geometric computation, and the Lempel-Ziv complexity feature comes from nonlinear dynamical analysis, respectively. These features are regarded as reflecting emotion status from different angles. Thus, their fusion is expected to take the advantage of from different analysis methods and domains in EEG-based emotion recognition. Power Spectral The most popular features in the EEG-based emotion recognition are power features from different frequency bands. The logarithms of the spectral power from the vector band [f1, f2,... fK]=[0.5,4,7,12,30,100] shown as theta (4-8 Hz), slow alpha (8-10 Hz), alpha (8-12 Hz), beta (12-30 Hz), and

|, k  1,2,...K  1,

(1)

Higuchi Fractal Dimension

where m=1, 2, … , k. For each time series constructed from (2), the length L(m,k) is calculated by following Formula 3: ( N  m ) / k 

L(m, k ) 

 i2

xm  ik  xm  (i 1) k ( N  1)

( N  m) / k k

L(k )  [i 1 L(i, k )] / k k

Final Final Classifier Classifier

CLASSIFICATION

| X

i i  N ( f k / f s ) 

where f s is the sampling rate, X i is the result of Fast Fourier Transform (FFT), and N is the series length.

Training Training (optimize (optimize the the weights weights of of base base classifiers) classifiers)

FIGURE 2. THE FRAMEWORK OF DECISION FUSION BASED EEG

 N ( f k 1 / f s ) 

(3) (4)

We repeat kmax times for each k from 1 to kmax, and then uses a least-square method to determine the slope of the line which best fits the curve of ln(L(k)) versus ln(1/k). The obtained slope is the Higuchi Fractal Dimension. We extract 32 fractal dimension features from 32 EEG channels, respectively, as the feature collection Fhfd. Lempel-Ziv Complexity The methods on using nonlinear dynamics to extract features from time series on data mining are not widely investigated. However, there are some solid proofs showing that using nonlinear dynamics in feature extraction from EEG signals are reasonable. Lempel-Ziv complexity (LZC) is one of the most effective measures used in nonlinear dynamics analysis. To use the Lempel-Ziv complexity as features, EEG signals are transformed to 0-1 discrete symbols, and then 32-dimension feature vector are extracted from each symbolic string corresponding to 32 channels. This group of features is denoted as FLZC. The detailed algorithms are shown in [6, 23].

885

Proceedings of the 2015 International Conference on Machine Learning and Cybernetics, Guangzhou, 12-15 July, 2015

3.2.

Label  arg max{ Ppostive, Pnegtive}

Decision Fusion of Multiple Classifiers

The fusion of above features aims to improve classification performance by exploiting the complementary nature of different features [17]. In general, the fusion can be classified into two categories: feature fusion and decision fusion. Feature fusion concatenates the different features extracted from signals to form a composite feature vector. As for the decision fusion, each feature is processed independently by the corresponding classifier and their outputs are integrated to generate the final result. The feature fusion methods are straightforward which consider synchronous characteristics of the involved features. However, the brain activity is a complicate nonlinear dynamic system. Thus, the feature fusion sometimes brings negative impact. Especially, their outputs miss the non-linear structures in the signal during the dimension reduction. The decision fusion method is constructed based on the fusion of classification results by base classifiers. Decision fusion supports to model the asynchronous characteristics of features flexibly. An important advantage of decision fusion over feature fusion is that since each of the signals is processed and classified independently in decision fusion, it is relatively easy to employ an optimal weighting scheme to adjust the contribution of each feature to the final decision according to their reliability of the feature. Therefore, decision fusion is adopted in this study. Firstly, we build the base classifiers based on SVM with RBF kernel using three groups of features, namely Fpower, Fhfd and FLZC. Then the classification error rate of each base classifier on training dataset is calculated as Formula 5: N

errorm  P( f ( xi )  yi )   wmi I ( f ( xi )  yi )

Pnegative 



m



mF

m

m

* conf mnegative

Experimental Settings

Our proposed decision fusion method for EEG-based emotion classification on two emotion dimensions (Arousal and Valence) is evaluated on the DEAP dataset (Dataset for Emotion Analysis using electroencephalogram, Physiological and Video Signals). DEAP dataset records 32 channels, multiple peripheral physiological signals of 32 participants when they were watching 40 one-minute music videos. After the presentation of each stimulus (music video), the participant rated his content in terms of arousal, valence, likability, dominance (ranges from 1 to 9) and familiarity (ranges from 1 to 5). We applied the pre-processing steps for EEG signals which is the same as in [6] (down-sampling to 128Hz, removing eye-blinking artifacts, band pass filtering each channel to 4-45Hz interval and averaged channels to a common reference). For the binary classification, we transformed ratings for arousal, valence to two categories, positive and negative, corresponding respectively to rating intervals [1,5) and [5,9]. In the following experiments, we evaluate the classification accuracies in a single-trial setup for each subject, separately. In other words, we used leave-one-out cross validation scheme on SVM to obtain single subject accuracy. The evaluation metrics are the averaged accuracies for two emotional categories: arousal (ARO), and valence (VAL). 4.2.

The weight coefficient of each classifier on the training dataset by Formula 6 is computed as below: 1 1  errorm (6)  m  log( ) 2 errorm where f(xi) represents the predicted label of sample xi, yi is the real label. wmi represents the sample weight of sample xi in the m-th feature-based classification where is set to 1.0. For an input testing sample, the confidence confm on each class by the corresponding classifier is the probability computation on each class: (7) P   * conf postive mF

4.1.

Evaluations and Results

(5)

i 1

postive

4.

(9)

(8)

The final decision function outputs the predicted label as the follows:

Experimental Results and Analysis

Firstly, the individual performance achieved the three base classifiers corresponding to Power Spectral, Higuchi Fractal Dimension and Lempel-Ziv Complexity features are evaluated. The achieved accuracy on valence and arousal on 32 subjects are shown in Figure 3 and Figure 4, respectively. It is observed that the achieved performances on valence and arousal have shown obvious differences between different base classifiers on most subjects. Table 1 gives the achieved performance by the three base classifiers on DEAP dataset, respectively. It is observed that generally speaking, the achieved performance on arousal classification is higher than valence classification. Meanwhile, the three base classifiers achieved similar performance in which the one based on Higuchi Fractal Dimension, HFD, achieves the highest performance.

886

Proceedings of the 2015 International Conference on Machine Learning and Cybernetics, Guangzhou, 12-15 July, 2015 0.9

binPower

hfd

clzc

0.8 0.7 0.6 0.5 0.4 0.3

0.0186 accuracy improvement on valence from feature fusion and 0.0155 improvements from the highest base classifier, Fhfd. As for the arousal classification accuracy, the decision fusion outperforms feature fusion for 0.0114 and FLZC for 0.0049. These results show that our proposed decision fusion method is effective to improve the classification performance of EEG-based emotion recognition.

0.2

TABLE 2. THE ACHIEVED PERFORMANCES BY FEATURE FUSION AND

0.1

DECISION FUSION

0 1

3

5

7

9

11

13

15

17

19

21

23

25

27

29

[17]

Feature Fusion

Decision Fusion

Valence

0.620

0.6385

0.6571

Arousal

0.583

0.6629

0.6743

31

FIGURE 3. VALENCE ACCURACY ON 32 SUBJECTS BY THREE BASE CLASSIFIERS

0.9

binPower

hfd

clzc

0.8 0.7 0.6 0.5 0.4 0.3 0.2

To estimate the upper limit of performance improvement by fusion method, we count the number of matched answers corresponding to the three base classifiers. It is supposed that the accuracy of features matching at least one answer is the upper limit of fusion method. The count results are given in Table 3. It shows that the fusion method still has large space to improve. TABLE 3. THE COUNTS OF CORRECT CLASSIFICATION BY BASE

0.1

CLASSIFIERS

0 1

3

5

7

9

11

13

15

17

19

21

23

25

27

29

31

0- answer 1- answer 2- answer 3- answer >1-answer

Figure 4. Arousal accuracy on 32 subjects by three base classifiers TABLE 1. THE ACHIEVED PERFORMANCES OF BASE CLASSIFIERS INDIVIDUALLY

5. Fpower

Fhfd

FLZC

Valence

0.6384

0.6416

0.6352

Arousal

0.6659

0.6625

0.6694

The second experiment evaluates the performance improvement by classifier integration. Here, both the feature fusion method and our proposed decision fusion method are evaluated. For comparison, the system reported in [17] is used as the baseline. Here, feature fusion method integrates all of the features into one vector. The achieved performance on valence and arousal are listed in Table 2. It is observed that the feature fusion achieves the performances higher than the lowest performance but lower than the highest performance by base classifiers. In other words, the feature fusion method cannot improve the classification performance obviously. Our proposed decision fusion method achieved better performance. It achieves the

Valence

0.26

0.10

0.12

0.52

0.74

Arousal

0.24

0.11

0.11

0.54

0.76

Conclusions

In this paper, we propose an effective decision fusion method by integrating the output of multiple base classifiers. Three base classifiers based on SVM (RBF kernel) using Power Spectral, Higuchi Fractal Dimension and Lempel-Ziv Complexity features are developed, respectively. The confidences on each class by the each base classifier are then estimated. Finally, the outputs of multiple base classifiers are integrated through confidence based decision fusion as final classification output. The evaluation on DEAP dataset shows that our proposed classifier based on decision fusion outperforms the individual base classifiers and feature fusion method which integrates different features into a single composite feature vector. It is shown that our proposed decision fusion method is helpful to take the advantage of different features from different angles. Furthermore, decision fusion based EEG-based emotion recognition system is flexible by appending more powerful features or removing weak low performance features easily.

887

Proceedings of the 2015 International Conference on Machine Learning and Cybernetics, Guangzhou, 12-15 July, 2015

Acknowledgements This work is supported by the National Natural Science Foundation of China (No. 61370165, 61203378), National 863 Program of China 2015AA015405, the Natural Science Foundation of Guangdong Province (No. S2013010014475), Shenzhen Development and Reform Commission Grant No.[2014]1507, Shenzhen Peacock Plan Research Grant KQCX20140521144507925 and Baidu Collaborate Research Funding. References [1] R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. Votsis, S. Kollias, W. Fellenz, and J. G. Taylor, Emotion recognition in human-computer interaction, Signal Processing Magazine, vol. 18, no. 1, pp. 32–80, 2001. [2] M. Li and B.-L. Lu, Emotion classification based on gamma band EEG, Proceedings of IEEE Int. Conf. on Engineering in Medicine and Biology Society, vol. 1, Sep. 2009, pp. 1323–1326. [3] X. Wang, D. Nie, and B. Lu, EEG-based emotion recognition using frequency domain features and support vector machines, Proceedings of ICONIP 2011, pp. 734–743. [4] H. Liu, H. Xie, W. He, Z. Wang, Characterization and classification of EEG sleep stage based on fuzzy entropy, Journal of Data Acquisition and Processing. Vol. 25, No. 4, 2010. [5] Y. Liu and O. Sourina, Real-time fractal-based valence level recognition from EEG, Transaction on Computational Science XVIII, pp. 101–120, 2013. [6] D. Zhang, D. Chen, Y. You, H. Li, Analysing emotion EEG signals feature based on adaptive Lempel-Ziv Complexity, Computer Application and Software, vol.31, No.9, 2014. [7] R. Q. Quiroga, S. Blanco, O. A. Rosso, H. Garcia, and A. Rabinowicz, Searching for hidden information with gabor transform in generalized tonic-clonic seizures,” Electroencephalography and Clinical Neurophysiology, vol. 103, no. 4, pp. 434–439, 1997. [8] S. M. Pincus, I. M. Gladstone, and R. A. Ehrenkranz, A regularity statistic for medical data analysis, Journal of Clinical Monitoring and Computing, vol. 7, no. 4, pp. 335–345, 1991. [9] T. Higuchi, Approach to an irregular time series on the basis of the fractal theory, Physica D, vol. 31, no. 2, pp. 277–283, 1988. [10] K. R. Scherer, What are emotions? and how can they be

[11]

[12]

[13] [14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

888

measured? Social Science Information, vol. 44, no. 4, pp. 695–729, 2005. L. F. Barrett, Discrete emotions or dimensions? the role of valence focus and arousal focus, Cognition and Emotion, vol. 12, no. 4, pp. 579–599, 1998. I. B. Mauss and M. D. Robinson, Measures of emotion: a review, Cognition and Emotion, vol. 23, no. 2, pp. 209 –237, 2009. P. Ekman, R. Davidson, The Nature of Emotion: Fundamental Questions, Oxford University Press, 1994. C. Muhl, A.-M. Brouwer, N. C. van Wouwe, E. van den Broek, F. Nijboer, and D. Heylen, Modality-specific affective responses and their implications for affective BCI, Proceedings of Int. Conf. on Brain-Computer Interfaces, 2011, pp. 4–7. Panagiotis C. Petrantonakis, Leontios J. Hadjileontiadis, A novel emotion elicitation index using frontal brain asymmetry for enhanced EEG-Based emotion recognition, IEEE Transactions on Information Technology in Biomedicine, 2011, vol. 15, No.5 Y. Liu, Olga Sourina, EEG Databases for Emotion Recognition, Proceedings of 2013 International Conference on Cyberworlds. S. Koelstra, C. Muehl, M. Soleymani, J.-S. Lee, A. Yazdani, T. Ebrahimi, T. Pun, A. Nijholt, and I. Patras. DEAP: A database for emotion analysis using physiological signals, IEEE Trans. on Affective Computing, Special Issue on Naturalistic Affect Resources for System Building and Evaluation Robert Jenke, Angelika Peer, Martin Buss, Feature extraction and selection for emotion recognition from EEG, IEEE Trans. on Affective Computing, 2014 P. C. Petrantonakis and L. J. Hadjileontiadis, Emotion recognition from EEG using higher order crossings, IEEE Trans. on Information Technology in Biomedicine, vol. 14, no. 2, pp. 186–197, 2010. R. Khosrowabadi and A. Rahman, Classification of EEG correlates on emotion using features from Gaussian mixtures of EEG spectrogram, Proceedings of 3rd Int. Conf. on Information and Communication Technology for the Moslem World, 2010 M. Akin, Comparison of wavelet transform and FFT methods in the analysis of EEG signals. Journal of medical systems, vol. 26, no. 3, pp. 241–247, Jun. 2002. S. J. Roberts, W. Penny, and I. Rezek, Temporal and spatial complexity measures for electroencephalogram based brain computer interfacing, Medical and Biological Engineering and Computing, vol. 37, no. 1, pp. 93–98, 1999.

Proceedings of the 2015 International Conference on Machine Learning and Cybernetics, Guangzhou, 12-15 July, 2015

[23] Lempel A. Ziv J. On the Complexity of Finite Sequence.

IEEE Trans. on Information theory, 1976, 22 (1): 7581.

889