Extracting temporal EEG features with BCIpy - Google Sites

2 downloads 152 Views 162KB Size Report
Our research analyzes EEG data collected by Chang et al. [1] while students ... 2 BCIpy uses the Pandas data analysis li
Extracting temporal EEG features with BCIpy Justis Peters, Sagar Jauhari, and Tiffany Barnes North Carolina State University, Raleigh, NC {jpgeter2,sjauhar,tmbarnes}@ncsu.edu

Abstract. We present BCIpy, an open source toolkit1 written in Python, which focuses on temporal features in EEG (electroencephalography) recordings from a single channel. BCIpy extracts and charts features and trains a support-vector classifier (SVC) with these features. BCIpy is intended to support classifying subject responses to stimuli in intelligent tutoring systems, particularly when those responses include event-related potentials (ERPs). We present a case study of EEGs recorded while students read passages in the Project LISTEN reading tutor [1]. To test our hypothesis that ERPs distinguish between hard and easy passages, we transform the first second of the EEG signal with a rolling median, train an SVC, and evaluate its classification accuracy. We conclude with recommendations for study designs and data collection that would support more accurate detection of ERPs, potentially leading to successful classification using temporal features of EEGs. Keywords: EEG, BCI, brain-computer interface, NeuroSky, ERP, eventrelated potential, SVM, SVC

1

Introduction

Accurately adapting difficulty of exercises in an ITS can speed learning outcomes and maintain flow for the student, but traditional forms of assessment such as in-system quizzes and feedback questionnaires can interrupt flow and frustrate students. Ideally, an ITS would adapt difficulty without interrupting flow. As Tan states in his thesis, EEG could move us closer to this goal, by aiding inference about the student’s mental state and supporting personalization of instruction [8]. Such inference from EEG allows insight without interrupting flow. Low-cost systems, such as those from NeuroSky, enable in situ studies with electroencephalography (EEG) at greater scale and lower cost. EEG provides rich information for personalization of instruction, but we must first understand how voltage samples from single-channel EEG systems relate to learning. In this paper, we present BCIpy, an open source toolkit which focuses on temporal features in EEG recordings from a single channel. BCIpy extracts and charts features and can train a support-vector classifier (SVC) with these features. Our main goal was to support classifying subject response to stimuli in 1

See http://bcipy.org for code and documentation.

II

Peters, Jauhari, and Barnes

intelligent tutoring systems, particularly when those responses include eventrelated potentials (ERPs). BCIpy is open source, written in Python, licensed with GPLv3, and its code is published on GitHub 2 . Identifying temporal patterns in EEG is difficult, because the underlying processes of EEG are non-stationary [4]. Classifying subject response to stimulus via EEG is also difficult, because neural activity includes many processes which are unrelated to engagement with the stimulus and these processes create noise which may obscure the signal for the process under classification [7]. Research about ERPs typically addresses nonstationarity by registering the timeline for presentation with the timeline of EEG recording and addresses noise by presenting the same stimulus multiple times and averaging the signal across multiple recordings. To use EEG in ITS, though, we want to maximize inference from the subject’s first engagement with each stimulus. Therefore, we present rolling median as a means of averaging the signal from one trial. To address nonstationarity, we discuss the importance of precise timestamps, accurate measures of subject engagement with stimulus, and proper registration of EEG recordings with timestamps of subject engagement. As a case study, we test the hypothesis that a rolling median on one second of EEG after stimulus is enough to differentiate subject response to easy and hard passages in the Project LISTEN reading tutor [1]. We focus on the first second because ERPs typically occur within the first second after subject engagement with stimulus (ex. N1, P2 and P300). Further, Chang et al found evidence that the first second contains enough information to classify the difficulty of a passage a student reads aloud [1]. This evidence inspired our hypothesis and supported the idea that ERPs may be present in these data. SVCs have been used in classification of EEG [9, 11]. To test our hypothesis, we trained a SVC and found it had accuracy no better than chance. We include analysis and discussion on how careful data collection and temporal registration could support classification with temporal features of an EEG.

2

Data, Toolkit, and Case Study

Our research analyzes EEG data collected by Chang et al. [1] while students read passages in Project LISTEN’s reading tutor, an ITS. Their pilot study recorded data from a NeuroSky MindSet, an inexpensive headset equipped with a single-channel, dry-contact EEG sensor. As Chang notes, the limitations of this headset are well balanced by the opportunities its convenience affords. The participants were allowed to make body movements, as well as read at their own pace and click the next button on the screen. [8] Using these data, they trained binary classifiers to predict exercise difficulty with above-chance accuracy, thus demonstrating that one or more correlations exist between EEG signal and passage difficulty.Their experiment showed that EEG data could be 2

BCIpy uses the Pandas data analysis library [5, 6], for the data structures and functions it provides, and uses NumPy and SciPy [3] for matrix operations, SVCs, crossvalidation, and statistics on classifier accuracy.

Extracting temporal EEG features with BCIpy

III

used to classify the difficulty of reading passages with 41% - 69% accuracy across different classification tasks with the probability of change equal to 50%. The EEG data were recorded at 512Hz, but the timestamp was truncated to the second. Fortunately, the timestamps for presentation of stimulus included milliseconds. To align thesewith EEG data, we treated the first EEG sample within a second as 0ms and linearly interpolated the time between seconds.

Table 1: Extracted features Feature word count is passage filtered EEG rolling power rolling median

Scope stimulus stimulus timeseries timeseries timeseries

Units words boolean mV dB mV

Description Number of words in passage. Stimuli with more than one word. Butterworth filter power spectral density median

Hz

512 8 10

We detail extracted features in Table 1: The word count and is passage features are useful in filtering the tasks, in order to train classifiers that consider only certain subsets of tasks. They could also be used as features in training, particularly if the other features included in training have patterns in common, regardless of word count, but include some features which are dependent on word count. For rolling window features, our default window size is 128 samples, which corresponds to 0.25 seconds worth of 512Hz data. The window size is configurable at runtime, via a window size parameter. We hypothesize that the EEG while the subject reads text stimuli exhibits ERPs that are a function of the subject’s mental processing of qualities within the text. An ERP is a notable deflection in the mean EEG voltage measured during a specific window of time after presentation of a stimulus. Some ERPs are well studied and have commonly accepted names, such as P300 and N400. Figure 1a illustrates this concept, with shortened versions of the names. In most research, ERPs are studied by presenting the stimulus multiple times and averaging the the signal from all trials. This eliminates noise from other mental processing and increases the signal from activity that is strongly dependent on the time at which stimulus is presented. To accomplish a similar effect with a single trial, we selected rolling median as our feature. We compared different window sizes for the rolling median function and chose 128 (0.25 seconds), as it smooths out high-frequency variation while preserving low-frequency variation. Further, we selected median instead of mean in order to accommodate some variance in the latency between presentation of the stimulus on the screen and the subject’s engagement with the stimulus. Figure 1b compares the original EEG signal with the rolling median of window size 128. Finally, we downsample the rolling median to 10Hz because the window function smoothed out higher frequency information and because it reduces the dimensionality of the feature vector. Although we selected a window size of 128, to accommodate the variance in latency that may exist in this corpus, we encourage

IV

Peters, Jauhari, and Barnes

200

512Hz EEG Window size: 32 Window size: 64 Window size: 128

Potential (µV)

100

0

100

200

300

0

200

400 600 Time after stimulus (ms)

800

1000

(a) ERPs after a stimulus (b) 10Hz rolling median calculated for 1 second on (Source: Wikipedia [10]) data of 512Hz

Fig. 1: Using rolling median to find Event Related Potentials (ERP)

experiments with better controls over this variance to consider a window size of 64 (0.125 seconds). This will reduce the overlap between windows and will thus reduce the effect of activity occuring before or after distinct ERPs. We trained a SVC with a RBF (radial basis function) kernel on the first second of rolling median as its feature vector. We included data only for tasks where the student read a passage aloud, excluding any single-word trials.

Class Easy Hard Avg

Precision 0.75 0.5 0.67

Recall 0.75 0.5 0.67

F1 Support 0.75 12 0.5 6 0.67 18

Class Easy Hard Avg

Precision 0.00 0.5 0.25

Recall 0.00 0.99 0.49

F1 Support 0.00 83 0.66 83 0.33 166

Table 2: Subject 24, unbalanced classes Table 3: All subjects, balanced classes

With these data, we ran two experiments: one on a specific subject (#24) and one on all subjects. We began with subject #24 because it was the subject for which the classifier in Chang 2013 [1] had the highest accuracy. In both experiments, we reserved 20% of the data for our test set and we used the remaining 80% the training set. Using ScikitLearn’s StratifiedKFold cross-validation, with 4-folds, we trained the SVC on the training set and tested its classifcation accuracy on the test set. We did not balance the class sizes for the SVC trained on data from subject 24. The results in Table 2 show accuracy of 67%, but this is the same as a naive

Extracting temporal EEG features with BCIpy

V

classifier which predicts ”easy” for every passage. We did balance the class sizes for the the SVC trained on all subjects and, as Table 3 demonstrates, it had accuracy no better than chance.

3

Discussion

An ERP is a strongly timed process, thus it is important to properly align training data such that each begins at the same point in the process. We call this alignment ”temporal registration”, analagous to image registration in computer vision. In this case, we analyze the EEG signal as it responds to the subject experiencing the stimulus. Thus, we want to align the beginning of each example with the exact moment at which the subject’s experience begins. We partly addressed temporal registration by interpolating timestamps for the 512Hz data and aligning task boundaries at millisecond resolution. The task boundaries, however, are not an accurate measure of the start and end times of the subject’s engagement with stimulus. The beginning of a task is recorded as the moment at which the system presented the stimulus, but the subject may have been looking away or thinking about something other than the stimulus. Further, each word may trigger one or more ERPs. Thus, variance in reading speed can confound temporal alignment. Focusing on the first second minimizes but does not eliminate this effect. For passages which the student reads aloud, we could infer time of experience through features in timestamped recordings of the student’s speech. At minimum, we could consider the first signal above a silence threshold, after presentation of stimulus. Having timestamped audio could further allow segmentation on word boundaries, similar to the methods used in Chen et al [2]. This could provide far more specific information about the mental process, including reaction to lexical qualities and dynamic processes which may capture semantic interplay between words in a passage. We also note that some of the subject’s experience with a passage may occur during visual processing and before vocalization of the words. Thus, it may be further interesting to include eye tracking equipment or video recordings in future studies of EEG in a reading tutor. If it does not detract from other goals of the experiment, one could also consider rapid serial visual presentation (RSVP) of one or two words at a time, for more granular control of the subject’s encounter with each stimulus. Acknowledging that there may exist other processes informing our model, we maintain our hypothesis that ERPs are present in these data and that these ERPs are, at least partially, a function of the difficulty class for the presented stimulus. We look forward to testing this hypothesis in future work. To continue our work toward building classifiers on temporal features of EEG, we plan to use synthetic data to test the properties of rolling median as the feature vector for a SVC. Synthesizing data allows us to model idealized ERPs and test how varying levels of noise and errors in temporal registration affect classifier accuracy. Establishing these bounds will help inform parameters for

VI

Peters, Jauhari, and Barnes

feature extraction and controls on data collection. We may also explore public data which were collected in experiments designed to elicit specific ERPs. As we have discussed here, temporal features of EEG signal from a single electrode may be useful in classification of subject response to stimulus. BCIpy helps you extract and analyze these features and our examples demonstrate how to train classifiers with these features. We hope that BCIpy and the analyses we present here move our community closer to useful applications of EEG in ITS, as a form of assessment which minimizes interruptions and helps a student in ”flow” remain there. We look forward to further study of this topic.

4

Acknowledgments

We provide our sincere thanks to Vinaya Polamreddi, Yueran Yuan, Kai-min Chang, Jack Mostow, Ryan Baker, Alper Bozkurt, and Thomas Price.

References 1. K.-m. Chang, J. Nelson, U. Pant, and J. Mostow. Toward exploiting eeg input in a reading tutor. International Journal of Artificial Intelligence in Education, 22(1):19–38, 2013. 2. Y.-N. Chen, K.-M. Chang, and J. Mostow. Towards using eeg to improve asr accuracy. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 382–385. Association for Computational Linguistics, 2012. 3. E. Jones, T. Oliphant, and P. Peterson. SciPy: Open source scientific tools for Python, 2001–. 4. A. Y. Kaplan, A. A. Fingelkurts, A. A. Fingelkurts, S. V. Borisov, and B. S. Darkhovsky. Nonstationary nature of the brain activity as revealed by eeg/meg: methodological, practical and conceptual challenges. Signal processing, 85(11):2190–2212, 2005. 5. W. McKinney. Pandas: Python data analysis library. http://pandas.pydata.org. Accessed: 2013-12-13. 6. W. McKinney. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. O’Reilly Media, 2012. 7. S. Sanei and J. A. Chambers. EEG signal processing. John Wiley & Sons, 2008. 8. B. H. Tan. Using a low-cost eeg sensor to detect mental states. Master’s thesis, Carnegie Mellon University, 2012. 9. B. Wang and F. Wan. Classification of single-trial eeg based on support vector clustering during finger movement. In Advances in Neural Networks–ISNN 2009, pages 354–363. Springer, 2009. 10. Wikipedia. File:componentsoferp.svg, 2009. [Online; accessed 13-December-2013]. 11. J. Zhou, J. Yao, J. Deng, and J. Dewald. Eeg-based classification for elbow versus shoulder torque intentions involving stroke subjects. Computers in biology and medicine, 39(5):443–452, 2009.

Suggest Documents