Optimization of Single-Trial Detection of Event-Related ... - IEEE Xplore

0 downloads 0 Views 930KB Size Report
of Event-Related Potentials Through Artificial Trials. Hubert Cecotti. ∗. , Member, IEEE, Amar R. Marathe, and Anthony J. Ries. Abstract— Goal: Many ...
2170

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 62, NO. 9, SEPTEMBER 2015

Optimization of Single-Trial Detection of Event-Related Potentials Through Artificial Trials Hubert Cecotti∗ , Member, IEEE, Amar R. Marathe, and Anthony J. Ries

Abstract— Goal: Many brain–computer interface (BCI) classification techniques rely on a large number of labeled brain responses to create efficient classifiers. A large database representing all of the possible variability in the signal is impossible to obtain in a short period of time, and prolonged calibration times prevent efficient BCI use. We propose to improve BCIs based on the detection of event-related potentials (ERPs) in two ways. Methods: First, we increase the size of the training database by considering additional deformed trials. The creation of the additional deformed trials is based on the addition of Gaussian noise, and on the variability of the ERP latencies. Second, we exploit the variability of the ERP latencies by combining decisions across multiple deformed trials. These new methods are evaluated on data from 16 healthy subjects participating in a rapid serial visual presentation task. Results: The results show a significant increase in the performance of single-trial detection with the addition of artificial trials, and the combination of decisions obtained from altered trials. When the number of trials to train a classifier is low, the proposed approach allows us improve performance from an AUC of 0.533 ± 0.080 to 0.905 ± 0.053. This improvement represents approximately an 80% reduction in classification error. Conclusion: These results demonstrate that artificially increasing the training dataset leads to improved single-trial detection. Significance: Calibration sessions can be shortened for BCIs based on ERP detection. Index Terms—Brain-computer interface (BCI), event-related potentials (ERPs), signal detection, single-trial detection.

I. INTRODUCTION HERE is widespread interest in developing brain– computer interfaces (BCI) for a variety of application domains such as communication, prosthetics, and visual target identification. In each of these cases, BCI usage is typically divided into two stages. The first stage is the calibration stage where the system attempts create a model that relates brain activity to different “actions” of the BCI system. While this model can be tuned and improved over time by analyzing the outputs from the application and/or monitoring the current neural activity [1], it is highly advantageous to start with a model that performs well. Upon completion of the calibration stage, the system then moves into an operation stage where the model is used to detect brain responses, such as event-related potentials

T

Manuscript received December 17, 2014; revised February 13, 2015; accepted March 20, 2015. Date of publication March 25, 2015; date of current version August 18, 2015. This work was supported in parts by U.S. Army Prime Contract No. W911NF-09-D-0001 and in part by the Office of the Secretary of Defense ARPI program MIPR DWAM31168. Asterisk indicates corresponding author. ∗ H. Cecotti is with the School of Computing and Intelligent Systems, University of Ulster, Londonderry BT52 1SA, U. K. (e-mail: [email protected]). A. R. Marathe and A. J. Ries are with the Human Research and Engineering Directorate, US Army Research Laboratory, Aberdeen Proving Ground. Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TBME.2015.2417054

(ERPs), in real time in order to carry out the task at hand (e.g., spelling a word, moving a computer cursor). The model created through the calibration stage estimates a function that aims to minimize the variability of signals from ERPs within one condition and to maximize the difference between ERP signals in different conditions. There are two primary approaches to create this model. One way is to use an adaptive method that estimates the evolution of the nonstationary neural activity over time. An alternative is to find a static model that is based on features that are invariant to the intraclass variability in the signal, e.g., after spatial filtering [2], spatial-temporal classification [3], or by creating a classifier that is able to absorb all of the possible intraclass variability by using the appropriate training database. BCIs require improvements in different areas to be more suitable for both healthy and disabled people [4]. Since the early work of the P300 speller [5], different improvements have been proposed for optimizing the classification of ERPs responses [6], the number of repetitions of the visual stimuli [7], [8], the number of sensors [9], and the reduction of the calibration session [10]. The latter point is the focus of this study. A typical approach is to optimize the duration of a session to find the best tradeoff between the length of the calibration session and classifier accuracy. In some cases, BCI is the only means of communication [11], and as such it is critical to develop a userfriendly system that provides a quick way to calibrate the system while still maintaining high accuracy. However, reducing the calibration time implies reducing the available training data for tuning a classifier. By reducing the size of the training data, the quality of estimates produced by the model will likely degrade, which can lead to poor system performance. One solution is to create a larger training set by adding transformed artificial training [12], [13]. The classifier will then extract the invariances from both the training data and the artificially generated training data. A toy example is depicted in Fig. 1. The creation of artificial training examples increases the size of the database and has the advantage of being readily implemented for multiple classifiers. We propose to add deformed brain response signals to increase the size of the training database in order to create a model invariant to small signal deformations that may occur in each trial. This technique requires a priori knowledge of the problem, i.e., the relationships between input features. The approach has been successfully used in other classification problems such as handwritten character recognition, where it was observed that small deformations (rotations, shifts, small distortions) would not affect the label of the image [14]. The addition of deformed trials can have two purposes. First, it can increase the size of the database to improve the performance of the model, and second it can increase the possible variability that can be modeled by the

0018-9294 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

CECOTTI et al.: OPTIMIZATION OF SINGLE-TRIAL DETECTION OF EVENT-RELATED POTENTIALS THROUGH ARTIFICIAL TRIALS

Fig. 1. Example with two bivariate normal distributions of 50 examples. The ellipses represent the directions from the eigenvectors of the covariance matrix of each distribution. The dashed and full line correspond to the linear classification with basic points, and with the addition of points shifted in both directions in the second dimension (x2 ), respectively.

classifier. Whereas it is relatively easy to estimate deformations of handwritten characters that do not change the label of images, it is more challenging to estimate deformations that do not alter the discriminant features of brain responses as their discriminant characteristics may not be directly observable. Despite this difficulty, studies in cognitive neuroscience reveal key characteristics of ERPs that can be exploited as prior information indicating relationships between ERP characteristics (amplitude and latency), and behavioral traits [15]. For instance, the latency of the P300 is related to stimulus evaluation time [16] and reaction time such that longer evaluation leads to slower reaction time and delayed P300 latency. Its peak latency is assumed to be proportional to stimulus evaluation timing, and is sensitive to task processing demands. In addition, latency can vary with individual differences in cognitive capability [17]. Therefore, it can be assumed that a robust single-trial detection classifier should be invariant to small time shifts. In this paper, we show how time shifted and noisy trials can be used to extend the training database in order to improve single-trial classification performance in a target detection task using a rapid serial visual presentation (RSVP) paradigm. In addition, we show how the combination of the decision of shifted trials can increase the reliability of the detection when jitter exists between the signal features and the stimulus onsets. The remainder of this paper is as follows. The methods and experimental protocol are detailed in Section II. The artificial deformations of the signal are proposed in Section II-E. Finally, the results are analyzed and discussed in Sections III and IV. II. METHODS A. Subjects Eighteen participants volunteered for this study. Participants provided written informed consent, reported normal or corrected-to-normal vision and reported no history of neurological problems. Due to excessive noise artifacts in the EEG, two participants were excluded from analysis. The resulting 16

2171

Fig. 2. (a) RSVP task. (b) Representative examples of stimuli on target (bottom) and nontarget trials (top). The inset showing a target is shown here for illustration purposes, and did not appear in actual stimuli.

participants had an average age 33.5 years (13 males, 15 right handed). The voluntary, fully informed consent of the persons used in this research was obtained as required by federal and Army regulations. The investigator has adhered to Army policies for the protection of human subjects [18], [19]. B. Visual Stimuli and Procedure Participants were seated 75 cm from a Dell P2210 monitor, and they viewed a series of simulated images from a desert metropolitan environment in a rapid serial visual presentation (RSVP) paradigm [see Fig. 2(a)]. Images (960 × 600 pixels, 96 dpi, subtending 36.3 × 22.5) were presented using E-prime software on a Dell Precision T7400 PC. Images were presented for 500 ms (2 Hz) with no interstimulus interval. Images contained either a scene without any people (nontarget) or a scene with a person holding a gun (target). A total number of 110 target images and 1346 nontarget images were presented to each participant. Scenes in which a target appeared were also presented without the person in the nontarget condition. All stimuli appeared within 6.5ºof center of the monitor. The goal of the task was to classify target images from nontarget images. Behavioral analysis was conducted on a session in which subjects responded to target stimuli by pressing a key while also counting the number of target images. Single-trial detection was conducted on a second session in which the subjects had only to count the number of target images. C. Signal Acquisition Electrophysiological recordings were digitally sampled at 1024 Hz from 64 scalp electrodes arranged in a 10–10 montage using a BioSemi Active Two system (Amsterdam, Netherlands). Impedances were kept below 25 kΩ. External leads were placed on the outer canthus of both eyes and above and below the right orbital fossa to record EOG. D. Temporal and Spatial Filtering The brain response to infrequent target stimuli contains several distinguishing features including P300 and N200. To

2172

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 62, NO. 9, SEPTEMBER 2015

capture these features, the signal is often filtered and downsampled (decimated) to emphasize the discriminating components. Decimation was carried out through standard MATLAB functions, and filtering used fourth-order Butterworth filters. The following preprocessing steps were evaluated: bandpass filters [1–42.66 Hz], [1–21.33 Hz], and [1–10.66 Hz]; decimation factors of 8, 16, and 32. The combination of bandpass filter and decimation factors was limited such that the bandpass filters were set to be at most a third of the resulting sampling rate after decimation. After preprocessing, the signal was epoched from stimulus onset to 640 ms after stimulus onset for subsequent analysis. The next step consisted of enhancing the relevant signal using the xDAWN spatial filtering approach [9], [20], [21]. In this method, spatial filters are obtained through the Rayleigh quotient by maximizing the signal-to-signal plus noise ratio (SSNR), where the signal corresponds to the information contained in the ERPs corresponding to the presentation of a target. The result of this process provides Nf spatial filters, which are ranked in terms of their SSNR. The enhanced signal XU is composed of three terms: the ERP responses on a target class (D1 A1 ), a response common to all stimuli, i.e., all targets (images with a person) and nontargets (images without a person) confound (D2 A2 ), and the residual noise (H), that are all filtered spatially with U . XU = (D1 A1 + D2 A2 + H)U

(1)

where {D1 , D2 } ∈ RN t ×N 1 are two Toeplitz matrices, N1 is the number of sampling points representing the target and superimposed evoked potentials (640 ms), and H ∈ RN t ×N s . The spatial filters U maximize the SSNR. SSNR(U ) = argmaxU

Tr(U T AˆT1 D1T D1 Aˆ1 U ) T r(U T X T XU )

(2)

where Aˆ1 represents the least mean square estimation of A1 .   −1  Aˆ Aˆ = ˆ1 = D1 ; D2 ]T [D1 ; D2 [D1 ; D2 ]T X (3) A2 where [D1 ; D2 ] ∈ RN t ×(N 1 +N 2 ) is obtained by concatenation of D1 and D2 , and Tr(.) denotes the trace operator. E. Transformations We consider a trial, X1 ∈ RN 1 ×N f , for the response corresponding to the presentation of a visual stimulus (target or nontarget), where N1 is the number of sampling points. We denote by si,j , the standard deviation of the feature X1 (i, j) across all of the trials of its corresponding class (target or nontarget). We propose two deformations to create additional trials in the training database. In the first deformation, F1 , we create random deformation vectors V of size N1 × 1, with values between −1 and 1. The deformation vector is filtered with a Gaussian filter (n = 3, σ = 4). Then, the vector is amplified by si,j on each feature, and the resulting vector is added to X1 . The resulting EEG trial is as follows: Y (i, j) = X1 (i, j) + si,j · Vi .

(4)

Fig. 3. EEG cap with the placement of the 64 electrodes. Grand average neural activity from electrodes highlighted in gray shown in Fig. 4.

The second deformation, F2 , is simply a shift of several sampling points of all the trials from the training database. The resulting EEG trial is Y (i, j) = X1 (i + Δt, j)(right shift in time)

(5)

Y (i, j) = X1 (i − Δt, j)(left shift in time)

(6)

where Δt is the number of sampling points in the time domain that are shifted. Hence, the size of the training database is multiplied by 2 for F1 , and 3 for F2 . For reference, a shift of one point represents a shift of 7.81, 15.62, and 32.25 ms in the signal for the sampling frequencies of 128, 64, and 32 Hz, respectively. F. Classifiers The first four spatial filters generated by xDAWN were used as inputs for the classification (Nf = 4). Bayesian linear discriminant analysis (BLDA) [22], [23] was used for the binary classification of the brain evoked responses corresponding to the presentation of a target versus nontarget images. Performance was evaluated across a fivefold cross-validation procedure with two conditions. In the first condition (C1 ), one block is used for testing, and the remaining four blocks are used for training. In the second condition (C2 ), four blocks are used for testing, and one block is used for training. In both cases, the process is repeated five times such that each block is used for testing (C1 ) or training (C2 ). Classifier performance is assessed by using the area under the receiver-operator characteristic curve (AUC) [24]. The results are presented for condition C1 and C2 . We also compare performance when the classification is done with and without spatial filters. G. Decision Combination As a final test, we examined the effects of combining classifier decisions across several time-shifted ERPs. There are a number of methods for combining the signals and/or the

CECOTTI et al.: OPTIMIZATION OF SINGLE-TRIAL DETECTION OF EVENT-RELATED POTENTIALS THROUGH ARTIFICIAL TRIALS

2173

Fig. 4. Grand average of the ERP waveforms corresponding to the presentation of target (bold line) and nontarget stimuli. The gray area corresponds to envelope +/− the standard error of the ERP waveform corresponding to the presentation of target. The electrodes are presented in gray in Fig. 3.

decisions from different sources in the literature [25], [26]. In this paper, we limit ourselves to strategies applied to binary classification decisions. In particular, we wanted to compare several methods for combining decisions of a single classifier applied to different input signals. For our purposes, the input signals were multiple time-shifted representations of a single trial. We used the original trial along with two additional trials representing a positive and negative time shift of the same magnitude (e.g., time lags of 0, −15.62, and +15.62 ms). The classifier described previously is applied to each input to produce a score (between 0 and 1) whose magnitude is a reflection of the confidence with which that trial can be classified into either class. The individual scores are then combined using one of three methods. The mean method averages the individual scores to create an aggregate score. Likewise the median method takes the median of the individual scores to create an aggregate score. The min/max method sets the aggregate score to the maximum value if the mean of the individual scores is above 0.5, and sets the aggregate score to the minimum value if the mean is below 0.5. In all three cases, aggregate scores above 0.5 indicate a target trial, and aggregate scores below 0.5 indicate nontarget trials. The decision combination methods are compared against a null condition in which the classifier is applied to the original trial data without a time shift.

Fig. 5. Spatial distribution of the P300 peak amplitude and peak amplitude latency.

III. RESULTS A. Behavioral Performance The mean reaction time (RT) across subjects for the sessions with a manual response was 530.83 ± 85.67 ms. The hit rate was 96.15 ± 4.49% by considering a time window up to 1.5 s after target onset. The standard deviation of the RT, for each subject, is used in the next section as a parameter to determine the potential shifts. B. Event-Related Potentials Fig. 4 depicts the grand average ERP waveforms from the electrodes Fz, Cz, Pz, Oz, P8, and P7, for the target and nontarget stimuli. The amplitude and the latency of the target P300 is depicted in Fig. 5. The amplitude corresponds to the maximum value from 300 to 600 ms after the stimulus onset. The figure represents the mean (across subjects) and standard deviation (SD) across trials. It shows that the amplitude of the P300 typically peaks around 450–500 ms. More importantly, the standard deviation in the latency of the P300 peak is about 80 ms over

Fig. 6.

AUC as a function of the shifted trials included in the training database.

parietal/occipital electrodes, which is similar to the standard deviation found in RT. As the standard deviation of the latency increases toward anterior electrodes, it suggests that the ERP waveform, and therefore the response, is more time locked in the parietal/occipital area. C. Single-Trial Detection Fig. 6 depicts the AUC that is obtained with different lowpass filtering and decimation parameters, as a function of the shifted signals (left and right) that are included in the database, with the condition C2 . The best performance, AUC = 0.922, is obtained with a sampling rate of 128 Hz, a low-pass filter frequency set to 42.66 Hz, and signals shifted by ±31.25 ms. First, these results show that the addition of shifted signals can improve the AUC, but the addition of signals with a shift greater

2174

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 62, NO. 9, SEPTEMBER 2015

Fig. 7.

AUC as a function of the shifted trials included in the training database.

Fig. 8.

AUC as a function of σ in the distribution of the shifts.

than 75 ms has a negative impact on the performance. Based on these results, subsequent analysis and tests were performed on signals downsampled to 128 Hz with bandpass filters [1–42.66 Hz]. Adding deformed trials to the training set improves performance when training set has a small number of trials (see Fig. 7). When the training set contains ten examples of targets stimuli, it is possible to reach an AUC of 0.854 with the addition of time-shifted trials, while the addition of noise and the default case (no additional trials) provide a performance barely above chance level, with an AUC of 0.532 and 0.519, respectively. When 20 target trials are present in the training set, the addition of time-shifted examples or noisy examples both achieve high accuracy, while the performance of the null condition remains around chance. When 60 target trials are available in the training set, the performance of the three methods is equivalent with an AUC of about 0.935. With the proposed method, only 30 target-trials are required for an AUC above 0.9 (AUC = 0.923). In the previous tests, the amount of data for the different possible shifts were the same, i.e., if there is a shift of n time points, then there are 2n possible shifts (left and right), and the size of the database increases with 2n times the amount of the initial database. In Fig. 8, we represent the impact of the distribution of the different shifts when it follows a 2n + 1 points Gaussian window with 0 means and a standard deviation σ. When σ has a low value, the amount of shifted examples is low. The results suggest that the choice of σ is not critical,

and it is better to use the same amount of examples from every possible shift. The utility of including deformations in the training set is dependent on the amount of real data available. The AUC for the conditions C1 and C2 is presented in Fig. 10. With a large training sets (condition C1 ), the number of real trials available is sufficient to erase the difference across the training methods. With no additional examples, the AUC is 0.944 ± 0.036. The AUC is 0.940 ± 0.039, 0.944 ± 0.035, and 0.940 ± 0.038 for the additional examples created with the noise deformation, time shift, and a combination of both. A Friedman test indicates a significant difference across conditions (p < 10e-2). Post hoc analysis with Wilcoxon signed-rank test, with a Bonferroni correction, reveals a difference between the default condition and the additional examples created with noise (p < 10e-3), and between noise and shift (p < 10e-2), suggesting that the addition of the noise has a negative contribution on the overall performance. However, with limited training data (condition C2 ), the AUC is barely above chance level (0.533 ± 0.080) when no deformations are included. When deformations are included, the AUC is 0.884 ± 0.060, 0.905 ± 0.053, and 0.900 ± 0.055 for the additional examples created with noise, time shift, and timeshift+noise, respectively. A Friedman test indicates a clear difference across conditions (p < 10e-8). Post hoc analysis with Wilcoxon signed-rank test, with a Bonferroni correction, shows that there is no difference between the additional examples created with time shift, and time-shift+noise. Time shift and timeshift+noise are both better than the default condition and the condition with noisy examples only. Next, due to the emerging prevalence of wireless EEG amplifiers and other devices with low timing resolution, EEG signals cannot always be precisely time locked to a specific stimulus onset [27]. Thus, rather than assume high precision, we tested the performance of a classifier that assumes some amount of jitter. To do this, we compared single-trial classification performance when multiple decisions were combined across three time shifted ERP signals (Nshift = 2, {−Δt; 0; +Δt}) that are meant to account for the jitter in the signal. Fig. 9 depicts the AUC for including different amounts of jitter into the combined decision. With a time shift of 46.87 ms in the stimulus onset, the AUC is 0.909, 0.917, 0.914, and 0.918 for the null, mean, median, and min/max methods, respectively. A Friedman test confirms a significant difference across methods (p < 10e-5) when there is a shift of 46.87 ms. Post-hoc analysis with Wilcoxon signedrank test, with a Bonferroni correction, shows that the method null is the worst, that there is no difference between mean and min/max, which are better than the median; (p < 10e-5 when there is a difference). These results show that the combination of decisions from shifted examples, when there is a shift in the stimulus onsets, can improve performance. D. Effect of the Classifier In order to examine the generalizibality of this approach, we compared the effect of the addition of artificial trials in condition C2 on four classifiers: SVM (linear kernel), SVM

CECOTTI et al.: OPTIMIZATION OF SINGLE-TRIAL DETECTION OF EVENT-RELATED POTENTIALS THROUGH ARTIFICIAL TRIALS

Fig. 9. AUC as a function of a shift in the stimulus onsets. The error bars represent the standard deviation across subjects.

2175

Fig. 11. Comparison of classification performance (AUC) of the default training method and the method with artificial examples in the training database for various classification approaches, in condition C 2 . The error bars represent the standard error across subjects. The ∗ symbol represents the classifiers where there exists a significant improvement with the addition of new artificial examples.

Furthermore, these results also provide insight about the importance of spatial filtering. Each of the SVM classifiers shows improved performance when xDAWN is applied, both with and without artificial trials. It is worth noting that in the current implementation, spatial filters are only calculated on the original training set. In separate testing, spatial filters were estimated using both the original training set and the artificial trials, yet no change of performance was observed with this approach. The local invariance is only useful for training the classifier, and not for the estimation of the spatial filters. IV. DISCUSSION

Fig. 10. AUC for the conditions C 1 and C 2 , with the different training conditions. The error bars represent the standard deviation across folds (subject 1 to 16), and across subjects (mean).

(RBF (radial basis function) kernel), BLDA, and LDA, with and without spatial filters obtained with the xDAWN framework, resulting in eight approaches. The mean AUC for the default training condition and the condition with artificial examples is depicted in Fig. 11. In the default condition, SVM classifiers outperform LDA-based classifiers with the limited training set. However, when the artificial trials are included in the training set, SVM classifiers actually decrease in performance, whereas the LDA-based classifiers all show significant improvement (Wilcoxon signed-rank test, Bonferoni correction, p < 10e-2). Across all classifiers with and without the artificial trials, the best performance is obtained with xDAWN+BLDA (AUC = 0.905) when artificial trials are included in the training set. It is unclear why the artificial trials are unable to improve classification on SVM-based classifiers. Nevertheless, there is an improvement with artificial trials from AUC = 0.804 to 0.817 with xDAWN+linear SVM when the sampling rate of the processed signal is downsampled to 64 Hz, (p < 10e-2).

Despite the recent progress in EEG signal acquisition, such as wireless amplifiers and dry electrodes, the use of brain– computer or machine interface remains highly idiosyncratic to the subject. While the combination of classifiers based on previous subjects and adaptive learning strategies can partially address this problem, a calibration session is often necessary to provide an efficient model for the classification of particular brain responses such as ERPs. BCIs are mainly aimed at disabled people as a means of communication, and they optimally require short calibration sessions and accurate prediction to be practically used. A BCI is a practical solution for communication for locked-in patients but for other groups of patients who can use an eye-tracker or other adaptive joysticks, a BCI should satisfy both performance and usability constraints. With a large amount of data, efficient machine learning techniques can grasp high level feature representation in order to achieve high performance [28]. Those methods typically require nonlinear classifiers with deep architecture. However, in many classification problems, the amount of available labeled data is limited, and methods that may be efficient with a large training database may suffer from a reduction of the data or may not overcome simpler approaches [29]. We have shown that significant classification performance can still be achieved when the training database is small. We have shown that a minimum of 60 images of the target class should be present in the training database in order to reach a plateau with a

2176

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 62, NO. 9, SEPTEMBER 2015

state-of-the-art method. In the RSVP task, the plateau was at an AUC of about 0.910. With the addition of deformed trials, the performance can be significantly above chance level (AUC = 0.854), with only ten images corresponding to the presentation of a target, whereas the AUC reaches barely chance level with only the nondeformed labeled trials in the training database. Due to the nature of the RSVP experimental paradigm, the number of targets can be directly related to the duration of the experiment. As the target probability is typically low in RSVP tasks (less than 20%), the duration is significantly impacted by the addition of new targets. By decreasing the number of targets needed in the training set, the proposed method can decrease the calibration time in RSVP-based BCI. The same type of improvement can be expected in other ERPs-based BCI such as the P300 speller [30]. This approach may be complemented with the addition of trials from other subjects, or with new trials if the system embeds incremental learning techniques and/or semisupervised learning [31]. V. CONCLUSION A new method has been proposed to improve the performance of single-trial detection during a rapid serial visual presentation task involving realistic images. The method has two main contributions. First, it uses deformed ERP signals to increase the size of the training database, which leads to a significant improvement of the classification performance relative to a smaller database of nondeformed trials. While the addition of artificial trials can be adapted with other classifiers, the local shift invariance of the ERPs could also be included directly in the classifiers (e.g., with a particular kernel for SVM [32]). Second, when jitter is present in stimulus onsets, exploiting shifted ERP responses may increase overall classification accuracy. REFERENCES [1] Q. Zhao et al., “Incremental common spatial pattern algorithm for BCI,” in Proc. IEEE Int. Joint Conf. Neural Netw., 2008, pp. 2656–2659. [2] W. Samek et al., “Stationary common spatial patterns for brain computer interfacing,” J. Neural Eng., vol. 9, no. 2, art. no. 026013 (14 pages), Apr. 2012, doi: 10.1088/1741-2560/9/2/026013 [3] Y. Zhang et al., “Spatial-temporal discriminant analysis for ERP-based brain–computer interface,” IEEE Trans. Neural Syst. Rehab. Eng., vol. 21, no. 2, pp. 233–234, Feb. 2013. [4] J. R. Wolpaw, “Brain-computer interface research comes of age: Traditional assumptions meet emerging realities,” J. Motor Behavior, vol. 42, no. 6, pp. 351–353, 2010. [5] L. Farwell and E. Donchin, “Talking off the top of your head: Toward a mental prosthesis utilizing event-related brain potentials,” Electroencephalogr. Clin. Neurophysiol., vol. 70, pp. 510–523, 1988. [6] H. Cecotti and A. Gr¨aser, “Convolutional neural networks for P300 detection with application to brain-computer interfaces,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 3, pp. 433–445, Mar. 2011. [7] M. Shreuder et al., “Optimizing event-related potential based braincomputer interface: A systematic evaluation of dynamic stopping methods,” J. Neural Eng., vol. 10, no. 3, art. no. 036025 (13 pages), Jun. 2013, doi: 10.1088/1741-2560/10/3/036025 [8] H. Zhang et al., “Asynchronous P300-based brain-computer interface: A computational approach with statistical models,” IEEE Trans. Biomed. Eng., vol. 55, no. 6, pp. 1754–63, Jun. 2008.

[9] H. Cecotti et al., “A robust sensor selection method for P300 braincomputer interfaces,” J. Neural Eng., vol. 8, no. 1, art. no. 016001 (12 pages), Feb. 2011, doi: 10.1088/1741-2560/8/1/016001 [10] B. Rivet et al., “Adaptive training session for a P300 speller brain-computer interface,” J. Physiol (Paris), vol. 105, no. 1–3, pp. 123–129, 2011. [11] E. W. Sellers et al., “A brain-computer interface for long-term independent home use,” Amyotrophic Lateral Sclerosis, vol. 11, no. 5, pp. 449–455, 2010. [12] H. Baird, “Document image defect models,” in Proc. IAPR Workshop Syntactic Struct. Pattern Recog., 1990, pp. 38–46. [13] P. Simard et al., “Tangent prop - a formalism for specifying selected invariances in an adaptive network,” in Proc. Adv. Neural Inform. Process. Syst., 1991, pp. 895–903. [14] P. Y. Simard et al., “Best practices for convolutional neural networks applied to visual document analysis,” in Proc. 7th Int. Conf. Document Anal. Recog., 2003, pp. 958–962. [15] J. Polich and A. Kokb, “Cognitive and biological determinants of P300: An integrative review,” Biol. Psychol., vol. 41, pp. 103–146, 1995. [16] M. Kutas et al., “Augmenting mental chronometry: The p300 as a measure of stimulus evaluation time,” Science, vol. 197, no. 4305, pp. 792–5, 1977. [17] J. Polich, “Updating P300: An integrative theory of P3a and P3b,” Clinical Neurophysiol., vol. 118, pp. 2128–2148, 2007. [18] “Use of volunteers as subjects of research,” U.S. Department of the Army, Washington, DC, USA, Tech. Rep. AR 70-25, 1990. [19] “Code of federal regulations, protection of human subjects,” U.S Department of Defense Office of the Secretary of Defense, Washington, DC, USA, Tech. Rep. 32 CFR 219, 1999. [20] B. Rivet et al., “xDAWN algorithm to enhance evoked potentials: application to brain-computer interface,” IEEE Trans. Biomed. Eng., vol. 56, no. 8, pp. 2035–43, Aug. 2009. [21] B. Rivet and A. Souloumiac, “Optimal linear spatial filters for eventrelated potentials based on a spatio-temporal model: Asymptotical performance analysis,” Signal Process., vol. 93, no. 2, pp. 387–398, 2013. [22] D. J. C. MacKay, “Bayesian interpolation,” Neural Comput., vol. 4, no. 3, pp. 415–447, 1992. [23] U. Hoffmann et al., “An efficient P300-based brain-computer interface for disabled subjects,” J. Neurosci. Methods, vol. 167, no. 1, pp. 115–125, 2008. [24] T. Fawcett, “An introduction to ROC analysis,” Pattern Recog. Lett., vol. 27, pp. 861–874, 2006. [25] L. I. Kuncheva, Combining Pattern Classifiers: Methods and Algorithms. New York, NY, USA: Wiley, 2004. [26] G. Fumera and F. Roli, “Performance analysis and comparison of linear combiners for classifier fusion,” in Proc. Joint IAPR Int. Workshop Struct. Syntactic Statist. Pattern Recog., 2002, pp. 424–432. [27] A. J. Ries et al., “A comparison of electroencephalography signals acquired from conventional and mobile systems,” J. Neurosci. Neuroeng., vol. 3, no. 1, pp. 10–20, 2014. [28] D. Cires¸an et al., “Multi-column deep neural networks for image classification,” in Proc. Comput. Vision Pattern Recog., 2012, pp. 3642–3649. [29] D. J. Krusienski et al., “A comparison of classification techniques for the P300 speller,” J. Neural Eng., vol. 3, pp. 299–305, 2006. [30] H. Cecotti et al., “Single-trial classification of event-related potentials in rapid serial visual presentation tasks using supervised spatial filtering,” IEEE Trans. Neural Netw. Learning Syst., vol. 25, no. 11, pp. 2030–2042, Nov. 2014. [31] Y. Li et al., “A self-training semi-supervised SVM algorithm and its application in an EEG-based brain computer interface speller system,” Pattern Recog. Lett., vol. 29, no. 9, pp. 1285–1294, 2008. [32] D. DeCoste and B. Sch¨olkopf, “Training invariant support vector machines,” Mach. Learning, vol. 46, no. 1–3, pp. 161–190, 2002.

Authors’ photographs and biographies not available at the time of publication.

Suggest Documents