This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
1
A Novel Algorithm for Learning Sparse Spatio-Spectral Patterns for Event-Related Potentials Chaohua Wu, Student Member, IEEE, Ke Lin, Wei Wu, Member, IEEE, and Xiaorong Gao, Member, IEEE
Abstract— Recent years have witnessed brain–computer interface (BCI) as a promising technology for integrating human intelligence and machine intelligence. Currently, event-related potential (ERP)-based BCI is an important branch of noninvasive electroencephalogram (EEG)-based BCIs. Extracting ERPs from a limited number of trials remains challenging due to their low signal-to-noise ratio (SNR) and low spatial resolution caused by volume conduction. In this paper, we propose a probabilistic model for trial-by-trial concatenated EEG, in which the concatenated ERPs are expressed as a linear combination of a set of discrete sine and cosine bases. The bases are simply determined by the data length of a single trial. A sparse prior on the rank of the spatio-spectral pattern matrix is introduced into the model to allow the number of components to be automatically determined. A maximum posterior estimation algorithm based on cyclic descent is then developed to estimate the spatiospectral patterns. A spatial filter can then be obtained by maximizing the SNR of the ERP components. Experiments on both synthetic data and real N170 ERP from 13 subjects were conducted to test the efficacy and efficiency of the algorithm. The results showed that the proposed algorithm can estimate the ERPs more accurately than the several state-of-the-art algorithms. Index Terms— Convex optimization, event-related potential (ERP), N170, nuclear norm, regularization, sparse learning.
I. I NTRODUCTION
E
VENT-RELATED potentials (ERPs) are the stereotypical brain electrical responses to an internal or external stimulus. ERPs are measured with an electroencephalogram (EEG), which is one of the most popular noninvasive technologies for detecting human brain activities. Because of the high temporal resolution, ERPs have been widely used in cognitive neuroscience and psychology research, and are particularly attractive when investigators are interested in studying fast brain responses (typically within 1 s). In recent years,
Manuscript received December 14, 2014; revised July 20, 2015 and October 9, 2015; accepted October 22, 2015. This work was supported in part by the Chinese 863 Project under Grant 2012AA011601 and in part by the National Natural Science Foundation of China under Grant 61403144, Grant 61431007, Grant 91220301 and Grant 91320202. C. Wu, K. Lin, and X. Gao are with the Department of Biomedical Engineering, Tsinghua University, Beijing 100084, China (e-mail:
[email protected];
[email protected]; gxr-dea@tsinghua. edu.cn). W. Wu is with the College of Automation Science and Engineering, South China University of Technology, Guangzhou 510641, China, and also with the Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA 94305 USA (e-mail:
[email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNNLS.2015.2496284
ERPs have also been employed as information carriers in brain–computer interfaces (BCIs). In general, an ERP wave contains multiple components, which are named according to their polarities and latencies (e.g., P100, N170, and P300). These components often indicate the underlying neural activities and can be used to study the brain’s responses to different stimuli. For example, P300 can be observed in an odd-ball paradigm. The amplitude of P300 will increase when a low-probability stimulus appears. Another example is N170, which is an ERP component sensitive to the human face. The amplitude of N170 evoked by a face is larger than that evoked by a nonface object. The amplitude difference of the ERP can be used to generate control commands in a BCI system, e.g., the P300-based BCI, which has become an important BCI paradigms over the past decade [1]. Recently, Kaufmann et al. [2], [3] used familiar faces instead of characters in P300-speller to improve the performance of a P300-speller. Zhang et al. [4] compared several types of stimuli in ERP-based BCI systems and determined that inverted faces resulted in the highest accuracy. In addition to the amplitude, the spectral power of ERPs in a certain frequency band is another useful feature in neurophysiology research [5]. In all these applications, accurate extraction of ERP components is of tremendous importance. However, extracting ERPs from the scale-recorded EEG is not trivial. On one hand, ERPs have a very low signal-to-noise ratio (SNR). Their amplitude is often much smaller than that of spontaneous EEG. They may also suffer from artifacts due to muscle activities, eye movements, and electrode polarization potentials. To reduce noise, temporal filtering techniques [6], [7] and trial averaging technique are widely used. Such methods generally require tens or even hundreds of trials. On the other hand, ERPs have a low spatial resolution due to volume conduction. As a result, the scalp EEG is a mixture of activities from multiple brain sources. To deal with the above problems, spatial filtering techniques have been introduced into ERP analysis and have been proved to be helpful [8]. Because ERPs, spontaneous EEG activities, and other noises usually exhibit distinct spatial patterns corresponding to different sources, a well-designed spatial filter can be used to extract an ERP component by removing the irrelevant neural and non-neural activities. For example, techniques, such as principal component analysis (PCA) and independent component analysis (ICA), have been widely used in EEG analysis [9], [10]. For ERP analysis, ICA with regular-
2162-237X © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2
ization was proposed to enhance the SNR of a phase-locked response [11]. However, because ICA and its variants cannot determine the number of ERP components, substantial effort is required to select the components of interest from a large number of candidates. Instead of separating EEG into statistically independent components, some algorithms aim to maximize the SNR of ERPs. Max-SNR beamformer requires both EEG data with ERP and without ERP (e.g., P300 for target and nontarget stimuli) to derive the spatial filter that can maximize SNR of ERP by solving a generalized eigenvalue problem [12]. SNR maximizer (SIM) for ERPs estimates the ERP and noise covariance matrix simultaneously, and can be used to extract the ERP when noise data are not explicitly available [13]. In these algorithms, the number of ERP components needs to be set manually. However, the number of the ERP components often varies under different experimental settings. The Bayesian estimation of ERP algorithm provides a Bayesian framework to determine it automatically [14]. A regularization term can also be introduced for this purpose [15]. Another category of algorithms, such as sparse component analysis (SCA) [16] and one-unit second order blind identification with reference [17], requires a template of ERP. When a proper template or reference is provided, these algorithms have impressive performance. However, in real applications, the template for ERP is generally difficult to obtain, since the ERP waveform varies across subjects, experimental settings, and even trials. In this paper, in order to overcome the above drawbacks, we propose a probabilistic model of trial-by-trial concatenated EEG and develop an algorithm to extract ERPs with maximized SNRs. First, since ERPs are phase-locked to stimuli, the spectrum of the concatenated ERPs is discrete. The fundamental and harmonic frequencies of ERPs are simply determined by the length of a single trial. As such, any ERP component can be expressed as a linear combination of a set of discrete sine and cosine bases, regardless of its waveform. The ERP spectrum can be recovered from the weight coefficients of the bases. Second, a sparse prior on the rank of the ERP spatiospectral pattern matrix is enforced, since the number of ERP components is typically less than the number of EEG channels in real situation [11], [14]. With this prior, the number of ERP components can be determined automatically from the data. Third, inspired by SIM, a spatial filter can be obtained by maximizing the SNR of the ERPs when their spatiospectral pattern matrices are available. In summary, the two main contributions of this paper are as follows. 1) Inspired by the spectral structure of steady-state visual evoked potentials, we concatenate ERPs trial by trial in time domain. The discrete spectral structure enables the flexible selection of a set of templates or bases. 2) The number of components can be determined automatically by introducing a sparse prior on the rank of the spatiospectral pattern matrix of ERPs. A simple but efficient algorithm is developed to estimate the spatial and spectral patterns. The rest of this paper is organized as follows. Section II mainly describes the model of concatenated EEG and the algorithm to estimate the spatial and spectral patterns of ERPs.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
Section III mainly describes experiments on evaluating the performance of the proposed algorithm and several state-ofthe-art algorithms with synthetic and real ERP data. Section IV discusses some issues that may affect the performance of the proposed algorithm. These issues include the trial-by-trial variability, the spontaneous EEG pattern, and the number of channels and trials. The connections between the proposed algorithm and some existing algorithms are also established in this section. II. M ETHOD A. Probabilistic Model of Concatenated ERPs Let xl (t) denote the l ∈ [1, L] trial M-channel EEG data at sampling time t ∈ [1, T ]. The probabilistic model of the ERP can be expressed as xl (t) =
N
an sn (t) + εl (t)
(1)
n=1
where a and s denote the spatial pattern and the waveform of the nth ERP component, respectively, and ε denotes the noise, which follows a multivariate normal distribution N (0, ). Remark 1: The ERP is assumed to be identical across all trials since it is approximately phase-locked to the stimuli [6]. The most commonly used trial averaging method to extract the ERP is also based on this assumption. Here, we do not make any assumptions about the distribution and waveform of the ERP signal. Remark 2: The spontaneous EEG is modeled as a random effect. It also arises from brain activities and its spectrum overlaps with that of the ERP. In contrast to the ERP, spontaneous EEG is random across trials. The spontaneous EEG recorded on the scalp is a linear mixture of multiple source activities due to volume conduction. According to the central limit theorem, its distribution is close to a Normal distribution [18], [19]. No particular covariance matrix structure (e.g., diagonal) is assumed in our model. If EEG data are concatenated trial by trial, the ERPs will become a circular signal. Let xˆ (t) denote the concatenated EEG, which can be expressed as xˆ (t) =
N
an sˆn (t) + ε(t)
(2)
n=1
where sˆn (t) denotes the concatenated L copies of sn (t). The spectrum of this circular signal is discrete, and therefore, it can be expressed as a linear mixture of a set of sine and cosine bases with frequency k/T , where k is an integer (see Appendix A for details) and T is the length of a trial. Let αnk and βnk denote the weight coefficients of the sine and cosine bases, then the nth component sˆn (t) can be expressed as 2πk 2πk t + t . sˆn (t) = αnk sin βnk cos T T k
k
According to the definition of 2 + β 2 )1/2 and tan−1 (α /β ) (αnk nk nk nk and phase of frequency k/T , where k1 the coefficients have been estimated, of sˆn (t) can be obtained.
the Fourier series, are the amplitude = k2 = k. Thus, once the spectral patterns
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. WU et al.: NOVEL ALGORITHM FOR LEARNING SPARSE SPATIOSPECTRAL PATTERNS FOR ERPs
Remark 3: The selection of k depends on the purpose of analysis. To analyze phase-locked signals in all frequency bands, all k < f s T /2 should be used. Otherwise, only the frequencies in the band of interest will be used. Our algorithm can extract ERPs in different frequency bands by adjusting k in the model. For notational convenience, let K denote the number of sine bases used, A = [a1 , a2 , . . . , a N ], and (t) = [sin((2π0/T )t); sin((2π1/T )t); . . . ; sin((2π(K − 1)/T )t); cos((2π0/T )t); cos((2π1/T )t); . . . ; cos((2π(K − 1)/T )t)]. Then, (2) can be rewritten as xˆ (t) = AC(t) + ε(t) = B(t) + ε(t)
(3)
where C ∈ R N×2K is the coefficient matrix αnk , κ = k, if k < K Cn,k = βnk , κ = k − K , if k ≥ K .
(4)
(5)
B. Algorithm We develop an algorithm to compute the maximum a posterior (MAP) estimation of the model parameters. Here, we have no specific prior knowledge regarding the distribution of , therefore, the prior of is assumed to be uniform. Then, the MAP estimation is equivalent to solve the following optimization problem: L = min B,
TL ((ˆx(t) − B(t))T −1 (ˆx(t) − B(t))) t =1
+ log || + μ · rank(B).
(6)
Since the closed-form solution of (6) is unavailable, cyclic descent is employed to derive a numerical solution. In this procedure, each of and B is updated in turn with the other fixed. The update rule for can be readily obtained by setting ∂L/∂ = 0 =
TL 1 (ˆx(t) − B(t))(ˆx(t) − B(t))T . TL t =1
min
TL
B
(7)
2 ˜ ˜x(t) − B(t) F + μ · rank(B)
(8)
t =1
where x˜ (t) = −1/2 xˆ (t) and B˜ = −1/2 B. Since −1/2 is ˜ (see Appendix B usually a full rank matrix, rank(B) = rank(B) for details). However, the rank function makes (8) an NP-hard nonconvex optimization problem. A convex relaxation of this problem with the nuclear norm regularization has proved to be ˜ ∗ to replace the rank function effective [21]. Thus, we use B in (8), where · ∗ denotes the nuclear norm of a matrix. This leads to the following convex relaxation of (8):
B
where h(·) is a function encouraging rank sparsity, for example, rank(·). The probabilistic model can be expressed as N (ˆx|B, ) p(B) p().
B can be obtained by solving the following regularized least squares problem:
min
In this model, each column of A represents the spatial pattern of an ERP component, each row of C represents the spectral pattern, and the rank of B, which is termed the spatio-spectral pattern matrix, equals the number of components. Since the number of ERP components is generally smaller than the number of EEG channels, A and C should be sparse in ranks. Intuitively, sparse priors on A and C can be introduced into the model [20]. In order to reduce model complexity, a more convenient approach is to set a prior that encourages rank sparsity on B. Once the rank of B is determined, the number of ERP components and the rank of A or C can be obtained. A possible sparse prior distribution is p(B) ∝ exp(−μh(B))
3
TL
2 ˜ ˜ ˜x(t) − B(t) F + μB∗ .
(9)
t =1
Several algorithms have been developed to solve the nuclear norm regularized least squares problem [22], [23]. The regularization coefficient μ can be determined by cross-validation as in LASSO [24]. Since neither (7) nor (8) will increase the cost function, the algorithm’s convergence is guaranteed. The decrease in cost function between adjacent iterations, L, is used to form the stopping criterion Li − Li−1
Li = Li−1 where Li is the value of the cost function in iteration i . By repeating steps (7) and (9), both and B can be estimated. B should be further decomposed to obtain the spatial pattern matrix A and spectral pattern matrix C. To ensure the uniqueness of the decomposition, we exploit the idea from [13] to maximize the SNR of the ERP components. The spatial filter matrix A† , where each row is a spatial filter, can be obtained by T max Tr(A† B˜ B˜ T A† ). A†
(10)
Because B is not full-rank, to ensure computational stability, the components whose singular values are less than a preset threshold (10−5 in our experiment) are pruned. It is noteworthy that the sign and amplitude ambiguities exist in model (3), such that multiplying the spatial pattern by a constant e and simultaneously multiplying the spectral pattern by 1/e yields an equally optimal solution. These ambiguities are also inherent in most spatial filtering methods owing to the nonuniqueness of matrix decomposition. To eliminate the amplitude ambiguity, additional constraints can be introduced, e.g., constraining the L 2 norm of the spatial filters to be unit. As for sign ambiguity, as long as the ERP spatial patterns/filters are analyzed in conjunction with the ERP time courses, interpretation of the results will be unequivocal. The main steps of the algorithm to estimate ERP spatial and spectral patterns (ESSPs) are summarized in Algorithm 1.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
Algorithm 1 ESSP Input: ˜ Trial-by-trial concatenated EEG data:X Sine and cosine bases: Initial : = T1L l,t (xl (t) − x¯ (t))(xl (t) − x¯ (t))T , where x¯ (t) is the trial-averaged EEG. Iteration: ati on ti mes < 20 do while L > 10−6 and i ter 2 + μB ˜ ˜ ∗ ˜ B˜ = arg min ˜ T L ˜x(t) − B(t) Update B: t =1 F B T L Update : = T1L t =1 (ˆx(t) − B(t))(ˆx (t) − B(t))T end while ˜ [U, S, V] = SVD(B) 1 Spatial Pattern: A = 2 U Spectral Pattern: C = SVT Spatial Filter: A† = pinv(U) Covariance matrix of spontaneous EEG: Output: A, C, A†
Fig. 1.
Synthetic ERP components.
III. E XPERIMENT To verify ESSP’s efficacy and efficiency, we tested it on both synthetic data and real EEG data and compared it with the two state-of-the-art algorithms: 1) SCA [16] and 2) regularized ICA (r-ICA) [11]. A. Experimental Setup 1) Synthetic Data Generation: The 30-channel synthetic EEG data contained three ERP components, which were orthogonal to each other (see Fig. 1). The spatial filters for these three ERP components were represented by a randomly generated orthogonal matrix. The spontaneous EEG was represented by 30-channel 1/ f Gaussian noise. The spatial pattern of the spontaneous EEG was represented by a random matrix following a multivariate Gaussian distribution. A range of SNR levels of ERPs were used in our experiments: −20, −15, −10, −5, 0, and 5 dB. Here, the SNR of ERPs was defined as the ratio of the ERP power and spontaneous EEG power. We also added white noise to each channel of data to simulate nonphysiological noise. The power of the white noise was 1/100 of the ERP and spontaneous EEG power. For each SNR of ERP, 100 data sets were generated. Each data set contained 100-trial synthetic EEG data. The length of each trial was 512 ms. The sampling rate of the synthetic data was 1000 Hz. 2) Algorithms Applied to Synthetic Data: In r-ICA, data with regularization were generated according to the following equation: x˙ l (t) = (1 − λ)xl (t) + λ¯xl (t) where x¯ is the trial-averaged ERP and λ ∈ [0, 1) is the regularization parameter. Then, second order blind
identification (SOBI) was applied to the trial-by-trial concatenated regularized data. From the resulted 30 components, the three components with the highest SNR were chosen. λ was determined according to the sum of the SNRs of these three components. λ = [0, 0.1, 0.2, . . . , 0.9] were tested on each data set and the λ corresponding to the highest SNR was used. In SCA, three templates were used to extract the three components. In simulation, three types of templates were used: 1) square waves centered at the peaks of the ERP components; 2) Gamma functions with peaks corresponding to the main peaks of the ERP components; and 3) the ERP waveforms used in the synthetic data. SCA was applied to the trial-averaged EEG data. This algorithm was implemented using the MATLAB code from https://sites.google.com/site/michaelzibulevsky/home. In ESSP, 60 sine and cosine bases (with corresponding frequencies 1000/512 Hz, 2 × 1000/512 Hz, . . . , 30 × 1000/512 Hz) were used. Fivefold cross-validation was used to select the optimal μ from [10−2 , 10−1.8 , 10−1.6 , . . . , 100 ]. The μ corresponding to the minimum cross-validation error was selected. The number of ERP components was then determined automatically in the algorithm. We used the MATLAB software for nuclear norm regularized linear least squares problems based on an accelerated proximal gradient method (NNLS) to solve (9) [22]. 3) Performance Evaluation: We used the SNR of the estimated ERP to compare the performance of different algorithms on the synthetic data. The SNR of the estimated ERP was defined as SNRest =
var(real ERP) . var(real ERP − estimated ERP)
A higher SNR indicated better performance in recovering ERP from noisy data. 4) Real EEG Experiment Protocol: The real EEG data set containing 13 subjects was collected in the Neural Engineering Laboratory at Tsinghua University. All the subjects were volunteers between 20 and 30 years old and were all righthanded. None had any history of mental disorder. They all had normal vision or corrected-to-normal vision when the experiment was conducted. Before the experiment, they were informed about the content of the experiment and signed consent agreements. In the experiment, four categories of grayscale pictures were presented on a CRT screen: 1) faces; 2) inverted faces; 3) cars; and 4) meaningless noise pictures (see [14, Fig. 4] for details). Subjects were required to mentally count the number of meaningless noise pictures. In each run, 60 faces, 60 inverted faces, 60 cars, and a random number (between 25 and 30) of meaningless pictures were presented. The pictures were presented on screen for 250 ms with random interstimulus intervals between 1.8 and 2.5 s. All pictures had been preprocessed such that their mean grayscale value was identical. Subjects sat 1 m from the screen, and the visual angle of each picture was 11◦ × 7.6◦ . Only the data collected during the runs in which subjects reported the correct number of meaningless pictures were used. According to previous findings, the amplitude of N170 evoked by the face was
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. WU et al.: NOVEL ALGORITHM FOR LEARNING SPARSE SPATIOSPECTRAL PATTERNS FOR ERPs
Fig. 2.
5
Positions of electrodes.
Fig. 4. ERP waveforms recovered by different algorithms. Here, the SNR of ERP was −10 dB.
Fig. 3. Grand-averaged ERP (P8) from 13 subjects. The amplitude of N170 evoked by the face was larger than that evoked by the car, which is consistent with previous reports.
larger than that evoked by the car [25]–[27]. The typical N170 waveform was recorded by electrodes in the occipital– temporal area. A 60-channel EEG data with an averaged mastoids reference were recorded by a NeuroScan SymAmp2 system (with electrode positions, as shown in Fig. 2) in a shielded room. The sampling rate was 1000 Hz. The electrode impedance was kept below 10 k during the recording. EEG data between 0 and 256 ms after the stimulus onset were used in the analysis. The algorithms applied to the synthetic data were used to extract N170 for the real EEG data and estimate the corresponding spatial filter. In r-ICA, we set the number of ERP components to be three (r-ICA1) or five (r-ICA2). λ was determined by the sum of the SNRs of the components. N170 was chosen according to the correlation coefficients between the extracted components and the templates. Here, the template was the grand-averaged EEG on P8 (see Fig. 3). In SCA, two different templates were used. The first was the grand-averaged ERP on P8 (SCA1); the second was a square wave whose peak was centered at the peak of the first template (SCA2). In ESSP, all bases whose frequencies below 60 Hz (i.e., 1000/256 Hz, 2×1000/256 Hz, 3×1000/256 Hz…) were used. The N170 component was chosen in the same way as in the r-ICA based algorithm. Before applying these algorithms, direct current components were removed from each channel, and EEG data were temporally filtered with a passband of 0.5–60 Hz to remove the low-frequency shifts and highfrequency artifacts. To evaluate the performance of different algorithms on the real data, we used the amplitude of the single-trial
spatial-filtered ERP to identify the type of the stimulus (i.e., face or car). First, EEG data evoked by the face were used as a training set to estimate the spatial filters using different algorithms. Both r-ICA and ESSP could obtain more than one spatial filter. The spatial filter for N170 was selected according to the correlation coefficient between the filtered trial-averaged ERP and the template. Since the spatial patterns of N170 evoked by face and car are similar, the spatial filter was then applied to each trial in the test set. The N170 peak latency was defined as the time point where the spatial filtered trialaveraged EEG reached the negative peak between 150 and 190 ms after the stimulus onset. The single-trial ERP peak amplitude was defined as the amplitude at the peak latency. If N170 was correctly extracted, the peak amplitude of N170 from the face stimuli would be larger (numerically lower since it was negative) than that from the car stimuli. A threshold of the peak amplitude could be set to discriminate the face and car stimuli. The true positive ratio (TPR) and false positive ratio (FPR) were calculated as the threshold varied, and the receiver operating characteristic (ROC) curve was generated by plotting TPR against FPR. The area under the curve (AUC) of ROC could be obtained using the trapezoidal rule. Higher values of AUC indicated better classification performance. B. Experimental Results 1) Synthetic Data: We compared ERP waveforms estimated by different algorithms. To eliminate the effect of amplitude ambiguity, the amplitude of the recovered signals was normalized before calculating the SNR of the estimated ERP. Fig. 4 shows the recovered ERP waveforms when the SNR was −10 dB. The three components estimated by ESSP were very close to the real ones with little noise. ICA with regularization got approximate results, but the noise was larger. The performance of SCA was strongly affected by the template, especially for the third component. Fig. 5 shows the SNR of the estimated ERP at different noise levels. Since ESSP was designed to enhance the SNR of ERP components, it showed better performance at all noise levels (according to a two-way repeated-measures ANOVA with the SNR and algorithm as factors, p-value < 10−3 for all components).
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
Fig. 7.
Mean convergence curve from 100 synthetic data sets.
Fig. 8.
60 trials averaged ERP waveform (P8) from a single subject.
Fig. 5. SNR of estimated ERPs at different noise levels. Height of the bars: mean SNR of the estimated ERP from 100 data sets. Error bars: standard deviation.
Fig. 6. Singular values of the matrix B. The area of the each square indicates the absolute value of the singular value. Left: result from ESSP. Right: result from the proposed algorithm without nuclear norm regularization, i.e., μ = 0.
To verify whether ESSP estimated the correct number of ERP components, we calculated the 30 largest singular values of the matrix B. The number of the nonzero singular values equals the number of ERP components. The result of the experiment showed that the ESSP estimated the correct number on all synthetic data sets. Fig. 6 shows the 30 largest singular values when the SNR of the ERP was −20 dB. When there was no regularization, it was difficult to find a proper threshold to determine the number of ERP components. To verify the convergence of ESSP, we calculated the value of cost function in each iteration. Fig. 7 shows the mean decrease curve of L over 100 data sets. In most cases,
L decreased to 10−6 after seven iterations and the algorithm converged on all data sets. 2) Real Data: The amplitude of the grand-averaged ERP evoked by faces was larger than that evoked by cars, which was consistent with the previous reports (Fig. 3). However, this phenomenon could not be observed with every single subject (Fig. 8), since the number of trials used was limited and the SNR of the ERP was low. For some subjects, the data were
so noisy that even the peak of N170 was not obvious. After applying ESSP, the number of ERP components was estimated and the noises irrelevant to ERPs were reduced. The number of ERP components was between 2 and 4 for all 13 subjects. It was not difficult to select the N170 component using the template (see the first column in each of Figs. 9 and 10). The spatial patterns indicated that N170 was originated from the occipital–temporal area. The peak amplitude of faceevoked N170 was larger than that of car-evoked N170 for all subjects. The N170 estimated by the other algorithms is also shown in Figs. 9 and 10. The result of ICA with regularization was close to that of the proposed algorithm. However, for some subjects (Subjects 4, 5, and 11 for r-ICA1; Subjects 1, 4, 5, and 11 for r-ICA2), the amplitude of car-evoked N170 was larger. ICA with regularization was affected by the number of candidate components when applied to multicomponent extraction. SCA suffered from overfitting on some subjects (Subjects 5, 6, and 7 for SCA1; Subjects 2, 7, 9, and 10 for SCA2). We could not find the peak corresponding to carevoked N170 on these subjects. The amplitude of car-evoked N170 was larger on some subjects (Subjects 12 and 13 for SCA1; Subjects 3, 5, 6, and 11 for SCA2). We also compared the separability of the peak amplitude of N170. The AUC values for different algorithms are listed in Table I. The AUC for ESSP was significantly larger than that for the other algorithms (paired t-test with Bonferroni correction, p-value is shown in Table I). To test the stability of ESSP, we randomly chose 55 trials with replacement from each subject’s data and applied ESSP to
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. WU et al.: NOVEL ALGORITHM FOR LEARNING SPARSE SPATIOSPECTRAL PATTERNS FOR ERPs
Fig. 9. N170 estimated by different algorithms. The topographic maps are the corresponding spatial patterns. SCA cannot estimate the spatial patterns, so we show the pseudoinverse of the spatial filter here. The amplitude of ERPs is normalized for better visualization here.
each of these surrogate data sets. This procedure was repeated ten times on each subject. The correlation coefficients between the spatial patterns from the surrogate data sets and that from the full data set were greater than 0.95 for all subjects. To verify that the components estimated by ESSP were reliable, brain electrical source analysis was applied to localize the sources of these components. Here, we used the DIPFIT toolbox in EEGLAB [10]. The residual variance was