This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. INVITED PAPER
Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation in Patients With Disorder of Consciousness By Yuanqing Li, Jiahui Pan, Jinyi Long, Tianyou Yu, Fei Wang, Zhuliang Yu, and Wei Wu
ABSTRACT | Despite rapid advances in the study of brain– computer interfaces (BCIs) in recent decades, two fundamental
report several initial clinical applications of these multimodal BCI systems, including awareness evaluation/detection in
challenges, namely, improvement of target detection perfor-
patients with disorder of consciousness (DOC). As an evolving
mance and multidimensional control, continue to be major
research area, the study of multimodal BCIs is increasingly
barriers for further development and applications. In this
requiring more synergetic efforts from multiple disciplines for
paper, we review the recent progress in multimodal BCIs (also
the exploration of the underlying brain mechanisms, the design
called hybrid BCIs), which may provide potential solutions for
of new effective paradigms and means of neurofeedback, and
addressing these challenges. In particular, improved target
the expansion of the clinical applications of these systems.
detection can be achieved by developing multimodal BCIs that utilize multiple brain patterns, multimodal signals, or multi-
KEYWORDS
sensory stimuli. Furthermore, multidimensional object control
switch; cursor control; multimodal brain–computer interface
can be accomplished by generating multiple control signals
(BCI)
|
Audiovisual BCI; awareness evaluation; brain
from different brain patterns or signal modalities. Here, we highlight several representative multimodal BCI systems by analyzing their paradigm designs, detection/control methods, and experimental results. To demonstrate their practicality, we
Manuscript received April 30, 2015; revised July 17, 2015; accepted August 10, 2015. This work was supported by the National High-Tech R&D Program of China (863 Program) under Grant 2012AA011601; the National Key Basic Research Program of China (973 Program) under Grant 2015CB351703; the National Natural Science Foundation of China under Grants 91120305, 91420302, 61175114, and 61503143; and Guangdong Natural Science Foundation under Grant 2014A030312005 and 2014A030310244. Y. Li, J. Long, T. Yu, F. Wang, Z. Yu, and W. Wu are with School of Automation Science and Engineering, South China University of Technology, Guangzhou 510640, China and also with Guangzhou Key Laboratory of Brain Computer Interaction and Applications, Guangzhou 510640, China (e-mail:
[email protected]). J. Pan is with the School of Software, South China Normal University, Guangzhou 510641, China. Digital Object Identifier: 10.1109/JPROC.2015.2469106
I. INTRODUCTION A brain–computer interface (BCI) system directly translates the signals produced by brain activities into control signals and uses these signals to control external devices without the participation of peripheral nerves and muscles [1]. In recent years, BCIs have earned increasing attention from both academia and the public, primarily because of their potential for clinical applications. For instance, BCIs can offer augmented or repaired sensory-motor functions and may appeal to individuals with severe motor disabilities [2]. Based on the methods used for brain signal acquisition, two broad categories of BCIs can be distinguished: invasive and noninvasive. In the case of invasive BCIs, the recording electrodes are implanted in the
0018-9219 Ó 2015 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/ redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
| Proceedings of the IEEE
1
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
cerebral cortex (microelectrode arrays or neurotropic electrodes) [3] or on the cortical surface [electrocorticography (ECoG)] [4]. These invasive BCIs may produce highly reliable control signals with high-speed information transfer. However, for this class of BCIs, there are prominent risks related to surgery as well as concerns regarding potential infection and the long-term viability of chronic invasive neural probes [3]. Currently, the brain signals for noninvasive BCIs include functional magnetic resonance imaging (fMRI), magnetoencephalography (MEG), functional near-infrared spectroscopy (fNIRS), and electroencephalography (EEG). Although EEG has a low signal-to-noise ratio (SNR) and low spatial resolution, it is noninvasive, portable, reasonably inexpensive with good real-time response, and technically less demanding than other modalities; thus, it has been commonly used in BCIs. In this paper, we focus on EEG-based BCIs. The brain patterns that are utilized for EEG-based BCIs mainly include the P300 potential [5]; steady-state evoked potentials (SSEP), particularly steady-state visual evoked potentials (SSVEPs) [6]; and event-related desynchronization/synchronization (ERD/ERS) produced by motor imagery (MI) [7]. In most cases, the setup of an EEG-based BCI system is fairly ‘‘simple.’’ For instance, a ‘‘simple’’ BCI system generally relies on a single signal input (EEG) and a single brain pattern, such as the P300, generated by stimuli of a single sensory modality (such as visual stimuli), without incorporating any other intelligent/ automation techniques. Remarkable progress has been made in the paradigm designs, brain signal processing algorithms, and control systems of these simple BCIs. However, these BCI systems also face multiple challenges, including low information transfer rates (ITRs), low man– machine adaptability, multidegree control problems, and high dynamics/nonstationarity of brain signals. Here, we consider two fundamental challenges and then introduce a class of multimodal/hybrid BCI techniques intended to address these challenges. 1) The target detection performance of EEG-based BCI systems: To date, tremendous effort has been devoted to improve the target detection performance of BCI systems, which is commonly measured in terms of their classification accuracy (alternatively, the ITR may be used as a performance measure), and a large number of BCIs with high classification accuracy have been reported. For example, Jin et al. proposed a method of feedback based on changes in facial expression and thus improved the detection performance of an event-related potential (ERP)-based BCI [8]. Furthermore, it has been shown that different familiar face stimuli enhance the N200 and N400 components and thus decrease actual error outputs [9]. Several modern signal processing techniques have also been developed for BCIs, such as multiway learning and tensor factorization [10], 2
Proceedings of the IEEE |
[11]. However, with a few exceptions, most BCI systems were developed for research and were tested mostly with healthy subjects; thus, they may not perform well on patients for the following reasons: a) in general, there are large differences in brain signals between healthy subjects and patients with severe brain injuries [12]; and b) cognitive levels, which often play an important role in a BCI system, differ substantially between healthy subjects and patients [13]. For instance, many patients, such as those with disorder of consciousness (DOC), have much lower attention levels [14], [15]. Even among healthy subjects, approximately 13% suffer from BCI illiteracy and are unable to achieve sufficient proficiency [16], [17]. 2) Multidimensional object control: Multiple control signals, which are difficult for a simple BCI system to generate, are essential for the operation of sophisticated objects or devices such as computer mice, wheelchairs, prosthetic limbs, or robots. For example, for wheelchair control, several different control signals are required for directional control (left or right forward), speed control (acceleration and deceleration), and start/stop control. Several studies have discussed multidimensional object control in the context of BCIs [18]–[20]. For instance, McFarland et al. presented a BCI system for continuous 3-D cursor movement control, which was implemented through the modulation of various EEG rhythms [19]. Doud et al. developed several BCI systems to control a virtual helicopter in 3-D space with commands generated through MI of left hand, right hand, tongue, foot and both hands, and the resting state [20]. For these BCI systems, the generation of the control signals is based on the single modality of MI and the training task may be challenging for the subjects. In this paper, we will demonstrate that potential solutions to the above two challenges may be achieved by means of a new class of BCI techniques, namely, multimodal/hybrid BCIs. In the following sections, four typical classes of multimodal BCIs are first reviewed from the perspective of the two challenging problems discussed above. Several representative multimodal BCI systems are then reviewed to illustrate how multimodal BCI techniques can be implemented to address or alleviate these issues. In particular, we describe several P300-and-SSVEP-based BCI systems for idle state detection and spelling, an audiovisual BCI for target detection, and several MI-and-P300-based BCIs for the 2-D control and the applications of a web browser, a mail client, a Windows-based explorer, and wheelchair control. Third, we review several clinical applications of multimodal BCIs, including awareness valuation/ detection in patients with DOC. Finally, our concluding remarks are presented, including several principles for the design of multimodal BCIs and future perspectives.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
II . MULTIMODAL BCIs As described in [17], a multimodal/hybrid BCI is composed of a BCI and another system [which might be another BCI based on different brain patterns or brain signals or might be a system based on other physiological signals, such as eye gaze, electrocardiography (ECG) or electromyography (EMG)] and is capable of achieving specific goals more successfully than a conventional BCI. In the following sections, we will adhere to the terms ‘‘multimodal BCIs’’ for two reasons: first, ‘‘multimodal BCIs’’ and ‘‘hybrid BCIs’’ are interchangeable terms referring to the same category of BCIs in the literatures, and second, it is our opinion that the term ‘‘multimodal BCIs’’ may be more precise in many cases, such as for BCIs based on multimodal stimuli (e.g., audiovisual BCIs) and BCIs based on multimodal signals (e.g., EEG and NRIS-based BCIs). We will now review the studies related to this topic and extend the definition of multimodal BCIs given in [17]. As shown in Fig. 1, four main classes of multimodal BCIs have been devised.
performance of an on/off brain switch [21] or a BCI speller [22]–[24]. Multimodal BCIs combining MI and SSVEPs have also been reported [25], [26] and used in various applications, such as for controlling a cursor [26] or an orthosis [27]. For instance, Pfurtscheller et al. proposed a multimodal MI-and-SSVEP-based BCI, in which an MIbased brain switch was used to turn an SSVEP-controlled orthosis on and off [27]. Li et al. proposed a multimodal BCI paradigm that incorporated MI and the P300 potential for 2-D cursor control [28], in which the vertical and horizontal movements of the cursor were controlled by MI and the P300, respectively. In [29], a multimodal BCI combining MI and SSVEPs was proposed to provide positive feedback and facilitate MI training. Note that in a multimodal BCI combining P300 and another brain pattern such as SSVEPs, an oddball paradigm is used to evoke the P300. In fact, in addition to the P300, other ERPs such as the N200 and N400 may also be evoked by the oddball paradigm, and these potentials are also useful for BCI classification, as shown in several studies [8], [28]
A. Multimodal BCIs Combining Multiple Brain Patterns In this type of multimodal BCI, at least two brain patterns (e.g., SSVEPs and P300, MI and SSVEPs, or MI and P300) are used. Several groups have proposed multimodal BCIs combining the P300 and SSVEPs to improve the
B. Multisensory BCIs In this class of multimodal BCIs, the brain patterns are evoked by multisensory stimuli, such as audiovisual stimuli, which are simultaneously presented. For instance, Belitski et al. presented several offline analysis results demonstrating that audiovisual stimulation increased the
Fig. 1. Multimodal BCIs and their applications with several illustrative references included.
| Proceedings of the IEEE
3
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
average strength of the P300 response compared with that evoked through either visual-only or auditory-only stimulation [30]. Initial results for visual-tactile and audio-tactile BCIs have also been reported. Thurlings et al. investigated the effects of bimodal visual-tactile stimulus presentation on ERP components and reported the enhancement of early components such as N1, which may improve BCI performance [31]. Rutkowski and Mori reported a tactile and bone-conduction auditory BCI for vision- and hearingimpaired users [32]. Studies have indicated that multimodal integration associated with multimodal stimuli has the potential to enhance brain patterns [33], which may be beneficial for BCI performance.
C. BCIs Based on Multimodal Signals Two or more of EEG, MEG, fMRI, fNIRS, EMG, and electrooculography (EOG) signals can be combined to serve as inputs to a multimodal BCI system [17], [34]–[39]. For example, in [34], the MEG and EEG signals generated in the sensorimotor cortex were used to index the finger movements of three tetraplegics. The experimental results indicated that reasonably good accuracy could be achieved using the combined MEG and EEG features because MEG signals are less smeared and may provide finer spatial information than EEG signals. The use of a combination of fMRI and EEG in BCIs may allow the spatial and temporal information of brain activity to be fully exploited [35], [40], which would enable the study of the basic psychophysiological mechanisms of BCIs, such as the correlations between local BOLD-responses and changes in the slow cortical potentials (SCPs). Although it offers less spatial information than fMRI, fNIRS is portable and also reflects the hemodynamic response of brain activity. In [36], it was demonstrated that the performance of an MI-based BCI could be significantly enhanced by combining EEG and fNIRS signals, and a meaningful classification was established for those subjects who were not able to operate a purely EEG-based BCI. Several studies have also investigated EEG-and-MEG-based BCIs. In [41], a multimodalP300 BCI communication system was proposed in which a P300 BCI was combined with the use of EMG for error correction. The effectiveness of this multimodal letter correction method for a spelling system was demonstrated by experimental data collected from four severely motorrestricted potential BCI end users. D. BCI-Based Shared Control Systems In multimodal BCIs of this type, BCIs are combined with other intelligent systems to achieve shared control and to make the BCI systems more reliable, flexible, user friendly, and efficient by allowing the subjects to focus their attention on the ultimate target while ignoring lowlevel details related to the execution of an action [42]. For instance, an intelligent wheelchair may incorporate the advantages of both a BCI system and an autonomous navigation system [37], [43]–[46]. In one study [44], shared 4
Proceedings of the IEEE |
control of a wheelchair was achieved in a series of stages. At each stage, a 3-D environmental map, which presented to the user a set of distributed candidate destinations, was constructed using a laser range finder (LRF). After the user selected a destination using a P300-based BCI, the wheelchair autonomously moved to this destination by means of the navigation system. Through a series of destination selections and subsequent navigation, the final destination could be reached. Guan et al. proposed a semiautomatic wheelchair to alleviate the workload of the user to the greatest possible extent [45]. Their method consisted of two steps: first, the user selected one of several predetermined destinations using a P300-based BCI, and the autonomous system then navigated the wheelchair to the selected destination along a predefined path. BCI-based shared control systems can also be established by combining a BCI system with an intelligent robot or intelligent prosthesis. A multimodal BCI system may provide only a single control signal or output to improve target detection performance. In this case, when multiple brain patterns or multiple signals are involved, data fusion is generally required at the feature or decision level. By contrast, when multiple control signals or outputs are available, a multimodal BCI system may attempt to implement multidimensional object control; in this case, the multiple control signals may be separately manipulated by the different brain patterns probed by the system, and the fusion of these brain patterns is generally not necessary.
I II . IMPROVING T ARGE T DETECTION PERFORMANCE USING MULTIMODAL BCIs Multimodal BCIs have been shown to improve detection performance [21], [24], and two major strategies for such improvements include 1) combining multiple brain patterns, such as MI-based ERD/ERS, the P300, and SSVEPs; and 2) enhancing brain patterns through the presentation of multisensory stimuli, such as audiovisual stimuli. The first strategy is applied in multimodal BCIs combining multiple brain patterns, whereas the second is employed in multisensory BCIs. In this section, we explain these two strategies by analyzing several SSVEP-and-P300-based BCIs, MI-and-SSVEP-based BCIs, and audiovisual BCIs.
A. An SSVEP-and-P300-Based BCI Brain Switch In [21], we proposed a multimodal BCI for brain switch applications in which the P300 and SSVEPs were combined to improve performance of idle/control state detection, as described below. 1) Paradigm and Detection: The GUI is shown in Fig. 2. To evoke SSVEPs, four groups of buttons in the GUI, each consisting of one large button and eight small ones, flicker at frequencies of 6.0, 6.67, 7.5, and 8.57 Hz. The large
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
summing the normalized P300 SVM scores and SSVEP power ratios, a detection index that can discriminate the control and idle states is obtained, and the target is determined. The control state is detected by judging whether one of the following two conditions is satisfied: 1) the difference between the detection index of one group of buttons and those of the other groups exceeds a predefined threshold; or 2) the same button is recognized simultaneously by both the P300 and SSVEP detection for a predefined number of consecutive times, such as three times in [21].
Fig. 2. GUI of the multimodal brain switch combining the P300 and SSVEPs [21]. (a) Each group of buttons flickers at a fixed frequency (6.0, 6.67, 7.5, or 8.57 Hz) by changing color between black and red. (b) Large buttons of the four groups are intensified with green squares in a random order, with each intensification lasting for 100 ms and the interval between two consecutive intensifications 100 ms. When a large button is intensified, a number ‘‘1’’ (top), ‘‘2’’ (right), ‘‘3’’ (bottom), or ‘‘4’’ (left) appears on the button. With the exception of the 100-ms period of intensification, each large button continues to flicker at its fixed frequency.
buttons in the four groups flash in a random order with changing shapes and colors to simultaneously generate the P300. The signal processing task is to discriminate between the control and idle states in an asynchronous manner. In the control state, the user may input a number by focusing on the corresponding button and counting its flashes, whereas in the idle state, no particular button is the focus. P300 detection and SSVEP detection are performed separately (see [21] for details). In the asynchronous algorithm, the P300 detection, including low-pass filtering, P300 feature extraction, and support vector machine (SVM) classification, is performed every 800 ms, corresponding to one round of button flashes, thereby yielding four SVM scores for the four button groups. Simultaneously, SSVEP detection, including band-pass filtering and power feature extraction, is performed every 200 ms, and four power ratios are computed based on the flickering frequencies of the four groups. Based on the results of P300 and SSVEP detections, a decision is made every 200 ms. By
2) Experimental Procedure and Results: Eight healthy subjects participated in two experiments to evaluate the described multimodal BCI. For purposes of comparison, three sessions were conducted in the first experiment, one each for the multimodal P300 and SSVEP-based BCI, the P300-based BCI, and the SSVEP-based BCI. Each session consisted of 80 trials. In each trial, one of five possible cues ‘‘1,’’ ‘‘2,’’ ‘‘3,’’ ‘‘4,’’ or ‘‘idle’’ appeared for 2 s in the center of the GUI; here, the digit cues corresponded to the groups of buttons that were the targets of the control state, whereas the ‘‘idle’’ cue corresponded to the idle state. In the control state, the subjects were instructed to focus on the target group of buttons for 16 s. In the idle state, the subjects were asked to take a 16-s rest without focusing on any buttons. There was a 2-s break between consecutive trials, and 40 control and 40 idle trials were alternately presented. The collected data were used for offline analysis. Three receiver operating characteristic (ROC) curves corresponding to the three sessions were first calculated to assess the system performance. As shown in Fig. 3(a), the performance in terms of the area under the ROC curve was better for the multimodal BCI than for the P300- or SSVEP-based BCI. Next, for each subject, the ERP waveforms of the ‘‘Pz’’ channel were calculated using the data corresponding to the control states collected in the multimodal and P300 sessions; Fig. 3(b) shows the average ERP waveforms across all eight subjects. In Fig. 3(b), the P300 and N200 components are clearly visible in the target curves for both the multimodal and P300 sessions. Third, we calculated the power density spectra for each subject using the data corresponding to the control states collected in the multimodal and SSVEP sessions; Fig. 3(c) shows the two average power density spectrum curves across all eight subjects for the target button of 6 Hz, each of which contains an apparent SSVEP response. Our offline analysis results demonstrated that for each subject, both SSVEPs and P300 potentials were simultaneously evoked in the multimodal session, as illustrated in Fig. 3(b) and (c). However, from Fig. 3(c), we observed that the SSVEP strength was decreased in the multimodal session compared with that in the SSVEP session. This might be caused by interference between the two BCI subsystems. It is essential to mitigate such interference when designing a P300 and SSVEPbased BCI. A previous study [47] suggests that using shape | Proceedings of the IEEE
5
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
Fig. 3. (a) Three average ROC curves across all eight subjects from the multimodal, P300, and SSVEP sessions. TPR: true positive rate, FPR: false positive rate. (b) Grand-average ERP waveforms across all eight subjects at electrode ‘‘Pz’’ from the multimodal and P300 sessions, where the solid curves containing P300 correspond to the target group of buttons, and the dashed ones without P300 correspond to the nontarget group of buttons. (c) Average power density spectra of the EEG signals across all eight subjects from the multimodal and SSVEP sessions (target frequency: 6 Hz), which show SSVEP responses.
changes instead of color changes for P300 evocation could alleviate the degradation of the SSVEP strength. In the second experiment, an online asynchronous test was conducted, and the effectiveness of the multimodal BCI system was demonstrated (see [21] for details). We designed this multimodal BCI primarily for a brain switch application, in which one group of buttons (top buttons) is designated as the target for ‘‘on/off’’ commands and the other buttons are designated as pseudokeys that do not activate any commands (as explained in detail in [48]). In an asynchronous manner, the user can switch between the ‘‘on’’ and ‘‘off’’ states of the system by focusing on the target or can remain in an idle state to avoid changing the system state. In [21], we successfully used this SSVEPand-P300-based brain switch to issue ‘‘go/stop’’ commands for wheelchair control.
B. SSVEP-and-P300-Based BCI Spellers In [22], a semi-asynchronous multimodal BCI speller was proposed, in which P300 and SSVEPs were combined to improve detection performance. 1) Paradigm and Detection: The GUI of the semiasynchronous multimodal BCI speller presented in [22] is a 6 6 button matrix consisting of 36 characters. The displays of the rows and columns are intensified in an orange color in a random order, similar to a standard P300 speller. Simultaneously, all of the buttons in the GUI flicker between white and black at 17.7 Hz to produce SSVEPs when the user is concentrating on one particular character. In this system, the SSVEPs are utilized for control state detection, and the P300 is used to determine the target character. 6
Proceedings of the IEEE |
In the semiasynchronous algorithm, SSVEP detection and P300 detection are performed separately. SSVEP detection is performed for each round of intensifications of the rows and columns. Specifically, the mean power values in both a narrowband and a wideband are utilized to calculate a power ratio; the central frequency is 17.7 Hz and the bandwidths are 0.6 and 4 Hz, respectively. When the power ratio is greater than a predefined threshold, a control state is detected for that round; otherwise, an idle state is detected. If at least three of five adjacent rounds are detected as the control state, the users desire to input a character is identified, and the target character is determined and output according to the result of P300 detection. 2) Experimental Procedure and Results: An online experiment involving ten healthy subjects was implemented in a semiasynchronous manner. Specifically, the BCI system was operated in discrete predefined blocks with an interstimulus interval (ISI) of 225 ms, and each block consisted of five rounds of intensifications. A decision was made for each block, and if the subject was deemed to be in the control state for a block according to SSVEP detection, a P300 classification was employed to output a character; otherwise, no output was produced in that block. Each subject participated in three experimental sessions, each involving a string of 18 characters, with each character corresponding to a block. In each session, the subjects were required to attend to the first six characters, gaze away for the next six characters, and again attend to the last six characters. In total, therefore, there were 36 blocks in the control state and 12 blocks in the idle state. For all subjects, the average control state detection accuracy was 88.15%, the average P300 classification accuracy in the control state was
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
94.44%, and the ITR was 19.05 b/min. These results demonstrated the effectiveness of the multimodal BCI speller. Several other multimodal BCI spellers combining the P300 and SSVEPs have also been designed. For instance, Yin et al. proposed a synchronous multimodal BCI speller, in which the P300 and SSVEPs were combined for target button recognition [24]. During the experiment conducted by these authors, each button in the 6 6 matrix flickered at a specific frequency to elicit SSVEPs; however, only six frequencies were used in the button matrix (in each row or column, the six buttons all had different frequencies; see [24]). Simultaneously, the six rows and six columns of the matrix were intensified in a random order to elicit the P300 potential. The authors found that the P300 and SSVEP-based multimodal speller performed better than the standard P300-based BCI. In a subsequent study [49], a maximum-probability estimation (MPE) fusion approach was proposed to combine P300 and SSVEPs at the decision level; in this approach, the scores obtained in P300 and SSVEP detections were converted into probabilities and the button with the maximum probability was selected as the target. The experimental results indicated that the MPE method achieved higher accuracy than did the other considered approaches, which were based on linear discriminant analysis (LDA), SVM, and a naBve Bayesian classifier. Allison et al. developed a P300 and SSVEP-based multimodal BCI with a four-choice paradigm. Four buttons flickered from red to black at four specific frequencies to elicit SSVEPs, while all of the buttons flashed in a fixed order (up, down, left, and then right) to elicit the P300 potential. They compared the performance of this system with those of the P300-only and SSVEP-only BCIs. Experimental results obtained from ten healthy subjects showed that the detection performance in the multimodal condition was superior to that in the SSVEP-only condition, although it was not improved compared with the P300only condition. Furthermore, all of the subjects were capable of using the multimodal BCI and did not consider it to be more annoying, difficult, or fatiguing to use than the P300- or SSVEP-based BCI [50]. In this section, we have shown that detection performance can be improved when the P300 and SSVEPs are combined appropriately. This improvement might be explained by the fusion of the time-frequency features of the P300 and SSVEPs, which may provide additional information to facilitate the classification of a target versus a nontarget, as demonstrated in Fig. 3. In the design of a BCI paradigm that combines the P300 and SSVEPs, it is critical for the two potentials to be elicited simultaneously without distraction. Such elicitation was successfully achieved in the aforementioned studies by combining periodic flickers with random flashes.
C. Audiovisual P300 BCIs Upon the simultaneous receipt of sensory inputs of the auditory and visual modalities, our brains may integrate
Fig. 4. GUI and experimental procedure of a single trial in the audiovisual condition [51].
both the auditory and visual features of these stimuli. Studies have shown that audiovisual integration may be accompanied by the enhancement of ERPs [33], which may be beneficial for BCI performance. Here, we will present an audiovisual BCI and investigate the effect of audiovisual stimuli on the BCI performance [51]. 1) Paradigm and Detection: Fig. 4 shows the GUI and procedure used in a single experimental trial of the audiovisual condition. In the GUI, two number buttons are located on the left and right sides of the monitor, which randomly display two different integer numbers from 0 to 9. There are also two speakers placed laterally to the monitor, and the two buttons flash in an alternating manner, with the color of the flashing button changing from green to black and the color of the corresponding number changing from black to white. When a number button is visually intensified, the corresponding spoken number is presented from the ipsilateral speaker. In this manner, the user is presented with a temporally, spatially, and semantically congruent audiovisual stimulus that lasts for 300 ms. In each trial, two audiovisual stimuli are presented, corresponding to the two numbers. At the beginning of a trial, two random numbers and an audiovisual Chinese instruction are presented for 6 s. The instruction is ‘‘Focus on the target number and count its repetitions silently’’ (such as 6 in Fig. 4). Next, there are two rounds of audiovisual stimulation. Each round consists of five consecutive repetitions of one audiovisual stimulus corresponding to | Proceedings of the IEEE
7
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
one number and then five consecutive repetitions of the other in which the ISI is randomized between 700 and 1500 ms. After stimulation, if the target number was detected by an SVM model as described below, then auditory applause and the visual target number are presented for 4 s as feedback; otherwise, a cross appears for 4 s. Finally, there is a 2-s break at the end of the trial. In total, each trial lasts approximately 42 s. The detection method is similar to that for a visual P300-based BCI (see [51] for details). Note that the time interval from 0 to 500 ms after the onset of each stimulus is used for feature extraction and that the extracted features encompass multiple ERP components, such as the P100, N200, and P300, which may be useful for classification in the audiovisual BCI, as demonstrated in the following. 2) Experimental Procedure and Results: Ten healthy subjects participated in the experiment, which consisted of three sessions administered in a random order, corresponding to the visual-only (V), auditory-only (A), and audiovisual (AV) conditions. The experimental procedure used in the visual-only and auditory-only conditions was similar to that used in the audiovisual condition except that only visual stimuli were presented in the visual session and only auditory stimuli were presented in the auditory session. In each session, an initial training run of ten trials was performed to establish the SVM model, followed by a test run of 30 trials. The online average accuracies across all subjects were 95.67%, 86.33%, and 62.33% for the audiovisual, visual-only, and auditory-only conditions, respectively. A one-way repeated measures ANOVA revealed that the stimulus condition exerted a significant effect ðp G 106 ; Fð2; 18Þ ¼ 41:82Þ. Fur ther m or e, po st hoc Bonferroni-corrected paired t-tests indicated that the online average accuracy was significantly higher for the audiovisual condition than for the visual-only or auditoryonly condition (all p G 0:05 corrected). Therefore, the audiovisual BCI was observed to significantly outperform the visual-only and auditory-only BCIs. Furthermore, as shown in Fig. 5(a), the ERP waveforms at the ‘‘Pz’’ electrode indicated that for the target stimuli, stronger P100, N200, and P300 responses were observed in the audiovisual condition than in the visual-only and auditory-only conditions. This phenomenon was not observed for the nontarget stimuli. The enhanced ERP components were beneficial for classification and allowed us to improve the performance of the BCI, as demonstrated below. To determine the discriminative features, we performed pointwise running t-tests for target versus nontarget responses in the audiovisual, visual-only, and auditory-only stimulus conditions across all subjects. As shown in Fig. 5(b), more discriminative features were present in certain specific time windows, such as the 100–200-, 200–300-, and 300–500-ms windows, in the audiovisual condition, compared with those in the visual-only or auditory-only condition. 8
Proceedings of the IEEE |
The improved performance of the audiovisual BCI compared with the visual-only or auditory-only BCI may be related to audiovisual integration. Audiovisual integration processes are generally assessed by comparing the ERPs elicited by audiovisual stimuli with the summed ERPs elicited by the corresponding visual-only and auditory-only stimuli [33]. For instance, using the criterion AV > A þ V, Talsma and Woldorff observed multisensory integration effects at approximately 100 ms (frontal positivity), 160 ms (centro–medial positivity), 250 ms (centro–medial negativity), and 300–500 ms (centro–medial positivity) [33]. For the attended target stimuli, our extensive experimental results also revealed the effects of audiovisual integration in the time windows of 40–60 ms (‘‘Cz’’ electrode), 120– 140 ms (‘‘Cz’’ and ‘‘Pz’’ electrodes), and 140–160 ms (‘‘Cz,’’ ‘‘Pz,’’ and ‘‘Oz’’ electrodes), whereas such effects were not observed for the nontarget stimuli [51]. The enhanced ERP components associated with multisensory integration, such as P100, N200, and P300, improved the performance of the BCI system. The paradigm of our audiovisual BCI, in which we present only two number buttons paired with semantically congruent auditory stimuli, is different from the classic oddball paradigm. Specifically, the oddball effect is generated by randomly setting the time interval between two adjacent stimuli, as has been verified in previous studies [52]. Our experimental results also indicated that the target stimuli actually produced a P300 response. The design of an effective audiovisual BCI with a larger number of stimulus buttons will be a subject of future work. Studies of audiovisual BCIs, although promising, are yet in the preliminary stages. The data analysis results that were presented for the offline audiovisual P300-based speller proposed in [30] showed that the strength of the P300 response was higher in the audiovisual condition than in the visual-only or auditory-only condition. Promising results (including ours) have demonstrated that audiovisual integration is useful for enhancing brain patterns and improving BCI performance. In the design of an audiovisual BCI, it is essential to consider the factors that may influence audiovisual integration, such as the temporal, spatial, and semantic congruity of the audiovisual stimuli. In this section, we have described several promising multimodal BCIs, including P300 and SSVEP-based BCIs and audiovisual BCIs, and have illustrated their potential for improving detection performance. A nonnegligible proportion of healthy users cannot effectively control simple BCIs based on MI, P300, or SSVEP brain pattern [16], [17], [53]. These users are typically regarded as BCI illiterates. However, a multimodal BCI involving two or more brain patterns offers a possible means of decreasing BCI illiteracy. For instance, a multimodal BCI combining MI and SSVEPs was presented to improve BCI performance in [25]. The experimental results for 14 healthy subjects showed that the average accuracy of the multimodal task was
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
Fig. 5. (a) Average ERP waveforms in each stimulus condition from the ‘‘Pz’’ electrode for all subjects. (b) Pointwise running t-tests compared the target responses with the nontarget responses in multisensory and unisensory stimulus conditions across all subjects for 30 electrodes. Significant differences were plotted when data points met an alpha criterion of 0.05 with a cluster size larger than 7 (modified from [51]).
improved by approximately 6% compared with that for the conventional MI-only or SSVEP-only method. Furthermore, although five of the participants were found to be BCI illiterates when using the MI- or SSVEP-only BCI, none was a BCI illiterate in the use of the multimodal BCI, possibly as a result of the addition of the second task, which provided more information to the classifier [25].
patterns or from multimodal signals in a multimodal BCI. In [28] and [54], a multimodal BCI combining MI and P300 potentials was proposed for 2-D cursor control and target selection, and a BCI mouse was thus implemented. Furthermore, this multimodal BCI was extended to several applications, including a web browser [55], a mail client [56], a Windows-based explorer [57], and a wheelchair [21], [58]. We review these systems in this section.
IV. MULTIDI MENS IONAL CONTROL BASED ON MULTI MODAL B CIs
A. Multimodal BCI-Based 2-D Cursor Control and BCI Mouse In the following, we present a multimodal BCI combining MI and P300 potentials for 2-D cursor control and target selection [28], [54].
An important issue in EEG-based BCIs is multidimensional control involving multiple independent control signals. These control signals may be obtained from multiple brain
| Proceedings of the IEEE
9
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
Fig. 6. GUI for 2-D cursor control and target selection in which a cursor (circle), target (square), and eight flashing buttons (three ‘‘up,’’ three ‘‘down,’’ and two ‘‘stop’’ buttons) are included [28], [54].
1) Paradigm and Control Methods: The GUI is shown in Fig. 6. The circle and the square are the cursor and the target, respectively, and the initial position of the cursor and the initial position and color (green or blue) of the target are randomly chosen. The three ‘‘up’’ buttons, three ‘‘down’’ buttons, and two ‘‘stop’’ buttons flash in a random order to evoke the P300 potential. The task of the subject is to move the cursor to the target and then to select/reject the green/blue target. The user can move the cursor to the left or right by imagining his or her left- or right-hand movement, whereas he or she can move the cursor up or down by focusing on one of the three flashing ‘‘up’’ or ‘‘down’’ buttons to evoke the P300 potential. If the user does not intend to move the cursor in the vertical direction, then he or she can focus on either of the two ‘‘stop’’ buttons. To further implement a BCI mouse, a target selection/rejection function is needed. Specifically, once the cursor reaches the target of interest (green square), the user can select it by focusing his or her attention on a flashing button while simultaneously maintaining the MI idle state. If the target is not of interest (blue button), the user can reject it by imagining his or her left- or right-hand movement without focusing on any flashing buttons. The cursor’s position is updated every 200 ms based on simultaneously performed MI and P300 detection. Specifically, once left- or right-hand movement imagery is detected, the cursor will move to the left or right. Meanwhile, the cursor will move up or down when the P300 is detected on one of the three ‘‘up’’ or ‘‘down’’ buttons. If no MI or P300 is detected, no corresponding horizontal or vertical movement of the cursor is generated (see [28] for details). Once the cursor reaches the target, selection or rejection is performed within 2 s, as described below. In each trial, we extract the features of the P300 and MI signals separately and then concatenate them to construct a hybrid feature vector, which is fed into the SVM for classification [54]. 10
Proceedings of the IEEE |
2) Experimental Procedure and Results: Eleven healthy subjects participated in an online experiment that consisted of one session of 80 trials for each subject. Each trial included two sequential tasks. The first was to move the cursor to a target in a randomized position on the screen. Once the cursor reached the target, the subject was instructed to perform the second task, which was to select or reject the target depending on its color (green for selection and blue for rejection). The time interval for the second task was set to 2 s. For all subjects, the average time for one trial was 18.96 s, the average percentage of successful trials was 92.84%, and the average target selection accuracy for the case in which the cursor was successfully moved to the target was 93.99%. Additionally, we also collected several data sets and performed an offline analysis to demonstrate the advantage offered by the hybrid P300 and MI features over the P300-only features or the MI-only features in target selection/rejection. The experimental results showed that the accuracy for the hybrid features was significantly higher than those for the MI-only and P300-only features (see [54] for details). Overall, we demonstrated that when using a multimodal BCI combining MI and P300 potentials, 2-D cursor control can be successfully performed and the cursor can be moved from an arbitrary initial position to an arbitrary target position. Once the cursor reaches the target, the user can effectively select or reject it within a time window of 2 s based on hybrid MI and P300 features. A simple BCI mouse was thus developed. Previously, an elegant BCI based on MI for 2-D cursor control was reported in [18]. In this system, the user can produce two independent signals by regulating his or her mu and beta rhythms. However, this approach has limited application because of the required intensive user training, the fixed initial position of the cursor, and the fixed positions of the eight target candidates in the working space. In addition, Allison et al. presented a novel type of multimodal BCI based on MI and SSVEPs for continuous 2-D cursor control [26]. In this system, the user controls the vertical movement of a virtual ball through left- and right-hand MI while simultaneously controlling the horizontal movement via SSVEP activity. Similar to the system presented in [18], the target in this system is also confined to one of eight candidates.
B. BCI Mouse-Based Web Browser, Mail Client, and File Explorer Several computer applications have been developed using the BCI mouse implemented by means of the multimodal BCI presented above for 2-D cursor control and target selection. Specifically, in [55], we proposed a BCI web browser that allows the user to surf the Internet without any limb movements. Moreover, Yu et al. proposed an alternative BCI-mouse-based mail client that can implement electronic mail communication [56]. Using this mail client, the user can receive, read, write, and attach files to their mails. Bai et al. proposed a BCI-mouse-based
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
b)
Fig. 7. GUI for a BCI-based Internet browser [55].
file explorer [57]. Using this system, the user can access his or her computer and manipulate (open, close, copy, paste, and delete) files such as documents, pictures, music, and movies. The results of online experiments have demonstrated the efficacy of these application systems. As an example, we describe the BCI web browser below. The principal methods of implementation for the BCI mail client and BCI file explorer, which are presented in [56] and [57], are similar to those of the BCI web browser. 1) Paradigm and Control Methods: The conceptual GUI for a BCI-mouse-based web browser, an extension of the GUI presented in Fig. 6, is shown in Fig. 7. The GUI includes a mouse, multiple targets in the client area, and eight buttons around the client area (three ‘‘UP,’’ three ‘‘DOWN,’’ and two ‘‘STOP’’ buttons) that flash in a random order to evoke the P300 potential. Furthermore, a web browser with a navigation bar at the top is embedded in the central client area. Several self-explanatory navigation buttons, including ‘‘home,’’ ‘‘reload,’’ ‘‘backward,’’ and ‘‘forward,’’ are provided in the navigation bar. In addition, a textbox to input a URL, a ‘‘go’’ button to access the webpage with the given address and a filter textbox for filtering out targets of no interest are also included. The buttons, the URL and filter textboxes, and the hyperlinks in an open webpage constitute multiple targets that can be accessed by the BCI mouse. Using the methods of 2-D cursor control and target selection/rejection presented above, the user can move the mouse onto a target and select or reject it depending on whether the target is of interest. The major challenge here is the selection of the intended target in a client area that contains multiple targets. The BCI browser was implemented through addressing the following three major concerns. a) Navigation: A navigation page that contains links to several popular sites, including search engines and news sites as well as e-mail, video, and shopping and social network services, was designed
c)
and implemented. Because there are only a small number of targets on this page, the user can open a site of interest by directing the mouse to indicate and select it. If the mouse reaches an unintended target, the user can reject it and continue to move the mouse until the intended target is reached and selected. Target visualization: All selectable targets are extracted based on the content of an accessed webpage. For each extracted target, an enlarged translucent target box is placed over it to allow it to be indicated with reasonable ease by the BCI mouse. However, these target boxes are invisible unless the mouse is touching them. Target filtering, mouse movement, and target selection: Because a webpage may contain a large number of hyperlinks, it can be difficult for the user to select an intended target using the BCI mouse; thus, a target filter was implemented to exclude most targets of no interest. Specifically, the user may select the filter textbox to activate a P300 speller and input keywords. Target filtering is automatically performed by the system based on these keywords, with the remaining targets highlighted in yellow. Then, through a series of mouse movements and target rejections/ selections, the user can open the webpage of interest. If the mouse remains in the blank region of the webpage, no target will be activated and the user will be able to read the page contents.
2) Experimental Procedure and Results: Seven subjects participated in an online experiment to evaluate the BCImouse-based web browser. The users were asked to perform a comprehensive task, such as executing most of the functions offered by the browser to open a real-world webpage and then to select an item on that page [55]. Although the task was quite complex, requiring the opening of a webpage, keyword input, and item selection, and involved multiple selections/rejections, all of the subjects completed it in an average of 247 s with few mistakes. These findings demonstrated that the BCI-mouse-based browser could be used to efficiently surf the Internet with high selection accuracy.
C. Wheelchair Control Based on Multimodal BCIs In Section IV-A, we presented a multimodal BCI that uses a combination of MI and P300 potential to provide two control signals for 2-D cursor control. In [58], we extended this multimodal BCI to the control of the direction and speed of a wheelchair, as described below. 1) Paradigm and Control Methods: The GUI is similar to that shown in Fig. 6 for 2-D cursor control except that neither the cursor nor the target is present. In this multimodal BCI system, the user can control the leftward and | Proceedings of the IEEE
11
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
rightward movements of the wheelchair through left- and right-hand MI, respectively. For speed control, a multimodal paradigm is employed. Specifically, if the user wishes to decelerate the movement of the wheelchair, he or she must imagine a third motor event (such as a foot movement) without attending to any flashing buttons. If the user wishes to accelerate the movement of the wheelchair, then he or she must focus on a specific flashing button without imagining any movement. Furthermore, the user can remain in the idle state without issuing any commands. The algorithm first detects the left-/right-hand MI for direction control. When no left- or right-hand MI is detected, no direction control command is issued. In this case, speed control is performed via the discrimination of three states: foot imagery without the P300 (lower speed), the P300 without foot imagery (higher speed), and the idle state (no speed control). It is rather difficult to stop a wheelchair using a BCI because the stop command needs to be elicited as promptly and accurately as possible. In Section III-A, we described a multimodal brain switch combining P300 and SSVEPs. We used this switch to issue ‘‘go/stop’’ commands in real-time wheelchair control [21]. In this system, the ‘‘go’’ and ‘‘stop’’ commands were always sent under static and in-motion conditions, respectively, and the direction control was achieved based on left- and right-hand MI [21]. Specifically, in the GUI shown in Fig. 2, the group of buttons at the top is designated as a target key for issuing ‘‘go/stop’’ commands, and the other groups of buttons are designated as pseudokeys that do not activate any commands. By focusing on the target key, the user can turn the wheelchair on or off. The system distinguishes between the control and idle states by assessing whether both the P300 and the SSVEPs are directed toward the target group of buttons. 2) Experimental Procedure and Results: To test the effectiveness of the control mechanism described above, two subjects were instructed to direct a real wheelchair along a predefined route. This route consisted of ten segments (alternating between low- and high-speed sections) and three turns. The subjects were asked to drive the wheelchair at a low speed of 0.1 m/s in the low-speed sections and at a high speed of 0.3 m/s in the high-speed sections. Each subject performed five trials. In each trial, the subject was required to drive the wheelchair from the starting point to the end of the route and then perform a ‘‘U’’-turn to return to the starting point. For the two subjects, the average driving distance (path length) for one low- or high-speed section was 5.43 or 5.21 m, respectively. Furthermore, the average time required to complete one low- or high-speed section was 45.38 or 20.14 s, during which the average time corresponding to incorrect speed control was 4.54 or 4.1 s, respectively. In addition, no collisions were observed. Overall, both subjects who directed the wheelchair using the developed multimodal BCI successfully traced the predefined route with effective direction and velocity control. 12
Proceedings of the IEEE |
In [21], we conducted another wheelchair control experiment to demonstrate the effectiveness of the above method in producing stop commands for a wheelchair. Five healthy subjects participated in this experiment. When in the static condition, the subjects could send a ‘‘go’’ command with an average response time (RT) of 4.21 s and an average false activation rate (FAR) of 0.48 events/min. When in the in-motion condition, the subjects could send a ‘‘stop’’ command with an average RT of 5.50 s and an average FAR of 0.52 events/min. In this section, we have described the implementation of multidimensional object control based on multimodal BCIs and have presented several application systems, such as a web browser and a wheelchair control system. Note that all of these multimodal BCI systems were implemented with healthy subjects and that there are several obstacles to be overcome when applying these systems to patients. For instance, the continuous control of a wheelchair using a BCI may pose heavy mental burden for some patients. One feasible solution to this problem is to integrate a BCI with an automated navigation system to implement shared control [43]–[45], in which the subject may simply focus on his or her final target and the execution of any actions necessary to achieve that target is performed by the automated navigation system (see Section II for further discussion). Other multimodal BCIs for multidimensional object control have also been reported. For instance, Pfurtscheller et al. introduced a multimodal BCI consisting of an MI-based brain switch and an SSVEP-based BCI [27]. The former was used to activate the SSVEP-based control of an orthosis for working periods and to deactivate the LEDs during resting periods. This combination of two BCIs operated using different mental strategies yielded a much lower FPR during resting periods compared with that achieved using the SSVEP-based BCI alone. Several multimodal BCIs combining eye movements and EEG have also been reported and used for robot control [38], [39]. Additionally, several multimodal BCIs based on EEG and EMG have also been presented for multidimensional neuroprosthesis control, such as that proposed in [59], which will be reviewed in Section V.
V. C LI NI C AL A P P LI CA T ION S : AWARENESS EVALUATION/DETECTION IN PATIENTS WITH DOC In this section, we illustrate several clinical applications of multimodal BCIs, predominantly oriented toward BCIbased awareness detection in patients with DOC. Patients with DOC include patients in a coma, patients in a vegetative state (VS) or with unresponsive wakefulness syndrome (UWS), patients in a minimally conscious state (MCS), and patients who have emerged from an MCS (EMCS) [60], [61]. Certain patients suffering from severe brain injury may progress to a VS/UWS, in which they may
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
be awake but show no awareness of themselves or their environment [62]. Other patients may improve to an MCS in which they demonstrate inconsistent but reproducible signs of awareness [63]. Furthermore, EMCS patients reliably and consistently demonstrate functional interactive communication or functional use of two different objects. Currently, the clinical diagnosis of patients with DOC relies primarily on behavioral observation scales such as the Glasgow Coma Scale (GCS) and the JFK Coma Recovery Scale-Revised (JFK CRS-R). Because of such patients’ lack of ability to perform normal physical movements and the difficulty of distinguishing reflexes from voluntary movements, misdiagnoses based on behavioral observation scales are often possible. For instance, the misdiagnosis rates for VS/UWS and MCS patients are rather high, ranging from 37% to 43% [64]. Detecting awareness in these patients is extremely challenging. Recently, potential applications of BCIs in awareness detection and online communication for DOC patients have been explored in several studies [14], [15]. Lule´ et al. tested a four-choice auditory oddball BCI in a study involving 16 healthy subjects and 18 patients with DOC [13 MCS, three VS, and two locked-in syndrome (LIS)], and 13 healthy subjects, one MCS patient, and one LIS patient were shown to be able to communicate using the BCI [14]. Ku ¨bler and Birbaumer presented an auditory P300-based BCI speller for patients with LIS [65]. Two of the four patients achieved a mean accuracy of 24%, and the other two patients were unable to output any correct characters. An important advantage of BCIs in awareness detection is that real-time feedback based on brain activities instead of behavior responses is available. However, BCI-based awareness detection in patients with DOC is still in its infancy. The detection performance of BCIs designed for such patients is generally poor because the cognitive levels of DOC patients are much lower than those of healthy subjects and the brain patterns produced by DOC patients are generally much weaker than those of healthy subjects. To achieve awareness detection in DOC patients, improving BCI detection performance is therefore an important research task. In Section III, we presented several multimodal BCIs including multimodal BCIs that combine the P300 potential and SSVEPs and an audiovisual BCI, and demonstrated their advantages in improving detection performance based on data collected from healthy subjects. We expect that these multimodal BCIs might also be suitable for the challenging population of patients with DOC. In the following sections, we will present our three preliminary results regarding the applications of these multimodal BCIs (or their variants) for awareness detection in DOC patients; the first two are based on multimodal BCIs that use a combination of the P300 and SSVEPs, which are variants of the multimodal brain switch presented in Section III-A, and the third is based on the audiovisual BCI presented in Section III-C.
A. Awareness Detection Based on a Multimodal BCI Combining the P300 Potential and SSVEPs In Section III, we reviewed several multimodal BCIs combining P300 and SSVEPs and reported their improved detection performance compared with P300-only or SSVEP-only BCIs. We have also applied this type of multimodal BCI for the detection of awareness in DOC patients, as reported in [15]. 1) Paradigm: The GUI is depicted in Fig. 8. In each trial, a photograph of the subject’s own face and a photograph of an unfamiliar face, each embedded in a photo frame, are randomly displayed on the left and right sides of the GUI. The left- and right-hand photos flicker between appearance and disappearance on a black background at frequencies of 6.0 and 7.5 Hz, respectively, to evoke SSVEPs. Meanwhile, the two photo frames also flash between appearance and disappearance in a random order to evoke the P300. The subject is instructed to focus on his/her own photograph or the unfamiliar one and to count the flashes of the corresponding photo frame. In this multimodal BCI paradigm, SSVEP and P300 responses are simultaneously elicited by the flickering target photograph and flashing target photo frame, respectively. Here, face photographs are utilized as stimuli based on the following considerations. Among the various available emotionally laden visual stimuli, faces are especially capable of capturing one’s attention because of their particular biological and social significance [66]. Furthermore, several BCI studies [67], [68] have demonstrated that multiple ERP components, including the N170, the vertex-positive potential (VPP), and the P300, can be elicited using human faces as stimuli, and these components are useful for improving the target detection performance of BCIs. 2) Experimental Procedure and Results: Three experimental runs were performed with eight patients (four VS, three MCS, and one LIS, diagnosed using the JFK CRS-R;
Fig. 8. GUI of the multimodal BCI for awareness detection [15].
| Proceedings of the IEEE
13
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
four males; native Chinese; mean age SD of 37.8 19 years) and four healthy controls. Each run consisted of five blocks, and each block consisted of only ten trials because the patients were easily fatigued during the experiments. In runs 1 and 2, the subjects were asked to focus on their own photographs and the unfamiliar photographs, respectively. In run 3, the subjects had to focus on their own photographs or the unfamiliar photographs in a pseudorandom order. The tasks of runs 1, 2, and 3 were progressively more difficult for the DOC patients. A subject was asked to perform run 3 only if he or she had successfully performed the tasks in runs 1 and 2. Note that prior to the online experiment, each subject performed a calibration run of ten trials identical to those of run 1 and trained an initial P300 classification model. The P300 classification model was updated after each block of the online experiment based on the data obtained in the calibration run and the data collected online. The BCI system determined which photograph the subject was focused on by means of both P300 and SSVEP detection and then presented the result on the screen as feedback, similarly to the feedback method described in Section III-A (see [15] and [21] for details). The accuracy was calculated as the ratio of the number of all correct responses (hits) to the total number of presented trials, and the significance of this value was assessed using 2 statistics [65]. At a significance level of p ¼ 0:05, we obtained a value of 3.84 for 2 , corresponding to 32 hits in 50 trials or an accuracy of 64% in our two-
choice BCI. The accuracies for patients VS1, MCS1, and LIS1 were significantly higher than the chance level for each of runs 1, 2, and 3, indicating that these three patients were able to follow commands. The average accuracies across all three runs were 72%, 69.3%, and 75.3% for patients VS1, MCS1, and LIS1, respectively. The accuracies for patients VS2 and MCS2 were significant in run 1 but not in run 2. This result may have occurred because emotional value or familiarity increases the likelihood that one’s attention will be attracted to a photograph of one’s own face [66]. For patients VS3, VS4, and MCS3, the accuracies were not significant in either run 1 or run 2. For the subjects VS1, MCS1, and LIS1, and one healthy subject (HC1), two ERP waveforms corresponding to each of runs 1, 2, and 3 were obtained from 200 ms pre-stimulus to 1000 ms post-stimulus by averaging over the 50 trials, one for target stimuli and the other for nontarget stimuli. Fig. 9 shows these ERP waveforms corresponding to the ‘‘Pz’’ electrode. In this figure, a P300-like component is evident in each of the target curves. It follows from Fig. 9 that significant differences in the P300 potential existed between the DOC patients and the healthy controls. For instance, the latent period of the P300 was generally longer for the patients than for the healthy controls, a finding that has also been reported in other studies [69], [70]. Additionally, recall that in this experiment, each run consisted of five blocks, and different blocks were administered on different days because the patients were easily fatigued. It took approximately two months for each
Fig. 9. Grand-average P300 ERP waveforms of the ‘‘Pz’’ electrode from 200 pre-stimulus to 1000 ms post-stimulus in runs 1 (upper panels), 2 (middle panels), and 3 (lower panels) for four subjects VS1, MCS1, LIS1, and HC1. The red curves containing P300 responses and the blue curves without P300 responses correspond to the target and nontarget photo frames, respectively [15].
14
Proceedings of the IEEE |
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
Fig. 10. Average power density spectra of the EEG signals from the eight channels (‘‘P7,’’ ‘‘P3,’’ ‘‘Pz,’’ ‘‘P4,’’ ‘‘P8,’’ ‘‘O1,’’ ‘‘Oz,’’ and ‘‘O2’’) in runs 1 (upper panels), 2 (middle panels), and 3 (lower panels) for the four subjects (subject VS1, MCS1, LIS1, or HC1). The red points indicate the SSVEP responses corresponding to the flickering frequency of the target photograph, and the green points correspond to the flickering frequency of the nontarget photograph. The target frequencies of 6 and 7.5 Hz imply that the target photographs appeared on the left and right sides of the GUI, respectively [15].
subject to complete all three runs of the experiment. Because of the fluctuating levels of consciousness in DOC patients, the P300 waveforms varied from run to run during the experiment. Furthermore, for each run (runs 1, 2, and 3) and each subject (subjects VS1, MCS1, LIS1, and HC1), Fig. 10 shows two average power density spectrum curves of the EEG signals across trials, one corresponding to the target photographs appearing on the left-hand side of the GUI and one corresponding to those on the right. In this figure, SSVEP responses are evident at the target photograph frequencies for all four subjects. Overall, for the three DOC patients, P300 and SSVEP responses were simultaneously evoked by the multimodal BCI. Using data collected from healthy subjects, we have verified that such simultaneous responses are useful for BCI classification (Section III). We did not perform any experiments to compare the results of the multimodal BCI with those of the P300-only and SSVEP-only BCIs for DOC patients because such experiments would have posed a heavy workload for these patients. However, we expect that the simultaneously evoked P300 and SSVEP responses should be similarly useful for BCI classification for these DOC patients based on results of the comparison for healthy subjects. Notably, it has been demonstrated that both P300 and SSVEP responses involve sequential activations of cortical networks and reflect higher order cognitive
abilities [71]. The P300 and SSVEP responses observed in this study therefore also led to the conclusion that awareness was present in the three patients who achieved significant accuracies in all three runs (VS1, MCS1, and LIS1).
B. Number Processing and Mental Calculation Based on a Multimodal BCI Combining P300 and SSVEPs Number processing and mental calculation are important brain functions that are associated with other cognitive functions, such as symbol representation and operation, attention, working memory, and linguistic processing [72]. We developed a multimodal BCI that uses a combination of P300 potentials and SSVEPs for the assessment and training of number recognition and arithmetic calculation abilities in DOC patients. 1) Paradigm: The GUI is similar to that presented in Fig. 8 except that the two photographs are replaced by two buttons displaying single-digit Arabic numbers (e.g., 6 and 8), one of which is the correct answer to a presented arithmetic problem. The patients are instructed to focus on the button corresponding to the target number (the answer to the arithmetic problem) and to count the number of flashes of the corresponding button frame. In this manner, the SSVEP and P300 responses can be simultaneously elicited by the flickering target number button and its flashing button frame, respectively. | Proceedings of the IEEE
15
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
2) Experimental Procedure and Results: The experiment consisted of three runs in which the subjects were instructed to perform tasks of number recognition (run 1), number comparison (run 2), and mental calculation, namely, single-digit addition and subtraction (run 3), respectively. Each run consisted of five blocks, and each block was composed of ten trials. Different blocks were administered on different days because the patients were easily fatigued. Each trial began with visual and auditory presentation of the task instructions in Chinese, which lasted 8 s. Then, two flickering number buttons, one of which was the correct answer, appeared on the GUI for 10 s. In each trial of run 1, the subjects were asked to recognize the target number specified in the instruction and focus on it. In each trial of run 2, the subjects were asked to focus on either the larger or smaller number. In each trial of run 3, a predefined addition or subtraction problem, e.g., ‘‘3 þ 5 ¼ ?,’’ was first presented through the instruction. The subjects were then asked to perform the required mental calculation and focus on the correct answer. The BCI system determined the target number on which the subject was focused by means of both P300 and SSVEP detection, in a manner similar to that described in Section V-A. Note that the tasks of runs 1, 2, and 3 became increasingly more difficult for the patients with DOC. Eleven patients (six VS, three MCS, and two EMCS, diagnosed using the JFK CRS-R; five males; native Chinese; mean age SD, 38.6 12 years) participated in runs 1 and 2, and only those patients who achieved the accuracies significantly higher than the chance level (50%) in runs 1 and 2 also underwent run 3. Five of the 11 patients (two VS, one MCS, and two EMCS) achieved accuracies higher than the chance level in runs 1 and 2 (ranging from 66% to 80%; p G 0:05, 2 -test). Among these five patients, three (one VS, one MCS, and one EMCS) also achieved accuracies higher than the chance level (50%) in run 3 (ranging from 64% to 66%; p G 0:05, 2 -test). Furthermore, both P300 potentials and SSVEPs were observed in the EEG signals from these five patients. For the other six patients, the accuracies were not significant in runs 1 and 2, and run 3, which represented the most difficult task, was therefore not conducted. Our experimental results demonstrated not only command following but also number and arithmetic abilities in these five patients. Additionally, all five patients exhibited somewhat improved consciousness levels according to JFK CRS-R-based behavioral assessments before and after the experiment. However, no conclusion could be drawn regarding whether the improved diagnoses for these patients were a result of these number- and calculationbased BCI training events. As demonstrated in this experiment, BCIs can assist patients with DOC who lack sufficient motor responses to output the results of number recognition and mental calculation tasks and thus provide an effective tool for the detection of related abilities, in which the BCI detection 16
Proceedings of the IEEE |
plays an important role. Furthermore, our multimodal BCI system can also be used for number recognition and arithmetic calculation training for those DOC patients who can achieve accuracies significantly higher than the chance level.
C. Awareness Detection Based on an Audiovisual BCI An audiovisual BCI was described in Section III-C, and its detection performance was superior to those of the corresponding visual-only and auditory-only P300 BCIs. We applied this audiovisual BCI for awareness detection in patients with DOC [51]. 1) Experimental Procedure and Results: Seven patients with DOC (six males; three VS, four MCS; mean age SD, 37 17 years) from a local hospital participated in the experiment. None of the patients had a history of impaired visual or auditory acuity. The experimental procedure was similar to that presented for healthy subjects in Section III-C except for the following changes. Because the patients were easily fatigued, the test run was divided into five blocks, each consisting of ten trials and each conducted on a different day. The SVM classification model was updated after each test block using the data from that block. Most of the patients performed at least 50 trials, except patients P3 and P6 (they completed 40 trials and then left the hospital for financial reasons). As shown in Table 1, the online accuracies for five patients (P1, P4, P5, P6, and P7) were significantly higher than the chance level (ranging from 66% to 74%; p G 0:05, 2 -test). These experimental results demonstrate the presence of command following and residual number recognition ability in the five DOC patients. For healthy subjects, we demonstrated in Section III that audiovisual stimuli enhanced several ERP components, including the P100, N200, and P300, and therefore improved the BCI classification performance. For the DOC patients, we did not conduct experiments to compare the audiovisual BCI with the visual-only or auditory-only BCI because it was TABLE 1 Online Accuracy of Each Patient [51]
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
difficult for them to sustain such a heavy experimental workload. However, the effectiveness of the audiovisual BCI was demonstrated for the DOC patient population. Furthermore, we can infer that the paradigm of the audiovisual BCI is useful for BCI classification for these DOC patients based on the results from the healthy subjects. As mentioned previously, misdiagnosis rates based on behavioral observation scales such as the GCS and the JFK CRS-R are rather high because patients often cannot produce sufficient behavioral responses. In this section, we have shown that the proposed multimodal BCI systems, including an SSVEP-and-P300-based multimodal BCI and an audiovisual BCI, may offer a potential solution to this problem and, moreover, can be potentially used as a supportive bedside tool. During the experiments using the SSVEP-and-P300-based multimodal BCI and the audiovisual BCI, two VS patients achieved accuracies significantly higher than the chance level. These findings suggest the possibility of misdiagnosis for these two VS patients. Previous fMRI (e.g., [73]) and EEG (e.g., [74]) studies have also shown that some patients who meet the behavioral criteria for VS may have residual cognitive functions and even conscious awareness. For patients to perform the presented experimental tasks, many cognitive functions are needed, such as language comprehension, object selection, working memory (i.e., the ability to remember the instructions), and sustained attention. All such cognitive functions, and thus residual awareness, were demonstrated in those patients that achieved the positive results. Notably, visual photograph stimuli and audiovisual number stimuli were presented separately in the two multimodal BCIs. Thus, either or both may be used for detecting awareness in a DOC patient depending on his or her status. For instance, if neither the visual nor the auditory acuity of a patient is impaired, we may choose the audiovisual BCI; whereas if the auditory acuity but not the visual acuity is impaired, we may choose the SSVEP-and-P300-based multimodal BCI (if the condition of the patient permits, we can also use both multimodal BCIs). In this section, we have illustrated the clinical applications of multimodal BCIs in awareness detection in DOC patients based on the improved target detection performance offered by multimodal BCIs. Additionally, as we have discussed before, multimodal BCIs can provide multidimensional object control, which also has clinical applications. For instance, EEG and EMG can be combined in multimodal BCIs for the multidimensional control of a neuroprosthesis [37], [59], [75] or for the rehabilitation of severely motor-restricted patients, such as those with stroke or spinal cord injuries [76]. Another effective rehabilitative system that has been proposed for stroke patients consists of an MI-based BCI and the InMotion II MIT-Manus planar shoulder and elbow robotic arm [77]. Although still in a nascent stage, multimodal BCIs have demonstrated their potential in clinical applications.
VI . CONCLUSION AND DISCUSSIONS OF FUT URE PE RSPE CT IVE S This paper critically reviewed state-of-the-art multimodal BCI systems, including their paradigm designs, control methods, and corresponding experimental results, and demonstrated that multimodal BCI techniques can be utilized to improve the target detection performance of BCIs and to perform multidimensional object control. We first summarized four basic classes of multimodal BCIs: BCIs combining multiple brain patterns, multisensory BCIs, BCIs based on multimodal signals, and BCI-based shared control systems. However, this categorization is not strict because different classes of multimodal BCIs may overlap. For instance, a shared control wheelchair system may consist of a multimodal BCI subsystem and an intelligent navigation system. Next, we analyzed two classes of multimodal BCIs, including BCIs combining multiple brain patterns and multisensory BCIs, to illustrate how these multimodal BCIs were designed to address two major challenges in the BCI field. BCI detection performance can be improved through the fusion of features from multiple brain patterns or by enhancing brain patterns through the presentation of multisensory stimuli, whereas multidimensional object control can be achieved by generating multiple separate control signals based on different brain patterns. Finally, we reviewed the emerging clinical applications of multimodal BCIs, including awareness detection in patients with DOC. Here, we offer several concluding remarks regarding the design principles of a multimodal BCI system and possibilities for future studies. Based on the various aspects of brain mechanisms, system implementation, and end-user workloads, we can summarize several principles for the design of a multimodal BCI. First, to evaluate a multimodal BCI system, it is preferable to consider multiple factors, such as accuracy, ITR, system complexity, cost, and user workload [17]. However, one or several particular aspects may be emphasized in specific applications. For instance, for patients with DOC, whose recognition capabilities are much lower than those of healthy individuals, we primarily consider concerns related to classification accuracy and user workload. In this case, the ITR may not be a key factor. To provide functional assistance for these patients, a shared control system is an appropriate choice to reduce the user workload. Second, in a multimodal BCI system, we may combine the use of multiple brain patterns, sensory modalities, signal inputs, or intelligent techniques. However, the optimal combination may differ considerably among users. Therefore, we should consider the possibility of personalized design, especially when a multimodal BCI system is designed for a specific group of patients. Third, the seamless integration of different input mechanisms and mental tasks must be carefully considered. For instance, when we design an audiovisual BCI, we should ensure the spatial, temporal, and semantic congruence of audiovisual stimuli to evoke stronger brain patterns, such as the P300, that are | Proceedings of the IEEE
17
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
stronger than those evoked through auditory-only or visualonly stimuli. In the following, several future perspectives on the study of multimodal BCIs are identified. 1) Brain mechanisms for multimodal BCIs: Brain mechanisms play an important role in determining the effectiveness of a BCI and its applications. MI is generally assumed to activate the same representations as the execution of the corresponding motor tasks [78]. Mental practice of movement sequences has been shown to activate similar brain areas as physical practice and also leads to improved performance when learning a sequence of foot movements [79]. In addition, there is evidence to suggest that mental rehearsal may have rehabilitative effects in cases of stroke and spinal cord injury [80]. These results may provide a theoretical basis for motor rehabilitative training using MIbased BCIs. When designing a multimodal BCI, we must consider the underlying brain mechanisms. For instance, crossmodal integration/interaction in the brain may provide the brain mechanisms for multisensory BCIs. We have shown that temporally, spatially, and semantically congruent audiovisual stimuli may enhance the P300, N200, and P100, which may lead to improved performance of an audiovisual P300 BCI. In summary, a multimodal BCI system may involve multiple brain patterns, multisensory modalities, multiple signal inputs, or multiple intelligent systems. To ensure the effective coordination of these components in a multimodal BCI system, the related brain mechanisms must be considered. However, few brain mechanism studies have yet been conducted for multimodal BCIs. 2) Multimodal BCI design and implementation: For multimodal BCIs combining multiple brain patterns, future studies may be devoted to identifying the optimal combinations of brain patterns to accomplish certain desired goals, which will most likely differ considerably among users [17]. When designing a multisensory BCI, we may consider more combinations of multisensory stimuli involving visual, auditory, and tactile modalities, where the key is to ensure that the desired brain patterns are enhanced by the multisensory stimuli. For multimodal BCIs based on multiple signal inputs, combinations of multiple brain signals, such as EEG and MEG, or multiple biosignals, such as EEG REFERENCES [1] J. R. Wolpaw, N. Birbaumer, D. J. McFarland, G. Pfurtscheller, and T. M. Vaughan, ‘‘Brain-computer interfaces for communication and control,’’ Clin. Neurophysiol., vol. 113, no. 6, pp. 767–791, 2002.
18
Proceedings of the IEEE |
3)
4)
and EMG, have been studied [34], [41]. In the future, we may develop real-time multimodal BCIs based on EEG and fMRI that address the following challenges: high noise in the EEG data (produced by the fMRI scanner), slow BOLD responses, and the high dimensionality and low time resolution of fMRI data. One potential application of this type of multimodal BCI is in brain mechanism studies for BCIs. For shared control BCI systems, one should also consider the paradigm of man–machine adaption/learning to implement the coupling of human and machine, and to establish a model that can effectively merge the user’s intention and the machine’s decision making. Multimodal neurofeedback: It is believed that neurofeedback can assist a subject in learning to perform various mental tasks such, as MI, and to achieve effective BCI control [81]. Furthermore, neurofeedback-based training can play an important role in rehabilitation for many patients, such as those with stroke [76]. In [29], a multimodal BCI, in which the neurofeedback was constructed based on a combination of MI and SSVEPs, was proposed to facilitate MI training, and the experimental results demonstrated its efficacy. However, further studies are still necessary to investigate appropriate means of establishing effective multimodal neurofeedback paradigms based on multiple brain patterns or multiple sensory stimuli (such as visual, tactile, and auditory stimuli). Clinical applications: To date, most of the multimodal BCI systems mentioned in this paper, such as BCI browsers and BCI wheelchairs, have been designed and tested using healthy subjects and are far from application to the end-user market. These systems must be extended for patients by carefully considering the differences between healthy subjects and patients. In addition, individual designs for use a specific patient may be necessary. For patients with DOC, the application of multimodal BCIs is in a nascent stage, and multimodalBCI-based communication and rehabilitation will be an important topic for future study. An important advantage of shared-control BCI systems is their ability to reduce user workloads. Such advantages are promising for patients with low-level cognition and low control abilities. However, to our knowledge, no such systems have yet been developed for patients. h
[2] N. Birbaumer et al., ‘‘A spelling device for the paralysed,’’ Nature, vol. 398, no. 6725, pp. 297–298, 1999. [3] S. H. Scott, ‘‘Neuroscience: Converting thoughts into action,’’ Nature, vol. 442, no. 7099, pp. 141–142, 2006. [4] J. A. Wilson, E. A. Felton, P. C. Garell, G. Schalk, and J. C. Williams, ‘‘ECoG
factors underlying multimodal control of a brain-computer interface,’’ IEEE Trans. Neural Syst. Rehabil. Eng., vol. 14, no. 2, pp. 246–250, Jun. 2006. [5] L. A. Farwell and E. Donchin, ‘‘Talking off the top of your head: Toward a mental prosthesis utilizing event-related brain potentials,’’ Electroencephalogr. Clin.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
Neurophysiol., vol. 70, no. 6, pp. 510–523, 1988. G. R. Mu¨ller-Putz, R. Scherer, C. Neuper, and G. Pfurtscheller, ‘‘Steady-state somatosensory evoked potentials: Suitable brain signals for brain-computer interfaces?,’’ IEEE Trans. Neural Syst. Rehabil. Eng., vol. 14, no. 1, pp. 30–37, Mar. 2006. G. Pfurtscheller and D. S. F. Lopes, ‘‘Event-related EEG/MEG synchronization and desynchronization: Basic principles,’’ Clin. Neurophysiol., vol. 110, no. 11, pp. 1842–1857, 1999. J. Jin, I. Daly, Y. Zhang, X. Wang, and A. Cichocki, ‘‘An optimized ERP brain-computer interface based on facial expression changes,’’ J. Neural Eng., vol. 11, no. 3, 2014, Art. ID. 036004. J. Jin, B. Z. Allison, Y. Zhang, X. Wang, and A. Cichocki, ‘‘An ERP-based BCI using an oddball paradigm with different faces and reduced errors in critical functions,’’ Int. J. Neural Syst., vol. 24, no. 8, 2014, Art. ID. 1450027. Y. Zhang et al., ‘‘Spatial-temporal discriminant analysis for ERP-based brain-computer interface,’’ IEEE Trans. Neural Syst. Rehabil. Eng., vol. 21, no. 2, pp. 233–243, Mar. 2013. Q. Zhao, G. Zhou, T. Adali, L. Zhang, and A. Cichocki, ‘‘Kernelization of tensor-based models for multiway data analysis: Processing of multidimensional structured data,’’ IEEE Signal Process. Mag., vol. 30, no. 4, pp. 137–148, Jul. 2013. R. Lehembre et al., ‘‘Electrophysiological investigations of brain function in coma, vegetative and minimally conscious patients,’’ Arch. Ital. Biol., vol. 150, no. 2/3, pp. 122–139, 2012. A. J. Calder, J. Keane, F. Manes, N. Antoun, and A. W. Young, ‘‘Impaired recognition and experience of disgust following brain injury,’’ Nature Neurosci., vol. 3, no. 11, pp. 1077–1078, 2000. D. Lule´ et al., ‘‘Probing command following in patients with disorders of consciousness using a brain-computer interface,’’ Clin. Neurophysiol., vol. 124, no. 1, pp. 101–106, 2013. J. Pan et al., ‘‘Detecting awareness in patients with disorders of consciousness using a hybrid brain-computer interface,’’ J. Neural Eng., vol. 11, no. 5, 2014, Art. ID. 056007. A. Ku¨bler and K.-R. Mu¨ller, ‘‘Toward brain computer interfacing,’’ in An Introduction to Brain-Computer Interfacing. Boston, MA, USA: MIT Press, 2007, pp. 1–25. G. Pfurtscheller et al., ‘‘The hybrid BCI,’’ Front. Neurosci., vol. 4, no. 2010, pp. 1–11, 2010. J. R. Wolpaw and D. J. McFarland, ‘‘Control of a two-dimensional movement signal by a noninvasive brain-computer interface in humans,’’ Proc. Nat. Acad. Sci. USA, vol. 101, no. 51, pp. 17 849–17 854, 2004. D. J. McFarland, W. A. Sarnacki, and J. R. Wolpaw, ‘‘Electroencephalographic (EEG) control of three-dimensional movement,’’ J. Neural Eng., vol. 7, no. 3, 2010, Art. ID. 036007. A. J. Doud, J. P. Lucas, M. T. Pisansky, and B. He, ‘‘Continuous three-dimensional control of a virtual helicopter using a motor imagery based brain-computer interface,’’ PLoS One, vol. 6, no. e2632210, 2011, Art. ID. e26322.
[21] Y. Li, J. Pan, F. Wang, and Z. Yu, ‘‘A hybrid BCI system combining P300 and SSVEP and its application to wheelchair control,’’ IEEE Trans. Biomed. Eng., vol. 60, no. 11, pp. 3156–3166, Nov. 2013. [22] R. C. Panicker, S. Puthusserypady, and Y. Sun, ‘‘An asynchronous P300 BCI with SSVEP-based control state detection,’’ IEEE Trans. Biomed. Eng., vol. 58, no. 6, pp. 1781–1788, Jun. 2011. [23] M. Xu et al., ‘‘A hybrid BCI speller paradigm combining P300 potential and the SSVEP blocking feature,’’ J. Neural Eng., vol. 10, no. 2, 2013, Art. ID. 026001. [24] E. Yin et al., ‘‘A novel hybrid BCI speller based on the incorporation of SSVEP into the P300 paradigm,’’ J. Neural Eng., vol. 10, no. 2, 2013, Art. ID. 026012. [25] B. Z. Allison et al., ‘‘Toward a hybrid brain-computer interface based on imagined movement and visual attention,’’ J. Neural Eng., vol. 7, no. 2, 2010, Art. ID. 26007. [26] B. Z. Allison et al., ‘‘A hybrid ERD/SSVEP BCI for continuous simultaneous two dimensional cursor control,’’ J. Neurosci. Methods, vol. 209, no. 2, pp. 299–307, 2012. [27] G. Pfurtscheller, T. Solis-Escalante, R. Ortner, P. Linortner, and G. R. Mu¨ller-Putz, ‘‘Self-paced operation of an SSVEP-based orthosis with and without an imagery-based ‘brain switch’: a feasibility study towards a hybrid BCI,’’ IEEE Trans. Neural Syst. Rehabil. Eng., vol. 18, no. 4, pp. 409–414, Aug. 2010. [28] Y. Li et al., ‘‘An EEG-based BCI system for 2-D cursor control by combining Mu/Beta rhythm and P300 potential,’’ IEEE Trans. Biomed. Eng., vol. 57, no. 10, pp. 2495–2505, Oct. 2010. [29] T. Yu et al., ‘‘Enhanced motor imagery training using a hybrid BCI with feedback,’’ IEEE Trans. Biomed. Eng., vol. 62, no. 7, pp. 1706–1717, Jul. 2015. [30] A. Belitski, J. Farquhar, and P. Desain, ‘‘P300 audio-visual speller,’’ J. Neural Eng., vol. 8, no. 2, 2011, Art. ID. 025022. [31] M. E. Thurlings, A. M. Brouwer, J. B. Van Erp, B. Blankertz, and P. J. Werkhoven, ‘‘Does bimodal stimulus presentation increase ERP components usable in BCIs?,’’ J. Neural Eng., vol. 9, no. 4, 2012, Art. ID. 045005. [32] T. M. Rutkowski and H. Mori, ‘‘Tactile and bone-conduction auditory brain computer interface for vision and hearing impaired users,’’ J. Neurosci. Methods, vol. 244, pp. 45–51, Apr. 2015. [33] D. Talsma and M. G. Woldorff, ‘‘Selective attention and multisensory integration: Multiple phases of effects on the evoked brain activity,’’ J. Cogn. Neurosci., vol. 17, no. 7, pp. 1098–1114, 2005. [34] L. Kauhanen et al., ‘‘EEG and MEG brain-computer interface for tetraplegic patients,’’ IEEE Trans. Neural Syst. Rehabil. Eng., vol. 14, no. 2, pp. 190–193, Jun. 2006. [35] T. Hinterberger et al., ‘‘An EEG-driven brain-computer interface combined with functional magnetic resonance imaging (fMRI),’’ IEEE Trans. Biomed. Eng., vol. 51, no. 6, pp. 971–974, Jun. 2004. [36] S. Fazli et al., ‘‘Enhanced performance by a hybrid NIRS-EEG brain computer interface,’’ NeuroImage, vol. 59, no. 1, pp. 519–529, 2012.
[37] G. Mu¨ller-Putz et al., ‘‘Towards non-invasive hybrid brain-computer interfaces: Framework, practice, clinical application and beyond,’’ Proc. IEEE, vol. 103, no. 6, pp. 926–943, Jun. 2015. [38] J. Ma, Y. Zhang, A. Cichocki, and F. Matsuno, ‘‘A novel EOG/EEG hybrid human-machine interface adopting eye movements and ERPs: Application to robot control,’’ IEEE Trans. Biomed. Eng., vol. 62, no. 3, pp. 876–889, Mar. 2015. [39] B. Choi and S. Jo, ‘‘A low-cost EEG system-based hybrid brain-computer interface for humanoid robot navigation and recognition,’’ PloS One, vol. 8, no. 9, 2013, Art. ID. e74583. [40] F. Putze et al., ‘‘Hybrid fNIRS-EEG based classification of auditory and visual perception processes,’’ Front. Neurosci., vol. 8, pp. 1–13, 2014. [41] E. Holz et al., ‘‘Hybrid-P300 BCI: Usability testing by severely motor-restricted end-users,’’ in Proc. TOBI Workshop IV, 2013, pp. 111–112. [42] G. R. Mu ¨ller-Putz et al., ‘‘Tools for brain-computer interaction: A general concept for a hybrid BCI,’’ Front Neuroinf., vol. 5, p. 30, 2011. [43] T. Carlson and J. D. Millan, ‘‘Brain-controlled wheelchairs a robotic architecture,’’ IEEE Robot. Autom. Mag., vol. 20, no. 1, pp. 65–73, Mar. 2013. [44] I. Iturrate, J. M. Antelis, A. Ku¨bler, and J. Minguez, ‘‘A noninvasive brain-actuated wheelchair based on a P300 neurophysiological protocol and automated navigation,’’ IEEE Trans. Robot., vol. 25, no. 3, pp. 614–627, Jun. 2009. [45] B. Rebsamen et al., ‘‘A brain controlled wheelchair to navigate in familiar environments,’’ IEEE Trans. Neural Syst. Rehabil. Eng., vol. 18, no. 6, pp. 590–598, Dec. 2010. [46] R. Zhang et al., ‘‘Control of a wheelchair in an indoor environment based on a brain-computer interface and automated navigation,’’ IEEE Trans. Neural Syst. Rehabil. Eng., 2015, DOI: 10.1109/TNSRE. 2015.2439298. [47] M. Wang et al., ‘‘A new hybrid BCI paradigm based on P300 and SSVEP,’’ J. Neurosci. Methods, vol. 244, pp. 16–25, 2015. [48] J. Pan, Y. Li, R. Zhang, Z. Gu, and F. Li, ‘‘Discrimination between control and idle states in asynchronous SSVEP-based brain switches: A pseudo-key-based approach,’’ IEEE Trans. Neural Syst. Rehabil. Eng., vol. 21, no. 3, pp. 435–443, May 2013. [49] E. Yin et al., ‘‘A hybrid brain-computer interface based on the fusion of P300 and SSVEP scores,’’ IEEE Trans. Neural Syst. Rehabil. Eng., vol. 23, no. 4, pp. 693–701, Jul. 2015. [50] B. Z. Allison, J. Jin, H. Zhang, and C. Wang, ‘‘A four-choice hybrid P300/ SSVEP BCI for improved accuracy,’’ Brain-Comput. Interfaces, vol. 1, no. 1, pp. 17–26, 2014. [51] F. Wang et al., ‘‘A novel audiovisual brain-computer interface and its application in awareness detection,’’ Sci. Rep., vol. 5, 2015, DOI: 10.1038/srep09962. [52] J. Polich and C. Margala, ‘‘P300 and probability: Comparison of oddball and single-stimulus paradigms,’’ Int. J. Psychophysiol., vol. 25, no. 2, pp. 169–176, 1997.
| Proceedings of the IEEE
19
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
[53] C. Guger et al., ‘‘How many people are able to control a P300-based brain-computer interface (BCI)?,’’ Neurosci. Lett., vol. 462, no. 1, pp. 94–98, 2009. [54] J. Long, Y. Li, T. Yu, and Z. Gu, ‘‘Target selection with hybrid feature for BCI-based 2-D cursor control,’’ IEEE Trans. Biomed. Eng., vol. 59, no. 1, pp. 132–140, Jan. 2012. [55] T. Yu, Y. Li, J. Long, and Z. Gu, ‘‘Surfing the internet with a BCI mouse,’’ J. Neural Eng., vol. 9, no. 3, 2012, Art. ID. 036012. [56] T. Yu, Y. Li, J. Long, and F. Li, ‘‘A hybrid brain-computer interface-based mail client,’’ Comput. Math. Methods Med., vol. 2013, 2013, Art. ID. 750934. [57] L. Bai, T. Yu, and Y. Li, ‘‘A brain computer interface-based explorer,’’ J. Neurosci. Methods, vol. 244, pp. 2–7, 2015. [58] J. Long et al., ‘‘A hybrid brain computer interface to control the direction and speed of a simulated or real wheelchair,’’ IEEE Trans. Neural Syst. Rehabil. Eng., vol. 20, no. 5, pp. 720–729, May 2012. [59] M. Rohm et al., ‘‘Hybrid brain-computer interfaces and hybrid neuroprostheses for restoration of upper limb functions in individuals with high-level spinal cord injury,’’ Artif. Intell. Med., vol. 59, no. 2, pp. 133–142, 2013. [60] J. L. Bernat, ‘‘Chronic consciousness disorders,’’ Annu. Rev. Med., vol. 60, pp. 381–392, 2009. [61] J. T. Giacino, J. J. Fins, S. Laureys, and N. D. Schiff, ‘‘Disorders of consciousness after acquired brain injury: The state of the science,’’ Nature Rev. Neurol., vol. 10, no. 2, pp. 99–114, 2014. [62] B. Jennett and F. Plum, ‘‘Persistent vegetative state after brain damage: A syndrome in search of a name,’’ Lancet, vol. 299, no. 7753, pp. 734–737, 1972. [63] J. T. Giacino et al., ‘‘The minimally conscious state definition and diagnostic criteria,’’ Neurology, vol. 58, no. 3, pp. 349–353, 2002.
[64] C. Schnakers et al., ‘‘Diagnostic accuracy of the vegetative and minimally conscious state: Clinical consensus versus standardized neurobehavioral assessment,’’ BMC Neurol., vol. 9, no. 1, p. 35, 2009. [65] A. Ku¨bler and N. Birbaumer, ‘‘Brain-computer interfaces and communication in paralysis: Extinction of goal directed thinking in completely paralysed patients?,’’ Clin. Neurophysiol., vol. 119, no. 11, pp. 2658–2666, 2008. [66] T. T. Kircher et al., ‘‘Recognizing one’s own face,’’ Cognition, vol. 78, no. 1, pp. B1–B15, 2001. [67] Y. Zhang, Q. Zhao, J. Jin, X. Wang, and A. Cichocki, ‘‘A novel BCI based on ERP components sensitive to configural processing of human faces,’’ J. Neural Eng., vol. 9, no. 2, 2012, Art. ID. 026018. [68] T. Kaufmann, S. Schulz, C. Gru¨nzinger, and A. Ku¨bler, ‘‘Flashing characters with famous faces improves ERP-based brain-computer interface performance,’’ J. Neural Eng., vol. 8, no. 5, 2011, Art. ID. 056016. [69] R. B. Sangal, J. M. Sangal, and C. Belisle, ‘‘Longer auditory and visual P300 latencies in patients with narcolepsy,’’ Clin. EEG Neurosci., vol. 30, no. 1, pp. 28–32, 1999. [70] W. Gerschlager et al., ‘‘Bilateral subthalamic nucleus stimulation does not improve prolonged P300 latencies in Parkinson’s disease,’’ J. Neurol., vol. 248, no. 4, pp. 285–289, 2001. [71] M. A. Pastor, J. Artieda, J. Arbizu, M. Valencia, and J. C. Masdeu, ‘‘Human cerebral activation during steady-state visual-evoked responses,’’ J. Neurosci., vol. 23, no. 37, pp. 11 621–11 627, 2003. [72] L. Feigenson, S. Dehaene, and E. Spelke, ‘‘Core systems of number,’’ Trends Cogn. Sci., vol. 8, no. 7, pp. 307–314, 2004. [73] M. M. Monti et al., ‘‘Willful modulation of brain activity in disorders of consciousness,’’ New England J. Med., vol. 362, no. 7, pp. 579–589, 2010.
[74] D. Cruse et al., ‘‘Detecting awareness in the vegetative state: Electroencephalographic evidence for attempted movements to command,’’ PLoS One, vol. 7, no. 11, 2012, Art. ID. e49933. [75] A. Kreilinger, V. Kaiser, M. Rohm, R. Rupp, and G. R. Mu¨ller-Putz, ‘‘BCI and FES training of a spinal cord injured end-user to control a neuroprosthesis,’’ Biomed. Eng.-Biomed. Tech ., vol. 58, 2013, DOI: 10.1515/bmt-2013-4443. [76] D. Mattia, F. Pichiorri, P. Arico`, F. Aloise, and F. Cincotti, ‘‘Hybrid brain-computer interaction for functional motor recovery after stroke,’’ in Converging Clinical and Engineering Research on Neurorehabilitation. New York, NY, USA: Springer-Verlag, 2013, pp. 1275–1279. [77] K. K. Ang et al., ‘‘A large clinical study on the ability of stroke patients to use an EEG-based motor imagery brain-computer interface,’’ Clin. EEG Neurosci., vol. 42, no. 4, pp. 253–258, 2011. [78] S. Halder et al., ‘‘Neural mechanisms of brain-computer interface control,’’ Neuroimage, vol. 55, no. 4, pp. 1779–1790, 2011. [79] P. L. Jackson, M. F. Lafleur, F. Malouin, C. L. Richards, and J. Doyon, ‘‘Functional cerebral reorganization following motor sequence learning through mental practice with motor imagery,’’ Neuroimage, vol. 20, no. 2, pp. 1171–1180, 2003. [80] F. Malouin, C. L. Richards, and A. Durand, ‘‘Normal aging and motor imagery vividness: Implications for mental practice training in rehabilitation,’’ Arch. Phys. Med. Rehabil., vol. 91, no. 7, pp. 1122–1127, 2010. [81] C. Neuper, A. Schlo¨gl, and G. Pfurtscheller, ‘‘Enhancement of left-right sensorimotor EEG differences during feedback-regulated motor imagery,’’ J. Clin. Neurophysiol., vol. 16, no. 4, pp. 373–382, 1999.
ABOUT THE AUTHORS Yuanqing Li was born in Hunan, China, in 1966. He received the B.S. degree in applied mathematics from Wuhan University, Wuhan, China, in 1988, the M.S. degree in applied mathematics from South China Normal University, Guangzhou, China, in 1994, and the Ph.D. degree in control theory and applications from the South China University of Technology, Guangzhou, in 1997. Since 1997, he has been with the South China University of Technology, where he became a Full Professor in 2004. From 2002 to 2004, he was with the Laboratory for Advanced Brain Signal Processing, RIKEN Brain Science Institute, Saitama, Japan, as a Researcher. From 2004 to 2008, he was with the Laboratory for Neural Signal Processing, Institute for Infocomm Research, Singapore, as a Research Scientist. He is the author or coauthor of more than 80 scientific papers in journals and conference proceedings. His research interests include blind signal processing, sparse representation, machine learning, brain–computer interface, EEG, and fMRI data analysis.
20
Proceedings of the IEEE |
Jiahui Pan received the B.S. degree in computer science and technology and the M.S. degree in computer application technology from South China Normal University, Guangzhou, China, in 2005 and 2008, respectively, and the Ph.D. degree in pattern recognition and intelligent systems from the South China University of Technology, Guangzhou, in 2014. He is currently a Lecturer in the School of Software, South China Normal University. His research interests include brain–computer interfaces, signal processing, image processing, machine learning, and their applications in biomedical engineering.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Li et al.: Multimodal BCIs: Target Detection, Multidimensional Control, and Awareness Evaluation
Jinyi Long was born in Guangdong, China, in 1983. He received the B.S. degree in materials processing and controlling engineering from Northeastern University, Shenyang, China and the Ph.D. degree in pattern recognition and intelligent systems from South China University of Technology, Guangzhou. He is currently a Lecturer in the College of Automation Science and Engineering, South China University of Technology. His research interests include machine learning, brain–computer interface, EEG, and fMRI data analysis. Tianyou Yu received the Ph.D. degree in pattern recognition and intelligent systems from the South China University of Technology, Guangzhou, China, in 2013. He is now with the School of Automation Science and Engineering, South China University of Technology. His research interests include pattern recognition, noninvasive brain–computer interfaces, and machine learning.
Fei Wang received the B.Sc. degree from the Beijing Institute of Technology, Beijing, China, in 2009. She is currently working toward the Ph.D. degree in pattern recognition and intelligent systems at the South China University of Technology, Guangzhou, China. Her research interests include noninvasive brain–computer interfaces and brain signal analysis.
Zhuliang Yu received the B.S.E.E. and M.S.E.E. degrees in electronic engineering from the Nanjing University of Aeronautics and Astronautics, Nanjing, China, in 1995 and 1998, respectively and the Ph.D. degree from Nanyang Technological University, Singapore, in 2006. He joined the Center for Signal Processing, Nanyang Technological University, in 2000, as a Research Engineer, and then became a Group Leader in 2001. In 2008, he joined the College of Automation Science and Engineering, South China University of Technology, Guangzhou, China, and was promoted to a full Professor in 2011. His research interests include video and image processing, array signal processing, pattern recognition, machine learning, and their applications in communications, biomedical engineering.
Wei Wu received the Ph.D. degree in biomedical engineering from Tsinghua University, Beijing, China, in 2012. He was a visiting student at the Neuroscience Statistics Laboratory, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA (2008–2012). Since 2012, he has been with the School of Automation Science and Engineering, South China University of Technology, Guangzhou, China, as an Associate Professor. Since 2014, he has been with the Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA, as a Research Fellow. He works in the field of neural signal processing and neural engineering, specializing in particular in developing statistical models and algorithms for the analysis and decoding of brain signals. Dr. Wu is an Associate Editor of Neurocomputing (Elsevier), and a member of the IEEE Biomedical Signal Processing Technical Committee.
| Proceedings of the IEEE
21