Towards increased data transmission rate for a three-class metabolic brain-computer interface based on transcranial Doppler ultrasound Andrew Myrden1,2 , Azadeh Kushki1 , Ervin Sejdi´c1,3 , Tom Chau1,2∗ 1
- Holland Bloorview Kids Rehabilitation Hospital, Bloorview Research Institute, Toronto, Ontario, Canada 2 - The Institute of Biomaterials and Biomedical Engineering, University of Toronto, Toronto, Ontario, Canada 3 - Department of Electrical and Computer Engineering, University of Pittsburgh, PA, USA ∗
- Corresponding Author,
[email protected] (416)425-6220 150 Kilgour Road, Toronto, ON M4G 1R8
1
Towards increased data transmission rate for a three-class metabolic braincomputer interface based on transcranial Doppler ultrasound
Abstract In this study, we conducted an offline analysis of transcranial Doppler (TCD) ultrasound recordings to investigate potential methods for increasing data transmission rate in a TCD-based braincomputer interface. Cerebral blood flow velocity was recorded within the left and right middle cerebral arteries while nine able-bodied participants alternated between rest and two different mental activities (word generation and mental rotation). We differentiated these three states using a three-class linear discriminant analysis classifier while the duration of each state was varied between 5 and 30 seconds. Maximum classification accuracies exceeded 70%, and data transmission rate was maximized at 1.2 bits per minute, representing a four-fold increase in data transmission rate over previous two-class analysis of TCD recordings.
Keywords: brain-computer interface, transcranial Doppler
1
Introduction
Brain-computer interfaces (BCIs) allow users to generate control signals for external devices using only their thoughts [7]. Due to their ability to bypass typical output channels such as movement and speech, BCIs are of interest within the field of rehabilitation engineering [29]. Specifically, BCIs can be used as an alternative means of communication in individuals with severe physical disabilities resulting from conditions such as stroke and amyotrophic lateral sclerosis (ALS). In extreme cases, these disabilities can result in total immobility and inability to communicate while retaining full consciousness. This condition is referred to as “locked-in syndrome” (LIS) [26]. The provision of a means of communication for individuals with LIS continues to be an important goal of BCI research [8]. 2
Previous non-invasive BCI research has focused on a small number of measurement modalities, of which the foremost has been electroencephalography (EEG) [3, 14]. Recent research has also investigated alternative measurement modalities such as functional magnetic resonance imaging (fMRI) and near-infrared spectroscopy (NIRS) [25, 32]. While EEG directly measures neuronal activity, fMRI and NIRS measure changes in blood hemoglobin concentrations [15]. Consequently, BCIs using the latter modalities are often referred to as hemodynamic or metabolic BCIs [19]. These BCIs do not generally possess the same temporal resolution as EEG BCIs, but have still attracted attention due to their intuitive training methods and robustness against electrical artifacts [10]. Recent research in this area has produced a number of real-time fMRI and NIRS-based BCIs [1, 4, 6, 11, 18]. These BCIs, many of which rely upon detection of motor imagery (e.g. imagined hand movement or finger tapping), suggest that metabolic BCIs are worthy of further study. Another metabolic signal that may be suitable for BCI development is transcranial Doppler ultrasound (TCD) [20]. TCD measures cerebral blood flow velocity (CBFV) within the circle of Willis (the network of arteries that supply the brain) [30]. Cognitive activation produces increases in CBFV within these arteries that can be detected using TCD [28]. These changes have been observed for a wide variety of different mental tasks [31], suggesting the potential to automatically detect mental activity on the basis of changes in CBFV. This possibility was investigated by Myrden et al. in [20], where it was shown that two different mental activities (word generation and mental rotation) can be differentiated from rest with greater than 80% accuracy. However, these results were achieved using very long durations for each activity (45 seconds), yielding a very low data transmission rate. This limits the practicality of such a BCI. Consequently, improvement of the data transmission rate is necessary in order to demonstrate the practical viability of a TCD-based BCI. In BCIs, data transmission rate depends on three parameters - the number of potential
3
classes (N), the classification accuracy (P), and the state duration - the length of time for which a mental activity is performed before it is classified. The first two variables determine the data transmission rate in bits per trial (B), which can be expressed as [22, 33]:
B = log2 (N ) + P log2 (P ) + (1 − P ) log2
1−P N −1
(1)
Using the state duration, data transmission rate can be converted to bits per second or bits per minute. It is clear that data transmission rate can be augmented by increasing either N or P, or by decreasing the state duration. The effects of each parameter on data transmission rate (in bits per minute) are shown in Figure 1. Increasing the number of classes and reducing state durations is likely to decrease classification accuracy. This limits the maximum achievable data transmission rate. In this paper, we investigate the net gain in data transmission rate that can be attained by varying these parameters for a TCD-based BCI. We have expanded the classification problem introduced in [20] to a three-class problem by attempting to differentiate word generation, mental rotation, and rest from each other. Furthermore, state durations have been limited to a range of durations between five and thirty seconds. If state durations can be substantially reduced without greatly decreasing classification accuracy, data transmission rate will be improved.
2
2.1
Materials and Methods
Participants
Nine able-bodied participants (6 female, mean age 25.6 ± 2.4 years) were recruited from the Bloorview Research Institute. All participants were right-handed, as quantified by the 4
Edinburgh Handedness Inventory [21], with a mean score of 79.4 ± 16.3. Participants had no history of migraine and no known neurological, cardiopulmonary, or respiratory conditions. All participants gave informed written consent. This study was approved by the Research Ethics Boards of both Holland Bloorview Kids Rehabilitation Hospital and the University of Toronto.
2.2
Signal Acquisition
CBFV was monitored using a Multi-Dop X4 TCD instrument (Compumedics USA). Dual 2 MHz ultrasonic transducers were fitted on the included headgear and placed over the left and right transtemporal windows. The insonation procedure detailed by Alexandrov et al. [2] was used to acquire CBFV signals from the left and right middle cerebral arteries (MCAs). These arteries profuse approximately 80% of the brain and have been implicated in a wide variety of mental tasks [31]. Probe position and measurement depth were adjusted until optimal signals were located from each MCA at depths between 45 and 60 millimetres. Signals were acquired from approximately the same depth for each MCA. The sampling rate was 100 Hz. The signal acquisition process is further detailed in [20].
2.3
Experimental Protocol
Participants completed two experimental sessions. Each session consisted of a 10-minute baseline period and two 15-minute experimental blocks. The baseline period allowed cerebral blood flow velocity to stabilize, and date from this period were not used for analysis. During each experimental block, participants completed ten rest states, five mental rotation states, and five word generation states. Participants alternated between rest and one of the two activation states until the block was completed. Each state was 45 seconds in duration.
5
Participants were seated facing a monitor on which the instructions and images for each task were displayed. During the word generation task, participants were presented with a letter and prompted to silently generate words beginning with that letter. During the mental rotation task, participants were presented with pairs of images of similar objects rotated to different angles, and were prompted to mentally rotate the objects until they could determine whether they were identical or mirror images. Further information regarding the word generation and mental rotation tasks is given in [20]. Participants were instructed to keep their eyes open during both activation and rest, and to perform each task as quickly as possible. Participants were also instructed to refrain from vocalizing their answers to prevent speech-related increases in CBFV. During rest states, participants were instructed to relax naturally.
2.4
Pre-Processing
TCD data were exported from the Multi-Dop X4, and the mean of the maximum velocity was extracted for analysis. The raw data from each block were normalized and then filtered using a third-order low-pass Butterworth filter with a cutoff frequency of 0.6 Hz to remove the effects of beat-to-beat fluctuations in CBFV. The data were then segmented into rest, word generation, and mental rotation states using markers that were automatically inserted into the TCD recordings at the beginning of each state during the experiment. During analysis, each of these segments was truncated to produce states of various shorter durations.
2.5
Feature Extraction
After segmentation, twelve features were extracted from each state. These included the mean, slope, and standard deviation from both the left and right MCAs; the difference in 6
means and difference in slopes between the left and right MCAs; the cross-correlation of the signals from each MCA; and the maximum and minimum instantaneous differences in CBFV between the left and right MCAs during each state.
2.6
Feature Selection and Classification
Classification was performed separately for state durations ranging from 5 to 30 seconds in one-second increments. To test each state duration, a signal of length corresponding to the state duration was extracted from the beginning of all states. Feature extraction was then performed for the set of shortened signals. Five runs of five-fold cross-validation were performed, with feature selection based on the training data set only. An exhaustive feature selection algorithm was used to select optimal feature sets from the pool of twelve features. Every possible combination of two and three features (referred to as two and three-dimensional feature sets, respectively) was used to classify the training data. Fisher linear discriminant analysis (LDA) was used for classification [13]. The two and threedimensional feature sets that produced the best performance on the training data were then used to classify the test data, again using Fisher LDA. The reported classification accuracies are the average of the accuracies for all three classes. All comparisons between classification accuracies at different state durations were performed using the Wilcoxon ranksum test.
3
Results
The mean classification accuracy across all participants using two and three-dimensional feature sets is displayed in Figure 2 for state durations ranging from 5 to 30 seconds. Mean classification accuracy ranged between 40% and 69% for two-dimensional feature sets, and 7
between 37% and 74% for three-dimensional feature sets. For both sets, classification accuracy increased with increasing state duration, but tended to stabilize as state duration exceeded 20 seconds. In Figure 3, these curves have been converted to reflect data transmission rate using (1). Data transmission rate was maximized at 1.2 bits per minute for 20-second state durations using a three-dimensional feature set. For both two and three-dimensional feature sets, mean classification accuracy across all participants exceeded chance levels for state durations longer than eight seconds. For durations greater than 10 seconds, classification accuracy was generally higher when using three-dimensional feature sets. This difference in classification accuracy was statistically significant for durations between 18 and 29 seconds (p < 0.02). Figure 4 depicts the mean accuracy across all participants for each class at each state duration for three-dimensional feature sets. For these sets, an accuracy exceeding 70% was first achieved for 20-second durations. Classification accuracies for word generation and mental rotation were significantly higher than classification accuracy for the rest class for durations longer than 10 seconds (p < 0.05) and 14 seconds (p < 0.001), respectively. Classification accuracy peaked for word generation at 77%, for mental rotation at 78%, and for the rest class at 66%. This was also observed in [20] when each task was independently differentiated from rest. The cubic polynomial of best fit was computed for the accuracy curve for each participant. From these curves, state durations at which several temporal milestones were achieved were computed for each participant. These include the duration at which maximum accuracy was achieved, the shortest duration at which classification accuracy was within 5% of the maximum value, and the duration at which classification accuracy stabilized. The final parameter represents the state duration for which further increases in duration yielded only marginal gains in classification accuracy. It was defined as the state duration for which the
8
magnitude of the derivative of the polynomial of best fit was less than 1% of the maximum accuracy. All parameters were calculated for three-dimensional feature sets and can be found in Table 1.
4
Discussion
In this study, we have shown that mean classification accuracies exceeding 70% can be achieved for a three-class problem within 20 seconds of the onset of cognitive activity using bilateral TCD measurements, time-domain features, and a linear classifier. This corresponds to a maximum data transmission rate of 1.2 bits per minute, compared to a maximum rate of 0.3 bits per minute previously reported for a TCD-based BCI [20]. This significant improvement highlights the advantages of a three-class BCI and the importance of reducing state duration. It is important to note that these results represent very early research into a TCD-based BCI, and it is likely that further improvement is possible. The present study used only the time-averaged mean of the maximum velocity due to limitations of the instrument. However, it is possible that response time may be further reduced using frequency-domain features extracted from the maximum velocity envelopes, if these signals are available. Our results should be compared to those from other metabolic BCIs, particularly those based on NIRS. Recent studies in this area have produced BCIs with data transmission rates ranging from 0.6 to 1.3 bits per minute [5, 16, 23]. Our BCI places near the upper end of this range, indicating that TCD is a worthwhile alternative to NIRS for metabolic BCIs. Older work in this area has achieved data transmission rates approaching 3 bits per minute [25], a level that may be possible for TCD when more powerful feature selection and classification algorithms are used. State duration has a significant effect on classification accuracy. As seen in Figure 2, clas-
9
sification accuracy for a three-dimensional feature set improves from 37% for five-second durations to approximately 50% for 10-second durations and 64% for 15-second durations. After state duration is extended to 20 seconds, further increases in classification accuracy are small. In previous TCD research, event-related peaks in CBFV have been observed between 4 and 20 seconds after the onset of cognitive activity [17, 24, 28]. These measurements support our results, and may explain why state durations beyond 20 seconds provide only marginal increases in classification accuracy. In this study, state duration was varied by extracting a segment of appropriate length from the beginning of a 45-second task. This was a practical necessity due to the goal of investigating a wide variety of different state durations. If state durations were limited during data collection rather than during data processing, the results may vary slightly. Future work should investigate the effect of using shorter state durations during data collection. The results from this study could be used as a basis for selecting shorter state durations in future studies. This study was necessary to verify that three-class classification accuracies comparable to other metabolic BCIs could be achieved in an offline analysis of TCD data from able-bodied participants. The present findings encourage future research on the development of an online TCD-based BCI and its application to individuals within the target population. Although we have focused on communication and control applications, it has also been recently proposed that BCIs may have applications for neurological rehabilitation [9, 12]. For example, it has been proposed that the usage of motor imagery BCIs could help restore movement in individuals partially paralyzed by stroke or neurotrauma [27]. Similarly, it is possible that a TCD-based BCI that uses the word generation task could be used to aid rehabilitation when language areas of the brain are affected by a stroke. It may also be possible to use TCD to provide neurofeedback to individuals with abnormal cerebral blood
10
flow velocities. This could allow them to voluntarily regulate their brain activity to aid recovery.
5
Acknowledgments
This work was supported by the Canada Research Chairs program, the Natural Sciences and Engineering Research Council, Holland Bloorview Kids Rehabilitation Hospital, and Barbara and Frank Milligan.
References [1] A. F. Abdelnour, T. Huppert, Real-time imaging of human brain function by nearinfrared spectroscopy using an adaptive general linear model, NeuroImage, 46 (2009) 133–143. [2] A. Alexandrov, M. Sloan, L. K. S. Wong, C. Douville, A. Razumovsky, W. Koroshetz, M. Kaps, C. Tegeler, Practice standards for transcranial Doppler ultrasound: Part I – test performance, J Neuroimaging, 17 (2007) 11–18. [3] B. Z. Allison, E. W. Wolpaw, J. R. Wolpaw, Brain-computer interface systems: progress and prospects, Expert Rev Med Devic, 4 (2007) 463–474. [4] H. Ayaz, P. Shewokis, S. Bunce, M. Schultheis, B. Onaral, Assessment of cognitive neural correlates for a function near infrared-based brain computer interface system, Found Augmented Cognition. Neuroergonomics and Operational Neuroscience, (2009) 699–708. [5] G. Bauernfeind, R. Scherer, G. Pfurtscheller, C. Neuper, Single-trial classification of
11
antagonistic oxyhemoglobin responses during mental arithmetic, Med Biol Eng Comput, 49 (2011), 1–6. [6] B. D. Berman, S. G. Horovitz, G. Venkataraman, M. Hallett, Self-modulation of primary motor cortex activity with motor and motor imagery tasks using real-time fMRI-based neurofeedback, NeuroImage, 59 (2012) 917–925. [7] N. Birbaumer, Breaking the silence: brain-computer interfaces (BCI) for communication and motor control, Psychophysiology, 43 (2006) 517–536. [8] N. Birbaumer, A. Murguialday, L. Cohen, Brain-computer interface in paralysis, Curr Opin Neurol, 21 (2008) 634–638. [9] N. Birbaumer, A. Ramos Murguialday, C. Weber, P. Montoya, Neurofeedback and brain-computer interface: clinical applications, Int rev neurobiol, 86 (2009) 107–117. [10] S. Coyle, T. Ward, C. Markham, G. McDarby, On the suitability of near-infrared (NIR) systems for next-generation brain–computer interfaces, Physiol Meas, 25 (2004) 815–822. [11] S. M. Coyle, T. E. Ward, C. M Markham, Brain-computer interface using a simplified near-infrared spectroscopy system, J Neur Eng, 4 (2007) 219. [12] J. J. Daly, J. R. Wolpaw, Brain-computer interfaces in neurological rehabilitation, Lancet Neurol, 7 (2008) 1032–1043. [13] R. Duda, P. Hart Pattern classification and scene analysis, Wiley, New York, 1996. [14] L. A. Farwell, E. Donchin, Talking off the top of your head: toward a mental prosthesis utilizing event-related brain potentials, Electroen Clin Neuro, 70 (1988) 510–523.
12
[15] M. Gerven, J. Farquhar, R. Schaefer, R. Vlek, J. Geuze, A. Nijholt, N. Ramsey, P. Haselager, L. Vuurpijl, S. Gielen, et al, The brain–computer interface cycle, J Neural Eng, 6 (2009) 041001. [16] L. Holper, M. Wolf, Single-trial classification of motor imagery differing in task complexity: a functional near-infrared spectroscopy study, J Neuroeng Rehabil, 8 (2011) 34. [17] S. Knecht, H. Henningsen, M. Deppe, T. Huber, A. Ebner, E. B. Ringelstein, Successive activation of both cerebral hemispheres during cued word generation, NeuroReport, 7 (1996) 820–824. [18] S. M. Laconte, Decoding fMRI brain states in real-time, NeuroImage, 56 (2011) 440–454. [19] F. Matthews, B.A. Pearlmutter, T.E. Ward, C. Soraghan, C. Markham, Hemodynamics for brain-computer interfaces, IEEE Signal Proc Mag, 25 (2008) 87–94. [20] A. J. B. Myrden, A. Kushki, E. Sejdic, A-M. Guerguerian, T. Chau, A brain-computer interface based on bilateral transcranial doppler ultrasound, PLoS-ONE, 6 (2011). [21] R. C. Oldfield, The assessment and analysis of handedness: the Edinburgh inventory, Neuropsychologia, 9 (1971) 97–113. [22] J. R. Pierce, An introduction to information theory: symbols, signals & noise, Dover Publishing, New York, 1980. [23] S. D. Power, T. H. Falk, T. Chau, Classification of prefrontal activity due to mental arithmetic and music imagery using hidden Markov models and frequency domain nearinfrared spectroscopy, J Neural Eng, 7 (2010) 026002. [24] C. Schnittger, S. Johannes, A. Arnavaz, T. F. M¨ unte, Relation of cerebral blood flow velocity and level of vigilance in humans, NeuroReport, 8 (1997) 1637–1639. 13
[25] R. Sitaram, H. Zhang, C. Guan, M. Thulasidas, Y. Hoshi, A. Ishikawa, K. Shimizu, N. Birbaumer, Temporal classification of multichannel near-infrared spectroscopy signals of motor imagery for developing a brain-computer interface, NeuroImage, 34 (2007) 1416–1427. [26] E. Smith, M. Delargy, Locked-in syndrome, Brit Med J, 330 (2005) 406–409. [27] S. R. Soekadar, N. Birbaumer, L. G. Cohen, Brain-computer interfaces in the rehabilitation of stroke and neurotrauma, IEEE T Neur Sys Rehabil Eng, 19 (2011) 542–549. [28] N. Stroobant, G. Vingerhoets, Transcranial Doppler ultrasonography monitoring of cerebral hemodynamics during performance of cognitive tasks: a review, Neuropsychol Rev, 10 (2000) 213–231. [29] K. Tai, S. Blain, T. Chau, A review of emerging access technologies for individuals with severe motor impairments, Assist Technol, 20 (2008) 204–219. [30] C. H. Tegeler, D. Ratanakorn, Physics and principles, in V. L. Babikian, L. Wechsler (Eds.), Transcranial Doppler ultrasonography, second ed., Butterworth-Heinemann, Boston, 1999, pp. 3–11. [31] G. Vingerhoets, N. Stroobant, Lateralization of cerebral blood flow velocity changes during cognitive tasks. a simultaneous bilateral transcranial Doppler study, Stroke, 30 (1999) 2152–2158. [32] N. Weiskopf, K. Mathiak, S. W. Bock, F. Scharnowski, R. Veit, W. Grodd, R. Goebel, N. Birbaumer, Principles of a brain-computer interface (BCI) based on real-time functional magnetic resonance imaging (fMRI), IEEE T Bio-Med Eng, 51 (2004) 966–970. [33] J. R. Wolpaw, N. Birbaumer, W. J. Heetderks, D. J. McFarland, P. H. Peckham, G. Schalk, E. Donchin, L. A. Quatrano, C. J. Robinson, T. M. Vaughan, Brain-computer 14
interface technology: a review of the first international meeting, IEEE T Rehabil Eng, 8 (2000) 164–173.
Fig. 1: (a) Effects of classification accuracy and number of classes on data transmission rate for a 30-second state duration. (b) Effects of state duration and number of classes on data transmission rate for 100% classification accuracy.
Fig. 2: Mean classification accuracy across all participants for durations ranging from five to thirty seconds. Both the raw results and the polynomial of best fit are shown for two and three-dimensional feature sets. Classification using three-dimensional sets was significantly more accurate for durations between 18 and 29 seconds.
Fig. 3: Data transmission rate for state durations between 5 and 30 seconds. The maximum attained data transmission rate is 1.2 bits per minute for 20-second durations using a three-dimensional feature set. Tab. 1: State duration and classification accuracy for each participant when maximum accuracy, near-maximum accuracy (within 5%), and stabilization occurred. Values computed for three-dimensional feature sets and cubic polynomials of best fit to the accuracy curves. Standard deviations are given in brackets. Maximum Accuracy Within 5% of Max Stabilization Participant Time (s) Accuracy (%) Time (s) Accuracy(%) Time(s) Accuracy(%) 1 26 74.3 18 70.4 21 73.0 2 19 84.5 14 80.7 20 84.3 3 30 73.3 22 68.6 26 72.2 4 24 90.5 18 86.3 21 89.5 5 30 43.6 19 38.7 26 42.4 6 26 86.3 22 82.1 25 86.1 7 26 53.8 19 49.3 24 53.5 8 19 69.9 13 65.2 20 69.8 9 24 81.9 18 77.3 21 80.7 Mean 24.9 (4.0) 73.1 (15.6) 18.1 (3.1) 68.7 (15.8) 22.7 (2.5) 72.4 (15.6)
15
Fig. 4: Mean classification accuracy for each class across all participants for threedimensional feature sets. Cubic polynomials of best fit are presented for word generation (full line) and mental rotation (dashed line), and the line of best fit for rest (dotted line).
16
Data Transmission Rate (Bits per Minute)
4 3.5 3
25 Two Classes Three Classes Four Classes
Two Classes Three Classes Four Classes
20
2.5
15
2 10
1.5 1
5
0.5 0 0.2
0.4 0.6 0.8 Classification Accuracy
1
0
10
20 30 State Duration (s)
40
75
Classification Accuracy (%)
70 65 60 55 50 45 Two−Dimensional Two−Dimensional (Best Fit) Three−Dimensional Three−Dimensional (Best Fit)
40 35 30
5
10
15
20
Duration (s)
25
30
Transmission Rate (bits per minute)
1.4 1.2 1 0.8 0.6 0.4 0.2 0
Two−Dimensional Three−Dimensional 5
10
15
20
Duration (s)
25
30
Classification Accuracy (%)
80
70 60
50 40
30
20
5
10
15
20
Duration (s)
25
30