RAPID Tech Paper

5 downloads 0 Views 757KB Size Report
and analysis, human intervention is often necessary to interpret visual subtleties, .... that it was not a dozer; thus, the non-target trials were more difficult to ...
Interactive Neurotechnology Platform: A Real-time Window on Human Information Processing at the Millisecond Level Berka, Chris Advanced Brain Monitoring, Inc. / 2237 Faraday Ave., Ste. 100 / Carlsbad / CA 92008 USA E-mail: [email protected] Hale, Kelly Design Interactive, Inc. / 1221 E. Broadway, Ste. 110 / Oviedo / FL 32765 USA E-mail: [email protected] Cowell, Andrew J. Pacific National Northwest Laboratory / 902 Batelle Blvd. / Richland / WA 99354 USA E-mail: [email protected] Fuchs, Sven Design Interactive, Inc. / 1221 E. Broadway, Ste. 110 / Oviedo / FL 32765 USA E-mail: [email protected] Johnson, Robin Advanced Brain Monitoring, Inc. / 2237 Faraday Ave., Ste. 100 / Carlsbad / CA 92008 USA E-mail: [email protected] Davis, Gene Advanced Brain Monitoring, Inc. / 2237 Faraday Ave., Ste. 100 / Carlsbad / CA 92008 USA E-mail: [email protected] Leathers, Robin Advanced Brain Monitoring, Inc. / 2237 Faraday Ave., Ste. 100 / Carlsbad / CA 92008 USA E-mail: [email protected] Popovic, Djordje Advanced Brain Monitoring, Inc. / 2237 Faraday Ave., Ste. 100 / Carlsbad / CA 92008 USA E-mail: [email protected] Whitmoyer, Melissa Advanced Brain Monitoring, Inc. / 2237 Faraday Ave., Ste. 100 / Carlsbad / CA 92008 USA E-mail: [email protected] Olmstead, Rich

Advanced Brain Monitoring, Inc. / 2237 Faraday Ave., Ste. 100 / Carlsbad / CA 92008 USA E-mail: [email protected] ABSTRACT The ability to track salient targets is common to contemporary work environments ranging from airport security to imagery analysis. A neurotechnology platform was implemented for real-time EEG to monitor operator state during review of complex satellite imagery. Consistent, quantifiable stimulus-locked ERP signatures for targets/non-targets were identified. A single trial classifier achieved accuracy approaching 80% across tasks, participants and test stimuli. An alternative approach maximizes accuracy of the time-locked analysis by integrating eye-fixations as triggers. Evidence suggests that the waveshape characteristics of fixation-locked ERPs, “FLERPs” resemble stimulus-evoked ERPs and more importantly, distinct signatures are associated with hits, correct rejections and misses. Keywords Neurotechnology, Augmented Cognition, EEG, event-related potentials, eye tracking, cognitive state, imagery analysis, neural signatures

INTRODUCTION Despite enormous gains in robotic vision and machine automated image processing and analysis, human intervention is often necessary to interpret visual subtleties, particularly in cases of ambiguous stimuli. The human visual system is uniquely equipped for continuous, rapid processing of a constant barrage of stimuli, often automatically without conscious intervention. Because tracking salient targets is common to many contemporary work environments, elucidating the capabilities and limitations of the human visual processing system will improve human-machine system interfaces and assist in optimizing the design of work, home and recreational environments. An integrated human-machine approach is needed to extend and enhance the powers of human perception. Advances in neuroscience suggest that neural processes can be monitored in real-time with electroencephalographic (EEG) analysis applied to continuously monitor operator state and quantify levels of fatigue, attention, and workload. A more detailed window on sensory and cognitive processing can be obtained by tracking specific event-related EEG potentials (ERPs) known to reflect levels of information processing. Real-time ERP monitoring has been proposed for brain-computer interfaces to increase throughput of intelligence analysts tasked with processing considerable volumes of satellite imagery. This paper presents progress in development of the Revolutionary Advanced Processing Image Detection (RAPID) system, designed to incorporate neurophysiology measurement techniques into a closed-loop system that tracks the image analysis process and automatically identifies images of interest using stimulus- and/or response-locked EEG/ERP and specific areas of interest within each image using fixation-locked EEG/ERP. RAPID incorporates two distinct neurophysiology instruments, specifically EEG/ERP and eye tracking technology with the goal of optimizing the efficiency and effectiveness of the image analysis process. Recent reports have shown feasibility of using single trial ERPs to identify when participants detected targets presented in satellite images or natural scenes (Parra, Spence et al. 2005; Mathan, Whitlow et al. 2006; Sajda, Gerson et al. 2007). There is growing interest in applications for real-time ERP classification as a brain-computer interface and increasing throughput of Image Analysts (IAs); however, the generalizability of previous results to real-world applications is limited due to constraints on selection of target stimuli (one centrally located target per image), use

of very fast stimulus presentation rates (3-5 images/second), low probability of targets to induce a novelty (“oddball”) ERP and creation of highly individualized classification models that have not been evaluated across multiple stimulus sets, times or participants. Taken together, these factors limit the ecological validity of previous studies. The present studies were designed to move beyond the simple stimulus parameters used in previous studies and explore a number of issues, including relevance of image size and complexity. The goal is to move beyond simple imagery with one centrally located target or one distractor to much more complex images (currently 1280×1024 pixel resolution) with target locations varied within images. The aim of these studies is to answer questions such as: What is the impact on the neural signatures of changing the target/non-target ratios (previous work used 20/80)? Are there different neural signatures at 50/50 or 20/80? How robust are the classifiers across stimulus sets, times and subjects? Finally, because the goal is to minimize the number of EEG sensors required for the final RAPID system, this study was designed to begin assessing the optimal number and location of sensor sites/channels required for neural signatures of target detection for complex visual stimuli.

METHODS Participants Study one included seven participants (2M; 5F), ranging in age from 24-35. Study two included a total of 15 participants, however complete data sets were only obtained for 13. All participants were in good health and had normal or corrected to normal vision. All had no former imagery analysis experience. Apparatus

Figure 1: EEG Sensor Headset

EEG data were acquired with the B-Alert® wireless EEG Sensor Headset with sensors at F3, F4, C3, C4, P3, P4, Fz, Cz, POz referenced to linked mastoids. Although 9 channels of EEG data were acquired, the preliminary data reported in this paper includes a sub-set of five channels (Fz, Cz, POz, Fz-POz and Cz-POz). The ABM wireless EEG Sensor Headset is a lightweight, easy-to-apply method to acquire and analyze six to nine channels of high-quality EEG. The sensors require no scalp preparation and provide a comfortable and secure sensor-scalp interface for 8 to 12 hours of continuous use with 2 AAA batteries. Amplification, digitization and radio frequency (RF) transmission of the signals are accomplished with electronics in a portable unit worn on the head. The combination of amplification and digitization of the EEG close to the sensors and wireless transmission of the data facilitates the acquisition of high quality signals even in high electromagnetic interference environments. EEG absolute and relative power spectral density (PSD) variables for each 1-second epoch were computed. Identification and decontamination of spikes, amplifier saturation, and environmental artifacts was accomplished using previously described methods (Berka et al., 2004). Event-related potentials (ERPs) and Power event-related potentials (PERPs) and event-related wavelet transformations were derived based on time-locking to the presentation of the stimuli. Tasks Each participant was presented with five tasks where they were asked to locate a target of interest. The tasks differed in difficulty as well as target/non-target ratios (Table 1). Each task consisted of four parts: Training session: allowed the participant

to become familiar with the target in question by showing 5 sets of images, the first of each set being the raw image, and the second of each set being the same image but with the target circled in red; Practice session: allowed the participant to practice identifying targets (n=10). If he/she was incorrect, the program would prompt the participant to make the correct choice; Short reminder training set: consisted of three images with targets circled in red; Testing session: consisted of 100 distinct images containing targets or non-targets. The subject was not given feedback regarding performance during this set. Images were displayed for 2 seconds with a 2-second ISI, and automatically advanced through the image set. Participants were instructed to respond to each image displayed. If they saw a target (a YES), they were to press the left arrow key on the keyboard. If they did not see a target (a NO), they were to press the right arrow on the keyboard. Table 1 illustrates the 5 image stimulus classes used in this experiment. Each stimulus class included 100 test stimuli including targets and non-targets. Samples of each of the five classes of stimuli and a sample of a non-target are presented in Figure 2. Table 1: Stimulus Classes, Difficultly Level, T/NT ratio, Window, Optimal Site, Mean Accuracy & Response Time (RT)

Target Stimuli

Difficulty Level

T/ NT Ratio

Measurement Window (ms)

Optimal site

Cranes

Easy

26:74

200-800

POz

Easy

20:80

200-600

POz

Easy

50:50

200-600

POz

Generators

Difficult

50:50

200-800

Fz-POz

Dozers

Difficult

50:50

300-900

Fz-POz

Water Towers Storage Tanks

Accuracy Mean (SEM) 98.1 (1.9) 99.4 (1.1) 99.0 (1.5) 91.7 (3.5) 90.7 (6.0)

RT Mean (SEM) 0.818 (.19) 0.898 (.29) 0.806 (.12) 1.092 (.17) 1.244 (.18)

METHODS Because the initial goals of this pilot study were to determine whether consistent ERP templates could be identified that distinguished targets from non-targets using highly complex satellite images unlike any stimuli previously reported, the first analytical step was designed to determine whether a consistent and identifiable difference between correctly identified targets and correctly identified non-targets could be identified across participants. Visual inspection of the ERPs (averaged correct targets vs. correct non-targets) for each subject and task were used to select one optimal channel and set a universal time measurement window (msec. post-stimulus) for the target/non-target distinction (T/NT). Across subjects, the parieto-occipital (POz) region provided the best neural signatures for targetness for the easy stimulus sets (cranes, water towers, storage tanks) however, the difficult stimuli (generators, dozers) were more distinctive for the Fz-POz in large part due to a combination of a sustained frontal negativity (likely related to attention) and a parietal positivity for the targets. T/NT distinctions were observed for some participants in the very early 100-190msec post-stimulus period, but this was inconsistent across participants. However, all participants showed T/NT distinctions in the timeframe 200msec post-stimulus.

For the easy storage tanks (50:50 ratio), the T/NT distinction was maximal between 200 and 600msec poststimulus across all participants. For the easy water towers (20:80 ratio), some subjects showed a prolonged time window (up to 800msec), but the majority evidenced the distinction between 200-600msec. The cranes were best distinguished by all using the 200-800msec window. The prolonged T/NT window of discrimination may relate in part to the ratio parameters (overlapping the ERP signatures related to “target detection” with that related to “novelty”).Because in a real-world scenario, there is no way to predict the number of targets encountered in a given Figure 2: Examples of targets and grand mean ERPs showing environment, any the T/NT template distinction from optimal site with classifier must be measurement window designed to be universally effective for any T/NT ratio. As the task difficulty increased, the window of T/NT distinction increased to as much as 900msec (e.g. as seen during the dozers task), consistent with increasing reaction times for each stimulus set. This suggests that limiting the ERP sampling window to one second may be too short to fully characterize the templates particularly in more complex, real-world tasks. The differences between target vs. non-target ERPs for each subject and stimulus class were quantified by summating the point by point difference of the correct target ERPs from the correct non-target ERPs within the designated measurement window for each of the 5 tasks (See Figure 2). All subjects had measurable T/NT differences (“Linear Difference Analysis”) (left graph in Figure 3). The data show that the most difficult target detection (dozers) prolonged the reaction time and the ERP measurement window, but the mean T/NT distinction was relatively small. This is likely due to the fact that the search in the non-targets required studying each vehicle in the image to make sure that it was not a dozer; thus, the non-target trials were more difficult to distinguish.

FA

CR

d

or s

Do ze rs

G

en er at

s er

Ta nk s ge

or a

St

an es Cr

er To w

at

Do ze rs

or s

G

en er at

s er

Ta nk s ge

er To w

at W

St or a

an es Cr

Do ze rs

or s

G

en er at

s er

Ta nk s ge

or a

St

an es Cr

at W

0.50

t

85

*

is se

0.75

90

0

1.00

2250 2000 1750 1500 1250 1000 750 500 250 0

Hi

95

Time (ms)

100

W

250

1.25

Time (ms)

% Correct

500

*

1.50

105

750

er To w

Sumated Linear difference( V)

110

1000

Reaction Time

Reaction Time

M

Accuracy

Linear Difference Analysis

Figure 3: T/NT mean ERP Linear Difference Analysis, mean Accuracy and RT by task, and mean reaction time by image classification type

Examples of Variability Introduced by Difficulty Varying both the task and stimulus difficulty produced differences in ERP components between the targets and non-targets (as seen in Figure 4). To investigate the variability introduced by difficulty, data were sorted for difficulty and ERPs were recalculated. Using a larger sampling window might allow these differences to be more distinctive during hard tasks as reaction and processing times are directly proportional to the difficulty of the task.

Figure 4: ERP signatures for Easy Stimuli seen in both Easy and Hard tasks, Hard Stimuli as seen in both Easy and Hard tasks

Evaluation of a single-trial stimulus-locked classifier model Several simple classification approaches are being investigated to identify the best combination of algorithms for developing the single trial classifier. The ideal classifier would generalize across tasks, participants and environments. Table 2 illustrates data from one model designed to classify across subjects, tasks and difficulty. To create the model, PERP and wavelets variables were subdivided into three separate bands (0-2, 2-4 and 4-8Hz) from each of five selected EEG channels (Fz, Cz, POz, Fz-POz and CzPOz). This very large set of variables was first scanned using forward selection discriminant function analyses (DFA) to select EEG variables that distinguished targets from non-targets. Up to 400 candidate variables were selected in this manner. This smaller set was then used in a stepwise DFA to determine the final set of predictor variables. The classification used only the nine monopolar EEG channels and simple feature extraction from the EEG. This preliminary attempt at developing a global classifier is promising. The classification accuracy could be greatly improved by: 1) expanding the number of

channels using all 9 monopolars, creating an additional set of up to 32 bi-polars and/or adding, averaging or subtracting channels, 2) deriving alternative multi-dimensional features from the PSD and wavelets data and 3) examining time windows larger than one second. Table 2: Preliminary Classification of Single Trial Correct Targets and Non-Targets

30

NonTargets Classified as NonTargets 314

NonTargets Classified as Targets 84

157

41

167

Storage Tanks

173

28

Water Towers

95

Dozers Totals:

Targets Classified as Targets

Targets Classified as NonTargets

Cranes

115

Generators

Sensitivity

Specificity

79.3

78.9

43

78.5

80.3

147

37

86.1

79.9

71

376

24

57.2

94.1

152

70

186

74

67.3

72.7

692

240

1190

262

73.7

81.2

Response-locked Single Trial ERPs for Targets and Non-Targets The variability in response times for the single trials (and by inference the variable decision times) appeared to be due to variability in stimulus difficulty, so a second preliminary classifier was developed by time-locking to the response. Response-locked ERPs were generated to help identify sources of reaction time variability between targets and non-targets. Although the preliminary single trial response-locked classification provided an increase in sensitivity to 81.3% across all subjects and paradigms the specificity was much lower than the stimulus-locked at 63%. Figure 5 presents data in microvolts for targets and non-targets as they relate to the time immediately preceding a response. Colors represent microvolt values with red indicating positive values, yellow indicating values near zero and blue indicating negative values. Targets can be distinguished as negative (blue) values ranging from 600ms to100ms prior to response. FzPO 90 80 70 60 50 40 30 20 10

-900

-800

-700

-600

-500

-400

-300

-200

-100

0

Figure 5: Response-locked ERP for Targets and Non-Targets (LEFT) and microvolt values for 1 second preceding a response. (RIGHT)

Preliminary assessment of Eye Fixation-Locked ERP Study 2 examined an additional alternative for time-locking the EEG to target detection where data were acquired with a synchronized combination of EEG and eye-tracking during target detection task (n=13). This study allowed eye fixations (>100msec

duration) to be inserted into the EEG record to compute time-locked potentials that could be sorted and averaged for hits, correct rejections, false alarms and misses. (Definitions: hits, participant fixating on target and responding; misses, fixating on target with no response; correct rejections, fixating on non-target and no response; false alarms, not fixating on target and responding). The preliminary evidence suggests that the waveshape characteristics of fixationlocked ERPs or “FLERPs” resemble those of stimulus-evoked ERPs and more importantly, distinct template signatures are associated with hits, correct rejections and misses. Figure 6 shows that the waveshapes of fixation ERPs associated with hits, correct rejections, false positives and misses. Single trial classifiers are currently under development.

Figure 6: Fixation-locked ERPs showing the template signatures for hits, correct rejections and misses at channels Fz, Cz and POz averaged across 13 participants.

GENERAL DISCUSSION Results from these preliminary investigations show feasibility of successful characterization of consistent ERP templates that distinguished targets from nontargets using highly complex satellite images that provided a closer approximation to

real-world IA tasking than previously reported (Parra, Spence et al. 2005; Mathan, Whitlow et al. 2006; Sajda, Gerson et al. 2007). Consistent, quantifiable ERP templates for targets/non-targets (T/NT) were identified across participants—an important first step given the use of more complex images than previously reported. The first attempt at developing a single trial classifier that generalizes across tasks, participants and stimulus sets showed some promise given the small number of samples, limited selection of EEG channels, use of just one 1000msec time window, and a single level classifier. The key to a successful single-trial classifier is to identify known and controllable sources of variability. Latency and difficulty level are two sources of variability that have been identified, and are being addressed by adjusting the time windows for classification and developing a multi-layer model. The variability in response times for the single trials (and by inference the variable decision times) appeared to be due to variability in stimulus difficulty, so a preliminary classifier was developed by time-locking to the response. Although the responselocked model increased classification sensitivity, the specificity was reduced relative to the stimulus-locked. The applicability of response-locking may also be limited because the presence of a keypress or other user response cannot be guaranteed for many real-world applications, thus limiting the utility of the platform. Multiple approaches will be required to address the needs across applications. The final alternative approach under investigation showed the feasibility of maximizing the accuracy of the time-locked EEG analysis by integrating eye-tracking as a trigger for cognitive assessment. Preliminary evidence suggested that the waveshape characteristics of fixation-locked ERPs or “FLERPs” resemble those of stimulus-evoked ERPs and more importantly, distinct template signatures were associated with hits, correct rejections and misses. To the investigators knowledge this is the first report to suggest that eye-tracking in combination with event-related EEG analysis can be utilized for classification of objects in complex images in a paradigm that has been designed to resemble the working environment of image analysts. Anticipated Benefits of the RAPID System The ability to identify patterns or track salient targets across multiple displays is common to contemporary work environments ranging from airport security to satellite imagery analysis to videogame development. The RAPID system is being designed to augment the innate capacities of the human visual processing system. A clearer understanding of the strengths and limitations of human information processing will improve human-machine system interfaces and assist in optimizing the design of work, home and recreational environments. Real-time electroencephalographic (EEG) analysis can now be applied to continuously monitor operator state to quantify levels of fatigue, attention, and workload. Real-time ERP monitoring has been proposed for brain-computer interfaces to increase throughput of intelligence analysts tasked with processing considerable volumes of satellite imagery. The RAPID architecture is expected to lead to significant increases in the amount of imagery reviewed, as identification and localization of areas of interest within images may occur without an analyst’s conscious intervention or motor responses and may be associated with an interest score based on physiological indicators. Using eye tracking technology to identify fixation points and EEG/ERP analysis to determine the level of interest at specific fixation locations, imagery review should be substantially faster and more accurate, as points of interest can be identified faster than overt behavioral marking while potential misuses of tacit knowledge that may result in misses, false alarms or issues with confirmation/anchoring/impact biases may be identified and mitigated in real-time. These results provide further support for the feasibility of routinely applying EEG outside the laboratory and suggest applications in education and training, human

factors evaluations, military operations and market research. The neurotechnology platform was designed to be interactive, facilitating the creation of closed-loop computational systems that sense cognitive state changes and adapt and learn from their human operators, fundamentally changing the way humans interact with technology.

REFERENCES Berka, C., et al., 2004, Real-time Analysis of EEG Indices of Alertness, Cognition and Memory with a Wireless EEG Headset. International Journal of Human-Computer Interaction, 2004. 17(2): p. 151-170. Fuchs, S., Jones, D.L. & Hale, K.S., 2007, Approaches for bias mitigation during imagery analysis. Presented at Augmented Cognition International 2007, Baltimore, MA, October 1, 2007. Itti, L., 2005, Models of Bottom-Up Attention and Saliency, In L. Itti, G. Rees, J. K. Tsotsos (Eds.), Neurobiology of Attention (pp. 576-582). San Diego, CA:Elsevier. Mathan, S., S. Whitlow, et al., 2006, Neurophysiologically driven image triage: a pilot study. Conference on Human Factors in Computing Systems, Montréal, Québec, Canada ACM Press. Parra, L. C., C. D. Spence, et al., 2005, "Recipes for the Linear Analysis of EEG." Neuroimage 28(2): 326-341. Sajda, P., A. D. Gerson, et al., 2007, Single-trial analysis of EEG during rapid visual discrimination: Enabling cortically-coupled computer vision. Brain-Computer Interface. K.-R. M. Guido Dornhege. Cambridge, MA, MIT press. WMD Commission, 2005, The Commission on the Intelligence Capabilities of the United States Regarding Weapons of Mass Destruction. March 31, 2005, US Government. Zelinsky, G.J., Zhang, W., Yu, B., Chen, X., & Samaras, D., 2006, The role of top-down and bottom-up processes in guiding eye movements during visual search. In Y. Weiss, B. Scholkopf, & J. Platt (Eds.), Advances in Neural Information Processing Systems Vol 18 (pp. 1569-1576). Cambridge, MA: MIT Press.