Conjoining Three Auditory Features: An Event-Related Brain Potential Study David L. Woods1 and Claude Alain 2
Abstract & The mechanisms of auditory feature processing and conjunction were examined with event-related brain potential (ERP) recording in a task in which participants responded to target tones defined by the combination of location, frequency, and duration features amid distractor tones varying randomly along all feature dimensions. Attention effects were isolated as negative difference (Nd) waves by subtracting ERPs to tones with no target features from ERPs to tones with one, two, or three target features. Nd waves were seen to all tones sharing a single feature with the target, including tones sharing only target duration. Nd waves associated with the analysis of frequency and location features began at latencies of 60 msec, whereas Nd-Duration waves began at 120 msec. Nd waves to tones with single target features continued until 400+ msec, suggesting that once begun, the analysis of tone features continued exhaustively to conclusion. Nd-Frequency and Nd-
Location waves had distinct scalp distributions, consistent with generation in different auditory cortical areas. Three stages of feature processing were identified: (1) Parallel feature processing (60–140 msec): Nd waves combined linearly, such that Ndwave amplitudes following tones with two or three target features were equal to the sum of the Nd waves elicited by tones with only one target feature. (2) Conjunction-specific (CS) processing (140–220 msec): Nd amplitudes were enhanced following tones with any pair of attended features. (3) Target-specific (TS) processing (220–300 msec): Nd amplitudes were specifically enhanced to target tones with all three features. These results are consistent with a facilitatory interactive feature analysis (FIFA) model in which feature conjunction is associated with the amplified processing of individual stimulus features. Activation of N-methyl- D-aspartate (NMDA) receptors is proposed to underlie the FIFA process. &
Human listeners are often faced with the problem of processing auditory signals in complex environments where multiple sound sources occur simultaneously. For example, at a cocktail party, a listener may attend to one voice amid others from nearby locations. Given that different features of sounds are processed in different regions of auditory cortex (Rauschecker, 1997), voices from different sources will simultaneously excite multiple foci in different cortical fields. This creates a formidable computational problem: How are the features of attended sounds accurately conjoined while excluding the corresponding features of unattended sounds? There has been relatively little speculation about the mechanisms of auditory feature conjunction. The most influential theory of feature conjunction, Feature Integration Theory (FIT, Treisman & Gelade, 1980), was developed to explain the results of visual search experiments. FIT hypothesizes that visual features are preattentively processed in different feature maps (Treisman, 1999). These different features are then conjoined into
a coherent percept through focused spatial attention. The feature-conjunction process depends on a serial scan of a spatially organized ’’master map’’ that receives projections from the individual feature maps. As each location is analyzed the features from stimuli at that location are bound together and evaluated. The features of unattended objects remain unbound and can be misconjoined with one another to produce illusory conjunctions. Although FIT can successfully explain many aspects of visual feature conjunction, its extension to the auditory modality is faced with several problems. First, FIT hypothesizes that features are conjoined at attended locations while excluding the features of objects in nearby spatial locations. However, human listeners have only a limited ability to discriminate sound locations. For example, tones differing in frequency by one-half octave must be separated by 15–308 for frequency to be accurately assigned to the correct spatial location (Divenyi, 1999). This poor spatial acuity would make an auditory spatial scan susceptible to the intrusion of features from sounds in nearby locations. Second, while visual stimuli persist at attended locations and can be scanned at leisure, auditory signals are often brief and transient. Although these features may be preserved in short-term auditory memory (Cowan, 1988), the transience of
1
University of California-Davis and Northern California System of Clinics, 2 Rotman Research Institute, Baycrest Center and University of Toronto, Toronto, Canada © 2001 Massachusetts Institute of Technology
Journal of Cognitive Neuroscience 13:4, pp. 492–509
auditory signals implies that even a rapid serial scan would be faced with the loss of signal integrity. Third, models like FIT that hypothesize a parallel featureprocessing stage followed by a serial feature-conjunction process have difficulty in explaining the results of auditory discrimination experiments in which reaction times (RTs) can be faster to conjunction– than to single– feature targets (Woods, Alain, & Ogawa, 1998). According to parallel/serial models, the RT to conjunction targets should never be faster than the RT to the most slowly processed individual stimulus feature. Research using event-related brain potentials (ERPs) has provided another perspective on auditory feature processing and conjunction inspired, in part, by ERPbased models of auditory attention. Hansen and Hillyard (1983) developed a parallel/serial model based on the results of an experiment in which participants attended to tones of a specified location (L+) and frequency (F+) amid distractors varying along both dimensions. The participants’ task was to respond to targets of a slightly longer duration. In most conditions, frequency and location features differed in discriminability. For example, in the Location-Easy, Frequency-Hard condition, participants attended to 900-Hz tones in one ear while ignoring 960-Hz tones in the same ear and tones of both frequencies in the opposite ear. Hansen and Hillyard found that attention effects associated with the processing of the difficult-to-discriminate feature only occurred when the tone could not be rejected on the basis of the easier feature dimension. For example, in the LocationEasy, Frequency-Hard condition, attention effects associated with frequency processing only emerged for tones in attended locations. Hansen and Hillyard proposed a parallel self-terminating (PST) model that posits that the features of auditory signals are analyzed in parallel until evidence accrues– along any feature dimension– that the stimulus is not a target, whereupon processing terminates. Thus, no attention effects would be observed for the analysis of small frequency differences in the wrong spatial location because the these tones would be rejected on the basis of location analysis before frequency analysis had begun. The PST model was similar to the attentional trace (AT) theory that had been previously developed by Na¨¨a ta¨nen (1982). According to Na¨¨a ta¨nen, incoming stimuli are matched against an attentional template, with the matching process associated with a processing negativity (PN), a negative deflection recorded over those cortical regions that are involved in stimulus analysis. ERP attention effects, revealed in ’’difference waves,’’ derive from differences in the duration of the matching processes. Target-like distractors undergo a more prolonged analysis and produce longer duration PNs. Because the duration of the PN varies with the resemblance of the distractor to the template, a negative difference (Nd) wave can be obtained by subtracting ERPs to rapidly rejected stimuli from ERPs to more
target-like distractors. For example, in Hansen and Hillyard’s Location-Easy, Frequency-Hard condition, location would be rapidly analyzed. Since the matching process would terminate as soon as location differences between the distractor and target were detected, no frequency-related Nd wave would be present. In contrast, the PN to L+F– tones would be evident because the tone would match the target on the basis of location and only be distinguished from the target by longerduration frequency processing. Consequently, an Nd wave would be evident when ERPs to L– F– were subtracted from ERPs to L+F– tones. Its onset latency would reflect the time required to reject tones on the basis of location and its duration would index the additional time needed to reject tones on the basis of frequency. ERP data from selective attention tasks can also be explained with serial, self-terminating (SST) models. According to SST models, one stimulus feature after another would be matched against a target template until the stimulus was distinguished from the target. The order of the matching process would reflect feature salience, beginning with the most salient feature. As in the PST and AT models, the differential duration of processing of stimuli with different features would give rise to Nd waves. For example, in Hansen and Hillyard’s (1983) Location-Easy, Frequency-Hard condition, participants would initially analyze stimulus location. Since the serial matching process would terminate as soon as the location difference between the distractor and target were detected, no Nd wave would be evident for tones in unattended locations (e.g., L– F+ tones). However, an Nd would be evident for L+F– tones because rejection by frequency could only occur after the more salient location feature had been processed. PST, AT, and SST models make similar predictions about Nd waves in conditions where distractors share a distinctive feature with the target but are distinguished by a harder-to-discriminate feature: Nd waves begin at the latency at which distractors differing from the target along the salient feature dimension would be rejected, and continue until they can be rejected on the basis of the harder-to-discriminate feature. However, PST/AT and SST models make different predictions about Nd durations in cases in which targets and distractors differ along two equally salient feature dimensions (e.g., Location Easy, Frequency Easy). For example, according to the PST/AT models, L+F– and L– F+ tones would both be rejected at short latencies– as soon as either frequency or location distinguished the distractor from the target. Assuming that the analysis of location and frequency features were equally rapid, L+F– and L– F+ tones would produce Nd waves of short duration and small amplitude. Indeed, Nd waves to the location feature would be expected to be even smaller in amplitude than in Location Easy, Frequency Hard conditions because on half of the trials L+F– tones could be Woods and Alain
493
rejected on the basis of frequency before location analysis was complete. The duration of the location Nd would also be expected to be shorter in asymmetric conditions, because processing would sometimes terminate based on frequency. Hence, the duration of the Nd would no longer reflect the duration of location processing, but rather the difference in the duration of processing of location and frequency. SST models make similar predictions about Nd amplitudes, but predict that Nd wave durations would be similar to those obtained in asymmetric conditions. According to SST models, if the two features were equally discriminable, subjects would attend to each feature equiprobably. Single-feature Nd waves would be obtained when the features of the incoming stimulus matched the attended dimension (i.e., to L+F– tones when the participant was attending to location). Hence, single-feature Nd waves would occur on 50% of the trials, compared to 100% of the trials in asymmetrical conditions. However, Nd waves would have the same durations in the two cases, reflecting the time required to analyze the location feature. Several aspects of the results of recent ERP experiments using two highly discriminable feature dimensions (Frequency Easy, Location Easy) cast doubt on PST/AT and SST models (Woods & Alain, 1993; Woods, Alho, & Algazi, 1994). In these experiments, Nd-Frequency and Nd-Location waves had uniformly long durations (400–450 msec), substantially longer than those predicted by either PST/AT or SST models (see Alho, Sams, Paavilainen, Reinikainen, & Naatanen, 1989). Second, single-feature Nd amplitudes were similar to those obtained in single-dimension attention tasks where the participant’s attention remained focused on a single-feature dimension (Hansen & Hillyard, 1988). The ERP results from these feature-conjunction tasks using highly discriminable feature dimensions supported a parallel-exhaustive processing model. According to this model, the processing of target features, once begun, would continue exhaustively until conclusion, even after evidence had accrued distinguishing distractors from the targets. Prominent conjunction-specific (CS)-Nd waves were also noted in these experiments. These could be visualized by subtracting the sum of single-feature Nd waves (e.g., Nd-Frequency plus Nd-Location) from Nd waves to tones that contain both attended features. CS-Nd waves began at longer latencies than Nd waves to individual stimulus features. PST/AT and SST models also predict the existence of CS-Nd waves at longer latencies than Nd waves associated with single-feature processing. According to PST/AT models, the initial processing of frequency and location would occur in parallel and independently. During this initial period, Nd waves to L+F+ tones would equal the sum of Nd waves to L+F– and L– F+ tones, so no CS-Nd wave would be recorded. However, since L+F+ tones would undergo a longer-duration 494
Journal of Cognitive Neuroscience
analysis, a CS-Nd wave would occur at long latencies. Its onset latency would reflect the time required to reject distractors sharing a single feature with the target, and its duration would reflect the time required to finish the analysis of the second target feature, as well as any possible target-related processing instigated after sensory analysis was complete. According to the SST model, Nd waves would occur on 50% of single-feature trials and to 100% of L+F+ tones. Therefore, the initial Nd-L+F+ wave would equal the sum of Nd-L+F– and Nd-L– F+ waves. Consequently, there would be a shortlatency stage of apparently parallel feature processing. However, the processing of L+F+ tones would continue after the processing of L– F+ and L+F– tones had terminated. Hence, differences between Nd-L+F+ and the sum of Nd waves to tones with single target features would emerge. The onset latency of this CS-Nd would reflect the time required to reject a tone based on one feature, and its duration would reflect the additional time needed to process the other feature. We have found that CS-Nd waves in two-dimensional attention tasks were delayed in latency by 50–80 msec with respect to single-feature Nd waves (e.g., Woods et al., 1994), a magnitude of delay that could be plausibly explained by either model. A facilitatory interactive feature analysis (FIFA) model was also proposed to account for CS-Nd results (Woods et al., 1994). The FIFA model assumes that features are processed in parallel and exhaustively, but that CS-Nds emerge because the neuronal activity associated with later stages of feature processing is amplified when several target features are present concurrently. This amplification occurs after a delay that reflects intrinsic properties of cortical connectivity (described in more detail below). According to the FIFA model, it is this enhanced processing that is responsible for CS-Nd waves. FIFA, PST/AT, and SST models make different predictions about single-feature and CS-Nd waves in feature-conjunction tasks using more than two stimulus dimensions. The current experiment was performed to test several of the predictions. The paradigm is shown in Figure 1. Three highly salient cues were used: frequency, location, and an easily discriminable difference in stimulus duration. Participants performed a three-feature conjunction task, detecting target tones of a prespecified location, frequency, and duration (L+F+D+). There were 18 different tones, derived from the combination of three frequencies (250, 1000, and 4000 Hz), three durations (8, 24, and 72 msec) and two locations (left and right ears). The tones were presented randomly and equiprobably with the stipulation that mid-duration tones (24 msec) were never targets. Although the duration cue was more salient than in previous studies, both behavioral and Nd data (see below) indicated that duration was more slowly processed than either frequency or location. Such a differVolume 13, Number 4
ipants would analyze one salient feature (either frequency or location) and then the other. Duration would only be analyzed for L+F+ tones. Again, the analysis of L– F+D+ tones would terminate before duration analysis, and no CS-Nd should be evident. In contrast, the FIFA model posits that feature processing occurs in parallel and exhaustively. Because the enhanced feature processing of tones with several target features is a consequence of the intrinsic properties of cortical connectivity, the full spectrum of two-feature CS-Nd waves should be evident, including for twofeature pairings that include duration. RESULTS Behavioral Performance
Figure 1. The paradigm. Participants performed an auditory conjunction task, detecting target tones of a prespecified location, frequency and duration (L+F+D+) among tones varying randomly in location (left or right ear), frequency (250, 1000, or 4000 Hz), and duration (8, 24, or 72 msec). An example of the stimulus sequence is shown during ’’attend 72 msec, 250 Hz Left’’ conditions. Letters indicate tone frequencies; fonts (lower case, italic, and upper case) indicate tone durations. Mid-duration tones (24 msec) were never targets.
ence in feature salience leads to contrasting predictions from PST/AT, SST, and FIFA models. First, PST/AT and SST models predict an absence of Nd-Duration waves to L– F– D+ tones. According to these models, the more salient location and frequency cues would be processed before duration, so that L– F– D+ tones would be rejected prior to duration analysis. In contrast, the FIFA model proposes that stimulus features are processed in parallel and exhaustively, hence, predicts that Nd-Duration waves should be present even if they have significantly longer onset latencies than Nd-Frequency or NdLocation waves. Second, PST/AT and SST models predict absent CSNd waves for two-feature conjunctions that include target duration but that are distinguished from the target by a salient feature (either frequency or location). According to the PST/AT model, the location and frequency features would be processed before duration. Hence, both L– F+D+ and L+F– D+ tones would be rejected before duration analysis took place (e.g., L+F– D– tones would be rejected as fast as L+F– D+ tones). According to the SST model, partic-
Task performance is summarized in Table 1. Participants detected 78.3% of targets. RTs averaged 453 msec. Although duration information was delivered earlier for short– than for long– duration targets (target information became available at 8 vs. 24 msec), responses to long-duration targets were significantly faster (416 vs. 489 msec) and more accurate (83% vs. 71%) [F(1,11) = 43.6 and F(1,11) = 11.3, p < .001 and p < .006] than responses to short-duration targets. The interaction between Duration £ Frequency was also significant, reflecting greater duration-related RT facilitation for 250-Hz tones (91 msec) than for 1000- or 4000-Hz tones [76 and 71 msec, respectively, F(2,22) = 4.6, p < .03]. Participants were more accurate at detecting targets in the right than in the left ear [80% vs. 75%, F(1,11) = 6.1, p < .03], but the small right-ear advantage in RT failed to reach significance [F(1,11) = 2.1, p < .2]. Overall false alarm (FA) rates were low (1.19%/nontarget stimulus) but varied significantly with distractor type [F(10,110) = 12.4, p < .001]. FA rates varied substantially for distractors differing from the target in a single feature, producing a significant main effect of distinguishing feature on FA rate [F(2,22) = 10.43,
Table 1. Mean (±SEM) RT and Accuracy (Hit Percentage– FA Percentage) for Each Condition Duration Long
Short
Frequency (Hz)
Ear
RT (msec)
Accuracy (%)
250
Left Right
426 ± 30 394 ± 15
80 ± 8 91 ± 5
1000
Left Right
424 ± 29 422 ± 31
77 ± 7 82 ± 8
4000
Left Right
418 ± 29 414 ± 27
86 ± 7 82 ± 8
250
Left Right
497 ± 14 505 ± 18
70 ± 7 71 ± 5
Woods and Alain
495
p < .001]. Pairwise comparisons revealed that FA rates were higher for L+F+D– (24 msec) tones distinguished only by duration (7.6%/stimulus) than for L+F– D+ [1.7%, F(1,11) = 17.5, p < .002] or L– F+D+ tones [2.6%, F(1,11) = 7.7, p < .02], distinguished respectively by frequency and location. FA rates for tones with single attended features did not differ significantly from each other or from FA rates to tones with no attended features (1.05%). ERPs: Exogenous Components Figure 2 shows grand mean ERPs to tones with no target features (L– F– D– ). ERPs to unattended tones were characterized by contralateral fronto-central N1, and symmetrical P2 deflections (further details of these results are presented in Alain, Woods, & Covarrubias, 1997). The N1 deflection receives contributions from multiple generator sources (Picton et al., 1999; Na¨¨ata¨nen & Picton, 1987) including those that produce N1a, N10, N1b, and N1c subcomponents (Woods, 1994). The N1a was measured on the rising phase of the N1 at midtemporal sites. It was larger over the left than the right hemisphere [at 60 msec, F(1,11) = 14.03, p < .01; at 80 msec, F(1,11) = 7.75, p < .025], consistent with previous reports (Knight, Scabini, Woods, & Clayworth, 1988). The N10 (80–110 msec) occurred over central/ midline sites concurrently with the N1a. It showed a tonotopic distribution similar to that previously de-
scribed (Alcaini, Giard, Thevenet, & Pernier, 1994; Woods et al., 1994) reflected in a significant Frequency £ Electrode interaction [at 70–90 msec, F(44,484) = 7.29, p < .001]. The N10 showed a parietal–central maximum following 250-Hz tones, a central maximum following 1000-Hz tones, and a frontal maximum following 4000-Hz tones. A P90 component became evident at contralateral posterior temporal sites following 4000-Hz stimuli. The P90 has been explained as the positive face of a rotating dipole field that becomes evident at the scalp surface for higher frequencies (Woods et al, 1994). The longer-latency N1b (110–130 msec) component showed a fronto-central midline distribution. A subsequent N1c component (130–150 msec) was most clearly evident at contralateral mid-temporal sites and was larger over the right than left hemisphere [at 130–150 msec, F(1,11) = 15.2, p < .01]. The N1c was followed by a centrally distributed P2 wave (190–220 msec). ERPs: Single-Feature Attention Effects Figure 3 shows the method used to extract Nd waves: ERPs to tones with no attended features (L– F– D– , dotted line) were subtracted from ERPs to tones that shared one or more features with the target (e.g., thin solid line, Figure 3). The most prominent attention-related enhancements occurred at frontal–central scalp sites and reflected a negative displacement of the ERP (Nd) that began at
Figure 2. Exogenous ERPs. ERPs to tones with no attended features (L– F– D– , dotted line) shown separately for 250-, 1000-, and 4000-Hz tones. Electrodes have been transposed so that those on right of the figure were opposite to the stimulated ear. The insert shows enlarged ERPs from the Fz electrode. ERPs shown were averaged over participants, frequencies, and ear of delivery.
496
Journal of Cognitive Neuroscience
Volume 13, Number 4
Figure 3. Nd-wave extraction. Attention effects were examined by subtracting ERPs to tones with no attended features (L– F– D– , dotted line) from ERPs to tones with one or more attended features, revealing negative difference (Nd) waves. Grand average ERPs from the Fz and contra- and ipsilateral posterior temporal electrodes (PTc, PTi). ERPs and Nd waves are shown to L+F– D– and L– F+D– of 24- and 72msec durations. Attention effects included the modulation of the exogenous, tonotopic N10/P90 component elicited by high frequencies. ERPs shown were averaged over participants and ear of stimulus delivery.
about 60 msec and continued for hundreds of milliseconds. The Nd included early and late phases: the Nde (60–240) and Ndl (260–700) (Hansen & Hillyard, 1980). Tones sharing target location (L+F– D– ) or frequency (L– F+D– ) produced short latency Nd waves [e.g., at 50–70 msec for 24-msec tones, F(1,11) = 11.7, p < .001]. Initially, these Nd-Location and Nd-Frequency effects were of comparable amplitude. However, from 100 to 260 msec, Nd-Frequency waves were significantly larger than Nd-Location waves [F(1,11) ranged from 4.8 to 61.9, p < .05 for all comparisons]. This is consistent with behavioral results indicating that tone frequency is a more salient cue than location in guiding attention in high-rate auditory selective attention tasks (cf., Woods et al., 1998; Woods, Alain, Diaz, Rhodes, & Ogawa, 2001). Figure 4 shows the Nd waves elicited by tones with single attended features. ERP data provided evidence of a short-duration stage of parallel, independent feature
processing. During this stage (60–120 msec), Nd wave amplitudes to tones with two or three target features equaled the sum of Nd waves to individual tones, each with a single attended feature. These short-latency (400 msec) exceeded the ISIs used in the current experiment (mean 300 msec). This implies that several stimuli often underwent concurrent analysis. The temporally overlapping analysis of successively presented stimuli is more clearly seen in experiments using even higher rates of stimulus delivery (Woods & Alain, 1993), and can be associated with illusory conjunctions of features between successively presented tones (Woods et al., 1998). Thus, the Nd-wave data suggest that tone features are analyzed in parallel both in space (different tone features are concurrently analyzed in different cortical locations) and in time (successive tones undergo concurrent analysis). The Effects of Stimulus Duration on Feature Processing Single-feature Nde latencies showed a graded relationship to stimulus duration. Nde onset latencies were
shorter for 8- than for 24- or 72-msec tones, with the latency differences similar to the physical differences in tone duration. This suggests that Nde may begin at a constant delay after stimulus offset rather than after a constant delay following stimulus onset. This would be computationally economical: If auditory feature analysis began at stimulus onset, computational resources might often be prematurely committed prior to the delivery of the relevant stimulus information. It suggests that the burst of neuronal activity seen at tone offset (e.g., deCharms & Merzenich, 1996) may serve to initiate the analysis of higher-order stimulus attributes. In contrast to Ndes, Ndl components had short latencies following 72-msec tones, intermediate latencies following 24-msec tones, and long latencies following 8-msec tones. Consequently, estimates of processing time derived from the differences in peak latency of Ndl and Nde components were about 70 msec longer for 8- than for 72-msec duration tones. These differences were similar to those seen in RTs. One possible explanation is that the strategic/mnemonic processing associated with the Ndl is accomplished more rapidly for the clearer sensory representation provided by longer duration stimuli. Stages of Feature Processing and Conjunction There was evidence for three separate stages of feature processing. During the first stage (60–120 msec), auditory features were processed in parallel and independently and attention effects combined linearly. During the second stage (120–220 msec), Nd amplitudes elicited by tones with two or three attended features exceeded the sum of the Nd amplitudes to tones with the constituent individual features. This CS-Nd stage was delayed by 60–80 msec with respect to the slowest component single-feature Nd wave. This suggests that CS processing only began after the most slowly processed constituent single feature had been partially analyzed. During the third, TS stage (200–720 msec), final sensory analysis and response selection/production occurred. This stage began at 200 msec, with the earliest signs of target detection evident bilaterally over dorsolateral prefrontal cortex. Later TS deflections included ERP components specifically associated with target detection (the N2b and P3), as well as premotor potentials associated with response preparation and production. Feature Processing: Self-Terminating or Exhaustive? PST models like those of Hansen and Hillyard (1983). Na¨¨ata¨nen’s (1992) AT model and SST models argue that stimulus features are matched against a template of the target, with stimuli being rejected as soon as any stimulus feature mismatches any feature of the target. A PN is Woods and Alain
503
produced as long as the matching process continues. These models predict longer-latency CS-Nds because targets and target-like distractors undergo longer duration analysis. While these models can effectively account for the results of many experiments, they have difficulty accounting for three aspects of the current data: (1) the long duration of single-feature Nd waves; (2) the occurrence of Nd-Duration waves; and (3) relative Nd-wave amplitudes. According to PST/AT models, single-feature processing should cease as soon as evidence is accrued, along any dimension, that a stimulus is not a target. Behavioral studies suggest that this evidence is available at quite short latencies. For example, RTs average about 300 msec for discrimination of frequency separations similar to those used in the current experiment (Woods et al., 2001). Since the transmission of motor commands and muscle contraction associated with the button press require about 130 msec (see above), that discrimination of frequency would appear to initiate motor commands at ¹ 170 msec. Thus, L+F– D– and L+F– D+, L+F– D– and L– F– D+ tones would presumably be rejected before 170 msec. Thus, this model predicts that Nd wave durations should be much shorter than the Ndwave durations actually observed. Both PST/AT and SST models also have difficulty in explaining the occurrence of Nd waves to the tone duration feature. Both location and frequency were more salient cues than duration, as reflected in the pattern of FAs produced by distractors, and by the fact that Nd-Frequency and Nd-Location waves had larger amplitudes and began some 50 msec before Nd-Duration waves. Therefore, according to PST/AT models the rejection of tones with nontarget location or frequency should occur before Nd-Duration waves would be generated. SST models have a similar difficulty. Participants would be expected focus attention on the more salient frequency or location cues. Consequently, F– L– D+ tones would be rejected prior to duration analysis. Of course, it might be argued that participants might have occasionally attended to tone duration. However, if so, the nature of the task would presumably have encouraged them to do so more often when targets were of short duration (where duration and frequency/location information were delivered at similar latencies), rather than in long-duration target conditions (where duration cues lagged frequency/location cues by 24 msec). Thus, this line of reasoning would imply larger Nd-Duration waves for short duration tones. In fact, Nd-Duration waves were larger in 72- than 8-msec conditions. PST/AT models make the more general prediction that Nd-wave amplitudes to tones with single attended features should be of low amplitude and short duration in multidimensional attention tasks with stimuli varying along several salient feature dimensions. Consider an experiment in which the target is characterized by three salient features, A+B+C+. Since the features are ana504
Journal of Cognitive Neuroscience
lyzed in parallel and are equally discriminable, A– B– C– tones will be rejected on the basis of each feature on one-third of the trials. By extension, A+B– C– tones will be rejected on two-thirds of the trials as rapidly as A– B– C– tones. Only one-third of the A+B– C– trials (those that would have been rejected on the basis of the A feature) would undergo additional analysis. Thus, in PST/AT models Nd wave amplitudes to tones sharing single target features would be inversely related the number of salient features. In addition, Nd waves to tones sharing single target features would be of very short duration. For example, the analysis of A+B– C– tones would terminate as soon as evidence from either the B or C feature became available. Since all features are analyzed in parallel, the rejection of the A+B– C– tone on the basis of the B or C features would occur almost simultaneously with its selection on the basis of the A feature. Thus, the duration of single-feature Nd waves should also diminish as the number of features distinguishing the target and distractors increased. SST models make different predictions about Nd wave amplitudes and durations. In cases where features differ in discriminability, participants should always analyze the most discriminable feature first. In this case, the SST model predicts a single-feature Nd wave to the most salient tone feature, but no Nd waves of any sort to tones lacking this feature. Alternatively, if feature salience were carefully equated, participants might attend to different features on different trials or on different blocks of trials. If so, the amplitude of single-feature Nd waves would vary inversely with the number of features that distinguish targets and distractors. Consider the aforementioned ’’A, B, C’ ’ set of equally salient features. An Nd-A wave would only be generated on trials when the participant processed the A feature first. However, unlike PST/AT models, the SST model predicts that the duration of a single-feature Nd wave would reflect the time needed to analyze a feature and hence would be unaffected by the number of feature dimensions. Thus, the SST model predicts that the amplitude of Nd waves to tones with single target features would vary inversely with the number of distinguishing target features, but that the duration of the Nd wave would be similar to that obtained in single-feature attention tasks. In contrast, parallel exhaustive models posit that Nd waves should have similar amplitudes and durations in single and multidimensional tasks. This prediction was supported in the current experiment, where Nd-Frequency and Nd-Location waves had long durations and amplitudes that were similar to those in previous unidimensional attention experiments using stimuli of similar discriminability (e.g., Hansen & Hillyard, 1988). Finally, PST/AT, SST, and FIFA models make different predictions about RTs in feature-conjunction tasks. According to PST/AT and SST models, RTs to targets in single-feature conditions should always be faster than RTs to the same stimuli in feature-conjunction condiVolume 13, Number 4
tions. In PST/AT models, stimulus features are processed in parallel so that the time needed to process two features would necessarily equal or exceed the time needed to process the more slowly processed feature when presented alone. According to SST models, RTs would be faster in single-feature than in feature-conjunction conditions because one feature-processing stage would be required in the former condition but two successive stages would be required in the latter. In contrast, the FIFA model hypothesizes that the processing of one target feature can actively facilitate the processing of concurrent features of the same stimulus. Consequently, the FIFA model predicts that RTs could theoretically be shorter in feature-conjunction than in single-feature conditions. Consistent with this prediction, behavioral studies have shown that RTs can sometimes be faster in feature-conju nction than in single-feature conditions (e.g., Woods et al., 1998; Woods et al., 2001). Neuronal Substrates of Auditory Feature Conjunction There are several possible neuronal substrates of the CS-Nd waves. First, it is possible that intermediate stages of feature combination are mediated by a ’’master map’ ’ (Treisman & Gelade, 1980) that receives projections from individual feature maps. As each stimulus is analyzed, features from the feature maps are bound together and evaluated, with the CS-Nd reflecting the activation in the master map. Alternatively, CSNd waves may be generated from increased activity in cortical fields processing the individual stimulus features. Both possibilities remain compatible with the scalp distributions of the CS-Nds. Fronto-central scalp distributions could be produced by a master map along the superior temporal plane, or alternatively by increased activity in the generators responsible for single-feature Nd waves. The basic observation of the current experiment was that Nd waves associated with the processing of single attended features were disproportionately enhanced and prolonged when stimuli contained several attended features. Thus, the ERP data appears to reflect FIFA. The FIFA mechanism differs from those proposed in guided search (Wolfe et al., 1989) because of its nonlinearity. In the guided search model, the magnitude of the response to a stimulus with two attended features would equal the sum of the activations elicited by each feature. In contrast, FIFA model is based on the ERP results that suggest that neuronal responses to attended features undergo active amplification during conjunction. Although in a linear system, the most salient stimuli would automatically undergo the greatest neuronal processing, the activation caused by a target stimulus would be similar to that caused by a target-like distrac-
tors. For example, consider a three-feature conjunction task, with two highly discriminable features (A and B), and a less discriminable third feature (C). Because of trial-to-trial variability, the magnitude of neuronal response to A+B+C– distractors might often overlap with responses to A+B+C+ targets. In contrast, in the FIFA model, feedback between the cortical fields processing different stimulus features would enhance responses in the presence of other attended features, and, hence, amplify differences between targets and A+B+C– tones. This feedback system has a Bayesian character: Mutual coactivation would increase as stimuli became more target like. One result would be that less evidence of target frequency would be required for tones with target location. The mechanism proposed would thus enhance the neuronal contrast between targets and target-like distractors. How could the processing of target features be amplified when they occur in combination with other target features, but not when they occur alone? The current experiment offers two clues. First, ERP signs of feature conjunction began 60–80 msec after the onset of Nd waves associated with single-feature processing. This delay appeared to reflect the requirement that the more slowly processed feature undergo at least 60–80 msec of analysis prior to feature conjunction. Second, ERP scalp distributions to CS-Nd waves were consistent with the suggestion that CS-Nds reflected crossfield facilitation that occurred 60–80 msec after the onset of singlefeature processing. It might be argued that any such facilitation would require implausible anatomical specificity, point-to-point connections between neurons responding to one attended feature and those responding to another. Since attention can be focused on any arbitrary combination of features, models that hypothesize such specific interactions are faced with a seemingly insurmountable combinatorial problem, because participants can attend to a potentially infinite number of feature combinations. Insofar as the occurrence of a particular value in one feature dimension enhances the processing of a particular value in another, and vice versa, the FIFA model would appear to require a correspondingly infinite number of point-to-point connections. However, projections would not need to be so anatomically specific if they could also be modulated by temporal factors. There exists a neurotransmitter system with temporally intrinsic conjunction-related activation pattern: N-methyl-D-aspartate (NMDA) receptors (see Shors & Matzel, 1993 for a further discussion of the role of NMDA receptors in attention). The activation of NMDA receptors occurs at long latencies, and, more importantly, NMDA receptors are only activated when neurons are already depolarized. Hence, NMDA projections incorporate an essential aspect of the positive feedback proposed in the FIFA model– only neurons that are already active would have their Woods and Alain
505
activity enhanced by NMDA activation. Since Nd waves have longer durations than exogenous ERPs, at long latencies neuronal excitation becomes increasingly limited to stimuli with attended features. Therefore, if relatively diffuse glutaminergic projects arrived at NMDA channels after some delay, excitation would only occur in active neurons– i.e., those neurons involved in the continued processing of attended stimulus features. This temporal contingency would reduce the requirements for precise point-to-point neural projections. The effects of even a diffuse projection would be functional restricted to enhancing attention-dependent neuronal activity. A schematic diagram of the FIFA model is shown in Figure 9. The model assumes that NMDA projections originate in the neuronal fields involved in generating single-feature Nd waves and modulate a common source of diffuse NMDA projections to all auditory fields. The activation of NMDA projections will occur after a delay, relative to both the exogenous and the initial attentionrelated activity elicited by the stimulus. In the case where only single target features were present (Figure 9A), glutaminergic projections would arrive at relatively long latencies (140+ msec) in other cortical fields. At
A.
this latency, the ERP data suggest that neuronal activity is largely restricted to neurons responding to attended tone features. Hence, the NMDA projection would have little effect, since few depolarized neurons would be present in the recipient fields. However, if two or three target features were present in the same tone (Figure 9B), the necessary conditions would be present for CS neuronal facilitation. Efferent projections would then activate NMDA receptors in all fields. Moreover, because of the delay in NMDA projections and NMDA receptor activation, reafferent amplification would be particularly important for less salient features, since the projections to NMDA receptors would arrive at the less salient feature processing regions concurrent with maximal bottom-up excitation. The NMDA mechanism implies that enhanced processing might also occur for distractors that occur in close temporal proximity with attended sounds. Indeed, there is behavioral evidence of false conjunctions between successively presented auditory signals (Woods et al., 1998). However, three factors might mitigate the problem. First, the time course of NMDA activation would be optimized for processing the different features of the same sound. The decay in activity of NMDA
B.
Figure 9. (A) FIFA. A schematic representation of the stages of feature processing of tones with two salient feature dimension (A and B) and one less salient dimension (C). The A– B– C+ stimulus activates attentionally primed areas in the C-feature field that are activated after some delay. Areas in the activated fields project to a modulatory center that sends diffuse NMDA projections to areas in other fields processing attended features. However, since there is no activation in these fields, NMDA projections have no effect. At longer latencies, excitation gradually increases in the C-processing field. (B) NMDA-mediated FIFA occurring for A+B+C+ tones. Salient target features A and B elicit short-latency activation in feature-processing fields and delayed activation of efferent NMDA projections to regions processing similar stimuli in other fields. Efferent NMDA projections potentiate the long-latency activation of attentionally labile neurons, with efferent activation having relatively more impact on the processing of less salient (e.g., C) features. See text for further details.
506
Journal of Cognitive Neuroscience
Volume 13, Number 4
projections would reduce the probability of facilitated processing of stimuli occurring with significant temporal offsets from attended stimuli. Second, NMDA projections would be primarily restricted to auditory cortical fields that show substantial attentional lability. Since attentionally sensitive neurons in these fields appear to be engaged primarily by stimuli with attended features (see above), the processing of distractors overlapping in time with attended stimuli, but lacking attended features, would not be enhanced. Finally, the efferent projections are proposed to incorporate a crude anatomical specificity, so that they would project to attentionally sensitive zones responding to similar sounds. For example, they might project to tonotopically to the attentionally labile fields. Thus, the first, gating stage of auditory selection (the N10/P90) might restrict processing to those stimuli of attended frequency, possibly utilizing an ’’attentional parasol’’ mechanism diminishing subsequent activation caused by stimuli without attended features (Woods, 1990; cf. Treisman & Sato, 1990). In contrast, later stages would utilize an ’’attentional spotlight’’ to enhance neuronal activity in stimuli with attended features, with this feature activity further amplified by the FIFA mechanism.
CONCLUSIONS ERP recordings in a three-feature auditory feature-conjunction task provided evidence of attentional modulation of ERPs to tones with one, two, and three target features. Evidence of attentional modulation of tones with single target features began at latencies as short as 60 msec, lasted for more than 400 msec, and suggested that different features were processed in different auditory cortical fields. The long duration of single-feature contradicts models that hypothesize that processing is terminated as soon as evidence is available that permits the distinction of the distractor from the target. Rather, auditory features appear to undergo exhaustive feature processing that occurs in parallel in both space (different features are processed in different locations) and in time (features of successive stimuli can undergo concurrent analysis). When stimuli contain two or three target features, ERPs show evidence of FIFA: The attention effects associated with processing tone features are enhanced. The FIFA model proposes a nonlinear amplification of neuronal activity that distinguishes the feature-conjunction target from distractors through reciprocal NMDA projections between feature processing fields.
the Veterans Administration Medical Center and University of California–Davis. All participants were righthanded with normal hearing and were paid for their participation. Design The paradigm is shown in Figure 1. Eighteen different tones– derived from the combination of three frequencies (250, 1000, and 4000 Hz), three durations (8, 24, and 72 msec), and two spatial locations (left and right ears)– were presented randomly and equiprobably. Tone intensities were adjusted so that subjective loudness was equated for the 72-msec duration tones of different frequencies (98 dB SPL for 250 Hz, 92 dB for 1000 Hz, and 90 dB SPL for 4000 Hz). The same peak intensities were then used for the 8- and 24-msec tone durations. All stimuli were shaped with 4-msec rise/fall times. Stimuli were presented through insert earphones to the left or right ear over broadband, binaural masking noise (76 dB SPL). Stimuli were presented in random order and at variable stimulus onset asynchronies (SOA, 200–400 msec, mean 300 msec). Participants were tested on 3 successive days, with the testing sessions separated by 3–10 days. On the first day, participants were audiometrically screened and given a series of training sequences to assist them in discriminating different tone frequencies and durations. On the second and third days, they pressed a button to a designated target of a prespecified frequency, location, and duration. There were 12 conditions, corresponding to the 12 possible targets derived from the combination of frequency (250, 1000, or 4000 Hz), location (left or right ears), and duration (8 or 72 msec, 24-msec duration tones were never targets). The order of conditions was counterbalanced across participants. Each stimulus block lasted 13 min and 40 sec and contained 2412 stimuli (134 of each type). Stimulus blocks were presented in three segments lasting 4 min and 57 sec each and separated by short pauses. ERPs were recorded to 57,888 trials for each participant. Behavior was continuously monitored, and participants were encouraged to be both fast and accurate. Correct responses were defined as button presses occurring 150–1000 msec after attended targets. Trials in which participants failed to respond to a target during this interval were categorized as ’’misses.’’ Responses falling outside the target-hit window (i.e., not preceded by a target for 1.0 sec) were categorized as FAs. A corrected accuracy measure was obtained by subtracting the mean FA rate/stimulus from the hit rate.
METHODS Participants
EEG Recording
Twelve participants (5 women, age range 18–42) gave informed consent according to procedures approved by
The EEG (bandpass 0.1–300 Hz) was continuously digitized (833 Hz/channel) from 28 electrodes over the scalp Woods and Alain
507
(Fp1, Fpz, Fp2, nose, left preauricular, right preauricular, T1, F7, F3, Fz, F4, F8, T2, left mastoid, T3, C3, Cz, C4, T4, right mastoid, T5, P3, Pz, P4, T6, O1, Oz, and O2). The EEG was then decimated by a factor of three to produce an effective sampling rate of 278 Hz. Vertical and horizontal eye movements were recorded from electrodes lateral and below the left eye. All electrodes were referenced to four interconnected, electrocardiogrambalanced electrodes at the base of the neck (Woods & Clayworth, 1985). Data Analysis ERPs were extracted off-line by computer. Trials contaminated by blinks, vertical or horizontal eye movements (in excess of 100 m V), excessive peak-to-peak deflections, amplifier clipping, or bursts of EMG activity were excluded from the average. Following averaging, peak amplitudes and latencies of ERPs were quantified under software control. Measurements were obtained relative to a 200-msec prestimulus baseline, after low-pass digital filtering of the averages to eliminate frequencies above 40 Hz. Mean voltages were obtained at 20-msec intervals starting at 50 msec after stimulus onset and continuing until 310 msec poststimulus, and at 50-msec intervals from 300 to 700 msec poststimulus. The results were statistically evaluated with analyses of variance for repeated measures. The data were initially analyzed at four midline electrodes (Fpz, Fz, Cz, and Pz) and four temporal electrodes (T3, T4, T5, and T6). Only those effects that were significant in these montages were statistically evaluated at all 27 scalp electrodes. Hemispheric effects were examined on the 20 lateral electrodes (including mastoids, anterior temporal, and preauricular placements) that remained after excluding midline and EOG electrodes. In analyzing scalp distributions, Type 1 errors associated with inhomogeneity of variance were controlled by reducing the degrees of freedom associated with individual electrodes and electrode–participant interactions according to a modified Greenhouse–Geisser procedure. Previous studies using five to seven scalp electrodes have established that the degrees of freedom associated with electrode and electrode interactions should be reduced by 40–70% (e; = 0.4–0.7, Ruchkin, 1990). Since we used more electrodes, we adopted a more stringent epsilon correction so that the total degrees of freedom for the electrode factor remained at 2. For example, using the reduced set of eight electrodes, an Electrode (eight levels) £ Pitch (three levels) interaction for 12 participants [F(14,154)] would only be reported as significant if its F ratio exceeded that required for significance at 2 and 22 degrees of freedom [F(2,22) = 3.44, p < .05]. In comparing scalp distributions, data were normalized by dividing voltages at all electrode sites by the sum of the squared voltages (McCarthy & Wood, 1985). 508
Journal of Cognitive Neuroscience
Acknowledgments Special thanks to Bill Yund, Kimmo Alho, and Janelle Weaver for careful readings of the earlier versions of this manuscript, and to Diego Covarrubias and John Lackey for software development that made these studies possible. This work was supported by grants from the NIMH, the NINDS, by the V.A. Research Service to D.L.W. and by an FRSQ Postdoctoral Fellowship to C.A. Reprint requests should be sent to: David L. Woods, Clinical Neurophysiology Neurology Service (127) VA-NCSC, 150 Muir Road Martinez, CA 94553, USA. E-mail:
[email protected].
REFERENCES Alain, C., Woods, D. L., & Covarrubias, D. (1997). Activation of duration-sensitive auditory cortical neurons in humans. Electroencephalography and Clinical Neurophysiology, 104, 531–539. Alcaini, M., Giard, M. H., Thevenet, M., & Pernier, J. (1994). Two separate frontal components in the N1 wave of the human auditory evoked response. Psychophysiology, 31, 611–615. Alho, K., Sams, M., Paavilainen, P., Reinikainen, K., & Naatanen, R. (1989). Event-related brain potentials reflecting processing of relevant and irrelevant stimuli during selective listening. Psychophysiology, 26, 514–528. Cohen, A. (1993). Asymmetries in visual search for conjunctive targets. Journal of Experimental Psychology: Human Perception and Performance, 19, 775–797. Cowan, N. (1988). Evolving conceptions of memory storage, selective attention, and their mutual constraints within the human information-processing system. Psychological Bulletin, 104, 163–191. deCharms, R. C., & Merzenich, M. M. (1996). Primary cortical representation of sounds by the coordination of action-potential timing. Nature, 381, 610–613. Divenyi, P. L. (1999). The ’’ cocktail party effect’ ’ viewed through the lens of psychophysics. Journal of the Acoustical Society of America, 105, 1150. Giard, M. H., Perrin, F., Pernier, J., & Peronnet, F. (1988). Several attention related wave forms in auditory areas: A topographic study. Electroencephalography and Clinical Neurophysiology, 69, 371–384. Hansen, J. C., & Hillyard, S. A. (1980). Endogenous brain potentials associated with selective auditory attention. Electroencephalography and Clinical Neurophysiology, 49, 277–290. Hansen, J. C., & Hillyard, S. A. (1983). Selective attention to multidimensional auditory stimuli. Journal of Experimental Psychology: Human Perception and Performance, 9, 1–19. Hansen, J. C., & Hillyard, S. A. (1988). Temporal dynamics of human auditory selective attention. Psychophysiology, 25, 316–329. Heinze, H. J., Mangun, G. R., Burchert, W., Hinrichs, H., & Hillyard, S. A. (1994). Combined spatial and temporal imaging of brain activity during visual selective attention in humans. Nature, 372, 543–546. Hillyard, S. A., & Munte, T. F. (1984). Selective attention to color and location: An analysis with event-related brain potentials. Perception and Psychophysics, 36, 185–198. Hillyard, S. A., Vogel, E. K., & Luck, S. J. (1999). Sensory gain control (amplification) as a mechanism of selective attention: Electrophysiological and neuroimaging evidence. In G. W. Humphreys, & J. Duncan, (Eds.), Attention, space, and action: Studies in cognitive neuroscience. New York: Oxford University Press. Volume 13, Number 4
Knight, R. T., Scabini, D., Woods, D. L., & Clayworth, C. (1988). The effects of lesions of superior temporal gyrus and inferior parietal lobe on temporal and vertex components of the human AEP. Electroencephalography and Clinical Neurophysiology, 70, 499–509. McCarthy, G., & Wood, C. C. (1985). Scalp distributions of event-related potentials: An ambiguity associated with analysis of variance models. Electroencephalography and Clinical Neurophysiology, 62, 203–208. Michie, P. T., Bearpark, H. M., Crawford, J. M., & Glue, L. C. (1990). The nature of selective attention effects on auditory event-related potentials. Biological Psychology, 30, 219–250. Na¨¨ata¨nen, R. (1982). Processing negativity: An evoked-potential reflection. Psychological Bulletin, 92, 605–640. Na¨¨ata¨nen, R. (1992). Attention and brain function. Hillsdale, NJ: Erlbaum. Na¨¨ata¨nen, R., & Picton, T. W. (1987). The N1 wave of the human electric and magnetic response to sound: A review and an analysis of the component structure. Psychophysiology, 24, 375–425. Owen, A. M., Stern, C. E., Look, R. B., Tracey, I., Rosen, B. R., & Petrides, M. (1998). Functional organization of spatial and nonspatial working memory processing within the human lateral frontal cortex. Proceedings of the National Academy of Sciences, U.S.A., 95, 7721–7726. Picton, T. W., Alain, C., Woods, D. L., John, M. S., Scherg, M., Valdes-sosa, P., Bosch-bayard, J., & Trujillo, N. J. (1999). Intracerebral sources of human auditory-evoked potentials. Audiology & Neuro-Otology, 4, 64–79. Rauschecker, J. P. (1997). Processing of complex sounds in the auditory cortex of cat, monkey, and man. Acta Oto-Laryngologica, Supplement, 532, 34–38. Ruchkin, D. S. (1990). Comments on the editorial policy on analysis of variance. Psychophysiology, 24, 476–477. Shors, T. J., & Matzel, L. D. (1993). Long-term potentiation: What’s learning got to do with it? Behavioral and Brain Sciences, 20, 597–655. Singh, J., Woods, D. L., & Knight, R. T. (1990). Psychophysiology of movement related brain potentials: Task dependence and neural generators. In K. A. Sinha (Ed.), Progress in clinical neurosciences, vol. 6 (pp. 63–78). Patna, India: Catholic University Press. Treisman, A. (1999). Solutions to the binding problem: Progress through controversy and convergence. Neuron, 24, 105–110. Treisman, A., & Sato, S. (1990). Conjunction search revisited. Journal of Experimental Psychology: Human Perception and Performance, 16, 459–478. Treisman, A. M., & Gelade, G. (1980). A feature-integration theory of attention. Cognitive Psychology, 12, 97–136. Wijers, A. A., Mulder, G., Okita, T., Mulder, L. J., & Scheffers, M. K. (1989). Attention to color: An analysis of selection, controlled search, and motor activation, using event-related potentials. Psychophysiology, 26, 89–109.
Wilson, F. A., O’ Scalaidhe, S. P., & Goldman-Rakic, P. S. (1993). Dissociation of object and spatial processing domains in the primate prefrontal cortex. Science, 260, 1955–1958. Woldorff, M., & Hillyard, S. A. (1991). Modulation of early auditory processing during selective listening to rapidly presented tones. Electroencephalography and Clinical Neurophysiology, 79, 170–191. Woldorff, M. G., Gallen, C. C., Hampson, S. A., Hillyard, S. A., Pantev, C., Sobel, D., & Bloom, F. E. (1993). Modulation of early sensory processing in human auditory cortex during auditory selective attention. Proceedings of the National Academy of Sciences, U.S.A., 90, 8722–8726. Wolfe, J. M., Cave, K. R., & Franzel, S. L. (1989). Guided search: An alternative to the feature integration model for visual search. Journal of Experimental Psychology: Human Perception and Performance, 15, 419–433. Wolfe, J. M., & Friedman-Hill, S. R. (1992). Visual search for oriented lines: The role of angular relations between targets and distractors. Spatial Vision, 6, 199–207. Woods, D. L. (1990). The physiological basis of selective attention: Implications of event-related potential studies. In J. W. Rohrbaugh, R. Johnson, & R. Parasuraman (Eds.), Eventrelated brain potentials: Issues and interdisciplinary vantages (pp. 178–209). New York: Oxford University Press. Woods, D. L. (1994). The component structure of the N1 wave of the human auditory evoked potential. Electroencephalography and Clinical Neurophysiology, Supplement. Perspectives on Event-Related Potential Research, 44, 102–109. Woods, D. L., & Alain, C. (1993). Feature processing during high-rate auditory selective attention. Perception and Psychophysics, 53, 391–402. Woods, D. L., Alain, C., Diaz, R., Rhodes, D., & Ogawa, K. H. (2001). Location and frequency cues in auditory selective attention. Journal of Experimental Psychology: Human Perception and Performance, 27, 65–74. Woods, D. L., Alain, C., & Ogawa, K. H. (1998). Conjoining auditory and visual features during high-rate serial presentation: Processing and conjoining two features can be faster than processing one. Perception and Psychophysics, 60, 239–249. Woods, D. L., Alho, K., & Algazi, A. (1994). Stages of auditory feature conjunction: An event-related brain potential study. Journal of Experimental Psychology: Human Perception and Performance, 22, 81–94. Woods, D. L., & Clayworth, C. C. (1985). Click spatial position influences middle latency auditory evoked potentials (MAEPs) in humans. Electroencephalography and Clinical Neurophysiology, 60, 122–129. Woods, D. L., & Courchesne, E. (1987). Intersubject variability elucidates the cerebral generators and psychological correlates of ERPs. Electroencephalography and Clinical Neurophysiology, Supplement, 40, 293–299.
Woods and Alain
509