To test this, listeners ... tone. Depending on the condition tested, the gap in the reference tone was ..... decay ramps, which overlapped with those of the pre-.
The Neurophysiological Basis of the Auditory Continuity Illusion: A Mismatch Negativity Study Christophe Micheyl, Robert P. Carlyon, Yury Shtyrov, Olaf Hauk, Tara Dodson, and Friedemann Pullvermu ¨ ller
Abstract & A sound turned off for a short moment can be perceived as continuous if the silent gap is filled with noise. The neural mechanisms underlying this ‘‘continuity illusion’’ were investigated using the mismatch negativity (MMN), an eventrelated potential reflecting the perception of a sudden change in an otherwise regular stimulus sequence. The MMN was recorded in four conditions using an oddball paradigm. The standards consisted of 500-Hz, 120-msec tone pips that were either physically continuous (Condition 1) or were interrupted by a 40-msec silent gap (Condition 2). The deviants consisted of the interrupted tone, but with the silent gap filled by a burst of bandpass-filtered noise. The noise either occupied the same frequency region as the tone and elicited the continuity illusion (Conditions 1a and 2a), or occupied a remote frequency region and did not elicit the illusion (Conditions 1b and 2b). We predicted that, if the
INTRODUCTION Perceptual illusions have long been of interest to psychologists and, more recently, neuroscientists, as a means of gaining insights into the inner workings of perception (see Eagleman, 2001). The best known and most comprehensively studied class of illusions are completion illusions, also known as perceptual ‘‘filling in’’ (Dennett, 1992). Striking illustrations of such illusions are provided, in the visual domain, by figures that are partially obliterated or erased, but nevertheless perceived as a whole (Kanizsa & Gerbino, 1982). An analogue of such completion illusions in the auditory modality is the continuity illusion (Warren, Obusek, & Ackroff, 1972; Warren, Wrightson, & Puretz, 1988; for a review see Warren, 1999). In this illusion, an acoustic signal interrupted momentarily by extraneous noise is perceived as continuing through the noise, even if it is in fact physically turned off during the noise. Far more than just a psychoacoustical oddity occurring in artificial laboratory conditions, the continuity illusion may reflect a very important mechanism of auditory perception in
MRC Cognition and Brain Sciences Unit, Cambridge, UK © 2003 Massachusetts Institute of Technology
continuity illusion is determined before MMN generation, then, other things being equal, the MMN should be larger in conditions where the deviants are perceived as continuous and the standards as interrupted or vice versa, than when both were perceived as continuous or both interrupted. Consistent with this prediction, we observed an interaction between standard type and noise frequency region, with the MMN being larger in Condition 1a than in Condition 1b, but smaller in Condition 2a than in Condition 2b. Because the subjects were instructed to ignore the tones and watch a silent movie during the recordings, the results indicate that the continuity illusion can occur outside the focus of attention. Furthermore, the latency of the MMN (less than approximately 200 msec postdeviance onset) places an upper limit on the stage of neural processing responsible for the illusion. &
everyday life, in which sounds are often interrupted by extraneous noises. Yet, we generally perceive these sounds as if they were continuous. This perceptual auditory restoration effect may account for our ability to follow and recognize auditory signals such as speech or music under noisy conditions (Carlyon, Deeks, Norris, & Butterfield, 2002; Miller, Dibble, & Hauser, 2001; Plack & White, 2000; Warren, 1970, 1999; Bashford, Meyers, Brubaker, & Warren, 1988; Cherry & Wiley, 1967; Schubert & Parker, 1956). The neural mechanisms underlying the auditory continuity illusion remain largely unknown at present, and a wide range of interpretations of the existing data is possible. At one extreme, the illusion might result from early and automatic brain processes that literally complete missing parts of the sensory information at the neural level. Examples of such neural information-completion mechanisms have been documented in the visual modality. A first example is provided by Pettet and Gilbert’s (1992) finding of a neural ‘‘filling in’’ phenomenon in the primary visual cortex, which may explain why visual scotomas are also ‘‘filled in’’ at the perceptual level. Another example of a neural mechanism that may account for completion illusions stems Journal of Cognitive Neuroscience 15:5, pp. 747– 758
from demonstrations that cortical neuron populations become active synchronously when common visual patterns are being perceived (Tallon-Baudry & Bertrand, 1999; Pulvermu ¨ ller, Birbaumer, Lutzenberger, & Mohr, 1997). These cells and neuron ensembles act as templates against which the visual input is evaluated, and a perceptual illusion can arise if the available sensory information sufficiently activates a cortical template. This type of mechanism has been proposed to account for Kanizsa’s famous illusory triangle (Hubel, 1995; Kanizsa & Gerbino, 1982). Transposing continuity in the spatial domain in the visual modality to continuity in the temporal domain in the auditory modality, it is easy to conceive how similar brain mechanisms as those above may underlie the auditory continuity illusion. At the other extreme, however, the auditory continuity illusion might result from a late, but possibly unconscious, cognitive inference; namely, since it is very improbable that the signal ended exactly when the noise started and resumed exactly when the noise ended, listeners may decide that the most likely correct interpretation of the available sensory evidence is that the signal simply continued through the noise. In other words, it is quite possible that, rather than proceeding from an early stage of processing of the sensory information, the continuity illusion results from a reinterpretation of the sensory material at a much later stage of processing. Therefore, a key issue in the investigation of the mechanisms of the auditory continuity illusion is whether the illusion occurs at an early stage of sensory information processing (the brain being then ‘‘fooled from the beginning’’), or whether it is the result of a late postperceptual process. Another important question regards the role of attention, if any, in perceptual illusions. Is attention to the stimulus required for the illusion to be perceived, or is the illusion generated outside the focus of attention? It is difficult to answer this question solely on the basis of studies that rely on subjective reports, because they implicitly require that subjects pay attention to the stimuli that they have to judge. Again, existing data permit a wide range of possible interpretations. At one extreme, the continuity illusion might be generated by completely automatic and preattentive mechanisms. At the other extreme, it might result from the fact that the noise directs the listeners’ attention away from the signal for a brief moment, so that if the interruption is short enough, they are unable to detect that the signal was turned off during the noise. One way of overcoming the limits of the behavioral approach for the study of the underlying mechanisms of auditory perceptual illusions involves using event-related brain potentials (ERPs). The mismatch negativity (MMN) component of the auditory ERP appears to be an important tool for gathering information about the time course of cortical activation during the perception of acoustic stimuli. The MMN is also well known to be observed in 748
Journal of Cognitive Neuroscience
the absence of focused attention toward the stimuli eliciting it (Picton, Alain, Otten, Ritter, & Achim, 2000; Na¨¨ata¨nen, 1995). The MMN is generally elicited when a rare, deviant stimulus is presented among repetitive standard stimuli. It is usually measured with the subject instructed to ignore the sounds and attend to a visual stimulus, which, in the present study, was a silent video. It is traditionally interpreted as reflecting the response of an automatic, preattentive change-detection system (Na¨¨a ta¨nen, Gaillard, & Ma¨ntysalo, 1978; Na¨a¨ta¨nen, 1990). Recent research indicates that its amplitude reflects the presence of auditory templates or long-term memory traces that may play a role in perceptual illusions (Na¨¨ata¨nen, 2001; Pulvermu ¨ ller et al., 2001; Shtyrov, Kujala, Palva, Illmoniemi, & Na¨¨ata¨nen, 2000). If an illusion is being reflected in the MMN, this suggests that the illusion is determined at a fairly early stage of processing, and that the process reflected does not require that subjects focus their attention on the auditory signals. A similar logic has been applied to the study of other aspects of auditory perception, namely, auditory stream segregation (Yabe et al., 2001; Ritter, Sussman, & Molholm, 2000; Shinozaki et al., 2000; Sussman, Ritter, & Vaughan, 1999; for a review, see Na¨¨ata¨nen, Tervaniemi, Sussman, Paavilainen, & Winkler, 2001) and auditory language processing (Na¨¨ata¨nen, 2001; Pulvermu ¨ ller et al., 2001; Shtyrov et al., 2000). In this study, we tested the hypothesis that the continuity illusion was reflected by the MMN. Using an oddball paradigm with standard and deviant stimuli, we recorded ERPs in four different conditions, which are shown in Figure 1. These conditions were formed by combining four basic stimuli: a continuous tone, an interrupted tone, and bursts of noise bandpass filtered in a frequency region that either encompassed the tone or did not. The standards were obtained by mixing a continuous or interrupted tone with a noise burst constrained to occur at a pseudorandomly selected delay after the onset of the tone or tone pair (but never temporally overlapping with it). This interspersing of the standard tones with noise bursts was used in order to ensure that the presence of these noise bursts during the deviants would not elicit an MMN simply because they were rare; otherwise stated, this was done to eliminate the ‘‘noise-in-deviant’’ novelty effect. In the deviants, the tone was always interrupted, and the noise always occurred during the silent middle gap. As expected on the basis of earlier data, and confirmed by psychophysical measures described later on in this article, the interrupted tone was perceived as continuous when, and only when, the gap was filled with ‘‘onfrequency’’ noise. On the basis of these conditions, and under the hypothesis that the processes determining the continuity illusion are reflected in the generation of the MMN, two predictions were made: (1) The MMN should be larger in Condition 1a, where the deviant is perceived as Volume 15, Number 5
continuous and the standard as interrupted, than it is in Condition 1b, where both standard and deviant stimuli are perceived as interrupted; (2) The MMN should be larger in Condition 2b, where the standards are perceived as continuous and the deviants as interrupted, than in Condition 2a, characterized by perception of both standard and deviant stimuli as continuous. The key prediction is that of a significant ‘‘interaction’’ of factors type of standard (Conditions 1 vs. 2) and type of deviant (‘‘a’’ vs. ‘‘b’’). This is a methodologically important feature of this study, because it implies appropriate controls for all other aspects of the stimuli that might affect the size of the MMN. For example, in Condition 1a, the deviant contains an on-frequency noise in the temporal center of the tone, and because the standards do not, this may elicit an MMN. Because subjects are generally poorer at judging the relative timing of stimuli in different frequency regions than of
stimuli in the same region, this effect might be smaller in Condition 1b. However, it would also be expected to be smaller in Condition 2b than in Condition 2a, and so the predicted interaction would not occur; instead, there would be a main effect of deviant type. Similarly, it might be that the size of the MMN depends on whether the standard is continuous or interrupted, due to the difference in sound energy between the two cases, but again this would not lead to the predicted interaction. It is only by the continuity illusion increasing the size of the MMN in Condition 1a, and reducing it in Condition 2a, that the predicted interaction should occur.
RESULTS Psychophysical Results A psychophysical experiment was performed to confirm that the continuity illusion was indeed induced when,
Figure 1. Schematic representation of the stimuli used in the four experimental conditions of the main electrophysiological experiment. The names of the conditions are indicated on the left. In each condition, two standards followed by one deviant are shown. In Conditions 1a and 1b, the tone pip stimulus was interrupted in both standard and deviant intervals. In Conditions 2a and 2b, the tone pip was continuous in standard intervals, but was interrupted in deviant intervals. In Conditions 1a and 2a, the noise bursts were filtered in the same frequency region as the tones. In Conditions 1b and 2b, they were filtered in a higher frequency region, and were not spectrally overlapping with the tones. The suspension marks indicate that the intervals between the standards and the interspersed noise bursts have been reduced for clarity. Note also that, although these intervals are shown to be identical in the figure, they in fact varied from noise burst to noise burst (see text for details).
Micheyl et al.
749
Figure 2. Psychophysical experiment results: Mean adjusted gap duration in the six experimental conditions (2 overall durations £ 3 gap conditions). The names of the conditions are self-explanatory. The mean adjusted gap durations shown on this graph were computed as the geometric mean of the adjustments across runs and subjects; accordingly, each histogram bar represents an estimate based on 50 adjustments (5 subjects £ 10 runs). They are expressed as a percentage of the total gap duration. The horizontal dashed line indicates the 33.3% point, which is the veridical duration of the gap in the reference tone. The error bars represent the standard error of the geometric mean across subjects.
and only when, a silent gap between two tones was filled by an on-frequency noise. To test this, listeners were required to adjust the duration of a silent gap in the middle of a comparison tone to match the perceived duration of the gap in a preceding (reference) tone. Depending on the condition tested, the gap in the reference tone was either left silent or filled with a burst of noise, which was bandpass filtered in a spectral region that either encompassed the tone frequency (‘‘on-frequency noise’’) or occupied a remote frequency region (‘‘off-frequency noise’’). The main prediction was that subjects would adjust the duration of the gap in the comparison tone to about zero in the onfrequency noise condition, but to a duration closer to the true silent gap in the reference tone in the offfrequency and no-noise conditions. This was tested at two different overall stimulus durations: 120 msec, equal to that used in the electrophysiological recordings, and 240 msec, which was more similar to that used in previous studies on the continuity illusion. In each case, the duration of the silent gap and, where present, noise burst, was equal to one-third of the overall stimulus duration. Figure 2 represents the mean adjusted gap duration in the six experimental conditions, expressed as a percentage of the overall stimulus duration. When the gap in the reference tone was silent, subjects adjusted the duration of the gap in the comparison tone to a value close to, but slightly smaller than, the veridical 750
Journal of Cognitive Neuroscience
duration of 33.3% (dashed line). This difference presumably reflects a response bias. The important features of the results are that the off-frequency noise produced results similar to that obtained with a silent gap, and that when the gap in the reference tone was filled with on-frequency noise, subjects adjusted the gap in the comparison to a value close to zero. These data were analyzed using an analysis of variance (ANOVA) with the natural logarithm of the gap duration as the dependent variable,1 and the condition, the overall duration, and the run number as repeated-measures factors. The results of the ANOVA indicated a significant overall difference across conditions, F(2,8) = 38.85, p < .0005. Planned comparisons revealed that the mean adjusted gap duration in the silent gap condition was significantly larger than in the on-frequency noise condition, F(1,4) = 58.10, p = .002, and not significantly different from that in the off-frequency noise condition, F(1,4) = 1.79, p = .252. Neurophysiological Results Figure 3 shows the standard, deviant, and deviant ¡ standard difference traces averaged across all subjects for each of the four experimental conditions. The difference traces were computed by, firstly, averaging the standard traces obtained in Conditions 1a and 1b on the one hand, and those obtained in Conditions 2a and 2b on the other hand, because within these couples of conditions, the standard stimuli were exactly the same; the resulting enhanced standard traces were then subtracted from the deviant traces measured in corresponding conditions. As can be seen in Figure 3, substantial negative deviations, corresponding to MMNs, were observed in the 120- to 280-msec poststimulus-onset time range in Conditions 1a, 2a, and 2b. Only a small negative deflection was present in Condition 1b. Figure 3 also shows that, in Condition 2b, the MMN was followed by a sharp positive peak, presumably corresponding to a P3a component. Also in Conditions 1b and 2a, the MMNs were followed by minor positive deviations, but these were much less pronounced than the P3a seen in Condition 2b. Figure 4 shows the scalp topography of the MMN responses averaged across the three conditions which produced statistically significant MMN (1a, 2a, 2b); the time window common to the MMN in all three conditions (190 – 220) was used for averaging. Each of these three conditions individually exhibited a fronto-centrally distributed negativity, typical of the usually reported MMN topography. To assess the statistical significance of the MMN in each condition, and to compare its size between conditions, the MMN amplitude was quantified in each subject by computing the average amplitude, within a specific time window, of the amplitude averaged across several recording sites (see Methods section for details). Volume 15, Number 5
Since MMN latencies varied quite substantially between conditions, we used condition-specific time windows for calculating the MMN amplitude. For each condition, 100-msec-wide time windows were centered around
the MMN maximum defined as the most negative-going point found between stimulus onset and 300 msec thereafter. The resulting intervals were as follows: 190 – 290 msec poststimulus onset in Condition 1a,
Figure 3. Grand-average standard, deviant, and difference ERP traces in the four experimental conditions. The difference traces were obtained by subtracting the standard traces averaged over all 16 subjects from the average deviant traces, in each of the four experimental conditions. To improve the signal-to-noise ratio, the standard responses were averaged over a and b conditions (which involved the same standards) before subtraction. Note that negativities are down.
Micheyl et al.
751
Figure 4. MMN scalp topographies. These maps show the amplitude of the MMN averaged across the three experimental conditions (1a, 2a, and 2b) in which a large and significant MMN was observed. The time window used for averaging (190 – 220 msec) encompassed the MMN in all three conditions.
110 – 210 in Condition 1b, 150 –250 msec in Condition 2a, and 120 – 220 msec in Condition 2b. MMN latency was taken to be the latency of the largest negative peak within a time window ranging from 110 to 280 msec poststimulus onset in all conditions. As evidenced by Figure 5, the average amplitude of the MMN varied substantially across conditions. Onesample t tests showed that the MMN differed significantly from zero in Conditions 1a [t(15) = ¡3.503, p < .005, 2a [t(15) = ¡5.989, p < .0005], and 2b [t(15) = ¡7.282, p < .0005], but not in Condition 1b [t(15) = ¡1.565, p = .139]. The statistical significance of differences between conditions was assessed using a two-way ANOVA. The type of standards (interrupted or continuous) and the spectral position of the noise in the deviants (on- or off-frequency) were used as withinsubjects factors. This analysis revealed a significant interaction, consistent with the hypothesis that the continuity illusion is reflected in the size of the MMN, F(1,15) = 16.057, p = .001]. Specifically, as revealed by planned comparisons, the MMN was significantly larger in Condition 1a than in Condition 1b [t(15) = 2.464, p < .05], and was significantly smaller in Condition 2a than in Condition 2b [t(15) = ¡2.938, p < .05]. In addition, there was an overall main effect of standard type, with larger 752
Journal of Cognitive Neuroscience
MMNs when the standards were continuous (Condition 2) than when they were interrupted, F(1,15) = 49.268, p < .0005. Although this effect has different possible
Figure 5. Average MMN amplitude in the four experimental conditions. The error bars represent the standard error of the mean (see text for details).
Volume 15, Number 5
sources, the difference in sound energy between the continuous and discontinuous tones may provide the most parsimonious explanation for this difference. Figure 6 shows the average MMN peak latency in the four conditions. A significant main effect of deviant type was observed: The latency was significantly longer in conditions where the deviants contained an onfrequency noise and were perceived as continuous, than in conditions where they contained an off-frequency noise, and sounded interrupted, two-way ANOVA: F(1,15) = 33.628, p < .0005. This occurred both in Condition 1 and in Condition 2: Post hoc comparisons revealed that the MMN latency was significantly longer in Condition 1a than Condition 1b [t(15) = 4.714, p < .0005], and in Condition 2a than in Condition 2b [t(15) = 5.166, p < .0005]. One possible interpretation for this effect is that, when the deviant contained the on-frequency noise, the brain’s assessment of whether the tone was interrupted had to wait until the end of the noise burst. This is plausible because the continuity illusion depends not only on the portion of the tone preceding the noise, but also on that following it, and because the average difference in latency between Conditions a and b was 41.6 msec, quite close to the 40-msec duration of the noise burst. However, caution should be exercised when interpreting both of these differences. For Condition 1, one of the MMNs (Condition 1b) had an average amplitude that did not differ significantly from zero. For Condition 2, one of the MMNs (in Condition 2b) was followed by a large positive deviation. It is possible that this deviation partially cancelled the later portions of an MMN, thereby reducing the measured latency.
Figure 6. Average MMN peak latency in the four experimental conditions. The error bars represent the standard error of the mean (see text for details).
DISCUSSION Psychophysical Experiment The results of the preliminary psychophysical experiment show that when the silent gap between the two tone pips was filled with off-frequency noise, the gap between the tone pips in the first interval was perceived to be about as large as when no noise was present. However, when the gap in the reference stimulus was filled with on-frequency noise, the mean adjusted gap duration in the comparison interval was very small and indeed not significantly different from zero. This provides indirect evidence that subjects perceived the tone as discontinuous when no noise or off-frequency noise was present, and as continuous when on-frequency noise was present. These results are consistent with previous results in the psychoacoustical literature on the continuity illusion (reviewed in Warren, 1999), and confirm our assumptions about how the stimuli used in the EEG experiment would be perceived. Electrophysiological Study Evidence for a Reflection of the Continuity Illusion in the MMN The electrophysiological study revealed a significant interaction between the standard-type and devianttype factors for MMN amplitude, which proved to be larger in Condition 1a than in Condition 1b, and smaller in Condition 2a than in Condition 2b. This pattern of results is consistent with that predicted under the hypothesis that the continuity illusion is reflected in the MMN. Indeed, as stated in the Introduction, the finding of a larger MMN in Condition 1a than in Condition 1b, together with that of a smaller MMN in Conditions 2a than 2b, can only be explained by assuming that the continuity illusion is already determined at the level at which the MMN is generated. We therefore now conclude that (1) mechanisms important for the continuity illusion occur within the first 200 msec after the onset of the gap filled by the illusion and (2) the illusion occurs in the absence of directed attention toward the sounds eliciting it. We do not claim that the MMN reflects illusory perceptions in the sense that its occurrence is entirely due to such phenomena, or indeed that the illusion is ‘‘complete’’ by the time the MMN is elicited.2 Rather, the conclusion offered here is that the MMN, which is known to be elicited by perceived changes in the auditory environment, is ‘‘modulated’’ by illusory perceptions in the same way as, for instance, its amplitude reflects the presence of cortical long-term memory traces for units of spoken language, including phonemes and words (Na¨¨ata¨nen, 2001; Pulvermu ¨ ller et al., 2001). The increase of the MMN—compared to adequate control conditions (as, e.g., Condition 1b), would Micheyl et al.
753
thus be related to the mechanisms directly involved in generating the MMN. Such mechanisms have been shown to occur within 200 msec after gap onset in the absence of directed attention. The modulation of the MMN as a function of illusory perceptions does not exclude the possibility that other factors than the perceived continuity or discontinuity of the standard and deviant tones affected the MMN. The finding of an MMN in Condition 2a suggests that the existence of a noise midway through a tone can contribute to the MMN, even when it does not cause a difference in perceived continuity of the standards. In addition, the MMN was generally larger in conditions where the standards were continuous than in conditions where they were interrupted. What is important is that these other influences would not be expected to lead to the interaction which was observed, and which forms the basis for our conclusion that the continuity illusion can be measured using the MMN. At What Stage of Processing in the Central Nervous System is the Continuity Illusion Generated? The finding that the continuity illusion is reflected in the MMN may have some implications regarding the brain locus at which the continuity illusion is generated. Although the MMN is thought to have multiple generators, its major source has been shown to be in auditory areas along the supratemporal plane (Kasai et al., 1999; Kropotov et al., 1995; Alho, Woods, Algazi, Knight, & Na¨¨a ta¨ nen, 1994; Javitt, Steinschneider, Schroeder, Vaughan, & Arezzo, 1994; Giard, Perrin, Pernier, & Bouchet, 1990). Thus, the finding that the continuity illusion is reflected in the MMN makes it likely, although it admittedly does not prove that the illusion is generated at or below the level of the auditory cortex. In the visual modality, neural correlates of perceptual completion illusions have been identified in primary visual cortical areas (e.g., De Weerd, Gattass, Desimone, & Ungerleider, 1995; Pettet & Gilbert, 1992). One will probably have to await the results of similar animal electrophysiological explorations on the underlying mechanisms of perceptual filling-in before a firm conclusion can be reached as to whether the continuity illusion is already determined at the level of the primary auditory cortex or not. To our knowledge, only two studies have started to address this question. One of these (Petkov, O’Connor, & Sutter, 2001) is still in its preliminary stages, and the results presented so far provide no clear answer yet. The other (Sugita, 1997) measured the response of neurons in the primary auditory cortex of anaesthetized cats to tone sweeps that contained a silent gap that either was or was not filled with a burst of noise. Sugita reported that some neurons responded more strongly to the combination of the gapped tone sweep and the noise than to either stimulus alone. However, it should be noted that 754
Journal of Cognitive Neuroscience
the frequency content of the noise bursts was often remote from that of the portion of the sweep that was deleted, leading to conditions that would not usually be expected to induce the continuity illusion.3 From a more general point of view, there is growing evidence in the MMN literature that the primary auditory cortex is the locus of relatively more complex perceptual mechanisms than was originally thought. Perceptual auditory mechanisms, whereby complex auditory scenes made of several sequential or simultaneous stimuli are integrated into coherent auditory entities, have recently been shown to be indexed by the MMN ( Yabe et al., 2001; Ritter et al., 2000; Shinozaki et al., 2000; Sussman et al., 1999; for a review, see Na¨a¨ta ¨nen et al., 2001). Interestingly, it has been proposed that such perceptual auditory organization phenomena share common underlying mechanisms with the continuity illusion, both depending on a common preliminary process of linking together the parts of sequences that have similar frequencies (Bregman, Colantonio, & Ahad, 1999). The present results are generally consistent with this view. However, it should be noticed that generators in the superior temporal lobe anterior to the primary auditory cortex and additional inferior frontal areas contribute to the MMN, so that any firm conclusion on the particular area underlying the illusion must remain tentative at this stage. Is Attention Required for the Generation of the Continuity Illusion? Our finding that the continuity illusion is reflected in the MMN, which is widely thought to be generated preattentively, suggests at first sight that the generation of this illusion is an automatic process that does not require attention. Although auditory attention was not explicitly manipulated in this study, subjects were not required to attend to the auditory stimuli. In fact, informal reports from the subjects at the end of the recordings suggested that they generally had become unaware of the auditory stimuli soon after the video started, being captured by the movie of their choice, and that they were rarely distracted by the sounds. A confirmation of this can be found in the fact that in most cases, the MMN was not followed by a clear P3a component, the only exception to this being Condition 2b, in which a positive peak was observed after the MMN. In general, the P3a is elicited only when subjects pay attention to the stimuli. However, it can also be obtained in inattentive conditions when the magnitude of the stimulus deviation is great (Friedman, Cycowicz, & Gaeta, 2001; Snyder & Hillyard, 1976; Squires, Squires, & Hillyard, 1975). Thus, the most likely interpretation of the positivity obtained in Condition 2b is that it reflects the fact that in this condition, the difference between standards and deviants was particularly striking. Thus, it appears that ‘‘focused’’ attention to the Volume 15, Number 5
stimulus is not required for the generation of the continuity illusion. To our knowledge, the present results are the first physiological measurements of the continuity illusion in human subjects. The fact that it was observed when subjects’ attention was directed elsewhere has implications for our understanding of the mechanisms responsible for it. In particular, it makes it unlikely that the illusion is mediated by an attention-directed integration of different stimulus features of the type proposed by Treisman and Gelade (1980) and Treisman (1998) to account for certain types of perceptual illusions. However, we should stress that the present results do not demonstrate that the continuity illusion can never be modulated by attentional manipulations. This is illustrated by the literature on auditory streaming, which can be reflected in the MMN (Yabe et al., 2001; Ritter et al., 2000; Shinozaki et al., 2000; Sussman et al., 1999), yet, when a highly demanding competing task is imposed, has been shown to be strongly dependent on attention (Carlyon, Cusack, Foxton, & Robertson, 2001). What the present results do achieve is to demonstrate that the illusion can be objectively measured without requiring subjects to either focus on the stimuli or respond to them. These objective measurements not only constrain the temporal and anatomical stage of processing at which the illusion occurs, but also pave the way for experiments in which auditory attention is manipulated explicitly.
METHODS Psychophysical Study Subjects The psychophysical experiment involved 5 subjects aged between 24 and 36 years, with no known auditory deficit. The subjects were tested in a double-walled soundproof chamber (IAC). Written informed consent was obtained. Although described before the main experiment, this study was actually performed last. This was done because the subjects also participated in the electrophysiological experiment, in which they were instructed to ignore the sounds, and because we did not wish to prejudice their acquiescence to these instructions by imposing a prior task in which they were required to explicitly judge the illusion. Stimuli and Procedure On each trial, the subjects were presented with two successive auditory stimuli, separated by 300 msec. Both stimuli consisted of a 500-Hz tone pip having a total duration T = 120 or 240 msec (including 10 msec on and off Hanning ramps). The first tone pip of each trial had its amplitude reduced to 0 (using 1-msec ramps in order to avoid clicks) over an interval of constant
duration t = T/3 msec around its temporal center. The second tone also contained a silent gap around its temporal center, but the duration of this gap was variable and could be adjusted using four buttons on a response box, which produced small (£21/4) or large (£23/4) increases or decreases, at the subject’s will. In one condition (‘‘silent gap’’), the gap in the tone was left silent. In the other two conditions, the gap was filled with a burst of one-octave-wide, bandpass-filtered noise. In one of these two conditions (‘‘on-frequency noise’’), the center frequency of the noise was equal to the tone frequency (500 Hz); in the other condition (‘‘off-frequency noise’’), it was set to 6000 Hz, so that none of the noise components fell at the tone frequency. The noise samples were generated by the addition of a series of sinusoids with frequencies increasing from the lower to the upper frequency of the defined passband in 1-Hz steps, with a constant amplitude and random starting phases uniformly distributed over 3608. The frozen noise samples used for the experiment were selected from a total sample of 1,000 waveforms, on the basis of having the smallest envelope fluctuations. This was done because the brief duration of the noises often introduced marked envelope fluctuations that subjects might mistake for a temporal gap. The overall duration of the noise was D + 2 msec, that is, the duration of the tone gap plus 1-msec attack and decay ramps, which overlapped with those of the preceding and following tone portions. The level of the tones was set at 65 dB SPL. The spectral level of the noise was adjusted to 13 dB below the level of the tone in the on-frequency condition. This difference was chosen in such a way that the noise would mask the 500-Hz tone if the two were simultaneously present. In the off-frequency noise condition, the level of the noise was reduced by a further 9.3 dB in order to roughly equate the specific loudness of the noise at the two considered frequencies.4 The subject’s task was to adjust the duration of the silent gap in the second stimulus so that its perceived duration was equal to that of the gap in the first-interval tone. After each button press, the second stimulus was modified accordingly, and the stimuli were played again. One additional button was used to play the stimulus again without modification. Finally, another button was used to indicate that the gap duration was subjectively the same in the two intervals and that the procedure could stop. In order to allow subjects to make the second tone continuous when they heard no gap at all in the first tone, a continuous stimulus was played each time the adjusted duration of the gap in the second stimulus became less than 0.52% of the overall duration (that is, 0.625 msec in the 120-msec overall duration condition, and 1.25 msec in the 240-msec overall duration condition). Subjects performed 10 adjustments in each of the 6 experimental conditions, in a completely randomized design. Micheyl et al.
755
Sixteen subjects (aged 18 – 41) took part in the electrophysiological experiment. Informed consent was obtained from all subjects prior to the measurements. Subjects were paid an hourly wage for their participation. The experiments received the approval of the Ethics Committee (Cambridge, UK) and were performed in accordance with the Declaration of Helsinki.
were not contaminated by responses to the surrounding noises. All epochs corresponding to standards that just followed a deviant stimulus were excluded. The following processing was performed under Matlab v6.0. After the average across channels was subtracted from all traces (average-referencing), the traces were rereferenced to linked mastoids (in this case, the average of channels TP7 and TP8, which were the closest to the left and right mastoids). Based on the scalp distribution of the responses, which showed largest MMN amplitudes at fronto-central locations, we chose to average responses across 20 fronto-central electrodes arranged in a 5 £ 4 matrix with symmetry around the vertex; namely: F3, F1, Fz, F2, F4; FC3, FC1, FCz, FC2, FC4; C3, C1, Cz, C2, C4; and CP3, CP1, CPz, CP2, CP4. The traces shown in this article thus correspond to averages over these 20 electrodes. Finally, the data were baseline corrected. The baseline was chosen to cover an interval between ¡100 and +40 msec relative to the onset of the tonal stimulus, since up to 40 msec poststimulus onset, the standard and deviant stimuli were identical.
EEG Recording and Analysis
Stimulus Sequences
Subjects were comfortably seated in an electrically shielded and acoustically insulated chamber. They were instructed to watch a silent video film of their own choice and to ignore auditory signals. The EEG was recorded with Ag/AgCl electrodes mounted in an extended 10 – 20 system cap (Quick-Cap, Neuromedial supplies, VA, USA) with a 64-channel EEG set-up (Neuroscan Labs). AFz was used as the reference electrode for recordings. The ground electrode was positioned on the subject’s right cheek. Horizontal eye movements were monitored using electrodes on the cap (the positions of which were near the outer canthi of left and right eyes). Vertical eye movements were monitored using two extra electrodes placed below and above the left eye. Signals were amplified, sampled at a rate of 500 Hz, and bandpass filtered between 0.1 and 100 Hz. The acquired EEG traces were stored on the hard disk of a Pentium computer and further processing was carried out off-line in the digital domain, starting with bandpass filtering (1 – 30 Hz, 24 dB/oct slopes). Eventrelated potentials were obtained by averaging epochs, which started 100 msec before and ended 340 msec after stimulus onset time (in other words, the analysis period was 440 msec, including a 100-msec prestimulus baseline and extending 220 msec after the offset of the tonal stimulus). Epochs containing voltage variations in excess of 100 mV at any EEG or EOG channel were discarded. Also discarded were all epochs containing a standard tone stimulus followed by a noise burst occurring less than 400 msec after the tone onset, and standard or deviant tones preceded by a noise burst occurring less than 320 msec earlier than the onset of the tone. This way, the responses to the standard or deviant tones
The stimulus sequences used in the main physiological experiment are represented schematically in Figure 1. Four experimental conditions, arranged into two main and two subconditions, were tested. The stimuli used in these four conditions were similar to those used in the psychophysical experiment. In Conditions 1a and 1b, the standards contained the previously described 120-mseclong tone interrupted in its middle by a 40-msec silent gap. In Conditions 2a and 2b, the standards consisted of the uninterrupted version of this tone. The deviants consisted of the interrupted tone with the middle gap filled by a 40-msec burst of noise. Depending on the condition tested, the noise was either on-frequency (Conditions 1a and 2a) or off-frequency (Conditions 1b and 2b), with the same spectral characteristics as described above in relation to the psychophysical experiment. In standard intervals, the added noise burst was not allowed to overlap temporally with the tonal stimulus. It occurred pseudorandomly at any of six equally probable temporal positions corresponding to delays of 40 + n £ 120 msec, with n being an integer comprised between 1 and 6. Only one noise burst was allowed to occur within each deviant or standard stimulus interval. An 840-msec delay separated the onset of two successive tonal stimuli. Each condition involved the presentation of a total of 1050 stimuli, among which 150 were deviants.
Stimuli The stimuli were generated digitally in the time domain on a Pentium Computer using MATLABv6.0. They were sent to a 16-bit D/A converter (CED1401plus), at a sampling rate of 40000 Hz. They were then passed through an anti-aliasing filter (Kemo VBF 25.01, cutoff = 17200 Hz, attenuation rate = 100 dB/oct), attenuated (TDT PA4), and fed to one input of a headphone amplifier. Finally, they were delivered to the subject’s left ear through Sennheiser 250 Linear II earphones. Electrophysiological Study
756
Journal of Cognitive Neuroscience
Acknowledgments The authors are grateful to two anonymous reviewers for helpful comments on an earlier version of this article. This study was performed while the first author was on a ‘‘Mise a disposition’’ from CNRS UMR 5020, Lyon, France, and was Volume 15, Number 5
partly supported by grant number GR/N64861/01 to B. C. J. Moore and R. P. Carlyon from the (British) Engineering and Physical Sciences Research Council. Reprint requests should be sent to Dr. Robert P. Carlyon, MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge CB2-2EF, UK.
Notes 1. The logarithmic transformation was used because the standard deviation of the arithmetic mean adjusted gap duration across runs was found to increase linearly with the mean itself, over all subjects and conditions (Pearson’s r = .76, p < .0005, N = 30). 2. One possible interpretation, which is consistent with the pattern of MMNs shown in Figure 5, is that the illusion is only partially complete. Assume, for example, that the deviants in Conditions 1a and 2a are processed as if the tone was ‘‘30% continuous,’’ and that those in conditions with off-frequency noise (1b and 2b) are treated as being completely discontinuous. Then, in Condition 1a, where the standards are interrupted, there should be a small MMN, whereas none should occur in Condition 1b. In Condition 2a, where the standards are (physically) 100% continuous, there should be a large MMN to the (‘‘30% continuous’’) deviant. The MMN should be even larger in Condition 2b, where the continuous standards are contrasted with the (completely discontinuous) deviants. This interpretation suggests that the MMN reflects a partial illusion, an illusionary filling of