UPDATED 22 AUGUST, 2013. To appear in: Oxford Handbook of Perceptual
Organization. Oxford University Press. Edited by Johan Wagemans. 1.
Introduction.
The temporal organization of perception A. Holcombe
UPDATED 22 AUGUST, 2013
To appear in: Oxford Handbook of Perceptual Organization Oxford University Press Edited by Johan Wagemans
1. Introduction Visual perception textbooks and handbooks customarily do not include sections devoted to the topic of time perception (the exception is van de Grind, Grusser, & Lunkenheimer, 1973). But that may soon change, with this chapter a sign of the times. In journals, the literature on temporal factors has grown very rapidly, and reviews in journals of time perception have proliferated (Vroomen & Keetels, 2010; Holcombe, 2009; Wittmann, 2011; Eagleman, 2010; Grondin, 2010; Nishida & Johnston, 2010; Spence & Parise, 2010). In an attempt to restrict this review to fundamental issues, only simple judgments of temporal order will be considered. The rapidly growing literature on duration judgments will not be discussed. Interpreting experimental results requires assumptions. For temporal experience, it is tempting to think of experience as forming a single timeline, with all sensations mapped to points or segments of that timeline. This assumption is often implicit in the literature, together with another assumption to allow for the experience of simultaneity: that sensations closer than a certain interval, the duration of the “simultaneity window”, are perceived as simultaneous (Meredith et al., 1987). Yet it is far from clear whether experience comprises a single ordered timeline. This chapter will question this assumption and ultimately suggest that our experience is frequently the product of organizational processes whose purpose is not to create an ordered timeline. Rather, simpler grouping and segmentation processes can be more important, with ordering sometimes only a byproduct or not occurring at all. Similar issues have arisen in the study of spatial perception. Marr (1982) suggested that the visual system delivers a representation of the ordered 3-‐D layout of all the objects and surfaces in a scene. This is analogous to the ordered timeline view of temporal experience. The evidence from space suggests that visual representation may be more impoverished than what Marr envisioned (Koenderink, Richards, & van Doorn, 2012) but still provides ordered and metric depth relations (van Doorn et al., 2011). Whether our timeline of experience achieves the analogous level of organization, a consistent ordering, remains unclear. One alternative to a well-‐ordered timeline is that we sometimes experience objects and qualities with undefined temporal relationships. That is, there may be some percepts for which we do not have an experience of before or after, and where the explanation for this failure is not simply that the two stimuli fall within the simultaneity window. A possible example is provided in the animations showcased at http://www.psych.usyd.edu.au/staff/alexh/research/colorMotionSimple . In those animations, a field of dots alternates between leftward motion and rightward motion. In synchrony with the motion direction alternation, the dots’ color alternates between red and green. Yet at alternation rates above about six times per second, one is unable to judge the pairing of motion and color, for example whether the leftward motion is paired with red or with green (Arnold, 2005; Holcombe & Clifford, 2012). Yet this rate is slow enough that the successive colors and motions should not fall inside the same simultaneity window (Wittmann, 2011). 1
A potentially related phenomenon was reported by William James in 1890. In Chapter 15 of his Principles of Psychology, James claimed that “When many impressions follow in excessively rapid succession in time, although we may be distinctly aware that they occupy some duration, and are not simultaneous, we may be quite at a loss to tell which comes first and which last”. Unfortunately, James provided no examples, so we do not know to what he was referring. Some detailed descriptions of dissociations of temporal order judgments and asynchrony judgments have been provided by Jaśkowski and others (Jaśkowski 1991; Allan 1975), however these may be explainable by decision criterion differences for the two tasks of a few tens of milliseconds. A temporal order deficit that seems less likely to be explained by decision criteria differences was reported by Holcombe, Kanwisher, & Treisman (2001), and can be experienced here: http://www.psych.usyd.edu.au/staff/alexh/research/MOD/demo.html. When four letters are presented serially, each for about 200 ms, and the sequence repeats, observers are typically unable to report their order. Yet if the sequence is presented just once, the order of the items is easily perceived (for a possible auditory analogue, see Warren et al., 1969). What are the implications of this phenomenon for the nature of temporal experience? It may mean that temporal experience is less organized than spatial experience. Ordering seems more integral to our representations of space, which benefit from the retinotopic organization of visual cortices. The positions of items on the retina are readily available thanks to this topography (although determining their locations in external space are another matter, requiring more mysterious mechanisms). This organization also affords parallel processing of a large range of locations. Orientation and boundary processing as well as local motion processing occur at many locations simultaneously, providing some spatial relationships preattentively and continuously (e.g. Levi, 1996; Forte, Hogben, & Ross, 1999). At a larger scale, perception of certain global forms is based on massively parallel processing (Clifford, Holcombe, & Pearson, 2004), which may also be true of perceiving the location of the centroid of a large array (Alvarez, 2011). Although the visual brain has retinotopy, it does not seem to have chronotopy. That is, no brain area seems to include an array of neurons that systematically respond to different times, arranged in temporal order. A possible exception is neurons selective for temporal rank order in movement-‐related areas of cortex (Berdyyeva & Olson, 2010), but as far as we know these are not involved in time perception. Our knowledge of the relative times of stimuli surely suffers for lack of a chronotopic representation. Not only does the lack of chronotopy suggest the absence of a readily available ordered temporal array, it may also mean less parallel processing of distinct times than of distinct locations. It is difficult to imagine that the brain gets by without any parallel temporal processing, and without any sort of temporally structured buffer. Smithson & Mollon (2006) and Smith et al. (2011) have provided some evidence for a temporally structured buffer in vision, but overall temporal processing seems less pre-‐organized than spatial processing. Retinotopy (or chronotopy) is not a full solution to the problem of perceiving spatial (or temporal) relationships, even ignoring the complication of movements of the eyes and body. There are aspects of spatial perception that are not achieved by specialised parallel processing, and those solutions might also be used in temporal processing. Two recent pieces of research suggest that some spatial relationships become available via serial, one-‐by-‐ one processing, through shifts of attention (Holcombe, Linares & Vaziri-‐Pashkam, 2011; Franconeri et al., 2011). With a moving spatial array, the Holcombe et al. (2011) study documented an inability to apprehend the spatial order of the items in the array when the items moved faster than the speed limit on attentional tracking. This, together with a telling pattern of errors, indicated that a time-‐consuming shift of spatial attention was necessary to determine the spatial relationships among the stimuli. Converging evidence 2
from Franconeri et al. (2011) suggests that shifts of spatial attention are also involved in perceiving spatial relationships among static stimuli. Attention may serve to select stimuli of interest for the limited-‐capacity processing that determines temporal and spatial relations. Some aspects of the rich spatial layout we enjoy are thus a result of accumulated representations from multiple shifts of attention (see Cavanagh et al., 2010 for related ideas). In this dependence on serial processing, spatial experience may be similar to temporal experience. But even these attention-‐mediated aspects of spatial perception seem to capitalize on the parallel processing advantage of retinotopy. Shifting attention involves moving from activating one set of location-‐labeled neurons to another set of location-‐ labeled neurons (assuming local sign has been set during the development of the organism-‐ Lotze, 1881). This may help to calculate the vector of the attention shift, which then indicates the relative location of the two regions. Although it is limited by the absence of chronotopy, temporal processing does reap some benefits from retinotopy. Thanks to retinotopy, motion detectors can operate in parallel across the visual field. The motion direction they compute indicates the temporal order of stimuli. It has also been suggested that retinotopy allows the visual system to compute in parallel whether stimuli across the visual field change together (in synchrony) or not. Some investigators suggested that this occurs not just for the luminance transients known to engage the motion system, but also direction and contrast changes (Usher & Donnelly, 1998; Lee & Blake, 1999). Follow-‐up work however supported alternative explanations (Dakin & Bex, 2002; Beaudot, 2002; Farid & Adelson, 2001; Farid, 2002). The issue remains unsettled, but the continued absence of good evidence for parallel temporal processing feeds the suspicion that perception of relative timing is serial and possibly attention-‐mediated. Temporal processing may be restricted to what can be processed serially in the short interval before it disappears from our sensory buffer. In some ways even better than chronotopy would be time-‐stamping of all stimuli by an internal clock. The time stamp might be provided by a dedicated internal clock comprising a pacemaker and counter (Treisman, 1963; Ivry & Schlerf, 2008) or a neural network with intrinsic dynamics and an internal model of the network that translates the network state into the current time (Karmarkar & Buonomano, 2007). With time-‐stamping, relative timing of two events is judged by simply comparing the timestamps of the two events, just as is done by desktop computers with files on a hard drive. If this were automatic and preattentive, then we might have better-‐organized temporal experience than spatial experience. But there is little or no evidence for extensive time-‐stamping. Instead the system may rely on less reliable information, like the relative activation of different stimulus types. Because activation in cortex and presumably short-‐term memory typically decreases over time, the most active item is likely to be the last one presented, the second most active the item presented before, etc. This “recency” scheme is subject to distortion as other factors like attention can affect which item is most active (Reeves & Sperling, 1986). The use of relative activation might also be thwarted with repeating displays that result in saturation of the activation of multiple items. An earlier paragraph described the alternating-‐motion display for which one cannot determine which color goes with which motion direction (http://www.psych.usyd.edu.au/staff/alexh/research/colorMotionSimple). The repetition of this display may saturate in memory the activation levels of the colors and motions, preventing the use of relative activation levels to pair the features. Another reason feature pairing may be difficult here is because pairing ordinarily involves using salient temporal transients to temporally segment the dynamic scene (Holcombe & Cavanagh, 2008; Nishida & Johnston, 2010; Nishida & Johnston, 2002). The unusual uninterrupted motion of the alternating-‐motion display results in continual transients that swamp the transient associated with the color change, and without other cues to rapidly guide attention to the transients of 3
interest (Holcombe & Cavanagh, 2008), temporal experience of the color and motion remains poorly organized. Only when the rate is slow can attention select an individual phase of the cycle, and that selection returns two features, indicating they occurred at the same time (Holcombe & Cavanagh, 2008). This is like spatial visual search, for which Treisman & Gelade (1980) suggested that attentional mediation is required to perceive that a color and shape originate from the same location. For time, strong luminance transients serve to engage the selective mechanism (perhaps attention, or a “when” pathway) that can make temporal relations explicit. Thus determination of temporal order and simultaneity is best when just two punctate, discrete events with strong transients are presented. In the remainder of this chapter we will set aside the segmentation and processing capacity problems created by complex scenes. For the ideal situation of only two stimuli, we will examine how sophisticated visual temporal processing can be. There is an important basic theoretical distinction between the time a percept is created and at what time the observer experiences the event to have taken place. The analogous distinction in spatial perception is uncontroversial, with the phrase “where an object is perceived” taken to mean “where an object is perceived to be” rather than where in the brain the percept is created. Yet if time is substituted for space and we write “when an object is perceived”, this will be interpreted by many as the time the percept was created rather than the time the percept refers to. This is the issue of brain time versus event time— whether the brain processes events such that when a percept arises is not identical to the time it is experienced as having occurred (Dennett & Kinsbourne 1995). Event time advocates have affirmed the distinction and moreover claimed that the system routinely considers the time of sensory signals together with other cues to infer the time of the corresponding stimuli in the external world. But this conclusion may be premature. 2. Brain time theory versus event time theory Conceivably, there is no distinction between when an object is perceived and the time that it is perceived to refer to. In other words, the time a percept occurs may be identical to the time that its object is perceived to have occurred. This possibility is referred to as the brain time theory of temporal perception. As Köhler put it in 1947, “Experienced order in time is always structurally identical with a functional order in the sequence of correlated brain processes” (p.62) (Köhler’s statement might also allow stretching of time that preserves order, but we will put aside this complication). According to this brain time theory, an event is perceived as occurring when the signals it evokes in the senses reach the processes responsible for consciousness. Some signals may take longer than others to travel from the receptors to the processes responsible for consciousness, and this will result in temporal illusions, because there is no processing that could compensate for delays. That is, signals with long latencies will be perceived as having occurred later than signals with short latencies. The alternative to brain time theory is that some property of signals other than when they arrive affects when the associated events are perceived to have taken place. The brain may have adaptive processes that result in perceived timing being closer to veridical than they would be otherwise. But some question this supposition, among them Moutoussis (2012), writing that “the idea of the perception of the time of a percept being different to the time that the actual percept is being perceived, seems quite awkward” (Moutoussis, 2012, p.4). 4
To other thinkers (e.g. Dennett & Kinsbourne, 1992), this would be no more peculiar than spatial illusions, wherein the perceived location of an object is dissociated from its retinal location (e.g. Roelofs, 1935; de Valois & de Valois, 1991). Time perception may be as much a constructive, interpretational process as is space perception. But to date, the evidence is that time perception does not adaptively take into account various cues to correct timing as comprehensively as spatial perception uses spatial cues. 2.1. Event time theory and simultaneity constancy Event time refers to the time that events occur in the environment rather than the time that they are processed by various stages of the brain. Event time theory is the idea that the perceived timing of events do not always correspond to brain time, but rather the brain may effectively label a percept as referring to a time different from when the percept became conscious. This could result in the perceived time of events being more accurate. For the brain, there are two aspects to the problem of recovering event time. A first aspect is the different latencies and processing times that re-‐order the temporal sequence of signals as they ascend the neural processing hierarchy. This is referred to as the differential neural latency problem. The second aspect is the different times signals require to travel from their physical sources to the receptors of the organism. For example, the light emanating from an object will arrive at the eye sooner than its sound will arrive at the ear. This is the problem of differential external latencies. In the face of these two differential latency problems, recovering actual event time would be a major achievement. It is sometimes claimed that the brain does accomplish this feat (Kopinska & Harris, 2004). Just as the visual system recovers the correct spatial size of external objects despite wide variation in retinal extent (size constancy), the brain may also recover the correct time of events—“simultaneity constancy” (Kopinska & Harris, 2004). 2.2. Brain time rules the day, and the minute At the very coarse time frame of years, days, or hours, it’s clear that brain time rules and simultaneity constancy fails. At night, when we look up at the sky and see stars, all the light we receive was caused by events that took place years ago. But our brain does not compensate for this travel time, and we perceive the stars’ appearance as being their appearance at the present, rather than of years ago. When we look at the moon, we see what it was 1.3 seconds ago, but again the brain does not compensate for this lag. Clearly any “simultaneity constancy” or compensation for differential latencies is only partial at best. It is unreasonable to expect the brain to know the distance of heavenly bodies, but more than this, absolutely no examples of evidence for simultaneity constancy on the scale of minutes or longer has ever been offered (as far as I know). On the scale of minutes, hours, and days, brain time rules. At the finer sub-‐ second timescale however, some researchers have provided evidence for event time recovery. 2.3. Does brain time rule the split-‐second? Some researchers suggest that the brain generally does reconstruct event times, at least at the sub-‐second scale (Harris et al., 2010). Eagleman writes that “the brain can keep account of latencies” (Eagleman, 2010). His theory is that the brain waits until the slowest signals arrive, and then reconstructs the order of events, compensating for the latencies of their neural signals. The full range of evidence however includes some conspicuous failures of the system to account for latencies, even at the subsecond scale and with good cues available. These failures discard the strong form of the event time theory—that latencies are comprehensively accounted for. Following our discussion of that, examination of evidence for successful event time reconstruction will lead to rejection of the other extreme, brain time theory, so we will conclude that partial compensation occurs. 2.4. Failures to compensate for differential neural and external latencies 5
The strength of a sensory signal can have a dramatic effect on its neural latency. The neural signals evoked by a high-‐contrast flash reach visual cortex tens of milliseconds quicker than a low-‐contrast one (Maunsell, Ghose et al. 1999; Oram et al. 2002). This effect is very consistent and Oram et al. (2002) reported that also at higher-‐order cortical areas such as STS, stimulus contrast is the major determinant of response latency. Successful compensation would amount to low-‐contrast flashes being perceived at the same time as high-‐ contrast flashes. But if people are asked to report which of two simultaneous flashes of different contrasts came first, they more frequently report the higher-‐contrast one (Allik & Kreegipuu, 1999; Alpern, 1954; Arden & Weale, 1954; Exner, 1875). It is natural to conclude that high-‐contrast flashes are perceived before low-‐contrast flashes, constituting a failure of event time perception. But that conclusion would be premature, because the greater salience of the high-‐contrast stimulus may bias decisions regarding temporal order, even if perception is unaffected (Yarrow et al., 2011; Schneider & Bavelier, 2003). Such biases complicate the interpretation of much of the literature on temporal judgments. Fortunately, more convincing evidence comes from two other illusions where decisional biases are unlikely to be responsible for the phenomenon. The first of these illusions was described by Hess in 1904. Hess and his subjects viewed two patches, one directly above the other while they both moved from left to right. When one patch was dimmer than the other, it appeared to lag the brighter patch, suggesting a difference in perceptual latency. The spatial size of the lag seems to scale with speed (Wilson & Anstis, 1969), consistent with a constant temporal delay between two stimuli with a particular luminance difference. And the delay is substantial, around a few dozen milliseconds per log unit difference in luminance (Wilson & Anstis, 1969; White, Linares, & Holcombe, 2008). Eagleman (2010) argued that the Hess effect displays were one of only a few special cases where the brain cannot succeed in accounting for differential latencies. Eagleman argued that it was a very special case indeed, suggesting that the Hess effect only occurs “when one uses a neutral density filter over half the screen – simply reducing the contrast of a single dot is insufficient”. Contrary to this proposal however, White, Linares & Holcombe (2008) for example obtained a Hess effect without changing the background luminance. And for the additional illusions reviewed below, stimuli also were typically not presented in a larger filtered region. The perceptual correlate of the intensity-‐related neural delay also manifests in motion signal processing. Roufs (1963) and Arden & Weale (1954) presented two flashes simultaneously and side-‐by-‐side on a dark background. When one flash was brighter than the other, motion was perceived from the brighter flash to the dimmer flash. Stromeyer & Martini (2003) documented a similar effect for two gratings differing in contrast rather than luminance. Motion was perceived in the direction from the higher-‐contrast grating to the lower-‐contrast grating, consistent with physiological evidence for latency decreasing with contrast as well as with luminance (Shapley & Victor, 1978; Benardete & Kaplan, 1999). A number of other motion illusions are also consistent with the effect of luminance or contrast on latency (Purushothaman et al, 1998; Ogmen, Patel, Bedell, & Camuz, 2004; Lappe & Krekelberg, 1998; White, Linares, & Holcombe, 2008; Kitaoka & Ashida, 2007). An apparent concordance of physiological latency and percepts is also observed for stimuli darker than the background versus stimuli brighter than the background. ON-‐center ganglion cells in primate retina respond ~5 ms faster than OFF-‐center cells. Correspondingly, psychophysical motion nulling experiments in humans indicate that dark dots have a processing latency of about 3 ms shorter than bright dots (Del Viva, Gori, & Burr, 2006). Together these illusions indicate that brain time rules when it comes to neural latency differences caused by variations in luminance or contrast. Unfortunately we cannot exclude the possibility that the brain 6
engages in partial compensation for the latency difference while consistently falling short of full compensation. But the size of the effects seem similar in human perceptual studies and in the latency of physiological responses in nonhuman animals (Maunsell et al. 1999; Oram et al. 2002), so any neural accounting for latency differences must be woefully under-‐complete. To explain these phenomena, defenders of the event time hypothesis may argue that they are an exception, perhaps because these luminance-‐related latency differences are unimportant to the organism. But this argument is less than compelling, as explained in the next section. 2.5. Compensation in action but not perception? Well-‐timed behavior is critical when playing a sport, fighting, or hunting. The size of the Hess effect in the photopic range is roughly 8 ms per log unit of luminance (White, Linares, & Holcombe, 2008). Comparing a daylight-‐illuminated object to one in dark shadow (5 log units or more), then, the object in shadow will be delayed by about 40 ms. If the objects were moving at 10 km/hr, this would result in a perceived spatial offset of 11 cm. Although these numbers may seem small, they are large relative to the accuracy of human performance in hitting a ball with a bat. Even amateurs hitting a ball with a bat achieve better than 15 ms resolution (McLeod, McLaughlin, & Nimmo-‐Smith, 1985) and some expert cricket batters seem to have 2 ms resolution (McLeod & Jenkins, 1991). The size of the Hess effect is large enough, then, to substantially impair performance. Its existence then should be surprising for theorists who are sanguine about the general ability of the visual system to compensate for latencies. But even if sensory learning does not compensate for delays caused by low luminance, this does not mean that sportsmen are condemned to miss the ball when the sun begins to set. Sensorimotor (as opposed to sensory) learning may save the day (White, Linares, & Holcombe, 2008; Nijhawan, 2008). Actions like hitting a ball involve mapping the timing of sensory signals onto behavior. Mappings between particular luminances and particular timings could perhaps be learned thanks to the feedback involved in successful action. But if this learning does not occur in the sensation→perception mapping (as argued in this chapter), then it may apply only to the perception→action mapping. That is, the error signal may not propagate to the deeper (sensation and perception) layers of the system because they are farther from the teaching feedback. 2.6. Evidence for event time reconstruction As reviewed above, luminance contrast has a consistent effect on latencies in the visual system, but perception does not seem to take account of these effects for reconstruction of event time. Let’s consider another factor that consistently affects latencies: the sensory modality of the signal. Auditory signals reach cortex quicker than visual signals, by roughly 30 to 50 ms (Regan, 1989; Musacchia & Schroeder, 2009). Yet the sight and sound of snapped fingers is not noticeably out of sync. This apparent discrepancy between perception and neural latencies has been cited as a case of simultaneity constancy or “active editing” of time (Eagleman, 2007; 2009; 2010). The sight and sound of snapped fingers may indeed be typically perceived as simultaneous. This does not however imply editing of event time. Rather, the perceived simultaneity may simply be due to our poor acuity for perceiving temporal differences or to a broad simultaneity window. Consider the relevant sort of psychophysical experiment. These reveal that although in many cases people are more likely to judge physically simultaneous sounds and flashes as simultaneous than as having occurred at different times, simultaneity is not the timing most likely to yield a percept of simultaneity. Instead, the best timing for perceptual simultaneity is, for most participants, to present the flash before the sound (Stone et al, 2001), consistent with sounds being processed faster than flashes. The point of 7
subjective simultaneity is the relative timing value at which both responses are equally likely when a person is forced to choose which of two signals was presented first. The non-‐zero point of subjective simultaneity suggests that the differences in latency were not entirely compensated for, or not compensated at all. Then why do the sight and sound of snapped fingers seem in sync? The perceptual asynchrony may simply not be large enough to be detected. Temporal order discrimination ability is just too poor (e.g. van Eijk et al., 2008). Active editing or reconstruction of event time need not be invoked. An additional factor that might make the snapped fingers asynchrony even more difficult to notice is the ambiguity in which moment of the temporally extended visual event generated the sound. It is not until the end of the fingers’ movement that the finger generates the snapping sound. If the brain instead assumes that the sound corresponds more to the beginning of the movement, this corresponds to an earlier visual event, diminishing the difference in neural latencies between sound and corresponding sight. While the auditory/visual latency difference and luminance contrast effects demonstrate failures to reconstruct event time, they do not imply that the perceptual system never reconstructs event time. After all, even the clear successes of adaptive vision turn into failures when certain limits are exceeded. In the case of size constancy for example, while the visual system does an acceptable job, failures are common (McBeath, Neuhoff, & Schiano, 1993; Granrud et al, 2003). If an organism must learn its own latencies over its lifespan, we might end up with a patchwork of partial event time reconstructions. To fully evaluate whether the brain takes account of latencies, we must review the other phenomena promulgated as evidence for simultaneity constancy. 2.6.1. Compensation for auditory distance? Several researchers have suggested that the brain compensates for the effect of the slow speed of sound relative to the faster speed of light. Although the difference in timing of sound and sight is small for most events, during storms we sometimes experience a very large timing difference. A distant thunderclap is heard a few seconds after the light from the physically-‐simultaneous lightning bolt. Because we do not perceive distant thunder and lightning as simultaneous, clearly our brain does not reconstruct the simultaneity of these events. This is unsurprising even for advocates of event time reconstruction, because the nature of the event and its distance are not easily perceived. However, for much closer events, from a few centimeters to a few dozen meters away, some have suggested that neural processing does result in perceiving an associated sound and light as simultaneous. Studies of the issue have generally presented a light and a sound at different distances and different relative timings. According to the event time hypothesis, the point of subjective simultaneity for the sound and the light should shift with greater object distance. That is, for greater object distances, larger sound delays should be considered simultaneous. However, different studies have yielded very different results. Keetels & Vroomen (2012) and Vroomen & Keetels (2010) provide good reviews of the subject and consider various explanations for the discrepancy between those that favor the hypothesis (Sugita & Suzuki, 2003; Alais & Carlile, 2005; Engel & Dougherty, 1971; Kopinska & Harris, 2004) and those that do not (Arnold, Johnston, & Nishida, 2005; Heron et al., 2007; Lewald & Guski, 2003; Stone et al., 2001). The issue is complex, for example because negative findings can be blamed on the experimenters presenting the visual and auditory information in such a way that the observer perceives the distance to the sound inaccurately. Second, whether trials with different times and distances were blocked or mixed can change the adaptation state of the observer, and as this can shift the simultaneity point (as described in a section below), it might explain some of the findings supporting latency compensation. 2.6.2. Compensation for the length of tactile nerves? Simultaneity constancy in tactile perception would be more straightforward to assess, and presumably for the brain to implement, than simultaneity constancy in the audiovisual domain. Tactile signals from the toe reach the brain about 40 ms after the signals from the face (Macefield et al., 1989). The brain might 8
compensate for this fact of longer latencies from parts of the body farther than the brain, so that a simultaneous touch on toe and forehead feels simultaneous. Whereas audiovisual simultaneity constancy is complicated by the fact that the transmission time of sounds varies with the distance of the source, the latency differences of tactile stimulation should be more stable, possibly making it easier to learn. Otto Klemm, at the time a junior colleague of Wilhelm Wundt in Leipzig, published a series of studies of the topic (Klemm 1925). Klemm presented tactile stimuli to the forehead, index finger, and ankle. The method he used is not entirely clear but he seems to have asked participants to report which of two touches was presented first, while also giving them the option of responding “simultaneous”. An interesting complication he encountered may be relevant to whether sensations are consistently assigned to points on a timeline or instead are represented differently. In the simple situation of a touch on the head accompanied by one near the ankle, Klemm reports (p.215): “At the beginning of the series some of the observers were helpless even when fairly large temporal separations were used... observers had a lot of trouble to judge direct simultaneity: Since the two tactile impressions did not go together [zusammengingen] into one common Gestalt it was difficult to merge [zusammenfassen] them to simultaneity” (translation courtesy of Lars T. Boenke). Fraisse (1964) makes a related observation that it is difficult to combine stimuli of different modalities and perceive them as synchronous. Klemm pressed on with testing his subjects until they produced reliable measurements (he did not report how much experience was required). He determined that five of his six participants, when presented with simultaneous stimulation to ankle and forehead, tended to report that the forehead was stimulated first. More specifically, in those five participants the ankle had to be touched 23 to 30 ms earlier than the forehead for the best chance of perceived simultaneity. In the sixth observer, he instead found evidence for simultaneity constancy, with the point of subjective simultaneity being true physical simultaneity. It is hard to know what to conclude, and indeed Klemm himself expressed some frustration. Klemm also noted that even when participants performed the temporal task without a problem, some continued to report that, as described in the previous paragraph, it felt artificial to categorize temporal order. Halliday & Mingay (1964) performed a similar study, but unfortunately with only two participants. For both participants, Halliday & Mingay concluded that touches of more distal body parts (toe vs. index finger, in their case) were perceived to have occurred later. Harrar & Harris (2005) followed with more experiments that yielded the same result, using temporal order judgments to infer the time difference for subjective simultaneity. Quantitatively, pooling the data across their six participants, they reported that the difference in perceived timing was approximately that predicted by the differences in simple reaction time to the body parts involved. Unfortunately they did not assess whether some participants were different than others, so we do not know if there was the significant variation between participants that Klemm found. Bergenheim et al. (1996) also investigated the issue, and like the others found evidence that stimulation of the more distal body parts was perceived later than more proximal areas. However, Bergenheim et al. suggested that the discrepancy they found between foot and arm (12 ms) was not as large as it should be for the difference in conduction latency indicated by physiological studies. In summary, all researchers found that on average, stimulation of distal areas of the skin was perceived as occurring earlier in time than stimulation of more proximal areas. If there is any compensation at all, it appears that the proportion of latency difference compensated for is small, or the proportion of people who compensate for latency is small. Settling the issue will require more studies of this topic using modern physiological methods, larger numbers of participants, and enough data per participant to assess simultaneity constancy in each participant. To evaluate whether the times at which signals are perceived reflects compensation for signal processing latencies, we have reviewed the effects on perceptual latency of luminance, originating modality, the speed 9
of sound, and the length of tactile fibers. The support in the literature for adaptive compensation in these instances ranges from none to mixed. Yet one class of studies provides strong evidence for limited compensation. These are the studies of adaptation to asynchrony. The phenomenon involved suggests a path to understanding the imperfect and limited processing that can compensate for differential latency. 3. Inter-‐sensory adaptation to take account of latency differences Fujisaki, Shimojo, Kashino, & Nishida (2004) repeatedly exposed participants to a particular asynchrony between auditory and visual information, and found consistent effects on the point of subjective simultaneity. In one condition, a tone pip was followed 235 ms later by a flashed ring. After about 3 min of repeated exposure to that sequence, participants made temporal order judgments to a range of temporal offsets, which revealed that the point of subjective simultaneity had shifted by an average of 22 ms. The shift was in the direction appropriate to compensate for the 235-‐ms offset between sight and sound. Other studies have proven this result to be robust (Vroomen et al., 2004; Hanson, Heron, & Whitaker, 2008; Harrar & Harris, 2008; Di Luca, Machulla, & Ernst, 2009; Roach et al., 2010), and a similar phenomenon has been observed for other modality pairings (Di Luca, Machulla, & Ernst, 2009). Compensation for a particular asynchrony has also been observed for the temporal delay between actions and their sensory consequences (Cunningham, Billock, & Tsou, 2001; Stetson et al., 2006), and these shifts do not seem to be caused by shifting the physical time of stimulus-‐evoked neural signals (Roach, Heron, Whitaker, & McGraw, 2010). Not only do these results constitute evidence for event time reconstruction rather than reliance on brain time, but also they indicate how latency differences might be known, through learning. The function of these shifts may stem from the statistics of the natural environment, where the distribution of the relative timing of stimulation by external events is likely to be centered on or near zero (simultaneity). Processes for compensation of consistent departures from this average may therefore cause the adaptation effects. These adaptation effects are analogous to aftereffects for other aspects of perception such as motion and orientation. Accordingly, to explain these effects researchers typically invoke similar neural mechanisms as those that have been proposed to explain traditional adaptation effects. Specifically, a typical suggestion is that neurons in the brain are selective for the adapted feature, and that adaptation of these neurons causes the aftereffect. In the case of the intersensory timing shifts, both Roach et al. (2010) and Cai, Stetson, & Eagleman (2012) suggest that the responsible neurons are multimodal neurons tuned to different asynchronies between the modalities. In the cat, there are indeed multimodal neurons that prefer different asynchronies (Meredith et al. 1987) and these also appear to exist in rhesus monkeys (Wallace, Wilkinson, & Stein, 1996) and perhaps humans. The relative timing perceived may reflect the differing activity of these neurons. Adaptation shifts this activity difference in a manner that compensates for the asynchrony (Roach et al., 2009; Cai, Stetson & Eagleman, 2012). 3.1. Timing-‐selective neurons vs. criterion shifts and expectations The explanation of asynchrony aftereffects in terms of a population of neurons tuned to various asynchronies is appealing. But other possible explanations should be considered, especially because one recent result is difficult to explain in the standard way. An adaptation effect reported by Roseboom & Arnold (2011) amounts to a shift in perceived audiovisual timing that is specific to the visual stimulus used. Participants in the experiment saw video clips of a male and a female actor on different trials, all saying the syllable “ba”. In one condition the auditory signal of the male actor was always presented 300 ms before the video, whereas the auditory signal of the female actor was always presented 300 ms after the video. In other words, participants adapted to opposite A-‐V timing 10
shifts for the male speaker and for the female speaker. After fifty presentations of these stimuli, participants were tested to determine what timing relationship they considered simultaneous. For the test phase, participants were shown the videos with a range of relative timings between the auditory and visual component, and each time asked to judge whether the sound and the video were simultaneous. It turned out that the point of subjective simultaneity had shifted by a few dozen milliseconds in the direction of compensation for the adapted asynchrony, but this was in different directions for the male actor and the female actor. The temporal shift maintained this association with the actor even though the locations of the two actors were switched during test, meaning that the timing shift was contingent more on the actor than on the location the actor was presented in during the adaptation phase. Unlike the experiments involving a simple, single auditory-‐visual timing offset, these results cannot be explained by the adaptation of a population of multimodal neurons tuned to various auditory-‐visual timings. The contingency on the actor requires additional processes. One might extend the logic of explaining simple asynchrony adaptation with multimodal neurons by positing neurons that are jointly selective for actor and audio-‐visual timing. But this might lead to a combinatorial explosion of neurons, as the contingency on “actor” is unlikely to be the only possible contingency. A range of neurons would be needed for each kind of contingency. A process with more flexibility should be considered. The processing that shifts decision criteria may fit the bill of a suitably flexible process that can accommodate different contingencies. In signal detection theory, the criterion is a threshold level of the internal signal that the observer uses to decide which response to make. In the context of a simultaneity judgment the relevant signal may be something like the difference in the internal timing of the auditory response and the visual response. This signal is assumed to have a Gaussian distribution. As the timing difference is signed (indicating whether auditory was before versus after visual), two criteria may be involved, one for the positive side of the distribution (discriminating simultaneous from auditory after visual) and one for the negative side (discriminating simultaneous from visual after auditory). See Yarrow et al. (2011) for discussion. Shifts of these decision criteria result in shifts in points of subjective simultaneity. Repeated exposure to a particular asynchrony might cause the brain to shift the decision criteria in the direction of compensation. This criterion shift account is quite different from those involving adaptation of a population of asynchrony-‐ tuned neurons (Roach et al., 2009; Cai, Stetson, & Eagleman, 2012). Among psychophysicists, criterion shifts are often considered uninteresting. The notion seems to be that a criterion shift is more likely to be caused by observers taking a different attitude towards their percepts rather than perception itself changing. In contrast, the asynchrony-‐tuned neuron account is firmly a theory of change of percepts, from a shift in underlying neural populations. Fortunately there is some hope of distinguishing these accounts by experiment, although this has not yet been done. The asynchrony-‐tuned neuron code account appears to predict that sensitivity will change, not just criterion. The evidence in the literature so far appears consistent with a shift in criteria (Fujisaki et al., 2004; Vroomen et al., 2004; Yarrow et al., 2011; Hanson, Heron, & Whitaker, 2008). Certainly, no one has demonstrated that their result could not be explained by a shift in criteria or greater variability in criteria (Roach et al., 2010; Yarrow et al., 2011). But one should not dismiss lack of evidence for sensitivity change as implying that percepts did not change. As Michael Morgan and colleagues have pointed out, even some indisputably perceptual effects, like the motion aftereffect, may be caused by criterion shifts (or “subtractive adaptation”) rather than sensitivity changes (Morgan, Chubb, & Solomon 2011; Morgan & Glennerster, 1991; Morgan, Hole, & Glennerster, 1990). 11
Thus an aftereffect that manifests only as a criterion shift is not necessarily non-‐perceptual. To get a fuller view of what needs to be explained, future investigations should document the scope of contingencies adapted to. Perhaps, given an appropriate task and stimulus exposure protocol, timing shifts could be accomplished for completely arbitrary stimulus pairings, with one pair of criteria for pictures of Jennifer Aniston, another for pictures of pink koalas, and another for a person whose face you didn’t encounter until the experiment began. For the brain to accomplish such a feat, some process has to store these criteria and trot them out for the appropriate tasks and stimuli. This topic is rarely discussed in the adaptation literature, but raises interesting issues that may be widespread in the study of human cognition and learning. While the Roseboom & Arnold (2011) result may herald an explosion of contingent timing shifts, this may be restricted to situations of high temporal uncertainty regarding the time of sensory signals. For rather than using a simple tone and flash as had been used in previous studies, Roseboom & Arnold (2011) presented extended, time-‐varying video and auditory stimuli. The video clip involved facial movements of the actor that extended over what appears to be (from the supplementary clip provided in the paper) several hundred milliseconds, and the duration of the auditory syllable signal was probably also at least a few hundred milliseconds. Both were complex stimuli with multiple features occurring over their time-‐ course, with differing durations and without unambiguous discrete onsets and offsets. In such a situation, to determine whether the stimuli were simultaneous, one must identify which stimulus features should go together. The adaptation process may then be one of associating particular features of the extended video signal that occur at certain times with particular features of the auditory train. This might be the explanation of the results -‐ after repeated experience hearing a particular part of the auditory train presented simultaneously with a particular lip movement, one may learn that is the way that particular speaker talks. Deviations from that learned timing for simultaneity are then perceived, correctly, as temporally shifted from that speaker’s usual timing. This may thus be a criterion shift, and one that does not generalize to cases where the auditory-‐visual matching is unambiguous. This interpretation that the contingent asynchrony adaptation found by Roseboom & Arnold (2011) will not generalize to unambiguous audio-‐visual correspondence situations gets some support from the results of Heron et al. (2012). Like Roseboom & Arnold (2011), Heron et al. (2012) tested whether intersensory asynchrony adaptation could be contingent on the identity of the stimulus. Instead of using different actors paired with their respective voices, they used high spatial frequency gratings with high-‐pitched tones and low spatial frequency gratings with low-‐pitched tones. Other researchers have shown that observers tend to spontaneously associate these values (Evans & Treisman, 2010; Gallace & Spence, 2006; Spence, 2011), suggesting they are not entirely unnatural associations. Yet unlike Roseboom & Arnold (2011), these authors found that the asynchrony adaptation did not “stick” to the identity of the stimulus, but was instead tied to the spatial location. Thus they demonstrated adaptation to opposite asynchronies (visual before auditory and visual after auditory) tied to distinct locations. This is compatible with mediation by a brain area like the superior colliculus that is retinotopically organized and has neurons tuned to audiovisual asynchronies. The accounts based on a population of neurons tuned to various asynchronies therefore remains viable. We have considered whether the brain sets the perceived timing of sensory signals to compensate for learned or imputed sensory latencies. In a limited way it does, but the scope of the phenomenon and nature of the underlying processing remains obscure. 4. Grouping and Gestalts 12
Auditory stimuli can have a powerful effect on temporal aspects of visual perception. A single flash looks like two if two sounds are presented within about 100 ms of the time of the flash (Shams, Kamitani, & Shimojo, 2000; 2002). Sounds also shift the perceived timing of flashes, in a manner suggesting strong perceptual integration (Morein-‐Zamir et al., 2003; Freeman & Driver, 2008; Kafaligonul & Stoner, 2010). But these shifts in perceived timing are not necessarily consequences of processing that evolved to extract event time. That is, although they may mean that the brain time theory is wrong, this does not mean that the event time theory is right. Instead of the brain being bent on recovering the time of sensory events and achieving simultaneity constancy, perceived timing may instead be a secondary effect of grouping and integration. Evolutionary selection pressure may primarily have driven the brain towards organising ambiguous stimuli into the most likely groupings, without special consideration for timing. A striking auditory illusion discovered a century ago supports this theory that the brain prioritises grouping over correct timing. Benussi in 1913 reported that simple punctate sound sequences result in consistent illusions of temporal order (Sinico 1999; Albertazzi 1999). In a demonstration available online (http://i-‐ perception.perceptionweb.com/journal/I/volume/3/article/i0490sas), Koenderink et al. (2012) present one example: a sequence comprising a low tone, a high tone, and a noise burst. When the noise burst is presented as the middle sound, so that the tones are not neighboring each other temporally, perceptually one hears the tones as grouped together and the noise occurring afterwards. This likely occurs because the tones form a good gestalt, and the noise is segmented away from them. A similar phenomenon with more complex stimuli was reported by Ladefoged & Broadbent (1960), who presented participants with a recording of a sentence, such as "John that was the boy that had a top". A click sound was superimposed on one of the words in the sentence. Participants had difficulty determining when the click was presented, and tended to report the click as occurring at the time of an earlier word then it actually was. Further work (Fodor & Bever, 1965; Garrett, Bever, & Fodor, 1966) indicated that clicks are subjectively attracted toward clause boundaries. In sum, the click is perceived to have occurred at a completely different time than it was presented, and presumably a time quite different from when it was processed. The shifting of the time perceived may be a byproduct of processes driven primarily by the need for auditory comprehension and source identification (see also Spence, this volume). This is very different from the view of event time theorists, who assume the goal of perceiving the correct time of events is the primary factor determining perceived timing. UPDATE AFTER PUBLICATION: AFTER WRITING THIS, I FOUND LATER PAPERS THAT REPORTED EVIDENCE THIS WAS ALL DUE TO RESPONSE BIAS, SO THE POINT OF THIS PARAGRAPH RESTS LARGELY ON BENUSSI’S EFFECT Brain time theory is wrong, but so is the strong form of event time theory. Instead, the brain’s priority may be grouping sensory signals originating with a comment event together. But this does not exclude the existence of adaptation and criterion shift phenomena that on average push perceived timing towards veridicality. 5. Summary We do not yet know whether perception consistently represents event sequences as a timeline, in the way that in the spatial domain we have a strong sense of the layout of a scene. It may be that temporal experience is more impoverished. When several to many stimuli are presented rather than just a few, this may leave most of the temporal relations unavailable, or reliant on erratic cues like relative strength of the items in short-‐term memory (Reeves & Sperling, 1986). When just two stimuli accompanied by strong transients are presented, they are more likely to engage attention and also to result in a clear percept of temporal order (Fujisaki & Nishida, 2007). 13
Extracting certain spatial relationships also seems to require attentional mediation (Holcombe, Linares, & Vaziri-‐Pashkam, 2011; Franconeri et al., 2012). But aspects of spatial perception take advantage of the brain’s topographic arrays to process information in parallel, whereas the visual brain may lack a chronotopic bank of processors. In recent years much literature has focused on deciding between the event time reconstruction theory and brain time theory. But the reality may be a modest amount of event time reconstruction that emerges from a recalibration process that shifts crossmodal simultaneity points after prolonged exposure to asynchrony. Operating in parallel with recalibration may be organizational processes that create temporal illusions as a byproduct of Gestalt grouping (Benussi, 1913; Ladefoged & Broadbent, 1960). In evolutionary history, success at event reconstruction has likely been a factor in selecting the winning organisms over the extinct losers. But segmenting events and identifying them may have been both more important for the organism and more feasible than determining exact event timing. When absolute timing is critical, learning of sensorimotor mappings may be used for correct timing of behaviour rather than changes to perception. 6. Acknowledgments I thank Lars T. Boenke, Colin Clifford, and Paolo Martini for discussions, and Lars T. Boenke, Alex L. White, and Daniel Linares for comments on an earlier version of the manuscript. I thank Alex L. White for the point that in snapping one’s fingers, it is not obvious which part of the visual sequence generated the sound. Lars T. Boenke translated Klemm (1925) from German into English. The writing of this chapter was supported by ARC grants DP110100432 and FT0990767.
14
7. References Alais, D., & Carlile, S. (2005). Synchronizing to real events: Subjective audiovisual alignment scales with perceived auditory depth and speed of sound. Proceedings of the National Academy of Sciences of the United States of America, 102(6), 2244-‐2247. Albertazzi, L. (1999). The time of presentess. A chapter in positivistic and descriptive psychology. Axiomathes, 49–73. Allan, L. G. (1975). The relationship between judgments of successiveness and judgments of order. Perception & Psychophysics, 18: 29–36. Allik, J and Kreegipuu, K. (1998). Multiple visual latency. Psychological Science, 9, 135-‐138. Alpern M. (1954) The relation of visual latency to intensity. A.M.A. Archives. of Ophthamology, 51, 369-‐374. Alvarez, G.A. (2011). Representing multiple objects as an ensemble enhances visual cognition. Trends in Cognitive Sciences, 15(3), 122–131. doi:10.1016/j.tics.2011.01.003 Arden, G.B., & Weale, R.A. (1954). Variations of the latent period of vision. Proceedings of the Royal Society of London B, 142, 258–267. Arnold, D. H. (2005). Perceptual pairing of colour and motion. Vision Research, 45(24), 3015–3026. Arstila, V. (2012). Why the Transitivity of Perceptual Simultaneity Should be Taken Seriously. Frontiers in integrative neuroscience, 6(January), 3. doi:10.3389/fnint.2012.00003 Benardete, E. A., & Kaplan, E. (1999). The dynamics of primate M retinal ganglion cells. Visual Neuroscience, 16, 355–368. Benussi, V. (1913). Psychologie der Zeitauffassung. Winter: Heidelberg. Berdyyeva, T. K., & Olson, C. R. (2010). Rank signals in four areas of macaque frontal cortex during selection of actions and objects in serial order. Journal of Neurophysiology, 104(1), 141–159. Bergenheim, M., Johansson, H., Granlund, B., & Pedersen, J. (1996). Experimental evidence for a sensory synchronization of sensory information to conscious experience. In S. R. Hameroff, A. W. Kaszniak, & A. C. Scott (eds.), Towards a Science of Consciousness: The First Tucson Discussions and Debates (301–310). Cambridge, MA: MIT Press. Blake, R., & Sekuler, R. (2006). Perception -‐ 5th edition. McGraw-‐Hill: New York. Bouvier, S., & Treisman, A. (2010). Visual feature binding requires reentry. Psychological science, 21(2), 200-‐4. doi:10.1177/0956797609357858. Cai, M., Stetson, C., & Eagleman, D. M. (2012). A Neural Model for Temporal Order Judgments and Their Active Recalibration: A Common Mechanism for Space and Time? Frontiers in Psychology, 3(November), 1– 11. doi:10.3389/fpsyg.2012.00470 Cavanagh, P., Hunt, A. R., Afraz, A., & Rolfs, M. (2010). Visual stability based on remapping of attention pointers. Trends in Cognitive Sciences, 14(4), 147–153. doi:10.1016/j.tics.2010.01.007 Crespi, S., Biagi, L., d'Avossa, G., Burr, D. C., Tosetti, M. & Morrone, M. C. (2011). Spatiotopic Coding of BOLD Signal in Human Visual Cortex Depends on Spatial Attention, PLoS One, 7 (6), e21661. Cunningham, D. W., Billock, V. A., and Tsou, B. H. (2001). Sensorimotor adaptation to violations of temporal contiguity. Psychological Science 12, 532-‐5. De Valois, R. L., & De Valois, K. K. (1991). Vernier acuity with stationary moving Gabors. Vision Research, 31(9), 1619–1626. Del Viva, M. M., Gori, M., & Burr, D. C. (2006). Powerful motion illusion caused by temporal asymmetries in ON and OFF visual pathways. Journal of Neurophysiology, 95(6), 3928-‐32. doi:10.1152/jn.01335.2005 Dennett, D., & Kinsbourne, M. (1995). Time and the Observer : the Where and When of Consciousness in the Brain. Behavioral and Brain Sciences, 15(1992), 1–35. Di Luca, M., Machulla, T. K., & Ernst, M. O. (2009). Recalibration of multisensory simultaneity: cross-‐modal transfer coincides with a change in perceptual latency. Journal of Vision, 9, 7-‐16. Eagleman, DM (2011). Incognito: The Hidden Life of the Brain. Canongate Books. Eagleman, DM (2010). The strange mapping between the timing of neural signals and perception. In Issues of Space and Time in Perception and Action, R. Nijhawan (Ed). Cambridge University Press. 15
Eagleman DM (2009). Brain Time. In What’s Next: Dispatches From the Future of Science, M. Brockman, Ed. Vintage Books. Eagleman, DM (2007). 10 Unsolved Mysteries Of The Brain. Discover, August, 1-‐3. van Eijk, R. L., Kohlrausch, A., Juola, J. F., & van de Par, S. (2008). Audiovisual synchrony and temporal order judgments: effects of experimental method and stimulus type. Perception & Psychophysics, 70(6), 955–968. Engel, G.R., and W.G. Dougherty (1971). Visual–auditory distance constancy. Nature 234(5327);308. Exner, S. (1875). Experimentelle Untersuchung der einfachsten psychischen Processe. III Abhandlung [Experimental research on simple physical processes]. Pflügers Archiv für die gesammte Physiologie des Menschen und Thiere, 11, 403–432. Fodor, J. A., & Bever, T. G. (1965). The psychological reality of linguistic segments. Journal of Verbal Learning & Verbal Behavior, 4, 414-‐420. Forte, J., Hogben, J. H., & Ross, J. (1999). Spatial limitations of temporal segmentation. Vision Research, 39, 4052–4061. Fraisse, P. (1964). The psychology of time. London: Eyre and Spottiswoode. Franconeri, S., Scimeca, J., Roth, J., Helseth, S., & Kahn, L. (2011). Flexible visual processing of spatial relationships. Cognition, 122, 210–227. Fujisaki, W., & Nishida, S. (2007). Feature-‐based processing of audio-‐visual synchrony perception revealed by random pulse trains. Vision Research, 47(8), 1075–1093. Garrett, ME, Bever, T.G., & Fodor, J.A. (1966). The active use of grammar in speech perception. Perception & Psychophysics, 1, 30-‐32. García-‐Pérez, M. a, & Alcalá-‐Quintana, R. (2012). On the discrepant results in synchrony judgment and temporal-‐order judgment tasks: a quantitative model. Psychonomic bulletin & review. doi:10.3758/s13423-‐ 012-‐0278-‐y. Granrud, CE, Granrud, MA, Koc, JC, Peterson, RW, & Wright, SM (2003). Perceived size of traffic lights: A failure of size constancy for objects viewed at a distance. Journal of Vision, 3(9), 491 Grondin, S. (2010). Timing and time perception : A review of recent behavioral and neuroscience findings. Attention, perception & psychophysics, 72(3), 561–582. doi:10.3758/APP Halliday, A. & Mingay, R. (1964). On the resolution of small time intervals and the effect of conduction delays on the judgement of simultaneity. Quarterly Journal of Experimental Psychology, 16(1), 37-‐41. Harris L R, Harrar V, Jaekl P, Kopinska A (2010) Mechanisms of simultaneity constancy. In: Nijhawan R (ed) Space and time in perception and action. Cambridge University Press, Cambridge, UK, pp 232-‐253. Hanson, J. V., Heron, J., & Whitaker, D. (2008). Recalibration of perceived time across sensory modalities. Experimental Brain Research, 185, 347-‐352. Harrar, V. & Harris, L. R. (2008). The effect of exposure to asynchronous audio, visual, and tactile stimulus combinations on the perception of simultaneity. Experimental Brain Research, 186, 517-‐524. Heron, J., Roach, N. W., Hanson, J. V. M., McGraw, P. V., & Whitaker, D. (2012). Audiovisual time perception is spatially specific. Experimental brain research, 218(3), 477–85. doi:10.1007/s00221-‐012-‐3038-‐3 Heron, J., Hanson, J. V. M. & Whitaker, D. (2009). Effect before cause: supramodal recalibration of sensorimotor timing. PLoS ONE 4, e7681. (doi:10.1371/journal.pone. 0007681). Heron, J., Whitaker, D., McGraw, P. V., & Horoshenkov, K. V. (2007). Adaptation minimizes distance-‐related audiovisual delays. Journal of Vision, 7, 1–8. Holcombe, A.O. & Cavanagh, P. (2008). Independent, synchronous access to color and motion features. Cognition, 107(2):552-‐580. Holcombe AO, Clifford CW. (2012). Failures to bind spatially coincident features: comment on Di Lollo. Trends Cogn Sci. 16(8):402. Holcombe, A.O., Linares, D.L., Vaziri-‐Pashkam, M. (2011). Perceiving spatial relationships via attentional tracking and shifting. Current Biology 21, 1-‐5. Holcombe, A.O. (2009). Seeing slow and seeing fast: Two limits on perception. Trends in Cognitive Science, 13(5):216-‐21. 16
Holcombe, A. O., Kanwisher, N., & Treisman, A. (2001). The midstream order deficit. Perception & Psychophysics, 63(2), 322–9. Jaśkowski, P. (1991). Two-‐Stage model for order discrimination. Perception & Psychophysics, 50, 76-‐82.
17
Kafaligonul, H., & Stoner, G. R. (2010). Auditory modulation of visual apparent motion with short spatial and temporal intervals. Journal of Vision, 10, 1–13. doi:10.1167/10.12.31 Kitaoka, A., & Ashida, H. (2003). Phenomenal characteristics of the peripheral drift illusion. Vision Research, 15, 261–262. Keetels, M., & Vroomen, J. (2012). Perception of Synchrony between the Senses. Frontiers in the neural basis of multisensory processes (pp. 147–178). London: Taylor & Francis. Klemm, O. (1925). Über die Wirksamkeit kleinster Zeitunterschiede auf dem Gebiete des Tastsinns. Archiv fur die gesamte Psychologie, 50, 205–220. Koenderink, J., Richards, W., & van Doorn, A. (2012). Space-‐time disarray and visual awareness. i-‐ Perception, 3(3), 159–162. doi:10.1068/i0490sas Köhler, W. (1947). Gestalt psychology: An introduction to new concepts in modern psychology. New York: Liveright Publication. Kopinska, A., and L.R. Harris. 2004. Simultaneity constancy. Perception 33(9);1049–60. Ladefoged, P., & Broadbent, D. E. (1960). Perception of sequence in auditory events. Quarterly Journal of Experimental Psychology, 12(3), 162–170. Levi, D. (1996). Pattern perception at high velocities. Current Biology, 6(8), 1020–1024. Lewald, J., and R. Guski. 2004. Auditory–visual temporal integration as a function of distance: No compensation for sound-‐transmission time in human perception. Neuroscience Letters 357(2);119–22. Lotze, H. (1881). Grundzüge der Psychologie. Leipzig: Dictate aus den Vorlesungen. S. Hirzel. Macefield, G., Gandevia, S.C. and Burke, D. (1989). Conduction velocities of muscle and cutaneous afferents in the upper and lower limbs of human subjects. Brain 112(6);1519–32. Malsburg von der, C. (1981). The correlation theory of brain function. In Models of Neural Networks II: Temporal Aspects of Coding and Information Processing in Biological Systems. Domany, J.L. et al., eds, pp. 95–119, Springer-‐Verlag (reprinted in 1994). Marr D (1982) Vision. San Francisco, CA: Freeman. Maunsell, J. H., Ghose, G.M., Assad, J.A., McAdams, C.J., Boudreau, C.E. & Noerager, B D (1999). Visual response latencies of magnocellular and parvocellular LGN neurons in macaque monkeys. Visual Neuroscience 16(1): 1-‐14. McBeath, M. K., Neuhoff, J. G., & Schiano, D. J. (1993) Familiar Suspended Objects Appear Smaller than Actual Independent of Viewing Distance. Annual Convention of the American Psychological Society. Chicago, IL. McLeod P and Jenkins S. (1991). Timing accuracy and decision time in high-‐speed ball games. International Journal of Sport Psychology 22: 279-‐295. McLeod, P., McLaughlin, C. and Nimmo-‐Smith, I. (1985). Information encapsulation and automaticity evidence from the visual control of finely timed actions. In: Posner, M.I. and Marin, O.S., Editors, 1985. Attention and performance XI, Erlbaum, Hillsdale, NJ. Morein-‐Zamir, S., S. Soto-‐Faraco, and A. Kingstone. (2003). Auditory capture of vision: Examining temporal ventriloquism. Cognitive Brain Research 17(1);154–63. Morgan, M. J., & Glennerster, A. (1991). Efficiency of locating centres of dot-‐clusters by human observers. Vision Research, 31, 2075–2083. Morgan, M. J., Hole, G. J., & Glennerster, A. (1990). Biases and sensitivities in geometrical illusions. Vision Research, 30, 1793– 1810. Morgan, M. J., Chubb, C. & Solomon, J. A. (2011). Evidence for a subtractive component in motion adaptation. Vision Research, 51, 2312-‐2316. Morgan, M., Dillenburger, B., Raphael, S. & Solomon, J. A. (2012) Observers can voluntarily shift their psychometric functions without losing sensitivity. Attention, Perception & Psychophysics, 74, 185-‐193. Musacchia, G., & Schroeder, C. E. (2009). Neuronal mechanisms, response dynamics and perceptual functions of multisensory interactions in auditory cortex. Hearing research, 258(1-‐2), 72–9. doi:10.1016/j.heares.2009.06.018 18
Nijhawan, R. (2008). Visual Prediction: Psychophysics and neurophysiology of compensation for time delays. Behavioral and Brain Sciences, 31, 179–239. Nishida S., Johnston A. (2010) The time marker account of cross-‐channel temporal judgments. In Space and time in perception and action (eds NijhawanR., Khurana B.), pp. 278–300. Cambridge, UK: Cambridge University Press. Nishida, S., & Johnston, A. (2002). Marker correspondence, not processing latency, determines temporal binding of visual attributes. Current Biology, 12(5), 359–368. Oram, M. W., Xiao, D., Dritschel, B., & Payne, K. R. (2002). The temporal resolution of neural codes: does response latency have a unique role? Philosophical Transactions of the Royal Society B: Biological Sciences, 357(1424), 987–1001. Reeves, A., & Sperling, G. (1986). Attention gating in short-‐term visual memory. Psychological Review, 93(2), 180–206. Regan, D. (1989). Human Brain Electrophysiology: Evoked Potentials and Evoked Magnetic Fields in Science and Medicine. New York: Elsevier. Roach, N. W., Heron, J., Whitaker, D., & McGraw, P. V. (2010). Asynchrony adaptation reveals neural population code for audio-‐visual timing. Proceedings. Biological sciences / The Royal Society, (October 2010), 1314-‐1322. doi:10.1098/rspb.2010.1737 Roelofs, C. (1935). Optische localisation. Archives fur Augenheilkunde, 109, 395–415. Roseboom, W., Nishida, S., Fujisaki, W., & Arnold, D. H. (2011). Audio-‐Visual Speech Timing Sensitivity Is Enhanced in Cluttered Conditions. PloS ONE, 6(4), 1-‐8. doi:10.1371/journal.pone.0018309. Roufs, J.A.J. (1963). Perception lag as a function of stimulus luminance. Vision Research, 3, 81-‐ 91. Schneider, K. a., & Bavelier, D. (2003). Components of visual prior entry. Cognitive Psychology, 47(4), 333-‐ 366. Shams, L., Kamitani, Y., & Shimojo, S. (2002). Visual illusion induced by sound. Brain Res Cogn Brain Res, 14(1), 147–152. Shams, L., Kamitani, Y., & Shimojo, S. (2000). Illusions. What you see is what you hear. Nature, 408(6814), 788. Shapley, R. M., & Victor, J. D. (1978). The effect of contrast on the transfer properties of cat retinal ganglion cells. Journal of Physiology, 285, 275–298. Shore, D. I., Spry, E., & Spence, C. (2002). Confusing the mind by crossing the hands. Cogn Brain Res 14: 153–163. Smith, W. S., Mollon, J. D., Bhardwaj, R., & Smithson, H. E. (2011). Is there brief temporal buffering of successive visual inputs? The Quarterly Journal of Experimental Psychology, 64(4), 767–791. Smithson, H., & Mollon, J. (2006). Do masks terminate the icon? Quarterly Journal of Experimental Psychology, 59(1), 150–160. Snowden, R., Thompson, P., & Troscianko, T (2006). Basic vision. Oxford University Press: Oxford. Sinico, M. (1999). Benussi and the history of temporal displacement. Axiomathes, 75–93. Spence, C. (2011). Crossmodal correspondences: A tutorial review. Attention, Perception, & Psychophysics, 73, 971-‐995. Spence, C., & Parise, C. (2010). Prior-‐entry: a review. Consciousness and cognition, 19(1), 364–79. doi:10.1016/j.concog.2009.12.001 Stetson, C., Cui, X., Montague, P. R., and Eagleman, D. M. (2006). Motor-‐sensory recalibration leads to an illusory reversal of action and sensation. Neuron 51, 651-‐9. Stone, J. V., Hunkin, N. M., Porrill, J., Wood, R., Keeler, V., Beanland, M., Port, M., et al. (2001). When is now? Perception of simultaneity. Proceedings of the Royal Society: Biological sciences, 268(1462), 31-‐8. doi:10.1098/rspb.2000.1326 Stromeyer, C. F., & Martini, P. (2003). Human temporal impulse response speeds up with increased stimulus contrast. Vision research, 43(3), 285-‐98. Tanji J (2001) Sequential organization of multiple movements: involve-‐ ment of cortical motor areas. Annual Reviews of Neuroscience 24:631– 651. Uttal, W. R. (1979). Do central nonlinearities exist? Behavioral and Brain Sciences 2: 286. 19
Vicario, G. B. (2003). Temporal displacement. The Nature of Time: Geometry, Physics, and Perception (pp. 53–66). Kluwer Academic. Vroomen, J., & Keetels, M. (2010). Perception of intersensory synchrony: A tutorial review. Attention, Perception, & Psychophysics, 72(4), 871–884. doi:10.3758/APP Vroomen, J., Keetels, M., de Gelder, B., & Bertelson, P. (2004). Recalibration of temporal order perception by exposure to audio-‐visual asynchrony. Brain research. Cognitive Brain Research, 22(1), 32–5. doi:10.1016/j.cogbrainres.2004.07.003 Wackermann, J. (2007). Inner and outer horizons of time experience. The Spanish journal of psychology, 10(1), 20–32. Wallace, M. T., Wilkinson, L. K., Stein, B. E., & Wallace, M. T. (2012). Representation and integration of multiple sensory inputs in primate superior colliculus. Journal of Neurophysiology, 76, 1246-‐1266. Warren, R. M., Obusek, C. J., Farmer, R. M., & Warren, R. P. (1969). Auditory sequence: Confusion of patterns other than speech or music. Science, 164, 586-‐587 Williams, JM & Lit, A (1983). Luminance-‐dependent visual latency for the Hess effect, the Pulfrich effect and simple reaction time. Vision Research, 23, 171-‐179. Wittmann, M. (2011). Moments in Time. Frontiers in Integrative Neuroscience, 5(October), 1–9. doi:10.3389/fnint.2011.00066. Woodrow, H. (1951). Time perception. In S.S. Stevens (Ed.), Handbook of experimental psychology (pp. 1224-‐1236). New York: Wiley. Wolfe, JM, et al. (2012). Sensation & Perception -‐ 3rd edition. Sinauer: Sunderland, MA. Yarrow, K., Jahn, N., Durant, S., & Arnold, D. H. (2011). Shifts of criteria or neural timing? The assumptions underlying timing perception studies. Consciousness and cognition, 20(4), 1518-‐31. doi:10.1016/j.concog.2011.07.003
20