Coexistence of Multiple Modal Dominances Marvin Chandra (
[email protected]) Department of Psychology, University of Hawaii at Manoa 2530 Dole Street, Honolulu, HI 96822, USA
Christopher W. Robinson (
[email protected]) Department of Psychology, The Ohio State University 208F Ohio Stadium East, 1961 Tuttle Park Place Columbus, OH 43210, USA
Scott Sinnett (
[email protected]) Department of Psychology, University of Hawaii at Manoa 2530 Dole Street, Honolulu, HI 96822, USA Abstract Research has shown visual dominance effects by participants‟ inclination to focus on visual information when presented with compounded visual and auditory stimuli. A recent study has found auditory dominance through a passive oddball detection task. As this task did not require an explicit response, the first aim of this study was to require a response from the participant. Using a single-response oddball task, Experiment 1 found auditory dominance when examining response times to auditory and visual oddballs, and Experiment 2 confirmed the findings, even when visual stimuli were presented 100 ms prior to auditory. Experiment 3 extended the task to measure error rates, requiring participants to make separate responses for auditory, visual, and bimodal stimuli. Auditory dominance was eliminated with a reversal to visual dominance. The current study provides evidence for the coexistence of multiple sensory dominances. Mechanisms underlying sensory dominance and factors that may modulate sensory dominance are discussed. Keywords: Cross-modal processing; Sensory Dominance; Attention.
Introduction Our multisensory milieu necessitates the interaction of information arriving at different sensory modalities to create the world we perceive. In fact, research involving multimodal presentations has highlighted brain specificity of multisensory relationships. For instance, imaging studies have shown that the combination of sensory stimuli arriving at different senses (e.g., visual and auditory stimuli) activates multisensory neurons in non-human mammals (Meredith & Stein, 1986), non-human primates (Hikosaka et al., 1988) and humans (Calvert et al., 2000). Behaviorally, this has been correlated with performance facilitation for multisensory presentations. For instance, Frassinetti and Bolognini (2002) demonstrated that the presentation of concurrent auditory stimuli reduces the threshold to detect visual items in a difficult detection task. In addition, Laurienti et al. (2004) found that presenting compound audiovisual stimuli prior to target unimodal visual stimuli reduced response latency to task targets. This decrease in
reaction time was not observed when the audiovisual stimuli were replaced with an equivalent amount of unisensory stimuli (i.e., either two visual or two auditory stimuli). Facilitation is not the only consequence of multisensory exposure, however. Using detection tasks, Colavita (1974) documented competition between visual and auditory modalities when visual and auditory stimuli were simultaneously presented. Participants were asked to press a button when exposed to a sound, a separate button when exposed to a flash of light, and press both buttons for the simultaneous presentation of both the sound and the light. In trials with bimodal presentations, participants pressed the unisensory visual response button in 98% of the occurrences, despite faster reaction times being recorded for unisensory auditory responses when presented in isolation (i.e., they made visually biased errors, note recent demonstrations have shown no difference between unisensory response latencies). The robustness of this visual dominance effect has been observed in a number of recent investigations (see Spence, 2009 for a review). The dominance of the visual modality is not limited to detection tasks involving simple stimuli such as beeps and flashes. Sinnett, Spence, and Soto-Faraco (2007) required participants to respond to specific visual (i.e., the picture of a stoplight), auditory (i.e., the sound of a cat), or bimodal (i.e., the stoplight and cat simultaneously presented) stimuli embedded within a rapid serial presentation of pictures and sounds. In this case, errors to bimodal stimuli were statistically in favor of visual responses (i.e., visual dominance effect). Building upon this finding, Koppen, Alsius and Spence (2008) instructed participants to press one key for a visual stimulus (i.e., full color picture of an animal), another key for an auditory stimulus (i.e., sound of an animal) or both keys for bimodal presentations. Employing a small set of visual and auditory stimuli reproduced the visual dominance effect, however, increasing the set to include more stimuli magnified the effect leading to more visual only based responses (i.e., errors) when presented with bimodal stimuli. Recent investigations have explored to what degree the visual dominance effect can be modulated, in part to answer
2604
whether visual dominance is attentional in nature, or sensory based. For example, Sinnett et al (2007) presented a higher ratio of unimodal auditory events and did indeed observe a reduction in visual dominance, but not a reversal (i.e., auditory dominance). Furthermore, these same authors also showed that when manipulating available attentional resources in either the auditory or visual modality, visual dominance could be modulated. For instance, visual dominance effects were larger when more visual attentional resources were available, suggesting that the effect must be partly based on an attentional mechanism. Koppen and Spence (2007) also demonstrated that increasing the frequency of bimodal trials to 60% recreates the visual dominance effect, but increasing the frequency to 90% will nullify any bias. The recent demonstrations of the visual dominance effect show that it can be modulated, however it should be noted that a complete reversal to auditory dominance still largely eludes researchers. Visual dominance effects persist even under conditions that typically favor auditory dominance. The repetition detection task has typically shown advantages for the auditory modality (Welch & Warren, 1986; 1980; Welch, Duttonhurt, & Warren, 1986). A recent investigation by Ngo, Sinnett, Soto-Faraco, and Spence (2010) utilized such a task (see Soto-Faraco & Spence, 2002) to examine visual dominance in order to determine if using a task that favors the auditory modality would lead to a reversal of the visual dominance effect. A stream of images and sounds were presented and participants were instructed to respond with three different keys to whenever: an image repeated in consecutive trials, a sound repeated, or both repeated. Despite the task favoring the auditory modality, significant visual dominance effects were still observed. A distinct possibility explaining visual dominance effects could be related to the response set that the participant is required to use. For instance, in the original Colavita et al. (1974) experiment, responses were recorded from two different buttons, one for visual responses and one for auditory responses, and both for bimodal responses. Recent examples (see Koppen et al., 2007; Sinnett et al., 2007) have required responses using three buttons. Interestingly, Sinnett et al. (2008) demonstrated a response inhibition to bimodal trials when participants were asked to respond with three different keys. However, a facilitation effect for bimodal trials was observed when participants only had to press one button for the presence of any unimodal auditory, unimodal visual, or bimodal target. Their results suggest that both multisensory facilitation and inhibition can be observed when responding to the same bimodal event, dependent on how the response is given. Thus, it could be possible that visual dominance may somehow be explained by some form of response related artifact. Directly addressing this question, Robinson, Ahmar, and Sloutsky (2010) examined how quickly participants detected changes in visual and auditory information, while using a task that did not require participants to make an explicit response. Participants in this study were presented with
frequent stimuli (i.e., standards) and infrequent stimuli (i.e., oddballs) occurring in either modality while Event Related Potentials (ERP) were collected during passive observation. The latency of the ERP component was assessed when visual and auditory oddballs were presented unimodally and when the same auditory and visual stimuli were paired together (i.e., bimodal presentation). Compared to the respective unimodal baselines, multimodal presentation retarded the processing of visual information (as indicated by increased latency of visual P300), and sped up the processing of auditory information (as indicated by decreased latency of auditory P300). Therefore, using a task that does not require participants to make a response, auditory dominance effects were observed with auditory input delaying visual oddball detection. These findings suggest that previous examples of visual dominance may indeed be explained by a response bias. Further evidence suggesting that visual dominance might be explained by a response bias comes from infant studies where auditory and visual processing is assessed by examining infants‟ looking times to visual and auditory compounds. For example, using familiarization and habituation tasks, infants familiarized to a visual and auditory compound stimulus often dishabituate when the auditory component changes at test but fail to dishabituate when the visual component changes at test (Lewkowicz, 1988a; 1988b; Robinson & Sloutsky, 2004; 2010a; Sloutsky & Robinson, 2008). This finding is noteworthy for two reasons. First, infants ably discriminate the same visual stimuli when presented unimodally, which suggests that the auditory stimulus interferes with processing of the visual stimulus. Second, the presence of the visual stimulus during habituation appears to have no cost on auditory processing. Thus, when using tasks that do not require an explicit response (e.g., passive ERP tasks, looking time tasks in infants, etc), auditory dominance effects are sometimes observed. However, it is unlikely that a response bias can fully account for auditory and visual dominance effects. For example, using an immediate recognition task, Sloutsky and Napolitano (2003) presented children and adults with a visual and auditory target stimulus followed by a visual and auditory test stimulus. If the target and test item are identical, children are instructed to respond “same”. If the picture changes, the sound changes, or both picture and sound change, children are instructed to respond “different”. Four-year-olds but not adults often fail to report that the picture changed, while at the same time ably discriminating the same visual stimuli when presented unimodally. These findings have been replicated using a variety of procedures examining children‟s responses to visual and auditory stimuli (e.g., Napolitano & Sloutsky, 2004; Robinson & Sloutsky, 2004), which suggests that other factors besides response bias influence auditory and visual dominance effects. One potential explanation that may also account for some of the reported differences is the manner in which auditory
2605
and visual dominance is measured. While more traditional visual dominance paradigms have almost exclusively looked at error rates to bimodal stimuli (e.g., Colavita, 1974; Sinnett et al., 2007), auditory dominance effects are often measured by directly comparing processing of a unimodal stimulus with processing of the same stimulus when presented multimodally. Auditory dominance occurs when multimodal presentation has greater costs on visual processing than on auditory processing (e.g., Robinson et al., 2010; Robinson & Sloutsky, 2004; 2010a; Sloutsky & Napolitano, 2003). Addressing whether sensory dominance is mediated by this methodological difference is the primary aim of the current study. In summary, applications of various methodologies measuring sensory dominance have arrived at different conclusions. In addition to the response versus no response issue, there are major differences in how dominance is measured. That is, examples of visual dominance often look at errors to bimodal trials, while examples of auditory dominance often look at how responses to visual and auditory stimuli are slowed down if presented in a bimodal format. Therefore, the present study aims to reconcile those differences and to examine and disentangle the underlying mechanisms leading to auditory and visual dominance when processing multimodal information. In Experiment 1, we replicate the oddball task from Robinson et al. (2010) using a behavioral measure rather than ERP recordings. Robinson et al.‟s findings challenge the numerous findings of visual dominance and a replication of their task using a methodology similar to traditional sensory dominance tasks will either validate or weaken the evidence for auditory dominance. Our experiment deviates from their paradigm in that we require a response from the participant.
auditory and visual standards were randomly chosen at the onset of the experiment, and the same standards were used across the unimodal and bimodal conditions. While the auditory and visual stimuli differed in pitch and shape, respectively, standards differed from oddballs in their rate of frequency. In particular, the standard was presented 280 times in the unimodal conditions (approximately 78% of the time), whereas the oddballs were only presented 80 times. In the bimodal condition, the standard was presented 560 times (approximately 78% of the time), whereas the oddballs were presented 160 times (80 auditory oddballs and 80 visual oddballs). A subset of the oddballs differed from the standard on two dimensions (i.e., shape and hue for visual oddballs and pitch and timbre for auditory oddballs). These oddballs were not included in Experiment 2. Therefore, to allow for comparisons to be made across Experiments 1 and 2, we excluded these oddballs when examining response times and accuracies in the current experiment.
Experiment 1
Figure 1. Example of visual stimuli and overview of experiment. A single asterisk represents a visual oddball while two asterisks represent auditory oddballs.
Participants Thirty-three participants were recruited from The Ohio State University in exchange for course credit. Participants were naïve to the experiment and had normal or corrected to normal hearing. Written informed consent was obtained before participation in the experiment began. Materials The auditory and visual stimuli consisted of five novel monochromatic shapes created in Microsoft Word, that were exported as jpeg images (not exceeding 400 x 400 pixels), and five sounds created in CoolEdit 2000 (see Figure 1 for examples). The auditory stimuli were pure tones ranging between 200 Hz and 1000 Hz, varying at 200 Hz intervals. The auditory stimuli were saved as 22 kHz files and the volume ranged between 68 and 72 dB. A Dell 17‟‟ LCD displayed the images, with sounds presented via two Polk PLKRC651 wall mount speakers on either side of the screen and a Harmon Kardon AVR-154 receiver amplified the sounds. Of the fives shapes and sounds, one of each was randomly chosen at the beginning of the experiment to serve as the standard while the remaining would serve as oddball stimuli. For each subject, the
Procedure There were three testing blocks: unimodal visual, unimodal auditory, and bimodal. In the bimodal condition, auditory oddballs were constructed by pairing the visual standard with an auditory oddball and visual oddballs were constructed by pairing the auditory standard with a visual oddball. The same oddballs were used in the unimodal conditions, however, visual stimuli were presented in silence (unimodal visual condition) or auditory stimuli were not paired with pictures (unimodal auditory condition). In order to reduce any possible response bias, the procedure for each condition was identical, that is, participants were required to detect oddballs as quickly as possible by depressing any of the buttons on a response pad. The presentation order (auditory, visual, and bimodal blocks) was pseudo-randomized with half of the participants starting with the bimodal task and finishing with unimodal tasks (order randomized), while the other half of the participants began with the unimodal tasks (order randomized) and finishing with the bimodal task. A short practice session began each block to ensure that the participant understood the instructions. Feedback was
Method
2606
given when participants false alarmed to the standard or missed an oddball. Each trial began with the presentation of a stimulus (standard or oddball) for 200ms and the interstimulus-interval randomly varied from 1000 – 1400ms. A blank white screen followed the presentation of the stimuli before the next trial began in the unimodal visual and bimodal tasks. For the unimodal auditory task, participants focused on a piece of paper taped to the top of the LCD monitor. The task in all three conditions was to respond to the oddballs by pressing any of the buttons on a response pad as quickly and as accurately as possible. Thus, participants made the same response when the sound changed, the shape changed, or when both the shape and sound changed. The experiment took approximately 40 minutes to complete with the bimodal task requiring approximately 20 minutes and 10 minutes each for both unimodal tasks.
Results and Discussion Oddball detection (proportion of hits to oddballs – proportion of false alarms to standards) approached ceiling across all oddball types in Experiments 1 and 2 (all accuracies > 97%. Therefore, primary analyses focused on response times to oddballs across the different oddball types (mean response times are presented on the left side of Figure 2). A Modality (Auditory vs. Visual) x Presentation Mode (Unimodal vs. Bimodal) ANOVA revealed a main effect of Presentation Mode, F (1, 32) = 5.03, p < .05, and a Modality x Presentation Mode interaction, F (1, 32) = 9.53, p < .005. Paired t tests revealed that visual oddball detection was significantly slower when paired with sounds (412 ms) than when presented unimodally (388 ms), t (32) = - 4.66, p < .001, while no differences were found when auditory oddballs were presented unimodally (391 ms) or bimodally (386 ms), p > .47. 440
Unimodal
Mean Response Times (ms)
430
Bimodal
420 410 400 390 380 370 360
examining latency of oddball detection in a behavioral task. In particular, while pairing the auditory standard with a visual oddball slowed down visual oddball detection, the presence of the visual standard had no effect on auditory oddball detection. This suggests that modalities are competing for attention, however, it is unclear when this competition occurs in the course of processing. If the competition occurs during encoding, with auditory input engaging attention prior to visual input (cf., Robinson & Sloutsky, 2010b), then it should be possible to attenuate or reverse these effects by presenting visual input prior to auditory input.
Experiment 2 The primary goal of Experiment 2 was to determine if auditory dominance could be attenuated or reversed by presenting visual input 100 ms prior to auditory input. If competition occurs during encoding with auditory input being faster to engage attention, then it may be possible to reverse these effects by giving a visual stimulus a chance to engage attention before presenting the auditory stimulus. However, if auditory dominance effects occur late in the course of processing (e.g., during the response/decision phase), then manipulating the relative onset of the visual stimulus may have little or no effect on participants‟ responses to auditory and visual oddballs.
Method Participants, Materials, and Procedure Thirty-seven participants were recruited from The Ohio State University. Participant recruitment was identical to the Experiment 1. With the following exception, the procedure was identical to Experiment 1. In the current experiment, visual stimuli appeared 100 ms prior to the auditory stimulus and were presented for 300 ms (i.e., same offset as the 200 ms auditory stimulus). As in Experiment 1, the standard was presented 280 times in the unimodal conditions (approximately 78% of the time), and oddballs were presented 80 times. In the bimodal condition, the standard was presented 560 times (approximately 78% of the time), and oddballs were presented 160 times (80 auditory oddballs and 80 visual oddballs). All oddballs in the current experiment differed from the standard on one dimension (i.e., either shape or pitch), thus, mean response times were averaged across all auditory and visual oddballs.
350
Results and Discussion
340 Auditory
Visual
Experiment 1: Synchronous
Auditory
Visual
Experiment 2: Asynchronous
Experiment and Presentation Mode
Figure 2. Response times in Experiments 1 and 2.
The main results of Experiment 1 replicate auditory dominance found in Robinson et al. (2010), when
Mean response time data are presented on the right side of Figure 2. A repeated measures ANOVA was conducted on the data with Modality (Auditory vs. Visual) and Presentation Mode (Unimodal vs. Bimodal) as factors. The analysis revealed a main effect of Presentation Mode, F (1, 36) = 5.02, p < .05, and a Modality x Presentation Mode interaction, F (1, 36) = 33.46, p < .001. The interaction arose due to response times to auditory oddballs being
2607
slightly faster when paired with the visual standard (379 ms) than when presented unimodally (391 ms), t (36) = 1.90, p = .066. At the same time, visual oddball detection was slower in the bimodal condition (404 ms) than in the unimodal condition (372 ms), t (36) = 7.10, p 3.97, ps < .001, indicating a heightened degree of multisensory competition when responding with multiple keys.
General Discussion The sensory dominance literature has largely been dominated by findings supporting visual dominance (Colavita, 1974; Koppen et al., 2007; Sinnett et al., 2007; Spence, 2009). Recently, Robinson et al. (2010) challenged this long standing notion by demonstrating auditory dominance in adults. While their task did not require responses and utilized an oddball paradigm, Experiment 1 of the current study demonstrated auditory dominance using a similar paradigm, but requiring a behavioral response (i.e., one button for all responses). Furthermore, when presenting visual stimuli in advance of auditory stimuli in Experiment 2, auditory dominance persisted suggesting that potential auditory dominance effects may occur after the initial encoding. Accordingly, the results of this paper strengthen the position that auditory dominance can be observed in adults, dovetailing with a number of demonstrations with infants and young children (Lewkowicz, 1988a; 1988b; Robinson & Sloutsky, 2004; 2010a; Sloutsky & Robinson, 2008). Yet, this oddball paradigm differs from the traditional visual dominance effects in two key ways. While the oddball paradigm does pit conflicting information from separate modalities, only one response is required. When doing precisely this in a Colavita visual dominance task, Sinnett et al. (2008) found evidence for multisensory facilitation. However, the oddball task employed here led to consistent auditory dominance effects. While further research is needed to address these differing patterns of results, one could speculate at this point that the oddball task creates competition between the oddball stimulus (either visual or auditory) and the standard stimulus that accompanies the oddball (auditory or visual, respectively). This is different from the paradigm used by Sinnett et al. that led to facilitation, in that both the auditory and visual components of target bimodal stimuli were unimodal targets, thereby leading to a redundant target effect. In a separate investigation we have included analogous double-oddball stimuli (i.e., both auditory and visual stimuli were infrequent) that did result in a facilitation in response latency when compared with unimodal response latencies (Sinnett, Chandra, & Robinson,
2608
in preparation). It should be noted that when using only one response it is impossible to gauge error performance on bimodal trials (i.e., the number of unimodal based responses to bimodal trials). Experiment 3 (current study) required participants to make multiple responses (for unimodal auditory, unimodal visual, or bimodal oddballs), with the result being an abolishment of auditory dominance and a shift to visual dominance, as demonstrated by a significant bias to the visual modality in error rates on bimodal trials. We would like to finish by proposing a theoretical possibility that requires future research and comment on measuring sensory dominance. First, given the contrary findings it seems to be a distinct possibility that both auditory and visual dominance can co-exist. In fact, this might be likely, given the enhanced alerting capabilities of the auditory modality. That is, perhaps the auditory sense is dominant, but that top-down attentional control mitigates this dominance depending on task difficulty. Therefore, as task difficulty increases, auditory dominance morphs into visual dominance. Note that reactions times to bimodal stimuli across experiments increased by nearly 250 ms. Thus, when designing sensory dominance experiments it is crucial to manipulate task difficulty, and measure both response latency and accuracy, as it seems possible to design the experiment quite simply to substantiate either theoretical possibility.
References Calvert, G. A., Campbell, R., & Brammer, M. J. (2000). Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex. Current Biology, 10, 649 – 657. Colavita, F. B. (1974). Human sensory dominance. Perception & Psychophysics, 16, 409-412. Frassinetti, F., & Bolognini, N. (2002). Enhancement of visual perception by crossmodal visuo-auditory interaction. Experimental Brain Research, 147, 332-343. Hikosaka, O., Iway, E., Saito, H., & Tanaka, K (1988). Polysensory properties of neurons in the anterior bank of the caudal superior temporal sulcus of the macaque monkey. Journal of Neurophysiology, 60, 1615-1637. Koppen, C., & Spence, C. (2007). Seeing the light: exploring the Colavita visual dominance effect. Experimental Brain Research, 180, 737-754. Koppen, C., Alsius, A., & Spence, C. (2008). Semantic Congruency and the Colavita visual dominance effect. Experimental Brain Research, 184, 533-546. Laurienti, P. J., & Kraft, R. A. (2004). Semantic congruence is a critical factor in multisensory behavioral performance. Experimental Brain Research, 158, 405414. Lewkowicz, D. J., (1988a). Sensory dominance in infants: 1. Six-month-old infants‟ response to auditory-visual compounds. Developmental Psychology, 24, 155-171. Lewkowicz, D. J., (1988b). Sensory dominance in infants: 2. Ten-month-old infants‟ response to auditory-visual compounds. Developmental Psychology, 24, 172-182.
Meredith, M. A., & Stein, B. E. (1986). Visual, Auditory, and Somatosensory Convergence on Cells in Superior Colliculus Results in Multisensory Integration. Journal of Neurophysiology, 56. Napolitano, A. C., & Sloutsky, V. M. (2004). Is a Picture Worth a Thousand Words? The Flexible nature of Modality Dominance in Young Children. Child Development, 75, 1850-1870. Ngo, M. K., Sinnett, S., Soto-Faraco, S., & Spence, C. (2010). Repetition blindness and the Colavita effect. Neuroscience Letters, 480, 186–190. Robinson, C. W., Ahmar, N., & Sloutsky, V. M. (2010). Evidence for auditory dominance in a passive oddball task. In S. Ohlsson & R. Catrambone (Eds.), Proceedings of the 32nd Annual Conference of the Cognitive Science Society (pp 2644-2649). Austin, TX: Cognitive Science Society. Robinson, C. W., & Sloutsky, V. M. (2004). Auditory dominance and its change in the course of development. Child Development, 75, 1387-1401. Robinson, C. W., & Sloutsky, V. M (2010a). Effects of multimodal presentation and stimulus familiarity on auditory and visual processing. Journal of Experimental Child Psychology, 107, 351-358. Robinson, C. W., & Sloutsky, V. M. (2010b). Development of Cross-modal Processing. Wiley Interdisciplinary Reviews: Cognitive Science, 1, 135-141. Sinnett, S., Soto-Faraco, S., & Spence, C. (2008). The cooccurrence of multisensory competition and facilitation, Acta Psychologica, 128, 153–161. Sinnett, S., Spence, C., & Soto-Faraco, S. (2007). Visual dominance and attention: Revisiting the Colavita effect. Perception & Psychophysics, 69, 673–686. Sloutsky, V. M., & Napolitano, A. (2003). Is a picture worth a thousand words? Preference for auditory modality in young children. Child Development, 74, 822-833. Sloutsky, V. M., & Robinson, C. W. (2008). The role of words and sounds in visual processing: From overshadowing to attentional tuning. Cognitive Science, 32, 354-377. Spence, C. (2009). Explaining the Colavita visual dominance effect, Progress in Brain Research, 176, 245– 258. Soto-Faraco, S., & Spence, C. (2002). Modality-Specific auditory and visual temporal processing deficits. Quarterly Journal of Experimental Psychology, 55A, 2340 Welch, R. B., & Warren, D. H. (1980). Immediate perceptual response to intersensory discrepancy. Psychological Bulletin, 88, 638-667. Welch, R. B., & Warren, D. H. (1986). Intersensory interactions. Handbook of Perception and Human Performance, 1, 25-36. Welch, R. B., Duttonhurt, L. D., & Warren, D. H. (1986). Contributions of audition and vision to temporal rate perception. Perception and Psychophysics, 39, 294-300.
2609