2011 2nd International Conference on Behavioral, Cognitive and Psychological Sciences IPCSIT vol.23 (2011) © (2011) IACSIT Press, Singapore
Musical sight-reading expertise: cross-modality investigations Véronique Drai-Zerbib 1, Thierry Baccino 2 12
CHART/LUTIN (EA 4004), Paris, France
[email protected]
Abstract. It is often said that experienced musicians are capable of hearing what they read from the score and vice versa. This suggests that they are able to process and to integrate multimodal information. Does the expertise in music rely on an efficient cross-modal integration? This paper investigates this issue with 2 experiments. In the first one, 30 expert and 31 non expert musicians were required to report whether two successively auditory and visual presented fragments of classical music were same or different. In half the conditions the participants received the fragments in visual presentation only (same modal presentation), in the other half they received the fragments in auditory and visual presentation (cross-modal presentation). As expected, analysis of Response Time and Errors showed that experts performed the task more accurately and rapidly than non experts, whatever the modal presentation, while non experts performed more accurately and rapidly in the same modal presentation. So, more experienced performers seem to be better able to transfer information from one modality to another. In the second experiment, 15 expert and 10 non expert musicians were required to listen, read, and perform classical piano excerpts. The experiment was run in two consecutive phases during which each excerpt was (1) read without playing and (2) sight-read (read and played). In half the conditions, the participants heard the music before the reading phases. The excerpts contained suggested fingering of variable difficulty (difficult, easy, or no fingering). Eye movements were registered during the reading phases. Analyses of fixations and playing mistakes validated the hypothesized modal independence of information among expert musicians, observed in the first experiment. Moreover analyses validated the cross-modal capacities of expert memory. For experts, the mere viewing of a musical score may facilitate planning and preparation for motor execution, whereas non-experts do not appear to have this cross-modal integration ability. Results are discussed in terms of amodal memory for expert musicians that can be in support of theoretical work by Ericsson and Kintsch [24]: more experienced performers better integrate knowledge across modalities Keywords: Musical sight reading, music perception, psychology of music, cross-modality, expert memory
1. Introduction The composer Robert Schumann [1] claimed that an expert musician must “hear music from the page”. Although this idea may seem obvious to most of us and is generally thought to result from extensive playing of a musical instrument, there are still no satisfactory scientific explanations of this ability. The underlying explanation is something to see with the ability to handle several sources of information in parallel. This ability (called cross-modality effect) is involved in everyday life and is highly important to describe because of adaptive value to enhance learning, comprehension or problem solving. Mixing several sources of information insure a coherent knowledge of our multidimensional world. In music, sight reading consists to read a score and play it concurrently on an instrument. It is typically a cross-modal task because it corresponds to extract visual information from a score and to perform simultaneous motor responses in playing or singing, laying on an auditory feedback. A common sense idea is that expert musicians can hear what they read from a score and represent visually the music that they are listening. In cognitive terms, this means that a kind of expert memory has been taken place during the years of practice facilitating the shift between vision and audition and somewhere mixing both of them (i.e, amodal memory hypothesis) compared to non experts that should separate the two types of processing (i.e, modal memory hypothesis). 1
Moreover fMRI studies shown that there is “intermodal transformation network of auditory and motor areas which is subject to a certain degree of plasticity by means of intensive training” [2].
2. Cross-modal conversion (modal memory and recoding hypothesis) versus cross-modal integration (amodal memory and integration hypothesis) Empirical studies using a variety of methodologies (Response Times, ERPs, etc.) have attempted to demonstrate various cross-modality effects, mostly by manipulating interference between different perception systems (olfaction, audition, vision, etc.), [3,4,5] These studies can be classified into two categories, depending on their underlying hypothesis: (1) cross-modal conversion (recoding hypothesis), wherein information in one modality is converted into the other modality, and (2) cross-modal integration (amodal hypothesis), wherein information is not encoded in a modality-dependent way but is integrated at a higher level in an amodal representation. These two hypotheses are not mutually exclusive; they operate at different information-processing levels (perceptual for the former, conceptual for the latter) and depend on the individual's prior knowledge and skill level in the activity or task to be carried out. The amodality hypothesis is brought to bear in cases where a higher representation level must be accessed in order to integrate information from various sources. This integrating role is often assigned to memory representations and to the inference processes that build mental representations. For language in particular, the amodality of semantic memory has been confirmed [3, 6]. Based on an ERP study using a semantic priming task, Holcomb and Anderson [3] argued that amodal semantic representations are accessed by way of modalityspecific encoding mechanisms. Further support for the amodality hypothesis has been obtained recently in brain imaging studies (fMRI, MEG), where cerebral structures, generally associated with different perceptual modalities, were shown to be interconnected or overlapped [7,8,9]. Moreover, intracranial recordings from human subjects provided a direct electrophysiological evidence for audio-visual multisensory integration in superior parietal lobule [10]. In a previous study, recording eye movements of expert and non expert pianists, we observed that amodal memory in piano musical sight-reading, crossing visual and auditory information [11]. Skilled musicians found to have very low sensitivity to the written form of the score and seemed to reactivate a representation of the musical passage from the material listened to. In contrast, less skilled musicians were very dependent on the written code and on the input modality and must build a new representation based on visual cues (Fig.1). Experts Non Experts
Fig.1: Eye tracking of expert and non expert pianists (Drai-Zerbib & Baccino, 2005).
3. Expert Memory Memory is the basis not only for the mechanisms of visual encoding [12, 13] and information retrieval [14], but also of inference-making processes [15]. Saying that the expert musician's memory is amodal means that experts code musical information independently of the input modality and can retrieve it regardless of how the information was perceived (visually or auditorily). It follows that perceptual cues might be less important for experts since they are capable of using their musical knowledge to compensate for missing [11] or incorrect information [1, 16, 17]. A more experienced performer may have both better representations of musical structure and better ability to apply these, or only one of these attributes. Conversely, less-expert musicians, who probably do not possess this ability, can be assumed to go through a slower recoding phase.
4. Motor implication in music reading A study using cerebral imagery showed that the same cortical areas were activated whether piano music was being read or played [18], which is consistent with the idea that music reading involves a sensorimotor transcription of the music's spatial code [19]. One of the sensorimotor activities taking place during music 2
reading on the piano consists of anticipating the positions of the fingers. The presence of fingering on the music aids in this anticipation process by providing visuo-motor cues, and helps the pianist to find the fingering combinations prescribed for virtuoso playing [20]. Motor areas seem to also be active when people listen to musical rhythms [21], even when they do not play an instrument or sing themselves. Moreover, an increased representation in pre-motor and primary motor cortex for muscles has been trained by imagery alone [22]. When a musical excerpt is presented for reading, the auditory imagery of the future sound is activated [23]. Thus, expertise in sight-reading, as in all complex activities, relies heavily on memory structures that seem to be amodal. Our research addresses the question of how this amodal kind of memory representation is the keystone of musical expertise. Can expert musicians' ability to hear what they read be a resulting from the amodal nature of the expert musician's memory?
5. Experiments In order to investigate whether the expertise in music relies on an efficient cross-modal integration, we run two experiments. In our first experiment, 30 expert (more than 12 years of academic musical practice) and 31 non expert musicians (5 to 8 years of academic musical practice) were required to report whether two successively auditory and visual presented fragments on a computer screen were same or different. 44 excerpts of classical piano were used; each fragment was 4 measures long. In half the conditions, fragments were presented visually only, in the other half, the fragments were presented auditorily than visually. Musicians had to make a right click with the mouse whenever the fragments were the same and left click whenever they were different. If they found a difference, they had to decide in which measure they found the difference. ANOVAs were run on Response Time, Global Errors (according to the response yes or no) and Local Errors (according to the localisation of the note modification). As expected, the experts performed the task more rapidly than non experts, F(1,59=15,91 ; p