William F. Thompson. Department of Psychology, Uni- versity of Toronto. Mississauga, ON, Canada. ABSTRACT. The experience of pitch relations is known to ...
Alma Mater Studiorum University of Bologna, August 22-26 2006
Setting words to music: Effects of phoneme on the experience of interval size Frank A. Russo1
Dominique Vuvan
Department of Psychology, University of Toronto Mississauga, ON, Canada
Department of Psychology, University of Toronto Mississauga, ON, Canada
Department of Psychology, University of Toronto Mississauga, ON, Canada
William F. Thompson These contextual variables include spectral centroid (Russo & Thompson, 2005), pitch register and pitch direction (Russo & Thompson, in press), and in the case of vocal music, facial expression (Thompson, Graham, & Russo, 2005; also see proceedings in “Media” Symposium, ICMPC9). The influence of spectral centroid is somewhat surprising because its perception is generally considered to be a component of timbre (i.e., brightness), and orthogonal to pitch perception. The current investigation sought to uncover contextual effects on the experience of pitch relations in vocal music concerning phonemic content.
ABSTRACT The experience of pitch relations is known to be influenced by timbral brightness, Because phonemes naturally vary in their extent of brightness, we wondered whether the experience of pitch relations in vocal music was influenced by phoneme content. Three-tone sequences were synthesized so as to vary with regard to both pitch and phoneme content. Perceived interval size was found to vary as a function of the interaction between normalized spectral centroid and pitch contour, with larger pitch distances associated with sequences possessing congruent contours. Proposals concerning the underlying mechanism and various implications of the findings are discussed.
Russo & Thompson (2005) presented listeners with musical intervals where the timbres of the two component tones varied. Congruent trials were those where an ascending interval was accompanied by a dull to bright timbral shift, or a descending interval was accompanied by a bright to dull timbral shift. Incongruent trials were those where an ascending interval was accompanied by a bright to dull timbral shift, or a descending interval was accompanied by a dull to bright timbral shift. Both musically untrained and musically trained participants rated congruent intervals as larger than incongruent intervals even when the actual pitch change was held constant.
Keywords Interval size, pitch, timbre, phoneme, lyrics, vocal music
INTRODUCTION Recent research has demonstrated that the experience of interval size is influenced by several variables not directly related to pitch distance. In: M. Baroni, A. R. Addessi, R. Caterina, M. Costa (2006) Proceedings of the 9th International Conference on Music Perception & Cognition (ICMPC9), Bologna/Italy, August 22-26 2006.©2006 The Society for Music Perception & Cognition (SMPC) and European Society for the Cognitive Sciences of Music (ESCOM). Copyright of the content of an individual paper is held by the primary (first-named) author of that paper. All rights reserved. No paper from this proceedings may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information retrieval systems, without permission in writing from the paper's primary author. No other part of this proceedings may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information retrieval system, without permission in writing from SMPC and ESCOM.
ISBN 88-7395-155-4 © 2006 ICMPC
Like timbre, phonemes vary with respect to brightness (i.e., spectral centroid). If the same mechanism linking pitch relations and timbre is at work in vocal music, it can be predicted that ascending intervals will be perceived as larger when accompanied by a shift from dull to bright phoneme and that descending intervals will be perceived as larger when accompanied by a shift from bright to dull phoneme. We therefore expected an interaction between pitch contour and the normalized spectroid contour, with 1246
ICMPC9 Proceedings
larger estimates of interval size corresponding to congruent shifts in pitch and normalized spectral centroid. We also expected that the strength of the interaction would be dependent on the extent of the normalized centroid change.
Syllable sandwiches were produced at three different pitch transpositions (low, middle, high) in order to keep participants focused on pitch change and not absolute pitch. For the rise-fall pitch contour, the anchor tone was F2, G2, or A2, and for the fall-rise pitch contour, the anchor tone was C3, D3, or A3. The orthogonal manipulation of all of these dimensions gave 72 unique stimuli (6 syllable sandwiches x 2 pitch contours x 2 interval sizes x 3 transpositions).
METHODS Participants Seventeen participants with varying levels of musical experience were recruited from an introductory psychology course at the University of Toronto at Mississauga. These participants included 8 males and 9 females, ranging in age from 18 to 21. All received course credit for their contribution. No participants had abnormal hearing..
Procedure Participants were briefed as to the general form of the syllable sandwiches they would be hearing, and instructed to judge the interval size between the book-ending syllables and the central syllable on a 5-point scaale, with “1” being a very small pitch change and “5” being a very large pitch change. It was advised that the judgment be made as quickly as possible. Stimuli were blocked by pitch contour. Within each pitch contour block, the trials were independently randomized in 2 sets of 36, thus yielding 2 repetitions of each stimulus and a total of 72 trials. The participants were given each block twice, for a total of 4 repetitions of each stimulus (288 trials). Block order was counterbalanced across participants so as to minimize any possible carry-over effects. At the beginning of the first instance of each block type, participants received either five practice trials, or as many as they needed to become familiar with the task. Stimuli were presented and responses recorded with Experiment Creator Software (Thompson, 2005) running on an eMac computer.
Stimuli To minimize uncontrolled variability in the production of phonemes, all stimuli were presented in the form of a standard consonant-vowel syllable structure, synthesized by VocalWriter 2.0 software (KAE Labs, 2005) running on a Power Mac computer. Three syllables were produced at varying pitch levels: /da:/, /di/, or /du/. Across pitch levels, the relative frequency of the normalized spectral centroid was high for /da:/, low for /di/, and intermediate for /du/. The synthesized syllables were combined to form “syllable sandwiches” in which two identical syllables produced at the same pitch surrounded a unique central syllable produced at a different pitch. The pitch contour of each syllable sandwich followed either a rise-fall or a fall-rise contour, with interval size corresponding to a perfect fifth (seven semitones) or a tritone (six semitones). Changes in the normalized spectral centroid of syllable sandwiches were either congruent or incongruent with the pitch contour change. For example, /da:/di/da:/ is incongruent with a rise-fall pitch contour but congruent with a fall-rise pitch contour. The change in normalized centroid values was largest for syllable sandwiches combining /da:/ and /di/ (33%), smallest for syllable sandwiches combining /da:/ and /du/ (7%), and intermediate for syllable sandwiches combining /du/ and /di/ (24%).
RESULTS As expected, the strength of the interaction between pitch contour and the normalized centroid contour varied across syllable sandwiches. The strength of the interaction was influenced by the extent of difference between the normalized centroid values of the component syllables. Results from each type of syllable sandwich are reported separately.
Large Changes in Normalized Spectral Centroid (/di/da:/di/ vs. /da:/di/da:) There was a significant main effect of interval (F [1,16] = 21.12, p < .0001), meaning that participants were able to distinguish between the perfect fifth (P5) and tritone (TT) syllable sandwiches. As expected, there was a significant interaction between pitch contour and syllable sandwich (F [1, 16] = 12.60, p < .01). As may be seen in Figure 1, syllable sandwiches with congruent shifts in normalized spectral centroid led to larger judgments of interval size than syllable sandwiches with incongruent shifts in normalized spectral centroid. All other effects (direction, phoneme,
Proceedings of the 9th International Conference on Music Perception & Cognition (ICMPC9). ©2006 The Society for Music Perception & Cognition (SMPC) and European Society for the Cognitive Sciences of Music (ESCOM). Copyright of the content of an individual paper is held by the primary (first-named) author of that paper. All rights reserved. No paper from this proceedings may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information retrieval systems, without permission in writing from the paper's primary author. No other part of this proceedings may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or . Currently at the Department Psychology, Ryerson University, by any information retrieval of system, with permission in writing from 350 Victoria Toronto, Canada, M5B 2K3. SMPC andStreet, ESCOM.
ISBN 88-7395-155-4 © 2006 ICMPC
1247
ICMPC9 Proceedings
repetition, transposition, and remaining interactions) were found to be non-significant (p > .05).
Small Changes in Normalized Spectral Centroid (/da:/du/da:/ vs. /du/da:/du/) There was a significant main effect of interval (F [1, 16] = 14.61, p < .01), indicating that participants were sensitive to the difference between P5 and TT trials. The interaction between pitch contour and normalized centroid contour was not significant (F < 1). This was an expected finding in that the normalized centroids of /da:/ and /du/ are quite similar. All other effects, as with the previous two analyses, were non-significant (p > .05).
5
/di/da:/di/ /da:/di/da:/ 4
3
2
5
/du/da:/du/ /da:/du/da:/
1 rise-fall
4
fall-rise Pitch Contour
3
Figure 1: Perceived interval size for intervals involving large changes in normalized spectral centroid 2
Intermediate Changes in Normalized Spectral Centroid (/di/du/di/ vs. /du/di/du/ )
1 rise-fall
There was a significant main effect of interval (F [1, 16] = 23.32, p < .0001), indicating that participants were sensitive to the difference between P5 and TT trials. Although there was no significant interaction between pitch contour and the normalized centroid contour (F [1, 16] = 2.89, p > .05), a trend emerged that was consistent with that observed for syllable sandwiches involving /da:/ and /di/ (see Figure 2). All other effects were found to be nonsignificant (p > .05).
Figure 3: Perceived interval size for intervals involving small changes in normalized spectral centroid
5
/di/du/di/ /du/di/du/ 4
3
2
1 rise-fall
fall-rise Pitch Contour
Figure 2: Perceived interval size for intervals involving intermediate changes in normalized spectral centroid
ISBN 88-7395-155-4 © 2006 ICMPC
fall-rise Pitch Contour
1248
ICMPC9 Proceedings
DISCUSSION Based on the results of this preliminary investigation it appears that phonemic content may exert a small but reliable influence on the experience of relative pitch in normal music listening. This influence may be due to an emphasis of spectral pitch over periodicity pitch in the experience of relative pitch. The determination of interval size may involve a comparison of energy across all resolved partials in addition to more refined pitch estimates derived from temporal analysis of neural firing patterns. By contrast, interval naming may emphasize temporal analysis, thereby minimizing any influence of phonemic shifts. An alternative explanation is that the observed influence on interval size judgments was actually caused by simple pitch distortions. Different vowel sounds are associated with unique intrinsic pitch levels (Whalen & Levitt, 1995; Sapir, 1989; Ladd & Silverman, 1984). For instance, with everything else equal, “high” vowels (those with relatively high values of F2, e.g., /di/) are produced with a higher fundamental frequency (Fo) than “low” vowels (those with relatively low values of F2, e.g., /da/). In the case of synthetic vowels where such minor adjustments in tuning are absent, there may be a compensatory expansion or contraction of perceived interval size. The merit of this alternative explanation could be assessed by conducting pitch matching experiments with the syllables tested here. If the observed distortions in pitch are large enough to account for the effects observed here, then this alternative explanation would be validated. However, in the case of synthetic timbres, this pitch distortion explanation has not been supported (Russo & Thompson, 2005). Regardless of the underlying mechanism, an important implication of these findings is that lyrics may influence the experience of pitch relations in a melody. It is possible that congruent changes in pitch and normalized centroid contours may sound more natural than incongruent changes. Other possibilities include the potential role words may play in naturally accenting a prosodic line. If this sort of phonemic accenting does occur, this would imply that prosodic and phonemic processing interact on some level despite evidence suggesting their dissociation. In summary, this investigation represents a first demonstration of the role of phonemic content on processing of pitch relations. Although the findings appear promising, it will be important to follow up this work with more naturalistic stimuli produced by real singers and speakers.
ACKNOWLEDGMENTS We thank David Beckford for his assistance with preparing the stimuli.
ISBN 88-7395-155-4 © 2006 ICMPC
1249
ICMPC9 Proceedings
REFERENCES KAE Labs. (2005). VocalWriter [Software]. Woodinville: Author.
Thompson, W.F. (2005). Experiment Creator [Software]. Toronto: Author.
Ladd, D.R. & Silverman, K.E.A. (1984). Vowel intrinsic pitch in connected speech. Phonetica, 41, 31-40.
Thompson, W. F., Graham, P., & Russo, F. A. (2005). Seeing music performance: Visual influences on perception and experience. Semiotica, 156, 203-227.
Russo, F.A. & Thompson, W.F. (2005). An interval-size illusion: The influence of timbre on the perceived size of melodic intervals. Perception and Psychophysics, 67, 559568.
Sapir, S. (1989). The intrinsic pitch of vowels: Theoretical, physiological and clinical considerations. Journal of Voice, 3, 44-51. Whalen, D. & Levitt, A. (1995). The universality of intrinsic Fo of vowels. Journal of Phonetics, 23, 349-366.
Russo, F. A., & Thompson, W. F. (2005). The subjective size of melodic intervals over a two-octave range. Psychonomic Bulletin & Review, 12, 1068-1075.
ISBN 88-7395-155-4 © 2006 ICMPC
1250