for speech-in-speech perception - Language

16 downloads 0 Views 612KB Size Report
sounds following with musical training (e.g., Kraus et al. 2014; Tierney ..... Parbery-Clark, Alexandra, Erika Skoe, Carrie Lam & Nina Kraus. 2009. Musician ...
2018. Proc Ling Soc Amer 3. 24:1-9. https://doi.org/10.3765/plsa.v3i1.4311

Investigating a possible “musician advantage” for speech-in-speech perception: The role of f0 separation Michelle D. Cohn* Abstract. Does listeners’ musical experience improve their ability to perceive speech-in-speech? In the present experiment, musicians and nonmusicians heard two sentences played simultaneously: a target and a masker sentence that varied in terms of fundamental frequency (f0) separation. Results reveal that accuracy in identifying the target sentence was highest for younger musicians (relative to younger nonmusicians). No such difference was observed between older musicians and nonmusicians. These results provide support for musicians’ purported advantage for speech-in-speech – but the advantage is limited by listener age. This work is relevant to our understanding of cross-domain transfer of nonlinguistic experience on speech perception. Keywords. speech-in-speech perception, fundamental frequency, cross-domain plasticity 1. Introduction. As listeners, we rarely experience ideal listening conditions; often we must contend with one or more competing speakers (e.g., in a busy café) to hear the target talker. Our ability to tease apart these competing voices, however, is not trivial. And listeners might vary in the strategies involved in successful speech-in-speech perception, such as in leveraging certain acoustic cues. Furthermore, with age, filtering out this noise (i.e., the competing signal) becomes an even more difficult task. An understanding of the mechanisms underlying successful speech perception amidst background babble – and sources of individual variation in this ability – are important in addressing this common concern for listeners and informing our models of speech in noise perception (e.g., Anderson et al. 2013). A number of groups have shown that introducing acoustic differences between the target and competing voice(s) improves intelligibility, including spatial separation (Hawley, Litovsky & Colburn 1999), timing (Carhart, Tillman & Greetis 1969), amplitude (Brungart 2001), and fundamental frequency (f0) (Summerfield & Assmann 1991; Summers & Leek 1998). Furthermore, listeners’ abilities to leverage these acoustic cues has shown to vary according to their linguistic backgrounds. For example, speech perception in multitalker babble is more difficult for non-native versus native speakers (e.g., Mayo, Florentine & Buus 1997) and for accents listeners have less experience with (e.g., for French speakers listening to French- vs. British-accented English in babble (Pinet & Iverson 2010)). Many listeners also have another type of experience that may impact their ability to perceive speech in the presence of background speakers: musical training. Musicians have specialized auditory training involving fine-grained acoustic distinctions of musical sounds (e.g., pitch, amplitude, timing, etc.). Whether this experience can transfer to speech perception, however, is an unresolved question – some studies show musicians’ enhanced speech-in-speech perception relative to nonmusicians (Parbery-Clark et al. 2009; Strait et al. *

Thank you to Dr. Georgia Zellou and Dr. Santiago Barreda for their guidance with the experiment and helpful comments on the manuscript and to Steven Ilagan, our undergraduate research assistant, for his help running the subjects. Author: Michelle D. Cohn, University of California, Davis ([email protected])

1

2013; Vasuki et al. 2016; Başkent & Gaudrain 2016; Zendel et al. 2017; Meha-Bettison et al. 2018), while others report no significant difference between these groups (Ruggles, Freyman & Oxenham 2014; Boebinger et al. 2015; Madsen, Whiteford & Oxenham 2017) or an enhancement limited by musicians’ age (e.g. only for musicians age ≥40 in Zendel & Alain 2012). Yet, the majority of these studies did not control for f0 separation and fluctuation – two acoustic cues that lead to increased intelligibility in perceiving competing speakers (e.g., Summerfield & Assmann 1991; Patel, Xu & Wang 2010) or other speaker-related cues (e.g., using different talkers for target and masker(s) in Boebinger et al. 2015; Madsen, Whiteford & Oxenham 2017). 2. Present Experiment. To address this gap in the literature (i.e., the need to control for acoustic characteristics of the target and masker(s)), the present experiment investigates the role of f0 separation for speech-in-speech perception across the groups, controlling for the acoustic cues of f0 fluctuation, spatial separation, amplitude, as well as speaker-related cues by using the same talker for the target and masker. Our focus on f0 separation is based on empirical work showing musicians’ enhanced encoding of f0 both in pure tones (Kishon-Rabin et al. 2011), but also in language (e.g., detecting weakly incongruous prosodic contours in Schön, Magne & Besson 2004). We hypothesize that musicians (relative to nonmusicians) will better leverage small f0 differences between competing talkers for improved speech-in-speech perception. Evidence for a transfer of skills developed in musical training, such as pitch perception, is supported by longitudinal studies that show increased fidelity in brainstem encoding of periodicity cues in speech sounds following with musical training (e.g., Kraus et al. 2014; Tierney, Krizman & Kraus 2015) and theoretical models of plasticity, such as the OPERA Hypothesis (Patel 2011). On the other hand, it is also possible that both musicians and nonmusicians will perform similarly on the task. That is, it is possible that all listeners will be equally sensitive to f0 manipulations, given the importance of f0 for intonational contours and as a cue for speaker gender. 2.1 SUBJECTS. Musicians (n=41) and nonmusicians (n=41) were all native English speakers with no prior experience with a tonal language and who reported normal hearing. Musicians had at least 10 years of musical training (x̅ =23.26 yrs, sd=14.59), while nonmusicians had minimal (