Semantic Contextual Cues and Listener Adaptation to Foreign ...

2 downloads 0 Views 17MB Size Report
May 8, 2009 - male, 1 Macedonian female, 1 Spanish male, 1 Indian male, ..... graduate students the percentage of Asian Americans is only 9.9% but more ...
Northwestern University

MA/BA Thesis

Semantic Contextual Cues and Listener Adaptation to Foreign-Accented English

Author:

Advisor:

Page Piccinini

Dr. Ann Bradlow

May 8, 2009

Acknowledgements I would like to sincerely thank Ann Bradlow for all of her guidance, not only on this paper, but for everything she has done for me over the last three years. I would also like to thank my second reader Matt Goldrick for his valuable insight. Thank you to Kristin Van Engen for her help teaching me so much about the lab that has led me to this point, as well as Midam Kim, Rachel Baker, and Melissa Baese-Berk. I am also appreciative to Brady Clark and everyone in Ling 500 for their input and advice. To the Bradlow RAs Kelsey Mok and Sudha Ayala, I have greatly valued all the help over the years. And finally thank you to Eric Kramer for dealing with all of my last minute LaTeX questions.

Abstract The context surrounding words can facilitate word recognition for native and non-native listeners. However, for degraded speech signals, (e.g. with noise), non-natives may lose the ability to take advantage of contextual cues unless the speech is clearly produced (Bradlow and Alexander, 2007). This study investigated whether: 1) foreign-accented speech degrades the signal such that natives show reduced ability to take advantage of context, and, 2) both native and non-native listeners adapt to foreign-accented speech. Native and non-native listeners were exposed to accented speech in two blocks. In each block, half the sentence final words were in high and half in low predictability contexts. Listeners were asked to identify sentence final words. In a single talker condition, natives benefitted from context and adapted to the accent. In a multi-talker condition natives also showed both context and adaptation effects, but context effects were only seen when several talkers shared the same L1, not accented speech in general. In the single talker condition non-natives could not use context but showed adaption. In the multi-talker condition, non-natives benefitted from context and adapted, however the context effect was only seen when listener and talker L1 matched. These results suggest that accented speech disrupts natives ability to take advantage of context, but this can be overcome by adaptation to a specific accent if given enough exposure through the task. Non-natives benefit from a talker-listener L1 match, thus allowing them to function at higher processing levels. Both natives and non-natives can adapt to accented speech in general over time, talker-specific and multi-talker.

Contents 1 Introduction

2

2 Experiment 1 - Native Listeners

6

2.1

Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

2.2

Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

2.3

Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

2.4

Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

2.5

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

3 Experiment - Non-native Listeners

18

3.1

Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

18

3.2

Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

18

3.3

Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

3.4

Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20

3.5

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

4 Experiment Comparisons Bradlow/Alexander (2007) vs. Single Talker vs. Multi Talker 27 4.1

Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

4.2

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

5 Conclusion

31

Appendix A

34

Bibliography

35

1

Chapter 1

Introduction How we as listeners perceive words involves a lot more than simply processing what is being offered to us from the speech signal. Other factors come into play such as listeners’ expectations, and as a result we can at times predict our speaker’s words before they have been produced. This becomes most evident when the source is somehow distorted making it difficult for the listener to accurately perceive the signal. In these situations listeners have various compensatory tools at their disposal for understanding the signal being given to them. One of these tools is the context of the sentence as a whole, which listeners can use to fill in the gaps. Listeners use context as a means to predict sentence final words. This use of context indicates the role played by top-down signal independent information in speech recognition. Listener adaption to the talker indicates a role played by perceptual learning for speech in enhancing speech recognition. This greater intelligibility or adaption can be achieved through increased exposure to a speaker. This is especially evident when the speaker’s speech somehow deviates from the norm, perhaps due to speaking-style, and thus the listener will not immediately be able to understand the speaker. Studies have been conducted looking at perceptual learning for foreign-accented speech in particular (Bradlow and Bent, 2007). This study further investigates the effects of perceptual learning for a specific speaker’s accent and for many speakers of difference L1 backgrounds. Foreign-accented speech is used experimentally as a means of manipulating a signal-related feature, i.e. variation in the bottom-up possessing. This is important because: (a) it can lead to a better understanding

2

of speech communication between natives and non-natives, and (b) it provides insight into what happens to bottom-up and top-down interactions when native and non-native talkers communicate. By using accented speech as stimuli two relevant research questions can be investigated: 1) Will the added distortion of accent put too much of a strain on processing efforts to allow listeners to take advantage of context?, and 2) Will listeners adapt to the speaker’s (or speakers’) accent(s) during the task? Speech-in-noise studies have shed light on the different ways in which natives and non-natives perceive speech at the various processing levels. Research shows that non-natives perform worse than natives in word identification when the signal is somehow degraded. For a while past research hypothesized that this was due to difficulties in phonemic identification that related to the language background(s) of the listener (Best, 1995; Strange, 1995). However, more recent work demonstrated that when natives and non-natives were asked to identify particular phonemes in a word the results by phoneme were parallel between natives and non-natives, with non-natives simply performing at constant lower degrees of accuracy across various SNRs; in fact each group had greater difficulty with the same phonemes (Cutler et al., 2004). This suggests that the disproportionately lower scores of non-natives relative to natives is not a result of trouble with non-native phoneme identification in noise, but instead is a result of an accumulation of decreased processing accuracy at multiple levels of processing. Those with greater experience with a language, natives, will be able to process phonemic to syntactic to semantic levels with less exposure to the language than those with less experience, non-natives. This more accurate processing would also allow natives to take advantage of higher level linguistic information such as context that non-natives perhaps could not. Studies have also examined how the role age of acquisition of L2 can affect phoneme and word recognition, and perhaps therefore processing accuracy in speech-in-noise. While all ages of acquisition display native like accuracy in silence on many tasks, it is in noise that the differences in processing accuracies come out. The earlier the L2 is acquired the better a subject performed in speech-in-noise (Mayo et al., 1997). Again context was shown to be a factor, as only the early bilinguals were able to take advantage of this semantic information in noise. However, it is important to note that even the early bilinguals did not perform at the same accuracy level as monolinguals in noise. Rogers et al. (2006) came up with three possible explanations for early bilinguals decreased processing speeds: 1) the need to deactivate the language not being used in

3

the task, 2) the presence of a large phonemic inventory to chose from, with several similar phonemes, and 3) having to “match native speaker productions to a perceptual category that may be intermediate between the norms for the two languages.” Most importantly, all these studies point to the theory that non-natives’ decreased accuracy in speechin-noise is not a result of lack of ability to identify L2 phonemes, but stems from the interference of L1 linguistic knowledge that decreases processing accuracy. The L1 and L2 linguistic knowledge are available for non-natives at all levels of speech processing, so it stands to reason that at higher levels of processing non-natives may be so far behind that there are some linguistic cues they cannot take advantage of at all. Again, this is only seen when the signal is somehow degraded. Bradlow and Alexander (2007) conducted a study that compared natives’ and non-natives’ ability to take advantage of the semantic information of context when speech was distorted in two manners: 1) the addition of noise, and 2) sentences were produced in either clear speech or plain speech (plain speech can be considered a distorted version of clear speech). Not surprisingly, natives were able to take advantage of context in both speaking styles as well as exhibit a jump in accuracy from plain to clear speech. Conversely, the non-native listeners performed about the same between the high and low predictability contexts in the plain speech sentences. There was some improvement in clear speech, showing better perception of high predictability context words, but this was nowhere near the difference seen in native listeners. This study demonstrates that non-native listeners are not able to take advantage of context the way native listeners are in speech-in-noise. However, when provided with the clearer speaking style, and thus less stress on their processing abilities at the level of signal decoding, it was possible for the non-natives to use the higher level semantic information. Another question of the present study looks at listener adaption to a single speaker or multiple speakers. Individual speakers have differing acoustic characteristics that carry indexical information specific to that speaker (Abercrombie, 1967). Studies have shown that this indexical information interacts with linguistic speaker-independent information in the speech signal and are linked in speech processing (Nygaard and Pisoni, 1998). This linking also has different effects on the different levels of speech processing, which can affect our overall perception. While some studies have shown that the perceptual learning spurred by this interaction occurs at the segmental and prelexical levels and are speaker specific, other studies show the

4

perceptual learning to occur at the featural level and are not speaker specific (Eisner and McQueen, 2005; Kraljic and Samuel, 2006). Kraljic and Samuel (2007) tried to understand this difference by studying listener adaption to two speakers on two phoneme continuum, /p/ to /b/ and /s/ to /S/. Results mirrored those of former studies, and Kraljic and Samuel hypothesized that when there is a temporal-voicing contrast (/p/ to /b/) perceptual learning occurs at the featural level and learning is reset for each speaker encountered. Conversely, for a spectral-place contrast perceptual learning occurs at the prelexical level and learning is maintained for various speakers. This could possibly be due to the fact that voicing is a speaker-independent acoustic cue while spectral-place contrasts are speaker-specific in their differing productions. These different kinds of contrast cues can affect a listener’s ability to adapt to not just one speaker but several speakers at once, particularly when foreign accent is involved. The greater the degree of a speaker’s accent the more time it takes to adapt to an accent, which is not surprising considering it will require learning of a greater number of phonemic adjustments (Bradlow and Bent, 2007). When listeners are exposed to multiple talkers with accented speech at once they actually have higher accuracy scores when exposed to several talkers before testing, compared to just one, and subjects perform better when the training and testing accent matches. This finding points to the conclusion that when subjects adapt to an accent they are better able to adapt to a specific type of accent versus accented speak in general, but that accent does not have to be speaker-specific. The current study will examine both natives and non-natives ability to take advantage of the higher level semantic feature context in speech-in-noise, as well as their ability to adapt to speaker-specific and speaker-independent accented speech. The stimuli used are all foreign accented sentences, from a subset of the sentences used in Bradlow and Alexander (2007). This will allow for a direct comparison between both native and non-natives listeners’ performance and ability to access higher level cognitive processes in both unaccented and accented speech.

5

Chapter 2

Experiment 1 - Native Listeners 2.1

Materials

Recordings were taken from the Wildcat Corpus, a collection of scripted and spontaneous speech produced by both native and non-native speakers of English. For the stimuli in this experiment talkers produced 60 sentences, presented to them one at a time through Superlab, a computer program used to build and run perception experiments. The sentences included 30 sentence final words in high predictability (HP) context and 30 sentence final words in low predictability (LP) context; 30 key words were used in total each occurring in both types of context coming to a total of 60 sentences. An example of a word in high predictability context would be “For dessert, he had apple pie ”, compared to the same word in a low predictability context such as the sentence “Mom talked about the pie ”. The sentences were a subset of those from lists previous created by Kalikow et al. (1977), Bench and Bamford (1979), Munro and Derwing (1995), Fallon et al. (2002), as well as some originally produced by Bradlow and Alexander (2007). The full list is provided in Appendix A. The stimuli were equated for overall amplitude and then mixed with speech shaped noise at SNR of -2 for native listeners to mimic the setup of the Bradlow and Alexander (2007) experiment. In the first condition one talker was used to produce all 60 sentences; this will be referred to as the single talker (ST) condition. The talker for this condition was selected on the basis of a separate test in which 50 native subjects were instructed to listen to a passage produced by all of the non-native talkers in the

6

Wildcat Corpus (the “Stella” passage, for details see the Wildcat Corpus website) and provide an accent rating on a scale of 1, no accent, to 9, strong accent for each talker (Choi, 2007). The talker for the present condition was chosen based on his accent rating which was deemed “medium” at an average rating of 6.34 from a range of 3.05 to 8.27 for non-native talkers; his first language was Korean. The second condition used the same sentences as in the ST condition, but for this condition 30 talkers were used, each producing one word in each of the 2 contexts, HP and LP (30 talkers each producing 2 sentences equals 60 sentences). These talkers made up the multi talker (MT) condition. The talkers were chosen based on the accent ratings of their speech to produce a range of degrees of accent. As shown in Figure 2.1, the talkers in this experiment had an average accent rating from 3.05 to 8.27 the mean for all talkers was 6.30, both the lowest and highest rated talkers were Chinese. There were a total of 11 Chinese talkers 7 female and 4 male, 9 Korean 6 female and 3 male, 2 Turkish both male, 1 Russian male, 1 Japanese male, 1 Macedonian female, 1 Spanish male, 1 Indian male, 1 Iranian Male, 1 Italian male, and 1 Thai male. In total there were 14 females and 16 males. The talker from the ST condition was also included in this group as talker number 14 from best to worst accent ratings. In the questionnaire filled out by all subjects, those who spoke a dialect of Chinese did not always specify which dialect was their native language. For the purposes of this experiment speakers of both dialects were collapsed together as “Chinese” talkers.

7

Table 2.1: Talkers in Multi Talker (MT) condition and accent ratings with L1 background and gender. Speaker 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

2.2

L1 Chinese Spanish Telegu Chinese Korean Korean Iranian Korean Russian Macedonian Korean Chinese Turkish Korean Turkish Korean Korean Chinese Korean Chinese Korean Italian Thai Chinese Chinese Chinese Chinese Chinese Japanese Chinese

Gender Male Male Male Female Female Female Male Male Male Female Female Female Male Male Male Male Female Female Female Male Female Male Male Male Female Male Female Female Male Female

Mean Accent Rating 3.05 3.50 4.68 4.70 4.84 5.27 5.59 5.59 5.59 5.82 6.07 6.11 6.27 6.34 6.43 6.61 6.77 6.89 6.89 7.11 7.11 7.18 7.18 7.23 7.34 7.43 7.50 7.73 7.77 8.27

Procedures

In both the ST and MT conditions the stimuli were presented in two blocks each composed of a different list of sentences: List 1 included words 1 through 15 in the high predictability (HP) context and words 16 through 30 in the low predictability (LP) context, List 2 included words 16 through 30 in HP context and words 1 through 15 in LP context (see Appendix A). Two presentation orders were used, one with List 1 followed by List 2 (Setup A) and the other with List 2 followed by List 1 (Setup B). Sentences were randomized within each block. This design was used to avoid a subject hearing an HP and LP sentence with the same target word one after another; in this design this could only occur if one of the sentences 8

was the final sentence of Block 1 and the other the first sentence of Block 2. However, subjects were given the chance to take a break between each block to try to break up the sections and thus avoid any effects of hearing the same keyword in two consecutive sentences. All subjects began with six practice sentences to become familiar with the task. The practice sentences, which differed from any of the test sentences, were produced in plain speech by a male native speaker of English. Plain speech, or conversational speech, is representative of the way people normally speak to one another. This is in contrast to clear speech which is hyper articulated and more clearly produced, such as for when one is talking to someone hard of hearing. Since the non-native talkers who produced the stimuli were not instructed to speak in clear speech their speech was considered more likely be labeled as plain speech, and thus the practice sentences were also in plain speech.

2.3

Subjects

Native English speakers were asked to listen to all 60 sentences and write down the final word in each sentence to the best of their abilities. The subjects were Northwestern University undergraduates ages 18 to 21 who participated in the experiment to fulfill part of a class requirement. There were 40 subjects in total, 20 participating in the ST condition and 20 in the MT condition with 10 in Setup A and 10 in Setup B within each condition. Each of the 40 subjects identified themselves as a native monolingual speaker of English. All self-identified bilingual or non-native speakers were excluded from the study; six bilinguals in the ST condition and five bilinguals and two non-natives in the MT condition. None of the subjects reported any hearing problems.

9

% P"-&K$%N(#0&,+%,3$"7$2,%1$2$%",7$4%-5%0&,-$(%-5%"00%@B%,$(-$('$,%"(4%12&-$%451(%-+$%H&("0% 1524%&(%$"'+%,$(-$('$%-5%-+$%6$,-%5H%-+$&2%"6&0&-&$,G%*+$%,/6L$'-,%1$2$%P52-+1$,-$2(%Q(&K$2,&-;% /(4$2#2"4/"-$,%H25.%"#$,%:R%-5%E:%1+5%3"2-&'&3"-$4%&(%-+$%$?3$2&.$(-%-5%H/0H&00%3"2-%5H%"%'0",,% 2$S/&2$.$(-G%*+$2$%1$2$%TB%,/6L$'-,%&(%-5-"0D%EB%3"2-&'&3"-&(#%&(%-+$%I*%'5(4&-&5(%"(4%EB%&(%-+$%U*% '5(4&-&5(%1&-+%:B%&(%I$-/3%F%"(4%:B%&(%I$-/3%9%1&-+&(%$"'+%'5(4&-&5(G%F00%,$0HO&4$(-&H&$4%6&0&(#/"0% 52%(5(O("-&K$%,3$"7$2,%1$2$%$?'0/4$4%H25.%-+$%H&("0%"("0;,&,D%,&?%6&0&(#/"0,%&(%-+$%I*%'5(4&-&5(% "(4%H&K$%6&0&(#/"0,%"(4%-15%(5(O("-&K$,%&(%-+$%U*%'5(4&-&5(G%P5($%5H%-+$%,/6L$'-,%2$352-$4%"(;% 2.4 Results +$"2&(#%32560$.,G% ."'*/#'( Table 2.2: Native listeners’ final word recognition accuracy in percentages in the Single Talker (ST) and !"#$%&'(&)*+"$&,-./&.%0-1+*2*-+&"003."04&*+&5%.0%+2"1%6&7-.&+"2*8%&"+/&$*62%+%.6&9*+1$%&!"$:%.&;9!,/& K>, .%>@/ .%>

!"!"

0

Final Word Recognition Accuracy (% correct)

!"!"

Final Word Recognition Accuracy (% correct) 20 40 60 80

80

"!""!"

Native Listeners: Multi Talker (MT) Native Listeners: Multi Talker (MT) (-2 (-2 dBdB Signal-to-Noise Ratio) Signal-to-Noise Ratio)

0

Final Word Recognition Accuracy (% correct)

Final Word Recognition Accuracy (% correct) 20 40 60 80 0

100

Native Listeners: Single Talker (ST) Native Listeners: Single Talker (ST) (-2 (-2 dBdB Signal-to-Noise Ratio) Signal-to-Noise Ratio)

#\# '#\# c'&1/$# N#c'&1/$# 6=1&3H# /6=1&3H# 8%#'$6L# 18%#'$6L# 1&3M#K)%# =1&3M#K)%# $/%+01# 3#$/%+01#

9#K)%# )9#K)%#

," :,&( :, 2"69 2" C$#" C$ "90$% "9 !"#$% !"# D51&/ D5 :;/2 K> ()#% ()

?":& ?

% % % % % % (3( (3 $:, $: J)%Q J)

5,#,)>&?7@&&&&&&&&&&&&&&&&&&&&&&&&&&& A".B&C),*"-5$D"%"52&'()*>& #!!" +!" *!" )!" (!" '!" &!" %!" $!" #!" !" #"

$"

%"

&"

'"

("

)"

*"

+"

!"#$%&'()*&+,-(.#"/(#&0--1)$-2&34-()),-56&

!"#$%&'()*&+,-(.#"/(#&0--1)$-2&34-()),-56&

averages for each talker:

& 0%%&3AB&$#*&=B6&'()*>& !"#$%&'()*&+,-(.#"/(#&0--1)$-2&34-()),-56&

!"#$%&'()*&+,-(.#"/(#&0--1)$-2&34-()),-56&

(a) High predictability context words.

#!!" +!" *!" )!" (!" '!" &!" %!" $!" #!" !" (" #" $" %" &" '" )" *" +" #" $" %" &" '" (" )" *" +" 7$%8,)&0--,#5&+$/#.&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& 9,)2&#$/9,:%"8,&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&9,)2&;(),".#:$--,#5,*&7$%8,)&0--,#5&+$/#.&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& 9,)2&#$/9,:%"8,&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&9,)2&;(),".#:$--,#5,*&

(c) All words.

Figure 2.2: MT Talker Accent Rating vs. Native Listeners’ Percent Correct Responses

As should be evident there does not appear to be any correlation between accent rating and intelligibility scores, by context type or for word totals. The correlation coefficient for the Accent Ratings vs. HP Words was -0.11, for LP Words 0.25, and for All Words 0.07. Turning to the L1s of the talkers in the MT condition one finds a particular high number of L1 Chinese (11 out of 30) and L1 Korean (9 out of 30 talkers) talkers. While the ST condition was all one Korean talker, perhaps subjects are adapting to Chinese and Korean accented English to the extent that it boosts overall intelligibility scores in the MT condition. The subjects’ responses were then separated according to L1 of the talker for three L1 groupings: 1) a dialect of Chinese (as previously mentioned due to lack of specific 12

information provided by subjects speakers of both dialects were collapsed together), 2) Korean, or 3) all other accents, since no other accent had a significant number of talkers. These L1 talker groupings were also compared to the ST condition. The boxplot in Figure 2.3 shows that native listeners are getting higher scores when listening to L1 Chinese talkers compared to both L1 Korean talkers or the “other” talkers. Once again an ANOVA was run, this time adding talker L1 as a within-subject factor. There proved to be a significant effect of talker L1 and an interaction between context and talker L1 [talker L1: F(2, 19) = 7.96, context-talker L1: F(2, 19) = 9.62]. ANOVAs were then conducted for each L1 talker group with context and block as within-subject factors. For L1 Chinese talkers and L1 Korean talkers there was a significant effect of context [L1 Chinese-context: F(1, 19) = 6.24, L1 Korean-context: F(1, 19) = 17.68]. Interestingly though, and not surprising by looking at the boxplot, there was no effect of context when analyzing the scores of all other accents that did not make up a prominent number of the talkers. Again all tests were conducted on data converted to RAU. It appears that the native listeners are only able to take advantage of context when they have enough stimuli to become familiar with the accent; this would have occurred for the L1 Chinese and L1 Korean talkers in the MT condition and the single L1 Korean talker in the ST condition. However, when subjects are listening to many different accents they are not able to adapt enough to access higher level semantic processes and show context effects.

13

Block 1 LP

N#;A+$131# /66#3*'&13#B'&#

Block 1 HP

Block 2 LP

Block 2 HP

Block 1 LP

Native Listeners: MT by Talker L1 and ST by Context (-2 dB Signal-to-Noise Ratio)

$"3"#

20

60 40

40

60

!"

0

20

Final Word Recognition Accuracy (% correct)

80

$"3"#

80

!"

0

*/$#%/=1# 1$%1$*13#K)%# ')$%#'B#%A1#

!" Final Word Recognition Accuracy (% correct)

$#?&')9#K)%#

100

100

Native Liste (

CH LP

CH HP

KO LP

KO HP

Other LP Other HP

ST LP

ST HP

CH B1

'#K'%A#/#Figure 2.3: Native listeners’ percent correct recognition for MT separated by talker L1 and ST results by context. **1$%1(#3911*A#

CH B2

KO

The data was then separated and analyzed according to block to see if there were differences of adaptability for the L1 groupings, Figure 2.4 summarizes this below. There appears to be a block effect for the

!,"#-'$.$/%+01#2+3%1$1&34#513)6%3#

L1 Chinese talkers. For the L1 Korean talkers there does not appear to be increased accuracy but there is decreased variability. For the other accents accuracy increased for some subjects but overall the median was about the same from Block 1 to Block 2. In fact the ANOVAs found there to be no significant effect of block Non-native Listeners: Single Talker (ST) (+2 dB Signal-to-Noise Ratio)

100

for any of the groups even though there was a significant effect for the data as a whole. This may suggest ?$+Q+*/$%#1BB1*%#

100

"!"

that while there was enough stimuli, in the case of the L1 Chinese and L1 Korean talkers, for listeners to adapt to the point of taking advantage of context, they lacked enough stimuli within each block to show

80 60 40

40 20

0

14 0

18%#1BB1*%# #$'%#%A1#I@M#J@# +$131#/$(#2N# (#%'#%/6=1&#

60

where there was clear adaptation to the accent.

20

$"3"#

Final Word Recognition Accuracy (% correct)

80

Final Word Recognition Accuracy (% correct)

+?$+Q+*/$%#1BB1*%# $"3"# adaptation effects across the entire condition for specific L1 groups. This is contrary to the ST condition

No

Block 2 LP

Block 2 HP

Block 1 LP

1 and ST by Context e Ratio)

Block 1 HP

Block 2 LP

Block 2 HP

100

Native Listeners: MT by Talker L1 and ST by Block (-2 dB Signal-to-Noise Ratio)

40

60

!"

0

!"

20

Final Word Recognition Accuracy (% correct)

$"3"#

$"3"#

###$"3"#

80

$"3"#

P Other HP

ST LP

ST HP

CH B1

CH B2

KO B1

KO B2

Other B1 Other B2

ST B1

ST B2

Figure 2.4: Native listeners’ percent correct recognition for MT separated by talker L1 and ST results by block.

% C$ '3(/$:/A 9$/5$$(%/ 4"B%@$# '327"1$8

K1* •!53(#$2%+ 2