AJSLP
Research Article
A Clear Speech Approach to Accent Management Alison Behrmana
Purpose: A 5-session twice-weekly clear speech protocol with daily home practice was developed to enable Spanish-accented speakers of English to code-switch for increased listener ease of understanding. This study provides preliminary data to test the hypothesis that this protocol results in increased ease of understanding for native English listeners, not in decreased talker accentedness. Method: Using a single-case experimental design, 6 adult native Spanish speakers with English proficiency participated in the protocol. Ease of understanding and accentedness were probed at least 5 times pretraining, at each training session, and once per week for 5 weeks
posttraining. Thirty native English–speaking listeners assessed the probes using 7-point scales for each measure. Results: Ease of understanding improved for all participants (mean improvement = 3.5 points; effect size range = 6.98 to 15.33). Accentedness improved for 4 of 6 participants (mean improvement = 2.3 points; effect size range = 4.04 to 10.48). At the outset, most participants expressed concern that this approach would highlight speech errors. Upon follow-up, all participants reported confidence in using the approach and found it helpful in daily communication. Conclusions: Further research should explore the effects of this protocol on intelligibility and acoustic metrics and their relationship to ease of understanding and accentedness.
A
Furthermore, the degree to which a nonnative accent impedes communication is not straightforward (Behrman, 2014). Therefore, additional research is needed. Here, the term accent management is used to encompass a broad range of strategies, including use of global strategies of communication enhancement, such as clear speech (hyperarticulation), as well as traditional goals of reducing segmental and prosodic differences in the L2. The goal of the present study is to inform clinical practice by contributing novel outcomes data on the use of clear speech in accent management and by clarifying the impact of nonnative accent on communication.
nonnative accent is a communication difference characterized by phonological and prosodic features that differ systematically from those of native speakers. Such differences are generally derived from perceptual and production characteristics of the native language (L1), which are applied to the second language (L2; Best & Tyler, 2007; Flege, 1999). English language learners, particularly those who begin their studies as teenagers or older (Flege, Munro, & MacKay, 1995; Piske, MacKay, & Flege, 2001), typically retain phonological and prosodic features of their L1. Nonnative accents can, but do not necessarily, interfere with intelligibility and the listener’s ease of understanding (Behrman, 2014; Kennedy & Trofimovich, 2008; Munro & Derwing, 1995; Neel & Long, 2015). The goal of traditional intervention is accent reduction, that is, to bring the phonological and prosodic features closer to that of a native speaker and thereby improve communication. Outcomes data elucidating the effect of accent reduction on interpersonal communication are limited.
a
Department of Speech-Language-Hearing Sciences, Lehman College, The City University of New York, Bronx Correspondence to Alison Behrman:
[email protected] Editor: Krista Wilkinson Associate Editor: Jack Ryalls Received October 12, 2016 Revision received March 6, 2017 Accepted June 13, 2017 https://doi.org/10.1044/2017_AJSLP-16-0177
1178
Accentedness and Communication Accentedness is the listener’s perception of how closely an individual’s speech approaches that of a native talker (Munro & Derwing, 1995). Factors that influence accentedness include the speaker’s age of acquisition of the L2 and the pattern of frequency of its use (Flege et al., 1995; Piske et al., 2001). A speaker’s attitudes toward the target language and motivation for learning the target language may also influence L2 production abilities (Moyer, 2007). The effect of accentedness on communication can be measured by intelligibility, the accuracy with which a listener understands a talker, ease of understanding, and the amount of effort invested by the listener. Many factors may influence Disclosure: The author has declared that no competing interests existed at the time of publication.
American Journal of Speech-Language Pathology • Vol. 26 • 1178–1192 • November 2017 • Copyright © 2017 American Speech-Language-Hearing Association
Downloaded From: http://ajslp.pubs.asha.org/ by a ReadCube User on 11/09/2017 Terms of Use: http://pubs.asha.org/ss/rights_and_permissions.aspx
intelligibility and ease of understanding, particularly the language background and experience of the listener (Bent & Bradlow, 2003; Cutler, 2001; Gass & Varonis, 1984), the semantic context (Behrman & Akhund, 2013) and lexical frequency (Levi, Winters, & Pisoni, 2007) of the spoken content, and background noise levels (Wilson & Spaulding, 2010). Many speakers with very strong accentedness score low on intelligibility and ease of understanding, whereas many speakers with very mild accentedness are perceived to be highly intelligible and easy to understand (Behrman & Akhund, 2013; Kennedy & Trofimovich, 2008; Munro & Derwing, 1999). However, L2 speakers may be intelligible and relatively easy to understand despite being perceived as having moderate to high accentedness (Behrman & Akhund, 2013; Kennedy & Trofimovich, 2008; Munro & Derwing, 1999). Therefore, accentedness alone does not necessarily predict intelligibility and ease of understanding. As such, approaches to accent management that focus on increased intelligibility and ease of understanding, such as clear speech, rather than directly focusing on reducing an accent so that a speaker sounds more like a monolingual English speaker, are worthy of investigation.
Accent Management Outcomes The use of clear speech as an accent management strategy has been acknowledged (Amy Neel, personal communication, February 2016), but it has not been experimentally tested. Rather, research has focused on accent reduction—the minimization of phonotactic and prosodic differences with the goal of producing more nativelike speech. Most such research has been conducted within the context of instructional outcomes of classes in English as a second language (ESL; Barb, 2005; Derwing, Munro, & Wiebe, 1998; Derwing & Rossiter, 2003). Using diverse outcome measures, such as accentedness, ease of understanding, and/or rhythm and segmental errors, those studies all found improvement upon completion of the classes. However, control groups were not included in the study designs. Two studies have examined individual training of accent management, both using single-subject designs with repeated baseline measures (Behrman, 2014; Schmidt & Meyers, 1995). Schmidt and Meyers (1995) comparatively assessed the outcome of articulation and phonological approaches for fricative and affricate errors in four L1 Korean speakers of English. Two participants were randomly assigned to each training approach. Articulation training consisted of practice in perceptual discrimination and motor skills associated with the target phoneme. Phonological training consisted of contrastive production of minimal sets of phonemes. Both approaches resulted in reduced phonemic errors. However, the effect of the reduced errors on communicative function was not measured. Behrman (2014) comparatively assessed segmental and prosodic approaches in a single-subject alternating treatment design in four L1 Hindi talkers proficient in English. Monolingual English listeners rated accentedness
and ease of understanding of phrases elicited from the participants throughout the baseline, treatment, and posttraining phases. In addition, analyses of segmental and prosodic errors were conducted. Segmental and prosodic errors were reduced after each type of training, respectively, with no carryover effect of one type of training to the other. Accentedness decreased and ease of understanding increased after each type of training. A clear advantage of one approach over the other was not found for any outcome measure. Of note, greater improvement was obtained for ease of understanding than for accentedness for all participants. Taken together with other similar findings (Behrman & Akhund, 2013; Kennedy & Trofimovich, 2008; Munro & Derwing, 1999), the data provide support to explore an accent management strategy that targets increased clarity without directly addressing language-based segmental and prosodic differences.
Clear Speech Clear speech is a natural speaking style used to enhance intelligibility in adverse listening conditions, such as when talking within high ambient noise levels or with individuals with hearing impairment (Ferguson, 2004; Smiljanić & Bradlow, 2005). Clear speech is hypothesized to represent a global increase in the overall effort of speech production (Lindblom, 1990; Searl & Evitts, 2013). In L1 speakers, acoustic changes resulting from clear speech include decreased rate of articulation, increased vowel duration, and longer and more frequent pauses (Ferguson & Kewley-Port, 2002, 2007; Ferguson & Quené, 2014; Picheny, Durlach, & Braida, 1986; Smiljanić & Bradlow, 2005; Uchanski, Choi, Braida, Reed, & Durlach, 1996). In addition to these changes in speech-timing features, other reported changes include increases in acoustic vowel space area, extent of formant movements (Ferguson & Kewley-Port, 2002, 2007), overall acoustic intensity, and consonant power (Picheny et al., 1986), as well as kinematic adjustments (Matthies, Perrier, Perkell, & Zandipour, 2001; Tasko & Greilick, 2010). An increase in intelligibility achieved with the use of clear speech, when compared with conversational speech, is referred to as the clear speech gain. In L1 speakers, the reported clear speech gains vary widely, depending upon context, from 5% to 40% (e.g., Ferguson & Kewley-Port, 2002; Picheny, Durlach, & Braida, 1985). Only two experiments reported in the literature have studied the clear speech gain achieved by L2 speakers. Rogers, DeMasi, and Krause (2010) compared the clear speech benefit produced by English monolingual (n = 15) and native Spanish talkers who were either early (n = 22) or late (n = 14) learners of English. The average age of immersion into an English-speaking environment was 6.5 years for early learners and 25 years for late learners. Comparison of intelligibility scores of six target vowels (V) embedded in the carrier phrase “I see a /bVd/ again” in conversational and clear speech revealed that early learners and monolinguals produced a similar clear speech benefit of 7–8 percentage points. Later learners produced a smaller Behrman: Clear Speech Approach to Accent Management
Downloaded From: http://ajslp.pubs.asha.org/ by a ReadCube User on 11/09/2017 Terms of Use: http://pubs.asha.org/ss/rights_and_permissions.aspx
1179
benefit of just 3 percentage points. Smiljanić and Bradlow (2011) used semantically anomalous sentences to assess intelligibility differences between conversational and clear speech in English for four L1 Croatian talkers with high proficiency in English. Results showed a clear speech benefit of 10 percentage points, whereas monolingual English talkers in a previous study (Smiljanić & Bradlow, 2005) produced a clear speech benefit of 17 percentage points for the same materials. While both studies found that L2 talkers achieved some clear speech benefit, the gains were smaller than those achieved by native talkers. Nevertheless, the data provide strong motivation for continued exploration of clear speech in L2 speakers. In particular, an intelligibility gain of 10 percentage points, as reported by Smiljanić and Bradlow (2007), may represent a clinically meaningful improvement for some individuals. In fact, those authors noted that the proportional clear speech benefit relative to conversational speech was similar (32%) across both L1 and L2 talkers. This finding suggests that the absolute benefit achieved may be related to baseline intelligibility. Furthermore, although Rogers et al. (2010) found only a 3-percentagepoint benefit for late learners of English, they assessed only vowel intelligibility. The effect of clear speech on minimally contrastive /bVd/ words produced by L2 speakers may not be representative of its effect on connected speech. Given that the number of nonnative speakers of English is projected to increase significantly over the next decade (Shin & Ortman, 2011), a deeper understanding of accented speech, treatment considerations, and the impact of accentedness on interpersonal communication are critical issues for speech-language pathologists. The present study focused on native speakers of Spanish. Spanish is the second most common language (after English) spoken in the United States (National Education Association, 2010), and people who identify themselves as Hispanic comprised 17% of the total U.S. population in 2014: Of those individuals, 19% were foreign-born (Pew Research Center, 2016). Although not all Hispanic individuals speak English with an accent, four in 10 U.S. adult Hispanics do not speak English proficiently, and 38% of Hispanics mostly use Spanish in their daily communication (Pew Research Center, 2016). In summary, the aim of this study was to provide preliminary data to test the hypothesis that use of a clear speech approach to accent management for Spanish-accented speakers of English would result in increased ease of understanding for native English listeners. It was hypothesized that this approach would not affect accentedness, because clear speech is a global approach that does not target specific phonemic and prosodic differences.
Method Design The hypothesis was tested using a single-case experimental design (SCED). SCEDs can contribute important data for establishing evidence-based practice, as described
1180
most recently by Beeson and Robey (2006), Kazdin (2010), and Byiers, Reichle, and Symons (2012). An important advantage of SCEDs is that detailed information can be obtained about variations in the treatment effect related to the specific individuals under study. Individuals are measured at multiple time points in all phases; thus, trends over time can be identified. Such information is often obscured in group-comparison designs because only group averages and group effect sizes are analyzed (Barlow, Nock, & Hersen, 2009). In the present study, a multiple-baseline design was used, in which the three phases consisted of baseline, clear speech training, and posttraining assessments. In this study, replication across six participants and multiplebaseline observations minimized the possibility that a co-occurring event, and not the treatment, was responsible for the change in the outcome measures (Byiers et al., 2012; Kazdin, 2010; Kratochwill et al., 2010). In effect, the six participants each served as their own controls (Kratochwill et al., 2010). A minimum of five probes were conducted to establish performance on the outcome measures in each of the three phases (Kratochwill et al., 2010). Furthermore, the start of the experimental intervention was staggered across participants, such that some participants remained in an extended baseline phase, whereas others entered the training phase to increase experimental control of extraneous variables (Byiers et al., 2012). Kratochwill et al. (2010) recommend a minimum of three session lags between participants. In the present study, instead of staggering all participants, three baseline start points were used with two participants in each start point group. This modification was used to control for potential dropout of some participants during an extended baseline phase. In summary, the baseline phase consisted of five, eight, or 11 probes on the outcome measures administered every other day (excluding weekends). The training phase consisted of five sessions of 45 min each, administered twice per week for 2.5 consecutive weeks, with the probes on the outcome measures administered at the end of each training session. The posttraining phase consisted of probes on the outcome measures once per week for 5 weeks. This study was approved by the Lehman College/ City University of New York Institutional Review Board for the Protection of Human Subjects, and all procedures adhered to the approved protocol. The participants, training protocol, and probes are described as follows.
Participants Participants were recruited by word of mouth from a college student population. Enrollment criteria specified that volunteers must be native Spanish speakers. They could not have studied English intensively or have been immersed in an English-dominant environment prior to the age of 14 years. They needed to consider themselves proficient speakers of English but with a moderate-to-heavy Spanish accent. Volunteers were required to be full-time
American Journal of Speech-Language Pathology • Vol. 26 • 1178–1192 • November 2017
Downloaded From: http://ajslp.pubs.asha.org/ by a ReadCube User on 11/09/2017 Terms of Use: http://pubs.asha.org/ss/rights_and_permissions.aspx
college students to ensure frequent exposure to English for opportunities to practice speech strategies learned in the experimental training sessions. They could not be proficient in a third language. Volunteers could not be participating in any type of accent management or ESL classes during the experiment. An additional requirement was the ability to commit to regular attendance in all sessions and completion of daily homework practice. In conclusion, enrollment criteria specified a score of 50% accuracy or lower on two subtests (Articulation and Word-Level Syllabic Stress) of the Proficiency in Oral English Communication (Sikorski, 2014). The first six volunteers who met the enrollment criteria were enrolled in the study. One individual dropped out after the first session due to scheduling problems, and a replacement volunteer was recruited and enrolled. Table 1 presents demographic information on the six participants. All the participants reported seeking accent management believing that it would improve their abilities to get into graduate school and/or to become more employable after graduation. They had all studied English in secondary schools in their native countries. The median duration of residence in the United States was 11 years (range = 8 to 15 years). All participants reported using English in the classroom, and three participants reported using English frequently at work. However, all participants reported feeling more comfortable speaking in Spanish than in English, and Spanish remained their dominant language used socially on a daily basis with friends and family members. None of the participants had previously participated in an accent management program. Four of the participants had been enrolled in one or more ESL classes but not within the year prior to enrollment in this project. Two participants were undeclared majors, three participants were in their first semester of undergraduate study in communication sciences and disorders, and one participant was an undergraduate majoring in social work.
the L1 influences the perception and production of the L2 (Best & Tyler, 2007; Davidson, 2011; Ellis, 1994; Pavlenko & Jarvis, 2002). While a comprehensive analysis of Spanish (Colina, 2009; Eddington, 2004) is beyond the scope of this article, some broad differences between Spanish and English are highlighted. The inventory of Spanish vowels consists of /i, e, a, o, u/. In contrast, American English has a relatively large vowel inventory of non-rhotic monophthongs, consisting of seven tense vowels (/i, e, æ, ɑ ɔ, o, u/) and four lax vowels (/ɪ, ɛ, ʌ, ʊ/). (Dialectical variations are common within both languages.) Regarding consonants, both languages have a similar number of consonants. English contains a greater number of fricatives, although Spanish plosives may be produced with frication in some contexts. Spanish plosives are achieved with more anterior place of articulation, different voice onset times, and different rules of aspiration than in English. In addition, in Spanish, /b/ and /v/ may be interchanged, depending upon the phonotactic context. English has a greater variety of syllable structures than does Spanish. In English, more consonants, as well as consonant clusters, are permitted in all positions. Spanish allows only the singleton consonants /s, n, r, l, d/ in the coda position; although, /n/ is often replaced with nasalization of the preceding vowel, and /s/ is omitted in many dialects. Therefore, closed syllables (having a terminal consonant) are much less common in Spanish, with a higher proportion of simple consonant–vowel syllables in Spanish than in English. As a result, Spanish L1 speakers talking in English will often omit one consonant from a cluster or use epenthesis (insertion of a vowel) to break up clusters and omit final consonants that do not typically occur in the coda position in Spanish. In Caribbean Spanish, especially in the Dominican Republic, deletion of final /s/ is quite common. It is also demonstrated, although less extensively, in the coastal communities of Ecuador (Salcedo, 2010). Therefore, although all six participants were native Spanish speakers, they represented different dialects. Furthermore, it is likely that their English has also absorbed some local features (Wolfram & Schilling, 2015, p. 228), in this case the English and Spanish dialects spoken in New York City.
Language Background of the Participants The phonotactic and prosodic features of a speaker’s L1 are important to consider in accent management, because Table 1. Demographic data of the participants.
Participant
Native country
Age of immigration to U.S.a
Age
Gender
1
31
F
D.R.
19
2 3 4 5 6
24 35 28 26 27
M F F M F
Mexico D.R. Ecuador D.R. D.R.
16 20 17 15 18
Dominant employment languageb
Dominant academic language
Dominant social language
English
Spanish
English English English English English
Spanish Spanish Spanish Spanish Spanish
Full-time employment: equal use of Spanish and English Not employed Full-time employment: mainly Spanish Part-time employment: mainly English Part-time employment: mainly Spanish Part-time employment: equal use of Spanish and English
Note. D.R. = Dominican Republic; F = female; M = male. In the United States, all participants have lived only in New York City. bIncludes field placement/externship training.
a
Behrman: Clear Speech Approach to Accent Management
Downloaded From: http://ajslp.pubs.asha.org/ by a ReadCube User on 11/09/2017 Terms of Use: http://pubs.asha.org/ss/rights_and_permissions.aspx
1181
In prosodic context, syllabic stress is more variable in English than in Spanish (Carter, 2005; Delattre, 1966), with vowel reduction in unstressed syllables commonly used in English but rarely used in Spanish. Again, it was hypothesized that accentedness scores would not be influenced by use of clear speech, because the language-based phonemic and prosodic differences are not directly addressed by clear speech.
Clear Speech Protocol The training was provided by two female graduate students in speech-language pathology. Every training session for each participant was completely supervised by the author. The training program was designed to teach participants how to produce clear speech consistently throughout each utterance without articulatory or prosodic distortion and to become comfortable and confident using this style of speaking so that speakers could elect to use it, in effect to code-switch, in a variety of communicative contexts. This approach is consistent with Lindblom’s (1990) hyperspeech versus hypospeech theory, which states that when a listener expresses difficulty understanding the talker’s speech, the talker will sacrifice motor economy (hypoarticulation) by using hyperarticulation to achieve greater accuracy in the speech signal. In effect, the talker is code-switching to hyperspeech. The overall structure of the five training sessions consisted of review of the home practice material (except for the first session), speech production practice with cuing and feedback, explanation of the home practice material for the upcoming week (except for the fifth session), and recording of the probe. Throughout the five sessions, segmental and prosodic errors were not addressed. When questions did arise from the participants about how to produce a given word, the discussion was deferred until the conclusion of the individual’s experimental participation. Instructions for elicitation of the probes were the same as those used during the baseline phase: to speak clearly at a comfortable loudness level. Practice material used during the training sessions never included the stimulus phrases used for the probes. An important component of the first training session was to help the participants believe that the clear speech– based approach would meet their personal needs relative to accent management. All the participants had volunteered for the experiment because of their desire to decrease their Spanish accent when speaking English. The experimental program was not designed for that purpose. Although the informed consent process identified that distinction, in the first session, the purpose of the research-based training protocol was explained again, as it had been at the time of the informed consent: to explore a novel and untested approach to accent management that focused upon increased speech clarity rather than nativelike speech. A nonnative accent was defined for the participants as differences in pronunciation of speech sounds and speech rhythm that arise from differences between the Spanish and English languages.
1182
To confirm that participants understood the discussion, they were asked for two examples: a sound difference and an English word that was difficult to pronounce because it was hard to know where to put the emphasis. All the participants readily provided examples. The rationale for the experimental protocol was then provided as follows. Some sound and rhythm differences are very hard for adults to learn because of long-established pronunciation habits. Furthermore, the English language has many pronunciation and rhythm rules and many exceptions to those rules, such that it can be difficult for adults to remember all of them. An alternative strategy, being tested in the experiment, is to produce words more clearly and with greater deliberation. Rather than amplifying errors, the strategy might help listeners hear and decode differences in pronunciation and rhythm. It was emphasized that this strategy was not intended to be used all the time in “real life,” but rather as a code-switching option in high-consequence communication or when the talker elected to use it for any reason. Clear speech was then introduced. The discussion of clear speech began with situational examples, such as talking to friends in a noisy restaurant, trying to talk surreptitiously in class to a friend across the room by exaggerating mouth movements without producing sound (“mouthing words”), or talking with a person with hearing impairment and needs to read lips. To confirm understanding, the participants were asked to provide an example of when they had used clear speech in either language. All gave appropriate examples (although two of the participants had no familiarity with the concept of lip reading). Throughout the discussion, the investigator used clear speech to provide a model and directed the participant’s attention to the visual and aural features of the model. Modeling may assist in the early stages of skill acquisition (Maas et al., 2008). The researcher then drew a parallel between a noisy environment and the “noise” of a nonnative accent, for which the speaker needs to use increased clarity to rise above the noise and aid in listener comprehension. Care was taken in using this analogy to remove the negative connotation of noise as “bad” by providing examples of pleasant noise, such as the sound of leaves rustling or water flowing in a stream. Emphasis was placed on noise as a neutral feature that could impede understanding and needs to be managed. In the end, clear speech was elicited from the participants through immediate repetition of a model provided by the investigator using lists (such as counting and days of the week) and common phrases (such as “Hello, how are you?” and “My name is …”). Although therapy was conducted in English, one instruction was provided in Spanish during this first session and intermittently throughout the therapy: “Habla claramente y con determinación” (speak clearly and purposefully). Additional prompts were used as needed to help participants achieve maximal use of clear speech. These prompts included direct instruction (“Exaggerate the movements of your mouth”), contrasting conversational style and clear speech production in a minimal pairs format, and magnitude production (“Your
American Journal of Speech-Language Pathology • Vol. 26 • 1178–1192 • November 2017
Downloaded From: http://ajslp.pubs.asha.org/ by a ReadCube User on 11/09/2017 Terms of Use: http://pubs.asha.org/ss/rights_and_permissions.aspx
regular speech corresponds to a clearness of 100. Now you should speak using twice the clearness—a clearness of 200”). Intermittent verbal feedback (“good,” “nice, that was said clearly”) and augmented verbal feedback (“remember to use clear speech,” “good, speak twice as clearly”) were used during practice. The feedback was most frequent in the first session and at the beginning of each subsequent session and then faded as the sessions progressed. Feedback that is intermittent or that summarizes multiple trials and that does not contain too much information may be most effective in skill acquisition (Maas et al., 2008; Schmidt & Lee, 2014). The feedback provided knowledge of results, which is hypothesized to be of equal effect as knowledge of performance (Maas et al., 2008; Schmidt & Lee, 2014). It should be noted that the effects of different clear speech instructions on acoustic and perceptual characteristics have been assessed in monolingual talkers without deficits (Lam & Tjaden, 2013a, 2013b; Lam, Tjaden, & Wilding, 2012) and with dysarthria (Lam & Tjaden, 2016a). Although some effects for the type of instruction have been found, no overall advantage of one instruction over the others has been demonstrated. Furthermore, the interaction of cultural and linguistic effects with the instruction type has not been studied. Therefore, a variety of instructions were used and modified, depending upon the results produced by each participant. The three most common clear speech errors initially produced by all participants were insufficient effort in clear speech production, a rapid decrease in clear speech toward the end of the phrase, and distortion in prosody (particularly minimization of the intonation and stress contours). In addition, two participants used excessive lip movement (particularly lip spreading), which resulted in increased phonemic and prosodic distortion. The prosody distortion was not addressed in this first session. In the case of all the other errors, the investigator imitated the participant’s production, highlighting the error, and then repeated the utterance while modeling the correct clear speech target. On occasion, the investigator video-recorded herself and then the participant producing clear speech for comparison. Exaggerated clear speech production (without distortion) was targeted in each session under the assumption that participants would naturally decrease production effort when using it outside of the clinical setting. Participants were guided to use high-effort clear speech in the training sessions and in the home practice exercises, defined as being purposeful, deliberate, and focusing on producing every word throughout every sentence while speaking twice as clearly as typical, without distortion. To make sure that the participants understood the concept of high-effort clear speech, the investigator asked the participants to identify in the investigator and then in themselves speech produced slightly more clearly than typical style contrasted with high-effort clear speech. All the participants could correctly identify the contrasts. The home practice portion of the program consisted of four segments, all conducted in English. First, the participants were required to time themselves while reading aloud
for 2 min using clear speech with high effort. The participants selected their own reading material. Many of them chose reading assignments from their classes. The second segment was to speak aloud for 2 min (timed) on any topic, either to themselves or to another person, again using high-effort clear speech. These two components had to be performed twice each day, except for the days on which they had a training session, in which case the training counted as one practice session. Third, they were required to use clear speech in actual conversation, either in person or on the phone, for 25% of their total daily talking time. The amount of time devoted to the use of clear speech in conversation was explained as follows: “Imagine that the amount of time you spend talking to others on a given day is equal in cost to one dollar. You must spend 25 cents’ worth of conversation on clear speech.” Fourth, participants were required to keep a daily log of observations of their experiences with clear speech. Training sessions 2 through 5 always began with a review of the homework and the daily log, and then clear speech was practiced. The participants were instructed to use clear speech all the time throughout every session. The most common errors observed during these sessions were forgetting to use clear speech during conversation, especially during brief comments (for example, “I know”) and, as in the first session, a decrease in effort over the duration of an utterance. Again, prosody was not addressed. In all cases, the investigator required repetition of the phrase using clear speech. Starting in the third training session, prosody was addressed, with the focus on using clear speech with the participant’s conversational rhythm, intonation contour, and stress patterns. It is important to note that, to maintain the integrity of the experiment by testing only the clear speech protocol, at no time did the training protocol address language-based differences in stress or intonation contour between Spanish and English. Instead, the participant was instructed to use clear speech while still using his or her natural conversational style. Instructions were also provided in Spanish: “No quiero que suene como un robot. Habla en su manera más natural.” (I don’t want you to sound like a robot. Speak in your way that is more natural.) Additionally, and in conjunction with this focus, the amount of time spent on reading aloud to practice clear speech was minimized. Activities included verbal games (such as 20 Questions), artificial conversations (“Tell me how you would travel from here to Central Park”), and actual conversations on a variety of topics. The content complexity of the actual conversations was varied throughout each session and always included simple and complex topics. For example, a relatively simple content conversation might be describing a recent family wedding, whereas a complex level might be explaining content from one of their classes. Upon completion of the training phase of the experiment, the participants were instructed to cease all homework activities and to use clear speech in actual conversations Behrman: Clear Speech Approach to Accent Management
Downloaded From: http://ajslp.pubs.asha.org/ by a ReadCube User on 11/09/2017 Terms of Use: http://pubs.asha.org/ss/rights_and_permissions.aspx
1183
only to the extent that they wanted to and found it helpful in communication. During the posttraining phase, a probe was recorded from each participant once per week for 5 weeks. Instructions for elicitation of the probes during the posttraining period were the same as those used during the baseline and training phases—to speak clearly at a comfortable loudness level. After the final probe, the investigator interviewed the participants about their experience with clear speech.
Probes (Baseline and Outcome Measures) Accent management outcomes were assessed with measures of ease of understanding and accentedness. These measures were probed a minimum of five times at baseline, at the end of each of the five training sessions, and once per week for 5 weeks after completion of the last training session. The speech stimuli used to construct the probes consisted of 25 anomalous phrases drawn from Liss et al. (2009). The phrases were syntactically permissible but contained limited contextual support (e.g., unseen machines agree) to minimize sentential cues in the ease-of-understanding assessments. All phrases consisted of six syllables (in three to five mono- or bisyllabic words), with alternation of weak and strong syllables, where strong syllables were defined as those carrying lexical stress in citation form. The 25 phrases were randomly divided into five groups of five unique phrases, with three exemplars of each phrase, for a total of 15 phrases per group. Each probe consisted of one phrase group, and a different phrase group was used for each of the five probes within a phase (baseline, training, posttraining). For each participant, within each phase, the order of the phrase groups and the phrases within each group were randomized. Therefore, no participant read the same group of phrases more than once within a phase. In this way, the participant’s familiarity with a given phrase was minimized. This method also decreased listener familiarity with the phrases when rating outcome measures, to be described below. At the time of recording of a probe, the participants were provided with a printout of the phrases and given the opportunity to read through them silently. They were cautioned that the phrases did not make sense, but no guidance in pronunciation was provided. Participants were instructed to use their best guess in pronouncing any unfamiliar words and to speak clearly at a comfortable loudness level. Each participant was positioned standing in front of a computer screen on which the stimuli were displayed, one phrase at a time. The rate of phrase presentation, controlled by the investigator, was at a comfortable pace for each speaker. The participants were recorded individually in a quiet environment using a digital audio recorder (Zoom H4N) using a 44.1-kHz sampling rate and 16-bit resolution connected to a head-mounted microphone (Shure SM35) positioned approximately 5 cm away and at a 45° angle from the corner of the mouth. Each recording was saved to a .wav file. The audio files were equalized for loudness level, and 200 ms of silence was appended to the start
1184
and end of each file. All the audio files from each participant were randomized, so that the files from the baseline, training, and posttraining phases were intermixed in a different order for each participant. Thirty adult male and female listeners, all functionally monolingual English speakers without communication disorders, rated the outcome measures. All the listeners were college undergraduate students in a variety of majors, and all had daily exposure to Spanish-accented speakers. Many of them had studied Spanish in school, but none of them considered themselves to be proficient speakers of Spanish. The listeners were divided into six groups of five raters each, and each group was arbitrarily assigned to one speaker. Each rater used headphones (Shure SRH840) to listen to the stimuli played through a computer. Listeners were told that they would hear sentences that were grammatically correct but did not make sense. Listeners then practiced with two phrases that were like the stimuli, but not otherwise used in the experiment, produced by a Spanishaccented talker who was not part of the experiment. Listeners could adjust the output gain so that it was at a comfortable loudness level, but then the gain had to remain unchanged throughout the experimental task. They could listen to each audio file only one time for each judgment. Ease of understanding was always assessed first for all stimuli. Then, after a 5-min break, they listened to the recordings again and made judgments of accentedness. Judgments of ease of understanding and accentedness were each made with 7-point Likert-type scales. The end point labels for the ease-of-understanding scale were very easy to understand and very hard to understand. For the accentedness scale, the end point labels were no Spanish accent and very strong Spanish accent. Listeners were reminded to make use of the entire scales.
Data Analysis Data were plotted to allow for a visual inspection of levels, trends, and variability within and across each phase (Byiers et al., 2012; Kazdin, 2010). The effect size, broadly equivalent to a Z score of a standard normal distribution, was used to quantify differences between phases. The effect size (d index; Beeson & Robey, 2006; Byiers et al., 2012) was computed as the difference in mean performance between baseline and posttraining phases and divided by the weighted standard deviation of the baseline phase to account for differences in the number of baseline observations (Olive & Smith, 2005). Interpretation of the effect size is not straightforward and “requires an informed means of developing benchmarks to discern the magnitude of…effect sizes for a particular treatment” (Beeson & Robey, 2006, p. 167). Cohen’s (1988) benchmarks of 0.2, 0.5, and 0.8 for small-, medium-, and large-sized effects, respectively, are based upon group design applications in psychology and are therefore inappropriate in SCEDs in speech research. Insufficient effect size data are available in accent management outcomes, with only one other study reporting effect sizes (Behrman, 2014). Therefore, effect
American Journal of Speech-Language Pathology • Vol. 26 • 1178–1192 • November 2017
Downloaded From: http://ajslp.pubs.asha.org/ by a ReadCube User on 11/09/2017 Terms of Use: http://pubs.asha.org/ss/rights_and_permissions.aspx
size is not comparatively assessed here. Instead, a criterion level of greater than 1.0 was used to define improvement, similar to the criterion used in Maas and Farinella (2012) in their SCED of childhood apraxia of speech. Likewise, mean score differences of 1.0 or less were defined as stable during baseline and posttraining phases. Interrater agreement of approximately 30% of the measures of accentedness and ease of understanding was calculated. A difference of 1 point or less between raters occurred for 82% and 84% of the assessments of accentedness and ease of understanding, respectively, indicating good agreement and consistent with other investigations (Behrman & Akhund, 2013; Kennedy & Trofimovich, 2008). To calculate intrarater reliability, 15% of the samples per talker were replicated. Intrarater reliability across the five listeners for each talker ranged from 80% to 91%.
Results The mean scores for accentedness and ease of understanding are plotted for each session within the three phases in Figures 1 and 2. Mean scores and standard deviations for the baseline and posttraining phases and effect sizes are given in Table 2.
Participant 1 (P1) The baseline and posttraining phases for P1 were stable, with mean scores varying by less than 1 point for both measures in both phases. P1 responded quickly to the training and demonstrated little difficulty in learning to use clear speech. She required only infrequent prompting to maintain use of clear speech throughout each sentence. She achieved lower mean scores for ease of understanding than for accentedness throughout all three phases. In the posttraining phase, the mean difference between the two measures was 2.08 points. The largest effect size of the six participants was obtained by P1 for both accentedness and ease of understanding. (Data are presented in Figure 1, top panel.) Participant 2 (P2) The baseline and posttraining phases for P2 were stable, with mean scores varying by 1 point or less for both outcome measures in both phases. P2 achieved lower mean scores for ease of understanding than for accentedness throughout all three phases. In the posttraining phase, the mean difference between the two measures was 2.04 points. Although he quickly learned how to produce clear speech, initially, his rate of speech slowed markedly, and he tended
Figure 1. Accentedness (black lines) and ease of understanding (blue lines) for participants 1 through 3. Each data point represents the average value of ratings from five listeners. Scale values closer to 1.0 represent less accentedness and greater ease of understanding, whereas scale values closer to 7.0 represent greater accentedness and lesser ease of understanding. Note that the time between data points varied by phase. Baseline (B) probes occurred every other day. Training (T) probes occurred twice per week. Posttraining (P) probes occurred once per week.
Behrman: Clear Speech Approach to Accent Management
Downloaded From: http://ajslp.pubs.asha.org/ by a ReadCube User on 11/09/2017 Terms of Use: http://pubs.asha.org/ss/rights_and_permissions.aspx
1185
Figure 2. Accentedness (black lines) and ease of understanding (blue lines) for participants 4 through 6. Each data point represents the average value of ratings from five listeners. Scale values closer to 1.0 represent less accentedness and greater ease of understanding, whereas scale values closer to 7.0 represent greater accentedness and lesser ease of understanding. Note that the time between data points varied by phase. Baseline (B) probes occurred every other day. Training ( T ) probes occurred twice per week. Posttraining (P) probes occurred once per week.
to produce each syllable with equal stress. In the third training session, sessions focused on the use of a more natural rate of speech and a return to his pretraining speech rhythm, while maintaining high-effort clear speech. He could successfully achieve this modification and maintain clear speech. These changes may have been responsible for the large
drop in mean accentedness scores after the third training session. The effect size for accentedness was in the middle range compared with the scores of the other participants. The effect size for ease of understanding was larger than for accentedness. However, it was in the lower range of effect size scores when compared with the scores of the
Table 2. Mean and standard deviation of accentedness and ease-of-understanding scores in the baseline and posttraining phases, and effect sizes from baseline to posttraining phases. Accentedness Participant 1 2 3 4 5 6
1186
Ease of understanding
Baseline M (SD)
Posttraining M (SD)
Effect size
Baseline M (SD)
Posttraining M (SD)
Effect size
6.68 (0.27) 6.32 (0.33) 4.98 (0.31) 6.15 (0.42) 4.47 (0.48) 6.58 (0.28)
4.08 (0.23) 4.28 (0.44) 4.92 (0.33) 4.68 (0.30) 4.72 (0.23) 3.28 (0.41)
10.48 5.28 0.17 4.04 0.70 9.57
6.40 (0.37) 5.36 (0.38) 4.98 (0.59) 6.60 (0.28) 5.45 (0.42) 6.45 (0.42)
2.00 (0.20) 2.24 (0.33) 2.12 (0.23) 3.52 (0.36) 1.92 (0.36) 2.36 (0.46)
15.33 8.75 6.98 9.53 9.02 9.34
American Journal of Speech-Language Pathology • Vol. 26 • 1178–1192 • November 2017
Downloaded From: http://ajslp.pubs.asha.org/ by a ReadCube User on 11/09/2017 Terms of Use: http://pubs.asha.org/ss/rights_and_permissions.aspx
other participants. (Data are presented in Figure 1, middle panel.) Participant 3 (P3) The baseline phase was not stable for P3, with mean scores varying by more than 1 point for accentedness and ease of understanding (1.8 and 2.0 points, respectively). The posttraining phase was stable, however, with mean scores varying by less than 1 point for both measures. P3 had milder (lower) mean accentedness scores in the baseline phase than four of the five other participants and demonstrated no improvement (decrease) in mean scores across the three phases. P3 achieved lower mean scores for ease of understanding than for accentedness in both the training and posttraining phases. In the posttraining phase, mean scores for ease of understanding were 2.8 points lower than for accentedness. P3 had the smallest effect size for accentedness, reflecting the lack of change in mean scores as a result of training. Although P3 achieved an improvement (decrease) of 2.86 points in the mean scores for ease of understanding from baseline to posttraining, the effect size was the smallest compared with the scores of the other participants. Accentedness scores increased at the first and second training sessions, whereas ease-of-understanding scores remained unchanged. To start with, the speaker distorted articulatory movements, particularly with excessive lip spreading, and her fluency decreased. The third training session focused on lingual and mandibular movements during clear speech with decreased attention to labial movements. As a result, articulatory distortions decreased and fluency increased. Afterward, mean accentedness scores returned to their baseline levels and ease-of-understanding scores improved (decreased). (Data are presented in Figure 1, bottom panel.) Participant 4 (P4) The baseline and posttraining phases for P4 were stable, with mean scores varying by no more than 1 point for either measure in both phases. P4 achieved lower mean scores for ease of understanding than for accentedness in both the training and posttraining phases, although the mean scores were separated by only 1.16 points, and she demonstrated the smallest change in ease of understanding compared with the scores of the other participants. The effect size for accentedness was the smallest of the four participants who achieved a training effect for this measure, whereas the effect size for ease of understanding was among the highest scores of all of the participants. During the early training phase, marked prosodic distortion was noted, particularly a flat intonation contour and equal stress on all syllables. Although P4 produced clear speech well by the end of the training phase with improved prosody, vowel errors were prominent. Even in conversational speech, she tended to pronounce American English vowels by their orthographic representation, and she used mainly Spanish vowels, resulting in numerous errors. (Data are presented in Figure 2, top panel.)
Participant 5 (P5) The baseline phase was not stable for P5, with mean scores varying by 1.2 points for both measures. The posttraining phase was stable, however, with a range of mean scores varying by less than 1 point for both measures. P5 had milder (lower) mean accentedness scores at baseline than four of the five other participants and demonstrated no improvement (decrease) in mean scores across the three phases, with an effect size of less than 1. He was the only participant with lower scores for ease of understanding than for accentedness during the baseline phase, with a mean difference of approximately 1 point. However, by the second training session probe, mean scores for ease of understanding decreased and were 2.8 points below the mean scores for accentedness during the posttraining phase. Although P5 quickly adapted to the use of clear speech, he struggled initially to maintain a high effort of clear speech through to the end of a phrase. The effect size for ease of understanding was in the middle range compared with the scores of the other participants. (Data are presented in Figure 2, middle panel.) Participant 6 (P6) The baseline and posttraining phases for P6 were stable, with mean scores varying by no more than 1 point for either measure in both phases. Mean scores for accentedness and ease of understanding were similar throughout the baseline phase (separated by approximately a 10th of a point) and remained close throughout the training and posttraining phases. P6 struggled to maintain high-effort clear speech throughout an utterance and demonstrated markedly reduced prosody. By the fourth training session, however, she achieved success in maintaining clear speech with minimal cuing and improved prosody. That change is likely the reason that both accentedness and ease-ofunderstanding scores improved (decreased) suddenly from the third to the fourth training session. The effect size for both outcome measures was among the largest obtained when compared with those of the other participants. (Data are presented in Figure 2, bottom panel.) Overall Overall, the mean values for accentedness and ease of understanding during the baseline phase did not vary by more than 1 point, except for P3 (1.8 points for accentedness and 2 points for ease of understanding) and P5 (1.2 points for both accentedness and ease of understanding). During the posttraining phase, the mean values did not vary by more than 1 point for any participant on either outcome measure. The scores for ease of understanding improved (decreased) for all six participants, and only P4 did not achieve mean scores under 2.5 during the posttraining phase. P1 and P6 showed the greatest changes in both accentedness and ease of understanding from baseline to posttraining phases. Both P3 and P5 had the mildest scores on both outcome measures during the baseline phase, and both talkers showed the smallest changes in accentedness from baseline to posttraining phases. Behrman: Clear Speech Approach to Accent Management
Downloaded From: http://ajslp.pubs.asha.org/ by a ReadCube User on 11/09/2017 Terms of Use: http://pubs.asha.org/ss/rights_and_permissions.aspx
1187
The immediacy of the training effect varied across participants and outcome measures. Four of the participants (P1, P2, P4, and P6) began to show an improvement (decrease) in scores for ease of understanding after the first training session, whereas the other three participants did not show improvement until the second (P5) or the fourth (P3) training session. In contrast, for the participants who achieved an improvement (decrease) in accentedness scores, only P1 demonstrated improvement at the first session. For the other participants, the improvement was demonstrated at the third (P2, P6) or the fourth (P4) training session. The plateauing of scores for ease of understanding occurred in the third session for P1 and P4, in the fourth session for P5 and P6, in the fifth session for P2, and not until the posttraining phase for P3.
Discussion Two major findings arise from this study. First, this five-session, 2.5-week clear speech protocol, which included daily home practice, improved the ease with which listeners understood Spanish-accented speakers of English. This study was limited to six participants and, therefore, other individuals may not achieve similar, favorable outcomes. Nevertheless, a positive training effect for ease of understanding was evident for all the participants in this study. No direct comparisons to other research are available. However, the outcome is consistent with Smiljanić and Bradlow (2005, 2011), who found that clear speech improved intelligibility for Croatian-accented talkers of English. The reason why clear speech improved ease of understanding for the listeners in this experiment is uncertain. While numerous studies have examined the acoustic changes that accompany clear speech in monolingual (Ferguson & Kewley-Port, 2002, 2007; Ferguson & Quené, 2014; Picheny et al., 1986; Smiljanić & Bradlow, 2005; Uchanski et al., 1996) and nonnative (Smiljanić & Bradlow, 2011) talkers, a definitive set of acoustic features of clear speech that result in increased intelligibility has not been identified. Studies of synthesized hybridized acoustic signals (Kain, Amano-Kusumoto, & Hosom, 2008; Liu & Zeng, 2006), in which one or more acoustic features of clear speech are incorporated into conversational speech, suggest that a combination of temporal and spectral characteristics is important for increased intelligibility. It is likely that ease of understanding of nonnative speakers is similarly dependent upon multiple acoustic characteristics of clear speech. Another contributing factor to the improved ease of understanding could be that clear speech, with its emphasis on phoneme hyperarticulation, resulted in greater production of syllable-final consonants and, thus, increased clarity of word boundaries. Segmentation of the continuous speech signal into words, using acoustic, syntactic, and lexical information, is necessary for speech perception. Rhythmic cues may be an important acoustic cue (Cutler & Butterfield, 1990). Enhanced rhythmic contrast between
1188
strong and weak syllables may, by itself, have enhanced listener ease of understanding. In English, word boundaries are generally perceived to occur before strong syllables, those in which the vowel quality of the nucleus is not reduced (Cutler & Norris, 1988). Compared with English, vowel reduction is less common in Spanish. Therefore, it may also be possible that the distinction between strong and weak syllables, which drives word segmentation in English, is less distinct in the production of Spanish-accented talkers of English. Omission of word-final consonants in English, associated with the dominance of open syllables in Spanish, could likely exacerbate listener confusion. The result could be a greater number of word boundary errors and lesser ease of understanding on the part of monolingual English-speaking listeners. In the only investigation of word boundaries in clear speech, Cutler and Butterfield (1990) found that monolingual talkers increased the frequency and duration of preboundary pauses in clear speech compared with conversational speech. In the present study, it is possible that talkers enhanced word boundaries through clearer production of final consonants (including release of final stops) and increased use of brief interword silence. However, phrase stimuli in this study were not designed to analyze final consonant production or word boundary pauses systematically across phonemes. Therefore, this explanation remains hypothetical. Future research should be designed to examine acoustic characteristics of word boundaries, as well as other changes generally observed in the clear speech literature, such as overall intensity and rate of speech. The second major finding of this study is that the clear speech protocol resulted in a decrease in monolingual English listeners’ perceptions of the strength of the nonnative accent for four of the six participants. This finding appears to contrast with that of Smiljanić and Bradlow (2011), who found that clear speech did not affect accentedness in Croatian-accented talkers of English. A closer examination of the SCED data of this experiment, however, suggests that the findings between the two studies are, in fact, similar. Data from the probes obtained at the end of the first training session are more comparable to the findings of Smiljanić and Bradlow, who elicited clear speech from one-time instructions and modeling, than are the mean scores from the probes obtained in the posttraining phase. In the present study, only P1 demonstrated a decrease in mean accentedness scores at the first training session, and it was only 0.6 points lower than the lowest baseline mean score. Therefore, if group statistical analysis had been conducted based upon the first training session, no change in accentedness would have been observed, consistent with the findings of Smiljanić and Bradlow (2011). The change in accentedness scores in this study was not anticipated. It had been considered unlikely that clear speech would affect accentedness. Clear speech does not directly address phonemic differences such as production of lax vowels, or the consonants /v/, /b/, /tʃ/, for example (which are often difficult for L1 Spanish speakers of English),
American Journal of Speech-Language Pathology • Vol. 26 • 1178–1192 • November 2017
Downloaded From: http://ajslp.pubs.asha.org/ by a ReadCube User on 11/09/2017 Terms of Use: http://pubs.asha.org/ss/rights_and_permissions.aspx
or prosodic differences such as syllabic stress and phrase intonation patterns. However, the process of hyperarticulation, in which talkers increase their focus upon accuracy of phoneme production, may have resulted in reduced phonemic errors. Thus, together with increased production of final consonants, clear speech may, in effect, be an indirect approach to reduction of accentedness. In addition, it is likely that the features used to identify accentedness may vary among listeners. It might be that accentedness is partially dependent upon, or confounded with, ease of understanding. Therefore, as ease of understanding improved, the listeners’ perceptions of accentedness were altered. In fact, for the four participants who did achieve decreased accentedness, the scores grossly mirrored the ease-of-understanding scores, consistent with data from other studies (Behrman, 2014; Behrman & Akhund, 2013; Kennedy & Trofimovich, 2008; Munro & Derwing, 1999). In general, the acoustic parameters upon which listeners base their judgments of accentedness are uncertain. One challenge in research on accentedness is that the perception of accentedness is not stable within a given listener. For example, within an experimental paradigm, the number of times a listener hears a given utterance and the number of native utterances with which the L2 speaker is compared both influence perception of accentedness (Flege & Fletcher, 1992). However, the difference in the immediacy of the training effect on ease-of-understanding and accentedness scores suggests that different features of the training protocol drove the change in scores, at least to some degree. Overall, ease of understanding responded quickly, consistent with the introduction of clear speech and correction of errors in its production, such as insufficient effort or articulatory distortions during production of clear speech. In contrast, for the five participants who achieved decreased accentedness scores, the decrease occurred more slowly. Speech naturalness may have been a factor in this timing difference. Naturalness is the degree to which speech conforms to a listener’s standard of prosody—the rate, rhythm, intonation, and stress patterning of an utterance (Yorkston, Beukelman, Strand, & Hakel, 2010). Prosody was not addressed in the first two sessions of the training phase. Instead, the training focused upon enabling the participants to achieve success in high-effort clear speech and to maintain clear speech throughout each utterance. It is possible that clear speech disrupted prosody in the early training sessions, which resulted in a decrease in naturalness, which, in turn, contributed to the perception of accentedness. Research in speech naturalness has focused on speakers with dysarthria (Yorkston, Hammen, Beukelman, & Traynor, 1990) and fluency disorders (Metz, Schiavetti, & Sacco, 1989), and the findings of those studies suggest that prosodic features are an important component of naturalness. However, data from disordered speech have limited the application to the speech of nonnative speakers, and no data are available that describe the contribution of naturalness to accentedness in nonnative speakers. Substantial additional research is needed to tease apart the factors that contribute to the listener’s perception of accent, including speech naturalness.
Two additional features regarding this study are worthy of emphasis. First, although ease-of-understanding and accentedness scores increased temporarily for one participant (P3) during the first two training sessions, scores did not increase over baseline for any participant by the time the posttraining phase was reached. In other words, the clear speech protocol did not heighten accentedness or decrease ease of understanding, at least for the participants in this study. This finding is important, because many of the participants expressed initial concerns that clear speech would draw attention to segmental and prosodic differences and further impede communication. As is evident, that was not the case—at least for the participants in this study. Second, the performance during the training phase did not diminish during the posttraining phase. The participants were able to maintain the effect of clear speech consistently for the 5 weeks after training had concluded. This outcome suggests that the protocol enabled participants to maintain the production of clear speech and its communicative benefits. It is unknown whether the duration of the training protocol was optimal. The design specified five training sessions because five probes per phase are recommended for the strongest experimental evidence (Kratochwill et al., 2010). However, the optimal training time required to achieve maximal performance is uncertain. The participants required at least three training sessions to achieve their lowest (best) ease-of-understanding scores. Furthermore, the data from five of the participants appeared to plateau by the fourth training session. When queried during the exit interview, five of the six participants perceived the training duration to be appropriate. P5, who achieved no change in accentedness but a large improvement in ease of understanding, reported that a greater number of sessions addressing clear speech specifically would have been helpful. Another consideration, relative to the duration of the training, is the generalizability of the training to communicative activities of daily living. Smiljanić and Bradlow (2008), in their study of temporal features of clear speech in native speakers of English, found that talkers maintained clear speech throughout oral reading of paragraphs. In contrast, in the present study, most participants decreased the effort of clear speech in the training sessions when practicing clear speech while engaged in conversation. The participants required repeated practice and cuing to maintain consistent effort of clear speech. This difference in talker abilities to maintain clear speech in the two studies suggests that practice using clear speech within conversation is critical for speakers to be able to ultimately use clear speech beneficially in real-world contexts. Upon conclusion of their participation, each participant was interviewed regarding his or her experience with the protocol. Three of the participants (P1, P2, and P6) reported that, prior to participation, they would sometimes purposely not speak clearly, try to “run words together,” or speak more quietly when conversing with native speakers, all to avoid drawing attention to pronunciation and grammatical errors. All the participants commented that, initially, Behrman: Clear Speech Approach to Accent Management
Downloaded From: http://ajslp.pubs.asha.org/ by a ReadCube User on 11/09/2017 Terms of Use: http://pubs.asha.org/ss/rights_and_permissions.aspx
1189
they hesitated to use clear speech in actual conversation because they were concerned that it would highlight their grammatical and pronunciation errors and, secondarily, that it would look or sound “odd.” Despite these, all participants reported that they grew more comfortable in using clear speech as they found that listeners asked for repetition less frequently and that the clear speech did not appear to call attention to itself. Three of the participants (P1, P3, and P5) commented that they also used clear speech in Spanish in certain situations, particularly at work, to make sure that others understood them easily. All the participants reported greater confidence in speaking English. However, they all also stated that they would like to participate in more sessions to work on pronunciation of specific consonants and vowels. Perhaps of utmost importance is that all participants reported that they did elect to use clear speech when important communicative contexts arose at work and in class and planned to continue to use clear speech. That is, not only did the training protocol result in improved ease of understanding, but also participants reported that they found the speaking style useful in real-world communication. The range of effect sizes obtained in this study was from 6.98 to 15.33 for ease of understanding and from 0.17 to 10.48 for accentedness. The only other accent management outcomes study reporting effect size was Behrman’s (2014) comparative study of segmental and prosodic training, in which effect size ranged from 3.4 to 9.7 for ease of understanding and from 5.0 to 11.2 for accentedness. It is not surprising that a larger effect size for accentedness was obtained by addressing segmental and prosodic differences directly than by using clear speech. In contrast, however, it is notable that larger effect sizes were obtained for clear speech compared with the earlier study. The findings of the present study contribute to the existing literature that a speaker can be perceived as having a moderately strong nonnative accent and yet be easy to understand and highly intelligible to the listener (Behrman, 2014; Derwing & Munro, 1997, 2009; Munro & Derwing, 1999; Smiljanić & Bradlow, 2011). Furthermore, ease of understanding may be more important to many listeners in the workplace than is accentedness (Derwing & Munro, 2009). The use of ease of understanding without concomitant measurement of intelligibility is an important caveat in the interpretation of the present findings. Listener accuracy of understanding is unknown. Therefore, it is possible that some listeners perceived talkers to be easy to understand but were, in fact, misinterpreting what the talkers were saying. Listener accuracy and ease of understanding are both important considerations in treatment planning. Future research needs to address this study limitation. The clinical implications of this study are that speechlanguage pathologists should discuss ease of understanding and accentedness with their accent management clients. Many L2 speakers of English elect to participate in accent management programs with the goal of increased communicative competence in English. Accent reduction, with its
1190
focus on sounding more nativelike through segmental and prosodic changes, may not be the only or even the optimal approach for some clients. This clear speech approach to accent management may provide L2 speakers with the ability to code-switch in high-consequence conversational contexts to achieve increased ease of understanding on the part of their listeners. Furthermore, the benefits of clear speech that potentially can be achieved by nonnative speakers may require more than instruction to elicit “naturally occurring” clear speech, such as that found in the research literature. Instead, a multiple-session training program may be required to obtain the maximum benefit. These findings may have limited generalization to other L1 dialects and languages, and additional research is needed. Nevertheless, the data from this study suggest that clear speech is worthy of further exploration in the practice of accent management.
Acknowledgment The participants are gratefully acknowledged. This research was supported in part by a clinical research grant from the American Speech-Language-Hearing Foundation, awarded to Alison Behrman.
References Barb, C. (2005). Suprasegmentals and comprehensibility: A comparative study in accent modification (Unpublished doctoral dissertation). Wichita State University, KS. Retrieved from http://soar.wichita.edu/xmlui/bitstream/handle/10057/570/ d05005.pdf.txt?sequence=4 Barlow, D. H., Nock, M. K., & Hersen, M. (2009). Single-case experimental designs: Strategies for studying behavior change (3rd ed.). Boston, MA: Pearson. Beeson, P. M., & Robey, R. R. (2006). Evaluating single-subject treatment research: Lessons learned from the aphasia literature. Neuropsychological Review, 16, 161–169. Behrman, A. (2014). Segmental and prosodic approaches to accent management. American Journal of Speech-Language Pathology, 23, 546–561. Behrman, A., & Akhund, A. (2013). The influence of semantic context on the perception of Spanish-accented American English. Journal of Speech, Language, and Hearing Research, 56, 1567–1578. Bent, T., & Bradlow, A. R. (2003). The interlanguage speech intelligibility benefit. The Journal of the Acoustical Society of America, 114, 1600–1610. Best, C. T., & Tyler, M. D. (2007). Nonnative and second-language speech perception: Commonalities and complementarities. In M. J. Munro & O.-S. Bohn (Eds.), Second language speech learning: The role of language experience in speech perception and production (pp. 13–34). Amsterdam, PA: John Benjamins. Byiers, B. J., Reichle, J., & Symons, F. J. (2012). Single-subject experimental design for evidence-based practice. American Journal of Speech-Language Pathology, 21, 397–414. Carter, P. M. (2005). Quantifying rhythmic differences between Spanish, English, and Hispanic English. In R. S. Gess & E. J. Rubin (Eds.), Theoretical and experimental approaches to romance linguistics: Selected papers from the 34th Linguistic Symposium on Romance Languages (pp. 63–75). Amsterdam, PA: John Benjamins. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Mahwah, NJ: Erlbaum.
American Journal of Speech-Language Pathology • Vol. 26 • 1178–1192 • November 2017
Downloaded From: http://ajslp.pubs.asha.org/ by a ReadCube User on 11/09/2017 Terms of Use: http://pubs.asha.org/ss/rights_and_permissions.aspx
Colina, S. (2009). Spanish phonology: A syllabic perspective. Washington, DC: Georgetown University Press. Cutler, A. (2003). Listening to a second language through the ears of a first. Interpreting, 5, 1–23. Cutler, A., & Butterfield, S. (1990). Durational cues to word boundaries in clear speech. Speech Communication, 9, 485–495. Cutler, A., & Norris, D. (1988). The role of strong syllables in segmentation for lexical access. Journal of Experimental Psychology: Human Perception and Performance, 14, 113–121. Davidson, L. (2011). Phonetic and phonological factors in the second language production of phonemes and phonotactics. Language and Linguistics Compass, 5, 126–139. Delattre, P. (1966). A comparison of syllable length conditioning among languages. International Review of Applied Linguistics in Language Teaching, 4, 183–198. Derwing, T. M., & Munro, M. J. (1997). Accent, intelligibility, and comprehensibility: Evidence from four L1s. Studies in Second Language Acquisition, 19, 1–16. Derwing, T. M., & Munro, M. J. (2009). Comprehensibility as a factor in listener interaction preferences: Implications for the workplace. The Canadian Modern Language Review, 66, 181–202. Derwing, T. M., Munro, M. J., & Wiebe, G. (1998). Evidence in favor of a broad framework for pronunciation instruction. Language Learning, 48, 393–410. Derwing, T. M., & Rossiter, M. J. (2003). The effects of pronunciation instruction on the accuracy, fluency, and complexity of L2 accented speech. Applied Language Learning, 13, 1–17. Eddington, D. (2004). Spanish phonology and morphology: Experimental and quantitative perspectives. Amsterdam, PA: John Benjamins. Ellis, R. (1994). The study of second language acquisition. New York, NY: Oxford University Press. Ferguson, S. H. (2004). Talker differences in clear and conversational speech: Vowel intelligibility for normal-hearing listeners. The Journal of the Acoustical Society of America, 116, 2365–2373. Ferguson, S. H., & Kewley-Port, D. (2002). Vowel intelligibility in clear and conversational speech for normal-hearing and hearing-impaired listeners. The Journal of the Acoustical Society of America, 112, 259–271. Ferguson, S. H., & Kewley-Port, D. (2007). Talker differences in clear and conversational speech: Acoustic characteristics of vowels. Journal of Speech, Language, and Hearing Research, 50, 1241–1255. Ferguson, S. H., & Quené, H. (2014). Acoustic correlates of vowel intelligibility in clear and conversational speech for young normal-hearing and elderly hearing-impaired listeners. The Journal of the Acoustical Society of America, 135, 3570–3574. Flege, J. E. (1999). Age of learning and second-language speech. In D. Birdsong (Ed.), Second language acquisition and the critical period hypothesis (pp. 101–132). Hillsdale, NJ: Lawrence Erlbaum. Flege, J. E., & Fletcher, K. L. (1992). Talker and listener effects on degree of perceived foreign accent. The Journal of the Acoustical Society of America, 91, 370–389. Flege, J. E., Munro, M. J., & MacKay, I. R. A. (1995). Factors affecting strength of perceived foreign accent in a second language. The Journal of the Acoustical Society of America, 97, 3125–3134. Gass, S., & Varonis, E. M. (1984). The effect of familiarity on the comprehensibility of nonnative speech. Language Learning, 34, 65–89. Kain, A., Amano-Kusumoto, A., & Hosom, J.-P. (2008). Hybridizing conversational and clear speech to determine the degree of
contribution of acoustic features to intelligibility. The Journal of the Acoustical Society of America, 124, 2308–2319. Kazdin, A. E. (2010). Single-case research design: Methods for clinical and applied settings (2nd ed.). New York, NY: Oxford University Press. Kennedy, S., & Trofimovich, P. (2008). Intelligibility, comprehensibility, and accentedness of L2 speech: The role of listener experience and semantic context. The Canadian Modern Language Review, 64, 459–489. Kratochwill, T. R., Hitchcock, J., Horner, R. H., Levin, J. R., Odom, S. L., Rindskopf, D. M., & Shadish, W. R. (2010). Single-case designs technical documentation. Retrieved from What Works Clearinghouse website: http://ies.ed.gov/ncee/ wwc/pdf/wwc_scd.pdf Lam, J., & Tjaden, K. (2013a). Intelligibility of clear speech: Effect of instruction. Journal of Speech, Language, and Hearing Research, 56, 1429–1440. Lam, J., & Tjaden, K. (2013b). Acoustic-perceptual relationships in variants of clear speech. Folia Phoniatrica et Logopaedica, 65, 148–153. Lam, J., Tjaden, K., & Wilding, G. (2012). Acoustics of clear speech: Effect of instruction. Journal of Speech, Language, and Hearing Research, 55, 1807–1821. Levi, S. V., Winters, S. J., & Pisoni, D. B. (2007). Speakerindependent factors affecting the perception of foreign accent in a second language. The Journal of the Acoustical Society of America, 121, 2327–2338. Lindblom, B. (1990). Explaining phonetic variation: A sketch of the H&H theory. In W. J. Hardcastle & A. Marchal (Eds.), Speech Production and Speech Modeling (pp. 403–439). Dordrecht, the Netherlands: Kluwer Academic. Liss, J. M., White, L., Mattys, S. L., Lansford, K., Lotto, A. J., Spitzer, S., & Caviness, J. N. (2009). Quantifying speech rhythm deficits in the dysarthrias. Journal of Speech, Language, and Hearing Research, 52, 1334–1352. Liu, S., & Zeng, F. G. (2006). Temporal properties in clear speech perception. The Journal of the Acoustical Society of America, 120, 424–432. Maas, E., & Farinella, K. A. (2012). Random versus blocked practice in treatment for childhood apraxia of speech. Journal of Speech, Language, and Hearing Research, 55, 561–578. Maas, E., Robin, D. A., Austermann Hula, S. N., Freedman, S. E., Wulf, G., Ballard, K. J., & Schmidt, R. A. (2008). Principles of motor learning in treatment of motor speech disorders. American Journal of Speech-Language Pathology, 17, 277–298. Matthies, M., Perrier, P., Perkell, J. S., & Zandipour, M. (2001). Variation in anticipatory coarticulation with changes in clarity and rate. Journal of Speech, Language, and Hearing Research, 44, 340–353. Metz, D. E., Schiavetti, N., & Sacco, P. R. (1989). Acoustic and psychophysical dimensions of the perceived speech naturalness of nonstutterers and posttreatment stutterers. Journal of Speech and Hearing Disorders, 55, 516–525. Moyer, A. (2007). Do language attitudes determine accent? A study of bilinguals in the USA. Journal of Multilingual and Multicultural Development, 28, 502–518. Munro, M. J., & Derwing, T. M. (1995). Foreign accent, comprehensibility, and intelligibility in the speech of second language learners. Language Learning, 45, 73–97. https://doi.org/10.1111/ 0023-8333.49.s1.8 Munro, M. J., & Derwing, T. M. (1999). Foreign accent, comprehensibility, and intelligibility in the speech of second language learners. Language Learning, 49, 285–310.
Behrman: Clear Speech Approach to Accent Management
Downloaded From: http://ajslp.pubs.asha.org/ by a ReadCube User on 11/09/2017 Terms of Use: http://pubs.asha.org/ss/rights_and_permissions.aspx
1191
National Education Association. (2010). Hispanics: Education issues. Retrieved from http://www.nea.org/home/HispanicsEducation% 20Issues.htm Neel, A. T., & Long, E. (2015). Listener profiles of accented English speech. Presented at the annual convention of the American Speech-Language-Hearing Association, Denver, CO. Olive, M. L., & Smith, B. W. (2005). Effect size calculations and single subject designs. Educational Psychology: An International Journal of Experimental Educational Psychology, 25, 313–324. Pavlenko, A., & Jarvis, S. (2002). Bidirectional transfer. Applied Linguistics, 23, 190–214. Pew Research Center. (2016). American Community Survey (IPUMS). Retrieved from http://www.pewhispanic.org/2016/04/ 19/statistical-portrait-of-hispanics-in-the-united-states-key-charts/ Picheny, M. A., Durlach, N. I., & Braida, L. D. (1986). Speaking clearly for the hard of hearing. I. Intelligibility differences between clear and conversational speech. Journal of Speech and Hearing Research, 28, 96–103. Piske, T., MacKay, I. R. A., & Flege, J. E. (2001). Factors affecting degree of foreign accent in an L2: A review. Journal of Phonetics, 29, 191–215. Rogers, C. L., DeMasi, T. M., & Krause, J. C. (2010). Conversational and clear speech intelligibility of /bVd/ syllables produced by native and non-native English speakers. The Journal of the Acoustical Society of America, 128, 410–423. Salcedo, C. S. (2010). The phonological system of Spanish. Revista de Linguistica y Lenguas Aplicadas, 5, 195–209. Translated at https://dialnet.unirioja.es/descarga/articulo/3269828.pdf Schmidt, A. M., & Meyers, K. A. (1995). Traditional and phonological treatment for teaching English fricatives and affricates to Koreans. Journal of Speech and Hearing Research, 38, 828–838. Schmidt, R. A., & Lee, T. D. (2014). Motor learning and performance: From principles to application (5th ed.). Champaign, IL: Human Kinetics Publishers, Inc. Searl, J., & Evitts, P. M. (2013). Tongue-palate contact pressure, oral air pressure, and acoustics of clear speech. Journal of Speech, Language, and Hearing Research, 56, 826–839. Shin, H. B., & Ortman, J. (2011). Language projections: 2010 to 2020. Paper presented at the Federal Forecasters Conference,
1192
Washington, D.C. Retrieved from http://www.census.gov/hhes/ socdemo/language/data/acs/Shin_Ortman_FFC2011_paper.pdf Sikorski, L. D. (2014). Proficiency in Oral English Communication: An assessment battery of accented oral English (4th ed.). Tustin, CA: LDS & Associates. Smiljanić, R., & Bradlow, A. (2007). Clear speech intelligibility: Listener and talker effects. ICPhS, XVI, 661–664. Smiljanić, R., & Bradlow, A. R. (2005). Production and perception of clear speech in Croatian and English. The Journal of the Acoustical Society of America, 118, 1677–1688. Smiljanić, R., & Bradlow, A. R. (2008). Temporal organization of English clear and conversational speech. The Journal of the Acoustical Society of America, 124, 3171–3182. Smiljanić, R., & Bradlow, A. R. (2011). Bidirectional clear speech perception benefit for native and high-proficiency nonnative talkers and listeners: Intelligibility and accentedness. The Journal of the Acoustical Society of America, 130, 4020–4031. Tasko, S. M., & Greilick, K. (2010). Acoustic and articulatory features of diphthong production: A speech clarity study. Journal of Speech, Language, and Hearing Research, 53, 84–99. Uchanski, R. M., Choi, S. S., Braida, L. D., Reed, C. M., & Durlach, N. I. (1996). Speaking clearly for the hard of hearing IV: Further studies of the role of speaking rate. Journal of Speech and Hearing Research, 39, 494–509. Wilson, E. O., & Spaulding, T. J. (2010). Effects of noise and speech intelligibility on listener comprehension and processing time of Korean-accented English. Journal of Speech, Language, and Hearing Research, 53, 1543–1554. Wolfram, W., & Schilling, N. (2015). American English: Dialects and variations (3rd ed.). West Sussex, United Kingdom: WileyBlackwell. Yorkston, K. M., Beukelman, D. R., Strand, E. A., & Hakel, M. (2010). Management of motor speech disorders in children and adults (3rd ed.). Austin, TX: Pro-Ed. Yorkston, K. M., Hammen, V. L., Beukelman, D. R., & Traynor, C. D. (1990). The effect of rate control on the intelligibility and naturalness of dysarthric speech. Journal of Speech and Hearing Disorders, 55, 550–560.
American Journal of Speech-Language Pathology • Vol. 26 • 1178–1192 • November 2017
Downloaded From: http://ajslp.pubs.asha.org/ by a ReadCube User on 11/09/2017 Terms of Use: http://pubs.asha.org/ss/rights_and_permissions.aspx