The effects of self-generated synchronous and asynchronous visual ...

3 downloads 824 Views 323KB Size Report
focused eye gaze on another speaker, written text, or text scrolling across a monitor ..... June 1, 2007, from http://www.mnsu.edu/comdis/isad2/papers/molt2.html.
This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright

Author's personal copy

Available online at www.sciencedirect.com

Journal of Communication Disorders 42 (2009) 235–244

The effects of self-generated synchronous and asynchronous visual speech feedback on overt stuttering frequency Gregory J. Snyder a,*, Monica Strauss Hough b,1, Paul Blanchet c,2, Lennette J. Ivy d,3, Dwight Waddell e,4 a

The Laboratory for Stuttering Research, Department of Communication Sciences & Disorders, University of Mississippi, 310 George Hall, University, MS 38677, United States b Department of Communication Sciences & Disorders East Carolina University, Health Sciences Building, Greenville, NC 27858, United States c Department of Speech Pathology & Audiology, State University of New York at Fredonia, W117 Thompson Hall, Fredonia, NY 14063, United States d The Laboratory for Stuttering Research, Department of Communication Sciences & Disorders, University of Mississippi, 308 George Hall, University, MS 38677, United States e Department of Health, Exercise Science & Recreation Management, University of Mississippi, 215 Turner Center, University, MS 38677, United States Received 20 August 2007; received in revised form 16 January 2009; accepted 17 February 2009

Abstract Purpose: Relatively recent research documents that visual choral speech, which represents an externally generated form of synchronous visual speech feedback, significantly enhanced fluency in those who stutter. As a consequence, it was hypothesized that self-generated synchronous and asynchronous visual speech feedback would likewise enhance fluency. Therefore, the purpose of this study was to investigate the effects of self-generated visual feedback (i.e., synchronous speech feedback with a mirror and asynchronous speech feedback via delayed visual feedback) on overt stuttering frequency in those who stutter. Method: Eight people who stutter (4 males, 4 females), ranging from 18 to 42 years of age participated in this study. Due to the nature of visual speech feedback, the speaking task required that participants recite memorized phrases in control and experimental speaking conditions so that visual attention could be focused on the speech feedback, rather than a written passage. During experimental conditions, participants recited memorized phrases while simultaneously focusing on the movement of their lips, mouth, and jaw within their own synchronous (i.e., mirror) and asynchronous (i.e., delayed video signal) visual speech feedback. Results: Results indicated that the self-generated visual feedback speaking conditions significantly decreased stuttering frequency (Greenhouse–Geisser p = .000); post hoc orthogonal comparisons revealed no significant differences in stuttering frequency reduction between the synchronous and asynchronous visual feedback speaking conditions ( p = .2554). Conclusions: These data suggest that synchronous and asynchronous self-generated visual speech feedback is associated with significant reductions in overt stuttering frequency. Study results were discussed relative to existing theoretical models of fluencyenhancement via speech feedback, such as the engagement of mirror neuron networks, the EXPLAN model, and the Dual Premotor

* Corresponding author. Tel.: +1 662 915 1202; fax: +1 662 915 5717. E-mail addresses: [email protected] (G.J. Snyder), [email protected] (M.S. Hough), [email protected] (P. Blanchet), [email protected] (L.J. Ivy), [email protected] (D. Waddell). 1 Tel.: +1 252 744 6090; fax: +1 252 744 6109. 2 Tel.: +1 716 673 3169; fax: +1 716 673 3235. 3 Tel.: +1 662 915 5122; fax: +1 662 915 5717. 4 Tel.: +1 662 915 5563; fax: +1 662 915 5525. 0021-9924/$ – see front matter # 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jcomdis.2009.02.002

Author's personal copy

236

G.J. Snyder et al. / Journal of Communication Disorders 42 (2009) 235–244

System Hypothesis. Further research in the area of self-generated visual speech feedback, as well as theoretical constructs accounting for how exposure to a multi-sensory speech feedback enhances fluency, is warranted. Learning outcomes: : Readers will be able to (1) discuss the multi-sensory nature of fluency-enhancing speech feedback, (2) compare and contrast synchronous and asynchronous self-generated and externally generated visual speech feedback, and (3) compare and contrast self-generated and externally generated visual speech feedback. # 2009 Elsevier Inc. All rights reserved.

1. Introduction Stuttering is generally considered to be a speech disorder that emerges between 2 and 4 years of age, affects approximately 1% of the global population, and is characterized by part- and whole-word repetitions, prolongations, and inaudible postural fixations during speech production (Bloodstein & Bernstein-Ratner, 2008). While approximately 5% of all children exhibit stuttered speaking behaviors at some point during their speech and language development, approximately 80% of children demonstrating stuttering behaviors spontaneously recover from the stuttering phenomenon (Yairi & Ambrose, 2005); the remaining 20% of children demonstrating stuttered speaking behaviors continue to exhibit stuttering into adulthood (Yairi & Ambrose, 2005). Substantial evidence indicates that the etiology of the stuttering phenomenon has a genetic and/or neurological genesis, with the possibility of environmental factors contributing to the development of the pathology as well (Yairi & Ambrose, 2005). Research reliably documents that overt stuttered speaking behaviors are dramatically, albeit transiently, reduced with the use of various forms of speech feedback (Bloodstein & Bernstein-Ratner, 2008; Starkweather, 1987). One such example is rhythmic or metronome-timed speech, which paces the initiation of the syllable or word with a rhythmic beat from an exogenous auditory, visual or tactile stimuli (Bloodstein & Bernstein-Ratner, 2008). Moreover, fluency-enhancement via the metronome effect remains significant at both normal and fast speech rates (Hanna & Morris, 1977). Although Stager, Denman, and Ludlow (1997) reported that metronome-timed speech resulted in increased subglottal pressure rise time, as well as decreased vowel intensity and peak pressure, it remains unclear whether these changes in speech production are the cause or result of the subsequent fluency-enhancement. Other forms of fluency-enhancing speech feedback include exposure to auditory masking noise and white noise (Bloodstein & Bernstein-Ratner, 2008). A number of studies document significant reductions in overt stuttering frequency in the presence of auditory masking noise (Cherry & Sayers, 1956; Shane, 1955). Significant fluencyenhancement via auditory masking noise is documented to occur with exposure to both low (500 Hz) frequency masking noise (Cherry & Sayers, 1956; Conture, 1974; May & Hackwood, 1968). However, research also reveals that even the monaural presentation of moderately intense white noise (i.e., 50 dB) has been documented to enhance fluency in those who stutter (Maraist & Hutton, 1957). In other words, although auditory masking noise has been documented to enhance fluency in those who stutter, simple monaural (and binaural) exposure to white noise (as low as 50 dB) also serves as a significant fluency-enhancer (Barr & Carmel, 1969; Yairi, 1976). Although researchers have tried to account for how and why exposure to either auditory masking or white noise significantly enhance fluency in those who stutter, the relationship between these two feedback conditions, as well as their mechanisms of efficacy, remain unknown (Bloodstein & Bernstein-Ratner, 2008). Finally, research suggested fluency-enhancement via exposure to speech feedback of a second speech signal (Guntupalli, Kalinowski, Saltuklaroglu, & Nanjundeswaran, 2005; Kalinowski & Dayalu, 2002; Kalinowski, Stuart, Rastatter, Snyder, & Dayalu, 2000). A second speech signal (SSS) is the speech feedback of a second gesturally similar and concurrent speech signal relative to the (primary) spoken speech signal (Andrews, Howie, Dozsa, & Guitar, 1982). A variety of methodologies employ the use of synchronous and asynchronous SSSs through various sensory modalities. Specific examples of the SSS include delayed auditory feedback (DAF; Andrews et al., 1983), frequency altered feedback (FAF; Hargrave, Kalinowski, Stuart, Armson, & Jones, 1994; Howel, El-Yaniv, & Powell, 1987), auditory choral speech (ACS; Bloodstein & Bernstein-Ratner, 2008) and visual choral speech (VCS; Kalinowski et al., 2000). Relatively synchronous SSSs that are documented to significantly enhance fluency include methodologies such as FAF, ACS and VCS, as the primary and second speech signals are in relative unison. Asynchronous forms of a SSS include DAF, with data revealing that delays from 50 ms to over 250 ms are sufficient to significantly enhance fluency in those who stutter (Kalinowski, Stuart, Sark, & Armson, 1996). Although fluency-enhancement via exposure to any

Author's personal copy

G.J. Snyder et al. / Journal of Communication Disorders 42 (2009) 235–244

237

number of SSSs is widely documented, a single prevailing paradigm accounting for how and why exposure to a SSS enhances fluency has not emerged (Bloodstein & Bernstein-Ratner, 2008). Research has revealed remarkable data regarding fluency-enhancement via visual speech feedback. Visual feedback, in the form of speech-contingent flashing lights, was documented to enhance fluency in those who stutter (Kuniszyk-Jozkowiak, Smolka, & Adamczyk, 1996, 1997). A few years later, Kalinowski et al. (2000) documented a more efficient fluency-enhancing visual speech feedback methodology in the form of an externally generated synchronous visual second speech signal (i.e., visual choral speech). This latter finding was seminal in that it suggests fluency-enhancement via speech feedback of a SSS is not solely an auditory phenomenon (such as DAF, FAF, or ACS), but rather a multi-sensory phenomenon (Kalinowski et al., 2000). The speculation that fluency-enhancement via a SSS functions as a multi-sensory phenomenon lead to the hypothesis that exposure to self-generated synchronous and asynchronous visual SSSs would likewise enhance fluency in those who stutter. Therefore, the purpose of the present study is twofold: first, to report preliminary data on fluency-enhancement secondary to exposure to synchronous and asynchronous self-generated visual SSSs. Second, to report any differential effects on fluency-enhancement as a result of exposure to synchronous and asynchronous self-generated visual SSSs. For the purposes of this study, synchronous self-generated visual speech feedback will be presented by the use of a mirror; asynchronous self-generated visual speech feedback will be presented by the use of a delayed visual feedback apparatus. 2. Methods 2.1. Participants Eight adults who stutter (4 males, 4 females), ranging from 18 to 42 years of age (median age = 26, mean age = 30.14, S.D. = 10.21), participated in this study. Participants reported either normal or corrected vision, and no other diagnosed speech, language, hearing, or attention disorders. Although all participants had a history of speech therapy, only one was currently enrolled. All participants, at a minimum, had graduated from high school. 2.2. Task and stimuli Measuring the effects of visual SSS speech feedback requires specially designed speaking and reading tasks. These tasks require that participants’ eye gaze remain focused on the visual SSS speech feedback, thereby disallowing focused eye gaze on another speaker, written text, or text scrolling across a monitor. Consequently, the principal speaking task and stimuli employed in this study modified a methodology used in previous visual speech feedback research (Kalinowski et al., 2000). During each speaking condition, participants read passages taken from junior high school science textbooks, all of which have been used in previous research (Kalinowski et al., 2000). Each passage, consisting of approximately 300 syllables, was divided into phrases consisting of 10–15 words, and printed on large double-sided cue cards. Participants sat at a table (approximately 75 cm in height), and were instructed to silently read and memorize a ‘‘phrase of comfortable length’’ (generally ranging from 5 to 7 words). Participants were then instructed to look up from the cue card, direct their eye gaze in to the visual SSS speech feedback, and recite the phrase they had just silently read and memorized. Although this task required participants to silently rehearse each phrase, the same procedure was used for all speaking conditions, thereby controlling for any differential effects of silent rehearsal on stuttering frequency. Practice trials, using an unrelated reading passage, were allowed in all speaking conditions until participants reported feeling comfortable with each speaking condition. Participants were instructed to speak at a normal rate, and not to use any speaking techniques that could alter, control, or reduce stuttering. Both the speaking conditions and passages were counterbalanced using a Latin Square. 2.3. Apparatus and procedure A ‘‘no visual feedback’’ (NVF) speaking condition served as the control condition for the study. During this condition, participants were instructed to read and silently memorize a phrase of comfortable length from the cue card, and then to look up (away from the cue card) and initiate speech.

Author's personal copy

238

G.J. Snyder et al. / Journal of Communication Disorders 42 (2009) 235–244

A vertically positioned mirror (24 cm high  33 cm wide, placed approximately 46 cm from the face of participant) positioned at eye level was used to create synchronous self-generated visual feedback (SVF) for the second speaking condition. During this experimental condition, participants were instructed to read and silently memorize a phrase of comfortable length from the cue card, and then to ‘‘look at your reflection and follow the movement from your lips, mouth, tongue, and jaw to initiate and maintain speech.’’ A third speaking condition tested fluency-enhancement via asynchronous self-generated visual feedback. The AVF was generated by using a 3Com HomeConnect universal serial bus netcam (model #0776) connected to a Winbook XL2 laptop computer. This netcam was positioned approximately 61 cm from the participant, where it was mounted slightly above eye level, and aimed at the participants’ lips, mouth and jaw. The participants’ AVF was displayed on the laptop computer’s monitor, which was no more than 46 cm from the participant and positioned at eye level. The laptop computer (Winbook XL2) ran Microsoft’s Windows 98 (Version 4.10.1998) and was configured with 128 megabytes of random access memory and a 300-MHz Pentium II processor. The laptop computer was equipped with a 14-in. active matrix liquid crystal display configured to 800 by 600 pixels resolution. The use of this hardware and software configuration provided a noticeable and reliable visual delay. The AVF speaking condition followed a nearly identical protocol as the SVF condition, with the exception that participants were instructed to look at their image (displayed on the laptop’s monitor), and pause for the asynchronous (delayed) visual feedback to become temporarily synchronous with the participants’ head position. Once the asynchronous (delayed) visual image ‘‘caught up’’ with the participants’ head movement, thereby providing participants with direct visual access to their lips, mouth and jaw displayed on the laptop monitor, they were instructed to ‘‘look at your image on the screen and follow the movement from your lips, mouth, tongue, and jaw to initiate and maintain speech.’’ Although the methodology described above successfully provided asynchronous visual feedback, the precise delay time of the visual feedback was initially unknown. Consequently, a specialized protocol was developed to precisely quantify the visual delay created by the methodology in the present study. This protocol involved the netcam and laptop computer configured to the previously described study specifications, a strobe light (RS#41-3048), and a standalone digital video camera (Sony #DSR-PD100). This protocol was designed to quantify the visual delay by allowing the stand-alone digital video camera to capture a single strobe flash directly from the strobe light, as well as the image of the strobe flash, which was captured by the netcam and displayed on the laptop’s LCD monitor. As such, the video captured by the stand-alone digital video camera contained both the original strobe flash and the image of the strobe flash rendered on the laptop computer. Consequently, the video captured by the stand-alone video camera was used to quantify the latency period between the original strobe flash and the rendered strobe flash displayed on the laptop computer’s monitor. This video signal was digitized using Broadway Pro (version 5.10.9) into an .avi video file (at 75 megabytes-perminute) at 30 frames per second, and analyzed with Ulead’s Video Editor (version 6.0). The individual video frames from the original strobe flash and the rendered image of the strobe flash displayed the laptop computer’s monitor were identified. The difference between frames was calculated, and then divided by 30, thereby providing the latency period between the original strobe flash and the rendered strobe flash in seconds. This process was iterated five times and averaged to generate an average visual delay of 0.36 s (S.D. = .054 s), which estimated the visual delay provided by the study’s AVF protocol. No other asynchronous (i.e., delayed) visual feedback time delay settings were employed in this preliminary study. During the experimental speaking conditions, participants perceived either synchronous or asynchronous visual speech feedback; no forms of altered auditory speech feedback were introduced during any speaking condition. All participants were video recorded using a Hi-8mm video camera (Sony #CCD-TRV75), and a lapel microphone (RS #33-3003) that was attached no more than 15 cm from their mouth with an approximate orientation of 08 azimuth and 1808 altitude. 2.4. Data collection and analysis Given that stuttering is often behaviorally defined as the production of three percent or greater stuttered syllables during speech production (Bloodstein & Bernstein-Ratner, 2008; Starkweather, 1987; Van Riper, 1982), only those participants demonstrating 3% or greater stuttering frequency in the control speaking condition were included in this study. For the purposes of this study, moments of overt stuttering were operationally defined as whole- and part-word

Author's personal copy

G.J. Snyder et al. / Journal of Communication Disorders 42 (2009) 235–244

239

Fig. 1. The minimum, maximum, inter-quartile range, and median values as a function of the no visual feedback (NVF), synchronous visual feedback (SVF), and asynchronous visual feedback (ASV) speaking conditions. The asterisk in the ASV speaking condition denotes an outlier.

repetitions, sound or syllable prolongations, or inaudible postural fixations (i.e., ‘‘blocking’’) (Bloodstein & BernsteinRatner, 2008). Stuttered syllables were counted from the first 300 syllables of each speaking condition by the primary author of the study. Intrajudge syllable-by-syllable agreement for 25% of the data, as indexed by Cohen’s kappa (Cohen, 1960) was .94. A trained stuttering research assistant, blind to the purpose of the study, randomly selected and independently analyzed 25% of the data, revealing an interjudge syllable-by-syllable agreement of .83, a value that represents excellent agreement (Fleiss, 1981). 3. Results The distributions of stuttering frequency as a function of visual feedback speaking condition are presented in Fig. 1. Specifically, the mean stuttering frequency was 37.13 stuttered syllables (S.E. = 10.55) for the NVF speaking condition, 14.88 stuttered syllables (S.E. = 6.41) for the SVF speaking condition, and 7.38 stuttered syllables (S.E. = 3.33) for the AVF speaking condition. As shown in Fig. 1, approximately 60% and 80% reductions of stuttered syllables occurred in the SVF and AVF speaking conditions, respectively. Due to the relatively small sample used in this study (i.e.,

Suggest Documents