Comparison of acoustic evidence of syllabic strength to lexical boundary error patterns revealed a source of .... measures taken from the hypokinetic dysarthric speech samples and those .... lables of the following words: ''teacher, release, legal, im- peach, veto ... occupied by the vowel quadrilateral derived from vowels in.
Syllabic strength and lexical boundary decisions in the perception of hypokinetic dysarthric speech Julie M. Liss and Stephanie Spitzer Motor Speech Disorders Laboratory, Arizona State University, Box 871908, Tempe, Arizona 85281
John N. Caviness, Charles Adler, and Brian Edwards Department of Neurology, Mayo Clinic-Scottsdale, 13400 Shea Boulevard, Scottsdale, Arizona 85259
~Received 13 August 1997; revised 19 May 1998; accepted 30 June 1998! This investigation evaluated a possible source of reduced intelligibility in hypokinetic dysarthric speech, namely the mismatch between listeners’ perceptual strategies and the acoustic information available in the dysarthric speech signal. A paradigm of error analysis was adopted in which listener transcriptions of phrases were coded for the presence and type of word boundary errors. Seventy listeners heard 60 phrases produced by speakers with hypokinetic dysarthria. The six-syllable phrases alternated strong and weak syllables and ranged in length from three to five words. Lexical boundary violations were defined as erroneous insertions or deletions of lexical boundaries that occurred either before strong or before weak syllables. A total of 1596 lexical boundary errors in the listeners’ transcriptions was identified unanimously by three independent judges. The pattern of errors generally conformed with the predictions of the Metrical Segmentation Strategy hypothesis @Cutler and Norris, J. Exp. Psychol. 14, 113–121 ~1988!# which posits that listeners attend to strong syllables to identify word onsets. However, the strength of adherence to this pattern varied across speakers. Comparison of acoustic evidence of syllabic strength to lexical boundary error patterns revealed a source of intelligibility deficit associated with this particular type of dysarthric speech pattern. © 1998 Acoustical Society of America. @S0001-4966~98!03410-9# PACS numbers: 43.71.Gv @WS#
INTRODUCTION
Speech intelligibility is central to the diagnosis, treatment, and study of dysarthria. However, speech intelligibility is not a unitary construct, and it is not a simple byproduct of the quality or clarity of the speech signal. Estimates of speech intelligibility depend on how the speaker talks, on what he or she is saying, and on who is doing the listening ~Connolly, 1986; Weismer and Martin, 1992; Yorkston and Beukelman, 1978, 1980a; Yorkston et al., 1990!. As a result of this complexity, research on intelligibility in dysarthria has taken a variety of forms. At one end of the continuum are those investigations of the causal relationship between acoustic information in the dysarthric speech signal and the perceptual consequences of this information, particularly at the single-word level ~Kent et al., 1989; Tikofsky et al., 1966; Yorkston and Beukelman, 1980a!. Other studies have explored message and listener effects on measures of intelligibility in dysarthria, primarily in connected speech ~e.g., Dongilli, 1994; Hammen et al., 1991; Garcia and Cannito, 1996; Tjaden and Liss, 1995; Yorkston and Beukelman, 1980b!. Taken together, these studies have delineated a wide range of variables that may affect estimates of intelligibility in dysarthria. They also have highlighted the conceptual gap between studies of single-word and connected speech perception, leading to conclusions such as those of Weismer and Martin ~1992!: ‘‘The explanatory principles that are emerging from the study of single-word intelligibility are likely to be quite different than those that will be discovered in the study of sentence intelligibility’’ ~p. 67!. The present report is the first of a series of studies that 2457
J. Acoust. Soc. Am. 104 (4), October 1998
adopts a different vantage point than those of previous investigations: the interface between the speech signal and the listener’s response to that signal ~see Lindblom, 1990!. Our studies examine evidence of the perceptual strategies listeners use to decipher the impoverished dysarthric speech signal, and how certain aspects of the dysarthric speech signal affect these perceptual strategies. To address these issues, we selected a paradigm that would allow us to consider jointly speaker and listener variables in the identification of word boundaries in dysarthric connected speech. This task, although seemingly effortless under normal circumstances, is perhaps the single most important process in the perception of connected speech ~Cutler and Norris, 1988; Gow and Gordon, 1995; Quene’, 1992!. Words must be extracted from the continuous acoustic stream for lexical access, and thus speech perception, to occur. No research to date has directly examined the special challenge that dysarthria poses to the task of lexical segmentation in connected speech, although it has been suggested that listeners modify their perceptual strategies when faced with impoverished acoustic signals ~Forster, 1989; Marslen-Wilson, 1989; McQueen, 1991; Pisoni and Luce, 1986!. If we can identify a mismatch between perceptual strategies and the available acoustic information, we will have learned something about the source or nature of intelligibility deficit. This is in contrast to other studies of intelligibility that regard listener perception as the tool or metric by which speaker performance is assessed. The paradigm used here to study lexical segmentation of dysarthric speech derives from a model of normal speech perception, termed the Metrical Segmentation Strategy ~MSS; Cutler and Butterfield, 1992; Cutler and Norris,
0001-4966/98/104(4)/2457/10/$15.00
© 1998 Acoustical Society of America
2457
1988!. This model has produced a growing body of evidence suggesting that listeners exploit syllabic strength, specifically the juxtaposition of strong and weak syllables, to parse the continuous acoustic stream into its component words. According to this model, strong syllables are those that contain full vowels and that may or may not receive prosodic stress, and weak syllables contain reduced vowels and do not receive prosodic stress1 ~Cutler and Butterfield, 1990, 1991; Cutler and Carter, 1987; Fear et al., 1995; Smith et al., 1989!. Central to the MSS hypothesis is the assumption that segmentation of the speech signal is activated by the occurrence of a strong syllable ~Cutler and Norris, 1988; Grosjean and Gee, 1987! Support for the MSS hypothesis is found in the statistical probabilities of syllabic strength in the English language, as well as in perception and production studies. Cutler and Carter ~1987! determined that, in English, the occurrence of strong syllables is associated with 73% of word–initial syllables or single-syllable words. They reported that words with word–initial strong syllables are most frequently openclass words such as nouns, verbs, and adjectives; weak syllables are associated most often with second syllable placement or word–initial placement ~including single-syllable words! in closed-class words, such as articles and pronouns ~see also Cutler, 1993!. Thus when listeners recognize strong syllables as word–initial, there is a high probability that they will be successful in their lexical segmentation. In a specific test of the MSS hypothesis, Cutler and Butterfield ~1992! presented listeners with phrases containing six alternating strong and weak syllables at a very low intensity, just above threshold. The pattern of errors evident in the listeners’ transcriptions was as expected if they recognized strong syllables as word onsets: Listeners were more likely to incorrectly insert lexical boundaries before strong than weak syllables, and they were more likely to delete lexical boundaries before weak than strong syllables. Moreover, they found that the words following a lexical boundary inserted before a strong syllable were most often open class, and the words following a lexical boundary inserted before a weak syllable were most often closed class. This pattern of results suggests that the listeners attended to the relatively robust acoustic cues associated with strong syllables ~Smith et al., 1989!. If listeners indeed rely on syllabic strength to identify word boundaries, speech with reduced syllabic strength contrasts—such as that produced by some speakers with dysarthria—should reduce the effectiveness of this perceptual strategy. Evidence for this reduction in effectiveness would be apparent in the patterns of lexical boundary errors ~LBEs! the listeners produce when deciphering such speech. Specifically, we would expect to see either a more even distribution of lexical boundary insertions and deletions between strong and weak syllables than has been shown to occur with normal degraded speech, or we would see a distribution that more closely aligns with the opportunities for these errors to occur. Hypokinetic dysarthria was chosen as a test case for the present investigation because the principal perceptual speech characteristics, by definition, serve to diminish syllabic 2458
J. Acoust. Soc. Am., Vol. 104, No. 4, October 1998
strength contrastivity: rapid speaking rate, monotony, monoloudness, and phoneme imprecision ~Darley et al., 1969!. The perception of reduced syllabic strength contrasts in hypokinetic dysarthria is supported indirectly by acoustic studies that have examined segmental and suprasegmental characteristics. These include demonstrations of rapid speaking rate and short segment durations ~Adams, 1991; Forrest et al., 1989!, reduced vowel formants and trajectories ~Forrest et al., 1989; Weismer, 1984!, consonant imprecision ~Logemann and Fisher, 1981; Weismer, 1991!, and the vocal characteristics of monopitch, monoloudness, and reduced stress ~Ludlow and Bassich, 1984; Ramig, 1992!. Thus from both perceptual and acoustic standpoints, hypokinetic speech that possesses the cardinal constellation of symptoms listed above has reduced syllabic contrastivity. The purpose of the present study was to examine the relationship between listeners’ perceptual strategies and the nature of the dysarthric speech signal to determine whether evidence of a mismatch could be identified. The variables of interest were the quantity and location of LBEs produced by the listeners as they attempted to decipher a set of phrases marked by reduced syllabic strength contrastivity. The following questions were addressed: ~1! Does this hypokinetic dysarthric speech elicit LBEs; ~2! Do patterns of LBEs support or refute the MSS hypothesis and the listeners’ apparent reliance on syllabic strength information to segment the acoustic stream; ~3! To what extent do acoustic indices of reduced syllabic strength contrastivity correspond with the number and pattern of LBEs produced by the listeners. I. METHOD A. Listeners
Seventy graduate and undergraduate ~63 females and 7 males! students primarily from the Speech and Hearing Science Department of Arizona State University served as listeners for this experiment. Their ages ranged from 21 to 50 years, with a mean age of 27 years. All listeners self-reported normal hearing and were native speakers of Standard American English. All listeners reported having little or no experience listening to dysarthric speech. B. Speech stimuli
The goal of speaker selection was to obtain a group of speakers with hypokinetic dysarthria whose segmental and suprasegmental characteristics were highly similar ~as per the operational definition of hypokinetic dysarthria herein!, and who were at least moderately impaired in intelligibility on the stimulus phrases. Therefore, only speech characteristics at the time of recording the stimulus phrases were relevant to the purposes of the present investigation. The two certified speech-language pathologists associated with this study confirmed that all speakers who participated exhibited some degree of all components of the operational definition: perceptually rapid speaking rate with monopitch and monoloudness; little use of variation in pitch or loudness to achieve differential syllabic stress; imprecise articulation that gives the impression of a blurring of phonemes and syllables; and a breathy and perhaps hoarse/harsh voice. These percepLiss et al.: Lexical boundary errors
2458
TABLE I. Information on speakers with hypokinetic dysarthria. Speaker HF1 HF2 HF3 HM1 HM2 HM3 HM4
a
Age
Years Post-Dx
H&Y
AIDS word
AIDS sentence
48 73 74 62 80 74 60
7.0 5.5 6.5 8.0 5.5 17.0 27.0
2 3 3 ¯ ¯ 3.5 3
92% 54% 20% 74% 76% 96% 72%
96% 45% 24% 70% 58% 88% 91%
a
Note. ‘‘HF’’ and ‘‘HM’’ refer to hypokinetic dysarthric female and male speakers, respectively. Speaker’s ages and the number of years that have elapsed since their diagnoses of Parkinson’s disease are presented in the first two data columns. The third data column contains the Hoehn and Yahr ~Hoehn and Yahr, 1967! scores that were obtained by trained neurology personnel within six months of speech sample collection. Dashes indicate no score was obtained. The last columns contain the best of two scores for the word and sentence subtests of the Assessment for Intelligibility of Dysarthric Speech ~Yorkston and Beukelman, 1981!.
tual impressions were supported by comparisons of acoustic measures taken from the hypokinetic dysarthric speech samples and those from a group of a control speakers, as will be described later. Thirty-three people were identified by their neurologists or speech-language pathologists as potential speakers for this study. Of these, 12 people were brought in to participate in recording based on their general speech characteristics and severity levels at the time of an initial telephone conversation with the first author. Five of these speakers who provided speech samples ultimately were not used in the investigation because of additional speech characteristics either not present or not noted during the initial telephone screening. These characteristics included the presence of a distinctive regional accent, a pervasive vocal tremor and oral dyskinesia, insufficient impairment of intelligibility, and the absence of one or more components of the operational definition of hypokinetic dysarthria. The seven remaining speakers ~4 men, coded HM1–4; and 3 women, coded HF1–3!, all who carried a primary diagnosis of Parkinson’s disease, served as speakers in this investigation. Their ages ranged from 48 to 81 years, with a mean age of 67 years ~see Table I!. Speech samples from a group of neurologically normal speakers who were similar to the dysarthric speakers in age also were collected. The acoustic measures derived from these control speech samples were compared to those of the hypokinetic samples to assess the perceptual impressions of reduced syllabic strength contrastivity among the speakers with hypokinetic dysarthria. The three women ~coded F1– F3! were aged 79, 52, and 56 years, and the three men ~coded M1–M3! were aged 47, 54, and 73 years. The mean age of the control group was 60 years. These speakers had no history or presentation of speech, language, or hearing disorders, and they exhibited no distinctive regional accents. Speech sample collection and data analysis procedures for these participants’ speech samples were identical to those described herein for the speakers with hypokinetic dysarthria. As with all speakers of this investigation, the control group was blinded to the purpose of the study. Speech samples were collected during a single hour-long session with each speaker. The protocol included the admin2459
J. Acoust. Soc. Am., Vol. 104, No. 4, October 1998
istration of the word and sentence subtests of the Assessment of Intelligibility of Dysarthric Speech ~AIDS; Yorkston and Beukelman, 1981!, the elicitation of several minutes of spontaneous speech, and the production of a set of stimulus phrases.2 The phrases were modeled after those of Cutler and Butterfield ~1992! and they were designed to have low interword predictability to reduce the contribution of semantic information to intelligibility. The 60 six-syllable phrases alternated strong ~S! and weak ~W! syllables, such that half of the phrases contained an SWSWSW phrasal stress pattern, and the other half contained a WSWSWS phrasal stress pattern. In addition to the phrasal stress distinction, the vast majority of strong and weak syllables contained full and reduced vowels, respectively. The phrases ranged in length from three to five words, and no word contained more than two syllables. None of the words in the phrases was repeated except articles and auxiliary verbs; and all English phonemes except /c/ were represented. For the purposes of the larger investigation, a portion of the phrases contained words that were deemed to have political connotation—these words have no particular import for the present study. Digital audio recordings of the speech protocol were made in a sound-damped booth with an initial microphoneto-mouth distance of 8 in. A Panasonic SV-3700 digital audio tape recorder, a Microtech Gefell GMBH condenser microphone mounted on a stand, a Mackie 1202 mixer, and high quality digital audio tapes were used to record the speech samples. Recording levels were monitored closely. A digital Tenma 72-860 sound level meter was used to measure sound pressure level at the microphone during each speaker’s sustained /a/ and connected speech, both prior to recording and at intervals during the session. When the maximum levels obtained by a speaker did not reach a minimum of 65 dB SPL at the microphone, the microphone was moved incrementally toward the mouth. No microphone-to-mouth distance was less than 5 in., none was more than 8 in. The average maximum sound pressure level for the hypokinetic dysarthric speakers across phrase productions was 73.7 dB SPL at the microphone, with a standard deviation of 5.4; the normal control speakers averaged 76.1 dB SPL with a standard deviation of 5.5. Throughout recording, the LED voltage indicator on the DAT recorder was monitored continuously to insure an identical peak voltage range for each phrase or production. For the production of the stimulus phrases, speakers were encouraged to produce them in their ‘‘normal, conversational voice.’’ All intelligibility test words and sentences and stimulus phrases were printed in large bold print on 8 21 by 11 in. cards and were read aloud by the speakers. Each production of the phrases was low-pass filtered at 10 kHz, digitized at a 22-kHz sampling rate and stored in a computer file using CSpeech Laboratory Automation System ~Milenkovic and Read, 1992!. Each speaker typically produced four iterations of each stimulus phrase during the course of speech sample collection. The first token which contained no word omissions, substitutions, dysfluencies, or interword pauses, and which most closely represented our operational definition of hypokinetic dysarthria was selected as the experimental token. The 60 phrases per speaker were Liss et al.: Lexical boundary errors
2459
then downloaded on to DAT audiocassettes for use in the perceptual experiment, and saved in computer files for acoustic analysis. Seven listening tapes for the hypokinetic speakers were created. Each tape contained one production of the 60 phrases in identical order to evenly distribute any practice effects that may have accrued over the course of the listening task. Each tape contained contiguous productions from 3 speakers, the first 20 from 1 speaker, 21–40 from another speaker, and 41–60 from a third speaker. Thus each speaker’s 60 phrases were distributed across 3 different tapes, such that 30 listeners ~10 listeners in each tape group! would hear and transcribe the productions of each of the 7 speakers. This provided an opportunity to identify any group of listeners which may have performed significantly better or worse than any other group. Phrases were preceded by the phrase number ~1–60! spoken by a neurologically normal female, and followed with a 12-s interstimulus silent interval. The subjective recording quality was judged to be high and signal intensity consistent across speakers and phrases. Identical procedures were used for the creation of listening tapes for the control speech samples. These tapes were used in the present investigation only for acoustic analysis and not LBE assessment because the normal speech was found to elicit only a negligible number of LBEs from among listeners. To obtain acoustic evidence of the perceived decreased syllabic strength contrasts, all of the 420 digitized hypokinetic phrases (60 phrases37 speakers) used in the listening experiment were subjected to the following acoustic measures: phrase duration, F0 and amplitude variation within and between phrases, and first and second formant frequencies of selected strong vowels across the phrases. Although these measures were not conducted on consecutive strong and weak syllables, they were meant to capture global reductions in prosody and vowel working space, particularly as compared to the measures from the control phrases. Reductions in phrase duration, F0 and amplitude variation, and vowel working space relative to the normal controls were regarded as support for the perceived characteristics of rapid speaking rate, monotony, and reduced vowel contrastivity, respectively. The same measures from the 360 phrases of the control group (60 phrases36 speakers) were used as a source of comparison. All acoustic measures were accomplished using the software CSpeech ~Milenkovic and Read, 1992!. Phrase duration was obtained during the initial editing of the phrases by placing cursors on the first and last acoustic evidence of phonemes on the spectrographic display. This included the first or last glottal pulse in the case of initial or final voiced phonemes, respectively; the beginning or end of noise energy in the case of initial or final fricatives; and the beginning or end of the of the burst release in the case of initial or final stop consonants. One hundred milliseconds of silence was then appended to the beginning and end of each phrase to reduce onset–offset effects and the entire screen was saved as a digital file for all subsequent acoustic analysis. Fundamental frequency (F0) and its variation within each digitized phrase was computed automatically using the short-term autocorrelation function with center clipping. All 2460
J. Acoust. Soc. Am., Vol. 104, No. 4, October 1998
pitch traces were inspected visually to identify and edit tracking errors which occurred in this set of phrases. The majority of tracking errors consisted of brief regions of pitch doubling that required interpolation. Approximately 10% of the phrases that required hand-editing were remeasured by a second judge to calculate interjudge reliability. One hundred percent of the interjudge differences were less than 7 Hz; 70% of these were less than 3 Hz. The rms amplitude envelope of each phrase was converted automatically to mean decibels and variation around the mean was calculated. Frequency and amplitude data were saved in computer files for the automated calculation of across-speaker values. First and second formant frequencies were measured at the temporal midpoints of seven occurrences of the vowels /i/, /,/, /a/, and /u/, using both broadband spectrograms and LPC displays. The vowels were taken from the strong syllables of the following words: ‘‘teacher, release, legal, impeach, veto, appear,’’ and ‘‘eager’’ for /i/; ‘‘ballot, taxes, catch, batch, rally, attack,’’ and ‘‘fast’’ for /,/; ‘‘congress, lobby, water, follow, solid, proper,’’ and ‘‘models’’ for /a/; and ‘‘choose, soon, nuisance, new, assume, remove,’’ and ‘‘include’’ for /u/. A total of 364 vowels were measured (4 vowels37 instances37 hypokinetic and 6 control speakers!. Means and standard deviations were used to create F1 2F2 plots to define the vowel quadrilaterals for each speaker, and geometric area values were calculated by summing the areas of the two triangles created by bisecting each quadrilateral ~see Fourakis, 1991; Turner et al., 1995!. Approximately 30% of the hypokinetic and 10% of the normal control vowels were remeasured and intrajudge reliability was found to be high and acceptable: 98% of the differences between the original and second measures of F1 were less than 20 Hz; 93% of the differences between the original and second measures of F2 were less than 40 Hz. Interjudge reliability was calculated on approximately 20% of all vowel measurements and also was found to be acceptable: 84% of the differences between the original and second measures of F1 were less than 20 Hz; 80% of the differences between the original and second measures of F2 were less than 40 Hz. Inter- and intrajudge differences that exceeded 40 Hz for F1 and 60 Hz for F2, which all occurred in the dysarthric samples, were reassessed and modified as appropriate. Severity of speech intelligibility impairment was estimated by performance on the word and sentence subtests of the AIDS, as scored by two naive judges. The mean difference between the two judges’ scores was 10% for both the word and sentence subtests. The highest score obtained by either of the judges for each subtest was used as an index of best case speaker intelligibility ~see Table I!. Phrase duration measures support the perceptual impression of rapid rate among these speakers with hypokinetic dysarthria. Figure 1 shows the mean phrase duration and standard deviation for each of the hypokinetic dysarthric and control speakers. All but one of the speakers with hypokinetic dysarthria exhibited more rapid phrase production than any of the normal controls ~HF1!. To visually compare mean intraphrase F0 and intensity variations across speakers and between speaker groups, coefficients of variation were calculated by dividing each stanLiss et al.: Lexical boundary errors
2460
FIG. 1. Mean phrase durations are shown for all speakers. Error bars indicate one standard deviation about the mean. The open bars represent the speakers with hypokinetic dysarthria ~H!; the filled bars represent the means of the control speakers ~C!. The M and F preceding speaker numbers refer to male and female, respectively.
FIG. 3. Geometric areas of vowel quadrilaterals, expressed in units2, were constructed from the mean F1 and F2 formant values. Open bars represents the speakers with hypokinetic dysarthria ~H!; the filled bars represent the means for the control speakers ~C!. The M and F preceding speaker numbers refer to male and female, respectively.
dard deviation by its mean. These values are plotted in Fig. 2 where the filled and unfilled circles represent data for the control and hypokinetic groups, respectively. Coordinates toward the upper right of the graph are associated with greater degrees of prosodic variation within phrases; those toward the lower left are associated with lesser degrees of prosodic variation. Thus control speakers M2 and F3 exhibited the greatest prosodic variation, and hypokinetic speakers HM2 and HF2 exhibited the least. The F0 variation appears to capture the majority of group difference, however, in that the means all speakers with hypokinetic dysarthria except F1 fall to the left of the control speakers. This corresponds favorably with the perception of monotony among the hypokinetic phrases. The perceptual impression of reduced vowel strength
contrasts in the hypokinetic phrases is supported by the measures of vowel working space, namely the geometric area occupied by the vowel quadrilateral derived from vowels in strong syllables. This measure offers an estimate of the ‘‘outer limits’’ of the vowel working space for this set of phrases. By inference, smaller working spaces may be related to less capacity for strong–weak distinctions because the strong vowels are closer to ‘‘reduced.’’ Figure 3 contains the geometric area values derived from each speaker’s vowel quadrilateral. All of the quadrilaterals from the hypokinetic dysarthric vowels were smaller than those of the control phrases—there was virtually no overlap between members of the two groups. To summarize, this group of speakers with hypokinetic dysarthria produced stimulus phrases that were perceptually rapid, monotonous, and lacking in syllabic strength contrastivity. These perceptual impressions are supported by the associated acoustic measures as they compare to those of a group of control speakers. C. Procedures
FIG. 2. Coefficients of variation for F0 are plotted as a function of coefficients of variation for intensity for all speakers. The open circles represent speakers with hypokinetic dysarthria; the filled circles represent the control speakers. 2461
J. Acoust. Soc. Am., Vol. 104, No. 4, October 1998
Ten listeners were randomly assigned to transcribe each of the seven tapes. Their task was to listen to each phrase and to write down exactly what they heard. They were told that all phrases consisted of real words in the English language, and they were encouraged to guess if they did not know what the speaker was saying. The listeners were seated in individual cubicles. The audio tapes were presented via the Tandberg Educational sound system in the ASU Language Laboratory over high quality Tandberg supra-aural headphones. Equivalent sound pressure levels across headphones were verified with a headphone coupler sound level meter ~Quest 215 Sound Level Meter!. Listeners were instructed to adjust the volume to a comfortable listening level ~in 4-dB increments up or down! during the preliminary instructions. They were directed not Liss et al.: Lexical boundary errors
2461
TABLE II. Examples of coding lexical boundary errors from the listeners’ transcriptions. Target phrase amend the slower page she describes a nuisance they repaired the veto ballot formal circles answer dying temper wait beyond the mention watch him join the caucus sticks are best for pencils soon the men were asking test a law for methods
Listener response
TABLE III. Proportions of possible lexical boundary error sites based on syllable position.
Error typea
I’m in the slower page see this guy’s a nuisance baby pass the veal battle for all circles that’s his dying panther way beyond dementia enjoy the carcass pigs are eventful sooner men were asking test for other methods
IS IS IS IW IW DS DS DS DW DW
Syllable position Word–Initial Error type Deletions Insertions
Strong 34.00% N/A
Weak 26.67% N/A
Nonword–Initial Strong a
N/A 16.00%
Weak N/Aa 23.33%
a
Note. N/A refers to the fact that lexical boundary deletions cannot occur in nonword–initial syllable positions, and that lexical boundary insertions cannot occur in word–initial syllable positions. Phrase–initial syllables were excluded from the count because they are not subject to LBEs.
a
Note. IS refers to insertion of a lexical boundary before a strong syllable; IW refers to insertion before a weak syllable. DS and DW refer to deletions of lexical boundaries before strong and weak syllables, respectively.
to alter the volume once the stimulus phrases had begun. The listeners transcribed three practice phrases which were read by a neurologically normal female speaker. Listeners who made more than one word-transcription error in the practice phrases were not eligible for the study. Only one listener was excluded by this criterion. D. Analysis
Three trained judges independently coded the listener transcripts for the presence and type of LBEs. Lexical boundary violations were defined as erroneous insertions or deletions of lexical boundaries. These insertions or deletions were coded as occurring either before strong or before weak syllables ~as determined by the target phrasal stress pattern of the phrase, SWSWSW or WSWSWS!. Thus four error types were possible and each phrase had the possibility of containing more than one LBE. Examples from the actual transcripts are provided in Table II. The 60 phrases consisted of 360 syllables, 60 of which were phrase–initial syllables and were therefore not subject to LBEs. Of those 300 nonphrase–initial syllables, 102 were word–initial strong syllables; 80 were word–initial weak syllables; 48 were nonword–initial strong syllables; and 70 were nonword–initial weak syllables. The occurrence of each word–initial strong syllable in the target corresponded to the opportunity for the deletion of a lexical boundary before a strong syllable. Similarly, word–initial weak syllables corresponded to the opportunity for the deletion of a lexical boundary before a weak syllable; nonword–initial strong syllables to the opportunity for an insertion of a lexical boundary before a strong syllable; and nonword–initial weak syllables to the opportunity for an insertion of a lexical boundary before a weak syllable. Thus the opportunities for producing the different types of LBEs were not equal, but are representative of the opportunities generally available in the English language ~Cutler and Carter, 1987!. The proportions of opportunities across the series of phrases are shown in Table III. The codes generated for each speaker by the three judges were merged into one composite data set that reflected instances in which there was 100% agreement among the judges. Twenty-four LBE decisions were discarded due 2462
J. Acoust. Soc. Am., Vol. 104, No. 4, October 1998
to interjudge disagreement. The number, type ~insertion or deletion!, and location ~before strong or before weak syllables! was then tallied for each speaker and for the group. The errors also were analyzed for the distribution of openand closed-class words following the insertion of a lexical boundary. All words following the insertion of a lexical boundary were classified as open, closed, or nonclassifiable in the case of nonword responses. Chi-square analyses were performed to determine the relationship between the variables of insert/delete and strong/ weak, and between the variables open/closed class and insertion before strong/weak syllables. Spearman Rank Order Correlation analyses were performed to assess the strength of the speaker-rank relationships between LBE patterns and acoustic measures. II. RESULTS A. Lexical boundary error incidence
The first finding of this investigation is that the hypokinetic speech samples elicited many LBEs from the listeners. In all, 1596 LBEs were identified unanimously by the three independent judges. The data columns of Table IV contain values from the coded listener transcripts. The number of phrases ~out of 420 total! for which listeners provided no response whatsoever ranged from 9 to 75. The percentage of words transcribed correctly ranged from 18.3% to 64.9%. The number of phrases transcribed entirely correctly for each speaker ranged from 1 to 157. The number of LBEs elicited by each speaker ranged from 151 to 317, with a mean of 228 (s.d.571.2). B. Lexical boundary error pattern
The second finding is that the pattern of LBEs supports the MSS hypothesis. Of the total 1596 errors identified, there were nearly three times as many insertion as deletion errors. Of these, 47% were insertions of a lexical boundary before a strong syllable, 27% were insertions before weak syllables, 10% were deletions before strong syllables, and 15% were deletions before weak syllables. Pattern strength can be expressed as two ratios, IS–IW and DW–DS. Ratio values of ‘‘1’’ indicate that insertions and deletions occur equally as often before strong and weak syllables: the greater the positive distance from ‘‘1,’’ the greater the strength of adherence to the predicted pattern. For the present data, insertion errors Liss et al.: Lexical boundary errors
2462
TABLE IV. Listener performance on phrase transcription task by speaker. Note. The first column contains number of phrases for which listeners provided no response. The second column contains the percentages of wordscorrect in the 60 stimulus phrases. Column 3 contains the number of phrase transcriptions that were entirely correct. The fourth column contains the absolute number of LBEs produced for each speaker. Speaker
No response
% Words-correct
# Phrase correct
# LBEs
HF1 HF2 HF3 HM1 HM2 HM3 HM4
11 75 49 25 9 18 21
64.9% 18.3% 26.7% 57.5% 47.7% 60.0% 43.9%
157 1 11 132 66 132 70
161 317 286 151 227 161 293
occurred 1.7 times more often before strong than before weak syllables, and deletion errors occurred 1.5 times more often before weak than before strong syllables. A chi-square analysis indicated a significant interaction between the variables of insert/delete and strong/weak @x 2 (1,N54)567.3, p,0.0001#. Also, the majority of the post-boundary words created by the insertion of a lexical boundary before a strong syllable were open-class words ~73.7%!, and the majority of the words created by the insertion of a boundary before a weak syllable were closed-class words ~74%!. A chi-square analysis revealed a significant interaction between the variables of open/closed class and strong/weak syllables @x 2 (1,N54) 5281.8, p,0.0001#. This general pattern of LBEs for the group also held for each of the seven speakers. The first four data columns of Table V contain the percentages of error types out of the total errors elicited by each of the speakers. In accordance with MSS hypothesis predictions, insertions were more common before strong than before weak syllables; and deletion errors were more common before weak than before strong syllables. The final column of this table expresses the strength of these error patterns with IS–IW ratio values:3 Lexical boundary insertions before strong syllables outnumbered those before weak syllables by 2.70 to 1.38 times. Thus although all speakers elicited the general pattern of LBEs predicted by the MSS hypothesis, the pattern was stronger for some speakers than for others. C. Lexical boundary error pattern and speech characteristics
The final question was the extent to which the acoustic evidence of reduced syllabic strength contrasts was associ-
ated with the listener performance patterns elicited by the seven speakers with hypokinetic dysarthria. Spearman correlation coefficients were calculated on the rank orderings of the speakers on the various measures of intelligibility; aspects of LBE patterns; and acoustic measures associated with syllabic contrastivity. This was meant to identify any intelligibility or acoustic measures that might be associated with LBE incidence or pattern strength. The rankings of the word and sentence subtests of the AIDS were correlated strongly with the rankings of percent words-correct from the stimulus phrases, accounting for 79% and 56% of the variance, respectively. However, neither standard intelligibility ranking was associated strongly with that of the number of LBEs (R50.64 and 0.43, for word and sentence subtests, respectively!, nor with the IS–IW ratio ranking (R50.39 and 0.46, respectively!. Only one of the three acoustic measure rankings was associated strongly with measures of intelligibility and LBE pattern strength. The F0 variability (CVF0) and sentence subtest rankings were correlated at 0.71 (p50.05). The CVF0 and IS–IW rankings were correlated at 0.93 ( p50.00). Although the correlations of rankings provide some insight to the relationships among the variables of interest, it is the joint consideration of individual speaker patterns and LBE results that is most revealing. The speakers in this study fall roughly into three groups when the strength of the IS–IW ratio is examined. The high IS–IW ratio group consists of speakers HF1 and HM3 who also were the most intelligible, and elicited among the fewest LBEs. The low IS–IW ratio group consists of speakers HF2 and HM2 who were among the least intelligible of this group, and who produced among the greatest numbers of LBEs. The mid IS–IW ratio group consists of HF3, HM1, and HM4. Although more variable in their presentation, they obtained IS–IW ratio values in the narrow range of 1.6–1.8. Each group is addressed in turn. The high IS–IW ratio group ~HF1 and HM3! obtained high standard intelligibility scores, and elicited the highest percentages of words-correct on the stimulus phrases. Subjective impressions also support the conclusion that these two speakers had the highest levels of articulatory integrity of the group. The acoustic data for HF1 are consistent with the notion that strong–weak contrasts in her phrases were well marked. Her mean phrase durations were the longest of the hypokinetic speakers, and comparable to those of the fastest control speakers ~Fig. 1!. Her measures of F0 and amplitude variation placed her among the control speakers ~Fig. 2!, and her vowel quadrilateral was the largest of the
TABLE V. Distribution and type of lexical boundary errors elicited per speaker.
2463
Speaker
% IS
% IW
% DS
% DW
IS–IW
HF1 HF2 HF3 HM1 HM2 HM3 HM4
55.28% 43.85% 49.65% 50.33% 37.44% 55.90% 44.71%
26.71% 30.28% 27.62% 32.45% 26.87% 20.50% 24.91%
8.70% 9.78% 10.84% 5.30% 11.89% 11.18% 11.95%
9.32% 16.09% 11.89% 11.92% 23.79% 12.42% 18.43%
2.03 1.44 1.78 1.56 1.38 2.70 1.78
J. Acoust. Soc. Am., Vol. 104, No. 4, October 1998
Liss et al.: Lexical boundary errors
2463
hypokinetic speakers ~Fig. 3!. In contradistinction, evidence for the acoustic marking of syllabic strength contrasts for HM3 is not so apparent. His mean phrase duration was almost half that of HF1 ~Fig. 1!, and his vowel quadrilateral area was the smallest of the group ~Fig. 3!. The only positive factor was his high CVF0 ranking. Although substantially smaller than that of HF1, his was second highest for the group of hypokinetic speakers ~Fig. 2!. Thus despite equivalent LBE pattern evidence of relatively robust strong–weak syllabic distinctions, only one of the two speakers offered commensurate acoustic evidence. The low IS–IW ratio group ~HF2 and HM2! did not exhibit high levels of articulatory integrity based on their standard intelligibility scores, their percentages of wordscorrect on the stimulus phrases, and on subjective impressions. Acoustically, these speakers exhibited the least prosodic variation ~see Fig. 2!, and among the smallest vowel quadrilateral areas ~see Fig. 3! of the group. Their mean phrase durations were short compared with those of the normal control, but they were in the middle of the range for the hypokinetic group ~see Fig. 1!. Thus the LBE pattern evidence of reduced syllabic strength contrasts was supported by the acoustic evidence for these two speakers. Two of the mid IS–IW ratio group, HM1, and HM4, had very similar intelligibility profiles on both the standard measure and on the percent words-correct on the stimulus phrases. The third member, HF3, was the least intelligible. Acoustically, the three speakers had highly similar CVF0 values ~Fig. 2!, and they formed the midrange for vowel quadrilateral area ~Fig. 3!. The only acoustic measure that distinguished these speakers was phrase duration ~Fig. 1!: HM1 was the slowest male of the group, and HF3 and HM4 tied for the most rapid speakers. Interestingly, this corresponds with the number of LBEs elicited: HM1 elicited nearly half as many errors as HF3 and HM4. III. DISCUSSION
The present report offers several new findings relevant to the study of speech perception and speech intelligibility. First, these hypokinetic speech samples elicited large numbers of LBEs. Also, like systematically degraded normal speech, the pattern of LBEs elicited by these hypokinetic dysarthric speakers generally is consistent with the predictions offered by the MSS hypothesis. The third and most critical finding is the evidence of diminished effectiveness of this strategy when syllabic strength contrasts were most reduced. More than 1500 LBEs were identified in the transcribed phrases. Even speakers with high scores on the standard measures of word and sentence intelligibility elicited relatively large numbers of LBEs. Although there is no direct point of comparison, Cutler and Butterfield ~1992! reported just 256 LBEs in their corpus of 864 phrase transcriptions evoked by normal speech presented at low listening levels. Their coding criteria were more stringent than those used in the present investigation, however, this cannot explain the magnitude of the discrepancy in the occurrence of LBEs. The answer may be related to the fact that degradation of normal speech results in a rather systematic and consistent 2464
J. Acoust. Soc. Am., Vol. 104, No. 4, October 1998
manipulation of segmental and suprasegmental components. Dysarthric speech, in contrast, may vary more randomly on these dimensions, thereby providing more opportunities for lexical boundary misperceptions to occur. This question may be addressed by the evaluation of LBEs elicited by other types of dysarthria. Having established the presence of LBEs, the next issue is that of error pattern. The primary hypothesis of this investigation was that a group of speakers who produced poor distinctions between strong and weak syllables would elicit lexical boundary error patterns that either are random, or closely aligned with the opportunity for such errors to occur. In other words, one would expect to see lexical boundary insertions distributed either equally often before strong and weak syllables, or more often before weak than before strong syllables. The opposite would be expected for lexical boundary deletions. This certainly was not the case. These data conformed precisely with the predictions of the MSS hypothesis: Lexical boundary insertions occurred most often before strong syllables and created open-class words; deletions occurred most often before weak syllables and created closedclass words. The conclusion must be, then, that listeners relied on some sufficient level of strong–weak syllabic contrastivity to make their lexical boundary decisions. Investigations have provided evidence that, despite overall reductions in stress production, speakers with hypokinetic dysarthria are able to maintain relative stress distinctions. Ackermann and Ziegler ~1991! measured sound pressure levels associated with syllable–initial stop consonants to estimate degree of articulatory closure in speakers with parkinsonian dysarthria. They reported greater degrees of articulatory closure for the initial stop consonants of stressed syllables as compared with those of unstressed syllables. Their interpretation was that, because of linguistic import, stressed syllables received articulatory attention at the expense of the unstressed syllables. Kinematic findings reported by Forrest and Weismer ~1995! further support the notion of maintained stress contrasts in hypokinetic dysarthria. They documented reductions in movement amplitude and velocity for stressed syllables, and even greater reductions for unstressed syllables. It may be the case that the patterns of LBEs found in the present study conformed with the MSS hypothesis predictions because, although reduced, relative syllabic strong–weak contrastivity was maintained. However, there exists one critical difference between the present study and those of Ackermann and Ziegler, and Forrest and Weismer: All of our speakers and speech tokens ~phrases! were selected because of the perceptual reductions in syllabic contrastivity, rather than the perceptual presence of correct stress, as was the case in the other two studies. The present study indicates that the perceptual strategy of attending to syllabic strength was less effective for some speech samples than for others. This evidence is found in discrepancies in the IS–IW ratios, or the strength of pattern adherence. There exists in the literature no entirely comparable data base. However, Cutler and Butterfield ~1992! report insertion and deletion values for a study of lexical boundary errors elicited by faint speech. The ratio values derived from their data indicated that insertion errors ocLiss et al.: Lexical boundary errors
2464
curred before strong syllables three times more often than before weak ones (n5195). Deletion errors before weak syllables outnumbered those before strong by nearly four times (n569). In the present study these ratios were substantially lower, with an IS–IW ratio of 1.7 (n51180), and a DW–DS ratio of 1.5 (n5399). At face value, this suggests that either the hypokinetic speech contained less strong-weak contrast information than the faint speech, or that the listeners in the present study attended less to the contrasts than did those of Cutler and Butterfield. Support for the validity of the IS–IW ratio as an index of pattern adherence is found in the assessment of individual speaker data. From this assessment, three conclusions can be drawn. The first two are related: The rankings of IS–IW ratios were strongly associated with the rankings of CVFO; and the lowest IS–IW ratios were elicited by speakers with the most overall acoustic evidence of decreased syllabic contrastivity. An interpretation of the first finding is that larger variations in F0 within phrases correspond to larger prosodic distinctions between strong and weak syllables. When these strong and weak syllables are most perceptually distinctive, and when listeners use this distinction to motivate lexical segmentation, the IS–IW ratio is largest. Even though overall low performance on the acoustic measures was related to low IS–IW ratios, the rankings of these measures did not line up precisely with LBE pattern rankings. This may reflect the nonspecific nature of the measures of phrase duration and vowel space relative to syllabic strength. Relative durations of syllabic strength, specifically ratios of strong to weak durations, may be more valuable indicators of syllabic strength contrastivity in the temporal domain. Similarly, comparisons of strong to weak vowel formant data would provide information specific to vowel quality differences that were not presented here. The final conclusion is that the speakers with highest measures of intelligibility elicited the highest IS–IW ratios, even when syllabic contrastivity apparently was most reduced ~HM3!. However, when measures of intelligibility were lower, the degree of syllabic contrastivity seemed to ‘‘matter more’’ for the IS–IW ratios. In other words, the IS–IW ratio appeared to be more vulnerable to decrement when articulatory integrity was most impaired. This supports the proposal of Cutler and Butterfield ~1992! that, ‘‘it is precisely under the conditions of phonetic uncertainty that rhythmic segmentation proves most useful’’ ~p. 233!. In the present case, the listeners may rely on the rhythmic information when articulatory integrity is reduced, but their success is restricted by the available syllabic strength acoustic information. The findings of this study must be viewed within the limitations of the methodology and design. First, syllabic strength was defined herein as a relative prosody-based entity with secondary vowel quality constituents. Although most of the strong and weak syllables of the stimulus phrases contain full and reduced vowels, respectively, not all do. If it were found that vowel quality has primary perceptual saliency for strong–weak distinctions, the findings of this study would have to be revisited ~see Fear et al., 1995!. Second, as mentioned earlier, this study did not use relative intersyllabic 2465
J. Acoust. Soc. Am., Vol. 104, No. 4, October 1998
measures to document the acoustic correlates of syllabic stress reductions. Because of the perceived reductions, it is expected that finer acoustic measures, such as strong–weak syllable duration ratios, will offer greater predictive value than the ones used here. Finally, individual listener patterns were not presented. Although all listener error patterns generally conformed to the group results, there was a range of performance across listeners. For example, some listeners appeared to be more vigilant than others in their transcription attempts, as evidenced by the number of ‘‘no responses’’ they produced. The design of the listening tapes did reduce potential listener effects by having each listener transcribe 20 phrases from three different speakers. However, the potential for different listening ~and segmentation! strategies across the listeners should be considered. IV. CONCLUSION
The examination of lexical boundary errors revealed that listeners attended to syllabic strength information to segment the connected hypokinetic dysarthric speech, just as they have been shown to do in the segmentation of normal speech. In general, adherence to the predicted pattern of errors was weakest for speakers with the greatest evidence of reduced syllabic contrastivity. This constitutes a mismatch between the perceptual strategy and the available acoustic information, and is therefore a source of intelligibility deficit for these speakers. ACKNOWLEDGMENTS
This research was supported by research Grant No. 5 R29 DC 02672 from the National Institute on Deafness and Other Communication Disorders, National Institutes of Health. Gratitude is extended to the patients of the Mayo Clinic—Scottsdale who participated in this investigation, and to Shannon Beatty and Michele Marshall for their assistance on the project. 1
This definition of syllabic strength is distinguished from that of prosodic phonology which holds that all strong syllables are stressed, and weak syllables are unstressed ~Halle and Keyser, 1971!. The relative importance of vowel quality versus prosodic stress in the perceptual designation of syllabic strength remains to be determined ~Fear et al., 1995; Gow and Gordon, 1995; van Ooijen et al., 1997!. 2 The list of phrases is available electronically from the first author. 3 Analogous data for the strength of deletion error patterns are not provided because of the relatively small numbers of deletion errors for several of the speakers. For example, speakers HF1, HM1, and HM3 elicited only 29, 26, and 38 deletion errors, respectively. These small numbers would render a comparison of DW–DS ratios across speakers difficult to interpret. Ackermann, H., and Ziegler, W. ~1991!. ‘‘Articulatory deficits in Parkinsonian dysarthria: An acoustic analysis,’’ J. Neurol. Neurosurg. Psychiat. 54, 1093–1098. Adams, S. G. ~1991!. ‘‘Accelerating speech in a case of hypokinetic dysarthria: Descriptions and treatment,’’ in Motor Speech Disorders: Advances in Assessment and Treatment, edited by J. Till, K. Yorkston, and D. Beukelman ~Paul H. Brookes, Baltimore!, pp. 213–228. Connolly, F. ~1986!. ‘‘Intelligibility: A linguistic view,’’ British Journal of Disorders of Communication 21, 371–376. Cutler, A. ~1993!. ‘‘Phonological cues to open- and closed-class words in the processing of spoken sentences,’’ Journal of Psycholinguistic Research 22, 109–131. Liss et al.: Lexical boundary errors
2465
Cutler, A., and Butterfield, S. ~1990!. ‘‘Durational cues to word boundaries in clear speech,’’ Speech Commun. 10, 485–495. Cutler, A., and Butterfield, S. ~1991!. ‘‘Word boundary cues in clear speech: A supplementary report,’’ Speech Commun. 9, 335–353. Cutler, A., and Butterfield, S. ~1992!. ‘‘Rhythmic cues to speech segmentation: Evidence from juncture misperception,’’ Journal of Memory and Language 31, 218–236. Cutler, A., and Carter, D. M. ~1987!. ‘‘The predominance of strong syllables in the English vocabulary,’’ Comput. Speech Lang. 2, 133–142. Cutler, A., and Norris, D. ~1988!. ‘‘The role of strong syllables in segmentation for lexical access,’’ J. Exp. Psychol. 14, 113–121. Darley, F., Aronson, A., and Brown, J. ~1969!. ‘‘Differential diagnostic patterns of dysarthria,’’ J. Speech Hear. Res. 12, 246–269. Dongilli, P. ~1994!. ‘‘Semantic context and speech intelligibility,’’ in Motor Speech Disorders: Advances in Assessment and Treatment, edited by J. Till, K. Yorkston, and D. Beukelman ~Paul H. Brookes, Baltimore!, pp. 175–192. Fear, B. D., Cutler, A., and Butterfield, S. ~1995!. ‘‘The strong/weak syllable distinction in English,’’ J. Acoust. Soc. Am. 97, 1893–1904. Forrest, K., and Weismer, G. ~1995!. ‘‘Dynamic aspects of lower lip movement in Parkinsonian and neurologically normal geriatric speakers’ production of stress,’’ J. Speech Hear. Res. 38, 260–272. Forrest, K., Weismer, G., and Turner, G. ~1989!. ‘‘Kinematic, acoustic, and perceptual analyses of connected speech produced by Parkinsonian and normal geriatric adults,’’ J. Acoust. Soc. Am. 85, 2608–2622. Forster, K. I. ~1989!. ‘‘Basic issues in lexical processing,’’ in Lexical Representation and Process, edited by W. Marslen-Wilson ~MIT, Cambridge!, pp. 75–107. Fourakis, M. ~1991!. ‘‘Tempo, stress and vowel reduction in American English,’’ J. Acoust. Soc. Am. 90, 1816–1827. Garcia, J., and Cannito, M. P. ~1996!. ‘‘Influence of verbal and nonverbal contexts on the sentence intelligibility of a speaker with dysarthria,’’ J. Speech Hear. Res. 39, 750–760. Gow, D. W., and Gordon, P. C. ~1995!. ‘‘Lexical and prelexical influences on word segmentation: Evidence from priming,’’ J. Exp. Psychol. 21, 344–359. Grosjean, F., and Gee, J. ~1987!. ‘‘Prosodic structure and spoken word recognition,’’ Cognition 25, 135–155. Halle, M., and Keyser, S. J. ~1971!. English Stress: Its Form, Its Growth, and Its Role in Verse ~Harper and Row, New York!. Hammen, V. L., Yorkston, K. M., and Dowden, P. ~1991!. ‘‘Index of contextual intelligibility: Impact of semantic context in dysarthria,’’ in Dysarthria and Apraxia of Speech: Perspectives on Management, edited by C. Moore, K. Yorkston, and D. Beukelman ~Paul H. Brookes, Baltimore!, pp. 43–53. Hoehn, M. M., and Yahr, M. D. ~1967!. ‘‘Parkinsonism: Onset, progression, and mortality,’’ Neurology 17, 427–442. Kent, R., Weismer, G., Kent, J., and Rosenbek, J. ~1989!. ‘‘Toward phonetic intelligibility testing in dysarthria,’’ J. Speech Hear. Dis. 54, 482–499. Lindblom, B. ~1990!. ‘‘Explaining phonetic variation: A sketch of the H and H theory,’’ in Speech Production and Speech Modeling, edited by W. J. Hardcastle and A. Marchal ~Kluwer Academic, Dordrecht, The Netherlands!, pp. 403–439. Logemann, J. A., and Fisher, H. B. ~1981!. ‘‘Vocal tract control in Parkinson disease: Phonetic feature analysis of misarticulations,’’ J. Speech Hear. Dis. 46, 348–352. Ludlow, C. L., and Bassich, C. J. ~1984!. ‘‘Relationships between perceptual ratings and acoustic measures of hypokinetic speech,’’ in The Dysar-
2466
J. Acoust. Soc. Am., Vol. 104, No. 4, October 1998
thrias, edited by M. R. McNeil, J. C. Rosenbek, and A. Aronson ~College Hill, San Diego!, pp. 163–195. Marslen-Wilson, W. ~1989!. ‘‘Access and integration: Projecting sound onto meaning,’’ in Lexical Representation and Process, edited by W. MarslenWilson ~MIT, Cambridge!, pp. 3–24. McQueen, J. ~1991!. ‘‘The influence of the lexicon on phonetic categorization: Stimulus quality in word-final ambiguity,’’ J. Exp. Psychol. 17, 433– 443. Milenkovic, P. H., and Read, C. ~1992!. CSpeech ~Version 4! @Computer software#, Madison, WI. Pisoni, D. B., and Luce, P. A. ~1986!. ‘‘Speech perception: Research, theory, and the principal issues,’’ in Pattern Recognition by Humans and Machines: Speech Perception, edited by E. C. Schwab and H. C. Nusbaum ~Academic, Orlando!, Vol. 1, pp. 1–50. Quene’, H. ~1992!. ‘‘Durational cues for word segmentation in Dutch,’’ Journal of Phonetics 20, 331–350. Ramig, L. O. ~1992!. ‘‘The role of phonation in speech intelligibility: A review and preliminary data from patients with Parkinson disease,’’ in Intelligibility in Speech Disorders, edited by R. D. Kent ~John Benjamin, Amsterdam!, pp. 120–155. Smith, M. R., Cutler, A., Butterfield, S., and Nimmo-Smith, I. ~1989!. ‘‘The perception of rhythm and word boundaries in noise-masked speech,’’ J. Speech Hear. Res. 32, 912–920. Tikofsky, R. S., Glattke, T. J., and Tikofsky, R. P. ~1966!. ‘‘Listener confusions in response to dysarthric speech,’’ Folia Phoniatr. 18, 280–292. Tjaden, K., and Liss, J. M. ~1995!. ‘‘The influence of familiarity of judgments of treated speech,’’ American Journal of Speech-Language Pathology 4, 39–48. Turner, G. S., Tjaden, K., and Weismer, G. ~1995!. ‘‘The influence of speaking rate on vowel space and speech intelligibility for individuals with amyotrophic lateral sclerosis,’’ J. Speech Hear. Res. 38, 1001–1013. Van Ooijen, B., Bertoncini, J., Sansavini, A., and Mehler, J. ~1997!. ‘‘Do weak syllables count in newborns?’’ J. Acoust. Soc. Am. 102, 3735–3741. Weismer, G. ~1984!. ‘‘Articulatory characteristics of Parkinsonian dysarthria: Segmental and phrase-level timing, spirantization, and glottalsupraglottal coordination,’’ in The Dysarthrias, edited by M. R. McNeil, J. C. Rosenbek, and A. Aronson ~College Hill, San Diego!, pp. 101–130. Weismer, G. ~1991!. ‘‘Assessment of articulatory timing,’’ in Assessment of Speech and Voice Production: Research and Clinical Applications, NIDCD Monograph, Vol. 1, pp. 84–95. Weismer, G., and Martin, R. ~1992!. ‘‘Acoustic and perceptual approaches to the study of intelligibility,’’ in Intelligibility in Speech Disorders, edited by R. D. Kent ~John Benjamin, Amsterdam!, pp. 67–118. Yorkston, K. M., and Beukelman, D. R. ~1978!. ‘‘A comparison of techniques for measuring intelligibility of dysarthric speech,’’ Journal of Communication Disorders 11, 499–512. Yorkston, K., and Beukelman, D. ~1980a!. ‘‘A clinician-judged technique for quantifying dysarthric speech based on single-word intelligibility,’’ Journal of Communication Disorders 13, 15–31. Yorkston, K. M., and Beukelman, D. R. ~1980b!. ‘‘The influence of passage familiarity of intelligibility estimates of dysarthric speech,’’ Journal of Communication Disorders 13, 33–41. Yorkston, K. M., and Beukelman, D. R. ~1981!. Assessment of the Intelligibility of Dysarthric Speech ~Manual! ~CC Publications, Oregon!. Yorkston, K. M., Hammen, V., Beukelman, D. R., and Traynor, A. ~1990!. ‘‘The effect of rate control on the intelligibility and naturalness of dysarthric speech,’’ J. Speech Hear. Disord. 55, 550–560.
Liss et al.: Lexical boundary errors
2466