Vibrotactile Speech Tracking Support: Cognitive Prerequisites Jerker Ronnberg Ulf Andersson Bjdrn Lyxell Linkoping University Karl-Erik Spens KTH, Stockholm
Research suggests that visual speechreading can be im-. proved by means of natural tactile support, where the receiving person holds his or her hands against the throat/neck of the talker (i.e., so-called tactiling; Ohngren, 1992, for a review; Ohngren, Ronnberg, & Lyxell, 1992; Ronnberg, 1993). The possibility that tactually supported speechreading works as a speechreading supplement is here assumed to build on the complementary relations that exist between vision and touch. Speech cues that cannot be seen on the lips of the talker are in some instances discriminable by means of touch. And low-frequency contrasts are often within the realms of skin sensitivity (Sherrick, 1982). Thus, there is a potential informational basis for a tactile-visual Correspondence should be sent to Jerker Ronnberg, Dept. of Education & Psychology, Linkoping University, S-581 83 Linkoping, Sweden (e-mail:
[email protected]). © 1998 Oxford University Press
perceptual mechanism, grounded on the complementarity between visual and tactile cues, analogous to the complementarity revealed for audio-visual speech integration in noise (Summerfield, 1987). Empirically, visual speech perception by means of tactile aids has also generally shown an added benefit of tactile support (Bernstein, 1995; Bernstein, Eberhart, & Demorest, 1989). In principle, two categories of tactile aids are currently available for transmitting different aspects of speech: one-channel or multi-channel aids. These categories also seem to build on different philosophies. Whereas the one-channel aids emphasize the pick up of time-intensity information, and hence, environmental sounds and prosodic support (e.g., Carney & Beachler, 1986; Kishon-Rabin, Boothroyd, & Hanin, 1996; Weisenberger & Russel, 1989), the philosophy underlying multi-channel aids is that some phonemic information is possible to decode reliably (e.g., Brooks, Frost, Mason, & Gibson, 1986a, 1986b; Lynch, Eilers, Oiler, Urbano, & Pero, 1989; Weisenberger & Miller, 1987). Multi-channel effects have also been demonstrated for electro-tactile stimulation (e.g., Alcantara, Cowan, Blarney, & Clark, 1990; Saunders & Franklin, 1985; Sparks, Ardell, Bourgeois, Wiedmer, & Kuhl, 1979). Effects of training are well-established in the literature (e.g., Alcantara et al., 1990; Bernstein, Demorest, Coulter, & O'Connel, 1991; Boothroyd & HnathChisolm, 1986; Brooks et al., 1986a, 1986b; Weisenberg & Russel, 1989) and should be conceptually dis-
Downloaded from jdsde.oxfordjournals.org by guest on July 6, 2011
Fourteen postlingually hearing-impaired participants took part in an intervention study on the potential benefit of three types of tactile aids (i.e., the Tactilator, Minivib 3, and the Tactaid 7). Although training by means of computerized tracking had substantial effects on speech tracking rate, no differential effects of type of aid emerged. However, a cognitive test battery revealed that training efficacy is directly dependent on the cognitive prerequisites of the individual speechreader. The speed with which an individual can make phonological judgments (i.e., rhyme judgments) and visual word decoding from lipreading proved to be critical cognitive skills. We conclude that these skills must be further assessed and taken into account when rehabilitation/training programs are launched.
144 Journal of Deaf Studies and Deaf Education 3:2 Spring 1998 speech processing conditions always is underspecified. The ability to make sense of an underspecified speech signal necessarily draws on cognitive abilities related to the filling in of missing information. A cognitive test battery was used to estimate the working-memory capacity, speed of verbal information processing, and internal speech of the tactile-aid users. These cognitive abilities also have in common that they have been shown to be critical predictors of speechreading (Ronnberg, 1995) and cochlear implant outcomes (Lyxell et al., 1996). In addition to the above cognitive abilities, verbal ability (Lyxell & Ronnberg, 1992), verbal inference making (Lyxell & Ronnberg, 1989), and visual word decoding tests were employed (Lyxell & Ronnberg, 1991) because they have been demonstrated to be important to sentence-based speechreading as such. This was done to allow for alternative sets of cognitive predictions of tactile-visual speech perception abilities. The main purpose of the present study was thus to address some of the observed constraints on generality with respect to previous research on tactile-aided speechreading, that is, (1) the studies typically involve a few cases, (2) they have not typically evaluated relative differences in tactile aid efficiency in the same subject, (3) the studies have not evaluated effects of training for a relatively large number of subjects, and (4) the outcomes have rarely been related to the cognitive abilities of the individual.
Method General Procedure and Training Design Prior to the actual training study, all participants took part in a three-part initial pretesting session. The pretest consisted of (1) a speechreading test section, (2) a cognitive test section, (3) and a vocabulary test section. Order of section was balanced across subjects according to a latin square. The first training session was conducted approximately 1 week after the pretest and was continued for a total of 10 sessions for 14 subjects. The study was carried out as a training study with 10 training sessions. In each training session there were four conditions, one referring to speechreading and the other three to the different visual-tactile speechreading conditions. Or-
Downloaded from jdsde.oxfordjournals.org by guest on July 6, 2011
tinguished from initial effects of a given aid (Lyxell, Ronnberg, Andersson, & Linderoth, 1993). The reason is that certain aspects of an individual's information processing capacity can be initially important for the use of an aid, whereas prolonged training with the aid perhaps demands another set of cognitive abilities at later points of skill development (Ackerman, 1992; Lyxell, Andersson, Arlinger, Bredberg, Harder, & Ronnberg, 1996). However, the relative successes of some of the multi-channel aids have not been left without criticism (Bernstein et al., 1991; Carney, 1988), nor is it the case that multi-channel aids always are superior to singlechannel aids (Plant, 1989). It is also important to avoid generalizations of results across materials and subject populations. In terms of the information provided, one tactile aid may provide important information at an analytic level of linguistic processing, whereas other aids prove to be most efficient for connected speech (Weisenberger, Broadstone, & Kozma-Spytek, 1991). There exist certain constraints on the generality in some studies as they are case studies or multiple case studies, not group studies (e.g., Boothroyd & HnathChisolm, 1988; Brooks & Frost, 1983; Lynch et al., 1989; Osberger, Robbins, Todd, & Brown, 1991; Plant, 1987). By including a relatively larger number of participants, the purpose was to overcome some of these constraints in the present study. Using a larger number of participants also allows for a general assessment of cognitive capacities assumed important for visual speech perception and for success with cochlear implants (Lyxell et al., 1996; Ronnberg, 1995). This individual difference aspect of successes and failures with different kinds of devices has been sadly neglected in the literature on tactile aid use (see Plant & Spens, 1995). As complementary information to evaluations of relative benefits of sensory speech aids, it is therefore important to study the potential generality of cognitive capacities previously found important for other speech processing conditions than the visual-tactile condition. In this study, we assess similarities and differences in cognitive predictions of cochlear implant success with the cognitive predictions that can be made for proficient tactile aid use (Carney, Kienle, & Miyamoto, 1990; Lyxell et al., 1996). A second general reason for analyzing cognitive capacities is that the information provided in these
Vibrotactile Speech Tracking Support Table 1 Participant characteristics Yrs 2 500 1 Age of with 3-frequency Age onset hr. aid Hz KHz KHz average Individuals 41 1 52 2 61 3 61 4 76 5 40 6 72 7 39 8 21 9 57 10 64 11 68 12 45 13 37 14
31 36 1 34 61 7 32 3 3 7 39 33 1 1
10 8 19 18 9 12 25 36 17 34 22 32 40 15
60 90 65 55 60 90 55 60 60 80 75 75 75 85
90 80 60 100 100 95
Meani 52 16 SD
21 19
21 11
70 13
60 60 75 80 75 90 65
100 40 45 110 120 95 120 60 50 70 75 70 95 60
83 70 57 88 93 93 85 60 57 75 77 73 87 70
79 14
79 27
76 12
80
Participants Thirteen profoundly hearing-impaired female participants and one male participant participated in this study. Table 1 presents participants' characteristics. Mean age was 52 years (SD = 16). The mean duration of the participants' hearing impairment was 32 years (SD = 14). Audiograms show three-frequency averages of 76 dB (SD = 13) for the best ear according to the most recently available medical records. It should be noted that these audiograms in some cases are relatively old (X = 32 months, SD = 45), which for all practical purposes means that the hearing loss of the group as a whole is somewhat underestimated. The participants had worn hearing-aids for a mean of 21 years (SD = 1 1 years). All participants conceived of themselves as profoundly hearing-impaired adults, and all preferred an oral communication mode.
3 and the Tactilator Ek 1011), and one was a multichannel aid (Tactaid 7). 1. Minivib 3 (AB Specialinstrument, Stockholm). The bone-conductor was attached to the participant's wrist with a wristband. The Minivib picks up the intensity contour at 500 to 1800 Hz, but time/intensity variations are presented to the subject at a fixed 230 Hz frequency. 2. Sennheiser wireless Tactilator Ek 1011. T h e experi-
menter held the contact-microphone against his throat/larynx and the participant held a boneconductor in her hand between the thumb and the index finger. The information provided by the boneconductor included fundamental frequency (F0) and amplitude variations. All information was transmitted wireless (via radio frequency) to the bone-conductor. The approximate frequency range of the Tactilator was 150 to 900 Hz. Above 900 Hz the linearity of the transmission is lost. 3. Tactaid 7. The seven vibrators were attached to a specially designed hand prototype, and the participant held the hand prototype in his or her right hand. Four of the channels were designed to cover the first formant frequencies; the remaining four covered the second formant frequencies. The first five vibrator elements stimulated the finger tips and the sixth and seventh stimulated the palm. The fourth channel is shared by the first and second formant analyzers (see Tactaid 7, user's manual, 1991, for technical details; cf. Osberger, Robbins, Todd, & Brown, 1991). A Finlux 26" color TV and a tape-recorder (JVC HR-7700EG) were also used. The Minivib 3 and the Tactaid 7 were connected directly to a video-recorder during all the speechreading pretests. Thus, all auditory information from the environment was excluded. This was not the case for the Tactilator, where all visual and tactical information was presented live by the experimenter. In all conditions (visual and tactile supported), sound was presented with the visual and the visual-tactile information.
Section 1: Speechreading Tests Apparatus In this study three different types of vibrotactile aids were used. Two were single-channel aids (the Minivib
1. Sentence-based speechreading test. In the sentence-
based speechreading test, the participant's task was to speechread sentences with and without tactile support.
Downloaded from jdsde.oxfordjournals.org by guest on July 6, 2011
der of condition was counterbalanced across sessions and subjects. Thus, the design was a within subjects 10 X 4 factorial combination.
145
146 Journal of Deaf Studies and Deaf Education 3:2 Spring 1998 terval) can be manipulated using the parameter menu of the program. The same subroutine is used in all kinds of access tests and no source of variation in precision emanates from the software per se. The order of test presentation is automatically re-randomized by TIPS and is unique for each subject. 1. Lexical decision making (speed of verbal informationprocessing). The task was to decide whether a string of letters constituted a real word or not. Fifty true words (e.g., "sno," "snow") and 50 lures were used. Out of the 50 lures 25 were possible to pronounce (e.g., "gar") and 25 impossible to pronounce (e.g., "nsi"). The participant responded "yes" or "no" by means of pressing predefined buttons (green = yes and red = no). The letter string was presented for 2 sec. Reaction-time was measured from the onset of the 2-sec interval, and the 2-sec interval also served as the maximum response time. After the response another 2-sec interitem interval commenced before presentation of the next word. Accuracy and speed of performance were measured.
2. Semantic decision making (speed of verbal information-processing). The task was to decide whether a word belonged to a certain predefined semantic category or not. There were four categories: "colors" (e.g., "brun," 2. Word decoding test. In the word decoding test the "brown"), "occupations" (e.g., "larare," "teacher"), "diseases" (e.g., "cancer," "cancer"), and "body parts" participant's task was to decode (i.e., name) 40 Swedish (e.g., "ben," "leg"). Within each category there were ordinary bisyllabic words (e.g., "kvinna," "woman"). Two balanced word lists were used. Ten of the 40 words 24 items. Presentation was blocked, with a small pause were presented without tactile support and 30 words between category-blocks, where the next category key with tactile support (10 words with each aid). The parword was presented. Twelve items belonged to the catticipants were asked to write down the word on an anegory and 12 items were lures. The presentation of the swer-sheet. Four separate randomized orders were creitems was the same as for the lexical decision making ated based on word order and aid order. Speechreading test. Accuracy and speed of performance were meaperformance was measured by the proportion of words sured. correctly perceived in each one of the four test condi3. The rhyme tests (internal speech). Four lists of tions. word-pairs were employed as test material. The first list consisted of 50 pairs of monosyllabic and bisyllabic common Swedish words. Twenty-five of the word-pairs Section 2: Cognitive Tests in the first list were orthographically similar and 12 of this subset rhymed (e.g., "sal-bal," "hall-ball"), and 13 The cognitive test battery was administered by a did not (e.g., "bil-bal," "car-punch"). The remaining computer program called TIPS, Text-Information25 word-pairs were orthographically dissimilar and 12 Processing-System (Ausmeel, 1988). The program of them rhymed (e.g., "kurs-dusch," "course-shower"), permits textual items to be presented letter-, word- or and 13 did not (e.g., "cykel-paron," "bicycle-pear"). line-wise in a fixed, moving, or growing text-window. Thus, the first list consisted of four conditions. In the Presentation rate (i.e., time per item and interitem in-
Downloaded from jdsde.oxfordjournals.org by guest on July 6, 2011
Twenty-four sentences were subdivided into three separate context-blocks of eight sentences each; a "restaurant context," a "train context," and a "shop context" were used. The sentences were of an ordinary type that may occur in real life (e.g., "Har ni en speciell avdelning for herrklader," "Do you have a special department for men's wear"). The participants speechread two sentences without tactile support and six sentences with tactile support for each context-block (two with each aid). A TV and a video-recorder were used to present the sentences, except when the Tactilator was used. The presentation procedure was as follows: the actor (a female native speaker of Swedish) or the experimenter (with the contact-microphone) sat quietly and looked straight forward. Then, she or he (the experimenter) uttered the sentence, continued to look straight forward, and finally looked down. The experimenter stopped the video-recorder when the subject wrote down what she had perceived. Speechreading performance was measured by the proportion of words correctly perceived in each one of the four test conditions. Two different sets of sentences were used in the pretest. Both the presentation order of the sentences and the order of the aids that were employed were randomized.
Vibrotactile Speech Tracking Support
5. Digit span test (short-term memory capacity). A series of digits was presented in a fixed text-window at a rate of one digit per 0.8 sec with an interitem interval of .075 sec. After the participants had attempted to recall the digit series orally in the correct serial order, the experimenter pushed a button and the next sequence of digits was presented. The first span-size employed was three digits, the next four digits and so on, ending with a span of eight digits. Three different sequences
were presented for each span-size. The response interval was maximized to 2 min. The participants' responses were scored in terms of total number of recalled digits. 6. Sentence completion test (inference-making). The participants were presented with 28 sentences that had some words missing and their task was to fill in the missing words (e.g., "Kan jag ett par _ byxor?" "May I a pair of _ trousers?"). Half of the sentences were related to a shop context and the other half were related to a restaurant context. The sentences involved 4 to 13 ordinary and familiar Swedish words each according to frequency counts (Allen, 1970). From each sentence two to four words were omitted. The incomplete sentence was exposed on the computer screen for 7 sec. After the exposure of the sentence the response interval started, which was set to 30 sec. The participants had to complete the sentence orally, and the experimenter wrote down all answers on an answer-sheet. The filled-in words were scored according to their semantic and syntactic appropriateness. The number of correct words was divided by the number of maximum deleted words in each sentence to constitute the basis for the scoring used, and the average proportion based on the 28 sentences was used as the actual data for each subject.
Section 3: Vocabulary Tests 1. Antonym test (verbal ability). The task consisted of deciding what two words out of five were antonyms (e.g., "vacker, gammal, ledsen, snabb, ung," "beautiful, old, sad, fast, young"). The whole test contained 29 five-word strings and was conducted under time pressure (maximum 5 minutes). The proportion of correct responses was used as the dependent measure. 2. Analogy test (verbal ability). In this test the participant's task was to decide which two words of five alternative words were related to each other, in a similar (analogical) way as the two target words were related (e.g., "penna-rita: tavla, pensel, bild, mala, ram," "pendraw: picture, brush, illustration, paint, frame"). Thus, the right answer would be "brush" and "paint." The whole test contained 27five-wordstrings and was conducted under time pressure (maximum 5 min and 30 sec). The proportion of correct responses was used as the dependent answer.
Downloaded from jdsde.oxfordjournals.org by guest on July 6, 2011
second list there were 50 pairs of bisyllabic words only, and in each pair there were one "real" word and one nonword (e.g., "citron-mirol," "lemon-mirol"). The third list contained 30 monosyllabic pairs of nonwords (e.g., "pret-blet") and the fourth 30 bisyllabic pairs of nonwords (e.g., "volir-sjolir"). One pair of words at a time was displayed for 5 sec on the computer screen and the participant's task was to respond "yes" or "no" by means of pressing predefined buttons if the two words rhymed, or did not rhyme. The response time was 5 sec. Accuracy and speed of performance were measured (cf. Lyxell, Ronnberg, & Samuelsson, 1994). 4. Reading span test (working memory capacity). The participant's task was to comprehend sentences and to recall the first or the final word in each sentence of a presented sequence of sentences (cf. Baddeley, Logie, Nimmo-Smith, & Brereton, 1985). The participants were presented with sentences word-by-word, at a rate of one word per 0.8 sec and with an interword interval of .075 sec. The sentences involved three words each. Half of the sentences were absurd (e.g., "Fisken korde bilen," "The fish drove the car") and half were normal sentences (e.g., "Kaninen var snabb," "The rabbit was fast"). After each presented sentence there was a 1.75 sec interval during which the participants were asked to respond "yes" (for a normal sentence) and "no" (for an absurd sentence). After a sequence of sentences (3, 4, 5, or 6 sentences) the participants were asked to recall the first or the final word in each sentence in the correct serial order. Three sequences per span-size were used, from span-size three (three sentences) to span-size six (six sentences). The response interval was set to 80 sec. However, no subject needed more than 30 sec to respond. The experimenter started the next sequence of sentences by pushing a button. The participants' responses were scored in terms of total number of recalled words.
147
148
Journal of Deaf Studies and Deaf Education 3:2 Spring 1998 47,5
32,5 0
1 2
3
4
5
6
7
9
10
Training sessions Figure 1 Speech tracking performance as a function of training for each condition.
Training Procedure
Results As can be seen in Figure 1, where the dependent variable, wpm, is plotted as a function of training session and tactile aid condition, there is a substantial gain from training by means of the computerized speech tracking procedure, /\9,117) = 13,02 p < .0001, MSE = 53,87). As is also true from inspection of Figure 1, the main effect of tactile aid condition is harder to detect visually. However, the overall averages range from 40.2 wpm to 42.2 wpm, which, actually, is statistically significant, in favor of the Tactilator, /1(3,39) = 4.13, p< .05, MSE = 23,71. The interaction between the two variables did not reach statistical significance. Thus, speech tracking is a method that readily can detect improvements as a function of training. Effects due to tactile aid are small, but significant. However, the clinical and practical relevance of a 2 wpm increase after 10 training sessions must be seriously doubted. Two further analyses of variance (ANOVAs) were
Downloaded from jdsde.oxfordjournals.org by guest on July 6, 2011
A computerized speech tracking procedure (Gnosspelius & Spens, 1992) derived from the original procedure (DeFilippo & Scott, 1978) was employed to evaluate the training effects. The experimenter read a text from a computer screen, sentence by sentence, and the participant had to orally repeat each sentence verbatim. As a repair strategy, the words the speechreader could not perceive were repeated verbatim (twice) orally. If the speechreader was still unable to speechread the word (i.e., after approximately 4-5 sec), it was presented a third time on an electronic display (cf. Boothroyd & Hnath-Chisholm, 1988). This type of verbatim correction strategy is effective and appropriate for computerassisted tracking implementations (Lunato & Weisenberger, 1994). Ten training sessions were administered with approximately one week's interval between each session. The sessions lasted for approximately one hour. The participants practiced 10 min ( 2 x 5 min) without any aid and 10 min (2x5 min) with each of the three tactile aids. All auditory and tactile information was presented live by the experimenter; that is, the tactile devices were not connected to the apparatus, as in the pretest. The order of training conditions was counterbalanced across participants for each session and their hearing aids were always turned off. The text materials used in the training sessions were simplified and easy-reading books; by Per-Anders
Fogelstrom "Mina drommars stad" ("The Town of My Dreams"), Hakan Bostrom "Avskedet" ("The Farewell"), and Jan Fridegard "Yxskaftet" ("The Axe Handle"). Conventional words per minute (wpm) rate was used as performance measure in the speech tracking task. That is, the total number of words correctly perceived in the test session were counted and divided by the time elapsed to give a wpm rate. This measure was automatically calculated by the computer program.
Vibrotactile Speech Tracking Support
149
Conventional Optimum
0
1
2
3
4
5
6
7
Training sessions
Figure 2 Speech tracking performance for conventional and optimum rate as a function of training.
One further aspect of the data should be noted, viz., that the training effect reaches its asymptote for the so-called optimum rates after three trials only, whereas there is still a gain in conventional rates up to the tenth trial (see Figure 2). This may be a consequence of the fact that optimum rates are calculated on the basis of error-free sentences. The fact that the asymptote was reached after three trials suggests that
it should be relatively straightforward to evaluate the benefit of tactile aids for easy conditions to get a reliable and quick estimate of their optimal efficiency. A final procedural aspect should be noted: We employed fast repair strategies (i.e., 4—5 sec before electronic display of the word). This is important as we believe that natural conversations are mimicked with faster rather than slow repair strategies (e.g., 15-30 sec). For slower repair strategies (higher k-values), Spens (1995) has shown that the relative benefit of the aid increases dramatically and nonlinearly. We obviously used a conservative k-value, deflating rather than inflating the relative effects of the aided conditions. We therefore argue that the impressive benefits observed in the literature may at least partially be a function of too high k-values (Spens, 1995) and not always a function of ecologically valid improvements. Thus, the picture of these overall ANOVAs is that small differences among the visual and visual-tactile conditions can be discerned, none with any statistical or clinical generality. The training procedure, on the other hand, has generally revealed strong effects of speech tracking training, a result important in its own right. This holds true especially for conventional wpm rates. It is also in agreement with the claim that computerized verbatim tracking seems to be an appropriate method (Lunato & Weisenberger, 1994). We know that individual cognitive differences are
Downloaded from jdsde.oxfordjournals.org by guest on July 6, 2011
computed on the overall data. The first ANOVA analyzed the data for "ceiling" or optimum wpm rate, that is, estimating the wpm speed where no errors were present (technically, those sentences in the computer presentation where the subjects had made no errors served as an estimate of the optimal speed, see Spens, 1995). The second ANOVA assessed the ratio between wpm/wpmoptimal. The first additional ANOVA was computed to study whether there would be an interaction between tactile aid and a dependent measure, the possibility being that a certain aid should work better when the text is less complicated, and others when the text is relatively complicated (cf. Weisenberger & KozmaSpytek, 1991). The second ANOVA can be viewed as a computation of a standardization, where relative differences in optimal speed is accounted for in each tactile aid condition. However, the results of both ANOVAs suggest only a main effect of training. No effect involving the tactile aid variable proved to be significant.
150 Journal of Deaf Studies and Deaf Education 3:2 Spring 1998 important to speechreading skill (Ronnberg, 1990). Therefore, the present within-participants design must be viewed as the most appropriate, as the individual information-processing abilities are disentangled from the assessment of training and tactile aid effects proper. Although we counterbalanced the order of presentation of tactile supplements, and averaged over two measurements per session, tactile aid condition and subject, there still seem to exist learning effects that transfer among the tactile aid conditions. To this effect there is also evidence that unimodal training (tactile or visual) as opposed to bimodal training does not impair bimodal perception (Alcantara, Blarney, & Clark, 1993).
Cognitive Analysis The cognitive analysis of prerequisites for speechreading with tactile aids departed from a description of two
Downloaded from jdsde.oxfordjournals.org by guest on July 6, 2011
In fact, it is quite feasible that becoming a better visual speechreader may positively influence the use of tactile signals, and vice versa. Nevertheless, it may be somewhat harder to argue this way for different methods of presenting the tactile signal. And, we know of no study which has explicitly addressed the possibility that stimulation of flexible visual-tactile strategies can be accomplished by a certain combination of different aids. The lack of tactile effects compares well with at least some of the studies in the literature (Bernstein et al., 1991; Carney, 1988). A final set of ANOVAs was computed on the subject variables (e.g., age, age of onset, number of years with a hearing aid, and the three-frequency average hearing loss). We used the median split technique to define the between-groups variable in the ANOVA to get a quick estimate of potential effects due to the particular variable. Only the effect of chronological age revealed an interaction so that the young benefitted relatively more from the Tactilator in particular. The interaction was relatively weak clinically and amounts to only a few wpm relative differences. However, for optimum rates, the main effect of age was significant (68 vs. 46 wpm), suggesting that the young are generally faster under optimal speech processing conditions, which should not come as a surprise based on the literature on cognitive aging and speechreading (Ronnberg, 1990).
extreme groups of participants (i.e., the five best performing participants and the five lowest performing participants). This subgrouping was done for each of the four conditions separately, averaged over the initial two training sessions. It is possible that other cognitive sets of abilities determine performance, once the participant has familiarized/automatized processing in the tracking task compared to the initial processing requirements (cf. Ackerman, 1992). One way of analyzing this possibility and to simplify the cognitive analysis would be to compute rank order correlations between the initial and final rank orders of all 14 individuals for each condition. Thus, the rank correlations were based on the average rank for the initial and final two training sessions. The overall correlation was very high, r = .94, ranging from r = .89-.96 for each condition separately, suggesting that an initially high (or low) performing individual remains, relative to the whole group of participants, high (or low) performing following training. In Figure 3 the tracking data have been plotted as a function of subgroup (five subjects in each) and initial and final testing for each of the four conditions. As can be seen in Figure 3, speechreading skill accounts for the main variation in the tracking data. The skill factor is indeed powerful as compared to the tactile aid conditions and the training effects. This leaves us in a position where we have to account for speechreading skill differences in terms of the underlying cognitive processing abilities (Ronnberg, 1990). The data were subsequently partitioned on the basis of the tracking skill subgroups in the tactile aid conditions for initial testing. The subgroups were the same for all conditions with the exception of the Tactilator condition. Two columns are therefore presented in Table 2 instead of four. The rank order correlations among columns was also very high, r = .89-.96, again suggesting that cognitive differences may be a powerful indicator of tracking success in terms of abilities pertinent to all tactile aid conditions, as well as to prediction of training outcomes. As can be observed in Table 2, there is a quite systematic pattern across tactile aid conditions such that the largest differences between the skilled and the less skilled subgroups can be seen for hard-wired, basic cognitive functions. Visual word decoding differs signifi-
70 —D—
60o 3
c
• •-O-— Skilled Tactilator 50-
E 40Words
Skilled Minivib3
30-
20-1 10 Initial
-••-O-—
Skilled Tactaid 7
—a—
Skilled Visual
—EB—
Less skilled Minivib3
—••—•
Less skilled Tactilator
—e—
Less skilled Tactaid 7
- - V- -
Less skilled Visual
Final
. Training session Figure 3 Initial and final speech tracking performance as a function of speechreading skill for each condition.
initial training sessions VIinivib3/Tactaid 11 Visual Speech tracking ability Speechreading ability Word decoding Sentence-based Verbal ability Antonym test Analogy test Inference-making Sentence-completion Short-term/working memory Reading span Digit span Verbal info-processing speed Semantic decision making Lexical decision making Rhyme-judgment Word-pairs Pairs words/nonwords Pairs nonwords (monosyllabic) Pairs nonwords (bisyllabic) Rhyme overall *p < .05, two-tailed t test.
Skilled
Tactilator
Less skilled Skilled
Less skilled
.48 (.15) .65 (.23)
.20 (.24)* .12 (.09)*
.52 (.11) .77 (.14)
.20 (.24)* .12 (.09)*
.46 (.13) .49 (.15)
.40 (.29) .31 (.35)
.53 (.09) .61 (.15)
.40 (.29) .31 (.35)
.72 (.06)
.68 (.16)
.76 (.08)
.68 (.16)
.40 (.10) .79 (.05)
.45 (.11) .76 (.02)
.46 (.11) .81 (.06)
.45 (.11) .76 (.02)
.64 (.13) .67 (.10)
.80 (.14) .88 (.16)*
.62 (.10) .66 (.10)
.80 (.14)* .88 (.16)*
1.11 (.20) 1.22 (.28) 1.28 (.38) 1.27 (.31) .21 (.27)
1.67 (.53) 1.88 (.62) 1.79 (.52) 1.80 (.56) 1.77 (.46)*
1.14 (.22) 1.24 (.29) 1.21 (.27) 1.22 (.23) 1.20 (.25)
1.67 (.53) 1.88 (.62) 1.79 (.52) 1.80 (.56) 1.77 (.46)*
Downloaded from jdsde.oxfordjournals.org by guest on July 6, 2011
Table 2 Mean performance for the skilled and less skilled speech tracking in the tests used in the study, expressed in proportions (speechreading tests and the memory tests) and in seconds (verbal information-processing speed and rhymejudgment)
152
Journal of Deaf Studies and Deaf Education 3:2 Spring 1998
Table 3 Accuracy level for skilled and less skilled speech tracking for the tests of verbal information-processing speed and rhyme-judgment for each group Initial training sessions Minivib3/Tactaid 11 Visual Speech tracking ability
Skilled
Verbal information-processing speed .95 (.02) Semantic decision making .92 (.04) Lexical decision making Rhyme-judgment .77 (.12) Word-pairs .84 (.08) Pairs words/nonwords Pairs nonwords (monosyllabic) .85 (.07) Pairs nonwords (bisyllabic) .92 (.08)
Tactilator
Less skilled Skilled
Less skilled
.95 (.02) .87 (.10)
.97 (.03) .94 (.03)
.95 (.02) .87 (.10)
.76 (.19) .87 (.08) .79 (.14) .95 (.06)
.83 (.13) .91 (.04) .87 (.04) .96 (.06)
.76 (.19) .87 (.08) .79 (.14) .95 (.06)
*p < .05, two-tailed t test.
that only the processing speed indices, especially the rhyme
(overall) condition (i.e., internal speech) were significant. The less skilled subjects were approximately more than half a second slower in deciding whether two items rhymed. Accuracy data for the reaction-time tests presented in Table 3 suggest that speech tracking skill is not a factor when it comes to variations in the representational aspects of phonology. This leaves us with the interpretation that the observed speed deficit is tied to a slowing down in internal speech processing rather than to an impoverished representation of print. Although not presented in table format, this state of affairs was' further reinforced as there is no significant correlation among the accuracy levels of the speed indices and tracking performance. However, the correlational analysis among the rest of the variables in the battery and tracking performance confirms the picture from the subgroup data in that decoding, sentence-based speechreading, and rhyme speed all are significantly correlated to the tracking rates in all four conditions (see Table 4). Also, it seems clear that when rhyming is mediated by a lexical com-
ponent (i.e., conditions with real words), the correlations are somewhat higher and systematically significant. All in all, the cognitive data replicate the general message from our previous models on the cognitive architecture underlying speechreading skill (cf. Ronnberg, Samuelson, & Lyxell, in press) and success with cochlear implants (Lyxell et al., 1996). Basic speed, phonological, and decoding functions predominate as descriptors of subgroups and as predictors of speechreading performance in the general population. This pattern of cognitive prerequisites is apparently generalizable across many speechreading conditions and tracking training, suggesting that cognitive abilities are always important for assessment as well as rehabilitation issues in the area of speech communication in the hearing-impaired and deaf.
Discussion The aim of the present study was to examine whether profoundly hearing-impaired adults with a postlingual acquisition of their impairment can improve their speechreading ability, with or without tactile support during speech tracking training. Training in four types of speechreading/tactile-supported speechreading conditions was carried out by means of a computerassisted speech tracking procedure (Gnosspelius & Spens, 1992). A cognitive analysis based on previous modeling constituted the basis for the chosen test battery (Ronnberg, 1990; Ronnberg et al., in press).
Downloaded from jdsde.oxfordjournals.org by guest on July 6, 2011
cantly between the two subgroups for all tracking conditions. This also holds true for the visual sentence-based speechreading test. Sentence-based speechreading has previously been found to be highly correlated with visual word decoding (Ronnberg, 1990), here indirectly suggesting that word decoding is important to both the tracking and the sentence-based speechreading measures. And, a further important piece of information is
Vibrotactile Speech Tracking Support
153
Table 4 Correlations among speech tracking conditions and cognitive tests in the battery Speech tracking condition
.58* .79*
.61* .85*
.62* .80*
.59* .78*
.21 .35
.28 .44
.25 .40
.21 .37
.29
.31
.30
.28
.07 .36
.15 .33
.07 .32
.01 .33
-.48 -.58*
-.61* -.65*
-.54* -.62*
-.52 -.60*
-.54* -.56* -.45 -.46 -.56*
-.61* -.62* -.53 -.52 -.63*
-.61* -.62* -.50 -.51 -.62*
-.58* -.58* -.44 -.46 -.57*
*p < .05.
First, the results suggest that it is possible to improve the speech tracking ability substantially by means of a relatively low-intensity training program, extended over 10 sessions. It may be the case that the computerized presentation program itself helps in making the connected discourse tracking more effective (Lunato & Weisenberger, 1994). Although the use of speech tracking as a method is less analytical in its approach (Plant 1986), training effects do obviously occur. However, the lack of differentiation of training effects on the specific tactile aids is not what we had initially expected. One argument (Plant, 1986) is that when analytical training is not explicitly used, as was the case in the present study, one should not expect differences in the ways tactile aids convey manner of speech and voicing (Alcantara et al., 1987). Also, the present group of participants had not used any sensory devices before puberty, which may hinder the development of more specific speech perception/production skills (cf. Osberger, Maso, & Sam, 1993; Tillberg, Ronnberg, Svard, & Ahlner, 1996). If the above is true, one should expect no interac-
tion with type of tactile aid as all of them roughly contribute to the prosodic patterning of the signal only, throughout the entire training period. Also, as previously mentioned, the literature is not always conclusive on whether differences between single- and multichannel aids should be detected after training (Carney et al., 1988; Plant, 1986). The obvious possibility that we had not trained the participants long enough does of course exist. Hence, the potential advantages of more advanced speech processing (as for the Tactaid 7) may not have been picked up during the present 10-week program, especially given the nonanalytical training procedure used. The counter argument is that the training effects were rapidly detectable, especially so for the optimum rates, asymptoting after 3-4 trials. Still, there is no significant interaction with tactile aid condition in the data for optimum rates. Second, the present data provide strong evidence to support the notion (as suggested by Ronnberg, 1995) that the cognitive prerequisites of the individual speechreader determine the potential benefit from speechreading training, as well as the benefit from the
Downloaded from jdsde.oxfordjournals.org by guest on July 6, 2011
Speechreading ability Word decoding Sentence-based Verbal ability Antonym test Analogy test Inference-making Sentence-completion Short-term/working memory Reading span Digit span Verbal info-processing speed Semantic decision making Lexical decision making Rhyme-judgment Word-pairs Pairs words/nonwords Pairs nonwords (monosyllabic) Pairs nonwords (bisyllabic) Rhyme overall
Minivib3 Tactilator Tactaid 7 Visual
154 Journal of Deaf Studies and Deaf Education 3:2 Spring 1998 tactile aid. Thus, the utility of a training program is directly dependent on the speechreader's cognitive prerequisites to process information. Assessment and rehabilitation procedures must obviously accept and take into account the speechreader's informationprocessing capacities to a larger extent than has been previously the case. This general thesis is beginning to gain some recognition in the general literature (Knutson, 1995; Pisoni, 1997; Summerfield & Martin, 1995).
Although similarities prevail between the cognitive prerequisites for tactile aid usage and cochlear implant success (Lyxell et al., 1996), there is one interesting difference that needs to be pointed out. Working memory capacity is not important for tactile aid use but seems to be a crucial necessary prerequisite for implantees to function well (i.e., at the level of understanding
References Ackerman, P. L. (1992). Predicting individual differences in complex skill acquisition: Dynamics of ability determinants. Journal ofApplied Psychology, 77, 598-614. Alcantara, J. I., Blarney, P. J., & Clark, G. M. (1993). Tactileauditory speech perception by unimodally and bimodally trained normal-hearing subjects. Journal of American Academic Audiology, 4, 98-108. Alcantara, J. I., Cowan, R. S. C , Blarney, P. J., & Clark, G. M. (1990). A comparison of two training strategies for speech recognition with an electrotactile device. Journal of Speech and Hearing Research, 33, 195-204. Allen, S. (1970). Frequency dictionary ofpresent-day Swedish. (In Swedish: Nusvensk frekvensbok.) Stockholm: Almquist & Wiksell. Ausmeel, (1988). TIPS (Text-Information-Processing-System): A user's guide. Manuscript, Department of Education and Psychology, Linkoping University, Sweden. Baddeley, A. D., Logie, R., Nimmo-Smith, I., & Brereton, N. (1985). Components of fluent reading. Journal of Memory and Language, 24, 119-131. Bernstein, L. E. (1995). Toward future tactile aids. In G. Plant & K.-E. Spens (Eds.), Profound deafness and speech communication (pp. 147—162). London: Whurr Publications. Bernstein, L. E., Demorest, M. E., Coulter, D.C., & O'Connell, M. P. (1991). Lipreading sentences with vibrotactile vocoders: Performance of normal-hearing and hearing-impaired subjects. Journal of the Acoustical Society of America, 90, 2971-2984. Bernstein, L. E., Eberhart, S. P., & Demorest, M. E. (1989). Single-channel vibrotactile supplements to visual perception of intonation and stress. Journal of the Acoustical Society ofAmerica, 85, 397-405. Boothroyd, A., & Hnath-Chisolm, T. (1986). Lipreading with tactile supplement. Journal of Rehabilitation and Research Development, 23, 139-146.
Downloaded from jdsde.oxfordjournals.org by guest on July 6, 2011
A minimal test battery based on the present data should consequently focus on speed (especially phonology, cf. Lyxell et al., 1994) and decoding functions, as long as the group of speechreaders is drawn from a representative segment of the population. However, we know that for expert visual speechreaders, working memory capacity and inference-making ability come into play in a more pronounced fashion, as if there was a threshold that determined what can be effectively decoded from the visual speech signal as such (Lyxell, 1994; Ronnberg, 1993; Ronnberg et al., in press). To shed further light on the rehabilitation aspect, we computed a three-group ANOVA using the skilled, the less skilled, as well as the intermediate group. Here, evidence was found for a significant interaction between group and training, generally suggesting that the intermediate group gains the most from the initial to the final training sessions. Below a certain hypothetical threshold of cognitive abilities (Ronnberg et al., in press), there seems to be little gain in terms of training (i.e., for the less skilled group, 15 to 21 wpm). For the skilled group, the improvement is presumably restricted to functional ceiling effects (i.e., 56 to 66 wpm). The five best participants are very skilled compared to other independent data on experts (Ronnberg, 1993). This leaves us with the intermediate group, which demonstrated a steeper learning slope (i.e., from 37 to 50 wpm). This constraint is important to keep in mind when aural rehabilitation procedures are implemented as well as evaluated.
speech without visual contact). We hypothesize that this difference is to be attributed to the fact that tactile information interacts with the visual in a way that provides information about speech, but only indirectly about phonology, whereas a cochlear implantation more directly stimulates auditory/phonological neural transformations. For that reason, we suggest that the phonological loop, which is inherent in a working memory system for lipreading (Ronnberg et al., in press), at least initially demands more relearning and recoding with a cochlear implant compared to the use of supplementary tactile information in speechreading. The implication of this reasoning is that the cochlear implantee is in need of a more capacious working memory.
Vibrotactile Speech Tracking Support
Lyxell, B., & Ronnberg, J. (1989). Information-processing skills and speechreading. British Journal of Audiology, 23, 339-347. Lyxell, B., & Ronnberg, J. (1991). Visual speech processing: Word decoding and word discrimination related to sentence-based speechreading and hearing-impairment. Scandinavian Journal ofPsychology, 32, 9-17. Lyxell, B., & Ronnberg, J. (1992). Verbal ability and speechreading. Scandinavian Audiology, 21, 67-72. Lyxell, B., Ronnberg, J., & Samuelsson, S. (1994). Internal speech functioning and speechreading in deafened and normal hearing adults. Scandinavian Audiology, 23, 179-85. Ohngren, G. (1992). Touching voices: Components of direct factually supported speechreading. PhD dissertation, University of Uppsala, Sweden. Ohngren, G., Ronnberg, J., & Lyxell, B. (1992). Tactiling: A usable support system for speechreading? British Journal of Audiology, 26, 167-173. Osberger, M. J., Maso, M., & Sam, L. K. (1993). Speech intelligibility of children with cochlear implants, tactile aids, or hearing aids. Journal of Speech and Hearing Research, 36, 186-203. Osberger, M. J., Robbins, A. M., Todd, S. L., & Brown, C. J. (1991). Initial findings with a wearable multichannel vibrotactile aid. The American Journal of Otology, 12 (suppl), 179-91. Pisoni, D. B. (1997, May). Cognitive factors and cochlear implants: A theoretical overview of the role of perception, attention, learning and memory in speech perception. Paper presented at the 5th International Cochlear Implant Conference, New York. Plant, G. (1986). A single-transducer vibrotactile aid to lipreading. Speech Transmissions Laboratory Quarterly Progress and Status Report, 1, 41-63. Plant, G. (1987). A single-transducer vibrotactile aid to lipreading. Speech Communication, 6, 335—342. Plant, G. (1989). A comparison of five commercially available tactile aids. Australian Journal of Audiology, 11, 11-19. Plant, G., & Spens, K.-E. (Eds.). (1995). Profound deafness and speech communication. London: Whurr Publications. . Ronnberg, J. (1990). Cognitive and communicative function: The effects of chronological age and "handicap age." European Journal of Cognitive Psychology, 2, 253-273. Ronnberg, J. (1993). Cognitive characteristics of skilled Tactiling: The case of GS. European Journal of Cognitive Psychology, 5, 19-33. Ronnberg, J. (1995). What makes a skilled speechreader? In G. Plant & K.-E. Spens (Eds.), Profound deafness and speech communication (pp. 393-416). London: Whurr Publications. Ronnberg, J., Samuelsson, S., & Lyxell, B. (in press). Conceptual constraints in speechreading. In R. Campbell & B. Dodd (Eds.), Hearing by eye II: The psychology of speechreading and audiovisual speech. London: Lawrence Erlbaum. Saunders, F. A., & Franklin, B. (1985). Field tests of a wearable 16-channel electrotactile sensory aid in a classroom for the deaf. Journal of the Acoustical Society ofAmerica, 78(suppl), 17.
Downloaded from jdsde.oxfordjournals.org by guest on July 6, 2011
Boothroyd, A., & Hnath-Chisolm, T. (1988). Spatial, tactile presentation of voice fundamental frequency as a supplement to lipreading: Results of extended training with a single subject. Journal of Rehabilitation Research and Development, 25, 51-56. Brooks, P. L., & Frost, B. J. (1983). Evaluation of a tactile vocoder for word recognition. Journal of the Acoustical Society ofAmerica, 74, 34-39. Brooks, P. L., Frost, B. J., Mason, J. L., & Gibson, D. M. (1986a). Continuing evaluation of Queen's University tactile vocoder I: Identification of open set words. Journal of Rehabilitation Research and Development, 23, 119-128. Brooks, P. L., Frost, B. J., Mason, J. L., & Gibson, D. M. (1986b). Continuing evaluation of Queen's University tactile vocoder II: Identification of open set sentences and tracking narrative. Journal of Rehabilitation Research and Development, 23, 129-138. Carney, A. E. (1988). Vibrotactile perception of segmental features of speech: A comparison of single-channel and multichannel instruments. Journal of Speech and Hearing Research, 31, 438-448. Carney, A. E., & Beachler, C. R. (1986). Vibrotactile perception of suprasegmental features of speech: A comparison of single-channel and multi-channel instruments. Journal of the Acoustical Society ofAmerica, 79, 131-140. Carney, A. E., Kienle, M, & Miyamoto, R. T. (1990). Speech perception with a single-channel cochlear implant: A comparison with a single-channel tactile device. Journal of Speech and Hearing Research, 33, 229-237. DeFilippo, C. L., & Scott, B. L. (1978). A method for training and evaluating the reception of ongoing speech. Journal of the Acoustical Society of America, 63, 1186—1192. Gnosspelius, J., & Spens, K.-E. (1992). A computer-based speech tracking procedure. Speech Transmissions Laboratory Quarterly Progress and Status Report, 2, 131-137. Kishon-Rabin, L., Boothroyd, A., & Hanin, L. (1996). Speechreading enhancement: A comparison of spatial-tactile display of voice fundamental frequency (F0) with auditory F0. Journal of the Acoustical Society ofAmerica, 100, 593-602. Knutson, J. F. (1995). Psychological and social issues in cochlear implant use (abstract). 100th NIH Consensus Development Conference: Cochlear Implants in Adults and Children. Washington, DC: National Institute of Health. Lunato, K. E., & Weisenberger, J. M. (1994). Comparative effectiveness of correction strategies in connected discourse tracking. Ear & Hearing, 15, 362-70. Lynch, M. P., Eilers, R. E., Oiler, D. K., Urbano, R. G , & Pero, P. J. (1989). Multisensory narrative tracking by a profoundly deaf subject using an electrocutaneous vocoder and a vibrotactile aid. Journal of Speech and Hearing Research, 32, 331-338. Lyxell, B. (1994). Skilled speechreading: a single case study. Scandinavian Journal of Psychology, 35, 212—219. Lyxell, B., Andersson, J., Arlinger, S., Bredberg, G., Harder, H., & Ronnberg, J. (1996). Verbal information-processing capabilities and cochlear implants: Implications for preoperative predictors of speech understanding. Journal of Deaf Studies and Deaf Education, 1, 190-201.
155
156
Journal of Deaf Studies and Deaf Education 3:2 Spring 1998
Sherrick, C. E. (1982). Cutaneous communication. In W. D. Neff (Ed.), Contributions to sensory physiology (pp. 1—43). New York: Academic Press. Sparks, D. W., Ardel, L. A., Bourgeois, M., Wiedmer, B., & Kuhl, P. K. (1979). Investigation the MESA (Multipoint Electrotactile Speech Aid): The transmission of connected discourse. Journal of the Acoustical Society of America, 65, 810-815. Spens, K-E. (1995). Evaluation of speech tracking results: Some numerical considerations and results. In G. Plant & K-E Spens (Eds.), Profound deafness and speech communication (pp. 417—437). London: Whurr Publications. Summerfield, Q; (1987). Some preliminaries to a comprehensive account of audio-visual speech perception. In B. Dodd & R Campbell (Eds.), Hearing by eye: The psychology oflipreading (pp. 3—51). London: Lawrence Erlbaum. Summerfield, Q., & Marshall, D. H. (1995). Cochlear implantation in the UK 1990-1994. Report by the MRC Institute of Hearing Research on the Evaluation of the National Cochlear Implant Programme.
Tactaid 7: User's manual. (1991). Audiological Engineering Corporation, Somerville, MA, USA. Tillberg, I., Ronnberg, J., Svard, I. & Ahlner, B. (1996). Audiovisual tests in a group of hearing-aid users: The effects of onset age, handicap age, and degree of hearing loss. Scandinavian Audiology, 25, 267—272. Weisenberger, J. M., Broadstone, S. P., & Kozma-Spytek, L. (1991). Relative performance of single-channel and multichannel tactile aids for speech perception. Journal of Rehabilitation Research and Development, 28, 45—56. Weisenberger, J. M., & Kozma-Spytek, L. (1991). Evaluating tactile aids for speech perception and production by hearing-impaired adults and children. American Journal of Otology, 12(suppl), 188-200. Weisenberger, J. M., & Miller, J. D. (1987). The role of tactile aids in providing information about acoustical stimuli. Journal of the Acoustical Society of America, 82, 906—916. Weisenberger, J. M., & Russel, A. F. (1989) Comparison of two single-channel vibrotactile aids for the hearing-impaired. Journal of Speech Hearing Research, 32, 83-92.
Downloaded from jdsde.oxfordjournals.org by guest on July 6, 2011