Representation of Speech Sounds at the Auditory Brainstem (PDF ...

6 downloads 183828 Views 5MB Size Report
Kanchan Kumari, Jyothi & Sujith Kumar Sinha ... N, Winnie Alex & Rathna Kumar ... Priyanka Jaisinghani, Akshay M & N.Sreedevi ... Dr. Vijaya Kumar Narne.
ISSN 0974-2131

Journal of Indian Speech Language and Hearing Association Vol. 26 No. 2

2012

Table of Contents Representation of Speech Sounds at the Auditory Brainstem............................................................... 01 Hemanth. N & Manjula. P Test Retest Reliability of Cochlear Hydrops Analysis Masking Procedure (Champ) ............................. 14 Kanchan Kumari, Jyothi& Sujith Kumar Sinha Acoustic Analysis of Speech Processed Through a Compression Hearing Aid ...................................... 20 Srikanth Chundu Sound Localization Abilities in Children using Cochlear Implant ........................................................... 30 Prakash S. G. R., Sharath.N, Winnie Alex & Rathna Kumar A Pilot Ultrasound Study of Kannada Lingual Articulations .................................................................. 38 Alexei Kochetov, N. Sreedevi, Midula Kasim, & R. Manjula Symbolic Play, Language and Cognition ............................................................................................. 50 Devika M. R., Swapna .N & Navitha .U Phonological Mean Length of Utterance ............................................................................................

69

Priyanka Jaisinghani, Akshay M & N.Sreedevi Iconic Memory and Echoic Memory ..................................................................................................... 76 Vinaya Ann Koshy, Jyothi Thomas, Theaja Kuriakose & Meghashree Automatic versus Volitional Mechanisms ............................................................................................ 82 Abhishek .B.P. & Prema K. S. Rao

Journal of Indian Speech Language and Hearing Association EDITORIAL BOARD Prof. N. Shivashankar Chairperson - JISHA Dr. S. P. Goswami Editor - JISHA Dr. Vijaya Kumar Narne Assistant Editor Dr. Divya Menon Associate Editor

Dr. Gowrishankar Patil Associate Editor

Dr. Banumathy N Associate Editor

Reviewers Mrs. Aparna N. Nandurkar

Dr. Sandeep Maruthy

Dr. Ajith Kumar U.

Dr. Sridevi N

Prof. Y. V. Geetha

Mr. Sujithkumar Sinha

Dr. Preeja Balan

Dr. Vani Rupella

Dr. Swapna N

Prof. S. Venkatesan

Dr. Santhosh M

© 2012 A Publication of the Indian Speech & Hearing Association (Registered under the Karnataka Societies Registration Act- Karnataka Act No. 17. Registration No.- S 25/ 67-68) Under the Title “Journal of Indian Speech Language and Hearing Association” Price : ` 500/Pages : 90

Indian Speech and Hearing Association Regd. under the Karnataka Societies Registration Act. Karnataka Act No. 17 Registration No. 25/67-68

Executive Committee for 2012-13 President Prof. M. Jayaram E-mail: [email protected]

Past President Dr. Kalyani N. Mandke E-mail: [email protected]

President Elect Shri Babul Basu E-mail: [email protected]

Gen. Secretary Dr. Krishna Y. E-mail: [email protected]

Additional Secretary Dr. Lakshmi Venkatesh E-mail: [email protected]

Treasurer Dr. Prakash B Email : [email protected]

Chairperson -JISHA Prof. N.shivashankar Email: [email protected]

Chairman of BOS & ES Prof. T. A. Subbarao E-mail : [email protected]

Editor, JISHA Dr. S. P. Goswami E-mail:[email protected]

EC member Mrs. Aparna N. Nandurkar E-mail: [email protected]

EC member Dr. Venkataraja Aithal U E-mail: [email protected]

EC member Mr. Suman Kumar Email : [email protected]

JISHA, 26 (2) 1-13

Representation of Speech Sounds at the Auditory Brainstem

Representation of Speech Sounds at the Auditory Brainstem 1

1

Hemanth. N , Manjula . P Abstract Objectives: To study auditory evoked brainstem potentials in terms of transient, transition and sustained portions in response to naturally produced onset, transition and sustained portions of the speech stimuli respectively. Method: Frequency following responses (FFR) were recorded from 40 participants between the age range of 15 and 65 years. The participants were grouped into four subgroups, i.e., Group A=15-25 yrs; B=26-35 yrs; C=36-55 yrs and D=56-65 yrs. The speech stimuli /da/ and /si/ were presented separately through the loudspeaker at 65 dB SPL. The responses recorded from vertical montage were measured in terms of latency and amplitude. Results: The latency of transient responses and FFR for both the stimuli was prolonged with age. Further, the latencies of transient response and FFR in Group 'D' were significantly different from other groups. However, the findings on amplitude did not follow a specific pattern with respect to age for both the stimuli. Conclusion: With the existing body of evidence on brainstem encoding of speech sounds, the data of the present study shows that the speech sounds are represented uniquely at the auditory brainstem and is dependent on the inherent acoustic cues. Key Words: Frequency Following Response (FFR); Loud speaker (LS), speech stimuli. The frequency-following response (FFR) is an electrophysiologic measure which assesses the auditory neural activity of the brainstem nuclei. It is a measure of phase-locked neural activity that represents responses to periodic acoustic stimuli, up to approximately 1,000 Hz (Smith, Marsh, & Brown, 1975; Stillman, Crow, & Moushegian, 1978). There is ample information regarding the representation of acoustic event at the brainstem nuclei using various stimuli. FFR for synthetic speech syllable /da/ (Cunningham, Nicol, Zecker, Bradlow, & Kraus, 2001; Plyler & Krishnan, 2001; King, Warrier, Hayes, & Kraus, 2002; Russo, Nicol, Musacchia, & Kraus, 2005; Wible, Nicol, & Kraus, 2004; Kraus & Nicol, 2005; Johnson, Nicol, Zecker, Bradlow, Skoe, & Kraus, 2008; Banai, Hornickel, Skoe, Nicol , Zecker , & Kraus, 2009, Chandrasekaran, Hornickel, Skoe, Nicol , & Kraus, 2009) and syllables from Mandarin of varying contours (Krishnan, Xu, Gandour, & Cariani, 2005; Xu, Krishnan, & Gandour, 2006; Wong, Skoe, Russo, Dees, & Kraus, 2007; Song, Skoe, Wong, & Kraus, 2008; Krishnan & Gandour, 2009) have been extensively studied. In addition to recording the FFR to synthetic /da/ under various recording conditions viz., mode of stimulation (Cunningham et al., 2001; ParberyClark, Skoe, & Kraus, 2009); ear effect (Hornickel, Skoe, & Nicol, 2009); and in the presence of background noise (Cunningham et al., 2001),

manipulating stimulus parameters with respect to duration of the stimulus and formant frequency settings (Johnson et al., 2008) has also been investigated. It is evident from the literature that the synthetic speech syllables were presented through insert earphone for recording the FFR. However, little has been reported on the utility of loud speaker for presenting stimuli while recording the FFR. Investigating the effect of naturally produced speech stimuli and a loud speaker (LS) as a transducer for presenting the stimuli would be advantageous. It provides a means of assessing the hearing sensitivity for difficult-to-test population such as infants and very young children, who are reluctant to wear, insert earphones (Walker, Dillon, & Byrne, 1984). This can also be used to measure the aided performance in individuals using hearing device, which in turn helps in determining the functional gain (Beynon & Munro, 1995). Although the synthetic speech allows the researcher to systematically manipulate the acoustic features and in turn knowing its representation in the auditory system, it does compromise the naturalness. The synthetic English /da/ utilized in recording FFR is different from that produced naturally in colloquial Kannada language. In synthetic /da/, there is no voice onset time which is an important feature to decide whether a speech sound is either voiced or

1 All India Institute of Speech and Hearing, Mysore

1

JISHA, 26 (2) 1-13

Representation of Speech Sounds at the Auditory Brainstem

voiceless. In addition, the F0 is flat in the transition duration and carries the phonetic information of the verbal message alone. However, in naturally produced speech syllable, the F0 varies and convey even the para-linguistic information or how the message is expressed (e.g., statement versus question, angry versus happy emotional state). The para-linguistic acoustic elements add a multidimensional aspect to speech that is separate from the phonetic information of the verbal message. The FFR for alveolar plosive /da/ has been widely studied, as the initial phoneme /d/ is typically difficult-to-perceive by listeners with hearing loss (Bilger & Wang, 1976). The possible reason is the likelihood of backward masking i.e., vowel masking over consonant (Song, Banai, Russo, & Kraus, 2006). Though alveolar fricative CV (/si/) syllable are not reported as stimulus for eliciting FFR in literature, an attempt was made in the present study to know on how the fricative CV stimulus registers in the brainstem nuclei. Alveolar fricative /s/ has a high frequency noise (Fant, 1960) that is preceded by the vowel /i/. This class of speech sounds was difficult to perceive in individuals with high frequency hearing loss. Megighian, Savastano, Salvador, Frigo, and Bolzan (2000) reported that the incidence of high frequency hearing loss is relatively more in elderly individuals. Thus, including a high frequency speech stimulus to record auditory brain stem potentials would throw light on how these speech sounds are represented in elderly individuals. In the past several decades, considerable amount of research and theoretical speculation has accumulated on age related changes on anatomical (Caspary, Schatteman, & Hughes, 2005) and functional (Chisolm, Willott, & Lister, 2003) related aspects. Schneider, Daneman, and Pichora-Fuller (2002) have reported concomitant changes in peripheral mechanisms with aging leading to disrupted access to retrieve information at the higher auditory centers. This might results difficulty in understanding speech. In order to understand on how the auditory physiological mechanisms that changes with aging on speech processing, electrophysiologic measurements can be utilized. Thus, the present study aimed at recording the FFR for naturally produced speech stimuli, /da/ and /si/, presented through loudspeaker. The specific objectives formulated were to study the

latency and amplitude of a) transient response to onset and CV boundary of /da/ stimulus and FFR to the transition portion of /da/ stimulus b) transient response to CV boundary of /si/ stimulus and FFR to steady state portion of /si/ stimulus and c) transient responses and FFR to /da/ and /si/ stimuli across age groups. Method Participants Forty individuals between the age range of 15 and 65 years participated in this study. The participants were sub-grouped based on age range. Group A had participants in the age range from 15 to 25 years; Group B from 26 to 35 years; Group C from 36 to 55 years; and Group D from 56 to 65 years. Each sub-group consisted of ten participants. All the participants had normal hearing as assessed by pure tone air- and boneconduction thresholds from 250 to 8000 Hz (less than 15 dB HL). Their tympanometric and acoustic reflex findings indicated that the middle ear had normal status. The participants did not have history of neurological, cognitive or speech and language problems. Preparation of Stimuli Three adult male speakers with normal voice and whose mother tongue was Kannada (a Dravidian language widely spoken in Karnataka, South India) were chosen to utter the voiced retroflex plosive (/da/) and unvoiced alveolar fricative (/si/) consonant-vowel (CV) syllables with normal vocal effort. Sridhar (1990) defines Kannada as one of the four major literary languages of the Dravidian family, spoken by over 30 million people, primarily in the state of Karnataka (formerly Mysore), South India. Test tokens uttered by male speakers were considered, as the fundamental frequency ranges from 100 to 140 Hz (Fant, 1960) which aids in studying the temporally spaced inter-peak differences corresponding to wavelength of the stimulus. A total of six CV ( /da/ stimuli uttered from three male speakers had same duration, similarly /si/ stimuli were prepared) tokens were recorded using the Adobe Audition (V-3) software. The recorded stimuli were digitized using a 32-bit processor at 44,100 Hz sampling frequency. A goodness test was performed in order to verify the 2

Representation of Speech Sounds at the Auditory Brainstem

JISHA, 26 (2) 1-13

test stimuli, in which ten listeners with normal Table 1. Fundamental frequency and the 2 format hearing rated the stimuli for naturalness. Two frequencies of each stimulus. stimuli, /da/ and /si/, produced by the same Stimuli F0 (Hz) F1(Hz) F2 (Hz) speaker, which were rated as being most natural Onset Steady state Onset Steady state Onset Steady state were selected as the test stimuli. The duration of 133 480 552 1811 1703 /da/ and /si/ stimuli was 94 msec. and 301 msec. /da/ 135 respectively. For the syllable /da/, the voice onset /si/ 147 138 300 318 2099 2423 time was 18 msec., the burst duration was 5 msec., the transition duration was 37 msec., and the vowel duration was 34 msec. For the syllable Acquiring the Frequency Following Response /si/, the fricative duration was 149.4 msec. and the vowel duration was 151.6 msec. Both the stimuli A new session for each participant was created were converted from '.wav' to '.avg' format using in the Neuro-scan AEP system by entering and wavtoavg m-file of Brainstem tool box. The '.avg' saving the details of the participant in the patient's format of both the stimuli were band pass filtered demographics. Each participant was seated from 100 to 1000 Hz using Neuroscan (Scan 2- comfortably in an armed reclining chair. The version 4.4) to know the functional relationship electrode sites were cleaned up with skin between the acoustic structure of speech and the preparing gel. Disc type silver coated electrodes brain stem response to speech. The were placed using conduction gel at the test sites. 'stimulus.avg', waveforms and spectrograms of The FFR was recorded using vertical montage. the two CV tokens were depicted in Figure 1. The non-inverting electrode (+) was placed on the Table 1 summarizes the fundamental frequency cortex (Cz), the ground electrode was on upper and the first two formant frequencies of /da/ and forehead (Fpz) and the inverting electrode (-) was /si/ stimuli. The onset to steady state F0, F1 and placed on the mastoid of the test ear (Ai). It was F2 within the transition duration (37 msec.) of /da/ ensured that the electrode impedance was less stimulus and further, the frequency components than 5 k Ohms for all the electrodes and that the within the initial sustained duration (42 msec.) of inter-electrode impedance was less than 2 k /i/ portion of /si/ stimulus were measured using Ohms. Praat (version 5.1.29) software. The loudspeaker was placed at 45 degrees Azimuth from the test ear of the participant, located at the calibrated position of 1 meter distance. The height of the loudspeaker was at the level of test ear of the participant. The participation of the left ear was ruled out by presenting a 65 dB HL speech noise through an insert earphone, from a calibrated audiometer. The participant was instructed to ignore the stimulus and watch a movie that was muted and played through a battery operated laptop computer. He/she was also asked to minimize Figure 1 : a) and (d) are the waveforms of /da/ and /si/ the head movement. stimuli. The dark black solid line in both stimuli For recording the FFR, the stimulus /da/ was indicates the f0, which are having falling pattern. presented through the loud speaker at a Formant frequencies were represented with white presentation level of 65 dB SPL to the right ear. lines. The f1 and f2 of /da/ is in raising pattern. The f1 of The PC-based evoked potential system, /si/ stimulus is in falling pattern and f2 is in raising Neuroscan 4.4 (Stim 2-version 4.4), controlled pattern. b) and e) were the waveforms of /da/ and /si/ stimuli. For syllable /da/, the voice onset time was 18 the timing of stimulus presentation and delivered msec, the burst duration was 5 msec, the transition an external trigger to the evoked potential duration was 37 msec and vowel duration of 34 msec. recording system, Neuroscan (Scan 2-version For the syllable /si/, the fricative duration was 149.4 4.4). To allow for a sufficient refractory period msec and vowel duration 151.6 msec. c) and f) are within the stimulus sweep, while minimizing the stimulus.avg of the speech stimulus /da/ and /si/. 3

JISHA, 26 (2) 1-13

Representation of Speech Sounds at the Auditory Brainstem

total recording time, an inter-stimulus interval of 93 msec. was used. The FFR recording was initiated once a stable EEG was obtained. All the artifacts exceeding ±35 microvolts were rejected while averaging the response. The recording window consisted of a 30 msec. pre-stimulus period and a 130 msec. post-stimulus time. The evoked responses were converted from analog-to-digital with the rate of 20,000 Hz and band-pass filtered online from 100 to 2000 Hz, with 12 dB/octave roll-off. Further, offline analysis was carried out. The epoched waveform was corrected for baseline. The responses were averaged and filtered off-line from 100 Hz (high-pass filter, 12 dB/octave) to 1000 Hz (low-pass filter, 12 dB/octave). The waveforms recorded from 1000 sweeps, separately in condensation and rarefaction polarities were added for further analysis of discrete peaks of the FFR. A similar procedure was repeated to record the FFR for /si/ stimulus. However, the recording window consisted of a pre-stimulus period of 30 msec. and a post-stimulus period of 330 msec. An inter-stimulus interval time of 113 msec. was used. The order of stimuli while testing on each

participant was counter balanced. Description of the brainstem response to speech stimuli The brainstem response to speech is a complex response. The responses of onset/ transient to /d/, FFR to transition portion of /da/ stimulus and CV boundary to /s/ to /i/, FFR to steady state portion of /si/ stimulus were depicted in Figure 2. The stimulus /da/ is voiced plosive CV syllable (Fant, 1960). The initial stop /d/ is preceded by a VOT of 18 msec. and produced by placing the tongue at the roof of the mouth and blocking the airflow, followed by a sudden release of the blockage resulting in transient burst of noise.The response to the onset of the speech stimulus /da/ includes a positive peak (V), analogous to the peak V elicited by click stimuli, followed immediately by a negative trough (A) (Russo, Nicol, Musacchia, & Kraus, 2004). Following the burst of noise there is a shift in articulator from /d/ to /ah/ resulting in CV boundary and which is represented by the peak C (Russo, Nicol, Musacchia, & Kraus, 2004).The response to the transition portion of the speech

Figure 2: The peak V and trough A responses to onset/ transient portion of /d/ had a latency of less than 45 msec and peak C had a latency 46.45 msec. FFR to transition portion of /da/ stimulus includes discrete peaks D, E, F, and G which represents the falling pattern of f0. In the stimulus /si/, the CV 1 boundary corresponds to peak a with latency of 188.68 msec. Response to steady state portion of stimulus /i/ in /si/ contains the discrete peaks a, b, c, d, e, and f which follows the wavelength of fundamental frequency of sustained portion of /i/ in /si/ stimulus. Table 2. Brainstem response to naturally produced speech stimuli Response

Peak latency components /da/ /si/

Peak amplitude components /da/ /si/

Transient response

V, A, & C

a1

V, A, & C

a1

Transition response

D, E, F, & G,

-

D, E, F,& G,

-

Sustained response

-

a, b , c, d, e, & f

-

a, b , c, d, e, & f,

4

JISHA, 26 (2) 1-13

Representation of Speech Sounds at the Auditory Brainstem

stimulus is its periodicity, which represents the frequency information contained in the stimulus (FFR). The formant transition period is defined as the portion of the response corresponding to the formant transition periods of the stimuli (37 msec.) (Russo, Nicol, Musacchia, & Kraus, 2004). However, response to the sustained portion of the stimulus /da/ is not added into the account of analysis. The stimulus /s/ being a fricative sound as it is produced with blocking the air in the narrow constriction at the anterior part of vocal tract and followed by release of air resulting fricative amplitude of the envelope yielding an aperiodic energy (Fant, 1960). The response pertaining to /s/ portion of /si/ is represented by aperiodic amplitude. Following the fricative aperiodic energy there is a shift in articulator from /s/ to /i/ resulting in CV boundary and which is represented by peak 'a1'. In response to sustained portion of /i/ in /si/ stimulus, the waveform was analyzed in the initial 42 msec. in which the acoustic parameter was found to be variable. The sustained response is defined as the portion of the response corresponding to the steady-state portion. The response waveform was evaluated for the latency and amplitude of onset/transient and FFR to transition portion of /da/ and steady state portion of /si/ stimulus were measured, which were tabulated in Table 2.

utilized to assess the discrete peaks of FFR. For participants in Group A, for /da/ stimulus, the onset response peak 'V' and trough 'A' had a latency of less than 45 msec. while the peak C had a latency of about 46.45 msec. However, these latencies varied across sub-groups. Response to transition portion of stimulus /da/ includes discrete peaks 'D', 'E', 'F', and 'G', which are negative potentials. The inter peak differences gradually increases and represents the falling pattern of F0 in the transition duration of /da/ stimulus. In the stimulus /si/, the CV boundary corresponds to peak 'a1'which occurred at a latency of 188.68 msec. Response to steady state portion of stimulus /i/ in /si/ contains the discrete peaks 'a', 'b', 'c', 'd', 'e', and 'f' which are negative potentials and unequally spaced temporally. The inter-peak difference follows the wavelength of fundamental frequency of sustained portion of /i/ in /si/ stimulus. Results Descriptive analysis was carried out to study the nature of transient response; and FFR to transition portion of /da/ stimulus and FFR to steady state portion of /si/ stimulus on four different age groups. The latency and amplitude of transient responses and FFR obtained from the two stimuli /da/ and /si/ across age groups were analyzed using MANOVA in Statistical Package for the Social Sciences (SPSS for windows, Version 18) software. Post-hoc Duncan test was used when indicated.

Discrete peak measures Measures of both latency and amplitude were

Table 3.The mean (M) and standard deviation (SD) of latency in msec of transient response and FFR for /da/ stimulus in each age group V

A

C

D

E

F

G

Mean (SD)

Mean (SD)

Mean (SD)

Mean (SD)

Mean (SD)

Mean (SD)

Mean (SD)

Group A (N=10)

40.51 (3.07)

43.05 (3.11)

46.45 (2.49)

52.16 (2.08)

59.11 (3.26)

66.52 (3.01)

74.61 (3.56)

Group B (N=10)

42.29 (2.51)

44.59 (2.03)

48.32 (0.64)

54.04 (1.66)

61.61 (1.60)

69.52 (1.57)

76.76 (2.04)

Group C (N=10)

46.25 (2.44)

48.25 (2.52)

52.89 (3.27)

58.94 (4.48)

67.02 (4.74)

74.88 (4.51)

82.21 (4.11)

Group D (N=10)

52.53 (0.86)

55.13 (0.25)

57.71 (0.45)

63.89 (0.41)

71.71 (0.75)

78.47 (0.23)

85.94 (0.75)

Groups

5

JISHA, 26 (2) 1-13

Representation of Speech Sounds at the Auditory Brainstem

The latency and amplitude of transient response and FFR to transition portion of /da/ stimulus

response i. e., 'V', 'A' and 'C' were 40.51 msec., 43.05 msec., and 46.45 msec. respectively. A similar trend for latencies in transient response was noticed in all other age groups. Further, the mean value of amplitude of transient response of 'V', 'A' and 'C' were 0.33 µV, -0.21 µV, & -0.39 µV respectively in Group 'A'. A similar trend was obtained in amplitude of transient response in other groups.

The mean (M) and standard deviation (SD) of latency and amplitude of transient response and FFR to transition portion for /da/ stimulus in each age group are tabulated in Tables 3 and 4. In Group 'A', the mean latencies of transient

Table 4.The mean (M) and standard deviation (SD) of amplitude in µV of transient response and FFR for /da/ stimulus in each age group Groups

V Mean (SD)

A Mean (SD)

C Mean (SD)

D Mean (SD)

E Mean (SD)

F Mean (SD)

G Mean (SD)

Group A (N=10)

0.33 (0.24)

-0.21 (0.29)

-0.39 (0.28)

-0.75 (0.43)

-0.67 (0.35)

-0.45 (0.21)

-0.37 (0.18)

Group B (N=10)

0.19 (0.09)

-0.23 (0.14)

-0.28 (0.24)

-0.53 (0.66)

-0.39 (0.44)

-0.32 (0.43)

-0.27 (0.15)

Group C (N=10)

0.32 (0.27)

-0.23 (0.26)

-0.53 (0.97)

-0.54 (0.43)

-0.40 (0.29)

-0.22 (0.14)

-0.27 (0.24)

Group D (N=10)

0.74 (0.31)

-0.72 (0.43)

-0.24 (0.13)

-0.89 (0.33)

-0.49 (0.37)

-0.58 (0.41)

-0.51 (0.30)

increased gradually. Similar inter-peak latency differences were observed in other age groups. Further, the amplitude of the peaks of FFR was 0.75 µV, -0.67 µV, -0.45 µV, & -0.37 µV respectively in Group 'A'. A similar trend was obtained in amplitude of FFR to transition portion

The FFR to transition portion of stimulus includes the negative peaks 'D', 'E', 'F', and 'G'. The mean latencies of the FFR peaks were 52.16 msec., 59.11 msec., 66.52 msec. and 74.61 msec. respectively in Group 'A'. The inter-peak latency difference of 'D' to 'E', 'E' to 'F' and 'F' to 'G'

Table 6. The mean (M) and standard deviation (SD) of latency in msec of transient response and FFR for /si/ stimulus in each age group a1 Mean (SD)

a Mean (SD)

b Mean (SD)

c Mean (SD)

d Mean (SD)

E Mean (SD)

f Mean (SD)

Group A (N=10)

188.68 (1.53)

195.06 (2.48)

201.73 (2.96)

208.72 (2.47)

215.50 (2.47)

223.13 (2.67)

230.62 (1.90)

Group B (N=10)

189.35 (3.98)

195.21 (2.80)

202.65 (2.08)

209.53 (3.11)

217.05 (2.33)

224.61 (0.67)

231.70 (0.83)

Group C (N=10)

190.58 (2.01)

196.89 (0.43)

203.56 (0.64)

210.90 (0.52)

218.04 (0.73)

224.93 (0.52)

233.09 (2.7)

Group D (N=10)

194.24 (5.53)

200.57 (5.89)

207.22 (6.72)

214.47 (7.11)

222.29 (7.98)

230.65 (10.90)

236.58 (7.13)

Groups

9

6

JISHA, 26 (2) 1-13

Representation of Speech Sounds at the Auditory Brainstem

of /da/ in other groups. The latency and amplitude of transient response and FFR to steady state portion of /si/ stimulus

age group were tabulated in Tables 6 and 7. The mean latency of transient response 'a1' was 188.68 msec. in Group A. The mean latencies of transient response were prolonged with increase in age. Further, the amplitude of transient response 'a1' was -0.38 µV in Group 'A', -0.27 µV in Group 'B', -0.36 µV in Group 'C' and -0.28 µV in Group 'D'.

The mean (M) and standard deviation (SD) of latency and amplitude of transient response and FFR to steady state portion of /si/ stimulus in each

Table 7. Table 4.The mean (M) and standard deviation (SD) of amplitude in µV of transient response and FFR for /si/ stimulus in each age group a1 Mean (SD)

a Mean (SD)

b Mean (SD)

c Mean (SD)

d Mean (SD)

E Mean (SD)

f Mean (SD)

Group A (N=10)

-0.38 (0.17)

-0.50 (0.27)

-0.52 (0.29)

-0.39 (0.18)

-0.41 (0.27)

-0.37 (0.17)

-0.32 (0.31)

Group B (N=10)

-0.27 (0.14)

-0.31 (0.10)

-0.33 (0.12)

-0.26 (0.17)

-0.26 (0.16)

-0.31 (0.11)

-0.28 (0.16)

Group C (N=10)

-0.36 (0.20)

-0.50 (0.45)

-0.50 (0.47)

-0.37 (0.36)

-0.38 (0.39)

-0.32 (0.34)

-0.33 (0.27)

Group D (N=10)

-0.28 (0.98)

-0.31 (0.84)

-0.32 (0.85)

-0.22 (0.05)

-0.22 (0.09)

-0.23 (0.08)

-0.33 (0.13)

Groups

The FFR to steady state portion includes negative peaks. 'a', 'b', 'c', 'd', 'e', and 'f'. The mean latencies of the peaks of FFR were 195.06 msec., 201.73 msec., 208.72 msec., 215.50 msec., 223.13 msec. and 230.62 msec. respectively in Group 'A'. The inter-peak latency differences of 'a' to 'b', 'b' to 'c', 'c' to 'd', 'd' to 'e', and 'e' to 'f' increased gradually. A similar trend of inter-peak differences was observed in other groups. Further, the amplitude of FFR were -0.50 µV, -0.52 µV, -0.39 µV, -0.41 µV, -0.37 µV, and -0.32 µV respectively in Group 'A'. However, in other groups, a similar trend was noticed in the amplitude of FFR to the steady state portion of /i/in /si/ in other groups.

age groups, MANOVA was performed. A similar analysis was carried on amplitude to see if there was any significant difference across age groups in transient response and FFR. It was found that there was a significant difference between age groups in the latency of transient response and FFR. However, there was no significant difference in amplitude of transient response and FFR. The F ratio and significant values of latency and amplitude of transient response and FFR for /da/ stimulus are tabulated in Table 5. Table 5. The F (3,36) and significant values of MANOVA across age on latency and amplitude of transient response and FFR for /da/ stimulus Components Latencies Amplitudes F p F p V 50.53 0.00* 9.26 0.21 A 57.20 0.00* 6.60 0.31 C 57.59 0.00* 0.60 0.61 D 35.85 0.00* 1.32 0.28 E 34.97 0.00* 1.22 0.31 F 35.80 0.00* 2.21 0.10 G 30.90 0.00* 2.48 0.07 Note: *=significant at 0.01 level.

Latency and amplitude of transient response and FFR to transition portion of /da/ stimulus in different age groups The relationship of the stimulus. avg waveform of /da/ to transient/onset and FFR across age groups is depicted in Figure 3. To determine if there was any significant difference in the mean latency of the transient response and FFR across 7

JISHA, 26 (2) 1-13

Representation of Speech Sounds at the Auditory Brainstem

and amplitude of transient response and FFR to steady state portion of /si/ stimulus, MANOVA was performed separately. The results showed that there was a significant difference among the age groups in latency of transient response and FFR. However, there was no significant difference in amplitude of transient response and FFR. The F ratio and significant values of latency and amplitude of transient response and FFR for /si/ stimulus are tabulated in Table 8. Table 8. The F (3,36) and significant values of MANOVA across age on latency and amplitude of transient response and FFR for /si/ stimulus Components

Latencies F p

Amplitudes F p

a1

4.64

0.00*

1.14

0.34

a

5.37

0.00*

1.66

0.19

b

3.72

0.02*

1.45

0.24

c

3.82

0.01*

1.39

0.26

d

4.45

0.00*

1.28

0.29

e

3.45

0.02*

0.78

0.51

f

4.28

0.01*

0.07

0.97

Figure 3: The FFR for /da/ stimulus across age groups The stimulus waveform has been shifted 19 msec. to compensate for neural lag and instrument delay in the response. The time –amplitude waveform illustrate the elements of the stimulus and corresponding peaks in the response.The latency of transient response and FFR increases with age but does not show systematic trend in amplitude

Note: *=significant at 0.01 level. Duncan post-hoc analysis was performed to know the age groups that might have accounted for a significant difference in latency of transient response and FFR. The results of Duncan test revealed that Group 'D' was significantly different (p0.05), for clicks alone and clicks presented with masking different noise. Table – 3 shows the Z value and the level of significance for the Wilcoxson signed rank test. Table-3. Z value and the level of significance Parameters Click alone Click plus 8 kHz Click plus 4 kHz Click plus 2 kHz Click plus 1 kHz Click plus 500 Hz

Z value

Level of significance

0.65 0.23 0.09 1.39 1.06 2.51

p>0.05 p>0.05 p>0.05 p>0.05 p>0.05 p>0.05

Error bar graphs were plotted for click alone and other high pass masking noise condition. It can be seen from the error bar graph (figure- 2) that there does not exist a significant difference in the standard error bar for latency between the first trial and the second trial for all the conditions.

17

Test Retest Reliability of Cochlear Hydrops Analysis Masking Procedure (Champ)

JISHA, 26 (2) 14-19

Figure 2. Error bar graph of click alone recording and recording of CHAMP with different high pass masking noise. DISCUSSION The usefulness of any diagnostic test depends upon its sensitivity and specificity. By manipulating certain conditions of a test, one can develop a highly sensitive test in which all patients, regardless of their status, will yield an abnormal result. Similarly, if one uses stringent settings in a test, the test can be made highly specific so that results in all patients will be normal, regardless of the presence or absence of pathology. Any new test should be first evaluated in a normal patient population that has no known pathology. The normative test on a normal population, allows the examiners to understand conditions and variables that are consistent with

normal findings, and also, to establish quantifiable ranges of normative values. CHAMP is a relatively new technique in detecting the cochlear hydrops. In the present study the CHAMP analysis revealed a remarkable test-retest reliability evoked by different high-pass masking noise when recorded from the same individual. Given this stability, it follows that any significant alterations in morphology is likely to reflect changes in neural activation, and not simply random variability. Establishing this level of stability has positive broad implications on research and clinical assessment whenever Meniere's disease is of interest. However, the same test-retest can be done in clinical population to find the reliability of the test in 18

Test Retest Reliability of Cochlear Hydrops Analysis Masking Procedure (Champ)

clinical population. As we know that the threshold fluctuates in the individuals with Meniere's disease, it would be interesting to know the properties of the basilar membrane during the fluctuations of the hearing loss. It would be also interesting to record CHAMP after administration of the glycerol (under medical supervision, as it evokes nausea, headache, dizziness), as this will help in understanding the cochlear physiology after the administration of the glycerol. Conclusions Present study showed remarkable test-retest reliability in normal hearing individuals. The technique seems to be valid in normal hearing subjects. However, a test re-test reliability of CHAMP must be obtained in subjects with Minere's disease. Although CHAMP showed a good test-retest reliability, the standard deviation for the latency of wave V obtained for Click plus 500 Hz noise was more. Hence, one may start the CHAMP testing directly from click plus 1000 Hz. Address for Correspondence : Kanchan Kumari, All India Speech and Hearing, Mysore570006, Karnataka. References Committee on Hearing and Equilibrium (1995). Committee on Hearing and Equilibrium guidelines for the diagnosis and evaluation of therapy in Meniere's disease. Journal of Otolaryngology Head and Neck Surgery, 113, 181-185. De Valck, C.F., Claes, G.M., Wuyts, F.L., & Heyning, P.H (2007). Lack of diagnostic value of high pass noise masking of auditory brainstem responses in Menier's disease. Otol & Neurootol 2007; 28: 700-707.

JISHA, 26 (2) 14-19

Don, M., Kwong, B., & Tanka, C. A (2005). Diagnostic test for Meniere's disease and cochlear hydrops: Impaired high- pass noise masking of auditory brainstem response. Otology & Neurootology, 26, 711-722. Ferraro, J.A., & Tibbils, R.P (1999). SP/AP area ratio in diagnosis of Meniere's disease. American Journal of Audiology, 8, 21-28. K l o c k h o ff , I . , & L i n d b l o m , U ( 1 9 6 6 ) . Endolymphatic hydrops revealed by glycerol test: Preliminary report. Acta Otolaryngologica, 61, 459-462. Ordonez-Ordonez, L., Rojas-Roncancio, E., Hernandez-Alarcon, V. Jaramillo-Safon, R., Prieto-Rivera, J., Guzman-Duran, J., et al.(2009). Diagnostic test validation: Cochlear Hydrops Analysis Masking Procedure in Meniere's disease. Otology & Neurootology, 30, 820-825. Schimdt, P.H., Eggermont, J.J., & Odenthal, D.W (1974). Study of Meniere's disease by electrocochleography. Acta Otolaryngolgica, 316, 75-84. Sinha, S.K (2006). Electrocochleography in individuals with Auditory Dys-Synchrony. Unpublished dissertation submitted to University of Mysore. S i n h a , S . K . , & Va n a j a , C . S ( 2 0 0 9 ) . Electrocochleography in individuals with Auditory Dys-synchrony. Student Research Volume,AIISH, 4,112-121. Snyder, J.M (1974). Extensive use of a diagnostic test for Meniere's disease. Archives of Otolaryngologica, 100, 360-365.

Don, M, Kwong B, Tanaka C. Letters to the Editor: Response to “Lack of diagnostic value of high-pass noise masking of auditory brainstem responses in Meniere's disease. Otology & Neurootology, 29, 1211-1213.

19

Acoustic Analysis of Speech Processed Through a Compression Hearing Aid

JISHA, 26 (2) 20-29

Acoustic Analysis of Speech Processed Through A Compression Hearing Aid Srikanth Chundu Abstract Modern digital and non-linear analogue hearing aids are capable of highly complex signal processing designed to optimise user's auditory perception, particularly of speech. This project involved acoustic analysis of speech signals (VCV syllables), after they were processed through a non-linear hearing aid, in order to determine the effect of non-linear hearing aid processing on acoustic cues useful for speech perception. Four acoustic cues: burst/frication duration, burst/ frication amplitude, spectral peak of fricatives and second formant transition onset frequency were analysed at four different compression ratios; linear, 2:1, 3:1 and 4:1. An analogue two channel compression hearing aid was used on a KEMAR. The output of the KEMAR was recorded onto a laptop and Spectrogram analysis was carried out. Results indicated that the cues burst/ frication amplitude decreased with increase in compression ratio with the least amplitude for 4:1 compression ratio. The other cues did not show any significant changes with increase in compression ratio. The acoustic analysis of hearing aid processed speech was carried out for plosives and fricatives. The hearing aid was set to different compression ratios with an input signal of 70dB SPL. It was observed that hearing aid preserved most of the cues that were analysed. Only the burst/frication amplitude was significantly affected and the other cues were altered minimally. Key Words: Hearing aids; Acoustic analysis of speech; Compression hearing aid; Spectrogram analysis. Speech is a highly redundant stimulus for normal hearing individuals. However, in hearing impaired listeners, this redundancy is considerably reduced as the acoustic cues available to the normal hearing listeners may not be useful to the person with hearing impairment. Vowels are more intense than consonants and consonants have more high frequency energy than vowels. Due to this, vowels are easier to perceive than consonants for people with hearing impairment (Hood 1990). The first formant of the vowels is mostly audible but to discriminate vowels second formant is necessary which will be usually affected due to hearing loss. Mild to moderate sensorineural hearing loss individuals can identify the place of articulation for voiced stop consonants, once the stimuli are at comfortable high sensation levels easily (Van Tasell et al., 1982). Fricatives are difficult to perceive for people with high frequency hearing loss due to difficulty in extracting high frequency acoustic cues (e.g., spectrum of noise).This makes perception of /s/ difficult, thus impacting language in the case of “plural, present South of England Cochlear Implant Centre, United Kingdom

20

progressive, possessive, third person singular, and copula –s” (Ameredith 2005). The other problems faced by hearing impaired are upward spread of masking (Ferrand 1997) reduced audibility (Pickett 1999), poor frequency selectivity, difficulties in resolving spectral differences and reduced temporal resolution. Place of articulation is the most common perceptual error made by the hearing impaired with frequent mistakes in identifying syllable final consonants, voiceless consonants and consonants paired with the vowel /i/ (Dubno et al., 1982). Hearing impaired people depend on the temporal cues i.e the duration differences between vowels preceding voiced and voiceless stops (Revoile et al., 1982). They rely on spectral peak frequency for the perception of fricatives and the second formant transition onset frequency for the perception of the plosives (Pickett 1999). Vowel duration is the main cue used by the hearing impaired to perceive voicing in final fricatives (Revoile et al., 1991) and rely on burst duration for voicing distinctions of word initial and inter-vocalic stops (Stevens and Klatt 1974). The relative amplitude of the burst

Acoustic Analysis of Speech Processed Through a Compression Hearing Aid

significantly improves the perception of the place of articulation of both voiceless and voiced stops (Ohde and Stevens 1983). The amplification of release of burst for /t/ and /k/ and murmur for /d/ and /g/ improves the stop voicing perception (Revoile et al., 1987). Revoille et al. (1987) reported that raising the energy of the consonant relative to vowel, i.e. reducing the vowel-toconsonant intensity ratio, could increase the audibility of a consonant for a hearing impaired person. Hearing aids compensate for some of the deficits like reduced audibility and recruitment. The speech perception of the hearing impaired is degraded by distortion introduced by both hearing loss and hearing aids (Van Tasell 1993). Hearing aids increase the amplitude of the burst more than the vowel in high frequencies which restrict the relative amplitude between consonant and vowel significantly affecting causing place of articulation errors (Hedrick and Rice 2000). Compression system of the hearing aid helps in maintaining the input signal within the dynamic range of a person and this is achieved by changing the amplitude relationships between the input signal and the signal reaching the listeners ear. Compression system introduces temporal distortion of the phonemes and linear amplification provides superior speech perception scores for voiceless consonants, initial consonants and fricatives for linear amplification were superior to compression (Hickson et al.,1994). Speech perception was better with linear hearing aids compared to the compression hearing aids where there is minimal alteration to the acoustic parameters (Hickson and Thyer 2003). Speech perception was affected with increase in compression ratio at 65dB and hearing impaired individuals performed poorer at high presentation levels and high compression ratios (Hornsby and Ricketts 2001). Temporal distortion of phonemes introduced by the compression is not disadvantageous for subjects with reduced temporal resolving capacity and in that case performance with compression is better than linear aids (Dreshler 1989). The benefit from compression decreased with increase in SNR and more benefit was seen with negative SNR (Olsen et al., 2005). Multi channel 21

JISHA, 26 (2) 20-29

compression system has an advantage when the signal was presented at low levels. There was no much difference at higher intensity levels and there was deterioration in performance in the presence of noise (Jenstad et al., 2005). In the hearing aid processed stimuli, F1 was unidentifiable on a spectrograph, due the frequency roll-off, typical for both compression and linear aids. The hearing aids attenuate the low frequency gain in order to prevent masking of speech signal from amplified background noise. There is also loss of periodicity, characteristic of vowel production due to amplified background noise and noise generated from the hearing aid circuitry (Stelmachowicz et al., 1995). Plethora of studies was carried out using either phoneme or syllable identification that was mostly based on perception of subjects. Some studies used two-dimensional acoustic analysis such as spectrum to study the effects of hearing aid on acoustic cues. Various studies recorded the acoustic output in a 2 cc coupler, which is not the true representation of human ear canal due to its difference in compliance and resonance characteristics. Plosives and fricatives are the important speech sounds in English. Burst/frication related parameters are very important for perception of plosives/fricatives. The acoustic cues of burst/frication amplitude, duration, spectral peak and second formant transition frequency forms the most important cues for perception of the plosives/ fricatives. The question of how the compression amplification affects the acoustic cues of speech has not been resolved entirely. The potential effects of hearing aid processing on speech signal need to be explored fully. This lead to the formulation of a few specific questions: How well the hearing aid transmits speech without altering the acoustic cues? How these acoustic cues are altered with the variation in the compression ratios, time constraints and crossover frequencies? How the effects of hearing aid processing differ with the manner and place of articulation? Are these effects the same for all syllables or vary with the manner and place of articulation? The present study, cannot

Acoustic Analysis of Speech Processed Through a Compression Hearing Aid

adequately address all of the above issues to their whole extent, however, it can contribute in order that the above issues may be addressed. Method The aim of the study is to acoustically analyze (via spectrograms) the output of a hearing aid to determine the effects of the compression system on the acoustic cues important for speech understanding. The second aim was to see how these effects differ with the variation in compression ratio of the hearing aid. Test material Four vowel consonant vowel (VCV) syllables /iki/, /ipi/, /isi/ and /iʃi/ were selected for analysis. Voiceless consonants were selected for this study for two reasons. First, with regard to the acoustic analysis, a consistent rule (the onset of voicing) could be used to define the consonant and vowel boundary. Second, it has been demonstrated, at least in single channel devices, that compression amplification affects the voiceless more than voiced consonant perception (Hickson & Bryne 1995). Acoustic analysis Acoustic analysis was conducted using the computerized software Adobe Audition and Speech filling system (SFS). Initially consonant and vowel boundaries were determined by careful examination of the waveform, spectrogram and by listening to the speech signal via headphones. Subjective analysis was done by utilising the information that is available in the literature for different cues (Borden et al., 2002). An expert second observer validated the acoustic analysis. There was a good consistency between the analyses performed by the two observers. For the burst / frication amplitude, the root mean square level of the consonant portions of the syllable was computed as the square root of the mean of the squared amplitude (average power) of the sampled points within the time segment of each sound. The average RMS level for stops and fricatives were calculated from the burst and frication. This average RMS amplitude is measured, and was later converted into dB SPL at the level of the eardrum using a correction 22

JISHA, 26 (2) 20-29

factor. Fricative and burst duration: Initially the consonant and vowel boundaries were identified by careful examination of spectrogram and the consonant boundaries were marked. Then the duration of the burst/ frication was measured for each sound. The cue second formant transition onset frequency was analysed with the help of the cross- sectional spectrogram studies. The second formant transition was carefully observed and selected. The spectral peak of the fricative was analysed by observing the cross-sectional spectrograms, and the frequency peak with the highest amplitude was selected as spectral peak of the consonant. Hearing aid Selection The hearing aid was a Starkey BE39FX dual channel analogue wide dynamic range compression hearing aid. The compression ratio can be changed by the trimmer controls without varying other features such as time constants, the gain in the low frequency channel, threshold knee point and crossover frequency. A digital hearing aid was not used to avoid the variables, such as noise reduction algorithms. Recording system and procedure Recordings were from a single female speaker and were undertaken by the UCL Department of Phonetics and Linguistics. They were stored as digitized sound (.wav) files with 16-bit resolution and sampling rate of 22060 Hz. The stimuli were then routed to a mixer and sent to a loudspeaker inside a double walled soundproof room. The average RMS level of the stimulus emanating from the loudspeaker was 70dB SPL, as measured using a sound level meter (Bruel and Kjaer) set on the 'fast' detector setting located 3 feet from the loudspeaker which was near the ear canal of the KEMAR. This measurement was made without the hearing aid. The unaided stimuli were then recorded by a 1-inch pressure microphone, which was located in the ear canal of KEMAR. The output of the microphone was fed to a preamplifier then to an amplifier. The output from the amplifier was fed into the laptop by an interface DX1. The signal was digitized at a sampling rate of 22060 Hz in the Adobe Audition soft ware using a laptop. The idea of doing recordings on KEMAR gives novelty to the study. KEMAR takes into account of

JISHA, 26 (2) 20-29

Acoustic Analysis of Speech Processed Through a Compression Hearing Aid

The stimuli were recorded with the same test setup except that the Behind the ear hearing aid was coupled on the ear of KEMAR via insertion of the bore of the skeleton shell ear mould with a 1 mm vent for the aided samples. The experimental conditions were unaided, hearing aid in linear mode, 2:1, 3:1 and 4:1 compression ratio settings. All the aided recordings were done with the same amplifier settings. Calibration Speech is a fluctuating signal which varies over time. Loudness equalization calibration was done to get a steady sound pressure level in a free field. In order to ascertain the level of sound recorded by the computer, a calibration factor was determined to convert the decibel scale on the computer into dB SPL at the eardrum of KEMAR.

ratios were lower than that of the linear condition, for all the syllables. Higher compression ratios decreased the levels of both vowels and consonants compared to the linear condition. A series of repeated measures analysis of variance revealed a significant effect of compression on the burst amplitude of plosives [F (3, 2) = 1.000; p 0.05). Though females scored higher for all the variables (PMLU, PWP and PWC) as compared to males, the difference was not statistically significant. Discussion The present study was planned to assess PMLU and its related measures in typically developing native Hindi speaking children in the age of 2 to 3 years. Results indicated that there was a significant difference in mean PMLU and PWP values across the age groups studied (2-2.6 vs 2.6-3 years). Similar results were reported by Radhish and Jayashree (2009) in Kannada. In the present study, children in the age group of 2.6 – 3 years were able to produce more number of vowels and consonants correctly resulting in higher PMLU and PWP values compared to the younger age group. This observation can be attributed to improved articulatory abilities with age due to better neuromuscular maturation. However no significant difference was seen in PWC values across the age groups. This is possibly because PWC checks for whole word

Table 2. Shows the mean and standard deviation (in parenthesis) of PMLU, PWP and PWC across gender in both age groups. Males

Females

2.0-2.6 years

2.6-3 years

2.0-2.6 years

2.6-3.0 years

Mean PMLU

5.29 (0.47)

5.87 (0.31)

5.59 (0.36)

6.18 (0.46)

Mean PWP

0.79 (0.70)

0.88 (0.05)

0.84 (0.06)

0.92 (0.07)

Mean PWC

0.28 (0.17)

0.44 (0.15)

0.32 (0.13)

0.64 (0.25)

72

JISHA, 26 (2) 69-75

Phonological Mean Length of Utterance

correctness, and the number of segment errors in each word were more in the lower age group compared to the higher age group. That is though the higher age group had better approximation of target words but whole word correctness was not completely achieved. Also, the younger age group evidenced the phonological process deaspiration more as compared to the higher age group which supports the results of many Indian studies (Rahul, 2006: Hindi; Divya, 2010: Malayalam; Usha, 2010 :Telugu) who reported that aspirated sounds were acquired towards 3 years of age. It was also found that phonological processes like stopping, affrication, initial consonant deletion and final consonant deletion were more manifested in the lower age group than in the higher age group which is in consonance to findings of Rahul (2006) in Hindi. Present study also evidenced high occurrence of affrication in the younger age group which is also reported by Usha (2010) in Telugu, stating that children in this age group are in the period of learning fricatives which usually follows learning of affricates. The higher PMLU scores in female subjects though not statistically significant, can be attributed to maturational factors as female children are phonologically more proficient than male children (Anderson & Shames, 2006). Conclusions This study was an attempt to investigate PMLU and its related measures in Hindi which are whole word measures for assessing phonological development. The results revealed a developmental trend and hence these measures could be regarded as yardsticks for phonological development. These are easy to apply and highly valid and reliable as suggested by several authors in the earlier studies, as aspects of phonological acquisition that have been missed due to focus on segments can be identified. Therefore, whole word measures can be incorporated into treatment plans for children with phonological disorders. However PMLU needs to be further refined on rules for correctness of vowels and complexity of clusters. Address for Correspondence : Ms. Priyanka Jaisinghani, All India Institute of Speech and Hearing, Manasagangothri, Mysore-570006. E-mail Id : [email protected] 73

References Anderson., & Shames. (2006). Human communication disorders: An introduction. (7th ed.). Boston: Pearson education. Archana, G., John, S., Veena, K.D., Mohite, S., & Rajashekhar, B. (2011). Comparison of PMLU in Kannada speaking Down's syndrome and Typically Developing Children. Language in India, 11, 118-125. Balasubramanium, R., & Bhat, J. S. (2009). Phonological mean length of utterance (pMLU) in Kannada speaking children. Language in India, 9, 489–502. Bernhardt, B., & Stemberger, J. (1997). Handbook of phonological development from the perspective of constraint-based non linear phonology. New York: Academic press. Davidovich, I., Bunta, F., & Ingram, D. (2006). The relationship between the phonological complexity of a bilingual child's words and those of the target languages. International Journal of bilingualism, 10(1), 71-88. Divya. (2010). Articulatory acquisition in typically developing Malayalam speaking children: 2-3 years. Unpublished Masters Dissertation. University of Mysore. Fee, E. J. (1995). Segments and syllables in early language acquisition. In J. Archibald (ed.), Phonological acquisition and phonological theory. Hillsdale, NJ. : Erlbaum. Garlant, M. (2001). Early phonological patterns of young Spanish- English bilingual. Unpublished Masters thesis, Arizona State University. Grunwell, P. (1985). Cited in Ingram, D. (2002). The measurement of whole word productions. Journal of child language, 29, 713-733. Helin, K. S. (2009). Measuring Phonological Development: A Follow-up Study of Five Children Acquiring Finnish. Language and speech, 52(1), 55-77.

JISHA, 26 (2) 69-75

Phonological Mean Length of Utterance

Helin, K. S., Makkonen, T.S., & Kunnari, S. (2006). The Phonological Mean Length of Utterance: methodological challenges from a cross linguistic perspective. Journal of child language, 33, 179-190.

Polite, J., & Leonard, B. (2006). Finite verb morphology and phonological length in the speech of children with specific language impairment. Clinical Linguistics and Phonetics, 20, 751–760.

Hodson, B., Paden, E. (1991). Targeting intelligible speech : a phonological approach to remediation. Austin, TX : Pro-ed.

Radish, K. B., & Jayashree, S. B. (2009). Phonological Mean Length of Utterance (PMLU) in Kannada-Speaking Children, Language in India, 9, 489- 502. Radish, K. B., Jayashree, S. B., & Prasad, N. (2011). Phonological Mean Length of Utterance in Children with Phonological Disorders. Asia Pacific Journal of Speech, Language, and Hearing, 14 (2), 103–109.

Ingram, D. (1981). Procedures for phonological analysis of children's language. Baltimore, MD: University Park Press Ingram, D., & Ingram, K.(2001). A whole-word approach to phonological analysis and intervention. Language, Speech and Hearing Services in Schools, 32, 271-283. Ingram, D. (2002). The measurement of whole word productions. Journal of child language, 29, 713-733. Jalieski, K. (2000). Cited in Ingram, D. (2002). The measurement of whole word productions. Journal of child language, 29, 713-733. Kunnari, S. (2000). Characteristics of early lexical and phonological development in children acquiring Finnish. Acta Universitatis Ouluensis. Oulu University Press. Kunnari, S. (2002). Word length in syllables: evidence from early word production in Finnish. First language, 22, 119-135. Masterson, J., & Kamhi, A. (1992). Linguistic interrelationships in school- age children with and without language disorders. Journal of speech and hearing research, 35, 64- 75.

74

Rahul. (2006). Study of phonological processes of 2-3 year old Hindi speaking normal children. Unpublished Masters Dissertation. University of Mysore. Savinainen- Makkonen, T. (2000a). Wordinitial consonant omissions- a developmental process in children learning Finnish. First language, 20, 161-185. Schauwers, R., Taelman, S., Gillis, R., & Govierts,M. (2005). The phonological development in young hearing-impaired children with a cochlear implant. From http://www.ddl.ishlyon.cnrs.fr/ELA2005/ abstracts/Ela2005SCHA_O.pdf. Usha. (2010). Articulatory acquisition in typically developing Telugu speaking children: 2-3 years. Unpublished Masters Dissertation. University of Mysore. Appendix-1

Rules for the calculation of the PMLU (Ingram; 2001) Sample-Size Rule: Select at least 25 words, and preferably 50 words for analysis depending on sample size. If the sample is larger than 50 words, select a selection of words that cover the entire sample, e.g. every other word in a sample of 100 words. Lexical-Class Rule: Count words (e.g. common nouns, verbs, adjectives, prepositions and adverbs) that are used in normal conversation between adults. This excludes child words, e.g. mommy, daddy, tata, etc. Counting child words can inflate the PMLU if a child is a reduplicator. Compound Rule: Do not count compounds as a single word unless they are spelled as a single word, e.g. 'cowboy' but not 'teddy bear', i.e. 'teddy bear' would be excluded from the count. This rule simplifies decisions about what constitutes a word in the child's sample. Variability Rule: Only count a single production for each word. If more than one occurs, then count the most frequent one. If there is none, then count the last one produced. Counting variable productions may distort the count if there is a highly variable single word. Production Rule: Count 1 point for each consonant and vowel that occurs in the child's production. Syllabic consonants receive one point, e.g. syllabic 'l ', 'r', and 'n'. (Some transcriptions may show these as two segments, i.e. a schwa plus consonant, e.g. 'bottle' [badel], but it should be counted as one consonantal segment.) Do not count more segments than are in the adult word. For example, a child who says 'foot' as [hwut] has two consonants counted, not three. Otherwise, children who add segments will get higher scores despite making errors. Consonants Correct Rule: Assign 1 additional point for each correct consonant. Correctness in vowels is not counted since vowel transcriptions are typically of low reliability. Syllabic consonants receive an additional point in the same way as nonsyllabic consonants. A child who applies liquid simplification, for example, will get 1 point for producing a vowel, e.g. 'bottle' [bado], but 2 points if the syllabic consonant is correct.

JISHA, 26 (2) 76-81

Iconic Memory and Echoic Memory

Comparison of Iconic Memory and Echoic Memory in Children With Learning Disability Vinaya Ann Koshy, Jyothi Thomas, Theaja Kuriakose & Meghashree Abstract The present study compared the iconic memory and echoic memory in children with learning disability (LD).A total of 35 subjects participated in the study. The subjects were divided into two groups. Group I consisted of fifteen children with LD and group II consisted of twenty normal children. All the subjects belonged to an age range of 8-12 years. Standardized line drawn pictures of frequently occurring nouns were taken as stimuli from Early Language Training Kit (Karanth, 1999).The study consisted of two tasks; first task was to check iconic memory, while the second task was to check echoic memory. For task one, fifteen slides were made with each slide having one picture. The participants were instructed in Kannada as “We will show you some pictures on the computer screen one after the other. At the end you have to name all the pictures which you have seen”. A score of '1' was given for each correct verbal response and '0' for an incorrect response. For task two, the names of the nouns (which were used in the task one) were uttered by a female native speaker of Kannada and were recorded using a SONY Digital IC recorder (ICD-P320). The recorded sample served as the stimulus. Scoring was similar to that of the task one. Results revealed that children with learning disability performed poorly in both iconic and echoic memory task compared to normal. Children with LD performed better in task 1(iconic memory) compared to that of task 2(echoic memory). From the study it can be concluded that, iconic memory is better for children with LD than echoic memory. Hence, presenting the stimuli in visual mode during management program may enhance the performance. Key Words: Learning Disability, Iconic memory, Echoic memory. Memory is an active system that stores, organizes, alters and recovers information (Baddeley, 1996). There are three major processes in memory: encoding, storage and last one is retrieval. During every moment of an organism's life, sensory information is being taken in by sensory receptors and processed by the nervous system. Humans have five main senses: sight, hearing, taste, smell, and touch. Sensory memory (SM) allows individuals to retain impressions of sensory information after the original stimulus has ceased. Cognitive studies on memory in normal individuals, functional neuroimaging studies and neuropsychological investigations of individuals with memory loss indicate that memory is not a unitary phenomenon (Giovanello & Verfaille, 2001). Rather, it has several functional systems which help in a unique way to encode, store and retrieve information. The generally accepted classification of memory is based on the duration of memory retention and identifies two types of memory: short term memory or working memory and long term memory. Short-term memory allows one to 76

recall something from several seconds to as long as a minute or a day without rehearsal. Its capacity is also very limited. Short term memory can be either in the form of verbal memory and nonverbal memory. Based on the type of stimuli it can be further classified as visual short-term memory and auditory short-term memory. Visual memory (iconic memory) involves the ability to store and retrieve previously experienced visual sensations and perceptions when the stimuli that originally evoked them are no longer present (Cusimano, 2010). That is, the student must be capable of making a vivid visual image of the stimulus in his mind, such as a word, and once that stimulus is removed, to be able to visualize or recall this image without help. Various researchers have stated that as much as eighty percent of all learning takes place through the eye with visual memory existing as a crucial aspect of learning (Farrald & Schamber, 1973). Auditory memory involves being able to take in information that is presented orally, process that information, store it in the mind and then recall what is heard(Cusimano, 2010). Basically, it involves the

JISHA, 26 (2) 76-81

Iconic Memory and Echoic Memory

task of attending, listening, processing, storing, and recalling. Cusimano, A. (2010) stated that, children who have not developed their visual memory skills cannot readily reproduce a sequence of visual stimuli. They frequently experience difficulty in remembering the overall visual appearance of words or the letter sequence of words for reading and spelling. They may remember the letters of a word but often cannot remember their order, or they may know the initial letter and configuration of the word without having absorbed the details, that is, the subsequent letters of the word. When teachers introduce a new word, generally they write it on the chalkboard, have the children spell it, read it and then use it in a sentence. Students with good visual memory will recognize that same word later in their readers or other texts and will be able to recall the appearance of the word to spell it. Students with visual memory problems often will not. Without a good development of visual memory these students fail to develop a good sight vocabulary and frequently experience serious writing and spelling difficulties. Learning Disabilities(LD) is a generic term that refers to a heterogeneous group of disorders manifested by significant difficulties in the acquisition and use of listening, speaking, reading, writing, reasoning, or mathematical abilities. These disorders are intrinsic to the individual and presumed to be due to central nervous system dysfunction. Children with learning disability face a variety of memory problems (National Joint Committee on Learning Disabilities 1980). It has been reported that many dyslexics have poor visual sequential memory, i.e. a poor ability to perceive things in sequence and then remember the sequence. This in turn affects their ability to read and spell correctly. Individuals with poor visual memory find it difficult to recall visual images immediately or after a long period of time. A large number of memory studies undertaken with children exhibiting reading deficiencies have shown consistently that these children, relative to their peers without disability, have difficulty with short term verbal memory tasks. Verbatim, sequential memory appears to be one area of primary deficit. These children exhibit difficulty on 77

a large number of short term memory tasks that require recall of letters, digits, words or phrases in exact sequence (Corkin, 1974; Lindgren & Richman 1984; McKeever & VanDeventer, 1975; Ritchie & Aten, 1976). Students with LD will often experience difficulty in developing a good understanding of words, remembering terms and information that has been presented orally. Bradley & Bryant, 1981; Hulme, 1981; Watson & Willows, 1995, reported that poor readers perform more poorly than younger typical readers on tasks requiring the recall of serial verbal information, list of words, and multisyllabic names. Merry Elizabeth Roy, Sara Paul and S.P. Goswami (2009) compared verbal memory span and sequential memory both in forward and backward order in children with LD and reported that the verbal memory span was better than the sequential memory and forward ordered tasks were easier than backward order. The review of literature suggests that the children with LD show deficits in both iconic and echoic memory. It has also been proposed that variations in stimuli i.e. auditory and/or visual stimuli are also important variables that can influence the performance in this clinical population. In this background, determining the better mode for presentation of stimuli in children with LD is also pivotal aspect in their management. However in Indian scenario, few attempts have been made to study echoic and iconic memory in children with LD. But the studies focusing on the comparison of iconic memory and echoic memory in these children are scanty. Hence, the present study attempts to compare the echoic and iconic memory in children with LD. Method Subjects: A total of 35 subjects were taken up for the study. All the participants were in the age range of 8-12 years. They were having Kannada (L-1) as their mother tongue and English (L-2) as the medium of instruction in their school. The participants were divided into two groups. Group 1 consisted of fifteen children with LD (8 males and 7 females) and group 2 consisted of 20 normal children (10 males and 10 females). Stimuli: In order to check the iconic memory

JISHA, 26 (2) 76-81

Iconic Memory and Echoic Memory

15 line drawn pictures of frequently occurring nouns were selected from Early Language Training Kit (Pratibha Karanth, 1999). Microsoft Power Point (2007 version) slides were made with one picture on each slide. Each stimulus was displayed for duration of 30 seconds as it requires a minimum of 30 seconds to form a visual imagery. To check for echoic memory the same nouns which were used to test iconic memory was considered. A female native speaker of Kannada named these 15 pictures with an inter-word duration of 3 seconds and these were recorded using a SONY Digital IC recorder (ICD-P320). The stimulus was played using a DELL laptop and was presented through headphones. Procedure: Each participant was seated comfortably in a quiet room in front of the computer screen. The environment was made as distraction free as possible by removing the potential visual distracters. All the participants were given the verbal explanation regarding the nature of the test. The test consisted of two tasks; task1 was to check iconic memory, and the task 2 was to check echoic memory. For task 1, the participants were instructed in Kannada as “We will be showing you some pictures on the computer screen one after the other. Each picture will be displayed for 30 seconds. At the end you have to name all the pictures which you have seen”. A score of '1' was given for each correct verbal response and '0' for an incorrect response. For task 2, the instruction given was “You will be hearing few names, after listening you have to repeat all the names which you have heard”. Scoring was similar to that of the task 1. In order to avoid the familiarity effect, each task was carried out on two different days. Statistical analyses were done using SPSS software (Version 17). Univariate analysis of variance was carried out to compare the difference in performance between normal children and children with LD. A paired-'t' test was done to compare between the tasks in each group. Results The scores obtained for task 1 and task 2 in both the groups were subjected to statistical analysis using SPSS version 17 software. Mean 78

scores were calculated for both task 1 (iconic) and task 2 (echoic) for both normal and children with LD. Scores obtained for task 1 and task 2 were compared between the two groups using univariate ANOVA. The results are as follows. Table1: Mean values for task1 for LDs and normal Mean Standard Sig.Value deviation Normal children

12.55

1.538

Children with learning disability

9.63

1.496

0.000

Graph1: Mean values for task 1 for LD and normal. Table 1 and graph 1 reveals that, the mean score for task 1 in normal children were better than in children with LD. i.e. mean score for normal children for task 1 is 12.55 and for children with LD is 9.63. Results of Univariate ANOVA shows a significant difference in task one between normal children and children with LD i.e. p0.05 Table 4: Mean values for task 1 and task 2 for normal children Mean Standard Sig.Value deviation

Graph2: Mean values for task 2 for LD and normal. Table 2 and graph 2 depicts the mean score for task 2 in normal children and children with LD. Similar trend was noticed in task 2 i.e. normal children had a better mean score compared to that of LD. The mean scores obtained for normal children are 12.35 and for children with LD are 8.56. Univariate ANOVA revealed a significant difference in task 2 between normal children and children with LD i.e. p