simulation of hearing loss using compressive ... - Semantic Scholar

3 downloads 516 Views 306KB Size Report
loss can be a valuable tool for developing signal processing strategies for .... materials while NH subjects were asked to listen to the same speech-in-noise ...
SIMULATION OF HEARING LOSS USING COMPRESSIVE GAMMACHIRP AUDITORY FILTERS  Hongmei Hu1,2, Jinqiu Sang1, Mark E Lutman1 and Stefan Bleeck1 (1. Institute of Sound and Vibration Research, Southampton University, UK 2. School of Mechanical Engineering, Jiangsu University, Zhenjiang, Jiangsu, China)

ABSTRACT Hearing loss simulation (HLS) systems can provide normal hearing (NH) listeners with demonstrations of consequences of hearing impairment. An accurate simulation of hearing loss can be a valuable tool for developing signal processing strategies for hearing aids. This paper presents a novel HLS system which is based on a physiologically motivated compressive gammachirp auditory filter bank to simulate several aspects of hearing loss including elevated hearing threshold, loudness recruitment and reduced frequency selectivity. The model was evaluated by speech-in-noise tests. An experiment with normally hearing and hearingimpaired listeners showed that the proposed HLS model can mimic typical hearing loss. It is concluded that a physiologically-inspired hearing loss model can perform in the same way as phenomenological models, yet has more fundamental underpinning. Index Terms — Hearing loss simulation, gammachirp auditory filter, sensorineural hearing loss 1. INTRODUCTION Hearing loss is a major concern for many people; more than 10% of the world population have some form of hearing impairment. Research to better understand hearing loss has used Hearing loss simulation (HLS) systems [1-4]. HLS can provide normal hearing (NH) listeners with demonstrations of consequences of hearing impairment. More importantly, an accurate simulation of hearing loss can be a valuable tool for developing signal processing strategies for hearing aids, because NH participants can be used instead of HI in preliminary experiments, which is often easier and cheaper. Especially in the initial stage of research on advanced signal processing algorithms designed for HI, such models provide a powerful tool to evaluate algorithms. Traditional algorithms used in HLS only simulate certain phenomenological aspects of hearing loss and are often based on Fourier transformation. This is for simplicity rather than physiological motivation. This paper aims to develop a novel HLS system that is physiologically motivated and based on compressive gammachirp auditory filters to simulate several aspects of hearing loss including

elevated hearing threshold, loudness recruitment, and reduced frequency selectivity. 2. AUDITORY FILTER BANK AND HLS 2.1. Auditory filter banks Auditory filter banks are arrays of non-uniform band-pass filters that are designed to imitate the frequency resolution of human hearing. Auditory filters used to describe cochlear impulse responses are based on psychoacoustic and physiological measurements using the reverse correlation (revcor) technique [5]. The rounded exponential (roex) filter as an approximation to results from psychophysical measurements was suggested by Patterson to approximate the auditory filter frequency response [6]. A disadvantage of the roex filter, however, is that it has no closed time domain expression. Physiologists on the other hand who work with time domain signals often use gammatone functions as analytic expression of the revcor function [7]. The roex and the gammatone filter are similar in some respects: the magnitude response of a fourth-order gammatone filter agrees well with the roex auditory filter [6]. Original versions of the gammatone filter were linear and had a symmetric frequency response. Psychophysical measurements of the auditory filter shape, however, indicate that the filter is approximately symmetric only at low stimulus levels [8-9], but asymmetric at high stimulus level with the low-frequency skirt shallower than the highfrequency skirt. These findings also are consistent with physiological observations of basilar membrane motion [6]. Consequently, a level-dependent gammachirp auditory filter with asymmetric correction of the basic gammatone frequency response was developed by Irino and Patterson [10-11]. This accommodated the level-dependent asymmetry in human masking data [8-9, 12-13]. Successively, the analytic gammachirp filter was modified to produce a gammachirp filter, which is a better representation of cochlear observations, encompassing both psychophysical human masking data and physiological revcor data [14-16]. 2.2. HLS Sensorineural hearing loss is the result of damage at the level of the cochlea or along the neural pathways. Various

This work was supported by the European Commission within the ITN AUDIS, grant agreement number PITN-GA-2008-214699. (PDLOKRQJPHLKX#VRWRQDFXN MVH#VRWRQDFXNPHO#LVYUVRWRQDFXN EOHHFN#VRWRQDFXN

978-1-4577-0539-7/11/$26.00 ©2011 IEEE

5428

ICASSP 2011

3.1. Gammachirp auditory filter As mentioned in section 2.1, the gammachirp filter was introduced by Irino and Patterson [10] and successively improved [15-16] to describe auditory filters. Irino and Patterson [15-16] demonstrated that the dynamic compressive gammachirp filter provides a good description of observed human auditory filter shapes. Its main advantages against traditional descriptions are that its amplitude spectrum is asymmetric and level dependent. Compressive gammachirp filters are physiologically motivated in the sense that they simulate the main passive and active mechanical processes in the basilar membrane. It is therefore a functional model to simulate and to describe observed auditory filter shapes and excitation patterns. Gammachirp filter banks that are based on normal listeners’ responses are well developed [15-16].

3.2. Gammachirp auditory filter based HLS The most obvious perceptual consequence of cochlear damage is an elevated threshold in quiet [18]. There are also supra-threshold effects of hearing loss, specifically loudness recruitment, reduced spectral and temporal resolution, and reduced dynamic range.

Fig.2. Compressive gammachirp based HLS

A complete simulation of hearing loss should incorporate all the important auditory distortions that have been observed among HI people. The described HLS system was realised by combining a gammachirp auditory filter bank with traditional HLS algorithms: the input signal was divided into 26 sub-bands by the gammachirp auditory filter bank and then digital signal processing methods were applied in each band to realise reduced frequency selectivity, loudness recruitment and increased hearing threshold. Figure 2 shows the detail of the sensorineural hearing loss simulation techniques used in this paper. Original signal 1

8000

0.5

6000 Frequency

3. GAMMACHIRP BASED SENSORINUEARAL HLS

centre frequencies (250, 500, 1000, 2000, 4000 and 6000 Hz) and three different levels (40, 60, 80 dB).

Amplitude

methods have been used to simulate different types and degrees of sensorineural hearing loss. The two most common methods of simulation involve masking noise and spectral filtering. Various processing schemes, such as multi-band dynamic expansion, modulation of a carrier signal pattern and complex modifications of the speech waveform, have been used to simulate different aspects of sensorineural hearing impairment. For example, Moore simulated aspects of hearing loss in NH subjects via digital signal processing [2-3]. This has been applied to loudness recruitment [2] as well as to reduced frequency selectivity [3]. Additional different techniques for spectral [3] and temporal smearing [4] were applied to account for reduced spectral and temporal resolution. The method proposed in this paper is novel in that it simulates all main characteristics of sensorineural hearing loss in one coherent model: elevated absolute threshold, loudness recruitment, reduced frequency selectivity and compression as a function of input signal level.

0 -0.5 -1

4000 2000 0

0

0.5 1 Time (s) Hearing loss simulation signal

1

8000

0.5

6000

0

0.5 Time (s)

1

0

0.5 Time (s)

1

30

Filter Gain (dB)

20 10

-10

-0.5 0

0.5 Time (s)

1

4000 2000 0

Fig.3. BKB sentence, “the car engine is running.” Top: original signal, bottom: after HLS

-20 -30 -40

0

-1

0

Frequency

Amplitude

40

250 500 1k 2k 4k Center Frequency (Hz)

6k

Fig.1. Filter shapes of six examples of compressive gammachirp filters with different center frequencies (out of 26 ) at three levels.

In our model the auditory filter bank is described by 26 compressive gammachirp filters with centre frequencies spaced logarithmically between 100 Hz and 8000 Hz. Figure 1 shows six examples of the filter shapes at different

The signal processing steps in Figure 2 were carried out to produce the same excitation pattern in a normal ear as an impaired ear would produce using unprocessed signals. The envelope and the phase information in each channel were separated using Hilbert transform. Spectral smearing of the test signals was applied in each channel based on the modification of Ref. [2] to simulate reduced frequency selectivity, loudness recruitment and elevated threshold.

5429

4. EVALUATION OF THE HLS SYSTEM Sensorineural hearing loss leads to significant degradation of speech intelligibility [19]. Speech intelligibility is often used as a criterion to evaluate sound perception, for example Moore and Glasberg simulated threshold elevation combined with loudness recruitment to examine the intelligibility of speech in quiet and speech with different background noises [3]. In this paper, a series of experiments are described that were performed to evaluate the HLS model in a similar way. Various listening conditions were used to estimate the validity of the simulation. The experiments consisted of two parts: measuring the rate of correctly identified keywords and an open questionnaire, in which the participants were asked to describe the sound quality of the processed and unprocessed speech. 4.1. Experimental evaluation of HLS To evaluate the HLS model, speech-in-noise listening tests were performed. Both the HI and NH participants were trained with the clean speech before the formal test to make them familiar with the test procedure. In the formal test, HI participants were asked to listen to speech-in-noise materials while NH subjects were asked to listen to the same speech-in-noise materials processed with the HLS model. The speech-in-noise materials were processed with different signal-to-noise ratios (5, 2, 0 dB SNR) and with different background noises (babble noise and speech-shaped noise). Correct keyword recognition rates were calculated. Six NH and two HI paid volunteers with no previous experience of the BKB sentence lists (3 males, 3 females) participated in these experiments. The parameters in the HLS model were chosen to simulate a mild hearing loss. The HI subjects had a mild sensorineural hearing loss with pure tone thresholds between 20 dB and 40 dB across all frequencies. BKB sentence lists were standard British speech materials with 21 lists and each list contained 50 keywords in 16 sentences. To avoid memorisation of the keywords, each subject listened in different conditions to different, randomly selected sentence lists. Three lists were used for practice. The remaining 18 lists were randomly split into the six conditions (3 SNRs × 2 noises). Therefore each subject listened to three sentence lists in each condition. The average score of the three scores in the three sentence lists in the same condition was recoded as the ‘percent correct’ for each subject in each condition. The listeners sat in a sound -isolated room in front of a computer screen and listened to the sentences through earphones (Telephonic TDH-49P). Presentation level for each participant was 50

5430

dB. Afterwards, the NH participants were asked to fill in a questionnaire to describe the sound quality subjectively in their own words. 4.2. Experimental results Figure 4 shows the results of the experiment. The general trend that the percent correct recognition rate goes down with reduced SNR is expected and well described. The main results in this paper are that there is a close correlation between the results of the HI subjects and the simulated NH participants. At 5-dB SNR, the average percent correct scores of both groups are quite similar. At 2 dB SNR and 0 dB SNR the average percent correct score in the HI group is slightly worse than in the NH group. However, the dispersion of the results overlaps, suggesting no significant difference between the two groups. Performance in the babble noise condition varied within both groups of listeners. Babble noise might represent a more natural background for example in social situations (cocktail party effect) and listeners are therefore more used to babble noise. That would explain why the dispersion in the babble noise condition is larger compared to the speech-shaped noise condition. 100

80 Percent Correct

Figure 3 shows the original and the processed BamfordKowal-Bench (BKB) sentence: “the car engine’s running”. It can be seen that the spectral detail of the simulated degraded signal is blurred and the onsets of the consonants are weakened mainly because of the frequency-domain smearing. Furthermore the processed signal has less energy as a result of loudness recruitment and elevated threshold.

60

HI (Babble noise) NH (Babble noise) HI (Speech-shaped noise) NH (Speech-shaped noise)

40

20

0

-5

-2 0 Speech-to-background ratio (dB)

Fig.4. Percent correct scores for three different noise conditions (5, 2 and 0 dB SNR) and for two type noises (babble and speech shaped) for both HI and NH subjects. NH scores show the average and the standard deviation, HI show the average and the range.

We conclude from these results that, at least with respect to speech intelligibility, the presented HLS system is successful: the intelligibility of processed speech signals for NH people was similar to the intelligibility of unprocessed original sounds to HI people. Unsurprisingly, NH participants judged in the questionnaires the original speech in noise to be of better quality than the degraded speech. Furthermore, they often reported, especially for low SNR, the simulated speech to be of a ‘hoarse’ quality (but intelligible), 5. CONCLUSIONS

This paper proposed a novel HLS system by combining the physiologically motivated compressive gammachirp auditory filter bank with traditional signal processing algorithms to simulate supra-threshold loss of hearing ability. The purpose of the project was to build a versatile and inexpensive HLS system that is simple to operate and maintain, yet has strong theoretical foundations. It is shown that such a system can simulate various perceptual aspects of sensorineural hearing loss, including reduced frequency selectivity, increased hearing thresholds and loudness recruitment. Furthermore, the compressive gammachirpbased HLS system can simulate the compressive, leveldependent characteristic of the basilar membrane. It can therefore be modified to simulate the reduced compression of individual HI people. This aspect, to our knowledge, is novel. In future we aim to combine individually adjusted HI gammachirp filters with traditional HLS algorithms to realize a novel HLS, which may improve simulation of the reduced compression of HI ears. 6. ACKNOWLEDGEMENTS This work is supported by the European Commission within the ITN AUDIS. The authors want to thank Irino Toshio and Masashi Unoki for their code for the gammachirp; they whish to thank Roy D. Patterson and Shouyan Wang for their helpful comments on hearing loss simulation. Lastly, the authors want to offer our regards to all the reviewers for their valuable suggestions on the final modification. 6. REFERENCES [1] E. Villchur, “Simulation of the Effect of Recruitment on Loudness Relationships in Speech,” JASA, vol. 56, no. 5, pp. 1601-1611, 1974. [2] B. C. J. Moore, and B. R. Glasberg, “Simulation of the Effects of Loudness Recruitment and Threshold Elevation on the Intelligibility of Speech in Quiet and in a Background of Speech,” JASA, vol. 94, no. 4, pp. 2050-2062, Oct, 1993. [3] Y. Nejime, and B. C. J. Moore, “Simulation of the Effect of Threshold Elevation and Loudness Recruitment Combined with Reduced Frequency Selectivity on the Intelligibility of Speech in Noise,” JASA, vol. 102, no. 1, pp. 603-615, Jul, 1997. [4] R. Drullman, J. M. Festen, and R. Plomp, “Effect of Temporal Envelope Smearing on Speech Reception,” JASA, vol. 95, no. 2, pp. 1053-1064, 1994. [5] E. De Boer, and P. Kuyper, “Triggered Correlation,” Biomedical Engineering, IEEE Transactions on, vol. BME15, no. 3, pp. 169-179, 1968. [6] R. D. Patterson, and B. C. J. Moore, "Auditory Filters and Excitation Patterns as Representations of Frequency Resolution," Frequency Selectivity in Hearing, B. C. J. Moore, ed., pp. 123-177, London: Academic Press, 1986.

5431

[7] P. I. M. Johannesma, "The Pre-Response Stimulus Ensemble of Neurons in the Cochlear Nucleus." in IPO Symposium on Hearing Theory, B.L. Cardozo, E.d. Boer, and R. Plomp, Eds., pp. 58-69, Eindhoven,The Netherlands,1972. [8] R. A. Lutfi, and R. D. Patterson, “On the Growth of Masking Asymmetry with Stimulus Intensity,” JASA, vol. 76, no. 3, pp. 739-745, 1984. [9] S. Rosen, and R. J. Baker, “Characterising Auditory Filter Nonlinearity,” Hearing Research, vol. 73, no. 2, pp. 231-243, 1994. [10] T. Irino, and R. D. Patterson, “A Time-Domain, LevelDependent Auditory Filter: The Gammachirp,” JASA, vol. 101, no. 1, pp. 412-419, 1997. [11] T. Irino, and R. D. Patterson, "A Gammachirp Framework for Auditory Filtering: Unification of Cochlear Frequency-Glide Data and Psychoacoustical Masking Data." in 12th International Symposium on Hearing, Mierlo,The Netherlands, 2000. [12] B. C. J. Moore, and B. R. Glasberg, “Formulas Describing Frequency-Selectivity as a Function of Frequency and Level, and Their Use in Calculating Excitation Patterns,” Hearing Research, vol. 28, no. 2-3, pp. 209-225, 1987. [13] B. C. J. Moore, R. W. Peters, and B. R. Glasberg, “Auditory Filter Shapes at Low Center Frequencies,” JASA, vol. 88, no. 1, pp. 132-140, 1990. [14] T. Irino, and R. D. Patterson, “A Compressive Gammachirp Auditory Filter for Both Physiological and Psychophysical Data,” JASA, vol. 109, no. 5, pp. 2008-2022, 2001. [15] T. Irino, and R. D. Patterson, “A Dynamic Compressive Gammachirp Auditory Filterbank,” Audio, Speech, and Language Processing, IEEE Transactions on, vol. 14, no. 6, pp. 2222-2232, 2006. [16] R. D. Patterson, M. Unoki, and T. Irino, “Extending the Domain of Center Frequencies for the Compressive Gammachirp Auditory Filter,” JASA, vol. 114, no. 3, pp. 1529-1542, 2003. [17] M. Unoki, T. Irino, B. Glasberg et al., “Comparison of the Roex and Gammachirp Filters as Representations of the Auditory Filter,” JASA, vol. 120, no. 3, pp. 1474-1492, 2006. [18] J. Chalupper, and H. Fastl, "Simulation of Hearing Impairment Based on the Fourier Time Transformation." in ICASSP, 2000 IEEE International Conference on, pp. II857II860, 2000. [19] D.-O. Chung, W. Doh, D.-H. Youn et al., "Hearing Impairment Simulation for the Performance Evaluation of Hearing Aid System." in Proceedings of the 18th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 415-416, 1996.

Suggest Documents