Wavelet-Based Speech Enhancement for Hearing Aids

Wavelet-Based Speech Enhancement for Hearing Aids M.A. Trenas J.C. Rutledge N.A. Whitmal

November 1999 Technical Report No: UMA-DAC-99/11

Published in: European Medical & Biological Engineering Conf. (EMBEC’99) Vienna, Austria, November 4-7, 1999

University of Malaga Department of Computer Architecture C. Tecnologico • PO Box 4114 • E-29080 Malaga • Spain

WAVELET-BASED SPEECH ENHANCEMENT FOR HEARING AIDS Maria A. Trenas*, Janet C. Rutledge**, Nathaniel A. Whitmal, III*** * Dept. Comp. Architecture, Univ. of Málaga, Málaga, Spain **Dept. of Otolaryngology, University of Maryland, Baltimore, MD 21201, USA ***School of CTI, DePaul University, Chicago, IL 60604, USA e-mail:[email protected]

Introduction Patients with sensorineural losses generally experience a high-frequency loss, resulting in a reduced dynamic range of hearing. In addition, many listeners will experience a reduced spectral resolution related to the phenomenon of upward spread of masking. Thus, speech discrimination is adversely affected. If a listener suffers from recruitment of loudness, perceived loudness grows more rapidly with an increase in sound intensity than it does in the normal ear. Thus, for sensorineural hearing losses with severely restricted dynamic ranges linear processing has limitations. The amplitude compression approach allows fast adjustments to the gain of a speech signal to ensure amplification of low-level components without having high amplitude components surpass the threshold of discomfort. Hearing aids should also deal with the problem of the background noise: it is a major problem with hearing aids because not only noise will mask consonants but its amplification is distracting and often painful. Many noise-reduction techniques available today require a separate source input for the noise signal. In hearing aids this is not always practical. We use a single microphone noise-reduction technique. This wavelet-based denoising technique can serve also as a preprocessor for a frequency-dependent compensation for hearing impairments.

such that the ratio of log intensity above hearing threshold to dynamic range of hearing is the same for the hearing impaired listener as the corresponding ratio is for the normal-hearing listener (see figure 1):

δ * ∆* = δ ∆ From the above relationship, compression gain for a coefficient can be derived as follows: * = Tim + (C mn − Tnor ) C mn

∆* ∆

where C* and C are respectively, the compensated and non-compensated wavelet coefficients.

120

INTENSITY (dB SPL)

Abstract: Several wavelet-based methods have been applied to compensate the speech signal to improve the intelligibility for a common hearing impairment known as recruitment of loudness, a sensorineural hearing loss of cochlear origin. The more complete method performs both denoising and amplitude compression using the same wavelet coefficients for both stages.

∆∗

100

C* δ∗

80 ∆

60 C

40 δ

20

689

1723

2412 3101 3790 FREQUENCY (Hz)

4823 5512.5

Threshold of pain for both (T pain ) Hearing-impaired threshold (T ) of hearing with standard WT im Hearing-impaired threshold (T im ) of hearing with a fixed tree WP Normal threshold of hearing (T nor )

Materials and Methods

Figure 1: Parameters in compression gain computation

For the purposes of building compensation systems it has been shown that recruitment of loudness can be modeled as an expander followed by an attenuator. The objective of compression methods is to invert this expander/attenuator model. In wavelet-based compression, a gain is calculated for each wavelet coefficient in each frequency band

In a previous work, the use of a standard wavelet base for amplitude compression showed a similar performance to traditional techniques [1]. In this work new methods using wavelet packets (WP) have been implemented: WP with a fixed decomposition tree: it is dependent on the patient, as we calculate the compensation gains

more carefully in regions where the threshold of hearing changes rapidly. For example, if the patient has the hearing levels shown in figure 2 we could use a decomposition tree with more terminal nodes in the frequency range 1000-4000 Hz . Frequency (Hz) -10

125

250

500

1000 750

0

2000 1500

4000 3000

8000 6000

Normal

HEARING THREHOLD (dB-HL)

10 20 Mild 30 40 Moderate 50 60 Mod-Sev. 70 Severe 80 90

Figure 2: Example of thresholds of hearing with high frequency losses In figure 1 we are comparing the use of a fixed WP tree with terminal nodes [(2,0),(3,2),(3,3),(3,4),(3,5),(2,3)], with the use of the standard WT tree (see figure 3). Note that with the WP tree we will be using a better aproximation to the patient thresholds of hearing when computing the compression gain . (2,3) (1,1) (3,5) (2,1) (3,4) (3,3) (3,1) (3,2) (3,0) (2,0)

(a)

(b)

Figure 3: (a) Standard WT-tree, (b) example's WP-tree. WP with best-tree searching: a best-basis is searched for in a frame by frame basis. This way we obtain a better matching of the input speech waveform. This last algorithm has also been implemented with a preprocessing stage for denoising. The denoising algorithm uses the LDBs as a way of selecting as the best basis the one that maximizes discrimination between classes of "signal+noise" and "noise". Then it selects the coefficients supposed to correspond to the original clean speech after the MDL criterion (see [2] for references), an information-theoretic measure. The same wavelet coefficients resulting are then used for the compression stage.

Results Preliminary results, including measurements of the Articulation Index (AI), showed some benefit from the new wavelet compression methods over the use of a standard wavelet transform. The listening tests were conducted using normal-hearing subjects with noise masking to simulate hearing impairment. Objective evaluation of the performance of the WP with a fixed decomposition tree provides no difference in the AI values. But listening tests showed that through a convenient tree selection, intelligibility and perceived quality could be improved in many cases. The better performance, anyway, came from the wavelet approach to the combined problem of denoising+compression. After all, even the most complex signal processing hearing aid can be decimated by the presence of background noise. The algorithm applied in the denoising stage has already shown (see [2]) to be well suited for speech processing as it makes improvements not only in speech-to-noise ratio but also it has a better retention of consonants than other conventional approaches. This way, it better improves both perceived quality and intelligibility. The compression stage, using the same best basis computed during denoising, always performed better than the other wavelet approaches. In this case the advantage was reflected not only in the subjective testings, but also in the AI measurements (a 82% of improvement, versus the 78% obtained with the standard wavelet approach, for example) . Conclusions Some wavelet based methods for amplitude compression have been studied. The more complet one performs both denoising and amplitude compression. This wavelet approach to the combined problem of noise-reduction and amplitude compression offers high quality and flexibility since the parameters can be modified to fit the individual hearing loss and characteristics of the noise. The compression gains are calculated based on the impaired listener's thresholds of hearing and discomfort, and adapt to the characteristics of the incoming speech signal. Previously, the noise-reduction algorithm decides whether the wavelet coefficients in a given band are likely to be from the speech signal or noise, and reduces them accordingly. In this moment, we are studying HW architectures in order to implement the above algorithms. REFERENCES [1] L.A. Drake, J.C. Rutledge and J. Cohen, “Wavelet Analysis in Recruitment of Loudness Compensation,” IEEE Trans. Signal Processing, vol.41, pp. 3306-3312, 1991. [2] N.A. Whitmal, J.C. Rutledge, J. Cohen, “Reducing Correlated Noise in Digital Hearing Aids,” IEEE Engineering in Medicine and Biology, pp. 88-96, Sept/Oct 1996.