TITLE: Intelligibility Enhancement of Neutral Speech based on Lombard Effect ... While most cochlear implant (CI) users performs well in quiet, their speech ...
TITLE: Intelligibility Enhancement of Neutral Speech based on Lombard Effect Modification with Application to Cochlear Implant Users
AUTHORS: Jaewook Lee, Hussnain Ali, John H. L. Hansen
AFFILIATION: Center for Robust Speech System – Cochlear Implant Lab (CRSS-CIL), Department of Electrical Engineering, The University of Texas at Dallas, Richardson, TX, US
ABSTRACT (no longer than 400 words):
Background While most cochlear implant (CI) users performs well in quiet, their speech intelligibility degrades significantly in the presence of noise.
To ensure high communication quality, speakers employs
“Lombard effect” within their speech production. Lombard effect is considered a type of stressed speech introduced by noise. Normal hearing (NH) individuals adjust their speech production parameters to convey speech information more robustly between individuals in adverse listening environments. Recent CI research (Lee et al., 2015) has suggested that Lombard effect is also present in the speech of postlingually deaf CI users. In that study, speakers altered their vocal effort, including fundamental frequency, vocal intensity, glottal spectral tilt, and formant characteristics in response to noisy environments. Motivated by this finding, the present study has focused to develop a Lombard effect based speech enhancement algorithm which is compatible for CI users. This study also investigated how CI users perceive the algorithmically modified Lombard speech in challenging listening conditions.
Methods In order to develop an effective modification scheme, a previous proposed framework based on Source Generator theory (Hansen, 1994) was employed.
This theory presumes that speech under noisy
environments can be modified by transformation of the neutral speech parameters. Based on this assumption, speech parameter variations for neutral and Lombard condition were modeled.
The
modification areas considered here were (1) voice intensity, (2) overall spectral contour, and (3) sentence duration. The models for each parameters were trained with the variations across a number of speakers using UT-Scope database (Ikeno et al., 2007). Modification transformations were then calculated based on differences from neutral speaking conditions. The transformations were finally used to modify the speaking style of input neutral speech, and hence generate Lombard synthetic speech output.
Results Acoustic characteristics for the original and modified speech sentences were analyzed. The proposed modification algorithm amplified high-frequency region of input speech signal, where is more robust against noise than low-frequency. The modification of neutral speech also resulted in time-stretched input sentence, which allows listeners more chance at hearing speech signal. Subjective listening evaluation will be performed with CI users to demonstrate the effectiveness of the proposed speech modification algorithm.
Conclusion A new speech modification criterion based on the Lombard speech characteristics was proposed. The results here are highly encouraging potential of the Lombard based speech modification scheme with perceptual benefit in CI users under noisy environments.
Fund Research supported by NIDCD/NIH R01 DC010494-01A