Introduction Distortion Modeling Experimental Setup

A PARAMETRIC APPROACH FOR CLASSIFICATION OF DISTORTIONS IN PATHOLOGICAL VOICES Amir Hossein Poorjam 1, Max A. Little 2,3, Jesper Rindom Jensen 1, Mads Græsbøll Christensen 1 1

2

Audio Analysis Lab, CREATE, Aalborg University, DK

1

 Information

on the type of distortion corrupting a signal can be used to inform the choice of appropriate enhancement algorithms.  Most existing methods focused on detecting a single and specific type of distortion in a signal.  In [1], we proposed a method to classify four major types of distortion in vowels directly from MFCCs extracted from speech signals.  Limitations of [1]:  MFCCs encode not only distortion in signals, but also other variability (speaker, articulation and disorder).  Distortion classification decision is made by majority vote over all frames, and the computation time increases with increasing signal length.  In this paper, distortion in variable duration recordings is modeled with a fixed-length, low-dimensional vector.

Distortion Modeling





[email protected]

channel factor 𝒙𝑠,𝑟 ~𝑁 𝝁𝑠,𝑟 , 𝚲𝑠,𝑟 and the channel subspace 𝑼 are estimated by applying an EM algorithm [2].

 In the E-step, using a random initialization of 𝑼, the posterior distribution

Method:  Fitting a Gaussian mixture model (GMM) to the features of a recording.  Assuming that the GMM mean supervector of the rth recording from the sth speaker can be decomposed as:

𝝁𝑠,𝑟 = 𝐸 𝒙𝑠,𝑟 = 𝑰 + 𝑼𝑇 𝚺−1 𝑵𝑠 𝑼

𝑼𝑇 𝚺−1 𝐟𝑠,𝑟

(2) −1

.

(3)

 In the M-step, the channel subspace is updated by solving the equations:

𝑼𝑖 𝚯𝒄 = 𝚿𝒊 .

(4)

 Definitions:

    

(1)

Estimating the matrices 𝑽, 𝑼, 𝑫, and the vectors 𝒙𝑠,𝑟 , 𝒚𝑠 and 𝒛𝑠 [2]: 1) Train 𝑽 , assuming that 𝑼 and 𝑫 are zero. 2) Estimate 𝑼 given the estimate of 𝑽 and assuming that 𝑫 is zero. 3) Estimate the residual matrix 𝑫 given the estimates of 𝑽 and 𝑼. 4) 𝒙𝑠,𝑟 , 𝒚𝑠 and 𝒛𝑠 are then calculated given the estimates of 𝑽, 𝑼and 𝑫.

−1

𝚲𝑠,𝑟 = 𝐸 𝒙𝑠,𝑟 𝒙𝑇𝑠,𝑟 = 𝝁𝑠,𝑟 𝝁𝑇𝑠,𝑟 + 𝑰 + 𝑼𝑇 𝚺 −1 𝑵𝑠 𝑼



Definitions:  𝒎 is speaker- and channel-independent supervector,  𝑽 is a rectangular matrix of low rank with high speaker variability  𝒚𝑠 is the speaker factor  𝑼 is a rectangular matrix of low rank with high channel variability  𝒙𝑠,𝑟 is the channel factor containing channel related information  𝑫 is a diagonal matrix describing any remaining speaker variability  𝒛𝑠 is the speaker-specific residual factor  The factors 𝒙𝑠,𝑟 , 𝒚𝑠 and 𝒛𝑠 are assumed to be independent of each other and have a standard normal prior distribution.

of

the channel factor is calculated as:



Channel variability can be produced artificially by corrupting the clean recording by different types and levels of distortion.

𝑴𝑠,𝑟 = 𝒎 + 𝑽𝒚𝑠 + 𝑼𝒙𝑠,𝑟 + 𝑫𝒛𝑠 .

2

 The





{ahp, jrj, mgc}@create.aau.dk,

Media Lab, MIT, Cambridge, Massachusetts, USA

Channel Factor and Subspace Estimation

Introduction



3

Engineering and Applied Science, Aston University, Birmingham, UK

 

𝚺 is a block-diagonal matrix entries form the covariance matrix of the cth mixture of the UBM, 𝑁𝑠,𝑟,𝑐 = 𝐿𝑙=1 𝛾𝑐,𝑙 and 𝐟𝑠,𝑟,𝑐 = 𝐿𝑙=1 𝛾𝑐,𝑙 𝝆𝑙 − 𝒎𝑐 + 𝑽𝑐 𝒚𝑠 are the zeroand first order statistics for each speaker s, recording r and mixture component c. 𝝆𝑙 is the acoustic features of the 𝑙th frame 𝑰 is an identity matrix, 𝑵𝑠 is a block-diagonal matrix which its entries are 𝑟 𝑁𝑠,𝑟,𝑐 𝑰 𝐟𝑠,𝑟 is a vector constructed by concatenation of 𝐟𝑠,𝑟,𝑐 𝛾𝑐,𝑙 is the posterior probability of the cth mixture generating 𝝆𝑙 , 𝒎𝑐 and 𝑽𝑐 are, respectively, the subvector of 𝒎 and the submatrix of 𝑽 of mixture component c. 𝚯𝒄 = 𝒔 𝒓 𝑁𝑠,𝑟,𝑐 𝚲𝑠,𝑟 𝑐 = 1, … , 𝐶 𝚿𝒊 is the i th row of 𝚿 = 𝒔 𝒓 𝐟𝑠,𝑟,𝑐 𝝁𝑇𝑠,𝑟

The Proposed Method

Experimental Setup • Database: • Parkinson’s voice database (sustained vowels, 750 telephone recordings).

• Distortion Classes: • • • •

Additive noise (white Gaussian, babble, office ambiance noises) Reverberation (8 different real room impulse responses) Peak clipping (clipping level: 0.3, 0.4, 0.5, 0.6) Coding (6.3 kbps, 9.6 kbps and 16 kbps CELP codecs)

• Acoustic features: • 39 dimensional vector (12 MFCCs + frame energy + ∆ + ∆∆)

• Distortion Modeling: • GMM with 256 mixtures • Speaker factor dim.: 0 • Channel factor dim.: 210

• Classifiers: • SVM with RBF kernel • PLDA Fig. 2: Performance of different configuration of the FA model.

Results Table 1: Comparison of [1] and the proposed method before and after pre-processing channel vectors using LDA. Results are in the form of mean ± STD. System Baseline PLDA PLDA + LDA SVM SVM + LDA

Clean 55 ± 11 100 ± 0 77 ± 4 28 ± 18 78 ± 3

Noisy 97 ± 4 0±0 98 ± 2 33 ± 5 97 ± 2

Reverb. 77 ± 4 0±0 86 ± 4 31 ± 16 87 ± 4

Clipped 82 ± 7 0±0 82 ± 2 35 ± 14 85 ± 2

Coded 85 ± 9 0±0 93 ± 3 68 ± 12 93 ± 3

Overall 79 ± 3 20 ± 0 87 ± 1 39 ± 4 88 ± 1

Conclusions • Distortion in variable duration signals is modeled by a fixed-length, lowdimensional vector which is more suitable for classification algorithms. • Channel vectors are more robust to small changes in signal characteristics than MFCCs, they are more suitable for distortion classification in pathological voices.

References

Fig. 1: The block diagram of the proposed distortion classification system.

[1] A. H. Poorjam, J. R. Jensen, M. A. Little, and M. G. Christensen, “Dominant distortion classification for pre-processing of vowels in remote biomedical voice analysis,” in INTERSPEECH, 2017, pp. 289–293. [2] P. Kenny, P. Ouellet, N. Dehak, V. Gupta, and P. Dumouchel, “A Study of inter-speaker variability in speaker verification,” IEEE Trans. Audio. Speech. Lang. Processing, vol. 16, no. 5, pp. 980–988, 2008.

Introduction Distortion Modeling Experimental Setup

Introduction Distortion Modeling Experimental Setup

Suggest Documents

Introduction Distortion Modeling Experimental Setup The ... - SigPort

Introduction Experimental setup

1. Introduction 2. Experimental setup

Abstract Introduction Experimental Setup

Abstract Introduction Experimental Setup

Introduction System Setup Experimental Procedure ...

1 Introduction 2 Experimental setup - inspire-hep

1. Introduction 3. Experimental setup 4 ...

Introduction Experimental Setup Results and Discussion - UCC

1. INTRODUCTION 2. EXPERIMENTAL SETUP 2.1 Experiment

Experimental setup - Lytix Biopharma

Experimental setup - CORE

Experimental setup - PLOS

Experimental Setup - arXiv

Effects of Experimental Setup and Modeling Assumptions on Predicted

INTRODUCTION DISCLAIMER SETUP

Calibration of wavefront distortion in light modulator setup by Fourier ...

Methodology and Experimental Setup Cultural ...

An experimental setup for practical differential ...

Experimental setup for determining ammonia-salt ...

1 Supplementary Note 1: Experimental setup The

INNOVATIVE EXPERIMENTAL SETUP FOR THE ... - CiteSeerX

experimental investigation of inlet distortion effect on

Background Results Hypothesis Experimental Setup ...