Least-mean kurtosis: a novel higher-order statistics ... - IEEE Xplore

2 downloads 0 Views 202KB Size Report
The least-mean kurtosis (LMK) adaptive FIR filtering algorithm is described which uses the negated kurtosis of the error signal as the cost function to beĀ ...
utterances of each word. The two data sets were recorded separately under different recording conditions and with a time interval of 5 months. In the experiments, some utterances from one or both sets were used for training, and the remaining utterances in either or both sets for testing. It was found that in all our experiments the state and dependence-optimised likelihood increased monotonically as the iteration proceeded. A threshold 0.001 for the relative increase of the likelihood between adjacent iterations was adopted to stop the iteration. Table 1 summarises the recognition results of IFDHMM and MVGHMM for different training and testing conditions of each word and acoustic features. Table 1: Recognition comparison between IFDHMM and MVGHMM for English E-set letters

Training utterance

Testing utterance

Set 1 Set 2

Set 1 Set 2

5 + 5 5 + 5 10 10 IO IO 10 10 20 20 50 50

40+45 40+45 40 40 40 + 50 40 + 50 50 50 50 50 50 50

0. Tanrikulu and A.G. Constantinides

Indexinn f e r m Adaptive /illem, Algorithms

The least-mean kurtosis (LMK) adaptive FIR filtering algorithm is described which uses the negated kurtosis of the error signal as the cost function to be minimised. Unlike other higher-order statistics based adaptive algorithms, it is computationallyefficient and it best suits those applications in which the noise contamination degrades the performance of the classical adaptive filtering algorithms.

Accuracy

Feature .

IFDHMM MVGHMM

C C+AP+AC C C+AP+AC C C+AP+AC C C+AP+AC C C+AP+AC C C+AP+AC

Least-mean kurtosis: A novel higher-order statistics based adaptive filtering algorithm

/n

/o

87.5 91.3 88.6 95.0 79.0 90.0 72.3 86.0 83.1 88.0 90.3 92.6

84.1 88.1 85.2 90.4 71.2 76.5 63.4 68.8 66.5 73.4 76.5 80.6

C, AC and P are, respectively, the cepstral and differenced cepstral coefficients and differenced power

The results indicate that, when trained in accordance with the utterance sets, MVGHMM incorporating the differenced cepstral and power features could only work slightly better than IFDHMM using cepstral features alone. Most importantly, we see that IFDHMM exhibits stronger robustness against the mismatch between training and testing conditions. This property is particularly preferred for a speech recognition algorithm. The significance of these results is that modelling the spectral dependence of speech in an optimal way is beneficial, which may help in capturing more accurate information of a word and therefore can increase both the descriptive and discriminative power of HMMs for speech recognition. Because the observation densities we used here for IFDHMM were of the differenced-frame form, which emphasised only the differenced spectral structure, much temporal spectral information was lost. Hence, we believe that improvement in recognition accuracy may be accomplished if better dependent densities are used.

Introduction: Almost all of the existing adaptive filtering algorithms operate by iteratively minimising a mean-squared error cost function due to the mathematical ease it provides [1,5,6]. The most common algorithms used in practice are the least mean-square (LMS) algorithm and its derivatives [I]. Adaptive algorithms that use non-mean-squared error cost functions are represented by the sign-error (SE) and the least mean-fourth (LMF) family of adaptive algorithms which use the absolute value and the fourth-order moment of the error signal as cost functions, respectively. The SE algorithm has poor convergence properties but it is computationally less demanding than LMS [I]. The LMF algorithm is reported to enjoy better convergence than the LMS algorithm if the desired signal to be predicted is contaminated by a noise signal with certain properties [2]. On the other hand, if the noise signal is Gaussian distributed then there is no real benefit in using LMF [2]. The LMK algorithm introduced here is found to be noise robust for a wide range of noise signals such as impulsive, periodic, uniformly distributed, Gaussian distributed etc. [3].

L M K algorithm: The rationale behind the LMK algorithm stems from the simple fact that for two independent random variables x and y we have

Cn(z y) = C,(z)

+

+ C,(y)

.Wlb + Y) #

+ Mn(Y)

K ( S )

(1)

where C,(.) and hi'.(.) are the nth -order cumulant and moment, respectively. In other words, the cumulant of the sum of two independent random variables is equal to the sum of the cumulants of the random variables but not so for the moments [4]. w(n)(

Acknowledgment: The authors would like to thank A.D. Irvine for

his assistance with the experiments and the Industrial Research and Technology Unit of the Northern Ireband Department of Economic Development for its support. 22 November I993 0 IEE 1994 Electronics Letters Online No: 19940134 J . Ming and F. J. Smith (Computer Science Deparfment, The Queen;, University, Belfast BT7 I N N , United Kingdom)

References

Fig. 1 Adaptive system identification in noise

The configuration for adaptive FIR system identification in noise is shown in Fig. 1. The error signal is e(n) = w(n) v'(n)u(n) where w(n) is the noise signal, v(n) = h(n) - haprand u(n) is the input signal. The vector h,, is the optimal set of coefficients that describe the unknown system. Assuming that w(n) and u(n) are statistically independent and w(n) is symmetrically distributed, we can rewrite the mean-squared cost function JLMs(n)= E{e'(n)} and the mean-fourth cost function JLMdn)= E(&)) as ~

LEE, K., HON. H., and REEDY, R.: 'An overview of the SPHINX speech recognition system', IEEE Trans., 1990, ASSP-38, pp. 3545 2 KENNY, P., LENNING. M . , and MERMELSTEIN, P.: 'A linear predictive HMM for vector-valued observations with applications to speech recognition', IEEE Trans., 1990, ASP-38, pp. 220-225 3 JUANG, B.H., and RABINER, L.R.: 'The segmental K-means algorithm for estimating parameters of hidden Markov models', IEEE Trans., 1990, ASSP-38, pp. 1639-1641 4 FORNEY. C D J R : 'The Viterbi algorithm'. Proc. IEEE, 1973. 61. pp. 268-278

1

ELECTRONICS LETTERS

filter

3rd February 1994

Vol. 30

J~.li,s(n) = d+vT(n)Rv(n) J L , w F ( ~=),