Wavelet-based Dynamic Time Warping for Speech Recognition Fabrício Lopes Sanchez
Sylvio Barbon Júnior,
[email protected]
[email protected]
Rodrigo Capobianco Guido
[email protected]
Lucimar Sasso Vieira
[email protected]
IFSC - USP - Instituto de Física de São Carlos – Universidade de São Paulo - Avenida Trabalhador São Carlense 400, 13560-970, São Carlos – SP, Brazil.
Natália Ferraz de Carvalho
[email protected] NEC - UNORP - Centro Universitário do Norte Paulista – Rua Ipiranga, 3460 – Jardim Alto Rio Preto, 15020-040, São José do Rio Preto – SP, Brazil.
ABSTRACT The well-known Dynamic Time Warping (DTW) algorithm [1], a pattern matching approach for speech recognition [2], is based on the computation of two n x m matrices, n and m being, respectively, the length of the input data and the length of the template signal. Therefore, the order of computational complexity (OC) and the time delay of this procedure increase proportionally to n and m.
References
The proposed algorithm, a wavelet-based version of the original DTW, has a much lower OC. The discrete wavelet transformation (DWT) [3] is applied to both signals involved, the input one and the template. Then, only high energy sub-bands are used to compute the matrices and to find the best path, causing the reduction of OC.
[3] Paul S. Addison, The Illustrated Wavelet Transform Handbook, Taylor & Francis - 1st edition (July 1, 2002).
Generally speaking, for speech signals, it is possible to reduce OC over 20 times, depending on the sampling rate of the input data and the number of sub-bands used, without decreasing the accuracy of the original version of the algorithm. The Java [4][5] version of the proposed algorithm runs under Linux operating system and can be easily adapted to work in real time using a digital signal processor [6].
[1] John Coleman, Introducing Speech and Language Processing, Cambridge University Press (April 11, 2005). [2] Li Deng, Speech Processing: A Dynamic and Optimization-Oriented Approach, CRC (June 1, 2003).
[4] Harvey M. Deitel, Paul J. Deitel, Java How to Program (6th Edition), Prentice Hall (August 4, 2004). [5] Cay Horstmann, Gary Cornell, Core Java(TM) 2, Volume II - Advanced Features, Prentice Hall PTR - 7 edition (November 22, 2004). [6] Alan V. Oppenhein, Ronald W. Schafer, John R. Buck, Discrete-Time Signal Processing, Prentice Hall - 2nd edition (February 15, 1999).