Speaker-Independent Speech Recognition - Semantic Scholar

6 downloads 0 Views 2MB Size Report
[7] Wang Shou-jue, Li Zhaozhou, etc., Discussion on the Basic. Mathematical Models ofNeurons in General Purpose Neurocomputer",. Acta Electronics Sinica ...
Biomimetic Pattern Recognition for Speaker-Independent Speech Recognition Hong Qin, Shoujue Wang and Hua Sun Laboratory ofArtificial Neural Networks, Institute of Semiconductors, Chinese academy of Sciences, Beijing 100083,China E-mail: [email protected], [email protected], [email protected]

Abstract-In speaker-independent speech recognition, the disadvantage of the most diffused technology ( Hidden Markov Models) is not only the need of many more training samples, but also long train time requirement. This paper describes the use of Biomimetic Pattern Recognition (BPR) in recognizing some Mandarin Speech in a speaker-independent manner. The vocabulary of the system consists of 15 Chinese dish's names. Neural networks based on Multi-Weight Neuron (MWN) model are used to train and recognize the speech sounds. Experimental results are presented to show that the system, which can carry out real time recognition of the persons from different provinces speaking common Chinese speech, outperforms HMMs especially in the cases of samples of a finite size. Key words-Speech Recognition, Biomimetic Pattern Recognition, Hidden Markov Models, Dynamic Time Warping I. INTRODUCTION

The main goal of automatic speech recognition (ASR) is to produce a system which will recognize accurately normal h-uman speech from any speaker. The recognition system may be classified as speaker-dependent or speaker-independent. The speaker dependence requires that

the system be personally trained with the speech of the person that will be involved with its operation in order to achieve a high recognition rate. For applications on the public facilities, on the other hand, the system must be capable of recognizing the speech uttered by many different people, with different gender, age, accent, etc. , the speaker independence has many more applications, primarily in the general area of public facilities. In recent years, Biomimetic Pattern Recognition (BPR)E1121 was proposed, since it is simple and feasible, it has already been applied to object recognition131, face identification[4] and fae recognition151 etc., and achieved much better performce. With some adaptations, such modeling techniques could be easily used within speech recognition too. In this paper, a real-time mandarin speech recognition system based on BPR for Chinese dish's names is proposed. The system is a small vocabulary speaker independent continuous speech recognition one. The whole system is implemented on PC with CASSANN-II neurocomputer161171.

0-7803-9422-4/05/$20.00 ©2005 IEEE

It supports standard 16-bit sound card. II. INTRODUCTION OF BIOMIMETIC PATTERN RECOGNITION AND MULTI-WEIGHTS NEURON NETWORKS

A. Biomimetic Pattern Recognition Traditional pattern recognition aims at getting the optimal classification of different classes of sample in the feature space. However, the BPR intends to find the optimal coverage of the samples of the same type. It is from the Principle of Homology-Continuity, that is to say, if there are two samples (not exactly the same) of the same class, the difference between them should be gradually changed. So a gradual change sequence must be exists between the two samples. Here, RI is a n-dimensional feature space, Let A be defined as a point set including all samples in class A. According to BPR, if x, yE A and 6 >0 were given, there must exist Set B, B ={xl = x,x2, * *xnl 9xn = Y I p(Xm,Xm+i)