TWO DIMENSIONAL PROCESSING OF SPEECH

0 downloads 0 Views 738KB Size Report
For speech signals. a contour plot is determined and used as a bas~c feature ... 142 / SPIE Vol 697 Applications of Digital Image Processing IX (1986) ..... t1me-frequency domain. is a concise and easily visualised representation of those areas.
TWO DIMENSIONAL PROCESSING OF SPEECH AND ECG SIGNALS USING THE WIGNER-VILLE DISTRIBUTION Boualem Boashash and Saman S. Abeysekera CRISSP. Department of Electrical Engineering. University of Queensland BRISBANE 4067 AUSTRALIA Abstract The Wigner-Ville Distribution (WVD) has been shown to be a. valuable tool for the analysis of non-stationary signals such as speec~ and Electrocard1ogr~m (~CG) dat? The one-dimensional real data are first transformed 1nto a complex anal~tlc slgna~ uS1ng.the Hilbert Transform and then a 2-dimensional image is formed uS1ng the w19ner-V1lle Transform. For speech signals. a contour plot is determined and used as a bas~c feature for a pattern recognition algorithm. This method is compared with the classlcal Short Time Fourier Transform (STFT) and is shown. to be able to recognize isolated words better in a noisy environment. The same method together with the concept of in~tantaneous frequency of the signal is applied to the analysis of ECG signals. This techn1que allowS one to classify diseased heart-beat signals. Examples are shown. Introduction to time-frequency signal analysis Introduction The classical representations used in signal theory are the time-domain representation and the frequency domain representation (based on the Fourier Transform). Both are idealizations. as the t-representation uses rigorously defined instants of time and the f-representation uses inf inite sinewaves of rigorously defined frequencies. In these representations. the variables time .t. and frequency .f. are mutually exclusive. The changing frequency in a device such as a siren is impossible to represent in this way because its representation involves both time and frequency. TO provide a representation closer to reality we are therefore naturally led to represent signals in two dimensions. with time and frequency as co-ordinates (3). We show here that the time-frequency distribution (TFD) by taking account of both variables. time and frequency. allows one to distribute the energy eVOlution of observed phenomena in a time-frequency domain and consequent~y gives a solution to this problem. Spectroqram and Sonoqram To distribute the energy of a signal in a time-frequency domain. many t.f representations have been proposed. One of them is the Short-Time Fourier Transform (Spectrogram). This device calculates the spectrum of a weighted slice of a signal by applying a window to the signal. The "sonagram" is an equivalent form obtained by a windowing of the frequency domain with a filter bank. Both methods are widely used for the processing of data from various sources. But the simple use of these methods does not lead to SUfficient accuracy in the localisation of time and frequency. One is always faced with the fact that an increase in time resolution is accompanied by a decrease in frequency resolution and vice-versa. The optimum resolution of this analysis is obtained in the case of the spectrogram with a window width given by: (1.1)

where

fi(t) is the instantaneous frequency defined as the derivative of the phase of the

analytic signal [2].

For a frequency modulated signal,df./dt represents the slope of the

mOduI~tion .law. Therefore. an optimum t. f an~lysis can be obtained only by a vary~ng w1ndow fUnction of this slope [17 land can be achieved only after an 1terat1ve analys1s.

frequ 7ncy ~hoos1~g

A General Formulation for Time-Frequency Analysis To distribute the energy of a signal in a time-frequency domain. many other t.f r 7presentations have been proposed by different authors, each one with its own merits and d1sadvantage~ .[2.][5][6]. . The TFD is a more general formula that generalizes all the. previous def1n1t1ons. It 1S expressed as [4]: 142 / SPIE Vol 697 Applications of Digital Image Processing IX (1986)

p{t.f) = ~3 e

j2trn(u-t)

ew{n.T)eZ(U+T/2)eZ*{u-T/2)e

-j2trfT

dndudT

(1. 2)

where we define z{t) as the analytic signal associated with the ieal signal set). z{t)

=

s(t)+jeH[s{t)]:

H = Hilbert Transform

(1. 3)

An appropriate choice of w{n.T) yields one of the previous proposed definitions of t.f representation [2]. A comparative study has shown that among this class of TFD's Distribution (WVD) is the most suitable tool for t.f analysis [11].

the

Wigner-Ville

Properties of the Wigner-ville Distribution (WVD) +m It is defined by:

J

W{t.f)

-j2trfT Z(t+T/2)eZ*{t-T/2)ee

edT

(1. 4)

-m

Previous studies have shown that the Wigner Distribution exhibits some very interesting properties with regard to time-frequency signal analysis [7][8][9]. Here are the most important: e W{t.f) is real for all values of t and f

. spectral density 2 Iz(t)1 . instantaneous power

+m

e

J W{t.f)edt

I Z{f) I

---co

+m J W(t.f)edf

e

---co

+m +m J W(t.f)edtedf

e

J

2

E.

(1. 5)

(1. 6)

(1. 7)

energy of the signal

---co ---co

if z(t)

o

for tT2. then W(t.f)

o

for tT2.

e if Z(f)

o

for ff2: then W(t.f)

o

for ff2.

e

e the first-order Moments of the WVD yield f.(t). instantaneous 1

frequency and T • group delay g

e if yet) = s(t)*h(t). then we have: Wy(t.f) = Ws(t·f)tWh(t.O e Reversibility of the wigner Distribution: we can reconstruct set) as follows [8]: +m

t

---co

2

J W(-.f)ee

Thus. is lost.

j2trft

(1. 8)

edf = z(t) z*(o)

the WVD contains all of the information carried by the signal. (For the discussion of those properties. see [8] and [9].)

No information

e All other Time-Frequency Distributions are smoothed versions of the WVD: (1. 9)

P(t.f) = W(t.f) * ;(t.f) t.f Where

W{t.f) is the 2D Fourier Transform of w(n.T).

Wigner-Ville Analysis of Modulated Signals The signals considered are real. causal. almost time-and-bandlimited. of finite They are energy. centered at (Z(f=O) = 0) and satisfy Bedrosian's conditions (BT»l). expressed as: set) = a(t)ecos~(t) in Which case. the analytic signal Z{t). associated with set) can be expressed as: z(t)

=

a(t)ee

j~(t)

(l.10)

SPIE Vol 697 Applications of Digital/mage Processing IX (1986) / 143

and thus the instantaneous frequency modulation law of the signal [17].

has

a

physical

meaning;

it

defines

the

frequency

WVD of Monocomponent Signals we call a signal monocomponent if for that signal the instantaneous frequency law represents the frequency modulation law of the signal and is invertable, so that the law t

f:1(f)

=

g

exists and

is not multiple valued.

The

latter

represents

then

the group

1

delay of the signal. In this case, the WVD makes the frequency modulation law of the signal clearly apparant by visual correlation of the maximum amplitude curve and can therefore be estimated simply by peak detection [17]. WVD of Multicomponent Signals Such a signal is characterised as the sum of several monocomponent signals. The WVD's behaviour depends on whether the frequency modulation laws of each component have the same gradient or not (same slope in the time-frequency domain or not) .if these gradients are analysis allows one to time-frequency domain .

different (non-parallel in the time-frequency plane),WVD separate the characteristics of each component in the

• if these gradients are equal (parallel in the time-frequency plane), WVD analysis creates artefacts (a ghost law 3 appears equi-distant between the two real laws 1 and the interpretation of 2). However, this structure can be usefully used for time-frequency analysis [17]. Wigner-Ville Time-Frequency Representation of Typical Signals Example 1:

FM signal with a linear modulation law (Chirp Signal)

2 This signal can be expressed as follows: set) = cos2n(f t+a t /2),for O

:l

+J +J 'n ..... 0.

·rl

"' ....

& ~I----

'" M

o

o

(5)

1.0

Fig. 3.1c : Contours of WVD of signal of Fig. 3.1a

o

t

(5)

1.0

Fig. 3.2c : Contours of WVD of signal of Fig. 3.2a

o

t

(s)

1.0

Fig. 3.3c Contours of WVD of signal of Fig. 3.3a

Finally in Figures 3.lc, 3.2c, 3.3c we present the resulting WVD "images" for the 3 dif.ferent heart beats shown in Figures 3.la, 3.2a, 3.3a. The relative weighting function used for local normalising is shown on top of each image. These time frequency images encode all the information relevant to a cardiologist and are/will be useful in automatic monitoring and classification of heart beats. Conclusion Due to non-stationary behaviour of ECG signals and importance of relative tempor~l positions of different waves, recognition and classification in a time frequency doma1n is more appropriate. We have shown that WVD provides SUfficient frequency resolution, in contrast to STFT, and also prodUces artefacts which simplify recognition problems. Our current research inclUdes developing algori thms for ident ifying P, QRS and T complexes, and automatic classification of cardiac arrhythmias using time frequency patterns, transforming for higher reliabi 1 i ty in separation (1. e. increased inter-class distance and decreased intra-Class distance) and improvements in noise performances. Acknowledgments The authors wish to thank Mr Ralph Seabrook for his contribution in the processing of the speech signals. They also wish to acknowledge Dr L C Westphal, Senior Lecturer in Electrical Engineering and Dr S W Manley, Senior Lecturer in Physiologogy for helpful discussion. The authors are also grateful to Mr H J Whitehouse for presenting this paper at the conference. In addition, this work is funded by the Australian Research Grant Scheme and the Australian Telecommunication and Electronics Research Board. References [1] Wigner, E. (1932), "On the Quantum correction for thermodynamic equilibrium" Phys.Rev., 40, pp.749-759. [2J Ville, J. (1948), "Theorie et application de la notion de signal analytique" Cables et Transmissions, vol. 2A (1). p. 61-74. Paris (France), 1.948. 152 / SPIE Vol 697 Applications of Digital Image Processing IX (1986)

[3] Gabor. [4] [5 ] [6]

[7] [8J

[9]

[10] [ 11]

[12] [13 ] [14J

[15] [16J [17J [18 J

[19] [20] [21] [22] [23J

[24] [25] [26] [27] [28] (29] (30] (31] (32]

D. (1946). "Theory of Communication". J.IEEE. London. vol.93. p. 429-45"1. November 1946. Cohen. L. (1966). "Generalized phase space distributions" Journ Math Phys vol 7. p. 781-186. 1967. • .. .• . Rihaczek (1968). "Signal energy distribution in time and frequency" lEE£-; Trans. on Inf.Th.It-14. pp.369-374. May 1968. • Flanagan. J.L. (1965). "Speech Analysis and Perception". Springer-Verlag 1965 Boua~hac~e. B. (1978). "Representation temps-frequence". ELF-Aquitain'e Research publlcatlon No. 373/78. Geophysical Research Center. Pau. France. Bou~c~ache. B. et aI.. (1979). "Sur la possibilite d'utiliser la representation COn)Olnte en temps et frequence de ville aux signaux vibrosismeques" 7th Colloque GRETSI. Proceed p.121. Nice. France. • Claasen. T. et aI.. (1983). "The Aliasing Problem in Discrete-Time Wigner Distribution". IEEE ASSP Vol.31. 5. October. 1983. Bo'!achache. B. (1979). "Representat ion temps- frequence". These Docteur- Ingenieur. Unlv. de Grenoble. France. 1982. Bouachache. B. et aI.. (1982). "Wigner-ville analysis of time-varying signals". IEEE. Intern.Conf. Of. Ac .• Sp and Sign. Process. Paris France. May 1982.ICASSP Proceed .• pp.1329. Parls. Bouachache. B. and Rodriguez. F. (1984). "Recognition of time-varying signals in the time-frequency domain by means of the Wigner distribution". IEEE ICASSP proceedings, p. 22.5 San Diego. March 1984. Imberger. J .• and Boashash. B. (1985). "The Wigner-Ville Distribution applied to Turbulent Microstructure Signals", submitted to Journal of Physical Oceanography. Boashash. B. (1985). Report 85/2. Univ. Qld .• Elec. Eng .• February. Bouachache. B. and Bazelaire. E. de (1983). 9th Colloque Gretsi. Proceed. pp.879-884. Nice. France. Bedros ian. E. (1963). "A Product Theorem for Hi lbert Transforms", Proceed.. IRE Corresp .• Vol.51. pp.868-866. Bouachache. B. (1983). Eusipco 83, Erlangen. West Germany. pp.703-705. Kay. S. and Boudreaux-Bartels. "On the Optimality of the Wigner Distribution for Detection". Proc. ICAS5P 85. pp. 27.2.1-4 Chalker. D, PhD Thesis. "An Acoustic-to-Articulatory Transofrmation and its Application in a Speech Training Aid for Hearing-Impaired Speakers". submitted 4/7/86. Department of Electrical Engineering. University of Queensland. Pipberger. H. V.. "Computer Analysis of the Electrocardiogram". in Computers in Biomedical Research. ed. R.W. Stacy and B.D. Waxman. Academic Press. 1965. pp. 377-407. Cox. J.R., F.M. Nolle. H.A. Fozzard. and G.C. Oliver. "AZTEC: A Preprocessing for Real-time ECG Rhythm Analysis". IEEE Transaction on Biomedical Engineering, V. BHE-15. no.2. April. 1968. pp. 128-129. Nolle. F.M., "ARGUS, A Clinical Computer System for Monitoring Electrocardiographic Rhythms". D.Sc. dissertation. Washington University. Sever Institute of Technology. St. Louis. Missouri. USA. 1972. Mead. C.N. et al. "Expanded Frequency Domain ECG Waveforms processing: Integration into a New Version of ARGUS/2H". Proceedings of Computers in Cardiology, pp. 205-208. 1982. Kitney, R. and McDonald, A.. "Frequency Analysis for Detecting Changes in QRS Shape", Proceedings of Computers in Cardiology, pp. 36.9-371. 1.980: . . . . Nygards. M.E. and Hulting. J .• "Recognition of Ventrlcular F1br11latlon Utlllzlng the power Spectrum of ECG". Proceedings of Computers in Cardiology. pp. 393-397. 1977. Butrous. G.S. et al. "Fast Fourier Transform for the Quantification of preexcitation in Wolff-Parkinson-White Syndrome". Proceedings of Computers in Cardiology. pp. 209-212. 1983. Guyton A.E., "Textbook of Medical Physiology". W.G .• saunders Co .• Philadelphia, PA 19105, USA. Abeysekera, R.M.S.S., et aI, "patterns in Hilbert Transform and Wigner-Ville Distribution of Electrocardlogram Data". ICASSP 86, Tokyo, pp. 34.18.1-4, 1986. Abeysekera, R.M.S.S .• Personal Notes, Department of Electrical Engineering, university of Queensland, St Lucia, Brisbane, 198?. . Cohen, L. and Pickover, C.A .• "A comparison of )Olnt tl~e-f:equency distributions for speech signals". 1986 International Symposium on Clrcu1t and Systems, IEEE, proceedings pp. 42-45. San Jose, USA, May 1986. . ' . " Chester. D •• et.al. "The wigner distribution in speech processlng appllcat10ns , J.Franklin Inst .• Vol. 318. No.6, pp. 415-430, December 1984.. . Garudari, H. et. al., "Identification of invar~ant. acoust1C CU 7S " 1n st~p consonants using the wigner distribution", in "Appl1ed Slgnal Process1ng , lASTED Conference proceedings. Editor: M.H. Hamza, calgary. 1985.

SPIE Vol 697 Applications of Digital/mage Processing IX (1986) I 153