Feature Extraction Using Wavelet Transform For Neural ... - IEEE Xplore

4 downloads 2327 Views 440KB Size Report
Feature Extraction Using Wavelet Transform. For Neural Network Based Image Classification. Manish N. Sarlashkar, M. Bodruzzaman and M. J. Malkani. Center ...
Feature Extraction Using Wavelet Transform For Neural Network Based Image Classification. Manish N. Sarlashkar,M. Bodruzzaman and M.J. Malkani Center of Neural Engineering Department of Electrical and Computer Engineering Tennessee State University, Nashville, TN. Sarlashkarm @ harpo.tnstate.edu 1. INTRODUCTION 2. It must be able to fiiter out the variations such as scaling and rotation in the images.

Pattern recognition is concerned with the design and development of methods for the classification or description of patterns, objects, signals, and processes. In the case of image classification, because of the image size involved, it is essential to extract the features of the image. The pattern or image is then classified on the basis of the features extracted. Thus it is very important to select a small number of features that represents the image uniquely and contains most discriminatory information for robust classification. The features extracted should reflect global properties of the image. The same mechanism is used in the human vision system. When a person looks at two different images or objects, for example letters ‘A ‘and ‘B7, he can recognize them immediate as ‘A’ and ‘B’, without careful inspection of the images. The reason is, human brain picks up the general structures or the coarse variations of the images which allows it to make the immediate decisions. These general structures are not local properties of the images. They represent the global properties of the images. That is even if the image is scaled or rotated, it does not affect brain decisions at all. So in order to design an art$icial image classification or recognition scheme which should have a robustness in classification approaching as close as possible to that of human biological recognition system, two factors must be taken into account:

Wavelet transforms of the images with high frequency components truncated off seem to be able to meet both of these conditions. This is because low frequency components are spread in the time domain and can be treated as global property while high frequency components, concentrated in time domain, can be discarded. Information at different resolution scales provided by wavelet features lead to highly discriminating, robust classifiers. Wavelets can examine data at different scales and frequencies. The rest of the paper is organized as follows: second section discusses the theory behind the wavelets and their suitabihty for image classification. Third section discusses feature extraction and how the wavelet transform is implemented. In the fourth section, results of feature extraction are shown.

2. WAVELETS The wavelet transform is a powerful technique for representing data at different scales and frequencies [4]. Wavelets are especially attractive from the standpoint of signal classification because the human ear computes an approximate wavelet transform [ 5 ] , and the eye has been shown to have wavelet like receptive fields[6]. In wavelets, basis functions are scaled and translated versions of the same prototype function $, known as mother wavelet.

1. It must be able to automatically extract global properties of the images. 0-7803-4547-9/%3/$10.00 @I998 IEEE

4 12

A wide basis function can examine a large region of the signal and resolve the low frequency details accurately, while a short bask function can examine a small region of the signal to resolve time details accurately. If $(x) represents the mother wavelet, the scaling is accomplished by multiplying ‘x’ by some scale factor, if the scale factor is power of 2, yielding $(2’ x), where ‘v’ is some integer , we get the cascaded octave bandpass filter structure. As ‘$’ has finte support, it will need to be translated along the time axis in order to cover an entire signal. This translation is accomplished by considering all the integral shifts of $,

transform and neural network based image classifier is shown in figure 1. 3.1 WAVELET TRANSFORM The forward and inverse transform can be each be efficiently implemented by a pair of appropriately designed Quadrature Mirror Filters (QMFs). Wavelet based image decomposition can be viewed as a form of subband decomposition. Each QMF pair consists of a low pass filter(H) and a high pass filter(G) which split a signal‘s bandwidth in half. The impulse response of H and G are mirror images, and are related by

Q(2”x-k) k G Z gn = (Thus putting t b all together gives a wavelet decomposition of a signal , f(x) = ]c v

c k

c v k $vk

hl,

The impulse response of the forward and inverse transform QMFs, denoted by (-H,-G) and (H,G) respectively, are related by

(x)

where $vk

(x) = 2”*$ (2’~-k) Figure 2 illustrates a single 2-D forward wavelet transform of an image, which is accomplished by two separate 1-D transforms. The image f(x,y) is first filtered along the x direction, resulting in a low pass image fi(x,y) and a high pass image fh(x,y).Since the bandwidth of ft(x,y) and fh(x,y) along the x direction is now half that o f f , each of the filtered images can be down sampled in the x direction by two without loss of information. The down sampling is accomplished by dropping every other filtered value. Both F and fh are then filtered along the y direction, resulting in four subimages: fib, M f and fu. Once again the subimages are down sampled by 2 along the y direction.

cvk are the transform coefficients. They are computed by the wavelet transform, which is just the inner product of the signal f(x) with the basis functions $vk (x). For classification features, we do not compute inverse wavelet transform, since there is no need to reconstruct the original signal for classification.

3. IMAGE CLASSIFICATION Two dimensional wavelet transform is applied on the image which results in four subimages, as shown in figure 2, as average image(&,) and three detail images(fb, fhl and fu). For the purpose of image classlfication the three detail images are discarded and the average subimage is converted into a vector by concatenating the columns. This vector is used as image representation for the purpose of image classification. The block diagram of wavelet

Input

FHX Wavelet E ZtheT formH of vector T Transform

as image features

To the Feed Forward freural Network for image classification.

Figure 1 : Wavelet Transform and Neural Network Based Image Classifier. 4u

Downsample by 2 along y

-H(x)

Downsample by 2 along x

Neural Network

Original Image

Downsample by 2 along y

Downsample by 2 along x

fhl(X.Y)

output

Downsample Y 2 along Y

Figure 2 : Block Diagram of the 2-D Forward Wavelet Transform and Neural Network Classifier.

4. RES'CJLTS AND CONCLUSIONS training input to a Feed Forward Neural Network. The Network was trained using ' Multi-directional search learning rule' [2]. To test the trained network, three test images of size (64x641, were created from baboon, lenna and scene images with addition of 20% noise. Thc tcst images with added noise are shown in figure 6.a, 10.a and 14.a. The corresponding features, extracted usmg wavelets, are shown m figure 6.b, 10.b and 14.b. The network classified the test images as baboon, lenna and scene, without any confusion. Thus this result shows that features extracted using wavelcts are scalcinvariant. And also noise-invariant with certain noise level. Future work is mainly concentrated on classlfylng the images, which are scaled as well as rotated. And also try to implement same technique in mobile robot for object recognition and classification.

The technique proposed in this paper was tested on various test images. Figures 3.a, 42,and 5.a show the baboon image at different scales. Figures 3.b, 4.b, and 5.b show the features, extracted using wavelets for corresponding images. It can be seen from the features, that they are similar. Figures 7.a, 8.a, and 9.a show the lenna image at different scales. Figures 7.b, 8.b, and 9.b show the features, extracted using wavelets, for corresponding images. Figures ll.a, 12.a, and 13.a show the same picture at different scales, taken by robot camera. Figures 1l.b, 12.b, and 13.b show the features, extracted using wavelets, for corresponding images. Also features were extracted from the images mentioned above with addition of 20% random noise. The features extracted, from the scaled as well as scaled and noisy images, were used as

,-,oo 8000 6000

4000 2000

I 0

(a) baboon [512x512]

(a) baboon [256x256]

(b)

20

40

60

EO

(b)

Figure 4:

Figure 3:

(a) Original baboon (b) The Wavelet feature, representing original baboon.

(a) Scaled version of baboon image. (b)The Wavelet feature, representing scaled baboon.

414

1

sooo, 4000

2000

:a000

1500

2000

1000

1000

0

500 20

(a) baboon [ 128x1281

40

I

60

I

80

(a) baboon [64x64]

(b)

(b)

Figure 6: (a) Scaled version of baboon with 20% random noise (b) The Wavelet feature, representing scaled baboon(noisy).

Figure 5 : (a) Scaled version of baboon image. (b)The Wavelet feature, representing scaled baboon image

0000 8000 6000

4000

2000 0 0

20

40

60

80

(a) lenna [256x256]

(a) lenna [512x512] (b) Figure 7: (a) Scaled version of lenna image. (b)The Wavelet feature, representing scaled lenna image.

Figure 8: (a) Original lenna (b) The Wavelet feature, representing original lenna image.

1-10052 1000

2000

800 1500

600

1000

400

cnn _.""

0

(a) lenna [128x128]

20

40

60

200

60

(a) lenna [64x64]

(b)

(b)

Figure 10: (a) Scaled version of lenna with 20% random noise (b) The Wavelet feature, representing scaled lenna(noisy).

Figure 9: (a) Scaled version of lenna image. (b)The Wavelet feature, representing scaled lenna image.

1.6

6000

1 4

7000

1.2

6000

1

5000

0.8

4000

0 6

3000

(a) scene [512x512]

(a) scene [256x256] Figure 12: (a) Scaled version of scene(Hal1way) captured by camera (b)The Wavelet feature representing scene in (a)

Figure 11 : (a) Scaled version of scene(Hal1way) captured by camera (b)The Wavelet feature representing scene in (a)

4 15

1

4000,

2000

3500

i8no

3000

1600 1400

2500

1200

2000 io00

1500

0

20

40

60

80

(a) scene [ 128x1281 (b) Figure 13: (a) Scaled version of scene(Hal1way) captured by camera (b)The Wavelet feature representing scene in (a)

(a) scene [64x64] (b) Figure 14. (a) Scaled version of scene(Hal1way) with 20% noise @)TheWavelet feature representing scene in (a)

5. ACKNOWLEDGEMENT This research work was supported by the office of Naval Research through the Center for Neural Engineering at Tennessee State University under Grant No. NOOO14-92-J-1372.

[4] I. Daubcheis, “ Orthonormal bases of compactly supported wavelets”. Pure appl. Match., vol. XLI. PP.906-996,1998. 151 X. Yang, K. Wang, S . Shamma, “Auditory Representations of Acoustic Signals”. IEEE trans. IT, vol. 38,824-839, Mar. 1992. [6] J. Daugman, “Complete Discrete 2-D Gabor Transforms by Neural Networks for Image Analysis and Compression”. IEEE Trans. ASSP, vol 36, 1169-1179, 1988. [7] S. Mallat, “ A theory for multiresolution signal decomposition: The wavelet representation”. IEEE Trans. Patt. Anal Machine Intell., vol 7 pp.674693,1989. [8] Chia-Lun J. Hu, “ Robustness of Pattern Recognition in a Noniteratively Trained Unsupervised, Hard Limited Perceptron”. [9] M. L. Hilton, B.D.Jawerth and Ayan Sengupta, “Compressing Still and Moving Images With Wavelets”. Multimedia Systems, v01.2, N0.3.

6. REFERENCES [l] M. Bodruzzaman, X. Li, K. Kuah, L. Crowder and M. Malkani, “ Speaker Recognition Using Neural Network and Adaptive Wavelet Transform”. SPIE Conference, 1993. 121 C. Wang, M. Bodruzzaman, X. Li, J.A. Cadzow, “Multi-directional Search Learning Algorithms for Multi Feed Forward Neural Network’. Proc. Of Canadian Conference on Electrical and Computer Engineering (CCECE’92), Ontario, Canada, September, 1992. [3] H. Szu, B.Tefler, S. Kadambe, “ Adaptive Wavelets for Signal Representation and Classification”. Optical Engineering, vol. 31, 1907-1916, sept. 1992.

416