A new algorithm for image indexing and retrieval using ... - IEEE Xplore

0 downloads 0 Views 303KB Size Report
image color correlogram or wavelet transform. 1. INTRODUCTION. Digital image libraries and other multimedia databases have been dramatically expanded in ...
A New Algorithm for Image Indexing and Retrieval Using Wavelet Correlogram

H Abrishami Moghaddam’

T.Taghizadeh Khajoie’

A.H. Rouhi2

moghadam@aba. knfu.ac.ir

[email protected]

[email protected]

’ K. N. Toosi University of Technology, Tehran, Iran

* Azad University, Science & Research Campus, Tehran, Iran ABSTRACT

Shape based and color-shape based systems using

In this paper, a new algorithm for content-based image indexing and retrieval i s presented. The proposed method i s based on a combination of multiresolution analysis and color correlation histogram of the image. According to the new algorithm, wavelet coefficients of the image are computed first using Daubechieswavelets. A quantization step is then applied before computing horizontal and vertical color correlograms o f the wavelet coefficients. Finally, index vectors are constructed using these wavelet correlograms. The retrieval results obtained by the new method on a 1000 image database demonstrated a significant improvement in effectiveness and efficiency compared to the indexing and retrieval methods based on image color correlogram or wavelet transform. 1. INTRODUCTION Digital image libraries and other multimedia databases have been dramatically expanded in recent years. Storage and retrieval o f images in such libraries become a real demand in industrial, medical, and other applications (91. Content-based image indexing and retrieval (CBIR) i s considered as a solution. I n such systems, some features are extracted from every picture and stored as an index vector [12]. I n retrieval phase, every index is then compared (using a similarity criterion) to find some similar pictures to the query image [4]. Two major approaches including spatial and transform domain based methods can be identified in CBIR systems. The first approach usually uses pixel (or a group of adjacent pixels) features like color and shape. Among all these features, color is the most used signature for indexing [3]. Color histogram [I41 and its variations were the first algorithms introduced in the pixel domain. However, color histogram i s unable to cany local spatial information of pixels. Therefore, in such systems retrieved images may have many inaccuracies, especially in large image databases. For these reasons, two variations called image partitioning and regional color histogram were proposed to improve the effectiveness o f such systems [2, 131.

0-7803-7750-8/03/$17.00 02003 IEEE

snakes [SI, contour images [IO] and other boundary detection methods [Ill, were also proposed as pixel domain methods. These systems are more sensitive to semantic information but have large computational cost. In the second approach. transformed data are used to extract some higher level features. Recently, wavelet based methods. which provide better local spatial information in transform domain have been used [7,9]. Daubechies wavelets are the most used in CBIR, because o f their fast computation and regularity. In [9], Daubechies wavelets in three scales were used to obtain transformed data. Then, histograms of wavelet coefficients in each sub-band were computed and stored to construct indexing feature vectors. In SlMPLlcity method [7], the image i s first classified into different semantic classes using a kind o f texture classification algorithm. Then, Daubechies wavelets were used to extract feature vectors. In the present work, Daubechies wavelet coefficients are computed first to extract space-frequency information o f the image. Then, the correlation o f these coefficients in LH and HL sub-bands is computed [l]. These directional sub-bands enable us to compute more efficiently the spatial correlation, while taking into consideration the semantic image information. This paper is organized as follows: in section 2, the color correlogram method is reviewed. In section 3, our new approach called wavelet correlogram is presented. Experimental results are given in section 4. Finally, section 5 is devoted to concluding remarks.

2. COLOR CORRELOGRAM Color correlogram, recently introduced by Huang el al. [6], is an approach for CBIR. Color correlogram of an image i s a three dimensional matrix whose elements r(i. j,k)represent the probability o f finding two pixels in

the image with color c, and c j placed in a distance

k

of

each other. Color correlogram expresses how the spatial correlation of pairs of colors changes with distance. This type o f feature turns out to be robust in tolerating large variations in appearance of the same scene caused by

111 - 497

changes of viewing position, background, partial occlusions and effects of camera zoom. Let f be an N X N image (a square image for simplicity) consisting of M different colnrsc,,c2,..., cM . For a pixel p(x,y, E f , let j c pexpresses , the pixel color and

f, =(p

=.}.

Therefore, P E

r;

is equivalent

t o p e f , f,,,= c . For simplicity, L _ - N o r m is used to measure the distance between pixels. This measure is computed for two pixels p , ( * , , y , ) . p 2 ( ~ 2 , ) ' 2as) follows:

IPI

-P2l="{lr,

-x2I>IYl

-hIl

Pig. I A sample binary image and it's autowmelogram in four distances{l.2,3,4).

;[---'--

0)

Y

The histogram h of the image f is defined by:

A ,(/)=N*.Pr{b

L,11

(2)

Assume that k is a specified distance and i, j E {I,...._, M}.The correlogram of ,f is defined by:

According to Eq. (3), y(i,j,k)denotes the probability of finding pixels with color c, at the distance

k of the pixel

with color c,. The computation order of the above equation isO(M'd), where d represents the number of different distances. Therefore. the correlogram computational cost seems to he too large for CBIR. 2.1 Autocorrelogram The autocorrelogram o f f represents spatial correlation between identical colors and is defined by:

a(i,k) = &,k)

(4)

This equation has only O(Md) order and can he computed much faster than Eq. (3). Figures ( I ) and (2) show two different images (M=2, N=7) with the same histograms but completely different autocorrelograms. According to 161, the autocorrelogram computing equations are

A i , i , k ) = \ b , , p Ez f, I ~ P- ,p 2 (= k l J

(5)

Fig. 2 A binary image with the same histogram of Figure ( I ) and it's autowrrelogram in four distances ( I , 2, 3, 4 ) .

The denominator of Eq. (6) is the total number of pixels of color ci at the distance k from any pixel with the same color and 8k is due to properties of L- - N o r m . Since this equation has the order of O(N'd') some methods have been proposed for fast computation of autocorrelogram 161. The main advantages of using color correlogram are: i) taking into consideration the spatial correlation of colors, ii) describing the global distribution of local spatial correlation of colors, iii) being easy to implement, and iv) producing fairly small size index.

3. WAVELET CORRELOGRAM One of the most important properties of wavelet transform is space-frequency decomposition of the input signal [PI,This property enables u s to apply pixel domain tools, such as color correlogram, to the wavelet coefficients. The wavelet correlogram approach is based on computing the spatial correlation of wavelet coefficients of an image. In this way, the multiscalemultiresolution property of the wavelet transform and TR' invariancy of color correlogram will be combined. Consequently, the image indexes obtained by the wavelet correlogram method may have better discriminative performance. 3.1 Wavelet correlogram indexing algorithm

According to the wavelet correlogram indexing algorithm [I], the wavelet transform of the input image is computed first (in three consecutive levels). In the second step, the computed wavelet coefticients are quantized to a limited number of levels. Then, one-dimensional auto1. Translation and Rotation

111 - 498

correlograms of vertical and horizontal quantized wavelet coefficients are computed. Finally, onedimensional autocorrelogram results are used to form feature vectors for image indexing. Figure (3) illustrates three major parts o f the wavelet correlogram indexing algorithm, including preprocessing, processing and feature construction phases.

coefficients corresponding to HH filters have no significant spatial correlation. Therefore, there will be no need to obtain the autocorrelogram o f these coefficients. Since HL and L H coefficients are real numbers with a large dynamic range, a quantization step is required before computing the auto correlogram. To take into consideration the dynamic range variations of wavelet coefficients, we use different quantized levels in each scale. Figure (4) shows the quantization procedure performed on 3 consecutive scales o f wavelet transform. In each scale, small coefficients around zero are considered as noise and discarded.

--0.5

0.23

-.

&&IS

-2

-7

Fig. 3 Wavelet correlogram indexing algorithm.

0.25

0

a

3.1.2. Wavelet transform Wavelet decomposition i s an important part o f the presented image indexing algorithm. Here, Daubechies wavelets are used because o f their regularity, separability and wmpact support properties. A comprehensive performance evaluation is currently being made in order to determine the most appropriate wavelet function to be used. Primary results demonstrated the importance of wavelet function regularity in the algorithm performance [I].Shifl invariancy i s another important requirement for image indexing algorithm. We have used a modified version o f Mallat's DWT algorithm in order to ensure the shift invariancy of the wavelet decomposition [I].

3.1.3. Wavelet Correlogram Computation Since color correlogram i s primarily computation o f the spatial correlation between different colors present in the image, a particular scheme for computing wavelet correlogram is required. Wavelet coefficients computed using LH (HL) filters correspond to low pass and high pass filtering in horizontal (vertical) and vertical (horizontal) directions, respectively. Logically, autocorrelogram computation on LH (HL) matrix can be made only in the direction o f low-pass filtering. Consecutively, monodimensional horizontal and vertical autocorrelograms are computed on LH and HL matrices, respectively. This method has less computational cost compared to the two-dimensional autocorrelogram proposed by Huang et al. [6].On the other hand, wavelet

-1s

0~25

-5

0

b

0.5

I -25

-15

7

2

t_

13

5

0.75

Noire LAsi*

-

c _

0.75

0~5

N o i n o MPrp,

3.1.1. Preprocessing The used image database 171, has one thousand color images from I O different classes provided in different sizes. In the preprocessing phase, color pictures in different input formats are first transformed to a unified gray level format. This transform i s primarily aimed to reduce the input data dimensionality while preserving image information content.

--0.75

Noi.c Mar-

b

0

IS

23

c

Fig. 4 Quantized levels in HL and LH maVices (a) scale I,(b) scale 2, and ( c ) scale 3.

3. I . 4. Feature Vector Structure The structure o f feature vector is simple. It consists of 96 real numbers computed using one-dimensional version o f Eq. (6). According to Fig. (4), wavelet coefficients are quantized into four levels in each scale. Hence, the computation of autocorrelogram in four distances ( k = l , ...,4), provides 4*4=16 real numbers for each matrix. Using two matrices (HL and LH) per scale and three consecutive scales, result in 16*2*3=96 total real numbers, equivalent to 768 bytes per feature vector.

4. RESULTS AND DISCUSSION For evaluation o f the proposed algorithm, a number o f query images were selected randomly from a 1000 image database'. The images in the database are categorized in I O classes including tribes, elephants. flowers, dinosaurs and so on. Each class contains 100 pictures in JPEG format. Five pictures from each class (totally 50) were selected to be used in the test. Figure (5) illustrates two query results. A retrieved image will be considered a match if it belongs to the same class of the query image. The developed indexing retrieval software based on wavelet correlogram, outputs ten best retrieved images in retrieval phase. Therefore, to compute the precision between k retrieved images, for every query image the following equation has been used:

2. From SlMPLlcity site http://wang.ist.psu.eduidocs/rel~

111 - 499

were combined to build a new method called wavelet correlogram. The new method demonstrated higher performance compared lo the methods based on each domain individually, such as wavelet based methods. The index vector using wavelet correlogram is fairly small and independent of the picture size.

6. REFERENCES [I] H. Abrishami Moghaddam, T. Taghizadeh Khajoie, A. H.

‘(4 (b) Fig. 5 Results obtained from query image (a) 359 and (b) 575.

p,

cn.

=‘-‘oxloo 10

n.

={

I correct answe, 0 Otherwise

(7)

The average precision is also computed using:

The total average is computed as follows:

Table ( I ) shows lhe results of the algorithm applied on the database. A total average of 71% matched retrieved images is achievable using wavelet correlogram method. The same experiment, performed using the SIMPLlcity method [7]. demonstrated a total average significantly less than the result obtained using our method.

5. CONCLUDING REMARKS In this paper, a new approach in CBlR was presented. Two different tools from pixel and transform domains

Rouhi, “Wavelet Correlogram: A New Approach for Image Indexing and retrieval”, Accepled in the 2”’ Iranian Conf on Machine Vision ond Image Process., Tehran, Iran, Feb. 2003. 121 C. Carson, M. Thomas, S. Blongie, J.M. Hellenline, and 1. Malik “Blobworld: a system for Region-Based Image Indexing and Retrieval”, Proc. of SPlE Visual Information Sysrems. pp. 509-516. June 1999. [3] Th. Gevers, “Color Based Image Retrieval”, Multimedia Search, Ed. M. Lew, Springer Verlag. Jan. 2001. 141 V. N. Gudivada J. V. Raghavan, “Special issue on Contentbased image retrieval systems“, IEEE Computer Magazine,Vol. 28, No. 9, pp. 18-22, Sep. 1995. 151 Ip. H. H. S. and D. Shen, “An afine-invariant active contour model (AI-snake) for model-based segmentation”, Image and Vision Compsling, Vol.16, No.2, pp. 135-146, Japan, Apr. 1998. [6] J. Huang, S. R. Kumar, M. Mitm, W. J. Zhu, R. Zabih, “Image Indexing Using Color correlograms”, IEEE ConJ Comp. Vision and Pattern Recognition, pp. 762-768, San Juan, 1997. [7] J. Li., J. Z. Wan& G. Wiederhold, “SIMPLlcity: Semanticssensitive Integrated Matching for Picture Llbraries”, IEEE Trans. on Parrern Analysis and Machine Intelligence, Vol. 23, No. 9, pp. 947-963, Sep. 2001. (81 S. Mallat, A wovelet lour of signal processing, 2ndEdition. Academic Press, 1999. [9] M. K. Mandal, S. Panchanathan, and T. Aboulnasr, “Fast Wavelet Histogram Techniques for Image Indexing”, Journal of Computer. Vision and Image Understanding, Vol. 75, No. 112, pp. 99-1 IO, JulylAug. 1999. [IO] A. Quddus, F. Alaya Cheikh, and M. Gabbouj, “Contentbased Object Retrieval Using Maximum Curvature Points In Contour Images”, Proc. of SPIE/EI’2000 Symposium, on Storage and Rerrievalfor Media Darabases, 2000. LI I ] A. Saijanhar and G. Lu, “A grid based shape indexing and retrieval melhod, Special issue ofAustrrrlion Computer Journal on Multimedia Sroroge and Archiving Systems, Vol. 29. No. 4,

pp. 131-140, Nov. 1997. 1121 A. W. M. Smeulders, M. Worring, S. Santini, A. Gupla, and R. Jain, “Content-Based Image Retrieval at the End of Early Years,” IEEE Trans on Partern Analysis arzd Machine Intelligeirce, Vol. 22; No. 12, Dec. 2000. 1131 M. Stricker, A. Dimai, “Spectral covariance and fuzzy regions for image indexine”. Machine Vision and Aoolienrions. .. Vol. 1O.pp. 66-73, 1997. 1141 M. J. Swain. D. H. Ballard. “Color indexing”. In1 Journal . _ ofCnmpsrer Vision, Vol. 7,pp. 11-32, 1991.

-

111 - 500

I

I

Suggest Documents