in the query image and that of retrieved image and K is num- ber of keywords common ... [8] R. M. Haralick, K. Shanmugam, and I. Dinstein. Texture features for ...
CBIR using Perception based Texture and Colour Measures Sanjoy Kumar Saha; Amit Kumar Das CSE Department; CST Department Jadavpur Univ., India; B.E. College, India Abstract We have designed and implemented an experimental CBIR system that uses a texture co-occurence matrix. Fuzzy index of major colours are also used as colour feature to improve performance. A new measure is suggested to find out the relevance of the retrieved images and to evaluate the CBIR system. Accordingly, the performance study of the proposed system is carried out and compared with a similar system. The study has established the effectiveness of the features. Keywords: CBIR, texture co-occurence matrix, Fuzzy index of colour, performance measure, relevance measure
Bhabatosh Chanda Electronics & Communication Sciences Unit ISI, Kolkata, India [18, 19, 14, 1] to extract appropriate texture features. Several systems have used colour histogram based on different colour model like RGB, HSV, HLS etc. [16, 20, 13]. Some systems maintain an adjacency matrix [3], colour layout vector [11], colour correlogram [9], colour coherence vector [2] etc. Yu et al. [21] has proposed a 48 dimensional color texture moments based on Fourier transform. Here, We present a CBIR system using texture and colour feature. This paper has five sections. Section 2 describes the computation of features. Section 3 suggests the performance measure. Experimental results are presented in section 4. Concluding remarks are given in section 5.
2. Computation of Features 1. Introduction Since early 90’s content based image retrieval has become a very active research area. Many image retrieval systems such as QBIC, MARS, Virage, FIDS etc. have been built. The CBIR systems can be classified broadly into two categories as (i) Low level feature based system and (ii) High level/Semantic feature based system. Low level features are general features and computed from pixel values. On the other hand, high level features are abstract attributes involving a significant amount of reasoning. Our work falls into the first category. Shape, texture and color are three main group of features that are being used in CBIR systems. A couple of schemes were reported in the literature on shape based retrieval where shape is represented by circularity, eccentricity [16], Fourier descriptor [22], moment invariants [5], histogram of values after applying Sobel edge filter [1], edge layout vector [11], histogram of angle between two colour edges [7], shape context descriptor [17] etc. A variety of techniques have been used for measuring texture such as co-occurrence matrix [8], Gabor filter [15] and fractals [10]. From these it is possible to measure the features like contrast, coarseness, directionality and regularity as used in a number of systems like QBIC, NETRA, MARS etc. Number of systems have explored different variations of wavelet transform
The term ’texture’ is used to specify the roughness or coarseness of object surface. In an intensity image texture puts its signature as the variation in intensity from pixel to pixel. Here we propose texture co-occurrence matrix for describing the texture of the image. A fuzzy index of colour based on hue histogram is also used for improved performance. Texture Co-occurrence Matrix: Usually a small patch of finite area of an image is required to feel or measure local texture value. The smallest region for such purpose could be a block. So in order to compute the texture co-occurrence matrix, the intensity image is divided into blocks of size . Then grey level of the block is converted to binary by thresholding at the average intensity. This operation is same as the method of obtaining binary pattern in case of block truncation coding [4]. The binary pattern obtained this way provides an idea of local texture within the block. By arranging this pattern in raster order a binary string is formed. It is considered as the gray code and corresponding decimal equivalent is its texture value. Thus, by the virtue of gray code, blocks with similar texture is expected to have closer values. Some examples of the blocks and corresponding texture values are shown in Fig. 1. Thus we get 15 such texture value as block of all 1’s does not occur. A problem of this approach is that a smooth intensity block [see Fig. 1(e)] and
0-7695-2128-2/04 $20.00 (C) 2004 IEEE
a coarse textured block [see Fig. 1(d)] may produce same binary pattern and hence, same texture value. To surmount this problem we define a smooth block as having intensity variance less than a small threshold. In our experiment, threshold is 0.0025 of average intensity variance computed over all the blocks. All such smooth blocks have texture value 0. Thus we get the scaled (both in space and value) image whose height and width are half of that of the original image and the pixel values range from 0 to 15 except 10 ( all 1 combination). This new image may be considered as the image representing the texture of the original image [see fig. 2]. 20 22
Intensiy Block
9 19
8
7
Binary Pattern
1
1
0
0
0
0
Gray Code
1100
Texture value
7
8
7
18
9
6 23
6
8
1
1
0
1
0
1
0
1
1
0
0
1
0
1
11 20
0101
17 11
1010
20
1001
1001
8
6
12
14
14
(a)
(b)
(c)
(d)
(e)
Figure 1. Blocks and texture values.
Finally, considering left-to-right and top-to-bottom directions co-occurence matrix of size is computed from this texture image. To make this matrix translation invariant the block frames are shifted by one pixel horizontally, vertically and both. For each case, co-occurence matrix is computed. To make the measure flip invariant, cooccurence matrices are also computed for the mirrored image. Thus, we have sixteen such matrices. Then, we take the element-wise average of all the matrices and normalize them to obtain the final one. In case of landscape, this is computed over the whole image; while in case of image containing dominant object(s) the texture feature is computed over the segmented region(s) of interest only.
text of indexing and comparison cost. Hence, to obtain more perceivable features, statistical measures like entropy, energy, texture moments [8] are computed based on this matrix. We have considered moments upto order as the higher orders are not perceivable. The use of gray code has enabled us measuring Homogeneity and Variation in texture. Fuzzy index of colour: Colour is represented using HSV model. A hue histogram is formed. The hue histogram thus obtained can not be used directly for searching similar images. As for example, a red image and an almost red image (with similar contents) are visually similar but their hue histogram may differ. Hence, to compute the colour features the hue histogram is first smoothened with a Gaussian kernel and normalized. Then for each of the six major colours ( red, yellow, green, , magenta), an index of fuzziness is computed as follows. It is assumed that in the ideal case for an image with one dominant colour of hue , the hue histogram would follow the Gaussian distribution with mean and standard deviation, say, . In our experiment we have chosen so that % of the population falls within to . The Bhattacharya distance [6] between the actual distribution and this ideal one indicates the closeness of the image colour to hue , where . Therefore, gives a measure of similarity between two distributions. Finally, an S-function [12] maps to fuzzy membership where
The texture co-occurrence matrix provides the detailed description of the image texture, but handling of such multivalued feature is always difficult, particularly in the con-
For membership values corresponding to red, yellow, green etc. are obtained. In our experiment is taken as 15.
3. Performance Measure Evaluation of the performance of a CBIR system is a major challenge. A collective effort is going on to set a platform for benchmarking the CBIR systems (http://www.benchathlon.net). While looking for a number of similar images, precision is an important measure. It may be computed as
Figure 2. An image and corresponding texture image.
Obviously, to facilitate such measure, the database is to be ground truthed. If the images contain exactly one dominant object then a retrieved image will either match with the query image or not. The situation becomes non-trivial when the images depict a natural scene or a collection of many significant objects. The groundtruth is then a set of this by different keywords. Now, the concept of partial match/relevance comes into the picture. To cope up with
0-7695-2128-2/04 $20.00 (C) 2004 IEEE
this situation, we suggest a modified definition of precision as
where, relevance value (RV) is degree of matching and varies from 0 to 1 both inclusive. A liberal approach considers the relevance value as 1 if the groundtruth of the query image and that of retrieved image have at least one keyword in common, Otherwise 0. Such relevance measure may lead to the initial definition of precision. But, the liberal approach may consider, almost dissimilar images as perfect match. A pragmatic approach provides a solution for this problem. If all the components of the query image are present in a retrieved image or vice versa then it is considered to be as a full match and relevance value is taken as 1. Otherwise, it is either a partial match or complete mismatch. Thus the relevance value is computed as:
where, N is the minimum between the number of keywords in the query image and that of retrieved image and K is number of keywords common between retrieved and query image. All these measures can be further improved if the groundtruth provides the relative ordering of the importance of different keywords in an image.
4. Experimental results We have used features of which represent texture and remaining correspond to fuzzy indices of six major colours. Distance between two images are computed as where, is the Euclidean distance of features of the two images. The database used by us consists of around 1000 groundtruthed images downloaded from http://www.cs.washington.edu/research/imagedatabase. It describes each individual image by a set of phrases or keywords. Performance of our system is studied by using each image in the database as the query image and top 10 similar images are retrieved by Euclidean distance based exhaustive search. In each case, the query image itself has appeared as the best match. Few results are presented in fig. 3.
To compare the retrieval performance of our system, we have also implemented the system proposed by Yu et al. [21]. Analysing the groundtruth information, we have adopted liberal and pragmatic approach for relevance value computation. Precision figures are shown in Table 1. It is clear from Table 1 that the proposed features have a fair enough capability and the performance is better than that of the system proposed by Yu et al. [21].
Figure 3. Six sets of results; each spread over two rows and the first one is the query image for all six sets.
0-7695-2128-2/04 $20.00 (C) 2004 IEEE
Table 1: All figures indicate % precision of retrieval. Liberal Approach Pragmatic Approach Our Yu’s Our Yu’s System System System System P(10) 83.54 74.94 69.68 55.51 P(20) 81.87 73.71 64.76 51.49 P(30) 80.42 72.88 60.96 49.24
5. Conclusion A In this paper we have presented the idea of computing local texture based on blocks of pixels of the intensity image and a fuzzy index to denote the presence of major colours. Based on the texture co-occurrence matrix, a few perceivable features have been proposed to reduce the comparison cost. We have also suggested the relevance and performance measure. Accordingly, our system is evaluated and the result indicates that the proposed features bear a strong potential towards the improvement of the performance.
References [1] A. P. Berman and L. G. Shapiro. A flexible image database system for content-based retrieval. Computer Vision and Image Understanding, 75:175–195, 1999. [2] I. J. Cox, M. L. Miller, T. P. Minka, T. Papathomas, and P. N. Yianilos. The bayesian image retrieval system, pichunter: Theory, implementation and psychophysical experiments. IEEE Transactions on Image Processing, 9(1):20–37, 2000. [3] M. Das, E. M. Riseman, and B. Draper. Focus: Searching for multi-colored objects in a diverse image database. In Proceedings of IEEE conference on computer vision and pattern recognition, pages 756–761, 1997. [4] E. J. Delp and O. R. Mitchell. Image compression using block truncation coding. IEEE Trans. on Comm., 27:1335– 1342, 1979. [5] S. A. Dudani, K. J. Breeding, and R. B. McGhee. Aircraft identification by moment invariants. IEEE Transaction - Computer, C26:39–45, January 1977. [6] K. Fukunaga. Introduction to Statistical Pattern Recognition. Academic Press, NY, USA, 1972. [7] T. Gevers and A. Smeulders. Pictoseek: Combining color and shape invariant features for shape retrieval. IEEE Transactions on Image Processing, 9(1):102–119, 2000. [8] R. M. Haralick, K. Shanmugam, and I. Dinstein. Texture features for image classification. IEEE Trans. on SMC, 3(11):610–622, 1973. [9] J. Huang, S. Kumar, M. Mitra, W. J. Zhu, and R. Zabih. Image indexing using color correlogram. In IEEE Conference on Computer Vision and Pattern Recognition, 1997. [10] L. M. Kaplan. Fast texture database retrieval using extended fractal features. SPIE 3312, SRIVD VI:162–173, 1998.
[11] Z. N. Li, D. R. Zaiane, and Z. Tauber. Illumination invariance and object model in content-based image and video retrieval. Journal of Visual Communication and Image Representation, 10(3):219–244, 1999. [12] C. Lin and C. S. G. Lee. Neural Fuzzy Systems. Prentice Hall, NJ, 1996. [13] W. Ma and B. S. Manjunath. Netra: A toolbox for navigating large image databases. Multimedia Systems, 7(3):184–198, 1997. [14] W. Y. Ma and B. S. Manjunath. A comparison of wavelet transform features for texture image annotation. In IEEE Intl. Conf. on Image Processing, pages 256–259, 1995. [15] B. S. Manjunath and W. Y. Ma. Texture features for browsing and retrieval of image data. IEEE Trans. on PAMI, 18:837– 842, 1996. [16] W. Niblack, R. Barber, W. Equitz, M.Flickner, E. Glasman, D. Petkovic, P. Yanker, C. Faloutsos, and G. Taubin. The qbic project: Querying images by content using color, texture and shape. SPIE, SRIVD, 1996. [17] B. Serge, Greenspan, Hayit, Malik, Jitendra, and Puzicha. Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(4):509–522, 2002. [18] J. R. Smith and S. F. Chang. Transform features for texture classification and discrimination in large image databases. In IEEE Intl. Conf. on Image Processing, volume 3, pages 407– 411, 1994. [19] J. R. Smith and S. F. Chang. Automated binary texture feature sets for image retrieval. In IEEE Intl. Conf. on ASSP, pages 2239–2242, USA, 1996. [20] M. J. Swain. Interactive indexing into image databases. In SPIE Storage and Retrieval for Image and Video Databases, volume 1908, 1993. [21] H. Yu, M. Li, H. jiang Zhang, and J. Feng. Color texture moments for content-based image retrieval. In IEEE Int. Conf. on Image Proc., Newyork, USA, sept.,2002. [22] C. T. Zahn and R. Z. Roskies. Fourier descriptors for plane closed curves. IEEE Transactions on Computer, C-21, No. 1:269–281, 1972.
0-7695-2128-2/04 $20.00 (C) 2004 IEEE