Color interest points detector for visual information retrieval - CiteSeerX

3 downloads 708 Views 510KB Size Report
merge two classical methods : the color pyramid and the interest points detection, well-known for grey level .... The triplet (HR k. , HG k. , HB k. ) permits to .... “http://www-white.media.mit.edu/vismod/imagery/visiontexture/vistex.html.” MIT Media ...
Color interest points detector for visual information retrieval Jérôme Da Rugnaa and Hubert Konika a LIGIV,

Université Jean Monnet, 3 rue Javelin Pagnon, 42000, St Etienne, France ABSTRACT

The purpose of our visual information retrieval tool is to extract from a database images that are similar to an image query. Color features are generally used to define a measure of similarity between images, as they are usually very robust to noise, image degradation, changes in size, resolution or orientation. Nevertheless, the most often features suffer objectively from the lack of color spatial knowledge. Then, our purpose is to merge two classical methods : the color pyramid and the interest points detection, well-known for grey level image analysis. The pertinence of this new method is demonstrated by an evaluation and a comparison with others keypoints detectors. We show the interest for image indexation with concrete tests on our large images database, using the i Cobra system.1 Keywords: image, multi-resolution, pyramids, indexation, keypoints

1. INTRODUCTION With the growth of large image databases, content-based image retrieval systems are actually a highly challenging problem. The common approach is to extract a signature for every image based on different features (texture, color, shape analysis . . . ) and to minimize a distance for retrieving similar images to a request one. Then, features extraction becomes the most important theme objectively , a large panel of systems23 and methods exist, based on statistical features,4 visual parameters, color histograms,5 region-based search. . . The main attention must be paid to develop insensitive features to intensity variation, scaling, rotations or else compression effects. In this paper, we will present our approach based on keypoints detection merged with the following of them through a multi-resolution bottom-up process. Among the different detectors, we will first present the Harris one, unfortunately very sensitive to a certain parameter not clearly visually “interpreted”. To improve its use, we will then present its coupling with a common multiresolution tool, namely the color pyramid in a γ-corrected RGB space. Finally, we will developp the solution to extract some numerical features for every image before achieving with the presentation of our content-based retrieval system called i Cobra∗ . The efficiency of this method will be validated on a large classical color texture database, composed notably by Vistex images.6

2. INTEREST POINTS - THE HARRIS DETECTOR The goal of keypoints -or interest points- approach is to extract in a image only the most representative points, those that are able to reduce the amount of information necessary to approximately describe entire image. Nevertheless interest points methods extract more often the corners, as insensibility to classical transformations is researched. Then the first advantage of these algorithms is the robustness under rotation, translation, illumination or scale effects. That’s what makes main application of keypoints is pattern matching or pattern recognition. Three kind of interest points detector can be separated : • Differential methods. Differential invariant are used to extract corners and then keypoints. The most is adapted from Harris detector7 and this kind of algorithms is a good approach for points pairing. Further author information: (Send correspondence to Jérôme Da Rugna) Jérôme Da Rugna: E-mail: [email protected], Telephone: (33)4 77 92 30 38 Work supported by the region Rhône-Alpes grant ACTIV2 ∗

http://www.ligiv.org/icobra/

• Pyramidal methods. Contrast pyramids allow to extract points which have more visual information, as these methods do not extract principally corners.8 Because of the pyramidal approach, this method is more insensitive to compression or noise. • Filters based methods. By using circular or adaptative filters these methods are robust to noise and extract visual interest points.9 The Harris detector is a reference in keypoints extraction. Many methods are in fact an adaptation adaptation for a specific kind of images or problems - of the first method introduced by Harris, based on Plessey detector. We define in this paper the classical Harris method, as other variants have more or less same properties. Definition 2.1. An image I is defined by the intensity function I(x, y). Ix and Iy are differential on x and y We note image as only grey-level image. Definition 2.2. C is the matrix defined by C=



Ix2 Ix · Iy

Ix · Iy Iy2



Smooth - Gaussian - filters are generally applied before computing differential images. Definition 2.3. The image Hk is then defined by : Hk = Det [C] − k · T race2 [C] Definition 2.4. Interest points are local maxima of image Hk The Figure 1 illustrates the detector for two different values of k on the same image.

(a) k=0.4

(b) k=0.05

Figure 1: Example of Harris detector results, with different values of k.

The complexity of the method is low, as it just needs to calculate some differential - at order 1 -, to apply some binary operation and to extract local maxima. But the first problem of the method is the large number of possibilities offered to the programmer which have a lot of influence on the results :

• The choice of filters and their applications. • The computation of differential images. • The maxima extraction and the noisie filtering. • The value k, usually fixed around 0.04 - 0.05. As the definition of k is not clear, the interpretation is very difficult. If the density of Hk increases with k, it is completely different for the number of maxima, as shown in the figure 3(b). It is then not possible to define “a priori” the number of interest points, as the number of maxima is not linear with k. Finally, we can note that this method - as a lot of differential methods - is not very robust to compression like jpeg, commonly used in large database images. Finally , the Harris detector is a classical, fast and easy to apply method for corner extraction - keypoints but suffers to scarcely controllable parameters, particularly in a content-based image retrieval application. Let present now our first approach for adapting the Harris detector to image indexation context.

3. GREY LEVEL ADAPTED HARRIS FOR TEXTURE INDEXATION In this approach the goal is not to made a new keypoints detector but to analyze how the Harris detector can be used in image indexation, specially in texture database indexation. Even if applying keypoints detector algorithm to -mono- textured images seems to be not logical, it is interesting anyway to explore this way by adapting the Harris detector. In this section, we only work on grey level images, then we can use a classical Gaussian pyramid for extracting a set of multi-resolution images from the initial image. Only the first levels of pyramid are computed and so, we can extract this pyramid from image of type a.2n x b.2n , n usually between 2 to 5. This permits to not only restrict the study to square images. On each pyramid level an interest points set is extracted with the Harris detector, then we compute features in order to describe the entire image. Each interest point is in fact a representation of a local region : local invariant around this point represents a useful information. Then the features of the image consists in the average - and standard deviation - of all local invariants computed on each point. One image is described in fact by a vector constructed with this information characterizing each level of pyramid. This last one permits an multi-resolution approach which gives a better featuring, by more visual perception of the image contents. The figure 2 shows a result on color textures. But as the color perception is still forgotten, this approch is not worth enough for color images database indexation. Let first define the color gaussian pyramid before introducing our new keypoints detector.

4. COLOR GAUSSIAN PYRAMID 10

As described in a color gaussian pyramid is a powerfull tool for image processing. The basic idea of the pyramid structure, first formalized only to grey-level, is to produce a stack of interrelated images with progressively reduced resolution. The sampling rate of theses lower-resolution images is to reduced in accordance with the elimination of the higher frequencies. Considering the pyramid in grey levels and the different color spaces, let us introduce the color pyramid. The gaussian pyramid construction consists mainly of two steps : a Gaussian convolution and a subspampling operation. Applied to a color image, the first step - which is a linear operation - is to perform an additive color mixture. The γ-corrected RGB color space can be used to compute the additive color mixture, using the Grassman’s laws, as outlined by Berns and Al..11 This construction permits to have a real color image for each level of pyramid, and not 3 distincts pyramids for each dimension for color space. Our method simulates the human visual system miming the focus-of-attention principle, assuming that there exists a resolution where the detection of points of interest is appropriated. Let construct now a color interest points detector.

Figure 2:

i Cobra: The most similar images, from left to right, top to bottom

5. COLOR INTEREST POINTS DETECTOR Actually, we present our contruction in the YUV color space, but with some minor changes this construction can be applyied to any other color space, specially non-linear classic one like L*a*b*. If it is not possible to construct the color pyramid with non-linear color space, thanks to these spaces it becomes possible to better estimate human visual perception. First it needs to compute a specific image on each plan of the γ-correted RGB space for each pyramid level. For this, we use directly the Harris algorithm to each plan but, instead of extracting local maxima, we compute only the image Hk (Def 2.3). The choice of k, even if it is not so important has of course some influence to the result. The parameter k fixed to 0.05, an empirical value, seems to be the best general value for our goal, color texture classification. The second step is to evaluate the persistence of a keypoint following the gaussian pyramid and color space, which will allow to select only interesting points. The triplet (HkR , HkG , HkB ) permits to compute an image in YUV space for each pyramid level. This image can be seen like a perceptual image of the texture. Then, we obtain a pyramid of color image called HkY uv . As Harris detector definition, the geometric position of interest points are the local maxima of Y plan. The persistence of points is evaluated by first, the geometric

15000

14000

13000

12000

11000

10000

9000 -0.1

(a) Keypoints Pyramid

-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

(b) Keypoints Density

Figure 3: Keypoints Pyramid

persistence through the pyramid, and, secondly, with local color invariants around each point, computed on the YUV space. Indeed if corners are, for example, searched, the local color dipersion of selected interest keypoints has to be large. The last part consists in computing features on the limited number of interest keypoints. This paper presents results with local invariants calculated around each selected points. These invariants have to be representative to the local region of each point. We used moment of first and second order, and a measure of local color dipersion.12

6. INDEXATION APPROACH Table 1: Comparative results on color texture database

Method Co-occurrence Matrix Pyramid sigma Grey level adapted Harris Color Pyramid Detector

First error found 11 10 12 15

Recognize Rate 81 75 82 91

(a) Texture

(b) Scene images

Figure 4: Examples of Color Pyramidal detector results Table 2: Standart Deviation

Method Pyramid sigma Grey level adapted Harris Color Pyramid Detector

Intra-class 0.023 0.011 0.0091

Inter-class 0.027 0.048 0.058

On our large texture database (more than 4000 images) we etablish a protocol for estimating the power of our method in texture classification. Our panel of textures is a good representation of the diversity of textures, with different perceptual properties like : directionality, regularity, complexity or coarseness.13 Then, for each different texture (in high resolution), we extract a subset of medium resolution texture images(size : 256x256). Thess subset constitute the classes as the construction made all the textures of one subset similar. The principle of our texture classification test is to classify all this subset created, using the k-nearest neighbor method (kNN). Then, we call “first error found” the mean of the first nearest neighbor which is not in the class of query texture and “Recognize Rate” the rate of good classification by the k-NN method. The tables 1 and 2 show a comparative study of different methods implemented in i Cobra. The sigma-pyramid methods consists for example in the following of mean-square statitics through the different levels of the pyramid.

7. CONCLUSION We have presented an efficient and low-cost method based on a coupling between Harris detector and color gaussian pyramid following to achieve a robust database retrieval tool with color texture images. Nevertheless, this is still an early stage towards a whole image content-based retrieval system, named i Cobra, where the user interface can be viewable from any browser. Moreover the system is also powerful to include new features easily and to test it on large database. Objectively, the system seems to be insufficient to distinguish particularly scene images for which our method has general limitations. For instance, the Harris detector suffers from the lack of pertinence of its parameter k, which is even so improved with the following through a bottom-up process in a color gaussian pyramid.

Figure 5:

i Cobra: Color adapted detector

Figure 6:

i Cobra: Color adapted detector

Nevertheless, even if it seems quite attractive for minimizing visual information contains in the image, coarse segmentation tools must be easier to define similarity between images. Indeed, the effectiveness of a contentbased retrieval system strongly depends on the way to define the “visual information” and the region-based14 approach seems to be more well-adapted. Then, the next step of our works will consist in quantifying the similarity between region of interest through an adaptative blobs selection process to querying interfaces.

REFERENCES 1. “http://www.ligiv.org/icobra/.” LIGIV Lab. 2. M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Janker, “Query by image and video content: the qbic system,” IEEE Computer 28(9), pp. 310–315, 1995. 3. J. Z. Wang, J. Li, and G. Wiederhold, “Simplicity : Semantics-sensitive integrated matching for picture libraries,” IEEE transactions on pattern analysis and machine intelligence 23(9), pp. 947–963, 2001. 4. A. D. Dimbo, Visual Information Retrieval, Morgan Kaufmann publishers, inc, 1999. 5. M. J. Swain and D. H. Ballard, “Indexing via color histograms,” in IEEE Proceedings, pp. 390–393, 1990. 6. “http://www-white.media.mit.edu/vismod/imagery/visiontexture/vistex.html.” MIT Media Lab. 7. C. Harris and M. Stephens, “A combined corner and edge detector,” 4th Alvey Vision Conf. Manchester , pp. 189+, 1988. 8. S. Bres and J.-M. Jolion, “Multiresolution contrast based detection of interest points,” Tech. Rep. RR-98-02, LRFV INSA - Lyon, 1998. 9. S. Smith and J. Brady, “SUSAN - a new approach to low level image processing,” Int. Journal of Computer Vision 23, pp. 45–78, May 1997. 10. H. Konik, V. Lozano, and B. Laget, “Color pyramids for image processing,” jist 40, pp. 535–542, Nov. 1996. 11. R. S. Berns, R. J. Motta, and M. E. Gorzynski, “CRT colorimetry. part I : Theory and practice,” 18, pp. 299–314, Oct. 1993. 12. A. Trémeau and P. Colantoni, “Color difference descriptors for image analysis,” in OSA Annual conference, (Baltimore), Oct. 1998. 13. A. R. Rao, A Taxonomy for texture description and identification, Springer-Verlag, 1990. 14. C. Carson, M. Thomas, S. Belongie, J. M. Hellerstein, and J. Malik, “Blobworld: A system for region-based image indexing and retrieval,” Visual information systems , 1999.

Suggest Documents