a method for color content matching of images - Semantic Scholar

0 downloads 0 Views 673KB Size Report
IBM TJ Watson Research Center, 30 Saw Mill River Road, H4 B31, Hawthorne, NY 10532. Jianying Hu. Lucent Technologies Bell Labs, 600 Mountain Avenue, ...
A METHOD FOR COLOR CONTENT MATCHING OF IMAGES Aleksandra Mojsilovic IBM TJ Watson Research Center, 30 Saw Mill River Road, H4 B31, Hawthorne, NY 10532 Jianying Hu Lucent Technologies Bell Labs, 600 Mountain Avenue, Murray Hill, NJ 07974

ABSTRACT Color features are among the most important features used in image database retrieval. Due to its compact representation and low complexity, direct histogram comparison is the most commonly used technique in measuring color similarity of images. However, it has many serious drawbacks, including a high degree of dependency on color codebook design, sensitivity to quantization boundaries, and inefficiency in representing images with few dominant colors. In this paper we present a new algorithm for color matching. We describe a new method for color codebook design in the Lab space. We introduce a statistical technique to extract perceptually relevant colors. We propose a new color distance measure that guaranties optimality in matching different color components of two images. Experiments comparing the new algorithm to some existing techniques show that these novel elements lead to better match to human perception in judging image color similarity.

1. INTRODUCTION Color features are some of the most extensively used low-level features in image database retrieval. They are usually very robust to noise, image degradation, changes in size, resolution or orientation. Furthermore, since color alone has little semantic meaning, color features tend to be more domain independent compared to other features such as shape or even texture. Color histogram, representing the joint probability of the intensities of the three color-channels, is the simplest and the most often used color feature. It is often employed in combination with the Euclidean distance as the color metric, providing undemanding yet efficient retrieval method. As an improvement of a basic histogram search, numerous more sophisticated representations and metrics have been developed. For example, Swain and Ballard proposed histogram intersection, an L1 metric, as similarity measure for histogram comparison [1]. To measure similarity between close but not identical colors Ioka and Niblac introduced an L2-based metric [2][3]. However, although the histogram representation is simple to compute, it lacks discriminatory power in retrieval of large image databases. Another disadvantage of the traditional histogram approach is huge amount of data needed for representation, which furthermore increases complexity of the retrieval process. To facilitate fast search of large image databases and better measurements of image similarity in terms of color composition, different research groups proposed more compact and flexible

representations [4]-[6]. However, these algorithms still do not match human perception very well. In this paper we present a new technique for color matching. We describe a new method for color codebook design in the Lab space and introduce a technique for the extraction of perceptually relevant colors. We propose a new color metric that has guaranteed optimality in matching different color components of two images. Finally, we present experimental results comparing this new algorithm to some existing methods.

2. COLOR FEATURE EXTRACTION The goal of feature extraction is to obtain compact, perceptually relevant representation of the color content of an image. So far, features based on the image histogram have been widely used in image retrieval. However a feature set based solely on that information does not adequately model the way humans perceive color appearance of an image. It has been shown that in the early perception stage human visual system performs identification of dominant colors by eliminating fine details and averaging colors within small areas [6]. Consequently, on the global level, humans perceive images only as a combination of few most prominent colors, even though the color histogram of the observed image might be very “busy”. Based on these findings, we have made several improvements on color feature extraction, including a new method for color codebook design and a statistical method to extract perceptually relevent colors. These improvements are explained in detail in the following subsections.

2.1 Color Codebook Design A color image is first transformed from the RGB space into the Lab color space. This step is crucial, since our metric relies on the perceptual uniformity of the Lab space where fixed Euclidean distance represents a fixed perceptual distance regardless of the position in the space [7]. The objective of the codebook design algorithm is to sample the color space into N points, where N is the predetermined codebook size. In linear color spaces (such as RGB and XYZ) approximated by 3D cubes color codebooks are typically obtained by performing separable uniform quantization along each coordinate axis. Unfortunately, this sampling scheme is not optimal in the Lab, due to nonlinearity of the space. Therefore, in

the Lab space color codebooks for the image retrieval applications are usually designed using vector quantization (VQ) techniques. When applied to the color codebook design problem, VQ algorithm determines a set of bin centers and corresponding decision boundaries so that mean-square quantization error for all pixels from all images in the particular database is minimized. Hence in the VQ design, the training data have a crucial effect on the final result. Therefore, this approach is optimal when the codebook is designed for a particular application and the training set contains images that are representative of a given problem. However, our aim is to design a reliable color codebook to be used in different retrieval applications dealing with arbitrary input images. In that case, to obtain an accurate estimation for the color distribution, training set for the VQ algorithm has to span the whole input space. This requires a huge number of training points, resulting in a computationally expensive and possibly intractable design task. To overcome this problem, we propose a design technique that provides uniform sampling of the Lab space and generates color codebooks that are very well suited for general retrieval applications. In the proposed scheme we first sample the luminance axis into NL levels. Then, for each discrete luminance value, the Np colors in the corresponding (a,b) plane are chosen as points zn of a hexagonal spiral lattice in the complex plane [8]. The particular sampling scheme generates the sampling points that are uniformly distributed over the area of a disk with a center at the particular gray level (regardless of the disk's size). Hence, there is neither crowding in the center nor sparsity near the edges, providing an optimal packing of the space. The proposed sampling scheme is optimal for creating small fixed color codebooks for image matching and retrieval. In addition to the uniform distribution of points, another advantage of this sampling method is that, for a given index n, one can easily determine which points zi are in the neighborhood of zn.

2.2 Extracting Visually Dominant Colors We developed a statistical method to identify colors of speckle noise and remap them to the surrounding dominant color. The method is based on the observation that human beings tend to ignore isolated spots of a different color that are randomly distributed among a dominant color. We first partition each image into non-overlapping N×N (N is typically 20) windows and then proceed independently in each window. For each window, we compute an m×m neighborhood color histogram matrix, H, where m is the number of colors found in the region. H[i,j] is the number of times a pixel having color j appears in the D×D (D is a small number, typically 3-5) neighborhood of a pixel having color i, divided by the total number of pixels in the D×D neighborhoods of pixels having color i. Note that unlike the commonly used color co-occurrence matrix, the neighborhood color histogram matrix is not symmetric. Each row i in H represents the color histogram in the collection of neighborhoods of pixels having color i. Based on this histogram matrix, speckle colors are detected and remapped in the following manner. For each color i, we examine row i and find the entry H[i,k] that has the maximum value. If k equals i, then i is determined to be a dominant color, and no

remapping is done. Otherwise, i is determined to be a speckle color, occurring mostly in the neighborhood of color k. Therefore all pixels of color i in the window are remapped to color k. Fig. 1 illustrates the process of color feature extraction. Fig. 1a is the original image, Fig. 1b is the image after quantization using a 91 color codebook, and Fig. 1c is the image after speckle color detection and remapping.

(a)

(b)

(c)

Fig. 1. Color feature extraction: a) original image, b) after quantization, c) after speckle color detection and remapping.

3. OPTIMAL COLOR COMPOSITION DISTANCE This section introduces a new method for measuring the distance between two images in terms of color composition. We first define a color component of an image as a pair CCi(Ii, Pi), where Ii is the index to a color in a particular color codebook and Pi is the area percentage occupied by that color. A color component CCi is considered to be dominant if Ii represents perceptually relevant color. Hence, the color composition of an image is represented by the set of dominant color components (DCC) found in the image. Based on human perception, for two images to be considered similar in terms of color composition, two conditions need to be satisfied [6]. First, the colors of dominant color components of the two images need to be similar. Second, the color components with similar colors need to have similar area percentage. Therefore, the distance between two images in terms of color composition should be a measure of the optimal match between the color components of the two images. Previously proposed color metrics all fail to capture both factors [1][4][5][6]. We define a new metric called optimal color composition distance (OCCD). The OCCD metric measures the difference between two images in terms of color composition based on the optimal mapping between the two corresponding sets of color components. First, the set of color components of each image is quantized into a set of n color units, each with the same area percentage p, where n×p = 100. We call this set the quantized color component (QCC) set. Apparently, different color units in the QCC set may have the same color and the number of color units labeled with a particular color Ii is proportional to the corresponding area percentage Pi. Since every unit now has the same area percentage, it suffices to label each by the color index along. Thus the color composition of an image is now represented as a set of n labeled color units. Suppose we have two images A and B, having QCC sets {C A |U 1A ,U A2 ,...,U An }

and {CB |U 1B ,U B2 ,...,U Bn } . Let I (U xk ), x = A, B ; k =1,...,n , denotes the color index of unit U xk , and let {M AB | m AB :C A → CB } be the set of one-to-one mapping functions from set CA to set CB. Each mapping function defines a mapping distance between the two sets:

the upper-left corner, and the 5 retrieved images arranged from left to right and top to bottom in order of decreasing similarity. The same color codebook of 99 colors was used in all methods.

n

MD(C A ,CB ) = ∑ W ( I (U Ai ), I (mAB (U Ai ))) , i =1

(a)

where W (i , j ) is the distance between color i and color j in a given color codebook. Our goal is to find the optimal mapping function that minimizes the overall mapping distance. The distance between the images A and B is then defined to be the minimal mapping distance. To solve this optimization problem, we create a graph, GAB. It contains 2n nodes, one for each color unit in CA or CB, and n2 edges, one between each node in CA and each node in CB. The cost for an edge is defined to be the distance between the two corresponding colors. The resulting graph is an undirected, bipartite graph. The problem of finding the optimal mapping between color units in A and those in B can now be cast as the problem of minimum cost graph matching. The latter is a wellstudied problem and there are well-known solutions with O(k3) complexity, where k is the number of nodes in the graph. We used Rothberg's implementation [9] of Gabow's algorithm [10]. The parameters n and p are chosen based on experimental results of human perception. Experiments on the images from our databases have shown that, with the current codebook, colors that occupy less than 5% of image are usually not perceived. Also, previously reported subjective experiments demonstrate that humans are not able to perceive large number of colors within an image (typically not more than 10) [6]. Hence, we found it sufficient to assign n = 20 and p = 5, allowing maximum of 20 dominant color components. Given the set of color components, the color units for each image are filled in the following fashion. For each color component we compute the number of bins it occupies as: ni = ceiling(Pi/p). Then, we sort the dominant color components according to the area they occupy, and start assigning the color units beginning with the “most dominant color”, until all twenty units have been assigned.

(b)

(c)

Fig. 2. Retrieval results using: a) traditional histogram, b) histogram intersection, and c) OCCD methods.

(a)

(b)

4. EXPERIMENTAL RESULTS In this section we illustrate the performance of the new algorithm (OCCD) by comparing it to several previously proposed methods. These previous methods include: 1) the simplest scheme based on the Euclidean distance between the target and query histograms; 2) Swain’s histogram intersection method [1]; 3) MDM metric [4]; and 4) Modified MDM metric [6]. For the comparison we used an interior design database, consisting of 335 color patterns. This database was chosen because there is little meaning attached to it – thus we were able to test our scheme and compare it to other methods without too much bias of semantic information. In the examples shown below, all retrieval results are displayed with the query image at

(c)

Fig. 3. Retrieval results using: a) MDM, b) modified MDM, and c) OCCD methods. Fig. 2 illustrates the performances of traditional histogram method (Fig. 2a) and histogram intersection (Fig. 2b) compared to that of OCCD (Fig. 2c). In this example, both histogram based methods fail to retrieve some of the very similar images. This is

because in these methods colors that are close to each other but happen to fall into different quantization bins are not compared against each other. OCCD succeeds to retrieve these images by allowing all possible cross-comparisons of colors. Fig. 3 shows a comparison of the performances of the MDM method (Fig. 3a), the modified MDM method (Fig. 3b), and the OCCD method (Fig. 3c). In this case both MDM and modified MDM methods perform poorly, with the former biased toward the background and the latter biased toward the foreground. Such unpredictable behavior is likely caused by the fact that the metric is ill defined in both methods. OCCD provides a well-defined metric and guaranteed optimality in distance computation, thus achieving much more reliable results. A subjective experiment was carried out to definitively compare the different methods. The traditional histogram method was not included in this experiment because its problems are well established. Also, including too many schemes in comparison tends to get the subjects more confused. Fifteen representative patterns were chosen from the interior design database as query images. For each query image, four different retrieval results were generated using the histogram intersection method, MDM method, modified MDM method and the OCCD method, respectively. Each query result contained the top 5 matching images retrieved from our database, displayed in the same manner as the examples in Figs. 2 and 3. Thirteen subjects (7 men, 6 women) were asked to evaluate these results. Each subject was presented the four retrieval results for each query image, arranged in random order on one page, and asked to rank order them based on how they matched their own judgement. A ranking of 1 was given to the best retrieval result, 2 to the second best result, etc. We expected large variations in the evaluation results due to the difficulties reported by the subjects in comparing different rankings, as well as in separating color from other factors such as pattern or spatial arrangement. However, despite these difficulties, the evaluation results showed significant amount of consistency among different subjects for most images. Ten of fifteen images used yielded majority votes (e.g., >6) for a single scheme as the best, indicating that the corresponding rankings were reasonably consistent. The remaining five images were then discarded. Of the ten query images yielding consistent rankings, the results produced by OCCD method had majority votes as the best method for eight images, and the results produced by histogram intersection were voted best for two. Table 1 gives the average rank for each scheme computed from these ten query images. OCCD clearly has the best ranking. These results demonstrate that the OCCD method indeed best matches human perception overall. Table 1. Average rankings for the tested methods. Histogram Intersection 2.3

MDM 2.8

Modified MDM 3.3

OCCD 1.6

5. CONCLUSIONS Due to its very compact representation and low complexity, direct histogram comparison is the most often used technique in color based image database retrieval. However, there are serious drawbacks to this approach. First, discrimination ability of the color histogram largely depends on the selection of the color quantization method, as well as on the size of the color codebook. Second, for most natural images histogram representation is very sparse, and therefore not efficient. Furthermore, direct histogram comparison is highly sensitive to quantization boundaries. This paper describes a new algorithm for color matching of images attempted at overcoming these drawbacks. The major novelties in this algorithm include: a new technique for color codebook design in the perceptually uniform Lab space; a statistical method for extracting visually dominant colors from images; and a new color distance measure with guaranteed optimality in terms of color composition matching. All these features contribute to a better match to human perception in judging color content similarity between different images.

6. REFERENCES [1] Swain M. and Ballard D., “Color indexing”, International Journal of Computer Vision, vol. 7, no. 1, 1991, pp. 11-32. [2] Ioka M., “A method of defining the similarity of images on the basis of color information”, Technical Report RT-0030, IBM Research, Tokyo Research Laboratory, Nov. 1989. [3] Niblack W., Berber R., Equitz W., Flickner M., Glasman E., Petkovic D., and Yanker P., “The QBIC project: Quering images by content using color, texture and shape”. in Proc. SPIE Storage and Retrieval for Image and Video Databases, 1994, pp. 172-187. [4] Ma W.Y., Deng Y., and Manjunath B.S., “Tools for texture/color base search of images”, In Proc. of SPIE, vol. 3016, 1997, pp. 496-505. [5] Pei S.C. and Cheng C.M., “Extracting color features and dynamic matching for image data-base retrieval”, IEEE Trans. On Circuits and Systems for Video Technology, Vol. 9, No. 3, April 1999, pp. 501-512. [6] Mojsilovic A., Kovacevic J., Hu J., Safranek R.J. and Ganapathy K., “Matching and retrieval based on the vocabulary and grammar of color patterns”, IEEE Trans. Image Processing, Vol. 9, No. 1, Jan. 2000, pp. 38-54. [7] Wyszecki G. and Stiles W. S., Color science: Concepts and methods, quantitative data and formulae, John Wiley and Sons, New York, 1982. [8] Mojsilovic A., and Soljanin E., “Quantization of color spaces and processing of color quantized images by Fibonacci lattices”, IEEE Trans. on Image Processing, (submitted). [9] ftp://dimacs.rutgers.edu/pub/netflow/matching/weighted/ [10] Gabow H., "Implementation of algorithms for maximum matching on nonbipartite graphs", Ph.D. Thesis, Stanford University, 1973.