bit version of the Windows Vista operating system. In the univariate case, the evaluation of the KLD between two images takes about 0.1 ms on this machine, ...
WAVELET-BASED COLOUR TEXTURE RETRIEVAL USING THE KULLBACK-LEIBLER DIVERGENCE BETWEEN BIVARIATE GENERALIZED GAUSSIAN MODELS Geert Verdoolaege, Yves Rosseel
Michiel Lambrechts, Paul Scheunders
Department of Data Analysis, Ghent University, Henri Dunantlaan 1, 9000 Gent, Belgium
IBBT, Vision Lab, Department of Physics, University of Antwerp, Universiteitsplein 1, 2610 Wilrijk, Belgium
ABSTRACT We study the retrieval of coloured textures from a database. In a statistical framework we model the heavy-tailed wavelet histograms through a generalized Gaussian distribution (GGD). We choose the KullbackLeibler divergence (KLD) as a similarity measure and we obtain a closed-form expression for the KLD between two zero-mean bivariate GGDs. This allows us to take into account the rich correlation structure between the colour bands two by two. We show that this results in a considerably improved retrieval rate and, in addition, we demonstrate the superior performance of the bivariate GGD, in comparison with the bivariate Gaussian. Index Terms— colour texture retrieval, KullbackLeibler divergence, multivariate generalized Gaussian distribution 1. INTRODUCTION Our contribution is concerned with the task of contentbased image retrieval (CBIR) (see [1] for a review). This automated retrieval of images from a database, based solely on their graphic content, is basically a two-step process. First, the image information needs to be captured by a an image signature that is adequately precise and at the same time sufficiently concise. The latter is an important quality when storing or transmitting images. This feature extraction step is followed by the similarity measurement, comparing the images by calculating a distance measure between their respective features. Since querying an image database is often an online activity, the evaluation of the distance measure should be sufficiently fast. In practice, this typically means that a closed-form expression for the distance function should be used. With a view to the extraction of an efficient set of image features, we will apply a discrete wavelet transform to the images. Indeed, this leads to a set of multiscale
978-1-4244-5654-3/09/$26.00 ©2009 IEEE
265
oriented subbands that are sensitive to horizontal, vertical and diagonal edges in the original image [2]. Several approaches to texture characterization in the wavelet domain assume that the wavelet representation accurately characterizes texture (see e.g. [3, 4]). In addition, besides the texture information itself, we wish to distinguish textured images by colour as well. Moreover, we want to make use of the information residing in the correlation structure between the colour bands, since we expect that the retrieval task will benefit from these additional data. In a statistical context, we accomplish this by modelling the wavelet coefficients corresponding to multiple colour bands by a multivariate probability distribution. The wavelet transform results in a sparse representation of the images and often the corresponding histograms, which have zero mean, are super-Gaussian and heavy-tailed. For instance, in [5], wavelet coefficients for grey-level images were modelled in a retrieval context using a univariate generalized Gaussian distribution (GGD), also known by the name of exponential power distribution, which in general provided a better fit than the Gaussian density. In conjunction with the use of the Kullback-Leibler divergence (KLD) as a similarity measure, good retrieval performances were obtained. In order to extend this work to the joint modelling of colour band correlation (while assuming independence among the wavelet subbands belonging to a single colour component), we employ a multivariate generalized Gaussian distribution. We have obtained a closed-form expression for the KLD between bivariate GGDs, which we put into practice through the two by two modelling of colour bands. The joint modelling of wavelet coefficients has been considered before. In [6], a joint alpha-stable subGaussian distribution was used to describe the joint wavelet histograms, modelling dependence across wavelet orientations and scales. With the aid of the KLD, good retrieval rates were obtained, but a computationally complex gaussianization step was required. In
ICIP 2009
contrast, the KLD between bivariate GGDs can be calculated very fast, without any detours. This contribution is organized as follows. In Section 2 we introduce the multivariate generalized Gaussian distribution and we present the expression for the KLD between two zero-mean bivariate GGDs. Section 3 discusses the results of several retrieval experiments using the KLD as a similarity measure on grey-level data and assumed uncorrelated and correlated colour images. The performance of the bivariate GGD is compared to the case of the bivariate Gaussian distribution. Section 4 provides a conclusion and an outlook towards a possible improvement of the method. 2. MULTIVARIATE GENERALIZED GAUSSIAN DISTRIBUTION AND KULLBACK-LEIBLER DIVERGENCE There does not appear to exist a generally agreed upon multivariate extension of the univariate generalized Gaussian distribution. However, in this work we have considered the multivariate exponential power distribution, which we call in this context the multivariate generalized Gaussian distribution. In [7], this distribution is defined through the following density: Γ m β p(x|μ, Σ, β) = m 2 m 1 m 2 2β |Σ| 2 π Γ 2β 2 β 1 −1 × exp − (x − μ) Σ (x − μ) . 2 Here, Γ denotes the gamma function and m is the dimensionality of the probability space (m = 2 in our application). μ is the expected vector, Σ is the dispersion matrix and β is called the shape parameter. Clearly, β = 1 corresponds to the multivariate Gaussian density. In the following, we will assume that μ = 0, since the wavelet detail coefficients have zero expectation. We routinely used a maximum likelihood (ML) approach to fit the GGDs to the wavelet coefficients. The ML equations were solved iteratively. The KLD between two zero-mean bivariate GGDs p1 (x|Σ1 , β1 ) and p2 (x|Σ2 , β2 ) is defined as
KLD(p1 ||p2 ) ≡
R2
p1 ln
p1 p2
dx.
We wish to derive a closed-form expression for the KLD, in order to apply this in a retrieval experiment, as described in the next section. To this end, we may carry out the coordinate transform y = H x, with H the matrix that jointly diagonalizes the matrices Σ1 and Σ2 (see e.g. [8]). It is then not difficult to see that the KLD be-
266
comes ⎡ ⎤ Γ β12 1 − 1 |Σ | 12 β 2 1 ⎦ KLD(p1 ||p2 ) = ln ⎣ 2 β2 β1 |Σ1 | β2 Γ β11 β2 +1 1 β2 Γ β1 2 β1 + − + 2 β1 β1 Γ β11 β γ1 + γ2 2 1 − β2 β2 , − ; 1; A2 . (1) × 2 F1 2 2 2 Here, γi ≡ λ−1 i , i = 1, 2, with λi the eigenvalues of 2 Σ−1 Σ , while A ≡ γγ11 −γ 2 1 +γ2 . 2 F1 represents the Gauss hypergeometric function [9], which may be tabulated for −1 < A < 1 and for reasonable values of β. In the case of two Gaussians, β1 = β2 = 1, the hypergeometric function in (1) becomes identically 1 and it can easily be verified that (1) reduces to the familiar expression for the KLD between two bivariate Gaussians [10]. 3. RETRIEVAL EXPERIMENTS In this section, we apply the expression for the KLD between bivariate GGDs as a similarity measure in several retrieval experiments. We consider grey-level images, colour images (RGB colour space) assuming no interband correlation and finally colour images considering the correlation between the three colour bands two by two. We started by using the same 40 images from the MIT Vision Texture (VisTex) database [11] as proposed in [5]. These images, displayed in Figure 1 (in grey scales), are real world 512×512 images from different natural scenes. Every image was divided in 16 non-overlapping 128×128 subimages, constituting a database of 640 images. Every colour (or grey-level) component of each image was individually normalized to zero mean and unit variance, followed by a discrete wavelet transform with three levels using the Daubechies filters of length eight. The wavelet detail coefficients of every subband over the colour components two by two (or the grey-level) were modelled using a GGD or, for comparison, a Gaussian. The parameters of the GGD models for all subbands, namely the dispersion matrix and the shape parameter, comprise the feature set for a single image. A retrieval experiment was conducted by sequentially presenting every subimage as a query image. The retrieval effectiveness was then measured by calculating the average ratio of relevant images to the top 15 images (excluding the query image). Here, an image is considered relevant if it is part of the same original 512 × 512 image as the query image. The experiments were first performed for grey-level textures, calculated from the luminance component of the colour images. Next, the information in all colour bands
Fig. 1. The 512 × 512 texture images from the VisTex database used in our experiments. In reality, these are RGB colour images but for convenience they are shown in grey scales here. From left to right and top to bottom the images are: Bark0, Bark6, Bark8, Bark9, Brick1, Brick4, Brick5, Buildings9, Fabric0, Fabric4, Fabric7, Fabric9, Fabric11, Fabric14, Fabric15, Fabric17, Fabric18, Flowers5, Food0, Food5, Food8, Grass1, Leaves8, Leaves10, Leaves11, Leaves12, Leaves16, Metal0, Metal2, Misc2, Sand0, Stone1, Stone4, Terrain10, Tile1, Tile4, Tile7, Water5, Wood1 and Wood2. was treated in parallel, assuming zero interband correlation. This still allowed us to use univariate distributions. Finally, we took into account the strong correlation that generally exists between the colour components of an image, by modelling the wavelet coefficients corresponding to the colour bands two by two jointly through bivariate distributions. The results of all retrieval experiments are summarized in Table 1. The results from the experiments can be summarized as follows. When considering the information in all colour components in parallel, without assuming interband correlation, the obtained retrieval rates are higher than when using only the grey levels. In turn, as was anticipated before, the joint modelling of wavelet subbands yields even considerably better results, due to the valuable information residing in the correlation structure. This constitutes the largest improvement of our method compared to the approach taken in [5]. Furthermore, the GGD model produces better results than the Gaussian in all experiments. This is due to the additional model flexibility in terms of the shape parameter, compared to the Gaussian distribution. In order to verify the scalability of the retrieval method, the same technique was applied to a larger database. This database was obtained from the full set of 167 colour texture images of size 512 × 512 in
267
the VisTex database, again dividing every image in 16 non-overlapping subimages, resulting in a total of 2672 database images. Hence, the database contains as a subset our previous smaller database of 640 images. The original 167 images in general are characterized by more heterogeneous textures compared to the selection of 40 images employed above. This renders the retrieval task considerably more difficult on the larger database. Indeed, the obtained retrieval rates, also shown in Table 1, are substantially lower than the corresponding rates observed using the database of 640 images. Nevertheless, very similar tendencies can be seen using both databases, in particular confirming the superior performance when modelling the interband correlation and when using the GGD model. As far as the computational demands of our method are concerned, naturally the calculation of the KLD between bivariate GGDs takes more resources than its evaluation in the case of univariate distributions. To give an idea of the computational load, we mention the amount of time necessary to calculate the total KLD between two images at three wavelet scales, on the one hand modelling the wavelet coefficients for each colour band independently through univariate GGDs and on the other hand considering the correlation between the colour bands two by two, using bivariate GGDs. This
Database
Grey
UV
MV
than 2, we have not attempted to include experimental results using trivariate GGDs. This could be the subject of future work. 5. REFERENCES
Model
640 images
Gauss GGD
64.1 76.6
70.4 77.9
85.8 87.6
2672 images
Gauss GGD
34.6 41.7
42.0 46.9
58.5 60.3
Table 1. Average retrieval rates (%) using the KLD and different models (Grey = grey-level, UV = univariate colour, MV = multivariate colour) in a database of 640 colour texture images and a larger database consisting of 2672 images.
duration was obtained on the machine on which all calculations in this work were performed, namely a Dell Optiplex 755 equipped with an Intel Core Duo Quad CPU at 2.7 GHz and 8 GB of RAM, running the 64bit version of the Windows Vista operating system. In the univariate case, the evaluation of the KLD between two images takes about 0.1 ms on this machine, while for bivariate distributions, using the tabulated hypergeometric function, this becomes about 4 ms, still fast enough to be practical in a retrieval context. 4. CONCLUSION The colour components in real world textured images tend to be strongly correlated. In this work we have shown that image retrieval results improve considerably when the interband correlation structure in the wavelet space is modelled jointly. In a statistical framework, we have proposed a generalized Gaussian distribution as a model, using the Kullback-Leibler divergence as a similarity measure. We have obtained a closed-form expression for the KLD between two zero-mean bivariate GGDs. This enables a fast evaluation of the KLD between bivariate GGDs that model the correlation between the RGB colour bands two by two. Our experimental results on a databases of 640 textured images and a larger database consisting of 2672 images, show the advantage of modelling the correlation between colour bands in terms of increased retrieval rate. In addition, the GGD model clearly performs better than the Gaussian distribution. Given the observed improved retrieval performance when modelling interband correlation, it would be interesting to capture the full correlation structure between the three colour bands through a trivariate GGD and calculate the KLD between these extended models. However, since so far we did not obtain a closed-form expression for the KLD between GGDs with dimension higher
268
[1] R. Datta, D. Joshi, J. Li, and J.Z. Wang, “Image retrieval: Ideas, influences, and trends of the new age,” ACM Computing Surveys, vol. 40, article 5, 2008. [2] S. Mallat, A wavelet tour of signal processing, Academic Press, New York, second edition, 1999. [3] B.S. Manjunath and W.Y. Ma, “Texture features for browsing and retrieval of image data,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 18, pp. 837– 842, 1996. [4] G. Van de Wouwer, P. Scheunders, and D. Van Dyck, “Statistical texture characterization from discrete wavelet representations,” IEEE Trans. Image Process., vol. 8, pp. 592–598, 1999. [5] M. Do and M. Vetterli, “Wavelet-based texture retrieval using generalized Gaussian density and Kullback-Leibler distance,” IEEE Trans. Image Process., vol. 11, pp. 146–158, 2002. [6] G. Tzagkarakis, B. Beferull-Lozano, and P. Tsakalides, “Rotation-invariant texture retrieval with gaussianized steerable pyramids,” IEEE Trans. Image Process., vol. 15, pp. 2702–2718, 2006. [7] E. G´ omez, M.A. G´omez-Villegas, and J.M. Mar´ın, “A multivariate generalization of the power exponential family of distributions,” Commun. Statist.— Theory Meth., vol. 27, pp. 589–600, 1998. [8] S. Theodoridis and K. Koutrombas, Pattern recognition, sec. B.2, Academic Press, London, second edition, 2003. [9] M. Abramowitz and I.A. Stegun, Handbook of Mathematical Functions, Dover Publications, New York, 1965. [10] S. Kullback, Information theory and statistics, Dover Publications, New York, 1968. [11] MIT Vision and Modeling Group, “Vision texture,” online at http://vismod.media.mit.edu/vismod/ imagery/VisionTexture/.