Available online at www.sciencedirect.com
ScienceDirect Procedia Computer Science 73 (2015) 320 – 327
The International Conference on Advanced Wireless, Information, and Communication Technologies (AWICT 2015)
Image Retrieval Based On Using Hamming Distance Sana FAKHFAKH, Mohamed TMAR, Walid MAHDI Laboratory MIRACL, Institute of Computer Science and Multimedia of Sfax, Sfax University, Tunisia
Abstract The expeditious development of images volume on applications recent creates the need for emergence of new indexing and retrieval tools that facilitate access to relevant need. In this paper, we tried to create a visual vocabulary from phase extraction of descriptors such as color, texture, interest points. The method is to assign a signature to each image in collection. The extraction of signature is based on application of Haar wavelet multiscale, Harris interest points and analyzing color histogram. This signature will undergo a size reduction step by using the PCA (Principal Component Analysis). The signature of each image passes through a stage of binarization. This binary code obtained need using Hamming distance in the similarity matching. Experiments are undertaken into four data sets ”Caltech101”, ”Caltech256”, ”ImageCLEF 2013” and ”ImageCLEF 2014”. The obtained results showed effectiveness of our approach. c 2015 2015The The Authors. Published by Elsevier B.V. © Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of organizing committee of the International Conference on Advanced Wireless, Peer-review under responsibility of organizing committee(AWICT of the International Information, and Communication Technologies 2015). Conference on Advanced Wireless, Information,
and Communication Technologies (AWICT 2015)
Keywords: Content; binary signature; image retrieval; hamming distance; visual descriptor.
1. Introduction Recent years have been characterized by considerable progress in acquisition techniques and digital image storage was at origin of permanent production database of digital images involved in several areas such as health, audiovisual, architecture, telesurveillance and remote sensing. The constant increase in number and volume, databases of digital images requires indexing methods and effective search tools. Image retrieval by content (Content-Based Image Retrieval (CBIR)) is a field that is based on a set of low-level features such as histogram, texture, distribution of colors, shape, brightness etc. Two indissociable aspects coexist in system of image retrieval by content : indexing and search (Figure 1). The process of indexing and retrieval of images are essentially based on extracting and representing of effective descriptors from images of database and query introduced by user. Indeed, automatic image indexing requires extracting parameters of these
E-mail address:
[email protected]
1877-0509 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of organizing committee of the International Conference on Advanced Wireless, Information, and Communication Technologies (AWICT 2015) doi:10.1016/j.procs.2015.12.040
Sana Fakhfakh et al. / Procedia Computer Science 73 (2015) 320 – 327
321
beforehand. These parameters quantifies color, texture, intensity or again shapes contained in image and provide a image signature as a visual descriptor. While that retrieval phase consist to analyze a request sent by user and returns the result as a list ordered images, depending on similarity between visual descriptor extracted from query image and database image using a adequate distance measurement (Figure 1). Indexing System
Features Extraction
DB
Retrieval System
Features Extraction
Similarity Matching
Query
Selected Images
Fig. 1. General architecture of an indexing and retrieval image system.
The need for methods of indexing and retrieval directly based on the content of the image is no longer evident. The first prototype system has been proposed in 1970 and this system has attracted the attention of many researchers. Some systems become commercial systems such as QBIC (Query By Image Content) IBM 1 and CIRES 2 . The Frip (Finding Regions in the Pictures) system 3 KoB05 proposes to research by regions of interest to drawn by the user 4 . RETIN system (REcherche and Hunt Interactive) 5 developed at the University of Cergy-Pontoise France, selects a random set of pixels in each image to extract their color values. The texture of these pixels is obtained by applying the method of Gabor filters 6 . These values are grouped and classified using a neural network. The comparison between the images is done by calculating the similarity between their feature vectors 7 . Some studies have proposed to change their field of research, for example changing the color feature space 8 . Sometimes the user wants to query such as ”Find all the images of the database with certain regions similar to those of the query image” 9 . Some work has been done, or the system segments an image or a spatial structure is used. In the first case, having a segmented image, the user selects some regions that we want from the segmented image. In the second case, the spatial structure is used. A structure is often used quadtree 9 . This structure is used to store the visual characteristics of different image regions and filter images by increasing progressively as the level of detail. Some work has proceeded to minimize the search space using the technique of calculating the nearest neighbors to group similar data in classes 10 . Thus the search for an image is done by searching a class. The disadvantage of these systems is that the user does not always have an image that expresses real need, which makes use of such a system difficult. One solution to this problem is the technique of vectorization, which allows to find relevant images to a query and which are not returned by the initial system. Its sequence requires the choice of a set of said reference material. The choice of these references is still problematic for the construction of the vector space. In this context, it is necessary to select a heterogeneous set of documents. The number of references is also problematic. Indeed, the number of documents is the new dimension of vector space. If we choose a small set of documents, we may not be able to properly represent all documents. These references are randomly selected by 11 , or the first results of an initial search or through the selection of the centers of gravity of homogeneous grouping images 12 . In this paper, we are interested in image retrieval by content. we started by presented overview of image retrieval by content and work existing in literature. In what follows, we present our approach to image retrieval. This approach is essentially based on color descriptors, texture and interest points. Finally, we will conclude this paper by evaluation our approach which was used four test collections: Caltech101, Caltech256, ImageCLEF 2013 (Plant task) and ImageCLEF 2014 (Plant task).
322
Sana Fakhfakh et al. / Procedia Computer Science 73 (2015) 320 – 327
2. Proposed system In this section, we propose a new scheme for Content-based Image Retrieval. This scheme is composed of three essential steps for the extraction of signature: extracting visual features, the dimension reduction step using Principal Component Analysis and the normalisation signature to obtain a binary signature to each image of corpus. All these steps are schematized in figure 2. Our system tried to create a visual vocabulary from phase extraction of descriptors such as color, texture, interest points. The method is to assign a signature to each image in collection. The extraction of signature is based on application of Haar wavelet multiscale, Harris interest points and analyzing color histogram. This signature will undergo a size reduction step by using the PCA (Principal Component Analysis). The signature of each image passes through a stage of binarization. This binary code obtained need using Hamming distance in the similarity matching. The signature extraction step preceded by a crucial step is the segmentation step so as to isolate a single object in image.
Features Extraction
Interest points
Color
Harris color detector
Texture
Haar wavelet decomposition
Quantified rgb
4.5
x 10
5
4
3.5
3
2.5
2
1.5
1
0.5
0
1
2
X11 X21 X31 X4 1 . . . XM1
3
4
5
6
7
8
9
10
X12 . . . X1N X22 . . . X2N X3 2 . . . X3N X4 2 . . . X4N . . . . . . XM2 . . . XMN
Principal component analysis
Normalization
Extraction of binary signature
Fig. 2. Architecture of proposed system.
323
Sana Fakhfakh et al. / Procedia Computer Science 73 (2015) 320 – 327
2.1. Automatic Segmentation In this step, we use the concept of active contour was introduced by Kass, Witkin and Terzopoulos 13 . Active contour (Snake) is a parametric curve which attempts to move into a position where its energy is minimized as shown in the following equation 1. ∗ Esnake =
0
1
Eint (V (s)) + Eimg (V (s)) + Econ (V (s))ds
(1)
The active contours are broken down three terms. The first term Eint represents the internal energy and the second term Eimg represents the strength of the image, the last term Econ is the constraints of external force. The sum of the image forces Econ and strength of the external stress is also known as the strength of the external Snake Eext . the internal energy of Snake is written 1 (α(s)||Vs (s)||2 + β(s)||Vss (s)||2 ) (2) 2 Such as ||Vs (s)||2 represents the measurement of the elasticity, while ||Vss (s)||2 represents the curvature measurement. This means that in some parts of Snake where the curve is stretched, the term yield will be high, while in some parts of Snake where the curve is bent, the curvature term will be of great value. Eint =
(a)
(b)
(c)
(d)
Fig. 3. Stages of automatic photo segmentation by active contour.
2.2. Feature extraction Our image retrieval system employs a three features for describing effectively the huge variety of images. There descriptors take care to ensure translation, rotation and if possible scale invariant descriptors. In this section, we review each feature family (interest points, texture, color) in detail. 2.2.1. Interest points For twenty years, several detectors points of interest have been developed. Schmid has compared the performance of several of them 14 15 . The famous Harris detector was evaluated as the best point detector and has a good reputation in the field of automatic extraction points of interest. In fact, Harris and Stephens 14 defined a sensor that recognizes the first derivatives on a window signal representing an image and provide better results in two important criteria 15 . The first criterion is based on their robustness with respect to rotation and scale changes of the camera and points of view. The second is summed up in his power with respect to variations of illumination. Harris detector was applicable at the beginning only the grayscale image. Montesinos 15 generalized Harris detector for color images. Figure 4 shows an example of detection of interest points in an image 16 . 2.2.2. Color features For a monochrome image, the histogram is defined as a discrete function that maps each intensity value the number of pixels taking this value. The determination of the histogram is made by counting the number of intensity for each pixel of the image. The quantification, which includes several intensity values in one class can better visualize the distribution of image intensities. Our job is to build a histogram of a color
324
Sana Fakhfakh et al. / Procedia Computer Science 73 (2015) 320 – 327
Fig. 4. Figure showing the detection of points of interest by the Harris detector.
image in the RGB space, the red, green and blue. Each component is quantized into eight intervals. In each interval of each Red, green and blue component; we calculate the number of pixels normalized frequency 16 . We can then represent the image by a color histogram as shown in Figure 5 following. 1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
(a)
1
2
3
4
5
6
7
8
(b)
Fig. 5. Example of placing a figure with experimental results, (a) RGB image, (b) quantized histogram .
2.2.3. Texture features The wavelet transform of the sinusoid replaces the Fourier transform by a family of translations and dilations of a single function. The translation parameters and expansion of the two arguments are the wavelet transform. Haar transform was introduced in 1910 and is the oldest wavelet transform. It is based on the demonstration of Haar (equation 3). h(j, k; x) =
2j/2 2H(2j x − k)
(3)
With H(x) = 1 f or x in ]0, 1/2[, H(x) = −1 f or x in ]1/2, 1[ and H(x) = 0 ailleurs form a complete orthogonal set for space L2 (R). The Haar wavelet is simple to study and implement. Figure 6 shows the result of decomposition of an image texture to analyze and extract the diagonal details, the vertical details, the horizontal details and finally the description. 2.3. Dimension reduction and normalization step After feature extraction phase based on texture, color and extraction of points of interests, we will move to dimension reduction step to keep most important information. Then we go through a binaristion stage with the use of a dynamic threshold to obtain a binary signature. For dimension reduce , we use the principal component analysis ( P CA), developed in France in the 1960s by JP.Benzecri is an exploratory statistical method used to describe a wide array of data type individuals / variables. When individuals are described with 5 large numbers of variables, no simple graphical representation allows to visualize the
325
Sana Fakhfakh et al. / Procedia Computer Science 73 (2015) 320 – 327
(a)
(b)
(d)
(c)
Fig. 6. Decomposition of the image using the Haar wavelet, (a) image description, (b) horizontal details, (c) vertical details, (d) diagonal details.
point cloud formed by the data. The PCA provides a representation in a space of reduced size, thereby allowing to highlight any structures in the data. For this, we look for subspaces where the projection of the cloud deforms the least possible initial cloud. Applying the P CA to the descriptors, previously computed, A signal f (x), that includes all the characteristic values, is obtained. This signal is then subjected to an automatic thresholding (equation 4 ) to the transformed into a binary signature C(x) (equation 5). Mf =
m
f (x)
(4)
n=1
In this way each image is represented by a binary code which is stored in a database for later use in retrieval phase. C(x) = 1 if f (x) >= Mf f (x) = (5) C(x) = 1 if f (x) < Mf 2.4. Estimate similarities matching between codes Once the signatures are extracted, we must think of a technique that quantifies the distance between them, in order to identify similar images. The bit-wise comparison of each two image codes A and B each detail is given by the normalization of the Hamming distance (HD) 17 defined as following: HD =
1 A(k) ⊗ B(k) N
(6)
With ⊗ is the boolean operator ( XOR) described by the equation 7 follows a ⊗ b = ab + ab
(7)
The probability that two bits of arbitrary code images are equal or not, worth P = 0.5. When the majority of codes A(j) and B(j) are equal , is therefore in the case where the two codes compared are equivalent. In the opposite case, the two codes are different. Generally, when HD tends to zero it is called relevant images.
326
Sana Fakhfakh et al. / Procedia Computer Science 73 (2015) 320 – 327
3. Experiments and Results In the field of image retrieval, there is a tradition of evaluation companion appeared around image retrieval by content as ImageCLEF. The objective of evaluation companion is to provide a unique data corpus, a clear definition of tasks and a metric to measure performance. Being placed under the same experimental conditions, and can more easily compare different approaches and see their strengths and weaknesses. In this context, we were conducted to evaluate our contribution on the basis of four standards: Caltech 101, Caltech 256, ImageCLEF 2013 Plant Task and ImageCLEF 2014 Plant Task. The table below represents the number of image and query used for evaluate our approach. Table 1. Experimental Task condition on four collections used Collection Caltech 101 Caltech 256 ImageCLEF 2013 ImageCLEF 2014
Number of Image 9146 30607 20985 47815
Number of Query 101 256 5092 8163
Figure 7 represents the different effects of our approach on four collections. We can conclude that an image retrieval system returns better pertinence in case uses a combination of descriptors (color, texture, points of interest) and using method of dimension reduction seem a promoter for indexing images.
Collecon
ImageCLEF 2014 ImageCLEF 2013 Caltech 256 Caltech 101 0
0,2
0,4
0,6
0,8
1
MAP
Fig. 7. Results of impact our approach on Caltech 101, Caltech 256, ImageCLEF 2013 and ImageCLEF 2014 based in MAP(Mean Average Precision
The aim of the experiments in this figure is to show the effectiveness of using Hamming distance in image retrieval. For this purpose, we evaluated our approach with using four collections: ”Caltech101”, ”Caltech256”, ”ImageCLEF 2013” Plant task and ”ImageCLEF 2014” Plant task, we respectively obtain the following MAP values: 0.79, 0.87, 0.48 and 0.54. We observed that the difference of results between the four collections. The increase of result in ImageClef collection is due to nature of plant task which includes heterogeneous images taken in very natural conditions and sometimes by amateurs.
4. Conclusions We have presented in this article, a method of image retrieval based content that is based on extraction of a signature for a given image. This signature is composed by a set of descriptors such as color histogram analysis of image, extracting Harris interest points and Haar descriptor. This signature undergoes a dimension reduction step by PCA method. An experimental study on four collections of image showed the effectiveness of our descriptors.
Sana Fakhfakh et al. / Procedia Computer Science 73 (2015) 320 – 327
References 1. Flickner, M., Sawhney, H., Niblack, W., Ashley, J., Huang, Q., Dom, B., et al. Query by image and video content: The qbic system. vol. 28. Los Alamitos, CA, USA: IEEE Computer Society Press; 1995, p. 23–32. doi:10.1109/2.410146. 2. Iqbal, K., Odetayo, M.O., James, A.. Content-based image retrieval approach for biometric security using colour, texture and shape features controlled by fuzzy heuristics. vol. 78. Orlando, FL, USA: Academic Press, Inc.; 2012, p. 1258–1277. URL: http://dx.doi.org/10.1016/j.jcss.2011.10.013. doi:10.1016/j.jcss.2011.10.013. 3. Ko, B., Byun, H.. Frip: a region-based image retrieval tool using automatic image segmentation and stepwise boolean and matching. vol. 7. 2005, p. 105–113. 4. Caron, Y., Makris, P., Vincent, N.. Caract´ erisation d’une r´ egion d’int´ erˆ et dans les images. In: EGC. 2005, p. 451–462. 5. Fournier, J., Cord, M., Philipp-Foliguet, S., pontoise Cedex, F.C.. Retin: A content-based image indexing and retrieval system. 2001, . 6. Rivero-Moreno, C., Bres, S.. Les filtres de Hermite et de Gabor donnent-ils des modles quivalents du systme visuel humain? In: ORASIS 2003, Journes francophones des jeunes chercheurs en vision par ordinateur. 2003, p. 423–432. URL: http://liris.cnrs.fr/publis/?id=1174; article apparu en confrence nationale avec actes de congrs et comit de lecture. 7. Tollari, S.. Indexation et recherche d’images par fusion d’informations textuelles et visuelles. Ph.D. thesis; 2006. 8. pierre Braquelaire, J., Brun, L.. Comparison and optimization of methods of color image quantization. vol. 6. 1997, p. 1048–1051. 9. Malki, M., Ayache, M., Rahmouni, M.K.. R´etro-ing´ enierie des bases de donn´ees relationnelles : approche bas´ee sur l’analyse de formulaires. In: INFORSID. 1999, p. 55–74. 10. Berrani, S.A., Amsaleg, L., Gros, P.. Recherche par similarit´ es dans les bases de donn´ ees multidimensionnelles: panorama des techniques d’indexation. vol. 7. 2002, p. 9–44. 11. Claveau, V., Tavenard, R., Amsaleg, L.. Vectorisation des processus d’appariement document-requˆ ete. In: CORIA. 2010, p. 313–324. 12. Karamti, H.. Vectorisation du mod` ele d’appariement pour la recherche d’images par le contenu. In: CORIA. 2013, p. 335–340. 13. Kass, M., Witkin, A., Terzopoulos, D.. Snakes: Active contour models. INTERNATIONAL JOURNAL OF COMPUTER VISION 1988;1(4):321–331. 14. Harris, C., Stephens, M.. A combined corner and edge detector. In: Proc. of Fourth Alvey Vision Conference. 1988, p. 147–151. 15. Montesinos, P., Gouet, V., Deriche, R.. Differential invariants for color images. In: Pattern Recognition, 1998. Proceedings. Fourteenth International Conference on; vol. 1. 1998, p. 838–840 vol.1. doi:10.1109/ICPR.1998.711280. 16. Fakhfakh, S., Akrout, B., Tmar, M., Mahdi, W.. A visual search of multimedia documents in imageclef 2014. In: Working Notes for CLEF 2014 Conference, Sheffield, UK, September 15-18, 2014. 2014, p. 715–723. 17. Akrout, B., Khanfir Kallel, I., Benamar, C., Ben Amor, B.. A new scheme of signature extarction for iris authentication. In: Systems, Signals and Devices, 2009. SSD ’09. 6th International Multi-Conference on. 2009, p. 1–8. doi:10.1109/SSD.2009.4956749.
327