Content Based Image Retrieval Using Pure Color

1 downloads 0 Views 3MB Size Report
To build a chain code, the image is covered with a grid, and the boundary points are ... Query by example (QBE) or query by image is a query technique that ...
Content Based Image Retrieval Using Pure Color Histogram and Multi Structures Occurrences in Uniform Patterns of RLBP

Submitted by Aminullah

Supervised by DR. MUHAMMAD SAJJAD

DEPARTMENT OF COMPUTER SCIENCE ISLAMIA COLLEGE PESHAWAR SESSION 2012-2016

Content Based Image Retrieval Using Pure Color Histogram and Multi Structures Occurrences in Uniform Patterns of RLBP

Aminullah Project Thesis Submitted to Department of Computer Science in Partial Fulfillment of the Requirement for the Degree of BS-Computer Science

Approved By DR. MUHAMMAD SAJJAD

Islamia College University Peshawar Department of Computer Science Session 2012-2016 ii

Dedicated to my dear parents And Respected teachers

iii

Abstract Due to the exponential growth in digital image databases, as well as its vast deployment in various applications in health, military, social media and art, the need for Content Based Image Retrieval (CBIR) is emerged. In CBIR system the actual content of the image (Colors, Textures and Shapes) are analyzed, and through which similar images are retrieved from the database. However retrieving similar images from massive image collection is a tough task because of the presence of complex properties of visual data. Digital image has many visual properties like color, shape and texture. In the proposed method we have analyzed two properties of an image for Content Based Image Retrieval (CBIR) system. Firstly, HSV color space is quantized and pure color histogram is calculated. Secondly texture features based on structures found in uniform patterns of rotated local binary patterns (RLBP) are extracted. Then the characteristics of color histogram and texture features are compared and analyzed for CBIR. The proposed system is based on color histogram and structures found in uniform patterns of RLBP, which are fused together to form a feature vector. The proposed method is evaluated on three benchmark datasets, the image retrieval experiments show that the proposed method is generating better results as compared to the other state-of-the-art techniques.

Keywords: Content based image retrieval, fused features, pure color histogram, and rotated local binary patterns.

i

List of figures FIGURE 1: CONCEPT OF QUERY BY TEXT AND QUERY BY EXAMPLE. ............................................... 1 FIGURE 2: WORKING OF A TYPICAL CBIR SYSTEM. ......................................................................... 3 FIGURE 3: VARIOUS WAYS OF DESCRIBING IMAGE CONTENT. ........................................................... 7 FIGURE 4: THE PROPOSED FRAMEWORK. ....................................................................................... 15 FIGURE 5: RGB TO HSV ................................................................................................................ 17 FIGURE 6:(A) IS THE 3X3 NEIGHBORHOOD TAKEN FORM IMAGE, (B) IS THE MASK WINDOW BASED ON LBP COMPRESSION, (C) IS THE WEIGHTS WINDOW FOR THE 8 BIT BINARY WITH RADIUS IS EQUAL TO 1. .................................................................................................................... 20

FIGURE 7: EFFECT OF ROTATION ON LBP AND RLBP OPERATOR (A) (TOP) IS THE ORIGINAL IMAGE AND (A) (BOTTOM) IS ROTATED BY 900 COUNTER-CLOCKWISE, (B) (TOP) IS NEIGHBORHOOD FOR THE ORIGINAL WHILE

(B) (BOTTOM) IS FOR ROTATED IMAGE, (C) IS MASK WINDOWS

WITH RESPECT TO NEIGHBORS FOR BOTH (TOP, BOTTOM) AND (D) IS THE WEIGHT WINDOW FOR

LBP, (E) SHOWS THE LBP VALUES. (F) IS THE MASK WINDOW FOR RLBP (G) IS THE

WEIGHTS WINDOW OF THE

RLBP

WITH RESPECT TO DOMINANT DIRECTION DENOTED IN

YELLOW COLOR, (H) THE RLBP VALUES FOR THE ORIGINAL AND ROTATED IMAGE IS SAME.

........................................................................................................................................ 22 FIGURE 8: FIVE

PREDEFINED

STRUCTURES

TO COUNT THE OCCURRENCES OF EACH UNIFORM

PATTERNS IN THE UPATTERN. ............................................................................................ 24

FIGURE 9: (A) IT’S OUR IPATTERN. FIGURE (B) SHOWS THE OCCURRENCES OF PATTERN 00111000 IN FIVE DIRECTIONS ACCORDING TO 2X2 STRUCTURE .......................................................... 24

FIGURE 10: SAMPLE IMAGES FROM EACH CATEGORY OF THE COREL1K IMAGE DATASET. .............. 26 FIGURE 11: SAMPLE IMAGES FROM THE ZB BUILDING IMAGE DATASET. ........................................ 26 FIGURE 12: SHOWS

THE COMPARISON WITH

COLOR ONLY, TEXTURE ONLY AND PROPOSED FUSED

FEATURE. ........................................................................................................................ 28

FIGURE 13: PRECISION AND RECALL GRAPH FOR COREL-10K DATASET WITH COMPARISON AGAINST OTHER GENERAL CBIR METHODS. ................................................................................. 29

FIGURE 14 : PRECISION AND RECALL GRAPH FOR COREL DATASET WITH COMPARISON AGAINST LBP AND ITS RELATED BASED CBIR METHODS. .................................................................... 30

FIGURE 15: VISUAL RESULTS OF COREL10K DATASET................................................................... 31

ii

FIGURE 16: RETRIEVAL RESULT OF COREL1K DATASET ON AFRICAN PEOPLE. ............................... 32 FIGURE 17: RETRIEVAL RESULT OF COREL1K DATASET ON BUS IMAGES. ....................................... 32 FIGURE 18: RETRIEVAL RESULT OF COREL1K DATASET ON BEACH IMAGES. .................................. 33 FIGURE 19: RETRIEVAL RESULT OF COREL1K DATASET ON BUILDING IMAGES............................... 33 FIGURE 20: RETRIEVAL RESULT OF COREL1K DATASET ON DINOSAUR IMAGES.............................. 34 FIGURE 21: RETRIEVAL RESULT OF COREL1K DATASET ON HORSE IMAGES. .................................. 34 FIGURE 22: RETRIEVAL RESULT OF COREL1K DATASET ON MOUNTAIN IMAGES. ............................ 35 FIGURE 23: RETRIEVAL RESULT OF COREL1K DATASET ON FLOWER IMAGES. ................................ 35 FIGURE 24: VISUAL RESULT OF ZB BUILDING DATASET.................................................................. 37

iii

List of Tables TABLE 1: DIFFERENCE BETWEEN CBIR AND DBIR. ..................................................................................... 4 TABLE 2: SUMMARY OF THE INPUT AND OUTPUT PARAMETERS IN THE PROPOSED CONTENT BASED IMAGE RETRIEVAL (CBIR) SYSTEM........................................................................................................ 14

TABLE 3: TRANSITIONS ARE SHOWN IN RED COLOR. WHENEVER THERE IS A TRANSITION EITHER FROM 0 TO 1 OR FROM 1 TO 0, IT IS INDICATED IN RED UNDERLINE......................................................... 23

TABLE 4: PERFORMANCE COMPARISON OF OUR METHOD WITH STATE-OF-THE-ART METHODS ON ZB DATASET. .................................................................................................................................... 36

TABLE 5: QUANTITATIVE EVALUATION OF FRAME RETRIEVAL IN VIDEOS. ................................................ 38 TABLE 6: RETRIEVAL RESULTS OF FRAMES FROM VIDEO. .......................................................................... 39 TABLE 7: TIME COMPLEXITY OF DIFFERENT DATABASE IMAGES. .............................................................. 40

iv

List of Abbreviations CBIR

Content Based Image Retrieval

DBIR

Discerption Based Image Retrieval

QBIC

Query by Image Content

EHD

Edge Histogram Descriptor

MPEG

Moving Pictures Expert Group

MTH

Multi Texton Histogram

GLCM

Gray Level Co-occurrence Matrix

MSD

Micro Structure Descriptor

SED

Structure Element Descriptor

SHE

Structure Element Histogram

LBP

Local Binary Pattern

LDP

Local Direction Pattern

LTxXORP

Local Texton XOR Pattern

LTP

Local Ternary Pattern

HSV

Hue, Saturation, Value

RLBP

Rotated Local Binary Pattern

MAP

Mean Average Precision

ZB

Zurich Building Image Database

v

TABLE OF CONTENTS Abstact ................................................................................................................................. i List of Figures ..................................................................................................................... ii List of Tables ..................................................................................................................... iv List of Abbreviations .......................................................................................................... v Acknowledgment ............................................................................................................. viii CHAPTER NO.1 Introduction ............................................................................................ 1 1.1

Description based image retrieval (DBIP): ........................................................... 1

1.2

Introduction to the content based image retrieval (CBIR): ................................... 2

1.3

Difference between CBIR and DBIR. ................................................................... 4

1.4

Hybrid Approach of CBIR and DBIR: .................................................................. 4

1.5

Problem Definition: ............................................................................................... 5

1.6

Problem Statement: ............................................................................................... 5

1.7

Motivation: ............................................................................................................ 5

1.8

Organization of Thesis: ......................................................................................... 6

CHAPTER NO.2 Literature Review................................................................................... 7 2.1

Overview of related works: ................................................................................... 7

2.2

Contents of an image: ............................................................................................ 7

2.2.1 Color content: ................................................................................................... 7 2.2.2 Texture Content: ............................................................................................... 8 2.2.3 Shape Content: .................................................................................................. 8 2.3

Types of CBIR system: ......................................................................................... 9

2.3.1 Region-based: ................................................................................................... 9 2.3.2 Object-based: .................................................................................................... 9 2.3.3 Example-based: .............................................................................................. 10 2.3.4 Feedback-based: ............................................................................................. 10 2.4

General CBIR techniques: ................................................................................... 10

2.5

Local binary pattern (LBP) ant its relevant patterns based: ................................ 11

CHAPTER NO.3 Proposed Method ................................................................................. 13 3.1

Overview: ............................................................................................................ 13

3.2

Color Feature Extraction: .................................................................................... 16 vi

3.2.1 RGB to HSV: .................................................................................................. 16 3.2.2 Colors quantization in HSV color space: ....................................................... 17 3.2.3 Proof for 70 Bins: ........................................................................................... 19 3.3

Texture feature extraction: .................................................................................. 19

3.3.1 Local binary pattern: ....................................................................................... 19 3.3.2 Rotated local binary pattern:........................................................................... 20 3.3.3 Labeling uniform patterns: ............................................................................. 22 3.4

Algorithm for Content Based Image Retrieval Using Pure Color Histogram and

Multi Structures Occurrences in Uniform patterns of RLBP. ................................................... 25 CHAPTER NO.4 Experimental Evaluation ...................................................................... 26 4.1

Databases for experimental evaluation: .............................................................. 26

4.2

Similarity measure (Euclidean distance): ............................................................ 27

4.3

Precision and Recall: ........................................................................................... 27

4.4

Mean average precision (MAP): ......................................................................... 28

4.5

Subjective and Visual Result Assessment: ......................................................... 28

4.5.1 Corel dataset: .................................................................................................. 29 4.5.2 ZB building dataset:........................................................................................ 36 4.5.3 Retrieval performance in videos: .................................................................... 38 4.6

Time Complexity Performance: .......................................................................... 40

CHAPTER NO.5 Conclusion and Future Works ............................................................. 41 5.1

Conclusion:.......................................................................................................... 41

5.2

Suggestions for Future Work: ............................................................................. 41

5.3

Promoting Code to C++: ..................................................................................... 42

References ......................................................................................................................... 43

vii

ACKNOWLEDGEMENT First and above all, I would like to thank Allah Almighty for His infinite bounties, blessings and help without which nothing would have been possible. Countless blessings upon Prophet Muhammad ‫ ﷺ‬the final Prophet of Allah, the Prophet of Rahmah, the intercessor of Ummah whose spiritual guidance and help is always with me. I would like to express my deepest gratitude to my supervisor Dr. Muhammad Sajjad for his guidance, for his help, and for his patience and never-ending support. His continuous confidence on me was a real source of motivation for me. I would like to say thanks to all my other teachers who taught me during my course work, by delivering their precious and valuable thoughts, and built my concepts, and special thanks to my dear teacher Mr. Jamil Ahmad for his support and guidance in this project. I would like to thank my parents and my brothers for providing me education, growing me up and for making me able to complete this work.

viii

CHAPTER NO.1 INTORDUCTION 1.1 Description based image retrieval (DBIP): In the description based image retrieval (also called Text based image retrieval) the images are searched and retrieved by using labeled texts associated with each image before emergence of CBIR system. For example elephant’s images were first labeled text as “Elephant” than in searching time we pass a text query to get the similar images. The images in such techniques were manually categorized by a labeled text, which is a laborious task and it requires human interaction for labelling the images, and as the digital images are growing day by day from commercial use to personal up to several terabytes, so it is very difficult to give name to each image separately. Since labeled text based image retrieval techniques are burdensome, so Content Based Image Retrieval techniques began to emerge.

Query Textual query Query By Example

Text

Retrieved Images

Image Sketch

Database Of Images

Figure 1: Concept of Query by text and Query by Example.

1

The advantages of the text based image retrieval is that, it is faster for image retrieval and easy to implement, full text search algorithms are required for labeled images. The disadvantages of the text based image retrieval is that, it need a manual annotation which is a laborious task and very difficult to give a text label to each image. Manual annotations are subjective any one can give its own suggested name to an image. Polysemy problem is also negative point of DBIR where more than one object can be refer by the same word.

1.2 Introduction to the content based image retrieval (CBIR): Content-based image retrieval (CBIR) is a system in which search is based on the actual contents of the image. The term content refers to color, shape, texture, or any other information that can be derived from the image itself [1, 2]. In CBIR the content of image are analyzed and some information is extracted, through which then the similar images are retrieve from the database. Due to the exponential growth of digital images in image databases the CBIR has become one of the most popular research areas in the field of Digital Image Processing. Considerable amount of research is underway around the world on the problem of pair-wise image matching, pair-wise matching means comparison of one to one image such as face recognition, figure print recognition, facial expiration analysis in which image is compare with its pair, because pair-wise matching is the core component of content-based image retrieval (CBIR) systems[3]. Humans tend to use high-level concepts to recognize an object, such as object’s name, object’s dimensions and object’s physical characteristics. On the other hand features extracted automatically using computer vision techniques are mostly low-level features such as color, texture and shape. In general, there is no direct link between the high-level concepts and the low-level features[4]. Low-level features fail to describe the high level semantic concepts, but researchers have proposed machine learning based image retrieval techniques which narrow the gap between low level features and high level semantic concepts[5-7]. However, the proposed framework works well in content based image retrieval without using machine learning techniques. A typical CBIR system contains four parts in its process as depict in the figure 2. 1. Creating the image data collection. 2. Build feature database by automatically extracting features of the images in image data collection.

2

3. Search on a required image using feature database. 4. Arrange the order of retrieved results.

Input Image

Preprocessing Feature Extraction Matching

Retrieval Results

Feature Extraction Database

Preprocessing

Figure 2: Working of a typical CBIR System.

The content based image retrieval technique takes one query image and retrieves similar images from the database. The database images are preprocessed to extract the features of images and then these features are stored for further processing. The extracted features are declared as feature vectors. The features of query image are also extracted, so that these features are compared with the features of database images to measure their similarity. The image consists of colors, textures and shapes as its contents. We have proposed a technique for image retrieval based on color and local features of texture. Low level visual features of the images such as color and texture are used in our technique, because they are easy to extract and they contain useful information for measuring image similarity. Color information in an image is very important, so we have used HSV color model for color feature because HSV color model is near to human perception. Two planes of HSV (Hue and Saturation) are first quantized to a one-dimensional plane and then its histogram is calculated as a color feature. Texture is also an important visual feature that represents the surface properties in an image and it helps to analyze the relationship between the pixels. Many objects in an image can be easily detected by their textures without any other information. . In our approach we have used the rotated local binary patterns (RLBP)[8] to analyze the patterns of the texture by counting the occurrences of predefined five structures for each texture pattern.. Through RLBP, rotation 3

invariance is achieved by computing the descriptor with respect to a reference in a local neighborhood. In the end we have calculated histogram of five structure occurrences in the texture of image. Through experiments it was found out that these structures in uniform patterns of RLBP serve as a useful descriptor for image retrieval.

1.3 Difference between CBIR and DBIR. Table 1: Difference between CBIR and DBIR.

DBIR Positives



CBIR 

Full text search algorithms are

Automatic index construction using

required for labeled images.

content of image (color, texture,



Easy to implement.

shape).



Fast retrieval.



Similarities of images are based on the distances between features.

Negatives 

Manual annotation is a laborious



Semantic gap.

task.



Querying by example is not



Manual annotations are subjective.



Polysemy problem (more than one

convenient for a user. 

object can be refer by the same

Slow due high resolution image processing.

word).

1.4 Hybrid Approach of CBIR and DBIR: The recent trend for image search is to fuse the basic techniques of Web images, i.e. Textual context (usually represent by the keywords) and Visual features for retrieval. In the hybrid approach we join the existing textual and visual features to provide a better result. The simplest approach for this method is based on counting the frequency-of-occurrence of words for automatic indexing. The second approach takes a different stand and treats images and texts as equivalent data. It attempts to discover the correlation between visual features and textual words on an unsupervised basis, by estimating the joint distribution of features and words and posing

4

annotation as statistical inference in a graphical model. As a result, pure combination of text based and content based image retrieval approaches is not sufficient for dealing with the problem of image retrieval on the Web. We can also combine the features of content based image retrieval to improve the image retrieval technique. These hybrid features also enhance the retrieval technique i.e. color and texture, texture and shape based hybrid approach etc. In the proposed method we have fused color and texture feature of image for the content based image retrieval system.

1.5 Problem Definition: The task of content base image retrieval is to get visually similar images form the database Ф={ᶲ1,ᶲ2,ᶲ3,…ᶲN} by given an input image II. The database Ф contains different categories of images ᶲ1,ᶲ2,ᶲ3,…ᶲN , if the query image is given from any category the retrieved images will visually similar to the query image. For that purpose we first preprocessed query image and all image of the database to form feature vector, the preprocessing consist of two part in first part we have calculate pure color histogram CFEAUTURES as a color feature vector and secondly some basic predefine structures are counted in uniform patterns of RLBP which form texture feature vector TFEAUTURES . Finally similarity of query image and database image are calculated through Euclidean distance. The lower the Euclidean distance value means higher the similarity with the query image.

1.6 Problem Statement: The problem is defined as follows: Retrieve visually similar image from the database as the query image is given as accurately as possible, the retrieved images should be visually similar as close as possible to the query image.

1.7 Motivation: Color and texture are one of the crucial primitives in human vision. Color and texture features have been used to identify contents of images, for examples texture can be used to describe contents of images, such as sky, grass, clouds, bricks, hair, etc. The identification and describing characteristics of the image are accelerated when texture is integrated with color, hence the details of the important features of image objects for human vision can be provided. One crucial difference between color and texture features is that color is a point, or pixel, property, whereas texture is a 5

local-neighborhood property. As a result, it does not make sense to discuss the texture content at pixel level without considering the neighborhood. There for the color and texture information gave us a very good descriptor to make a content base image retrieval system.

1.8 Organization of Thesis: The thesis is organized as follows: Chapter 2 presents the techniques that are used for CBIR by texture and color content of images. A survey on some of the existing content-based retrieval systems is also provided. In Chapter 3, the content-based retrieval system that is proposed is explained in detail, color feature extraction is disused in 3.2 and texture feature extraction is discussed in 3.2, an algorithm for the proposed method is given in 3.4. The performance experiments of the content-based retrieval system are given in Chapter 4, the databases and quality assessment metrics are discussed in 4.1, 4.2, and 4.3, the subjective assessment and visual result assessment for each database is discussed is separate section 4.4 and 4.5 and finally, Chapter 5 concludes the thesis and future work is explained.

6

CHAPTER NO.2 LITERATURE REVIEW 2.1 Overview of related works: Content based image retrieval is currently a very active area of research. Several state-of-the-art techniques have been proposed over the last two decades in this area. The term "content-based image retrieval" was firstly given by T. Kato [9] in 1992 when it was used to describe experiments of automatic retrieval of images from a database, based on the colors and shapes features of image. Since then, the term “content based” has been used to describe the process of retrieving similar images from the database on the basis of actual content of image.

Content

Color

Texture

Shape

Figure 3: Various ways of describing image content.

2.2 Contents of an image: 2.2.1 Color content: Color feature is the most important in searching collections of color images. Color plays very important role in the human visual perception mechanism. Methods for color feature representation can be divided into two groups, Color histograms and statistical methods of color representation. The most commonly used color spaces are RGB (red, green, and blue used in color monitors and cameras), CMY (cyan, magenta and yellow), CMYK (cyan, magenta, yellow, and black used in color printers), the RGB, CMY, and CMYK are mostly use in hardware while for software we

7

have Lab (L*a*b, lightness, a and b are two color dimensions, from green to red and from blue to yellow) HSI, HSV (hue, saturation, and value). The simplest and widely used form of color feature representation is color histograms. For each intensity of the desire color space, the number of occurrences for each intensity is calculated. Such technique for the information of color is simple and ordinary, however, it has one very clear disadvantage: the distance between two images that may be visually similar but there color are not same or may two images are visually but there color histogram came same. Moreover, such histograms are very light in effect for image retrieval. 2.2.2 Texture Content: Texture of an image gives us information on structural arrangement of pixels arrangement and objects on the image. Texture of an image is not defined for a single pixel, but it is made of a textons in a neighborhood and depends on the distribution of color level over the image. Texture possesses periodicity and scalability properties; it can be described by main directions, contrast, and sharpness. Texture analysis plays an important role in comparison of images adding to the color feature. The most popular method use texture features for image retrieval such as EHD, MTH, MSD, and SED etc. those methods are discussed in literature review of general CBIR methods. 2.2.3 Shape Content: In computer vision and image processing with widely use of color and texture characteristics, shape features of objects in an image is also often used for image matching. Methods for describing and representation of shape of object in an image divided into two groups, external methods, which represent the region of an object in terms of its external properties such as its boundary, and internal ones, which represent the region of an object in terms of its internal properties such as area, parameter and centroid of an object. Shape features are divided into two types, boundary properties or boundary descriptor and regional properties or regional descriptor. Boundary Descriptors: Chain code: It describes an object boundary as a sequence of line segments with a given orientation. To build a chain code, the image is covered with a grid, and the boundary points are approximated by the nearest grid nodes. The line segments connect the neighboring nodes. 8

Fourier descriptors: The Fourier descriptors are one of the most popular methods of contour parameterization. The basic idea of this method consists in the application of the discrete Fourier transform to the signature and use of the Fourier coefficients obtained as parameters describing the contour. Regional descriptor: Regional properties can be described as a simplest geometrical parameters, such as an area or perimeter and centroid etc. Moments and its invariants: Moment and its invariants are presently the most common and widely used region descriptors.

2.3 Types of CBIR system: 2.3.1 Region-based: Region-based image retrieval (RBIR) was recently proposed as an extension of content-based image retrieval (CBIR). An RBIR system automatically segments images into a variable number of regions, and extracts for each region a set of features. Then, a dissimilarity function determines the distance between a database image and a set of reference regions. During retrieval, a user is provided with segmented regions of the query image, and is required to assign several properties, such as the regions to be matched, the features of the regions, and even the weights of different features [10]. 2.3.2 Object-based: Object-based image retrieval systems retrieve images from a database based on the appearance of physical objects in those images. These objects can be elephants, stop signs, helicopters, buildings, faces, or any other object that the user wishes to find. One common way to search for objects in images is to first segment the image in the database and then compare each segmented region against a region in some query image presented by the user [11]. Such image retrieval systems are generally successful for objects that can be easily separated from the background and that have distinctive colors or textures.

9

2.3.3 Example-based: Query by example (QBE) or query by image is a query technique that contains providing the CBIR system with an example query image or part of the query image that it will then base its search upon it. Also search with multiple model images or search with a sketched image can be taken. Result images should all share common elements with the provided sample image. Users give a sample image, or part of an image, that the system use it as a query for the search. The system then finds images that are similar to the base image. The proposed method is also a query by example technique to retrieve similar images from database. 2.3.4 Feedback-based: Relevance feedback, originally developed for information retrieval, it is a supervised learning technique used to improve the effectiveness of information retrieval systems. The main idea of relevance feedback is using positive and negative examples provided by the user to improve the system’s performance. For a given query, the system first retrieves a list of ranked images according to predefined similarity metrics, which are often defined as the distance between feature vectors of images. Then, the user selects a set of positive and/or negative examples from the retrieved images, and the system subsequently refines the query and retrieves a new list of images. The key issue is how to incorporate positive and negative examples to refine the query and how to adjust the similarity measure according to the feedback System shows user a sample of pictures and asks for rating from the user. Using these ratings, system re-queries and repeats until the right image is found[12].

2.4 General CBIR techniques: Generally, in CBIR, there are several descriptors proposed to represent the image, with respect to its color, shape or texture. Various algorithms have been designed to extract features for image retrieval. The Edge Histogram Descriptor (EHD) represents the spatial distribution of five types of edges in the image.it calculates the histogram with respect to local edge distribution in an image with edge types (vertical, horizontal, 45 diagonals, 135 diagonal, and isotropic). It is an efficient texture descriptor for images which have good results on MPEG-7 dataset[13]. 10

The edge orientation autocorrelogram (EOAC) is proposed for shape based descriptor. This feature classifies image edges based on two factors: their orientations and correlation between neighboring edges [14]. The texton co-occurrence matrices (TCM) can describe the spatial correlation of textons for image retrieval[15]. Guang-Hai Liu proposed an image feature representation method, called as multi-texton histogram (MTH). MTH integrates the advantages of co-occurrence matrix and histogram by representing the attribute of co-occurrence matrix using histogram. MTH works directly on natural images as a shape descriptor[16]. The micro-structures descriptors (MSD) is described by edge orientation similarity, it is low dimensional feature extraction method which effectively integrates color, texture, shape and color layout information based on the underlying colors in microstructures with similar edge orientation[17]. Another novel method structure element descriptor (SED) represent image local features, extract and describe color and texture features. Structure element histogram (SEH) is computed by SED. SEH integrates the advantages of both statistical and structural texture description methods, and it can represent the spatial correlation of color and texture [18]. A dominant color descriptor (DCD) that was proposed by MPEG-7. DCD extract semantic feature from dominant colors (weight for each DC). it helps to reduce the effect of image background on image matching decision where an object’s colors receive much more focus[19]. Fusion-based approaches have been developed with the idea that single feature representation cannot accurately represent the heterogeneous and complex structures in contents with sufficient discrimination. Hence, different methods are combined together by researchers to get improved performance. Deepak John and S. T. Tharani fused HSV color histogram and Gray level cooccurrence matrix (GLCM) for image retrieval the image is divided into sub-blocks. Color of each sub-block is extracted by quantifying the HSV color space into non-equal intervals and the color feature is represented by cumulative histogram. Texture of each sub-block is obtained by using gray level co-occurrence matrix. The experimental result is improved with fusion [20].

2.5 Local binary pattern (LBP) ant its relevant patterns based: There are several texture analysis methods used in the field of computer vision and image processing. Local binary pattern is one of the efficient methods for texture analysis. The LBP 11

operator is invariant to light and contrast changes in the image. A lot of work has been done through LBP in object recognition, face recognition and facial expression analysis because of its low complexity, simplicity and low dimensionality. For content based image retrieval Oana-Astrid and Mihaela [21] analyzed histogram of LBP patterns and histogram of uniform patterns in LBP. Another LBP based approach in which the image is first divided into blocks and then the histogram for the LBP of each block is calculated to form feature vector [22]. However only histogram is not an efficient way to discriminate the visual content in image. In addition to this the LBP is converted by local direction pattern (LDP) by modifying the input image through Kirsch edge response masks in eight directions to get the edge response values and top kth values are set to 1 and the rest are set to 0 to get the LDP patterns. The histogram of LDP patterns are taken as a feature vector and color histogram is fused with it for image retrieval [23]. A Local texton XOR patterns (LTxXORPs) is formed by applying seven 2x2 textons on the image and then XOR the central pixel’s value to the neighboring pixels to form LTxXORPs pattern image and then histogram of the patterns is calculated as a feature vector. However, the procedures used for texture analysis like LBP, LDP, LTP, LTrP and LTxXORP is based on only histogram of the patterns to get the feature vector, where image cannot be represented by the histogram of the patterns only, it does not take into account the texture, edge and shape features, therefore we analyzed texture in a different way to form feature vector by analyzing different structure in the texture patterns.

12

CHAPTER NO.3 PROPOSED METHOD 3.1 Overview: In this section, the framework of the proposed system and its constituents will be discussed in detail, understanding the concepts of retrieving similar images from database Ф={ᶲ1,ᶲ2,ᶲ3,…ᶲN} based on content similarity with query image II. The proposed method input and output parameters are given in table 2. The proposed procedure for retrieving similar images from the database is divided in two parts. In first part, we have converted RGB image to HSV II IHSV, further we have quantized IHSV to QHSV and then calculated color histogram for the quantized image QHSV, i.e., color feature vector CFEAUTURES .Secondly the image II is converted from RGB to grayscale image and rotated local binary patterns (RLBP) are extracted in IPATTERN and then uniform patterns are labeled, finally five 2x2 structures are counted for each pattern to form a texture feature vector TFEAUTURES. In the proposed method, we have fused the color feature CFEAUTURES and texture feature TFEAUTURES as a feature vector. The same process is repeated for all images of database Ф={ᶲ1,ᶲ2,ᶲ3,…ᶲN}. Similarity is measured at the end using Euclidean distance between query image and all images in database Ф. The proposed method has two sections, each section will be discussed in separate parts. Figure 4 shows our proposed framework.

13

Table 2: Summary of the input and output parameters in the proposed content based image retrieval (CBIR) system.

Description of model parameters: II

Input query image.

IH SV

Transformed H, S, and V image.

H, S, V

The hue, saturation and value

QHSV

Quantized image.

CFEATURES

Color feature vector.

components of image. LS , LV

Quantization level of S and V components.

Q1D

One dimension image.

TFEATURES

Texture feature vector.

IG

Transformed gray image.

IPATTERN

Image consist of rotated local binary patterns RLBP.

UPATTERN Uniform patterns labeled

F

Query image fused features vector.



Database

image. F

Database image fused features vector.

14

Figure 4: The Proposed Framework.

15 Database Of Images

Query Image S

G

Gray scale

RLBP Pattrons

H

R

V

B

Query Image For Each Database image

+

Similarity Measures

Fused Color and texture Features

Labeling Of Uniform Histogram of patterns in structures Patterns

00000000 00110000 11110000 11110011

HSV Quantization

Hue & Saturation Histogram Similar images

3.2 Color Feature Extraction: In the content based image retrieval system the feature (content) extraction is a basic task. In the text-based retrieval search is based on text feature such as keywords, annotations, etc. while the CBIR system uses visual features color, texture, shape, etc. general visual features are color, texture, shape and domain specific features are fingerprints, human faces. The domain specific feature are studied in pattern recognition works according to domain knowledge problem. Color feature plays the most significant role in searching desired image from image database. Color plays an important role in the human visual perception system. There exist several color spaces, which are specific forms of color models, used in image processing where each color model has some properties. Some color models are RGB, CMY’K, Y’UV, YIQ, HSI and HSV [24]. The HSV color model is closer to human conceptual understanding of colors and it contains rich information about the purity of color [25-27]. The RGB color model is mixture of red, green and blue in which information about one color cannot be distinguish therefor we are using HSV color model. 3.2.1 RGB to HSV: HSV is the color model where we can separate Chroma information form light intensity information, to get robustness to lighting changes. The hue and saturation components are related to the way of human eye observes colors. In other color models the color are derived from primary colors such as in RGB, the other colors are derived from its combination while in HSV the hue refers to which color it look like so all the tints and shades of red have the same hue. The hue component is expressed from 00 to 3600. Where hue of red color start from 00, yellow color starts from 600 green start from 1200, cyan starts from 1800, blue start from 2400 and magenta starts from 3000.The saturation is the mixture of white color to the color. A pure color is fully saturated when saturation of 1. The value describes how dark the color is the value 0 refer to black color for all hue component and with increasing lightness shifting far from black. The major effect in HSV color model is of hue and saturation if we separate the light information we get the pure color, because the effect of light change the color just like at the dawn the color of sky is gray type and at the noon time its blue where at sunset time it become yellow type but the sky color is blue, so we can separate the light information from the hue and saturation to get the pure color.

16

Conversion formulas are given in equation 1, 2 and 3 below as follows.

IH  cos IS  1 

1

 1/2[(R  G) (R  B)]     (R  G) 2 (R  B)(G  B) 

3 [min(R,G, B)] R G B

1 IV  (R  G  B) 3

… (1)

… (2) … (3)

Figure 5: RGB to HSV

3.2.2 Colors quantization in HSV color space: HSV color space is widely used in color feature extracting. In this space, hue is used to distinguish color, saturation is the percentage of white color added to a pure color and value refers to the perceived light intensity. The advantage of HSV color space is that it is closer to human conceptual understanding of colors [25-27]. Color provides rich information for image retrieval systems. But it is difficult and time consuming to compute large number of colors for an individual image, therefor we can quantize the color space in a way that we don’t lose maximum information. According to the characteristics of HSV color space, Hue [0, 360], Saturation [0, 1] and Value [0, 1], we quantize HSV color space to 70 colors as follows: We quantize color space by the following formula [28] in equation 4 and 5.

17

0,  1, 2,  3, QH   4, 5,  6,  7,

IH [0,24]  IH [345,360] IH [25,49] IH [50,79] IH [80,159] IH [160,194] IH [195,264] IH [265,284] IH [285,344]

... (4)

0, IS [0,0.15]  QS  1, IS [0.15,0.8]  2, IS [0.8,1]

After quantization of

IH SV

 QHSV (quantized HSV) we

image based on the above classification using the following Q1D  LSLV QH  LV QS

… (5)

construct one-dimensional equations 4 and 5.

… (6)

In the above formula, LS, LV are the quantization orders or levels of the color space component S and V, in this paper the S and V are quantized into three levels, therefore LS, LV are equal to 3, the formula becomes [28].

Q1D  3 * 3* QH  3* QS

… (7)

We get one-dimensional image Q1D from hue and saturation because we have used pure color information for retrieval. We call this a pure color information because it is taken from the Chroma information and we ignore the intensity of light information. After getting the one-dimensional image we calculate 70 bins color histogram as a color feature vector CFEAUTURES of an image. Because the above formula and the classification of H and S, we know, that the value of image’s one dimensional feature vector is from 0 to 69, therefore the feature of every image will be expressed as a histogram with 70 bins.

18

3.2.3 Proof for 70 Bins: In the Quantize Hue QH we have highest value 7, and in Saturation QS we have highest value 2.formula for the Q1D is:

Q1D  3 * 3* QH  3* QS Putting values of QH and QS in formula. Q1D = 3 * 3 * 7 + 3 * 2 = 69

3.3 Texture feature extraction: Texture feature extraction is an important research area of computer vision, through which various classification and matching algorithms are developed. For matching and classification the richness of texture can be seen in objects such as woods, plants, materials and skin, because each object has its own texture properties. In the proposed method we have analyzed the texture through some structures having different orientations in the uniform patterns of RLBP which will discuss in detail later in this section. 3.3.1 Local binary pattern: Before describing in detail the rotated local binary patterns, let us briefly review the Local Binary Patterns. The LBP operator is computed in a local circular region by taking the difference of the central pixel with respect to its neighbors, it is shown as follows [29] . N 1 1 g p  g c LBPR , N   s( x) * 2 p , s( x)   0 g p  g c p 0

… (9)

Where gc and gp are the gray values of the central pixel and its neighbor respectively, p is the index of the neighboring pixels. R is the radius of the circular neighborhood and N is the number of the neighbors. Where 2p indicate the corresponding value in weights window in our approach the first value is start from weights window (2, 3) in clock vise direction.

19

66

65

62

1

67

66

61

1

69

68

67

1

(a)

0

1

0

32

0

16

1

8

(b)

64

128 1

4

2

(c)

Figure 6:(a) is the 3x3 neighborhood taken form image, (b) is the mask window based on LBP compression, (c) is the weights window for the 8 bit binary with radius is equal to 1.

pattern  00111110 LBP  0  0  32  16  8  4  2  0  62 The LBP operator checks each neighborhood pixel with the central pixel’s value. If the neighboring pixel’s value is greater than or equal to the center pixel’s value then we put 1 in the corresponding location of the mask window. If the neighboring pixel’s value is less than the center pixel value we put zero in the corresponding location of the mask window. Then we multiply the values in the mask window with the corresponding values of the weights window. The order of the weights is fixed in the circular neighborhood. If we rotate the image, the arrangement of the pixels around the central pixel also changes. Therefor if we compute the LBP of an image and then rotate it and compute the LBP patterns this will give different results. So it is not rotation invariant and in content based image retrieval we have to retrieve visually similar images even if it is rotated at any angle. 3.3.2 Rotated local binary pattern: The problem in LBP is that it isn’t rotation invariant, because of the fixed arrangement of weights for the whole image. Since LBP is not rotation invariant so we have used Rotated local binary pattern (RLBP) [8].In the rotated local binary pattern the arrangement of the weights is not fixed. Since in RLBP the weights are aligned in a circular manner, so the result of image rotations is neutralized by rotating the weights in clockwise direction while computing the weights window for the rotated local binary patterns. RLBP is rotation invariant [8], if we rotate the image 20

at any angle then the weights window of RLBP also rotates in the same direction.. A 3 by 3 neighborhood of the image is moved across the image to extract its texture. The difference between the central and surrounding pixels of neighborhood is calculated. Index of the location having maximum difference between the central and the surrounding pixels is termed as the Dominant Location. The dominant location is computed as:

D  index(max g p  g c ) p  (1,2,3,...,8)

… (10)

In the above equation g p represents the value of the neighboring pixel and p represents the index of the neighboring pixel, whereas g c represents the value of central pixel. The patterns are computed by rotating the weights with respect to the Dominant Location, there for the descriptor is called as Rotated Local Binary Pattern (RLBP). Since the dominant location is determined by taking the circular neighborhood as a reference, so the weights of the weight window are assigned with respect to the dominant location. The RLBP operator is defined as following [8]: N 1 1 g p  g c RLBPR , N   s( x) * 2 mod( p D, N ) , s( x)   0 g p  g c p 0

… (11)

In the above equation R is the radius of circular neighborhood, in our proposed method we have kept R=1. N is the number of neighboring pixels where N=8, gp indicates the neighboring pixel value where p is the index to the pixel to be compared with central pixel gc . The modulus operator is used to assign values to the weight window. The term 2mod (p−D, N) depends on D, where D is the dominant location and weights are assigned in its clockwise direction. In the above procedure the weights are not dependent on the preselected arrangement of the weights in the weight window, so RLBP achieves rotation invariance through this. Figure 7 shows the effects of rotation in LBP and RLBP.

21

LBP

44

52

43

0

70

0

0

0

32

1

16

64

0

128

1

=11

0

0

0

4

1

2

8

16

32

221

53

78

1

0

1

8

4

2

1

0

1

1

128

64

55

70

78

0

1

1

32

64

128

0

1

1

16

32

64

53

0

0

16

0

8

221

0

1

8

1

4

52

44 (A)

55

RLBP

43 (B)

0 (C)

1

4 (D)

=194

2

0

0 (E)

0 (F)

128 2 (G)

=97

=97

1 (H)

Figure 7: Effect of rotation on LBP and RLBP operator (A) (top) is the original image and (A) (bottom) is rotated by 900 counter-clockwise, (B) (top) is neighborhood for the original while (B) (bottom) is for rotated image, (C) is mask windows with respect to neighbors for both (top, bottom) and (D) is the weight window for LBP, (E) shows the LBP values. (F) Is the Mask window for RLBP (G) is the weights window of the RLBP with respect to dominant direction denoted in yellow color, (H) The RLBP values for the original and rotated image is same.

In the figure 7(B) and 7(G) the yellow color represents the dominant location. The pink color in figure 7(C) and 7(F) represents those pixels whose values are greater than the center pixel. The location of the weights window corresponding to the index D will always receive the minimum weight, which will be equal to 1. Other locations of the weight window will receive values as 2 n in clockwise direction of the dominant location, where n = 1, 2, 3…, 7. In figure 7(B) it can be seen that the dominant location of the original image is at the bottom left position, while in the rotated image the dominant location is at the bottom right position. In figure 7(H) we can see that the RLBP values is the same for original and the rotated image, which shows us that RLBP is rotation invariant. 3.3.3 Labeling uniform patterns: After calculating RLBP for the whole image, we get a total of 256 patterns in the image. There are many patterns that are redundant, these redundant patterns are termed as uniform patterns. These uniform patterns will be further analyzed for feature vector calculation. Local binary patterns are composed of uniform patterns, which is approximately 90% of the overall local binary patterns 22

[30]. A local binary pattern is called as a uniform pattern, if its bits contain at most two transitions either from 0 to 1 or 1 to 0. For example, the patterns 00011100 and 01000000 are called as uniform patterns as both consist of 2 transitions, while 00101000 and 00011010 are non-uniform as these contain 4 transitions. Experimentally it was found out that amongst the 256 patterns, there are 58 patterns which has a uniform structure. Once we have found the uniform patterns they will be labelled starting from 1 to 58, so that they are used in further processing easily. Table 3 shows the concept of transitions. Table 3: Transitions are shown in red color. Whenever there is a transition either from 0 to 1 or from 1 to 0, it is indicated in red underline.

Patterns

Transition

00001111

1

11100111

2

00011100

2

00111111

1

00110011

3

01101001

5

11001011

4

10101011

6

Analyzing texture orientation plays an important role in computer vision and image processing. Therefore we have defined five 2x2 structures to analyze those 58 patterns, so that we get the feature vector to measure similarity between the visually similar images. Figure 8 shows the five structures defined for analyzing uniform patterns.

23

00111000

00111000

a) Denotes

00111000

00111000

00111000

00111000

b) Denotes

c) Denotes

00111000

00111000

00111000

00111000

00111000

00111000

e) No direction

d) Denotes

Figure 8: Five predefined Structures to count the occurrences of each uniform patterns in the UPATTERN.

We have defined five structures and we count the number of occurrences of each pattern according to the structure. We have five structures and the total patterns are 58, so we get 58x5 dimensional structure. We convert it into a one dimensional vector, which results in a 1x290 feature vector from the texture. (a)

00111000

00111000

11000011

00111000

00111111

00011000

10011111

00111000

11000011

00111000

11110000

00000111

00111000

10011111

11100001

00111111

00111000

00011111

00111000

11110000

(b)

1

00111000

Figure 9: (a)

11100011

00111000

11100011

1

1

2

0

00011000

it’s our IPATTERN. Figure (b) Shows the occurrences of pattern 00111000 in five directions

according to 2x2 structure

24

3.4 Algorithm for Content Based Image Retrieval Using Pure Color Histogram and Multi Structures Occurrences in Uniform patterns of RLBP. 1. Select query image II . 2. Convert II from RGB to HSV IH SV . 3. Reduce the number of colors from 224 to 26 + 6 IH SV  QHSV . 4. Convert Hue and Saturation of the quantized image QHSV into one dimensional image

Q1D to form image having pure colors. 5. Calculate histogram of one dimensional Q1D which is 70 bins to get a color feature

vector C FEATURES . 6. Convert II from RGB to gray IG . 7. Apply RLBP on IG and get IPATTERN . 8. Labeling uniform patterns in IPATTERN  UPATTERN . 9. Calculate histogram of predefined 2x2 five structure for each uniform pattern, the

resultant histogram is TFEATURES as texture feature. 10. Fuse color feature CFEATURES and texture feature TFEATURES , store query image features

in  F and database image features in F . 11. Repeat steps 2 to step 10 for each image of database images. 12. Find the Euclidean distance between query image features  F with all database images

features F by the formula d( F , F ) 

v

[  (i)   (i)]

i 1

F

F

2

.

13. Retrieve the images having smaller Euclidean distance with query image.

25

CHAPTER NO.4 EXPERIMENTAL RESULTS 4.1 Databases for experimental evaluation: In this section, the proposed technique has been evaluated qualitatively as well as quantitatively using various image quality assessment metrics. We have used Corel-1k, Corel-10k [31] and ZB building [32] datasets for our experimental results, these datasets are benchmark image datasets for CBIR . Corel-1k dataset has 10 categories each containing 100 images. The size of each image is 384 x 256 or 256 x 384. Corel-10k has 100 categories and the ZB dataset of building contains images of 201 buildings where each building has 5 images. The size of each image in ZB dataset is 640 x 480. We randomly choose 10 images from each category of the Corel-1K dataset and use them as query images. For each category, we compute the precision on different recall levels.

Figure 10: Sample images from each category of the Corel1k image dataset.

Figure 11: Sample images from the ZB building image dataset.

26

4.2 Similarity measure (Euclidean distance): For feature comparison the standard Euclidean distance measure given in [33] was used throughout the experiments. d( F , F ) 

v

[  (i)   (i)] F

i 1

F

2

… (12)

Where F are features extracted form query image and F are features extracted form database images, v is the length of the feature vector and d is the Euclidian distance between the two feature vectors. For all images of the database, the distance d is calculated with the query image. Smaller the value of d, greater will be the similarity between the two images (query image and database image) and hence the image will be at a higher rank when retrieved.

4.3 Precision and Recall: The proposed method was evaluated using the evaluation metrics precision and recall. The commonly used evaluation technique, precision-recall pair, is usually used for the performance evaluation of the CBIR techniques. The precision P is defined as the ratio between the number of the retrieved relevant images M and the total number of the retrieved images N; it measures the accuracy of the retrieval. Recall R is defined as ratio between the number of the retrieved relevant images M and the total number of the relevant images S of the whole database; it measures the robustness of the retrieval. Precision P and recall R are computed by the following equations [34].

p

Number _ of _ relevent _ images _ retrieved( M ) Number _ of _ images _ retrieved _ from _ database( N )

... (13)

R

Number _ of _ relevent _ images _ retrieved( M ) Total _ number _ of _ relevent _ images _ in _ database( S )

… (14)

27

4.4 Mean average precision (MAP): In addition to precision and recall, another widely used metric for evaluating the performance of CBIR systems is MAP which can be computed as [35]: Q

MAP  1/Q  APi

… (15)

i 1

Where Q represents the number of queries run on the CBIR system, AP represents average precision, denoting the mean precision values of the relevant images and can be computed as: r

AP  1/r  Pi

… (16)

i 1

Where Pi is the precision value for all the relevant images r. The MAP returns values in the range [0–1] with higher values representing good retrieval ranks and lower values indicating bad rankings.

Figure 12: Shows the comparison with Color only, Texture only and Proposed fused feature.

4.5 Subjective and Visual Result Assessment: In this work, the proposed method have been evaluated subjectively and visually using various image quality assessment metrics on publically available well-known datasets: the Corel1k, Corel10 and ZB building database. Various quantitative and qualitative experiments have been

28

conducted to assess the performance of the proposed scheme over other state-of-the-art techniques, confirming its superiority. 4.5.1 Corel dataset: Corel dataset is broadly utilized for evaluating the performance of CBIR systems. It consists of varieties of images containing statistically rich color and texture. Ten query images are selected from each category to evaluate performance. The visual performance of Corel1k and Corel10k datasets by the proposed method is shown in figure 15 to 23 for queries (a, b, c …). The top left image in the figure 15 to 23 (a, b, c …) is the query image, while the other all images are the retrieved images by our proposed method. Value of the Euclidean Distance is given as a label to the retrieved images. The lower the Euclidean Distance, higher is the similarity between the query image and database images. The pictures are organized by expanding its distance value to query image. We have campared our proposed method with other general CBIR techniques such as MSD [17], SED, FF [35], ART [35] ,ECDH [35] and CTDD [36] through precision–recall curve in figure 13 as the value of the graph are taken from the refernce papers, and also campared with LBP, LDP, LTP, LTrP and LTxXORP patterns based methods in figure 14. The figure 12 shows the precision–recall curve of retrieval using only color, only texture and our proposed fused based technique. The experimental results outperform the other state-of-the-art techniques by significant margin.

Figure 13: precision and recall graph for Corel-10k dataset with comparison against other general CBIR methods.

29

Figure 14 : precision and recall graph for Corel dataset with comparison against LBP and its related based CBIR methods .

The visual results of the Corel1k and Corel10k datasets by proposed method are shown in figure 16 to 23. The effect of the structures we use for getting texture information is clear in a sense of retrieving similar images of different colors having similar structures and texture, such as the bus images are retrieved of different structures and directions, same is the case with buildings, flowers, dinosaurs, horses, and elephant images with different color with similar texture and edges the rest of categories are shown in figure 16 to 23. Some of the unmatched images are also retrieved such as flower is retrieved in African people images and bowl and dinner dish is retrieved instead of flower, this is because the structure of the flower is similar to the structure of food in the dinner dish. The elephant image is retrieved in beach images and the beach image is retrieved in building images and in mountain images, this is because the background of the elephant image, building images and mountain images are having same structure sky, there for it is retrieved. Such type of result are because of the fusion we use for the color and texture feature is just concatenation, so it can be improved by the information fusion method and similarity metrics techniques which will be study in the future works. We have evaluated the results of our proposed method using a precision–recall curve and compared the results with other state-of-the-art methods. From the precision–recall curve figure 12, 13 and 14 it can be seen that the retrieval performance of our proposed method is better than the other state-of-the-art methods. Hence, it is clear from these results that the proposed algorithm outclasses existing techniques by a significant margin.

30

Figure 15: Visual Results of Corel10k dataset.

31

Figure 16: Retrieval result of Corel1k dataset on African people.

Figure 17: Retrieval result of Corel1k dataset on bus images.

32

Figure 18: Retrieval result of Corel1k dataset on beach images.

Figure 19: Retrieval result of Corel1k dataset on building images.

33

Figure 20: Retrieval result of Corel1k dataset on dinosaur images.

Figure 21: Retrieval result of Corel1k dataset on horse images.

34

Figure 22: Retrieval result of Corel1k dataset on mountain images.

Figure 23: Retrieval result of Corel1k dataset on flower images.

35

4.5.2 ZB building dataset: ZB dataset is a dataset which contains images of buildings. The images of each building are taken at various angles, and the colors, walls, windows and doors of all the building images are almost similar. Due to this similarity, retrieval evaluation is a very challenging task in this dataset. In the proposed method we have achieved rotation invariance through rotated local binary patterns (RLBP) therefore the as the images of building are taken at different angle but feature extracted using RLBP are rotation invariant so the retrieval performance is batter as compare to other stateof-the-art techniques. In certain queries, our algorithm failed to retrieve all the relevant images in the top-5 images, due to the structure of windows, door and wall are almost similar. The percentage MAP retrieval performance with others state-of-the-art techniques for ZB dataset [32] is shown in Table 3. The MAP value of the proposed method is maximum in Table 4. The values indicate the percentage MAP retrieval accuracy of all five relevant images over randomly selected query images. Our algorithm achieves an average performance gain on other similar approaches for recognizing buildings. Some of the visual results are shown in figure 24, for each retrieval the leftmost image is the query image. Table 4: Performance comparison of Our Method with state-of-the-art methods on ZB dataset.

Approach

Performance (MAP %)

MSD [17]

74.52

ECDH [35]

76.10

ART [35]

76.84

FF [35]

77.95

SED [18]

81.66

CTDD [36]

82.74

OUR METHOD

87.77

36

Figure 24: visual result of ZB building dataset.

37

4.5.3 Retrieval performance in videos: In addition to evaluate result in images database we have evaluate the image matching performance in the videos by extracting a similar frames from the video according to the seen pass as query image. The evaluation of this case study is to show the strength of the proposed fused based CBIR by applying it as a videos searching and browsing. For performance, different type of videos are acquired from YouTube including shorts movies, military documentary, music videos, and cartoons videos. The videos duration is at least 4-5 mints longer. Different type of frames are selected randomly as queries from each video. The similarity is of the frames are search through proposed method, as the query is consist of most of the redundant frames there for we skipped some frame in checking similarity, due to this many of the frame may skipped which looking for similarity. We have skipped 30 frames in each video because of the redundant frames. Table 5: Quantitative evaluation of frame retrieval in videos.

Video No. 1 2 3 4 5 6 7 8 9 10 Average of 10 videos

Total Number of

Total Number of

Frames

Frame checked

8789 11173 12465 8733 5880 6932 10530 12732 7348 12944 9752

293 372 415 291 196 231 351 424 244 431 325

Precision

Recall

0.95 0.80 0.75 0.80 0.90 0.90 0.70 0.85 0.80 0.90 0.835

0.68 0.74 0.87 0.64 0.72 0.87 0.95 0.66 0.76 0.96 0.785

The setting of skipping the frames allows our method is to compare the query frame with two frame in one second. Therefore, the algorithm take just right frame from the video. The retrieval result has the batter result in the top most relevant images. Some of the irrelevant image are also retrieved that is due the frame we are skipping and some time the videos don’t have similar frame in context of the particular query seen. The algorithm will need graphic processing unit due to the

38

high number of frames in the video. The experimental results of the proposed method show that it can be used for the application of videos recommendation and videos searching. Table 6: Retrieval results of frames from video.

Video Title Description Query Frame

Video Title Description Query Frame

Video Title Description Query Frame

U.S. Marines Maritime Raid Force - MH-60S Helicopter Casting and SPIE A preparation video of the U.S. marines its Duration = 4:52 Total No of Frames = 8789, Total No of Frames checked for retrieval = 293 Top-5 retrieved frames

Title Tom and Jerry - Funny Best Moment's A small video clip taken from the best funny moments its Duration = 6:12 Total No of Frames = 11173, Total No of Frames checked for retrieval = 373 Top-5 retrieved frames

Watch recipe- Gosht Awadhi Korma - YouTube A cooking video in which korma is cooked its Duration = 5:17 Total No of Frames = 7947, Total No of Frames checked during retrieval = 264 Top-5 retrieved frames

39

4.6 Time Complexity Performance: The time complexity of the proposed algorithm is computed over Intel(R) Core i3 (fourth generation), 1.7 GHz CPU, 4 GB RAM, 512 MB Graphic Processing Unit (GPU) using 64 bit operating system, i.e., Microsoft Windows 8.1. As already mentioned that the proposed algorithm has been implemented in MATLAB (R2016a) using features saved in .mat file generated from each image of the database, where time complexity for each image feature extraction is in microseconds, where retrieval from the .mat file is in nanoseconds. The average CPU time (in seconds) required for features extraction to image retrieval of different databases due to different resolution of images is presented in Table 7.

Table 7: Time Complexity of different database images.

Databases Corel1k

Texture Feature Extraction Time 0.391s

Color Feature Extraction Time 0.038s

Time for Top 10 Images 0.451s

Time for Top 20 Images 0.6224s

Corel10k

0.283s

0.042s

0.493s

0.7201s

ZB Building Time for Retrieving Similar Frames in Video

0.631s

0.032s

0.332s

1.0201s

0.303s

0.066s

0.566s

--------

40

CHAPTER NO.5 CONCLUSION AND FUTURE WORKS The main theme of this research is to examine and improve content based image retrieval system. State-of-the-art CBIR methods are studied and their advantages and disadvantages identified, and the feature extraction which is basic task in CBIR, some feature extraction methods which are most commonly used for content based image retrieval and classification are determined.

5.1

Conclusion:

In this work, we have proposed a method using two content of image color and texture and fused them for image retrieval. Color features of image are extracted by calculating the histogram of Chroma information in HSV color model and texture features are extracted by analyzing some predefined structures in uniform patterns of rotated local binary patterns. The proposed approach is significantly robust to the light changes in color and the changes in texture due to rotation. The Similarity between the query image and images of the database is computed by using the Euclidean distance. The performance was evaluated using precision and recall curve and by calculating Mean average precision (MAP) % value. The experimental results outperform the other state-of-the-art techniques in both precision, recall and MAP% value.

5.2

Suggestions for Future Work:

A fused based strategies have been proposed in this work but the improvement of content based image retrieval and images classification procedures still has to go a long way. It is still an active research area. A few suggestions for future work are being provided. 1. Structures we used in the proposed method for the analyzation of the texture can be further increased on different orientation or on different scale to get more accuracy in the content based image retrieval.

41

2. The fusion of the feature (Color feature, Texture features) is very simple by just concatenating them, in future a good information fusion method can be studied to batter the performance. 3. The Euclidean distance is very simple formula for the similarity measures, a batter similarity metrics technique will improve the content based image retrieval.

5.3 Promoting Code to C++: The major problem for executing graphic, images and videos programs is speed. For the implementation of the code we have used matlab which a simulator which work on virtual machine and not directly link with the hardware, its slow processing speed and its overhead computation time is still a highly looking after problem. To overcome this problem of overhead computation and slow processing the code we implement in the matlab can work faster than the time take in the matlab, because the C++ code directly run on the machine, therefor it is faster than the exacting code in matlab.

42

References: [1]

Y.-H. Yu, T.-T. Lee, P.-Y. Chen, and N. Kwok, “On-chip real-time feature extraction using semantic annotations for object recognition,” Journal of Real-Time Image Processing, pp. 1-16, 2014.

[2]

A. W. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, “Content-based image retrieval at the end of the early years,” IEEE Transactions on pattern analysis and machine intelligence, vol. 22, no. 12, pp. 1349-1380, 2000.

[3]

R. Ji, L.-Y. Duan, J. Chen, H. Yao, T. Huang, and W. Gao, "Learning compact visual descriptor for low bit rate mobile landmark search." p. 2456.

[4]

I. K. Sethi, I. L. Coman, and D. Stan, "Mining association rules between low-level image features and high-level concepts." pp. 279-290.

[5]

M. Boutell, and J. Luo, "A generalized temporal context model for semantic scene classification." pp. 104-104.

[6]

S. C. Hoi, R. Jin, J. Zhu, and M. R. Lyu, "Semi-supervised SVM batch mode active learning for image retrieval." pp. 1-7.

[7]

R. Marée, P. Geurts, and L. Wehenkel, "Content-based image retrieval by indexing random subwindows with randomized trees." pp. 611-620.

[8]

R. Mehta, and K. Egiazarian, “Dominant rotated local binary patterns (DRLBP) for texture classification,” Pattern Recognition Letters, vol. 71, pp. 16-22, 2016.

[9]

T. Kato, "Database architecture for content-based image retrieval." pp. 112-123.

[10]

F. Jing, B. Zhang, F. Lin, W.-Y. Ma, and H.-J. Zhang, "A novel region-based image retrieval method using relevance feedback." pp. 28-31.

[11]

D. Hoiem, R. Sukthankar, H. Schneiderman, and L. Huston, "Object-based image retrieval using the statistical structure of images." pp. II-490-II-497 Vol. 2.

[12]

M. Inoue, “Image retrieval: Research and use in the information explosion,” Progress in Informatics, vol. 6, no. 3, 2009.

[13]

C. S. Won, D. K. Park, and S.-J. Park, “Efficient use of MPEG-7 edge histogram descriptor,” Etri Journal, vol. 24, no. 1, pp. 23-30, 2002.

[14]

F. Mahmoudi, J. Shanbehzadeh, A.-M. Eftekhari-Moghadam, and H. Soltanian-Zadeh, “Image retrieval based on shape similarity by edge orientation autocorrelogram,” Pattern recognition, vol. 36, no. 8, pp. 1725-1736, 2003. 43

[15]

G.-H. Liu, and J.-Y. Yang, “Image retrieval based on the texton co-occurrence matrix,” Pattern Recognition, vol. 41, no. 12, pp. 3521-3527, 2008.

[16]

G.-H. Liu, L. Zhang, Y.-K. Hou, Z.-Y. Li, and J.-Y. Yang, “Image retrieval based on multi-texton histogram,” Pattern Recognition, vol. 43, no. 7, pp. 2380-2389, 2010.

[17]

G.-H. Liu, Z.-Y. Li, L. Zhang, and Y. Xu, “Image retrieval based on micro-structure descriptor,” Pattern Recognition, vol. 44, no. 9, pp. 2123-2133, 2011.

[18]

X. Wang, and Z. Wang, “A novel method for image retrieval based on structure elements’ descriptor,” Journal of Visual Communication and Image Representation, vol. 24, no. 1, pp. 63-74, 2013.

[19]

A. Talib, M. Mahmuddin, H. Husni, and L. E. George, “A weighted dominant color descriptor for content-based image retrieval,” Journal of Visual Communication and Image Representation, vol. 24, no. 3, pp. 345-360, 2013.

[20]

D. John, and S. Tharani, “Content Based Image Retrieval using HSV-Color Histogram and GLCM,” International Journal of Advance Research in Computer Science and Management Studies, vol. 2, no. 1, 2014.

[21]

O. A. Vatamanu, M. Frandes, M. Ionescu, and S. Apostol, "Content-Based Image Retrieval using Local Binary Pattern, Intensity Histogram and Color Coherence Vector." pp. 1-6.

[22]

V. Takala, T. Ahonen, and M. Pietikäinen, "Block-based methods for image retrieval using local binary patterns." pp. 882-891.

[23]

J. Zhou, T. Xu, and W. Gao, "Content based image retrieval using local directional pattern and color histogram," Optimization and Control Techniques and Applications, pp. 197-211: Springer, 2014.

[24]

Y. Mistry, and D. Ingole, “Survey on content based image retrieval systems,” International Journal of Innovative Research in Computer and Communication Engineering (IJIRCCE), vol. 1, 2013.

[25]

M. Eimer, “The neural basis of attentional control in visual search,” Trends in Cognitive Sciences, vol. 18, no. 10, pp. 526-535, 2014.

[26]

S. Kastner, and L. G. Ungerleider, “The neural basis of biased competition in human visual cortex,” Neuropsychologia, vol. 39, no. 12, pp. 1263-1276, 2001.

44

[27]

M. S. Livingstone, and D. H. Hubel, “Anatomy and physiology of a color system in the primate visual cortex,” The Journal of neuroscience, vol. 4, no. 1, pp. 309-356, 1984.

[28]

L. Junling, Z. HongWei, K. Degang, and C. Chongxu, "Image retrieval based on weighted blocks and color feature." pp. 921-924.

[29]

T. Ahonen, A. Hadid, and M. Pietikainen, “Face description with local binary patterns: Application to face recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 28, no. 12, pp. 2037-2041, 2006.

[30]

G. Zhao, and M. Pietikainen, “Dynamic texture recognition using local binary patterns with an application to facial expressions,” IEEE transactions on pattern analysis and machine intelligence, vol. 29, no. 6, pp. 915-928, 2007.

[31]

J. Tang, and P. H. Lewis, “A study of quality issues for image auto-annotation with the corel dataset,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no. 3, pp. 384-389, 2007.

[32]

H. Shao, T. Svoboda, and L. Van Gool, “Zubud-zurich buildings database for image based recognition,” Computer Vision Lab, Swiss Federal Institute of Technology, Switzerland, Tech. Rep, vol. 260, pp. 20, 2003.

[33]

J. Ahmad, M. Sajjad, I. Mehmood, and S. W. Baik, "SSH: Salient structures histogram for content based image retrieval." pp. 212-217.

[34]

H. Müller, W. Müller, D. M. Squire, S. Marchand-Maillet, and T. Pun, “Performance evaluation in content-based image retrieval: overview and proposals,” Pattern Recognition Letters, vol. 22, no. 5, pp. 593-601, 2001.

[35]

J. Ahmad, M. Sajjad, I. Mehmood, S. Rho, and S. W. Baik, “Saliency-weighted graphs for efficient visual content description and their applications in real-time image retrieval systems,” Journal of Real-Time Image Processing, pp. 1-17.

[36]

M. Rahimi, and M. E. Moghaddam, “A content-based image retrieval system based on Color Ton Distribution descriptors,” Signal, Image and Video Processing, vol. 9, no. 3, pp. 691-704, 2015.

45

Suggest Documents