Content Based Web Image Search Engine Evaluation ...

2 downloads 0 Views 741KB Size Report
Content Based Web Image Search Engine Evaluation Using Arabic Text. Queries. Lina Al-Quraan, Sawsan Nusir and Belal Abuata. *. Department of Computer ...
Content Based Web Image Search Engine Evaluation Using Arabic Text Queries Lina Al-Quraan, Sawsan Nusir and Belal Abuata* Department of Computer Information Systems, Yarmouk University, Irbid, Jordan [email protected], [email protected], [email protected]

Abstract “Image search engines” means Web-based services that gather and index images on the Internet. Image searching is done by general search engines and by some specialized search engines. In addition, there are few "meta-search engines", which pass search queries to more than one search engine and then return back the results. Research on content-based image retrieval was the focus of attention of many researchers during the last decade. The paper compares the performance of three image search engines for answering Arabic text queries. The paper also evaluates whether the query’s language has an effect on the content based images retrieved. This paper’s research consists of two phases. In the first phase, ten Arabic text queries were used and the first ten images retrieved were tested for relevancy. Then we calculate precision ratios for each query. In the second phase, image features, as color and shape are analyzed and evaluated using the Euclidian distance. The results of the first phase indicated that Google has the best retrieval effectiveness. The second phase results showed that the image content was not similar for the relevant images retrieved for a specific query neither for the irrelevant images. Keywords: Arabic Queries, Search Engine (SE), Content-Based Image Retrieval (CBIR), Image Features

1. Introduction With the huge increase of information on the internet, it became difficult for internet users to suit their need for information necessity without Search Engines (SE). SEs play significant role in permitting users to search for different contents like image, audio and video where users can search for almost anything and can find it using SE, and therefore the SE performances are important. The type of search that is responsible for finding images is called Image Search Engine (ISE) or Image Search [13]. There are two main difficulties to retrieve images; the first one is the use of text-based query for retrieving/searching for images. In this case we can’t write the whole description of the image [18]. This will cause in irrelevancy of many top ranked images to the user text query as images are indexed based on texts extracted from different sources (file name, image meta data, manual index terms, etc.) The second difficulty concerns the relevancy perspective for the retrieved images when searching for a certain image. These results may differ from one user to another according to the user perception of the retrieved image. So, the retrieved images’ relevancy depends on human perception.

1

To overcome these difficulties, Content Based Image Retrieval (CBIR) was proposed. In CBIR, images would be indexed by their own visual content (color, texture or shape) instead of being manually annotated by textbased keywords [11][6][18]. The work of [11] presents a new method that combines color, texture and shape information. Their research achieves higher retrieval efficiency for retrieving images from image database. Their main idea is to use the image and its complement and partitions them into non-overlapping tiles of equal size. They used the combination of the color and texture features between image and its complement in conjunction with the shape features which provide a robust feature set for image retrieval. Their experimental results demonstrated a good efficacy of their method.

The researchers [18] introduced a new method that achieves high efficiency and effectiveness of CBIR in large scale image data. They achieve effectiveness via extracting the color and texture features by combining color feature and Discrete Wavelet Transform (DWT). Their method also uses Color Correction Matrix (CCM) separately. Using the previous features, an image retrieval based on multi feature fusion is achieved by using normalized Euclidean distance classifier. To increase the efficiency they propose a new method where the DWT learns online user behavior. This method can be applied in medical, Photoshop and web field. The main purpose of this paper is to evaluate the performance of several ISEs (Yahoo!, Ask! and Google) using Arabic queries. Also we measure the effectiveness of the three ISEs according to their image retrieval results for Arabic queries and compare them using the recall and precision measures. MATLAB is also used for the evaluation and analysis of the three ISEs retrieved images common features in order to identify the searchable image features and analyze their impact on retrieval effectiveness.

2. Related Work Increasing broadband penetration, producing popularity of social networking sites and decreasing digital cameras price have resulted in millions of different images on the web. So the task of image retrieval as the internet grows becomes more complicated [9]. However, ISEs are available to facilitate the searching and retrieving through hundreds of millions of web images. Additionally, ISEs try to give an access to the wide range of available web images [4]. As described in (Wang et al., 2006), image retrieval has been one of the most active research fields. Various studies, such as [22][7][4], are carried to evaluate ISEs and to improve their effectiveness.

2

Unlike text-based retrieval systems, images must be annotated in advance. In popular commercial libraries, to ensure the service quality, the images are manually annotated by editors with titles, keywords or short abstracts. However, for web images, due to the huge volume, different techniques for Automatic annotation from web pages are needed [17]. Web image retrieval problem has already been discussed for years and different approaches have been proposed. Based on information used in the retrieval algorithms, the existing solutions can be roughly divided into three categories: • Text-based Approach: this approach extracts text information as annotations from the image of web pages. Then traditional text retrieval algorithms are executed to search the images. Most of the popular ISEs have adopted this approach, such as Google and AltaVista [19]. • Content-based Approach: this approach uses Image analysis techniques which are executed to extract visual images’ features, such as color, texture, orientation, and shape. These features are used to define the most similar images to the query image [22]. • Link-based Approach: adopted by the huge success of Google’s PageRank algorithm, researchers use the web link structure to increase the improvement of image search results [17]. Shijian et al [21] proposed a document retrieval technique that is capable of searching document images without optical character recognition (OCR). The proposed technique retrieves document images by a new word shape coding scheme. It captures the document content through annotating each word image by a word shape code. In particular, they annotate word images by using a set of topological shape features including character ascenders/descenders, character holes, and character water reservoirs. The effective CBIR needs efficient extraction of low level features like color, texture and shapes for indexing and fast query image matching with indexed images for the retrieval of similar images [8]. In their paper [8], the researchers studied the issues of efficient extraction of features and the effective matching of images in the compressed domain. They extracted the quantized histogram statistical texture features from the DCT blocks of the image using the significant energy of the DC and the first three AC coefficients of the blocks. They used various distance metrics for the effective matching of the image with images to measure similarities using texture features. The analyzed the effectiveness of the CBIR on the basis of various distance metrics in different number of quantization bins. They tested the proposed method using Corel image database. The experimental results showed robust image retrieval for various distance metrics with different histogram quantization in a compressed domain

3

In [15], the researchers proposed a unified learning framework for heterogeneous medical image retrieval based on a Full Range Autoregressive Model (FRAR) with the Bayesian approach (BA). The proposed system employed adaptive binary tree based support vector machine (ABTSVM) for efficient and fast classification of medical images in feature vector space. They measured the performance of the proposed system using precision and recall (PR). The Experimental results revealed that the retrieval performance of the proposed system for heterogeneous medical image database is better than the existing systems at low computational and storage cost.

Different visual images’ features such as shape, color and texture are extracted to characterize images. Each of these features is implemented using one or more feature descriptors. During the retrieval, features of the query are compared to those of the images in the database in order to rank the indexing image according to its distance to the query. In other hand, in biometrics systems images are used as patterns (e.g. fingerprint, iris, hand etc.) are also represented by feature vectors. The candidates’ patterns are next retrieved from Database by comparing the distance of their feature vectors [5]. In the web [2], current ISEs depend purely on the keywords around the images and the filenames which produce a lot of rubbish in the search results. Due to the fact that web ISEs are blind to the content of images, the queries results often provide irrelevant data, although a lot of research has been done on content based image retrieval (CBIR). More attention in recent years focused on CBIR. The exponential growth of the numbers and sizes of digital image on web is making it necessary to develop powerful tools for retrieving this unconstrained imagery. Furthermore, CBIR is the key technology for improving the interface between user and computer [16]. Color, texture and shape information have been the primitive image descriptors in CBIR systems [11]. The three main groups of features that are being used in CBIR systems are: color, shape and texture. Color is one of the most used features for examining images. The modern image search studies used color as the comparing feature between images [23]. For two images, the color similarity can be measured by comparing their color histograms. The color histogram is a common color descriptor, denoted to the occurrence frequencies of colors in image [10][20]. The color of an image is represented by using some color models. Shape is another important visual feature for image content description. Shape based image retrieval is the measure of similarity between shapes of an image represented by their features [5]. The shape feature of image refers to the particular region that is being sought out. Shape will be determined by applying segmentation or edge detection to an image. In image analysis it’s

4

very important to apply edge detection which gives indication about the shapes of objects present in the image [14. Authors in [2] proposed an approach named ReSPEC (Re-ranking Sets of Pictures by Exploiting Consistency), that is a hybrid of the two methods (CBIR system IMALBUM and MPEG7 technique) used by [12]. The first retrieves the results of a selected keyword query from an existing image search engine, clusters the results based on extracted image features, and returns the cluster that is inferred to be the most relevant to the search query. Furthermore, it ranks the remaining results in order of relevance. But this approach is not discriminative enough to sort huge collections of highly variable photos, because the distribution of image features in personal photo collections is in general different than the distribution of image features conditioned on a selected text query. One study in performance evaluation of ISE field finds that for one word query, ISE pairs have no overlap. However, the overlap ratios increased with increasing the query words numbers. Yahoo-Ask pair has more common image items in their retrieval outputs than other ISE pairs. There are no common image items between more than two ISEs. While Google has better precision ratio at cut-off point 10, its precision ratios generally decreased with increasing cut-off point values. However, at cut-off points 30 and 40 Google retrieved the lowest number of relevant images [5]. Another study sees that the precision ratio of any of the ISEs was different and changes from a query to another. The precision ratios for all ISEs decreased gradually with increasing cut-off point values from 10 to 20. The results indicated that Google has the best overall retrieval effectiveness with 95% precision ratio, followed by Yahoo with 91% precision ratio, and Ask lastly with 83.7% precision ratio. In this this study two features are identified, which are color and shape. However, there is one thing that’s common for all ISEs, that is the color feature is not sufficient to retrieve images that are relevant 100%. Also, two images with similar color histograms can possess different content. As well as for shape feature, the number of edges not efficient to identify relevant images [1]. From the above research and other published research on CBIR it is clear that research on Arabic CBIR is very rare. One related study discusses the Farsi/Arabic Document Image Retrieval through Sub-Letter Shape Coding for mixed Farsi/Arabic and English text [3]. The authors proposed a system that retrieves document images by a new sub_letter shape coding scheme in Farsi/Arabic documents. In this method document content is captured through sub_letter coding of words. The decision tree-based classifier partitions the sub_letters space into a number of sub regions by splitting the sub_letter space, using one topological shape features at a time.

5

Topological shape features include height, width, holes, openings, valleys, jags, sub_letter ascenders/descanters. Experimental results show advantages of this method in Farsi/Arabic Document Image Retrieval.

3. Methodology The methodology used is divided into two phases: the first phase is about image SEs evaluation and second phase about image features analysis. The objective of all of that is to evaluate image search engines and to improve its retrieval performance and effectiveness using different techniques. The methodology phase’s details are explained in the following sections.

3.1 First Phase: Image Search Engines Evaluation This phase target is the evaluation of the three selected ISEs by calculating the recall and precision for each one. This is done by applying several steps. Firstly, is to select 10 queries and run each one to all three ISE, then label the image by relevant and irrelevant, after that recall and precision are calculated, finally result comparison are shown. The steps of this phase are: A. Queries Selection To the evaluation and comparison of the web image search engines, initially, 10 queries the queries consist of few words without explicit Boolean operators such as AND and OR, as shown in table 1. The queries were given to 5 students. Each query was run on the selected image search engines separately. Table 1: Selected queries list. Query number

Query in Arabic

Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10

‫قطه‬ ‫قدم‬ ‫كرة السلة‬ ‫استراليا فيكتوريا‬ ‫الحياة الجامعية‬ ‫احذية كرة القدم‬ ‫الكاميرا الرقمية‬ ‫بيبسي ماكس‬ ‫ريد بول‬ ‫هابي فيت‬

Query (English translation) cat Foot Basketball Victoria Australia University life Football shoes Digital camera Pepsi max Red bull Happy feet

B. Image Search Engines Tested Three popular search engines are tested: Google, Yahoo and Ask. The reasons for this selection were as follows: Google, Yahoo, and Ask were the top three most frequently used in the world wide. Second reason was that all these search engines had both Internet Search and Image Search features.

C. Retrieved Images Labeling (Relevant or Irrelevant) 6

After each student run each query on the selected ISEs separately, the first 10 images retrieved in each retrieval result were evaluated manually as “relevant” or “non-relevant” according to the user.

D. Recall and Precision Calculation The famous Recall and Precision are used for the relevance evaluation. And to evaluate the retrieval performance over all test queries, we calculate the average precision value at the points at which each relevant image is retrieved. In addition the mean average precision value (MAP) is calculated for a set of queries that runs on each individual ISE.

E. Comparison of Results with Other Researches A comparison between our results and other researches results will be analyzed. The comparison should determine if the evaluation gives the same results as other researches or the results will vary.

3.2 Second Phase: Image Features Analysis In this phase we focus on analyzing a set of image features. Using MATLAB program the Color and Shape features are analyzed to identify their impact on the retrieved images.

A- Color Analysis Euclidian distance is used to analyze the color of images retrieved by the SEs. Euclidian distance is the most common approach to compare images. The images should be similar in size, to be able to compare them using Euclidian distance, which can be achieved easily in MATLAB. The distance between two images is calculated by equation 1. 𝐷(𝐻’, 𝐻) = 𝐷(𝐻, 𝐻’) = √|((𝐻𝑅 − 𝐻’𝑅)2 + (𝐻𝐺 − 𝐻’𝐺)2 + (𝐻𝐵 − 𝐻’𝐵))|2

(1)

B- Shape Analysis The strategy followed for shape feature analysis consists of the following three steps: 1. Detect edges for each image. 2. Count the number of edges. 3. Compare images based on the number of edges and show result.

4. Experiments and Evaluation In this section, we will present the evaluation of ISEs and compare between our results with the other two researches results. Furthermore we will present an analysis of common image features using MATLAB program.

4.1 First Phase Results and Evaluation 7

As presented before in the methodology section, three web ISEs, namely, Google, Yahoo and Ask are evaluated. The average precision values are calculated over all test queries; afterwards the mean average precision (MAP) value is calculated for a set of queries that run on each individual ISE. The overall results of the experiment are discussed as follows: Table 2 shows the MAP values of the three ISEs for all queries: Table 2: MAP values for the three ISEs. Query

Query name

Query name

number

(Arabic)

(English translation)

Google

Yahoo

Ask

1

‫قطه‬

cat

1

1

0.88

2

‫قدم‬

Foot

0.92

0.94

0.14

3

‫كرة السلة‬

Basketball

0.91

0.76

0.81

4

‫استراليا فيكتوريا‬

Victoria Australia

0.95

0.43

0.90

5

‫الحياة الجامعية‬

University life

0.69

0.72

0.72

6

‫احذية كرة القدم‬

Football shoes

0.96

1

0

7

‫الكاميرا الرقمية‬

Digital camera

0.98

0.95

0.95

8

‫بيبسي ماكس‬

Pepsi max

0.95

0.53

0.51

9

‫ريد بول‬

Red bull

0.98

0.90

0.75

10

‫هابي فيت‬

Happy feet

0.82

0.87

0.79

The average precision ratio are calculated over all test queries and shown in fig.1.

mean precision

100% 80%

60%

Google

40%

Yahoo

20%

Ask

0% 1

2

3

4

5

6

7

8

9

Query number

Fig. 1: The average precision ratio calculated over all test queries.

8

10

Fig. 1 shows the precision ratio calculated for each query achieved by each ISE. The highest and lowest values can be seen in the figure for each ISE. To evaluate the performance and effectiveness of the three ISEs for the 10 queries, the mean average precision value (MAP) is calculated. The results obtained from the experiment are as follows: 89%

81% 64%

Google

Yahoo

Ask

Fig. 2: Mean Average Precision ratio (MAP) for the three ISEs

As shown in fig. 2, Google got the highest MAP, then comes Yahoo as second best ISE and Ask ISE got the lowest precision ratio

4.1.2 Result Comparison and Discussion In this section we compare our evaluation results with other research results in [1]. Our results of 10 queries compared for the three ISEs (Google, Yahoo and Ask). The comparison is shown in table 3. Table 3: Compared first 10 queries precision results Query

Query Name

Number

Our Results

Query Name

Google

Yahoo

Ask

Previous Results Google

Yahoo

Ask

1

‫قطه‬

100%

100%

100%

cat

99%

98%

92%

2

‫قدم‬

92%

94%

15%

foot

87%

92.5%

73%

3

‫كرة السلة‬

92%

77%

81%

basketball

95.5%

94.5%

96%

4

‫استراليا فيكتوريا‬

96%

44%

91%

Australia Victoria

92%

93%

76.5%

5

‫الحياة الجامعية‬

96%

72%

72%

university life

89.5%

69%

78.5%

6

‫احذية كرة القدم‬

96%

100%

0%

football shoes

100%

95.50%

71.5%

7

‫الكاميرا الرقمية‬

98%

95%

96%

Digital camera

99.7%

99%

75.5%

8

‫بيبسي ماكس‬

95%

53%

51%

Pepsi max

78%

89%

64%

9

‫ريد بول‬

99%

90%

76%

red bull

96.5%

89.7%

95%

10

‫هابي فيت‬

82%

87%

79%

Happy feet

100%

93%

99.5%

. 9

As shown in table 3, the 10 queries results in our research have different values than those in [1], however, in both studies Google has the highest precision ratio and Yahoo comes in second order and ask has the lowest precision ratio as shown in fig. 3. All ISEs got higher results in the other research compared to our experiment

100% 50% 0% Google Yahoo Ask

precision ratio

results.

Previous results Our results

Search Engine

Fig. 3: MAP result of previous study compared with our study.

4.2 Second Phase Results and Evaluation As mentioned before, two important features are selected to analyze images; the color and shape features. The analysis was performed on all the images in order to compare different images and to determine whether features affect images retrieved. The results of image analysis are presented and described in the following subsections.

4.2.1 Color Analysis Results Color analysis is performed on images colors. In order to compare between images, the color histogram is used. The color histogram was computed for the retrieved images and then the histograms of these images are compared using Euclidian distance. The Euclidian distances are calculated between selected images (relevant and irrelevant images). The Euclidian distances for query 2= "‫ "قدم‬in Google are shown in fig. 4 and fig. 5.

Euclidian distance image 10

Irrelevant image 10 with relevant image (query 2) 1.00E+04 5.00E+03 0.00E+00 ImageImage Image Image 1 Image 2 3 4 5

Image 6

Image 7

Image 1 Image 2 Image 3 Image 4 Image 5 Image 6 Image 7 image 10 5.28E+03 4.16E+03 5.67E+03 1.45E+03 4.68E+03 2.07E+03 2.37E+03

Fig. 4: Irrelevant with relevant image Euclidian distances for query 2 in Google.

10

Euclidian distance image 1

Relevant image 1 with relevant images (query 2)

1.00E+04 0.00E+00 image 2image 3

image4 image 5 image 6

image 7

image 2 image 3 image4 image 5 image 6 image 7 Image 1 3.38E+03 2.71E+03 5.49E+03 4.17E+03 5.68E+03 5.83E+03

Fig. 5: Relevant with relevant image Euclidian distances for query 2 in Google.

The Euclidian distances for query (4) = “‫ "كاميرا رقمية‬in Yahoo are shown in fig. 6 and fig. 7.

Euclidian distance image1

Irrelevant image 1 with relevant image (query 4)

image 1

5.00E+03 0.00E+00 Image 3

Image 4

Image 3 3.06E+03

Image 4 2.50E+03

Image 5 Image 5 4.00E+03

Fig. 6: Irrelevant with relevant image Euclidian distances for query 4 in Yahoo.

Euclidian distance image5

Relevant image 5 with relevant image (query 4)

image 5

5.00E+03 0.00E+00 Image 3 Image 4 Image 3 4.04E+03

Image 4 3.13E+03

Fig. 7: Relevant with relevant image Euclidian distances for query 4 in Yahoo.

The Euclidian distances for query (10) = “‫“ هابي فيت‬in Ask shows in fig. 8 and fig. 9.

11

Euclidian distance image2

Irrelevant image 2 with relevant image (query 10)

image 2

1.60E+03 1.40E+03 1.20E+03 Image 1

Image 1 1.33E+03

Image 3 Image 3 1.35E+03

Image 6 Image 6 1.42E+03

Fig. 8: Irrelevant with relevant image Euclidian distances for query 10 in Ask.

Euclidian distance image1

Relevant image 1 with relevant image (query 10)

1.48E+03 1.47E+03 1.46E+03 1.45E+03 Image 3 Image 6

image 1

Image 3 1.46E+03

Image 6 1.47E+03

Fig. 9: Relevant with relevant image Euclidian distances for query 10 in Ask. The results from this method indicate that the distance between relevant images is often greater than the distance between relevant and irrelevant images.

4.2.2 Shape Analysis Results As mentioned earlier, the shape feature is analyzed in term of edge detection for images. The selected images (relevant and irrelevant) were compared based on the number of image edges. The edge detection will find and count the number of edges for each image. The results obtained by running query 5 = ”‫ ”الحياة الجامعية‬in Google SE are shown in fig. 10 and fig. 11.

12

Number of edges for relevant images of query 5 40 30 20 10 0

image image image image image image 1 2 3 6 8 9 # of edge 29 28 12 22 35 22 Fig. 10: Number of edges for query 5 relevant images returned by Google

Number of edges for Irrelevant image of query 5 30 25 20 15 10 5 0 # of edges

Im 4

Im5

Im7

Im10

13

24

16

20

Fig. 11: Number of edges for query 5 irrelevant images returned by Google

The results obtained by running query 2 =”‫ ”قدم‬in Yahoo SE are shown in fig. 12 and fig. 13. Number of edges for relevant images of query 2 18 16 14 12 10 8 6 4 2 0

Image image Image image Image Image Imgea image 1 2 4 5 6 7 8 9 # of edge 12 7 4 16 7 12 12 9 Fig. 12: Number of edges for query 2 relevant images returned by Yahoo.

13

Number of edges for Irrelevant image of query 2 20 15 10 5 0

Image3

image10

19

9

# of edge

Fig. 13: Number of edges for query 5 irrelevant images returned by Yahoo. The results obtained by running query 3 =”‫ ”هابي فيت‬in Ask SE are shown in fig. 14 and fig. 15. Number of edges for relevant images of query 3 30 20 10 0

# of edge

image image image image image image 1 2 3 7 8 10 17

12

17

15

21

19

Fig. 14: Number of edges for query 3 relevant images returned by Ask.

Number of edges for Irrelevant image of query 3 40 30 20 10 0 # of edge

image4

image5

image6

image9

26

19

33

26

Fig.15: Number of edges for query irrelevant images returned by Ask.

The results show that the number of edges of retrieved images from running different queries in the three SEs is in the same range. We also found that some of the relevant images have the same number of edges as for irrelevant images.

14

4.2.3 Results Comparison and Discussion In this section we compare our evaluation results with other research results in [1]. Our results indicated that for all three SEs, Google, Yahoo and Ask, the difference histogram between the irrelevant images and the relevant images is greater than the difference histogram between relevant images themselves. However, in [1] the results indicated that for all three SEs the difference histogram of relevant images is higher than the difference histogram between relevant and irrelevant images. As for Euclidian distances that are calculated between selected images (relevant and irrelevant), our results showed that the distance between relevant images is often greater than the distance between relevant and irrelevant images. The same is also in [1] results which indicated that the distance between relevant images is often higher than the distance between relevant and irrelevant images. As well as for shape feature, the number of edges is not efficient to identify relevant images. Two different images may have similar number of edges in both studies.

5. Conclusion In most ISEs, the indexing of images is not done by their appearance but by text, which can be found in the image context. Current SEs technological basic is based on keyword searches that accompany an image. In this paper first phase we developed a systematic collection of image queries written in Arabic to evaluate the performance of the three major ISEs (Google, Yahoo and Ask). The target of these queries is to test the retrieval contents which are humanly meaningful as opposed to machine-oriented features such as color, textures, and shapes. The major findings of this paper can be summarized as follows: The results indicated that Google has the best retrieval effectiveness with 89% precision ratio, the second one is Yahoo with 81% precision ratio, and Ask at last with 64% precision ratio. Where Google SE has the highest precision ratio in 5 queries: Where Yahoo has the best precision ratio in 5 queries. Ask doesn’t have any best precision ratio in any. Our results were compared with results from previous study [1]. We show that the MAP results for Google in [1] was 95% but in our result was 89%, Yahoo in [1] was 91% but in our results is 81% and Ask in [1] was 83.7% but in our result was 64%. All that shows the MAP of the three SEs in [1] is higher than our results. This is due to language of the queries. The second phase of this project is to test relationships between the image’s Color and shape information (content based features) with image relevancy status that are retrieved by the three ISEs and compare the results obtained for the same English queries. Color Histogram search represent an image by its color distribution, or

15

histogram but the main drawback of a global histogram representation is that object location, shape, and texture information are discarded thus leading to some false retrieval. Thus this project showed that images retrieved by using the global color histogram may not be semantically related even though they share similar color distribution in some different results. This drawback also minimized the ISE performance. As well as for shape feature, the number of edges is not efficient to identify relevant images, where the relevant and irrelevant images have the same number of edges range. So, the three ISEs do not take into considerations images content base information in their retrieval of image results for the user.

References 1. Abu Sini, M. and Abu Ata, B.; 2013. Web Image Search Engine Evaluation. International Arab Journal of eTechnology (IAJeT). 3(2): 90-98. 2. Babenko, B. and Belongie, S., 2006. Improving web-based image search via content based clustering. Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop, p.106, June 17-22. New York, USA. 3. Bahmani Z. and Azmi R., 2011. Farsi/Arabic Document Image Retrieval through Sub -Letter Shape Coding for mixed Farsi/Arabic and English text. International Journal of Computer Science (IJCSI) Issues. 8(5):166-172 . 4. Cakir E., Bahceci H. and and Bitirim Y., 2008, An Evaluation of Major Image Search Engines on Various Query Topics. The Third International Conference on Digital Telecommunications, pp. 161-165. June 29 - July 5, Bucharest, Romania. 5. Choras R. S., 2007. Image Feature Extraction Techniques and Their Applications for CBIR and Biometrics Systems. International Journal of Biology and Biomedical Engineering. l(1): 6-16. 6. Dinakaran D., Annapurn J and Aswani C., 2010. Interactive Image Retrieval Using Text and Image Content. Cybernetics and Information Technology. 10(3): 20-30. 7. Elagoz M.T., Mendeli M., Manioglulari R.Z. and Bitirim Y., 2008. An empirical evaluation on meta-image search engines. The Third International Conference on Digital Telecommunications. pp. 135-139, June 29 - July 5, Bucharest, Romania. 8. Fazal Malik and Baharum Baharudin, 2013. Analysis of distance metrics in content-based image retrieval using statistical quantized histogram texture features in the DCT domain. Journal of King Saud University - Computer and Information Sciences, 25(2), pp. 207-218. 9. Fukumoto T., 2006. An analysis of image retrieval behavior for metadata type and Google image database. Information processing and management. 42(2006): 723-728. 10. Gonzalez R. C., and Woods R. E., 2001. Digital Image Processing, Addison-Wesley, Boston, MA, USA. 11. Hiremath P. S., and Pujari J., 2007. Content Based Image Retrieval Based on Color, Texture and Shape Features using Image and its Complement. International Journal of Computer Science and Security. 1(4): 25-35. 12. Idrissi K., Ricard J. and Baskurt A., 2002. An Objective Performance Evaluation Tool for Color Based Image Retrieval Systems. In Proceedings of the IEEE International Conference on Image Processing, volume 2, pp 389–392, Sept. 22-25, New York, USA. 13. Kaur M., Bhatia N. and Singh S., 2011. Web search engines evaluation based on features and end-user experience. International Journal of Enterprise Computing and Business Systems. 1(2): 1-19. 14. Kekre H.B., and Sudeep T., 2010. Image Retrieval with Shape Features Extracted using Gradient Operators and Slope Magnitude Technique with BTC. International Journal of Computer Applications. 6(8): 2833. 15. K. Seetharaman and S. Sathiamoorthy, 2016. A unified learning framework for content based medical image retrieval using a statistical model Original Research Article. Journal of King Saud University Computer and Information Sciences, 28(1), pp. 110-124.

16

16. Lei Z., Fuzong L., and Bo Z., 1999. A CBIR Method Based on Color-spatial Feature. Proceeding of IEEE Region 10 Annual International Conference (TENCON 99), pp 166-169, Sept. 15-17, Cheju, Korea. 17. Liu H., Xie X., Tang X., Li Z.W. and Ma W.Y., 2004. Effective browsing of web image search results. Proceeding of 6th ACM SIGMM international workshop on multimedia information retrieval, pp. 8490, Oct. 10-16, New York, USA. 18. Meenachi S. and Srinivasagan K. G., 2013. Design of color and texture based relevant image search engine. International Journal of Advanced Research in Computer Engineering & Technology. 2(1): 267-272. 19. Pu H.T., 2005. A comparative analysis of web image and textual queries. Online Information Review. 29(5): 457-467. 20. Saha S. K., Das A. K., and Chanda B., 2004. CBIR Using Perception Based Texture and Color Measures. International Conference on Pattern Recognition (ICPR’04’), pp. 28-33, vol. 2, Aug. 23-26, Cambridge, UK. 21. Shijian L; Linlin Li; and Chew L. T., 2008. Document Image Retrieval through Word Shape Coding. IEEE Transactions on Pattern Analysis and Machine Intelligence. 30(11): 1913-1918. 22. Stevanson K. and Leung C., 2005. Comparative evaluation of web image search engines for multimedia applications. IEEE International Conference on Multimedia and Expo, 2005. ICME 2005, pp. 1-4, July 6-8, Amsterdam, The Netherlands. 23. Tico M., Haverinen T., and Kuosmanen P., 2000. A Method of Color Histogram Creation for Image Retrieval. in Proc. of Nordic Signal Processing Symposium (NORSIG), pp.157-160, June 13-15, Kolmården, Sweden. 24. Wang H., Liu S. and Chia L.T., 2006. Does ontology help in image retrieval? - A comparison between keyword, text ontology and multi-modality ontology approaches. Multimedia ’06 proceeding of the 14th annual ACM international conference on multimedia, pp. 109-112, Oct. 23 - 27, Santa Barbara, CA, USA.

17

Suggest Documents