Document not found! Please try again

EFFICIENT TAG MINING VIA MIXTURE MODELING ...

1 downloads 0 Views 5MB Size Report
pork shoulder roast root beer float cool haircut crying nokia 2720 fold fusion sushi lion king kovu. 2010 chevrolet camaro ss wedding bouquet miniature pinscher.
EFFICIENT TAG MINING VIA MIXTURE MODELING FOR REAL-TIME SEARCH-BASED IMAGE ANNOTATION∗ Lican Dai† , Xin-Jing Wang‡ , Lei Zhang‡ , Nenghai Yu† †

Department of EEIS, University of Science and Technology of China, Hefei, 230027, China ‡ Microsoft Research Asia, Haidian District, Beijing, 100080, China [email protected], {xjwang, leizhang}@microsoft.com, [email protected] ABSTRACT

Although it has been extensively studied for many years, automatic image annotation is still a challenging problem. Recently, data-driven approaches have demonstrated their great success to image auto-annotation. Such approaches leverage abundant partially annotated web images to annotate an uncaptioned image. Specifically, they first retrieve a group of visually closely similar images given an uncaptioned image as a query, then figure out meaningful phrases from the surrounding texts of the image search results. Since the surrounding texts are generally noisy, how to effectively mine meaningful phrases is crucial for the success of such approaches. We propose a mixture modeling approach which assumes that a tag is generated from a convex combination of topics. Different from a typical topic modeling approach like LDA, topics in our approach are explicitly learnt from a definitive catalog of the Web, i.e. the Open Directory Project (ODP). Compared with previous works, it has two advantages: Firstly, it uses an open vocabulary rather than a limited one defined by a training set. Secondly, it is efficient for real-time annotation. Experimental results conducted on two billion web images show the efficiency and effectiveness of the proposed approach. Index Terms— Search based image annotation, Tag mining, Topic space modeling 1. INTRODUCTION Automatic image annotation has become one of the core research topics in computer vision and multimedia areas due to its importance in underlying many useful applications, including image search and photo management. Recently, the emergence of large image databases has provided new opportunities to advance the research in image annotation. The data-driven approaches have demonstrated their great potential to solve the image annotation problem [1, 2, 3, 4]. Such approaches leverage a search-toannotation strategy and utilize rich media information (such ∗ This

work was performed at Microsoft Research Asia.

as image filename, URL, and surrounding texts) which is publicly available on the Web to understand an image. Compared with traditional computer vision or machine learning approaches, such approaches do not adopt a training stage and annotate an image with an open vocabulary. Given an uncaptioned image as a query, a typical datadriven approach annotates it in two steps: 1) image search: search for a group of visually closely similar images; and 2) tag mining: mine meaningful tags from the associated texts of the retrieved images. There are a great number of works discussing how to discover visually similar images [2, 3, 5, 6]. Some recent works paid special attention on the effect of dataset size on image retrieval. For example, Wang et al. [1] collected 2.4 million high-quality web images and employed the low-level color and texture features to measure the visual similarity. Torralba et al. [3] collected about 80 million low-resolution images and confirmed that recognition performance improves as image dataset grows. The Arista approach [2] further investigated this problem on two billion images. Contrarily, few previous works focus on efficient and effective tag mining techniques for real-time annotation. In [1] and [2], the Search Results Clustering (SRC) technique [7] was adopted to discover salient phrases from image surrounding texts, which measures the statistic importance of an n-gram, i.e. a sequence of n words. Though SRC is efficient, we will show that its effectiveness can be greatly improved. A few researchers adopted topic modeling approaches, such as Latent Dirichlet Allocation (LDA) [8] and Probabilistic Latent Semantic Analysis (pLSA) [9], to generate image annotations and have shown promising performances. Such approaches assume that image tags are generated from a hidden topic layer rather than are directly generated from raw image features. However, there are two disadvantages of such approaches to be adopted for a practical real-time image annotation engine: Firstly, they handle limited vocabularies There is a training stage and the vocabulary is limited by the terms appearing in the training set. Secondly, the training process is too time expensive. On the other hand, there are a few interesting works tackling the annotation refinement problem [10, 11, 12]. However,

such works require that an initial annotation step is available and target at refining the suggested tags towards better effectiveness. There are three basic requirements for a practical tag mining approach for real-time image annotation: 1) it corresponds to an open vocabulary; 2) it is efficient to support an online approach; and 3) it is effective in generating specific annotations which are preferably long phrases. In this study, we formulate the tag mining problem as to rank candidate n-grams which are generated by evaluating certain statistic properties, and propose a mixture modeling approach for ngram ranking. The top-ranked n-grams are assumed as the image annotations. Specifically, inspired by the idea of topic modeling [8, 9], we assume that a semantic annotation is generated from a convex combination of topics while the topic probabilities are conditioned on the image to be annotated. Moreover, the topic space in our approach is explicit rather than latent, and is learnt from the Open Directory Project (ODP) [13], of which a topic is defined as a word distribution. The motivation of learning an explicit topic space from ODP is twofold. First, ODP is the largest, most comprehensive human-edited directory of web queries. It not only corresponds to an open vocabulary so that is a good fit for a practical image annotation system, but also defines a ground truth topic space with its manually edited ontology. The webpages associated with each tree node provide ideal training sets for our topic learning. Second, based on the explicit topic space, an online service can be built so that given a textual query, its related topics can be retrieved instantly. This is crucial to meet with the real-time annotation requirement. We dumped about two billion web images from a commercial search engine, based on which a series of evaluations were conducted to measure the proposed tag mining approach. Annotation results show the efficiency and effectiveness of our proposed approach. The rest of this paper is organized as follows. In Section 2, we present the topic space generation approach. Section 3 details the mixture model. Experimental results are provided in Section 4 with discussions, and we conclude our work in Section 5. 2. EFFICIENT TOPIC SPACE LEARNING Before presenting the mixture model for image annotation, we describe how we learn a hierarchical topic space leveraging the Open Directory Project (ODP) [13], which is a core component of the mixture model. The idea of capturing the correlations between an image and a word via their interrelationship of topics is inspired by the recent success of topic modeling approaches [8, 9] in text mining and computer vision. A desirable feature of ODP for topic learning is that each node is associated with a large amount of manually selected

webpages. We leverage this feature to build our hierarchical topic space. The space is defined on the entire ODP ontology, and each topic is a term distribution generated from the associated webpages of a certain tree node. We call an ODP tree node as a category hereafter. This process is non-trivial: First, to identify topic-specific terms is not easy since they generally occupy a small population among the associated webpages of a certain category. Second, to efficiently match a query to the learnt topics is challenging since there are nearly one million categories. We tackle the two problems in the following way. 2.1. Sentence-wise Topic Relevance Identification Though the associated webpages of a certain ODP category are manually chosen by human experts, which means that they are about the same topic, it is still reasonable to assume that there are two types of sentences in a webpage: topicrelated and topic-unrelated. Typically, topic-related sentences utilize a small vocabulary since the corresponding terms are about the same topic. Contrarily, topic-unrelated terms correspond to diverse semantics and generally form a much larger vocabulary. Motivated by this, we propose a sentence importance measure to score a sentence according to its importance in interpreting the topic of an ODP category. We define a sentence si to be similar to a sentence sj if their cosine similarity is larger than a predefined threshold ε. Meanwhile, si is assumed as “similar” to a webpage dj if at least one sentence of dj is similar to si . Denoting such a relationship by S(si , dj ), we have S(si , dj ) = 1 if si is similar to dj , and S(si , dj ) = 0 otherwise. Based on these assumptions, we define sentence frequency (SF) of si as in Eqn.(1): SFi =

n ∑

S(si , dj )

(1)

j=1

where n is the total number of webpages associated with the category under consideration. Fundamentally, SF measures the confidence of a sentence to be topic-related. Intuitively, the larger the frequency, the more important the sentence is in representing the topic of the corresponding category. On the other hand, if a sentence si is also similar to webpages from other categories, then this sentence is not important or is noisy to represent the topic, or say topic-unrelated. We define inverse webpage frequency (IWF) to quantify such an assumption as in Eqn.(2): IW Fi = log ∑N k=1

N S(si , dk )

(2)

where N is the number of webpages randomly sampled from all the ODP categories.

Based on the two measures, we define the sentence importance (SI) of si as: SFi × IW Fi SIi = ∑m j=1 SFj × IW Fj

(3)

which is the normalized SF-IWF score, where m is the total number of sentences in the webpages belonging to a certain category. The higher the SI score, the more important the corresponding sentence is for the topic. The underlying idea is that a sentence is important for a certain category if it has a localized distribution, i.e. it has many similar sentences in this category while has few in other categories. This is similar to the TF-IDF weighting scheme [14] which is widely used in text search and mining. Based on the proposed SI score, we adopt a simple yet efficient sentence-wise topic relevance identification method to separate topic-related sentences from topic-unrelated ones: Firstly, we parse webpages into sentences and represent each sentence in the vector space model. Secondly, for each category, we cluster all its sentences using k-means clustering. We define cluster importance (CI) of the kth cluster CIk as its average sentence importance: ∑ 1 × SIi M i=1 M

CIk =

(4)

where M is the number of sentences in this cluster. Since topic-related sentences tend to have similar terms, they are more likely to be grouped into one cluster. Moreover, since these sentences tend to have higher sentence importance scores, the corresponding cluster is more likely to have a higher cluster importance score. Therefore, we can rank clusters based on their CI scores and assume the top one cluster containing topic-related sentences. We then build a vector space model based on the topic-related sentences, which defines the corresponding topic. Such a process is iterated over all the ODP categories, which results in the hierarchical topic space. 2.2. Efficient Topic Space Indexing and Matching To efficiently identify the topic distribution of a given textual query, we index the learnt topics. A similar idea is widely adopted by web search applications. Specifically, since it is represented as a weighted list of terms, we treat a topic as a document, based on which we build an inverted index for efficient retrieval. Then by loading the inverted index file into memory, we construct an online service for topic matching. Specifically, given a query term, all the topics that contain this term are instantly returned. If a query contains multiple terms, exactly the same technique of general web search is applied, which outputs a ranked list of the intersection of the topics indexed by each distinct query term. By these means, real-time topic matching with an open vocabulary is realized.

3. MIXTURE MODELING FOR TAG MINING In this section, we present the mixture modeling approach for tag mining in details. Fundamentally, the aim of image annotation is to find a group of n-grams w∗ which maximizes the conditional probability p(w|Iq ), as described in Eqn.(5), where Iq is the query image and w is an n-gram in the vocabulary. By applying the Bayesian rule, we can obtain Eqn.(6) where Ii denotes the ith image in the database, p(Ii |Iq ) investigates the similarity between Ii and Iq , and p(w|Ii ) evaluates the semantic correlation between Ii and w. If we assume that there is a topic layer so that an image is represented as a mixture of topics, and it is from these topics that terms or phrases are generated, then can we obtain Eqn.(7). To evaluate p(t|I) and p(w|t), the topic modeling approaches [8, 9] utilize a generative process. Though they are technically elegant, they need to learn a great number of parameters which make the process time expensive and less scalable. The SRC technique [7] adopted in [1, 2] simulates the two probabilities via a label-based document clustering approach instead. It is efficient but is not effective enough, as we will show in Section 4. In this study, we propose an efficient and effective approach based on the hierarchical topic space presented above. Specifically, by further applying the Bayesian rule, we can reformulate Eqn.(7) as Eqn.(8). The motivation is to leverage the topic matching service of which an n-gram w is used to query a topic t, as presented in Section 2.2. This is a key to ensure real-time tag mining. w∗ = arg max p(w|Iq ) w ∑ = arg max p(w|Ii ) ∗ p(Ii |Iq ) w

= arg max w

(5) (6)

i

∑∑

p(w|t) ∗ p(t|Ii ) ∗ p(Ii |Iq )

(7)

t

i

∑ ∑ p(w) ∗ p(t|w) ∗ p(t|Ii ) ∗ p(Ii |Iq ) p(t) t i ∑ ∑ p(w) ∗ p(t|w) . = arg max ∗ p(t|Ii ) ∗ p(Ii |Iq ) w p(t) t i∈Θ

= arg max w

(8) (9)

q

Note that we focus on the tag mining step in the datadriven image annotation framework. That is, we are given a group of image search results, and the task is to mine semantic annotations from their surrounding texts. Therefore, the target can be simplified as Eqn.(9), where Θq denotes the set of retrieved images and p(Ii |Iq ) simulates the search process1 . In the following subsections, we present in details our methods of 1) candidate phrase extraction, and 2) probability evaluation, based on which a candidate phrase is scored and 1 For each retrieved closely similar result I , we simply assume a constant i value of p(Ii |Iq ) and actually ignore it in the implementation.

the top-ranked ones whose scores are above than a predefined threshold η are assumed as annotations. 3.1. Candidate phrase extraction There are two key factors in Eqn.(9) - w and t (Ii is known). We have shown how t is explicitly represented in Section 2. As for w, a simple definition is to tokenize the surrounding texts of an image into words, or unigrams. Intuitively it is not desired by an annotation system which prefers long phrases as annotations. To tackle this problem, we propose a novel idea to generate meaningful n-grams which are good candidates for salient phrases, and we will show in Section 4 that such a step improves annotation precision. The challenge lies in that there are no prior knowledge on which n-gram is a good candidate and we should balance the efficiency and effectiveness. Specifically, we extract candidate phrases in the following way: Firstly, we parse the surrounding texts into words, during which all stopwords are kept because they can form meaningful phrases when adjacent to some meaning words. Secondly, a number of n-grams are figured out according to the co-occurrences between terms. Highly frequent terms are also kept in this step. Note that an n-gram here is not necessarily successive terms. Thirdly, we compute a few properties for each n-gram, based on which candidate phrase ranking is performed. The properties are: Document Frequency: D(w) This property adopts the traditional definition of document frequency. Since documents in consideration (i.e. surrounding texts in our scenario) are related to closely similar images, intuitively, if many documents mention a certain phrase, this phrase is very likely to represent the semantics of the images than less frequent ones. Meanwhile, since noisy phrases can also be highly frequent, a manually constructed stop phrase list is also maintained for filtering noisy phrases. Phrase Frequency: F (w) Phrase frequency is calculated as the frequency of occurrence of a phrase in a certain document. It is an important supplement to document frequency D(w). The higher the value, the more likely a phrase is a good candidate. Phrase Length: L(w) Intuitively, a longer phrase is more likely to be specific than a shorter one and is preferred by an annotation model. S(w) = a × D(w) + b × F (w) + c × L(w) + d

(10)

Based on the above properties, we train a linear regression model as Eqn.(10) from a manually labeled training set {(wi , yi ); i = 1, ..., Φ}, yi ∈ {0, 1}. To collect such a training set, we randomly select 800 query images and figure out all n-grams from their search results. And then we ask the labelers to identify each wi positive or not. For each wi , we set its label value yi = 1 if wi is assumed as positive, and yi =

0, otherwise. Based on this model, we score all the n-grams and output the top-ranked ones as candidate phrases, which are then re-ranked by probability evaluation to generate final annotations. The details of probability evaluation will be presented in the following subsection. 3.2. Probability evaluation From Eqn.(9), we can see that the probability that a phrase w is a good annotation for image Iq is measured by the following probabilities: 1) the topic distribution of an retrieved image Ii , i.e. p(t|Ii ), 2) the topic distribution of a candidate phrase, i.e. p(t|w), 3) the prior probability of a topic p(t), and 4) the prior probability of a candidate phrase p(w). We describe how to evaluate these probabilities below. Image topic distribution: p(t|Ii ) We leverage the learnt topic space to measure this probability. Specifically, given the surrounding text of an image Ii , we use it to query the topic matching service, which returns the intersection set of the topics indexed by each distinct term of the query. The output topics are then re-ranked by their cosine similarities to the query, and the top-ranked ones are kept which give p(t|Ii ) after normalization. Phrase topic distribution: p(t|w) The quantification approach of p(t|w) is similar to that of p(t|Ii ). The only difference is that a candidate phrase is generally short than the surrounding text of an image. In the extreme case that a candidate phrase contains only one word, the topic search results of this word will be directly used and no topic re-ranking step is adopted. Topic prior: p(t) To estimate p(t), we randomly sampled W = 50k images from the two billion image dataset. For each image Il , we evaluate its topic distribution p(t|Il ), which gives a weighted topic vector. The average topic thus approximates the topic ∑vector W prior2 , i.e. p(t) = 1/W l=1 p(t|Il ). Phrase prior:p(w) For each candidate phrase, we use it as a query to search by a commercial search engine. Assume the number of search τ results is τ , we use the euqation p(w) = e− T to estimate the phrase prior, where T = 3B. Note that online query for the phrase prior computation is time expensive, we finish this process offline. In the implenmentaion, we first collect a large phrase vocabulary and then compute the prior information according to their search result numbers. Individually, if one candidate phrase is not included in our vocabulary, we treat it as a specific phrase and assigne a constant value of p(w). This equation is designed based on such an intuition - A common phrase (i.e. τ is large) represents more general semantics than a rare phrase (i.e. τ is small). 2 For a topic that is not included in the sampled set, we set its p(t) to be the minimal non-zero value of {p(t)} and then re-normalize {p(t)}, where {p(t)} denotes the list of priors defined on topics appearing in the sampled set.

Annotation precision

0.8

0.7

0.6

0.5

0.4

0.3

overall

top3

top1

Fig. 1. Precision comparison of annotation results generated by different methods. 4. EXPERIMENTS We conducted a series of experiments to evaluate the performance of the proposed method as well as the effects of its components, i.e. candidate phrase extraction and prior estimation. We employ the same near duplicate detection technique as used by the Arista approach [2] to retrieve closely similar images from the two billion image dataset. Based on the detected duplicate images, we apply the proposed mixture model to annotate images. 1,200 images which have no less than three duplicates were randomly selected which form the query set for the evaluations.

SRC Query Image Ours

SRC Mixture model with SRC Mixture model without priors Mixture model with topic prior Mixture model with candidate phrases

Ours

0.9

SRC Query Image

1

house, paint wandtattoos house painting house hardwood floor interior design

michael jackson sony music cd dvd michael jackson the essential pop music

apple ipod mp3 player, wifi media player apple ipod touch apple iphone touch screen

themoneytimes comments, featured stock, food

lcd tv lcd monitor product id

general mills logo money brand

sony bravia kdl 1080p flat panel lcd hdtv

inde, asp note comments pages, public text quote clip

Fig. 2. Examples of annotation results generated by our approach and SRC. To save space, only top 3 annotation results are shown and positive ones are highlighted in bold. didate phrases” were compared with the baseline. They applied the probability evaluation and ranking based on SRC’s results and the candidate phrases extracted by the regression modeling approach respectively. Furthermore, to evaluate the influence of topic and phrase prior estimation in the probability evaluation, another two methods named “Mixture model without priors” and “Mixture model with topic prior” were developed as well - One conducted ranking without any prior based on the candidate phrases, and the other performed ranking only with topic prior based on the candidate phrases.

4.1. Experimental setting

4.2. Experimental results

4.1.1 Evaluation criterion It is difficult to create ground truth annotations for this work because an image is worth a thousand words and we are targeting at practical image annotation with an open vocabulary. It is very challenging, if not impossible, to figure out all possible annotations for an image. Therefore, we simply asked our labelers to judge a suggested annotation as positive or not, based on which the annotation precision was evaluated. Ten volunteers were involved in the evaluation and each of them evaluated all the results. An annotation was assumed as positive only if seven out of ten labelers marked it as positive. To ensure the quality, Bing and Google image search results of each annotation were provided to assist the labeling. ∩ α = |C|R|R| to measure the annotation preWe use P ⃝K cision where R is a set of the top K annotations, and C is the manually labeled positive ones.

Figure 1 shows the annotation precision of different methα P ⃝3 α and the overall precision (i.e. all the ods, where P ⃝1, annotation results were included in the evaluation) are evaluated. Specifically, “SRC” denotes the SRC technique which was used in [1, 2]. “Mixture model with SRC” denotes the method which uses the SRC suggested annotations as w in Eqn.(9) and then applies our proposed mixture model to rerank them. The performance gap between this method and “Mixture model with candidate phrases” suggests the effectiveness of our candidate phrase extraction method, whereas the gap between the method and SRC shows the effectiveness of the proposed mixture model. “Mixture model without priors” represents the method which uses identical p(w) and p(t) over any images and any candidate phrases in Eqn.(9), whereas “Mixture model with topic prior” fixes p(w) but varies p(t). These two methods suggest the importance of estimating the priors. Several observations from Figure 1 are: 1) Our approach greatly improves the annotation performance. The overall precision of our method is 81.03%, which is 23.95% better than SRC. Figure 2 shows a few examples of annotation results generated by our approach and SRC respectively. From the figure, it can be seen that our approach not only gives a higher precision but also provides more specific

4.1.2 Compared algorithms For comparison, the SRC technique used by the state-of-theart Arista approach [2] was used as the baseline. To evaluate the effect of candidate phrase extration and probablity evaluation on the annotation performance, two methods named “Mixture model with SRC” and “Mixture model with can-

cool haircut

crying

nicole kidman

friendship

arturo gatti wife

pencil drawing fairy

nokia 2720 fold

fusion sushi

girls cowboy boots pork shoulder roast

sony psp 3000

root beer float

mermaid swim fins

thierry mugler angel perfume

niagara falls

lion king kovu

2010 chevrolet camaro ss

wedding bouquet

miniature pinscher

super mario 64 ds

taj mahal

Fig. 3. Examples of query images and their top one annotations generated by our approach. annotations. 2) We achieved a very good performance on the top one result. This is quite appealing for the scenarios which require short and precise texts of an image, such as web image search and image search on mobile devices. A few examples of the top one annotations are shown in Figure 3. 3) The candidate phrase extraction approach is quite effective which also suggests that to generate good candidate phrases are important for high annotation performance. 4) Topic priors and phrase priors provide more knowledge on the importance of a candidate phrase, which help improve the annotation performance. The efficiency of our approach was also evaluated. Based on the online topic matching service, our tag mining approach costs about 0.33 seconds on average with one Intel Core2 Quad CPU and 2G memory. The time includes candidate n-gram generation, probability evaluation, and ranking. To guarantee the efficiency, at most 50 top duplicate image search results were used for tag mining. 5. CONCLUSION In this paper, we proposed a topic-based mixture model for tag mining in the scenario of data-driven image annotation. Based on a human-edited Web directory, we learn a topic space in which each topic is represented as a word distribution. We build an inverted index from the topic space so that given a textual query, its related topics can be efficiently obtained. This is crucial to support real-time image annotation. Meanwhile, the learnt topic space supports practical image annotation since it utilizes an open vocabulary. Given an uncaptioned image, we extract candidate phrases and rank them with a mixture model leveraging the learnt topic space. Ex-

perimental results conducted on real web images demonstrate the effectiveness and efficiency of our proposed approach. 6. REFERENCES [1] Xin-Jing Wang, Lei Zhang, Feng Jing, and Wei-Ying Ma, “Annosearch: Image auto-annotation by search,” in CVPR, 2006. [2] Xin-Jing Wang, Lei Zhang, Ming Liu, Yi Li, and Wei-Ying Ma, “Arista - image search to annotation on billions of web photos,” in CVPR, 2010. [3] Antonio Torralba, Robert Fergus, and William T. Freeman, “80 million tiny images: A large data set for nonparametric object and scene recognition,” IEEE transactions on PAMI, 2008. [4] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Fei-Fei Li, “Imagenet: A large-scale hierarchical image database,” in CVPR, 2009. [5] Arnold W. M. Smeulders, Marcel Worring, Simone Santini, Amarnath Gupta, and Ramesh Jain, “Content-based image retrieval at the end of the early years,” IEEE transactions on PAMI, 2000. [6] Yong Rui, Thomas S. Huang, and Shih fu Chang, “Image retrieval: Current techniques, promising directions and open issues,” Journal of Visual Communication and Image Representation, 1999. [7] Hua-Jun Zeng, Qi-Cai He, Zheng Chen, Wei-Ying Ma, and Jinwen Ma, “Learning to cluster web search results,” in SIGIR, 2004. [8] David M. Blei, Andrew Y. Ng, and Michael I. Jordan, “Latent dirichlet allocation,” Journal of Machine Learning Research, 2003. [9] Thomas Hofmann, “Unsupervised learning by probabilistic latent semantic analysis,” Journal of Machine Learning Research, 2001. [10] Yohan Jin, Latifur Khan, Lei Wang, and Mamoun Awad, “Image annotations by combining multiple evidence & wordnet,” in ACM Multimedia, 2005. [11] Changhu Wang, Feng Jing, Lei Zhang, and HongJiang Zhang, “Image annotation refinement using random walk with restarts,” in ACM Multimedia, 2006. [12] Changhu Wang, Feng Jing, Lei Zhang, and Hong-Jiang Zhang, “Content-based image annotation refinement,” in CVPR, 2007. [13] ODP, “The open directory project,” http://www.dmoz.org/, 2011. [14] Gerard Salton and Chris Buckley, “Term weighting approaches in automatic text retrieval,” Journal of Information Processing and Management, 1988.

Suggest Documents