Ranking domain objects by wisdom of web pages

13 downloads 4664 Views 296KB Size Report
However, the wisdom of crowd is not easy to be obtained. It requires ... the wisdom of web pages for ranking domain objects as a substitute when the wisdom of ...
Ranking Domain Objects by Wisdom of Web Pages Honghan Wu School of Computer and Software, Nanjing University of Information Science & Technology, China

Ago Luberg

Tanel Tammet Department of Computer Science,

ELIKO Technology Competence Centre, Tallinn, Estonia

ELIKO Technology Competence Centre, Tallinn, Estonia

[email protected]

[email protected]

ABSTRACT Collaborative ranking is a way to utilize the wisdom of crowd to recommend objects. It has shown to be very popular and effective in many domains. However, the wisdom of crowd is not easy to be obtained. It requires an active community and takes time to form the true wisdom. In this paper, we introduce an approach to exploit the wisdom of web pages for ranking domain objects as a substitute when the wisdom of crowd is not ready yet or not available at all. We evaluate our work on three real world datasets. The results show that web-page collaborative ranking is a promising way to imitate the wisdom of crowd.

Tallinn University of Technology, Estonia ELIKO Technology Competence Centre, Tallinn, Estonia

[email protected]

x

We proposed a framework for ranking objects by exploiting the `wisdom of web pages`.

x

In the framework, we introduced several models to characterize how web pages are collaboratively ranking objects.

x

We used various metrics to rank objects, and conducted an experiment on real world data sets in the tourism domain.

Categories and Subject Descriptors

In section 2, we give an overview of our framework. In Section 3, we introduce the webpage collaborative ranking models. Related work is given in section 4. In section 5, we discuss our experiment results followed with a conclusion in section 6.

H.3.3 [Information Systems]: Information storage and retrieval – Information Search and Retrieval.

2. OVERVIEW

General Terms Experimentation, Algorithms, Measurement

Keywords Object Ranking, Collaborative Ranking, Web Mining

1. INTRODUCTION User collaborative ranking is a way to recommend items (such as music, books or sightseeing) or social elements (such as celebrities or groups) by leveraging user experiences (such as reviews or ratings). This method was argued to be effective in decision making [1]. In practice, it is widely adopted and has also shown to be very popular in many domains, such as Amazon in e-commerce and Tripadvisor in tourism domain.

At this writing, 6 out of 10 top ranked web sites are Web 2.0 websites according to the Alex traffic rank 1. This means a large volume of internet traffics goes to user-generated contents. Users’ experiences or opinions on the Web are keeping increasing rapidly. Hence, to generate a ranking which conforms to user collaborative rankings, one possible way is to exploit such information to compile the desired object rankings.

2.1 General Framework Inspired by aforementioned observations, the basic idea of our approach is to take web pages as the opinion holders. So the wisdom of web pages is then aggregated to be a substitution for the wisdom of crowd (users), when the latter is missing or not ready yet. Formally, our ranking approach can be described as a function ܴ which is defined in Definition 1.

The wisdom of crowd is effective, but it is costly to get them. It requires an active community with considerable user bases. In addition, it also takes time to form the true wisdom especially for newly emerging items. Then an interesting question would be “Is there any automatic way to rank domain objects similarly as user collaborative rankings?” Inspired by this question, in this paper, we focus on the problem of user collaborative ranking imitation and propose an approach based on web-page collaborative ranking.

ܴሺܱሻ ൌ ܸሺ‫ܨ‬ሺܱሻሻ

To summarize, our contributions in this work are listed as follows. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. WIMS'12, June 13-15, 2012 Craiova, Romania Copyright © 2012 ACM 978-1-4503-0915-8/12/06... $10.00

-

ܱ denotes a set of objects to be ranked; ‫ܨ‬ሺ‫ݔ‬ሻ is a function to return a data structure which describes the relation between a set of web pages and their describing objects;

-

ܸሺ‫ݔ‬ሻ is a function to generate object ranking from the result of ‫ܨ‬ሺ‫ݔ‬ሻ. Definition 1. Object Ranking Function

The input of function ܴ is a set of objects to be ranked, such as restaurants in London or books about data mining. More specifically, the input is a set of object names. The output is a ranking of these objects which is represented by a list of tuples as follows. Each tuple has two elements. The first element is the 1

http://www.alexa.com/topsites

object element (‫݋‬௜ ), and the second one is a numeric score (‫ݏ‬௜ ) which represents popularity of the object element. ܴሺܱሻ ൌ ሺ൏ ‫݋‬ଵ ǡ ‫ݏ‬ଵ ൐ǡ ൏ ‫݋‬ଶ ǡ ‫ݏ‬ଶ ൐ǡ ǥ ǡ ൏ ‫݋‬௡ ǡ ‫ݏ‬௡ ൐ሻ Basically our ranking approach is composed of two steps. The first step (function‫ )ܨ‬is to find out web pages, which are describing the objects in question. The second step (functionܸ) is to generate ranking(s) from the result of the first step.

2.2 Finding Opinion Holders (Web Pages)

3.1 Modeling Opinion Holder In our scenario, web pages are opinion holders. To generate a collective rating from them, we have to know the opinions expressed in each webpage, i.e. are they positive or negative. Different opinions should be differentiated explicitly and aggregated accordingly. Moreover, the trustworthy of opinion holders is also very important in a collaborative ranking model. Opinions from highly trustworthy web pages should be boosted, while spam opinions should be ignored completely. In this subsection, we introduce our consideration in these two issues.

3.1.1 Differentiating Opinions Generating Queries

Searching Webpages

Identifying Objects

Figure 1. The Process of ࡲሺ࢞ሻ function The function ‫ ܨ‬is designed for locating web pages with relevant opinions about objects in question. Its process is composed of three sub-steps, as shown in Figure 1. 1.

2.

3.

Generating queries. Given a set of objects, this function first generates a set of queries, which will then be submitted to a search engine for finding relevant web pages. Query generation is a critical step in our framework. Different queries result in different sets of web pages, which in turn result in different opinions to be exploited. This sub-step results in an object-query relation. Searching web pages. After queries are generated, they will be submitted to a web search engine for retrieving relevant web pages. Note that the ranking of each result webpage is recorded for later use. The result of this sub-step is a querywebpage relation. Identifying objects. The last sub-step is to identify objects in relevant web pages. It is a typical named entity recognition process. The dictionary based method is adopted for its simplicity and efficiency. The result of this sub-step is a webpage-object relation.

The result of ‫ ܨ‬function is composed of three bipartite graphs as depicted in Figure 2. From left to right, they are the intermediate results of sub-step 1 to sub-step 3 accordingly. In Section 3, we will discuss ranking generation from these intermediate results. Objects

Queries

Web Pages

Objects

o1

q1

d1

o1

o2

q2

d2

o2

...

...

...

...

on

qk

dm

on

Figure 2. Two bipartite graphs generated by the function ࡲሺ࢞ሻ

3. COLLABORATIVE RANKING In this section, we discuss the design considerations in generating a collective webpage ranking.

Understanding opinions in web pages is a non-trivial problem. A line of work [2-5] in the web mining community has been targeted to this problem. Instead of modeling this problem as a data mining task, in our framework, we propose a webpage selection approach to address this issue. We adopt two strategies for query generation. Each strategy targets a specific subset of relevant web pages whose opinions share a common characteristic. The first strategy is called `select the describing`. Queries are generated for finding web pages which are just describing the object in question. The second strategy is called `select the recommending`. Specific queries are generated to search web pages which are recommending the object. Results from both strategies are integrated together by a linear combination function, in which results from the second strategy are more weighted. Due to space limitation, we omit the detail information about query generation methods and the combination function.

3.1.2 Trusting the Trustworthy The Web is a diverse information source. It is not surprising to encounter untrustworthy information on the Web. Fake review writing even has turned out to become a business. Keeping return high quality results in the top, search engine companies like Google seems to have been equipped with an effective solution against these spam information. This means that, in most cases, the top ranked web pages are more trustworthy than those ranked lower. Based on this assumption, we utilize the rankings of web pages in searching results as a measurement of their trustworthiness.

3.2 Object Importance Assessment The modeling work is to consider the important factors to characterize the opinion of each individual webpage. In this subsection, we will aggregate these individual opinions into an overall ranking about objects. More specifically, we will discuss how to assess the importance of objects from the data structures which are depicted in Figure 2.

3.2.1 Frequency Metric Given the webpage-object bipartite graph in Figure 2, a straightforward method to assess the importance of an object is to calculate how many times it is mentioned in web pages. This is a frequency based metric which is defined as follows. ‫ܣ‬൫‫ܩ‬஽ǡை ൯ ൌ ቐ൏ ‫݋‬௜ ǡ ݂௜ ൐ ቮ݂௜ ൌ ෍ ܹሺ݀ሻ ǡ ‫݋‬௜ ‫ܱ א‬ቑ ௗ‫א‬஽೚೔

-

‫ܩ‬஽ǡை  is the webpage-object bipartite graph; ‫ܦ‬௢೔ is the set of documents describing ‫݋‬௜ ; ݂௜ denotes the importance of ‫݋‬௜ ;

-

ܹሺ݀ሻ is the weight value of webpage ݀. Definition 2. Frequency Based Assessment

In Definition 2, the returned result is a set of tuples. The two elements in each tuple are the object and its importance value respectively. ܹሺ݀ሻ is a numeric value which indicates the aforementioned trustworthy of a webpage.

3.2.2 Centrality Metrics Link-analysis based method has shown to be effective and popular in both the academic research and the practice use[6, 7]. Inspired by this, we also adopt a centrality based metrics for assess the importance of objects.

3.2.2.1 Object Linking Graph The prerequisite for link analysis is to have an object linking graph. We introduce a reference model to generate the object linking graph from the three bipartite graphs in Figure 2. ‫ܯ‬௥௘௙ ൫‫ܩ‬஽ǡை ൯ ൌ ቄ‫݋ۃ‬௜ ǡ ‫݋‬௝ ‫ۄ‬ቚܶ‫݌݋‬ሺ‫ܦ‬௤೔ ǡ ݇ሻ ‫ܦ ת‬௢ೕ  ് ‫׎‬ቅ -

‫ܦ‬௤೔ is the set of web pages found by query‫ݍ‬௜ Ǣ

-

ܶ‫݌݋‬ሺ‫ܦ‬௤೔ ǡ ݇ሻdenotes the top ݇ ranked web pages in the result of‫ݍ‬௜ Ǣ Definition 3. Reference Model

The intuition is to characterize the reference relations between objects. If one object ‫݋‬௜ ’s introduction page(s) also mentions another object‫݋‬௝ , then we say that ‫݋‬௜ references ‫݋‬௝ . In our scenario, introduction pages of ‫݋‬௜ are the top ranked results of query‫ݍ‬௜ . The formal definition is given in Definition 3. ܶ‫݌݋‬ሺ‫ܦ‬௤೔ ǡ ݇ሻ denotes the top-k ranked results of the query generated from ‫݋‬௜ . If ݇ is set as a low value like 1 to 3, we can say that these documents are mainly about ‫݋‬௜ . If they are also describing‫݋‬௝ , then we add a link from ‫݋‬௜ to ‫݋‬௝ in the object linking graph.

3.2.2.2 Centrality Measurements In our experiment, several well-known centrality metrics are used to assess the importance of nodes in the object linking graphs. They are degree, closeness, betweenness and one variant of Eigenvector centrality: PageRank. The introduction of these centrality metrics can be found at this Wikipedia page2.

4. RELATED WORK About ranking algorithms, the most practical ones are those used by search engines. Hyper Search[8] has been the first published technique to introduce link analysis for search engines. The most well-known ranking algorithms are Google’s PageRank[6] and Kleinberg's HITS algorithm[7]. In the network theory, centrality measures are used to determine the importance of nodes or edges in a graph. Another branch of work in ranking problems is learning to rank, which has emerged from the past decade. Tie-Yan Liu has presented a very good overview[4] in this area. Finally, collaborative filtering is another type of related work. Readers can refer to this survey[9] for a comprehensive introduction.

5. EXPERIMENTS 5.1 Datasets, Standards and the Metric We select museums as the types of objects to be ranked. We collect data from three cities, namely Tallinn, London and New York. So, there are totally three datasets tested in our experiments.

2

http://en.wikipedia.org/wiki/Centrality

Table 1. Datasets Statistics Statistic Type # of Objects # of Documents # of Nodes

London museums

New York museums

21(171)

21(227)

21(288)

9,736

20,505

27,068

116

227

288

2,670

# of Edges Standard sources

Tallinn museums

ranking

16,245

20,834

Foursquare

Foursqare

Foursqare

Sightsplanner

Tripadvisor

Tripadvisor

Detailed statistics of these datasets are given in Table 1. The first data row in the table gives the number of objects of each data set. For each dataset, we have to collect rankings from two different web sites. Objects used in experiments have to be ranked on both web sites. So there are two numbers of objects in each data set. Numbers in brackets are the total number of objects collected, and numbers out brackets are the number of objects to be ranked in the experiment. The second data row lists the number of web pages found and used in each experiment. The third and fourth data rows are the statistics of the generated object linking graph. The last row lists the web sites where the ranking standards are collected. For each data set we will run two turns of experiments, because we have two standard rankings. In each turn, one standard ranking is chosen as the ground truth to be compared and the other is used as the state of art ranking. The similarity of rankings is evaluated by the Kendall’s tau distance[10]. Standard rankings are collected from popular tourism web sites. For Tallinn museums, we collect rankings from sightsplanner.com and foursquare.com. For the rest datasets, rankings are all collected from foursquare.com and tripadvisor.com.

5.2 Experiment Results The results are shown in Table 2. The three columns correspond to three cities respectively, i.e. London, New York and Tallinn. In each column, there are two charts, which are results based on two different ground truths. The name of the ground truth ranking is shown right below each chart. In each chart, the y-axis is the Kendall’s tau distance value. The xaxis is the number of objects to be ranked. A point ሺ‫ݔ‬ǡ ‫ݕ‬ሻ in a line of rank ‫ ݎ‬means y is the distance between the ranking of top ‫ݔ‬ objects in the ground-truth ranking and the ranking of those objects in the rank‫ݎ‬. In our experiments, we choose a step of 3 to increase the x value from an initial value of 3. The stepped measurement reveals more information about the similarity between two rankings. Several observations can be obtained from the result. About the object importance assessment metrics. Betweenness performs best. The next are Degree and Frequency. Closeness centrality is the worst one. From the charts, it can also be figured out that Frequency based method is the most stable one. About the ranking standards. It seems that generated rankings are more agreed with the Foursquare rankings. The overall distance between standard rankings is around 0.2. The generated rankings can also achieve similar performance with respect to the selected standard rankings. Cross dataset performance. In London and New York datasets, our approach performs even better than standard rankings, while it is slightly worse in Tallinn data.

Table 2 Kendall’s tau Distance among Rankings of Museums Objects

0.8 0.6

0.6

0.6

0.4

0.4

0.2

0.2

0.4 0.2 0

0 3

6

9

12 15 18 21

0 3

Tripadvisor as Ground Truth

6

9

12

15

18

21

Tripadvisor as Ground Truth

1 0.8 0.6 0.4 0.2 6

9

12

15

18

21

6

9

12

15

18

21

Sightsplanner as Ground Truth

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2 0

0

0 3

3

3

6

9

12

15

18

21

3

6

9

12

15

18

Foursquare as Ground Truth

Foursquare as Ground Truth

Foursquare as Ground Truth

London Museums

New York Museums

Tallinn Museums

In a word, according distance based metrics, web-page collaborative rankings performs similarly to user collaborative rankings in three datasets. The frequency metric is the best metric when considering both the performance and the stability. Among link analysis based metrics, the best one is the betweenness centrality, while the closeness centrality is worst. It is noticeable that PageRank based centrality doesn’t perform very well in this task.

6. CONCLUSION User collaborative ranking is a way to exploit the wisdom of crowd for object recommendation. In practice, it has shown to be very popular and effective. However, the wisdom of crowd is not easy to be obtained. In this paper, we have introduced a web-page collaborative ranking to imitate the user collaborative ranking. The experiment result shows that web-page collaborative ranking can be a good substitute when user collaborative rankings are missing.

7. ACKNOWLEDGMENTS This work has been funded by the European Commission as part of the FP7 Marie Curie IAPP project TEAM (grant No. 251514) and National Natural Science Foundation of China (No. 61105007).

8. REFERENCES [1] Surowiecki, J. The Wisdom of Crowds. Doubleday; Anchor, 2004. [2] Liu, B. Sentiment Analysis: A Multifaceted Problem. IEEE Intelligent Systems, 25, 3 (May-June 2010), 76-80. [3] Liu, B., Hu, M. and Cheng, J. Opinion observer: analyzing and comparing opinions on the Web. In Proceedings of the the 14th

21

international conference on World Wide Web (New York, NY, USA, 2005). ACM, 342–351. [4] Liu, T.-Y. Learning to Rank for Information Retrieval. Foundations and Trends in Information Retrieval, 3, 3 (Dec 2009), 225-331. [5] Mukherjee, A., Liu, B., Wang, J., Glance, N. and Jindal, N. Detecting Group Review Spam. In Proceedings of the WWW-2011 (Poster) (Hyderabad, India, 2011), 93-94. [6] Brin, S. and Page, L. The anatomy of a large-scale hypertextual Web search engine. In Proceedings of the the seventh international conference on World Wide Web 7 (April, 1998). [7] Kleinberg, J. M. Authoritative sources in a hyperlinked environment. In Proceedings of the Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms (San Francisco, California, United States, 1998). Society for Industrial and Applied Mathematics, 668-677. [8] Marchiori, M. The Quest for Correct Information on the Web: Hyper Search Engines. In Proceedings of the he Sixth International World Wide Web Conference (WWW6) (1997). [9] Su, X. and Khoshgoftaar, T. M. A survey of collaborative filtering techniques. Adv. in Artif. Intell., 2009(January 2009), 1-19. [10] Kendall, M. and Gibbons, J. D. Rank Correlation Methods. A Charles Griffin Title, 1990.