Random walk based rank aggregation to improving web search

6 downloads 794 Views 1MB Size Report
Apr 30, 2011 - 1 a user inputs an initial query ''mouse'' in the box named Your. Query. Then ... can be assumed to be ranked below N by the search engine.
Knowledge-Based Systems 24 (2011) 943–951

Contents lists available at ScienceDirect

Knowledge-Based Systems journal homepage: www.elsevier.com/locate/knosys

Random walk based rank aggregation to improving web search Lin Li a,⇑, Guandong Xu b, Yanchun Zhang b, Masaru Kitsuregawa c a

School of Computer Science & Technology, Wuhan University of Technology, China School of Engineering & Science, Victoria University, Australia c Institute of Industrial Science, University of Tokyo, Japan b

a r t i c l e

i n f o

Article history: Received 19 August 2010 Received in revised form 1 April 2011 Accepted 1 April 2011 Available online 30 April 2011 Keywords: Random walk Rank aggregation Query suggestion Web search Pairwise contest Pairwise majority contest

a b s t r a c t In Web search, with the aid of related query recommendation, Web users can revise their initial queries in several serial rounds in pursuit of finding needed Web pages. In this paper, we address the Web search problem on aggregating search results of related queries to improve the retrieval quality. Given an initial query and the suggested related queries, our search system concurrently processes their search result lists from an existing search engine and then forms a single list aggregated by all the retrieved lists. We specifically propose a generic rank aggregation framework which consists of three steps. First we build a so-called Win/Loss graph of Web pages according to a competition rule, and then apply the random walk mechanism on the Win/Loss graph. Last we sort these Web pages by their ranks using a PageRank-like rank mechanism. The proposed framework considers not only the number of wins that an item won in competitions, but also the quality of its competitor items in calculating the ranking of Web page items. Experimental results show that our search system can clearly improve the retrieval quality in a parallel manner over the traditional search strategy that serially returns result lists. Moreover, we also provide empirical evidences as to demonstrate how different rank aggregation methods affect the retrieval quality. Ó 2011 Elsevier B.V. All rights reserved.

1. Introduction Keyword based search query is a popular way to let Web users easily specify their information needs, which is supported by commercial search engines like Google and Yahoo!. However, the difficulties in finding only those which satisfy an individual’s information goal increase due to the continued rapid growth in data volume of Web and Web users’ inexpert in phrasing queries. Related query suggestion [2,18,28,32] has been investigated to help users formulate alternative related queries to satisfy their information needs. Furthermore, commercial search engines have implemented methods to suggest alternative queries to users, such as Related search terms in Google, Search Assist in Yahoo!, and Related Searches in Bing Search. However, the current utilization of query suggestion is still naï ve. After getting suggestions, users usually have to submit these recommended queries one by one to find the results matching their information need. Moreover, they sometimes have to frequently navigate through the result pages because they are not sure whether the recommended queries are exactly matching their needs before they read the actual contents of Web pages. Obviously, this is tedious for the users to manipulate several search ⇑ Corresponding author. E-mail addresses: [email protected] (L. Li), [email protected] (G. Xu), [email protected] (Y. Zhang), [email protected] (M. Kitsuregawa). 0950-7051/$ - see front matter Ó 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.knosys.2011.04.001

windows. In addition, if a single query is deficient in accurately representing their information needs, the set of recommended related queries may have a higher possibility to provide a broader search converge over Web pages, thereby more likely including the information the users want. In this paper we address this search problem and devise a novel enhanced web search approach by aggregating results of related Web queries, which aims at facilitating locating the information need of a user. Our search system takes a couple of related queries as search inputs and outputs a final search result list which is the aggregation of the result lists of these input queries. The strength of the combined query collections can substantially enhance the utilization of query suggestion to improve Web search quality. The technical issue in our system is the rank aggregation of the search result lists given a set of queries. In literatures various rank based aggregation methods have been studied and employed in many applications [11,23,30,33]. In this paper we propose a generic framework of rank aggregation based on the random walk mechanism by constructing a so-called Win/Loss graph of Web pages. The random walk on the Win/Loss graph determines the aggregated rank of each page in the final result list by using competition rules. In particular two kinds of competition rules are studied to determine the Win/Loss relationship between two nodes in the graph. One is based on a pairwise contest that chooses the next Web page based on the number of pairwise contests (within all the lists) the page won. The other one is based on a pairwise majority contest that decides the next Web page by the number of pairwise majority contests

944

L. Li et al. / Knowledge-Based Systems 24 (2011) 943–951

the page won. The advantage of the proposed framework on the two kinds of Win/Loss graphs is that it generalizes two main schools of solutions in the field of rank aggregation spearheaded by Borda [4] and Copeland [9]. The two main solutions also lay down a foundation of Markov Chain based methods [11,23,12]. In addition, the proposed framework takes into account not only the number of wins that a page item won, but also the quality of the competitor page items. In summary, this paper aims to make contributions on (a) enhancing Web search by the aggregation of the search results of related queries (Section 2), (b) devising a generic rank aggregation framework based on the random walk mechanism and discussing how to employ various rank aggregation methods in the proposed framework (Section 3), and (c) providing empirical evidences as to demonstrate how the result aggregation improves search quality and how the different rank aggregation methods affect the system performance (Section 4). Last, we review related work in Section 5 and conclude our work in Section 6. 2. Overview of our search system In this section, we give an overview of our search system which includes two important components, related query suggestion and rank aggregation. Given a query input by a user, we first need its related queries. In this paper, we are mainly interested in how to make use of related queries to enhance Web search, not how to find related queries. Therefore, we assume that related queries are already available. Our idea can be supported not only by the popular query suggestion services in commercial search engines, but also by other approaches of finding related queries [2,18,28,32]. After a set of related queries is selected by the user, we get their search results from a search engine (e.g., Google). Last, we aggregate these search results. The aggregation is implemented by three steps: (1) constructing a Win/Loss graph where the order relationship among search results is encoded, (2) applying the random walk mechanism on the Win/Loss graph to assign a new score to each search result, and (3) sorting nodes (search results) in the graph based on the standing probability distribution of the random walk mechanism. The technical details will be discussed in the next section. An example is given in Fig. 1. In this example we directly utilize the query suggestion service from Google search engine to get related queries. We extract Related search terms from the HTML source code of the result page after sending the initial query to Google. In Fig. 1 a user inputs an initial query ‘‘mouse’’ in the box named Your Query. Then, he/she clicks the button Query Suggestion to get a list of related queries to his/her input query, such as ‘‘Mouse Disney’’, ‘‘house mouse’’ and so on. After getting recommendations, the user selects the recommended queries which represent his/her information need better by clicking the radio button Add? In this example, the user selects ‘‘house mouse’’ and ‘‘house mouse biology’’ as related queries. The selected queries together with the initial query are parallelly sent to a search engine (e.g., Google search engine) to get the rank lists of Web pages. Last we aggregate these lists and return the top Web pages in the final list. Moreover, the title of each search result is followed by queries that retrieved, so the user can change some recommended queries if their search results do not match his/her information need. Our system provides an interactive retrieval means which allows users to do search conveniently. 3. Our generic rank aggregation framework Suppose that we have some related queries for an input query, our goal is to combine their search result lists returned by a search engine and generate a single final list to users. Here, we propose a generic rank aggregation framework for combining search lists.

3.1. Preliminary In the field of rank aggregation, let U be a set of items and a rank list (or simply ranking) s w.r.t. U is an ordering of a subset of U. Also, if i 2 U is present in s, written i 2 s, let s(i) denote the position or rank of the i item in the list. A highly ranked or preferred item has a low-numbered position in the list, which means if s(i) < s(j), i is more highly ranked than j in s. If s contains all the items in U, then s is said to be a full list. In the context of our problem, given an information need, the k related queries (Q1, . . . , Qk) are submitted to a search engine. We let si denote the top N (say, N = 100) results of the search engine w.r.t. the query Qi and U be the set of all Web pages returned by these k queries. Since si is most surely only a subset of U, we have an inequality jsij 6 jUj. Such lists that rank only some of the items (pages) in U are called partial lists. Clearly the pages in U that are not present in the list si can be assumed to be ranked below N by the search engine. Given several search lists, traditional rank aggregation methods directly re-order an item based on its positions on the lists. Usually they count the number of wins that the item gets according to a position-based competition rule [4,9,31]. For example, Condorcet condition [31] specifies that if an item (e.g., p) wins or ties with every other item (e.g., q) in a pairwise competition, i.e., si(p) 6 si(q) for p – q, q = 1, . . . , jUj and i = 1, . . . , k, the item as the winner will be put in the first position of the final fused list. The traditional aggregation methods mainly consider the number of wins that an item won (or the positions of competitors), but ignore the effect of the quality of items that the item won. In the field of random walk, let G = (V, E) be a connected directed graph where V represents the set of nodes and E is the set of edges. Consider a random walk on G: a random walker starts at an arbitrary node v0 randomly selected from V; if after several steps the walker is at a node vt, he/she uniformly moves to the neighbor of vt with probability 1/jO(vt)j where O(vt) represents the outlinks of vt. A sequence of nodes visited by the random walker (vt: t = 0, 1, . . .) constitutes state transitions in a Markov chain and the edges connecting those nodes are unweighted. If the state transition of a Markov chain is not uniform (edges are differently weighted), we can treat it as a random walk with some transition probability. In both unweighted and weighted cases, the standing probability distribution of a random walk on the graph naturally generates an ordering list of all the nodes in V. The random walk based rank mechanism has been widely used in the area of Web search, e.g., PageRank [22]. The aforementioned two research paradigms were developed independently but still correlated to each other. The representative work on Markov chain based rank aggregation is [11] who proposed four specific methods. However, the relationship among the four methods and traditional rank aggregation methods presented by the authors can be improved for more clearness. For each method, whether its transition probability is reasonable needs theoretical analysis for further improvement. Therefore, we are encouraged to devise a generic rank aggregation framework which combines the traditional and random walk based rank aggregation into a unified manner. Upon the unified framework, we can theoretically and empirically study how we can employ various rank aggregation methods, and which of them is the most effective for solving our problem. 3.2. Three steps for rank aggregation Based on the order relationship between two search results among these individual lists, we need a rank mechanism to score each search result (i.e. Web page) and re-sort them according to their assigned scores. Our idea is that the traditional rank aggregation can be used as a rule to determine the probability of state transition or the weight of an edge connecting two nodes on a connected

L. Li et al. / Knowledge-Based Systems 24 (2011) 943–951

945

Fig. 1. Our search system interface.

directed graph. Then, the random walk based rank mechanism iteratively refines the ordering to extend the traditional competitionbased rank aggregation methods by considering not only the number of wins that an page item won, but also the quality of the competitor page items. For example, if the search result a won the search result b in one competition, a got one point; if a won c and c’ points are more than b’s, a got more than one point (e.g., two points).

3.2.1. Step 1: Constructing Win/Loss graph We first give the definition of the Win/Loss graph, and then describe two kinds of competition rules to construct it (e.g., the probability of state transition between nodes). Win/Loss graph: Web pages and Win/Loss relationship are modeled as a directed graph G(V, E) where nodes in V represent Web pages and a directed edge hp, qi in E corresponds to a Win/Loss relationship between p and q. Specifically speaking, if p wins q according to a competition rule, the link direction is from q to p (q ? p); if p is defeated by q and loses the competition, the link direction is from p to q (p ? q). Based on the definition of the Win/Loss graph, the number of inlinks of a node (e.g., q) means how many times the node won other nodes in the competition and the number of outlinks of a node means how many times the node was defeated by other nodes and lost in the competition. Here we produce the connection between Win and Inlink or Loss and Outlink. The connection closely resembles PageRank where a good page has many good inlinks (pages). We know that PageRank running on Web graph is simulating a walker following outlinks or hopping to the random pages to surf the web. The random walk mechanism utilized by PageRank is to iteratively transfer the significance scores of web pages to measure their final quality or importance until the significance scores converge. Therefore, PageRank does not just consider the sheer number of links

that a page receives. It also takes into account the importance or quality of the page that conveys the prestige. In our proposed Win/Loss graph, the random walker transfers the competition scores of search results (competition rules are discussed in the below part) to compute their ranking scores in an aggregated manner. Likewise to PageRank, the random walk based rank mechanism in the Win/Loss graph iteratively refines the ordering of pages by considering not only the number of wins that a page won, but also the quality of the competitor pages. Let us discuss a variety of competition rules to determine the Win/Loss (Inlink/Outlink) relationship between nodes. In this paper, we mainly study two popular competition rules, i.e., the pairwise contest and the pairwise majority contest. Win/Loss graph on pairwise contest: the competition rule using the pairwise contest is as follows: In the rank list si w.r.t the query Qi (i = 1, . . . , k) and two Web pages q and p, (a) if si(p) < si(q), i.e., p wins q in this pairwise contest, q is treated as an inlink of p (q ? p); (b) if si(p) > si(q), i.e., p is defeated by q in this pairwise contest, q is treated as an outlink of p (p ? q). Notice that si(q) cannot be the same as si(p) in an individual list. The duplicate inlinks and outlinks will be omitted as a post-processing step. The indegree of p is the number of pairwise contests with all other pages in U that p won. Win/Loss graph onpairwise majority contest: the competition rule using the pairwise majority contest is as follows: In the rank list si w.r.t the query Qi (i = 1, . . . , k), let the number of the lists where si(p) < si(q) be wins of the page p over the page q and the number of the lists where si(p) > si(q) be losses of p over q; (a) if wins > losses, i.e., p wins q in this pairwise majority contest, q is treated as an inlink of p (q ? p);

946

L. Li et al. / Knowledge-Based Systems 24 (2011) 943–951

(b) if wins < losses, i.e., p is defeated by q in this pairwise majority contest, q is treated as an outlink of p (p ? q); (c) if wins = losses, there is NO link between p and q because they are level. The indegree of p is the number of pairwise majority contests p won. After constructing the Win/Loss graph, we can apply a PageRank-like iterative method to sort the nodes (Web pages) in the graph.

3.2.2. Step 2: Applying random walk on Win/Loss graph In the above descriptions of two competition rules, we utilize the ordering relationship between two Web pages which appear in a same list si to construct our Win/Loss graph. The order of two pages appeared in a same list si is an explicit relationship. However, as we discussed in Section 3.1, each list si in our problem is a partial list which means that there are some pages in U left unranked by si. Clearly the pages that are not present in the list si can be assumed to be ranked below jsij, which is an implicit ordering relationship. We will argue that a random walk behavior on the Win/Loss graph naturally reflects both explicit and implicit ordering relationships between two pages. We model a random walk on a Win/Loss graph as follows. If a walker is at the page p now, the walker has two choice: (a) he/she may walks to the next page q chosen from p’s outlinks uniformly or based on the edge’s weights. The main idea is that in each step, we move from the current page p to a better page q since q wins p. This walk behavior indeed represents an explicit ordering relationship in a list according to a competition rule (the pairwise competition or the pairwise majority competition); (b) or he/she may also jumps to the next page q uniformly chosen from the set of pages which never appear together with p in any lists (called NO(p)). The reason is that we assume that the pages (e.g., p) that are not present in a list are ranked below that all the pages in the list (e.g., q). This jump behavior models the implicit ordering relationship based on the reasonable assumption. We rank nodes corresponding to the standing probability distribution (i.e. score) of the walker on the graph. Thus, we have

r ðtÞ ðqÞ ¼

X ð1  aÞ r ðt1Þ ðpÞwðp; qÞ; þa jJ q j p2IðqÞ

ð1Þ

where r(t)(q) is the rank score of node q after t iterations, Jq is [p2I(q)NO(p). I(q) is the indegree of q. w(p,q) is the weight of the edge (p ? q). We tune up the walk behavior and the jump behavior by a mixing parameter a, 0 < a < 1. From this formula we determine the overall score of a target node by counting both the number of nodes linking to a target node and the relative quality of each pointing node. Note that in the rank computation we also include a small uniform probability epsilon transition from every node to every other node, to overcome a sort of trap in the Win/Loss graph, called a rank sink by Brin and Page [22]. In addition, the Win/loss graph assumes that each node has at least one outlink, which, however, may be not always true. These epsilon trnasitions modify the Web pages with no outlinks in the Win/Loss graph to include virtual links to all other pages in the graph, which guarantees the convergence to a unique rank score distribution of nodes [3]. At last, we can ensure a smooth, complete ranking on all the items in jUj (the number of nodes in the graph). This smoothing technique was used in a number of random walk based ranking methods, including Google PageRank [22].

3.2.3. Step 3: Sorting nodes after t iterations After constructing the Win/Loss graph and applying the random walk on it, we can sort the nodes by their ranks using Eq. (1). The recursive running of Eq. (1) gives the probability distribution that the walker is on page q after t iterations. When t equals to 1, no heuristic is used. When t is large enough, r(t)(q) will gradually converge to a stationary distribution. Then, the distribution induced on the state transitions of all the pages in the Win/Loss graph produces a final ranking of these pages. The ranked position of each page represents its relative significance matching the information need of users. The initial state is chosen uniformly at random because in general the initial value will not affect final values, just the rate of convergence [22]. 3.3. Discussions Here we discuss how to incorporate typical rank aggregation methods, i.e., Borda Count (BC) [4], Copeland Method (CM) [9] and four specific random walk methods proposed by [11], i.e.,MC1, MC2, MC3, and MC4, into our proposed framework. The differences of their parameter settings are listed in Table 1. Using the proposed generic framework, we find that the random walk based methods generalize BC and CM and further extend them by considering the quality of items in the competitions. 3.3.1. Generalization of Borda Count (BC) The random walk on the Win/Loss graph using the pairwise contest is a generalization of Borda Count method. BC is a popular rank fusion method and also widely discussed by [23,1]. For each list, the topmost item receives n  1 points, the second item receives n  2 points, and so on. The item with the maximal points will be put in the first position of the fused list. In fact, we can view the points received by an item in a list as the number of pairwise contests it won against all other items in this list, and the items are ranked in a decreasing order of total points obtained in all lists. Following the direction of BC, in our framework, if the weight of the edge between q and p (w(p, q)) is based on the number of lists where q is ranked higher than its inlinked nodes (e.g., p), on this weighted Win/Loss graph, WeightedPW-t (t iterations) and WeightedPW-1 (one iteration) in Table 1 produce similar aggregation results to that calculated by MC3 and BC respectively. On the other hand, the total points of an item in all lists may include duplicate pairwise comparisons. For example, if the page q is ranked higher than the page p in two lists, BC adds two points for q. In an unweighted Win/Loss graph build by the pairwise contest (i.e., PW-1 and PW-t), we assign only one point for q in the pairwise comparisons between q and p of all the lists, which means we do not consider how many lists where q wins p, but whether q wins p. In this case, therefore, w(p, q) is set to be 1/O(p) where O(p) is the outdegree of p. On this unweighted Win/Loss graph, t iterations by PW-t generate similar aggregation results with MC1. Moreover, MC2 utilizes a different weighting strategy from MC3 so that the distribution induced on its states produces a ranking of the pages such that q is ranked higher than p if the geometric means of the ranks of q is lower than that of p. Generally speaking, MC2 and MC3 can be regarded as the different weighted versions of MC1. In a word, PW-t represents the kind of unweighted and iterative method such as MC1 and WeightedPW-t represents the kind of weighted and iterative method such as MC2 and MC3. In addition, WeightedPW-1 represents the kind of weighted method without iterations such as BC and PW-1 can be regarded as unweighted BC. 3.3.2. Generalization of Copeland Method (CM) Similar to the Win/Loss graph based on the pairwise contest, the random walk on the Win/Loss graph using the pairwise majority contest generalizes Copeland’s suggestions of sorting the items

947

L. Li et al. / Knowledge-Based Systems 24 (2011) 943–951 Table 1 The generic rank aggregation framework.

a

w(p, q)

r(0)(p)

Rule

Iteration

BC

0.5

O(p)

Pairwise contest

1

CM

0.5

The number of lists where q is ranked higher than p (BCWeights) The value wins-losses of each

Pairwise majority contest

1

MC1 MC2 MC3 MC4 PW-1 PW-t WeightedPW-1 WeightedPW-t PWM-1 PWM-t WeightedPWM-1 WeightedPWM-t

1 1 1 1 0.85 0.85 0.85 0.85 0.85 0.85 0.85 0.85

Unweighted The number of lists where p is BCWeights Unweighted Unweighted 1/O(p) Unweighted 1/O(p) BCWeights⁄1/O(p) BCWeights⁄1/O(p) Unweighted 1/O(p) Unweighted 1/O(p) CMWeights⁄1/O(p) CMWeights⁄1/O(p)

O(p) pairwise majority contests q won (CMWeights) Randomly Randomly Randomly Randomly Randomly Randomly Randomly Randomly Randomly Randomly Randomly Randomly

Pairwise Pairwise Pairwise Pairwise Pairwise Pairwise Pairwise Pairwise Pairwise Pairwise Pairwise Pairwise

t (t > 1) t (t > 1) t (t > 1) t (t > 1) 1 t (t > 1) 1 t (t > 1) 1 t (t > 1) 1 t (t > 1)

by the number of pairwise majority contests they won. This amounts to sorting nodes by their indegrees in the Win/Loss graph using the pairwise majority contest. If the initial probability of a node p is set to O(p), i.e., r(0)(p) = O(p), the random walk on this unweighted graph after one iteration (i.e., PWM-1) exactly models CM which is also popular in the literature of rank aggregation [21]. MC4 is based on the above defined graph with t iterations and similar with our PWM-t. Other parameters are also given in Table 1. In addition, the edges in the Win/Loss graph can be weighted by the value wins-losses of each pairwise majority contest a node won (i.e., WeightedPWM-t), thus generating a weighted version for MC4. In a word, PWM-1 can be considered as CM and PWM-t represents the kind of unweighted and iterative method such as MC4. Both of them can be extended to the weighted version such as WeightedPWM-1 and WeightedPWM-t. Therefore, typical rank aggregation methods are included in our generic framework. In experiments, we will mainly concern which kind method is the most effective. 3.3.3. Eight methods used in our generic framework As shown in Table 1, we find that most existing methods are different in several aspects, thus having difficulties in comparing them without a unified framework. We outline three aspects of a rank aggregation method involved in our framework: (1) Competition rule for the Win/Loss relationship of competitor items PW utilizes the pairwise contest, and PWM uses the pairwise majority contest. We investigate which kind of competition rules (pairwise contest or pairwise majority contest) is the most effective in our system. We will experimentally discuss two kinds of methods on the Win/Loss graph, i.e., PW-based methods including PW-1, PW-t, WeightedPW-1, and WeightedPW-t, and PWM-based methods including PWM-1, PWM-t, WeightedPWM-1, and WeightedPWM-t in Table 1. (2) Iterative computation for the quality of competitor items From the above discussions, we know that BC (pairwise contest) and CM (pairwise majority contest) directly rank items by the number of wins in their contests, while a random walk on a directed graph propagates the indegree (wins) of a node over the whole Win/Loss graph, inducing an iterative computation for combining rankings. The random walk based methods show that a node which has more highly ranked nodes as its inlinks will be ranked higher. For evaluations, we conduct experiments on two kinds of computation. One has only one iterative computation. The other has t (t > 1) iterative computations.

contest contest contest majority contest contest contest contest majority majority majority majority

contest

contest contest contest contest

(3) Weighting strategy for enhancing the Win/Loss relationship w(p, q) in Table 1 represents different weights on the Win/ Loss graph. Two weighting strategies are included as shown in the first two rows of Table 1. One is called BCWeights which counts the number of lists where q is ranked higher than p. The other is called CMWeights which computes the value wins  losses of each pairwise majority contests q won. We further study the effect of the two weighting strategies on retrieval quality, i.e., WeightedPW-1, WeightedPW-t, WeightedPWM-1, and WeightedPWM-t.

4. Experiments In this section, we present experimental evaluation results: we will first explore how the aggregation of the search results is able to greatly improve the search quality in comparison to the traditional retrieval methods which return Web pages query by query, and then we will investigate the impact of different rank aggregation methods of our framework on the search quality.

4.1. Experimental setup and evaluation measure First we need a collection of queries that reflect various users’ information needs and each kind of information need consists of a chain of specific queries for the purpose of experiments. Here we use the original queries collected by [24] as a seed set where those queries labelled by a same topic number are considered semantically related (30 topics and 146 queries in total). We use the topic number to represent one information need of a user. For example, there is a topic representing the information need of buying a surveillance equipment including queries of ‘‘bug’’, ‘‘bug spy’’, and ‘‘bug spy security’’, and the three queries within this topic are submitted to search engines to get their initial search results. We then evaluate the retrieval performance of the proposed approaches in terms of MAP [19] which computes the relevant page precision scores. In a word, given an information need the retrieval performance (i.e., MAP scores) of different methods is evaluated. In addition, we use Boost C++ Libraries1 to build the proposed random walk based approaches. The latest version of Boost can be downloaded freely.2 After downloading Boost, there is a Boost graph library where a file named pagerank.hpp can be found. Various random walk based approaches can be coded using pagerank.hpp and other.hpp files, 1 2

http://www.boost.org/. http://sourceforge.net/projects/boost/files/boost/1.45.0/.

948

L. Li et al. / Knowledge-Based Systems 24 (2011) 943–951

mainly from the graph library (such as djacency_list.hpp, graph_utility.hpp and so forth). 4.1.1. Evaluation measure Mean Average Precision (MAP) is an evaluation metric for evaluating ranking methods. In our problem, our aim is the information need rather than the query. Given an information need IDi, its average precision (AP) is defined as:

PN

AP i @N ¼

j¼1 ðP@j  posðjÞÞ ; # of relevant Web pages in IDi

ð2Þ

where N, as an evaluation parameter, is the number of top Web pages evaluated (e.g., 10), P@j is the precision score of the returned search result at position j, and pos(j) is a binary function to indicate whether the Web page at position j is relevant to IDi. Then, we obtain MAP@N scores by using the average value of the AP@N values of all the evaluated information needs. 4.1.2. Evaluation design The top 100 search results of each query are retrieved from Google Directory Search, and the number of total returned search results is 12103 given 146 queries. Furthermore, we also obtained various topical category information of search results corresponding to each topic number (i.e., each information need). Google Directory Search integrates Google’s sophisticated search technology with Open Directory pages3 which assigns topics with Web pages. For example, the query ‘‘apple’’ retrieves Web pages belonging to several topics such as computers, cooking, shopping and so forth. For evaluations we have to judge whether a search result is relevant to its information need (represented by a topic number). As we know, the topical category information of a search result can represent its main topic. Therefore, we can say that the information need of a set of related search queries is the common topics shared by their search results. Here an automatic judgment method is introduced rather than manually viewing each search result by human experts. It works as follows: we assume that the category indicative to an information need is the identified topical category with a maximal occurrence frequency among all the categories of search results of a topic number. The search results exactly matching this topical category indicative will be considered as relevant ones. Therefore, we can determine the number of relevant Web pages corresponding to each topic number (i.e. information need) by counting the times of exact matching. Meanwhile, in order to investigate the impact of diversity of information need on search performance, we use a parameter M in our automatic judgment method to represent the diverse magnitude of search results, whose lower bound is set to be M. M will be increased from 10 to 40 with a step 10. The larger M is, the more diverse an information need is. In experimental study, if the number of relevant Web pages does not reach the lower bound, which means the selected searched results are not sufficiently diverse, we need to take the secondary topic category of searched results into account. Thus the category with the second maximal occurrence frequency will be selected as an additional category indicative. This process continues, until the size reaches the lower bound. 4.2. Results and discussions We conduct evaluations when M (relevant result size) varies from 10 to 40 with a step of 10. The experiment results of PW based methods and PWM based methods are shown in Figs. 2 and 3 respectively. 3

OPD:http://www.dmoz.org/.

Google is our baseline in which Google’s MAP score averaged by all the related queries within a topic number is calculated. It is seen in Fig. 2 that the four PW based methods, i.e. PW-1, PW-t, WeightedPW-1 and WeightedPW-t generally produce higher MAP scores than the baseline at different values of N and M. For example, when M = 30, the improvements of these four methods at N = 5 are 5.43%, 5.43%, 23.47%, 15.47%, respectively. Also in Fig. 3 the four PWM based methods, i.e. PWM-1, PWM-t, WeightedPWM-1 and WeightedPWM-t generally produce higher MAP scores than the Google baseline at different values of N and M. For example, when M = 30, the improvements of these four methods at N = 5 are 20.83%, 25.15%, 21.16%, 24.07%, respectively. The experiment results validate the effectiveness and importance of making good use of the query collective to achieve the performance improvement. In other words, we conclude that the proposed rank aggregation algorithm can outperform the traditional search methods in terms of retrieval quality. In addition, we want to identify which rank aggregation method is the most effective to retrieval quality. As shown in Fig. 2 among all the PW based methods, the highest MAP scores are achieved by WeightedPW-1 at different values of M and N. An interesting observation is that PW-1 and PW-t (two unweighted random walk models on the pairwise contest with different iterations) have the same MAP scores in Fig. 2. After checking the rank aggregation values, we found that although the values were different, the ordering produced by these values did not change. This means that more iterations (e.g. t = 40) just give more convergence of values rather a new ordering. However, different iterations used by WeightedPW1 and WeightedPW-t (two weighted random walk models on the pairwise contest) change both values and ordering. Also as shown in Fig. 3, when M = 10 and 20 the best results are obtained by PWM-1. When M = 30 and 40 PWM-t is the most effective approach. We can say that when using the PWM based methods, the unweighted approaches show better than the weighted ones. Moreover, let us compare PWM-1 with PWM-t. M represents the number of relevant Web pages corresponding to an information need. When the information need is simple like finding a website with the relative small size of answers, PWM-1 looks more effective than PWM-t in such cases. When the information need covers several aspects and the size of answers is relatively large, PWM-t seems more useful than PWM-1. In such cases, t iterative computation can further improve the search performance since t iterative computation has an effect on the coverage of the whole results by considering not only the number of wins that a page item won, but also the quality of its competitor page items. In order to compare the PW based methods and PWM based methods, the highest MAP scores at different values of M and N are listed in Table 2. Notice that these highest scores are produced by different approaches. When M = 10 and 20, (the values in the first and second columns of Table 2) come from WeightedPW-1 and PWM-1. The pairwise contest based approach (WeightedPW1) exhibits better results than the pairwise majority contest based approaches (PWM-t). For example, when M = 10, WeightedPW-1 outperforms PWM-t by 8.54% (N = 5), 10.17% (N = 10), and 11.33% (N = 15). And when M = 30 and 40, the values in the last two columns of Table 2 are got by WeightedPW-1 and PWM-t. The pairwise majority contest based approach (PWM-t) exhibits better results than the pairwise contest based approach (WeightedPW-1). For example, when M = 30, PWM-t outperforms WeightedPW-1 by 2.18% (N = 5), 2.63% (N = 10), and 1.25% (N = 15). The above results are similar to what we have discussed on the performance of PWM-1 and PWM-t. When the information need of users only covers the relative small number of answers, PW based method, i.e., WeightedPW-1 (e.g., BC) works better than PWM based method, i.e.,PWM-t (e.g., MC4) in such cases. When the information need of users are included in the relatively large number of answers,

949

L. Li et al. / Knowledge-Based Systems 24 (2011) 943–951

Fig. 2. MAP@5, 10, 15 of PW based methods when M = 10, 20, 30, 40.

Fig. 3. MAP@5, 10, 15 of PWM based methods when M = 10, 20, 30, 40.

PWM based method, i.e.,PWM-t is more effective than PW based method, i.e., WeightedPW-1. In such cases, t iterative computation and pairwise majority contest rule can further improve the search performance. From these results we can say that different retrieval mechanisms are appropriate for different kinds of information needs for the high retrieval quality. This is consistent with the results given by [13] which first classified queries into informational and navigational ones and then used different retrieval models for the two kinds of queries.

Table 2 PW vs. PWM (MAP scores).

4.3. Case study

the top five search results corresponding to an information need which consists of three queries, i.e., mouse, house mouse, house mouse biology. The URLs of Web pages are given but the titles and snippets are omitted due to the space limit. The judgments

Here, we also present a case study of finding more relevant web pages via the proposed rank aggregation framework. Table 3 lists

TopN

M = 10

M = 20

M = 30

M = 40

Method

5 5 10 10 15 15

0.2999 0.2763 0.3076 0.2792 0.2987 0.2683

0.3434 0.3224 0.3728 0.3434 0.3612 0.3307

0.4119 0.4175 0.4006 0.4284 0.3874 0.402

0.4396 0.4492 0.4301 0.4414 0.4071 0.4122

PW PWM PW PWM PW PWM

950

L. Li et al. / Knowledge-Based Systems 24 (2011) 943–951

Table 3 Search results of individual queries and our search system. Query 1: MOUSE 1. 2. 3. 4. 5. 1. 2. 3. 4. 5. 1. 2. 3. 4. 5. 1. 2. 3. 4. 5.

http://en.wikipedia.org/wiki/mouse (Highly Relevant) http://en.wikipedia.org/wiki/computer_mouse http://www.apple.com/mightymouse/ http://www.apple.com/keyboard/ http://www.mousebreaker.com/ Query 2: HOUSE MOUSE http://www.house-mouse.com/ http://www.house-mouse.com/eeek-mail/ http://psc.disney.go.com/abcnetworks/toondisney/shows/hom/ http://psc.disney.go.com/abcnetworks/toondisney/games/ pack_the_house/index.html http://en.wikipedia.org/wiki/mus_musculus (Highly Relevant) Query 3: HOUSE MOUSE BIOLOGY http://www.pestproducts.com/mice.htm (Highly Relevant) http://www.pestproducts.com/rodents.htm (Relevant) http://en.wikipedia.org/wiki/mus_musculus (Highly Relevant) http://www.ipm.ucdavis.edu/pmg/pestnotes/pn7483.html http://doyourownpestcontrol.com/mice.htm (Highly Relevant) Our search system http://en.wikipedia.org/wiki/mus_musculus (2,3) (Highly Relevant) http://www.nsrl.ttu.edu/tmot1/mus_musc.htm (2) (Highly Relevant) http://www.house-mouse.com/ (2) http://en.wikipedia.org/wiki/mouse (1) (Highly Relevant) http://animaldiversity.ummz.umich.edu/site/accounts/information / mus_musculus.html(3) (Highly Relevant)

are manually given by our human expert. The highly relevant and relevant pages are shown in bold fonts and the number in a round bracket following each URL denotes which queries retrieve this URL. The aggregation scores are calculated by PWM-t. From Table 3, we observe that the number of highly relevant pages at the top five produced by our system is more than the initial query (mouse) issued by a user, and any recommended query (house mouse, house mouse biology). Moreover, our system covers broad search results derived from different queries, which increases the diversity of the fused search result list. For example, ‘‘house mouse’’(query #2) returns ‘‘http://www.nsrl.ttu.edu/ tmot1/mus_musc.htm’’ as one of search results. ‘‘http://en.wikipedia.org/wiki/mouse’’, is from the query ‘‘mouse’’ (query #1) and the query ‘‘house mouse biology’’ (query #3) includes the Web page ‘‘http://www.pestproducts.com/mice.htm’’. In contrast to current search engines which are able to efficiently retrieve results for a single query, thus, our system provides an alternative and useful way to produce more relevant Web pages by using parallel search strategy with related queries.

5. Related work Various methods have been proposed in the literature of rank aggregation. They can be classified based on whether: (1) they rely on score (value) [23,20,25]; (2) they rely on rank (order) [11,33,1,21,16]; (3) they require training data or not [23]. The performance of some rank-based methods is comparable to that of score-based methods [23], which is useful in the context of Web search since scores are usually unavailable from search engines. Certainly, it is reasonable for us to guess the approximate relevance values, but this topic is out of the scope of this paper. We have proposed a generic rank aggregation framework based random walk. Various representative rank aggregation methods can be applied in our framework, as listed in Table 1. This generic rank aggregation framework lays down the foundation for researchers to theoretically and empirically compare different rank aggregation methods them under a unified framework. Recently, researchers show interested in the Web search topic from different viewpoints. There are a lot of studies on how to

enhance Web search [15,29,17,6,5]. Chen and Lin [6] describe a new algorithm called Word AdHoc Network (WANET) and use it to extract the most important sequences of keywords to provide the most relevant search results to the user. Champin et al. [5] propose an approach to improving mainstream Web search by harnessing the search experiences of groups of like-minded searchers. Metasearch [1] is orthogonal to our work, which submits a query to several search engines to obtain several lists of search results. We assume that we could get useful search contexts from related queries and do searches on a search engine. In addition, query expansion is also an alternative to improve search quality [14,26,7,8,10,27]. Most query expansion techniques suggest terms used extracted from Web pages. However, some terms are difficult to be suggested because of their high document frequencies. Moreover, query expansion generates artificial queries while in this paper we focus on rank aggregation of search results can improve retrieval quality. Terms in related queries are actually input by previous users and reflect their search intents. Query expansion using related queries would be an interesting topic in our future work. 6. Conclusions and future work In this paper, we studied the random walk based rank aggregation to improve Web search by combining the search results of related search queries given an input query. To find the effective method of rank aggregation, we have proposed a generic rank aggregation framework which applies random walk on a novel Win/Loss graph as our rank mechanism. The proposed Win/Loss graph can utilize a variety of competition rules to determine its edge direction and weights. In addition, we have discussed how some typical rank aggregation methods are integrated into our framework and empirically showed how different methods affect on the retrieval quality of our search strategy. Experimental results have verified that our strategy is quite effective in facilitating users’ locate relevant information. In the future, incorporating the title and content of a URL to our approach is probably an interesting and promising topic. References [1] J.A. Aslam, M.H. Montague, Models for metasearch, in: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,(SIGIR’01), New Orleans, Louisiana, USA, 2001, pp. 275– 284. [2] R.A. Baeza-Yates, C.A. Hurtado, M. Mendoza, Improving search engines by query clustering, JASIST 58 (12) (2007) 1793–1804. [3] M. Bianchini, M. Gori, F. Scarselli, Inside pagerank, ACM Trans. Internet Technol. 5 (1) (2005) 92–128. [4] J. Borda, Mémoire sur les élections au scrutin, Comptes rendus de l’Académie des sciences 44 (1781) 42–51. [5] P.-A. Champin, P. Briggs, M. Coyle, B. Smyth, Coping with noisy search experiences, Knowl.-Based Syst. 23 (4) (2010) 287–294. [6] P.-I. Chen, S.-J. Lin, Word adhoc network: using google core distance to extract the most relevant information, Knowl.-Based Syst. 24 (3) (2011) 393–405. [7] P.-A. Chirita, C.S. Firan, W. Nejdl, Personalized query expansion for the web, in: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’07), Amsterdam, The Netherlands, 2007, pp. 7–14. [8] K. Collins-Thompson, J. Callan, Query expansion using random walk models, in: Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management (CIKM’05), Bremen, Germany, 2005, pp. 704–711. [9] A. Copeland, A Reasonable Social Welfare Function, Mimeo, University of Michigan, 1951. [10] H. Cui, J.-R. Wen, J.-Y. Nie, W.-Y. Ma, Query expansion by mining user logs, IEEE Trans. Knowl. Data Eng. 15 (4) (2003) 829–839. [11] C. Dwork, R. Kumar, M. Naor, D. Sivakumar, Rank aggregation methods for the web, in: Proceedings of the 10th International Conference on World Wide Web (WWW’01), Hong Kong, China, 2001, pp. 613–622. [12] M. Farah, D. Vanderpooten, An outranking approach for rank aggregation in information retrieval, in: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’07), Amsterdam, The Netherlands, 2007, pp. 591–598.

L. Li et al. / Knowledge-Based Systems 24 (2011) 943–951 [13] A. Fujii, Modeling anchor text and classifying queries to enhance web document retrieval, in: Proceedings of the 17th International Conference on World Wide Web (WWW’08), Beijing, China, 2008, pp. 337–346. [14] F.A. Grootjen, T.P. van der Weide, Conceptual query expansion, Data Knowl. Eng. 56 (2) (2006) 174–193. [15] S. Lakshminarayana, Categorization of web pages – performance enhancement to search engine, Knowl.-Based Syst. 22 (1) (2009) 100–104. [16] G. Lebanon, J.D. Lafferty, Cranking: combining rankings using conditional probability models on permutations, in: Proceedings of the 19th International Conference on Machine Learning, (ICML’02), Sydney, Australia, 2002, pp. 363– 370. [17] L. Li, Z. Yang, M. Kitsuregawa, Using ontology-based user preferences to aggregate rank lists in web search, in: Proceedings of Advances in Knowledge Discovery and Data Mining, 12nd Pacific-Asia Conference (PAKDD’08), Osaka, Japan, 2008a, pp. 923–931. [18] L. Li, Z. Yang, L. Liu, M. Kitsuregawa, Query-url bipartite based approach to personalized query recommendation, in: Proceedings of the 23rd AAAI Conference on Artificial Intelligence,(AAAI’08), Chicago, Illinois, USA, 2008b, pp. 1189–1194. [19] C.D. Manning, P. Raghavan, H. Schutze, Introduction to Information Retrieval, Cambridge University Press, 2008. [20] M.H. Montague, J.A. Aslam, Relevance score normalization for metasearch, in: Proceedings of the 2001 ACM CIKM International Conference on Information and Knowledge Management (CIKM’01). Atlanta, Georgia, USA, 2001, pp. 427–433. [21] M.H. Montague, J.A. Aslam, Condorcet fusion for improved retrieval, in: Proceedings of the 2002 ACM CIKM International Conference on Information and Knowledge Management (CIKM’02). McLean, VA, USA, 2002, pp. 538–548. [22] L. Page, S. Brin, R. Motwani, T. Winograd, The pagerank citation ranking: Bringing order to the web, Tech. rep., Stanford Digital Library Technologies Project, 1998. [23] M.E. Renda, U. Straccia, Web metasearch: rank vs. score based rank aggregation methods, in: Proceedings of the 2003 ACM Symposium on Applied Computing (SAC’03). Melbourne, FL, USA, 2003, pp. 841–846.

951

[24] X. Shen, B. Tan, C. Zhai, Context-sensitive information retrieval using implicit feedback, in: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’05), Salvador, Brazil, 2005, pp. 43–50. [25] M. Shokouhi, Segmentation of search engine results for effective data-fusion, in: Proceedings of the 29th European Conference on IR Research (ECIR’07), Rome, Italy, 2007, pp. 185–197. [26] M. Song, I.-Y. Song, X. Hu, R.B. Allen, Integration of association rules and ontologies for semantic query expansion, Data Knowl. Eng. 63 (1) (2007) 63– 75. [27] R. Sun, C.-H. Ong, T.-S. Chua, Mining dependency relations for query expansion in passage retrieval, in: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’06). Seattle, Washington, USA, 2006, pp. 382–389. [28] J.-R. Wen, J.-Y. Nie, H. Zhang, Query clustering using user logs, ACM Trans. Inf. Syst. 20 (1) (2002) 59–81. [29] Y. Xu, K. Wang, B. Zhang, Z. Chen, Privacy-enhancing personalized web search, in: Proceedings of the 16th International Conference on World Wide Web (WWW’07), Banff, Alberta, Canada, 2007, pp. 591–600. [30] Z. Yang, L. Li, M. Kitsuregawa, Efficient querying relaxed dominant relationship between product items based on rank aggregation, in: Proceedings of the 23rd AAAI Conference on Artificial Intelligence (AAAI’08), Chicago, Illinois, USA, 2008, pp. 1261–1266. [31] H.P. Young, Condorcet’s theory of voting, American Political Sci. Rev. 82 (4) (1988) 1231–1244. [32] Z. Zhang, O. Nasraoui, Mining search engine query logs for query recommendation, in: Proceedings of the 15th International Conference on World Wide Web (WWW’06), Edinburgh, Scotland, UK, 2006, pp. 1039–1040. [33] S. Zhu, Q. Fang, X. Deng, W. Zheng, Metasearch via voting, in: Proceedings of the Fourth International Conference on Intelligent Data Engineering and Automated Learning (IDEAL’03), Hong Kong, China, 2003, pp. 734–741.