Page Hunt: using human computation games to improve ... - CiteSeerX

1 downloads 0 Views 314KB Size Report
In contrast, Page Hunt is a single-player human ... In particular, Page Hunt is used to elicit data ... Against each result, a big check mark is shown if the match is.
Page Hunt: Using Human Computation Games to Improve Web Search Hao Ma *

Raman Chandrasekar, Chris Quirk

Abhishek Gupta *

Dept. of CSE The Chinese University of Hong Kong Shatin, N.T., Hong Kong

Microsoft Research One Microsoft Way Redmond, WA 98052, USA

Digital Media, LCC Georgia Institute of Technology Atlanta, GA 30332, USA

[email protected]

{ramanc, chrisq}@microsoft.com

[email protected]

surfaced due to a variety of reasons. If a page does not get surfaced, it will not get into a pool of results to be evaluated, nor will not figure in clickthrough data since no one clicks on it.

ABSTRACT There has been a lot of work on evaluating and improving the relevance of web search engines, primarily using human relevance judgments or using clickthrough data. Both of these approaches look at the problem of learning the mapping from queries to web pages. In contrast, Page Hunt is a single-player human computation game which seeks to learn a mapping from web pages to queries. In particular, Page Hunt is used to elicit data from players about web pages that can be used to improve search. The data that we elicit from players has several applications including providing metadata for pages, providing query alterations for use in query refinement, and identifying ranking issues. The demo has features which make the game fun, while eliciting useful data.

To obviate problems in finding pages relevant to a given query, we suggest an alternative method, using a human computation game inspired by the ESP game [2, 3], looking the other direction: given a web page, what queries will effectively find this web page.

2. PAGE HUNT The basic idea of Page Hunt [4] is to show the player a random web page, and ask the player to formulate a query that would bring up this page in the top few results (say in the top 5) on a search engine. The web page being „hunted‟ is shown in the background. The player views the page, and types in a word (or a phrase) as the current query Q. The system gets the top N search results for this query from a search engine and displays them. Against each result, a big check mark is shown if the match is successful, and a cross is displayed if it is not. We get results from Microsoft‟s Live Search engine using their public SOAP API.

Categories and Subject Descriptors H.3.m [Information Retrieval]: Miscellaneous; H.5.3 [HCI]: Webbased Interaction

This match is successful if the URL U is in the top N results for a given query Q. Players get points based on the rank of U in the result set: 100 points for rank 1, 90 for position 2 etc. Thus the search engine in the background determines if the player wins or loses. If query Q does not lead to success, the player edits the query to change the phrase or add a word/phrase. Otherwise, the player‟s score gets updated, and the game advances to the next web page. This is repeated for each page till the player quits or hits a fixed time limit for the game.

General Terms Measurement, Design, Experimentation, Human Factors.

Keywords Web Search, Human Computation Games, relevance

1. INTRODUCTION Search engines have become an integral part of our everyday lives. It is clearly important to evaluate the “goodness” of search engines to help improve the search experience of users and advertisers, and build traffic and revenue. One common method of evaluating search relevance uses large hand-annotated evaluation corpora where query/document pairs are assigned relevance judgments. Another method uses implicit measures of relevance, such as identifying clicks on results. These methods are based on pages surfaced by a search engine. But some pages may never get

From pilot tests with Page Hunt, we have seen that players can readily comprehend the task. From these tests, we also found that the query data we obtain from the Page Hunt game is not very different from queries used on search engines. Using a transparent overlay on top of the web page, we inhibit players from cutting and pasting long phrases from the page. Coupled with the fact that there is a time limit for each game, this seems to discourage people from providing unrealistic phrases as queries simply because they are unique. To verify this assumption, we took a random sample of 100 query – web page URL pairs from the pilot data, and categorized them as over-specified (containing more query words than required to get the associated web page), underspecified (having fewer words than required), or OK. In our analysis, 78% of the queries were OK, 15% were over-specified and 7% were under-specified.

* Work done on summer internships at Microsoft Research. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. KDD-HCOMP '09, June 28, 2009, Paris, France. Copyright 2009 ACM 978-1-60558-672-4...$5.00

27

Figure 1. Screenshot of Page Hunt with the web page that is being hunted in the background, and the panel with the player’s query and search results floating on top.

We have devised a process to extract query alterations from game data, using the idea of bitext matching [1]. Given a set of queries that all described the same URL, we extract all possible query pairs. From these pairs, we can learn transformations between approximately equivalent queries, including synonym replacements, acronym expansion and contraction, and many more operations. There is considerable scope for further work, including evaluations of the utility of query alterations gleaned by this method.

3. ACKNOWLEDGMENTS In any human computation game, the data comes from the players who gave us data while playing our games. We thank them for their help and the data they provided.

4. REFERENCES [1] C. Quirk, C. Brockett, and W. B. Dolan. Monolingual Machine Translation for Paraphrase Generation. In Proc. Of EMNLP, Barcelona, Spain, 2004.

The data that we elicit from online players has several other applications as well, such as providing metadata for web pages, evaluating performance of different search engines, examining cognitive style of online users, or identifying search results ranking issues.

[2] L. von Ahn. Games with a purpose. IEEE Computer, 39(6):92-94, 2006. [3] L. von Ahn and L. Dabbish. Designing games with a purpose. Communications of the ACM, 51(8):58-67, 2008.

In our demo, we will highlight aspects of the game which seek to make it fun and engaging, and point out features which improve the quality of data collection.

[4] H. Ma, R. Chandrasekar, C. Quirk, and A. Gupta. Page Hunt: Improving Search Engines Using Human Computation Games. In Proc. of SIGIR, Boston, USA, 2009.

28