search engines, Google [2] and Yahoo [3], had simplified the web search processes into simple keywords input. By cumulating popularity, keyword frequency, ...
ConSearch: An Concept-Associating Search Interface using Commonsense Chia-Hsun Lee, Henry Lieberman MIT Media Laboratory 20 Ames. Str. E15-324 { jackylee, lieber}@media.mit.edu +1 617.253.4564 ABSTRACT
the right information should not be a probability test.
This paper presents ConSearch- a concept-associating search interface based on a cognitive model of web searching. Web search usually isn’t a good experience when possible results are totally un-searchable. People consumed heavy mental loads of filtering out irrelevant web links. To make the search experience easier, the search mechanism should be mapped into our mental model. Human cognition has a great advantage over machines on recognizing things that make sense. Adding a layer of conceptual relationship could help users easily figure out the right ways to go. ConSearch provides an interactive way of retrieving search results by associating concepts.
Semantic search [5] augments traditional search results from a rich-resources network. GOOSE [6] is a search interface allowing users to make searches based on their goals. Keyword searching can be extended as a goaloriented search to find out results conceptually related to the input keywords. Openmind [7] is an online database stored over 700,000 simple facts of how we use obvious real world knowledge. It provides a rich-semantic structure to make analogy from some text inputs to related commonsense knowledge. An opportunity has arisen to enrich the web search processes in a commonsense manner instead of filtering statistic results manually.
Author Keywords
Web search engines usually are built on a statistic model of keyword repetition. The search results from Google might be hard to explore, if we didn’t find a good one on the first result page. Our mental load suffered when browsing search results one by one. Starting a random inspection over a very long list is trivial. Each result seems no relations to one another. User usually picks up random links that seems useful.
Commonsense Reasoning, Concept Association, Interactive Search Interface. ACM Classification Keywords
H.5.2. User Interfaces, H.5.3. Web-based interaction, H.3.3 Information Search and Retrieval.
We propose ConSearch- a concept-associating search interface to make senses of web search processes. The main idea is to classify search results by concepts and to offer users a comprehensive way to find out their answers. Web search processes should be as easy as human analogy instead of try-and-error. A rich semantic representation of search results might take less mental efforts to get in the right path to the search goal.
INTRODUCTION
Internet has been considered as a huge ill-structured information resource. Tim Berner-Lee’s semantic web [1] is a great vision to make internet as a machine understandable structure where an agent can follow semantic and meanings to understand and response to users’ situation and needs. However, this vision still needs time to develop. Web search engines, Google [2] and Yahoo [3], had simplified the web search processes into simple keywords input. By cumulating popularity, keyword frequency, and certain information retrieval techniques [4] into relevant ranks, today’s search engines seem to do a good job. But, finding
A COGNITIVE MODEL OF WEB SEARCH
Web search involves a serial of mental stages and information seeking activities to find out useful information. Ellis and Haugan [8] propose a general model of information seeking behaviors based on studies of the information seeking patterns of researchers and practitioners. The model describes six categories of information seeking activities as generic: starting, chaining, browsing, differentiating, monitoring, and extracting. In this paper, we focus on the browsing, differentiating and extracting activities in searching results. They are the most time consuming activities but not always productive.
1
is a natural language processing tool to make sense out of the commonsense database. ConceptNet could make concept associations from English sentence inputs.
Having a system to reduce certain complexity of searching processes could greatly improve the search experience. In Marchionini’s model of information-seeking process [9], he proposes that the information seeking process is composed of eight sub-processes which develop in parallel: 1. recognize and accept an information problem, 2. define and understand the problem, 3. choose a search system, 4. formulate a query, 5. execute search, 6. examine results, 7. extract information, 8. reflect/iterate/stop. Modern web search interfaces, like Google and Yahoo, usually address only the executing search process within the six subprocesses of information seeking. In other words, human brain still has much more work to do during the searching processes.
INTERACTIVE SEARCH BY ASSOCIATING CONCEPTS
The hard time in web searches is usually filtering out irrelevant results if the right answers weren’t so clear to find. Users are suffered heavy mental load in switching web pages back and forward. Grouping search results by similar concept and retrieving them in a concept-associated manner can possibly offload some irrelevant searching tasks. This paper suggests an interactive way of filtering results. A user enters some keywords and then the system will return grouped results conceptually related to the keywords. It’s relatively easier for users to identify a concept if it’s within his or her goal.
As shown in Figure 1, Google tried to simplify certain processes by leading users to a simple keyword search portal without defining their problem in detail. Yahoo provides both keyword search and categories to simplify search scope. But few of them really care about improving user interfaces in order to examine results and extract information in easy ways.
Web links was treated as nodes. We utilized ConceptNet to find out the most relevant topic among a group of web links. The relevant topics come from a spreading activation within a semantic network [13]. We consider all known concepts represented as nodes connected in a network. Each node is extended by reasoning the context information from ConceptNet. A concept is directly connected by links to its most-closelyrelated concepts. When a concept receives an activation signal from one of its neighbors, it in turn passes along the activation signal to each of its other neighbors. This is referred to as spreading activation. The concept-associate search also encourages users to follow the reasonable path of search. Once a user finds out an appropriate concept s/he needs. The search scope will also be narrowed down to related concepts.
Figure 1. Google and Yahoo web search interface. Google makes users start from a keyword search. Yahoo categories websites to provide visible search paths. A COMMONSENSE RESEASONING APPROACH
To make the search results readable, it should be in a form of knowledge that we are familiar with. Semantic memory takes less cognitive load in reasoning [10]. The obvious way to think and make analogy of something is to use commonsense. Commonsense interface agents [11] demonstrated that adding a little bit knowledge about everything to under constraint user interface can at lease make sense as the user is concerned. A large commonsense knowledge database- Openmind commonsense restores over 700,000 English sentences of simple facts in commonsense, i.e. Things fall down, not up, and dinner is a meal eaten in the evening. ConceptNet [12]
Figure 2. ConSearch interface extracts concepts from each result link with color-coded.
2
SEARCH SCENARIOS
pages with color-coded concepts, it allows users to retrieve and associate information with simple semantic relations. The concept visualization also preserves the relevant ranks from Google search engine.
As shown in Figure 2, if you want to know more about ‘frog pond’ and you know it’s in ‘boston common’. You search 'Boston common frog pond' in Google, the results ranges from 'city of Boston', 'frog pond in Boston reopens', 'Boston common frog pond wading pool', 'project for public spaces..'...etc. ConceptNet extracts (guest_topic) each link and connect related topics together, such as 'park','news','ice skate','skate rental', 'child','win baseball game','stay','hotel'. The system represents the results into ‘park’,’news’,’ice skate’ as different concept folders.
REFERENCES
1. Berners-Lee, T., Hendler, J., Lassila, O. Semantic Web. Scientific American 1(1) 2000, pp. 68-88. 2. Google, http://www.google.com 3. Yahoo, http://www.yahoo.com 4. Broder, A., Henzinger, M., Algorithmic aspects of information retrieval on the web. Handbook of massive data sets, Kluwer Academic Publisher, Norwell, MA, 2002.
IMPLEMENTATION
ConSearch system is implemented as three parts, Google search result clones, concept association and concept visualization. Google search results are crawled into a template for retrieving concepts behind every hyperlink. ConceptNet parses the template to find out a group of concept that all results are related to. The concept visualization part manages the grouping of result links and resorts them into a serial of concepts. Google search result clones
Concept Association
5. Guha, R., McCool, R., Miller, E. Using the Semantic Web: Semantic Search. In Proc. Of the Twelfth International Conference on World Wide Web, May 2003, pp.700-709 6. Liu, H., Lieberman, H., Selker, T. GOOSE: A GoalOriented Search Engine with Commonsense. In Proc. AH2002, pp.253-263.
Concept Visualization
7. Openmind commonsense project, http://commonsense.media.mit.edu/
Park
8. Ellis, D., M. Haugan. Modelling the Information Seeking Patterns of Engineers and Research Scientists in an Industrial Environment," Journal of Documentation. vol. 53, no 4. 1997, pp. 384-403
Google search results Keyword ConceptNet
Skating
9. Marchionini, G. Information Seeking in Electronic Environments. Cambridge University Press, 1995. 10. Conrad, C., Cognitive Economy in Semantic Memory, Journal of Experimental Psychology, Volume 92, pp. 149-154, 1972.
Figure 3. System Architecture of ConSearch CONCLUSION
11. Lieberman, H., H. Liu, P. Singh, B. Barry. Beating Some Common Sense into Interactive Applications. submitted to the International Joint Conference on Artificial Intelligence, 2003
This paper presents a concept-associating interface for annotating web search results. This system clusters Google search results by similar concepts using commonsense reasoning. Making the search results human-readable is also as important as making search engine more efficient. Statistical ranking systems had much success in finding the right results with high probability. But the search engine never understands what the user may need in order to guide user to perform a successful search. By annotating the result
12. Liu, H., Singh, P. ConceptNet- A Practical Commonsense Reasoning Tool-Kit. BT Technology Journal. Vol. 22 No.4, October 2004, pp.211-226. 13. Quillian, M. R., Semantic Memory. in M. Minsky (ed.), Semantic Information Processing ,1968
3