Ohne Titel

The Many Ways of Searching the Web Together: A Comparison of Social Search Engines

Manuel Burghardt, Markus Heckner und Christian Wolff Media Informatics Group Institute for Information and Media, Language and Culture University of Regensburg, 93040 Regensburg, Germany E-‐‑mail: {firstname.lastname}@ur.de Abstract: This article illustrates and explains the ambiguity and vagueness of the term social search and aims at describing and classifying the heterogeneous landscape of social search implementations on the WWW. We have looked at different definitions as well as the context of social search by carrying out an extensive literature review, and tried to unify and enhance existing ideas and concepts. Our definition of social search is illustrated by a general review of existing social search engines, which are analyzed and described by their specific features and social aspects. Keywords: social search engines, social tagging, social question-‐‑answering, collaborative filtering, collaborative search, personalized social search engines

1. Introduction Before the digital age and the rise of the WWW, information seeking almost always occurred in a social context, i.e. users had to ask a person from their social environment – either a qualified friend or an information professional – when they wanted to obtain some kind of information. The advantage of information sought in such a socially mediated way lies in the ease of evaluating its particular quality, as the source of information is in most cases personally known for his or her competence in a specific field of knowledge. Today, most algorithmic web search engines suffer from a lack of trust in the quality of the retrieved information. The problem is no longer to find any information about a certain topic at all, but to be able to judge which piece of information from the vast, automatically generated list of results is actually relevant and of decent quality. Another problem is that existing search engines adopt a “one size fits all” (Ahn, Brusilovsky, & Farzan, 2005) approach, i.e. they don’t consider user-‐‑specific and contextual aspects of the search process. Thus, it seems obvious to integrate some kind of social context into the process of web

1

search, as real people from a user’s social environment are trusted more than abstract and non-‐‑transparent search algorithms. Social search can be understood as a generic term, which was coined to summarize a wide variety of concepts to approach this issue. Most traditional web search engines focus on text-‐‑ and content-‐‑based retrieval algorithms, whereas social search engines focus on human judgment when it comes to ranking and assessing the relevance of online resources. This means, the ranking algorithm of a social search engine is not (primarily) based on the frequency and distribution of specific keywords in a single document, but on the fact that other users evaluate certain documents as interesting and relevant with respect to a certain information need. Ahn et al. (2005) point out another problem of current web search engines which is constituted by the traditional IR-‐‑assumption that the users’ queries and the document-‐‑ representation share a common language and can be matched. Social approaches like e.g. social tagging try to overcome this mismatch by allowing users to describe documents in their own terms.

2. What is social search? While the social web is still changing at a very fast pace, the concept of social search is evolving, too. As a result of this development, and because of the highly generic nature of the term social search, a wide range of different interpretations and implementations for social search engines can be found on the web. This chapter aims at describing and comparing the heterogeneous landscape of social search implementations on the WWW. 2.1. Context and history of social information retrieval Although there is a plethora of different social search engines on the web, most of them build on basic concepts and ideas from the IR-‐‑field that have been around for at least 50 years. The problems and limitations of exclusively automatic, query-‐‑based search, which cause a low precision of the result set (at least in the case of output without relevance ranking), have been known to IR-‐‑ researchers for years. A first approach to use human judgment for an increased precision of results can be seen in Rocchio’s introduction of relevance feedback (Rocchio, 1971) which allows the user to gradually refine his query by evaluating the relevance of the particular results. Although this idea dates back to the 1970s, it has been difficult to implement effectively ever since. The reasons are basic problems of motivation and acceptance, as it is very difficult to get the user to voluntarily give feedback to an IR-‐‑system about the relevance of the results: Users primarily want to find information and not give feedback. The

2

various applications of the social web (social networks, social tagging, etc.) as well as a change in the mentality of its users (from passive consumers to active contributors and collaborators or prosumers) enable a large-‐‑scale realization of the relevance feedback idea, adding the possibility of socially sharing feedback with many other users. Another crucial development in the 1990s, setting the course for social search, was the consideration of cognitive and contextual aspects during the information seeking process. Before this development IR-‐‑systems were evaluated using the so-‐‑called “Laboratory Model of Information Retrieval” (Ingwersen & Järvelin, 2010) in the Cranfield paradigm which focuses on theoretical aspects and the system itself. In the last decades this mainly system-‐‑ and theory-‐‑driven tradition of IR research has been broadened by a more user-‐‑ oriented stand of research, especially the by the Scandinavian school of information retrieval (Ingwersen, 1996): Starting from the special situation of information seeking as an “anomalous state of knowledge“ (ASK) (Belkin, Oddy, & Brooks, 1982) the adequate cognitive modeling of information needs as well as the user’s perception of information as presented by information systems gain importance. Wilson models the search process as an integral part of information (use) behavior (Wilson, 1999) and (Ingwersen & Järvelin, 2010) see information retrieval and information behavior as part of the social context. Their theory of polyrepresentation claims that retrieval quality may become better if many different representations of documents and media can be analyzed for information search. Among these representations are information types as different as full text indices, keywords attached to documents by information professionals or users’ tags as offered on tagging platforms like Connotea. The result of this approach is not equivalent to a genuine theory of social search, but approaches like social tagging can be explained with the polyrepresentation model: By adding new methods of collaborative indexing like e.g. social tagging, an additional and different representation of the original unit of information is created, which can help optimize the search process. This kind of social indexing is prominent in those cases where no general purpose algorithms for automatic indexing exist, and at the same time the sheer number (and quality) of informational units precludes intellectual indexing by information professionals, e. g. in the case of image and video retrieval. Well-‐‑ known platforms which are a good example include Flickr and YouTube. 2.2. The social graph Graph theory can be used to formalize a user’s social context. Graphs are frequently used to represent networks, as they allow for an easy modeling of objects (as nodes) and relations (as edges). The application of this model to a

3

user’s social context on the web is called social graph (de Choudhury, Sundaram, John, & Seligmann, 2010). The nodes of the social graph describe people on the Internet, the edges describe relations between those people. People may have an immediate connection, i.e. they know each other directly, or they only know each other indirectly that means they share a connection via a third person they both know. Milgram introduced the so called “small world phenomenon” in 1967, claiming that everybody on the world is connected via six degrees of separation (Milgram, 1967) which means that everybody knows everybody else via a maximum of six other persons. Depending on the degree of separation, a user’s social graph can be divided into several sub-‐‑graphs, denoting different groups of social connections and indicating different levels of intimacy. In practice, a user may have several parallel social graphs for different social networks. One of the main challenges for social search engines will be to aggregate different social graphs and provide the user with information from his entire social circle, or only from a pre-‐‑selected sub-‐‑circle. Markup languages for social networks like e.g. the RDF-‐‑based FOAF-‐‑format (friend of a friend) (Brickley & Miller, 2010) constitute an important step towards a standardized and interoperable representation of a user’s social graph. 2.3. Defining social search The heterogeneous field of existing social search engines illustrates that the concept of social search allows for a wide range for interpretations, and obviously lacks a standardized and generally accepted definition. In order to understand and define social search, it is necessary to clarify the slightly ambiguous term social. Some search engines are labeled social because they search for social data, which can be either information about real people (cf. Table 1), or real-‐‑time search in social media (cf. Table 2), whereas other search engines use the knowledge and judgment of people to back up web search in various ways. In the first case, search engines are just used for social data-‐‑ mining (Evans & Chi, 2008), i.e. they are treated as “systems searching for social data”. Name Pipl iSearch Wink 123people Yasni

URL http://pipl.com http://www.isearch.com http://wink.com http://www.123people.com http://www.yasni.com Table 1: Examples for people search engines

4

Name Socialmention Whostalking Technorati Google Realtime Kurrently

URL http://www.socialmention.com http://www.whostalkin.com http://technorati.com http://www.google.com/realtime/ http://www.kurrently.com Table 2: Examples for social media search engines

In the second case, search engines are treated as “systems searching socially”, which means they rely on the actions of real people, therefore often being called people powered search engines. Although social search is often used as an umbrella term to capture both definitions, we argue for understanding social search and social search engines in the tradition of social software: In its loosest definition, social software describes any software that enables and supports people to communicate and to collaborate (Deans, 2008). By seeing social search engines as one of the many types of social software (in line with e.g. blogs, wikis, etc.), we will use the term in the meaning of “systems searching socially”, as the search for a person or some kind of social media content is per se not an act of communication and collaboration. Croft, Metzler & Strohman (2010) claim that social search needs some kind of social environment which “can be defined as an environment where a community of users actively participate in the search process” . McDonnell & Shiri (2011:9), along with the previous definitions, define social search as “the use of social media to aid finding information on the Internet”. In addition, they include a special case of the “systems searching for social data” interpretation in their very definition, by claiming that the search for an expert which is actually some kind of people search is social search as well. This seems plausible, as the search for a person with expertise in a certain domain can be seen as some kind of meta-‐‑search, while the searcher is actually trying to find somebody who can fulfill his or her information need. After experts are found, they are integrated into the user’s social context, enabling communication and collaboration. Actually, this scenario occurs in some social question-‐‑answering systems, where users search experts to answer their specific questions. The social search engine Aardvark hides this meta-‐‑search step from the user, as it tries to find experts to a question automatically, and only returns the answer of this expert to the initial questioner. Building on the context of social software, we propose to use the term social search for any IR-‐‑system that in some way relies on the user’s social context in order to enhance the search process. Such a social enhancement requires active communication and collaboration. Another distinctive criterion is needed to narrow down the mode of communication and collaboration in a social environment, as Sherman (2006)

5

rightly claims that social collaboration has been around on the web since its very beginning, and is almost omnipresent. He argues that every web page and every search algorithm was created by a human initially, so even if a user types a query into Google, the results he gets back are based on an algorithm which eventually reflects human judgments about quality and relevance. In order to distinguish social collaboration, the concept of intent, which can be implicit or explicit, has been suggested by several authors (Golovchinsky, Pickens, & Back, 2009; Lewandowski, 2009; McDonnell & Shiri, 2011). Implicit collaboration can take many forms, where some are even more implicit than others. Google’s PageRank (Brin & Page, 1998) is a good example for the implicit collaboration of all people editing and linking web pages, as they evaluate and rank other websites by linking to them. The algorithm considers the hyperlink structure (link topology) of the web and derives relevance criteria which are based on the human judgment of the link builders. Another implicit form of collaboration can be found in the statistical analysis of user’s surfing behavior on the web, where the click popularity gives clues to the relevance of web pages. These two basic forms of implicit collaboration represent two user groups: • link topology: authors of websites • click popularity and user statistics: consumers of websites Implicit collaboration in the context of social search almost always means that people perform some kind of action which is not primarily intended to enhance search but to fulfill some other task. In contrast, explicit collaboration in social search is always (at least to some degree) directed and deliberate, and can take many different forms, such as: (1) social tagging (2) social question answering (3) collaborative search (4) collaborative filtering (5) personalized social search engines Figure 1 gives a visualization of the various social approaches to information retrieval as described above.

6

Taxonomy of Social Search

social data mining

people powered search (systems searching socially)

social media search (search for social media content)

people search* (search for people)

*not to be confused with systems that search for experts, who are able to support the search process because of their expertise

explicit

social tagging

personalized social search engines

implicit

collaborative search social question answering

collaborative filtering

click popularity and user statistics

Figure 1: Taxonomy of Social Search Approaches

link topology

In the following we will investigate the various possibilities to realize active and explicit participation of a community in the search process. We will also show in some more detail that our definition for social search still leaves plenty of room for interpretation, as “search” is a vague concept that includes different activities like e.g. tagging, querying or ranking. Accordingly, the possibilities to search socially are numerous. This is well reflected by the vast and heterogeneous field of actual social search implementations which can be found on the web.

3. Social tagging systems Storing and retrieving books and documents with the help of catalogs has been a well-‐‑established practice long before electronic information retrieval systems were developed. Analog catalogs require manual indexing with bibliographic

7

metadata such as author or title as well as content-‐‑related categories and keywords. The goal of document indexing is to create a representation of the document, which can be easily stored in a catalog and is therefore available for later retrieval (Lancaster, 2003). This manual indexing process is still applied in certain web directories, where human editors are collecting and annotating relevant links for a given set of topics. One of the first web directories was founded by Yahoo in 1994 under the name “Jerry and David’s Guide to the World Wide Web”. This directory was an attempt to create a basic catalog of the WWW (Hayhurst & Weston, 2007). Today, this catalog only plays a minor role on Yahoo’s web sites and has largely been replaced by their web search engine (Griesbaum, Bekavac, & Rittberger, 2009). Given the limitations of physical libraries and their catalogs, manual indexing and cataloging is a viable task, whereas the web’s scale makes manual indexing a futile effort: No organization can provide a large enough number of professional indexers to annotate all documents and websites on the WWW. The fact that web documents are constantly changing and growing in number makes this task even harder. In order to alleviate this problem the social tagging approach suggests that (web) authors and users, both become indexers (i. e., taggers) of resources on the web. 3.1. Fundamentals and motivation for social tagging A major problem of manual indexing is constituted by the fact that indexers are in most cases completely separated from the retrieval process (Mathes, 2004), i.e. the professional indexer or author of a document usually does not search for the same document. This can hinder the retrieval process: When a user has to formulate a search query based on an abstract information need, a gap occurs between the representation created by the professional indexer and the searcher (vocabulary problem) (Furnas, Landauer, Gomez, & S. T. Dumais, 1987). Social tagging alleviates this problem, as indexing becomes a collaborative effort in which all users of the system search and annotate the same set of documents (Blank, Bopp, Hampel, & Schulte, 2008; Mathes, 2004). Users can assign an arbitrary number of keywords to describe various resource types (Marlow, Naaman, Boyd, & Davis, 2006), which can be employed by other users to search for documents later on. In the social web, any web user can become a potential document indexer. This additional manpower can at least theoretically match the number of sites that can be automatically indexed. In addition, social tagging can apply human intellect to resources that are notoriously difficult to be indexed automatically, as users can easily assign content-‐‑based keywords to images, videos and audio files. Moreover, certain platforms allow parallel tagging of one resource by several taggers, which creates different user perspectives on the same document (Blank et al., 2008).

8

Various attempts to define social tagging have been made: Barsky & Purdon (2006) describe tagging as a form of classification through tags or keywords, whereas Tonkin (2006) emphasizes the informal character of tagging by describing tags as free-‐‑text with “unconstrained and arbitrary values”. Voß (2007) describes tagging as a manual form of indexing and thus makes an explicit connection to the classic method of assigning keywords through professional indexers (e.g. librarians). Finally, Huang & Chuang (2009) analyze tagging as a form of communication in the tradition of Peirce’s semiotics. Social tagging also differs from indexing by professional and authors with respect to motivation for tagging. Professional indexers and authors are consciously and systematically indexing for others, but these conventions do not apply for social taggers, who are not bound to any conventions. The major motivations for social taggers presumably are information sharing (information should be discovered by others) and information management (information should be retrieved by the tagger at a later stage) (Heckner, Heilemann, & Wolff, 2009; Thom-‐‑Santelli, Muller, & Millen, 2008). 3.2. Direct usage of social tagging systems Social tagging systems allow users to add any kind of web resource to an existing collection. Tags are typically assigned when the resources are added to the collection. The explicit act of adding a resource to a collection can be regarded as an implicit positive judgment of relevance. Social tagging can affect different resource types, including bookmarks (cf. Delicious) images (cf. Flickr), videos (cf. YouTube), presentations (cf. Slideshare) as well as different text types such as scientific papers (cf. Connotea) or social media content like e.g. blog posts (cf. Technorati). Typically, social tagging systems offer the following three approaches for search: (1) Query based search through tags: Users transform their information need into query terms and the system compares these terms with the internal document representation, which at least partly consists of tags. (2) Serendipitous findings in the collections of others: Social tagging systems often contain collections of items which can be allocated to a certain user. Since a resource which occurs in the collection of two different users also indicates some kind of shared interest, navigating in the collections of other users can produce valuable resources, which could have not been discovered otherwise. (3) Navigation based search through tag-‐‑clouds: Users can navigate through tag clouds, which visualize the information space indexed through tags by consolidating and unifying the most frequent terms in a single view (Hearst & Rosner, 2008; Sinclair & Cardew-‐‑Hall, 2008).

9

Using the Connotea interface as an example, Figure 2 gives examples for the three types of tag usage for social search.

Figure 2: Alternatives for search in social tagging systems exemplified with Connotea (image source: http://www.connotea.org).

Indexing by a large number of users of a social tagging system makes it possible to index collections that would otherwise have been much too large to be handled by professional indexers (Golder, 2006; Marlow et al., 2006). Additionally, serendipitous discoveries in collections of other users can be made (N. Ford, 2005). Social tagging systems potentially alleviate the vocabulary problem because the representation of the document in the system and the representation of the user’s information need are created by users with a shared set of previous knowledge. The gap between the vocabulary of a professional indexer and a user is potentially becoming narrower. 3.3. Indirect usage of social tags as input for search algorithms Apart from directly using search terms, tags can also be used as additional parameters for retrieval algorithms. Bao et al. (2007) propose two algorithms for optimizing web search: SocialSlimRank and SocialPageRank. Hotho, Jäschke, Schmitz & Stumme (2006) propose a ranking algorithm (FolkRank) for optimizing search. Begelman, Keller & Smadja (2006) employ clustering techniques to optimize the user experience of a social tagging system and Milicevic, Nanopoulos, & Ivanovic (2010) provide an overview of using tags in recommender systems. Aurnhammer, Hannape & Steels (2006) combine visual

10

properties of images and social tagging for image retrieval. Finally, systems like 50 matches2 directly search in social tagging systems (social powered search).

4. Social question-‐‑answering Question answering (QA) systems constitute a special version of search engines, which most notably differ from other IR systems in how the users can formulate their information need. QA systems have a long tradition which can be traced back to the time of command line interfaces where question answering was proposed to enhance human computer interaction by enabling natural language communication with IR systems (Simmons, 1970). The basic constituents of any QA system that answers a user’s information need on the basis of a collection of documents are a component for matching the natural language query to the internal document representation, a component for extracting relevant answers and a processing component, i.e. an IR engine (Kwok, Etzioni, & Weld, 2001).Until today, the biggest advantage of any QA system is its intuitive handling: users can formulate their information needs in natural language which means they are not forced to translate it into less intuitive descriptors and operators, which can be understood by an automatic retrieval function. Also the results are returned in natural language, and are an immediate answer to the user’s question, not a list of interesting or relevant websites. 4.1. Fundamentals of social QA Social QA systems realize some of those components by means of human intelligence. Similar to Amazon’s Mechanical Turk platform, the knowledge base of social QA systems may be called artificial artificial intelligence (Pontin, 2007). These systems are social, because they mediate between an asking user and a user who might know the answer to that question, and provide features to interact and evaluate, rank or revise questions and answers. In social QA systems the user not only gets answers from real humans, but also has the chance to get in contact with like-‐‑minded users or experts in a certain field of interest. These contacts can add to the user’s social circle, even after a specific QA process has ended (Croft et al., 2010: 419-‐‑420), thus building a network of people which can be tapped to answer future questions in a similar field. Another crucial feature of social QA systems which is implied by the possibility to formulate queries in natural language is that people can ask even complex questions directly, without needing to translate them into manageable subtasks or sub-‐‑questions, which can be processed and computed by a machine. http://www.50matches.com

2

11

A basic problem of social QA systems is that there is no guaranty for the correctness of an answer. Social mechanisms like rating those users who have given answers before, as well as the position of the answering person in a user’s personal social circle help to evaluate that quality and correctness of an answer. Another drawback of asynchronous QA systems is that you never know when you will get an answer to your question or whether you will get an answer at all. The biggest problem in open social QA systems lies in the poor quality of answers, which is often closely related to the bad quality of the actual questions. The fact that users can pose questions and formulate answers in natural language is not always beneficial, but actually promotes informal, off-‐‑topic dialogues that are lacking a neutral point of view, and often remind of threads in forums (Dearman & Truong, 2010). Agichtein, Castillo, Donato, Gionis, & Mishne (2008) try to address this problem by proposing a framework to identify high quality content in social media with their social QA system Yahoo! Answers. Social QA systems are numerous (cf. Table 3), but most of them can be characterized by three dimensions: • temporal dimension: synchronous vs. asynchronous • cost dimension: free vs. fee • social dimension: community vs. experts3 Name

URL

temporal dimension

Yahoo Answers Amazon Askville Wiki Answers Google Answers UClue Aardvark

http://answers.yahoo.com http://askville.amazon.com http://wiki.answers.com http://answers.google.com http://uclue.com http://vark.com

asynchronous asynchronous asynchronous asynchronous asynchronous asynchronous

cost dimens ion free free free fee fee free

Ether

http://www.ether.com

synchronous

fee

social dimension open community open community open community preselected experts preselected experts preselected experts (from the user's social graph) self-proclaimed experts

Table 3: Examples for social question answering systems

Especially the social dimension makes existing social QA systems distinguishable. We will describe community-‐‑based systems and expert-‐‑based (but socially mediated) systems in some more detail.

In an open community anybody can ask and give answers. In semi-‐‑open communities only people from the user’s closer social environment can answer questions. In expert-‐‑based systems self-‐‑proclaimed or preselected experts (similar to paid editors) can answer questions. 3

12

4.2. Community-‐‑based systems One of the biggest and most prominent social QA systems is Yahoo! Answers (Arrington, 2006). It allows registered users to pose questions and give answers to other users’ questions. Users can search for answered or unanswered questions via full-‐‑text search or by browsing different question-‐‑categories. They may answer open questions themselves, or rate and comment questions and existing answers on some kind of meta-‐‑level. Some social QA systems like Wiki Answers not only allow for commenting on given answers, but to edit and revise answers in a wiki-‐‑like manner (i.e. user don’t even have to register on the service). Building on the idea of the wisdom of the crowd (Surowiecki, 2007), social QA systems guarantee a maximum of social interaction, but also cause corresponding problems known from Wikipedia, such as vandalism and edit wars (Viégas, Wattenberg, & Dave, 2004; Wilkinson & Huberman, 2007). A lot of free and open for everybody social QA systems try to counter such quality problems with internal motivation systems where users are promoted to different levels according to their achievements. Yahoo! Answers even has some kind of currency system where users can earn points when they answer questions or rate other users’ answers. If users want to pose a question, they need to pay with some of their points. A high number of points also brings special privileges, like e.g. being allowed to post more comments or pose more questions. Systems like Amazons Askville try to activate and motivate their users with a sophisticated achievements system which shows some elements of game experience design, prominent in MMORPGs (Yee, 2005). In Askville, the user can get a reward for all kinds of tasks, like e.g. posing over 100 questions with at least two answers each. Additionally, as some kind of community-‐‑building social glue, users can give and get compliments to each other. Recently, this type of information seeking – posing natural language questions to other humans over some QA platform – has been discovered in social networks like Facebook and Twitter as well, where users post questions as a status update, and get answers from the community via the comment function. This kind of QA system is semi-‐‑open, as only users from a person’s social graph can read and answer the question. 4.3. Expert-‐‑based systems In addition to the many free and open social QA systems, there are numerous systems that are available for a fee, which in most cases imply a closed community of answering persons, who have some kind of expertise in a specific field. Experts are either preselected by the operator of the QA system, or they can register as self-‐‑proclaimed experts with the system. As people are paid to answer a question, the quality of answers is for the most part significantly

13

higher than in open systems. This holds true for the quality of questions, too. A basic challenge is that questioners obviously want to assess whether an expert is really competent in a certain field. For both preselected and self-‐‑proclaimed experts, the questioners can use previous answers of an expert, or the ratings for an expert by other questioners. Therefore, an expert-‐‑based QA system has to provide a history of answered questions, and / or a method to evaluate and rate experts. Another basic distinction for expert-‐‑based systems can be made in the temporal dimension. Google Answers, which is no longer answering new questions, but still provides an archive of answered questions, and Uclue, are two examples for asynchronous systems, where a person posts a question in a specific category, and an expert takes some time to deliver an appropriate answer. Another variant of expert-‐‑based systems which operate in real-‐‑time and therefore enable synchronous communication are so-‐‑called call-‐‑an-‐‑expert services (Del Conte, 2006). Platforms like e.g. Ether allow users to present themselves as experts in a certain field. Questioners can either search for those experts directly, or browse through a topic-‐‑directory, which is similar to web directories like e.g. the open directory project (DMOZ), to find the appropriate expert for their question. The main difference between such systems and social QA services like e.g. Yahoo! Answers or Google Answers lies in the mode of interaction which is live and allows spoken, natural language (via Skype or telephone). It’s advantageous that questioners can give immediate relevance feedback and reformulate their questions if necessary while speaking to the answering expert. Additionally many such platforms allow the experts to provide (and bill) additional digital material which can be used to answer a question in more detail. As mentioned before, this kind of social QA resembles a meta-‐‑people search, and thus poses a special case of our previous social search definition. In such systems users search for social content (i.e. a human expert), but also search socially, as they interact with the detected experts and rate their answers, which again constitutes an indirect collaboration with other users who are searching for experts. Zhang & Ackerman (2005) note that this kind of expertise search does not necessarily require a social QA system, but can also be observed in social networks. Aardvark, which was bought up by Google in 2010, is another system which mediates between questioners and experts, but does so automatically. A striking feature of Aardvark is that it identifies experts ad hoc from the user’s social graph, based on the user’s questions. The basic functionality of Aardvark is described by Horowitz & Kamvar (2010): In order to find appropriate experts from a user’s social graph, which is realized by social media activities like e.g. a user’s participation on Facebook, the system does not search for an answer, but for a person who might be able to give an answer. Thus, Aardvark doesn’t index

14

documents, but people (social crawling) and tries to assign areas of expertise (so called topics) to these people. A topic parser tries to identify topics from structured data fields, like e.g. an expert’s interests and activities on his Facebook profile, a topic extractor tries to identify topics in unstructured text data, e.g. social media content created by the user (e.g. in his blog or on his Facebook wall). In addition to topic information, Aardvark stores ratings of other users about the actual topic competence of the expert, which is called score. An expert’s score is high if he gives many, good answers, and low if he gives few and / or bad answers.

5. Collaborative search Collaborative search can be seen as a subset of social search, with the most explicit form of cooperation, where users “share an information need and […] actively work together to fulfill that need”(Morris & Teevan, 2010: 2). Collaboration always indicates some kind of direct cooperation between users, whereas social search may take looser and more indirect forms of cooperation during the search process. Collaborative searchers often know each other, and have a specific goal they want to achieve together. It is best suited for complex search processes, which can be divided amongst several individual searchers. Therefore, a collaborative search system has to provide mechanisms to coordinate a combined search of different users. Van Setten & Moelaert-‐‑El Hadidy (2000) identify an improved group understanding of a complex search process and the division of labor as the main advantages of collaborative search. Collaborative search can be further distinguished into collaborative querying and collaborative browsing. Collaborative querying supports the information seeking process by allowing to share other users’ search experiences (expressed via queries) and help users to reformulate those queries for their own needs (Fu, Ciszek, Marchionini, & Solomon, 2006). Collaborative browsing can take various forms, such as “searching for specific information, exploring previously unexplored territory to see what'ʹs interesting, or some combination of the two” (Lieberman, Van Dyke, & Vivacqua, 1999). A third form of collaborative search may be seen in collaborative filtering, which is – due to its highly implicit nature (Golovchinsky et al., 2009) – treated as a social search genre of its own. One challenge for collaborative search systems is an adequate visualization of other users’ searches. It is necessary to present relevant search paths to fellow searchers in order to avoid redundant searches, and allow them to join the search somewhere along the way, or to turn off to some other sub-‐‑paths of the given search route. Another requirement for collaborative systems is the availability of commenting and rating mechanisms as a means to communicate

15

relevance assessments between the collaborators. Collaborative systems can be classified by using the time space taxonomy, which has been proposed for early groupware systems (Ellis, Gibbs, & Rein, 1991). According to this taxonomy, a collaborative search can be performed by several persons on one shared workplace (co-‐‑located search), or by collaborating from distributed work places (remote search), which are connected via the Internet. Considering the temporal dimension, collaborative search can happen in real-‐‑time (synchronous search), or with some kind of delay (asynchronous search). A good example for co-‐‑located collaboration can be found in the CoSearch system (Amershi & Morris, 2008), where one user acts as the so-‐‑called driver which is actually the head of the search group and starts the search process on a collectively used monitor. All other searchers take the role of so-‐‑called observers, who can affect the driver’s search efforts via input devices like e.g. multiple computer mice or smartphones. Observers may click on relevant results of the driver query, forwarding them to some kind of waiting queue, which can be worked off by the driver successively, or they may formulate new queries, which are collected in a query queue and can be executed later on by the driver. Additionally, results and queries can be annotated and commented. Although not all users are having equal rights in CoSearch, they can search for information synchronously. SearchTogether is another system for collaborative search, which enables remote search on distributed computers (Morris & Horvitz, 2007). In this system all queries are visible for all collaborating users and may be commented or even edited. Remote searches, where users work from different work places, oftentimes allow asynchronous collaboration, which implies the need for a mechanism to persistently store and document the joint search efforts for the time of the search project (Croft et al., 2010: 426) and alleviates the entrance of collaborators, who join the search project at a later point in time. The biggest problem of collaborative search systems lies in the absolute transparency of each user’s actions during the process of collaboration. By sharing personal queries and relevance rankings directly and evidently with other users, collaborative search systems are challenged with privacy and trust issues (Burghardt, Buchmann, Boehm, & Clifton, 2009), as users might hesitate to make a supposedly silly, naive or redundant query or comment.

6. Collaborative filtering and recommender services (Baeza-‐‑Yates & Ribeiro-‐‑Neto, 1999: 21) make a distinction between two basic approaches towards information retrieval: On the one hand there is browsing as a way of interacting with information, on the other hand they see the query-‐‑ based model of retrieval where ad hoc retrieval and information filtering are

16

further subtypes. In the framework of ad hoc retrieval, the user continuously formulates queries for an IR system with a more or less static document collection. Filtering, on the other hand, describes IR processes where documents are dynamically added to the collection whiles others may be removed and the same query (or filter) is constantly matched against the changing collection. The query may be formulated as a user profile used for selecting relevant items from the stream of documents being filtered (“selective dissemination of information“, (Callan, 1996: 262)). To create a basic (search) profile, the user has to specify his information need using keywords as precise as possible. By giving relevance judgments on selected documents, this profile may successively be refined during the filtering process. Ideally, this profile will become stable after some iterations of the relevance feedback loop (Baeza-‐‑Yates & Ribeiro-‐‑Neto, 1999: 22-‐‑23). Although document filtering can be seen as an example of personalized IR systems, the main aspect is the matching between documents and a user profile. Based on the assumption that users with similar profiles – or similar information needs – tend to judge the same documents as relevant, collaborative filtering (X. Su & Khoshgoftaar, 2009) extends document filtering by analyzing the relevance judgments of different users. By adding the degree of similarity between different profiles to the search process, document filtering is enhanced by a social or collaborative dimension (S. Dumais et al., 2000): Users will be presented relevant documents which are judged as relevant based on other users’ profiles and which would not have been selected using the individual user’s profile only (Croft et al., 2010: 436). Collaborative filtering is typically used for implementing recommendation services which suggest potentially relevant items that are not to be found with the original user’s query. Relevant items are selected automatically using judgments of “similar” users. The key tasks of a collaborative filtering system are the identification of similar user profiles and the generation of helpful recommendations from the differences in otherwise similar profiles. A central problem in collaborative filtering is the users’ lack of willingness for giving explicit relevance judgments. More implicit ratings like e.g. the buying decision on e-‐‑commerce platforms like Amazon work fine, though (Linden, Smith, & York, 2003): An article bought by a user can typically be judged as being relevant with respect to the user’s information need. In the case of text documents, implicit relevance criteria can be reading duration or the downloading or printing of a document (Ferber, 2003). Another typical problem of recommendation systems is the cold start or new item / new user problem (Adomavicius & Tuzhilin, 2005): New documents cannot be recommended without ratings and new users get little recommendations, as too little is known about them and their profile is not very specific yet. Table 4 gives some examples of web-‐‑based recommendation systems which make use of

17

collaborative filtering. These services may be treated as vertical search engines, as they recommend collaboratively filtered results for different subareas like e.g. a library or an e-‐‑commerce platform.

Name Amazon

URL

Type of recommendation

http://www.amazon.de

products (books, software etc.)

BibTip

http://www.bibtip.com

books

Hunch

http://hunch.com/

generic recommendations

Last.fm

http://www.lastfm.de

music

Mendeley

http://www.mendeley.com

scientific articles / research trends

MovieLens

http://www.movielens.org

movies

Table 4: Examples for recommender services based on collaborative filtering.

Alongside we find systems like e.g. Hunch, which pursue a more comprehensive filtering approach in order to build a so-‐‑called taste graph. In the long run, this graph is meant to represent all objects and people on the Internet together with their specific connections (edges of the graph), e.g. person A likes object B (K. Ford, 2010). The taste graph can also be exported and mapped with the users’ individual social graph, which allows customized applications like e.g. a recommender service for gifts and presents for a user’s friends (Schonfeld, 2010). Hunch also delivers an interesting approach which tries to alleviate the cold-‐‑start problem for new users: If users register to Hunch, they are asked to answer 20 (on a voluntary basis even more) questions about their personal interests and tastes (cf. Figure 3), allowing a basic comparison with existing profiles.

Figure 3: .„TeachHunchAboutYou“ – exemplary questions for the definition and refinement of user profiles (image source: http://hunch.com/).

Hunch can be either browsed via existing categories, or queried with a specific search term. Categories, which are called topics in Hunch, can be created and edited socially. Any suggestion for a new category is first discussed by the community in a topicworkshop. For each topic, users can define restrictive

18

questions that allow later users to focus their search within a topic 5 . The collaboratively filtered results can be ranked (pro and con) or commented by the users, to optimize future result lists for similar searches. Like in many social QA systems, Hunch provides a sophisticated awards system of points and medal to keep their users motivated. By matching user profiles with regard to their specific interests, collaborative filtering extends the social dimension from “friends and acquaintances” to “like-‐‑minded people”, who share the same interests and attitudes, but don’t necessarily have to know each other (Anderson, 2005). The downside of this approach is the lack of transparency, i.e. users often don’t know why they get certain recommendations, or with whom they implicitly collaborate. Although collaborative filtering systems are a more implicit form of collaboration during the search process, the systems require users to define detailed and specific profiles, which can be achieved best by users explicitly refining their profiles (cf. the TeachHunchAboutYou approach).

7. Personalized social search Although at least location-‐‑based selection of retrieval results has become a wide-‐‑spread feature, many algorithmic web search engines still don’t offer personalized search results, i.e. every user gets the same list of results for the same query. In contrast, a personalized search engine has the ability to produce individual results for different users and different contexts (Dalal, 2007). In order to consider a specific user and his context for personalized search results, such a system needs a persistent user profile, which stores information about personal preferences, search history, and other contextual aspects. Social search engines that enable personalized results can be described as “systems that consider the behavior of other users of the system when generating search results and recommendations” (Keenoy & Levene, 2005), which means they not only consider the user’s profile, but also other users’ profiles, who are in the actual searchers social graph. Personalized social search engines (cf. Table 5) like Rollyo and Eurekster may be labeled programmable search engines (Marchiori, 2007) as well, as they build on Yahoo’s web search and allow users to define individual sub-‐‑search engines which only search in preselected document collections (i. e., websites). The approach assumes that different users have expertise in different domains, and thus know the sources which are more prone to contain potential results for a query in a specific field. A web designer,

Users searching in the category “new cars” could e.g. be asked if they prefer “speedy sports cars”, “roomy family cars” or “something” else (cf. the Hunch styleguide: http://hunch.com/info/style-‐‑guide/). 5

19

for instance, might know some insider blogs on design and web programming, a monthly online journal with a huge digital archive of freely available articles and some sites for the official documentation of recent web standards – all queries from the domain of web design have a good chance to find relevant results and answers in this personalized document collection. The immediate effect of searching only in interesting, high quality sites increases precision of the result set. At the same time, the recall decreases considerably, as only a fraction of the existing web sites is searched. The social aspect of these personalized search engines, which Rollyo for instance calls searchrolls, lies in the possibility to share specific sub-‐‑search engines with other users by tagging and classifying them. In this way, which can be seen as a special application of social tagging, users build a classification of personalized search engines together, which can be browsed or searched by other users. If users find a predefined personal search engine, they can modify it and build a personalized version of it for themselves, by e.g. dropping some sites the original creator had considered relevant, and adding some pages which are of personal relevance for themselves. Eurekster implements another social component by allowing users to rate and comment on others’ personalized search engines (these are called Swickis), so that other users can assess the usefulness of a specific personalized search engine. Additionally, Eurekster aggregates and visualizes the most frequent queries in a tag cloud, allowing spontaneous discovery and serendipity effects for its users. Blekko, another example for personalized social search engines, allows defining restrictions of the search area and saving them in so-‐‑called slashtags. These slashtags can be added to a query as a suffix, enabling the combination of different users’ personalized slashtags for a specific information need (cf. Figure 4).

Figure 4: Exemplary search for „global warming“. Blekko only searches websites which are defined in the slashtags /tech and /date (image source: http://blekko.com/).

Recently, Google and Yahoo have started to provide personalization features of their own (Google Co-‐‑op and Yahoo Search BOSS), expanding the infrastructure for a social usage of personalized search engines. name Rollyo Eurekster

URL http://www.rollyo.com http://www.eurekster.com

20

Blekko Google Co-‐‑op Yahoo Search BOSS

http://blekko.com http://www.google.com/cse/ http://developer.yahoo.com/search/boss/ Table 5: Examples for personalized social search engines

The advantage of personalized social search engines certainly lies in the intelligent human preselection and narrowing of the search area, which not only increases precision but also considers criteria like topicality and provenance of the source of information. This advantage at the same time implies that the personalized search engines have to be maintained manually, which can be problematic if its initial creators lose interest during the course of time. The low recall of personalized searches can be disadvantageous too, as users of such systems constantly run the risk of missing relevant information from sources, which are not defined in the vertical search setups. Putting all this together, personalized social search engines are particularly useful for quick and incomplete searches, but have a serious drawback when it comes to exhaustive researches, which aim at a high recall.

8. Outlook In this chapter we have discussed the concept of social search and its context as well as different genres of social search engines, each having their very own approach to enhance search with a social dimension. Social search engines have the potential to overcome traditional IR challenges, like e.g. the vocabulary problem, a realistic implementation of relevance feedback or the personalization and customization of search. Although there are sound arguments for the consideration and utilization of a user’s social context during the search process, until now the actual benefits of such systems over algorithmic search engines like e.g. Google have been evaluated only sporadically (Lewandowski & Maaß, 2008). Results of such evaluation work could bring about best practices of how and when to use different types of social search engines, and how to effectively combine them with each other as well as with algorithmic search engines. While there is a lack of social search evaluation, the latest social search efforts (cf. Table 6) of Google (“social search is the future” (Sherrets & Mayer, 2008)) can be interpreted as an indicator for the relevance of the concept. Besides the acquisition of the social QA system Aardvark, Google provides a browser toolbar called Google Sidewiki which allows its users to annotate and share websites directly in the browser. Another approach which aims at integrating Google’s algorithmic search with social interaction with the search results can be found in the Google SearchWiki. Users can arrange documents from the Google list of

21

results, and delete irrelevant documents or add new documents that weren’t found by the algorithmic search. SearchWiki is social, because users can look at other users’ search wikis to a specific query, i.e. if a user searches for topic X, and knows user Y is an expert in this field, he might want to use this user’s search wiki for his own query. With Google Social Search the company presents a comprehensive approach to integrate results from different social graphs (e.g. graphs from Twitter, Friendfeed, etc.). Recently, Google announced the +1-‐‑button, taking after Facebook’s like-‐‑button, which allows ubiquitous tagging of resources all across the web, and integrating them into the user’s social graph on Facebook. Name Aardvark Google Sidewiki Google Search Wiki Google Social Search Google +1-‐‑button

URL http://vark.com/ http://www.google.com/support/toolbar/bin/static.py?page=guide.cs&gu ide=24296 http://googleblog.blogspot.com/2008/11/searchwiki-‐‑make-‐‑search-‐‑your-‐‑ own.html http://googleblog.blogspot.com/2009/10/introducing-‐‑google-‐‑social-‐‑ search-‐‑i.html http://www.google.com/+1/button/ Table 6: Google’s efforts in the social search field.

Google’s efforts point towards an understanding of social search, where the concept will probably not replace traditional search engines, but rather complement them. Concepts like the Facebook like-‐‑button promote a rapid growth of a user’s social graph, which causes a blurred view of one’s social contacts and a loss of overview and control. Often the social graph of a user in specific social services is not in accordance with the user’s social context from real life. Consequently, a basic problem of existing social search engines can be seen in trust issues with relevance rankings, which are not from a user’s immediate social context. The possibility of human relevance ranking not only produces highly subjective results, but also brings about the risk of deliberate manipulation of search results, which is in most cases motivated by commercial aspects. The biggest drawback of social search may be seen in its limited document collection, causing high precision and low recall values. This means extensive searches for a maximum number of relevant documents will depend on content-‐‑based, algorithmic search in the near future, social search (at this stage) can be seen as complementary means to perform deliberately vertical searches, where precision is more important than recall.

22

9. References Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: a survey of the state-‐‑of-‐‑the-‐‑art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6), 734-‐‑749. Published by the IEEE Computer Society. Agichtein, E., Castillo, C., Donato, D., Gionis, A., & Mishne, G. (2008). Finding high-‐‑ quality content in social media. Proceedings of the international conference on Web search and web data mining -‐‑ WSDM ’08 (pp. 183-‐‑194). New York, New York, USA: ACM Press. Ahn, J.-‐‑wook, Brusilovsky, P., & Farzan, R. (2005). Investigating users’ needs and behavior for social search. Proceedings of the workshop on new technologies for personalized information access (part of the 10th international conference on user modelling (UMʼ05)) Edinburgh, Scotland, UK (p. 1–12). Amershi, S., & Morris, M. R. (2008). CoSearch: a system for co-‐‑located collaborative web search. Proceeding of the twenty-‐‑sixth annual SIGCHI conference on Human factors in computing systems (p. 1647–1656). ACM. Anderson, C. (2005). Why Social Software Makes for Poor Recommendations. Wired (Blog Network). Retrieved from http://longtail.typepad.com/the_long_tail/2005/02/why_social_netw.html. Arrington, M. (2006). Yahooʼs Big Win. TechCrunch. Retrieved May 3, 2011, from http://techcrunch.com/2006/11/30/yahoos-‐‑big-‐‑win/. Aurnhammer, M., Hannape, P., & Steels, L. (2006). Integrating collaborative tagging and emergent semantics for image retrieval. Proceedings WWW 2006, Collaborative Web Tagging Workshop. Baeza-‐‑Yates, R., & Ribeiro-‐‑Neto, B. (1999). Modern Information Retrieval. Harlow: Addison-‐‑Wesley. Bao, S., Wu, X., Fei, B., Xue, G., Su, Z., & Yu, Y. (2007). Optimizing web search using social annotations. Proceedings of the 16th international conference on World Wide Web -‐‑ WWW ’07 (pp. 501-‐‑510). New York, New York, USA: ACM Press. Barsky, E., & Purdon, M. (2006). Introducing Web 2.0: Social networking and social bookmarking for health librarians. Journal of the Canadian Health Libraries Association, 27, 65-‐‑67. Begelman, G., Keller, P., & Smadja, F. (2006). Automated tag clustering: Improving search and exploration in the tag space. Collaborative Web Tagging Workshop at WWW2006 (p. 15–33). Belkin, N. J., Oddy, R. N., & Brooks, H. M. (1982). ASK for Information Retrieval: Part II. Results of a Design Study. Journal of Documentation, 38(3), 145-‐‑164. Blank, M., Bopp, T., Hampel, T., & Schulte, J. (2008). Social Tagging = Soziale Suche? In B. Gaiser, T. Hampel, & S. Panke (Eds.), Good Tags -‐‑ Bad Tags. Social Tagging in der Wissensorganisation. (pp. 85-‐‑97). Münster: Waxmann. Brickley, D., & Miller, L. (2010). FOAF Vocabulary Specification 0.98. Namespace Document 9 August 2010 -‐‑ Marco Polo Edition. Retrieved May 25, 2011, from http://xmlns.com/foaf/spec/20100809.html.

23

Brin, S., & Page, L. (1998). The anatomy of a large-‐‑scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1-‐‑7), 107-‐‑117. Burghardt, T., Buchmann, E., Boehm, K., & Clifton, C. (2009). Collaborative Search and User Privacy: How Can They Be Reconciled? Collaborative Computing: Networking, Applications and Worksharing, 85–99. Springer. Callan, J. (1996). Document Filtering with Inference Networks. SIGIR ’96: Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 262-‐‑269). Choudhury, M. de, Sundaram, H., John, A., & Seligmann, D. D. (2010). Analyzing the Dynamics of Communication in Online Social Networks. In B. Fuhrt (Ed.), Handbook of Social Network Technologies and Applications (pp. 59-‐‑94). New York: Springer. Croft, B., Metzler, D., & Strohman, T. (2010). Search Engines: Information Retrieval in Practice. Addison Wesley. Dalal, M. (2007). Personalized social & real-‐‑time collaborative search. Proceedings of the 16th international conference on World Wide Web -‐‑ WWW ’07, 1285. New York, New York, USA: ACM Press. Deans, P. C. (2008). Social Software and Web 2.0 Technology Trends. Retrieved June 15, 2011, from http://portal.acm.org/citation.cfm?id=1502262. Dearman, D., & Truong, K. N. (2010). Why users of yahoo!: answers do not answer questions. Proceedings of the 28th international conference on Human factors in computing systems (p. 329–332). ACM. Del Conte, N. (2006). BitWine Gives Access To Those In The Know. TechCrunch. Retrieved June 2, 2011, from http://techcrunch.com/2006/11/28/bitwine-‐‑gives-‐‑acces-‐‑ to-‐‑those-‐‑in-‐‑the-‐‑know/. Dumais, S., Grudin, J., Poltrock, S., Bruce, H., Fidel, R., & Pejtersen, A. M. (2000). Collaborative information retrieval (CIR). CHI ’00 extended abstracts on Human factors in computing systems -‐‑ CHI ’00. New York, New York, USA: ACM Press. Ellis, C. A., Gibbs, S. J., & Rein, G. (1991). Groupware: some issues and experiences. Communications of the ACM, 34(1), 39–58. ACM. Evans, B. M., & Chi, E. H. (2008). Towards a model of understanding social search. Proceedings of the ACM 2008 conference on Computer supported cooperative work -‐‑ CSCW ’08. New York, New York, USA: ACM Press. Ferber, R. (2003). Information Retrieval. Suchmodelle und Data-‐‑Mining-‐‑Verfahren für Textsammlungen und das Web. Heidelberg: dpunkt. Ford, K. (2010). Hunch’s “taste graph” now exceeds 10 billion connections. HunchBlog. Retrieved May 12, 2011, from http://blog.hunch.com/?p=20404. Ford, N. (2005). New cognitive directions. In A. Spink & C. Cole (Eds.), New directions in cognitive information retrieval (pp. 81-‐‑98). Dordrecht. Fu, X., Ciszek, T., Marchionini, G., & Solomon, P. (2006). Annotating the Web: An exploratory study of Web usersʼ needs for personal annotation tools. Proceedings of the American Society for Information Science and Technology, 42(1). Furnas, G. W., Landauer, T. K., Gomez, L. M., & Dumais, S. T. (1987). The vocabulary problem in human-‐‑system communication. Communications of the ACM, 30(11), 964-‐‑971.

24

Golder, S. A. (2006). Usage patterns of collaborative tagging systems. Journal of Information Science, 32(2), 198-‐‑208. Golovchinsky, G., Pickens, J., & Back, M. (2009). A taxonomy of collaboration in online information seeking. 1st International Workshop on Collaborative Information Seeking, 1-‐‑3. Griesbaum, J., Bekavac, B., & Rittberger, M. (2009). Typologie der Suchdienste im Internet. In D. Lewandowski (Ed.), Handbuch Internet-‐‑Suchmaschinen. Nutzerorientierung in Wissenschaft und Praxis. (pp. 18-‐‑52). Heidelberg: AKA Verlag. Hayhurst, C., & Weston, M. R. (2007). Jerry Yang And David Filo (Internet Career Biographies). Rosen Publishing Group. Hearst, M., & Rosner, D. (2008). Tag Clouds: Data Analysis Tool or Social Signaller? Proceedings of the 41st Annual Hawaii International Conference on System Sciences (p. 160). IEEE Computer Society. Heckner, M., Heilemann, M., & Wolff, C. (2009). Personal information management vs. resource sharing: Towards a model of information behaviour in social tagging systems. Int’l AAAI Conference on Weblogs and Social Media (ICWSM) (pp. 42-‐‑49). Horowitz, D., & Kamvar, S. D. (2010). The anatomy of a large-‐‑scale social search engine. Proceedings of the 19th international conference on World wide web -‐‑ WWW ’10, 431. New York, New York, USA: ACM Press. Hotho, A., Jäschke, R., Schmitz, C., & Stumme, G. (2006). Information retrieval in folksonomies: Search and ranking. The Semantic Web: Research and Applications (LNCS), 40, 411–426. Springer. Huang, a W.-‐‑C., & Chuang, T.-‐‑R. (2009). Social tagging, online communication, and Peircean semiotics: a conceptual framework. Journal of Information Science, 35(3), 340-‐‑357. Ingwersen, P. (1996). Cognitive perspectives of information retrieval interaction: Elements of a cognitive IR theory. Journal of Documentation, 52(1), 3-‐‑50. Ingwersen, P., & Järvelin, K. (2010). The Turn: Integration of Information Seeking and Retrieval in Context (The Information Retrieval Series). Springer. Keenoy, K., & Levene, M. (2005). Personalisation of web search. (B. Mobasher & S. Anand, Eds.)Intelligent Techniques for Web Personalization, 201–228. Berlin / Heidelberg: Springer. Kwok, C., Etzioni, O., & Weld, D. S. (2001). Scaling question answering to the Web. ACM Transactions on Information Systems (TOIS), 19(3), 242–262. ACM. Lancaster, F. W. (2003). Indexing and Abstracting in Theory and Practice. Facet Publishing. Lewandowski, D. (2009). Wie Suchmaschinen von Social Software profitieren. Proceedings des Workshops “Social Software@ Work” (Vol. 5, pp. 1-‐‑6). Düsseldorf. Lewandowski, D., & Maaß, C. (Eds.). (2008). Web-‐‑2.0-‐‑Dienste als Ergänzung zu algorithmischen Suchmaschinen. Berlin: Logos-‐‑Verl. Lieberman, H., Van Dyke, N. W., & Vivacqua, A. S. (1999). Letʼs browse: a collaborative browsing agent. Knowledge-‐‑Based Systems, 12(8), 427–431. Elsevier. Linden, G., Smith, B., & York, J. (2003). Amazon.com recommendations: item-‐‑to-‐‑item collaborative filtering. IEEE Internet Computing, 7(1), 76-‐‑80. Marchiori, M. (2007). Social search engines. International Journal of Bifurcation and Chaos, 17(07), 2355–2361. World Scientific Publishing Company.

25

Marlow, C., Naaman, M., Boyd, D., & Davis, M. (2006). HT06, tagging paper, taxonomy, Flickr, academic article, to read. Proceedings of the seventeenth conference on Hypertext and hypermedia -‐‑ HYPERTEXT ’06. New York, New York, USA: ACM Press. Mathes, A. (2004). Folksonomies -‐‑ Cooperative Classification and Communication Through Shared Metadata. Retrieved June 11, 2011, from http://www.adammathes.com/academic/computer-‐‑mediated-‐‑ communication/folks{Bibliography}onomies.html. McDonnell, M., & Shiri, A. (2011). Social search: A taxonomy of, and a user-‐‑centred approach to, social web search. Program: electronic library and information systems, 45(1), 6-‐‑28. Milgram, S. (1967). The Small World Problem. Psychology Today, 1(1), 60-‐‑67. Milicevic, A. K., Nanopoulos, A., & Ivanovic, M. (2010). Social tagging in recommender systems: a survey of the state-‐‑of-‐‑the-‐‑art and possible extensions. Artificial Intelligence Review, 33(3), 187-‐‑209. Springer Netherlands. Morris, M. R., & Horvitz, E. (2007). SearchTogether: an interface for collaborative web search. Proceedings of the 20th annual ACM symposium on User interface software and technology (p. 3–12). ACM. Morris, M. R., & Teevan, J. (2010). Collaborative Search: Who, What, Where, When, Why, and How. Morgan and Claypool Publishers. Pontin, J. (2007). Artificial Intelligence, With Help From the Humans. The New York Times -‐‑ Onlineversion. Rocchio, J. (1971). Relevance Feedback in Information Retrieval. In G. Salton (Ed.), SMART Retrieval System. Experiments in Automatic Document Processing. (pp. 313 -‐‑ 323). Englewood-‐‑Cliffs, N.J. Prentice-‐‑Hall. Schonfeld, E. (2010). The Hunch Gift-‐‑O-‐‑Matic Churns Out Holiday Gift Ideas For Your Twitter Pals. TechCrunch. Retrieved from http://techcrunch.com/2010/12/10/hunch-‐‑ gift-‐‑o-‐‑matic/. Setten, M. van, & Moelaert-‐‑El Hadidy, F. (2000). Collaborative Search and Retrieval: Finding Information Together. Submitted to ACM CSCW. Sherman, C. (2006). Whatʼs the Big Deal With Social Search? Search Engine Watch. Retrieved May 22, 2011, from http://searchenginewatch.com/article/2068090/Whats-‐‑the-‐‑Big-‐‑Deal-‐‑With-‐‑Social-‐‑ Search. Sherrets, D., & Mayer, M. (2008). Google’s Marissa Mayer: Social search is the future. VentureBeat. Retrieved December 1, 2010, from http://venturebeat.com/2008/01/31/googles?marissa?mayer?social?search?is?the?fu ture/. Simmons, R. F. (1970). Natural language question-‐‑answering systems: 1969. Communications of the ACM, 13(1), 15–30. ACM. Sinclair, J., & Cardew-‐‑Hall, M. (2008). The folksonomy tag cloud: when is it useful? Journal of Information Science, 34(1), 15-‐‑29. Su, X., & Khoshgoftaar, T. (2009). A Survey of Collaborative Filtering Techniques. Advances in Artificial Intelligence, 2009, 1-‐‑20. Surowiecki, J. (2007). Die Weisheit der Vielen. Goldmann Verlag.

26

Thom-‐‑Santelli, J., Muller, M. J., & Millen, D. R. (2008). Social tagging roles. Proceeding of the twenty-‐‑sixth annual CHI conference on Human factors in computing systems -‐‑ CHI ’08. New York, New York, USA: ACM Press. Tonkin, E. (2006). Searching the long tail: Hidden structure in social tagging. In J. Furner & J. T. Tennis (Eds.), Proceedings of the 17th ASIS&T SIG/CR Classification Research Workshop (Vol. 17, pp. 1-‐‑10). Viégas, F. B., Wattenberg, M., & Dave, K. (2004). Studying cooperation and conflict between authors with history flow visualizations. Proceedings of the 2004 conference on Human factors in computing systems -‐‑ CHI ’04 (Vol. 6, p. 575–582). New York, New York, USA: ACM Press. Voß, J. (2007). Tagging, folksonomy & co -‐‑ Renaissance of manual indexing? In A. Oßwald, M. Stempfhuber, & C. Wolff (Eds.), Open Innovation. Neue Perspektiven im Kontext von Information und Wissen, ISI 2007. (pp. 243-‐‑254). Wilkinson, D., & Huberman, B. (2007). Cooperation and quality in wikipedia. Proceedings of the 2007 international symposium on Wikis (p. 157–164). New York, New York, USA: ACM. Wilson, T. D. (1999). Models in information behaviour research. Journal of Documentation, 55(3), 249-‐‑270. Yee, N. (2005). Motivations of play in MMORPGs. Proceedings of DiGRA 2005. Zhang, J., & Ackerman, M. S. (2005). Searching for expertise in social networks: a simulation of potential strategies. Proceedings of the 2005 international ACM SIGGROUP conference on Supporting group work (p. 71–80). ACM.

27