Instead of choosing between e.g. Google, Bing and Baidu, a user may want to get integrated results from all three search engines (and more) via a single search ...
2015 8th International Conference on u- and e-Service, Science and Technology
Towards Networks of Search Engines and Other Digital Experts: A Distributed Intelligence Approach Predrag T. Toši, Yinghui Wu School of EECS, Washington State University Pullman, Washington, USA {ptosic, yinghui}@eecs.wsu.edu
Abstract— We outline our vision and high-level architecture of a network of “digital experts” such as online recommendation systems, general- and special-purpose search engines. Instead of choosing between e.g. Google, Bing and Baidu, a user may want to get integrated results from all three search engines (and more) via a single search query. The user may also want results or suggestions to questions implied by, but not explicitly stated, in his query. Ideally, the combined results from different search engines will be relevance-ranked, taking into account each user’s individual preferences. There exist “expert systems” that integrate results or recommendations from multiple different websites and/or other search engines – e.g., the meta-search engines for finding the best flights and airfares. However, these meta-search engines do not (i) relevance-rank on behalf of the end-user, (ii) learn over time, which websites/individual search engines are most trustworthy and relevant to a particular user, (iii) maintain a quality assurance model of the individual sources of information or recommendation that they harvest, or (iv) create sub-queries or new queries based on inference of the user intent, and not merely what the user has explicitly asked for. We propose a unified framework to address all these issues. In particular, our goal is to enable the end-user to seamlessly obtain integrated expertise from a variety of sources, so that those recommendations are ranked based on both (a) the user’s preferences and (b) different individual sources’ of recommendation reputation and trustworthiness. Our vision for achieving these goals is to have a decentralized, transparent market-place of search engines, recommenders and knowledge bases, where the burdens of integrating, ranking and evaluating quality of different knowledge sources are taken off the end-user. (Abstract) Keywords- web search, search enginer, recommendation systems, meta-search engines, distributed intelligence
I.
INTRODUCTION AND MOTIVATION
Consider a situation where you have just arrived to a town somewhere in the Midwest region of the United States, and after checking-in at your hotel, you decide to find a local restaurant for dinner. In addition to the hotel staff, there are always Google, Bing and Yelp at your fingertips. Among those, Yelp not only provides the information on restaurants in this town, but also diners’ reviews and a bucket-based ranking of each restaurant’s reputation (each restaurant is evaluated, and therefore implicitly ranked, with respect to a fixed scale; Yelp’s default is between 1 and 5 stars). However, you are in a small town, and in the mood for Chinese food. There are only two Chinese restaurants near-by, and they both have practically identical reviews – and each is reviewed and scored by only a handful of diners. So you don’t find this information too helpful to determine, which restaurant is a better option for you. You then start thinking of friends who may have visited the area and dined at one or both of the restaurants. You decide, to call up a couple of friends – if any of them gives you a definite preference in favor of Restaurant A over Restaurant B, you will surely rely on that preference much more than on what Google or Yelp may say. (The reason being, you trust your friends’ opinions.) Furthermore, of the two friends who have recently visited the town you are currently visiting, one grew up in China whereas the other grew up in Greece. So, you will call your Chinese-born friend first: the Greek friend’s opinion on Chinese restaurants is unlikely to dissuade you from whichever recommendation your Chinese friend may make. You then pause to think about choosing dinner venue for the following night; if you opt for Greek or other Mediterranean food, you will call your Greek friend first (and the Chinese friend’s opinion is unlikely to change you following the Greek friend’s recommendation in that situation). The reasoning behind these preferences is intuitively clear: you consider your Chinese friend’s expertise to exceed your Greek’s friend expertise in matters related to Chinese cuisine, and your confidence in your friends’ knowledge gets reversed when it comes to Mediterranean food. Moreover, you trust each of your friends’ recommendations more than you trust the rankings and reviews by unknown people on Yelp – and this applies across different restaurants and cuisines. Now consider a different scenario, where you wish to find an air itinerary from a small town on the West Coast of North America, to a city in South-Eastern Europe. You know you will need at least three, possibly even four, different flight segments. You are travelling to a conference, and therefore the travel dates aren’t flexible. You want the lowest price on economy class seats (after all, you are to use your research grant to cover the cost of this trip), but you also want to use only reputable airlines, and to minimize the layovers between consecutive flight segments. Moreover, you are considering the weather news about certain central regions in North America, and would prefer to avoid flight segments landing in any of those regions potentially troubled by the inclement weather. So, clearly you have a number of different constraints and preferences; to make matters even more complicated, some of your preferences and (soft) constraints, such as wanting both a relatively low airfare and “good” departure and arrival times with short layovers, may be in direct conflict with each other. Certainly, Expedia 978-1-4673-9852-7/15 $31.00 © 2015 IEEE DOI 10.1109/UNESST.2015.26
35
or Orbitz or another flight search engine can provide long listings of candidate itineraries, and you can have those options ranked w.r.t. the price, or the total duration of the itinerary, or one of several other criteria provided by the flight search engines. However, at the end of the day, “balancing out” your various preferences and soft constraints, while ensuring that the itinerary you end up picking satisfies all of your hard constraints, is ultimately still left to you – and it can be a daunting task! Furthermore, even if you use one of the several existing “meta-search engines”, that simultaneously crawl multiple airline websites as well as individual “ordinary” search engines, the burden of sifting through tens if not hundreds of candidate itineraries in order to find an “optimal” one, is still on you, the human end-user. Meta-search engines in essence do “parallel web crawling”, and then integrate the results into a single ranked list of options. They may help you find an itinerary on Orbitz that Expedia has missed, or a lower airfare for a flight purchased on Air Papa than what you’d pay for the same flight at OneTravel. But what these meta-search engines currently cannot do, is take your complex, mutually conflicting preferences into the mix, appropriately weigh different options, and present you with a ranked list of recommendations taking into account all of the above: your user profile and history, your specific current situation and preferences, reputation of different sources of information (such as individual online travel agencies, airline websites, etc.), and your likely preferential ordering over these different sources of information. Hence, having a digital assistant that can do not just meta-search, but also meta-reasoning and meta-ranking on your behalf, would make decision-making in complex scenarios like this so much easier! We propose in this paper a vision of a decentralized, transparent marketplace of general and/or special purpose search engines and recommender systems (“experts”), and meta-reasoning and meta-ranking mechanisms to crawl, integrate, digest, analyze and reason about those different experts and their recommendations. The ultimate goal is to ease the burden of human decision-making when data and knowledge come from a variety of different sources, especially when reconciling multiple (and often mutually conflicting) objectives is necessary. Our vision is inspired by how social networks, such as Facebook, function in terms of not only “virtual socializing” and data exchange, but also creating (or destroying) individuals’ reputations on different topics, and using perceptions or reputations of “friends” (or for that matter total strangers) on such a social network, to decide which posts to read in detail, which YouTube videos to view, which link to news sources or blogs to click as opposed to ignoring, etc. Quality of information found on Facebook, Twitter and other social networks in general broadly varies. But these and other social networks represent open, transparent and democratic marketplaces of ideas, opinions and contents. In such a marketplace, each participant gets to choose what contents to view, “like” etc. In a nutshell, our overarching goal is to reduce the burden of complex multi-objective, multi-constraint decision-making as much as possible, by designing and deploying a meta-search, meta-reasoning and meta-learning “digital assistants” that would carry out most of the “heavy lifting” on the end-users’ behalf, and then present to the human user a simplified, relevanceranked list of recommendations from a variety of knowledge sources, thus making currently daunting decision-making tasks much easier. In the following sections, we outline the overall architecture of this network of experts and how to automatize extracting the most relevant pieces of knowledge and expertise out of such a network. II.
RECEIVING KNOWLEDGE FROM A NETWORK OF EXPERTS: A DISTRIBUTED INTELLIGENCE VIEW
Our vision of a network of recommenders/experts is that of achieving distributed intelligence via a transparent marketplace model; this marketplace applies to the individual sources of expertise, recommendations, answers to user queries, etc. To enable such a distributed architecture with the desired capabilities, we need paradigms and models from Distributed AI (multi-agent systems), as well as from mathematical economics (specifically, collaborative game theory and mechanism design). Due to space constraints, we focus our attention in this paper on the multi-agent, distributed intelligence aspect. From a multi-agent system perspective, each individual search engine or recommender system can be viewed as an autonomous agent that (i) has some internal computational resources and expertise, incl. its own knowledge base (KB) as a repository of relevant domain knowledge, and (ii) ability to communicate and exchange information with other agents. In particular, different agents will in general have different domains of expertise, access to different KBs, may operate at different speeds insofar as processing the user’s (or other agent’s) queries, etc. The meta-recommender acts on behalf of an individual user; this meta-recommender collects, aggregates, analyzes and ranks responses from different expert agents to a given user’s query or request for expertise (such as, in our examples, asking for recommendations on Chinese restaurants or best options on a multi-leg flight itinerary), based on what it knows both about this user and her preferences, and about different experts (i.e., various agents representing individual search engines, websites, domain-specific recommendation systems, and similar). From the system architecture standpoint, this metarecommender is also an autonomous agent; typically, each user (a single human, an organization, etc.) will have its own meta-recommender agent as its individual “digital assistant”. It is the responsibility of this meta-recommender, which (in the vernacular of Distributed AI) we will refer to as the coordinator agent, to maintain and as appropriate update both (i) its model of the end-user and her preferences, and (ii) models of the individual expert agents (such as, domain-specific search engines, recommender systems, and so on); the latter models include how reputable or trustworthy each such individual agent is with respect to the domain of the end-user’s query. This knowledge maintenance entails periodically updating both the model of the user’s preferences and the models of individual expert agents’ reputations in various domains.
36
However, the above two responsibilities ddo not exhaust the coordinator agent’s role. This coordinaator agent also engages in meta-learning and meta-reasoning on the end-user’s behalf. Examples of meta-learning and metta-reasoning tasks could include (but are not necessarily limited) to tthe following: • Incorporating the user feedback into evvaluation, ranking and reputation maintenance and updatiing of individual experts; and, as appropriate, removing certain experts (for example, those whose reputation on a partticular subject or domain drops below user-defined or coordinatoor-defined threshold) from the “pool of expertise”, and/or adding newly discovered experts to the pool of knowledge sourcees. • Making inferences both about the user’s shifting preferences and needs, and individual experts’ e reputations and trustworthiness, based on either expliciit (e.g., in the form of responding to automatically generrated surveys) or implicit (e.g., in the form of whether the user haas clicked on recommended links or not) feedback from th he user. • Making inferences about the user’s neeeds and preferences, including anticipating the informaation needs not explicitly requested by the user (see some examplles of this in the next section). • Making inferences on the individual eexpert agents even in the absence of explicit and/or im mplicit feedback from the user(s); an example of such inference iss, mining graph patterns on the network of experts to iden ntify previously unknown expert agents who may have expertisee on a given domain, based on their connections with the experts known to the coordinator agent to have a particular eexpertise or domain knowledge. (An example would be, among all Yelp reviews, identifying and trusting a particular Yellp recommendation on the best Chinese restaurant in town n, from a reviewer whom we don’t personally know, but for whoom our coordinator agent has discovered, via e.g. mining the Facebook or Twitter network, that this reviewer happens to bbe a friend of our trustworthy friend.) The reputation maintenance and preferencee updating roles of the coordinator agent can be accompllished using the existing, reinforcement learning (and, when we acceess to explicit feedback from the end-users, potentially also a supervised learning) approaches, as well as meta-learning techniiques (e.g., [4, 5]). Inferences about unknown experts bassed on their relationships with known experts whose expertise we trrust and whose current reputation is high, can be accom mplished computationally efficiently by taking advantage of recent reesults from graph pattern mining [2]. We will elaborate further on application of these techniques to our context, and experim mental findings on their effectiveness, in our future work. III.
PROCESSING USER’S QUERIES
We envision distributed query processing framework to be applicable to a number of routinely used u Web query classes, including keywords and graph pattern qqueries [1]. We illustrate the distributed search framework below (illustraated in Fig.1). Coordinator Agent. The coordinator maaintains (1) an agent network topology, where each node is a registeredd search agent with associated information, including query format, URL L and dynamically maintained expertise metrics (e.g., reputations and trusttworthiness); (2) a task mapper, which interprets a given query Q, selects a sset of proper search agents, and maps (sub)queries of Q to each searchh agent; (3) an assembler to synthesize the results from all the agents to generate final answers. In addition, it provides interface to receive feeddback from users/applications. Individual Search Agents. Each agent maanages (a) its local data source, (b) a communication protocol with otheer search agents, (c) a local knowledge base to facilitate the search proocess, and (d) a local learner to enrich the knowledge base by leveraging thhe knowledge from other search agents, via the communication protocol. During the search process, it maintains the knowledge base by reasoninng and integrating (run-time or Figure 1: Distributed Search Agent Framework for trip offline) information from other agents. T The knowledge base will be recommendation. Each agent (e.g. a hotel agent, a restaurant agent, an airfare ag gent) exchanges information enriched for better search results in the futurre. (e.g., social recommendation,, itinerary) with other expert Online distributed query evaluation. Following distributed query agents to improve search on the user query processing architectures, our distributed iintelligence framework works with a coordinator (query server) and a neetwork of search engine/agents. Upon receiving a (possiibly complex and vague) query Q (e.g., “find me a cheap but fun triip from Seattle to LA”) [7, 8], the coordinator interprets Q and identifies several sub-patterns (e.g., airfare, hotel, places to visit, and social recommendations). Coordinator agent then n, using the task mapper, selects most suitable individual expert ageents, based on the dynamically maintained agent reputattions, and assigns search
37
tasks and sub-queries to selected experts. Then, it collects returned answers/recommendations, and synthesizes the results to the best complete answers, via the assembler. Lastly, the coordinator agent then interactively communicates with the enduser to receive feedback, adjust individual agents’ reputations, and refine the search whenever necessary (for example, by receiving refined queries from users). Upon receiving a (sub)query, each individual “expert agent”, in parallel, exchanges knowledge and intermediate answers with a set of other agents in the distributed agent network, to gather the information needed for the local search results. It then conducts local search and return the partial answer/recommendation for the query. Each expert agent also book-keeps communication logs with other agents (both domain experts and coordinators) for effective knowledge exchange. IV.
META-REASONING ABOUT INDIVIDUAL EXPERTS’ REPUTATIONS
As mentioned in Section II, one of the core functionalities of the coordinator agent is to define, maintain and as appropriate update reputations of individual expert agents, on behalf of the end-user. Inferring or extracting reputations of different agents from a networked multi-agent system has been investigated by the Distributed AI community for over a decade (see, e.g., [3]). Modeling of different agents’ reputations can be useful both in purely collaborative distributed domains, which are typically those where the same entity or organization designs and controls the entire multi-agent system in order to accomplish its objectives (where those objectives are perhaps more feasible to accomplish in a distributed rather than centralized manner), as well as in more general multi-agent domains, where different agents in general belong to different entities, and therefore have different objectives; the latter type of domains, from a game-theoretic standpoint, are of competitive (or, at the very least, mixes collaborative-competitive) nature. Usefulness of taking reputation into account in collaborative multi-agent contexts such as distributed task allocation and multi-agent coalition formation has been recently argued in [6]. Our “network of experts”, by definition, is a complex multi-agent domain where different individual experts belong to different entities, for example, Google vs. Yahoo vs. Baidu general-purpose search engines, or Orbitz vs. Expedia vs. OneTravel special-purpose (travel) search engines. In particular, these individual experts in general compete with each other. Also, a general or special purpose search engine may, for example, give undeserved preference (from a relevance standpoint) to its top advertisers. The end-user may want the most relevant results to her query, not the most advertised ones. The coordinator agent’s job is to maintain and update knowledge of which search engines give unbiased rankings and which ones give preferential treatment to its advertisers. This aspect, together with several other (such as the quality and freshness of each expert’s knowledge base, speed of retrieval, past track record of user’s satisfaction with this expert’s recommendations, and so on), contributes to the reputation score of an expert. The coordinator agent uses these reputation scores of various experts in the distributed “pool of knowledge” when combining and ranking recommendations by those experts to a given user’s query. In our future work, we will investigate different scoring mechanisms and functions, trying to determine the most effective (yet computationally efficient) models of reinforcement and/or supervised learning based reputation scoring. We will then evaluate and quantify how useful these reputation tracking mechanisms are for the overarching objective of providing the enduser with the most relevant, personalized recommendations extracted from a broad variety of knowledge sources. V.
SUMMARY
We present a novel vision of the distributed query processing framework over a network of expert agents. A user deploys a personalized digital assistant, in the form of a coordinator agent, that expands on the user’s query, crawls and mines a distributed “pool of knowledge” on the Internet, identifies and ranks various suitable individual sources of expertise from that pool, and presents the end-user with a unified, relevance-ranked list of recommendations meeting the user’s both explicit and inferred informational needs. With such personalized coordinator agents tapping on the vast amounts of information and knowledge found on the World Wide Web, humans and organizations would hugely reduce the effort, time, cognitive burden and financial costs related to formulating complex queries, reconciling multiple objectives and constraints that are often conflicting with each other, and making complex multi-faceted decisions. REFERENCES [1] [2] [3] [4] [5] [6] [7] [8]
W.Fan et al. “Distributed Graph Simulation: Impossibility and Possibility”, Proc. Int’l Conf. on Very Large Data Bases (VLDB), 2014 W. Fan et al. “Association Rules with Graph Patterns”, Proc. Int’l Conf. on Very Large Data Bases (VLDB), 2015 J. M. Pujol et al. “Extracting Reputation in Multi-Agent Systems by Means of Social Network Topology”, Proc. AAMAS-02, Bologna, Italy, pp. 467474, ACM, July 2002 P. Stone, M. Veloso. “MultiAgent Systems: A Survey from a Machine Learning Perspective”, Autonomous Robots, vol. 8, pp. 345-383, 2000 P. Tosic, R. Vilalta. “A unified framework for reinforcement learning, co-learning and meta-learning how to coordinate in collaborative multi-agent systems”, Procedia Computer Science, vol. 1 (1), pp. 2217-2226, ScienceDirect, 2010 P. Tosic. “Learning to Get Better at Distributed Coalition Formation in Collaborative Multi-Agent Systems”, Proc. Agreement Technologies (AT), within EUMAS-2015, Athens, Greece, Dec. 2015 (to appear) Y.Wu et al. “Ontology-based Subgraph Querying”, Int’l Conf. on Data Engineering (ICDE), 2013 S.Yang et al. “Schemaless and Structureless Graph Querying”, Proc. Int’l Conf. on Very Large Data Bases (VLDB), 2015
38