Information market based recommender systems fusion Efthimios Bothos
Konstantinos Christidis
Dimitris Apostolou
National Technical University of Athens Athens, Greece
National Technical University of Athens Athens, Greece
University of Piraeus Piraeus, Greece
[email protected]
[email protected] Gregoris Mentzas
[email protected]
National Technical University of Athens Athens, Greece
[email protected]
ABSTRACT Recommender Systems have emerged as a way to tackle the overload of information reflected in the increasing volume of information artefacts in the web and elsewhere. Recommender Systems analyse existing information on the user activities in order to estimate future preferences. However, in real life situations, different types of information can be found and their interpretation can vary as well. Each recommender system implements a different approach for utilizing the known information and predicting the user preferences. A problem is that of blending the recommendations in an adaptive, intuitive way while performing better than base recommenders. In this work we propose an approach based on information markets for the fusion of recommender systems. Information Markets have unique characteristics that make them suitable building blocks for ensemble recommenders. We evaluate our approach with the Movielens and Netflix datasets and discuss the results of our experiments.
Categories and Subject Descriptors H.3.3 [Information Search and Retrieval]: Information filtering, Selection process; H.3.4 [Systems and Software]: User profiles and alert services
General Terms Algorithms, Performance, Experimentation
Keywords Content-Based Recommendation, Collaborative Filtering, Ensemble Recommenders, Information Markets
1.
INTRODUCTION
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. HetRec ’11 October 27, 2011, Chicago, IL, USA Copyright 2011 ACM 978-1-4503-1027-7/11/10 ...$10.00.
The explosive growth of available content in the form of movies, books, songs and other items renders the process of information search and selection increasingly difficult. Users are overwhelmed by the abundance and diversity of options to consider and may not have the time or knowledge to personally evaluate these options. This situation can lead to inability to choose and other negative effects for the users [26]. Recommender systems have proven to be a valuable way for online users to address this kind of problems. The task of recommendation techniques or models is to accurately identify user preferences and utilise this information to predict future user reactions to unseen items. Many of these systems have been successfully deployed in commercial environments [16, 4]. Recommender systems are generally categorized in two large groups, collaborative filtering and content-based recommenders. Collaborative filtering utilizes similarity between user interests in the past in order to predict future preferences based on the premise that users with similar preferences in the past tend to like similar items in the future. On the other hand, content-based recommenders analyse the user preferences, generate a user profile and compare it with the content of the item in hand for the same purpose. Recent works in the area propose the acquisition of results by combining sets of recommendation models, each of which solves the same original task. Such methodologies are known as ensembles and provide composite global models with more accurate and reliable estimates than those produced by using a single model [23]. The main idea of an ensemble method is the aggregation of the predictions of different base algorithms - the ensemble - to obtain a final prediction. The combination of different predictions into a final prediction is also referred to as blending or fusion. Many of the top performing teams in the Netflix Prize contest have employed ensemble methods in their solutions [3, 20, 32]. In one of the cases [3], 107 different models have been combined using a linear regression approach and this performance won the progress-prize in the Netflix challenge. In this case a number of substantially different base algorithms were employed for creating the models. According to [3]: “Predictive accuracy is substantially improved when blending multiple predictors [...] most efforts should be concentrated in deriving substantially different approaches, rather than refining a single technique.”
Over the last two decades many powerful collaborative filtering and content-based algorithms have been published. It is argued that their performance can be significantly improved if ensemble methods are used. The ensemble models proposed for the NetFlix Prize have been combined to address the problem of recommendation in the specific dataset. Existing combiner methods have limitations including the requirements for a training phase. Additionally, current methods presuppose restrictive assumptions such as the constant ensemble base-recommender composition and that the training data performance is a good proxy for subsequent actual performance. In this paper we propose a novel approach for blending heterogeneous recommender systems using information markets. Market participants are computational agents representing different base recommenders. Agents bet on the recommendation they foresee to be correct based on the information provided by their corresponding base recommender and/or information revealed through market prices. The market outcome depends on the wealth of the participants and reflects the weighted ’opinions’ of the base recommenders. Our approach (1) has comparable performance to linear regression blending methods, (2) does not require offline training data, and (3) through online learning can adapt to changes in base-recommender composition and performance. The rest of this paper is organized as follows. The following section contains background information on the blending of recommenders and information markets. In section 3, we describe our methodology for using the information markets to build an ensemble of recommenders. In section 4, we provide the details and the results of the experiments that have been performed on the NetFlix and Movielens Datasets. Section 5 contains the related work, while conclusions and further work are given in section 6.
2. 2.1
BACKGROUND Blending Recommenders
An ensemble is a group of methods or models that can be used together to address a specific problem. The main idea of using an ensemble (or a blending) of models is that of assigning a task to different single models and then to combine their outputs into a global model. Ensembles can be classified in one of two types, homogeneous and heterogeneous. In general, homogeneous ensembles utilize different versions of the same model while the heterogeneous approaches combine models of different types. In the field of recommender systems, this approach is also known as hybrid recommender systems. Any recommender system that combines multiple recommendation techniques together to produce its output is considered a hybrid recommender system [10]. Hybrid recommenders can combine several different techniques of the same type, for example, two different content-based recommenders could work together. However, they can also combine information across different sources. The latter are particularly effective in resolving the cold-start problem [10]. One of the simplest designs for implementing a hybrid system is a weighted aggregation. In this setting, each component of the hybrid recommender estimates the utility of a given item and the estimates are combined using a linear formula. An early example of weighted hybrid recommenders
can be found in [12]. One of the assumptions made in this type of hybridization is that it combines evidence from base recommenders in a static manner [10] and therefore seem inappropriate if the base recommenders have changing prediction accuracy. These hybrid recommenders involve a training phase, in which each individual recommender processes the training data. Then when a prediction is being generated for a test user, the recommenders jointly propose candidates. Besides weighted aggregation, other approaches for creating hybrid recommender systems have also been proposed. In switching algorithms, the system switches between different techniques based on the situation in hand. Mixed hybrid recommenders present different recommendations at the same time. In recommenders that apply feature combination, features from different sources are fed into a single algorithm. In cascade recommenders, one recommender refines the recommendations provided by the other, while feature augmentation algorithms can use output from one technique as an input feature to another. Finally, in metalevel hybrids, the model learned by one recommender is used as input to another recommender[9].
2.2
Information Markets
Markets are institutional settings able to efficiently allocate scarce resources and accommodate hedging against various types of risks. Besides the aforementioned properties markets have been known to aggregate and disseminate information into prices; the efficient market hypothesis, in its strongest form, states that all private and public information is reflected in market equilibrium prices [13]. Information Markets (IMs) are designed and run for the primary purpose of mining and aggregating information scattered among participants and subsequently using this information in the form of market values in order to make predictions about specific future events. IMs make use of specifically designed contracts that yield payments based on the outcome of uncertain future events and differ from traditional equity markets in that they are not typically tied to a claim of an ownership stake in a firm. Instead, the assets are claims that will pay off an amount which depends upon a future state of the world. IMs, also known as prediction markets, virtual stock markets and decision markets, have provided accurate predictions on future outcomes in a wide range of domains, e.g., political elections, economic indices and sport events [34]. Prices of contracts can be interpreted as a measure of the probability of the event and this metric depends on contract specifications and market design. Contracts are sold for each of the possible outcomes of the event of interest. The contract price fluctuates based on supply and demand. In the Iowa electronic market, a winning contract (that predicted the correct outcome) pays 1 after the outcome is known. Therefore in a similar setting the contract price will always be between 0 and 1. In order for IMs to be most efficient, they should attract a group of individuals with specific characteristics. Surowiecki [31] has provided a qualitative analysis of participant characteristics necessary for the market to be trustworthy: diversity of opinion, independence of thought and decentralization of knowledge. Wolfers and Zitchevitz [34] established a theoretical model and provided an account of sufficient conditions under which IM prices aggregate private information held amongst participants. They concluded that,
when participants are typically well-informed, IM prices will aggregate information into useful information. This is in accordance with the Condorcet Jury Theorem [5], which suggests that a group of individuals has a higher probability of making the correct decision as the size of the group increases, provided sufficient individuals have a better than even probability of making the correct decision, i.e. they are well informed. In an IM, participants who hold the correct information are rewarded and their wealth increases. On the contrary the assets of those who provide erroneous input decrease. The resulting effect is a higher long term accuracy as ‘good predictors’ obtain greater influence on the market results due to their increased wealth.
3.
METHODOLOGY
In our work IMs constitute an ensemble recommender which can potentially be employed in any recommendation problem. We consider items that have been rated by the users in an inconsistent and sparse manner and seek to identify the rating of items that users have not rated yet. For a given dataset, a set of base recommenders is considered. These recommenders are trained and then prompted for estimations for the testing part of the dataset. A market is composed, where its participants are M computational agents representing different base recommenders and assigned with indices i in the set A = {1, .., M }. Agents bet on the rating they foresee to be correct based on the information provided by their corresponding base recommender and/or information revealed through market prices. On a conceptual level we also introduce the role of the market operator who is responsible to gather agent bets, reward them according to their performance (i.e. forecasting accuracy) and generate the market outcome (i.e. the recommendation) which is equal to the wealth weighted ratings provided by the base recommenders. The market works in parallel with the base recommenders in the task of estimating user ratings. Agents are rewarded based on their performance i.e. those who perform well receive greater rewards and their impact increases when they accurately predict subsequent items. The wealth accumulated by the agents is then considered to reflect their predictive power and is used to aggregate the recommendations of the base algorithms. Note that although the base recommenders need to be trained, the market converges to a combination of base recommendations without training, through the resource allocation mechanism. The concept of the proposed recommender fusion can be seen in Figure 1. Without loss of generality in this work we consider ratings on a scale of 1 to 5 since this scale is commonly used by most online recommendation services such as Movielens and Netflix. The possible outcomes may differ depending on the online service under consideration; Movielens users are allowed to place ratings with a 0.5 step on the 1 to 5 scale (in total 11 options) whereas Netflix users rate using 5 options. In any case the available options are represented by corresponding contracts in the market and their prices form a K-dimensional vector p = (p1 , .., pk ) ∈ [0, 1]K . The real rating of each item becomes known after the prediction (output) of the market.
3.1
Figure 1: Conceptual Architecture of the proposed Information Market based fusion of Recommenders
Base Recommenders
As stated above the ensemble may consist of a variety of recommendation techniques; it is the market’s task to aggregate them in order to achieve low prediction errors. In order to be consistent with the principles of IMs we seek input from diverse recommenders. The most prominent recommender systems rely on collaborative filtering and content analysis. Content analysis is commonly used when the problem includes content that will either be consumed itself (for example, a book) or that describes the consumable content (for example, a movie). Unsupervised content processing can lead to reasonable suggestions for what type of items the user may enjoy next. As in [1], the process is generally modeled as follows: the utility u (c, s) of an item s for user c is estimated based on the utilities u (c, si ) assigned by user c to items si where si belongs to the set S of items that are similar to item s. To compute the similarities between items and to estimate the utility of items to users we need ways to represent both items and users. Items can be described either by predefined fields or, in the case of text, a structured representation has to be obtained using methods such as tf-idf and the vector space model [25]. User profiles are represented by a model of preferences, which is a description of the kind of items that the user prefers. Information for this model is attained by either asking the user to enter information about himself, or by analyzing user interactions in the system. Collaborative filtering (CF) methods aim to learn user preferences and make recommendations based on user and community data. These techniques use a database of preferences for items by users to predict additional artifacts or products a new user might like. In a typical scenario, there is a list of m users u1 , u2 , ..., um and a list of n items i1 , i2 , ..., in , and each user, ui , has a list of items, sui , which the user has rated, or about which his preferences have been inferred through his behaviors. The ratings can either be explicit indications on an ordinal scale (e.g. 1-5) or implicit indications, such as purchases or click-throughs [30]. CF algorithms commonly use neighbourhoods of users or items based on a similarity measure wij , which reflects distance, correlation, or weight, between two users or two items, i and j. Then they produce a prediction for the active user by taking a weighted average of all the ratings of similar users (or items) with respect to the item (or user) of interest. Several approaches can be used for computing similarity between items or users, e.g. correlation or vector cosine similarity
3.3
[30].
Market
Our information market is a zero-sum game where the total amount of money collectively owned by the participants Our agents follow the Belief-Desire-Intention design paradigm. is conserved after each new item P is presented. Thus the sum of all participants wealth M According to the BDI framework an agent is characterized m=1 wm should always be M ∗w0 , i.e. the total wealth distributed at the beginning. In by its beliefs, goals (desires), and intentions it will intend our experiments the participating agents begin with a wealth to do what it believes will achieve its goals given its beliefs of 1/(numberof participants), thus the total wealth is equal about the world [21]. Our agents build a view (belief) of the to one (this is an arbitrary selection; any initial state with world by retrieving and interpreting information, have the equal wealth would fit). When the winning option becomes goal (desire) of maximizing their resources through ratioknown the wealth of each participant is updated as follows: nal trading and engage (intend to do) in transactions based on the available information. This follows the processes of K the real world IMs where human traders receive information X wm k 0 k signals and based on their personal assessment of that infm (x) (1) wm = (wm − wm fm (x)) + pk formation buy or sell contracts in order to maximize their k=1 portfolios. Equation (1) shows that the invested amount is subtracted Agents intentions are modeled after a betting function from the available and then the winning amount is added. that describes what the participant plans to do for each posThe earnings depend on the correct outcome k, its price sible market price p and belief b. Considering this, our comk pk and the amount invested wm ∗ fm (x). For the selection putational agents can be represented as a pair (w, f (x, c)) of constant betting functions the equilibrium prices are obK of a wealth w and a betting function f (x, c) : Ω × [0, 1] → tained by linear aggregation as follows (the proof is provided K [0, 1] similarly to [2]. The wealth w represents the weight in [2]): or importance of the participating agent in the market. The betting function is used to calculate what percentage of its PM M k X wealth the participant will invest to each outcome, with rek m=1 wm fm (x) = p = αm fm (x) (2) P k M spect to item x and the market prices p. Participants can m=1 wm m=1 bet at most their available wealth w, so the betting functions PK k Furthermore the analytic form of agent rewards is promust satisfy k=1 (f (x, c)) ≤ 1 (K refers to the possible vided by: outcomes). Barbu and Lay [2] have defined a set of betting functions for classification problems using markets which we P list below: y fm (x) M j=1 wj 0 (3) wm = wm (1 − λ) + λwm PM y k k • Constant betting functions f (x, p) = h (x) (the agent’s j=1 wj fj (x) bet on option k depends on the item x only) Where λ controls how much of the wealth will be affected and consequently using λ we affect the learning rate of the k k • Linear betting functions f (x, p) = (1 − ck )h (x) (the market. agent’s bet depends on the item x and the market price 3.4 Recommendation p) The expected rating of a new item i for user u is calculated using the current wealth of the agents, that is known at the • Aggressive betting functions end of each round of trading. if pk ≤ hk (x) − 1 k 0 if pk > hk (x) f (x, p) = hk (x)−pk M otherwise X ˆ u,i = wm ∗ rm (4) R
3.2
Agents
In this paper we consider and adopt for the recommendation problem the constant betting function: f k (x, p) = f k (x). The agent decides what percentage f k of his wealth will be invested on each of the possible ratings K considering the base recommender prediction without being affected by current market prices. When a new item is presented to an agent, the task is to bet the wealth it has in its possession to the available rating options according to the belief generated by the base recommender the agent represents in the market. The investment depends on the distance between the rating option and the prediction of the base classifier as follows: betmk = (5 − |k − b|)/5 ∗ wm , where k denotes the rating option (or possible outcome, e.g. for Netflix it can be one of 1,2,3,4,5) and b the belief generated by the base recommender. The result is that the largest portion of the wealth is placed on the rating option that is closest to the prediction of the base recommender.
m=1
ˆ u,i denotes the predicted rating of our In equation (4) R market based blending, wm is the wealth of agent m when the rating is calculated and rm is the predicted rating of the base recommender the agent represents. While the market runs there can be agents who dominate and accumulate the total wealth while others end up with no money. This situation is unwanted as the ensemble has optimal performance when all agents contribute to the final prediction. Agents are greedy since they care only for their own wellbeing. Nonetheless the market operator maintains a ‘system utility’ which depends on how well the ensemble works as a team. While the team performs well agents are rewarded; if performance drops agents maintain the same wealth level. Note that trading never stops. The system level utility depends on the square root error of the ensemble prediction and the actual user rating. The market operator monitors this error and if it is found to be decreasing across
the latter N observations (N is defined upon market setup) he considers that agents should be rewarded otherwise not. Such an approach resembles a gradient descent [28] where we use the RMSE to determine whether we are at a minimum or we are moving away from one.
4.
EXPERIMENTS
We verify our approach on two well known datasets: Movielens and Netflix. We compare with the linear aggregation of base recommenders using the least squares approach as it has been proven to provide significant improvements (see e.g. the winner of the Netflix). We want to minimize the prediction Root Mean Square Error (RMSE) on our test sets: v u N u1 X ˆ i − Ri )2 (R RM SE = t N i=1
4.1 4.1.1
Dataset Analysis Movielens
The first round of our experiments were based on one of the datasets given by the HetRec 2011 workshop 1 . This dataset is an extension of the Movielens dataset, that contains personal ratings, tags and tag assignments to movies. The part of the dataset that has been used for the recommender training and testing is that of the ratings that range from 1 to 5, including 0.5 steps. These statistics can be found in Table 1. Users and ratings can be used in a straightforward way as an input for Collaborative Filtering recommenders, while this is not true for the movie content. To apply content based recommendation on this dataset we have made the assumption that the tags assigned by the users can be considered as documents, taking into consideration also the number of assignments. Subsequently the documents that were generated and referred to a movie have been subjected to Latent Semantic Analysis, and specifically Latent Dirichlet Allocation [6] in order to extract latent semantics of the tags. For this application we have used the Mallet [17] implementation of the LDA algorithm for k=1000 number of topics and 2000 iterations. This process allowed us to infer a fixed number of topics that we could use later on to characterize the movies and also the users in a content-based recommendation setting. To generate and evaluate our methodology we have applied 5-fold cross validation. The total ratings are split in 5 parts randomly. In each fold the four fifths are used as a http://ir.ii.uam.es/hetrec2011/datasets.html
training set for the base recommenders while the remaining ratings are used for testing. The least squares factors per fold were derived by using the 20% of the ratings contained in the testing set. All the reported results are the averages of the cross validation.
4.1.2 (5)
ˆ i denotes the predicted rating and Ri is In equation (5) R the actual rating of the movie. Additionally, we use top N recommendation metrics, namely precision and recall. We sort the list of items rated by the user, based on the user rating. Subsequently we evaluate the capability of the algorithms to predict the items that are rated with 4 or more (which we arbitrarily consider as the ones that the user enjoyed in a binary rating setting). For all the recommendations provided for a user we can compute the precision and recall metrics and then compose the results in a graph.
1
Table 1: Movielens Dataset Analysis HetRec Movielens dataset Users 2,113 Movies 10,197 Ratings 855,598 Tags 13,222 Tag Assignments 47,957
Netflix
The Netflix dataset was published as part of the Netflix Prize contest and consists of a training set with approximately 100 million ratings, with a fixed hold out set called probe set. The dataset provides information on user ratings only; there are no demographics or other related user information included. The top performing teams focused on variations of collaborative filtering algorithms. These algorithms include k-nearest neighbour (KNN) methods (KNNitem, KNNuser), methods based on matrix factorization (SVD, AFM, SVDe), restricted Boltzmann machines (RBM), and global effects (GE); see [15] for a thorough review. Each single algorithm models the data in a different way and a suitable combination of these predictions was proven to significantly improve the prediction performance. For our experiments we make use of calculated predictions provided by [15] (see also http://elf-project.sourceforge.net/). In this dataset the probe Netflix set with 1.4 million ratings was randomly split in two equally sized sets called pTrain and pTest. The collaborative filtering models were trained on the pTrain set whereas the pTest is used as a hold out set.
4.2 4.2.1
Results Movielens
For the Movielens dataset we created three base recommenders. The first was based on collaborative filter using the cosine similarity to build user neighbourhoods. We set the neighbourhood size to 30 (i.e. the weighted recommendations of up to 30 most similar neighbours are used to calculate the predicted rating for an unrated item of a user). We used the Apache Mahout2 machine learning library to train our model and get the training set recommendations. For the second recommender we employ content analysis based on latent topic analysis. The tag description of each movie has been used in order to map the movie to a latent topic. Then, user profiles are generated based on the topics they rate highly in previous items. The correlation between the topics found in a new movie and in the user profile is used in order to predict the user rating. This is a custom implementation of a generic content based recommender system. The third recommendation model was based on a simple average rating. The predicted rating of every unrated item for user x is derived from the weighted average of the ratings the rest of the users have already provided. 2
http://mahout.apache.org/
Table 2: Movielens RMSE results Method RMSE Collaborative Filtering 0.8876 Content Analysis 0.9436 Average Recommender Rating 1.088 Linear Least Squares 0.8758 Information Market Based recommender fusion 0.8797
Figure 3: RMSE evolution over time for the Movielens Dataset
Figure 2: Precision and Recall Graph for the Movielens Dataset Once we calculated the predicted rating of each method for the training sets we run a linear least squares regression in order to identify the factors α, β, γ of our linear model: RLS = αRˆ1 + β Rˆ2 + γ Rˆ3 , where RLS denotes the predicted rating from the linear least squares aggregation and Ri the prediction of recommender i (collaborative filtering, content analysis, average rating). The least squares problem was solved using the Java Matrix Package3 provided by the National Institute of Standards and Technology. This package utilizes the QR decomposition algorithm in order to compute the least square minimum. The factors are given by f = R−1 QT b, where Q is orthonormal, R is upper triangular and Q ∗ R provides the matrix A of predicted ratings and b is the vector of observations (true ratings). With our information market based approach we acquired similar results to those of linear least squares regression and better than those provided by individual recommenders. Table 2 summarizes the results in terms of RMSE. In Figure 2 a graph of Precision/ Recall is depicted, where our approach is seen to perform very similarly to that of linear least squares. In Figure 3 the evolution of RMSE as more users are evaluated can be seen.
4.2.2
Netflix
Table 3 provides a list of the most useful collaborative filtering algorithms of the Netflix Prize challenge see also [15]. These algorithms include k-nearest neighbour (KNN) methods (KNNitem, KNNuser), methods based on matrix factorization (SVD, AFM,SVDe), restricted Boltzmann machines (RBM), and global effects (GE). For KNNitem, predictions are the result of a weighted sum over the ratings of the user u for the k items most similar to item i. The weights and the similarities are proportional to the correlations cij between item i and other items j. KNNuser is similar to KNNitem but the weighted sum is over the ratings of the item i for the k users most similar to user u. Single Value Decom3
http://math.nist.gov/javanumerics/jama/
Table 3: Netflix results as provided by [15], linear least squares aggregation and our information market base recommender systems fusion Method RMSE Method RMSE AFM-1 0.9362 GE-1 0.9079 AFM-2 0.9231 GE-2 0.9710 AFM-3 0.9340 GE-3 0.9443 AFM-4 0.9391 GE-4 0.9209 KNN-1 0.9110 RBM-1 0.9493 KNN-2 0.8904 RBM-2 0.9123 KNN-3 0.8970 SVD-1 0.9074 KNN-4 0.9463 SVD-2 0.9172 SVD-3 0.9033 SVD-4 0.8871 Linear Least Squares 0.8756 Information Market 0.8840
position (SVD) methods predict ratings by calculating the dot product of a user feature vector pu and an item feature vector qi . The user features and item features are learned via stochastic gradient descent. Asymmetric factor model (AFM) methods parameterize item features only and represent users by the item they have rated [18]. A Restricted Boltzmann Machine (RBM) is a neural network with one input and one hidden layer [24]. Training is performed userwise and converges after a few ten epochs. Global effects (GE) are based on user and item features, such as support (number of votes), mean rating, mean standard deviation, mean rating date, etc. Bell et al. originally described ten global effects in [3]. Global effects can be effective when applied to residuals of other algorithms. The RMSE column lists the root-mean-squared error that a single algorithm can achieve with good learning parameters, e.g. proper learn rate and regularization constants. Each single algorithm models the data in a different way. Therefore a suitable combination of these predictions can significantly improve the prediction performance. In Table 3 we can also find that the results of the information market based ensemble recommender are better than all base recommenders. Figure 4 depicts the evolution of RMSE. It should be noted that the linear least squares method outperforms our approach however our information market does not require a training phase. Our approach is aimed at generating an adaptive framework. In order to show this quality of our system we
Figure 4: RMSE evolution over time in Netflix
Figure 5: Agents wealth fluctuation and the performance of Linear Least Squares against the Information Market based Ensemble in a synthetic dataset
created a synthetic dataset based on the available Movielens data. We processed items to be predicted and ordered them according to the prediction of the best performing base recommender. First we listed items for which the CF recommender performed better than the others, second items for which the CB recommender performed better and last items for which the average rating provided the best prediction. We calculated the least squares factors using the upper 20% of the synthetic dataset (this means that the least squares approach was optimized for CF as the items for which CF provided the best results where at the top of the list). For the latter 80% of the items we applied our market based ensemble and least squares approach. Figure 5 depicts the results and shows the ability of our approach to converge gradually to the optimal values for recommender aggregation. The RMSE is lower for the market based ensemble and the wealth of the agents (see Figure 5) is distributed according to their performance. For the first 30000 ratings the CF base recommender accumulates most of the wealth whereas the agent that represents the averaged rating approach becomes richer towards the end of the dataset.
5.
RELATED WORK
There have been many approaches for blending many different sources and recommenders in the last years. An extended survey on hybrid recommenders can be found in [9].
An example of a technique for aggregating information from heterogeneous sources in user profiles can be found in [27]. For blending recommenders, the reader can refer to some of the top teams that took part in the Netflix prize (eg. [32, 20, 3]). However, our approach is based in the demonstrated ability of markets to aggregate information that can be useful in order to approach machine learning and recommendation problems with multiple experts. The connection between markets and learning is described in [11]. The authors consider a basic ‘learning from expert advice’ framework, in which an algorithm makes a sequence of predictions based on the advice of a set of N experts (which can be individual features, weak learners, human advisers, or any other forecasters) and receives a corresponding sequence of losses. According to [11] any cost function based information market with bounded loss can be interpreted as a no-regret learning algorithm. Furthermore the aforementioned work shows the benefits of the consideration of incentives in machine learning research. In [19], markets are used to solve a multi-classifier combination problem instead of traditional machine learning techniques. When combining classifiers in dynamic real world settings, several challenges emerge as the relative performance of base classifiers and the ensemble composition may change overtime (i.e. base classifiers are removed, added or rendered temporarily unavailable). Methods based on offline training do not take this into consideration as they cannot change subsequent to training and validation and cannot receive further learning. The theoretical underpinnings of IMs for supervised learning in classification problems are provided in [29, 2]. Additionally, IMs have been recently utilized in order to aggregate user preferences expressed in social media [8]. The proposed approach considers agents that process user generated content and subsequently make transactions in an IM. The market acts as an aggregator of user opinions. IMs for recommender systems have been considered in the recent work of Resnick and Sami [22], who propose an influence-limiting algorithm that can turn existing recommender systems into manipulation-resistant systems. They make use of a scoring function inspired by the widely used ‘market scoring rules’ [14] and assign reputation to recommendations. Reputation is dependent on the performance of the recommendations and honest reporting is the optimal strategy for raters who wish to maximize their influence. Last but not least in [33, 7] markets are used to shortlist recommendations in decreasing order of user perceived quality. Moreover the marketplace in [33] coordinates multiple recommendation methods. Our information markets predict user ratings in a machine learning manner.
6.
CONCLUSIONS AND FURTHER WORK
In this paper we applied an information market-based approach in order to generate a fusion of recommenders. We proposed a methodology that led to a hybrid recommender system that can combine the recommendations from multiple base recommenders. Experimental results in two datasets have proven that our approach consistently outperforms the base recommenders and performs similarly with the least squares linear aggregation. The proposed approach does not require a training phase and provides an adaptive framework that can evolve to address possible changes in recommender predictive power.
A number of issues has not been addressed in this work, and can be a research subject in future works. We intend to examine the performance of other betting functions which consider the market prices as well. This kind of functions will allow our trading agents to learn from each other by monitoring price fluctuations, similarly to real life markets where prices convey information to traders. Another issue that remains open is that of evaluating the performance of our approach in more datasets, especially those that contain significant fluctuation in the performance of the recommendation algorithms.
7.
REFERENCES
[1] G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE transactions on knowledge and data engineering, 17(6):734–749, 2005. [2] A. Barbu and N. Lay. An Introduction to Artificial Prediction Markets for Classification. ArXiv e-prints, Feb. 2011. [3] R. M. Bell, Y. Koren, and C. Volinsky. The BellKor ˘ Zs ´ solution to the netflix prize. KorBell Teamˆ aA Report to Netflix, 2007. [4] J. Bennett, S. Lanning, and N. Netflix. The netflix prize. In In KDD Cup and Workshop in conjunction with KDD, 2007. [5] D. Berend and J. Paroush. When is condorcet’s jury theorem valid? Social Choice and Welfare, 15(4):481–488, 1998. [6] D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. The Journal of Machine Learning Research, 3:993–1022, 2003. [7] S. M. Bohte, E. Gerding, and H. L. Poutr´e. Market-based recommendation: Agents that compete for consumer attention. ACM Trans. Internet Technol., 4:420–448, November 2004. [8] E. Bothos, D. Apostolou, and G. Mentzas. Using social media to predict future events with agent-based markets. IEEE Intelligent Systems, 2010. [9] R. Burke. Hybrid recommender systems: Survey and experiments. User Modeling and User-Adapted ˘ S370, 2002. Interaction, 12(4):331ˆ aA¸ [10] R. Burke. Hybrid Web Recommender Systems. In P. Brusilovsky, A. Kobsa, and W. Nejdl, editors, The Adaptive Web, volume 4321 of Lecture Notes in Computer Science, chapter 12, pages 377–408. Springer Berlin / Heidelberg, Berlin, Heidelberg, 2007. [11] Y. Chen and J. Vaughan. A new understanding of prediction markets via no-regret learning. In Proceedings of the 11th ACM conference on Electronic commerce, pages 189–198. ACM, 2010. [12] M. Claypool, A. Gokhale, T. Miranda, P. Murnikov, D. Netes, and M. Sartin. Combining content-based and collaborative filters in an online newspaper. In Proceedings of ACM SIGIR Workshop on Recommender Systems, 1999. [13] E. Fama. Efficient capital markets: A review of theory and empirical work. The journal of Finance, 25(2):383–417, 1970. [14] R. Hanson. Combinatorial information market design. Information Systems Frontiers, 5(1):107–119, 2003.
[15] M. Jahrer, A. T¨ oscher, and R. Legenstein. Combining predictions for accurate recommender systems. KDD ’10, pages 693–702, New York, NY, USA, 2010. ACM. [16] G. Linden, B. Smith, and J. York. Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Computing, 7:76–80, January 2003. [17] A. K. McCallum. Mallet: A machine learning for language toolkit. http://mallet.cs.umass.edu, 2002. [18] A. Paterek. Improving regularized singular value decomposition for collaborative filtering. In Proc. KDD Cup Workshop at SIGKDD’07, 13th ACM Int. Conf. on Knowledge Discovery and Data Mining, pages 39–42, 2007. [19] J. Perols, K. Chari, and M. Agrawal. Information market-based decision fusion. Management Science, 55(5):827–842, 2009. [20] M. Piotte and M. Chabbert. The pragmatic theory solution to the netflix grand prize. Netflix prize documentation, 2009. [21] A. S. Rao and M. P. Georgeff. Bdi agents: From theory to practice. In ICMAS-95, pages 312–319, 1995. [22] P. Resnick and R. Sami. The influence limiter: provably manipulation-resistant recommender systems. In Proceedings of the 2007 ACM conference on Recommender systems, pages 25–32. ACM, 2007. [23] L. Rokach. Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography. Computational Statistics & Data Analysis, 53(12):4046 – 4072, 2009. [24] R. Salakhutdinov, A. Mnih, and G. Hinton. Restricted boltzmann machines for collaborative filtering. ICML ’07, pages 791–798, New York, NY, USA, 2007. ACM. [25] G. Salton, A. Wong, and C. Yang. A vector space model for automatic indexing. Communications of the ACM, 18(11):613–620, 1975. [26] B. Schwartz. The paradox of choice. HarperCollins, New York, 2004. [27] A. Sieg, B. Mobasher, and R. Burke. Improving the effectiveness of collaborative recommendation with ontology-based user profiles. HetRec ’10, pages 39–46, New York, NY, USA, 2010. ACM. [28] J. A. Snyman. Practical Mathematical Optimization: An Introduction to Basic Optimization Theory and Classical and New Gradient-Based Algorithms. Applied Optimization, Vol. 97. Springer-Verlag New York, Inc., second edition, 2005. [29] A. Storkey. Machine learning markets. Quantitative Finance Papers 1106.4509, arXiv.org, June 2011. [30] X. Su and T. M. Khoshgoftaar. A survey of collaborative filtering techniques. Adv. in Artif. Intell., 2009:4:2–4:2, January 2009. [31] J. Surowiecki. The wisdom of crowds. Doubleday Books, 2004. [32] A. T and M. Jahrer. The bigchaos solution to the netflix prize 2008. Training, pages 1–17, 2008. [33] Y. Z. Wei, L. Moreau, and N. R. Jennings. A market-based approach to recommender systems. ACM Transactions on Information Systems, 23:2005, 2005. [34] J. Wolfers and E. Zitzewitz. Prediction markets. J. of Economic Perspectives, 18(2):107–126, 2004.