The Role of Trust in Collaborative Filtering Neal Lathia, Stephen Hailes, Licia Capra
Abstract Recommender systems are amongst the most prominent and successful fruits of social computing; they harvest profiles from a community of users in order to offer individuals personalised recommendations. The notion of trust plays a central role in this process, since users are unlikely to interact with a system or respond positively to recommendations that they do not trust. However, trust is a multi-faceted concept, and has been applied to both recommender system interfaces (to explore the explainability of computed recommendations) and algorithms (to algorithmically reproduce the social activity of exchanging recommendations in an accurate and robust manner). This chapter focuses on the algorithmic aspect of trust-based recommender systems. When recommender system algorithms manipulate a set of ratings, they connect users to each other, either implicitly or by explicit trust relationships: users, in effect, become each others recommenders. This chapter therefore describes the key characteristics of trust in a collaborative environment: subjectivity, or the ability to include asymmetric relationships between users in a system, the adaptivity of methods for generating recommendations, an awareness of the temporal nature of the system, and the robustness of the system from malicious attack. The chapter then reviews and assesses the extent to which current models exhibit or reproduce the properties of a network of trust links; we find that while particular aspects have been throroughly examined, a large proportion of recommender system research focuses on a limited faction of trust relationships.
1 Introduction Recommender sytems have experienced a growing presence on the web, becoming evermore accurate and powerful tools to enhance users’ online experience [1]. The Neal Lathia, Stephen Hailes, Licia Capra Department of Computer Science, University College London, London WC1E 6BT, UK e-mail: n.lathia, s.hailes,
[email protected]
1
2
Neal Lathia, Stephen Hailes, Licia Capra
motivation for their existence is two-fold: on the one hand, users need to be provided with tools to confront the growing problem of web information overload; on the other hand, users respond to and seek personalised, tailored items on the web that will cater for their individual needs. Collaborative filtering provides an automated means of building such systems, based on capturing the tastes of like-minded users, and has grown to become the dominant method of sifting through and recommending web and e-commerce content [1]. The traditional approach to collaborative filtering does not include any notion of the underlying system users, both in terms of who they are and how they implicitly interact with one another. For example, the k-Nearest Neighbour (kNN) algorithm has been widely applied in this context [2, 3]. As described in [4], kNN can be broken down into a number of steps. The first is neighbourhood formation: every user is assigned k neighbours, the k most similar other users in the system. The ratings from this neighbourhood are aggregated in the second step, to generate predicted ratings for each user. In the last step, the predictions are ordered to generate a list of recommendations. This process is repeated iteratively, and recommender systems are continuously updated in order to include the latest ratings from each user. The same method can also be applied from the item-based perspective. In this case, rather than populating user neighbourhoods with other users, each item in the system is given a set of similar items. The two approaches are exactly the same in terms of how they operate, but differ in the perspective that they take on the user-item ratings. In this chapter we adopt the language of the user-based approach (i.e. we discuss similarity between users rather than items); however, all of the algorithms reviewed can be equally applied using the item-based approach. The kNN algorithm is not the only candidate available for collaborative filtering: the bulk of research on recommender systems has been focused on improving the accuracy of collaborative filtering by exploring the many algorithms that can operate on a set of user-item ratings. As reviewed in [5], a range of classifiers have been applied to the problem; in particular, a number of classifiers are often combined to produce hybrid recommendation algorithms. The overriding factor is that collaborative filtering is a mechanism for implicitly connecting users to each other based on the opinions each member of the system has expressed. However, as we explore below, there is a tight coupling between how well users understand the process that generated the recommendations they are given and the extent that they trust them. Therefore, while accuracy remains an important metric of recommender system performance, the need to understand and explain the underlying and emergent models of user interaction in collaborative filtering gains greater traction. This result has motivated a wide range of research that aims to incorporate the idea of trust between users into collaborative filtering. Trust needs to transpire from the users’ experience with the system interface, to the algorithms that compute the recommendations and, ultimately, down to the honesty of the ratings used by the algorithms themselves.
The Role of Trust in Collaborative Filtering
3
The kNN algorithm has been subject to a number of modifications to improve the recommendations it generates; for example, [6] combines the algorithm with information about the content being recommended and [2] augments the user-similarity computation with a significance weight, to penalise user-pairs that do not share a large number of common ratings. In fact, many of the techniques that we describe below can be viewed in a similar light: they extend and modify traditional collaborative filtering in order to incorporate facets of human trust, albeit with different goals in mind. This chapter therefore begins by exploring the underlying motivations for the use of trust models and a discussion on trust itself, from a computational point of view. The purpose of this chapter is to review the variety of approaches that have been adopted when modeling a recommender system as interactions between trusting users.
1.1 Notation Prior to delving into the topic of trust in recommender systems, we define the notation that will be used throughout the chapter. The notation mainly relates to the ratings that are input to collaborative filtering algorithms: a rating ru,i is the value that user u input for item i. This rating may be an explicit integer value, or inferred from the user’s behaviour. Nevertheless, it represents a judgement that user u has expressed on item i. The set of ratings Ru input by user u constitute user u’s profile, which has size |Ru |; equivalently, the set of ratings input for item i is denoted Ri , and has size |Ri |. Lastly, the mean rating r¯u of user u is the average of all ratings contained in Ru .
2 Motivating Trust in Recommender Systems The motivations for using a notion of trust in collaborative filtering can be decomposed into three categories: (1) in order to accomodate for the explainability required by system users, (2) to overcome the limitations and uncertainty that arises from similarity-based collaborative filtering, and (3) to improve the robustness of collaborative filtering to malicious attacks. A consequence of incorporating trust models into collaborative filtering is also often a measurable benefit in terms of prediction accuracy; however, state of the art algorithms that are only tuned for accuracy [3] do not mention trust models at all. The first motivation is centred around the system users. For example, Tintarev [7], when expanding upon what an explainable recommendation consists of, cites usersystem trust as an advantageous quality of a recommender system. Similarly, Herlocker et al [8] discuss the effect that explainability has on users’ perception of the recommendations that they receive, especially those recommendations that are sig-
4
Neal Lathia, Stephen Hailes, Licia Capra
nificantly irrelevant or disliked by the users. Chen and Pu [9] further investigate this issue by building explanation interfaces that are linked to, and based on, a formal model of trust. Although a major component of these works revolve around presenting information to the end users, they recognise that building an explainable algorithm is a key component of transparency: it converts a “black-box” recommendation engine into something that users can relate to. This is related to what Abdul-Rahman and Hailes [10] describe as user-user context specific trust: users, who will be implicitly connected with one another by the collaborative filtering algorithm, need a mechanism to represent this interaction with each other during the process. Descriptions of trust models are based on the language of everyday human interaction, and therefore promise to fulfill this requirement. The second motivation shifts focus from the users to the filtering algorithms themselves. These algorithms rely on user-rating data to be input; however, ratings are notoriously sparse. The volume of missing data has a two-fold implication. First, new users cannot be recommended items until the system has elicited preferences from them [11]. Even when ratings are present, a user pair who may share common interests will never be cited in each other’s neighbourhood unless they share common ratings: information cannot be propagated beyond each user’s neighbourhood. Secondly, computed similarity will be incomplete, uncertain, and potentially unreliable. To highlight this point, Lathia et al [12] showed that assigning a random-valued similarity between every pair of users in the MovieLens dataset produced comparable accuracy to baseline correlation methods. Why does this happen? Comparing the distribution of similarity that emerges from the set of users when different metrics are used displays a strong disagreement between, say, the Pearson correlation coefficient and the cosine-based similarity. Even the qualitative interpretation of similarity (i.e. positive values implying high similarity, negative values showing dissimilarity) is dependent on how similarity itself is measured and not on the ratings in user profiles: a pair of profiles may shift from being highly similar to highly dissimilar when one metric is replaced by another. These observations can themselves be interpreted in two ways; on the one hand, they highlight the inefficacy of accuracy-based evaluations of recommender systems, and further evaluation metrics are required. On the other hand, they again emphasise the need to base these algorithms on methods that people can understand, in order to encourage participation and offer transparent reasons for recommendations. In other words, while collaborative filtering offers a means of implicitly connecting people, it is not evaluated or treated as such. The last motivation moves another level down, to the ratings themselves. Since recommender systems are often deployed in an e-commerce environment, there are many parties who may be interested in trying to game the system for their benefit, using what are known as shilling or profile injection attacks [13]. Items that are recommended more will have greater visibility amongst the system’s customers; equivalently, items that are negatively rated may never be recommended at all, and there may be measurable economic benefits from being able to control the recommendation process. The problem stems from the fact that collaborative filtering, when
The Role of Trust in Collaborative Filtering
5
automating the process of implicitly connecting users, operates in a near-anonymous environment. From the point of view of the ratings themselves, it is difficult to differentiate between what was input by honest users and the ratings that have been added in order to perform an attack. This last motivation deals with the case where collaborative filtering data itself may have been molded in order to influence the results. Trust models come to the rescue: by augmenting traditional collaborative filtering with a notion of how users interact, the robustness of recommender systems can be secured.
3 Trust Models The previous section highlighted an array of problems faced by collaborative filtering; what follows is a review of state of the art approaches that aim to address these issues by extending collaborative filtering with the facets of human trust. However, before we proceed, we explore trust itself: what is trust? How has it been formalised as a computational concept? Most importantly, what are its characteristics? A wide range of research [10, 14, 15] begins from sociologist Gambetta’s definition of trust [16]. Gambetta states “trust (or, symmetrically, distrust) is a particular level of the subjective probability with which an agent will perform a particular action [...].” Trust is described as the level of belief established between two entities in a given context. Discussing trust as a probability paved the way for computational models of trust to be developed, as first explored by Marsh [17] and subsequently by a wide range of researchers [18]. The underlying assumption of trust models is that users’ (or agents, peers, etc) historical behaviour is representative of how they will act in the future: much like collaborative filtering, the common theme is one of learning. The differences between the two emerges from the stance they adopt toward their target scenarios; unlike collaborative filtering, trust models are often adopted as a control mechanism (by, for example, rewarding good behaviour in commerce sites with reputation credit) and are user-centred techniques that are both aware and responsive to the particular characteristics desired of the system (such as, in the previous example, reliable online trade). Trust models have been applied to a wide range of contexts, ranging from online reputation systems (e.g. eBay.com) to dynamic networks [19] and mobile environments [20]; a survey of trust in online service provision can be found in [14]. Due to this, trust modeling and computational trust may draw strong criticism with regards to their name: it is arguable that, in many of these contexts, “trust” is a vague synonym of “reliability,” “competence,” “predictability,” or “security.” However, encapsulating these scenarios under the guise of trust emphasises the common themes that flow between them; namely, that researchers are developing mechanisms for users to operate in computational environments that mimic and reflect the way humans conduct these interactions between each other outside of the realm of information
6
Neal Lathia, Stephen Hailes, Licia Capra
technology. Given this view of trust, what are the high-level common characteristics that emerge from trust models? We outline four features of trust here: 1. Subjectivity: Trust relationships are, for the most part, asymmetric. In other words, if we represent a trust value of user a for user b as trust(a, b), then we may have that trust(a, b) 6= trust(b, a). 2. Temporality: Trust model-based decisions are made by learning from previous interactions. In other words, the set of historical interactions has a direct influence on the current decisions; trust relationships are viewed as lasting through time. This particular point has lead to the application of game theory in the analysis of trust, where interactions between users are viewed as a sequence of events [15]. 3. Adaptivity: There is a hidden feedback loop built into trust models; current trust values affect the decisions that are made, which in turn affect the update of the trust values. Given that trust relationships will be asymmetric, different users will learn to trust at different rates (or potentially with different methods): trust-based systems therefore adapt according to the feedback provided by each user. 4. Robustness: Trust models are often built within a decision framework, the aim being to encourage interaction between user pairs that will lead to fruitful outcomes and discourage the participation of malicious users. Trust models are therefore often deployed to secure contexts where traditional security paradigms no longer hold. In the following sections we look at how these trust models characteristics have been implemented in recommender systems; in particular, we examine the extent that work done to date addresses the characteristics we have outlined above.
4 Using Trust For Neighbour Selection One of the central roles that trust modeling has served in collaborative filtering is to address the problem of neighbour selection. Traditional approaches to collaborative filtering are based on populating users’ kNN neighbourhood with others who share the highest measurable amount of similarity with the them [2]. However, as described above, these methods suffer may shortcomings, including: • Poor explainability; • Vulnerability to attack; • Since they base their computations on profile intersections, it is likely that each user share measurable similarity with a only small subset of other users; • Similarly, users who have recently joined the system and have next to no ratings may have no neighbourhood at all; and • Similarity metrics are symmetric and are computed on the co-rated items between user pairs, implying that high similarity is often shared with users who have very
The Role of Trust in Collaborative Filtering
7
sparse profiles if the few ratings they have input are the same as those co-rated with the current user. These weaknesses dampen the ability for recommendations to be propagated around a community of users. The aim of using trust for neighbour selection is to capture information that resides outside of each user’s local similarity neighbourhood in a transparent, robust and accurate way. Two main approaches have been adopted: implicit methods, which aim to infer trust values between users based on item ratings, and explicit methods, that draw trust values from pre-established (or manually input) social links between users. Both methods share a common vision: the underlying relationships (whether inferred or pre-existing) can be described and reasoned upon in a web of trust, a graph where users are nodes and the links are weighted according to the extent that users trust each other. In this section, we review a range of techniques that have been applied and evaluated from both of these perspectives. Comparing these methods side by side highlights both the common traits and differences that emerge from trust-based collaborative filtering.
4.1 Computing Implicit Trust Values The first perspective of trust in collaborative filtering considers values that can be inferred from the rating data. In other words, a web of trust between users is built from how each user has rated the system’s content. In these cases, trust is used to denote predictability and to accomodate for the different ways that users interact with the recommender system; in fact, many of these measures build upon generic error measures, such as the mean absolute error. For example, Pitsilis and Marshall focus on deriving trust by measuring the uncertainty that similarity computations include [21, 22]. To do so, they quantify the uncertainty u(a, b) between users a and b, which is computed as the average absolute difference of the ratings in the intersection of the two user’s profiles. The authors scale each difference by dividing it by the maximum possible rating, max(r): |ra,i − rb,i | 1 (1) u(a, b) = |Ra ∩ Rb | i∈(R∑∩R ) max(r) a
b
The authors then use this uncertainty measure in conjunction with the Pearson correlation coefficient to quantify how much a user should believe another. In other words, trust is used to scale similarity, rather than replace it. Similarly, O’Donovan and Smyth define a trust metric based on the recommendation error generated if a single user were to predict the ratings of another [23, 24]. The authors first define a
8
Neal Lathia, Stephen Hailes, Licia Capra
rating’s correctness, as a binary function. A rating rb,i is correct relative to a target user’s rating ra,i if the absolute difference between the two falls below a threshold ε: correct(ra,i , rb,i ) ⇐⇒ |ra,i − rb,i | ≤ ε (2) The notion of correctness has two applications. The first is at the profile level, Trust P : the amount of trust that user a bestows on another user b is equivalent to the proportion of times that b generates correct recommendations. Formally, if RecSet(b) represents the set of b’s ratings used to generate recommendations, and CorrectSet(b) as the number of those ratings that are correct, then profile-level trust is computed as: |CorrectSet(b)| (3) Trust P (b) = |RecSet(b)| The second application of Equation 2 is item-level trust Trust I ; this maps to the reputation a user carries as being a good predictor for item i, and is a finer-grained form of Equation 3, as discussed in [23]. Both applications rely on an appropriate value of ε: setting it too low hinders the formation of trust, while setting it too high will give the same amount of trust to neighbours who co-rate items with the current user, regardless of how the items are rated (since correct is a binary function). Similar to Pitsilis and Marshall, this metric also operates on the intersection of user profiles, and does not consider what has not been rated when computing trust. Lathia, Hailes and Capra approach trust inference from a similar perspective, but extend it from a binary to continuous scale and to include ratings that fall outside of the profile intersection of a user pair [25]. To do so, rather than quantifying the correctness of a neighbour’s rating, they consider the value that b’s rating of item i would have provided to a a’s prediction, based on a’s rating: value(a, b, i) = 1 − ρ|ra,i − rb,i |
(4)
This equation returns 1 if the two ratings are the same, and 0 if user b has not rated item i; otherwise, its value depends on the penalising factor ρ ∈ [0, 1]. The role of the penalising factor is to moderate the extent to which large differences between input ratings are punished; even though the two ratings may diverge, they share the common feature of having been input to the system, which is nevertheless relevant in sparse environments such as collaborative filtering. A low penalising factor will therefore have the effect of populating neighbourhoods with profiles that are very similar in terms of what was rated, whereas a high penalising factor places the emphasis on how items are rated. In [25], the authors use ρ = 51 . The trust between two users is computed as the average value b’s ratings provide to a: ! 1 trust(a, b) = (5) ∑ value(a, b, i) |Ra | i∈R a
The Role of Trust in Collaborative Filtering
9
This trust metric differs from that of O’Donovan and Smyth by being a pairwise measure, focusing on the value that user b gives to user a. Unlike the measures explored above, the value sum is divided by the size of the target user’s profile, |Ra |, which is greater than or equal to the size of the pair’s profile intersection, |Ra ∩ Rb |, depending on whether a has rated more or less items than b. This affects the trust that can be awarded to those who have the sparsest profiles: it becomes impossible for a user who rates a lot of content to highly trust those who do not, while not preventing the inverse from happening. The three methods we have presented here are not the only proposals for trust inference between users in collaborative filtering contexts. For example, Weng et al [26] liken the web of trust structure in a collaborative filtering context to a distributed peer-to-peer network overlay and describe a model that updates trust accordingly. Hwang and Chen [27] proposed another model that again marries trust and similarity values, taking advantage of both trust propagation and local similarity neighbourhoods. Papagelis et al [28] do not differentiate between similarity and trust, by defining the trust between a pair of users as the correlation their profiles share; they then apply a propagation scheme in order to extend user neighbourhoods. Many of the problems of computed trust values are akin to those of similarity; for example, it is difficult to set a neighbourhood for a new user who has not rated any items [11]. As the above work highlights, the characteristics of trust modeling allow for solutions that would not emerge from similarity-centric collaborative filtering. For example, Lathia et al extend the above measure to include a constant bootstrapping value β , which translates to initial recommendations that are based on popularity, and would become more personalised as the user inputs ratings. However, none of the measures take into account the potential noise in the user ratings, or the actual identity of the neighbours themselves (leading to system vulnerabilities that are explored below). All of the methods we have explored share the common theme of using error between profiles as an indication of trust. Similarly, there is a broad literature on similarity estimation that does not adopt the language of trust modeling, such as the “horting” approach by Aggarwal et al [29] and the probabilistic approach by Blanzieri and Ricci [30]. In all of the above, each user pair is evaluated independently; the significant differences appear in how each method reflects an underlying user model of trust.
4.2 Extracting Explicit Trust The alternative to computing trust values between users is to transfer pre-existing social ties to the collaborative filtering context. There are two approaches that have been followed here: one the one hand, users may be asked to explicitly select trust-
10
Neal Lathia, Stephen Hailes, Licia Capra
worthy neighbours. On the other hand, social ties may be drawn from online social networks by where it is possible to identify each user’s friends. For example, Massa and Avesani describe trust-aware recommender systems [31, 32]. In this scenario, users are asked to rate both items and other users. Doing so paves the way to the construction of a web of trust between the system users. Since users cannot rate a significant portion of the other users, the problem of sparsity remains. However, assuming that user-input trust ratings for other users are more reliable than computed values, trust propagation can be more effective. This chapter does not address the details of trust propagation; however, there are some points worth noting. Trust propagation is a highly explainable process: if a trust b, and b trust c, then it is likely that a will trust c. However, this transparency is obscured as the propagation extends beyond a two-hop relationship. Propagation sits on the assumption that trust is transitive, an assumption that can be challenged once the propagation extends beyond “reasonable” limits. In small-world scenarios (such as social networks), this limit is likely to be less than the famed six-degrees of separation, since it is apparent that people do not trust everyone else in an entire social network. Much like similarity and computed trust, the efficiency of trust propagation is therefore dependent on the method used and the characteristics of the underlying data. A range of other works centre their focus on the social aspect of recommendations. For example, Bonhard and Sasse [33, 34] perform a series of experiments that analyse users’ perception of recommendations: they conclude that participants overwhelmingly prefer recommendations from familiar recommenders. The experiments reflect the ongoing asymmetry between algorithmic approaches to collaborative filtering, which tend to focus on predictive accuracy, and user studies that mainly consider recommender system interfaces. It is difficult to evaluate one independently of the other, and Bonhard’s motivations for the use of social networks echo those used to motivate the use of trust models in Section 2: in order to reconcile the end users’ mental model of the system, and the system’s model of the users. Golbeck explored the power of social networking in the FilmTrust system [35], showing that these systems produce comparable accuracy to similarity-based collaborative filtering. Research along these lines departs from pure trust-based modeling towards the Semantic Web [36] and multi-agent systems [37]. The application of social networks can also be beneficial to collaborative filtering since relationships in the web of trust can be augmented from simple weighted links to annotated, contextual relationships (i.e. b is my sister, c is my friend). Context-aware recommender systems is a nascent research area; Amodavicius et al [38] provide a first view into this subject by looking at multi dimensional rating models. Full coverage of this falls beyond the scope of this chapter; however, it is apparent how network ties can be fed into mechanisms that include who and where the users are before providing recommendations.
The Role of Trust in Collaborative Filtering
11
The main criticism of many of these approaches is that they require additional manual labour from the end user; in effect, they move against the fully automated view of recommender systems that original collaborative filtering proposed. However, social networks are on the rise, and users proactively dedicate a significant portion of time to social networking. The implementation of these methods therefore aims to harness the information that users input in order to serve them better. It is important to note that both the computed and explicit methods of finding trustworthy neighbours are not in conflict; in fact, they can be implemented side by side. Both require users to be rating items in order to provide recommendations, while the latter also requires social structure. Popular social networking sites, such as Facebook1 include a plethora of applications where users are requested to rate items, making the conjunction of the two methods evermore easy.
4.3 Trust-Based Collaborative Filtering Once neighbours have been chosen, content can be filtered. However, there are a range of choices available to do so; in this section we outline the methods implemented by the researchers we discussed in the previous section. The approaches revolve around rating aggregations, in other words, taking a set of neighbour ratings for an item and predicting the user’s rating. The most widely adopted is Resnick’s formula [39], where the predicted rating rˆa,i of item i for user a is computed based on the weighted ratings from each neighbour: rˆa,i = r¯a +
∑ (rb,i − r¯b ) × wa,b ∑ wa,b
(6)
The difference between each method is (a) what neighbours are selected, and (b) how the ratings from each neighbour are weighted. We decompose the methods into three strategies, trust-based filtering, weighting, and social filtering. 1. Trust-Based Filtering. In this case, neighbours are selected (filtered) using the computed trust values. The ratings they contribute are then weighted according to how similar they are with the target user. 2. Trust-Based Weighting departs fully from similarity-based collaborative filtering: neighbours are both selected and their contributions weighted according to the trust they share with the target user. 3. Social Filtering. Neighbours are selected based on the social ties they share with the target user. Ratings can then be weighted according to either their shared similarity or trust with the target user. All of these methods assume that users will be using the ratings scales symmetrically, i.e. if two users predict each other perfectly, then the difference (ra,i − r¯a ) will 1
http://www.facebook.com/apps/
12
Neal Lathia, Stephen Hailes, Licia Capra
be the same as (rb,i − r¯b ), regardless of what each user’s mean rating actually is. In practice, this is not always the case: predictions often need to be changed to fit rating scale, since users each use this scale differently. This notion was first explored in the aforementioned work by Aggarwal et al [29], who aimed to find a linear mapping between different users’ ratings. However, Lathia et al [25] extend this notion to encompass what they refer to as semantic distance, by learning a non-linear mapping between user profiles based on the rating contingency table between the two profiles. The results offer accuracy benefits in the MovieLens dataset, but do not hold in all cases: translating from one rating scheme to another is thus another research area that has yet to be fully explored. The above work further assumes that the underlying classifier is a kNN algorithm. Recent work, however, has been moving away from kNN-based recommender systems. In fact, the data derived from users telling the system whom they trust can be also input to other algorithms, such as matrix factorisation techniques [40, 41]. In these works, Ma et al describe matrix factorisation models that account for both what users rate (their preferences) and who they explicitly connect to (who they trust). While certainly beneficial to cold-start users, introducing trust data into factorisation models reignites the problem of transparency: how will users understand how their input trust values contribute to their recommendations? A potential avenue for research lies in the effect of combining trust models on users. For example, Koren describes how a neighbourhood and factorisation model can be combined [42], and this work may begin to bridge the chasm between the model-based and traditional kNN-based use of trust in recommender systems. One the other hand, recent work shows that the response of different users to the same algorithm may themselves vary. In [43], the authors describe a number of experiments the show the potential of selecting between rather than combining different recommender system algorithms; this relates to the broader notion of user-adaptivity that we described above. The above trust-based methods have, for the most part, been evaluated according to traditional recommender system metrics: mean error, coverage and recommendation list precision [44]. Mean error and coverage go hand in hand: the error considers the difference between the predictions the system generates and the ratings input by users and the coverage gives insight into the proportion of user-item pairs the system could make predictions for. There are a number of modified mean error metrics that aim to introduce fairer representation of the users. For example, Massa uses the mean absolute user error in order to compensate for the fact that more predictions are often made for some users rather than others [32]; O’Donovan and Smyth compare algorithm performance by looking at how often one is more accurate than another [23], and Lathia et al measure mean error exclusively on items that have been predicted, to measure accuracy and coverage separately [25]. However, while these have provided explicit metrics that researchers have aimed to optimise towards, accuracy has disputed utility from the perspective of end-users’ [45]. Examining an algorithm from the point of view of top-N recommendations provides an alternative
The Role of Trust in Collaborative Filtering
13
evaluation; rather than considering the predictions themselves, information retrieval metrics are used on a list of items sorted using the predictions. However, sorting a list of recommendations often relies on more than predicted ratings: for example, how should one sort two items that both have the highest possible rating predicted? How large should the list size N be? These kind of design decisions affect the items that appear in top-N lists, and has motivated some to change from deterministic recommendation lists to looking at whether items are “recommendable” (i.e. their prediction falls over a predefined threshold) or not [46].
4.4 Are Current Methods Sufficient? Given the above review of trust in collaborative filtering, both in terms of neighbourhood selection and neighbour rating aggregation, we return to the characteristics outlined in Section 3. In particular, we ask: do these algorithms address all the features required of trust models? 1. Subjectivity. This characteristic stated that trust relationships are, for the most part, asymmetric, and is addressed by majority of the models presented above. The parallel that computed trust values have with error measures (or, their focus on the correctness of a rating with respect to another) ensures that the weight a relationship in the web of trust is given depends on which user is being considered. 2. Temporality: This factor was based on the idea that trust model-based decisions are made by learning from previous interactions. However, the perspective that the above models adopt is that all previous interactions can be captured in the rating data; in fact, they do not consider whether users had been neighbours in the past and any previous trust values shared between them. Recommender systems are continuously updated, but none of the previous decisions that they made are fed forward toward making the best decision at the next update. Consider, for example, the manually input trust values described in Massa’s trust-aware system: these are not subject to aging or updated according to how useful they are. The values remain static and responsibility for updating them is left to the user. 3. Adaptivity: The idea of adaptivity is based on the assumption that the “one size fits all” model does not hold in recommender systems; the way users think about and rate items and respond to recommendations will vary between users. However, all of the models presented above assume that the same model is applicable to all users: the one size fits all vision returns. 4. Robustness: The last quality related to the ability that users have to game the system’s recommendations. This topic is covered more extensively in Section 6; however, as explored by O’Donovan and Smyth [47], implementations of trust may fend off certain shilling attacks, but paves the way for others to be adopted. Based on the definition in Section 3, we find that much of the state of the art addresses only one of the four facets of trust models, subjectivity. In the following
14
Neal Lathia, Stephen Hailes, Licia Capra
sections, we review further work from related fields that may respond to this void and complete the trust-based vision of collaborative filtering.
5 Temporal Analysis of a Web of Trust As seen above, trust models have primarily focused on creating the web of trust between users; we can equivalently say that the role of trust has been to build a graph that connects the system users (or items) to each other. Thinking of a recommender system in this way has lead to a growing body of work that analyses and makes use of the graph when recommending items. For example, Zanin et al [48] applied the same analytic techniques used in complex network analysis to collaborative filtering graphs. Celma and Cano [49] used the same graph-based structure to highlight the bias that popular artists introduce in music recommendation and discovery, and proposed a method to navigate users towards niche content based on graph traversal. Similar work by Zhou et al operates on citation graphs, and explores supervised techniques to exploit the graph structure [50]. Mirza et al [51] examine the graphs of movie rating datasets in order to evaluate the ability different collaborative filtering algorithms have to connect users. All of these proposals begin by drawing the comparison between recommender system graphs and real-world networks; moreover, they assume that the computed graph structure is a representative depiction of the links between content and then make their proposals based on this. However, one of the aspects of recommender system graph analysis that is not considered is the importance of time. Time can play three rather different roles in a recommender system; the most notable recent application of temporal features is using when people rate items in order to better predict how they rate will them [52]. A temporal view of recommender system also beckons the idea of recommender systems devoted to developing, or incrementally changing, people’s tastes. These systems could reflect and respond to growing knowledge of its users in the domain of items being recommended; an apparent application is in recommender systems for education. The last view of time relates back to recommender system graphs: as time passes and people continue rating items, the recommender system will be updated and (in the case of computed trust relationships) the web of trust will be recreated. In the following sections we review work that analyses how these graphs evolve over time; in particular, we compare and highlight the difference between how the two types of graphs evolve.
5.1 Explicit Trust Networks The structure and evolution of explicit social networks has been extensively analysed; for example, Kumar et al [53] looked at how two large Yahoo! social network
The Role of Trust in Collaborative Filtering
15
datasets changed over time. They identify three components in the graph: singletons, who join the system and never connect to other users, the giant component of connected users, and a middle region of isolated communities. In particular, they note that majority of the relationships are reciprocal, and a significant fraction of the community is found in the middle region. Interestingly, many of the isolated communities in the middle region formed star shaped graphs; they are formed of a single user who connects to a range of singletons (who may then only connect back to this central user). The networks display an initial rapid growth followed by a period of steep decline and then settling on a pattern of steady growth; the authors postulate that this comes as a consequence of both early adopters and the following majority of users joining the system. Collaborative filtering systems that rely on explicit trust are likely to demonstrate similar patterns, especially if they are purely based on users’ social network. This structure highlights the weakness of explicit trust: it is difficult to propagate information across a community of users if the graph is largely disconnected and a large proportion of users are singletons or fall in isolated communities.
5.2 Computed Web of Trust Lathia et al [54] performed a similar analysis on rating data when the links between users are computed and iteratively updated as the rating dataset grows. The main differences between this analysis and that of Kumar et al emerge from the nature of the kNN algorithm: each user’s out-degree will be bound by the k parameter (hence the out-degree distribution is constant), and users are not guaranteed to link to others to whom they previously linked (i.e. links are not persistent). The authors decompose their analysis into a number of groups, considering single nodes, node pairs, node neighbourhoods, and finally the entire graph, comparing results across a range of similarity measures. The difference between computed and explicit trust are highlighted if their results are compared to that of Kumar et al. Each analysis display different growth patterns; in particular, the computed graphs do not exhibit the same early adopter-late majority behaviour. The link weight (i.e. computed trust or similarity measure, depending on the algorithm details) between pairs of users is not constant. In fact, the way that the weights evolve over time leads to a classification of similarity measures; some are incremental, displaying small variation over time, others are corrective, initially awarding high similarity to neighbours and then incrementally reducing the shared value, and the last set are near-random as there is no pattern governing the subsequent similarity shared between a pair of users, given their current value. This leads to an ever-changing ranking of neighbours, and each user’s neighbourhood is therefore also continuously subject to change. The equivalent of this result in Kumar et al’s work would have been observing very quick churn in isolated communities, which they do not report on. The entire computed graph also differs from the explicit social network: once the k parameter is greater than one, the graph is fully connected. Intuitively, as k is increased the graph will tend
16
Neal Lathia, Stephen Hailes, Licia Capra
toward becoming a clique; the only factors impeding each user from being linked to everyone else are the k parameter and the requirement that user pairs co-rate a number of items to have a non-zero trust (and similarity) value. Unlike explicit social networks, the computed graph has a very short average path, small diameter, and much smaller level of reciprocity: collaborative filtering based on implicit connectivity between users therefore generates a structure that seems to favour recommendation propagation more than social networks. However, Lathia et al also examine the in-degree distribution of users. Assuming that a node points to another if the latter is in the first’s top-k, then the in-degree distribution shows the “fairness” that collaborative filtering exhibits when assigning users to each other’s neighbourhood. The authors found that this distribution has a very long tail. There is a distint bias in collaborative filtering: some users are highly cited in others’ neighbourhood, while others are never selected for this role. These so-called power users, whose in-degree falls at the head of the distribution, will exert a high amount of influece on other’s recommendations. In the case of the datasets explored in [54], this subset of users contribute positively to the system’s accuracy and coverage; however, this leads to a strong vulnerability of collaborative filtering, as will be explored in the next section.
5.3 Temporal Collaborative Filtering The above works focus on analysing the temporal structure of the graphs that recommendation algorithms can operate upon. The flipside of this work is to explore the temporal performance of the same algorithms. The temporal performance of computed trust is considered in [55]; rather than performing the traditional data partinioning into training and test sets, the authors view collaborative filtering data as a growing set that is continuously required to predict future ratings. In other words, given the current dataset, how well does the algorithm predict next week’s ratings? Then, after the subsequent week’s ratings have been added to the dataset, how well does the dataset predict the following week? The results that the authors report again highlight the difficulty of accuracy-based evaluations: the relative performance of different collaborative filtering algorithms changes based on the current state of the dataset. In particular, different neighbourhood sizes (k values) offer varying levels of accuracy to each of the system’s users. This observation leads to the proposal of a system where the k values is adapted on a per-user basis, by modifying it according to the temporal performance measured to date. There is a need for similar work to be done on the temporal performance of explicit/social network based recommendations; in particular, it would be interesting to observe and measure the performance of a system as the recommendation algorithm is itself changed, to be able to quantify the short and long term benefits of changing filtering algorithms. Once collaborative filtering is cast onto the temporal scale, the evaluation metrics that are used to measure performance need to be modified as well. While changes in
The Role of Trust in Collaborative Filtering
17
the structure of the graph can be measured in terms of neighbourhood convergence, accuracy is measured by transforming mean error metrics to include time. For example, the root mean squared error (RMSE) can be converted to the time-averaged RMSE: s ∑Nrˆu,i ∈Rt (ˆru,i − ru,i )2 (7) RMSEt = |Rt | In other words, the RMSE achieved at time t is the root mean error on all predictions made to date. Similar transformations can be used to adapt coverage and recommendation list precision in order to evaluate the temporal performance of collaborative filtering on dynamic data.
6 Building Robust Recommender Systems With Trust As introduced in Section 1, the metaphor of trust needs to transpire from the user experience with the system to the algorithm modeling interaction between users and down to the ratings themselves. Trust in the rating data is motivated by the uncertainty regarding the intentions behind why users rate items (or other users) the way they do. In an ideal world, each user would be expressing an independent, personal opinion of each item. However, it is more likely that other interests and factors pervade the rating process [44]; furthermore, as explored above, the structure of the web of trust implies that different users exert varying levels of influence [56] on others, and some users may be interested in gaining as much influence as possible. Research on attacks in collaborative filtering must make two apriori decisions. First, what constitutes an attack? This involves suitably formalising attack models in collaborative filtering. Researchers often assume that attacks are executed by a set of user profiles; a single profile with malicious ratings is not considered. More importantly, however, is the second: how is the success of an attack measured? Answering this question requires appropriate metrics that reflect whether attackers have achieved the purpose they intended. In this section, we briefly introduce attack models and evaluation metrics in order to discuss the contributions that trust research has made to the area.
6.1 Attack Models The first aspect of collaborative filtering-vulnerability research models the attacks themselves. Examples of these attacks, as mentioned by Lam and Riedl [57], include
18
Neal Lathia, Stephen Hailes, Licia Capra
buying reputation credit on eBay and companies using fake ratings to promote recent product releases. Attacks are often viewed as pertaining to one of three groups [58]: • Push Attack: Where the aim is to promote an item until it is recommended where it otherwise would not be; • Nuke Attack: Which aims to demote an item so that it is never recommended; • System Attack: Where the aim is to reduce the entire system’s performance in order to attack the trust users have for the recommendations it produces. At a high level, all attacks are performed by inserting a number of user profiles into the system; each profile rates some items following a pre-defined strategy. They also rate the target item in accordance with the attacker’s intentions (i.e. high ratings for a push attack, low ratings for a nuke attack). A variety of strategies have been implemented to rate the non-target items [59]; these include random attacks, where a subset of the items are rated around the overall mean rating, average attacks, whereupon items are rated according to the individual items’ mean, or bandwagon attacks, where subsets of items are rated randomly and popular items are rated highly. Finer grained differences between each attack can be drawn from the amount of knowledge used by attacks to achieve their goals. For example, attackers may or may not have access to summary statistics about the items (including their mean rating and rating standard deviation); this differentiates between a high and low-knowledge attack. Mobasher et al provide a formal definition of collaborative filtering attacks [13]. An attack model is a 4-tuple that consists of: • A Choice Function. Given a target item i, the set of all items I and users U, the choice function partitions the set I into three subsets: the set of selected items IS , the set of filler items IF , and the remaining set of unrated items I0/ . • A Filler and Selected Item Mapping Functions that will be used to generate ratings for the items in IS and IF . These may be the same function (in the case of random attacks), or may be different in order to modify the similarity between items. As described in [13], a segment attack does the latter, and can be envisaged in scenarios where attackers are interested in binding the similarity between the target item and other content (for example, making a fantasy book highly similar to Harry Potter). • A Target Rating Function that rates item i according to the attacker’s intention. The process of performing an attack involves inserting these generated profiles into the system; once they are there, the standard collaborative filtering algorithm (that is not aware of malicious data) will automatically skew the recommendations for the target items. The difficulty of counteracting these attacks stems from the fact that it is difficult to differentiate between honest ratings (which perhaps express seemingly contractory opinions) and the ratings of malicious profiles [60]. It is also difficult to differentiate between the instances where attacks have taken place and where highly inaccurate recommendations have been made. Therefore, a primary concern of research in this field has been on how to quantify the effect of an attack.
The Role of Trust in Collaborative Filtering
19
6.2 Measuring Attacks: Success or Failure? A key component of a robust system is the ability to detect when an attack is taking place. Similarly, researchers require methods to be able to measure both the vulnerability that algorithms have to attack and the success that different attack strategies have over collaborative filtering algorithms. While traditional collaborative filtering evaluation methods tend to focus on mean error, or the distance between the ratings that users input and the predictions produced by the system, attack models are evaluated according to the change in performance produced by an attack, or the difference between predicted ratings when attack profiles are present or not (they therefore do not focus on the “true” rating input by the user [60]). O’Mahoney [58] defines two suitable metrics: a system’s robustness, which compares system performance pre- and post-profile injection, and stability, which focuses on the skew in the target items’ predictions induced by the fake profiles. Although the attack models described above aim to change the prediction for an item, the attack is unlikely to have very much effect (from the users’ perspective) if the produced recommendations do not change; similarly, an attack is less effective if it changes the position of a low-ranking item by a small amount. In other words, changing the top 10 recommendations users see is more likely to influence them than changing the 90th - 100th recommendations. It therefore also becomes important to measure the likelihood that an item appear in a user’s top-N recommendations, which motivates the use of the hit ratio. The hit ratio is based on a proportion of binary values: an item’s score is incremented by 1 for each time it appears in a user’s recommendation list. However, the problem with the hit ratio metric is that it relies on a notion of top-N list, as explored above. It therefore also inherits the same difficulties that top-N evaluation suffers from. Chirita et al [61] propose a number of metrics that can compliment algorithms in order to identify when an attack is taking place. These include: the number of predictions that a single user profile is involved in, the user standard deviation, degree of agreement with others, and degree of similarity with top-k neighbours. In fact, they highlight how automated attacks will display consistent patterns, and can be identified accordingly (although some non-shill profiles will also be identified as false-positives).
6.3 Robustness of Trust-Based Collaborative Filtering Based on these metrics, we explore the extent that trust-based collaborative filtering manages to produce attack-resistant recommendations. Lam and Riedl [57] show that collaborative filtering algorithms differ in their ability to fend off attacks; in particular, they show that the most effective attack push attack on an item-based kNN algorithm produces the same results as the least effective attack on user-based kNN. They discuss the results of a number of experiments using the MovieLens
20
Neal Lathia, Stephen Hailes, Licia Capra
dataset: they find it easier to push (rather than nuke) items in user-based kNN and that new items are the most vulnerable to manipulation. The intersection between trust modeling and attack resistance is related: both have attempted to improve the way collaborative filtering selects neighbours for users. Mehta and Hofmann review the robustness of a variety of non-kNN collaborative filtering methods [59]; in this section we focus on the robustness of the above trust models to attack. O’Donovan and Smyth discuss the reinforcement problem as a primary vulnerability of their model (described in Section 4.1): if a large number of injected profiles generate “correct” (Eq. 2) for each other, they will reinforce the trust values endowed on them. The model proposed by Lathia et al [25], which is based on pairwise computations, does not have this ripple effect. However, any trust computation that is based on measuring error between profiles is intuitively subject to attack, and are an insufficient defense mechanism if implemented alone. The explicit trust-based methods, described in Section 4.2, do not suffer from these same vulnerabilities. In fact, injected profiles would have little effect in such a system unless the profiles managed to befriend (or be trusted) by a number of users. On the one hand, this lessens the vulnerability of collaborative filtering; on the other hand, it does so by excluding all honest raters who are not directly trusted. Dell’Amico and Capra proposed a scheme that blends both explicit and implicit methods [62]. To do so, they formalise relationships in a web of trust as being composed of two distinct, but equally important, characteristics: intent, or the willingness to provide honest ratings, and competence, the ability to provide correct ratings. An injected profile may therefore seemingly satisfy the competence requirement (by being appropriately populated with filler ratings) but will be excluded if its intent cannot be determined. Intent is measured by traversing the web of trust, taking advantage of the structure of social networks and role that injected profiles play in them [63]. A potential solution to this problem, approached from the perspective of generating recommendations using identifiable raters, was explored by Amatriain et al [46]. In contexts where expert ratings are widely available (such as movies, since a wide range of media publishes reviews that include ratings), the effect of profile injection attacks can be nullified by only using the expert opinions when computing users’ recommendations. This translates the collaborative filtering process from one of identifying the best k neighbours for each to user (within a dataset or user community) to matching each user with the best k experts (across different datasets). The solution is therefore centred around finding experts who each user trusts. The work in [46] evaluates the potential of using a dataset where member’s identity is known (since their ratings are crawled from their reviews in published media) to predict and produce useful recommendations for a much larger set of anonymous users; the work highlights that while the experts are not as accurate as traditional nearest neighbour filtering, they nevertheless provide recommendations that score highly in terms of recommendation list precision and are favoured by a large sample
The Role of Trust in Collaborative Filtering
21
of people questioned in a user study. All of the above work views collaborative filtering attacks from the same perspective adopted in traditional evaluations, i.e. attackers will insert profiles, gain influence, and modify predictions in a single step. However, it is unclear both whether the above models are sufficient and whether it may not be easier for attackers to adopt different strategies. For example, large groups of users often proactively rate items they have strong feelings against, such as political figures in an ongoing election2 . The effect of groups of users rallying to nuke an item has the same effect as a profile injection attack, except that it has not been initiated by fake profiles: removing their ratings from the system is likely to delete a large volume of valid information in order to revert the effects of manually nuking a controversial item, and traditional methods seem no longer appropriate.
7 Conclusions This chapter began by describing the three-fold motivation for the use of trust models in collaborative filtering: trust models help to transform a “black-box” into a transparent, explainable recommender system; they also help users select neighbours in an intelligent way and access others who are not in their local neighbourhood, and can address the vulnerability collaborative filtering has to malicious attack. Reviewing the trust model literature highlights the range of contexts where trust is deployed, and the similarities that arise between them. Trust models are used to construct, maintain, and moderate interaction between a set of independently operating users, agents, or peers. It is therefore possible to summarise the characteristics of trust models; in Section 3 we described subjectivity, temporality, adaptivity, and robustness. The remainder of the chapter examined the extent that the use of trust in collaborative filtering matches the requirements of a complete trust model. Majority of the work performed to date seeks to improve neighbour selection by reasoning on trust, which can be either computed, derived from the rating data, or explicit, by being drawn from users’ social networks. Both techniques can make use of trust propagation to extend the reach of users’ neighbourhoods, and can be implemented side by side to draw the benefits of each method. Trust modeling in collaborative filtering has tendentially centred its focus on the kNN approach, however, recent contributions are beginning to explore how to reason with trust-value datasets with other classifiers. As discussed in Section 4.3, trust-based neighbourhood selection only addresses one of the four identified features of trust (subjectivity, adaptivity, temporality, ro2
Robust Recommendation. Robin Burke, ACM RecSys ’08 Tutorial
22
Neal Lathia, Stephen Hailes, Licia Capra
bustness); it does not examine the temporal aspect of trust relationships, has been implemented as a non-adaptive “one size fits all” trust model, and remains vulnerable to attacks. To seek for insight into these characteristics, the chapter looked at work in social network analysis, comparing the differences between explicit social network and computed web of trust evolution in Section 5, and discussing the opportunity that temporal changes offer to apply adaptive user-centric techniques to collaborative filtering. Lastly, Section 6 formalised the set of attacks that have been studied in the literature and reviewed the extent that trust models address these weaknesses. Trust has also been applied to recommender systems that operate in a distributed setting [64]. In fact, as recommender systems are ported from centralised web environments to distributed and mobile contexts, they inherit and intensify the preexisting problems faced by collaborative filtering by adding the uncertainty of both the provenance and trustworthiness of ratings as they are exchanged between peers. This context also provides a useful framework for extending the notion of trust. In this chapter, we only considered “positive” trust: users who explicitly tell the system that they trust another, or computed trust values that determined who the most trustworthy neighbours were. There is certainly space for extending this in the future, to include, for example, distrust [65]. The striking aspect of trust model research related to collaborative filtering is the independent goals set by each method: trust models have been implemented and evaluated to address a single characteristic of trust, while the characteristics seem intuitively correlated. For example, building a robust algorithm assumes that users neighbourhoods should not be veered toward injected, noisy profiles and should be able to quickly adapt to changing the changing data environment when an attack is mounted: research into the utility of trust model application includes a broad set of unanswered questions that beg for attention.
References 1. J. Schafer, J. Konstan, and J. Riedl. E-commerce recommendation applications. Data Mining and Knowledge Discovery, 5:115–153, 2001. 2. J. L. Herlocker, J. A. Konstan, A. Borchers, and J. Riedl. An Algorithmic Framework for Performing Collaborative Filtering. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 230–237, 1999. 3. R. Bell and Y. Koren. Scalable collaborative filtering with jointly derived neighborhood interpolation weights. In IEEE International Conference on Data Mining (ICDM’07). IEEE, 2007. 4. N. Lathia. Chapter 2: Computing Recommendations With Collaborative Filtering. In Collaborative and Social Information Retrieval and Access: Techniques for Improved User Modeling, September 2008. 5. G. Adomavicius and A. Tuzhilin. Towards the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions. IEEE Transactions on Knowledge
The Role of Trust in Collaborative Filtering
23
and Data Engineering, 17(6), June 2005. 6. P. Melville, R.J. Mooney, and R. Nagarajan. Content-boosted collaborative filtering for improved recommendations. In Proceedings of the Eighteenth National Conference on Artificial Intelligence (AAAI), pages 187–192, July 2002. 7. N. Tintarev and J. Masthoff. Effective explanations of recommendations: User-centered design. In Proceedings of Recommender Systems (RecSys ’08), Minneapolis, USA, 2007. 8. J. Herlocker, J.A. Konstan, and J. Riedl. Explaining collaborative filtering recommendations. In In proceedings of ACM 2000 Conference on Computer Supported Cooperative Work, 2000. 9. P. Pu and L. Chen. Trust building with explanation interfaces. In Proceedings of the 2006 International Conference on Intelligent User Interfaces. ACM, 2006. 10. A. Abdul-Rahman and S. Hailes. A distributed trust model. New Security Paradigms, pages 48–60, 1997. 11. A. M. Rashid, I. Albert, D. Cosley, S.K. Lam, J.A. Konstan, and J. Riedl. Getting to know you: Learning new user preferences in recommender systems. In ACM International conference on Intelligent User Interfaces, 2002. 12. N. Lathia, S. Hailes, and L. Capra. The effect of correlation coefficients on communities of recommenders. In ACM Symposium on Applied Computing (TRECK track), Fortaleza, Brazil, March 2008. 13. B. Mobasher, R. Burke, R. Bhaumik, and C. Williams. Toward trustworthy recommender systems: An analysis of attack models and algorithm robustness. In ACM TOIT, 2007. 14. A. Josang, R. Ismail, and C. Boyd. A survey of trust and reputation systems for online service provision. Decision Support Systems, 43(2):618–644, 2007. 15. M. Ahmed and S. Hailes. A game theoretic analysis of the utility of reputation management. Technical Report, Department of Computer Science, University College London, January 2008. 16. D. Gambetta. Can We Trust Trust? Trust: Making and Breaking Cooperative Relations, pages 213–238, 1990. 17. S. Marsh. Formalising trust as a computational concept. PhD Thesis, Department of Mathematics and Computer Science University of Stirling UK, 1994. 18. J. Golbeck, editor. Computing With Social Trust. Springer, 2008. 19. M. Carbone, M. Nielsen, and V. Sassone. A formal model for trust in dynamic networks. In Int. Conference on Software Engineering and Formal Methods (SEFM), pages 54–63, Brisbane, Australia, September 2003. 20. D. Quercia, S. Hailes, and L. Capra. B-trust: Bayesian trust framework for pervasive computing. In Proceedings of the 4th International Conference on Trust Management. LNCS, pages 298–312, Pisa, Italy, May 2006. 21. G. Pitsilis and L. Marshall. A model of trust derivation from evidence for use in recommendation systems. In Technical Report Series, CS-TR-874. University of Newcastle Upon Tyne, 2004. 22. G. Pitsilis and L. Marshall. Trust as a Key to Improving Recommendation Systems, Trust Management, pages 210–223. Springer Berlin / Heidelberg, 2005. 23. J. O’Donovan and B. Smyth. Trust in recommender systems. In IUI ’05: Proceedings of the 10th international conference on Intelligent user interfaces, pages 167–174. ACM Press, 2005. 24. J. O’Donovan and B. Smyth. Eliciting trust values from recommendation errors. International Journal of Artificial Intelligence Tools, 2006. 25. N. Lathia, S. Hailes, and L. Capra. Trust-based collaborative filtering. In Joint iTrust and PST Conferences on Privacy, Trust Management and Security (IFIPTM), Trondheim, Norway, 2008. 26. J. Weng, C. Miao, and A. Goh. Improving Collaborative Filtering With Trust-Based Metrics. In Proceedings of the 2006 ACM Symposium on Applied Computing, pages 1860–1864, Dijon, France, 2006. 27. C-S. Hwang and Y-P. Chen. Using Trust in Collaborative Filtering Recommendation. In New Trends in Applied Artificial Intelligence, pages 1052–1060, 2007.
24
Neal Lathia, Stephen Hailes, Licia Capra
28. M. Papagelis, D. Plexousakis, and T. Kutsuras. Alleviating the Sparsity Problem of Collaborative Filtering Using Trust Inferences. In Proceedings of the 3rd International Conference on Trust Management (iTrust), 2005. 29. C. Aggarwal, J.L. Wolf, K. Wu, and P.S. Yu. Horting Hatches and Egg: A New Graph Theoretic Approach to Collaborative Filtering. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, California, USA, 1999. 30. E. Blanzieri and F. Ricci. A Minimum Risk Metric for Nearest Neighbour Classification. In Sixteenth International Conference on Machine Learning, June 1999. 31. P. Massa and B. Bhattacharjee. Using trust in recommender systems: An experimental analysis. In iTrust International Conference, 2004. 32. P. Massa and P. Avesani. Trust-aware recommender systems. In Proceedings of Recommender Systems (RecSys), Minneapolis, USA, October 2007. 33. P. Bonhard. Improving recommender systems with social networking. In Addendum of CSCW, Chicago, USA, November 2004. 34. M.A Sasse P. Bonhard and C. Harries. The Devil You Know Knows Best: The Case for Integrating Recommender and Social Networking Functionality. In Proceedings of the 21s t British HCI Group Annual Conference, Lancaster, UK, September 2007. 35. J. Golbeck. Generating Predictive Movie Recommendations from Trust in Social Networks. In Proceedings of the Fourth International Conference on Trust Management, Pisa, Italy, May 2006. 36. P. Bedi, H. Kaur, and S. Marwaha. Trust based recommender system for semantic web. In International Joint Conference on Artificial Intelligence, Hyperabad, India, 2007. 37. F. E. Walter, S. Battiston, and F. Schweitzer. A model of a trust-based recommendation system on a social network. Autonomous Agents and Multi-Agent Systems, 2007. 38. G. Adomavicius, R. Sankaranarayanan, S. Sen, and A. Tuzhilin. Incorporating Contextual Information in Recommender Systems Using a Multidimensional Approach. ACM Transactions on Information Systems, 23(1), January 2005. 39. P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. GroupLens: An Open Architecture for Collaborative Filtering of Netnews. In CSCW ’94: Conference on Computer Supported Coorperative Work, pages 175–186, Chapel Hill, 1994. 40. H. Ma, H. Yang, M.R. Lyu, and I. King. SoRec: Social Recommendation Using Probabilistic Matrix Factorization. In ACM Conference on Information and Knowledge Management (CIKM), Napa Valley, California, USA, October 2008. 41. H. Ma, I. King, and M.R. Lyu. Learning to Recommend With Social Trust Ensemble. In ACM Conference on Research and Development in Information Retrieval (SIGIR), Boston, MA, USA, July 2009. 42. Y. Koren. Factorization meets the neighborhood: A multifaceted collaborative filtering model. In ACM SIG KDD Conference, 2008. 43. N. Lathia, X. Amatriain, and J.M. Pujol. Collaborative Filtering With Adaptive Information Sources. In IJCAI Workshop on Intelligent Techniques for Web Personalization and Recommender Systems, Pasadena, California, USA, July 2009. 44. J. Herlocker, J. Konstan, L. Terveen, and J. Riedl. Evaluating collaborative filtering recommender systems. In ACM Transactions on Information Systems, volume 22, pages 5–53. ACM Press, 2004. 45. S.M. McNee, J. Riedl, and J.A. Konstan. Being accurate is not enough: How accuracy metrics have hurt recommender systems. In Extended Abstracts of the 2006 ACM Conference on Human Factors in Computing Systems. ACM Press, 2006. 46. X. Amatriain, N. Lathia, J.M. Pujol, N. Oliver, and H. Kwak. The wisdom of the few: Mining the web to uncover expert recommendations for the crowds. In ACM SIGIR, Boston, MA, USA, July 2009. 47. J. O’Donovan and B. Smyth. Is Trust Robust? An Analysis of Trust-Based Recommendation. In Proceedings of the 11th International Conference on Intelligent User Interfaces, Sydney, Australia, 2006.
The Role of Trust in Collaborative Filtering
25
48. M. Zanin, P. Cano, J.M. Buldu, and O. Celma. Complex Networks in Recommendation Systems. In Proceedings of the 2nd WSEAS International Conference on Computer Engineering and Applications, Acapulco, Mexico, January 2008. 49. O. Celma and P. Cano. From hits to niches? or how popular artists can bias music recommendation and discovery. In 2nd Workshop on Large-Scale Recommender Systems and the Netflix Prize Competition (ACM KDD), Las Vegas, USA, August 2008. ACM. 50. D. Zhou, S. Zhu, K. Yu, X. Song, B.L. Tseng, H. Zha, and C.L. Giles. Learning Multiple Graphs For Document Recommendations. In International World Wide Web Conference, Beijing, China, 2008. 51. B.J. Mirza, N. Ramakrishnan, and B.J. Keller. Studying Recommendation Algorithms by Graph Analysis. Journal of Intelligent Information Systems, 20:131–160, 2003. 52. G. Potter. Putting the Collaborator Back Into Collaborative Filtering. In Proceedings of the 2nd Netflix-KDD Workshop, 2008. 53. R. Kumar, J. Novak, and A. Tomkins. Structure and evolution of online social networks. In International Conference on Knowledge Discovery and Data Mining. ACM, 2006. 54. N. Lathia, S. Hailes, and L. Capra. knn cf: A temporal social network. In Proceedings of Recommender Systems (RecSys ’08), Lausanne, Switzerland, 2008. 55. N. Lathia, S. Hailes, and L. Capra. Temporal Collaborative Filtering With Adaptive Neighbourhoods. In Under Submission, January 2009. 56. A. M. Rashid, G. Karypis, and J. Riedl. Influence in ratings-based recommender systems: An algorithm-independent approach. In Proceedings of SIAM 2005 Data Mining Conference, 2005. 57. S. K. Lam and J. Riedl. Shilling recommender systems for fun and profit. In Proceedings the 13th International Conference on World Wide Web, New York, USA, 2004. 58. M. P. O’Mahoney, N. Hurley, and N. Kushmerick. Collaborative Recommendation: A Robustness Analysis. ACM Transactions on Internet Technology, 2003. 59. B. Mehta and T. Hofmann. A Survey of Attack Resistant Collaborative Filtering. Data Engineering: Special Issue on Recommendation and Search in Social Systems, 31(2), June 2008. 60. M.P. O’Mahoney, N. Hurley, C. Guenole, and C.M. Silverstre. Promoting Recommendations: An Attack on Collaborative Filtering. In Proceedings of the 13th International Conference on Database and Expert Systems Applications, pages 494–503, Aix-En-Provence, France, 2002. 61. P.A. Chirita, W. Nejdl, and C. Zamfir. Preventing shilling attacks in online recommender systems. In Proceedings the 7th Annual ACM International Workshop on Web Information and Data Management, Bremen, Germany, 2004. 62. M. Dell’Amico and L. Capra. Sofia: Social filtering for robust recommendations. In Joint iTrust and PST Conferences on Privacy, Trust Management and Security (IFIPTM), Trondheim, Norway, June 2008. 63. H. Yu, M. Kaminksy, P.B. Gibbons, and A. Flaxman. Sybilguard: Defending against sybil attacks via social networks. In Proceedings of ACM SIGCOMM, Pisa, Italy, September 2006. 64. C. Ziegler. Towards Decentralised Recommender Systems. PhD Thesis, University of Freiburg, 2005. 65. R. Guha, R. Kumar, P. Raghavan, and A. Tomkins. Propagation of trust and distrust. In Proceedings of the 13th international conference on World Wide Web, pages 403–412, 2004.