Incorporating social networks and user opinions ... - ACM Digital Library

Incorporating Social Networks and User Opinions for Collaborative Recommendation: Local Trust Network ∗ based Method Bin Liu

Zheng Yuan

Department of Computer Engineering University of California, Santa Cruz Santa Cruz, CA 95064 USA

Dept. of Electrical and Computer Engineering University of Florida Gainesville, FL 32603 USA

[email protected]

[email protected]

ABSTRACT

not yet considered [1]. Given the historical user rating profile, the task of recommendation is to predict the rating of an active user u on a unrated item i. Collaborative Filtering (CF) [3] based recommendation algorithm identifies a set of most similar users to active user u using similarity metric derived from rating profile, then the rating on item i is forecasted by computing the active user’s historical average rating compensated with interpolation from other similar users’ rating of the item. CF-based is ineffective due to the sparsity problem of the rating profile. Social network, to a great extent, reflects social trust between the individuals in a social group. Intuitively, when faced with overwhelming choices and lacking specific domain knowledge, a user would refer to his friends for opinions on a specific item. Sinha and Swearingen found that recommendations from friends are more preferable to collaborative filtering recommender systems in terms of quality and usefulness [13]. Friends are thus seen as more qualified referrers to make good and useful recommendations compared to recommendation systems. Massa and Avesani [9] proposed the idea of user trusted network based CF, which was regarded as a friend network by propagating the trustworthiness of users with the active one. However, with the availability of user opinions besides user friendship social network, user’s opinion on certain tags or labels of a given item reflects his/her taste or at least attention to the item. We argue, for trusted network in terms of movie preference, friends do not guarantee they should like the same type of movie; on the other hand, people who cast interests on the same movies do not have to know each other at all. These observations motivate us to explore social network and user opinions information to find an active user’s highly trusted user referrer1 group, thus providing much better personalized recommendations. In this paper, we investigate exploring implicit user social networks and user opinions information to improve collaborative recommendation performance. We proposed local trust network2 (LTN) based recommendation method, that mines the social network and multiple sources of user opinions to generate a LTN for a given active user. Our advantage of LTN in recommendation setting is that the trust

The sparse nature of historical rating profile hinders reliable similarity metrics between users, leading to poor recommendation performance. The availability of user social networks and user opinions can be incorporated to improve prediction accuracy. One of the key points is how to make the multiple sources of information consistent for the purpose of recommendation. In this paper, we proposed Local Trust Network (LTN) based recommendation method in the setting of movie recommendation, that mines the social network and multiple sources of user opinions to generate a highly reliable trust user network, upon which a recommendation is made. With transductive reasoning, LTN interpret trust user as a collection of instances, so it is well suited for the sparse issue of social network information. Our experiments on CAMRa10 data set shows the proposed methods improve recommendation performance significantly.

Categories and Subject Descriptors H.3.3 [Information Systems]: Information Search and Retrieval; H.3.5 [Information Storage and Retrieval]: Online Information Service

Keywords Recommendation Systems, Social Networks, Collaborative Filtering, Trust Network

1.

INTRODUCTION

Recommender Systems use opinions of an online community of users to provide individuals with personalized recommendation on various types of products and services [11]. Typically, a recommender system collects user profile, more specifically, a set of users and a set of items, and users’ rating of a collection of items. Then recommendation is computed by exploiting the historical data of the users’ profile to predict the rating that a user would give to an item they had ∗Bin Liu and Zheng Yuan contributed equally to this work. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. CAMRa2010, September 30, 2010, Barcelona, Spain. Copyright 2010 ACM 978-1-4503-0258-6 ...$10.00.

1

We mean trusted referrer as a group of trusted friends, upon whom the recommendation is made 2 By local trust network, we mean a series of users share highly reliable similarities in terms of potential item rating with the active user. It is a subgraph of the entire user social network graph.

53

network can be far beyond as friendship network, we analyzed all the possible multiple sources of user opinion in CAMRa10[12] dataset and selected those highly relevant. More importantly, the inferred trust referrers are coordinated when the inferred movie-preferences of the same refer from different opinions combinations are not consistent. Then predication is made over the local trust network. Through experiments on CAMRa10 data set, we show that LTN based method significantly outperforms the standard CF recommendation performance with increase by 20.70% in MAP, 29.60% reduction in MAE, and 24.61% reduction in RMSE. The rest this paper is organized as follows: section 2 introduces related work. We discuss the details of our work in section 3. The experiment and results are discussed in section 4. We conclude this paper in section 5.

2.

RELATED WORK

2.1 Collaborative Filtering Collaborative Filtering (CF) is perhaps the most widely known algorithm for recommendation. In CF, users explicitly give like/dislike judgments for items in the form of ratings. Past user ratings are then used to predict the rating of new ones. CF can be categorized into model-based and memory-based classes. Furthermore, memory-based collaborative filtering systems can be further classified into user based and item based systems depending on the way how historical profiles are used [3, 4]. In user based CF, given rating matrix Rm×n and an active user a, the predicted rating for user a for movie i is Pm (ru,i − r¯u )wa,u (1) râ,i = r¯a + u=1Pm u=1 wa,u

where wa,u is the Pearson correlation P ¯a )(ru,j − r¯u ) j (ra,j − r wa,u = qP P ¯a )2 j (ru,j − r¯u )2 j (ra,j − r

a dataset based on data from the last.fm3 that included bonds of friendship social network and collaborative annotation. They applied Random Walk with Restart(RWR) based algorithm for recommendation and showed that the extra knowledge provided by the users’ social activity can improve the performance of a recommendation system using the RWR method. Ioannis et. al. incorporated social network information through a linear combination of user historical profile and social information. Are there better approaches to incorporate the social network and user opinion information? Social network reflects the trust relationship between the individuals in a social group. Andersen et. al proposed axiomatic approach trust-based recommendation systems, in which recommendations were made by aggregating the opinions of other users in the trust network [2]. There are two main steps in trust-based recommendation, namely, how to determine the trusted individuals for a given user, and how to make recommendation with the trusted users information. [6] proposed a random walk model on the trust network and returned target items from the trusted users in the walk. Our work is much related previous work [9, 10], which based trust network by by propagating the trustworthiness of users with the active one. But our work generate a local trust work by mining friendship social network and multiple sources of user opinions in the setting of movie recommendation problem in CAMRa10 dataset, and recommendation is make upon the LTN.

3. INCORPORATING SOCIAL NETWORKS AND USER OPINIONS FOR RECOMMENDATION 3.1 Analysis of Friendship Network and Multiple Sources of User Opinions

(2)

This paper focus on explore social network and multiple sources of user opinions for recommendation. Here we analysis the recommendation within a specific movie recommendation setting. The new release of CAMra2010 data from Filmtipset provides significantly more sources of information, leading it possible to mine latent social network patterns. This information allows us to identify a local trusted network for the specific movie recommendation task and thus make recommendations for a user with more accuracy and serendipity. We analyze the user-user relations that may be mined from the multiple sources of user opinions. Then the trusted referrer network of a user can be inferred by comparing their movie preferences with the user. 1 friend : this document provides direct and asymmetric social relations between two users. Note that the fact that a user can pay attention to a movie recommended from a friend but this does not necessarily the user will like it while seldom the user will hate such a movie in order to be involved to friend group. So a pair of users who are friends is strong indicator of their similarity in movie preference. 2 peopleinmovies: this document provides user-person-inmovie relations. Note the fact user favor/disfavor a movie for the simple reason that a particular people (actor/writer/director) is in the movie. So the users favor/disfavor the same people in a movie are strong indicator of their similarity in movie

where the summation over j means both a and u have corated movie j. Typically, the rating matrix Rm×n is quite sparse, [5] suggested that prediction accuracy can be improved by penalizing the correlation values based on the number of the items that two users have co-rated. If the number of co-rated items between two users k is less than a threshold value T hr, then their similarity is multiplied weight with T khr . This is what we adopt as the similarity measurement in our baseline CF algorithm implementation.

2.2 Social Networks and Collaborative Recommendation Though recent years have witnessed blooming presence of diversity of online social media and online social networks, it seems difficult for recommendation system to incorporate social network information. There are good reasons. Almost all current recommender systems provide no access to the information of the a user’s friends transactions information. However, the social network information can be very useful for recommendation. Lee and Brusilovsky [8] argued that users’ self-defined social networks could be valuable to increase the quality of recommendation in CF systems. They proved it indirectly by showing that users connected by social networks exhibit significantly higher similarity than non-connected users. Ioannis et. al. [7] presented

3

54

http://www.last.fm

preference. In reality, people like the movies of Tom Hanks would holds a different taste of movies from those those who are big fans of Robert Pattinson. 3 moviecomments: this document shows records when a user leave comment for a movie. Although it does not further present the comment preference, the comments themselves suggest user at least cast interest onto the movie. So a pair of users who leave comments for the same movie fairly indicates similarity in movie preference. 4 personcomments: this document shows records when a user leave comment for a person (actor/writer/director) of a movie. Similarly, it implies users may pay interest to the same persons and thus to the same movies. We consider a pair of users who leave comments for the same person in movie is weak indicator of similarity in movie preference. 5 review and reviewratings: these two documents supposedly show user rates a review written by another user for a movie. However, the review ID was not provided. 6 users: this document shows the feature of users, their age, location and joined time, allowing discovering the user similarity. However, this information is fairly sparse, resulting in unreliable user similarity measurement. These other documents collections, genres, lists suggests that the similarity between movies. While extending the potential recommended item as exact preferred items by trusted referrers to all similar items helps to increase the coverage of recommended items, the accuracy of recommendation may not be guaranteed. At this point of time, we withhold this information at this time for the trusted network itself already recommends the items they like but the user of interest may not exactly like, how can even those items the trusted network may not exactly like? In this paper, we finalize using friends, peopleinmovies, moviecomments and personcomments to infer a user-user network and discover those highly trusted referrers.

people in a movie, co-comments on a movie and co-comments on the same person in a movie by a pair of user i and j respectively. Let Ncointerest (i, j) = φNcopeopleinmovies (i, j) + ϕNcomoviecomments (i, j)+ηN copersoncomments (i, j). Then we

(i,j) to measure the extent two users define exp γNcointerest 2 share the same interest. Then we can derive the weighted social network graph G = (V, E, W ) with a combination of the two sources of information as γNcointerest (i, j) W (i, j) = α exp (WSN (i, j)) + β exp 2 (3) where 0 < α < 1, 0 < β < 1 and 0 < γ are parameters.

3.3 Generate Local Trust Network from Social Network and User Opinions The most straightforward methods to construct trusted local network is clustering based on similarity features between users. However, it is widely acknowledged that one of the difficulties in clustering lies in the feature extraction. So it is hard to assign a numeric distance to any majority of two user node pairs, prohibiting reliable clustering. On the other hand, the clustering results depend upon the number of clusters, which is always plausibly set by a rough guess. Therefore, clustering cannot produce consistent reliability of a local trust network. As shown in Table1, the friendship graph is a very sparse graph with an average degree of 1.8926, which means that people are connected by few people in the graph. The average clustering coefficient of 0.1908, which indicates the probability that a people’s friend’s friend is still a friend is low. Table 1: Statistics of Friendship Social Network average degree average clustering coefficient 1.8926 0.1908

3.2 Modeling Users Relationship Graph Based on the above analysis of multiple sources of information, we model the users relationship as a weighted graph

In this paper, we use the transductive reasoning to circumvent the cluster model behind the user data. Instead, we construct the local trust network for an active user from parG = (V, E, W ) ticular ‘friend’ user instances or ‘co-interest’ user instances. where V be the users, and Ei,j be the edge and W (i, j) a In an uninformed manner, we search the user instances with weight metric that reflects the closeness between two users connections explicitly indicated by the social network and i and j respectively. W (i, j) should incorporate as much user opinion documents. This method discards the model information that can describe the relationship between the assumption in clustering in situations that large number of pair of users as possible. user-user relations is missing, and characterizes the local Users friendship Social Networks: Given the social nettrust network by a pool of particular instances. Therefore, works graph it is well suited for the sparse nature of social network and user opinion information. Given the user social network inGSN = (V, E, WSN ) formation,we construct a graph GSN = (V, E) where V be the users and E(i, j) be represents the relationship between where V be the users and WSN (i, j) be represents the relausers i and j, E(i, j) = 1 if they are friends, 0 if not. Then tionship between users i and j. In binary relationship case, starting from the active user node, we apply Depth First WSN (i, j) = 1 if they are friends, 0 if not. It can also be Search (DFS) to record the first and second level child node extended to case WSN (i, j) ∈ R+ that represent the weight of the search tree as the trusted friends of the active user. of closeness of relationship. Then for each trusted friend node, we apply one level DFS Multiple sources of user opinions: users’s comments on again to explore its friends and compare with the active user the items they have viewed can also be an important implicit friend list to count the number of shared co-friends as the taste similarity factor. Here we consider three sources of user user-user similarity measure. Likewise, we also construct opinions as discussed in Section 3.1. another user network with edges exists between two users For example, the number of co-actors/director/wirter shared who rated the same people in movie, who commented on by two users, can be appropriate index of the two users shared interest. Let Ncopeopleinmovies (i, j), Ncomoviecomments (i, j) the same movie and who commented on the same people in a movie. Then apply DFS to search the first level of child and Ncopersoncomments (i, j) be the number of co-interested

55

nodes and denote the user-user similarity by the number of co-interest index by the two users simultaneously.

Table 2: pureCF 0.3452182 0.73107 1.031052

3.4 Recommendation on Local Trust Network

Experimental Results LTN improvement 0.4166795 20.70% 0.5146545 29.60% 0.7772841 24.61%

MAP MAE Given the generated local trust network for active user a, RMSE and is denoted as Ta . The predicted rating for user a for movie i is users have co-rated [5]. Then, we implemented the local  P P trust network based recommendation and is denoted as LTN u∈Ta I(ru,i ) × W (a, u) × ru,i   P , if u∈Ta I(ru,i ) 6= 0,  in Table 2. The MAP, MAE and RMSE for standard CF I(r ) × W (a, u) u,i u∈Ta P râ,i = and local trust network based recommendations methods are ¯u )wa,u  u∈U (ru,i − r  P  r¯a + , otherwise, shown in Table 2. As shown in Table 2, numerically comu∈U wa,u pared to standard CF, LTN based method achieves a 20.70% (4) improvement in terms of MAP, a 29.60% reduction in terms where I(ru,i ) is an index function with I(ru,i ) = 1 if user u of MAE; achieves a 24.61% reduction in terms of RMSE. has rated item i, I(ru,i ) = 0 otherwise. (a,u) is a W (a, u) = α exp (WSN (a, u)) + β exp γNcointerest 2 weight function that measure the similarity between active user a and user u. The recommendation is made on the local trust network, and if all the users in the trust network have not rated item i, it returns the prediction by CF algorithm.

4.

5. CONCLUSIONS

This paper deals with how to incorporate social network and user opinion information into collaborative filtering movie recommendation systems. Local trust network based method is proposed, that detects local trust networks on friend and actor-consistent user graphs, identifies those users with comparable rating reliably and makes prediction based on the reference of trust local network. Experiments on real data shows that our LTN based algorithm leads to an increase by 20.70% in MAP, 29.60% reduction in MAE, and 24.61% reduction in RMSE.

EVALUATION

4.1 Evaluation Data Set The data set we used for our evaluation is from CAMRa104 . The data set was collected from Filmtipset5 , in which users take part in the social network of the service and leaving comments based on the movies they have seen. It contains a training set of 3, 075, 346 user-movie-rating records of 16, 473 users on 24, 222 movies, a friendship social network between the users of 12, 171 records, and additional users’ opinions. The rating matrix is very sparse since 99.23% elements are missing. There is also a testing set that contains 15, 729 records of 439 users on 1, 915 movies to be predicted.

6.[1] G.REFERENCES Adomavicius and A. Tuzhilin. Toward the next [2]

4.2 Evaluation Metrics

[3]

Since our task is to predict the rating of active users on unrated movies, for each active user, we measured the recom[4] mendation quality on the basis of the number of hits and the # of hits precision is defined as: Preca = # of movies to be predicted for user a . [5] And Mean Average Precision (MAP) is defined as: MAP = P a Preca with N the number of the total active users. a Na One limitation of the MAP measure is that it is indiffer[6] ent to the distance between the predicted rating and actual rating. This limitation is addressed by the Mean Absolute Error (MAE), which penalizes each predicted rating by the [7] Pn ri,j −ri,j | i=1 |ˆ where n distance to actual rating. MAE = n is the total number of ratings over all users, rî,j is the pre[8] dicted rating for user i on item j, and ri,j is the actual rating. We also adopt the Root Mean Squared Error (RMSE) to measure the error in recommendation and it is a met[9] ric that emphasizes q P large errors compared to MAE metric 1 [10] RMSE = ri,j − ri,j )2 . The smaller the value of i,j (ˆ n MAE and RMSE, the more precise a recommendation. [11]

4.3 Experimental Results

[12]

The first algorithm we implemented, denoted as pureCF, was the standard user based CF [3] with penalizing the correlation values based on the number of the items that two

[13]

4 5

http://www.dai-labor.de/camra2010/ http://www.filmtipset.se/

56

generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Trans. on Knowl. and Data Eng., 17(6):734–749, 2005. R. Andersen, C. Borgs, J. Chayes, U. Feige, A. Flaxman, A. Kalai, V. Mirrokni, and M. Tennenholtz. Trust-based recommendation systems: an axiomatic approach. In WWW ’08, pages 199–208, 2008. J. S. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. pages 43–52. Morgan Kaufmann, 1998. M. Deshpande and G. Karypis. Item based top-n recommendation algorithms. ACM Transactions on Information Systems, 22:143–177, 2004. J. L. Herlocker, J. A. Konstan, A. Borchers, and J. Riedl. An algorithmic framework for performing collaborative filtering. In SIGIR ’99, pages 230–237, 1999. M. Jamali and M. Ester. Trustwalker: a random walk model for combining trust-based and item-based recommendation. In KDD ’09, pages 397–406, 2009. I. Konstas, V. Stathopoulos, and J. M. Jose. On social networks and collaborative recommendation. In SIGIR ’09, pages 195–202, 2009. D. H. Lee and P. Brusilovsky. Social networks and interest similarity: the case of citeulike. In HT ’10, pages 151–156, 2010. P. Massa and P. Avesani. Trust-aware recommender systems. In RecSys ’07, pages 17–24, 2007. J. O’Donovan and B. Smyth. Trust in recommender systems. In IUI ’05, pages 167–174. ACM Press, 2005. P. Resnick and H. R. Varian. Recommender systems. Commun. ACM, 40(3):56–58, 1997. A. Said, S. Berkovsky, and E. W. De Luca. Putting things in context: Challenge on context-aware movie recommendation. In CAMRa2010, 2010. R. Sinha and K. Swearingen. Comparing recommendations made by online systems and friends. In In Proceedings of the DELOS-NSF Workshop on Personalization and Recommender Systems in Digital Libraries, 2001.