A Graph-Based Method for Combining Collaborative ... - Springer Link

0 downloads 0 Views 208KB Size Report
A Graph-Based Method for Combining Collaborative and Content-Based Filtering. Nguyen Duy Phuong, Le Quang Thang, and Tu Minh Phuong. Faculty of ...
A Graph-Based Method for Combining Collaborative and Content-Based Filtering Nguyen Duy Phuong, Le Quang Thang, and Tu Minh Phuong Faculty of Information Technology, Posts and Telecommunications Institute of Technology, Hanoi, Vietnam [email protected], [email protected], [email protected]

Abstract. Collaborative filtering and content-based filtering are two main approaches to make recommendations in recommender systems. While each approach has its own strengths and weaknesses, combining the two approaches can improve recommendation accuracy. In this paper, we present a graph-based method that allows combining content information and rating information in a natural way. The proposed method uses user ratings and content descriptions to infer user-content links, and then provides recommendations by exploiting these new links in combination with user-item links. We present experimental results showing that the proposed method performs better than a pure collaborative filtering, a pure content-based filtering, and a hybrid method. Keywords: Collaborative filtering, content-based filtering, hybrid recommender systems, graph-based model.

1 Introduction Recommender systems help users search for favorite products (such as movies, books, news stories etc.) by providing personalized suggestions in form of list of products that are likely to interest the users. These systems have played an important role in Ecommerce and information filtering with a number of commercial systems deployed, examples include Amazon, Netflix, IMDB. Collaborative filtering (CF) and content-based filtering (CBF) are the two main techniques used in recommender systems. CF systems work by first collecting user preferences for items in a given domain. The systems then use the collected data to find users with similar profiles and use their ratings to predict items that might interest a specific user [18,19]. CBF is an alternative technique that originates from the field of information retrieval. Content-based systems rely on the content descriptions of items (such as title, author, text description) to find items similar to items that interest the user. CBF has been mainly used in domains where content descriptions are available [3,14,1]. Both CF and CBF have their strengths and weaknesses. A key advantage of CF over CBF is that the former can perform in domains where it is difficult to get descriptions of items’ content, for example where items to recommend are ideas, opinions etc. This makes CF the most successfully recommendation method in various T.-B. Ho and Z.-H. Zhou (Eds.): PRICAI 2008, LNAI 5351, pp. 859–869, 2008. © Springer-Verlag Berlin Heidelberg 2008

860

N.D. Phuong, L.Q. Thang, and T.M. Phuong

domains. On the other hand, CF relies only on user ratings to produce recommendations. In practice, most users rate very few items and the user-item rating data are typically very sparse. Therefore it is difficult to reliably compare the profiles of two users. This problem is widely known as the data sparsity problem and presents a major challenge to CF algorithms. Another difficulty is that CF methods cannot handle an item if no user has rated it before. This problem, known as the first rater problem, applies to new and obscure items. Such problems are easily solved by CBF, which can make recommendations by comparing the descriptions of item content. In this paper, we propose to combine content information with rating information by using a unified graph representation for the two types of information. From the combined graph model representing user-item and item-content interactions, our method first determines content features that have significant impact on the behavior of each user. A network-propagation algorithm is then used to compute the association between user node and item nodes by exploring both user-content and user-item links of the graph. We apply the method in the domain of movie recommendation and show that our method gives promising results. Related Work The potential benefits of combining collaborative and content-based filtering have been studied in a number of works. The most simple hybrid approach is to implement content-based and collaborative methods separately and then combine their predictions [7]. In another approach, content information and rating information are first combined to produce data that serve as mixed input for predictors. Pazzani [16] proposed to represent each user-profile by a vector of weighted words selected from content descriptions using the Winnow algorithm. The matrix of user-profiles is then used as input for collaborative filtering instead of the user-item matrix. The Fab system [3] employs content analysis to generate user profiles from relevance feedbacks, which are then used to create personal filters. Melville and Mooney [13] used a pure content-based predictor to calculate so called pseudo-ratings. They used the predicted ratings to augment user ratings vector before applying CF techniques. Another family of approaches creates a general unified recommendation model and treats user rating prediction as a machine learning problem, in which predictors are learned from labeled examples [5,15]. Popescul et al. [17] proposed a unified probabilistic latent semantic analysis that combines collaborative and content-based characteristics. The approach of [4] uses kernel functions to combine user-user and item-item similarities in a unified kernel vector space and applies support vector learning to produce predictions. Crammer et al. [8] approaches the problem as learning a ranking on items set by incorporating additional item features. Due to intuitiveness of representation and availability of graph algorithms, graphbased models have been used in a number of recommendation methods. Aggarwal et al. [2] represented relationships among users as a directed graph in which a directed link connecting two users indicates that the behavior of the source user is highly predictive from the behavior of the target user. Huang et al. [11] introduced a graphbased model that includes both users and items. A weighted link between two item nodes represents the similarity between the two items, which is pre-computed based on the items’ content. This model allows capturing both content-based and rating information in a unified framework. In [12], the authors exploited transitive associations in a graph-based model to tackle the data sparsity problem.

A Graph-Based Method for Combining Collaborative and Content-Based Filtering

861

2 Graph Model and Recommendation Algorithm We first introduce notations to use in the paper. We denote by X = { x1, x2, …,x|X|} a set of items, and by U = {u1, u2, …,u|U|} a set of users. We denote user ratings over the items by matrix R = (rij) of size |U| × |X|, such that rij is the rating user i has given to item j. Each rating rij can take on a value from a finite set of possible ratings. Here we assume rij can be either +1 (like) or −1 (dislike). If user i has not rated j then rij = ∅. Furthermore, we use C = {c1, c2, …, c|C|} to denote a set of features that characterize the items’ content. We denote the item-content associations by a |X| × |C| matrix Y = (yij), where yij = 1 if item i has feature j and yij = 0 otherwise. For example, for xi being a movie, cj can be “genre = action” and yij = 1 means the movie belongs to genre “action”. The goal of a recommender system is to predict ratings an active user would give to unrated items and based on this provide a list of recommendations. The Graph Model. As shown in previous works [2,11], it is natural and convenient to solve the problem at hand by using a graph-based recommendation model. The basic idea is to build a graph model of the rating and content information, and then explore the associations among nodes to make predictions. Figure 1 shows an example graph. The top part of the graph shows item-content associations, where a node corresponds to either an item or a content feature. A link is drawn between an item node xi and a feature node cj if there is a non-zero association between xi and cj, i.e. if yij = 1 according to the notation above. Similarly, the bottom part of the graph represents user preferences over items. In this part, a link between a user node and an item node can have +1 or −1 weights indicating the user likes or dislikes the item. 2.1 Construction of User-Content Links Given the item-content association matrix, a simple way to compute the similarity between two items is to compare their content features. For example, Huang et al. [11] compute the similarity of two items by calculating the mutual information between the items’ descriptions and then draw weighted links between the item nodes to represent their similarity. However, such methods do not take into account user ratings when computing item-item similarities and thus cannot adjust item similarity for a specific user. To illustrate this, let us consider the example given in figure 1. c1

x1

c2

x2 +1 +1

x3 +1

u1

+1

x5

x4 +1

−1

u2

+1

−1

u3

Fig. 1. An example of graph representation for rating and content information

862

N.D. Phuong, L.Q. Thang, and T.M. Phuong

In this example, item x3 and x4 share a common feature c2. Looking only at the item-content part of the graph, a simple similarity computation will decide that x3 and x4 are similar because of this common feature. But this is not true for user u1, who has rated x3 as liked and x4 as disliked. This means, from u1’s point of view, x3 and x4 are not similar and c2 has no effect on u1’s ratings. At the same time, all the items which u1 has rated and which contain c1 (i.e. x2 and x3) get positive ratings from u1. Therefore, one should consider c1 as having an important role in the opinion of u1. This example shows the need to use a personalized measure of similarity when computing item-item similarities from content information. Each content feature should have its degree of importance in deciding how items are similar, and this degree of importance should be adapted for a specific user. We now present our approach that characterizes the importance of each content feature for each user. This is the first step to combine content-based and collaborative recommendations. Given a graph introduced above, for each user ui and each content feature ck, we say that ck is important for ui if the sum of the weights of all distinct paths connecting ck and ui divided by the number of the paths exceeds some threshold T (0< T T sik

(1)

otherwise

In this formula, wik/sik shows the dominance of positive ratings over negative ratings that user ui gives for items with feature ck ; min(sik, γ)/γ is the so called significance weighting factor [9] that devalues the importance degree based on few paths. Following [9] we used γ = 50 in our experiments. The threshold valued T is set to 0.3, which means the number of positive paths should be about two times bigger than the number of negative ones for a feature to be considered important. We illustrate the computation of vij through an example shown in figure 1: for user u1 and content feature c1, we have n11 = 2, w11 = 2, and v11 = 2/γ. For user u1 and c2, we have n12 = 2 and thus w12 = 0. For each pair (ui, ck) that has a non-zero vik, we draw a new link with weight vik. Figure 2 shows such an extended graph for the graph from figure 1. The dotted line is the new link just added to represent the correlation between u1 and c1. 2.2 Making Recommendation

We now describe the recommendation process as a graph search problem in the extended graph. We will use the example shown in figure 2 to illustrate our approach. Suppose the system needs to recommend items for an active user. Following [11, 12], we first determine the association between this user and each of items that have not been rated by the user. The items are then sorted according to the associations, and top K items are chosen for recommendation.

A Graph-Based Method for Combining Collaborative and Content-Based Filtering

c1

863

c2

v11

x1

x2

x3

+1 +1

+1

u1

+1

x5

x4 +1

−1

u2

+1

−1

u3

Fig. 2. Example of graph with links between user nodes and content feature nodes

In our model, the association between two nodes is determined by considering all paths connecting them. For the pair of a user node ui and an item node xj, we compute the association between them as the sum of weights of all distinct paths that connect ui and cj. In this computation we differentiate two types of paths – paths via content nodes and paths via item nodes. A path of the first type is one whose length is equal to 2 and goes through a content feature node. An example of such a path is u1-c1-x1 in figure 2, which has an intuitive interpretation – u1 likes items that contain feature c1 and therefore likes x1. Such paths correspond to associations via content information. By limiting the path length to be 2, we do not allow transitive associations for paths of this type. Since the weight of an item-content link is always 1, the weight of a path of this type is equal to the weight of the respective user-content link. The second type includes paths that go through item nodes and user nodes. Examples of such paths are u1-x3-u2-x1, u2-x2-u1-x4-u3-x5. Because we are interested in the association between a user node and an item node, the length of a path of this type must be an odd number. In addition, only paths whose lengths do not exceed a parameter M are considered. Such paths represent transitive associations that were first exploited by Huang et al. [12] to make recommendations. In contrast to the work of Huang et al., who applied collaborative filtering for purchasing data that contain only positive links, in our context, a user-item link can have either a positive or a negative weight. Thus when exploring paths that go through item nodes we consider three cases: -

-

All the intermediate links of a path are positive (here we define intermediate links of a path as all its links except the last one), for example paths u1-x3-u2x1, and u2-x2-u1-x4 in figure 2. We consider such a path important and compute its weight as the product of the link weights. A path has two intermediate links that end at the same item node but have different signs. This means the two corresponding users have rated this item differently and thus we ignore such a path. A path has two consecutive negative links that end at the same item, for example in path u1-x4-u3-x5, links u1-x4 and x4-u3 both have negative weights. This means the two users have similar ratings for x4 and this path may indicate the similarity between the two users. However, our experiments show that including such paths in computation makes results unstable and hence we ignore such paths in our implementation.

864

N.D. Phuong, L.Q. Thang, and T.M. Phuong

Association computation. An important step in the algorithm above is computing the association via different paths. This can be solved by a variety of algorithms known as network propagation algorithms, an example of which is the Google PageRank algorithm. Huang et al. [12] applied several network propagation algorithms to CF. Here, to compute user-item associations, we use the following algorithm which is a modification of the algorithm by Weston et al. [22]. Let ua be the active user, i.e. the user for which the system needs to make recommendations. Let N denote the set of nodes that can form paths from ua to items nodes. For paths of the first type, N=C∪X∪ ua. For paths of the second type, N=X∪U. For ease of algorithm description we use ni to denote a node i ∈ N regardless this is a user node, content node or an item node. We denote by eij the weight of the link between node ni and nj, and by eaj the link weight between node ua and nj. The matrix (eij) is formed from matrices R, Y introduced in section 3 and matrix (vij) from equation (1). Furthermore, let ai(t) denote the association degree between ua and node ni ∈ N when considering paths of length t. The algorithm for computing association between ua and item nodes using only paths of the same type is shown in figure 3.

1. 2. 3. 4. 5. 6. 7.

Initialize ai(0) = 0 for all ni ∈ N, aa(0) = 1 for t = 1, 2, …, m or until convergence do for each node ni ∈ N do ai(t) ← eai for each node nj ∈ N do if eij > 0 or t = m then ai(t) ← ai(t) + αe ji a j(t − 1 )

8. 9. 10. 11

end for end for end for return ai (m). ai (m) is the association between ua and ni via paths of length m Fig. 3. Algorithm for computing association degrees

Here, α ∈[0,1] is a parameter that down-weights longer paths. Our experiments use α =1 for paths of the first type and α = 0.5 for paths of the second type.

The algorithm needs to be run separately for N = C ∪ X ∪ ua, and N = X ∪ U to compute associations via content nodes and via item nodes respectively. Parameter m is equal to 2 (path length = 2) for the former case and is an odd number less than or equal to M for the later case. If no limit is set, the algorithm runs until convergence, that is when ai(t) become stable. In our implementation, we use M = 10. In the initialization stage, the algorithm activates the active user node by setting its activation level to 1 and the activation levels of the remaining nodes to 0. In each iteration t, the activation level is pumped from the active user node to the remaining nodes of N via paths of length t. Step 6 ensures that a path is counted only if it does not contain a negative link(s) unless this is the last link. Clearly, this step is needed only for paths of the second type.

A Graph-Based Method for Combining Collaborative and Content-Based Filtering

865

The most time consuming part of this algorithm is from line 3 to line 9 which requires O(|N|2) computations over all eji. Fortunately, matrix (eij) is very sparse with most elements equal to zero. This allows us to use sparse-matrix representation for (eij), which reduces the complexity to O(|N|S), where S is averaged number of nonzero elements for each row of matrix (eij). Assume the algorithm returns aic and air for N = C ∪ X, and N = X ∪ U respectively, we compute the overall association aio between user ua and item xi as:

aio = βaic + (1 − β )air

(2)

Where β∈[0,1] is a parameter that controls the contribution of each type of association. β = 1 means only association via content is considered while β = 0 corresponds to pure collaborative filtering. The unrated items are then sorted based on their associations and top K items with strongest associations are recommended for the active user.

3 Experimental evaluation Experimental setup. We evaluated the proposed algorithm on the MovieLens data set (http://www.grouplens.org). The data set contains 100000 ratings from 943 users for 1682 movies. Ratings are in five-point scale (1, 2, 3, 4, 5) and each user has rated at least 20 movies. We transformed the two highest scores (4 and 5) into +1 (like) and the rest into −1 (dislike). 80 percent of the users were randomly selected to form the training set and the rest were used as the test users. Users in the test set were used to measure to recommendation accuracy. From each test user, 25 percent of the ratings were withheld. The rest of ratings were used as input for recommendations. We used movie genre provided with the MovieLens data set and retrieved other information about the movies from IMDB (http://www.imdb.com) to form content features. In our experiments we used only genre and director as content information. In general, other content features can be exploited. Following experimental procedures reported in the literature [10, 12] we used precision, recall, and F-measure to measure the effectiveness of recommendation methods. The metrics are defined as follows:

precision =

recall =

# of recommended items that get actual positive ratings # of all recommended items

# of recommended items that get actual positive ratings # of all items that get actual positive ratings F=

2 × precision × recall precision + recall

(3)

(4)

(5)

866

N.D. Phuong, L.Q. Thang, and T.M. Phuong

We compared the proposed method (denoted by CombinedGraph) to the following methods: -

-

User-Based k-nearest neighbor using Pearson correlation. This method uses the Pearson correlation as the measure of similarity between two users, and makes recommendations based on the ratings of users that are highly similar with the active user. Content-Based using graph search. This method counts the number of paths of length 3 that go from the active user node via item and content nodes to an unrated item node and recommends items with the largest number of paths. 3-Hop collaborative filtering using graph search. This method makes recommendations based only on the association via item nodes as described in the previous section with the maximal path length M = 3. Simple Hybrid method that combines Content-Based and 3-Hop by merging top recommended items returned by each method.

Results. We varied parameter β to control the contribution of the content component and the collaborative component of the method. The best results were obtained for β between 0.7 and 0.8, which emphasizes the contribution of the collaborative filtering component. In what follows we report only results when using β = 0.8. The recalls, precisions, and F-measure values for top 10, 20, and 50 items are summarized in table 1. The results show that our method performs better than the other methods on all three metrics. On average, among 50 items recommended by the CombinedGraph, eight will receive positive ratings from the users. The table also shows that simple 3-Hop significantly outperforms both User-Based and Content-Based methods. This result supports recent findings that user-based kNN approach gives poor results in terms of precision and recalls while performs well in term of the mean absolute error (MAE) metric - defined as the average absolute difference between predicted ratings and actual ratings [10]. Table 1. Recalls, precisions, and F-measure values of experimented methods Algorithm

Metrics

User-Based

Recall Precision F-measure

Number of recommended items 10 20 50 0.007 0.021 0.069 0.015 0.025 0.034 0.009 0.023 0.045

Content-Based

Recall Precision F-measure

0.009 0.022 0.013

0.017 0.020 0.018

0.037 0.018 0.024

3-Hop

Recall Precision F-measure

0.155 0.284 0.200

0.222 0.225 0.223

0.377 0.164 0.228

Simple Hybrid

Recall Precision F-measure

0.117 0.186 0.144

0.162 0.148 0.155

0.279 0.118 0.166

CombinedGraph

Recall Precision F-measure

0.165 0.292 0.211

0.234 0.240 0.237

0.381 0.175 0.240

A Graph-Based Method for Combining Collaborative and Content-Based Filtering

867

0.35 User_Based 0.3

Content-Based CombinedGraph

F-measure

0.25 0.2 0.15 0.1 0.05 0 95

96

97

98

99

100

% Sparsity

Fig. 4. F-measure values at different sparsity levels

One challenge for CF algorithms is that the recommendation accuracy suffers when the user-item matrix is sparse. Our method can alleviate this problem by exploiting association via content feature nodes even when most paths via item nodes are not present. Thus the sparsity of rating data has a smaller effect on CombinedGraph than on a pure collaborative algorithm. To verify this hypothesis, we conducted the following experiments. We used the first 400 users to form the training set and the next 100 user to form the test set. For each test user, 25% ratings were withheld for prediction. We randomly deleted elements from the user-item matrix to increase the sparsity and measured the accuracy of recommendations at different sparsity levels. The F-measure values for top 50 recommendations are shown in figure 4. As can be seen, CombinedGraph gives more stable results than User-Based and 3-Hop when the sparsity increases. This confirms our hypothesis that CombineGraph is less sensitive to data sparsity than pure CF.

4 Conclusion We have presented a natural and effective way to combine content-based and collaborative filtering for achieving more accurate recommendations. Our method uses a graph-based model to represent both content and rating information. This representation allows exploiting user ratings to select important content features that connect users with items of interest. The graph model also provides a convenient way to compute the association between users and items using available network propagation algorithms. We have shown how our method performs better than the user-based k-NN collaborative filtering method, a content-based method and a hybrid recommendation method. Acknowledgments. This research was supported by Ministry of Science and Technology of Vietnam under a grant for fundamental research.

868

N.D. Phuong, L.Q. Thang, and T.M. Phuong

References 1. Adomavicius, G., Tuzhilin, A.: Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions. IEEE Trans. Knowl. and Data Eng. 17(6) (2005) 2. Aggarwal, C.C., Wolf, J.L., Wu, K.L., Yu, P.S.: Horting Hatches an Egg: A New GraphTheoretic Approach to Collaborative Filtering. In: Proc. of ACM SIGKDD 1999 (1999) 3. Balabanovic, M., Shoham, Y.: FAB: Content-based, collaborative recommendation. Communication of the ACM 40(3), 66–72 (1997) 4. Balisico, J., Hofman, T.: Unifying collaborative and content-based filtering. In: Proceedings. of Int. Conf. on Machine learning (ICML 2004) (2004) 5. Basu, C., Hirsh, H., Cohen, W.: Recommendation as classification: Using social and content-based information in recommendation. In: Proc. Nat. Conf. on AI, pp. 714–720 (1998) 6. Breese, J.S., Heckerman, D., Kadie, C.: Empirical analysis of predictive algorithms for collaborative filtering. In: Proc. of 14th Conf. on Uncertainty in AI, pp. 43–52 (1998) 7. Claypool, M., Gokhale, A., Miranda, T., Murnikov, P., Netes, D., Sartin, M.: Combining contentbased and collaborative fillters in an online newspaper. In: Proc. of ACM SIGIR Workshop on Recommender Systems (1999) 8. Crammer, K., Singer, Y.: Pranking with ranking. In: Advances in Neural Information Processing Systems, vol. 14, pp. 641–647 (2002) 9. Herlocker, J., Konstan, J., Borchers, A., Riedl, J.: An algorithmic framework for performing collaborative filtering. In: SIGIR 1999: Proc. of the 22nd Inter. Conf. on Research and Development in Information Retrieval (SIGIR), pp. 230–237 (1999) 10. Herlocker, J., Konstan, K., Terveen, L.G., Riedl, J.: Evaluating Collaborative Filtering Recommender Systems. ACM Trans. on Inform. Syst. 22(1), 5–53 (2004) 11. Huang, Z., Chung, W., Ong, T.-H., Chen, H.: A graph-based recommender system for digital library. In: Proc. of 2nd ACM/IEEE-CS Joint Conf. on Digital Libraries, pp. 65–73 (2002) 12. Huang, Z., Chen, H., Zeng, D.: Applying Associative Retrieval Techniques to Alleviate the Sparsity Problem in Collaborative Filtering. ACM Trans. Inf. Syst. 22(1), 116–142 (2004) 13. Melville, P., Mooney, R.J., Nagarajan, R.: Content-boosted collaborative filtering for improved recommendations. In: Proc. of 18th Nat. Conf. on AI, pp. 187–192 (2002) 14. Mooney, R.J., Roy, L.: Content-based book recommending using learning for text categorization. In: Proc. of the 5th ACM Conference on Digital Libraries, pp. 195–204 (2000) 15. Phuong, N.D., Phuong, T.M.: Collaborative filtering by multitask learning. In: Proc. IEEE Int. Conf. on Research, Innovation and Vision for the Future, pp. 227–232 (2008) 16. Pazzani, M.J.: A framework for collaborative, content-based and demographic filtering. Artificial Intelligence Review 13(5-6), 393–408 (1999) 17. Popescul, A., Ungar, L.H., Pennock, D.M., Lawrence, S.: Probabilistic Models for Unified Collaborative and Content-Based Recommendation in Sparse-Data Environments. In: Proc. 17th Conf. Uncertainty in Artificial Intelligence (2001) 18. Resnick, P., Varian, H.R.: Recommender systems. Special issue of Communications of the ACM, 56–58 (1997) 19. Shardanand, U., Maes, P.: Social information filltering: Algorithms for automating word of mouth. In: Human Factors in Computing Systems ACM CHI, pp. 210–217 (1995) 20. Wang, J., de Vries, A.P., Reinders, M.J.T.: Unifying user-based and item-based collaborative filtering approaches by similarity fusion. In: Proc. of SIGIR 2006, Seatle, USA (2006)

A Graph-Based Method for Combining Collaborative and Content-Based Filtering

869

21. Yu, K., Schwaighofer, A., Tresp, V., Ma, W.-Y., Zhang, H.: Collaborative ensemble learning: Combining collaborative and content-based information filtering via hierarchical bayes. In: Proc. of the 19th Conference on Uncertainty in Artificial Intelligence, UAI (2003) 22. Weston, J., Elisseeff, A., Zhou, D., Leslie, C.S., Noble, W.S.: Protein ranking: From local to global structure in the protein similarity network. Proc. of National Academy of Science 101(17), 6559–6563 (2004)