A Hybrid Multi-group Co-clustering Recommendation Framework

0 downloads 0 Views 761KB Size Report
Furthermore, most of the clustering-based CF models only utilize historical rating ... recommendation framework based on information fusion. ACM Trans. Intell.
A Hybrid Multigroup Coclustering Recommendation Framework Based on Information Fusion SHANSHAN HUANG, JUN MA, and PEIZHE CHENG, Shandong University ¨ SHUAIQIANG WANG, University of Jyvaskyl a¨

Collaborative Filtering (CF) is one of the most successful algorithms in recommender systems. However, it suffers from data sparsity and scalability problems. Although many clustering techniques have been incorporated to alleviate these two problems, most of them fail to achieve further significant improvement in recommendation accuracy. First of all, most of them assume each user or item belongs to a single cluster. Since usually users can hold multiple interests and items may belong to multiple categories, it is more reasonable to assume that users and items can join multiple clusters (groups), where each cluster is a subset of like-minded users and items they prefer. Furthermore, most of the clustering-based CF models only utilize historical rating information in the clustering procedure but ignore other data resources in recommender systems such as the social connections of users and the correlations between items. In this article, we propose HMCoC, a Hybrid Multigroup CoClustering recommendation framework, which can cluster users and items into multiple groups simultaneously with different information resources. In our framework, we first integrate information of user–item rating records, user social networks, and item features extracted from the DBpedia knowledge base. We then use an optimization method to mine meaningful user–item groups with all the information. Finally, we apply the conventional CF method in each cluster to make predictions. By merging the predictions from each cluster, we generate the top-n recommendations to the target users for return. Extensive experimental results demonstrate the superior performance of our approach in top-n recommendation in terms of MAP, NDCG, and F1 compared with other clustering-based CF models. Categories and Subject Descriptors: H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval—Information filtering General Terms: Algorithms, Performance, Experimentation Additional Key Words and Phrases: Recommender systems, collaborative filtering, coclustering, information fusion, data sparsity ACM Reference Format: Shanshan Huang, Jun Ma, Peizhe Cheng, and Shuaiqiang Wang. 2015. A hybrid multigroup coclustering recommendation framework based on information fusion. ACM Trans. Intell. Syst. Technol. 6, 2, Article 27 (March 2015), 22 pages. DOI: http://dx.doi.org/10.1145/2700465

27 This work was supported by the Natural Science Foundation of China (61272240, 60970047, 61103151, 71402083), the Doctoral Fund of Ministry of Education of China (20110131110028), the Natural Science Foundation of Shandong province (ZR2012FM037, BS2012DX012), the Humanity and Social Science Foundation of Ministry of Education of China (12YJC630211), and the Microsoft Research Fund (FY14-RESTHEME-25). Authors’ addresses: S. Huang, J. Ma, and P. Cheng, School of Computer Science and Technology, Shandong University, 1500 Shunhua Road, Jinan 250101, China; emails: [email protected], majun@sdu. edu.cn, [email protected]; S. Wang, Department of Computer Science and Information Systems, Univer¨ ¨ Agora, 5. krs., Mattilanniemi 2, 40100 Jyvaskyl ¨ ¨ Finland; email: [email protected]. sity of Jyvaskyl a, a, Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or [email protected]. c 2015 ACM 2157-6904/2015/03-ART27 $15.00  DOI: http://dx.doi.org/10.1145/2700465

ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

27:2

S. Huang et al.

1. INTRODUCTION

In the age of information overload, Recommender Systems (RSs) have become indispensable tools in helping people find potential interest items and filter out uninteresting ones [Adomavicius and Tuzhilin 2005]. They can be used to discover relevant items or information and make personalized recommendations based on users’ past behaviors. RSs can not only benefit users by saving time but also help online shops satisfy customers and make more profits. Collaborative Filtering (CF) [Huang et al. 2007; Deshpande and Karypis 2004] is one of the most popular techniques to build recommender systems. The underlying assumption of CF algorithms is that if users have similar tastes in the past, they are more likely to have similar preferences for items in the future. An important advantage of CF is the ability to make recommendations without any domain knowledge. However, CF-based recommendation algorithms also suffer from several drawbacks that limit their performance [Sarwar et al. 1998]. The first is the data sparsity problem, which is really common in real-world applications. It makes the CF methods incapable of finding accurate neighbors when users have not rated many items in common. The second is the scalability problem, which is caused by the increasing number of users and items. Some research has been conducted to address these two mentioned problems by various clustering techniques [Sarwar et al. 2002; Gong 2010a; George and Merugu 2005]. Clustering methods are usually processed as an intermediate step in CF-based recommendation algorithms. They can be used to cluster users or items based on the rating information. Most clustering methods used in recommender systems assume that a user or an item falls into a single cluster. However, this is not a reasonable assumption in reality. In most situations, users and items can belong to several clusters; for example, one user may be interested in different kinds of movies such as horror, comedy, and drama and one movie could have multiple genres. In addition, in order to cope with the data sparsity problem, many hybrid recommendation algorithms have been proposed by incorporating other information resources such as social networks [Massa and Avesani 2007; Jamali and Ester 2010; Jiang et al. 2012] or item attributes [Middleton et al. 2004; Di Noia et al. 2012b; Ostuni et al. 2013]. However, this information has not been used for clustering in CF models. Xu et al. [2012] proposed a multiclass coclustering (MCoC) model by assuming each user and item belongs to multiple clusters. However, MCoC clusters users and items based only on rating information, which is usually very sparse. The sparse relationships between users and items may not be sufficient to find meaningful clusters. Especially when a user has rated a very small number of items and an item has been rated by only a few users, the clustering result will not be accurate for this user (item). Fortunately, in some recommender systems, people not only have interactions with items but also have social relations with each other. Besides, the items usually have metadata, such as descriptions, categories, and so forth. Based on this additional information, we can cluster strongly connected users and highly correlated items in the same groups. With all the concerns aforementioned, in this article, we propose HMCoC, a Hybrid Multigroup CoClustering recommendation framework, by extending conventional CFbased recommendation algorithms with a novel clustering method. In HMCoC, we assume that each user and item can belong to multiple groups (clusters). To alleviate the data sparsity problems, besides rating records, we utilize additional information resources, such as user social networks and item correlations. In order to calculate item correlations, we extracted items’ category information from DBpedia. To utilize multiple information sources together in the clustering model, we first fuse rating records, social relations, and item correlations into a unified graph model. We then

ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

A Hybrid Multigroup Coclustering Recommendation Framework Based on Information Fusion

27:3

formulate a novel clustering problem to cocluster both users and items into multiple groups at the same time with all the information we have. Finally, we combine these groups with existing CF methods to generate top-n recommendation. In the top-n recommendation process, we can partition the rating matrix into several submatrices according to the clustering result and choose one CF method to make recommendations independently with each submatrix. The recommendation results from all groups are merged together to generate top-n recommendation. An advantage of our framework is that it is more general because it does not rely on any specific CF algorithm. For a concrete problem, we can choose one suitable rating-based CF method to be integrated into this framework without any modification. The main contributions of this work can be summarized as follows: —To the best of our knowledge, this is the first work that utilizes heterogeneous information to cluster users and items in recommender systems. —We propose a novel recommendation framework, HMCoC, to cope with the limitations of conventional CF methods via coclustering users and items into multiple groups at the same time. —We embed social networks and knowledge bases as the complementary data resources in the clustering progress to satisfy connectivity coherency and topical consistency. —We formulate our Hybrid Multigroup Coclustering method as an optimization problem that combines one-sided and two-sided clustering techniques. Furthermore, we provide an effective approximate solution to the problem for finding meaningful user–item groups. The remainder of this article is organized as follows. We discuss the related work in Section 2. We introduce our new recommendation framework, HMCoC, in Section 3. Section 4 presents the experimental settings. We report experimental results and discussion in Section 5. Finally, Section 6 gives the conclusions and future work. 2. RELATED WORK 2.1. Collaborative Filtering

In recommender systems, CF algorithms can be mainly classified into two kinds of approaches: memory-based algorithms [Wang et al. 2006; Huang et al. 2007; Deshpande and Karypis 2004] and model-based algorithms [Lee and Seung 2000; Hofmann 2004; Mnih and Salakhutdinov 2007]. In memory-based CF algorithms, the entire user–item rating matrix is directly used to predict unknown ratings for each user. User-based [Desrosiers and Karypis 2011] and item-based [Sarwar et al. 2001; Deshpande and Karypis 2004] CF algorithms are the two best-known methods that fall into this category. User-based CF methods first find several nearest neighbors with high similarities for each user and then make predictions based on the weighted average ratings of his or her neighbors. The neighbors can be determined by various similarity measures [Wang et al. 2006], such as Pearson correlation coefficient and cosine similarity in rating space. Similarly, item-based CF methods find the nearest neighbors for each item. The similarity computation process is computationally expensive for large datasets and neighbors cannot be found accurately in highly sparse data [Adomavicius and Tuzhilin 2005]. In model-based CF algorithms, a predictive model is trained from observed ratings in advance. In this category, latent factor models (LFMs) [Hofmann and Puzicha 1999; Hofmann 2004; Koren 2008] are very competitive and widely adopted to build recommender systems. They assume that only a few latent factors influence user rating behaviors. Latent factor models seek to factorize the user–item rating matrix into two

ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

27:4

S. Huang et al.

low-rank user-specific and item-specific matrices and then utilize the factorized matrices to make further predictions. Latent factor models can reduce data sparsity through dimensionality reduction and usually generate more accurate recommendations than memory-based CF algorithms. PureSVD [Sarwar et al. 2000], Matrix Factorization (MF) [Mnih and Salakhutdinov 2007], and Nonnegative Matrix Factorization (NMF) [Lee and Seung 2000] are commonly used methods in LFM. When numbers of existing users and items grow tremendously, traditional CF algorithms, either memory based or model based, will suffer a serious scalability problem where the computational cost goes beyond practical or acceptable levels. Clustering CF models address the scalability problem by making recommendations within smaller clusters instead of the entire database, demonstrating promising performance in tradeoff between scalability and recommendation accuracy. 2.2. Clustering CF Models

Various clustering techniques have been investigated in an attempt to address the problems of sparsity and scalability in recommender systems [Sarwar et al. 2002; George and Merugu 2005; Xu et al. 2012]. In clustering CF models, clustering is often an intermediate process and the clustering results are further used for CF recommendation algorithms. User clustering [Sarwar et al. 2002; Xue et al. 2005] and item clustering methods [Gong 2010b] (also called one-sided clustering) cluster users or items according to the rating vectors, and then the prediction is calculated separately in each cluster. Some other clustering CF models cluster users and items at the same time (also called twosided clustering). In George and Merugu [2005], the key idea is to simultaneously obtain user and item neighborhoods via coclustering. The proposed method generates predictions based on the average ratings of the coclusters while taking into account the individual biases of users and items. Leung et al. [2011] propose a Collaborative Location Recommendation (CLR) framework that employs a dynamic clustering algorithm to cluster the trajectory data into groups of similar users, similar activities, and similar locations. The advantage of applying clustering techniques in CF is that it can improve the scalability and alleviate the sparsity problem by partitioning the whole rating space into smaller and denser subspaces. However, all the aforementioned approaches assume that users and items belong to a single cluster, which is not a reasonable assumption in real-world applications. Another clustering technique that is most related to our model is the MCoC method [Xu et al. 2012]. It assumes each user and item can appear in multiple groups and clusters users and items by limited rating records. However, the rating matrix is usually very sparse because most users only rated a small fraction of items. The sparse relationships between users and items may not be sufficient to find meaningful clusters. In this article, different from previous clustering techniques used in recommender systems, we try to find meaningful user–item groups by integrating multisource information. To the best of our knowledge, there has been no attempt to incorporate heterogeneous information from various resources for clustering in clustering CF models. 2.3. Recommendation Using Additional Information

Due to the lack of sufficient rating records, many research works have been done by exploiting additional information to enhance recommendation performance, such as metadata [Ahn and Shi 2009], tags [Peng et al. 2010; Zhang et al. 2010] and social relations [Massa and Avesani 2007; Jamali and Ester 2010], and other social media information [Bu et al. 2010]. ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

A Hybrid Multigroup Coclustering Recommendation Framework Based on Information Fusion

27:5

Fig. 1. Overview of the recommendation framework.

Ahn and Shi [2009] use five types of cultural metadata, user comments, plot outline, synopsis, plot keywords, and genres provided by the IMDB, to calculate similarities between movies. Zhang et al. [2010] propose a recommendation algorithm by integrating diffusion on user–tag–item tripartite graphs. Symeonidis et al. [2008] represent the relationships between users, items, and tags by a tensor and then decompose the full folksonomy tensor using Higher-Order Singular Value Decomposition (HOSVD). Social relations, such as trust and friendship relations, have been regarded as potentially valuable information in recommender systems because of the homophily and selection effects. The common rationale behind this is that a user’s taste is influenced by his or her trusted friends in social networks. The SocialMF [Jamali and Ester 2010] is proposed to incorporate trust propagation into probabilistic matrix factorization and achieves better recommendation accuracy. Bu et al. [2010] use a unified hypergraph to model the high-order relation in social media and music acoustic-based content to make recommendations. This approach is an application of ranking on graph data and requires learning a ranking function. In their hypergraph model, they need to store the vertex–hyperedge incidence matrix, which demands much larger memory space. The time complexity of computing the inverse of matrix is relatively high. Besides, in their work, different kinds of relations contribute the same to the recommendation. As far as we know, no research has been done to study if the additional information contributes to clustering users and items in recommender systems. 3. HYBRID MULTIGROUP COCLUSTERING RECOMMENDATION FRAMEWORK 3.1. Overview

As illustrated in Figure 1, our framework is composed of three main modules: information fusion, hybrid multigroup coclustering, and the top-n recommendation module. (1) Information Fusion: In this module, we integrate information from different sources. Besides the rating matrix, we also use the user’s social network and item’s topic information. In recommender systems, it is usually time-consuming to obtain topics of items from tremendous plain texts with natural language processing techniques. Instead, we utilize some publicly available knowledge base (e.g., DBpedia) to help us extract accurate and specific properties of items. Lastly, we use a uniform graph model to represent the integrated information. (2) Hybrid Multigroup Coclustering: With the information from the first module, we cocluster users and items into multiple groups simultaneously. We assume that users and the items to which they have given high rating scores should belong to the same one or more groups. To satisfy connectivity coherency and topical consistency, users who have tight social relationships are likely to appear in the same groups, ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

27:6

S. Huang et al.

Fig. 2. A fraction of attributes extracted from DBpedia related to the movie Avatar.

and items that have strongly implicit correlations are also likely to appear in the same groups. We combine the one-sided and two-sided clustering techniques and present a fuzzy c-means-based clustering method to discover user–item clusters with different information sources. (3) Top-n Recommendation: In this module, we choose a suitable CF recommendation algorithm and implement it independently with user–item submatrices derived from the clustering results. By merging the predictions from each cluster, we finally make top-n recommendations to the target users for return. 3.2. Information Fusion Module

Conventional CF algorithms make recommendations based on a set of users U = {u1 , u2 , . . . , um}, a set of items I = {i1 , i2 , . . . , in}, and the rating matrix R ∈ Rm×n, where each element Ri j denotes the rating score that user ui gives to item i j . With the increasing popularity of social networks, many people maintain their social relations online, such as friendships on Facebook1 or Last.fm2 and trust relations in Epinions.3 Social relations have been regarded as potentially valuable information in recommender systems because the relations can be usefully applied to find users’ like-minded neighbors and reduce the data sparsity problem. Many researchers have successfully exploited social relations to improve the performance of online recommender systems [Massa and Avesani 2007; Jamali and Ester 2010]. In this article, we investigate if social relations can contribute to cluster users. Apart from social relations, other relations such as item–category could also be incorporated into recommender systems to make up for the lack of rating information [Zhang et al. 2013]. However, in many recommender systems, the category information about items is absent or too general, for example, Action or Drama in the movie domain. In recent years, thanks to the advancement of the Web of Data, we can have access to abundant knowledge bases such as Freebase4 and DBpedia.5 These knowledge bases typically contain a set of concepts, instances, and relations, where the information about instances is structured, specific, and comprehensive. In Figure 2, we show the partial information about the movie Avatar extracted from DBpedia. It is observed that the category information in this figure can represent the topics expressed by the movie. Several research works have been conducted to improve recommendation by exploiting the knowledge base [Di Noia et al. 2012a, 2012b]. In our study, for each item 1 https://www.facebook.com. 2 http://www.last.fm. 3 http://www.epinions.com. 4 http://www.freebase.com/. 5 http://dbpedia.org/.

ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

A Hybrid Multigroup Coclustering Recommendation Framework Based on Information Fusion

27:7

in recommender systems, we try to map it to an instance in DBpedia and extract its associated properties to build its profile. Since the category property contains most of the information about an item, we only use the category information of items in this article. In DBpedia, the categories are modeled as a hierarchical structure, which allows us to catch implicit relations and expand information. In HMCoC, we extract not only the items’ categories but also the categories’ parent categories from DBpedia in one step. In order to calculate the implicit correlations between each pair of items, we adopt the vector-based method used in Di Noia et al. [2012a], where each item i j is represented as a vector ω j = (ω ja1 , ω ja2 , . . . , ω jat ). The nonbinary weights in the vector are TF-IDF weights of category terms. More precisely, they are computed as   n , (1) ω jai = t f jai × log nai where n is the number of items and nai is the number of items that belong to category ai ; t f jai = 1 if item i j belongs to category ai , otherwise 0. Thus, more general categories will have lower weights such as English language films in the movie domain. As is often used in the classical vector space model, we evaluate the implicit correlations between items i j and ik by the cosine similarity between their vectors: t ω jai × ωkai  . (2) sim(ω j , ωk) =  i=1 t t 2 2 ω · ω i=1 jai i=1 kai So far, we have three different types of relations in our framework, including rating behavior R, user social relations F, and item implicit correlations S. In fact, we can fuse this information by a heterogeneous graph model G = (V, E), where V = U ∪ I and E = R∪ F ∪ S. In this graph, we have two different kinds of vertices, namely, users and items, and three different types of edges. R ⊆ U × I represents the rating behaviors, and the weights of this kind of edge are the rating scores. F ⊆ U × U represents the social relations between users; Fi j is 1 if user ui and u j are friends or ui trusts u j , and otherwise, Fi j = 0. Similarly, S ⊆ I × I is the implicit correlation between items, and the weight of each edge is the cosine similarity computed by Equation (2). 3.3. Hybrid Multigroup Coclustering Module

In this module, the goal is to cocluster users and items into multiple groups simultaneously. What differentiates our work from prior methods is that any user or item can belong to more than one group in different degrees, and furthermore, we utilize not only the rating information to do the clustering procedure but also users’ social relations and items’ implicit correlations. Important notations used in the rest of the article are listed in Table I. We use subscript i and j to index the i/ jth row of a matrix and i j to index the cell in the ith row and jth column of a matrix. We first define the concept of group and then present the formulation of the hybrid multigroup coclustering problem. Lastly, we propose an approximate solution to the optimization problem. Definition 3.1 (Group). Let G(V, E) be a graph where V = U ∪ I is the vertex set of G, and U and I are the user set and item set, respectively. For a given positive integer L and a fuzzy clustering method on V , if V can be partitioned into L nonempty subsets L L Vk = Uk ∪ Ik such that k=1 Uk = U and k=1 Ik = I (0 ≤ k ≤ L), we call Vk a group. ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

27:8

S. Huang et al. Table I. Notations Notation U, I R F S m, n Vk Y P, Q D L α, β

Description the user set and item set the rating matrix the social relationship matrix the item correlation matrix the number of users and items the kth group the membership matrix of clustering results the membership matrix of users and items the diagonal degree matrix the number of groups the tradeoff parameters

We use membership matrix Y ∈ [0, 1](m+n)×L to represent the clustering results, where each element Yik is the relative weight of entry i belonging to each group Vk. We can fix the number of groups such that each user or item can belong to, for example, K groups (1 ≤ K ≤ L). Then we have exactly K nonnegative weights in each row and the remains are set to be zero. Specifically, matrix Y can be written as   P Y = , (3) Q where P ∈ [0, 1]m×L is the membership matrix for users and Q ∈ [0, 1]n×L for items. A different number of groups may have different effects on the recommendation performance, and we conduct experiments to investigate this in Section 5.1. We aim to group users and items simultaneously and allow each user and item to belong to multiple groups. Intuitively, if a user gave a high rating score to an item, this user and item are likely to belong to the same one or more groups. Furthermore, if two users are connected in social networks, they probably appear in one or more groups together, and if two items have strongly implicit correlations, they might belong to one or more of the same groups. Note that unlike spectral clustering [Von Luxburg 2007] and bipartite spectral graph partitioning [Dhillon 2001] algorithms, we have two distinct types of vertices and three different kinds of adjacency matrices in graph model G. These algorithms cannot be directly used here without adaptation. The pairwise relationship between users and items is represented in our undirected weighted graph G. Considering the differences between the scale of weights and structure, we need to model the inter- and intrarelationships between users and items separately. In our clustering method, we assume that if users and items are strongly associated, their group indicator vectors Pi s and Q j s should be as close as possible. The local variation between two connected objects is the difference between their group indicator vectors. However, before computing the local variation, we need to split the object’s indicator vector among adjacent objects to make it balanced. We use L(P, Q) to denote our loss function, which comprises three different terms. The first term indicates that if a user has rated an item, his or her group indicator vectors (Pi and Q j ) should be close. The second term implies that if two users are friends, they should have similar interests and their group indicator vectors (Pi and Pj ) should be close. The last term states that if two items are correlated with each other, their group indicator vectors (Qi and Q j ) should be close. Each term is proportional to the relationship weights between users and/or items, which are Ri j , Fi j , and Si j . In order to group strongly associated users and items, inspired by Zhang et al. [2012] and ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

A Hybrid Multigroup Coclustering Recommendation Framework Based on Information Fusion

Von Luxburg [2007], we propose our clustering problem as follows: ⎛ ⎞ ⎛

2

2 ⎞



m m n m Qj Pj Pi Pi ⎜ ⎟ ⎜

Fi j ⎟

Ri j ⎠ + α − L(P, Q) = ⎝  row −  ⎝  ⎠

col Dii F Dj j DFjj i=1 j=1 i=1 j=1 Dii ⎛

2 ⎞

n n ⎜ Qi Qj

Si j ⎟   +β − ⎝ ⎠,

S S

Dj j Dii i=1 j=1

27:9

(4)

col where Drow ∈ Rm×m and D ∈ Rn×n are two diagonal degree matrices of users and items, m n row respectively, and Dii = j=1 Ri j and Dcol jj = i=1 Ri j . Usually the user social relations and item implicit correlations are symmetric, so we use DF and DS to denote the diagonal degree matrix of F and S. Parameter α ≥ 0 controls the social-relation-constrained user-side clustering, and β ≥ 0 controls the implicit correlation-constrained item-side clustering. The joint objective function concerns not only the user–item preferences but also the connectivity coherency between users and topical consistency between items. After some algebraic derivations, Equation (4) can be rewritten in the matrix form as follows: ⎛ ⎞ m n m n T 2P Q R i i j j L(P, Q) = ⎝ Pi 2 + Q j 2 −  row  col ⎠ Dii Dj j i=1 j=1 i=1 j=1 ⎞ ⎛ m m m m T 2Pi Pj Fi j ⎟ ⎜   Pi 2 + Pj 2 − +α⎝ ⎠ DiiF DFjj i=1 j=1 i=1 j=1 ⎛ ⎞ n n n n T 2Q Q S i j ij ⎟ ⎜   +β⎝ Qi 2 + Q j 2 − ⎠ SiiF S Fjj i=1 j=1 i=1 j=1  = T r P T P + QT Q − 2P T AQ + α(P T P + P T P − 2P T BP) (5)  + β(QT Q + QT Q − 2QT C Q)     Im −A P T T = T r [P Q ] Q −AT In    P 2α(Im − B) 0 + [P T QT ] 0 0 Q     P 0 0 + [P T QT ] Q 0 2β(In − C)     −A T Im + 2α(Im − B) Y =Tr Y In + 2β(In − C) −AT  T  = T r Y MY .

In Equation (5), we have A = (Drow )− 2 R(Dcol )− 2 , B = (DF )− 2 F(DF )− 2 , 1

1

1

1

C = (DS )− 2 S(DS )− 2 , 1

1

ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

(6)

27:10

S. Huang et al.

and

 M=

 −A Im + 2α(Im − B) . In + 2β(In − C) −AT

(7)

Matrices A, B, and C are the normalized Laplacian matrices of R, F, and S, respectively. Finally, with the loss function in Equation (5), we define the hybrid multigroup coclustering problem as in Definition 3.2. Definition 3.2. The hybrid multigroup coclustering problem is defined as min Y

s.t.

T r(Y  MY ) Y ∈ [0, 1](m+n)×L, Y 1 L = 1m+n, |Yi | = K, i = 1, . . . , (m + n),

(8)

where L is the number of clusters and K is the biggest number of groups each user or item can belong to. Matrix M is given by Equation (7). Notation |Yi | means the number of nonzero elements in each row of matrix Y . Before we discuss the solution to our problem (Equation (3.2)), let us first prove a property of matrix M given in Theorem 3.1. THEOREM 3.1. Matrix M given by Equation (7) is positive semidefinite. PROOF. In linear algebra, an n × n real matrix M is said to be positive semidefinite if z Mz is nonnegative for every nonzero column vector z of n real numbers. From Equation (5), we see that M is positive semidefinite if we can prove the three following matrices are positive semidefinite: 

     Im −A 0 0 2α(Im − B) 0 , . , 0 0 0 2β(In − C) −AT In

(9)

For any vectors x = [x1 , . . . , xm] and y = [y1 , . . . , yn] , [x , y ]



Im −A −AT In



x y



= x x − y A x − xT Ay + y y   m n Ri j Ri j    row  xi y j =x x+y y−2 Dii Dcol i=1 j=1 jj

+

m n i=1 j=1

Ri j 2 x + Diirow i

m n i=1 j=1

Ri j 2 yi − x x − y y Dcol jj ⎞2

(10)

⎛   m n Ri j Ri j ⎝  row xi −  yj ⎠ ≥ 0 = col D ii D i=1 j=1 jj

ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

A Hybrid Multigroup Coclustering Recommendation Framework Based on Information Fusion 27:11

and 

2α(Im − B) 0 [x , y ] 0 0 T

T



x y



  = α 2xT x + 2xT BT x ⎛ ⎞  m m m  F F ij ij  = α ⎝2 xi2 − 2 · xi x j ⎠ F DFjj Dii i=1 i=1 j=1 ⎛ ⎛ ⎞2 ⎞  m m F F ⎜ ⎝  i j xi −  i j x j ⎠ ⎟ = α⎝ ⎠ ≥ 0. DFjj DiiF i=1 j=1

(11)

The proof of the third matrix is similar to the proof of Equation (11). The detailed proof is left to the interested readers. Since the sum of positive semidefinite matrices is still positive semidefinite, matrix M is positive semidefinite. However, it is not easy to solve the optimization problem in Equation (8), because it is nonconvex and discontinuous. In order to solve it efficiently, we relax Equation (8) according to the spectral clustering method given in Von Luxburg [2007]. First, we map all the users and items into a common low-dimensional subspace, and then we cluster them simultaneously in this subspace. Let Z ∈ R(m+n)×r be the matrix with rows being the low-dimensional representations of users and items in r-dimensional subspace. The optimal Z∗ is obtained by solving the following problem: min Z

s.t.

T r(Z MZ) Z ∈ R(m+n)×r , Z Z = I.

(12)

Since matrix M is positive semidefinite, according to the Rayleigh-Ritz theorem [MacDonald 1933], the optimal solution Z∗ can be given by the solution of the eigenvalue problem MZ = λZ. Z∗ = [z1 , . . . , zr ] can be used as the approximate solution to our hybrid multigroup coclustering problem, where z1 , . . . , zr are the smallest eigenvectors of matrix M ordered according to their corresponding eigenvalues. As long as we obtain the unified representation Z of users and items, each row of Z can be used as the feature vector of each user or item. We utilize fuzzy c-means [Lkeski 2003] to cluster users and items into L groups with Z. After the clustering procedure, for each row of the membership matrix Y , only the top-K biggest entries are reserved and normalized to be 1. The pseudocode of the Hybrid Multigroup Coclustering method is shown in Algorithm 1. 3.4. Top-n Recommendation Module

Now we describe how to combine these groups obtained in the previous section with conventional collaborative filtering methods. Intuitively, for each group Vk, we can get a submatrix from the original user–item matrix R with only users and items appearing in that group. Let Rk ∈ Rmk×nk denote the rating matrix for group Vk, where k = 1, . . . , L. mk and nk are the number of users and items in that group, respectively. For a traditional CF method such as user-based CF or NMF, its input is the user–item rating matrix and the output is the predicted scores for the missing values in that matrix. We can apply any rating-based CF method in each submatrix independently and merge the prediction results together from all the ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

27:12

S. Huang et al.

Fig. 3. Illustration by example of recommendation procedure in HMCoC.

ALGORITHM 1: Hybrid Multigroup Coclustering Algorithm Input: Rating matrix R ∈ Rm×n, user social relations F ∈ Rm×m, item implicit correlations S ∈ Rn×n, the number of groups L, and the number of feature vectors r. Output: Group membership matrix Y . 1 Compute the normalized Laplacian A, B, and C according to Equation (6); 2 Construct matrix M from A, B, and C according to Equation (7); 3 Compute the first r smallest eigenvectors z1 , . . . , zr of M; 4 Let Z ∈ R(m+n)×r be the matrix containing the vectors z1 , . . . , zr as columns; 5 For i = 1, . . . , m + n, let yi be the vector corresponding to the ith row of Z; 6 Cluster the points {yi }(i = 1, . . . m + n) with the fuzzy c-means into L groups, resulting in the membership matrix Y ∈ [0, 1](m+n)×L; 7 For each row in Y , reserve the top-K largest entries and set others to be zero, then normalized.

groups at last. The dimensions of the submatrices are much smaller than the original matrix and the CF model can be executed in parallel; therefore, many CF models can be applied online with very large dataset. To be more specific, we show an example in Figure. 3. Since each user and item can belong to multiple (K ≥ 0) groups, we need to merge the prediction results generated from these groups. As in Xu et al. [2012], we define the final prediction score of user ui to item i j as

i j = R

⎧ r(u ˜ i , i j , k) · ωik i f ui and i j belong to one or ⎪ ⎨ ⎪ ⎩

k

0

more same groups, otherwise,

(13)

  where r˜ ui , i j , k is the prediction score of ui to item i j in the kth group by the chosen CF algorithm and ωik is the weight. ωik can be set to be the relative weight of user ui belonging to group k. To be simple, we just set ωik = 1 if Yik is the maximum satisfying Yik = 0 and Y jk = 0 (1 ≤ k ≤ L) and ωik = 0 otherwise. By the recommendation framework described earlier, for each user, we sort the prediction scores in decreasing order and recommend the top-n items to the user. In fact, HMCoC can filter out lots of items for a user if these items are not in any of the groups this user belongs to. The pseudocode of the top-n recommendation process is shown in Algorithm 2. ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

A Hybrid Multigroup Coclustering Recommendation Framework Based on Information Fusion 27:13 ALGORITHM 2: Top-n Recommendation Algorithm Input: Rating matrix R ∈ Rm×n, all the groups {V1 , . . . , VL}, a chosen CF method, and the number of items in recommendation list N. Output: Recommendation list for each user. 1 for k ← 1 to L do 2 Extract submatrix Rk from rating matrix R with users and items belonging to group Vk; 3 Apply the CF recommendation method with Rk as input and predict missing scores r(u ˜ i , i j , k). 4 end 5 for i ← 1 to m do 6 for j ← 1 to n do 7 if Ri j is missing then 8 Find group index k = {max Yik|Yik = 0 and Y jk = 0}; k

9 10 11 12 13 14 15 16 17 18

if k is null then i j = 0; Set R end else i j = r(u Set R ˜ i , i j , k); end end end Generate top-n recommendation list for user ui according to the decreasing order of the predicted scores. end

4. EXPERIMENTAL SETTINGS 4.1. Datasets

Our experiments are carried out on two real datasets, Movielens-1M6 (ML1M) and Last.fm7 (LF), and we use the mappings of items to DBpedia instances published by Di Noia et al. [2012b] and Ostuni et al. [2013].8 The ML1M dataset contains user rating scores for different movies that are in a 1 to 5 star scale. This dataset does not have user social networks. The second dataset comes from the Last.fm online music system. Last.fm is an implicit feedback dataset, in which each user has a list of most-listened-to music artists and the weight indicates the listening frequency of one user to an artist. Its users are interconnected in a social network generated from Last.fm bidirectional friend relations. We delete the artists that have been listened to once. The basic statistics of these two datasets are shown in Table II. In our experiments, the datasets were partitioned into five parts for fivefold crossvalidation, where four parts were used for training and the remaining part for testing, and the averaged performances were reported. 4.2. Evaluation Metrics

In reality, recommender systems care more about personalized rankings of items than the absolute rating predictions Cremonesi et al. [2010] and Deshpande and Karypis [2004]. To be consistent with other top-n recommendation literature, three classical measures are selected to evaluate the accuracy of the ranked list: F1-measure, MAP 6 http://www.grouplens.org/node/73. 7 http://ir.ii.uam.es/hetrec2011/datasets.html. 8 http://sisinflab.poliba.it/semanticweb/lod/recsys/datasets/.

ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

27:14

S. Huang et al. Table II. Statistics of Datasets # of users # of items # of items found in DBpeia # of categories # of ratings # of relations # of ratings per user # of ratings per item # of friends per user # of categories per item

Movielens-1M 6,040 3,952 3,148 9,042 1,000,209 165.60 253.09 49.66

Last.fm 1,885 6,953 5,209 18,134 82,155 25,334 43.58 11.82 13.44 29.19

(Mean Average Precision), and NDCG (Normalized Discounted Cumulative Gain). For each item in the recommendation list, if a user had a rating in test data, we assume that he or she was interested in this item. To compute the F1-measure, let precision and recall be the user-oriented averaging precision and recall for the ranked list: F1 =

2 × precision × recall . precision + recall

(14)

For each user u, given a ranked list with n items, we denote prec( j) as the precision at the rank position j, and pref( j) as the preference indicator of item at position j. If the item at position j is rated by user u in the test set, pref( j) = 1, and otherwise 0. Average Precision (AP) is computed as the average of precisions computed at each position in the ranked list. MAP is the mean of AP for all users: n j=1 prec( j) × pref ( j) AP(u) = n (15) 1 AP(u). MAP = |U | u∈U

In addition, when evaluating lists of recommended items, the position of the relevant item in the ranked list is also important. NDCG gives more weights to items whose positions are in front: 2pref ( j) − 1 1 × , IDCG log2 ( j + 1) n

NDCG =

(16)

j=1

where IDCG is produced by a perfect ranking algorithm. Higher F1, MAP, and NDCG imply better recommendation performance. In experiments, we recommend the top-20 items for each user. 4.3. Comparisons

Here we chose four popular CF models as the basic CF algorithms, including one memory-based recommendation method with User-based CF (UserCF) [Huang et al. 2007] and three model-based recommendation methods with PureSVD [Sarwar et al. 2000], NMF [Lee and Seung 2000], and SLIM [Ning and Karypis 2011]. For userbased CF, we used Pearson correlation to measure user–user similarities. We set the dimension of the latent feature to be six in PureSVD and NMF. For SLIM, we set the regularization parameters λ = 0.01 and the number of neighbors k = 30. To investigate the effect of clustering models in CF recommendation, we used several ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

A Hybrid Multigroup Coclustering Recommendation Framework Based on Information Fusion 27:15 Table III. Performance Comparisons on LF in Terms of MAP, NDCG, and F1 with Group = 10 and 20 Recommendation Methods

10 Groups MAP NDCG@10 F1@10 MAP UserCF 0.2084 0.2076 0.0800 0.2084 Single+UserCF 0.2253 0.2217 0.0760 0.1359 MCoC+UserCF 0.2381 0.2327 0.0816 0.2386 HMCoC+UserCF 0.2614** 0.2532* 0.0989* 0.2742** PureSVD 0.2091 0.2419 0.0998 0.2091 Single+PureSVD 0.2695 0.2772 0.1124 0.1800 MCoC+PureSVD 0.3147 0.2794 0.1326 0.3208 HMCoC+PureSVD 0.3308** 0.2845 0.1347 0.3488** NMF 0.3189 0.2697 0.1408 0.3189 Single+NMF 0.2699 0.2585 0.1204 0.1968 MCoC+NMF 0.3287 0.2674 0.1501 0.3302 HMCoC+NMF 0.3420** 0.2906** 0.1522 0.3514** SLIM 0.3119 0.2619 0.1307 0.3119 Single+SLIM 0.2696 0.2627 0.1142 0.2082 MCoC+SLIM 0.3279 0.2665 0.1396 0.3273 HMCoC+SLIM 0.3380* 0.2842** 0.1405 0.3401** Bold typeset indicates the best performance. ** indicates statistical significance statistical significance at p < 0.01 compared to the second best.

20 Groups NDCG@10 0.2076 0.1547 0.2393 0.2511* 0.2419 0.1763 0.2799 0.2942* 0.2697 0.1893 0.2713 0.2941* 0.2619 0.1877 0.2618 0.2857** at p < 0.001.

F1@10 0.0800 0.0385 0.0809 0.1026* 0.0998 0.0609 0.1172 0.1328** 0.1408 0.0713 0.1515 0.1507 0.1307 0.0729 0.1422 0.1436 * indicates

variant clustering models in combination with UserCF, PureSVD, NMF, and SLIM. The clustering models are as follows: —Single: In the single-cluster model, we use k-means to cluster users and items with eigenvectors Z computed according to Equation (12). Each user and item can only belong to one cluster. —MCoC: This model only uses rating information in clustering, which is the same as Xu et al. [2012]. We use this model to investigate whether or not the assumption that users and items belong to multiple groups is more reasonable. —HMCoC: In this model, besides rating information, we also use user social network information and item implicit correlations when clustering. 5. EXPERIMENTAL RESULTS AND DISCUSSION

Performance on Last.fm. In LF, the user–item ratings range in a large scale; for a user, some artists are listened to just once and some artists are listened to more than 10,000 times. So we rescaled the ratings by R˜ i j = log2 (Ri j ) to alleviate the big variance. In this experimental setting, we set α = 0.3 and β = 0.3 and K = log2 L . Table III shows the experimental results on the Last.fm dataset with different combinations of CF models and clustering models. From Table III, we observe that MCoC and HMCoC yield better performance under most of the evaluation conditions. It verifies the assumption that recommendation performance can be improved if we consider that users and items can belong to multiple groups. When the number of groups changes from 10 to 20, we can see that Single+CF models perform better when the number of groups L is small (L = 10) and drops quickly when L increases (L = 20). This result is consistent with some previous clustering CF models [George and Merugu 2005; Sarwar et al. 2002]. This is because a small number of clusters can help filter out irrelevant items or users and denoise. Usually, when L becomes large, the number of items in each cluster becomes too small to be recommended for each user. However, in our model, the performance is more stable and even better when the number of groups L gets bigger. We believe that it is because the fuzzy weights may be more accurate as L increases ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

27:16

S. Huang et al.

Fig. 4. Performance comparisons on different values of N (top-N) with 20 groups. Table IV. Performance Comparisons on ML1M in Terms of MAP, NDCG, and F1 with Group = 10 and 20 Recommendation Methods

10 Groups 20 Groups MAP NDCG@10 F1@10 MAP NDCG@10 UserCF 0.2918 0.2872 0.0768 0.2918 0.2872 Single+UserCF 0.2208 0.3015 0.0470 0.1372 0.2562 MCoC+UserCF 0.2919 0.2838 0.0701 0.2790 0.2690 HMCoC+UserCF 0.3109* 0.2989* 0.0717 0.2890* 0.2784* PureSVD 0.3870 0.3649 0.1042 0.3870 0.3649 Single+PureSVD 0.2655 0.3657 0.0688 0.1731 0.3128 MCoC+PureSVD 0.4151 0.3731 0.1199 0.4142 0.3760 HMCoC+PureSVD 0.4264* 0.3838* 0.1283 0.4306** 0.3872* NMF 0.4043 0.3811 0.1203 0.4043 0.3811 Single+NMF 0.2753 0.3573 0.0729 0.1987 0.2976 MCoC+NMF 0.4121 0.3701 0.1330 4137 0.3742 HMCoC+NMF 0.4261* 0.3800* 0.1376 0.4256** 0.3839** SLIM 0.4348 0.3767 0.1433 0.4348 0.3767 Single+SLIM 0.2812 0.3470 0.0724 0.2017 0.2863 MCoC+SLIM 0.4312 0.3664 0.1406 0.4329 0.3747 HMCoC+SLIM 0.4335 0.3696 0.1428 0.4346 0.3824 Bold typeset indicates the best performance. ** indicates statistical significance at p < 0.001. * statistical significance at p < 0.01 compared to the second best.

F1@10 0.0768 0.0228 0.0637 0.0712 0.1042 0.0320 0.1256 0.1301 0.1203 0.0377 0.1364 0.1403 0.1433 0.0366 0.1440 0.1447 indicates

and meanwhile the number of items in each group will not drop drastically since each item can appear in multiple groups. We also observe that our HMCoC model yields the best performance in most cases. It verifies that besides ratings, user social networks and item implicit correlations are both helpful to find more accurate groups for CF. In Figure 4, we plot the precision values of different methods when the length of the recommendation list varies (one to 10). We choose a memory-based CF (UserCF) method and a model-based CF (SVD) method as examples. Similar to the results in Table III, the single-cluster model performs worse than the baseline CF methods when the number of the group L = 20. As can be seen, our model can outperform other comparison partners constantly. Performance on Movielens-1M. In Movielens, there is no user social relation information, so we set α = 0. The experimental results are shown in Table IV. From Table II, we can see that Movielens-1M is a much denser dataset than Last.fm, so the basic CF models can achieve fairly good performance. We found the results are similar to those on Last.fm; our recommendation framework still performs the best under most situations. The single-cluster model has very poor performance when the number of ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

A Hybrid Multigroup Coclustering Recommendation Framework Based on Information Fusion 27:17

Fig. 5. Impact of parameters L and K on Last.fm (a) and Movielens-1M (b).

groups is large. Surprisingly, MCoC and our model have a negative effect on SLIM. The reason may be that in the ML dataset, the number of items is much fewer than the number of users, and SLIM needs a feature selection procedure to learn its parameters. 5.1. Parameter Selection

In this section, we conduct various experiments to investigate how the parameters in our HMCoC model affect the recommendation accuracy. We use PureSVD as our basic CF model; UserCF and NMF have similar results and are omitted here. 5.1.1. Impact of L and K . In our HMCoC model, L is the number of groups and K is the biggest number of groups a user or an item can belong to (1 ≤ K ≤ L). We conduct experiments on both LF and ML with L varying from two to 20. The impacts of these two parameters on recommendation performance are plotted in Figure 5. We can observe that our model achieves better performance when K is small. This result corresponds with our expectation because people have diverse but also limited interests. In addition, we find that when L gets bigger, K needs to be bigger to get higher MAP. This is because when we divide users and items into more clusters, users and items need to belong to more clusters to keep the clusters big enough to make recommendations. Based on the previous analysis, we set K = log2 L .

ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

27:18

S. Huang et al.

Fig. 6. Impact of parameter r on Last.fm (a) and Movielens-1M (b).

Fig. 7. Impact of parameters α and β on Last.fm (a) and Movielens-1M (b).

5.1.2. Impact of r. Parameter r is the number of eigenvectors computed in Equation () and also the dimension of feature vectors in fuzzy c-means clustering. We conduct experiments on both LF and ML datasets with 10 and 20 groups and plot the results of MAP in Figure 6. From the figure, we can see that our recommendation performance is competitive when we use just a few eigenvectors in the fuzzy c-means clustering process. So in our experiments, we select r = 4. 5.1.3. Impact of α and β. Another two important parameters α and β control the socialnetwork-constrained user-side clustering and item implicit correlation-constrained item-side clustering. Figure 7 shows how α and β affect the performance of HMCoC on Last.fm and Movielens-1M datasets. They have similar trends when α and β increase. When α and β are small, they have little effect on the performance because the information of user social network and item implicit correlations are ignored. When they increase continuously (>1), the user social network and item implicit information would overwhelm the rating information and cause the descendence of performance. From Figure 6(a), we also can see that social relations among users contribute more to the recommendation performance than item implicit correlations. When α and β are around 0.5, we have the best MAP measure. 5.2. Discussion

Sparsity. In order to show how our model can alleviate the sparsity problem in CF recommendation, we record the sparsity (percent of zero elements in a matrix) of the original rating matrix and also the average sparsity of groups in Table V. In the table, “Random” means each user or item is assigned to multiple groups randomly, ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

A Hybrid Multigroup Coclustering Recommendation Framework Based on Information Fusion 27:19 Table V. Sparsity Comparisons on ML1M and LF

Original Random HMCoC

ML 10 Groups 20 Groups 0.9581 0.9581 0.9785 0.9781 0.9512 0.9507

LF 10 Groups 20 Groups 0.9937 0.9937 0.9816 0.9814 0.9536 0.9458

and the number of groups each user and item can belong to is the same as HMCoC (K = log2 L ). From the table, we can observe that the sparsity is largely reduced by using our HMCoC model. Furthermore, by comparing the average sparsity values of the last two rows of Table V, we can see that our clustering strategy is more effective than random clustering strategy in alleviating sparsity. Scalability. Our recommendation framework includes three main processes, that is, the information fusion process, hybrid multigroup coclustering process, and top-n recommendation process, while information fusion and the clustering process can be done offline, so the running time of top-n recommendation is not increased. In our clustering process, the time-consuming parts are eigenvector computing and fuzzy c-means clustering. However, our matrix M is highly sparse and positive semidefinate and we require only a few eigenvectors, so the eigenvector computation and fuzzy clustering process are relatively fast. It takes O((m + n)2 ) [Leordeanu and Hebert 2005] to compute the eigenvectors and O((m + n)dc) [Kolen and Hutcheson 2002] to execute fuzzy c-means clustering, where m and n are the numbers of users and items, respectively; d is the dimension of the features; and c is the number of clusters. Furthermore, there are many existing software packages supporting parallel eigenvector computation and fuzzy clustering for large datasets. In the top-n recommendation process, the original user–item matrix R can be partitioned into much smaller submatrices according to the clustering results. Thus, the CF model can be executed independently by multiprocessing systems; therefore, many CF models can be applied online with very large datasets. The presented experimental results suggest that it is more reasonable to assume that users and items can belong to multiple groups. Furthermore, integrating additional information resources, such as user social network and item category, can help generate better groups for recommendation. However, one problem of our framework is that the groups we get may be unbalanced. In extreme cases, there may exist some groups with only a few items in them. In this situation, one solution is that we can put forth several popular items for recommendation. Our experiments are conducted on datasets whose items are in the same domain, such as movies (Movielens) and music (Last.fm). The distinctions between items may be hardly defined only by categories in DBpedia. Some other domain-specific knowledge base with descriptions about plots of movies or feelings of music could be further investigated. 6. CONCLUSION

In this article, we proposed a Hybrid Multigroup CoClustering recommendation framework, denoted as HMCoC, by extending conventional CF-based recommendation algorithms with a novel clustering method. This framework allows users and items to be clustered into multiple groups. To generate groups, we employed the information from different sources, for example, rating matrix and user social network and knowledge base, and we represented this information by a unified graph model. In our top-n recommendation process, many traditional rating-based CF models can be used directly without any modification. The experimental results showed that our framework can reduce the sparsity problem and is effective in top-n recommendation on ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

27:20

S. Huang et al.

Movielens-1M and Last.fm datasets in terms of MAP, NDCG, and F1. The experimental results on Last.fm also showed that user social relations contribute more to the performance improvement of our recommendation framework. In the future, we would like to test our framework in other multidomain datasets such as Epinions9 and Douban.10 In addition, we will investigate some other clustering methods such as community topic mining and find better ways to combine groups and CF algorithms. ACKNOWLEDGMENTS The authors would like to thank the anonymous reviewers for their valuable comments and suggestions to improve the quality of the article.

REFERENCES Gediminas Adomavicius and Alexander Tuzhilin. 2005. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering 17, 6 (2005), 734–749. Shinhyun Ahn and Chung-Kon Shi. 2009. Exploring movie recommendation system using cultural metadata. In Transactions on Edutainment II. Springer, Berlin, 119–134. Jiajun Bu, Shulong Tan, Chun Chen, Can Wang, Hao Wu, Lijun Zhang, and Xiaofei He. 2010. Music recommendation by unified hypergraph: Combining social media information and music content. In Proceedings of the 18th International Conference on Multimedia. ACM, 391–400. Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of recommender algorithms on topn recommendation tasks. In Proceedings of the 4th International Conference on Recommender Systems. ACM, 39–46. Mukund Deshpande and George Karypis. 2004. Item-based top-n recommendation algorithms. ACM Transactions on Information Systems 22, 1 (2004), 143–177. Christian Desrosiers and George Karypis. 2011. A comprehensive survey of neighborhood-based recommendation methods. In Recommender Systems Handbook. Springer US, 107–144. Inderjit S. Dhillon. 2001. Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the 7th International Conference on Knowledge Discovery and Data Mining. ACM, 269–274. Tommaso Di Noia, Roberto Mirizzi, Vito Claudio Ostuni, and Davide Romito. 2012a. Exploiting the web of data in model-based recommender systems. In Proceedings of the 6th International Conference on Recommender Systems. ACM, 253–256. Tommaso Di Noia, Roberto Mirizzi, Vito Claudio Ostuni, Davide Romito, and Markus Zanker. 2012b. Linked open data to support content-based recommender systems. In Proceedings of the 8th International Conference on Semantic Systems. ACM, 1–8. Thomas George and Srujana Merugu. 2005. A scalable collaborative filtering framework based on coclustering. In 5th IEEE International Conference on Data Mining. IEEE, 625–628. Songjie Gong. 2010a. A collaborative filtering recommendation algorithm based on user clustering and item clustering. Journal of Software 5, 7 (2010), 745–752. Songjie Gong. 2010b. An efficient collaborative recommendation algorithm based on item clustering. In Advances in Wireless Networks and Information Systems. Springer, 381–387. Thomas Hofmann. 2004. Latent semantic models for collaborative filtering. ACM Transactions on Information Systems 22, 1 (2004), 89–115. Thomas Hofmann and Jan Puzicha. 1999. Latent class models for collaborative filtering. In Proceedings of the 16th International Joint Conference on Artificial Intelligence. ACM, 688–693. Zan Huang, Daniel Zeng, and Hsinchun Chen. 2007. A comparison of collaborative-filtering recommendation algorithms for e-commerce. IEEE Intelligent Systems 22, 5 (2007), 68–78. Mohsen Jamali and Martin Ester. 2010. A matrix factorization technique with trust propagation for recommendation in social networks. In Proceedings of the 4th International Conference on Recommender Systems. ACM, 135–142. 9 http://www.epinions.com/. 10 http://www.douban.com/.

ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

A Hybrid Multigroup Coclustering Recommendation Framework Based on Information Fusion 27:21 Meng Jiang, Peng Cui, Rui Liu, Qiang Yang, Fei Wang, Wenwu Zhu, and Shiqiang Yang. 2012. Social contextual recommendation. In Proceedings of the 21st International Conference on Information and Knowledge Management. ACM, 45–54. John F. Kolen and Tim Hutcheson. 2002. Reducing the time complexity of the fuzzy c-means algorithm. IEEE Transactions on Fuzzy Systems 10, 2 (2002), 263–267. Yehuda Koren. 2008. Factorization meets the neighborhood: A multifaceted collaborative filtering model. In Proceedings of the 14th International Conference on Knowledge Discovery and Data Mining. ACM, 426–434. Marius Leordeanu and Martial Hebert. 2005. A spectral technique for correspondence problems using pairwise constraints. In Proceedings of 10th IEEE International Conference on Computer Vision, Vol. 2. IEEE, 1482–1489. Kenneth Wai-Ting Leung, Dik Lun Lee, and Wang-Chien Lee. 2011. CLR: A collaborative location recommendation framework based on co-clustering. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 305–314. Jacek Lkeski. 2003. Towards a robust fuzzy clustering. Fuzzy Sets and Systems 137, 2 (2003), 215–233. J. K. L. MacDonald. 1933. Successive approximations by the Rayleigh-Ritz variation method. Physical Review 43, 10 (1933), 830–833. Paolo Massa and Paolo Avesani. 2007. Trust-aware recommender systems. In Proceedings of the 1st International Conference on Recommender Systems. ACM, 17–24. Stuart E. Middleton, David De Roure, and Nigel R. Shadbolt. 2004. Ontology-based recommender systems. In Handbook on Ontologies. Springer, Berlin, 477–498. Andriy Mnih and Ruslan Salakhutdinov. 2007. Probabilistic matrix factorization. In Advances in Neural Information Processing Systems. MIT Press, 1257–1264. Xia Ning and George Karypis. 2011. SLIM: Sparse linear methods for top-n recommender systems. In 11th IEEE International Conference on Data Mining. IEEE, 497–506. Vito Claudio Ostuni, Tommaso Di Noia, Eugenio Di Sciascio, and Roberto Mirizzi. 2013. Top-n recommendations from implicit feedback leveraging linked open data. In Proceedings of the 7th ACM Conference on Recommender Systems. ACM, 85–92. Jing Peng, Daniel Dajun Zeng, Huimin Zhao, and Fei-yue Wang. 2010. Collaborative filtering in social tagging systems based on joint item-tag recommendations. In Proceedings of the 19th ACM Iinternational Conference on Information and Knowledge Management. ACM, 809–818. Badrul M. Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2000. Application of dimensionality reduction in recommender system-a case study. In Proceedings of the ACM WebKDD Web Mining for E-Commerce Workshop. Badrul M. Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th International Conference on World Wide Web. ACM, 285–295. Badrul M. Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2002. Recommender systems for large-scale e-commerce: Scalable neighborhood formation using clustering. In Proceedings of the 5th International Conference on Computer and Information Technology, Vol. 1. Badrul M. Sarwar, Joseph A. Konstan, Al Borchers, Jon Herlocker, Brad Miller, and John Riedl. 1998. Using filtering agents to improve prediction quality in the groupLens research collaborative filtering system. In Proceedings of the 12th ACM Conference on Computer Supported Cooperative Work. ACM, 345–354. Daniel D. Lee and H. Sebastian Seung. 2000. Algorithms for non-negative matrix factorization. In Advances in Neural Information Processing Systems. MIT Press, 556–562. Panagiotis Symeonidis, Alexandros Nanopoulos, and Yannis Manolopoulos. 2008. Tag recommendations based on tensor dimensionality reduction. In Proceedings of the 2008 ACM Conference on Recommender Systems. ACM, 43–50. Ulrike Von Luxburg. 2007. A tutorial on spectral clustering. Statistics and Computing 17, 4 (2007), 395–416. Jun Wang, Arjen P. De Vries, and Marcel J. T. Reinders. 2006. Unifying user-based and item-based collaborative filtering approaches by similarity fusion. In Proceedings of the 29th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 501–508. Bin Xu, Jiajun Bu, Chun Chen, and Deng Cai. 2012. An exploration of improving collaborative recommender systems via user-item subgroups. In Proceedings of the 21st International Conference on World Wide Web. ACM, 21–30. Gui-Rong Xue, Chenxi Lin, Qiang Yang, WenSi Xi, Hua-Jun Zeng, Yong Yu, and Zheng Chen. 2005. Scalable collaborative filtering using cluster-based smoothing. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 114–121.

ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

27:22

S. Huang et al.

Lijun Zhang, Chun Chen, Jiajun Bu, Zhengguang Chen, Deng Cai, and Jiawei Han. 2012. Locally discriminative coclustering. IEEE Transactions on Knowledge and Data Engineering 24, 6 (2012), 1025–1035. Xi Zhang, Jian Cheng, Ting Yuan, Biao Niu, and Hanqing Lu. 2013. TopRec: Domain-specific recommendation through community topic mining in social network. In Proceedings of the 22nd International Conference on World Wide Web. 1501–1510. Zi-Ke Zhang, Tao Zhou, and Yi-Cheng Zhang. 2010. Personalized recommendation via integrated diffusion on user–item–tag tripartite graphs. Physica A: Statistical Mechanics and its Applications 389, 1 (2010), 179–186. Received November 2013; revised July 2014; accepted September 2014

ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

Suggest Documents