A Personalized Recommendation System on Scholarly ... - CiteSeerX

A Personalized Recommendation System on Scholarly Publications Maria Soledad Pera

Yiu-Kai Ng

Computer Science Department Brigham Young University Provo, Utah 84602, U.S.A.

Computer Science Department Brigham Young University Provo, Utah 84602, U.S.A.

[email protected]

ABSTRACT Researchers, as well as ordinary users who seek information in diverse academic fields, turn to the web to search for publications of interest. Even though scholarly publication recommenders have been developed to facilitate the task of discovering literature pertinent to their users, they (i) are not personalized enough to meet users’ expectations, since they provide the same suggestions to users sharing similar profiles/preferences, (ii) generate recommendations pertaining to each user’s general interests as opposed to the specific need of the user, and (iii) fail to take full advantages of valuable user-generated data at social websites that can enhance their performance. To address these problems, we propose P ubRec, a recommender that suggests closely-related references to a particular publication P tailored to a specific user U , which minimizes the time and efforts imposed on U in browsing through general recommended publications. Empirical studies conducted using data extracted from CiteULike (i) verify the efficiency of the recommendation and ranking strategies adopted by P ubRec and (ii) show that P ubRec significantly outperforms other baseline recommenders.

Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval—Information Filtering

General Terms Algorithms

1.

INTRODUCTION

Researchers, as well as ordinary users, who seek materials in diverse academic fields, can turn to customized search engines, such as Google Scholar or ACM Portal, to locate publications of interest. These tools, however, are designed to retrieve articles matching the information need specified by a user in a keyword query and are inadequate for performing

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. CIKM’11, October 24–28, 2011, Glasgow, Scotland, UK. Copyright 2011 ACM 978-1-4503-0717-8/11/10 ...$10.00.

[email protected]

personalized and contextual searches [11]. Scholarly publication recommenders [4] are context-dependent and thus infer users’ interest to suggest articles that potentially match the preferences of a user. Although these recommenders have been thoroughly studied over the past decade, a shortcoming of their design methodology is the “one-size-fits-all” premise. Due to their assumption on group preference, which leads to the suggestion of same publications to users sharing similar profiles regardless of the individual interest of the users, their recommendations are not personalized enough [7]. In addition, these recommenders provide suggestions pertaining to the users’ general interests, which unlikely match the instant need of an individual user and yield extraneous recommendations the user is required to browse through. With the increasing popularity of social websites, such as Delicious(.com) and CiteULike(.org), which archive usergenerated data (i.e., social connections and tags), a new era of research on advancing the design of recommender systems is emerging [10]. Even though recently-developed recommenders consider the rich resource of information available on social websites, to the best of our knowledge, none of them generate personalized recommendations on scholarly publications. For this reason, we introduce P ubRec, a personalized recommender that addresses the shortcomings of traditional recommenders and takes advantage of users’ data archived on CiteULike to enhance the quality of its recommendations on scholarly publications. The proposed recommender facilitates the process of discovering academic citations related to a particular publication to provide relevant references to its users, a common inquire conducted very frequently within the academic setting. While existing recommenders suggest publications for a user U based on the (contents of) publications included in U ’s profile/personal library, P ubRec recommends articles tailored to the specific information need of U captured on a particular publication P . Based on the premise that U treasures recommendations made by people with whom (s)he has an explicit connection, P ubRec recommends articles among the publications bookmarked by U ’s connections on CiteULike that are relevant (in content) to P to a certain degree.

2. OUR PROPOSED RECOMMENDER P ubRec extracts data from CiteULike to generate personalized recommendations of scientific references, i.e., scholarly publications or simply publications, for CiteULike users. Given a CiteULike user Cusr and a publication P (that Cusr is interested in), P ubRec first identifies Cusr’s connections. Hereafter, using word-correlation factors, P ubRec

2.3 Selecting Candidate Publications

Figure 1: The architecture of P ubRec determines the set of publications, denoted CandidateP , among the ones included in the personal libraries of Cusr’s connections that are similar in content to P to a certain degree. P ubRec computes a ranking score for each publication in CandidateP and recommends the Top-10 publications with the highest ranking scores to Cusr. The process of P ubRec is illustrated in Figure 1.

2.1 CiteULike CiteULike, which is one of the leading social web systems developed for managing and sharing bibliographic references, includes 5,237,158 indexed articles (as of May 24, 2011) and allows its users to search, organize, and share publications. A CiteULike user’s personal library includes a number of bibliographic references that the user has bookmarked. Besides bookmarking publications and maintaining their metadata, abstracts, and links to the publishers’ websites, CiteULike users can add personal comments and tags to publications in their personal libraries [4]. Each publication P indexed in CiteULike is associated with a list of tags that are provided by the CiteULike users who have bookmarked P . The list is used by P ubRec to infer the tag cloud of P , i.e., a global visual representation of the tags and their frequencies assigned to P . CiteULike users can also assign a reading priority to a publication using a 5-star rating scale. In this rating system, one star (five stars, respectively) is associated with the label “I don’t really want to read it” (“Top priority!”, respectively), which indicates that the user is not eager to read (very interested in reading, respectively) the corresponding publication. Serving as a social website, CiteULike offers its users an infrastructure to establish explicit communication channels with other CiteULike users. Explicitly-connected users, called connections in CiteULike, can exchange private messages and share bibliographic references of interest with one another. (To find out all the social features offered by CiteULike, see wiki.citeulike.org/index.php/Social Features.)

2.2 Word Correlation Factors P ubRec relies on the pre-computed word-correlation factors in the word-correlation matrix [8] to determine the similarity between any two tags assigned to their respective publications, which capture and represent their contents. Each correlation factor, which was calculated using a set of approximately 880,000 Wikipedia(.org) documents, indicates the degree of similarity of the two corresponding words1 based on their (i) frequency of co-occurrence and (ii) relative distances in each Wikipedia document. 1

Words in the Wikipedia documents were stemmed af-

As the number of publications bookmarked by Cusr’s connections can be large, it is inefficient to compare each publication with P , a publication of interest to Cusr, to identify the ones similar to P to be recommended to Cusr, since the comparisons significantly prolong the processing time of P ubRec. To minimize the number of comparisons and thus reduce the processing time required in generating recommendations, P ubRec applies a f iltering strategy on articles included in the personal libraries of Cusr’s connections to generate a subset of articles, denoted CandidateP , to be considered for recommendation. Each publication in CandidateP contains at least one tag exactly matching or highly similar to one of the tags of P assigned by Cusr. As publications in CandidateP and P share same (or analogous) tags, P ubRec expects they are similar (to a degree) in content and address the same or similar topic. To identify highly similar tags, P ubRec employs a reduced version of the word-correlation matrix (introduced in Section 2.2) which contains 13% of the most frequentlyoccurring words (based on their frequencies of occurrence in the Wikipedia documents), and for the remaining 87% of the less-frequently-occurring words only the exact-matched correlation factor, i.e., 1.0, is used.

2.4 Ranking of Scholarly Publications P ubRec ranks each publication CP in CandidateP to prioritize them for recommendation using (i) the degree of similarity between P and CP , (ii) the number of connections who include CP in their personal libraries, and (iii) the adjusted rating (or simply rate) score given to CP by each connection of Cusr who has bookmarked CP .

2.4.1 Content Similarity of P and CP To determine the degree of similarity between P and CP , P ubRec adds the word-correlation factors between each tag in the tag cloud of P and CP in CiteULike, respectively. We consider the tags in the tag cloud of P (CP , respectively), since these tags provide a more comprehensive description of (the content of) P (CP , respectively), as opposed to the personal tags assigned to P (CP , respectively), which only reflect the personal opinion of and vocabulary used by Cusr (one of Cusr’s connections, respectively) in describing (the content of) P (CP , respectively).

2.4.2 Popularity of Scholarly Publications Besides computing the (content) similarity between P and CP , P ubRec considers the popularity of CP , which denotes the number of Cusr’s connections who include CP in their personal libraries. Publications that have attracted the attention of a number of Cusr’s connections are more likely bookmarked in the personal libraries of Cusr’s connections. P ubRec weights the fact that publications frequently-bookmarked by Cusr’s connections may also be of interest to Cusr, since Cusr and his/her connections share common interests to a certain degree. While solely relying on the popularity of an item in performing the recommendations task (which does not apter all the stopwords, such as articles and prepositions, which do not play a significant role in representing the content of a document, were removed. From now on, unless stated otherwise, (key)words/tags refer to non-stop, stemmed (key)words/tags.

ply to P ubRec) can lead to less diverse, i.e., less personalized, recommendations, Adomavicius and Kwon [1] claim that the accuracy of the recommendations can be enhanced by considering the popularity of an item during the recommendation process.

2.4.3 Rate of Publications P ubRec considers the number of stars given by Cusr’s connections to each publication in CandidateP and assign higher weight to those given high ratings by the connections, since users tend to favor highly-rated items more than the ones assigned lower ratings [5]. A connection, who includes in his/her library a significant number of articles on the same (or similar to the) topic T of P , is interested in T , which can be interpreted as the connection is likely more knowledgeable on or familiar with T and thus is more reliable in rating articles on T . Based on this premise, if the rating R of a candidate publication CP is assigned by the connection who is familiar with the topic of P , which is the same or similar to the topic of CP , then R should be perceived as more trustworthy than the rating assigned to CP by a connection who is less interested in or familiar with the topic of P . P ubRec adjusts the rating provided by each Cusr’s connection, denoted Ccon, on CP based on the reliability of Ccon in assigning ratings to publications on the topic related to the one addressed in P .

2.4.4 Rank Aggregation Having determined the (i) degree of similarity between P and each CP in CandidateP , (ii) the popularity score of CP , and (iii) the adjusted rating score assigned to CP by each Cusr’s connection, P ubRec computes the ranking score of CP by adopting CombMNZ [9], a popular linear combination strategy. CombMNZ combines multiple existing lists of rankings on an item into a joint ranking, a task known as rank aggregation or data fusion [9]. The rank aggregation strategy adopted by P ubRec accounts for the fact that not all of the publications are assigned a non-zero score in each (input) ranked list of publications. By adopting CombMNZ, P ubRec considers the strength of each evidence, i.e., scores in each of the ranked lists, as opposed to simply positioning higher in the ranking publications with a non-zero score in all of the ranked lists of publications, regardless of the values of the corresponding scores in each list. As a result, P ubRec can position higher in the ranking of generated recommendations a candidate publication CP that is more closely related (in content) to P , highly-popular among Cusr’s connections, and without a rating score than a publication that is less similar to P , less popular, and poorly-rated.

3.

EXPERIMENTAL RESULTS

In this section, we introduce the dataset, evaluation protocol, and metrics (in Sections 3.1, 3.2, and 3.3, respectively) which are used for assessing the performance of P ubRec. We detail the empirical study conducted for evaluating the effectiveness of P ubRec and compare its performance with existing baseline recommenders (in Section 3.4).

3.1 Dataset We constructed a dataset with data extracted from CiteULike. We first identified users who have recently bookmarked articles on CiteULike and randomly selected a set of

fifty of them2 , denoted “active users.” For each active user U we extracted U ’s connections. Thereafter, we retrieved the personal tags and ratings of each article in U ’s personal library (posted under the personal libraries of U ’s connections, respectively). The dataset also includes the set of tags (along with their frequencies of occurrence) in the (inferred) tag cloud of each scholarly publication bookmarked by either an active user or one of his/her connections. The resultant dataset includes 183 distinct users (fifty of which are active users and the remaining ones are their connections), 103,723 distinct scholarly publications, and 35,034 distinct tags. Since, as previously stated, P ubRec generates personalized recommendations for an active user on a particular publication, we evaluate P ubRec based on the recommendations generated for each of the 21,867 user-publication pairs, called T est P airs, in the constructed dataset.

3.2 Evaluation Protocol We measure the overall performance of P ubRec, using the metrics to be introduced in Section 3.3, on the recommendations generated for each user(U )-publication(P ) pair in T est P airs. As a ground truth, i.e., the (non-)relevance of a recommendation R generated by P ubRec for a U -P pair, we rely on the publications bookmarked by U on CiteULike. R is relevant if it is included in U ’s personal library (excluding P ) and is non-relevant, otherwise, a commonly-employed protocol for assessing recommendation systems [3, 4].

3.3 Metrics We treat P ubRec as a content retrieval system that recommends to its users a list of ten publications relevant to a submitted query, a publication in our case, which is a commonly-adopted evaluation strategy [3, 4], and apply Precision@K, Mean Reciprocal Rank, and Normalized Discounted Cumulative Gain [6] to evaluate the recommendation accuracy of P ubRec, in addition to its ranking strategy.

3.4 Performance Evaluation To demonstrate and verify the effectiveness of our proposed recommender, we compare the performance of P ubRec with two well-known, widely-adopted baseline recommender systems: SocialRecommender (SR) [3] and T agV ectorSimilarity (T V S) [2]. While the former adopts a collaborative filtering strategy, the later is a content-based recommender. Given that P ubRec and SR recommend publications included in the personal libraries of active users’ connections to make recommendations, we restricted the publications to be considered for recommendations by T V S to those bookmarked by active users’ connections and thus conduct a fair, i.e., comparable, assessment among the recommenders. Prior to comparing the aforementioned recommenders, we determined the relevance of each publication recommended by SR (T V S, respectively) for each user-publication pair in T est P airs according to the evaluation protocol detailed in Section 3.2 and the metrics introduced in Section 3.3. Note that since only publications existing in a user’s library are considered relevant, it is not possible to account for potentially relevant publications that the user has not bookmarked. Thus, the computed precision scores are underestimated, a well-known limitation of the protocol introduced in Section 3.2. As this limitation affects P ubRec, SR, and 2

CiteULike users lacking explicit connections were excluded.

Figure 2: The P @1, P @10, M RR, and N DCG scores of SR, T V S, and P ubRec, respectively T V S, the precision values are consistent for comparative purposes [3]. As shown in Figure 2, the P @1 scores achieved by SR and T V S are 0.15 and 0.23, respectively, which are at least 36% lower than the P @1 score of P ubRec. Also shown in Figure 2 are the P @10 scores of SR, T V S, and P ubRec, which are 0.11, 0.18, and 0.45 respectively. The P @1 scores indicate that more than 21 (close to 71 and 14 , respectively) of the time, the first publication recommended by P ubRec (SR and T V S, respectively) is relevant. In addition, the P @10 values show that close to half of the scholarly publications recommended by P ubRec are relevant, as opposed to the approximately one-tenth (one-fifth, respectively) recommended by SR (T V S, respectively). Figure 2 also shows the M RR scores of SR, T V S, and P ubRec. These scores reflect that while on the average P ubRec users are required to browse through less than two 1 (∼ = 1.44 < 2) recommended publications before locat= 0.69 ing one that is related to a scholarly publication that (s)he is interested in, users relying on SR and T V S are required to 1 1 scan through at least four (∼ = 4.16) and three (∼ = = 0.24 = 0.33 3.03) recommended publications, respectively before locating one that is of interest. The N DCG scores calculated for the evaluated recommenders are also shown in Figure 2. The N DCG score of P ubRec, which is 0.72, is over 40% higher than the N DCG scores computed for SR or T V S, which are 0.29 and 0.30, respectively. A significantly higher N DCG value indicates that P ubRec is notably more effective than SR and T V S in ranking higher in the list of recommendations scholarly publications that are relevant.

4.

CONCLUSIONS

We have introduced a personalized publication recommender, denoted P ubRec, which is simple and requires neither supervision nor domain-specific information in generating recommendations. P ubRec is based on the premise that a user U values recommendations made by people with whom U has an explicit connection, a design methodology that differs from existing recommenders which suggest recommended items inferred from users unknown to U . Recommendations suggested by P ubRec are personalized, since it considers the personal preference of U to make suggestions, instead of providing the same recommendations to users who share the same or similar profile information/common interests. Furthermore, while existing user-centric recommenders provide recommendations pertaining to U ’s general

interests, P ubRec recommends articles relevant to a particular publication P given U . The unique design methodologies employed by P ubRec facilitate the process of identifying academic publications relevant to an article A of interest to an individual user who conducts a search of references for A, a task performed in the academic setting on a regular basis. To assess the performance of P ubRec and existing baseline recommenders, we have conducted an empirical study using data extracted from CiteULike. The study has demonstrated (i) the effectiveness of the recommendation and ranking strategies adopted by P ubRec and (ii) the superiority of P ubRec over baseline recommenders. The results of the conducted experiments verify that by considering (i) collaborative annotations, i.e., tags and ratings, extracted from a social website, along with (ii) connections established among the users in a social environment we can enhance the quality of recommendations on publications. While P ubRec is currently applied for recommending scholarly publications, we intent to further enhance our proposed recommender so that it can simultaneously suggest any number of multimedia items, i.e., songs or movies, provided that collaborative data describing items of interest and explicit connections among users can be extracted from a social networking environment.

5. REFERENCES [1] G. Adomavicius and Y. Kwon. Improving Aggregate Recommendation Diversity Using Ranking-Based Techniques. IEEE TKDE - In Press, 2011. [2] C. Basu, H. Hirsh, W. Cohen, and C. Nevill-Manning. Technical Paper Recommendation: A Study in Combining Multiple Information Sources. JAIR, 1:231–252, 2001. [3] A. Bellogin, I. Cantador, and P. Castells. A Study of Heterogeneity in Recommendations for a Social Music Service. In ACM HetRec, pages 1–8, 2010. [4] T. Bogers and A. van den Bosch. Recommending Scientific Articles Using CiteULike. In ACM RecSys, pages 287–290, 2008. [5] P. Cremonesi, Y. Koren, and R. Turrin. Performance of Recommender Algorithms on Top-N Recommendation Tasks. In ACM RecSys, pages 39–46, 2010. [6] W. Croft, D. Metzler, and T. Strohman. Search Engines: Information Retrieval in Practice. Addison Wesley, 2010. [7] J. Jung, K. Kim, H. Lee, and S. Park. Are You Satisfied with Your Recommendation Service?: Discovering Social Networks for Personalized Mobile Services. In KES-AMSTA, pages 567–573, 2008. [8] J. Koberstein and Y.-K. Ng. Using Word Clusters to Detect Similar Web Documents. In KSEM, pages 215–228, 2006. [9] J. Lee. Analyses of Multiple Evidence Combination. In ACM SIGIR, pages 267–276, 1997. [10] H. Ma, D. Zhou, C. Liu, M. Lyu, and I. King. Recommender Systems with Social Regularization. In ACM WSDM, pages 287–296, 2011. [11] R.White, P. Bailey, and L. Chen. Predicting User Interests from Contextual Information. In ACM SIGIR, pages 363–370, 2009.