Improving Social Recommender Systems - Semantic Scholar

Social Computing

Improving Social Recommender Systems Ofer Arazy, University of Alberta Nanda Kumar, City University of New York Bracha Shapira, Deutsche Telekom Laboratories at Ben-Gurion University

Recommender systems play a significant role in reducing information overload for people visiting online sites. The accuracy of recommender systems could be improved by using data from online social networks and electronic communication tools.

R

ecommender systems are a key component of successful online stores such as Amazon.com, Epinions.com, and Netflix as they help users sort through a site and find relevant information (we discuss the approach behind each of these examples in the “Commercial Social Recommender Systems” sidebar). Since the emergence of social (or collaborative) filtering techniques in the mid-1990s, the industry has adopted a wide variety of collaborative filtering (CF) designs to generate recommendations. Typically, CF works by identifying recommendation sources with preferences similar to the user, identifying items that these sources like (but which the user hasn’t purchased yet), predicting the relevance of these items (based on ratings and the source’s similarity to the user), and recommending the most relevant items.

31

IT Pro May/June 2009

Today, online communities—with their strong ties and built-in relationships—present an opportunity for enhancing the design of social recommender systems and increasing system prediction accuracy. We can use the various relationships captured in these communities (phrased as “trust” on Epinions and “reputation” on eBay) in new ways, by incorporating better indicators of relationship information. The potential impact of these social recommender systems is not restricted to the public domain: the recent advent of Enterprise 2.0—the application of Web 2.0 approaches in enterprises—is expected to bring social recommendation techniques to corporate settings. In this article, we present a framework for social recommender systems that is intended to enhance recommendation accuracy. We model our approach after Arazy and Woo,1 who proposed

Published by the IEEE Computer Society

1520-9202/09/$25.00 © 2009 IEEE

Commercial Social Recommender Systems

S

ince the introduction of collaborative filtering (CF) algorithms in the mid-1990s, social-based recommendation techniques have played a significant role in shaping consumer Web-based recommendation applications. The first large-scale implementation of CF is attributed to Amazon.com, which launched its book recommendation application in 1995. It later extended recommendations to additional products, such as music CDs and consumer products. Amazon has been a leader in adopting social approaches to recommendations, and it provided user reviews for its products at an early stage. Recently, Amazon upgraded its review system to incorporate user ratings of reviews and a reputation system that establishes reviewer credibility. Netflix, a Web-based movie rental service, relies heavily on its CF system to recommend movies to users. The company has been extremely effective at

that the design of systems should be grounded in theoretical foundations. In the context of recommender systems, we believe that designers should consider behavioral theories of persuasion and advice taking when they design social recommender systems. Although the design of existing collaborative filtering (CF) systems assumes that similarities in preferences (as captured in users’ consumption profiles) determine recommendation quality, behavioral theory suggests that other characteristics—such as the source’s trustworthiness and reputation—determine the recipient’s perception of the recommendation.

Should I Take Your Advice? Online relationships are useful for a variety of purposes, including social (such as those in MySpace), job searching (LinkedIn), information access (Slashdot.org), and commerce (eBay). Although these online ties weren’t established for the purpose of advice taking, recommender systems could use them to link a user with relevant sources. Using previous research in marketing, applied psychology, and organization, we identified four salient constructs that impact a recipient’s advice-taking decision—homophily, tie strength, trust, and social capital. We argue that these constructs are relevant for the design of recommender systems. Homophily refers to the similarity between source and recipient, and marketing research has investigated it for word-of-mouth recom-

matching users with movies and using these recommendations to push items on the long-tail portion of its inventory. In addition to CF, NetFlix lets users define a social network of friends, allowing them to view each other’s preferences. However, this social network data isn’t incorporated into Netflix’s CF algorithm yet. Epinions.com is a successful product recommendation site launched in 1999 to let users rate products; its CF algorithm then uses these ratings to make product recommendations. Additionally, users can associate themselves with others whose opinions they trust. Epinions then forms a “web of trust,” propagating this trust information across a network and incorporating it into its CF algorithm. Thus, Epinions is a pioneer in developing a social recommender system that incorporates two types of social relations: shared preferences and trust.

mendations. Homophily—particularly similarity in knowledge and preferences—is a key determinant of whether a recipient would accept a source’s advice,2 specifically in domains such as movie and book recommendations. From a system design perspective, we could estimate similarity in preferences between various users by recording their consumption patterns and comparing these patterns. Early recommender systems adopted this CF approach,3 which has quickly become the industry standard (an example is Amazon’s recommender system). This approach works well for large user communities where sufficient information is available about each user. Recent research in collaborative filtering provides enhancements along various dimensions, such as automatically eliciting accurate user feedback, employing algorithms to measure users’ similarities, and improving prediction methods.4 The main advantage of this approach is that it requires little effort from users: they might need to rate the items they’ve consumed, but they aren’t required to explicitly define their relationships to other users. Its limitation is that in cases where little information is available about users and items (referred to as a cold start), prediction accuracy suffers. Behavioral researchers have studied tie strength—the intensity of the relationship between the recipient and source—and identified it as a key determinant in a recipient’s likelihood to take advice.5 Tie strength has several facets, including the relationship’s duration, interaction

computer.org/ITPro

32

Social Computing Table 1. Key social recommendations research studies. Social dimensions

Implementation approach

Task domain

8

Trust Established a social trust network. — Shared preferences3 Original collaborative filtering (CF) work. — Shared preferences, trust In addition to standard CF, used trust from Epinions’s Product recommendations (local trust), and reputation web-of-trust data; propagation of trust; reputation (global trust)10 based on a user’s average trust scores. Shared preferences and trust11 In addition to standard CF, used trust (which is extracted Movie recommendation automatically based on the accuracy of the user’s past predictions); used MovieLens data. Shared preferences and trust9 In addition to standard CF, established a social trust Movie recommendation network; propagation of trust.

frequency, and feeling of closeness. Empirical findings suggest that frequency and closeness can impact a recipient’s advice-taking decision.6 In the design of recommender systems, we can easily calculate the frequency of users’ electronic communications (such as email or text messaging) by installing a tracking utility on their computers or electronic devices (with their permission). Assuming that consumption data is available for these users, communication frequency data can link users to sources, thus potentially improving prediction accuracy. Although this approach would require little effort from users, it could pose a risk to user privacy. To the best of our knowledge, no commercial application has implemented this approach yet. A recipient’s trust in a recommendation source is yet another important indicator of his or her likelihood of accepting a recommendation.5,7 The construct of trust includes both cognitive and affective dimensions—and both dimensions play an important role in advice taking.5 Researchers have investigated the impact of trust primarily in the context of organizational advice networks. Online social networks provide ample evidence of trust relationships. If we can harvest this relational information and incorporate it into a recommender system, we could obtain a more accurate representation of recipient-source relationships. Alternatively, instead of harvesting data from online communities, the system might ask users to explicitly define the extent to which they trust other users. The first CF system, introduced in 1992,8 employed this approach and required users to define explicit trust relations. At that time, the explicit trust approach failed to gain acceptance. However, this approach is now gaining momentum, and recent studies9 demonstrate its potential. The trust-based approach’s main limitation— besides requiring users to spend time explicitly 33


defining their online relationships—is that users often have only a few links, resulting in insufficient data for improving recommendation quality. We could potentially alleviate this limitation by propagating trust across relationships—for example, if user A trusts B, and B trusts C, then we could assume that A trusts C (at least to some extent). Researchers have explored various trust propagation algorithms,10 and existing trust-based recommender systems often employ some variation of trust propagation. Again, the big drawback of this approach is the potential risk to users’ privacy. Finally, a source’s social capital (that is, the source’s reputation or opinion leadership) has been shown to affect the recipient’s decisionmaking process. A person’s social capital represents his or her ability to influence others, and stems from that person’s structural positioning in the social network. Sociology and management research have investigated this construct.2 In designing online recommender systems, we can use two approaches to calculate social capital (or reputation). The first is based on a system that records user ratings on others’ recommendations and accumulates these ratings to calculate recommenders’ reputation scores.10 Online commerce sites (such as eBay) were among the first to adopt this reputation system approach, and it has now been adopted by many other sites (such as Amazon.com). The alternative approach for estimating a user’s reputation is based on the structural analysis of online social networks. This technique, referred to as social network analysis (SNA), assigns various centrality measures to each user, based on his or her position in the network. We can apply SNA in a variety of situations, including management consulting, analyzing the Web structure, and evaluating citations. To date, no commercial recommender system has capitalized on these possibilities to incorporate social capital information.

Consumption history Receiver’s Source’s Calculate profile similarity

Shared preferences

Trust propagation

Trust

System’s source qualification component

Social network Social network analysis (SNA) Ratings of recommendation

Reputation mechanisms

Online communications

Calculate interaction frequency

Source’s reputation

Tie strength

Source’s qualifications

System’s prediction component

System’s prediction (recommendation)

Figure 1. Conceptual recommender system design based on our proposed framework. Rectangles represent input (red) or output (blue) information, trimmed rectangles (orange) represent system processes, and the green rectangle is the final output.

Table 1 summarizes some relevant research projects that explore the use of social approaches to design recommender systems. Research on social recommendation systems is in its early phases, and most current attempts to incorporate relationship information into recommender systems employ only a subset of the available indicators. Furthermore, it seems that the design choices in these works are somewhat ad hoc and are often not informed by current knowledge and theories of human behavior.

Our Proposed Framework We propose a social recommendation framework that borrows from advice-taking theory by integrating the aforementioned relationship indicators between users and recommendation sources (homophily, tie strength, trust, and reputation). A social recommender system based on this framework would employ various mechanisms for capturing relationship information:

• track user consumption patterns, construct user profiles, and compare profiles (to detect homophily, as in CF systems); • establish social networks and propagate links to form indirect links (to establish users’ trust in each other); • record user communication patterns and interaction frequency (as evidence for tie strength); and

• establish reputation mechanisms based on either ratings of recommendations or on the analysis of the social network’s structure. Figure 1 presents a conceptual design of a recommender system based on our proposed framework. As Figure 1 shows, once the system records the various relationship indicators, the system source qualification component calculates a weighted average of the indicators to arrive at a single qualification score for each source. We expect that the task domain (that is, leisure versus work-related tasks) will affect the relative importance (weights) of the various source qualification indicators. For example, based on results from behavioral studies, we expect that for movie recommendation tasks, users will deem shared preferences as more important than interaction frequency. Next, the system prediction component takes sources’ qualifications and their history of ratings as input to predict an item’s relevancy to the recipient and produces a recommendation. We present an algorithm for a possible implementation of this framework in the sidebar “Algorithm for Implementing Our Framework.”

Using Social Relationship Data to Alleviate the Cold-Start Problem Because research on social recommender systems is still in its infancy, both industry and academia have experiments currently in process

computer.org/ITPro

34

Social Computing Algorithm for Implementing Our Framework

W

e calculate the source qualification for user u, Qu,k as a weighted average of various indicators. A simple formula is

dation is relative to their qualification. The recommendation function of an item i to a user u could be based on various algorithms, the gold standard1 in CF systems being

Qu,k = WH × Hu,k + W T × Tu,k + W TS × TSu,k + WR × Ru,k, where Hu,k is the homophily (shared preferences) score for users u and k, Tu,k is the trust score, TSu,k is the tie strength (interaction frequency) score, and Ru,k is the reputation score. W represents the relative weight assigned to each indicator: WH for homophily, W T for trust, W TS for tie strength, and WR for reputation. Alternative formulas, such as harmonic mean, are also possible. The system prediction component’s output is a prediction of item relevancy to users. The system computes it as an aggregation of the recommendations of the n most qualified sources, where the effect of each of n sources on the final recommen-

n

Pu ,i = ru +

n

∑ Qu ,k

,

where Pu,i is the prediction score of item i to user u, ru is the average overall past ratings provided by user u, rk,i is the rating assigned to item i by user u, and rk is the average overall past ratings provided by user k. Reference 1. J. Herlocker et al., “Evaluating Collaborative Filtering Recommender Systems,” ACM Trans. Information Systems, vol. 22, no. 1, 2004, pp. 5–53.

Effort and Privacy Our proposed framework is somewhat generic in the sense that it includes all available relationship indicators. However, any implementation of this framework is likely to use a subset of indicators. We can choose which indicators to use based on the domain in which we deploy the system and


k =1

k =1

about how to incorporate various indicators of social relationships into recommender systems. We grounded our proposed framework on behavioral theory, utilizing a series of relationship indicators that we can extract in online settings. We expect this framework to provide accuracy enhancements beyond traditional CF, especially in cold-start situations. This problem is critical in commercial recommender systems4,12 because in the early phases of CF system deployment, relatively little information on user tastes is available, making it difficult to provide accurate recommendations.13 For example, two of the most popular commercial CF applications— GroupLens and Epinions—suffer from the coldstart problem.10,13 Advice-taking literature suggests that relationship indicators such as trust and tie strength are highly correlated with homophily. It makes sense, then, that data extracted from a social network could serve as a proxy for preference similarities in cold-start situations and ensure that the system associates a recipient with appropriate sources.

35

∑ Qu ,k × (rk ,i − rk )

based on efficiency considerations. The impact of the various indicators on system efficiency is independent of task domain. Efficiency depends on three key factors:

• effort required by users, • effort required by system administrators, and • privacy concerns. Table 2 summarizes these considerations for the various relationship indicators. The effort required from users plays a large role in determining system adoption. To keep user effort down to a minimum, the system can calculate shared preferences based on users’ consumption records. It can also capture and calculate communication frequency automatically. Establishing a social network (whether to calculate trust or indicate reputation) requires users to invite and accept invitations from other users, whereas a reputation system requires them to rate the quality of the recommendations they received. The effort required from system administrators, too, might play a part in decisions about which relationship indicators to use. Calculating shared preferences requires the recording of user profiles—and matching them. Although calculating direct trust relationships from a given social network is straightforward, propagating trust to indirect relationships requires additional calculations. We can calculate reputation scores from

Table 2. Effort and privacy considerations for extracting relationship indicators. Evidence

User effort

System administration effort

Privacy concerns

Shared preferences Low (if based on purchase Low (existing CF available) Low (only rating of items) history) or medium (when ratings of items are required) Communication Low (automatic) Low (monitoring electronic Medium (social relations frequency communication) Social network—direct High (establishing a social Low (social network) Medium (social relations) relations network) Social network—indirect High (establishing a social Medium (social network and Medium (social relations) relations network) trust propagation) Social network—social High (establishing a social Medium (social network and Medium (social relations) network analysis (SNA) network) SNA calculations) Reputation system Medium (rating of others’ High (reputation mechanism Low (rating of others recommendations) and fraud control) recommendations)

a social network using SNA, but implementing a reputation mechanism requires setting up technical and social controls to combat fraud and assure normative user behavior. Privacy is a major issue for both users and system administrators. Users are reluctant to provide personal details for fear of misuse, and system administrators are concerned about the legal issues associated with protecting user privacy. Calculating shared preferences requires tracking consumption data, and data about consumed item ratings (whenever collected). Tracking communication frequency, as well as collecting social network data, might pose a larger threat to privacy because users might consider their social relations with others to be confidential information. However, the information that reputation systems use—ratings of recommendations and reputation scores—is often considered public knowledge. The analysis we mention here highlights the advantages of the shared-preferences approach in light of user effort and privacy concerns. Nevertheless, the use of additional indicators for social relationships has potential benefits. First, incorporating additional information sources will tackle the cold-start problem and increase prediction reliability. Second, even in cases where shared preference scores are reliable, we need to incorporate additional indicators of social relationships because behavioral theory suggests that shared preferences are just one of several factors that determine a recipient’s likelihood of accepting advice. Moreover, extracting relationship indicators might not require much effort from users, especially if we can harvest this information from existing online social networks.

F

or more than a decade now, the ad hoc standard in recommendations systems has been based on users’ shared preferences. Recent advances in academia and industry suggest that we can employ alternative sources of relationship information to enhance recommender system performance. By considering these different approaches and grounding our analysis in behavioral theory, we proposed a conceptual design for a social recommender system that has the potential to alleviate the cold-start problem and improve recommendation accuracy. We hope that others will investigate similar approaches to employ social relationship information in the design of recommender systems. Notwithstanding the potential benefits, our approach has some limitations associated with administration costs, usability, and user privacy. In implementing social recommender systems and choosing which types of relationship indicators to employ, system designers should consider the risks associated with each indicator.

References 1. O. Arazy and C. Woo, “Enhancing Information Retrieval through Statistical Natural Language Processing: A Study of Collocation Indexing,” Management Information Systems Quarterly, vol. 31, no. 3, 2007, pp. 525–546. 2. M. Gilly, J. Graham, M. Wolfinbarger, and L. Yale, “A Dyadic Study of Interpersonal Information Search,” J. Academy of Marketing Science, vol. 26, no. 2, 1998, pp. 83–100. 3. U. Shardanand and P. Maes, “Social Information Filtering: Algorithms for Automating Word of Mouth,” Proc. Conf. Human Factors in Computing Systems, ACM Press, 1995, pp. 210–217.

May/June 2008 IT Pro

36

Social Computing

4. J. Herlocker et al., “Evaluating Collaborative Filtering Recommender Systems,” ACM Trans. Information Systems, vol. 22, no. 1, 2004, pp. 5–53. 5. D. Levin and R. Cross, “The Strength of Weak Ties You Can Trust: The Mediating Role of Trust in Effective Knowledge Transfer,” Management Science, vol. 40, no. 11, 2004, pp. 1477–1490. 6. P. Marsden, and K. Campbell, “Measuring Tie Strength,” Social Force, vol. 63, no. 2, 1984, pp. 482– 501. 7. D. Smith, S. Menon, and K. Sivakumar, “Online Peer and Editorial Recommendations, Trust, and Choice in Virtual Markets,” J. Interactive Marketing, vol. 19, no. 3, 2005, pp. 15–37. 8. D. Goldberg et al., “Using Collaborative Filtering to Weave an Information Tapestry,” Comm. ACM, vol. 35, no. 12, 1992, pp. 61–70. 9. J. Golbeck and J. Hendler, “Filmtrust: Movie Recommendations Using Trust in Web-Based Social Networks,” Proc. Consumer Comm. and Networking Conf., IEEE CS Press, 2006, pp. 282–286. 10. P. Massa, and P. Avesani, “Trust-Aware Collaborative Filtering for Recommender Systems,” Lecture Notes in Computer Science, vol. 3290, 2004, pp. 492–508. 11. J. O’Donovan, and B. Smyth, “Trust in Recommender Systems,” Proc. 10th Int’l Conf. Intelligent User Interfaces, ACM Press, 2005, pp. 167–174. 12. K. Goldberg et al., “Eigentaste: A Constant Time Collaborative Filtering Algorithm,” Information Retrieval, vol. 4, no. 2, 2001, pp. 133–151.

37


13. D.A. Maltz and K. Ehlrich, “Pointing the Way: Active Collaborative Filtering,” Proc. Computer–Human Interaction, ACM Press, 1995, pp. 202–209.

Ofer Arazy is an assistant professor in the School of Business at the University of Alberta. His research interests are in knowledge management and social computing. Arazy has a PhD in management information systems from the University of British Columbia. Contact him at ofer. [email protected]. Nanda Kumar is an associate professor in the Computer Information Systems Department at Baruch College, City University of New York. His research interests include human–computer interaction, digital government, and the impact of IT on the organization of work and leisure. Kumar has a PhD in management information systems from the University of British Columbia. Contact him at nanda. [email protected]. Bracha Shapira is a project manager in the Deutsche Telekom Laboratories at Ben-Gurion University, where she leads a project that deals with personalized content on mobile devices. She’s also a senior lecturer in the Department of Information Systems Engineering at Ben-Gurion, where she leads the Information Retrieval Laboratory. Shapira’s research interests include information retrieval and filtering— especially for user modeling, profiling, and personalization. She has a PhD in information systems from Ben-Gurion University. Contact her at [email protected].