Reinforcing Recommendation using Implicit Negative Feedback Danielle H. Lee1 and Peter Brusilovsky1 1
School of Information Sciences, University of Pittsburgh 135 N. Bellefield Ave., Pittsburgh, PA 15260 USA.
[email protected],
[email protected]
Abstract. Recommender systems have explored a range of implicit feedback approaches to capture users’ current interests and preferences without intervention of users’ work. However, the problem of implicit feedback elicit negative feedback, because users mainly target information they want. Therefore, there have been few studies to test how effective negative implicit feedback is to personalize information. In this paper, we assess whether implicit negative feedback can be used to improve recommendation quality. Keywords: Negative preference, implicit feedback, recommendation.
1
Introduction
Modern recommender systems rely on user feedback to provide high quality recommendations. User feedback communicates information about user interests and it can be provided either explicitly through users’ ratings or implicitly through various user activities such as browsing, reading, or bookmarking. While explicit feedback is sometimes considered as more reliable, implicit feedback requires less intervention to users, captures short-term interest, and continuously updates user preference. Also modern approaches to implicit feedback analysis make the quality of recommendation based on implicit feedback comparable to those based on explicit feedback. One aspect, however, where implicit feedback still differs from explicit is its predominant positive focus. It has been argued that negative preferences are hard to acquire through implicit channel (Gauch, et al., 2007) because users mainly pursue information they consider as interesting. Therefore, there has been little work done to test implicit negative preference to personalize information. This paper introduces a relatively simple mechanism to infer negative feedback and test the feasibility of the feedback as a way to represent what users want in the context of a job recommender system. In this study, to prove the effectiveness of implicit negative feedback, not only the negative feedback, but two kinds of positive feedbacks (users’ saved jobs and search options) were collected. The collected information was compared with a list of recommended jobs which are evaluated by users as the ground truth. The quality of feedback was measured in several settings – positive preference only, negative preference only and compound preference.
2
Implicit Feedback as a Way to Indicate User’s Preferences
Implicit feedback as a source of information for recommendation has been explored in a range of projects. Sugiyama et al. (2004) and Kim and Chan (2003) utilized browsing history and the key-terms in each visited website as implicit feedback to retrieve personalized search results. Joachimis, et al. (2005) studied clickthrough as implicit feedback. Morita and Shinoda (1994) and Claypool et al. (2002) examined time spent on information items and found a strong positive correlation between implicit feedback and time spent. Implicit feedback techniques were overviewed in Kelly and Teevan (2003) and Gauch, et al. (2007). Both papers showed that almost all personalized systems and approaches rely on positive implicit feedback. In Kelly & Teevan’s study, only 4 out of the 27 techniques could be used to represent negative feedback, for instance, deleting, skipping forward, editing existing contents or rating negatively. Even further, none of the studies that are introduced in the paper utilized negative preference in implicit feedback. The same is true for the classification by Gauch, et al. (2007). The decrease of the discriminative power of positive-only implicit feedback is certainly a concern. Morita and Shinoda’s study indicated that the quality of positiveonly implicit feedback may decrease with the increased flow of positive judgments. When subjects read a series of related papers, the power of the reading time-based feedback became weaker to discriminate the positive and negative preferences (Morita and Shinoda, 1994). An empirical study by Golbeck (2008) also suggested that if a recommender system includes one single item that a user, the system become untrustworthy, even if it provides a set of favorable items as well. Holland, et al. (2003) examined the personalization using both positive and negative feedback. Based on users’ web log, the user preferences were expressed as a comparison, such as, “A is better than B.” This is effective in calculating categorical or numerical preference. Nevertheless, the personalization needed an adequate amount of log data and detecting negative preferences was founded on the assumption that the users know all possible values. If they did not select them, they could be defined as negative. Chao, et al. (2005) and Pampalk, et al., (2005) actively collected negative implicit feedback sources such as skipping or blocking a song. In the former study, only the songs users did not want to hear were taken into account in a recommendation for background music for a group of users. In the later study, the researchers counted for positive and negative preference, and the skipped behavior of users decreased.
3
Job Recommendation Mechanism in Proactive
The study presented in this paper was performed in the recommender system Proactive (Lee & Brusilovsky, 2007), which helps its users to find information technology jobs. The users can explore jobs available in the system through a comprehensive list of jobs (which can be sorted by several properties) or using search by keywords and desired job properties (Figure 1). Either approach provides the summary of each job case. Once a job looks promising, users can click the job title to see the detailed information in a separate window. If the job is really interesting, they
can save (bookmark) the job into “my jobs” list with three levels of ratings (relevant, good, very good). Issuing a query with specific parameters and saving a job case send different kinds of positive feedback to the system. If the job turns out not worthy to save, users just close the window. We considered this action as implicit negative feedback and used the properties of ignored job cases to elicit negative preferences. To generate recommendation, Proactive used job properties, which are encoded using several taxonomies. Job category and company information are defined by the Yahoo! HotJobs taxonomy. The geographic information of jobs is gathered from Google Map to calculate the neighboring area. Educational level, experience, position type Fig. 1. Advanced Search and salary are defined as taxonomies, as well. Every property in the several taxonomies was indexed by a certain weight value representing the semantic position in the taxonomy to make each property comparable. When jobs are crawled, the properties of each job case are automatically classified and have the corresponding weight values. Proactive calculates the differences of weight values between the new opening jobs and user preference to recommend jobs. User preferences are defined within all possible properties of job cases and accordingly have weight values. The smaller is the weight value difference, the closer the new jobs are to the user preference (Lee & Brusilovsky, 2007). The equation to calculate the distance of weight values is like the following.
Weight Distance s w r
eq. (1)
For instance, we count for users’ saved jobs as their explicit positive preference. sj is the weight value of the saved job properties and wij is the weight values of candidate job properties which are recently added. i is the total number of a user’s saved jobs and j is the total number of candidate jobs. ri is the ratings of each saved job as additional power. These calculations are executed for all job properties (job category, company industry, job location, educational level, experience level, salary, and position type). As the equation shows, if two job cases are semantically close, the weight distance between them is small. In addition, as the rating of a saved job is higher, the overall distance gets smaller. Comparison between search options and candidate jobs was also based on this equation.
4
The Study
The goal of the reported study was to assess the value of implicit negative feedback. During the process of the study, we employed users’ saved jobs and search options as positive job preferences and jobs that were opened but not saved as negative job preferences. A list of recommended jobs was manually evaluated by users and the evaluation was applied as ground truth. To judge which source of feedback represents user preferences best, we compared weight distance between all properties in the ground truth and four kinds of user preference – one positive source, two positive feedback sources, negative feedback, and compound feedback. The subjects in this study were 17 Information Science majored students (10 males, 7 females) at the University of Pittsburgh who expressed interest in looking for information technology-related jobs. The structure of a study session was simple: after a brief introduction of the system, each subject was asked to use the system to find jobs of interest. It was explained that they had to click the job title in either a comprehensive job list or search results list to see job detail. It was also explained that to receive good recommendations, they had to save only interesting jobs. When participants explored the system thoroughly and saved several jobs, the system chose a list of recommended jobs by calculating the similarity of the jobs just based on the saved jobs. Participants had to rate the recommended jobs using three point Likert scale (1 = bad, 3 = neutral, 5 = good). In total, 237 job cases were rated by 17 subjects. 4.1
Positive Preferences
As in the first result, explicit positive feedback – user’s saved jobs – was tested to measure how it works well to represent users’ preferences. We grouped recommended jobs according to three levels of user evaluation (good-neutral-bad). Then the distances between each group of recommended jobs and the users’ saved jobs were compared. If explicit positive preferences work well, the distances should be significantly smaller for jobs rated as ‘good’ that for ‘bad’. Unfortunately, there was no significant difference between the weight of recommended jobs and saved jobs among the three levels of ratings, Kruskal-Wallis H = .595, df = 2, p = .537. The result suggests that one kind of positive feedback, even explicit, may not be sufficient to reliably distinguish good and bad jobs. Table 1.
Mean Values of Weight Distance & Mean Difference Tests Rating
Good
Neutral
Bad
Sig.
47.70 40.01 34.09 31.44
60.67 42.79 97.08 94.04
47.55 57.66 78.21 59.40
.537 .477 .003* .005*
Profile User’s saved jobs User’s saved jobs + search options Uninteresting job-based profile Compound profile
To represent users’ interests more clearly, we considered additional search options as another source of positive implicit feedback which is implicit. The advanced search interface of Proactive (as shown in Figure 1) provides all properties of existing jobs as drop-down menus, and the properties come from the taxonomies. Accordingly,
they have weight values to make them computable, as mentioned above. Once a user specified the search options, it could be assumed that the user is interested in jobs with these properties. Both saved jobs and search options were computed with recommended jobs respectively using equation (1) and the result distances were summed up together. While the ability of the recommender engine to discriminate good and bad jobs has visibly improved, still, there was no significant difference in weight distance among the level of ratings, Kruskal-Wallis H = .743, df = 2, p = .477. The addition of another source of positive feedback for recommendations wasn’t helpful to distinguish good jobs reliably. 4.2
Negative Preferences
As explained, when participants click the job title in a summarized list, they can see job details in a separate window. If they find the job interesting, they can rate and save it to generate recommendations. Otherwise, they just close the window without saving. It can be surmised that some of the properties in the job did not match to their interests and the user doesn’t want to receive recommendations relevant to the job. Therefore, the properties of the opened but not saved jobs were elicited and added as a negative user profile. Since there is no way to determine which specific property made the job uninteresting to the user, we counted the job properties as a whole. In equation (1), the distance between uninteresting jobs and recommended jobs were computed. Since uninteresting jobs had no explicit ratings, the ratings were simply marked as ‘bad.’ The result shows significant differences in weight distance among three levels of rating, Kruskal-Wallis H = 11.951, df = 2, p = .003. Although weight distances evaluated as ‘neutral’ jobs have the highest values (M = 97.08), compared with the weight distance of ‘bad’ jobs (M = 78.21), the weight distance of ‘good’ jobs are much smaller (M = 34.09). Implicit negative preference makes it possible to reliably distinguish ‘good’ jobs. 4.3
Compound preferences
As Teevan, et al. (2005) pointed, to suggest better recommendation, it is important to use various methods to see various aspects of users’ preferences. Hence, the previous three methods – saved jobs, search options and uninteresting job cases – were merged to test whether compounded profile outperformed either of the individual methods. There was significant difference in the recommended jobs according to the three levels of ratings using compounded profile, Kruskal-Wallis H = 10.52, df = 2, p = .005. In this method, the results are similar with user profile using just uninteresting job cases. The weight distance evaluated as ‘neutral’ jobs have the highest values (M = 94.04), the weight distance of ‘bad’ jobs are the second highest (M = 59.40), and the weight distance of ‘good’ jobs are the lowest (M = 31.44). Compared with the weight distance values only using not-interesting job cases, the mean weight distance of ‘good’ jobs are smaller, although the mean weight distance of ‘bad’ jobs also decreased. Therefore, it is concluded that understanding implicit negative preferences can be a more effective method by which to determine users’ preferences as compared to what they dislike.
5
Conclusion
This study showed how implicit negative feedback affects recommendation quality. Compared with the cases using just positive preferences, the distinction between good and bad jobs was significantly clear when negative preferences were used. As an additional way to reinforce positive preference, search options did not benefit personalization. This study shows that negative preferences, as it was inferred, can increase recommendation quality. In particular, the implicit feedbacks used in this study can be applied to make recommendation on-the-fly without accumulating a certain amount of information. Due to the semantic infrastructure of the recommender system, it is possible to have richer user profile than the system without semantic structure.
References 1. Chao, D., L., Balthrop, J. & Forrest, S.: Adaptive Radio: Achieving Consensus using Negative Preferences, In: Proc. of the 2005 International ACM SIGGROUP Conference on Supporting Group Work (Sanibel Island, Florida, USA) (2005) 2. Claypool, M., Le, P., Waseda, M. & Brown, D.: Implicit interest indicators. In: Proc.of 6th Conference on Intelligent User Interfaces, pp. 33-40 (2002). 3. Gauch, S., Speretta, M., Chandramouli, A. & Micarelli, A.: User Profiles for Personalized Information Access, In: Brusilovsky, P., Kobsa, A. & Nejdl, W. (Eds) The Adaptive Web, Springer, Berlin, Germany, pp. 54 ~ 89 (2007). 4. Golbeck, J. (to be appeared) Trust and Nuanced Profile Similarity in Online Social Networks, ACM Transactions on the Web 5. Holland, S., Ester, M. & Kieβling, W.: preference mining a novel approach on mining user preference for personalized applications, In: Proc. of the 7th European Conference on Principles & Practice of Knowledge Discovery in Databases, Dubrovnik, Croatia (2003) 6. Joachims, T., Granka, L., Pan, B., Hembrooke, H. & Gay, G.: Accurately Interpreting Clickthrough Data as Implicit Feedback, In: In: Proc. of the 17th Annual International ACM SIGIR Conference (Salvador, Brazil), pp.154 ~ 161 (2005) 7. Kelly, D. & Teevan, J.: Implicit feedback for inferring user preference: a bibliography, ACM SIGIR Forum, 37 (2), pp. 18 ~ 28 (2003) 8. Kim, H. & Chan, P.: Learning Implicit User Interest Hierarchy for Context in Personalization, In: Proc. Of the International Conference on Intelligent User Interfaces (Miami, Florida, USA), pp. 101 ~ 108 (2003) 9. Lee, D. H. & Brusilovsky, B.: Fighting Information Overflow with Personalized Comprehensive Information Access: A Proactive Job Recommender, In: Proc. of the 3rd Conference on Autonomic & Autonomous System (Athens, Greece), pp. 21 ~ 26 (2007) 10. Morita, M. & Shinoda, Y.: Information Filtering based on User Behavior Analysis and Best Match Text Retrieval, In : Proc. of the 17th ACM SIGIR Conference (Dublin, Ireland), pp. 272 ~ 281 (1994) 11. Pamplak, E., Pohle, T. & Widmer, G.: Dynamic Playlist Generator Based On Skipping Behavior, In: Proc. of the 6th Conference on Music Information Retrieval (London, UK) (2005) 12. Sugiyama, K., Hatano, K., Yoshikawa, M. & Uemura, S.: User-Oriented Adaptive Web Information Retrieval based on Implicit Observations, In: Proc. of the 6th Web Conference on Advanced Web Technologies and Applications (Hangzhou, China), pp. 636 ~ 643 (2004) 13.Teevan, J., Dumais, S. & Horvitz, E.: Personalized Search via Automated Analysis of Interests and Activities, In: Proc.of the 17th ACM SIGIR Conference (Salvador, Brazil), pp. 449 ~ 456 (2005)