A Hybrid Collaborative Filtering System for Contextual ... - Springer Link

2 downloads 1213 Views 283KB Size Report
applied in multiple social networks that are spreading world-wide. The resulting .... build a new list with all the recommended items matching the user's situation.
A Hybrid Collaborative Filtering System for Contextual Recommendations in Social Networks Jorge Gonzalo-Alonso1, Paloma de Juan1 , Elena Garc´ıa-Hortelano1, ´ Iglesias2, and Carlos A. 1

Departamento de Ingenier´ıa de Sistemas Telem´ aticos Universidad Polit´ecnica de Madrid {jgonzalo,paloko,elenagh}@dit.upm.es 2 Germinus XXI, Grupo Gesfor [email protected]

Abstract. Recommender systems are based mainly on collaborative filtering algorithms, which only use the ratings given by the users to the products. When context is taken into account, there might be difficulties when it comes to making recommendations to users who are placed in a context other than the usual one, since their preferences will not correlate with the preferences of those in the new context. In this paper, a hybrid collaborative filtering model is proposed, which provides recommendations based on the context of the travelling users. A combination of a user-based collaborative filtering method and a semantic-based one has been used. Contextual recommendation may be applied in multiple social networks that are spreading world-wide. The resulting system has been tested over 11870.com, a good example of a social network where context is a primary concern.

1

Introduction

This article addresses contextual recommendation, which is a new research area in the field of recommender systems [1]. Our definition of context is based on the representational view proposed by Dourish [2]. According to this definition, the context is presented as a series of attributes representing the features of a user’s situation. In our case, these attributes were modelled using an ontology. In particular, this work is devoted to the geographical contextualization of recommendations, although our system has been built so it can be easyly adapted to any other definition of context by just adding attributes to the ontology. For example, supposing that a user has only rated restaurants in her city and wants to find a restaurant in a city she is visiting, the purpose of our work is to suggest 

This research project is partly funded by the Spanish Government under the R&D projects CONTENIDOS A LA CARTA (Plan AVANZA I+D TSI-020501-2008-114) and AdmiTI2 (Plan AVANZA I+D TSI-020100-2009-527).

J. Gama et al. (Eds.): DS 2009, LNAI 5808, pp. 393–400, 2009. c Springer-Verlag Berlin Heidelberg 2009 

394

J. Gonzalo-Alonso et al.

an item (the restaurant) based on its compatibility to her profile, but restricting the results to the new context (the city that she is going to visit). A recommender system based on user-based or item-based collaborative filtering [3, 4] only uses the ratings given by the users to the products, making recommendations from the evaluation of the similarity between the profiles of different users, i.e. people that tend to rate the same items will have similar tastes. Therefore, when contextual components are added to the items, it turns very difficult to find similar user profiles in different contexts. Using the previous example of the user who only rates restaurants in her city, if that user wanted to travel abroad, her profile in a user-based collaborative filtering system would be uncorrelated with the profiles of users in the city she intends to visit. Recommendation based on geographical context may be applied to multiple social networks, such as the one chosen to test the results: 11870.com, a supervised social network where users store and review services they like and share them with the rest of the network community. Contextualization is vital to this community, since these networks provide users with recommendations based on geography, meaning that items rated by users are actual companies or services in their respective cities. The rest of the article is organized as follows: In Sect. 2, all the profiles and concepts used throughout the paper are introduced. In Sect. 3, the core of the proposed solution is explained. In Sect. 4, an analysis of the solution is performed providing the results over a controlled and a real-world scenario. Finally related work and conclusions will be presented in Sect. 5 and 6.

2

Context-Aware Recommendation Framework

In order to develop the proposed recommender system, the User Profile and the Decontextualized User Profile are introduced. User Profile   – Let U = u0 , u1 , u2 . . . u|U | be the set of users within the application. Where |U | is the total  within the application.  number of users – Let P = p0 , p1 , p2 . . . p|P | be the set of products. Where |P | is the total number of products of the application. – Let R be the set of ratings. – Let U P be the User Profile, a function: U × P −→ R

A matrix will be built in the plane UP where the position (i, j) will be the rating r ∈ R given by the user ui ∈ U to the product pj ∈ P . Decontextualized User Profile. Another profile is created using a formal conceptualization of the domain in which the products are framed. Every product P in the system will be classified so that a new profile can be built laying a semantic layer over the preferences of every user, e.g. instead of knowing that one user likes a specific comic book shop the aim, rather, is to know that she likes the category of Comic Book Shops. In this study, a taxonomy is used to classify the products and the profile is built over its categories.

A Hybrid CF System for Contextual Recommendations

395

  – Let C = c0 , c1 , c2 . . . c|C| be the set of all the categories defined in the taxonomy . Where |C| is the total number of the categories in the taxonomy. – Let S be the set of scores given to one category. This score will be obtained for one user using the ratings given to P . – Let DU P be the Decontextualized User Profile, a function: U × C −→ S

The decontextualization needed in the system is related to the semantic layer that will be laid upon the set of items over which the recommendation is made. The UP will be dependant on the context whereas the DUP will be context free as long as the taxonomy used is correctly designed. A final contextualization is then needed to adapt the final recommendations to the current context of the user.

3

Context-Aware Recommendation Process

In this section the core of the solution proposed is explained. The recommendation process is depicted in Fig. 1. The system comprises the following blocks: Collaborative Filtering Recommender System (block 1). This system carries out the computation of a recommendation list based on the UP. Collaborative filtering (CF) is the method chosen for recommendations in most web applications. The broadness and diversity of the products treated by any application make it very difficult to use typical content-based methods [5]. Recommendations are provided by studying the correlation between users’ ratings. Therefore, the results are content agnostic and independent of the domain [3, 4]. A basic user-based CF algorithm with Pearson’s correlation [6] is used to compute the recommendations in this branch of the system. Semantic Recommender System (block 2). This system carries out the computation of a second recommendation list using the DUP. The semantic structure used will be a taxonomy which categorizes all the products in the system. The user’s taste vector is now the rating for each category, based on the ratings of the items included in each category, i.e. the DU P . It will consist of

Fig. 1. Overview of the proposed solution

396

J. Gonzalo-Alonso et al.

scores given to categories in the taxonomy C using Ziegler’s method to distribute the score [7]. Once the DU P is generated, similarities are computed using the cosine of the angle between the vectors [6]. Then, the users whose profiles are most closely correlated to the given user will be taken into consideration to compute recommendations using the relevance formula in [7]. Recommendation List Builder (block 3). This module merges the recommendation lists provided by both recommender systems. This block is tuned depending on the accuracy given by each of the branches of the hybrid system. The accuracy will not only depend on the actual system but also on the quality of the structures used, i.e. the taxonomy. Ziegler [7] tested the taxonomybased CF algorithm using Amazon’s taxonomy and proved that it performed better than the user-based CF one, hence, in this case, the recommendations given by the taxonomy-based branch should be prioritized. But there are other studies which show the opposite behavior [8] and that is the case of the dataset that we will be using, 11870.com. The tests run proved that the recall [9] is between 2 and 3 times better with the user-based CF and its precision [9] triples the case of the taxonomy-based CF algorithm. This is the case that makes the hybrid system really useful because it makes it possible for us to take advantage of the diversity introduced by the user-based CF system and also rely on the taxonomy-based CF algorithm to overcome the sparsity problems [10] [8] or the uncorrelation between profiles of users in different contexts. Context Handler (block 4). This module gathers all the contextual information concerning both the user and each of the recommended items and processes it according to a context ontology, using a reasoner that also processes the data contained in a knowledge base according to a set of rules that allow the generation of context-related entailments. Every time a user asks for a recommendation, instances representing both her profile and her context are generated, according to the corresponding descriptions available in the ontology. These instances are put into the knowledge base. Once a decontextualized recommendation has been generated, instances representing the context of each recommended item are also put into the knowledge base. The reasoner can then build a new list with all the recommended items matching the user’s situation and attending the imposed rules. Note that in the aforementioned example of the traveler, most of the products recommended by block 1 will be filtered out when contextualization is performed. In this case, block 2 will be the one providing the output to the user.

4

Impact of the Contextual Hybrid System

Two different experiments have been carried out to evaluate the performance of the proposed hybrid recommender system. The first experiment was done using a restricted set of users, which is shown in Fig. 2. This experiment was performed to prove that the hybrid system solves the contextual problems explained throughout the paper. After that, a second experiment was developed

A Hybrid CF System for Contextual Recommendations

397

using a real social network to test how the hybrid system would perform in a real-world scenario. In both experiments, only the geographical features of the context have been considered. Instead of using precision and recall as metrics for our experiments, we will count the number of possible recommendations that each branch can provide. Precision and recall are measured over a test set extracted out of the products rated by the user [9] and that set would form the perfect recommendation list. But in the example that we are using, based on geographical contextualization, the aim is to prove that we are able to provide recommendations anywhere, specially in a geographical context where the user has never rated a product. Therefore the recommendations we are looking for will never appear in any test set chosen. 4.1

Results over a Restricted Dataset

The dataset used for this experiment consists of 10 users in 3 different countries, rating 17 services that will be classified according to a small taxonomy (left box of Fig. 2). As we can see in this example, 3 users in Beijing and 4 users in Madrid are uncorrelated because they do not have any products in common. This experiment aims to prove that with a hybrid system, we can take advantage of the basic CF but in the case of non-correlation between users in context-based recommendations, we are still able to provide an accurate recommendation, which will satisfy the user’s criteria. We used a similarity threshold in order to limit the users who could be in the Top-M of most similar users, given that the DUP could make all users similar because they all have at least one category (the root one) in common. In this experiment, the similarity threshold was set very high, to 0.5. Even at this threshold setting, the main issue was proved.

Fig. 2. Left box: Restricted dataset, UP matrix, taxonomy and classification of the products. Right box: Recommendations obtained for both systems

398

J. Gonzalo-Alonso et al.

The result of the recommendations obtained for every user in the data set is shown in the right box of Fig. 2. The user Xiaomei, is not only uncorrelated using the UP with the rest of users in a different country but also with the users in the same geographical context. In this extreme situation, the basic CF algorithm is not able to provide recommendations whereas the taxonomy branch is. Also we can observe that the basic CF algorithm provides diversity to the first recommended items (see the user Li). On the other hand, the taxonomy-based algorithm will be more focused on the semantically related products, especially at the top of the recommendation list. 4.2

Results over a Social Web: 11870.com

The social network 11870.com has three versions, depending on the language setting, but is situated mainly in Spain. Therefore, it is important to consider that 16.12% of the site’s services are set outside of Spain and 3.9% of users can be considered to originate from other countries because they do not have any reviews in Spain. These people outside of Spain review an average of 2.61 products or services. This analysis of the data in this social network allows us to see that this scenario is not as international as it should be to check the proper use of the hybrid system, but makes a good scenario for testing the performance with authentic data which is location dependant. Two different experiments will compare the basic CF algorithm and the taxonomy-based one. The first will study users within Spain travelling to other cities in Spain where they do not have any service stored. The second experiment will do the same but uses data from cities outside of Spain. For these experiments, the similarity threshold used in the taxonomy branch is 0, 5. The charts in Fig. 3 show the number of possible recommendations given by each branch against the number of users taken into account when obtaining the most similar users to the one getting the recommendation. The results show that the number of possible recommendations grows as the Top-M grows, being the taxonomy-based branch the one providing more recommendations (Fig. 3). This is specially remarkable in the case of cities within Spain. The reason is that users in the Top-M of the user-based CF system may have travelled to the city we are considering, but there may also be many similar

Fig. 3. Number of recommendations shown (user-based vs. taxonomy-based CF)

A Hybrid CF System for Contextual Recommendations

399

users in the Top-M of the taxonomy-based CF system who were born in that city. When a user wants a recommendation when travelling abroad, it might be more interesting to provide recommendations using the taxonomy branch because they will fit better with her usual interests. Both experiments were performed using averages among most populous cities, first in Spain, then worldwide. In a broader dataset these results will be amplified, especially if the cities chosen are less important, which would make it more difficult to correlate in the UP plane.

5

Related Work

Hybrid approaches using content-based and CF recommendations have been used to solve the sparsity and cold-start problems. Examples of these hybrid systems are those presented in [10], [8] and [11], but none of these have been used to contextualize the recommended products. In contextualizing recommendations, interesting work is being done assuming that the preferences of one user may vary over time [1] or that the context is not a fixed set of attributes [12], as according to Dourish’s interactional view [2]. In [1] and [13], a reduction-based approach is presented. Both works propose that the contextualization should be carried out on the dataset before any recommendation is generated, in order to achieve a dimensional reduction of the original space of products. Finally, [14] and [15] propose a different solution to contextualization, based on the use of user feedback to produce content-based recommendations from her interaction with the system.

6

Conclusions

Recommender systems have proven to be a key element of new web applications. But the internationalization of any social network brings with it new problems for these systems. A very serious problem appears when the items treated are context dependant. In this paper, and as a result of the work done over the social network 11870.com, a novel approach to contextual recommendations is proposed. In this case, a hybrid two-branch system is introduced. The first branch utilizes a user-based CF algorithm; the second uses this same algorithm but does so over a semantic structure, e.g. a taxonomy. Finally, a semantic context handler has been designed and implemented, in order to perform context-aware recommendations. The concepts of descontextualization and contextualization have been discussed in this paper. The techniques explained above have been proposed as a novel solution for the social network 11870.com to tackle the problem of non-correlation among the tastes of users from disparate geographical areas.

400

J. Gonzalo-Alonso et al.

References 1. Adomavicius, G., Sankaranarayanan, R., Sen, S., Tuzhilin, A.: Incorporating contextual information in recommender systems using a multidimensional approach. ACM Trans. Inf. Syst. 23(1), 103–145 (2005) 2. Dourish, P.: What we talk about when we talk about context. Personal Ubiquitous Comput 8(1), 19–30 (2004) 3. Resnick, P., Iacovou, N., Suchak, M., Bergstorm, P., Riedl, J.: GroupLens: An Open Architecture for Collaborative Filtering of Netnews. In: Proc. of ACM 1994 Conference on Computer Supported Cooperative Work, pp. 175-186. Chapel Hill, North Carolina (1994) 4. Breese, J.S., Heckerman, D., Kadie, C.: Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pp. 43–52 (1998) 5. Adomavicius, G., Tuzhilin, A.: Towards the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering 17(6), 734–749 (2005) 6. Herlocker, J.L., Konstan, J.A., Borchers, A., Riedl, J.: An algorithmic framework for performing collaborative filtering. In: SIGIR 1999: Proc. of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 230–237. ACM Press, New York (1999) 7. Ziegler, C.-N., Lausen, G., Schmidt-Thieme, L.: Taxonomy-driven computation of product recommendations. In: Proc. of the 2004 ACM CIKM Conference on Information and Knowledge Management, pp. 406–415. ACM Press, Washington (2004) 8. Weng, L.-T., Xu, Y., Li, Y., Nayak, R.: Exploiting item taxonomy for solving cold-start problem in recommendation making. In: ICTAI 2008: Proceedings of the 2008 20th IEEE International Conference on Tools with Artificial Intelligence, pp. 113–120. IEEE Computer Society, Washington (2008) 9. Herlocker, J.L., Konstan, J.A., Terveen, L.G., Riedl, J.T.: Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. 22(1), 5–53 (2004) 10. Weng, L.-T., Xu, Y., Li, Y., Nayak, R.: Improving recommendation novelty based on topic taxonomy. In: WI-IATW 2007: Proc. of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops, pp. 115–118. IEEE Computer Society, Washington (2007) 11. Cho, Y.H., Kim, J.K.: Application of web usage mining and product taxonomy to collaborative recommendations in e-commerce. Expert Systems with Applications 26(2), 233–246 (2004) 12. Anand, S., Mobasher, B.: Contextual recommendation, pp. 142–160 (2007) 13. Woerndl, W., Schlichter, J.: Introducing context into recommender systems. In: Proc. AAAI 2007 Workshop on Recommender Systems in e-Commerce, Vancouver, Canada (2007) 14. Yap, G.-E., Tan, A.-H., Pang, H.-H.: Dynamically-optimized context in recommender systems. In: MDM 2005: Proceedings of the 6th international conference on Mobile data management, pp. 265–272. ACM, New York (2005) 15. Kim, S., Kwon, J.: Efective context-aware recommendation on the semantic web. International Journal of Computer Science and Network Security 7(8), 154–159 (2007)

Suggest Documents