Contextual Information based Recommender System using Singular Value Decomposition Rahul Gupta, Arpit Jain, Satakshi Rana and Sanjay Singh Department of Information & Communication Technology Manipal Institute of Technology, Manipal University, Manipal-576104, India
[email protected]
Abstract—The web contains a large collection of data, this is where the need for recommender system arises. A recommender system helps user to come to a decision quickly. In the conventional recommendation system only the reviewer’s ratings are taken into consideration. However, contextual information pertaining to each user should be incorporated in the recommendation system, making the recommendation personalized. As some features can enhance the performance of a recommendation system and also certain irrelevant features might degrade it, feature selection becomes an essential aspect of context aware recommendation system. In our paper we have devised a novel approach which first selects relevant contextual variables based on the contextual information of the reviewers and their ratings for a class of entities, with naive Bayes classifier. Once the relevant contextual variables are extracted, Singular Value Decomposition (SVD) is applied for extracting most significant features corresponding to each entity. This information is used by the recommendation system in analyzing the contextual information of the user in recommending him entities that are of interest to him. The proposed method also determines the best contextual variable and feature space for each entity. This enables the context aware recommendation system more efficient and personalized. Moreover, with the proposed method an overall increase in F-score of 30% was obtained thereby improving the reliability of the recommender system. Keywords-Context Aware Recommender System, Contextual Information, Singular Valued Decomposition, Naive Bayes classifier
I. I NTRODUCTION Recommendation system aims to make the user search more efficient and less time consuming by suggesting most appropriate and relevant content that the user wants to search. It is extensively being used in the field of information retrieval, E-commerce, social network analysis, big data analytics to increase the performance of search result. Another aspect of the recommendation system is that it allows users to provide ratings and comments for entities. Feedback given by reviewers is used to recommend items with high ratings to prospective users. The recommendations are typically made using collaborative filtering, content-based filtering approaches or a combination of the two [1][2][3]. The collaborative filtering approaches suggest items by analyzing the activities of the user such as his browsing history, previous rating patterns and also the rating patterns of other users. On the other hand, content based filtering approaches recommend items having similar properties and characteristics to that of the previously searched items of the user.
c 978-1-4673-6217-7/13/$31.00 2013 IEEE
The conventional recommendation system suffers from many major drawbacks which greatly degrade its performance. The recommended results are not personalized and as the data is huge, personal interests plays a crucial role in selecting the appropriate content or items. In recommending an item, the system takes care of only the ratings, and does not consider other information specific to that item and the characteristics of the user who has rated it. For example, in a movie recommendation system, a top rated movie of unfamiliar language should not be recommended to a person not familiar with that language. The movie might be top rated but since it is irrelevant to the user, the current recommendation system fails to solve this problem. Also the current recommendation system requires sufficient information regarding the users activity in the past (browsing history, ratings, likes) and in the absence of such information it fails to provide satisfactory recommendation. Therefore, recommendation system should utilize and analyze the contextual information of the reviewers who rate entities and then examine the contextual information of a current user in making recommendations. The likes and dislikes of a person are greatly influenced by various factors like gender, age-group, occupation, marital status and many others. These factors are treated as contextual variables. The context aware recommender system incorporates the contextual information of a user in recommending him items which is of interest to him [4]. The incorporation of contextual information in the recommendation system makes it more personalized and efficient. In the context aware recommender system, the main challenge is to select the relevant contextual variables from the large amount of contextual information available. There are numerous features associated with an entity, among them, only the pertinent features should be taken into consideration. The recommendation system should acquire and use only the relevant features, while neglecting the irrelevant ones, which may act as noise and degrade the recommendation accuracy. For example, ”Traveling with whom” plays a significant role in deciding which tourist destination a user is interested in, however in some other circumstance this feature might be totally irrelevant. Therefore, it is essential to determine the relevance of a contextual variable with respect to a specific goal. It is necessary to examine the association between the contextual variable and the rating to select the relevant contextual
2084
information. The tendency in the linear relationship between two random variables can be estimated by analyzing the sign of the covariance. If the variables exhibit similar behavior, covariance is positive and if the variables exhibit opposite behavior, covariance is negative. Since contextual variables are not always numeric, but also categorical, the variability of a contextual variable cannot be computed simply using the covariance formula given by (1) η(v) = Cov(v, u) To counter the problem of categorical data, Kader and Perry [5] developed the coefficient of unalikeability to measure the variability among categorical variables by focusing on “how often the observations differ” and not on “by how much”. Another methods for estimating variability of contextual variables are, calculating variability as the entropy of observations and variability as a sample variance [6]. Nadal Golbandi et al. [7] has proposed the use of decision tree to form a relation between different contextual variables. In our paper, we have gathered the previous ratings and the contextual information of each reviewer, for a set of entities belonging to the same class. Based on the analysis of this data by a naive Bayesian classifier [8], we examined the set of contextual variables that play a significant role in predicting the rating of a class. Singular Value Decomposition is applied on the selected relevant contextual variables and the ratings of each entity to generate features for each entity that are associated with its high rating [9][10]. This information is then integrated into the recommendation system for analyzing the contextual information of a user and in recommending him entities that he will probably rate high. The rest of the paper is organized as follows. Section II describes the proposed methodology. Section III discusses the results obtained and finally section IV concludes this paper. II. P ROPOSED M ETHOD The contextual information of a user plays a significant role in predicting his likes and dislikes. We have utilized the contextual information of reviewers and their ratings for obtaining the relevant contextual variables. The contextual variables, thus obtained are categorical in nature. The categories of these relevant contextual variables are then examined using SVD to determine the features that are associated with high ratings of an entity. This information is integrated in the recommendation system. When a user is searching for an entity, the recommendation system will analyze the contextual information of the user. Based on this information and the class of the entities he is looking for, the system will make recommendations that are likely to satisfy the user. In this paper, we have considered Tourist Destination Recommender System as our test scenario. In this recommender system, there are a set of reviewers who have rated a particular tourist destination after their visit. We have incorporated the contextual information of each reviewer, to make this recommender system more personalized for the prospective users. In this recommender system, 15 tourist destinations
have been considered. All the tourist destinations which have similar characteristics are put into one class (pilgrimage, beach, amusement park). The contextual information about the reviewer is categorical in nature like age, gender, continent, etc. Among them, only a set of few contextual variables and their features are selected for better result. A. Terminologies used There are various terminologies used in the paper, which are summarized in the subsequent paragraph. The tourist destination is called as entities and the set of entities of the same type is called as class. There are total three classes considered each having five entities as shown in Table I. TABLE I C LASSES AND E NTITIES Class Pilgrimage Amusement Park Beach
Entities Bahai Garden [11], Mecca Black Stone [12], Ganga River, Golden Temple, Sanctuary of Our Lady of Lourdes [13] Europa Park [14], Hopi Hari [15], Magic Kingdom [16], Disneyland Paris [17], Disneyland Tokyo [18] Bora Bora [19], Port to da Barra [20], Baga Beach [21], Seychelles [22], Hawaii [23]
Each entity is rated by twenty reviewers. Contextual information of each reviewer considered in our approach is summarized in Table II. TABLE II C ONTEXTUAL VARIABLES AND ITS FEATURES Contextual Variable Continent Age Gender Travel With Motive Frequent Traveler
Features Asia, Africa, Antarctica, Australia, Europe, North America, South America 18-24, 25-34, 35-49, 50-64, 65+ Male, Female Alone, friends, family, both Fun, Work, Work and fun Yes, no
# Features 7 5 2 4 3 2
Selecting the set of most relevant contextual variables from the data set is crucial for pruning redundant and irrelevant information, thereby improving the performance of a classifier. Wrapper and filter are two popular methods used for feature selection [24], [25]. In our approach wrapper method is used, which uses a subset evaluator to create all possible subsets of features and then uses a classifier to estimate their merit. Naive Bayes classifier is used to determine the subset of contextual variables which render the best result by evaluating their classification accuracy so as to minimize error rate and maximize correctness. To find the subset of contextual variables, evaluator uses best-first search technique. B. Extracting Relevant Contextual Variables In our methodology, we have trained the naive Bayes classifier using supervised learning technique [26]. For learning, we have used correctly-identified observations, to estimate the contextual variables that are necessary in predicting the ratings of a class. Each training example consists of a vector
2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI)
2085
of contextual information of a reviewer and the rating given by the reviewer. Naive Bayes classifier assumes that the features contribute independently to the probability of a class variable. In other words, for a given class variable, the presence or absence of a feature has no significance on the presence or absence of another feature. Under this assumption, naive Bayes probability model is expressed as [8] p(C|F1 , . . . , Fn ) =
n 1 p(C) p(Fi |C) Z i=1
(2)
where: C is the class F1 . . . , Fn are the feature variables Z is a scaling factor n is the number of features This probability model is combined with a decision rule in the naive Bayes classifier, the corresponding naive Bayes classifier is expressed in the function classify which is given by classify(f1 , . . . , fn ) = argmax p(C = c) c
n
Fig. 2.
p(Fi = fi |C = c)
i=1
(3) For the purpose of demonstration, we have shown the selection of relevant contextual variables of the class “Beach”. The same procedure was followed for the other two classes that is Amusement Park and Pilgrimage and the result are shown in Table III. The statistics of the training data used for the class “beach” is shown in Fig 1- Fig 6 where criss-cross represents those who have given Excellent ratings about a place and horizontal hatching in the bar graph represent those who have given average ratings.
Fig. 3.
Fig. 4.
Fig. 1.
Distribution of reviewer on the basis of gender
Distribution of reviewer on the basis of travel motive
Distribution of reviewer in different continents
From the data set of the class “Beach”, the subset of contextual variables which provides minimum error rate, as determined by the naive Bayes classifier consists of “Travel With” and “Motive”, thereby reducing the contextual information significantly. The performance of the system in predicting the ratings for beach using only these relevant contextual variables is depicted in the contingency matrix, M. In the contingency matrix, the rows correspond to the number of instances in the 2086
Distribution of reviewer on the basis of age
Fig. 5.
Distribution of reviewer on the basis of frequent traveler
2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI)
TABLE III C LASSES AND THEIR R ELEVANT C ONTEXTUAL VARIABLES Class Beaches Pilgrimage Amusement Park
Relevant Contextual Variable Continent, Travel With Age, Motive, Travel With Continent, Age, Travel With
Old F-Score 0.68 0.62 0.65
Improved F-Score 0.89 0.83 0.88
#Features Reduced 4 3 3
Improvement Percentage 30.88 % 33.87 % 34.29 %
TABLE IV A NALYSIS USING R EDUCED VARIABLES Performance Measure Precision(P ) = Recall(R) =
Fig. 7.
Distribution of reviewer on the basis of travel with
Distribution of reviewer on the basis of class ratings
data set which were rated as either excellent or average, while the columns correspond to the instances that were predicted by the system to be either excellent or average. In the test case of the class Beach, when no contextual variables were removed, the system’s overall F-score was noted to be 0.68. After applying, Naive Bayes Classifier irrelevant contextual variables were removed and only the relevant variables were considered it increased to 0.89 as shown in Table III which tabulates the old and new F-Score, before and after reducing the features. Excellent Average Excellent 74 1 M= Average 10 15 From the matrix M, it can be inferred that out of the 100 training instances given to the system, 75 were excellent and 25 were average. Out of the 75 excellent instances, 74 were correctly identified and out of 25 instances of average, 15 were correctly identified. The precision (P), recall (R) and the FScore(F1) of the system therefore is given in Table IV [27]. C. Relevant Feature Extraction The relevant contextual variables determined using naive Bayes classifier are categorical in nature. Each contextual variable have many features associated with it. The challenge is to
Overall
Excellent
Average 0.94
0.89
0.88
TP T P +F N
0.89
0.98
0.6
2P R P +R
0.89
0.93
0.73
F-Score(F ) =
Fig. 6.
TP T P +F P
perform dimensionality reduction to further reduce the feature space. As it is earlier proved in the subsection II(B) that only a set of relevant features can improve the performance of the recommender system. Singular Value Decomposition(SVD) is a methodology used in signal processing for factorizing a real or complex matrix. In the paper we have selected the relevant contextual variables of each class using naive Bayes classifier. Important features of these contextual variables for each entity are then selected using SVD. SVD of a m × n matrix M is given by [28] (4) M = UΣVT where: U is a m × m orthogonal matrix of left singular vectors ofM Σ is a m × n diagonal matrix of singular values of M V is the right singular vector of dimension n × n. Using naive Bayes theorem, Continent and Travel With were found to be the most relevant contextual variables of the class “Beaches”. These variables have many features such as {North America, Family and Friends}, {Europe, Alone} etc. Among these, features for each entity in the class “Beach” having high cosine similarity [27] with the “excellent” rating were determined. In the following paragraphs we have calculated the cosine similarity between the rating “excellent” for the entity Hawaii in the class Beach with one of its features {North America, Family and Friends} using SVD. Matrix C as shown in Eq 5, is of dimension 20 x 2 is constructed, here we have considered 20 reviewers and 2 features. All the calculations are shown with the data-set of class Beach with the entity Hawaii. The rows correspond to each instance of the training data for Hawaii. The first column elements of C are set to 1, for the training instances having North America as the Continent and Friends and Family as a choice of Travel With. The second column elements of C are set to 1 for the instances rated as “excellent” and 0 for instances rated as “average”. SVD is performed on the matrix C which results in U , Σ and V matrices as shown in Eq: 6, 7 and 8 respectively. For the low rank approximation, the first value is taken and all the
2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI)
2087
other value of Σ is set to 0 which results in a new matrix Cnew as shown in Eq 9. The cosine similarity of the feature {North America, Friends and Family} with “excellent” is calculated using Cnew and is found to be 13.0573. ⎛ ⎞ 0 1 ⎜0 1 ⎟ ⎜ ⎟ ⎜ ⎟ C = ⎜1 1 ⎟ (5) ⎜ .. .. ⎟ ⎝. .⎠ 1 0 After applying Singular Value Decomposition: ⎞ ⎛ −0.150 0.3037 · · · −0.0519 ⎜−0.2716 −.0768 · · · −0.3955⎟ ⎟ ⎜ U =⎜ ⎟ .. .. .. .. ⎠ ⎝ . . . . −0.1206 −0.3805 · · · 0.8462 ⎛ ⎞ 5.1750 0 ⎟ ⎜ 0 2.0541 ⎜ ⎟ ⎜ .. .. ⎟ ⎜ . . ⎟ ⎟ Σ=⎜ ⎜ 0 0 ⎟ ⎜ ⎟ ⎜ . .. ⎟ ⎝ .. . ⎠ 0 0 −0.6239 −0.7815 V = −0.7815 0.6239 ⎞ ⎛ 0.4876 0.6108 ⎜0.8768 1.0984⎟ ⎟ ⎜ Cnew = ⎜ . .. ⎟ ⎝ .. . ⎠ 0.3892 0.4876
(6)
selected contextual variables. Therefore, reducing the contextual variable at the first step and further reducing the feature space results in a concise contextual information which can be directly incorporated into the recommender system. TABLE V C OSINE S IMILARITY AFTER APPLYING SVD Tourist Destination
Continent
Travel With
Bora Bora, France
Europe Europe Europe Asia Australia North America North America Europe Europe Asia Australia North America North America Europe Europe Europe Europe North America South America South America Europe Europe Europe Asia Asia Asia Europe Europe Asia Australia Africa North America North America
Family Friends Both Both Alone Family Both Family Both Family Family Family Both Family Friends Both Alone Both Friends Both Family Friends Alone Family Friends Both Family Both Family Family Both Family Both
Hawaii,USA
Porto da Barra, Brazil
(7)
Baga, India
(8)
(9)
Anse Lazio, Seychelles
Cosine Similarity = 0.4876 ∗ 0.6108 + . . . + 0.3892 ∗ 0.4876 (10) = 13.0573 The features having a high cosine similarity with the rating “excellent” of an entity are extracted as significant features for that entity. III. S IMULATED R ESULTS The cosine similarity of the features for selected contextual variables for each entity of Beach is shown in Table V. In the table, only the non-zero values are depicted which gives a comparison of all the feature with each entity. Here the entities of class Beach, which are Bora Bora, Hawaii, Port da Barra, Baga, Anse Lazio and their reviews for the contextual variables Continent and Travel are considered. For these contextual variables, only the best features that are given by SVD and cosine similarity are finally taken. Higher is the value of cosine similarity higher will be its rank in the recommendation system. Low cosine similarity portrays that the feature is less import. This table can be generalized for each entity to find the best features from the
2088
Cosine Similarity 2.2362 1.1364 1.1364 1.0617 1.0617 4.7735 5.4356 1.0522 1.0522 1.0522 1.0522 3.4332 13.0573 2.5230 1.0814 2.5230 1.0814 1.0814 1.1864 4.8284 5.1491 2.2999 1.1864 1.1864 2.5200 3.9000 5.4358 6.3891 1.0617 1.0617 2.3969 1.0617 2.23
A. Incorporating the proposed method in recommendation system Three users are considered that are searching for beaches. As it has been already demonstrated that for the class “Beaches” Continent and Travel With are the two relevant contextual variables. Thus, the important contextual information of the three users used by the recommendation system is summarized in Table VI. TABLE VI U SERS AND THEIR INFORMATION User User-1 User-2 User-3
Recommended contextual Information Continent Travel With North America Family Europe Both North America Both
For each user cosine similarity with the entities of the class “Beaches” is calculated and represented in Fig 8. From the
2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI)
R EFERENCES
Fig. 8.
Cosine similarities of beach entities for each user
figure it is implied that Bora Bora is of more interest to User3 than to User-1 or User-2. Similarly Baga is of more interest to User-2 than to User-1 or User-3. Corresponding to each user the destination which is of maximum interest to the user is given in Table VII. From Table VII it can be clearly shown that users have different preferences and therefore it is essential to incorporate the contextual information of the users while recommending them entities. TABLE VII T OP RESULT FOR EACH USER User User-1 User-2 User-3
Recommended Destination Bora Bora Anse Lazio Hawaii
Cosine Similarity 4.7735 6.3891 13.0573
IV. C ONCLUSION In this paper, we have incorporated the contextual information of the reviewers in the recommendation system, to make the recommendations more personalized and pertinent to the interest of the user. Of the available contextual information, most relevant contextual variables for each class is determined using feature subset evaluation using naive Bayes classifier. These contextual variables are then used for extracting the feature with high cosine similarity with good ratings for each entity using Singular Value Decomposition. This information is utilized by the recommender system to analyze the contextual information of the user and recommend him entities that are of interest to him. Thus this mechanism gives an imporvement of 30% in the F-score thereby increasing the accuracy and making the recommender system more personalized.
[1] J. A. K. Jonathan L. Herlocker, “Evaluating collaborative filtering recommender systems,” ACM Transactions on Information Systems, vol. 22, pp. 5–53, January 2004. [2] A. T. Gediminas Adomavicius, “Toward the next generation of recommender systems:a survey of the state-of-the-art and possible extensions,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, pp. 734–749, June 2005. [3] C. M. Asela Gunawardana, “A unified approach to buidling hybrid recommender systems,” in Proceedings of the third ACM conference on Recommender Systems, New York, USA, 2009, pp. 117–124. [4] R. S. Gediminas Adomavicius, “Incorporating contextual information in recomender systems using a multidimensional approach,” ACM Transactions on Information Systems, vol. 23, pp. 103–145, January 2005. [5] M. P. Gary D. Kedar, “Variability for categorical variables,” Journal of Statistics Education, vol. 15, pp. 1–16, Apr. 2007. [6] M. E. Young and E. A. Wasserman, “Entropy detection by pigeons: Response to mixed visual displays after same–different discrimination training,” Journal of Experimental Psychology: Animal Behavior Processes, vol. 23, p. 157, Mar 1997. [7] N. Golbandi, Y. Koren, and R. Lempel, “Adaptive bootstrapping of recommender systems using decision trees,” in Proceedings of the fourth ACM international conference on Web search and data mining, New York, USA, June 2011, pp. 595–604. [8] Wikipedia. (2013) Nive bayes classifier. [Online]. Available: https://en.wikipedia.org/wiki/Naive Bayes classifier [9] G. H. Golub and C. Reinsch, “Singular value decomposition and least squares solutions,” Numerische Mathematik, vol. 14, pp. 403–420, March 1970. [10] B. D. M. Lieven De Lathauwar, “A multilinear singular value decomposition,” SIAM Journal on Matrix Analysis and Applications, vol. 21, pp. 1253–1278, May 2000. [11] Wikipedia. (2013) Bahai garden. [Online]. Available: http://en.wikipedia.org/wiki/Bah%C3%A1%27%C3%AD gardens [12] ——. (2013) Mecca black stone. [Online]. Available: https://www.google.co.in/search?q=Mecca+Black+Stone&ie=utf8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a [13] ——. (2013) Sanctuary of our lady of lourdes. [Online]. Available: http://en.wikipedia.org/wiki/Sanctuary of Our Lady of Lourdes [14] (2013) Europa park. [Online]. Available: http://www.europapark.de/langen/Home/c1174.html?langchange=true [15] Wikipedia. (2013) Hopi hari. [Online]. Available: http://en.wikipedia.org/wiki/Hopi Hari [16] (2013) Magic kingdom. [Online]. Available: https://disneyworld.disney.go.com/destinations/magic-kingdom/ [17] (2013) Disneyland paris. [Online]. Available: http://www.disneylandparis.com/ [18] (2013) Disneyland tokyo. [Online]. Available: http://www.tokyodisneyresort.co.jp/en/ [19] Wikipedia. (2013) Bora bora. [Online]. Available: http://en.wikipedia.org/wiki/Bora Bora [20] ——. (2013) Porto da barra beach. [Online]. Available: http://en.wikipedia.org/wiki/Porto da Barra Beach [21] ——. (2013) Baga. [Online]. Available: http://en.wikipedia.org/wiki/Baga, Goa [22] ——. (2013) Seychelles. [Online]. Available: http://en.wikipedia.org/wiki/Seychelles [23] ——. (2013) Hawaii. [Online]. Available: http://en.wikipedia.org/wiki/Hawaii [24] R. Kohavi and G. H. John, “Wrappers for feature subset selection,” Artificial intelligence, vol. 97, pp. 273–324, March 1997. [25] H. Liu and R. Setiono, “A probabilistic approach to feature selectiona filter solution,” in Machine Learning-InternationalL Workshop Then Conference, France, 1996, pp. 319–327. [26] S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd ed. Upper Saddle River, NJ, USA: Prentice Hall PTR, 1998. [27] C. Manning, P. Raghavan, and H. Sch¨utze, Introduction to Information Retrieval, ser. An Introduction to Information Retrieval. Cambridge University Press, 2008. [Online]. Available: http://books.google.co.in/books?id=t1PoSh4uwVcC [28] G. Strang, Linear Algebra and Its Applications. Thomson Brooks/Cole Cengage learning, 2006. [Online]. Available: http://books.google.co.in/books?id=q9CaAAAACAAJ
2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI)
2089