An Efficient and Optimized Recommendation System Using Social ...

3 downloads 69870 Views 413KB Size Report
e-commerce based websites have started using social networking sites to access ... to generate a list of recommended items. They generally focus on the items ...
An Efficient and Optimized Recommendation System Using Social Network Knowledge Base Md Zeeshan Ashraf1, Dheeraj Kumar Chouwdhary1, Rohan Lal Das1, Prasun Ghosal1,2 1

Dept of IT, Bengal Engineering and Science University, Shibpur, Howrah 711103, WB, India 2 Dept of CSE, University of North Texas, Denton, TX 76201, USA [email protected], [email protected], [email protected], [email protected]

Abstract— With the advent of e-commerce in the current market a large number of companies e.g. Flipkart, eBay, infibeam, amazon etc. have come up with a huge range of products on a single platform to the users. These products are recommended to the users based on certain parameters related to the user. Moreover, in order to refine the recommendation these e-commerce based websites have started using social networking sites to access information pertaining to the user in order to improve their recommendation. In this paper we present a novel recommendation system for e-commerce websites using social network knowledge base that uses certain parameters provided by users viz. age group, gender, location etc. and based on these criteria best recommendation is provided by our proposed method using Analytical Hierarchy Process, Merge-and-Sort, and Sort-and-Count algorithms within a wrapper to optimize. User preferences are taken from Facebook whereas Flipkart is chosen as the e-commerce website for illustration of our proposed method. Keywords— Analytical Hierarchy Process, Merge-and-Sort algorithm, Sort-and-Count algorithm, social networks, recommendation system, e-commerce websites. Abbreviations and Acronyms used AHP - Analytical Hierarchy Process, FB – Facebook, P.V – Potential Value, N – Used for top N (Integer constant) number of nodes out of all friends nodes (in phase II), ŵ - Row average matrix, A = comparison matrix (in phase II).

I.

INTRODUCTON

Today, with the rise of social networking sites the web applications have become more dynamic than ever. This has also lead to active participation of more and more users in these social network websites. These websites offer to store personal data related to the user and thus work as one of the largest repositories of such information on the web. Moreover, rapid growth in e-commerce market via e-commerce websites has also led to the production of huge chunk of use related information. Thus due to presence of this Big Data it becomes an extremely difficult task to suggest the best relevant product to the user by these websites. To overcome the complexity of processing this enormous amount of information, we propose a novel recommendation tool to give the user a customized recommendation of most relevant products on ecommerce website according to their personal preferences using their social network profile information and purchase history of user and its friends on that particular e-commerce website.

II.

BACKGROUND AND MOTIVATION

Recommendation Systems are widely known for their use on ecommerce web sites, where they use a customer’s interests as inputs to generate a list of recommended items. They generally focus on the items a particular user purchases on that particular website and they carry out their recommendation process based on this set of data. However, nowadays many recommendation systems also take into account user’s interests on a social networking website, process them, and link their recommendation to the data about a particular user from any social networking website. E-commerce recommendation systems often operate in several challenging environments as follows. • • • •

Optimized processing for a large retailer having huge amount of data, tens of millions of customers and millions of distinct catalog items. To provide recommendation result set to be returned in realtime, in no more than half a second, while still producing highquality recommendations. To provide new customers typically having extremely limited information based on only a few purchases or product ratings. To provide older customers a glut of information based on thousands of purchases and ratings.

Therefore, designing a novel recommendation system not only challenging for designing an optimized decision making model, but also to design a computationally efficient system to be able to process an enormous amount of Big Data in real time.

III.

NOVEL CONTRIBUTIONS OF THIS WORK

In the present proposed work, we extracted the user and his friend’s social network (e.g. Facebook) data as well as the purchase/rating history of each user from e-commerce website (since our recommendation tool will be functional on an e-commerce website so the purchase history of users will be available). Our approach is to collect the user information from social network website (Facebook) and then perform the Analytical Hierarchy Process (AHP) on the collected data to find the set of closest friends to the user (based on certain parameters, described later). From the above filtered set of friends we select top 100 or 50% friends (whichever is less) and on this set we apply the Inversion method technique followed by Memory based Collaborative Filtering for high-quality recommendation in real time.

IV.

PROPOSED APPROACH

Entire work may be divided in to THREE phases as follows.

Phase I: Data Extraction In this phase we retrieve the user’s inform mation as well as user’s friend’s information from the social netw working website. This enables us to set up the social netw work knowledge database. Phase II: Optimization and Sorting of Extracteed Data In this phase, retrieved data of neighbbor nodes (friend circle in social network) of the user in the social network is used. Using AHP (Analytical Hierarchy Process) rankking is done for closest friends in terms of similarity (with the user u in behavior) according to current browsing category of user. Phase III: Recommendation of Product In this phase, from the above generated set of friends top I method ‘N’ closest friends are selected and then Inversion technique is used to find inversion count with thee help of Sort and Count algorithm followed by Merge and Soort algorithm. A product from the browsing category is recommeended to the user using Memory-Based CF algorithm recommenndation algorithm using user’s and these N closest friend’s purcchase history and rating on the product and liked items.

A. Phase I: Data Extraction As the user logs into an e-commerce website throough social network id and password, relevant information is retrievedd about the user and his friend’s circle. a executed. In order to carry out the retrieval following steps are 1. Setting up a FB application in FB developer moode. 2. Specifying the data fields that need to be retriieved particular to a user. 3. Integrating the FB app with the e-commerce weebsite. 4. After logging on user is asked for his permissioon to use his data. 5. Extraction of data. The data stored in the form of a file that consists of all the information pertaining to a particular user. This data is now sent for g us the desired optimization in Phase II. Thus, this integration gives information about the user and its friend circle.

Fig 1: AHP overview

B. Phase II: Optimization and sorting of Extractted Data (Using

values an efficient algorithm is used. These values of each node (friends) of each category are alsoo put into a table called alternatives evaluation table. These two tablees are used in AHP process to find the potential value of user and acccording to potential value all friends (nodes) are ranked. Figure 1 repressents the overview of an AHP flow. Following parameters are extrracted and assigned for further processing in our proposed system m. Detailed algorithms for extraction and processing are omitted due to paucity p of space. 1. Language O 639-3 codes for ease of processing Each language is converted to ISO with error handling and evaluationn. 2. Age Age 0 – 100 is divided in 20 grouups from 1 to 20 and neighbor node whose age falls under same grouup is supposed to be nearest to the user’s behavior and as the gap increases behavior differs almost linearly so evaluation is done by difference of age group. 3. Location Evaluation according to location (lliving and home location) 4. Sex 5. Religion 6. Likes 7. Political Views 8. Relationship Status Similarly the system may be extended to any number of other m parameters of interest in the same manner.

C. Phase 3: Recommendation of Product P s of data as per the parameters After we are done with AHP and sorting we proceed to Phase 3 for the recommendation of the product. Here, firstly the concept of Inveersion Count is applied where we compare the ranking of the user onn a set of products with the new list of friends generated by Phase 2 on o the similar set of products. This count gives us the compatibilityy between the user and his set of friends i.e., it gives us a measure of the level of complete agreement o complete disagreement (if every (when there are no inversions) or pair forms an inversion) between them. t Now in order to find the Inversion count, let us consider comparing user’s ranking and his friend’s rannking of the same set of n books. A natural method would be to label the books from 1 to n according to the user’s ranking and then ordder these labels according to the friend’s ranking and see how maany pairs are “out of order”. More concretely, we will consider the following f problem. We are given a sequence of n numbers a1, a2…a … n; we will assume that all the numbers are distinct. We want to define a measure that tells us how far this list is from being in ascendding order; the value of the measure should be 0 if a1 < a2 < …….. < an, and should increase as the numbers become more scrambled. A natural way to qualify this nottion is by counting the number of inversions. We say that two indicees i < j form an inversion if ai > aj are “out of order”. We will seeek to determine the number of inversions in the sequence a1, …, an.

AHP method) s location etc.) First, different parameter (language, age, sex, information is collected for each node from sociial network website. In e-commerce website there are different categories of products available, for each category, these parameters are evaluated and assigned a potential value relatively from very extremely important (10) to equal importance (1) manually. Similarlly for all categories available in e-commerce website should be evaluuated and put into a table called parameter/criteria/Alternative evaluuation table. In this way, all the N friends are assigned a value from m high to low with respect to closeness to the user for each parameeter. To assign these

Fig 2: Geometric visualizzation of Inversion Count Based on the above theory let us consider an example in which the sequence is 2, 4, 1, 3, 5. These arre three inversions in this sequence: (2, 1), (4, 1), and (4, 3). There is allso a geometric way to visualize the

inversions, as shown in figure 2. We draw the sequence of input numbers in the order they are provided above and that in ascending order below. We then draw a line segment between each number in the top list and its copy in the lower list. Each crossing pair of line segments corresponds to one pair that is in the opposite order in the two lists; in other words, an inversion. Figure 2 shows a natural way to calculate Inversions but in our approach this is realized using Sort and Count followed by Merge and Sort algorithms. Pseudo codes of those algorithms are as follows. Sort and Count Algorithm Sort-and-Count (L) If the list has one element then There are no inversions Else Divide the list into two halves: A contains the first n/2 elements B contains the remaining n/2 elements (rA, A) = Sort-and-Count (A) (rB, B) = Sort-and-Count (B) (r, L) = Merge-and-Count (A, B) Endif Return r = rA + rB + r, and the sorted list L

Let, Vi, j = vote of user i on item j Ii = items for which user i has voted

Predicted vote (

,

,

,

,

,



-------- (3) ,

will give the recommendation for the user.

V.

IMPLEMENTATION DETAILS

Phase I Implementation: Integration with Facebook Connect: Different steps for the entire implementation for this stage may be summarized as follows. Step 1: Creating a Web Site for Intended Integration A website is designed using ASP.NET platform to integrate to Facebook. Step 2: Creating a Cross-Domain Communication Channel File It is used to establish communication between third-party Web pages and Facebook pages and services inside a browser. To reference the library, we needed to create a cross-domain communications channel file. Step 3 - Adding Facebook XML Namespace to HTML Tag Step 4 - Adding a Reference to the Facebook JavaScript Feature Loader Step 5 - Creating the Facebook Connect Login Button Final Step - Initializing Facebook Connect Using Facebook Developer Toolkit: Next, Facebook Developer Toolkit is used to collect information from the user’s basic profile. Phase II Implementation: We have collected the following information of each node from social network website 1. Language. 2. Age 3. Location 4. likes 5. Sex 6. Religion 7. Political views 8. Relationship status

And, Mean vote for i is





,

Our work is implemented in three phases as mentioned earlier. In phase I for extraction of data from social network website, ASP.NET and for implementation of phase II & III JAVA is used.

After obtaining the number of inversion between the users, MemoryBased CF algorithm is applied for efficient and high-quality recommendation for user.

| |



,

A. Experimental Framework:

Merge and Count Algorithm: Merge-and-Count (A, B) Maintain a Current pointer into each list, initialized to point to the front elements Maintain a variable Count for the number of inversions, initialized to 0 While both lists are nonempty: Let ai and bj be the elements pointed to by the Current pointer Append the smaller of these two to the output list If bj is the smaller element then Increment Count by the number of elements remaining in A End if Advance the Current pointer in the list from which the smaller element was selected End while Once one list is empty, append the remainder of the other list to the output Return Count and the merged list.

1

and Pearson Correlation Coefficient (Resnick ‘94, Grouplens)

-------------- (1)

) for “active user” a is weighted sum:



,

,

1

,

---------- (2)

, = weights of n similar user. j = Product recommended by the friends to User A. k = Normalizer = ( 1 / | ∑ (w(a,i) | ). K-nearest neighbor W(a, i)

=

1 0

if i є neighbors(a) elsewise Fig 3: An example social network

We use these eight parameters as criteria for selection of optimal nodes through AHP.

Recommendation Systems have thus emerged as a common tool to carry out the process of recommendation.

Phase III Implementation: Phase III algorithm is applied on the data obtained above through JAVA to get the recommendation. This is explained in next section through an illustration.

For future work we will be focusing on decreasing the complexity

B. Illustration and result: Let us consider the Social network shown in Figure 3 and a case assuming only three parameters (age, language and sex) for each node of the network (user + friends) and assuming user is currently browsing for books in e- commerce website. In phase I user and his friend social network data is retrieved using method described earlier. Now phase II procedure is applied step by step. First potential of each parameter (age, sex and location) is evaluated then potential of user’s friend node is calculated and their ranking of closeness is determined. All parameters are assigned some potential value based on importance of parameter towards books selection and we get the ranking as per the proposed method. Now in phase III product is recommended finally applying the algorithms proposed in previous section. Therefore, moving to the main part of Recommendation, we get Vi, j = vote of user i on item j Ii = items for which user i has voted Therefore, Mean votes for User A, D, and C are given by equation [1] Va = 1/5( 1 + 2 + 3 + 4 + 5 ) = 3, and Vd = 3, Vc = 3 Now, first calculating the Pearson correlation coefficient using formulae in equation [3], we get W( a,d ) = ( 1-3 ).( 5-3) + ( 2-3 ).( 3-3) + ( 3-3 ).( 4-3) + ( 4-3 ).( 1-3) + ( 53 ).( 2-3 ) = √(1-3)^2 + (2-3)^2 + (3-3)^2 + (4-3)^2 + (5-3)^2 . √(5-3)^2 + (33)^2 + (4-3)^2 + (1-3)^2 + (2-3)^2 ∴ W( a,d ) = - 0.80 Similarly, W( a,c ) = 0.10. Now, let us say for a Product ‘T’, D – 4 rating C – 2 rating Hence, predictive vote for active user a (according to equation [2]), P( a,t ) = 3 + ( 1/0.7 ) . ( -0.8 ( 4-19/6 ) + 0.1 ( 2-17/6 )) = 3 + ( 7/10 ) . ( -3/4 ) = 3 – 21/40. For any other product ‘U’, Let D – 2 rating C – 4 rating So, P( a,u ) = 3 + ( 1/0.7 ) . ( -0.8 ( 2-17/6 ) + 0.1 ( 4-19/6 )) = 3 + ( 7/10 ) . ( +3/4 ) = 3 + 21/40. Hence, we see that P (a,u ) > P (a,t ) as Inversion of A with D was more. Therefore, the product that D recommends gets lower priority when compared to C’s ratings.

VI.

CONCLUSIONS AND FUTURE WORKS

This paper gives an insight into the concept of Recommendation System by implementing AHP along with Inversion and CF algorithm techniques. This approach has wider applications in the field of e-commerce websites. Thus, we conclude from our research and analysis that, Recommendation System is here to stay. Social networking being the best available means to gauge user behavior and thus, recommendation system using social networking will go a long way in helping the user to actually find out what they are looking for in a less cluttered manner.

of Inversion algorithm from

O(n log n) to O (n log n )

thereby increasing the speed of execution considerably. Moreover, as we are all aware that information of user keeps on changing on Facebook after a period of time hence our code for retrieval must be dynamic so as to keep up with the changes. M o r e o v e r , the entire process of data mining and recommendation may be taken to Level-2 friends (friends of friends) also to make the proposed recommendation system more robust and efficient.

REFERENCES [1]

Nancy p., R. Geetha Ramani and Shomona Gracia Jacob, “Mining of Association Patterns in Social Network Data (Facebook 100 Universities) through Data Mining Technique and Methods”, Advances in Computing and Information Technology, Springer Berlin Heidelberg, Vol. 178, P 107-117, 2013. [2] Pareek, Jyoti and Jhaveri, Maitri and Kapasi, Abbas and Trivedi, Malhar. SNetRS: Social Networking in Recommendation System. ACITY (2). editor(s) Meghanathan, Natarajan and Nagamalai, Dhinaharan and Chaki, Nabendu. Advances in Intelligent Systems and Computing, (177) 195-206, Springer, Year 2012. [3] Joshi, Priyanka and Chaudhary, Sanjay and 0002, Vikas Kumar. Development of Agro-tagger and Recommendation Generation Using Social Network for Agro-produce Marketing. ACITY (3). editor(s) Meghanathan, Natarajan and Nagamalai, Dhinaharan and Chaki, Nabendu. Advances in Intelligent Systems and Computing, (178) 401410, Springer, Year 2012. [4] Jay Liebowitz, (2005) "Linking social network analysis with the analytic hierarchy process for knowledge mapping in organizations", Journal of Knowledge Management, Vol. 9, Issue 1, pp.76 - 86 [5] Evangelos Triantaphyllou and Stuart H. Mann, “Using The Analytic Hierarchy Process For Decision Making in Engineering Applications: Some Challenges”, International Journal of Industrial Engineering: Applications and Practice, Vol. 2, No. 1, pp. 35-44, 1995. [6] Jiaqin Yang and Ping Shi, “Applying Analytic Hierarchy Process in Firm'sOverall Performance Evaluation: A Case Study in China”, International Journal Of Business, 7(1), 2002. [7] Thomas L. Saaty, “Decision making with the analytic hierarchy process”, Int. J. Services Sciences, Vol. 1, No. 1, 2008. [8] Greg Linden, Brent Smith, and Jeremy York. 2003. Amazon.com Recommendations: Item-to-Item Collaborative Filtering. IEEE Internet Computing 7, 1 (January 2003), 76-80. DOI=10.1109/MIC.2003.1167344 [9] William W. Cohen., “C.F tutorial”, Available: www.cs.cmu.edu/~wcohen/ [10] Timothy M. Chan and Mihai Pătraşcu. 2010. Counting inversions, offline orthogonal range counting, and related problems. In Proceedings of the Twenty-First Annual ACMSIAM Symposium on Discrete Algorithms (SODA '10). Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 161-173. [11] Yu, Kai, Xiaowei Xu, Jianjua Tao, Martin Ester, and HansPeter Kriegel. "Instance Selection Techniques for Memorybased Collaborative Filtering." J. SDM, vol. 2, p. 16. 2002. [12] Developing Facebook Application, Available: http://developers.facebook.com/

Suggest Documents