Data Mining and Knowledge Engineering
Design of Quality-based Recommender System for Bundle Purchases R. MohanKumar 1, D. Saravanan 2 Final Year M.E (CSE) 1, Asst. Professor 2 Department of Computer Science & Engineering Pavendar Bharathidasan College of Engineering & Technology Tiruchirappalli, Tamilnadu, India (
[email protected] 1,
[email protected] 2) Abstract - In online shopping, Recommender systems are playing a vital role in recommending products to users based on their interests, product ratings, availability, etc.,. However, while most of recommender systems make use of algorithms which are designed for recommending one-product-at-a-time with focusing only on accuracy tolerance. This individual diversity characteristic of the current system, leads to monopoly in recommendation. In proposed system, a combination of multiple Collaborative Filtering (CF) techniques namely Item-based technique, User-based technique and Matrix Factorization, is used to predict most appropriate rating for the unrated items. Number of item ranking techniques is proposed to provide more diverse in recommendation for both single product and bundle purchases while maintaining trade-off between recommendation accuracy and diversity. Also an improved indexing technique has been proposed to speed up the user requests processing with the help of well-defined semantic library. Both Record-level and Word-level Inverted indexing techniques are helpful to retrieve only the most relevant products as result set.
some rating criteria, and recommend top-N items which have the highest predicted ratings to the customers [14]. Accordingly, there have been many studies on developing new algorithms that can improve the predictive accuracy of recommendations while maintaining the better diversity in recommendations [6] [7]. Still, the quality of recommendations can be evaluated along a number of dimensions, and relying on the accuracy of recommendations alone may not be enough to find the most relevant items for each user. In particular, the importance of diverse recommendations has been previously emphasized in several studies [11]. These studies argue that one of the goals of recommender systems is to provide a user with highly personalized items, and more diverse recommendations result in more opportunities for users to get recommended such items. There is a increasing awareness of the importance of aggregate diversity in recommender systems. Higher diversity (both individual and aggregate), however, can come at the expense of accuracy [6]. There is a tradeoff between accuracy and diversity because high accuracy may often be obtained by safely recommending to users the most popular items, which can lead to the reduction in diversity, i.e., less personalized recommendations. On the other hand, higher diversity can be achieved by recommending highly personalized items for each user, which often have less data and are more difficult to predict, and, thus, may lead to a decrease in recommendation accuracy [11]. The Internet has dramatically reduced buyer search costs by providing easy information retrieval with the help of search engines and recommender systems. On the other hand, researchers have found significant price variation on the Internet even for identical commodities such as Books, Movies and CDs [2]. This variation and the large number of vendors have made it difficult for a user to find the best price for a product. There are number of comparison-shopping search engines, widely known as “shopbots”, have become popular in online shopping websites. At these websites a customer can enter the product name and specification, and the Shopbot will search a large number of vendors and return the prices offered by retailers, as well as other information such as shipping cost, availability, customers‟ comments and discount details [2]. Most current shopbots are geared towards
Keywords - Recommender systems, ranking functions, products‟ ranking criteria, Bundle purchase types, collaborative filtering, recommendation monopoly, recommendation diversity.
1. Introduction Due to enormous amount of information, it is harder to search and find content on internet which is relevant to what the user actually need. Over the last 10 years, recommender systems have been introduced to facilitate the customer deal with these huge amounts of information and they have been widely used in research as well as e-commerce applications, such as the one used by Flipkart and BestBuy [6]. The most common function of the recommender system relies on the notion of ratings, i.e., the systems estimate ratings of products that are yet to be consumed by users, based on the ratings of items which are already consumed and user preferences [11]. Recommender systems typically try to predict the ratings of unknown items for each user, often using other users‟ ratings or 1
Data Mining and Knowledge Engineering
one product-at-a-time search. Thus, using these shopbots, a shopper who wants to find the best price for a group (bundle) of products would have to initiate a search for each individual item and then combine the results on her own. A few shopbots allow a customer to compare shopping list for multiple items as a whole by displaying the total purchasing price of these items from a single vendor and/or from multiple vendors. However, none of these shopbots can incorporate the variety of bundling and pricing alternatives that are frequently offered by online retailers. The use of bundle pricing and promotions has been a common marketing practice for a long time. Recently, they have been used more frequently in online retailing to improve sales [2]. Another reason for increased use of bundle pricing and promotions could be that retailers are trying to avoid direct price competition pressure caused by shopbots. Further, online shoppers are increasingly purchasing multiple items in a single order because of factors such as the convenience of online shopping, site search, non-linear shipping costs, and recommendations of related products from various recommendation systems. The rest of the paper is organized as follows. A detailed literature study about the on-line shopping websites, recommendation techniques involved, mode of operation are included in chapter 2. The Proposed system overview, algorithms used and techniques involved are given under chapter 3. The conclusion of the proposed paper is present in chapter 4. Finally, a list of research papers that have been referred for this paper, have been displayed.
base. All the recent works in recommender system follows any one of the above said strategy. The objective is to satisfy user needs with a single search and to maximize the likelihood of relevant cases appearing high up in the result list, hence the priority given to similarity. By prioritizing similarity during the retrieval conventional recommenders implicitly ignore the importance of result diversity, and this may reduce the quality of the final recommendation. 2.1
Traditional CF Technique
The task of the traditional collaborative filtering (CF) recommendation algorithm concerns the prediction of the target user‟s rating for the target item that the user has not given the rating, based on the users‟ ratings on observed items. And the user-item rating database is in the central [12]. Each user is represented by item-rating pairs, and can be summarized in a user-item table, which contains the ratings Rij that have been provided by the ith user for the jth item [6] [9]. The task of the recommender system will be named as follows [11], 1. 2.
Measuring the rating similarity Selecting neighbors Select of the neighbors who will serve as recommenders and there are two techniques have been employed in the collaborative filtering recommender systems. Threshold-based selection, according to which users whose similarity exceeds a certain threshold value are considered as neighbors of the target user. The top-n technique, n-best neighbors is selected and the n is given at first. 3. Producing prediction Since the membership detail of users is in database, weighted average of neighbors‟ ratings can be manipulated, weighted by their similarity to the target user. The rating of the target user „u‟ to the target item „x‟ is as following:
2. Related Work This section presents a comprehensive survey, mainly focused on the study of recommender system working strategy and its ranking function. And also bundle strategies being followed in on-line shopping websites nowadays. Finally a detailed study about inverted indexing technique and its advantages have been given. Most of real-time recommender systems have implemented the neighborhood based rating technique for predicting the unrated item‟s rating in order to recommend to the user. The standard ranking approach uses this rating to reorder the products before recommend them [6] [9]. Some of system uses Model-based technique which uses users‟ preferences to rate the product. But it requires a well-designed machine learning predictive model. Both the systems have the problem of predicting the initial rating value when it is included in the products
Σ u‟=1,..n ( Pu’,x - Au’ ) * sim(u,u’) Pu,x = Au +
Σ u‟=1,..n sim(u,u’) where Au is the average rating of the target user „u‟ to the items, Pu‟,x is the rating of the neighbor user „u‟‟ to the target item „x‟, Au‟ is the average rating of the neighbor user u‟ to the items, sim(u,u‟) is the similarity of the target user „u‟ and the neighbor user „u‟‟, and „n‟ is the number of the neighbors. 2
Data Mining and Knowledge Engineering
Collaborative filtering (CF) system works by building a database of preferences for items by users and collects user feedback in the form of ratings for items in a given domain and exploiting similarities in rating behavior amongst several users in determining how to recommend an item. CF methods can be further sub-divided into neighborhood-based and model-based approaches. 2.2
where average similarity of the result set reduces gradually for increasing values of “k” [4]:
Similarity(q,x) =
Limitations of Collaborative Filtering System
Diversity(c1,..cn) =
Standard Ranking Technique
*
Σ
i=1,2…n
j=i.,.n
(1 – Similarity(ci,cj))
(n/2) * (n-1)
Quality Metric:
Accuracy
Diversity
Popular item (item with the largest number of known ratings)
82 %
49 distinct items
“Long-tail” item (item with the smallest number of known ratings)
68 %
695 distinct items
Top-1 recommendation of:
In practice, improving the diversity of a fixed-size recommendation set means sacrificing similarity. The goal is to develop a strategy that optimizes similarity accuracy-diversity trade-off, delivering product recommendations that are diverse, without compromising their similarity to the target query. We will describe strategy for retrieving k items from a database, given a target query „q‟, each focusing on a different way of increasing the diversity of the result set. Table 1 illustrates an example of accuracy and diversity tradeoff in two extreme cases where only popular items or long-tail type items are recommended to users.. In this example, a popular recommendation technique, i.e., neighborhood-based collaborative filtering (CF) technique is used to predict unknown ratings. Then, as candidate recommendations for each user, only the items that were predicted above the predefined rating threshold is considered, to assure the acceptable level of accuracy [7].
-1
The power of -1 in the above equation indicates that the items with highest predicted ratings R*(u, i) are the ones being recommended to user. Recommending the most highly predicted items selected by the standard ranking approach is designed to help improve recommendation accuracy, but not recommendation diversity. So, new ranking criteria are needed in order to achieve diversity improvement in recommendation. 2.4
Σ
TABLE 1 Accuracy-Diversity Tradeoff: Example
Mostly recommender systems predict unknown ratings based on known ratings, using any traditional recommendation technique such as neighborhood-based or matrix factorization CF techniques. Then, the predicted ratings are used to support the user‟s decision making process. Each user „u‟ gets recommended a list of top-N items, LN(u), selected according to some ranking criterion. More formally, item ix is ranked ahead of item i y (i.e., ix iy) if rank(ix)< rank(iy), where rank: R => IR(ix) is a function representing the ranking criterion. The majority of recommender systems use the predicted rating value as the ranking criterion [6]: rankstd(i) = R (u, i)
Σ i=1,2,…n wi
The diversity of a set of items c 1,c2,...cn, will be the average dissimilarity between all pairs of product items in the result-set. All standard content-based recommenders often display a characteristic diversity profile with diversity increasing for larger result sets. Thus the trade-off between similarity and diversity in recommendation is simple: for low values of „k‟, while similarity tends to be high, diversity tends to be very low, highlighting the fundamental problem that exists with case-based recommender systems.
There are three essentials are needed to support CF: Many people must participate (increasing the likelihood that any one person will find other users with similar preferences), there must be an easy way to represent the user interests in the system, and the algorithms must be able to match people with similar interests [8] [10]. These three elements are not that easy to develop, and produce the main shortcoming of CF systems. Some of limitations of collaborative systems are cold start problem, sparsity problem, grey sheep problem and scalability problem [8] [11]. 2.3
Σ i=1,2,…n wi * simi(qi , xi)
Accuracy-Diversity Trade-Off
In content-based recommender systems, the normal approach to measuring the similarity between an information item “x” and target query “q” is to use a weighted-sum metric. Selecting the “k” most similar items usually results in a characteristic similarity profile 3
Data Mining and Knowledge Engineering
As illustrated by Table 1, if the system recommends each user the most popular item, it is much more likely for many users to get the same recommendation (e.g., the best selling item). The accuracy measured by precisionin-top-1 metric (i.e., the % of truly “high” ratings among those that were predicted to be “high” by the recommender system) is 82 %, but only 49 popular items out of approximately 2,000 available distinct items are recommended. The system can improve the diversity of recommendations from 49 up to 695 by recommending the long-tail item to each user (i.e., the least popular item among highly predicted items for each user) instead of the popular item. However, high diversity in this case is obtained at the significant expense of accuracy, i.e., drop from 82 to 68 percent. The above example shows that it is possible to obtain higher diversity simply by recommending less popular items; however, the loss of recommendation accuracy in this case can be substantial. In this paper, we propose new recommendation approach that can increase the diversity of recommendations with only a minimal (negligible) accuracy loss using different recommendation ranking techniques. In particular, traditional systems typically rank the relevant items in a descending order of their predicted ratings and then recommend top-N items, resulting in high accuracy. In contrast, the proposed approaches consider additional factors, such as item popularity, when ranking the recommendation list to substantially increase recommendation diversity while maintaining comparable levels of accuracy [7]. 2.5
bundling, and component selling (pure unbundling). In pure bundling, individual items are not offered. Mixed bundling is a combination of pure bundling and component selling. On the other hand, pure bundling is not a concept of interest, since one can simply consider “components” to be minimal sets of goods that can be purchased individually. The various common bundling strategies that have been implemented by retailers are listed as follows [2]: 1. Deterministic bundling Exactly one set of predetermined items is included in the bundle. 2. Non deterministic bundling This includes, for instance, the following: (i) Tie-in bundling: The buyer is required to buy one major product (e.g., digital camera or mp3 player) to qualify for discounted prices on other products (e.g., three software products for $48). Usually there are comprehensive lists of both the major and tied-in products. (ii) Add-on bundling: The buyer is required to buy one product (e.g., wireless router) to get a free product (e.g., wireless card). (iii) Cross promotion: The buyer is required to purchase one product to qualify for a discount on another product. However, the buyer has the option of not purchasing the additional product.
Product Bundling Strategies
The use of bundle pricing and promotions has been a common marketing practice for a long time. They have been used extensively and more frequently in online retailing to improve sales and to avoid direct price competition pressure caused by shopbots. Shopbots are claim that they can retrieve, process and present product related information at low cost and therefore greatly affect the efficiency and behavior of sales campaign [2]. None of the recommender algorithms consider possible savings that can be obtained from bundle pricing and promotions in making recommendations to users. The GRAB algorithm takes advantage of the fact that there are prices for each of the individual items, a condition that is not true in general set covering problems. “in“ terms of objective function value, the performance of GRAB is consistent, able to identify the freebies and approximately 1% greater than that of an optimal algorithm like CPLEX [2]. Basically product bundles are classified three modes of bundling structures, namely pure bundling, mixed
(iv) Total value discount: If the total amount of an order is above a certain threshold, the order gets an extra discount.
3. PROPOSED SYSTEM 3.1
Problem Definition In current recommender systems, some of existing rating techniques are used according to their requirements, individually such as neighborhood based CF techniques User-based CF and Item-based CF technique, Matrix factorization and Bayesian classification and so on. Those techniques are effective but still individual diversity recommendation leads the system to recommend products in multiple fashions to the users. So user may be confused in choosing the interested products from the recommendation list. Standard ranking approach is easier to implement because it does not need any additional information than 4
Data Mining and Knowledge Engineering
Σ iЄ I(u,u‟) R(u,i) . R(u‟,i)
the product rating values. But ignoring the important aspects in the commercial application such as user location profile, availability, user likeability, neighbor‟s preferences, may lead the system to recommend the product with less diversity. Collaborative filtering technique used in most of recommender systems only focus on recommendation accuracy in order to recommend the products as per user searching preference, to increase the user attention towards the commercial portal. If they consider the more diversity in recommendation, then this will lead to more accuracy precision loss. There is no trade-off between recommendation accuracy and diversity in current systems. Currently recommender system has been deployed to recommend one-product-at-a-time fashion. If there is any bundle offer available on the portal means the inability of CF technique, those offers will not be recommended to the user who wants it. Mostly searching algorithm makes use of indexing technique in order to find the required result as soon as possible. There are many indexing techniques such as forward indexing, reverse indexing and inverted indexing, have been proposed and their effectiveness depends on the system implementation. And finally, most of search engines including Google and yahoo ignore some of word preprocessing steps like stemming and removal of stop words. Stop words are words like “a”, “the”, “of”, and “to”, which are so common that nearly every document contains them. A stop word list contains the list of words to ignore when indexing the document collection 3.2
sim(u,u‟) =
i Є I(u,u‟)
R(u,i) 2
√Σ
i Є I(u,u‟)
R(u‟,i) 2
Σu‟ Є N(u) sim(u,u‟) . (R(u‟,i) – R(u‟)) RU(u,i) = R(u) +
Σu‟ Є N(u) | sim(u,u‟) |
where I(u, u‟) represents the set of all items rated by both user u and user u‟. Based on the similarity calculation, set N(u) of nearest neighbors of user u is obtained. The size of set N(u) can range anywhere from 1 to |U| - 1, i.e., all other users in the data set. Then, Ru(u, i) is calculated as the adjusted weighted sum of all known ratings R(u‟, i), where u‟ Є N(u). R(u) represents the average rating of user u. A neighborhood-based CF technique can be userbased or item-based, depending on whether the similarity is calculated between users or items. Formulas which represent the user-based approach, but they can be straightforwardly rewritten for the itembased approach because of the symmetry between users and items in all neighborhood-based CF calculations. In this paper, both user-based and item-based approaches are proposed for rating estimation [3].
Σ uЄ U(i,i‟) R(u,i) . R(u,i‟) sim(i,i‟) =
√Σ
uЄ U(i,i‟)
R(u,i) 2
√Σ
uЄ U(i,i‟)
R(u,i‟) 2
Σi‟Є N(i) sim(i,i‟) . (R(u,i‟) – R(i‟)) RU(u,i) = R(i) +
Aggregated Rating Functions
In order to predict most appropriate rating for the unrated products, there are three rating functions from the category of heuristic-based and model-based rating functions will be used [4]. Item-based and user-based CF techniques from Heuristics techniques and matrix factorization technique from model-based techniques are proposed to determine the products‟ ratings. The rating function uses either product related information or user preferences to predict the unknown rating values [4] [5]. Finally average value from above calculated rating value is calculated and it is ensure that it will be more appropriate than the individual result. 3.2.1
√Σ
Σi‟Є N(i) | sim(i,i‟) |
3.2.2 Matrix Factorization CF Technique Matrix factorization techniques have recently gained popularity in recommender systems applications because of their effectiveness in improving recommendation accuracy [7] [13]. Many variations of matrix factorization techniques have been developed to solve the problems of data sparsity, over fitting, and convergence speed. We propose the basic version of this technique with the assumption that a user‟s rating for an item is composed of a sum of preferences about the various features of that item [7]. Using K features, user „u‟ is associated with a user-factor vector „pu’ (the user‟s preferences for K features), and item „i‟ is associated with an item-factor vector „Wi’ (the item‟s weights for K features) [10]. The preference of how much user u likes item i, denoted by Rm(u, i), is predicted by taking an inner product of the two vectors, i.e.,
Neighborhood-Based CF Technique
There exist multiple variations of neighborhood-based CF techniques. In this paper, to estimate the rating that user „u‟ would give to item „i‟, we first compute the similarity between user „u‟ and other users „u‟‟ using a cosine similarity metric [7]: 5
Data Mining and Knowledge Engineering
consistently observed that popular items, on average, are likely to have higher predicted ratings than less popular items, using different traditional recommendation techniques [6]. Therefore, it is suggested that recommending not as highly predicted items (but still predicted to be above TH), on average, less popular items, potentially leading to diversity improvements; following this observation, we propose to use predicted rating value itself as an products ranking criterion. Based on similar empirical observations, we propose a number of alternative ranking approaches which are based on average rating, absolute likeability, and relative likeability [6] [7].
Rm(u,i) = PTu * Wi And the final predicted average rating value will be, R*(u,i) = Round(RU + RI + Rm) / 3 3.3
Proposed Approach: Popularity-Based Ranking
Item-popularity-based ranking approach ranks items directly based on their popularity, from lowest to highest, where popularity is represented by the number of known ratings that each item has. More formally, Item-popularity based ranking function can be written as follows [7]: rank(i) = |U(i)| where
3.5
U(i) = { u Є U | Ǝ R(u, i) }
Since we measure recommendation diversity as the total number of distinct items that are being recommended across all users, one could possibly argue that, while the diversity can be easily improved by recommending a few new items to some users, it may not be clear whether the proposed ranking approaches would be able to shift the overall distribution of recommended items towards more idiosyncratic, longtail recommendations. Therefore, in this subsection we explore how the proposed ranking approaches change the actual distribution of recommended items in terms of their popularity. Following the popular “80-20 rule” or the Pareto principle, we define the top 20% of the most frequently rated items in the training dataset as “bestsellers” and the remaining 80% of items as “longtail” items. For example, with the standard ranking approach, the long-tail items consist of only 16 percent of total recommendations (i.e., 84 percent of recommendations were of bestsellers) when recommending top-5 items to each user using item-based CF technique on data set, confirming some findings in prior literature that recommender systems often gravitate toward recommending bestsellers and not long-tail items. However, the proposed ranking approaches are able to recommend significantly more long-tail items with a small level of accuracy loss, and this distribution becomes even more skewed toward long-tail items if more accuracy loss can be tolerated. For example, with 1 percent precision loss, the percentage of recommended long-tail items can be increased from 16 to 32 percent with item popularity and item absolute likeability-based approaches. And with 2.5 or 5 percent precision loss, the proportion of long-tail items can grow up to 43 and 58 percent, respectively (e.g., using item-popularity ranking). This
Item popularity-based ranking approach as well as other ranking approaches proposed in this paper are parameterized with “ranking threshold” T R ∈ [TH, Tmax] (where Tmax is the largest possible rating on the rating scale, e.g., Tmax=5) to provide user the ability to choose a certain level of recommendation accuracy. In particular, given any ranking function rankX(i), the ranking threshold TR is used for creating the parameterized version of this ranking function, rankX(i, TR), which is formally defined as [6]: rankx(i,TR) = rankx(i), if R*(u,i) Є [TR,Tmax],
Au + rankstd(i), if R*(u,i) Є [TH,TR], where, I*u(TR) = {i Є I | R*(u,i) ≥ TR}
Au = max iЄ
rankx(i)
I*u(TR)
Simply put, items that are predicted above ranking threshold TR are ranked according to rankX(i), while items that are below TR are ranked according to the standard ranking approach rankStd(i). In addition, all items that are above TR get ranked ahead of all items that are below T R (as ensured by α u in the above formal definition). Therefore, choosing different TR values in-between TH and Tmax allows the user to set the desired balance between accuracy and diversity. 3.4
Impact of Proposed Ranking Approaches on the Distribution of Recommended Items
Additional Ranking Approaches
We here introduce some additional ranking approaches that can be used as alternatives to rankStd to improve recommendation diversity, and the formal definitions of each ranking approach as well as standard and item popularity-based ranking approaches. We also 6
Data Mining and Knowledge Engineering
expected result provides further support to the fact that the proposed ranking approaches increase not just the number of distinct items recommended, but also the proportion of recommended long-tail items, thus, confirming that the proposed techniques truly contribute toward more diverse and idiosyncratic recommendations across all users. 3.6
Note that at each step the algorithm chooses a bundle that includes at least one desired item not purchased at previous steps. The chosen bundle has the lowest ratio of cost to the individual costs of these desired items in the bundle [2]. GRAB takes advantage of the fact that there are prices for each of the individual items, a condition that is not true in general set covering problems.
Bundled Products Search: GRAB Algorithm 3.7
For consumer purchases that involve only a few items, an optimal solution of CB can easily be found by generic integer programming or specialized set covering algorithms. However, if we want to use the same model to help organizations decide on an optimal purchase plan for a large number of items, even specialized algorithms might take a long time to reach optimality. On the other hand, even if the number of items is small as in the case for individual consumers, the huge number of simultaneous requests that could be received by the shopbots can still pose a challenge [2]. Even if only a small no. of shoppers send in bundle search requests at any point of time, it would take at least quite a few seconds to solve for optimality for all the requests. Usability research shows that delay of more than 10 seconds results in loss of user attention. So waiting time even in seconds could lead to user abrasion. Therefore, a better algorithm that can solve the problem quickly will be useful if the number of items to be purchased is large and/or there are potential benefits from shaving off valuable seconds in responding to a large number of shopper requests. Therefore, we propose the algorithm, “Greedy Addition of Bundles” (GRAB) [2].
Quality-Based Products Search
Once the user selects a product from the recommendation list, to view its details, the QualityBased Search (QBSE) engine begins processing the item details and fetches the relevant products which are best-match with the currently viewed product according to the pre-defined quality parameter metrics which varies domain to domain in online sale. The engine matches the product attributes with similar like products and their feedbacks to find out the most relevant products with more or less same quality. This approach makes use of inverted index table to fetch the products ID of all stem words, matches other quality attributes, produce the quality based recommendation result set to the user. The quality based recommendation will be done only from the same category. And this can be taken as further work to enhance the quality-based search to search in all categories. Inverted indexing technique is used to index the product details to improvise the product search. In proposed system, both record-level and word-level inverted indexing are used to find the products and sort the search result before showing off the user. Wordlevel indexing is used to search for most matched words in the product base and record level indexing is used to order the product recommendation list. The wellmatched product will be placed at the top of the recommendation list.
Algorithm: GReedy Additions of Bundles Initialization Step: Set M1 = S0, N1=B, and t=1. Bundle Addition Step: Select a bundle, jt Є Nt to min {fj / Σj Є Sj n Mi Ci} Iteration Step: Set Nt+1 = Nt \ {jt}, Mt+1 = Mt \ {Sjt} and t=t+1 If Mt ≠ ϕ go to Bundle Addition Step Bundle Generation Step: The solution is xj=1 for j Є Nt+1 and xj=0
4. CONCLUSION In this paper, the proposed quality-based recommender system addresses the problem of oneproduct-at-a-time recommendation by processing and searching for bundle offers. GRAB method is used for searching for bundle products recommendation. Aggregate diversity preserving techniques are used to produce products recommendation while preserving accuracy-diversity tradeoff. All rating and ranking techniques are already proven that they produce high quality of result and also combination of more than one rating techniques will produce a good result set than performing individually. Re-ranking approach is used to
Let„t‟ index the iterations, Mt be the items in S0 not yet purchased at iteration t; Nt be the bundles of B that have not been chosen at iteration t.
7
Data Mining and Knowledge Engineering
purify the calculated rank in order to produce diversity preserving recommendations. To improve the product retrieval process, semantic library and optimized inverted index are used. Furthermore, the exploration of this work can be extended by considering sequential purchases and promotional purchases with implementing all additional ranking techniques which are mentioned in the section 3.4 to improve the recommendation result better.
[12] SongJie Gong (July 2010), "A Collaborative Filtering Recommendation Algorithm Based on User Clustering and Item Clustering", Journal Of Software, Vol. 5, No. 7, July 2010. [13] Yehuda Koren, Robert Bell and Chris Volinsky (2009), "Matrix Factorization Techniques For Recommender Systems”, Published by the IEEE Computer Society 0018-9162/09. [14] Zan Huang (January 2004), "Selectively Acquiring Ratings for Product Recommendation".
5. REFERENCES [1] Ajit Kumar Mahapatra, Sitanath Biswas (July 2011), "Inverted indexes: Types and techniques", IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 4, No 1, pp 384-392. [2] Arvind Tripathi, Fang Yin, Ram Gopal & Robert Garfinkel (July 2006), "Design of a shopbot and recommender system for bundle purchases", Decision Support Systems 42, pp 1974–1986. [3] Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl (May 2001), "ItemBased Collaborative Filtering Recommendation Algorithms", GroupLens Research Group/Army Research Center, ACM 1581133480/ 01/0005. [4] Barry Smyth & Keith Bradley (July 2007), "Improving Recommendation Diversity". [5] CaiNicolas Ziegler & Sean M. McNee (May 2005), "Improving Recommendation Lists Through Topic Diversification", ACM 1595930469/ 05/0005. [6] Gediminas Adomavicius & Youngok Kwon, Phoenix (2009), "Toward More Diverse Recommendations: Item Re-Ranking Methods For Recommender Systems”, 19th Workshop on Information Technologies and Systems, Phoenix 2009. [7] Gediminas Adomavicius & Youngok Kwon, Phoenix (May 2012), "Improving Aggregate Recommendation Diversity Using Ranking-Based Techniques”, IEEE TKDE, VOL. 24, NO. 5, pp 896911. [8] Jon Herlocker & Mark O‟Connor (August 2009), "Clustering Items for Collaborative Filtering". [9] Leo Iaquinta, Marco de Gemmis & Pasquale Lops (January 2002), "Preference Learning in Recommender Systems" [10] Mingrui Wu (August 2007), "Collaborative Filtering via Ensembles of Matrix Factorizations", KDDCup. 07, San Jose, California, USA. [11] Prem Melville & Vikas Sindhwani (March 2008), "Recommender Systems”. 8