2013 IEEE 2nd Global Conference on Consumer Electronics (GCCE)
A Rating Prediction Method for E-Commerce application Using Ordinal Regression based on LDA with Multi-modal Features Takayuki Kawashima, Takahiro Ogawa and Miki Haseyama Graduate School of Information Science and Technology, Hokkaido University N-14, W-9, Kita-ku, Sapporo, Hokkaido, 060-0814, Japan Email: {kawashima, ogawa}@lmd.ist.hokudai.ac.jp,
[email protected] Abstract—This paper presents a new method for rating prediction in e-commerce, which uses ordinal regression based on linear discriminant analysis (LDA) with multi-modal features. In order to realize accurate recommendation in e-commerce, the proposed method estimates each user’s rating for target items. Note that we define the rating as “the degree of preference for each item by a user.” For estimating the target user’s preference of each item from the past ratings of other items, the proposed method performs training from pairs of “ratings of items” and their feature vectors using ordinal regression based on LDA. Furthermore, in this approach, new features are obtained by applying canonical correlation analysis (CCA) to textual and visual features extracted from review’s texts and images on the Web, respectively. Therefore, higher performance of the rating prediction can be realized by our method than that when using single kind of features. Experimental results obtained by applying the proposed method to an actual movie data set, which has been provided by SNAP, show the effectiveness of the proposed method.
I.
I NTRODUCTION
Recently, due to an explosive growth of e-commerce, e.g., Amazon1 and eBay2 , we are surrounded by a large amount of product information. Therefore, effective recommendation systems in e-commerce have become necessary for finding each user’s desired items. Traditionally, the recommendation of items is performed based on the past rating, which are provided to other items by users. Specifically, Collaborative Filtering (CF) [1] is a traditional method for realizing this scheme. Although the CF method realizes item recommendation based on similar user’s favorite items, its performance tends to be degraded if the number of similar users decreases. Therefore, Content-based Filtering (CB) [1] has been proposed to directly model each user’s favorite items, i.e., trends of features in favorite items, are estimated to perform rating of target items. However, in conventional rating prediction methods using CB, it is difficult to accurately perform the prediction since only single kind of features is used. In this paper, in order to solve the problem of the conventional methods, we propose a new rating prediction method using ordinal regression based on linear discriminant analysis (LDA) [2] with multi-modal features. The proposed method enables the extraction of textual and visual features from users’ reviews, which have been rated by each user, and merges them into latent features based on canonical correlation analysis (CCA) [3]. Then, in order to estimate the degree of user’s preference, the proposed method performs training of ordinal 1 http://www.amazon.com/ 2 http://www.ebay.com/
978-1-4799-0892-9/13/$31.00 ©2013 IEEE
Feature Representation From reviews
Textual features
Image features
PCA
PCA CCA
Rating Prediction Multi-modal features
Ordinal Regression
Fig. 1.
An overview of the proposed method
regression based on LDA by using these features and rating information of each user. Consequently, rating prediction of target items can be realized by using the trained LDA-based ordinal regression, and thus, higher performance than that of conventional methods becomes feasible. This paper also shows the effectiveness of the proposed method from experimental results obtained from an actual movie dataset of SNAP [4]3 . II.
PROPOSED RATING PREDICTION METHOD
This section shows the rating prediction method using LDA with the multi-modal features. In the proposed method, we apply tf-idf weighting [5] to the reviews of each item for obtaining textual features and Histograms of Oriented Gradients (HOG) [6] from images related to these items for obtaining visual features. Furthermore, the proposed method applies the CCA to them and performs ordinal regression using canonical variates as features. Therefore, we realize the rating prediction method using ordinal regression based on LDA with multi-modal features. An overview of the proposed method is shown in Fig.1. A. Feature Representation In this subsection, we explain features used in the proposed method. First, we collect keywords that appear in N reviews for each user. Then, by applying tf-idf method to these keywords in each review, weights corresponding to these keywords are calculated for textual features. Therefore, by aligning the obtained weights, the feature vectors wi (i = 1, 2, · · · , N ) are obtained. Furthermore, we apply the principal component 3 http://snap.stanford.edu/data/
260
From Google Image Search
This is a good movie. ͐
analysis (PCA) to w i and a new vector ti is obtained as follows: ti = P (w i − w) (i = 1, 2, · · · , N ),
Next, in order to extract visual features, we collect images which represent contents of items from Google Image Search4 by providing the name of the item as the search query. Traditionally, in collecting images using Google Image Search, the first 5 images are used to reduce influence of noise, and we also follow this scheme [7]. Next, we extract HOG features which represent appearance and shape of target objects from images, and the feature vectors hi (i = 1, 2, · · · , N ) are obtained. Then, by applying the PCA to hi (i = 1, 2, · · · , N ), new vectors v i are obtained as follows: (2)
where Q is an eigenvector matrix that is obtained by applying the PCA to hi , and h is the average vector of hi . Then, vectors v i are defined as the visual feature vectors. Finally, in order to merge textual features and image features shown in Eqs. (1) and (2), the CCA is applied to them. Thus, we can get vectors which contain d-dimensional canonical variates shown as follows: ui = Ati (i = 1, 2, · · · , N ),
(3)
si = Bv i (i = 1, 2, · · · , N ),
(4)
where d is the dimension of the latent spaces, and A and B are coefficient matrices obtained by the CCA. Therefore by applying CCA, latent features can be calculated from the two types of features, i.e., textual and visual features, respectively. B. Rating Prediction for Item Recommendation In this subsection, we explain the rating prediction method using ordinal regression based on LDA. Let yi ∈ {1, 2, · · · , K; K being the number of classes} be class labels representing the rating of the i-th item and {xi |i = 1, 2, · · · , N } denote feature vectors which include canonical variates in Eqs. (3) and (4). Specifically, we obtain the optimal ˆ of w by solving the following problem: direction w min J(w, ρ) = w Sw w − Cρ, s.t. w (mk+1 − mk ) ≥ ρ (k = 1, 2, · · · , K − 1),
(5)
where C is a penalty coefficient and Sw is a between-class scatter matrix of xi . Furthermore, our method predicts label y predict of the unseen input xnew by the following decision rule: y predict =
min
ˆ xnew − bk < 0}, {k : w
k∈{1,··· ,K}
Accuracy
multi-modal 0.428
visual 0.293
textual 0.381
(1)
where P is an eigenvector matrix that is obtained by applying the PCA to wi !$and w is the average vector of w i . In our method, the obtained vectors ti are defined as the textual feature vector.
v i = Q(hi − h) (i = 1, 2, · · · , N ),
E XPERIMENT R ESULT
TABLE I.
(6)
ˆ (Nk+1 mk+1 +Nk mk )/(Nk+1 + where bk is defined as bk =w Nk ) . Then the rating of the new input becomes y predict . 4 http://images.google.com/
261
III.
E XPERIMENTAL RESULTS
In this section, we verify the effectiveness of the proposed method. In this experiment, we focused on user’s reviews provided for Movies dataset and used 420 reviews. Moreover, the used HOG descriptor is a dimensionality of 1296. The total number of keywords that appear in user’s reviews is 6995. After applying PCA, the dimensions of textual features and image features became 99 and 124, respectively. There are two kinds of canonical variates from text and images after applying CCA, and in this experiment, we used canonical variates from images. In the ordinal regression, if the predict label y predict of the unseen input xnew is the same as the correct label y true , we regard rating prediction as correct. Therefore, in order to verify the effectiveness of the proposed method, accuracy is evaluated as follows: K Nk (7) Accuracy = k=1 N where k is the class label, N is the total size of test data, and Nk is the number of data correctly predicted as class k. We show the results in TABLE I. In this table, we compare proposed method with other methods using only image features or textual features. As shown in TABLE I, we can see the improvement of the rating prediction performance by the multi-modal scheme. IV.
C ONCLUSION
This paper has proposed a new method for rating prediction, which uses ordinal regression based on LDA with multimodal features. From the experimental results obtained by applying the proposed method to the actual data set, we can see the performance improvement of the rating prediction by using multi-modal features. However, the problem of determining the optimal number of canonical variates remains unsolved. This is a subject of future work. ACKNOWLEDGMENT This work was partly supported by Grant-in-Aid for Scientific Research (B) 25280036. R EFERENCES [1] G. Adomavicius, A. Tuzhilin “Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 6, pp. 734-749, 2005. [2] B. Y. Sun, J. Li, D. D. Wu, X. M. Zhang, and W. B. Li, “Kernel Discriminant Learning for Ordinal Regression,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 6, pp. 906-910, 2010. [3] T. W. Anderson, An Introduction to Multivariate Statistical Analysis. Second edition, John Wiley & Sons, 1984. [4] J. McAuley and J. Leskovec, “From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews,” WWW, 2013. [5] F. Sebastiani, “Machine learning in automated text categorization,” ACM Computing Surveys, vol. 34, no. 1, pp. 1-47, 2002. [6] N. Dalal and B. Triggs, “Histograms of Oriented Gr adients for Human Detection,” Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 886-893, 2005. [7] R. Fergus and L. Fei-Fei and P. Perona and A. Zisserman, “Learning object categories from Google’s image search,” Proceedings of the Tenth IEEE International Conference on Computer Vision, vol. 2, pp. 18161823, 2005.