Supervised Learning Approaches for Rating Customer ... - CiteSeerX

4 downloads 0 Views 599KB Size Report
kiransarv@research.iiit.ac.in, [email protected], [email protected] ... lexicon of sentiment words by themselves (seed lists) or use lexicons prepared by linguistics for ... Pang and Lee [3] proved that standard machine learning techniques will outperform ... experiments on the dataset using different features and classification methods ...
Supervised Learning Approaches for Rating Customer Reviews by Kiran Sarvabhotla, Prasad Pingali, Vasudeva Varma

in Published In the Journal of Intelligent Systems , Vol 19-(1), 2010. : 79 -94

Report No: IIIT/TR/2009/250

Centre for Search and Information Extraction Lab International Institute of Information Technology Hyderabad - 500 032, INDIA December 2009

Supervised Learning Approaches for Rating Customer Reviews Kiran Sarvabhotla, Prasad Pingali, Vasudeva Varma Search and Information Extraction Lab (SIEL). International Institute of Information Technology. Hyderabad, India. [email protected], [email protected], [email protected]

Abstract Social media has become highly popular in recent years that people are expressing their views, thoughts about any product, movie through reviews. Reviews are having a great influence on people and decisions made by them. This has led researchers and market analyzers to analyze the opinions of users in reviews and model their preferences. Sometimes reviews are also scored in terms of satisfaction score on any product or movie by customer (ratings). These ratings usually vary on a scale from one to five (stars) or very bad to excellent. In this paper we address the problem of attributing a numerical score (one to five stars) to a review. We view it as a multi-label classification (supervised learning) problem and present two approaches, using Naïve Bayes (NB) and Support Vector Machines (SVM’s). We focus more on feature representations of reviews widely used; problems associated with them and present solutions which address them.

Key Words:

Customer Reviews, Classification, Naïve Bayes, Support Vector Machines, Logistic

Regression.

1. INTRODUCTION With the increase in popularity of online reviews, people have been expressing their opinions or sentiments on a product, movie, book etc in them. Their sentiments are having a great influence on others [1] in decision making. This has led researchers and market people to focus on analyzing the sentiments of users which will be helpful in areas like recommendation systems and business applications.

Page | 1

Online reviews may also be attributed with a satisfaction score (rating). Usually these scores will be on a scale from one to five or very poor to excellent or thumbs up and thumbs down. For example, a person willing to buy a mobile device will go through reviews (preferably ratings) given by others on the same device and arrive at a conclusion. The probability of a user buying the same device increases with more positive reviews on it. But it is not necessary for each review to have a score associated with it. Some may have only textual representation of sentiments with no score. In such a case, it will be very difficult for users to read the text in each review to make a decision. A satisfaction score to each review by analyzing the textual information will be of immense help. Hence tools which do this and make a review more informative (rating) to a user are gaining popularity. We address the problem of attributing a satisfaction score (rating) to a review by analyzing text in it. Our work focuses on rating a review on a discrete scale (1-5) [2] than on a binary scale (positive or negative). The main challenge in rating reviews will be to identify the features which are subjective in nature (sentiments) and make user to give a satisfaction score for a review. For example features like weird, bad, rude etc are largely found in reviews with ratings one or two whereas features like excellent, awesome, beautiful etc can be found in reviews with four or five stars. To derive these subjective features of reviews, researches have followed lexicon based approaches where they either build a lexicon of sentiment words by themselves (seed lists) or use lexicons prepared by linguistics for sentiment analysis like Senti Wordnet1. They predict the orientation of a review to be either positive or negative based on the presence of sentiment words, largely adjectives in each review [5]. More positive words in a review means the review will be tagged as positive and vice-versa. Later researchers have focused on giving better representations for reviews by extracting phrases and using those as features; Turney [4] has followed an unsupervised approach which calculates the mutual information between subjective phrases and seed words. These subjective phrases are extracted using Brills-Tagger2. Pang and Lee [3] proved that standard machine learning techniques will outperform lexicon and rule based approaches. They have used bag-of-words (BOW), Part-Of-Speech (POS) information and sentence position as features for analyzing reviews .Then the focus has been shifted on representing reviews as feature vectors to a learning device usually Naïve Bayes and SVM. But these feature extraction methods are also dependent on tools like POS Tagger [4]. But Stefano and Andrea [2] have emphasized the fact that simple BOW representation of a review as a feature will not be sufficient for a learning device to accurately predict the sentiments of users and better features with larger text units are needed. They have used POS-Tagger, sentiment lexicon and have a predefined set of rules for extracting patterns containing larger text units which are subjective in nature [2]. Problems with these approaches are, they are not scalable and can’t be extended to multiple languages and domains because it requires lot of human effort to build such senti lexicons. Also the notion of sentiment changes from context to context and from domain to domain. This motivated us in exploring better features for reviews without using any lexicon or POS tagger. 1 2

http://sentiwordnet.isti.cnr.it/ http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/nlp/parsing/taggers/brill/0.html

Page | 2

We propose two methods word association (WA) and term co-occurrence (CO) for extracting features that can represent reviews in a better way. We conducted experiments on features obtained using both WA and CO methods along with BOW. Both WA and CO are based on corpus statistics alone and using them we were able to achieve reasonable performance in predicting the ratings of reviews and to the best of our knowledge these two methods for feature extraction have not been explored in predicting rating of reviews. We view this problem as multi-label classification problem where each review is mapped to a label on a discrete scale from one to five rather than binary (positive or negative). We conducted experiments using our feature extraction methods with two supervised machine learning approaches, Naïve Bayes (NB) which is widely used for text categorization and linear classification with Logistic Regression (LR). LR is used in applications where human preferences play a major role and also in marketing applications like predicting user propensity to buy a product. Our process briefly includes these three steps (1) Extracting features using BOW, WA and CO methods and representing each review as a feature vector using these three patterns. (2) Select top ‘n’ features in each pattern using feature selection methods. (3) Give these features as an input vector to a learning device for training and the device will build a model which will be used later to predict ratings on unseen examples. We present detailed observations on each feature extraction method and its effect on scoring reviews with baseline being BOW as features with no feature selection. The paper is organized as follows. Section 2 describes the work related to customer reviews, rating customer reviews and sentiment analysis. Section 3 gives a brief overview of the learning methods we have used to train the classifier Section 4 describes methods we have used in extracting features for reviews. Section 5 describes the feature selection part. In Section 6 we describe our experiments, the dataset used to conduct these experiments and evaluation metrics. In section 7, we discuss results of experiments on the dataset using different features and classification methods. Finally we conclude our work in Section 8.

2. RELATED WORK We discuss the work related to sentiment analysis and rating prediction and compare our approaches with their approaches in this section. Sang-Hyob Nam [10] used statistical means to predict the orientation of sentiment at phrase level. They determine the scores of each word based on the term frequencies and also the sentiment score associated with each sentence which was distributed on a scale 1- 10 with 10 being most positive. They automatically tag the words whose scores are from 6-10 as positive and 1-4 as negative. They have used CRF and HMM as classification models on n-grams. They classify each phrase as positive or negative which is simpler compared to classifying on a scale of 1-5. Our work focuses on multi class than binary sentiment classification.

Page | 3

Chan and King [11] investigated the importance of feature-opinion association in sentiment analysis for building lexicons. They defined features as attributes and components of products and opinion words as adjectives and adverbs. They used a combination of nearest opinion word (DIST), co-occurrence ratio (COR) and likelihood-ratio test (LHR) to determine the relation between opinion word and feature. For extracting adjectives and adverbs they used a POS tagger. But we have associated words based on corpus statistics rather than using POS tagger. Most of the work in sentiment analysis either use lexicon or classification based approach on humanly annotated documents to accurately predict the orientation of sentiments in a review. In [12], Prem, Richard and Gryc presented a unified framework which uses both lexicon and classification approaches. They used their approach in sentiment analysis of blogs across different domains. They have used Naïve Bayes classifier and preprocessed the training set using sentiment lexicon and pooled these two approaches to prove that they work better in combination than in isolation. But we did not do any preprocessing on reviews; we presented them in the same way they were in the dataset to the classifier. Also they classify each blog as positive and negative but it is not the case with our problem. Shallow dependency parsing method is employed in [13] for review mining. They have used the parsing technique for extracting phrases and to identify the dependencies between subjective (opinion) words and object (Feature). They did experiments on product reviews. In [14], Zhang employed two regression algorithms, one using -SVR and the other linear regression using WEKA toolkit for scoring utility of product reviews. We also followed the same linear classification approach but we used logistic regression which has not been explored. Our work is very similar to [2], but they have used a tagger to extract phrases which contribute to subjective patterns and general inquirer (GI) lexicon to extract simple and enriched sentiment expressions. But we did not use any senti lexicons but tried to extract phrases which are sources of subjectivity based on corpus statistics. They have used -SVR as the classification method and established the problem as a problem of ordinal regression.

3. CLASSIFICATION We view the problem of attributing a numerical score to reviews as a multi-class classification problem. We train the classifier on some samples from the dataset and the classifier will build a model based on these and predict the label or rating of unseen samples. We have followed two supervised learning methods for classification. Naïve Bayes (NB) and Logistic Regression (LR) Naïve Bayes is widely used and a standard method in classification. It is a probabilistic classifier method based on Bayes theorem and assumes feature independence. It estimates the likelihood of a document Page | 4

(review) belonging to a class ‘C’ (i.e.) the posterior probability of a document belonging to a class ‘C’ based on the prior probability of class ‘C’ and the likelihood or conditional probability of a document belonging to a class ‘C’. Each document ‘D’ is viewed as a vector of words which are independent. Where Hence, Where

(1) (2)

denotes number of words in document D

The second method is based on linear classification and it uses logistic regression as a discriminating function which is more sophisticated than a simple linear model. Logistic regression (LR) is also a probabilistic method of classification like NB; it also estimates posterior probabilities.

Figure 1 Logit Curve

LR is used in applications where human preferences play an important role in the outcome of an event. To model these preferences in the field of information retrieval, LR is used. It is generally used for binary classification but it can also be extended to multi class classification. The logistic model performs a kind of discriminate analysis; the model parameter estimation is modified from estimating means and variances of distributions to maximize “likelihood” (MLE). It calculates the posterior probabilities of a class C rather than estimating the classes’ individual density functions. LR as other forms of regression uses several predictor variables which can be numerical or categorical. The function which is used for estimating the posterior probability on an input vector X is referred to as ‘Logit’. Logit will take any values from and output values are in range (see Figure 1) and each class will have a parameter vector associated with it of the form where αk denotes the intercept associated with class K and β is the regression coefficient vector associated with class K which needs to be estimated. Page | 5

Logit function is given by

where

is an input vector and

is the posterior probability

estimated on vector . Vector will be a linear combination of intercept and regression coefficient vectors on ‘n’ predictor variables which will define the outcome of an event. In our case the Logit function will be modified to include class variable in the function. Each review is represented as a feature vector and training instances will be of the form where

denotes the probability of attaching the label

to .

The conditional probability will be modeled as (3) where we have to estimate the values of weight vector using training instances. (3)

As a learning device for NB, we have used rainbow [6] text classifier which is an open source tool and has several built in methods for classification of which NB is one of the methods. For LR, we have used liblinear tool where logistic regression method is implemented. LR method is L2-norm regularized and default values are chosen in our case. We focus more on feature extraction part of reviews than on learning devices in this work. But using LR approaches have not been explored in review scoring problem. As all learning devices, these two tools also need both training and testing set where each instance will be of (label, feature vector) pair, they build a model from the former and predict labels on later. LR method in liblinear tool tries to solve equation (4): (4)

4. FEATURE EXTRACTION Each review has to be represented as a feature vector for a classifier. Then the classifier builds a model based on these feature vectors (training samples) and predicts labels on unseen samples (testing) by means of classifier function maximizing accuracy. To derive feature vector representations for reviews we follow three methods. BOW with stop words removed as features for reviews was considered as a baseline in our experiments. WA and CO are the other two feature extraction methods used that helped us to derive better features for reviews in the form of phrases. WA is a simple variant of bi-gram representation based on corpus statistics. We didn’t use the association rules in data mining to extract larger text units. We are dependent on other measurements for associating two words or to extract phrases which are of length greater than two.

Page | 6

3.1 Extracting Important Features for Association To derive better feature representations for each review with larger text units, we have to extract patterns which are the main sources for subjectivity. These patterns will usually be “JJ NN”, “RB NN” and also NP and VP. To extract these patterns, we have to use a POS tagger which is not a scalable solution. So we present here an approach which will extract most important features in each class based on document frequency for association. (5) Where denotes the number of documents that has term “ ” and terms in a class.

denotes the number of unique

Each term which has document frequency of greater than its ADF is considered important. We will have a list of terms from all classes and remove duplicates and then have a final list of terms which will be used for extracting phrases. 3.2 Word Association Word association (WA) is our first step towards deriving better feature representations for reviews. We will associate two words in a sliding window of length two based on (5). So a feature will be a phrase of length two rather than a single word. It is a variant of bi-gram method; bi-gram method will associate all the words in a document. But we don’t associate all the words blindly rather we associate words only if two words satisfies (5). For example consider text units like ‘had a great time this weekend’, ‘decent location’ and ‘hotel was very nice’. Our WA method produce ‘*great time+ and *time weekend+’, ‘*decent location+’ and ‘*very nice+’ features as output. 3.3 Term Co-occurrence Term co-occurrence (CO) will be an extension of WA method. In this method, we select phrases of length greater than two and less than five. We select all phrases in a sliding window of size five (five is just an intuition and not empirically determined) if each word in the phrase satisfies (5). The motivation for using these methods is to extract terms which are dependent as features. The notion of sentiment is largely affected by the presence of adjectives or adverbs which occur in proximity to a noun or a verb. Better representations with larger text units include both adjectives/adverbs and nouns/verbs. We are not using any tools like POS tagger or any rules. We are simply tokenizing each document and associated each token with one another if it satisfies (5). Hence it is a scalable solution for deriving better features for reviews and can be extended across several domains and many languages.

Page | 7

5. FEATURE SELECTION Feature selection is a technique used in machine learning to select most discriminative features from the feature set across labels. This process will select only a subset of features to make learning process efficient and fast. Our feature set will have features from all the three BOW, WA and TC methods. So the entire feature set will be huge. Hence a feature selection process is inevitable to select most appropriate features without affecting the classification accuracy much and also for fast learning. For this we have conducted experiments with two approaches for feature selection along with average doc frequency (ADF) approach which we have used to extract important features for association. There are many standard methods in feature selection for text categorization [8] and linear classification [9]. We have used Mutual Information (MI) and Fisher Score (FS) and Average Doc Frequency (ADF) methods in our case. 5.1 Mutual Information Mutual information between two random variables is defined as the quantity which measures the mutual dependence of two variables. In our case the mutual information quantity is between a class label and a feature. (6) Higher the MI value, more discriminative power the feature has. 5.2 Fisher Score Fisher criterion for determining discriminative power of a feature is used for SVM’s. It gives an estimate of discriminative power of a feature based on the variance of group means and mean of variances within group. Larger the fisher score more discriminative power the feature has. Fisher score is simple and independent of classifiers and generally effective. Extension of it for multi-class classifier is (7). (7) Where

is the number of classes and

and

are mean and variance of each class respectively.

We have selected top 10% of features from entire vocabulary for our task in both MI, FS feature selection methods and in ADF we set average doc frequency as threshold. Both MI and FS are standard feature methods. We have followed the state of art methods than discovering new in the feature selection part. We present the increase in performances of our classifier with the application of feature selection methods in results section in detail.

Page | 8

6. OUR EXPERIMENTS In this section, we describe the dataset used to conduct our experiments, the experimental setup and evaluation methodology. 6.1 Dataset The dataset3 we have used for conducting experiments is customer review data on hotels which was downloaded from [2]. The dataset consists of reviews given by people on hotels in towns of Pisa and Rome. The dataset has about 15,000 reviews and each review was rated on several parameters (facets) of a hotel like cleanliness, business service etc. There is also a global rating attached to each review in the dataset on a discrete scale from 1 to 5. We have independently and randomly divided our dataset into training and testing sets where 75% of reviews were placed in the training dataset and the remaining 25% were used for testing the classifier. There are about 11,000 reviews in the training set and 4,000 in the test set. We have conducted experiments on global ratings of reviews and evaluated them. The dataset is a very imbalanced one which was dominated by reviews with ratings 4 and 5; ratings 1 and 2 are very low in number. Table 1 gives complete statistics about the dataset we have used in our experiments.

Table 1 Reviews in dataset

Rating

# of reviews

1

670

2

1130

3

1576

4

5473

5

6912

6.2 Experiments The feature set will have features extracted using BOW, WA and CO methods. We conducted experiments on hotel review dataset with a subset of features (top 10% and doc frequency) selected using the feature selection methods (MI, FS and ADF). We have used two classification methods NB and LR. As a baseline we have used BOW as features for our experiments. We incrementally added the new features extracted using WA and CO respectively. Evaluation of experiments is based on standard classification evaluation technique (mean absolute error). We have reported our performance on both feature extraction and feature selection methods used in classification and compared our performance with the baseline. In NB method each feature is weighted equally whereas in LR, each feature is weighted using ‘tfidf’ weighting.

3

http://patty.isti.cnr.it/~baccianella/reviewdata/

Page | 9

(8) Where is the frequency of feature in a document (review) and is the total number of documents (reviews) in the dataset and is the number of documents (reviews) in which feature occur in the dataset. 6.3 Evaluation Metrics We evaluated our experiments using standard evaluation metrics in classification. As an evaluation measure we have used mean absolute error which predicts the average deviation between the predicted and true label. We have reported the performance using both micro-averaged and macroaveraged versions of mean absolute error (MAE).

(9) (10) Where denotes the test document set, denotes the number of classes denotes the test documents in class c, and denotes the predicted label of instance and denotes the true label of an instance . In all instances of test set are considered same whereas in computes the average deviation independently across all labels. It is used for highly imbalanced datasets like ours.

7. RESULTS AND OBSERVATIONS We report and discuss results of our experiments on the dataset. We did two fold cross validation to test the statistical significance of our results. Table 2 shows the baseline scores with BOW as features on both classification methods used and with no feature selection. Table 2 Baseline results with BOW representation

Classifier

MAEµ

MAEM

Naïve Bayes (NB)

0.521

0.804

Logistic Regression (LR)

0.580

0.807

Table 3 shows results of our experiments with NB classifier on different feature representations we have used and proposed earlier with mutual information (MI) as feature selection.

Page | 10

Table 3 Results of NB classifier with different features

Feature representation(s) BOW

MAEµ

MAEM

0.496

0.524

BOW+WA

0.439

0.503

BOW+WA+CO

0.44

0.444

Table 4 shows the results of our features with Logistic Regression (LR) method. We will show our results on two feature selection methods we have employed in selecting most discriminative features. Table 4 Results for LR approach with different representations

ADF MAEµ MAEM

MAEµ

MAEM

BOW

0.585

0.758

0.610

0.843

BOW+WA

0.531

0.776

0.512

0.872

BOW+WA+CO

0.532

0.705

0.568

0.799

Feature representation(s)

FS

From the tables shown above, NB method is performing better than LR in many cases. Our values showed that in both classifiers, using feature selection methods BOW+WA performs better when compared to BOW or BOW+WA+CO. There was marginal difference in values when CO method is used which indicates that one can obtain good performances with phrases containing only two words like JJ NN, RB NN patterns as features. But our MAEM values are good compared to what [2] have obtained on the same dataset which shows that we are able to classify the labels which are given lesser importance in the dataset with good accuracy. The dataset is highly dominated by reviews with ratings of 4 and 5 (highly imbalanced). MAEM evaluation metric is for imbalanced datasets and we were able to produce good scores in this regard. But there is not much improvement in MAEµ values (11.5% relative improvement from BOW+MI to BOW+WA+MI) in case of NB classifier and 16.1% increase in relative performance with LR. But in [2] they have reported an increase of 33.1% when their feature set is extended from BOW to (BOW + Expr). There has been great improvement in MAEM values of NB classifier from the baseline approach using BOW to BOW + MI. The % of improvement is 35 which was very good and proves the fact that MI as feature selection technique produced good results across multiple labels.

8. CONCLUSION We have explained the problem of rating customer reviews, investigated various approaches that researchers are using, challenges in it, problems with current approaches and came up with solutions. We proposed two feature extraction methods for reviews which need no other tool, and solely Page | 11

dependent on corpus statistics. We viewed the problem as a multi-label classification problem and conducted experiments with our feature extraction and feature selection methods using Naïve Bayes and Logistic Regression on a hotel review dataset which was manually annotated by customers. We also employed three feature selection techniques MI which is widely used for text categorization, ADF which is based on simple average document frequency of the label and Fisher score (FS) criterion based on the variance. We presented a detailed analysis of our experiments in the results section. The feature extraction in sentiment analysis using statistics of data alone has not been explored till date, so we believe experiments we did will be a good starting point for researches to explore on better feature extraction methods for extracting subjective phrases in sentiment analysis using corpus statistics alone. Our experiments are reporting an accuracy of 65% which can be improved. The rating of customer reviews is fairly a new application and much has not been explored in this area. The literature related to scoring reviews is very sparse. Most of the work done on sentiment analysis has been using either lexicon or binary classification methods to predict the orientation of review as positive or negative. Multi-label classification is a new area of research in sentiment analysis. The feature extraction methods proposed in this work are giving reasonable improvements but not that significant compared to what we get using language tools. This work can be considered as a building block for rating reviews and can be extended with more feature extraction methods, better feature selection and classification methods.

Acknowledgements We thank Stefano and Andrea for providing us a link in [2] to download the dataset which contains reviews on hotels in Rome and Pisa.

References [1] Gretze, U., Yoo, K.Y. 2008. Use and Impact of online travel review: In the Proceedings of International Conference on Information and Communication Technology, Springer Vienna, 35-46. [2] Stefano, B., Andrea, E., Fabrizio, S. 2009. Multi-Facet Rating of Product Reviews: In the Proceedings of 31st European Conference on Information Retrieval, France, Lecture Notes In Computer Science, 5478, 461-472. [3] Pang, B., Lee, L., Vaithayanathan, S. 2002. Thumbs up or thumbs down? Sentiment Classification Using Machine Learning Techniques: In the proceedings of ACL 2003 conference on Empirical methods in natural language processing, Association for Computational Linguistics (2002), Morristown, NJ, USA, 79-86.

Page | 12

[4] Turney, P.D. 2001. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews: In the proceedings of 40th Annual meeting on Association for Computational Linguistics, Association for Computational Linguistics (2001), Morristown, NJ, USA, 417-424. [5] Hu, M., Liu, B. 2004. Mining and Summarizing Customer Reviews: In the proceedings of tenth ACM SIGKDD International conference on Knowledge discovery and data mining, Association For Computing Machinery, Seattle, WA, USA, 168-177. [6] McCallum, Andrew Kachites. 1996. "Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering”. http://www.cs.cmu.edu/~mccallum/bow. [7] Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., and Lin, C.J. 2008. LibLinear: A library for large linear classification, Journal of Machine Learning Research, 1871-1874. [8] Yang, Y., Pederson, J.O. 1997. A comparative study on feature selection in text categorization: In the proceedings of 14th International Conference on Machine Learning. Morgan Kaufmann Publishers Inc, 412-420 [9] Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., Vapnik, V. 2001. Feature Selection for SVMs:Advances in Neural Information Processing Systems, Massachusetts Institute of Technology press, 668-674 [10] Sang-Hyob Nam, Seung-Hoon Na, Jungi Kim, Yeha Lee, Jong-Hyoek Lee. 2009. Partially Supervised Phrase Level Sentiment Classification: In the proceedings of International Conference on Computer Processing of Oriental Language, Lecture Notes In Artificial Intelligence. Springer-Verlag, 5459, 225-235. [11] Kam Tong Chan, Irwin King. 2009. Finding Right Couple for Feature Opinion Association in Sentiment Analysis, In Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, Lecture Notes In Artificial Intelligence. Springer-Verlag, 5476, 741-748. [12] Prem Melville, Wojceich Gryc, Richard D. Lawrence, 2009. Sentiment analysis of blogs by combining lexical knowledge with text classification: In proceedings of 15th ACM SIGKDD international conference on Knowledge discovery and data mining, Association for Computing Machinery, 1275-1284. [13] Qi Zhang, Yuanbin Wu, Tao Li, Mitsunori Ogihara, Joseph Johnson, Xuanjing Huang. 2009. Mining product review based on shallow dependency parsing: In the Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, Association for Computing Machinery, 726-727. [14] Zhang, Z., Varadarajan, B. 2006. Utility scoring of product reviews: In the Proceedings of the 15th ACM International Conference on Information and Knowledge Management (CIKM 2006), Association for Computing Machinery, 51–57. Page | 13

Suggest Documents