Good News vs. Bad News: What are they talking about?

0 downloads 0 Views 490KB Size Report
In (Shriniwas Doddi, Haribhakta, Kulkarni,. 2014) the authors proposed a platform for serving good news and create a positive environment. They used SVM for ...
Good News vs. Bad News: What are they talking about? Olga Kanishcheva Intelligent Computer Systems Department, National Technical University “KhPI” [email protected]

Victoria Bobichev Department of Informatics and Systems Engineering Technical University of Moldova [email protected]

In this paper, we explored the problem of sentiment analysis for Ukrainian and Russian news, developed a corpus of Ukrainian and Russian news and annotated each text using one of three categories: positive, negative and neutral. Each text was marked by at least three independent annotators via the web interface, the interannotator agreement was analyzed and the final label for each text was computed. These texts were used in the machine learning experiments. Further, we investigated what kinds of named entities such as Locations, Organizations, Persons are perceived as good or bad by the readers and which of them were the cause for text annotation ambiguity.

(2012)), some more for European Union languages whereas far less research has been completed for East European languages including Ukrainian and Russian. Sentiment analysis of user generated content has been the focus of many researches; however, mass media news articles deserve the attention of these researches as well. Sentiment analysis of the news helps make their image more transparent, as possible biases in different news sources can be uncovered. The lack of research in this field was the motivation for the current work. In comparison with the other domains, like product reviews, sentiment polarized words are used less frequently and sentiments are conveyed by complex structures and contextual knowledge as a way for journalists to seem more objective than they actually are. Many newspapers at least want to give an impression of objectivity and journalists desist from using obviously positive or negative vocabulary. This causes the sentiment classification task to become very challenging as we need to find domain-specific methods to handle this complexity. The paper is organized as follows: in the next Section we describe related work. In Section 3 we described the corpus, its annotation and inter-rater agreement. In Section 4 we present the sentiment analysis experiments. Section 5 interprets the results; this part contains conclusions and future work.

Introduction

2

Abstract Today’s massive news streams demand the automate analysis which is provided by various online news explorers. However, most of them do not provide sentiment analysis. The main problem of sentiment analysis of news is the differences between the writers and readers attitudes to the news text. News can be good or bad but have to be delivered in neutral words as pure facts. Although there are applications for sentiment analysis of news, the task of news analysis is still a very actual problem because the latest news impacts people’s lives daily.

1

In news, sentiments are conveyed in a subtle manner without the use of explicit sentiment bearing words, and their detection requires contextual knowledge. Most research conducted in the field of sentiment analysis has been done for the English language (Zhang and S. Skiena (2010), Liu B. and Zhang L.

Related Work

In recent years, sentiment analysis has been developing faster; it is connected with the growth of online texts and social networks. Some surveys about this area were presented in works Zhang and Skiena (2010), Liu and Zhang (2012).

In (Shriniwas Doddi, Haribhakta, Kulkarni, 2014) the authors proposed a platform for serving good news and create a positive environment. They used SVM for sentiment analysis of news in English. Zhai et al. (2009) developed Java data processing code and used the Stanford Classifier to quickly analyze financial news articles from The New York Times and predict sentiment in the articles. They used Naïve Bayes classifier. In (Godbole, 2007) the author proposed a system that assigns scores indicating positive or negative opinion to each distinct entity in the text corpus. Their system consists of a sentiment identification part, which associates expressed opinions with each relevant entity, a sentiment aggregation and scoring phase, which scores each entity relative to others in the same class. Azar (2009) analyzed the numerical information found in financial markets and the verbal information reported in the financial news to classify the news in two classes: positive and negative. The author showed that the performance of Support Vector Machines was comparable to human performance. In (Kalyani, Bharathi and Jyothi, 2016) the authors compared results obtained using Random Forest: the resulting accuracy varied from 88% to 92%. SVM accuracy was around 86%. Naive Bayes algorithm performance was around 83%. Recently, deep learning has become a popular method for the sentiment analysis. Lina Maria Rojas-Barahona (2016) and Duyu Tang, Bing Qin, Ting Liu (2015) provide an overview of deep learning for sentiment analysis in order to place these approaches in context. In the work authors (Cıcero Nogueira dos Santos and Maıra Gatti, 2014) use deep learning for the Stanford Sentiment Treebank with the accuracy of 86.4%. In (Koltsova, Alexeeva and Kolcov, 2016) the authors described the development of a system for Russian language sentiment analysis. This system included: a publicly available sentiment lexicon, a publicly available test collection with sentiment markup and a crowdsourcing website for such markup. The sentiment lexicon was aimed at detecting sentiment in user-generated content (blogs, social media) related to social and political issues. Loukachevitch and Levchik (2016) in their paper presented the Russian sentiment lexicon – RuSentiLex. The current size of the lexicon is

more than ten thousand words and phrases. The lexicon entries were classified according to four sentiment categories (positive, negative, neutral, or positive/negative) and three sources of sentiment (opinion, emotion, or fact). The news sentiment analysis is different from that of other text types. (Balahur, et al., 2013) emphasized that there are several specific problems related to this type of texts; one of them being the problem of positive or negative opinion from good or bad news separation. This problem becomes more evident in the process of news manual annotation (Balahur, Steinberger, 2009). Annotators tend to misinterpret the author intention and mark their interpretation as the true sentiment of the text. In (Balahur, et al., 2009) the annotation experiments of news quotations annotator agreement even for these short pieces of text was relatively low, below 50%. Only for two steps of reannotation of the same quotations they managed to reach 80% agreement creating detailed annotation guidelines with multiple examples and explanations. (Bakken et al., 2016) adapted the (Balahur, et al., 2009) methodology in their annotation experiment and annotation made by the first two authors of the paper reached an agreement of 76%. (Ellis et al., 2014) also stated that in news video transcripts or news articles, the sentiment attached to a statement can be much less obvious. An example presented in the paper demonstrated this: “take the statement that has been relevant in the news in the past year, ’Russian troops have entered into Crimea’. This statement by itself is not polarizing as positive or negative and is in fact quite neutral. However, if it was stated by a U.S. politician it would probably have very negative connotations and if stated by a Russian politician it could have a very positive sentiment associated with it." The annotation was made using Amazon Mechanical Turk and each annotation unit obtained three independent annotations. The paper did not report annotation agreement although they mentioned that around 6% of examples were discarded due to the disagreement of annotators. In the papers which describe various types of manual annotation (Balahur, et al., 2009, Navarro et al., 2005, Melzi et al., 2014) several reasons of the poor inter-annotator agreement were listed. The ambiguity of the annotation units and subjectivity of the annotators were one of the main problems encountered in this process. A really high in-

ter-annotation agreement can be reached only after several turns of annotations by the same annotators and iteratively improved annotation instructions. In (Mihalcea, Strapparava, 2010) each news headline in the corpus was annotated by six annotators which had not received any additional instructions, hence, annotated their own sentiments triggered by the text. Then they calculated average annotation of the headlines. In (Melzi et al., 2014) master students annotated health forum messages with six basic emotions and the calculated Fleiss Kappa (Artstein and Poesio, 2008) was relatively low: 0.26. They explained such disagreement by the variability between people and specifics of the corpus texts. Nevertheless, they continued with automating sentiment classification experiments which give reasonably good results (best F-measure = 0.65).

3 3.1

Experiments The Data for Annotation

We used two data sets of Ukrainian (https://tsn.ua/) and Russian (http://censor.net.ua/) news. The Ukrainian corpus contains 5,817 news texts and the Russian corpus contains 10,194 news texts. Some statistics are shown in Table I. The initial set of files was quite large but the great part of these files was neutral. As there was news, not user’s comments they were delivered in comparatively neutral texts. Even if the author’s attitude was present in a text, it was expressed implicitly, mostly without sentiment bearing words. TABLE I. Topic Society Politics Incidents Sport Economics Total

STATISTICS ABOUT DATA SETS Number of Ukrainian texts 1,171 1,235 1,156 1,153 1,102 5,817

Number of Russian texts 2,100 2,502 1,393 2,100 2,099 10,194

However, we decided to use sentiment lexicons in order to select the most interesting files. The used sentiment lexicons for Russian (Bobicev et al., 2010) and Ukrainian (Kanishcheva, 2017) were created on the base of WordNetAffect (Strapparava and Valitutti, 2004). WordNetAffect in its turn was created starting from WordNet DOMAINS (Magnini, B., Cavaglia, 2002). WordNet-Affect produces an additional hierarchy of the effective domain labels, independent from

the domain hierarchy for the synsets that represent affective concepts. Only a part of WordNet-Affect synsets provided as a resource for the SemEval2007 “Affective Text” by Strapparava and Mihalcea (2008) was used for Russian and Ukrainian lexicon creation. There were terms grouped in six subsets in accordance with six basic emotions (Ekman, 1992): anger, joy, surprise, sadness, disgust, fear. Below is presented an example of synset: a#01943022 awed awestruck awestricken in_awe_of

The created lexicons are comparatively small; the total number of translated synsets is 248 and the number of Russian and Ukrainian words is around 2,000. Sentiment words from the lexicons were compared with the words from the texts. For most text, no one word matched lexicon’s terms; these texts were discarded as neutral. If there were some words in texts which matched lexicon terms we left them for the annotation as possible sentiment bearing. After the filtering described above, 2,018 Russian and 2,133 Ukrainian news texts were left for the manual annotation. For annotation was presented only texts without any additional metainformation such as date, author etc. 3.2

The Annotation and Inter-Annotator Agreement

The annotation was performed by Kharkiv Polytechnic Institute students via the online interface1. Around 40 students participated in annotation of each part, Russian and Ukrainian yielding in 7,248 annotations for Russian texts and 6,733 annotations for Ukrainian. Starting the annotation students selected the topic they were willing to work with and the interface randomly selected texts from the folder and offered students. The question they answered was: What sentiments does this text evoke? They also could add some comments if they considered necessary. Finally, each text was annotated by 2 to 5 students using three labels: positive, negative, neutral. The average number of annotators per text was 3.59 for Russian and 3.2 for Ukrainian texts. Interannotator agreement calculated on these texts was extremely low: Fleiss Kappa = 0.14 for Ukrainian texts and Fleiss Kappa = 0.24 for Russian. 1

http://lilu.fcim.utm.md/annot_ru/annot_ru.html http://lilu.fcim.utm.md/annot_ukr/annot_ukr.html

In the case of Ukrainian text, two annotators selected one label and one annotator selected another label for 2/3 of them. Even if in this case we may use the label selected by two annotators as the right one, such distribution of votes is actually identical to the distribution “by chance” and Fleiss Kappa for such annotation is equal to 0. Nevertheless, we selected the final label by majority voting. This means that for all texts we selected the label which was attached by majority annotators. In the cases, when the text was annotated with three labels and one label was selected by two annotators this label was selected as the final one. The statistics of documents selected for the annotation are presented in the TABLE II. and TABLE III. TABLE II. STATISTICS ON THE RUSSIAN NEWS FILES SELECTED FOR ANNOTATION AND SELECTED FOR THE EXPERIMENTS Category

Economics Incidents Politics Society Sport Total

Number of files for for expeannori-ments tation 298 202 251 209 526 445 434 322 509 321 2,018 1,499

Emotion category PosNeuNegaitive tral tive 35 14 84 56 80 269

87 93 259 153 134 726

80 102 102 113 107 504

Figures 1 and 2 show the final statistics about the agreement of our data sets. The number of news texts is presented for each emotion category (positive, negative and neutral). TABLE III. STATISTICS ON THE FILES UKRAINIAN NEWS SELECTED FOR ANNOTATION AND SELECTED FOR THE EXPERIMENTS Category

Economics Incidents Politics Society Sport Total

3.3

Number of files for anfor exnotation periments 254 150 493 428 496 383 449 373 441 368 2,133 1,702

Emotion category PosNeuNeg itive tral ative 60 30 82 52 153 377

65 137 183 189 147 721

25 261 118 132 68 604

Ambiguous Texts

In this section, we try to analyze reasons of such ambiguous estimation of the news. We explored the news which received three different labels (negative, neutral, positive) from all three annotators. The statistics on these texts are shown in Table IV.

For each category Society, Politics, Incidents, Sport, Economics we extracted named entities (Location, Person, Organization) as we hypothesized that these elements influence the user opinion about the texts. News should be neutral and not impose any opinion. However, the person reading the news has her/his own subjective opinion on this or that object/event/person and this opinion is influence the news evaluation. This is the reason why some of the news have been marked quite ambiguously (TABLE IV. ). TABLE IV.

Category Society Politics Incidents Sport Economics Total

STATISTICS ABOUT AMBIGUOUS DATA SETS Number of Ukrainian texts Ambiguous/ unambiguous 63/373 103/383 58/428 58/368 68/150 350/1,499

Number of Russian texts Ambiguous/ unambiguous 62/322 65/445 30/209 92/321 43/202 292/1,702

In TABLE V. we present entities from completely ambiguous texts and compare with the entities from the unambiguous news. The extracted entities can be divided into several groups. The first group of named entities such as USA Congress, the US Senate, The German Foreign Ministry etc. have been met only with the ambiguous news. The second group, such as The Verkhovna Rada, Arseniy Yatsenyuk etc. have been met in two categories: positive and negative. Possibly that's why these entities appeared also in the ambiguous texts. The third group of entities (The International Monetary Fund (IMF), Ukraine) has been met quite often in all categories because these entities are very popular for Economics category and for Ukraine in general Next, we analyzed the frequency of term IMF and we can see that it has been met more frequently in the negative context of news. Also, such entity as Arseniy Yatsenyuk appeared in positive and negative categories because this politician was at the politic arena of Ukraine in 2016 (our data set was created during the 2016 year) and people who analyzed the news had already their own apparently controversial opinion about him.

the Economy and Society categories, we detected the following named entities (TABLE V. and TABLE VI. ). TABLE V.

THE MOST FREQUENT ENTITIES FROM RUSSIAN NEWS (CATEGORY – ECONOMY)

Name of Entity (Russian/English) Organization Верховная Рада/ The Verkhovna Rada

Figure 1: The number of texts with total agreement by emotion categories (Ukrainian news).

Figure 2: The number of texts with total agreement by emotion categories (Russian news). TABLE VI. shows that the Security Service of Ukraine (SSU) had been met in negative contexts of news but the persons Arsen Avakov and Hatiya Dekanoidze appeared in the positive context. This positive context may be explained by the people's positive attitude to the reforms they have been carrying out last two years. The term Antiterrorist operation (ATO) had been met in all categories of text but more often in the negative news. It’s connected with the war in the East of Ukraine and the annotators used negative labels for texts related to ATO. We observed an interesting fact about the entity Petro Poroshenko. This person was mentioned relatively often in positive and in negative categories. This indicates that the president of Ukraine does not have unequivocal support in Ukrainians. This result match with the results of the official surveys2. For our texts, we think that the main cause of news ambiguity was the politics. For example, for 2

http://ukraine-elections.com.ua/socopros/parlamentskie_vybory https://ru.slovoidilo.ua/2017/02/01/infografika/politika/kakmenyalis-elektoralnye-simpatii-ukraincev-v-2016-m

Кабинет Министров/ Cabinet of Ministers Международный валютный фонд (МВФ)/ The International Monetary Fund (IMF) Конгресс США/ USA Congress Сенат США/ The US Senate МИД Германии/ The German Foreign Ministry Евросоюз/ European Union Антимонопольный комитет/ Antimonopoly Committee Национальный банк/ National Bank Кремль/ Kremlin Location Российская Федерация/ Russian Federation Украина/ Ukraine Person Президент/ Prezident Айвараса Абромавичуса/ Aivaras Abromaviichus Кристин Лагард/ Christine Lagarde Оксана Сыроид/ Oksana Syroid Арсений Яценюк/ Arseniy Yatsenyuk Владимр Гройсман/ Vladimir Groisman Валерия Гонтарева/ Valeria Gontareva Павел Жебривский/ Pavel Zhebrivsky Дмитрий Песков/ Dmitry Peskov Александр Лукашенко/ Alexander Lukashenko

Tonality Positive

Neutral

Negative

+

-

+

+

-

-

+

+

+ (many times)

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

+

-

+ (often)

-

+

+

-

-

-

-

+

-

-

-

-

+

-

+

-

-

-

-

-

+

-

-

-

-

-

-

-

-

-

The same situation is for the news in Ukrainian. Extracted entities show how the society ambiguously perceives the same news especially when it comes to politics. Many categories even nonpolitical ones contain entities that are related to

politics and thereby the user's opinion overlaps the evaluation of the news as a whole. TABLE VI.

THE MOST FREQUENT ENTITIES FROM RUSSIAN NEWS (CATEGORY – SOCIETY)

Name of Entity (Russian/English) Organization Минсоцполитики/ Ministry of Social Policy СБУ/ SSU Генпрокуратуру/ Prosecutor General's Office Кабинет Украины/ Cabinet of Ukraine Верховная Рада/ The Verkhovna Rada Госпогранслужба Украины/ State Border Service of Ukraine Луганская Народная Республика/ Luhansk People's Republic Парламент Канады/ Parliament of Canada Location Майдан/ Maidan АТО/ ATO Евросоюз/ European Union Дебальцево/ Debaltsevo Мариуполь/ Mariupol Чернобыльской АЭС/ Chernobyl nuclear power plant Person Путин/ Putin Арсен Аваков/ Arsen Avakov Петр Порошенко/ Petro Poroshenko Михеил Саакашвили/ Mikheil Saakashvili Хатия Деканоидзе/ Hatiya Dekanoidze Вера Савченко/ Vera Savchenko

Tonality Positive

Neutral

Negative

-

+

-

-

+

+

-

-

-

-

-

-

+

+

+

-

-

-

-

-

-

-

-

-

+

+

+ (often)

-

-

-

-

-

-

-

-

+

-

-

-

features3 but bag of words features gave better results and we are reporting only them. We performed 10-fold cross-validation for each dataset. The best obtained F-measures, the feature sets and the methods of their selection and the machine learning algorithms used to obtain this result on all our categories of texts are presented in the TABLE VII. We obtained comparatively good results for our texts. The best F-measure 0.776 was obtained for Sport category of Russian texts. The third column of the table presents information about the feature set used to obtain this result. In the case of Sport category of Russian texts, 91 features selected from the bag of words initial feature set using Correlation-based Feature Subset Selection were used by Naïve Bayes Multinomial algorithm to obtain F=0.779. It is interesting that the worst result F=0.676 was obtained also for the category sport, Ukrainian news. It may be explained by the fact that comparatively neutral reports about scores of sports matches evoke quite different sentiments from competing sport team fans and there was no text element, which indicates these sentiments. TABLE VII.

CATEGORY

Category -

-

-

+

+

-

+ (often)

+ (often)

-

-

+

+

-

-

-

-

+

Economics Incidents Politics Society Sport

Economics Incidents

4

Automated Classification Experiments

In order to evaluate the annotation, we performed Machine Learning experiments using several algorithms: Bayesian classifiers (Naive Bayes Multinomial) and SVM (Support Vector Machine). The feature set was obtained using bag of word approach and the selection of the best set of attributes by two selection methods: Information Gain feature evaluation which evaluates the worth of a feature by measuring its information gain and Correlation-based Feature Subset Selection (Hall, 1998). We also experimented with lexicon based

THE BEST RESULTS OBTAINED FOR EACH

Politics Society Sport

The best F-measure

Feature set for this result Russian news 73 Info Gain 0.728 71 Info Gain 0.741 160 Info 0.729 Gain 57 Best 0.695 subset 91 Best 0.776 subset Ukrainian news 29 Info Gain 0.720 72 Best 0.762 subset 77 Info Gain 0.746 163 Info 0.763 Gain 79 Best 0.676 subset

Classification algorithm

NB Multinomial SVM algorithm NB Multinomial NB Multinomial NB Multinomial

Naive Bayes SVM algorithm NB Multinomial NB Multinomial SVM algorithm

TABLE VII. TABLE VIII. and 0show the number of features for all our data set which we used for our experiments. The obtained results are not as good as in (Bobicev et al., 2017) on the similar data set although our corpora are larger. It could be explained 3

https://sites.google.com/site/datascienceslab/projects/multil ingualsentiment

by the annotation vagueness. We decided on text polarity by simple majority voting. If a text was labeled by three annotators once with the positive label and two with negative we considered this text negative. However, in such cases, there was not enough evidence that the text was really negative. The possible solutions may be additional annotations of these texts but we already have a number of texts annotated by 4, 5, 6 and even 7 annotators and still, they are ambiguous obtaining for example, 3 positive votes and 2 negative from 5 annotators or 3 positive, 2 negative and 2 neutral from 7 annotators. The possible solutions, in this case, may be (1) creation of an additional category: ambiguous texts or (2) removal of these texts from the training corpus. TABLE VIII. Category Economics Incidents Politics Society Sport

TABLE IX. Category Economics Incidents Politics Society Sport

5

THE NUMBER OF FEATURES FOR RUSSIAN NEWS BOW 3,483 3,564 6,431 6,107 3,482

Best Subset 42 40 78 57 91

Info Gain 73 71 160 128 130

THE NUMBER OF FEATURES FOR UKRAINIAN NEWS BOW 2,878 6,110 5,868 6,088 4,531

Best Subset 25 72 68 92 79

Info Gain 29 112 77 163 166

Conclusion and Future Work

In the paper, we have presented the work on sentiment analysis for news in Ukrainian and Russian news. The news was annotated into three categories (positive, negative and neutral) by three annotators. The inter-annotator agreement was calculated and the final label was attached to each annotated text. The texts that remained ambiguous were not used in the experiments. We have performed experiments with the created annotated corpus of news using Bayesian (Naive Bayes Multinomial) and SVM (Support Vector Machine) classifiers. Bag of word approach with feature selection gave the best results which demonstrated that even the simple learning methods as for example, Naïve Bayes, are able to achieve high results (average F1-score of 0.73) with right features. We explored what kind of Named Entities have been met in the positive and negative contexts of

the news and examined the possibility to use the standard approach of sentiment analysis for prognoses of politician’s rating and for some events, such as sports matches, festivals, TV shows etc. There are several open issues in our study. First, the quality of the annotation needs to be improved. In order to solve this problem we plan to undertake the following steps: (1) to carry more detailed analysis of the most ambiguous texts to detect the common cause of their ambiguity; (2) to introduce one more annotation category: ‘ambiguous’ for the texts the annotators are not certain about; (3) to experiment with various modifications of the instructions for the annotators. Second, we plan to search for the better sentiment lexicons for Russian and Ukrainian in order to use them in automate sentiment recognition which currently still need considerable improvement. Third, the final aim of our work is aspect based sentiment analysis: mining and summarizing opinions from a text about specific entities and their aspects; thus we plan to continue the direction of automating named entities recognition with a close connection to sentiment analysis.

Acknowledgments We gratefully acknowledge feedback and comments of the anonymous RANLP reviewers, which considerably helped to improve the paper. We would like to thank students from National Technical University “KhPI” for creation and annotation of news collection.

References Ron Artstein and Massimo Poesio. 2008. Inter-coder agreement for computational linguistics. Journal of the Computational Linguistics, 34(4):555-596. http://dx.doi.org/10.1162/coli.07-034-R2 Pablo Daniel Azar. 2009. Sentiment Analysis in Financial News, Ph.D. thesis, Harvard University. Patrik F. Bakken, Terje A. Bratlie, Cristina Marco and Jon Atle Gulla. 2016. Political News Sentiment Analysis for Under-resourced Languages. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 2989–2996. Alexandra Balahur and Steinberger R. 2009. Rethinking Opinion Mining in News: from Theory to Practice and Back. In Proceedings of the 1st Workshop on Opinion Mining and Sentiment Analysis, Satellite to CAEPIA 2009.

http://publications.jrc.ec.europa.eu/repository/hand le/JRC55018 Alexandra Balahur, Ralf Steinberger, Erik van der Goot, Bruno Pouliquen, and Mijail Kabadjov. 2009. Opinion Mining on Newspaper Quotations. In Proceedings of the workshop 'Intelligent Analysis and Processing of Web News Content' (IAPWNC). http://dx.doi.org/10.1109/WI-IAT.2009.340 Alexandra Balahur, Ralf Steinberger, Mijail A. Kabadjov, Vanni Zavarella, Erik Van der Goot, Matina Halkia, Bruno Pouliquen, and Jenya Belyaeva. 2013. Sentiment Analysis in the News. In the Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC'2010), pages 2216-2220. Victoria Bobichev, Olga Kanichsheva, and Olga Cherednichenko. 2017. Sentiment Analysis in the Ukrainian and Russian News. In Proceedings of the First Ukraine Conference on Electrical and Computer Engineering (UKRCON 2017), pages 10501055. Victoria Bobicev, Victoria Maxim, Tatiana Prodan, Natalia Burciu and Victoria Angheluş. 2010. Emotions in words: developing a multilingual WordNet-Affect. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2010), pages 375-384. http://dx.doi.org/10.1007/978-3-642-12116-6_31 Paul Ekman. 1992. An argument for basic emotions. Cognition and Emotion, vol. 6(3/4), pages 169– 200.

Affect dictionary for the Ukrainian language. In the Proceedings of the 3rd International Academic Conference “Human. Computer. Communication”. Olessia Koltsova, Svetlana Alexeeva, and Sergei Koltcov. 2016. An Opinion Word Lexicon and a Training Dataset for Russian Sentiment Analysis of Social Media. In the Proceedings of the International Conference “Dialogue 2016”, pages 277287. Bing Liu and Lei Zhang. 2012. A survey of opinion mining and sentiment analysis. Mining Text Data, pages 415-463. Natalia Loukachevitch and Anatoly Levchik. 2016. Creating a General Russian Sentiment Lexicon. In the Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), pages 1171-1176. Bernardo Magnini and Gabriela Cavaglia. 2002. Integrating subject field codes into WordNet. In the Proceedings of the Second International Conference on Language Resources and Evaluation (LREC 2002), pages 1413-1418. Soumia Melzi, Amine Abdaoui, Jerome Azé, Sandra Bringay, Pascal Poncelet, and Florence Galtier. 2014. Patient's rationale: Patient Knowledge retrieval from health forums. In the Proceedings of the Sixth International Conference on eHealth, Telemedicine, and Social Medicine, pages 140-145. Borja Navarro, Raquel Marcos, and Patricia Abad. 2005. Semantic Annotation and Inter-Annotators Agreement in Cast3LB Corpus. In the Proceedings of the Fourth Workshop on Treebanks and Linguistic Theories.

Joseph G. Ellis, Brendan Jou, and Shih Fu Chang. 2014. Why We Watch the News: A Dataset for Exploring Sentiment in Broadcast Video News. In the Proceedings of the 16th International Conference on Multimodal Interaction (ICMI 2014), pages 104-111. http://dx.doi.org/10.1145/2663204.2663237

Cıcero Nogueira dos Santos and Maıra Gatti. 2014. Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts. In the Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 69–78.

Mark A. Hall. 1999. Correlation-based Feature Selection for Machine Learning, Ph.D. thesis, University of Waikato.

Lina Maria Rojas-Barahona. 2016. Deep learning for sentiment analysis. Language and Linguistics Compass, 10(12), pages 701–719. http://dx.doi.org/10.1111/lnc3.12228

Namrata Godbole, Manja Srinivasaiah and Steven Skiena. 2007. LargeScale Sentiment Analysis for News and Blogs. In the Proceedings of the International Conference on Weblogs and Social Media. Joshi Kalyani, Bharathi H. N., and Jyothi Rao. 2016. Stock trend prediction using news sentiment analysis, International Journal of Computer Science & Information Technology (IJCSIT), 8(3), pages 6776. Olga Kanishcheva, Catharine Klymenkova, and Catharine Yurieva. 2017. Development of WordNet-

Kiran Shriniwas Doddi, Y. V. Haribhakta, and Parag Kulkarni. 2014. Sentiment Classification of News Articles, International Journal of Computer Science and Information Technologies, 5 (3), pages 46214623. Carlo Strapparava and Rada Mihalcea. 2010. Annotating and Identifying Emotions in Text. Intelligent Information Access, “Studies in Computational Intelligence”, pages 21-38. Carlo Strapparava and Rada Mihalcea. 2008. Learning to identify emotions in text. In the Proceedings

of the 2008 ACM symposium on Applied computing, pp 1556-1560. http://dx.doi.org/10.1145/1363686.1364052 Carlo Strapparava and Alessandro Valitutti. 2004. WordNet-affect: an affective extension of WordNet. In Proceedings of the 4th International Conference on Language Resources and Evaluation, pages 1083-1086. Duyu Tang, Bing Qin, Ting Liu. 2015. Deep learning for sentiment analysis: successful approaches and future challenges. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 5(6), pages 292-303. http://dx.doi.org/10.1002/widm.1171 Jinjian Zhai, Nicholas Cohen, and Anand Treya. 2009. CS224N Final Project: Sentiment analysis of news articles for financial signal prediction. Wenbin Zhang and Steven Skiena. 2010. Trading Strategies to Exploit Blog and News Sentiment. In the Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media, pages 375-378.