KAIS manuscript No. (will be inserted by the editor)
Mood Sensing from Social Media Texts and Its Applications Thin Nguyen · Dinh Phung Adams · Svetha Venkatesh
·
Brett
Received: May 16, 2012 / Revised: Jan 30, 2013 / Accepted: Feb 08, 2013
Abstract
We present a large-scale mood analysis in social media texts. We organize
the paper in three parts: 1) addressing the problem of feature selection and classication of mood in blogosphere, 2) we extract global mood patterns at dierent level of aggregation from a large-scale dataset of approximately 18 millions documents 3) and nally, we extract mood trajectory for an egocentric user and study how it can be used to detect subtle emotion signals in a user-centric manner, supporting discovery of hyper-groups of communities based on sentiment information. For mood classication, two feature sets proposed in psychology are used, showing that these features are ecient, do not require a training phase and yield classication results comparable to state-of-the-art, supervised feature-selection schemes; on mood patterns, empirical results for mood organisation in the blogosphere are provided, analogous to the structure of human emotion proposed independently in the psychology literature; and on community structure discovery, sentiment-based approach can yield useful insights into community formation.
1 Introduction So much of the web today is two-way; that is, users are able to not only read commercially owned content but can also respond to it. The facility to comment on what was previously only broadcast media, such as news articles, has led to the creation of an unprecedented amount of associated sentiment-laden text. This adoption of usercontribution by the on-line arms of traditional media and industry seems to have been driven by the emergence of
social media YouTube,
Flickr, Facebook, Twitter and
T. Nguyen, D. Phung, S. Venkatesh School of Information Technology, Deakin University, Geelong Waurn Ponds Campus, Australia. E-mail: {thin.nguyen,dinh.phung,svetha.venkatesh}@deakin.edu.au B. Adams Department of Computing Curtin University, Perth, Australia. E-mail:
[email protected]
2
Nguyen et al.
blogsthrough which whole communities contribute media of various types and trans-
1
act via on-line communication channels. Social media's uniquely `egocentric'
nature
means that these communities and the artifacts they produce constitute sentimentladen corpora in their own right.
2
Hence, the user-generated content is usually opinionated and/or sentiment-bearing, bringing opportunity, for example, for a company to gain insight into consumer opinions about its products and those of its competitors. Thus the ability to identify opinion sources on the Web and monitor them is a growing research eld, termed opinion mining. This line of research mainly focuses on subjectivity detection, sentiment classication, joint topic-sentiment analysis, and opinion summarisation. In this paper, we focus on a popular form of sentiment:
mood. Mood is a strong form of sentiment
expression, conveying a state of the mind such as being happy, sad or angry. Social media texts are rich in sentiment and this paper discusses various fundamental issues related to mood sensing from these texts and novel applications of this information. Text-based mood classication and clustering, as a sub-problem of opinion and sentiment mining, have many potential applications, as identied in [28], such as automated recommendation for product websites, as a sub-component of web technology in business and government intelligence, or for the collection of empirical evidence for studies in psychological and behavioural sciences. Specically, in the blogosphere, mood classication can be used to lter search results, to ascertain the mental health of communities, or to gain detailed insight into patterns of how bloggers behave and relate to one another. However, text-based mood analysis poses additional challenges beyond standard text categorisation and clustering. The complex cognitive processes of mood formulation make it dependent on the specic social context of the user, their idiosyncratic associations of mood and vocabulary, their syntax and style which reects on language usage (for example, the order of linguistic components) and the specic genre of the text. In the case of weblogs studied in this paper, these challenges are reected in the diverse styles of expression of the bloggers, the relatively short text length and the use of informal language, such as jargon, abbreviations and non-standard grammar. Featureselection methods available in machine learning are often computationally expensive, relying on labelled data to learn discriminative features. However, the blogosphere is
3
vast (reaching almost 130 million users ) and is continuing to grow, making it desirable to construct a feature set that works without requiring supervised feature training to classify mood. To this end, it is necessary to look to the results of studies that intersect Psychology and Linguistics. Doing so reveals two potentially useful systems: (1) the sentiment-bearing lexicon known as ANEW [6] and (2) psycholinguistic features drawn from the LIWC [31]. It is proposed that these systems are suitable for use in mood classication. Our contribution is three-fold. First, we conduct a comparative study of machine learning-based text feature selection for the specic problem of mood classication, elucidating insights into what can be transferred from a generic text-categorisation problem for mood classication. Second, we formulate a novel use of two psychologyinspired sets of features for mood classication that do not require supervised feature
1 For example, blog text was found to have a higher occurrence of the 1st person singular
than conversations [31]. 2 For example, one of the main reasons for writing, cited by bloggers, is to speak their minds: www.intac.net/breakdown-of-the-blogosphere/, accessed August 2011. 3 From the state of the blogosphere 2008 at http://technorati.com.
Mood Sensing from Social Media Texts and Its Applications
3
learning and are thus useful for large-scale mood classication, and use them to obtain empirical results for mood organization of a large dataset drawn from the blogosphere, which contains the largest set of mood groundtruth available to date. To our knowledge, we are the rst to consider the problem of data-driven mood pattern discovery at this scale. Third, we examine the potential for mood to reveal hyper-communities, or inter-community boundaries, not apparent from topical analysis. Our study includes results for community clustering using topical, mood-based, and psycho-linguistic-based, features, in a comparative analysis that is the rst of its kind. Our results have signicance for all of the application domains noted above. Our comparison of mood estimators can serve as a basis for deciding which feature to use in a particular application setting when trading o performance for speed. The cheap estimators we have experimented with can be used at scale wherever mood is a useful facet of analysis, from monitoring of discussions in participatory democracy to surveillancing online forums for malicious intent or recruitment campaigns, and even generic search; our hyper-community formulation has direct application to any social media community applications with a textual component, and could serve as a useful feature in a domain whose lifeblood is product dierentiation and rapid innovation. This paper represents extensions of our previous work. For mood classication problem, in addition to [25], cheap and eective features inspired from psychology research are included. Also, experiments on a range of features are conducted on a larger dataset containing million blog posts. For hyper-community problem, continuing from previous work [26], we introduce a novel view of online communities through the linguistic styles their members express in journal diaries. Furthermore, instead of running experiments on data without Live Journal category annotations, we explore new data with the labels, allowing a more objective clustering measurement, in comparison to [26]. Finally, this paper glues these previous work together into a coherent and unied view towards sentiment analysis in social media. The remainder of this paper is structured as follows. The following section provides a discussion of related work. Section 3 examines mood classication in a supervised framework, compares a number of feature selection methods, and introduces cheap features based on psychology-sourced sets. Section 4 presents to an unsupervised approach to examine the correlation between real-world use of a mood vocabulary and an ecient feature set introduced in the previous section, using a large dataset of blog posts with associated groundtruth mood. Section 5 focuses on particular online communities, and examines the use of various community signatures, including topic, mood, and psycholinguistic features, for the task of clustering them into hyper-communities.
2 Related Work 2.1 Feature selection for mood classication For generic text categorization, a wide range of feature selection methods in machine learning has been studied [5, 9]. Most noticeably, Yang and Pedersen [40] conduct a comparative study on dierent feature selection schemes, including information gain (IG), mutual information (MI), and
χ2
statistic (CHI). Their study concludes that IG
and CHI are most eective at dimensionality reduction for text categorisation without compromising classication accuracy, while MI has inferior performance by comparison. An alternative to a term-class interactions approach to selecting features is to consider
4
Nguyen et al.
term statistics. Thresholds for term frequency (TF) or document frequency (DF) are commonly used in feature reduction in data mining. The joint term frequencyinverse document frequency (TF.IDF) scheme, popular for retrieval settings, is also used in text mining and often outperforms TF and DF. Some approaches reduce feature space for feature selection from the entire vocabulary by using linguistic representations, such as parts of speech (POS), including adjectives, verbs, adverbs, and other subcategorizations. All of these practices will be conducted in this study to provide a comparative study of machine learning-based text feature selection for the specic problem of mood classication, elucidating insights into what can be transferred from a generic text categorisation problem for mood classication. In addition, due to the vast scale of social media, we do experiments on feature sets that work without requiring a supervised feature selection stage to classify mood. We extend previous work on mood classication in a supervised framework [25] to include more ecient features based on psychology-sourced sets and provide additional experiments on a larger dataset.
2.2 Mood analysis Mood analysis can be viewed as a subset of sentiment analysis and opinion mining [28] where subjective information is identied, extracted, or utilised in real-world applications. This sentiment information conveyed in social media data has been used in viral marketing, as in Fan and Chang[8], where the authors introduced a sentiment-oriented approach to place advertisements in a specic blog. Also, sentiment information conveyed in user feedback can be helpful in reviews of products and their features [17]. Furthermore, Feng et al. [10] propose a method to group blogs into sentiment clusters, employing Chinese sentiment lexicon for sentiment-bearing representation, facilitating public opinion monitoring for governments and business organizations. Mood classication in weblogs has been conducted by Mishne [20], who classies blog text according to the mood tagged by its author at the time of writing, and by Leshed and Kaye [16], who predict the user's state of mind for an incoming blog post. Mood estimation has a wide scope of application. For example, in business and government intelligence, product research, ethnographic study of the Internet, community or media recommendation, search with a mood facet and pervasive healthcare. However, mood estimation from text poses challenges beyond those encountered by typical text categorisation and clustering. Leaving aside the complexity of the underlying cognitive processes, the manifestation of mood is coloured by a person's idiosyncratic vocabulary and style, with messages often reecting a social context, including community norms, history and shared understanding. Consequently, text is often short, informal and punctuated by jargon, abbreviations and frequent grammatical errors. In addition to classication, clustering mood into patterns is also an important task as it provides clues about human emotion structures, with implications for sentimentaware applications such as sentiment-sensitive text retrieval. The structure of mood organisation has been investigated from a psychological perspective for some time. For example, Russell [32, 33] proposes the
circumplex model of aect
to represent aect
states. Using this model, emotion names can be placed around the perimeter of a circle in two-dimensional space. The dimensions dening the space are pleasantness (or valence) and activation (or arousal). However, the structure of mood formation has not been investigated from a data-driven and computational point of view. This pa-
Mood Sensing from Social Media Texts and Its Applications
5
per aims to discover intrinsic patterns in mood structure using unsupervised learning approaches. Using a large, ground-truthed dataset of approximately 18 million posts introduced in [16], it seeks empirical evidence to answer various questions that have often been posed in psychology. For example, does mood follow a continuum in its transition from `pleasure' to `displeasure' or from `activation' to `deactivation'? Is `excited' closer to `aroused' or `happy'? Does `depressed' transit to `calm' before reaching `happy'? These are interesting and important research questions that have long been the focus of conjecture in dierent elds, but have not been extensively investigated from a data-driven perspective. The work of Russell [32], for example, includes only 36 participants, and is thus far smaller in scale than the dataset used here. Mood clustering has also been found in [14], in which the authors performed clustering on music genres, artists and usages in term of moods used and in [16], in which the authors grouped blog posts to nd mood synonymy. The idiosyncratic nature of mood attributions for the domain of music leads Hu and Downie [14] to cluster, rather than classify, music genres, artists and usages into data-derived `mood spaces'. Some approaches employ a clustering component for mood classication. For example, Sood and Vasserman [36] use K-means clustering to obtain a much reduced set of mood classes from the popular blog site, Livejournal's, predened 132 moods. Mood classication of blog posts is reduced to a ternary classication from among happy, sad and angry, and is achieved using a number of generic text features plus some that are mood specic (for example, emoticons and Internet slang), with an average F-measure of 0.66. Their mood classier is used as part of a mood-aware search interface.
2.3 Mood and Inference about Communities and Users The above sections discuss work aimed at classifying, or grouping, by manifest mood
texts. If the focus of analysis is shifted from texts to their authors, dierent questions arise: Can users be characterized by the mood of the messages they author? Can communities be so characterized? Are some more alike than others in terms of the mood of their discussions? For example, Mishne and Glance [21] are not alone in using social media commentary as a kind of sensor about how a product, company, political party, or person is being perceived. They use Nigam and Hurst's polar sentiment classier [27] to analyze blog posts referring to new movies together with the movie's box oce performance. They nd some evidence of a positive correlation between use of positive sentiment prior to a movie's release and the movie's subsequent nancial success. Cambria et al. [7] take a Semantic Web approach to opinion mining of patients, relatives and healthcare professionals for the purpose of obtaining a crowd validation of the UK National Health Service. The problem of community structure discovery in complex networks has been investigated in numerous studies. In particular, link structure, which can be explicitly declared as
friendship or membership in subscribers' proles, has been exploited. For
example, Kumar et al. [15] use friendship links and other information user proles, annotated with timestamps, to learn the network structure and its evolution using Flickr, Yahoo! 360 and blogosphere data. Backstrom et al. [2] use co-authorship and publication information in the Digital Bibliography and Library Project, and friendship and community membership in Livejournal to learn group formation in the two networks. A disadvantage of link-based approaches is that they make strong assumptions about
6
Nguyen et al.
the availability of link structure and the stability of communities, which is not always true in practice. Another approach to learning community structure is to use the content created by users. For example, Mc Callum et al. [18] present the Author-Recipient-Topic model to exploit the content in emails sent among Enron employees. The model can discover relevant topics, predict people's roles and give lower perplexity on previously unseen messages. They also introduce the Group-Topic model to detect groups in streams using relationships between entities and the properties of the relationships [19]. In [35], the Content-Time-Relation model is applied to the Enron email corpus to discover roles, predict the senders, infer the receivers and to describe the content of information exchanged between people in the company. In addition to the content, tags can also be exploited for the task of community detection. For instance, Negoescu et al. [24] use the tags and membership information of Flickr users in a bag-of-words representation to learn Flickr hyper-groups. Berendt and Hanser [3] show that tags could potentially enrich information related to the posts and the bloggers. The authors reveal that tags complement content, reect dierences between annotators and provide additional information. Hayes and Avesani [13] use tag information to nd topic-relevant blogs. A joint linkcontent approach has also been used in studying social networks. An example of this practice is that of Nallapati and Cohen [23], who introduce LinkPLSA-LDA to model the relationships between the citing/linking and the cited/linked documents on specic topics. In this paper, community detection using topic-based and mood-based features in a comparative analysis as proposed in [26] that, to our knowledge, is the rst of its kind. In addition, this paper introduces a novel representation of communities which bases on psycho-linguistic features. These features oer a wide scope of classication, including topical, linguistic, stylistic and mood categories, and they are cheap to obtain. They are found to be eective in capturing and identifying the style of authors in personal blogs [22].
3 Mood Classication We begin by framing the problem of mood estimation from text in a supervised classication setting. The results of this approach will serve as a baseline for comparison when we subsequently look for lighter-weight alternatives to computationally expensive feature selection and classiers. Denote by
B
the corpus of all blog posts and by
M=
{sad, happy, ...} the set of
all mood categories. In a standard feature-selection setting, each blog post also labelled with a mood category a feature vector
x(d)
V = v1 , . . . , v|V |
d
to be classied as
is
d
cd .
the set of all terms, then the
x(d) = [. . . , xi , . . .] might take a simple counting with its i -component representing the number of times the term vi appears in document d, a scheme
feature vector
(d)
d∈B
and the objective is to extract from
being as discriminative as possible for
For example, if we further denote by
xi
cd ∈ M
(d)
widely known as bag-of-word representation. The problem of generic text document classication has been investigated extensively in the text-mining domain. It is generally agreed among researchers that, despite the strong independence assumption among features conditional on the class, the simple NB classier remains the state-of-the-art for this task: a fact that is veried by
Mood Sensing from Social Media Texts and Its Applications
7
the experimental results for the problem of mood classication presented in this paper. However, dierent feature-selection methods are found to inuence classication performance greatly [40]. Taking the view that the works of Yang and Pedersen [40] and Sebastiani [34] represent state-of-the-art results in feature selection for generic text-categorisation tasks, we question whether the ndings in the work hold for our mood classication problem. We shall briey describe commonly used feature-selection schemes, including those in [40, 34].
3.1 Feature selection methods
Term-based selection These are features derived with respect to a term
v.
Two common features are term
term frequency T F (v, d) represents the number of times the term v appears in document d, whereas document frequency DF (v) is the
and document frequencies where
v . It is also well-known in text mining that T F.IDF (v, d) weighting can improve discriminative power where T F.IDF (v, d) = T F (v, d) × IDF (v) with IDF (v) = |D|/DF (v) is the inverse document frequency. In this work, a term v will be selected if it has high DF (v) value, or high average values of T F (v, d) or T F.IDF (v, d) across all documents d over a threshold. number of blogposts containing the term
Term-Class interaction-based selection The essence of these methods is to capture the dependence between terms and corresponding class labels during the feature selection process, and capture the relevance of
IG(v), χ2 -statistics CHI (v, l) [40]. IG (v) captures the information gain (measures in bits) when a term v is present or absent; M I (v, l) measures the mutual information between a term v and a class label l; and lastly CHI (v, l) terms. Three common selection methods in this category are: information gain
mutual information
M I(v, l)
and
measures the dependence between a term and a class label by comparing against one degree of freedom
χ2
distribution.
It has been shown that linguistic components, such as use of adverbs, adjectives or verbs, can be a strong indicator of mood [28]. Therefore we also apply the above feature selection methods to subsets of the raw unigram text classed by part-of-speech.
4
We use the SS-Tagger [39], ported to the Antelope NLP framework,
to pre-process
blog post text, and tag verbs, adjectives and adverbs.
Aective Norms for English Words ( ANEW) lexicon A source of text representation apt to inferring mood comes by way of mood-valued lexicons, such as Aective Norms for English Words (ANEW) [6]. ANEW is a set of 1034 sentiment-conveying words created by the National Institute of Mental Health of the United States to serve as a standard for studies in cognition and emotion. Words are rated in terms of normalized scores using a well-known mood model consisting of the triple: valence, arousal, and dominance. The valence values in ANEW range from 1.25 (
suicide ) to 8.82 (triumphant ); arousal ranges between 2.39 (relaxed ) and
4 www.proxem.com
8
Nguyen et al. Category
Code
Word count Words/sentence Dictionary words Words>6 letters Function words
Total pronouns
Personal pron.
1st pers singular 1st pers plural 2nd person 3rd pers singular 3rd pers plural
Impers. pron.
Articles Common verbs Auxiliary verbs Past tense Present tense Future tense Adverbs Prepositions Conjunctions Negations Quantiers Numbers Swear words
Social
Example
Linguistic processes wc wps dic sixltr funct pronoun ppron i we you shehe they ipron article verb auxverb past present future adverb prep conj negate quant number swear
Since Them I Me We You She They It The Walk Am Went Is Will Very To And No Few Once Damn
227 13.3 83.4 12.4 53.4 17.2 11.9 7.9 0.7 1.6 1.2 0.4 5.3 4.5 15.8 9.3 4.0 9.8 1.0 5.8 10.6 6.3 1.9 2.5 0.7 0.7
social family friend human aect posemo negemo anx
Mate Son Buddy Adult Happy Love Hurt Nervous
8.4 0.4 0.3 0.8 7.3 4.6 2.7 0.3
Psychological processes
Family Friends Humans
Aective Positive Negative Anxiety
%
Category Anger Sadness
Cognitive
Insight Causation Discrepancy Tentative Certainty Inhibition Inclusive Exclusive
Perceptual See Hear Feel
Biological Body Health Sexual Ingestion
Relativity Motion Space Time
Work Achieve Leisure Home Money Religion Death
Code anger sad cogmech insight cause discrep tentat certain inhib incl excl percept see hear feel bio body health sexual ingest relativ motion space time
Example Hate Crying Cause Think Hence Should Maybe Always Block And But Heard View Listen Touch Eat Cheek Flu Horny Dish Bend Car Down Until
% 1.2 0.5 15.3 1.9 1.3 1.5 2.5 1.3 0.4 4.5 2.8 2.3 0.9 0.6 0.7 2.6 0.9 0.6 0.8 0.4 13.7 2.1 4.9 6.3
work achieve leisure home money relig death
Job Hero Chat Family Cash God Bury
1.6 1.2 1.5 0.5 0.5 0.4 0.2
assent nonu ller
OK Hm Blah
1.2 0.5 0.1
Personal concerns
Assent Nonuency Fillers
Spoken
Table 1: Language groups categorised by LIWC over the corpus of approximately 18 million blog posts studied in this paper. The numbers are the means of the percentages of the features' words used in a blog post, except for words in a post) and
wps
wc
(the mean of the number of
(the mean of the number of words per sentence).
rage ); and dominance ranges are between 2.27 (helpless ) and 7.88 (leader ). The patient ), 5.06 (sunrise ), and 4.12 (knife ),
8.17 (
median values for these dimensions are 5.29 (
respectively. For each blog post, we construct a feature vector that contains counts for each ANEW word. Because only a fraction of the 1034 ANEW words appears in any given blog post (typically between 5 and 20 words), the resulting feature vectors are sparse.
Psycholinguistic features (LIWC) Another powerful set of features used in this paper are psycholinguistic, drawn from the LIWC package [31]. The LIWC package assigns English words to one of four high-level
Mood Sensing from Social Media Texts and Its Applications
9
categories: linguistic processes, psychological processes, personal concerns and spoken
5
categories, which are further sub-divided into a three-level hierarchy.
The taxonomy
ranges across topic (for example, religion and health), mood (for example, positive emotion) and processes not captured by either, such as cognition (for example, causation and discrepancy) and tense. Tausczik and Pennebaker [37] survey the use of the LIWC package, based on the social and psychological meaning of words, across dierent research areas in sociology and psychology, including status, dominance and social hierarchy, honesty and deception, thinking styles and individual dierences. According to [30], weblogs are particularly suitable for sentiment analysis since they contain a high rate of words in
aective processes. For an 18 million blog posts dataset used in
this paper we compute the mean for each of the LIWC groups and present them in Table 1. As can be seen, the percentage of words in
aective processes
in the corpus
(7.3 per cent) is larger than that of any class of text reported in [30], even the emotional writing class (6.02 per cent). This enables sentiment analysis for social media-derived corpora.
3.2 Mood Classication We use IR05 and WSM09 datasets for the task of mood classication. We use data crawled from Livejournal, a weblog hosting site. Livejournal allows people to tag their
mood
when they are blogging, thus providing an excellent source of ground truth data
for sentiment analysis. The host provides a comprehensive set of 132 moods for users to specify their current emotion at the time of blogging. The provided moods range diversely in the emotion spectrum, for example, or
discontent
and
uncomfortable
for
sadness.
cheerful
and
grateful
for
happiness
The IR05 dataset The IR05 dataset, introduced by Mishne [20], contains 815,494 blog posts from Livejournal. This dataset can be considered the rst Livejournal corpus created for the purpose of mood classication. Mishne [20] performs emotion analysis on this blog post corpus. He uses a number of feature sets such as frequency counts, lengths, sentiment orientations, emphasized words and special symbols as input to an SVM classier. The classication accuracy is modest, being slightly above baseline. Of the total, 535,844 posts are tagged with predened moods. We disregard the posts annotated with nonpredened moods.
The WSM09 dataset The WSM09 dataset was provided by Spinn3r (spinn3r.com) as the benchmark dataset
6
for the ICWSM 2009 conference.
It contains 44 million blog posts crawled between
August and October 2008. A subset from this dataset was extracted for this paper consisting of only blog posts from Livejournal, which include the mood ground truth entered by the user when the post was composed. Again, only the moods predened by
5 http://www.liwc.net/descriptiontable1.php accessed July 2011. 6 http://www.icwsm.org/2009/data/, retrieved November 2011.
10
Nguyen et al. Order 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Selection method IG TF DF IG TF TF.IDF DF LIWC TF.IDF ANEW IG TF.IDF DF TF TF.IDF IG TF DF MI MI TF.IDF IG DF TF MI MI CHI CHI CHI CHI MI CHI
Linguistic subsets Unigram Unigram Unigram AdjVbAdv AdjVbAdv AdjVbAdv AdjVbAdv Unigram Verb Verb Verb Verb Adjective Adjective Adjective Adjective AdjVbAdv Adjective Adverb Adverb Adverb Adverb Adverb Verb AdjVbAdv Adjective Verb Unigram Unigram Adverb
Accuracy 0.779 0.764 0.763 0.752 0.743 0.743 0.742 0.744 0.726 0.704 0.698 0.696 0.696 0.694 0.687 0.685 0.683 0.682 0.624 0.570 0.607 0.606 0.606 0.606 0.605 0.617 0.601 0.601 0.589 0.580 0.522 0.561
F-score 0.772 0.757 0.756 0.742 0.732 0.732 0.731 0.730 0.709 0.681 0.678 0.676 0.676 0.674 0.658 0.655 0.653 0.653 0.575 0.570 0.569 0.568 0.568 0.568 0.567 0.560 0.555 0.540 0.533 0.515 0.509 0.445
Classier SVM SVM SVM SVM SVM SVM SVM SVM SVM SVM SVM SVM SVM SVM SVM SVM SVM SVM SVM Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes SVM Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes
Table 2: Mood classication results for dierent feature selection schemes and for different feature subsets in WSM09 dataset, sorted in descending order of F-score. Both SVM and Naive Bayes classiers are run on 32 sub-corpora but only the better results are reported.
Livejournal are considered and all others discarded, resulting in approximately 600,000 blog posts.
7
The WSM09 dataset is used in [36] for a two-step data-mining application. At rst, all moods are categorised into three classes: happy, sad and angry, using K-mean clustering. Blogs posts in these three groups are then subjected to a Naive Bayes classier. The feature set considered in this task consists of unigrams, bigrams, stems, emotion, emoticons and slang. The highest recall, precision and F-measure are 67.1%, 65% and 66.1% respectively. In order to compare our results with those of Sood and Vasserman [36], we restrict classication for this experiment to three popular moods { complete set of
sad, happy, angry }.
The
132 moods employed by Livejournal will be considered in the following
section.
7 Consistent with what is reported in [36].
Mood Sensing from Social Media Texts and Its Applications Order 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Selection method TF DF TF DF TF.IDF TF.IDF ANEW LIWC TF DF TF.IDF TF TF.IDF DF CHI IG MI CHI IG MI CHI IG MI DF TF TF.IDF CHI IG MI CHI IG MI
Linguistic subsets Unigram Unigram AdjVbAdv AdjVbAdv AdjVbAdv Unigram Verb Verb Verb Adjective Adjective Adjective Unigram Unigram Unigram Adjective Adjective Adjective AdjVbAdv AdjVbAdv AdjVbAdv Adverb Adverb Adverb Adverb Adverb Adverb Verb Verb Verb
Accuracy 0.771 0.771 0.752 0.750 0.749 0.740 0.727 0.732 0.709 0.707 0.707 0.711 0.692 0.692 0.669 0.669 0.669 0.667 0.667 0.667 0.657 0.657 0.657 0.638 0.637 0.637 0.633 0.633 0.633 0.637 0.637 0.637
11 F-score 0.761 0.761 0.738 0.736 0.734 0.721 0.699 0.691 0.680 0.680 0.678 0.673 0.673 0.673 0.597 0.597 0.597 0.594 0.594 0.594 0.591 0.591 0.591 0.586 0.584 0.584 0.563 0.563 0.563 0.555 0.555 0.555
Classier SVM SVM SVM SVM SVM SVM SVM SVM SVM SVM SVM SVM Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes Naive Bayes
Table 3: Mood classication results for dierent feature selection schemes and for different feature subsets in IR05 dataset, sorted in descending order of F-score. Both SVM and Naive Bayes classiers are run on 32 sub-corpora but only the better results are reported.
For the classication method we experimented with two popular classiers implemented in the Weka package [12]: Support Vector Machine (SVM) and Naive Bayes classiers. For each run we use ten-fold cross-validation, repeat 10 runs, and report the average result. To evaluate the results we report two commonly used measures: accuracy and F-score.
Eect of feature selection schemes and linguistic components TF, DF, TF-IDF), and three term-class interactionIG, MI, CHI) selection methods, are used in this experiment. Feature selection
Three term weighting-based ( based (
is applied using all terms (unigrams), and using subsets of terms performing a particular part-of-speech (adjectives, verbs, adverbs, or in combination). This leads to 30 parameter combinations: (term-based(3)
+ term-class interaction(3)) × part of speech(5). In
addition, we use the 1034 ANEW words and 68 categories returned from the LIWC
12
Nguyen et al. 80 Frequency Presence
70
F−score (%)
60 50 40 30 20
0
dfUni tfUni igUni tfidfAdj tfAdj miAdj dfAdj igAdj tfidfUni tfidfAdjVbAdv dfAdjVbAdv tfAdjVbAdv igAdjVbAdv ANEW tfVb dfVb igVb tfidfVb miAdjVbAdv miUni tfidfAdv igAdv dfAdv tfAdv miAdv chiAdjVbAdv miVb chiVb chiAdj chiUni chiAdv
10
(a) Naive Bayes classier. 80 Frequency Presence
70
F−score (%)
60 50 40 30 20
0
tfidfVb igVb dfVb tfVb tfidfUni igAdjVbAdv miAdj tfidfAdjVbAdv miAdjVbAdv ANEW tfAdjVbAdv dfAdjVbAdv miVb chiAdjVbAdv tfUni igUni dfAdv dfUni tfAdv igAdv miAdv tfidfAdv tfidfAdj igAdj chiUni tfAdj dfAdj chiAdj chiVb chiAdv miUni
10
(b) SVM classier. Fig. 1: Performances of binary versus counting features for mood classication for the WSM09 dataset. The performance is measured in F-score and descending sorted by the dierence of the performances.
package as separate feature vectors. For comparison, the number of features in all other cases equals 1,034, which is the number of ANEW words. We experiment with all possible combinations of feature selection methods and dierent linguistic subsets and report the top results in Table 3. With respect to feature-selection schemes, IG is observed to be the best selection scheme. Other term-class interaction-based methods do not perform well; noticeably, mutual information (MI) does not appear in any of the top ten results. These observations are consistent with results for text categorization in [40]. In contrast to their
Mood Sensing from Social Media Texts and Its Applications
13
80 Frequency Presence
70
F−score (%)
60 50 40 30 20
0
dfUni tfUni tfidfUni tfAdjVbAdv tfidfAdjVbAdv dfAdjVbAdv ANEW dfAdj tfidfAdj dfVb tfVb tfidfVb tfAdj tfidfAdv dfAdv chiUni igUni miUni tfAdv chiAdj igAdj miAdj chiAdv igAdv miAdv chiAdjVbAdv igAdjVbAdv miAdjVbAdv chiVb igVb miVb
10
(a) Naive Bayes classier. 80 Frequency Presence
70
F−score (%)
60 50 40 30 20
0
tfidfUni ANEW tfidfVb tfVb dfVb tfUni dfUni tfidfAdjVbAdv tfAdjVbAdv chiAdjVbAdv igAdjVbAdv miAdjVbAdv dfAdjVbAdv chiVb igVb miVb chiUni igUni miUni tfidfAdj tfAdj dfAdj chiAdj igAdj miAdj tfAdv dfAdv tfidfAdv chiAdv igAdv miAdv
10
(b) SVM classier. Fig. 2: Performances of binary versus counting features for mood classication for the IR05 dataset. The performance is measured in F-score and descending sorted by the dierence of the performances.
ndings, we nd CHI does not perform well for the mood classication task, absent
TF and DF perform TF.IDF in unigram casesthe opposite is true for generic text mining; that is, IF.IDF is often superior, albeit more computationally expensive. Thus TF or DF are recommended, in conjunction with IG, for good performance in the trade-o
as it is from either of the top ten result lists. Surprisingly, both better than
with computational cost. With respect to the eect of linguistic components (which are not tested in [16] or [36]), a combination of adjectives, verbs and adverbs (AjVbAv) dominates the top
14
Nguyen et al.
results and gives performance very close to that achieved when using all terms. Using verbs or adjectives alone also produces a good performance. The performance of the selected feature-selection schemes is also acceptable across the two datasets, as can be seen in Tables 2&3 for WSM09 and IR05 datasets respectively. The best results stand at 76.1% F-score for IR05 and 77.2% F-score for WSM09. Although NB classier outperforms SVM in a majority of feature-selection schemes for IR05 dataset, all top results in both datasets are returned from SVM.
Performance of LIWC In comparison with more than 30 combinations of selection methods and feature spaces, though without the need for a supervised feature-selection stage, the result of LIWC features is found to be very encouraging, appearing among the top results across the two datasets. This result reveals dierences in the use of the psycholinguistic features among posts tagged with dierent moods.
Performance of ANEW The results of classication based on the ANEW lexicon alone are encouraging, appearing in the top ten results for both datasets. Performance is consistent across datasets at approximately 70% F-score.
Term presence vs. term frequency It is well known in text mining that the bag-of-word counting representation (that is, count the number of appearances of a term) is an eective feature. However, in sentiment analysis, it has been found that a simple binary representation (that is, use
1 if the term appears in the document and 0 otherwise) is more eective at movie review classication [29]. Further, a binary feature representation can be better compressed, making it suitable for dealing with large datasets. Therefore, we are motivated to investigate whether a binary representation is eective for mood classication. Figures 1&2 show the results for the WSM09 and IR05 datasets when the classication was performed on the binary and counting features respectively. For each dataset, the classication F-score is plotted for both types of representation, displaying the top results in an increasing order of performance. The results reveal that binary representation outperforms its counterpart for four top performances in all datasets. This result again conrms the superiority of binary representation over counting for mood classication, suggesting that recording the appearance of a term in a document is sucient and recommended for mood classication tasks.
4 Exploratory Mood Patterns Much of the existing work aimed at inferring mood from text has been framed as supervised classication, but the computational costs of these approaches are prohibitive. The Livejournal corpus introduced by [16] is a case in point: it contains 18 million posts, many of which are groundtruthed with one of 132 predened mood labels.
8 For the full list of pre-dened moods, visit http://www.livejournal.com
8
A
Mood Sensing from Social Media Texts and Its Applications
15
ACTIVATION
AROUSAL
. .. .. . . . . . . . ... .... . . .. . . . . .. .. . ... . . . . . . .. . .... .... ... . . . . ............ . .. . .. .. . . .. . . . .. . ................ ...... ... .. . .... . . . . . ... ...................... ........ . ... ..... . .... . . . . .. .. . . . ..... ......................... .... ... .. .. .. ... .. ...... .................................... ....... . .. . .... .... .. .. . .. .................... .... . . . .. . . . . ... . .. . . .. ............................... ....... .. ........ .. .. .. ... ...................................... .... .... . . . . . ... . .. ........ .... ... .. ...... ...................................... ....... ... . . . . . ............. . ........ . ...... .......... .............................. ....... .. .. .. .. ... . . .. .. .. . ........................... . . . . . . . . . . . . . . ... . ... ................ . . . ... ... . ... . .. . . .. . . .
8
7
DISPLEASURE
6
5
4
3
PLEASURE
9
2
1
0
1
2
3
4
5
6
DEACTIVATION
7
8
9 VALENCE
Fig. 3: Valence and arousal values of 1,034 ANEW words on the aect circle.
Fig. 4: Visualization of the distribution of
132
predened mood labels from approxi-
mately 18 million LiveJournal blog posts.
cloud visualisation of moods tagged in this dataset is shown in Figure 4. Leshed and Kaye [16] perform emotion classication on fty of the most frequent moods appearing in the corpus. They use the TFIDF feature-selection method to select only the rst 5,000 features. Blog entries are represented in the `bag-of-word' model of information retrieval, and subjected to an SVM classier. The average accuracy of the system is re-
16
Nguyen et al.
(a) ANEW words used in a post tagged with mood this post is 7.75.
(b) ANEW words used in a post tagged with mood post is 3.12.
happy. The average valence for
sad. The average valence for this
Fig. 5: Examples of the use of ANEW words in happy and sad blog posts. The colour reects the valence and arousal values an ANEW word conveys, as shown in Figure 3.
ported to be 78% [16]. To apply the feature selection schemes presented in section 3.1 to
M I(v, l) for each pair (term, O (|M| × |V|) where |M| = 132 and |V| is
this corpus would be expensive; for example, computing mood) has a computational complexity of
the number of unique terms, which could be on the order of hundreds of thousands. And this corpus is a fraction of the existing social media corpus, which continues to grow at an accelerating speed. We investigate the possibility of unearthing intrinsic patterns of mood using unsupervised approaches. We take as a promising base for this analysis the ANEW feature vectors used in section 3.2, which gave classication performance comparable with those employing the expensive feature selection schemes. Figure 5 depicts the relationship we hope to exploit between blog posts with groundtruth mood that also use words in the ANEW lexicon in their textual content. For the Livejournal corpus we make the initial observation that blog posts tagged with moods in the same emotion pattern have similar proportions of use of words in the ANEW lexicon. For example, see Figure 6, which plots a sample of ANEW words having arousal in the range of 7.2 8.2 against proportion of ANEW words in blog
happy/cheerful angry/p*ssed o. We can see clearly that the words anger, enraged, and rage are most likely to be found in posts labelled angry or p*ssed o, and least likely in those
posts tagged with Livejournal moods of similar sentiment, one of either or
Mood Sensing from Social Media Texts and Its Applications Cluster 1
Exemplar CHEERFUL
2 3 4
PENSIVE REJUVENATED QUIXOTIC
5
CRAZY
6 7 8 9 10
MELLOW GRATEFUL AGGRAVATED ANGRY GLOOMY
11 12 13 14 15 16
PRODUCTIVE TIRED NAUSEATED MOODY THIRSTY EXANIMATE
17
Members ecstatic, jubilant, giddy, happy, excited, energetic, bouncy, chipper determined, contemplative, thoughtful optimistic, relieved, refreshed, hopeful, peaceful surprised, enthralled, devious, geeky, creative, recumbent, artistic, impressed, amused, complacent, curious, weird horny, giggly, high, irty, hyper, drunk, naughty, dorky, ditzy, silly pleased, satised, relaxed, content, anxious, good, full, calm, okay loved, thankful, touched irritated, bitchy, annoyed, frustrated, cynical p*ssed o, infuriated, irate, enraged jealous, envious, rejected, confused, worried, lonely, guilty, scared, pessimistic, discontent, distressed, indescribable, crushed, depressed, melancholy, numb, morose, sad, sympathetic accomplished, working, nervous, busy, rushed sore, lazy, sleepy, awake, groggy, exhausted, lethargic, drained sick disappointed, grumpy, cranky, stressed, uncomfortable, crappy nerdy, mischievous, hungry, dirty, hot, cold, bored, blah intimidated, predatory, embarrassed, restless, nostalgic, indierent, listless, apathetic, blank, shocked
Table 4: Livejournal moods clustered by similarity of ANEW word use. 0.06
angry p*ssed off cheerful happy
0.05
Proportion
0.04
0.03
0.02
0.01
Fig. 6: ANEW usage proportion in the posts tagged with
gry /p*ssed o.
happy /cheerful
and
love
fun
happy
party
birthday
ANEW words
christmas
pretty
food
cute
good
fight
mad
alone
stupid
hell
hate
dead
hurt
sick
death
0
an-
18
Nguyen et al. ANGRY,p*ssed off,infuriated, irate,enraged
CHEERFUL,ecstatic,jubilant, giddy,happy,excited,energetic, bouncy chipper bouncy,chipper
AGGRAVATED,irritated,bitchy, annoyed,frustrated,cynical REJUVENATED,optimistic, relieved,refreshed,hopeful, peaceful f l
MOODY,disappointed,grumpy, cranky,stressed,uncomfortable, crappy
MELLOW,pleased,satisfied, relaxed,content,anxious,good, full,calm,okay
TIRED,sore,lazy,sleepy,awake, groggy,exhausted,lethargic, drained
GRATEFUL,loved,thankful, touched
Fig. 7: Mood patterns on the aect circle.
labelled
happy
or
cheerful.
often used in posts labelled or
romantic and surprised are not p*ssed o, but are used in posts labelled happy
In contrast, the words
angry
or
cheerful. This congruence between the use of words in the ANEW lexicon and the
attached mood groundtruth supports the idea of using the ANEW feature to discover the implicit structure of mood in textual social media. Following on from this observation, we next cluster Livejournal moods based on how users who tag blog posts with a given mood use words from the ANEW lexicon.
M= {sad, happy, ...} the = 132). Each blog post b ∈ B in the corpus is labeled with a mood lb ∈ M. Denote by n the number of ANEW words (n = 1034). Let m m xm = [xm 1 , . . . , xi , . . . , xn P] be the vector representing the usage of ANEW words m by mood m. Thus, xi = b∈B,lb =m cib , where cib is the count of the i-th ANEW word in blog post b tagged with mood m. The usage vector is normalized so that Pn m m i=1 xi = 1 for all m ∈ M. The vectors xi will serve as a similarity measure upon Let us denote by
B
the corpus of all blog posts, and by
predened set of moods (|M|
which to cluster. Clustering is performed using the non-parametric algorithm termed Anity Propagation (AP) [11]. AP has the desirable properties of automatically discovering the number of clusters, and cluster exemplars (instances that best represent the given cluster). AP simply requires the pairwise similarities between moods, which we compute using Euclidean distance. After running the AP algorithm,
16 clusters were detected. Table 4 lists the discov-
ered clusters, with exemplar moods in caps, and the remaining members in lower case. The discovered clusters are intuitively coherent. Clusters 17 typically contain moods of high valence or pleasure; clusters 816 contain moods of low valence or displeasure. Figure 7 depicts the same clusters as distributed in valencearousal space. Clusters are plotted at the average (ANEW) arousal and valence of their member moods. For those moods that do not appear in the ANEW lexicon (95 of the pre-dened 132), arousal and valence is taken from the nearest parent in Livejournal's mood hierarchy that appears in the ANEW lexicon.
9
The above data-driven aect analysis is, to the authors' knowledge, the rst of a kind. The coherence of the discovered implicit mood structures validate the use
9 http://www.livejournal.com/moodlist.bml accessed July 2011.
Mood Sensing from Social Media Texts and Its Applications
19
Cooking and Bentolunch, consider Bookish and 50bookchallenge ; and Pokemon and Pkmncollectors.
Fig. 8: Proles of six Livejournal communities: similar topics; as do
of ANEW-based features for the purpose of inferring mood in unlabelled text. Such a mood sensor is relatively inexpensive to apply, and has a potential role in many applications including text surveillance, mood-related search facets, characterization and recommendation of media and communities, and ethnographic study, to name a few.
5 Mood as an Index for Users and Communities Having in the previous section obtained a robust and eective sensor for mood from text, we now seek to apply it to ner-grained problems. In particular, we will investigate the use of mood to detect hyper-communities, and to characterize users, discussions, and communities.
5.1 Community Representation While there has been extensive work on characterizing collections of text by topic, including blog sub-communities [1] and tagged media [24], the same has not been attempted with mood. But mood is often an integral feature of a text, particularly for social media forums; two communities might discuss precisely the same topics, yet within an entirely dierent atmosphere. E.g., where one forum might host conversations about politics in a cerebral, serious-minded, and friendly fashion; another will discuss
Nguyen et al.
Communities
20 __quotexwhore _we_are_lost 20sknitters adayinmylife add_a_writer addme25_and_up aesthetes altparent amateur_artists baby_names baristas battlestar_blog beatlepics beauty101 behind_the_lens bentolunch birls bjorkish blackfolk boys_and_girls breastfeeding broadway calmallamadown cat_lovers charloft chuunin classics clucky color_theory computer_help computerhelp corsetmakers curlyhair davis_square doctorwho dog_lovers dogsintraining dyed_hair egl egl_comm_sales eurotravel filmsg ftm gamers glee_tv gleeclub gundam00 house_cameron htmlhelp i_am_thankful ipod iworkatborders just_good_music lword macintosh madradstalkers miracle______ naturalbirth naturesbeauty ncisficfind news_jpop newyorkers nonfluffypagans note_to_cat ofmornings ontd_political ourbedrooms parenting101 patd patdslashseek picturing_food poor_skills prolife queer_rage relaxmusic rilokiley rpattz_kstew ru_glamour seattle sew_hip sgagenrefinders sgastoryfinders sheldon_penny theater_icons thecure thenicestthings theoffice_us thesims2 time_and_chips todayirealized topmodel transnews trashy_eats vintagehair walmart_employe webdesign weddingplans world_tourist worldofwarcraft wow_ladies 5
10
15
20
25
30
35
40
45
50
Topics
Fig. 9: Topic proportions of 100 communities.
Category AdviceSupport CreativeExpression EntertainmentMusic Fandom Fashion-Style Food-Travel GamingTechnology ParentingPets PoliticsCulture Television
Community add_a_writer, addme25_and_up, baristas, boys_and_girls, i_am_thankful, iworkatborders, thenicestthings, todayirealized, walmart_employe, weddingplans __quotexwhore, 20sknitters, adayinmylife, aesthetes, amateur_artists, behind_the_lens, charloft, color_theory, naturesbeauty, sew_hip beatlepics, bjorkish, broadway, just_good_music, news_jpop, patd, relaxmusic, rilokiley, theater_icons, thecure chuunin, house_cameron, miracle______, nciscnd, patdslashseek, rpattz_kstew, sgagenrenders, sgastorynders, sheldon_penny, time_and_chips beauty101, corsetmakers, curlyhair, dyed_hair, egl, egl_comm_sales, madradstalkers, ourbedrooms, ru_glamour, vintagehair bentolunch, davis_square, eurotravel, lmsg, newyorkers, ofmornings, picturing_food, seattle, trashy_eats, world_tourist computer_help, computerhelp, gamers, htmlhelp, ipod, macintosh, thesims2, webdesign, worldofwarcraft, wow_ladies altparent, baby_names, breastfeeding, cat_lovers, clucky, dog_lovers, dogsintraining, naturalbirth, note_to_cat, parenting101 birls, blackfolk, classics, ftm, nonuypagans, ontd_political, poor_skills, prolife, queer_rage, transnews _we_are_lost, battlestar_blog, calmallamadown, doctorwho, glee_tv, gleeclub, gundam00, lword, theoce_us, topmodel
Table 5: Communities from ten Livejournal directory for experiments.
the same issues adversarially, with zest and tolerance of profanity. Such mood-related distinctions are important for many kinds of analysis and application, not the least of which is community recommendation. Online communities come in many shapes and sizes, and are aected by many factors, including the demographics of their members, reason for existence, and facilities aorded by the hosting application. The Livejournal blog site referred to in the previous sections includes a community feature. Each community is dened by the scope of topics
Mood Sensing from Social Media Texts and Its Applications
21
0.6
dog−lovers dogsintraining cat−lovers note−to−cat computer−help computerhelp ipod macintosh htmlhelp webdesign
0.5
Proportion
0.4
0.3
0.2
0.1
0 5
10
15
20
25
30
35
40
45
50
Topics
Topic 9
16
Top Topic Terms
dog baby
dogs
milk
question
puppy
training
bed
vet
gets
started
tried
vet
outside
months sleep weeks thanks
tried
hospital
dear cat 20
animal
water
sleep
doctor
cats mom
eat
food
big
pet
months
daughter
kitten
problem
walk
away
potty
breast
couple
mommy
litter
glad
pain
room
using apple mac thanks drive files screen
file running internet music
link table page code 29
website
links layout
click background
text thank
big
loves
start
weight
kitty
thank
clean
computer ipod tried problem windows 23
run
month birth started
stop
eat
house
fix
itunes
open
download
thanks site
journal picture
change entries
post box
entry
Fig. 10: Above: Topic proportions of 10 communities; Below: Example topics and most likely words sized by
p (word | topic).
it aims to host, and comprises among other things members, posts and comments made in response to posts. Figure 8 shows example proles for six Livejournal communities.
Hyper-community detection aims to group communities that are somehow related. We investigate the usefulness of including mood in this clustering task.
Nguyen et al. __quotexwhore _we_are_lost 20sknitters adayinmylife add_a_writer addme25_and_up aesthetes altparent amateur_artists baby_names baristas battlestar_blog beatlepics beauty101 behind_the_lens bentolunch birls bjorkish blackfolk boys_and_girls breastfeeding broadway calmallamadown cat_lovers charloft chuunin classics clucky color_theory computer_help computerhelp corsetmakers curlyhair davis_square doctorwho dog_lovers dogsintraining dyed_hair egl egl_comm_sales eurotravel filmsg ftm gamers glee_tv gleeclub gundam00 house_cameron htmlhelp i_am_thankful ipod iworkatborders just_good_music lword macintosh madradstalkers miracle______ naturalbirth naturesbeauty ncisficfind news_jpop newyorkers nonfluffypagans note_to_cat ofmornings ontd_political ourbedrooms parenting101 patd patdslashseek picturing_food poor_skills prolife queer_rage relaxmusic rilokiley rpattz_kstew ru_glamour seattle sew_hip sgagenrefinders sgastoryfinders sheldon_penny theater_icons thecure thenicestthings theoffice_us thesims2 time_and_chips todayirealized topmodel transnews trashy_eats vintagehair walmart_employe webdesign weddingplans world_tourist worldofwarcraft wow_ladies accomplished aggravated amused angry annoyed anxious apathetic artistic awake bitchy blah blank bored bouncy busy calm cheerful chipper cold complacent confused contemplative content cranky crappy crazy creative crushed curious cynical depressed determined devious dirty disappointed discontent distressed ditzy dorky drained drunk ecstatic embarrassed energetic enraged enthralled envious exanimate excited exhausted flirty frustrated full geeky giddy giggly gloomy good grateful groggy grumpy guilty happy high hopeful horny hot hungry hyper impressed indescribable indifferent infuriated intimidated irate irritated jealous jubilant lazy lethargic listless lonely loved melancholy mellow mischievous moody morose naughty nauseated nerdy nervous nostalgic numb okay optimistic peaceful pensive pessimistic p*ssed off pleased predatory productive quixotic recumbent refreshed rejected rejuvenated relaxed relieved restless rushed sad satisfied scared shocked sick silly sleepy sore stressed surprised sympathetic thankful thirsty thoughtful tired touched uncomfortable weird working worried
Communities
22
Moods
(a) The mood usage proportions of 100 communities used in hyper-community detection. computer−help computerhelp htmlhelp ipod webdesign ncisficfind sgagenrefinders sgastoryfinders
0.2
0.18
0.16
Proportion
0.14
0.12
0.1
0.08
0.06
0.04
0.02
Moods
confused
frustrated
annoyed
aggravated
worried
p*ssed off
creative
cranky
flirty
content
sleepy
excited
optimistic
grateful
chipper
cheerful
cold
bouncy
tired
hopeful
0
(b) An illustration of mood usage proportions in two groups of communities: {computer_help, computerhelp, htmlhelp, ipod, webdesign } and {nciscnd, sgagenrenders, sgastorynders }. Fig. 11: Communities and the mood usage.
Topic-based Representation of Communities To represent what community members talk about, we apply Latent Dirichlet Allocation (LDA) [4] a Bayesian probabilistic topic model to the blog post corpus. All posts for each community are aggregated to form the corpus input to LDA, wherein each post is considered as one document. LDA learns the probabilities
p (vocabulary | topic),
which are used to describe a topic, and assigns a topic to each word in every docu-
Mood Sensing from Social Media Texts and Its Applications
23
ment. Each post can then be represented as a mixture of topics using the probability
p (topic | document). Intuitively, we expect similar communities to discuss a similar mix of topics, and hence have similar mixtures of mally, supposing we have set of blog posts in the
J
j th
p (topic | document)
aggregated from their posts. For-
communities, denote by
community where
nj
xj =
x1j , x2j , . . . , xnj j
the
is the total number of posts by this
N = Jj=1 nj documents agJ gregated from all communities D = ∪j=1 xj . Finally, if θij denotes the topic mixture Pnj for blog post xij , the j th community can be represented by θj = (1/nj ) i=1 θij . θj is a K -dimensional vector, where K is the number of topics used by LDA, and the kth element represents the mixture proportion of topic k for community j . Figure 9
P
community. Thus the corpus to be modelled consists of
contains example topic mixtures for a number of communities. These mixtures are used to perform topic-based community clustering. The topic distributions are well separated among some groups of communities. As
dog_lovers, dogsintraining } could be inferred as a group Dog (topic numbered 9); similarly, {cat_lovers, note_tocat } about cat (topic 20); {macintosh, computer_help, computerhelp, ipod } on computer/ipod (topic 23); and {webdesign, htmlhelp } on web design (topic 29). can be seen in Figure 10, {
of communities mainly talking about the character
LDA requires that the number of topics be specied in advance, which is dicult to determine in real-world applications. To avoid this we explore the use of the hierarchical Dirichlet processes (HDP) [38], a hierarchical Bayesian nonparametric topic modelling. This approach automatically infers the number of latent topics. Again, clustering is performed using AP [11], as the number of clusters is not known in advance, and to obtain cluster exemplars during clustering. Similarity between communities each
θj
j
and
l is calculated as the Kullback-Leibler divergence between θj
and
θl
(as
is a proper probability mass function over topics).
Mood-based Representation of Communities Recall that Livejournal oers 132 moods for users to tag their posts. We assume that there exists a dierence in tagging moods among communities, supporting the intuition that such communities can be grouped by mood. Let
M=
{
sad, happy,
...} be the predened set of moods;
|M| = 132
is total
number of moods provided by Livejournal. Using the notation in Section 4, each blogpost
xij
in the community
community
j th,
j th
ment is the number of times the Each
mj
is further tagged with a mood
a 132-dimension mood usage vector
k-th
mood in
M
mj
mij ∈ M.
For each
is constructed whose
kth
ele-
was tagged within this community.
is then normalized to unity. Figure 11a shows the mood proportions for
the 100 communities. These proportions are used to perform mood-based community clustering. Figure 11b shows a plot of the mood usage by eight dierent communities in
computer_help, computerhelp, htmlhelp, ipod, webdesign ) is rather well separated from another group (nciscnd, sgagenrenders, sgastorynders ). The rst group favors using moods having low valence (such as p*ssed o, worried, and confused ) while the second prefers high valence moods (for instance, hopeful ), empirically suggesting that Livejournal. It can be seen that the mood usage in one group of communities (
it is sensible to study grouping behaviour based on mood.
24
Nguyen et al.
No. Clusters Purity NMI
Topic (LDA) 20 70% 62%
Topic (HDP) 17
75% 68%
Mood 9 46% 43%
ANEW 15 63% 59%
LIWC 12 54% 51%
Table 6: Cluster purity and NMI based on dierent community representations.
When mood groundtruth is not available (because a mood feature is not implemented by the social media application, or it is present but not used), mood-based hyper-community detection can be performed using ANEW vectors (as described in Section 3.1). Again, vector similarity is calculated with the Bhattacharyya coecient, and AP is used to cluster communities.
Psycho-linguistic-based Representation of Communities As a nal point of comparison that bridges pure topical and mood-based representation of communities, we use psycho-linguistic features as classied by the Linguistic Inquiry and Word Count (LIWC) taxonomy [31]. LIWC assigns terms to one of four high level categories: linguistic processes, psychological processes, personal concerns, and spoken categories, which are further sub-divided into a three level hierarchy. The taxonomy ranges across topic (e.g., religion and health), mood (e.g., positive emotion), and processes not captured by either, such as cognition (e.g. causation and discrepancy) and tense. In all, 68 LIWC features are used to build a vector to provide a psycho-linguistic representation of each community.
10
5.2 Hyper-community Detection For experimentation, we crawled the communities listed in the Livejournal directory.
11
These communities are categorized into 13 groups: Advice-Support, Creative-Expression, Entertainment-Music, Everything Else, Fandom, Fashion-Style, Food-Travel, GamingTechnology, Parenting-Pets, Politics-Culture, Sports-Fitness, Television, and Threadbased RP. From the 579 communities obtained (consisting of 1,090,408 posts and 10,081,215 comments by 182,197 members), we extracted a subset consisting of the top 100 communities, having the most members, across ten categories, resulting in a dataset of 211,740 posts by 59,496 users. Table 5 lists the communities used for the remaining experiments. Overall clustering performance for the dierent community representationstopic, mood, mood-proxy, and psycho-linguisticis shown in Table 6. We report Cluster purity and Normalized Mutual Information (NMI) against the Livejournal community classication, which is a topical classication, thus it is expected that these metrics will be highest for the topic-based community representation, 75% purity and 68% NMI, versus the result for pure mood, 46% purity and 43% NMI. We are chiey concerned with new knowledge discovered through the use of mood-related representations, which will be analyzed in more detail below for each type of representation.
10 http://www.liwc.net/descriptiontable1.php accessed July 2011. 11 http://www.livejournal.com/browse/ accessed July 2011.
Mood Sensing from Social Media Texts and Its Applications No. I II
Members 20sknitters, corsetmakers, sew_hip addme25_and_up, add_a_writer
No. XI XII
III
beauty101, curlyhair, dyed_hair, vintagehair bjorkish, patd, rilokiley, thecure
XIII
blackfolk, classics, nonuypagans, ontd_political, prolife, queer_rage, transnews calmallamadown, battlestar_blog, news_jpop, rpattz_kstew, theoce_us cat_lovers, note_to_cat
XV
IV V VI VII VIII
IX
X
charloft, _we_are_lost, baby_names, birls, chuunin, doctorwho, gamers, gundam00, house_cameron, i_am_thankful, just_good_music, lword, miracle______, patdslashseek, relaxmusic, sheldon_penny, thesims2, time_and_chips, weddingplans, worldofwarcraft, wow_ladies color_theory, adayinmylife, aesthetes, amateur_artists, beatlepics, behind_the_lens, lmsg, madradstalkers, naturesbeauty, ourbedrooms, ru_glamour, topmodel, world_tourist dog_lovers, dogsintraining
XIV
XVI XVII XVIII
25 Members egl, egl_comm_sales glee_tv, broadway, gleeclub, theater_icons macintosh, computer_help, computerhelp, ipod newyorkers, davis_square, eurotravel, poor_skills, seattle ofmornings, bentolunch, picturing_food, trashy_eats parenting101, altparent, breastfeeding, clucky, ftm, naturalbirth sgagenrenders, nciscnd, sgastorynders todayirealized, __quotexwhore, boys_and_girls, thenicestthings
XIX
walmart_employe, baristas, iworkatborders
XX
webdesign, htmlhelp
Table 7: LDA-topic-based hyper-communities (exemplar listed rst for each).
100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%
AdviceͲSupport CreativeͲExpression EntertainmentͲMusic Fandom FashionͲStyle FoodͲTravel GamingͲTechnology
HC_I HC_II HC_III HC_IV HC_V HC_VI HC_VII HC_VIII HC_IX HC_X HC_XI HC_XII HC_XIII HC_XIV HC_XV HC_XVI HC_XVII C_XVII HC_XVIII C_XVIII HC_XIX HC_XX
ParentingͲPets PoliticsͲCulture Television
Fig. 12: LDA-topic-based hyper-communities with Livejournal category; multi-coloured clusters are less pure (best seen in colour).
26
Nguyen et al. No. I
No. X
II
Members altparent, breastfeeding, clucky, ftm, naturalbirth, parenting101 baby_names, news_jpop
III
beatlepics, ru_glamour
XII
IV
classics, nonuypagans
XIII
V
corsetmakers, 20sknitters, beauty101, curlyhair, dyed_hair, egl, egl_comm_sales, madradstalkers, sew_hip, vintagehair
XIV
VI
davis_square, charloft, eurotravel, newyorkers, poor_skills, seattle
XV
VII
ipod, computer_help, computerhelp, htmlhelp, macintosh, webdesign iworkatborders, baristas, walmart_employe note_to_cat, cat_lovers, dog_lovers, dogsintraining
XVI
VIII IX
XI
XVII
Members ofmornings, bentolunch, picturing_food, trashy_eats ourbedrooms, adayinmylife, aesthetes, amateur_artists, behind_the_lens, birls, color_theory, lmsg, naturesbeauty, topmodel, world_tourist rpattz_kstew, house_cameron, miracle______, sheldon_penny, time_and_chips thecure, bjorkish, blackfolk, just_good_music, lword, relaxmusic, rilokiley theoce_us, _we_are_lost, battlestar_blog, broadway, calmallamadown, doctorwho, glee_tv, gleeclub, nciscnd, patd, patdslashseek, sgagenrenders, sgastorynders, theater_icons todayirealized, __quotexwhore, add_a_writer, addme25_and_up, boys_and_girls, chuunin, i_am_thankful, thenicestthings, thesims2, weddingplans transnews, ontd_political, prolife, queer_rage worldofwarcraft, gamers, gundam00, wow_ladies
Table 8: HDP-topic-based hyper-communities (exemplar listed rst for each).
100% 90%
AdviceͲSupport
80%
CreativeͲExpression
70%
EntertainmentͲMusic
60%
Fandom
50%
FashionͲStyle
40%
HC_XVII
HC_XV
HC_XVI
HC_XIV
HC_XII
HC_XIII
HC_X
HC_XI
HC_IX
HC_VII
HC_VIII
HC_V
HC_VI
PoliticsͲCulture
HC_III
ParentingͲPets
0% HC_IV
GamingͲTechnology
10% HC_I
FoodͲTravel
20%
HC_II
30%
Television
Fig. 13: HDP-topic-based hyper-communities with Livejournal category; multi-coloured clusters are less pure (best seen in colour).
Topic-based hyper-communities Using LDA with 50 topics yielded 20 hyper-communities, listed in Table 7. Figure 12 shows community assignments to clusters together with Livejournal category. Clustering appears to have gathered topically similar communities together in a number
ofmornings, bentolunch, picturing_food, trashy_eats }), but also elu-
of cases (e.g., {
Mood Sensing from Social Media Texts and Its Applications No.
Members
IV VII
classics, nonuypagans
XII XVI
PoliticsCulture GamingTechnology
iworkatborders, baristas, walmart_employe note_to_cat, cat_lovers, dog_lovers, dogsintraining
AdviceSupport ParentingPets
ofmornings, bentolunch, picturing_food, trashy_eats
FoodTravel
rpattz_kstew, house_cameron, miracle______, sheldon_penny, time_and_chips
Fandom
transnews, ontd_political, prolife, queer_rage
PoliticsCulture
IX X
Category
ipod, computer_help, computerhelp, htmlhelp, macintosh, webdesign
VIII
27 Top topic
pagan troy
ancient greek history wicca
gods book community war paris achilles religion witch roman fluffy
magic paganism
goddess
film
computer ipod
windows problem
page using files link file
drive
download
website
store starbucks
cafe
department
dog dear
kitty
door
email
mart
cat kitten
company
customer
wal
dogs
house
desk
customers
food
problem
leash
skin
cats
puppy
food lunch egg
info
chocolate
coffee
stores
training
vet
crate
bed
cake
eat
green
gibbs
ryan
outside
pet
post
cup
stories
children
law
woman
abortion
pro community trans male identity
sexual
national
cat_lovers, note_tocat }, {dog_lovers, dogsintraining }). Hyper-community VIII has the lowest purity. On further inspection, a number cidated ner distinctions (e.g., {
of its communities have a signicant romance or relationships component. E.g., in addition to those communities with obvious topics, three are about particular ctional
house_cameron, sheldon_penny, and time_and_chips.
Noticeably, the clustering result is better when HDP is employed to learn latent topics in the data, 75% purity and 68% NMI, versus the result for LDA, 70% purity and 62% NMI. Specically, running HDP on the same dataset, the model discovers 52 topics discussed by bloggers in their online diaries. Based on the similarity on the preference of these topics among communities, 17 hyper-communities are clustered as shown in Table 8. As can be seen from the community assignments in reference to the Livejournal categories in Figure 13, there are more 100%-purity hyper-communities in HDP-topic based clustering than in LDA-topic based. These 100%-purity hyper-communities with their top topics are shown in Table 9.
ofmornings, Food-Travel
Only one 100%-purity hyper-community is found in both clusterings: {
bentolunch, picturing_food, trashy_eats }
service
story post spoilers
Table 9: Pure HDP-topic-based clusters and their top topics.
relationships:
borders
title
women gay sex rights female
job
breakfast sauce
transgender gender health
shift
icons fic episode rodney
posted
tried
question
butter
tony fics glee doctor remember read brendon
info
treats
rice chicken box cream tea
site
drink
bento cheese dinner
click
screen
manager card
music
those communities in the
Livejournal category. It is interesting to see from the table that while being further
separated in LDA-topic based clustering, {cat_lovers, note_tocat } and {dog_lovers, dogsintraining } are in the same hyper-community in HDP-topic based clustering, and so do {webdesign, htmlhelp } and {macintosh, computer_help, computerhelp, ipod }. In contrast, we also see a ner distinction for the Politics-Culture Livejournal category:
classics, nonuypagans } mentioned classical topics such as ancient, troy or Greek history, {transnews, ontd_political, prolife, queer_rage } were interested in contemporary issues such as transgender, abortion or gay. while {
Mood-based hyper-communities Clustering based on explicit mood labels yielded 9 hyper-communities, which are recorded in Table 10. In contrast to the topic-based clustering, only two hyper-communities
support
28
Nguyen et al.
Members altparent, boys_and_girls, breastfeeding, cat_lovers, clucky, dog_lovers, dogsintraining, macintosh, naturalbirth, parenting101, todayirealized baristas, blackfolk, iworkatborders, nonuypagans, note_to_cat, ontd_political, prolife, queer_rage, thesims2, transnews, walmart_employe, worldofwarcraft beatlepics, __quotexwhore, addme25_and_up, birls, charloft, egl_comm_sales, gundam00, house_cameron, miracle______, news_jpop, patdslashseek, rpattz_kstew, sheldon_penny, thenicestthings, time_and_chips behind_the_lens, add_a_writer, aesthetes, amateur_artists, color_theory, lmsg, just_good_music, naturesbeauty, ofmornings, ourbedrooms, relaxmusic bentolunch, adayinmylife, i_am_thankful, picturing_food, trashy_eats, world_tourist broadway, _we_are_lost, baby_names, beauty101, bjorkish, classics, curlyhair, davis_square, doctorwho, egl, eurotravel, ftm, gamers, lword, madradstalkers, newyorkers, poor_skills, rilokiley, seattle, thecure, theoce_us, topmodel, weddingplans, wow_ladies chuunin, 20sknitters, battlestar_blog, calmallamadown, corsetmakers, dyed_hair, glee_tv, gleeclub, patd, ru_glamour, sew_hip, theater_icons, vintagehair htmlhelp, computer_help, computerhelp, ipod, webdesign nciscnd, sgagenrenders, sgastorynders
Categories Advice-Support, Gaming-Technology, Parenting-Pets
Advice-Support, Gaming-Technology, Parenting-Pets, Politics-Culture
Advice-Support, Creative-Expression, Entertainment-Music, Fandom, Fashion-Style, Politics-Culture, Television Advice-Support, Creative-Expression, Entertainment-Music, Fashion-Style, Food-Travel Advice-Support, Creative-Expression, Food-Travel Advice-Support, Entertainment-Music, Fashion-Style, Food-Travel, Gaming-Technology, Parenting-Pets, Politics-Culture, Television Creative-Expression, Entertainment-Music, Fandom, Fashion-Style, Television Gaming-Technology Fandom
Hyper-community Mood Cloud
curious
confused
contemplative
happy
tired
aggravated
hopeful
cheerful sad
amused
anxious excited
worried
annoyed
frustrated
accomplished
awake
chipper
calm
content
amused curious
annoyed
aggravated
confused
frustrated
awake
cheerful
angry
excited
sad
contemplative cranky
hopeful
tired
blah
worried
happy
pissed off
accomplished
cheerful amused
bouncy
accomplished happy calm tired hopeful sleepy anxious
chipper creative
bored blah
curious
excited
busy
content
awake
loved
calm cheerful accomplished creative artistic awake chipper amused happy curious tired bored bouncy content sleepy cold
busy contemplative working anxious
happy hungry full accomplished cheerful tired calm content amused sleepy chipper thankful satisfied busy cold
curious hopeful
happy
bouncy
amused excited bored
amused
calm
contemplative
confused
hopeful anxious
bored
aggravated
Table 10: Mood-based hyper-communities.
chipper
accomplished
awake
confused anxious
annoyed
creative
tired
sleepy
excited bouncy happy chipper
confused bored artistic calm
hopeful
cheerful
content
curious accomplished
creative cheerful
aggravated
creative
awake bouncy curious exhausted
tired
anxious
awake
hopeful
sleepy
crazy
curious frustrated annoyed
anxious
worried
blah
contemplative
curious
confused
content
chipper
calm
sleepy
pissed off
tired
blah
cranky
bored
cold
annoyed
creative
awake
cheerful
contemplative
calm
busy
bouncy
content
awake
stressed
sad
distressed
frustrated
exhausted
Mood Sensing from Social Media Texts and Its Applications Members addme25_and_up, add_a_writer, htmlhelp, nonuypagans altparent, baby_names, breastfeeding, clucky, ftm, naturalbirth, parenting101, prolife beatlepics, amateur_artists, birls, calmallamadown, chuunin, classics, gundam00, news_jpop, ourbedrooms, patd, theater_icons, thecure, theoce_us behind_the_lens, aesthetes, color_theory, lmsg, naturesbeauty, ru_glamour, world_tourist
Categories Advice-Support, Gaming-Technology, Politics-Culture Parenting-Pets, Politics-Culture Creative-Expression, Entertainment-Music, Fandom, Fashion-Style, Politics-Culture, Television Creative-Expression, Fashion-Style, Food-Travel
bentolunch, ofmornings, picturing_food, trashy_eats
Food-Travel
curlyhair, beauty101, dyed_hair, vintagehair
Fashion-Style
dog_lovers, cat_lovers, dogsintraining, note_to_cat egl, 20sknitters, corsetmakers, egl_comm_sales, madradstalkers, sew_hip, weddingplans lword, _we_are_lost, battlestar_blog, bjorkish, blackfolk, broadway, glee_tv, gleeclub, rilokiley, topmodel macintosh, computer_help, computerhelp, ipod, webdesign miracle______, charloft, house_cameron, just_good_music, nciscnd, patdslashseek, relaxmusic, rpattz_kstew, sgastorynders, sheldon_penny, thesims2 newyorkers, adayinmylife, baristas, davis_square, eurotravel, iworkatborders, ontd_political, poor_skills, seattle, transnews, walmart_employe time_and_chips, doctorwho todayirealized, __quotexwhore, boys_and_girls, i_am_thankful, queer_rage, sgagenrenders, thenicestthings worldofwarcraft, gamers, wow_ladies
29 Hyper-community ANEW Words
journal
people love
music
thought
name
pretty
hope
baby name love
month
pretty
love news
kids
people
color
surgery
name
black
doctor
milk
hope
idea
life
couple
hope people
song
art
part
kind
panic
fall
journal
spring
sunset
city
cut
art
red
milk pretty black cake
good
cut time love
thought
people
kind
dog
face
dark
blue
cat love
people
happy
house
color pretty
hope
natural
time
pretty
cute
couple
good
happy
bed
idea
red black nice
home
pet
kind
idea
friend
free
hope
friend
food
door
name
thought
hope
watch
free
computer idea
pretty
love
home
love pretty
cute
cut
pretty
time couple
girl
good
friend
trouble
home
life
friend
doctor love
song
rock
kind
couple
time
thought
book
part
love
part
hope
good
people
quick
Table 11: ANEW-based hyper-communities.
person
pretty
time world
happy
kind
girl
hope
good name
free
family
life
car
free
journal
cut
song
nice
tank
love friend
name
sex
thought pretty
pretty
nice
song
time hard
hope
thought
world
travel
idea
happy
people
love
home
idea
cut
hard
save
free love kind
hate
free
kind
good hope
people hit
black
name
happy
hope
friend
people good thankful
happy
game thought
white
name
music
kind
time good people
thought
couple
name
time part house
people pretty music
money
eat
people thought good time lost song
music hope part
city
eat
sugar cut
black people fabric thought wedding
thought
Gaming-Technology
life
poetry
salad food dinner time butter love
love
Advice-Support, Creative-Expression, Fandom, Politics-Culture
book
chocolate green egg good
Entertainment-Music, Politics-Culture, Television
Fandom, Television
girl
pretty
good dress time
Advice-Support, Creative-Expression, Food-Travel, Politics-Culture
free
kind
hope thought good
white
Advice-Support, Creative-Expression, Fashion-Style
Creative-Expression, Entertainment-Music, Fandom, Gaming-Technology
good
person
people good thought
home
happy
time love
glamour
puppy
Gaming-Technology
time friend
time thought good
red white pancakes
Parenting-Pets
part
time
abortion child
pretty friend office free
free part
table
heart
fun
music
friend mind
world
happy
life
world
priest
couple
watch
pretty
pet
hate
fun
30
Nguyen et al. leisure motion
sixltr numberassent
home
assent nonfl
see ingest feel
see
i
home
body
percept
filler
bio
incl
percept
e we
percept
bio see
Food-Travel
trashy_eats, bentolunch, ofmornings, picturing_food
Creative-Expression, Fashion-Style
color_theory, naturesbeauty, ru_glamour
negemo
Fashion-Style
vintagehair, beauty101, curlyhair, dyed_hair
we
sad
inhib
family
bio
death sixltr
past health
relig
anx humans hear
friend
anx
sexual anger
you
home we
sexual shehe
leisure
Creative-Expression, Entertainment-Music, Politics-Culture, Television
just_good_music, battlestar_blog, charloft, ontd_political, relaxmusic anx past wc
social ppron
Parenting-Pets
Advice-Support, Parenting-Pets
altparent, baby_names, breastfeeding, clucky, naturalbirth, note_to_cat, parenting101
dog_lovers, boys_and_girls, cat_lovers, dogsintraining, thenicestthings
work quant excl discrep
affect
money
body future motion
health
shehe
fl nonfl
body
time
sexual sad
feel
cause
sixltr
nonfl health death
see achieve
relig
shehe
number
home
tentat
friend
home
space
Creative-Expression, Fandom
Fandom, Fashion-Style, Food-Travel, Politics-Culture
sheldon_penny, adayinmylife, house_cameron, miracle______, rpattz_kstew, time_and_chips
newyorkers, davis_square, egl_comm_sales, eurotravel, madradstalkers, ourbedrooms, poor_skills, seattle, sgagenrenders, world_tourist
posemo
Advice-Support, Creative-Expression, Food-Travel, Gaming-Technology, Television
aesthetes, amateur_artists, behind_the_lens, doctorwho, lmsg, i_am_thankful, wow_ladies
auxverb
tentat
work
posemo
negemo
present
filler
sad
cause
nonfl
they
anger
ffuture
filler
swear
relig
assent they th
ipron work
swear
leisure hear
Advice-Support, Entertainment-Music, Fandom, Fashion-Style, Gaming-Technology, Politics-Culture, Television
calmallamadown, addme25_and_up, baristas, beatlepics, birls, bjorkish, broadway, chuunin, egl, gamers, glee_tv, gleeclub, gundam00, lword, nciscnd, news_jpop, patd, patdslashseek, rilokiley, theater_icons, thecure, theoce_us, topmodel, weddingplans, worldofwarcraft
anger
sexual
tentat death
discrep
i
Advice-Support, Creative-Expression, Fashion-Style, Gaming-Technology, Politics-Culture
macintosh, 20sknitters, add_a_writer, classics, computer_help, computerhelp, corsetmakers, ftm, htmlhelp, ipod, iworkatborders, sew_hip, webdesign
humans
health
Advice-Support, Creative-Expression, Fandom, Gaming-Technology, Politics-Culture, Television
blackfolk, __quotexwhore, _we_are_lost, nonuypagans, prolife, queer_rage, sgastorynders, thesims2, todayirealized, transnews, walmart_employe
Table 12: Psycho-linguistic-based (LIWC) hyper-communities, showing pie charts of LIWC features used above average per hyper-community; Livejournal categories; and grouped communities.
Mood Sensing from Social Media Texts and Its Applications
31
have 100% purity with respect to the topical groundtruth, one of which is the only group characterized by negative mood, consisting of
puterhelp, ipod, and webdesign.
htmlhelp, computer_help, com-
Mood-based clustering reveals distinctions not apparent in the topic-based representation. E.g., the group including
behind_the_lens, while having signicant overlap
with Group IX (see Table 7) in the topic-based clustering, has some illuminating dier-
beatlepics, madradstalkers, ru_glamour, topmodel, worldtourist ; replacing them are add_a_writer, just_good_music, and ofmorn-
ences: gone are the communities and
ings. From an appraisal of the content of these communities, we nd the distinctions
to be nuanced. The topic-based hyper-community is loosely united by pictures and people, whereas the mood-based hyper-community is united by the desire
to create
and its outcomesdierences that are best explained by prevailing mood and intent. Indeed, these distinctions are captured by the predominant moods of the dierent hyper-communities, respectively and
creative.
curious, cheerful,
or
happy
vs.
calm, accomplished
ANEW-based hyper-communities Clustering on ANEW features as proxy mood yielded 15 hyper-communities, listed in Table 11. Of these, ve consisted of communities with matching Livejournal category (e.g.,
curlyhair, beauty101, dyed_hair, vintagehair are all classied as Fashion-Style).
Two hyper-communities are examples of the sub-category distinctions returned by the
macintosh, computer_help, computerhelp, ipod, webdesign } worldofwarcraft, gamers, wow_ladies } are both from Livejournal's Gaming-
topic-based clustering; { and {
Technology category.
Psycho-linguistic-based hyper-communities Clustering based on psycho-linguistic features yielded 12 hyper-communities, shown in Table 12. Three hyper-communities contain communities with the same Livejournal category, and appear to have been associated topically. The top three LIWC categories
feel, body, and percept ingest, bio (i.e., biological processes), and percept ; and for Parenting-Petsfamily, health, and humans (e.g., adult, baby, boy).
for these hyper-communities are illuminating; for Fashion-Style (i.e., perceptual processes); for Food-Travel
Other hyper-communities appear to exhibit a characteristic mixture of topic and style of discussion, which is in part captured by the linguistic processes of LIWC. E.g., one hyper-community aggregates
all
of the communities in the dataset about
ctional relationships (plus one community about documenting a day in one's life). These communities are a kind of meta-genre not easily captured by topical features
wc ) and extensive use shehe ) appear to help associate these communities.
alone; linguistic features such as post length (i.e., personal singular (i.e.,
of the 3rd
Last, there is at least one hyper-community for which mood appears to be the discriminating feature, that which includes
walmart_employe. This hyper-community sad, swear, death, sexual, health, hu-
has above average use of the LIWC categories
mans, anger, relig, they (i.e., 3rd person plural). These communities could be described
as some combination of angst-ridden, gritty, adversarial (note the above average use of 3rd person plural), or forthright, and contain much negative emotion and introversion.
32
Nguyen et al.
ACTIVATION
ACTIVATION 9
9
8
8
aggravated annoyed
7
7
awake
cheerful ecstatic excited
excited
touched
6
happy
6
hopeful contemplative
content
sick
crappy
4
drained groggy
DISPLEASURE
hyper productive
depressed
PLEASURE
DISPLEASURE
energetic
cold 5
thoughtful PLEASURE
determined
determined
5
thankful tired
4
gloomy hot sad
3
3
sleepy sore
calm
tired 2
2
1
1
0
1
2
3
4
5
6
7
8
9
DEACTIVATION
(a) High mood variance (user from 24_7_posting ).
0
1
2
3
4
5
6
7
8
9
DEACTIVATION
(b) Low mood variance (user from tional365 ).
devo-
Fig. 14: Aggregated mood distribution for two users, plotted by valence (pleasure) x-axis, and arousal (activation) y-axis. Larger mood labels indicate more frequent occurrence.
Contrast this hyper-community with that which includes the community
aesthetes
it discusses similar topics (e.g., health, death, relig) but does so with above average
posemo
(i.e., positive emotion).
5.3 Discussion It is not surprising that the dierent community representations lead to hyper-communities that reect these varying emphases. Topic-based representation is the method of choice for recovering hierarchy within, and associations across, Livejournal's canonical topic categories. Likewise, the results for the mood-based representation indicate an ability to recover non-topical features of a community such as prevailing intent and atmosphere of discussion. However, contrary to expectations, ANEW does not appear to be a well suited and cheap alternative to mood-based representation for the task of hyper-community detection. The clustering results for LIWC's psycho-linguistic representation are worthy of follow-up. LIWC oers a wide scope of classicationincluding as it does topical, linguistic, stylistic, and mood categoriesyet is cheap to obtain. Some of the distinctions captured by the hyper-communities arising from LIWC representation are a kind of
topic + atmosphere
that seems relevant to the Web 2.0 denizen, who is faced
with a surfeit of choice, and whose decision as to which community they will invest in may turn on the presence of more than one characteristic of the content. Hence psycho-linguistic analysis would make a useful facet of community recommendation (and analysis). Of course, there are more dimensions than community along which to break-down and re-factor the Livejournal corpus; e.g., user and discussion. Figure 14 contains plots of mood aggregated from the posts of two dierent users. Figure 14a. represents a user
Mood Sensing from Social Media Texts and Its Applications
33
with high mood variancethis user's moods are distributed across three quadrants of the mood circle, with a few instances in the fourth; Figure 14b. represents a user with low mood variancethis user's moods tend to be restricted to positive valence with average arousal. User proling of this kind is interesting in and of itself, but when correlated with communities oers additional insight into the user and/or community. E.g., we have found users who appear to project dierent persona conditional on the community of posting, which is captured by valence; valence over time for a particular user can indicate traumatic events; and joint analysis of such events with the mood of responding users potentially oer an even ner grained tool of analysis.
6 Conclusion In seeking to create tools for analysing content in social media under the impact of users' moods, we have investigated the problem of mood classication in weblogs. While the problem of machine learning-based feature selection for text categorisation has been intensively explored, little work has been done on textual-based mood classication, which is often more challenging. This paper's contribution is a comprehensive comparison of dierent feature-selection schemes in combination with a range of linguistic subsets as feature spaces across two large datasets, elucidating insights into what can be transferred from a generic text-categorisation problem to mood classication. In addition, a novel use of a set of psychology-inspired features (ANEW) and psycholinguistic features (LIWC) is proposed, that do not require a supervised selection phase and can therefore be applied for mood analysis at a much larger scale. The results support similar ndings in previous research, but have also brought to light discoveries particular to the problem of mood classication. The newly proposed feature sets have also performed comparatively well at a fraction of the computational cost of supervised schemes. Our analysis of global mood structures reveals interpretable and interesting patterns in the organisation, transition and continuum of moods, suggesting valuable empirical evidence about the structure of human emotion. The patterns contain mood synonyms, which can be used interchangeably, for instance, in terms of the sentiment score. The results have additional potential applications, such as mood-sensitive indexing and retrieval. We have discovered hyper-groups of communities by using topics, sentiment information and psycholinguistic properties of the posts of members. The problem of sentiment-based clustering for community structure discovery is rich with many interesting open aspects to be explored. The meta-communities grouped based on the sentiment information can be a social indicator, having potential applications in, for example, mental healthby targeting support or surveillance to communities with negative moodor in marketingby targeting customer communities having the same sentiment on similar topics. On the other hand, the psycholinguistic hyper-groups detected provide insight into the language styles of people in specic categories; whereas, topical meta-communities are a good source for users to nd suitable communities based on their interests.
34
Nguyen et al.
References 1. B. Adams, D. Phung, and S. Venkatesh. Discovery of latent subcommunities in a blog's readership. ACM Transactions on the Web, 4(3):130, 2010. 2. L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks: Membership, growth, and evolution. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), pages 4454, 2006. 3. B. Berendt and C. Hanser. Tags are not metadata, but `just more content'to some people. Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM), 2007. 4. D.M. Blei, A.Y. Ng, and M.I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:9931022, 2003. 5. V. Bolón-Canedo, N. Sánchez-Maroño, and A. Alonso-Betanzos. A review of feature selection methods on synthetic data. Knowledge and Information Systems, pages 137, 2012. 6. M.M. Bradley and P.J. Lang. Aective norms for English words (ANEW): Instruction manual and aective ratings. University of Florida, 1999. 7. E. Cambria, A. Hussain, C. Havasi, C. Eckl, and J. Munro. Towards crowd validation of the UK national health service. In Proceedings of the Web Science Conference (WebSci), 2010. 8. T.K. Fan and C.H. Chang. Sentiment-oriented contextual advertising. Knowledge and Information Systems, 23:321344, 2010. 9. A.K. Farahat, A. Ghodsi, and M.S. Kamel. Ecient greedy feature selection for unsupervised learning. Knowledge and Information Systems, pages 126, 2012. 10. S. Feng, D. Wang, G. Yu, W. Gao, and K.F. Wong. Extracting common emotions from blogs based on ne-grained sentiment clustering. Knowledge and Information Systems, 27:281302, 2011. 11. B.J. Frey and D. Dueck. Clustering by passing messages between data points. Science, 315:972976, 2007. 12. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I.H. Witten. The Weka data mining software: An update. ACM SIGKDD Explorations Newsletter, 11(1):1018, 2009. 13. C. Hayes and P. Avesani. Using tags and clustering to identify topic-relevant blogs. In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM), 2007. 14. X. Hu and J.S. Downie. Exploring mood metadata: Relationships with genre, artist and usage metadata. In Proceedings of the International Conference on Music Information Retrieval, 2007. 15. R. Kumar, J. Novak, and A. Tomkins. Structure and evolution of online social networks. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), page 617, 2006. 16. G. Leshed and J.J. Kaye. Understanding how bloggers feel: Recognizing aect in blog posts. In Proceedings of the ACM Conference on Human Factors in Computing Systems (SIGCHI), page 1024, 2006. 17. C. Long, J. Zhang, M. Huang, X. Zhu, M. Li, and B. Ma. Estimating feature ratings through an eective review selection approach. Knowledge and Information Systems, 2012. 18. A. McCallum, X. Wang, and A. Corrada-Emmanuel. Topic and role discovery in social networks with experiments on Enron and academic email. Journal of Articial Intelligence Research, 30:249272, 2007. 19. A. McCallum, X. Wang, and N. Mohanty. Joint group and topic discovery from relations and text. Lecture Notes in Computer Science, 4503:28, 2007. 20. G. Mishne. Experiments with mood classication in blog posts. In Proceedings of ACM Workshop on Stylistic Analysis of Text for Information Access, 2005. 21. G. Mishne and N. Glance. Predicting movie sales from blogger sentiment. In Proceedings of the AAAI Spring Symposium on Computational Approaches to Analysing Weblogs, 2006. 22. H. Mohtasseb and A. Ahmed. Two-layered blogger identication model integrating prole and instance-based methods. Knowledge and information systems, 31(1):121, 2012. 23. R. Nallapati and W. Cohen. Link-PLSA-LDA: A new unsupervised model for topics and inuence of blogs. In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM), 2008.
Mood Sensing from Social Media Texts and Its Applications
35
24. R.A. Negoescu, B. Adams, D. Phung, S. Venkatesh, and D. Gatica-Perez. Flickr hypergroups. In Proceedings of the ACM International Conference on Multimedia, pages 813816, 2009. 25. T. Nguyen, D. Phung, B. Adams, T. Tran, and S. Venkatesh. Classication and pattern discovery of mood in weblogs. Advances in Knowledge Discovery and Data Mining, pages 283290, 2010. 26. T. Nguyen, D. Phung, B. Adams, T. Tran, and S. Venkatesh. Hyper-community detection in the blogosphere. In Proc. of ACM Workshop on Social media, in conjunction with ACM Int. Conf on Multimedia (ACM-MM), Firenze, Italy, 2010. ACM. 27. K. Nigam and M. Hurst. Towards a robust metric of opinion. In AAAI Spring Symposium on Exploring Attitude and Aect in Text, pages 598603, 2004. 28. B. Pang and L. Lee. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1-2):1135, 2008. 29. B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up? Sentiment classication using machine learning techniques. In Proceedings of the ACL Conference on Empirical Methods in Natural Language Processing, pages 7986, 2002. 30. J.W. Pennebaker, C.K. Chung, M. Ireland, A. Gonzales, and R.J. Booth. The development and psychometric properties of LIWC2007. Austin, Texas: LIWC Inc, 2007. 31. J.W. Pennebaker, M.E. Francis, and R.J. Booth. Linguistic inquiry and word count (LIWC) [computer software]. Austin, Texas: LIWC Inc, 2007. 32. J.A. Russell. A circumplex model of aect. Journal of Personality and Social Psychology, 39(6):11611178, 1980. 33. J.A. Russell. Core aect and the psychological construction of emotion. Psychological Review, 110(1):145, 2003. 34. F. Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 34(1):147, 2002. 35. X. Song, C.Y. Lin, B.L. Tseng, and M.T. Sun. Modeling and predicting personal information dissemination behavior. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), pages 479488, 2005. 36. S.O. Sood and L. Vasserman. ESSE: Exploring mood on the web. In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM), 2009. 37. Y.R. Tausczik and J.W. Pennebaker. The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology, 29(1):24, 2010. 38. Y. W. Teh and M. I. Jordan. Hierarchical Bayesian nonparametric models with applications. In N. Hjort, C. Holmes, P. Müller, and S. Walker, editors, Bayesian Nonparametrics: Principles and Practice. Cambridge University Press, 2010. 39. Y. Tsuruoka and J. Tsujii. Bidirectional inference with the easiest-rst strategy for tagging sequence data. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 467474, 2005. 40. Y. Yang and J.O. Pedersen. A comparative study on feature selection in text categorization. In Proceedings of the International Conference on Machine Learning (ICML), pages 412420, 1997.