CommuniMents: A Framework for Detecting ...

7 downloads 363 Views 1MB Size Report
Keywords: Social Media, Twitter, Sentiment Analysis, Communities, Events ... Through the effective monitoring and analysis of social media posts, government ...
CommuniMents: A Framework for Detecting Community based Sentiments for Events Muhammad Aslam Jarwar1,8, Rabeeh Ayaz Abbasi2,1, Mubashar Mushtaq3,1, Onaiza Maqbool1, Naif R. Aljohani2, Ali Daud2,4, Jalal S. Alowibdi5, J.-R. Cano6, S. García7, Ilyoung Chong8 1

2

3

Department of Computer Sciences, Quaid-i-Azam University, Islamabad, Pakistan [email protected],[email protected]

Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia {rabbasi,nraljohani}@kau.edu.sa


Department of Computer Science, Forman Christian College (A Chartered University), Lahore, Pakistan [email protected]

4

Department of Computer Science and Software Engineering, International Islamic University, Islamabad, Pakistan [email protected] 5

Faculty of Computing and Information Technology, University of Jeddah, Jeddah, Arabia [email protected]

6

Dept. of Computer Science, University of Jaén, EPS of Linares, Avenida de la Universidad S/N, Linares 23700, Jaén, Spain [email protected]

7

Department of Computer Science and Artificial Intelligence, University of Granada, 18071, Granada, Spain [email protected] 8

Saudi

Department of Information and Communications Engineering, Hankuk University of Foreign Studies (HUFS), Korea [email protected]

ABSTRACT Social media has revolutionized human communication and styles of interaction. Due to its effectiveness and ease, people have started using it increasingly to share and exchange information, carry out discussions on various events, and express their opinions. Various communities may have diverse sentiments about events and it is an interesting research problem to understand the sentiments of a particular community for a specific event. In this article, we propose a framework CommuniMents which enables us to identify the members of a community and measure the sentiments of the community for a particular event. CommuniMents uses automated snowball sampling to identify the members of a community, then fetches their published contents (specifically tweets), pre-processes the contents and measures the sentiments of the community. We perform qualitative and quantitative evaluation for a variety of real world events to validate the effectiveness of the proposed framework.

Keywords: Social Media, Twitter, Sentiment Analysis, Communities, Events INTRODUCTION Social media applications provide easy and effective ways for communication, sharing of opinions and exchange of information. These applications enable people to communicate with a large and diverse set of people for different purposes. For example, people may communicate and share their problems directly with their representatives in government and parliament. They may also give their opinion and show their sentiments on social problems, events, political movements, and government policies. Active participation of a large number of users results in abundance of information, and most of this information is unstructured and unmanageable. The huge amount of information in social media leads to the problem of “Social media information overload” [Bright et al., 2015]. Social media information overload and the diversity of information create difficulties and challenges in information processing, presentation and analysis [Batrinca and Treleaven, 2015, Schuller et al., 2015]. In social media, the information which is created, shared and exchanged has importance for the public, news agencies, governments, oppositions and political parties because this information contains public opinion and sentiments. News agencies these days often select the subject of talk shows and the trends of news as per opinion and sentiments of public in social media. The government may also be able to benefit from the social media while making policies and taking decisions about the country and the general public, as users on social media discuss and express their opinions about the government policies, decisions and its governance with their friends, colleagues, and community. Through the effective monitoring and analysis of social media posts, government may make their policies and take decisions in a more informed way [WeGov, 2016]. Nowadays many communities, e.g. lawyers, politicians, journalists, doctors, and researchers are aware about the importance of social media and they use social media services to express their opinions on various issues in their daily lives [Manaman et al., 2016]. Among these communities, the journalist community actively participates in discussions on social media like twitter, and expresses its opinions about the events occurring in the surroundings. Journalists and media also have an influential role on government policies and they affect the mindset of the public, which also effects the election results [Takahashi et al., 2015, Bekafigo and McBride, 2013]. Journalists are using social media services increasingly [Zubiaga et al., 2013] to gather the news about the major events.

Due to the important role of communities in society and social media, in our study we propose a framework CommuniMents, for identifying targeted communities and analyzing their event based sentiments. It is a challenging task to identify a community which contains members from all the demographic locations of a country and not certain selected members only. We test our framework by identifying the Pakistani journalist community and finding its event based sentiments. Our framework has three components, the first component identifies members of a community. The second component gathers publicly available tweets of community members and filters event specific tweets. The third component measures collective sentiments of the community for a particular event. To evaluate our framework, we use real data related to important events within Pakistan.

Figure 1: Example of a hashtag (#BBSaid) and a mention (@SaeedGhani1) in a tweet. BACKGROUND AND RELATED WORK Twitter was launched officially on 13th July, 2006 [Kumar et al., 2014]. It facilitates its users to communicate in real time and create, send, receive and read posts known as “Tweets”. The length of a tweet is limited to 140 characters [Mollett et al., 2011] and averages eleven words per tweet [O’Connor et al., 2010]. Twitter is popular with academic researchers [Mollett et al., 2011], because most of the tweets are publicly available and are accessible through the twitter API [Makice, 2009]. Different activities are performed by twitter users, such as to post a tweet publicly or specially to a user by mentioning his address as “@userid”, read a tweet, and forward a tweet known as “Retweet”. The retweet mechanism of twitter gives the strength to users to spread a tweet to many users who are not followers of the original user who created the tweet. Due to its specific structure and features, twitter has emerged as a new medium of communication and a channel of rapidly spreading information [Khan Minhas et al., 2015, Kwak et al., 2010, Honey and Herring, 2009]. Twitter users use the hash symbol “#” followed by a word called “hashtag” (Figure 1) in their posts to categorize posts or follow posts related to a specific topic. Sometimes users overlap topics by using a hashtag not in the context of the topic [Bastos et al., 2013]. Hashtags further help users in searching posts. Hashtags, simple words, and phrases are used by many users in their tweets and are also tracked by twitter for detecting trending topics. In twitter users may create new lists or subscribe to existing lists. A lists has the ids of users who are mostly related to the theme of the list. By using lists, users can see a tailored stream of tweets of the users present in the list. The list can be private or public. The public lists of a user @rabeeh are shown in Figure 2.

Figure 2: Public lists of the user @rabeeh Snowball sampling or “chain-referral sampling” is a non-probabilistic sampling technique where existing study subjects recruit future subjects from their acquaintances. This technique is used for identifying specific communities in twitter, e.g., celebrities, media, organizations, and blogs [Wu et al., 2011]. It starts with seed users U0 belonging to a specific community. The seed users U0 are mostly famous personalities within a community. All lists having the users U0 are retrieved. The retrieved lists for each user are filtered on the basis of manually chosen keywords. The filtered lists L0 contain only those lists whose names match the manually chosen keywords. In a recent study [Khan et al., 2014], the researchers have used snowball sampling for spell checking in English tweets. An event represents an action occurring in our surroundings. Due to the simplicity, popularity, and adhoc usage of Twitter, many people report events on Twitter. In recent years, the demand of event based analysis of tweets has led to an increase in the interest of researchers to explore event extraction mechanisms from twitter. An extensive survey of event detection methods on twitter is presented in [Atefeh and Khreich, 2015]. It discusses the event detection methodologies for both specified (for known) and unspecified (for unknown) events. [Abdelhaq et al., 2016] extract events from a real-time stream of tweets. The extracted events are described using keywords, time and location. [Zhou et al., 2015] use a Bayesian modelling approach to extract event-related keywords from tweets without supervised learning. [Lin et al., 2016] compare behavior of users on two different micro-blogging platforms, Twitter and Weibo. Tweets contain metadata and unstructured text which includes URLs, user ids and special characters, abbreviations, hashtags, and non-stem words which affects the accuracy of sentiment analyzers inversely [Pang and Lee, 2008]. In recent studies [Khan et al., 2014, Gupta and Sharma, 2016] WordNet is used for the identification of abbreviations and spelling correction. Before discussing the recent literature on sentiment analyzers, let us discuss what a sentiment

analyzer is. The phrase “sentiment analysis” was first used in [Nasukawa and Yi, 2003]. In textual natural language processing [Khan et al., 2016] “Sentiment analysis” is the process of finding the opinion of the writer. [Medhat et al., 2014] describes “sentiment analysis” or “opinion mining” as the study of knowing the people's opinions, attitudes and emotions towards objects [Balazs and Velsquez, 2016, Bravo-Marquez et al., 2016, Khan et al., 2017, Liu, 2012, Muhammad et al., 2016, Wang et al., 2016]. In sentiment analysis, the first step is the selection of text features. These features include the frequency of terms present, use of adjectives (parts of speech), phrases, and negations [Liu and Zhang, 2012]. The features are selected using two methods: lexicon-based statistical measuresbased. The lexicon-based method is a manual method, in which the human annotator annotates the features manually. The statistical method is fully automatic and widely used [Medhat et al., 2014]. However the chances of novice features in this case are much greater as compared with the lexical method. Most of the sentiment analysis methods work on whole documents, with exception of a few methods that work at sentence level [Appel et al., 2016]. The CoreNLP sentiment annotator uses the supervised methodology to train the classifier. For accurate results the classifier uses the sentiment treebank, which includes 215,154 phrases labeled with fine-grained sentiment labels in 11,855 parse tree sentences. Recursive Neural Tensor Network[Hammer, 2002] is used to reduce the complexity in sentiment composition. This sentiment annotator classifies the sentence into positive/negative polarity with accuracy ranging from 80% to 85.4% and in fine-grained classification up to 80.7% [Socher et al., 2013]. Another sentiment analysis tool SyneSketch uses the word lexicon based on WordNet, lexicon of emoticons, common slang words and a set of heuristic rules for extracting the fine-grained Ekman emotion classes (i.e. happiness, sadness, anger, fear, disgust, and surprise) and classifies the sentence in positive with value (+1) and negative with value (-1) polarity [Krcadinac et al., 2013]. Sentiments related to events on twitter are discussed by [Thelwall et al., 2011, Gaspar et al., 2016, Rill et al., 2014]. Researchers have also used machine learning algorithms to detect deception [Alowibdi et al., 2015] and sentiments [Katz et al., 2015] on twitter. Communities in Twitter have been addressed to solve various problems by researchers. In [Takahashi et al., 2015], authors investigate how different types of communities use Twitter in a situation of disaster and emergency. They particularly focus the typhoon Haiyan in the Philippines. The social media usage patterns were also investigated among those who were directly affected by the typhoon, and those who were coordinating the relief efforts or disseminating information. The results of this study show that different communities use social media for the promulgation of second-hand information in mobilizing relief efforts. The authors manually found different types of communities (i.e. ordinary citizen, journalists, NGO, Government officials, and celebrities), by analyzing a random sample of 1000 tweets, whereas our proposed framework CommuniMents enables identification of community members using snowball sampling technique. PROPOSED FRAMEWORK: COMMUNIMENTS We propose a framework, CommuniMents, to detect the sentiments of a specific community for a particular event. The framework has the following three components (Figure 3): 1. Identifying Community Users: This component receives a list of seed community members and based on these seed members, gathers more members using snowball sampling described in the previous section.

2. Collecting and Filtering Tweets: This component receives input (collection of community members) from the first component as well as from keys related to an event and then collects all the publicly available tweets of the community members and filters the tweets that are related to the event. 3. Sentiment Analysis: This component takes input (tweets of a community against an event) from the second component and analyzes the sentiments of these tweets after pre-processing them.

Figure 3: The CommuniMents Framework We detail working of the components in the following sections. Identifying Community Users This first component of the proposed framework CommuniMents identifies members (Uc) of a particular community c. It does so by starting with a set of seed users Us. All lists having one of the users in Us are retrieved. The rationale behind getting these lists is that they will contain similar types of users as in the set Us. These lists are then filtered and only those lists are retained which contain words from a set of predefined keywords Kl in their names. For example, if Kl contains keywords { “journalist”, “journalists”, “analysts” }, and the lists fetched (L0) contains lists { “Journalists of Pakistan”, “Political Analysts”, “Pakistani Sportsmen” } then only first and second lists in L0 are considered for further steps.

Once the lists are filtered, users present in these lists are fetched. The users are filtered (in a way similar to how lists are filtered), such that only those users are kept which have one of the keywords from the set Ku in their profile description. This process is repeated until no new users are found or a maximum number (maxc) of users belonging to a community are fetched. The process taking place in this component is explained in Algorithm 1. The function GetLists calls a Twitter API1 to get all the lists in which users Us are present. Similarly GetUsers also calls a Twitter API2 to get all the users present in the lists L. The FilterLists and FilterUsers functions are used to filter lists and users based on the keywords Kl and Ku respectively. In comparison to other sampling methods (like stratified or random sampling), identifying users of a community using hashtags [Lin et al., 2016], looking at top n user related to a topic [Sligh et al., 2016], or clustering users in a community [Guo et al., 2016], snowball sampling allows to start with a limited set of items, and grows the number of items in the sample. CommuniMents uses snowball sampling in a semi-automated way, in which all the members of a community are not required to be known beforehand.

Algorithm 1: Algorithm for getting members belonging to a community function GetCommunityMembers(Us, Kl, Ku, maxc)  Us is the set of seed users  Kl is the set of keywords for filtering lists  Ku is the set of keywords for filtering users  maxc is the maximum number of community members L ← GetLists(Us) L0 ← FilterLists(L) U ← GetUsers(L0) U0 ← FilterUsers(U) i←1 repeat L ← GetLists(Ui−1) Li ← FilterLists(L) U ← GetUsers(Li−1) Ui ← FilterUsers(U) i←i+1 until |Ui| = |Ui−1| or |Ui|≥ maxc return Uc  Set of users U belonging to community c) end function

1 2

https://dev.twitter.com/rest/reference/get/lists/memberships https://dev.twitter.com/rest/reference/get/lists/members

Collecting and Filtering Tweets The second part of the proposed framework acquires tweets (Tc) of the targeted community (c) and filters these tweets based on an event (e). Algorithm 2 is used for getting the tweets. The GetTweets function gets all the publicly available3 tweets of the user u using the Twitter API4. Algorithm 2: Algorithm for getting tweets of community members Uc function GetCommunityTweets(Uc) Tc ←∅ for all u ∈ Uc do Tc ← Tc ∪ GetTweets(u) end for return Tc end function For identifying the tweets (Tc ) that belong to a particular event, CommuniMents uses a set of keywords Ke that represent the event e. Currently, this list is maintained semi-automatically, in such a way that seed keywords representing the event are used to fetch a set of tweets belonging to the event [Abdelhaq et al., 2016]. The list of the seed keywords in extended by adding most frequent keywords from the tweets to the list Ke and the processes of adding frequent keywords is repeated until no more frequent and relevant word is found. The process is described in Algorithm 3. e

Algorithm 3: Algorithm for filtering tweets Tc related to an event e based on the set of keywords Ke function FilterCommunityTweets(Tc,Ke) Tce ←∅ L ←∅  L is the list of frequent words repeat KTempe ← Ke for all t ∈ Tc do if t contains a keyword from Ke and t not in Tce then Tce ← Tce ∪{t} for all words w in t do if L contains w then freq(w) ← freq(w) + 1 else freq(w) ← 1 L ← L ∪{w} end if end for end if end for Add most frequent relevant words to Ke until |KTempe|≠|Ke| return Tce end function 3 4

Currently Twitter API returns last 3200 publicly available tweets of a user https://dev.twitter.com/rest/reference/get/statuses/user_timeline

Sentiment Analysis The third component of CommuniMents pre-processes the tweets filtered by Algorithm 3 (Tc ) and finds sentiment score of each tweet. The sentiment scores of individual tweets are aggregated to obtain an overall sentiment score (SSc ) which indicates the sentiments of the community c for the event e. Before finding sentiment scores for individual tweets, it is important to pre-process the tweets, because tweets have limited length (140 characters) and this limited length forces users to use slang words and abbreviations. Twitter users also use mentions, URLs, and hash-tags. It is also quite possible that users make intentional (to keep to tweet short) or unintentional spelling mistakes. These twitter specific features affect the accuracy of the sentiment analysis algorithms, since most sentiment analysis tools are not specially built for twitter. Therefore, it may be better to extract twitter specific features before extracting features for sentiment analysis. Addressing these issues facilitates in achieving higher accuracy [Bao et al., 2014]. Effectiveness of these pre-processing steps for sentiment analysis has already been reported in [Khan et al., 2014]. Algorithm 4 shows the pre-processing steps. e

e

Algorithm 4: Algorithm for pre-processing tweets Tce function PreProcessTweets(Tce) TPce ←∅ for all t ∈ Tce do tTemp ← RemoveURLs(t) tTemp ← RemoveMentions(tTemp) tTemp ← RemoveHashSymbols(tTemp) tTemp ← ExpandAbbriviations(tTemp) tTemp ← SpellCheck(tTemp) tTemp ← Lemmatize(tTemp) TPce ← TPce ∪{tTemp} end for return TPce end function Each function used in Algorithm 4 is described below: 

RemoveURLs: Removes all the strings matching the following regular expression (representing a URL): \b(((https?)\:\/\/)|(www\.))\S+\b



RemoveMentions: Removes the user ids (user screen names followed by an @ symbol) in a tweet by using the following regular expression: \@\S+\b



RemoveHashSymbols: Removes the hash symbols (#) in a tweet. The hashtag (word followed by # symbol) is kept. Without removing hash symbols, the sentiment analyzer cannot identify the polarity of the hash tag.



 

ExpandAbbreviations: Identifies the abbreviations and slang words by looking at each word of the tweet in WordNet6 . If a word is not found in WordNet, it is considered as slang or an abbreviation. Then the function looks up such a word in a customized Netlingo5 acronyms dictionary.The words found in the custom dictionary are replaced by their expanded form. SpellCheck: Corrects the spellings of the words not found in WordNet and the customised NetLingo dictionary. Microsoft COM API6 is used for correcting the spellings mistakes. Lemmatize: Replaces the words with their roots. For the current implementation of CommuniMents, we use the lemasharp library7.

After pre-processing the tweets, CommuniMents assigns a sentiment score to each tweet. These scores are aggregated to measure the sentiment polarity of the community c for the event e. Algorithm 5 shows the process of finding sentiments. GetSentimentScore uses a sentiment analysis library for measuring the sentiment score of an individual tweet t. mins is the minimum score returned by the GetSentimentScore function, representing the most negative sentiment and maxs is the maximum score returned by the GetSentimentScore function, representing the most positive sentiment. As there are many sentiment analysis tools and libraries available, we evaluated two popular libraries CoreNLP and SyneSketch. Based on empirical analysis, we found that the results of CoreNLP are better than that of SyneSketch therefore the current implementation of CommuniMents uses CoreNLP for finding sentiments. Stanford CoreNLP toolkit [Manning et al., 2014] classifies a phrase into five integer values ranging from 0 to 4 that describe the finegrained sentiment classes i.e. “Very Negative” (0), “Negative” (1), “Neutral” (2), “Positive” (3), and “Very Positive” (4). The values are re-scaled between -2 and 2 for aggregation, such that 0 becomes -2 and 4 becomes 2. In case of CoreNLP, mins = 0 and maxs = 4. Algorithm 5: Algorithm for finding sentiments of tweets TPce function FindSentiments(TPce) TSce ←∅ SS ← 0 for all t ∈ TPce do sTemp ← GetSentimentScore(t) 𝑠𝑇𝑒𝑚𝑝−𝑚𝑖𝑛 𝑠 ← ( 𝑚𝑎𝑥 − 𝑚𝑖𝑛 𝑠 ) ×4 − 2 𝑠

𝑠

SS ← SS + s TSce ← TSce ∪{(t,s)} end for 𝑆𝑆 𝑆𝑆𝑐𝑒 ← |𝑇𝑃

𝑐𝑒 |

return (TSce,SSce) end function

5

http://www.netlingo.com/acronyms.php https://msdn.microsoft.com/library?url=/library/en-us/off2000/html/ woobjproofreadingerrors.asp 7 http://lemmagen.ijs.si/ 6

RESULTS For testing CommuniMents, we chose the Pakistani Journalist community, because this community is active on social media and has great impact on the Pakistani society and government. By using the first part of the framework, we identified 969 user ids of Pakistani journalists. For getting event based community sentiments, we downloaded 2,107,374 tweets with metadata belonging to these users. Then tweets were filtered into three distinctive events. These events include “Zarb-e-Azab”, “Azadi and Inqlab March” and “Hockey Champions Trophy for Men 2014”. Zarb-e-Azab is a joint military operation conducted by Pakistan armed forces against Tehrik-i-Taliban Pakistan (TTP) and others militant groups in North Waziristan area. This operation started on 15 June 2014, and was in progress when we started to retrieve tweets in 2015. The Azadi (freedom) and Inqilab (revolution) marches were launched by two political parties Pakistan Tehrik-e-Insaf (PTI) and Pakistan Awami tehrik (PAT) from Lahore to Islamabad for getting their demands met from the government. Azadi march continued from 14th August 2014 to 17th December 2014 and Inqilab march continued from on 14th August 2014 to 21st October 2014. The Hockey Champions Trophy for Men 2014 sports event was held from 6 to 14th December 2014 in Bhubaneswar, India. In the semi-final Pakistan won against India by 4 goals to 3 and Germany won against Australia by 3 goals to 2. The final was won by Germany, which defeated Pakistan by 2 goals to 0. Table 1 shows the number of event wise tweets and journalists’ participation. It also shows the polarity of sentiments for the whole community for the three events as computed by Algorithm 5 (SSc ). Overall sentiments of journalists’ community are negative for all the events. e

Table 1: Polarity of Sentiments of Journalists Community Participating in three Events Event Zarb-e-Azab Azadi and Inqlab March Hockey Champions Trophy for Men 2014

Tweets 11007 144845 13222

Participating Journalists 605 (65.55%) 796 (86.24%) 597 (64.68%)

Polarity -0.71 -0.62 -0.52

As the Algorithms 3 and 1 filter users and tweets on the basis of keywords, there is a chance irrelevant users and tweets are included during the process. To count the exact number of tweets that are irrelevant, we need to label the complete dataset, which is not possible due to the large size of the dataset. Instead we created a random sample of 30 event-related (“Zarb-e-Azab”) tweets. These tweets are shown in table 2 along with their user ids. Analysis of table 2 shows that 23 out of 25 (92%) users are either journalists or often participate in discussions with journalists. Users at serial numbers 10 and 23 are not actually journalists, but news related accounts. Irrelevant accounts are included because many users keep the journalists and news related accounts in the same twitter list, which affects the process of snowball sampling, and causes irrelevant users to be included in the targeted community. When we look at the number of relevant tweets, we find that 27 out of 30 (90%) tweets are relevant to the event. The reason behind inclusion of irrelevant tweets is the different context of the keywords that journalists use. For example, tweets at serial numbers 4, 10, and 17 in table 2 are not related to the event “Zarbe-Azab” but these are included in the dataset due to related keywords.

Table 2: Random sample of tweets for the event “Zarb-e-Azab” S Twitter id 1 Aak0 2 Adnanrandhawa 3 adilshahzeb 4 ajmaljami 5 alisalmanalvi 6 AmirMateen2 7 madihariaz 8 MahsudFarooq 9 MahsudFarooq 10 ApnaWaziristan 11 ApnaWaziristan 12 AtikaRehman 13 14 15 16

BenazirMirSamad SaeedShah DawarSafdar DawarSafdar

17 FarooqHKhan 18 FauziaKasuri 19 FauziaKasuri 20 taahir_khan 21 MinaSohail 22 P_Musharraf 23 PakMilitaryNews 24 PakMilitaryNews 25 KlasraRauf 26 penpricker 27 muniraqazi 28 MudassarGEO 29 nadia_a_mirza 30 QuatrinaHosain

Tweet #ZarbEAzb The other side of peace: scared residents flee the war zone Wondering when CIA senate committee members disappear and reports “leaked” to media they have gone to N. Waziristan wilfully. See how good he sounds here MT “@MurtazaGeoNews: TuQ says he’ll dispatch 14 truckloads of food/medicine today4 IDPs in Bannu,more2 follow” —-RT @ReesEdward: The British in N. Waziristan. Sometime in early 20th century. Dear COAS, the killers of these innocent school children are not restricted to North Waziristan only. They are everywhere in #Pakistan. need to something fast What started as a mass exodus of locals is now humanitarian crisis” http://t.co/rokEZDExKz Waziristan @DalrympleWill Samaa ran a report mocking the cricket team about the amount of their donation for IDPs. Then asked a moulvi’s opinion who deplored it too Effect of Terrorism on music in South waziristan . @AnserAbbas @alex_gilchrist @FATANews @IftikharFirdous @pirroshan Clash b/w security forces and millitants, 5 Millitants killed in sarvakai area #south #waziristan agency, security sources New Delhi: Kashmir Bharat Ka Atoot Hissa Tha, Hissa Hein Aur Rahay Ga, Pakistan Ko Sirf Apni Fikar Karni Chahiye:#BJP *Peshawar: Kpk Hakumat Ny IDPs Ky Liye Shikayat Cell Qaim Kr Diya, IDPs Peer Sy Hafta Subha 9 Bajay Tak Shikayat Darj Kara Saktay Hyn: RT @TahaSSiddiqui: Since we’ve SO MUCH of aid coming in for #Waziristan IDPs, why not waste some thru poor logistics and arrangements! http RT @PTIofficial: Khyber Pakhtunkhwa Govt making adequate arrangements for IDPs. Instructions passed to all relevant departments. RT @asadmunir38: Suicide attack kills four soldiers in North Waziristan http://t.co/eGRQNMpXBT Afghan gov will teach IDPs children,and here KP Gov ordered to vacate the schools In thousands IDPs in government schools going to IDPs again. RT @Shahidmasooddr: We pay taxes to Govt for IDPs/flood victims etc.We dont pay taxes to opposition! And its better to beg than to Rob jana RT @imran_sidra: Ma’am @FauziaKasuri & team while doing clothes shopping for IDPs sisters in Bannu KP. #HelpIDPs #Donate #IKF #PTI http://t @ArshadSidiqi Thank u for thinking of the IDPs..Allah bless you all. @Jan_Achakzai JUI-F should start protest to force the gov’t, military to send back 1 million North Waziristan people instead political Jirga RT @asadmunir38: #PakArmy soldier #ZarbeAzb I vehemently condemn the suicide attack on our troops in North Waziristan. The ultimate sacrifice offered by our... http://t.co/nM4MXtXcwt RT @AsimBajwaISPR: #ZarbeAzb:A pic taken in #IDP Camp Bannu today. Let us join hands to bring back their smiles#helpidps Pakistan plans military operation in North Waziristan, targeting extremist groups RT @Dr_Afaq: @KlasraRauf doesn’t talk of utopia in his Urdu column.He talks about real life solution to #IDPs problem. For me those nameless innocent kids of Waziristan, who die in drone strikes r no less than Malala. All of ’em are victim of a war not ours @dunyanetwork @BBhuttoZardari Good one! The PPP’s support for #ZarbEAzb is vital for a stable & secure #Pakistan. #PPP RT @NazranaYusufzai: What would be the meeting point of #drones and #Fazluallah - would he go to Waziristan or drone would come to swat. In #Bannu everything trickles down to army to serve n manage #IDPs, No Federal neither Provincial Govt presence, only photo sessions. RT @Khawar69: Media shud rather demonise the ideology of TTP and Jandullah that preach killing of humans for political goal. #zarbeazb will

Figure 4: Distribution of tweets by the sentiment classifier. Figure (a) for the event “Zarb-eAzab”, (b) for the event “Azadi and Inqlab March” and (c) for the event “Hockey Champions Trophy for Men 2014”

Event based sentiments classification The sentiments analysis has certain limitations, for example, if some events have more negative tweets, it does not mean that people are talking against the event, because sometimes there are circumstances, in which people are talking in the favor of the event and criticizing the opposite side/view. To focus on this problem, in this section we discuss the sentiment results of each of the three events individually. Figure 4a shows that there are 76% tweets classified as negative, in the event “Zarb-e-Azab”. The operation “Zarb-e-Azab” was against the militants. The majority of negative tweets prove that the targeted community (Pakistani journalists) were against the operation “Zarb-e-Azab”. In tweets these journalists expressed their opinions regarding the issues in operation, militants, and problems faced by IDPS (internally displaced persons). Tweets at serial no 1, 3, 6, 7, 14, 16, and 17 in table 3 show that the majority of the journalists are not in favour of the operation. The tweets in table 3 are randomly selected from all the negative tweets of this event. There are also 5% tweets, which are classified as positive. In the positive tweets, journalist are praising the operation and bravery of the law enforcement agencies. For example, tweets at serial no 1 and 7 in table 4 support our argument. The tweets in table 4 are randomly selected from the positive set. Figure 4b shows the sentiment analysis of the event “Azadi and Inqlab March”. As discussed in previous sections, this movement by two political parties PTI and PAT was against the government. As shown in the pie chart, 70% tweets are classified as negative, 22% are neutral and only 8% tweets are classified as positive. We randomly chose 20 negative tweets shown in table 5 to study why majority of the tweets are classified as negative. In the tweets at serial no 1, 6 and 20 journalists give opinion with negative sentiments about the said event and in tweet at serial no 10 compare the march with previous movements. In other tweets, peoples are expressing their support for the march while criticizing the government and others. Thus a majority of negative tweets does not mean that majority of journalists were against the “Azadi and Inqlab March”. A random sample of tweets having positive sentiments is shown in table 6. In these tweets journalist express their opinion directly in favor of the event. The examples of these type tweets are serial no 2, 3, 4, 8 and 9 in table 6.

Figure 4c shows the tweets classification of the event “Hockey Champions Trophy for Men 2014”. The tweets of this event are classified as 65% negative, 13% positive and 25% neutral. More negative sentiments are because of the criticism on International Hockey Federation (FIH). FIH banned two Pakistani hockey players due to violation of discipline. For further analysis, we randomly choose 20 tweets as shown in table 7 from negatively classified tweets. In tweets at serial no 5, 6, 8, 15, 18 and 19 the people express their disgust about the decision of FIH, Indian media and others. These are the major reasons that majority of sentiments of this event are classified as negative. Table 3: Tweets with Negative Sentiments from the Event “Zarb-e-Azab” S 1

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Twitter id AtifSal

Tweet #Pakistan rulers claims of #ZarbeAzb n #Waziristan exposed wth #PeshawarAttack.Fighting paid US War of Terror is bringing mayhem inside Pak. KhSaad Attended High level meeting chaired by Prime Minister. Current political Rafique situation , Operation , IDPs were discussed MariumCh RT @YusraSAskari: ’So far 572,529 people, belonging to 44,633 families have been registered as Internally Displaced Persons’ #ZarbEAzb #Pak arsched #PMSharif refused to authorise operation against militants when #GenSharif Gen sought authorisation in Feb & March #ZarbEAzb DawarSafda RT @washingtonpost: Pakistan military advances against Taliban, kills 27 r militants in North Waziristan http://t.co/3gMo14Rnbn TahaSSiddiq #PakArmy commences op in #NorthWaziristan by name of #ZarbeAzb ui (Prophet’s sword name). Will it b a failure like Swat & South Waziristan ops? Mustafa Pakistan authorities must ensure mil operations in N Waziristan respect laws Qadri of war, no collective punishment & provide for IDPs wasi78 Terrorists’ network worth Rs 2 billion 49 crore 80 lakh destroyed as Zarb-eAzb continues —... http://t.co/bYaDYGgk65 taahir khan Blast at marketplace in North Waziristan’s headquarters Miranshah killed two people on Tuesday, tribesmen said. PATofficialP #ZarbEAzb #ZarbeHaq rally by #PAT in Lahore to show solidarity with K #PakArmy operation #Pakistan http://t.co/qgpa5i5SQk MahsudFaro 2 soldiers wer killed,4 injd wen n IED planted by Terrorists, exploded on oq roadside in area vill Jatarai barwand n South Waziristan Agency, asmashirazi RT @fareedraees:TTP’s Hafiz Gul Bahadur instructions to the people of #Waziristan. He has advised people to migrate before 10th June. http ZaaraAbbas RT @AsimBajwaISPR: Army #Chief visited CMH,met injured students.Students Khar said,we are in high spirits,consider us soldiers of ZarbeAzb,don’t shaistaAziz How many people people remain in north Waziristan and which groups is the army targeting? http://t.co/AIiBjqUYCk via #Pakistan #ZarbeAzb aliarqam RT @mjdawar: Waziristan has been Razed to the ground by PAF Jets and none of the terrorists killed. #StateFranchisedTerror DawarSafda Till curfew in waziristan,I think and worry that with curfew uplifting the war r will start among military and militancy . SanaTGulzar RT @iramabbasi: Tribal customs making it difficult for some women IDPs to get access to all the help the need:My Video Story @BBCUrdu http: arsched RT @javerias: Chief of Army Staff General Raheel Sharif at MiranShah . #ZarbEAzb #MilataryOperation #NorthWazirstan #TTP #Pakistan http://t Rabail26 Extremist religious outfits have access to #IDPs in Bannu to provide ”relief”: by @TahaSSiddiqui. http://t.co/SbAGoPBYwu #ZarbeAzb #Pakistan kazmiwajah BREAKING: #PakArmy troops are deployed in all the major cities of #Pakistan at including #Karachi, #Lahore & #Islamabad. #ZarbeAzb #TTP

Polarity Negative

Negative Negative Negative Negative Negative Negative Negative Negative Negative Negative Negative Negative Negative Negative Negative Negative Negative Negative Negative

Table 4: Tweets with Positive Sentiments from the Event “Zarb-e-Azab” S

Twitter id

Tweet Polarity RT @amarbail1: I am sure helpin IDPs in holy month of #Ramzan will bring peace 1 khushnood2020 and satisfaction to ur heart. #HelpIDPs @ErumManzoor http:// Positive Happy Christmas, wherever you are RT @Razarumi: Church in South Waziristan 2 MishalHusainBBC celebrates http://t.co/wK07ymz2 Positive RT @syedsuhaibshah: Mrs. @FauziaKasuri distributing relief goods among the IDPs 3 FauziaKasuri of North Waziristan with team@GVPakistan. @rameez_mumtaz ht Positive 4 sharmeenochinoy Now would be a good time to address the nation! #PM #ZarbEAzb Positive Plz Allah protect our soldiers who r in #WaziristanOperation as they always protect 5 NadiaaQasim us.Allah Bless U and may you come home safe & sound,, Positive How many $$$?”@AsimBajwaISPR: #ZarbeAzb:Whole of nation approach will help 6 AtifSal us succeed vs terrorism,extremism in st http://t.co/VEQWUrZEHG” Positive Air strikes in Waziristan 'effective and successful': Sartaj Aziz 7 taahir_khan http://t.co/9e81NV21el Positive A selfie with lovely Waziristan orphans at Sweet Home. Watch report now on 8 TheHaroonRashid @BBCurdu on Aaj TV http://t.co/QOzuwQnpB6 Positive RT @Majid_Agha: Dear @AsimBajwaISPR #ZarbEAzb is hope of the 9 omar_quraishi nation.#BringBackTaseerAndGillani http://t.co/kdUJ8cW2k9 Positive Jibran Ahmad has a great piece on refugees fleeing Pakistans war-torn North 10 praveenswami Waziristan — http://t.co/VyfkEskSIt Positive

Table 5: Tweets with Negative Sentiments from the Event “Azadi and Inqlab March” S

Twitter id

Tweet

Polarity

Maulana Fazlur Rehman makes terrible/incorrect accusation of fa’hashi against PTI’s protests.Stick to pol, 1 NasimZehra Negative Constitutional issues,Maulana sb 2 Mahamali05 Negative Will the reforms include making one joint electorate for all? Will you ask for this reform Imran Khan? That is why Altaf the toad and beaten up, corrupt politicians are supporting Dr. TUQ. They want their share 3 ZaidZamanHamid in the National govt ! Got it ? Negative Life. ”@fasi_zaka: A day after meeting his hero, Imran Khan’s biggest fan passes away 4 MominaKhawar http://t.co/qdPnLuaNyX” Negative RT @saleemiss: Imran Khan releasing his workers from the Police Station, making a video & uploading it 5 shabbeerwahgra from his official page too https://t Negative RT @mazdaki: After calling off his dharna Dr.Tahir-ul-Qadri walks straight into the dustbin of history; 6 ImaanZHazir Negative Aabpara will retrieve him if & when RT @Maria95PTI: @Asad_Umar @ImranKhanPTI Almost many many pti girl wing ISB are upset ...We 7 mushtaqminhas will not come in azadi march if its in allian Negative VIDEO: Imran Khan’s speech on 41st day of the protest at #AzadiSquare 23rd Sep, 2014 8 FaisalJavedKhan http://t.co/28coSjml5r Negative @AnsarAAbbasi or Ulma Counsal ny wese bhe pehle bewi sy ijzat walee shart bhee khatm kardee,,ab tu 9 SikanderBalouch Naya Pakistan with New Wife #AzadiSquare Negative Particularly sacrifices of interior Sindh ppl in MRD movement were matchless Then many revolutionaries 10 tariqbutt_ Negative of 2day weren’t even born in politics RT @aslammuz: @Uzma_Views @KlasraRauf @arsched media is also responsible for this, promoting a 11 KlasraRauf Negative criminal as a hero like @ImranKhanAnchor 12 AzazSyed Negative Met two doctors both support @ImranKhanPTI and both admit he has lack of vision. Resignations? PTI in an effort to prove ’we mean business’ by going ’out of business’! Only ’business as 13 sanabucha Negative usual’ in KPK. Vah! RT @MurtazaGeoNews: Leading female reporters @Fereeha and @asmashirazi threaten to boycott #PTI 14 SaeedShah Negative coverage if attacks on journalists by #PTI RT @KamranShafi46: 9,000 in one DHARNA and 11,000 in another ain’t makin’ Nawaz/Shahbaz Sharif 15 mohsinrz Negative to ’GO’ and the assemblies to be dissolved.B RT @RaheeqAbbasi: One of the reason for #DrQadri to go abroad: Govt denied right of treatment in 16 ShahidMursaleen Pakistan for #DrQadri #LongLiveDrQadri htt Negative RT @TahirulQadri: We condemn the death of a protesting #PTI worker in Faisalabad and the state brutality 17 Khalil_a_hassan towards them. #PAT Negative 18 wajih_sani RT @HamidMirGEO: Imran Khan mentioned missing persons after a long time good to hear that from him Negative 19 DrAwab Negative RT @ArsalanGhumman: Dharna has ended but uprising movement has stated ! #RespectForIK MT @AnsarAAbbasi:Breaking DI Khan & Bannu Jails was terrorism. Removing prisoners forcibly from 20 kdastgirkhan Negative police van was political activity?

Table 6: Tweets with positive Sentiments from the Event “Azadi and Inqlab March” S

Twitter id

Tweet Polarity Musarrat Misbah joins PTI. Great to have a leading woman in the field of social 1 NazBalochPTI welfare become part of #PTIFamily. http://t.co/QE5aNGOgSo Positive Just finished a very interesting meeting with @TahirulQadri at his home. He 2 Fereeha categorically denied rumours of a deal. http://t.co/Hwii2BJubK Positive This is the finest piece you will read on Imran Khan’s Plan C. Even Insafians may 3 mosharrafzaidi like it (if you read to the end). http://t.co/ruybDf9lf1 Positive RT @AmnaKhanPTI: And the hilarious moment when IK cleans his sweating with 4 jasmeenmanzoor his kameez....Baqio k tou tissues hi nahi khatam hote Positive An attempt to get celebrities like Shahid Afridi & Wasim Akram & in return a good 5 kazmiwajahat number of crowd. #ShameOnIK #PTI http://t.co/7gaItCQ59G Positive @ImranKhanPTI talking with @nadeemmalik , right now @SAMAATV 6 NadeemMalikLive http://t.co/9Cfg9tgsFK Positive @AsimBajwaISPR sir when will you gives us good news of Nawaz Sharif arrest? 7 Mubashirlucman #AzadimarchPTI #InqilabmarchwithDrQadri #DrQadri Positive 8 FarahnazZahidi Thank you Sargodha! Massive! And well done #PTI for such an organzied jalsa. Positive 9 arsched The rich man stops laughing when the revolution comes. Quote #RevolutionMarch Positive RT @syedarr: @RNYousuf @NazBalochPTI nice to meet the enthusiastic PTI couple 10 NazBalochPTI .. Like My brother and sister .. #GoNawazGo http://t.co/zp Positive

Table 7: Tweets with Negative Sentiments from the Event “Hockey Champions Trophy for Men 2014” S

Twitter id

Tweet

Polarity

Indian hockey chief announces ending ties with #Pakistan : TV reports ... WTH ? .. #Hockey 1 JavedAzizKhan Negative #ChampionsTrophy RT @usmanmanzoor: When will Malik Riaz announce plots and cash for the poor Hockey players ??? 2 SaadiaAfzaal Negative #Waiting Some asses are set on fire... Sore losers. https://t.co/sh9w6CAkXC #ChampionsTrophy2014 #PakvInd 3 alisalmanalvi Negative #Hockey RT @faizanlakhani: BLAST FROM THE PAST: This is India’s Prabhjot Singh during WC2010, after 4 shakirhusain Negative Indian’s loss to Argentina. cc: @FIH_Hockey htt congratulations #Indian #Media for success in getting ban from #FIH is it not a biased decisions 5 AsmatullahNiazi #fihockey ??????????? Negative RT @SalaamHockey: It was only 5 members of the 7k crowd who’d said derogatory things & 6 ApaAlii Negative unfortunately got the better of our boys, spoiling t RT @Anujmanocha: @iffathasanrizvi @imvkohli cricket ka badla hockey me ! wah!! . see u in the world 7 IffatHasanRizvi Negative cup 2015 RT @AQpk: After years of #Indian abuse and pettiness, finally someone from #Pakistan pays them in 8 MuhamadAfzalECP kind. #PakistanHockeyTeam #TitForTat #We Negative RT @faizanlakhani: Nadeem Omar, the businessman who helped Pakistan Hockey team financially, 9 yasmeen_9 Negative announces gold medals for the players of Pakis Sorry! I opted out of @arsched ARY show because did not want to give up a double whammy show of 10 ArifAlvi Negative cricket and hockey @KlasraRauf 11 SikanderRJ Negative Congrats Pakistan Hockey team on reaching the final of the #CT2014 beating India 4-3 pics: Muhammad Tousiq, left and Ammad Shakeel Butt performing Sajda after beating Netherlands in 12 khalidkhan787 Negative Hockey Quarterfinals http://t.co/WYttkzw4vg Report- Hockey India calls off bilateral series with Pakistan. #RoIndiaRo #RoIndiaRo #RoIndiaRo 13 AnsarAAbbasi Negative #RoIndiaRo #RoIndiaRo #RoIndiaRo Congrats #Pakistan for #Silver in #Hockey #ChampionsTrophy.But uneeded controversy bad for 14 AdilNajam Negative #SouthAsia + for #Hockey. http://t.co/0zHxzYQ6Wq India’s behavior in hockey after yesterday’s game & how its been behaving in cricket for the last few 15 AqilSajjad Negative years. totally shameful Report Decision PAK Player #22 Ali Amjad https://t.co/7HToKfnPUu #CT2014 #Bhubaneswar 16 khawajaNNInews #fihockey” Negative Nadeem Omar, the businessman who helped Pakistan Hockey team financially, announces gold medals 17 faizanlakhani Negative for the players of Pakistan team. #CT2014 RT @Khan_Arsalan: That awkward moment when World’s Largest Democracy cries over a Hockey 18 asadrana74 Negative Match Defeat..#RoIndiaRo #CT2014 http://t.co/uQLj 19 ApaAlii Negative Why has @FIH_Hockey not made Youtube live streaming available in England!? #Pakistan remember: Our hockey team defeated in semifinals Champions Trophy becuz #India lobbied 20 AQpk Negative @FIH_Hockey to wrongfully ban 2 key players

To measure the effectiveness of the proposed framework, we computed its precision. Although it is desired to measure precision on the complete dataset, but in absence of a labeled dataset, we computed precision based on random samples. We calculate two types of precisions, first Pt for the tweets retrieved by the framework and second Pu for the users retrieved by the framework. Pt = 1 if all the tweets retrieved by the framework are relevant to the event, and Pu = 1 if all the users retrieved by the framework belong to the community. To compute each of the precisions, we take 3 random samples of 50 tweets each, and measure the average precision for all the samples. We compute Pt and Pu using the following equations:

(1) (2) Table 8 shows the average precisions for tweets and the users. The high precision rates show the effectiveness of CommuniMents. Even for totally different types of events, at least 76% of the tweets fetched by the framework are related to the event. Moreover, the framework performs even better in terms of user precision (Pu), where the least precision is 81% for the event Azadi and Inqilab March. Table 8: Precision of framework in terms of relevant tweets (Pt) and relevant users (Pu) Event Pt Pu Zarb-e-Azab 0.79 0.93 Azadi and Inqlab March 0.77 0.81 Hockey Champions Trophy for Men 2014 0.76 0.83

DISCUSSION To know the opinion of targeted community about the events under study, we identified the members of community by using the snowball sampling technique. During this process, it is possible that irrelevant users are also fetched along with actual members. To reduce this risk, an effective filtering mechanism is necessary. To filter irrelevant users, we carried out a semimanual process. For knowing the opinions of journalists about events under study, we performed event based sentiment analysis of their tweets. Popular sentiment analyzer (Stanford CoreNLP) classified the tweets into very positive, positive, neutral, negative, and very negative. It is difficult to completely rely on sentiments in forming an opinion about an event, because the meaning of positive and negative depends on the context. For example, in the event “Azadi and Inqlab March” journalists condemn the federal government and election commission of Pakistan for the rigging in general election 2013 and the way the federal government handled the issue of protest and sit-in (dharna) by using the paramilitary forces. In the same way journalists also criticized the Punjab government due to the incident of Model town, Lahore. Due to the condemning and criticism, a large part of event related tweets are classified as negative. Upon the analysis of these negative tweets, we reached the conclusion that the targeted community was actually supporting these protests in their tweets.

On the other hand, in “zarb-e-azb” event related tweets, those classified as negative, were really negative tweets. The community under study was reluctant that this war was not ours, and they were worried regarding the displacement of the common residents of area, where the operation was started. In event “Hockey Champions Trophy for Men 2014”, a large part of relevant tweets were also classified as negative. The Pakistani journalist community expressed their opinion in support of Pakistani players and team while condemning and criticizing the decision of ban by the International Hockey Federation (IHF). From the experiments and evaluation it has been observed that sentiment analysis i.e. Polarity of tweets is only helpful in primary and rapid opinion making about any specific event. To know the complete opinion of community about a specific event, qualitative analysis of tweets is necessary. CONCLUSIONS AND FUTURE WORK In this paper, we discussed the significance and role of social media among various communities and highlighted the influential role of communities which is powered by social media. Due to the importance of communities, a generic framework has been proposed to identify a specific community and also find out the event based sentiments of the community. CommuniMents consists of three main parts, which includes identification of members of targeted community, acquiring and filtering tweets in required events and cleansing event based tweets, and finding out the sentiments. For testing the framework, we chose the Pakistani journalist community as the targeted community and three different real events in the Pakistani context. The precision and recall of the obtained tweets and their sentiments proves the effectiveness of the proposed framework. The direction of future research includes the requirement of a fully automatic filtering algorithm, which filters the relevant members of a targeted community from irrelevant members. Also in this study, the keywords lists are prepared manually and extended semi-automatically, but in the future, the process of creating keywords lists can be automated. For the current evaluation, the tweets other than English language are excluded, because the sentiment analyzers do not understand the local Pakistani languages, especially Urdu and Roman Urdu. So work in the future may be directed to provision of sentiment analysis facility of Urdu and Roman Urdu tweets in the proposed framework. REFERENCES [Abdelhaq et al., 2016] Abdelhaq, H., Gertz, M., and Armiti, A. (2016). Efficient online extraction of keywords for localized events in twitter. GeoInformatica, pages 1–24. [Alowibdi et al., 2015] Alowibdi, J. S., Buy, U. A., Yu, P. S., Ghani, S., and Mokbel, M. (2015). Deception detection in twitter. Social Network Analysis and Mining, 5(1):1–16. [Appel et al., 2016] Appel, O., Chiclana, F., Carter, J., and Fujita, H. (2016). A hybrid approach to the sentiment analysis problem at the sentence level. Knowledge-Based Systems, 108:110 – 124. New Avenues in Knowledge Bases for Natural Language Processing. [Atefeh and Khreich, 2015] Atefeh, F. and Khreich, W. (2015). A survey of techniques for event detection in twitter. Computational Intelligence, 31(1):132–164.

[Balazs and Velsquez, 2016] Balazs, J. A. and Velsquez, J. D. (2016). Opinion mining and information fusion: A survey. Information Fusion, 27:95 – 110. [Bao et al., 2014] Bao, Y., Quan, C., Wang, L., and Ren, F. (2014). The role of pre-processing in twitter sentiment analysis. In Huang, D.-S., Jo, K.-H., and Wang, L., editors, Intelligent Computing Methodologies, volume 8589 of Lecture Notes in Computer Science, pages 615–624. Springer International Publishing. [Bastos et al., 2013] Bastos, M. T., Puschmann, C., and Travitzki, R. (2013). Tweeting across hashtags: Overlapping users and the importance of language, topics, and politics. In Proceedings of the 24th ACM Conference on Hypertext and Social Media, HT ’13, pages 164–168, New York, NY, USA. ACM. [Batrinca and Treleaven, 2015] Batrinca, B. and Treleaven, P. C. (2015). Social media analytics: a survey of techniques, tools and platforms. AI & SOCIETY, 30(1):89–116. [Bekafigo and McBride, 2013] Bekafigo, M. A. and McBride, A. (2013). Who tweets about politics? political participation of twitter users during the 2011gubernatorial elections. Social Science Computer Review. [Bravo-Marquez et al., 2016] Bravo-Marquez, F., Frank, E., and Pfahringer, B. (2016). Building a twitter opinion lexicon from automatically-annotated tweets. Knowledge-Based Systems, 108:65 – 78. New Avenues in Knowledge Bases for Natural Language Processing. [Bright et al., 2015] Bright, L. F., Kleiser, S. B., and Grau, S. L. (2015). Too much facebook? an exploratory examination of social media fatigue. Computers in Human Behavior, 44:148 – 155. [Gaspar et al., 2016] Gaspar, R., Pedro, C., Panagiotopoulos, P., and Seibt, B. (2016). Beyond positive or negative: Qualitative sentiment analysis of social media reactions to unexpected stressful events. Computers in Human Behavior, 56:179 – 191. [Guo et al., 2016] Guo, L., Ding, Z., and Wang, H. (2016). Database Systems for Advanced Applications: DASFAA 2016 International Workshops: BDMS, BDQM, MoI, and SeCoP, Dallas, TX, USA, April 16-19, 2016, Proceedings, chapter Behavior-Based Twitter Overlapping Community Detection, pages 371–376. Springer International Publishing, Cham. [Gupta and Sharma, 2016] Gupta, S. and Sharma, S. (2016). A spelling mistake correction (smc) model for resolving real-word error. In Behera, H. S. and Mohapatra, D. P., editors, Computational Intelligence in Data MiningVolume 1, volume 410 of Advances in Intelligent Systems and Computing, pages 429–438. Springer India. [Hammer, 2002] Hammer, B. (2002). Recurrent networks for structured data a unifying approach and its properties. Cognitive Systems Research, 3(2):145 – 165. Integration of Symbolic and Connectionist Systems. [Honey and Herring, 2009] Honey, C. and Herring, S. (2009). Beyond microblogging: Conversation and collaboration via twitter. In System Sciences, 2009. HICSS ’09. 42nd Hawaii International Conference on, pages 1–10. [Katz et al., 2015] Katz, G., Ofek, N., and Shapira, B. (2015). Consent: Context-based sentiment analysis. Knowledge-Based Systems, 84:162 – 178. [Khan et al., 2014] Khan, F. H., Bashir, S., and Qamar, U. (2014). Tom: Twitter opinion mining framework using hybrid classification scheme. Decis. Support Syst., 57:245–257.

[Khan et al., 2017] Khan, H. U., Daud, A., Ishfaq, U., Amjad, T., Aljohani, N., Abbasi, R. A., and Alowibdi, J. S. (2017). Modelling to identify influential bloggers in the blogosphere: A survey. Computers in Human Behavior, 68:64 – 82. [Khan et al., 2016] Khan, W., Daud, A., Nasir, J. A., and Amjad, T. (2016). A survey on the state-of-the-art machine learning models in the context of NLP. Kuwait Journal of Science, 43(4):95–113. [Khan Minhas et al., 2015] Khan Minhas, M. F., Abbasi, R. A., Aljohani, N. R., Albeshri, A. A., and Mushtaq, M. (2015). Intweems: A framework for incremental clustering of tweet streams. In Proceedings of the 17th International Conference on Information Integration and Web-based Applications & Services, iiWAS ’15, pages 87:1–87:4, New York, NY, USA. ACM. [Krcadinac et al., 2013] Krcadinac, U., Pasquier, P., Jovanovic, J., and Devedzic, V. (2013). Synesketch: An open source library for sentence-based emotion recognition. Affective Computing, IEEE Transactions on, 4(3):312–325. [Kumar et al., 2014] Kumar, S., Morstatter, F., and Liu, H. (2014). Visualizing twitter data. In Twitter Data Analytics, pages 49–69. Springer. [Kwak et al., 2010] Kwak, H., Lee, C., Park, H., and Moon, S. (2010). What is twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web, WWW ’10, pages 591–600, New York, NY, USA. ACM. [Lin et al., 2016] Lin, X., Lachlan, K. A., and Spence, P. R. (2016). Exploring extreme events on social media: A comparison of user reposting/retweeting behaviors on twitter and weibo. Computers in Human Behavior. [Liu, 2012] Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1):1–167. [Liu and Zhang, 2012] Liu, B. and Zhang, L. (2012). A Survey of Opinion Mining and Sentiment Analysis. Springer US, Boston, MA. [Makice, 2009] Makice, K. (2009). Twitter API: Up and Running. O’Reilly Media. [Manaman et al., 2016] Manaman, H. S., Jamali, S., and AleAhmad, A. (2016). Online reputation measurement of companies based on user-generated content in online social networks. Computers in Human Behavior, 54:94 – 100. [Manning et al., 2014] Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., and McClosky, D. (2014). The stanford corenlp natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 55–60. [Medhat et al., 2014] Medhat, W., Hassan, A., and Korashy, H. (2014). Sentiment analysis algorithms and applications: A survey. Ain Shams Engineering Journal, 5(4):1093 – 1113. [Mollett et al., 2011] Mollett, A., Moran, D., and Dunleavy, P. (2011). Using twitter in university research, teaching and impact activities. http://eprints.lse.ac.uk/38489/. [Muhammad et al., 2016] Muhammad, A., Wiratunga, N., and Lothian, R. (2016). Contextual sentiment analysis for social media genres. Knowledge-Based Systems, 108:92 – 101. New Avenues in Knowledge Bases for Natural Language Processing.

[Nasukawa and Yi, 2003] Nasukawa, T. and Yi, J. (2003). Sentiment analysis: Capturing favorability using natural language processing. In Proceedings of the 2Nd International Conference on Knowledge Capture, K-CAP ’03, pages 70–77, New York, NY, USA. ACM. [O’Connor et al., 2010] O’Connor, B., Balasubramanyan, R., Routledge, B. R., and Smith, N. A. (2010). From tweets to polls: Linking text sentiment to public opinion time series. In Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media (ICWSM), pages 122–129. [Pang and Lee, 2008] Pang, B. and Lee, L. (2008). Opinion mining and sentiment analysis. Found. Trends Inf. Retr., 2(1-2):1–135. [Rill et al., 2014] Rill, S., Reinel, D., Scheidt, J., and Zicari, R. V. (2014). Politwi: Early detection of emerging political topics on twitter and the impact on concept-level sentiment analysis. Knowledge-Based Systems, 69:24 – 33. [Schuller et al., 2015] Schuller, B., Mousa, A. E.-D., and Vryniotis, V. (2015). Sentiment analysis and opinion mining: on optimal parameters and performances. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 5(5):255–263. [Sligh et al., 2016] Sligh, J., Abedtash, H., Yang, M., Zhang, E., and Jones, J. (2016). A novel pipeline for targeting breast cancer patients on twitter for clinical trial recruitment. [Socher et al., 2013] Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C. D., Ng, A., and Potts, C. (2013). Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1631–1642, Seattle, Washington, USA. Association for Computational Linguistics. [Takahashi et al., 2015] Takahashi, B., Jr., E. C. T., and Carmichael, C. (2015). Communicating on twitter during a disaster: An analysis of tweets during typhoon haiyan in the philippines. Computers in Human Behavior, 50:392 – 398. [Thelwall et al., 2011] Thelwall, M., Buckley, K., and Paltoglou, G. (2011). Sentiment in twitter events. Journal of the American Society for Information Science and Technology, 62(2):406–418. [Wang et al., 2016] Wang, Y., Rao, Y., Zhan, X., Chen, H., Luo, M., and Yin, J. (2016). Sentiment and emotion classification over noisy labels. Knowledge-Based Systems, 111:207 – 216. [WeGov, 2016] WeGov (2016). Where eGovernment meets the eSociety. http://www.wegov_project.eu. Accessed: March 19, 2016. [Wu et al., 2011] Wu, S., Hofman, J. M., Mason, W. A., and Watts, D. J. (2011). Who says what to whom on twitter. In Proceedings of the 20th International Conference on World Wide Web, WWW ’11, pages 705–714, New York, NY, USA. ACM. [Zhou et al., 2015] Zhou, D., Chen, L., and He, Y. (2015). An unsupervised framework of exploring events on twitter: Filtering, extraction and categorization. In Proceedings of AAAI Conference on Artificial Intelligence, pages 2468–2474. [Zubiaga et al., 2013] Zubiaga, A., Ji, H., and Knight, K. (2013). Curating and contextualizing twitter stories to assist with social newsgathering. In Proceedings of the 2013 International Conference on Intelligent User Interfaces, IUI ’13, pages 213–224, New York, NY, USA. ACM.

Suggest Documents