IST-Africa Template

Extrapolation of Aspects of Fake News on Social Networks Potsane Mohale and Wai Sze Leung University of Johannesburg, Johannesburg, South-Africa [email protected] [email protected] Abstract Social networks provide us with an ideal platform for keeping up to date with the latest information from across the world. However, social networks such as Twitter and Facebook are also proving to be a fertile ground for spreading fake news which reaches a large number of people in a short span of time. The spreading of fake news on social networks, especially in times of disaster, or on matters involving national security brings unwanted effects on lives of individuals and societies. The main aim of this research paper is to determine the most common features of fake news on social networks. The results of this study will later on be used for an eventual task of building a model for automatic detection of fake news. We studied over 40 research papers on fake news on social networks and analysed a dataset containing over a million public tweets labelled as spam and non-spam tweets to extrapolate common aspects. We classified the aspects of fake news on social networks into three categories, listing the most prominent aspects in each category. Keywords: Fake news, Social networks, Fake news detection, Classification algorithms, Twitter, Facebook 1 Introduction The use of social networks has rapidly changed the way people communicate and access news and information online. Social networks have drawn a large number of users distributed across different regions all over the world, with Facebook alone having more than 1.9 billion registered users (Khalajzadeh, Yuan, Grundy, & Yang, 2016; Efstathiades, Antoniades, Pallis, Dikaiakos, Szlávik & Sips, 2016). Social networks like Twitter and Facebook provide an environment which enables people to influence each other’s decisions on important issues such as health, shopping, traveling, and professional concerns (Louni & Subbalakshmi, 2014). While social networks are mostly used for communicating, they are also being used to share news on current affairs and important information. Now more than ever, people rely on social networks as a source of news. For example, at any point in time, 85% of the trending topics on Twitter is related to news on current affairs (Vosoughi, Mohsenvand, & Roy, 2017; Shao, Ciampaglia, Flammini, Menczer, 2016). Unfortunately, there are people who misuse social networks by spreading fake news, rumours and propagating misinformation. Fake news is news whose context has been intentionally changed and manipulated in order to influence the readers’ opinions on facts of the events taking place in the real world. It is important that people get only verified news on social networks, as what they read on these platforms can influence their decisions and actions they take in the real world (Efstathiades et al, 2016). For instance, during natural disasters, there is often a spread of misinformation which leads to unnecessary panic and inappropriate practices (Oyeyemi, Gabarron, & Wynn, 2014; Jin, Wang, Zhao, Dougherty, Cao, Lu, & Ramakrishnan, 2014; Kalyanam, Velupillai, Doan, Conway, & Lanckriet, 2015; Starbird, Maddock, Orand, Achterman, & Mason, 2014; Reuter, Kaufhold, & Steinfort, 2017; Hughes & Palen, 2009). Recent past elections that took place in Kenya and the United States, for example, have been the target of copious reports of fake news being published on social media. The misinformation spread in both elections are believed to have influenced the outcome of elections as a result (Starbird et al, 2014; Reuter et al, 2017; Allcott & Gentzkow, 2017; Bessi & Ferrara, 2016). Due to the high volumes of social media posts that are made every second (7,901 tweets are made every second on Twitter (Internet Live Stats, 2018) while Facebook users post approximately 510,000 comments every minute (Zephora. 2018)), social networking companies have the unenviable task of

monitoring and policing their networks in a manner that will not be regarded by the community as being heavy-handed, biased, or infringing on freedom of speech (Stewman, R. 2018). Automating the monitoring of social networks is, therefore, one potential solution that could be used to overcome the high post volumes while addressing issues of potential bias. To achieve this, a model that considers the attribute categories of user attributes, post features, and propagation features, will need to be developed for implementation in a machine learning environment. The first step we need to take is to address the following question: What are the most prominent features of fake news which need to be considered in order to verify the validity of news on social networks? Although much research has been conducted on detecting misinformation on social media by looking at different aspects of news on these social networks, there appears to be a lack of common features which should be considered when attempting to classify news as either fake or real. The paper aims at coming up with a ranking of the most prominent features of fake news on social networks. This ranking is intended to guide a solution for detecting and counteracting fake news propagated on social networks. We also provide a brief, high-level explanation of the model we intend to build which will be guided by the results we obtain from this study. The current research on fake news detection on social networks is relatively in its budding stages, and there are still a lot of issues which need standardization and formalization, like the categorization and characterization of features of fake news on social networks. Currently, social networks do not have stable tools which automatically flag content as potentially fake or deceptive. Most of these platforms rely on manual ways of detecting fake news. This is highly in-effective, given the large volumes of data which are generated daily on some of the most popular social networks (Liu, Xu, & Tourassi, 2016). This facilitates for a rapid spread of fake news at a high rate resulting in more social harm and damages (Hughes & Palen, 2009; Castillo, Mendoza, & Poblete, 2011). In this paper, we aim at addressing this problem by defining the categories of features of news on social networks, which will guide future research works and intelligent implementations of automatic detection of fake news on social networks. The rest of this research paper is organized as follows: In Section 2, the methodology used to carry out the research is detailed, describing the datasets used and the literature we reviewed when carrying out the research. Sections 3 and 4 highlight the literature studied and the dataset we analysed respectively. Section 5 discusses the results of the analysis of the dataset and the literature review. In Section 6, the model we will build based on the results of this research is outlined, with Section 7 and 8 concluding the paper with a discussion on future work that will follow from the results of the findings in this paper. 2 Methodology A mixed methods approach is followed in this research in which an extensive systematic literature review is conducted qualitatively to extrapolate common aspects that are considered when evaluating whether a particular piece of news is fake or real. This systematic literature review is helpful for synthesizing literature in order to derive knowledge which is not covered in the existing literature (Crossan & Apaydin, 2010). It is a reliable method, which will maximize an audit trail of evidence for our study, and limit bias in the conclusions this study will reach. In addition, we analyse three annotated datasets that collect spam and fake news posts in order to identify common aspects that are present in these known spam or fake posts. The results of both are then used to develop our set of aspects that are deemed suitable indicators for our model for incorporation in automating fake news detection. 3 Literature review A literature review was conducted by first identifying existing research papers on the topic. This identification was achieved by carrying out searches for research articles in the ACM, IEEE, ScienceDirect, Emerald, Springer, and Wiley databases. This choice of databases ensured that we

only obtained the best peer-reviewed research papers in Computer Science and Informatics. Our search included keywords such as “fake news on social networks”, “detecting fake news on social networks”, “spam users on social networks” and “news verification on social networks”. From this search, we obtained a total of 258 research papers, of which only 40 of them was considered to be relevant for our purpose. From the initial literature review, features of fake news on social networks were extracted and categorized into user attributes, post features, and propagation features. 3.1 User attributes User attributes refer to the personal features of the user account which posted the news in question. These attributes include whether or not the account has been verified, the age of the user, their gender, and the user account’s description (Vosoughi, Mohsenvand, & Roy, 2017; Yango, Mendoza, & Poblete, 2011). Another important user attribute considered is the user’s influence, which is measured by the number of followers the user has. User influence is closely related to the user role, which is the ratio of the number of users the current user follows, to the number of users who follow the current user. The period an account has been in existence is another aspect which can tell a lot about the validity of the news that is posted as often, accounts which spread fake news on social networks normally have a lot of followers, and short periods of existence (Vosoughi, Mohsenvand, & Roy, 2017; Shao et al, 2016; Yango et al, 2011; Giasemidis, Singleton, Agrafiotis, Nurse, Pilgrim, Willis, & Greetham, 2016). It is, however important to note that user attributes alone cannot reveal much about the posts on social networks. As such, assessing the truthfulness of a post requires that one considers the other aspects relating to the post as well. 3.2 Post features Accounts propagating fake news on social networks use a different writing style when posting their messages. A post containing fake news on social networks often contains external links to the original source of the news (Khalajzadeh et el, 2016; Louni & Subbalakshmi, 2014; Jin et al, 2014). These links are often shortened using URL shortening services like bit.ly. Another characteristic of a post containing fake news is the number of hash-tags, users posting malicious content on social networks often post a lot of hash-tags and mention a lot of users in their posts in order to gain more coverage (Vosoughi, Mohsenvand, & Roy, 2017). The formality levels of the text, use of vulgar words and abbreviations in a post can tell us if the post is malicious or not (Giasemidis et al, 2016). Special characters like exclamation marks, question marks and the length of words and sentences used in a post can define the validity of the post. The time and the geographical location at which the post was sent from can also be verified when establishing the validity of the content on social networks. 3.3 Propagation based features Propagation features of content on social networks define its diffusion dynamics on the network. Propagation features such as the number of comments on the post, the origin of the post and the number of times it has been shared by other users are considered to be important (Yango et al, 2011). Propagation features define a propagation tree which results from the sharing and engaging with the post in question. The significance of these features lies on the fact that people show more trust on posts which are shared to them from somewhere else, than original posts which are posted directly to them by someone (Louni & Subbalakshmi, 2014; Zhao, Resnick, & Mei, 2015). Table 1 details the results of existing research identified as potential sources on how to recognise fake news based on these categories of features as discussed above. Some of the research material outlines the detection of fake news based on a single category, while some considered all the categories we identified.

Table 1: Literature on detection of fake news based on different categories Article Louni & Subbalakshmi, 2014 Vosough Mohsenvand, & Roy, 2017 Reuter, Kaufhold, & Steinfort, 2017 Hughes & Palen, 2009 Castillo, Mendoza, & Poblete, 2011 Giasemidis, Singleton, & Agrafiotis, 2016 Yang & Yu, 2012 Zhao, Resnick, & Mei, 2015 Horne & Adah, 2017 Ratkiewicz, Conover, & Meiss, 2010 Keller, Schoch, Stier, Yang, 2017 Hamidian & Diab, 2016 Biyami , Tsioutsiouliklis, & Blackmer, 2016 Liu, Xu, & Tourassi, 2016 Ferrara, Varol, Menczer, & Flammini, 2016 Conroy, Rubin, & Chen, 2015 Saez-Trumper, 2014 Chen, Zhang, Chen, Xiang, Zhou & 2014 Sedhai & Sun, 2017 Vosoughi & Roy, 2016 Yang, Zhong, Ha, & Oh, 2016 Rubin, 2017 Goel, Watts, & Goldstein, 2012 Deng, Fu, & Yang, 2015 Kwon, Cha, Jung, Chen, & Wang, 2013 Alonso, Carson, Geaster, Ji, & Nabar, 2010 Agichtein, Castillo, & Donato, 2008 Vosoughi, 2015 Zubiaga, Liakata, Procter, Bontcheva, & Tolmie, 2015 Nurse, Creese, Goldsmith, & Rahman, 2013 Nurse, Agrofiotis, Goldsmith, Creese,& Lamberts, 2014 Castillo, Mendoza, & Poblete, 2013 Kwon, Cha, Jung, Chen, & Wang, 2013 Zubiga, Liakata, Procter, Bontcheva, & Tolmie, 2013 Hamidian & Diab, 2015 Jin, Ca, Zhang,& Luo, 2016 Varol, Ferrara, Davis, Menczer, & Flammini, 2017 Vijayaraghavan, Vosoughi, & Roy, 2017

User Post features features ✔ ✔ ✔ ✔

✔

✔

✔

✔ ✔

Propagation based features ✔ ✔

✔ ✔ ✔

✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔

✔ ✔ ✔ ✔ ✔ ✔ ✔

✔ ✔ ✔

✔ ✔ ✔ ✔

✔ ✔ ✔ ✔

✔

✔ ✔ ✔

✔ ✔ ✔

✔

✔ ✔ ✔ ✔ ✔ ✔

✔ ✔

✔ ✔ ✔

4 Dataset analysis We studied three annotated data-sets, the Twitter spam dataset, the Buzzfeed’s United States of America presidential elections and the LIAR dataset. These three data-sets are a perfect fit for this

research because they all have a collection of clearly annotated social network content. The Twitter spam data-set consists of 12 light-weight features of which we analysed to get a ranking of the most important features. The Buzzfeed data-set contains Facebook posts labeled as fake and real, we also analysed these posts and their features, we then deduced the most prominent features of fake news on Facebook. The LIAR dataset has six human curated truthful labels, based on the content of each entry. 4.1 Twitter spam dataset We used an annotated dataset of spam tweets which is published by Chen, Zhang, Chen, Xiang, & Zhou (2015), they used the Twitter Streaming API to collect a set of about 600 million tweets containing URLs in them. They further ran every URL through the Trend Micro’s Web Reputation Service to identify tweets containing malicious URLs. The Trend Micro’s Web Reputation Service collects URLs and analyses them, and gives output on their legitimacy (Yang et al, 2012). 1% of the tweets in this dataset is labelled as fake news. The dataset is further broken down into more features, in Table 2 we summarize and categorize the ID of these features. Table 2: Features of the Twitter spam dataset User features account_age no_follower no_following

Post features no_userfavourites no_hashtag no_usermention no_urls no_char no_digits

Propagation based features no_tweets no_retweets

4.2 Buzzfeed United States of America 2016 presidential elections dataset We also used a labelled dataset published by Horne & Adali (2017), they collected the news stories from the Buzzfeed’s collection of fake elections stories on Facebook which were published a few months before the 2016 United States of America presidential elections took place. The original collection from Buzzfeed is made up of real and fake stories which generated the highest number user engagement on Facebook. This dataset contains a collection of Facebook posts marked as fake news and another collection which is marked as legitimate. The dataset is annotated by the following set of features which relate to the user involved, the content posted and propagation dynamics. The following attributes stand out as the difference between the posts marked as fake and posts marked as legitimate 1. Account identity 2. Post-identity 3. URLs included 4. Date post was made 5. Multi-media (images or video) 6. Share count 7. Reaction count 8. Comment count 4.3 LIAR dataset We inspected this dataset which contains statements from tweets and Facebook posts which cover a diverse set of topics ranging from economy, education, immigration and election-related issues. This dataset contains more than 12 000 labelled statements which are collected from a fact-checking website PolitiFact accessed through its API (Shu, Kai, Wang, Suhang, Liu & Huan, 2017). For this dataset, the content features were considered and annotated with the six labels, pants-fire, false, barely-true, half-true, mostly true and true. These labels are human-curated and are based on the statements themselves.

In table 3 we review the dataset based on which features of fake news can be extracted from each. We note that no single database provides all the features and each database has a limitation of its own, which makes it a makes it a necessity for us to consider a number of datasets and the literature review as well. The tweeter spam dataset only considers tweets labelled as spam, however fake news on social networks can take different forms apart from spam. The Buzzfeed dataset is limited in that it only contains news relating only to the US presidential elections of 2016, while the LIAR dataset contains short statements instead of the entire content, and it is not free of the human error. Table 3: Features of the Twitter spam dataset Dataset Buzzfeed dataset Twitter spam dataset LIAR dataset

User features ✔ ✔

Post features ✔

Propagation based features ✔

✔

5 Results After studying the literature and datasets we collected, we further investigated the most relevant attributes of fake news on social networks in each of the three categories we and identified. We first begin by presenting the results of a literature review of the 40 research papers we studied. We went through the literature material, we noted all the features each author utilized in their respective models, and build the best feature table in the process. We have summarized the top 25 features which are most relevant to the effective detection of fake news on social networks. Table 4 summarizes these features according to the category to which they belong, we noted 10 features for both user and post based features and 5 propagation based features. Table 4: Features of fake news from different categories Content-based features URLs in the post Number of hash-tags Word complexity

User-based features Originality Engagement Number of followers

Sentence complexity Emoticons used Special characters Multi-media content Abbreviations Location Time and date

Number of users following Credibility Number of posts created Verification status Length of membership period Influence Roles

Propagation-based features Number of post shares Number of comments Number of shares with external URLs Propagations maximum sub-tree Propagation maximum level

We further investigated how the different features studied from the literature review impact the classification of a social network account. We achieved this by analysing the Twitter spammer dataset and plotting the attributes of each account and having a clear distinction between spammers and nonspammer accounts. The data was analysed using WEKA 3.8. In figure 1a, we note that accounts which are classified as spammers include too many hash-tags in their posts. In figure 1b, we have a visual presentation of the number of posts non-spammers account and spammers account each send out, and we can see that spammer account send out more tweets. When it comes to users being followed, the plotting on figure 1c shows us that accounts which post fake news are most likely to follow a large number of users in comparison to non-spammer user accounts. During our research, we also discovered that accounts which generate fake tweets have a relatively less number of followers in comparison to clean accounts, as shown in figure 1d. The average number

of characters in a post done by both fake tweeters and clean accounts was found to be almost the same for the two classes of users, however, this can be to the fact that the dataset was extracted from Twitter, which does limit each post to 140 characters. Accounts which spread fake news on social networks do not last long, they are quickly removed, this is shown by figure 1f, spreaders of fake news have very much less number of days of existence in comparison to clean accounts. Top – Spammers Bottom – Non-spammers

(a) Hashtags used

(b) Posts created

(c) Users followed Top – Non-spammers Bottom – Spammers

(d) Followers

(e) Average characters (f) Number of days in a post account has existed

Figure 1: Plotting of the points representing the two classes of user for each feature

From these attributes we studied in the literature review and analysing the datasets, we further went on to identify the top ten most prominent features from each of the categories. Table 5 shows the final features of fake news which we found to be the most considered for the purpose of detecting fake news on social networks. Table 5: Most used features of fake news on social networks Feature Originality Number of hash-tags Number of post shares Number of comments Number of shares with external URLs Engagement Number of followers Number of users following Location Time and date Whether the user is verified or not

Category User-based features Content-based feature Propagation-based feature Propagation-based feature Propagation-based feature Use-based features User-based features User-based features Content-based feature Content-based feature User-based features

6 Model construction Having identified the most suitable features of fake news on social networks, we now define a highlevel view of the model we can build for automatic detection of fake news. Since fake news is often a combination of valid statements and false claims, it is reasonable to consider a model which predicts the likelihood of content being fake news instead of outputting a binary value. The model involves

the extraction of these features, according to the categories in which they are defined, and incorporating them into a set of unsupervised classifiers. This ensemble solution which combines different supervised classifiers in order to learn a strong classifier which is more effective and successful. Supervised classification methods will provide a more accurate result than semi-supervised or unsupervised learning classifiers (Shu, Sliva, Wang, Tang, Liu, 2017). An ensemble classification method is beneficial as the results of each classifier will lead to a stronger and more effective detection of fake news. 7 Discussion Given the far-reaching implications of fake news running rampant on social media, it is imperative that mechanisms exist and are in place to reduce the negative impact that they may have on society. As seen in the various examples already cited, fake news could potentially be regarded as a form of cyber warfare that attempts to destabilise a nation. The features identified in this work will enable researchers to further develop the ability to automate the process of detecting fake news on social networks. This, in turn, will provide everyone such as users, media companies, and social networks with the ability to discern real news from fake ones. Organisations and governments will be able to improve their defenses against posts aimed at creating instability. There will be greater confidence in social networks while users will be able to make better-informed decisions in their lives. This will improve the social aspects of societies on social networks, who will be sharing valid information about health, security, political matters and helpful intelligence during disasters (Shao et al, 2016; Yango, Mendoza & Poblete, 2011; Yang et al, 2012). Furthermore, emergency services in times of disaster will be able to better deal with false alarms, preventing the spread of fake news from propagating further (Starbird et al, 2014; Reuter, Kaufhold & Steinfort, 2017; Alonso, Carson, Gerster, Ji, & Nabar, 2010). 8 Conclusions In this research project, we learned how people around the world make use of social networks and how the information found on social networks affect the decisions and actions of individuals in the real physical world. We described fake news and identified the repercussions of spreading fake news on social networks. The most prominent features of fake news on social networks were extracted, from studying two datasets and conducting a literature review on prior research work conducted on the topic of fake news on social networks. Whereas most of the literature we studied only considered a single category of features relating to fake news on social networks, this research paper stands out in that it combines the different features and aggregates the most important ones. This research work can further be improved in two ways: firstly, the datasets we considered were only from two social networks, Facebook and Twitter. More features of fake news can be discovered when considering feeds from a number of different social networks. Secondly, the data set and literature we studied was focused on the English language. More relevant results could be achieved by building a data-set based on other languages, this way we would learn to cater for fake news being published using other languages with more accuracy. In future, we will build on this paper by conducting a study to find the performance of different classifiers that will classify social network posts based on the features identified in this paper. Having found the best classifiers and the best features to work with, we will be in a position to build a solution which will detect fake news on social networks with a much higher level of accuracy. References Agichtein, E., Castillo, C., Donato, D., Gionis, A. & Mishne, G. (2008). Finding high-quality content in social media. In Proceedings of the 2008 international conference on web search and data mining,183-194. ACM

Allcott, H., & Gentzkow, M. (2017). Social media and fake news in the 2016 election. Journal of Economic Perspectives, 31(2), 211-36. Alonso, O., Carson, C., Gerster, D., Ji, X. & Nabar, S.U. (2010). Detecting uninteresting content in text streams. In SIGIR Crowdsourcing for Search Evaluation Workshop. Bessi, A. and Ferrara, E. (2016). Social bots distort the 2016 US Presidential election online discussion. Biyani, P., Tsioutsiouliklis, K., & Blackmer, J. (2016, February). " 8 Amazing Secrets for Getting More Clicks": Detecting Clickbaits in News Streams Using Article Informality. In AAAI, 94-100. Castillo, C., Mendoza, M. and Poblete, B. (2013). Predicting information credibility in timesensitive social media. Internet Research, 23(5), 560-588. Chen, C., Zhang, J., Chen, X., Xiang, Y. and Zhou, W. (2015). 6 million spam tweets: A large ground truth for timely Twitter spam detection. In Communications (ICC), 2015 IEEE International Conference, 7065-7070. IEEE Conroy, N.J., Rubin, V.L. and Chen, Y. (2015). Automatic deception detection: Methods for finding fake news. Proceedings of the Association for Information Science and Technology, 52(1), 1-4. Crossan, M. M., & Apaydin, M. (2010). A multi‐dimensional framework of organizational innovation: A systematic review of the literature. Journal of management studies, 47(6), 1154-1191. Deng, J., Fu, L., & Yang, Y. ZLOC: Detection of zombie users in online social networks using location information. In Proc. of the third IARIA International Conference on Building and Exploring Web Based Environments (WEB 2015) (pp. 24-28). Efstathiades, H., Antoniades, D., Pallis, G., Dikaiakos, M. D., Szlávik, Z., & Sips, R. J. (2016, December). Online social network evolution: Revisiting the Twitter graph. In Big Data (Big Data), 2016 IEEE International Conference on (pp. 626-635). IEEE. Ferrara, E., Varol, O., Menczer, F., & Flammini, A. (2016, March). Detection of Promoted Social Media Campaigns. In ICWSM (pp. 563-566). Giasemidis, G., Singleton, C., Agrafiotis, I., Nurse, J.R., Pilgrim, A., Willis, C. and Greetham, D.V., 2016, November. Determining the veracity of rumours on Twitter. In International Conference on Social Informatics (pp. 185-205). Springer International Publishing. Goel, S., Watts, D. J., & Goldstein, D. G. (2012, June). The structure of online diffusion networks. In Proceedings of the 13th ACM conference on electronic commerce (pp. 623-638). ACM. Hamidian, S., & Diab, M. (2015). Rumor detection and classification for twitter data. In Proceedings of the Fifth International Conference on Social Media Technologies, Communication, and Informatics (SOTICS) (pp. 71-77). Hamidian, S., & Diab, M. (2016). Rumor identification and belief investigation on twitter. In Proceedings of the 7th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (pp. 3-8). Horne, B. D., & Adali, S. (2017). This just in: fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. arXiv preprint arXiv:1703.09398. Hughes, A.L., and Palen, L. (2009). Twitter adoption and use in mass convergence and emergency events. International Journal of Emergency Management, 6(4), 248-260. Internet Live Stats (2018, January). Internet Live Stats. Retrieved from: http://www.internetlivestats.com/one-second/#tweets-band Jin, F., Wang, W., Zhao, L., Dougherty, E., Cao, Y., Lu, C. T., & Ramakrishnan, N. (2014). Misinformation propagation in the age of Twitter. Computer, 47(12), 90-94.

Jin, Z., Cao, J., Zhang, Y., & Luo, J. (2016, February). News Verification by Exploiting Conflicting Social Viewpoints in Microblogs. In AAAI (pp. 2972-2978). Kalyanam, J., Velupillai, S., Doan, S., Conway, M., & Lanckriet, G. (2015). Facts and fabrications about Ebola: A twitter based study. arXiv preprint arXiv:1508.02079. Keller, F. B., Schoch, D., Stier, S., & Yang, J. (2017). How to Manipulate Social Media: Analyzing Political Astroturfing Using Ground Truth Data from South Korea. In ICWSM (pp. 564-567). Khalajzadeh, H., Yuan, D., Grundy, J., & Yang, Y. (2016, June). Improving Cloud-based Online Social Network Data Placement and Replication. In Cloud Computing (CLOUD), 2016 IEEE 9th International Conference on (pp. 678-685). IEEE. Kwon, S., Cha, M., Jung, K., Chen, W., & Wang, Y. (2013, November). Aspects of rumor spreading on a microblog network. In International Conference on Social Informatics (pp. 299308). Springer, Cham. Kwon, S., Cha, M., Jung, K., Chen, W., & Wang, Y. (2013, December). Prominent features of rumor propagation in online social media. In Data Mining (ICDM), 2013 IEEE 13th International Conference on (pp. 1103-1108). IEEE. Liu, Y. and Xu, S., (2016). Detecting rumors through modeling information propagation networks in a social media environment. IEEE Transactions on Computational Social Systems, 3(2), 46-62. Louni, A., & Subbalakshmi, K. P. (2014). Diffusion of information in social networks. In Social networking (pp. 1-22). Springer. Nurse, J.R., Agrafiotis, I., Goldsmith, M., Creese, S. and Lamberts, K. (2014). Two sides of the coin: measuring and communicating the trustworthiness of online information. Journal of Trust Management, 1(1), 5. Nurse, J. R., Creese, S., Goldsmith, M., & Rahman, S. S. (2013, July). Supporting human decisionmaking online using information-trustworthiness metrics. In International Conference on Human Aspects of Information Security, Privacy, and Trust (pp. 316-325). Springer. Oyeyemi, S. O., Gabarron, E., & Wynn, R. (2014). Ebola, Twitter, and misinformation: a dangerous combination?. BMJ, 349, g6178. Ratkiewicz, J., Conover, M., Meiss, M., Gonçalves, B., Patil, S., Flammini, A., & Menczer, F. (2010). Detecting and tracking the spread of astroturf memes in microblog streams. arXiv preprint arXiv:1011.3768. Reuter, C., Kaufhold, M. A., & Steinfort, R. Rumors, Fake News and Social Bots in Conflicts and Emergencies: Towards a Model for Believability in Social Media. Rubin, V. L., Conroy, N. J., & Chen, Y. (2015, January). Towards news verification: Deception detection methods for news discourse. In Hawaii International Conference on System Sciences. Rubin, V. L. (2017). Deception detection and rumor debunking for social media. The SAGE Handbook of Social Media Research Methods, 342. Saez-Trumper, D. (2014, September). Fake tweet buster: a web tool to identify users promoting fake news on twitter. In Proceedings of the 25th ACM conference on Hypertext and social media (pp. 316-317). ACM. Sambuli, N. (2017, December). http://www.aljazeera.com. Retrieved from: http://www.aljazeera.com/indepth/opinion/2017/08/kenya-latest-victim-fake-news170816121455181.html Sedhai, S., & Sun, A. (2017). An analysis of 14 Million tweets on hashtag‐oriented spamming. Journal of the Association for Information Science and Technology, 68(7), 1638-1651.

Shao, C., Ciampaglia, G. L., Flammini, A., & Menczer, F. (2016, April). Hoaxy: A platform for tracking online misinformation. In Proceedings of the 25th international conference companion on world wide web (pp. 745-750). International World Wide Web Conferences Steering Committee. Shu, K., Sliva, A., Wang, S., Tang, J., & Liu, H. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter, 19(1), 22-36. Shu, K., Wang, S., & Liu, H. (2017). Exploiting Tri-Relationship for Fake News Detection. arXiv preprint arXiv:1712.07709. Starbird, K., Maddock, J., Orand, M., Achterman, P., & Mason, R. M. (2014). Rumors, false flags, and digital vigilantes: Misinformation on twitter after the 2013 Boston marathon bombing. iConference 2014 Proceedings. Stewman, R. (2018). Facebook Police: There’s No Such Thing As Free Speech On Social Media. Retrieved from: https://www.huffingtonpost.com/entry/facebook-police-theres-no-such-thing-asfree-speech_us_5a045e95e4b055de8d096ae2 Vanhoenshoven, F., Nápoles, G., Falcon, R., Vanhoof, K., & Köppen, M. (2016, December). Detecting malicious URLs using machine learning techniques. In Computational Intelligence (SSCI), 2016 IEEE Symposium Series on (pp. 1-8). IEEE. Varol, O., Ferrara, E., Davis, C. A., Menczer, F., & Flammini, A. (2017). Online human-bot interactions: Detection, estimation, and characterization. arXiv preprint arXiv:1703.03107. Vijayaraghavan, P., Vosoughi, S., & Roy, D. (2017). Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (Vol. 2, pp. 478-483). Vosoughi, S. (2015). Automatic detection and verification of rumors on Twitter (Doctoral dissertation, Massachusetts Institute of Technology). Vosoughi, S., & Roy, D. (2016, May). A Semi-Automatic Method for Efficient Detection of Stories on Social Media. In ICWSM (pp. 707-710). Vosoughi, S., Mohsenvand, M.N. & Roy, D. (2017). Rumor gauge: predicting the veracity of rumors on Twitter. ACM Transactions on Knowledge Discovery from Data (TKDD), 11(4), 50. ACM Yang, H., Zhong, J., Ha, D., & Oh, H. (2016, August). Rumor Propagation Detection System in Social Network Services. In International Conference on Computational Social Networks(pp. 8698). Springer. Castillo, C., Mendoza, M., & Poblete, B. (2011, March). Information credibility on Twitter. In Proceedings of the 20th international conference on World wide web (pp. 675-684). ACM. Zephora (2018). The Top 20 Valuable Facebook Statistics. Retrieved from:

https://zephoria.com/top-15-valuable-facebook-statistics/ Zhao, Z., Resnick, P., & Mei, Q. (2015, May). Enquiring minds: Early detection of rumors in social media from inquiry posts. In Proceedings of the 24th International Conference on World Wide Web (pp. 1395-1405). International World Wide Web Conferences Steering Committee. Zubiaga, A., Liakata, M., Procter, R., Bontcheva, K., and Tolmie, P. (2015). Towards Detecting Rumours in Social Media. In AAAI Workshop: AI for Cities.