Aug 16, 2012 - Twitter as a graph consisting of users and posts as nodes and retweet relations .... first baseline, denoted as #RT, is obviously based on ab-.
Finding Interesting Posts in Twitter Based on Retweet Graph Analysis †
‡
Min-Chul Yang† , Jung-Tae Lee‡ , Seung-Wook Lee† , and Hae-Chang Rim†
Dept. of Computer & Radio Communications Engineering, Korea University, Seoul, Korea Research Institute of Computer Information & Communication, Korea University, Seoul, Korea
{mcyang, jtlee, swlee, rim}@nlp.korea.ac.kr ABSTRACT
guishing interesting tweets with more than 80% accuracy. This simple rule, however, may incorrectly recognize many interesting tweets as not interesting, simply because they do not contain links. In this paper, we follow the definition of interesting tweets provided in [1] and focus on automatic methods for finding tweets that may be of potential interest to not only the authors and their followers but a wider audience. Specifically, we model Twitter as a graph of user and tweet nodes connected by retweet links (edges) and present a variant of the HITS algorithm [4] based on the retweet graph for producing a static ranking of tweets. Our premise is that an interesting tweet attracts users not only in but beyond the author’s own network and impels them to retweet as a sign of recommendation; thus, such a tweet can be found by analyzing the retweet relations in the graph. We conduct experiments on real world tweet data and demonstrate that our method achieves better performance than the simple retweet count approach and a similar recent work [2] that uses supervised learning with a broad spectrum of features.
Millions of posts are being generated in real-time by users in social networking services, such as Twitter. However, a considerable number of those posts are mundane posts that are of interest to the authors and possibly their friends only. This paper investigates the problem of automatically discovering valuable posts that may be of potential interest to a wider audience. Specifically, we model the structure of Twitter as a graph consisting of users and posts as nodes and retweet relations between the nodes as edges. We propose a variant of the HITS algorithm for producing a static ranking of posts. Experimental results on real world data demonstrate that our method can achieve better performance than several baseline methods. Categories and Subject Descriptors: H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval General Terms: Algorithms, Design, Experimentation Keywords: Twitter, tweet ranking, HITS, social network
1.
2. PROPOSED METHOD We treat the problem of finding interesting tweets as a ranking problem where the goal is to derive a scoring function which gives higher scores to interesting tweets than to uninteresting ones in a given set of tweets. Instead of running the HITS algorithm on the tweet side of the network right away, we run it on the user side first. Our intuition is that the authority of a user is an important prior information to infer the score of the tweets that the user published. Graph modeling. Formally, we model the Twitter structure as directed graph G = (N, E) with nodes N and directional edges E. We consider both users U = {u1 , . . . , unu } and tweets T = {t1 , . . . , tnt } as nodes and the retweet relations between these nodes as directional edges. For instance, if tweet ta , created by user ua , retweets tb , written by user ub , we create a retweet edge eta ,tb from ta to tb and another retweet edge eua ,ub from ua to ub . On user-level. We first run HITS on the user nodes in G. For all user nodes, the scores are updated iteratively as: |{uk ∈ U : euj ,uk ∈ E}| × H(uj ) (1) A(ui ) = |{k : euj ,uk ∈ E}| ∀j:e ∈E
INTRODUCTION
Twitter is a social networking service that enables its users to share their status updates, news, observations, and findings in real-time by creating short-length posts known as tweets. The service rapidly gained worldwide popularity as a communication tool, with millions of users generating millions of tweets per day. Although many of those tweets contain valuable information that is of interest to many people, many others are mundane tweets that are of interest only to the authors and their followers (i.e., subscribers). Finding tweets that are of potential interest to a wide audience from large volume of tweets being accumulated in real-time is a crucial but challenging task. One straightforward way is to use the numbers of times tweets have been retweeted (i.e., re-posted). [3] proposes to regard retweet count as a measure of popularity and presents classifiers for predicting whether and how often new posts will be retweeted in the future. However, mundane tweets by very popular users, such as celebrities, with huge numbers of followers can record high retweet counts. [1] uses crowdsourcing to categorize a set of tweets as “only interesting to author and friends” and “possibly interesting to others” and reports that the presence of a URL link is a single, highly effective feature for distin-
uj ,ui
H(uj ) =
∀i:euj ,ui ∈E
Copyright is held by the author/owner(s). SIGIR’12, August 12–16, 2012, Portland, Oregon, USA. ACM 978-1-4503-1472-5/12/08.
|{uk ∈ U : euk ,ui ∈ E}| × A(ui ) |{k : euk ,ui ∈ E}|
(2)
Note that unlike the standard HITS, the authority and hub scores are influenced by the retweet behaviors of users; the
1073
idea here is to dampen the influence of users in case they devote most of their retweet activity toward a very few other users like celebrities. After each iteration, the authority and hub scores are normalized by dividing each of them by the square root of the sum of the squares of all authority values and all hub values respectively. This user-level procedure ends after iterating mU times. After this stage, the algorithm outputs a function SUA : U → [0, 1], which represents the user’s final authority score, and another function SUH : U → [0, 1], which represents the user’s final hub score. On tweet-level. When the user-level stage is ended, we compute the authority and hub scores of the tweet nodes. In each iteration, we start out with each tweet node initially inheriting the authority and hub scores of the user who published the tweet. If C is a function that returns the creator of a given tweet, then the scores are updated iteratively as: A(ti ) = SUA (C(ti )) + α F (etj ,ti ) × H(tj ) (3)
Table 1: Performance of individual Method P@10 P@20 R-Prec #RT 0.294 0.313 0.311 HITSorig 0.203 0.387 0.478 MLrt 0.658 0.566 0.474 MLall 0.819 0.795 0.698 HITSprop 0.852 0.807 0.735 HITSu 0.823 0.732 0.648 HITSt 0.545 0.576 0.536
∀j:etj ,ti ∈E
H(tj ) = SUH (C(tj )) + α
F (etj ,ti ) × A(ti )
(4)
∀i:etj ,ti ∈E
where α > 1 is a weight for hops in retweet chain. If a tweet retweets another tweet which has already been retweeted, it is more likely that the tweet is useful. Thus, tweets are accumulated bonus score, as much as α, for each retweet hop. F is a function that returns β > 1 if the creator of tweet tj is not a follower of the creator of tweet ti and 1 otherwise. It is intuitive that if the users retweet other users’ tweets even if they are not friends, then it is more likely that the such tweets are interesting. After each iteration, the authority and hub scores are normalized as done in the user-level. After the iterations are run mT times, the algorithm finally outputs a scoring function ST : T → [0, 1], which represents the tweet node’s final authority score. We use this function ST to rank a given set of tweets based on interestingness.
3.
EXPERIMENTS
We use real world tweets collected during 31 days of October 2011, containing 64,107,169 tweets and 2,824,365 users. For evaluation, we generated 31 initially ranked lists of tweets, each consisting of top 100 tweets created on a specific date of the month with highest retweet counts accumulated during the next 7 days. Two annotators were asked to categorize each tweet as interesting or not by inspecting its content, as done in [1]. In case of disagreement (about 15% of all cases), a final judgment was made by consensus between the two. We observe that the ratio of tweets judged to be interesting is about 36%. The goal of this evaluation is to demonstrate that our approach is able to produce better ranked lists of tweets by re-ranking interesting tweets highly. Table 1 reports the ranking performance of various methods in terms of Precisions at 10 and 20, R-Precision, and MAP. We compare our method to several baselines. The first baseline, denoted as #RT, is obviously based on absolute retweet counts; tweets with higher retweet counts is ranked higher than those with lower retweet counts. The second baseline, HITSorig , is the standard HITS algorithm in which an authority value of a node is computed as the sum of the hub values that point to it, and a hub value is the sum of the authority values of the nodes it points to. No other influential factors are considered in the calcula-
1074
methods MAP 0.355 0.465 0.551 0.763 0.778 0.732 0.549
tions. Lastly, we choose one recent work [2] that addresses a related problem, which aims at learning to classify tweets as credible or not. Although interestingness and credibility are two distinct concepts, the work presents a wide range of features that may be applied for assessing interestingness of tweets. For re-implementation, we train a SVM classifier using features proposed in [2], which include message-, user-, and propagation-based features1 ; then, we use the probability estimates of the learned classifier for re-ranking. We use leave-one-out cross validation for this learning approach, denoted as MLall . MLrt is a variant that is learned only with propagation-based features observed from retweet trees [2]. We observe that #RT alone is not a sufficient measure for finding interesting tweets. HITSorig performs better than #RT across most metrics but does not perform well in general. MLrt , which uses more features induced from retweet information, demonstrates significantly better results. The results of MLall show that more reasonable performance can be achieved when message- and user-based features are combined with propagation-based features. The proposed method, HITSprop (α = 2, β = 3), shows significantly better performance than #RT and HITSorig and also comparable performance to that of MLall . This result is significant, since our method is based on unsupervised method and does not require complex training as supervised learning approaches do. We lastly compare the contribution of the user-level procedure, denoted as HITSu against the tweet-level procedure, HITSt . We observe that the user-level procedure is much more important than the tweet-level procedure. This confirms our intuition described earlier that user authority is a valuable prior information to infer interestingness of tweets.
4. ACKNOWLEDGMENTS This work was supported by the South Korea Ministry of Knowledge Economy, under the title “Development of social web issue detection-monitoring and prediction technology for big data analytic listening platform of web intelligence” (No. 10039158).
5. REFERENCES [1] O. Alonso, C. Carson, D. Gerster, X. Ji, and S. U. Nabar. Detecting uninteresting content in text streams. In SIGIR CSE Workshop, 2010. [2] C. Castillo, M. Mendoza, and B. Poblete. Information credibility on twitter. In WWW, 2011. [3] L. Hong, O. Dan, and B. D. Davison. Predicting popular messages in twitter. In WWW, 2011. [4] J. M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46(5), 1999. 1 We do not use some topic-based features in [2] since such information was not available in our case.