A Novel Method for Worm Containment on Dynamic Social Networks

The 2010 Military Communications Conference - Unclassified Program - Cyber Security and Network Management

A Novel Method for Worm Containment on Dynamic Social Networks Nam P. Nguyen, Ying Xuan, My T. Thai CISE Department, University of Florida, USA {nanguyen, yxuan, mythai}@cise.ufl.edu Abstract—With the introduction of the World Wide Web and online social networks, people now have sought ways to socialize and make new friends online over a greater distance. Popular social network sites such as Facebook, Twitter and Bebo have witnessed rapid increases in space and the number of online users over a short period of time. However, alongside with these fast expands comes the threat of malicious softwares such as viruses, worms or false information propagation. In this paper, we propose a novel adaptive method for containing worm propagation on dynamic social networks. Our approach first takes into account the network community structure and adaptively keeps it updated as the social network evolves, and then contains worm propagation by distributing patches to most influential users selected from the network communities. To evaluate the performance of our approach we test it on Facebook network dataset [17] and compare the infection rates on several cases with the recent social-based method introduced in [21]. Experimental results show that our approach not only performs faster but also achieves lower infection rates than the social-based method on dynamic social networks.

I. I NTRODUCTION Online social networks have become more and more popular nowadays. Since their introduction, popular social network sites such as Facebook, Twitter, Bebo and MySpace have attracted millions of users worldwide, many of whom have integrated those sites into their everyday lives. On the bright side, online social networks are ideal places for people to keep in touch with friends and colleagues, to share their common interests, to hold discussions in forums or just simply to socialize online. However, on the other side, social networks are also fertile grounds for the rapid propagation of malicious softwares (such as viruses or worms) and false information. Facebook, one of the most famous online social networks, experienced a wide propagation of a trojan worm in late 2008. “Koobface” (popularly known as “Facebook virus”) was the name of the worm that made its way through not only Facebook but also Bebo, MySpace and Friendster social networks [7], [9]. Once an user is infected with Kooface, this worm scans through the current user’s profile and sends out fake messages or wall posts to everyone in the user’s friend list with titles or comments appeal to people’s curiosity. If one of the user’s friends, attracted by the comments without a shadow of doubt, clicks on the link and installs the fake “flash player”, he will be infected. Koobface’s life will then cycle on this newly infected user’s machine. Since people are able to access social network sites via cell phones nowadays, worm’s targets are now not only computers but also mobile devices.

978-1-4244-8179-8/10/$26.00 ©2010 IEEE

Even though the behavior and propagation techniques of Koobface worm are not new, if not to say classical, they arise a major problem on social networks: Since people tend to trust messages or wall posts from their friends, they could unsuspiciously follow the malicious links and easily get infected, and as worms are able to spread out at an exponential rate [5], the whole social network might be infected in just a short period of time. As an attacker, one will try selecting a set of most vulnerable users such that when attacked, the worm propagation seeded with this set will infect a large portion of the network. A network defender, in a similar point of view, also selects a set of potentially vulnerable users probably different from the attacker’s set - in such a way that worm propagation will mostly be contained once worm patches are sent to those users. One naive solution for the network defender is simply sending patches to all users in the network. However, doing this takes a lot of time and efforts [6] and in the meantime, worm propagation may spread out to a larger population. Therefore, patching a tighter set of users which still efficiently prevents worms from spreading out widely is of desired. The problem of worm containment becomes more and more complicated on a dynamic social network as this kind of network evolves and changes rapidly over time. On online social networks, such as Facebook, Twitter or Bebo, changes are constantly introduced by new users joining in or withdrawing from the networks, by the introductions of new social connections or by the removals of existing relationships. The dynamics of social networks thus gives worms more chances to spread out faster and wider as they could flexibly switch between new and existing users in order to propagate. Therefore, the problem of containing worm propagation on social networks is extremely challenging in the sense that a good solution at the previous time step might not be effective at the next time step. Although one can recompute a new solution at each time the network changes, doing so would result in heavy computational costs and time consuming as well as worms spreading out wider during the recomputing process. A better solution should quickly and adaptively update the current worm containing strategy based on changes in network topology, thus could avoid the hassle of recomputing from scratch whenever the network evolves. Recent studies on social networks reveal that they exhibit a very common and important property: the property of containing community structure [8], [15]. Roughly speaking,

1973

community structure is the natural division of the network into groups of vertices with denser connections within each group and fewer connections between groups, where vertices and connections represent network users and their social relationships, respectively. Members in each community usually share some common interests and thus, tend to socialize with other members more frequently than with people from outside communities. Moreover, due to less interactions between communities, a user in a community will be suspicious when he receives strange messages from users in different communities. He may contact the sender for confirmation or just simply disregard the messages. This implies that worms propagate within a single community would be much faster than between communities on a social network. There are many proposed methods dealing with worm containment on computer networks by either using a multiresolution approach to enhance the power of threshold-based detection [16], or using a simplification of the Threshold Random Walk scan detector [19], or by measuring the velocity of the number of new connections and infected hosts [5], or using fast and efficient worm signature generation [10], [12]. There are also several method proposed for cellular and mobile networks [18], [3], [2]. However, all of these above approaches fail to take into account the community structure as well as the dynamics of social networks, thus might not be appropriate for our problem. A recent work [21] proposed a social-based patching scheme for worm containment on cellular networks. However, this method encounters the following limitations on a real social network (1) its clustered partitioning does not necessarily reflect the natural network community structure (2) it requires the number of clusters k (which is generally unknown for social networks) must be specified beforehand and (3) it exposes weaknesses when dealing with dynamics of the network. To overcome these limitations, we propose a novel method for worm containment on a dynamic social network. Our approach first identifies the community structure of the social network and then adaptively keeps this structure updated as network evolves, without the need of recomputation. Once the network community structure is identified, our patch distribution procedure will select the most influential users from different communities for sending patches. These users, as soon as they receive patches, will apply them to disinfect the worm and then redistribute them to all friends in their communities. These actions will restrict worm propagation to only some communities of the social network and prevent it from spreading out to a larger population. The rest of this paper is organized as follow: Section II describes in details our adaptive community updating algorithms and patch distribution method. Section III presents experimental results of our approach in comparison with the one proposed in [21]. Finally, we conclude our paper in Section IV .

II. M ETHOD

DESCRIPTION

Our method for containing viruses and worms on dynamic social networks consists of (1) adaptive algorithms for quickly updating the network community structure from its previous snapshots and (2) a procedure for selecting influential users who patches will be distributed to. A. Graph model We present a social network by an undirected unweighted graph G = (V, E) with |V | = N vertices and |E| = M links, where vertices in V and links in E represent network users and their social relationships, respectively. Let S = {S1 , S2 , .., Sp } be a partitioning, i.e., a collection of p disjoint communities of G, where Si ∈ S is a community of G. For each vertex a, denote by da and Sa the degree of a and the community a belongs to. Moreover, for any subset S ⊆ V , let mS and dS denote the number of links within S and the total degree of vertices in S, respectively. Let Gt = (Vt , Et ) be a time dependent network recorded at time t. Let ΔVt and ΔEt be the set of vertices and the set of links to be added (or deleted) at time t. Denote by ΔGt = (ΔVt , ΔEt ) the change in term of the whole network. The network at next time step Gt+1 is defined as Gt+1 = Gt ∪ ΔGt . A dynamic social network G is defined as a sequence of evolving networks over time: G = (G0 , G1 , .., Gh ). B. Updating network community structure Let us first discuss how challenging the problem of updating network community structure is. Let S = {S1 , S2 , .., Sp } be the current community structure of a social network G. In each community Si , the number of connections within Si are much more than the number of connections linking Si to its neighbors, i.e., vertices in Si are much densely connected inside than outside. Intuitively, one may think that adding intra links (links whose two endpoints belong to the same community) inside or removing inter links (links whose endpoints belong to different communities) from community structure of G will strengthen those communities as well as make the structure of G clearer, and vice versa. However, it would not be the case if two communities have less distraction caused by the other. In this scenario, inserting or removing edges will make them look more attractive to each other and thus, leaves a possibility that these communities will be combined together. Therefore, the process of updating community structure in a social network is extremely challenging since any insignificant adjustment or modification introduced to the network topology could cause an unexpected transformation to its community structure. Changes introduced to a social network are frequently reflected to its underlying graph by either inserting or removing a vertex or a set of vertices, or by either inserting or removing a link or a set of links. We can further decompose these changes as a sequence of single events, in which each vertex or link is inserted (or removed) at a time. Thus, network changes at any time point can be treated as a collection of single events where a single event can be one of the followings {newUser, newLink, removeUser, removeLink}.

1974

Fig. 1. (a) The removal of a social connection within a densely connected community (b) A community is divided into two smaller communities when a social connection is removed.

A closer look to the above single events reveals that some of them can be further decomposed as a sequence of another event. For instance, when a new user joins in a network, he usually starts building social relationship by making connections to other users who are already in the network. Thus, the corresponding event newUser can be represented as a series of newLink events. Similarly, removeUser can be represented as a collection of removeLink events. These observations also imply that efficient algorithms for newLink and removeLink are sufficient for handling the dynamics of a social network. In order to quantify the quality of the network community structure, we use the widely accepted measurement Modularity [14], denoted by Q, which is defined as: mS d2S − Q= M 4M 2 S∈S

Modularity values the differences between the number of links within a community and the expected number of such links, and the higher modularity is, the better network community structure. Maximizing Q will help in finding the natural community structure of the network without the need of a predefined number of communities. Therefore, our objective function is to find a community assignment for each node in the network such that Q is maximized. The community updating procedure first requires a basic network community structure S in order to process further. Since the input social network is an undirected unweighted graph, this initial structure can be easily obtained by applying one of the available community detection algorithms such as [13], [4], [1]. In this paper, we choose the approach suggested by Blondel et al [1] since it returns a high modularity community structure in a timely manner [11]. We also make an assumption that each node in the network belongs to a single community. 1) Introducing a new social connection: When a new link l joining two vertices a, b is introduced to the network, we consider the following two cases (1) l is an intra link and (2) l is an inter link connecting different communities. If l connects two vertices within a community S, its present will help strengthening the inner structure of S (Lemma 1) and will not separate S into smaller modules (Theorem 1). In other words, when two members make a new social relationship inside a community, this relationship will strengthen the inner structure of that community. Therefore, we leave the current network community structure intact in this case.

The problem become more complicated when l is an inter link connecting communities Sa and Sb since this new link can probably attract a member of Sa to join in Sb and vice versa. If this member has more connections inside Sa , he may advertise his new membership to all friends and some of them may want to leave Sa and join in Sb as a consequence, thus might result in the separation of Sa into smaller communities. However, when communities Sa and Sb are less attractive to each other, adding a new link between them might not affect their inner structures, thus would leave them unchanged. To effectively handle the second case, we first let a and b determine their best communities to join in. Intuitively, if a (or b) decides to change its current membership (i.e., leaves its current community and joins in another one), some of its neighbors would probably change their mind as well. Therefore, we then perform membership testing on all of a’ neighbors and let them join in new communities if necessary. This updating process repeats until no vertex membership changing occurs. In order to determine the membership of a user u, we S (u) (to keep this user stays inside compute two forces Fin S community S) and Fout (u) (to bring a user to community S −du ) S S (u) = euin S − du (d2M , Fout (u) = S) as in [20]: Fin max

eu

S∈N C(u)

to S

−

du dout 2M

S

, where N C(u) is the set of

communities u connects to and dout C is of opposite meanings of dC (see II-A for variable definitions). The detailed procedure is presented in Algorithm 1. Lemma 1: If dS < M then inserting a link within community S will increase its modularity contribution. Theorem 1: Let S be a community in the current structure of G, then S will not be divided into smaller communities when an intra link is added to it. Proof: Suppose the contradiction.Let T1 , T2 , .., Tk be disjoint modules of S. Denote by di , mi and lij the total degree of Ti , the total links within Ti and the number of links going from Ti to Tj , respectively. W.L.O.G, assume that when an intra link is added to S,it is added to T1 . We di dj

will prove that

i=j

2M

not happen because

i=j

M − 4M 2 . Now, since T1 , T2 , .., Tk are disjoint subsets of S, it implies dS = mS = as

mS M

k i=1

−

mi + k

i=1

mi M

k i=1

di and

lij . The above inequality can be written di dj k d2i d2S i=j > 4M 2 − lij > 2M . 4M 2 or

i=j

i=j

i=1

Now, when an intra link is added to S and S is divided + qX + .. + q Xk into T1 , T2 , .., Tk , we have qS < qX 1 2 k k 2

mi +

⇐⇒

1975

i=1

lij +1

i=j

M+1

di +2

−

i=1

4(M+12 )