Multi-Objective Optimization to Identify Key Players in Social Networks

2 downloads 159 Views 404KB Size Report
gain advantage in political campaigns. Many node centrality measures have been proposed to capture different behaviors a node can have in a social setting,.
Multi-Objective Optimization to Identify Key Players in Social Networks R. Chulaka Gunasekara, Kishan Mehrotra, Chilukuri K. Mohan Department of Electrical Engineering and Computer Science Syracuse University, NY 13244-4100 {rgunasek, mehrotra, mohan}@syr.edu

Abstract—Identification of a set of key players in a given social network is of interest in many disciplines such as sociology, politics, finance, economics, etc. Although many algorithms have been proposed to identify a set of key players, each emphasizes a single objective of their interest. Consequently the prevailing deficiency of each of these methods is that they perform well only when we consider their objective of interest as the only characteristic the set of key players should have. But in complicated real life applications, we need a set of key players which can perform well with respect to multiple objectives of interest. In this paper, we propose a new perspective for key player identification, based on optimizing multiple objectives of interest. This method allows us to compare well establish methods of key player identification and to show that the sets of key players identified by this method are better when multiple objectives must be addressed. In addition we propose an algorithm to select the most suitable sets of key players when multiple choices are available. We apply these algorithms in Eventual Influence Limitation (EIL) problem and show that our multi-objective perspective outperforms previous approaches. Keywords—Key Player Identification; Social Network Analysis; Influential Users; Multi-Objective Optimization; Genetic Algorithms

I.

I NTRODUCTION

Selecting a set of key players from a social network is an important research problem in many disciplines, such as: •

In viral marketing, it is important to identify and target the ‘right’ set of key players in a population to spread information efficiently and effectively.



In management, it is critical to identify and strategically place the key players to improve the productivity of colleagues.



In politics, it is necessary to identify key players to gain advantage in political campaigns.

Many node centrality measures have been proposed to capture different behaviors a node can have in a social setting, to identify key players [1, 2], such as: •

Degree centrality, which focuses on the number of peers to which a node is connected[3].



Betweenness centrality, which consider the number of shortest paths in the network that pass through a certain node [4].



Closeness centrality, which measures distance from a certain node to all other nodes [5].



Eigenvector centrality and Pagerank, which focus on the importance of the nodes a certain node is connected to [6, 7].



Bonacich Power Centrality, which is closely related to Eigenvector centrality except for the fact that it introduces a parameter β which determines the radius of the power [8].

In other work, Borgatti[9] described how to find a set of key players considering two different aspects. He defined the set of nodes maximally connected to all other nodes as KPP-Pos and the set of nodes upon removal would result in a residual network with the least possible cohesion as KPP-Neg. Separate key player sets are obtained from both perspectives using combinatorial optimization. In related work, Ortiz-Arroyo and Hussain [10] proposed an Information Theory based measures to find KPP-Pos and KPP-Neg sets. Researchers present different arguments in identifying sets of key players. Kitsak, et al.,[11] argued that the most efficient information spreaders in a social network are located within the core of the network as opposed to the nodes which show high local properties, and proposed the k-shell decomposition method to find such nodes. Ilyas and Radha [12] rectified a deficiency of the Eigenvector centrality approach to detect key players by proposing Principal Component centrality. Other methods to identify sets of key players are proposed by Janssen and Monsuur[13] and Huang, et al.,[14]. A review of current research on key player identification is found in [15]. One common feature of the aforementioned methods is that only a single property of interest is taken into consideration to identify key players. As an example, in Degree centrality only the number of connections a node has is taken into consideration. Although many new approaches to tackle the key player identification problem have been proposed, no clear argument exists that can justify the characteristics of the selected key players. Ideally, the set of key players identified should have the capability to excel in multiple objectives of interest. Hence we propose a new method of key player identification, which is based on optimizing multiple objectives of interest. Rest of the paper is organized as follows. In Section II, we revisit single objective approach to identify key players, and discuss its deficiencies. In Section III, we describe how the multi-objective optimization approach can be used to identify sets of key players; we also demonstrate how it can be used to evaluate single objective key player identification

methods. In Section IV, we propose an algorithm which can be used to select the best key player set(s) identified by multiobjective optimization approach. In Section V, we compare our algorithm against previously proposed methods with regard to the EIL problem and show that multi-objective approach over-performs previous methods. Finally, Section VI focuses on conclusions.

TR120 TR88 Fork

Shmuddel TSN103 Scabs

SN63 Zipfel SN96 Bumper Fish TSN83

Thumper Hook Grin

SN9

TR99

Beak TR77

SN4 Kringel SN100 Patchback

SMN5

II.

Whitetip

Stripes

P ROBLEMS WITH APPROACHES THAT OPTIMIZE A

Oscar MN83

Jonah

Zap

MN105

SINGLE OBJECTIVE

PL Double

Topless

Vau

DN63 Knit

CCL

As mentioned above, current approaches define “key players” with an appropriate objective of interest, and find sets of key players which optimize the identified objective. Let’s consider Eigenvector centrality as an example.

Five

Trigger

Haecksel MN60

Cross

Number1 Notch SN89 SN90 Beescratch Upbang TR82

Eigenvector centrality gives priority to nodes who are connected to other important nodes. One deficiency of Eigenvector centrality approach is that it tends to find key players that are all within the same region of a network [12]. This becomes a critical issue when dealing with massive social networks with multiple communities, as the key players identified by the Eigenvector centrality might not represent some of the communities in the network. The strongest argument, in favor of searching key players in different parts of a given network, was given by Granovetter [16]. In this paper on “The strength of weak ties”, Granovetter argues that members of one community have much to gain from acquaintances (a node belonging to another community) for fresh ideas. So, for a set of key players to have diverse ideas and to represent the ideas of the whole population, they should represent all parts of the population; but Eigenvector centrality fails to accomplish that. Dolphin social network[17] represents the frequent associations between dolphins in a community living off Doubtful Sound, New Zealand. Figure 1 shows the network structure and the communities detected using [18]. Sizes of the nodes are proportional to the Eigenvector centrality and different communities are denoted by different colors. We applied Eigenvector centrality and Principal Component centrality to identify key players in the Dolphin network. According to Eigenvector centrality, the set of 5 key players was found as [Grin, SN4, Topless, Scabs, TR99], and all 5 key players come from only 2 communities in the network. Similarly, the set of 5 key players [Grin, Topless, SN4, Scabs, Trigger] identified by Principal Component centrality also belong to only 2 communities. Eigenvector centrality manages to identify a set of key players who are connected to important nodes in the network, but ignores importance of the the distribution of the key players. Deficiency of this sort is not unique to Eigenvector centrality approach; any key player identification algorithm that focuses on single objective optimization can suffer from this sort of deficiency. Next section is focused on the multi-objective approach to key player identification that we propose to rectify the aforementioned major deficiency. III.

I DENTIFYING KEY PLAYERS WITH MULTIPLE OBJECTIVES

In this section, first we briefly discuss the concept of multiobjective optimization. Next, we discuss its implementation

Mus

Web

Jet

Gallatin DN21 Feather

Quasi

MN23 DN16

Wave

Ripplefluke Zig

Fig. 1: Dolphin Network

to address the key player identification problem. Objective of this approach is to evaluate and understand the characteristics of multiple approaches that have been proposed to address the key player identification problem discussed in section I. In addition, the multi-objective approach is also beneficial in addressing aforementioned concerns related to single objective approaches. A. Multi-Objective Optimization The process of optimizing a collection of objective functions is called multi-objective optimization. The general multiobjective optimization problem is: Optimizex F (x) = [F1 (x), F2 (x), ..., F` (x)]T Subject to gj (x) ≤ 0 ; j = 1, 2, ..., m hl (x) = 0 ; l = 1, 2, ..., e Here the objective function Fi (x) can either be a maximization or minimization, ` is the number of objective functions, m is the number of inequality constraints, and e is the number of equality constraints, x is a vector of n variables. F (x) is a ` dimensional vector of objective functions. In other words, the goal of multi-objective optimization is to simultaneously obtain optimum value of all ` objective functions, subject to equality and in-equality constraints. Typically, there is no single global solution, and it is often necessary to determine a set of points that all fit a predetermined definition for an optimum. In multi-objective optimization problems, we expect to find a set of “Pareto-optimal” solutions, each of which is “ non-dominated”1 by any other solution2 . All Pareto optimal points lie on the boundary of the feasible criterion 1 A solution X is non-dominated if every solution better than X with respect to any objective function, must be worse than X with respect to another objective function. 2 In a car purchase problem scenario, for example, one Pareto-optimal solution may have low cost, whereas another has better engine performance; neither is strictly “better” than the other according to both cost and performance criteria.

space. Nondominated sorting genetic algorithm II (NSGA-II) is one of the algorithms proposed to solve multi-objective optimization problems [19]. Key player problem can be addressed via Genetic Algorithm by a simple transformation. Each individual in the population is a bit string of 0s and 1s. 1s indicates that the particular node is selected as a key player and 0 indicates otherwise. As an example, if nodes 4, 6, 10, 14 and 19 were selected as key players from a social network of 20 nodes, this will be represented as a bit string of length 20 with indices 4, 6, 10, 14 and 19 selected as 1s and rest as 0s. Initially, in each sample of the genetic-population, k(< n) (where n = number of nodes in the network) bits are randomly assigned 1 to represent that they are the selected key players, and the rest (n − k) bits are assigned 0. Finally, the k key players are found using NSGA-II algorithm. One point crossover is performed to a fraction Pc of selected individuals to generate off-springs and mutation is performed at a probability Pm to change some elements in selected individuals by flipping the bit value. Since we are interested in identifying k number of key players it is important to keep the number of indices with 1s to a fixed number k. To achieve this, for each off-spring with p(> k) indices selected as 1s, randomly selected (p − k) indices were changed back to 0s from 1s and for each offspring with p(< k) indices selected as 1s, randomly selected (k − p) indices were changed from to 1s to 0s. Alternative method to maintain exactly k 1- bits in each off-spring, is to employ knowledge about the bits from previous generations of the Genetic algorithm. B. Addressing the deficiency of Eigenvector Centrality with Multi-Objective Optimization Different objectives can be selected to optimize according to the requirements of the application. Let’s consider the aforementioned problem of Eigenvector centrality. The issue was that the identified key players were too close to each other. To rectify this problem, except for maximizing the average Eigenvector centrality of the set of key players a second objective of maximizing the distance between the key players was introduced into the problem. The reasoning behind distance maximization is to spread out the set of identified key players. The intention is to find solutions (sets of key players) that maximize both objectives. Recall that each point in the Pareto front is a non-dominated solution to each other. Figure 2a illustrates the Pareto optimal front drawn for Dolphin network and Figure 2b illustrate the same for the Prisoners network[20], where i. the x-axis represents the Average Eigenvector centrality of the selected set of five key players, and ii. the y-axis represents the total distance between the selected set of key players. As mentioned earlier, all the points in the Pareto front are non-dominated to each other; thus depending on the importance given to chosen objective function, all points in the Pareto front provide a set of k key players. As an example, point A in the Pareto front refers to the set of key players [Topless, TR99, TSN83, Wave, Zig] and they belong to 4 different communities in the network. More concretely, consider Figure 2b, the Pareto optimal front for Prisoners

Network. In this figure point B refers to the set of five key players [0, 18, 21, 36, 51] that belong to 4 different communities in the network. Let’s consider two additional points in the Pareto front: B1 and B2. Although these 3 solutions are non-dominated to each other and intended to optimize both objectives, they assign different weights to the two objectives considered here. The point B1 which refers to the key player set [0, 10, 18, 37, 51], has a high distance between the selected key players, but its average Eigenvector centrality is low compared to B and B2. This indicates that the solution B1 is appropriate if higher weight should be given to distance between the key players as opposed to the average Eigenvector centrality. On the other hand the point B2 which refers to the key player set [7, 36, 40, 51, 54], has a high average Eigenvector centrality but low distance between the key players compared to B1 and B. If one intends to find key players which give high significance to the average Eigenvector centrality as opposed to the distribution of the key players, B2 is a better choice than B and B1. The key player set identified by point B, gives equal weight to average Eigenvector centrality and distribution of the key players. So, the selection of ideal key player set from the suggested points in the Pareto front is application oriented. The Pareto front not only allows us to identify sets of key players that optimizes both objectives, but also allows us to evaluate other key player identification algorithms with regard to selected objectives. Towards this goal, the sets of key players identified by previously proposed methods were compared against the Pareto front3 . The set of 5 key players found by each of the above methods give a certain value for average Eigenvector centrality and another value for distance between the key players. These two values respectively were taken as X and Y cordinates of the appropriate key player identification method. All the sets of key players identified by previously proposed algorithms fall below the Pareto front in Figures 2a as well as 2b. According to Figure 2a the sets of key players identified by all the methods mentioned above fall below the Pareto front, except for Eigenvector centrality method, which owns the right most point in the Pareto front (high average Eigenvector centrality but low distance among the key players) whereas the key player set found by Principal Component centrality fall quite close to the Pareto front (Euclidean distance of 0.44 to the closest point in the Pareto front) while the key player set found by Betweenness centrality is far apart from the Pareto front (Euclidean distance of 1.04 to the closest point in the Pareto front). This indicates that Principal Component centrality can identify key player sets who are much suited to optimize both average Eigenvector centrality and distance between key players than Betweenness centrality. Same principle can be used to optimize other objectives as well. Let’s consider Borgatti’s positive and negative KPP as two objectives. The idea is to identify a set of key players who are optimally connected to the rest of the network and will maximally disconnect the network upon deletion. Figure 3a shows the Pareto front generated for Dolphin Network for 3 Key player identification methods of Degree centrality, Betweenness centrality [4], Eigenvector centrality [6], Pagerank[7], Borgatti’s KPP Positive[9] , Borgatti’s KPP Negative[9], Principal Component centrality[12], KPP Positive using Information theory [10], KPP Negative using Information theory [10], K-shell [11] are compared here

this case. As an example point C in the Pareto front refers to the set of key players [Beescratch, DN63, Jet, SN63, Trigger]. In this case two information theory based key player measures identify sets of key players which are closer to the Pareto front than for the other measures. . Figure 3b shows the Pareto front generated for Prisoners Network considering the same two objectives. As an example the point D in the Pareto front refers to the set of key players [7, 15, 46, 51, 60]. IV.

O N SELECTION OF KEY PLAYERS SETS

As discussed above, multi-objective optimization identifies multiple sets of key players which fall in the Pareto front. For example, when we use the objectives ‘the average Eigenvector centrality’ and ‘distance between the key players’ Dolphin Network and Prisoners network have 56 and 44 points in the Pareto front, respectively. Since there are many sets of solutions in the Pareto front, we provide a guideline so that the user can select one set of key players from the set of Pareto front solutions. In devising the algorithm to select the best kkey player sets, we assume that the selected set of key players are needed to perform well “collectively’ (as a single unit). As an example, if a set of k-key players were picked to initiate a marketing campaign, these k key players should perform well collectively to spread intended news. Let A = {a1 , a2 , ..., aN } be the complete set of qualities that can be used to identify a set of k-key players. The items in the set A are the measures, such as average of Degree centrality of the key players, average of Eigenvector centrality of the key players, Boragatti’s KPP Positive measure, Borgatti’s KPP Negative measure, etc., as discussed in section I. Let M = {m1 , m2 , ..., mn } ⊆ A be the set of qualities key players should have, as determined by the application of interest, such as the marketing campaign. The subset of measures used to identify key players in a political campaign could be different from the subset of measures picked to identify key players for a marketing campaign. As discussed in section III, two qualities are chosen from M to draw the Pareto front, that is to identify sets of nondominated k-key players and the remaining qualities in M are used to reduce the size, as described below. Since we assume that the set of k- key nodes should “perform” well as a single unit, we represent the set of key nodes as a single super-node in the given network. Then, we evaluate the “performance” of this super-node, in terms of measures in M. Finally, we apply non-dominance to vectors find the desired set of key players. Algorithm 1 describes the approach. Input to Algorithm 1 is the set M = {m1 , m2 , ..., mn }, which are the n qualities a set of key players should have, the set, S, of k-key players found by the Pareto front using two selected objectives from the set M, the network G(V, E). A super-node is constructed to represent each set of key players as a single node; the objectives ( in M) of the super-node are evaluated, and the set(s) of non-dominated key players is selected. Application of this algorithm is illustrated for the Dolphin and Prisoner networks. The set M consists of measures mentioned in Table I and each set of five selected key players is required to do well with respect to all four capabilities. A fraction of the sets of key players, suggested by the Pareto front, is presented in

Algorithm 1 : Reducing the number of Key Player sets Input: Sets of key players found in the Pareto Front (S), Network G = (V, E), M = {m1 , m2 , ..., mn } Output: Sets of key players (T , where |T | ≤ |S|) 1: for all set of key players s ∈ S do 2: m1 (s) ← 0, m2 (s) ← 0, m3 (s) ← 0, ..., mn (s) ← 0 3: end for 4: for all set of key players s ∈ S do 5: V 0 = V and E 0 = E 6: Ns = {} 7: for all Node j ∈ s do 8: Nj = {u | u ∈ V 0 and (j, u) ∈ E 0 } 9: Ns = Ni ∪ Nj 10: E 0 = E 0 \ {(j, v 0 ) ∈ E 0 | v 0 ∈ Nj } 11: V 0 = V 0 \ {j} 12: end for 13: V 0 = V 0 ∪ {vnew } 14: for all Node j ∈ Ns do 15: E 0 = E 0 ∪ (vnew , j) 16: end for 17: i←0 18: for i ≤ n do 19: mi (s) ← mi (vnew ) 20: end for 21: end for 22: T ← F ind non dominated sets(m1 , m2 , m3 , ..., mn ) return T TABLE I: Performance Criteria and Measures for sets of Key Players Performance Criteria Directly connected to as many nodes as possible Should be able to mediate communication between communities Should be able to communicate quickly with all the nodes Should be connected to important nodes

Measured By Degree centrality Betweenness centrality Closeness centrality Eigenvector centrality

Table II for the Dolphin Network and in Table III for the Prisoners Network. The set of players identified by Algorithm 1 are depicted in bold in both tables. The sets selected by Algorithm 1 are non-dominated to each other and the users can select any set depending on the requirements. Both examples illustrate that the algorithm significantly reduces the desired set of key players (from 56 to 2 in the Dolphin Network and 44 to 2 for the Prisoners network). When all the objectives in M are considered in a single step to identify the sets of key players using multi-objective optimization, the number of non-dominated solutions identified are really high. As an example, when all the objectives were considered in a single step, the NSGA-II based Genetic Algorithm identifies 549 sets of non-dominated key players for the Dolphin network and 249 sets non-dominated key players of for the Prisoners network. Recall that the two step process described above identified only 2 sets of key players for both networks. Since the users would be more interested

Set ID 1 2 3 4

Set of Key Players Cross Cross Cross Cross

DC

BC

CC

Grin SMN5 SN63 Topless

Wave Whitetip Zig 0.28 0.30 0.50 TSN83 Wave Zig 0.12 0.17 0.41 Topless Wave Zig 0.39 0.37 0.54 TSN83 Wave Zig 0.28 0.27 0.49 . . . . . . . . . . . . 22 Grin Patchback SN4 SN63 Topless 0.53 0.62 0.66 . . . . . . . . . . . . 30 Grin SN4 SN63 Topless Zig 0.54 0.59 0.67 . . . . . . . . . . . . Note : DC, BC, CC and EC stand for Degree centrality, Betweenness centrality, Closeness centrality and Eigenvector centrality respectively

EC 0.36 0.11 0.45 0.34 . . . 0.54 . . . 0.54 . . .

TABLE III: Set of Key Players found for Prisoners Network from Pareto front and respective centrality values Set ID 1 2 3 4

Set of Key Players 0 0 0 0

DC

BC

CC

45 40 35 30 25 20 15 10 5 2

3

4

Deg_Btwn Multi Obj Degree Betweenness Eigenvector

5

6 7 Delay PageRank PC Centrality k-Shell Borgatti KPP_Neg

8

9

12 18 34 0.13 0.11 0.39 0.14 13 18 34 0.13 0.12 0.38 0.09 13 18 51 0.26 0.31 0.49 0.44 18 21 37 0.13 0.10 0.36 0.13 . . . . . . . . . . . . . . . 31 7 36 40 51 54 0.47 0.70 0.60 0.59 32 7 36 40 51 55 0.48 0.66 0.60 0.59 . . . . . . . . . . . . . . . Note : DC, BC, CC and EC stand for Degree centrality, Betweenness centrality, Closeness centrality and Eigenvector centrality respectively

in identifying a small number of sets of key players for their applications, the two step process is more useful. It can be the case that none of the key player sets are non-dominated with respect to all the properties in the subset of M that we desire. In this scenario we can prioritize the properties in the set M and select the sets of key players that are non-dominated for the properties of high priority. E VALUATION OF K EY P LAYER S ETS

This section is focused on evaluating the credibility of the key player sets identified by the multi-objective approach discussed in section III and IV. Using counter cascades to limit the propagation of gossips has been studied in recent research. Budak, et al., [21] proposed Multi-Campaign Independent Cascade Model (MCICM) to capture dynamics of competing information cascades. In this model, the first campaign is considered as the gossip and the second campaign (known as the limiting campaign) starts after some delay (r) and try to send “good” information to the network. Each node has two different probabilities to pass “good” information and “bad” information to each of its neighbors. The problem addressed here is, given a budget b, select a set of nodes AL as seeds for initial activation for the limiting campaign L, such that the number of nodes that adopt campaign L when the model stabilizes π(AL ) is maximized. This problem is also known as the Eventual Influence Limitation (EIL) Problem.

Fig. 4: Dolphin Network - Delay vs Number of nodes recruited by the Limiting Campaign

As proved in [21], EIL is a NP hard problem. So, it expected to be impossible to find the optimal set AL in polynomial time. Different algorithms in selecting key players in social networks are used to select the set AL and the results showed that in many cases the performance of heuristics such as degree centrality in solving EIL, is comparable to computationally costly algorithms. We have evaluated the performance of different key player identification algorithms discussed in Section I, and the multi-objective approach discussed earlier with regard to the EIL problem. A set of 5 key players (b = 5) selected by each key player identification algorithm was used as the seed for initial activation for L and eventually the π(AL ) generated by each set of key players were compared against each other. Sum of Degree centrality and the sum of Betweenness centrality were taken as the two objectives for the multi-objective key player identification method (discussed in sections III and IV). Figure 4 shows the variation of π(AL ) for different key player identification algorithms with the delay r for the Dolphin network. Since the aim is to “save” as many nodes as possible from getting the gossip, the idea is to achieve high π(AL ) values. As the delay increases, the number of nodes recruited by the gossip campaign is high, so as expected π(AL ) of each key player set decreases. According to Figure 4, π(AL ) value of the key player set identified by the multi-objective approach is higher than all the other algorithms compared. Figure 5 shows the same plot with regard to Prisoners network and this figure also shows that the key player set identified by the multi-objective approach achieves high π(AL ) compared to alternative key player identification methods. VI.

10

Borgatti KPP_Pos Info Theory KPP_Neg Info Theory KPP_Pos

EC

10 10 10 10

V.

Number of nodes recruited by the Limiting Campaign

TABLE II: Set of Key Players found for Dolphin Network from Pareto front and respective centrality values

C ONCLUSION

Although many algorithms have been proposed to identify a set of key players in social networks, each of those approaches focus on a single objective of their interest. But for many

Number of nodes recruited by the Limiting Campaign

pp. 466–484, 2006. [2] P. V. Marsden, “Egocentric and sociocentric measures 40 of network centrality,” Social networks, vol. 24, no. 4, pp. 407–422, 2002. 35 [3] L. C. Freeman, “Centrality in social networks conceptual clarification,” Social networks, vol. 1, no. 3, pp. 215–239, 30 1979. 25 [4] M. Barthelemy, “Betweenness centrality in large complex networks,” The European Physical Journal B-Condensed 20 Matter and Complex Systems, vol. 38, no. 2, pp. 163–168, 2004. 15 [5] K. Okamoto, W. Chen, and X.-Y. Li, “Ranking of 10 closeness centrality for large-scale social networks,” in Frontiers in Algorithmics, pp. 186–195, Springer, 2008. 5 2 3 4 5 6 7 8 9 10 [6] P. Bonacich, “Factoring and weighting approaches to Delay status scores and clique identification,” Journal of MathDeg_Btwn Multi Obj PageRank Borgatti KPP_Pos ematical Sociology, vol. 2, no. 1, pp. 113–120, 1972. PC Centrality Degree Info Theory KPP_Neg k-Shell Betweenness Info Theory KPP_Pos [7] L. Page, S. Brin, R. Motwani, and T. Winograd, “The Borgatti KPP_Neg Eigenvector pagerank citation ranking: bringing order to the web.,” 1999. [8] P. Bonacich, “Power and centrality: A family of meaFig. 5: Prison Network - Delay vs Number of nodes sures,” American journal of sociology, pp. 1170–1182, recruited by the Limiting Campaign 1987. [9] S. P. Borgatti, “Identifying sets of key players in a social network,” Computational & Mathematical Organization Theory, vol. 12, no. 1, pp. 21–34, 2006. applications, a set of key players which can perform well with [10] D. Ortiz-Arroyo and D. A. Hussain, “An information respect to multiple objectives of interest are in need. The main theory approach to identify sets of key players,” in Infocus of this paper is to propose a new perspective for key telligence and Security Informatics, pp. 15–26, Springer, player identification, based on optimizing multiple objectives 2008. of interest. [11] M. Kitsak, L. K. Gallos, S. Havlin, F. Liljeros, L. MuchWe have discussed the deficiencies with regard to single nik, H. E. Stanley, and H. A. Makse, “Identification objective key player identification approaches and then to solve of influential spreaders in complex networks,” Nature such problems, we proposed the multi-objective optimization Physics, vol. 6, no. 11, pp. 888–893, 2010. approach for the key player problem. The idea here is to [12] M. U. Ilyas and H. Radha, “Identifying influential nodes identify sets of key players who can be useful in multiple in online social networks using principal component objectives of interest. A NSGA-II based Genetic Algorithm centrality,” in Communications (ICC), 2011 IEEE Interwas used to identify sets of key players for multi-objective national Conference on, pp. 1–5, IEEE, 2011. case. To reduce the number of key player sets identified, we [13] R. Janssen and H. Monsuur, “Identifying stable network propose an algorithm which uses general network centrality structures and sets of key players using a w-covering permeasures to capture the performance of each set and find nonspective,” Mathematical Social Sciences, vol. 66, no. 3, dominated sets of key players. The main hypothesis here is pp. 245–253, 2013. that the set of key players should perform well as a unit. [14] S. Huang, H. Cui, and Y. Ding, “Evaluation of node importance in complex networks,” arXiv preprint According to aforementioned results our algorithm sucarXiv:1402.5743, 2014. ceeded in finding multiple sets of non-dominated solutions and [15] F. Probst, D.-K. L. Grosswiele, and D.-K. R. Pfleger, the sets of key players identified by this method are better “Who will lead and who will follow: Identifying incompared to recent key player identification methods when fluential users in online social networks,” Business & multiple objectives must be addressed. This method can also Information Systems Engineering, vol. 5, no. 3, pp. 179– be used as tool to evaluate the performance of previously 193, 2013. proposed key player identification algorithms. Multi-objective [16] M. S. Granovetter, “The strength of weak ties,” American approach to the key player problem was successful in identijournal of sociology, pp. 1360–1380, 1973. fying multiple sets of non-dominated solutions. We were able [17] D. Lusseau and M. E. Newman, “Identifying the role that to reduce the possible solution space by more than 94%, for animals play in their social networks,” Proceedings of the both the test cases using the algorithm discussed in section IV. Royal Society of London. Series B: Biological Sciences, The results in solving the EIL problem suggest that the multivol. 271, no. Suppl 6, pp. S477–S481, 2004. objective approach outperforms previously proposed methods [18] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and to identify sets of key players in social networks. E. Lefebvre, “Fast unfolding of communities in large networks,” Journal of Statistical Mechanics: Theory and R EFERENCES Experiment, vol. 2008, no. 10, p. P10008, 2008. [19] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A [1] S. P. Borgatti and M. G. Everett, “A graph-theoretic fast and elitist multiobjective genetic algorithm: Nsga-ii,” perspective on centrality,” Social networks, vol. 28, no. 4, 45

Evolutionary Computation, IEEE Transactions on, vol. 6, no. 2, pp. 182–197, 2002. [20] D. MacRae, “Direct factor analysis of sociometric data,” Sociometry, pp. 360–371, 1960. [21] D. Budak, Ceren Agrawal and A. El Abbadi, “Limiting the spread of misinformation in social networks,” in Proceedings of the 20th international conference on World wide web, pp. 665–674, ACM, 2011.

(a) Dolphin Network, Objectives - Average Eigenvector centrality and total distance between key players

(b) Prisoners Network, Objectives - Average Eigenvector centrality and total distance between key players

Fig. 2: Pareto Fronts : Objectives - Average Eigenvector centrality and Total distance between key players

(a) Dolphin Network, Objectives - Borgatti’s KPP POS and Borgatti’s KPP NEG

(b) Prisoners Network, Objectives - Borgatti’s KPP POS and Borgatti’s KPP NEG

Fig. 3: Pareto Fronts : Objectives - Borgatti’s KPP positive and negative

Suggest Documents