SUBMITTED TO IEEE
1
Decentralized algorithms for evaluating centrality in complex networks Katharina A. Lehmann, Michael Kaufmann
Abstract— Centrality indices are often used to analyze the functionality of nodes in a communication network. Up to date most analyses are done on static networks where some entity has global knowledge of the networks properties. To expand the scope of these analyzing methods to decentral networks we will propose here a general framework for decentral algorithms that calculate centrality indices. We will describe variants of the general algorithm to calculate four different centralities, with emphasis on the algorithm of the betweenness centrality. The betweenness centrality is the most complex measure and best suited for describing network communication based on shortest paths and predicting the congestion sensitivity of a network. The communication complexity of this latter algorithm is asymptotically optimal and the time complexity scales with the diameter of the network. The calculated centrality index can be used to adapt the communication network to given constraints and changing demands such that the relevant properties like the diameter of the network or uniform distribution of energy consumption is optimized. Index Terms— Centrality Indices, distributed algorithm, Selfdiagnostic networks.
I. I NTRODUCTION INCE the first days of telecommunications the world has changed dramatically. After a long period where a letter would take days to be delivered the invention of cable and telephone made instant communication possible and achievable for many people. To grant this service a widespread network of cables had to be built that is being remodeled periodically. However, due to the tremendous costs it had never been possible to rebuild the whole hardwired system despite the fact that the demands on communication have changed drastically in the last decades. In contrast to this wireless communication networks and fast changing peer-to-peer networks give the ability to choose the appropriate network architecture. Power efficient and stable networks are accomplishable if this architecture is dynamically changed, according to the demands of the user. But up to date most protocols for peer-to-peer-networks just randomly choose the set of peers with which data is shared directly. Due to this, the resulting communication network has random features that hamper e.g. efficient searching algorithms. A first attempt to overcome this random character of the network has been proposed by Ng and Sia [3]: every
S
This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. The authors are with Parallel Computing, Wilhelm-Schickard-Institut f¨ur Informatik, Eberhard-Karls-Universit¨at T¨ubingen, Sand 14, 72076 T¨ubingen, Germany. Email: lehmannk/
[email protected]
participant of the peer-to-peer network holds a list of the kind of data packages he is interested in. A new participant chooses some entry point and then walks randomly through the existing network to find persons with similar interests. Then the participant chooses some of them for building up so called quality links and additionally some random links to guarantee the connectivity of the whole network. By this procedure every participant will be clustered into a subnet of persons with similar interests. A fast searching algorithm is now able to use the long random links in areas that are not likely to contain the required data and to switch to a broadcasting protocol in subclusters of persons with similar interests to the query. The appropriate network architecture will also be extremely important in so called wireless multi-hop ad hoc communication networks. In classic mobile networks communication packets are routed over a centralized backbone system of basestations while in multi-hop networks the mobile devices have the routing functionalities. Each device is able to communicate directly with a part of the adjacent neighbors and provides routing services for communication between non-adjacent devices. There are two strict constraints that have to be considered: One is the limitation of resources such as energy or bandwidth, the second is the lack of any global information. To generate stable and efficient network architectures special algorithms have to be developed that are working decentrally and allow self-organization of the system. In [4] the following local organisation rule is proposed: Every new node will send a so called hello messages to its direct neighbors within a specified transmission radius to get connected to them. If the number of neighbors is too small within this radius it enhances its transmission power until a given minimal number k of neighbors is achieved. Every node receiving a hello message from a new node is forced to answer with a hello-reply message even if it already has the minimal number of neighbors. We will refer to this model in the following as k-next-neighbors model. The resulting network will be connected for reasonable high with high probability and most devices will not have many more than k neighbors. Both introduced decentral approaches generate quite functional networks but the main problem is that they tend to disfavour some nodes in a severe manner. In the first model commonly known entry points will have a higher probability of gaining too many links and in the k-nearest-neighbors model the special geometry of the network is disregarded. Since the network results in a grid like architecture routing on approximately shortest paths will affect inner nodes much more than nodes on the border of the network. That is, energy
0000–0000/00$00.00 c 2002 IEEE
SUBMITTED TO IEEE
consumption is not uniformly distributed but allows higher demands in some regions while relieving others unproportionally (s. Fig. 6(a)). To overcome this situation the evaluation of the current network architecture is needed, followed by some appropriate adaption if necessary. A common way to determine the ability of a network to provide efficient communication is the calculation of centrality indices, first introduced in the social network analysis [5], [6]. One of the most prominent of such indices is the betweenness-centrality index that has become an important analysis tool in fields as diverse as epidemiology, design of router networks, load of communication networks and co-citation networks [7], [8], [9], [10], [11]. It measures which proportion of communications on shortest paths would be afflicted by a removal of a given node in the graph and was first proposed by Freeman [2]. It has been shown that this static feature of a network has a non-trivial inter-dependency with the load of the communicating nodes [10]. As a static feature of the network it gives a good approximation of the network’s sensitivity to congestion. As such it can be used for the routing protocols to avoid paths with high congestion probability or to reconstruct the connections from individual nodes to optimize the network structure. The improvement of communication networks by evaluating the centrality and appropriate adaptation has been first proposed in [9] but the author did not provide any decentral algorithm with which the nodes could do the evaluation themself. We want to propose here a general framework that can be used for the calculation of different centrality indices that are based on shortest paths and distances in the network. The emphasis of this paper is on the calculation of the betweenness-centrality index which gives most information on the structure of the network. We will show that the proposed algorithm for this latter centrality index is asymptotically optimal in its communication complexity and that its time complexity scales with the diameter of the network. We will further introduce the concept of evolving networks. By a periodically evaluating and subsequently adaptation the network is globally optimized. We will provide a proof of concept that illustrates the flexibility of evolving networks. The model is applicable to many networks first of all wireless multi-hop ad-hoc communication networks, but also router networks, peer-to-peer networks, upcoming e-government systems, and even biological networks. The remainder of the paper is organized as follows: in Section 2 we define the setting of a wireless multi-hop system and describe a general framework for all centrality calculating algorithms. Section 3 gives the details of the algorithms for four classic centrality indices. We will further analyze their respective complexity and the decentrality, scalability and reliability of the algorithms. In Section 4 we will discuss how the evaluation of centrality can be used to optimize the global network architecture.
2
II. G ENERAL F RAMEWORK A. The Model: Definitions and Settings 1) General terms: A wireless multi-hop ad hoc communication network consists of a set of communication devices that are connected by communication protocols that enable direct communication between devices within a given distance. In the following we will describe such a network in the following terms: Let be a graph with undirected edges, without multi-edges. Every node represents one device, every edge represents a direct communication between nodes and . The number n of nodes is defined as the cardinality of , the number m of edges is defined as the cardinality of . A node is a neighbor of node if both are directly communicating with each other. We assume that every node maintains a list of its neighbors and that the graph is always connected. The degree of node is defined as the number of its neighbors. A path from a source node to a target node is defined as a set of nodes such that and all are directly connected. The hop length of a path denotes the number of edges on it, that is . The distance between two nodes and is defined as the minimum hop length of any path between and . Every path with minimum hop distance length will be called shortest path between and . A node is called predecessor of node with respect to some source node if and are neighbors and there is at least one shortest path between and running over . Vice versa, is called successor of node . Neighbors that are neither predecessors nor successors with respect to a given source node are called neutral. denotes the set of all predecessors of node with respect to source node . The number of shortest paths between and will be denoted , with . The number of shortest paths between by and that run over node , that is , is denoted by . The probability of lying on a randomly . picked shortest path from to is given by The diameter of the graph is defined as the length of the longest shortest path between any two nodes of the graph. The transmission radius of a node is defined as the furthest geographical distance to any of its neighbors. Every node has a unique identification string . 2) Synchronization: We assume that the system can be synchronized. The basic idea is that each communication device has a maximal possible transmission radius . Let be the time needed for any message to travel . Such, any message sent at time will be received by all neighbors at the latest at time . We further assume that any computation at any node can be done within a fixed time span . This assumption is based on the fact that the number of messages received at one time unit is restricted by the number of nodes in the system. In summary, we assume that the receiving and computation of all current informations in the system is finished after . With the introduction of a global some time span
' 1&, 23# "6/7# ' #
!" # $% '&( *)",+-+.+- "/10 *4 *4-5&3
67!" # 8 :9JC @ K LNMPO@M%QOR.Q SUT VXW # Y
Z1V[
\ ] \ #9 ] # ]_^ ] 9 ]^
SUBMITTED TO IEEE
3
#
clock it is now possible to synchronize the system. At a given starting time all nodes will send information to all neighbors. Every node will wait time interval for messages from its neighbors and after that start computation on the received messages. The newly generated informations will be to be sure that every sent to all neighbors in time node has received all relevant messages and has finished every necessary computation. To simplify things the notion ’received (computed) in time ’ will be a short cut for ’received (computed) in the time interval from time to time ’. 3) Analysis of algorithms: The communication complexity of an algorithm is defined as the number of elementary messages that are sent throughout the algorithm (adapted from [15]). The time complexity of an algorithm is defined as the number of time units required until the algorithm terminates [15]. The package complexity of an algorithm is defined as the number of packages that have to be sent throughout the algorithm. A package is defined as some set of elemantary messages that can be sent together in one message.
]
# 9]9]^
#4 # 9 9JC
The Betweenness Centrality is the most complex measure. It sums over all probabilities . Such, it measures the expected routing service demands of node if every node was communicating with every other simultaneously. This can be interpreted as an approximative congestion sensitivity of . C. Two phase model The general framework for algorithms calculating these centralities consists of two phases. In the first phase, called Count Phase, the distance or number of shortest paths is calculated, respectively. In the second phase, called Report Phase, the according value is reported back to all other nodes. We will first describe the two phases and then show that both phases can be interlaced with each other. The general framework is sketched in Fig. II-C. In the following, messages that carry new and relevant data will be called relevant messages. A message that contains data a node already has received earlier is called a backfiring message. 1) Count Phase: To describe the calculation of the number of shortest paths between any source node and target node we first have to state the following Lemmata. Lemma 1: A node is on the shortest path from a source node to a target node if and only if . This follows directly from the definitions of shortest paths and distance. Lemma 2: The number of shortest paths from to is the sum of shortest paths from to all predecessors of :
#
# '# 8 B !' 9 #
A>JC?
Graph Centrality [13]
(2)
Stress Centrality [14]
> A >JC
Q> C
(3)
Betweenness Centrality [2]
> >JC @ K
Q> C
(4)
! "
Centrality indices are designed such that the highest value indicates the most central node. An example graph with these centrality indices can be seen in Fig. 1 The interpretation of the meaning of this centrality indices is based on two assumptions. First, most routing protocols try to establish communications on shortest paths. Second, the hop distance is a (scaled) approximation for the real length of the path between two communicating nodes. Under this assumptions, the Closeness Centrality gives an indication how long it would take the node to communicate in a serial manner to all other nodes in the network. The Graph Centrality indicates how long the parallel communication to all other nodes would take at most. The Stress Centrality is measuring on how many shortest paths a node lies.
# %$
COUR T AB> #
A >JC
#
##
(5)
Lemma 2 follows from Lemma 1. We use the synchronization of the network as assumed above. All nodes start with the counting at the same global time . At this starting time an initial number of shortest paths (NOSP) message is generated and sent to all neighbors. Every NOSP message contains the of the source node and an integer number that denotes the number of shortest paths from to the currently sending node. This number will be called the weight of the message. The weight of the initial message will be ’one’ since the number of shortest paths from a node to itself is defined to be one. Based on Lemma 2, every node will sum the weight of all messages from direct predecessors with respect to . This sum is the number of shortest paths from to and will be stored such that it can be retrieved by the of the source node. This newly computed number of shortest paths will be broadcasted with to all its neighbors in the next time unit. This includes also the broadcasting to the predecessors. Thereby, the message will be a backfiring message for some of the neighbors. To decide whether a message came from a predecessor and therefore is relevant or a backfiring message every node checks for all messages whether the already has an entry in their number of shortest paths table. If so, it ignores the according message and will not broadcast it further. By definition of the synchronization it is clear that in time all nodes with distance will receive their first message
Z1V[P
Z'V P
A >S
Z'V[!P
#
Z1V[P
#/
SUBMITTED TO IEEE
4
Graph Centrality
Closeness Centrality
CG(a)=1/2
CC(a)=1/6
a
a CC(d)=1/5
CC(b)=1/7
b
e
d
CC(c)=1/6
CG(b)=1/3
CBC(e)=1/8
b
CG(d)=1/2
CG(c)=1/2
c
e
d
c
Stress Centrality
Betweenness Centrality
CS(a)=4
CB(b)=2
a
b CS(d)=10
CS(b)=2
b
Fig. 1.
CG(e)=1/3
CS(e)=0
CB(a)=1
e
a
d CS(c)=4
CB(c)=2
c
c
CB(d)=7
CB(e)=0
d
e
The figure shows the four different centrality indices that are based on shortest paths and therefore are important for communication networks
)
,1 (a
(a ,1
)
t0
Decentralized Centrality Calculation a e
t1 (a ,b
e t2
)
)/2 )/2
(a ,d
)/1
a
(a ,c
)
,2 (a
(a ,2
c
t3
)
(a,2)
e
(a,e)/1
e
)/2
d
,c (a
d c
c
b
a b
e
a
(a ,1
)
,1 (a
t2
(a,1)
(a,d)/1
b
(a ,c
(a ,1
d
)/1
)
,1 (a
b
d
,b (a
)
b
a
c
)/1
c t1
a
,d (a
)/1
d
(a (a ,c) ,e /2 )/1
b
d
e
)/2 ,c /1 (a ,e) (a
)/2
,c (a
C
(a,c)/2 (ae)/1
Count Phase Report Phase Fig. 2. The figure shows the general algorithm for calculating centralities in a decentralized manner at the example of calculating the number of shortest paths from node to all other nodes. On the left side of the figure the Count Phase is shown, on the right the Report Phase. Red messages indicate backfiring node is sending its NOSP message to its neighbors and . broadcasts it to (ignored) messages that will be ignored by the receiving node. In time and in . Additionally sends a report message to and because it knows its number of shortest paths to . This report message will be ignored by since it is not yet in the Report Phase regarding source node . Also node will broadcast the NOSP message to (ignored), and . Additionally, it sends the report message to all its neighbors. In time every node knows its number of shortest paths to and nodes and start reporting it back. The last report message will reach node at time and the algorithms terminates.
SUBMITTED TO IEEE
5
from source node and are able to calculate their number of shortest paths to . With this observation the hop distance between any two nodes can be determined by storing the time at which the first message with has been received. It follows that no weight is needed in a NOSP message if only the distance is calculated. It is important to note that the NOSP messages from different source nodes do not interfere with each other. Thus, all number of shortest paths and distances can be calculated simultaneously. Moreover, every node can collect all relevant NOSP messages and send them to all its neighbors in packages. The elementary NOSP messages might then be coded as pairs of ,weight). 2) Report Phase: As observed above every node with distance to any other source node will know its distance and number of shortest paths in time . To report the according value back every node will create a report message with an -pair and as a weight the distance or number of shortest paths , respectively. This report message is sent to all neighbors of in time . A node receiving a report message in time will perform the appropriate computation to calculate its centrality and forward the message without changing it to all neighbors in time . To avoid the forwarding of backfiring messages each node has to build up an array of size . In this is stored whether the report message with a certain -pair has been received already. Every distance or number of shortest paths is uniquely identified by the source and target node and will only be reported once. It follows, that every relevant report message will have a different -pair. Since all pairs of nodes will report some value back an array of size is necessary and sufficient. The detailed computations for the four different centralities introduced above will be described in more detail in the next section. 3) Interlacing of the Phases: All NOSP messages are independent from each other as are the report messages from each other. Of course, a report message can only be sent if the according distance or number of shortest paths is already calculated. But beside this dependency a node can be in different phases with respect to different source nodes. That is, a node may at the same time be waiting for some NOSP message from some node , sending the report message for some node and having finished the Report Phase for another node two time units ago. 4) Analysis: We will first analyze the time complexity of the general algorithm and then determine the communication and package complexity. Let denote the furthest distance of any node to . Node will receive its last NOSP message at time . It will send its last report message at time and it will receive its last report message at time . It is important to observe that until this time the node has received at least one report message every two time units. The reason for this is that there is some node in every distance from to and that the time until the according report message gets back to is twice the distance. With this observation, every node is
#4
Z1V[P
@Z1V[!%
B!" # Z1V
@Z1V[# 2 Z1V[P #
# 4-5&
#/
A >JC #
# 4 # /U5I&
) 1Z V
Z'V
V[
#
)
P^ ^ P^
# # R.SUT # 5 & R . U S T # ) .R SUT 5 & ;
able to determine the end of the calculation and afterwards react to the calculated value. In a global view the algorithm will stop after time units. That is, the time complexity T of any algorithm in this . framework is It is very important to state that the diameter of a graph is more depending on the geographical area that the networks spans than on the number of nodes in it. We assume that every node has an average transmission radius that depends very little on the number of nodes in the system. In this case, the expected diameter of the graph will be the diameter of the geographical area divided by the average transmission radius. Such, the time complexity T is not scaling with the number of nodes but approximately scaling with the root of the spanned geographical area. It is obvious that the time complexity cannot further be diminished. To calculate the communication complexity we make the following observation. In the Count Phase every node sends one NOSP message over every edge. In the Report Phase every node sends report messages back to the source.
Thus, the communication complexity is . This is optimal since every node needs to know the distance or number of shortest paths to every other node to calculate the centrality index it is interested in. To calculate the package complexity we observe that one node cannot receive and send more than different elementary messages in one time unit. This implies that the number of packages sent over one edge per time unit is bounded by a constant. The package complexity P is therefore given by
.
VW9;
V W
VXW
;
)
)9
@V W
III. D ECENTRALIZED C ALCULATION
OF
C ENTRALITY
In the following we will sketch the algorithms for the four different centralities and analyse their respective complexity. In the following, a report message that contributes to the centrality index of a node is called a contributing message.
A. Closeness Centrality The distance between all nodes is calculated in the Count Phase as described above. A report message is contributing to the Closeness Centrality of node if it contains its . Every node will compute its Closeness Centrality by adding up all weights from contributing messages. Contributing message will not be broadcasted since no other node beside is interested in the weight of the message. After having received the last contributing message the Closeness Centrality is determined by taking the inverse of the sum.
Z'V
B. Graph Centrality
The distance between all nodes is calculated in the Count Phase as described above. Every node will store the currently maximal weight of all received contributing messages. After it has received its last contributing message the stored value is inversed and hence the Graph Centrality of the node.
SUBMITTED TO IEEE
6
C. Stress Centrality To calculate Stress Centrality distance and number of shortest paths have to be computed in the Count Phase. The report messages hold as weight the distance from pair . Before describing the Stress Centrality algorithm a further lemma is needed. Lemma 3: The number of shortest path from a source node to a target node running over is given by . Proof: The proof is given by an inductional argument. For a direct successor Lemma 3 and Lemma 2 coincide regarding that the number of shortest paths between and is one if and are neighbors. Assume that the lemma is true for all nodes up to distance from . Let be a node in distance . Accordingly, the number of shortest path from any source node to is given by
Z1V
@Z1V[P 2 Z1V[#
A >JC @ K A > S A S C
#
A >JC
#
#
9;
#
AB>JC2@ K
#
CUO R T AB> @ K
COUR T A > S A S A> S C A S OR T A S C O R C T A S # Z'V @ Z1V[P U Z'V #
#
# %$
# %$
(6)
#
#
(8)
Equ. (6) is given by lemma 1, equ.(7) uses the assumption. # is by lemma 1 the number of # %$ Regarding that shortest paths from to lemma 3 is proved. To calculate its Stress Centrality a node has to decide whether a given report message with -pair is contributing to its centrality or not. According to lemma 1 this can be easily determined by adding up the distances of to and to and compare it to the weight of the message. If the weight equals the sum the message is contributing otherwise the message will just be broadcasted to all neighbors. Following lemma 3, a node will calculate for all contributing messages with -pair the product of and sum them up. This is possible because the number of shortest paths is symmetric. That is, and these values are stored in the NOPS table of . After a node has received the last report message the correct value of its Stress Centrality is obtained.
#
A >S A SC
@Z1V[P U Z'V #
A > S A S >
D. Betweenness Centrality The Betweenness Centrality is the most interesting centrality index for a communication network. It is superior to the Stress Centrality because it indicates the congestion sensitivity of a node where the Stress Centrality just absolutely counts the number of shortest paths the node lies on. The algorithm is nearly the same as the one for Stress Centrality but it needs additionally the report of the number of shortest paths between any node pair (referred to as NOSP weight). The definition of contributing is the same as in Stress Centrality algorithm. For every contributing report message with -pair it will calculate the and divide it by the NOSP appropriate product of weight of the message. The sum of all these fractions gives the Betweenness Centrality. Fig. 3 and Fig. 4 describes the pseudocode for this algorithm.
'#
Z1V
A @Z1> S V[AP US C Z'V #
Fig. 3. This pseudocode describes the Count Phase of a Betweenness Centrality calculating algorithm.
(7)
# %$
Z1V
Count Phase for calculating Betweenness Centrality at node v initialize first message with ownID and initial numberOfShortestPaths 1 new message( , 1); send message to all neighbors in time while any message has been received in time do forall messages do if if of received message has been stored already in time do ignore else do
store and sum numberOfShortestPaths over all messages with same
store time and numberOfShortestPaths together with
new message( , numberOfShortestPaths)
add new message to listOfMessagesToSent od od sent listOfMessagesToSent to all neighbors at time od
Report Phase for calculating Betweenness Centrality at node v forall between v and s known at time generate report message with ( , ) and weight send report messages to all neighbors in time while any report message has been received in time do forall messages do if ID-pair ( , ) has already been registered do ignore else do
multiplicate with , divide by weight of message and add value to betweenness-centrality memory cell
add report message to listOfMessagesToSent od od sent listOfMessagesToSent to all neighbors at time od Fig. 4.
Pseudocode for the preparing and handling of report messages
E. Decentrality, Scalability and Reliability In the following we want to discuss the decentrality, scalability and reliability features of the general framework. 1) Decentrality It can be easily seen by the above given pseudocodes that all operations only require the list of neighbors which with direct communication is possible. Since these lists can be provided by algorithms with local decision rules like in Glauche et al. [4], the algorithm is totally decentral. 2) Scalability As mentioned above the proposed algorithm scales in its time complexity with the diameter of the graph. As outlined before the diameter of a graph is normally depending on the area the network spans and not as much on the number of nodes in this area. It follows that the time needed for calculation might be constant if the area is restricted. The demands on memory scale with . This can be reduced to a linear dependency without enhancing the communication complexity by more than
)
SUBMITTED TO IEEE
7
a factor. 1 3) Reliability under failure Centrality indices are measures for static graphs. If the computation time is so long that a big part of the network has meanwhile changed the results will not be very meaningful. However, we want to show here that the algorithms always terminates and will give meaningful results if the change of the network has not been too dramatically. The termination of any of the algorithms is given by the fact that every node will forward every NOSP and report message just once. The quality of centrality indices after some change of the network is depending on the architecture of the network. The architecture of a wireless ad-hoc network is grid like due to the restricted transmission radius. By this the number of shortest paths between any nodes is very high and the failure of some portion of the graph likely will not change the distance between them. This implies that the calculated values for Closeness Centrality and the Graph Centrality of the remaining nodes will resemble the values in the changed network. This result can be further improved if failing nodes are able to broadcast a failure notice with all known distances to other nodes. With this every other node is able to update its centrality index by simply subtracting the according values. The situation is more complicated with Stress and Betweenness Centrality. With every failure the number of shortest paths between any node pair is changing in a complex way. This makes any updating of the current calculated values very complicated. Still, all remaining nodes will calculate the correct centrality with respect to the moment were the Count Phase began. It is reasonable to assume that the Betweenness Centrality depends mainly on the geographical position a node has in the spanned network area. If it is on a border of that area the likeliness is high that the index is low. If it is an inner node the index will be high. With this observation it follows that the absolute centrality index may vary due to minor changes in the network but still the calculated value will reflect the position of a node in the network and an adaptation according to the current value is reasonable even in cases of failure. IV. D ISCUSSION We have shown in this paper that the decentral calculation of diverse centrality indices is possible. The general framework has been shown to work optimally with respect to all analyzed complexity measurements. We want to discuss here how the proposed algorithms give rise to optimization of the network. We introduce here a model of evolution of the networks that consists of evaluating and adaptation phases. We assume that some lower level protocol will generate the network architecture whenever a node is lost or a new node is attached to the network. This architecture might not be optimal. As described above the 1 We
will be happy to give the details of this improvement on request.
algorithms for a peer-to-peer network in [3] and wireless multi-hop ad hoc networks in [4] disfavour some of the nodes. The latter will produce a mesh or grid like architecture where inner nodes are more probable to route a communication than are nodes on the border of the network. This can be seen in Fig. 6(a). A periodically evaluation of for example the Betweenness Centrality will reveal these differences. With an appropriate local rule each node decides whether and how it will change its list of neighbors. This gives rise to an emergent behaviour that optimizes the global network architecture. The evolution of a network as a cyclic process of evaluation and adaptation is sketched in Fig. 5. As a proof
Evolution of a Network Loss & addition of nodes General Architecture Protocol Adjusts network by local rules Appropriate Adaptation, e.g. re-attachment
generates networks dynamically
Evaluation of Centrality Reveals bad substructures
Fig. 5. The diagram shows an evolving network in which a basic architecture algorithm generates networks. The network is changing whenever nodes are lost or added to it. All nodes periodically evaluate one or more centrality indices. After evaluation they adapt their direct neighborhood according to some local rule. With an appropriate adaptation rule nodes with an too high load will be relieved, the diameter of the network decreased and the communication stabilized.
of concept we made some simulations on a network with 1000 nodes. The nodes were identically indepently distributed in a (10,000 x 10,000) 2D-area and connected according to the k-next-neighbors rule with 10 nearest neighbors. One example can be seen in Fig. 6(a). We conducted 10 evaluation steps with the following adaption rule. Adaptation rule If the Betweenness Centrality of a node is smaller than a given minimum it connects to the next furthest node. Thus it increases its transmission radius. If the Betweenness Centrality of a node is higher than a given maximum it will remove the connection to the neighbor with the highest centrality index. The minimum threshold was set to 1000, the maximum to 50 000.
SUBMITTED TO IEEE
8
747
747
285
947
285
947 632
398
632
398
190
880
190
880
213 647
213
520
385
647
520
385
206
206
712
712
600
600 360
360 590
590
904 706
904 706
417
773
417
773
649
649 442
442
727
519
589
727
519
589
777
777
35
35
350
350 329
599
329
599
699 910
699 910
548
548
670
670
735
735
953
953
454
454 479
479
352
352 740
644
740
644
951
951 915
915 81
81
507
708
507
708
897
897
253
253 508
508 898
898
793
793 125
125
478
680
478
99
646
588
489
680
787
99
646
588
489
787
41
41
682
682 909
909
990 985
990 985
689
324
529
689
324
529 950
950 698
698 355
48
759
977
434 991
637
822
822
139 799
991
637
659
143
359
77
143
359
799
77
686
686
539
307
539
307
309
542
759
977
434 659
139
355
48
309
542 968
625
968
625
713
713 198
198
55 988
55 988
452
452
287
287 938 89
938 89
811
221
811
221
214 34
620
251
403
214 34
620
231
251
403
231
72 924
956
123
72
742
924
937
956
123
742
937 751
751 376
376 247
247
936
936
273 332
455
702 467 192
467 192
226
806 687
124
254
279 226
624
806 687
492 153
629
157
675
183
300 158
304
421
279
163
332
455
702 157
675 254
492 669
273
124
183
300 158
304
421
153
624
629
163
669
111
111 238 115
238 115
622
318
622
318
284
284
863
831
82
558
863
831
82
558
846
846
176
176
78
78 996
100
272
278
104
546
546
23
145
358
996
100
272
278
104 23
145
358
393
393 792
792
530
530
569
549
181
569
369
763
322
549
181
582
582 656 568
970
180
615
809
235
888
550
180
615
809
833
833
690
690
364
672
655
970
966
235
888
550
369
763
322
656 568 966
140
364
672
655
140
544 368
976
532
602 182
544
856
182
368
370
374
784
856
784
348 972
972 306
496
635
306
496
635
654
812
812
594
581 555
338
976
532
602
370
374 348
654 594
581 555
338
779
779
420
420
194
194 586 908 323
663
556
586 908 323
963
663
556
928
963
928
245 810
457
245
174
810
457
498
531
482
174 498
531
482 739
739
842
463
842
463
202
652
843
758
758
493
469
168
202
652
843
493
469
168
828
335
828
335
873
764
873
764
339
339 610
610
484
484 864
516
567 845 282
845 408
159
282
408 308
729
574
525
345 159
850
229
729
715
113
256
308
850
864
516
567
345
715
113
256
229 574
525
3
3
1
270
1
879
270
879
42 431
42
22
431
912
958
22
912
958 940
940 585
585 547
776
547
776
852
741
86 129
816
895
514
15
852
741
86 129
816
895
518
514
15
518
552
552 858
858 607
607 835
835
460
406
460
406 108
633
108 633
732 372
732 372
411
411
730
778
730
778
982
982
396 560
396 560
233 220
233
379
26
330
577
220
379 533
326
665
215
884
26
330
577 533
326 147
147
665
215
884 334
334 678
366
678
366
772
772
283
283
664
391 75 172 701 44
545
701 728
189
44
545
448
728
189
106 500
500 51
133
491 905
900
575
818
887
887
844
444
51
133
913
491 905
818
804
343
80
658
303
106
913
575
448
343
80
658
303
900
664
391
75 172
844
444
804
12
12 102
102
232
232 63
964
443
63
964
243
697
443
243
697 453
453
685
685
813
262
17 919
813
27
997
93
93
43
29
43
29 823
67
320
262
17 919
27
997
752
823
67
320
752
252
333
252
333
228
228 266
617
57
774
849
381
266
617
57
774
849
381
660 38
660 38
293
6
293
6
76
76
962
962 766
316
138
112
237
298
766
316
138
112
237
651
704
298
651
704
37 187
37
327
187
668
327
668
573
573 614
614 559
536
559
536
310
310
691
691
336
336
705
319 114
705
319 114
760
142
760
142
13
13
79
79
134 914
881
340
914
881
340
584
130
717 584
999 662
989
999 662
605
429
989
429
661 137
134
130
717
605
661
317
395
137
317
395
869
869
18
18 540
676
540
676
5
290
5
290
471
471 466
466
45
413
45
413 767
767
782
782
750
886
750
886 627
794 579
423
579
862 432 164
983
32
384
837
967
692
641
692
935
19
515
694
354
384
667
347
694
893
893
230
230
923 127
896
933
296
815
923 127
896
933
693
475
36
415
723
641
354
935
19
515
299
885 983
32
667
347
801
468 561
862 432 164
837
942
726 162
415
723
967
627
794 423
468 561
299
885
801
942
726 162
693
475
36
296
815
932
932 24 805
24
388
557
805
945
388
557
945
363
363
894
222
894
892
734
218
785
222
892
734
218
785
796
796
265
265 402
992
261
797
878
402
992
261
797
878
87
87
606
606
66
66 195
195
872
116
872
116 592
592
640
640
674
674
200
200 167
167
248
440
955
248
440
955
921
666
921
666
502
280
502
280
244
244 721
817
721
817 534
534 239
754
239
754
883
883 74
430
110
551
74
430
110
551
480
866
480
866 566
240
825
33
673
639
566
240
825
33
673 745
639
745
73
73
720
720
631 807 49
631 807
105 49
90
511
105
90
511
152
765
598
152
765
598
714
509
714
509
286
465
286
465
907
907 613
613
648
141
890
109
258
109
325 526
497
930
10
826
580
524
205
824
313
930
356
512
10 165
392
576
526 356
512 337
762 671
185 593 459
325 392
576
83
60
554
236 611
258
497 826 205
648
762 671
185 593 459
580 824
141
890
554
236 611
242
83
60
524
165
337
313
242
738
738 925
925
96
96
9
9 169
859
169
122
859
122
700
748
700
748
636
193
636
193
7
7
40
971
40
971
178
178 931
458
830
931
458
830
481
851
481
851
571
571
965
965 865
865
435 445
435 445
857
857
62
62
733
733
450
450
136
414 362
645
136
414 362
645
946
761
946
761
382
382
854
854 436 117 487
436
757 117
25 877
926
487
456
196
477
860
344
877
267
267
175
177
719 564
944
250 118
250 118
959 841
800
473
959 841
451 473
634
638 281
802
315
501 173
795
486
638 281
802
315
173
210
819
795
486
819 409
409
927
927
154
154
126 688
126 688
595
0
474
331
975
175
177
719 564
944
451
210
595
917
0
474
331
975
906
917
906 722
276
54
101
722
276
54
101 441
441 798
798 257
257
68
259
483
838
994
472
346
166
553
537
746
166
746
219 349
84
84
875
14
14
389
389
224
224 227
227 783
783
737
476 91
314
148
737
476
847
612
987
377
995 146
612
889
570
390
405
199
986
780
949
397
889
570
390 377
506
786
780
949
397
952
847
504
987 506
786
148
834
427
504
91
314
834
405 995
346
439 769
537 219
472
209
553
439 769
349
68
259
483
838
994 209
875
427
477
401
446 803 503
634 501
860
344
121 781
803 503
800
757
25
456
978
922
401
446
121
926
196
978
922 781
199
986
146
952
46
46 623
179
623
179
513
603
513
603
353
11
353
11
394
979
394
979
425
425
736
736
911
961
911
961
65
65 424
424
861
711
255
217 874
201
596
743
743 103
103 297
297 867
274
39
867
274
39 461
416
461 416
707
707
419
419 731
630
731 630
98
98
208
876
903
4 771
288 902
902
681
681 643
643
70
70 107
289
289
499
69
373
47
47
28
621 974
216
128
31
52
216
517
517
653
149
28
621 974
128
31
941
184
241
52
373
791
184
241
499
69
941
791
208
876
827
903
4 771
107
151
653
149
583
151
583
562 407
268
510 407
572 92 642 993
587
207
16
207
20
790
616 312
609
20
790 292
616 312
609
246 58
53
918 400
53
918 204
400
871
840
650
204
604
399
94
277 449
449
677
870
871
428
399
94
277
150
58
749
21
840
650
428 604
16 998
246 749
21
291
263
934 703
132
679
61
8
703
993
587 957
998
263
934
572 92 642
291
132
679
61
8
292
562
268
510
957
861
711
255
217 874
201
596
827 288
527
677
870
527
808
626
948
295
839
71
591
85
855
150
948
295
839
71
591
85
855
808
626 426
464
375
271
901
64
426
464
212
375
271
901
64
212
294 683
294 683
755
770
755
770 341 203
788
341 203
788
365 943
404
718
365 943
404
718
973
973
969
969 848 485
891
848 485
170
160
170
160 891
916
628
916
628
156 775
156 775
521
695
521
695
367
367
351 188
929
351 188
899
899
657
161 357
186 543
657
161
56
438
357
186
249
832
929
543
56
438 249
832
495
495
684
684 2
920
2
920
302
305
302
305
43730
386
43730
386
301 387
422
301 387
422
95
95 528
528 565
565
269
269 144
981
538
155
814
211
144
981
311
960
538
155
814
211 311
960
371
371 789
789
724
378
696
724
378
696 710
618
494
710
618
494 275
135
725
954
954
608 836
608 836
361
361
984
984 505
131
131
756
756 120
829
619
275
135
725
505
321
264
120
829
619
853 470
383
321
980 462
191 433
342
535
264
853 470
980 462
191
383
433
342
535
418
418 709
821
59 223
709
821
59 223
380
380
447
744
447
744
488
488
50
50 490
410 234
490
410
563
225
234
939
563
225
939
328
328
88
88
601
171
820
601
412
171
820 412
578
578
197
197 541
541 119
119 597
597
522
522 523
523
260 716
260
882
768
716
882
768
753 97
868
(a) Network before evolution
753 97
868
(c) Network after evolution
Fig. 6. The figure shows on the left a communication network of 1000 nodes. The nodes are identically independently distributed on a (10,000x10,000) grid. Each node is connected to its next 10 neighbors, symbolized by an edge between the nodes. The network on the right has evolved from the left network after 10 simulation steps. In each simulation step the Betweenness Centrality of all nodes has been calculated. Subsequently, every node with a Betweenness Centrality index less than 1500 enhanced its transmission radius to connect to the next furthest. Every node with a Betweenness Centrality index of more than 50 000 cut the connection to the neighbor with the highest Betweenness Centrality. The result is a network in which nodes on the border spread their connections deep into the network. The inner part of the network is partly sparser than before. The evolved network has a lower diameter and lower average distance. Whereas in the left network some nodes have Betweenness Centralities of up to 105 000, in the right network the maximal index is 49 000.
After 10 simulation steps the diameter of the graph is reduced from 26 to 22, the average hop distance from 10.66 to 10.07. The maximal transmission radius has increased from 1262 to 1776, the average transmission radius from 654 to 814. The maximal Betweenness Centrality is more than halved from 105 000 to 49 000. Of course, every graph will reduce its diameter if the average transmission radius is enhanced. The important observation here is that the nodes with high load in routing will maintain or decrease their transmission radius. The global decrease of the diameter is therefore conducted by increasing the transmission radius of the less loaded nodes. This single simulation shown here is exemplary. Still, finding the correct minimal and maximal Betweenness Centralities according to the number of nodes is not trivial as is neither determining an appropriate adaptation rule. We want to emphasize that the main goal of our paper is to show that a decentral computation of centralities is possible in an efficient way. The above given example is therefore just an illustration of what we think can be done with these centralities. Nonetheless, the given adaptation rule might be a good starting point for real applications: It encourages the nodes to enhance their transmission radius if they are not likely to suffer high routing demands from others. Simultaneously, highly recommended router points are relieved. Thereby, every node contributes similar amounts of energy to the global system, either by maintaining a higher transmission radius or
by an expanded routing service. R EFERENCES [1] U. B RANDES , ”Faster Evaluation of Shortest-Path Based Centrality Indices,” J. Math. Soc., vol. 25(2), pp. 163-177, 2001 [2] L.C. F REEMAN , ”A set of measures of centrality based on betweenness,” Sociometry, vol 40, pp. 35-41, 1977 [3] C.H. N G , K.C. S IA , C.H. C HAN , I. K ING , ”Peer Clustering and Firework Query Model in the Peer-to-Peer Network,” Dep. Comp. Science & Eng., Chinese University of HongKong, HongKong, China, Tech. Rep.,2002 [4] I. G LAUCHE , W. K RAUSE , R. S OLLACHER , M. G REINER Continuum percolation of wireless ad hoc communication networks Physica A, 325 (3-4), 577-600 (2003) [5] S. WASSERMANN , K. FAUST Social Network Analysis: Methods and Applications Cambridge University Press, 1994 [6] J. S COTT Social Network Analysis: A handbook Sage Publications, London, 2003 (2nd ed.) [7] M. E. J. N EWMAN , M. G IRVAN Finding and evaluating community structure in networks preprint cond-mat/0308217 [8] M.E.J. N EWMAN Who is the best connected scientist? A Study of scientific coauthorship networks Phys. Rev. E 64, (2001) [9] V. K REBS The social Life of Routers The Internet Protocol Journal 3(4),14-25 (2000) [10] P. H OLME Congestion and centrality in traffic flow on complex networks Advances in Complex Systems 6, 163-176 (2003) [11] K LOVDAHL , A.S., G RAVISS , E.A., YAGANEHDOOST, A., R OSS , M.W., WANGER , A., A DAMS , G. AND M USSER , J.M. Networks and Tuberculosis: An Undetected Community Outbreak Involving Public Places Social Science and Medicine, 52(5), 681-694, 2001. [12] G. S ABIDUSSI The centrality index of a graph Psychometrika 31, 581603 (1966) [13] P. H AGE , F. H ARARY Eccentricity and Centrality in Networks Social Networks 17, 57-63 (1995)
SUBMITTED TO IEEE
[14] A. S HIMBEL Structural parameters of communication networks Bull. Math. Biophys.15, 501-507 (1953) [15] R OBERT G. G ALLAGER Distributed Minimum Hop Algorithms Technical Report LIDS -P- 1175, MIT, (1982)
9