Highly-Clustered Networks with Preferential Attachment to Close ...

3 downloads 953 Views 224KB Size Report
Aug 3, 2006 - highly clustered lattices; Barabási and Albert obtained scale-free ... search engines use algorithms that reward being linked (high ... Other works proved that high clustering can also be a consequence of optimisation of.
Highly-Clustered Networks with Preferential Attachment to Close Nodes Matteo Dell'Amico∗ Dipartimento di Informatica e Scienze dell'Informazione Università di Genova

[email protected] August 3, 2006

Abstract We analyse the properties of networks formed using a variant of the preferential attachment algorithm introduced by Barabási and Albert, where a new node i connects to the old node j with a probability that is proportional to its degree kj . In our model, nodes are assigned a random position on a ring, and connection probability is α σ proportional to (kj ) / (dij ) , where dij is the distance between i and j , and α and σ are positive parameters. When γ = σ/α is xed and α grows to innity, an even γ simpler model selecting the m nodes having the highest value of kj / (dij ) is produced. The resulting family of networks shows various properties, most interestingly scalefree, small-world, and hierarchical structure, that are commonly found in real-world networks.

1 Introduction Complex networks attracted great interest in recent years, with the discovery of recurrent features that the random graph model of Erdös and Rényi could not explain. Watts and Strogatz [12] highlighted that many networks are characterised by small diameter1 and high clustering (small world behaviour); Barabási and Albert [2] observed that the distribution of degrees often follows a power law (scale-free). These two seminal papers, moreover, showed how these characteristics could be obtained in synthetic graphs, by proposing models for creating random networks exhibiting those properties. The model of Watts and Strogatz obtains small worlds by introducing some long-range connections in highly clustered lattices; Barabási and Albert obtained scale-free networks by introducing preferential attachment: nodes having a high degree are more likely to obtain new links. This work extends the Barabási-Albert algorithm, in order to obtain networks that are both scale-free and small-world. In our model for graph construction, nodes are distributed randomly on a given topology, and they are more likely to link both to closer nodes and to higher-degree ones. This idea combines preferential attachment for degree, which produces a power-law degree distribution, and preferential attachment for close nodes, which yields networks with localised connectivity, and thus a high clustering value. ∗

Financially supported by the Italian MIUR under the framework of the FIRB 2001 action, WEBMiNDS Project. http://web-minds.consorzio-cini.it/ 1 Throughout this paper, the term diameter will refer to the average length of geodesic paths between all pair of vertexes in a graph.

1

The rationale for preferentially attaching to close nodes is that, in many networks, a concept of distance exists. For instance, in social networks, two people are more likely to know each other if they live close, or if they share some interests or hobbies. The space modelling this concept of distance could thus reect both physical positions and concepts such as semantic closeness or similarity in behaviour. An example of our kind of preferential attachment can be seen on the WWW. We can put a new link on our web page because we saw it as a web search result. We are more likely to search something close to our area of interest (i.e., low distance); moreover, search engines use algorithms that reward being linked (high degree), ranking them higher between our search results, thus increasing our probability of reading and linking that web page. Throughout this paper, we will assume that nodes are randomly distributed on a ring. This choice is arbitrary; it can be justied by its simplicity and symmetry. In modelling particular kinds of networks, dierent topologies can be more suited to the case.

2 Related Work Other network models were proposed in order to account for high clustering and power-law degree distribution; in [9, 3, 5], high clustering is achieved by forming triangles, connecting node pairs at distance 2. This phenomenon, for instance, can happen when an old friend introduces to us a friend of hers. While this mechanic is realistic, it is reasonable to think that it coexists with the one we are describing. For example, we may get to know a person because he lives near us, even if we do not have friends in common. Other works proved that high clustering can also be a consequence of optimisation of diameter versus wiring cost [8], and of preferential attachment networks when nodes, at some period in time, get old and cannot be linked anymore by new nodes [6]. Our approach, in which preferential attachment is dependent both on degree and on Euclidean distance, has been tackled in other works. [7] focused on link length distribution, and [13] studied degree distribution and diameter. Both papers analysed the case where the α parameter of Equation 1 is equal to 1. [14] uses a model of this kind for modelling the structure of the network of Internet routers. To the knowledge of the author, however, no work so far has studied either clustering in such networks or the structure of networks when α gets large values and grows towards innity.

3 Graph Construction The graph construction algorithm depends on four parameters: n, the numbers of nodes in the network; m, the number of new edges that will be added when a new node is added; α and σ , which are two parameters inuencing, respectively, the relevance of degree and (euclidean) distance for preferential attachment. When a new node i joins the network, it selects its m neighbours according to the set of probabilities (kj )α α ≥ 0, σ ≥ 0 (1) Π (kj , dij ) ∼ (dij )σ for each node j ∈ V , where kj is the degree of node j and dij is the distance from i to j . This algorithm reduces to the original Barabási-Albert algorithm when α = 1 and σ = 0.

2

1

α=1.0, σ=2.0, γ=2.0 α=∞, σ=∞, γ=2.0 α=∞, σ=∞, γ=1.0 α=2.0, σ=1.0, γ=0.5 α=∞, σ=∞, γ=0.5

Cumulative frequence

0.1

0.01

0.001

1e-04 1

10

100 Degree

1000

10000

Figure 1: Impact of γ on the degree distribution. For each degree d, the plotted value represents the frequency of nodes that have degree equal or greater than d. Sampled networks have n = 10000 and m = 5. We dene γ =

σ α;

when α 6= 0, the denition in Equation 1 is equivalent to Ã

Π (kj , dij ) ∼

kj (dij )γ



.

(2)

An interesting case arises when the limit for α → ∞ is studied. When α grows, the highest probability in Equation 2 grows to 1; the nodes chosen for linking thus become the m ones that maximise kj . (3) (dij )γ In this case, the edge selection becomes deterministic, and the only random component remaining is the choice of values in the random position vector p. In Section 4, it will be seen that networks produced in this way show properties that are widely present in real-world random graphs. With an abuse of notation, this case will be denoted in the following as having α = σ = ∞.

4 Experimental Results We refer to the full version of this paper [4] for a more detailed discussion of the experimental results.

Degree Distribution According to our simulation results applied to cases with α > 1, the value of γ = ασ appears fundamental for the topology of the network. Networks having γ < 1 tend towards gelation (most nodes are connected to - and only to - the same m hubs; those hubs have thus a degree that is close to n). When γ = 1, the degree distribution follows a power-law behaviour with the same γ = 3 exponent of the Barabási-Albert model. Figure 1 shows experimental results on degree distributions with various parameters. Diameter The diameter is close to 2 for large gelated networks, since any two random nodes are with a high probability both connected to at least one large hub. A logarithmic growth is present in all other kinds of networks. 3

0.45

0.8

C0 C1

C0 C1

0.7 0.4

0.35

Clustering

Clustering

0.6

0.3

0.5

0.4

0.3 0.25 0.2

0.2

0.1 0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

0

1000

Graph size

2000

3000

4000

5000

6000

7000

8000

9000

10000

Graph size

Figure 2: Clustering with respect to network evolution. Analysed networks have m = 5, α = 1, σ = 2 (left) and m = 5, α = σ = ∞, γ = 1 (right). These results suggest that clustering approaches a nonzero limit when the network size grows towards innity.

Clustering Two popular measures of clustering [10, equations 3.3 and 3.5] will be used to evaluate our experimental results. Given a graph G with n nodes, we dene:

C0 (G) = c1 (v) =

3 · number of triangles in G ; number of connected triples in G

number of triangles connected to v ; number of triples centered in v

C1 (G) =

(4)

1X c1 (i). n i∈V

(5)

While C1 accounts equally for all vertexes, C0 gives a higher weight to higher-degree nodes, since those nodes are part of a higher number of connected triples. In general, we observe that C1 > C0 (when m = 1, C0 = C1 = 0 because no triangles can be formed). This is due to the fact that nodes having a large degree attract more longrange connections, and contribute less to the clustering. Since C0 assigns more weight to high-degree nodes, the resulting total clustering is lower. A higher value of σ increases both clustering and network diameter. Anyway, while clustering rapidly approaches high values, the network diameter remains small, and logarithmic with respect to the network size. Thus, when m > 1 and σ > 1, the result is a small-world network. In many real systems, clustering values are independent from the size of the network itself. In other words, clustering tends to a nonzero value when n → ∞ [1, III.F]. In our experiment, we calculated the clustering values for networks during the network formation. The results, shown in Figure 2, suggest that the clustering values either converge to a nite value or go to zero very slowly. According to the results seen so far, the case of γ = 1 seems particularly interesting, since the properties of high clustering, scale-free degree distribution, and low diameter appear together. In Figure 3, the eect of variation of α = σ on clustering and diameter is shown. The graphs show that these parameters are not heavily inuenced by variation of α = σ when this value gets larger than approximately 5, and converge to the values obtained for α = σ = ∞. A hierarchical structure [11] is present when small groups of nodes are organised hierarchically in larger and larger groups. This can be measured by the fact that clustering on a node (the c1 (v) value dened in Equation 5) is inversely dependent to the degree of that node. The networks we generate have a hierarchical structure, according to the observations of Figure 4. 4

0.8

4.5

C C10

0.7 4.4

0.6 4.3

Diameter

Clustering

0.5

0.4

4.2

0.3 4.1

0.2 4

0.1

0

3.9

0

1

2

3

4

5

α=σ

6

7

8

9

10

0

1

2

3

4

5

α=σ

6

7

8

9

10

Figure 3: Impact of α = σ on clustering (left) and diameter (right) when γ = 1. Networks analysed in the above graphs have n = 10000, m = 5, γ = 1. Results obtained when α = σ = ∞ are C0 = 0.176, C1 = 0.770, diameter 4.27. Diameter is not heavily inuenced by the variation of α and σ , but the graph clearly shows a minimum diameter when α = σ = 1.

Clustering

1

8k-1 α=∞, σ=∞, γ=1 α=∞, σ=∞, γ=0.5 α=2, σ=2, γ=1 α=1, σ=2, γ=0.5 α=1, σ=1, γ=1

0.1

0.01 1

10

100

1000

Degree

Figure 4: Clustering on nodes with respect to degree. The test graphs have n = 10000 and m = 5. When σ grows, the distribution of c1 on nodes gets closer to proportionality with k −1 (the continuous line in the graph). Dierent values of α seem to have little eect on the points in the graph, inuencing only the presence or absence of nodes having a particular degree.

5

5 Conclusions and Further Work This paper presented a family of models that extend Barabási and Albert's algorithm, and can generate networks having high clustering, low diameter, scale-free degree distribution and hierarchical structure. These characteristics are commonly observed in many dierent kind of complex networks, and thus justify the usefulness of these models. Our results were obtained by simulation, and a possible direction for future research is an analytical demonstration of the observed properties. Moreover, further work can be devoted on studying how node placement in a space with dierent topology or a non-uniform distribution would aect the resulting network characteristics, for instance using a dierent number of dimensions, or even a non-integer fractal dimensionality, as done in [14].

References [1] R. Albert and A. Barabási. Statistical mechanics of complex networks. Rev. Mod. Phys., 74:47, 2002. [2] A. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509512, Oct. 1999. [3] P. Blanchard, T. Krueger, and A. Ruschhaupt. Small world graphs by iterated local edge formation. Phys. Rev. E, 71(046139), 2005. [4] M. Dell'Amico. Highly Clustered Networks with Preferential Attachment to Close Nodes. Technical Report DISI-TR-06-06, DISI, 6 Apr. 2006. [5] P. Holme and B. Kim. Growing scale-free networks with tunable clustering. Phys. Rev. E, 65(026107), 2002. [6] K. Klemm and V. Eguíluz. Highly clustered scale-free networks. Phys. Rev. E, 65(036123), 2002. [7] S. Manna and P. Sen. Modulated scale-free network in Euclidean space. Phys. Rev. E, 66(066114), 2002. [8] N. Mathias and V. Gopal. Small worlds: How and why. Phys. Rev. E, 63, 2001. [9] M. Newman. Clustering and preferential attachment in growing networks. Phys. Rev. E, 64(025102), 2001. [10] M. Newman. The structure and function of complex networks. SIAM Rev., 45(2):167 256, 2003. [11] E. Ravasz and A. Barabási. Hierarchical organization in complex networks. Phys. Rev. E, 67(026112), 2003. [12] D. Watts and S. Strogatz. Collective dynamics of `small-world' networks. Nature, 393(6684):397498, June 1998. [13] R. Xulvi-Brunet and I. M. Sokolov. Evolving networks with disadvantaged long-range connections. Phys. Rev. E, 66(026118), 2002. [14] S. Yook, H. Jeong, and A. Barabási. Modeling the Internet's large-scale topology. Proc. of the National Academy of Sciences of USA, 99(21):1338213386, 2002. 6

Suggest Documents