Automatic Induction of Inter-Domain Hierarchy in Randomly ... - Supelec

2 downloads 13450 Views 262KB Size Report
hierarchical network. To the best at our knowledge, there is not any random .... Tier-1 or core is composed of provider domains linked with. P2P relationships.
Automatic Induction of Inter-Domain Hierarchy in Randomly Generated Network Topologies Marc-Antoine Weisser and Joanna Tomasik Computer Science Department, Supélec 3, rue Joliot-Curie, 91192 Gif-sur-Yvette cedex, France {Marc-Antoine.Weisser, Joanna.Tomasik}@supelec.fr Keywords: Inter-domain, Autonomous Systems, Hierarchical Topology Generator

Abstract Links between domains making up Internet are of different types. Relationships introduced by link types impose a hierarchical structure on the global network. This structure may influence the functioning of mechanisms deployed in the network, notably inter-domain routing protocols. In order to validate a new inter-domain protocol we must have a model of hierarchical network. To the best at our knowledge, there is not any random topology generator which is adapted to represent the Internet hierarchy. We try to induct this hierarchy from random topologies generated by BRITE. The obtained results show that the hierarchy inducted in topologies generated with the extended model of Barabási and Albert (BA2) is very close to the real one. The implementation of the proposed solution, the SHIIP program, is available on Internet under public-domain license.

1. INTRODUCTION Internet considered as a network of domains is a hierarchical network. The hierarchy between domains arises from different types of relationships connecting them. This hierarchy may have an impact on the functioning of inter-domain routing protocols and for this reason, studies of performance of inter-domain protocols require hierarchical topologies typical for Internet. We can consider two approaches to obtain a network topology. The first one is oriented towards the reconstruction of the real-life Internet from data stored in actual BGP tables. It is a tedious and time-consuming task. The size of the real Internet may become also a blocking factor for efficient simulation runs testing the behavior of a new protocol. The second approach focuses on a random generation of inter-domain network topologies. It gives to a modeler a possibility of generating quickly range of random topologies of a given size. We want to furnish a modeler with a tool allowing him to have series of different topologies conformed to the Internet

characteristics but smaller in size. In our opinion, network topologies representative for real inter-domain networks and small enough to be numerically treated by simulation contain several hundreds of domains. To obtain them, we generate random topologies of domains representative for Internet, then we induce hierarchical relationships into them. The existing generators, which create random topologies, are divided into two classes: • degree-based generators with Inet [1], the models of Aiello et al. [2], Waxman [3], Barabási and Albert [4, 5], and the generalized linear preference model [6]; • structural generators with Tiers [7] and Transit-Stub [8]. The generators of the first group tend to reproduce local properties of a network such as domain connectivity and clustering. The generators of the second group aim to represent global network properties such as its hierarchical structure. There are not many studies comparing these alternative approaches. Tangmunarunkit et al. [9] claim that network generators based on the degree better represent the large-scale structure of Internet than the structural generators. The performed analysis of the BGP (Border Gateway Protocol) routing tables [10, 11] shows that there are five levels in the Internet structure. It is not clear how to scale the number of levels on topologies of only several hundreds nodes. The structural topology generators create graphs with two (Transit-Stub) or three (Tiers) arbitrarily imposed hierarchy levels and do not introduce the power-law degree distribution found in Internet. We considered the structural generators described above as inadequate to represent the Internet hierarchy. Degree-based generators fit the power-law degree distribution but create random topologies without any related hierarchy. In topologies which they generate, the inter-domain relationships have to be induced. We cannot adapt the algorithms described in [10, 11, 12] and used to infer the hierarchy into the real-world network because generated topologies are not provided with any routing tables. We intend to induct the hierarchy by analyzing a generated topology with graph theory methods. The generator of our choice is BRITE [13] which offers four models of generation. We have selected this tool because it is largely accepted by the network community and

is widely used in Internet studies. We carry on experiments testing our algorithm for different topology generation models in order to select the one the most suitable for the hierarchy inducting procedure. In the following section we bring up the description of the global inter-domain measured characteristics proposed by other authors and we state the working hypotheses about the hierarchy we want to induce [14]. Next, we present in detail our algorithm. In Section 4 we discuss the results obtained for the different models offered by BRITE and we suggest the one which is, in our opinion, the most appropriate. In that section the reader finds the link to the page of the public-domain implementation of the proposed algorithm.

• the average number of domains connected to a domain in a layer decreases strictly starting from the Tier-1 (degree-criterion).

3. INDUCTING A HIERARCHY The proposed algorithm is composed of the following steps: 1. generate a random network topology; 2. find the core; 3. induct a layer number for each node;

2. INTERNET HIERARCHY 2.1 Measured characteristics of Internet We consider two principal relationships between domains: provider to customer (P2C) and peer to peer (P2P). A P2C relationship links a provider, which sells connectivity, to its client. P2P relationship links two domains which share their connectivity. These relationships create a hierarchy of domains in Internet. Some researchers [10, 11, 12, 15, 16] tend to extract the characteristic of Internet and its hierarchy from the BGP routing tables. They have found that Internet is composed currently of more than 10,000 domains divided into five layers: Tier-1, . . . , Tier-5. Usually, the domains in layer T-i are providers of domains in layer T- j with j > i. The domains in layer T- j are clients of domains in T-i, i < j. The domains in layer T-i and T- j are linked together with P2C relationships. Domains in the same layer can be linked together with P2P or P2C relationships. Tier-1 or core is composed of provider domains linked with P2P relationships. Usually the graph formed by Tier-1 is a clique whose size is between 10 and 20 nodes. The average percentage of domains in each layer increases strictly (starting from the Tier-1). The average degree of a domain in a layer decreases strictly (starting from the Tier-1).

2.2 Characteristics of network model Our goal is to introduce in flat randomly generated topologies the hierarchical characteristics which can be found in the real-world Internet. Other properties such as degree distribution, clustering coefficient, and characteristic path length are determined by the models used to generate the flat degree based topologies. The hierarchy which we propose to induct into topologies has to fulfill three following conditions: • Tier-1 is a clique (clique-criterion); • the average size of the layer increases strictly starting from the Tier-1 (size-criterion);

4. optimize the layer number of the nodes; 5. induct relationships.

3.1 Finding the core We attempt to find a natural clique composed of the nodes with the largest degree. Such a choice of the clique is justified by [11, 16] in which their authors have stated that the domains in Tier-1 have the largest degree. We sort the list of nodes in decreasing order of their degree. We look for the largest index i such that first i nodes in the sorted list form a clique.

3.2 Inducting node level The inducting of node levels is a major difficulty of our algorithm. We have at our disposal the percentages of nodes in five layers found in the real Internet [11, 15]. Our goal is to introduce layer in any network topology despite of its size. We do not have, however, any information about its possible number of layers and layer compositions. Our algorithm explores generated random topologies and try to reproduce the Internet structure taking into account the conditions mentioned in Section 2.2. Our heuristic algorithm is described as follows: the level of each node is set to the minimum distance between this node and the core increased by one. We can perform this step doing a bread first search. Using this algorithm to the different generation models proposed by BRITE, we observe a concentration of nodes in the second and third layers. This concentration is caused by the important number of nodes directly connected to the core. In order to avoid the observed concentration of nodes, we find the possible lowest location for these nodes. The optimization procedure described below is charged with the task of finding the lowest place for a node.

3.3 Optimizing node levels We attempt to limit the number of nodes on a layer by lowering level of nodes with a smaller degree. We can decrease by one the level i of a node v if two following conditions are satisfied simultaneously: • v has a neighbor on layer i; • all the customers of v on the layer i+1 have a connection to a node different than v on the layer i.

• an edge joining two nodes of two different layers is converted into P2C relationship with the provider in a upper layer. A reader may notice that these rules cause a simplification because they forbid P2C relationships between nodes of the same level.

4. RESULTS Testing the algorithm, we focused our attention on properties introduced by our algorithm:

These two conditions guarantee that decreasing the level of v will not impact the level of the other nodes (Figures 1 and 2).

• size of the natural clique; • average size of the layers; • average degree of nodes in the layers; • number of P2P and P2C relationships inducted;

Figure 1. Node d on the layer i has a neighbor e of the same level. All the nodes on the layer i + 1 in the neighborhood of d ( f and g) have a link to a node of level i which is not d: c and e. We can lower the level of d.

We induce the hierarchy on topologies of variable size (from 40 to 2000 nodes) generated with four models offered by BRITE: Waxman [3], Barabási and Albert (BA) [4], extended model of Barabási and Albert (BA2) [5], generalized linear preference (GLP) [6]. We generate the topologies with the default parameters proposed by BRITE. We present here the results estimated with precision 5 percent and significance level 0.05.

4.1 Core size 12

Figure 2. The node d is now of level i + 1 and has a provider on the level i. All its former customers have also providers on the layer i. We start to reduce the level of nodes with smallest degrees on second layer until there are not any nodes satisfying two conditions noted above. Next, we apply the optimization on the other layers, top down.

3.4 Inducting relationship types In order to convert the undirected edges between nodes of generated topology graphs into P2P or P2C relationships, we define a simple rule: • an edge joining two nodes of the same layer is converted into P2P relationship;

Number of nodes in the core

10 BA2 BA GLP Waxman

8

6

4

2

0 0

500

1000

1500

2000

2500

Size of topologies

Figure 3. Average core size for four generation models. Figure 3 represents the average size of the core for different generation models and varied topology sizes. For the Waxman and BA models, the average size of the core is small (about 1.55 for Waxman model and 2.17 for the BA model) and does not depend on the size of the topologies. This behavior results from the strict construction method of the core performed by our algorithm (a natural clique consisting of maximum degree nodes) and from the generation models which

60

60 L1 L2 L3 L4 L5 L6

40

L1 L2 L3 L4 L5

50

Percentile of nodes in layers

Percentile of nodes in layers

50

30

20

40

30

20

10

10

0

0 0

500

1000

1500

2000

0

2500

500

Figure 4. Average size of layers for the Waxman model.

1500

2000

2500

Figure 6. Average size of layers for the BA model.

70

60 L1 L2 L3 L4

60

50

Percentile of nodes in layers

50 Percentile of nodes in layers

1000

Size of topologies

Size of topologies

40 30 20

L1 L2 L3 L4

40

30

20

10

10 0 0

500

1000

1500

2000

2500

Size of topologies

0 0

500

1000

1500

2000

2500

Size of topologies

Figure 5. Average size of layers for the GLP model.

Figure 7. Average size of layers for the BA2 model.

do not allow one adding or rewriting links. These mechanisms are introduced in the BA2 and GLP models and allow them to produce topologies with more links between nodes of larger degree. For this reason the average sizes of the core for topologies generated with the BA2 and GLP models are larger. For the BA2 model, the size of the core decreases when the number of nodes in topologies grows up. On the contrary, for the GLP model, the core size increases with the size of topologies. This phenomenon can be explained by a greater clustering coefficient for the GLP model than for the BA2 model [6].

nodes. We observe that the number of layers in topologies of 2000 nodes is greater than the number of layers in the real Internet of 10,000 nodes [11, 10] which is equal to five. The GLP model (Fig. 5) produces topologies of three principal layers: the core (small in size but essential for a hierarchy) the second, and third layers. The fourth layer is outlying and small in comparison with the second and third layers. For topologies with more than 400 nodes the three important layers respect the size-criterion. The number of layers composing topologies created with the BA model (Fig. 6) is five. This model fulfills the sizecriterion, with the exception of the fifth layer, for topologies whose size is greater than 200. The fifth layer is of negligible size for small topologies but grows up slowly and almost reaches the size of the second layer for 2000 node topologies (8.90 percent for the second layer and 7.79 percent for the fifth). These observations lead us to the conclusion that the BA model is not adapted to treat topologies of sizes which we investigate but other studies on larger topologies may exhibit a quick acceleration of the growth of the fifth layer.

4.2 Size of layers Figures 4 to 7 represent the average size of layers depending on the size of topologies. The Waxman model (Fig. 4) produces topologies with many non-negligible layers (four layers for 40 nodes topologies, six layers for topologies with more than 400 nodes). The seventh layer containing one percent of nodes appears for topologies with more than 2000

1000 L1 L2 L3 L4 L5

L1 L2 L3 L4 L5 L6

Percentage of nodes in layers

Percentage of nodes in layers

100

10

1

100

10

1 0

500

1000

1500

2000

0

2500

500

Size of topologies

1000

1500

2000

2500

Size of topologies

Figure 8. Average degree of nodes in layers for the Waxman model.

Figure 10. Average degree of nodes in layers the for BA model.

1000

100

Percentage of nodes in layers

Percentage of nodes in layers

1000 L1 L2 L3 L4

10

L1 L2 L3 L4

100

10

1 0

500

1000

1500

2000

2500

Size of topologies

1 0

500

1000

1500

2000

2500

Size of topologies

Figure 9. Average degree of nodes in layers the for GLP model. In the majority of topologies generated with the BA2 model (Fig. 7) the number of layers is four. The fifth layer appears sometimes for topologies greater than 400 nodes but its size does not exceed 0.01 percent of nodes. We observe that the average layer size is stable and the size-criterion is always satisfied. This model generates networks whose hierarchy layers show the features we expected.

4.3 Average degree Figures 8 to 11 represent the average degree of nodes in layers. The degree-criterion is respected for the Waxman and BA models (Fig. 8 and 10) but nodes in the layers from two to six have a close average degree. The observed feature is undesirable because the measures taken in the Internet [11] make evident that the average degree of nodes in layers varies significantly. The degree-criterion is not respected for the GLP model (Fig. 9). The average degree of nodes in the fourth layer is

Figure 11. Average degree of nodes in layers the for BA2 model.

greater than the average degree of nodes in the third one. A possible interpretation of the average degree inversion between the third and fourth layers could be that too many nodes of the third layer has been forced to descend by the optimization step. We test this assumption by generating a hierarchy for topologies generated with the GLP model without the optimization step. In that case the average size of the layers is close to the size for topologies with optimization (Fig. 12). The degree inversion between the layers three and four is deleted. With or without optimization for this model, our algorithm provides topologies with gaps between the average degree of nodes in layers. These results are more realistic than the ones produced from the Waxman or BA models. For the BA2 model (Fig. 11), the degree-criterion is respected. The average degrees depend clearly on the layers and are stable. Once again topologies generated with the BA2 model are the best for our modeling purposes.

between nodes in same layer to be converted into P2P. This problem is caused by the lack of a rule inducting P2P relationships. In Internet, the BGP tables depend on the commercial relationships between the domain operators and can be used to roughly guess the type of the relationships. To the best of our knowledge, there is not any theoretical results about the emergence of P2P relationship between two domains. We may consider two approaches to cope with the large percentage of P2P relationships:

70 60

Percentage of nodes in layers

50 40 L1 L2 opt L2 no opt L3 opt L3 no opt L4 opt L4 no opt

30 20 10 0 0

500

1000

1500

2000

2500

Size of topologies

Figure 12. Size of the layers for the GLP model, with and without the optimization step. The optimization does not impact on the size of the core. 50

• we do not modify generated topologies, leaving them as natural as possible even if we consider the percentage of P2P relationships too large; • we allow ourselves to modify the topologies to limit the percentage of P2P relationships arbitrarily transforming some P2P relationships (the P2C relationships are selfsufficient to guarantee the connectivity).

45

5. CONCLUSION

40

Percentage of P2P links

35 30 BA2 BA GLP Waxman

25 20 15 10 5 0 0

500

1000

1500

2000

2500

Size of topologies

Figure 13. Percentage of the P2P relationships for the four models.

4.4 Percentage of P2P relationships The last factor of the model conformity which we treat is the average percentage of P2P relationships in a generated network. Measures of this parameter in the Internet are studied in [10, 11, 15]. In those papers their authors give the measurement methodology they applied and explain the problems with which they coped. The major difficulties of P2P percentage measures are caused by the choice of BGP vantage points and the interpretation of BGP table contents. We noticed that the proposed P2P percentage values significantly vary depending on the chosen methodology. The results obtained with all four generation models are summarized in the Figure 13. For all the models, the average percentage of P2P links is larger than values extracted from the studies of BGP tables presented in the papers cited above. This is not surprising because of the strict rules used by our algorithm to induct the relationships which make links

Studies on propositions of new routing protocols in interdomain networks require simulation results in order to evaluate their performance. These studies should be made for a wide range of network topologies. In this context the utilization of real network may become a blocking factor and it is recommended to use random topology generators. On the one hand, existing structural random generators do not produce hierarchical topologies representative neither of local nor global Internet properties. On the other hand, the degreebased generator produce flat topologies representative of the Internet local properties but without any hierarchy. We propose an algorithm to induct a hierarchy into flat topologies generated with the degree-based generators. The method allows us to create hierarchical topologies including local properties of degree-based model such as degree distribution and clustering. We test our algorithm on the four degree-based models proposed by BRITE: Waxman, GLP, BA and BA2. The study of different metrics (average core size, layer size, degree of nodes and percentage of P2P relationships) show that BA2 model produce topologies with properties representative of the Internet hierarchy and it is the best generating model tested with our algorithm. The average size of layers and degree of nodes stable. Studies of topologies with 4000 nodes confirm this tendency. They are three principal perspectives for this work: • improving the method used to find the core making it less restrictive; • finding a criteria to limit the number of P2P relationships between nodes in the same layers;

• studying topologies of larger size; • confronting our algorithm with the reel topology extracted from the BGP tables. The implementation of our algorithm inducting the domain hierarchy SHIIP (Supélec Hierarchy Inter-domain Inducting Program) is available for all users under GNU license at http://wwwsi.supelec.fr/~weisser/.

REFERENCES [1] C. Jin, Q. Chen, and S. Jamin, “Inet: Internet topology generator,” Tech. Rep. CSE-TR443-00, Department of EECS, University of Michigan, 2000. [2] William Aiello, Fan Chung, and Linyuan Lu, “A random graph model for massive graphs,” in Proc. of STOC’00, Portland, USA, 2000, pp. 171–180, ACM Press. [3] Bernard M. Waxman, “Routing of multipoint connections,” Selected Areas in Communications, IEEE Journal on, vol. 6, no. 9, pp. 1617–1622, 1988. [4] Albert-Laszlo Barabási and Reka Albert, “Emergence of scaling in random networks,” Science, vol. 286, no. 5439, pp. 509–512, 1999. [5] Reka Albert and Albert-Laszlo Barabási, “Topology of evolving networks: local events and universality,” Physical Review Letters, vol. 85, pp. 5234, 2000. [6] T. Bu and D. Towsley, “On distinguishing between Internet power law topology generators,” in Proc. of INFOCOM’02, New York, USA, June 2002, vol. 2, pp. 638–647. [7] M. Doar, “A better model for generating test networks,” in Proc. of Globecom’96, London, Great Britain, 1996, pp. 86–93. [8] Ellen W. Zegura, Kenneth L. Calvert, and Samrat Bhattacharjee, “How to model an internetwork,” in Proc. of IEEE Infocom, San Francisco, USA, March 1996, vol. 2, pp. 594–602. [9] Hongsuda Tangmunarunkit, Ramesh Govindan, Sugih Jamin, Scott Shenker, and Walter Willinger, “Network topology generators: degree-based vs. structural,” in Proc. of SIGCOMM’02, Pittsburgh, USA, 2002, vol. 32, pp. 147–159, ACM Press. [10] Zihui Ge, Daniel R. Figueiredo, Sharad Jaiswal, and Lixin Gao, “The hierarchical structure of the logical Internet graph,” in Proc. of SPIE ITCOM’01, Colorado, USA, July 2001, vol. 4526, pp. 208–222.

[11] Lakshminarayanan Subramanian, Sharad Agarwal, Jennifer Rexford, and Randy H. Katz, “Characterizing the Internet hierarchy from multiple vantage points,” in Proc. of IEEE INFOCOM’02, New York, USA, June 2002, vol. 2, pp. 618–627. [12] Lixin Gao, “On inferring autonomous system relationships in the Internet,” IEEE/ACM Trans. Netw., vol. 9, no. 6, pp. 733–745, 2001. [13] Alberto Medina, Anukool Lakhina, Ibrahim Matta, and John Byers, “BRITE: An approach to universal topology generation,” in Proc. of MASCOT’01, Cincinnati, Ohio, USA, August 2001. [14] Marc-Antoine Weisser and Joanna Tomasik, “Inferring inter-domain relationships in randomly generated network models,” in Proc. of EuroNGI Workshop, Torino, Italy, June 2006. [15] H. Chang, R. Govindan, S. Jamin, S. Shenker, and W. Willinger, “Towards capturing representative ASlevel internet topologies,” Computer Networks Journal, vol. 44, no. 6, pp. 737–755, april 2004. [16] Hongsuda Tangmunarunkit, John Doyle, Ramesh Govindan, Walter Willinger, Sugih Jamin, and Scott Shenker, “Does AS size determine degree in as topology?,” SIGCOMM Comput. Commun. Rev., vol. 31, no. 5, pp. 7–8, 2001.

Suggest Documents