A Realistic Model to Evaluate Routing Algorithms ... - Semantic Scholar

1 downloads 0 Views 84KB Size Report
tion level, i.e., the transmission of data is not explicitely simulated. Connections are .... If the available bit rate on the selected path is larger than the threshold Ш ...
A Realistic Model to Evaluate Routing Algorithms in the Internet C. Casetti, R. Lo Cigno, M. Mellia, M. Munaf`o Dipartimento di Elettronica – Politecnico di Torino – Italy Z. Zs´oka - Technical University of Budapest – Hungary Abstract— This paper addresses the problem of evaluating routing algorithms via simulation in packet-switched networks when elastic traffic is involved. It highlights some deficiencies of classic approaches that fail to capture both the complex interactions of connections traversing multiple bottlenecks and common user behaviors. The paper describes an approach devised to overcome these limitations, which is particularly suited for the evaluation of routing algorithms in presence of best-effort traffic. The simulation results presented offer a deeper insight into wellknown routing algorithms. Through this analysis it is clear that quantitative and also qualitative behaviors of dynamic routing algorithms based on traffic measurements may be fairly different depending on the nature of the traffic loading the network, as well as depending on its interactions with the network parameters and behavior.

I. M OTIVATION & I NTRODUCTION Routing has traditionally been an active research field in both circuit- and packet-switched telecommunication networks. Recently, the need for differentiating services and most of all the provision of Quality of Service (QoS) in the Internet spawned a burst of work concerning new, QoS-based, dynamic routing algorithms suitable for implementation on IP networks. Results obtained from this research effort can be found in [1], [2], [3], [4], [5], [6], though these are only some examples. Usually, performance evaluation of novel routing algorithms is done using simulation, and, in general, it is performed at the connection level, i.e., the transmission of data is not explicitely simulated. Connections are generated following any suitable arrival process, and their lifetime is pre-determined, typically based on a random variable drawn from either a distribution or from a set of measured durations. Possibly, connections are rejected if there are not enough resources available. This modeling process and simulation technique is perfectly suitable for circuit-switched, telephone-like networks: connections are set up, requiring a fixed bandwidth, and consume a fixed amount of resources for a limited period of time. The traffic model of the Internet, however, is based on a best-effort service model, where no bandwidth guarantees are provisioned: thus the duration of a connection depends on the amount of data to be transferred and on the (dynamically changing) congestion level of the network. The circuit-switched approach is a Time-Based one, while the Internet model is Data-Based. However, performance evaluation of complex networks under the Data-Based traffic is not straightforward. A possible solution is based on packet-level, instead of connection-level, simulation, but it results in a much more complex model, whose simulation is generally too expensive to obtain results for any reasonably realistic network. In this paper we propose a novel, more realistic method for the evaluation of routing algorithms for elastic traffic. It is based on the computation of the connection duration starting from the amount of data to be transferred: the connection duration depends on the amount of resources available in the network during the lifetime of the connection. Thus, the closing instant of the connection is not computed a priori when the connection is admitted into the network, but it is dynamically evaluated depending on the actual availability of transmission resources computed applying a max-min fairness criterion. This work was partially supported by the Italian Ministry for University and Scientific Research (MURST) through the PLANET-IP Project.

In addition, we introduce the concept of connection starvation, another typical feature of the Internet, specially related to Web browsing. It is well known that there are no connection admission control mechanisms in current TCP/IP networks. Therefore, the number of connections using the network at the same time is virtually unlimited. As a result, TCP packets might be lost due to congestion, leading to frequent retransmission and consequently longer transfer completion time; similarly, UDP packets belonging to a streaming multimedia connection could be dropped, leading to unacceptably poor playback quality. In either case, users will abort the connection, maybe immediately starting a new one, as happens, for instance, hitting the ‘reload’ button in Web browsers. This behavior can seriously affect the network performance, since the network does some effort to transfer information which might turn out to be useless. Furthermore, resources devoted to aborted connections appear to be unnecessarily taken away from other connections. To emulate the sudden closure of starved connections, we define a connection as starved when the amount of bandwidth resources it receives falls below a given threshold. The elastic traffic model, together with the run-time starvation detection, shows that the evaluation of routing strategies based on inadequate (or too simplistic) models, such as the Time-Based one, is prone to gross approximations and even mistakes. We stress that the focus of the paper is not on routing algorithms (indeed, we do not propose any new routing algorithm), but on traffic modeling and evaluation procedures. The routing evaluation method we propose here can be applied to any routing algorithm on any network topology, provided that the dominant traffic has an elastic behavior as is presently in the Internet. The remaining part of the paper is organized as follows. Section II describes the two traffic models we consider, with particular attention to the method we use to emulate elastic traffic. Section III describes the tool and routing algorithm we use to show the differences between the traditional evaluation method and the novel one proposed here. Section IV presents the results and discusses the reasons why traditional models fail to represent the actual behavior. Section V discusses and closes the paper. II. M ODELING E LASTIC T RAFFIC C ONNECTIONS A. The Traditional Time-Based Model The most common traffic model used in high-level communication networks simulation is Time-Based. In this model, connections are described by their duration (holding time), and by their bandwidth requirements. For example, in telephone networks, connections require a CBR (Constant Bit Rate) service; this piece of information, integrated by the connection interarrival time and the required bandwidth, is sufficient both to determine the traffic intensity produced by the connection generator and to completely characterize the connection from the network point of view. The network can then allocate enough resources to possibly guarantee the required QoS. A similar approach is valid for variable rate connections. Any given CAC (Connection Admission Control) scheme can be easily implemented in simulation to control the network load and to satisfy the QoS requirements.

B. The Novel Data-Based Model The Time-Based model fails when we try to apply it to the typical data exchange in Internet, such as data downloads from the Web or FTP transfers. In all these situations the objects of communication are files and the communication ends when the last bit of the file has been acknowledged. Even if we declare a maximum communication rate (that emulates the limitations introduced by the application, the transport protocol, or the access network), it is of little help, because it only sets the lower bound for the connection duration. Indeed, the actual time required to successfully end the data transfer depends on many factors, and mainly from the bandwidth available while the connection is active. In our Data-Based traffic model, the connection lasts until all the data associated to the connection is transmitted. Since in the connection-level simulation approach packet transmission is not simulated, we need to dynamically estimate the time the transmission ends. Moreover, best-effort connections generally use all their share of available bandwidth (up to their maximum transmission rate); the connection duration can then be determined by monitoring the instantaneous bandwidth allocated to the connection. By taking the simplifying assumption that the share of bandwidth available on a path changes only when connections that share a link of that path start/end, we need to update the ending time only upon these occurrences. Thus, we need an estimate of the current available bandwidth each connection can use. This can be obtained running a max-min fair share algorithm, that requires a full recalculation of the bit rate of all connections currently routed in the network. This provides an upper bound to the performance, since the max-min fair share represents an ideal working situation for any congestion control protocol that aims at equally dividing resources among users, as the TCP protocol tries to achieve. Moreover, to model the starvation effect, i.e., users that abort the data transfer due to poor performance, we introduce a starvation threshold Bt that will be used to identify starved connections: if the current per-connection bit rate estimate on a bottleneck link drops below Bt , then a connection is randomly picked on that bottleneck and terminated. This is repeated until the sending rate of starved connections raises above the threshold. This allows us to define the starvation probability Ps as the ratio between connections that are prematurely aborted and the total number of connections that entered the network. III. ROUTING A LGORITHMS & S IMULATION T OOL To compare the two methodologies, we extended ANCLES [7], a connection-level simulator that was previously developed at the Politecnico di Torino. Originally conceived for ATM networks, ANCLES gradually evolved to a generic connection-level simulator, where traffic sources request connections and the network performs all the actions required to manage them. The reader interested in the simulation tool is referred to [8]. As regards routing algorithms, beside the static, hop-count based algorithms, that are unable to cope with the variation of available bandwidth in the network, ANCLES implements several dynamic, traffic-driven routing algorithms, like those proposed in [1], [2], [9]. Considering the available bandwidth in each link in the path lookup procedure, these algorithms can offer a better choice for the routing of connections with QoS requirements. Here, we briefly describe the algorithms that will be used in Section IV to assess the difference between the traditional evaluation method and the one proposed in this paper.  Shortest-Path (SP): for each source-destination pair, the algorithm determines the path with the minimum hop count and routes flows over that path. This is the routing algorithm commonly used in the

current Internet.  Widest-Shortest (WS): for each source-destination pair, the algorithm determines the path with the minimum hop count; if more than one such path exists, it breaks the tie by choosing the one with the largest available bandwidth [1].  Minimum-Distance (MD): for each source-destination pair, the 1 path P is chosen which minimizes the quantity: D(P ) = l2P bl where bl is the max-min fair bandwidth that is available to a new connection over link l belonging to path P [2]. The advertizing of updated QoS parameters, such as the currently available bandwidth, among network routers is assumed to occur every s seconds. However, in the simulations presented here we chose an “ideal” instantaneous update, in order to avoid complicating the interpretation of results with the problem of stale routing information [5].

P

IV. S IMULATION R ESULTS AND M ETHOD C OMPARISON In this Section we report the simulation results obtained running ANCLES. In order to decouple performance comparisons from a particular topology, the network topology we selected was randomly generated using the GT-ITM software [10]. The resulting topology comprises 32 nodes, with an average connectivity degree of 4. Every link has the same capacity C = 10 Mbit/s, and there is a best-effort traffic source generator connected to each node, trying to set up connections with a maximum bandwidth BM of 1 Mbit/s; each connection requires a bulk data transfer whose size SD is randomly chosen from an exponential distribution with average 20 kbytes. A uniform traffic pattern is simulated. The holding time Ht of Time-Based connections is computed based on the information amount and the maximum required bandwidth, Ht = (8  SD )=BM . For both the DataBased and Time-Based models the simulator computed the average throughput obtained by connections during their lifetime, which is used as a means of comparison between different routing algorithms. The load offered to the network is expressed as calls/s per generator, hence the global load offered to network can be obtained multiplying this number by 32. For instance, an offered load of 10 calls/s corresponds to a nominal network load of 10  32  8  20 = 51:2 Mbit/s; the actual throughput carried by the network depends on the routing algorithm and the source model. In the Data-Based model, we present simulations with different starvation threshold Bt set to 100 kbit/s, 50 kbit/s or 0 kbit/s. On the contrary, in the Time-Based model, we use the Bt parameter to run a CAC algorithm. If the available bit rate on the selected path is larger than the threshold Bt the connection is accepted; otherwise, the connection is refused. This connection admission control determines a blocking probability Pb , which is defined as the ratio between connections that enter the network and the number of requested connections. As performance indices, we report results for the starvation probability Ps , the blocking probability Pb , but most of all the average bandwidth per connection Bw , defined as the bandwidth that connections obtain during their lifetime averaged among all the source/destination pairs. Only connections that successfully end are taken into account. In the Data-Based scenario, we report results for the dilatation factor Df , i.e., the ratio between the average completion time of a connection and its minimum completion time (computed using the maximum bandwidth theoretically available to the source, i.e., on its access link). All results are reported versus the average network offered load, measured as the number of connections per second generated by each source. To get accurate results, each simulation was ended when the performance indices were such that the 95% confidence interval was within 5% of the point estimate.

1000

800 700

Data-Based Time-Based MD WS

1.4 Throughput gain η

900

Bw [Kb/s]

1.6

Time-Based Data-Based Bt = 0 Mb/s Bt = 50 kb/s Bt = 100 kb/s

600 500 400

1.2 1 0.8 0.6

300 0.4

200 100

0.2 0

5

10

15 20 25 Offered Load [call/s]

30

35

0

5

10

15 20 25 Offered Load [call/s]

30

35

Fig. 1. Shortest Path: average bandwidth per connection with different starving/blocking probability t for the Data-Based and Time-Based models

Fig. 2. Average throughput gain of MD and WS with respect to SP for the Time- and Data-Based models

A. Shortest-Path performance evaluation

of Internet traffic. If we look at Fig. 1 keeping in mind that the upper curves are obtained with a model where the network load is independent from the network condition, while the lower ones stem from a model where the offered load depends from the network feedback, then the differences are probably somewhat justified.

B

Figure 1 presents a comparison of the source average bandwidth Bw obtained modeling best-effort connections using the classic Time-Based approach (dotted lines) and the results obtained with the novel Data-Based connection model (solid lines), when the SP algorithm is selected. The difference in performance results of the two approaches is striking. While both approaches show Bw starting from 1Mb/s when the offered load is low, i.e., no congestion is registered in the network, they drastically differ when the offered load starts increasing. Indeed, the new Data-Based model shows a drastic decrease in Bw as soon as the offered load increases. The performance tends to the threshold Bt that acts as a lower bound on this performance index. On the contrary, the Time-Based approach shows a smoother decrease of the average bandwidth, and a smaller dependency on the threshold Bt . This is due to two factors: first, in the Data-Based approach, when congestion arises in the network, connections are throttled. This causes a stretching-out of the duration of the connections, that require more time to be successfully ended, thus spreading congestion over time. A sort of snowball effect occurs, as the number of connections suddenly increases on a bottleneck link, reducing the fair share of bandwidth. Second, when the available bandwidth eventually goes below the starvation threshold, then connections begin to be aborted. This results in a waste of bandwidth. On the contrary, in the Time-Based model, a connection is blocked a priori, and thus no waste of bandwidth occurs. Moreover, since the connection holding time is defined “a-priori,” no snowball effect on congestion is possible. The behavior difference highlighted in Fig. 1 is so striking one may question the validity of the results. Indeed, we must consider that the traditional evaluation method used, to the best of our knowledge, in all routing studies completely disregards the adaptativity properties of elastic traffic. In the Internet, elastic traffic is mostly carried over TCP. Adaptivity is embedded in the congestion control algorithm of the TCP protocol which is a closed-loop protocol with implicit feedback from the network. The modeling technique proposed in this paper does not address the details of TCP protocol, but tries to partially capture its behavior. For instance, the reaction to network congestion spreading the connection over time and resulting in congestion periods that last longer than what can be estimated with a model devoid of any “feedback” feature. The aim of TCP congestion control is sharing network resources following max-min fairness, and this is exactly the “adaptivity” criterion we use. Indeed, some authors [11] have recently suggested that feedback phenomena, making the load offered to the network dependent from the network status and parameters, might be responsible for the LRD (long range dependent) behavior

B. Routing algorithm performance comparison In this Subsection, we present a set of results that aims to show the differences on the performance obtained with the two connection models when QoS routing algorithms are adopted in the network. Generally, when comparing routing algorithm performance, we are interested in the relative merit with respect to well-established algorithms, such as the Shortest-Path (SP). Thus, as performance index, we selected the relative throughput gain  obtained using a QoS-aware routing algorithm with respect to SP, i.e.,  (algo) = Bw (algo)=Bw (SP ), where Bw (algo) is the average throughput measured using the algo routing. Figure 2 presents the plot of the above defined routing gain obtained with the classical Time-Based model at the top, and with the new Data-Based model at the bottom. We report results for a scenario where the threshold Bt is set to 100 kbit/s. Throughput results are comparable to previous studies (e.g., [2], [9]; MD and WS algorithms outperform SP routing on networks with relatively low load, as they manage to exploit the spare bandwidth that is present on lightly loaded links; instead, they provide a worse performance when the network becomes overloaded, because of the waste of bandwidth that occurs when a longer path is selected. Indeed, it must be noted that the performance in the two approaches is quite different. In particular, on the one hand, the classical approach shows a much wider range of offered load where the dynamic routing algorithms outperform the SP algorithm, although the gain is never larger than 15%. On the other, the novel and more realistic approach shows that the load range where the dynamic algorithms perform better than the SP is much smaller. Moreover, in this range, the maximum obtained gain is much larger (about 50%), but the transition to the overloaded region where the SP performs better is much sharper. This behavior too can be ascribed to the closed-loop nature implicit in elastic traffic. As already noted, when the network is congested, connections are throttled, thus they remain for a longer time in the network, spreading the congestion in time and creating a positive, destabilizing feedback that explains the sharper transition. To provide a deeper insight, Figure 3 plots the Starvation Probability and the Blocking Probability. Given that the two performance indices are completely different, and not comparable, they suggest that the waste of bandwidth caused by the starved connections can

10

Data-Based Time-Based SP MD WS

0.4 0.35

9 8 7

0.3 Df

Starvation/blocking probability

0.5 0.45

0.25

6 5

0.2

4

0.15 0.1

3

0.05

2

0

SP MD WS

1 0

5

10

15 20 25 Offered Load [call/s]

30

35

0

Fig. 3. Starvation probability for the Data-Based model, and Blocking Probability for the Time-Based approach.

be rather high, as the starvation probability grows to large values as soon as the offered load is larger than 3 connections per second. Two comments are in order here. First, it seems completely inappropriate to approximate the starvation phenomenon with a blocking probability. Second, correlating Figs. 2 and 3, it seems clear that the peak throughput gain of MD and WS with respect to SP is coincident with the network load where connections begin to starve if SP is used, but still receive a satisfactory service if MD or WS are used. We can see that the starvation probability is smaller when both the WS or MD algorithms are used. This suggests that the network bears a larger number of simultaneous connections, each one obtaining a smaller throughput, but still higher that its starvation limit. Figure 4 plots the average dilatation factor for the different routing algorithms, which is a performance figure that can be defined only in the Data-Based model. It can be noted that the WS algorithm degrades its performance very quickly as soon as the offered load is higher than about 4 connections per second, while the MD algorithm performs better than the SP algorithm up to 8 connections per second. This last plot shows that a routing algorithm like the WS exhibits a sudden transition from the normal operation region, where connections end in reasonable time, to the congestion region, identified by connection duration nearly ten times larger (the dilatation factor cannot be larger than 10, given the starvation threshold we used). V. D ISCUSSION & C ONCLUSIONS The evaluation of the potential gain offered by new routing algorithms is, for the vast majority, based on call-level simulations, because analytical tools are not available, packet-based simulations are too complex and costly, and testbeds are often not available. Manly due to legacies to circuit-oriented networks, the synthetic workload fed to the network is modeled by connections whose holding time is defined (typically with a random choice) when the connection is generated. This approach fails to account for most characteristics that make the Internet different from traditional networks. This paper addressed three characteristics that we deem have a deep impact on the performance of routing algorithms:  the holding time of Internet connections is typically based on the amount of data to be transferred and hence depends on network conditions;  the elastic nature of the Internet traffic implies a closed loop between network and sources: the connection completion time depends on the instantaneous load of the network and, hence, cannot be computed “a-priori” when the connection is generated;  the Internet does not provide CAC functions, but connections are generally closed when the perceived quality is too poor.

5

10

15 20 25 Offered Load [call/s]

30

35

Fig. 4. Data-Based model: dilatation factor versus offered load.

A novel model for the generation of synthetic load that takes into account the three points above was introduced and discussed. The new model allows the evaluation of networks where connections have to transfer a given amount of data, and dynamically adapt their sending rate to network congestion, as is normally the case on the Internet, where users frequently abort a transfer on perceiving too poor a performance. The presented results show that the new performance evaluation methodology can help the understanding of different QoS routing algorithms. Besides, comparison with traditional evaluation methods (where connection are time-limited) highlights significant differences, hinting that traditional models might fail to grab the relevant characteristics of the current data networks. This work is a starting point to better understand the relationship between dynamic routing algorithms and the congestion control algorithms (typically TCP), that aims to distribute resources to connections following a max-min criterion. The striking difference in performance of different routing algorithms as a function of the modeling and evaluation technique, is a clear warning that connection-level models that better represent the Internet characteristics are needed to study QoS routing algorithms and protocols that support them. R EFERENCES [1]

Z. Wang, J. Crowcroft, “QoS Routing for Supporting Resource Reservation”, IEEE JSAC 14(7):1288–1294, Sept. 1996. [2] Q. Ma, P. Steenkiste, “Routing Traffic with Quality-of-Service Guarantees in Integrated Services Networks”, 8th IEEE/ACM International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV’98), England, July 1998. [3] Q. Ma, P. Steenkiste, and H. Zhang. “Routing High-Bandwidth Traffic in Max-Min Fair Share Networks”, in Proceedings of the ACM SIGCOMM’96, pages 206–217, Stanford, CA, USA, Aug. 1996. [4] D. Cavendish, M. Gerla, “Internet QoS Routing Using the Bellman-Ford Algorithm”, In Proceedings of the Conference on High Performance Networking (HPN98), IFIP, Vienna, Austria, 1998. [5] G. Apostolopoulos, R. Gu´erin, S. Kamat, S. K. Tripathi, “Quality of Service Based Routing: A Performance Perspective”, ACM SIGCOMM’98, Vancouver, Canada, Sept. 1998. [6] G. Apostolopoulos, D. Williams, S. Kamat, R. Guerin, A. Orda, T. Przygienda, “QoS Routing Mechanisms and OSPF Extensions”, RFC 2676, IETF, Aug. 1999 [7] M. Ajmone Marsan, A. Bianco, C. Casetti, C. F. Chiasserini, A. Francini, R. Lo Cigno, M. Munaf`o, “An Integrated Simulation Environment for the Analysis of ATM Networks at Multiple Time Scales” Computer Networks and ISDN Systems, Special Issue on Modeling of Wired and Wireless - ATM Networks, Vol. 29, No. 17-18, Feb. 1998, pp. 2165–2185 [8] ANCLES - A Network Call-Level Simulator. URL: http://www.tlc-networks.polito.it/ancles [9] C. Casetti, G. Favalessa, M. Mellia, M. Munaf`o, “An Adaptive Routing Algorithm for Best-effort Traffic in Integrated-Services Networks”, IEE ITC-16, Edinburgh, UK, June 1999 [10] GT-ITM Georgia Teach-Internetwork Topology Models URL: http://www.cc.gatech.edu/project/gtitm [11] A. Arvidsson, P. Karlsson, “On the Traffic Models for TCP/IP,” 16th International Teletraffic Congress (ITC-16), Edimbourgh, UK, June 1999

Suggest Documents