An Investigation into the effects of non-dynamic ... - Semantic Scholar

An Investigation into the effects of non-dynamic routing algorithms and the resultant congestion on the Internet Kevin Curran, Derek Woods, Nadene McDermot Internet Technologies Research Group School of Computing and Intelligent Systems University of Ulster, Magee Campus, Northland Road, Northern Ireland, BT48 7JL, UK Email: [email protected]

Phone: +44 (028) 7137 5565

Fax: +44 (028) 7137 5470

Abstract: The Internet is an interconnected mesh of participating IP devices forwarding information worldwide. The speed and volume with which this information travels can vary due to limitations in channel capacity, type & size of media and device capacity. The primary devices, which handle routing decisions on the Internet Backbone, are gateways. The efficiency of gateways at forwarding traffic can be observed through the use of trace packets. These trace packets are sent towards a destination and returned to the sender with the entire round-trip time recorded. The times of the arrival of these packets at each intervening gateway along the route are also recorded. From an exhaustive series of trace packets to a diverse set of destinations, our research has discovered that specific routers are the cause of bottlenecks in the Internet. We found that packets took the same route each time towards their destination. Our research has also found that over periods as large as seven days – these routers continue to cause bottlenecks with no re-routing of packets to alleviate congestion. This research begs the question as to why these bottlenecks occur at the same places – and for so long a period and also queries the extent of implementation of dynamic routing algorithms. Keywords: Internet Congestion, Dynamic Routing Algorithms, Quality of Service

1. Introduction Internet Traffic is essentially the load on the network, or the amount of packets of data traversing the network. When the Internet was first developed, it was never visualised as being a public service, so the increase in network traffic was no real cause for concern. However, Internet Traffic has grown at an alarming rate, “When seen over the decade of the 1990s, traffic appears to be doubling about once each year.” [Odlyzko00] Indeed, the growth rate of Internet Traffic has posed sever problems for measuring this traffic, which is essential to the understanding of Internet Congestion. A common complaint about traffic measurement studies is that they do not sustain relevance in this environment where traffic, technology, and topology change faster than we can measure them. So then, future proposals on Internet Congestion can only estimate the actual load on the network at any particular time. In fact, no single organization is truly measuring global Internet behaviour, because the global Internet is simply not instrumented to allow such measurement [Murray01] Internet traffic in general is said to be “Bursty”, that is, that bursts of traffic rather than a steady flow are transmitted. The most serious effects of congestion are seen in the form of Congestion Collapse. This is a condition, which occurs when “a sudden load on the net can cause the round-trip time to rise faster than the sending hosts measurements of round-trip time can be updated“ [Nagle84]. Under these conditions the network can come to a complete stand still. “Informally, congestion collapse

1

occurs when an increase in the network load results in a decrease in the useful work done by the network” [Floyd00]. Due to this increase in network traffic and the pressure it puts on the available bandwidth, the fairness of the Internet is also at risk. Fairness means that no user is penalised compared to others that share the same bottleneck links”. This type of fairness has long been a measurement of a congestion control mechanisms worth to the Internet, as a public service all users should be treated equally. Unfairness in the Internet happens when congestion control techniques are not implemented, such as the AIMD method of control, which will be discussed later in this paper. Flows that do not “back-off” when faced with congestion can greedily utilise all of the available bandwidth, leaving other flows fighting for their share.

1.1.

Router Congestion Control

Routers have already been discussed in this paper as being one of the key pieces of architecture of the Internet, so then it is important to consider the controls that are in place to effectively manage congestion at this point in the transfer of packets across the Internet. In the work of [Lefelhocz96] the authors proposed a paradigm in order to maintain a fair routing system for Internet usage. The proposed design incorporated four controls for congestion management, which are widely implemented in the Internet today. These are Scheduling Algorithms, Buffer Management, Feedback and End Adjustment. The two most popular scheduling algorithms are FIFO (First In First Out), which forwards packets according to their place in the queue, and WFQ (Weighted Fair Queuing), which attempts to allocate the available bandwidth fairly, thus protecting flows from unfairness on the part of others. The author feels that some form of Scheduling algorithm must be implemented in order to prevent bandwidth being “swallowed up” by greedy users. However this alone will not prevent packets from being dropped, so other measures must be implemented simultaneously. Buffering is required at a switch whenever packets are arriving faster than they can be sent out. This buffering should take one of two forms. The shared buffer pool method does not protect flows from one another, but forwards packets using the First Come First Use method. On the other hand, the Per-Flow allocation method uses a fairer method of packet forwarding where each flow is forwarded on merit and “well behaved” flows are serviced first. Routers manage traffic flow by using one of two methods: static routing or dynamic routing. Static routing is best implemented by small networks, where the topology and state of the network is known to be relatively stable. The routes taken by packets of data are hard coded by the network engineer manually and do not change unless the engineer becomes aware of changes within the network, such as a change to the network topology. Dynamic routing, on the other hand is used where the topology of the network is unknown and the route taken by packets of data is uncertain at the time of sending. Dynamic protocols are used to calculate the best path for the packets to take, considering many parameters. These dynamic routing protocols are used to discover the best paths for Internet traffic on a daily basis.

2. Gateway Protocols Once dynamic routing has been chosen as the preferred method of path selection the actual protocol that will be used to implement the metrics and perform the calculations must then be considered. There are two main classes of gateway protocols: Interior gateway protocols and Exterior gateway protocols. Interior gateway protocols are used within autonomous systems, that

2

is a set of routers under a single technical administration [Rekhter95]. Exterior gateway protocols are used to communicate between autonomous systems therefore providing the links between different networks, in order for them to work together as a single unit. BGP v4 is the main routing protocol used in the Internet today. It is a Border gateway protocol that was designed for networks that implement TCP/IP. BGP is an exterior gateway protocol (EGP), which means that it performs routing between multiple autonomous systems or domains and exchanges routing and reachability information with other BGP systems. The protocol exchanges this information by using BGP speakers to communicate to the routers that directly neighbour it. These update messages can contain information such as withdrawn routes and Network reachability information. The BGP v4 protocol does not usually take the load on the network into account when choosing the best path. The load would constantly be changing as users went online and offline and there would be a lot of routing table updates to the extent that there would be just as much traffic being generated just to maintain the routing table. The recent growth of the Internet has adversely affected the stability of routing in times of congestion. It has been proven that when a network is congested and packets are dropped, some of these packets could be the updated routing information of the network. Congestion in the network can hinder the propagation of routing information or peering refresh requests if the routing protocol messages are not isolated from data traffic [Shaikh00]. So then, it would seem that the very messages that are providing routers with information to stem the build up of congestion are at risk themselves (with both OPSF and BGP protocols displaying signs of unreliability under congested conditions). This can have great implications for routing and therefore Internet stability.

3. Proposals for alleviating congestion Various suggestions have recently been made in order to improve upon the current TCP congestion control methods. These include the Limited Transmit method, which it is hoped will reduce unnecessary retransmit timeouts, and a SACK-based mechanism for detecting and responding to unnecessary Fast Re-transmits or Retransmit Timeouts [Bonald00]. In addition to this, another method for the improvement of the TCP congestion control technique is offered in the form of “General AIMD Congestion Control”, [Yang00]. This paper proposes a window adjustment strategy where the window increase and decrease values are used as parameters, depending on the ACK’s received or the packets lost. [Mo98], propose the widespread use of an updated version of TCP Vegas, which they claim is fairer with regards to packet loss than its counterparts TCP Reno and Tahoe are, as it uses a complicated bandwidth estimation scheme. Some of The most controversial proposed methods for the alleviation of Internet Congestion involve the idea of using pricing methods. As with any other public service many believe that pricing the Internet would make end-users think twice about the amount of information that they download, which in turn would then help to decrease the amount of congestion that is occurring [Odlyzko98].

4. Experimental Set-Up Internet congestion can be best measured by observing the round trip times and packet loss which occurs when packets of data are sent to a chosen destination. In order to fully observe and analyse these types of congestion it was necessary to trace the route of each packet on its journey. One of the simplest methods of tracing packets is by using a trace route utility. The utility chosen for this study is a “Ping” (Packet Internet Groper) program, which sends an echo request

3

to the specified site and records the length of time it takes for the site to respond to this request. For the purpose of this study 12 web sites were chosen to provide an in depth account of the congestion that is occurring. These sites were selected for their diversity of location and purpose on the web. The sites were: Northern Ireland Rep. Of Ireland United Kingdom

United States Asia Australia Iceland

WWW.belfasttelegraph.com WWW.Aerlingus.com WWW.UCL.ac.uk WWW.Scotland.org WWW.wales.gov.uk WWW.bankofamerica.com WWW.newyorktimes.com WWW.Yahoo.co.jp WWW.southafrica.co.za WWW.unisa.edu.au WWW.csiro.au WWW.hi.is

Table 1 : Websites used for evaluation of Internet Congestion The chosen period was five days, from 10am until 7pm. To get an accurate picture of the flow of traffic it was essential to ensure that the sample sets were identical in every aspect. To facilitate this the packet size would always be 56 bytes in size, the sample set was 10 packets per sample and the results were recorded every hour, in order to show the trend in the traffic flow. Due to space restrictions, we only examine the NewYorkTimes.Com trends.

1.2.

Newyorktimes.com Evaluation

The trip to Newyorktimes.com took 17 hops along the same path each day as illustrated in Figure 1. The average round trip time greatly increased midweek, rising from 92.2ms on Day 1 to 333ms on Day 4. This type of difference in times would suggest heavier traffic midweek, with more users connected. Hops 4 and 9 consistently caused spikes in the flow of traffic throughout the study. The owner of Hop 4 is unknown, and JA.net own the router at Hop 9. JA.net have already been mentioned in this paper, as several routers belonging to them have caused congestion elsewhere. In fact the exact same router on the path to www.Bankofamerica.com was highlighted as a congested router. There were a significant number of timeouts midweek when the traffic was at its heaviest. Here the destination site was the most frequent offender as Hop 17 showed timeouts for days 4 and 5. The spikes at time spots 12pm and 2pm suggest that the traffic flow was heaviest at these times on Day 2 and 4 of the study as illustrated in Figure 2.

4

Hop DNS Name 1 -------------2 -------------3 cisj.icbb.ulst.ac.uk 4 -------------5 belfast-bar.ja.net 6 pos14-1.lond-scr.ja.net 7 london-bar2.ja.net 8 us-gw2.ja.net 9 ny-pop.ja.net 10 if-8-2.core1.NewYork.Teleglobe.net 11 if-1-0-4.bb7.NewYork.Teleglobe.net 12 -------------13 pos3-2-155M.cr2.JFK.gblx.net 14 pos1-0-622M.cr1.NYC2.gblx.net 15 pos0-0-2488M.hr8.LGA2.gblx.net 16 -------------17 www.newyorktimes.com

Figure 1 : Days 1-5 (NewYorkTimes.com)

Day 1 2 3 4 5

Spikes Hops 4,9 Hops 2,6,9 Hops 2,4,6,9 Hops 4,9,12 Hops 4,9,12

Timeouts

12,13,14,16,17 6,7,9,11,17

5

Time Spots 11am, 6pm 12pm, 2pm 1pm, 4pm 12pm, 2pm 2pm, 3pm

Figure 2 : Identified Trends

5. Evaluation of Results As with previous studies [Bonald00], the location of the site greatly affected the number of hops. The furthest site in Japan (Yahoo.co.jp) showed between 28 and 30 hops daily, whereas the nearest site to Magee University, in Belfast showed an average of 12 hops. This was to be expected. Moreover, it could then be assumed that a site in South Africa would show a great number of hops, when in fact the exact opposite was found from this study. The average number of hops to Southafrica.com was only 14; however on closer inspection the hop that transports the packet from the UK to South Africa increased the average roundtrip time by 200 milliseconds. This shows us that one large hop can take as long to traverse as many smaller hops. In addition to this, another surprising result from this study was the behaviour of the chosen route when congestion was encountered. Even when a particular router was displaying signs of congestion, the packets were not re-routed. Also, the importance of the day by day analysis is clearly evident from the results of this study as the flow of traffic midweek is more sporadic, with more time-outs occurring and also more packet loss. This was also expected, as Internet usage would be higher midweek, and the more users connected would imply more traffic and therefore more congestion.

Newyorktimes.com : Day 1 - Average, Minimum and Maximum Hop times

Time in Milliseconds

140 120 100

Ave

80

Min

60

Max

40 20 0 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17

Hop Number

Figure 3 : Average, Min and Max Hop Times for NewYorkTimes.com There is no evidence to suggest that the congestion that this study has highlighted is a TCP problem. TCP assumes that packet loss is caused by router queue overflow. This supports the theory that whilst the TCP congestion controls are working efficiently, the underlying problems are at the routers in this case. The most surprising element of the research results was the lack of route changes, even under congested conditions. Most of the sites that were analysed showed that the route did not change at all from day to day, with re-routing of packets happening only twice in the five days of the study. It seems then that the behaviour of the associated packets of data did not perform dynamically, which would then lead to the assumption that dynamic routing techniques are not being implemented on a wide scale. This could have serious implications for the success of the Internet as its inherent capacity could be significantly underused. It has already been addressed through the literature review that the current Internet protocol of choice is BGP v4 and that, at present, this protocol does not usually consider link-load when selecting the best path for packets of data. However it is important to note that link-load could be used as a parameter if it was viable to do

6

so. Although there are associated overheads (such as the escalation of congestion due to routing updates), these overheads could be controlled. The main reason that this is not considered an option by the owners of these routers (i.e. the ISP’s) is that there is no financial incentive to do so. To re-route traffic means to pass business on to the competition. This is graphically displayed in Error! Reference source not found. and Figure 4 whereby we can see that at each time from 10am to 7pm – routers 8/9 cause the highest delay with no apparent effort to reroute. From discussions with network engineers within large ISP’s we have come to believe that it is not in the interest of ISPs to do this – so ISPs keep the traffic on a set route along its own routers whereby it can charge more. The author feels that this could be a major reason for the amount of congestion that is occurring at present. Newyorktimes.com : Day1 - Behaviour at each time slot

Time in Milliseconds

140 120 100 80 60 40 20 0 10am 11am 12pm

1pm

2pm

3pm

4pm

5pm

6pm

7pm

Time Slot

Hop1 Hop2 Hop3 Hop4 Hop5 Hop6 Hop7 Hop8 Hop9 Hop10 Hop11 Hop12 Hop13 Hop14 Hop15 Hop16 Hop17

Figure 4 : Behaviour of routers at each time slot Another observation from the research results is that, over the course of the five days of the study, some individual DNS names were consistently the owners of the routers causing delays. For example, the ISP company JA.net is responsible for causing some of the worst delays throughout our study. It could be that the owners of such routers have over allocated the bandwidth that they have to their users. Whilst at off peak times this would make effective use of the bandwidth available, at peak times, such as midweek lunchtimes (12pm -1pm), this could cause significant problems for the flow of traffic. With too many users competing for a finite resource, bottlenecks could easily occur at these routers. It seems then that delays are caused by hardware limitations and the responsibility for this must inevitably rest with the ISP’s.

6. Conclusion Through the research connected to this study, it has become apparent that the majority of Internet congestion is occurring at the routers that make up the backbone of the Internet. Frequently throughout this study, the delays were systematically occurring at the same routers. It would appear that nothing more than a simple look-up table strategy is being used to implement routing decisions in the nodes observed. This has serious implications for all Internet users. If dynamic routing is not being widely implemented then the Internet’s inherent capacity is probably significantly underused. This means that many users may be experiencing congestion and delays which are unnecessary. It is also arguable that the minority of nodes, which are currently carrying the bulk of Internet traffic, are carrying disproportionate expenditure on hardware and storage capacity.

7

It should also be noted that in the absence of dynamic routing, congestion will be inevitable and will lead to packet loss at the routers. As routers exist in the Network layer this loss of data will lead to whole messages being retransmitted by the higher layers (i.e. the transport layer), thus creating further congestion. It is our belief that the role of ISP’s should be further researched and held accountable for congestion where it is obvious that preventative measures have not been taken. Since the Internet now has the status of being a public service, any abuse of such a system needs to be addressed and controlled. Over allocation of bandwidth is a very serious cause for concern, which adds to the notion of unfairness on the Internet. In conclusion, it can be said that Internet traffic is delayed at gateways and not by any limitations on the capacity of data communications media. Delays are caused not by software but by hardware storage limitations and by limited processing power or are due to the inadequate amount of bandwidth provided by ISPs, thus causing bottlenecks. These delays are the direct cause of Internet congestion. Those organisations responsible for the Internet backbone with bottleneck gateways as identified in this research must assume responsibility for implementing dynamic congestion algorithms to alleviate congestion.

7. References [Allman99] Allman, M., et al., "TCP Congestion Control", Request for Comments: 2581, Network Working Group, April 1999 [Bonald00] Bonald, T. and Massoulie, l., "Impact of Fairness on Internet performance" Bradley, C., "An Analytical Study of Internet Congestion", Msc Thesis, Computer Science Department, UCL, 2000 [Floyd00] Floyd, S., "Congestion Control Principles", Request for Comments: 2914, Network Working Group, 2000 [Floyd01] Floyd, S., "A Report on Some Recent Developments in TCP Congestion Control", IEEE Communications Magazine, April 2001 [Key99] Key, P., et al., "Congestion Pricing for Congestion Avoidance", Caltech Technical Report, February 1999 [Lefelhocz96] Lefelhocz, C., et al., "Congestion Control for Best-Effort Service: Why We Need a New Paradigm" IEEE Network, Volume 10, Number 1, January/February 1996 [Maurer01] Maurer, S.M. and Huberman, B.A., "Restart Strategies and Internet Congestion", Journal of Economic Dynamics and Control, 2001 [Mo98] Mo, J., et al., "Analysis and Comparison of TCP Reno and Vegas", IEEE Networks, July 1998 [Nagle84] Nagle, J., "Congestion Control in IP/TCP Internetworks", Request for Comments: 896, Network Working Group, January 1984 Odlyzko, A., "Paris Metro Pricing for the Internet", AT&T Labs – Research Internal Report AT82323, 1998 [Odlyzko00] Odlyzko, A., "Internet Growth: Myth and Reality, Use and Abuse", Information Impacts Magazine, available online from www.cisp.org, November 2000

8

Rekhter, Y., "A Border Gateway Protocol 4 (BGP-4)", Request for Comments: 1771, Network Working Group, March 1995 [Shaikh00] Shaikh, A., et al., "Routing Stability in Congested Networks: Experimentation and Analysis", SIGCOMM 2000: 163-17

9