The 3rd International Conference on Communications and Information Technology (ICCIT-2013): Networks and Internet Technologies, Beirut
A TCP delay-based mechanism for detecting congestion in the Internet Khaled Dassouki∗ , Herve Debar∗ , Haidar Safa† and Abbas Hijazi‡ ∗ Department
of Networks and Telecommunication Services, Telecom SudParis, France Email: kaled.el
[email protected] and
[email protected] † Department of Computer Science, American University of Beirut, Lebanon Email:
[email protected] ‡ Department of Physics, Lebanese University, Lebanon Email:
[email protected]
Abstract—Internet congestion existing solutions such as active queue management algorithms have many shortcomings, mainly related to the detection phase. These algorithms depend on routers’ buffer statistics to detect congestion and their performance is highly affected by the environment and the parameters that are used. In this paper we are proposing a mechanism that is capable of detecting congestions by monitoring passively an aggregation link. The proposed mechanism does not need parameterizations since all the used parameters are deduced from public real internet traces using statistical approaches and uses TCP delays as a detection parameter. It is dynamic since the detection is proportional to the severity of the congestion. Experimental results have shown that the proposed mechanism is able to detect congestion rapidly and does not suffer from false alarms. Keywords—Internet, Congestion Detection, TCP, Active Queue Management, throughput.
I.
I NTRODUCTION
Traditional services such as voice and video are switching entirely to IP solutions. These changes have made Internet Service Providers (ISPs) obliged to deliver fast and reliable internet connectivity to their clients. In this context, congestion is one of many challenges that are facing todays ISPs. Internet congestion occurs when the demand on a certain resource (e.g. link bandwidth) exceeds the capacity of this resource. This will delay the delivery of packets (which affects sensitive services like VOIP) and leads to wasting of valuable internet resources since these undelivered packets will be retransmitted. Internet congestion is controlled by two mechanisms; TCP congestion control [18] and Active Queue Management (AQM) [1] [2] [3] [4]. The TCP congestion control manages the sender transmission rate. TCP is a reliable protocol, every transmitted segment should be acknowledged. The amount of segments that a sender is able to transmit before waiting for an acknowledgement is the minimum of the congestion window, cwnd, and the advertised window. The advertised window is the maximum number of bytes a receiver can accept while the congestion window is used to control the transmission rate. When a session is established, cwnd is set to one and is increased for every received acknowledgment by one. This exponential increase will stop when cwnd becomes larger than a slow start threshold. When the sender notices a lost segment or duplicates acknowledgements TCP considers that congestion is occurring and the sender must slow down its
978-1-4673-5307-6/13/$31.00 ©2013 IEEE
141
transmission rate. The latter, Active Queue Management algorithms monitor passively routers queues to detect congestions. Once the packets waiting in the queue exceeds a specified threshold, the algorithm considers that congestion is occurring. The algorithm manages the router queue to avoid and control congestion. Random Early Detection (RED) [1] is the most famous AQM algorithm. In this algorithm, when the queue occupancy reaches a certain threshold, RED drops TCP packets based on a probabilistic relation. When different parties using TCP congestion control mechanism notice packets dropping, they realize that, congestion is occurring and slows down their transmission rate. Although AQM is one of the main solutions deployed nowadays to avoid congestion, it has many shortcomings, mainly related to the detection phase. First, most AQM algorithms use routers buffer statistics to detect congestion. However, many routers architecture include multiple distributed buffer stages that make it difficult to obtain such metrics [6]. Therefore, it is essential to introduce a mechanism capable of detecting congestion passively and independently from network components and their architecture. Second, an end-to-end communication may traverse routers under different administrations and deploying heterogeneous AQMs. Many studies have shown that interaction between different AQMs and under certain circumstances may lead to oscillation and instability in the network [6]. Hence, it is important to overcome the interoperability problem by proposing distributed homogeneous or centralized congestion detection and control algorithms. Third, AQMs performance is influenced by its parameterization. For example, studies performed on RED had shown that it is highly dependent on the environment and the way its parameters are tuned [7]. Unfortunately, AQM propositions are not suggesting systematic rules to specify these parameters [6]. Thus, it is essential to propose a congestion detection mechanism with clear guidelines about its parameterization. To overcome these drawbacks we are proposing in this paper a mechanism that is capable of detecting congestions by monitoring passively an aggregation link. The remaining of this paper is organized as follows. Section II describes our proposed algorithm. Section III presents the environment, implementation and the result analysis. In Section IV presents the related work and finally we conclude in Section V.
The 3rd International Conference on Communications and Information Technology (ICCIT-2013): Networks and Internet Technologies, Beirut
II.
T HE PROPOSED MECHANISM
The proposed mechanism does not need parameterization since all the used parameters are deduced from public real internet traces using statistical approaches and uses TCP delays as a detection parameter. It is dynamic since the detection is proportional to the severity of the congestion. Hence, the more severe the congestion the quicker is the detection. Fig. 1.
Different delay phases experienced by a TCP session
Fig. 2.
(a) Session level analyzer and (b) aggregation level analyzer
A. Notations, definitions and parameters Before describing the proposed approach, we define first the following notations and parameters: 1) TCP delays: TCP Round Trip Time (RTT) is the time needed for a packet to get to its recipient and for the acknowledgement to get back to the packet sender. Our studied traces were collected from a gateway located between different TCP endpoints. In this scenario, studying TCP delays based on RTT estimations is not trivial. Therefore, we studied for every TCP session the delay between its packets. Our study was based on statistical interpretations applied on real public internet traces as detailed in section III. We observed that the delay experienced by a TCP endpoint may be classified into three different phases as shown in Fig. 1: a) Normal delay: It is a delay after which most of the TCP sessions experiencing it are resumed. b) Abnormal delay: It is a delay after which most of the TCP sessions experiencing it are not resumed. When reaching this delay phase, most of the operating systems considers that there is something wrong happening and drop the session. c) Probably abnormal delay: When the delay experienced by a TCP endpoint is not normal anymore and before getting to the abnormal phase, the TCP endpoint delay is considered as probably abnormal. During this phase we are not sure whether the TCP session will be resumed or it enters the abnormal phase. 2) ThPrAb: It is the probably abnormal delay threshold after which the session is considered as probably abnormal. 3) ThAb: It is the abnormal delay threshold after which the session is considered as abnormal. 4) ∆: It is the difference in seconds between ThAb and ThPrAb. 5) t: It is the time cumulated by the session delay after reaching the probably abnormal threshold ThPrAb. 6) x: It is the number of sessions that reach the probably abnormal phase during a period P. 7) Max: It is the maximum acceptable number of sessions reaching the abnormal phase on a monitored aggregation link during a specified period of time. If the amount of abnormal sessions exceeds Max, congestion is detected. 8) y: It is the number of sessions that reach the abnormal phase at a specified time. Congestion is detected when y becomes greater than Max.
142
B. Basic Concepts We aim to design a congestion detection mechanism based on TCP delays, therefore we suggest an adaptive detection threshold that is proportional to the severity of the congestion. We take into consideration the rate of occurrence of abnormal events, then decide whether to decrease the time needed for detecting congestion or not. Indeed, according to our delay classification, a session is considered abnormal when its delay reaches ThAb, the abnormal threshold. Thus, to minimize the detection time, we propose that a session should transit from the probably abnormal to the abnormal phase, when t reaches E given as: ∆ − ∆/M ax, E= ∆ − ∆ ∗ x/M ax, ∆
for x = 1 for 1 < x < M ax for x = M ax
(1) (2) (3)
This equation is made of two constants, ∆ and Max, and one variable x. When x which is the number of sessions that had reached the probably abnormal phase increases, ∆ - ∆*x/Max decreases which consequently reduce the time needed for a session to reach the abnormal phase and accordingly the time needed to reach Max. The proposed approach monitors passively all the active TCP sessions communicating on an aggregation link. An active session is a session that does not end by a FIN, RST or a long
The 3rd International Conference on Communications and Information Technology (ICCIT-2013): Networks and Internet Technologies, Beirut
specified inactive period. Every monitored session may lead to an increase or decrease in the value of x and y. Algorithm 1 illustrates the proposed mechanism. Fig. 2 provides further description about its analysis. Indeed, the algorithm has two levels of analysis; a session level analysis and aggregation link analysis. A session level analyzer is created for every active TCP session as shown in Fig. 2(a). There is only one aggregation level analyzer which is influenced by the different session analyzers as shown in Fig. 2(b). When the delay experienced by an active TCP session becomes greater than ThPrAb, the session level analyzer created for this session transit from the normal state to the probably abnormal state. At this stage, the aggregation level analyzer is informed by the session level analyzer to increase the value of x by 1. If the TCP session receives a packet, the session analyzer goes back to the normal state and the value of x is decreased by 1. If the session is still in the probably abnormal state and the delay persists until the value of t becomes greater than ∆ - ∆*x/Max, the session is classified as Abnormal and the counter y of the aggregation level analyzer is increased by 1. When y becomes greater than Max, congestion is detected. Initialize Max; Initialize ∆ = ThPrAb - ThAb; for every TCP session do Request is sent at time t0; //line 6; while (delay = tcurrent - t0) < ThPrAb do //Normal state delay is greater than ThPrAb go to prob. ab. state t1 = tcurrent; x=x+1; E=∆ - ∆*x/Max; while (t=tcurrent-t1) < E && !(Response to request) do //Probably abnormal state Wait; end if Response to request then //go to normal state x=x-1; go to line 6; else // t > E transit to abnormal state; y = y + 1; if y > Max then Alarm Congestion ; end while !(Response to request) do x = y - 1; x = x - 1; go to line 6; end end end end end Algorithm 1: Proposed mechanism III.
implementing the proposed algorithm and before deploying it, we must specify ThPrAb, ThAb and Max. This section shows also how these parameters values were specified. A. Implementation using IDS Bro We have used Bro IDS [10] to implement our detection mechanism. Algorithm 2 presents the pseudo code of our policy script. Due to space restriction, the pseudo code is written with high level functions. The script monitors and analyzes the active sessions for every one second. During this period, the script computes the number of sessions that had experienced a delay greater than ThPrAb seconds (i.e., x). Then for every session experiencing this delay (greater than ThPrAb seconds), the script compares the t value (i.e., the time elapsed since the session had reached the ThPrAb seconds delay) with the ∆ - ∆*x/Max. If t is greater than ∆ - ∆*x/Max, y (i.e., the number of sessions that had reached ThAb) is incremented. Finally the script compares y with Max. If y is greater than Max, congestion is detected. for every 1 second do Compute the number of active TCP sessions that were delayed for more than ThPrAb seconds and save it in x; for every session where delay is > ThPrAb do E = ∆ - ∆*x/Max; t = time elapsed since reaching ThPrAb; if t > E then y=y+1; end if y > Max then Alarm congestion; end end end Algorithm 2: Bro script pseudo code B. ThAb, ThPrAb and Max specification To specify the values of ThPrAb and ThAb, we study first the TCP delay behavior on the internet. We have chosen nine public traffic traces provided by the MAWI [8] dataset. Two reasons were behind our choice. First, the MAWI dataset provides an important amount of internet traces collected in real case scenario from a transpacific aggregation link connecting Japan to the US. Every trace file consists of 15 minutes of internet traffic. Second, the attacks present in the MAWI dataset were labeled by the MAWILab project [9]. These attack labels were used to clean the MAWI traces from all the malicious sessions. To study the TCP delay behavior we applied the following methodology: 1) Cleaning the traces: For every chosen trace file, we deleted all the sessions containing IP addresses labeled as attack by the MAWILab dataset. This step helps us in minimizing the influence of malicious behavior on our statistics.
I MPLEMENTATION AND RESULT ANALYSIS
In this section, we describe our implementation environment and experiments, then analyze the obtained results. After
143
2) Extracting different delay proportions: For every 15 minutes trace file, we studied the delay between messages sent by every TCP session. During the 15 minutes we computed the percentage of delays greater than 1 second. Then we repeated this procedure for delays greater than 2, 3, 4, ...,15 seconds.
The 3rd International Conference on Communications and Information Technology (ICCIT-2013): Networks and Internet Technologies, Beirut
Fig. 3 presents the computed values for every chosen trace file. Fig. 4 presents the mean of all the traces and the standard deviation. 3) ThPrAb and ThAb specification: both Fig. 3 and 4 indicate that the amount of delay experienced by a TCP session could be classified into three phases. The first phase is between 1 and 6 seconds. The graph shows that there is a major decrease in the amount of delayed packets during this phase. Based on Fig. 4 which shows the delay mean of the nine studied trace files, 97.2% of the TCP sessions did not experience a delay for more than 6 seconds. We consider this phase as the normal phase and set ThPrAb to 6 seconds. The second phase is between 6 and 10 seconds. During this phase, there is no major change between the percentage of delays experienced at 6 seconds (2.8%) and the percentage of delays experienced at 10 seconds (2.64%). The percentage decreases between 11 and 12 seconds and then the graph remains constant. Based on these results, we can deduce that between 6 and 10 seconds, it is most probably that a delayed session may be resumed. This is the probably abnormal phase and set ThAb to 10 seconds. Fig. 4 shows that the third phase starts after a delay of 10 seconds.
Fig. 3.
Delay percentage for every studied trace file
4) ∆ specification: Based on these statistics we can specify that is equals to 4 seconds.
Fig. 4.
Mean and standard deviation of all the studied traces
5) Max specification: As defined before, Max is the maximum acceptable value of sessions reaching the abnormal phase under normal circumstances and during a specified period of time. This means that we have to specify the maximum expected number of sessions on an aggregation link delayed for more than 10 seconds (i.e. ThAb value) during a specified period of time. To achieve this goal we use the central limit theorem which is given as: √ X ∼ N (µx, σx/ n)
C. Statistical interpretation
where X is the mean of a random sample X1, X2 .. Xn of size n from a distribution with a finite mean and a finite standard deviation . The central limit theorem states that if the sample size is sufficiently large (30 and above), then the mean of a random sample X, from a population has a sampling distribution that is approximately normal, regardless of the shape of the distribution of the population. By computing the mean of the sample means, we will be able to deduce the population mean. Following are the steps made to apply the central limit theorem on our traces: a) Delays: For every trace file and at the beginning of every 1 second, we compute the percantage of delayed active sessions for more than 10 seconds with respect to the number of packets exchanged by these sessions. b) Sampling: We choose between the highest computed value for every consecutive 30 seconds. this will make every trace file of 15 minutes duration made of 30 samples. c) Mean: For every 15 minutes trace file we will compute the mean of the 30 samples. d) Mean of all means: We computed the mean of all means computed in c). The calculated mean in this part is the expected value of Max. Our results shows that Max is equals to 0.19% of the packets exchanged by the active sessions.
144
Our statistical studies are validated theoretically based on the RFC of Computing TCP’s Retransmission Timer [19]. The RFC states that when a session starts by a SYN message, the RTO (Retransmission Time Out) is set to 3 seconds. If a SYNACK is not received during this period of time, the SYN message is sent again and a new RTO is set to 6 seconds. If a SYNACK is not received during this period of time, the SYN message is sent again and a new RTO is set to 12 seconds. If a session is established, the RTO value is computed based on the RTT estimation, if the RTT is small, RTO decreases slowly which will lead to a small decrease in RTO value, this is the case of short TCP sessions. The minimum RTO value should not be set less than 1 second. If the first RTO is slightly greater than 1 second the second RTO is set to 3 seconds and the third is set to 6 seconds and so on. Based on these theoretical facts, the graph in Fig. 1 and 4 shows that most of the sessions are resumed during the first and second RTO (3 plus 6 seconds which is the expected value of the protocol designers) some remaining delayed sessions persists until the beginning of the third RTO. During the third RTO, there is a small probability that sessions are resumed. This explains the behavior of most of the operating system which considers that sessions are inactive after reaching the third RTO. D. Experiments and discussion Our proposed algorithm detects congestions by monitoring passively TCP sessions communicated on an aggregation link. It could be deployed on a link connected to a gateway router, to a server or even to a client. To demonstrate the efficiency of our detection mechanism, we founded our experiments on real public internet traces collected from different location and during different circumstances. We chose five traces grouped in two different categories; congested and uncongested traces.
The 3rd International Conference on Communications and Information Technology (ICCIT-2013): Networks and Internet Technologies, Beirut
TABLE I.
S IMPLE TABLE
Characteristics
Proposed algorithm
Nettimer
MultiQ Yes
Passive
Yes
Yes
Applied on live network
Yes
No
No
Applied on previously collected traces
Yes
Yes
Yes No
Fast
Yes
No
Deployed on an aggregated link
Yes
No
No
Deployed on servers
Yes
Yes
Yes
1) Experiment 1 MAWI Congested traces: During the period going from 2003/04 and 2003/10 the MAWI link was suffering from severe congestion [17]. We applied our detection mechanism on two different traces chosen randomly during this period of time. According to [17] both files are suffering from continuous congestion during all the capturing period (15 minutes). Our detection mechanism was able to detect congestion after 7 seconds from the beginning of the trace file. It persists on signaling congestion until the end of the trace files. This shows that our proposed mechanism can be deployed on an ISP aggregation link and can detect passively congestion occurring on that link. 2) Experiment 2 CAIDA DDOS attack traces: From the CAIDA [16] dataset, we used traffic traces that contains approximately one hour of DDoS attack targeting a server on August 4, 2007. Only attack traffic to the victim and responses to the attack from the victim are included in the traces. Our algorithm detected congestion in around 10 seconds; this is because the attack was not intense in the beginning. 3) Experiment 3 Uncongested traces: We chose two MAWI traces collected during 2010. These traces are not suffering from congestion. By applying our detection mechanism on these two traces no congestion was detected. This experiment showes the efficiency of our algorithm in terms of false alarm rate. No false alarms were signaled while monitoring an uncongested link. IV.
R ELATED WORK
An active research field that deals with congestion is bottleneck detection. The main objective of the proposed solutions [11] [12] [13] [14] is to detect the presence of bottleneck on the monitored network that may lead to congestion. Two techniques exists active and passive bottleneck detection. The passive detection uses pre-captured network traces to estimate the presence of bottleneck on the network. Many passive capacity estimation tools have been proposed so far. MultiQ [14] uses equally spaced mode gaps technique. Nettimer [13] deduces link capacity from the location of the modes in the packet inter-arrival distribution. Both tools could not be deployed on live operational networks. This was one of the major reasons that motivated us to present our work. Table 1 compares the features of our algorithm to that of Nettimer and MultiQ.
or a client. All the parameters used in the proposed mechanism are specified by applying statistical methods on real internet traces. We have implemented the proposed mechanism using IDS Bro. we have used MAWI traces to estimate the values of some parameters. We have performed several experiments using traces collected from real internet traffic. Our mechanism was able to detect congestion rapidly. We also showed that our algorithm does not suffer from false alarms when the network is not congested. ACKNOWLEDGMENT This work was supported in part by a grant from the Lebanese National Council For Scientific Research (no.01-0711, LNCSR-3435/S). R EFERENCES [1]
[2]
[3] [4] [5]
[6] [7] [8] [9]
[10] [11]
[12]
[13] [14]
[15]
[16]
[17]
V.
C ONCLUSION
In this paper we have proposed a passive congestion detection algorithm that is capable of efficiently detecting congestions by monitoring an aggregation link. Our proposed mechanism could be deployed near a gateway router, a server
145
[18] [19]
S. Floyd and V. Jacobson, Random Early Detection Gateways for Congestion Avoidance, IEEE/ACM Transactions on Networking, vol. 1, no. 4, Aug. 1993. S. Floyd, R. Gummadi, a nd S. Shenker, ”Adaptive RED: An Algorithm for Increasing the Robustness of REDs Active Queue Management,” Preprint, available at http://www.icir.org/floyd/papers. html, August, 2001. W. Fen, D. Kandlur, D. Saha, and K. Shin, ”BLUE: A New Class of Active Queue Management” T. Ott, T. Lakshman, and L. Wong. SRED: Stabilized RED. In proc. IEEE INFOCOM, New York City, NY, March 1999 S. Ahmad, A. Mustafa, B. Ahmad, A. Bano, A. Hosam, Comparative Study Of Congestion Control Techniques In High Speed Networks (IJCSIS) International Journal of Computer Science and Information Security, Vol.6, No. 2, 2009 D. Papadimitriou, Ed., M. Welzl, M. Scharf, B. Briscoe ”Open Research Issues in Internet Congestion Control” IRTF, RFC 6077, February 2011. Lin, Dong and Robert Morris, Dynamics of Random Early Detection, Proceedings of SIGICOMM 97 MAWI (Measurement and Analysis on the WIDE Internet), http://tracer.csl.sony.co.jp/mawi R.Fontugne, P.Borgnat, P.Abry, K.Fukuda, MAWILab: Combining diverse anomaly detectors for automated anomaly labeling and performance benchmarking”, in CoNEXT 2010 Bro Intrusion Detection System, www.bro-ids.org C. Dovrolis, P. Ramanathan, and D. Moore, Packet-dispersion techniques and a capacity-estimation methodology, IEEE/ACM Trans. Netw., 12(6):963 977, 2004. R. Kapoor, L. Chen, L. Lao, M. Gerla, and M. Sana-didi, CapProbe: A Simple and Accurate Capacity Es-timation Technique, In Proceeding ACM SIGCOMM , 2004 K. Lai and M. Baker, Nettimer: A Tool for Mea-suring Bottleneck Link Bandwidth, In Proceeding of USENIX, 1999. S. Katti, D. Katabi, C. Blake, E. Kohler, and J. Strauss,MultiQ: automated detection of multiple bottleneck ca-pacities along a path, InIMC 04 , pp. 245250, 2004. En-Najjary, T. Passive Capacity Estimation: Comparison of Existing Tools, Performance Evaluation of Computer and Telecommunication Systems, 2008 The CAIDA ”DDoS Attack 2007” Dataset, Paul Hick, Emile Aben, kc claffy, Josh Polterock, http://www.caida.org/data/passive/ddos20070804 dataset.xml Borgnat, P.; Dewaele, G.; Fukuda, K.; Abry, P.; Cho, K.; , ”Seven Years and One Day: Sketching the Evolution of Internet Traffic,” INFOCOM 2009, IEEE , vol., no., pp.711-719, 19-25 April 2009 Allman, M., Paxson, V., and E. Blanton, ”TCP Congestion Control”, RFC 5681, September 2009. Paxson, V. and M. Allman, ”Computing TCP’s Retransmission Timer”, RFC 2988, November 2000