Document not found! Please try again

An Early Stage Detecting Method against SYN Flooding ... - CiteSeerX

0 downloads 0 Views 681KB Size Report
However, 90 % of the DoS attacks utilize. TCP to achieve their aims[1]. The SYN flooding attack is the most commonly- used attack[1], and aims at the three-way ...
China Communications

An Early Stage Detecting Method against SYN Flooding Attacks Sun Qibo, Wang Shangguang, Yan Danfeng,Yang Fangchun State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China

Abstract: Existing detection methods against SYN flooding attacks are effective only at the later stages when attacking signatures are obvious. In this paper an early stage detecting method (ESDM) is proposed. The ESDM is a simple but effective method to detect SYN flooding attacks at the early stage. In the ESDM the SYN traffic is forecasted by autoregressive integrated moving average model, and non-parametric cumulative sum algorithm is used to find the SYN flooding attacks according to the forecasted traffic. Trace-driven simulations show that ESDM is accurate and efficient to detect the SYN flooding attacks. Key words: denial-of-service attacks; autoregressive integrated moving average model; non-parametric cumulative sum algorithm

I. introduction Denial-of-service attack (DoS attack) or distributed denial-of-service attack (DDoS attack) is an attempt to make a computer resource unavailable to its intended users such as banks, credit card payment gateways, and even root name servers. DoS attacks are large-scale cooperative attacks launched from a large number of compromised computers.

108

2009.11

They remain a significant problem in today’s Internet. Despite the widespread deployment of perimeter security devices, such as firewalls and intrusion detection Systems (IDS)[1-5], denial of service targets the heart of today’s information economy, connectivity, by preventing access to services by legitimate users. This may be achieved in a number of ways. However, 90 % of the DoS attacks utilize TCP to achieve their aims[1]. The SYN flooding attack is the most commonlyused attack[1], and aims at the three-way handshake mechanism of TCP connection[2]. In a SYN flooding attack, the requesting addresses are usually the spoofed IP addresses. The Server sends the SYN-ACK to the spoofed IP addresses, and never receives the ACK packets back. Once all resources set aside for half-open connections are reserved, new connections cannot be made, resulting in DoS attacks[3]. Furthermore, some other system resources such as CPU and network bandwidth are occupied and overloaded[4]. Many methods or systems have been proposed to detect the SYN flooding attacks. The authors of [1] detected the SYN flooding attacks at leaf routers that connect end hosts to the Internet, which utilizes the normalized difference between

Broadband Network 宽带网络

the number of SYNs packets and the number of FIN (RST) packets in a time interval. If the rate of SYNs packets is much higher than that of FIN (RST) packets by a non-parametric cumulative sum algorithm, the router recognizes that some attacking traffic is mixed into the current traffic. A similar approach was used in [5], which also considers a non-parametric cumulative sum algorithm; however the authors apply it to measure the number of only SYN packets, and by considering an exponential weighted moving average for obtaining a recent estimate of the mean rate after the change of SYN packets. In [6], the authors built a standard model generated by observations from the characteristic between the SYN packet and the SYN+ACK response packet from the server by a program for the activity of the server. The authors of [7] proposed a method to detect the flooding agents by considering all the possible kinds of IP spoofing, which is based on the SYN/SYN-ACK protocol pair with the consideration of packet header information. The Counting Bloom Filter is used to classify all the incoming SYN-ACK packets to the sub network into two streams, and a nonparametric cumulative sum algorithm is applied to make the detection decision by the two normalized differences, with one difference between the number of SYN packets and the number of the first SYN-ACK packets and another difference between the number of the firs SYN-ACK packets and the number of the retransmission SYN-ACK. There are also some other related studies such as SYN cookies, SYN cache, D-SAT[8] and DiDDeM[9], and more related studies is in [10, 11, 12, 13]. However, these exiting methods or defense mechanisms that against SYN flooding attack are effective only at the later stages when attacking signatures are obvious[12]. There are three disadvantages as follows: (1) The aggregation of numerous malicious packets on the victim server makes it difficult to launch an effective counterattack.

(2) The SYN flooding attack, at that time, has already brought great damage to the target server and a lot of resources have been wasted. (3) The SYN flooding attacks can be or will be over at a later stage, and it is difficult to trace the flooding source. Hence, it is important to detect SYN flooding attacks at an early stage before there are a large number of half-open connections maintained by the protected server. As known, early detection allows sufficient time for defense responses such as filtering, pushback and tracing the flooding source[1, 13]. Based on the above analyses, an early stage detecting method (ESDM) is proposed to detect SYN flooding attacks with a shorter detection time. The ESDM does not monitor the entire flows on the network. It just counts the received number of TCP SYN packets which avoid the need to maintain all incoming TCP sessions and occupy large storage spaces. What is more, being different from the existing methods, ESDM not only monitors the number of SYN packets but also manages the number of the prediction SYN packets, which are obtained by autoregressive integrated moving average model. Non-parametric cumulative sum algorithm is applied to find the SYN flooding attacks with the prediction SYN traffic. If SYN flooding attacks happen, the SYN traffic increase; accordingly the trend of increased traffic can be predicted. Then ESDM finds the attack by the prediction traffic, an upward trend in the early stage. Trace-driven simulations show that our method is efficient to detect SYN flooding attacks, and has shorter detection time. Moreover, due to only storing the number of SYN packets, it reduces session management overhead. The remainder of the paper is organized as follows. Section II describes the ESDM including how to predict SYN traffic and how to detect SYN flooding attacks. Section III evaluates performance of the ESDM. Section IV discusses issues and concludes the paper.

2009.11

109

China Communications

II. ESDM The two key elements of ESDM are: the traffic prediction and detecting SYN flooding attacks. in the ESDM, we use autoregressive integrated moving average model to predict the traffic of SYN packets with the past traffic, then non-parametric cumulative sum algorithm is used to find the sign of SYN flooding attacks. 2.1 Traffic prediction There are a lot of classical predictive models that are used to predict network traffic, such as auto regressive(AR (p)) model, moving average (MA (q))model, autoregressive moving average model(ARMA (p, q)) and autoregressive integrated moving average (ARIMA (p, d, q))[14]. The ARIMA model is often used in predicting internet traffic[14, 15].Therefore we also use it to predict the SYN traffic. Generally, it is originated from the AR (p), the MA (q) and the ARMA (p, q) model[16]. The ARIMA (p, d, q) model is usually organized as the following formation: ϕ ( B)(1 − B) d X t = θ ( B) Z t , Z t N (0, σ 2 ) (1) Where p, q are orders of the AR (p) model and MA (q) model, d is the number of series difference[21]. p, d, q are all integers. If d=0, then the model reduces to a pure ARMA(p,q) model, and if d=0, p=0, then the model reduces to a pure MA(q) model, while if d=0,q=0, it reduces to a pure AR(p) model. The polynomials ϕ ( B ) and θ ( B) are defined as: ϕ ( B) = 1 − ϕ1 B − ϕ2 B 2 −  − ϕ p B p θ ( B) = 1 − θ1 B − θ 2 B 2 −  − θ q B q Where ϕ ( ) and θ ( ) are the pth and qth degree polynomials, and have no common factors [1].B is the backward shift operator as follows: B j X t = X t − j , B j Z t = Z t − j , j = 0, ±1 (2) The formulating of ARIMA model is a complicate process. In ESDM, we take the Box-Jenkins methodology to fit the ARIMA model. It involves the following steps[16]: 1. Identify the order of the ARIMA (p, d, q) with

110

2009.11

auto correlation function (ACF) and its Partial auto correlation function (PACF). 2. Estimate the parameters of the model. The estimation of the model parameters is done using Maximum Likelihood Estimation(MLE), and the best fitting parameters is selected using Akaka’s information Criterion(AIC)[16,17]. 3. Forecast the future data based on the historical data[18]. According the ARIMA model, all the 6 parameters from the historical actual time series and the last one of the historical time series data have been known. For example, X t is the time series and its m-step-ahead prediction is X t + m. When m=1, it is called as the one-step-ahead prediction. Because each time the ARIMA model gets a forecast range of the prediction result, it propose to provide the upper and low probability limit to decide forecast value. Once the actual traffic value becomes available, The ARIMA model update the historical number of the SYN packets, and estimate the new value of parameters. 2.2 Non-parametric cumulative sum algorithm After predicting SYN traffic by ARIMA, we apply a non-parametric cumulative sum algorithm to detect SYN flooding attack. Detailed information about non-parametric cumulative sum algorithm can be found in [1, 5]. Let µ0 and µ1 be the mean traffic before and after change. The change traffic {yi } can be independent Gaussian distribution with known variance σ 2, a non-parametric cumulative sum algorithm alarms with h threshold parameter as following. If Yn ≥ h then alarm, Where, µ −µ µ + µ0   Yn = Yn −1 + 1 2 0 ( yn − 1 ) 2 σ  

+

(3)

We apply the non-parametric cumulative sum algorithm to xn, with xn = xn − µ n −1.Where xn is the number of SYN packets in the n-th time and µn is the mean in the n-th time, which is computed using an exponential weighted moving average(EWMA), as the equation:

µn = λµn −1 + (1 − λ ) xn (0 < λ < 1), µ0 = x1

(4)

Broadband Network 宽带网络

Where λ is the EWMA factor. The mean value of xn is zero before a change hence µ0 = 0 in (4).To the µ1, it is the mean traffic ratio after the change and cannot be easily predicted. But it can be approximated as αµn[5], where α is amplitude percentage parameter, which corresponds to the most probable percentage of increase of the mean rate after a change. Thus the non-parametric cumulative sum algorithm can be written as: αµ αµ   Yn = Yn −1 + n2−1 ( xn − µn −1 − n −1 )  2  σ 

+

(5)

2.3 Detection The detection algorithm basically does time series analysis of SYN traffic to detect an attack. The signal value is the number of m-step-ahead prediction. Let {xt , t = 1, 2,} be the number of SYN collected within one sampling period, and t is the discrete time index. We use the Box-Jenkins modeling to get the best model of ARIMA (p,d,q),and apply the model to get the prediction number of SYN packets, {xt + m , m = 1, 2,}. From (4), we got µt, and then +

αµ αµ   Yt = Yt −1 + t2−1 ( xt − µt −1 − t −1 )  2  . σ 

If Yt ≥ h, then alarm. Similarly, we get µ t + m from µ t + m = λµ t + m −1 + (1 − λ ) xt + m .then +

αµ αµ   Yt + m = Yt + m −1 + t +2m −1 ( xt + m − µ t + m −1 − t + m −1 )  σ 2  ,

if Yt + m ≥ h, then alarm at time t, in this way, we can use the number of prediction SYN packets to get the alarm ahead.

III. Performance evaluations To evaluate and validate the ESDM, it involves the following steps: (1) We carried out trace driven simulations. The background traffic used in our study are collected at the egress point of Being University of Post and Telecommunications, School Building III (about 2500 people work in it) , which is a GE (1 Gbps) link that connects to the rest of world. Only SYN

packets coming to port 80 were collected from TCP using the tcpdump tool [19]. There are three trace data, and the dynamics of SYN packets are illustrated in Figure1.The first trace (SBIII-1) was obtained on May 7, 2009, the second trace (SBIII-2) was taken on May 10, 2009, and the third trace (SBIII-3) also on May 7, 2009. (2) We believe that there are only few malicious packets in School Building III’s traces, because its network is protected by high performance IDS (Intrusion Detection System) and Firewall. In our study, we simulated attack traffic by traffic generator tool, as shown in Figure 4 (a). Attack packets synthetically generated for SYN flooding attacks, as shown in and Figure 4 (b), which allowed us to control the characteristics of the attacks, and investigated the performance of the ESDM for different attack types. (3) To predict the SYN traffic by ARIMA, the first step is to use ACF and PACF to test the traces’ stationary, and to determine the differenced order d. After that, we use the ACF and PACF again to determine the order of p and q. Then we can use MLE to estimate the models parameters’ values, and determine all the details of parameters, and perform the traffic model. We use some observation periods’ traffic to estimate model’s related parameters and test the model’s prediction accuracy. After getting the best accuracy model, we apply it to predict 3-step-ahead traffic, especially an increasing SYN traffic, where “step” means the observation period, for example in Figure 2. Once the actual traffic value becomes available, The ARIMA model update the historical number of SYN packets. By this time, we estimate the new value of parameters and apply the new model to predict the SYN traffic. (4) After getting the SYN traffic prediction, the non-parametric cumulative sum algorithm is used to monitor the actual traffic and the prediction traffic (3-step-ahead traffic). After that we apply the non-parametric cumulative sum algorithm to detect the two kinds of traffic as shown in Figure 3 and 4.

2009.11

111

China Communications

Fig.1 The normal SYN background traffic

we choose α as 0.5, h as 5, β as 0.98 and the observation period as 10 seconds with the a certain false alarm ration tolerance according to the parameters of the non-parametric cumulative sum algorithm in [1, 5]. 3.1 Normal traffic test In the experiment, we first apply the ESDM on all the three SYN background traffic traces without adding attack traffic. The ARIMA model is used to analyze the actual traffic of SYN packets and predict the future traffic. In the experiment, the observation period is 10 seconds, which means that 3-step-ahead prediction will predict SYN traffic in 30 seconds in future. Figure 2 shows parts of 3-step-ahead predictions of all the traces, such as the 25th, 100th, 200th, 275th and 350th observation period. The test statistics for all prediction traces are plotted in Figure

Fig.2 The prediction of SYN packets

112

2009.11

Fig.3 Test statistics under normal SYN background traffic

3. For all the traces tested, are mostly zeros and always much smaller than the threshold. Hence, no false alarms are reported. 3.2 SYN flooding detection In the attack experiment, the SBIII-2 trace was used as the normal background traffic (using other traces also can get the similar results). For example, Figure 4 (a) shows the simulated attacks traffic only, and Figure 4 (b) shows the resultant SYN traffic containing attack traffic. The SYN flood attacks occurred in the 25th, 100th, 200th, 275th and 350th observation period. All attacks lasted for 30 observation periods. The first two of these attacks represent high intensity attacks, whose mean amplitude is more than two times of the background traffic’s amplitude. The last three of the attacks represent low intensity ones, whose mean amplitude is about one time larger than that of the background traffic. The aim of SYN flooding detection is to vali-

Fig.4 Traffic trace with attacksc

Broadband Network 宽带网络

date our method, i.e. ESDM. In this section this experiment is divided into two parts, i.e. low intensity attacks detection and high intensity attacks detection. In addition, in order to further verify the validity of our method the experiment comparison is to compare the detection time and false ratio with the method of [5]. All the experiments were taken on the same software and hardware, which were Pentium 2.0GHz processor, 1.0GB of RAM, Windows XP Pro. In addition, the same SYN traffic is adopted. In the experiments, the capital letter “A” represents our method. The capital letter “B” represents the method using the non-parametric cumulative sum algorithm of [5]. 3.2.1 Low intensity attacks detection In Figure 5 we compare the detection time of “A” and “B” with low intensity flooding attacks according to the traffic trace of Figure 4. From the above figure, the detection time of “A” is obviously lower than that “B” in average in all three attacks. The computation time of “A” is 68.3% shorter than that of “B” with low intensity attacks in average. For example, when the attack occurs at 200th observation period, the detection time of “A” is 5 observation periods, while the detection time of “B” is 9 observation periods. The reason why “A” is superior to “B” is that the increasing trend of SYN flooding attacks can be predicted by the ARIMA model and found by the

Fig.5 The experiment comparison results with low intensity attacks

non-parametric cumulative sum algorithm before the aggregation of numerous malicious packets on the victim server. Therefore, “A” finds the sign of attacks at an early stage. 3.2.2 High intensity attacks detection In Figure 6 we compare the detection time of “A” and “B” with high intensity flooding attacks according to the traffic trace of Figure 4.

Fig.6 The experiment comparison results with high intensity attacks

From the above figure, the detection time of “A” is also lower than that “B” in average in the above two attacks. The computation time of “A” is 50% shorter than that of “B” with high intensity attacks in average. For example, when the attack occurs at 25th observation period, the detection time of “A” is 1 observation period, while the detection time of “B” is 2 observation periods. 3.3 Results analysis In Figure 7 we compare the alarm ratio of “A” and “B” with all attacks according to the traffic trace of Figure 4. From the Figure 7, All the two methods can find attacks accurately in our experiments. They both yield an alarm ratio of 100%. However, from Figure 5 and Figure 6, ESDM has shorter detection time for all SYN flooding attacks, especially for low intensity attacks. For example, ESDM takes the average of 4.3 observation periods to detect attacks, while the method of [5] takes the average of 10 observation periods. The results show that the

2009.11

113

China Communications

Fig.7 The alarm ratio of “A” and “B”

ESDM has more advantages in detection time than the method of [5], and it will help get more time to launch an effective counterattack to SYN flooding attacks. In addition, the simulation results show that the performance of our method is better for low intensity attacks than high intensity attacks. The reason is that the increasing trend of SYN flooding attacks can be predicted by the ARIMA model and found by the non-parametric cumulative sum algorithm before the aggregation of numerous malicious packets on the victim server. Therefore, “A” finds the sign of attacks at an early stage.

SYN packets and stores them, which avoid the need to maintain all incoming TCP sessions and occupy large storage spaces. What is more, our method also manage the forecasted SYN traffic, which are obtained by the ARIMA Model, and uses the non-parametric cumulative sum algorithm to find the SYN flooding attacks by traffic prediction. Through trace-driven simulations, it is shown that our method is accurate and efficient to detect the SYN flooding attacks. It achieves shorter detection time and small storage space. Ongoing work focuses on the application of the system to an actual production network for early detection of the SYN flooding attacks. It is helpful to develop a more effective SYN flooding mitigation scheme relying on other schemes like SynDefender[20] to mitigate SYN flooding in the future. Acknowledgments The work presented in this paper is supported by the National High-Tech Research and Development Plan of China under Grant No. 2006AA01Z448 (863); the Key Science and Technology Research project of Ministry of Education of China under Grant No. 108013; the Foundation for Innovative Research Groups of the National Natural Science Foundation of China under Grant No. 60821001; the National Information Security Plan of China under Grant No.2007A14 (242).

IV. Conclusions In order to find the SYN flooding attacks at an early stage, the ESDM is proposed. In some existing detection methods which are effective only at the later stages when attacking signatures are obvious. However, with the aggregation of malicious traffic on the victim server at the later stages, it is difficult to launch an effective counterattack, which will result in great damage and waste a lot of resources. Furthermore, it is difficult to trace the flooding source at the later stage. By contrast, our method finds the attack with a shorter detection time. The method just only counts the received number of

114

2009.11

References [1] H. Wang, D. Zhang, and K. G. Shin. Detecting SYN flooding attacks. I NFOCOM, 2002. [2] Qiu, Xiaofeng, Hao, Jihong, Chen, Ming .A mechanism to defend SYN flooding attack based on network measurement system. ITRE 2004 - 2nd International Conference on Information Technology: Research and Education. [3] Yuan, Dongqing, Zhong, Jiling.A lab implementation of SYN flood attack and defense. SIGITE’08: Proceedings of the 9th ACM SIG-Information Technology Education Conference, 2008. [4] Sun, Changhua, Fan, Jindou, Liu, Bin.A robust method to

Broadband Network 宽带网络

detect SYN flooding attacks. Proceedings of the Second Inter-

tion Technology, 2008.

national Conference on Communications and Networking in

[18] Zhou, Bo; He, Dan; Sun, Zhili .Traffic predictability

China, ChinaCom 2007.

based on ARIMA/GARCH model Source: 2006 2nd Confer-

[5] V. A. Siris, Fotini P. Application of Anomaly Detect Algo-

ence on Next Generation Internet Design and Engineering,

rithms for Detecting SYN Flooding Attack. Elsevier Computer

NGI 2006, p 200-207, 2006.

Communications, 29: 1433-1442, 2006.

[19] tcpdump/libpcap, http://www.tcpdump.org

[6] Nakashima, Takuo, Oshima, Shunsuke.A detective method

[20] Check Point Software Technologies Ltd. SynDefender:http://

for SYN flood attacks. First International Conference on In-

www.checkpoint.com/products/firewall-1.

novative Computing, Information and Control 2006. [7] Dalia Nashat,Xiaohong Jiang,Susumu Horiguchi.Detect-

Biographies

ing SYN Flooding Agents under Any Type of IP Spoofing . ICEBE archive Proceedings of the 2008 IEEE International

Sun Qibo received the Ph.D. degree in Communication and

Conference on e-Business Engineering table of contents.

Electronic System from the Beijing University of Posts and

[8] Seung-won Shin, Ki-young Kim, Jong-soo Jang.D-SAT:

Telecommunication, in 2002. He is currently an Associate

detecting SYN flooding attack by two-stage statistical ap-

Professor at the Beijing University of Posts and Telecommu-

proach. Applications and the Internet, 2005, Page(s):430 –

nication in China. He is a member of the CCF. His current re-

436.

search interests include network security, network intelligence

[9] Haggerty, J., Berry, T., Shi, Q., Merabti, M..DiDDeM:

and services.

a system for early detection of TCP SYN flood attacks. GLOBECOM 2004.

Wang Shangguang is a Ph.D. candidate from Beijing Univer-

[10] Tao Peng, Christopher Leckie, Kotagiri Rammamo-

sity of Posts and Telecommunications in China. His major is

hanarao. Survey of Network-Based Defense Mechanisms

Computer Science and Technology. His current research inter-

Countering the DoS and DDoS Problems. ACM Computing

ests are network security.

Surveys Vol. 39, Issue 1. 2007. [11] elena Mirkovic and Peter Reiher. Taxonomy of DDoS

Yan Danfeng received the Ph.D. degree in computer applica-

Attack and DDoS Defense Mechanisms. ACM SIGCOMM

tion from the Beijing University of Posts and Telecommuni-

2004.

cation, in 2007. He is currently an Associate Professor at the

[12] Guiyi Wei, Ye Gu, Yun Ling.An Early Stage Detecting

Beijing University of Posts and Telecommunication in China.

Method against SYN Flooding Attack.Computer Science and

His current research interests include network security and

its Applications, 2008.

mobile internet service.

[13] Bin Xiao, Wei Chen, Yanxiang He, Sha, E.H.-M..An active detecting method against SYN flooding attack.Parallel

Yang Fangchun He received B.S. degree in computer com-

and Distributed Systems, 2005.

munication, M.S. degree in computer application, Ph.D.

[14] Papagiannaki, K.,Taft, N., Zhang, Z.-L., Diot, C..Long-

degree in Communication and Electronic System from the

term forecasting of Internet backbone traffic: observations and

Beijing University of Posts and Telecommunication, in 1982,

initial models. INFOCOM 2003.

1987 and 1990, respectively. He is currently a Professor at the

[15] Sang A, Li S Q. A Predictability Analysis of Network

Beijing University of Posts and Telecommunication in China

Traffic [C].Proceedings of IEEE INFOCOM’2000, Tel-Aviv,

He is a council member of the CCF. He has published 6 books

Israel, 2000-03.

and more than 80 papers. His current research interests include

[16] P.J.Brockwell, R.A.Davis. Introduction to Time Series

network intelligence and services, communications software,

and Forecasting. 2nd ed, Springer New York, 2002.

soft switching technology and network security.

[17] Zare Moayedi, H., Masnadi-Shirazi, M.A.. Arima model for network traffic prediction and anomaly detection. Informa-

2009.11

115