On Estimating Strength of a DDoS Attack Using Polynomial Regression Model B.B. Gupta1,2, P.K. Agrawal3, A. Mishra1, and M.K. Pattanshetti1 1
Department of Computer Science, Graphic Era University, Dehradun, India
[email protected] 2 Department of Electronics and Computer Engineering, Indian Institute of Technology Roorkee, Roorkee, India 3 Department of Computer Science, NSIT, New Delhi, India
Abstract. This paper presents a novel scheme to estimate strength of a DDoS attack using polynomial regression model. To estimate strength of attack, a relationship is established between strength of attack and observed deviation in sample entropy. Various statistical performance measures are used to evaluate the performance of the polynomial regression models. NS-2 network simulator on Linux platform is used as simulation test bed for launching DDoS attacks with varied attack strength. The simulation results are promising as we are able to estimate strength of DDoS attack efficiently.
1 Introduction DDoS attacks compromise availability of the information system through various means [1,2]. One of the major challenges in defending against DDoS attacks is to accurately detect their occurrences in the first place. Anomaly based DDoS detection systems construct profile of the traffic normally seen in the network, and identify anomalies whenever traffic deviate from normal profile beyond a threshold [3,4]. This extend of deviation is normally not utilized. We use polynomial regression [5,6] based approach that utilizes this extend of deviation from detection threshold, to estimate strength of a DDoS attack. In order to estimate strength of a DDoS attack, polynomial regression model is used. To measure the performance of the proposed approach, we have calculated various statistical performance measures i.e. R2, CC, SSE, MSE, RMSE, NMSE, η, MAE and residual error [12]. Internet type topologies used for simulation are generated using Transit-Stub model of GT-ITM topology generator [7]. NS-2 network simulator [8] on Linux platform is used as simulation test bed for launching DDoS attacks with varied attack strength. The remainder of the paper is organized as follows. Section 2 contains overview of polynomial regression model. Detection scheme are described in section 3. Section 4 describes experimental setup and performance analysis in details. Model development is presented in section 5. Section 6 contains simulation results and discussion. Finally, Section 7 concludes the paper. A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 244–249, 2011. © Springer-Verlag Berlin Heidelberg 2011
On Estimating Strength of a DDoS Attack Using Polynomial Regression Model
245
2 Polynomial Regression Model In its simplest form regression analysis [9,10] involves finding the best straight line relationship to explain how the variation in an outcome variable, Y, depends on the variation in a predictor variable, X. When there is only one explanatory variable the regression model is called a simple regression, whereas if there are more than one explanatory variable the regression model is called multiple regression. Polynomial regression [4,5] is a form of regression in which the relationship between the independent variable X and the dependent variable Y is modeled as an ith order polynomial. The general form of this regression model is as follows:
Yi = Yˆi + ε i Yˆi = β0 + β1 X + β2 X 2 + ......... + βn X n
(1)
Input and Output: In polynomial regression model, a relationship is developed between strength of a DDoS attack Y (output) and observed deviation in sample entropy X (input). Here X is equal to (Hc-Hn). Our proposed regression based approach utilizes this deviation in sample entropy X to estimate strength of a DDoS attack.
3 Detection of Attacks Entropy [11] based DDoS scheme is used to construct profile of the traffic normally seen in the network, and identify anomalies whenever traffic goes out of profile. A metric that captures the degree of dispersal or concentration of a distribution is sample entropy. Sample entropy H(X) is
H ( X ) = −∑ pi log 2 ( pi ) N
(2)
i =1
where
pi is ni/S. Here ni represent total number of bytes arrivals for a flow i in {t −
= ∑ ni , i = 1, 2....N . The value of sample entropy lies in the range 0N
Δ, t} and S
i =1
log2 N. To detect the attack, the value of
Hc ( X ) is
calculated in time window Δ conti-
nuously; whenever there is appreciable deviation from X n ( X ) , various types of DDoS attacks are detected. Hc ( X ) , and X n ( X ) gives Entropy at the time of detection of attack and Entropy value for normal profile respectively.
4 Experimental Setup and Performance Analysis Real-world Internet type topologies generated using Transit-Stub model of GT-ITM topology generator [7] are used to test our proposed scheme, where transit domains are treated as different Internet Service Provider (ISP) networks i.e. Autonomous
246
B.B. Gupta et al.
Systems (AS). For simulations, we use ISP level topology, which contains four transit domains with each domain containing twelve transit nodes i.e. transit routers. All the four transit domains have two peer links at transit nodes with adjacent transit domains. Remaining ten transit nodes are connected to ten stub domain, one stub domain per transit node. Stub domains are used to connect transit domains with customer domains, as each stub domain contains a customer domain with ten legitimate client machines. So total of four hundred legitimate client machines are used to generate background traffic. The legitimate clients are TCP agents that request files of size 1 Mbps with request inter-arrival times drawn from a Poisson distribution. The attackers are modeled by UDP agents. A UDP connection is used instead of a TCP one because in a practical attack flow, the attacker would normally never follow the basic rules of TCP, i.e. waiting for ACK packets before the next window of outstanding packets can be sent, etc. In our experiments, the monitoring time window was set to 200ms. Total false positive alarms are minimum with high detection rate using this value of monitoring window.
5 Model Development In order to estimate strength of a DDoS attack ( Yˆ ) from deviation (HC - Hn) in entropy value, simulation experiments are done at the varying attack strength from 10Mbps Table 1. Deviation in entropy with actual strength of DDoS attack Actual strength of DDoS Deviation in Entropy (X) attack (Y) 10M 0.149 15M 0.169 20M 0.184 25M 0.192 30M 0.199 35M 0.197 40M 0.195 45M 0.195 50M 0.208 55M 0.212 60M 0.233 65M 0.241 70M 0.244 75M 0.253 80M 0.279 85M 0.280 90M 0.299 95M 0.296 100M 0.319
On Estimating Strength of a DDoS Attack Using Polynomial Regression Model
247
to 100Mbps and at fixed total number of zombies i.e. 100. Table 1 represents deviation in entropy with actual strength of DDoS attack. Polynomial regression model is developed using strength of attack (Y) and deviation (HC - Hn) in entropy value as discussed in Table 1 to fit the regression equation. Figure 1 shows the regression equation and coefficient of determination for polynomial regression model.
Strength of Attack (Mbps)
120 100
y = -1284.9x 2 + 1176.4x - 144 R2 = 0.9603
80 60 40 20
Polynomial Regression
0 0.10
0.14
0.18 0.22 0.26 Deviation in Entropy (X)
0.30
0.34
Fig. 1. Regression equation and coefficient of determination for polynomial regression model
6 Results and Discussion We have developed polynomial regression model as discussed in section 5. Various performance measures are used to check the accuracy of this model. 120
Strength of Attack
100 80 60 40 20 0 0.149
0.184
0.199
0.195
0.208
0.233
0.244
0.279
0.299
0.319
Deviation in Entropy Actual DDoS attack Strength
Predicted DDoS attack strength using Model M2
Fig. 2. Comparison between actual strength of a DDoS attack and predicted strength of a DDoS attack using polynomial regression model M2
248
B.B. Gupta et al.
Predicted strength of attack can be computed and compared with actual strength of attack using proposed regression model. The comparison between actual strength of attack and predicted strength of attack using polynomial regression model is depicted in figures 2. Table 2 contains values of various statistical measures for polynomial regression model. It can be inferred from table 2 that for polynomial regression model, values of R2, CC, SSE, MSE, RMSE, NMSE, η, MAE are 0.96, 0.98, 566.31, 29.81, 5.46, 1.06, 0.96 and 0.81, respectively. Hence estimated strength of a DDoS attack using polynomial model is closed to actual strength of a DDoS attack. Table 2. Values of various performance measures R2 CC SSE MSE RMSE NMSE η MAE
0.96 0.98 566.31 29.81 5.46 1.06 0.96 0.81
7 Conclusion and Future Work This paper investigates how polynomial regression model can be used to estimate strength of a DDoS attack from deviation in sample entropy. For this, model is developed and various statistical performance measures are calculated. After careful investigation, we can conclude that estimated strength of a DDoS attack using polynomial regression model is very close to actual strength of a DDoS attack. Hence, polynomial regression model is very useful method for estimating strength of attack.
References 1. Gupta, B.B., Misra, M., Joshi, R.C.: An ISP level Solution to Combat DDoS attacks using Combined Statistical Based Approach. International Journal of Information Assurance and Security (JIAS) 3(2), 102–110 (2008) 2. Gupta, B.B., Joshi, R.C., Misra, M.: Defending against Distributed Denial of Service Attacks: Issues and Challenges. Information Security Journal: A Global Perspective 18(5), 224–247 (2009) 3. Gupta, B.B., Joshi, R.C., Misra, M.: Dynamic and Auto Responsive Solution for Distributed Denial-of-Service Attacks Detection in ISP Network. International Journal of Computer Theory and Engineering (IJCTE) 1(1), 71–80 (2009) 4. Mirkovic, J., Reiher, P.: A Taxonomy of DDoS Attack and DDoS defense Mechanisms. ACM SIGCOMM Computer Communications Review 34(2), 39–53 (2004) 5. Stigler, S.M.: Optimal Experimental Design for Polynomial Regression. Journal of American Statistical Association 66(334), 311–318 (1971)
On Estimating Strength of a DDoS Attack Using Polynomial Regression Model
249
6. Anderson, T.W.: The Choice of the Degree of a Polynomial Regression as a Multiple Decision Problem. The Annals of Mathematical Statistics 33(1), 255–265 (1962) 7. GT-ITM Traffic Generator Documentation and tool, http://www.cc.gatech.edu/fac/EllenLegura/graphs.html 8. NS Documentation, http://www.isi.edu/nsnam/ns 9. Lindley, D.V.: Regression and correlation analysis. New Palgrave: A Dictionary of Economics 4, 120–123 (1987) 10. Freedman, D.A.: Statistical Models: Theory and Practice. Cambridge University Press, Cambridge (2005) 11. Shannon, C.E.: A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communication Review 5(1), 3–55 (2001) 12. Gupta, B.B., Joshi, R.C., Misra, M.: ANN Based Scheme to Predict Number of Zombies in DDoS Attack. International Journal of Network Security 13(3), 216–225 (2011)