2011 European Intelligence and Security Informatics Conference
SVM Based scheme for Predicting Number of Zombies in a DDoS Attack P. K. Agrawal
B. B. Gupta
Satbir Jain
Department of computer Engineering Netaji Subhas Institute of Technology, Delhi University, New Delhi, India
Department of Computer Science, Graphic Era University, Dehradun, India
[email protected]
Department of Computer Engineering, Netaji Subhas Institute of Technology, Delhi University, New Delhi, India
managers or researchers. System managers or researcher always try to modify their approach to handle new attacks and attackers also modify the attack tools to bypass the security system developed by researchers or system managers. In order to design and develop reliable and secure network services, rapid detection and quick response to these attacks are major concern. In practice, there is no scheme that can completely detect or prevent the DDoS attack. Predicting number of zombies in a DDoS attack is helpful to suppress the effect of DDoS attack by filtering and rate limiting the most suspicious attack sources or improve DDoS response system. Figure 1 shows SVM based prediction of number of zombies in a DDoS attack.
Abstract— In recent time, Internet or network services has gain popularity due to rapid growth in information and telecommunication technologies. Internet or network services become the mean for finance management, education, and global information service center for news, advertisements and many others. Denial of service attack and most particularly the distributed denial of service attack (DDoS) is most common and harmful threat to the Internet or network services. In order to design and develop reliable and secure network services, rapid detection and quick response to these attacks are major concern. In practice, there is no scheme that completely detects or prevents the DDoS attack. Predicting number of zombies in a DDoS attack is helpful to suppress the effect of DDoS attack by filtering and rate limiting the most suspicious attack sources or improve DDoS response system. In this paper, we present machine learning approach based on support vector machine for regression to predict the number of zombies in a DDoS attack. MATLAB implementation of support vector machine for regression and datasets generated using NS-2 network simulators running on Linux platform are used for training and testing. SVM for regression with various kernel function and other parameters are compared for their prediction performance using mean square error (MSE). Results show SVM based scheme have promising prediction performance for small dataset. Keywords: DDoS Attack, Support Vector Machine, kernel function, Intrusion Detection, Mean Square Error.
Figure 1: SVM based Prediction number of zombies in DDoS.
1. INTRODUCTION Today, distributed denial of service attacks (DDoS) are most common and harmful threats to Internet or network services. Many to one nature of DDoS attack makes it more powerful and difficult to prevent. DDoS attacker mostly targets the network bandwidth or connectivity to prevent the legitimate use of service by producing an excessive surge of traffic toward a victim. Victim of DDoS attack can suffer damages like file corruption and system shutdown etc [1]. The impact of DDoS attack can be minor inconvenience to user of a website or the financial losses for companies that run e- commerce website. Simple logic structure and low resource requirement of DDoS attack make it difficult to detect and prevent. At the same time, GUI based toolkits are available to the attacker that makes such attack easier to perform [1, 2]. Security of network services or Internet resources is a major issue to the system 978-0-7695-4406-9/11 $26.00 © 2011 IEEE DOI 10.1109/EISIC.2011.19
Good prediction performance with small training dataset (less prior knowledge) helps in rapid detection and quick response to these attacks.
In this paper, we present machine learning approach based on support vector machine for regression to predict the number of zombies in a DDoS attack. Datasets generated using NS-2 network simulator running on Linux platform are used for training and testing. SVM for regression with various kernel function and other parameters are compared for their prediction performance using mean square error (MSE). Results show SVM based scheme have promising prediction performance for small dataset. The remainder of this paper is organized as follows: Section 2 contains related work. Section 3 provides an overview of support vector machine (SVM). In Section 4, detailed discussion about our approach and experimental setup 178
is given. Section 5 contains results and discussion. Finally section 6 concludes the paper.
performance. Support vector machine has capability of extracting optimal global solution from small size training dataset because it captures the geometric picture of feature space corresponding to kernel function without deriving weights of networks from training data [3]. Above literature provides good motivation towards the use of machine learning approach based on support vector machine for regression, to predict the number of zombies in a DDoS attack.
2. RELATED WORK In this paper, we present machine learning approach based on support vector machine for regression to predict the number of zombies in a DDoS attack. Predicting number of zombies in a DDoS attack is helpful to improve DDoS response system or suppress the effect of DDoS attack by filtering and rate limiting the most suspicious attack sources. In literature, there are many applications in which SVM based schemes are effectively used. Shin et al. [3] have used support vector machine in bankruptcy prediction. In this approach, the radial basis function is used as the kernel function of SVM. Support vector machines (SVM) approach is used for detecting the corporate failure data pattern from research data provided by Korean credit firm and result are compare with neural network based approach. Theodore et al. [4] have used support vector machine for regression to forecast the stock price from financial data. Advanced techniques such as machine learning play a vital role for financial forecasting. In order to perform prediction of daily stock price, data from IBM, Yahoo and America Online are used In [5], Bao et al have proposed network intrusion detection system based on support vector machine (SVM).In this approach, Anomaly intrusion detection has combined with the misuse intrusion detection based on the support vector machine. Support vector machine helps to achieve higher detection accuracy to intrusion detection system when limited prior knowledge (small sample) is available. In [6], Zhang et al have proposed a CH-SVM method for constructing the reduced training dataset based on the convex hull of original huge dataset to reduce the time and space cost without decay on accuracy. In [7], Ivanciuc have utilized support vector machine for the cancer diagnosis from the blood concentration of Zn, Ba, Se, Mg, Ca, and Cu. Generalization property of support vector machine make it more useful for the medical diagnosis applications. It has the ability to make correct predictions for the patterns which have not use in the training. In [8], Gupta et al. have used ANN based scheme to predict number of zombies in a DDoS attack, In ANN based scheme, feed forward neural network is used to predict number of zombies. Neural network has several limitations. First, in neural network, it is difficult task to find appropriate model that reflect the problem characteristics due to its large number of controlling parameters and number of processing elements in a layer [3]. Second, neural network is based on empirical risk minimization (ERM) principle [9]. ERM minimizes the error on the training data and it suffers lack of good generalization performance. Stability is also an issue in neural networks. Support vector machine have many good features over neural network. First, support vector machine has only two free parameters upper bound and kernel function [3]. Second, support vector machine employ structural risk minimization (SRM) principle [9], which is known for better generalization
3. SUPPORT VECTOR MACHINE Support vector machine (SVM) becomes popular due to its escalating performance in regression estimation [3], intrusion detection, medical diagnosis, pattern recognition and many others. Support vector machine was founded on the statistical learning theory by Vapnik [9]. Support vector machine employs structural risk minimization (SRM) principle [4]. In classification [3], our objective is to discover an optimal hyper plane that can separates two classes. In regression [3], we aim to discover a hyper plane that cover as many data point as possible. 3.1 Support Vector Regression Case 1: Linearly separable case Suppose we have the training data {(p1,q1),(p2,q2), …………. (pn, qn)} ⊂ P x R where P denotes the space of the input pattern for instance-Rd. In the ε-insensitive support vector regression, our main objective is to find a function f (p) that has a ε deviation from the actually obtained target qi for all training data and at the same time is as flat as possible [4]. Suppose f(p) takes the following form: f ( p ) = wp + b with w ∈ P and b ∈ R. (1) In eq. (1) flatness means one seeks small value of w. We can represent problem as convex optimization problem as below: Minimize || ||2 Subject to
(2)
here, we assume a function f () exists that approximate all pairs (pi, qi) with precision. If in case, it is infeasible, we allow , are introduced and this is some errors. Slack variables called soft margin formulation [4]. It is described as follows: Minimize || ||2+C∑ ) Subject to (3)
0 where C>0 and C is a constant. It determines the trade-off between the flatness of the f (p) and the amount up to which deviations larger than are tolerated. Eq. (3) can easily solve in its dual formulation. A lagrangian function is to be
179
4. OUR APPROACH AND EXPERIMENTAL SETUP Prediction performance of SVM is highly dependent on training data. Accuracy of training data improves the performance of the trained system. Training data collection is critical task. Training Data can be obtained from three ways, by simulated traffic, by real traffic, by sanitized traffic [10]. In this paper, we used simulated traffic data.
constructed from the objective function and the corresponding constraints, by introducing a dual set of variables [11]. L=
∑
∑
||w|| ∑
∑
(4)
0. For optimality partial derivatives of where , , , L with respect to (w, b, , ) we have
4.1 Simulation Environment In order to evaluate performance of our scheme, simulated traffic data is used. Simulations are carried out using NS-2 network simulator on Linux platform. A transit stub model of GT-ITM topology generator is used to generate Internet type topology for simulation. For generating attack traffic, total number of zombie machines ranges between 10 to 100. Attack traffic rate is fixed to 25 Mbps. It means attack rate of each zombie can varies from 0.25Mbps to 2.5Mbps. In this experiment, monitoring time window size is set to 200ms. The simulations are repeated and various attack scenarios are created by varying total number of zombie at fixed rate attack strengths.
0 ∑
0
(5)
0 0 by substituting eq. (5), in eq.(4) dual optimization problem can be represented as follows: Maximization 1 2 ,
Subject to
4.2 Data set Description Dataset contain the deviation in entropy (Hc-Hn) value and actual number of zombies in a DDoS attack. In order to train and test [4] the support vector machine for regression, obtained dataset is divided in two parts. First part, training data, contains 78.95% of total data values and used for training the support vector machine for regression. It is shown in table 1.
(6) 0
, 0, now eq. (6) is solved for w as follows:
Table 1: Training data-deviation in entropy with actual number of zombies in DDoS attack.
Actual Number of Zombies (Y) Deviation in Entropy (X) by substitution w in Eq. (1). ∑
(7)
Case 2: Non linear separable We need some mapping from input space to feature space and try to find hyper plane in the feature space [4]. With the use of kernel function, we can solve QP Problem. For more details, readers can refer [11]. The optimal solution, we have
(8) ∑
,
where K (.,.) represents kernel function. Two common types of the kernel functions are discussed below: A. Radial basis function (RBF) , exp 1/2 (9) is bandwidth of radial basis function kernel [3]. where B. Polynomial function . , 1 (10) where r is the degree of polynomial function.
180
10
0.045
15
0.046
25
0.050
30
0.068
35
0.087
40
0.099
45
0.111
55
0.130
60
0.139
65
0.148
75
0.163
80
0.170
85
0.176
90
0.182
100
0.192
Second part, testing data, contains random mly chosen data values from original dataset. Test data samplees used for testing the support vector machine for regression are shown in table 2.
value of MSE decreases graduaally which reflect better prediction performance. Table 3: Testing results for RBF kerrnel function with ε = 0.1
Table 2: Testing Data- deviation in entropy with actual nnumber of zombies in DDoS attack.
Values of C(↓) 500
20
50
70 0
95
MSE in Testing
24.51
49.99
71.6 69
94.25
5.95
0.018
1000
21.47
50.09
71.5 52
95.06
1.12
50
0.121
1500
18.91
50.18
71.2 24
95.58
0.77
70
0.157
95
0.189
Actual Number of Zombies (Y) Deviatioon in Entropy (X) 20
Table 4: Testing results for polynomiall kernel function with ε =0.1
5. RESULTS AND DISCUSSIO ON 5.1 SVM Training In support vector machine’s training phase, support vector machine has to be trained by giving deviation in entropy (X) as input and corresponding actual number of zzombies in DDoS attack (Y) as target value from the training data as shown in table 1. A MATLAB implementation off support vector machine for regression is used. It has moree than one kernel functions such as polynomial, radial basis function (RBF), Gaussian, exponential etc. Polynomial annd radian basis function (RBF) are most commonly used kernnel functions. One could not say one kernel function is better than others. No direct method presents to choose the best. Onnce, we decide the kernel function, we also need to decide the other parameters for a kernel function. We have used the radial basis functtion (RBF) and polynomial kernel function with different values of other parameters. Support vector machine employys structural risk minimization (SRM) principle [4], which is known to better generalization performance under small size ddataset (less prior knowledge).
Values of C(↓)
20
50
70
95
MSE in Testing
500
21.62
50.08
71..21
95.83
1.21
1000
21.05
50.14
71..16
95.90
.8248
1500
20.67
50.15
71..09
95.97
.6528
From Table 3 and Table 4, we caan observe that SVM with polynomial kernel function have bettter prediction performance than SVM with RBF kernel functtion. Test results for RBF kernel function with different valuees of C (500, 1000, 1500) and ε = 0.01 are shown in tablee 5. From table 5, we can notice the same trend again as when n ε = 0.1. Table 5: Testing results for RBF kerrnel function with ε =0.01
Values of C(↓)
20
50
70
95
500
24.62
50.03
71.79 7
94.30
1000
21.68
50.08
71.43 7
94.98
1.23
1500
18.80
50.26
71.28 7
95.36
0.81
MSE in Testing 6.26
From table 3 and 5, we can see that when ε is low, higher the MSE or low prediction perfo formance. Test results for polynomial kernel function with different d values of C (500, 1000, 1500) and ε = 0.01 are show wn in table 6. From table 6, we can see that, when value of C increases, MSE decreases gradually, means better prediction performance.
5.2 SVM Testing Once the training phase is completed, trainned support vector machine for regression (SVR) is ready to test.. From the testing data as shown in table 2, deviation in entroppy (X) is feed as input to the trained SVM. We conduct the experiment with various kernel function (i.e. RBF, polynomial)) and different set of other parameters (i.e. upper bound C and ε) to compare the prediction performance. Test results for RBF kernel function with ddifferent values of C (500, 1000, 1500) and ε = 0.1 are shownn in table 3. From table 3, we can see that when value of C inncreases, value of MSE decreases gradually which reflect bbetter prediction performance. Test results for polynomial kernel functioon with different values of C (500, 1000, 1500) and ε = 0.1 arre shown in Table 4. From table 4, we can see that when value of C increases,
Table 6: Testing results for polynomial kernel k function with
181
=0.01
Values of C(↓) 500
20
50
70
95
21.60
50.19
71.25
95.68
1.16
1000
21.04
50.25
71.21
95.82
0.82
1500
20.67
50.21
71.10
96.01
0.68
MSE in Testing
size dataset (less prior knowledge). Accuracy and stability of SVM based scheme improves the performance of DDoS response system by rapidly predicting number of zombies in DDoS attack with good prediction performance.
7 6 5
c=500 c=1000 c=1500
REFERENCES
4
[1]. C. Douligeris, A.Mitrokotsa,“DDoS attacks and defense mechanisms: classification and state of the art,” Computer Networks, volume 44, pp. 643-666, 2004. [2]. Gupta, B.B., Joshi, R.C., Misra, M.: Defending against Distributed Denial of Service Attacks: Issues and Challenges. Information Security Journal: A Global Perspective 18(5), 224– 247 (2009). [3]. K. Shin, T. Lee, H. Kim, “An application of support vector machines in bankruptcy prediction model,” Expert Systems with Applications, volume 28, pp. 127–135, 2005. [4]. Theodore B. Trafalis and Huseyin Ince, “Support vector machine for regression and application to financial forecast,” In Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, pp. 348—353, 2000. [5]. X. Bao, T. Xu, H. Hou, “Network Intrusion Detection Based on Support Vector Machine,” In Proc. Of 2009 IEEE International Conference, pp. 1-4, Wuhan, 2009, DOI: 10.1109/ICMSS.2009.5304051. [6]. X. Zhang, C. Gu, “CH-SVM based network anomaly detection,” In Proc. of IEEE sixth International Conference on Machine Learning and Cybernetics, Hong Kong, 2007. [7]. O. Ivanciuc, “Support vector machines for cancer diagnosis from the blood concentration of Zn, Ba, Mg, Ca, Cu, and Se,” Internet Electron. J. Mol. Des., pp. 418–427, 2002. http://www.biochempress.com. [8]. B.B. Gupta, R.C. Joshi and M. Misra, “ANN Based Scheme to Predict Number of Zombies in DDoS Attack,” International Journal of Network Security, vol.13, No.3, PP.216-225, 2011. [9]. S. R. Gun, “Support vectoe machine for classification and regression,” Technical Report, 10 may 1998. [10]. I. Ahmad, A. B. Abdullah, A. S. Alghamdi, “Application of artificial neural network in detection of DoS attacks,” In proc. of ACM conference SIN’09, October 6---10, 2009, North Cyprus, Turkey. [11]. A. J. Smola, B. Scholkopf, “A tutorial on Support vector regression,” Neuro COLT2 technical report series, October 1998.
3 2 1 0 MSE for RBF with MSE for Polynomial MSE for RBF with MSE for Polynomial ε=0.1 with ε=0.1 ε=0.01 with ε=0.01
Figure 2: comparison of prediction performance for RBF and polynomial kernel function with different set of other parameter values.
Figure 2 shows comparison of prediction performance for RBF and polynomial kernel function with different set of other parameter values. From figure 2, we get best prediction performance by polynomial kernel function with C=1500 and ε = 0.1. Our results show that predicted number of zombies is very close to actual number of zombies and results are stable. Overall prediction performance of SVM is better for small training dataset (less prior knowledge). It can also improve the performance of DDoS response system by rapidly (small size training dataset) predicting number of zombies in a DDoS attack with good prediction performance. 6. CONCLUSION Security of Internet infrastructure is always a challenging task. DDoS attack is most harmful threat to the Internet these days. In this paper, we have proposed SVM based scheme for predicting number of zombies in DDoS attack. SVM based scheme provides promising prediction performance with small
182