ISBN 978-952-5726-06-0 Proceedings of the 2009 International Workshop on Information Security and Application (IWISA 2009) Qingdao, China, November 21-22, 2009
Network Intrusion Detection by Support Vectors and Ant Colony 1
Qinglei Zhang 1 and Wenying Feng 2 Department of Computing and Software, Faculty of Engineering, McMaster University, Hamilton, ON, Canada, L8S 4L8 2 Departments of Computing & Information Systems Department of Mathematics Trent University, Peterborough, ON K9J 7B8 Email: {qingleizhang,
[email protected]}
Abstract—This paper presents a framework for a new approach in intrusion detection by combining two existing machine learning methods (i.e. SVM and CSOACN). The IDS based on the new algorithm can be applied as pure SVM, pure CSOACN or their combination by constructing the detection classifier under three different training modes respectively. The initial experiments indicate that performance of their combination is better than pure SVM in terms of higher average detection rate as well as lower rates of both negative and positive false and is better than pure CSOACN in terms of less training time with comparable detection rate and false alarm rates.
different suitable actions will be further taken for different types of intrusions. The main issue in misuse detection systems is the pattern recognition and signature depiction of the pertinent attack, which should encompass all possible variations but should not match non-intrusive activities. Until now, a great deal of time and effect has been invested in IDSs. Various approaches to model normal or attacks behaviors have been applied. For example, statistical approaches, predictive pattern generation, neural networks, expert systems, keystroke monitoring, model based intrusion detection, NSM (Network Security Monitor), autonomous agents, fuzzy logic network and data mining. However, it seems that neither the anomaly detection nor the misuse detection can detect all kinds of intrusion attempts on their own. The future research trend is converging towards the model which is a hybrid of the anomaly and misuse detection models.
Index Terms — Network security, network attack, Intrusion Detection Systems (IDS), Support Vector Machine (SVM), Ant Colony Network.
I. INTRODUCTION Intrusion detection [1, 2] is the detecting of actions that attempt to compromise the integrity, confidentiality or availability of a resource in the network. In case of an intrusion, an IDS (Intrusion Detection System) detects it as soon as possible and takes appropriate action. The combination of facts, such as the rapid growth of the Internet, the vast financial possibilities opening up in electronic trade, and the lack of truly secure systems, makes IDSs an important front edge research orientation of network security.
SVM (Support Vector Machine, see [3]) and CSOACN (Clustering based on Self-Organized Ant Colony Network, see [4]) are approaches in data mining to generate classifiers. Both methods have been applied in intrusion detection to separate the normal and abnormal network connecting records. From the machine learning point of view, the process of SVM is in supervising learning [5] whereas the CSOACN algorithm is unsupervised learning [6]. In this paper, we present a framework that aims to combine the two approaches in intrusion detection and therefore, to reach a better system performance. In Section 2, we briefly introduce SVM and CSOACN. The new algorithm is presented in Section 3. Some conclusions and future work are discussed in Section 4.
Although many different IDSs have been developed, their detection schemes generally fall into one of two categories: anomaly detection or misuse detection. Anomaly detectors look for behavior that deviates from normal use of system whereas misuse detectors look for behavior that matches a known attack scenario. The normal behavior is based on lots of innocuous factors and highly variable. Therefore, the selection of features to monitor is the main issue in anomaly detection. The approach of misuse detection is to model abnormal system behaviors at first and define any other behaviors as normal behavior. Namely, known intrusion attacks are represented in the form of pattern or signature, activities that match those attack scenarios can be detected and
© 2009 ACADEMY PUBLISHER AP-PROC-CS-09CN004
II. SUPPORT VECTOR MACHINES AND ANT COLONY A. Intrusion Detection by Support Vectors SVMs are a set of related supervised learning methods originally for pattern recognition and regression [3]. Every network connecting record is described by several
639
features of the connection and a record is represented as a data point. In the case of support vector machines, the data points are written as high dimensional vectors. Using some training vectors (collected history data), the SVMs separate the data points by generating a hyperplane between the normal and abnormal data points. Since there might have more then one hyperplanes separate the two classes of data, the question for SVMs is how to select the one that have the maximum margin classifier (see Figure 1). Therefore, SVMs are actually a quadratic optimization problem. Figure 2. SVM Active Learning Process
The traditional SVM algorithm is operated over the entire training data set. The number of training data points determines the dimension of the matrix for computing the kernel function, which influences the time of solving the QP problems. However, SVMs have the property that the points that do not lie on the margin are not necessary to be involved in the computation. Same decision function is found if some of the training data points, excepting the support vectors, are removed. Hence, for SVM, the number of training data points can be reduced without loosing accuracy. In order to reduce the number of training data, an active learning into SVM is introduced in [7]. Initially, a SVM classifier was trained by using only small amount of data points from the whole training data set. The SVM classifier was then gradually modified by adding new data points for SVM training. After each training process, the output classifier is used to separate the entire data. The recurrence of training a new SVM classifier can stop when a required correct classification rate is obtained. Figure 2 shows the framework of SVM active learning. Assume U is the data set of the entire candidate points for training and T is the data set of points for each SVM training. There are two independent parts of the SVM active learning machine: f and q. f is the SVM classifying function trained by T, q is a searching function to decide which new data points are selected from U into T for the next SVM training process.
a population of ant-like agents move objects on the 2-D grid to cluster similar objects into same regions. Object and ant are the two basic entries in the program. As an object is described by several of its features, each object can be denoted by an vector Oi and each feature of the object can be denoted by vij . All objects on the ant colony network for clustering thus can be denoted as ddimensional vectors as follows: ^O1 , O2 , O3 ,..., O N `, ® ¯Oi {vi1 , vi 2 , vi 3 ,..., vid }, i 1,2,..., N .
where, N is the number of objects and d is the dimension of features. Network connecting records described by several features can be viewed as objects in CSOACN. These objects belong to different classes (i.e., normal and different kinds of intrusions). As the profiles of both abnormal and normal data are defined as different clusters, intrusion detection classifiers with both anomaly and misuse detection pattern can be constructed by applying clustering. Figure 3 shows the process of ant colony based intrusion detection. As an ant colony network possesses properties such as flexibility, robustness, decentralization and self organization, it can suggest very interesting heuristics. Optimization and control algorithms based on swarm intelligence, including Ant Colony Optimization and Ant Colony Routing, are well known [8]. In the area of data mining, particularly for clustering purposes, there are also many studies using the metaphor of ant colonies [9].
B. Intrusion Detection by Ant Colony In the real world of self-organized ant colony network,
III. COMBINATION OF SVM AND ANT COLONY To take the advantages of both active SVM and ant colony, we propose the combination of the two algorithms to develop a new intrusion detection system. The idea is to use the SVM to find the support vectors and generate a hyperplane that separates normal and abnormal data while CSOACN is used to find data added to active SVM training set and finally generate models for normal data as well as for each class of abnormal data. To this end, the traditional SVM algorithm has to be modified. Figure 1. Linear Classifiers in Two-Dimensional Space
640
Ant clustering phase: The clustering process is divided into several clustering periods by clustering around certain objects each time. Those certain objects for each clustering period are support vectors (viewed as objects in this phase) marked by the last SVM training phase. After this phase, all objects within the neighborhoods of the support vectors will be selected for the next SVM training phase. Constructing classifiers: A binary classifier (i.e. SVM classifier) for intrusion detection can be constructed by the result generated in SVM training phase. Namely, all data can be separated into two cases. Although CSOACN focuses on the boundary clustering, the other data points meanwhile are also clustered and the classifier for each class can be generated. For the model of normal data, we only compute the centre models of the clusters which include the support vectors, thus the computing time for the model of normal data are decreased. Further, for classifications of the abnormal data, when the amount of training data from each abnormal class is small, the cluster centre models of each class can be computed without using too much time. If the amount of training data from one of those abnormal classes is large, more SVM hyperplanes can be generated to reduce the amount of that type of abnormal training data. In addition, as data have already be separated into two sides by SVM, the cluster centre model of each class can be calculated without including the data points that are in the other sides. Namely, for those classes within one side, we use SVM to minimize the reduction of detection rate which may be caused by the particular clustering process.
Figure 3. Intrusion Detection Based on CSOACN
The active learning SVM is a process of repeating SVM training based on different training data sets. The data set for each SVM training step is only part of the entire data set and the main problem of active learning SVM is how to select the training data points for each SVM training step. Since the separating hyperplane generated by SVM is dependent on the support vectors, the hyperplane can be modified gradually by adding points between the margins after each SVM training. The aim of the selection strategy for active learning SVM is trying to find out the support vectors among the entire training data points. As the support vectors are data points at the margin of each class, support vectors that are found after each SVM training process, and the points around those support vectors are more likely to be the support vectors of the entire data set. Therefore, after each SVM training process, clustering around the support vectors may be applied as the selection strategy for active learning SVM. Namely, the new training data set for the next SVM training consists of the support vectors and some of the points around them. Figure 4 shows the process of the new CSVAC (Combining Support Vectors with Ant Colony) algorithm. SVM and CSOACN are two interactive phases that are taken multiple times. The individual phases are described as the following.
Classifier modification: The classifiers can be modified gradually by repeating the three steps: SVM training phase, ant clustering phase and constructing the classifier. The repeating loops will terminate when both of the two classifiers (i.e. SVM classifier and CSOACN classifier) obtain a satisfying detection rate (denoted as RR in Figure 4) upon the entire training data. Furthermore, as it is time consuming to construct the classifiers and test the entire training data, it is more reasonable to repeat the SVM training phase and the ant clustering phase for a certain number of loops (denoted as N in Figure 4) before constructing and modifying the classifiers. The value of N is determined by experiments. The best value of N is the smallest number of loops that are needed to construct and test the classifiers one time before obtaining the satisfying detection rate.
SVM training phase: The training data for each SVM training are objects (viewed as data points in this phase) selected by the last ant clustering phase or initially selected randomly. In order to let the first generated hyperplane correctly separate normal data and different types of abnormal data, initially one data point is randomly selected from each type of data. After this phase, the support vectors among the selected data points will be found and marked for the next ant clustering phase.
Intrusion detection based on the final classifiers: Two final classifiers (i.e., SVM classifier and CSOACN classifier) are formed after the whole training process is completed. In order to classify a new data item, there are three methods: 1) using the single SVM classifier only; 2) using the CSOACN classifier only; 3) constructing a combination detection classifier based on both of the two classifiers generated by CSVAC. In the first case, the new data is classified as normal or abnormal by the hyperplane generated by SVM. In the second case, data
641
The initial experiments indicate that performance of the CSVAC is better than pure SVM in terms of higher average detection rate as well as lower rates of both negative and positive false and is better than pure CSOACN in terms of less training time with comparable detection rate and false alarm rates. It is also noticed that the effectiveness of the algorithm is comparable to the KDD99 [10] winner, which was used as the standard of the performance comparison in the experiments. The algorithm may be enhanced in some aspects. For example, the training and testing speeds may be improved by applying the dimension reduction on the input data. More experiments on performance evaluation are also expected. REFERENCES [1] S. Axelsson, “Research in intrusion detection systems: a
Figure 4. Flow Graph of the CSVAC Algorithm
survey technical report”, TR98 í 17, Chalmers University of Technology, Goteborg, Sweden, 1999.
is classified as normal or one of the intrusions by comparing the distance between the new data with the average centre models of each class on the feature space. In the third case, since two profiles are obtained for “normal” data by SVM and CSOACN respectively, in order to reduce the false negative of intrusion detection, a new data item is confirmed as normal data only when it is classified as “normal” by both of the two classifies (i.e. SVM and CSOACN). Otherwise, the new data item should be classified into another class which may depend on different applications.
[2] H. Debar and M. Dacier, A.Wespi, “A revised taxonomy of Intrusion Detection Systems”, Ann.Telecommun 55 (7, 8), PP. 361-378, 2000.
[3] J. C. Burges and Christopher, “A tutorial on support vector machines for pattern recognition”, DataMining and Knowledge Discovery 2, PP. 121-167, 1998.
[4] Y. Feng, J. Zhong, C. Ye, Z. Xiong and Z. Wu, “Intrusion detection classifier based on self-organizing ant colony networks clustering”, Information Assurance and Security 4, PP. 247~256, 2006.
[5] S. Kotsiantis, “Supervised machine learning: A review of classification techniques”, Informatics Journal 31 , PP. 249-268, 2007.
IV. CONCLUSIONS AND FUTURE WORK
[6] R. O. Duda, P. E. Hart and D. G. Stork, Unsupervised Learning and Clustering (2nd edition). Wiley, New York, ISBN 0-471-05669-3, P. 571, 2001.
In this paper, a framework for a new algorithm of intrusion detection is proposed by combining two existing machine learning methods (i.e. SVM and CSOACN). Based on the new algorithm, an IDS can be developed. The system can be used in different cases by construct the detection classifier under three different training modes (i.e. SVM mode, CSOACN mode, CSVAC mode). SVM training mode is suitable for the time intense case that only one binary classifier is required by training upon a small amount of labeled data. Namely, it only needs to distinguish the data of normal from abnormal. On the other hand, the CSOACN training mode is suitable for the preciseness intensive case and can solve multiclass problems upon both label and unlabeled data. The CSVAC mode, which is based on the combination of SVM and CSOACN, can be used to balance the performance of IDS in terms of efficiency and accuracy.
[7] D. Duan, S. Chen and W. Yang, “Intrusion detection system based on support vector machine active learning”, Computer: Engineering, 1000—3428 (2007) 01—0153— 03, January, 2007.
[8] M. Dorigo, E. Bonabeau and G. Theralulaz, “Ant algorithms and stigmergetic”, Future Generation Computer Systems, Volume 16, Issue 9, PP. 851-871, 2000.
[9] V. Ramos and A. Abraham, “Evolving a stigmergic selforganized data-mining”, Proc. of the Fourth International Conference on Intelligent Systems Design and Applications, Budapest, Hungary, PP. 725~730, 2004.
[10] X. Xu, “Adaptive intrusion detection based on machine learning: feature extraction, classifier construction and sequential pattern prediction”, 122 Journal of information Assurance and Security 4, PP. 237-246, December, 2006.
642