Index Termsâcyber security, zero-day attack, misuse ... 2012 International Conference on Cyber Security ..... Knowledge Management, Services and Cloud.
2012 ASE International Conference on Social Informatics (SocialInformatics 2012) / 2012 ASE International Conference on Cyber International onConference Cyber Security Security (CyberSecurity2012 2012) / 2012 ASEConference International on BioMedical Computing
A Contextual Anomaly Detection Approach to Discover Zero-Day Attacks Ahmed AlEroud, George Karabatis Department of Information Systems University of Maryland, Baltimore County (UMBC), Baltimore, USA {Ahmed21, GeorgeK}@umbc.edu
connections with known attack context profiles created using conditional entropy. The known attack patterns detected in connections that fully match one or more attack context profiles are sent as alerts to security administrators. On the contrary, for those connections that partially match known attacks context profiles a profile similarity score is calculated and assigned to them. Indeed, a small variation (i.e. high similarity) between the feature values of the incoming connections and the attack context profile features may indicate new attack patterns. We also utilize an anomaly detection approach created using one class nearest neighbor anomaly detection algorithm (1-NN) [8]. The new connections are compared to sampled normal profiles stored in an anomaly detection module. Based on density efficient computations, the system generates and assigns anomaly scores to suspicious connections. The execution mode of the prototype consists of two phases: The first phase is a static phase that occurs off-line. During this phase we create the misuse and anomaly detection models that can be used later in the second phase. The second phase occurs at run-time and it utilizes the previously created models to detect known and unknown attacks. The contributions of this work are as follows: First, we utilize both attack profiles similarity and anomaly detection to discover zero-day attack patterns. Second, using an implemented prototype system, we evaluate its accuracy in detecting new attack types through a series of experiments, which show that the proposed approach has a very good rate of detecting new attack types and also minimizes the rate of false positives. The rest of the paper is organized as follows. Section II provides a background about misuse detection, anomaly detection, contextual based intrusion detection, and dimensionality reduction approaches. Section III introduces our approach in detail. Section IV, illustrates our experiments with the NSL-KDD intrusion detection dataset. Section V concludes our research and discusses the scope of future work.
Abstract—There is a considerable interest in developing techniques to detect zero-day (unknown) cyber-attacks, and considering context is a promising approach. This paper describes a contextual misuse approach combined with an anomaly detection technique to detect zero-day cyber attacks. The contextual misuse detection utilizes similarity with attack context profiles, and the anomaly detection technique identifies new types of attacks using the One Class Nearest Neighbor (1NN) algorithm. Experimental results on the NSL-KDD intrusion detection dataset have shown that the proposed approach is quite effective in detecting zero-day attacks. Index Terms—cyber security, zero-day attack, misuse detection, contextual anomaly, one class nearest neighbor.
I. INTRODUCTION Intrusion Detection Systems (IDSs) play a significant role in monitoring connections between computer networks. Such systems are responsible for determining whether network activities are normal, or attacks. IDSs are categorized as misuse detection or anomaly detection systems; in misuse detection systems, the IDS analyzes the network connections and compares them to large databases of known attack signatures. In anomaly detection systems, the IDS defines the baseline, or normal state of the network, and then monitors network segments, compares their state to the normal baseline and looks for anomalies. One of the major problems in the current IDSs is the significant number of false alerts disseminated. In addition, IDSs rely on human intervention to create the new attack signatures [1]. If such signatures do not properly describe new attack patterns, simple modification on them allow the attack to succeed. Zero-day attacks are considered the ultimate challenge in cyber security domain [2]. There have been few studies which addressed zero-day attacks detection problems, most of them utilized unsupervised anomaly detection techniques to discover new types of attacks [3,4,5,6]. Although such techniques have the capability of detecting zero-day attack patterns, however, they produce large and unmanageable amounts of false alerts [7]. In order to address the above problems, we introduce a contextual misuse and anomaly detection approach that has the capability of handling both known and unknown types of attacks. An implemented prototype system matches the features of real network 978-0-7695-5014-5/12 $26.00 © 2012 IEEE 978-0-7695-4938-5/12 DOI 10.1109/CyberSecurity.2012.12 10.1109/SocialInformatics.2012.45
II. BACKGROUND AND RELATED WORK There have been several works that addressed the problem of detecting known attacks using misuse detection approaches. These approaches focus on simple correlation and signature
383 40
matching [9, 10, 11]. The common problem with most of these approaches is their inability to detect zero-day attacks whose signatures do not already exist in the system. Context has been utilized in different computing areas, where it is vital for the system to be aware of the current situation. The ultimate goal of creating context-aware modules in IDSs is to minimize the dependency on humans who decide what is considered as “in context” or “out of context”, and how to react accordingly. In [12] the authors studied the effect of correlating IDS attack signatures with static and dynamic network information to derive network context. However, this approach did not utilize any components to detect zero-day attack types. Context correlation was used in [13] to detect events which have the same attack context. Contextual anomaly detection refers to the problem of identifying patterns in data that do not conform to expected behavior in a particular context [14]. Song et al. in [15] assumed that object properties can be partitioned into contextual and behavioral attributes. The authors of [16] focus on identifying contextual anomalies in IDS data using behavioral attributes. The zeroday or unseen attack detection problem has also been addressed using machine learning approaches such as Support Vector Machines (SVM) [1, 2], and Clustering [5, 17]. Zhichun et al. in [18] proposed a model to detect zero-day worms by analyzing the invariant content of polymorphic worms. In [19] the authors introduced an approach that can detect zero-day attacks from IDS alerts.
uncertainty about the possibility of particular attacks. Fourth, the Anomaly Detection module stores a set of normal profiles sampled from audit network connections and labeled as normal. The normal profiles are used by the one class nearest neighbor anomaly detection algorithm (1-NN) [8] to detect deviation from normal activities by calculating the anomaly score of a suspicious connection record. In addition, the Singular Value decomposition (SVD) dimensionality reduction technique is utilized to reduce the number of dimensions (i.e. features) of normal profiles and the incoming connections. Anomaly Detection module
Normal profiles sampling
Audit Network Data Repository
A=U*∑*V T Data preprocessing
A
U ∑
V
Evaluation connection Repository
Singular Value Decomposition Misuse Detection module
Known attack context profiles
Partially matched connections
Profile similarity score S
III. THE APPROACH
Normal profiles Anomaly score S’
Fully matched connections
In this section, we provide a short overview of our misuse and anomaly attack detection approach and a description of the prototype system. Fig. 1 illustrates the major modules and processes in the system. The system consists of several static components and processes. First, the Audit Network Data Repository (i.e. training data) is a database that stores events collected by network sensors along with their characteristics such as connection protocol, connection duration, services requested, packet flow statistics, etc. We populate the Audit Network Repository with information obtained from the publicly available intrusion detection dataset NSL-KDD [20]. The events in the dataset are presented in a high level format called connection records. Each connection record consists of 41 features. The connection records are labeled as normal or attacks, where the attacks are categorized in one of the following four categories: user to root (U2R), remote to local (R2L), denial of services (DoS), and probe. Second, the Data Preprocessing module converts numerical features of the connection records into bins; we carried out binning to facilitate the process of creating known attack context profiles from the connection record features. Third, the Misuse Detection module consists of several known attack context profiles formed by utilizing the conditional entropy of attacks based on the features of previous connections; hence, they serve as context quantifiers, meaning that the occurrence of particular events (i.e. feature values) during the connection may increase or decrease the
Known attacks
Unknown attacks /Normal
Fig. 1. System Components and processes
There are also several processes that are performed at run time to measure the detection capabilities of the system. The incoming connection records used to test the system in realtime are stored in the Evaluation Connections repository. We populate the Evaluation Connections repository with new attacks that do not exist in the Audit Network Data repository, and have no context profiles. During run-time, a connection record under evaluation is passed through known attack context profiles for a possible match. The connection record features are matched with the features of attack context profiles. If a full match is detected, that is, if the connection record features match all features of an attack context profile, an alert is raised about that attack and the process stops at this point. However, if the connection record partially matches a context profile, we need to further investigate for a possible zero-day attack and the connection record is further processed in one of the following two processes. Α. We generate the connection record profile similarity scores {S1… Sn} by calculating the similarity between connection record features and attack context profiles. The maximum similarity score Si is compared with a user defined threshold w ; such threshold is used to decide whether the connection record is a new attack pattern or a normal activity.
384 41
A small variation (i.e. high similarity) between the feature values of the incoming connection and attack context profile features may indicate a new attack pattern. Β. We calculate the connection record anomaly score S’ by passing the connection record to the anomaly detection module. The one class nearest neighbor algorithm is used to compare the connection record features with normal profiles. The anomaly score S’ is compared with a user defined threshold to decide whether the connection record is a new attack pattern or a normal behavior. At the end of the process the prototype may raise an alert about the possible attacks based on one of these two scores or both of them. Later we will also utilize some scenarios where both scores are applied in conjunction. Next, we describe the misuse detection and the anomaly detection modules.
connection C, there is at least one matched value in the corresponding context profile. TABLE I. A SAMPLE OF ATTACK CONTEXT PROFILES USING CONDITIONAL ENTROPY FOR SOME R2L ATTACKS Service Root # of Flags # Compromised shell root files
Guess Password
telnet∨
Imap
Imap
#Accessed files
0
0
RSTO ∨ RSTR
0
0
0
0
S0∨ S1
0
0
ftp
If no context profile fully matches the connection record features, the connection record is partially matched with attack context profiles. This is accomplished by calculating a profile similarity score S as similarity between the connection record features and context profile features. The resulting similarity scores {S1… Sn} are used to decide whether such connection records are new attacks or normal activities.
A. Misuse Detection Module & Known Attack Context Profiles The main purpose of the misuse detection module is to detect any attacks that are relevant and specific to a particular context, and at the same time to identify activities which are normal thus, to reduce the amount of false positive alerts. The misuse detection module consists of several known attack context profiles created using conditional entropy [21]. In the context of cyber security, conditional entropy can be defined as the amount of information needed to infer the degree of uncertainty about one event based on the occurrence of another. We use conditional entropy to create known attack context profiles using patterns from historical data. The conditional entropy is described by [21] as the entropy of a variable A given another variable F, and it is calculated using the sum of entropies for all values of variable A given values of variable F. It is denoted by
Attack/ Feature
B. Profile Similarity Score Calculation. The connection records whose features partially match some known attack context profiles features may reveal suspicious patterns. If the similarity between an attack context profile features and a connection record C is high, this may indicate a new attack pattern in that connection. The new attack pattern may be similar but not identical to attack . We utilize the simple matching similarity coefficient [22] to calculate the similarity values {S1… Sn} between n context profiles and each connection record that partially matches such context profiles (i.e. Similarity ≠1). The simple matching similarity coefficient is utilized to calculate the similarity between the features of the context profile of attack that can be represented as a vector and the corresponding features in the connection record C represented as a vector . The process of calculating profile similarity Si between vectors , is carried out by finding the number of matches and mismatches between vectors. The maximum similarity score from {S1… Sn} is selected to decide whether the connection record is a zero-day attack. If the score is very low, this may indicate that the connection record is indeed normal activity. If the score is relatively high, it may indicate an unknown attack. A userdefined similarity threshold w is used to decide whether the connection record is a new attack pattern or a normal behavior. We further implemented an anomaly detection module to check the anomaly behavior in connection records features.
(|) = − , | (1)
where , , is the joint probability of variables F and A whose values are , respectively, and | is the conditional probability of given . Typically, conditional entropy is a very useful measure for context quantization as it measures the degree of uncertainty about particular variable on the average when we know the occurrence of a particular value of feature . We calculate the value of conditional entropy of a particular attack on the condition of occurrence of value of feature . We created 15 unique context profiles. Table I shows the context profiles for two attacks in the R2L attack category. The features used in the context profile of each attack are the ones which give the lowest conditional entropy values. Each feature generates a single context profile entry; however, some profile entries may contain more than one value of such feature (e.g. service = telnet ∨ ftp). The first row in Table I is the context profile for the Guess Password attack. The Guess Password profile will fully match connection C features only, if for each feature value in
C. Anomaly Detection Module We utilize the One Class Nearest Neighbor algorithm (1-NN) [8] to detect anomalies that may reveal zero-day attacks. Nearest neighbor anomaly detection techniques require distance or similarity functions to be defined between instance features (i.e. connection records features). There are several distance measures that can be utilized, and we chose Euclidian distance since it is one of the widely used measures in anomaly
385 42
detection, and it has shown very good results in this area. The 1-NN method is applied as follows. First, a model for class corresponding to the context of normal behavior is created. We created such a model as a set of normal connection profiles sampled from the Audit Network connections repository. As such a model is created, a connection record C that partially matches attack context profiles can be passed to the anomaly detection module. The anomaly detection module accepts such connection as a zero-day attack (i.e. rejects it as normal activity) when its local density is less than or equal to the local density of its nearest neighbor selected from the normal connection profiles. Usually the first nearest neighbor is used for local density estimation. The following is the acceptance function that calculates the anomaly score of connection record C and compares it to a user defined threshold . () =
‖ − !! "# ()‖ > (!!"# (!!"# ()))‖
‖!!"# () −
repository, and to create the known attacks context profiles. We did not create any context profiles for the new 14 types of attacks. The selected connection records were distributed as follows: 262,178 connections are labeled as attacks and distributed among all 4 categories (DOS, R2L, U2R, and PROBE), and 237,822 are labeled as normal. We also used 77,289 connection records from the NSL-KDD testing dataset. The testing dataset includes several unknown types of attacks that do not exist in the training one. The testing data consists of 29,378 attack connections, and 47,911 normal connections. We used the testing data to populate our Evaluation Connections repository. We next give some sample sequences of network connection records in the dataset as shown in table II. The last column indicates whether this record is part of a normal activity or an attack. This column is also used to verify the prediction accuracy of our system.
(2)
TABLE II. SAMPLE SEQUENCES OF NETWORK CONNECTION RECORDS
The equation above is used to compare the distance from a connection record C to it is nearest neighbor !! "# () selected from the set of normal profiles (NPs); let us call this distance d1. We also need the distance from this nearest neighbor !! "# () to its nearest neighbor !! "# (!! "# ()); let us call this distance d2. We detect and label the connection record as a zero-day attack based on the ratio between d1 and d2. We implemented several modifications on the 1-NN to improve its computational time, such as sampling, pre-calculating normal samples nearest neighbors, and dimensionality reduction using Singular Value Decomposition. We used SVD to reduce the number of numerical dimensions (i.e. features) of both connection C and the normal profiles connections. At the end both the profile similarity score and the anomaly score of a connection record are used together in order to predict zero-day attacks. There are two major reasons that led us to this direction. First, we want to measure the effects of incorporating both the profile similarity and the anomaly scores on the accuracy of our system predictions. We also aim to minimize the dependency on the anomaly detection module. Due to the large number of connections, 1-NN may take more time in calculating the anomaly scores .
Protocol Service
Source
Destination
Bytes
Bytes
# Of Failed
Attack/ Normal Label
Login TCP
SMTP
3170
329
0
Normal
TCP
Telnet
295
753
1
Guess Password
TCP
Telnet
281
1301
0
Load Module
Our experiments were performed on a server with Intel Pentium D Dual Core 3.4 GHZ CPU with 8 GB RAM running 64-bit Windows. We performed several types of experiments to measure our system accuracy in detecting known attacks types which have context profiles, and the new types of attacks with no context profiles. Although the main focus is on the detection rate of new attack types, we also achieved competitive results in detecting known attack types. For more details about our system performance in detecting known attacks types refer to our previous work in [29]. In our experiments, we measure the performance of system modules in detecting new attack types using: x Profile similarity score without anomaly score x Anomaly detection score without profile similarity score x Anomaly detection score to refine profile similarity score predictions We used three evaluation metrics in our experiments: the true positives rate (TP), the false positives rate (FP) and the detection accuracy (AC). The true positive rate (TP) is the proportion of new zero-day attacks that were correctly predicted as attacks. The false positive rate (FP) is the proportion of normal connections that were incorrectly predicted as attacks. We also utilized accuracy (AC) to find the best combination between TP and FP. The accuracy is the proportion of the total number of predictions that were correctly predicted. Table III explains the metrics calculation. As shown in the table, a refers to the number of normal connections predicted as normal by our system, b is the number of normal connections predicted as attacks, c is the number of attacks connections predicted as normal , and d is the number of attacks connections predicted as attacks.
IV. EXPERIMENTS AND EVALUATION We performed our experiments using the NSL-KDD dataset. This dataset addressed the problems of the original KDD 1999 intrusion detection dataset [24] whose drawbacks were discussed in [25, 26]. We developed our system using an Oracle database. Additionally, we used Weka [27] and Matlab [28] software packages to perform data preprocessing and dimensionality reduction tasks, and to create the known attack context profiles. The data set was divided into two parts: the training part, and the testing one. The training part consists of 22 types of attacks as well as normal connections. The testing part contains 14 zero-day (new types of) attacks that do not exist in the training part. These new attacks are used to test whether the prototype can uncover and recognize them as attacks. We selected 500,000 connection records from the NSL- training KDD dataset to populate the audit network data
386 43
10% of the normal connections from Audit Network repository. Fig. 3 shows the results of the second experiment using only the anomaly score S’. Both TP and FP rates are high at low threshold values of . At the higher refined values of , the FP rate has been decreasing. The best combination of TP and FP rates is achieved at 1.6 level of with values 0.8 and 0.17 respectively. The accuracy value achieved at 1.6 is also about 0.86. Based on the results of experiments I and II anomaly detection score achieves a better detection rate than the profile similarity one. This can be explained by the fact that the anomaly detection component is actually better than simple profile matching (i.e. profile similarity) in detecting the deviation between the new attack types and the normal profiles.
TABLE III. SYSTEM EVALUATION METRICS CALCULATION Predicted
Actual
Normal
Normal a
Attack b
Attack
c
d
TP, FP, AC rates are calculated as: $% =
& * (8) % = (9) '+& +*
=
+& (10) +*+'+&
Attacks detected
Experiment I - Profile Similarity Score Detection Rate: The purpose of this experiment is to measure the detection rate of zero-day attacks by using only profile similarity score S in connection records under evaluation. Such connection records do not fully match any attack context profile; hence they can be normal or new attack types. This experiment is carried out by varying the values of profile similarity score threshold - from 0.1 to 0.9. For any connection C, if its profile similarity score is greater than - it is labeled as an attack. Otherwise the connection is labeled as normal.
1 0.8 0.6 0.4 0.2 0
TP FP
0.5 0.7 0.8 1 1.2 1.4 1.6 1.8 Anomaly score theshold
2
Attack Detected
Fig. 3 TP, FP rate of zero-day attack detection using anomaly score S’ 1.2 1 0.8 0.6 0.4 0.2 0
Experiment III - Using Anomaly Detection Score to Refine Profile Similarity Predictions: In this experiment we utilize anomaly detection to refine the results generated by profile similarity score. The purpose of this experiment is twofold: first to minimize the computational time required if all connection records are passed to anomaly detection module; second to minimize the false positive rate of the profile similarity score. The experiment is conducted as follows: we select the best performing profile similarity threshold - value which was 0.8 in experiment I.
TP FP 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Profile similarity threshold -
Fig. 2 TP, FP rate of zero-day attack detection using profile similarity score S
Attacks Detected
Fig. 2 shows the results of the first experiment. The TP rate is very high when the value of - is very small. At 0.1 level of -, the true positive rate is about 0.98. The false positive rate is also high at low values of -. At higher values of -, a decrease in false positives rate is observed, though we start missing some true positives (i.e. attacks) as well. To find the best - threshold we measure the accuracy with each threshold value. Our system achieves very good TP, FP rates at 0.8 level of -. The TP rate is almost 0.67, and the FP rate is about 0.22. The overall accuracy at 0.8 level of - is about 0.71. Thus, using only profile similarity score the system is able to detect about 0.67 of the new attacks types with less than 0.22 false positives rate.
0.8 0.6 0.4 0.2 0
TP FP 0.5 0.7 0.8 1 1.2 1.4 1.6 1.8 2 Anomaly score theshold
Fig. 4 TP, FP rate of zero-day attack detection using the anomaly detection score S’ to refine the profile similarity score S predictions
Each connection record whose profile similarity score is less than 0.8 is labeled as normal by the misuse detection module. The connection records for which the profile similarity scores are greater or equal to 0.8 are passed to the anomaly detection module to further calculate their anomaly score. The final decision about the category (i.e. the label) of such connection records is determined according to the value of anomaly score S’. Fig. 4 shows the results of this experiment. At 1.6 level of anomaly threshold , the FP rate is minimized to approximately 0.08 compared to 0.22 when the profile similarity is used without anomaly detection. The TP rate achieved in this experiment is about 0.62.
Experiment II - Anomaly Score Detection Rate:In this experiment we only utilize the anomaly scores in detecting new types of attacks. Thus, the connection records that partially match context profiles are passed to the anomaly detection module. We experimented with several values of the anomaly threshold in the range from 0.5 to 2. We ran this experiment using the first two eigenvectors (i.e. dimensions) corresponding to the two largest eigenvalues of the decomposed normal profiles matrix. Additionally, we sampled
387 44
V. CONCLUSIONS
[11] W. Li, and S. Tian, “An ontology-based intrusion alerts correlation system,” Expert Syst. Appl., vol. 37, no. 10, pp. 7138-7146, 2010. [12] M. C. a. Y. L. Frédéric Massicotte, “Context-Based Intrusion Detection Using Snort, Nessus and Bugtraq Databases,” In: Proc. of Third Ann. Conf. on Privacy, Security and Trust, New Brunswick, Canada, 2005. [13] J. Zhou, M. Heckman, B. Reynolds et al., “Modeling network intrusion detection alerts for correlation,” ACM Trans. Inf. Syst. Secur., vol. 10, no. 1, pp. 4, 2007. [14] V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,” ACM Comput. Surv., vol. 41, no. 3, pp. 1-58, 2009. [15] X. Song, M. Wu, C. Jermaine et al., “Conditional Anomaly Detection,” IEEE Trans. on Knowl. and Data Eng., vol. 19, no. 5, pp. 631-645, 2007. [16] S. Staniford, J. A. Hoagland, and J. M. McAlerney, “Practical automated detection of stealthy portscans,” J. Comput. Secur., vol. 10, no. 1-2, pp. 105-136, 2002. [17] G. Hendry, and S. Yang, “Intrusion signature creation via clustering anomalies,” In Proc. of SPIE Security and Defense Symp. on Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security, Orlando, FL, 2008, pp. 69 730C–69 730C12. [18] L. Zhichun, S. Manan, C. Yan et al., “Hamsa: fast signature generation for zero-day polymorphic worms with provable attack resilience,” In: IEEE Symp. on Security and Privacy, 2006, pp. 15 pp-47. [19] J. Song, H. Ohba, H. Takakura et al., “A comprehensive approach to detect unknown attacks via intrusion detection alerts,” In: Proc. of the 12th Asian computing science conf. on Advances in computer science: computer and network security, Doha, Qatar, 2007, pp. 247-253. [20] M. Tavallaee. "NSL-KDD dataset " 2012; http://www.iscx.ca/NSL-KDD/. [21] C. E. Shannon, The Mathematical Theory of Communication: Univ. Illinois Press, 1971. [22] D. M. Lewis, and V. P. Janeja, "An empirical evaluation of similarity coefficients for binary valued data," IGI Global, 2011, pp. 44-66. [23] M. Wu, and C. Jermaine, “Outlier detection by sampling with accuracy guarantees,” In: Proc. of the 12th ACM SIGKDD Int’l conf. on Knowledge discovery and data mining, Philadelphia, PA, USA, 2006, pp. 767-772. [24] M. L. Labs. "KDD Cup 1999 intrusion detection Data,"http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.htm [25] J. Mchugh “Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory,” ACM Trans. Inf. Syst. Secur., vol. 3, no. 4, pp. 262-294, 2000. [26] G. Kayacık, and N. Zincir, “Analysis of three intrusion detection system benchmark datasets using machine learning algorithms,” In: Proc. Of the 2005 IEEE int. conf.on Intelligence and Security Informatics, Atlanta, GA, 2005, pp. 362-367. [27] "Weka Data mining and machine learning software" http://www.cs.waikato.ac.nz/ml/weka/ [28] "Matlab", http://www.mathworks.com/./matlab/ [29] A. AlEroud, and G. Karabatis, “A system for cyber attack detection using contextual semantics” In: Proc. Seventh Int’l Conf. Knowledge Management, Services and Cloud Computing(KMO'12), Salamanca, Spain, 2012, pp. 431-442
We proposed and implemented a contextual misuse and anomaly detection prototype to detect zero-day cyber-attacks. The approach first utilized the similarity between the features of the new incoming connections and the context profiles about known cyber-attacks to discover new attack types. We also utilized a one class nearest neighbor (1-NN) anomaly detection technique in a reduced dimensional space to detect anomalies and thus to discover new attacks types. Both profile similarity and anomaly scores are used individually, and in conjunction to evaluate the attack detection rate of our system. The experimental results have shown that the proposed technique is indeed effective in detecting zero-day attack patterns with a very low false positive rate. ACKNOWLEDGMENT This work was partially supported by a grant from NorthropGrumman Corporation, USA. REFERENCES [1] T. Shon, and J. Moon, “A hybrid machine learning approach to network anomaly detection,” Inf. Sci., vol. 177, no. 18, 2007, pp. 3799-3821,. [2] S. Jungsuk, H. Takakura, and K. Yongjin, “A generalized feature extraction scheme to detect 0-day attacks via ids alerts,” In: Int’l Symp. on Applications and the Internet(SAINT’08), 2008, pp. 5561. [3] K. Leung, and C. Leckie, “Unsupervised anomaly detection in network intrusion detection using clusters,” In: Proc. Of The Twenty-Eighth Australasian Conf. On Computer Science, Newcastle, Australia, 2005, pp. 333-342. [4] E. Eskin, A. Arnold, M. Prerau et al., “A geometric framework for unsupervised anomaly detection: Detecting intrusions in unlabeled data,” Applications of Data Mining in Computer Security, vol. 6, pp. 77-102, 2002. [5] L. Portnoy, E. Eskin, and S. Stolfo, “Intrusion detection with unlabeled data using clustering,” In: Proc. of ACM CSS Workshop on Data Mining Applied to Security (DMSA’01), Philadelphia, PA, 2001, pp. 5-8. [6] Y. Guan, A. A. Ghorbani, and N. Belacel, “Y-means: a clustering method for intrusion detection,” In: IEEE Canadian Conf. on Electrical and Computer Engineering(CCECE’03), 2003, pp. 1083-1086. [7] G. Vigna, W. Robertson, and D. Balzarotti, “Testing networkbased intrusion detection signatures using mutant exploits,” In: Proc.of the 11th ACM conf. on Computer and Communications Security, Washington DC, USA, 2004, pp. 21-30. [8] D.Tax, and R. Duin, “Data description in subspaces,” In: Proc. Of 15th Int’l. Conf. on Pattern Recognition, 2000, pp. 672-675 vol.2. [9] P. Ning, Y. Cui, D. S. Reeves et al., “Techniques and tools for analyzing intrusion alerts,” ACM Trans. Inf. Syst. Secur., vol. 7, no. 2, pp. 274-318, 2004. [10] C. Sheng-Hui, C. Erh-Hsien, Y. Chih-Yung et al., "Attack subplan based attack scenario correlation", In: Proc. Int’l. Conf. on Mhine Learning and Cybernetics, 2007, pp. 1881-1887.
388 45