Predictive Security Model using Data Mining - Semantic Scholar

2 downloads 198 Views 90KB Size Report
Our model continuously monitor the online network data and efficiently detects the attacks. Our model provides a tool for intrusion detection based on statistics ...
Predictive Security Model using Data Mining Sathishkumar P. Alampalayam and Anup Kumar

Computer Engineering and Computer Science Department 123 J.B. Speed Building, University of Louisville,Louisville, KY 40292 Ph: (502) 852-0471 Fax: (502) 852-4713 Email: {spalam01, ak}@louisville.edu Abstract: In this paper, we propose a practical and predictive security model for intrusion detection in a computer networking environment using data mining. This model uses classification and regression technique for data mining. The goal of the proposed model is to identify significant variables that measure the network intrusion from wealth of raw network data and perform an efficient vulnerability evaluation based on those variables. We also present a methodology of classification and regression analysis for intrusion detection in a system. Analysis of experimental results using the DARPA benchmark dataset shows that the CART approach performs better compared to related models like random projection and principal component analysis. The results also indicate that the model performs better as dimension of the input data decreases, without compromising the prediction success rate. Keywords: Intrusion Detection, Predictive Security, Computer Network, Classification and Regression Trees (CART), Data Mining. 1. Introduction: In spite of successful invention of technologies in the field of security and cryptography to secure computer network systems, malicious users still succeed in attacking the network with devastating effects. The challenge of network security has attracted several researchers with the aim of securing computer networks. Intrusion Detection Approach (IDA) is an active research area in the field of computer network security [1-9]. Intrusion detection is the process of detecting and responding to malicious activty that is aimed at compromising network security. Network firewalls often implement security policies at the front-end to protect computers form malicous attacks. Due to the sophistication of hackers and attacks from the insiders, regular intrusion prevention techniques like firewalls, encryption, digital signature and authentication fail to detect various attacks. The objective of the IDA is to provide another layer of defense for sophisticated and internal attacks. An overview of the existing IDA techniques can be found in [2, 4]. IDA can be classified into misuse detection and anomaly detection approaches. There are several weaknesses in the current IDAs. To address these security related issues, this paper presents an efficient practical framework that is adaptable, scalable and could predict the security and privacy related attacks at a node or at a system level. The goal of the proposed model is to use data mining to identify the significant

variables that measure network intrusion from the wealth of raw network data and perform an efficient vulnerability evaluation based on those variables. We also present the results of using CART technique [16] as IDA. This paper is organized into five sections.The next section gives the motivation and overview of existing approaches. It also presents the rationale of our approach and limitations of the existing schemes. Section three presents our proposed predictive security model, with its general architecture and methodology of using CART as an efficient intrusion detection algorithm. Section four explains the simulation of the prediction aspects of the model with experiments and results. Section five presents conclusion from our experiments. 2. Background and Motivation: 2.1 Existing Approaches: The misuse detection systems use patterns of known attacks to match and identify those intrusions [5]. Although it can accurately and efficiently detect instances of known attacks, it lacks the ability to adapt in detecting new type of attacks. The anomoly detection systems on other hand detect intrusions by finding deviations from the established user profiles. Anomoly detection should detect new types of intrusions but it could have higher false positive rate [6]. Traditionally, IDA are developed using expert knowledge of the system and attack methods [7]. Due to the complexity of modern network system and sophistication of attackers, expert knowledge engineering is often very limited and unreliable [2]. One of the problems in the existing IDA is the computational overhead, which sometimes can be unacceptably excessive. Analyzing system logs requires large memory and processor resources. Usually an IDA is trained over this huge amount of system audit information, which increases the complexity of intrusion detection algorithms dramatically [11]. Long term training and testing is not suitable for the requirements of realtime detection and response, which may further limit the practical use of an IDA. The problem gets worse when the input audit information involves a high dimensionality of the data. Most of the existing algorithms assumes a low dimensionality of the data. Another issue is the noise-tolerance performance of IDA. Some IDA schemes are very sensitive to the data representation. For instance these schemes may fail to generalize an unseen data if the representation contains irrelevant information. In some instance, it has been observed that training of IDA requires a noise free data (the data that is

1

labelled ‘normal’) [1]. It has recently been observed that Denial of Service (DoS) attacks are targeted even against the IDA [2]. Thus, IDA themselves needs to be protected. An IDA should also be able to distinguish an attack from an internal system fault. The detection of the intrusion and adaptive response still represents a challenging issue. To summarize, the above existing IDA suffer from one or more of the following limitations:  Computational intensive and inefficient  The detection rules are more subjective due to the limited and unreliable expert knowledge  These existing schemes primarily focus on authentication and privacy related issues. Most of the existing schemes do not provide continuous monitoring, detection and appropriate predictive protection against different active and passive attacks such as DoS, probing, packet mistreatment and routing attacks.  Lack the ability to detect new threats and attacks and also has a higher false positive rate  Needs longer training/testing time and performs poorly with high dimensional input data Thus, the current schemes has practical problems in intrusion detection and adaptive real time response. They are also limited in the dimension of the input variables and in selecting qualitiative and quantitative variables that can be used to predict the intrusion efficiently and accurately. The proposed predictive security model addresses these limitations. Our model continuously monitor the online network data and efficiently detects the attacks. Our model provides a tool for intrusion detection based on statistics and machine learning concepts. In addition, our model performs the variable selection and the intrusion detection efficiently with low false positive rate and with less processing, training and testing time. It also has the ability to detect new attacks and control the attacks in an adaptive manner.

by the flooding of packets by the intruder or malicious nodes. Hence, a significant increase in the measurement of the source bytes and decrease in the measurement of destination bytes may indicate a DoS attack. Error rate: DoS attacks are characterized by the flooding of erroneous packets by the intruder or malicious nodes. Hence, a significant increase in the measurement of the error rate for a group of malicious nodes in distributed system may indicate a DoS attack. Challenge here is the identification of the critical indicators that would measure this kind of DoS attack and predict the attack/intrusion efficiently and practically in computer networks. Our philosophy is that, by identifying critical system parameters that are affected by various types of attacks from the raw network data, we could measure the relative change in these parameter values, and detect the type of attack accurately, without compromising the system efficiency. This enables the proposed model for on-line real time detection. Once an attack is detected, proper level of response measures could be applied, thereby malicious nodes could be isolated from accessing the system or network [12]. 3 Datamining based IDA Security Model Methodology: In this paper and in the following simulation experiments we focus on using the DARPA intrusion detection benchmark dataset obtained from the raw military network, simulated with security related attacks. Our goal in the model simulation and experiments is to serve two purposes: one is to evaluate an efficient intrusion detection model by variable dimension reduction techniques for the computer networks and the other is to identify significant network parameters for DoS, masquerade and unauthorized attacks and validate the significant parameters for intrusion attacks like DoS attacks that we identified earlier through elaborate GloMoSim simulation experiments [12].This section explains the theoretical foundation of CART as a data mining tool, which is the basis of step 2 and step 3 of our model.

2.2 Rationale for the Proposed Security Model: To explain the rationale of the proposed predictive security model, let us consider a possible DoS attack by malicious nodes: Flooding the host by other nodes resulting in DoS attack. In this kind of attack, the agents on malicious nodes flood the host with requests, which can cause resource depletion. This leads to DoS, that affects the request from agents on genuine nodes. Some of the critical network parameters that are affected by this kind of DoS attack are: the rate of packets lost (which is a difference between number of destination bytes and source bytes), number of packets ignored and error rate. Packet loss rate: Due to DoS attacks, host nodes are generally not in a position to serve agent nodes. This results in packet loss and hence a significant increase in the measurement of packet loss rate for nodes within the distributed system. Source and Destination bytes: DoS attacks are characterized

Classification Trees: Let Y1,Y2,…Yn, O be random variables where Yi has domain Dom(Yi) and O has domain Dom (O) = {1,…,J}. Here Y1,…,Yn are the predictor attributes and n is the number of predictor attributes. Let O be the class label. A classifier, C is a function C: Dom(Y1) x … x Dom (Yn)  Dom (O). Let S = Dom(Y1) x ... x Dom(Yn) x Dom (O) be the set of events. For a given classifier, C and a given probability measure, µ over S, misclassification rate is given by MCµ (C)=µ[C(Y1,…,Yn) ≠ O]. Thus, for a given training dataset, D of N independent identically distributed samples from S, sampled according to probability distribution µ, a classifier, C that minimizes the misclassification rate, MCµ (C) needs to be constructed [16]. A decision tree is a special type of classifier. It is a directed acyclic graph, T in the form of a tree. The root of the tree,

2

Root(T) does not have any incoming edges. Every other node has exactly one incoming edge and may have zero or more outgoing edges. A node Ŧ, without outgoing edges is a leaf node, while those with outgoing edges is called an internal node. Each leaf node is labeled with one class label, while each internal node, Ŧ is labeled with one predictor attribute Y T, where YT Є { Y1,…, Yn} is called the split attribute. Let the label of the node Ŧ is denoted by Label(Ŧ). Each edge (Ŧ, Ŧ’) from an internal node Ŧ, to one of its children Ŧ’ has a predicate q (Ŧ, Ŧ’) associated with it, where q(Ŧ, Ŧ’) involves only the splitting attribute YT of node n. A set of predicates Q is non-ovelapping if the conjunction of any two predicates in Q evaluates to false. A set of predicates Q is exhaustive if the disjunction of all predicates in Q evaluates to true. Let the splitting predicates of Ŧ are the set of predicates QŦ on the outgoing edges of an internal node Ŧ. Splitting criteria of Ŧ, denoted by crit(Ŧ), is the combined information of splitting attribute and splitting predicates. For a given decision tree T, we can define the associated classifier in the following recursive manner:

Figure1. Classification Tree Induction Schema Algorithm A classification tree is usually constructed in two phases. In the first phase - growth phase, an overly large classification tree is constructed from the training data. Most classification tree construction algorithms grow the tree topdown in the greedy way as shown in the figure 1, which takes split selection method as an argument. The most popular split selection method that has been widely used for the classification tree construction is the Gini index rule. Gini rule is used as a measure of how well splitting rule separates classes in the parent node. Gini index, originally proposed by Breiman et al [10] is given by the following formula. GINI

(t )  1 

n 2  [ p ( j / t )] j 1

Where p(j/t) is the relative frequency of class j at node t, this index measures the impurity of node and has a maximum value, when records are equally distributed among all classes, which implies the least interesting information. The minimum value of (0.0) indicates all records belong to one class, implying the most interesting information. The following formula computes the quality of split, when a node p is split into k partition (children).

if Ŧ is a leaf node. C(y1,…, yn, Ŧ) = Label (Ŧ); Ŧ C(y1,…, yn, j); if Ŧ is an internal node and Ŧ j is children node of Ŧ, yi is label of Ŧ, and q(Ŧ, Ŧj) (yi) = true DT (y1…, yn) = C(y1,…, yn, Root(Ŧ)) As per above definitions, if the tree T is a well-formed decision tree, then the function DT() is also a well defined classifier which can be called as a decision tree classifier or classification tree. Thus, for a given dataset D = {ω1,…, ωN}, where ωi are independent identically distributed random samples from a probability distribution µ over S, a classification tree T that minimizes the misclassification rate, MCµ(DT) needs to be constructed.

Where k is number of children nodes, ni is the number of records at child i and n is the number of records at node p. In the second phase, pruning is used to generate a sequence of simpler and simpler trees, each of which is a candidate for the appropriately-fit final tree. In the pruning phase, the final size of the tree T is determined with the goal to minimize an approximation of misclassification rate, MCµ (DT) where DT is a decision tree classifier or classification tree and µ is a probability distribution and is calculated as defined earlier.

Algorithm: Classification Tree Induction Schema: Input: node Ŧ, dataset D, split selection method V. Output: classification tree T for D rooted at Ŧ.

Regression Trees: Let Y1,Y2,….,Yn be random variables as defined in the previous section. Let X be the predicted attribute or output random variable with real line as the domain. A regressor, R is a function R: Dom(Y1) x … x Dom (Yn)  Dom (X). Let S = Dom(Y1) x ... x Dom(Yn) x Dom (X) be the set of events. For a given regressor, R, loss function L, L(a, x) = ║a – x ║ 2 and a given probability measure, µ over S, regressor error of the regressor R is given by Rµ (R) = Eµ [L(Y, R(Y1,…,Ym)] where Eµ is the expectation with respect to probability measure µ. Hence for a given training dataset, D of N independent identically distributed samples from S, sampled according to probability distribution µ, a regressor, R that minimizes the value of Rµ (R) needs to be constructed. Regression trees are a particular type of regressors, which are

BuildTree(Node Ŧ, dataset D, split selection method, V) 1. Apply V to D to find the split attribute X for node Ŧ. 2. Let n be the number of children for Ŧ. 3. if (Ŧ splits) Partition D into D1,…, Dn and label node Ŧ with split attribute X Create children nodes Ŧ 1,… ,Ŧ n of Ŧ and label the edge(Ŧ, Ŧi) with predicate q(Ŧ, Ŧ i) for each iЄ{1,…n} BuildTree(Ŧi, Di, V) end for each else Label T with the majority class label of D end if

3

the natural generalization of decision trees for regression (continuous valued prediction) problems. Instead of associating a class label to every node, a real value or a functional dependency of some of the inputs is used. Regression trees introduced by Breiman et al., [10] have a constant numerical value in the the leaves and use the variance as a measure of impurity. The reason for using variance as the impurity measure is justified by the fact that the best constant predictor in a node is the average of the value of the predicted variable on the test examples that correspond to the node; the variance is thus the mean square error of the average used as a predictor. Similar to classification trees, prediction is made by navigating the tree following branches with true predicates until a leaf is reached. The numerical value associated with the leaf is the prediction of the model. Top-down induction schema algorithm as shown in the figure 1 can be used to build regression trees. Pruning methods for classification trees can be adapted for regression trees.

and principal component analysis (PCA). PCA technique is chosen for the comparision because it is the most popular dimensional reduction technique used practically and best in the mean square sense. Random projection is chosen since the expriments have been conducted to demonstrate that it performs the intrusion detection better than PCA[11]. They are also evaluated for their performances in the reduced dimensions compared to the original dimension. 4.3.1 Metrics for Performance Evaluation: The following four metrics are chosen for comparing the performance of CART approach with PCA and random projection techniques. Prediction success rate : It is defined as the percentage of whole data that is correctly predicted. This is chosen since accuracy is one of the most important charactersitics of an IDA. A high prediction success rate is desirable for a good IDA. Misclassification rate: It is defined as the percentage of normal data instances that has been falsely classified to be vulnerable or intrusive. This parameter represents sensitivity to the noisy training data. A good IDA should adapt better even to the unseen data, even if the data representation has some irrelevant information. Total Processing Time: It is defined as the total time the system takes to analyze a variable and detect the intrusion. This is an important metric since the effective intrusion detection should occur in real time and response should be taken before a significant damage occurs to the network. Total Training Time: It is defined as the total time the system takes to train from the input data. This is an important metric since a good intrusion detection should be scalable and hence must handle high dimensional situation, with a large amount of data.

4. Simulation and Experimentations: 4.1 Input Data Description: The baseline input data we used in our simulation experiment is obtained from the DARPA intrusion detection evaluation program [15]. This datafile has the information GINI split 

k ni GINI (i  i 1 n

)

pertaining to various intrusions simulated in a military network environment. We used a total of 60000 data instances each of them is a 41 dimensional vector. Each dimension represents either a qualitative or quantitative variable. Each variable represents an extracted feature from raw network like number of wrong fragments, number of source bytes sent, number of destination bytes received etc., Overall these data file represents 24 simulated attack types that fall in DoS, probing and unauthorized acceess intrusion categories.

4.3.2 Experimental Results: The noise tolerance feature of the IDA is shown in the following figure 2. It shows the results of CART in various reduced dimensions compared with the results of random projection and PCA for the misclassification rate metric.

4.2 Experimentation Setup: The simulation of the log analysis and vulnerability evaluation framework is carried out using the freeware package CART 5.0 [14]. Gini splitting rule, explained in section 3 is used for the classification trees. Least squares method is used for the regression trees. Penalties were not issued for any variables. Minimum cost tree regardless of the size is used for the standard error rule. We repeated the experiments with d={2, 5, 10, 15, 20, 25, 30, 35, 41} where d represents the dimension of the dataset and d=41 corresponds to the original dimension of the raw network data. All experiments are conducted on Dell Pentium 3 machine with 1 Ghz processor and 512 MB RAM.

Com paris ion of M is clas s ification rate 16

Misclassification Rate (%)

14 12 10

Classif ication Tree

8

Random Projection

6

PCA

4 2 0 0

10

20

30

40

50

Dim e ns ion of Input Datas e t

Figure 2. Comparision of misclassification rate 4.3 Performance Evaluation: This section evaluates the performance of CART data mining technique with similar models like random projection

It is evident that the misclassification rate is smallest in the original data dimension and as the dimension of the input

4

variables decreases the value of this metric varies marginally. In the lowest reduced dimension, d=2, the misclassification rate is maximum at 7.54%, which is lower compared to 14.48% of random projection method and 10.2% of PCA model. Thus the value of the misclassification rate is better compared to the other two models at all dimensions. It can also be shown that the value remains constant from d=41 through d=5. This demonstrates that we can do a vulnerability evaluation or detection at the reduced dimension of d=5 compared to d=41.

resource constraint networks. Following figure 5 shows the time complexity of CART approach compared to other models in different reduced dimensions. It shows the results of CART model in various dimensions compared with results of random projection and PCA for the total training time metric. Table 1: Identification of significant network parameters Network Parameter

1. Service (Type of network service e.g., http, telnet, etc) 2. Protocol_type (Type of protocol, e.g. tcp, udp, etc) 3. Src_bytes (No of data bytes sent from source to destination) 4. Dst_bytes (No of data bytes received at destination from source) 5. Count (No of connections to the ‘same host’ in the past two seconds) 6. Dst_Host Count (No of connections to the ‘same host’ using a window of 100 connections instead of a time window ) 7. Dst_Host Same Source Port Rate (No of connections to the same source port of host ) 8. Dst_Host Service Count (No of host connections providing the ‘same service’ for the current connection in the past two seconds) 9. Dst_Host_Srv_Diff_Host Rate (% of connections to different hosts ) 10. Dst_Host_Srv_ Rerror_Rate (% of host connections that have ``REJ'' errors for providing the ‘same service’) 11.Dst_Host_Serror Rate (% of host connections that have ``SYN'' errors) 12.Dst_Host_Rerror Rate (% of host connections that have ``REJ'' errors)

The accuracy feature of the IDA is shown in the following figure 3. It shows the results of CART in various reduced dimensions compared with the results of random projection and PCA for the prediction success rate metric. Com paris ion of Pre diction Succe s s rate

Prediction Success Rate (%)

97 96 95 Classif ication Tree

94

Random Projection 93

PCA

92 91 90 0

10

20

30

40

50

Dim e ns ion of Input Datas e t

Figure 3. Comparision of prediction success rate It can be shown that the prediction success rate is higher for a higher dimension dataset and remains relatively constant until d=5. However at the lowest reduced dimension d=2, the prediction success rate drops to 90.44 % from 94.32 % at d=5. Also, the values of prediction success rate by random projection is slightly better compared to PCA and CART approach. Though CART approach performs better at the lowest reduced dimension, d = 2.

Do S

Masque rade

Y

Y

Unaut horize d Access Y

Y

Y

N

Y

Y

Y

Y

N

Y

Y

N

N

N

Y

Y

N

Y

Y

N

N

Y

N

Y

N

Y

N

N

Y

N

N

Y

N

N

Com parison of Training Tim e

Figure 4 shows the real time detection feature of CART approach compared to the other models. It shows the results of CART model at various reduced dimensions compared with the results of random projection and PCA for the total processing time metric.

8 7

Training Time

6 5

Classification Trees

4

Random Projection

3

PCA

2 1

Com paris ion of Proce s s ing Tim e

0 0

45

Processing Time (s)

40

10

20

30

40

50

Dim ension of Input Dataset

35 30

Figure 5. Comparision of total training time

Classif ication Tree

25

Random Projection

20

PCA

15 10

The results indicate that the CART model requires shorter training time compared to PCA and random projection. This indicates that the CART approach is scalable and hence would handle high dimensional data compared to random projection and PCA.

5 0 0

10

20

30

40

50

Dim e ns ion of Input Datas e t

Figure 4. Comparision of total processing time The results indicate that CART has shorter processing time compared to PCA and random projection approaches. Thus it is more suitable for on-line real time detection of

4.4 Significant Network Experimentation:

5

Parameters

Identification

In order to identify significant network parameters that would measure vulnerability of the network due to a particular attack, we performed data mining experiments using the high dimensional benchmark DARPA dataset and CART tool as described in section 4.1. Our main idea in this experiment is to identify significant parameters in detecting DoS, masquerading and unauthorized access attacks which then can be used in steps 3 and 4 of our model. The results from variable dimension reduction simulation model for DoS attacks demonstrates that most of the significant variables that were identified by the CART classifier tool like the number of source bytes, number of destination bytes, destination host count and destination host service count based on the benchmark data does match with the significant variables like number of packets lost (which is a difference of number of destination bytes and source bytes), number of packets ignored and the number of collisions that we identified earlier using our simulation with GloMoSim [12, 13]. Following table 1 shows the significant network parameters that are identified for specific attacks like DoS, masquerade and unauthorized access attacks. The significant network parameters are identified using the variable importance table generated by CART. Variable importance is based on the contribution of predictors during the construction of the tree. Importance of a variable is determined by playing a role in the tree construction, either as a main splitter or as a surrogate. A Y in the column of table 1 indicates that the parameter does affect the attack and hence they need to be used to measure vulnerability of the system. A N in the column of table 1 indicates that the parameter does not represent/indicate the attack. Unlike DoS and masquerading attacks, there appear to be no sequential patterns for unauthorized access attacks. This is because DoS and masquerading attacks involve many connections to some host(s) in a very short period of time, but the unauthorized access attacks patterns are embedded in the data portions of packets, and normally involve only a single connection. Hence it is unlikely that they can have any unique frequent traffic patterns. So, we used domain knowledge to construct a set of “content” features to indicate whether the connection contents suggest suspicious behavior. In the above table 1, except categorical parameters like protocol and service type all the parameters are of continuous type. In the above table 1, the “same host” network parameter examines only the connections in the past two seconds that have the same destination host as the current connection. The “same service” network parameters examine only the destination host connections in the past two seconds that provide same service for the current connection. The "Same host" and "same service" network parameters are time-based traffic continuous features. ‘SYN’ error represents the error due to TCP SYN flood attack. A TCP SYN flood sends erroneous TCP requests to the target system, which cannot complete the connection request. ‘REJ’ error indicates that the packets have not been received correctly by the destination node.

5. Conclusion: In this paper, we have proposed a practical and efficient intrusion and vulnerability detection model using the data mining technology. Experimental results indicate that the proposed predictive security model using the CART methodology can be used to identify the significant parameters, which then can be used for evaluating vulnerability and detect intrusion out of raw network data efficiently. Analysis of the results show that the CART data mining IDA performs better compared to the related models like random projection and PCA. Also, the performance of the model gets better as the dimension of the input data decreases without compromising detection accuracy, a feature essential for online real time detection of the resource constraint networks. REFERENCES: [1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

6

W. Lee and W. Fan, “Mining System Audit Data: Opportunities and Challenges”, SIGMOD Record, 30(4), December 2001. S.J.Stolfo et al., “Data Mining based Intusion Detectors: An overview of the columbia IDS project, SIGMOD Record, 30(4), December 2001. Dokas, P., Ertoz, L., Kumar, V., Lazarevic, A., Srivastava, J., Tan, P.: “Data Mining for Network Intrusion Detection”, Proc. NSF Workshop on Next Generation Data Mining, Baltimore, MD, November 2002. C.Warrender, S.Forrest and B.Pearlmutter, “ Detecting intrusions using system calls: Proceedings of the 1999 IEEE symposium on Security and Privacy, pp.133-145, 1999. S.Kumar and E.H. Spafford, “A software architecture to support misuse intrusion detection”, proceedings of the 1995 national Information Security Conference, pp.194-204. E.Eskin, “Anomoly Detection over noisy data using learned probability distributions”, Proceedings of the International Conference on Machine Learning, pp.255-262, 2000. K.IIgun, R.A.Kemmerer and P.A. Porras, “State transition analysis: A rule based intrusion detection approach”, IEEE transactions on Software Engineering, Vol 21, No 3, pp.181-199, 1995. W.Lee and S.Stolfo, “A Framework for Constructing Features and Models for Intrusion Detection Systems”, ACM Transactions on Information and System Security, pp. 227-261, 2000. Y Zhang and W Lee, “Intrusion Detection in Wireless Ad-Hoc Networks”, Proceedings of The Sixth International Conference on Mobile Computing and Networking, Aug 2000. Breiman, L., J.Friedman, R.Olshen, and C.Stone,

[11]

[12]

[13] [14] [15] [16]

Classification and Regression Trees, (1994), Pacific Grove:Wadsworth. Hongmei Deng, Qing-An Zeng and Dharma P. Agrawal, "Network Intrusion Detection System using Random Projection Technique," Proceedings of the 2003 International Conference on Security and Management (SAM'03), Las Vegas, Nevada. Sathishkumar .P. Alampalayam and Anup Kumar, “Adaptive Security Model for mobile agents in wireless networks”, IEEE GlobeCom 2003. GloMoSim http://pcl.cs.ucla.edu/projects/glomosim/ CART software at http://www.salford-systems.com/. http://kdd.ics.uci.edu/databases/kddcup99. Alin Dobra and Johannes Gehrke, “Bias correction in classification tree construction”, In proceedings of the Eighteenth International Conference on Machine Learning, pp 90-97, 2001.

7

Suggest Documents