Fusion of Multiple Classifiers for Intrusion Detection in Computer ...

6 downloads 17370 Views 330KB Size Report
Computer networks are usually protected against attacks by a number of .... can be used to describe network traffic, while the features depend on the services,.
Fusion of Multiple Classifiers for Intrusion Detection in Computer Networks Giorgio Giacinto,Fabio Roli, and Luca Didaci Department of Electrical and Electronic Engineering – University of Cagliari, Italy Piazza D’Armi – 09123 Cagliari, Italy {giacinto,roli,luca.didaci}@diee.unica.it

Abstract The security of computer networks plays a strategic role in modern computer systems. In order to enforce high protection levels against threats, a number of software tools have been currently developed. Intrusion Detection Systems aim at detecting intruders who elude “first line” protection. In this paper, a pattern recognition approach to network intrusion detection based on the fusion of multiple classifiers is proposed. Five decision fusion methods are assessed by experiments and their performances compared. The potentialities of classifier fusion for the development of effective intrusion detection systems are evaluated and discussed. Keywords: Intrusion Detection in computer networks, pattern classification, multiple classifier systems, decision fusion.

1. Introduction Computer networks are usually protected against attacks by a number of access restriction policies. Despite the effort devoted to carefully designing such filtering, network security is very difficult to guarantee, since attacks exploit unknown weaknesses or bugs, which are always contained in system and application software (McHugh et al., 2000; Proctor, 2001). Intrusion Detection Systems (IDS) act as the “second line of defence” placed inside a protected network, looking for known or potential threats in network traffic and/or audit data recorded by hosts.

1

Two approaches to intrusion detection are currently used. One is called misuse detection, and is based on attack signatures, i.e., on the detailed description of the sequence of actions performed by the attacker. This approach allows the detection of intrusions perfectly matching the signatures. Their effectiveness is strictly related to the extent to which IDSs are updated with the signatures of the latest attacks developed. In principle, this problem could be solved by designing general signatures that capture the "root-cause" of an attack, thus allowing for the detection of all the attack variants designed to exploit the same weakness. Unfortunately, general signatures designed by security experts usually generate high volumes of “false alarms” (Proctor, 2001), i.e., normal traffic events matching an attack signature. This is currently a challenge for the development of effective IDSs, since very small false alarm rates translate to a number of false alarms greater than the number of true alarms, as the volumes of normal traffic are some orders of magnitude greater than those related to the attacks (Axelsson, 2000). The second approach to intrusion detection is based on statistical knowledge about the normal activity of the computer system, i.e., statistical profiles of what constitutes legitimate traffic in the network. In this case, intrusions correspond to anomalous network activity, i.e., to traffic whose statistical profile deviates significantly from the normal (McHugh et al., 2000; Proctor, 2001). This IDS model has been the first to develop on account of its theoretical ability to detect intrusions, regardless of the system type, the environment, the system vulnerabilities, and the type of intrusions. Unfortunately, the acquisition of profiles of "normal" activity is not an easy task, due to the high variability of network traffic over time. The above discussion points out that the trade-off between the ability to detect new attacks and the ability to generate a low rate of false alarms is the key point to develop an effective IDS. Therefore, the misuse (signature-based) detection model is currently the most widely used due to its ability to produce very low false alarm rates at the price of a very limited ability to detect new attacks. 2

The difficulties in detecting novel attacks have led researchers to apply statistical pattern recognition approaches using learning by example paradigms (Duda et al. 2001). The main motivation in using pattern recognition approaches to develop advanced IDSs is their generalization ability, which may support the recognition of previously unseen intrusions that have no previously described patterns. In particular, pattern recognition approaches should allow the detection of the so-called attack “variants”. Research on IDSs based on learning-by-example paradigms is at an early stage, so that a number of issues should be solved before such systems can be used in operational environments (Allen et al., 2000; Bonifacio et al., 1998; Cannady, 2000; Debar et al., 1992; Elkan, 2000; Ghosh and Schwartzbard, 1999; Lee and Heinbuch, 2001; Ryan et al., 1998). For network managers, one of the main drawbacks of such systems appears to be the high false alarm rate often produced. A short review of the current state of the art on IDSs based on pattern recognition paradigms is given in Section 2. In this paper, an approach to intrusion detection in computer networks based on the fusion of multiple classifiers is proposed (Roli and Kittler, 2002). Each member of the classifier ensemble is trained on a distinct feature representation of patterns, then the individual results are combined using a number of “fixed” and “trainable” fusion rules (Roli and Kittler, 2002). This approach is motivated by the observation that human experts try to design signatures that “combine” different attack characteristics in order to attain low false alarm rates and high attack detection rates. Unfortunately, the manual development of such types of signatures is very difficult and tedious. The fusion of multiple classifiers, using learning-by-examples, can automate such an approach, thus providing IDSs with the ability to detect new attacks without producing high false alarm rates. A formulation of the intrusion detection problem as a pattern recognition task with distinct feature representations is presented in Section 3. The approach based on the multiple classifier

3

paradigm is illustrated in section 4, and results on a data set available to the public are reported in Section 5. Conclusions are drawn in Section 6.

2. Related Work on Pattern Recognition Approaches to Intrusion Detection A recent report on current Intrusion Detection (ID) technology provides a discussion on the challenges to developing effective IDSs (Allen et al., 2000). In particular, pattern recognition and learning by example approaches are expected to provide the following benefits: -

the ability to generalise from a representative set of examples thus allowing to detect new types of intrusion;

-

the possibility of automatically extracting attack "signatures" from labelled traffic data, thus allowing to overcome the subjectivity of the human interpretation of intrusive behaviour. These issues have been addressed since the early years of IDS development. In particular,

the application of neural networks to intrusion detection has been investigated by a number of researchers. Neural networks provide a solution to the problem of modelling the users' behaviour in anomaly detection, because they do not require any explicit user model (Debar et al., 1992; Lee and Heinbuch, 2001; Ryan et al., 1998). A neural network model designed to perform anomaly and misuse detection has been proposed in (Ghosh and Schwartzbard, 1999). The training set is made up of strings of events captured by the Base Security Module (BSM) that is part of many operating systems. Training sets made up of traffic data instead of audit records have also been used for misuse detection (Bonifacio et al., 1998; Cannady, 2000). The above overview of related work points out that pattern recognition techniques are apt to provide a solution to some open issues in IDS development. In addition, the extensive evaluation of pattern classification techniques carried out on a sample data set of network

4

traffic performed during the KDD'99 conference pointed out the feasibility of the pattern recognition approach to intrusion detection (Elkan, 2000). However, it should be remarked that, for the deployment of IDSs using pattern recognition algorithms in operational environments, one of the main drawbacks appears to be the high false alarm rate they often produce. Except for the neural network hierarchy proposed in (Lee and Heinbucjh, 2001), where features at different abstraction levels were processed by distinct networks, the classification is usually performed in the feature space made up of all the features needed to detect the considered attack classes. It is easy to see that classifiers working in such a “monolithic” feature space can suffer from the so called “curse of dimensionality”, due to the large number of features necessary for effective intrusion detection and to the limited amount of training data that can be collected, especially for attack classes. In addition, features with very different meanings, related to different characteristics of network traffic, are used in intrusion detection (Section 3). It is well known that it can be very difficult for an individual classifier to effectively process features that have very different semantic meanings. For such cases, the fusion of multiple classifiers using a distinct feature representation of patterns can be more effective than any approach based on individual feature representations or based on a high-dimensional feature vector made up of all the available features (Kittler et al., 1998; Sharkey, 1999). In addition, as the authors showed in another application field, the fusion of multiple classifiers can improve the trade-off between false alarm and attack detection rates (Giacinto et al., 2000).

5

3. Problem Formulation Intrusions in computer networks basically exploit weaknesses of the network transmission protocol and weaknesses and bugs exhibited by system and application software. The solution proposed hereafter is a network-based IDS (NIDS), as it allows to detect both types of intrusions by processing the network traffic flow (Proctor, 2001). From the pattern recognition point of view, the network intrusion detection problem can be formulated as follows (see Figure 1): given the information about network connections between pairs of hosts, assign each connection to one out of N data classes representing normal traffic or different categories of intrusions (e.g., Denial of Service, access to root privileges, etc.). It is worth noting that various definitions of data classes are possible (McHugh et al., 2002; Proctor, 2001). Intrinsic Features

TCP/IP packets

Network Features

Connection records

Feature Extraction

Traffic Features

Classification

Normal or Attack classes

Content Features

Figure 1: Intrusion Detection formulated as a Pattern Recognition problem. The term "connection" refers to a sequence of data packets related to a particular service, e.g., the transfer of a web page via the http protocol. As the aim of a network intrusion detector is to detect connections related to malicious activities, each network connection can be defined as a "pattern" to be classified. This formulation is in agreement with the related work presented in Section 2. Extraction of suitable features representing network connections is based on expert knowledge about the characteristics that distinguish attacks from normal connections. These features can be subdivided into two groups: features related to the data portion of packets (called

6

payload) and features related to the network characteristics of the connection, extracted from the TCP/IP headers of packets (Northcutt and Novak, 2001). The latter group of features can be further subdivided into two groups: intrinsic features, i.e., characteristics related to the current connection, and traffic features, related to a number of similar connections. Therefore, the following three feature sets can be used to classify each connection (Lee and Stolfo, 2000): • content features, i.e., features containing information about the data content of packets (“payload”) that could be relevant to discover an intrusion, e.g., errors reported by the operating system, root access attempts, etc. • network related features • intrinsic features, i.e., general information related to the connection. They include the duration, type, protocol, flag, etc. of the connection; •

traffic features, i.e., statistics related to past connections similar to the current one e.g., number of connections with the same destination host or connections related to the same service in a given time window or within a predefined number of past connections.

This feature categorisation is general enough to take into account the high number of features that can be used to describe network traffic, while the features depend on the services, software, etc. used in the protected network.

4. A multiple classifier system approach In the previous section, we have pointed out that three types of features can be extracted from network traffic data. Each feature category provides information that can be used to discriminate between attacks and normal traffic. In particular, when an attack is performed against a computer network, a "signature" related to that attack may be found in each feature category. For each attack type, network analysts try to design effective attack "signatures" by selecting the more effective subsets of features according to their experience and intuition 7

(Allen, 2000; Northcutt et al., 2001). On the other hand, pattern recognition tools have been designed to process the entire available feature set to extract more effective signatures than the ones hand-coded by network analysts (Section 2). A pattern recognition approach based on the multiple classifiers paradigm can further exploit the above experimental observation that attack evidence can be collected separately in different feature subspaces. First each feature subspace is used independently to perform attack detection. Then the evidence is combined to produce the final decision (see Figure 2). This process reflects the behaviour of network security experts, who usually look at different traffic statistics in order to produce reliable attack signatures, i.e., signatures providing effective attack detection and a very low false alarm rate. In addition, the generalization capabilities of pattern recognition algorithms can allow for the detection of novel attacks that signatures designed by human experts usually do not detect. The effectiveness of a multiple classifier approach also depends on the choice of the decision fusion function. As a first “user’s guide” to choose the decision function, the expected degree of diversity among classifiers should be taken into account (Roli and Kittler, 2001). It can be seen easily that the three feature sets presented in the previous Section are associated to unrelated connection characteristics. For example, for a given set of values of the intrinsic features (e.g., the number of bytes transmitted), there is no relationship with the values assumed by the content features (e.g., an attempt to log in the system as user “root”). According to the current achievements of the multiple classifiers theory (Roli and Kittler, 2002), it can be expected that classifiers trained on such different feature types provide outputs exhibiting a certain degree of uncorrelation. This observation allowed us to experiment with simple, “fixed” fusion rules, such as the majority voting rule and the average of classifier outputs, which are based on the assumption that different classifiers make uncorrelated errors. (It should be remarked that such fixed rules also assume that classifiers exhibit similar accuracies and pair-wise output correlations) (Xu et al., 1992). 8

Classifier1 Intrinsic Features ClassifierK1

Classifier1 Decision Fusion function

Traffic Features

Normal or Attack Class

ClassifierK2

Classifier1 Content Features ClassifierK3

Figure 2: Multiple Classifier System for Intrusion Detection However, as “fixed” fusion rules are not able to handle the different accuracies and pairwise correlations that can be exhibited by the classifier ensemble effectively (Roli and Kittler, 2002), we also considered the Naive-Bayes “trainable” fusion rule (Xu et al., 1992). The decision of each classifier is weighted according to the confusion matrix on the training set. In addition, it should be remarked that uncorrelation among features does not always guarantee uncorrelation of classifiers’ outputs (Giacinto et al., 2000; Roli and Kittler, 2002). In some cases classifiers may exhibit strong degrees of positive and negative correlations. For example, some attacks can be detected effectively only by one classifier of the ensemble, or by the combination of a subset of classifiers. To handle these cases, we have considered two additional “trainable” fusion techniques: the “decision template” fusion rule (Kuncheva et al., 2001), based on the set of outputs of the classifier ensemble on the training set, and the “dynamic classifier selection” (DCS) algorithm proposed by the authors (Giacinto et al., 2000), designed to approximate an ideal oracle, which, for each new pattern selects the classifier, if any, that provides the correct data class. These two fusion methods also allow to effectively handle different accuracies and pair-wise correlations exhibited by individual classifiers.

9

5. Experimental results Experiments were carried out on a subset of the database created by DARPA in the framework of the 1998 Intrusion Detection Evaluation Program (http://www.ll.mit.edu/IST/ideval). We used the subset that was pre-processed by the Columbia University and distributed as part of the UCI KDD Archive (http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html). The available database is made up of a large number of network connections related to normal and malicious traffic. Each connection is represented with a 41-dimensional feature vector according to the set of features illustrated in section 2. Connections are also labelled as belonging to one out of five classes, i.e., normal traffic, Denial of Service (DoS) attacks, Remote to Local (R2L) attacks, User to Root (U2R) attacks, and Probing attacks. Each attack class is made up of different attack variants, i.e., attacks designed to attain the same effect by exploiting different vulnerabilities of the computer network. In order to test our pattern recognition approach thoroughly, we restricted our investigation to connections related to the ftp service. These connections can be represented by a feature set containing 30 out of the 41 available features. It was possible to discard 11 features out of 41, because they exhibited a constant value over all ftp connections (the discarded features were related to other services). The 30 features were subdivided into the three categories described in section 2, so that 4 features belong to the “intrinsic” category, 19 to the “traffic” category, and 7 to the “content” category. For more details about the feature extraction process the reader is referred to (Lee and Stolfo, 2000). Features were linearly normalized so that they took values within the range [0,1]. A training set made up of 122 normal samples, 6 U2R attacks, 539 R2L attacks, 1 probing, and 57 DoS attacks corresponding to a total of 725 ftp connections, and a test set made up of 7436 connections, were extracted from the available dataset. In particular, the test set was made up of 5128 connections related to normal traffic and 2308 connections related to attacks,

10

while each of the latter belonged to one of the four attack classes. It is worth noting that 125 attack connections were related to attack types not included in the training set, so that the classification results related to these patterns allowed to test the ability of the pattern recognition approach to detect novel attack types. At present, we solved the problem of imbalanced classes in the training set by carrying out a number of preliminary classification experiments in order to identify the connections that needed more samples to produce reliable performances on the test set. Consequently, we populated the training set by a number of copies of these connections, so that the training set used in the experiments was made up of 157 normal samples, 46 U2R attacks, 598 R2L attacks, 25 probing activities, and 57 DoS attacks corresponding to a total of 833 ftp connections. This heuristic approach agrees with the practical observation that if the same attack type is carried out a number of times, its connections exhibit very similar feature values (i.e., these feature values correspond to the signature of that attack). Therefore, in the context of intrusion detection application, the simple replication of patterns is a reasonable solution for the problem of imbalanced data classes. More sophisticated techniques can obviously be investigated. TABLE 1 Overall performances on the test set of the considered neural networks. Three networks were trained on distinct feature sets. One network was trained on the entire feature set. Classifier Type MLP - 4 intrinsic features MLP - 7 content features MLP - 19 traffic features MLP - 30 features

% error 1.51 1.20 9.83 1.55

Overall Performances Average cost % false alarms 0.031 3.19 0.024 2.25 0.200 23.94 0.029 3.57

Table 1 shows the performances on the test set of three neural networks trained using distinct feature representations, i.e., the 4 intrinsic features, the 7 content features, and the 19 traffic features. In addition, the performances of a neural network trained using the entire 30 dimensional feature vector are reported for the sake of comparison. These networks are fullyconnected multi-layer perceptrons (MLPs) with three layers of neurons. Each network has 5 11

output neurons, as the number of data classes, and a number of inputs equal to the number of features. A hidden layer made up of 5 neurons was used for the networks trained on distinct feature representations. Fifteen hidden neurons were used for the network using all the 30 available features as inputs. Different neural network architectures were trained using the backprop algorithm, with different learning rates and random starting weights. The reported results represent the best performances attained on the test set. Classification results are reported in terms of the overall classification error, the average classification cost computed according to the cost matrix shown in Table 2, and the false alarm rate. (Other researchers used the cost matrix shown in Table 2 to weight errors according to their severity (Elkan, 2000)).

True class

TABLE 2 Cost matrix used to evaluate the confusion matrix related to each classifier Normal U2R R2L Probing DoS

Normal 0 3 4 1 2

Assigned class U2R R2L Probing 2 2 1 0 2 2 2 0 2 2 2 0 2 2 1

DoS 2 2 2 2 0

The overall performances of neural networks, except for the network trained on traffic features, are quite similar to each other; the network trained on the content features provided the best results. This result indicates that the content feature set is well suited for the type of traffic at hand, while the reverse is true with respect to the traffic feature set. TABLE 3 Attack detection performances on the test set of the considered neural networks. Three networks were trained on distinct feature sets. One network was trained on the entire feature set. Classifier Type MLP - 4 intrinsic features MLP - 7 content features MLP - 19 traffic features MLP - 30 features

known attacks % error Average cost 0.73 0.017 0.82 0.017 2.11 0.047 0.64 0.011

unknown attacks % error Average cost 16.00 0.320 14.40 0.280 31.20 0.880 12.00 0.192

12

Table 3 reports the performances of the four neural networks on known attacks, i.e., attack types included in the training set, and unknown attacks, i.e., the 125 attack samples related to attack types not included in the training set (each unknown attack type belongs to one of the four attack classes). The network trained on the traffic features exhibits the worst performances, with high error rates and high costs. The two networks trained on the intrinsic and content features exhibit quite a similar behaviour. In particular, they exhibit good generalization ability with respect to unknown attacks. However, as can be expected, the best results in terms of detection of known and unknown attacks is provided by the network trained on the overall feature set. TABLE 4 Overall performances on the test set of the five fusion rules used to combine the three neural networks trained on three distinct feature sets. Fusion Rule Majority Average Naive Bayes Decision Templates A Posteriori DCS MLP – 30 features Oracle

Overall Performances % error Average cost % false alarms 0.89 0.018 1.29 0.87 0.017 1.33 0.82 0.014 0.87 0.81 0.016 1.41 0.59 0.011 0.65 1.55 0.029 3.57 0.40 0.007 0.17

TABLE 5 Attack detection performances on the test set of the five fusion rules used to combine the three neural networks trained on three distinct feature sets. Fusion Rule Majority Average Naive Bayes Decision Templates A Posteriori DCS MLP – 30 features Oracle

known attacks % error Average cost 0.64 0.014 0.60 0.013 0.87 0.015 0.50 0.011 0.50 0.011 0.64 0.011 0.50 0.009

Unknown attacks % error Average cost 17.60 0.344 16.80 0.296 17.60 0.248 12.80 0.216 14.40 0.256 12.00 0.192 12.00 0.184

The performances of the multiple classifier systems made up of the three classifiers trained on the three distinct feature sets are shown in Tables 4 and 5. For comparison purposes the performances of the ideal oracle are also shown. Thanks to all the considered fusion rules, it was possible to attain improved overall performances with respect to the individual classifiers,

13

as well as with respect to the neural network trained on the 30-dimensional feature vector. In particular “trainable” fusion rules provided better performances than “fixed” rules, as the performances of the classifier trained on traffic features are much worse than those of the other two classifiers. (We remark that fixed rules usually suffer from accuracy imbalance (Roli and Kittler, 2002)). The best performances have been attained by the ‘A-Posteriori’ DCS technique (Giacinto et al., 2000), thus pointing out that DCS can perform well for classifiers exhibiting imbalanced accuracies and pair-wise correlations. The above results are also confirmed in the detection of known attacks, the best performances being provided by decision templates and DCS. In particular, the accuracy of these fusion rules is equal to that of the oracle. The only exception is for the Naive-Bayes fusion rule, which did not provide better results. It is worth noting, however, that the improvements are slight, as good performances have been attained by individual networks. On the contrary, the analysis of the performances of fusion rules on unknown attacks shows that no improvement over the best result of the individual networks is attained, except for the decision templates. In addition, no fusion rule provides improvements on performances of the neural network trained on the overall feature set that attains the same performance of the oracle. It is worth noting that decision templates and DCS provide the best performances among the considered fusion rules. However, it should be pointed out that the main goal of an effective IDS is to provide high rates of attack detection with very small rates of false alarms (Axelsson, 2000; Proctor, 2001). As the number of normal connections is some degrees of magnitude higher than that of the attacks, small rates of false alarms can translate to a number of alarms unacceptable for operational environments. (An IDS with a 100% attack detection rate can be useless in spite of the small rates of false alarms). The proposed fusion of multiple classifiers provides a very effective trade-off between false alarm rate and attack detection accuracy. As an example, for the considered data set, the DCS algorithm produced 33 false alarms against the 183 of the 14

individual neural network trained with the whole set of 30 features. At the same time, 29 attacks were missed both by DCS and this neural network, the neural network being a little more accurate in detecting unknown attacks. Therefore, we can conclude that the fusion of multiple classifiers satisfies the constraints of operational environments better than the previously proposed pattern recognition methods based on individual classifiers.

6. Conclusions In this paper, we have proposed a multiple classifier approach based on distinct feature representation, and experimented with five different fusion rules. The reported results showed that the MCS approach provides a better trade-off between generalization abilities and false alarm generation than that provided by an individual classifier trained on the overall feature set. Among the fusion rules evaluated, the dynamic classifier selection technique provided the best results, as it can not only exploit the diversity of independent classifiers, but also that of negatively and positively correlated classifiers. As one of the main criticisms raised from network security experts against previously proposed pattern recognition methods in intrusion detection was related to the high false alarm rates that such methods usually produce (Allen, 2000), we believe that our work can contribute to designing future pattern-recognition-based IDSs that should satisfy the operational requirements. In particular, the results here reported should increase the degree of acceptance of network security experts for the pattern recognition approach to intrusion detection. This main conclusion of our work is clearly supported by the reported results, which show that fusion of multiple classifiers allows to achieve a better trade-off than provided by individual classifiers between generalization abilities and false alarm generation.

15

References Allen J., Christie A., Fithen W., McHugh J., Pickel J., Storner E., 2000. State of the Practice of Intrusion Detection Technologies. Tech. Rep. CMU/SEI-99-TR-028, Software Engineering Institute, Carnegie Mellon University. Axelsson S., 2000. The Base-Rate Fallacy and the Difficulty of Intrusion Detection. ACM Trans. on Information and System Security 3(3), 186-205. Bonifacio J.M., Cansian A.M., de Carvalho A.C.P.L.F., Moreira E.S., 1998. Neural Networks applied in intrusion detection systems. Proc. of the IEEE World congress on Comp. Intell. (WCCI ’98). Cannady J., 2000. An adaptive neural network approach to intrusion detection and response. PhD Thesis, School of Comp. and Inf. Sci., Nova Southeastern University. Debar H., Becker M., Siboni D., 1992. A Neural Network Component for an Intrusion Detection System. Proc. of the IEEE Symp. on Research in Security and Privacy, Oakland, CA, USA, 240-250. Duda R., Hart P., Stork D.G., 2001. Pattern Classification. John Wiley & Sons. Elkan C., 2000. Results of the KDD’99 Classifier Learning. ACM SIGKDD Explorations 1, 63-64. Ghosh A.K., Schwartzbard A., 1999. A Study in Using Neural Networks for Anomaly and Misuse Detection. Proc. of the USENIX Security Symposium, August 23-26, 1999, Washington, USA. Giacinto G., Roli F., Fumera G., 2000. Selection of Image Classifiers. Electronic Letters 36(5), 420-422. Giacinto G., Roli F., Bruzzone L., 2001. Combination of Neural and Statistical Algorithms for Supervised Classification of Remote-Sensing Images. Pattern Recognition Letters, 21(5), 385-397. Kittler J., Hatef M., Duin R.P.W., Matas J., 1998. On Combining Classifiers. IEEE Trans. on Pattern Analysis and Machine Intelligence 20(3), 226-229. Kuncheva L.I., Bezdek J.C., Duin R.P.W., 2001. Decision Templates for Multiple Classifier Fusion. Pattern Recognition 34(2), 299-314. Lee S.C., Heinbuch D.V., 2001. Training a Neural-Network Based Intrusion Detector to Recognize Novel Attacks. IEEE Trans. on Systems, Man, and Cybernetics Part A 31, 294299. Lee W., Stolfo S.J., 2000. A framework for constructing features and models for intrusion detection systems. ACM Trans. on Inform. and System Security 3(4), 227-261. McHugh J., Christie A., Allen J., 2000. Defending Yourself: The Role of Intrusion Detection Systems. IEEE Software, Sept./Oct. 2000, 42-51. Northcutt S., Cooper M., Fearnow M., Frederick K., 2001. Intrusion Signatures and Analysis. New Riders Pub. Northcutt S., Novak J., 2001. Network Intrusion Detection (2nd ed). New Riders Pub. Proctor P.E., 2001. The Practical Intrusion Detection Handbook. Prentice Hall. Roli F., Kittler J. (Eds.), 2002. Multiple Classifier Systems. Springer-Verlag, Lecture Notes in Computer Science, vol. 2364. Ryan J., Lin M.J., Miikkulainen R., 1998. Intrusion Detection with Neural Networks. In: Advances in Neural Information Processing Systems 10, M. Jordan et al., Eds., Cambridge, MA: MIT Press, 943-949. Sharkey A.J.C., 1999. Combining Artificial Neural Nets. Springer. Xu L., Krzyzak A., Suen C.Y., 1992. Methods for combining multiple classifiers and their applications to handwriting recognition. IEEE Trans. Systems, Man and Cybernetics 22, 418-435. 16

Suggest Documents