Attack Type Prediction Using Hybrid Classifier

33 downloads 111 Views 289KB Size Report
There must be a mechanism that can predict what kind of “attack types” can happen in future ... Classification and prediction are two forms of data analysis.
Attack Type Prediction Using Hybrid Classifier Sobia Shafiq, Wasi Haider Butt, and Usman Qamar Department of Computer Engineering College of Electrical & Mechanical Engineering (CEME) National University of Sciences and Technology (NUST), Pakistan [email protected], {wasi,usmanq}@ceme.nust.edu.pk

Abstract. Due to the rapid increase in terrorist activities throughout the world, there is serious intention required to deal with such activities. There must be a mechanism that can predict what kind of “attack types” can happen in future and important measures can be taken out accordingly. In this paper, a hybrid classifier is proposed which consists of some existing classifiers including K Nearest Neighbor, Naïve Bayes, Decision Tree, Averaged One Dependence Estimators and BIFReader. The proposed technique is implemented in Rapid Miner 5.3 and it achieves the satisfied level of accuracy. Results reveal the improvement in accuracy for the proposed technique as compare to the individual classifiers used. Keywords: Classification, Prediction, K-NN, Naive Bayes, Decision Tree, AODE, BIFReader.

1

Introduction

Terrorism has been around the world as long as one can remember but from past one decade there is a huge increase in terrorist activities. Terrorists perform activities like hijacking, murder, kidnapping and bombing in order to achieve an agenda. These activities not only subjected to one region or country but throughout the world. As these incidents increased in past few years the phenomena of terrorism became an important issue to the government authorities [1]. There is a serious need that security agencies or government authorities can detect terrorist activities before they happen and take precautionary measures accordingly. This research is based on prediction of “attack types” using data mining techniques. The rest of this paper is organized as follows. Section 2 discusses the related work. Section 3 illustrates proposed framework. Section 4 provides the experimentation and implementation of proposed technique. Section 5 analyses experimental results. Section 6 concludes this paper and points out some future work of proposed framework.

2

Related Work

Data Mining is a process in which data analysis is done. In data mining different discovery algorithms are applied on the data which can produce particular patterns or X. Luo, J.X. Yu, and Z. Li (Eds.): ADMA 2014, LNAI 8933, pp. 488–498, 2014. © Springer International Publishing Switzerland 2014

Attack Type Prediction Using Hybrid Classifier

489

models over the data [2]. In data mining different type of techniques and algorithms are used like Classification, Clustering, Artificial Intelligence, Association Rules, decision Trees etc for knowledge discovery from large set of databases. From all of above mentioned techniques, classification is one of the most popular and studied technique which can be used for predicting something. Value of an attribute can be predicted based on the values of other attributes. The attribute which is to be predicted is known as “class” [3]. Classification and prediction are two forms of data analysis which describe about classes or tell about future trends by building models. This section briefly describes the different techniques of classification in data mining such as K Nearest Neighbor classification (K-NN), Naive Bayes Classification, Decision Tree, Averaged One-Dependence Estimators (AODE) and BIFReader Classification. This section also gives a view of ensemble or hybrid classifiers. K Nearest Neighbor or K-NN is a classification algorithm that was introduced in early 1950’s. N. Suguna, and Dr. K. Thanushkodi [4] state that K-NN is an important pattern recognition algorithm. In K-NN classification rules are produced using only training dataset instead of any additional data. Xingjiang Xiao and Huafeng Ding [5] present that in K-NN classifier, classification rules are generated based on training datasets. When test data is given to KNN classifier it predicts its category based on training dataset which are nearest neighbor to the test data. This algorithm first calculates the distance of the new query and already known samples to find out the K nearest neighbors. Once K nearest neighbors is gathered, the majority of these K nearest neighbors are taken to be the prediction of the query instance. Pratiksha Y. Pawar and S. H. Gawande [6] in their research presents that K-NN classification is outstanding because of its simplicity. This technique is widely used in text classification as well as in classification tasks with multi categorized document because of its simplicity and easy implementation. One drawback of this algorithm is that it takes more time if the training dataset is large. LiuYu and Chen Gui-Sheng [7] presents that K-NN is the most widely used lazy learning method. It is one of most powerful classification algorithm which can deal with complex problems easily. K-NN provides well classified samples when the training dataset is large. Pat Langley and Stephanie Sage [8] in their research states that naive bayes is an important techniques used for probabilistic induction. This classification technique represents each class with a single probabilistic summary. Jaideep Vaidya et al [9] presents that Naive Bayes is a very useful Bayesian learning technique and it is more concerned with high dimensional tasks. Naive Bayes classifier takes an arbitrary number of continuous or categorical variables and classifies an instance to belong one of several classes. It is based on bayesan theorem with strong independence assumption. A Naive Bayes classifier considers that the absence or presence of a particular feature or an attribute is independent of absence or presence of any other feature. The advantage of this classifier is that it requires small amount of training data to calculate the means and variances of the variables required for classification. Amany Abdelhalim and IssaTraore [10] presents that decision trees are models that direct the decision making process. Decision trees can be created through the dataset (data-based decision trees) as well as from rules proposed. A decision tree is an efficient technique for guiding a decision process as long as no changes occur in the

490

S. Shafiq, W. Haider Butt, and U. Qamar

dataset used to create the decision tree. Decision tree is a flowchart like tree structure which works both on numerical as well as categorical data. Decision Trees are based on recursive portioning. Decision tree is a directed tree like structure having nodes. Its node is of three types. One is Root node that has no incoming edges, second is internal or test node that have outgoing edges and third one is leaves. Each internal node represents a test on an attribute, branch or edges represent result of those test whereas each leave denotes a label class. Decision trees are popular in field of classification because they do not require any domain knowledge thus it is widely used for classification purposes. [11] Liangxiao Jiang and Harry Zhang [12] in their research presents that Naive Byse is a classification model based on probabilities and on attributes independence assumptions. In real world data this assumption of attribute independence is violated. Researchers have done research regarding this issue and tried to get better the Naive Bayes’s accuracy. This is done by fading its attribute independence assumption. Averaged One-Dependence Estimators (AODE) is proposed by a researcher in this regard which helps in weakening the attribute independence assumption of naive bayes. AODE has done this by averaging all the models, from a restricted class of one dependence classifier. AODE is a probabilistic classification method. It deals with the attribute independence assumption of naive bayes by averaging all of the dependence estimators. [13]. BIF reader constructs a description of a Bayes Net classifier which is stored in XML BIF 0.3 format. It accepts input in form of an example dataset, which is training dataset and give model in output. Nicholas Stepenosky et al state in their research that collection or combination of classifiers is now more popular than individual classifiers because of their better performance and superiority over individual classifier system. Ensemble techniques of classifiers include bagging, boosting, voting techniques. These techniques are quite effective on many applications. The main idea behind combining classifiers is that individual classifiers are diverse and chances of errors are higher while combining classifiers can reduce errors and results in better performance through averaging. [14] Lior Rokach in his research state that the idea of ensemble classifiers is used to build a model by combining different classifiers for better performance. Different type of techniques can be used for combining classifiers one of which is majority voting. In majority voting technique classification of an unlabeled instance is carried out according to the class that obtains the highest number of votes. [15]

3

Proposed Technique

The proposed framework consists of a hybrid classifier which is shown in figure 1 and has following steps: 1. 2. 3.

Data Gathering Data Pre-processing Data Mining 4. Data Deployment and testing

Attack Type Prediction Using Hybrid Classifier

491

Fig. 1. Proposed Framework for Prediction of Attack Types Using Global Terrorism Database

3.1

Data Gathering

First phase of data mining process is data gathering. This step is concerned with how and from where data is collected [20]. For proposed framework Global Terrorism Database (GTD) is gathered from an open source database. The data base has been obtained from the National Consortium for the Study of Terrorism and Responses to Terrorism (START) initiative at University of Maryland, from their online interface at

492

S. Shafiq, W. Haider Butt, and U. Qamar

http://www.start.umd.edu/gtd/. GTD includes information about terrorist activities happened throughout the world from 1970 to 2012. There are more than 113,000 cases recorded in GTD. It includes information on more than 14,400 assassinations, 52,000 bombings, and 5,600 kidnappings activities since 1970. Complete information of an incident is given in GTD, for example date of the event, location of the event, weapon used, number of causalities, attack type, group that is responsible for the event etc. It is the most comprehensive data base on terrorist activities throughout the world [19]. 3.2

Data Pre Processing

Data gathered from any source is dirty and unclean [20]. It may have following three problems. 1. 2. 3.

Incomplete data (Incomplete data is the one which have some missing attributes) Inconsistent data (Inconsistent data contains discrepancies in codes or names) Noisy data (Noisy data contains some kind of error or outliers) [19]

Data Pre-processing is an important step in knowledge discovery process. The data gathered from any source is dirty and is of no use without its pre-processing. Data Pre-processing is a process through which source data is transformed into a different format through which following goals can be achieved: [20]: 1. 2. 3. 4.

It ensures the easy application of data mining algorithms. Through data pre-processing performance and effectiveness of mining algorithms can be improved. Data which comes after pre-processing is easily understandable by both humans and machines. Data pre-processing ensures faster data retrieval from databases.

To deal with unclean data following tasks can be applied on it to make it suitable for further use: 1. 2. 3. 4. 5.

Data cleaning ( It is concerned with filling of missing values and smoothing noisy data) Data integration (It deals with mixing of several databases or files) Data transformation (it is concerned with normalization) Data reduction (it deals with reducing the amount of data but without disturbing the analytical results) Data discritization (it is part of data reduction but with specific importance, particularly for numerical data)

3.2.1 Data Pre Processing of Global Terrorism Database (GTD) By going through the above mentioned data pre processing steps, the dataset (GTD) is transformed into a form that is suitable for data mining algorithm. Missing values are

Attack Type Prediction Using Hybrid Classifier

493

removed from GTD and required attribute are selected (Data reduction). There are total 134 attributes and 113114 records in GTD with a lot of missing values. For the proposed framework data in GTD is reduced by selecting 8 attributes and 45221 records which have large impact on prediction of attack type. Data obtained after reduction is both in numerical as well as textual format. Attributes selected for the prediction of attack types are listed below: 1. 2. 3.

4. 5. 6. 7. 8.

Country (This attribute represents the country or location where the incident has happened) Region (It is a categorical variable and it represents the region in which the incident occurred) Attack type (It is a categorical variable and shows which kind of attack types is happen e.g. assassination, bombing, kidnapping etc. there are total of 9 kind of attack types recorded in GTD ) Target type (This attribute represents the target category) Group name (This attribute represents the group that is responsible for attack) Weapon type (This attribute shows the type of weapon used in attack) Property (This attribute shows that any damage to any property happen or not during a terrorist incident ) Ransom (This field shows that, is some ransom demanded or not for an incident?)

Through the proposed framework, attack types will be predicted based on other selected attributes which have most impact on attack types. 3.3

Data Mining and Classification

Classification is an important and predictive data mining method. Classification is used to make prediction using known data. K-NN, Naïve Bayes, Decision Tree, AODE and BIFReader are applied on GTD and a hybrid classifier is proposed by combining above mentioned individual classifiers. Implementation of these classification algorithms is discussed in section 4. 3.4

Data Deployment and Testing

Models are created from dataset using classification algorithms and these models can be used for prediction purpose. In data mining act of applying a model on a dataset is known as deployment. For testing of proposed framework, dataset (GTD) was split into two datasets, one for training purpose and other for testing purpose. Models are created from training dataset by applying classification techniques on that dataset and by using those models testing is performed.

4

Experimentation and Implementation

This section elaborates the implementation of proposed framework. Rapid Miner 5.3 is used for the implementation of proposed framework. Rapid miner is an environment

494

S. Shafiq, W. Haider Butt, and U. Qamar

used for experimentation of machine learning and data mining techniques. For prediction of attack types GTD is split into two parts. One is training dataset and other is testing dataset. KNN, Naive Byse, Decision Tree, AODE and BIF Reader are applied on training dataset to build models. The implementation of the proposed technique includes classification using individual classifiers as well as a hybrid classifier. 4.1

K-NN Classifier

A model is created using K-NN classifier in rapid miner. For this purpose dataset (GTD) is retrieved using retrieve data operator. After retrieving dataset, split operator is applied on the dataset for splitting data into two parts i.e. for training and testing purpose. K-NN classifier is applied on training dataset and model is created. This model is then applied on the testing dataset using apply model operator after this accuracy and classification error of the model is noticed using performance operator. Accuracy of K-NN Classifier is 80.50% whereas classification error is 19.50%. 4.2

Naïve Bayes Classifier

A model is created using Naïve Bayes classifier. For this purpose dataset (GTD) is retrieved After retrieving dataset, data is split for testing and training purpose and a model is created using training dataset. This model is then used for testing dataset. Accuracy and classification error for naïve bayes classifier are calculated. Accuracy of naive bayes classifier is 81.54% whereas classification error is 18.46%. 4.3

Decision Tree Classifier

A model is created using decision tree classifier. For creation of model, dataset (GTD) is retrieved in rapid miner and then split into two parts i.e. for training and testing purpose. Decision tree classifier is applied on training dataset and a model is created. That model is then used for testing dataset. Accuracy calculated for decision tree classifier is 82.90% whereas classification error is 17.10%. 4.4

AODE Classifier

A model is created using AODE classifier. For this purpose dataset (GTD) is retrieved. After retrieving dataset, data is split into two parts i.e. for training and testing purposes. AODE classifier is applied on training dataset and model is created. This model is then applied on the testing dataset and after this accuracy of the model is noticed which is 84.07 % whereas classification error is of AODE classifier is 15.93%.

Attack Type Prediction Using Hybrid Classifier

4.5

495

BIFReader Classifier

A model is created using BIFReader classifier. For this purpose dataset (GTD) is retrieved and after retrieving data is split into two parts i.e. for training and testing purpose. BIFReader classifier is then applied on training dataset and model is created. This model is then applied on the testing dataset and accuracy and classification error of the model is noticed. Accuracy of BIFReader classifier is 83.30 % whereas classification error is 16.70%. 4.6

Hybrid Classifier

The proposed hybrid Classifier is built by combining K-NN, Naïve Bayes, Decision Tree, AODE and BIFReader classifiers for better performance and results. Hybrid classifier is made using Vote Operator in Rapid Miner. The Vote operator has sub processes which must have at least two learners, called base learners. This operator builds a classification model or regression model depending upon the Example Set and learners. This operator uses a majority vote for predictions of the base learners provided in its sub process. While doing classification, all the operators in the sub process of the Vote operator accept the given Training Data Set and generate a model for classification. For prediction of an unknown example set, the Vote operator applies all the classification models included in its sub process and assigns the predicted class with maximum votes to the unknown example. A hybrid classifier is proposed for better performance. For this purpose dataset (GTD) is retrieved. After retrieving dataset, data is split into two parts i.e. one for training purpose and one for testing purpose. Vote Operator is applied on training dataset and model is created. As the sub processes of vote operator K-NN, Naive Bayes, Decision Tree, AODE, BIFReader classifiers are used. These classifiers are working as sub processes of vote operator. Each of this classifier will get the dataset and generates a classification model, and then vote operator applies all the classification models from its sub processes, and assigns the predicted class with maximum votes to the unknown example. This model is then applied on the testing dataset and its accuracy and classification error is calculated. Accuracy of Hybrid classifier is 85.10 % whereas classification error is 14.90%.

5

Analysis of Results

This section explains the results of proposed hybrid classifier and individual classifiers in graphical form. Accuracy of K-NN, Naive Bayes, Decion Tree, AODE and BIFReader classifier as well as proposed hybrid classifier is shown in figure 2 where classifiers are taken along x axis, and accuracy is plotted along y axis. Graphical representation of results shows that K-NN has least accuracy whereas proposed hybrid classifier has maximum accuracy.

496

S. Shafiq, W. Haider Butt, and U. Qamar

Accuracy 86.00% 84.00% 82.00% 80.00% 78.00%

Accuracy

Fig. 2. Graphical Repreesentation of Accuracy of Individual and Hybrid Classifier

Classification error of K-NN, K Naive Bayes, Decion Tree, AODE and BIFReaader classifier as well as proposeed hybrid classifier is shown in figure 3. In figure classsifiers are taken along x axiis, where as classification error is plotted along y aaxis. Graphical representation of o results shows that proposed hybrid classifier has leeast classification error whereass K-NN classifier has maximum classification error.

C Classification error 20.00% 15.00% 10.00% 5.00% 0.00%

Classification error

Fig. 3. Graphical Representaation of Classification Error of Individual and Hybrid Classifieer

Attack Type Prediction Using Hybrid Classifier

6

497

Conclusion and Future Work

In this research, a hybrid classifier is proposed for predicting terrorist activities using data mining techniques. That hybrid classifier uses five types of classification techniques including K-NN, Naive Bayes, Decision Tree, AODE and BIFReader. Five different classifiers are used for comparison purpose. Accuracy and classification error of the individual classifier as well as hybrid classifier shows that hybrid classifier gives best result for predicting “Attack Types” in future. In future this research work can be extended for different classification algorithms and different techniques for ensemble classifiers.

References [1] Jenkins, B.M.: The study of terrorism: Definitional Problems. The Rand Corporation, Santa Monica (1980) [2] Ozer, P.: Data Mining Algorithms for Classification. Radboud University Nijmegen (January 2008) [3] Bhardwaj, B.K., Pal, S.: Data Mining: A prediction for perform-ance improvement using classification. International Journal of Computer Science and Information Security (IJCSIS) 9(4) (April 2011) [4] Suguna, N., Thanushkodi, K.: An Improved k-Nearest Neighbor Classifica-tion Using Genetic Algorithm. IJCSI International Journal of Computer Science Issues 4(2) (July 2010) [5] Xiao, X., Ding, H.: Enhancement of K-nearest Neighbor Algo-rithm Based on Weighted Entropy of Attribute Value. In: 2012 5th International Conference on BioMedical Engineering and Informatics (BMEI 2012), pp. 1261–1264 (2012) [6] Pawar, P.Y., Gawande, S.H.: A Comparative Study on Different Types of Approaches to Text Categorization. International Journal of Machine Learning and Computing 2(4), 423–426 (2012) [7] Yu, L., Chen, G.-S.: KNN Algorithm Improving Based on Cloud Model. In: 2010 2nd International Conference on Advanced Computer Control (ICACC), March 27-29, vol. 2, pp. 63–66 (2010) [8] Langley, P., Sage, S.: Induction of Selective Bayesian Classifiers. In: Proceedings of the 10th Conference on Uncertainty in Artificial Intelligence, Seattle, WA, pp. 399–406. Morgan Kaufmann, San Mateo (1994) [9] Vaidya, J., Basu, A., Shafiq, B., Hong, Y.: Differentially Private Naive Bayes Classification. In: 2013 IEEE/WIC/ACM International Conferences on Web Intelligence (WI) and Intelligent Agent Technology (IAT), pp. 571–576 (2013) [10] Abdelhalim, A., Traore, I.: A New Method for Learning Decision Trees from Rules. In: 2009 International Conference on Machine Learning and Applications, pp. 693–698 (2009) [11] Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. [12] Jiang, L., Zhang, H.: Weightily Averaged One-Dependence Estimators. In: Proceedings of the 9th Pacific Rim International Conference on Artificial Intelligence, Guilin, China, August 07-11 (2006) [13] Wu, J., Cai, Z.: Learning Averaged One-dependence Estimators by Attribute Weighting. Journal of Information & Computational Science 8(7), 1063–1073 (2011)

498

S. Shafiq, W. Haider Butt, and U. Qamar

[14] Stepenosky, N., Green, D., Kounios, J., Clark, C.M., Polikar, R.: Majority vote and decision template based ensemble classifiers trained on event related potentials for early diagnosis of Alzheimers’s disease. In: Proceedings of the IEEE Int. Conf. on Acoustics, Speech and Signal Processing, pp. 901–904 (2006) [15] Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33, 1–39 (2010), doi:10.1007/s10462-009-9124-7 [16] http://www.cecs.louisville.edu/datamining/PDF/0471228524.pdf [17] http://www.start.umd.edu [18] http://www.mimuw.edu.pl/~son/datamining/DM/4-preprocess.pdf [19] http://www.cs.ccsu.edu/~markov/ccsu_courses/ DataMining-3.html [20] Jain, A., Nandakumar, K., Ross, A.: Score normalization in multimodal biometric systems. Pattern Recognition 38(12), 2270–2285 (2005)

Suggest Documents