The International Congress for global Science and

The International Congress for global Science and Technology

ICGST International Journal on Computer Network and Internet Research (CNIR) Special Issue on Network Security Techniques December, 2015 www.icgst.com www.icgst-amc.com www.icgst-ees.com © ICGST LLC, Delaware, USA, 2015

Table of Contents Papers

Pages

Dynamic (2, 1, 8) Convolutional Coding Algorithm for Data Correction and Security Enhancement in an Intelligent Urban Traffic Management System Lim Juleen and Tiong Sieh Kiong

1--6

Vulnerability Analysis of Extensible Authentication Protocol (EAP) DoS Attack over Wireless Networks Mina Malekzadeh, Abdul Azim Abdul Ghani, Jalil Desa and Shamala Subramaniam

7--14

A Professional Comparison of C4.5, MLP, SVM for Network Intrusion Detection based Feature Analysis Alaa F. Sheta and Amneh Alamleh A DOS Attack Intrusion Detection and Inhibition Technique for Wireless Computer Networks Mofreh Salem, Amany Sarhan and Mostafa Abu-Bakr

Identification of Information Systems Threats Sources: An Analytical Study Ahmad Ali Al-Zubi

IDSUDA: An Intrusion Detection System Using Distributed Agents Ahmed Shaaban Abdel Alim, Imane Aly Saroit Ismail and S.H.Ahmed

Performance Evaluation of VHDL Implementation of Proposed SAFER+ Security Algorithm and Pipelined AES Security Algorithm for Bluetooth Security Systems D.Sharmila R. Neelaveni

15--29

31--38

39--45

47--57

59--66

AIML-CNIR Journal ISSN Print 1687-4846 ISSN Online 1687-4854 ISSN CD-ROM 1687-4862 © ICGST LLC, Delaware, USA, 2015

ICGST International Journal on Computer Network and Internet Research (CNIR) A publication of the International Congress for global Science and Technology (ICGST) Guest Editor: Prof. Dr. Alaa Sheta ICGST Editor in Chief: Ashraf Aboshosha www.icgst.com, www.icgst-amc.com, www.icgst-ees.com [email protected]

Preface

C

yber threat became one of the most serious problem for both economics and national security in the 21st century. Therefore, we need a focused research on developing efficient techniques, technologies and tools to deal with this stimulating problem. The growing dimension and complexity of spatiotemporal data generated on daily basis and from variety of sources and its distribution over all types of networks makes it a challenge to protect it from theft or damage. Cyber security is the science that concerns on protecting these big data from disruption or misused. This special issue is provided to explore the complexity of this problem and to present possible number of solutions. L. Juleen and T. Kiong in their article presented a method to enhance the security of the transmitted data over the Wireless Local Area Network. They provided a data security dynamic design algorithm that has the ability to automatically change the configuration of both encoder and decoder based on a few bits of initial input data to the encoder. There new algorithm strengthens the overall security of the transmitted data over the wireless links. Vulnerability analysis of Extensible Authentication Protocol (EAP) DoS Attack over wireless networks is presented by Malekzadeh et. al. Authors presented an experimental framework to demonstrate and quantify possible flooding attacks using unprotected EAP frames against wireless communications. Results show that such attacks can easily launch, and cause serious service disruption to compromise network availability. A professional comparison of decision tress, artificial neural network and support vector machine for network intrusion detection is presented by A. Sheta and A. Alamleh. Intrusion Detection Systems (IDSs) is one of the main solutions for computer and network security. We need IDS to identify the un-authorized access that attempt to compromise confidentiality, integrity or availability of computer or computer network. In this research, author attempted to provide new models for intrusion detection (ID) problem using veracious data mining techniques. The proposed models were capable of reducing the complexity while keeping acceptable detection accuracy. The Denial Of Service (DOS) attacks are one of the very serious networks attach. M. Salem et. al. Presented a new security technique is proposed that aims to detect the DOS attacks in WLANs and further prevent the detected attackers, in the future, from accessing the network. They measure the Probability of Denied Service (PDS) with respect to the number of attacks and the maximum number of connections that access point allows. These results show the effectiveness of the proposed technique in securing the WLAN against the DOS attacks. An analytical view on possible techniques for the identification of information systems threats sources is presented by A. Al-Zubi. He proposed a new approach for identifying the source of threats and the proposed actions to be taken against. A framework called Intrusion Detection System Using Distributed Agents (IDSUDA) was built by A. Alim et. al. This framework is extendable in its capabilities and could be enhanced to meet future challenges. A comparison of Novel architectures of VHDL Implementation of the SAFER+ encryption algorithm and Pipelined AES algorithm is also presented in this special issue by D. Sharmila and R. Neelaveni. It was found that the proposed SAFER+ architecture has better data throughput and frequency than the pipelined AES algorithm. Alaa Sheta Guest Editor Computers and Systems Department Electronics Research Institute (ERI) Cairo, Egypt © ICGST LLC, Delaware, USA 12-12-2015

Guest Editor, Biography: Prof. Dr. Alaa F. Sheta is currently a Professor at the Computers and Systems Department, Electronics Research Institute (ERI), Egypt. He received his PhD degree from the Computer Science Department, George Mason University, Fairfax, VA, USA in 1997. He received his B.E., M.Sc. degrees in Electronics and Communication Engineering from the Faculty of Engineering, Cairo University in 1988 and 1994, respectively. His main research area is in Evolutionary Computation, with a focus on Genetic Algorithms, Genetic Programming and applications. He is also interested in Particle Swarm Optimization, Differential Evolutions, Cuckoo Search, etc. Alaa Sheta authored/coauthored more than 100 publications in peer reviewed international journals, proceedings of the international conferences and book chapters. He is co-author of two books in the field of Landmine Detection and Classification and Image Reconstruction of a Manufacturing Process. He is the co-editor of the book: Business Intelligence and Performance Management - Theory, Systems and Industrial Applications by Springer Verlag, United Kingdom, published in March 2013.

Contct: Prof. Alaa F. Sheta, Ph.D. Computers and Systems Department Electronics Research Institute (ERI) El-Tahrir Street, Dokky Giza, Egypt Research URLs:

  

http://www.icgst-amc.com/institute/Community.aspx?aid=184 https://scholar.google.com/citations?user=x7zJsNoAAAAJ https://www.researchgate.net/profile/Alaa_Sheta

Special Issue on Network Security Techniques, CNIR Journal, ICGST LLC, Delaware, USA, December 2015

Dynamic (2, 1, 8) Convolutional Coding Algorithm for Data Correction and Security Enhancement in an Intelligent Urban Traffic Management System Lim Juleen1 and Tiong Sieh Kiong2 Freescale Semiconductior Malaysia Sdn. Bhd., No.2, Jalan SS8/2, Free Industrial Zone Sungei Way, 47300 Petaling Jaya, Selangor, Malaysia. [email protected] 2 College of Engineering, Universiti Tenaga Nasional, Km 7, Jalan Kajang-Puchong, 43009 Kajang, Selangor, Malaysia. [email protected] 1

installation, and economical, as there is no need for road digging and wiring. Wireless data communication integration and improvement is essential to optimize the transmission of the traffic data.

Abstract This paper presents an application of a dynamic (2, 1, 8) convolutional encoding and Viterbi decoding for data correction and security enhancement in an Intelligent Urban Traffic Management System. The 5.8GHz Wireless Local Area Network (WLAN) system is employed, where the traffic data are transmitted among adjacent traffic intersections and a traffic control centre. Long distance and environmental factors such as bad weather and noise can critically influence the performance of the wireless link especially in the aspect of Bit Error Rate (BER). A dynamic non-systematic (2, 1, 8) convolutional coding with unique configuration is developed as an error-correction mechanism to reduce the BER. The design of a convolution code can be customized to any suitable configuration to enhance the security of the transmitted data. The dynamic design has the ability to automatically change the configuration of both encoder and decoder based on a few bits of initial input data to the encoder. This new dynamic design is unlike the current existing fixed configuration convolutional encoder and Viterbi decoder. Thus, it strengthens the overall security of the transmitted data over the wireless link. Viterbi decoding is used as it yields the maximum likelihood decoding of the convolutional code.

The 5.8 GHz band is chosen as there is minimum interference source near this frequency at the traffic intersection site and also the advantage of wider bandwidth. Wireless access integrated antennas are used as the communication device in this system. Environmental factors like the weather may change from time to time, and this may contribute to wireless signal interference. Bad weather such as rain, fog, mist, or extreme temperatures may hinder the performance of the WLAN [2]. In general, interference can be caused by radio devices operating in the same bands or by thermal noise, or both. Thermal noise that can cause random and burst errors is the only source of interference for a single AP (Access Point). However, with multiple cells, there is also interference from adjacent channels and co-channels. Thus, the overall impact of interference depends on the number of available frequency channels and cell deployment. Sensible cell deployment and management of the number of available channels can ease its effects [2]. A bit error rate (BER) of better than 10-5 is considered acceptable in WLAN applications [3].

Keywords: Dynamic convolution encoding, Viterbi Algorithm, and 5.8GHz Wireless Local Area Network.

Therefore, an error detection and correction mechanism by coding utilization is needed to ensure that the BER is better than 10-5. Coding that particularly also offers security to the encoded data is ideal. There are various types of error-correction coding, which are classified as block codes or convolutional codes. Every coding system has its own features and is applied according to its suitability and merits. Block codes can detect and correct any single bit error occurring within the block.

1. Introduction Cities around the world are experiencing high traffic growth rate. An Intelligent Urban Traffic Management System is needed to provide the best strategy to disperse the congested traffic efficiently [1]. Outdoor Wireless Local Area Network (WLAN) system is applied as the communication system due to mobility, easier and faster

1


Convolutional codes offer better protection against burst errors as compare to block code [4].

There are systematic and nonsystematic convolutional codes. The information bits flow through directly into the code-stream for systematic code unlike nonsystematic code. Although encoding of systematic code is simpler, nonsystematic code gives better performance than the systematic code when Viterbi decoding is used [3].

Andrew Viterbi introduced the Viterbi Algorithm in 1967 [5], which is now the most widely used technique for the decoding of convolutional codes. This error–correction algorithm was later proven to provide a ‘maximum likelihood’ decoding of convolutional code [5].

+

The 8-bit register convolution encoder and Viterbi Algorithm is developed using Visual C++® program to perform the encoding and decoding for a 1/2 rate convolutional code. The Viterbi decoder algorithm uses the trellis diagram and path metrics to perform the errorcorrection. The design being carried out has constraint length, L=8 bit, and rate, R = b/V = 1/2, where b is 1 information bit and V is 2 output bits.

Transmitter Input Data

Convolution Encoding

Digital Demodulator

Interleaver (either with or without)

De-Interleaver (either with or without)

Digital Modulator

Viterbi Decoding

F

E

D

C

B

A

Y

A Viterbi decoder is a device which decodes V binary input to produce b binary output at a bit time, t where V>b. V and b are integers. Viterbi decoder is also known as Viterbi Algorithm. It is presently the most extensive technique used for decoding convolutional code [5].

Noisy Channel via air

The decoding can be clearly illustrated using traceback technique with a trellis diagram as shown in an example in Figure 3. The example of a simple K=3-bit Viterbi decoder’s trellis diagram consists of 2K-1=22=4 states and 2K=23=8 paths. An 8-bit Viterbi decoder’s trellis diagram will consist of 27=128 states and 28=256 paths, which is a massive diagram. The accumulated smallest amount difference between the receive bits (Rx signal, XY) and the X and Y bits of the path is the known as the path metric. The paths with the smallest path metric will lead to the ‘maximum likelihood’ decoding. The X and Y of the selected paths (undotted lines) match the actual transmitted bits from the encoder without errors (Tx output, XY). The input of the selected paths (undotted lines) matches the input bits of the encoder (the actual correct data, illustrated in the first line of Figure 3). Thus, it is able to perform error-correction.

Ouput Data

The convolutional encoder is used in the transmitter while the Viterbi decoder is found in the receiver as shown in Figure 1. Both devices must be synchronized. At the transmitter, a stream of binary input data to the convolutional encoder will produce a stream of convolutional code as the binary output of the encoder. An interleaver may be applied before input to the digital modulator. The digital modulator converts the digital signal to analogue signal for transmission via air. The air is naturally a wireless noisy channel that may cause errors to the transmitted signals.

input: Tx output, XY Rx signal, XY bit time, t '0' State (a), 00

At the receiver, the analogue signal is converted back into digital signal. If there is an interleaver used at the transmitted, the digital signal must pass through the matching deinterleaver before input to the Viterbi decoder. After decoding the received signal, the output of the decoder should be the same to the input signal to the encoder. In this case, error rate is 0%.

and

G

Figure 2. Convolutional encoding of a non systematic (2, 1, 8) convolutional encoder at the transmitter.

Figure 1. Block diagram of the data transmission testbed with convolutional encoder and Viterbi decoder.

Encoder

H

+

Receiver

2. Convolutional Decoder

01011011

Input Data

X Output Data

(b), 01 (c), 10

1 11 11

0 01 01 '1'

XY/INPUT 2 00/0 00/0 11/1 0 11/1

1 00 10 '2'

'3'

'4'

3

4

2

00/0 1 11/1 4

3 2

00/0 11/1 3 11/0 00/1 01/0 0 10/1

10/1 2

(d), 11

Viterbi

0 01 01

10/0 01/1

1 5

01/0

2 10/1 3 4

01/1

2 1 5 4

0 11 10

0 00 00

'5' '6' (smallest path metric) 2

3 00/0

2

11/0 01/0

00/0

5

11/0 4 3

10/0

3

path 0, transition from state a to a path 2, transition from state a to b

path 1, transition from state c to a path 3, transition from state c to b

path 4, transition from state b to c path 6, transition from state b to d

path 5, transition from state d to c path 7, transition from state d to d

Note: 1. The dotted lined indicate the possible transitions of a convolutional encoder. 2. The undotted lines indicate the transitions of the convolutional encoder according to the input bits.

The convolutional encoder is defined as a device which encodes b binary input to produce V binary output where b2>3) for every file. All 4 pattern needs to be cracked. 4. Random dynamic code with pattern randomly changing among pattern 0, 1, 2, 3, 4, 5, 6, and 7 for every file. All 8 pattern needs to be cracked. 5. Random dynamic convolutional code with pattern randomly changing among pattern 0, 1, 2, and 3 for every file. All 4 pattern needs to be cracked. Figure 9. Text file after decoding with the right pattern configuration at the destination computer.

Average Duration for Cracking Convolutional Codes versus Constraint Length

2.5E+06

If an unmatched code configuration is used to decode the encoded data, the original data cannot be obtained correctly. Thus, if an unauthorized user tries to decode the encoded data with the wrong configuration, the result will be meaningless as illustrated in example Figure 10.

Non-Dynamic Codes

2.0E+06

Time (s)

1.5E+06

1.0E+06

5.0E+05

0.0E+00 3

4

5

6

7

Constraint Length, K

8

9

10

Sequential Dynamic Codes with 8 patterns Sequential Dynamic Codes with 4 patterns Random Dynamic Codes with 8 patterns Random Dynamic Codes with 4 patterns

Figure 11: Average duration for cracking convolutional codes with 5 different cases versus constraint length, K.

Referring to Figure 11, the time taken to crack the nondynamic convolutional code is the least compared to the other four cases of dynamic convolutional codes. The non-dynamic convolutional code is the benchmark for this comparison as it is the current existing standard convolutional code. This proves that dynamic convolutional code has higher security as compared to non-dynamic convolutional code. The time taken to crack the sequential dynamic convolutional code is the lower compared to the random dynamic convolutional codes because the next pattern is predictable and will be easily found following the sequence. Dynamic convolutional codes with 8 patterns needs a longer time to be cracked compared to codes with 4 patterns because the possibility to find a correct pattern among 8 patterns is lower than possibility to find a correct pattern among 4 patterns. Thus, the more patterns used in the design, the better the security of the dynamic convolutional code.

Figure 10. Unreadable text caused by decoding with wrong configuration code at the destination computer.

Security Performance: The performance of the security feature is analysed by studying the cracking of the codes. The time taken to crack the code and the number of trial taken to crack the code are used to measure the performance of the security. Constraint lengths, K of 3 to 10 are used for the analysis with several possible code configurations patterns. This study is done by assuming the intruder knows the following: i. It is a convolutional code. ii. The rate, R is b/V=1/2. iii. The interleaver design is 10x10. iv. The number of redundant bits used as the code configuration pattern is equivalent to the constraints length, K. v. The typical duration to decode an encoded data of an average traffic data file of 10kB is about 0.1 second and a new traffic data file is generated within the same duration.

For every increment of the constraint length, the duration required to crack the code has a drastic non-linear increment. This is because more redundant bits are use as the security code configuration pattern increase the possibility of cracking the correct pattern. In order to further increase the security of the convolutional codes, the security redundancy bits use to represent the code configuration pattern can be more than the constraint length of the convolutional code. Increasing the number of patterns to the maximum of 22*security redundancy bits, also contributes to the improvement of the security of the dynamic codes. Security redundancy of 10 bits can have a maximum of 1048576 patterns. Thus, random dynamic convolutional codes with highest number of patterns and highest number of security redundancy bits have the best security level. The security level can be easily designed according to an application requirement considering the amount of affordable redundancy.

Five cases for security programming with constraint length, K=3 to 10 bits. 1. Non-dynamic convolutional code, just find the pattern (0 to 7 pattern) from 0000000 00 to 11111111 11. Only one pattern need to be cracked. 2. Sequential dynamic convolutional code with pattern changing from 0 to 7 (0>1>2>3>4>5>6>7) for every file. All 8 pattern needs to be cracked.

5


The objectives of ‘to improve the performance of the Wireless LAN communication system by BER (bit error rate) reduction’ and ‘to establish a secure data transmission link for the Wireless LAN communication system with enhanced security feature’ are shown to be fulfilled.

8. References [1] R. A. Rahmat, K. Jumari, A. Hassan, H. Basri. SKLIP - Intelligent Urban Traffic Control. Proceeding of 10th World Congress on Intelligent Transport System, Ertico, Madrid, Spain, 2003. [2] M. J. Ho, M. S. Rawles, M. Vrikorte, L. Fei. RF Challenges for 2.4 and 5 GHz WLAN Deployment and Design. Wireless Communications and Networking Conference, March 2002, vol. 2, pp. 783-788. [3] P. Sweeney. Error Control Coding: from Theory to Practice. John Wiley & Sons Ltd., 2002, pp. 1-65. [4] R. J. Tocci. Digital Systems, Principles and Applications. Prentice-Hall International, fifth edition, 1991. [5] A. J. Viterbi. Error Bounds for Convolutional Codes and on Asymptotically Optimum Decoding Algorithm. IEEE Transactions on Information Theory, IT-13, 260-269, 1967. [6] A. M. Michelson, A. H.Levesque. Error-Control Techniques for Digital Communications. John Wiley and Sons, pp. 270-336, 1995.

5. Conclusion A new dynamic (2, 1, 8) convolutional encoder and Viterbi Algorithm was successfully developed using Visual C++®. Both programs (with and without interleaver) demonstrate the convolutional encoding and Viterbi decoding of multiple configuration convolutional codes. This error-correction mechanism effectively enhanced the security level and improves the bit error rate (BER) of the data transmitted via the Wireless LAN of the Intelligent Urban Traffic Management System.

6. Future Work

The software program using Visual C++® for this dynamic non-systematic (2, 1, 8) convolutional encoder and Viterbi decoder could be expanded from 8 configurations until 65025 configurations as a future work. The error-correction performance of each configuration could be further evaluated. However, this may be fairly complicated and challenging to perform.

Lim Juleen ([email protected]) is currently a Radio Frequency (RF) Product Engineer in Freescale Semiconductor (a manufacturing company). She received the B. Eng (Hons) Communication and Electronic Engineering from University of Northumbria, Newcastle, United Kingdom, in 1999. She is pursuing Master of Electrical Engineering by Research, in Universiti Tenaga Nasional, Malaysia with the research entitled “Performance of Data Transmission Over Wireless Channel in an Intelligent Urban Traffic Management System”. Her research interest includes error-correction coding, and wireless communication. She is a Corporate Member of The Institution of Engineers, Malaysia (MIEM) since July 2005 and a Professional Engineer in Electronics with the Board of Engineers Malaysia (BEM) since December 2005.

A proposal is to employ Visual C++® to develop an educational convolutional encoding and Viterbi decoding design software. This software enables selection of various combinations of design parameters such as the constraint length (K) and rate (R), and also all possible non-systematic and systematic configurations. Thus, this helps users to further evaluate the performance of this error-correction mechanism based on different designs. The dynamic (2, 1, 8) convolutional encoder and Viterbi decoder can also be implemented as a hardware. The C++® source codes can be translated to machine codes and implemented in microcontrollers, microprocessors, or specially-designed IC (Integrated Circuit) chips. IC chips can further be developed using the VLSI (verylarge-scale integration) technology. This will create a new generation of dynamic encoder and decoder that improves BER and also have stronger security features.

Tiong Sieh Kiong ([email protected]) is currently lecturer with the Department of Electrical, Electronics Engineering, Universiti Tenaga Nasional in Malaysia. He received the B.Sc. degree in Electrical and Electronics Engineering from Universiti Kebangsaan Malaysia in 1997, the M.Sc. degree in Communication Engineering from Universiti Kebangsaan Malaysia in 1987, and the Ph.D. from Universiti Kebangsaan Malaysia in 2006. His research interest includes mobile communication, wireless networking and artificial intelligent. He is committee member of IEEE Communication Society, Malaysia, holding the post of treasurer.

Programming of a microprocessor to function as the dynamic (2, 1, 8) convolutional encoder and Viterbi decoder is a potential research area for hardware development. The Motorola 68000, which is a 32-bit CISC (Complex Instruction Set Computer) microprocessor, is one of the suitable components that can be used to implement this dynamic encoder and decoder.

7. Acknowledgements This work was supported in part by the MOSTI (Ministry of Science Technology and Innovation Malaysia) under PR IRPA (Prioritized Research Intensification of Research in Priority Areas) Grant 04-99-03-0001PR0059/09-09.

6


Vulnerability Analysis of Extensible Authentication Protocol (EAP) DoS Attack over Wireless Networks Mina Malekzadeh1, Abdul Azim Abdul Ghani2, Jalil Desa3, and Shamala Subramaniam4 Department of Communication Technology and Networks, Faculty of Computer Science and Information Technology, University of Putra Malaysia 1 [email protected], [email protected],[email protected], [email protected]

an Extensible Authentication Protocol (EAP) for authentication. EAP protocol defines a number of methods for authentication such as MD5, LEAP, TLS, TTLS, PEAP which are commonly used [10, 11]. The IEEE 802.1x also defines EAP over LANs (EAPOL) to encapsulate EAP messages between the supplicant and the authenticator over a wired or wireless LAN. Basically 802.1X authentication process is exchanging EAP frames. Some of these EAP frames are protected by using one of the mentioned authentication methods but some of these EAP frames are transmitted unencrypted which result in serious vulnerabilities to even a highly secured wireless network. These unprotected EAP frames typically are exchanged between wireless client and access point. An adversary may try to modify or spoof EAP packets to start different types of DoS attacks over WLANs which consume useful resources and disrupt services. There are a few papers that consider these types of attacks. In [2] they propose a central manager to dynamically manage access points and clients for mitigating the attacks. The proposed central manager is a back-end server that takes the place of the authentication server defined by 802.1x. They discusses the proposed model not only takes the responsibilities of the authentication server, but also tracks clients in the authentication process to avoid the DoS attacks. In [3] they consider either to eliminate some of these vulnerable EAP frames or to replace with other types of management frames such as deauthentication [1]. In [48] they all discuss possibility of the EAP attacks over the wireless networks. In [12] they consider wireless DoS attacks in different layers and discuss current security mechanisms in WLANs. They proposed an intrusion detection system to enhance security of the 802.1x. Through simulation they proved that their proposed model can decrease probability of DoS attacks over WLANs.

Abstract IEEE 802.11 supports 802.1x to provide strong authentication mechanism for Wireless networks. 802.1x utilizes Extensible Authentication Protocol (EAP) as a framework for authentication, allowing for a number of authentication methods to be used. Unfortunately, 802.1x includes some unprotected EAP packets during authentication process which can be easily exploited by an attacker to start different types of Denial of Service (DoS) attacks over wireless networks. In this paper we developed an experimental framework to demonstrate and quantify possible flooding attacks using unprotected EAP frames against wireless communications. First we setup a testbed wireless network in order to demonstrate how EAP flooding attacks take very little effort to bring a protected wireless network to a complete halt. Then via measurements and analyses we evaluate the impact and consequence of these attacks against performance of wireless network in terms of network drop rate and throughput. Results show that such attacks can easily launch, and cause serious service disruption to compromise network availability. Keywords: EAP flooding attack, wireless network, DoS attacks, WLAN security, WLAN throughput, WLAN drop rate.

1. Introduction and Related Works IEEE 802.1x is used for wired and wireless LAN authentication. This standard for wireless LANs involves three main components to complete an authentication process: the supplicant (client pc), the authenticator (access point), and the authentication server (usually Remote Authentication Dial-In User Service Server or RADIUS). Sometimes it is possible to combine authenticator and RADIUS in one device. 802.1X specification requires using

7


type supported by the authentication server. A RADIUSAccess-Challenge frame is transmitted over the RADIUS protocol back to the authenticator. 6) The authenticator forwards the challenge back to the client with the authentication type requested by the authentication server using EAPOL-RequestAuthentication frame. 7) The client examines the challenge and determines if it can support the requested EAP authentication protocol. If it cannot support the authentication type requested by the authentication server, the client will issue a NAK request and try to negotiate an alternative authentication method. If the client can support the authentication type requested by the authentication server, it responds with its credential information using EAPOL-Response-Authentication frame. 8) The authenticator relays the client’s credentials to the authentication server using the RADIUS protocol as Radius-Access-Request frame. 9) If the client’s credentials are valid, the authentication server authenticates and authorizes the client. Otherwise, the client is rejected and the appropriate RADIUS AccessAccept or Access-Reject frame is sent back to the authenticator using the RADIUS protocol. 10) The authenticator receives the RADIUS AccessAccept or Access-Reject frame and configures the network access accordingly, and then it sends an EAPOL-Success or EAPOL-failure frame to the client respectively. After this process, according to the agreed authentication protocol the other required frames are exchanged. 11) When the whole of authentication process is done, the client obtains required permission to access to the network and finally they start to transmit data over the wireless network. Whenever the client desires to terminate its connection, it sends an EAPOL-Logoff frame to the access point [9]. From the above process, EAPOL carry EAP protocol packets between supplicants and authenticators over LANs. An EAPOL frame with type of EAP-Packet carries an EAP packet in its packet body field. Figure 2 shows the EAPOL and EAP frame formats.

To the best of our knowledge, no work has been done on investigating the effect of EAP frames attacks on performance of the 802.11 wireless networks in term of network drop rate and throughput, which has been presented in this paper. In this paper we discuss vulnerabilities of EAP frames which can be used by attackers. We setup a wireless testbed environment to experiment possible DoS attacks against wireless network using unprotected EAP frames. We show the influence of these attacks on throughput and drop rate of the wireless networks to present the damage that these attacks can impose to the WLANs. The remainder of the paper is organized as follows. Section (2) provides a general introduction to 802.1x along EAP frames format. Section (3) describes EAP practical attacks in wireless networks. Section (4) depicts the testbed environment to implement the experiments and the performance metrics to evaluate impact of the attacks on the performance of the wireless network. Section (5) presents the results of the experiments and discusses them. Conclusions that can be drawn from this paper are stated in section 6.

2. Overview of 802.1x and EAP Frames Authentication process of 802.1x involves exchanging some EAP packets. A general EAP frames flow is depicted in Figure 1.

Figure 1. A Typical 802.1x Frames Flow

1) The client starts authentication process by sending EAPOL-Start frame to the access point. 2) The access point responds to the client by asking for its identity using EAPOL-Request-Identity frame. 3) The client responds to the authenticator with its identity information using EAPOL-Response-Identity frame. 4) The authenticator forwards the client’s identity information to the authentication server over the RADIUS protocol using RADIUS-Access-Request frame. 5) The authentication server responds with a challenge to the authenticator and will specify the EAP authentication

Figure 2. EAP and EAPOL Frame Format

The EAP packets that are exchanged between access point and RADIUS are protected by the underlying

8


network component for the experiments is listed as the following: 1) Hardware details: x One 802.11b/g wireless card model Intel Pro 2200 using Intel chipset as our target client. x One 802.11b/g wireless card using Atheros chipset model Netgear WG511T as attacker client. x One 802.11 b/g wireless card model Netgear WG311T using Atheros chipset to monitor the target wireless network. x Linksys WRT54G as the target access point. 2) Software details: x Ubuntu Linux operating system with the following kernel specifications: the kernel version is 2, the major revision of the kernel is 6, and the minor revision of the kernel is 22. We equip both attacker system and monitoring system with these specifications. x Wireshark network analyzer version 1.0.3 to capture in monitor mode and analyze transmitted traffics over the target wireless network. It has ability to indicate all legal and illegal traffics. We use wireshark to collect some statistics with respect to our research metrics to evaluate the results of our experiments. x Libpcap and Madwifi-ng revision svn r3367 driver for Atheros based cards. x Khexedit to make our forgery EAPOL frames and to do required modifications on the forgery frames. x File2air to inject our forgery frames to the target. x Windows XP SP2 as operating system for the target client.

authentication method. From the transmitted EAP packets between client and access point, some of them are protected by the underlying authentication protocol but some of them are transmitted without any security method and consequently make wireless network vulnerable to DoS attack. These unprotected EAP frames include: EAPOLStart, EAPOL-Logoff, EAPOL-Success, and EAPOLFailure. Therefore in this paper via experiments in a testbed, we investigate the impact of DoS attacks on throughput and drop rate of the wireless network using the above mentioned insecure EAP packets.

3. EAP Denial of Service (DoS) Attacks Description In this section we discuss how an attacker can launch EAP flooding attacks over the wireless network using insecure EAP frames. 1) Wireless flooding attack by EAPOL-Start frames: When the wireless client sends an EAPOL-Start frame to the access point the access point responds back to the wireless client with an EAPOL-Identify-Request and also does some internal resource allocation. Attackers use this vulnerability and send a lot of EAPOL-Start frame to the target access point forcing it to allocate more and more resource and thereby bringing it down. 2) Wireless flooding attack by EAPOL-Success frames: An adversary can easily forge EAPOL-Success frame to maliciously overflow the wireless channel. The attacker overwhelms the wireless network by sending forgery EAPOL-Success to exhaust resources until the access point can no longer process legitimate requests. 3) Wireless flooding attack by EAPOL-Logoff frames: When a station wishes to leave a WLAN it can send an EAP-Logoff frame to the access point to end its authenticated session. Therefore an attacker can spoof the MAC address of an authenticated station and send an EAPLogoff frame to the target access point. This will cause the AP to believe that the legitimate station has ended its session. The legitimate station will not be aware that its session has been ended until it attempts to transmit data. At this point it will attempt to reestablish the association, but the session will be short lived as the attacker will be sending this EAPOL-Logoff frame continuously. 4) Wireless flooding attack by EAPOL-Failure frames: An attacker use this frame to disconnect the client from the network. The attacker continually sends forgery EAPOLFailure to effectively shut down the wireless network. Since access point has limitation to allocate resources, when the attacker tries to exhaust these resources, the access point can not operate required services for its legal clients anymore. This causes the communication between legitimate clients be disrupted.

We perform our experiments in two different situations: targeting access point as victim of the attacker and targeting client as victim of the attacker. In order to keep our research more comprehensive, by using four unprotected EAP frames we manage to experiment eight different types of attacks. The eight attacks and their descriptions have been mentioned in Table 1. Attack EAP-ST-F-AP EAP-LO-F-AP EAP-SU-F-AP EAP-FA-F-AP EAP-ST-F-C EAP-LO-F-C EAP-SU-F-C

4. Testbed Architecture for Experiments As the first step to get our measurements, we setup an appropriate testbed and identify the requirements tools. Therefore to fulfill our purpose, the configuration of each

EAP-FA-F-C

Description Flooding Attack over Target Access Point using EAPOL-Start Flooding Attack over Target Access Point using EAPOL-Logoff Flooding Attack over Target Access Point using EAPOL-Success Flooding Attack over Target Access Point using EAPOL-Failure Flooding Attack over Target Client using EAPOL-Start Flooding Attack over Target Client using EAPOL-Logoff Flooding Attack over Target Client using EAPOL-Success Flooding Attack over Target Client using EAPOL-Failure

Table 1. EAP DoS Attacks for Experiment

9


An overview of our testbed is shown in Figure 3 as follow.

Figure 4. WLAN Traffic during EAP-ST-F-AP Attack

To determine the effectiveness of the attack, we did normal HTTP and FTP transmission on the network for about 29 seconds without any attack. We measured throughput during this time which was about 184539.050 B/s. Then we started to flood the target access point by our forgery EAPSTART packets for 25 seconds. We saw that soon after starting the attack, all clients were disconnected from the network. We collected our statistics during the attack to measure the throughput and it was only about 5.010 B/s. We also repeat the experiment to obtain the drop rate of the network during the attack. We sent 45 ICMP request packets and received 33 ICMP response packets. The network drop rate was about 26% during the attack. We conclude this is due to the fact that the access point allocates significant resources on receipt of our forgery frames hence it denies any service after excessive flooding and consequently knocks all wireless clients of the network.

Figure 3. Environment of Testbed WLAN

4.1 Performance Metrics We present an empirical analysis of a set of metrics over the data collected through our experiments. In order to understand the effect that the above attacks would have on performance of the wireless network, we measure our metrics as wireless network drop rate and throughput before and during the attacks as the performance metrics. In all our experiments we consider these metrics as follow: x The total throughput, T, is measured as a ratio by looking at the total transmitted bytes, b, over the total transmission time, t, in seconds: T=b/t. x The average drop rate ADR is measured using ICMP Ping program to determine the number of packet lost during the attack time: ADR= (number of frames transmitted to the destination/number of frames received by the destination)/ number of frames transmitted to the destination*100.

5.2 Objective 2: DoS Attack Using EAP-Logoff Frame over the Target Access Point For this experiment, using Khexedit, we made our specific EAP-logoff packet. We managed to flood the target access point for about 26 seconds with the attack rate about 25 packets per second. The result of this attack is shown in Figure 5.

5. Experimental Results and Discussions We attempt to provide similar conditions for all experiments to have fair enough situations to compare the results. Therefore each of our attacks is run for 26 seconds which is the attack duration. Attack rate in all tests is considered as 25 packets per second which is enough to flood a wireless channel with the maximum duration filed 32767ȝs. In all experiments, the experiment time is divided in three parts: before, during, and after the attack. The process of experiments have been designed so that first we test the wireless network under normal conditions with no attacks for some seconds and after that we start the desired attack for 26 seconds to observe the wireless network performance in terms of our metrics and then after the attack period we let the wireless network back to normal transmission for some seconds.

Figure 5. WLAN Traffic during EAP-LO-F-AP Attack

As the graph shows, the attacker was completely successful to overwhelm the network by forgery packets. By continually sending EAP-logoff frames to the target access point the attacker effectively blocked all legal connections. Before starting the attack the average throughput of the network was about 147227.700B/s during 31 seconds normal transmission. But when the attack started the throughput decreased considerably to 12.231B/s. To test ping blockage during the attack, we sent 43 ICMP request packets. In the response to our ICMP request we received 33 ICMP response packets which show the network drop rate was about 23%.

5.1 Objective 1: DoS Attack Using EAP-Start Frame over the Target Access Point We attempt to emit our forgery EAP-START frame to the target access point in the middle of its normal transmission. Therefore we set our attack rate as 25 forgery packets per second. Result of this experiment during 26 seconds attack duration is shown in Figure 4.

10


immediately was affected and throughput decreased to about 5.201 B/s. We did the test again to measure network performance during the attack in term of drop rate. Therefore we sent 46 ICMP request packets and received 34 packets which demonstrate 26% drop rate for the wireless network. The above results indicate that this attack was effective against the access point and it compels all legal clients disconnected from the network. Attacker by using a lot of forgery packets saturates the link to exhaust the buffer of the access point so that it denies any services from its associated clients.

The results show that the attack brings the network to its knee and create such loss of productivity. The target access point was not resistance to this attack and was blocked for the duration of the attack. The unwanted forgery EAPlogoff frames that flood the buffer of the target access point have caused this problem. 5.3 Objective 3: Test Flooding Attack Using EAPSuccess Frame over the Target Access Point In this experiment a flood of 25 forgery EAP-success packets per second was injected to the target access point and repeated for 26 seconds. The result of this attack is shown in Figure 6.

Figure 7. WLAN Traffic during EAP-FA-F-AP Attack Figure 6. WLAN Traffic during EAP-SU-F-AP Attack

5.5 Objective 5: Test Flooding Attack Using EAP-Start Frame over the Target Client We created congestion by continually transmitting EAPstart forgery packets over the target client. The stream of our forgery frames are transmitted 25 packets per second during 26 seconds attack time. The result of this attack is shown in Figure 8.

Normal traffics were exchanged between clients and the access point before the attack for about 28 seconds. Throughput measurement was done during this time which was about 207015.177B/s. Then we injected our forgery EAP-success packets to the target access point. We measured the network throughput during the attack period and we observed it decreased to about 1977.308B/s. We repeated our experiment to send 53 ICMP request packets to the network during the attack duration. The number of received ICMP response were 33 packets which shows the attack caused 37% drop rate on the wireless network. Since the access point was totally busy to manage all forgery frames, it could not response to any other legal transmission including ICMP request packets, hence dropped them which result in to increase network drop rate. All useful buffer of the access point were consuming by useless packets which made the access point to disconnect all legal clients and as a result all legal communications failed.

Figure 8. WLAN Traffic during EAP-ST-F-C Attack

We monitored the network to measure the throughput of the target client during 27 seconds normal communication and it was about 207151.743B/s. When attacker started to transmit his forgery EAP-start we measured the average of the throughput and observed that it was 0 B/s which shows the network was completely shut down for the target client and it was unable to do any communication during the attack time. Then to test the impact of the attack on the target client drop rate, we sent 44 ICMP request packets. We received 33 ICMP response packets which shows that drop rate of the client was about 25%. This is because all resources of the target client are abused by forgery EAP-start packets that directly are transmitted toward the client. The attack causes the client drops its network connection and has no longer access to the access

5.4 Objective 4: Test Flooding Attack Using EAPFailure Frame over the Target Access Point We ran this experiment on the target access point by transmitting 25 forgery EAP-failure packets per seconds during 26 seconds. The result of this attack is shown in Figure 7. In this experiment a normal transmission was allowed for about 27 seconds. Then under this normal communication, we managed to collects required statistics to compute average of the network throughput and found it about 210129.170B/s. After that we launched to flood the network bandwidth by injecting our forgery EAP-failure packets. We observed that the target access point

11


under this situation. From the collected statistics the average of throughput was calculated about 206657.871B/s. While the attacker was transmitting a continuous stream of the forged EAP-success we collected numerical facts to compute throughput of the target client during attack time. We observed the average throughput was 0 B/s which shows the attacker has forced the individual target client disconnected from the network. Then to investigate the packet drop rate of the target client under attack we used ping command. Out of 44 ICMP request packets we received 33 ICMP response packets therefore the packet drop rate of the target under attack was about 25%. We concluded from the result, the wireless network is subject to this attack. Attacker successfully made the wireless client completely busy with large amount of unwanted frames and consequently provided total inability for the client to keep normal communication.

point until end of the attack period when the attacker makes its resources free to communicate. 5.6 Objective 6: Test Flooding Attack Using EAPLogoff Frame over the Target Client For this experiment attacker transmits 25 forgery EAPlogoff packets per second to the target client and attempts to block any communication by the client during 26 attack duration. The result of this attack is shown in Figure 9.

5.8 Objective 8: Test Flooding Attack Using EAPFailure Frame over the Target Client In this test we target the wireless client by flooding 25 forgery EAP-failure packets per second during 26 seconds. The result of this attack is shown in Figure 11.

Figure 9. WLAN Traffic during EAP-LO-F-C Attack

We attempts to collect related information to measure the throughput of the target client during 29 seconds without any attack. The average throughput of the client was about 203405.194B/s during normal transmission time. We kept collecting information about target client throughput since the attacker started to transmit enormous forgery flow of EAP-logoff packets. Our measurement shows the client throughput during the attack goes down to the average of about 1076.538B/s. we performed the test once again to calculate number of packet lost during the attack. Therefore we sent 43 ICMP request packets and we received 33 ICMP response packets therefore we obtain 23% drop rate for the target client. The attack overflow the buffer of the network interface card of the client and hence drains its resources and substantially reduces the available channel capacity and increase the number of dropped packets by the target client.

Figure 11. WLAN Traffic during EAP-FA-F-C Attack

5.7 Objective 7: Test Flooding Attack Using EAPSuccess Frame over the Target Client In this test attacker adds significant overhead on the target client by injecting 25 forgery EAP-success packets per second during 26 seconds. The result of this attack is shown in Figure 10.

Table 2. Summary of the Attacks Results

Before the networks goes under the attack, we calculated the average throughput of the network for the wireless target client which was about 185959.101B/s during 27 seconds normal transmission. Then we did the same to obtain throughput of the target client under the attack and it was about 945.769 B/s. We ran the experiment again to test

Figure 10. WLAN Traffic during EAP-SU-F-C Attack

We performed the test to keep track of normal connection exchanges for about 28 seconds to measure the throughput

12


[8] Palekar A., Simon D., Salowey J., Zhou H., Zorn G., and Josefsson S. 2004. Protected EAP Protocol (PEAP) Version 2. [9] Kwan P. 2003. 802.1x authentication & extensible authentication protocol . [10]Sotillo S. 2007. Extensible Authentication Protocol (EAP) Security Issues. [11]Bansal D. and llar S. 2006. Authentication in Wireless Networks. [12] Mofreh Salem M., Sarhan A., AbuBakr M. 2007. A DOS Attack Intrusion Detection and Inhibition Technique for Wireless Computer Networks. ICGSTCNIR, Volume 7, Issue I.

the number of packet lost during the attack. We sent 46 ICMP request packets and received 33 packets therefore the target client drop rate was about 28%. Because the attacker causes congestion on the target client by generating an excessive amount of EAP-failure traffic, the target client kept so busy responding to these frames and consequently has insufficient resources to respond to legitimate traffics on the network. A summery of the comparison of the all results is depicted in Table 2.

6. Conclusion In this paper we presented a highly effective denial of service attack that can be mounted against IEEE 802.11 WLANs using unprotected EAP messages. We experiments eight different types of attacks over both target access point and target wireless client to show the impact of these attacks over performance of the wireless network in terms of throughput and packet drop rate. The results of the experiments show that all the mentioned attacks are great threat to wireless networks. We observed the attacks could be highly effective to degrade the wireless network throughput close to zero and to increase the packet drop rate considerably. We observed a severe reduction of throughput as the attack rate increases which show the attacks are extremely effective on wireless networks. As the number of forgery frames of the attacker increases it imposes a heavy load and congestion to the network results in to increase drop rate during the attacks.

7. References [1]

[2]

[3]

[4] [5] [6] [7]

Malekzadeh M., Azim A. Jalil D., and Shamala S. 2008. An Experimental Evaluation of DoS Attack and Its Impact on Throughput of IEEE 802.11 Wireless Networks. IJCSNS International Journal of Computer Science and Network Security, Vol.8 No.8. Ding, P, Holliday J, and Celik A. 2004. Improving the Security of Wireless LANs by Managing 802.1x Disassociation. Proceedings of the IEEE Consumer Communications and Networking Conference, Las Vegas, NV. pages 53-58. He C. and Mitchell J. C. 2005. Security Analysis and Improvements for IEEE 802.11i. The 12th Annual Network and Distributed System Security Symposium (NDSS'05), pages 90-110. Adrangi F., Lortz V., Bari F., and Eronen P. 2006. Identity selection hints for Extensible Authentication Protocol (EAP). Wang Y. P., Liu Y. W., and Chen J. C. 2005. Design and Implementation of WIRE1x. In Proc. of Taiwan Area Network Conference (Taipei, Taiwan). Blunk L. and Vollbrecht J. 2002. The One Time Password (OTP) and Generic Token Card Authentication Protocols. Microsoft. Saleh M. A. 2005. Weaknesses of Authentication and Encryption Methods Used in IEEE 802.11b/g Wireless Networks. Alexandria University, Egypt.

13


Biographies Mina Malekzadeh She received the M.S. in Software Engineering at University Putra Malaysia in 2007. The author now is Ph.D. student in security in computing at UPM.

Dr. Jalil Md. Desa is a head of Network Transmission and Security, and also an Associate Principal Researcher at Telekom Research and Development Malaysia. Dr. Jalil received his Bachelor of Electronic and Electrical Engineering (Hons) from University of Portsmouth (UK), and his PhD from University of Strathclyde (UK). His major research fields are computer network, Internet protocols (IPv4/IPv6), routing, security, network management system, and router design.

Dr. Abdul Azim Abd Ghani Received the B.Sc in Mathematics Computer Science from Indiana State University in 1984 and M.Sc in Computer Science from University of Miami in 1985. He received the Ph.D. in Software Engineering from University of Strathclyde in 1993. He is an Associate Professor and the Dean of Faculty of Computer Science and Information Technology, Universiti Putra Malaysia. His research interests are software engineering, software measurement, software quality, and security in computing.

Dr. Shamala K. Subramaniam Received the B.S. degree in Computer Science from University Putra Malaysia (UPM), in 1996, M.S. (UPM), in 1999, PhD. (UPM) in 2002. Her research interests are Computer Networks, Simulation and Modeling, Scheduling and Real Time System.

14


A Professional Comparison of C4.5, MLP, SVM for Network Intrusion Detection based Feature Analysis 1

Alaa F. Sheta1 , Amneh Alamleh2 Computers and Systems Department, Electronics Research Institute, Giza, Egypt 2 Computer Science Department, Zarqa University, Zarqa, Jordan [email protected], [email protected]

Abstract

of systems abuse or attack. IDSs could be classified in multiple dimensions based on detection method, architecture and their post detection action. While there are multiple types of intruder attacks; traditional IDSs requires a huge amount of human effort to maintain and improve their performance [2]. Two types of ID systems are defined in the literature; they are the Misuse detection and Anomaly detection. Intrusion detection of pre-defined patterns is termed misuse detection while identifying the abnormalities from the normal network behaviors is called anomaly detection [3]. The best IDS is expected to discover new types of attacks in minimum time and trigger the required action. It is almost impossible to reach one hundred percentage of IDS accuracy; research effort is focusing on raising IDS accuracy as much as possible. Classification of IDSs Systems as presented in [4] is shown in Figure 1. In this research, we are interested in network anomaly detection methods. In the present research study, an off-line intrusion detection system is implemented using three algorithms: Decision Tree based C4.5 algorithm, Multi-Layer Perceptron, and Support Vector Machine. Three intrusion detection models shall be implemented using the NSL-KDD data-set.The proposed models shall have 41 features (i.e. inputs) and six classes (i.e. outputs). Due to model complexity of the models, we proposed number of order reduction techniques using the Best First Search and the Genetic Search. These techniques shall be used to reduce the number of features used in the learning process. This paper is organized as follows: Section 2, we provide a literature review which cover various techniques used to solve the ID problem. Section 3 briefly explains nature of the data set used in our experiments. A discussion on the KDD, the NSL-KDD and the modified data adopted, in this study. The three proposed algorithms; DT, ANN and SVM are described in Sections 4, 5 and 6. Feature selection methods based BFS and GS are presented in Section 7. The methods of evaluating the developed model are presented in Section 8. Experimental results are

The volume of targeted network attacks is increasing continuously over time. This causes great financial loss. Intrusion Detection Systems (IDSs) is one of the main solutions for computer and network security. We need IDS to identify the un-authorized access that attempt to compromise confidentiality, integrity or availability of computer or computer network. In this paper, we attempt to provide new models for intrusion detection (ID) problem using Decision Tree (DT) based C4.5 algorithm, Multi-Layer Perceptron (MLP) and Support Vector Machine (SVM). Number of attacks were classified using the three methods. A training and testing data proposed by DARPA is used to develop and evaluate these proposed models. To enhance the performance of the proposed models and speeding up the detection process, a set of features are selected using the Best First Search (BFS) and the Genetic Search (GS). A comparison between the models developed in each case shall be provided. The proposed models were capable of reducing the complexity while keeping acceptable detection accuracy. Keywords: Network Security, Intrusion Detection, Classification, C4.5, Artificial Neural Networks, Support Vector Machine, KDD, NSL-KDD

1

Introduction

Computer security is defined as the protection of computing systems against threats to confidentiality, integrity, and availability [1]. Information confidentiality implies that information is revealed to authorize people with pre-defined rights. Information integrity lead to protecting information from being destroyed or corrupted under any condition. Information availability means that system is capable of providing the services at any given time. IDS plays a very important role in systems security for a long time. It helps in protecting our computers and network systems by detecting any new trial

15


Function (RBF), Logistic Regression (LR) and Voted Perception (VP) using NSL-KDD data. All these algorithms were implemented in Weka [8], a software for data mining, to evaluate the performance. To enhance their results, feature reduction techniques were applied. The results showed that the MLP neural network algorithm provided more accurate results than other algorithms.

2.2

Various aspects of anomaly based intrusion detection in computer security using Machine Learning (ML) was explored [9]. A Review of Intrusion detection solution using machine learning was presented in [10]. This work presented a revision for 55 related research studies between 2000 and 2007 focusing on developing single, hybrid, and ensemble classifiers. Recently, ten ML approaches which include Decision Tree J48, Bayesian Belief Network, Hybrid Naïve Bayes with Decision Tree, Rotation Forest, Hybrid J48 with Lazy Locally weighted learning, Discriminative multinomial Naïve Bayes, Combining random Forest with Naïve Bayes and finally ensemble of classifiers using J48 and NB with AdaBoost AB to detect network intrusions using the NSL-KDD data set [11]. Intrusion detection on mobile ad hoc networks (MANETs) is challenging process. The reason is because of their dynamic nature, and their highly resource-constrained nodes. In [12], author explored the use of Evolutionary Computation (EC) techniques, specifically Genetic Programming (GP) and Grammatical Evolution (GE), to evolve intrusion detection programs.

Figure 1: Classification of IDSs Systems [4]. presented in Section 9. Finally, we present the conclusions of this work.

2

Related Research

Soft-Computing (SC) techniques has a different characteristics from conventional (i.e. hard) computing such that, it can handle fuzziness, uncertainty, incomplete certainty, and approximation. Most soft computing techniques are inspired from the way human mind works and natural biological system. The above stated advantages of soft computing techniques makes them useful in solving the intrusion detection problem. Furthermore, soft computing may be considered as a foundation element for the growing field of conceptual intelligence. Some well-know branches of SC are Fuzzy Logic (FL), Artificial Neural Networks (ANNs), Evolutionary Computation (EC), Machine Learning (ML) and Probabilistic Reasoning (PR).

2.1

Applications of ML

2.3

Applications of DT

Classification based unsupervised and supervised ML techniques in detecting intrusions using network audit trails was presented in [13]. Authors investigated wellknown techniques such as the Frequent Pattern Tree mining (FP-tree), classification and regression trees (CART), multivariate regression splines (MARS) and TreeNet for solving ID problem. Classification accuracy based the ROC curve analysis was used to measure the performance of each developed model. The results show that classification accuracies are better in the cases of SVM and ANN. Farid et al. [14] proposed a new learning algorithm for anomaly base IDS using DT. Their method modified the splitting weights of the dataset. Their method involved changing the weights relative to posterior probabilities. The results of their work illustrate a better performance than the traditional DT algorithm. An ensemble neural decision tree was used in [15] for feature selection and model reduction. The proposed model was compared to 6 types of decision trees. They used specificity and sensitivity as evaluation metrics. The results showed that the proposed model performed better than other

Applications of MLP

ANN were used to solve intrusion detection problem in [5], the proposed model was able to identify three classes of attacks: Normal and two other attack types. The developed ANN model achieved high accuracy. Authors suggested including more attack scenarios in the data set, they also suggested reducing the number of records as a trial to minimize the complexity of the system. Another ANN model was proposed in [6]. Authors defined the output of the ANN to be either 1 or 0 based on the fact that the packet is infected or not with intrusion. They explored the issue of reducing the domain of feature set by using rough set theory performed on just one type of attack. Authors claimed that their model was 20.5 times faster than the previous ones. They suggested applying their method on other classes of attack as a future work. In [7], authors presented four different algorithms to solve the intrusion detection problem. They include the Multilayer Linear Perceptron (MLP), Radial Base

16


methods. In [2], three types of decision trees: ID3, C4.5, and BFS were tested on NSL-KDD network intrusion data set. Feature selection was performed using Consistency Subset Evaluator (CSE). NSL-KDD data set and 10-fold cross validation test mode were used to train and test the three DT algorithms. The analysis of the results concluded that C4.5 performs better than BFS and ID3 in terms of prediction accuracy. Also, they used the ROC curve as evaluation criteria.

2.4

computing facility or memory supply and make it too busy or too full such that it cannot lever genuine requests, thus rejecting users access to a machine. • Surveillance and Other Probing (Probing) Probing is a type of attack where an attacker scans a network to identified vulnerabilities. Thus, he/she can use the gathered information to look for exploits. • Unauthorized Access from a Remote Machine (R2L) A remote to user (R2L) attack is a type of attack where a packet is sent by attacker to a machine over a network, then pursuits the machine’s weakness to unlawfully access the network as a regular user.

Applications of SVM

Yao et. al. [16] proposed an enhanced SVM model for intrusion detection, they used rough set theory to reduce the number of features by removing the less weighted ones. They evaluated the proposed model using KDD99 and UMN data sets against precision, recall, false positives, and false negatives criteria. The results showed that their model was more accurate and needs less time to perform. Chen et al. [17] proposed a model for IDS using SVM based system on a Rough Set Theory (RST). RST was used to reduce the number of features from 41 to 29. The authors compared RST based SVM with that of a full features and Entropy. Their proposed RST-SVM model resulted in a better accuracy compared to the other two mothods. An integrated model of SVM model and DT model for multiclass classification proposed in [18]. First they separated the classes by binary tree structure, then each class were fed to a number of SVMs as the number of the classes. The authors supposed that by combining the two models the results will be more accurate, and the classification process will be faster than individual models. But they didn’t prove or simulate their model. A comparison between three types of Support Vector Machine (SVM) kernel functions: Gaussian Kernel (Radial Basis Function-RBF), polynomial kernel, and Sigmoid kernel was implemented in [19]. Using cross validation classifier and proper data set pre-processing showed that RBF kernel function can achieve better performance that the two other kernel functions.

3

• Unauthorized Access to Local Super User (U2R): User to root are a type of attacks where an attacker access to network as a regular user then exploit the network susceptibility to get root access. Many ML and pattern classification algorithms were used to solve the intrusion detection problem based the KDD data set and failed to identify most of the user-to-root and remote-to-local attacks. In [22], authors introduced the deficiencies and limitations of the KDD data set to argue that this data set should not be used to train pattern recognition or ML algorithms for misuse detection for these two attack categories.

3.2

It was reported that the KDD data set is has many problems. For example: it contains a very huge number of redundant records, and the difficulty level of the different records was not inversely proportional to the percentage of records in the original KDD data set. These deficits results in a very poor evaluation of different ID proposed techniques. NSL-KDD data set was suggested to solve some of the inherent problems of the KDD Cup 1999 data set. The proposed new data set consists of selected records of the complete KDD data set and it recovering these problems [21]. Table 1 shows the NSL-KDD data variables [23]. In Table 2, we show the distribution of attack records per attack category. The following are some of the NSL-KDD advantages over the original KDD data set as presented in [21]:

Data Set

3.1

NSL-KDD Data Set

KDD Data Set

KDD Cup 1999 is the most widely used data set in the ID research. The data is accessible from [20]. It contains about 4,900,000 connection records. Each record consists of 41 features. A statistical analysis on this data set was presented in [21]. There are four major categories of attacks in the KDD data set. They are:

1. Redundant records are excluded in the training set. Thus, biased towards more frequent records is eliminated. 2. In the original KDD data set the number of selected records from each group level is inversely proportional to the percentage of records.

• Denial of Service (DoS): Denial of Service is a type of attack where an attacker access the

17


3. Provided that the number of records in the training and testing portion are sound, experiments on the whole set can be economically tested without the necessity for a random sample at a reduced scale.

Table 2: Distribution of attack records per attack category of the NSL-KDD Attack Category

Table 1: NSL-KDD Data Set Features [23] Variable No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

4

Description duration protocol_type service flag src_bytes dst_bytes land wrong_fragment urgent hot num_failed_logins logged_in num_compromised root_shell su_attempted num_root num_file_creations num_shells num_access_files num_outbound_cmds is_host_login is_guest_login count srv_count serror_rate srv_serror_rate rerror_rate srv_rerror_rate same_srv_rate diff_srv_rate srv_diff_host_rate dst_host_count dst_host_srv_count dst_host_same_srv_rate dst_host_diff_srv_rate dst_host_same_src_port_rate dst_host_srv_diff_host_rate dst_host_serror_rate dst_host_srv_serror_rate dst_host_rerror_rate dst_host_srv_rerror_rate

Type continuous symbolic symbolic symbolic continuous continuous symbolic continuous continuous continuous continuous symbolic continuous continuous continuous continuous continuous continuous continuous continuous symbolic symbolic continuous continuous continuous continuous continuous continuous continuous continuous continuous continuous continuous continuous continuous continuous continuous continuous continuous continuous continuous

Attack Name Back Land Neptune Pod Smurf teardrop

DoS Satan Ipsweep Nmap Portsweep Probe Guess_Password Ftp_write Imap Phf Multihop Warezmaster Warezclient Spy R2L Buffer_overflow Loadmodule Rootkit Perl U2R Normal Total

Number of Records 956 18 41214 201 2646 892 45927 3633 3599 1493 2931 11656 53 8 11 4 7 20 890 2 995 30 9 10 3 52 67343 125973

trees are constructed in a top-down recursive divideand-conquer way [24]. Unlike ID3; C4.5 can deal with continuous attributes and handles missing values, but a little slower than the other DT algorithms [2].

4.1

How to develop a DT?

Decision tree is a directed tree, conforms its structure by recursively separates the set of observations. It consists of a root with no incoming edges, internal or test nodes with exactly one outgoing edge for each, and leaves which represent the decision node and have no outgoing edges [26]. The decision tree development algorithm is a greedy algorithm which is a top-down recursive divide-and-conquer in nature. The algorithm can be summarized as follows: To reduce tree complexity, pruning algorithms were presented. Pruning is a general technique to go against over fitting has a huge effect on the tree size, and a slight effect on the accuracy. It results in better accuracy as reported in [27]. Using Decision Tree, network connections can be classified as normal, anomaly, or other predefined types of attack.

Decision Tree

Decision tree is one of the most well-known and used classification algorithms. Decision tree algorithm known as ID3 was known since 1970. C4.5 algorithm was presented later by Quinlan [24]. C4.5 became a benchmark to which newer supervised learning algorithms are often compared. A classification and Regression Trees (CART) which was used to generate a binary decision trees as presented in [25]. ID3, C4.5, and CART adopt a greedy approach in which decision

4.2

How to Select Tree Root?

We want to determine which attribute can work as a root of a tree given a set of training feature vectors. Information gain (IG) define how important certain attribute of the feature vectors is. IG helps deciding

18


Algorithm 1: Basic steps of DT generation 1 2 3

Create a node N If samples are all of the same class, C then return N as a leaf node labeled with class C;

4 5 6 7

If attribute-list is empty then return N as a leaf node labeled with the most common class in samples;

8 9 10 11 12

Select test-attribute, the attribute among attribute-list with the highest information gain based the Entropy; Label node N with test-attribute;

Figure 2: Simple Tree Structure

13 14 15 16 17 18 19 20 21 22 23

for each known value ai of test-attribute do Let si be the set of samples for which test-attribute= ai ; If si is empty then Attach a leaf labeled with the most common class in samples; else attach the node returned by generate decision tree end if end for

Eparent

=

1

IG

=

1−

=

Echild1 Echild2 Eparent

Entropy =

−pi log2 pi

=

(1)

f3 1 0 1 0

(2)

5

Class A A B B

Echild2

= =

1 1 2 2 − log2 ( ) − log2 ( ) 3 3 3 3 0.5284 + 0.39 0.9184

=

0

=

0 0

(5)

1 1−

1 1 × (0) − × (0) 2 2

1

(6)

Artificial Neural Network

Classification is one of the most active research and application areas of neural networks. A classification problem arises when an object needs to be allocated into a predefined group or class based on a number of observed attributes associated to that object. ANN was successfully used to handle multi-class pattern classification problem [29,30], medical diagnosis [31], bankruptcy prediction [32], handwritten character recognition [33, 34], and speech recognition [35]. ANN usually consist of many hundreds of simple processing units which are connected together in a complex communication network. Each unit or node is a simplified model of a real neuron which fires (sends off a new signal) if it receives a sufficiently strong input signal from the other nodes to which it is connected. The strength of these connections may be varied in order for the network to perform different tasks corresponding to different patterns of node firing activity. ANN model consists of a set of synapses each of which is characterized by a weight or strength of its own.

Thus, the entropy of children and the gain can be computed as in Equations 3 and 4. Echild1

= =

Splitting using feature f2 shall produce the best gain. The developed tree structure in this case can be presented as in Figure 2. This tree was developed using Weka software [28].

E, AE are the entropy and the average entropy, respectively. pi is the probability of class i. Entropy comes from information theory. The higher the entropy the more the information content. For example, given a training data set in Table 4.2. The table has three features f1 , f2 and f3 and the two classes A and B. Assuming that f1 is the split best attribute, this node would be further split. f2 1 1 0 0

=

IG =

i

f1 1 1 0 1

(4)

If we split using the feature f2 , we get Equation 5 and 6.

the ordering of attributes in the nodes of a decision tree. Equations 1 and 2 show how entropy and information gain are calculated [24]. IG = E(P arent) − AE(Children)

3 1 × (0.9184) − × (0) 4 4 0.3112

(3)

19


5.1

Perceptron yî

Neuron is the basic processing unit in ANN. Each neuron has number of inputs and a single output. Each input has an assigned factor or parameter called the weight. The way how a neuron works, is as follows: an input signal to each neuron is multiplicity by the corresponding weight then the result from the multiplication is summed and passes through a transfer function. This transfer function is most likely to be a sigmoid function (see Equation 7). The most simple neural network unit is called "Perceptron". If the result of the summation is over a certain threshold, the neuron output will be activated else the output is not. f (x) =

1 1 + e−x

=

=

f(

n

w j xj + w 0 )

l=1

(9) where yî is the output signal, gi is the function realized by the neural network and θ specifies the parameter vector, which contains all the adjustable parameters of the network (weights wj,l , and biases Wi,j ). MLP is trained by using the Backpropagation (BP) learning algorithm. Training means adjusting the network weights such that the objective criteria is minimized (i.e. minimize the error difference between the network output yˆ and the input Φ). The ANN achieve a good match when the Mean Square Error (MSE) is minimized (See Equation 10). Figure 3 shows the architecture of MLP with 41 inputs which are the features of NSL-KDD and six outputs which are the types of attacks. We used MLP to detect the six types of attacks available, in our data samples.

(7)

(8)

1 (yi − yî )2 M SE = n i=1 n

j=1

5.2

gi [Φ, θ] ⎡ ⎤ n nh Φ Wi,j fj wj,l Φl + wj,0 + Wi,0 ⎦ Fi ⎣ j=1

For example, given a set of inputs xj and a set of corresponding weights wj between the input and hidden neurons, the outputs of all neurons in the hidden layer are calculated by the summation function (see Equation 8). yi

=

Multilayer Perceptron

6

ANN consist of three layers named as: input layer, hidden layer, and output layer. Neurons are most likely fully connected. Each connection is signified by a weight. This weight is computed based on what is called a learning algorithm. These neurons are grouped together to form a layer. MLP is a fully connected network because all inputs/units in one layer are connected to all units in the following layer. The input layer gets the initial data, the hidden layer calculates several interim values which are used to calculate output values in the Output layer. The MLP can be represented mathematically as given in Equation 9 [36, 37].

(10)

Support Vector Machines

Support Vector Machines (SVMs) are one of the latest development of supervised machine learning technique [38]. A survey of SVMs can be found in [39, 40]. Although SVM were known since late seventies [41, 42], it started to receive attention on late nineties [41]. It was applied basically to pattern recognition, also used for pattern classification problems like image recognition, text recognition, face detection, etc [43]. However many research was implemented based SVM in solving intrusion detection problem such as in [44]– [47]. SVMs works mainly by deriving a hyper plane that maximizes the separating margin between two classes [48]. The feature vectors that lie on the boundary of separation vectors are called support vectors [48]. SVM are fantastic because they are very resilient to over fitting [27].

6.1

How SVM works

To see how SVM works, assume we are having a set of training examples in a pair format (xi , yi ), i = 1, . . . , l where xi ∈ Rn and y ∈ {1, −1}l . Thus, our objective is to learn a classifier: f (x) = wT φ(x) + b

(11)

The classifier’s output for a new x is sign(f (x)). If the training data are linearly-separable in the feature space of φ(x) (See Figure 4), the two classes of

Figure 3: Proposed MLP architecture for the NSLKDD data classification

20


Figure 4: Optimal hyperplane in Support Vector Machine [49] training examples are sufficiently well separated in the feature space that one can draw a hyperplane between them. We need to maximize the margin (i.e. the distance from the hyperplane to the closest data point in either class) such that we maximize the margin of error. Many data sets might not be linearly separable. This means, there will be no solution which could satisfy all the constraints. One way to handle this problem is to release some of the constraints by introducing slack variables. Slack variables are presented to permit certain constraint to be violated. It means that, certain training points could be within the margin. SVM maps the training vector xi into a higher dimension space using the function φ by finding linear separator hyperplane with the maximum margin. ζ > 0 is a penalty coefficient for the error term. Our objective is to minimize the number of points within the margin as much as possible. In this case, the SVM [50, 51] require the solution of the following optimization problem: N

i=1 ζi

minw,b,ζ ∀i

Figure 5: Optimal hyperplane with slack variables [49]

7

Feature selection was defined as the process of selecting a subset of originally defined features based on a pre-defined evaluation criteria. Feature selection was successfully used to enhance the process of modeling for input output system. In many cases of modeling, various attributes are gathered during data collection process although they might not be significant. The more irrelevance data might increase the model complexity and increase the convergence time of the best model structure. There are number of advantages for feature selection process. They include:

+ 21 wT w

yi (wT φ(xi ) + b) ≥ 1 − ζ ζi ≥ 0

(12)

• Feature selection was frequently used for model dimension reduction.

K(xi , yi ) ≡ φ(xi ) φ(xi ) is called the kernel function. Now a day, many kernels were proposed for the SVM. Some are listed below: T

• Feature selection helps reducing the features domain, removes redundant features. This way will help in speeding up a learning/modeling process [24, 52].

• linear: K(xi , yi ) = xTi xj • polynomial: K(xi , yi ) = (γxTi xj + r)d > 0 • sigmoid: K(xi , yi ) =

tanh(γxTi xj

Feature Selection

Studying the relevance between the 41 features and the attack types was studied in [23]. The author concluded that not all the 41 features are needed to classify types of attacks. They recommended that more studies are required based machine learning algorithms.

+ r)

where γ, r, and d are kernel parameters. Slack variables characteristics with various values are shown in Figure 5.

21


Figure 6: Summary of feature selection methods [55] In [53], a performance analysis of different feature selection methods in intrusion detection was presented. Number of feature-selection algorithms were compared including: SVM-wrapper, Markovblanket and Classification And Regression Trees (CART) algorithms and generic-featureselection (GeFS) method. Developed experimental results using the KDD CUP’99 data set show that the generic-feature-selection (GeFS)method for intrusion detection outperforms the existing approaches by removing more than 30% of redundant features from the original data set, while keeping a better classification accuracy. A summary of feature selection methods [54] is presented in Figure 6.

7.1

Algorithm 2: Best first search algorithm [52] 1 2 3 4 5 6 7 8 9 10 11 12 13

Process of Feature Selection

7.3

Feature selection processes comprise four simple steps. Different methods for attribute search and evaluation were analyzed in [55, 56]. A typical feature selection method was presented in [53]–[55]. These four steps are: 1) Generation procedure to generate the next candidate subset 2) Evaluation function to evaluate the subset 3) Stopping criterion to decide when to stop and 4) Validation procedure to check whether the subset is valid.

7.2

Begin with the OPEN list containing the start state, the CLOSED list empty, and BEST ←start state. Let s = argmaxe(x) (get the state from OPEN with the highest evaluation). Remove s from OPEN and add to CLOSED. If e(s) ≥ e(BEST ), then BEST ←s. For each child t of s that is not in the OPEN or CLOSED list, evaluate and add to OPEN. If BEST changed in the last set of expansions, goto 1. Return BEST.

Genetic Search

Genetic Algorithms (GA) are search algorithms adopting the principle of natural selection [52] [57]. Using GA robust and adaptable systems can be developed [57] [58]. GA works on an individual called chromosome. Initial population is a set of randomly created chromosomes. Each chromosome represents a possible solution to the problem [59] [57]. The generated solutions evolve over time to produce an optimal solution in an iterative process. In feature selection problem, a solution usually is a fixed length binary string representing a feature subset. Each position value in the string represents the presence or absence of a particular feature [52]. Initial subset is selected randomly from the all features set. Successive generations are produced using genetic operators called crossover and mutation applied on the current selected subset. The new generated subset members are evaluated using what called fitness function according to defined fitness criteria. The better subsets have a stronger chance to be selected for a new subset formation. By this way, newer evolved

Best First Search

Best first search strategy allows backtracking along the search path. It moves through the search space by making local changes to the current feature subset. If the path being explored begins to look less promising, best first search can back-track to a more promising previous subset and continue the search from there. Best first search algorithm works as given in Algorithm 2.

22


n 1 (y − yˆ)2 RM SE = n i=1

subsets potentially has higher quality. Generally genetic search strategy works as given in Algorithm 3. Algorithm 3: Genetic search algorithm [52]. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

n |y − yˆ| RAE = i=1 n i=1 |y − y|

Begin by randomly generating an initial population P . Calculate e(x) for each member x ∈ P . Define a probability distribution p over the members of P where p(x)αe(x). Select two population members x and y with respect to p. Apply crossover to x and y to produce new population members x ´ and y´. Apply mutation to x ´ and y´. Insert x ´ and y´ into P´ (the next generation). If |P´ | < |P |, goto 4. Let P ← P´ . If there are more generations, goto 2. Return x ∈ P for which e(x) is highest.

(16)

(17)

y is the actual class of connection, yˆ is the predicted class and y is the mean of the type y using n instances.

9

Experimental Results

In our experiments, we selected randomly 6000 records from NSL-KDD data. The selected set contains 5 types of attack and normal type, 100 records for each type. Table 4 shows the type of data used and the number of samples for each attack type.

Table 4: Experimental data

8

Attack type Normal ipsweep neptune nmap smurf satan Sum

Model Evaluation

In order to check the performance of the developed models, we explored set performance evaluation functions such as: Correctly Classified Instances (CCI), Incorrectly Classified Instances (ICI), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Relative Absolute Error (RAE). These performance evaluation functions are used to measure how accurate the predicted intrusion types by the learned algorithms to the actual intrusion types. The equations which described are computed as follows: CCI =

TP + TN TP + TN + FP + FN

No. of records 1000 1000 1000 1000 1000 1000 6000

In the following sections, we show our experiments on developed three intrusion detection models based MLP, C4.5, and SVM classifiers. A data mining software tool Waikato Environment for Knowledge Analysis (WEKA) [28] was used to develop our results. Sample data of NSL-KDD data set shown in Table 4 was adopted. For all experiments we used 10-fold cross validation test mode since it reduces the variance of estimate. A block diagram which shows the experimental setup for the Weka environment for the three proposed models are shown in Figure 7, Figure 8 and Figure 9. The experimental results will be explained in details in the following subsections.

(13)

FP + FN (14) TP + TN + FP + FN where T P is the proportion of correctly classified instances as positives, T N the proportion of correctly classified instances as negatives, F P proportion of negative instances that were incorrectly classified as positives, F N the proportion of positive instances that were incorrectly classified as negatives. Confusion matrix shown in Table 3 is used to evaluate the performance of the classification system. ICI =

Table 3: Confusion matrix. Predicted Positive Negative Positive TP FN Actual Negative FP TN 1 |y − yˆ| n i=1 n

M AE =

Figure 7: Weka Setup for the C4.5 Classification Model

(15)

23


used. This number is computed as the number of inputs plus the number of of classes to be identified divided by 2. This number is 60, in our case. Table 6: The Setting of ANN Maximum number of epochs Number of Hidden layer Number of neurons in hidden layer Learning rate Momentum Figure 8: Weka Setup for the MLP Classification Model

500 1 60 0.3 0.2

The structure of the proposed MLP was presented in Figure 3. The confusion matrix developed based the MLP model is given in Table 7. Table 7: Confusion matrix for the MLP model

Pred. Actual ipsweep neptune nmap normal satan smurf

Figure 9: Weka Setup for the SVM Classification Model

9.1

9.3

C4.5 Model

neptune%

99.20 0.00 0.70 0.60 0.10 0.00 Average of

nmap %

normal %

satan %

0.00 0.30 0.30 0.20 99.8 0.00 0.20 0.00 0.00 99.0 0.20 0.10 0.10 0.60 97.40 1.30 0.10 0.00 0.90 98.9 0.00 0.00 0.00 0.00 correctly classified instances = 99.05 %

normal %

satan %


smurf % 0.00 0.00 0.00 0.10 0.20 99.80

SVM Model

(18)

Table 8: Confusion matrix for the SVM model


smurf % 0.00 0.00 0.00 0.00 0.00 100.0

10 9.2

98.70 0.00 1.60 0.60 0.30 0.20 Average of

nmap %

K(xi , yi ) = exp(−γ||xi − xj ||2 , γ > 0

Table 5: Confusion matrix for the C4.5 model ipsweep%

neptune%

In this section, we provide the results of the developed SVM classification model. We explored various types of kernels: Gaussian Kernel (RBF), Polynomial kernel, and sigmoid kernel. We found that SVM with RBF kernel can achieve the best accuracy rate over the other kernels. The RBK kernel is given in Equation 18. The confusion matrix developed based the SVM model is given in Table 8.

C4.5/J48 is a very popular machine learning algorithm. It is a new variant of ID3 algorithm. The output of this classification algorithm is an understandable tree. To get the tree small as possible information gain during building the tree is used. Pruning also can be used to get smaller tree. Without pruning we get a tree of 456 nodes and 400 leaves. The classification accuracy computed was 99%. Using pruning we get tree of 229 nodes size and 188 leaves and 99.05% classification accuracy. Confidence factor of 0.25 was used. The confusion matrix developed based the C4.5 model is given in Table 5.


ipsweep%

ipsweep%

neptune%

84.30 0.10 25.60 0.50 0.40 0.00 Average of

nmap %

normal %

satan %


smurf % 0.00 0.00 0.00 0.10 0.00 95.60

Best Attribute Selected

Feature selection was implemented using BFS and GS algorithms for attribute subset selection. The Correlation based Feature Selection (CFS) was used to evaluate the models developed based selected attributes. The selected features subset was then used to develop a new set of models based the C4.5, MLP and SVM.

ANN Model

A MLP was used as a classification model to solve the intrusion detection problem. A setup for the developed MLP model is given in Table 6. In Weka, the default number of neurons in the hidden layer was

24


The proposed flow diagram in this case is shown in Figure 10. In Tables 9 and 10, we show the best features selected based BFS and GS algorithms, respectively.

Table 9: BFS Selected Features No. 3 5 6 23 30 37 38

Description service src_bytes dst_bytes count diff_srv_rate dst_host_srv_diff_host_rate dst_host_serror_rate

Type symbolic continuous continuous continuous continuous continuous continuous

Table 10: GS Selected Features No. 2 3 5 6 23 24 25 30 36 37

Figure 10: Feature Selection Process Flow Diagram

Description protocol_type service src_bytes dst_bytes count srv_count serror_rate diff_srv_rate dst_host_same_src_port_rate dst_host_srv_diff_host_rate

Type symbolic symbolic continuous continuous continuous continuous continuous continuous continuous continuous

• C4.5 and MLP perform better with Genetic Search attribute selection method. SVM is the only algorithm which performs better with Best First feature selection method. We conclude that feature selection can reduce the model complexity by minimizing the number of attributes and model building time.

Correctly Classified Instances for C4.5, MLP and SVM 100 90 80 70 60 50

11

40

In this paper, we developed three models to solve the intrusion detection problem using decision tree based C4.5 algorithm, Multi-Layer Perceptron, and Support Vector Machine. Number of attacks were classified using the three methods. To enhance the performance of the proposed models and speeding up the detection process, a set of features were selected using the Best First Search and the Genetic Search methods. A comparison between the developed models before and after feature selection were provided. The developed models were capable of reducing the complexity while keeping acceptable detection accuracy. The decision tree based C4.5 algorithm achieved the highest classification accuracy compared to other search techniques explored in this work.

30 20 10 0

Original Data

BF Search

Genetic Search

Figure 11: Correctly Classified Instances for C4.5, MLP and SVM

10.1

Conclusions

Some Observations

Performance of each one of the three built models using C4.5, MLP, and SVM where tested and the obtained results are shown in Table 11. Figure 11 shows that C4.5 achieved the highest classification accuracy compared to other techniques.

References

• Results showed that C4.5 was the best method in terms of detection accuracy and minimum training time. It achieved the top accuracy rate of (99.05%).

[1] R. C. Summers. Secure computing: Threats and safe-guards. McGraw Hill, New York, 2010.

• After applying feature selection using Best First and Genetic Search methods; C4.5 still occupy the top accuracy percentage.

[2] Shih Yin Ooi, Yew Meng Leong, Meng Foh Lim, Hong Kuan Tiew, and Ying Han Pang. Network intrusion data analysis via consistency sub-

25


Table 11: ALGORITHM C4.5 (J48) C4.5+BestFirst C4.5+Genetic Search MLP MLP+BestFirst MLP+Genetic Search SVM SVM+Best First SVM+Genetic Search

Performance evaluation for C4.5, MLP and SVM models CCI ICI MAE RMSE RAE Time Taken(s) 99.05% 0.95% 0.0039 0.0534 1.39% 0.47 97.35% 2.65% 0.0122 0.0903 4.41% 0.06 98.80% 1.20% 0.005 0.0573 1.80% 0.11 98.72% 1.28% 0.0061 0.0619 2.18% 485.68 93.05% 6.95% 0.0302 0.1299 10.86% 218.2 94.77% 5.23% 0.0218 0.1151 7.83% 235.3 89.58% 10.42% 0.0347 0.1863 12.50% 14.66 93.80% 6.20% 0.0207 0.1438 7.44% 4.08 86.77% 13.23% 0.0441 0.21 15.88% 5.32

set evaluator with ID3, C4.5 and best-first trees. IJCSNS, 13(2):7, 2013.

[10] Chih-Fong Tsai, Yu-Feng Hsu, Chia-Ying Lin, and Wei-Yang Lin. Intrusion detection by machine learning: A review. Expert Systems Applications, 36(10):11994–12000, December 2009.

[3] M. M. Pillai, Jan H. P. Eloff, and H. S. Venter. An approach to implement a network intrusion detection system using genetic algorithms. In Proceedings of the 2004 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists on IT Research in Developing Countries, pages 221–221, Republic of South Africa, 2004. South African Institute for Computer Scientists and Information Technologists.

[11] Mrutyunjaya Panda, Ajith Abraham, Swagatam Das, and Manas Ranjan Patra. Network intrusion detection system: A machine learning approach. Int. Dec. Tech., 5(4):347–356, October 2011. [12] Sevil Sen and John A. Clark. Evolutionary computation techniques for intrusion detection in mobile ad hoc networks. Comput. Netw., 55(15):3441–3457, October 2011.

[4] Al-Sakib Khan Pathan, editor. The State of the Art in Intrusion Prevention and Detection. CRC press, 2014.

[13] Srinivas Mukkamala, Dennis Xu, and Andrew H. Sung. Intrusion detection based on behavior mining and machine learning techniques. In Proceedings of the 19th International Conference on Advances in Applied Artificial Intelligence: Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE’06, pages 619–628, Berlin, Heidelberg, 2006. SpringerVerlag.

[5] Sammany Mohammed, Sharaw Marwa, Elbeltagy Mohammed, and Saroit Imane. Artificial neural networks architecture for intrusion detection systems and classification of attacks. In Faculty of Computers and Information Cairo University, 2007. [6] Dilip Kumar Barman and Guruprasad Khataniar. Design of intrusion detection system based on artificial neural network and application of rough set. International Journal of Computer Science and Communication Networks, pages 548–552, 2012.

[14] Dewan Md Farid, Nouria Harbi, Emna Bahri, Mohammad Zahidur Rahman, and Chowdhury Mofizur Rahman. Attacks classification in adaptive intrusion detection using decision tree. In International Conference on Computer Science (ICCS’10), Rio De Janeiro, Brazil, March 2010.

[7] Singh Sahilpreet and Bansal Meenakshi. Improvement of intrusion detection system in data mining using neural network. International Journal of Advanced Research in Computer Science and Software Engineering, 2013.

[15] Siva S. Sivatha Sindhu, S. Geetha, and A. Kannan. Decision tree based light weight intrusion detection using a wrapper approach. Expert Syst. Appl., 39(1):129–141, 2012.

[8] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. The weka data mining software: An update. ACM SIGKDD Exploration Newsletter, 11(1):10–18, November 2009.

[16] JingTao Yao, Songlun Zhao, and Lisa Fan. An enhanced support vector machine model for intrusion detection. In Proceedings of the First International Conference on Rough Sets and Knowledge Technology, pages 538–543. Springer-Verlag, 2006.

[9] Yihua Liao. Machine Learning in Intrusion Detection. PhD thesis, University of California at Davis, Davis, CA, USA, 2005.

26


[17] Rung-Ching Chen, Kai-Fan Cheng, Ying-Hao Chen, and Chia-Fen Hsieh. Using rough set and support vector machine for network intrusion detection system. In Intelligent Information and Database Systems, 2009. ACIIDS 2009. First Asian Conference on, pages 465–470, April 2009.

[28] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. The weka data mining software: an update. ACM SIGKDD Exploration Newsletter, 11:10–18, 2009. [29] G. P. Zhang. Neural networks for classification: A survey. Trans. Sys. Man Cyber Part C, 30(4):451–462, November 2000.

[18] Snehal A. Mulay, P.R. Devale, and G.V. Garje. Intrusion detection system using support vector machine and decision tree. International Journal of Computer Applications, 3(3):40–43, 6 2010. Published By Foundation of Computer Science.

[30] Guobin Ou and Yi Lu Murphey. Multi-class pattern classification using neural networks. Pattern Recogn., 40(1):4–18, January 2007.

[19] Yogita B Bhavsar and Kalyani C Waghmare. Intrusion detection system using data mining technique: Support vector machine. International Journal of Emerging Technology and Advanced Engineering, 3(3):581–586, 2013.

[31] Rüdiger W. Brause. Medical analysis and diagnosis by neural networks. In Proceedings of the Second International Symposium on Medical Data Analysis, ISMDA ’01, pages 1–13, London, UK, UK, 2001. Springer-Verlag.

[20] KDD Cup 1999 Data. http://kdd.ics.uci. edu/databases/kddcup99/kddcup99.html. Accessed: 2015-04-24.

[32] Philippe du Jardin. Predicting bankruptcy using neural networks and other classification methods: The influence of variable selection techniques on model accuracy. Neurocomputing, 73(10-12):2047–2060, June 2010.

[21] Mahbod Tavallaee, Ebrahim Bagheri, Wei Lu, and Ali A. Ghorbani. A detailed analysis of the kdd cup 99 data set. In Proceedings of the Second IEEE International Conference on Computational Intelligence for Security and Defense Applications, CISDA’09, pages 53–58, Piscataway, NJ, USA, 2009. IEEE Press.

[33] Dayashankar Singh, Maitreyee Dutta, and Sarvpal H. Singh. Neural network based handwritten hindi character recognition system. In Proceedings of the 2Nd Bangalore Annual Compute Conference, New York, NY, USA, 2009. ACM.

[22] Maheshkumar Sabhnani and Gursel Serpen. Why machine learning algorithms fail in misuse detection on KDD intrusion detection data set. Intelligent Data Analysis, 8(4):403–415, September 2004.

[34] Soni Chaturvedi, Rutika N. Titre, and Neha Sondhiya. Review of handwritten pattern recognition of digits and special characters using feed forward neural network and izhikevich neural model. In Proceedings of the 2014 International Conference on Electronic Systems, Signal Processing and Computing Technologies, pages 425– 428, Washington, DC, USA, 2014. IEEE Computer Society.

[23] N. Kayacik and M. Heywood. Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 Intrusion Detection Datasets. In The 3rd Annual Conference on Privacy, Security and Trust (PST), 2005.

[35] Dariusz Król and Boguslaw Szlachetko. Automatic image and speech recognition based on neural network. Journal of Information Technology Research (JITR), 3(2):1–17, April 2010.

[24] Jiawei Han, Micheline Kamber, and Jian Pei. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 3rd edition, 2012.

[36] M. Norgaard, O. Ravn, Poulsen, and L. K. Hansen. Neural Networks for Modelling and Control of Dynamic Systems. Springer, London, 2000.

[25] L. Breiman, J. Friedman, C.J. Stone, and R.A. Olshen. Classification and Regression Trees. The Wadsworth and Brooks-Cole statisticsprobability series. Taylor & Francis, 1984.

[37] Heba Al-Hiary, Alaa Sheta, and Aladdin Ayesh. Identification of a chemical process reactor using soft computing techniques. In Proceedings of the 2008 International Conference on Fuzzy Systems (FUZZ2008) within the 2008 IEEE World Congress on Computational Intelligence (WCCI2008), Hong Kong, 1-6 June, pages 845– 653, 2008.

[26] Oded Maimon and Lior Rokach, editors. Data Mining and Knowledge Discovery Handbook, 2nd ed. Springer, 2010. [27] Ian H. Witten, Eibe Frank, and Mark A. Hall. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers Inc., 3rd edition, 2011.

[38] Andrew Ng. Cs229 lecture notes, Autumn 2014.

27


[39] Christopher Burges. A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov., 2(2), 1998.

[49] Alaa Sheta, Sara Elsir M. Ahmed, and Hossam Faris. A comparison between regression, artificial neural networks and support vector machines for predicting stock market index. International Journal of Advanced Research in Artificial Intelligence (IJARAI), 4(4):55–63, 2015.

[40] Nello Cristianini and John Shawe-Taylor. An Introduction to Support Vector Machines: And Other Kernel-based Learning Methods. Cambridge University Press, New York, NY, USA, 2000.

[50] Bernhard E. Boser and et al. A training algorithm for optimal margin classifiers. In In Proceedings of the 5 th Annual ACM Workshop on Computational Learning Theory, pages 144–152. ACM Press, 1992.

[41] Christopher J. C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2):121–167, June 1998.

[51] Corinna Cortes and Vladimir Vapnik. Supportvector networks. Mach. Learn., 20(3):273–297, September 1995.

[42] Vladimir Vapnik. Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics). Springer-Verlag New York, Inc., 1982.

[52] Mark Hall. Correlation-based Feature Selection for Machine Learning. PhD thesis, University of Waikato, 1999.

[43] Ashis Pradhan. Support vector machines - a survey. International Journal of Emerging Technology and Advanced Engineering, 2(8), 2012.

[53] Hai Thanh Nguyen, Slobodan Petrović, and Katrin Franke. A comparison of feature-selection methods for intrusion detection. In Proceedings of the 5th International Conference on Mathematical Methods, Models and Architectures for Computer Network Security, MMM-ACNS’10, pages 242–255, Berlin, Heidelberg, 2010. SpringerVerlag.

[44] Latifur Khan, Mamoun Awad, and Bhavani Thuraisingham. A new intrusion detection system using support vector machines and hierarchical clustering. The VLDB Journal, 16(4):507–521, 2007.

[54] M. Dash and H. Liu. Feature selection for classification. Intelligent Data Analysis, 1:131–156, 1997.

[45] Jiaqi Jiang, Ru Li, Tianhong Zheng, Feiqin Su, and Haicheng Li. A new intrusion detection system using class and sample weighted c-support vector machine. In Proceedings of the 2011 Third International Conference on Communications and Mobile Computing, CMC ’11, pages 51–54, Washington, DC, USA, 2011. IEEE Computer Society.

[55] Aggarwal Megha and Amrita. Performance analysis of different feature selection methods in intrusion detection. International Journal of Scientific and Technology Research, 2(6), 2013. [56] Huan Liu and Lei Yu. Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17:491–502, 2005.

[46] P. Kola Sujatha, C. Suba Priya, and A. Kannan. Network intrusion detection system using genetic network programming with support vector machine. In Proceedings of the International Conference on Advances in Computing, Communications and Informatics, ICACCI ’12, pages 645– 649, New York, NY, USA, 2012. ACM.

[57] Swati Sharma, Santosh Kumar, and Mandeep Kaur. Recent trend in intrusion detection using fuzzy-genetic algorithm. International Journal of Advanced Research in Computer and Communication Engineering, 3(5), 2014.

[47] Jayshree Jha and Leena Ragha. Intrusion detection system using support vector machine. IJAIS Proceedings on International Conference and workshop on Advanced Computing 2013, ICWAC(3):25–30, June 2013. Published by Foundation of Computer Science, New York, USA.

[58] Kuldeep Kumar and Ramkala Punia. Improving the performance of ids using genetic algorithm. International Journal of Computer Science and Communication, 4(2), 2013. [59] Mohammad Sazzadul Hoque, Md Abdul Mukit, and Md. Bikas. An implementation of intrusion detection system using genetic algorithm. International Journal of Network Security & Its Applications, 4(2), 2012.

[48] Wenjie Hu, Yihua Liao, and Rao Vemuri. Robust anomaly detection using support vector machines. In In Proceedings of the International Conference on Machine Learning. Morgan Kaufmann Publishers Inc, 2003.

28


Biographies Alaa F. Sheta is currently a Professor at the Computers and Systems Department, Electronics Research Institute (ERI), Egypt. He received his PhD degree from the Computer Science Department, George Mason University, Fairfax, VA, USA in 1997. He received his B.E., M.Sc. degrees in Electronics and Communication Engineering from the Faculty of Engineering, Cairo University in 1988 and 1994, respectively. His main research area is in Evolutionary Computation, with a focus on Genetic Algorithms, Genetic Programming and applications. He is also interested in Particle Swarm Optimization, Differential Evolutions, Cuckoo Search, etc. Alaa Sheta authored/co-authored over 100 publications in peer reviewed international journals, proceedings of the international conferences and book chapters. He is co-author of two books in the field of Landmine Detection and Classification and Image Reconstruction of a Manufacturing Process. He is the co-editor of the book: Business Intelligence and Performance Management - Theory, Systems and Industrial Applications by Springer Verlag, United Kingdom, published in March 2013. Amneh Alamleh is currently a laboratory Assistant with Zarqa University, Jordan. She received her B.Sc. and M.Sc degrees in Computer Science from Zarqa University in 2003, 2015, respectively. Amnah is M.Sc. candidate with the Computer Science Department, College of Computer Science and Information Technology, Zarqa University, Jordan. Currently, Here research interests include Network Security, Artificial Neural Networks, Evolutionary Computation, Classification, Genetic Algorithms, Data Mining, and Software Engineering.

29


30


A DOS Attack Intrusion Detection and Inhibition Technique for Wireless Computer Networks Mofreh Salem*, Amany Sarhan**, Mostafa Abu-Bakr* *Computers and System Dept., Faculty of Engineering, Mansoura Univ., Egypt **Computers and Automatic Control Dept., Faculty of Engineering, Tanta Univ., Egypt [email protected] WLAN. However, there are still some problems in the security that are not solved yet [4, 5, 6, 7]. In this work, we will concentrate on one of these problems that is the Denial Of Service (DOS) attack. Potential DOS attacks are a significant risk for any application where loss of wireless LAN access affects life, profits or reputation. For example, when the network is used for public use in the hot spots for surfing the Internet, the DOS will decrease the profit of the firm that provides the service. DOS attacks can be single or distributed [8, 9]. A WLAN that uses IEEE 802.11i protocol are the most likely candidate protocol to become widely prevalent in cooperate environment. Its low cost of entry is what makes it so attractive. However, inexpensive equipment also makes it easier for attackers to mount an attack. There has been a great deal of both research and commercial activity in wireless network security. The vulnerabilities associated with the IEEE 802.11i standards are now widely known and widely exploited. In addition to suffering from the previous types of security, it suffers from vulnerability to a special type of DOS attacks [4, 10, 11, 12]. This type of DOS attack can occur if an attacker submits two packets with wrong MIC (Message Integrity Code) tags each second to the same access point. The access point then shuts down for a minute or more assuming it is under attack. When it restarts, the attacker can repeat the attack and so on thus shutting the access point for any period of time the attacker wants. To cause such attack, the attacker only needs a laptop or hand held computer with an 802.11b card and a little software. In this paper, we examine integrating an intrusion detection and inhibition process with the current authentication level of the IEEE 802.11i. The proposed technique tries to improve the security of the WLAN by introducing a means to detect and inhibit forged MIC tags DOS attacks. Like security and management systems for the wired network infrastructure, it attempts to monitor the traffic accessing the wireless network, detect intrusions, block attackers, and maintain a level of service to authorized clients. The main contribution of this paper is to solve a special kind of DOS attack caused by sending two wrong key packets to the access point. However, this technique can be combined with other techniques to defend against the other types of DOS attacks such as Distributed or flooding DOS attacks.

Abstract Security is one of the most important issues to be considered in the Wireless Local Area Networks (WLANs). There are many weakness points of security in WLANs due its nature. Many security techniques were introduced to solve the available security bugs. However, there are still many bugs that were not solved yet such as Denial Of Service (DOS) attacks. In this paper, a new security technique is proposed that aims to detect the DOS attacks in WLANs and further prevent the detected attackers, in the future, from accessing the network. The proposed technique uses an intruders’ database (IDB), which it creates and modifies each time an intruder is detected. This database will be used by the technique to inhibit intruders from bringing the network down by a DOS attack. The simulation results of the proposed technique measure the Probability of Denied Service (PDS) with respect to the number of attacks and the maximum number of connections that access point allows. These results show the effectiveness of this technique in securing the WLAN against the DOS attacks. Keywords: DOS attack, WLAN, Security, Intrusion detection, IEEE 802.11i.

1. Introduction Wireless networking technologies are increasingly penetrating into everyday life [1, 2]. A wireless LAN (Local Area Network) is a type of local-area network that uses high-frequency radio waves rather than wires to communicate between nodes. Today, wireless LANs introduce the concept of complete mobility; communication is no longer limited to the infrastructure of wires. This provides new opportunities and challenges such as security. Security techniques in Wireless computer networks have been increasingly needed. There are many basic risks associated with the WLAN such as insertion attacks, interception and unauthorized monitoring of wireless traffic, jamming, and denial of service [3]. There are many security techniques that already exist such as WEB, Virtual Private Networking (VPN), 802.1X, the Extensible Authentication Protocol (EAP) and Remote Authentication Dial In User Service (RADIUS). These security techniques tried to solve the security bugs in

31


The rest of this paper is organized as follows: Section (2) reviews the basics of wireless networks. Section (3) discusses the wireless network security including the problem of denial of service and the existing security techniques. Section (4) introduces the proposed intrusion detection and inhibition technique. The simulation results and analysis are presented in section (5). Finally section (6) gives the conclusion of the proposed work.

Area Networks (WPANs), are intended to provide the flexibility of forming networks in an ad hoc manner, without access to any infrastructure, or by extending a pre-existing network infrastructure, e.g. to provide access to a wired campus LAN and/or the Internet. Recent developments also foresee WLAN technologies to be a complement to third generation mobile communication networks. The use of WLAN as radio access network in hot spot areas is seen as an approach to increase network capacity, handle a larger number of users. However, we are interested in this work with the IEEE 802.11i family of WLAN. It started on 1997 by introducing IEEE 802.11 legacy that operated on frequency 2.4-2.5 GHz and had an outdoor range of 75 meter and a maximum data rate of 2 Mbits/sec. It is then followed by IEEE 802.11a on 1999 that had a higher frequency range 5.15-5.825 GHz and had a maximum data rate of 54 Mbits/sec. Then comes the IEEE 802.11b which is an enhancements of 802.11 to support 5.5 and 11 Mbit/s on 1999. These versions are followed by a series of 802.11x like 802.11c, 802.11d, 802.11f, 802.11g, 802.11h, 802.11i, 802.11j,…. The last version of this series is 802.11y that is expected to appear on March 2008 in USA.

2. Wireless Networks Wireless LANs offer a quick and effective extension of a wired network or standard LAN. By simply installing access points to the wired network, personal computers and laptops equipped with wireless LAN cards can connect with the wired network [1, 2]. Over the past few years, the world has become more mobile. Traditional ways of networking have altered to accommodate new lifestyles and ways of working. Wireless networks offer several advantages over fixed (or wired) networks, with mobility, flexibility, ease and speed of deployment, and low-cost at the top of the list. Large productivity gains are possible when developers, students, and professionals are able to access data on the move. Wireless networks are typically very flexible, which can translate into rapid deployment. Once the infrastructure is in place, adding new users is very simple [1]. Figure 1 shows an example for the wireless network.

3. Wireless Network Security Wireless networks have become one of the most interesting targets for hackers today. Organizations today are deploying wireless technologies at a rapid rate, often without considering all security aspects. Due to the ease of tapping into the wireless medium, eavesdropping and impersonation can be relatively easily achieved. Thus, privacy and anonymity (security) are always considered on of the important features of communication networks. In fact, supporting those two features is still considered a main challenge of mobile and wireless networks. 3.1 Security threats in wireless network Unlike a wired network, a WLAN sends data over the air and may be accessible outside the physical boundary of an organization. WLAN is very susceptible to many security threats. The first of them is the possibility to intercept and unauthorized monitor of wireless traffic. When WLAN data is not encrypted, the packets can be viewed by anyone within radio frequency range. For example, a person with a Linux laptop, a WLAN adapter, and a program such as TCPDUMP (Transport Control Protocol) can receive, view, and store all packets circulating on a given WLAN. Another common problem is jamming the transmitter that can make communications impossible. Consistently hammering an access point with access requests, whether successful or not, will eventually exhaust its available radio frequency spectrum and knock it off the network denying its services to users. A denial of service (DOS) can also be caused by flooding other wireless clients with bogus packets creating a denial of service of these clients. In addition, duplicate IP (Internet Protocol) or MAC (Media Access Control) addresses, both intentional and accidental, can cause disruption on the network [3, 4, 5, 8, 13, 14]. The security impact of ad hoc WLANs is significant. Many wireless cards, including some shipped as a default

Figure 1. A Wireless Network Configuration

Wireless mobile devices are encountered in nearly universal use - that is, at home, at work, on the road - to access the Internet and to provide data services in general. This leads to a growing expectation from users to be able to access information anytime anywhere. Wireless networks have thus become increasingly important to satisfy the end user demand for mobile and persistent computing. The benefits of WLAN, briefly, are ease of installation, ease of modification, mobility, portability, interconnectability, expandability, easily segmentation, and broad range of coverage. Businesses are quickly deploying new networks without the costs and time of wiring offices and workstations. So, wireless access is a real business tool [2, 9]. Wireless networking is a means of providing ever-present computing. This means that it enables computing any time and any place. Examples of successful technologies such as the IEEE 802.11i family of Wireless Local Area Network (WLAN) protocols, and Bluetooth for Personal

32


item by PC manufacturers, support ad hoc mode. When adapters use ad hoc mode, any hacker with an adapter configured for ad hoc mode and using the same settings as the other adapters, may gain unauthorized access to clients.

•Data link layer (~OSI layer 2) data link attacks are launched to disable the ability of hosts to access the local network. It targets either a host or a network •Physical layer (~OSI layer 1) ”backhoe attack”, a heavyequipment operator accidentally cut a communication cable, take down services potentially, by creating a device that produces lots of noise at 2.4GHz frequency is both easy and cheap. In addition to the common types of DOS attacks found in wired networks, a WLAN that uses IEEE 802.11i protocol and its various versions excites a new type of DOS attack. This new type can be caused by attackers who can simply submit two packets with wrong key each second to the same access point. The access point considers that there is an attempt to intrude the network. The access point then shuts down for a minute or more. When it comes back up, the attacker can repeat the attack as many time as he wishes thus shutting this access point for a long time and denying its service to the legitimate users.

3.2 Denial of Service in wired and wireless networks Denial Of Service (DOS) attack is an attack whereby one computer or a group of loosely networked computers attempt to send too much information to a remote computer or server, such as a web server. In the context of a network of computers, DOS occurs when a particular system resource (e.g., application, operating system or routing services, communications or processing bandwidth, memory, queue position) is not available to legitimate users. A DOS floods the remote computer with so much traffic that it cannot handle normal, valid requests made from others. The attacker either consumes the recourses as in TCP SYN (Transport Control Protocol Synchronization) floods and ICMP (Internet Control Message Protocol) echo floods or consumes the bandwidth as in UDP (User Datagram Protocol) floods and ICMP floods. DOS attacks are easily undetectable, as the remote computer cannot easily distinguish requests and traffic sent from the DOS-attacking machines versus that sent by valid means. DOS may occur because of high legitimate demand or a nonmalicious fault, but mostly occur because of hostile actions taken on the network itself, that is, a DOS attack [2, 13, 15]. DOS attacks are carried out in several basic ways. First, the attacker can send a single communication that the victim system cannot deal with, thus causing it to hang up or crash (e.g., ping of death). Second, attackers can manipulate information critical to the operation of the network. Third, an attacker can send a continuous stream of communications that use up the victim system’s resources, making them unavailable to legitimate users (flooding). Flooding attacks can be further classified into brute force attacks that make an entire site unavailable because of congestion and stateful resource attacks that make a service unavailable by consuming resources needed by the service (e.g., memory, file descriptors, queue allocations). Although a high priority, protecting against DOS attacks is an extremely difficult task.

3.4 Existing Wireless Security Techniques There has been number of security techniques for Wireless computer networks [1, 3, 16, 17]. In order to provide data confidentiality equivalent to a wired network, the IEEE 802.11 Standard [17] originally defines Wired Equivalent Privacy (WEP). WEP is an encryption mechanism designed to secure the data being transmitted wirelessly included in the 802.11 standard. It uses the RC4 encryption algorithm that is known as a stream cipher. It is used to protect wireless communication from eavesdropping and to prevent unauthorized access to a wireless network. The main weaknesses of the WEP are that it uses a single, static shared key [15, 17]. Numerous researches have shown that none of the data confidentiality, integrity, and authentication could be achieved through WEP mechanism. First, the 40-bit shared key is too short for brute-force attacks. Though some vendors might support a longer key (104 bits), it is still possible for an adversary to recover the plaintext traffic because the static shared key result in a high possibility of key stream expected. This also can lead to IP redirection, reaction attacks, and inductive chosen plaintext attack. Additionally, WEP does not implement any mechanism to prevent replay attacks. Other solutions have been proposed to replace the WEP problems. Virtual Private Network technology (VPN) has been used to secure communications among remote locations via the Internet [6]. When a WLAN client uses a VPN tunnel, communications data remains encrypted until it reaches the VPN gateway, which sits behind the wireless AP. Thus, intruders are effectively blocked from intercepting all network communications. The disadvantage of VPN security technology is that it is not self-managing; User credentials and VPN software must be distributed to each client. The Wi-Fi Alliance proposed an interim solution, called Wi-Fi Protected Access (WPA), to ameliorate the vulnerabilities by reusing the legacy hardware. WPA is a standards-based security mechanism that eliminates most 802.11 security issues [7, 18]. WPA adopts a Temporal Key Integrity Protocol (TKIP) for data confidentiality,

3.3 DOS Attack in WLAN In WLAN a DOS attack can also be caused by sending a very strong radio signals using a very powerful transmitter in a relatively close range. However, the use of very strong radio signal is risky because the WLAN owners can find the attacker by AirMagnet tool. DoS attacks target many different layers of the wireless network [2, 9]: •Application layer (~OSI layer 7) by sending large amount of legitimate requests to an application •Transport layer (~OSI layer 4) by sending many TCP connection requests to a host •Network layer (~OSI layer 3) conducted by sending a large amount of IP data to a network

33


which still uses RC4 for data encryption, but includes a key mixing function and an extended IV space to construct unrelated and fresh per packet keys. WPA also introduces Michael algorithm, a weak keyed Message Integrity Code (MIC), for improved data integrity under the limitation of the computation power available in the devices. Furthermore, in order to detect replayed packets, WPA implements a packet sequencing mechanism by binding a monotonically increasing sequence number to each packet. In addition, WPA provides two improved authentication mechanisms. In one mechanism, the possession of a Pre-Shared Key (PSK) authenticates the peers; furthermore, a 128-bit encryption key and another distinct 64-bit MIC key can be derived from the PSK. WPA prevents hackers by periodically generating a unique encryption key for each client. WPA fixes all known problems with WEP, except denial-of-service (DOS) attacks [9, 15]. Alternatively, IEEE 802.1X and the Extensible Authentication Protocol (EAP) [6, 15] can be adopted to provide a stronger authentication for each association, and generate a fresh common secret as part of the authentication process; all required keys can be derived from this shared secret afterwards. The 802.1X represents a different approach to wireless security. It is known as port-based network access control. It uses the Extensible Authentication Protocol (EAP) and RADIUS to authenticate clients and distribute keys. However, If an attacker collects two ciphertexts that are encrypted with the same key stream, then he should perform statistical attacks to recover the plaintext (Figure 2)[5, 17]. RADUIS

Access Point

security administrators. The 802.11i standard for advanced security on all 802.11 networks would not prevent the DOS attacks. 802.11i is a solution for authentication of users and encryption of data. The 802.11i specification defines two classes of security algorithms: Robust Security Network Association (RSNA), and Pre-RSNA. Pre-RSNA security consists of Wired Equivalent Privacy (WEP) and 802.11 entity authentication. RSNA provides two data confidentiality protocols, called the Temporal Key Integrity Protocol (TKIP) and the Counter-mode/CBC-MAC Protocol (CCMP), and the RSNA establishment procedure, including 802.1X authentication and key management protocols [17]. The main problem of the WLAN that uses IEEE 802.11i is the Denial Of Service (DOS) attack [4, 17]. Most research solutions call for the encryption techniques and the various types of authentications. Others rely on a technique to detect the intruders. The focus of this paper is to examine how cooperative intrusion response technique can increase wireless security. A new technique is introduced that is capable of detecting and inhibiting the DOS attack threat in the WLAN.

4. The Proposed Intrusion Detection and Inhibition Technique The past researches have greatly focused on the data confidentiality, integrity, and mutual authentication for wireless security. However, the availability of the network resources has not been considered seriously although it is a major issue. It is always thought that the denial of service of the network is unsolved problem due to the numerous ways to bring the resources down. Therefore, if each of the problems is solved separately, then the whole solutions combined together, a real and effective DOS attack solution can be produced. In this work we focuse on the DOS attack problem in the WLAN that uses IEEE 802.11i protocol and its various versions. We aim to make the IEEE 802.1X authentication process stronger authentication by adding an additional step to it. First let us study the problem in a closer look. An intruder can bring the AP down if he sends two packets with wrong keys in the open authentication process. The IEEE protocol assumes that the network is under attack and automatically shuts down the AP and its connections. To detect such type of DOS attack, we build an intruders database in which we store all the detected intruders. In the following access sessions, this database is used to identify the intruders and thus inhibits them from gaining access to the network. The proposed technique can be considered as an additional step in the overall security system of the WLAN that precedes the authentication step (802.1x/EAP) as shown in Figure 3. The configuration of the WLAN under consideration is as shown in Figure 4. It consists of N remote computers and an access point. This configuration is called the infrastructure of Wi-Fi network. The main problem of this WLAN is that when using the IEEE 802.11i protocol, this protocol considers any wireless client that tries to submit two packets with the wrong key in one second as an intruder client and thus the WLAN is suspected to be

Supplicant EAP Req/Id

RAD Acc Req(EAP Id)

EAP Resp/Id EAP Req l

RAD Acc Chal (EAP Req 1)

EAP Resp l

RAD Acc Req (EAP Res l) EAP Resp N

RAD Acc Req (EAP Res N) EAP Succ/Failure RAD Accept (EAP Succ) or RAD Reject (EAP Fail) EAPOL Key (optional)

Figure 2. A complete 802.1x authentication session showing the EAP and RADUIS messages

The new IEEE 802.11i standard is much better, it provides authentication and privacy. Authentication with 802.11i is built around the 802.1X protocol, used in conjunction with EAP (extensible authentication protocol) and implemented using RADIUS authentication servers that have been proven for many years in managing secure dial-up connectivity. 802.11i's privacy services are built on top of AES, a strong encryption standard that passes muster with even the most paranoid

34


under attack. Therefore, the IEEE 802.11i shutdowns all the connections with all wireless clients for one minute in order to ensure that all connections are authorized.

of once. These data are checked first to ensure that they are identical (to decrease the possibility of writing them wrong) to ensure that the typed authentication data are what the client has. If the data are identical, it will be checked for authorization as discussed before. The steps of the proposed intrusion detection and inhibition technique are summarized as follows: 1- A client requests access to the WLAN. 2- The AP checks if the client is in IDB. 3- If he exists in the IDB, the authentication process is ended and the request is rejected, and if he does not, go to the next step. 4- The AP requires the authentication data twice from the client. 5- The AP verifies that the twice-entered authentication data are identical. If not identical, then it allows the client to re-enter the authentication data again. 6- If the twice-entered data are identical, then it checks the authentication data versus those authorized to access the AP to check if the client is authorized. 7- If he is authorized, it completes the authentication process successfully. 8- If he is not authorized, it exits the authentication process by rejecting the client, and adding him to the IDB in order to prevent him from re-requiring the access of the WLAN in the future. Figure 5 shows the proposed intrusion detection and inhibition technique steps.

DDI Technique 802.1x / EAP TKIP MIC

Figure 3. The overall security system including DDI

Based on that fact, we propose a technique that solves this problem called “intrusion detection and inhibition technique”. The technique has two phases: the inhibition phase and the detection phase. The inhibition phase uses an intruders’ database (IDB) which is created during the detection phase. In this phase, when a client tries to submit a packet requiring access to the network, the client IP is first checked against the IDB. If this client is found in the IDB, the access point will prevent him from sending another wrong key packet in one second so preventing this intruding node from bringing the WLAN down and causing a denial of service.

User Requiring Access

Intruder

Two packets with wrong keys

Yes

Check in IDB No

User enters Authentication data twice No

Figure 4. WLAN Configuration

Identical?

Yes

The idea of intrusion detection is simple and based on the authentication step. When a wireless client requests access to the WLAN, the Access Point (AP), that has a database of the authorized clients, requires the authentication data from this client. When the wireless client provides the authentication data required, the access point checks the correctness of these data. If these authentication data are not correct (i.e. not available in the authorized clients database), then the wireless client is considered an intruder, thus inhibited from gaining access the WLAN and added to the intruder client database (IDB). This technique if followed this way will raise a problem; what if the wireless client has the right authentication data but he mistakenly typed them wrong? This client will be prevented from the access although he is a legal user. To solve this problem, we added another level of authentication by requiring the client who is not in the IDB to enter the authentication data twice instead

Verification of the authentication data Yes

Authorized?

Yes

No

Store in Intruders database (IDB)

Reject the Request

Finish the authentication process successfully

Figure 5. Intrusion Detection and Inhibition Technique

35


The following steps are performed to create a network simulation: I- Creating the event scheduler. II- Creating network: This involves declaring nodes and defining links (either simplex or duplex) between these nodes. This also involves defining queue lengths and the queue types which include the Drop Tail queue, the RED queue, the CBQ, etc. III- Creating connection: In this step either a TCP (Transport Control Protocol) or a UDP (User Datagram Protocol) connection is created. Data sources and the data sinks are defined along with the connections between respective source-sink pairs. Sources are the nodes that generate the data and the sinks are the nodes which are the respective recipients of the data. IV- Creating traffic: If a TCP connection is being simulated then either a FTP (File Transfer Protocol) or a Telnet traffic can be set up. If data traffic is being created over a UDP connection then either a Constant Bit Rate (CBR) or an exponential traffic can be generated. V- Tracing: A command can be used to trace packets on all the links or on specific links. The visualization tools provided by the NS2 software suite are Xgraph and NAM (Network Animator) [19, 20]. The program that runs NS is in OTcl script Language. The basic script initiates an event scheduler, sets up a network topology using the Network Component Objects, blumbs the network together using the functions in the Plumbing library and Co-ordinates the timing of the traffic source (Figure 6).

5. Simulation Results and Analysis In order to evaluate the proposed system we should determine when the DOS attack will happen and the parameters that control it. The DOS attack will happen in two cases: when the client sends at least two packets using a wrong key in one second, as defined in the WPA and in the 802.11i. Another possibility of DOS happens when the maximum number of connections that the access point allows is not high enough to serve all connections. Thus, the extra clients will be denied of service. The most important parameter to ensure the efficiency of the security algorithm is the Probability of denied service (PDS) which measures the probability of denying the service due to external attack. PDS is a function of the attack rate that represents the number of attack requests per second (requests/sec) and total amount of resources that represents the maximum number of connections of an access point Through the simulation, the PDS is measured with respect to each parameter when the rest of other parameters are kept constant. The proposed system should find the optimal situation for setting the parameters in order to make the probability of denied service as low as possible. There are many network simulation programs that can be reprogrammed in order to simulate the desired network, such as OPNet Modeler simulator, INSANE, x-Sim, Comnet, and NS simulator. MATLAB can also be used to simulate the wireless network by using RF toolbox that extends the MATLAB technical computing environment with functions and graphical user interface (GUI) for working with, analyzing, and visualizing the behavior of RF components [19, 20]. In our simulation, we will use NS simulator (NS2) to simulate the wireless computer network. The choice of NS was because it is a rich infrastructure for developing simulations and also it is used mainly in the research projects. However, it is relatively difficult to setup and to develop.

OTCL: TCL Interpreter with OO extension OTCl Script Simulation Program

5.1 Building the NS Simulator NS simulator is a software package used to simulate wired and wireless networks. It supports simulations for wired networks (P2P links, LAN), wireless networks (adhoc, cellular; GPRS, UMTS, WLAN, Bluetooth) and satellite. It is an open software that has various protocols at the packet level. It also has the capability for emulation and trace. The simulator is written in C++; it uses OTcl (ObjectOriented Tool Command language) as a command and configuration interface. Figure 6 shows the duality of OTcl–C++; the simulator performs detailed and efficient simulations of protocols. Moreover, it also allows the user to vary the parameters or configurations and get the instantaneous change in the output of the simulation. For the first case, we need a system programming language like C++ that efficiently handles bytes, packet headers and implements algorithms efficiently. But for the second case, iteration time is more important than the run-time of the part of task. This is accomplished by a scripting language like Tcl [19, 20].

NS Simulator Library Event Scheduler Simulation Objects Results Network Components Objects NAM Network Setup Network Helping modules Animator (Plumbing Modules)

Figure 6. A User View of NS

The event scheduler is used by those network components that handle packets to issue an event for a packet. Essentially a packet can be viewed as an event that triggers at a particular time. Simulation results are stored in files called 'Trace files'. These trace files can be used for analysis of the simulated data. A graphical tool known as Network Animator (NAM) uses these files automatically and gives a topological view of the network. Graphs of data throughput, data delays, data loss, jitter and other parameters associated with the quality of the network can be found using that data for which useful data needs to be sieved out from the large 36


(PDS)

(PDS)

0.6 0.4 0.2 0 50

70

80

100

Figure 7-c Figure 7. PDS as a Function of Attack Rate

The results of the proposed intrusion detection and inhibition technique show that the PDS is governed by the rate at which attack happened and the total amount of resources (i.e., maximum number of allowed connections). For attack rates under 40 attack/sec, the proposed technique was capable of detecting and inhibiting all the attacks on the system at all values of maximum connections of AP. As the attack rate increase, the PDS increase. However, when the maximum number of connections increases, the technique was able to defeat attacks at higher rates (at 50 and 70 connections). At 70maximum connections, the PDS approaches zero till 80attacks per second and PDS is below 0.4 if all clients are bad. As shown from the results, the proposed technique enhances the security of the WLAN against the DOS attacks. It is capable of detecting most (and in many cases all) the attackers and prevented them from bringing the system down. As a way to decrease the PDS for higher attack rates, the maximum number of connections allowed by the access point is suggested to be increased.

0.6 0.4

6. Conclusion

0.2

In this paper, we introduc a security technique to enhance the wireless computer security. It handles one of the problems that the WLAN suffers from: Denial of Service (DOS) Attack. It is based on creating an Intruder database (IDB) that contains all intruder clients in order to prevent them from sending at least two packets in one second so preventing them from bringing down the WLAN and causing DOS attack. In order to evaluate our approach, we measure the probability of denied service PDS and used two different factors that affect it: number of attacks and maximum number of connections that the access point allows. We show through the results that the Probability of denied service (PDS) is decreased at different attack rates after using the proposed technique. However, increasing the maximum number of connections that the access point allows decreases the PDS for higher attack rates.

0 30

40

50

100

Attack Rate (Attacks / Second)

Figure 7-a

Max. Conn.: 50

1 0.8 (PDS)

0.8


0.8

20

Probability of denied service

Max. Conn.: 70

1

20

Max. Conn.: 30

1 Probability of denied service

Probability of deniedservice

Trace File. This sieving can be done using C++ tools or certain scripting languages like AWK [19, 20]. The wireless model essentially consists of the MobileNode at the core, with additional supporting features that allows simulations of multi-hop ad-hoc networks, wireless LANs etc. The MobileNode object is a split object. The C++ class MobileNode is derived from parent class Node. A MobileNode thus is the basic Node object with added functionalities of a wireless and mobile node like ability to move within a given topology, ability to receive and transmit signals to and from a wireless channel etc. A major difference between them, though, is that a MobileNode is not connected by means of Links to other nodes or mobile nodes. 5.2 Results and Analysis In order to analyze and evaluate the proposed technique, we develop a simulator using NS2. The goals of our simulation are to build the wireless computer network and to determine the performance of our security approach. We assume that the mobile nodes are fixed (i.e. no random motions for the nodes) in order to measure the security approach. The number of attacks and maximum number of connections that the access point allows are very fundamental factors for the evaluation of the proposed technique. A number of experiments are conducted. In each experiment, the maximum number of connections of access point is kept constant and the system is attacked by a varying attack rate (1, 2, 20, 100 requests/sec). The PDS is measured for each experiment as shown in Figure 7.

0.6 0.4 0.2 0 20

30

50

70

100

7. References


[1] Min-ho Shin, Justin Ma, Arunesh Mishra and William Arbaugh. Wireless Network Security and Interworking, In proceedings of the IEEE, 2005.

Figure 7-b

37


[2] Sandeep K. Singhal. The Seven Deadly Sins of Wireless LANS. White paper, ReefEdge Inc., Dec 2001. [3] Tiantong You, Chi-Hsiang Yeh, Hossam Hassanein BROADEN: An Efficient Collision-Free MAC Protocols for Ad Hoc Wireless Networks. In proceedings of the 3rd International, Workshop on Wireless Local Networks, LCN 2003, October 2003. [4] William A. Arbaugh, Narendar Shankar, and Y.C. Justin Wan. Your 802.11 Wireless Network has No Clothes. IEEE Wireless Communications Magazine, Volume(9): 44 - 51, December 2002. [5] Cisco SAFE. Wireless LAN Security in Depth. Cisco Systems, 2003. [6] R. Housley and W. A. Arbaugh, WLAN Problems and Solutions, Communications of the ACM, Volume(46): 31 - 34, May 2003. [7] Vesa Kärpijoki. Security in Ad Hoc Networks. Technical Report, Helsinki University of Technology, Telecommunications Software and Multimedia Laboratory, 2000. [8] Marco Domenico, Aime Giorgio and Antonio Lioy. A Wireless Distributed Intrusion Detection System and a New Attack Model. In proceedings of the 11th IEEE Symposium on Computers and Communications, pages 35-40, Cagliari, Italy, June 2006. [9] Chris Wullems, Kevin Tham, Jason Smith, and Mark Looi1, Technical Summary of Denial of Service Attack against IEEE 802.11 DSSS based Wireless LAN's. Technical Report, Information Security Research Center, Queensland University of Technology, Brisbane, Australia, May 2004. [10] Matthew S. Gast. 802.11 Wireless Networks: The Definitive Guide, O'Reilly &Associates, 2003. [11] R. Gill, J. Smith, M. Looi and A. J. Clark. Passive Techniques for Detecting Session Hijacking Attacks in IEEE 802.11 Wireless Networks. In proceedings of Asia Pacific Information Technology Security Conference (AusCert 2005), pages 26--38. QUT Publications, May 2005. [12] Changhua He, Mukund Sundararajan, Anupam Datta, Ante Derek, and John C. Mitchell. A modular Correctness Proof of IEEE 802.11i and TLS. In proceedings of the 12th ACM Conference on Computer and Communications Security (CCS'05), November 2005. [13] Wenyuan Xu, Wade Trappe, Yanyong Zhang and Timothy Wood. The Feasibility of Launching and Detecting Jamming Attacks in Wireless Networks. In proceedings of the 6th ACM international symposium on Mobile Ad-hoc Networking and Computing, Urbana-Champaign, IL, USA, pages 46 – 57, May 2005. [14] Sheng Zhong, Jiang Chen, Yang Richard Yang. Sprite: A Simple, Cheat-Proof, Credit-Based System for Mobile Ad-Hoc. In proceedings of the IEEE INFOCOM, San Francisco, CA, April 2003. [15] Wi-Fi Alliance. Securing Wi-Fi Wireless Networks with Today’s Technologies, White paper, February 2003. [16] Donna M. Gregg, William J. Blackert, David V. Heinbuch, and Donna C. Furnanage, Analyzing

[17]

[18]

[19] [20]

Denial of Service Attacks using Theory and Modeling and Simulation. In proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy, West Point, NY, June 2001. Arunesh Mishra, Nick L. Petroni, and William A. Arbaugh. Security Issues in IEEE 802.11 Wireless Local-Area Networks: A Survey. Wireless Communications and Mobile Computing Journal, vol. 4, no. 8, pp. 821-833, 2004. Yih-Chun Hu, Adrian Perrig, David B. Johnson,. Packet Leashes: A Defense Against Wormhole Attacks in Wireless Networks, In proceedings of the IEEE INFOCOM, 2003. Kevin Fall, Kannan Varadhan, VINT Project. The NS Manual, 2003. Marc Greis, VINT group, NS Tutorial.

Biographies Amany Sarhan, received the B. Sc degree in Electronics engineering, and M.Sc. in Computer Science and Automatic Control from the Faculty of Engineering, Mansoura University, in 1990, and 1997, respectively. She awarded the PhD degree as a joint research between Tanta univ., Egypt and Univ. of Connecticut, USA. She is working now as an Assistant Prof. at Computers and Automatic Control Dept., Tanta Univ., Egypt. Her interests are in the area of: Network Security, Software restructuring, Object-oriented Database, Fragmentation and allocation of databases and distributed systems and Computations. Mofreh Salem is a full-time Professor at the Computer and Control Department, Mansoura University. He received the B.Sc. degree in electrical engineering from the Faculty of Engineering, Mansoura University, Egypt. He received his M.Sc. and Ph.D. degrees in computer science and control from the Departments of Electronics and Electrical Engineering, University of Glasgow, Glasgow, UK. He has a lot of publications in computer network, software engineering, AI, and distributed systems. His interests are in the areas of network security, mobile agent, pattern recognition, databases, and performance analysis. Mostafa Abu-Bakr is currently working at Mansoura Power Station, Information Dept. He graduated from Mansoura University, Faculty of Engineering, Computers and Control Dept. in 2001. He received his M.Sc. degree in Computer Science from the Faculty of Engineering, Mansoura University in 2005. His interests are in Network security, WLAN management and Grid Computing.

38


Identification of Information Systems Threats Sources: An Analytical Study Ahmad Ali Al-Zubi Computer Science Department, RCC, King Saud University, P.O. Box: 28095 – 11437 Riyadh-Saudi Arabia [email protected] http://faculty.ksu.edu.sa/drzubi

Abstract

1. Introduction

Threats to information systems and cyber-based critical infrastructures are evolving and growing. These threats can be unintentional and intentional, targeted or no targeted, and can come from a variety of sources, such as information warfare, criminals, hackers, virus writers, and disgruntled employees and contractors working within an organization. Moreover, these groups and individuals have a variety of attack techniques at their disposal, and cyber exploitation activity has grown more sophisticated, more targeted, and more serious. As government, private sector, and personal activities continue to move to networked operations, as digital systems add ever more capabilities, as wireless systems become more ubiquitous, and as the design, manufacture, and service of information technology have moved overseas, the threat will continue to grow. In the absence of robust security programs, agencies have experienced a wide range of incidents involving data loss or theft, computer intrusions, and privacy breaches, underscoring the need for improved security practices. These developments have led government officials to become increasingly concerned about the potential for a cyber attack. This article proposed a new approach for identifying the source of threats so some action could be taken against them and therefore information will be protected against loss or theft.

As information technology has advanced, companies and organizations have become more dependent on computerized information systems to perform their operations and to process, maintain, and report essential information. Virtually all operations in these organizations are supported by automated systems and electronic data, and organizations would find it difficult, if not impossible, to carry out their missions, deliver services to the public, and account for their resources without these information assets. Information security is therefore very important for organizations to ensure the confidentiality, integrity, and availability of their information and information systems [1, 2]. Prior establishing a system for information security, it is necessary to asses which threats are most actual; therefore threat assessment is an essential component of an information security risk evaluation. In order to prioritize vulnerabilities for remediation and to evaluate existing controls, a thorough understanding of potential threat sources is required; this activity is a pre-requisite for a comprehensive information security system and a stated regulatory requirement. This article explores key issues related to threat assessment, including essential elements, methodologies, and common pitfalls to information security. Besides, that solid information security should be complicated in its nature, to which people already got used. However, the organization of information security should not only have complicated

Keywords: Information Security, Threat Source, Vulnerability, Attack, Protection, Control, Information System.

39


on the proposed approach for analysis and evaluation of information security, section (5) contains the results of this paper, section (6) presents the conclusion of this work, and finally paper ends with the list of used references.

nature, but also based on a deep analysis of possible negative effects. It is important not to miss any significant aspect. More and more of the useless ways of information protection is getting old and are not used any more. Today the complex system of information security is not just a set of tools, but also is a set of measures oriented on preventing the loss of information. Companies do not want to spend their resources any more for nothing; they want to buy only what they really need to build a reliable system for protecting their valuable information with lowest cost. To achieve this target they need to be aware of the nature of potential dangers. As it is said that "finding the cause of evil is almost equal to finding a medicine against it". For example, is it necessary to stack the Personal Computer (PC) of organization manager with protection tools against Unauthorized Access (UA) if it is in a separate room, and the room is closed, and also surrounded with guards? On the other hand, we can also talk about how to study the danger; we can, for example, for every new threat use new scanners, firewalls and VPN. As a result, of course, you can get a reliable, proven protection with which your business can sleep peacefully, unless, of course, after all this you still have some resources to continue your economic activity. You can learn from others' mistakes. But, firstly, our people are not used to learn from others' mistakes, and secondly, who will tell about his own mistakes? That is why only remains: first, to see all possible options and then select the most applicable to some specific case. Here again, the alternative: either to use the accumulated data bank that contains information about threats which already occurred before (and of course not be sure, that all threats options are included), or try to create a methodological tool for determining possible forms of threats, based on studying all influencing factors and taking into consideration even the most unlikely ones. We can recall for example, what happened in the United States, they began to put armored doors to the cockpit in civilian airplanes only after the tragic events of September 11, 2001 [1, 2]. The remainder of the paper is organized as follows: section (2) focuses on the previous works done in this field, section (3) presents the analysis of the problem, section (4) emphasizes

2. Previous Works

However, we are not going to argue that in nature there are no methodologies for risk analysis. For example, in our university there is a special center whose major focus is information security, and we have carefully studied the experience of colleagues in this center regarding this issue. But all existing techniques to date provide only a qualitative assessment. For example, techniques such as the Guide to BS 7799 risk assessment and risk management. DISC, PD 3002, 1998., Guide to BS 7799 auditing.o - DISC, PD 3004, 1998 (based on the standards of BS 7799 and ISO / IEC 17799-00). In these techniques, information security assessment usually is being conducted in 10 key control points, which are either mandatory (required by the law) or considered as the main structural elements of information security (for example, learning the rules of safety) [1, 2, 3]. These control points are applicable to all organizations. And include: x A document on information security policy; x Allocation of responsibilities for information security; x Educating and training staff to maintain information security; x Notification of cases of protection violation; x Protection against viruses; x Planning for organization business continuity; x Copyright Control of copying software; x Protection of assets and documentation of the organization; x Data protection; x Compliance monitoring of security policy. The procedure for controlling the security of an Information System (IS) includes checking the availability of these key points, evaluating the completeness and correctness of their implementation, as well as analyzing their adequacy to the existing risks. This approach can give an answer only at the level of “this is good

40


or this is bad" but cant give an answer to such questions as" How good?; How bad?; How critical?”. Today, therefore, a big necessity occurred to develop such a technique, which would issue a quantitative result to the administration, a complete picture of the situation, confirming with figures the recommendations of specialists responsible for the information security assurance in the organization. Let us see what forms the basis of this technique [4].

3. Problem Analysis

Figure 1: Threats Implementation Model of Information Security

Selfishness rules the world, and therefore all activities of all organizations regarding information security assurance are aimed only to prevent losses which happen due to the loss of confidential information [3, 5]. Accordingly, already implies the existence of valuable information, loss of which the organization may suffer big losses, so due to simple and logical falsehood, we obtain the following chain:

During analysis we should make sure that all possible sources of threats and vulnerabilities are identified and juxtapose with each other, and all identified sources of threats and vulnerabilities (factors) juxtapose with the methods of implementation. It is important to be able, if necessary, without changing methodological tools to introduce new sources of threats, methods of implementation, vulnerability evaluation, which will be known as a result of the knowledge development in this field.

Threat Source - Factor (Vulnerability) Threat (Action) - Implications (Attack) Today threats to information security are usually identified either with their nature (type, method) of the destabilizing impact on information, or by consequences (results) of such exposure. However, practice shows that this kind of complex terms can have a large number of interpretations but one possible approach for identifying threats on information security, based on the concept of "threat". "Threat" – is the intention to cause physical, material or other damage to public or private interests, the possible danger , in other words, the concept of threat tightly linked to the legal category of "harm" - the actual expenses incurred by an entity as a result of a violation of its rights (for example, disclosure or misusing confidential information by a violator), damage or loss of property, as well as the expenses that it must be made to restore the violated rights and the value of damaged or lost items. Analysis of the negative impacts of threats requires mandatory identification of possible sources of threats, vulnerabilities, contributing in their occurrence and methods of implementation. And then chain grows into the scheme presented in Figure 1.

3.1. Threats Categorization

Threats are categorized according to the possibility of damage they cause to the subject of relations with the violation of security purposes. Damage can be caused by any entity (offense, fault or negligence), as well as become a consequence, independent of the expression entity. Threats are not so many: x While assuring the confidentiality of information: o Theft (copying) of information and its processing tools. o Loss (unintended loss, leakage) of information. x While assuring the integrity of the information: o Modification (distortion) of information. o Denial of the authenticity of the information. o Imposition of false information. x While assuring accessibility of information:

41


o o

3.2.

the threat. In case of such mismatch, attack is considered as a phase of preparing for implementing the threat, i.e., as "preparing to perform” an illegal act. The result of the attack is the consequences, which are considered as the implementation of the threat and / or contribute to such realization.

Blocking information . Destruction of information and its processing tools.

Classification of Threats Sources

All sources of threats can be divided into classes, depending on the type of media, and classes also divided into groups depending on their location as shown Figure 2.

Figure 4: Classification Structure of Implementation Methods

Figure 2: Classification Structure of Threats Source

3.3. Classes of Vulnerabilities

Vulnerabilities can also be divided into classes depending on their belonging to which source of vulnerabilities, and classes also divided into groups and subgroups depending on manifestations as presented in Figure 3.

4. Suggested Approach

The approach itself for analysis and evaluation of information security based on the calculation of weights of risk coefficients for threats sources and vulnerabilities, comparing these coefficients with predefined criteria and progressive reduction of the list of possible sources of threats and vulnerabilities to the minimum, that is acceptable for a given object. Initial data for the assessment and analysis shown in Figure 5 is the result of questionnaire of subjects, aimed for clarification of the focus of its activities, alleged priorities of security goals, tasks undertaken by the Computerized System (CS) and the conditions of the location and exploitation of the object. Through this approach may: x Set priorities for security purposes relations subject; x Define a list of actual sources of threats; x Determine the list of actual vulnerabilities; x Evaluate the relationship between threats, sources of threats and vulnerabilities; x Determine the list of possible attacks on the object; x Describe the possible consequences of threats.

Figure 3: Classification Structure of Vulnerabilities

3.4. Implementation Methods Methods of implementation can be divided into groups by the means of implementation, see Figure 4. It should be taken into consideration that the concept "method" it is self is applicable only when considering the implementation of the threats by anthropogenic sources. For anthropogenic and natural sources, the concept is transformed into the concept of "prerequisite". Classification of the feasibility of threats (attacks) is a set of possible actions of the threats source with certain methods of implementation and using vulnerability that leads to the realization of the objectives of the attack. The target of the attack may not be identical with target of threat implementation and can be aimed to obtain an intermediate result that is necessary for achievements in the further implementation of

42


Figure 5: Threats Evaluation Algorithm

4.1.

Method of Threat Analysis

No

Threats to Information Security

1 2 3 4 5 6 7 8

Theft (copy) of information and media processing Destruction of information and media processing Modification (distortion) of information Creating conditions for implementation of information security threats Locking information Loss (leakage) of information and media processing Denial of the authenticity of information Imposition of false information

4.2. Selecting targets priorities of Information Security Assurance (ISA)

(CW)k

Our initial data for threat analysis is the data obtained by questionnairing subjects related to this issue, and by analysis of the problems to be solved in CS. x Privacy x Integrity x Accessibility Priorities for the objectives show the main directions of information security and used in further analysis and evaluation of the threats relevance.

0.19 0.17 0.14 0.14 0.12 0.10 0.08

4.3.

0.06

Threats Ranking

The coefficient of Actuality of each threat (CA)k is calculated from the selected coefficients of priority (KP)r and the threats normalized weights (CW)k (table 1). In case of equal target priorities:

Table 1: Normalized weight Coefficients of threats

43


objective circumstances and can not be excluded under any circumstances: x Side radiation from elements of technical tools. x Radiation from cables and hardware. x Pickup of electromagnetic radiation on the line and conductors. x Leakage signals at circuit power supply and grounding. x Acoustic and vibration and acoustic radiation. x Failures and faults technology.

(CP)P=(CP)I=(CP)A we can confirm that: (CW)k = (CR)k

4.4.

Ranking of threats sources

The initial data to rank the sources of threats is the results of expert-analytical assessment of indirect indicators (kn)I : x The possibility of a threat source (k1)i . x Readiness of threat source (k2)i . x Fatality of threat source (k3)i . Ranking sources of threats enables us to assess their danger, which is afterword used to analyze and identify actual sources of threats.

4.5.

4.8. Description of Attacks Consequences The description of the attacks effects can visualize the state of information security and it cab written as follows: "The object is possible to be attacked by... [list of possible threats], performed by ... [list actual sources of threats] using... [list of possible methods to perform the threats] through ... [list of actual vulnerabilities], which leads to ... [description of the potential consequences]."

Actual sources of threats

Sources with risk coefficient (CR)i2000$ >10000$ ---Cost

Autonomy of IDSUDA agents is one of the important features resulted from the architecture that allowed each agent to have all the functionality required to work as an independent intrusion detection system. Beside autonomy, IDSUDA has a number of some important features such as: x There is no single point of failure as exists in the other systems depending on a centralized manager. x IDSUDA impose less overhead on the network due to the lack of sophisticated control protocols that are required to control and coordinate between agents of different types and levels (IDSUDA agents are homogeneous, i.e., of the same type). x Security policies could be applied anywhere on the network. x IDSUDA enables administrators to specify security policies according to the needs and nature of specific network segment or host. Since misuse intrusion detection alone is not enough to detect the ever-changing attacks, IDSUDA employs both intrusion detection technique, misuse and anomaly detection. This way IDSUDA is capable of detecting known and unknown attacks. This also provides the required flexibility to define and apply the organization’s security requirements.

Table 4: Comparison between IDSUDA and two Commercial IDSs

Both IDSUDA and Centrax employ anomaly detection beside misuse detection, which helps to detect the unknown attacks, while RealSecure depends only on misuse detection. The rest of features such as data encryption, user authentication, and response types exist in the three compared IDSs.

The problems and limitations that have been encountered in intrusion detection systems have indicated that the best place to collect data about the behavior of a program or system is at the point where the data is generated or used. Hence, the monitors built inside IDSUDA agents are designed to monitor a number of important system resources in run-time.

Finally we can say that IDSUDA implementation is promising, and has competing features beside the common features that exist in several commercial products.

Although the overhead imposed by IDSUDA agents on their hosts, the advantages of having IDSUDA agents to protect networks and hosts is high. The overhead imposed could be decreased in two ways. First, compromising between the number of rules and its generality will yield a relatively reasonable overhead and good performance. Second, system administrators may assign some agents for monitoring network, and other agents to monitor hosts. Flexibility about this issue is one of the important features of IDSUDA.

6.4 IDSUDA disadvantages The following are the shortcomings of IDSUDA: x Lack of Network Sessions Analysis; where some remote attacks cannot be detected by inspecting single packets, but it could be detected by analyzing the full network sessions. x Performance: IDSUDA agents may expose overhead on their hosts as the number of rules increased and high rates of data transmitted over the network. x Intrusion detection rules added through the user interface only; there is no ability to add scripted rules.

Finally IDSUDA provides a promising solution for intrusion detection field, and built as a framework that is extendable in its capabilities and could be enhanced to meet future challenges.

7. Conclusion The primary goal of the presented framework in this paper is to provide a solution for the intrusion detection

As a future work we will try to solve the IDSUDA shortages illustrated in section 6.4. 56


8. References [1] Rebecca Bace and Peter Mell, “Intrusion Detection

Systems”,http://csrc.nist.gov/publications/nistpubs/ 800-31/sp800-31.pdf, 2001 [2] Rebecca Bace, “Intrusion Detection”, Macmillan Technical Publishing, 2000. [3] Naval Information Systems Management Center, US, "Introduction to Information Systems Security Guidebook",, 1995 [4] Diego Zamboni, "Using Internal Sensors for Computer Intrusion Detection", PHD thesis, , 2001 [5] Ofir Arkin, "Network Scanning Techniques Understanding how it is done", , 1999 [6] Hervé Debar, “A Revised Taxonomy for Intrusion Detection”, IBM Research, , 1999 [7] Mikhail Gordeev, "Intrusion Detection: Techniques and Approaches", , 2000 [8] Martin Arvidson and Markus Carlbark, "Intrusion Detection Systems - Technologies, Weaknesses andTrends",, 2003. [9] J.M. Bradshaw, "An introduction to software agents", AAA1 Press/MIT Press, Cambridge, 1997 [10] Rijndael page, . [11] RealSecure page,