Evaluating Machine Learning Algorithms for ... - ACM Digital Library

0 downloads 0 Views 1MB Size Report
Since NDN has different network archi- tecture than TCP/IP, so it is prone to new types of attack. These attacks are Interest Flooding Attack (IFA), Cache Privacy ...
Evaluating Machine Learning Algorithms for Detection of Interest Flooding Attack in Named Data Networking Naveen Kumar

CSED, MNNIT Allahabad Allahabad - 211004, U. P., India [email protected]

Ashutosh Kumar Singh

CSED, MNNIT Allahabad Allahabad - 211004, U. P., India [email protected]

ABSTRACT Named Data Networking (NDN) is one of the most promising datacentric networks. NDN is resilient to most of the attacks that are possible in TCP/IP stack. Since NDN has different network architecture than TCP/IP, so it is prone to new types of attack. These attacks are Interest Flooding Attack (IFA), Cache Privacy Attack, Cache Pollution Attack, Content Poisoning Attack, etc. In this paper, we discussed the detection of IFA. First, we model the IFA on linear topology using the ndnSIM and CCNx code base. We have selected most promising feature among all considered features then we applied diïňĂerent machine learning techniques to detect the attack. We have shown that result of attack detection in case of simulation and implementation is almost same. We modeled IFA on DFN topology and compared the results of different machine learning approaches.

CCS CONCEPTS • Security and privacy → Denial-of-service attacks;

KEYWORDS Named Data Network, NDN, Information Centric Networking, ICN, Security in NDN, Interest Flooding Attack, IFA ACM Reference Format: Naveen Kumar, Ashutosh Kumar Singh, and Shashank Srivastava. 2017. Evaluating Machine Learning Algorithms for Detection of Interest Flooding Attack in Named Data Networking. In Proceedings of SIN ’17, Jaipur, IN, India, October 13–15, 2017, 4 pages. https://doi.org/10.1145/3136825.3136864

1

INTRODUCTION

Present solutions like Secure Sockets Layer (SSL) try to secure communicating endpoints (source and destination), but data itself is not secure. The main goal of NDN is “security by design”. NDN ensures that each data item must be signed by the producer thus ensuring integrity and provenance of data. But there is no defense against DDOS attack in NDN. DDOS in NDN occur when the attacker requests a large number of non-existing Interest packets. These requests get stored in Pending Interest Table (PIT) of in between NDN routers these entries remain in PIT till timeout. Due to IFA Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). SIN ’17, October 13–15, 2017, Jaipur, IN, India © 2017 Copyright held by the owner/author(s). ACM ISBN 978-1-4503-5303-8/17/10. https://doi.org/10.1145/3136825.3136864

Shashank Srivastava

CSED, MNNIT Allahabad Allahabad - 211004, U. P., India [email protected]

legitimate requests from normal users do not get space in PITs of NDN routers. Since this attack occurs when PIT of routers become full, so a term Interest Flooding Attack is coined to refer such types of attack. We have modeled IFA on Linear topology modeled using CCN code base and ndnSIM [2]. We selected six prominent features out of nine features for the detection of IFA. Then we model IFA on DFN topology and perform attack detection by collecting data from the simulator. This paper is organized as follows: In the second section, we describe basic architecture of NDN. The third section describes the related work on IFA. In the fourth section, we describe IFA in NDN. The fifth section presents attack detection and result’s analysis. Ending with future scope and conclusion.

2

RELATED WORK

Compagno et al. [4] proposed a framework for detection and mitigation of IFA called Poseidon. In this approach, a filter is applied to each interface, so incoming legal requests also suffer from this attack. Afanasyev et al. [1] apply a filter on the malicious interface based on probability, so legitimate interest may also suffer. Alberto et al. [7] use PIT size as a metric to detect the attack. When it goes beyond a predefined threshold, then the router applies traceback algorithm. This scheme does not work for attack done using a random prefix. Karami et al. proposed a multi-objective RBF-PSO method for the detection of IFA in NDN. In this approach 12 features are used for the detection of IFA in NDN. Karami et al. have not done feature analysis of detection parameters. Most of the above attacks are done in simulation environment which applies to either IFA using random prefix or IFA using existing prefix. We implemented IFA using CCNx code base and compared the result of detection with simulation environment. Then we model IFA on DFN topology and perform attack detection by collecting data from the simulator.

3

INTEREST FLOODING ATTACK IN NDN

In IFA, an attacker(s) request non-existing interest packets these interests get stored in PIT until timeout due to which legitimate user’s request is dropped by the router. We can classify IFA into two categories, based on interest packet used for attack. These are IFA using existential interest packet and IFA using non-existential packet. In the first type of attack, the attacker uses an existing prefix for performing the attack. Attacker concatenates existing prefix with random string for performing the attack. Since these are interest packets of existing prefix, there for these interest packets go to the server of the belonging prefix. These packets follow the path between attacker and server of the prefix. Thus entries get

SIN ’17, October 13–15, 2017, Jaipur, IN, India

N. Kumar, A. K. Singh and S. Srivastava

2500

2500

800

2x 4x 8x

2x 4x 8x

2000

2x 4x 8x

700

2000

1500

1000

500

1500

InData

OutInterests

InInterests

600

1000

400 300 200

500

500 100

0

0 0

1000

2000

3000

4000

5000

6000

0 0

1000

2000

Time in ms

5000

6000

0

300

1000 800 600

200

200

100

0

0 5000

6000

2x 4x 8x

300

100

4000

0 0

1000

2000

Time in ms

3000

4000

5000

6000

0

1000

2000

Time in ms

2x 4x 8x

700

3000

4000

5000

6000

Time in ms

900

800

6000

400

400

3000

5000

500

200

2000

4000

600

InSatisfiedInterests

SatisfiedInterests

400

3000

700

1200

500

1000

2000

800 2x 4x 8x

1400

600

0

1000

Time in ms

1600 2x 4x 8x

700

OutData

4000

Time in ms

800

6000 2x 4x 8x

800

200-500 200-600 200-700

5000

700

500 400 300 200

600

4000

PITsize

TimedOutInterests

600

OutSatisfiedInterests

3000

500 400 300

3000

2000

200 1000

100

100

0

0 0

1000

2000

3000

4000

5000

6000

0 0

1000

2000

Time in ms

3000

4000

5000

6000

0

Time in ms

100

200

300

400

500

600

Simulation time in seconds

Figure 1: Effect of attack on different detection parameters created in PIT of in-between routers. Since there is no data packet corresponding to these interest packets, therefore these packets are dropped by the server. In the second type of attack, the attacker simply uses a random string as interest packet for performing the attack. These interest packets are broadcasted by receiving routers. Therefore this attack is more severe than the first type of attack. This attack affects network rather than a particular server.

LRU. The capacity of the CS of each node is taken as 1000 contents, and replacement policy for CS is LRU. The other system setting is given in Table 1. We have run the simulation for 600 seconds of simulation time and collected data corresponding to traffic generated by simulation. This data is preprocessed so that it can be used for off-line detection.

4 RESULT AND EXPERIMENTAL SETUP 4.1 Attack Detection on linear topology using ndnSIM We have used ns-3 based ndnSIM simulator for modeling NDN network, performing IFA and collecting data. The relative position of attackers, publishers, and normal consumers is given in Figure 2. We have configured PIT size to 120KB and interest expiration time equal to 4s. For every link, we set queue length and delay to 10ms and 400 respectively. The replacement policy for PIT is taken as

Figure 2: Linear Topology 4.1.1 Feature Selection. We have analysed nine different parameters for the detection of IFA in NDN. We perform nine different simulations for each combination of attacker frequency (200, 300,

Evaluating ML Algorithms for Detection of IFA in NDN

SIN ’17, October 13–15, 2017, Jaipur, IN, India

Table 1: Network parameters considered for modeling linear topology Node Consumer1 Consumer2 Attacker1 Attacker2

Distribution Randomize Randomize Randomize Randomize

Pattern Uniform Uniform Uniform Uniform

Frequency 200 200 700 700

Run time 0-600 50-550 160-375 160-375

400) and consumer frequency (500, 600, 700). Keeping consumer frequency fix, i.e., 200 and varying attacker frequency from 500 to 700, we have plotted the graph for each parameter as shown in Figure 1. The attacker is active in between 160 and 375 seconds of simulation time. We can see the effect of the attack on each parameter. Out of these nine features, we have chosen six prominent features as shown in Table 2. 4.1.2 Attack Detection. We have applied different detection approaches on the collected dataset. We have applied Multilayer Perceptron with Back propagation (MLP with BP) [8], Radial Basis Function (RBF) [3] Network in which centroid is computed using k-means clustering algorithm and spread of each RBF function corresponds to the hidden node is taken as the mean of the euclidian distance of all data point from the centroid. The weight between hidden and the output layer is optimised using single objective optimisation algorithm like Particle Swarm Optimisation (PSO) [9], JAYA [10], and Teaching Learning Based Optimisation (TLBO) [11]. We have applied Linear Support Vector Machine (SVM) [5] and Fine k-nearest neighbors (KNN) [6] algorithm for attack detection. Then we compared these detection approaches using classification metrics like Accuracy, Precision, Sensitivity (Recall) and Specificity. We have used ten-fold cross-validation method for evaluating the classification.

4.2

Attack detection using CCN Implementation

We have installed CCNx code-base in eight different core-i7 computer of 16 GB RAM having Ubuntu 16.04. We have created a linear topology of 8 nodes as given in Figure 2. Each consumer uses five thread for generating consumer traffic and twenty-five threads for generating attacker traffic. Data is collected in trace file using python scripts. This data is preprocessed for using it for attack detection. We have used same six features for the attack detection as we choose in case of simulation above.

4.3

Result Analysis of IFA on linear topology

The result in the case of simulation and implementation are given in Table 3 and Table 4 respectively. The result shows that attack detection in case of implementation and simulation are almost similar. MLP with BP performs better than other machine learning algorithm in case of both simulation and implementation.

4.4

Producer Producer2 Producer1 No Producer Producer1

Goal Normal Normal IFA for non existing data IFA for existing data

Table 2: Parameters used for detection of IFA Parameters InData InInterests OutInterests OutData SatisfiedInterests PITsize

Meaning A number of arrival data packets in a router A number of arrival Interests packets in a router A number of Interest packets from a router A number of sent data packets from a router A total number of satisfied Interests from a router Number of PIT entries in a router

Table 3: Result of attack detection in case of linear topology in CCNx code base based implementation Algorithm MLP with BP RBF with PSO RBF with JAYA RBF with TLBO SVM Linear Fine KNN

Accuracy 97.37 88.79 88.23 89.42 93.03 97.91

Precision 97.451 87.82 87.68 88.19 93.59 97.18

Sensitivity 96.99 94.96 94.11 95.63 92.91 97.18

Specificity 96.99 78.88 78.79 79.46 93.16 98.37

Table 4: Result of attack detection in case of linear topology in ndnSIM Algorithm MLP with BP RBF with PSO RBF with JAYA RBF with TLBO SVM Linear Fine KNN

Accuracy 100 100 100 100 99.83 99.64

Precision 100 100 100 100 99.73 99.73

Sensitivity 100 100 100 100 100 99.70

Specificity 100 100 100 100 99.53 99.54

is given in Table 5. We have run the simulation for 600 seconds of simulation time and collected data corresponding to traffic generated by simulation. This data is preprocessed so that it can be used for off-line detection. The effect of the attack on a normal consumer is shown in Figure 4. We can see the decrement of satisfaction ratio (ratio of incoming interest and outgoing data packet) of consumers during attack duration (105-240 and 330-465).

Attack detection in DFN topology

We are using Deutsches Forschungsnetz (DFN) like topology to simulate IFA. The relative position of consumers, attackers and publishers are given in Figure 3. Configuration of PIT and CS is same as above (in case of linear topology). The other system setting

4.5

Result Analysis of IFA on DFN topology

The result of the attack detection in case of DFN topology simulated using ndnSIM is given in Table 6. The result shows that MLP with BP, RBF with PSO, RBF with TLBO, and RBF with JAYA can detect

SIN ’17, October 13–15, 2017, Jaipur, IN, India

N. Kumar, A. K. Singh and S. Srivastava

Table 5: My caption Node

Distribution

Pattern

C1 C2

Uniform Exponential Exponential Uniform

C6

Randomize Randomize Zipf-Mande lbort Îś=[0.50.9] Randomize Zipf-Mande lbort Îś=[0.50.9] Randomize

Freq uency 300 300

C7

Table 6: Result of attack detection in case of DFN topology in ndnSIM

0-600 30-600

Producer P1 P2

300

15-600

P3

Normal

300

60-600

P6

Normal

Exponential

300

45-600

P2, P3

Normal

Uniform

300

P3

Normal

Randomize

Uniform

300

P6, P4

Normal

C8

Randomize

Exponential

300

75-600 105-240, 330-465 120-270, 375-600

P1

Normal

A1

Randomize

Uniform

3000

105-240

P1

A2

Zipf-Mande lbort Îś=[0.50.9]

Uniform

3000

330-465

No Producer

A3

Randomize

Uniform

3000

105-240

P5

A4

Randomize

Exponential

3000

330-465

P6

C3 C4 C5

Runtime

Goal Normal Normal

Algorithm MLP with BP RBF with PSO RBF with JAYA RBF with TLBO SVM Linear Fine KNN Cosine

Accuracy 100 100 100 100 99.83 93.03

Precision 100 100 100 100 99.73 93.59

Sensitivity 100 100 100 100 100 92.91

Specificity 100 100 100 100 99.53 93.16

IFA for existing prefix IFA for existing prefix IFA for existing prefix IFA for existing prefix

attack with 100 % accuracy. This concludes that neural network based learning approach is a good solution to IFA detection in NDN. P3

A1 C2

R2

R1

Figure 4: Effect of attack on consumer

C3 C1

REFERENCES

R3

A2

P1 R4

P5 P4

C5 R5

C6

A3

R6 R7 R8

P6

P2

C8

R9

R10 R11 C4

A4

C7

Figure 3: DFN Topology

5

CONCLUSION AND FUTURE WORK

The results show that attack detection in case of simulation gives better results as compared to implementation. Neural network based approaches like MLP with BP, RBF with PSO, RBF with JAYA, and RBF with TLBO perform equally better than other machine learning approaches like KNN and SVM. In future, we try to do online detection of IFA in NDN by deploying our machine learning based detector in ndnSIM. This will help us to measure the actual performance of these machine learning approaches.

[1] Alexander Afanasyev, Priya Mahadevan, Ilya Moiseenko, Ersin Uzun, and Lixia Zhang. 2013. Interest flooding attack and countermeasures in Named Data Networking. In IFIP Networking Conference, 2013. IEEE, 1–9. [2] Alexander Afanasyev, Ilya Moiseenko, Lixia Zhang, et al. 2012. ndnSIM: NDN simulator for NS-3. University of California, Los Angeles, Tech. Rep (2012). [3] David S Broomhead and David Lowe. 1988. Radial basis functions, multi-variable functional interpolation and adaptive networks. Technical Report. DTIC Document. [4] Alberto Compagno, Mauro Conti, Paolo Gasti, and Gene Tsudik. 2012. NDN interest flooding attacks and countermeasures. In Annual Computer Security Applications Conference. [5] Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning 20, 3 (1995), 273–297. [6] Thomas Cover and Peter Hart. 1967. Nearest neighbor pattern classification. IEEE transactions on information theory 13, 1 (1967), 21–27. [7] Huichen Dai, Yi Wang, Jindou Fan, and Bin Liu. 2013. Mitigate ddos attacks in ndn by interest traceback. In Computer Communications Workshops (INFOCOM WKSHPS), 2013 IEEE Conference on. IEEE, 381–386. [8] Howard B Demuth, Mark H Beale, Orlando De Jess, and Martin T Hagan. 2014. Neural network design. Martin Hagan. [9] Riccardo Poli, James Kennedy, and Tim Blackwell. 2007. Particle swarm optimization. Swarm intelligence 1, 1 (2007), 33–57. [10] R Rao. 2016. Jaya: A simple and new optimization algorithm for solving constrained and unconstrained optimization problems. International Journal of Industrial Engineering Computations 7, 1 (2016), 19–34. [11] Manohar Singh, BK Panigrahi, and AR Abhyankar. 2013. Optimal coordination of directional over-current relays using Teaching Learning-Based Optimization (TLBO) algorithm. International Journal of Electrical Power & Energy Systems 50 (2013), 33–41.