Bio-Inspired Machine Learning Based Wireless Sensor Network Security

5 downloads 709 Views 3MB Size Report
to evaluate because of the analogies between network security and survival of human body under pathogenic attacks. Wireless. Sensor Network (WSN) is a ...
Bio-Inspired Machine Learning B ased Wireless Sensor Network Security Sushmita Jha Assistant Professor lIT, Rajasthan sushmitajha @ iitj . ac.in

Heena Rathore leT, Research Scholar lIT, Rajasthan heena7sept@ iitj . ac.in

Abstract-Exploring the symbiotic nature of biological systems can result in valuable knowledge for computer networks. Biolog­ ically inspired approaches to security in networks are interesting to evaluate because of the analogies between network security and survival of human body under pathogenic attacks. Wireless Sensor Network (WSN) is a network based on multiple low­ cost, low-energy sensor nodes connected to physical signals. The network is made up of sensor nodes and gateways, where the server nodes acquire physical world data, while the gateway forwards the data to the end-user. While the spread of viruses in wired systems has been studied in-depth, applying trust in wireless sensor network nodes is an emerging area. This paper uses machine learning techniques to first dif1"erentiate between fraudulent and good nodes in the system. Next, it derives inspiration from the human immune system to present an idea of virtual antibodies in the system, to disable the fraudulent nodes in the system. Index Terms-Biologically Inspired, Machine Learning, WSN, Human Immune System, Security.

I . INTRODUCTION Study of nature means exploring, analysing and investigat­ ing the physical world around us. It encompasses study of living bodies which grow, respire, need energy and evolve. Detailed study in this area shows how everything in nature is structured to be hierarchical, adaptive and synchronized in space and time. For example, plants perform photosynthesis, bees search for nectar, birds flock together in a synchronized fashion and the sun rises and sets in a specific way. There are lots of things that researchers can learn from nature and use as a source of inspiration for solving many of the challenges in man-made systems. When we talk about biologically inspired systems, we demonstrate a strong relationship between a computer system and biology, which tries to solve a specific problem in computer domain with biological solution which follows a similar procedure or has similar capabilities. At this point of time, it may be good to answer a very basic question: Why should we look at biology as a source of inspiration? The answer to this question lies in many characteristics of these systems[ 1 ] , such as: • Biological systems are adaptive to their environment which ensures their survival in the harshest conditions. • They have a proven capacity to heal, remain strong, and be resilient against failures caused by many factors. • They are able to perform and accomplish very intricate tasks using a limited set of basic rules.

978- 1 -4799- 1 4 1 5-9/1 3/$3 1 .00 20 1 3 IEEE

140



They are efficient in learning, resolving and regenerating themselves when exposed to new conditions.

Over the recent years, there has been a paradigm shift in the development of computer networks; from monolithic, centralised systems to independent, distributed, self organised systems. Due to this, it is imperative for these distributed systems to have the ability to adapt and organise in the changing world. In addition, they have to address numerous other challenges[2] , some of which are listed below •



• •

Today's networks are highly dense due to strong inter­ connectivity. Hence, the size of the network is a major challenge. Since the system is open, any number of nodes can be added onto it. The network should be scalable in a way such that, one can acquire large scale networking while performing normal system functions. In the early communication system where there was only a single receiver, transmitter and a communication channel, the system was static. Such static networks did not have to deal with varying dynamics of the system. However, today's dynamic networks have to deal with varying behaviour resulting from traffic, bandwidth, chan­ nel and network conditions. Resources need to be effectively used and managed, so that the network is cost effective. Today's networks must have the capabilities of self orga­ nization, self-evolution and survivability.

If one looks at the characteristics of biological systems and the challenges faced by distributed network systems, it is pretty evident that one can apply bio-inspired techniques to solve these challenges[3] . The objective of this paper is to present the design of a security system for Wireless Sensor Network (WSN) using human immune system as inspiration. Section II describes the human immune systems and explains the concept of T-cells and B-cells in our system. Section III describes the security issues in WSN [4] and how some of the immune system concepts can be used to protect against such threats. Section IV shows initial results from using machine learning algorithms based on K-means and Support Vector Machine (SVM) for distinguishing between fradulent and good nodes. Section V summarizes the paper and presents scope for future work.

II. HUMAN IMMUNE SYSTEM Biological immune systems have intelligent capabilities of detecting antigens (foreign bodies in the system) in the body. As shown in Figure 1 , inunune system can be classified as two types, innate and adaptive. Innate immunity is the first Fig

1

Human Immune System I m m u n e Syst e m





Le u kocytes

Ada pt i ve

(Wh ite B l ood

System

I m m u n e Syst e m

Ce l l s)

P h ysi ca l Ba rri e rs

B l ood B o r n e

/�

T·Ce l l s {Ce l l · M e d i ated I m m u n ity)



Fig. 2. Classification of Leukocytes

I n n ate I m m u n e

/� I-I-I--

they mature into T-cells. They are called T-cells because the latter stages of their development occur in the thymus. Spleen, bone marrow and thymus are also called as lymphoid tissues. Lymph nodes are specialized tissue harbouring cells of the innate immune system called lekocytes and macrphages, in addition to specialized cells of the adaptive immune system; T and B cells. These nodes are connected by the lymphatic circulation of the body and help the 2 arms of our immune system coordinately fight a pathogenic attack.

B-Ce l l s ( H u mo ra l

P h a gocyctes

Lym p h o cytes

I m m u n ity)

Skin

M a c r o p h a ges

Mucus N e u t ro p h i l s

Sa l iva Tea rs

T-Ce l l s (Ce l l ­ M e d i ated

B-Ce l l s ( H u mo ra l

I m m u n ity)

lie of defense for pathogens. It is non-specific and is meant for rapid detection and elimination of pathogens. It generally refers to non-specific defence mechanisms that come into play within hours of an antigen's appearance in the body. It is referred to as non-specific defence mechanism since it is not designed for any specific pathogen. It can be further classified as physical barriers and blood borne. Physical barriers, such as skin, tears, saliva and mucus, stop pathogens from entering the body[5] . If pathogens manage to get past the physical barriers, blood borne body cells come into picture. Their response will also be non-specific. This process, called phagocytosis, is carried out by a number of different phagocytes, the most common types being the neutrophils and macrophages. For example, neutrophil has protein molecules on their cell walls that help them in identifying foreign particles. Once foreign particles are identified, it will attach to the pathogenic wall, thus engulfing it, and enclosing the pathogen in the vacuole. Pathogens containing vacuoles fuse with the lysosome that contains digestive enzymes. Macrophages perform the same task outside blood vessels, so that pathogens can be removed from tissue. If the innate immune system cannot remove the pathogen, then the adaptive immune system takes over. Adaptive immune system is made up of a network of cells, tissues, and organs that work together to protect the body. The cells involved are white blood cells, or leukocytes, which come in two basic types, phagocytes and lymphocytes. Classification is depicted in Figure 2. Phagocytes have already been discussed in innate ilmnune system. Lymphocytes are of two types, namely T-cells and B-cells. Leukocytes are developed from undifferentiated stem cells in the bone marrow. Lymphocytes start out in the bone marrow, stay there and mature into B­ cells. Alternatively, they leave for the thymus gland, where

I m m u n ity)

H e l pe r T-Ce l l s K i l i e r T-Ce l i s

The adaptive immune system consists of two complementary systems, namely cellular immune system and humoral immune system. The humoral immune system is aimed at bacterial infections and extracellular viruses, but can also respond to individual foreign proteins. This system contains soluble proteins called antibodies which bind bacteria, viruses, or large molecules identified as foreign and target them for destruction. Antibodies are produced by B-cells. Antigens are secreted by the pathogens which causes the immune system to respond. B-cells produce and secrete antibodies after they encounter antigens. The cellular immune system destroys host cells infected by viruses and also destroys some parasites. The agents at the heart of this system are a class of T-cells. B­ cells are like the body's military intelligence system, seeking out their targets and sending defences to lock onto them[6] . T-cells are like the soldiers, destroying the invaders that the intelligence system has identified. T-cells are broadly of two types, namely Helper T-cells and Killer T-cells. Killer T-cells interact with infected host cells through receptors on T-cell surface. Helper T-cells interact with macrophages and secrete cytokines that stimulate killer T-cells, helper T-cells, and B­ cells to proliferate and produce antibodies specific to the pathogen. Mathematical Model In 1 977, Dibrov' s et al. devised a model to study the rate of change of antibodies and antigen. Dibrov Model consists of three coupled equations for the antibody quantity a, the antigen quantity g, and the small B cell population x [7] ,Since x is generally considered as a constant, the rate of change of x is zero and the third equation is ignored. Now consider the

2013 World Congress on Nature and Biologically Inspired Computing (NaBIC)

141

set of equations describing antigen-antibody interactions :

dg = Kg - Qag dt da = AH(t - T)g(t - T) - Rag - Ea dt

(1) (2)

where Equation 1 and 2 are the rate of change of antigen and antibody respectively. Also K, Q, A, R, E are rate constants. K is the overall growth rate of antigen. H(t) in equation 2 is the Heaviside step function whose value is zero for negative argument and one for positive argument.

H(t) = 0, H(t) = I ,

t-

o . · · ·· · _+_+____1 · ·. ".+-+-+-_+_ 1 .O _ Hf-....�-

o.o -HI--t-+-+-+--+--+_+____1

-2 .0_�,.o 0:0

25

5.'0 7:5 10'.0 12.5 15.0 17.5 20'.0 Ti m e

2) Supervised Learning: Classification of data comes under this category and it is used when we are given labelled data and we need to describe pattern and create decision boundary. Support vector machine(SVM) is a popular tool used to classify the data and create a decision boundary to distinguish between fraudulent and good data. So the data which was classified into clusters from k-means algorithm when given to support vector machine creates the decision boundary as shown in the Figure 8. As seen from the figure, red is the fraudulent data and green is the good data. Now as new data is received, if it lies in the red region , it is considered as fraudulent, else it is considered as good data. All the simulations were done in LabVIEW 20 1 2 environment.

ng;,:;s Data Read;::i"" I P l ot::--' 0 I,... 1... I . · .n ,--,-----,---, .,--..':::':::;: :;:: =-;u Fig. 9.



5.0-1--+---+�.f--+__+-+-___1 4.0 -1--+---+----;.---1---+-t--i 3.0-I-_+_o--jo--t�--"1----+-1----i 2.0 -1--+---;��+__-+__+-+-___1 1.0 -1-_+_--+-+--+--+-t----i 0.0 -1--+---+-+__-+__+-+-___1 ·1.0 -1--+---+-+---1---+-t----i 2.0 . .i.o 0,'0 2.'0 4.'0 6.'0 8 �O 1(}.O 12'.0

Given new example x, we compute p(x).

2 p ( x ) = rrnj =l P ( Xj , J.Lj ' O'j2 ) = rrnj =l exp - (Xj2 O'-2 J.Lj ) j

(8) The probability distribution function of x and y is as shown in Figure l O. The combined probability distribution function of x and y is Fig. 10. Probability Distribution Function I

PDF of X 0.2

/

� Ol

II

v--""

1\

II '"

1.0

2.0

3.0

4.0

5.0

6.0

7.0

NI

Pl ot O

\

P D F of Y

' I "' P , ,,," o -=n

1\

8.0

""

9.0

10.0

as shown in the Figure 1 1 .

3) Anomaly Detection Algorithm: SVM creates the deci­ sion boundary however the data which lies on the boundary needs to be further evaluated for better accuracy and preci­ sion. Anomaly detection thus helps in making more accurate decisions. So the data which is on the boundary of benevolent region could be given more time to analyse. The algorithm works in following manner.

144

p (X) = p (X , J.Lx , O'� ) p ( y, J.Ly, O'� )

(9)

Threshold E is maintained and if the probability of new data point < E, it is considered as anomaly otherwise not. Paper aims in detection of anomaly in wireless sensor net­ work with the aid of machine learning classification, thus development of novel intrusion detection system. Flowchart describing machine learning based biological intrusion system

2013 World Congress on Nature and Biologically Inspired Computing (NaBIC)

Fi . I I .

X Axis

0.06

0.05 « N 0.03 .



either 0 or 1 . Our aim is to build a system to implement the mathematical model and evaluate the performance of our system under real-world conditions. Scope of work would be the development of virtual antibodies with the help of Dibrov differential equations and its significance on the trust ratings.

Combined PDF

'

.

,



N

is as shown in the Figure 12 [ 1 9] . Packets are captured and are checked whether they lie in the Fig. 12. Intrusion Detection System Ca ptu red Pa c kets

good region or fraudulent region using SVM and K-means algorithm. For better accuracy and precision it is then passed through anomaly detection algorithm for boundary values of good and fraudulent data. If found anomalous, then trust development module comes into picture and virtual antibodies are produced with the help of differential equations 1 and 2. Finally in the end the gateway would turn off the signal of fraudulent node. V. C ONCLUSION This paper described the human immune system, specifi­ cally focussing on the adaptive immune system consisting of the T-cells and B-cells. Aim was to derive inspiration from these cells to design a security system for next generation wireless sensor network (WSN). Objective was to define the combination of cluster heads and gateway as the lymph nodes in human body. Cluster heads will calculate the virtual antibod­ ies for all the sensor nodes connected to it thereby transmuting these ratings to the gateway. Gateway will then initiate action to disable the radio on that node by production of trust ratings

ACKNOWLEDGEMENTS

This work was carried out under the National Instruments PhD Sununer Internship Program under the supervision of Abhay Samant. Authors would like to thank Mr. Abhay Samant, Mr. Chinmay Misra for their help with the simulations performed using National Instruments LabVIEW 20 12. REFERENCES [ 1] Meisel, Michael, Vasileios Pappas, and Lixia Zhang, "A taxonomy of biologically inspired research in computer networking." Elsevier Computer Networks Journal, vol. 54, no. 6, pp. 901 - 9 1 6, 2009. [2] Falko Dressler , Ozgur B . Akan, "A survey on bio-inspired networking", Elsevier Computer Networks Journal, vol. 54, no. 6, pp. 881900, 2010. [3] Gomez Marmol, Felix, and Gregorio Martnez Perez. "Providing trust in wireless sensor networks using a bio -inspired technique." Telecommuni­ cation systems, vol. 46, no. 2, pp. 163- 1 80, 201 1 . [4] A. Boukerch, L . Xu, K . EL- Khatib, "Trust-based security for wireless ad hoc and sensor networks." Elsevier Computer Communications Journal, vol. 30, no. 1 1 , pp. 241 32427, 2007. [5] Julie Greensmith, Amanda Whitbrook and Uwe Aickelin, "Artificial Immune Systems", Book on Handbook of Metaheuristic, 2010. [6] Heena Rathore, Abhay Samant, "A system for building immunity in social networks", in proc. Fourth World Congress on Nature and Biologically Inspired Computing (NaBlC) , no.4, pp. 20-24, 2012. [7] A. C. Fowler, "Approximate Solution of a Model of Biological Immune Responses Incorporating Delay", Journal of Mathematical Biology, vol. 13, pp. 23-45, 1 9 8 1 . [8] Mohammad Momani and Subhash Challa , "Probabilistic modelling and recursive bayesian estimation of trust in wireless sensor networks", Book on Bayesian Network, 2007. [9] Javier Lopez, Rodrigo Roman, Isaac Agudo, Carmen Fernandez-Gago , "Trust management systems for wireless sensor networks: Best prac­ tices", Elsevier Computer Communications Journal, vol. 33, no. 9, pp. 1 086 - 1 093, 2010. [ 1 0] Haiguang Chenl , Huafeng Wu, Xi Zhou, Chuanshan Gao, "Agent­ based Trust Model in Wireless Sensor Networks", ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, no. 8, pp. 1 19- 1 24, 2007.

[ 1 1 ] Yenumula B. Reddy , "Trust-Based Approach in Wireless Sensor net­ works using an Agent to each Cluster, International Journal of Security, Privacy and Trust Management, voU , no. l , pp. 1 9-36, 20 12. [ 12] Mohammad Momani , Subhash Challa," Survey of Trust Models in Different Network Domains, International Journal of Ad Hoc, Sensor and Ubiquitous Computing, vol. 1 , no. 3, pp. 1 - 19, 2010. [ 1 3] Murad A . Rassam, M.A. Maarof and Anazida Zainal, "A Survey of Intrusion Detection Schemes in Wireless Sensor Networks, American Journal of Applied Sciences, vol. 9, no. 2, pp. 69-83, 2012. [ 1 4] Daniel-loan Curiac, Constantin Volosencu, Alex Doboli, Octa Vian Dranga, Tomasz Bednarz,"Discovery of Malicious Nodes in Wireless Sensor Networks Using Neural Predictors, WSEAS Transactions On Computer Research, vol. 2, no. I , pp. 38-43, 2007. [ 1 5] Flix Gmez Mrmol, Gregorio Martnez Prez "Providing trust in wireless sensor networks using a bio -inspired technique, Telecommunication Systems, vol. 46, no. 2, pp. 1 63- 1 80, 201 1 . [ 1 6] Idris M . Atakli, Hongbing Hu, Yu Chen, Wei Shinn Ku , "Malicious Node Detection in Wireless Sensor Networks using Weighted Trust Evaluation, in proc. Spring simulation Multiconference, pp. 836 - 843, 2008. [ 1 7] Mitchell, T., Book on Machine Learning, McGraw Hill, pp.2, 1 977. [ 1 8] MacKay, David, "Information Theory, Inference and Learning Algo­ rithms", Chapter 20. An Example Inference Task: Clustering, Cambridge University Press, pp. 284292, 2003.

2013 World Congress on Nature and Biologically Inspired Computing (NaBIC)

145

[19] Hichem Sedjelmaci and Mohamed Feham, "Novel Hybrid Intrusion Detection System for clustered wireless sensor network", International Journal of Network Security and Its Applications, vol.3, no.4, pp. 1 - 14, 20 1 1 .

146

2013 World Congress on Nature and Biologically Inspired Computing (NaBIC)

Suggest Documents