Malicious Objects Trafficking in the Network Dinesh Kumar Saini1, Sanad Al Maskari2 and Hemraj Saini3
1,2 Faculty of Computing and Information Technology, Sohar University P.O. Box: 44, P.C. 311, Sohar, Sultanate of Oman Tel: +968-26720101 Ext: 251, Fax: +968-26720102, Mobile: +968-95784762 e_mail:
[email protected],
[email protected] Abstract—Traffic control and management is the crucial factor for the smooth running of the network and network management. When a packet travels in the cyber space it moves according to the routing protocols. But still there are traffic problems and an attempt is made to solve this problem with mathematical modeling using querying theory. Internet traffic contains different type of packets such as IP, TCP, UDP, SMTP, FTP, ARP, etc. These packets are vulnerable to cyber attacks, which leads to various problems in the network and cost damages in the network. Various cyber attacks such as DDoS which consume excessive bandwidth can easily be detected by using queuing model. In this paper we propose a queuing model for the incoming traffic. Our aim is to filter malicious information in the early stage of the attack and reduce the unnecessary false positives. We consider the Poisson distribution of the traffic arrival and model the system as M/M/1/ f / f queuing model to detect the malicious traffic. We also incorporate the concept of selfsimilarity to detect the self-similar patterns in the traffic (which a major cause of the cyber attacks) and the abnormal behavior of the internet traffic. Keywords— Mathematical Modeling, M/M/1/ f / f Queuing model, Poisson distribution, Cyber attack, Self-similarity Pattern I. INTRODUCTION
I
the present time when security is a serious concern when we talk about cyber space and internetworks. All most all the computers and communications systems are susceptible or even we can say vulnerable to the cyber attacks when systems come online on the internet. The attacks or infections are caused by the malicious objects likes virus, worm, crawler, fisher, and bots and there are many more. There are good models to represent malicious object behavior such as [1, 2, 3] but they do not address scalable attacks. The current Internet infrastructure is highly susceptible to the scalable attacks such as distributed denial-of-service (DDoS) attacks [4, 5, and 6] because DDoS attacks do not rely on particular network protocols or system weaknesses. Therefore, in the duration of attack, whole traffic traverses towards the highly congested point in the Intranet and temporarily degrades or jams the services of the system. These attacks cause big loss in the network infrastructure and damages in the system. There are also various protocols by which a scalable attack can be carried out; some are mentioned in the table-1. The attack situation can occurs due to different protocols which follows N
request response processing. As soon as the request is sent, it may be trapped by the attacker and attacker floods a huge amount of traffic in the form of response to the request. This traffic consumes the large bandwidth and degrades the performance. Some time it becomes so worse that the whole resources of the host blocked by the unwanted traffic. Protocol HTTP TCP DNS SMTP FTP
Packet sent GET/PUT TCP SYN REQUEST/AUTHORATATIVE HELO/DATA
Response sent ACK TCP ACK RESPONSE NICE TO MEET YOU/354 “MESSAGE” ACK
MGET/MPUT/ USER ICMP ICMP ECHO ICMP REPLY ARP REQUEST REPLY Table-1: Protocols susceptible towards scalable attacks
The significance of scalable attacks can be seen by the fact that approximately 10,000 such attacks occur every day. Not only the number of attacks but the size of attack is also too big to consume the whole bandwidth i.e. the biggest size of DDoS attack in 2005 was 3.5Gbps and in 2006 it was 10Gbps and this size keep increasing every year [7]. Scalable attacks are of three types: Targeted, Consumptive, and Exploited. Attackers are smart to make their attack powerful and dynamic to attack at different layers of the OSI network. Attacks can not be detected at one centralized point but different iterative points are needed such as ISP routers, border routers, different network fragments (switches/routers), host IP stack, or port/services [8]. Router can filter the bad packets from incoming and outgoing traffic on the basis of network traffic analysis as mentioned in [9, 10]. But they are not the perfect solution for the scalable attacks due to following reasons: Many routers are not able to process high volume traffic because of limited processing capability. Routers generally not support access control for more than, say 100 lines. In such case it is difficult to check hundreds of thousand of IP lines. Due to insertion of spoofing into attacks it is difficult to detect the actual source of attack and hence it can not be blacklisted.
- 64 -
Routers use limiting traffic principle but due to extremely heavy traffic its processing unit fails. Other thing in this approach is that it limits both legitimate and nonlegitimate traffic get limited which is not good. Firewall is the next solution, which is generally implemented at layer-5. Firewall works in a trusted environment and blocks all traffic from unauthorized sources. It is better for known attacks or known malicious URLs. But it can not restrict the services which are available to public and hence not the scalable attacks. Third solution is intrusion detection systems. IDs are passive solutions and good to detect the attacks after checking each packet in the traffic. But in case of scalable attacks it is difficult to check all packets due to extremely heavy traffic. To improve the overall efficacy of the scalable attacks detection, one should consider different features such as high capacity and scalability, monitoring and filtering at all OSI layers, adaptability of new attacks, and alternate/multiple routing. In this paper we are giving the modeling to improve the monitoring and filtering of scalable attacks at the border gateway. The model shows, how the queuing parameters such as mean waiting time, service time, server utilization, and queue length help to detect an scalable attack scenario. We also incorporate the self-similarity principle to improve the queuing performance. The paper is organized as under: Theoretical model Model analysis Improving queuing performance using self-similar property of traffic Summary and discussion
(O P ) packets per unit time, on average. Packets are entering the system at rate O per
waiting line grows in length at rate
unit time and are leaving the system at a maximum rate of P per time unit. For M/M/1 queue to have statistical equilibrium, the offered load must satisfy (O / P ) 1 , in which case (O / P ) U . The steady state parameters are shown by the equations from (1) to (9). Most of the measures of the performance can be expressed fairly simply in terms of
P0 , the probability that the
f
system
is
empty,
or
¦P
n
,
the
probability
that
n 1
filtering/categorizing
P0 ( L(f) t 1) ,
mechanism
where
is
busy,
denoted
by
L(f) is a random variable
representing the number in system in statistical equilibrium (after a long time). Thus,
P ( L (f)
n)
Pn , n
0,1,2,3,... The value of P0 or U
is necessary for computing all the measures of effectiveness and given by the equation (1).
II. THEORETICAL MODEL For each Intranet there is a traffic entering point to the incoming external Internet traffic, generally known as border gateway. At the border gateway the traffic comes in a queue form. A filtering/categorizing mechanism is used to filter the traffic on the basis of its behavior. Before entering the incoming traffic in the Intranet, filtering/categorizing mechanism checks its reliability in FCFS manner. Let the traffic arrival process be Poisson with traffic arrival
mechanism with the exponential checking time distribution
queue and enter to the filtering/categorizing mechanism. Figure-1 shows the network traffic queuing system. However, queue will build up if packet arrives more then one at a time. We are assuming that every incoming packet will be checked by the firewall and there is no overflow. The offered load is defined by O / P . If O t P , the packet arrival rate is greater than or equal to the maximum checking rate of the filtering/categorizing mechanism when it is busy. Thus, the system can not handle the load put on it, and therefore it has no statistical equilibrium. If, O t P , the
Figure-1: Traffic Queuing System At the time of framing the model we have taken care of following points: (i) Generally it is assumed in the cyber attack models that higher network traffic is a signal of cyber attack. We are not considering it a perfect assumption. In the normal network connection situations higher network traffic may occur. It means this assumption leads to the false positives. We modify it by including system behavior. As the higher traffic is detected the system behavior is to be predicted and the base profile is to be changed automatically. It helps us to reduce the false positives. (ii) We considered the length of the queue as the parameter to detect the rate of arrival network traffic. If in a fixed amount of time the queue length goes out some threshold value then it will be an indication of cyber attack. In parallel, these threshold values will be changed according to the system behavior. (iii) Our aim in the paper is to optimize the utilization of
- 65 -
filtering/categorizing mechanism ( U ) , so that the exact threshold values can be set and false positives can be reduced. If the U is too small or large then the false positives increases. (iv) We also assume that, at the time of attack detection, the results are not bypassed. Because the bypass of result can be used by the attackers and for the time being they can reduce the traffic rate. (v) Cyber attacks can be occurred by the internal users or by the external users. So, we need to check the traffic form both the sides.
§ O ·§ O · ¨¨1 ¸¸¨¨ ¸¸ © P ¹© P ¹
Pn
O P O
L
L O
w
1 P O
It is a random variable representing the number in system in statistical equilibrium (after a long time) P (L(f) t 1) Probability that the firewall/filtering mechanism is busy w Long-run average time spent in system per packet
LQ
OwQ
O2 P (P O )
Long-run average time spent in queue per packet
L Long-run time-average number of packets in system LQ Long-run time-average number of packets in queue
O P (P O )
UP0 (1 U ) 2 U2 1 U
(8)
L LQ
Probability of having n packets in the system
3.2. Steady State parameters for the model
O P
U
(9)
O P
IV. ATTACK DETECTION (1)
1 § O ·§ P ·½ ® ¨ P ¸¨ ( P O ) ¸¾ ¹© ¹¿ ¯ ©
½ 1 U § 1 ¨ (1 U ) ·¸¾ ® © ¹¿ ¯
1
1
In the system, traffic comes with a normal speed, and the queue length remains within the threshold value. But after a certain amount of time known as ideal period, traffic comes at a high speed and the queue length becomes more than the threshold value which is considered as the attack. The duration, in which queue length remains higher than the threshold value, known as the attack duration as shown in the figure-2.
(2)
P( L(f) t 1)
1 P (1 U )
(7)
L(f)
P0
P
U 1 U
(6)
Traffic arrival rate in packet per second Packet per second checked by firewall or by some filtering/categorizing mechanism P0 Probability of having no traffic in the system
U
P 1 O
1 wQ w P U P (1 U )
Filtering method utilization or firewall utilization
Pn
O
(5)
3.1. Nomenclature
wQ
(1 U ) U n (4)
III. MODEL FORMULATION
U O P
n
§¨ O ·¸ P © P¹ 0 §¨1 O ·¸ P¹ ©
( UP0 ) (1 U )
(3)
- 66 -
P0 (t )
P P O
, and
Pf (t )
O OP
(12) By the equation (11) it can be seen that P0 (t ) Pf (t )
1.
These are transient probabilities and the nature will be according to the following figure-3.
Figure-2: An attack This situation occurs in the protocols which follows request response processing. As soon as the request is sent it may be trapped by the attacker and attacker floods a huge amount of traffic in the form of response to the request. This traffic consumes bandwidth as other resources of the host and degrades the performance. Some time it becomes so worse that the whole resources of the host blocked by the unwanted traffic and totally jams the processing.
V. MODEL ANALYSIS 5.1. Probability distribution function As the system follows Poisson distribution for incoming traffic, the probability that the system is having n number of packets in the queue at any moment t is given by
P( L(t )
0)
P a n e ( O P )t , n (O P )
f)
O a f e ( O P ) t , n (O P )
0,1,2,3,..... ,
Figure-3: Probability ! "#
$
-1=2 seconds, and assuming system is empty initially Figure-3 shows the curve Pf (t ) shows the probability by which there are large number of packets means an attack can be occurred. The curve P0 (t ) shows the probability by which queue is going to be empty and there is no attack. 5.2. Queue length Vs Utilization
and t>0 (10)
P( L(t )
0,1,2,3,..... ,
and t>0 (11) Where the
a n is constant i.e. does not depend on time but
do depend on initial conditions and is given by
an
Pn (0)
O OP
We assume that initially there is no packet, therefore P0 (0) 1 , Pf (0) 0 As tÆ f , therefore
e ( O P )t Æ0 since O >0 and P >0, and
- 67 -
Figure-4: Queue length Vs filtering/categorigng mechanism utilization, P0=0.5
Figure-4 shows that the filtering/categorizing mechanism utilization increases as the mean number of packets increases in the queue. On the basis of system behavior there is a threshold value of the mean packets in the queue i.e. LThreshold. If there is an alarm raised by the security system we check LQ with LThreshold. If LQ > LThreshold , we say alarm is right else say, it is false positive
Figure-5 shows the behavior of the queue length with the arrival rate of the packets. As arrival rate increases the queue length also increases exponentially. On the basis of system behavior there is a threshold value of the mean packets in the queue i.e. LThreshold. If there is an alarm raised by the security system we check LQ with LThreshold. If LQ > LThreshold , we say alarm is right else say, it is false positive VI. IMPROVING QUEUING PERFORMANCE USING SELF-SIMILAR PROPERTY OF TRAFFIC
Self-similarity is an important property of flow. In this a particular structure is repeated with a different scale which is random. This property is helpful to detect the self-similar data patterns in the network traffic. Because in case of scalable attacks the traffic patterns of the attack flow remaining almost similar. Measure of the self-similarity is given by H. E. Hurst [11] known as Hurst parameter. It can be defined as follows:Given that X { X (i ), i 0,1,2,...} is covariancestationary stochastic process with mean variable P , variance
V2
and autocorrelation function r (k ) , definition:
1 i m1 ¦ X ( j ), i m ji
Xm
0,1,2,3,....., m
(13) For every m , we define that stationary process, and of X m . Given
rm (k ) Figure-5 shows the waiting time in a queue with respect to filtering/categorizing mechanism utilization. In this we obtain a point termed as high risk point i.e WQ. If WQ goes beyond threshold value WThreshold, then it is considered as an attack alarm. 5.3. Queue length Vs traffic arrival rate
X m (i ) is a covariance
rm (k ) is an autocorrelation function
s
X ' autocorrelation function for all m satisfy:
r (k ) | k E , (0 E 1)
(14) Then, X (i ) is an exactly second-order self-similar process with parameter H
(1 E ) . Given X ' s autocorrelation 2
function for all m satisfy:
rm (k ) | r (k ), m o f (15) Then,
X (i ) is an asymptotically second-order self-similar
(1 E ) . When k o f , 2 behavior of autocorrelation function r (k ) of exactly or process
with
parameter H
asymptotically second-order self-similar process resembles power law, whose exponent is determined by H. Parameter H , called as Hurst parameter, is the parameter describing the character of self-similarity and H (0.5,1) . As the value of H increases, the value of
r (k ) decays more and more slowly,
when k o f . When 0.5 H 1 , then:
f
¦ r (k )
k f
Figure-5: Queue length Vs packet arrival rate
(16)
- 68 -
f
The behavior of above equation (16) shows that the autocorrelation function is not summable; it is known as longrange dependence (LRD) behavior. Now, in case of LRD the arrival numbers of packets are too high and therefore the computational time for generating mean number of values for each sample of the occupancy process is also very high. Therefore it is better to divide the Poisson arrival processes into sub Poisson arrival processes. Now, we checked that to produce identical processes the sample values for random numbers will be similar. By this, we can find out the similar patterns in the traffic. If there are lots of self-similar processes in the traffic then we can say that it is an scalable attack and hence restrict the malicious traffic. We propose an algorithm to detect an attack. Algorithm is good to increase the efficacy of queuing performance by including the self-similarity. The algorithmic steps to detect an attack can be represented as follows:
Algorithm to detect an attack
LThreshold: Maximum threshold to queue length. Flag AttackFound=0 Flag MoreQueueLength=0 For each incoming packet do Step-1: Add packet to the current queue. Step-2: Compute current queue length (L). Strp-3: if L> LThreshold then Set MoreQueueLength=1 Step-4: Compute H by equation (13) and (14) // Measure the self-similarity Step-5: If H>0.5 Set AttackFound=1
VII. SUMMARY AND DISCUSSION An approach towards the malicious scalable attack detection in cyber space using M/M/1 queuing system is proposed in this paper. We have shown that how the queuing parameter
such as mean waiting time, service time, server utilization, and queue length help to detect an scalable attack scenario. The network traffic arrival process is assumed to be a Poisson process. The threshold values are to be set according to the system behavior. To increase the queuing system performance we insert the self-similarity principle. The future work is at present oriented towards the diagnosis of the behavior of network traffic in presence of large self-similar patterns and its simulation REFERENCES [1]
Bimal K Mishra, Dinesh Kumar Saini, SEIRS Epidemic Model with delay for Transmission of malicious Objects in Computer Network International Journal of Applied Mathematics and Computation, doi : 10.1016/j.amc.2006.11.012. [2] Hemraj Saini, Dinesh Saini, "Cyber Defense: mathematical Modeling and Simulation" National conference on mathematical analysis and its real time applications, 16-17 September, 2006, University of Berhampur (Berhempur)-Udisa, pp. 106-111. [3] Bimal K Mishra, Dinesh Kumar Saini, Mathematical Models on Computer Viruses, doi: 10.1016/j.amc.2006.09.062. [4] Jelena Mirkovic, Peter Reiher, A Taxonomy of DDoS Attack and DDoS Defense Mechanisms, ACM SIGCOMM Computer Communications Review, Volume 34, Number 2: April 2004, pp.-39-54. [5] Jian Kang, Yuan Zhang, Jiu-Bin Ju, Classifying DDoS Attacks by Hierarchical Clustering based on Similarity, Proceedings of the Fifth International Conference on Machine Learning and Cybernetics, Dalian, 13-16 August 2006, pp.-2712-2717 [6] Chen R. ; Park J. ; Marchany R., A Divide-and-Conquer Strategy for Thwarting Distributed Denial-of-Service Attacks, IEEE Transactions On Parallel and Distributed Systems, Volume PP, Issue 99, 2007 Page(s):1 – 14 [7] White paper by Prolexic Technologies, Distributed Denial of Service Attacks: Protect Your Site from this Growing Threat, Retrieved on February, 2007. Available at: http://www.prolexic.com/news/Prolexic_NewWhitePaper.pdf [8] Hemraj Saini, Dinesh Saini, "Cyber Defense Architecture in Campus Wide Network," 3rd International Conference on Quality, Reliability and INFOCOM Technology (Trends and Future Directions), 2-4 December, 2006, Indian National Sciences and Academics, New Delhi (India) [9] Nevil Brownlee, University of Auckland and kc claffy, Cooperative Association for Internet Data Analysis, “Internet Measurement”, IEEE Internet computing, 2004, pp. 30-34. [10] M. Mahoney, Network Traffic Anomaly Detection Based on Packet Bytes, Proceedings of the 2003 ACM Symposium on Applied Computing, Melbourne, March 2003. [11] Hurst, H. E. 1951. Long-term storage capacity of reservo
[12] irs. Trans. Am. Soc. Civil Engineers 116: 770–799.
- 69 -