a monitoring system for mitigating fast propagating worms ... - CiteSeerX

0 downloads 0 Views 220KB Size Report
To count packet-subsets on-line we use Bloom filters with counters [1, 2]. The idea is that, upon congestion, packets are classified into equivalence classes.
A MONITORING SYSTEM FOR MITIGATING FAST PROPAGATING WORMS IN THE NETWORK INFRASTRUCTURE Miguel Vargas Martin University of Ontario Institute of Technology email: [email protected] Abstract Typically, intrusion detection systems deal with detection and response to a computer worm itself, but not with the collateral damage caused by the worm’s propagation. We present a monitoring system that classifies outbound packets within a router. This classification scheme results in a dynamic bandwidth share for packets where those that repeat disruptively are put into busy queues, whereas the rest are put into emptier queues. One of the major advantages of this approach is that the diagnosis of worm activity is less relevant since any disruptive traffic (worm or otherwise) will get limited bandwidth, consequently throttling some polymorphic worms, encrypted worms, denial-of-service (DoS) and distributed DoS attacks, abusive use of network services, and congestion due to flash crowds. There are some limitations to this system, all of which are acceptable in many applications. Keywords: Network security; worms; denial-of-service; flash crowds.

1. Introduction Fast propagating worms pose a significant threat to the stability of the Internet primarily because they consume a disproportionate share network bandwidth. Most worms also damage infected systems themselves by consuming resources, disrupting legitimate users, compromising confidential information, or by corrupting code and data. While such damage is important to the users of a given system, it is of no consequence to users of other systems. Conscientious users and organizations can generally patch and configure their systems so that they will be immune to most worms; nobody, however, is immune to the effects of worms infecting more vulnerable hosts. While a given worm will only infect thousands of machines, it can often disrupt the communications of millions of other hosts. This network flooding can be so extreme that virtually all other traffic is disrupted along some connections. While intrusion detection and prevention systems have the potential to effectively handle many kinds of worms, the performance of such systems is limited by the fact that worms can propagate using

packets that are arbitrarily similar to legitimate network traffic. Indeed, as the “slashdot effect” has demonstrated, perfectly legitimate network traffic can be just as damaging as a propagating worm. Thus, while systems that distinguish between legitimate and malicious network traffic are becoming essential to maintaining network stability and security, they will need to be complemented by others that can effectively manage the collateral damage of network floods, whether such traffic is malicious in origin or not. Our contribution is a mechanism that would allow routers to classify and prioritize traffic such that disruptive network flows are not allowed to consume bandwidth needed by other users and services. Our system keeps track of the number of times each packet is repeated based on a packetsubset. To count packet-subsets on-line we use Bloom filters with counters [1, 2]. The idea is that, upon congestion, packets are classified into equivalence classes. The classification of a packet is based on the number of times its packet-subset has been forwarded. If we assume that worm and other disruptive network traffic is classified into only a few equivalence classes, and if “non-disruptive” traffic is instead partitioned into many classes, most dropped packets will belong to the disruptive flows and most non-disruptive traffic will be forwarded on experiencing less congestion. The rest of this document is organized as follows: In §2 we review related work on packet classification techniques, worm mitigation, and traffic shaping. In §3 we describe the architecture and the classification of our system. We conclude in §4 with an analysis of the system with respect to polymorphic and encrypted worms, denial-of-service (DoS), distributed DoS (DDoS), abusive use of network services, and flash crowds.

2. Related Work Disproportionate transmission of packets disrupts network resources causing congestion. One approach against disruptive traffic consists of mitigating congestion. Classification of packets into queues has been extensively proposed to alleviate network congestion problems. In this section we overview some approaches including packet classification, router throttling, and traffic shaping techniques. The current Linux implementation of traffic shaping [3, 4] is mainly based on the differentiated service (DS) field of the IP header. The Linux implementation also supports traffic shaping

based on packet arrival rate, known as “metering”. This implementation is not enough to cope with high volumes of traffic in core routers. Different router-based packet classification techniques with different levels of granularity have been proposed. Nagle [5, 6] introduces a fair queuing algorithm with packet classification granularity based on network flows (i.e., connections between hosts). Nagle’s approach consists of keeping a different queue for each source host of the network. Demers et al. [7] modify Nagle’s algorithm in such a way that classification is performed according to a combined criteria involving source host and packet sizes. Floyd et al. [8] present a random early detection (RED) system based on probabilistic packet dropping in routers, in such a way that all connections encounter the same loss rate. Lin et al. [9] notice that RED may be unfair with lowbandwidth TCP packets in certain situations and propose a flow random early drop (FRED) system. FRED favours flows which have less queued packets than the average number of queued packets over all flows. In addition, FRED favours flows which slow down after dropping one of its packets. Trying to overcome network overheads imposed by FRED, Ertemalp et al. [10] present a router-based dynamic buffer limiting system, which classifies packets into flows. Yau et al. [11] present mechanisms for rate reduction of the same nature of TCP congestion control, whereas Mahajan et al. [12] present a similar system based on aggregates. Mahajan et al. introduce a “pushback” mechanism against DoS attacks. This mechanism consists of analyzing dropped packets in order to find signatures of aggregates responsible for congestion (without distinguishing malicious from nonmalicious packets) and subsequently filtering these signatures into a delay queue. Further, routers share signatures to upstream routers which in turn will filter traffic matching these signatures. The pushback mechanism is implemented by Ioannidis et al. [13] as a router-based defence against DDoS attacks. Their system identifies disruptive traffic by analyzing the destination addresses of dropped packets (due to congestion). The destination address (or a prefix of it) is then considered as the signature of a disruptive packet. Pushback may result costly and complex to deploy. The approach requires that all routers in the path between the attacker and the victim implement pushback so that the rate of suspicious packets can be moderated upstream. Also, in pushback, communication among routers would take place in a disrupted environment, which may make the communication hard, if not impossible. Hussain et al. [14] propose a framework for classifying DoS attacks into single or multiple-source by looking for hints in packet header fields like fragment identification field (ID) and time-to-live field (TTL), monitoring for sudden increases in packet transmission rates, and, more complicated, spectral content analysis. Wang et al. [15] propose a protection mechanism to counter DDoS attacks. Their system classifies packets into data and control segments taking into account the DS fields in the IP header. Their system uses a three level classification mechanism to accommodate packets into different queues.

Tanachaiwiwat et al. [16] present a router-based packet classification scheme consisting of source IP address reputation. Wang et al. [17] propose a network IDS based on n-gram (an ngram is a consecutive sequence of n characters or symbols in a text document) analysis. The idea is to train the IDS with traffic with no malicious packets (they report on 1-gram analysis only). After the training period the IDS compares statistically the ngram frequency distribution of each arriving packet with the previously computed n-gram frequency distribution. If the result of this comparison is above a threshold the packet is considered malicious. In terms of network overhead, the number of tasks involved in their n-gram analysis makes its practical application questionable. Our monitoring system imposes less network overhead, which makes its implementation in core routers more feasible. The problem of detecting disruptive flows can be mapped to the problem of detecting flash crowds (i.e., sudden massive number of requests to the same web site). Jung et al. [18] find characteristics of flash crowds. They use these characteristics to design a dynamic load-balancing algorithm for web caches. Chen et al. [19] present a flash crowd mitigation system based on requests regulation for web servers. The system is based on the observation that high-bandwidth applications (i.e., applications with request rates of more than a pre-defined threshold) are more sensitive to flash crowds than lowbandwidth ones. The idea is to monitor the response rate of high bandwidth applications and the rate of request arrivals and compare these two against the corresponding long-term averages. A flash crowd signal is sent to a request regulator (which in turn, throttles the requests arrivals) if the response rate of the fast connections decreases (below a pre-defined threshold compared to the long-term average) and the arrival rate of requests increases (above a pre-defined threshold compared to the long-term average). Details on the policies for the request regulator are not provided. The choice of parameters, as well as the mechanisms of the algorithms are not studied.

3. The Monitoring System The monitoring system is based on the typical components of traffic shaping and Bloom filters with counters (BFWC). The main idea of our monitoring system consists of classifying packets dynamically, based on the number of times packets are forwarded. Packets found to consume disruptive amounts of bandwidth within a short time period will be classified into busier queues. This classification does not stop worms from propagating but limits their speed of propagation and their bandwidth consumption, thus reducing the degradation of network resources. In this sense, countermeasures consist of merely delaying disruptive traffic up to the point that all the applications make a more equitable use of bandwidth.

3.1. Architecture Our system uses the three typical components of traffic shaping: classification, queuing, and scheduling. Classification consists of identifying and categorizing packets into different

incoming traffic

required, i.e. mc, etc.) has been extensively studied (see Vargas Martin et al. [2] and references therein). Bloom-table h1(p)

increment counter[hk(p)]

h2(p)



classes. Different classes of traffic are put into different queues (some queues may accommodate more than one class). The scheduler decides what queue will be served next. In our approach, classes change dynamically according to significant traffic fluctuations. An important aspect of the approach is the definition of appropriate classes, and evolving this definition dynamically. The architecture is depicted in Fig. 1. The idea is to have an in-line BFWC which counts packet-subset repetitions and defines classes based on this information (see §3.3).

hk(p)

increment counter[h1(p)]

m=2b entries

increment counter[h2(p)]

b bits c bits

classifier

Fig. 2. Bloom filter with counters (BFWC).

BFWC

scheduler

… queues

shaped traffic

Fig. 1. Architecture of the monitoring system.

3.2. Bloom Filters with Counters To achieve packet content granularity, the classification method we propose uses Bloom filters with counters. BFWC are based on conventional binary Bloom filters [20], which have been used in several different contexts since they were introduced (see for example [2, 21, 22, 23, 24, 25]). A (conventional) Bloom filter is a hash-based method for testing membership of a series of items in a large given set of items, with allowable errors. The idea is to insert a given set of items into a bit-array, or Bloom-table, as follows. Each item is hashed by k independent hash functions hi, for i = 1,..., k. Each resulting hash is interpreted as an index pointing to an entry of the Bloom-table. Then the corresponding entry of the Bloom-table is set to 1 (initially, each entry is set to 0). To test membership of an item p, p is hashed and if hi(p) = 1, for all i = 1,..., k, then it is inferred that p is already in the Bloom-table. Fan et al. [1] introduced BFWC applied to web cache techniques. In BFWC the Bloom-table has one counter (instead of one bit) per entry. We use BFWC in routers as part of the classification criteria. Fig. 2 depicts a BFWC. To be able to detect packet repetition we must look only at those portions of the packet that do not change. Our selection criteria for extracting a packet-subset p from a packet P is to retain those portions of P which we expect to be identical in a propagating worm (see discussion in §4). For example, the destination IP address would vary, therefore this field should be excluded. For most worms, however, the destination port (dst_port) is expected to be constant. At the same time we want the packet-subset to contain enough information to avoid having non-identical packets having the same packetsubset. Thus, we propose assembling packet-subsets of the form [dst_port, payload] (see discussion in [2]). The selection of appropriate parameters for the BFWC (i.e., nature and number of hash functions, amount of memory

3.3. Packet Classification Packets are classified into a number of queues, q. The router’s administrator sets the number of queues depending on the characteristics of the network and the fields used in the packet-subset. Packet classification takes place only upon congestion. Congestion can be detected by monitoring the rate of dropped packets. A threshold of dropped packets can be used to set on (or off) a congestion flag when congestion actually occurs. The packet-subset p of every incoming packet P is processed by the BFWC. If the congestion flag is on, P will be classified according to the following rules: If 1 ≤ t0 ≤ z, P is put into queue 1; If z+1 ≤ t0 ≤ 2z, P is put into queue 2; If 2z+1 ≤ t0 ≤ 3z, P is put into queue 3; … If (q-1)z ≤ t0 ≤ t, P is put into queue q; where z=t/q (for t > q); t is the maximum possible value of every counter of the Bloom-table (t=2c-1, cf. Fig. 2); and t0 is the minimum value of the k corresponding counters of p in the Bloom-table (i.e., the inferred number of repetitions of p).

4. Analysis and Concluding Remarks The effectiveness of our monitoring system against malicious code depends substantially on the parameters of the BFWC. Our system may be effective against polymorphic worms (i.e., those changing their representation on each new infection – e.g., see Nachenberg [26]) and encrypted worms (i.e., those which encrypt the bulk of their payloads using a different key per infection), if we are able to capture the unchanging parts of the packets. One would think that the destination port would not change, however, it is known that some worms may even vary their destination port (e.g., the Witty worm [27] includes a random padding and variable destination port in each packet). To deal with polymorphic worms which change only part of their payload, we could implement a fingerprinting technique (see [28, 29]). Identifying these unchanged portions is a problem that deserves to be addressed in future work. Similarly, our monitoring system can throttle DoS and DDoS attacks if the

packet subset is composed of the unchanging portions of the packets. Our system may effectively throttle abusive use of network services and flash crowds; e.g., by using the destination port as the only part of the packet-subset. Future experimentation will provide a better understanding of the effectiveness of our monitoring system.

References [1] L. Fan, P. Cao, J. Almeida, and A.Z. Broder, “Summary cache: A scalable wide-area Web cache sharing protocol,” IEEE/ACM Trans. on Networking, vol. 8, no. 3, pp. 281–293, June 2000. [2] M. Vargas Martin, J.-M. Robert, P.C. Van Oorschot, “A monitoring system for detecting repeated packets with applications to computer worms,” Technical Report TR-04-02, School of Computer Science, Carleton Univ., June 2004. URL: http://www.scs.carleton.ca/research/ tech_reports/2004/TR-04-02.1.pdf [Accessed: January 4 2005]. [3] W. Almesberger, “Linux traffic control – implementation overview,” Technical Report SSC/1998/037, EPFL, November 1998. URL: ftp://lrcftp.epfl.ch/pub/ people/_almesber/pub/tcio-current.ps.gz [Accessed: February 12, 2004].

[4] W. Almesberger, “Linux traffic control – next generation,” Proc. 9th Int’l Linux System Technology Conference (Linux-Kongress 2002), Ottawa, Canada, September 2002. [5] J. Nagle, “On packet switches with infinite storage,” IETF RFC 970, 1985. [6] J. Nagle, “On packet switches with infinite storage,” IEEE Trans. on Communications, vol. 35, pp. 435–438, 1987. [7] A. Demers, S. Keshav, and S. Shenker, “Analysis and simulation of a fair queuing algorithm,” Proc. Special Interest Group on Data Communication (SIGCOMM), Austin, USA, September 19–22 1989. [8] S. Floyd and V. Jacobson, “Random early detection gateways for congestion avoidance,” IEEE/ACM Trans. on Networking, vol. 1, no. 4, pp. 397–413, 1993. [9] D. Lin and R. Morris, “Dynamics of Random Early Detection,” Proc. Special Interest Group on Data Communication (SIGCOMM), pages 127–137, Cannes, France, September 14– 18 1997. [10] F. Ertemalp, D.R. Cheriton, and A. Bechtolsheim, “Using dynamic buffer limiting to protect against belligerent flows in high-speed networks,” Int’l Conf. on Network Protocols (ICNP), pages 230–240, Riverside, USA, November 11–14 2001. [11] D.K.Y. Yau, J.C.S. Lui, and F. Liang, “Defending against distributed denial-of-service attacks with max-min fair servercentric router throttles,” Proc. of the IEEE Int’l Workshop on Quality of Service, Miami Beach, USA, May 2002. [12] R. Mahajan, S.M. Bellovin, S. Floyd, J. Ioannidis, V. Paxson, and S. Shenker, “Controlling high-bandwidth aggregates in the network,” ACM SIGCOMM Computer Communication Review, vol. 32, no. 3, pp. 62–73, 2002. [13] J. Ioannidis and S.M. Bellovin, “Router-based defense against DDoS attacks,” Proc. of Network and Distributed System Security Symp. (NDSS), San Diego, USA, February 2002. [14] A. Hussain, J. Heidemann and C. Papadopoulos, “A framework for classifying denial of service attacks,” Proc. of the Special Interest Group on Data Communication (SIGCOMM), pages 99–110, Karlsruhe, Germany, August 25–29 2003. [15] H. Wang and K.G. Shin, “Transport-aware IP routers: A builtin protection mechanism to counter DDoS attacks,” IEEE

Trans. on Parallel and Distributed Systems, vol. 14, no. 9, pp. 873–884, September 2003. [16] S. Tanachaiwiwat and K. Hwang, “Differential packet filtering against DDoS flood attacks,” May 19 2003. URL: http://gridsec.usc.edu/papers/ ACMSecurity509pdf.pdf [Accessed: March 1, 2004]. [17] K. Wang and S. J. Stolfo, “Anomalous payload-based network intrusion detection,” Proc. of the 7th Int’l Symp. on Recent Advances in Intrusion Detection (RAID 2004), Sophia Antipolis, France, September 15–17 2004. [18] J. Jung, B. Krishnamurthy, and M. Rabinovich, “Flash crowds and denial of service attacks: Characterization and implications for CDNs and web sites,” Proc. of the 11th Int’l World Wide Web Conf., pages 293–304, Honolulu, USA, May 7–11 2002. [19] X. Chen and J. Heidemann, “Flash crowd mitigation via adaptive admission control based on application-level measurement,” Technical Report ISI-TR-557, Univ. Southern California, May 2002. URL: http://www.isi.edu/~johnh/PAPERS/ Chen02a.html [Accessed: April 1 2004]. [20] B.H. Bloom, “Space/time trade-offs in hash coding with allowable errors,” Comm. of the ACM, vol. 13, no. 17, pp. 422–426 , July 1970. [21] S. Dharmapurikar, P. Krishnamurthy, and D. Taylor, “Longest prefix matching using Bloom filters,” Proc. of the Special Interest Group on Data Communication (SIGCOMM), pages 201–212, Karlsruhe, Germany, August 25–29 2003. [22] S. Dharmapurikar, P. Krishnamurthy, T. Sproull, and J. Lockwood, “Deep packet inspection using parallel Bloom filters,” Symp. on High Performance Interconnects (HotI), pages 44–51, Stanford, USA, August 2003. [23] E.-J. Goh, “Secure indexes,” Cryptology ePrint Archive, Report 2003/216, 2003. URL: http://eprint.iacr.org/2003/ 216/ [Accessed: January 7, 2004]. [24] A. Kumar, J. Xu, L. Li, and J. Wang, “Space-code Bloom filter for efficient traffic flow measurement,” Proc. of IMC, Miami Beach, USA, October 27–29 2003. [25] A.C. Snoeren, C. Partridge, L.A. Sanchez, C.E. Jones, F. Tchakountio, S.T. Kent, and W.T. Strayer, “Hash-based IP traceback,” Proc. of the Special Interest Group on Data Communication (SIGCOMM), San Diego, USA, August 27–31 2001. [26] C. Nachenberg, “Computer virus-antivirus coevolution,” Comm. of the ACM, 40(1):46–51, January 1997. [27] C. Shannon and D. Moore, “The spread of the Witty worm,” 2004. URL: http://www.caida.org/analysis/security/ witty/ [Accessed: June 18, 2004]. [28] S. Singh. C. Estan, G. Varghese, and S. Savage, “Automated Worm Fingerprinting,” Proc. of the 6th Symp. on Operating Systems Design and Implementation (OSDI 2004), San Francisco, USA, December 6–8 2004. [29] H.-A. Kim, and B. Karp, “Authograph: Toward authomated, distributed worm signature detection,” Proc. of the 13th Usenix Securiy symposium (Security 2004), San Diego, USA, August 9–13 2004.

Suggest Documents