software-based pattern matching algorithm that modifies Wu-. Manber pattern matching algorithm using Bloom filters. The. Bloom filter acts as an exclusion filter ...
Exhaust: Optimizing Wu-Manber Pattern Matching for Intrusion Detection using Bloom Filters Monther Aldwairiab, Koloud Al-Khamaisehc
a
College of Technological Innovation, Zayed University, Abu Dhabi, P.O. Box 144534 , U.A.E. Department of Network Engineering and Security, Jordan University of Science and Technology, Irbid 22110, Jordan. c Department of Electrical Engineering, Tafila Technical University, Tafila 66110, Jordan
b
Abstract— Intrusion detection systems are widely accepted as one of the main tools for monitoring and analyzing host and network traffic to protect data from illegal access or modification. Almost all types of signature-based intrusion detection systems must employ a pattern matching algorithm to inspect packets for malicious signatures. Unfortunately, pattern matching algorithms dominate the execution time and have become the bottleneck. To remedy that, we introduce a new software-based pattern matching algorithm that modifies WuManber pattern matching algorithm using Bloom filters. The Bloom filter acts as an exclusion filter to reduce the number of searches to the large HASH table. The HASH table is accessed if there is a probable match represented by a shift value of zero. On average the HASH table search is skipped 10.6% of the time with a worst case average running time speedup over Wu-Manber of 33%. The maximum overhead incurred on preprocessing time is 1.1% and the worst case increase in memory usage was limited to 0.33% Keywords— Bloom Filters; Intrusion Detection Systems; Pattern Matching; Network Security; Wu-Manber
I. INTRODUCTION Information technology revolution has put all kinds of sensitive data within the reach of the growing number of Internet users. Unsurprisingly, there is an increasing number of emerging attacks that affects networks and personal computers. Attacks spreading rates have a direct relation to network speeds and Internet traffic, which is doubling every six months [1]. Intrusion detection systems (IDS) have been widely used to monitor and analyze network traffic in an effort to detect and prevent malicious content. IDSs are classified based on the approach into: anomaly and signature-based. Anomaly detection uses machine learning techniques to form a profile of system normal behavior. This profile is expressed by using rule based languages to specify behavior patterns or by using statistical analysis on system calls. More specifically, traffic features such as flags, protocol types, error rate and quality of service are used to construct normal behavior profiles. On the other hand, signature-based intrusion detection compares incoming and outgoing packets against predefined attack signatures to detect previously known attacks. It often relies on exact pattern matching algorithms and therefore is considerably faster and more accurate than anomaly detection. The main disadvantage is the accuracy of the signatures and the fact that they are manually crafted leaving systems vulnerable to new attacks until a signature has been drafted. A greater part of 978-1-4799-8172-4/15/$31.00 ©2015 IEEE
today’s IDSs are software signature-based, deployed offline on general purpose computers and are efficient at low speeds of about 100 Mbps [2]. Though anomaly-based IDS can detect unknown attacks, they suffer from high false positives and slow speeds leaving signature based to be the most widely deployed system [3]. Snort is a popular and commonly used open source signature-based intrusion detection system both in practice and among researchers [4]. Similar to most signature-based IDSs, Snort scans packets header and payload fields searching for a set of predefined signatures. Snort is typically software running on general purpose machines and cannot match the increasing network speeds. To speed up Snort countless, hardware solutions have been suggested, but they suffer from high cost and low configurability. Software-based IDS remains the favorite application for host based IDS and many small networks [5]. The main component that dictates the IDS performance is the pattern matching algorithm. It has been found that Snort spends at least 70% of its processing time doing pattern matching [6]. In addition, the number of signatures is on the rise with the increasing attack rates. 87% of the 2003 Snort rules contain signatures to match against [7]. Hence, accelerating pattern matching for intrusion detection is still an open problem. In this paper we propose a new modified Wu-Manber that employs Bloom filters in a novel manner to exclude unnecessary HASH table searches. To evaluate the algorithm, a complete theoretical and experimental evaluation is brought forward. Section II reviews the necessary background to better understand the problem. It explains Snort rules, pattern matching algorithms and Bloom filters theory. Section III surveys the related work and points out advantages and drawbacks of existing approaches. Section IV describes the detailed structure of the new algorithm and finally Section V delivers a full set of theoretical and experimental results to evaluate the proposed algorithm. Finally, Section VI concludes the paper.
II. BACKGROUND First we examine Snort and pattern matching for IDS. Next we explain Wu-Manber in details with a step by step example. Finally, the theory of Bloom filters is briefly introduced.
A. Snort Snort describes attacks using a rule-driven language, where each rule consists of a header and several option fields. The header field specifies the protocol type, packet source and destination IP addresses as well as port numbers. The options fields contain the attack signatures in addition to other qualifiers such as nocase for case sensitive or not, sid for signature identification number and rev for rule revision number [8]. A simplified from Snort rule from v2.8 ddos.rules file is shown in Fig 1. The rule is explained as follows. Fire an alert if a TCP packet distend to any machine deom the local network on port 27665 and contains the pattern “gOrave”. This rule is designed to catch activation password of Trin00 which was one of the first DDoS programs [9]. IDSs spend most of time trying to locate the signature (e.g., “gOrave”) in the packet payload. Snort uses several algorithms to perform pattern matching such as Boyer-Moore (BM) variants, AhoCorasick (AC) or Wu-Manber (WM). B. Pattern Matching for IDS Pattern matching algorithms are classified into either single or multiple pattern matching algorithms. Single pattern matching algorithms are simpler and search for one pattern at a time, such as BM algorithm. BM algorithm aligns the pattern with the beginning of the searched packet and shifts the pattern one character until a match or the end of the packet is found. In BM, the search time increases linearly with the size of the packet and patterns database [10]. Multiple pattern algorithms search for all patterns at one time and generally require a preprocessing and search phases. The preprocessing digests the patterns into an easy to search representation. Aho-Corasick [11] and Wu-Manber are two different examples of such algorithms. Wu-Manber algorithm is hash table based and on average performs better than AC [12]. C. Wu-Manber Algorithm Wu-Manber is a high performance multiple pattern matching algorithm that extends the ideas of BM algorithm by having a block of B characters. WM shifts the window is shifted by a safe distance, without missing any occurrences, when it finds a mismatch. Wu-Manber has two phases: preprocessing and scanning. 1) Preprocessing. In this phase, the algorithm computes the minimum pattern length m. Then WM considers only the first m characters for each pattern to build three tables: SHIFT, HASH and PREFIX. The SHIFT table contains the safe shift distance. WM finds each possible substring of size B characters, call it X, and hashes it to produce an integer to index the SHIFT table. The SHIFT table holds the number of characters that must be shifted in the packet and is computed as follows. Considering each pattern Pi, to calculate the safe shift for each substring, X, there are two possibilities. i. X does not appear in any pattern. It is possible to shift the search windows on the packet by (m – B + 1) characters to the right. This is the default shift value for the table.
ii. X appears in some of the patterns. In this case, the rightmost occurrence, q, of X in any pattern is located, and the shift value is (m – q). The HASH table uses the same hash function used in the SHIFT table and maps the last B characters of all patterns with zero shift value. This table is quite sparse because it holds only the patterns that have same last character block. The value of HASH [i] is a pointer, p, that points to the PREFIX table and points to a list of pointers of the patterns whose last B characters hash into i. The PREFIX table is used to filter the patterns that have same suffixes and different prefixes because the patterns that share both prefixes and suffixes are rare. Experimentally, a block size of two or three appeared to be a favourable choice [12]. 2) Searching. This phase traverses the packet using a sliding window of size B or more. WM computes the hash value for the current block from the packet and uses it to index the SHIFT table. If it is greater than zero, WM shifts the packet sliding windows, computes the hash value for the new block and repeats the process. Otherwise, if the SHIFT value is zero, the text to the left of the search position might be one of the pattern strings. Therefore, the HASH and PREFIX tables are checked for matching the full pattern against the packet. It is clear that WM search time does not increase with increasing the patterns size which makes it outperform other algorithms for longer strings. Wu-Manber achieves a very good average-case performance compared to the other algorithms. The current Snort implementation uses a lightly modified version of Wu-Manber (MWM) [13] which will be discussed later in the related work section. D. Theory of Bloom Filters Bloom filters are easy to program and reprogram. They use less space and are fast to query for membership. They have the capacity to tell us if a string is not a member or part of a string set. The Bloom filter preprocesses the string set and computes multiple hash functions on each string. Then it sets the bits in a bitmap of size m that correspond to the hash values [14]. Afterwards, the filter can be searched for the membership of a new string. This process is done by computing the same hash functions for this new string and checking the corresponding bits in the bitmap. If the hash bits are not set, then the new string does not belong to the programmed set with 100% confidence. That is, Bloom filters have no false negatives. On the contrary, if all the bits are set then the new string is a possible match, that is, one of the programmed signatures. A pattern matching algorithm is required to verify if there is an actual match. Therefore, Bloom filters have a probability of false positives.
alert tcp $EXTERNAL_NET any -> $HOME_NET 27665 (msg:"DDOS Trin00 Attacker to Master default password"; content:"gOrave"; sid:234) Fig. 1. Sample Snort rule
III. RELATED WORK There has been a flux of hardware and software approaches to speed up pattern matching for IDS. Dharmaprikar et al. [15] There has been a flux of hardware and software approaches to speed up pattern matching for IDS. Dharmaprikar et al. [15] proposed parallel Bloom filters in hardware to exclude packets that do not contain signatures. Because Bloom filters work only on strings of similar length they were forced to use multiple Bloom filters for each signature length. Given that patterns length ranges from 1 to over 250 characters, that translates into a large number of filters. In this paper Bloom filters are used in a different context, the filters are injected within Wu-Manber to reduce the number of expensive searches for the sparse HASH table. Because only B character prefixes are programmed, we only need one bloom filter. QWM [16] paired the idea of Quick Search (QS) algorithm [17] and the mismatch information during the matching phase, in order to reach the maximum shift distance. QWM utilized the idea of QS algorithm to determine whether the current text is the prefix of any pattern, if yes, use Wu-Manber. In order to be able to do that a fourth HEAD table is added to WM to maintain the information of the first two characters of the patterns. The HEAD table is used to decide whether the first two characters in the matching window are the prefix of a pattern. QWM algorithm has better performance compared to Wu-Manber only in the case of large alphabet such as English and Chinese texts. Additionally, the memory usage of QWM is considerably larger than Wu-Manber, due to addition of the HEAD table. The current version of Snort implements a lightly modified version of Wu-Manber. Because WM is conservative in that the maximum shift possible is m – B + 1 which depends on the minimum string length, Snort uses a modified WM (MWM) [8]. The algorithm is designed to change the default shift value by examining the suffixes of the block. The MWM can achieve a shift equal to the block size if the suffixes do not exist in any pattern. IV. EXHAUST: WU-MANBER WIH BLOOM FILTERS The proposed algorithm is called Exhaust: exclude hash table unnecessary searches time. Based on the fact that most network traffic is benign, using Bloom filters to exclude unnecessary HASH table searches results in reducing the search time. Exhaust benefits from the fact that Bloom filters can exclude strings with one hundred percent certainty to reduce the number of times the HASH table has to be searched. This is achieved by inserting a Bloom filter after the SHIFT table and before accessing the sparse HASH table. The Bloom vector is programmed with the pattern prefixes contained in the HASH table. The key idea is to reduce the amount of times the
HASH table is searched due to false alarms. Any time we encounter a block with zero shift value, the HASH and PREFIX tables must be searched. But having a zero shift value does not necessarily result in a match and most of the time it is a false alarm. Often the search window ends up with same characters as the suffix of a signature because the block size is generally small. Exhaust preprocessing is similar to WM except for programming the Bloom filter with the last B characters of pattern in the HASH table. The algorithm begins by finding the minimum pattern length m. Next the SHIFT table is initialized with the default shift value, of m – B + 1. Remember that the SHIFT and HASH tables are indexed by calculating the same hash functions on the characters block. The next step is to calculate the shift values for all substrings (x) of size B of each pattern as explained in the previous Section. If the shift value is zero, the corresponding HASH and PREFIX entries are filled and the Bloom filter is programmed with the substring of size B. To search incoming packets for attack signatures, Exhaust slides a window (w) over the packet and calculates the hash functions on the B character suffix to produce the index for the SHIFT table. If the SHIFT[i] value is not zero then slide the window by the shift amount. Otherwise there is a probable match, therefore, query the Bloom filter to verify if the substring exists in the HASH table. If the Bloom filter does not return a match, then there is no need to search the HASH table, the window is shifted by B characters, and the process repeats. If the Bloom filter returns a probable match then HASH and PREFIX tables to find the exact match. V. RESULTS AND ANALYSIS Exhaust performance is evaluated through a comprehensive set of simulations performed on actual traces representing worst case workloads. Subsection A explains the test environments and defines the metrics used to evaluate the algorithm. Subsections B explains how the attack signatures are generated and the process of extracting signatures. Subsections C present detailed analysis of the traffic traces used in testing. Subsection D evaluates the performance improvements of Exhaust in terms of runtime and the number of times the HASH table search skipped. Subsection E measures the overhead on preprocessing runtime and memory usage. Finally Subsection F measures the overhead on preprocessing runtime and memory usage. A. Test Environments and Evaluation Metrics To evaluate the algorithm performance Snort rules extracted from Snort 2.8.4.1 database released in July 2009 [18]. The simulation reads the packets and signatures from local files. Each experiment is repeated five times and the average is reported. To quantify the performance improvement we measure the speedup in terms of runtime. In addition, we measure the number of times the HASH table search is skipped courtesy of the Bloom filter. To measure the number of times Exhaust skips the HASH table search we define the HAC and HSC metrics. Where, HAC is the HASH table access count and HSC
is the HASH table skips count. An access means that the Bloom filter gives a probable match and the HASH table has to be searched. A skip takes place when the Bloom filter confirms non-membership and skips the HASH table search. The higher the HSC the better because the time to query the filter is very small compared that of searching the HASH table. In addition, to better understand the percentage of savings we define the HASH table access ratio (HAR), and the HASH table skip ratio (HSR). The ratios are calculated according to (1) and (2). Finally, to evaluate the overhead introduced by incorporating the Bloom filter within WM we measure the added preprocessing time and memory. In addition, we analyze the false positives resulting from adding the Bloom filter. HAR = HAC/(HAC+HSC)
(1)
HSR = HSC/(HAC + HSC)
(2)
B. Signatures Extraction This work is only concerned with accelerating the pattern matching part of IDS, therefore we always assume that packets header match the rules. Therefore, we extract the content part of Snort signatures and match against the extracted packets payload. The attack signatures programmed into Exhaust are extracted from Snort 2.8.4.1 database released in July 2009[18]. Exhaust supports all ASCII characters including the non-printable characters as well as the hexadecimal strings included in most Snort rules. Attack signatures are extracted from content and uricontent keywords in the Snort rules set and used to program Exhaust. The uricontent keyword is similar to the content keyword except that it specifies strings in the URI part of an HTTP packet. Searching only the URI is more efficient as opposed to searching the whole packet. In the case of a pattern that is shorter than the block size (B=3), Exhaust concatenates that pattern with the previous pattern from the same rule with one space character as a delimiter. C. Traffic Analysis We use DEFCON17 Capture the Flag (CTF) game packet traces released in August 2009 [19]. DEFCON is the largest annual hacker conference. Table 1 shows for each selected trace the number of packets with payload, the total number of packets, and the percentage of packets with payload. We study the 78 CTF traces and find that on average for all traces, 51.62% of the packets have payload. The percentage of packets with payload is considered an indication of the maliciousness of the trace. We pick the ten most malicious traces to represent the actual worst-case evaluation. In those, the percentage of malicious content averages 57%.
TABLE I. Trace No 8 13 14 46 49 50 51 52 53 54
MOST MALICIOUS TRAFFIC TRACES Number Packets with of Packets Payload 671143 383233 683770 398615 676657 389705 494466 280123 331508 188722 326101 190173 299746 168660 277840 159299 275483 155846 311546 178480 Average
Percentage 57% 58% 58% 57% 57% 58% 56% 57% 57% 57% 57%
D. Speedup We simulate Exhaust for different traces and increasing number of signatures, to evaluate the number of times Exhaust skips the HASH table search. Fig 2 shows the HASH table access and skip counts for varying number of signatures for the highly malicious trace number 8. Naturally as the number of signatures increases, the number of times the HASH table needs to be accessed increases and therefore the number of skips and savings due to the filter. The savings increase slightly with increasing number of signatures. Fig 3 shows the HASH table access and skip ratios for varying number of signatures for trace 8. The HASH table is skipped 10.6% of the time on average. The savings range between 2.6% and 13.7%. The HSR is not correlated to the number of signatures as much as it is related to the actual contents of the signatures. For example the HSR percentage for 1500 signatures was 12% and for 2000 signatures was 10.9% because the additional signatures resulted in more matches and hits than skips, therefore reducing the ratio. Fig 4 plots the HAR and HSR for different traces using 3500 signatures. It can be clearly seen that the HSR depends on the content of the trace. Trace 52 has the highest HSR of 39.1% while trace 46 has the lowest HSR of 0.6%. This variation is attributed to the fact that some traces such as trace 46, have lots of actual matches. As opposed to others such as trace 52 which includes content close to being malicious but are not a full match. That is content that happens to share the suffixes of many of the signatures, which results in a hash value of zero and is caught and skipped by the filter. Now, we focus on traces 46 and 52 representing the lowest and highest savings possible. Fig 5 shows the HAC and HSC for varying number of characters for trace 52 when using all signatures. It can be clearly seen that the HAC and HSC counts increase as the number of characters increases which confirms the earlier finding reported in Fig 2. The scanning or search time is the most important metric which can measure the speedup. The time overhead in preprocessing time is insignificant and is incurred only when programming the Bloom filter. We compare the running time of WM and Exhaust for all traces using all signatures. The average running time for WM is 8.912s, while the average running time for Exhaust is 5.972s which amounts to 33% speedup.
E. Preprocessing Overhead The main overhead incurred from adding Bloom filter to Wu-Manber takes place in the preprocessing phase, where Exhaust programs the filter. To measure the preprocessing overhead on runtime time we compare the preprocessing running time for WM and Exhaust for varying number of patterns. Fig 6 shows a small increase in the preprocessing runtime for Exhaust over WM. The largest overhead is 62 ms which accounts for 1.08% increase in preprocessing time. The average overhead is 50 ms which is equivalent to 0.8% increase. The algorithm scales very well with increasing number of signatures which is demonstrated by the minimal increase in overhead time with increasing number of signatures.
Fig. 3. HAR and HSR for varying no. of signatures for trace 08
Fig 7 plots the memory usage in MB for both WM and Exhaust against varying number of signatures. The chart shows linear scaling for both algorithms that is the memory usage increases linearly with the increasing number of signatures. The worst case increase measured at 4000 signatures is 1,308B which is equivalent to 0.33% increase. The average increase is 1,285B which is equivalent to 0.32% increase in memory usage. This is a very small price to pay for the speedup presented earlier. F. Reducing False Positives In this section we examine reducing the false positives probability that results from adding the Bloom filter. The filter provides 100% true negatives which save execution time but it cannot provide 100% true positives. False positives (FP) result in the penalty of searching the HASH table. Fig 8 shows the FPs probability versus different number of signatures used to program the Bloom filter. For such a large vector the FPs probability is insignificant with a maximum of 1-5.
Fig. 4. HAR and HSR for different traces using 3500 signatures
VI. CONCLUSIONS We introduced Exhaust, a new fast pattern matching algorithm for intrusion detection systems to meet the increasing demand on higher performance. The new algorithm modified and optimized the Wu-Manber algorithm to avoid the expensive searches for the HASH table. The HASH table can grow extremely large as the number of patterns grows. In Exhaust we employed a Bloom filter in a novel manner to exclude these unnecessary HASH table searches.
Fig. 2. HAC and HSC for varying no. of signaturues for trace 08
Fig. 5. HAC and HSC versus number of characters for trace 52
Fig. 6. Preprocessing time for varying no. of signatures
[4]
[5]
[6]
[7]
[8] Fig. 7. Memory usage for varying no. of signatures
[9]
[10] [11] [12]
[13]
[14] Fig. 8. Memory usage for varying no. of signatures
REFERENCES [1] [2]
[3]
L. Roberts, Internet growth trends. IEEE Computer Magazine Internet watch column 2000 V.T. Lam, M. Mitzenmacher M, G. Varghese, Carousel: scalable logging for intrusion prevention systems. The 7th USENIX conference on Networked systems design and implementation (NSDI'10); 24–39, USENIX Association, Berkeley, CA, USA. 2010. M. Aldwairi, Hardware-efficient pattern matching algorithm and architectures for fast intrusion detection. Available from NCSU Theses and Dissertations Institutional Repository (id 1840.16/3558). 2006.
[15]
[16] [17]
[18] [19]
ROESCH M. Snort – lightweight intrusion detection for networks. The 13th USENIX Systems Administration Conference (LISA ’99), Seattle, WA. 1999 M. Jamshed, J. Lee, S. Moon, I. Yun, D. Kim, S. Lee, Y. Yi, and K. Park, Kargus: a highly-scalable software-based intrusion detection system. In Proceedings of the ACM Conference on Computer and Communications Security (CCS), 2012. S. Antonatos, K. Anagnostakis, E. Markatos, Generating realistic workloads for network intrusion detection systems. SIGSOFT Software Engineering Notes 2004; 29(1):207–215. M. Aldwairi, T. Conte, P. Franzon, Configurable string matching hardware for speeding up intrusion detection. SIGARCH Computer Architecture New 2005; 33(1):99–107. J. Beale J, A. Baker A, J. Esler J, and S. Northcutt, Snort: IDS and IPS toolkit. Burlington, MA: Syngress Publishing, Elsevier; 2007. D. Dittrich, The DoS Project's trinoo distributed denial of service attack tool analysis, University of Washington. http://staff.washington.edu/dittrich/misc/trinoo.analysis. [Date accessed 15.06.2013]. R.S. Boyer and J.S. Moore, A fast string searching algorithm. Communications of the ACM; 20(10):762–772. 1977. A. ah, M. Corasick, Efficient string matching: an aid to bibliographic search. Communications of the ACM 1975;18:333–340. WU S, MANBER U. Fast algorithm for multi-pattern searching. Technical Report TR94-17. University of Arizona at Tuscon. 1994. http://webglimpse.net/pubs/TR94-17.pdf. [Date accessed 15.06.2013]. R. Baeza-Yate, B. Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology behind Search. 2nd ed. Addison-Wesley; 2011. B.H. Bloom, Space/time trade-offs in hash coding with allowable errors. Communications of the ACM 1970;13(7), 422–426. S. Dharmaprikar, P. Krishnamurth, T.S. Sproull, J.W. Lockwood, Deep packet inspection using parallel bloom filters. IEEE Micro 2004; 24(1):52–61. D. Sunday, A very fast substring search algorithm. Communications of the ACM 1990;33(8):132–142. D. Yang, K. Xu, Y. Cui, An improved Wu-Manber multiple patterns matching algorithm. The 25th IEEE International Performance, Computing, and Communications Conference (IPCCC), 680–686. 2006. Snort.org. Snort rules. http://www.snort.org/. [Date accessed 6.08.2010]. Defcon Organization. http://www.defcon.org/. [Date accessed 15.06.2013].