and we call it the epsilon transition. The DFA is shown in .... regular expressions matching for deep packet inspection,â in Proceedings of the 2006 conference on.
NFA-based Pattern Matching for Deep Packet Inspection Yan Sun and Victor C. Valgenti and Min Sik Kim School of Electrical and Computer Engineering Washington State University Pullman, Washington, U.S.A. Email: {ysun,vvalgent,msk}@eecs.wsu.edu Abstract—Many network security applications in today’s networks are based on deep packet inspection, checking not only the header portion but also the payload portion of a packet. For example, traffic monitoring, layer-7 filtering, and network intrusion detection all require an accurate analysis of packet content in search for predefined patterns to identify specific classes of applications, viruses, attack signatures, etc. Pattern matching is a major task in deep packet inspection. The two most common implementations of Pattern matching are based on Non-deterministic Finite Automata (NFAs) and Deterministic Finite Automata (DFAs), which take the payload of a packet as an input string. In this paper, we propose an efficient NFAbased pattern matching in Binary Content Addressable Memory (BCAM), which uses data search words consisting of 1s and 0s. Our approach can process multiple characters at a time using limited BCAM entries, which makes our approach scalable well. We evaluate our algorithm using patterns provided by Snort, a popular open-source intrusion detection system. The simulation results show that our approach outperforms existing CAM-based and software-based approaches.
I. I NTRODUCTION Many network security applications in today’s networks are based on deep packet inspection, checking not only the header portion but also the payload portion of a packet. Traffic monitoring, layer-7 filtering, and network intrusion detection all require an accurate analysis of packet content in search for predefined patterns to identify specific classes of applications, viruses, etc. Those patterns are a number of strings representing signatures to be compared against packet contents using exact matching algorithms. More and more patterns are used to describe a wide variety of payload signatures. For example, Snort [1], an open-source network intrusion detection system (NIDS), has thousands of patterns and ClamAV virus signature database has about 27 000 patterns in 2010 [2]. The most popular method to implement pattern matching is to use finite automata [3]–[6]. In this method, a finite automaton is built based on given patterns, and is run with packet payload as input. The finite automaton is either deterministic or non-deterministic, depending on underlying technologies and available resources. A non-deterministic finite automaton (NFA) requires as many state transitions per character in the payload as the number of states in the worst case. However, it is very efficient in terms of space usage compared to a deterministic counterpart. These properties make NFAs more suitable for ASIC (application-specific integrated circuit)
or FPGA (field-programmable gate array) implementations, which can provide wide bandwidth but small amount of on-chip memory. On the other hand, a deterministic finite automaton (DFA) requires only one transition per character, while it needs a much larger amount of memory. Therefore, DFAs are more suitable for general-purpose processors and network processors. Our approach is based on NFA because NFA requires the least number of states and transitions and the update process is faster than DFA. Ternary Content Addressable Memories (TCAMs) have been widely adopted by network applications such as routers use TCAMs to improve the speed of the longest prefix matching and deep packet inspection systems use TCAMs to perform pattern matching. TCAMs allow the “don’t care” state to be stored in each memory cell as well as binary states 0 and 1. A memory cell in a “don’t care” state matches both 0 and 1 in the corresponding input bit. A TCAM-based routing table is extremely fast because it allows the input key to compare with all the prefixes stored in the TCAM simultaneously and retrieve the result in a single clock cycle. While the TCAMbased search is very fast, TCAMs usually have some major disadvantages such as well studied problems: high cost and high power consumption. In fact, both of them result from the circuit complexity of each TCAM cell because each TCAM cell consists of data storage, mask storage and comparator circuit. The high power consumption also affects the total cost and performance of TCAM-based approaches, not only because it increases the power supply and cooling costs but also it reduces the port density since more space is needed between ports for cooling purpose. Therefore, how to use TCAMs efficiently becomes a critical issue, and many methods have been proposed to reduce TCAM entries usage. Compared with TCAMs, Binary CAMs (BCAMs) require many fewer transistors and less power because no mask is needed and the comparison circuit is simpler. However, they can store only 0 and 1; they don’t have the “don’t care” state. In this paper, we use BCAMs instead of TCAMs to save cost and increase the clock speed. The architecture of CAM usage is shown in Fig. 1. A key is stored in the input register and each prefix or pattern is stored in a single entry. The key compares with all the entries in parallel and the results are stored in the match vector, where 1’s represent the corresponding entries match
978-1-4577-0638-7 /11/$26.00 ©2011 IEEE
Input Register (key) match vector priority encoder Entry 0
1-bit
Entry 1
1-bit LogN output
Entry N -2
1-b it
Entry N -1
1-b it
Fig. 1.
The architecture of CAM usage
the key, and the priority encoder chooses the matched entry with the highest priority. At last, the output signal is used to find the corresponding result. In this paper, our contributions include: we first analyze the problems we are facing in regular expression matching based on our experiments. Second, we propose an efficient hybrid algorithm for regular expression matching to implement deep packet inspection on multi-core architecture. Last, we analyze our algorithm based on the important factors of multi-core architecture. The remainder of the paper is organized as follows. In Section II, related work in pattern matching is presented. Section III provides a discussion on using finite automata for pattern matching, and Section IV describes our algorithm and its implementation. Then, the proposed algorithm is evaluated in Section V. Finally, we conclude in Section VI. II. R ELATED W ORK Pattern matching is the core of NIDS/NIPS in commercial products such as 3Com TippingPoint X505 [7] and Cisco IPS [8]. Pattern matchers are typically implemented using finite automata, either NFA or DFA. We divided the approaches into the following three categories: a) Software-based: The software-based approaches are also called general-purpose approaches, and they are based on general-purpose processors or network processors [4], [9]– [13]. DFAs are more popular in software-based approaches because they only need one state transition per input character, which causes at most one memory access for each character input. Therefore, they are often desirable at high network link rates. However, As we mentioned earlier, the practical use of DFAs is limited because of their excessive memory usage. In order to mitigate this issue, many methods have been proposed [10]–[14]. They develop several compression techniques for DFAs, focusing on reducing the number of transitions between states, and in some cases, 99% transitions can be eliminated. Although this can reduce the memory consumption significantly, unfortunately it is hard to reduce the number of states in DFAs with complex regular expressions.
b) ASIC-based: Several commercial network equipment vendors, including 3Com [7] and Cisco [8] have supplied their own NIDS, and a number of smaller players have introduced pattern matching ASICs which go inside these NIDS. Developing ASICs for NIDS, however, has several disadvantages; it requires a large investment and a long development cycle, and it is hard to upgrade. c) FPGA-based: There is a body of literature advocating FPGA-based pattern matching [15]–[19]. It can provide not only fast matching cycle but also parallel matching operations. NFAs are well-suited for FPGA-based matching because of its wide bandwidth requirement and low memory consumption. This type of approaches, however, does not have enough have resource for a large number of patterns. d) TCAM-based: Some approaches are proposed based on TCAMs recently [2], [20], [21]. Both [20] and [21] used TCAMs to store the transition rule table with transition compression approaches, But they still consumed much TCAM and SRAM resources. The paper [2] stored states instead of transitions in TCAMs to reduce the usage of both TCAM and SRAM, and their simulation results showed a significant improvement because there are usually many fewer states than transitions in a DFA, but the computational complexity of the approach is very high. Our work falls into this category, and we also store states instead of transitions to reduce the hardware consumption. The high speed of TCAMs is usually offset by the slow clock rate and high cost of themselves and the long delay of memory access, so we utilize BCAM instead of TCAM to not only increase the clock speed but also reduce the cost. More important, our approach eliminates usage of SRAM. III. PATTERN M ATCHING AND F INITE AUTOMATA A network intrusion detection system (NIDS) classifies packets using a predefined rule set to determine whether packets are malicious or not by searching packet payloads for any signature in the rule set. Because of the increasing amount of network traffic and threats, intrusion detection systems become very resource-intensive. For instance, open-source NIDSs such as Bro [22] and Snort [1] expend all the resources, both CPU time and memory, and halt immediately when they are deployed under high-speed network environment [23]. Therefore, achieving high-throughput in pattern matching and reducing memory access frequency are crucial for overall intrusion detection performance. A. Types of Regular Expression Actually, the pattern matching is a part of regular expression matching, in order to better understand the pattern matching algorithms in deep packet inspections, we first need to study types and characteristics of regular expression components. Although the regular language itself is a well-defined and wellunderstood language, there are many variants adopting additional notations to make the language more human-friendly. In this section, we consider some of the main types of regular expression components frequently found in Snort rule sets. In
the following, we present them in the increasing order of their complexity. 1) Exact-match strings are the simplest kind of regular expressions. They are fixed-size patterns, and thus the number of states in a finite automaton (DFA or NFA) can be kept less than the number of characters in the regular expression (string). The Aho-Corasick algorithm [24] or the Boyer-Moore algorithm [25] can be used without modification, and hashing can be used for optimial performance. This type of regular expressions is not expressive enough, and cannot detect malicious packets if an attacker inserts padding in them. However, the advantage of this kind of regular approach is that they are easily to be implemented and can achieve much faster matching speed than other types of regular expressions. 2) Character sets and single wildcards such as “[ci -cj ck ]” and “.”. For this type of regular expressions, the exactmatch algorithms such as the Aho-Corasick and BoyerMoore algorithms or hashing schemes cannot be used directly. Instead, exhaustive enumeration of exact-match strings should be used. These regular expressions are more expressive than exact-match strings, but require larger finite automata with more transitions. 3) Simple character repetitions such as “c?”, “c*”, and “c+”. For this type of regular expressions, exhaustive enumeration of exact-match strings are inapplicable because the length of a matched string may be infinite. However, it can be efficiently implemented as a loop transition in a finite automaton. 4) Bounded repetitions such as “c{n,m}”, “[ˆci -cj ]{n,}”. For this kind of regular expressions, the number of states of a finite automata grows fast as the counting constraints increase. However, we can introduce counters as an augmentation to finite automata to alleviate this problem. 5) Character sets and wildcards repetitions such as “[ˆci -cj ]*”, “.*”, “[ˆ\r\n]*”. If multiple such regular expressions are implemented as a single DFA, the size of the DFA can grow exponentially. In practice, most regular expressions in NIDS have more than one kind of regular expression patterns mentioned above in a single regular expression. In this paper, we focus on the first kind of regular expression. IV. P ROPOSED A PPROACH In a Binary CAM entry cell, the content can be made up of binary bits, each of which has either 0 or 1. In a Ternary CAM cell, however, a third “don’t care” state can be used as a bit value. An entry of a Ternary CAM stores content as a (value, mask) pair, where value and mask are W-bit numbers, requiring W storage cells for the value and additional W storage cells for the mask. Moreover, the matching circuitry is more complicated than that of a Binary CAM. A typical TCAM cell requires six transistors as a SRAM cell. The same number of transistors are required to store the mask bit, and four transistors for the match logic. Thus, each TCAM
cell requires 16 transistors, which is about 2.7 times of a typical SRAM cell. However, different techniques used by CAM manufactures result in different numbers [26]. For the evaluation of our scheme in this paper, we assume that the number of transistors and power consumption of a TCAM are two times as large as those of a BCAM cell on average. We have three goals in the design of pattern matching, first, we need to reduce the complex of each storage cell so we use BCAM instead of TCAM, second, we need to reduce the usage of BCAM entries os we use NFA instead of DFA, last, we need to increase the overall throughput of pattern matching so we propose a well scalable parallel processing architecture. So overall, we want to design a high-performance and low cost pattern matching processing engine. For a single pattern, the input string should match every character in the pattern, so the current state is based on the previous state and the current matching result. In order to compare with the latest approach, we use the same example in [2], we have pattern set: CF, BCD, BBA, BA, EBBC, and EBC. And the corresponding NFA and DFA is shown in Figure 2. The NFA is shown in the left in Figure 2, and the ε points to the beginning state is used to represent an empty string and we call it the epsilon transition. The DFA is shown in the right in Figure 2, and most of transitions are omitted for clarity here. We can see that both NFA and DFA have 14 states but DFA has many more transitions than NFA. Furthermore, every state in NFA has only one incoming transition except the beginning state and we take advantage of this property to design our algorithm. The basic logic of proposed approach is shown in Figure 3. We assume there are n patterns and the total length of all the patterns is N . Each BCAM entry stores an active state and the corresponding character in patterns. For the first BCAM entry, the active state is always match due to the epsilon transition in NFA, but from the second BCAM entry to the forth BCAM entry, active states in every BCAM entry is depends on the matching result of previous BCAM entry, and the matching result of the BCAM entry is the result of final matching result of the first pattern, and all other patterns work similarly. And the real architecture of proposed approach is shown in Figure 4. There are still n patterns and the total length of all the patterns is N . Each BCAM entry stores a character in the patterns in sequence, so all the BCAM entries are 8-bit. And the registers here are used to record current matching result for the usage in the next clock cycl, and the “AND” logics are used to decide whether both previous match and current match happen. For example, the income packet matches the first pattern r1 , C0 matches in the first clock cycle, C0 matches and the output of the first register is logic “1” in the second clock cycle, so the output of the first “AND” logic is “1”, similarly, C1 matches and both of the output of the second register and the output of the second “AND” logic are logic “1” in the third clock cycle, C2 matches and both of the output of the third register and the output of the third “AND” logic are logic “1” in the forth clock cycle, then we get the match
S0
S0
E
B
A
B S2
S3
C
S6
S1
C
E
S7
B
S2 S8
A
S 10
S 13
C
S9
B
S8
A
S4
S9
F S 10
S 13
D S 11
S 11
C
C
S5
S5
Fig. 2.
S7
S 12
B C
D S3
S4
A
B
F
C
S6
S1
S 12
B C
B
NFA and DFA for the patterns {EBC,EBBC,BA,BBA,BCD,CF}. Failure and restartable transitions are omitted for clarity in DFA
1
K ey
BCA M
BCA M
V0
C0
C0
V1
C1
C1
V2
C2
V3
C3
C2
r1 C3
1
V N -3
C N-3
V N -2
C N-2
V N -1
C N-1
Key
C N-4 C N-3
rn
C N-2 C N-1
The logic of proposed approach based on BCAMs Fig. 4.
result of r1 . In order to increase the throughput, we designed our approach to process multiple characters at a time, and we process four characters at a time as an example in this paper and it is easy to be extended to process more characters at a time. The architecture of an example is shown in Figure 5. In this example, we consider pattern “ABCD” and the input string is “CABCDFE”. In order to process four characters in a single clock cycle, we need to consider four beginning points in the incoming string, so we need four sets of patter “ABCD” to be stored in the BCAM entries, and the dashed lines show the comparison relationships between incoming characters and BCAM entries. The four “AND” logics mean the incoming string must match all characters in the pattern in a single clock
...
C N-4
...
Fig. 3.
V N -4
r1
...
...
Vi
rn
The main architecture of proposed approach based on BCAMs
cycle, and the “OR” logic means the pattern can be appeared any where in the incoming string. The incoming string must shift four characters every time, but seven characters in the incoming string need to be check every time. So in the example, the second set of BCAMs store pattern “ABCD” matches the incoming string. Actually, the four BCAM entries in the same column in Figure 5can be combined together to save hardware logic because they store the same content, and the combined architecture is shown in Figure 6. The four BCAM entries with the same content can share the data storage and only utilize their own comparator logic.
E
F
D
Fig. 5.
CAM
C
B
C
D
C
B
A
D
C
B
A
D
C
B
A
D
C
B
A
r1
Proposed parallel processing unit for pattern matching
Matc h
Storge Input key
Com parator1
M1
Input key
Com parator2
M2
Input key
Com parator3
M3
Input key
Com parator4
M4
Fig. 6.
Inp ut shift
A
States comparison between NFA and DFA
V. E VALUATION To evaluate the proposed approach, we collect pattern set from Snort [1] and ClamAV [27] which are the same as used in [2] for comparison and build a single NFA only with prefix merging because we need to make sure each state in the NFA only has one incoming transition except the beginning state. We implement it in NetFPGA [28], which is a network hardware accelerator that augments network functions of a standard computer. There is an Xilinx Virtex-II Pro FPGA on the NetFPGA and we implement our algorithm on it to do simulation. Furthermore, the NetFPGA card has four Gigabit Ethernet ports, SRAM and DRAM chips on board, and the NetFPGA communicates with the host PC through a Peripheral Communication Interconnect (PCI) bus. All the BCAM entries are 8-bit wide and each character is stored in a BCAM entry. So the number of BCAM entries used is the number of number of states in the NFA, and it is also the number of characters in patterns and only counting common prefixes once. And we compare our approach with the latest and the most efficient TCAM-based pattern matching approach [2] to the best of our knowledge. The rule sets collected from Snort and ClamAV are shown in Table I, and simulation results are shown in Table II.
TABLE I S TATISTICS OF THE PATTERNS COLLECTED FROM S NORT AND C LAM AV Pattern Set Snort Snort ClamAV ClamAV
Approaches In [2] Our approach In [2] Our approach
Patterns 6,423 6,423 26,987 26,987
States in NFA 75,256 1,565,874
States in DFA 75,256 1,565,874 -
TABLE II C OMPARISON WITH APPROACH PROPOSED IN [2] Items CAM type CAM length CAM Size (MB) CAM Size (MB) SRAM Size (MB) SRAM Size (MB) Need memory access Construction speed Pattern update speed Platform
Patter set Both Both Snort ClamAV Snort ClamAV Both Both Both Both
Approach in [2] TCAM 36-bits 0.36 8.18 0.32 7.37 Yes Slow Slow Commercial TCAM chip
Our approach BCAM 8-bit 0.08 1.57 0.00 0.00 No Fast Fast FPGA
From the simulation results we can see our algorithm outperforms the algorithm proposed in [2] in almost all aspects, but the disadvantage of our approach is that our approach needs to be implemented on configurable hardware platform such as FPGA, fortunately, our approach is small enough to be configured into FPGAs. Checking four character in a single clock cycle, our approach easily achieves a throughput of 16 Gbps. With increasing number of parallel processing units the CAM consumption increase in other approaches and in our approach is shown in Figure 7. We can see that our approach scales better because some resource is shared by parallel processing units in our approach. VI. C ONCLUSION AND F UTURE W ORK In this paper, we studied techniques used in deep packet inspection, in particular pattern matching with growing number. We built an efficient NFA generator based on BCAMs, which consume fewer transistors and with shorter latency. Different with other TCAM-based approaches, our approach does not
16
C AM C o nsump tio n
1 2
4
8
Existing ap p ro ach O ur ap p ro ach
P arallel S ets 1
2
4
8
16
Fig. 7. CAM consumption increases with increasing number of parallel processing units between different approaches
need memory access after every TCAM matching process, so our approach not only eliminates the usage of large RAM resources but also increases the matching speed for every incoming character. Furthermore, our approach can process multiple characters at a time using limited BCAM entries, which makes our approach scalable well. The evaluation results show that our approach outperforms existing approaches. In the future, we would like to explore the usage of our approach in other kinds of regular expressions. With the increasing complexity of regular expressions, existing approaches are not suitable for such regular expressions [9], a good approach is to use exact-match strings extracted from regular expression as preprocessor to eliminate most unnecessary regular expression matching, and that’s based on the observation that most of the Internet traffic does not match any regular expressions even exact-match strings within them. So there are two requirements in such applications, first, high-performance light-weighted pattern matching as a preprocessor, second, efficient regular expression matching approach based on preprocessor’s results. So our pattern matching algorithm fulfills the first requirement and we will work on the regular expression matching approach to fulfill the second requirement. R EFERENCES [1] Snort User Manual 2.8.6, The Snort Project, Apr. 2010, http://www. snort.org/assets/140/snort manual 2 8 6.pdf. [2] A. Bremler-Barr, D. Hay, and Y. Koral, “CompactDFA: Generic state machine compression for scalable pattern matching,” in Proceedings of the 29th conference on Information, INFOCOM’10, 2010, pp. 659–667. [3] M. Paolieri, I. Bonesana, and M. D. Santambrogio, “ReCPU: a parallel and pipelined architecture for regular expression matching,” in Proceedings of 15th Annual IFIP International Conference on Very Large Scale Integration, Oct. 2007, pp. 19–24. [4] M. Becchi and P. Crowley, “A hybrid finite automaton for practical deep packet inspection,” in Proceedings of ACM CoNEXT, Dec. 2007. [5] S. Kumar, B. Chandrasekaran, J. Turner, and G. Varghese, “Curing regular expressions matching algorithms from insomnia, amnesia, and acalculia,” in Proceedings of the 2007 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, Dec. 2007, pp. 155–164.
[6] S. Kumar, S. Dharmapurikar, F. Yu, P. Crowley, and J. Turner, “Algorithms to accelerate multiple regular expressions matching for deep packet inspection,” in Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications, Dec. 2006, pp. 339–350. [7] “TippingPoint x505,” http://www.tippingpoint.com/products ips.html. [8] “Cisco IOS IPS signature deployment guide,” http://www.cisco.com/. [9] Y. Sun, H. Liu, V. Valgenti, and M. S. Kim, “Hybrid regular expression matching for deep packet inspection on multi-core architecture,” in Proceedings of the 19th International Conference on Computer Communications and Networks, ICCCN’10, Aug. 2010, pp. 1–7. [10] D. Ficara, S. Giordano, G. Procissi, F. Vitucci, G. Antichi, and A. Di Pietro, “An improved DFA for fast regular expression matching,” Computer Communication Review, vol. 38, no. 5, pp. 29–40, 2008. [11] M. Becchi and P. Crowley, “An improved algorithm to accelerate regular expression evaluation,” in Proceedings of the 3rd ACM/IEEE Symposium on Architecture for networking and communications systems, 2007, pp. 145–154. [12] S. Kumar, J. Turner, and J. Williams, “Advanced algorithms for fast and scalable deep packet inspection,” in Proceedings of the 2006 ACM/IEEE symposium on Architecture for networking and communications systems, 2006, pp. 81–92. [13] M. Becchi and S. Cadambi, “Memory-efficient regular expression search using state merging,” in Proceedings of the 26th IEEE International Conference on Computer Communications, May 2007, pp. 29–40. [14] M. Becchi, C. Wiseman, and P. Crowley, “Evaluating regular expression matching engines on network and general purpose processors,” in Proceedings of the 2009 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS), Oct. 2009. [15] W. Lin and B. Liu, “Pipelined parallel AC-based approach for multistring matching,” in Proceedings of the 14th IEEE International Conference on Parallel and Distributed Systems, ICPADS’08, Dec. 2008, pp. 665–672. [16] I. Bonesana, M. Paolieri, and M. Santambrogio, “An adaptable FPGAbased system for regular expression matching,” in Proceedings of the conference on Design, Automation and Test in Europe, Mar. 2008, pp. 1262–1267. [17] N. Yamagaki, R. Sidhu, and S. Kamiya, “High-speed regular expression matching engine using multi-character NFA,” in International Conference on Field Programmable Logic and Applications, Sep. 2008, pp. 131–136. [18] C. R. Clark and D. E. Schimmel, “Efficient reconfigurable logic circuit for matching complex network intrusion detection patterns,” in Conference on field-programmable logic and applications, Sep. 2003, pp. 956–959. [19] A. Mitra, W. Najjar, and L. Bhuyan, “Compiling PCRE to FPGA for accelerating SNORT IDS,” in Proceedings of the 3rd ACM/IEEE Symposium on Architecture for networking and communications systems, 2007, pp. 127–136. [20] M. Alicherry, M. Muthuprasanna, and V. Kumar, “High speed pattern matching for network IDS/IPS,” in Proceedings of the 14th IEEE International Conference on Network Protocols, ICNP’06, Oct. 2006, pp. 187–196. [21] F. Yu, H. R. Katz, and T. V. Lakshman, “Gigabit rate packet patternmatching using TCAM,” in Proceedings of the 12th IEEE International Conference on Network Protocols, ICNP’04, Oct. 2004, pp. 174–183. [22] V. Paxson, “Bro: A system for detecting network intruders in real-time,” Computer Networks, vol. 31, no. 23–24, pp. 2435–2463, Dec. 1999. [23] H. Dreger, A. Feldmann, V. Paxson, and R. Sommer, “Operational experiences with high-volume network intrusion detection,” in Proceedings of the 11th ACM Conference on Computer and Comm. Security, Oct. 2004. [24] A. V. Aho and M. J. Corasick, “Efficient string matching: An aid to bibliographic search,” Communications of the ACM, vol. 18, no. 6, pp. 333–340, Jun. 1975. [25] R. S. Boyer and J. S. Moore, “A fast string searching algorithm,” Communications of the ACM, vol. 20, no. 10, pp. 762–772, Oct. 1977. [26] K. Pagiamtzis and A. Sheikholeslami, “Content-addressable memory (CAM) circuits and architectures: a tutorial and survey,” IEEE Journal of Solid-State Circuits, vol. 41, no. 3, pp. 712–727, Mar. 2006. [27] ClamAV, Clam AntiVirus, http://www.clamav.net/lang/en/. [28] G. Gibb, J. W. Lockwood, J. Naous, P. Hartke, and N. McKeown, “NetFPGA: an open platform for teaching how to build gigabit-rate network switches and routers,” IEEE Transactions on Education, vol. 51, no. 3, pp. 364–369, Aug. 2008.