focus on generating traffic workload simulating application flows, by mimicking ... III shows results from performance tests of our architecture. Section IV presents ...
IEEE GLOBECOM 2011 - Communications QoS, Reliability, and Modeling Symposium
High-Performance Traffic Workload Architecture for Testing DPI Systems Alysson Santos, Stenio Fernandes, Rafael Antonello, Petrônio Lopes, Djamel Sadok
Géza Szabó Ericsson Research Stockholm, Sweden
Federal University of Pernambuco Recife, Brazil Abstract— Traffic identification and classification are essential tasks performed by Internet Service Providers (ISPs) administrators. Deep Packet Inspection (DPI) is currently playing a key role in traffic identification and classification due to its increased expressive power. To allow fair comparison among different DPI techniques and system, workload generators should have the following characteristics: (i) synthetic packets with meaningful payloads; (ii) TCP and UDP traffic generation; (iii) configurable network traffic profile, and (iv) high-speed sending rate. Filling this niche of interest, this paper proposes a workload generator framework which inherits all of the above characteristics. Performance evaluation shows that our flexible workload generator system achieves very high sending rates over a 10Gbps network, using a commodity Linux machine. Additionally, we have configured and tested our workload generator following a real application traffic profile. We then have analyzed its results within a DPI system, proving its accuracy and efficiency. Index Terms— Traffic Generation, Deep Packet Inspection, Signature Generation, Traffic Analysis, Computer Networks
I.
C
INTRODUCTION
ontrol over backbone links is fundamental for Internet Service Providers (ISPs) and network administrators. Precise network management can be achieved by using passive or active measurements together with traffic analysis. Currently, such analysis has incorporated traffic identification techniques to fully describe users` profile at application level. Port-based traffic identification is no longer accurate due to a number of applications using well-known ports [11]. For instance, some Peer-to-Peer (P2P) applications can exchange traffic through ports assigned to other services, such as DNS (port 53), web (port 80), and the like. Identification methods scan packets’ payload to find out which networked application generated them. This is commonly known as Deep Packet Inspection (DPI), which consists of examining the entire packet payload seeking for known patterns. Basically, packet contents are compared against a set of signatures (patterns) which potentially determine a given application. Although DPI techniques improve accuracy in traffic identification, it also imposes high CPU and memory utilization as a side effect. As speed of backbone links goes over 10Gbps, research on DPI mechanisms focuses on developing high performance DPI engines. Although performance of DPI systems can be verified, it is difficult to
simultaneously assess their accuracy through conventional tools (i.e., existing traffic generators). In other words, the main challenge to perform realistic stress tests on DPI systems is to generate traffic at very high speed and, at the same time, to create packet payloads with meaningful patterns (i.e., application signatures). Existing synthetic traffic generators focus on generating traffic workload simulating application flows, by mimicking its behavior at a high-level (e.g., packet size, inter-packet departure time, and the like) [12]. Usually, they do not inject meaningful patterns in the packet payload. In some cases, traffic generators that send real packets payload do not focus on performance. For example, replaying captured packet traces using TCPReplay1 has the advantage of sending meaningful packets to a DPI, but it does not reach wire-speed (from 1Gbps and beyond) and limits the ability to configure synthetic traffic with flexible application-specific profile. In summary, although there are a large number of synthetic traffic generators, all of them have at least one of the following shortcomings: (i) there is no concept of flow. They only create packets and flush them into the network; (ii) packets are created with no application signatures. In other words, they are not able to generate payload patterns that are present in real networked applications; (iii) they are not capable of generating traffic at line rate in high speed links (1Gbps and beyond); (iv) no network traffic profile is generated. In other words, it is not possible to generate traffic with a pre-configured profile (e.g., 30% P2P, 40% Streaming, 10% Mail, and 20% Web). Using existing tools to assess DPI’s efficacy and efficiency can lead to biased analysis of results [12]. Filling this niche of interest, this work proposes, implements, and evaluates a framework for synthetic traffic workload generation. Contributions of this work are many-fold: First, we developed an algorithm for automatic signature generation. Based on research studies of [1], [2], and [4], our signature generator extracts common patterns that appear on a set of pre-captured flows. This allows easy system signature database update. Second, we created a mechanism to precisely mimic application behavior. Our system extracts application patterns from the signature database and insert into packets payload. Additional application behavior includes signature payload offset, packet size, signature packet order location, 1
1086
http://tcpreplay.synfin.net/trac
and the like. Third, we developed a kernel level system component capable of generating traffic at wire speed. With these three components, we have designed an openloop traffic workload generator called New Network traffic workload Generator (N-NGen). Given a pre-configured traffic profile, our system selects application signatures, inserts them into packets payload, creates real network traffic profiles, and sends flows at wire-speed for testing DPI systems. The remainder of the paper is organized as follows. Section II describes the main components of our architecture. Section III shows results from performance tests of our architecture. Section IV presents related work, whereas Section V concludes the paper and provides directions for future research work. II.
TRAFFIC WORKLOAD GENERATOR ARCHITECTURE
N-NGen is composed of the following subsystems: • Signature Generator (SigGen): it automatically identifies applications’ signatures from packet trace files; • Pattern Generator (PatternGen): it extracts additional information of applications’ signatures and behavior, such as signature’s payload offset location, packet size distribution, and the most common packet order that contains an application signature in a given flow; • Network Traffic Generator (TrafficGen): it generates real network traffic at controllable speed. It follows a configurable traffic profile. It also inserts real application signatures (from SigGen and PatternGen) into synthetic packets. In next subsections we describe architecture components and system dynamics. A. SigGen The Signature Generator (SigGen) consists of three previous tools, namely AutoSig [1], LASER [2], and Autograph [4]. SigGen aims to find application signatures as [1], differently from [2] and [4] which focus is to find worms signatures. Based on that, SigGen uses the algorithm implemented by [1], with some enhancements to reduce the number of false positives and memory consumption. Its main focus is to identify the most relevant substrings in a given set of sampled flows, to rank in order of relevance, and to combine them if necessary. Such substrings will represent uniquely a networked application. SigGen also adds a Blacklist feature, as in [4], and a flow partitioning scheme as in [2]. SigGen also uses a sibling tree data structure to create signatures as opposed to AutoSig, which uses a regular tree. A sibling tree keeps all nodes that are brothers on the same level. With this approach, this component is able to create signatures faster and with a higher precision. B. PatternGen The PatternGen
component
computes
packet
size
distributions (per application), histograms of the signature offset within packets payload, and histograms of the packet order (i.e., in which packet sequence number a given signature appears more frequently). The number of bins and their correspondent ranges for any histogram can be informed by the user. It is also possible to generate different histograms for UDP and TCP packets. PatternGen helps the user to evaluate the relevance of a certain application signature from a packet trace file by looking at flow volume and duration. C. TrafficGen TrafficGen generates stream of packets representing different application flows. It makes N-NGEN ideal for testing DPI systems for either classification or scalability tests purposes. It generates traffic by following configurable profiles parameterized by the SigGen and PatternGen subsystems. As the TrafficGen is the most important component of N-NGEN, in this section we provide more details. Figure 1 shows the UML component diagram of TrafficGen.
FIGURE 1 - TRAFFICGEN ARCHITECTURE
MainControllerC controls all the TrafficGen execution. This component is responsible for global variables allocation and sockets initialization. Additionally, the MainControllerC sets the TrafficGen process’s priority, to guarantee stability and prevent unnecessary context switching overheads, initializes the thread responsible for parsing the XML file, which holds the traffic profile to be simulated, creates flows entities and their respective packets, and sends the traffic. TrafficGen was implemented as a multi-threaded application to take advantage of the true parallelism offered by the latest multi-core processors. Every thread will have its own instance of each component and socket. This arrangement avoids complex synchronizations and consequently multi-threaded performance pitfalls caused by critical section locks. FlowGeneratorC is responsible for creating applications’ flows. Before packets transmission, TrafficGen creates a pool of flows in order to save time with memory allocation and reallocation when the memory becomes fragmented. This architectural design greatly increases the overall sending rate performance. Following, TrafficGen creates flows for each application, considering its transport protocol (i.e., TCP or UDP). FlowGeneratorC uses the XMLParserC component to
1087
parse the XML configuration file and store it into hash tables. After parsing, FlowGeneratorC creates the packets for each flow using the PacketCreatorC component, following the XML traffic profile. It is worth stressing that all packets belonging to a flow will have the same source and destination IP addresses as well as the same source and destination ports numbers. In order to create a fully controlled traffic (i.e., following a packet size distribution, a payload format, and a general profile), users must configure a XML file that will be parsed and interpreted by TrafficGen. For instance, one can define that the sending traffic must be distributed in application classes as follow: 45% of P2P traffic, 35% of Web, and 20% of video streaming. It is also possible to specify that the 45% of P2P traffic must be distributed equally among BitTorrent, Kazaa etc. In other words, the user has full control of the generated traffic profile with two additional advantages. First, it creates packets payload with real signatures in the right packets of a given application flow. Second, it positions signatures in the correct place within the payload offset. After the flow creation process, PacketForwarderC sends packets through the network interface. PacketForwarderC uses the PF_PACKET linux socket [13], which copies packets directly into the kernel. This solution reduces processing overhead and consequently improves the generator performance. In addition, TrafficGen allows the user to change any value in the IP header as well as the source and destination MAC addresses. This allows flexibility to reroute traffic to specific DPIs or through different networks. TrafficGen also allows users to control the packet sending rate, which can be used to test DPI performance and scalability at different stress level. D. N-NGEN Traffic Generation Process Figure 2 shows an overview of the N-NGEN traffic generation process. Basically it has two phases, namely the creation of the desired traffic profile and the composition of application flows that will be flushed through the network interface. Initially the user must collect traffic from different applications separately to extract its signatures more precisely. Then it is necessary to profile application traffic and to extract properties related to application signatures. SigGen and PatternGen tools play an important role in these phases. By repeating this process with a number of application-specific trace files, N-NGEN will have an application signature database that can be easily extended. From the traffic profile configuration file, TrafficGen generates synthetic traffic following its specification and creates several flows for each application. For example, traffic profile may define that 80% of the sending traffic must come from P2P file sharing applications and 20% from Web flows. As mentioned earlier, each application class may have different distribution among specific applications. To create a fully controlled traffic the user must configure a XML file that will be parsed and interpreted by TrafficGen.
Details of such configuration file attributes were omitted due to lack of space.
FIGURE 2 - N-NGEN TRAFFIC GENERATION PROCESS
III.
PERFORMANCE EVALUATION
A. Methodology In this subsection we provide details of the methodology we used to conduct performance analysis of N-NGEN. We generated synthetic traffic using the N-NGEN (and other traffic generators) through a 10Gbps network interface card (NIC) and then collected some metrics. Basically we conducted four tests. I. Expected rate comparison. In this test we generated traffic at 500 kpps and then compared expected throughput with N-NGen and D-ITG measured throughput. Expected throughput is calculated by the following formula: Expected rate = 500.000 * Packet’s size * 8. Throughput is shown in Gbps. II. Scalability test. Here we measured N-NGen’s maximum throughput while varying the performance factor “number of applications”. This factor ranged from 10 to 40 applications, i.e., we varied the amount of applications that generate different traffic profiles. We ran this test for packet’s size of 64 and 1500 bytes. Metrics used were throughput in Gbps and packet per second (kpps). III. Traffic generator’s comparison. This test compares N-NGen’s maximum throughput with four wellknown traffic generators, namely (1) Rude, (2) MGEN, (3) D-ITG and (4) Brute. We generated traffic with 64 and 1500 bytes sized packets and measured maximum throughput in Gbps and kpps. IV. Traffic profile accuracy. In this test we evaluated whether N-NGen’s generated traffic followed the specified traffic profile. For tests I, II and III we generated traffic and then measured the output rate using the IFSTAT application to calculate the total throughput in Gbps and kpps. In these tests we configured N-NGEN to generate traffic with the minimum amount of applications as possible to make a fair comparison with the others generators, expect for test II where we varied
1088
this factor. Therefore, only one application is simulated simultaneously. Notice that only N-NGen is able to generate different applications’ traffic profiles. In traffic profile accuracy test (test IV) we sent traffic on a controlled rate, using a commodity machine (cf. Table I) into our 10 Gpbs testbed network. We placed an online DPI system [14] running in promiscuous mode inside this network, and compare its output (traffic profile) with the N-NGEN output. The results were compared in terms of expected flows that should have been generated according to the configured profile and precisely classified by the DPI. TABLE I - Traffic Processor
RAM
Intel Xeon E5430, dual quad core
4GB DDR
generator machine configuration
NIC
Onboard Gigabit
Traffic NIC
Generator
Offboard, Intel 82598EB 10 Gigabit
HD
OS
3x 500GB Sata HDs
Linux, 2.6
able to reach high speed sending rates even with a high number of applications. For instance, simulating 10 applications with IP packets carrying 1500 Bytes, it can reach sending rates about 4.2 Gbps. Its maximum throughput slightly decreases as the number of applications increases. One can observe that the lowest sending rate is about 3.95Gbps (with 40 apps), which means only 6.0% decrease as compared to the maximum throughput (with only 10 apps). For 64 Bytes, the maximum throughput has decreased significantly (to around 321 kbps). The minimal throughput (with 40 apps) was about 274 kbps, 14.6% smaller than the maximum. This decreasing was larger when compared to the similar test using 1500 bytes due to two main factors: i) higher CPU processing demand when simulating applications ii) elevated number of kernel interrupts (to send the packets).
Notice that to circumvent possible instabilities at the generation machine and to assess statistical significance, we repeated each aforementioned measurement 10 times, with 2minutes long traffic generation. For each measurement, the average throughput were calculated and reported. B. Performance Results Figure 3 compares N-NGen and D-ITG real throughput with expected throughput. Packets are generated at 500 kpps, so expected throughput is 500.000 * packet’s size * 8. One can note that N-NGen generates traffic at expected rate for almost all packet size values. On the other hand, D-ITG’s throughput decreases significantly as packet’s size increases. Above 512 bytes sized packets D-ITG throughput is almost half than expected. N-NGen is able to reach 5.8 Gbps (for 500 kkps) much more efficient than D-ITG, a state-of-the-art traffic generator [12].
FIGURE 4 – N-NGEN MAXIMUM THROUGHPUT IN G BPS (PACKET SIZE OF 1500 AND 64 B YTES)
Figure 5 shows N-NGEN throughput performance test results measured in packets per second. The packet sending rates in kbps for 64 bytes and 1500 sized ip packets had almost the same behavior as for Gbps. The maximum sending rate for 1500 bytes has decreased about 6% and about 14.6% for 64 bytes. As we can see the system scales well as the number of applications increases.
FIGURE 3 – E XPECTED THRPUGHPUT FOR 500KPPS
Following, Figure 4 shows the scalability test results in Gpbs varying the number of simulated applications. The left Y axis represents the average packet sending rate for 1500 bytes sized packets and the right Y axis for 64 Bytes. N-NGEN is
FIGURE 5 – N-NGEN MAXIMUM THROUGHPUT IN KPPS (PACKET SIZE OF 1500 AND 64 B YTES )
1089
Results from N-NGEN accuracy analysis of the generated traffic profile are shown in TABLE II. N-NGEN was configured to generate traffic following a given traffic profile, which consists of the amount of flows presented on the output traffic that belongs to a certain applications. The following results were obtained after 30 minutes of traffic generation. The traffic was formed by 23% of Web traffic flows, 35% of IPTV, 15% of P2P, and 27% of multimedia streaming. It can be noticed that the received traffic, on the machine that has the DPI system running, follows the configured profile, with minor errors. Errors are mainly due to the fact that we used a non-deterministic approach to create traffic flows. In other words, when we define a profile with 35% of P2P traffic, we set a probability of 35% of creating a packet with payload from a P2P application. TABLE II TRAFFIC PROFILE ANALYSIS (# of FLOWS)
Expected Received
WEB 23% 22%
IPTV 35% 36%
P2P 15% 14%
Streaming 27% 28%
Figure 6 and Figure 7 show the performance comparison among N-NGEN and four other traffic generators. Fig. 5 shows the throughput results in Gbps and Fig. 6 depicts the packet sending rate in kpps. One can notice that N-NGEN has a very good performance reaching throughput rates of 5.8 Gbps and up to 514 kpps. Comparing Fig.5 with Fig. 3 we can see that even simulating 40 applications N-NGEN performs similarly to all other generators (up to 4 Gbps for packets of 1500 bytes). It is worth emphasizing that no other traffic generator has all N-NGEN features (as several application profiles). In other words, N-NGEN presented better performance even in an unfair comparison scenario.
FIGURE 6 – TRAFFIC GENERATORS AVERAGE THROUGHPUT IN GBPS
FIGURE 7 – TRAFFIC GENERATORS AVERAGE THROUGHPUT IN KPPS
IV.
RELATED WORK
Many traffic generators have been proposed on the literature for workload generation. We refer the interested reader to [12] for a review on software-based traffic generators. In addition, some techniques of automatic signature generation are presented. RUDE2 is a traffic generator that is only able to generate UDP packets. It is not able to work at high rates [7]. MGEN3 provides both a command line and a Graphical User Interface (GUI) for traffic generation on user-space. Its performance is comparable to RUDE. KUTE [6] is an UDP traffic generator designed to achieve high performance on Gigabit-Ethernet. It is a Linux 2.6 kernel module that operates directly on the network device driver. Its UDP packets have no meaningful message within the payload. The Internet Traffic Generator (ITG) [8] focuses on reproducing TCP and UDP traffic and replicating samples of stochastic processes for inter-departure time and packet size. This tool is able to generate traffic according to well-known distributions. ITG has a successor called D-ITG [9]. However, both do not put real applications’ message in the packet’s payload. BRUTE [7] is a traffic generator which takes advantages of the capabilities of Linux Kernel 2.4 – 2.6, to generate traffic at high rates. Sword [10] is a scalable, flexible and extensible workload generator. It can generate a diversity of application types as HTTP and Voip and also reach a good sending rate. Another traffic generator that is commonly used, among the academia, is TCPReplay4. This tool replays traffic at configurable speed, from previously captured packets. In addition, there are many others traffic generators implemented as hardware platforms, such as the Agilent 5 traffic generators. Agilent generators are very powerful and precise, but they are very expensive and hardware bounded, not being suited for commodity platforms evaluations. Automatic signature generation tools have been studied recently in the worm detection area [3][4][5]. The Autograph system [4] generates worm-like signatures by selecting TCP 2
http://rude.sourceforge.net http://mgen.pf.itd.nrl.navy.mil/ http://tcpreplay.synfin.net/trac/ 5 http://www.home.agilent.com/ 3 4
1090
suspicious flows from an online monitor, storing them remotely on disk. Besides the worm signature generators, there are some works aiming at application signature generation. Byung-Chul et al. [2] proposed the LASER algorithm, which tries to find the longest common subsequence among samples, without any prior knowledge of the application protocol, and chose it to be the application signature. Mingjiang Ye et al. [1] built a tool called AutoSig, which generates application signature by combining the techniques used on worm signature generation systems presented on [4] and [5]. In spite of all those workload generators presented previously on this Section, none of them have the following features simultaneously: (1) synthetic packets with detectable payload generation; (2) TCP and UDP traffic generation; (3) generation of traffic with a configurable applications’ profile and (4) at wire speed. Our proposal aims at addressing all of the above issues resulting in a high performance, fully configurable workload generator framework.
REFERENCES [1]
[2]
[3]
[4]
[5]
[6]
V.
CONCLUDING REMARKS AND FUTURE WORK
In previous sections we described a robust framework for realistic and configurable workload generator. N-NGen was able to generate traffic at high speed, with flows carrying packets with detectable payload correctly filled with real applications signatures. Additionally, N-NGen generated traffic following the application packet size distribution and a user defined traffic profile. The user may also configure the traffic profile concerning the application classes that will be used. Our tests have shown that N-NGen can reach up to 5.8Gbps with packets of 1500 Bytes long. Also, we have presented a novel architecture that enables the users to generate traffic that is as close as possible to the real world. Additionally, we do not know of any traffic generator in the literature that can group all the N-NGEN described features. Some future works are proposed for N-NGEN improvements. First, we have to care about the memory consumption of N-NGEN, which is caused due to the creation of the pool of flows per application. Second, the traffic profile must be done at flow level too, instead of only at volume level. Third, the traffic generation speed when dealing with small packets must be improved. VI.
[7]
[8]
[9]
[10]
[11]
[12]
[13] [14]
ACKNOWLEDGMENT
The authors would like to thank Ericsson Research for supporting this work, which is part of the project UFP.27 (Broadband Traffic Measurements and Analysis – BTMA, Phase 3).
1091
Mingjiang Ye, Jianping Wu, Ke Xu, “AutoSig-Automatically Generating Signatures for Applications”, Proc. Of IEEE 9th International Conference on Computer and Information Technology (ICTI), Xiamen, China, October 11 – 14 , Byung-Chul Park, Young J. Won, Myung-Sup Kim, and James Won-Ki Hong. ‘Towards Automated Application Signature Generation for Traffic Identification,’ Proc. of the IEEE/IFIP Network Operations and Management Symposium (NOMS 2008), Salvador, Brazil, April 2008, pp. 160-167. Newsome, J., Karp, B., and Song, D. 2005. Polygraph: Automatically Generating Signatures for Polymorphic Worms. In Proceedings of the 2005 IEEE Symposium on Security and Privacy (May 08 - 11, 2005). SP. IEEE Computer Society, Washington, DC, 226-241. Kim H, Karp B. Autograph: Toward automatic distributed worm signature detection. In: Proc. of the USENIX Security Symp. Diego, 2004. 271-286. Singh, S., Estan, C., Varghese, G., and Savage, S. 2004. Automated worm fingerprinting. In Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation – Volume 6 (San Francisco, CA, December 06 - 08, 2004). USENIX Association, Berkeley, CA, 4-4. S. Zander, D. Kennedy, e G. Armitage. KUTE – A High Performance Kernel-based UDP Traffic Engine. Technical Report 050118A, January 2005, Swinburne University of Technology, Melbourne Australia. N. Bonelli, S. Giordano, G. Procissi, R. Secchi, “Brute: A high performance and extensible traffic generator,” in Proceedings of SPECTS05, July 24-28, 2005, Philadelphia, USA. Stefano Avallone, Antonio Pescapè and Giorgio Ventre, "Analysis and experimentation of Internet Traffic Generator", New2an 2004, International Conference on Next Generation Teletraffic and Wired/Wireless Advanced Networking - St. Petersburg (Russia), February 2004. S. Avallone, S. Guadagno, D. Emma, A. Pescapè, G. Ventre, "D-ITG Distributed Internet Traffic Generator," Qest, pp.316-317, The Quantitative Evaluation of Systems, First International Conference on (QEST'04), 2004. Anderson, K. S., Bigus, J. P., Bouillet, E., Dube, P., Halim, N., Liu, Z., and Pendarakis, D. “SWORD: scalable and flexible workload generator for distributed data processing systems. In Proceedings of the 38th Conference on Winter Simulation (Monterey, California, December 03 06, 2006”. Arthur Callado, Carlos Kamienski, Balazs Gero, Geza Szabo, Stenio Fernandes, Djamel Sadok, “A Survey on Internet Traffic Identification and Classification”, IEEE Communications Surveys and Tutorials, Vol. 11, Issue 3, pp. 37-52, 3rd Quarter 2009, DOI: 10.1109/SURV.2009.090304 Botta, A. Dainotti, A. Pescapé , A., “Do you trust your software-based traffic generator?”, Communications Magazine, IEEE, Sept. 2010, Volume : 48 , Issue: 9, DOI: 10.1109/MCOM.2010.5560600. Benvenuti, Christian, “Understanding Linux Network Internals”, U.S.A.: O‟Reilly, December 2005. FERNANDES, S. F. L. ; ANTONELLO, R. T. ; LACERDA, T. B. ; SANTOS, A. F. ; SADOK, D. F. H. ; WESTHOLM, T. . Slimming Down Deep Packet Inspection Systems. In: 12th IEEE Global Internet Symposium 2009, 2009, Rio de Janeiro. INFOCOM Workshops 2009, IEEE, 2009. p. 1-6.