of the proposed emulation infrastructure has been implemented based on Virtual ... which we call real-time interactive network simulation [7]. With the emulation .... Supercomputer Center, which consists of 272 8-way p655+ and 7 32-way p690 ...
An Open and Scalable Emulation Infrastructure for Large-Scale Real-Time Network Simulations Jason Liu, Scott Mann, Nathanael Van Vorst, and Keith Hellman Department of Mathematical and Computer Sciences Colorado School of Mines Golden, Colorado 80401 Emails: {xliu,scmann,nvanvors,khellman}@mines.edu
Abstract— We present a software infrastructure that embeds physical hosts in a simulated network. Aiming to create a largescale real-time virtual network testbed, our real-time interactive simulation approach combines the advantages of both simulation and emulation, by maintaining flexibility of the simulation models and increasing fidelity as real systems are included in the simulation. In our approach, real-world distributed applications and network services can run together with the real-time simulator; real packets are injected into the simulation and subject to the simulated network conditions computed as a result of both real and virtual traffic competing for network resources. A prototype of the proposed emulation infrastructure has been implemented based on Virtual Private Network (VPN). One distinct advantage of our approach is that it does not require special hardware. Furthermore, it is flexible, secure, and scalable—attributes inherited directly from the VPN implementation. We conducted a set of preliminary experiments to assess the performance limitations of our emulation infrastructure. We also present an interesting case study to demonstrate the capability of our approach.
I. I NTRODUCTION Over the years, we have seen a major shift in the global network traffic in response to emerging “killer apps” of the Internet. The landscape of computer networks changes rapidly, which poses a considerable challenge to researchers designing the next-generation high-performance networking and software infrastructures. Traditionally, networking research has relied on a variety of tools, ranging from physical testbeds (e.g., [1], [2]), through analytical models (e.g., [3]), to simulation and emulation (such as [4]–[6]). There have also been efforts in augmenting network simulation with emulation capabilities, which we call real-time interactive network simulation [7]. With the emulation extension, a network simulator operating in real time can run together with real-world distributed applications and network services. To a certain extent, realtime network simulation not only alleviates the burden of developing separate models for applications in simulation, but also increases the confidence level of network simulation as real systems are included in the network model. Large-scale real-time network simulation must characterize the behavior of a very large network—potentially with millions of network entities loaded with realistic traffic. In addition, the simulation must be scalable and run in real-time. To this end, parallel discrete-event simulation [8] and multiscale modeling [9]–[11] have been applied to allow real-time simulation of very large networks with realistic traffic flows.
Advanced emulation techniques can now provide full-scale emulation capabilities for large-scale networks [5], [12]–[14]. These approaches require the network applications and the simulator be tightly coupled and statically configured. Bradford et al. provided a survey on techniques for importing and exporting real network packets to and from the network simulators [15]. For example, in MaSSF [16], by intercepting system calls, applications can run unmodified together with the simulator. These techniques, however, do not address issues, such as maintainability, scalability or load balancing, which are essential to large-scale real-time simulations. We propose an emulation infrastructure, which uses a Virtual Private Network (VPN) server farm as the simulation gateway for distributed applications to dynamically connect to the real-time network simulator. The simulation gateway manages the VPN connections as well as traffic between the VPN clients and designated simulation processes. Applications run on computers configured as VPN clients can automatically forward network packets to the simulation gateway, which redirects them to the simulator, where the network packets are then translated into simulation events. Aside from conducting emulated traffic, the network simulator can model other types of traffic, thereby providing a more realistic yet controllable networking environment. Packet arrivals at an emulated host in the virtual network are translated into real network packets and forwarded to the corresponding VPN clients via the simulation gateway. The VPN approach is flexible, scalable, and secure, which are important attributes inherited directly from the opensource implementation. This effort is part of the PRIME (Parallel Real-time Immersive network Modeling Environment) project, designed to provide a self-sustained large-scale virtual network environment for researchers of distributed systems and networks [17]. The remainder of this paper is organized as follows. In section II we present the detail of our emulation infrastructure. The experiments are described in section III. Section IV presents a case study using the emulation infrastructure to interact with a real commercial network device. We conclude this paper with a discussion of future work in section V. II. T HE E MULATION I NFRASTRUCTURE As illustrated in Fig. 1, the emulation system consists of three major components:
Fig. 1.
The emulation infrastructure.
1) The I/O threads are collocated with the real-time network simulator run on parallel computers. The threads are connected to the simulation gateways, which are configured to tunnel network packets to and from the applications running on client machines. 2) Simulation gateways, each running a VPN server and a daemon process called ssfgwd, are responsible for shunting IP packets between the virtual network and the client applications. 3) Client machines run VPN clients configured to connect to one or more simulation gateways. Each client machine’s IP routing table is automatically setup by VPN to forward traffic to the simulated network. In the remainder of this section, we describe the details of these three components, as well as a mechanism to emulate a router with multiple network interfaces. A. The Real-Time Network Simulator and the I/O Threads A virtual network is partitioned among parallel processors to be simulated in parallel. Each processor runs a simulation process, an I/O agent, a reader thread, and a writer thread (Fig. 2). The I/O agent forwards and receives packets from the reader and writer threads. The writer thread takes packets exported from the simulation process and forwards them to a simulation gateway, which then delivers the packets to the corresponding client machine. A reader thread accepts packets from the client applications through a simulation gateway and inserts the packets into the simulator. The I/O agent uses established simulator functions designed specifically to support emulation (see [18] for more details). Each emulated host in simulation contains an emulation protocol session where packets can be intercepted at the IP layer. When a packet is received by the virtual host that is destined for the client application, it is forwarded to the I/O agent, which exports the event. The event is then translated into a real packet before it is forwarded by the writer thread to the simulation gateway. On the reverse path, when the reader thread receives an IP packet, the packet is translated into a simulation event before presented to the I/O agent within the simulation. The I/O agent forwards the event to the corresponding virtual host’s emulation session where the packet is pushed down the simulated protocol stack.
Fig. 2.
Connecting the simulator and the simulation gateway.
B. The Simulation Gateway The simulation gateway is a critical component of the emulation infrastructure, residing between the network simulator and the client applications run on distributed machines. There can be multiple simulation gateways for balancing the traffic load or for differentiated services (e.g., to satisfy different bandwidth and latency requirements). Each simulation gateway runs a VPN server which manages incoming connections from the client machines and traffic to and from these client machines. We defer the details of the VPN setup to the next section. In addition to running the VPN server, each simulation gateway runs another process (called ssfgwd) which shunts IP packets between the simulator’s I/O threads and the VPN server. Separate TCP connections are maintained with each I/O thread on the parallel machines running the simulation. The TCP connections between a simulation gateway and reader or writer threads are distinguished with a 1-byte identifier which is sent after connection setup. A reader thread additionally sends a list of virtual host IP addresses which it is responsible for. The ssfgwd process creates a mapping from the IP addresses to the corresponding reader thread. This mapping is later used by the ssfgwd to forward traffic from the client machines to the corresponding simulation processes. The VPN server was modified to spawn the ssfgwd process directly and communicate with it over Unix pipes. The modified VPN server prepends packets with the associated client IP before forwarding the packet to ssfgwd. Using the prepended IP address and the previously created mapping, ssfgwd delivers the packet to the corresponding reader thread. The reader thread then forwards the packets to the I/O agent which injects the packet into the protocol stack of the corresponding virtual host. In such a way, the packet appears as if it were generated directly by the virtual host. On the other hand, an IP packet emanating from a virtual host is sent to the simulation gateway through the TCP connection established by the writer thread. Again, before forwarding the packets to the VPN server, the ssfgwd process prepends the packet with an IP address identifying the virtual host. Using the prepended IP address the VPN server forwards the packet to the client with the corresponding IP address. The client machine receives the packet as if it was originated from a physical network. We can opt to use multiple simulation gateways to alleviate
the traffic load otherwise placed on a single gateway forming a bottleneck. The VPN system allows us to implement sophisticated load balancing schemes with little or no modifications. In our implementation, we adopted a simple load-balancing scheme where a client randomly chooses from a set of simulation gateways at the start of the connection. In any case, the I/O threads at the simulator make separate TCP connections to all simulation gateways. We implemented mechanisms to allow a client to connect to different gateways during one simulation. Traffic from the simulator is then routed to the client machine through the gateway with which the client is currently engaged. C. The VPN Connections OpenVPN [19] was chosen to allow applications to dynamically connect to the simulation gateway and to emulate network interfaces on the client machines. Each simulation gateway runs a modified OpenVPN server, which spawns the ssfgwd process (described in the previous section) and communicates with it using the standard Unix pipes. Each client machine runs an unmodified OpenVPN client to connect to a chosen simulation gateway. We wrote a utility program to generate the configuration files needed by the OpenVPN servers and clients. Specifically, based on the virtual network specification, the program generates a VPN server configuration for each simulation gateway and a VPN client configuration for each emulated network interface. A VPN server configuration includes the private key of the server and the public keys of all clients managed by the server and a mapping between public keys and IP addresses. The mapping is used by the server to statically assign IP addresses to incoming client connections. A VPN client configuration includes the private key of the client and the public keys of all servers it can connect to. Each client will choose to connect to one server and, upon connection, it will have an IP address automatically assigned to the virtual private network interface that matches the IP address of the emulated network interface in simulation so that the client machine can assume the proper identity as applications running on the client machine can transparently forward traffic to the simulated network. D. IP Forwarding We modified the OpenVPN implementation to allow the simulation gateway to correctly forward IP packets to and from a client machine emulating a router (i.e., with multiple network interfaces). Each VPN connection at the client machine corresponds to an emulated network interface inside the simulation. For multi-homed hosts, multiple VPN connections are needed. It is clear, in this case, one could not simply use either the source or the destination IP address to indicate the network interface of the router though which the packet is received or to be forwarded. To emulate a router that supports IP forwarding, we make each packet sent from the network simulator to the simulation gateway preceded with an IP address of the network interface from which the packet is received in simulation.
The VPN server is modified to use this leading address to forward the packet to the corresponding VPN client machine. The client machine therefore receives the packet from the virtual private network interface exactly as it happens in the simulated network. Conversely, a packet sent from the client machine to the simulation gateway will be prepended with the address of the sender’s virtual private interface at the server before it is given to the ssfgwd process and subsequently to the simulator. The reader thread at the simulator uses this leading address to dispatch the packet to the corresponding virtual host; the simulated packet is then sent out from the corresponding network interface in the simulator. III. E XPERIMENTS We ran a set of preliminary tests to gauge the performance limitations of our emulation infrastructure. In particular, we assess the impact on the accuracy of the emulation as it inevitably introduces overhead, manifested as packet delays and losses, when network traffic is sent between applications running on the client machines and the real-time simulator through the simulation gateway. In our particular experiment setup, the machines running the VPN clients and the simulation gateway are both AMD Athlon64 machines (each with a 2.2 GHz CPU and 2 GB of memory) connected via a gigabit switch. We chose two sites to run the network simulator. Site 1 is a 23-node Linux cluster located on campus where each node has a Pentium IV 3.2 GHz CPU and 512 KB of memory. The campus network is a 100 Mb/s Ethernet. Site 2 is an IBM terascale supercomputer located at the San Diego Supercomputer Center, which consists of 272 8-way p655+ and 7 32-way p690 compute nodes [20]. The simulated network is shaped like a dumbbell, a structure which has been commonly used to evaluate TCP congestion control algorithms. The dumbbell has two routers (R1 and R2 ) and N nodes on either side of the dumbbell. The two routers are connected by a 100×N bandwidth link with a 300 millisecond delay. We let nodes connected to R1 be servers (S1 , S2 , · · · , SN ) and nodes connected to R2 be clients (C1 , C2 , · · · , CN ). The delay of each link connecting a router with one of its adjacent client or server nodes is set to be 100 milliseconds and its bandwidth is 100 Kb/s. A server node on one side of the dumbbell (Sx ) sends traffic via TCP to the corresponding client node (Cx ) on the other side of the dumbbell. In our experiments, we emulated router R1 , which is located on the server side of the dumbbell network. On the client machine, we simultaneously run several VPN clients, each corresponding to a network interface within router R1 . We also enabled IP forwarding on the client machine and set the kernel routing table to forward packets to and from the VPN tunnel devices. This dumbbell network was designed to expose the overhead of the emulation infrastructure: by changing the number of TCP client/server pairs (N ) we could vary the amount of network traffic traversing the emulation infrastructure. Before the TCP transfers began we first ran ping on the machine running the simulator and the machine running
the VPN clients both targeting the simulation gateway to collect the statistics (measured in milliseconds). The baseline measurements reported in the following were collected during a relative network quiescence. The results are averages of 10 separate runs: • x, the round trip time between the simulation host and the gateway. Site 1: 0.210±0.064 ; Site 2: 46.674±3.383. • y, the round trip time between the gateway and the client machine. Site 1: 0.042 ± 0.007 ; Site 2: 0.045 ± 0.007. • z, the round trip time between the client and a simulated server node. Site 1: 200.794 ± 0.889 ; Site 2: 247.197 ± 2.827 0 • z , the simulated round trip time between R1 and S1 . Site 1: 200.013 ; Site 2: 200.013. 0 0 • = |z − z |/z , the emulation error. Site 1: 0.39% ; Site 2: 23.59%. 0 • δ = z − z − x − y, the difference in the round-trip times. Site 1: 0.529 ; Site 2: 0.464. The difference in the round-trip times (δ) reflects the overhead in addition to the network latencies of the two segments— including, for example, the time it takes to import and export simulation events and to transport packets between ssfgwd and the VPN server. In both cases, δ amounts to only about 1/2 milliseconds. Next we allowed all server nodes to each send a file of 300 KB simultaneously over TCP to their client counterparts. We varied the number of client/server pairs (N ) during the experiment. Figures 3 and 4 show the amount of data transferred over all TCP connections as we ran the network simulator at Site 1 and Site 2, respectively. Scaled up to 64 TCP connections (or for an aggregate throughput of 32 Mb/s), the emulation on the local cluster (Site 1) produced results closely matching the simulated behavior. When we increase N to 128, the traffic demand went beyond the capacity of the campus network, significant and unpredictable packet losses occurred causing the TCP congestion control to reduce the send rates accordingly—a phenomenon resulting from emulation errors. At Site 2, with only one connection, the significant roundtrip time between the client host caused a 22% reduction in that amount of data transferred. Additional TCP connections further exacerbate this problem. Different from the previous scenario, the latency—rather than packet loss—played a major role in admitting the emulation errors. For 32 TCP connections the throughput is reduced to 57%, placing the outcome of the emulation poignantly different from the expected behavior. The experiments show that the losses and delays experienced by the packets as they travel through the simulation gateway and the client machines can have a profound impact on the emulation accuracy. More important, our experiments confirm that the throughput of the emulation infrastructure can scale all the way up the network’s physical limit. IV. A C ASE S TUDY In this section we show a study that uses the emulation infrastructure to test a network device code named CAPE (Content Aware Policy Engine) being developed at Northrop
Fig. 3. Amount of data transferred during the first 40 seconds at Site 1 (with each of the 10 runs for N=128 being shown).
Fig. 4.
Amount of data transferred during the first 25 seconds at Site 2.
Grumman. CAPE allows one to use a simple language to examine, search, log, and modify the content of the packets that pass through the device. In this study we used a software version of CAPE for content routing and distribution. We set up a simulation network that consists of two subnetworks with a total of 1008 hosts and routers. This network is a scaled-down version of the baseline network model from the DARPA NMS program, which is used commonly for large-scale simulation studies. Each subnetwork (shown in Fig. 5) has a server cluster (in net 1) that acts as the traffic source. Also, in net 2 and net 3, there are a total of 12 local area network clouds, each with 42 client hosts acting as the traffic sink. To populate the network with sufficient background traffic, at each client network cloud, we designated 320 fluid TCP flows, 2 packet TCP flows, and 2 packet UDP flows, each downloading data from a randomly selected server from both campuses (see [11] for more detail on the fluid TCP model). The application was streaming video, simulated as a sequence of 1 KB UDP datagrams sent at a constant rate. We randomly chose N clients in net 2 of one campus network having them each request a 100 KB/s video stream from a server randomly chosen from the other campus network. We placed the CAPE device between the routers labeled 0:0 and 4 so it could intercept the streaming traffic. We measured the overall average packet loss rate of all clients as a indication of the receiving video quality. We ran three tests. As the baseline, we took out the CAPE device and simply ran the simulation. For comparison, in the second test we emulated the CAPE device but programmed it
Fig. 5.
The DARPA NMS campus network instrumented with CAPE.
its important properties to derive a flexible, portable, scalable, and secure implementation of a simulation gateway between the network applications and the real-time simulator. We plan to conduct further scalability tests in the near future. We also plan to investigate other criteria that can be used by client machines to select among multiple simulation gateways. For example, client connections can be differentiated by the bandwidth and latency requirements: a client running applications with a more stringent quality-of-service requirement should be assigned to a simulation gateway with higher bandwidth and lower latency. Furthermore, we plan to investigate dynamic traffic load balancing schemes and fault tolerance measures. ACKNOWLEDGMENTS This research is supported in part by a National Science Foundation Career grant CNS-0546712. The authors would like to thank the San Diego Supercomputer Center (SDSC) for allowing access to the computing resources and Northrop Grumman Corporation for the CAPE device prototype. Thanks to Phil Romig and Mike Colagrosso, who helped set up the physical testbed. We would also like to thank anonymous reviewers for their constructive suggestions. R EFERENCES
Fig. 6.
Using CAPE content distribution to reduce packet loss.
to forward all the traffic without inspecting the content. In the third test we configured the CAPE device to allow only one outstanding request from a client to reach the server and the rest were cached inside the CAPE device. When the server streamed the video back to the client, CAPE replicated the video stream to all other clients who also requested the video download. In this way, the network path between the CAPE device and the server was only burdened with one video stream and better performance was expected. We ran the experiments on the same AMD machines with a gigabit connection as in the previous section. We varied the number of clients N from 5 to 160, resulting an aggregate traffic of 4 to 128 Mb/s. We found the python-based software CAPE device was unable to process packets at a rate higher than 4K packets/s without significantly dropping them. Therefore, for experiments with more than 40 clients, the speed of the real-time simulator was throttled down to accommodate the slow processing of the CAPE device. Figure 6 shows that, as expected, the packet loss rate (averaged over 10 runs) increases with the increase of clients when content distribution is disabled. The results from simulation and emulation are statistically indistinguishable. When replication is activated in CAPE, however, the loss rate remains stable, implying that the packet losses occur primarily at the network segment between the CAPE device and the streaming video server. V. C ONCLUSIONS AND F UTURE W ORK An important aspect of our emulation infrastructure is the use of an existing VPN framework, thus allowing us to inherit
[1] P. Barford et al., “Bench-style network research in an Internet instance laboratory,” SIGCOMM Computer Communication Review, vol. 33, no. 3, pp. 21–26, 2003. [2] L. Peterson et al., “A blueprint for introducing disruptive technology into the Internet,” in HotNets-I, October 2002. [3] Y. Liu et al., “Scalable fluid models and simulations for large-scale IP networks,” TOMACS, vol. 14, no. 3, pp. 305–324, July 2004. [4] L. Breslau et al., “Advances in network simulation,” IEEE Computer, vol. 33, no. 5, pp. 59–67, May 2000. [5] A. Vahdat et al., “Scalability and accuracy in a large scale network emulator,” in OSDI’02, December 2002. [6] J. Zhou et al., “TWINE: A hybrid emulation testbed for wireless networks and applications,” in INFOCOM’06, April 2006. [7] D. M. Nicol et al., “Advanced concepts in large-scale network simulation,” in WSC’05, December 2005. [8] R. Simmonds and B. Unger, “Towards scalable network emulation,” Computer Communications, vol. 26, no. 3, pp. 264–277, February 2003. [9] D. Nicol and G. Yan, “Discrete event fluid modeling of background TCP traffic,” TOMACS, vol. 14, no. 3, pp. 211–250, July 2004. [10] J. Zhou et al., “MAYA: integrating hybrid network modeling to the physical world,” TOMACS, vol. 14, no. 2, pp. 149–169, April 2004. [11] J. Liu, “Packet-level integration of fluid TCP models in real-time network simulation,” in WSC’06, December 2006. [12] P. Zheng and L. Ni, “EMPOWER: a network emulator for wireline and wireless networks,” in INFOCOM’03, March/April 2003. [13] B. White et al., “An integrated experimental environment for distributed systems and networks,” in OSDI’02, December 2002, pp. 255–270. [14] S. Y. Wang et al., “The design and implementation of the NCTUns 1.0 network simulator,” Computer Networks, vol. 42, no. 2, pp. 175–197, 2003. [15] R. Bradford et al., “Packet reading for network emulation,” in MASCOTS’01, August 2001, pp. 150–157. [16] X. Liu et al., “Network emulation tools for modeling grid behavior,” in CCGrid’03, May 2003. [17] “The PRIME Research.” [Online]. Available: http://prime.mines.edu/ [18] M. Liljenstam et al., “RINSE: the real-time interactive network simulation environment for network security exercises,” in PADS’05, June 2005, pp. 119–128. [19] J. Yonan, “OpenVPN – an open source SSL VPN solution.” [Online]. Available: http://www.openvpn.net/ [20] San Diego Supercomputer Center at UCSD, “SDSC DataStar user guide.” [Online]. Available: http://www.sdsc.edu/user services/datastar/