Model-Driven Network Emulation with Virtual Time Machine - CiteSeerX

4 downloads 151583 Views 230KB Size Report
and new services directly on Internet itself [30]. ... 1. Accuracy: A network testbed shall maintain a desir- able degree of accuracy in representing the target net-.
Florida International University Technical Report TR-2009-03-01

Model-Driven Network Emulation with Virtual Time Machine

Jason Liu and Raju Rangaswami Florida International University

[email protected]

[email protected]

1

Model-Driven Network Emulation with Virtual Time Machine Jason Liu and Raju Rangaswami School of Computing and Information Sciences Florida International University Miami, Florida 33199 Emails: {liux,raju}@cis.fiu.edu

Abstract We present our initial work on VENICE, a project that aims at developing a high-fidelity, high-performance, and highly-controllable experimental platform on commodity computing infrastructure to facilitate innovation in existing and futuristic network systems. VENICE employs a novel model-driven network emulation approach that combines simulation of large-scale network models and virtualmachine-based emulation of real distributed applications. To accurately emulate the target system and meet the computation and communication requirements of its individual elements, VENICE adopts a holistic machine and network virtualization technique, called virtual time machine, in which the time advancement of simulated and emulated components are regulated in complete transparency to the test applications.

utility of these network testbeds: 1. Accuracy: A network testbed shall maintain a desirable degree of accuracy in representing the target network. Accuracy also implies realism: the testbed shall be able to faithfully capture detailed network transactions. Experiments shall include real implementations of network applications with representative network topologies under realistic traffic conditions. 2. Performance: Given that users typically have access to only limited physical infrastructure, a network testbed shall utilize resources efficiently. Meanwhile, the testbed shall be scalable to facilitate understanding the complex behaviors of large-scale networks. 3. Controllability: A network testbed shall provide mechanisms for flexible configuration and visualization of network experiments. Users shall be able to explore design alternatives and the parameter space with ease and efficiency.

1. A Harmless Necessary Cat1 The success of Internet depends on the continued innovations of the networking research community, and on the successful transition of laboratory research projects into real products and services. Yet, the community faces a problem. Internet is an operational environment rather than an experimental testbed, and it has a distributed ownership. These two factors combined together create a hostile environment to effectively prototype, deploy, and evaluate new designs and new services directly on Internet itself [30]. Technological changes are slow to be adopted by the Internet operators, partially due to insufficient experimentation and evaluation. As a result, the current Internet fails to realize the full potential of emerging network technologies. We can generally classify available experimental networking research testbeds into physical, emulation, and simulation testbeds. Three important factors determine the 1 From

Shakespeare’s play “the Merchant of Venice”.

Emulation testbeds (such as EmuLab [9], PlanetLab [31], and the current GENI effort [13]) combine physical testbeds with emulation and provide the ability to conduct extensive live experiments. Although emulation allows flexible network experiments (e.g., VINI [1]), emulated network conditions are inherently dependent upon the available physical infrastructure (e.g., nodal processing speed, link bandwidths and latencies) and the target traffic scenario. These factors limit the flexibility and the scalability of the testbeds. Network simulation tools, in contrast, allow examining large-scale phenomena more expediently. However, simulation fairs poorly in other aspects, particularly in its operational realism. Simulation model development is both labor-intensive and error-prone; reproducing realistic network topology, representative network traffic, and diverse operational conditions in simulation is known to be a substantial undertaking [11]. Despite these problems, simulation is effective at capturing high-level design issues, an-

tions to the simulation engine, resource-intensive I/O operations are reduced so that network experiments can scale to accommodate substantially larger networks. To provide a high-performance and scalable interface between the virtual machines (running as emulated hosts) and the simulation engine, we are investigating various communication optimizations. It is conceivable VENICE be extended to run on supercomputers (with today’s near petaflop performance [36]) or computational grids to allow “superscale network emulation” as a genuine part of large science applications. Consequently, networking research can truly benefit from the advances in cyber infrastructure research. Controllability. VENICE aims to provide a controllable experimental infrastructure, allowing researchers to slow down or speed up experiments, inject faults and failures, and provide a simple and flexible interface for configuring, monitoring, and controlling the testbed both before and during the experiment. In this paper, we present our initial work on VENICE, focusing only on the first two goals, namely accuracy and performance.

swering what-if questions, and providing preliminary analysis of complex system behaviors. In this paper, we present our initial experiences with the VENICE project (Virtual Emulation Network Infrastructure for Commodity Environments), which aims to combine the advantages of simulation and emulation techniques to provide a high-fidelity, high-performance, and highlycontrollable experimental platform for large-scale network experiments. We propose model-driven network emulation, a holistic machine and network virtualization scheme that allows execution of network simulation models and emulation of real network protocols and distributed applications. Traffic generated by the real network applications is conducted on a virtual simulated network with packet delays and losses calculated according to the simulated network conditions. VENICE uses a unified machine and network virtualization scheme, which we call the virtual time machine, to accurately represent the the computation and communication environments of the target system as well as to regulate the time advancement of the virtual-machine resident emulated elements and the simulated network. The virtual time machine builds upon recent work on time dilation [16] which allows time, as experienced by the applications running on individual virtual machines, to be transparently dilated or contracted relative to real time. VENICE coordinates time progression in various parts of the system, including the simulation engine and associated emulated elements, so that accuracy (in terms of time-ordered packet delivery) and performance (in terms of maximizing resource utilization) are both improved. VENICE aims to comprehensively address three important factors that affect the utility of experimental network platforms—accuracy, performance, and controllability. Accuracy. Model-driven network emulation, as proposed in VENICE, allows real implementations of network applications to run unmodified on virtual machines configured to represent the target execution environment. Meanwhile, communications—especially, packet forwarding operations—are carried out via simulation, which involves only changes to a few state variables, far more efficient than the expensive I/O operations required by emulation. One can thus incorporate detailed models (e.g., sophisticated network queuing management schemes) for an accurate network representation. Moreover, simulation can easily incorporate mathematical models with different modeling resolutions (such as the fluid traffic models [25, 27]), allowing necessary trade-off between accuracy and performance. Performance. VENICE targets existing commodity computing infrastructure, readily available to researchers with sufficient computation and communication power for day-to-day networking research. Consequently, scalability is an important design goal. By offloading communica-

2. Speak Me Fair In Death In this section, we discuss existing research in network testbeds, machine virtualization, and virtual time management schemes, based on which we build VENICE. Note that due to limited space this is by no means comprehensive. Emulation testbeds provide an iconic view of the Internet. Their popularity is evident by the wide variety of network experiments they currently support. For example, PlanetLab consists of machines distributed across the Internet and shared by researchers simultaneously conducting experiments directly on the Internet. EmuLab is also an experimentation facility consisting of many networked computers integrated and coordinated via emulation mechanisms to present a diverse virtual network environment. Emulation testbeds can be built on a variety of computing infrastructures, including dedicated high-performance computing clusters (such as EmuLab and ModelNet [38]), distributed platforms (such as PlanetLab, VINI, X-Bone [37], and VIOLIN [19]), and special programmable hardware (such as ONL [6] and the CMU Wireless Emulator [20]). A major distinction between simulation and emulation is that simulation contains only software modules representing network protocols and network entities (such as routers and links), and mimicking network transactions as pure logic operations, where emulation involves artificially delaying and dropping real packets [34]. Representative network simulators include ns-2 [4] (with the ongoing effort in ns-3 [28]), SSFNet [5], GTNetS [15], and QualNet [33]. Real-time simulation allows network simulation to interact with real implementations of network applications 3

Virtual Network

for improved accuracy. Most real-time simulators (such as NSE [10], IP-TNE [35], MaSSF [24], Maya [44], and PRIME [32]) are based on existing network simulators augmented with emulation capabilities. EmuLab also includes real-time simulation based on NSE. Since real applications operate in real time, real-time network simulation must meet real-time requirements. To speed up simulation, one can apply parallel simulation techniques to harness the computing resources of parallel computers; in conjunction, multi-resolution modeling techniques (such as lowresolution fluid models [23]) can be employed to reduce the computational demand. Recently, researchers have proposed using machine virtualization for building network emulation testbeds [1, 3, 17, 26]). Virtual machine solutions come in variations. Full-scale machine virtualization (such as VMWare workstation [39] and User-Mode Linux [7]) offer complete transparency to the guest operating system, but in doing so incur a large performance overhead. Light-weigh machine virtualization (such as Xen [8], Vserver [22], VMWare ESX server [40], and Denali [41]) requires modification to the guest OS to bargain for greater efficiency. In addition to virtualizing the entire operating system instance, one can virtualize only network stacks [18, 21, 29, 42]. Applications running on the same OS can be presented with multiple independent network stacks, managed individually with distinct network devices. Virtual time management has been the central theme of parallel discrete-event simulation [12]. By exploiting the concurrency inside a simulation model, parallel simulation can harness the collective power of parallel computers to accelerate the execution of large-scale network models. Virtual time has been proposed for emulating network and system configurations which are superior to those provided by the available physical infrastructure. A principal technique employed is time dilation [16], which enables controlling a system’s notion of how time progresses; one can control the system clock so that applications experience a slower passage of time and are consequently provided with seemingly upgraded system resources (including both CPU speed and network processing rate). Bergstrom et al. [2] proposed a modified time dilation mechanism that includes the ability to warp time to handle periods of inactivity and thereby improve resource utilization during the period of resource under-subscription. Grau et al. [14] proposed a low-overhead conservative synchronization mechanism to regulate the time dilation factors across virtual machines.

H3

H2

Compute Cluster

H1

VM0

Scaling Unit

H1 Guest OS

VM1 H2

VM2

Network Applications Running on H1

Network Applications Running on H2

Guest OS

Guest OS

Virtual Machine Monitor Physical Hardware other simulator instances

Figure 1. VENICE infrastructure. conduct high-fidelity experiments of existing and futuristic network systems on commodity infrastructure. VENICE architecture. A VENICE network model may consist of a set of network elements running network protocols and distributed applications. The virtual network is simulated in parallel on the underlying compute cluster (illustrated in Figure 1). A user of VENICE specifies a model of the target network that he wishes to instantiate. A subset of network hosts and routers (as specified by the user) are emulated, where real implementations of network applications are executed directly on virtual machines instantiated on the physical compute nodes. Each physical node of the compute cluster is treated as a scaling unit, which also includes a simulation instance responsible for a part of the virtual network. The emulated elements are assigned to independent virtual machines so they can run unmodified network applications. Traffic generated by these applications is intercepted at the virtual machines and redirected to the network simulator to be “carried” over the virtual network. Emulation traffic is processed together with simulation traffic inside the simulation engine both represented as simulation events. Packet delays and losses are calculated based on competition for virtual network resources. The simulation instance at each scaling unit communicates with other simulation instances (located on other physical nodes) through a parallel discrete-event simulation executive. We now illustrate a specific instantiation of a user’s experiment using the example depicted in Figure 1. In this example, H1 and H2 are emulated nodes in the virtual network as specified by the user. VENICE instantiates two virtual machines correspondingly, VM1 and VM2 , which provide the target execution environment for the applications under

3. Of Wisdom, Gravity, Profound Conceit In this section, we elaborate on the architecture as well as the key design and implementation aspects of VENICE. The goal of VENICE is to allow networking researchers to 4

simulator appears to be. The passage of time within an emulated element, however, may be dilated or contracted relative to real time in case the emulated resource properties differ from available physical resources. For example, if we need to emulate a gigabit network service on a system that can only offer 100 Mb/s connections between VMs, the time on these VMs may need to be dilated (i.e., slowed down) by at least a factor of 10. It is important to note that a gigabit network service does not mean one has to send data at 1 Gb/s all the time. Due to nonuniform system properties and dynamic load characteristics of the target network, using a single time dilation factor for all emulation elements during the entire experiment can lead to arbitrary underutilization of available physical resources. One can extend the time dilation technique, proposed by Gupta et al. [16], to allow dynamic control over emulated network performance with specific link bandwidths and latencies. The time dilation factor can adapt to the changes in bandwidth utilization. Changing the time dilation factor, however, also changes the perceived CPU speed. To allow independent CPU and network scaling, we are currently investigating techniques to combine the time dilation technique with VM resource allocation schemes and dynamic scheduling policies. VENICE must be able to maintain a consistent time frame across the distributed emulation infrastructure where different elements may advance time independently. That is, since individual emulation elements maintain their own timelines, various elements of the emulated system must be synchronized to ensure causality and timeliness. We are currently investigating the effect of dynamic time dilation on emulation accuracy so that we can derive an optimal synchronization strategy for adjusting the time dilation factor accordingly within a tolerable threshold. We can extend existing synchronization methods (such as the relativistic time [2], or the epoch-based synchronization scheme [14]) to coordinate the time advancement of both simulation instances (measured in discrete time) and virtual machines (measured in continuous time).

test (e.g, a new content distribution service). These applications may even include full-scale software routers.2 Realizing a hybrid simulation-emulation system. VENICE simulates the target communication system. VENICE is being developed based on our previously developed real-time network simulator [reference blinded], which supports a rich collection of network entities (such as routers and queues) and common network protocols (such as TCP, UDP, and HTTP). The network simulator conducts both simulation and emulation traffic, the latter of which is generated by the real applications running on the virtual machines. To illustrate, suppose a web client running on VM1 sends an HTTP request to a web server running on VM2 . The associated packet will be captured at the virtual network device and sent to the simulation instance running on another virtual machine at the same scaling unit (VM0 in the example). Subsequently, a simulation packet will be sent from virtual node H1 to H2 on the virtual network, with a delay and possible loss calculated subject to the conditions of the virtual network. Once the packet “arrives” at virtual node H2 , the simulator will send a real packet to VM2 . The underlying conduit is concealed from the applications running on VM1 and VM2 , which experience the same behavior in terms of throughput and latency as if on the virtual network. Optimizing network communication. VENICE must be able to provide a high-performance and scalable interaction interface supporting emulation traffic flows between potentially a large number of virtual machine instances running real applications and the corresponding simulation instances. We have been developing a variety of solutions to this problem. One is to use TUN/TAP devices, i.e., through IP tunneling. Similar solutions have been adopted by VINI [1] and Trellis [3]. We built the conduit by customizing OpenVPN and making it function as a gateway for bridging traffic between the virtual machines and the simulated network [reference blinded]. Alternative solutions adopt inter-VM communication mechanisms. Currently we are investigating the use of network bridging as well as XenSocket [43] for high-throughput low-latency communications between Xen domains and will report results in another paper.

Incorporating traffic models. VENICE provides a multiscale multi-modal network traffic emulation framework for accurate representation of traffic on the target infrastructure. It is undoubtedly a great challenge to accurately represent the global traffic behavior in a repeatable and controllable fashion for network experiments. We propose a multi-pronged approach by making a distinction between foreground traffic, which is generated by emulated network elements and shall be dealt with high fidelity, and background traffic, which represents the bulk of the network traffic. Although it is of secondary interest and may not require significant accuracy, background traffic interferes with the foreground traffic as they both compete for network resources, and thus determines the behavior of the network applications we intend to study. VENICE can integrate hy-

Employing time dilation. The simulation engines in VENICE uses simulation time. The speed with which simulation advances its simulation time is dictated by the simulator’s event processing speed. For network simulation, this largely depends on the traffic load. Higher level of network traffic results in more simulation events that are needed to be processed at a given time, and therefore the slower the 2 In a preliminary study [reference blinded], the emulation infrastructure has demonstrated capable of supporting large-scale routing experiments, involving hundreds of XORP routers per scaling unit.

5

cation scenarios) to live experimentation and deployment (e.g., on the GENI platform).

brid traffic models, such as epidemic and fluid models with packet models to represent the aggregate background traffic behavior [two references blinded].

References 4. Let It Serve for Table Talk [1] A. Bavier, N. Feamster, M. Huang, L. Peterson, and J. Rexford. In VINI veritas: realistic and controlled network experimentation. SIGCOMM’06. [2] C. Bergstrom, S. Varadarajan, and G. Back. The distributed open network emulator: Using relativistic time for distributed scalable simulation. PADS’06. [3] S. Bhatia, M. Motiwala, W. Muhlbauer, V. Valancius, A. Bavier, N. Feamster, L. Peterson, and J. Rexford. Hosting virtual networks on commodity hardware. Technical Report GT-CS-07-10, Georgia Tech, 2008. [4] L. Breslau, D. Estrin, K. Fall, S. Floyd, J. Heidemann, A. Helmy, P. Huang, S. McCanne, K. Varadhan, Y. Xu, and H. Yu. Advances in network simulation. IEEE Computer, 33(5), 2000. [5] J. Cowie, D. Nicol, and A. Ogielski. Modeling the global Internet. Computing in Science and Engineering, 1(1):42– 50, 1999. [6] J. DeHart, F. Kuhns, J. Parwatikar, J. Turner, C. Wiseman, and K. Wong. The open network laboratory. ACM SIGCSE Bulletin, 38(1):107–111, 2006. [7] J. Dike. A user-mode port of the Linux kernel. 4th Annual Linux Showcase & Conference, 2000. [8] B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, I. Pratt, A. Warfield, P. Barham, and R. Neugebauer. Xen and the art of virtualization. SOSP’03. [9] EmuLab. http://www.emulab.net/. [10] K. Fall. Network emulation in the Vint/NS simulator. ISCC’99. [11] S. Floyd and V. Paxson. Difficulties in simulating the Internet. TON, 9(4):392–403, 2001. [12] R. M. Fujimoto. Parallel and distributed simulation systems. John Wiley & Sons, 2000. [13] GENI. http://www.geni.net/. [14] A. Grau, S. Maier, K. Herrmann, and K. Rothermel. Time jails: A hybrid approach to scalable network emulation. PADS’08. [15] GTNetS. http://www.ece.gatech.edu/ research/labs/MANIACS/GTNetS/. [16] D. Gupta, K. Yocum, M. McNett, A. Snoeren, A. Vahdat, and G. Voelker. To infinity and beyond: time-warped network emulation. NSDI’06. [17] M. Hibler, R. Ricci, L. Stoller, J. Duerig, S. Guruprasad, T. Stack, K. Webb, and J. Lepreau. Large-scale virtualization in the emulab network testbed. USENIX’08. [18] X. W. Huang, R. Sharma, and S. Keshav. The ENTRAPID protocol development environment. INFOCOM’99. [19] X. Jiang and D. Xu. VIOLIN: Virtual internetworking on overlay infrastructure. ISPA’04. [20] G. Judd and P. Steenkiste. Repeatable and realistic wireless experimentation through physical emulation. SIGCOMM CCR, 34(1):63–68, 2004. [21] Linux VRF. http://sourceforge.net/ projects/linux-vrf/.

We conclude this paper with a discussion of several important issues that fundamentally differentiate the VENICE approach with existing network emulation testbeds. Live vs. synthetic traffic. The ability to perform live experiments is a major attraction for systems like PlanetLab and GENI, and is unparalleled by other methods. Conducting tests under real network traffic can reveal both design and operational problems before the ultimate deployment. However, the ability of live testing sacrifices controllability. After all, one cannot control live traffic to reveal every potential problem that the target system would encounter under all kinds of traffic conditions. VENICE uses network simulation to model network transactions and therefore can easily incorporate synthetic traffic models. Synthetic traffic cannot replace live traffic, but it provides a controllable diversity. For example, synthetic traffic can be used to represent various congestion levels of the network queues. It is possible to run VENICE on distributed platforms (such as an overlay network) so that it can incorporate live traffic situations. It is important, however, to accurately integrate live traffic with emulation traffic from time-dilated systems. Common vs. special devices. Simulated packet forwarding is faster than explicit I/O operations, but is slower in terms of processing packets than on specialized hardware architectures, such as FPGAs and network processors (e.g., see [6]). VENICE uses time dilation to proportionally increase the perceived networking speed to compensate for this deficiency and to accommodate future network capabilities. It is possible to integrate model-driven network emulation with other emulation testbeds supported by special devices for sake of efficiency. Commodity vs. shared platforms. The VENICE’s approach to increasing availability of the network testbed is to adopt commodity platforms (so that everybody would be able to run the software on existing computing systems). This is in contrast to the philosophy behind PlanetLab, EmuLab, and GENI, where resource sharing is of critical importance. Virtualization techniques are used in the latter case primarily to reduce interference between experiments sharing the same hardware platform. In VENICE, virtualization techniques are used primarily to represent the target system independent of the underlying execution environment, and to incorporate virtual time management according to resource utilization and time synchronization. VENICE complements and strengthens existing network testbed approaches; it will help smooth the transition from controlled studies (with diverse execution and communi6

[22] Linux VServer. http://linux-vserver.org/. [23] J. Liu. Packet-level integration of fluid TCP models in realtime network simulation. WSC’06. [24] X. Liu, H. Xia, and A. A. Chien. Network emulation tools for modeling grid behavior. CCGrid’03. [25] Y. Liu, F. Presti, V. Misra, D. Towsley, and Y. Gu. Scalable fluid models and simulations for large-scale IP networks. TOMACS, 14(3):305–324, July 2004. [26] S. Maier, D. Herrscher, and K. Rothermel. Experiences with node virtualization for scalable network emulation. Computer Communications, 30(5):943–956, 2007. [27] D. M. Nicol and G. Yan. Discrete event fluid modeling of background TCP traffic. TOMACS, 14(3):211–250, 2004. [28] The ns-3 Project. http://www.nsnam.org/. [29] OpenVZ. http://openvz.org/. [30] L. Peterson, T. Anderson, D. Culler, and T. Roscoe. A blueprint for introducing disruptive technology into the Internet. HotNets-I, 2002. [31] PlanetLab. http://www.planet-lab.org/. [32] PRIME. http://www.cis.fiu.edu/prime/. [33] Scalable Network Technologies. http:// scalable-networks.com/. [34] L. Rizzo. Dummynet: a simple approach to the evaulation of network protocols. SIGCOMM CCR, 27(1):31–41, 1997. [35] R. Simmonds and B. W. Unger. Towards scalable network emulation. Computer Communications, 26(3):264– 277, February 2003. [36] TOP 500. http://top500.org/. [37] J. Touch. Dynamic Internet overlay deployment and management using the X-Bone. ICNP’00. [38] A. Vahdat, K. Yocum, K. Walsh, P. Mahadevan, D. Kostic, J. Chase, and D. Becker. Scalability and accuracy in a large scale network emulator. OSDI’02. [39] VMWare Workstation. http://www.vmware.com/ products/desktop/workstation.html. [40] VMWare ESX Server. http://www.vmware.com/ products/vi/esx/. [41] A. Whitaker, M. Shaw, and S. Gribble. Denali: Lightweight virtual machines for distributed and networked applications. USENIX’02. [42] M. Zec. Implementing a clonable network stack in the FreeBSD kernel. USENIX’03. [43] X. Zhang, S. McIntosh, and P. Rohatgi. XenSocket: A highthroughput inter-domain transport for VMs. IBM T.J. Watson Research Report, 2007. [44] J. Zhou, Z. Ji, M. Takai, and R. Bagrodia. MAYA: integrating hybrid network modeling to the physical world. TOMACS, 14(2):149–169, 2004.

7

Suggest Documents