Scaling Non-elastic Applications Using Virtual Machines - Software ...

11 downloads 63427 Views 745KB Size Report
Running multiple virtual machines, possibly from dif- ferent customers, on a single physical host will lead to resource contention. Customers' application demand ...
2011 IEEE 4th International Conference on Cloud Computing

Scaling non-elastic applications using virtual machines Thomas Knauth, Christof Fetzer TU Dresden {firstname.lastname}@tu-dresden.de time. Dynamically re-balancing of virtual machines is thus of particular concern. Modern hypervisors provide a feature called “live” migration, where virtual machines can be re-located between physical hosts. The benefit of “live” migration, compared to stopcopy-restart migration, is the minimal downtime incurred. Ideally, users of services running inside the migrated VM, will only see a minimal (sub-second) increase in latency, during live migration. Using VM technology, multiple operating system instances and application stacks can run in parallel on a single physical machine. We, however, investigate how virtual machines can aid in scaling statically provisioned, distributed applications. Even if the application itself does not allow to add new instances at runtime, virtual machines help minimize the physical resource usage. If load is light, all virtual machines can run on a single physical machine. As load increases, individual virtual machine instances are migrated to spare physical machines. This way the available compute resources can be smoothly adapted on-demand. We are not interested in handling sudden load spikes (i.e., bursty workloads), but rather gradual, forecastable load changes. For example, online shopping sites usually show positive peak activity at noon, and negative peak activity at midnight, with smooth transitions in between.

Abstract—Hardware virtualization is a cost effective mean to reduce the number of physical machines (PMs) required to handle computational tasks. Virtualization also guarantees high levels of isolation (performance and security wise) between virtual machines running on the same physical hardware. Besides enabling consolidation of workloads, virtual machine (VM) technology also offers an application independent way of shifting workloads between physical machines. Live migration, i.e., shifting workloads without explicitly stopping the virtual machine, is particularly attractive because of the minimal impact on virtual machine and hence service availability. We explore the use of live migration to scale non-elastic (i.e., static runtime configuration) applications dynamically. Virtual machines thus provide an application agnostic way to dynamic scalability, and open new venues for minimizing the physical resource usage in a data center. We will show that virtualization technology in connection with the live migration capabilities of modern hypervisors can be used to scale non-elastic application in a generic way. Some problems still present in current virtualization techniques with respect to live migration will also be highlighted. Keywords-platform virtualization, elasticity, scalability

I. I NTRODUCTION Green computing has become a major factor driving innovation in the technology sector in general and computer industry in particular. Computation is on its way of becoming a utility like electricity at the end of the 19th century. Companies specialize in the provisioning of computer resources and offer access to their compute resources to other businesses. This is especially attractive for small to medium sized companies, which no longer need to invest in an inhouse IT department. The mainstream media refers to the outsourcing of IT infrastructure as “moving into the cloud” or “cloud computing”. One tenet of cloud computing is the advent of virtualization technologies. With modern virtualization technology, data center operators today have more opportunities than ever for consolidating hardware resources. Virtual machines provide the illusion of dedicated hardware to multiple users on a single physical machine. Security concerns aside, virtual machines enable deployments previously requiring separate physical machines. Running multiple virtual machines, possibly from different customers, on a single physical host will lead to resource contention. Customers’ application demand for CPU, RAM, disk, and network bandwidth will vary over 978-0-7695-4460-1/11 $26.00 © 2011 IEEE DOI 10.1109/CLOUD.2011.77

II. P ROBLEM Modern data center operators stay competitive by multiplexing multiple operating system instances on a single physical machine. Individual users have the illusion of a dedicated operating system instance, without caring too much what other workloads might be running concurrently on the same physical host. The scenario investigated in this paper assumes an environment that is not shared by multiple users. Instead, we assume there is a single user, wanting to run a single program. The program can be run in a distributed fashion, i.e., across multiple physical nodes. This allows to harness more than just the compute resources of a single machine. The number of nodes (physical or virtual) for a specific configuration is static. It is not possible to start the application on a machine not part of the initial setup and add this machine to the already existing deployment. Applications with a static configuration have the benefit of being easier to develop. Adding compute resources at 468

B. Virtual machine interaction

runtime, requires mechanisms to re-partition the application state. Such mechanisms costs time to develop, and, depending on the application, may be non-trivial to implement. Our proposed scheme for dynamically scaling non-elastic applications does not require any application support. In fact, our scheme obviates any application-level state-migration features. With support for dynamic scaling an application can readily adapt to varying request rates. We do not concern ourselves with sudden load spikes, but rather gradual, e.g., over the course of a business day, change in load. Whether our scheme can be extended to handle load spikes depends on the specific characteristic of the spikes as well as the duration and impact of virtual machine migration. Deploying a static application for maximum load directly on physical machines means a lot of wasted energy. The system will handle peak load only for a small fraction of its total uptime. The rest of the time, machines will be idle. Idle machines still consume significant (in our case 130 W) amounts of energy, while doing no useful work. Even though the application itself does not allow dynamic scaling at runtime, we propose to use virtualization along with live migration to reap most of the benefits of a dynamically scalable application. The basic idea is to start each application instance inside a virtual machine. The total number of virtual machines depends on the maximum load the system should be able to handle. On full scale out, each physical machine hosts a single virtual machine. This defines the maximum load the system can handle. On the other hand, when load is minimal, all virtual machines run on a minimal set of physical machines. In between maximum and minimum load, the allocation of virtual to physical machines varies. The following questions require clarification in order to make the proposed scheme work in practice.

Related to the problem of virtualization overhead, is the interaction of multiple virtual machines running on the same physical host. The load which can be stably handled by a single VM running on a physical host, is likely to be higher than of a physical machine running, e.g., four VMs. Multiplexing resources between virtual machines in itself creates overhead. This overhead is in addition to the overhead created by the virtualization technology. Virtual machine interaction dictates the maximum number of virtual machines running simultaneously on a single physical host. Running too many virtual machines on a single host, will have a decidedly negative effect on load the machine can handle. Even for minimal load, the number of required physical nodes may be greater than one. The ideal number of virtual machines per physical machine also affects the maximum achievable energy savings. The fewer physical nodes are needed for minimum load, the higher the energy savings. C. Migration strategy When load for the minimally deployed resources reaches the critical threshold, virtual machines must be migrated in order to handle further increases in incoming requests. This opens the question of migration strategy. How many virtual machines should be migrated at any one time? How many physical machines should be booted up to accommodate the increased load? One strategy is to migrate individual virtual machines from their origin host to newly booted machines. Only the origin host will ever host more than one virtual machine. This strategy minimizes the total number of migrations to reach full scale out. However, it does not necessarily ensure the most effective physical resource utilization. Optimizing the migration strategy is out of the scope of this paper. When demonstrating the viability of our approach, we will use the simple strategy described earlier.

A. Virtualization overhead There are different technologies available for setting up a virtualized compute environment. Commercial, closed source implementations like VMware, as well as freely available open-source ones like kvm and Xen. Besides the well known and widely used virtualization technologies, there exist other research prototypes like, e.g., Nova [1]. We do not further consider them here however. The virtualization products differ in their architecture and implementation, which in turn dictates the maximal achievable performance. No matter which virtualization technique is used, there will always be a performance penalty compared to execution without the virtualization layer. The virtualization overhead dictates the maximum number of physical machines required to handle the expected peak load. For example, if without virtualization 10 machines are enough to handle peak load, with virtualization 12 machines might be needed.

III. M ETHODOLOGY A. Workload As an example application, which cannot (at the time of writing) be scaled dynamically, we chose StreamMine [2]. StreamMine is a stream processing engine developed in our research group. StreamMine was developed with the goal to enable high throughput, low latency data stream processing. Stream processing aims to avoid touching the disk at all costs. Input events and data structures required for processing are all kept in volatile memory. This is beneficial in a virtualized environment with respect to live migrations. Hard disks in a live migration enabled environment usually mean network attached storage, because migration only transfers the virtual machine’s in-memory state. In our case, virtual machines have access to a small (1 GB) virtual hard disk for temporary

469

  



   

 

central NTP server. Time differences due to clock drift are in the single digit millisecond range and pose no threat to the validity of our timing related measurements. As our virtualization technology, we used Xen version 4.0 as provided by the Debian repositories. All virtual machines were configured with 512 MB of RAM and 8 virtual CPUs (VPUs). A 1 GB hard disk is sufficient, because stream processing avoids writing data onto disk. As such, the disk only holds application binaries and temporary data. For maximum scalability, each VM has the same number of VPUs as the host system has CPUs.



IV. E VALUATION 





In order to evaluate the effectiveness of using live migrations to dynamically scale applications we conducted a series of experiments. This section reports on the results.

Figure 1. Basic stream processing architecture showing three stages with multiple processing nodes each. Events flow from left to right.

A. Virtualization overhead Switching from native execution to a virtualized environment does incur overhead. The concrete impact on application performance varies, depending on application characteristics, i.e., whether the application is compute, memory or I/O intensive. We performed a series of experiments in order to determine the maximum stable throughput of our system. Stable in our case means that throughput does not vary by more than 10 percent over the course of an experiment. Knowing the processing limits of the system is important: migration must happen well before the system, in its current deployment, hits its processing limit. This will ensure a smooth transition between two deployment setups, with minimal perceived impact on application performance. Table I summarizes our findings. Despite our initial assumption, virtualization does not have any perceptible effect on the maximum stable throughput. For a single physical machine, the stable throughput does not change when switching from native execution to a virtualized environment. Neither does running multiple VMs on a single node show any detrimental effect on throughput. We attribute this to the fact that the system operates well below its theoretically possible throughput. Because the workload is network bound, using a single Gigabit Ethernet link a maximum of 1.9M events per second could be received and processed per node. At 1.1M events per second the network link operates at about 60% of its maximum capacity. This is a utilization level where small request rate fluctuations bear little risk of affecting throughput and latency noticeably [3]. Virtualization does affect latency though, as we will show in Section IV-C.

data. The virtual hard disks are shared via NFS among all participating nodes. Figure 1 shows a graphical representation of a typical stream processing setup. Stream processing applications have a staged architecture. Events flow from one stage to the next. A stage comprises one or more nodes performing identical tasks. Events are simple key-value pairs. The key is used to route events between stages. Events with the same key will always be routed to the same node in a stage. Processing of events with distinct keys can happen in parallel, the enabling factor for vertical scaling. Our setup is basic in that we only have three stages: source, processing, and sink stage. Potentially, a stream processing pipeline can involve more than three stages. A source emits events, which are processed by the processing stage, which in turn generates output events that are sent to the sink. The key factor to scaling this system is the processing stage. Hence, in our experiments only processing nodes will run inside virtual machines. In our example, events represent credit card transaction records. Each record contains a credit card number, transaction location and date. The processing nodes run a fraud detection algorithm, which determines whether two transactions might represent a credit card skimming attack. If an attack is suspected, an event is sent to the sink stage. We use the terms event and request synonymously in the remainder of the text. B. Hardware/software setup Each physical machine has two four-core Intel Xeon E5405 CPUs and 8 GB of RAM. Machines are connected via Gigabit Ethernet. Hard disks are attached via SATA2. The operating system is Debian Linux 6.0 with kernel 2.6.32. The clock on each machine is synchronized using a

B. Performance impact In order to use live migration to dynamically scale applications, the most important insight is how the migration process affects the application’s operation. Besides the fact

470

Virtual machines 0 1 2 4 8

setup: initially, we start eight virtual machines on a single physical machine. We determined the maximum load such an allocation can handle earlier. Before the current setup reaches peak load, we trigger a migration. We migrate a single VM each time the system’s current setup is nearing its maximum load. The migration points are static in our setup, as we have full control over the request frequency. In a real world scenario, an automated mechanism would trigger migrations, e.g., by monitoring the rate of incoming requests or other relevant parameters [4], [5]. At full scale out, the PM-to-VM ratio is 1 : 1. Figure 4 shows the latency for the experiment over time. Latency measures the time an event takes to travel from the source to the sink node. To reduce measurement overhead we measure latency only once every 10k events. The first migration is triggered 30s into the experiment, and every 60s one additional VM is migrated. This is evident in the seven approximately 10s gaps in the latency graph. One noteworthy effect is the stepwise decrease in the upper bound latency from around 1s (initial) to around 250ms (after seven migrations). Virtualization overhead does not impact throughput (cf. Table I) as much as it affects latency. Figure 5 shows the throughput of our system over the course of the experiment. The throughput increases every minute, and simulates the gradual increase in load we expect to see for our target applications. We see seven drops in throughput at the time when migrations take place. A drawback of our simple migration strategy becomes apparent here: the scaling factor is far from perfect. Using twice the physical resources, e.g., going from one to two physical machines, does not enable the system to processes twice as many requests. Load is unevenly distributed between the physical machines. After one migration, in our example, one PM hosts 7 VMs and thus receives 7/8ths of the requests. This is why throughput only increases from initially 900 × 103 requests to 1.1 million after one migration. Note that the throughput drops to zero during live migration is a property inherent to our sample application. Throughput should drop only by the number of events processed by the virtual machine being migrated. This behavior is due to our sample application and not inherent to live migration or our proposed approach. Source nodes send requests synchronously. Thus the sources block once they try to send a request to a migrating virtual machine. This is an important observation for system designers: not only the migrating VM stops, but every other participant, who interacts synchronously with it, stops too. After seven migrations, the system reaches maximum scale out. It is then able to process the previously determined maximum of 8 million events per second. The main motivator for this work was to use virtualization in order to save energy. Figure 6 shows the energy consumption for the scenario just discussed. The figure compares energy consumption for a dynamic (using migrations) as

Physical machines 1 2 4 8 1.1 2.0 4.0 8.0 1.1 N/A N/A N/A 1.1 2.0 N/A N/A 1.1 2.0 4.0 N/A 1.1 2.0 4.0 8.0 in million events per second

Table I M AXIMUM STABLE THROUGHPUT FOR VARYING NUMBERS OF PHYSICAL AND VIRTUAL MACHINES . N/A IMPLIES IMPOSSIBLE CONFIGURATION , E . G ., ONE VM SPANNING TWO PM S .

whether migration works at all, i.e., TCP connections may break, impact on throughput and latency are among the top concerns. Migration time depends on two key factors: the available bandwidth for state transfer and the rate at which the applications running inside the VM update volatile memory. Migration times for idle VMs are only limited by the available transfer bandwidth. In our case, the Gigabit Ethernet link transfers 512 MB of state in less than six seconds. This is a lower bound on migration time. Starting from this lower bound, Figure 3 shows the impact of varying loads on migration time. We reference load as a fraction of maximum stable load (cf. Table I). Up to a load factor of 0.4 migration time increases with the load factor. However, migration at a load factor of 0.8 only takes 14s, compared with 17s for a 0.4 load factor. Because the page dirtying rate increases with the load factor, the hypervisor stops the VM earlier during the migration process. Consequently, the state transfer completes more quickly. The relationship between load and migration time is important as it affects the decision when to migrate a VM. Postponing VM migration as long as possible, results in higher hardware utilization and greater energy savings. On the other hand, migrating a VM under high load, takes longer to complete (up to a certain point). Also, because migration has an effect on the services running inside the VM, a prolonged migration process translates into longer intervals with degraded service quality. Figure 2 shows the impact on throughput and latency when migrating a single VM. We trigger the migration 20s into the experiment. Latency increases and throughput drops slightly. Xen stops the VM eight seconds into the migration process to transfer the remaining state to the destination host. Execution resumes at t = 36s. Throughput and latency immediately return to pre-migration levels. In total migration took 16s. C. Dynamic scaling In order to evaluate the general usefulness of our approach, we evaluated dynamic scaling using the following

471

Throughput

10^7



5e+05 15

10^5

Live migration time [s]

3e+05

Latency [us]

Throughput [events/s]

10^6 4e+05

2e+05 10^4







0.8

1.0

10



5

1e+05 10^3

0

10

20

30

40

50

0.2

Time [s]

Figure 2.

0.4

0.6

Load factor (relative to peak)

Impact of live migration on throughput and latency.

Figure 3.

well as a static setup involving a maximum of 8 physical machines. The scenario with static resource provisioning has a base energy consumption of around 1 kW; that is 8 machines each drawing 130 W when idle. Compare this to the initial 170 W the dynamic setup requires. When handling minimal load, using the dynamic setup requires less than 1/5th of the energy for the static setup. We accounted each machine with 30s for booting, i.e. we added the machine’s power consumption to the total power consumed starting 30s prior to the migration. If machines must transition between “off” and “on” more quickly, suspend-to-RAM is an alternative to a complete shutdown. This way, baseline power draw would be slightly higher in the dynamic case, depending on how much power each machines draws when on standby. Assuming 1/20th of idle power consumption during standby [6], a minimal dynamic deployment still only requires 22% of the energy of a static deployment. Due to virtualization overhead the dynamic setup actually consumes more energy at full scale out (i.e., each PM hosts a single VM) than the static setup. As we expect the system to run at full scale out only for short periods this is acceptable. Increased power consumption during periods of full scale out are offset by considerable savings during off-peak periods.

Migration time as a function of VM load.

in general. Initially, we found it impossible to achieve subsecond downtimes, without interfering with the migration process at all. Observed downtimes (i.e., periods with no processed events) ranged between 20 and 50 seconds. We suspected that the routers connecting the cluster machines were taking such a long time to update their ARP caches. In order to remedy this, we started a process on each virtual machine sending gratuitous ARP replies continuously. Although this reduces the pause to around 10 seconds, it is still an order of magnitude away from the sub-second mark. Rather, the downtime is inherent to the migration process. The page dirtying rate of our stream processing application is too high to perform a “real” live migration. In order to complete the migration, the virtual machine has to be stopped to copy the remaining dirty pages. This process takes around 10 seconds in our case, resulting in respective downtimes. Figure 7 contains evidence of this behavior. Outgoing traffic for the migration source jumps to 100 MB/s once the migration process is started. For a while the system’s throughput only marginally decreases. At the 30s mark the virtual machine is stopped, as indicated by the drop in throughput to zero. Outgoing network traffic stays at 100 MB/s for eight more seconds. Once the outgoing network traffic drops back to zero, the application continues normally and throughput increases again to pre-migration levels.

V. D ISCUSSION The following section discusses some of the issues we ran into while conducting the experiments. A. Perceived VM downtime

This is an important observation, that system designers have to be aware of when deciding to apply VM migration. Real live migration may not actually be possible.

Although the literature about virtual machine (live) migration reports downtimes of less than a second, this is not true

472

10^7

Latency [us]

10^6

10^5

10^4

10^3 0

100

200

300

400

500

Time [s]

Figure 4.

Impact of multiple consecutive migrations on application end-to-end latency. Less is better.

Throughput [events/s]

1500000 6e+06 1000000 4e+06 500000 2e+06 0 0

50

100

150

200

250

300

300

Time [s]

350

400

450

Time [s]

Figure 5. Impact of multiple consecutive migrations on application throughput. Higher is better. NB: this is one measurement. We split it up to better show scalability. *+,-.

/-/.

'$## '"##

 !

'### &##

2-*3,. -3

%## $##

01+.-23-.1,3/- /*

"##

#

'##

"##

(##

$##

 

Figure 6.

Comparison of power consumption for static and dynamic hardware provisioning. Less is better.

473

)##

Throughput

100

5e+05

3e+05

60

2e+05

40

1e+05

20

0e+00

0

0

10

20

30

40

Tx traffic [MB/s]

80

4e+05

Throughput [events/s]

shared disk and network interface cards is thus of utmost importance. Virtualization technology is only the enabler of modern data center management. It does not solve the question of how best to allocate available resources to virtual machines. Depending on the objectives, different allocation strategies must be used. In [4] the authors have developed a scheme to detect hot spots by monitoring CPU, memory and network utilization. Moving or swapping VMs between physical machines alleviates identified hot spots. [5] builds on [4], by incorporating migration overhead into the decision when and which virtual machines to migrate. Hot spot avoidance is further complicated by virtual machine interaction. In [11] the authors proposed a systems which takes VM interaction and performance isolation into account when computing a virtual to physical machine mapping. However, each work understates the significant downtime “live” migration can incur. As we have demonstrated, downtime, and hence request latency, is well above the one second mark, even in our comparatively simple deployment scenario. In order for live migration to be useful it is important to know exactly which applications are running inside the virtual environment. While high level approaches consider CPU, network, disk, and RAM as their optimization target, we are more interested in how live migration affects individual applications. Being able to mitigate hot spots is important, but how individual applications are affected by migration needs more careful study. The possibility to migrate live OS instances was first demonstrated in [12], [13]. [12] reports client observable delays of less than one second for different applications. Besides migrating OSes within the same subnet, is has also been successfully demonstrated for wide area networks (WANs) [14]–[16]. While migration across WANs opens further possibilities to shift computations, we do not consider it here. Our focus is solely on migrations within the same data center for the purpose of minimizing the number of powered up physical machines. Another direction of research associated with the management of virtualized environments is into cloud infrastructure management software. Prominent examples sprung out of research efforts are Eucalyptus [17] and OpenNebula [18]. The later already has support for intelligent scheduling of virtual machines. Allocation policies can be specified for optimizing energy consumption, service level conformance, or other user defined objectives. [19] describes an approach for quantifying energy consumption on a per-VM basis. Traditionally, power metering is only supported for physical machines. Extending existing power metering concepts to virtual machines is not easy, because utilization based energy consumption profiles for individual components (e.g., disk and RAM) are hard to come by.

Tx traffic

50

Time [s]

Figure 7. Correlation between throughput and migration induced network traffic.

VI. R ELATED WORK The most well known virtualization technologies today are: Xen [7], kvm [8], and VMWare ESXi [9]. Differences in architecture and design principles are their major differentiating factor. Xen provides a hypervisor running directly on the bare hardware. A modified operating system, called Dom0 runs on top of the hypervisor. Besides the hypervisor and Dom0 operating system, several guest operating systems can be installed alongside. kvm, on the other hand, does not require a separate hypervisor running below the operating system. Instead, the Linux kernel was enhanced with capabilities to allow for hosting guest operating systems alongside the first operating system. Using kvm, virtual machines appear to the system as processes like any other program. VMware ESXi takes an approach similar to Xen, where a small hypervisor runs directly on the hardware. The ESXi hypervisor is responsible for multiplexing system resources and hosting guest operating systems. Besides architecture questions, another important direction of research centered around virtualization concerns itself with reducing the overhead of virtualization. [1] have shown that close to 100% of the bare metal performance are possible. Besides reducing the computational overhead of virtualization, improving I/O performance is another important research direction [10]. Distributed applications must communicate frequently over the network and write data persistently to disk. Co-ordinating efficient access to

474

VII. C ONCLUSION

[7] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, “Xen and the art of virtualization,” in Proceedings of the 19th ACM Symposium on Operating Systems Principles. ACM, 2003, pp. 164–177.

Virtual machine technology has paved the way for consolidating data center infrastructure. Especially the live migration feature, where virtual machines can be re-located between physical hosts with minimal downtime, is appealing for dynamic, load-directed reshuffling of compute resources. We have shown that live migration, despite best efforts, is not instantaneous and delays processing significantly (ten seconds in our scenario). It is important to know the virtual machine’s page dirtying rate in order to estimate the expected downtime when migrating. We plan to profile popular application stacks with respect to their amenability for seamless live migration. Despite the downtime problem, we achieved our goal to demonstrate how virtualization enables non-elastic applications to scale dynamically. Under minimal load, we were able achieve energy savings of at least 80% when compared to a static resource provisioning. Scaling seamlessly from minimal to peak load posed no problems.

[8] A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori, “kvm: the Linux virtual machine monitor,” in Proceedings of the Linux Symposium, vol. 1, 2007, pp. 225–230. [9] “VMware ESXi.” [Online]. http://www.vmware.com/products/vi/esx/

Available:

[10] M. Kesavan, A. Gavrilovska, and K. Schwan, “Differential virtual time (DVT): rethinking I/O service differentiation for virtual machines,” in Proceedings of the 1st ACM Symposium on Cloud Computing. ACM, 2010, pp. 27–38. [11] G. Somani and S. Chaudhary, “Application Performance Isolation in Virtualization,” in Cloud Computing, 2009. CLOUD’09. IEEE International Conference on. IEEE, 2009, pp. 41–48. [12] C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield, “Live migration of virtual machines,” in Proceedings of the 2nd Symposium on Networked Systems Design & Implementation Volume 2, ser. NSDI’05. Berkeley, CA, USA: USENIX Association, 2005, pp. 273–286. [Online]. Available: http://portal.acm.org/citation.cfm?id=1251203.1251223

ACKNOWLEDGMENT This research was funded as part of the SRT-15 project supported by the European Commission under the Seventh Framework Program (FP7) with grant agreement number 257843.

[13] M. Nelson, B. Lim, and G. Hutchins, “Fast transparent migration for virtual machines,” in Proceedings of the USENIX Annual Technical Conference, 2005, pp. 391–394.

R EFERENCES

[14] P. Ruth, J. Rhee, D. Xu, R. Kennell, and S. Goasguen, “Autonomic live adaptation of virtual computational environments in a multi-domain infrastructure,” in Autonomic Computing, 2006. ICAC’06. IEEE International Conference on. IEEE, 2006, pp. 5–14.

[1] U. Steinberg and B. Kauer, “Nova: a microhypervisor-based secure virtualization architecture,” in Proceedings of the 5th European Conference on Computer Systems, ser. EuroSys ’10. New York, NY, USA: ACM, 2010, pp. 209–222. [Online]. Available: http://doi.acm.org/10.1145/1755913.1755935

[15] F. Travostino, P. Daspit, L. Gommans, C. Jog, C. De Laat, J. Mambretti, I. Monga, B. Van Oudenaarde, and S. Raghunath, “Seamless live migration of virtual machines over the MAN/WAN,” Future Generation Computer Systems, vol. 22, no. 8, pp. 901–907, 2006.

[2] A. Martin, T. Knauth, S. Creutz, D. Becker, S. Weigert, A. Brito, and C. Fetzer, “Low-overhead fault tolerance for high-throughput data processing systems,” in Proceedings of the 31st International Conference on Distributed Computing Systems, ser. ICDCS’11. IEEE, 2011.

[16] R. Bradford, E. Kotsovinos, A. Feldmann, and H. Schi¨oberg, “Live wide-area migration of virtual machines including local persistent state,” in Proceedings of the 3rd International Conference on Virtual Execution Environments. ACM, 2007, pp. 169–179.

[3] C. Millsap, “Thinking Clearly About Performance,” Queue, vol. 8, no. 9, pp. 10–20, 2010. [4] T. Wood, P. Shenoy, A. Venkataramani, and M. Yousif, “Black-box and gray-box strategies for virtual machine migration,” in Proceedings of the 4th Symposium on Networked Systems Design & Implementation, 2007.

[17] D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and D. Zagorodnov, “The Eucalyptus opensource cloud-computing system,” in Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid. IEEE Computer Society, 2009, pp. 124–131.

[5] F. Hermenier, X. Lorca, J. Menaud, G. Muller, and J. Lawall, “Entropy: a consolidation manager for clusters,” in Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments. ACM, 2009, pp. 41–50.

[18] B. Sotomayor, R. Montero, I. Llorente, I. Foster et al., “Virtual infrastructure management in private and hybrid clouds,” IEEE Internet Computing, vol. 13, no. 5, pp. 14– 22, 2009.

[6] Y. Agarwal, S. Savage, and R. Gupta, “SleepServer: A Software-Only Approach for Reducing the Energy Consumption of PCs within Enterprise Environments,” in Proceedings of the 2010 USENIX Annual Technical Conference. USENIX Association, 2010.

[19] A. Kansal, F. Zhao, J. Liu, N. Kothari, and A. Bhattacharya, “Virtual machine power metering and provisioning,” in Proceedings of the 1st ACM Symposium on Cloud Computing. ACM, 2010, pp. 39–50.

475