INTERNATIONAL JOURNAL OF NETWORK MANAGEMENT Int. J. Network Mgmt (2016) Published online in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/nem.1948
PAVM: a framework for policy-aware virtual machine management Muhammad Nouman Durrani,1,, Feroz Zahid2,3 and Jawwad A. Shamsi1 1
Systems Research Laboratory, National University of Computer and Emerging Sciences, Karachi, Pakistan 2 Simula Research Laboratory, Fornebu, Norway 3 University of Oslo, Norway
SUMMARY The problem of efficient placement of virtual machines (VMs) in cloud computing infrastructure is well studied in the literature. VM placement decision involves selecting a physical machine in the data center to host a specific VM. This decision could play a pivotal role in yielding high efficiency for both the cloud and its users. Also, reallocation of VMs could be performed through migrations to achieve goals like higher server consolidation or power saving. VM placement and reallocation decisions may consider affinities such as memory sharing, CPU processing, disk sharing, and network bandwidth requirements between VMs defined in multiple dimensions. Considering the NP-hard complexity associated with computing an optimal solution for this VM placement decision problem, existing research employs heuristic-based techniques to compute an efficient solution. However, most of these approaches are restricted to only a single attribute at a time. That is, a given technique of using heuristics to compute VM placement considers only a single attribute, while completely ignoring the impact of other dimensions of placing VMs. While this approach may improve the efficiency with respect to the affinity attribute in consideration, it may yield degraded performance with respect to other affinities. In addition, the criteria for determining VM-placement efficiency may vary for different applications. Hence, the overall goal of achieving VM placement efficiency becomes difficult and challenging. We are motivated by this challenging problem of efficient VM placement and propose policy-aware virtual machine management (PAVM), a generic framework that can be used for efficient VM management in a cloud computing platform based on the service provider-defined policies to achieve the desired system-wide goals. This involves efficient means to profile different VM affinities and to use profiled information effectively by intelligent and efficient VM migrations at run time considering multiple attributes at a time. By conducting extensive evaluation through simulation and real experiments that involve VM affinities on the basis of network and memory, we confirmed that the PAVM architecture is capable of improving the efficiency of a cloud system. We elaborate the architecture of a PAVM system, describe its implementation, and present details of our experiments. Copyright © 2016 John Wiley & Sons, Ltd. Received 3 April 2015; Revised 9 June 2016; Accepted 10 August 2016
1. INTRODUCTION Virtual machines (VMs) are integral part of data centers. In an infrastructural cloud setup, they offer abstraction of a computing system and provide resource provisioning, availability, isolation, and fault tolerance [1]. The decision of VM creation or VM placement involves selecting a physical machine in the data centers to host VMs. This decision could play a crucial role in yielding high efficiency for both the cloud and its users. For instance, consider a scenario in which two VMs are communicating with each other via network. Placing these two VMs on two different racks in a data center could lead to longer latency. In comparison, latency could be reduced if the two VMs are either created on a single rack or placed on the same physical machine. Consequently, improved user behavior and effective utilization of network bandwidth could also be observed. In principle, many VMs may communicate with each other and placing all of them with minimal network distance may seem desirable.
Correspondence to: Nouman M. Durrani, Department of Computer Science, FAST National University of Computer and Emerging Sciences, Karachi, Pakistan. E-mail:
[email protected]
Copyright © 2016 John Wiley & Sons, Ltd.
N. DURRANI, F. ZAHID, AND J. SHAMSI
The domain of VM placement is not specific to network affinity. Many other attributes also exist for which efficient placement of VMs could lead to enhanced performance. These include memory sharing [2,3], disk sharing [4], CPU usage [5], server consolidation, and power utilization [6]. Cloud service providers employ over provisioning [7] in order to maximize profits; however, efficient utilization of resources could be critical in achieving this goal.The problem of determining an efficient location for a VM such that the VM could provide improved efficiency is challenging because of multiple reasons. As discussed, multiple dimensions exists for which VM management’s efficiency could be computed and improved. Therefore, computing VM placement efficiency with respect to one attribute does not necessarily ensure enhanced performance in some other domains. Even for a specific attribute, the mapping between any two VMs may be non-injective and non-surjective [8]. In addition, massive size of data centers further alleviates the problem as with the increase in the number of VMs the problem of computing affinity becomes severely challenging. The aforementioned considerations lead to an NP-hard problem for finding an optimal physical machine for a specific VM. Considering the NP-hard complexity, existing research employs heuristicbased techniques to compute an efficient solution. However, most of these approaches are restricted to only a single attribute at a time [3,9,10]. That is, a given technique of using heuristics to compute VM placement considers only a single attribute, while completely ignoring the impact of other attributes on the placement of VM. In this paper, we are motivated by the massive impact of VM placement that could improve the efficiency and scalability of data centers hosting big data applications. The focus of this paper is to propose an efficient framework for computing VM affinity and determine appropriate placement decisions. For big data systems, network communication overhead can be largely reduced if VMs are either colocated or located within a small proximity. Our work is novel and has multiple contributions. Unlike existing techniques [3,9], which are restricted to a single attribute, we propose a generic framework that computes efficient VM placement with respect to multiple attributes at a time. To cater the problem of supporting multiple attributes at a time, we propose policy-aware virtual machine management (PAVM). In that, VM placement decisions are computed through a systemwide policy, which is defined in accordance with the needs and requirements of the cloud service provider. We demonstrate the effectiveness of our framework by considering network and memory affinities among VMs. Simulation and real-world experiments confirm that by utilizing the two affinities together, the performance of a data center could be greatly enhanced. Our work is valuable in enhancing the user behavior, increasing the system performance, and reducing system cost for cloud providers and their users. Main contributions for this paper are as follows: (i) A generalized mechanism that can be used for efficient VM management in a cloud computing environment based on the customizable resource utilization parameters reflecting service provider-defined policies to achieve the desired system-wide goals. (ii) A heuristic algorithm for the approximation of combined resource optimization problem. (iii) A comprehensive evaluation of the proposed algorithms through simulation and real-world experiments. The remainder of this paper is structured as follows: Section 2 discusses the background work, Section 3 describes the problem formally. Section 4 explains our approach to cater the problem and mentions system architecture of the implementation. Section 4 also describes the proposed PAVM algorithm. Section 5 mentions experimental setup in detail to evaluate the proposed PAVM that is performed in Section 6. Section 7 concludes the paper and mentions open research directions.
2. BACKGROUND AND RELATED WORK The purpose of this section is to provide details about the complexity of VMs management in data centers. Location of a VM can be defined in two directions: host physical machine and its location in the network topology. In this paper, we are limiting our scope to the first part, that is, the physical machine on which a VM is located. Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
PAVM: A FRAMEWORK FOR POLICY-AWARE VIRTUAL MACHINE MANAGEMENT
There are two types of system-wide policies that determine the location of a VM in a data center over a period of time, allocation and migration policies. A VM allocation policy is defined as a the governing and placement of VMs in data centers. It also determines how host physical machines are chosen in the initial placement and how their migration is managed for increasing efficiency and maximizing fault tolerance [10]. A VM migration policy is used to optimize current VM allocations. For example, a simple server consolidation migration policy can be employed to migrate VMs from low consumed physical hosts to free low consumed machines for the purpose of shutting them down to save power and other resources. Considering the diversity of this research problem, affinity-aware VM management can be classified to the following main categories: (i) Policy-aware VMs migration: This scheme follows policy requirements and reduces networkwide communication cost in data center networks. The communication cost is defined with respect to policies associated with each VM at different checkpoints (load balancing, firewalls, and intrusion prevention system). Using this policy, each migration can reduce communication cost by 39.06% [11]. Effective placement of VMs would yield efficiency using MapReduce job scheduling with consideration of server assignment discussed in [12]. Mishra and Sahoo [13] provided the grounds for a theoretical model. The model has considered five attributes such as shape, remaining capacity, remaining capacity of individual dimension, combined utilization, and utilization of individual dimension. Further, the authors have proposed a vector arithmetic-based approach for choosing the target physical machine for a VM while considering the aforementioned attributed. (ii) Network communication-aware VM management: It deals with the profiling of the network activities of VMs and calculating affinities between the VMs depending on their network communication. Network communication between VMs can often lead to high bandwidth consumption when VMs are located at a distance in the data center. In this regard, Francois et al. [14] implemented the design of a virtual message passing device that can be added as a simple Message Passing Interface to the guest operating system in order to conserve the bandwidth for the inter-VM communication on the same host. This virtual device acts as an interface between the guest operating system and the hypervisor. It uses either shared memory buffer message passing method or privileged access to the hardware memory in order to pass messages between co-resident VMs. These considerations are made depending on the message size and channel parameters. A better and a transparent approach for the applications and guest operating systems has been proposed as XenLoop [15]. It inspects outgoing network packets at the hypervisor network layer and uses high-speed inter-VM shared memory to transfer packets between the co-resident VMs. Techniques discussed earlier exploit co-residency of VMs for faster and efficient communication between them. However, they do not implement profiling of VM resident on different hosts for the communication affinity. Starling [9] proposed a scheme that minimizes communication overhead in a cloud computing platform by using decentralized network communication-aware VM migrations. The basic approach taken for the implementation is to monitor the communication patterns between pairs of VMs and dynamically placing communicating VMs close to each other in the network hierarchy to save network resources and bandwidth. Startling uses black-box approach for monitoring communication patterns between VMs. In their Xenbased implementation, fingerprints of VM communication are obtained by filtering streams from tcpdump utility running in Dom0 . Traffic information is presented in the form of tuples < sourceID; destID; volume >. The communication fingerprint of a VM is defined as a vector of its traffic tuples to all other VMs. Data obtained from monitoring are used by a bartering agent on each physical machine to infer the potential benefits in relocating a VM to the other physical machine. They have mapped the optimization problem for minimizing communication overhead to an instance of graph partitioning problem that is known to be NP-complete. To minimize overheads, a distributed bartering algorithm has been proposed in which the bartering agent on each physical machine negotiates with the other bartering agents for VM migrations by considering network topology information. There are concerns regarding the scalability of the proposed solution as the evaluation is performed on a very small cloud setup. This Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
N. DURRANI, F. ZAHID, AND J. SHAMSI
is the most advanced work on the communication-aware VM management at the time of writing this paper because, in addition to the implementation of profiling mechanism, it also proposes a framework to the distributed algorithm for bartering of VM migrations. (iii) Storage-aware VM management: It deals with the profiling of the storage affinities, that is, storage content similarities between VMs to exploit sharing potentials for the efficient utilization of storage resources. Exploiting storage sharing potential reduces data redundancy and saves storage resources by deduplication. A storage affinity between a pair of VMs could be disk based, or memory based. Disk content on a VM is often very large. Applying any data deduplication technique on large contents is both expensive. However, using disk to save other resource such as bandwidth is interesting and more beneficial. To the best of our understanding, much work has not been carried out on the disk-based affinity profiling as the large volumes of the disks often require huge computational resources to infer such affinities. However, it has been covered in the literature that software packages could be shared among VMs hosted on a single physical machine. For example, a package management tool for VMs named Stork [4] can securely download packages to a physical machine and could share these packages among VMs. Stork also manages package updates and provides security mechanisms in which users can delegate package permission to others. Stork is an example of a live disk-based sharing system as it has been in use on PlanetLab [16] for more than 4 years. In the literature, much work on the content-based sharing among VMs has been carried out for the memory. Foundation on the memory-based storage-aware VM management has been carried out by DifferenceEngine [2]. The authors proposed a combination of three different memory conservation techniques for co-resident VMs. These techniques are page sharing, page caching, and compression. Page sharing deals with the sharing of complete memory pages if there are identical pages among co-resident VMs. Page caching goes further by patching not completely identical but similar memory pages into smaller subpages, which may be shared. Compression applies compression methods on lesser accessed pages to save even more memory. Later, the authors of the DifferenceEngine extended their work for the VMs holding identical pages but residing on different physical hosts as memory buddies [3]. Memory buddies include implementation of a memory sharing-aware placement and migration system for VMs. It implements a memory fingerprinting system based on DifferenceEngine to infer page sharing potential between VMs on both initial placement and workload change. Anthony Nocentino and Paul M. Ruth [17] discussed a slightly different approach to a dependency-aware live VM migration scheme for the memory. They proposed that, if dependency information about VMs is available at the time of live migration, better consolidation, and efficient migration could be achieved. System Xen implementation uses a memory tainting mechanism that was originally developed for an intrusion detection system. Sangwook Kim et al. [18] proposed a novel memory de-duplication technique that provides isolation support for the groups on a physical machine. Groups were created on the basis of VM owner in the cloud computing platform. Using the isolation mechanism for the memory de-duplication, quality of service and security issues related to the memory sharing between VMs have been addressed. Authors have extended Linux KSM to include the group support. This research points out the issues related to the security in a cloud computing platform because of the creation of VMs of different groups on one physical machine when memory de-duplication or any other resource sharing is used among VMs [19]. (iv) CPU-aware VM management: Sujesha Sudevalayam et al. [5] models the CPU usage for different workloads to emphasize the importance of network-aware VM provisioning in a data center environment. They developed a correlation between resource usages and CPU usage, and based on information obtained from the execution of various micro-benchmarks, they believed that the predication of CPU resource consumption is possible. Recently, a research work has been conducted to bridge the semantic gap in CPU management between the hypervisor and guest operating systems [11]. Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
PAVM: A FRAMEWORK FOR POLICY-AWARE VIRTUAL MACHINE MANAGEMENT
(v) Power-aware VM management: Because of the increasing demand of energy in the large data centers, power-saving VM management has become an important area of the VM management research. A particularly interesting example is GreenCloud [20]. It proposes novel cloud data center architecture to save power. It includes methods for VM placement optimization and subsequent migrations to reduce power consumption. Anton Beloglazov and Rajkumar Buyya [6] proposed a dynamic adaptive heuristic to improve energy requirements through consolidation in a distributed environment while maintaining strong adherence to the service-level agreement requirements. Vytautas Valancius et al. [21] proposed a new distributed computing platform termed as nano data centers that use ISP-controlled services to save energy for a greener computing infrastructure. (vi) Hybrid approaches for VM management: Several hybrid approaches have also been discussed in the literature. However, almost all of the work related with hybrid techniques to optimize multiple resource utilization in a virtualized environment are restricted to outlining the problem. Virtual putty [22] argues that, in a cloud computing platform, operational costs can be significantly lowered by accounting and exploiting affinities and avoiding conflicts between co-placed VMs. The most important factor that determines the provisioning requirement of a VM is its physical footprint, that is, the amount of resources it consumes such as CPU, memory, disk, and bandwidth. Physical footprint of a VM is not location independent, and it is related with the physical machine on which the VM is placed. Moreover, this VM physical footprint can be reshaped, by accounting affinities and conflicts between hosts and other VMs, to lower the operational costs in a data center. It is presented that the resource sharing potentials can be exploited by placing VMs with affinities like several similar memory pages or intercommunication on the same physical host. However, because the affinities between VMs are multidimensional, the task is challenging as optimizing for one dimension could affect the other. However, authors of virtual putty [22] worked only on network-aware VM placement on their prototype implementation and demonstrated how to use virtual footprints of VMs to determine affinities between them. They also listed several principles that can be taken as guidelines for minimizing individual VM footprints in a network-aware VM provisioning. These principles include reducing data dependencies, placing VMs close to their data and exploiting statistical multiplexing techniques. Hyung Won Choi et al. [23] conducted experiments about finding optimum threshold values for the scarce resources in a cloud computing platform. The authors proposed a machine learning self-adaptive mechanism for the resource provisioning. It implements a distributed algorithm where each VM is an autonomous agent and implements self-learning. Paolo Campigiani et al. [24] modeled resources allocation in multi-tier distributed environment and mapped the problem to the classical 0=1 multiple knapsack problems and hence provided a near optimal solution. However, they have confined themselves to resource allocation with a predefined profit for each of the resource parameter and affinities or conflicts between VMs are not considered. The problem could only be optimized in one dimension only like memory and bandwidth using normal optimization techniques. However, optimizing the exploitation of resource sharing potentials among VMs in a distributed environment of a data center is an NP-hard problem, which will be discussed in the next section. 3. PROBLEM FORMULATION Practically, finding the optimum VM placement in a cloud data center considering VM affinities in multiple dimensions is NP-hard. In order to formulate this optimization problem, we first define optimum VM placement as an arrangement of VMs on the physical hosts such that the optimum resource utilization for multiple evaluation parameters is achieved. For instance, we have considered optimizing against network bandwidth consumption, power rating, and total memory usage for set of hosts in data center. We denote each physical host Pi as a set of hosted VMs. Pi D ¹VMi1 ; VMi 2 ; VMi 3 : : : VMi n º Copyright © 2016 John Wiley & Sons, Ltd.
(1) Int. J. Network Mgmt (2016) DOI: 10.1002/nem
N. DURRANI, F. ZAHID, AND J. SHAMSI
Utilization parameter U over a period of time t is given by U.t/ D U D ¹ucpu ; umemory ; ubandwidth ; upower : : : uparamN º
(2)
Each host resource utilization (also called host physical footprint) Pi can be expressed as in equation (3). UP i D ¹uP imemory ; uP ibandwidth ; uP ipower : : : uP iparamN º
(3)
Similarly, for each VM, VMi , we define an n-tuple of the resource utilization parameters termed as virtual footprints of the VM. UVM i D ¹uVM imemory ; uVM ibandwidth ; uVM ipower : : : uVM iParamN º
(4)
A subtle point to note here is that physical footprint UPi may not equal to the sum of virtual footprints of all VMs UVMj hosted on Pi . This inequality exists because of unstructured VM allocations (conflicts). n X
UPi ¤
UVMj
(5)
j D1
For our problem, we are going to optimize the total physical footprints of a data center and continue to formulate our optimization problem without considering VMs conflicts, initial placement, and other overheads incurred because of racks, routing topology, and network overheads. In this case, the data center utilization UDatacenter can be expressed as UDatacenter D
n X
UPi C Overheads
(6)
i D1
For m knapsacks as physical machines with capacities Cj , any VM allocation (Pi in greater utilization than the U ThresholdPi is not considered. n X
VMi ) that results
UVMi C overheads U ThresholdPi < CUP i
(7)
i D1
The dynamic gain Gi associated with each VMi migration can be further expressed in equations (8) and (9): 0
0
Gi D UPfrom UPfrom .UPto UPto / 0
(8)
0
where UPfrom , UPfrom , UPto , and UPto are n-tuple utilizations of the from or to physical machine before 0 and after move. UPfrom UPfrom can also be defined in terms of an objective function denoting the sum of gain in the individual resource utilization parameter gains. 0
ObjectiveFunction D UPfrom UPfrom D W1 .ıuPfrom memory / CW2 .ıuPfrom bandwidth / C : : : C Wn .ıuPfrom nth parameters / ˙ Overheads
(9)
Here, migration and other parameter cost are also considered in the form of overhead. It has been proved that multiple knapsack problems are strongly NP-hard [11,25,26]. Computational complexity increases exponentially with the size of such problems [27]. Techniques for finding optimal solution for thousands of VMs mounted over large number of physical machines are almost infeasible. Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
PAVM: A FRAMEWORK FOR POLICY-AWARE VIRTUAL MACHINE MANAGEMENT
Various approximation schemes consider profit of each item [28–30]. In our case, affinities that determine the profit are defined among items. However, in our problem, the profit of each VM is completely dependent on the location of the VM. In the next section, we describe our approach and system architecture of an implementation of PAVM, the proposed policy-aware management framework to find an approximate solution towards the problem.
4. POLICY-AWARE VIRTUAL MACHINE MANAGEMENT In this section, we discuss our proposed framework named as PAVM framework.
4.1. Our approach Arrangement of VMs in the data centers is dynamic. In this regard, identifying sharing potentials of individual resources could increase profit of data centers. In this section, we present PAVM framework that could be used to approximate the problem described in Section 3. A PAVM consists of a PAVM profiler, PAVM controller, and hosted VMs. These components are discussed in detail as under (i) PAVM profiler: A PAVM profiler runs on each physical machine, Pi , and sends profiling information periodically to the PAVM controller running on a separate machine. In our Xen-based implementation shown in Figure 1, PAVM profiler runs on Dom0 and gathers resource utilization information from each VM. Currently, the system has been implemented using two resource utilization parameters in consideration, that is, memory pages and network communication. (a) Memory page profiling: PAVM profiler running on Dom0 uses a VM introspection library LibVMI [31,32] for reading memory pages from all VMs at the hypervisor level. It generates, stores, and transfers hashes of each referenced memory page to the PAVM controller. Experimentally, it has been found that VMs have large number of similar memory pages in heterogeneous desktop-type workloads. In an experimental setup of 50 dumps (each of Linux (Ubuntu 10.04) and Windows XP with 512 MB memory), it has been found that about 40% of the memory pages could be shared among VMs. Percentages of shareable
Figure 1. Policy-aware virtual machine management (PAVM) system architecture Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
N. DURRANI, F. ZAHID, AND J. SHAMSI
Figure 2. Hash computation speed test for 4K memory pages
Figure 3. Hash computation speed test for 4K memory pages memory pages are shown in Figure 2. We also found (shown in Figure 3) that the murmur family of hashing algorithms is the fastest for hashing memory page contents. (b) Network communication profiling: Network communication between VMs can lead high bandwidth consumption, when VMs are far away in data centers. As shown in Figure 4, we have profiled the communication between VMs using LibPCAP [33] on the Dom0 . Only traffic for the IP range of the data center have been considered for profiling. (ii) PAVM controller: A PAVM controller runs on a separate machine and receives profiling information from individual PAVM profilers hosted on the domai n0 of all the PAVM hosts. Eventually, migrations are triggered based on the heuristic PAVM algorithm, discussed in the next section.
4.2. PAVM algorithm The PAVM algorithm optimizes VM distribution in data centers by exploiting potential profitable migrations, in correspondence with the objective function derived in equation (9). We have identified four important phases to enable a combined network and memory-aware migration algorithm. However, as more policy parameters are added to the algorithm, the algorithm can be modified accordingly. Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
PAVM: A FRAMEWORK FOR POLICY-AWARE VIRTUAL MACHINE MANAGEMENT
Figure 4. Network profiling on Xen Dom0 using LibPCAP. PAVM, policy-aware virtual machine management The purpose of the phase-based migration algorithm is to optimize the VM allocation heuristically, without exploring all the possible combination of the VM allocations, which is an NP-complete problem as formulated in Section 3. Note that the PAVM algorithm works regardless of the VM allocation policy implemented in the data centers. However, understandably, the algorithm recommends a higher number of profitable migrations when the allocation policy in effect is not optimized, for example, a random VM allocation policy. The objective function of the PAVM algorithm, called the PAVM migration policy, is used to specify various policy parameters that affect the PAVM algorithm results. As seen in equation (9), the simplest objective function could be used by assigning weights W1 , W2 , and W3 to each resource in consideration. Some important parameters in the PAVM migration policy along with their functions are listed in Table 1. Figure 5, illustrates the phases of a sample implementation of the PAVM algorithm. After each phase, the PAVM controller calculates the list of potential migrations. The migrations are actually carried out using remote libvirt management library after each phase. (i) Phase I—server consolidation: Rationale: The purpose of the server consolidation phase is to reduce the total number of physical machines being used to host VMs in a data center. Description: As VMs are provisioned and de-provisioned continuously in a dynamic cloud data center, over a period of time, VM fragmentation occurs, where some of the physical machines have only few VM instances running. Those VMs can be migrated to other available slots on heavily loaded servers. The simplest heuristic in the first phase is to start with hosts with minimum VMs, maximum available resources, and/or to find profitable migration of VMs on other free available hosts. Mathematically, let V be the set of VMs and H be the set of physical hosts in the data center, then the server consolidation phase selects the set of VMs Vconsolidation satisfying conditions stated in equation (10). profitV < profitthreshold ; 8 V 2 Vconsolidation Copyright © 2016 John Wiley & Sons, Ltd.
(10) Int. J. Network Mgmt (2016) DOI: 10.1002/nem
N. DURRANI, F. ZAHID, AND J. SHAMSI
Table 1. Tunable parameters Parameter
Function
Profit per memory unit Profit per bandwidth unit Overselling Maximum migrations Threshold profit Swap profit Network affinity threshold Maximum free hosts Maximum virtual machine pairs
Assigns a numerical value to each of the memory unit saved by the algorithm. Assigns a numerical value to each of the bandwidth unit saved by the algorithm. What percentage of the shared resources on each host will be allocated to the new virtual machines? Maximum migrations allowed for each of the phases. Profit after which a virtual machine will not be considered for migration. Minimum profit for which a swap can be considered (a swap equals three migrations). Minimum network affinity for which a migration or swap will be considered in external affinity consideration phase. Maximum number of free hosts to consider in greedy profit gain phase. Maximum virtual machine pairs to be considered for swap in random selection phase.
Figure 5. Phases of policy-aware virtual machine management algorithm where profitthreshold is defined as the value of profit above which a VM will not be considered for migration, as expressed in equation (7). Descriptively, the algorithm works as follows: 1. Identify hosts with the following characteristics (weighted function based on the service provider policy): a. minimum number of VMs and b. maximum available resource contents. A weighted function for the selection of hosts (hosti 2 H where hosti contains set of VMs hosti V ), which scores potential hosts in ascending order, is defined in equation (11)
Score.hosti / Dj hosti V j C1 =.MemoryAvailable% C bandwidthavailable% / Copyright © 2016 John Wiley & Sons, Ltd.
(11)
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
PAVM: A FRAMEWORK FOR POLICY-AWARE VIRTUAL MACHINE MANAGEMENT
2. Select VMs from the selected hosts based on the policy objective function with lowest profits. Do not consider VMs with profit exceeding or equal to the threshold profit as defined in the policy. 3. Use bin packing first fit algorithm for packing selected VMs to other hosts (free hosts are not considered, only hosts with VMs running on them are selected). The time complexity of packing selected VM to other physical machines is O.n log n/. 4. To avoid looping back, VMs from the destination hosts in a potential migration are not considered. 5. Perform migrations subject to the maximum allowed migrations in consolidation phase as defined in the provider policy. (ii) Phase II—external affinity consideration: Rationale: Phase II of our algorithm is motivated to determine affinity with external attributes. These may include various factors such as network and memory. Description: After the server consolidation in phase I, in phase II, affinity of a VM with other VMs is computed with respect to its resource utilization parameters. The idea is to colocate VMs with related affinity in order to improve performance. The VMs, colocated on the same machine, use shared memory for their communication. For example, each unordered pair of VMs ¹vi ; vj º8v 2 V; vi ¤ vj , is assigned a memory affinity based on the percentage of similar memory pages between them. This saves the network bandwidth in the data center and improves memory utilization. The VM pairs are sorted with decreasing order of affinity so that VMs with highest order are selected. Mathematically, let V be the set of VMs and H be the set of physical hosts in a data center; then, the function AFFINITYnetwork vi ; vj is defined for each unordered pair ¹vi ; vj º8v 2 V; vi ¤ vj . It assigns a numerical value x 0 indicating the proportion of network traffic used between vi and vj in time t. External affinity consideration phase selects set of VM pairs Vexternal affinity satisfying the conditions stated in equations (12), (13), and (14). AFFINITYnetwork .vm ; vj / > THRESHOLDaffinity 8v 2 V
(12)
profitv < profitthreshold 8v 2 Vexternal affinity
(13)
profitinitial.hosti / C profitinitial.hostj / C profitnetworkthreshold < profitfinal.hosti / C profitfinal.hostj / ; 8vi ; vj 2 Vexternal affinity
(14)
In equation (12), THRESHOLDaffinity is the minimum amount of network affinity value for the consideration in the network affinity phase. Pair of VMs having maximum number of external network affinity are identified. In case, if the profit gain in migrating a VM profitv is greater than some threshold, profitthreshold , then the VM migration is considered. Profitnetworkthreshold in equation (14) is the minimum profit gain for which a swap is recorded defined in the policy where the initial and final profit indicates profit before swap and after swap. Descriptively 1. Identify pair of VMs that have the maximum number of external network affinity. VM pair on the same host are not considered. 2. To colocate these VMs, find a VM having least current profit defined by the policy objective function. 3. If the profit exceeds the threshold profit, swap is not considered. 4. If the network affinity is less than the threshold profit gain defied for the external affinity consideration phase, do not consider the swap. 5. Calculate potential profit gain by this swap. Memory page comparison will be carried out for this move to calculate potential profit gain. 6. If we are having a gain, save into potential swap operations; otherwise drop. 7. To avoid looping back, once a host is selected in a swap, do not consider it again for this run. Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
N. DURRANI, F. ZAHID, AND J. SHAMSI
Table 2. A virtual machine (VM) swap as three migration Migration
Migrating VM
From Host
To Host
VMi VMj VMi
Hosti Hostj Hostin migration
Hostin migration Hosti Hostj
1 2 3
8. Find a host using allocation policy that can host one of the swappable VMs; use this host as in migration host for swapping. 9. Record swaps as three consecutive migrations. Table 2 lists three migrations recorded for one swap. 10. Perform swapping subject to the maximum migration constraint in external affinity phase as defined in the provider migration policy. 11. If swap fails, reset the VMs to their original host. (iii) Phase III—greedy profit gain: Rationale: The purpose of this phase is to determine unused target physical machines to host low-performing VMs. This consolidation would allow improved profit gain. Description: In this phase, VMs with the lowest profits are identified and, greedily, a profit gain is achieved by placing them on another unused host in the data center. How many unused hosts can be used for this greedy profit gain is defined in the provider policy. This phase works as follows: 1. Identify VMs that have the minimum current profit not exceeding threshold profit as defined in the migration policy. 2. Use bin packing first fit algorithm to pack these VMs to new hosts. 3. Calculate profit gain as per defined in the objective function. 4. If no gain can be achieved, drop consideration. 5. If a gain more than the threshold gain defined in the policy can be achieved with a migration, record it as migrations in potential migration list. 6. Perform migrations. In addition, let x be the number of migrations required to gain a certain profit in an experiment and y be the number of full VM hash comparisons required to perform for this profit gain; a cost function cost.x; y/ can be defined as an additive function of x and y for the simplicity. cost.x; y/ D x C y
(15)
This simple additive function works well in determining the relationship between cost and profit as we know that the cost of migrations as well as cost of hash comparisons is linear. (iv) Phase IV—random selection swaps: Rationale: The rationale for this phase is to determine target physical machines to host lowperforming VMs. However, unlike phase III, where unused target machines are utilized, in this phase, pairs of different hosts are chosen randomly. This would allow us to compute profit gains. Description: In this phase, VM pairs on different hosts are chosen according to a uniformly distributed random variable x D U.0; j V j/ that indexes set of VM V in the data center. This phase works as follows: 1. 2. 3. 4. 5.
Randomly choose n pair of swapping VMs as defined in the policy. Calculate profit gains after calculating memory and network affinities for the pair. If profitable, record the swap operation; otherwise, drop consideration. Perform swaps. If swap fails, reset the VMs to their original hosts and profile as a penalty for the future random selection swaps.
Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
PAVM: A FRAMEWORK FOR POLICY-AWARE VIRTUAL MACHINE MANAGEMENT
More specifically, an optimum solution is expressed in terms of a matrix, in which each row represents the existence of a VM in each column representing m hosts. For each generated combination, if the resource constraints of a physical machine is satisfied, in that situation a profit is calculated. 2
3 x11 : : : x1m 6 7 ::: 7 profit D 6 4 5 xn1 : : : xnm
(16)
For example, equation indicates that VM1 and VM2 are on Host 2, VM3 is on Host 1, VM4 is on Host 4, and Host 3 is free. 2 3 0100 60 1 0 07 7 profit D 6 (17) 41 0 0 05 0001 In the next section, we describe complete experimental setup for the evaluation of the PAVM algorithm. 5. EXPERIMENTAL SETUP We have evaluated the PAVM algorithm on both real-world and simulations setups. In order to facilitate large-scale simulations, a Java-based simulator is designed.
5.1. Large-scale simulations Simulations presented in this work are based on a platform inspired by the CloudSim toolkit [34–36]. CloudSim is a cloud simulation framework that offers better modeling of virtualized environments as compared with other available options like SimGrid [37]. Simulated data center is composed of four types of physical machines with different CPU and RAM configurations listed in Table 3. VM instance types supported are based on the offerings from Amazon EC2 [38] as shown in Table 4. We further assume that there is no firewall policy restricting communication between VMs.
Table 3. VM instance types Machine type Type 1 Type 2 Type 3 Type 4
CPU (MHz)
Cores
RAM (MB)
Bandwidth (Gbps)
180 1860 2660 2660
2 2 2 2
1024 2048 4096 8192
0.5 1.0 1.0 1.0
Table 4. Memory trace repository statistics Operating system Linux Linux Mac OS Windows XP
Workload type
Number of trace files
Trace hashes size (GB)
Web/email server Desktop Desktop Desktop
999 50 581 50
4.27 515 1.31 524
Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
N. DURRANI, F. ZAHID, AND J. SHAMSI
5.2. Random workload generation Allocation policy determines physical hosts on whom VMs are created. The following sections outlines the process and techniques used in random workload data generation in a PAVM simulation. (i) Allocation of physical machines: Type of each physical machine in the data center is chosen according to a uniformly distributed random variable x D U.0; j T j/ that indexes set of physical machines of types T . (ii) Allocation of VMs: Same as with physical machines, instance of each VM is chosen according to a uniformly distributed random variable x D U.0; j T j/ that indexes set of VMs of type T . In real-world data centers, workloads for VMs are dynamically allocated and they are destroyed when they are no longer needed. This cycle can be termed as VM life cycle. In the simulation, this life cycle has been accommodated as an allocation percentage. Allocated percentage determines the probability of a VM to survive till the completion of one simulation cycle after which PAVM algorithm will run. It works on the selection of a uniformly distributed random variable x D U.0; 100/ using function f .x/, which in turns determines whether a VM will survive .f .x/ D 1/ or not .f .x/ D 0/ as shown in equation (18).
f .x/ D f .x/ D
1; x < allocation percentage 0; x allocation percentage
(18)
(iii) Assignment of VM affinities: In our simulation-based experiments, we have confined the scope of our work to two types of affinities, that is, memory and the network affinities. With ease, other affinities like storage, power, and processing affinities can be incorporated in the generic PAVM algorithm. Assignment of these affinities is discussed as under i. Assignment of memory affinities: Each unordered pair of VMs ¹vi ; vj º 8v 2 V; vi ¤ vj , where V is set of VMs, is assigned a memory affinity based on the percentage of similar memory pages between them. This similarity is based on a repository of memory traces in the form of hashes. A major part of the memory traces repository is taken from the University of Massachusetts Amherst Trace Repository [39], contributed by Timothy et al. [3], and the remaining traces have been gathered from the desktop load VMs on Xenbased PAVM implementation. Statistics of memory trace repository used in the simulation has been shown in Table 5. ii. Assignment of network affinities: Network affinities between VMs are assigned based on the experimental results. In these experiments, subset of traces from 50 random servers were gathered to give summary about the level of network affinities present in a diverse set of machines. We examined the network traces for 5 h over a randomly chosen 50 Linux and Windows servers used for web hosting and email services. Figure 6 shows the number of servers that communicated with no (zero), one, two, three, or four other servers in a collection taken of 50 servers. Domain name system communication is neglected as the chosen servers were part of a single domain name system cluster. Results showed that 54%
Figure 6. Fifty-four percent of the servers communicated with at least one other server Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
PAVM: A FRAMEWORK FOR POLICY-AWARE VIRTUAL MACHINE MANAGEMENT
of the server machines communicated with at least one other server, in the collection, while 22% of the machines communicated with at least two other server machines. Experiments presented in the literature has also shown that a substantial benefit could be gained by placing communicating VMs on a single host [9]. In order to add a randomization element to the generated network affinities between VMs, communication patterns are selected according to a uniformly distributed random variable x D U.0; 100/ using a the following function f .x/ that determines the number of communications among VMs. 8 0; 0 x < 46 ˆ ˆ ˆ ˆ < 1; 46 x < 78 f .x/ D f .x/ D 2; 78 x < 92 (19) ˆ ˆ ˆ 3; 92 x < 98 ˆ : n; 98 x 100 where equation (19) shows a simple communication pattern process over x. The probability of the percentage of network bandwidth used between two VMs is assigned using the results presented by Starling [9]. In the next section, we continue with the PAVM experimental setup and evaluate its performance. 6. RESULTS AND EVALUATION This section outlines the results obtained from large-scale simulations and real-world experiments based on the experimental setup described in Section 5. 6.1. Simulation results Using the workload data that have been described in the previous section, various experiments are performed to evaluate the proposed algorithm and its results. This section explains PAVM simulation results; all experiments are performed with 500 hosts and 1625 VMs (in the VM life cycle) if not mentioned otherwise. 1. Comparison with single-affinity consideration algorithms: Fifty iterations of a 500-host simulation setup with first fit bin packing allocation policy are performed with the same data for memory-aware only, network-aware only, and PAVM on each iteration. Results of Figure 8 show that considering both memory and network affinities give better profit gains for every iteration and cumulative data center-wide profit increases with each of the iteration of the PAVM algorithm. Furthermore, Figure 9 shows that PAVM eventually saves both more memory and more bandwidth after a period of time.
Figure 7. Considering both memory and network affinities give better profit gains. PAVM, policyaware virtual machine management Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
N. DURRANI, F. ZAHID, AND J. SHAMSI
2. Comparison with single-affinity consideration algorithms: Fifty iterations of a 500-host simulation setup with first fit bin packing allocation policy have been performed for memory aware, network aware, and PAVM. As shown in Figure 7, both memory and network affinities give better profit gains for every iteration and the cumulative data center-wide profit increases with each iteration of the PAVM algorithm. The PAVM eventually saves both more memory and more bandwidth after a period of time, as shown in Figure 8.
Figure 8. Policy-aware virtual machine management (PAVM) eventually saves both more memory and more bandwidth
Figure 9. Policy-aware virtual machine management (PAVM) profit gain in comparison with optimal profit gain
Figure 10. In the optimal solution, the profit gain is linear and the cost is exponential Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
PAVM: A FRAMEWORK FOR POLICY-AWARE VIRTUAL MACHINE MANAGEMENT
Figure 11. Number of migrations triggered are proportional to the profit gain. PAVM, policy-aware virtual machine management Table 5. Profit from PAVM Profit gain
Migration
VM hash Comparison
VM
PAVM
Optimal
PAVM
Optimal
PAVM
Optimal
3 5 6 7 9
0.0 23.13 39.15 55.01 70.21
12.76 23.13 46.55 156.12 220.11
0 2 4 5 8
3 4 10 14 25
2 6 7 7 9
6 120 720 5040 362 880
PAVM, policy-aware virtual machine management; VM, virtual machine.
3. Comparison with the optimal solution: In series of experiments shown in Figures 9–11, the PAVM has been compared with the optimum solution cost.x; y/ D x C y and found that the optimal cost is too much for the optimal gain. As shown in Table 5, with low cost, the PAVM algorithm achieves good gains as compared with the optimal solution. 4. Algorithm performance: In order to evaluate performance of the algorithm, several experiments have been performed against various parameters. For example, in repeated experiments shown in Figure 11, it has been found that the numbers of migrations triggered by PAVM algorithm are proportional to the profit gain, because policy objective function normally saves resources or gives assigned weights to resources. In an experiment, shown in Figure 12, it has been found that the algorithm saved more memory and bandwidth when high weights are assigned to the memory and network affinity parameters, respectively. The latter parameter is compromising the memory-aware migrations while saving the bandwidth. Although the algorithm works independently of the allocation policy implemented in the data centers, however, the profit gains can be different for different types of allocation policies. For example, as shown in Figure 13, the server consolidation phase will yield better profits if the allocation policy is not randomized. After evaluating the algorithm, it has been found that most of the profit is gained in phases II and III as seen in Figure 14. However, the results presented in Figure 15 show that no correlation is found between hash comparisons and successful migrations, which means that a substantial cost could be incurred without profit in worst-case scenarios. An adaptive heuristic with history profiling could create direct relationship between these two metrics. 6.2. Comparison with optimal solution An optimal solution is calculated (as described in Section 4.2) and compared with the PAVM results, because the optimal cost is greater for the optimal gain. PAVM achieves comparable gains shown in Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
N. DURRANI, F. ZAHID, AND J. SHAMSI
Figures 10, 11, and 13 and Table 5, with least cost than the optimal solution. For example, as shown in Figure 13, the server consolidation phase will yield better profits if the allocation policy is not randomized. After evaluating the algorithm, it has been found that most of the profit is gained in phases II and III as seen in Figure 14. However, the results presented in Figure 15 show that no correlation is found between hash comparisons and successful migrations, which means that a substantial cost could be incurred without profit in worst-case scenarios. An adaptive heuristic with history profiling could create direct relationship between these two metrics. 6.3. Real-world experiment setup and results Real-world experiments have been performed by creating a Xen-based private cloud platform setup on four different physical machines. VMs with different operating system combination taken from
Figure 12. Quantity of resource saved depends on the weight of the resource assigned in objective function—n mem C m net
Figure 13. Effect of allocation policy on number of hosts freed in consolidation phase Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
PAVM: A FRAMEWORK FOR POLICY-AWARE VIRTUAL MACHINE MANAGEMENT
Figure 14. Policy-aware virtual machine management, most of the profit is gained in phases II and III
Figure 15. No correlation found between hash comparisons and successful migrations, an adaptive heuristic with history profiling can create direct relationship
Figure 16. Time to traverse memory for hash calculation on Xen Dom0, one virtual machine, at 3.0 GHz Windows XP, Ubuntu 10.04, and Centos 5 have been created in the cloud. Processing, number of cores, memory, number of traces, trace hashes sizes, workload type, and bandwidth of each VM have been listed in Tables 3 and 4. Results of real-world experiments are shown in Section 6. When VMs were created on Xen as shown in Figure 16, it took 11.8 ms/MB as the memory traversal time. The time to traverse memory for hash calculation increases with increase in the VM’s memory. In our experiments, the PAVM algorithm placed the same VMs with similar memory pages on the same host because of network affinity and profit profiles shown in Figure 17. Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
N. DURRANI, F. ZAHID, AND J. SHAMSI
Figure 17. Initial and final setup when three types of virtual machines (VMs) are running on three hosts
7. CONCLUSIONS The VM placement decision involves selecting a physical machine in the data center to host a specific VM. This decision could play a pivotal role in yielding high efficiency for both the cloud and its users. Also, reallocation of VMs could be performed through migrations to achieve goals like higher server consolidation or power saving. VM placement and reallocation decisions may consider affinities between VMs defined in multiple dimensions like memory and network affinity. Considering the NP-hard complexity associated with computing an optimal solution for this VM placement decision problem, existing research employs heuristic-based techniques to compute an efficient solution. In this paper, we have shown that obtaining optimal result of the affinity exploitation problem in more than one dimension is combinatorial. Also, given the usual size of the modern data centers with hundreds of thousands of VMs in place, it is not computationally efficient to calculate the optimal solution. Hence, a heuristic efficient algorithm like PAVM is required. We have proved that PAVM is an effective framework for VM management in a cloud computing platform. Unlike the existing work, it considers more than one resource attribute for inferring affinities between VMs and hence give better results than considering only one dimension. We have particularly shown that better system-wide profits could be gained by considering both memory and network affinities between VMs. PAVM could be useful for both cloud service providers and users. By using an adaptive objective function for the migrations policy, a cloud service provider can adjust the algorithm based on the scarcity or abundance of a resource in the data center. Moreover, real-world cloud computing platforms [40] like Eucalyptus [41] or OpenStack [42] can also get more efficiency in VM migrations based on VM affinities using PAVM using the algorithm.
REFERENCES 1. Armbrust M, et al. Above the clouds: a Berkely view of cloud computing. Technical Report UCB/EECS-2009-28, University of California, Berkeley, USA, February 2009. 2. Diwaker G, Lee S, Vrable M, Savage S, Snoeren AC, Varghese G, Voelker GM, Vahdat A. Difference engine: harnessing memory redundancy in virtual machines. Communications of the ACM 2010; 53(10): 85–93. 3. Wood T, Tarasuk-Levin G, Shenoy P, Desnoyers P, Cecchet E, Mark D. Memory buddies: exploiting page sharing for smart colocation in virtualized data centers. SIGOPS Oper. Syst. Rev. 2009; 43(3): 27–36. 4. Cappos J, Baker S, Plichta J, Nyugen D, Hardies J, Borgard M, Johnston J, Hartman JH, et.al. Stork: package management for distributed VM environments. In Proceedings of the 21st Conference on Large Installation System Administration Conference (LISA’07), Anderson P (ed.) USENIX Association: Berkeley, CA, USA, 2007.
Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
PAVM: A FRAMEWORK FOR POLICY-AWARE VIRTUAL MACHINE MANAGEMENT
5. Sudevalayam S, Kulkarni P. Affinity-aware modeling of CPU usage for provisioning virtualized applications. In Proceedings of the 2011 IEEE 4th International Conference on Cloud Computing (CLOUD ’11): IEEE Computer Society, Washington, DC, USA, 2011; 139–146. 6. Beloglazov A, Buyya R, 2011. Optimal online deterministic algorithms and adaptive heuristics for energy and performance efficient dynamic. In Consolidation of Virtual Machines in Cloud Data Centers, Concurrency and Computation: Practice and Experience. Wiley Press: New York, USA, 2011. 7. RajkumarBuyya, Garg SK, Calheiros RN. SLA-oriented resource provisioning for cloud computing: challenges, architecture, and solutions. In International Conference on Cloud and Service Computing 2011, Hong Kong, 2011; 1–10. 8. Lin C, et al. Policy-aware virtual machine management in data center networks. In 2015 IEEE 35th International Conference on Distributed Computing Systems (ICDCS). IEEE, Columbus, OH, 2015; 730–731. 9. Sonnek J, Greensky J, Reutiman R, Chandra A. Starling: minimizing communication overhead in virtualized computing platforms using decentralized affinity-aware migration. In Proceedings of the 2010 39th International Conference on Parallel Processing (ICPP ’10), IEEE Computer Society, Washington, DC, USA, 2010; 228–237. 10. Wood T, Shenoy P, Venkataramani A, Yousif M. Black-box and gray-box strategies for virtual machine migration. In Proceedings of the 4th USENIX Conference on Networked Systems Design & Implementation (NSDI’07) USENIX Association, Berkeley, CA, USA, 2007; 17–17. 11. Fukunaga AS, Korf RE. Bin completion algorithms for multicontainer packing, knapsack, and covering problems. Journal of Artificial Intelligence Research 2007: 393–429. 12. Ling X, Yuan Y, Wang D, Liu J, Yang J. Joint scheduling of MapReduce jobs with servers: performance bounds and experiments. Journal of Parallel and Distributed Computing 2016; 90–91: 52–66. 13. Mishra M, Sahoo A. On theory of VM placement: anomalies in existing methodologies and their mitigation using a novel vector based approach. In IEEE 4th International Conference on Cloud Computing, 2011; 275–282. 14. Diakhate F, Perache M, Namyst R, Jourdren H. Efficient shared memory message passing for inter-VM communications. In Euro-Par 2008 Workshops - Parallel Processing. Springer, Berlin Heidelberg, 2009; 53–62. 15. Wang J, Wright K-L, Gopalan K. XenLoop: a transparent high performance inter-VM network loopback. In Proceedings of the 17th International Symposium on High Performance Distributed Computing (HPDC ’08): ACM, New York, NY, USA, 2008; 109–118. 16. Chun B, et.al. PlanetLab: an overlay testbed for broad-coverage services. SIGCOMM Comput. Commun. Rev. 2003; 33(3): 3–12. 17. Nocentino A, Ruth PM. Toward dependency-aware live virtual machine migration. In Proceedings of the 3rd International Workshop on Virtualization Technologies in Distributed Computing (VTDC ’09): ACM, New York, NY, USA, 2009; 59–66. 18. Kim S, Kim H, Lee J. Group-based memroy deduplication for virtualized clouds, 2011. 19. Hu F, et.al., Qiu M, Li J, Grant T, Tylor D, Mccaleb S, Butler L, Hamner R. A review on cloud computing: design challenges in architecture and security. Journal of Computing and Information Technology 2011; 19(1): 25–55. 20. Liu L, et.al., Wang H, Liu X, Jin X, Bo He W, Bo Wang Q, Chen Y. GreenCloud: a new architecture for green data center. In Proceedings of the 6th International Conference Industry Session on Autonomic Computing and Communications Industry Session (ICAC-INDST ’09): ACM, New York, NY, USA, 2009; 29–38. DOI=10.1145/1555312.1555319. 21. VytautasValancius, NikolaosLaoutaris, Massouli L, ChristopheDiot, Rodriguez P. Greening the Internet with Nano Data Centers. In Proceedings of the 5th International Conference on Emerging Networking Experiments and Technologies (CoNEXT ’09): ACM, New York, NY, USA, 2009; 37–48. 22. Sonnek J, Chandra A. Virtual putty. In Proceedings of the 2009 Conference on Hot Topics in Cloud Computing (HotCloud’09): USENIX Association, Berkeley, CA, USA, 2009. 23. Choi HW, Kwak H, Sohn A, Chung K. Autonomous learning for efficient resource utilization of dynamic VM migration. In Proceedings of the 22nd Annual International Conference on Supercomputing (ICS ’08): ACM, New York, NY, USA, 2008; 185–194. 24. Campegiani P, Lo Presti F. A general model for virtual machines resources allocation in multi-tier distributed systems. In Proceedings of the 2009 Fifth International Conference on Autonomic and Autonomous Systems (ICAS ’09): IEEE Computer Society, Washington, DC, USA, 2009; 162–167. DOI=10.1109/ICAS.2009. 25. Kulik A, Shachnai H. There is no EPTAS for two-dimensional knapsack. Information Processing Letters 2010; 110(16): 707–710. 26. Hiremath CS. New heuristic and metaheuristic approaches applied to the multiple-choice multidimensional knapsack problem. Dissertation Abstracts International 2008; 69(01): 252. suppl. B. 27. Han B, Leblet J, Simon G. Hard multidimensional multiple choice knapsack problems, an empirical study. Computers & Operations Research 2010; 37(1): 172–181. 28. Fleszar K, Hindi KS. Fast, effective heuristics for the 0–1 multi-dimensional knapsack problem. Computers & Operations Research 2009; 36(5): 1602–1607. 29. Ji J, Huang Z, Liu C, Liu X, Zhong N. An ant colony optimization algorithm for solving the multidimensional knapsack problems. In Proceedings of the 2007 IEEE/WIC/ACM International Conference on Intelligent Agent Technology (IAT ’07): IEEE Computer Society, Washington, DC, USA, 2007; 10–16. 30. The Virtualization API. Available from: http://libvirt.org/ [Accessed on 28 August 2016]. 31. Virtual Machine Introspection Tools. Available from: http://libvmi.com/ [Accessed on 28 August 2016]. 32. Weisstein EW. From MathWorld—‘Surjection http:// mathworld.wolfram.com/ Surjection.html Injection’. From MathWorld Available from: http://mathworld.wolfram.com/Injection.html [Accessed on 28 August 2016]. 33. TCPDump. LibPCap Public Repository. Available from : http://www.tcpdump.org/ [Accessed on 28 August 2016].
Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem
N. DURRANI, F. ZAHID, AND J. SHAMSI
34. Calheiros RN, Ranjan R, Beloglazov A, De Rose CAF, Buyya R. CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource. Provisioning Algorithms, Software: Practice and Experience (SPE) 2011; 41(1): 23–50. Wiley Press, New York, USA. 35. Wickremasinghe B, Calheiros RN, Buyya R. CloudAnalyst: a CloudSim-based visual modeller for analysing cloud computing environments and applications. In Proceedings of the 24th International Conference on Advanced Information Networking and Applications (AINA 2010): Perth, Australia, 2010; 446–452. 36. Buyya R, Ranjan R, Calheiros RN. Modeling and simulation of scalable cloud computing environments and the CloudSim toolkit: challenges and opportunities. In Proceedings of the 7th High Performance Computing and Simulation Conference (HPCS 2009, ISBN: 978-1-4244-4907-1, IEEE Press, New York, USA): Leipzig, Germany, 2009; 1–11. 37. Casanova H, Legrand A, Quinson M. SimGrid: a generic framework for large-scale distributed experimentations. In Proceedings of the 10th IEEE International Conference on Computer Modelling and Simulation (UKSIM/EUROSIM’08). 38. Amazon Elastic Compute Cloud (EC2) Instance Types. Avaialble from: http://aws.amazon.com/ec2/#instance [Accessed on 28 August 2016]. 39. University of Massachusetts Amherst Trace Repository. Available from: http://traces.cs.umass.edu/ [Accessed on 28 August 2016]. 40. Cordeiro TD, Damalio DB, Pereira NCVN, Endo PT, de Almeida Palhares AV, Goncalves GE, Sadok DFH, Kelner J, Melander B, Souza V, Mangs JE, et.al. Open source cloud computing platforms. In Proceedings of the 2010 Ninth International Conference on Grid and Cloud Computing (GCC ’10). IEEE Computer Society: Washington, DC, USA, 2010; 366–371. 41. Nurmi D, Wolski R, Grzegorczyk C, Obertelli G, Soman S, Youseff L, Zagorodnov D. The eucalyptus open-source cloudcomputing system. In Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID ’09): IEEE Computer Society, Washington, DC, USA, 2009; 124–131. 42. O PEN S TACK, Open Source Cloud Computing Software. Available from: http://openstack.org/.
AUTHORS’ BIOGRAPHIES Nouman M. Durrani is a PhD candidate at the Department of Computer Science, FAST National University of Computer and Emerging Science, Karachi. His research interests include Heterogeneous Devices Volunteer Computing Systems, Human Computation, Cloud Computing, Distributed Systems, WSNs, and Big Data Analytics. Feroz Zahid is a doctoral candidate at the University of Oslo. He is also affiliated with Simula Research Laboratory and is working on the RCN funded research project ERAC, Efficient and Robust Architecture for the Big Data cloud. His research interests include Interconnection Networks, Distributed Systems, Cloud Computing, Energy Efficient Systems, Network Security, Machine Learning, and Big Data. He has an MS (Computer Science) degree from the FAST National University of Computer and Emerging Sciences, Karachi, and a BS (Computer Science) degree from the University of Karachi. Jawwad A. Shamsi is the Head of the Department of Computer Science FAST National University, of Computer and Emerging Sciences, Karachi. He completed his PhD from Wayne State University in August 2009 under the supervision of Monica Brockmeyer. His research interest lies in cloud computing, distributed systems, highperformance computing, and network security.
Copyright © 2016 John Wiley & Sons, Ltd.
Int. J. Network Mgmt (2016) DOI: 10.1002/nem