A Virtual Machine Placement Taxonomy Fabio L´opez Pires
Benjam´ın Bar´an
Itaipu Technological Park (PTI) National University of Asunci´on (UNA)
[email protected] Paraguay
National University of Asunci´on (UNA) East National University (UNE)
[email protected] Paraguay
Abstract—Cloud computing datacenters dynamically provide millions of virtual machines (VMs) in actual cloud markets. In this context, Virtual Machine Placement (VMP) is one of the most challenging problems in cloud infrastructure management, considering the large number of possible optimization criteria and different formulations that could be studied. VMP literature include relevant research topics such as energy efficiency, Service Level Agreement (SLA), Quality of Service (QoS), cloud service pricing schemes and carbon dioxide emissions; all of them with high economical and ecological impact. This work classifies an extensive up-to-date survey of the most relevant VMP literature proposing a novel taxonomy in order to identify research opportunities and define a general vision on this research area. Index Terms—Taxonomy, Placement, Scheduling, Consolidation, Cloud Computing, Optimization
I. I NTRODUCTION Cloud computing datacenters deliver infrastructure (IaaS), platform (PaaS) and software (SaaS) as services available to end users in a pay-as-you-go basis [36]. In this context, a significant number of research challenges for delivering computational resources as utilities has been identified [8]. The present work focus on one of the most studied problems for efficient management of physical and virtual resources in cloud computing datacenters: the process of selecting which virtual machines (VMs) should be hosted at each physical machine (PM) of a datacenter, commonly known as Virtual Machine Placement (VMP) problem. Research works demonstrated that considering the VMP problem for efficient management of resources could result in significant improvement in energy-efficiency, Quality of Service (QoS) and carbon dioxide emissions; all of them with high economical and ecological impact [2], [5], [7]. Beloglazov and Buyya proposed in [6] four different subproblems for efficient resource management of physical and virtual resources in cloud infrastructures: (1) determining when a PM may be considered as being overloaded, requiring migration of one or more VMs from this PM; (2) determining when a PM may be considered as being underloaded, leading to a decision of migrating all VMs from this PM, switching the PM to sleep mode; (3) selecting VMs that should be migrated from an overloaded PM; and (4) finding a new placement of the VMs selected for migration considering the overloaded and underloaded PMs. All these sub-problems should be optimized online considering the dynamic workload of applications [6].
A conceptual cloud computing architecture considering the most studied problems in cloud computing infrastructure management is presented in [18]. According to the proposed architecture, the VMP (or scheduling) is one of the main problems in the mentioned context, where placement of admitted services could be solved considering different criteria and requirements. Other relevant problems to address towards an efficient management of cloud computing resources are admission control and proactive elasticity [18]. Additionally, in [40] it is remarked that finding the best placement of VMs into PMs is one of the most challenging problems for cloud management systems. A. Background and Motivation The VMP problem has been extensively studied in cloud computing literature and several surveys have already been presented. Existing surveys focus on specific issues such as: (1) energy-efficient techniques applied to the problem [5], [41], (2) particular architectures where the VMP problem is applied, as federated clouds [21], and (3) methods for comparing performance of placement algorithms in large ondemand clouds [37]. Beloglazov et al. presented in [5] a survey of energyaware resource allocation policies and scheduling algorithms considering QoS. The following open challenges were identified considering energy-aware management of cloud computing datacenters: (1) development of fast energy-efficient algorithms for the VMP, considering multiple resources for large-scale systems, with the ability of predicting workload peaks to prevent performance degradation, (2) energy-aware optimization of virtual network topologies among VMs for optimal placement in order to reduce network traffic and thus energy consumed by the network infrastructure, (3) development of new thermal management algorithms to appropriately control temperature and energy consumption, (4) development of workload-aware resource allocation algorithms, considering that current approaches assume a uniform workload, and (5) decentralization and distributed approaches to provide scalability and fault-tolerance to the VMP problem resolution. Salimian et al. presented a review of different selection and placement algorithms for energy-efficient management of cloud computing datacenters [41]. Approaches for virtual and physical resources modeling, applied techniques and future work were identified for each studied article. Most relevant
future work include: (1) VMP for multi-core architectures considering multiple resources, (2) considerations of dynamic thresholds for QoS and (3) development of intelligent schemes according to workload and considering live migration. Gahlawat et al. proposed in [21] a brief survey of the main cloud federation architectures and approaches considered for VMP problem formulation. It is opportune to remember that cloud federation is the practice of voluntarily interconnecting cloud infrastructures of different Cloud Service Providers (CSPs), mostly with the aim of responding to workload peaks. Also, Mills et al. presented in [37] methods to compare performance of placement algorithms in large on-demand clouds, where 18 different algorithms for VMP were compared considering 39 variables such as: reallocation rate, user request rate, allocation rate and disk space utilization, among others. The above mentioned surveys and research articles focused into specific issues related to the VMP problem. To the best of the authors’ knowledge, there is no published research work presenting a general and extensive study of a large part of the VMP literature. In consequence, this work presents an extensive up-to-date survey of the most relevant VMP literature and presents a novel taxonomy in order to identify research opportunities defining a general vision on this promising research area. The remainder of this paper is organized as follows: Section II details the proposed taxonomy considering most relevant VMP literature and its classification criteria. Section III introduces optimization approaches proposed in the studied articles, while the fourth section presents a classification of the studied articles considering optimization objective functions. Next, in Section V, the solution techniques of the studied articles are described. Section VI presents additional classification criteria identified for the characterization of VMP problems. Finally, conclusions and future directions are left to Section VII. II. V IRTUAL M ACHINE P LACEMENT TAXONOMY A large part of the VMP literature was studied for the definition of the taxonomy presented in this section. A selection process of the most relevant current research work was performed, resulting in 84 research articles that studied the VMP problem considering different possible formulations. A detailed description of the selection process can be found in [33]. In the proposed taxonomy, research articles may be classified by: (1) optimization approaches, (2) objective functions and (3) solution techniques (see Figure 1). First, the VMP problem could be formulated considering one of the following optimization approaches: (1) mono-objective, (2) multiobjective solved as mono-objective or (3) pure multi-objective. Once the optimization approach is defined, research articles may also be classified by the objective function(s) they studied, both in minimization and maximization contexts. These objective functions could be optimized separately or simultaneously, depending on the selected optimization approach. Finally, solution techniques proposed for solving the VMP problem are used as a third classification criterion.
Solution T echnique
Objective F unction Optimization Approach Figure 1. Main classification criteria for the proposed VMP taxonomy
The following sections present a detailed description of the studied articles considering the 3 classification criteria above defined for the proposed taxonomy of the VMP problem (summarized in Figure 1). Additional identified classification criteria are presented in Section VI. III. O PTIMIZATION A PPROACHES This section presents the optimization approaches proposed in the studied articles. The identified optimization approaches may be classified as: (1) mono-objective (MOP), (2) multiobjective solved as mono-objective (MAM) and (3) pure multiobjective (PMO). The mentioned optimization approaches are detailed in the following sub-sections and summarized in Figure 2. A. Mono-Objective Approach A mono-objective approach considers the optimization of only one objective function or the individual optimization of more than one objective function, one at a time. Clearly, research on the VMP problem has been mainly guided by the mono-objective optimization approach, considering that 64% of the studied articles proposed a monoobjective approach (MOP) for solving the VMP problem (see Figure 2). From the 54 articles studying the problem in a mono-objective approach, almost 40 different objective functions were proposed. It is remarkable that the same objective functions could be proposed considering different modeling approaches (e.g. economical revenue maximization could be achieved by minimizing the total economical penalties for SLA violations [15], by minimizing operational costs [26], [27] or even by maximizing the total profit for leasing resources [43]). Taking into account the large number of proposed objective functions and possible approaches for objective function modeling, multi-objective optimization [13] could result in more realistic formulations of the VMP problem, optimizing more than just one objective function at a time (e.g. achieve economical revenue maximization by simultaneously minimizing the total economical penalties for SLA violations, minimizing operational costs and maximizing the profit for leasing resources). B. Multi-Objective solved as Mono-Objective Approach In this work, the optimization of multiple objective functions combined into one objective function is considered a
multi-objective approach solved as mono-objective (MAM). A disadvantage of this hybrid approach is that it requires a deep knowledge of the problem domain to allow a correct combination of the objective functions, which in most cases is not possible [3]. According to Figure 2, 32% of the studied articles proposed a multi-objective approach but finally solved the VMP as a mono-objective optimization problem. In fact, in the last years, a growing number of articles have proposed formulations of the VMP problem with this hybrid approach [33]. The most widely used method for solving a multi-objective formulation of the VMP problem as mono-objective is the weighted sum method [33]. In this method, one objective function is formulated as a linear combination of multiple objectives, mapping several objectives into only one objective function. Solving a multi-objective optimization problem as monoobjective and considering the other objective functions as constraints of the problem is also a studied method in the VMP literature [20], [42]. Kord et al. proposed a two-objective approach and made a trade-off between these two goals using a fuzzy Analytic Hierarchy Process (AHP) [29]. Other research articles employed fuzzy logic to provide an efficient way for combining conflicting objectives and expert knowledge. For multiple objectives, fuzzy logic allows mapping of different objectives into linguistic values characterizing levels of satisfaction [25], [45], [49]. C. Pure Multi-Objective Approach A general pure multi-objective optimization problem (PMO) includes a set of p decision variables, q objective functions, and r constraints. Objective functions and constraints are functions of decision variables. In a PMO formulation, x represents the decision vector, while y represents the objective vector. The decision space is denoted by X and the objective space as Y . These can be expressed as [13]: Optimize: y = f (x) = [f1 (x), f2 (x), ..., fq (x)]
(1)
e(x) = [e1 (x), e2 (x), ..., er (x)] ≥ 0
(2)
x = [x1 , x2 , ..., xp ] ∈ X
(3)
y = [y1 , y2 , ..., yq ] ∈ Y
(4)
subject to:
where:
It is important to remark that optimizing, in a particular problem context, can mean maximizing or minimizing. The set of constrains e(x) ≥ 0 defines the set of feasible solutions Xf ⊂ X and its corresponding set of feasible objective vectors Yf ⊂ Y . To compare two solutions in a multi-objective context, the concept of Pareto dominance is used. Given two feasible solutions u, v ∈ X, u dominates v, denoted as u v, if f (u) is better or equal to f (v) in every objective function and
MOP
64% 4% PMO 32%
MAM Figure 2. Percentage of articles considering each optimization approach in the studied universe of 84 papers.
strictly better in at least one objective function. If neither u dominates v, nor v dominates u, u and v are said to be noncomparable (denoted as u ∼ v). A decision vector x is nondominated with respect to a set U , if there is no member of U that dominates x. The set of non-dominated solutions of the whole set of feasible solutions Xf , is known as optimal Pareto set P ∗ . The corresponding set of objective vectors constitutes the optimal Pareto front P F ∗ . In [39], Osyczka stated that a multi-objective optimization problem can be defined (in words) as the problem of finding a vector of decision variables which satisfies constraints and optimizes a vector function whose elements represent the objective functions. These functions form a mathematical description of performance criteria which are usually in conflict with each other. It should be noted that, the term optimize means finding a solution which would give the acceptable values of all the objective functions for the decision maker. Decision makers may base their solution choice considering non-modeled human preferences [13]. In the considered literature, only 4% of the articles proposed a pure multi-objective approach, using Pareto Dominance for comparing conflicting objective functions [22], [32], [34]. To the best of the authors’ knowledge, there is no manyobjective optimization formulations proposed for the VMP problem in the specialized literature (i.e a multi-objective optimization problem with at least four conflicting objective functions [47]). IV. O BJECTIVE F UNCTIONS In cloud computing datacenters with a considerable amount of PMs and VMs, there are several criteria that can be considered when selecting a possible solution for the VMP problem, depending on management policies and optimization objectives. These criteria can even change from one period of time to another, which implies a variety of possible formulations of the problem and different objectives to be optimized. According to the studied articles, VMP literature mainly focuses on the optimization of objective functions that specifically concerns CSPs. In this work, this scenario is considered as a provider-oriented VMP (see Figure 3). Considering the
large number of current CSPs and the different prices and features of VMs, the VMP problem could also be studied as the process of selecting which virtual machines (VMs) should be hosted at each CSP [23]. In this context, objective functions could also be formulated considering the requirements of Cloud Service Tenants (CSTs) for allocation of a particular service, often composed by more than one VM. In this work, this scenario is considered as a broker-oriented VMP (see Figure 4). Finally, it should be mentioned that the studied articles proposed nearly 60 different objective functions for the three optimization approaches presented in Section III [33]. Considering the large number of proposed objective functions, this work classifies objective functions with similar characteristics and goals into 5 objective function groups that are presented in following sub-sections and summarized in Table I. A. Energy Consumption Minimization Energy consumption management is an important studied issue for CSPs, with high impact in operational costs and
Cloud Service Tenants (CSTs)
carbon dioxide emissions for datacenter operations. According to [4], most of the time, servers operate in the lowest energyefficiency possible region (i.e. between 10 and 50% of resource utilization), even thought energy efficiency is a very important issue to address, considering its economical and ecological impact in modern datacenters. According to the studied articles, 51.2% of the studied papers proposed energy consumption minimization for the VMP problem, considering different energy modeling approaches which are summarized below. Achieving energy consumption minimization by consolidating VMs on the minimum number of PMs is a very studied approach for the VMP problem [33]. This approach is mostly based on the assumption that “the use of fewer PMs will bring less energy consumption” [16], but real world cloud computing datacenters are not homogeneous in terms of performance, management capabilities, and energy efficiency of PMs [17]. Algorithms that consolidate VMs on the minimum number of PMs without considering the power consumption of each PM could result in sub-optimal solutions with energy-inefficient PMs. Energy consumption could also be modeled as a linear relationship between power consumption and Central Process Unit (CPU) utilization. Articles considering this approach proposed similar formulations to the one presented in [5]: P (Ucpu ) = Uidle × Pmax + (1 − Uidle ) × Pmax × Ucpu (5)
Cloud Service Provider (CSP)
where: P (Ucpu ): Uidle : Pmax : Ucpu :
Virtual Machine Placement (VMP)
Power consumption of a PM Fraction of power consumed by an idle PM Maximum power consumption of a PM CPU utilization rate
Cloud Computing Datacenter
Figure 3. Provider-oriented VMP problem selects which VMs should be hosted at each PMs of a datacenter.
The CPU utilization may change over time due to dynamic workload, so CPU utilization is a function of time and it is represented as Ucpu (t). Considering Eq. (5), total energy consumption E of a PM is given by Eq. (6): Z
t1
E= Cloud Service Tenants (CSTs)
Cloud Service Provider 1 (CSP) Cloud Federation
Cloud Service Provider N (CSP)
Virtual Machine Placement (VMP)
Cloud Federation
Cloud Service Provider 2 (CSP)
P (Ucpu (t)) dt
(6)
t0
... Cloud Service Provider 3 (CSP)
Figure 4. Broker-oriented VMP problem selects which VMs should be hosted at each CSP.
Another studied model is a quadratic polynomial function presented in [12] for modeling power consumption according to the workload of each datacenter in a geo-distributed cloud context for electricity cost minimization. Considering that network communication equipment consumes between 10 and 20% of the total datacenter energy [19], reducing energy consumption of these equipment is an important issue to address. Some articles proposed formulations of the VMP for energy consumption minimization of network communication equipment. In [19], it is proposed VMPlanner for the optimization of both placement and traffic flow routing with the objective of minimizing the number of active network communication equipment for energy saving.
In [14], the optimization of energy consumption of PMs and network switches is proposed, where power modeling is based on the utilization of each physical resource. An holistic approach for power consumption minimization in large-scale distributed cloud datacenters is proposed in [28], where services are provisioned by Internet demands over an IP-over-Wavelength Division Multiplexing (WDM) network considering: (1) power consumption at each datacenter, (2) power consumption at the IP layer and (3) power consumption at the WDM layer. B. Network Traffic Minimization As proposed in [5], network traffic is an important objective function for optimization with open challenges in cloud computing datacenters. From the studied articles, 30.9% proposed network traffic minimization for the VMP problem (see Table I). As presented in section IV-A, network traffic optimization can be studied jointly with energy consumption. Additionally, other approaches are also presented in VMP literature, mainly based on the optimization of: (1) network communication costs, (2) live migration overhead, (3) network metrics such as: delay, data access and data transfer time, link congestion, network performance, service response time as well as average latency, and (4) Wide Area Network (WAN) communication in geo-distributed clouds [33]. A very studied approach for network traffic minimization is the placement of VMs with high communication rate in the same PM (or at least in the same rack) to avoid the utilization of network resources (or at least core network equipment). This approach includes workload characterization and clustering techniques applied to the VMP problem to minimize the inter-VM network traffic [33]. Modeling and quantification of live migration network overhead is an important open challenge in a VMP network traffic optimization context. The live migration network overhead is quantified in [2] considering that all virtual resources of a VM affects the total migration overhead but with different levels of impact (i.e. storage migration affects more than CPU). For this, the total live migration network overhead is modeled as a weighted sum of resources in [2]. The weight w assigned to each resource represents the impact of those resources in total migration overhead, considering that network (w = 0.8) and storage (w = 0.6) resources have more impact than Random Access Memory (RAM) (w = 0.4) or CPU (w = 0.1). Another approach for a live migration network overhead optimization in VMP is presented in [9], where live migrations are limited to a maximum number, modeled as a constraint of the problem. Considering that VMs are dynamically created and destroyed in cloud computing environments, a consolidation process could require high level of flexibility where traditional routing protocols present limitations to adjust flow paths. In [48], the authors proposed network traffic load balancing to improve QoS in a VMP context considering Software Defined Networking (SDN) [35], where flow paths are determined based on network status metrics such as low delay, low packet loss or high security.
C. Economical Costs Optimization From the studied articles, 22.6% proposed economical costs optimization for the VMP problem (see Table I). Economical revenue maximization is a key issue to be addressed by CSPs and could be achieved by reducing operational costs. These operational costs are mainly related to energy consumption minimization but other formulations could also be studied, such as thermal dissipation costs [49]. Reducing penalty costs of SLA violations is another studied approach in the VMP literature in order to maximize the economical revenue of a CSP [33]. Finally, CSPs can maximize its economical revenue by leasing all its available resources or at least the maximum possible [33]. In this context, VMP could be studied jointly with admission control problem [18] and two possible scenarios could be identified: if demand for resources exceeds the current available resources, overbooking techniques or cloud federation [18] can help CSPs to attend the requirements of the CSTs; on the other hand, idle resources can be offered in an auction-based scheme such as Amazon’s Spot Instances [1]. Both scenarios represent open challenges for the VMP as well as emerging cloud computing markets. On the other hand, CSTs look for CSPs that meet the specific requirements of a particular service, preferably with the minimal costs for the required virtual infrastructure. In this context, economical cost minimization is studied and trending dynamic pricing schemes of VMs introduces many open challenges in this research area [33]. The most studied pricing scheme in the considered VMP literature is fixed price. However, with the recent trend of dynamic pricing of cloud resources, where the price of resources can vary depending on the free capacity and load of the provider, few articles have recently proposed formulations of the VMP problem considering dynamic prices schemes [31]. D. Performance Maximization In this work, objective functions that proposed formulations related to performance represent 16.7% of the studied articles (see Table I). These articles include the optimization of: (1) security metrics, (2) resource interference, (3) QoS, (4) high availability, (5) total job completion time, (6) shared last level cache (SLLC) contention and (7) deployment plan time [33]. Most of these performance metrics may be considered for CSTs in order to select an appropriate CSP to host their services. A few articles focused on the VMP problem for High Performance Computing (HPC) applications, where the proposed approaches were based on avoiding the placement of CPUintensive VMs in the same PM [24]. E. Resource Utilization Maximization Cloud infrastructures are commonly composed by multiple physical and virtual resources such as CPU, RAM, storage, network bandwidth and Graphical Process Units (GPU). In this context, an efficient and balanced utilization of these resources is an important issue to address. Taking into account the
surveyed universe of 84 papers, 15.5% of the studied articles proposed the optimization of resource utilization (see Table I). The main approaches included the maximization of resource utilization, but an important issue to consider is the balanced utilization of each resource [33]. Li et al. studied the concept of elasticity, referring to how well a datacenter can satisfy the growth of the input VMs resource demands under limitations of PMs and network link capacities [30]. Considering the importance of the efficient utilization of resources, an interesting analysis of the anomalies and drawbacks in some existing strategies for efficient resource utilization is presented in [38], proposing a novel vector-based technique that solves the studied anomalies. V. S OLUTION T ECHNIQUES In the considered universe, there exist different techniques proposed for solving the VMP problem. The main solution techniques include (1) deterministic algorithms (2) heuristics, (3) meta-heuristics, and (4) approximation algorithms. The four mentioned solution techniques are detailed in the following sub-sections and summarized in Table II. A. Deterministic Algorithms Classical deterministic techniques are proposed for the VMP problem, including Constraint Programming (CP), Linear Programming (LP), Integer Linear Programming (ILP), Mixed Integer Linear Programming (MILP), Pseudo-Boolean Optimization (PBO) and Dynamic Programming (DP), representing 17.9% of the studied articles (see Table II). Most of these approaches are proposed as novel mathematical formalizations of the VMP problem, without any practical intention, considering that obtaining the optimal solution implies a search on an universe of N possible solutions [34], where: N = (n + 1)m
(7)
where: N: n: m:
Size of the searching universe Number of physical machines Number of virtual machines
B. Heuristics Considering that VMP is a combinatorial NP-complete problem [46], it is impracticable to exactly solve instances of the problem for large number of PMs and VMs. Commonly,
Table II S OLUTION T ECHNIQUES : I T COULD APPLIED MORE THAN ONE . Solution Technique
% of studied papers
Deterministic Algorithms
17.9%
Heuristics
66.7%
Meta-Heuristics
14.3%
Approximation Algorithms
2.4%
CSPs are composed by thousands to millions of PMs and VMs, a scenario where optimal solutions with exhaustive search algorithms can result extremely expensive. Therefore, a tradeoff between quality of solutions and computational cost has to be considered for real world cloud management systems. Heuristics have already been extensively studied in the literature for NP-complete problems. In this work, 66.7% of the studied articles proposed heuristic-based solution techniques for the VMP problem (see Table II). Most of the studied articles proposed heuristics based on well-studied algorithms such as: Fist-Fit, First-Fit Decreasing, Best-Fit, Best-Fit Decreasing, Worst-Fit and Heaviest-Fit [33]. Other greedy algorithms were also proposed in addition to novel heuristics proposed for the VMP problem resolution [33]. According to the proposed taxonomy, heuristics are very studied in mono-objective (MOP) as well as multi-objective solved as mono-objective (MAM) optimization approaches with the 5 objective function groups presented in Table I. C. Meta-Heuristics As mentioned before, approximations to optimal solutions are sufficient in most of cloud infrastructure environments. Meta-heuristics are also very useful in order to obtain good solutions in practical time. From the studied articles, 14.3% proposed the resolution of the VMP problem with metaheuristics (see Table II), including Memetic Algorithms (MA), Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO), Genetic Algorithm (GA), Neighborhood Search (NS), Cut-and-Search, Simulated Annealing (SA) and Tabu Search (TS) [33]. According to the proposed taxonomy (see Table III), metaheuristics are mainly studied in multi-objective optimization approaches for solving the VMP problem. D. Approximation Algorithms
Table I O BJECTIVE F UNCTIONS : A PAPER MAY CONSIDER JUST ONE OR SEVERAL DIFFERENT OBJECTIVE FUNCTIONS . Objective Function
% of studied papers
Energy Consumption Minimization
51.2%
Network Traffic Minimization
30.9%
Economical Revenue Maximization
22.6%
Performance Maximization
16.7%
Resource Utilization Maximization
15.5%
Heuristics and meta-heuristics provide good quality solutions, but the quality of the expected solutions is hardly measurable. In a p-approximation algorithm, the value of a solution will not be more (or less) than a factor p times the optimum solution. Only 2.4% of the studied articles proposed approximation algorithms for solving the VMP problem (see Table II). It is worth noting that for cloud infrastructures, solutions obtained using heuristics or meta-heuristics techniques are sufficiently good for most practical cases.
Table III V IRTUAL M ACHINE P LACEMENT TAXONOMY . T HE ELEMENTS OF THE TABLE REPRESENT THE NUMBER OF ARTICLES IN THE STUDIED UNIVERSE . Technique
Approach
Deterministic Algorithms
Heuristics
Meta-Heuristics
Approximation Algorithms
MOP MAM PMO MOP MAM PMO MOP MAM PMO MOP MAM PMO
Energy Consumption 3 (3.6%) 3 (3.6%) 11 (13.1%) 14 (16.7%) 3 (3.6%) 4 (4.8%) 3 (3.6%) 2 (2.4%) -
Network Traffic 3 (3.6%) 3 (3.6%) 3 (3.6%) 8 (9.5%) 2 (2.4%) 2 (2.4%) -
VMP Additional Classification Criteria
Experimental Environment
Cloud Architecture
Orientation
Online
SingleCloud
Provider
Simulation
Offline
MultiCloud
Broker
System
Formulation
Federated
Size Workload
Figure 5. Additional classification criteria to eventually expand proposed taxonomy.
Objective Functions Economical Revenue 2 (2.4%) 6 (7.1%) 6 (7.1%) 1 (1.2%) 2 (2.4%) 2 (2.4%) -
Performance 1 (1.2%) 8 (9.5%) 3 (3.6%) 1 (1.2%) 1 (1.2%) -
Resource Utilization 1 (1.2%) 2 (2.4%) 5 (6.0%) 2 (2.4%) 2 (2.4%) 1 (1.2%) -
to be dynamically consolidated on necessary PMs according to dynamically updated resource requirements. A formulation that dynamically consolidates the resources according to the requirements of the VMP problem is considered as an online formulation and 77.4% of the studied articles proposed this type of formulation. For online formulations of the VMP problem, many articles proposed prediction techniques in order to approximate in advance the required migrations of VMs for the consolidation process, to reduce resource under-provisioning [33]. The most studied scenario for online formulations of the VMP problem considers that VMs are dynamically created and destroyed. At the time of this writing, the authors are also working in a deeper research of possible dynamic parameters in order to propose holistic and more realistic scenarios of the VMP problem considering different architectures. B. Cloud Architectures
VI. A DDITIONAL C LASSIFICATION C RITERIA Additionally to the classification criteria detailed in Sections II - V, this work identified other relevant criteria for the characterization of VMP problems. The following sub-sections present a detailed description of those additional criteria. A. Formulation: Offline or Online According to the studied articles, optimization of the VMP problem could be studied considering offline or online formulations. Offline formulations (also known as initial or static placement) consider the placement of VMs into PMs for a static cloud datacenter deployment. This scenario does not consider possible re-locations of the VMs; therefore, there is no need for migration techniques to be applied, simplifying the problem formulation and complexity. According to this work, 22.6% if the studied articles proposed an offline formulation of the VMP problem. Considering the on-demand model of cloud computing with dynamic resource provisioning, an offline formulation of the VMP problem can result in under-optimal solutions after a short period of time. Live migration technique allows VMs
Different scenarios for the VMP problem can be studied depending on the number of cloud datacenters associated to the problem instance. In scenarios considering only one datacenter, a cloud service provider can formulate the VMP problem subject to commonly studied constraints such as capacity of resources or SLA compliance. In this work, a one datacenter scenario is considered as a single-cloud scenario and it is the most studied scenario representing almost 88.1% of the studied articles [33]. According to [18], cloud architectures considering more than one datacenter could be classified as: (1) bursted private clouds (i.e. a CST having a private cloud infrastructure with possibility to expand using external CSPs), (2) federated clouds (CSPs using partners to ensure the capacity needed to serve the CSTs that are their customers), and (3) multi-clouds (CSTs working directly with multiple external CSPs). For CSPs like Amazon or Rackspace Cloud, the singlecloud scenario is extended to many geo-distributed cloud datacenters, where the formulation of the VMP problem includes additional constraints, considering provision of differentiated services to world-wide tenants, such as redundancies of services (in different datacenters) or quality and performance
of services (datacenters closer to end users), increasing the complexity of the formulation. Tenants can even require VMs to be deployed in many geo-distributed cloud datacenters of different cloud providers for disaster recovery reasons. In this work, this scenario is also considered as a multi-cloud scenario. Another studied scenario for the VMP problem includes different CSPs in a federated cloud environment [33]. In federated clouds, CSPs with excess capacity lease resources to providers in need of temporary additional resources; consequently, particular constraints associated to this scenario have to be studied [21]. C. Orientation: Provider-oriented or Broker-oriented Cloud infrastructure optimization is a main concern for CSPs, considering the objective functions described in Section IV. In this context, the VMP problem is commonly formulated from a provider-oriented perspective [32] (see Figure 3). Considering that placement decisions only include constraints of CSTs commonly represented as SLAs, tenants cannot decide which PMs will hosts their VMs. However, the number of CSPs has been rapidly increased and nowadays there are different pricing schemes, virtual machine offers and features. In general, it is difficult for users to search cloud prices and decide where to host their resources. A scenario for the VMP problem for optimizing users virtual infrastructure placements across available public cloud offers can be also studied. This novel scenario can be described as a broker-oriented approach [11] (see Figure 4). D. Experimental Environment The studied articles proposed novel formulations of the VMP problem, considering different objective functions and applying several solution techniques. In this context, experiments could include simulations and implementations in cloud operating systems (e.g. OpenStack). Simulations are widely used in the studied articles, and some works even proposed a hybrid environment considering simulations and complementing those simulations with implementations in cloud operating systems to validate the feasibility of the proposal in real world scenarios [33]. Simulations can include experimental tests with different numbers of VMs and PMs, different types of workloads (e.g. CPU-intensive, network-intensive) and different distribution of workload (e.g. Gaussian or normal distribution) [33]. The most widely used cloud simulator is Cloudsim [10]. The studied articles mostly consider between 100 and 1000 VMs for the experiments. Only a few experiments with more than 1000 VMs have been studied [33]. It is important to remark that actual cloud markets are mostly composed by thousands to millions of VM, so it is important to validate the experiments with large number of VMs and PMs. Most of the studied articles consider synthetic and uniform workload. Only a few experiments consider real workload to validate their work [33]. The types of workload used in
experimental tests should be diverse, considering the heterogeneity of applications running in cloud computing datacenters. Most of the studied articles assume homogeneous resource configurations for PMs [33], but cloud computing datacenters should be also modeled considering heterogeneous resource configurations of PMs [7]. Particular optimization objective functions, such as network traffic optimization, should be validated according to different network topologies. In the studied articles, topologies such as: Tree, Fat-Tree, VL2 and BCube have been proposed for experimental tests. Novel algorithms for the VMP problem resolution should be compared to known methods and metrics [37] and their performance should be compared with state of the art algorithms such as: (1) First Fit, (2) Best Fit or (3) Best Fit Decreasing, just for cite a few [33]. VII. C ONCLUSIONS AND F UTURE D IRECTIONS Based on an universe of more than 80 studied publications carefully chosen [33], this work presented a general up-todate taxonomy of the VMP problem considering optimization approaches, objective functions and solution techniques as the 3 main dimensions of the proposed taxonomy. Studied articles were analyzed according to each classification criteria (taxonomy dimensions) in order to identify research opportunities and define a general vision on this important and promising research area. The proposed taxonomy showed that research of the VMP problem has been mainly guided by the mono-objective optimization (MOP) approach (see Figure 2), but a growing number of articles proposed formulations with a multi-objective solved as mono-objective optimization (MAM) approach in recent years. There exist nearly 60 different objective functions and several approaches for objective functions modeling were proposed. Consequently, pure multi-objective optimization (PMO) [13] could result in more realistic formulations of the VMP problem, optimizing more than just one objective function at a time. To the best of the author’s knowledge, there is no manyobjective optimization formulations proposed for the VMP problem in the specialized literature (i.e a multi-objective optimization problem with more than three objective functions [47]). At the time of this writing, the authors are working on a many-objective mathematical framework for the VMP problem. From the studied universe of 84 papers, energy consumption minimization is the most studied objective function with several modeling approaches proposed. A very studied approach is consolidating VMs on the minimum number of PMs based on the assumption that “the use of fewer PMs will bring less energy consumption” [16]. In a general vision, algorithms that consolidate VMs on the minimum number of PMs without considering the power consumption of each PM could result in sub-optimal solutions with energy-inefficient PMs. Many works considered that energy consumption could be modeled as a linear relationship between power consumption and CPU utilization, but definitely research should advance
in order to propose holistic energy consumption models for cloud computing datacenters. Network traffic minimization is also a very studied objective function, where modeling and quantification of live migration network overhead represents a very challenging issue [33]. Economical cost optimization is also an important objective function, where two possible scenarios could be identified as future research directions for the VMP problem: (1) if demand for resource exceeds the current available resources, overbooking techniques or cloud federation [18] can help CSPs to attend the requirements of the CSTs; on the other hand, (2) idle resources can be offered in an auction-based scheme such as Amazon’s Spot Instances [1]. Both scenarios represent open challenges for the VMP research and emerging cloud computing markets. According to the proposed taxonomy (see Table III), deterministic algorithms are studied mostly in a MOP approach and briefly in MAM approach. Future work can be proposed considering unstudied objective functions in MAM and PMO approaches for deterministic algorithms, remarking that no study has been proposed in a PMO approach using deterministic techniques. Heuristics are very studied solution technique in MOP and MAM approaches proposing solutions based on well known heuristics or proposing novel heuristics for the resolution of the VMP problem. Future work can be proposed for heuristics in a PMO approach, considering that no studied articles proposed an heuristic-based solution technique with a PMO approach. Meta-heuristics are mainly studied in PMO approaches for solving the VMP problem and future work can be proposed considering unstudied objective functions in a MOP approach. For example, a future work may consider objective functions as simultaneously network traffic and resource utilization. Performance is also an unstudied objective function for metaheuristics in a PMO approach. Approximation algorithms can also be studied as future work, considering that this solution technique was only studied for energy consumption minimization in a MOP approach. At the same time, different meta-heuristics, methods and algorithms should still be developed and applied to the VMP problem before a real good tool is ready for massive use in cloud computing datacenters. Techniques for avoiding an excessive number of possible placement combinations is also an open challenge for the VMP problem [44]. Additionally, the VMP problem can be formulated as an offline problem, but considering the on-demand model of cloud computing with dynamic resource provisioning, an offline formulation of the VMP problem can result in underoptimal solutions after a short period of time. Clearly, the VMP problem should be optimized online to efficiently attend typical dynamic workload of modern applications. According to this work, the VMP problem could be formulated as a provider-oriented VMP (i.e. the process of selecting which VMs should be hosted at each PMs of a datacenter), see Figure 3 or as a broker-oriented VMP (i.e. the process of selecting which VMs should be hosted at each CSPs),
see Figure 4. Considering the growing number of CSPs, the different prices and configurations offered by the CSPs and the different requirements for particular cloud services, a deeper study of the broker-oriented VMP is proposed as a future direction with clear practical impact for the final user, specifically in a cloud computing environment. The most studied scenario for online formulations of the VMP problem considers that VMs are dynamically created and destroyed. At the time of this writing, the authors are also working in a deeper research of possible dynamic parameters in cloud computing scenarios in order to propose holistic and more realistic scenarios of the VMP problem in different architectures. Different scenarios for the VMP problem can be studied depending on the number of cloud datacenters associated to the problem instance. The most studied scenario is the singlecloud scenario. Other trending cloud architectures such as bursted clouds, multi-cloud and federated clouds should be also studied in a deeper manner. Finally, it is important to remark that actual cloud markets are mostly composed by thousands to millions of VMs which are dynamically created and destroyed, so experimental tests for VMP problem should consider: (1) large number of VMs and PMs, (2) heterogeneity in PMs and VMs configurations, (3) diverse types and workload distribution, and (4) trending dynamic parameters. Considering the studied articles, there is no existing tests problems for the VMP that could be used as a world accepted benchmark. In this context, a test problem generator should be useful at the time of experimental testing. R EFERENCES [1] O. Agmon Ben-Yehuda, M. Ben-Yehuda, A. Schuster, and D. Tsafrir, “Deconstructing Amazon EC2 spot instance pricing,” ACM Transactions on Economics and Computation, vol. 1, no. 3, p. 16, 2013. [2] A. Anand, J. Lakshmi, and S. Nandy, “Virtual machine placement optimization supporting performance SLAs,” in Cloud Computing Technology and Science (CloudCom), 2013 IEEE 5th International Conference on, vol. 1. IEEE, 2013, pp. 298–305. [3] B. Bar´an, C. von L¨ucken, and A. Sotelo, “Multi-objective pump scheduling optimisation using evolutionary strategies,” Advances in Engineering Software, vol. 36, no. 1, pp. 39–47, 2005. [4] L. A. Barroso and U. H¨olzle, “The case for energy-proportional computing,” IEEE computer, vol. 40, no. 12, pp. 33–37, 2007. [5] A. Beloglazov, J. Abawajy, and R. Buyya, “Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing,” Future Generation Computer Systems, vol. 28, no. 5, pp. 755–768, 2012. [6] A. Beloglazov and R. Buyya, “Optimal online deterministic algorithms and adaptive heuristics for energy and performance efficient dynamic consolidation of virtual machines in cloud data centers,” Concurrency and Computation: Practice and Experience, vol. 24, no. 13, pp. 1397– 1420, 2012. [7] R. Buyya, C. S. Yeo, and S. Venugopal, “Market-oriented cloud computing: Vision, hype, and reality for delivering it services as computing utilities,” in High Performance Computing and Communications, 2008. HPCC’08. 10th IEEE International Conference on. Ieee, 2008, pp. 5–13. [8] R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, “Cloud computing and emerging it platforms: Vision, hype, and reality for delivering computing as the 5th utility,” Future Generation computer systems, vol. 25, no. 6, pp. 599–616, 2009. [9] N. M. Calcavecchia, O. Biran, E. Hadad, and Y. Moatti, “VM placement strategies for cloud scenarios,” in Cloud Computing (CLOUD), 2012 IEEE 5th International Conference on. IEEE, 2012, pp. 852–859.
[10] R. N. Calheiros, R. Ranjan, A. Beloglazov, C. A. De Rose, and R. Buyya, “Cloudsim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms,” Software: Practice and Experience, vol. 41, no. 1, pp. 23–50, 2011. [11] S. Chaisiri, B.-S. Lee, and D. Niyato, “Optimal virtual machine placement across multiple cloud providers,” in Services Computing Conference, 2009. APSCC 2009. IEEE Asia-Pacific. IEEE, 2009, pp. 103–110. [12] K.-y. Chen, Y. Xu, K. Xi, and H. J. Chao, “Intelligent virtual machine placement for cost efficiency in geo-distributed cloud systems,” in Communications (ICC), 2013 IEEE International Conference on. IEEE, 2013, pp. 3498–3503. [13] C. C. Coello, G. B. Lamont, and D. A. Van Veldhuizen, Evolutionary algorithms for solving multi-objective problems. Springer, 2007. [14] A. Dalvandi, M. Gurusamy, and K. C. Chua, “Time-aware vm-placement and routing with bandwidth guarantees in green cloud data centers,” in Cloud Computing Technology and Science (CloudCom), 2013 IEEE 5th International Conference on, vol. 1. IEEE, 2013, pp. 212–217. [15] H. T. Dang and F. Hermenier, “Higher SLA satisfaction in datacenters with continuous VM placement constraints,” in Proceedings of the 9th Workshop on Hot Topics in Dependable Systems. ACM, 2013, p. 1. [16] J. Dong, X. Jin, H. Wang, Y. Li, P. Zhang, and S. Cheng, “Energy-saving virtual machine placement in cloud data centers,” in Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on. IEEE, 2013, pp. 618–624. [17] C. Dupont, G. Giuliani, F. Hermenier, T. Schulze, and A. Somov, “An energy aware framework for virtual machine placement in cloud federated data centres,” in Future Energy Systems: Where Energy, Computing and Communication Meet (e-Energy), 2012 Third International Conference on. IEEE, 2012, pp. 1–10. [18] E. Elmroth, J. Tordsson, F. Hern´andez, A. Ali-Eldin, P. Sv¨ard, M. Sedaghat, and W. Li, “Self-management challenges for multi-cloud architectures,” in Towards a Service-Based Internet. Springer, 2011, pp. 38–49. [19] W. Fang, X. Liang, S. Li, L. Chiaraviglio, and N. Xiong, “Vmplanner: Optimizing virtual machine placement and traffic flow routing to reduce network power costs in cloud data centers,” Computer Networks, vol. 57, no. 1, pp. 179–196, 2013. [20] T. C. Ferreto, M. A. Netto, R. N. Calheiros, and C. A. De Rose, “Server consolidation with migration control for virtualized data centers,” Future Generation Computer Systems, vol. 27, no. 8, pp. 1027–1034, 2011. [21] M. Gahlawat and P. Sharma, “Survey of virtual machine placement in federated clouds,” in Advance Computing Conference (IACC), 2014 IEEE International. IEEE, 2014, pp. 735–738. [22] Y. Gao, H. Guan, Z. Qi, Y. Hou, and L. Liu, “A multi-objective ant colony system algorithm for virtual machine placement in cloud computing,” Journal of Computer and System Sciences, vol. 79, no. 8, pp. 1230–1242, 2013. [23] N. Grozev and R. Buyya, “Inter-cloud architectures and application brokering: taxonomy and survey,” Software: Practice and Experience, 2012. [24] A. Gupta, D. Milojicic, and L. V. Kal´e, “Optimizing vm placement for hpc in the cloud,” in Proceedings of the 2012 workshop on Cloud services, federation, and the 8th open cirrus summit. ACM, 2012, pp. 1–6. [25] D. Huang, D. Yang, H. Zhang, and L. Wu, “Energy-aware virtual machine placement in data centers,” in Global Communications Conference (GLOBECOM), 2012 IEEE. IEEE, 2012, pp. 3243–3249. [26] Z. Huang and D. H. Tsang, “Sla guaranteed virtual machine consolidation for computing clouds,” in Communications (ICC), 2012 IEEE International Conference on. IEEE, 2012, pp. 1314–1319. [27] Z. Huang, D. H. Tsang, and J. She, “A virtual machine consolidation framework for mapreduce enabled computing clouds,” in Proceedings of the 24th International Teletraffic Congress. International Teletraffic Congress, 2012, p. 26. [28] B. Kantarci, L. Foschini, A. Corradi, and H. T. Mouftah, “Inter-andintra data center vm-placement for energy-efficient large-scale cloud systems,” in Globecom Workshops (GC Wkshps), 2012 IEEE. IEEE, 2012, pp. 708–713. [29] N. Kord and H. Haghighi, “An energy-efficient approach for virtual machine placement in cloud based data centers,” in Information and Knowledge Technology (IKT), 2013 5th Conference on. IEEE, 2013, pp. 44–49. [30] K. Li, J. Wu, and A. Blaisse, “Elasticity-aware virtual machine place-
[31]
[32]
[33]
[34] [35]
[36] [37]
[38]
[39] [40]
[41]
[42]
[43] [44]
[45] [46]
[47] [48]
[49]
ment for cloud datacenters,” in Cloud Networking (CloudNet), 2013 IEEE 2nd International Conference on. IEEE, 2013, pp. 99–107. W. Li, P. Sv¨ard, J. Tordsson, and E. Elmroth, “Cost-optimal cloud service placement under dynamic pricing schemes,” in Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing. IEEE Computer Society, 2013, pp. 187–194. F. L´opez Pires and B. Bar´an, “Multi-objective virtual machine placement with service level agreement: A memetic algorithm approach,” in Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing. IEEE Computer Society, 2013, pp. 203–210. F. L´opez Pires and B. Bar´an, “Virtual machine placement literature review,” Polytechnic School, National University of Asunci´on, Tech. Rep., June 2014. [Online]. Available: https://sites.google.com/site/flopezpires/ F. L´opez Pires, E. Melgarejo, and B. Bar´an, “Virtual machine placement. a multi-objective approach,” in Computing Conference (CLEI), 2013 XXXIX Latin American. IEEE, 2013, pp. 1–8. N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner, “Openflow: enabling innovation in campus networks,” ACM SIGCOMM Computer Communication Review, vol. 38, no. 2, pp. 69–74, 2008. P. Mell and T. Grance, “The NIST definition of cloud computing,” National Institute of Standards and Technology, vol. 53, no. 6, p. 50, 2009. K. Mills, J. Filliben, and C. Dabrowski, “Comparing vm-placement algorithms for on-demand clouds,” in Cloud Computing Technology and Science (CloudCom), 2011 IEEE Third International Conference on. IEEE, 2011, pp. 91–98. M. Mishra and A. Sahoo, “On theory of vm placement: Anomalies in existing methodologies and their mitigation using a novel vector based approach,” in Cloud Computing (CLOUD), 2011 IEEE International Conference on. IEEE, 2011, pp. 275–282. A. Osyczka, “Multicriteria optimization for engineering design,” Design optimization, vol. 1, pp. 193–227, 1985. B. Rochwerger, D. Breitgand, E. Levy, A. Galis, K. Nagin, I. M. Llorente, R. Montero, Y. Wolfsthal, E. Elmroth, J. Caceres et al., “The reservoir model and architecture for open federated cloud computing,” IBM Journal of Research and Development, vol. 53, no. 4, pp. 4–1, 2009. L. Salimian and F. Safi, “Survey of energy efficient data centers in cloud computing,” in Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing. IEEE Computer Society, 2013, pp. 369–374. K. Sato, M. Samejima, and N. Komoda, “Dynamic optimization of virtual machine placement by resource usage prediction,” in Industrial Informatics (INDIN), 2013 11th IEEE International Conference on. IEEE, 2013, pp. 86–91. W. Shi and B. Hong, “Towards profitable virtual machine placement in the data center,” in Utility and Cloud Computing (UCC), 2011 Fourth IEEE International Conference on. IEEE, 2011, pp. 138–145. S. Shigeta, H. Yamashima, T. Doi, T. Kawai, and K. Fukui, “Design and implementation of a multi-objective optimization mechanism for virtual machine placement in cloud computing data center,” in Cloud Computing. Springer, 2013, pp. 21–31. F. Song, D. Huang, H. Zhou, H. Zhang, and I. You, “An optimizationbased scheme for efficient virtual machine placement,” International Journal of Parallel Programming, vol. 42, no. 5, pp. 853–872, 2014. M. Sun, W. Gu, X. Zhang, H. Shi, and W. Zhang, “A matrix transformation algorithm for virtual machine placement in cloud,” in Trust, Security and Privacy in Computing and Communications (TrustCom), 2013 12th IEEE International Conference on. IEEE, 2013, pp. 1778–1783. C. von L¨ucken, B. Bar´an, and C. Brizuela, “A survey on multi-objective evolutionary algorithms for many-objective problems,” Computational Optimization and Applications, pp. 1–50, 2014. S.-H. Wang, P. P.-W. Huang, C. H.-P. Wen, and L.-C. Wang, “Eqvmp: Energy-efficient and qos-aware virtual machine placement for software defined datacenter networks,” in Information Networking (ICOIN), 2014 International Conference on. IEEE, 2014, pp. 220–225. J. Xu and J. A. Fortes, “Multi-objective virtual machine placement in virtualized data center environments,” in Green Computing and Communications (GreenCom), 2010 IEEE/ACM Int’l Conference on & Int’l Conference on Cyber, Physical and Social Computing (CPSCom). IEEE, 2010, pp. 179–188.