Particle Swarm Optimization for Energy-Aware Virtual Machine ...

3 downloads 42595 Views 338KB Size Report
consumption of a virtualized data center by means of virtual machine placement optimization ..... The multiplication operator is redefined to update the particle position. We call .... resource request parameters of Amazon EC2 instances, i.e.,.
Particle Swarm Optimization for Energy-Aware Virtual Machine Placement Optimization in Virtualized Data Centers 1

1

2

1

Shangguang Wang , Zhipiao Liu , Zibin Zheng , Qibo Sun , Fangchun Yang 1

1

2

State Key Laboratory of Networking and Switching Technology; Shenzhen Research Institute 2 Beijing University of Posts and Telecommunications; The Chinese University of Hong Kong 1 2 Beijing 100876, China; Hong Kong 1 2 {sgwang;lpliu;qbsun;fcyang}bupt.edu.cn; [email protected]

1

Abstract—A critical research issue is to lower the energy consumption of a virtualized data center by means of virtual machine placement optimization while satisfying the resource requirements of the cloud services. In this paper, we focus on different existing schemes and on the energy-aware virtual machine placement optimization problem of a heterogeneous virtualized data center. We attempt to explore a better alternative approach to minimizing the energy consumption, and we observe that particle swarm optimization (PSO) has considerable potential. However, the PSO must be improved to solve an optimization problem. The improvement includes redefining the parameters and operators of the PSO, adopting an energy-aware local fitness first strategy and designing a novel coding scheme. Using the improved PSO, an optimal virtual machine replacement scheme with the lowest energy consumption can be found. Experimental results indicate that our approach significantly outperforms other approaches, and can lessen 13%-23% energy consumption in the context of this paper.

Keywords—virtualized data center; energy consumption; virtual machine replacement; particle swarm optimization I.

INTRODUCTION

With the increasing popularity of cloud computing, the number and size of data centers are increasing. As a result, energy consumption imposes a significant operational cost for data centers that host cloud services and applications and also contributes to the carbon footprint on the environment. It has also been reported that the worldwide spending on enterprise data center power and cooling topped $30 billion in the year 2008 and is likely to surpass spending on new server hardware[1,2]. Currently, data centers that power Internetscale applications consume approximately 1.3% of the worldwide electricity supply, and this fraction is expected to grow to 8% by 2020 [3,4]. Data center carbon emissions were 0.6% of the global total, nearly equal to that of the Netherlands, and the fraction is expected to reach 2.6%, which exceeds the carbon emission of Germany by 2020 [4]. Why is the numbers so high? We find that the inefficient usage of servers is the main source of high energy consumption in a data center. This finding is supported by the following evidence: 1) server utilization varies between 11% and 50%, on average; 2) an idle server usually consumes more than 60% of the power

when the machine is fully utilized [5-7]. How can we accomplish our goals? To achieve energy savings and emissions reduction, server consolidation technology using virtualization is introduced. This technology can consolidate multiple applications on the same physical machine, with each application typically running on its own virtual machines. In return, these virtual machines are mapped to physical machines. In the context of virtualized data centers, it is a critical concern to design energy-efficient virtual machine placement approaches that reduce energy consumption while satisfying cloud services/applications. Then, using the above technology as a foundation, a large amount of work has been devoted to optimizing the energyefficiency of virtualized data centers. For example, C. Tang et al. [8] investigated the applications workload placement optimization problem in the context of an enterprise data center. They presented an online applications placement approach to minimize the number of applications starts and stops; they maximized the total satisfied application demand and balanced the load across machines. However, they failed to consider the energy consumption reduction problem. In [9], the energy optimization of virtualized data centers was modeled as a mixed integer programming problem, and then, its exact solution was obtained by the CPLEX solver (an optimization software package). Although the approach obtains a good solution, it did not apply to solve large-scale virtual machine placement optimization problems due to the long calculation time cost. In [10], the energy consumption optimization of a virtualized data center is modeled as a sequential optimization problem, and an energy optimization algorithm based on control theory is proposed. Because the algorithm requires model training that tends to incur a high time cost, it is suitable only for a small-scale virtualized data center scenario. Although there are many other studies, unfortunately, most current studies have assumed that the physical servers of a virtualized data center are homogeneous. At first glance, this assumption appears to be a reasonable assumption, but it is unreasonable. Why? The reason is that new and different servers are usually added to a virtualized data center to run new services or applications or to satisfy new increasing

requirements in the process of data center operations and, thus, form a heterogeneous virtualized data center environment. In a heterogeneous virtualized data center, server configurations are often very different (the hardware configuration in a heterogeneous virtualized data center can be very different in terms of the CPU core count, memory, hard disk, and other components), which leads to different server energy consumption characteristics. This fact means that the minimum number of active servers might not consume the least amount of energy. Thus, the approaches that focus on opening a minimal number of servers to minimize the energy consumption, which is proposed by most researchers, might not be able to achieve the best energy-saving effect. Therefore, these approaches are not applicable to real virtualized data center environments that have a large number of heterogeneous servers. Are there any other approaches? In this paper, we provide an alternative approach that heterogeneous virtualized data centers can use. In contrast to the above work, our study removes the assumption of server homogeneity and considers the virtual machine placement optimization problem in a heterogeneous virtualized data center. We first establish the energy consumption model of a heterogeneous virtualized data center, and then, we present an energy-aware virtual machine placement optimization approach that is based on particle swarm optimization (PSO). To effectively solve the virtual machine placement optimization problem, we improve the PSO by redefining the parameters and operators of the PSO. Then, an energy-aware local fitness first strategy is adopted to update the particle position, to improve the problem-solving efficiency. Moreover, a novel two-dimensional particle encoding scheme is designed. Finally, we use the improved PSO to find the optimal virtual machine replacement with the lowest energy consumption. We evaluate the proposed approach based on a simulated virtualized data center 1 that is composed of 1000 clusters. Each cluster contains 350 heterogeneous servers. Experimental results indicate that our approach significantly improves the server utilization and saves more energy for the virtualized data centers. The remainder of this paper is organized as follows. Section 2 establishes an energy consumption model of a heterogeneous virtualized data center and presents a problem statement. Our approach for an energy-aware virtual machine placement optimization is proposed in Section 3. Experimental evaluations for comparing our solution against existing solutions are presented in Section 4. Finally, Section 5 presents the conclusions and an outlook on possible continuations of our work. II.

ENERGY CONSUMPTION MODEL

The energy consumption of servers depends on the comprehensive utilization of a CPU, memory, disk and network card. It is well known that, among the above factors, the CPU is the most important energy consumption component. Therefore, the resource utilization of a server is usually represented by its CPU utilization [11,12]. Because the CPU utilization can be modeled as a function of time

according to the workload variability, the server energy model can also be established based on the CPU utilization. How is it possible to establish the server energy model? We first introduce two very important conclusions: 1) the relationship between CPU utilization and energy consumption is approximately linear; and 2) an idle server consumes approximately 66% of the peak load electricity [12,13]. Based on the above conclusions, this approach quantifies the relationship between the CPU utilization of a server and the energy consumption of the server. Then, the researchers can obtain an energy model of heterogeneous servers based on massive training. However, an idle server consumes a large proportion of the peak load energy to maintain its basic running operation, which lays a theoretical foundation for designing an efficient virtual machine placement approach to reduce the energy consumption of the virtualized data centers. Then, based on the two conclusions [7], we introduce an energy consumption model of a server by the following: t2

E = ³ P (u (t )) ⋅ dt t1

with

P (u (t )) = c ∗ Pmax + (1 − c ) ∗ Pmax ∗ u (t ),

where E is the total energy consumption of this server in an optimization period [t1 , t2 ] ; P (u(t )) is the energy consumption of this server at time t; Pmax is the maximum energy consumed by the server that is fully utilized; c is the fraction of energy consumed by the idle server; and u (t ) ( u (t ) ∈ [0,1] ) is the varying CPU utilization. (Due to the different workload of each cloud service that runs on the servers, the CPU utilization of a server usually fluctuates continuously during any given period). A. Energy Consumption Optimization In this paper, we consider a heterogeneous virtualized data center that is composed of n servers that host a set of m virtual machines. A cloud service/application is often implemented as a virtual machine that is deployed to a server while satisfying its specified resource (i.e., CPU and memory) constraints. Each virtual machine runs one cloud service as a time-varying workload. One cloud service runs only on a virtual machine. Then, the performance of the cloud service is usually associated with resource provisioning and is typically expressed as resource guarantees. The optimization objective of the virtual machine placement is to minimize the total energy consumption in an optimization period while satisfying the resource requirements. In other words, if the requested maximum resources of the virtual machine are allocated, then the cloud service can run on this virtual machine with good performance [13]. Hence, the overall energy consumption (OEC) of a heterogeneous virtualized data center can be obtained by the following: n

OEC=min ¦¦ i =1 E j xij j =1

1

You can get the code from http://sguangwang.com.

(1)

m

(2)

with

¦

m

r cpu xij < c cpu and j

i =1 i

¦

n j =1

¦

m

r mem xij < c mem , j

i =1 i

xij = 1, i = 1, 2,..., m,

(3) (4)

where n is the number of servers in the virtualized data center and m is the number of virtual machines; E j is the total energy consumption of the j-th server in an optimization period; ri cpu and ri mem are the maximum CPU and memory requirements of the i-th virtual machine in an optimization mem and c j are the CPU and period, respectively; and c cpu j memory resource capacity of the j-th server. Eq. (3) shows that the sum of the resource requirements for virtual machines must be less than the server's resource capacity, and Eq. (4) shows that a virtual machine can only be placed on one server such that xij =1 if the i-th virtual machine is run on the j-th server and xij =0 otherwise. Please note that, due to the heterogeneity of the virtualized data center, the c cpu of the j-th j server is not equal to the ckcpu of the k-th server, and the c mem j of the j-th server is also not equal to the ckmem of the k-th server. B. Problem Statement According to Eqs. (2-4), we find that the energy-aware virtual machine replacement is an NP-hard problem. The

problem of finding the best virtual machine placement is considered to be an optimization problem in which the overall energy consumption must be minimized while satisfying all of the constraints (Eqs. (3-4)). Formally, the optimization problem that we are addressing can be stated as follows: 1) The overall energy consumption OEC is minimized,

and 2) The CPU and memory requirements of each virtual machine satisfy Eqs.(3) and (4).

Then, we present an energy-aware virtual machine placement approach that is based on an improved particle swarm optimization to solve the optimization problem for finding the best virtual machine placement operator. Why do we use the particle swarm optimization (PSO) to solve the energy-aware virtual machine placement problem? III.

ENERGY-AWARE VIRTUAL MACHINE PLACEMENT APPROACH

PSO is a random search algorithm that is based on swarm intelligence and was first introduced by Kennedy and Eberhart in 1995 [14]. PSO shares many similarities with evolutionary computation techniques such as genetic algorithms (GA) (it is an adaptive heuristic search algorithm premised on the evolutionary ideas of natural selection and genetics) [15]. Compared to GAs, the advantages of PSO are that it is easy to

implement and there are few parameters to adjust. Moreover, in terms of the computational efficiency, the superiority of the PSO over the GA has been statistically proven with a 99% confidence level [16]. Compared with other similar optimization algorithms, PSOs have such advantages as a faster execution and higher efficiency of problem-solving. In most cases, PSO outperforms the Branch and Bound algorithm, which is a general algorithm for finding optimal solutions of various optimization problems, especially in discrete and combinatorial optimization [17]. Thus, the PSO has been successfully applied in many areas, such as function optimization, artificial neural network training and fuzzy systems control. From the above introduction, we find that the PSO is a good algorithm for solving optimization problems in many areas. Thus, we will also attempt to use it for solving the energy-aware virtual machine placement optimization problem. To apply a PSO to our study, the PSO must have several improvements. The reasons are as follows: 1) a traditional PSO is suitable only for solving a continuous optimization problem and is not used to solve a discrete optimization problem, which means that the parameters and operators of the PSO must be redefined; and 2) to apply the PSO to solve the virtual machine placement problem, the position update strategy and the coding scheme must be adjusted. Thus, in this paper, we adopt the improved PSO as the key to our approach to solve the energy-aware virtual machine placement optimization problem. How can we improve the PSO and use the improved PSO? A. Particle Swarm Optimization PSO is based on groups and fitness. Every individual of the swarm is called a particle and represents a feasible solution of the problem. Every particle has two parameters, i.e., the velocity and position. Every particle position is associated with a fitness value, which is used to evaluate the quality of the solution. The PSO begins by initializing a group of random particles, and then, it finds the optimal solution by performing iterations. It imitates the interactive behavior of bird group foraging. Every particle flies in the multi-dimension search space at a specified velocity while referring to the local best position X lbesti and the global best position X gbest , and then, it updates its velocity and position by the following:

Vi t +1 = ωVi t + c1r1 ( X lbesti (t ) − X it ) + c2 r2 ( X gbest (t ) − X it ) ,(7) X it +1 = X it + Vi t +1 ,

(8)

where Vi t and Vi t +1 are the velocity before the update and the updated velocity, respectively; and X it and X it +1 are the position before the update and the updated position, respectively. Here, ω is called the inertia weight coefficient, which represents the inheritance of the current velocity of the particle and can balance the local and global search capability of the particles; c1 and c2 are called learning factors, which enable the individual to have the ability to learn; and r1 and r2 are random numbers that are between 0 and 1.

B. Improved PSO Our improvement for PSO is simple and smart. It contains the following two aspects: 1) redefining the parameters and operators of PSO to solve the discrete optimization problem (energy-aware virtual machine placement optimization); and 2) adopting an energy-aware local fitness first strategy to update the particle position.

the uncertain bit value. Then, the uncertain bit value influences the update of the particle velocity.

1) Redefining the Parameters and Operators of the PSO Traditional PSO is suitable only for solving continuous optimization problems, and it fails to solve the energy-aware virtual machine placement optimization. As a result, we must redefine the parameters and operators of the PSO to solve a discrete optimization problem. Combined with the specific characteristics of the virtual machine placement optimization problem, the parameters and operators of the PSO can be redefined by the following five definitions.

P1i =

1 / f ( X it ) , 1 / f ( X it ) + 1 / f ( X lbesti (t )) + 1 / f ( X gbest (t ))

(9)

P2i =

1 / f ( X lbest i (t )) , 1 / f ( X ) + 1 / f ( X lbest i (t )) + 1 / f ( X gbest (t ))

(10)

Definition 1 (particle position). The particle position X = ( xit1 , xit2 , ... , xint ) is redefined as an n -bit vector and represents a feasible virtual machine placement solution, where n is the length of the particle code and is equal to the number of servers in a virtualized data center. t i

Definition 2 (particle velocity). The particle velocity Vi t = (vit1 , vit2 , ... , vint ) is redefined as an n -bit vector and represents the adjustment decision of the virtual machine placement. Vi t guides the particle position update operation to enable virtual machine placement to adjust to the optimal solution. The value of every bit in the vector Vi t is 0 or 1. The value is 0 if the corresponding server and its virtual machines must be re-evaluated and adjusted, and the value is 1 otherwise. Definition 3 (subtraction operator). The subtraction operator is redefined to calculate the difference between two virtual machine placement solutions. We call Θ the subtraction operator. As far as X it ΘX kt is concerned, if the value of the corresponding bit of the solution X it is equal to that of solution X kt , then the value of the corresponding bit in the result is 1; otherwise, the value is 0. For example, (1, 0, 1) Θ (1, 1, 0)= (1, 0, 0). Definition 4 (addition operator). The addition operator is redefined to represent the particle velocity update operation caused by its own velocity inertia, best position and global best position in the process of particle updating. We use ⊕ to t t t represent an addition operator. Then, PV 1 1 ⊕ PV 2 2 ⊕ " PnVn denotes that a particle updates its velocity by using V1t with a probability P1 , …, and using Vnt with a probability Pn . We call the probability value Pi (

n

¦P =1 ) i

i =1

the inertia weight

coefficient. For example, 0.2(1, 0, 1, 0) ⊕ 0.8(1, 1, 0, 0) = (1, #, #, 0). The probability that the value of the second bit is equal to 0 is 0.2, and the probability that the value is equal to 1 is 0.8. Here, the value of # is uncertain; this bit value is called

In the improved PSO, there are three inertia weight coefficients, P1i , P2i , P3i . These coefficients can be obtained by the following:

P3i =

t i

1 / f ( X gbest (t )) t i

1 / f ( X ) + 1 / f ( X lbest i (t )) + 1 / f ( X gbest (t ))

,

(11) where f ( X it ) denotes the fitness of the virtual machine placement solution X it . The fitness is defined as the total energy consumption of the feasible solution in an optimization period. The higher the energy of this feasible solution is, the greater its fitness. Hence, when calculating the inertia weight coefficient of a feasible solution, the reciprocal of the total energy consumption is adopted so that the selected probability of the solution that has a higher total energy consumption is lower. Moreover, the uncertain bit value can be obtained by the following:

­ the uncertain bit value = q1 , if rand ≤ p1i ° ® the uncertain bit value = q2 , if p1 < rand ≤ p2 i , (12) ° the uncertain bit value = q , if p < rand ≤ p ¯ 3 2 3i where q1 is the corresponding bit value of the particle’s velocity vector before updating; q2 is the corresponding bit value of the best local particle’s velocity vector; and q3 is the corresponding bit value of the best global particle’s velocity vector. Definition 5 (multiplication operator). The multiplication operator is redefined to update the particle position. We call ⊗ the multiplication operator. X it ⊗ Vkt +1 represents the position update operation of the current particle position vector X it based on the velocity vector Vkt +1 . The computation rule of ⊗ is as follows: 1) if the corresponding bit value of the velocity vector is 1, then the corresponding bit of the position vector is not adjusted; 2) if the corresponding bit value of the velocity vector is 0, then it shall be adjusted. For example, (1, 0, 1, 0) ⊗ (1, 0, 0, 1), where (1, 0, 1, 0) is the position vector and (1, 0, 0, 1) is the velocity. The second and third bit values of the velocity vector are all equal to 0, which indicates that the status of the second and third server in the corresponding virtual machine placement solution should be updated. Finally, based on the above five definitions, we improve the PSO by transforming Eqs. (7-8) to Eqs. (13-14), as follows:

t t t Vi t +1 = PV 1 i ⊕ P2 ( X lbest i ( t ) ΘX i ) ⊕ P3 ( X gbest ( t ) ΘX i ), (13)

X it +1 = X it ⊗ Vi t +1.

(14)

2) Energy-aware Local Fitness First Strategy The particle position update usually adopts a random selection strategy. However, the random selection strategy will affect the overall convergence of the PSO, which will reduce the effectiveness of the energy consumption optimization. Hence, to overcome the problem and enhance the quality of the solution, we propose an energy-aware local fitness first strategy to update the particle position. For ease of presentation, every bit in the first dimension of the particle is called the local position. The CPU utilization of all of the virtual machines that run on this server in an optimization period [t1 , t2 ] is called the local fitness, which is denoted by f local and can be determined by the following:

f local −i =

k 1 t2 ( ¦ j =1 uij (t )) ⋅ dt , t2 - t1 ³t1

(15)

where uij (t ) is the CPU utilization of the j-th virtual machine running on the i-th server, and k is the total number of virtual machines that run on the i-th server . For the energy-aware local fitness first strategy, when the PSO needs to update a certain local position, the virtual machine on the server with the maximum fitness is selected to fill the local position with a larger probability. The local fitness represents the CPU utilization of the server, and the CPU utilization is related to the energy consumption of the server. 3) Encoding Scheme To improve the efficiency of the solutions, as shown in Fig. 1, we devise a two-dimensional encoding scheme that is based on the character (a one-to-many mapping relationship between the server and the virtual machine) of the energy-aware virtual machine placement optimization problem.

virtual machine placement solution, and “0” denotes otherwise. The second dimension of a particle is a set of subsets that comprises the virtual machines to be placed. Each virtual machine subset is associated with an active server. For example, the third bit value of the first dimension of this particle is equal to 1, which means that the third server in the virtualized data center should be turned on. The third, seventh and j-th virtual machine should be placed onto the third server. Compared with traditional one-dimension particle encoding, our designed two-dimension encoding scheme not only can effectively shorten the particle encoding length to reduce the search time but also can reflect the character of the virtual machine static placement optimization problem. Hence, the encoding scheme is conducive to maintaining the current feasible solution and improving the convergence speed of the PSO. C. Improved PSO for Virtual Machine Placement Optimization Based on the improved PSO, we can obtain the best virtual machine placement by the following nine steps: Step 1: Periodically collecting the continuously arrived virtual machine requests as the input of our proposed approach. Step 2: Initialization. Step 2.1: The particle population size is set to be L, and the maximum iteration number is set to be MaxNum. Step 2.2: Generating the initial population. According to the multi-dimensional virtual machine requests and the various CPU and memory resource constraints of the servers, a first fit strategy [18] is adopted to generate an L virtual machine placement feasible solution. Every solution is a particle, and every particle is encoded by the two-dimension encoding scheme. These particles constitute the initial population. The particle’s initial position is determined by the initial population, and then, the particle’s initial velocity is determined by the status information of the particle’s first dimension. Step 3: By calculating the fitness of all of the particles in the initial population, we obtain the local best position of every particle and further obtain the global best position of the population (i.e., the global optimal solution so far). Step 4: Updating velocity. Update the particle velocity according to Eq. (13). Step 5: Updating the position.

Fig. 1. Two-dimensional encoding scheme.

From Fig. 1, the first dimension of a particle is an n-bit binary vector. Every bit in the vector is associated with a server in a heterogeneous virtualized data center. Here, “1” denotes that the corresponding server is active in the current

Step 5.1: Update the particle position according to Eq. (14). Note that this local position update operation can miss virtual machines, and the lost virtual machines will be backfilled in the next steps. Step 5.2: Removing virtual machines. The above position update operation can lead to the result that a virtual machine is placed on two servers. Therefore, the duplicated virtual machines must be removed to ensure the feasibility of the solution. The removing method is to set the local position of the two-dimensional encoding to be 0. However, this operation has a side effect. Because the removing operation is

server-oriented, those non-repetitive virtual machines on this server can be removed together with those reduplicated virtual machines. In the same way, those accidentally removed virtual machines will be backfilled in the next step. Step 5.3: Backfilling the virtual machines. To obtain an updated feasible solution, the missed or accidentally removed virtual machines must be reinserted into the servers. This reinserting operation is called virtual machine backfilling. The backfilling operation still adopts the first fit strategy, but it first considers the current active server as the host server. Only when all of the active servers cannot place the virtual machine, a new server is turned on.

maximum iteration number is set to be 30 times. The optimization period is set to be 10 seconds.

TABLE I.

Step 7: Go to Step 3 when the current iteration number is less than the specified maximum iteration number MaxNum, or go to Step 8. Step 8: Output the global best position and its fitness, and then, obtain the optimal solution for the energy-aware virtual machine placement problem. Step 9: All of the virtual machine requests are placed on the current server cluster, and the approach ends. IV.

EXPERIMENT EVALUATION

To evaluate the performance of our proposed approach in this paper, we compare it with other approaches in terms of the number of active servers, the server utilization and the energy consumption. Moreover, we also evaluate the scalability of our proposed approach.

Server

CPU

Memory

Peak energy consumption

HP ProLiant G4

3720 MIPS

4GB

117 Watts

HP ProLiant G5

5320 MIPS

4GB

135 Watts

TABLE II.

Step 6: Updating the local best and global best particle position information based on the updated new population.

SERVER CONFIGURATION

RESOURCE REQUEST PARAMETERS

Virtual machine type

CPU

Memory

Micro Instance

500 MIPS

613 MB

Small Instance

1000 MIPS

1700 MB

B. Comparison Results on the Number of Active Servers In this paper, the number of active servers is the total number of servers that must be active to run the virtual machines that host the cloud service. In this experiment, the number of virtual machine requests is from 100 to 1000. HP ProLiant G4 severs or HP ProLiant G5 servers will be activated to respond to their requests. Table III gives the comparison results with other approaches As shown in Table III, with the increasing size of the virtual machine requests, our proposed approach always activates the minimum number of servers. However, the FF algorithm always activates the maximum number of servers. The activated server number of MBFD is less than BF and FF but is more than our approach.

A. Experiment Setup We simulate a heterogeneous virtualized data center that comprises 1000 clusters. Each cluster contains 350 heterogeneous servers. To reflect the heterogeneity of the virtualized data center, as shown in Table I, these servers can be divided into two categories, which have different configurations and energy consumption characteristics [19]. Each virtual machine runs a cloud service/application.

TABLE III.

NUMBER OF ACTIVE SERVERS. Number of active servers

N. a

FF Total (G4;G5)

Moreover, to better reflect the actual virtual machine request, as shown in Table II, we introduce two types of resource request parameters of Amazon EC2 instances, i.e., Micro Instance and Small Instance. We compare this approach with the approach (called MBFD) in [7] and First-fit algorithm (FF) and Best-fit algorithm (BF). FF and BF are two very straightforward greedy approximation algorithms. All of the experiments are conducted on the same computer, with an Intel Core2 Duo CPU 2.1 GHz processor, 2.0 GB of RAM, Windows XP Professional SP3, and MATLAB7.11.0. The simulation program is written in Java and is based on eclipse-java-indigo-SR2-win32, and the runtime environment is JDK 1.6.0_25. Sufficient repetition tests are executed for setting the following parameters. The parameter of the server energy model c is set to be 0.6. The initial population size of the PSO is set to be 20, and the a.

BF

MBFD

Our approach

Total(G4;G5)

Total(G4;G5)

Total(G4;G5)

100

33(18;15)

32(18;14)

31(20;11)

29(25;4)

200

64(33;31)

63(35;28)

60(37;23)

53(38;15)

300

100(46;54)

96(45;51)

92(49;43)

78(58;20)

400

135(66;69)

128(63;65)

122(62;60)

103(71;32)

500

166(79;87)

162(76;86)

156(78;78)

130(86;44)

600

200(95;105)

199(93;106)

182(89;93)

155(105;50)

700

233(112;121)

229(110;119)

223(113;110)

179(119;60)

800

270(131;139)

263(126;137)

244(118;126)

203(136;67)

900

310(154;156)

301(150;151)

289(151;138)

233(145;88)

1000

338(172;166)

329(163;166)

316(166;150)

256(156;100)

The number of virtual machine requests

Why do we obtain this outcome? The fundamental reason lies in that FF, BF and MBFD are all based on a simple greedy idea, which tends to fall into a local optimum and misses potential optimal solutions. In contrast, our approach devises a novel two-dimensional encoding scheme according to the essential character of the virtual machine placement problem, and it designs an energy-aware local fitness first strategy that greatly improves the quality of the problem-solving. As a result, our approach activates the smallest possible number of servers. C. Comparison Results on Server Utilization In this paper, the server utilization is the value of the CPU utilization rates of all of the active servers, which can be determined by the following:

UT =

1

nactive

¦

nactive i =1

f local −i

(16)

where UT is the server utilization, flocal −i is the local fitness, and nactive is the number of active servers. As is shown in Fig. 2, our approach achieves a higher server utilization (approximately 61%~68% in terms of 1001000 virtual machine requests). The server utilizations of FF, BF and MBFD are between 52% and ~ 60%. With the growth in the number of virtual machine requests, the server utilization of the three algorithms varies. However, the server utilization of our approach is always higher than that of FF, BF and MBFD. For the same set of virtual machine requests, if the server utilization of the servers is higher in a virtualized data center, then fewer servers must be activated to host the cloud service workload. Hence, the energy consumption of the virtualized data center is less than the others.

by Eq. (1), the energy consumption must be calculated using the definite integral. In our approach, a trapezoidal rule (a numerical method that approximates the value of a definite integral) is used to calculate the definite integral to obtain the total energy consumption value. As shown in Fig. 3, we give the comparison results. Fig. 3 indicates that our proposed approach enables the data center operators to save more energy than other approaches, regardless of the scale of the virtual machine requests. Compared with the other three approaches, our approach can save approximately 13% to 23% on the energy bill. Why is our approach better than other approaches? Because the FF, BF and MBFD lack global information (i.e., the energy consumption characteristics of the heterogeneous servers in the virtualized data center), they only account for the multi-dimensional resource constraints and do not consider the energy difference of the different servers in the problemsolving process. However, our approach introduces an effective particle velocity and position update mechanism, which enables it to find a better virtual machine placement solution and enhance the convergence of the algorithm, improving the quality of the solution. As a result, our approach activates the smallest number of servers possible and reduces the total energy consumption of the virtualized data center.

Fig. 3. Total energy consumption.

Fig. 2. Server utilization.

D. Comparison Results on Energy Consumption In this paper, the energy consumption is the total energy consumption of all of the active servers. Because the CPU utilization is modeled as a time-varying continuous function

E. Discussion of the Scalability of Our Approach The scalability of our approach is aˆˆected by the time complexity of the applied approach. There are two factors that determine the size of the energy-aware virtual machine placement optimization problem: the number of virtual machine requests m and the number of servers n. Because the problem can be modeled as a Variable-Sized Multidimensional Packing problem, which is known to be NPhard, the time complexity of any exact solution is expected to be exponential. In our approach, we use improved PSO to solve the optimization problem. For this problem, the solution space (SS) is mn. For the improved PSO, the representation space of a particle is mn. The representation space of the swarm (RSS) is Lmn, where L is the particle population size. Then, the representation space can cover the solution space, and we call

RSS the cover degree. The SS Lm degree is O ( ) n . Due to m

time complexity of the cover

L Lm ( ) mn