A novel virtual machine deployment algorithm with ... - Springer Link

1 downloads 0 Views 559KB Size Report
Central South University Press and Springer-Verlag Berlin Heidelberg 2015. Abstract: In order to improve the energy efficiency of large-scale data centers, ...
J. Cent. South Univ. (2015) 22: 974−983 DOI: 10.1007/s11771-015-2608-5

A novel virtual machine deployment algorithm with energy efficiency in cloud computing ZHOU Zhou(周舟), HU Zhi-gang(胡志刚), SONG Tie(宋铁), YU Jun-yang(于俊洋) School of Software, Central South University, Changsha 410004, China © Central South University Press and Springer-Verlag Berlin Heidelberg 2015 Abstract: In order to improve the energy efficiency of large-scale data centers, a virtual machine (VM) deployment algorithm called three-threshold energy saving algorithm (TESA), which is based on the linear relation between the energy consumption and (processor) resource utilization, is proposed. In TESA, according to load, hosts in data centers are divided into four classes, that is, host with light load, host with proper load, host with middle load and host with heavy load. By defining TESA, VMs on lightly loaded host or VMs on heavily loaded host are migrated to another host with proper load; VMs on properly loaded host or VMs on middling loaded host are kept constant. Then, based on the TESA, five kinds of VM selection policies (minimization of migrations policy based on TESA (MIMT), maximization of migrations policy based on TESA (MAMT), highest potential growth policy based on TESA (HPGT), lowest potential growth policy based on TESA (LPGT) and random choice policy based on TESA (RCT)) are presented, and MIMT is chosen as the representative policy through experimental comparison. Finally, five research directions are put forward on future energy management. The results of simulation indicate that, as compared with single threshold (ST) algorithm and minimization of migrations (MM) algorithm, MIMT significantly improves the energy efficiency in data centers. Key words: cloud computing; energy efficiency; three-threshold; virtual machine (VM) selection policy; energy management

1 Introduction Cloud computing offers pay-as-you-use model to users worldwide. Based on the model, a large number of computing resources and storage resources are saved in cloud, which poses a great challenge for effective management [1−2]. According to statistics, in 2006, the energy consumption of data center was about 61 billion kW·h in the US, leading to 4.5 billion dollars in electricity costs [3]. International Data Corporation (IDC) research suggests that global companies would cost 40 billion US $ per year on energy consumption [4]. The high power consumption of data center not only causes the energy wastes and system instability, but also generates the adverse effects on the environment. The high energy consumption problem mainly comes from two aspects: one is on the processor level, and the other is on the data centers level. With the rapid development of processor manufacture technique, the number of transistors in Intel Itanium2 has reached 1 billion [5]. Processor gains high computational power, but at the same time it also gives rise to high energy consumption. Hardware optimization technology can solve this problem to a certain degree. As for data centers

level, on one hand, the increasing number of physical servers and advanced processing capacity bring about more energy consumption; on the other hand, each server with low utilization causes huge waste of electricity. BARROSO and HOLZLE [6] found that more than 5000 servers run only at 10%−50% of their total capacity most of the time. BOHRER et al [7] explored the energy consumption of data centers and drew the same conclusion. As for data centers energy consumption problem, the related government institutions, social organizations and academic group have already paid attention to it. In 2007, a global organization called Green Grid was set up. The goal of the organization is to improve the energy efficiency in data centers and decrease environmental impact. Recently, high performance has not been the sole concern in data centers deployments, and a trend exists from optimizing resource management for pure performance in data centers to optimize them for energy efficiency, while meeting the QoS (Quality of Service) requirement. Virtualization technology provides a new thought to manage the power consumption of data centers. By using virtual machine (VM) migration technology, VMs on different hosts could be considered according to certain

Foundation item: Project(61272148) supported by the National Natural Science Foundation of China; Project(20120162110061) supported by the Doctoral Programs of Ministry of Education of China; Project(CX2014B066) supported by the Hunan Provincial Innovation Foundation for Postgraduate, China; Project(2014zzts044) supported by the Fundamental Research Funds for the Central Universities, China Received date: 2013−12−26; Accepted date: 2014−04−21 Corresponding author: HU Zhi-gang, Professor, PhD; Tel: +86−15274975378; E-mail: [email protected]

J. Cent. South Univ. (2015) 22: 974−983

conditions. Idle hosts could be switched to sleep mode or shut down for the purpose of saving energy consumption. However, some problems must be solved during the process of VMs migration. First, when VM should be migrated? When VMs should not be migrated? Second, which VMs should be migrated? Third, where VM should be placed? For these reasons, improving the energy efficiency in data centers tends to be much thornier tasks.

2 Related works Power efficiency of data center becomes a hot topic that receives an increasing attention due to both economic and environmental issues. Many researches focused on modeling of VMs allocation problem, and proposed different solutions to decrease data center power. SONG et al [8] proposed a global resource flowing algorithm in the multi-tiered resource scheduling scheme. This algorithm preferentially assures performance of some critical services by defining application priorities when resource competition arises. However, the method acquires machine-learning to obtain utility function, and it does not involve optimizing allocation of VMs. STRACK [9] presented a resource management framework considering utility-based dynamic VM deployment manager. Questions of minimizing power consumption and service-level agreement (SLA) violations are modeled as constraint satisfaction problems. Meanwhile, different trade-offs between performance and energy consumption were dealt with in the case of contention. Unfortunately, the total power consumption of the test bed was not present. Dynamic voltage and frequency scaling (DVFS) policy [10] can realize energy saving to a certain degree by adjusting the CPU voltage frequently. However, the energy saving effect of this strategy is limited. In cloud computing, overestimate or underestimate of the task execution time is not a good solution for reducing power and/or satisfying deadline constraints. For overestimation, the slack leads to energy waste. For underestimation, the increased time may cause the application to miss the deadline. To overcome these problems, a novel dynamic scheduling algorithm [11] was proposed for reallocating the slack for future tasks to reduce energy and/or satisfy deadline constraints. BUYYA et al [12] explored the problem of VM migration and provided a method named single threshold (ST). It is based on the idea of setting a higher utilization threshold to keep the total utilization of CPU for all hosts below this threshold. ST policy reserves the idle resource to avoid the SLA violation that comes from the increased demand of virtual resource during the VMs consolidation.

975

At each time, all VMs are allocated using the modification of the best fit decreasing (MBFD) algorithm, while the upper utilization threshold is kept not to be violated. The experimental results, obtained in the CloudSim [13], showed that ST policy can save more energy consumption than DVFS policy. BELOGLAZOV and BUYYA [14] presented an energy efficiency resource management system. This system partly reduces operational costs and provides high QoS requirements. Energy savings are realized by continuous consolidation of VMs according to current utilization of resources. In order to deal with energy−performance trade-off, BELOGLAZOV and BUYYA [15] proposed a policy for dynamic consolidation of VMs based on adaptive utilization thresholds, which can meet QoS requirements defined in SLA. The experimental results support the proposed technique with different kinds of workloads. However, the proposed approach in terms of energy saving is limited. A green cloud computing architecture was developed in Ref. [16], which can minimize operational costs as well as reduce the environmental impact. Based on this architecture, a minimization of migrations (MM) algorithm is proposed to improve the energy efficiency of data center, without violation of the negotiated SLA. Additionally, a number of open research challenges are addressed in this work.

3 Power model, cost of VM migration and SLA violations metric 3.1 Power model Generally speaking, energy consumption by server in data center consists of static power and dynamic power. Static power is often considered as a constant as long as the machine is turned on. Dynamic power is related with virtualized resource utilization in data center. Recent studies [17] show that server energy consumption is linear with CPU resource utilization, even DVFS is applied. Moreover, it is identified that on average an idle server consumes approximately 70% of the power when it is fully utilized. Therefore, the power consumption as a function of CPU utilization (P(u)) is defined as follows: P(u)=Pstatic+Pdynamic=0.7Pmax+0.3Pmax·u=Pmax(0.7+0.3u) (1) where Pmax can be obtained by statistic methods (for our experiment, Pmax is set to be 250 W); u is the CPU utilization of a server. As the CPU utilization may change over time, the CPU utilization is a function of time and is represented as P(u(t)). Therefore, the total energy consumption (E) by server can be defined as follows:

976

J. Cent. South Univ. (2015) 22: 974−983 t1

E   p(u (t ))dt

(2)

t0

3.2 Cost of VM migration VM migration allows transferring VMs between servers without suspension and with a short downtime. The advantages of VM live migration include energy saving and meeting consumer requirements. However, excessive VM live migration gives rise to negative impact on the performance of application running in a VM. VOORSLUYS et al [18] explored the impact and found that the performance degradation and downtime depend on the application behavior, which rests on how many memory pages are updated during its migration. For the class of web-applications, the average performance degradation can be estimated as approximately 10% of the CPU utilization. This means that each migration can result in some SLA violations. Thus, it is essential to minimize the VM migration. The migration time is related to the available network bandwidth and total amount of memory used by a VM. Therefore, the performance degradation caused by the j-thVM is defined as

U d, j  0.1  

Tm, j

t0

Tm, j 

u j (t )dt

Mj Bj

(3) (4)

where Ud,j is total performance degradation caused by the j-th VM, uj(t) is the CPU utilization of the j-th VM, t0 is starting time of migration, Tm,j is time of migration completed, Mj is the total memory used by the j-th VM, and Bj is the available network bandwidth. 3.3 SLA violations metric To realize the energy saving while meeting the QoS requirements is extremely important in cloud data center. In fact, QoS requirements are commonly formalized in the form of SLA [15]. The SLA violation is defined as follows: I request  I allocate S (5) I request

(1) When VMs should be migrated? When VMs should not be migrated? (2) There are tens of thousands of VMs in a data center. Which VM should be migrated in order to improve the energy efficiency? (3) Where VMs should be placed? To solve these problems, TESA and VM selection policies are presented. 4.1 TESA TESA sets three thresholds of a, b, c (0≤ahUtil−c) then 9 t ← vm.getUtil ( )–hUtil+c; 10 if (t< bestUtil) then 11 bestUtil ← t; 12 bestVm ← vm; 13 end 14 else if bestUtil = Max then 15 bestVm ← vm; 16 break; 17 end 18 end 19 end 20 hUtil ← hUtil-bestVm.getUtil ( ); 21 migrationList.add (bestVm); 22 vmlist.remove (bestVm); 23 end 24 if (hUtil >b) then 25 continue; 26 else if (hUtil≤a) then 27 migrationList.add (host.getVmList ( )); 28 vmlist.remove (host.getVmList ( )); 29 end 30 end 31 end 32 return migrationList; host.getVmList( ) is to get the VMs on the host, and host.getUtil( ) is to get the CPU utilization of the host. The function of Line 6−Line 23 is to choose the best VM to migrate from the host with heavy load (CPU utilization is greater than c). The best VM should satisfy

the following two conditions. Firstly, the VM should have the utilization greater than the different between the host’s overall utilization and c threshold. Secondly, if the VM is migrated from a host, the difference between c threshold and the new utilization is the minimum across the values provided by all the VMs. If there is no such a VM, MIMT chooses the VM with the highest utilization. Line 6−Line 25 mean that the VMs on the host (CPU utilization at b−c interval) are kept constant. Line 26−Line 29 represent that all VMs on the host (CPU utilization is less than or equal to a) are needed to migrate. Line 24−Line 30 also mean that VMs on the host (CPU utilization at a−b interval) are kept constant. 4.3.2 Maximization of migrations policy based on TESA (MAMT) MAMT chooses the maximum number of VMs which must be migrated from the host in order to lower the CPU utilization below c threshold for a host with heavy load (CPU utilization is greater than c). Supposing that Vj is a set of VMs currently allocated to the host j, and φ(Vj) is the power set of Vj, the MAMT policy chooses a set R  φ(Vj), formalized as follows:



 

 S | S  φ V , u   u (v )  c ,   S  max, j j a  vs  if u j  c   (7) R  , if b  u j  c  , if a  u j  b V , if u  a j  j  where uj is the current CPU utilization of the j-th host, ua(v) is the fraction of the CPU utilization allocated to the v-th VM. Parameters a, b and c are the three thresholds of TESA. As the MAMT is similar to MIMT algorithm, the pseudo-code is not provided here. 4.3.3 Highest potential growth policy based on TESA (HPGT) HPGT migrates VMs that have the lowest usage of the CPU relative to the total CPU capacity of VM for a host with heavy load (CPU utilization is greater than c). Supposing that Vj is a set of VMs currently allocated to the j-th host, and φ(Vj) is the power set of Vj, the HPGT policy chooses a set R  φ(Vj), formalized as follows:



 

 S | S  φ V , u   u (v )  c, j j a  vs  u (v )   ua (v)  min , if u j  c  vs r   R  , if b  u j  c  , if a  u j  b   V j , if u j  a  

(8)

979

J. Cent. South Univ. (2015) 22: 974−983

where uj is the current CPU utilization of the j-th host, ua(v) is the fraction of the CPU utilization allocated to the v-th VM, ur(v) is the total CPU capacity of the v-th VM and defined as the VM’s parameter, and parameters a, b and c are three thresholds of TESA. As the HPGT is similar to MIMT algorithm, the pseudo-code is not provided here. 4.3.4 Lowest potential growth policy based on TESA (LPGT) LPGT migrates VMs that have the highest usage of the CPU relative to the total CPU capacity of VM for a host with heavy load (CPU utilization is greater than c). Supposing that Vj is a set of VMs currently allocated to the j-th host, and φ(Vj) is the power set of Vj, the LPGT policy chooses a set R  φ(Vj) , formalized as



 

 S | S  φ V , u   u (v )  c , j j a  vs  u (v )   ua (v)  max , if u j  c  vs r   R  , if b  u j  c  , if a  u j  b   V j , if u j  a  

5 Experiments and performance evaluation 5.1 Setting of experiments The CloudSim toolkit [13] is chosen as a simulation platform due to its enormous advantage. For example, the CloudSim toolkit supports both system and behavior modeling of cloud system components such as data centers, VMs and resource provisioning policies. It implements generic application provisioning techniques that can be extended with ease and limited effort. For our experiments, the experimental parameters are given in Table 1. Table 1 Experimental parameters

(9)

where uj is the current CPU utilization of the j-th host, ua(v) is the fraction of the CPU utilization allocated to the v-th VM, ur(v) is the total CPU capacity of the v-th VM and defined as the VM’s parameter, and parameters a, b and c are the three thresholds of TESA. As the LPGT is similar to MIMT algorithm, the pseudo-code is not provided here. 4.3.5 Random choice policy based on TESA (RCT) If there is a host with heavy load (CPU utilization is greater than c), RCT rests on a random selection of a series of VMs that have to be migrated from the host to decrease the host’s CPU utilization below c threshold. Supposing that Vj is a set of VMs currently allocated to the j-th host, and φ(Vj) is the power set of Vj , the RCT policy chooses a set R  φ(Vj), formalized as follows:



the v-thVM, Y is a uniformly distributed discrete random variable that is used to choose a subset of Vj, and parameter a, b and c are the three thresholds of TESA. As the RCT is similar to MIMT algorithm, the pseudo-code is not provided here.

 

 S | S  φ V j , u j   ua (v)  c, Y U(0,|φ(Vj)|−1)},  vs  if u c  j   R  , if b  u j  c  , if a  u j  b V , if u  a j  j  (10) where uj is the current CPU utilization of the j-th host, ua(v) is the fraction of the CPU utilization allocated to

Parameter

Value

Number of physical node

200

CPU capacity of physical node

{1000,2000,3000}

Memory capacity of physical node

1000

VM number

600

CPU capacity of VM

{250,500,750,1000}

Memory capacity of VM

128

Task number

600

The data center consists of 200 heterogeneous physical nodes. Each node has the performance equivalent to 1×109, 2×109 or 3×109 s−1, 1000 MB of RAM and one CPU core. Each VM has the performance equivalent to 2.5×108, 5×108, 7.5×108, 1×109 s−1, 128 MB of RAM and one CPU core. Users send requests to 600 heterogeneous VMs that belong to the simulated data center. Each VM can run a web-application or any kind of application with viable workload, which is modeled to generate the CPU utilization according to a uniformly distributed random variable. The application runs for 1.5×1011 s−1 that is equal to 10 min of the execution on 2.5×108 s−1 CPU with 100% utilization. To ensure the accuracy of the data, each experiment has been run for 10 times.

5.2 Optimal interval among three thresholds Considering the energy efficiency (energy consumption and SLA violations), it is extremely important to determine the optimal interval among three thresholds of a, b and c. The MIMT is chosen to conduct a series of experiment. A thought based on statistic to determine the optimal interval among three thresholds is proposed.

980

Suppose that the difference between a and b (b−a) is integer multiple of 0.1 (10%), and the difference between b and c (c−b) is also integer multiple of 0.1 (10%). Assuming that a={0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0}, because 0≤a

Suggest Documents