SLA-driven Dynamic Resource Provisioning for Service ... - IEEE Xplore

12 downloads 20572 Views 538KB Size Report
†{jianyang, xihs}@ustc.edu.cn. Abstract—Cloud Computing provides a convenient means of remote and pay-per-use access to computing resources in forms.
Globecom 2013 Workshop - Cloud Computing Systems, Networks, and Applications

SLA-driven Dynamic Resource Provisioning for Service Provider in Cloud Computing Yongyi Ran∗ , Jian Yang† , Shuben Zhang∗ , Hongsheng Xi† Department of Automation University of Science and Technology of China Hefei, China 230027 ∗ {yyran, zsb1986}@mail.ustc.edu.cn, † {jianyang, xihs}@ustc.edu.cn Abstract—Cloud Computing provides a convenient means of remote and pay-per-use access to computing resources in forms of Virtual Machines (VMs). Specially, with cloud computing, service providers no longer need to maintain a large number of expensive physical machines, which can significantly reduce the cost. However, due to the fluctuation and uncertainty of the future demands, it is still a challenge for service providers to dynamically determine the optimal resource provisioning to save cost while guaranteeing the Service Level Agreement (SLA). Overload may result in the service unavailable for the latter service requests while over-provisioning naturally increases the cost. To address the problem, in this paper, by defining the unavailability probability of the service as a metric of SLA, we propose a SLAdriven dynamic VMs provisioning strategy based on the large deviation principle, which is capable of proactive calculating the optimal number of VMs for the upcoming demands subject to the unavailability probability below a desired threshold. Finally, the experiments are performed based on real workload traces to show the attainable performance of the proposed resource provisioning strategy and verify that the proposed strategy can make a good tradeoff between saving cost and guaranteeing SLA. Keywords—Resource provisioning, cloud computing, service level agreement, cost saving.

I.

I NTRODUCTION

Cloud computing is an emerging commercial infrastructure paradigm and computing paradigm, which provides various computing, storage and other services to cloud consumers via the Internet [1]. Generally, it offers Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). For the IaaS cloud platforms, virtualization technology [2] can be used to package the resources into virtual machines (VMs), and then the cloud consumers (e.g. service providers and scientific research institutions) may purchase these VMs to achieve their applications. Specifically, via employing cloud computing as services provisioning environment, service providers are no longer required to make large capital outlays in the hardware infrastructures for deploying their services. Furthermore, by adjusting the number of VMs, service providers can flexibly design and optimize the resource provisioning strategies according to their own workload characteristics for achieving a cost-effective resource provisioning as well as guaranteeing the Service Level Agreements (SLAs). The SLA is a contract between a service provider and its customers, which facilitates the transactions between customers and service providers by providing a approach for consumers to indicate their required service level or Quality of

978-1-4799-2851-4/13/$31.00 ©2013IEEE

408

Service (QoS). SLA usually specifies a common understanding about responsibilities, guarantees, warranties, performance levels in terms of availability, response time, etc. Particularly, in cloud computing, the cloud service provider is responsible for the efficient utilization of the physical resources and guarantee their availability for cloud consumers. The cloud consumers (e.g. service providers) are responsible for the efficient utilization of their allocated resources in order to satisfy the SLA of their customers (end users) and achieve their business goals. It is known that the availability of the service is the premise of the other performances. Therefore, in this paper, we consider the unavailability probability of the service as a metric of the SLA between the service provider and end-users where the unavailability probability indicates the probability that the workload exceeds the serving capacity. In addition, the IaaS cloud platforms generally (e.g. Amazon EC2 [3] and GoGrid [4]) may offer On-demand VM (in Amazon EC2, the VM is called instance), which allows the cloud consumers pay for the compute capacity by the hour, thus we use the on-demand VMs for the service provisioning. In the cloud computing environment, for the service provider, due to the fluctuation and uncertainty of the future demands (which is hard to predict accurately), if excessive VMs are purchased from cloud data center, the resources may be wasted due to the lack of service demands, this overprovisioning naturally results in high cost and low resource utilization rate. While if the purchased VMs are insufficient, the overload may be incurred, which makes the service unavailable for the latter service requests and results in a high rejection rate or a high waiting rate. This under-provisioning really gets a low resource cost, but the availability and other performances can not be guaranteed well. Therefore, how to optimally and dynamically plan resources purchasing to save cost while guaranteeing the SLAs is still a challenge for service providers. In this paper, to address the above problem, by defining the unavailability probability of the service as a metric of SLA, we propose a SLA-driven dynamic VMs provisioning strategy based on the large deviation principle, which is capable of proactive calculating the optimal number of VMs for the upcoming demands subject to the unavailability probability below a desired threshold. Specifically, the main contributions of this paper are: •

The dynamic resource provisioning problem for service providers is formulated as minimizing the number

Globecom 2013 Workshop - Cloud Computing Systems, Networks, and Applications

of purchased VMs subject to a SLA requirement in terms of the unavailability probability. •





An online unavailability probability estimation model based on the large deviation principle [5] [6] is proposed, which avoids requiring any prior knowledge of the workload and triggers a prompt response to the variations of the instantaneous workload. Then the estimated unavailability probability is used to drive the proactive adjusting of the optimal number of VMs for the future workload.

model based on the large deviation principle [5] [6] is proposed, and then the estimated unavailability probability is used to drive the proactive adjusting of the optimal number of VMs for the future workload. III.

S YSTEM M ODEL

A. System Architecture Drive Compare

An event-based implementation procedure is presented to demonstrate how to apply the proposed dynamic resource provisioning algorithm to achieve the cost saving and the SLA guaranteeing in practice.

N-SLA

VDMU

E-SLA TSU

VM

MMU

Finally, experiments are carried out based on two real workload traces to verify the performance of the proposed algorithm.

The rest of this paper is organized as follows. Related works are reviewed in Section II. Section III describes the system model for resource provisioning in cloud computing. In Section IV, the dynamic resource provisioning strategy based on the large deviation principle is designed to reduce the cost while guaranteeing SLA. Section V presents the simulation results to show the performance of our proposed strategies. Finally, Section VI concludes this paper. II.

R ELATED W ORK

Recently, many works illustrate how to meet the SLA requirements by the proper resource provisioning. Zhao et al. [7] presented an end-to-end framework that facilitates adaptive and dynamic provisioning of the database tier of the software applications based on consumer-centric policies for satisfying their own SLA performance requirements and controlling the monetary cost. Feng et al. [8] addressed how to maximize providers’ revenues based on the performance-aware pricing model in SLAs through the proper resource allocation among the customers by the method of lagrange multiplier. Kouki et al. [9] presented a SLA-driven approach for improving capacity planning for cloud computing by following a queueing network proposal. Many other related works focusing on the provisioning strategy were addressed in [10] [11]. Hong et al. [10] developed ShrinkWrap algorithm to train an offline table for looking up the optimum number of instances for a special workload. Furthermore, a ShrinkWrap-opt algorithm based on dynamic programming was proposed to optimally expend a tolerance budget to achieve the maximum margin cost saving. Sivadon et al. [11] proposed an optimal cloud resource provisioning (OCRP) algorithm to allocate resources offered by multiple cloud providers. The optimal solution is obtained by formulating and solving stochastic integer programming with a multistage recourse. Other works [12] [13] also studied the resource provisioning strategies in different cloud environments. In contrast to the previous works, we focus on the dynamic resource provisioning for service providers with cost saving and SLA guaranteeing, where the SLA mainly considers the unavailability probability of the service as a metric for all the end-users. An online unavailability probability estimation

409

TMQ Demand

Fig. 1.

Service Provider

Cloud Provider

System Architecture for Dynamic Resource Provisioning

System architecture for the dynamic resource provisioning is depicted in Fig. 1. For the service provider, the major modules contain Task Management Queue (TMQ), Task Scheduler Unit (TSU), VMs Dynamic Management Unit (VDMU), Monitoring Management Unit (MMU) and a SLA estimation module (E-SLA). TMQ is used to manage the tasks and the ESLA estimates the actual unavailability probability according to the arrival and the departure of the tasks in TMQ. TSU is to execute the VMs allocation strategy for the service requests. VDMU is to decide the optimal number of VMs for cost saving subject to the SLA according to the estimated unavailability probability, while MMU is to collect state information of the purchased VMs. In this paper, the main work is to propose a dynamic resource provisioning strategy implemented in VDMU. For dynamic resource provisioning, VDMU will use the ESLA to periodically calculate a estimated value for the actual unavailability probability of the service, and then compare the estimated value with the negotiated SLA (N-SLA) to decide the increase or decrease of the VMs. B. VM Type & Price As mentioned before, in order to meet cloud consumers′ different resource demands, the cloud providers supply many types of VMs with different resource configuration. Generally, IaaS cloud computing vendors (such as Amazon EC2, the VM is called instance) offer Standard VMs for general purpose, High-Memory VMs for high throughput applications, HighCPU VMs for compute-intensive applications, High I/O VMs for many high performance database workloads, etc. Moreover, each type may have several subtypes with different quantity of the resources such as Small VM, Medium VM, Large VM and Extra Large VM. TABLE I show the detailed configurations and prices of High-CPU on-demand VMs in US East (N. Virginia) which came from Amazon EC2 on May 10th, 2013. It is noted that the type and price may vary due to the adjustment of Amazon EC2. In this paper, we mainly concentrate on determining the optimum resource provisioning for the up-coming workload

Globecom 2013 Workshop - Cloud Computing Systems, Networks, and Applications

TABLE I.

EC2 H IGH -CPU O N - DEMAND VM S

VM Subtype Medium VM Extra Large VM

Memory (GB) 1.7 7

Compute Unit 5 20

Disk (GB) 350 1690

Let us define k n as the number of purchased VMs, and C(k n ) = k n ro as the total capacity of k n VMs in the form of CPU number during the nth slot. Based on the unavailability probability defined in (3), the dynamic resource provisioning problem for cost saving with SLA guaranteeing can be defined as:

Usage Fee (per hour) $0.145 $0.580

and disregard how to select an appropriate VM type. Without loss of generality, the default VM subtype that we used in this paper is Medium VM. The resource and cost associated with an on-demand VM is defined as ⟨ro , po ⟩, where po is the hourly fee of an purchased on-demand VM, and ro specifies the available resource. The available resource can indicate the CPU number for compute-intensive applications, bandwidth capacity for High I/O applications (like VoD), etc. For simplicity, here we mainly consider the compute-intensive applications and the available resource specifies the number of idle CPUs.

min s.t.

According to the data format of real workload traces [14], we use Tirequest = ⟨ai , ei , ci ⟩ to describe a historical task i, where ai specifies the task arrival time, ei specifies the task end time, ci is the computation complexity of the task i in terms of CPU number. For facilitating statistics, we divide the time into slots with equal duration of m minutes, and define Rn as the computation resource demand in terms of CPU number at the beginning of the nth slot. Thus the dynamic resource demand in the nth slot can be described as follows: Rn+1 = Rn + An − Fn ,

(1)

where An ∈ {0, 1, ..., sA } specifies the required CPUs for the arriving tasks and Fn ∈ {0, 1, ..., sF } specifies the released CPUs because of the tasks finished during the nth slot. Here we assume that the process An and Fn are both i.i.d sequences. Let T0 denote the start time, then the interval of nth slot can be expressed as In = [T0 + (n − 1)m, T0 + nm]. Thus An and Fn can be defined as below: An =

∞ ∑ i=1

ci 1In (ai ),

Fn =

∞ ∑

ci 1In (ei ) ,

(2)

i=1

where the indicator function 1In (t) is assigned with 1 if t ∈ In , otherwise with 0. As mentioned before, the availability of the service is the premise of the other performances. When the workload exceeds the serving capacity provided by the purchased VMs, the service becomes unavailable for the latter service requests. Therefore, in this paper, we consider the unavailability probability of the service as a metric of the SLA between the service provider and end-users. Then we derive the optimum number of VMs which need be purchased from cloud driven by the unavailability probability which indicates the degree of mismatch between the workload and the serving capacity provided by the VMs. The unavailability probability is defined as follows: n Pun = P (Rn > Cn ), (3) where Rn is a random variable, specifying the resource demand from the service requests, and Cn specifies the serving capacity provided by the purchased VMs during the nth slot.

410

(4)

where ε is a pre-negotiated SLA in forms of the unavailability probability and Z(k n ) = k n po specifies the total hourly cost of k n VMs. The optimization problem (4) is to find the optimum number of VMs which need be purchased from cloud dynamically and satisfy the unavailability probability no more than ε. In the next section, the large deviation principle is employed to estimate the unavailability probability. IV.

C. SLA-Based Mathematical Model

Z(k n ) P (Rn > C(k n )) < ε

SLA- DRIVEN DYNAMIC R ESOURCE P ROVISIONING S TRATEGY

In order to reduce the total cost while guaranteeing SLA, a SLA-driven Dynamic Resource Provisioning strategy (SDRP) based on the Large Deviation Principle is derived in this section. A. Unavailability Probability Estimation Model According to the description of Section III-C, the SLA – n = P (Rn > C(k n )) is required unavailability probability Pun to decide whether to increase or decrease the purchased VMs at the beginning of the nth slot. Specifically, if the unavailability probability is larger than ε, more VMs need to be purchased. By contrast, if the unavailability probability is less than ε, the number of VMs need be decreased to reduce the cost. n Since it is a challenge to calculate Pun directly because the probability distribution of Rn is unknown, we consider applying an online measurement based method to predict the unavailability probability at the (n + N )th slot by using the available historical observations at the nth slot, where N is the prediction intervals. Assuming that the k n VMs are kept at (n + N )th slot, then the unavailability probability during the (n + N )th slot can be defined as: n+N Pun = P (Rn+N > C(k n )), (5) where the Rn+N is the workload at the (n + N )th slot and C(k n ) specifies the total capacity of the k n VMs. Let ∆i = An+i − Fn+i (i = 0, 2, ..., N − 1) denote the variation of the computation resource requirement during the (n + i)th slot, and the value space of ∆i is {−sF , ..., 0, 1, ..., sA }. ∆i > 0 specifies that the resource demand increases and ∆i < 0 indicates the decrease of the demand. According to the formula (1), Rn+N can be rewritten as: Rn+N = Rn +

N −1 ∑ i=0

(An+i − Fn+i ) = Rn +

N −1 ∑

∆i .

(6)

i=0

Thus, according to (5) and (6), we can obtain: ∑N −1 N −1 ∑ ∆i n+N n > a0 ), Pun = P (Rn + ∆i > C(k )) = P ( i=0 N i=0

Globecom 2013 Workshop - Cloud Computing Systems, Networks, and Applications

where a0 = (C(k n ) − Rn )/N specifies ∑N the acceptable average growth of resource demand and i=1 ∆i /N indicates the average variation of workload in each of the future N slots. Owing to the assumption that An and Fn (n = 1, 2, ...) are both i.i.d, the random variables ∆i (i = 1, 2, ..., p) are also i.i.d with a finite moment generating function M (θ) = E{eθ∆i }.

(7)

By applying Cram´ er′ s T heorem in the context of the large deviation approximation [5] [6] to estimate the unavailability probability, we can obtain ∑N −1 ∆i 1 lim log P ( i=0 > a0 ) = −l(a0 ) (8) N →∞ N N where l(a0 ) = sup{θa0 − log M (θ)}

(9)

C. Dynamic Resource Provisioning Strategy In this section, we will solve the problem described in (4) by using the unavailability probability estimation model. Suppose that the current slot of the system is n and let k n denote the current purchased VMs with the total capacity C(k n ) in terms of CPU number. We apply the criterion (13) to check whether the current number, k n , of purchased VMs satisfies the SLA parameter ε. The two scenarios of the strategy are described as follows: n+N If Pˆun ≥ ε, this implies that the current k n VMs cannot provide enough computation resource to guarantee the unavailability probability lower than ε for the forthcoming workload. Thus more VMs should be purchased to increase the computation capacity. In each iteration, we only increase δC capacity. The iteration will not end until the criterion (13) is satisfied. Assume that the iteration is stopped after m steps, then the capacity of the system becomes

θ>0

is called rate function. If the probability distribution of ∆i is ( ) −sF . . . 0 1 . . . sA ∆i ∼ (10) π−sF . . . π0 π1 . . . πsA then log M (θ) can be expressed as log M (θ) = log{

sA ∑

πj ejθ }.

(11)

j=−sF

Thus, according to (8), for sufficiently large N , the unavailability probability can be approximated as [6]: n+N Pˆun ≈ e−N l(a0 ) .

(12)

Then based on this estimation equation, we can determine whether the current capacity of k n purchased VMs satisfies the forthcoming workload with guaranteeing the pre-negotiated SLA ε by using the following criterion: n+N Pˆun < ε.

(13)

Obviously, the value of M (θ) cannot be derived directly because the probability distribution of ∆i is unknown. Thus an online estimation method is proposed in the Section IV-B to estimate πj .

C(k n+N ) = C(k n ) + mδC ,

(14)

where mδC is the increased computation resource in terms of CPU number. The corresponding number of newly increased VMs can be calculated as δkn = mδC /ro , then k n+N = k n + δkn . n+N If Pˆun < ε, this implies that keeping the current k n VMs can satisfy the negotiated SLA ε in (n + N )th slot. However, in this situation, the number k n is not necessarily the optimum number of VMs which can provide the desired SLA for the forthcoming workload. Consequently, we decrease δC capacity repeatedly until the criterion (13) is not satisfied. Assume that the iteration is stopped after m steps, then the capacity of the system should be: C(k n+N ) = C(k n ) − (m − 1)δC ,

(15)

where (m − 1)δC is the decreased computation resource. The corresponding number of newly decreased VMs δkn = (m − 1)δC /ro . Thus we can decrease δkn VMs to reduce the cost. By applying the proposed strategy, we can dynamically configure the optimum number of VMs to guarantee the desired SLA, ε, for the forthcoming workload while reducing the cost. V.

P ERFORMANCE E VALUATION

In this section, the performance of the proposed strategy is illustrated based on two real workload traces.

B. Online Estimation of Unavailability Probability In this section, the key step is to estimate πj (j ∈ {−sF , ..., 0, 1, ..., sA }) using a sliding window which contains the most recent Ls samples and update the parameters when new observations are available. Assume that the index of the current slot is n, then we can use the Ls observations Sn = {∆n , ∆n−1 , ..., ∆n−Ls +1 } to estimate πj (n). Let Lj denote the number of the occurrence that ∆k = j, where ∆k ∈ Sn , j ∈ {−sF , ..., 0, 1, ..., sA }. Then, the distribution πj can be estimated according to pˆj (n) = Lj /Ls . By using the estimated distribution πj (n), the unavailability probability can be calculated finally.

411

A. Experiment Setup In this paper, two real workloads with different characteristics are applied to evaluate the performance of the proposed methods. The first is extracted from LLNL Thunder workload log [14], covering the period from Jan. through Jun. in 2007. The second workload is extracted from Blue Gene/P (Intrepid) at Argonne National Laboratory(ANL) [14], containing 8 months′ workload from Jan. 2009 to Sept. 2009. The key parameters for our experiments were set as follows: the slot size d = 1/6 minutes, the prediction interval N = 60d and the length of the sliding window Ls = 240d, the SLA parameter ε = 0.01, the computation capacity increment δC = 4 CPUs.

45000

100%

0.40

40000

90%

0.35

35000 30000 25000 20000 15000 10000

60% 50% 40% 30%

0%

0.25 0.20 0.15

0.05

LLNL Thunder

ANL Intrepid

Reactive

33457

36832

Reactive

90.90%

ARMA

36484

42379

ARMA

SDRP

36260

42784

SDRP

(a) The Total Cost

0.30

0.10

10%

0

0.00

LLNL Thunder

ANL Intrepid

84.72%

Reactive

0.1720

0.3370

83.63%

74.91%

ARMA

0.1069

0.2082

86.02%

70.38%

SDRP

0.0288

0.1130

LLNL Thunder

ANL Intrepid

(b) The Average Utilization Rate

(c) The Job Waiting Rate

Performance comparison of Reactive Strategy, ARMA-based Strategy and SDRP. x 10

4

0.2 Average Utilization Rate

Total Cost

LLNL Thunder ANL Intrepid

4.5

4

3.5 -5 10

-4

10

-3

-2

10 SLA Value

10

(a) The Total Cost

-1

10

LLNL Thunder ANL Intrepid

LLNL Thunder ANL Intrepid

0.95 0.9

Job Waiting Rate

5

Fig. 3.

70%

20%

5000

Fig. 2.

80%

Job Waiting Rate

Averge Utilization Rate

Total Cost($)

Globecom 2013 Workshop - Cloud Computing Systems, Networks, and Applications

0.85 0.8 0.75 0.7

0.15 0.1 0.05

0.65 -5

10

-4

-3

10

10 SLA Value

-2

10

(b) The Average Utilization Rate

-1

10

0 -5 10

-4

10

-3

10 SLA Value

-2

10

10

-1

(c) The Job Waiting Rate

Impact of different SLA values, ε.

B. Performance Metrics In order to evaluate the performance of our proposed strategy, we define three performance metrics as follows: 1) The total cost (Z(k n )): This metric indicates the total cost of the purchased VMs during runtime. Assuming that the total running time is N0 slots, the total cost can be defined as: Z(N0 ) =

N0 ∑

k n po .

(16)

n=1

2) Average utilization rate (φ): The utilization of the purchased VMs at time t is described as φ(t) = K(t)/Ko (t), where K(t) and Ko (t) denote the used VMs and the total VMs at time t respectively. Thus, the average utilization rate φ can be calculated by the following formula: ∫ 1 T0 +∆ φ= φ(t)dt (17) ∆ T0 3) Job waiting rate (ψ): Generally, a higher unavailability probability may lead to more waiting jobs. Therefore, we utilize the job waiting rate to evaluate the performance of the proposed method in this paper: ψ = Nw /Nt , where Nw is the number of waiting jobs and Nt is the total number of submitted jobs. C. Experimental Results 1) Performance Comparison: In order to demonstrate the performance improvement of the proposed SDRP, we compared SDRP against the ARMA-based strategy [10] and the

412

Reactive Strategy which dynamically adjusts the number of VMs just according to the current workload at the time for adjustment. In Fig. 2, the simulation results were given in terms of total cost, average utilization rate and job waiting rate. The results in Fig. 2(a) and Fig. 2(b) show that the total cost of our proposed SDRP is little higher (about 8.38% for LLNL Thunder) than that of Reactive Strategy. Moreover, Reactive Strategy can also achieve a better average utilization rate. However, Fig. 2(c) shows that the job waiting rate is largely reduced by applying SDRP(about 83.26% for LLNL Thunder). It is shown that the prediction-based SDRP strategy adaptively reserves a little extra capacity to meet the fluctuation of the future workload. While Reactive Strategy just uses the current workload information to adjust the VM number, which cannot guarantee optimum capacity (excessive or insufficient) for the future workload requirement due to lacking prediction. From Fig. 2, it can be observed that our proposed SDRP and the ARMA-based Strategy can achieve almost the same total cost and average utilization rate, but the job waiting rate of SDRP is much lower than that of the ARMA-based Strategy. According to the observation, this means that the proposed SDRP can better predict the future workload and reserve a more adaptive extra VMs for the upcoming workload. Therefore, the proposed SDRP is effective and can make a better trade-off between the cost and the SLA. 2) Sensitivity to SLA Value ε: The experiment in this section was conducted to investigate the effect of the unavailability probability threshold, ε, in formula (13) for our proposed SDRP strategy. The values of ε cover {10−5 , 10−4 , 10−3 , 10−2 , 10−1 }, and the setting of the other

Globecom 2013 Workshop - Cloud Computing Systems, Networks, and Applications

4

x 10

6 5 4 3 0

5

10 15 20 25 30 35 40 45 50 Slot Size(seconds)

(a) The Total Cost Fig. 4.

LLNL Thunder ANL Intrepid

0.9

Job Waiting Rate

Total Cost($)

7

0.4

1

LLNL Thunder ANL Intrepid

Average Utilization Rate

8

0.8 0.7 0.6

LLNL Thunder ANL Intrepid

0.3 0.2 0.1

0.5 0.4 0

5

10 15 20 25 30 35 40 45 50 Slot Size(seconds)

(b) The Average Utilization Rate

0 0

5

10 15 20 25 30 35 40 45 50 Slot Size(seconds)

(c) The Job Waiting Rate

Performance of SDRP for different slot sizes (d)

parameters was the same as before. The corresponding simulation results were given in Fig. 3. Fig. 3(a) shows that as ε increases, the total cost decreases. The reason is that as ε increases, the number of VMs for the future increased workload reduces according to the large deviation principle. Specifically, the lower SLA – the unavailability probability, the less extra VMs needed. Thus this in turn leads to the decrease of the total cost. As shown in Fig. 3(b) and 3(c), the average utilization ratio and job waiting ratio both increase as ε increases. It is because that increasing ε leads to a reduced number of extra VMs, which are kept to meet the future workload, thus for the same resource requirement, less VMs leads to a higher average utilization ratio and a higher job waiting ratio. These results also show that we can achieve different SLA by controlling the unavailability probability threshold ε while keeping the cost as low as possible. 3) Sensitivity to slot size d: In this section, we carried out an experiment to investigate the effect of slot size (d) by setting d to 1, 5, 10, 20, 50 seconds. The other parameters and the simulation settings were the same as before. The simulation results were given in Fig. 4. From Fig. 4(a), it can be observed that the total cost increases upon increasing d. The reason is that the slot size d is the measurement period of the workload, and a larger d leads to the sparse measurements which cannot characterize the workload variations in a small time scale, and thus the accuracy of calculating the minimum number of needed VMs may become bad. For the same workload, the more purchased VMs, the lower average utilization rate and job waiting rate. Therefore, the average utilization rate and the job waiting rate decrease as d increases as depicted in Fig. 4(b) and Fig. 4(c). VI.

C ONCLUSION

adaptively provide resources for the uncertain service demands with a good trade-off between the cost saving and the SLA guarantee. ACKNOWLEDGMENT This work was supported by the National Natural Science Foundation of China (No. 61174062), the State Key Program of National Natural Science of China (No. 61233003) and the Fundamental Research Funds for the Central Universities (WK2100100021). R EFERENCES [1]

[2]

[3] [4] [5]

[6] [7]

[8]

[9]

[10]

In this paper, the problem of dynamic resource provisioning for cost saving with a SLA guarantee in cloud environment was studied. By defining the unavailability probability of the service as a metric of SLA, we proposed a SLA-driven dynamic VMs provisioning strategy based on the large deviation principle, which can proactively estimate the unavailability probability of the service and adjust the resources (VMs) according to the result of the comparison between the estimated unavailability probability and the pre-negotiated unavailability probability. The experiments were carried out to verify the performance of the proposed strategy using two real workload traces, and the results illustrated that the proposed strategy can

413

[11]

[12]

[13]

[14]

A. Fox, R. Griffith et al., “Above the clouds: A berkeley view of cloud computing,” Dept. Electrical Eng. and Comput. Sciences, University of California, Berkeley, Tech. Rep. UCB/EECS, vol. 28, 2009. P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, “Xen and the art of virtualization,” SIGOPS Oper. Syst. Rev., vol. 37, no. 5, pp. 164–177, Oct. 2003. Amazon ec2. [Online]. Available: http://aws.amazon.com/ec2/ Gogrid. [Online]. Available: http://www.gogrid.com/ J. Yang, K. Zeng, H. Hu, and H. Xi, “Dynamic cluster reconfiguration for energy conservation in computation intensive service,” IEEE Trans. Comput., vol. 61, no. 10, pp. 1401–1416, Oct. 2012. A. Dembo and O. Zeitouni, Large deviations techniques and applications. Springer, 2009, vol. 38. L. Zhao, S. Sakr, and A. Liu, “A framework for consumer-centric sla management of cloud-hosted databases,” IEEE Trans. Services Comput., vol. PP, no. 99, pp. 1–1, 2013. G. Feng, S. Garg, R. Buyya, and W. Li, “Revenue maximization using adaptive resource provisioning in cloud computing environments,” in Proc. 2012 13th ACM/IEEE Int’l Conf. Grid Computing (GRID’12), pp. 192–200, 2012. Y. Kouki and T. Ledoux, “Sla-driven capacity planning for cloud applications,” in Proc. 2012 4th IEEE Int’l Conf. Cloud Computing Technology and Science (CloudCom), pp. 135–140, 2012. Y.-J. Hong, J. Xue, and M. Thottethodi, “Selective commitment and selective margin: Techniques to minimize cost in an iaas cloud,” in Proc. 2012 IEEE Int’l Symp. Performance Analysis of Systems and Software (ISPASS), pp. 99–109, Apr. 2012. S. Chaisiri, B.-S. Lee, and D. Niyato, “Optimization of resource provisioning cost in cloud computing,” IEEE Trans. Services Comput., vol. 5, no. 2, pp. 164–177, Apr. 2012. G. Singer, I. Livenson, M. Dumas, S. Srirama, and U. Norbisrath, “Towards a model for cloud computing cost estimation with reserved resources,” in Proc. 2nd Int’l ICST Conf. Cloud Computing, 2010. Z. Xiao, W. Song, and Q. Chen, “Dynamic resource allocation using virtual machines for cloud computing environment,” IEEE Trans. Parallel Distrib. Syst., vol. 99, no. PrePrints, p. 1, 2012. Parallel workloads archive. [Online]. Available: http://www.cs.huji.ac.il/labs/parallel/workload/logs.html