Provisioning for Large Scale Cloud Computing Services

3 downloads 15607 Views 825KB Size Report
Jun 15, 2012 - Provisioning for Large Scale Cloud Computing Services. Yue Tan. The Ohio State University [email protected]. Yingdong Lu. IBM Thomas J.
Provisioning for Large Scale Cloud Computing Services Yue Tan

Yingdong Lu

Cathy H. Xia

The Ohio State University

IBM Thomas J. Watson Research Center

The Ohio State University

[email protected]

[email protected]

ABSTRACT

age. EC2 provides eight different types of virtual machine instances tailored for different user preferences. Another important component in the service offerings are the service level agreements (SLA), which specify the desired targets on various performance metrics that the service provider should meet. Violation of the SLAs typically results in significant penalty, with the potential hazard of losing customer loyalty. Resource provisioning, the task of planning sufficient amounts of resources to meet the SLAs for all cloud users, has become an important management task in modern virtualization-based service clouds. Most existing provisioning solutions typically are done individually for each user based on its peak usage, which often result in overprovisioning of resources. However, with the increasing cost on energy consumption in today’s data centers, such approaches are clearly undesirable. In this paper, we focus on service availability, a key performance metric specified in today’s cloud service offerings, defined as the percentage of time at which new service requests can be admitted into the system with their desired amount of resources fulfilled. The objective of resource provisioning is then to plan sufficient capacity of all resource types so as to guarantee the service availability defined in the SLA. For this purpose, we present novel mathematical models that capture the key features of cloud computing: 1) services composed of multiple underlying resources, 2) customers’ flexibility on upgrade/downgrade services on demand, 3) the multiplexing gains in a stochastic environment with large number of customers, and 4) the probabilistic nature of SLA violation.

Resource provisioning, the task of planning sufficient amounts of resources to meet service level agreements, has become an important management task in emerging cloud computing services. In this paper, we present a stochastic modeling approach to guide the resource provisioning task for future service clouds as the demand grows large. We focus on on-demand services and consider service availability as the key quality of service constraint. A specific scenario under consideration is when resources can be measured in base instances. We develop an asymptotic provisioning methodology that utilizes tight performance bounds for the Erlang loss system to determine the minimum capacity levels that meet the service availability requirements. We show that our provisioning solutions are not only asymptotically exact but also provide better QoS guarantees at all load conditions. Categories and Subject Descriptors: H.4 [Information Systems Applications]: Miscellaneous; D.2.8 [Software Engineering]: Metrics–complexity measures, performance measures General Terms: Theory

1.

[email protected]

INTRODUCTION

Cloud computing is rapidly gaining momentum as a new paradigm for offering computing as services via the Internet. From service providers, customers can lease infrastructure resources, such as CPU cycles, memory and disk storage, based on a “pay as you go” billing model, while providers are able to present the illusion of an infinite amount of resources to the clients via virtualization [1]. An important value proposition of cloud computing is the flexibility for customers to increase or decrease capacity on the fly to meet demand. This is especially attractive as it allows the clients to trade the high fixed costs (of owing infrastructure) for variable usage-based costs and to dynamically offload unforeseen excess demand. Cloud service offerings are typically specified in a menu of service instances or virtual machine configurations. The instances are composed of various amounts of distinct resources such as CPU, memory and storage. Clients may choose to purchase any of these instances. For example, a basic instance at Amazon EC2 [2] is comprised of 1.7 GB of memory, 1 virtual core with 1 EC2 Compute Unit, and 160 GB of instance stor-

2. PROBABILISTIC MODEL & ANALYSIS Suppose that there are R classes of customers, each following an independent Poisson process with rate λr , r = 1, . . . , R. To capture the flexibility for customers to increase/decrease resource requirements on the fly, we consider the following probabilistic model. Assume that, after holding the service template r for a random amount of time, a class r customer upgrades/downgrades to class r  with probability prr and terminates the service with probability   1− R p  r =1 rr . And the holding times for class r customers are i.i.d. random variables with mean 1/μr . Suppose there are J types of resources, and each base instance consists of Aj units of type j resource, j = 1, . . . , J. Service template of class r contains br units of such base instance. Suppose for class r, there is an SLA that the service will be available with probability 1 − r . If the system has an uncapacitated resource pool, it can be shown that the number of class r cus-

Copyright is held by the author/owner(s). SIGMETRICS’12, June 11–15, 2012, London, England, UK. ACM 978-1-4503-1097-0/12/06.

407

−4

−4

x 10

5.2

6

x 10

8

Achieved Blocking Probability Target QoS: 0.05%

blocking probability

Blocking Probability

5.8

5.6

5.4

4.8

4.6

4.4

4.2

5

0

200

400

600

800

1000 1200 Offered Load

1400

1600

1800

2000

4

Achieved Blocking Probability Target QoS: 0.05% 0

200

400

600

800

1000 1200 offered load

1400

1600

1800

Figure 2: Blocking probability versus offered

load using Normal approximation.

load using Poisson approximation. (r)

tomers in steady-state, denoted by Q∞ , called offered load, is Poisson distributed and independent of other classes. 1. Existing Approach: Normal Approximation This problem can be mapped to a multi-class bandwidth model in [4]. Denote mean offered loads as ν = (ν1 , . . . , νR ). Total offered load in terms of the number of base instances, R (r) br Q∞ , with mean denoted by Y , is given by Y = r=1  R R 2 2 = μ = r=1 br νr and variance σ r=1 br νr . Existing approach in [4] is based on Normal approximation and an Asymptotic Rule of Thumb is given as follows: r

4

0 2000

Figure 1: Blocking probability versus offered

C ≥ μ + σ max ψ(r σ/br ),

6

2

5.2

4.8

% Difference in Capacity by Poisson and Normal Approximation

5

% Difference

6.2

where ψ = (φ/Φ)−1 ,

0

100

200 300 Scaling Factor N

400

Figure 3: % difference in capacity between Normal and Poisson approximations.

the capacity C is provisioned using the Normal approximation (resp. the Poisson approximation) under a fixed target blocking probability  = 0.05% (shown as the horizontal line). As shown in Figure 1, when using the Normal approximation, although asymptotically the achieved blocking probability gets closer to the target as offered load gets larger, service level violation always happens. On the other hand, by provisioning using upper bound approach under the Poisson approximation, as shown in Figure 2, all the achieved blocking probabilities meet the desired QoS target. Asymptotic Performance Comparison. Consider a cloud service provider that offers three different types of templates: basic, medium and large, defined as b1 , b2 , b3 units of basic instances, respectively. Assume there is a sequence of offered loads scaled by factor N , with bN ≡ (1000, 40, 8), ν N = (3.7241N, 3.1586N, 3.1172N ) and N ≡ (1%, 0.5%, 0.1%). Figure 3 plots the relative percentage of difference in the resulting Cn (N ) and Cp (N ). Observe that the percentage of difference diminishes as the offered load goes larger. This is consistent with the theory since the two approximations are asymptotically close to each other as the offered load grows larger. Also observe that Cp always dominates Cn , therefore when using the Poisson approximation, a large amount of penalty cost can be reduced.

(1)

and φ, Φ are pdf and cdf of standard normal distribution respectively. We show later that this provisioning solution provides poor SLA guarantees where QoS violation could occur at almost all load conditions. 2. New Approach: Poisson Approximation By applying a Poisson approximation [3] to the distribution of Y, we provide the following provisioning solution that is not only asymptotically exact, but also provide better QoS guarantees than the asymptotic rule of thumb. Proposition 1. Under large offered load ν, in order to provide service availability 1 − r for all r, it suffices to set the total capacity C such that    1 r C∗ = min C : U (kμ, kC) ≤ min , k r br

4. CONCLUSION AND EXTENSIONS

where k = μ/σ 2 and U (kμ, kC) is the upper bound for the  −1 √ C 2 Erlang-B formula given by [5]: U (ν, C) := Φ(α) + , φ(α) 3 with α = sgn(1 − ρ) −2C(1 − ρ + ln ρ) and ρ = ν/C. The provisioning solution C∗ provides better QoS guarantees than the asymptotic rule of thumb and also becomes asymptotically exact as ν grows larger and larger.

We present solutions for provisioning at instance level, which generally yields better SLA guarantees. It is certainly restrictive, as all demand must be specified as multiples of the base instance. For the general scenario, we can map the provisioning problem to a link capacity planning problem in a stochastic loss network model with constraints on the blocking probabilities. It is also possible to extend the offered load process to a nonstationary Poisson process and carry out our asymptotic provisioning methodology.

3.

5. REFERENCES

NUMERICAL RESULTS

[1] M. Armbrust, A. Fox, R. Griffith, A. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia. A view of cloud computing, April 2010. [2] Amazon elastic compute cloud. http://amazon.com/ec2. [3] M. Fay and E. Feuer. Confidence intervals for directly standardized rates: a method based on the gamma distribution. Statistics in Medicine, Vol. 16:791–801, 1997. [4] R. C. Hampshire, W. A. Massey, D. Mitra, and Q. Wang. Provisioning of bandwidth sharing and exchange. Telecommunications Network Design and Management, 207–226, 2003. [5] A. Janssen, J. V. Leeuwaarden, and B. Zwart. Gaussian expansions and bounds for the poisson distribution applied to the erlang b formula. Advances in Applied Probability, 40:122–143, 2008.

We present two numerical examples, comparing the provisioning solutions derived using two different approximations. Achieved QoS Comparison. Suppose a service provider only offers a single class of service, each customer requires 1 unit of the base instance upon entrance, holds it for an exponential amount of time with mean 1, then leaves. We use this example to demonstrate the difference on the achieved QoS when provisioning using (1) based on Normal approximation or using our Poisson approximation. We shall denote the resulting provisioning solutions respectively as Cn and Cp . In Figure 1 (resp. Figure 2) , we plot (in dashed line) the resulting blocking probabilities as a function of ν when

408

500

Suggest Documents