Virtual Machine Proactive Scaling in Cloud Systems - Semantic Scholar

4 downloads 175121 Views 363KB Size Report
1The National Supercomputing Center in Changsha & College of Information Science and Engineering, Hunan .... monitoring for AWS Cloud resources, starting with Amazon .... the most frequent values that we call quantization bins; Thus,.
2012 IEEE International Conference on Cluster Computing Workshops

Virtual Machine Proactive Scaling in Cloud Systems Ahmed Sallam1,2 , Kenli Li1 1

The National Supercomputing Center in Changsha & College of Information Science and Engineering, Hunan University, Changsha, P.R. China 410082 2 Faculty of Computers and Informatics, Suez Canal University, Ismailia, A.R. Egypt Email: [email protected]

therefore reduce amount of hardware in use and improve the utilization of resources, improve the fault and performance isolation between applications sharing the same resources, facilitate the ability to relatively easy move VMs from one physical host to another using live or off-line migration, and support for hardware and software heterogeneity. [5] Generally, the applications exploiting the power of cloud systems characterized by dynamic behaviors due to different factors such as workload, dependencies, and the operative data-center region, this dynamic fluctuation degrades the utilization of resources; moreover, the operative large-scale computing data centers consuming enormous amounts of electrical power despite of the improvements of hardware. Most of the used techniques to handle the application fluctuations are reactive models; these models depend on a configured set of heuristics with thresholds predefined at the Service Level Agreement (SLA) time to control the scaling reactions with the dynamic change of the application behaviors. Amazon for example provides strategies for load balancing a replicated VMs via its Elastic Load Balancing capabilities [6]. These reactions (adaptations) can be of different forms such as Dynamic Voltage and Frequency Scaling (DVFS) [7], by adding new server replicas and balancers to distribute load among all available replicas, or changing of the assigned resources to an already running instance, for instance letting more physical CPU to a running virtual machine. Obviously, such models help to get over the application dynamic change; however, in cases where the application phase behavior is very dynamic (i.e. social networks), reactive systems can result poor performance and may lead to infrequent peak loads which drive to low average utilization of resources. Fortunately, behavior prediction models are potential solution for this problem by predicting the global patterns of such applications; these models predict application behavior by tracing recently observed patterns or statistics in the observed application features, which can be used to guide dynamic management decision; Likewise, these models can be used to predict the behaviors of VMs in the Cloud and introduce acceptable management decision for adapting VMs as will be present in the following sections.

Abstract—Although the investment in Cloud Computing incredibly grows in the last few years, the offered technologies for dynamic scaling in Cloud Systems don’t satisfy neither nowadays fluky applications (i.e. social networks, web hosting, content delivery) that exploit the power of the Cloud, nor the energy challenges caused by its data-centers. In this work we propose a proactive model based on an application behaviors prediction technique to predict the future workload behavior of the virtual machines (VMs) executed at Cloud hosts. The predicted information can help VMs to dynamically and proactively be adapted to satisfy the provider demands in terms of increasing the utilization and decreasing the power consumption; and to enhance the services in terms of improving the performance with respect to the Quality of Services (QoS) requirements and dynamic changes demands. We have tested the proposed model using CloudSim simulator, and the experiments show that our model is able to avoid undesirable situations caused by dynamic changes such as (peak loads, low utilization) and can decrease the losses of energy consumption, overheating, and resources wastage up to 45% on average. Keywords—Cloud Computing; Virtual Machine; Performance Prediction Models; SMM; CloudSim

I. I NTRODUCTION VEN there are a lot of rough challenges (i.e. security, and reliability) which threat the migration to the Cloud, the utility solutions of Cloud Computing become a fact in the IT industry and can’t be ignored; at the same time, dozens of researches are running now to facilitate these challenges. One of the considerable definition for the Cloud is “An information processing model in which centrally administered computing capabilities are delivered as services, on an asneeded basis, across the network to a variety of user-facing devices”. These services are introduced in the form of Infrastructure as a Service (IaaS), Platform as a Service (PaaS), Software as a Service [1], and recently some researches talk about Network as a Service (NaaS) [2]. In fact, Cloud Computing overruns an existing technology called Grid Computing which offers access to many heterogeneous resources; however, a user typically needs a very specific environment that is customized to support specific requirements or legacy applications while resources’ providers clearly cannot support the diversity of all the required environments and users are often unable to use what is available [3, 4]. Using virtualization has been recognized as the potential solution to these problems and an important reason for Cloud computing to surpass Grid Computing. In addition, Virtualization techniques allow one to create several virtual machines (VMs) on a physical server and,

E

978-0-7695-4844-9/12 $26.00 © 2012 IEEE DOI 10.1109/ClusterW.2012.17

II. E XISTING TECHNOLOGIES AND R ELATED W ORKS To the best of our knowledge all the active technologies in Cloud market and the running research activities perform 97

the scaling operation based on heuristic rules (thresholds). Current technology in use depends on two techniques to handle the dynamic changes issues, Dynamic Scaling and AutoScaling. Dynamic scaling is the ability to add and remove resources into the Cloud infrastructure, while Auto-Scaling is the ability (with certain Cloud infrastructure management tools configured with a set of heuristic rules) to add and remove resources into a Cloud infrastructure based on actual usage; no human intervention is necessary. Even the term auto-scaling sounds brilliant, Most of the market Clouds can’t respond fast enough to perform scaling operations. Amazon Cloud [6] for example has a component called CloudWatch, that’s a web service that provides monitoring for AWS Cloud resources, starting with Amazon EC2. It provides visibility into resource utilization, operational performance, and overall demand patterns - including metrics such as CPU utilization, disk reads and writes, and network traffic [8]; Thus, the CloudWatch web service triggers the autoscaling operations when the monitored data matches a pre-set of heuristic rules Figure 1.

the services using an external provider without notice from the users or affecting the service workload. On the contrary, a few researches adopt the prediction techniques for scaling in Cloud Systems but for different purposes. In [12] the authors designed a VMM monitoring tools and measured the workload of each domain. By predicting CPU usage of each domain to dynamically increase and decrease CPU resource of each domain to satisfy quality of service (QoS) for multimedia streams, however they depend on the feed back to reserve the proper CPU slice, which is an automation rather than prediction technique and obviously will suffer from delay which contradict with their proposed contribution. Another model In [13] the authors used Kernel Canonical Correlation Analysis (KCCA) to predict the execution time of MapReduce jobs to predict the query performance in parallel databases. In [14] the authors introduced a model based on “Global Phase History Table Predictor”, this technique simply stores the changes of certain behaviors of an application in a history table where each row presents a pattern of the changes. Every time a new pattern monitored, the model tries to find a match in the table to predict the next phase or to store that new pattern in case of no matches occurs. The problems of these techniques are: I) Disable to predict global long range patterns. II) Disable to model patterns of varying length. The rest of this paper is organized as follows, in Section III we summarize the motivation. In Section IV we explain in details the SMM model. Section V contains the implementation procedure of the proposed model. In section VI we explain how the loadblance done in Virtualized Environment represented by CloudSim. Our experiments and results analysis are presented in Section VII followed by a discussion about an important factor for prediction performance in Section VIII; finally, in Section IX we conclude and mention the future work plan.

Unfortunately, EC2 instances can take up to 10 minutes to launch. That’s 10 minutes between when the cloud infrastructure management tool detects the need for extra capacity and the time when that capacity is actually available. [9] In the same direction, Windows Azure platform supports the concept of elastic scale through a pricing model that is based on hourly compute increments. By changing the service configuration, either through editing it on the portal or through the use of the Management API, customers are able to adjust on the fly the amount of capacity they are running; however, this ability is not limitless and only possible to add and remove storage capacity can dramatically. Academically, many researches recently proposed scaling models also based on thresholds; among these researches, In [10]the authors introduce a scaling strategy for web application in Cloud Systems using a front-end load balancer to dynamically route user requests to back-end web servers that host the web application. The number of web servers should automatically scale according to the threshold on the number of current active sessions in each web server instance to maintain service quality requirements. Another technique proposed in [11] by deploying a generic virtualization layer between the service and the physical infrastructure to give additional capacity to

III. M OTIVATION Our goal is to build a proactive adaptation model to observe workload behaviors such as request arrival pattern, service time distributions, and memory usage and record the dynamic changes of a varying length execution patterns, then to analyze the recorded patterns using adapted version of Statistical Metric Modeling (SMM) [15] . This helps to early takes the right decision based on the analysis results to adapt the VMs resources. Referring to [2]VMs resources adaptation can be in the form of horizontal scaling (i.e. adding new server replicas and load balancers to distribute load among all available replicas) or vertical scaling (on-the-fly changing of the assigned resources to an already running instance). Practically, most common operating systems do not support on-the-fly (without rebooting) changes on the available CPU or memory to support this “vertical scaling”. The DVFS technique can achieve the vertical scaling at some limits; however, this requires using a special non cheap server equipments and technologies which contradicts with the trend of using commodity for building the Cloud. Thus, application behaviors prediction techniques can be a better alternative. We can list our contribution as following:

Scale out: if M>80, by 10% Scale In: if M1 then 5) pattern=buffer[first:last-1] 6) next=buffer[last] 7) else 8) next=buffer[first] 9) Endif 10) //find "pattern+next" in history 11) While pattern.length>0 do 12) If history (pattern+next) then 13) history.update (pattern+next, frequency) 14) Else 15) history.insert (pattern+next, 1) 16) pattern=pattern[2:last] 17) Endif 18) EndWhile 19) If bins (next) then 20) bins.update (next, frequency) 21) Else 22) bins.Insert (next,1) 23) Endif 24) samples++ 25) EndUpdateHistory utilization UCP U , and network utilization Unet . To combine the three criteria we used the load volume notation introduced in [19] and formulated as Datacenters

Similar to language modeling, the resulted value of Equation 5 could be zero for each unseen sequence sii−n+1 in the history. The solution for such situations in language modeling is by using means of smoothing models [14, 18]; however, in contradict to language modeling where the history estimations generated is static and fully dependent on the collection of text or speech presented by the Corpus, the History in SMM model is dynamic and updated instantly every new logged sample as will be described later in Section V, which means that the SMM model will only have this problem at the first occurrence of an unseen logged sample at the training period and it will disappear automatically with further logged samples. However, we still have the problem to predict the next value for such new unseen samples; thus, smoothing can’t take part in this case. In History table prediction method [14], the predictor backoff to Last Value when meeting such situations; however, as mentioned before the logged data represent numerical values so we can improve this step in History table prediction by predicting the next value to each unseen sample with Equation 6 i−1 (sk − sk−1 ) si = si−1 + k=i−n (6) n where n is the pattern length, this means that the next value equals to the Last seen value modified by the average of differences between the n samples preceding the current sample.

LV =

1 1 1 · · 1 − Umem 1 − UCP U 1 − Unet

(7)

V. I MPLEMENTATION Most real-world Cloud computing application services such as social networking, gaming portals, business applications (e.g., SalesForce.com), media content delivery, and scientific work-flows show workload dynamic changes closely related to the usage behaviors of users during a day, a weak, or a season. Although the workload fluctuates rapidly, these changes format repeated sequences which can be logged and handled in similar manners as sentences in language modeling. Unlike language modeling the samples in SMM model is real values; thus, we need to quantize these values first to apply the SMM model. In this work we will consider 50 quantized bins to represent the percentage of the increases of load volume. The SMM model history handled in three history structured placements, the first structure to record a (Pattern-Next-Frequency) format to record patterns of different length, the second structure used to record the unique samples (quantized bins), and the third to count the logged samples, see Figure 3 . Whenever a new sample logged the model performs two steps, firstly, update the history with the last logged value following Algorithm 1, and then predict the next value following Algorithm 2 In algorithm 1 the update process carried out by splitting the input “buffer” with the last n recorded samples into two placements, a “pattern” of n − 1 samples and a “next” which contains the last sample in the buffer. Next the algorithm tries to find a match to record with “pattern” and “next” as entries in the input history which have

B. Model criteria To apply the SMM model in Cloud environment we need a criteria to be logged and represent the behavior of the workload; in practice there are three criteria directly affect the data-centers work load, Memory utilization Umem , CPU 100

solutions available for allocating virtual machines during their operation time to optimize the actual server workload (e.g. VMware DRS, VirtualIron LiveCapacity). This operation represented in CloudSim as following, DataCenterController [20] uses a VmLoadBalancer to determine which VM should be assigned to the next request for processing. Most common Vmloadbalancer are the active monitoring load balancing algorithms as mentioned before in Section I. Active VmLoadBalancer maintains an information about each VMs and the number of requests currently allocated to which VM. When a request to allocate a new VM arrives, it identifies the least loaded VM. If there are more than one, the first identified is selected. ActiveVmLoadBalancer returns the VM id to the Data Center Controller. The data Center Controller sends the request to the VM identified by that id. DataCenterController notifies the ActiveVmLoadBalancer of the new allocation. This process becomes complex with frequent workload changes, so when the ActiveVmLoadBalancer can’t take the right decision based on the available monitored information to dispatch the new coming requests to the proper VMs, peak loads occur. In this situation the DataCenterController tried to handle this situation by selecting a targeted VM/VMs and scale the available resources to absorb the coming workloads. scaling the available resources can be in the form of increasing shared resources (CPU, Memory) slices in the same machine or by migrating to another machine with more power. hereafter we will call this method Active Monitoring Scaling to distinguish from scaling based on our Proactive Model in the following Section.

New logged Sample

buffer

s 11 s 12 s 13 s 14 s 15 n

Next

Frequency

.....

.....

1

Pattern

.....

History

1 : n-1

Frequency

.....

..... 33

...........

...........

v

s 15

2 2 7 15

...........

1 Bin

s 15 s 15 s 15 s 15 ...........

...........

s 11 s 12 s 13 s 14 s 12 s 13 s 14 s 13 s 14 s 14

number of logged samples

samples++

VII. E XPERIMENTS AND RESULTS EVALUATION We perform extensive simulation experiments using CloudSim toolkit [20] The reasons behind choosing to work

Figure 3: Editing process in SMM History

been exhibited in Figure 3, if so the algorithm increases the frequency assigned to this record, if no the algorithm save “pattern” and “next” as a new record with frequency 1, then it truncate the first sample from “pattern” and repeat the process again until a match found. The Algorithm 2 predict the next value to the input “pattern” with the last n − 1 recorded samples by finding a match for a record with “pattern” as an entry, if a single match found the algorithm return the “next” entry in that record, while if more than one match found, the algorithm return “next” entry of the record with the biggest probability. On the other side, if no match found the Equation 6 is used to calculate an average value based on the change direction of the last n samples. We express the search process in both algorithms as greedy procedures for simplicity; however the real implementation is performed with SQL quires; thus the complexity for searching and inserting process actually depend on the SQL Engine and mostly less than O(lgn), it’s worth to mention that we adopt n = 6 as well be shown later in Section VII-A.

Algorithm 2 Next value prediction 1) Prediction(pattern, history) 2) lsample=pattern[last] //last sample 3) next=null 4) while(pattern.length>0) do 5) //find records match “pattern” in history 6) records= find (pattern) in history 7) If (records.number>=1) then 8) max=0 //biggest probability 9) counter=0 10) while(counter++

Suggest Documents