Performance Evaluation of Virtual Machines Instantiation in a Private ...

2015 IEEE World Congress on Services

Performance Evaluation of Virtual Machines Instantiation in a Private Cloud Eliomar Campos*, Rubens Matos*, Paulo Maciel*, Igor Costa*, Francisco Airton Silva*, Francisco Souza† *Federal University of Pernambuco, Center of Informatics, Recife, Brazil †Federal University of Piauí, Center of Informatics, Teresina, Brazil Email: *{egc2, rsmj, prmm, ioc, faps}@cin.ufpe.br, †[email protected] before a complete migration to a public cloud, as well as to setup a hybrid cloud, where both infrastructures – private and public – are simultaneously used [4].

Abstract—Elasticity is an outstanding concept of cloud computing, usually deployed through mechanisms such as auto scaling and load balancing. Cloud-based applications are able to adapt themselves dynamically to the workload behavior due to such mechanisms. The efficient instantiation of Virtual Machines (VMs) is one requirement for the elastic behavior of cloud-based applications. This study characterizes the performance of VM instantiation in a private cloud platform, considering distinct factors such as VM type, VM image size, and VM caching. We employed a full factorial design of experiments (DoE) to compute the effect and relevance of the factors as well as their interactions. Our experimental results show that the cache factor has an impact of 45.07 % on the total instantiation time, whereas the machine image (MI) has 26.45 % and the VM type only 1.05 %. The results of these experiments are also used as input parameters in a Markov chain model for sensitivity analysis. The model evaluation showed that for 6 GB and 8 GB MI, the probability of finding the MI on cache must be at least 40 % and 60 % respectively, to achieve an average instantiation time of 300 seconds. For MI with size 2 GB, such time is not exceeded even with the cache disabled. This analysis allows checking the impact of every parameter on the system response time and pointing out effective ways for improvement of performance. Such conclusions may be used as decision support for systems which often instantiate new VMs, including those using elasticity features, such as auto scaling.

Auto scaling and elastic load balancing are important mechanisms to allow flexible allocation of resources and to enforce the fulfillment of service level agreements (SLAs) in environments with highly varying workloads [5], [6]. Applications running on cloud environments, and using auto scaling and load balancing features, are designed to instantiate or terminate VMs instances according to the current workload level. Such a behavior avoids the waste of idle resources (e.g.: memory, CPU, disk space, power) in periods of low load, whereas enables the fast increase of computational power when facing a burst of high load [5], [7]. The efficient instantiation of VMs is one requirement for the elastic behavior of cloud-based applications. Therefore, this study characterizes the performance of VMs instantiation in an Eucalyptus private cloud [4], [7]. Our analysis considers distinct scenarios, varying factors such as VM type, VM image size, and VM caching on physical nodes. The impact of those factors (and their interactions) on the total time for instantiation is analyzed through a full factorial design of experiments [8], [9]. On the second step of our performance evaluation, we build a Continuous Time Markov Chain (CTMC) model [10], and use it for sensitivity analysis. One specific study we chose is to check the behavior of instantiation time for intermediate levels of VM caching, considering different sizes of virtual machine image (MI). These techniques help in finding effective points for improvement of performance measures such as system response time. Such conclusions may be used as decision support for mechanisms of auto scaling and load balance of cloud platforms.

Keywords—Cloud computing; Eucalyptus platform; Virtual machine instantiation; Auto scaling; Performance evaluation; Analytical modeling.

I.

INTRODUCTION

Cloud computing provides on-demand access to shared computing resources and services, such as: network infrastructure, storage, operating systems, applications. Such resources and mechanisms can be easily acquired and released with minimal management effort [1]. These features enable administrators to focus only on the business model, without worrying about infrastructure details [1], [2]. The experience in acquiring cloud services is often compared to the consumption of public utilities, as mentioned in [3] through the following question: “How many individuals or companies prefer generating all its electricity instead of buying it from a commercial supplier of electricity?".

The remainder of paper is organized as follows. Section II describes some related works in the area of performance evaluation of cloud computing. Section III highlights the main concepts of performance evaluation and modelling formalisms. Section IV presents the methodology employed for the analyses. Section V describes the analyses and discusses the results obtained. Section VI summarizes the conclusions and possible future works.

II.

Despite the benefits of acquiring services from third-party providers, some companies prefer investing on private clouds to get the advantages of flexible and efficient usage of physical resources while keeping the control over the data. Such a solution also allows a smoother transition to the cloud paradigm 978-1-4673-7275-6/15 $31.00 © 2015 IEEE DOI 10.1109/SERVICES.2015.55

RELATED WORK

Performance evaluation of cloud computing systems has been a topic of many recent works, however a significant portion of these researches focuses on the operating performance of VMs with specific workloads [2], [11]. In [2], 318 319

Sousa et al. evaluated the performance of distinct VM types in the Eucalyptus platform, using well-known benchmarks. Their analysis enables to assess the quality of service and prevent performance degradation related to fluctuations in workload in private clouds. Iosup et al. [11] analyzed the performance of cloud computing services for the execution of scientific computing workloads. They concluded that the cloud services analyzed at that time needed an order of magnitude in performance improvement to be useful to the scientific community.

Figure 1: Conceptual representation of the Eucalyptus.

Related articles on evaluating the performance of scalability and elasticity mechanisms have also been published, such as [6], [12], [13]. In [6], Gusev et al. characterize the behavior of CPU utilization as the number of CPU core scales. Ghosh et al. [12] developed a CTMC model for stochastic analysis of the performance of an Infrastructure-as-a-Service (IaaS) cloud. Suleiman et al. [13], proposes an analytical model that enables the performance study regarding different rules of elasticity.

(e.g.: KVM, Xen, VMware) to control the VMs execution. The Storage Controller (SC) provides block-level storage service (similar to Amazon Elastic Block Storage – EBS) [18], by means of remote volumes that may be attached to the VMs. The Walrus consists of a file-level storage system that extends across the cloud and is similar to the Amazon Simple Storage Service (S3) [19] in terms of functionality [14], [15], [17].

B. Performance Evaluation of Systems

Our work contributes to the field by evaluating VMs instantiation performance, a study not seen in the mentioned papers. The results of this work are useful to tune private cloud systems and speed up the time required to get a new VM instance running. The data and modeling presented here might also contribute to other broader analyses.

III.

The performance evaluation of a system usually aims to verify the behavior of a system according to established performance metrics, identify possible bottlenecks, and propose improvement solutions. The selection of appropriate evaluation techniques and metrics is fundamental to achieve such goals [20].

BACKGROUND

There are three major techniques for performance evaluation: analytical modeling, simulation and measurement [20]. This paper employs measurements based on DoE and analytical modeling through CTMC.

This section presents the main concepts of Eucalyptus cloud computing platform [14], performance evaluation with the DoE technique, and analytical modeling using CTMC.

1) Design of Experiments: DoE technique allows to obtaining a maximum of information regarding many factors, with a reasonable number of experiments and effort [20], [9]. A set of experiment executions planned through DoE can be analyzed to determine if the factors have significant effects, or if the differences in the observed effects are due to variations caused by measurement errors and not controlled parameters [8], [20], [9]. This technique has been widely adopted for sensitivity analysis [21].

A. Eucalyptus Cloud Eucalyptus is an acronym for Elastic Utility Computing Architecture Linking Your Programs to Useful Systems. It is an open-source platform for private and hybrid clouds, developed at the University of California, Santa Barbara, enabling businesses and research centers to create cloud environments tailored to their specific needs [14], [15]. Eucalyptus is interface-compatible with Amazon Web Services (AWS) [16], what is important to build hybrid clouds, offering several ways to implement, manage and maintain their own collection of virtual resources (machines, network, and storage) [15]. It is compatible with various Linux distributions including Ubuntu, Red Hat Enterprise, OpenSuse, Debian, Fedora, and CentOS [14], [17].

This study adopts the full factorial design, which uses all possible combinations of levels for all factors, i.e., there are no limits to the number of factors and the number of levels. This type of DoE allows every configuration to be examined, so we can find the effects of all factors and their interactions, which is an advantage; the disadvantage is that the cost of analysis can be very high if the number of factors and levels is too high. However, considering that each of these experiments may have to be repeated several times, it is possible to reduce the number experiments by reducing the number of factors, and/or the number of levels for each factor, or using fractional factorial design instead [20].

Figure 1, adapted from [17], shows the main concepts of an Eucalyptus cloud. There are five high-level architectural components, each with its own web services interface and application programming interface (API) used to intercommunication between components and also to receive end user commands [17]. The Cloud Controller (CLC) is the frontend controller of the entire cloud infrastructure, responsible for presenting and managing a unified view of virtualized resources (servers, storage, and network). The Cluster Controller (CC) manages the services of a single cluster, collects data about the resources of each physical node, and schedules the instantiation of VMs. The Node Controller (NC) is a component installed on each physical machine which is intended to run VM instances. The NC communicates with a hypervisor

2) Continuous Time Markov Chains: Stochastic models are usually adopted to characterize the systems whose behavior is intrinsically probabilistic. Markov chains are outstanding stochastic models, used to analyze a variety of systems [22]. Markov chains have been used extensively in dependability, performance, and performability modeling [23], [10]. A Markov chain is a discrete (countable) state space model associated to a Markov process [24]. The main characteristic

320 319

Table I: Factors and parameters chosen as relevant. Factors Cache VM type EMI size (GB)

the node where the VM will be instantiated, and the Node Controller (NC) starts the copy of the Eucalyptus Machine Image (EMI), Eucalyptus Kernel Image (EKI) and Eucalyptus Ramdisk Image (ERI) required for that VM instance. These images may be downloaded from the Walrus or copied from the local cache kept by the NC [7].

Figure 2: Diagram of the performance evaluation methodology adopted.

The cache will not be used if it is not enabled in system setup, or if the EMI does not exist in the cache yet. In the latter case, the CLC downloads the EMI from the Walrus to the cache, and to the instance directory of the NC. Notice that the EKI and ERI are also downloaded if they are not already in the NC cache. When the cache is not enabled, the EMI is downloaded directly to the directory of instances on NC and is not stored in cache for further use. When the cache is working and the node already has a copy of the EMI, the CLC does not download from Walrus, but only copies the EMI from the cache to the NC instance directory [7].

Figure 3: Representation of virtual machine instantiation process of the Eucalyptus.

After copying the EMI, EKI, and ERI to the instance directory, the NC interacts with the hypervisor (KVM, Xen, or VMware) to prepare the disk space needed by the VM instance according to its type and effectively runs it. This preparation usually requires creating, partitioning, and formatting virtual block devices. The hypervisor then starts the actual VM, completing the instantiation process [15], [17], [7].

of a Markov process is that the probability distribution for its future development depends only on the current state, and not on the way like the process arrived in this state [23], [22]. This means that, at the time of a transition, all past history is summarized by the current state, this property is known as absence of memory [22]. Markov chains are usually distinguished in two classes: Discrete-Time Markov Chains (DTMC) and Continuous-Time Markov Chains (CTMC) [25]. In DTMCs, the transitions between states can only take place at discrete intervals, that is, step-by-step. If state transitions may occur at instants of time on a continuous domain, the Markov chain is a CTMC. The Markov property implies that the time of transitions is driven by a memoryless distribution [10], so the exponential distribution is used for CTMCs.

IV.

Levels yes, no m1.large, m3.xlarge, cc1.4xlarge 2, 5, 8

B. Metrics and Factors First, we identified the metrics of interest from the study of the instantiation process. Those measures are associated with the relevant phases of the creation of a VM instance (presented in the previous subsection). We adopted four metrics: 1) the time for the CC to reserve the instance; 2) the time to copy (or download) the EMI, EKI, and ERI (shortened here as “copy time of EMI”) to the instance directory of the NC, from Walrus or cache depending on the scenario; 3) the time that the hypervisor takes to prepare and start the VM; 4) the total time of instantiation, which is the sum of the previous three measures.

EVALUATION METHODOLOGY

The experimental performance evaluation carried out in this work was made in 6 phases as shown in Figure 2. The following subsections describe each of the remaining phases.

After the definition of metrics, we carefully chose factors and their levels to assure that the system will be evaluated in a relevant way [8], [20]. Table I shows the factors and their levels. We consider the cache as a factor to be analyzed, since the instantiation time is expected to be higher when the cache is not used, due to differences between remote copy throughput (network-bounded) and local copy throughput (hard disk-bounded). Such an issue was also observed in preliminary tests, confirming the need to carefully analyze this factor.

A. Problem Identification This first phase of the methodology describes the main phases of the VM instantiation process. The identification of these phases define the performance elements (i.e., metrics and factors) involved in such process. In Eucalyptus cloud platform, as depicted in Figure 3, when a user – or the auto scaling mechanism – requests the creation of a new VM instance, the Cloud Controller (CLC) checks the existence of available resources for the creation of such a VM. This is performed by means of queries to the Cluster Controller (CC), which stores the information about its nodes. If there are resources, the CLC reserves a unique identification number for the instance, the CC assigns

Another factor considered was the type of VM instance, which is related to the time of preparation of the VM by the hypervisor, and therefore might have some impact on this instantiation stage. The choice of three VM types (levels) intends to encompass very different requirements of CPU,

321 320

factor, leading to 18 scenarios or combinations. In order to get results in an acceptable confidence level, we decided to run 50 replicas for each scenario, yielding a total of 900 experiments. This amount of experiments might be considered large, but most of the actions (e.g., workload generation, and data collection) are automatically executed, so reducing the required effort and time for running the experiments.

RAM and disk resources. The type m1.large requires 2 CPU cores, 10 GB of disk space, and 512 MB of RAM. The type m3.xlarge requires 4 CPU cores, 15 GB of disk space, and 2048 MB of RAM. The type cc1.4xlarge requires 8 CPU cores, 60 GB of disk space, and 3072 MB of RAM. The factor EMI size was chosen because it is associated with the download of EMI from Walrus to the NC cache, and with the copy to the directory of instances on NC, as explained previously. The sizes considered in this study are: 2 GB, 5 GB and 8 GB, based on our experience of most commonly used operating system images and size of installed applications. These values were also chosen considering that they would not exceed the disk size of the smallest instance type (m1.large), which is 10 GB.

Table II provides the average time in milliseconds measured for each phase of the instantiation process – instance reservation, copy of the EMI, and VM preparation by the hypervisor – and also the total time. The standard deviation and coefficient of variation for the total time is also presented. For the sake of conciseness, in the scenario column, the cache factor is referenced as Y (working) or N (not working). The VM factor is denoted as 10 (m1.large), 15 (m3.xlarge), or 60 (cc1.4xlarge), where the numbers represent the size of the disk allocated for that VM type. The EMI size is simply represented by the number of gigabytes: 2, 5 or 8.

C. Environment Configuration The environment configuration comprises the setup and preliminary verification of the testbed system that will be used in experiments and measurements. At that stage we remove any sort of external or internal interference that may influence the measurements. Examples of such actions are: finishing unnecessary processes, disabling operating system automatic updates, and assuring that the network is isolated from computers not involved. The environment was assembled with two machines of same hardware configuration: Intel(R) Core(TM) i7-3770 3.4 GHz CPU, 4 GB of RAM DDR3, 500 GB SATA HD. It is important to highlight that a cluster with many machines is not necessary for the purposes of this study, since the VM instantiation is a process involving only the front-end and the specific node where the VM is allocated. Therefore, the usage of only these two machines enabled accurately monitoring every stage of the instantiation process. One machine is configured as the front-end, running the CLC, CC, SC, and Walrus. The other machine runs the node controller (NC). Both execute the Linux CentOS 6 operating system with ext4 filesystem, and Eucalyptus platform 3.4.0.1. Preliminary tests of VM instantiation were conducted to verify whether the environment was working as expected, i.e., the requests of VM instantiation were properly executed, the cache of VM images was populated by the node, etc. The VMs run the Linux Ubuntu Server 14.04.01 LTS operating system, with ext4 filesystem.

Observing Table II, it shall be noticed that the total time to instantiate a VM is at least 10 times higher when the cache is not used, comparing to the scenarios which use the instance cache. Such a difference indicates that this factor is of utmost importance for the system performance. Focusing in the scenarios when the cache is used (denoted by Y), those ones with VM type 10 are the three best configurations, but their standard deviations indicate that the difference with respect to VM types 15 and 60 might not be significant. For the scenarios without cache, the EMI size plays an important role, since the instantiation time increases more than 100 % when the size increases from 2 GB to 5 GB, or more than 50 % when the size increases from 5 GB to 8 GB, while the other factors are kept unchanged. These results and analyses are not intended to point out the best scenario ever, since such a conclusion might depend upon other specific conditions and requirements of the system being used. Anyway, those results are helpful to guide a cloud administrator in adjusting the parameters of his system while being aware of possible impacts on performance of applications, mainly when using mechanisms such as auto scaling. Additional analyses in the experiment results are presented in subsection V-A.

The next step was the creation and selection of workloads and tools to enable us measuring what we planned so far. We created a software script to instantiate the VMs repeatedly, and collect the times of each phase of the instantiation, i.e., each chosen metric. This script was created in shell script language (Bash – Bourne-again Shell). The workload was one VM at a time, after the creation and execution of an instance was completed, the VM was terminated, and the script waited a fixed amount of time to request a new instantiation. At the end of that phase, we consider the environment is controlled and ready to carry out the experiments.

E. Analytical Modeling Figure 4 presents the CTMC that we propose to represent the instantiation process of a VM in a Eucalyptus private cloud. This model is composed of the states RI, CI, DI(1), DI(2), DI(3), DI(4), PV and VR. Corresponding respectively to: RI = reserving instance in the Cluster Controller; CI = copying EMI from cache to the directory of instances in Node Controller; DI(1), DI(2), DI(3), and DI(4) = downloading EMI from Walrus to the Node Controller cache and directory of instances respectively; PV = formatting of virtual block device and VM configuration by hypervisor; VR = VM running. The process of instantiation begins on RI state. If the EMI is already in the node’s cache, the model goes to CI state with rate p_cache × (1/t_RI). If the node needs to download the EMI from Walrus, the model goes to DI(1)

D. Experimentation We adopt a full factorial DoE to obtain the desired measures and to study the impact of each factor on those measures. The experiment was performed in May 2014. According to the design adopted, we have 2 levels of cache factor, 3 levels of VM type factor, and 3 levels of EMI size

322 321

Table II: Results of each scenario of the experiment. Scenario (Cache VM EMI) Y 10 2 Y 10 5 Y 10 8 Y 15 2 Y 15 5 Y 15 8 Y 60 2 Y 60 5 Y 60 8 N 10 2 N 10 5 N 10 8 N 15 2 N 15 5 N 15 8 N 60 2 N 60 5 N 60 8

Instance Reserved (ms) 280 279 271 306 306 318 329 314 307 342 359 357 390 354 363 405 401 409

EMI Copied (ms) 7624 7152 7251 7221 7257 7543 7169 7329 7297 194790 472928 750593 196716 472524 753203 206374 487273 767982

VM Prepared (ms) 10603 11416 10216 13266 13235 14115 22120 16764 14813 14538 16253 14182 12546 16635 13958 14017 13273 13949

Total (ms) 18506 18848 17739 20793 20797 21976 29618 24407 22417 209670 489540 765132 209651 489513 767525 220796 500947 782340

Std. Deviation 10002 10020 9844 10107 10085 9874 118 8918 9736 118 1667 8326 109 85.5 7282 9996 9830 7477

Coef. of Variation 0.5405 0.5316 0.5549 0.4861 0.4849 0.4493 0.004 0.3654 0.4343 0.0006 0.1026 0.0109 0.0005 0.0002 0.0095 0.0453 0.0196 0.536

Table III: Parameter values for the CTMC model of VM instantiation. Parameter p_Cache t_RI t_CI t_DI t_PV

Figure 4: CTMC model for the VM instantiation performance.

Description Probability that EMI is already in cache Mean instance reservation time Mean EMI local copy time Mean EMI download time Mean VM preparation time

Value 1 (100 %) 0.280 s 7.624 s 194.79 s 10.603 s

of p_cache was initially set to 1 to enable the comparison of results provided by the model with those measured in the experiment of the scenario Y 10 2. The mean time to absorption [28], i.e., the mean instantiation time computed from the CTMC with those parameters, is 18.507 s. The time obtained in the experiments is 18.506 s with confidence interval of (15.663 s; 21.349 s) for a confidence level of 95 %, as seen in Table II. When we change the parameters in the CTMC to match the configuration of scenario N 10 2 (absence of cache), the computed mean time to absorption is 209.670 s, the same value measured for the total instantiation time in the corresponding experiment scenario, whose confidence interval is (209.636 s; 209.703 s) for a confidence level of 95 %. Therefore, our model provides consistent results to those observed in the experimental testbed. The next step in our performance evaluation study is using the CTMC model to perform additional analyses (presented in subsection V-B).

state with rate (1-p_cache)×(1/t_RI). In CI state, a transition to PV state occurs with rate 1/t_CI. The download of VM image occurs in four stages (DI(1), DI(2), DI(3), and DI(4)) in order to fit an Erlang distribution1 . The transition from each stage to the other occurs with rate 1/(t_DI/4). In the last stage of the download, the model goes from the DI(4) state to the PV state. The last step of instantiation process occurs when the model goes from PV state to VR state, with rate 1/t_PV, indicating that the VM is running. We have used average results from experimental testbed as input for the VM instantiation model, already presented in Table II. Statistical analysis on experimental data suggests that all time values represented in our model are well fitted by exponential distributions, except by the EMI download t_DI. For such a reason, we refined our model to represent t_DI by means of polyexponential distributions, using a moment matching method described in [26]. We have split the DI state in 4 phases with rate 1/(t_DI/4).

V. ANALYSIS

OF

RESULTS

This section presents the sixth and last phase of the performance evaluation methodology adopted. Comprises the analyses of the results obtained using full factorial DoE technique, and analytical modeling through CTMC to sensitivity analysis.

Table III presents description of parameters and their values, which were obtained from testbed experiments and used to verify the results provided by the VM instantiation model. Mercury tool was used for all analyses [22], [27]. We choose the first scenario (Y 10 2): using cache, VM type 10, and EMI size 2 GB. The value for t_DI was obtained from scenario N 10 2, because the time to download the EMI is only measurable in the absence of cache. The value

A. Sensitivity Analysis – DoE The effect and relevance of each factor and their interactions were computed based on results of total instantiation time shown in Table II. Table IV, and Figure 5 show the results of the sensitivity analysis [21] just for total instantiation time. The evaluation for the times of isolated instantiation phases (i.e., instance reservation, EMI copy,

1 The analysis of experimental data indicated that an Erlang distribution with four phases fits the time to download the EMI better than the Exponential distribution does.

323 322

cache

480000

vm

360000

400000

240000 0

Mean

15

60 cache no yes

cache

0

120000 no

yes

10

15

800000

60

400000

vm

emi

480000

0

800000

360000

400000

240000

0

120000 0

10

800000

2

5

emi no

yes

2

5

8

8

(a) Main effects plot.

vm 10 15 60 emi 2 5 8

(b) Interaction effects plot.

Figure 5: Charts for main and interactions effects of factors.

bandwidth and throughput is especially valuable in no-cache scenarios. Moreover, efforts such as the customization of EMIs for using small disk space, or compressing the EMI for transmission through the network might be considered worthy in environments which do not use cache, but likely not when the EMI can already be in the cache of all nodes. However, a specific study is needed to evaluate the viability of EMI compression strategy, due to computational costs for compression and decompression.

Table IV: Estimated effects and relevances for the total time of instantiation. Factors Constant A B C AB AC BC ABC

Effects 472415 11031 277260 3136 281244 -88 3129

T 603.09 551.55 12.88 323.71 3.66 328.36 -0.1 3.65

Relevances 45.0720 % 1.0525 % 26.4532 % 0.2991 % 26.8332 % -0.0082 % 0.2983 %

P-value 0.000 0.000 0.000 0.000 0.000 0.000 0.918 0.000

It is also important to stress that all interactions with the factor VM type have parallel lines on Figure 5(b), confirming the analysis shown on the other plots, i.e., are interactions without major influence on results.

and VM preparation) is not presented here for the sake of conciseness. In Table IV each factor corresponds to a letter: Cache = A, VM type = B, and EMI size = C. The results showed that the factors that most impact on performance are A (cache), generating a performance effect with relevance of 45.072 %, followed by C (EMI size) with 26.453 %, and the interaction AC with around 26.833 %. The factor B (VM type) with 1.052 % and its interactions (AB, BC, and ABC) obtained no significant relevance, where AB obtained around 0.299 %, BC with -0.008 %, and ABC with 0.298 %. Note also that the p-value of interaction BC was 0.918, being above 0.05 (set as the threshold in this study), i.e., there is not enough evidence that this interaction has significant effect on performance improvement [20].

B. Sensitivity Analysis – CTMC One specific study we choose is to check the behavior of the instantiation time for intermediate levels of the parameter p_Cache. This is an aspect which might deserve characterization in an infrastructure with multiple nodes, and multiple EMIs to instantiate. Even when the cache is enabled, some nodes may not always have the required EMI available in its cache, due to disk space restrictions or simply lack of a previous instantiation. Figure 6 depicts the sensitivity analysis [21] of instantiation time with respect to p_Cache, for the three levels of EMI size (2 GB, 5 GB, and 8 GB). The plot shows a linear relationship between p_Cache and the instantiation time, which can be explored for example by system administrators searching for a compromise between system performance and efforts to pre-load EMIs in the cache of cloud nodes. For example, assuming that a given application requires an average instantiation time smaller than 300 seconds, from Figure 6 we find out that such a requirement is achievable for a 2 GB EMI even without enabling cache, but for 5 GB EMI and 8 GB EMI, the probability of using cache shall be higher than 40 %, and 60 %, respectively.

The main effects plot, Figure 5(a), shows the isolated factors and the effects of their distinct levels. Since the difference between levels is significant, this highlights the cache and EMI size as high impacting factors, as denoted by both slopes. On the other hand, the VM type isolated has a much smaller effect, denoted by the almost horizontal line. Figure 5(b) shows a evaluation of the effects of factors interacting with each other. The interaction that has no parallel lines suggests a significant impact on the measure of interest, i.e., the total instantiation time. This is the case of the upper right and the lower left cells in the grid, which represent the interaction between cache and EMI size factors. Both plots show that when the cache is not used, the effect of varying EMI size is significant, but when the cache is working, the different levels of EMI cause negligible impact on the instantiation time. Therefore, the employment of network equipment and configuration to provide high

Next, we checked the behavior of the instantiation time for different values of other parameters: t_RI, t_CI, t_DI, and t_PV. This analysis may allow systems administrators to better prioritize efforts to reduce the time of each phase. Figure 7 depicts the sensitivity analysis of instantiation time with respect to t_RI, t_CI, t_DI, and t_PV, considering three levels of p_Cache (25 %, 50 %, and 75 %). The plots show

324 323

relying on auto scaling features to deal with bursts of incoming requests.

VI.

CONCLUSION

This paper investigated the process of VMs instantiation, an important activity in cloud computing systems, and intensively used by elasticity mechanisms such as the Eucalyptus auto scaling feature. For this, we proposed a full factorial experiments design, followed by measurements and analytical modeling. We analyzed the effects and relevance of three factors – cache, VM type, and EMI size – considering the total time of instantiation. The times for completion of intermediate phases of the instantiation were also measured and analyzed.

Figure 6: Sensitivity analysis of instantiation time with respect to p_Cache.

The experimental results pointed out that the most influencing factors are the cache and EMI size, including the interaction between these factors. The use of cache causes a significant decrease in the instantiation time. The size of EMI also has a large effect, mainly when the cache is not used. The experimental results provided input data for a CTMC model proposed in this paper. The results computed from the CTMC model show a linear relationship between the instantiation time and the probability of using cache.

a linear relationship between each of these parameters and the instantiation time. For this analysis, we used the values found in Table III, and varied one parameter at a time while holding the values of other parameters unchanged. In Figure 7(a), we varied the parameter t_RI around their mean on a range from 0.1 s to 1 s with intervals of 0.1 s. Since the instance reservation time is already small enough, we notice that reducing it would have little effect on total time, for all p_Cache levels.

The presented analysis lead to the suggestion of some good practices for cloud infrastructures administrators. Due to the high impact of EMI size, the customization of virtual machine images to produce small EMIs is recommended, as well as the employment of network equipment and configuration to provide high bandwidth and throughput. Preloading EMIs on a fraction of the available nodes might also improve significantly the instantiation time, since it may increase the probability of using the cache. The approach for sensitivity analysis presented here contributes with guides to prioritize efforts and might be applied in similar models of cloud computing infrastructures and their applications.

In Figure 7(b), we varied the parameter t_CI around its mean on a range from 5 s to 14 s at intervals of 1 s, note that the instantiation time is slightly affected by the EMI local copy time. Especially, we might highlight that for p_Cache = 25%, the changes on t_CI have almost negligible impact on the total instantiation time, whereas this impact is larger for p_Cache = 75%. Thus, system administrators should only focus on efforts for reducing t_CI when there is high probability of finding the VM image on node’s cache. In Figure 7(c), we varied the parameter t_DI around its mean on a range from 150 s to 240 s at intervals of 10 s. It is important to emphasize that the interval of variation here is larger (in absolute value) than for other parameters because the EMI download time is impacted by EMI size, as mentioned in subsection V-A. Figure 7(c) shows that by varying t_DI, the instantiation time changes much significantly for all levels of p_Cache. Moreover, the smaller is the value of p_Cache, the higher is the impact of t_DI on total time. Therefore, efforts for decreasing t_DI produce more benefits when there are low probability of finding the VM image on node’s cache.

Future works may verify the viability of EMI compression before transmission to the node and the subsequent decompression. Other studies might compare the instantiation process of other private cloud environments, such as OpenStack and OpenNebula. The proposed CTMC model might be used and extended to evaluate the auto scaling mechanism in private clouds, and therefore observe the impact of analyzed factors on high-level applications running on cloud infrastructures to address bursty workload.

REFERENCES

Finally, in Figure 7(d), we varied the parameter t_PV around its mean on a range from 5 s to 14 s at intervals of 1 s. The plot shows that the variation of t_PV impacts slightly on instantiation time, and the degree of such impact is almost the same for all levels of p_Cache, as expected since the VM preparation occurs only after the EMI copy or download.

[1]

NIST, “Nist cloud computing standards roadmap,” 2013.

[2]

E. Sousa, P. Maciel, E. Medeiros, D. Souza, F. Lins, and E. Tavares, “Evaluating eucalyptus virtual machine instance types: a study considering distinct workload demand,” in CLOUD COMPUTING 2012, The Third International Conference on Cloud Computing, GRIDs, and Virtualization, 2012, pp. 130–135.

[3]

E. Bauer and R. Adams, Reliability and Availability of Cloud Computing. Wiley-IEEE Press, 2012.

It is worth stressing that the CTMC model presented in this paper can be extended and used for other analyses not shown here. This CTMC also allows composition with other models to evaluate higher level applications which make intensive use of VM instantiations, such as applications

[4]

Eucalyptus, AWS and Eucalyptus Compatibility, 2013, Available on http://www.eucalyptus.com/aws-compatibility.

[5]

E. Caron, L. Rodero-Merino, F. Desprez, A. Muresan et al., “Autoscaling, load balancing and monitoring in commercial and opensource clouds,” 2012, INRIA. Research report N. 7857. Available on http://hal.inria.fr/docs/00/66/87/13/PDF/RR-7857.pdf.

325 324

(b) Variation of t_CI (EMI local copy time).

(a) Variation of t_RI (instance reservation time).

(c) Variation of t_DI (EMI download time).

(d) Variation of t_PV (VM preparation time).

Figure 7: Sensitivity analysis of instantiation time with respect to parameters t_RI, t_CI, t_DI, and t_PV.

[6]

[7] [8]

[9] [10]

[11]

[12]

[13]

[14] [15] [16] [17] [18]

M. Gusev, S. Ristov, M. Simjanoska, and G. Velkoski, “Cpu utilization while scaling resources in the cloud,” in CLOUD COMPUTING 2013, The Fourth International Conference on Cloud Computing, GRIDs, and Virtualization, 2013, pp. 131–137. Eucalyptus, Eucalyptus 3.4.0 User Guide, 2013. A. P. Guimarães, P. R. Maciel, and R. Matias Jr, “An analytical modeling framework to evaluate converged networks through businessoriented metrics,” Reliability Engineering & System Safety, 2013. D. C. Montgomery, Design and Analysis of Experiments, 8th ed. New York: John Wiley and Sons, 2012. K. S. Trivedi, Probability and Statistics with Reliability, Queuing, and Computer Science Applications. New York: John Wiley and Sons, 2001. A. Iosup, S. Ostermann, M. N. Yigitbasi, R. Prodan, T. Fahringer, and D. H. Epema, “Performance analysis of cloud computing services for many-tasks scientific computing,” Parallel and Distributed Systems, IEEE Transactions on, vol. 22, no. 6, pp. 931–945, 2011. R. Ghosh, F. Longo, V. K. Naik, and K. S. Trivedi, “Modeling and performance analysis of large scale iaas clouds,” Future Generation Computer Systems, vol. 29, no. 5, pp. 1216 – 1234, 2013, Special section: Hybrid Cloud Computing. B. Suleiman and S. Venugopal, “Modeling performance of elasticity rules for cloud-based applications,” in Enterprise Distributed Object Computing Conference (EDOC), 2013 17th IEEE International. IEEE, 2013, pp. 201–206. Eucalyptus, Eucalyptus Open-Source Cloud Computing Infrastructure An Overview. Eucalyptus Systems, 2009. ——, Eucalyptus 3.4.0 Administration Guide, 2013. Amazon, “What is cloud computing?” March 2014, Available on http: //aws.amazon.com/what-is-cloud-computing/?nc1=h_l2_cc/. Eucalyptus, Eucalyptus 3.4.0 Installation Guide, 2013. Amazon, “Amazon elastic block store (ebs),” December 2013, Available on http://aws.amazon.com/ebs/.

[19]

——, “Amazon simple storage service (s3),” December 2013, Available on http://aws.amazon.com/s3/.

[20]

R. Jain, The Art Of Computer Systems Performance Analysis: Techniques For Experimental Measurement, Simulation and Modeling. Wiley India Pvt. Ltd., 2008.

[21]

R. S. Matos, P. R. M. Maciel, F. Machida, D. S. Kim, and K. S. Trivedi, “Sensitivity analysis of server virtualized system availability,” IEEE Transactions on Reliability, vol. 61, no. 4, pp. 994–1006, 2012.

[22]

B. Silva, G. Callou, E. Tavares, P. Maciel, J. Figueiredo, E. Sousa, C. Araujo, F. Magnani, and F. Neves, “Astro: An integrated environment for dependability and sustainability evaluation,” Sustainable Computing: Informatics and Systems, vol. 3, no. 1, pp. 1–17, 2013.

[23]

P. Maciel, K. S. Trivedi, R. Matias, and D. S. Kim, Performance and Dependability in Service Computing: Concepts, Techniques and Research Directions. IGI Global, 2011, ch. Dependability Modeling.

[24]

B. R. Haverkort, Lectures on formal methods and performance analysis. New York, NY, USA: Springer-Verlag New York, Inc., 2002, ch. Markovian models for performance and dependability evaluation, pp. 38–83.

[25]

L. Kleinrock, Queueing Systems.

[26]

I. Watson, J.F. and A. Desrochers, “Applying generalized stochastic petri nets to manufacturing systems containing nonexponential transition functions,” Systems, Man and Cybernetics, IEEE Transactions on, vol. 21, no. 5, pp. 1008–1017, Sep 1991.

[27]

G. Callou, P. Maciel, D. Tutsch, J. Ferreira, J. Araújo, and R. Souza, “Estimating sustainability impact of high dependable data centers: a comparative study between brazilian and us energy mixes,” Computing, pp. 1–34, 2013.

[28]

New York: Wiley, 1975, vol. 1.

J. Kohlas, “Numerical computation of mean passage times and absorption probabilities in markov and semi-markov models,” Zeitschrift für Operations Research, vol. 30, no. 5, pp. A197–A207, 1986.

326 325