A Science Cloud Resource Provisioning Model Using ... - IEEE Xplore

2011 IEEE Ninth Ninth IEEE International Conference on Dependable, Autonomic and Secure Computing

A Science Cloud Resource Provisioning Model using Statistical Analysis of Job History Seoyoung Kim, Jung-in Koh, Yoonhee Kim*

Chongam Kim

Dept. of Computer Science Sookmyung Women’s University Seoul, South Korea {sssyyy77, jungin, yulan}@sookmyung.ac.kr *Corresponding Author

Dept. of Mechanical and Aerospace Engineering Seoul National University Seoul, South Korea [email protected]

Abstract— The advent of cloud computing makes scientists to extend their research environments over supercomputers to on-demand and dynamically scalable resources. Science cloud becomes a trend in various scientific domains thesedays. However, it is difficult to provide optimal job execution environment rapidly and dynamically depending on user’s demands. Therefore, it is very important to predict user’s requirements and to prepare execution environment in advance. In addition, it needs scheduling mechanisms for virtual machines to provide some level of guaranteed performance of a user application. In this paper, we propose a cloud resource provisioning model using statistical analysis of job history. In this model, we use job history which is generated from many application executions and identifies characteristics of an application by applying statistical analysis. We utilize a statistical technique, PCA (Principal Component Analysis), to analyze execution history of applications and to extract the factors which contribute much to execution time. The effective factors are used for selecting reference job profile and then VM is deployed on the selected node based on the reference profile. An application is executed on chosen nodes and its performance result is incorporated into job history with the purpose of evaluating profile’s credit. As a result, this model can provide efficient management of cloud resource for a service provider and reduce management overhead on cloud. Keywords-component; Science Cloud; Principal component Analysis; Job history; Resource provisioning

I.

VM is deployed on the selected nodes based on the reference profile. An application is executed on the chosen node and its performance result is incorporated into job history with purpose of evaluating the profile’s credit. The rest of this paper is organized as follows. In section 2, we discuss related work and then introduce resource provisioning step in section 3. In section 4, our proposed science cloud resource provisioning model which based on PCA approach is described in detail. Finally, section 5 lists conclusion and discusses future research. II.

REALATED WORK

One of challenges in Cloud management to be improved is resource provisioning which execution environments could guarantee some level of better performance for a user application. Various prediction algorithms have been used to improve the performance of job execution on large-scale computing resources. Statistic models or intelligent techniques are applied into these prediction algorithms for cloud resource management. As an example of statistic models in prediction algorithms, prediction models in AppleS [2] are used to evaluate and decide schedule candidates. On the other hand, an artificial intelligent algorithm [3] proposed by Hui et al, predicts application runtime and queue wait time. Provisioning Step

INTRODUCTION

Cloud computing can offer on-demand capacity as utility and can provide the abilities to use insufficient resources effectively. Therefore, selecting proper resource and managing the resource efficiently are critical for the performance of executions. That is, we need to identify application characteristics first and then provide an optimal resource provisioning in order to attain good application execution on cloud. In this paper, we propose a cloud resource provisioning model using statistical analysis of job history. In this model, we use job history which is generated from many application executions and identifies characteristics of application by applying statistical analysis. We utilize a statistical technique, PCA (Principal Component Analysis) [1], to analyze execution history of applications and to extract the factors which contribute much to execution time. The effective factors are used for selecting reference job profile and then 978-0-7695-4612-4/11 $26.00 © 2011 IEEE DOI 10.1109/DASC.2011.134

Figure 1. A graph representaion of PCA results

In this paper, a statistic model is used to evaluate previous execution history for improving job execution. If 793 792

each user runs an application, the result of it is respectively stored in each user’s job history. To analyze recent executions in the job history, Principal component analysis (PCA) [1] technique is chosen. The inferred output involves two key points: First, it shows how the factors used as input effects to the jobs and second, it demonstrates most affective jobs typically used by a user. Specifically, as shown in Fig.1, the output shows that, among factors, stability is the most influencing one; and in a view of jobs, job 10 and job 1 are most commonly used and typical jobs for the user who perform experiments for certain period of time. Therefore, the stability and these jobs, 10 and 1 are stored in profile repository for reference as a reference job profile for future experiment. III. RESOURCE PROVISIONING MODEL Figure 2. A flowchart of Evaluation

A. Principal Component Analysis Principal component analysis (PCA) [1] is a well-known statistical technique for analyzing multivariate data in which observations are described by several inter-correlated quantitative dependent variables. As sorting dimensions in order of importance, we can extract important information which called principal components and discard low significance dimensions. In the proposed provisioning model, PCA method is exploited to catch the important associations between different kinds of factors and execution time in job history profiles. The model chooses an important factor which has maximal values of the principal component (PC1). Note that each job is denoted as a vector having d number of parameters corresponding factors. A matrix having n rows and d columns is conducted for analyzing.

C. Example We define main factors in execution time of a science application and analyze which factor impacts on the performance of the execution. The number of iteration is defined as an application factor and CPU speed, network bandwidth, node stability are for computing resource factor. For the evaluation of the proposed approach, we used workload for fluid analysis study on PRAGMA [4]. Depicted on Fig.1,‘Stability’ is selected as a principal factor. A VM is scheduled on nodes which are chosen according to ‘Stability’. IV. CONCLUSION AND FUTURE WORK We proposed a science cloud resource provisioning model using statistical analysis of job execution history. To analyze the history, we applied PCA technique to elect the effective factors. The factors utilized for choosing nodes and VM are deployed on the chosen nodes in advance. In the future, we have a plan to evaluate this model for efficient execution performance in science cloud.

B. Cloud Resource Provisioning Algorithm Algorithm 1 shows the proposed cloud resource provisioning model. As earlier mentioned we analyzed job history and found principal factors. To obtain the factors, we elect candidate profiles and choose a profile which has minimal execution time, and then return its resource infomation (line 5). A VM is deployed on the chosen profile’s resource and job is submitted to the VM. After the completion of job execution, the job profile is evaluated by comparing the execution time of it to one of the referenced profile. As shown in Fig. 2, the result of evaluation is applied to the credit of related factors. In case of decrease in credit, PCA is carried out with the updated job history.

ACKNOWLEDGMENT This work was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (No.2010-0027719). REFERENCES [1] [2]

Algorithm 1 Select cloud resource for a VM using PCA Given job history Jrecent = {j1,j2,…,jn}, Each j1 is a d – dimensional vector. Factor F = { f1,f2,…,fd}, fprin 1. fprin = null , rselected = null 2. if fprin not exists then 3. fprin = PCA( F , Jrecent) 4. endif 5. rselected = JobHistory( fprin, jnew) 6. Deploy VM(x) on rselected 7. Dispatch jnew on VM(x)

[3]

[4]

793 794

Jolliffe, I.T. “Principal Component Analysis” Springer Verlag. 2002, F.Berman et al., "Adaptive computing on the grid using apples," IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 4, pp.369-382, 2003. H. Li, D. Groep, L. Wolters, "Efficient response time predictions by exploiting application and resource state similarities," in Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing. IEEE Computer Society, 2005, pp. 234-241. PRAGMA Grid and Cloud Monitoring home page, http://pragmagoc.rocksclusters.org