A Dynamic Hybrid Resource Provisioning Approach for ... - IEEE Xplore

2013 2013 19th IEEE International International Conference Conference on Parallel on Parallel and Distributed and Distributed Systems Systems

A Dynamic Hybrid Resource Provisioning Approach for Running Large-scale Computational Applications on Cloud Spot and On-demand Instances Sifei Lu, Xiaorong Li, Long Wang, Henry Kasim, Henry Palit, Terence Hung, Erika Fille Tupas Legara, Gary Lee Institute of High Performance Computing, Singapore {lus, lixr, wangl, kasimh, henry, terence, legaraeft, leekk}@ihpc.a-star.edu.sg Abstract—Testing and executing large-scale computational applications in public clouds is becoming prevalent due to cost saving, elasticity, and scalability. However, how to increase the reliability and reduce the cost to run large-scale applications in public clouds is still a big challenge. In this paper, we analyzed the pricing schemes of Amazon Elastic Compute Cloud (EC2) and found the disturbance effect that the price of the spot instances can be heavily affected due to the large number of spot instances required. We proposed a dynamic approach which schedules and runs large-scale computational applications on a dynamic pool of cloud computational instances. We use hybrid instances, including both on-demand instances for high priority tasks and backup, and spot instances for normal computational tasks so as to further reduce the cost without significantly increasing the completion time. Our proposed method takes the dynamic pricing of cloud instances into consideration, and it reduces the cost and tolerates the failures for running large-scale applications in public clouds. We conducted experimental tests and an agent based Scalable complex System modeling for Sustainable city (S3) application is used to evaluate the scalability, reliability and cost saving. The results show that our proposed method is robust and highly flexible for researchers and users to further reduce cost in real practice.

charge. For spot instances, users need to bid the price. Although the availability of a spot instance cannot be guaranteed, a spot instance usually has a cheaper price than an on-demand instance. A complex large-scale computational application usually involve a large number of tasks which can run in parallel or sequential with certain dependency, and it may require a large amount of the computational resources. Hence, it is crucial to reduce the cost, improve the reliability, and ensure the execution speed, when running the large-scale computational applications in clouds. In this paper, we focus on running large-scale computational applications on public clouds especially for ondemand instances and spot instances offered by Amazon EC2. After analyzing the characteristic of the spot price and the effect of spot instances disturbance, we proposed a dynamic approach for running the applications on on-demand instances and spot instances so as to reduce cost, increase the reliability and reduce the complexity of fault tolerance without affecting the overall performance and scalability. As a case study, we use a workflow-enabled Scalable complex System modeling for Sustainable city (S3) [2] application to evaluate and verify the result in Amazon EC2 with 1000 VM instances. Our approach is robust and adaptable for large-scale computational applications. In the real practice, researchers and users are able to reduce cost and increase the reliability using our proposed approach in Amazon EC2.

Keywords-component; Cloud computing; resource provisioning; cost analysis; fault tolerance; spot instance; spot price; reliability; workflow scheduling

I.

The rest of the paper is organized as follows. We review the related works in section 2, elaborate the methodology and solution in section 3, then present the case study on S 3 in section 4, evaluate and verify the results in section 5, finally, state the conclusions and future work in section 6.

INTRODUCTION

Cloud computing provides an attractive and alternative solution of testing and executing complex computational applications in a large pool of computational resources. Comparing with the traditional private cloud environment, public clouds allow researchers and users to migrate the spike workload to the additional resources in the public clouds. The provisioning of elastic on-demand resources in clouds [1] improves the scalability, reduces the turnaround time, and handles the load bursting problem.

II.

Nowadays, various cloud service providers support a variety of OS type virtual machines with different pricing schemes, including storage and network resources for users. Amazon Elastic Compute Cloud (Amazon EC2) [1] provides three types of instances, namely on-demand instances, reserved instances and spot instances. Users can pay a fixed rate per hour for on-demand instances and have the flexibility to start and terminate them anytime. If the user applications require occupying the resources for a long time, reserved instances are suitable because of the one-time discount based on the hourly

1521-9097/13 $31.00 © 2013 IEEE DOI 10.1109/ICPADS.2013.117 10.1109/.116

RELATED WORK

In practice, users need to consider whether migration of an application from an enterprise system to a public cloud is economically feasible. Byung et al. [3] use application's workload intensity, growth rate, network traffic, storage, and software license to estimate the overall migration cost. It was founded that only a small number of business applications are suitable to be completely migrated to public cloud, while component based applications are relatively expensive due to high costs of data transfer. In recent years, researchers investigated a variety of cloud resource provisioning methods and studied how to meet the requirements of cost savings, deadlines, and service level agreements (SLAs). The approach of Just satisfactory resource

658 657

provisioning [4] dynamically adjusts resource allocation based on the monitoring data, so as to avoid the common overprovisioning or under-provisioning. Robust cloud resource provisioning algorithm was proposed in [5] to minimize the total resource provisioning cost with the consideration of uncertainty. Optimal cloud resource provisioning algorithms in [6] provide reservation instances and on-demand instances to minimize total cost using a stochastic programming model. A dynamic HBS resource provisioning algorithm was proposed in [7] to provide computation resources from different cloud providers but less sensitive to inaccurate input data.

A model with mixture of Gaussians distribution [24] with 3 or 4 components is proposed to study the characteristics of spot price and the inter-price time. For most of the Amazon data centers, they have the maximum spot price on Tuesday and the lowest price on Saturday. Although Amazon does not disclose how spot price is determined, Orna et al. [25] found that high prices may reflect market changes, while low prices are via a dynamic hidden reserve price. Considering a discrete-time stochastic Dynamic Programming formulation, an optimal bidding strategy [26] was proposed to minimize the average computation cost with deadline constraint. As it is hard to predict the spot instances price, Zhao et al. [27] proposed a Stochastic Resource Rental Planning (SRRP) model to minimize the expected resource rental cost.

A variety of methods have take both computational and data transfer cost in hybrid clouds into the consideration. Ruben et al. [8] proposed a method to maximize the utilization of internal resource and minimize the usage of public cloud resource with the deadline constrained. A fault tolerant strategy [9] was proposed to recover the virtual node failure during resource provisioning phase. To handle scientific workflow applications in hybrid clouds, workflow engines [10][11] were developed to manage workflow scheduling, fault tolerance and data movement for scientific applications. In [12], algorithms were proposed to schedule resources with the consideration of workflow structure and the constraints of budget, and deadline. In [13], resource failures were considered to have dynamic resource provisioning for workflow applications to avoid deadline violations. Comet Cloud [14][15]was proposed as a core engine to provide an autonomic computational management framework for running scientific workflow applications on hybrid clouds .

Given the high probability of spot instances failures, fault tolerance is becoming more important to an application in cloud. Qin [28] proposed a heuristic approach to schedule backups upon failure for fast recovery to improve MapReduce fault tolerance in the cloud. Using an appropriate check pointing [29][30] for fault tolerance, users can reduce significantly both monetary costs of computations and the task completion time, while keeping reliability at a high level. In addition to check pointing, work migration [30] can reduce execution time with a slightly higher cost. A multifaceted resource provisioning policy [31] was proposed to reliably manage a pool of spot instances for compute-intensive tasks. Migration based fault tolerance mechanisms were employed to ensure applications to be run successfully. Different from the methods mentioned above, we set back-up instances so as to start a new instance as soon as possible when failure happens.

In December 2009, Amazon EC2 introduced Spot Instances on which researchers may further reduce the cost of running applications. In [16], a maximum bid price resource provisioning policy is proposed to extend the capacity of a local cluster by migrating peak loads onto spot instances which can effectively reduce the cost by 50%. Using spot instances for MapReduce was investigated in [17] to attain performance gain at low monetary cost. However, the application may suffer long completion time and increasing cost due to forcing termination of spot instances. A bidding scheme and server allocation policy is designed to use spot instances for web services [18] so as to optimize the average revenue with performance and availability guarantees. A probabilistic model was used to determine the bid price [19] to meet SLA requirement with a low cost. It is a kind of trade off that low bidding price may result in longer execution time, while high bidding price may have shorter execution time but higher costs.

In this paper, our focus is to run large-scale computational application in clouds. We investigated the characteristics of the dynamic pricing of the spot instances for large-scale applications. We founded that there will be a disturbance effect of spot price if requesting a large number of spot instances. Our simulation used real spot price rather than historical spot price. We proposed a dynamically hybrid resource provisioning method which schedules on-demand and spot instances with the consideration of dynamic pricing as well as the failure of the spot instances. The management module will schedule the backup instances on on-demand instances to tolerate the failure of the spot instances. Our approach can further reduce the cost and improve the reliability of running large-scale computational intensive application in clouds. III.

From the perspective of the Infrastructure as a Service (IaaS) providers, researchers also proposed solutions to maximize the revenue with reliability and customer satisfaction. Using utility model, Chen et al. [20] proposed dynamical scheduling algorithms for service providers to optimize their profits or customer satisfaction. In [21], a dynamical method was proposed to adjust the capacity of each VM to best match the demand so as to maximize the provider’s revenue and customers’ satisfactions. By on-the-fly adapting spot price and inter-price time [22], an online algorithm was proposed to effectively achieve a good profit-reliability tradeoff in accordance with a variety of market requirements. Dynamic auctions [23] were proposed to quickly adapt to market changes so as maximize revenue of cloud service providers with guaranteed services.

METHODOLOGY AND SOLUTION

.In this section, we proposed an solution which includes analysis the characteristics of dynamic pricing of spot instances, and fault tolerance mechanism and dynamic resource provisioning. A. Characteristics of spot instances  Amazon spot instances price history In order to utilize those idle computational resources in data centers, Amazon EC2 provides spot instances for users to bid the price of VM instances. The spot price is determined by supply and demand.

658 659

659 660

IV.

CASE STUDY ON SCALABLE COMPLEX SYSTEM MODELING FOR SUSTAINABLE CITY

Scalability Scaling of the “create agents” and “define attributes” phases is achieved through the division of the workload, where each process handles a group of agents. For example, in a simulation with 7 million commuters running on an infrastructure containing 1000 instances, commuter agents’ creation was split among the virtual machine instances, in such a way that each instance handles the creation of 7 thousand agents. We utilize on the cloud enabled workflow management system to manage big distributed nature of data for the agentbased simulation workflow and dynamic resource allocation for adaptive services with fault tolerance. 

A case study for Scalable complex System modeling for Sustainable city application (S3) is used to evaluate and verify the dynamic hybrid resource provisioning mechanism and application aware of fault tolerance approach. In this section, we describe the application and analysis the requirements in term of the program modeling and the scalability. A. Application Objectives and motivations As a city country, accommodation economic and population growth are vital important to the sustainable development of Singapore. Since the 1960s, Singapore’s population has grown from 1.6 million to 5.3 million, while the land area has increased from 581 km2 to 716 km2 in 2012. In 2013, the total population is projected to be at about 6.9 millions by 2030 by the government. As a result, it means that there is increasing strain on space and service infrastructure. The planning agencies need to understand the urban fabric and how it can adapt to social, economical, and environmental changes. It motivated us to adopt a data-driven approach to understand the dynamics of the transport system in Singapore.

V.

RESULTS AND DISCUSSION

Using resource at Amazon EC2 US East (Northern Virginia) Region, we conducted experimental test to examine the performance of our proposed dynamic hybrid resource provisioning mechanism and application aware of fault tolerance approach in terms of cost saving and application completion time. We firstly built and tested the S3 application in local private cloud environment to verify and update the logic and implementation of algorithms. Then, we migrated it to Amazon EC2 to test scalability, cost savings and reliability. Using standard Ubuntu 12.04 Amazon Machine Image (AMI), we built customization AMIs and configured network settings.

A Scalable complex System modeling for Sustainable city is developed to study how the city will behave under different planning scenarios. B. Scalable complex System modelling for Sustainable city The S3 application is composed of three phases: preprocessing, data analysis and agent-based simulation.

There are three types of instances used for the S3 Application: 

There are three types of agents in S3 application including the agents for 6.9 million commuters, 90 stations, and 3000 trains, respectively. Each agent has its own attributes, adaptive agent process, decision making process and agents interactions.



ETL or pre-processing. The synthetic data set consist of one second time granularities for 7 days duration with approximately 3 million number of journey per day in Singapore. Data is extracted and transformed with travel duration for each Origin-station to Destinations-station (O-D pair) of 90x90 by three different route choices.



m1.xlarge Linux instance (4 cores with 8 ECU and 15 GiB Memory) for workflow management control, dynamic resource provisioning and resource pool monitoring and fault tolerance. m3.xlarge Linux instance (4 cores with 13 ECU and 15 GiB Memory) for Mysql database server to save status and intermediate result. m1.small Linux instance (1 core with 1 ECU and 1.7 GiB Memory) for computation.

A. Spot Instance Scalability Result Spot instances will be terminated if user's bidding price is low or equal to current spot price. System termination rate is defined as no of system terminated instances / total no of requested instances.

Data analysis. Our approach creates or improves on synthetic journey function of all the possible O-D pairs, possible routes for each O-D pair and temporal travel demand. Agent-based simulation. Agent-based simulation consists of agent-granularity, adaptive agent process, decision making heuristics and agents interactions. It is used to simulate the actions and interactions of autonomous agents.

Fig 3 shows the system termination rate of m1.small spot instances in Amazon EC2 US East (Northern Virginia) Region, bidding price is 0.045 USD per hour per instance. From Fig 3, we observe a disturbance effect that the termination rate will be increased due to the increase of the number of the requested spot instance. The system termination rate will be significantly increased if more than 1000 spot instances are launched, and Amazon EC2 will distribute the large amount of spot instances in several Zones in same Region. Hence, in our experiments, we set the backup as the dynamic termination rate according to the number of the instances. If bidding a high price for a very large amount of spot instance the spot price will close to on demand instance price. On the other way, if bidding a lower price for a very large amount of spot instance, the system termination rate will be higher. Finally we chose a medium bidding price to trade off spot price against system termination rate.

Program modeling S3 application architecture comprises of adaptive cloud workflow management system, ETL or pre-processing algorithm, data analysis algorithm and agent-based simulation. It applies event-based simulation modeling to handle the agents’ interaction such as boarding of commuters, alighting of commuters, train arrival on station, and train departure on station. On the back-end, the workload is distributed via a similar method to other phases (each process to handle a group of agents). 

660 661

System Termination Rate

6

Completion Time Completion Time (Seconds)

Termination Rate (%)

5 4 3 2 1 0 50

200

500

2000 1800 1600 1400 1200 1000 800 600 400 200 0

1000

On-demand

No of Requested m1.small Spot Instances

Spot

Figure 5. Completion time

Figure 3. System termination rate of m1.small spot instances

VI.

B. Cost saving and completion time result We exam the performance based on two key indicators: Completion time and Cost savings. We compare the performance of our proposed DHSI approach on hybrid instances with both on demand and spot instance, with the method in [30] using on-demand instances and the method in [31] using spot instances for dynamic resource provisioning.

CONCLUSIONS AND FUTURE WORK

Migrating a large-scale computational application to public cloud may help to solve the peek workload and improve the turnaround time. However, it raises the concerns of reliability and costs. After analyzing the characteristics of Amazon EC2 spot instances price, we show there are disturbance effects due to the requests of large number of spot instance. Hence, it required an effective solution to handle the cost as well as the reliability issue for running large-scale computational applications in public clouds.

EC2 Instances Cost (USD)

We proposed a dynamic resource provisioning solution to running large scalable application with hybrid instances includes both spot and on-demand instances.

70

We configure certain backup instances using on-demand instances and bid the price of the spots instances for running large-scale computational applications in public clouds. A case study for Scalable complex System modeling for Sustainable city was used to evaluate the proposed approach: dynamic hybrid resource provision, on-demand instances for backup, application aware fault tolerance.

60 Total Cost (USD)

Hybrid

50 40 30 20

We conducted experimental tests using 1000 Amazon EC2 instances. The results show that our proposed hybrid resource provisioning method achieved 23.3% cost savings compared with using on-demand instances only method, while the total completion time only increase 5.3% with backup fault tolerance policy. Our proposed methods are high practicable that researchers and users could use on-demand and spot instances to further reduce the cost with the completion time constraint.

10 0 On-demand

Hybrid

Spot

Figure 4. The total cost of public instances

Fig 4 shows the total cost of running S3application. Comparing with on-demand method our hybrid approach achieves 23.3% cost savings when 1000 Amazon EC2 instances are requested. The method using spot instance [31] achieves slightly higher cost saving with about 25.4%, but much longer completion time than HDSI, as shown in Fig 5.

Our future work is to extend the result to a long execution time application for further cost-savings, as well as some light data movement applications. ACKNOWLEDGMENT

Fig 5 shows the total completion time of running S 3 application. With our proposed DHSI method, the total execution completion time only increased 5.3% which is close to that of on-demand method but 12.0% less than spot method.

Thanks to Vicknesh Selvam, Christopher Monterola and Vasundhara Jayaraman for developing the Scalable complex System modeling for Sustainable city.

661 662

[17] Navraj Chohan, Claris Castillo, Mike Spreitzer, Malgorzata Steinder, Asser Tantawi, Chandra Krintz, "See Spot Run: Using Spot Instances for MapReduce Workflows", in Proc of the 2nd USENIX conference on Hot topics in cloud computing, Hotcloud 2010. [18] Michele Mazzucco and Marlon Dumas, “Achieving performance and availability guarantees with spot instances,” in Proceedings of the 13th International Conferences on High Performance Computing and Communications (HPCC2011). Los Alamitos, , USA, 2011, pp 291-303 [19] Artur Andrzejak, Derrick Kondo, Sangho Yi, "Decision Model for Cloud Computing under SLA Constraints", The 18th IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS2010),, Miami beach, USA 17-19 Aug 2010, pp 257-266. [20] Junliang Chen, Chen Wang, Bing Bing Zhou, Lei Sun, Young Choon Lee, Albert Y. Zomaya, “Tradeoffs Between Profit and Customer Satisfaction for Service Provisioning in the Cloud”, in proc of the 20th international symposium on High performance distributed computing (HPDC 2011) , San Jose, USA, June 8-11, 2011, pp. 229-238 [21] Qi Zhang ; Quanyan Zhu ; Boutaba, R. "Dynamic Resource Allocation for Spot Markets in Cloud Computing Environments", in proc of The Fourth IEEE International Conference on Utility and Cloud Computing (UCC2011), December 5-7 2011, Melbourne, Australia, pp 178-185 [22] Kai Song, Yuan Yao, Leana Golubchik, "Exploring the Profit-Reliability Trade-off in Amazon’s Spot Instance Market: a Better Pricing Mechanism", in proc of the IEEE/ACM 21st International Symposium on Quality of Service (IWQoS 2013), Montreal, Canada, 3-4 June 2013 [23] Wei Wang, Ben Liang, Baochun Li, "Revenue Maximization with Dynamic Auctions in IaaS Cloud Markets", in proc of of the IEEE/ACM 21st International Symposium on Quality of Service (IWQoS 2013), Montreal, Canada, 3-4 June 2013 [24] Bahman Javadi, Ruppa K. Thulasiram, Rajkumar Buyya, "Statistical Modeling of Spot Instance Prices in Public Cloud Environments", in proc of the Fourth IEEE International Conference on Utility and Cloud Computing(UCC 2011), Victoria, NSW, 5-8 Dec. 2011, pp 219-228. [25] Orna Agmon Ben-Yehuda, Muli Ben-Yehuda, Assaf Schuster, Dan Tsafrir, "Deconstructing Amazon EC2 Spot Instance Pricing", in proc of the Third IEEE International Conference on Coud Computing Technology and Science (CloudCom 2011), Athens, Greece 29 November-1 December 2011, pp 304-311. [26] Murtaza Zafer, Yang Song, Kang-Won Lee, "Optimal Bids for Spot VMs in a Cloud for Deadline Constrained Jobs", in proc of the Fifth IEEE International Conference on Cloud Computing(Cloud 2012), Honolulu, USA, 24-29 June 2012, pp 75-82. [27] Han Zhao, Miao Pan, Xinxin Liu, Xiaolin Li, Yuguang Fang, "Optimal Resource Rental Planning for Elastic Applications in Cloud Market", in proc of the 26th IEEE International Parallel and Distributed Processing Symposium(IPDPS 2012), Shanghai, China, 21-25 May 2012, pp 808819. [28] Qin Zheng, “Improving MapReduce Fault Tolerance in the Cloud,” the 15th IEEE Workshop on Dependable Parallel, Distributed and NetworkCentric Systems (DPDNS), in conjunction with IEEE IPDPS, Atlanta, USA, Apr. 2010 [29] Sangho Yi, Derrick Kondo, Artur Andrzejak, "Reducing Costs of Spot Instances via Checkpointing in the Amazon Elastic Compute Cloud", in proc of the 3rd IEEE Interrnational Conference on Cloud Computing(Cloud 2010), Miami, USA, 5-10 July 2010, pp 236-243. [30] Sangho Yi, Artur Andrzejak，Derrick Kondo，"Monetary Cost-Aware Checkpointing and Migration on Amazon Cloud Spot Instances", IEEE Transactions on Services Computing, Volume 5, Issue 4, Oct.-Dec. 2012，pp 512-524. [31] William Voorsluys, Rajkumar Buyya: “Reliable Provisioning of Spot Instances for Compute-intensive Applications.”, in proc of the 6th IEEE International Conference on Advanced Information Networking and Applications (AINA2012), Fukuoka, Japan, 26-29 March 2012, pp 542549.

REFERENCES [1] [2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

Amazon EC2, http://aws.amazon.com/ec2/ Henry Kasim, Terence Hung, Erika Fille Tupas Legara, Kee Khong Lee, Xiaorong Li, Bu-sung Lee, Vikknesh Selvam, Sifei Lu, Long Wang, Christopher Monterola, and Vasundhara Jayaraman, “Scalable complex system modelling for a sustainable city”, The Six IEEE International Scalable Computing Challenge (SCALE 2013) in conjunction with The 13th International Symposium on Cluster, Cloud and Grid (CCGrid 2013), Delft, Netherland, May 13-15, 2013. Byung Chul Tak; Urgaonkar, B.; Sivasubramaniam, A., "Cloudy with a Chance of Cost Savings" Parallel and Distributed Systems, IEEE Transactions on Volume: 24, Issue: 6, 2013, pp 1223-1233 Chen Wang, Junliang Chen, Bing Bing Zhou, and Albert Y. Zomaya, " Just Satisfactory Resource Provisioning for Parallel Applications in the Cloud", 2012 IEEE Eighth World Congress on Services, 2012, pp 285292. Sivadon Chaisiri, Bu-Sung Lee, and Dusit Niyato, "Robust Cloud Resource Provisioning for Cloud Computing Environments", in proc of IEEE International Conference on Service-Oriented Computing and Applications (SOCA’10) Perth, Australia, December 14, 2010 Sivadon Chaisiri, Bu-Sung Lee, and Dusit Niyato, "Optimization of Resource Provisioning Cost in Cloud Computing", IEEE Transactions on Services Computing, Volume 5 , Issue 2, pp 164-177. Ta Nguyen Binh Duong, Xiaorong Li, Rick Siow Mong Goh, "A framework for dynamic resource provisioning and adaptation in IaaS clouds", in proc of the Third IEEE International Conference on Coud Computing Technology and Science (CloudCom 2011), Athens, Greece 29 November-1 December 2011, pp 312-319 Ruben Van den Bossche, Kurt Vanmechelen, Jan Broeckhove, "CostEfficient Scheduling Heuristics for Deadline Constrained Workloads on Hybrid Clouds", in proc of the Third IEEE International Conference on Coud Computing Technology and Science (CloudCom 2011), Athens, Greece 29 November-1 December 2011, pp 320-327. Min Lu, Huiqun Yu, " A Fault Tolerant Strategy in Hybrid Cloud Based on QPN Performance Model" in proc of International Conference on Information Science and Applications (ICISA 2013), Suwon, Korea, 2426 June 2013. Suraj Pandey, Dileban Karunamoorthy and Rajkumar Buyya, “Workflow Engine for Clouds”, Chapter 12, pp. 321-344, Cloud Computing: Principles and Paradigms, R. Buyya, J. Broberg, A.Goscinski (eds), ISBN-13: 978-0470887998, Wiley Press, New York, USA, February 2011. Mustarfizur Rahman, Xiaorong Li, Henry Palit, “Hybrid Heuristic for Scheduling Data Analytics Workflow Applications in Hybrid Cloud Environment”, in Proc. High-Performance Grid and Cloud Computing Workshop 2011, in conjunction with International Parallel and Distributed Processing Symposium (IPDPS 2011), 2011. Malawski, M.; Juve, G.; Deelman, E.; Nabrzyski, J., "Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds", in proc of International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2012), Salt Lake City, USA, 10-16 Nov. 2012 Bahman Javadi, Jemal Abawajy, Richard O. Sinnott, "Hybrid Cloud Resource Provisioning Policy in the Presence of Resource Failures", in proc of the 4th IEEE International Conference on Coud Computing Technology and Science (CloudCom 20120, Taipei, 3-6 Dec. 2012, Hyunjoo Kim, Yaakoub el-Khamra, Ivan Rodero, Shantenu Jha, Manish Parashar, “Autonomic management of application workflows on hybrid computing infrastrcuture”, Scientific Computing 19(2-3):75-89, 2011, IOS Press. Hyunjoo Kim, Manish Parashar, “CometCloud: an autonomic Cloud engine”, Chapter 10, pp. 275-297, Cloud Computing: Principles and Paradigms, R. Buyya, J. Broberg, A.Goscinski (eds), ISBN-13: 9780470887998, Wiley Press, New York, USA, February 2011. Michael Mattess, Christian Vecchiola, and Rajkumar Buyya, “Managing Peak Loads by Leasing Cloud Infrastructure Services from a Spot Market”, in proc of the 12th IEEE International Conference on High Performance Computing and Communications (HPCC2010), Melbourne, Australia, 1-3 September 2010, pp 180-188.

662 663