DYNAMIC CLOUD PROVISIONING. FOR SCIENTIFIC GRID WORKFLOWS.
Simon Ostermann, Radu Prodan and Thomas Fahringer. Institute of Computer ...
DYNAMIC CLOUD PROVISIONING FOR SCIENTIFIC GRID WORKFLOWS Simon Ostermann, Radu Prodan and Thomas Fahringer Institute of Computer Science, University of Innsbruck Technikerstrasse 21a, Innsbruck, Austria
[email protected]
OVERVIEW • Introduction • Optimized
Cloud Provisioning
Cloud Start • Instance Size • Grid Scheduling • Cloud Stop •
• Evaluation •
Wien2k
•
Invmod
•
Meteoag
using 3 scientific workflows
• Conclusion
INTRODUCTION • Infrastructure • On-demand • Other
as a Service a branch of Cloud computing
resources i.e.: Amazon EC2, GoGrid, ...
common Cloud computing areas not covered:
• Platform
as a Service
• Software
as a Service
• Specialized
solutions for Storage, Web hosting, ...
CLOUD COMPUTING FOR SCIENTIFIC COMPUTING? • Rent
resources instead of buying own hardware • Eliminates permanent operation, maintenance, and deprecation costs • Scale up/down an infrastructure based on temporary immediate needs • Significantly reduced over-provisioning • Virtualised resources enables scalable deployment and provisioning of application software • Reliability through business SLA relationships that bind actors to offering higher QoS guarantees
nothing Unallocated Requested Starting Running Accessible Shutting down Terminated Unallocated
50 100 100 100 30 270 50 10 100
CLOUD MODELS
computing mostly available on a hourly basis
• Some
research papers assume finer granularity =#>-(&%''%,6( %,-#./*'(
1%2#(
• Interesting
!""#$%&'#(
0$*&'#(%,-#./*'(
4:8;,6(+3$'()*
!"#$%&'()*+),")-*
Generate failure ="24/">$'()*
3&"%$/-*.-)-/&4(/*
:&;$'()*
./"0*&)0*9%($0*+)''-2*
OPTIMIZED CLOUD PROVISIONING • Analysis
of regular executions and the resulting costs
• Analysis
resulted in multiple parts needing optimization
• Choices
have to be made about: start and stop of resources and the amount of instances requested
• Four
optimizations found, defined as algorithms (in the paper) and exploited in the evaluation
Grid core 3 Grid core 2 Grid core 1 Cloud core 1
• Parallel
CLOUD START 120 120 120
120 120 120
regions with more tasks then250 available cores
• Depending
of Cloud and Grid speed Serialization and Imbalance overheads are analyzed
• When
Grid core 3 Grid core 2 Grid core 1 Cloud core 1
120 120 120
120 120
minimization of the runtime of300the parallel section is possible Cloud resources are started :;.3437)+"
2+&'34'536*7" %&'(")*&+"#"
%&'(")*&+"$"
%&'(")*&+","