Deadline-Constrained Algorithms for Scheduling of Bagof-Tasks and Workflows in Cloud Computing Environments Ashish Kumar Maurya
Anil Kumar Tripathi
Department of Computer Science &Engineering Indian Institute of Technology (BHU) Varanasi, India
Department of Computer Science &Engineering Indian Institute of Technology (BHU) Varanasi, India
[email protected]
[email protected]
ABSTRACT Cloud computing is an emerging distributed computing paradigm that solves immense scientific applications through distributing computing resources over the Internet. These applications may have a huge number of tasks that may increase their execution costs, if not scheduled appropriately. Thus, scheduling of tasks is one of the key challenges in cloud computing environments. The scheduling problem for Bag-of-tasks (BoT) and workflow applications has been broadly studied, and there exist many algorithms for this in cloud computing. In this paper, we evaluate and compare the performance of four deadline-constrained scheduling algorithms for cloud computing environments in which two are heuristic algorithms, and two are meta-heuristic algorithms. The heuristic algorithms used in this work are ICPCP, and SCS and meta-heuristic algorithms utilized here are PSO and CSO. The algorithms aim to minimize the makespan and execution cost of BoT and workflow applications while achieving deadline constraints. For performance estimation and comparison of algorithms, we used three categories of BoT as small, medium and, large and two real-world applications for workflows for instance Montage and CyberShake. The results illustrate that CSO algorithm performs better than other algorithms for both BoT and workflow applications.
CCS Concepts • Computing methodologies → Distributed computing methodologies; Scheduling Algorithms;
Keywords Scheduling Algorithms, Resource Provisioning, Bag-of-Tasks, Scientific Workflows, Cloud Computing
1. INTRODUCTION From the last few decades, distributed environments have grown from domain sharing platforms to service provisioning model; the newest of them is cloud computing [1]. The cloud computing environments facilitate the utilization of different resources onto Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from
[email protected]. HP3C 2018, March 15–17, 2018, Hong Kong, Hong Kong © 2018 Association for Computing Machinery. ACM ISBN 978-1-4503-6337-2/18/03. . . $15.00
https://doi.org/10.1145/3195612.3195618
the Internet and pursue a pay-as-you-go model in which the users pay according to the usages of resources [2]. Today, there exist many cloud vendors that may offer various products and services to their users. The world’s top 10 most-powerful and mostinfluential cloud computing vendors according to Forbes1 in 2017 include Microsoft, Amazon, IBM, Salesforce.com, SAP, Oracle, Google, ServiceNow, Workday, VMware. Based on the service models used, they can be mainly categorized into Platform as a Service (PaaS), Software as a Service (SaaS) and Infrastructure as a Service (IaaS) [2]. This work concentrates on IaaS clouds that provide the customer an unlimited number of heterogeneous virtual resources which may be used when required. Furthermore, the customers give the flexibility of holding and releasing resources through different configurations to meet the needs of an application. Although, it provides more power to the users and offers them additional control over the resources which motivates to discover novel scheduling methods so that the distributed resources can be utilized well. BoT and workflows are frequently used in complex and large systems. A BoT application can be represented as a set of independent tasks that do not interact with each other throughout execution. Thus, each task in the BoT application can be scheduled independently and have different computational requirements. Workflows in large scientific and complex applications require high computing powers and hence need a high-performance computing environment to facilitate the completion of the application at an appropriate time. A workflow application can be represented as a group of dependent tasks interconnected by computing or data dependencies and can communicate with each other during execution. Figure 1 shows examples of a BoT application with independent tasks and a workflow application with dependent tasks. The scheduling of these tasks onto distributed environments such as clusters and grids has been broadly studied for many years in which scheduling algorithms attempt to reduce the makespan of the BoT and workflows. However, in new distributed computing paradigms like cloud computing, other parameters than makespan like economic cost also optimized. Thus, the researchers are focusing on developing novel methods of task scheduling for cloud computing to deal with its challenges. In this work, we perform a comparative study of four deadlineconstrained algorithms for scheduling of Bag-of-Tasks and workflows in cloud computing environment. The algorithms are IC-PCP (IaaS Cloud Partial Critical Path) [3], SCS (Scaling Consolidation Scheduling) [4], PSO (Particle Swarm Optimization) [5] and CSO (Cat Swarm Optimization) [6] [7]. 1
https://www.forbes.com/sites/bobevans1/
The algorithms aim to minimize the makespan and execution cost of Bag-of-Tasks (BoT) and workflow application while achieving deadline constraints. For performance estimation and comparison of algorithms, we used three categories of BoT as small, medium and, large and two real-world applications for workflows for instance Montage and CyberShake.
Figure 1. Examples of (a) a BoT application with independent tasks and (b) a workflow application with dependent tasks. The rest of the paper is organized as follows: Section 2 gives the description on algorithms used in this work. Performance evaluation and result analysis is presented in Section 3. Section 4 concludes the summary of findings and gives the scope for future works.
2. A DESCRIPTION OF ALGORITHMS USED In this section, we present a summary of four deadline-constrained scheduling algorithms in which two are heuristic algorithms, and two are meta-heuristic algorithms. The heuristic algorithms used in this work are IC-PCP, and SCS and meta-heuristic algorithms utilized here are PSO and CSO.
2.1
IC-PCP Algorithm
The IC-PCP algorithm [3] is a PCP (Partial Critical Path) scheduling algorithm for IaaS cloud that contains 2 phases: Deadline Distribution and Planning. The first phase distributes the deadline of the workflow among its tasks. To perform this, the Critical Path (CP) of the workflow is determined and a path assigning algorithm is called for distributing the deadline amongst the tasks of the CP. Now, the tasks’ individual deadlines can be utilized to determine their predecessors’ deadlines in the workflow. After this, the Planning algorithm schedules the workflow by allocating all tasks to the economical resource that fulfils its individual deadlines. The IC-PCP is a single-phase algorithm that schedules the tasks of a path on the same Virtual Machine (VM) and is possibly allocated to a previously leased instance that will able to fulfil the Latest Completion Time (LCT) constraints of the tasks. Though, if it can’t be accomplished, an economical instance which can complete the tasks earlier than their LCT is leased and the path allocated to it. At last, the algorithm determines a CP for all unallocated tasks on the allocated path and the procedure is performed again until each task has been allocated. At the end of this procedure, all tasks have been allocated to the VMs and have a beginning and finishing times linked to them. Moreover, all VMs maintain its starting time which can be calculated with the time when they start executing their first assigned task and an ending time that can be calculated via the completion time of their last assigned task.
2.2
SCS Algorithm
The SCS algorithm [4] is a dynamic algorithm devised to schedule multiple workflows. This algorithm may also be utilized for the scheduling of a single instance of the workflow without any alteration. The SCS primarily recognizes and packs tasks with dependency constraints into a single one that decreases times
elapsed in data transfer. Hereafter, the workflow’s deadline is allocated among the tasks such that all tasks get some part of the deadline depends on the VMs that are primarily economical for the tasks. The method utilized for accomplishing this deadline allotment is completed according to [8]. After this, a load vector is defined for each VM type which specifies how many VMs are required for the tasks to complete by their allocated deadline at a certain moment; the value can be determined via the approximated execution time of a task on the particular instance and computation interval obtained from the deadline assignment phase. Subsequently, SCS continues to combine partial instance hours via integrating tasks which executes on various instances into one if any of the VM has idle time and can finish the further task by its original deadline. Ultimately, earliest deadline first is utilized for assigning tasks onto active VMs such that the task having the earliest deadline is assigned immediately when any matching instance type is available.
2.3
PSO Algorithm
The PSO algorithm [5] is an evolutionary computational method inspired by the simulation of social behavior. It is a stochastic optimization method that utilizes the idea of particle which denotes an individual like bird or fish. The particle has the potential to move in the specified search region with the aim of finding a candidate solution. The particles’ movement at any instance can be defined by their velocity and their position. The velocity of a particle has direction and magnitude; hence it is denoted with a vector. The velocity of a particle can be computed by its most excellent position so far and the other particles’ best position so far. Particles record two positions: (1) pbest as their best position and (2) gbest as their global best position. These values are computed according to the fitness function. At each step, the algorithm changes the particles’ velocities towards the pbest and gbest positions. The particles’ movement toward above said values is subjected to a random term, by various numbers produced for acceleration for achieving gbest and pbest positions. This algorithm iterates until a stopping condition is fulfilled. The stopping condition may be a predefined fitness value, or it may be a given maximum number of iterations. The fitness function used in one problem may differ from the other problem, depending on the requirements and constraints.
2.4
CSO Algorithm
The CSO algorithm [6] is a metaheuristic algorithm that is based on the social behavior of cats. It was given by observing the seeking and tracing the behavior of cats. In seeking mode, cats only seek for their next best positions without moving, and in tracing mode, cats shift with some velocity towards their next best positions. The tracing mode represents that how cats chase their targets. This algorithm first initializes the swarm with cats and assigns random velocities to the cats. It computes a mixture ratio (MR) of cats in tracing mode to seeking mode and randomly assigns cats’ MR percentage to tracing mode and remaining to seeking mode. Further, it calculates the fitness of cats and stores the best cat’s location in memory and updates the location of cats. These steps are performed until stopping criteria is satisfied. Authors of [7] proposed a customized CSO algorithm which tries to assign resources to tasks and reduce the total cost obtained in the computation of all tasks in a cloud computing environment. This algorithm presents fair load balancing on existing resources and decreases the wastage of energy in random movement. Each cat corresponds to a task-resource mapping that can be updated according to the mode in which the cat belongs. Finding of such mapping which gives minimum cost depends on the evaluation of
the fitness value of the cats. At each step, a novel group of cats is selected from tracing mode. Finally, best position among the cats provides the best mapping which has the minimum cost among all mappings.
3. PERFORMANCE EVALUATIONS AND ANALYSIS This section estimates the performance of algorithms and analyzes the results for BoT and workflow applications in cloud computing environment. We utilized the CloudSim simulator [9] to provide a cloud computing environment.
used from [5] and given in Table 1. The processing capacity of VMs in MFLOPS are depended on the EC2 units used and can be estimated by using the work of [12]. Each application is computed using 3 various workflow sizes, small, medium, and large with 50 tasks, 100 tasks and 1,000 tasks on an average respectively. The billing interval of a VM is set to 1 hour. From the results of [4] for Amazon EC2 cloud, we set a VM’s boot time to 97 seconds and performed each experiment 5 times and considered the average of the results found for large workflows. We assumed that the computation time of tasks is known in advance and performance variation is modeled according to the work of [13].
Figure 2. Flow diagram of Montage and CyberShake workflows.
3.1
Experimental BoT and Workflows
To perform experiments, Bag-of-Tasks are produced by the random process, and three ranges such as small, medium and large are considered for a number of tasks in the BoT. Further, each of the BoT categories includes million floating point instructions per second (MFLOPS) with many ranges produced by the random process. In the experiments, we also utilized two types of workflows from various real-world applications: Montage and CyberShake workflows. The Montage workflow [10] is an astronomical application utilized to generate custom mosaics of the sky which depends on input images. The major tasks of this workflow are I/O intensive and need very less processing power. The CyberShake workflow [10] is utilized to illustrate earthquake hazards by producing synthetic seismograms and may be categorized as a data-intensive workflow with large memory and CPU requirements. Figure 2 shows the flow diagram of Montage and CyberShake workflows. We used Montage workflow having 25 tasks and Cybershake workflow with 29 tasks in our experiments. The complete information about these two workflows is given in [11].
Figure 3. Total Execution Time of algorithms for small, medium and large ranges of BoT applications.
Table 1: VMs Types used in the Experiments
3.2
Name
EC2 Units
m1.small m1.medium m1.large m1.xLarge m3.xLarge m3.doubleXLarge
1 2 4 8 13 26
Processing Capacity (MFLOPS) 4400 8800 17600 35200 57200 114400
Experimental Setup
In this work, we assumed six types of heterogeneous VMs with different configurations and processing capacity alike to the setup used in [5]. Further, these VMs are based on Amazon EC2 cloud offering. The configurations of VMs and their pricing model are
Figure 4. Percentage of Deadlines met for Montage and CyberShake Workflows. The experiments are performed with a deadline that is computed so that obtained values lie within the minimum and the maximum execution times. The deadline is computed by calculating the difference between the higher and lower value of execution times obtained by executing experiments on the fastest VM and the slowest VM respectively and dividing this value by two to obtain an interval size. After this, adding the interval size to the minimum execution time to get a deadline.
3.3
Performance Metrics
We used execution time, makespan and execution cost as performance metrics to estimate the performance of algorithms. The makespan [14] gives the total elapsed time needed to finish the execution of the whole BoT or workflow [15]. It can also be defined as the completion time of the final task in the task set or the workflow [16]. Further, the makespan should not be more than the deadline. The cost of running a task ti on a VM of type VMv is calculated as
where
is the estimated
processing time of task ti on VMv and Cv is the cost of VMv per time interval [17].
machine. It is clear from the results that the CSO is the best algorithm in terms of execution time. Figure 4 shows the percentage of deadlines met for Montage and Cybershake workflows for all algorithms. It is examined from the figure that SCS, PSO and CSO algorithms fulfill the deadline criteria. Figure 5 and 6 show the average makespan of algorithms for Montage and CyberShake workflows. The horizontal solid line in figures specifies the deadline for workflows. For Montage workflow, ICPCP algorithm falls above the deadline and therefore it fails to fulfill the deadline criteria, and other algorithms lie below the deadline with different makespan. For Cybershake workflow, all the algorithms lie below the deadline with varying makespan. The makespan varies because of the dynamic nature of CPU speed, resource utilization, etc. For both workflows, CSO algorithm gives better results than other compared algorithms in terms of average makespan.
Figure 5. Average Makespan of Montage Workflow.
Figure 7. Total Execution Cost of Montage and CyberShake Workflows. For the cost optimization, we used Amazon EC2 instance pricing model, i.e., in dollars. Figure 7 shows the total execution cost of the algorithms for Montage and CyberShake workflows. It is observed from the results that ICPCP algorithm takes maximum execution cost than other algorithms for both workflows. All other algorithms execute the workflows within its given deadline interval and take similar execution cost in which CSO algorithm gives slightly minimum execution cost than SCS and PSO algorithms.
4. CONCLUSIONS Figure 6. Average Makespan of CyberShake Workflow.
3.4
Results and Analysis
Figure 3 shows the total execution time of algorithms for small, medium and large ranges of BoT applications. For all ranges of BoT applications, ICPCP takes more time than other algorithms as it executes without considering machines MFLOPS which increases waiting time. PSO algorithm takes less time than ICPCP as it selects the VM from each cluster and assigns to the first-best
In this paper, we discussed and compared the performance of two heuristic and two meta-heuristic deadline-constrained scheduling algorithms for cloud computing environments. We used SCS and IC-PCP as heuristic algorithms and PSO and CSO as metaheuristic algorithms. We performed experiments on algorithms that aimed to minimize makespan and execution cost while achieving deadline constraints for BoT applications and two realworld applications such as Montage and CyberShake workflows. For BoT applications, CSO takes less execution time while ICPCP takes more execution time. For Montage workflow, ICPCP fails to meet the deadline criteria and perform worst in terms of average makespan. For Montage and CyberShake workflows, CSO
provides better average makespan than other algorithms. In terms cost optimization, SCS, PSO, and CSO perform nearly equal. Overall, CSO performs better than other algorithms for both BoT and workflow applications. For future work, the performance of deadline-constrained scheduling algorithms can be evaluated and compared for real-world application graphs in real cloud computing environments.
5. ACKNOWLEDGMENTS
[1]
Rajkumar Buyya, Chee Shin Yeo, Srikumar Venugopal, James Broberg, and Ivona Brandic. 2009. Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility. Future Generation computer systems 25, 6 (2009), 599–616.
[2]
Peter Mell, Tim Grance, et al. 2011. The NIST definition of cloud computing. (2011).
[10] Ewa Deelman, Gurmeet Singh, Mei-Hui Su, James Blythe, Yolanda Gil, Carl Kesselman, Gaurang Mehta, Karan Vahi, G Bruce Berriman, John Good, et al. 2005. Pegasus: A framework for mapping complex scientific workflows onto distributed systems. Scientific Programming 13, 3 (2005), 219–237.
[3]
[4]
[5]
Saeid Abrishami, Mahmoud Naghibzadeh, and Dick HJ Epema. 2013. Deadline-constrained workflow scheduling algorithms for infrastructure as a service clouds. Future Generation Computer Systems 29, 1 (2013), 158–169. Ming Mao and Marty Humphrey. 2011. Auto-scaling to minimize cost and meet application deadlines in cloud workflows. In High Performance Computing, Networking, Storage and Analysis (SC), 2011 International Conference for. IEEE, 1–12. Maria Alejandra Rodriguez and Rajkumar Buyya. 2014. Deadline based resource provisioningand scheduling algorithm for scientific workflows on clouds. IEEE Transactions on Cloud Computing 2, 2 (2014), 222–235.
[6]
Shu-Chuan Chu, Pei-Wei Tsai, and Jeng-Shyang Pan. 2006. Cat swarm optimization. In Pacific Rim International Conference on Artificial Intelligence. Springer, 854–858.
[7]
Saurabh Bilgaiyan, Santwana Sagnika, and Madhabananda Das. 2014. Workflow scheduling in cloud computing environment using cat swarm optimization. In Advance Computing Conference (IACC), 2014 IEEE International. IEEE, 680–685.
[8]
Jia Yu, Rajkumar Buyya, and Chen Khong Tham. 2005. Costbased scheduling of scientific workflow applications on utility grids. In e-Science and Grid Computing, 2005. First International Conference on. IEEE, 8–pp.
[9]
Rodrigo N Calheiros, Rajiv Ranjan, Anton Beloglazov, Cesar AF De Rose, and Rajkumar Buyya. 2011. CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Software: Practice and experience 41, 1 (2011), 23–50.
The authors gratefully acknowledge Mr. Bharat Tewari an alumnus of 1993 batch of the Computer Science & Engineering Department, Indian Institute of Technology (BHU), Varanasi, India for providing travel support.
6. REFERENCES
[11] Gideon Juve, Ann Chervenak, Ewa Deelman, Shishir Bharathi, Gaurang Mehta, and Karan Vahi. 2013. Characterizing and profiling scientific workflows. Future Generation Computer Systems 29, 3 (2013), 682–692. [12] Simon Ostermann, Alexandria Iosup, Nezih Yigitbasi, Radu Prodan, Thomas Fahringer, and Dick Epema. 2009. A performance analysis of EC2 cloud computing services for scientific computing. In International Conference on Cloud Computing. Springer, 115–131. [13] Jorg Schad, Jens Dittrich, and Jorge-Arnulfo Quiane-Ruiz. 2010. Runtime measurements in the cloud: observing, analyzing, and reducing variance. Proceedings of the VLDB Endowment 3, 1-2(2010), 460–471. [14] Ashish Kumar Maurya and Anil Kumar Tripathi. 2018. On Benchmarking Task Scheduling Algorithms for Heterogeneous Computing Systems. The Journal of Supercomputing (2018). [15] Ashish Kumar Maurya and Anil Kumar Tripathi. 2018. An Edge Priority-based Clustering Algorithm for Multiprocessor Environments. Concurrency and Computation: Practice and Experience (2018). [16] Ashish Kumar Maurya and Anil Kumar Tripathi. 2017. Performance Comparison of HEFT, Lookahead, CEFT and PEFT Scheduling Algorithms for Heterogeneous Computing Systems. In Proceedings of the Seventh International Conference on Computer and Communication Technology 2017, (ICCCT’17). ACM,128–132. [17] Jyoti Sahni and Deo Vidyarthi. 2015. A cost-effective deadline-constrained dynamic scheduling algorithm for scientific workflows in a cloud environment. IEEE Transactions on Cloud Computing (2015).