The 4th International Conference on Computer Science and Computational Mathematics (ICCSCM 2015)
Bin Packing Multi-Constraints Job Scheduling Heuristic for Heterogeneous Volunteer Grid Resources Saddaf Rubab1, Mohd Fadzil Hassan2, Ahmad Kamil Mahmood3, Nasir Mehmood Shah4 1,2,3,4
Department of Computer and Information Sciences, University Teknologi PETRONAS, Bandar Seri Iskandar, 31750, Tronoh, Perak, Malaysia
[email protected],
[email protected],
[email protected],
[email protected]
Abstract: In volunteer grid, resources contributed are mostly heterogeneous .i.e., the characteristics of resources, time availability etc. The jobs submitted to volunteer grid, may require multiple heterogeneous resources and one job can be divided into number of tasks to fulfill job requirements which demands multiresource scheduling policy to consider different constraints of resource and job before scheduling. Whereas, traditional scheduling policies; contemplate mostly single resource and one optimization criteria for resource usage. Therefore, it is required to have a scheduling policy that fulfills multiple resource constraints to optimize resource usage and schedule jobs efficiently. The paper discusses the proposed heuristic for multi-constraint job scheduling in volunteer grid environment. Bin-packing problem is employed to reorder the job queues to improve the scheduling. To evaluate the performance of proposed heuristic, waiting time, turnaround time, slowdown time and job failure rate are calculated and validated the performance improvement in comparison to other scheduling algorithms practiced in grid. Keywords: Volunteer grid computing, volunteer resources, binpacking, multi-constraints, back-filling
1. Introduction Volunteer grid computing environment has resources which are heterogeneous and distributed in nature, volunteered to help serve scientific research and high computations [1, 2]. These volunteered resources are unpredictable and cannot guarantee the availability till completion of running job. Managing and efficient utilization of these resources is one of the major research issues. This can be attempted to solve by scheduling submitted jobs in a way that maximum resources are utilized and jobs are completed within the deadline specified. There are many traditional job scheduling algorithms and policies which can be incorporated in volunteer grid environment such that First Come First Serve (FCFS) Shortest Process Next (SPN), Longest Job First (LJF), Round Robin (RR) [3-5] etc., but these all lacking the ability to consider multiple constraints. For example, if the submitted jobs are scheduled using FCFS, the job arriving first will be scheduled first and will continue scheduling jobs in order of arrival time until all jobs are submitted or no more available resources. But if a large job arrived first and
640
demands multiple resources, it will block most of the resources and delays the next arriving jobs. This will affect the resource usage ratio and most of resources will be underutilized. The solution to such a scheduling problem is back-filling [6], which takes out the job from running queue and submits to waiting queue. Back-filling can be used if a large job is assigned to multiple resources and it is blocking resources or delaying the smaller jobs arriving later; which are potential to make maximum use of unassigned resources. While studying the multiple constraints of a resource, it is possible to have one over-loaded resource whereas the remaining available resources are starving for jobs even though back-filling mechanism is employed in job scheduling. The reason of blocking here is the nature of traditional scheduling algorithms and policies that most of them are not considering multiple constraints of resource and only focus on single constraints like CPU cycles required, time of availability. Again taking FCFS algorithm as an illustration, it only selects jobs based on arrival times and not considering multiple resource constraints. If multiple resource constraints are considered before assignment of a job, it is possible to find a better match of job and resource which in turn will affect overall performance. If a job submitted to a system like volunteer grid, requests for a matching resource by providing multiple resource requirements required to complete submitted job, the resources can be more efficiently used. Thus, the resources requested are selected based on multiple resource constraints can improve resource utilization and job completion efficiently. This gives a research motivation to implement multi-constraint resource scheduling in volunteer grid where resources are heterogeneous and unpredictable. The paper proposed a multi-constraint job scheduling heuristic in volunteer grid environment; with the ability to assign a resource which matches multiple constraints required to be fulfilled for a job completion. A sub-queue mechanism is implemented within a resource to reorder the jobs in order to improve the scheduling. The outline of this paper is as follows. The overview of related literature is explained in section 2, which is followed
The 4th International Conference on Computer Science and Computational Mathematics (ICCSCM 2015)
by section 3 discussing bin-packing problem. The proposed heuristic is discussed in section 4. The simulation results and comparisons are presented in section 5 and scope of work is concluded in section 6.
2. Related Literature This section will review the brief literature on single and multi-constraint scheduling along with the bin-packing policies. Job scheduling in grid computing has been extensively researched in past. Although, the past research mainly focused on single resource constraint scheduling. The work reported in [7], introduce a multiconstraint preemptive scheduling algorithm which optimizes multiple metrics like turnaround time and slowdown time. The algorithm is based on theoretical analysis of job trigger conditions. The algorithm is proposed for parallel jobs. Branch and bound approach was used in a shared memory system [8], to select jobs which fits in the available memory and later will allocate jobs to those CPUs. It also considers the load of the system. The aim was to utilize minimum resources to complete a job, whereas later the findings proved to authors that if more resources are utilized the performance is better. In [9], grid resource selection heuristics are proposed that are deterministic and probabilistic in nature. The authors reported to reduce the turnaround time of jobs. Scheduling jobs in advance and rescheduling are also methods to reduce the job waiting time and increasing the system optimization. Few of the scheduling heuristics and models are discussed briefly in [10]. An advance scheduling technique is proposed in [11], that reserve the resources for jobs based on the network. It presented a technique to use idle resource cycles using red-black trees which considers network as first level of resource. Job scheduling algorithms have also used binpacking policies. An algorithm proposed Multi-Capacity Bin Packing (MCBP) by [12] is used in [13] to schedule jobs for single homogeneous cluster. In [13], the resource load is shared among virtual machines and find heuristics to optimize the scheduling metrics. First Fit Decreasing (FFD), is an extension of FF, and claimed better execution times if incorporated in job scheduling [14]. FFD is a bin-centric method; pack the items that maximize the product of remaining all capacities.
3. Bin Packing Problem The packing of items in bin which should not exceed some maximum value set like total weight, volume etc. and also to minimize the number of bins used to pack all the items, is known as bin-packing problem [15, 16]. Before proceeding to the details of proposed heuristic supported by bin-packing; the concept of bin-packing problem need to be highlighted, along with the job scheduling perspective.
641
Suppose there are ‘݊’ bins, ܤ representing ‘݊’ number of available resources ‘ܴ’ in volunteer grid. ܤൌ ሼܤଵ ǡ ܤଶ ǡ ǥ ǡ ܤ ሽ ܴ ൌ ሼܴଵ ǡ ܴଶ ǡ ǥ ǡ ܴ ሽ Each of the bin is also having a volume ‘ܸ’ associated to it that is allowed to be filled ܸ ൌ ሼܸଵ ǡ ܸଶ ǡ ǥ ǡ ܸ ሽ
Where ‘ܸ’ represents volume of each bin available and from job scheduling perspective it represents the CPU cycles available at each resource. The ‘݊’ jobs submitted to volunteer grid are represented by ‘J’. ܬൌ ሼܬଵ ǡ ܬଶ ǡ ǥ ǡ ܬ ሽ A job ‘ܬ ’can only be packed to bin ‘ܤ ’, if and only if, its volume ‘ܸ ’ is greater than or equal to the sum of ‘ܬ ’ and already packed jobs in ‘ܤ ’. ܤ ܬ ܸ
In the interpretation of job scheduling, the job ‘ܬ ’ can only be submitted to resource ‘ܴ ’, if and only if, it can complete the job ‘ܬ ’ and already submitted jobs to ‘ܴ ’. The total of all the jobs submitted to resource ‘ܴ ’must be equal or less than the total CPU time available ‘ܸ ’at that resource. ܴ ܬ ܸ
There are a few practiced algorithms like [17, 18] Next Fit (NF), First Fit (FF) and Best Fit (BF) to solve the bin packing problem; have already been incorporated in job scheduling. In NF, a job will be submitted to current resource available if it fits and if the resource not available, NF will try to submit the job to next resource. In FF, all the jobs are arranged in increasing order of required CPU time and assign each job to the resource job fits first. Whereas in BF, will select the best of all the available resources where a submitted job fits best.
4. Bin Packing Multi-Constraint Scheduling Heuristic In bin-packing problem, it is supposed that all items to be packed in bins should arrive before selection of bins and packing of items start. In proposed heuristic resources are representing bins and jobs as items. There is a supposition that all jobs must arrive before assigning jobs to resources. In bin-packing, the goal is to use minimum number of bins, where as in proposed heuristic it is aimed to use less number of cycles for assigning jobs to matching resources. In the basic bin-packing policy i.e., FF, NF or BF, the resource constraints are not considered like available CPU time, resource availability time etc. Therefore, giving attention to job requirements and ignoring resource constraints will lead to single criteria of scheduling job. Hence, in proposed heuristic not only the job requirements but also the resource constraints are studied before scheduling a job which leads to multi-constraint job
The 4th International Conference on Computer Science and Computational Mathematics (ICCSCM 2015)
ܴ ൌ ሼܴଵ ǡ ܴଶ ǡ ܴଷ ǡ ܴସ ǡ ܴହ ǡ ܴ ሽ ܸ ൌ ሼܸଵ ǡ ܸǡ ܸଷ ǡ ܸସ ǡ ܸହ ǡ ܸ ሽ
of ܸ.
The set of resources ܴ is arranged in decreasing order ܸଵ ൏ ܸଶ ൏ ܸଷ ൏ ܸସ ൏ ܸହ ൏ ܸ
ܴଵ ൏ ܴଶ ൏ ܴଷ ൏ ܴସ ൏ ܴହ ൏ ܴ
There are five (05) jobs submitted to volunteer grid that can be assigned to matching resources from set ܴ. ܬൌ ሼܬଵ ǡ ܬଶ ǡ ܬଷ ǡ ܬସ ǡ ܬହ ሽ The jobs are arranged in increasing order of their required CPU time and the arrival time. Suppose ܬଵ ܬଶ ܬଷ ܬସ ܬହ
A job ܬ will be assigned to the resource that can allocate the required CPU time such that the total of all the jobs submitted to resource ‘ܴ ’must be equal or less than the total CPU time available ‘ܸ ’at that resource. ܴ ܬ ܸ
If there are multiple resources satisfying this condition, job ܬ will be submitted to the resource will less remaining CPU time available. This will reduce scheduling completion and number of cycles for scheduling which in turn will decrease the overhead and slowdown time. There is a possibility that multiple jobs are assigned to one resource. In the second stage of proposed heuristic, the jobs within a resource will be organized in increasing order of the CPU time required by each job assigned to resource ܴ to process the smallest job first but keeping in view which job starting time. Let’s supposeܬଵ , ܬଶ and ܬଷ are assigned to ܴଵ where: ܬଵ ܬଷ ܬଶ according to CPU time required and ܬଶ ܬଵ ܬଷ according to start time The jobs in ܴଵ will then be processed starting with ܬଶ followed by ܬଵ and ܬଷ at end.
5. Results
The simulation of proposed heuristic is performed in volunteer grid environment to evaluate the job scheduling by comparing the scheduling performance with baseline job
642
scheduling heuristics in practice including First Come First Serve (FCFS) Shortest Process Next (SPN), Longest Job First (LJF), Round Robin (RR) [3-5]. The volunteer grid environment available at High Performance Computing Center (HPCC) at Universiti Teknologi PETRONAS, Malaysia has been used to have already volunteered resources set-up for experimental purposes. These resources are heterogeneous and their availability time is also varying. The test run limits to ten (10) resources for performance evaluation in a real volunteer grid environment. Jobs dataset has been used to evaluate and compare the performance of proposed heuristic. The job dataset of 500 jobs is synthesized using Monte Carlo method, comprising the CPU time required, start time and deadline of job whereas arrival time of all jobs is supposed be the same. To study the simulation of proposed heuristic in volunteer grid environment, following job scheduling performance metrics are calculated: · Average waiting time: the average waiting time of a job before its execution.
· · ·
Average turnaround time: the average time between a job submission and completion time. Average slowdown time: the average ratio of response time to run time of a job Jobs failure rate: the amount of jobs not completed within the deadline
The proposed heuristic outperforms in all the performance metrics selected. The average waiting time of baseline scheduling algorithms and proposed heuristic is presented in Figure 1. The average waiting time of proposed and SPN is very close whereas it is comparable to the other three baseline algorithms. 3000 2500 Time (sec)
scheduling and optimizing resource utilization in volunteer grid environment. If a resource ܴ has least available volume/available CPU time that can be utilized, search a job ܬ which fits inܴ . This will reduce the scheduling cycles and will allow the next job assignment to other resource and will help skip ܴ for any resource selection until the current batch of jobs finished. All the resources can have different volumes/available CPU time. Resources will firstly arrange in decreasing order of their available CPU timeܸ. Suppose there are six (06) volunteer resources and their respective CPU time:
2000 1500 1000 500 0
Figure 1: Average Waiting Time The performance of proposed heuristic is improved in terms of average turnaround time as well Figure 2. The reason for this can be the less waiting time for starting the job execution.
The 4th International Conference on Computer Science and Computational Mathematics (ICCSCM 2015)
The optimized solution of resource utilization is also possible by implementing multi-constraint job scheduling heuristic. The proposed heuristic attempts to use the maximum of available resources in volunteer grid environment. The re-ordering of jobs in a resource emphasized more on reducing the average waiting and turnaround time of jobs.
1000 Time (sec)
800 600 400 200
number of jobs failed
0
Figure 2: Average Turnaround Time The jobs start without much delay in the starting time and get quick response using proposed heuristic. This helps to decrease the slowdown time of jobs. The slowdown time of LJF is increased as longer jobs are entertained first. The slowdown time of proposed heuristic is comparable to RR as the response time and average waiting time of RR is also less.
Figure 4: Job Failure Rate By extensive simulation on volunteer resources using synthetic job dataset, authors have verified that multiconstraint scheduling address the improvement in scheduling performance metrics. The proposed heuristic is also proved to complete a maximum of jobs assigned in comparison to the baseline scheduling algorithms. The proposed work will be incorporated to test the real workload job datasets in a real volunteer grid environment comprising of large number of resources.
500 400 Time (sec)
30 25 20 15 10 5 0
300 200 100 0
Figure 3 : Average Slowdown Time In the job dataset, deadline of job is also included to check whether jobs are completing within the deadline specified and not lagging. The comparison of proposed heuristic and baseline scheduling algorithms verified that the proposed heuristic completes maximum number of jobs within the deadline Figure 4. The less number of jobs are delayed using proposed heuristic whereas the more jobs are delayed is scheduling jobs using LJF. The delaying jobs or not completing jobs within the deadline can occur because of the unpredictable resources in volunteer grid environment and not having enough free resources to complete the jobs.
Acknowledgement Authors appreciatively acknowledge the High Performance Computing Center (HPCC) at Universiti Teknologi PETRONAS (UTP), Malaysia for providing the volunteer grid computing environment.
References [1]
[2]
[3]
6. Conclusions The traditional scheduling algorithms are based on single constraint to select a job for scheduling. The scheduling algorithm if consider multi-constraints can improve the job scheduling. The paper proposed a multi-constraint job scheduling heuristic for volunteer grid resources, which are unpredictable and distributed in nature.
643
[4]
[5] [6]
M. Nouman Durrani and J. A. Shamsi, "Volunteer computing: requirements, challenges, and solutions," Journal of Network and Computer Applications, vol. 39, pp. 369-380, 2014. K. Watanabe, M. Fukushi, and M. Kameyama, "Adaptive group-based job scheduling for high performance and reliable volunteer computing," Journal of Information Processing, vol. 19, pp. 3951, 2011. M. Milenkovič, Operating Systems: Concepts and Design: McGraw-Hill, 1987. H. Li and R. Buyya, "Model-driven simulation of grid scheduling strategies," in e-Science and Grid Computing, IEEE International Conference on, 2007, pp. 287-294. W. Stallings, Operating Systems: Prentice Hall, 1995. S. Srinivasan, R. Kettimuthu, V. Subramani, and P. Sadayappan, "Selective reservation strategies for
The 4th International Conference on Computer Science and Computational Mathematics (ICCSCM 2015)
[7]
[8]
[9]
[10]
[11]
[12]
backfill job scheduling," in Job Scheduling Strategies for Parallel Processing, 2002, pp. 55-71. L. Hongbing, W. Linping, and W. Wei, "A multiconstraint preemption algorithm for parallel job scheduling," in Parallel and Distributed Computing, Applications and Technologies (PDCAT), 2011 12th International Conference on, 2011, pp. 73-78. E. W. Parsons and K. C. Sevcik, "Coordinated allocation of memory and processors in multiprocessors," in ACM SIGMETRICS Performance Evaluation Review, 1996, pp. 57-67. C.-M. Wang, H.-M. Chen, C.-C. Hsu, and J. Lee, "Dynamic resource selection heuristics for a nonreserved bidding-based Grid environment," Future Generation Computer Systems, vol. 26, pp. 183197, 2// 2010. F. Xhafa and A. Abraham, "Computational models and heuristic methods for Grid scheduling problems," Future generation computer systems, vol. 26, pp. 608-621, 2010. L. Tomás, A. Caminero, B. Caminero, and C. Carrión, "Using network information to perform meta-scheduling in advance in Grids," in Euro-Par 2010-Parallel Processing, ed: Springer, 2010, pp. 431-443. W. Leinberger, G. Karypis, and V. Kumar, "Multicapacity bin packing algorithms with applications to
644
[13]
[14]
[15]
[16]
[17]
[18]
job scheduling under multiple constraints," in Parallel Processing, 1999. Proceedings. 1999 International Conference on, 1999, pp. 404-412. M. Stillwell, D. Schanzenbach, F. Vivien, and H. Casanova, "Resource allocation algorithms for virtualized service hosting platforms," Journal of Parallel and Distributed Computing, vol. 70, pp. 962-974, 2010. R. Panigrahy, K. Talwar, L. Uyeda, and U. Wieder, "Heuristics for vector bin packing," research. microsoft. com, 2011. J. Coffman, Edward G, M. R. Garey, and D. S. Johnson, "Dynamic bin packing," SIAM Journal on Computing, vol. 12, pp. 227-258, 1983. W. Leinberger, G. Karypis, and V. Kumar, "Job scheduling in the presence of multiple resource requirements," in Proceedings of the 1999 ACM/IEEE conference on Supercomputing, 1999, p. 47. E. G. Coffman Jr, M. R. Garey, and D. S. Johnson, "Approximation algorithms for bin packing: a survey," in Approximation algorithms for NP-hard problems, 1996, pp. 46-93. A. Cervin, J. Eker, B. Bernhardsson, and K.-E. Årzén, "Feedback–feedforward scheduling of control tasks," Real-Time Systems, vol. 23, pp. 2553, 2002.