A Method for Scheduling Heterogeneous Multi ... - Springer Link

A Method for Scheduling Heterogeneous Multi-Installment Systems Amin Shokripour, Mohamed Othman , Hamidah Ibrahim, and Shamala Subramaniam Department of Communication Technology and Network, Universiti Putra Malaysia, 43400 UPM, Serdang, Selangor D.E., Malaysia [email protected], [email protected]

Abstract. During the last decade, the use of parallel and distributed systems has become more common. Dividing data is one of the challenges in this type of systems. Divisible Load Theory (DLT) is one of the proposed method for scheduling data distribution in parallel or distributed systems. Many researches have been done in this field but scheduling a multi-installment heterogeneous system in which communication mode is blocking has not been addressed. In this paper, we present some closedform formulas for the different steps of scheduling jobs in this type of systems. The results of our experiments show the proposed method gave better performances than the Hsu et al.’s method.

1

Introduction

Different models were investigated in DLT researches [1,2], each of which is made by some assumptions. One installment system with blocking and non-blocking mode communication [3], multi-installment system by non-blocking communication modes [4], system with different processor available time (SDPAT) [5], nondedicated systems [6], and others are some examples of the investigated models. However no closed-form formula has yet been presented for multi-installment system with blocking mode communication yet. In this paper, we will try to schedule jobs in a multi-installment system with high communication speed. The proposed method considers all four steps in scheduling a multi-installment system. The remainder of this paper is organized as detailed below. Section 2 presents related works. Our model and notations are introduced in the third section. In section 4, a proposed method which includes closed-form formula for finding the proper number of processors, for finding the proper number of installments, scheduling internal installments, and scheduling the last installment, is presented. The results of experiments and their analysis are appeared in section 5. In the last section, the conclusion is obtainable.

The author is also an associate researcher at the Lab of Computational Science and Informatics, Institute of Mathematical Research (INSPEM), Universiti Putra Malaysia.

N.T. Nguyen, C.-G. Kim, and A. Janiak (Eds.): ACIIDS 2011, LNAI 6592, pp. 31–41, 2011. c Springer-Verlag Berlin Heidelberg 2011

32

2

A. Shokripour et al.

Related Works

One of the first studies that presented a closed-form formula for scheduling multi-installment system was done by Yang et al. [7]. In this research, size of installments were not equal. They also used non-blocking communication mode. The authors presented a closed-form formula for scheduling jobs in a homogeneous and heterogeneous multi-installment system. After this research, some articles were published for improving this method by using different technique such as parallel communication [8], resizing chunk size of each installment inrespects of to the negative effect of performance prediction errors at the end of execution[7], and attending to other system parameters [9]. Hsu et al. presented a new idea for multi-installment scheduling jobs in a heterogeneous environment in [10]. Their article includes two algorithms, ESCR and SCR. ESCR is an extension of SCR which has some problems for scheduling the last installment. They did not attend to communication and computation overheads and the size of each installment is also independent of job size. Therefore, for large job sizes, the increased number of installments results in making some problems for systems with overhead.

3

Preliminaries

Throughout this paper, the following notations and their definitions are used and stated in Table 1. In this research, we used client-server topology for network, while all processors are connected to a root called P0 . The root does not do any computation and only schedules tasks and distributes chunks among workers. Communication type is blocking mode; it means communication and computation cannot be overlapped. This model consists of a heterogeneous environment which includes communication and computation overheads. Table 1. Notations Notation Description W Total size of data V Size of each installment n Number of installments m Number of processors αi The size of allocated fraction to processor Pi in each internal installment βi The size of allocated fraction to processor Pi in the last installment wi Ratio of the time taken by processor Pi , to compute a given load, to the time taken by a standard processor, to compute the same load zi Ratio of the time taken by link li , to communicate a given load, to the time taken by a standard link, to communicate the same load si Computation overhead for processor Pi oi Communication overhead for processor Pi T (αi ) The required time for transferring and computing data in processor Pi .

A Method for Scheduling Heterogeneous Multi-Installment Systems

4

33

Proposed Method

In this research, we assume that the size of all installments is equal and call it V . We claim that the calculated proper number of installment is the optimum. Hence, we cannot change the size of each installment because if we change the size of installments and increase the size, the idle time for each processor between each installment is increased because the summation of transferring data to the previous processors is increased but the computation time this processor is static and finished before the finishing previous processors transferring data. If we decrease the size installments, summation of time transferring data to the previous processor is decreased and when the root should send the data to the current processor, it is busy because the computation time of previous installment is larger than summation of communication time of the other processors. 4.1

Internal Installments Scheduling

In each internal installment, we want to have equal values for the total time taken for communication and computation time for all processors (Fig. 1). In other words, we determine the size of task for each processor so that the total time taken for communication and computation for each processor is equal to the total of communication time of all the other processors. For example, when P1 finishes its computation, the root is ready to send the data of the next installment to it. As a result, α1 V (z1 + w1 ) + o1 + s1 = α2 V (z2 + w2 ) + o2 + s2 = . . . = αm V (zm + wm ) + om + sm α2 =

α1 V (z1 + w1 ) + o1 + s1 − o2 − s2 V (z2 + w2 )

Fig. 1. Time Diagram for a Multi-Installment System

(1)

(2)

34


We know that α1 + α2 + . . . + αm = 1. Hence, we have α1 V (z1 + w1 ) + o1 + s1 − o2 − s2 V (z2 + w2 ) α1 V (z1 + w1 ) + o1 + s1 − o3 − s3 + ... + V (z3 + w3 ) α1 V (z1 + w1 ) + o1 + s1 − om − sm + =1 V (zm + wm )

α1 +

Two new variables are defined, Δi = can be rewritten as α1 (1 +

m

z1 +w1 zi +wi

Δi ) +

i=2

and Φi =

o1 +s1 −oi −si . zi +wi

(3)

Now Eq.(3)

m 1 Φi = 1 V i=2

(4)

Finally, we find these closed-form formula as ⎧ ⎨

α1 =

m mi=2 Φi

1− V1 1+

Δi

i=2 ⎩α = α Δ + 1 i 1 i V Φi , i = 2, . . . , m

4.2

(5)

Last Installment Scheduling

The last installment in Fig. 1 shows the time diagram for a blocking mode communication system in which all the processors finish their tasks at the same time. Hence, we have T (β1 ) = o1 + z1 β1 V + s1 + w1 β1 V

T (βi ) =

i−1

(oj + zj βj V ) + oi + zi βi V + si + wi βi V

(6)

(7)

j=1

Processing time for all processors is the same, T (βm ) = T (βm+1 ), hence si + wi βi V = oi+1 + zi+1 βi+1 V + si+1 + wi+1 βi+1 V

βi+1 =

wi si − (si+1 + oi+1 ) + βi V (wi+1 + zi+1 ) wi+1 + zi+1

(8)

(9)


Again, we define two new symbols, δi+1 = Eq.(9) is rewritten as βi+1 =

si −(si+1 +oi+1 ) wi+1 +zi+1

and εi+1 =

1 δi+1 + εi+1 βi V

35

wi wi+1 +zi+1 .

(10)

We assume δi ≥ 0. This assumption is true when size of job is large enough that all the processors participate in the job. After solving Eq.(10), we have Ei = i i m i j=2 εj and Γi = j=2 (δj k=j+1 εk ). We know that i=1 βi = 1. Therefore, ⎧ ⎨ βi = Ei β1 + V1 Γi , i = 2, . . . , m ⎩ β1 =

1− V1 1+

m

mi=2 Γi i=2

(11)

Ei

By using Eq.(11), we can easily schedule a task in a heterogeneous environment with blocking mode communication which includes communication and computation overheads. 4.3

The Proper Number of Processors

The number of processors is important for a good scheduling. If the number of processors is increased, the waiting time for getting a task is increased for each processor. As mentioned, we should decrease idle time for each processor. We should, therefore, try to have the same time for computing a task in a processor and total time taken for transferring data to the others. Since we have a Computation Based system, we should not delay computation but we can have waiting time before transferring data. Therefore, we have,

α1 V w1 + s1 ≥

m

αj V zj + oj

(12)

j=2

This means that computation time for the first processor should be larger or equal to the sum of all other processors’ communication time in each installment. The best state to achieve is when they are equal because the system does not have any idle time due to computation. In this equation, the unknown parameter is m, the proper number of processors. With Eq.(5), we rewrite Eq.(12) as α1 V w1 + s1 ≥

m j=2

[(α1 Δi +

1 Φi )V zj + oj ] V

(13)

36


α1 V (w1 −

m

Δj zj ) ≥

m (Φj zj + oj ) − s1

j=2

(14)

j=2

We replace α1 with its equation in Eq.(5) V (w1 −

m

Δj zj ) ≥(1 +

j=2

m

⎡ Δi ) ⎣

i=2

+

m

m

⎤ (Φj zj + oj ) − s1 ⎦

j=2

Φi (w1 −

i=2

m

(15) Δj zj )

j=2

The proper number of installments is called n + 1. Since we have stated that the size for all installments is the same, the size of task for each internal installment W is V = n+1 . Thus, ⎤ ⎡ m m m W (w1 − Δj zj ) ≥(1 + Δi ) ⎣ (Φj zj + oj ) − s1 ⎦ + n+1 j=2 i=2 j=2 m

Φi (w1 −

i=2

⎡ W ≥⎣

(1 +

m

i=2 Δi )

(w1 −

m j=2 (Φj zj + m j=2 Δj zj )

m

(16)

Δj zj )

j=2

oj ) − s1

+

m

⎤ Φi ⎦ (n + 1)

(17)

i=2

From Eq.(17), we find that the proper number of processors is related to the number of installments. Therefore, before solving Eq.(17), we should calculate the proper number of installments. In the remainder of this paper, we use m to show the proper number of processors.

4.4

The Proper Number of Installments

Finding the proper number of installments is one of the main parts of scheduling heterogeneous multi-installment systems. With reference to Fig. 1, by using Eq.(5) and Eq.(11), we arrive at Eq.(18) to find the response time of the job (T (W )). T (W ) = n(α1 V (w1 + z1 ) + o1 + s1 ) + β1 V (w1 + z1 ) + o1 + s1

(18)


37

We replace α1 and β1 with their values in Eq.(5) and Eq.(11) respectively. m V − i=2 Φi m (z1 + w1 ) + n(o1 + s1 )+ 1 + i=2 Δi V − m−1 i=2 Γi (z1 + w1 ) + (o1 + s1 ) m−1 1 + i=2 Ei

T (W ) =n

Therefore, with reference to V =

W n+1 ,

(19)

we rewrite Eq.(19) as

m − i=2 Φi m (z1 + w1 ) + n(o1 + s1 )+ T (W ) =n 1 + i=2 Δi m−1 W i=2 Γi n+1 − (z1 + w1 ) + (o1 + s1 ) m−1 1 + i=2 Ei W n+1

(20)

One of the known methods for finding the minimum or maximum value of an equation is derivation. We calculate the derivative of Eq.(20) based on n. Φ −(z1 + w1 ) m −W (z1 + w1 ) m i=2 i + + (o1 + s1 )+ m−1 1 + i=2 Δi (n + 1)2 (1 + i=2 Ei ) W (z1 + w1 )(n + 1)(1 + m − nW (1 + m i=2 Δi ) i=2 Δi )(z1 + w1 ) 2 (n + 1)2 (1 + m Δ ) i=2 i

T (W ) =

(21)

Now we set the equation to equal to zero and calculate n. m−1 m W (z1 + w1 )( i=2 Ei − i=2 Δi ) m m (1 + m−1 i=2 Ei )[(z1 + w1 ) i=2 Φi − (1 + i=2 Δi )(o1 + s1 )]

(22)

2

= (n + 1)

Using Eq.(22), the proper number of installments can easily be found; n is the number of internal installments and 1 refers to the last installment. Using Eq.(17) and Eq.(22), the proper number of processors can be found by

2 m m Δi )( j=2 Φj zj + oj − s1 ) + W ≥ Φi . (w1 − m j=2 Δj zj ) i=2 m (z1 + w1 )( m−1 i=2 Ei − i=2 Δi ) m−1 m m (1 + i=2 Ei )[(z1 + w1 ) i=2 Φi − (1 + i=2 Δi )(o1 + s1 )] (1 +

m

i=2

(23)

To find the proper number of processors, m, we should calculate the right side of the non-equation for different m, from the first to the last of available processors. When the right side of the equation is larger than total size of data (job size) for the first time, we select m − 1 as the proper number of processors and call it m.

38

5


Experiment

A set of 50 processors with randomly produced attributes was used. The value of w is 20 times as large as the z value for all processors. This means that communication speed is much faster than computation speed. 5.1

Order of Processors

In heterogeneous multi-installment systems, the order of data distribution is very important because a bad ordering leads to more General Idle Time which causes response time to increase. Therefore, we repeated experiments with different ordering algorithms (increasing zi , increasing wi , and Hsu’s ordering algorithm which is introduced in the next section). The results of these experiments can be seen in Fig. 2. According to the graph scale, although the response time for ordering by zi and wi looks the same, there are some differences and ordering by zi is better than ordering by wi . 5.2

Evaluating the Proposed Method

We attempted to prove that the proposed formula to find the proper number of installments is true. Therefore, we manually increased and decreased the proper number of installments. The results of this experiment for four different sizes of job can be seen in Fig. 3a. In this experiment, we both increase and decrease one to four units from the proper number of installments for each job size. Zero shows the calculated proper number of installments using our formula. It is clear that the response time, for each job, at this zero point is less than others. Thus, our formula for calculating the proper number of installments is true.

Fig. 2. Response Time vs Job Size for Different Ordering Methods


(a)

39

(b)

Fig. 3. Response Time vs Difference with The Proper Number of Installments and Processors for Different Job Sizes

We did the same experiment to show that the proposed formula for finding the proper number of processors is true. The results of this experiment, presented in Fig. 3b, show that the proposed formula for calculating the proper number of processors is true. 5.3

The Proposed Methods vs Hsu’s Method

We did some experiments to compare the proposed method to the presented method by Hsu et al. [10]. Hsu’s method does not have any mechanism for restricting the number of used processors and it use all available processors. Therefore we did two different experiments, one experiment with all available processors (50 processors) and one experiment with the only first 16 processors,

Fig. 4. Response Time vs Job Size for The Proposed Method and Hsu’s Method

40


the calculated proper number of processors by Eq.(23), after sorting all processors with Hsu’s mechanism for sorting processors. Hsu’s method used a specific zi order of processor. They calculate zi +w for all processors and order processors i by increasing this parameter. In Fig. 4, a comparison between the proposed method for and Hsu’s method can be seen. It is clear that the proposed method’s response time is much better than Hsu’s. One of the problems of Hsu’s method is that simultaneously finishing tasks in all processors is not controlled. Another problem is the unsuitable order of processors.

6

Conclusion

Scheduling a job in a heterogeneous system which includes overheads and using multi-installment method is complex. In this paper we present a method for scheduling jobs in a multi-installment heterogeneous system which includes overheads with blocking mode communication. This method consists of four closed-form formulas for calculating the proper number of processors, the proper number of installments, chunk sizes for internal installments and chunk sizes for the last installment. We showed that ordering by decreasing communication speed. Comparing the proposed method with Hsu’s method showed that response time by the proposed method is smaller than Hsu’s method.

Acknowledgment The research was partially supported by the Malaysian Ministry of Higher Education, FRGS No: 01-11-09-734FR.

References 1. Robertazzi, T.: Ten reasons to use divisible load theory. Computer 36(5), 63–68 (2003) 2. Shokripour, A., Othman, M.: Categorizing DLT researches and its applications. European Journal of Scientific Research 37(3), 496–515 (2009) 3. Mingsheng, S.: Optimal algorithm for scheduling large divisible workload on heterogeneous system. Appl. Math. Model 32, 1682–1695 (2008) 4. Mingsheng, S., Shixin, S.: Optimal multi-installments algorithm for divisible load scheduling. In: Eighth International Conference on High-Performance Computing in Asia-Pacific Region, pp. 45–54 (2005) 5. Shokripour, A., Othman, M., Ibrahim, H.: A new algorithm for divisible load scheduling with different processor available times. In: Nguyen, N.T., Le, M.T., ´ atek, J. (eds.) ACIIDS 2010. LNCS, vol. 5990, pp. 221–230. Springer, Swi¸ Heidelberg (2010) 6. Shokripour, A., Othman, M., Ibrahim, H., Subramaniam, S.: A new method for job scheduling in a non-dedicated heterogeneous system. Accepted in Procedia Computer Science (2010)


41

7. Yang, Y., Casanova, H.: Umr: A multi-round algorithm for scheduling divisible workloads. In: 17th International Symposium on Parallel and Distributed Processing (2003) 8. Yamamoto, H., Tsuru, M., Oie, Y.: A parallel transferable uniform multi-round algorithm in heterogeneous distributed computing environment. In: Gerndt, M., Kranzlm¨ uller, D. (eds.) HPCC 2006. LNCS, vol. 4208, pp. 51–60. Springer, Heidelberg (2006) 9. Lee, D., Ramakrishna, R.S.: Inter-round scheduling for divisible workload applications. In: Hobbs, M., Goscinski, A.M., Zhou, W. (eds.) ICA3PP 2005. LNCS, vol. 3719, pp. 225–231. Springer, Heidelberg (2005) 10. Hsu, C.H., Chen, T.L., Park, J.H.: On improving resource utilization and system throughput of master slave job scheduling in heterogeneous systems. J. Supercomput. 45(1), 129–150 (2008)