Execution Time Estimation for Workflow Scheduling 1 ...

Execution Time Estimation for Workflow Scheduling Artem M. Chirkin1, Adam S. Z. Belloum2, Sergey V. Kovalchuk3, Marc X. Makkes4, Mikhail A. Melnik2,3, Alexander A. Visheratin3, Denis A. Nasonov3 1 ETH Zurich, Switzerland 2 University of Amsterdam, Netherlands3ITMO University, Russia 4 TNO Information and Communication Technology, Netherlands Abstract. Estimation of the execution time is an important part of the workflow scheduling problem. The aim of this paper is to highlight common problems in estimating the workflow execution time and propose a solution that takes into account the complexity and the stochastic aspects of the workflow components as well as their runtime. The solution proposed in this paper addresses the problems at different levels from a task to a workflow, including the error measurement and the theory behind the estimation algorithm. The proposed makespan estimation algorithm can be integrated easily into a wide class of schedulers as a separate module. We use a dual stochastic representation, characteristic/distribution functions, in order to combine tasks’ estimates into the overall workflow makespan. Additionally, we propose the workflow reductions - the operations on a workflow graph that do not decrease the accuracy of the estimates, but simplify the graph structure, hence increasing the performance of the algorithm. Another very important feature of our work is that we integrate the described estimation schema into earlier developed scheduling algorithm GAHEFT and experimentally evaluate the performance of the enhanced solution in the real environment using the CLAVIRE platform. Keywords. workflow; scheduling; time estimation; cloud computing; genetic algorithm

1. Introduction The execution time estimation is used mainly to support the workflow scheduling. In this work we use “execution time”, “runtime” and “makespan” as synonyms. In order to obtain the workflow runtime estimation, one needs to combine the execution time estimations of the tasks forming the workflow. Makespan estimation is an essential part of the scheduling optimization process because it greatly affects the quality of generated solutions no matter what optimization criteria are used. The runtime estimation problem is not so well-developed as the scheduling problem because several straightforward techniques exist, providing acceptable performance in many scheduling cases. However, these techniques may fail in some cases, especially in a deadline constrained setting. For example, consider a very basic approach when we take the average time of previous executions as the only value representing the runtime of the task. If we apply it for a real case with a lot of tasks running in parallel, it will underestimate the total execution time, because it is very likely for at least one task to take more than average time for execution. In addition, when estimating the workflow execution time, we should take into account the dependency between the tasks and the heterogeneity of the tasks and computing resources. Another important feature of scientific workflows, from the estimation perspective, is that many components of the runtime are stochastic in their nature (data transfer time, overheads, etc.). These facets are not considered together in modern research, therefore the problem requires a diligent investigation. Besides scheduling, the runtime estimation is used to give the expected price of the workflow execution when leasing Cloud computing resources [1]. Given the estimated time bounds, one can use a model described in [2] to provide a user with the pre-billing information. The central idea of the work is to allow a scheduler to use the estimates as the random variables instead of keeping them constant, and provide the complete information about them (e.g. estimated distribution functions) at maximum precision. We compare our results to the only known to us alternative providing the distribution estimation – “fully probabilistic” algorithm – that is based on combining quantiles of the tasks’ makespans. When calculating the workflow execution time, we assume that the tasks runtime models are adjusted to the distinct computing nodes; this allows us to separate

the task-level (estimating the runtime on a given machine) and the workflow-level (aggregate the tasks’ estimates into the overall makespan) problems. We also combine the proposed approach with the previously developed algorithm GAHEFT [26], and evaluate the performance of this schema in comparison with simple GAHEFT and widely used heuristic algorithm called HEFT [27]. The part of the paper devoted to makespan estimation is based on a Master’s thesis by A. Chirkin [3]. See the thesis for the proofs of formulae and the implementation details.

2. Related works The workflow scheduling is known to be a complex problem; it requires considering various aspects of application’s execution (e.g. resource model, workflow model, scheduling criteria, etc.) [4]. Therefore, many authors describing the architecture of their scheduling systems do not focus on the runtime estimation. We separate the estimation from the scheduling so that the proposed approach (the workflow makespan estimation) can be used in a variety of scheduling algorithms, e.g. HEFT and GAHEFT. Additionally, in order to monitor the workflow execution process or reschedule the remaining tasks, one can use the estimation system to calculate the remaining execution time at the moment when the workflow is being executed. Some examples of scheduling systems that can use any implementations of the runtime estimation module are CLAVIRE [5], [6], a scheduling framework proposed by Ling Yuan et al. [7], and CLOWDRB by Somasundaram and Govindarajan [8]. Table 1 – Usage of the runtime estimation in scheduling time \ scheduling Unspecified Random Fixed Ordinal

task task ranking and makespans [9], [10] NP parallelizing models [15], [16], [17] Round-Robin, greedy etc. [21], [20]

workflow architecture [11], [12], [7], [8] Chebyshev inequalities [13], composing quantiles [14] optimization algorithms [9], [18], [19], [20] -

There are no papers known to us, which are dedicated specifically to the workflow runtime estimation problem. Instead, there is a variety of dedicated to scheduling papers that describe the runtime models in context of their scheduling algorithms. We decided to classify the schedulers by the way they represent the execution time. Table 1 shows a classification of workflow scheduling algorithms; the section below discusses the classes in details. Several papers are difficult to be classified due to two reasons: first, some approaches stay between two classes (e.g. in [9], [15], [16], [17], [10] the execution time is calculated as a mean of a random variable, but the variance of the time is not used); second, some researchers focus more on the architecture of scheduling software than on the algorithms, so one can use different estimate representations [12], [8], [7]. Ordinal time: The first class of schedulers use task-level scheduling heuristics (which do not take into account the execution time of the workflow). An algorithm of such scheduler uses a list of tasks ordered by their execution time this ordering is the only information used to map the tasks onto the resources. First of all, this class is based on various greedy algorithms; two of them are described in [20]. Comparing to the workflow-level algorithms, they are very fast, but shown to be less effective [20]. A number of scheduling heuristics of this class is described in a paper by Gutierrez-Garcia and Sim [21]. Fixed time: The second class of schedulers considers the task runtime as a constant value. On the one hand, this assumption simplifies the estimation of the workflow execution time, because in this case there is no need to take into account a possibility that a task with the longer expected runtime finishes its execution faster than a task with the shorter expected time. Especially, this approach simplifies the calculation of the workflow execution time in case of parallel branches (Figure 1.b): if the execution time of branches is constant, then the expected execution time of the whole workflow is equal to the maximum of the branches’ execution time. On the other hand, this assumption may lead to significant errors in the estimations and, consequently, to inefficient scheduling. In the case of a large number of parallel tasks, even a small uncertainty in the task execution time makes the expected makespan of the workflow longer. However, in some cases a random part of the program execution time is small enough, thus this approach is often used. One example of using fixed time is the work

of M. Malawski [22] et al. who introduced a cost optimization model of workflows in IaaS (Infrastructure as a Service) clouds in a deadline-constrained setting. Other examples of schedulers that do not exploit the stochastic nature of the runtime can be found in [9], [18], [19], [20]. They calculate the workflow makespan; this means that they have to calculate execution time of parallel branches, but the stochastic and fixed approaches to estimate it are different. The main problem, again, is that the average of the maximum of multiple random variables is usually larger than the maximum of their averages. Stochastic time: The last class of schedulers makes assumptions that has a potential to give the most accurate runtime predictions. Knowledge of the expected value and the variance allows measuring the uncertainty of the workflow runtime, making the rough bounds on time using Chebyshev’s inequality1. For instance, Afzal et al. [13] used Vysochanski-Petunin inequality2 to estimate the runtime of a program. If provided by the quantiles or the distribution approximation, one can make the rather good confidence intervals on the execution time that may be used in a deadline-constrained setting. N. Trebon presented several scheduling policies to support urgent computing [14]. To find out if some configuration of a workflow allows to meet a given deadline, they used quantiles of the tasks’ execution time. For example, it can be shown easily that if two independent concurrent tasks (𝑇2 and 𝑇3 Figure 1.c) finish in 20 and 30 minutes with probabilities 0.9 and 0.95 accordingly, then one can set an upper bound on the total execution time (max⁡(𝑇2 , 𝑇3 )) to max(20,30) = 30 minutes with probability 0.9 · 0.95 = 0.855. If the tasks are connected sequentially (Figure 1.a), the total execution time is 20 + 30 = 50, again with probability 0.9 · 0.95 = 0.855. This approach is quite straightforward and very useful to calculate the bounds for the quantiles, but does not allow evaluating the expected value. Additionally, it is highly inaccurate for a large number 𝑛 of tasks: let say, one has quantile 𝑝 for each task, then the total probability for any workflow topology is 𝑝𝑛 at least. For example, in Figure 1 workflows (a) and (b) give 𝑝3 , workflow (c) gives 𝑝4 (if calculate max⁡(𝑇2 , 𝑇3 ) firstly) or 𝑝5 (if calculate tasks sequentially), workflow (d) gives 𝑝7 or more depending on the way the steps are composed.

b) T1

a)

T1

T2

T3

T2 T3

c)

T1

d)

T2 T3

T4

T1 T2

T3 T4

T5

Figure 1 – Primitive workflow types: (a) sequential workflow - the workflow makespan is the sum of variables; (b) parallel workflow - the makespan is the maximum of variables; (c) complex workflow that can be split into parallel/sequential parts; (d) complex irreducible workflow The main drawback of using random execution time is that one needs much more information about the algorithm of the task than just a mean runtime; in order to compute the quantiles one has to obtain full information about distribution of a random variable. Some authors do not take into account the problem of obtaining such information assuming it is given as an input of the algorithm. However, in real applications the execution data may be inconsistent with the assumptions of the distribution estimation method: if a program takes several arguments, and these arguments change from run to run, it might be difficult to create a proper regression model. Additionally, even if one has created good estimations of the mean and the standard deviation of a distribution as the functions of program arguments, these two statistics do not catch the changes in the higher-order moments like the asymmetry. As a result, under some circumstances (for some program arguments), distribution of the Chebyshev’s inequality use the variance and the mean of a random variable to bound the probability of deviat1 ing from the expected value: 𝑃(|𝑋 − 𝐸[𝑋]| ≥ 𝑘√𝑉𝑎𝑟(𝑋)) ≤ 2 , 𝑘 > 0 . We also refer to the Chebyshev-like inequalities 𝑘 as the other inequalities that use the mean and the variance to bound the probability. 2 Vysochanski-Petunin inequality is one of the Chebyshev-like inequalities; it is constrained to unimodal distri4 butions, but improves the bound: 𝑃(|𝑋 − 𝐸[𝑋]| ≥ 𝑘√𝑉𝑎𝑟(𝑋)) ≤ 2, 𝑘 > √8⁄3. 1

9𝑘

runtime may have a larger “tail” unexpected by the constructed empirical distribution function, hence the program has a larger chance to miss the deadline.

3. Execution time model Section 2 describes the ways scheduling systems use the runtime estimations. Several examples show that the stochastic representation gives the most accurate results [20], [13]. The most significant advantage of the stochastic approach is the ability to measure the error of the estimation in a simple way. Additionally, it allows to represent the effect of the components that are stochastic in their nature, e.g. the data transfer time, the execution of randomized algorithms. However, this approach has its own problems, specific to the time representation. The first problem is the creation of the task execution time models: in order to efficiently perform the regression analysis one needs a sufficient amount of execution statistics; estimating the full probability distribution as a function of the task input usually is a complicated task. The second problem is related to random variables (tasks’ execution time) combining into a workflow makespan because of the complex dependencies in the workflow. In the most simple form the proposed solution to the problem is straightforward: create the task runtime models, collect the data and fit the models to it, and combine the task models into the workflow runtime estimate. Assuming the mentioned approach, we use the following features of the problem model: • The workflow is represented as a directed acyclic graph (DAG) (Figure 1). • The nodes of the workflow are the tasks and their makespans are represented as random variables 𝑇𝑖 > 0. These variables are mutually independent and we assume variables 𝑇𝑖 to be continuous on the experimental basis. • The workflow execution time (runtime, makespan) is the time needed to finish all tasks in the workflow. An equivalent definition is the maximum execution time among all paths throughout the workflow. • The path in the workflow is a sequence of the dependent tasks, which starts at any head task (task that does not have the predecessors) and finishes at any of the tail tasks (tasks that are not followed by any other tasks). The number of paths increases with each fork in the workflow (Figure 1).

3.1 Task runtime description Besides the calculation time of the applications, the workflow execution time includes the additional times for the data transfer, resource allocation, and the others. Therefore, single task processing can be considered as the call of a computing service, and its execution time can be represented in a following way: 𝑇 = 𝑇𝑅 + 𝑇𝑄 + 𝑇𝐷 + 𝑇𝐸 + 𝑇𝑂 .

(1)

Here, 𝑇𝑅 is the resource preparation time, 𝑇𝑄 - queuing time, 𝑇𝐷 - data transfer time, 𝑇𝐸 program execution time, 𝑇𝑂 system overhead time (analyzing the task structure, selecting the resource, etc.). For detailed description of the component model (1) see [6]. At the learning stage, each component is modeled as a stochastic function of the environment parameters (e.g. machine characteristics, workload, data size, etc.). At this stage one may apply any known techniques to create the good models. When the scheduler asks for the estimate, it provides all the parameters (the unknown parameters are estimated either), thus the components transform into the random variables. In the case if all components are independent (at least conditionally on the environment parameters), one can easily find the mean and the variance of 𝑇 as the sum of the components. Here and later in the paper for the sake of simplicity we assume all components to be included into the overall task execution time 𝑇, which is a complex function at the learning stage and a variable at the estimation stage (after the parameters substitution).

The major problem of the stochastic approach is related to the way one represents the execution time when fitting it to the observed data. We use a regression model for this purpose: 𝑇 = µ(𝑋, 𝛽) + 𝜎(𝑋, 𝛽)𝜉(𝑋, 𝛽).

(2)

Here 𝑋 is the task execution arguments, represented by a vector in an arbitrary argument space; 𝑋 is controlled by an application executor, but can be treated as a random variable from the regression analysis point of view. 𝑇 ∈ 𝑅 is a dependent variable (i.e. execution time); 𝛽 - unknown regression parameters, µ⁡ = ⁡𝐸[𝑇|𝑋], 𝜎 = √𝑉𝑎𝑟(𝑇|𝑋); 𝜉(𝑋, 𝛽) is a random variable. The regression problem consists of estimating unknown parameters β, thus estimating the mean, the variance, and the conditional distribution of the runtime. Some authors used only the mean µ (e.g. [16], [20]) therefore they could not create the confidence intervals. Others computed the mean and the variance, but did not estimate 𝜉(𝑋, 𝛽) (e.g. [13]); they used Chebyshev-like inequalities, which are based on the knowledge of the mean and the variance only. Trebon [14] used the full information about the distribution, but assumed it to be obtained by an external module. As we explained in section 2 (Stochastic time), it is not seemingly possible to estimate the conditional distribution of 𝜉 for any input 𝑋, therefore one need to simplify the problem. As an example, one may assume that 𝜉 and 𝑋 are independent. However, in general case 𝜉 depends on 𝑋 - this means that the distribution of 𝑇 not only shifts or scales with 𝑋, but also its shape changes. Consequently, for some values of 𝑋, the time 𝑇 may have a significantly larger right “tail” than for others, hence this effect should be taken into account when modelling task runtime.

3.2 Task runtime estimation A task runtime model always needs to be consistent with the execution data coming from the task execution. In order to achieve this we use evaluation/validation approach that allows to construct and update models in time. 1) Evaluation: The evaluation procedure creates a model of a task runtime, i.e. basically solves a regression problem (2); thus, an output of this procedure is a tuple of the mean and the standard deviation functions µ(𝑋) and 𝜎(𝑋). The basis for the evaluation is the execution data from the provenance (arguments-runtime dataset). If a parametric model of the task execution time is known, we adjust the parameters using Maximum Likelihood Estimation (MLE)3 method. If no prior information about the structure of the time dependency on arguments is known, we use the Random Forest4 machine learning method to create the models based entirely on the provided data. 2) Validation: Once the execution time models µ(𝑋) and 𝜎(𝑋) for a task are created, they must be validated from time to time: there might be some changes in the environment (or a program represented by the task may be updated) that make an earlier created model inconsistent with the new data. The validation procedure solves this problem by checking task models with pre-defined time interval or each time the new execution data comes from the provenance (e.g. a logging system adds a record to a database with the task executions). The model is validated by means of performing statistical hypothesis tests on the normalized execution time data. We use simple heuristics (e.g. fixed number of points, fixed time interval) to select active sample - using this sample tests’ statistics are calculated and compared to the current values. Various tests are able to reject the null hypothesis (i.e. assumption that the model is still valid) with respect to different alternatives. We use a set of tests to check if the mean, variance, or shape of a sample have changed.

3

The MLE is a basic statistical method to estimate parameters of a distribution; it consist of creating a likelihood function (which measures “how likely” the data come from a given distribution) and maximizing it in a parameter space. 4 The Random Forest is a machine learning non-parametric method of creating functions out of argument-target tuples sample based on combining the decision trees [23].

4. Workflow runtime estimation In this section we provide the theoretical basis for the method and the algorithms. In order to proceed with the calculations, we use the following notation. Denote 𝑠 - a total number of nodes in the graph. 𝑇𝑘 , 𝑘⁡ ∈ ⁡1. . 𝑠 - the independent random variables representing the execution time of each node; one can safely assume they are non-negative, continuous, and at least first two their moments are finite 𝐸[𝑋 𝑙 ] < ∞, 𝑙 ∈ {1,2}. 𝐹𝑘 is a cumulative distribution function (CDF) of the variable 𝑇𝑘 . Let 𝑟 be the total number of different paths throughout the workflow; 𝑠 𝑗 is the number of steps in the path 𝑗 𝑗 𝑗 𝑗 ∈ 1. . 𝑟. Then 𝑘𝑖 , 𝑖 ∈ 1. . 𝑠 𝑗 , 𝑗 ∈ 1. . 𝑟, 𝑘𝑖 < 𝑘𝑖+1 is an index of node on the 𝑗-th path of a workflow, each path contains a number of nodes which is less than or equals to 𝑠 𝑗 ≤ 𝑠. Our agreement on in𝑗 dexing is to use superscripts for the path numbers and subscripts for the task numbers. 𝐹𝑝𝑎𝑡ℎ (𝑡) is a 𝑗

CDF of the variable 𝑇𝑝𝑎𝑡ℎ ; 𝐹𝑤𝑓 (𝑡) is a CDF of the workflow runtime. The characteristic function (CF) is denoted as 𝜑(𝜔) and uses the same subscripts/superscripts as CDF 𝐹(𝑡). The execution time of a path throughout a workflow is denoted as follows: 𝑗

𝑗

𝑇𝑝𝑎𝑡ℎ = ∑𝑠𝑖=1 𝑇𝑘 𝑗 .

(3)

𝑖

The execution time (makespan, runtime) of a workflow is the time required to execute all paths of the workflow in parallel: 𝑗

𝑗

𝑇𝑤𝑓 = max (𝑇𝑝𝑎𝑡ℎ ) = max (∑𝑠𝑖=1 𝑇𝑘 𝑗 ). 𝑗∈1..𝑟

𝑗∈1..𝑟

(4)

𝑖

Estimating moments: The mean and the standard deviation of a random variable are the easiest characteristics to estimate. However, since estimation of the workflow makespan requires to get the maximum of two random variables, we cannot calculate its moments directly. Since the tasks (as the random variables) are assumed to be mutually independent, the moments of path’s runtime can be computed directly: 𝑗

𝑗

𝑗

𝑗

E[𝑇𝑝𝑎𝑡ℎ ] = ∑𝑠𝑖=1 𝐸 [𝑇𝑘 𝑗 ], Var(𝑇𝑝𝑎𝑡ℎ ) = ∑𝑠𝑖=1 𝑉𝑎𝑟 (𝑇𝑘 𝑗 ). 𝑖

𝑖

Although the moments for a workflow cannot be calculated directly, it is possible to use the simple maximums as the initial approximations for the moments in order to use these values where the precision is not mandatory: 𝑗

𝑗

𝐸[𝑇𝑤𝑓 ]~ max (𝐸[𝑇𝑝𝑎𝑡ℎ ]), 𝑉𝑎𝑟(𝑇𝑤𝑓 )~ max (𝑉𝑎𝑟(𝑇𝑝𝑎𝑡ℎ )) 𝑗∈1..𝑟

𝑗∈1..𝑟

(5)

Estimating distributions: If the CDFs are given for each task, then it is clear how to compute the overall probability. The sum of random variables turns into the convolution of two CDFs, the maximum of random variables turns into the multiplication of two CDFs. The only problem is to address the correlation of two variables: for example, in Figure 1.d random variable ⁡𝑇1 + 𝑇3 partially depends on max(𝑇1 , 𝑇2 ) + 𝑇4 . Define 𝑍𝑠𝑢𝑚 = 𝑇1 + 𝑇2 . Both random variables 𝑇1,2 are positive, have finite mean and variance. Then the CDF is given by the following expression: 𝑧

𝐹𝑍𝑠𝑢𝑚 (𝑧) = ∫0 𝐹𝑇2 (𝑧 − 𝑥1 |𝑇1 = 𝑥1 )𝑑𝐹𝑇1 (𝑥1 ). Similarly for the maximum: 𝑍𝑚𝑎𝑥 = max⁡(𝑇1 , 𝑇2 ), then: 𝑧

𝐹𝑍𝑚𝑎𝑥 (𝑧) = ∫0 𝐹𝑇2 (𝑧|𝑇1 = 𝑥1 )𝑑𝐹𝑇1 (𝑥1 ). Measuring the dependence (i.e. joint CDF) of two variables as well as the integration are complex procedures, hence there is a need to find a workaround. On the one hand, it is known that in case of independent variables 𝑋 and 𝑌, CDF of their maximum can be found simply as the multiplication of their CDFs:

𝐹max(𝑋,𝑌) (𝑡) = 𝐹𝑋 (𝑡)𝐹𝑌 (𝑡).

(6)

On the other hand, one can use a CF to calculate the sum, because the CF of the sum of independent random variables equals to multiplication of their CFs: 𝜙X+Y (𝜔) = 𝜙𝑋 (𝜔)𝜙𝑌 (𝜔).

(7)

Since the tasks (as the random variables) in one path throughout a workflow are considered to be independent (i.e. a path forms the sum of independent random variables), the easiest way to compute distribution of the runtime of one path is to use the CF. Thus, one can use (6) for the maximum operation and (7) for the sum operation. However, this raises the question of the conversion between CDF and CF - which is addressed in section 4.2.

4.1 Workflow makespan CDF Schedulers usually have to satisfy a constraint on the maximum workflow makespan (“deadline”) when mapping tasks on computing nodes. Thus, a common request of the scheduler to an estimator is to get the probability that a workflow in a given configuration executes not later than the defined deadline. We call this the probability of meeting deadline t. Note, this definition is equal to the definition of the CDF - the probability that a random variable is lower than a fixed value t (argument of the CDF). Assume the CDFs of all paths in the workflow are already estimated. As it was mentioned, the paths as random variables may correlate, therefore it is difficult and computationally expensive to compute the workflow CDF directly. Our approach, instead, proposes to make a lower bound on the CDF. This allows to compute the lower bound on confidence level of meeting deadline - the most important application of the runtime estimation. The formula for the probability of executing within a deadline t comes directly from (4) and can be written as follows: 𝑗

𝐹𝑤𝑓 (𝑡) = 𝑃(𝑇𝑤𝑓 ≤ 𝑡) = 𝑃 (max (∑𝑠𝑖=1 𝑇𝑘 𝑗 ) ≤ 𝑡). 𝑗∈1..𝑟

(8)

𝑖

𝑗

Note, some indices 𝑘𝑖 overlap along varying paths 𝑗 ∈ 1. . 𝑟. The main problem is that, due to occurrence of the same random variables in different paths, the probability of the maximum cannot be decomposed into the multiplication of the probabilities directly. What one can do instead, is to 𝑗 compute the distribution of each path 𝑇𝑝𝑎𝑡ℎ separately and then bound the total distribution of 𝑇𝑤𝑓 : 𝑗

𝑃(𝑇𝑤𝑓 ≤ 𝑡) ≥ ∏𝑟𝑗=1 𝑃 (∑𝑠𝑖=1 𝑇𝑘 𝑗 ≤ 𝑡).

(9)

𝑖

The inequality above allows to say that the probability of meeting a deadline is higher than the calculated value, i.e. to ensure that the workflow will execute not later than it is required.

4.2 Transform between CDF and CF Section 4.1 explains how to bound the CDF of a workflow given the CDFs of all paths throughout it. However, it is easier to use the CF than the CDF for the paths, because it does not require the convolution (integration) of the sum of the random variables (tasks). Since the model of a task runtime is non-parametric, the only way to represent the CDF or the CF is to store their values in a vector (grid, i.e. set of points). The equations (6) and (7) are linear in term of the number of operations performed on the grid 𝑂(𝑛)) where n is the size of the grid, whereas the convolution requires 𝑂(𝑛2 ) operations (or, probably, 𝑂(𝑛 log 𝑛) if using some sophisticated algorithms). If one stores both the CDF and the CF, then the sum and the maximum operations can be computed in the linear time; the CDF and the CF have the important property that they can be converted one into another using the Fourier transform; conversion of the functions using the Fast Fourier Transform (FFT) requires 𝑂(𝑛 log 𝑛) operations, resulting in the overall complexity of 𝑂(𝑛 log 𝑛). FFT is required only once per path, hence this also significantly reduces the number of “intensive”

𝑂(𝑛 log 𝑛) operations. As a result, the overall complexity of the method is 𝑂(𝑟𝑛 log 𝑛). In this section we derive the conversion formulae for these two representations of the random variable distribution. 1) Estimating CDF and CF: Assume random variable 𝑋 with CDF 𝐹𝑋 (𝑡) and CF 𝜙𝑋 (𝜔). Here a parameter 𝑡 denotes the time domain, and 𝜔 denotes the frequency domain. According to the definition, 𝜙𝑋 (𝜔) ⁡ = ⁡𝐸[𝑒 𝑖𝜔𝑋 ], 𝐹𝑋 (𝑡) = 𝑃(𝑋 ≤ 𝑡). These functions can be estimated using sample 𝑋1 . . 𝑋1𝑚 - i.i.d.: 𝜙̂𝑚 (𝜔) =

1 𝑚

𝑖𝜔𝑋𝑗 ̂ (𝑡) ∑𝑚 , 𝐹𝑚 = 𝑗=1 𝑒

1 𝑚

∑𝑚 𝑗=1 1{𝑋𝑗 ≤ 𝑡}.

Let’s say, one would like to calculate the distribution on a grid containing 𝑛 points. Since the sample is finite, one can define such interval (𝑎, 𝑏) according to the sample, that 𝑎 < min 𝑋𝑗 , 𝑏 > 𝑗∈1..𝑚

max 𝑋𝑗 . Then 𝐹̂𝑚 (𝑎) = 0, 𝐹̂𝑚 (𝑏) = 1. Parameters 𝑎, 𝑏, and 𝑛 completely define the grids that are

𝑗∈1..𝑚

used to store the values of CDF, CF: 𝑘

𝑡𝑘 = (𝑏 − 𝑎) + 𝑎, 𝑘 = 0,1. . (𝑛 − 1),

(10)

𝑛

2𝜋𝑗

𝜔𝑗 =

𝑏−𝑎

𝑛

, 𝑗 = 0,1. . . 2

Note, 𝜔-grid in (10) is almost half-sized (𝑛/2 + 1 elements instead of 𝑛). The CF is a Her̅̅̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅̅ ̂ ̂ mitian function 𝜙𝑋 (−𝜔) = 𝜙 𝑋 (𝜔) (and 𝜙𝑚 (−𝜔) = 𝜙𝑚 (𝜔)), hence one can restore the values of the CF for negative argument to get a full-sized grid. 2) Fourier Transform: After calculating the estimations of the CF and the CDF on the grids as shown in section 4.2.1, one can transform between the representations using the FFT in a following way: 𝑖(𝑎−𝑏)𝜔𝑗 𝑒

∗ 𝜙̂𝑚 (𝜔𝑗 ) = {

−2𝜋𝑖𝑘 ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ 𝑘 𝑗 ̂ 𝑛 ∑𝑛−1 , 𝑗 ≠ 0, 𝑘=1 (𝐹𝑚 (𝑡𝑘 ) − ) 𝑒

𝑖𝜔𝑗 𝑎

𝑛

𝑛

𝑛

𝑏+𝑎

𝑎−𝑏

(

2

− 𝐸[𝑋]) , 𝑗 = 0,

𝑖𝜔 𝑎

𝜓𝑗 =

(11)

1, 𝑗 = 0.

𝑖𝑛𝑒 𝑗 ̅̅̅̅̅̅̅̅̅̅ 𝜙̂ (𝜔𝑗 ), (𝑎−𝑏)𝜔𝑗 𝑚 𝑖𝜔 𝑎 𝑖𝑛𝑒 𝑛−𝑗

{(𝑎−𝑏)𝜔𝑛−𝑗

𝑛

𝑗 ∈ {1. . − 1} , 2

(12)

𝑛 𝜙̂𝑚 (𝜔𝑛−𝑗 ), 𝑗 ∈ { . . 𝑛 − 1} . 2

1 𝐹̂𝑚∗ (𝑡𝑘 ) = ∑𝑛−1 𝑗=0 𝜓𝑗 𝑒 𝑛

2𝜋𝑖𝑗 𝑘 𝑛

𝑘

+ . 𝑛

(13)

The sums in (11) and (13) have exactly the form of the forward and backward Discrete Fourier Transforms (DFTs) respectively. Therefore, the FFT, which has the complexity 𝑂(𝑛 log 𝑛), can be applied directly to the equations. Going from continuous Fourier transform to discrete transform introduces a bias that is proportional to the approximation error of the finite sum of corresponding Fourier series. One can assume that the CDF is the analytic function, then the bias decays exponentially with the grid size 𝑛 (and can be controlled easily by varying 𝑛), hence the significant part of the error is the variance of the initial estimations (statistical estimates of the CF and the CDF are the random functions). The overall error of the algorithm is expressed as follows: 𝑠 ∗ (𝑡) 𝐹̂𝑤𝑓 − 𝐹𝑤𝑓 (𝑡) = 𝑂(𝑟𝑠𝑏𝑛∗ ⁄2 ) + 𝑂 (√ ∗ ) 𝜉. 𝑚

(14)

Here 𝑏𝑛∗ ⁄2 denotes the asymptotic of the Fourier series error the convergence may be polynomial or exponential, depending on the type of the CDFs; 𝑚∗ may be considered as the size of the

smallest (among tasks within a workflow) sample. That is, the variance of the estimate depends on the sample size and the number of nodes. The bias also depends on the number of paths, but it can easily be controlled by varying the grid size; in addition, it depends on the shape of the CDF (and the interval 𝑏 − 𝑎) indirectly via term 𝑏𝑛∗ ⁄2 . 3) Combining shifted t-grids: Using one grid to compute the estimations of all tasks gives poor quality of the result. It is clear, because the execution time of a workflow usually is far higher than task’s execution time. Surely, if a workflow consists of ten similar tasks in a sequence, then each of the tasks has the average time ten times smaller than the overall time. But if they use the same grid for the CDF, then the most of grid’s points are wasted - filled by ones or zeros. Easy to show that, in general, the average time grows linearly with number of nodes, whilst the standard deviation grows as square root of the number of tasks. Therefore, in order to improve the overall accuracy of the method, we shift the center of the grid with the average execution time of the workflow’s component (task or path). We use formulae (5) to obtain the initial rough estimations of the moments: the size of the grid is always the same, proportional to the standard deviation of the workflow makespan. If one fixes the size of the t-interval (𝑎, 𝑏) using the standard deviation of the overall workflow makespan and shifts the center of this interval with the average time of the calculated components, then the total error reduces significantly, because the length of the interval in this case is not proportional to the overall time and depends on the variance only. Additionally, in this case the 𝜔-grid stays always the same, thus the calculation procedure does not become too complicated.

4.3 Algorithm The formulae of section 4 dictate the constraints on a possible algorithm to obtain the workflow runtime estimate. Firstly, one needs to get the estimations of the moments for tasks and paths using (5). Using the estimations, one should then calculate the CFs on a constructed grid (𝜔-grid stays the same during the computations). Finally, using (11, 12 and 13) one can convert the CFs into the CDFs and combine them on the shifted grids into the overall estimation. Next we provide the algorithm of estimating the workflow runtime. Input of the algorithm is a workflow graph produced by the scheduler and the normalized samples (that are used to compute the runtime distribution of the workflow’s tasks). The task runtime distribution may be in some other form (e.g. parametric distribution together with its parameters instead of the data sample); the only requirement - is the possibility to estimate the CF or the CDF. 1 - Calculate moments The mean and the variance of the tasks are taken directly from the task models (2). Using formulae (5) one gets the estimations of the mean 𝜇𝑤𝑓 and the standard deviation 𝜎𝑤𝑓 of the workflow execution time. 2 - Create calculation grids The t-grids must satisfy several conditions: first, values of the CDF should be close to zero and one on the sides of a grid; second, it should not be too large in order to keep a good accuracy (see section 4.2.2 for details); third, all grids should have the same length, so that 𝜔-grids become the same; last, t-grids of all variables should be shifted in such a way that if they are continued to infinity, they all coincide (in other words, if the grids’ step is ∆𝑡, then the shift can be only 𝑐∆𝑡,𝑐 ∈ 𝑍). Grids that satisfy these conditions may be constructed in a following way: 𝑛

𝜇

2𝑞𝜎𝑤𝑓

2

∆𝑡

𝑛−1

𝑡𝑘 = (𝑘 − + ⌊ ⌋), ∆𝑡 = 𝜔𝑗 =

2𝜋𝑗 𝑛∆𝑡

, 𝑘 = 0. . 𝑛 − 1,

𝑛

𝑛

2

2

, 𝑗 = − . . −1,0,1. . − 1.

Here µ is the mean of a variable for which a grid is computed (task, path, workflow); 𝜎𝑤𝑓 estimated standard deviation of the workflow runtime; 𝑞 - constant, which affects the size of the grid interval (i.e. creates 𝑞-sigmas interval); n is the size of grid, which is recommended to be the power of 2 (for a faster work of the FFT). 3 - Calculate CF the paths throughout workflow

On this step one should use provided tasks’ data to estimate the CFs and sum them up into all possible paths throughout a workflow. 4 - Convert CFs into CDFs This step consist of applying the FFT formulae (12,13) on each path throughout a workflow. 5 - Bound workflow CDF The final step is to bound the overall execution time CDF. Here one should directly apply (9). If the required value is not the CDF, but the CF, the result is then converted back using 11. The algorithm passes throughout all paths in a workflow once; the most expensive operation in terms of the grid size n is the FFT, which is performed for each path and in the sub-workflows. Therefore one can find out that the complexity of the presented algorithm is not more than 𝑂((𝑟 + 𝑠)𝑛 log 𝑛).

4.4 Workflow reduction Before performing the main algorithm one may simplify the workflow graph in order to reduce the complexity of the algorithm. It is clear that the complexity of an estimation algorithm cannot be lower than linear to the number of tasks s; hence we investigate the ways to decrease the number of paths 𝑟 (which in the worst case may be exponential to the number of nodes). Figure 2 presents a workflow, which contains parts that can be simplified. One can count the number of paths 𝑚 throughout it; it equals 11. Obviously, some of the paths in this graph are guaranteed to have lower value (execution time) than others. For example, path 1−4−9−10 is always shorter than 1 − 4 − 5 − 6 − 9 − 10, because the former sequence is a subsequence of the latter; then it does not contribute to overall runtime distribution, it only wastes an extra computing time. Therefore one should modify somehow the graph so that it does not contain the shorter paths. Another possible improvement is to compute the execution time of nodes 7 and 8 as a single random variable, which reduces the number of paths throughout the workflow by 3. Now the question is how to formalize these simplifications, implement them algorithmically, and prove that they help? Following are three types of reductions that partially answer these questions (the other types of reductions are also may be introduced in future): • Delete Edge (DE) - delete all edges which introduce the paths fully contained in some other paths (“shorter” paths); e.g. edge 4 − 9 in Figure 2; • Reduce Sequence (RS) - combine all fully sequential parts into the single meta-tasks; if the workflow contains the sequences of the tasks like 5−6 in Figure 2, these sequences are reduced into the single tasks; • Reduce Fork (RF) - combine all parallel parts (forkjoins) which start from the same set of nodes and join in the same set of nodes (i.e. tasks 7,8,9 in Figure 2 are reduced into a single task after deletion of edge 4 − 9). Figure 2 presents an example of performing the workflow reductions on one workflow. The initial workflow on this figure contains 𝑟 = 11 paths and 𝑠 = 10 nodes. The final workflow is irreducible w.r.t. discussed patterns (DE, RS, RF). As it is shown, the effect of the reductions is impressive for this particular example: final number of paths is almost four times lower than the initial (from 11 to 3). Unfortunately, it is difficult to give an analytical estimate on the performance of the reductions. Number of paths may grow exponentially with the number of nodes; and it is clear that any of the presented reductions alone cannot decrease the number of paths to polynomial. However one cannot claim the same for applying all three reductions together. Moreover, the real-world workflows are usually well-structured - have the polynomial number of paths and are easily reducible [24].

a)

1 2 b)

1 2

3 4 3 4

7 5

6

8

c)

10

1

9

2

7 5

6

8

d)

10

9

1 2

3 4 3 4

7 6

8

10

9 6

9

e) 1

10

2

3 4

10

Figure 2 – Example of performing reduction on a workflow (from (a) to (e))

5. Workflow scheduling Workflow scheduling is known to be an NP-complete problem [28] and nowadays there is a large number of algorithms developed for solving it. Two mostly widespread classes of algorithms are list-based heuristics and metaheuristics, both of which have their strong and weak parts. The main advantage of the first class is a short execution time while providing the solutions of the suitable quality. One of the most popular algorithms of this group is the heterogeneous earliest finish time (HEFT), which in many cases provides very efficient solutions. Another representatives of this class are Greedy Randomized Adaptive Search Procedure [29] and Dynamic Critical Paths algorithm [30]. The second class of algorithms often requires much more time for execution, but can provide better solutions than list-based heuristics, because metaheuristics can search through the whole solution space instead of using some predefined algorithm. The most popular algorithms of this group are Genetic Algorithm (GA), Ant Colony Optimization (ACO) and Particle Swarm Optimization (PSO) [31].

5.1 GAHEFT algorithm Taking into account the best parts of the described approaches we proposed a hybrid algorithm based on HEFT and GA algorithms [26]. The main idea behind it is that we generate the initial schedule using HEFT and then use it as a part of initial population of GA. Experimental results show that this schema allows to greatly improve the convergence of the algorithm, although slightly reducing its searching ability. The example of experimental evaluation of using HEFT initial population is presented in Figure 3. The plot shows results of GA and GAHEFT execution for the synthetic workflow CyberShake (30 tasks) [24], which represents the real composite application developed in the Southern California Earthquake Center to characterize earthquake hazards with the Probabilistic Seismic Hazard Analysis. For the statistical significance experiments were repeated 100 times. We can clearly see that in about 150 generations GA with 1% of initial HEFT chromosomes converges to the solution, which is not outperformed by simple GA even after 500 generations.

Figure 3 – Experimental results for different number of initial HEFT chromosomes We combined GAHEFT with the described in Section 4 schema for workflow execution time evaluation. Instead of using average task execution time on the step of fitness calculation, we apply the algorithm to the execution statistics. There are several important aspects need to be considered when applying this complex approach. The first problem is that execution time evaluation algorithm does not take into account situations when the number of computational resources available is less, than the number of tasks, which can run in parallel. Consider following example – we are trying to schedule the Montage workflow [24] containing 25 tasks to the pool of five resources. The workflow schema is presented in Figure 4a. If we apply logic of our makespan evaluation algorithm in a straightforward way, it will consider tasks 5-13 to run simultaneously, but having only five available resources it is impossible. That is why after creating the schedule, which will be evaluated by fitness function, we add logical connections between tasks of the same level that run on the same resource. Modified schema of the workflow is presented in Figure 4b, logical connections are represented with red arrows. This allows the makespan evaluation algorithm to comply with the tasks execution order and correctly calculate execution time.

(a) (b) Figure 4 – Schema of Montage workflow with 25 tasks. a – original, b – modified for five resources The second aspect is that execution time evaluation algorithm generates the distribution of the workflow makespan, whereas for the scheduling the exact number is required. Taking this into account, we need a proper way to obtain this value from the result distribution. For this purposes we use a 3-sigma rule [32], which allows to cover more than 99% of possible runtime values. In Figure

5 three possible output makespan distributions are presented. If we would use the default method for the makespan evaluation, we would choose the first distribution since it has the minimal mean value. But if we apply the 3-sigma rule and compare its output (dashed vertical lines of corresponding colors), we can clearly see, that the second distribution is the best option and the third is very close to it, despite the fact that it has the maximal mean value.

Figure 5 – Example of makespan distributions for the scheduling fitness function All these issues were taken into account when execution time evaluation algorithm was combined with GAHEFT algorithm in order to provide the best scheduling quality when working with real workflows, which tasks are very stochastic in their nature.

6. Scheduling experimental study 6.1 Experimental workflow For the evaluation of the efficiency of proposed combined scheduling algorithm we considered the scenario of complex workflow designed for flood simulations in Saint-Petersburg. The general purpose and theoretical basis of cyclones generation for the modelling of floods are given in [33]. Workflow presented in Figure 6 was designed to generate forecasts of possible floods in the SaintPetersburg area with high precision. Module “Cyclone generator” is responsible for generation of pressure and wind fields, based on which the level of water will be predicted. SWAN is a thirdgeneration wave model, developed at Delft University of Technology that computes random, shortcrested wind-generated waves in coastal regions and inland waters. Module “Analyzer” is responsible for processing the outputs of BSM model, comparison the resulting flood characteristics with the target ones and adjustment of cyclone generator parameters to reproduce the flood more precisely. After the parameters adjustment the generation process starts again and results into very precise reconstruction of the user-specified flood.

Figure 6 – Experimental workflow structure Statistical data for the execution time of each package were gathered over a certain period of time, about a year. This data includes computation times for packages, data transfer times between them and other overheads. Kernel density estimation of tasks calculation time and data transfer volume are shown in Figure 8 and 9 accordingly.

Figure 8 – Workflow tasks calculation time distribution From these figures can be seen, that computation time varies in the range from 200 up to 3500 seconds, while data transfer is carried out for a few seconds with the bandwidth of 100 Mb/sec.

Figure 9 – Data transfer distribution between packages

6.2 Experimental environment All experimental studies were conducted on the cloud computational platform CLAVIRE [5]. It base on three main modules: EasyFlow, EasyPackage and EasyResource. EasyFlow allows interpreting complex tasks in formalized workflow presentation based on acyclic graph. EasyPackage provide simple and clear mechanism to unify packages integration supporting performance models, while EasyResource includes heterogeneous resources and functionality to operate task execution. The core of EasyResource is execution module that has integrated scheduling mechanisms also it allows manually mapping tasks on resources adding notations at each executed step in EasyFlow (see fig. 10).

Figure 10 – Manually resource allocation with “Resource” notation

6.3 Experimental study Our experimental study consists of two parts, simulation and real experiments. During the simulation part, we conducted experiments on the composite application from section 7.1 in the simulated computational environment, combining two scheduling algorithms (default MinMin and GAHEFT) with two performance models. The first performance model (algorithms MinMin and

GAHEFT), which we used previously operates only with average values of computing and data transfer times for each package, while the second performance model (algorithms eMinMin and eGAHEFT), which presented in this work estimates EDF for whole workflow. It should be noted, that fitness function for evolving solutions from eGAHEFT is calculated as 99% quantile of EDF. Table 2 – Results of simulated experiments Average estimation EDF estimation (99% quantile) MinMin (eMinMin) 24126.0 s 24367.2 s GAHEFT (eGAHEFT) 15241.4 s 13286.9 s Profit 36.8% 45.5% According to the results of experiments presented in Table 2 it can be said that result schedules obtained by GAHEFT scheduler have 36.8% performance in compare to default MinMin scheduler by using old performance model with average values, and 45.5% efficiency in case of performance model with EDF estimation.

Figure 11 – Result schedules estimated by using different estimators Further, all obtained result schedules were estimated with two estimators (EDF estimation, quantile method). Thus, schedules provided by EDF estimation (eGAHEFT, eMinMin) were estimated by using quantile method, and schedules (GAHEFT, MinMin) obtained by using mean values were estimated by using EDF estimation and quantile methods. The comparison of these density functions is presented in Figure 11 (one MinMin solution left, since they were equal). The left plot presents solutions estimated by EDF estimation method. Solutions estimated by quantile method are shown in the right plot. From this figure we can see, that eGAHEFT schedule is considered worse than GAHEFT schedule by 30%, when both solutions were estimated by using quantile method. However, if evaluate both GAHEFT and eGAHEFT schedules using EDF estimation, the eGAHEFT schedules will be considered with 8% profit in compare with GAHEFT solution. This means, that the efficiency of scheduling algorithms (especially metaheuristics) highly depends on the estimation method for evolving solutions. Fitness estimation with poor-quality performance models may give wrong solutions that leads to a loss of efficiency during the real execution of workflow. Therefore, the main objective of these experiments is to compare the results of simulated and real executions of composite application. To provide this comparison, all obtained schedules were executed on CLAVIRE platform, which was described in previous section. Results of these experiments are presented in Figure 12.

Figure 12 – Experimental scheduling The solid red line presents experimental EDF for MinMin schedule, and the dotted red line corresponding to the real results of MinMin algorithm. Blue solid and dotted lines are responsible for the estimated EDF by using quantile method and real data accordingly for solutions, provided by GAHEFT algorithm. The last two lines represent estimated EDF for solution obtained by eGAHEFT algorithm and its real execution results. Algorithm MinMin GAHEFT eGAHEFT

Table 3 – Results of real experiments Simulated time, s Profit, % Real time, s 24126.0 24598.8 15241.4 36.8 14391.9 13286.9 45.5 12776.2

Profit, % 41.5 48.1

As can be seen from the figure 13, EDF estimation method provides much better approximation to the real data in comparison to quantile method, that was shown early in the section 6.2. Moreover, the results of the experiments (Table 3) with real workflow execution show, that the eGAHEFT solution better than GAHEFT solution by 6.4%, when these schedules were applied in the real cloud environment. Thus, by using a more qualitative estimation method for evolving solutions during the metaheuristics execution, the makespan profit in comparison to default MinMin algorithm has grown from 41.5 up to 48.1 percent.

7. Conclusion The main purpose of this research work was to experimentally determine the dependence of the efficiency of metaheuristic algorithms for workflow scheduling on chosen performance model to estimate the obtained solutions. The advanced method for the workflow makespan estimation was described in the section 4.3. The experimental study from the section 6.2 showed, that proposed estimation method better approximates makespan of a workflow to the real data than the quantile method.

Further, the GAHEFT scheduling algorithm, which is used in our experimental study was described in the section 5. This algorithm was compared with default MinMin algorithm. Both simulated and real experiments were conducted for these algorithms with two methods for the makespan estimation (EDF estimation, quantile method) in the section 7. The experimental study revealed, that proposed GAHEFT and eGAHEFT algorithms outperform MinMin algorithm by 30-48% in all cases. All obtained result schedules were evaluated by both estimation methods. According to the quantile estimation, the eGAHEFT showed worse performance when GAHEFT. However, the EDF estimation method and the real experiments presented the opposite fact, that eGAHEFT solution turned out to be better than the GAHEFT solution. Thus, efficiency of metaheuristic algorithm had increased by 6.4% just by using a better method for makespan estimation without changes in the metaheuristic algorithm itself. To conclude, it should be said that efficiency of metaheuristic algorithms is based not only on the evolving strategy, but also on the evaluation method for the evolving solutions. Using of qualitative estimation method allows the algorithm to properly assess the evolving solutions, and thus, find indeed optimal solutions, while a poor quality method might reduce the effectiveness of the algorithm or even find the wrong solutions. Acknowledgement. This work was partially supported by Government of Russian Federation, Grant 074-U01, project ”Big data management for computationally intensive applications” (project #14613), and the Dutch national research program COMMIT.

References [1]

[2]

[3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18]

C. Weinhardt, A. Anandasivam, B. Blau, N. Borissov, T. Meinl, W. Michalk, and J. Stoß er, “Cloud Computing - A Classification,¨ Business Models, and Research Directions,” Business & Information Systems Engineering, vol. 1, no. 5, pp. 391–399, Sep. 2009. B. Sharma, R. K. Thulasiram, P. Thulasiraman, S. K. Garg, and R. Buyya, “Pricing Cloud Compute Commodities: A Novel Financial Economic Model,” in 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012). IEEE, May 2012, pp. 451–457. A. Chirkin, “Execution Time Estimation in the Workflow Scheduling Problem,” M.S. thesis, http://dare.uva.nl/en/scriptie/496303, University of Amsterdam, the Netherlands, 2014. M. Wieczorek, A. Hoheisel, and R. Prodan, “Towards a general model of the multi-criteria workflow scheduling on the grid,” Future Generation Computer Systems, vol. 25, no. 3, pp. 237–256, Mar. 2009. K. V. Knyazkov, S. V. Kovalchuk, T. N. Tchurov, S. V. Maryin, and A. V. Boukhanovsky, “CLAVIRE: e-Science infrastructure for datadriven computing,” Journal of Computational Science, vol. 3, no. 6, pp. 504–510, 2012. S. V. Kovalchuk, P. A. Smirnov, S. V. Maryin, T. N. Tchurov, and V. A. Karbovskiy, “Deadline-driven Resource Management within Urgent Computing Cyberinfrastructure,” Procedia Computer Science, vol. 18, pp. 2203–2212, 2013. L. Yuan, K. He, and P. Fan, “A framwork for grid workflow scheduling with resource constraints,” in 2011 International Conference on Consumer Electronics, Communications and Networks (CECNet). IEEE, Apr. 2011, pp. 962–965. T. S. Somasundaram and K. Govindarajan, “CLOUDRB: A framework for scheduling and managing High-Performance Computing (HPC) applications in science cloud,” Future Generation Computer Systems, vol. 34, pp. 47–65, May 2014. D. I. G. Amalarethinam and F. K. M. Selvi, “A Minimum Makespan Grid Workflow Scheduling algorithm,” in 2012 International Conference on Computer Communication and Informatics. IEEE, Jan. 2012, pp. 1–6. D. Kyriazis, K. Tserpes, A. Menychtas, A. Litke, and T. Varvarigou, “An innovative workflow mapping mechanism for Grids in the frame of Quality of Service,” Future Generation Computer Systems, vol. 24, no. 6, pp. 498–511, Jun. 2008. G. Falzon and M. Li, “Evaluating Heuristics for Grid Workflow Scheduling,” in 2009 Fifth International Conference on Natural Computation, vol. 4. IEEE, 2009, pp. 227–231. E. Juhnke, T. Dornemann, D. Bock, and B. Freisleben, “Multi-objective Scheduling of BPEL Workflows in Geographically Distributed Clouds,” in 2011 IEEE 4th International Conference on Cloud Computing. IEEE, Jul. 2011, pp. 412–419. A. Afzal, J. Darlington, and A. McGough, “Stochastic Workflow Scheduling with QoS Guarantees in Grid Computing Environments,” in 2006 Fifth International Conference on Grid and Cooperative Computing (GCC’06). IEEE, 2006, pp. 185–194. N. Trebon and I. Foster, “Enabling urgent computing within the existing distributed computing infrastructure,” Ph.D. dissertation, University of Chicago, Jan. 2011. S. Ichikawa and S. Takagi, “Estimating the Optimal Configuration of a Multi-Core Cluster: A Preliminary Study,” in 2009 International Conference on Complex, Intelligent and Software Intensive Systems. IEEE, Mar. 2009, pp. 1245–1251. S. Ichikawa, S. Takahashi, and Y. Kawai, “Optimizing process allocation of parallel programs for heterogeneous clusters,” Concurrency and Computation: Practice and Experience, vol. 21, no. 4, pp. 475–507, Mar. 2009. Y. Kishimoto and S. Ichikawa, “Optimizing the configuration of a heterogeneous cluster with multiprocessing and executiontime estimation,” Parallel Computing, vol. 31, no. 7, pp. 691–710, 2005. L. F. Bittencourt and E. R. M. Madeira, “HCOC: a cost optimization algorithm for workflow scheduling in hybrid clouds,” Journal of Internet Services and Applications, vol. 2, no. 3, pp. 207–227, Aug. 2011.

[19]

[20] [21] [22]

[23] [24] [25]

[26]

[27] [28] [29] [30] [31] [32] [33]

W.-N. Chen and J. Z. J. Zhang, “An Ant Colony Optimization Approach to a Grid Workflow Scheduling Problem With Various QoS Requirements,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 39, no. 1, pp. 29–43, Jan. 2009. V. Korkhov, “Hierarchical Resource Management in Grid Computing,” Ph.D. dissertation, Universiteit van Amsterdam, 2009. J. O. Gutierrez-Garcia and K. M. Sim, “A Family of Heuristics for Agent-Based Cloud Bag-of-Tasks Scheduling,” in 2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery. IEEE, Oct. 2011, pp. 416–423. M. Malawski, K. Figiela, M. Bubak, E. Deelman, and J. Nabrzyski, “Cost optimization of execution of multi-level deadlineconstrained scientific workflows on clouds,” in Parallel Processing and Applied Mathematics, ser. Lecture Notes in Computer Science, R. Wyrzykowski, J. Dongarra, K. Karczewski, and J. Waniewski, Eds. Springer Berlin Heidelberg, 2014, pp. 251–260. [Online]. Available: http: //dx.doi.org/10.1007/978-3-642-55224-3 24 L. Breiman, “Random forests,” 567, Department of Statistics, UC Berkeley, 1999. 31, Tech. Rep., 1999. “Pegasus workflow management system,” http://pegasus.isi.edu/, accessed: 2014-05-19. V. V. Krzhizhanovskaya, N. Melnikova, A. Chirkin, S. V. Ivanov, A. Boukhanovsky, and P. M. Sloot, “Distributed simulation of city inundation by coupled surface and subsurface porous flow for urban flood decision support system,” Procedia Computer Science, vol. 18, pp. 1046–1056, 2013. Nasonov, D., Butakov, N., Balakhontseva, M., Knyazkov, K., & Boukhanovsky, A. V. (2014). Hybrid Evolutionary Workflow Scheduling Algorithm for Dynamic Heterogeneous Distributed Computational Environment. In International Joint Conference SOCO’14-CISIS’14-ICEUTE’14 (pp. 83-92). Springer International Publishing. Topcuoglu, H., Hariri, S., & Wu, M. Y. (2002). Performance-effective and low-complexity task scheduling for heterogeneous computing. Parallel and Distributed Systems, IEEE Transactions on, 13(3), 260-274. Zhang, Fan, et al. "Multi-objective scheduling of many tasks in cloud platforms." Future Generation Computer Systems 37 (2014): 309-320. Resende, Mauricio GC, and Celso C. Ribeiro. "Greedy randomized adaptive search procedures: Advances, hybridizations, and applications." Handbook of metaheuristics. Springer US, 2010. 283-319. Rahman, Mustafizur, Srikumar Venugopal, and Rajkumar Buyya. "A dynamic critical path algorithm for scheduling scientific workflow applications on global grids." e-Science and Grid Computing, IEEE International Conference on. IEEE, 2007. Rahman, Mustafizur, et al. "Adaptive workflow scheduling for dynamic grid and cloud computing environment." Concurrency and Computation: Practice and Experience 25.13 (2013): 1816-1842. Pukelsheim, Friedrich. "The three sigma rule." The American Statistician 48.2 (1994): 88-91. Kalyuzhnaya, Anna V., et al. "Synthetic storms reconstruction for coastal floods risks assessment." Journal of Computational Science 9 (2015): 112-117.

Execution Time Estimation for Workflow Scheduling 1 ...

Execution Time Estimation for Workflow Scheduling 1 ...

Suggest Documents

Execution Time Estimation for Workflow Scheduling - Semantic Scholar

Towards Better Workflow Execution Time Estimation - ScienceDirect

Fair Execution Time Estimation Scheduling in Computational ... - CeiD

Estimation of Assembly Programs Execution Time

Time-aware Test Case Execution Scheduling for Cyber-Physical ...

Execution Environment for Dynamically Scheduling Real-Time Tasks

Algorithms for dynamic scheduling of unit execution time tasks - Cicese

Algorithms for dynamic scheduling of unit execution time tasks

Fault-tolerant real-time scheduling under execution time constraints

Fault-tolerant real-time scheduling under execution time ... - CiteSeerX

Dynamic workflow model fragmentation for distributed execution

Decentralized Workflow Execution for Virtual ... - Google Sites

Secure Web Service Workflow Execution

Predicting the Execution Time of Grid Workflow ... - Semantic Scholar

Analyzing Loop Paths for Execution Time Estimation - NUS Computing

Planning, Scheduling, and Plan Execution for ... - CiteSeerX

Workflow Scheduling and Resource Allocation for ...

Workflow Scheduling for SaaS / PaaS Cloud ...

A Learning Architecture for Scheduling Workflow

Dual-phase Just-in-time Workflow Scheduling in ... - Semantic Scholar

Cloud workflow scheduling with deadlines and time slot availability

Robust Scheduling of Task Graphs under Execution Time Uncertainty

Runtime Scheduling, Allocation, and Execution of Real-Time ...

Contextualised Workflow Execution in MyGrid - CiteSeerX