P2PScheMe: a P2P scheduling mechanism for workflows in grid

0 downloads 0 Views 687KB Size Report
Dec 20, 2011 - to achieve the full potential of system resource abstraction [5, 6]. .... grids and ASKALON also need to carry a resource reservation process in ...... work, we plan to refine the definition of user QoS parameters and allow grid ... Yu J, Buyya R. A taxonomy of workflow management systems for grid computing.
CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 Published online 20 December 2011 in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/cpe.1899

SPECIAL ISSUE PAPER

P2PScheMe: a P2P scheduling mechanism for workflows in grid computing João Marcelo U. de Alencar 1,5, * ,† , Rossana M. C. Andrade 1,4 , Windson Viana 1,3 and Bruno Schulze 2 1 GREat

- Group of Computer Networks, Software Engineering and Systems, Federal University Of Ceará (GREat-UFC), Fortaleza, CE, Brazil 2 LNCC - National Laboratory for Scientific Computing, Petrópolis, RJ, Brazil 3 UFC Virtual Institute, Fortaleza, CE, Brazil 4 Department Of Computer Science, UFC, Fortaleza, CE, Brazil 5 Quixadá Campus Academic Department, UFC, Quixadá, CE, Brazil

SUMMARY Complex scientific experiments have a growing demand for computational resources, which are expensive to be acquired and maintained. Grid computing has emerged as the mainstream technology to solve this issue. Grids are also adequate for the execution of scientific workflows because they allow the use of heterogeneous and distributed resources. In spite of the progress in grid technology, there are challenges to overcome in workflow scheduling. For instance, centralized scheduling solutions may lead to performance degradation and to problems with scalability. Some scheduling approaches are partially distributed, keeping a few centralized components that may become bottlenecks. Other distributed solutions have a lack of flexibility in the definition of workflows, in which only the use of tasks as steps in the workflow is permitted not high level services. In this work, we present P2PScheMe, a scheduling mechanism for peer-to-peer execution of workflows based on the invocation of grid services. The proposal considers information regarding grid execution environment in order to allow workflow scheduling adaptation. This adaptation is performed according to user requirements of quality of service. In this paper, we describe how P2PScheMe works and provides a comparative analysis with existing solutions. Copyright © 2011 John Wiley & Sons, Ltd. Received 16 March 2011; Revised 5 October 2011; Accepted 5 October 2011 KEY WORDS:

grid computing; peer-to-peer; workflows

1. INTRODUCTION Scientists collaborate with colleagues who are geographically dispersed, which requires the sharing of data, applications and laboratory equipments. Grid technologies have then appeared as the common solution for providing distributed access to these resources [1]. In fact, computational grids are infrastructures that easily integrate distributed and heterogeneous resources [2]. Many of the distributed scientific experiments can be structured as workflows in which the nodes are individual tasks or services (e.g. access a protein database, execute a 3-D modelling method), and the edges between them are data flow or an execution dependency [1]. Scientific workflows have then emerged as paradigms to describe, manage and share complex scientific analyses. They represent the steps that need to be performed on a complex scientific application as well as data *Correspondence to: João Marcelo U. de Alencar, GREat - Group of Computer Networks, Software Engineering and Systems, Federal University of Ceará, Av. Mister Hull, s/n - Campus do Pici - Block Number 942A - CEP: 60455760 - Fortaleza - CE - Brazil. † E-mail: [email protected] Copyright © 2011 John Wiley & Sons, Ltd.

A P2P SCHEDULING MECHANISM FOR WORKFLOWS IN GRID COMPUTING

1479

dependencies between these steps in a declarative way. Workflows also facilitate the sharing of information on scientific processes because their development can be carried out collaboratively by several researchers. Recent research has shown that workflow steps can be mapped to computational services [3] following the service-oriented architecture (SOA) concept. Current grid technologies provide SOA capabilities that allow the representation of scientific workflows as a service orchestration. This representation provides the user with a general overview of his/her application comprising control flow and data exchange between the services [4]. Despite the advantages of using workflows in grids, challenges persist and must be overcome to achieve the full potential of system resource abstraction [5, 6]. For example, a relevant issue to be addressed is the automatic mapping of the workflow model created by the user into a runnable instance because it is necessary to define each workflow step, which is the concrete service (i.e. a real application) to be invoked. In some worldwide scientific grids, numerous versions of the same service could be available. Consequently, a workflow has several possibilities to be performed. A grid middleware should then provide a scheduling mechanism in order to choose an optimal combination of the services. In this case, the efficient scheduling of the workflow is crucial for system maintenance and user satisfaction as it has a direct impact on the overall system performance. The scheduling mechanism should also take into account the user-defined QoS metrics given that in some situations, the user may find it helpful to loosen a requirement in favour of another. For example, slower execution is acceptable if there is no budget available to use paid powerful resources. In a centralized workflow scheduling mechanism, a single process gathers information about the grid resources and decides which machine will be responsible for each workflow step. This approach is straightforward and efficient, although it lacks scalability. In order to provide robust implementation for workflow schedulers, researchers have proposed decentralized architectures [7]. A challenge that remains is to provide a decentralized scheduler with the same efficiency and functionality found in centralized solutions. On the other hand, peer-to-peer (P2P) networks are distributed infrastructures characterized by decentralization of responsibilities in which each host performs both the server and the client functions [8]. The P2P execution of workflows eliminates the need for a central scheduler, distributes decision making and avoids bottleneck issues. In the P2P networks, every host requests or provides services to other peers. There is also the possibility of implementing mechanisms for load distribution among the hosts on the network. In this paper, we propose P2PScheMe, a solution for the peer-to-peer execution of workflows based on the invocation of grid services. The proposal allows the definition of priorities between QoS metrics and considers both information regarding the context of the grid execution environment and the QoS requirements informed by the user. We aim to provide the adaptation of workflow scheduling because the scheduler has access to the latest available information concerning grid resources, and it can optimize workflow execution in order to meet the QoS requirements. Two experiments in a testbed are performed in order to evaluate our approach. First, the system is compared with an existing decentralized solution that does not consider QoS metrics. Second, the system is compared with the state of art of centralized scheduling. The objective of this second experiment is to evaluate the negative impact of decentralization on the execution of workflows because a centralized scheduling has a global view of the grid resources. The remainder of this paper is described as follows: (i) Section 2 discusses other existing solutions for P2P execution of workflows; (ii) in Section 3, the detailed proposed mechanism is presented; (iii) a comparative analysis with existing solutions is discussed in Section 4; and (iv) conclusions and suggestions for future work are summarized in Section 5.

2. RELATED WORKS The challenge of decentralized scheduling of workflows is to decrease the likelihood of bottlenecks without sacrificing the efficiency of execution. Existing approaches use hybrid or P2P architectures to tackle the occurrence of bottlenecks. Table I shows the principal information concerning main related works. Copyright © 2011 John Wiley & Sons, Ltd.

Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 DOI: 10.1002/cpe

1480

J. M. U. DE ALENCAR ET AL.

Table I. Comparison among related works. Environment

Architecture decentralization

User QoS

Number of metrics

Resource reservation

Application paradigm

SwinDeW Global grids Triana ASKALON

Peer-to-peer Peer-to-peer Hybrid Hybrid

No Yes No Yes

N/A Static N/A Static

No Yes No Yes

Jobs and services Jobs Jobs and services Jobs and services

SwinDeW [7] is a workflow management system for grids in which the machines that compose the infrastructure are organized in a P2P network. In SwinDeW, peers are grouped according to their functionalities (services), and the execution of a workflow instance begins by allocating tasks to peers. The result of this allocation process is a network of interconnected peers able to run workflows. The first step in creating such network is the detection of an external event by a peer. An event can be manually provided by the user or by another system. A detected event triggers the process of network topology establishment. The peer that receives such an event is called the current instantiation peer. This peer searches for other peers capable of performing the next tasks of the workflow. With the search result, the peer stores, in its repository, references to peers that provide the next tasks. If in turn the next tasks also have successor tasks, the selected peers begin to act as instantiation peers, and the process begins again until the final tasks are achieved (i.e. those without successors). The main advantage of SwinDeW is its focus on a decentralized architecture, which enables load balancing effectively. On the downside, we note that SwinDeW does not consider user QoS requirements during the workflow execution. Buyya presents the Global Grids solution that uses a distributed index based on distributed hash table for the exchange of messages and coordination among the peers of a grid computing [9]. The index serves as a distributed shared memory space in which each peer may report requests coming from its users or publish reports about available resources. This memory represents a vector space of several dimensions. Each dimension represents an aspect of the resource requested, such as a specific processor type, the minimal free memory required or the desired bandwidth. The information space is partitioned, and each peer is responsible for maintaining a partition that is given to applications for publishing reports or requests for resources. When a report of resources published by a peer matches a request issued by another peer, the first peer becomes responsible for fulfilling the request. For the execution of a workflow, the user submits the instance with tasks annotated with the resources needed for the execution. A local agent in each peer publishes the requirements of each task and waits for another peer answer requests. At this time, the task is dispatched to the chosen resource. Distributed hash table has a strong formal basis for exchanging messages, which is an advantage. However, a user can only describe requirements by defining characteristics in terms of resource configuration parameters. In this process, Global Grids does not take into account the functionality of the services. Triana is a workflow development environment that allows steps of a workflow to be submitted for execution in several distributed environments, including P2P networks [10]. The visual programming interface is represented by Triana units. Applications are developed by connecting units of a palette in a work area. In addition to a vast library of units, Triana also allows creation of custom units through the implementation of a programming interface. Control structures, such as loops, are provided to allow the execution of control flows. These units of abstraction permit developers to create complex execution flows by using a small set of simple units. Triana can make use of distributed systems such as grids and P2P for executing the steps of the workflow. However, execution control is provided by a central entity. Although Triana has a hybrid architecture, it is still vulnerable to bottlenecks. The ASKALON project [11] has an execution environment based on events that uses agents distributed among grid resources to gather updated information about the state of the resources. This information is used in the scheduling process. In ASKALON, the initial decision concerning scheduling is produced by a restricted set of services, which are implemented following centralized Copyright © 2011 John Wiley & Sons, Ltd.

Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 DOI: 10.1002/cpe

A P2P SCHEDULING MECHANISM FOR WORKFLOWS IN GRID COMPUTING

1481

architecture. However, the services and events that trigger the event handling system are distributed across the grid in various hosts, and each grid domain has a replica of those services. The main goal of this hybrid architecture is not the reduction of system bottlenecks. Given the geographical distribution of the information resources, a good option is to use an event-driven system for exchanging information. The advantage is the possibility of always using the most recent state of the grid, providing an adaptive system. However, the centralized components of ASKALON remain susceptible to system bottlenecks. Table I summarizes the characteristics of the aforementioned research works. With regard to the column of architecture decentralization, SwinDeW and Global Grids solutions have a larger degree of distribution because they completely adopt the P2P paradigm. In contrast, Triana and ASKALON have a centralized decision point at some instant in their scheduling mechanism (hybrid approach). User QoS requirements are only considered by global grids and ASKALON solutions. However, these solutions take into account a restricted set of QoS requirements such as cost and execution time. This restricted set is static, that is cannot be changed by the addition of new members. Global grids and ASKALON also need to carry a resource reservation process in order to function properly. Except for Global Grids, all the aforementioned works make use of service orchestration as an option for application definition. One of our proposed goals is to provide a decentralized scheduling mechanism for the execution of service workflows. SwinDeW, or its newer version SwinDeW-G, offers the initial framework to construct this kind of scheduling approach because this approach follows a decentralized architecture and is based on service orchestration. Our approach will extend SwinDeW in order to consider QoS requirements. 3. PROPOSED MECHANISM 3.1. Adopted grid model Before beginning to discuss our scheduling mechanism, the adopted grid model must be outlined. The main assumption is that a service can be provided by several hosts in a worldwide grid, such as grids for complex scientific applications. Our grid architecture is based on the concept of grid federation [9] in which a host or a set of hosts act as a access point to the resources of an administrative domain. The adopted grid model in this paper uses service invocation instead of job submission. An access point hosts several services. Each service has a specific functionality that can be used by the grid users in their workflow design. In the context of a P2P network deployed on a grid, a domain access point also takes on the role of a peer. The P2P network is an unstructured network [12]. Every peer has a list that contains information about hosted services. For each hosted service in the list, there is a vector with references to other peers that also host the same service. Therefore, peers can exchange messages among themselves through flooding techniques such as those described in [12]. The reason for the adoption of an unstructured network is that we consider it important that the user is capable of searching the content of service descriptions. As structured networks do not allow searches for content, only by key, they were not adopted, although they are more efficient. With regard to QoS grid metrics, there are basically two approaches for taking them into account in workflow executions [13]. The first option is to take into consideration the QoS constraints for each service of the workflow. The second approach is to associate the restrictions to the whole workflow. This second approach requires the existence of a process to monitor the execution, which should ensure that the invocation of each service satisfies the global constraints of the workflow instance. Hence, this approach requires a centralized architecture. As one of the goals of this paper is to provide a solution for decentralized execution, the QoS approach adopted is the first option [13]. 3.2. Mechanism overview Our scheduling mechanism provides a decentralized technique for scheduling workflows that takes into account QoS parameters. An overview of the proposed mechanism is presented in Figure 1. Copyright © 2011 John Wiley & Sons, Ltd.

Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 DOI: 10.1002/cpe

1482

J. M. U. DE ALENCAR ET AL.

Figure 1. Mechanism overview.

Stages 1–3 are the workflow design phase. The grid user searches for candidate services to use in creating their workflows. Peer hosting services that can meet the functionality required by the user return their references. With these references, the user can elaborate the final workflow for submission. The last phase is workflow execution, which is composed of stages 4 and 5. The workflow created in the previous phase is submitted to a grid peer in order to be executed. This peer becomes the instantiation peer. The first activity performed by the instantiation peer is to search for peers that host the service required for the first invocation of the workflow. Within the set of peers found, a negotiation process is conducted to decide which peer will become responsible for the invocation. The negotiation process considers the informed QoS to decide which peer is the most fitted. Once a peer has been chosen, the workflow instance is transferred to it. The invocation is then executed, and the negotiation process is initiated once again to choose the peer for the next workflow step invocation. The steps described earlier are repeated until the workflow is concluded. The interleaving of the phases of negotiation and execution is the main difference of our mechanism compared with SwinDeW. In SwinDeW, all peers that will perform workflow steps are chosen in the beginning of the execution. Our mechanism makes the information used in the negotiation more accurate because it evaluates the grid state at each step. 3.3. Negotiation In the negotiation process, the peer responsible for an invocation is chosen among a group of peers that host the same service. Figure 2 shows an overview of the peer selection process. This negotiation proceeds from left to right. The negotiation is initially conducted by the first instantiation peer or by a peer who has finished the execution of a workflow step. The process uses information about the status of a group of peer candidates and then applies the QoS-based selection to choose the next peer. The initial phase of the selection is to discover which peers have the appropriate services, thus forming the initial group of candidates (Phase I in Figure 2). The second and the third phases apply a similar selection process. In the second phase, the QoS metrics used are those related to the administrative requirements of the grid (e.g. load balancing).

Figure 2. Phases of negotiation. Copyright © 2011 John Wiley & Sons, Ltd.

Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 DOI: 10.1002/cpe

A P2P SCHEDULING MECHANISM FOR WORKFLOWS IN GRID COMPUTING

1483

The third phase uses the metrics relevant to the end user (e.g. the execution cost). For each metric, the process computes a numeric value that quantifies the degree of fulfillment of the metric by the peer. For metrics, such as cost of computation or execution time, the calculation is immediate. However, there are cases where the calculation is not straightforward, such as the security metric [14]. Whatever the chosen function, it needs to be adopted by all institutions with peers in the grid. Phases II and III apply a selection process that takes an input (from the user or administrator) and a tuple in the form of Equation 1, where q is the number of metrics evaluated. Each value of pi has the priority of the metric i with regard to the other metrics. For example, considering only the cost and execution time, a vector of the form Œ0.75, 0.25 means that the priority of cost metric is three times higher than the priority of execution time, that is the grid user prefers a slower execution than one that leads to a higher financial cost. Value M is a threshold that indicates the percentage of peers that proceeds to the next selection phase. In the next paragraphs, we explain the role of this threshold. < Œp1 , p2 , p3 , : : : , pq , M > where

q X

pi D 1 e 0 , weights of the three QoS metrics.

(8)

0

Equations 9 and 10 show the calculation formula of kj for the third phase. We adopted a matrix multiplication to facilitate the computation. One matrix contains a line with the weights, and other matrix has the values computed and normalized for each metric. Each column of the second matrix represents a candidate peer. 2 1 3 Œm1  Œm13  Œm14  Œm16  Œm110    6  0 0 0 0 7  0 p1 p2 p3  4 Œm21  Œm23  Œm24  Œm16  Œm110 5 D k1 k3 k4 k6 k10 (9) Œm31 

Œm33 

Œm34  Œm16 

Œm110 

Table V. Peer metric values computed during the third phase of negotiation. Peer 1 3 4 6 10 Copyright © 2011 John Wiley & Sons, Ltd.

m1j 0.00 0.67 0.23 0.89 0.00

m2j 0.90 0.08 0.49 0.03 1.00

m3j 0.00 0.60 0.80 0.80 0.00

Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 DOI: 10.1002/cpe

1486

J. M. U. DE ALENCAR ET AL.

2



0.50 0.25 0.25



0.00 6  4 0.90 1.00  0.475

0.67 0.08 0.40

0.23 0.89 0.49 0.03 0.20 0.20

3 0.00 7 1.005 D 1.00

0.455 0.287 0.502 0.500

(10) 

The final application of the equations leads to Table VI. The number of candidates approved for the next phase is defined by Muser  n, with Muser D 0.4 and n D 5. Therefore, there are two 0 peers approved. They are the peers with the lowest values for the index kj in the table, namely peers 3 and 4. During the fourth phase, a random choice is made between these two peers to choose the peer responsible for the invocation (the workflow step). A discussion is needed to justify the use of random selection as the last phase. The choice of peer 3 with higher cost, instead of peer 4, is a consequence of the ratio between the weights informed. In the array Œ0.5, 0.25, 0.25 in Equation 8, the priority of the cost is twice higher than the priorities of execution time and availability. However, the cost of execution of peer 3 is almost twice the cost of peer 4, and the execution time of peer 4 is five times greater than peer 3 (according to Table II). Hence, the difference between the weights (0.5 and 0.25) is not enough to justify the direct choice of peer 4, as the proportion between the values for execution time is greater than the proportion between the execution costs. Suppose that instead of the array Œ0.5, 0.25, 0.25, the user informed the vector Œ0.8, 0.1, 0.1. The 0 cost priority is now eight times bigger than the priority of execution time. The calculation of kj for 0 the third phase with the new vector is in Equation 11. The new values for kj are in Table VII. 3 2 0.00 0.67 0.23 0.89 0.00   7 6 0.80 0.10 0.10  4 0.90 0.08 0.49 0.03 1.005 D (11) 1.00 0.40 0.20 0.20 1.00   0.190 0.584 0.253 0.735 0.200 As we increase the cost priority, the remaining candidates are peers 1 and 10 instead of peers 3 and 4. Both peers 1 and 10 have the smallest costs, but their values for the execution time are among the worst. Thus, the user can influence execution by defining the weight vector. The 0

Table VI. Values for kj defined in the third phase for the priority vector [0.5, 0.25, 0.25]. 0

kj

Value

k1 0 k3 0 k4 0 k6 0 k10

0.475 0.455 0.287 0.502 0.500

0

0

Table VII. Values for kj defined in the third phase for the priority vector [0.8, 0.1, 0.1]. 0

kj

Value

k1 0 k3 0 k4 0 k6 0 k10

0.190 0.584 0.253 0.735 0.200

0

Copyright © 2011 John Wiley & Sons, Ltd.

Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 DOI: 10.1002/cpe

A P2P SCHEDULING MECHANISM FOR WORKFLOWS IN GRID COMPUTING

1487

higher the proportion between the weights, the greater the significance given for the metric with highest priority. The proposed strategy does not guarantee to choose the best possible solution because this problem belongs to the NP-complete class[15]. The mechanism allows reducing the number of possible candidate solutions according to user and grid requirements. In this aspect, the last phase of the proposal (Phase IV) has similarities with optimization heuristics, because it is a random choice, as the sample space is more suited to the needs of the problem, the negotiation process leads to a better compliance with administrative and user QoS requirements of the grid. 4. COMPARATIVE ANALYSIS In order to allow a comparative analysis, we developed a P2P workflow execution environment. The environment was implemented using the Globus Toolkit and the Web Services Resource (WSResource) Framework [16]. This environment allowed the execution of our scheduling approach and other scheduling proposals. The main idea of our experiment was to submit applications and workload on a testbed. In each round of execution, only one algorithm was performed and analyzed. To make the tests equitable, we defined a workload, which was created as a set of predefined workflows. The same workload was submitted for execution every time an algorithm was evaluated in the testbed. The workload definition is presented at Section 4.2. First, we compared our work with a P2P approach. We chose SwinDeW because it provides a P2P architecture with support for scheduling service workflows. We did not use the existing implementation of SwinDeW. Instead, we implemented the scheduling strategy of SwinDeW in our environment. However, SwinDeW only considers load balancing as a metric. We then compared the proposed mechanism with SwinDeW in terms of load distribution. During our comparison, we also took into account other QoS requirements. The goal was to provide evidence that in addition to supplying a better load balancing, P2PScheMe would also guide the execution to meet the requirements informed. It is important to note that the implementation of SwinDeW developed by its creators was not used in the comparison. A new environment for the execution of workflows was developed on Globus Toolkit, and both the proposed mechanism and the techniques used by SwinDeW were completely implemented in the new environment. Moreover, the proposal was compared with a centralized scheduler on the basis of the execution of genetic algorithms. We chose genetic algorithms because it is a search heuristic that has shown good results in the scheduling of workflows in computational grids [17]. An important feature of this heuristic is the multicriteria optimization, which is necessary for solving problems with various metrics. The load distribution was again analysed. In addition, we evaluated the behaviour of centralized and decentralized techniques when they try to meet the QoS requirements. Details concerning the execution environment, workload generation and computation of the metrics values as well as the experiment results are described in the next subsections. 4.1. Testbed description In order to perform the analysis, we deployed our environment on a testbed grid. A set of servers was configured to represent the resources available in the grid. Each server was also a peer of the grid and represented a separate administrative domain. The configuration of the computer machines is described in Table VIII. A total of ten machines was configured with the Globus Toolkit version 4.2.1 installed. The environment was implemented using the Globus platform, the Java programming language version 1.6.0.20 provided by Sun. The operating system was CentOS 5.4 with kernel version 2.6.32.23. The interconnection network was the fast ethernet technology, and all machines were in the same local network. Despite being composed of individual machines, the experiments were dimensioned according to the grid-federation paradigm [18]. In this model, the grid is formed by several organizations each represented by a grid federation agent who is responsible for controlling access to resources of Copyright © 2011 John Wiley & Sons, Ltd.

Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 DOI: 10.1002/cpe

1488

J. M. U. DE ALENCAR ET AL.

Table VIII. Resources available at the grid testbed. Processor (GHz)

Cache (MB)

Colours

Memory (GB)

Quantity

Pentium(R) 4 3.00 Opteron(R) 248 2.20 Xeon(R) X5355 2.66 Core(R)2 Duo E4600 2.40

2 1 4 2

1 2 8 2

1.5 2.0 16.0 2.0

1 4 3 2

the organization, scheduling tasks and enforcing security policies. Thus, each machine used in the experiment represents a grid federation agent, resulting in a grid formed by ten organizations. This size is not far from the reality of many projects that support grid computing [19]. The use of only one machine for representing a domain is not inappropriate when the metrics considered are the execution cost, execution time and load distribution. The execution cost is determined by the number of credits, and each organization is free to decide the value charged. This is a political decision regardless of the number of machines. The execution time can be influenced by the quantity of machines, but the effect of this difference can be achieved through the use of machines with different settings, as was done in the experiments. The load of a larger group of machines can be represented in the same way (real number) by using a tool such as Ganglia to collect information about the grid. 4.2. Workload We also developed a workflow generator. It was responsible for submitting several workflow instances to the grid, following a time interval of submission. The goal was to reproduce a grid workload in a full operation situation of the grid. The workload had a set of ten workflows that were periodically submitted to the grid. This workload was created with two applications. Both applications adopted the fork-join model of workflows [18]. This model consisted of levels of low computational complexity responsible for setting parameters and user interaction, interspersed with high levels of computational intensity. The first application was a Monte Carlo simulation that calculates the approximated value of . This application consisted of three separate services: (i)MonteCarloPiStarter; (ii) MonteCarloPiWorker; and (iii) MonteCarloPiSink. The second one was an adaptation of an application of fluid mechanics taken from a benchmark for high performance computing provided by NASA (National Aeronautics and Space Administration) [20]. We adapted this application by transforming threads from the benchmark into services. Three services were defined as: (i) SPStarter; (ii) SPWorker; and (iii) SPReport. A workload was set with these applications and submitted to the testbed environment. The submitted workflows were classified under the categories listed in Table IX. A low load workflow is an instance in which all composing invocations complete their tasks in 5 min on average (with a range variation of 3 min). The same reasoning applies to other classes. The average time intervals were defined according to initial tests made with the applications defined for the analysis. The size of each workflow varies in the range Œ3, Nworker C 2, where Nworker is the number of peer hosting services corresponding to intermediate invocations of the application fork-join workflow. The total size of the workflow is decided randomly.

Table IX. Workflow classification according to the workload. Class

Execution average interval (min)

Variation (min)

Quantity

Low load Medium load High load

5 15 30

˙3 ˙5 ˙7

5 2 3

Copyright © 2011 John Wiley & Sons, Ltd.

Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 DOI: 10.1002/cpe

A P2P SCHEDULING MECHANISM FOR WORKFLOWS IN GRID COMPUTING

1489

We performed a first adjustment round in the grid environment described in Section 4.1, in order to define the total set of workflows. The scheduling algorithm was configured to select the peers randomly. In this adjustment round of the simulation, ten workflows were submitted. The workflow type quantity followed the distribution described in the last column of Table IX. When a running workflow finished, a new instance from the same workflow class was loaded and submitted to the grid. Thus, at any time, the amount of running workflows in the grid remained the same. However, the load changed because the workflows submitted were not identical. The system load ranged from a situation in which only size three workflows were executing to another situation, in which every workflow had the size Nworker C 2 equivalent to the application they represented. This variation was important as it depicted a realistic situation of a grid. It also ensured that the environment had a minimal load every time. The simulation was observed for a period of 5 h. Three queues were defined for each class of workflow. When a workflow reached the end of execution, its details (e.g. load and execution time) were stored in the queue of its class. The history of insertion in these queues defines the total set of workflows that make up the workload. The reason to divide the workload into workflows of different loads was to replicate the evolutionary process of development and use of workflows. The process of scientific discovery had an exploratory nature, and naturally, several attempts were necessary to be able to calibrate experiments in order to achieve the best possible results. Recent studies point out that in practice, even already knowing the structure of their experiments, researchers often submit workflow instances of short duration when they would like to explore new alternatives for the initial parameters [21]. It is only after positive initial results that instances of long duration are submitted. For this reason, the proportion shown in Table IX was adopted. 4.3. Computation of the metric values In both evaluations, comparisons were performed taking load distribution as the administrative metric. The cost and execution time are the user metrics. These requirements are used as metrics by most similar works in the grid domain [15, 22]. Other metrics, such as security, could also be considered [23, 24]; however, it is very hard to calculate such metric in real environments. For the analysis of load distribution in the grid, the workload was submitted for each scheduling mechanism studied. During the execution of the workload, information about the state of each peer was retrieved by invoking a monitoring service hosted on all peers. Within the information received was the load of each peer. The load of a peer is defined as the average size of the queue of processes ready for execution in the operating system, and this value is divided by the number of processor cores. A comparative study in [25] states that the queue length is a better metric for load balancing than the rate of utilization. In the testbed, each peer represented a domain consisting of only one machine. However, if each peer was the entry point to clusters, the environment Ganglia [26] allows this metric to be calculated with precision for a group of machines. With regard to the user QoS requirements, the data required for analysis were obtained by submitting new workflows in parallel with the workload of another set of workflows. For example, in the test with the SwinDeW scheduling mechanism, ten workflows of each class were submitted in sequence. Their submission initiated after a fixed interval from the beginning of the submission of the workload. The peer that received the submission request was chosen randomly. For each peer hosting a service, a monitoring service provided the required data to calculate the value of the execution time metric. This monitoring functionality was implemented as a WS-Resource [16], and it stored the average execution time for each service of a peer. The execution cost metric value was calculated at the time of the peer activation. The execution cost was computed according to Equation 12. The C.i, j / corresponded to the cost per minute of invoking the service i in peer j . Ncoresj represented the number of processor cores from the peer. The frequency of each processing core was given by C lockj in gigahertz. Mj is the amount of main memory in gigabytes. Pi is the base price of the service i, defined in the agreement on formation of the grid. Table X gives prices for the Grid testbed, defined according to the complexity of each service. Every service had its base price associated with it. The final value of the invocation of a Copyright © 2011 John Wiley & Sons, Ltd.

Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 DOI: 10.1002/cpe

1490

J. M. U. DE ALENCAR ET AL.

Table X. Services Prices. Service

Price (monetary unit/minute)

MonteCarloPiStarter MonteCarloPiWorker MonteCarloPiSink SPLaunch SPWorker SPReport

1 2 1 5 10 3

service on a peer was calculated taking into account this initial value and the computational power of the peer. C.i, j / D .Ncoresj  C lockj C Mj /  Pi

(12)

As notifications about invocations arrived at the client process that submitted the workflow, it retrieved information about the price of basic service and configuration of the peer that performed the invocation. The total cost was given by Equation 13, where T .i, j / was the time in minutes that was required for the execution of the invocation of service i in the peer j . Final cost D

n X

C.i, j /  T .i, j / , with .i, j / as a workflow step.

(13)

i D0

4.4. Results In all analysis presented in this section, the term M1 refers to the negotiation threshold applied in the first selection, based on administrative metrics. The term M 2 represents the same threshold but is related to the second selection, based on metrics defined by user. pt i me is the weight of the runtime, whereas pcost is the weight of the cost. 4.4.1. Comparison with SwinDeW In the comparison between the proposal and SwinDeW, the first analysis is presented in Figure 3. The figure shows the behaviour of the proposed solution with parameter M1 ranging during submission of the workload. For each parameter value, the workload was submitted by using the technique of the proposal. The same workload was also submitted by using the mechanism of SwinDeW. According to Section 3.3, parameter M1 controls the relevance of the administrative requirements. If the unique requirement is the load distribution, the value of M1 determines how the system should prioritize load balancing. At first, the other parameters are configured to not affect the scheduling mechanism. The objective is to analyze the impact of our scheduling mechanism (Section 3.2) in which there is an intercalation of the invocation and negotiation process. In contrast, SwinDeW executes the negotiation of all workflow steps before the first invocation. In Figure 3, observing only the results of the proposed scheduling mechanism, one can perceive that the threshold parameter value has a direct impact on the variance of the load among the peers (i.e. a peer can have a low load and another a high load, which is not desired). When the threshold parameter assumes more restricted values, such as M1 D 0.2, the difference between the average load on the peers does not exceed the value of 1.0. The curves present an orderly behaviour. However, when the parameter is relaxed to M1 D 0.8, the variance is always over 1.0, and the curve shows steep slopes. The load distribution of our mechanism is more balanced than SwinDeW, when the value of M1 is approximately less than or equal to 0.5. Figure 3 also shows the average load for the proposed solution and SwinDeW. Some values for parameter M1 were removed in order to maintain the readability of the graphic. Even with extreme values for M1, the values of the average load are restricted to values in the range Œ0.6  1.1, and SwinDeW also has values in that range. The use of any technique does not alter the average load on the system, so the overhead of using any one of them is similar. Nevertheless, our approach has a lesser amount of load variance by using short M1 values. Copyright © 2011 John Wiley & Sons, Ltd.

Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 DOI: 10.1002/cpe

A P2P SCHEDULING MECHANISM FOR WORKFLOWS IN GRID COMPUTING

1491

Load Variance 1.4

M1=0.2 M1=0.4 M1=0.6 M1=0.8 SwinDew

Variance

1.2 1 0.8 0.6 0.4 0.2 0

0

50

100

150

200

250

300

Minutes Load Average M1=0.2 M1=0.8 SwinDew

Average

1 0.9 0.8 0.7 0

50

100

150

200

250

300

Minutes

Figure 3. Variance and average load of peers with a variable M1. Other parameters are fixed with M 2 D 1.0 and ptime D pcost .

Figure 4 presents the behaviour of the approaches when the metric of execution time is taken into account. For simplicity, only the high load workflows are presented as results. In other workflow classes, we got the same behaviour depicted in Figure 4. Workflow size ranges from three to ten. Again, if a workflow has a size of ten that means it has an initial invocation (e.g. MonteCarloPiStarter or SPLaunch), seven intermediate invocations (e.g. MonteCarloPiStarter or SPLaunch) plus a final invocation (e.g. MonteCarloPiSink or SPReport). M1 parameter was set to 0.5 in order to provide a load balancing similar to SwinDeW. The weights for the metrics were set to pcost D 0.2

Execution Cost

M1=0.5, Cost Weight=0.8 and Time Weight=0.2 600 550 500 450 400 350 300 250 200 150

M2=0.5 M2=0.8 M2=0.2 SwinDeW

3

4

5

6

7

8

9

10

Execution Time in Minutes

Workflow Size M1=0.5, Cost Weight=0.2 and Time Weight=0.8 60 58 56 54 52 50 48 46 44 42 40

M2=0.5 M2=0.8 M2=0.2 SwinDeW

3

4

5

6

7

8

9

10

Workflow Size

Figure 4. Comparison with SwinDeW: high load workflows. Copyright © 2011 John Wiley & Sons, Ltd.

Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 DOI: 10.1002/cpe

1492

J. M. U. DE ALENCAR ET AL.

and pt i me D 0.8. We have given more emphasis to time rather than cost. The M 2 threshold varies in order to enable the impact analysis of this parameter. For each workflow size, no more than ten samples were submitted and each sample had the same invocation configurations. In this sample set, the confidence interval was calculated with a rate ic D 95. Using our scheduling mechanism, high load workflows exhibit curves with regular behaviour in which execution intervals increase when the workflow size augments. In this configuration, the parameter M 2 value narrows when workflow execution time decreases. Thus, we conclude that the proposed mechanism allows prioritization of QoS metrics in accordance with user requirements. The behaviour of SwinDeW is irregular because it does not consider user metrics. Figure 4 also presents the behaviour in relation to the execution cost metric. The confidence intervals are larger in this case. This difference is caused by the grid testbed heterogeneity. As the price of a service from a peer is defined by its hardware configuration (Equation 13), there are large variations in cost because of the range of machine configurations. This confidence interval does not invalidate the experiment. Despite representing an unpredictability factor in the results, the position of the curves shows that the user and administrator QoS parameters have a major impact on the grid workflow execution, as we targeted. Figure 5 assesses the impact of user QoS requirements on the scheduling of high load workflows. All administrative parameters were fixed with intermediate values. We ranged the proportion among the weights of user metrics. The technique used by SwinDeW is not evaluated because it does not take into account user QoS parameters. In this experiment, we also ranged the weight values of the cost execution and time metrics, in which the sum of the weights was always equal to 1. The influence of the weights in the scheduler decisions is noticeable. The graphics in Figure 5 demonstrate the effectiveness of our technique. 4.4.2. Comparison with genetic algorithms We compare our approach to a centralized solution in order to measure drawbacks in using our approach. As aforementioned, centralized solutions have a

550

Cost Weight=0.5 Cost Weight=0.8 Cost Weight=0.2

Execution Cost

500 450 400 350 300 250

3

4

5

6

7

8

9

10

Execution Time in Minutes

Workflow Size 90 85 80 75 70 65 60 55 50 45 40

Time Weight=0.5 Time Weight=0.8 Time Weight=0.2

3

4

5

6

7

8

9

10

Workflow Size

Figure 5. Metrics evalutation: execution cost and time for high load workflows with M1 D 0.5 and M 2 D 0.5 Copyright © 2011 John Wiley & Sons, Ltd.

Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 DOI: 10.1002/cpe

A P2P SCHEDULING MECHANISM FOR WORKFLOWS IN GRID COMPUTING

1493

potential to be more effective in the scheduling process because they can have a global view of the state of all grid resources. However, P2P solutions decrease the chances of system bottlenecks. We chose a centralized scheduler that uses a genetic algorithm. We implemented it following the data structures and functions described in [15]. The parameters used are shown in Table XI and were defined according to experimental studies found in [15]. In addition to the initial settings, a modification was established in the workflow execution environment to allow the reservation of resources. This change was required to make a realistic comparison with existing techniques as they work with resource reservation. The genetic scheduler was implemented using the same Java virtual machine configuration used in other experiments. We used the framework Java Genetic Algorithms Package (JGAP) to develop the centralized solution on the basis of genetic algorithms [27]. Unlike the decentralized experiments in which the scheduler chooses a random peer for submission of a workflow, a machine was chosen to submit and monitor implementation. In this case, the machine chosen was the Pentium 4 of Table VIII. The first analysis is presented in Figure 6. Normally, centralized solutions based on genetic algorithms do not take into account load distribution as a requirement. However, it is important to

Table XI. Initial parameters for the scheduler with genetic algorithms. Parameter

Value/type

Population size Maximum generation Crossover probability Mutation probability Scheme selection Initial individuals

10 100 0.9 0.5 Category elitist Random generation

Variance

Load Variance 0.85 0.8 0.75 0.7 0.65 0.6 0.55 0.5 0.45 0.4 0.35

M1=0.5,M2=0.5 Genetic Deadline Restriction

0

50

100

150

200

250

300

Minutes

Average

Load Average 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1

M1=0.5,M2-0.5 Genetic Deadline Restriction

0

50

100

150

200

250

300

Minutes

Figure 6. Comparison with genetic algorithms: variance and load average of peers with M1 D 0.5, M 2 D 0.5 and ptime D 2  pcost . Copyright © 2011 John Wiley & Sons, Ltd.

Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 DOI: 10.1002/cpe

1494

J. M. U. DE ALENCAR ET AL.

Execution Cost

500

Cost Weight=0.8 Genetic Algorithm

450 400 350 300 250 200

3

4

5

6

7

8

9

10

Execution Time in Minutes

Workflow Size 58 56 54 52 50 48 46 44 42 40 38

Time Weight=0.8 Genetic Algorithm

3

4

5

6

7

8

9

10

Workflow Size

Figure 7. Comparison with genetic algorithms: execution cost and time for high load workflows with M1 D 0.5 and M 2 D 0.5.

analyze it because our proposed mechanism includes this QoS requirement. We can observe in Figure 6 that the use of genetic algorithms to submit the workload leads to a load imbalance on the grid. With regard to the load average values, our approach and the centralized one presents a quite similar behaviour. Figure 7 shows the analysis of the user QoS metrics. With regard to both metrics, P2PScheMe was able to provide results close to the solutions encountered by the centralized scheduler. Thus, our architecture offers comparable results while maintaining a decentralized environment, reducing the ocurrence of bottlenecks without compromising performance. 5. CONCLUSIONS AND FUTURE WORKS This paper presented a grid environment and a scheduling mechanism allowing the establishment of P2P grids able to execute applications on basis of the paradigm of service workflows. The P2PScheMe mechanism provides a high level of decentralization, decreasing the occurence of bottlenecks and improving scalability. In addition, users and grid administrators can affect the scheduling decisions by informing priorities between QoS metrics. These priorities have a direct impact on the workflow execution, and they are computed using the latest available information about the resources. The improvements mentioned earlier vastly contribute to the development of robust grid computing environments. The comparative analysis was performed using a workload with strong resemblance to real grid environments. Two applications based on mathematical methods widely applied in scientific computing were implemented as workflows. The workflows were submitted to a testbed composed of several machines with heterogeneous configurations. The resulting experiment showed that P2PScheMe presents a more refined load distribution than the other two techniques considered. Concerning user QoS metrics, our mechanism presented the execution oriented by the informed QoS parameters, whereas SwinDeW does not possess this functionality. In the case of a central scheduler based on genetic algorithms, our mechanism was able to reach close performance values without the disadvantage of a limited scalability. As a first future work, we plan to refine the definition of user QoS parameters and allow grid information to be translated more easily into administrative parameters. We have already started Copyright © 2011 John Wiley & Sons, Ltd.

Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 DOI: 10.1002/cpe

A P2P SCHEDULING MECHANISM FOR WORKFLOWS IN GRID COMPUTING

1495

the study of web semantic technology to represent metadata in the grid. We would like to provide semantic description of QoS requirements and permit the execution of services annotated with semantic information. We believe that the autonomy level of the grid will be improved, and this will facilitate the grid used by nonspecialist users. Another future work will be the replacement of the underlying grid architecture for a cloud computing environment [28]. In the current mechanism, the instantiation peer searches for other peers with the service required by the next workflow step. If there are no peers with the desired service, the workflow execution is stopped, and the user receives intermediate results. With a cloud computing environment supporting virtualization, we plan to provide an alternative to stopping the execution. In the absence of suitable services hosted in the peers, the instantiation peer will start a new negotiation process to decide for which peer will be placed as the virtual machine containing the needed service. This new negotiation process will now take into account QoS requirements related to cloud computing and virtualization. Thus, cloud computing will enable even more robustness to the current mechanism. It will also improve the experience for the end user, now that there is greater assurance of complete workflow execution.

ACKNOWLEDGEMENTS

Authors thank the Brazilian National Research Council (CNPq) (131018/2008-6) for the financial support, and the National High Processing Center at UFC (Cenapad-UFC) as well as the National Laboratory for Scientific Computing (LNCC) for providing technical resources to improve this research.

REFERENCES 1. Deelman E, Blythe J, et al. Pegasus: mapping scientific workflows onto the Grid. In Grid Computing, Vol. 3165, Lecture Notes on Computer Science. Springer Berlin, 2004; 11–20. 2. Foster I, Kesselman C. The Grid: Blueprint for a New Computing Infrastructure, Chapter 2. Morgan-Kaufman: San Francisco, 1999. 3. Ludascher B, Altintas I, et al. Compiling Abstract Scientific Workflows into Web Service Workflows. Proceedings of 15th IEEE Conference on Scientific and Statistical Database Management, Cambridge, MA, USA, 2003; 251–254. 4. Gil Y. From data to knowledge to discoveries: scientific workflows and artificial intelligence. Scientific Programming 2008; 16(4):231–246. 5. Gil Y, Ratnakar V, et al. Wings for Pegasus: Creating Large-Scale Scientific Applications Using Semantic Representations of Computational Workflows. Proceedings of the Nineteenth Conference on Innovative Applications of AI (IAAI-07), Vancouver, British Columbia, Canada, 2007; 22–26. 6. Oinn T, Greenwood M, et al. Taverna: lessons in creating a workflow environment for the life sciences. Concurrency and Computation: Practice and Experience 2005; 18(10):1067–1100. 7. Yan J, Yang Y, et al. SwinDeW–a P2P-based decentralized workflow management system. IEEE Transactions on Systems, Man, and Cybernetics 2006; 36(5):922–935. 8. Tanenbaum A, Van Steen M. Distributed Systems Principles and Paradigms, 15th ed. Prentice Hall: USA, 2007. Pages 33–34. 9. Ranjan R, Rahman M, et al. A Decentralized and Cooperative Workflow Scheduling Algorithm. Proceedings of 8th IEEE International Symposium on Cluster Computing and the Grid, Lyon, France, 2008; 1–8. 10. Taylor I, Shields M, et al. The triana workflow environment: architecture and applications. In Workflows for e-Science. Springer: London, 2007; 320–339. 11. Fahringer T, Prodan R, et al. ASKALON: A Grid Application Development and Computing Environment. GRID ’05: Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing, Washington, DC, USA, 2005; 122–131. 12. Keong Lua E, Crowcroft J, et al. A survey and comparison of peer-to-peer overlay network schemes. IEEE Communications Surveys and Tutorials 2005; 7:72–93. 13. Yu J, Buyya R. A taxonomy of workflow management systems for grid computing. Journal of Grid Computing 2006; 34(3):44–49. Springer. 14. Martins F, Maia M, et al. A Grid Computing Diagnosis Model for Tolerating Manipulation Attacks. Proceedings of the International Transactions on Systems Science and Applications 2006; 2(2):135–146. 15. Yu J, Buyya R. Scheduling scientific workflow applications with deadline and budget constraints using genetic algorithms. Scientific Programming 2006; 14(3-4):217–230. IOS Press. 16. Foster I. Globus Toolkit Version 4: Software for Service-Oriented Systems. IFIP International Conference on Network and Parallel Computing, Beijing, China, 2005; 2–13. Copyright © 2011 John Wiley & Sons, Ltd.

Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 DOI: 10.1002/cpe

1496

J. M. U. DE ALENCAR ET AL.

17. Khaled Ahsan Talukder AKM, Kirley M, et al. Multiobjective differential evolution for scheduling workflow applications on global grids. Concurrency and Computation: Practice and Experience 2009; 21(13):1742–1756. John Wiley and Sons Ltd. 18. Rahman M, Ranjan R, et al. Cooperative and decentralized workflow scheduling in global grids. Future Generation Computer Systems 2010; 26(5):753–768. Elsevier. 19. Sistema Nacional de Processamento de Alto Desempenho. SINAPAD - Sistema Nacional de Processamento de Alto Desempenho. (Available from: http://www.lncc.br/sinapad), last access 08/2011. 20. Winjgaart R, Fumkin M. NAS grid benchmarks version 1. NAS technical report NAS-02-005, NASA Ames Research Center, Moffett Field, CA, USA, 2002. 21. Office of Cyberinfrastructure - National Science Foundations Cyberinfrastructure Council - OCI. Cyberinfrastructure vision for 21st century discovery. (Available from: http://www.nsf.gov/pubs/2007/nsf0728), last access 01/2011. 22. Prodan R, Fahringer T. Dynamic Scheduling of Scientific Workflow Applications on the Grid: A Case Study. SAC ’05: Proceedings of the 2005 ACM symposium on Applied computing, New York, NY, USA, 2005; 687–694. 23. Peixoto M, Santana M, et al. A P2P Hierarchical Metascheduler to Obtain QoS in a Grid Economy Services. IEEE International Conference on Computational Science and Engineering, IEEE Computer Society, Washington, DC, USA, 2009; 292–297. 24. Anselmi J, Ardagna D, et al. A QoS-based Selection Approach of Autonomic Grid Services. ACM SOCP ’07: Proceedings of the 2007 workshop on Service-oriented computing performance: aspects, issues, and approaches, Monterey, California, USA, 2007; 1–8. 25. Ferrari D, Zhou S. An Empirical Investigation of Load Indices For Load Balancing Applications. Performance 87, the 12th Internation Symposium On Computer Performance Modeling, Measurement, and Evaluation, Amsterdam, The Netherlands, 1988; 515–528. 26. Massie M, Chun B, et al. The ganglia distributed monitoring system: design, implementation and experience. Parallel Computing 2004; 30(7):817–840. ISSN 0167-8191. 27. Meffert K, Meseguer J, et al. Java Genetics Algorithm Package. (Available from: http://jgap.sourceforge.net/), last access 06/2010. 28. Mc Evoy G, Schulze B, Garcia E. Performance and deployment evaluation of a parallel application on a private cloud. Concurrency and Computation: Practice and Experience 2011; 23(7):2048–2062.

Copyright © 2011 John Wiley & Sons, Ltd.

Concurrency Computat.: Pract. Exper. 2012; 24:1478–1496 DOI: 10.1002/cpe