J Grid Computing (2012) 10:601–630 DOI 10.1007/s10723-012-9240-5
WS-PGRADE/gUSE Generic DCI Gateway Framework for a Large Variety of User Communities Peter Kacsuk · Zoltan Farkas · Miklos Kozlovszky · Gabor Hermann · Akos Balasko · Krisztian Karoczkai · Istvan Marton
Received: 26 December 2011 / Accepted: 24 September 2012 / Published online: 7 November 2012 © Springer Science+Business Media Dordrecht 2012
Abstract The WS-PGRADE/gUSE generic DCI gateway framework has been developed to support a large variety of user communities. It provides a generic purpose, workflow-oriented graphical user interface to create and run workflows on various DCIs including clusters, Grids, desktop Grids and clouds. The framework can be used by NGIs to support small user communities who cannot afford to develop their own customized science gateway. The WS-PGRADE/gUSE framework
also provides two API interfaces (Application Specific Module API and Remote API) to create application-specific science gateways according to the needs of different user communities. The paper describes in detail the workflow concept of WS-PGRADE, the DCI Bridge service that enables access to most of the popular European DCIs and the Application Specific Module and Remote API concepts to generate applicationspecific science gateways. Keywords Science gateway · Customized interface · Workflow · Distributed computing infrastructures
P. Kacsuk · Z. Farkas (B) · M. Kozlovszky · G. Hermann · A. Balasko · K. Karoczkai · I. Marton MTA SZTAKI, P.O. Box 63, Budapest, 1518, Hungary e-mail:
[email protected] P. Kacsuk e-mail:
[email protected] M. Kozlovszky e-mail:
[email protected] G. Hermann e-mail:
[email protected] A. Balasko e-mail:
[email protected] K. Karoczkai e-mail:
[email protected] I. Marton e-mail:
[email protected]
1 Introduction Many scientific user communities can take advantage of using distributed computing infrastructures (DCI) because they have big computational needs, very large data sets to process or some other computer-related requirements. For example, high-energy physics, drug research (molecule docking) or geology are fields which can easily gain from using large computer systems to solve their problems. Different types of Grid or cluster middleware offer well-defined interfaces to get access to such computer systems. For example, if an institute sets up a cluster, the nodes of the cluster are not
602
accessed individually rather are connected together with the help of a local resource manager which offers a unified interface for submitting application instances (jobs) to the cluster with the help of a job description and using middlewarespecific tools or APIs. Once submitted, it is the local resource manager’s task to find an execution resource for the scientist’s job. Examples of cluster middleware are PBS [36], LSF [33] or Condor [41]. An additional level of abstraction is Grid systems that enable collecting different types of cluster resources within a geographically distributed infrastructure. Grid middleware are more complex than cluster middleware as they usually offer very heterogeneous and large number of services, for example, resource brokers, data storage or information systems services. Just like cluster middleware, Grid middleware offer a user interface to access their services through commandline tools or APIs. Examples of Grid middleware are gLite [17], ARC [10] and UNICORE [11]. The problem of cluster and Grid middleware from a scientist’s point of view is multiple. First, each middleware uses its own user interface that sometimes can be very complex requiring notable effort and learning before the scientist can make use of it. Second, middleware interfaces are rarely compatible with each other, so moving to another Grid middleware needs an additional learning period. Fortunately, there are a number of efforts, projects and software tools that aim at solving interoperability among different Grid systems, for example, OGSA-BES [35], SAGA [27], EDGI [26], EDGeS [3] and 3G Bridge [14]. Third, Grid middlewares usually do not offer a way to create and manage a flow of work for complex scientific problems, so scientists have to create scripts or applications managing such scenarios, usually with the help of computer scientists. Steps of a very basic workflow scenario are input data collection, data processing and result evaluation. Workflow systems and Grid gateway technologies have evolved to overcome these difficulties of using Grid middleware. The task of Grid gateways is to hide details of the underlying Grid or cluster middleware from their users through an easy-to-use user interface, whereas workflow systems enable creating and managing workflows consisting of a number of applications. Although
P. Kacsuk et al.
both approaches are aimed at facilitating the use of distributed computing infrastructure, workflow systems and gateway technologies are usually decoupled, so most of the Grid gateways do not offer a way to create workflows, they provide simple job submission mechanism [5, 43]. Similarly, most of the workflow systems are created without any gateway support, they can be used in a desktop environment and they are typically support a single DCI. Examples for such workflow systems are Taverna [21], Triana [40], Kepler [2, 34], Pegasus [7], ASKALON [12]. It was only the OGCE portal [1], P-GRADE Portal [13] and gUSE/WS-PGRADE [25] that were designed to natively support a workflow system. Gateways can be divided into two main categories. Generic DCI gateway frameworks are not specialized for a certain science area and hence scientists from many different areas can use them. NGIs (National Grid Initiatives) are good candidate to set up such gateways to support their very heterogeneous user communities. For example, the UK NGS, Grid Ireland, Malaysian KnowledgeGrid, etc. set up a P-GRADE portal for such purpose. Typical gateways belonging to this category are GridPort [5], P-GRADE [13], Vine Toolkit [43], WS-PGRADE/gUSE [25], etc. The problem with these gateways is that they expose a large set of services for their users. Thus, in order to exploit their full power, scientists need a relatively long learning period to use all the available features efficiently. Such complexity distracts many end-user scientists from using the Grid even through these Grid gateways. Therefore, the other class of gateways targets a well-defined set of scientists typically working on the same field of science. These are the socalled application-specific science gateways. They provide a very simplified user interface that is highly tailored to the needs of the given scientific community. As a result, on one hand the scientists do not have to learn too much for using the services provided by the gateway, on the other hand these services are very limited and hence if a scientist needs a more complex service, for example, a new workflow, it cannot be created and managed by these gateways. Typical example is the rendering portal developed at the Laurea University [37] in Finland.
WS-PGRADE/gUSE: Generic DCI Gateway Framework for User Communities
In order to create such application-specific gateways there are two options. One option is to write the gateway from scratch. Since their services are limited and there are good portal frameworks like Liferay, it is relatively easy task to develop such application-specific gateways. However, such simplified gateways typically support the usage of only one particular DCI and do not support workflow execution at all. Typical examples of this approach are the gateways developed by the WeNMR project, for example the NRG-CING portal [9]. Communities selecting this option assume that building a science gateway is an easy and fast activity. They usually underestimate the required manpower and time to produce a really robust gateway that can be provided as a production 24/7 service for the large number of members of the community. Meanwhile building such gateways the different communities usually solve again and again the same technical issues independently from each other. This redundancy of gateway development effort is a huge waste of money and manpower. Many times they do not reach the required production level by the time their project is over and then all the efforts building the gateway become useless. Sometimes they manage to produce the gateway in time but when the project is over they have not got the financial support and required manpower to maintain the gateway according to the progress of the underlying software stack. This aspect of maintaining gateways and making them sustainable is very often underestimated. Related to this issue is scalability both from the point of view of functionality and usability. Once a gateway is provided as a production service the user community would like to have new functionalities, access to more DCIs of different kinds and these needs require substantial further developments of the gateway. Another serious problems rises when large number of users start to use the gateway and then turns out that it was not properly designed to serve large number of users in a scalable way. The other option is to customize an existing versatile generic DCI gateway framework according to the needs of a certain user community. In this case the full power of the underlying portal framework can be exploited for example by developing comprehensive and sophisticated
603
workflows for the community and hiding these complex workflows by a simplified applicationspecific user interface. This is the approach that was followed by the SystemX project in Switzerland when they developed their proteomics science gateway [32] based on P-GRADE and recently on WS-PGRADE. A similar approach was followed in the UK ProSim project where they created a gateway [29] for biologists to model carbohydrate recognition based on the WS-PGRADE/gUSE portal framework. The advantage of this approach is that the DCI access services are solved and provided in a robust way in such a generic DCI gateway framework and hence the user communities can concentrate on producing their applicationspecific layers of the science gateway. In this way the redundancy of developing the same DCI access mechanisms by many different communities can be avoided. Due to the same reason the development time of the application-specific gateways can be significantly reduced and there is a good chance that within the lifetime of the given project the science gateway can be built and provided as a production service. Obviously, the cost of producing such a gateway is substantially lower than in the case of the first approach. Since the gateway is a customization of an existing robust and scalable system the produced production gateway service will likely also be robust and scalable. The sustainability of such a customized gateway is much easier than in the case of the first method provided that the sustainability of the generic DCI gateway framework is solved since in this case the community of the science gateway should maintain only a narrow set of user-specific services the rest is maintained by the generic DCI gateway framework developer community. This latter should be an open source community that maintains the code as community effort. In this paper we introduce WS-PGRADE/gUSE, a generic DCI gateway framework and backend service stack that can be easily customized to build application-specific science gateways. The most distinguishing feature of WS-PGRADE/gUSE compared to other generic DCI gateway frameworks is that it is workflow-centric, i.e., it provides services to build workflow applications that can be executed on various DCIs. The provided execution mechanism enables the simultaneous
604
execution of workflow nodes placed on parallel workflow branches. These nodes can be executed in parallel in different DCIs enabling the organization of very efficient parallel, multi-DCI workflow applications. WS-PGRADE/gUSE provides all the services that are needed to create, execute and monitor these workflows. Therefore the three most important features of WSPGRADE/gUSE are as follows: 1. Workflow support 2. Enabling multi-DCI workflow execution 3. Enabling the customization of the framework towards application-specific science gateways In the current paper we focus on these most important features of WS-PGRADE/gUSE, we show how it allows creating complex workflow scenarios, and enables running them on a diverse set of DCIs. We show two different customization methodologies of WS-PGRADE/gUSE that will be used within the EU FP7 SCI-BUS project [42] to create customized science gateways. The SCIBUS project will produce 17 different applicationspecific gateways for various user communities by customizing the WS-PGRADE/gUSE portal framework according to the needs of these user communities including seismology, astrophysics, helio-physics, chemistry, biology, medical science, etc. There are many other important features of WS-PGRADE/gUSE but due to the length restrictions of this paper we do not cover those features here. The interested readers can find details of those features in the WS-PGRADE/gUSE User Manual [20]. The rest of the paper is organised as follows: in Section 2 we present the concept of the WSPGRADE gateway and its workflow-oriented back-end, the gUSE middleware. Next, in Section 3 we describe the workflow concept of WSPGRADE and gUSE. In Section 4 we show how the various DCIs can be transparently accessed by the DCI Bridge service of gUSE. Afterward, in Section 5 we present two ways offered by gUSE/WS-PGRADE to create customized science gateways. Finally, in Sections 6 and 7 we present related work on Grid gateways and conclude our paper.
P. Kacsuk et al.
2 Concept of WS-PGRADE and gUSE WS-PGRADE portal based on the gUSE (Grid User Support Environment) service set is the second generation P-GRADE portal that introduces many advanced features both at the workflow and architecture level compared to the first generation P-GRADE portal [24]. The major lesson we learnt from the usage of P-GRADE portal was that there are different kinds of user communities and they require different kind of user interfaces to reach vaious types of DCIs. Based on this lesson we have redesigned the whole concept of P-GRADE portal and created a multi-tier service architecture that can support various user types and various DCIs. The Architectural tier enables the access to many different kinds of DCIs through the DCI Bridge job submission service as shown in Fig. 1. This tier will be described in detail in Section 4. The Middle tier contains those high level gUSE services that enable the management, store and execution of workflows as explained in Section 3. Finally, the Presentation tier provides the graphical WS-PGRADE user interface of the generic DCI gateway framework. This layer can be easily customized and extended (see Section 5) according to the needs of the science gateway to be derived from WS-PGRADE/gUSE. Figure 2 shows how the different kinds of users can utilize this generic DCI gateway framework. Type A of users is the workflow developer who develops workflows for the end-user scientists. This user understands the usage of the underlying DCI and is able to develop complex workflows. This activity requires to edit, configure and run workflows in the underlying DCI as well as to monitor and test their execution in the DCIs. In order to support the work of these users WS-PGRADE (the graphical user interface service) provides a Workflow Developer UI through which all the required activities of developing workflows are supported. When a workflow is developed for the scientists, it should be uploaded to a repository where the scientists can download from and execute it. In order to support this interaction between workflow developers and endusers the gUSE service set provides an Application (Workflow) Repository service in the gUSE tier and the WS-PGRADE Workflow Developer
WS-PGRADE/gUSE: Generic DCI Gateway Framework for User Communities
605
Fig. 1 Multi-tier architecture of WS-PGRADE/gUSE
UI enables workflow developers to upload and publish their workflows for end-users via this repository. Type B of users is an end-user scientist (e.g. a biologist, chemist, etc.) who is not aware of the features of the underlying DCI nor of the structure of the workflows that realize the type of applications he or she has to run in the DCI(s). For these users WS-PGRADE provides a simplified End-User UI where the reachable functionalities are very limited. Typically, end-user scientists can download workflows from the Application Repository, parameterize (configure) the workflows and execute them on the DCI(s). They can also monitor the progress of the running workflows via a simplified monitoring view. A user can login to the portal either as workflow developer or end-user and according to this login he/she can see either the developer view or the end-user view of WSPGRADE. In many cases even this simplified view is too complex for the scientists or sometimes they need some special visualization tool or other application specific portlets to make the usage of the portal more customized for their work. Therefore, there is a need to customize the portal according to these application specific requirements. In order to support the development of such application specific UI we provide the ASM (Application Specific Module) API by which such customization
can easily and quickly be done. Once, this has happened Type C scientists who require such customization can run their workflow applications on various DCIs via the Application Specific UI developed by means of the ASM API. Notice, that in this case the WS-PGRADE End-User UI is replaced with the customized Application Specific UI and this new UI can directly access the gUSE services via the ASM API. It can also happen that a certain user community has got already a favorite Application Specific UI and they insist on using this existing UI but they would like to access as many DCIs as possible through their UI. These communities are shown as Type D users in Fig. 2. For the benefit of this kind of users we have moved the DCI Bridge service from the gUSE tier to the Architectural tier and made this service directly accessible via the standard OGF BES job submission interface [35]. Finally, there are users who prefer to access the gUSE services via a direct API without any user interface and run WS-PGRADE workflows directly via this API. For this type of users (denoted with E in Fig. 2) we provide the gUSE Remote API. Further lessons learnt from the usage of PGRADE showed that the simple DAG-based workflow concept of P-GRADE is not enough for many applications. Therefore, the WS-PGRADE/ gUSE system extends the DAG-based workflow
606 Fig. 2 User access modes of the WS-PGRADE/gUSE DCI gateway framework
P. Kacsuk et al.
1
2
3
concept with advanced parameter study features through special workflow entities, (generator and collector jobs, parametric files), conditiondependent workflow execution and workflow embedding support. All these features will be explained in detail in Section 3. Another problem in P-GRADE was that abstract workflows were not handled. The new portal extends the concrete workflow concept of P-GRADE with new concepts and objects like graph (abstract workflow), workflow instance, template, application and project. Among the WS-PGRADE/gUSE design concepts another important requirement was to enable the simultaneous handling of very large number of jobs even in the range of millions without compromising the response time at the user interface. In order to achieve this level of concurrency of job handling the workflow management back-end of WS-PGRADE portal is implemented on Service Oriented Architecture (SOA) concept and is supported by gUSE and with the DCI Bridge service as shown in Fig. 1. gUSE is a set of services with well-defined interface protocols to realize the workflow management backend of the WS-PGRADE portal. Since the workflow concept of WS-PGRADE is much more sophisticated than in P-GRADE the Condor DAGMan workflow engine is replaced with a newly developed workflow engine, called as Zen, which can manage the
very large number of job executions. The WSPGRADE/gUSE gateway framework was successfully tested up to one million simultaneous jobs. In order to prevent malicious users to flood the portal (and DCIs) with enormous number of jobs the portal system administrator can set up an upper limit for the number of simultaneous jobs per user. Since the WS-PGRADE/gUSE generic DCI gateway framework is implemented as a SOA there are many options of installing the system. The simplest solution is that all the services of all the tiers are placed on a single machine. More advanced installations can lead to more efficient system with smaller response time to the users. For example, the recommended installation configuration would contain a front-end machine with WS-PGRADE on it and a back-end machine with gUSE and the DCI Bridge services. Due to the lack of space we show here only the single machine installation principles but the interested reader can consult with the WS-PGRADE/gUSE Installation Manual [18] concerning other installation options. The high-level architectural view of a WSPGRADE/gUSE system installed on a single machine is shown in Fig. 3. As it can be seen, Apache Tomcat is used as the servlet container to host gUSE services and the Liferay portal technology. Of course it is possible to deploy
WS-PGRADE/gUSE: Generic DCI Gateway Framework for User Communities
607
Fig. 3 Overview of a WS-PGRADE/gUSE single-node installation
gUSE service onto a different servlet container, thus the user interface (WS-PGRADE) can be operated separately from backend (gUSE) services as mentioned above. The user interface of WS-PGRADE consists of a number of portlets conforming to the JSR-268 portlet specification deployed in Liferay. The different WS-PGRADE portlets communicate with the gUSE services as needed.
3 Workflow Concept WS-PGRADE uses its own XML-based workflow language with a number of features: advanced parameter study features through special workflow entities (generator and collector jobs, parametric files), diverse distributed computing infrastructure (DCI) support, condition-dependent workflow execution and workflow embedding support. The structure of WS-PGRADE workflows are represented by directed, acyclic graphs. An example of a workflow graph is shown in Fig. 4. Big boxes represent nodes of the workflow, whereas smaller boxes attached to the bigger boxes represent input and output file connectors (ports) of the given node. Directed edges of the graph represent data dependency (and corresponding file transfer) among the workflow nodes.
In WS-PGRADE two types of workflows are distinguished: abstract workflows and concrete workflows. The abstract workflow represents only the structure (graph) of a workflow but the semantics of the nodes are not defined yet. This abstract workflow can be used to generate various concrete workflows. The concrete workflows are derived from abstract workflows by specifying the semantics of workflow nodes and the DCIs where the various nodes should be executed. A workflow node can be a job to be submitted into a DCI, a service (e.g. Web Service) to be invoked and another workflow to be executed. The concrete workflows generated from a certain graph can be different concerning the semantics of the workflow nodes, input and output files associated with ports, specification of the DCIs where the various nodes should be executed, etc. The life-cycle of WS-PGRADE workflows is the following: the user first has to create the graph (or structure) of the workflow. This abstract workflow can be used in the second step to generate various concrete workflows by configuring detailed properties (first of all the executable, the input/output files where needed and the target DCI) of the nodes representing the atomic execution units of the workflow. After all the properties of the workflow have been set, it can be submitted resulting in an instance of the workflow. A concrete workflow can be submitted several times
608
P. Kacsuk et al.
Fig. 4 Example workflow graph
(for example, in case of performance measurements) and every submission will result in a new instance of the same concrete workflow. The execution of a workflow instance is data driven forced by the graph structure: A node will be activated (the associated job submitted or the associated service called) when the required input data elements (usually file, or set of files) become available at each input port of the node. This node execution is represented as the instance of the created job or service call. One node can be activated with several input sets (for example, in the case of a parametric node) and each activation results in a new job or service call instance. The job or service call instances contain also status information and in case of successful termination the results of the calculation are represented in form of data entities associated to the output ports of the corresponding node. The user may suspend or abort the execution of a running instance. A previously suspended workflow instance can be resumed at any time by the user. This way the user can observe the progress of workflow elaboration on the fly and
can perform a job/service call instance level checkpointing when needed. However, the rather strict policy of the data driven workflow interpretation has been relaxed in the case of runtime condition dependent execution of a given job/service call instance. In such case substantial optimization can be achieved by cutting off unneeded branches of the graph. Another important feature of the WS-PGRADE workflows is the support for embedding workflows. This enables the usage of the “subroutine paradigm”, i.e., a tested workflow B can be used inside a parent workflow A by specifying that the executable of a certain node in A should be workflow B. 3.1 Advanced Parameter Sweep Features Parameter sweep applications are typically simulations where the same simulation application should be executed with many different input sets. DCIs are ideal for parameter sweep executions and therefore their most frequent usage scenario is performing such parameter sweep applications.
WS-PGRADE/gUSE: Generic DCI Gateway Framework for User Communities
This implies that any good science gateway should provide an easy-to-use way of support to construct and execute parameter sweep applications. Typical scenario for such applications is that first the parameter set should be generated, then the parametric application should be executed and finally the results of the parametric applications should be collected and processed (for example, creating statistics based on the results of the different executions). 3.1.1 Generator Nodes Any node can be defined as a Generator node. A workflow node has generator property if the execution of the associated job or service call instance may produce more than one output data elements associated to given distinguished output port(s). This kind of distinguished output port is called Generator output port, and must be specified as such during the configuration of the node when the concrete workflows are defined. The code is expected to produce files with the naming convention _, where -s must be subsequent integers starting from zero. The user defines the during the configuration of the Generator output port, enabling the workflow manager to gather the created files. 3.1.2 Collector Nodes A node has Collector property if it has at least one distinguished Collector port, i.e., at least one of its input ports are specified as Collector port during the workflow configuration time when the concrete workflows are defined. Collector nodes are typically used to collect several files and then process them as a single input. Therefore Collector ports force delayed job execution until the last file of the input file set to be collected has arrived on the Collector port. The workflow engine computes the expected number of input files on a Collector port at run time. When all the expected inputs arrived to the Collector port, the node becomes executable and a single job instance will be created and started to process all the incoming input files as a single input set. It is the responsibility of the exe-
609
cuting code associated to the node to find and process all received input files having the name convention _, where has been defined by the user during the configuration of the Collector Port. 3.1.3 Parametric Input Ports and Parametric Nodes A node may have single and parametric input ports and this should be specified at configuration time when the concrete workflow is defined based on an abstract workflow. If a node has got only single input ports it will be executed only once as a single instance processing the single inputs of every input ports. These nodes are called as Normal nodes. If a node has at least one parametric input port it is called Parametric node. If a Parametric node has got one parametric input port, it will be executed in as many instances as many files arrive on the parametric input port. If the output port of a Generator is connected to the parametric input port of a Parametric node then this Parametric node will be executed for every file generated by the Generator node. If a Parametric node has several parametric input ports then we have to specify how the input files from these ports will be combined together. The way it is organized is described in the next subsection. A node may have a free parametric input port. Free input port means a port that is not associated to any output port of the workflow graph. Such ports must be associated to existing input data, i.e., data that are not produced by the workflow and were available before the workflow execution has been started. If the input data is not a single file but a set of files, then the associated port must be distinguished as Parametric input port. Parametric input ports force a sequence of job submissions (job/service call instance creations) of the associated node in the same way as if the node was connected to the Generator output port of a Generator node. 3.1.4 Input Port Ordering A node can have several parametric input ports. In such case input port ordering is a feature to
610
associate Parametric input ports together. The question is how many times such a node should be executed when input data sets arrive on the different Parametric input ports. In order to answer the question a new term, the Cross Product Set (CPS) is introduced. All input ports must be subdivided into disjoint CPS-s. Within a CPS all input files of the participating ports must be combined as Cartesian (or cross) product. Among the CPSs the dot product relationship is applied. Example 1 Let’s assume, that the node has just two input ports: P1 , and P2 ; the first has 2 the second 3 data elements: P1 = {v11 , v12 }, P2 = {v21 , v22 , v23 }. In this case the number of combination in the Cartesian product is 6, i.e., 6 job (or service call) instances will be generated by the WS-PGRADE workflow engine with the following input pairs: CPS1 = {[v11 , v21 ], [v11 , v22 ], [v11 , v23 ], [v12 , v21 ], [v12 , v22 ], [v12 , v23 ]} Example 2 Let’s assume, that there are two other ports of the node: P3 = {v31 , v32 } and P4 = {v41 }, and this two compose a separate CPS: C PS2 = {[v31 , v41 ], [v32 , v41 ]} According to the input ordering convention of the workflow execution engine the existing CPSs of a node must be “paired” as a common dot product group where association will be executed according to the indices of the elements of the CPS-s. If the lengths of the participating CPS are different, the longest will determine the number of the job submissions, and the elements belonging to the missing indices are replaced by the first element. In the case of our job submission Example 2 the job instances would receive the following 6 quadrics: CPS1 ∗ CPS2 = {[v11 , v21 , v31 , v41 ], [v11 , v22 , v32 , v41 ], [v11 , v23 , v31 , v41 ], ← exhausted CPS2 [v12 , v21 , v31 , v41 ], [v12 , v22 , v31 , v41 ], [v12 , v23 , v31 , v41 ]}
P. Kacsuk et al.
Figure 5 demonstrates the power and possibilities of data driven progress of job generation controlled by the advanced parameter sweep features of WS-PGRADE workflow manager. It is assumed that the node C-1 receives parameter sweep datasets on each Parametric input ports, and there is cross product relation among the input ports, which means each element of the given input set must be combined with the members of the other sets participating in the cross product relation. If the production system receives i files on port 1 and j files on port 0 then i ∗ j combination must be calculated, i.e., i ∗ j times jobs will be generated from the node C-1 and each of the independent running i ∗ j job instances will create a single output file on the output port so overall i ∗ j output files will finally be created. Notice that node C-1 works as a parameter sweep node that runs as many times as many parameter combinations can be found on its Parametric input ports. The advantage of the WS-PGRADE workflow concept is that such Parametric nodes can be placed anywhere in the WS-PGRADE workflow without any restrictions. The node C-2 is configured as a Generator node. It means that a single run of the node may produce more than one output files on one (ore more) distinguished Generator output port(s) of the node. The number of outputs does not need to be fixed. In our example, this job receives a parametric sweep dataset containing k input files. The k inputs trigger k job executions (k independent job instances) and a not predictable number of output files (see the summa resulting s on the figure) where each job execution may produce different number of outputs. The node C-3 receives aggregated inputs from the preceding nodes on its two Parametric input ports that are organized as cross product ports. u = r ∗ s times job will be created and executed because of the cross product relation of its Parametric input ports. Since there is no generator output port defined each job execution will create one output file on the output port so overall r ∗ s output files will be created. The Parametric input ports of node D-4 are in dot product relation. Dot product in our terminology means the pairing of inputs according to the common index of enumerated members of
WS-PGRADE/gUSE: Generic DCI Gateway Framework for User Communities Fig. 5 Example workflow presenting construction capabilities
constituent input datasets. If the size of one constituent dataset is less than the size of the biggest set involved in the relation, then the missing part of the pair will be replaced by the member having the lowest index. It follows the example job will be executed either r = i ∗ j or s times depending the value of the max(r, s) function. Let us denote max(r, s) with t. The node DCo-5 is a Collector node since one of its input ports is a Collector port. The execution of the job will not start until each expected input file of port “0” (the Collector port) has arrived, and a single job instance would execute the whole set if there was no other input port. However, the example node has another Parametric input port where t input files will arrive. Therefore, it will be executed t times, because its Parametric input ports are in dot production relation and the files arrive on input port 1 force the creation of t job instances. Each job instance will get the whole u size set of files at the collector port 0 and the next member of t size collection arriving at port 1. The Parametric input ports of node D-6 are in dot product relation. The job will be executed t times—pairing the inputs of equal sizes—resulting t output files altogether. Notice that all the nodes of the example workflow are executed as parametric nodes but their execution number is different according to the number of input files, types of input ports and ordering of input ports.
611
k
Consequences of Input Port Ordering on the Internal Interpretation The length limitations of the current paper does not allow to explain in detail the internal workflow interpretation methods of the parameter sweep language definitions explained above. However, a short illustration for a simple example can help the reader understand the main ideas of the interpretation. The example below shows a three node workflow (see upper part of Fig. 6—“Configuration”) which consists of a cascade of Generator nodes (Gen1, Gen2) and a subsequent node DotPrInp. These example nodes handle input output data files containing just one integer number. The Gen1, Gen2 produce in one job step as many output files as the integer value N defines in its input file and the values of these output files are the integers 1, 2, . . . , N in a random order. There is a dot product (pairing) relation between the input ports of the node DotPrInp. The associated job executes a multiplication operation among the received input arguments, and the result will be written on the output port. The “Execution” part of the Fig. 6 shows the principles of the internal interpretations. The created job instances of a given node receive identifiers “pid” (parametric identifier—index of a given node instance within a parametric job execution) numbers (see circles). The files arriving and leaving the job instances are indicated by blocks, where the central number indicates the
612
P. Kacsuk et al.
∑
3
∑
Fig. 6 Example 1 of internal workflow interpretation and generation of output data
value of the file, and the little number in the upper right index indicates the identifier of the file. This identifier will be used at the “pairing” operation. The input port ordering required by the job DotPrInp pinpoints a nontrivial restriction needed to be forced: in order to get deterministic results (by recalculating the ordering indices of output files on the base of pid numbers) each instance of Gen2 ( pid = 0, pid = 1, pid = 2) must be terminated before any instances of the DotPrInp ( pid = 0, pid = 1, pid = 2, pid = 3, pid = 4, pid = 5) can be started. This restriction would not be forced in case of cross product relation among the input ports of node DotPrInp. In that case all files produced by Gen2 can be combined immediately with the set of files available on port(1) of DotPrInp even in such scenario when— for example—the third instance ( pid = 2 of Gen2 ) has terminated but the second instance ( pid = 1) has not.
3.1.5 Condition-Dependant Node Execution The user has the ability to make execution of a workflow node dependent on the content of any of the input files. The possible operations for testing are equality, inequality and containing. The possible entities to check against are user-defined values or another input file of the job. For example, the user may restrict the execution of a node if and only if its input port’s file contains the value 10, or is the same as one of its other input port. Both the application of the condition dependent node execution and an eventual error during the execution of a job instance may cause the situation that the originally expected number of files will not arrive on a collector port. The question arises how to ensure that the job associated to the collector port will be able to fire at all not waiting for the rest of the files which never arrive. Figure 7 demonstrates such a situation: let
WS-PGRADE/gUSE: Generic DCI Gateway Framework for User Communities
613
Fig. 7 Error propagation
us assume that the job instance pid = 2 associated to the node Job1 will not be executed. To ensure the execution of the collector job associated to node Coll, its collector port 0 must be notified that the third expected file from the pid = 2 instance of Job 2 will not arrive. The solution is the directed graph routing based error propagation algorithm applied in the workflow interpreter: the workflow execution engine puts all the directly and indirectly not executable job instances in a special “repercussion” state. Testing the fire condition of a collector port the workflow engine calculates the number of invalid job instances attached on source side of the arc belonging to the collector port. With this information it can be decided whether the living job instances has terminated or not, i.e., the expected number of input files are already generated or not. An important question arises: why to use this error propagation and insulation method in case of a failed job instance which may invalidate the
overall calculation of the workflow? The answer is the policy of the “greedy calculation”. The workflow calculation may be a long and very expensive process. When a part of a big parameter study can not be executed because of a run time error of single isolated calculation, it does not mean that the rest of the result set is worthless. In this situation the workflow interpreter indicates the error, but does not kill the overall calculation. Of course gUSE tries to resubmit failed job instances in case of an error of the middleware. For example in case of gLite jobs, job resubmission is performed at most three times given that middleware error is detected. 3.1.6 Embedded Workf low Support Another powerful feature of WS-PGRADE workflows is the possibility to embed workflows into workflow nodes. Thus, instead of running for example an executable within a workflow node,
614
another WS-PGRADE workflow may run inside the parent workflow node. The embedded workflow feature is shown in Fig. 8. The main workflow has two normal jobs denoted with B and C, whereas the node A contains an embedded workflow. Thus, if gUSE is about to run node A, it actually runs the embedded workflow, passing the input files of the A node to the jobs of the embedded workflow. The assignment of input nodes can be defined by the user. Once the embedded workflow has finished, its user-specified output files are assigned to the outputs of the parent A node. This feature of gUSE is heavily used in the SHIWA project [30], where WS-PGRADE/gUSE is used as the core technology to solve coarsegrained interoperability of different workflow systems through the SHIWA Simulation Platform. The main idea of the CGI approach in SHIWA is that not only WS-PGRADE workflows can be embedded into a node but many other types of workflows (ASKALON, Galaxy, Kepler, MOTEUR, Taverna, Triana, etc.) using the built-in GEMLCA service [8] of WS-PGRADE/gUSE.
3.2 WS-PGRADE Workflow Developer UI As it has been described in Section 2, there are two distinct WS-PGRADE user roles: the workflow developer and the end-user. Users of the first group are able to exploit all the features of WS-PGRADE, whereas end-users can only see a limited interface of WS-PGRADE strictly tied to configuring and running existing workflows. Since the workflow developer user interface is a superset of the end-user interface within this section
Fig. 8 Embedded workflow example
P. Kacsuk et al.
we introduce all the user interface components available for workflow developers. WS-PGRADE portlets are gathered in the following groups: workflow-related portlets, storagerelated portlets, security-related portlets, informationrelated portlets and some additional portlets (for example user’s manual and notification settings). Within this subsection we will give a short overview of those WS-PGRADE portlets that are related with workflow creation, configuration, execution and monitoring. Due to space limitation other important portlets like security-, storageand information-related portlets are not dicussed in this paper. Interested users can find detailed information on those portlets in the WS-PGRADE User Manual [20]. The group of workflow-related portlets enables users to create, configure, execute and monitor their workflows. The following portlets are offered: Graph: this portlet gives an overview for the user about existing workflow graphs, and offers a way to start the Graph Editor. Create concrete: this portlet allows users to create concrete workflows based on graphs, templates or some other concrete workflows. Concrete: this is the most important portlet of WS-PGRADE, enabling users to set all the properties of their workflows for execution. Figure 9c shows a screenshot with the list of a user’s workflows. Here users may configure, submit, abort or view details of workflows. An example screenshot of a workflow’s configuration is shown in Fig. 9a. An example of a workflow’s details is shown in Fig. 9b. The screenshot presents a workflow where two instances of a workflow have already finished,
WS-PGRADE/gUSE: Generic DCI Gateway Framework for User Communities
(a) Concrete workflow configuration example
615
(b) Concrete workflow details as running
(c) Concrete workflow list Fig. 9 Concrete workflow views
and one workflow instance is in progress (that is, it is running). Some jobs of this workflow instance are shown below the instance list. As it can be seen, a number of job instances have finished, some are running, and many of them are waiting or are in the status init. Here the user has the possibility to get details of the different status. For example, in the case of a finished job the user may take a look at the standard outputs, errors, or may download results of the different job instances. Import: this portlet allows users to fetch existing workflow’s from gUSE’s internal Application
Repository, thus implementing basic collaboration possibilities between different users. Upload: this portlet offers a way to upload workflows stored on the user’s filesystem into the Application Repository, thus allowing workflow developers to publish ready-to-use workflow applications for the end-users. Export: as part of the Concrete workflow portlet, users may export their workflows into a repository so other users may use the Import portlet to start using them. Download: as part of the storage portlet users may download their workflows onto their
616
P. Kacsuk et al.
machines so publishing workflows amont portal instances is solved by using the Download and Upload functionalities.
4 DCI Bridge DCI Bridge is a recent development in the WSPGRADE/gUSE portal framework in order to provide a flexible and versatile access to all the important DCIs applied in Europe. In the previous versions of gUSE as many submitters had to be developed as many different DCIs we wanted to support. This required more and more effort as we extended the scope of supported DCIs. In order to reduce the development work and also for enabling the use of all these DCIs for other science gateways we have developed a new service called as DCI Bridge as shown in Fig. 10. The DCI Bridge is a web service based application that provides standard access to various distributed computing infrastructures (DCIs) such as: Grids, desktop Grids, clusters, clouds and service based computational resources (it connects through its DCI plug-ins to the external DCI resources). The main advantage of using the DCI Bridge as web application component of workflow management systems is, that it enables workflow management systems to access various DCIs using the same well defined communication interface. When a user submits a workflow, its job components can be submitted transparently into the various DCI systems using the OGSA Basic
Fig. 10 Schematic overview of the use of DCI Bridge
Execution Service 1.0 (BES) interface. As a result, the access protocol and all the technical details of the various DCI systems are totally hidden behind the BES interface. The standardized job description language of BES is JSDL. Additionally, DCI Bridge grants access to a MetaBroker service called GMBS [28]. This service acts as a broker among different types of DCIs: upon user request selects an adequate DCI (and depending on the DCI, an execution resource as well) for executing the user’s job. Just like the DCI Bridge, GMBS accepts JSDL job descriptions, and makes use of the DCI Bridge service to actually run the job on the selected DCI. As a consequence, nodes of a complex workflow can run simultaneously in different DCIs. For example, a parameter sweep node that should be executed several thousand times can run in a BOINC-based desktop Grid meanwhile some other nodes of the workflow that run only several hundred times can run in a gLite, ARC, Globus or UNICORE type of Grid. If these resources are not enough the execution can be continued on academic and commercial clouds. 4.1 Architecture of the DCI Bridge DCI Bridge is based on four main components: the Resource Registry, the Application Management, the Runtime System, and the Monitor component. All components of the DCI Bridge can run within a Tomcat based web container. The Resource Registry subsystem provides an online configuration interface to configure the
WS-PGRADE/gUSE: Generic DCI Gateway Framework for User Communities
accessible DCIs. It also provides information about the configured resources to other external software components. Main components of this subsystem are: – –
Online configuration interface ResourceConfiguration service
Wide range of different types of DCIs are supported by the DCI Bridge. The number of supported DCIs is constantly growing. So far the following DCIs have been supported: Glite, GT2, GT4, ARC, UNICORE, PBS, LSF, web services, BOINC, Google App Engine, GEMLCA, local resources. The authentication mechanism of the common portal container is used by all online graphical user interfaces, thus also the Online configuration interface is accessible by the same user database. To visualize the available resources, the ResourceConfiguration service provides both http and https communication channels for the users. Based on these services the Resource portlet gives details on the configuration of different Grid middleware supported by the connected DCI Bridge service. Figure 11 shows a screenshot of the Resource portlet displaying information about a DCI Bridge service connected to a number of DCIs. The Application Management subsystem is the implementation of the BES-Management Porttype from the 5th volume of the OGSA Basic Execution Service 1.0 specification that makes possible to supervise the software based access of the BES Factory service. Please note that in
Fig. 11 Resource portlet
617
this context Application refers to the Application term of OGSA BES. The Runtime System is responsible for running the incoming jobs on the selected DCIs. The subsystem can be called via the BES WSDL and it implements the operations defined by the OGSA Basic Execution Service 1.0 specification on different Grid/cloud/service based middleware. The separate DCIs are handled by corresponding plug-ins and their numbers can be increased without any restriction. Main components of this subsystem are shown in Fig. 12. The Runtime System accepts standardized JSDL job description documents from the Workflow Interpreter (WFI) service. These documents are based on a well-defined XML scheme containing information about the job inputs, binaries, runtime settings and output locations. The core JSDL itself is not powerful enough to fit all needs but fortunately it has a number of extensions. For example, DCI Bridge makes use of the JSDLPOSIX extension. Beside the JSDL-POSIX extension, DCI Bridge makes use of two legacy extensions: one for defining execution resources and one for proxy service and callback service access. The execution resource extension is needed both for the core DCI Bridge in order to define specific execution resource needs and for the Metabroker service. The proxy service extension is needed for jobs targeted to DCIs which rely on X.509 proxy certificates for job submission. The callback service extension is required if status change callback functionality is needed: the DCI Bridge will
618
P. Kacsuk et al.
Fig. 12 Internal architecture of the DCI Bridge Runtime System
initiate a call to the service specified in the extension upon every job status change. User credential handing of DCI Bridge is based on content-based approach instead of a channelbased one. This means that user credentials (proxies or SAML assertions) are not handled by the communication channel, but are rather specified in the JSDL extension mentioned earlier. This approach allows DCI Bridge to implement various DCI-dependent credential handling options within the different DCI plugins instead of relying on the capabilities of the servlet container running the DCI Bridge service. Of course, it is still possible to run DCI Bridge as a secured service (for example, as a service accessible through authenticated HTTP, HTTPS or even HTTPG), but credentials used for establishing connection to the service are not passed to the destination plugin, it solely makes use of the credentials described in the JSDL extension. Even when different DCIs using different middleware and access mechanisms that are not
compatible with each other they provide similar services for the users. To overcome the incompatibility issue, the DCI Plugin Manager (a plugin based framework with a common interface for all the different DCIs) was defined and implemented. Every DCI plug-in runs in a separate thread. In order to increase the utilization performance of a DCI multiple plug-ins threads can be launched and assigned to the same DCI entity. These threads are feeding the same DCI at the same time in a concurrent manner. The final component of the DCI Bridge is the Monitor subsystem that handles and visualizes the logs and messages of the DCI Bridge, the plug-ins and the running jobs. 4.2 Job Submission Through the DCI Bridge When a workflow is submitted through the WSPGRADE GUI the workflow nodes are parsed and submitted by the Workflow Interpreter (WFI) to the DCI Bridge as individual jobs one-by-one.
WS-PGRADE/gUSE: Generic DCI Gateway Framework for User Communities
The major steps executed by the DCI Bridge components are explained below. The reader is advised to use Fig. 12 to follow the explained steps. During the explanation we assume the user configured the workow node to run on local resource. This assumption simplifies the explanation without missing any relevant step. – –
–
–
–
External Job submission: The WFI (inside gUSE) initiate the job submission (see Fig. 12). In the WFI’s submit pool the job is waiting, and when processed, according to its job configuration a job description XML file (JSDL) is generated. The job is submitted through the BES Factory service, i.e. the WFI calls the BES Factory service of the DCI Bridge with the generated JSDL. Job registration: The BES Factory service receives the job and sends it into the Job Registry as object for further storage. The Job Registry creates references to the job object and provides back the reference and a job ID to the BES Factory service. Prepare job for submission: BES Factory service inserts the job reference into the Input Queue. BES Factory service sends back the job ID and some status information to the WFI. The job references in the Input Queue are waiting for their processing. When the job processing started an internal job directory is created. The executable(s), and all the inputs of the job are downloaded from the gUSE storage into the newly created directory. All the job directories are stored under the same temporary directory path locally. The used name convention is simple: the full job ID is the directory name. In parallel to the download process certificate assignment (using proxy certificates), and all other authentication/authorization related tasks are taking place during this step. Furthermore Metabroker service can be utilized (if any DCI decision is required). This is an optional step in the procedure (currently not used). Internal job submission: If the job is ready to run, the job ID is forwarded to the Plugin Manager. The Plug-in Manager is using its own queue to store all the pending job IDs. Each DCI can be utilized by multiple
–
–
–
619
DCI plug-in threads. Each thread is trying to process the assigned queues and submit jobs into the appropriate DCI. In our example the job is using local resources. In local submission the submit returns and status query starts automatically. The job can finish with successful/failed status. Collect results from DCI: When the job is finished, outputs are downloaded from the DCI. (In local submission: the output is just copied between different directories.) Forward result to Upload Manager: The job reference is transferred into the queue of the Upload Manager. The job references in the Output Queue are waiting for their processing. When job processing is started the job outputs are uploaded to the storage. Feedback to WFI: The job status (e.g.: finished/ failed) is forwarded back through the Job Registry to the WFI. Internal Clean Up: The remaining entities of the previous job processing (directory, generated files, outputs) are cleaned up, the temporary job directory and the job references in the Job Registry are deleted.
4.3 DCI Bridge in a Multi-Grid Multi-Node Installation Scenario Usually to submit a job to a Grid system requires that the submission engine is placed on a user interface machine of the given Grid. For example, gLite requires a gLite UI service to submit a job. Unfortunately, in many cases these UI services are not manageable on the same physical machine and hence, it is not trivial to submit jobs from a single portal to several different Grids in parallel. However, the DCI Bridge provide an elegant solution for this problem. As it can be seen in Fig. 13, one WS-PGRADE installation can make use of a number of DCI Bridge services. The different DCI Bridge deployments are running on different Grid user interface machines connected to the different Grid infrastructures. In such a setup, users of a WSPGRADE installation may make use of very different DCIs through a unified user interface, without the need to use different services to access the different middlewares.
620
P. Kacsuk et al.
Fig. 13 Multi-node installation of the DCI Bridge service
4.4 DCI Bridge in a Load-Balancing Scenario The usage of several DCI Bridges can provide load-balancing between multiple DCIs. In this scenario the load-balancing can be implemented through a centrally implemented DCI Bridge that can distribute the incoming jobs in a cascade arrangement to other DCI bridges. The outline of the scenario is shown in Fig. 14.
Fig. 14 DCI Bridge load-balancing scenario
In this scenario the core gUSE services are aware of one DCI Bridge installation. Although this DCI Bridge service is not connected to any DCI, it may forward the jobs it receives to other DCI Bridge deployments as they are using the same submission interface and job description language. In this way the central DCI Bridge service may distribute the incoming jobs among the other services it is aware of. After the jobs are
WS-PGRADE/gUSE: Generic DCI Gateway Framework for User Communities
distributed, they have the possibility to report job status back to the central gUSE services using the callback JSDL extension we have described earlier. Typically, one may set up a number of DCI Bridge services to submit jobs to different middlewares or to the same middleware with different entry points. In this case the central DCI Bridge (based on responses from the other DCI Bridge services) may send individual jobs to the other DCI Bridge services to solve load balancing.
5 Creating User Specific Gateways from the WS-PGRADE/gUSE Portal Framework Concerning the usage of portals three different user groups can be identified. First, the largest group covers the end-user scientists who in one hand would like to use rich, distributed computational infrastructures for their scientific research but on the other hand they are not interested in learning technical knowledge about the usage of these distributed systems. Therefore all complexity of the underlying run-time systems must be hidden from them as much as possible, and a convenient user interface should be provided for them that is in line with their own research area. These kinds of users are shown in Fig. 2 denoted with C. In fact, they are the consumers of ASM based customized gateway systems. The second, smaller group of users is called Workflow Developers. They are qualified to adapt scientific applications to DCIs and to manage such systems. These users are denoted with A in Fig. 2 and they are supposed to use the workflow development facilities of the WSPGRADE/gUSE portal framework. The last, smallest group is the set of User Interface Developers. Typically they are webprogrammers who are able to develop user-friendly web-applications customized for a specified scientific application created by Workflow Developers. The User Interface Developers are supported by the ASM to develop the application-specific, userfriendly web interface or by the Remote API to use and adapt their existing user environments. The WS-PGRADE/gUSE portal framework was
621
designed to serve all these types of users and provide for them the services that are in line with their expertise. These two application specific access possibilities of gUSE are detailed in this section. 5.1 Application Specific Module 5.1.1 Usage Scenario According to Fig. 15, first the Workflow Developer adapts a specified scientific application into the WS-PGRADE/gUSE portal framework by creating a workflow structure. Then the workflow must be configured by uploading executables for jobs, setting input and output naming conventions, and selecting computing and storage resources to be used. When it is configured and tested correctly, the Workflow Developer exports the ready-to-use workflow application to the local Application Repository of gUSE. The second step is that the User Interface Developer develops an easy-to-use interface according to the needs of the end-users (typically the information is mediated by the Workflow Developer). As the end-users will execute applications via this interface, features and services of the core gUSE system should be made accessible for this new interface. The Application Specific Module is an API to establish this connection in a convenient and easy way. Finally, an end-user can create a new clone of the prefabricated scientific application by downloading it from the Application Repository in to his/her own user space, adjusting its settings, executing it and getting the results, as shown in Fig. 15. 5.1.2 Provided Functions The main advantage of ASM is to hide the complexity of the inner abstraction levels, and inner callings of different core services of gUSE. Without this component, one or more difficult web-service callings should be constructed each time when a customized portlet should get or pass information from/to the portal. In order to avoid this complexity ASM covers all of these internal information accesses by a simple call of a
622
P. Kacsuk et al.
Fig. 15 User scenario of ASM-based portlet
well-parameterized function and moreover, it requires only the really necessary parameters such as the name of the user or the id of the workflow. In usual cases this information can be retrieved easily by the portal container or by ASM. The functionalities provided by ASM can be separated into three different subsets: (i) methods covering application management issues, (ii) methods that can be used for input/output manipulation and (iii) methods to handle user activities during execution such as aborting or rescuing applications. Within the set of methods for application management, as it is shown in Table 1, there are several methods available for getting information from workflows stored in the local
repository: getting list of application developers, getting list of applications according to a specified developer id, importing an application to local user space and getting a list of applications that have already been imported. The set of Input/Output Manipulation covers various methods shown in Table 2 to handle different input cases such as uploading a file to a specified port, setting a file that is currently exists on the portal server, or setting commandline parameters for a job. Some methods of this set contain possibilities to fetch the outputs of the calculations. The set of Execution methods contains methods not only for workflow submission, but for
Table 1 Application management methods of ASM API Functions
Parameters
Return values
Description
getWorkflowDevelopers
type: enumeration, possible values are “Application”, “Workflow”, “Graph” String userId
Array of Strings
Gets the list of applivation developers that have exported at least one application to the gUSE Application repository in the given type Returns a list of applications has been already imported by the user identified by userId Imports an application (identified by workflowName) exported by developerId, from the gUSE Application repository to space of userId. The application will be named as newWorkflowName. Removes the workflow called workflowId from space of userId.
getASMWorkflows
ImportWorkflow
DeleteWorkflow
String userId, String newWorkflowName, String developerId, String workflowType, String workflowName String userId, String workflowId
List of ASMWorkflow objects Void
Void
FileItem file, userId, fileName String userId, File fileOnPortalServer, String workflowId, String jobId, String portId String userId, String workflowId, String jobId, String portId String userId, String workflowId, String jobId, String portId, String newRemotePath String userId, String workflowId, String jobId, String portId String userId, String workflowId, String jobId, String portId, String newRemotePath String userId, String workflowId, String jobId String userId, String workflowId, String jobId, String commandline String userId, String workflowId, String jobId String userId, String workflowId, String jobId, String type, String Grid, String resource, String queue String userId, String workflowId, String jobId String userId, String workflowId, String jobId, int nodenumber
uploadFiletoPortalServer
placeUploadedFile
setNodenumber
getNodeNumber
setResource
getResource
setCommandLineArg
getCommandLineArg
setRemoteOutputPath
getRemoteOutputPath
setRemoteInputPath
getRemoteInputPath
Parameters userId, workflowId, jobId, portId
Functions getFiletoPortalServer
Table 2 Settings manipulation methods of ASM API
Void
String
Void
ASMResourceBean
Void
String
Void
String
Void
String
Void
Void
Return values Void
Returns the value of the required number of nodes if the given job is an MPI job. Sets the number of the required nodes to nodenumber for jobId.
Returns the current resource to where the job is being submitted. Sets a resource specified by Grid, resource and queue.
Returns the actual command line argument set for jobId of workflowId. Sets command line argument for jobId of workflowId.
Returns the actual remote input path associaged to a specified job/port. Sets the remote input path according to the string added as parameter.
Description Downloads the file associated to the portId of jobId contained by workflowId, to the portal server. Uploads file to a temporary space of userId and names it fileName. Associate the file that has been already uploaded to portal server, to the portId of jobId contained by workflowId. Returns the current remote input path associated to a specified job and port. Sets remote input path to newRemotePath for portId of jobId contained by workflowId.
WS-PGRADE/gUSE: Generic DCI Gateway Framework for User Communities 623
624
P. Kacsuk et al.
many other activities like methods for getting workflow execution status in simple or in detailed format, for aborting or for rescuing a workflow. These methods are detailed in Table 3. As portlets in general are deployed in portlet containers that supervise the most common user activities and manage user sessions, ASM does not have to provide any security feature with the exception of issues related to the underlying infrastructures and complex systems. To provide these, ASM uses inner-level solutions of gUSE requiring proxy certificates. These certificates will be created and used with the help of dedicated Certificate portlets by the end-users who—similarly to the members of other user groups—have account to the portal.
5.2 Remote API 5.2.1 Concept This kind of API follows a different concept than ASM. Here the users do not have to register to the WS-PGRADE portal but they should have a valid, well-parameterized WS-PGRADE workflow (technically an XML file that describes the structure of the workflow and all local input files and binaries compressed into one zip file). In order to submit a workflow through Remote API, a valid proxy certificate is required. The users do not have to register to the portal, therefore the API call creates a new temporary user before every execution, and the workflow will be submitted on behalf of this temporary user. The API provides methods for checking the workflow’s status, and for downloading the outputs of the execution. After downloading the outputs, the workflow and the temporary user that was created in association with this workflow, will be removed.
Remote API is implemented as a simple servlet that can be installed as one of the gUSE services. Servlets can be called from anywhere such as webinterfaces (Javascript), stand-alone applications, different web-services or anything that supports http or https communication protocols. As a result Remote API is a powerful tool to exploit gUSE capabilities independently from the WS-PRADE portal interface (see Fig. 2). 5.2.2 Provided Functions Remote API covers the whole life-cycle of a workflow management by providing an interface for workflow submission, obtaining status during execution in short or in detailed format, for stopping, suspending, or rescuing the workflow, and for downloading outputs. Table 4 shows the overview of the functions according to function name, required parameters, return values and description. Security issues are solved in two ways. First, Remote API has to comprise a solution for defending the portal server against illegal access, or other attacks. Second, the API, as a gateway, has to contain a security mechanism to defend distributed systems against illegal access. To guarantee security of the portal server, the Remote API can be set to use a password for authentication. This password can be added as a property of Remote API, and guarantees that requests containing passwords will be served only. To defend distributed systems, Remote API leaves authentication methods to be done by the underlying gUSE system. In this case this component is totally transparent, all security support of WS-PGRADE is being supported, too. All security-sensitive functions of the API (workflow submission and rescue) requires to upload a compressed file which contains files for the
Table 3 Execution management methods of ASM API Functions
Parameters
Return values
Description
getDetails
String userId, String workflowId
WorkflowInstanceBean
Submit Rescue Abort
String userId, String workflowId String userId, String workflowId String userId, String workflowId
Void ASMService ASMService
Returns detailed statuses of each job of workflowId that is being submitted. Submits application named workflowId. Rescues an application named workflowId Aborts the application named workflowId
WS-PGRADE/gUSE: Generic DCI Gateway Framework for User Communities
625
Table 4 Functions provided by the Remote API Functions
Parameters
Return values
Description
Submit
certs.zip: zip file that contains files of authentication, pass: password to authenticate to gUSE, gusewf: gUSE forkflow, URL: url of the Remote API Servlet Id: the workflow runtime Id from the return value of the submit method, pass: password to authenticate to gUSE
String: Id, that identifies the workflow
Submits the workflow to gUSE
Overall workflow status: submitted/running/ finished/error/ suspended/invalid Workflow and job status
Access status of the workflow
Info
Detailsinfo
Stop
Download
Id: the workflow runtime Id from the return value of the submit method., pass: password to authenticate to gUSE Id: the workflow runtime Id from the return value of the submit method, pass: password to authenticate to gUSE Id: the workflow runtime Id from the return value of the submit method, pass: password to authenticate to gUSE
authentication (such as ×509 proxy certificates in gLite’s case, or public part of SSH keys in different clusters’ case) with a defined naming convention. For instance a file that can authenticate the user in GILDA Virtual Organization as a proxy certificate must be named as ×509up.GILDA. Compression does not just save I/O transfer cost between the client and gUSE, but as gUSE system supports execution of multi-DCI worfklows, jobs in workflows can be authenticated differently, which means that several authentication files should be transferred in one submission call, therefore handling one compressed file is a more elegant and easier way than handle all files one by one. 5.3 Comparison of APIs As Fig. 2 shows, both the ASM and the Remote API supports access to gUSE services but they do it in different ways. Meanwhile ASM is focusing on providing an API to support developing portlets or other graphical user interfaces, Remote API supports direct access to gUSE services without the WS-PGRADE graphical user interface. ASM follows a concept where a prefabricated application should be stored and be accessible in the local repository. Compared to ASM in case of Remote API the prefabricated workflow
Access status of the workflow and its jobs
True: abort was successful, False: otherwise
Aborts and deletes the workflow
A compressed file that contains outputs and logs
Downloads the produced results of the workflow
can be taken from everywhere without any restriction. Concerning the semantic correctness of the input workflow Remote API is less strict than the ASM API. Conceptually, any syntactically correct gUSE application can be uploaded for gUSE and executed via Remote API even it is semantically not correct. On the other hand, using ASM results in an interface to access applications from the Application Repository and these applications per definition are assumed to be correct. As a result, ASM provides an API to create user interface that guaranties to access semantically correct workflows (applications). Another difference between the two APIs appears with respect to their user’s storage space need. Remote API creates a temporary user for each workflow submission and stores the output till its first download only. At this time, every information related to the execution of this workflow is removed from the gUSE storage space. Therefore, the Remote API approach does not require extensive user storage space. ASM requires that the user should be logged into the portal and hence, gUSE can manage all the workflow-related information that belongs to the workflows initiated by this user as long as his/her quota is not exceeded. From the point of view of portal usage ASM is preferable because in case of ASM the quota is generated per users while in
626
case of Remote API the quota is generated only per workflow submission. Finally, there is a difference concerning the goal of using these APIs. ASM is designed to be a convenient Java-based API that hides all complexity of gUSE and to serve functionalities for powerful portlet programming. Remote API is tailored to be a general interface for either users or software tools to access the gUSE services. 5.4 Use Case of WS-PGRADE/gUSE Within this section we present an example application-specific gateways based on WSPGRADE/gUSE in the filed of biology using the ASM technology. 5.4.1 Proteomics Portal Proteomics is a typical scientific area that uses and develops applications operating on large datasets, and therefore needs large computational power. Although several portals provide access to different analysis tools working on proteomics data, most of them serve just specific use-cases.
Fig. 16 Proteomics portal view
P. Kacsuk et al.
Swiss Grid Proteomics portal, that is developed as a cooperation between ETH Zurich, SystemsX and MTA SZTAKI and is based on P-GRADE portal and its ASM, relieves the tight domain of use-cases and let the community of power users (Workflow Developers) to create simple or complex workflows based on existing tools. In fact, three different workflows were created to serve needs of the scientists: Spectra Quality Control (SQC), Trans-Proteomic Pipeline (TPP) and Label Free Quantification(LFQ). All these workflows can be studied in detail in the following reference [32]. The portal enables the execution of these workflows not just on local, powerful clusters but also on the whole national Grid system of Switzerland. The User Interface Developers also developed an application-specific user interface based on ASM and OpenBIS, a commonly used tool for storing and providing data and metadata. The new interface helps the biologists to focus on input data selection and output data visualization in a convenient way as shown in Fig. 16. Although Swiss Proteomics portal is currently based on P-GRADE portal technology,
WS-PGRADE/gUSE: Generic DCI Gateway Framework for User Communities
developers of ETH and SystemsX are working to adapt these solutions to WS-PGRADE/gUSE portal framework and then upgrade the official production portal to the new one.
6 Related Work Authors of paper [16] gave an excellent overview of portal framework technologies available for life sciences. In fact, the portal framework technologies described in the paper are valid for other scientific areas, too. Authors claimed that basically the following portal frameworks were available for production use: OGCE [1], GridPort [5], Vine Toolkit [43] and the P-GRADE Portal [13]. Most of these portals were originally built on the GridSphere portlet container but their recent versions are ported to Liferay. However, paper [16] did not deal with the other types of portals, namely with the applicationspecific science gateways. Recently, these are getting more and more popular since they provide a really easy-to-use interface for a certain set of scientists serving their direct and specific needs. Since the number of these gateways has been significantly increased a short section like this on related research is not enough to cover all of them. Here we show some of the recent gateways that cover typical approaches. First, we show some gateways that are customized from existing portal frameworks and then we introduce gateways that are developed independently from the existing portal frameworks. The DECIDE and INDICATE science gateways are customized based on the Genius portal framework. The science gateway of the DECIDE [6] project helps the early diagnosis and research on brain diseases. The gateway is custombuilt based on Liferay, granting access to gLitebased infrastructures with a number of supporting services like LDAP, Shibboleth or robot certificate for authentication, authorization and Grid access. The INDICATE project’s e-Culture Science Gateway [22] is offering a simple interface for accessing a number of digital cultural heritage repositories. The gateway is based on Liferay, enabling predefined applications on gLite and EMIbased middlewares through LDAP authorization
627
and Shibboleth authentication like the DECIDE Science Gateway. Job submission to the Grid infrastructures is solved with making use of robot certificates. As it can be seen, the above new science gateway examples aim to offer a portal for one specific problem, and both of them offer new technologies like SSO through Shibboleth or robot certificates in order to provide easy Grid access for inexperienced Grid users. These new technologies are essential for e-Science portals and soon will be available in the new releases of WS-PGRADE, too. The CancerGrid gateway, ProSim gateway and the MosGrid gateway are customized from the WS-PGRADE/gUSE portal framework. The CancerGrid gateway [23] was developed for supporting the access of a large molecule database and for enabling the execution of three different molecule processing workflows in order to facilitate anti-cancer drug design. This was the first gateway where BOINC desktop Grids were accessible via the job submission mechanism of the portal. The ProSim gateway [29] was developed to support bio-scientists in the framework of the UK ProSim project. Bio-scientist can run a complex parameter sweep workflow containing autodock and gromacs nodes in order to model carbohydrate recognition. Swiss Grid Proteomics gateway was developed as a cooperation between ETH Zurich, SystemsX and MTA SZTAKI and its recent version is based on WSPGRADE/gUSE and its ASM concept. It enables the community of power users (Workflow Developers) to create complex workflows based on existing tools. Already three different workflows were put into production to serve needs of the scientists: Spectra Quality Control (SQC), TransProteomic Pipeline (TPP) and Label Free Quantification(LFQ) [32]. The gateway enables the execution of these workflows not just on local, powerful clusters but also on the whole national Grid system of Switzerland. The MosGrid gateway [15] was developed for the German MoSGrid community in order to provide a chemistry-oriented workflow execution gateway for the UNICORE-based German Grid. One of the major customization of WS-PGRADE was the modification of the security system in
628
order to support SAML (security assertion markup language) and also extending the available job submitter plugins with a UNICORE plugin. The other important class of applicationspecific gateways contains gateways that were developed independently from existing portal frameworks. A typical example is the VisIVOWeb gateway [4] that was developed for astrophysics in order to enable the visualization of astrophysics objects on gLite-based Grid systems. Another example is the AMC Biomedical gateway [38] that supports staff members of AMC (Amsterdam Medical Centre) in processing large medical data sets on various types of Grid systems.
7 Conclusions There are many gateways that support individual job submission to one or only few DCIs. However, the WS-PGRADE/gUSE generic DCI gateway framework is the only one that provides a comprehensive workflow-oriented framework that enables the development, execution and monitoring of scientific workflows and the nodes of these workflows can access to a large variety of different DCIs including clusters, Grids, desktop Grids and clouds. Another important feature of the WSPGRADE/gUSE framework is that it can easily be adapted and customized according to the special needs of various user communities in order to develop application-specific science gateways for these communities. The WS-PGRADE/gUSE framework has already been used by several user communities to customize for their need as it was written in the Introduction and in the Section on Related Work. The reason why these communities selected it was because of its very flexible workflow system and its capability of submitting and managing these workflows on a large variety of different DCIs. The recent developments of the WSPGRADE/gUSE framework further improved the editing features of the workflows and extended the list of accessible DCIs with new types (UNICORE, ARC, BOINC, PBS and LSF clusters, commercial and academic cloud systems). It is particularly important new
P. Kacsuk et al.
development in WS-PGRADE/gUSE that the job submission system is standardized based on the BES interface defined by OGF and it is separated as an independent service called as DCI Bridge from gUSE. As a result, if the developers of other science gateways will implement a BES job submission service they will be able to access all the DCIs supported by the DCI Bridge. In this way the developers of other science gateways can save a lot of efforts in providing access to the various DCIs. WS-PGRADE/gUSE is an open source software based on apache license and can be downloaded from Sourceforge [19]. Since its publication on Sourceforge in February 2011 more than 3000 downloads have been done from 46 countries. There is an active user forum where messages pointed out several drawbacks restrictions of the current gateway. For example, an important difficulty came from the many possibilities of installing the framework. In order to solve this problem an installation and a configuration wizard has been developed. One important restriction appears in the WSPGRADE workflow concept. There is no control loop possibility and this could be a problem for certain applications. There are other workflow systems where this possibility is available. For example, MOTEUR and Askalon provide control loops and hence those users who need these facilities can build their workflows in those systems. On the other hand using the results of the SHIWA project such workflows containing control loops can be executed in the WS-PGRADE/gUSE framework, too. WS-PGRADE/gUSE is the basis of the SHIWA Simulation Platform [39] that enables combining many different kind of workflows (Taverna, Askalon, Moteur, GWES, Galaxy, WSPGRADE, Kepler, Triana, Pegasus, etc.) into a single so-called meta-workflow via the SHIWA portal. The meta-workflow is a WS-PGRADE workflow where any node can contain workflows written in any workflow language supported by SHIWA. Typically, these workflows can be downloaded from the SHIWA Workflow Repository and the WS-PGRADE/gUSE framework provides a portlet and some other mechanisms to directly access this repository. In this way SHIWA can be considered as the generalization of the
WS-PGRADE/gUSE: Generic DCI Gateway Framework for User Communities
WS-PGRADE workflow embedding concept and the WS-PGRADE/gUSE Application Repository concept. The WS-PGRADE/gUSE framework will be further developed within the EU FP7 project SCI-BUS (SCIence gateway Based User Support) [42] in order to facilitate its customization towards application-specific science gateways. Therefore, in SCI-BUS the ASM-based customization methodology and technology will be further elaborated. The project will create and set up 27 different science gateways by customizing the WS-PGRADE/gUSE framework according to the special needs of the various user communities. In this way the WS-PGRADE/gUSE framework will be a major platform in Europe to build generic-purpose and application specific gateways for all the major DCIs of Europe. This will be further strengthened by the integration of WSPGRADE/gUSE with the CloudBroker Platform [31]. This integration is available from version 3.5.0 and enables the access of a large variety of commercial and academic clouds. As a result scientific workflows are executable not only on academic Grids but also on academic clouds and if these resources are not enough for a large scientific simulation even commercial clouds are usable. The detailed description of the integration of WS-PGRADE/gUSE with CloudBroker Platform will be the subject of a forthcoming paper. Acknowledgements The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no 283481 (SCI-BUS) and under grant agreement no 261585 (SHIWA).
3.
4.
5.
6. 7.
8.
9.
10.
11.
12.
References 1. Alameda, J., Christie, M., Fox, G., Futrelle, J., Gannon, D., Hategan, M., Kandaswamy, G., von Laszewski, G., Nacar, M.A., Pierce, M., Roberts, E., Severance, C., Thomas, M.: The open Grid computing environments collaboration: portlets and services for science gateways: research articles. Concurr. Comput.: Pract. Exper. 19, 921–942 (2007) 2. Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludascher, B., Mock, S.: Kepler: an extensible system for design and execution of scientific workflows. In: 16th International Conference on Scientific and
13.
14.
15.
629
Statistical Database Management. Proceedings, pp. 423–424 (2004) Balaton, Z., Farkas, Z., Gombas, G., Kacsuk, P., Lovas, R., Marosi, A.C., Terstyanszky, G., Kiss, T., Lodygensky, O., Fedak, G., Emmen, A., Kelley, I., Taylor, I., Cardenas-Montes, M., Araujo, F.: EDGeS: The Common Boundary Between Service and Desktop Grids. Grid Computing, pp. 37–48. Springer US (2008) Costa, A., Becciani, U., Massimino, P., Krokos, M., Caniglia, G., Gheller, C., Grillo, A., Vitello, F.: Visivoweb: a www environment for large-scale astrophysical visualization. Publ. Astron. Soc. Pac. 123, 503–512 (2011) Dahan, M., Boisseau, J.R.: The Gridport toolkit: a system for building Grid portals. In: Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing, pp. 216–227. IEEE Computer Society, Washington, DC (2001) Decide science gateway: http://applications.eu-decide. eu/ (2011). Accessed 28 Oct 2012 Deelman, E., Singh, G., hui Su, M., Blythe, J., Gil, A., Kesselman, C., Mehta, G., Vahi, K., Berriman, G.B., Good, J., Laity, A., Jacob, J.C., Katz, D.S.: Pegasus: a framework for mapping complex scientific workflows onto distributed systems. Sci. Program. J. 13, 219–237 (2005) Delaitre, T., Kiss, T., Goyeneche, A., Terstyanszky, G., Winter, S., Kacsuk, P.: Gemlca: running legacy code applications as Grid services. J. Grid Computing 3, 75– 90 (2005) Doreleijers, J.F., Vranken, W.F., Schulte C., Markley, J.L., Ulrich, E.L., Vriend, G., Vuister, G.W.: NRGCING: integrated validation reports of remediated experimental biomolecular nmr data and coordinates in wwpdb. Nucleic Acids Res. 40(D1), D519–D524 (2012) Ellert, M., Grønager, M., Konstantinov, A., Kónya, B., Lindemann, J., Livenson, I., Nielsen, J., Niinimäki, M., Smirnova, O., Wäänänen, A.: Advanced resource connector middleware for lightweight computational Grids. Future Gener. Comput. Syst. 23(2), 219–240 (2007) Erwin, D.W., Snelling, D.F.: Unicore: a Grid computing environment. In: Euro-Par 2001 Parallel Processing, Lecture Notes in Computer Science, pp. 825–834. Springer, Berlin (2001) Fahringer, T., Prodan, R., Duan, R., Hofer, J., Nadeem, F., Nerieri, F., Podlipnig, S., Qin, J., Siddiqui, M., Truong, H.-L., Villazon, A., Wieczorek, M.: Askalon: a development and Grid computing environment for scientific workflows. In: Taylor, I.J., Deelman, E., Gannon, D.B., Shields, M. (eds.) Workflows for eScience, pp. 450–471. Springer, London (2007) Farkas, Z., Kacsuk, P.: P-grade portal: a generic workflow system to support user communities. Future Gener. Comput. Syst. 27(5), 454–465 (2011) Farkas, Z., Kacsuk, P., Balaton, Z., Gombás, G.: Interoperability of boinc and egee. Future Gener. Comput. Syst. 26(8), 1092–1103 (2010) Gesing, S., Grunzke, R., Balasko, A., Birkenheuer, G., Blunk, D., Breuers, S., Brinkmann, A., Fels, G.,
630
16.
17. 18.
19. 20.
21.
22. 23.
24.
25.
26.
27.
28.
29.
P. Kacsuk et al. Herres-Pawlis, S., Kacsuk, P., Kozlovszky, M., Krüer, J., Packschies, L., Schäfer, P., Schuller, B., Schuster, J., Steinke, T., Zikszay Fabri, A., Wewior, M., MüllerPfefferkorn, R., Kohlbacher, O.: Granular security for a science gateway in structural bioinformatics. In: Proc. IWSG-Life 2011 (2011) Gesing, S., van Hemert, J., Kacsuk, P., Kohlbacher, O.: Special issue: portals for life sciences—providing intuitive access to bioinformatic tools. Concurr. Comput.: Pract. Exper. 23(3), 223–234 (2011) gLite: gLite. http://glite.web.cern.ch/glite/ (2010). Accessed 28 Oct 2012 gUSE install manual: http://sourceforge.net/projects/ guse/files/3.4.4/docs/Portal_Installation_Manual_v3.4. 4.pdf/download (2012). Accessed 28 Oct 2012 gUSE sourceforge webpage: http://guse.sf.net/ (2011). Accessed 28 Oct 2012 gUSE user manual: https://sourceforge.net/projects/ guse/files/3.4.4/docs/Portal_User_Manual_v3.4.4.pdf/ download (2012). Accessed 28 Oct 2012 Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M.R., Li, P., Oinn, T.: Taverna: a tool for building and running workflows of services. Nucleic Acids Res. 34, 729–732 (2006) Indicate e-culture science gateway: http://indicate-gw. consorzio-cometa.it/ (2011). Accessed 28 Oct 2012 Kovács, P.K.J., Lomaka, A.: Using dedicated desktop Grid system for accelerating drug discovery. Future Gener. Comput. Syst. 27, 657–666 (2011) Kacsuk, P.: P-grade portal family for Grid infrastructures. Concurr. Comput.: Pract. Exper. 23(3), 235–245 (2011) Kacsuk, P., Karoczkai, K., Hermann, G., Sipos, G., Kovacs, J.: Ws-pgrade: supporting parameter sweep applications in workflows. In: Third Workshop on Workflows in Support of Large-Scale Science. WORKS 2008, pp. 1–10 (2008) Kacsuk, P., Kovács, J., Farkas, Z., Marosi, A.C., Balaton, Z.: Towards a powerful European DCI based on desktop Grids. J. Grid Computing 9(2), 219–239 (2011) Kaiser, H., Merzky, A., Hirmer, S., Allen, G., Seidel, E.: The SAGA C++ reference implementation: a milestone toward new high-level Grid applications. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC ’06. ACM, New York (2006) Kertész, A., Kacsuk, P.: Gmbs: a new middleware service for making Grids interoperable. Future Gener. Comput. Syst. 26, 542–553 (2010) Kiss, T., Greenwell, P., Heindl, H., Terstyanszky, G., Weingarten, N.: Parameter sweep workflows for
30.
31.
32.
33.
34.
35. 36. 37. 38.
39. 40.
41.
42. 43.
modelling carbohydrate recognition. J. Grid Computing 8, 587–601 (2010). doi:10.1007/s10723-0109166-8 Korkhov, V., Krefting, D., Kukla, T., Terstyanszky, G., Caan, M., Olabarriaga, S.: Exploring workflow interoperability tools for neuroimaging data analysis. In: 6th Workshop on Workflows in Support of Large-Scale Science (WORKS’11) (2011) Kunszt, P., Malmström, L., Fantini, N., Sudholt, W., Lautenschlager, M., Reifler, R., Ruckstuhl, S.: Accelerating 3d protein modeling using cloud computing. In: Workshop on Computing Advances in Life Science, 7th IEEE International Conference on e-Science, Stockholm (2011) Kunszt, P., Pernas, L.E., Quandt, A., Schmid, E., Hunt, E., Malmström, L.: The Swiss Grid proteomics portal. In: Proceedings of the Second International Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering, pp. 1–21 (2011) LSF: Platform LSF. http://www.platform.com/workloadmanagement/high-performance-computing/lp (2011). Accessed 28 Oct 2012 Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E.A., Tao, J., Zhao, Y.: Scientific workflow management and the kepler system: research articles. Concurr. Comput.: Pract. Exper. 18(10), 1039–1065 (2006) Ogsa basic execution services: http://www.ogf.org/ documents/GFD.108.pdf (2011). Accessed 28 Oct 2012 PBS: Pbs professional. http://www.pbsworks.com/Product. aspx?id=1 (2011). Accessed 28 Oct 2012 Renderfarm.fi: http://www.renderfarm.fi/ (2011). Accessed 28 Oct 2012 Shahand, S., Santcroos, M., Mohammed, Y., Korkhov, V., Luyf, A., van Kampen, A., Olabarriaga, S.: Frontends to biomedical data analysis on Grids. In: Proceedings of HealthGrid 2011 (2011) Shiwa simulation platform: http://ssp.shiwa-workflow. eu/ (2012). Accessed 28 Oct 2012 Taylor, I., Shields, M., Wang, I., Harrison, A.: The Triana workflow environment: architecture and applications. In: Workflows for e-Science, Scientific Workflows for Grids, pp. 320–339. Springer, London (2007) Thain, D., Tannenbaum, T., Livny, M.: Distributed computing in practice: the condor experience. Concurr. Comput.: Pract. Exper. 17(2–4), 323–356 (2005) The sci-bus project: http://www.sci-bus.eu/ (2011). Accessed 28 Oct 2012 Vine toolkit: http://vinetoolkit.org/ (2011). Accessed 28 Oct 2012