Vega: a Service-Oriented Grid Workflow Management System

3 downloads 311 Views 481KB Size Report
oriented Grid workflow system which consists of a set of loosely coupled services ... posed for the management and the access to the state of Grid services.
(Draft) Lecture Notes in Computer Science (LNCS). 2007, vol. 4804, p. 1516-1523. ISSN 0302-9743.

Vega: A Service-Oriented Grid Workflow Management System ⋆ ´ R. Tolosana-Calasanz, J.A. Ba˜ nares, P. Alvarez, and J. Ezpeleta Instituto de Investigaci´ on en Ingenier´ıa de Arag´ on (I3A) Department of Computer Science and Systems Engineering, University of Zaragoza Mar´ıa de Luna 1, E-50018 Zaragoza (Spain) [email protected]

Abstract. Because of the nature of the Grid, Grid application systems built on traditional software development techniques can only interoperate with Grid services in an ad hoc manner that requires substantial human intervention. In this paper, we introduce Vega, a pure serviceoriented Grid workflow system which consists of a set of loosely coupled services co-operating each other to solve problems. In Vega, the execution flow of its services is isolated from their interactions and these interactions are explicitly modelled and can be dynamically interpreted at run-time.

Keywords: Grid workflow, Service-oriented computing, Grid protocols

1

Introduction

Current Grid research intends to develop techniques for building more flexible, autonomous and adaptive Grid systems [1]. For this purpose, all the participant services are required to interoperate in a highly flexible and dynamic way. Firstly, Grid services should agree on the transport protocols and on the message structure and format in advance. Secondly, service providers should be able to specify their particular interaction protocols, that is, the expected sequence of messages for providing their services and, on the other hand, service consumers should be able to obtain those specifications and to dynamically interpret them. The agreement on the transport protocols was proposed in [2] and based on these foundations, the Open Grid Service Architecture (OGSA) [3] was developed. More recently, the Web Service Resource Framework (WSRF) [4] was proposed for the management and the access to the state of Grid services. Nonetheless, despite the homogeneous mechanisms defined by WSRF, each WSRFcompliant provider assumes its own and exclusive interaction requirements and ⋆

This work has been supported by the research project PIP086/2005, granted by the Government of Arag´ on and the project TIN2006-13301, granted by the Spanish Ministry of Education and Science

(Draft) Lecture Notes in Computer Science (LNCS). 2007, vol. 4804, p. 1516-1523. ISSN 0302-9743.

this heterogeneity in the interactions among Grid services may introduce a barrier to interoperability. In general terms, application systems and, in particular, Grid workflow systems should be designed to overcome this barrier. At the time that a candidate Grid service is chosen and the user-specified task is due to be executed, service consumers should be provided with a way of interacting with the candidate service without requiring to re-compile its software. This could be accomplished by explicitly separating the process flow of services and their interactions and dynamically interpreting the interactions [5, 6]. Nevertheless, current Grid workflow management projects [7], which have had great success at different application scenarios and which offer different approaches for building and executing workflows on Grids, do not overcome this barrier to interoperability successfully. In this paper, we introduce Vega, a pure service-oriented Grid workflow system specially designed to overcome these Grid challenges. Vega was modelled and implemented in the DENEB operating environment [8, 9] based on Reference nets [10]. DENEB supports Web standards such as WSRF, SOAP, etc. This fact facilitates the interoperability between services. Additionally, other DENEB’s features were also exploited, namely, the service-oriented principles and the isolation of the business logic of its services from the logic of their interaction protocols and these principles for designing Grid workflow systems are Vega’s main contribution. The reminder of this paper is organised as follows. In Section 2, a general overview of DENEB and its main underlying concepts are given. In Section 3, Vega’s modelling of workflows and its architecture are described. Finally, the conclusions are presented.

Fig. 1. DENEB in execution

2

The DENEB operating environment

DENEB is an operating environment for the development and execution of Web processes. In DENEB, Web processes’ business logic, its coordination protocols

(Draft) Lecture Notes in Computer Science (LNCS). 2007, vol. 4804, p. 1516-1523. ISSN 0302-9743.

and even the implementation of the platform are based on Reference nets (a special type of Nets-within-Nets [11]). Nets-within-Nets belong to the formalism of object oriented Petri-net approaches. Nets-within-Nets have a static part (the environment, also known as system net) and a dynamic part, composed of object net instances which move inside the system net. There exists a tool for modelling and executing Reference nets, Renew [12], developed in Java, which features an easy integration of both Reference nets and the Java programming language. Renew is the tool chosen for implementing DENEB. In DENEB’s model, the business logic of services is isolated from the interaction logic. This separation provides a high flexibility as services are allowed to dynamically determine the participant services of a desired interaction. One of the most important components of DENEB’s architecture is the workspace component. It comprises a service management mechanism, which primarily starts the execution of processes, and a workflow interpreter for the execution of the business logic of processes which are described and modelled by means of object nets. Workflows may invoke internal services or may interact with external services and not only the interactions among processes are limited to independent and simple operation invocation, but they can also involve complex negotiation processes. In any case, the sequences of exchanged messages form conversations. On a given interaction, the set of all valid sequences of messages represent a coordination protocol. In [9], it is shown how DENEB manages the interaction protocols and how object nets were also chosen to describe them. The conversation space is the component responsible for executing the parts of the conversation that a process plays (known as roles) when interacting with other processes. The objective of the message broker component is to establish an explicit separation between the logic of the interchange of messages (that is to say, the conversations) and the mechanisms of the communication and/or the specific coding formats in which the communication is performed. Internally, this component is composed of a message repository, at which the received or the pending-to-be-delivered messages are temporarily stored; and a set of binding components. These components support the communication with external processes through different transport protocols (SOAP, HTTP, TCP/IP, etc.), isolating the platform from any technological aspect of communication and from information exchange formats. In addition to storing, sending and/or receiving messages, the message broker component has to be able to block the execution of a process until it receives a specific message. These features typically appear in a asynchronous message passing system. For this reason, the coordination language Linda was used as the intermediate language for modelling the conversations among processes and an implementation of Linda [13] in Renew, RLinda, was developed as the message space. More recently, DENEB was extended to support the execution of WSRFcompliant services. The model from Figure 1 represents a simplified view of DENEB in execution. Transition t1 is responsible for managing the life cycle of services. In the example, there are three services in execution in the workspace,

(Draft) Lecture Notes in Computer Science (LNCS). 2007, vol. 4804, p. 1516-1523. ISSN 0302-9743.

being one of them a WSRF service instance. Transition t2 allowed the WSRF service instance to start a new conversation to interact with a service consumer, according to a specific communication protocol. As it can be seen, there is a conversation initiated at the conversation space. This conversation is going to retrieve the request from the message space coded as a tuple (transition t3). This request will be processed by the WSRF service instance and the result will be provided to the conversation as a response tuple. The conversation will store the tuple in the message space (transition t3). At this point, the binding components are responsible for taking this tuple (transition t4), adapting its content to the required format and sending it back to the service consumer with the required transport protocol.

3

Vega: a service-oriented Grid workflow system

DENEB was used to implement Vega and certain DENEB’s features were exploited. First, the utilisation of Web standards such as WSRF, SOAP, etc. facilitating the interoperability between services. Second, the application of serviceoriented techniques and the isolation of the business logic of its services from the logic of their interaction protocols. Thus, Vega can operate across different application scenarios, in a flexible and scalable way and without being constrained by Grid service interactions.

t1 id

id

this:getID(id); this:beginTask(id,["getParents",args], ["Scheduler","ExecTask","convResquest"])

getParents t2 this:endTask(id,result)

this:getID(id); t1 this:beginTask(id,["getParents",args], id [URI,gridServiceProtocolRole]) id

getParents t2 this:endTask(id,result)

Fig. 2. a) Abstract Task getParents b) Concrete Task getParents in Vega

The interactions with Grid resources can range from a simple request/reply message exchange to complex interaction protocols, required for dealing with the life cycle of a WSRF-compliant service. Vega tackles this heterogeneity of interactions by its capability of dynamically interpreting the interaction protocols of the involved services at run-time without any previous adjustment. In order to allow users to describe their workflow tasks in Vega, Grid workflows are modelled as particular Object nets which move inside DENEB’s system net. These specific Object nets, as Petri nets, provide adequate and well-known formalisms for expressing sequence, parallelism, choice and iteration, allowing the users to connect their tasks properly. Besides, the tasks involved can be of two different types: abstract tasks, that is, tasks that are not mapped to any Grid resource and concrete tasks, tasks connected to specific Grid resources and due

(Draft) Lecture Notes in Computer Science (LNCS). 2007, vol. 4804, p. 1516-1523. ISSN 0302-9743.

to be executed in them. Thus, Vega supports abstract workflows, in case all of its tasks are abstract tasks; concrete tasks, in case all of its tasks are concrete; or hybrid workflows, in case its tasks are a mixture of abstract and concrete. The resulting model is interpreted by DENEB and the tasks are executed according to the user-specified process flow. However, DENEB itself does not actually execute the tasks directly, but allows them to establish interactions with other services which will eventually perform them. Figure 2a) shows an example of an

Fig. 3. Vega’s architecture and its relationship with other systems

abstract task called getParents. It is composed of two transitions and a state. Transition t1 starts an interaction with Vega’s meta-scheduler, a service responsible for mapping abstract tasks to actual Grid resources in order to execute the task. Transition t2 gets the result of the interaction. On the other hand, Figure 2b) depicts the concrete task version for task getParents that also consists of two transitions and a state. Transition t1 starts an interaction with a Grid resource able to perform the task, then, the result is obtained in transition t2. In this early version of Vega, there have been designed some service components which, in some cases, do not provide as many features as other equivalent components from other Grid workflow systems. However, Vega was not designed with the aim of creating new components from scratch, but with the aim of integrating the existing service components. Indeed, ongoing efforts will exploit Vega’s interaction features in order to adapt some of the most important Grid middleware platforms. Figure 3 shows Vega’s architecture and its relationship with other systems. It should be noticed that the dot-lined components are still under development. Users can specify their Grid workflows in the Renew’s Reference net GUI editor

(Draft) Lecture Notes in Computer Science (LNCS). 2007, vol. 4804, p. 1516-1523. ISSN 0302-9743.

Fig. 4. Vega’s Services Interaction Diagram

- which acts as a build-time component -, whereas the rest of the components are responsible for enacting the user-defined workflows and are known as run-time components. Vega has a set of loosely coupled services that interact and co-operate each other by asynchronously exchanging sequences of messages in accordance with a defined interaction protocol. Depending on the nature of the task, abstract or concrete, there exist two possible scenarios when enacting a workflow specification. In the first case, a series of interactions among several services occur, just as Figure 4 shows: 1. The task initiates an interaction with the Vega’s meta-scheduling service. 2. The meta-scheduler, in turn, interacts with an available cataloguing service in order to retrieve candidate resources disposed for executing the task. 3. The Grid scheduler chooses one of the candidate resources according to a scheduling policy, provided by a Scheduling Policy Service. 4. Once the choice was done, the scheduler dispatches the task onto the resource initiating an interaction with it. The result (in case a result is produced) is sent back to the client task in the workflow. The alternative case, the scenario of a concrete task, since the concrete tasks itself initiates a direct interaction with the Grid resource which is going to perform the task, neither the meta-scheduler services nor the catalogue service participate in it. On the other hand, in this paper, DENEB’s data movement was designed to be automatic and centralised: data is passed through the message space. Nevertheless, other data movement alternatives can be easily implemented with additional services and/or interaction protocols. Figure 5 reproduces an instant of a Grid workflow enactment in Vega. It must be noticed that because Vega uses DENEB for its execution, some of DENEB’s operating environment elements are present. In the workspace, there are two services in execution, the user-defined workflow service and the meta-scheduler

(Draft) Lecture Notes in Computer Science (LNCS). 2007, vol. 4804, p. 1516-1523. ISSN 0302-9743.

service. As commented on previously, the abstract tasks of the workflow interact with the meta-scheduler in order to achieve their execution and the metascheduler, in turn, may need interact with other services such as a resource catalogue service or a resource able of executing the task. Thus, in the conversation space, there are two conversations which are being interpreted. The one on the left has been initiated by the workflow service and the one on the right by the meta-scheduler service. The communication act between them is accomplished by writing/taking messages to/from the message space (Linda), according to an interaction protocol.

Fig. 5. Vega in execution

Vega was tested by reproducing some problems of the biological domain which the Grid workflow system Taverna [14] solved. One of these problems is modelled in the Gene Ontology context workflow 1 which builds up a subgraph of the Gene Ontology for a supplied gene term.

4

Conclusions

In this paper, we introduced Vega, a pure service-oriented Grid workflow management system, modelled and enacted in DENEB, an operating environment for Web processes. Vega uses standard protocols and it deals with the heterogeneous interaction requirements of Grid service providers by explicitly separating the execution flow of services from their actual interactions. In fact, the interactions are explicitly modelled and can be dynamically interpreted, allowing the system 1

http://workflows.mygrid.org.uk/repository/myGrid/TomOinn/

(Draft) Lecture Notes in Computer Science (LNCS). 2007, vol. 4804, p. 1516-1523. ISSN 0302-9743.

to be configured late in the execution process and to adapt itself to particular circumstances of specific environments.

References 1. Huhns, M.N., Singh, M.P.: Service-Oriented Computing: Key Concepts and Principles. IEEE Internet Computing 09 (2005) 75–81 2. Foster, I., Kesselman, C., Tuecke., S.: The Anatomy of the Grid: Enabling Scalable Virtual Organizations. Int. J. Supercomputer Applications 15 (2001) 3. Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The Physiology of the Grid: an Open Grid Services Architecture for Distributed Systems Integration. Technical report, Open Grid Service Infrastructure WG, GGF (2002) 4. Czajkowski, K., Foster, D.F.F.I., Frey, J., Graham, S., Sedukhin, I., Snelling, D., Tuecke, S., Vambenepe, W.: The WS-Resource Framework. Technical report, IBM DeveloperWorks library (2004) 5. Ardissono, L., Cardinio, D., Petrone, G., Segnan, M.: A Framework for the Serverside Management of Conversations with Web Services. In: Proc. of the 13th Int. World Wide Web Conf. on Alternate track papers & posters, New York, NY, USA, ACM Press (2004) 124–133 6. Biornstad, B., Pautasso, C., Alonso, G.: Enforcing Web Services Business Protocols at Run-time: a Process-Driven Approach. Int. J. of Web Engineering and Technologies 2 (2006) 396–407 7. Yu, J., Buyya, R.: A Taxonomy of Workflow Management Systems for Grid Computing, Journal of Grid Computing. Springer Science+Business Media B.V. 3 (2005) 171–200 ´ 8. Alvarez, P., Ba˜ nares, J.A., Ezpeleta, J.: Approaching Web Service Coordination and Composition by Means of Petri Nets. The Case of the Nets-Within-Nets Paradigm. Number 3826 in LNCS. In: 3rd Int. Conf. on Service Oriented Computing. Springer Verlag (2005) 185–197 ´ 9. Fabra, J., Alvarez, P., Ba˜ nares, J.A., Ezpeleta, J.: A Framework for the Development and Execution of Horizontal Protocols in Open BPM Systems. In Dustdar, S., Fiadeiro, J.L., Sheth, A.P., eds.: Business Process Management. Volume 4102 of LNCS., Springer (2006) 209–224 10. Kummer, O.: Introduction to Petri Nets and Reference Nets. Sozionik Aktuell 1 (2001) 1–9 ISSN 1617-2477. 11. Valk, R.: Petri Nets as Token Objects - An Introduction to Elementary Object Nets. LNCS: 19th Int. Conf. on Application and Theory of Petri Nets, Lisbon, Portugal 1420 (1998) 1–25 12. O. Kummer and F. Wienberg: Renew - the Reference Net Workshop. In: Tool Demonstrations, 21st Int. Conf. on Application and Theory of Petri Nets, Computer Science Department, Aarhus University, Aarhus, Denmark (2000) 87–89 ´ 13. Fabra, J., Alvarez, P., Ba˜ nares, J.A., Ezpeleta, J.: RLinda: A Petri Net Based Implementation of the Linda Coordination Paradigm for Web Services Interactions. In: EC-Web. (2006) 183–192 14. Oinn, T., Greenwood, M., Addis, M., Alpdemir, M.N., Ferris, J., Glover, K., Goble, C., Goderis, A., Hull, D., Marvin, D., Li, P., Lord, P., Pocock, M.R., Senger, M., Stevens, R., Wipat, A., Wroe, C.: Taverna: Lessons in Creating a Workflow Environment for the Life Sciences: Research Articles. Concurr. Comput. : Pract. Exper. 18 (2006) 1067–1100

Suggest Documents