Transaction Management for Grid Workflow Applications - IEEE Xplore

2 downloads 0 Views 324KB Size Report
Transaction Management for Grid Workflow Applications. Feilong Tang1, Minglu Li1, Minyi Guo1, Zhengwei Qi2, Yi Wang1,Yanqin Yang1,Daqiang Zhang1.
Transaction Management for Grid Workflow Applications Feilong Tang1, Minglu Li1, Minyi Guo1, Zhengwei Qi2, Yi Wang1,Yanqin Yang1,Daqiang Zhang1 1 Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China 2 School of Software, Shanghai Jiao Tong University, Shanghai 200240, China {tang-fl, li-ml, guo-my}@cs.sjtu.edu.cn Abstract As Grid technology is expanding from scientific computing to business applications, transactional workflow management emerges as one of the most important services for Grids. The ShanghaiGrid project launched in Shanghai, China implemented a basic workflow service without the reliability support. In this paper 1 , we propose a transactional Grid workflow service (GridTW), providing a reliable and automatic workflow management for the ShanghaiGrid as well as other Grid workflow applications. This paper focuses on how to manage Grid transactions and combine transaction management with the workflow service. We present an automatic compensation based coordination algorithm with which the GridTW guarantees the reliability of Grid workflows. An important feature of our algorithm is that it can adapt to the dynamics Grid applications by the event- and condition-driven mechanism, and allows users to select execution results from committed subtransactions.

1. Introduction Grid computing has been identified as one of the paradigms revolutionizing the discipline of distributed computing, especially for large-scale resource sharing and commercial application cooperation[1-3]. As Grid technology is expanding from scientific computing to business applications, a transactional Grid workflow service responsible for reliable and automatic 1

This work was supported by 863 Program of China (Grant Nos. 2006AA01Z172 and 2006AA01Z199), National Natural Science Foundation of China (Grant Nos. 60533040, 90612018 and 60473092), and Natural Science Foundation of Shanghai Municipality of China (05ZR14081).

The Sixth International Conference on Grid and Cooperative Computing(GCC 2007) 0-7695-2871-6/07 $25.00 © 2007

execution of business process will play an important role. A business process, in general, is a set of activities with a common goal, and comprises a set of service invocations that are executed in a specified order and reliable semantics. Workflows have generally been accepted as a means to model and support complex business processes, be they interactive or completely automated[4,5]. Furthermore, the use of workflows for core business processes has lead to the requirements of clear process semantics and robustness, both in regular process execution and under exception or error conditions[6]. Just like the workflow concept from the workflow management coalition, a Grid workflow can be defined as an automation of a Grid based business process, in whole or part, during which data and control information are passed from one Grid service to others for action, according to a set of predefined rules. In an execution of a Grid workflow, failures may occur more often than that in traditional distributed systems. Therefore, the Grid domain needs not only a more general usage of workflow technology but also a stronger reliability guarantee. Grid workflow management has widely been researched and many results have been reported [7,8]. But current Grid workflow models do not consider failure control and recovery, which is dependent on the interference of managers. Many transactional workflow models for traditional distributed systems have been proposed[9-11], however, they do not work well in Grid environment because of the following challenges: Grid services may join or exit the Grid system dynamically during an execution of a business process. Transactional Grid workflow service has to hide the dynamism from users. it is difficult even impracticable for application programmers to develop compensating

transactions for Grid applications. Existing transaction models require application programmers to provide all compensating transactions. However, Grid services that execute a business application are dynamically discovered. Service providers may set up special compensating rules based on their own business models. For example, some service providers allow users to cancel a ticket order without other actions while others may require users to pay some compensating fee for cancellation of the order. In this paper, we propose a transactional Grid workflow service (GridTW), aiming at providing a reliable workflow service for ShanghaiGrid as well as other Grid applications. The paper also presents a compensation based coordination algorithm to manage workflow based complex applications. The remainder of this paper is organized as follows. In the next section, we review related work. In Section 3, we simply introduce the ShanghiGrid. The GridTW and the coordination algorithm for transactional Grid workflow are presented in Section 4. The implementation and discussion are reported in Section 5. Finally, Section 6 concludes the paper with the discussion of our future work.

2. Related work Transactional Grid workflow and Grid transaction have not been well researched. We review related efforts on Grid workflow as well as well-known transaction models for traditional distributed systems. In 2002, Krishnan et al. [12] presented a workflow framework for Grid services(GSFL) to manage workflows within the Open Grid Services Architecture (OGSA) framework. The GSFL focuses on the definition of an XML-based workflow language, but also includes a description of an implementation of a workflow engine. An important feature distinguishing from Web services workflow models is the GFSL proposes a Notification Model to provide the solution to the problem of the workflow engine mediating at every step of an activity. Nichols et al. [13] proposed a model for autonomic workflow management in the Grid. The model integrated dynamic fault tolerance mechanism selection into grid-based workflow management system (WMS) architecture, providing awareness and resilience to failures for automatic Grid computing. In [14], Deelman described how to map and execute complex workflow based on full-ahead planning, where a workflow can be generated from metadata

The Sixth International Conference on Grid and Cooperative Computing(GCC 2007) 0-7695-2871-6/07 $25.00 © 2007

description of the desired data product using AI-based planning technologies. GridFlow is a workflow management system for Grid computing. It includes services of both global Grid workflow management and local Grid subworkflow scheduling. Simulation, execution and monitoring functionalities are provided at the global grid level[15]. McRunjob is a Grid workflow manager used to manage the generation of large number of production processing jobs in high energy physics. It converts core metadata into jobs submitted in a variety of environments[16]. Sagas, proposed by Garcia-Molina and Salem [17], is a classical long-lived transaction model and was extended to many extended transaction models[18,19]. In Sagas, a transaction consists of a set of subtransactions T={T1, T2,. . . ,Tn} with ACID (atomicity, consistency, isolation, durability) properties, and a set of associated compensating transactions C={C1,C2,…,Cn}, where each sub-transaction Ti associates with a compensating transaction Ci that can semantically undo the effect caused by the commit of Ti. In Sagas, all the committed sub-transactions must be undone if a subsequent sub-transaction fails, which causes waste of a lot of valuable work already finished. Chen et al. [20,21] analyzed the grid workflow verification and validation, presented a taxonomy and proposed a checkpoint selection strategy that can adaptively select not only necessary but also sufficient checkpoints based on the dynamics and uncertainty of run-time activity completion duration. Above models do not consider combining workflow management with Grid transaction management. Our work concentrates on how to provide reliability support for Grid workflows, with the character: shielding the dynamicity of the Grid, i.e., a transactional Grid workflow is not aborted if some Grid services involved in the workflow dynamically exit the Grid system.

3. Background 3.1. Background of ShanghaiGrid ShanghaiGrid is a long term research plan sponsored by Science and Technology Commission of Shanghai Municipality (STCSM). It is one of five, top, grand Grid projects in China. As the most important part of digital city and city Grid plan, ShanghaiGrid concentrates on constructing metropolis-area information Grid infrastructure, establishing an open standard, and developing a set of system softwares for widespread upper-layer applications from both

research communities and official departments, especially for intelligent traffic services for citizens and Shanghai government. To achieve the goal, the project consists of four interdependent sub-projects: the research and investigation on requirements, protocols and standards of information Grid infrastructure; the development of system softwares and establishment of major Grid nodes; the development of decentralized virtual research platform; and the research on metropolis Grid applications[23]. ShanghaiGrid has more than 1400 Gflops peak computing capability and 16TB disk capability, by interconnecting supercomputers distributed in the following organizations: Shanghai Supercomputing Center, Shanghai Jiao Tong University, Shanghai University, Tongji University, Shanghai Urban Traffic Information Center and others. By means of flexible, secure and open standards, data and services among virtual organizations, ShanghaiGrid will be built as an information Grid testbed for Shanghai to support typical Grid based applications, in particular for traffic management (traffic guidance, traffic-congestion control, decision support etc.). The first group of typical applications will be put in practice soon, covering road status report (free or jammed), the best path prediction (to an specified destination) and remainder time estimation that the next bus is going to arrive at a stop.

3.2. Problem statement ShanghaiGrid software infrastructure comprises many services dedicated to different functions, where workflow is one of core services. By means of the workflow service, Grid service providers can compose existing services into a new value-added service easily and dynamically, and users also perform multiple activities with only one request. However, current ShanghaiGrid workflow service cannot support reliable semantics that is necessary for mission-critical applications. It is important to combine transaction management with the workflow service to establish a transactional workflow service for ShanghaiGrid. This paper researches on how to provide reliability support for ShanghaiGrid workflow service. Let’s give a typical scenario of transactional workflow applications in a journey arrangement where a group of tourists plan to visit Shanghai, for which they often have to (1) query a weather service that predicts the weather status during expected period, (2) reserve rooms in a hotel and hire a car if the weather is good enough, and

The Sixth International Conference on Grid and Cooperative Computing(GCC 2007) 0-7695-2871-6/07 $25.00 © 2007

(3) query the route service that advises the best route to a specified destination. Reserve rooms S

Query weather

Query route

E

Hire car Fig. 1. A process for a journey arrangement.

The execution flow of above process is illustrated in Fig. 1, with a starting point S and an ending point E. This process consists of four activities: query weather, reserve rooms, hire a car and query route, where these activities are partially ordered and automatic execution of them is desired; and Query weather and Query route activities are nontransactional while others are transactional. A transactional activity is the one whose commit has to been undone if a workflow fails while it is unnecessary to undo non-transactional activities. We define a transactional workflow as the process that includes at least one transactional activity. Workflow Planning Workflow Design

Workflow Repository

Workflow Run-Time Service Discovery

Workflow Engine

Workflow Admin

Workflow Monitoring

Grid Transaction Manager Scheduler

Coordinator/ Participant

CTG

Log Service Grid Services

Fig.2. Architecture of transactional Grid workflow service (GridTW).

4. Transactional Grid workflow service Transactional Grid workflow service is a high-level Grid service to provide reliable workflow management for business applications.

4.1 Layered architecture The architecture of transactional Grid workflow service is shown in Fig.2. The top two layers show main components of Grid workflow service implemented in the ShanghaiGrid, which supports process definition, workflow enactment, administration and monitoring of workflow processes. The Workflow Design is a set of tools that provide users with a graphical design environment to model business processes as workflows, which then are stored in the Workflow Repository. The Workflow Engine is at the center of the Grid workflow service. When a workflow is instantiated, the engine creates the process data, binds and invokes Grid services, and executes the workflow. The Workflow Admin and Workflow Monitoring are respectively used to manage the workflow service and monitor real-time information of workflows, such as the process number, user number, etc. The workflow management is a core service of the ShanghaiGrid. It employs event-condition-action

(ECA) mechanism to improve the flexibility and adaptability to dynamical grid environments. Grid Transaction Manager layer is our main considerations in this paper. It takes charge of coordinating Grid services involved in a Grid workflow and recovering systems from potential failures, with the following main components. Coordinator and Participant. They execute coordination algorithm to manage a transactional workflow. Here, dashed line means that they are dynamically created and live only until the end of a workflow. Scheduler. It creates a transaction context, a Coordinator in an application side or a Participant in a service side, and promulgates the transaction context. Compensating Transaction Generator (CTG). The CTG automatically generates compensating transactions based on predefined compensating rules. Log Service. This component records the coordination process and the state information.

Scheduler Log Service Participant

CTG

Grid Service

Rule Base

Coordination Algorithm Workflow Repository Workflow Design Workflow Engine

Workflow Monitoring

Shanghai Grid

......

Workflow Admin

Grid service 1

Scheduler

Log Service

Grid Service

Coordination Algorithm

Coordinator Workflow application

Grid service n

Fig.3. Transactional Grid workflow processing.

4.2. Transactional Grid workflow processing Workflows can be classifies two categories: transactional workflow with at least one transactional activity and non-transactional workflow without any

The Sixth International Conference on Grid and Cooperative Computing(GCC 2007) 0-7695-2871-6/07 $25.00 © 2007

transactional activity. Committed transactional activities need to be compensated if the workflow fails, however, compensation is unnecessary for nontransactional activities. For example, the journey arrangement mentioned above covers four activities,

where the Reserve room and Hire car services are transactional while the Query weather and Query route services are non-transactional. If a user cancels a finished journey arrangement, the hotel reservation and car hiring have to be compensated. But a weather query is unnecessary to be compensated no matte the journey arrangement is successful or is aborted. Fig.3 illustrates the process of a transactional Grid workflow, where transactional services (e.g., Grid service 1) need the support of our Grid transaction manager while non-transactional services (e.g., Grid service n) is just an original Grid service. The Workflow Engine of our GridTW is able to identify two types of workflows (transactional and nontransactional) through the workflow description. For transactional workflows, the Workflow Engine initiates transaction requests to the Grid Transaction Manager. The latter then orchestrates these workflows with our coordination algorithm, which will be investigated in the next subsection. On the other hand, the Workflow Engine directly dispatches nontransactional workflows to remote Grid services.

The coordination algorithm for transactional Grid workflows, simplified TWCA, consists of two parts (see Fig.4): Algorithm for Coordinator and Algorithm for Participant executed by a Coordinator and a Participant respectively, where t is the system time; CC is a coordination context; Ti is a subtransaction; and Tvalid is the valid time before which a Coordinator must send Confirm or Cancel messages to committed subtransactions and Participants also must report their states. Otherwise, if a Coordinator does not confirm or cancel a subtransaction before Tvalid, the corresponding Participant automatically undoes the committed subtransaction by the compensating transaction. On the other hand, a Coordinator presumes that a subtransaction failed if it does not return the commit result before Tvalid. The CC message contains necessary information for coordination of a transactional workflow, including transaction type, transaction identifier, coordinator address, expire time etc.

Algorithm for Coordinator Input: references of all participants,timeout; Output: workflow results or failure; { Scheduler creates a Coordinator; completed=false; while (t

Suggest Documents