Methods for Enabling Recovery Actions in Ws-BPEL Stefano Modafferi and Eugenio Conforti Politecnico di Milano Dipartimento di Elettronica e Informazione P. zza L. da Vinci 32, 20133 Milano, Italy
[email protected], conforti
[email protected]
Abstract. Self-Healing is an emerging exigency for Information Systems where processes are everyday more complicated and where many autonomous actors are involved. Roughly, self-healing mechanisms can be viewed as a set of automatic recovery actions fired at run-time according to the detected fault. These actions can be at infrastructure level, i.e. transparently to the process, or they can be defined in the workflow model and executed by the workflow engine. In the Service Oriented Computing world Ws-BPEL is the most used language for web-service orchestration, but standard recovery mechanisms provided by Ws-BPEL are not enough to implement, with reasonable effort, lots of suitable recovery actions. This paper presents an approach where a designer defines a Ws-BPEL process annotated with some information about recovery actions and then a preprocessing phase, starting from this “annotated” Ws-BPEL, generates a “standard” Ws-BPEL, that is a file understandable for a standard Ws-BPEL engine. This approach has the advantage of avoiding any change in the engine using the standard capabilities to define specific behaviors that will realize recovery actions, but at the end are still a set of Ws-BPEL basic and structured activities.
1
Introduction
In Service Oriented Architectures languages and framework for managing processes are becoming mature and quite stable. The de-facto standard for defining Web-Service based process is Ws-BPEL [7]. Several engines [1,2,3] are available for implementing Ws-BPEL processes. The next step for research in this field is an advanced fault management to realize Self-Healing Information Systems able to choose and implement appropriate recovery actions. Thus a Self Healing Information System will implement a monitoring part to capture faults, a diagnosis part to detect what and where is the source of the problem and to decide which recovery action has to be fired, and a recovery part that actually implements the recovery action. These actions can be at infrastructure level, i.e. transparently to the process (e.g. the approach of [17]), or they can be defined in the workflow model and R. Meersman, Z. Tari et al. (Eds.): OTM 2006, LNCS 4275, pp. 219–236, 2006. c Springer-Verlag Berlin Heidelberg 2006
220
S. Modafferi and E. Conforti
executed by the workflow engine. This paper focuses on the recovery part of a Self Healing Information System following the latter approach and considering recovery mechanisms that can be defined during the process design phase. A designer who wants to use a language like Ws-BPEL as is, has very few and basic mechanisms to implement recovery actions. It is possible to realize more complex recovery behavior, but the effort is too much. Our approach tries to extend Ws-BPEL strategies suggesting some composite constructs that mix handlers and activities (standard and structured), to drive the designer in his work, enabling standard, and therefore general, mechanisms which perform repair actions at run time level. During the design phase, the designer has to decide which mechanism must be used, specializing it according to the given schema. That is, for instance, inserting in the general mechanism the actual recovery actions. The proposed mechanisms cover a wide range of possibility: i) the ability of modifying the value of process variables by means of external messages, ii) the specification of a time deadline associated with a task, iii) the ability of redoing a single task or an entire scope, iv) the possibility of specifying alternative paths to be followed, in the prosecution of the execution, after the reception of an enabling message, v) the possibility of going back in the process to a point defined as safe for redoing the same set of tasks or for performing an alternative path. The work is so structured: Section 2 is devoted to the analysis of existing WsBPEL constructs for recovery actions, then in Section 3 we define our general approach to improve power expression in Ws-BPEL without modifying existing engines, mechanisms are presented in Section 4, then in Section 5 we compare our work with related literature and finally in Section 6 we draw our conclusion.
2
Ws-BPEL Recovery Mechanisms
In the Service Oriented Computing world Ws-BPEL [7] is the most used language to specify Web-Service Orchestration. It defines a language to design workflow processes by specifying the process tasks and their interaction with the external world. Ws-BPEL has standard activities (invoke, receive, assign, wait, reply, terminate etc.) called simply tasks, and structured activities (loop, pick, sequence etc.) that can be combined to define the process. An important concept is the Scope that defines a portion of workflow upon which variables and handlers can be defined. Each Scope has a primary activity that defines its normal behavior. The primary activity can be a complex structured activity, with many nested activities with an arbitrary depth. The Scope is shared by all the nested activities. Each Scope has unique entry and exit points. Ws-BPEL provides standard mechanisms for managing exceptions. Specific handlers (fault, compensation and event) are associated with a single action or with a Scope, that is, a set of actions. The major problems that handlers create are about their lack of flexibility. Those handlers are enabled at different time during execution: Fault and Event Handlers are enabled only during running time (of a Task or Scope); Compensation Handler is enabled when the status is
Methods for Enabling Recovery Actions in Ws-BPEL
221
“completed” and therefore the execution point is ahead of the Scope. The fault and compensation can be defined both at Task or Scope level, but Event Handler exists only at Scope level. In the following a brief description of each handler is provided: – Fault Handler. Its aim is to undo the partial and unsuccessful work of a Scope. The first thing it does is to terminate all the activities contained within the Scope, then it performs its tasks defined at design phase. Sources of faults can be: i) invoke activity responding with a fault WSDL message, ii) a programmatic throw activity iii) standard fault that pertains to the engine iv) “CatchAll” for anything else. If a fault is not consumed in the current Scope it is recursively forwarded to the enclosed ones. If termination of instance has not been invoked, after consuming the exception the normal flow restarts at the end of the Scope associated with the fault. – Compensation Handler. This is basically a wrapper for compensation activities. A Compensation Handler is available only after the related Scope has been completed correctly. It represents a compensation process of already executed activities. Activities see a snapshot of all the variables of the Scope the Compensation Handler is attached to and they cannot update live data. It can be used from within fault-handlers or Compensation Handlers. Calling a compensation on a Scope without Compensation Handler, the default handler is executed, that is all Compensation Handlers for the immediately enclosed scopes in reverse order are invoked. – Event Handler. The whole process as well as each Scope can be associated with a set of Event Handlers that are invoked concurrently if the corresponding event occurs. The actions taken within an Event Handler can be any type of activity, such as sequence or flow. Event Handler is considered a part of the normal processing of the Scope, i.e., active Event Handlers are concurrent activities within the Scope. Events can be: i) incoming messages; ii)temporal alarms. Using these three handlers as they are, it is possible to realize very basic “recovery” mechanism. In fact Ws-BPEL provides the three handles as standard mechanisms and leaves to designer any other specification about the tasks actually executed when an handler is fired. Therefore more powerful and flexible instruments could be built, but this effort is currently fully in charge of the designer. For instance, no native method is available for redoing an action, designer could use the Compensation Handler to do this, but it is not so easy and a discussion about how to implement a redo action in Ws-BPEL is given in Section 4.3. Moreover in developing advanced recovery mechanisms, the designer has to consider the side effect of each specific handler. For instance when Fault Handler is invoked, it terminates all the activities in the Scope upon which it is defined, therefore if the designer wants to provide a repair action on the process without killing a part of it (or all), he needs to catch the fault with the Event Handler that is executed concurrently to the normal flow. It does not kill any activity, but, at the same time, it does not stop the process flow.
222
3
S. Modafferi and E. Conforti
Enhancing Design Capability
To overcome Ws-BPEL limitation in supporting recovery action three different approaches can be followed: to define a totally new workflow language and workflow engine, to define an extended Ws-BPEL and the corresponding extended engine or to use the concepts of annotation and preprocessing for enhancing Ws-BPEL at design time without modifying the workflow engine. Starting from the assumption that Ws-BPEL is currently the de facto standard for orchestrating Web Service based workflow, changing the model will be useless, hardly limit the diffusion of the new model. Thus the followed direction is to give to the designer several advanced recovery mechanisms asking him very few changes of his current instruments. For this reason the presented approach follows the third way. The payed price is the necessity of interpreting an annotated Ws-BPEL and some limitations (related to the nature of the language) that cannot be overcome with any mechanism. The advantages is to avoid any change in the engine supporting any commercial engine complaint with Ws-BPEL 1.1 standard without imposing any choice to the company interested in these mechanisms and moreover without interfering with autonomous developments carried out by the actual owner of each engine. Our approach exploits Ws-BPEL annotation based on the XML nature of Ws-BPEL, that is adding new tags in the XML during the design phase and removing them during the preprocessing phase. Each Ws-BPEL process can be invoked like a Web Service having a WSDL interface. The final WSDL, output of pre-processing phase, is modified according to the new (recovery) operations that the process supports. To realize a distinction between the operations related to the Business and those related to Management our approach follows the direction proposed in WSDL-S [15] to specify the semantic associated with this (recovery). This language is an extension of WSDL and supports the use of extension attributes, namely modelReference, that specify the association between a WSDL entity and a concept in a semantic model. As stated before, the enhancement of design capability is not related with specific process schema. It is provided like the basic Ws-BPEL handlers and thus each mechanism requires a design phase that is devoted to decide where the mechanism is active and which are the actual recovery actions that have to be performed once the mechanism has been fired.
4
Proposed Mechanisms
In this section the five mechanisms enabling specific recovery actions are described. They are the focus of the paper and cover a wide range of possibility: i) the ability of modifying the process variables value by external messages, ii) the specification of a time deadline associated with a task, iii) the ability of redoing a single Task or an entire Scope, iv) the possibility of specifying alternative paths to be followed, in the prosecution of the execution, after the reception of
Methods for Enabling Recovery Actions in Ws-BPEL
223
an enabling message, v) the possibility of going back in the process to a point defined as safe for redoing the same set of tasks or for performing an alternative path. All the mechanisms, if not differently specified, can be associated both at task level and at Scope level. 4.1
External Variable Setting
A common typology of errors in Business Process executions is related to data. Actual recovery actions in this field are often performed outside the process by human actions. Even if it is performed out of the process, this kind of recovery usually produces the need for an update of several process variables. This mechanism allows the designer to simply identify which variables can be set from incoming message, leaving to the preprocessor the task of producing the corresponding set of Event Handlers. In fact the basic idea behind this mechanism is to use several Event Handlers with a simple activity each one modifying the corresponding variable. Abstract Extended Model. Even if the “annotated” Ws-BPEL is not executable on standard workflow engines, it can be still viewed as a workflow description language. Step by step we will show what has to be added to the original model for enabling advanced management of recovery actions. In the following the term “attribute” has general meaning and does not imply that in the corresponding XML structure it is actual an attribute, in fact in the XML it could also be an element. For this feature the model has to consider also an External Modifiable attribute ExtV ar optionally associated with each Ws-BPEL variable definition. The Transformation Algorithm. This is the algorithm performed by the preprocessor to generate the “standard” Ws-BPEL: 1. Scan the “annotated” Ws-BPEL generating an Event Handler for each variable having the attribute ExtV ar true. 2. Put an “assign” operation for modifying corresponding variable in each Event Handler generated in the previous step. 4.2
Timeout
In the communication between two web services A problem is the time that one actor can wait before the message arrival. We need a mechanism to manage, at process level, timeout in the communication. In Ws-BPEL the possibility of using timeout is associated to Event Handler and to Pick activity. In this section a simple way to extend this possibility associating a timeout to a Receive activity is presented. The designer specifies a timeout for each selected Receive activity and the corresponding recovery actions, that is the set of activities performed if the timeout happens.
224
S. Modafferi and E. Conforti
With simple extension involving an Event Handler it is possible to associate a timeout with activities different from Receive. This extension exploit the possibility of using a timeout for firing an Event Handler. For terminating the current Activity/Scope is used a Fault Handler. Abstract Extended Model. For Timeout feature the model has to consider: – Timeout attribute T ime optionally associated with “Receive” activity definition. – Set of basic and structured Ws-BPEL activities RecActions . With respect to the default execution flow RecActions identifies a semantically different behavior because it is performed as a recovery action, but syntactically it is composed of standard Ws-BPEL code. The Transformation Algorithm. As shown in Fig. 1 and Fig. 2, to enable this feature each Receive activity associated with a timeout is transformed in a Pick activity with two branches, one associated with the original message and the other associated with an alarm, that is the standard mechanism provided by Ws-BPEL for managing timeout. < receive > < pick > ...... < onM essage > < T imeout = “duration − expression >? < Empty/ > RecActivity < /OnM essage < /T imeout > < /receive > < onAlarm(f or = T imeout) > RecActivity < /OnAlarm > < /pick > Fig. 1. Transformation of a receive with timeout in a pick
Fig. 2. UML representation of transformation
This is the algorithm performed by the preprocessor to generate the “standard” Ws-BPEL: 1. Scan all the “annotated” Ws-BPEL transforming every receive action associated with a timeout in a pick with two branches:
Methods for Enabling Recovery Actions in Ws-BPEL
225
– A branch characterized by onMessage property and associated with the same message originally devoted to corresponding receive. – A branch characterized by onAlarm property and associated with the corresponding defined timeout. 2. Associate the “standard behavior” with the onMessage branch. 3. Associate the “recovery behavior” with the onAlarm branch. 4.3
Redo Mechanism
Redoing an activity is an action often useful repairing a process. In Ws-BPEL there is not A mechanism to manage this type of situations. Redo means to execute again a Task or a Scope (a set of activities) that is already completed. The action of Redo is not related to the concept of rolling back a process or part of it. According to [14] we are assuming that concurrently with a running process, at a time, the system can ask for redoing a Task or a Scope without any relationship with the current point in the execution flow. Compensation Handler is enabled only when the related Task/Scope is completed. Event Handler and Fault Handler are enabled only during the running phase of the Activity/Scope. To redo an activity the mechanism is based on the Compensation Handler.
< CompensationHandler >? < switch > < casecondition = “bool − expr > + Redoactivity < /case > < otherwise >? Compensateactivity < /otherwise > < /switch > < /CompensationHandler > Fig. 3. New structure of Compensation Handler
In this mechanism redo is viewed as a compensation activity. The basic idea is to define a behavior that syntactically is inserted in a Compensation Handler, but actually is the repetition of “normal” behavior. An Event Handler fired by the specific redo message is used to activate this compensation. It is possible to use the same construct also at single task level, because the Invoke activity has an internal Compensation Handler. It is important to remark that for each Task/Scope, only one Compensation Handler can be defined and it cannot be invoked by an external message. For this reason the message rising a redo action will syntactically be an event message. The aim of the Event Handler is to set the variable that will drive the choice between redo and compensation and then to call the Compensation Handler.
226
S. Modafferi and E. Conforti
Fig. 4. Overview of Redo mechanism at run-time
Using this approach it is possible to have two different messages for raising compensation or redo action. The default behavior performed in the Compensation Handler is the compensation path. The designer has to only specify the Tasks/Scopes that could be redone. Abstract Extended Model. For this feature the model has only to consider the Redo attribute Redo optionally associated with Invoke activity definition or with Scope. From the enhanced model perspective no other information is necessary. Fig. 3 shows how the information can be provided. The Transformation Algorithm. The following transformation algorithm is performed for each Task/Scope upon which the redo attribute is defined: 1. A global variable vi is defined by the preprocessor. This variable will drive the choice between compensation and redo action. 2. A new event message and the corresponding handler is created. 3. In the Event Handler an operation for setting vi to “redo action” value is defined. 4. In the Event Handler an operation for calling the Compensation Handler associated with the activity is inserted. 5. a Switch construct is put on the top of Compensation Handler by the preprocessor. 6. The Task or the Scope is copied in a branch of the switch in the handler and the normal compensation behavior in the other branch by the preprocessor. 7. This switch is driven by a variable vi that indicate if the wanted behavior is compensate or redo. 8. An operation for setting the value of vi to “Compensation”, default value, is inserted.
Methods for Enabling Recovery Actions in Ws-BPEL
227
In Fig. 4 a basic functional schema is depicted. Notice that the compensation branch could be a shortcut if compensation is not defined upon the corresponding Task/Scope. 4.4
Future Alternative Behavior
This mechanism provides a way for specifying that as consequence of an event, in the future the process, in specific sections, will follow an alternative behavior instead of the default one. The typical example is when a given (and not vital) service, during the process execution becomes not available and an incoming message carries this information. Each operation related to this service will be skipped until the situation does not change. The designer has to define the following items: 1. The scopes of workflow sensible to one (or more) possible alternative behaviors. To do this he defines a Scope exactly covering every region that could be substituted. 2. The set of “Alternative Behaviors” associated with the workflow, that is the set of scopes available as alternative paths. 3. The relationship among default scopes and alternative scopes. Each region can be associated with more than one different Alternative Behaviors. High-level Overview. The idea behind this mechanism is to have some alternative behaviors available along the process and related to several portions of it. The preprocessing phase uses a Switch activity to store all the possible alternative behaviors in the corresponding places, specific variable drives one and only one of this kind of Switch. Each alternative behavior is fired (or killed) by a specific message. To manage this situation the Ws-BPEL Event Handler is used. The incoming message carries a boolean information (Alternative/Default). When a message arrives the Event Handler reacts and all the variables driving the Switch activities containing the alternative behavior indicated in the message are set to the indicated value. This approach assumes that default and alternative behaviors are mutually exclusive and that the incoming message fires the associated behavior along all the process. If for a single Scope more than one alternative behavior has been defined, the last incoming message will decide the actual behavior. This is quite reasonable because the semantic associated with the message can be summarized in: “From now perform the associate alternative behavior wherever it is defined”. Figure 5 shows an high level picture of this mechanism. It is worth noticing that in the message content it is possible to specify if the alternative pattern has to be activated or if the default behavior has to be restored. Even if not yet implemented, an important evolution of this mechanism is the possibility of specifying the alternative behavior not as an absolute set of tasks with a predefined topology defined at design-time, but as a set of rules defined upon the default behavior (e.g. after each operation related to partner a perform an Invoke operation to partner b informing about the content of the previous communication).
228
S. Modafferi and E. Conforti
Fig. 5. Overview of future alternative behavior mechanism
Abstract Extended Model. The extended model for this feature has also to consider: – A Conditional Scope CSi which is a Scope with his alternative behaviours: • Starting Point CSi Ps of the Scope • Ending Point CSi Pe of a Scope. • Set of Alternative Arcs AACSi linking the Scope with all the corresponding available Alternative Scopes. Each arc is defined as AACSi −AltScopej . – The set of Alternative Scope AltScope available for the workflow. Each AltScopei is defined as a set of basic and structured Ws-BPEL activities. AltScopeCSi identifies the set of alternative behaviors associated with a Conditional Scope CSi . With respect to the default execution flow AltScopei identifies a semantically different behavior because it is performed as a recovery action, but syntactically it is composed of standard Ws-BPEL code. The Transformation Algorithm. The Pre-processing algorithm for each Conditional Scope is defined as follow: 1. For each Conditional Scope CSi : – In correspondence of each Conditional scope a Switch activity starting from CSi Ps and ending in CSi Pe is placed. A variable rCSi drives the switch construct. – The associated AltScopeCSi is identified. It represents the set of alternative behaviors available for CSi .
Methods for Enabling Recovery Actions in Ws-BPEL
229
– The Switch structure have | AltScopeCSi | +1 branches. A branch is filled with the default behavior and the others with an available Alternative Scope. 2. For each Alternative Scope AltScopei : – An Event Handler EHAltScopei associated with message M SGAltScopei enabling the Alternative Scope AltScopei is defined. – All the Scope (and the set of corresponding rCSi ) where the AltScopei behavior is defined are identified. – The body of EHAltScopei is filled with a set of Assign activities devoted to switch each involved rCSi from current value to AltScopei or to the original CSi according to message content. 4.5
Rollback and Conditional Re-execution of the Flow
A common recovery action in workflow exception management is to compensate a part of the executed process, rolling back the state to a “safe point”. it is possible to compensate a Scope using the simple Compensation Handler provided by standard Ws-BPEL, but the execution flow can only proceed ahead and no “jump” or “go to” construct are provided. The only way to go back is to use a loop in a proper way. In the following we will refer to a more general mechanism that allows to rollback the process until a safe point and then to execute the same or a possibly different behavior. The concept of safe points is derived from [11] and their identification is in charge of the designer. In the following they will be addressed as “migration points” because each point can be the “starting point” for migrating to an alternative behavior. The designer has to specify: – The point considered safe, that is the point suitable for ending a rollback process and starting migration process. Migration process can be also “selfmigration”, that is starting and ending configurations/behaviors are the same. – The compensation process associated with the rollback action – The arcs linking a “starting migration point” with related “ending migration point” of different behaviors and the optional transformation process associated with each arc. The use of both “Starting” and “Ending” Migration Point allows general migrations without enforcing a symmetry among different behaviors. High Level Overview. The solution is derived from the one presented in [16] where the authors propose a way for specifying run-time change of configuration as reaction to a context change. It uses the Fault Handler property that terminates all the activities inside a Scope and restarts the flow after the end of that Scope. This Scope is inserted in a loop and the Fault Handler modifies the variable driving the loop for performing another iteration. Two different switch
230
S. Modafferi and E. Conforti
constructs drive the choice of the configuration that will be performed after the rollback phase and the restarting from the correct point in the configuration. In this way, by combining Fault Handler properties, specific code performed in the Fault Handler and an appropriate main flow structure it is possible to have a behavior that realize the rollback using the Ws-BPEL language. Abstract Extended Model. The extended model will consider: – A Context-sensitive region (CSR) that is a workflow subprocess that supports rollback and may have several configurations exporting different behavior according to specific conditions. A CSR will be defined exactly upon a Ws-BPEL Scope. A region is composed of alternative configurations linked with particular arcs called migration arcs. A migration arc is associated with instructions to migrate a workflow instance from one configuration to another. – A Configuration that is always a workflow subprocess, but with different characteristic with respect to CSR. A Configuration Confi is composed by: i) an entry condition EC; ii) basic and structured activities; iii) a set of Starting Migration Points M P s; iv) a set of Ending Migration Points M P e; v) a set of directed Configuration Arcs F C. The entry condition EC is an expression used to define when the configuration has to be entered. In typical workflow execution the default behavior satisfies the EC, but it could be that in the past the EC has been varied for enabling specific behavior. The Transformation Algorithm. The transformation algorithm for translating from an annotated Ws-BPEL, see figure Fig. 6, to a standard one, see Fig. 7, is shown in the following: 1. For each configuration (each Context-Sensitive Region has some alternative configurations) the preprocessor builds n + 1 sub-configurations (where n is the number of migration points, the other sub-configuration is for the default behavior), each one starting from a different migration point (or from the start of the configuration) and ending at the end of the configuration. 2. All these sub-configurations become branch of a switch construct. This switch will be driven from Ending mig point variable. 3. At this point there is a Switch construct for each configuration. Each configuration becomes a branch of another Switch construct. This latter switch will be driven from the behavior variable. 4. The Switch construct is put inside the Scope where the exception is managed. This fact ensures that at the end of exception management the flow is just outside the switch construct. 5. The Scope, that is the Switch, is put inside a Loop. This Loop is driven by pass through variable, that is ”if no exceptions have been raised the flow go away”, otherwise ”the flow go back to the start of the loop”. 6. In the place of each starting migration point put an activity called U pdate M P s, necessary to maintain updated the last M P s variable.
Methods for Enabling Recovery Actions in Ws-BPEL
231
Fig. 6. Example of designer specification for a Scope supporting rollback execution
7. As first task of the Loop, before the beginning of the Scope, an activity called U pdateP ass through is put. The aim of this activity is to set P ass through = true, to allow the flow to go out of the loop if an exception has been raised previously. To better understand the run-time behavior let us suppose that an exception is raised. The handler starts and the actions performed are: 1. Compensate until last migration point. 2. Set P ass through = f alse. 3. Kill the current Scope and start again the flow at his end. Now let us suppose that during the execution of Task 3 (see Fig. 7), a recovery action that requires rollback has to be performed. This information is carried in a fault message and then when it arrives all the activities in the Scope are terminated and the Fault Handler is called. The Fault Handler sets the variable pass through to false, then performs the recovery actions associated with the rollback process, determines the new configurations that will be followed (again “Default” or “Alternative”)and the Ending M igration point corresponding to the last Starting M igration P oint (that is the Safe Point) upon which the flow is passed. Eventually, in case, perform the actions associated with the migration process. The main flow will restart immediately after the end of the Scope when the Fault Handler ends. The variable that drives the loop construct has been set for performing another iteration and then the main flow will go back to the start
232
S. Modafferi and E. Conforti
Fig. 7. Standard Ws-BPEL implementing specification of Fig. 6
of the loop. The first activity in the loop (Update pass through) is necessary to allow the flow exiting the loop. Then the first switch will drive the choice of the right configuration and the second, driven from the Ending migration point,will determine which is the right point where the flow has to start in the new configurations. If the new configuration is the same of the previous one, the system is performing a traditional rollback operation. 4.6
Harmonizations of Proposed Mechanisms
The changes of semantic applied to many constructs using our mechanisms are very profitable, but attention has to be put in considering possible conflicts among mechanisms. In fact final Ws-BPEL code is much more complicated then the original one. Some basic policies has been followed in defining mechanisms: – Each mechanism is raised by a different message for avoiding non-determinism. – Each mechanism is independent and self-contained, that is different mechanisms do not communicate and therefore, given that each mechanism terminates, deadlock introduced by cyclic and reciprocal calls of different mechanisms is prevented. – The definition of region upon which mechanisms are defined is the same of Ws-BPEL Scope. This ensures that partial overlapping among them are not allowed.
Methods for Enabling Recovery Actions in Ws-BPEL
233
This three policies and general considerations about mechanism behaviors lead us to be sure that conflicts can be avoided. Moreover we assume the hypothesis that the engine support a recovery mechanism at once. This choice definitively improves robustness and can be realized blocking, and in case buffering, incoming message using an apposite firewall.
5
Related Work
Recovery actions for workflow systems have been wide studied in the past. The works [6,8,18] present specific workflow models that widely support recovery actions; in [12] the authors focus on the analysis, prediction, and prevention of exceptions in order to reduce their occurrences. The model presented in [13] focuses on the handling of expected exceptions and the integration of exception handling in the execution environment, while in [5] the authors propose the use of “worklets”, a repertoire of self-contained subprocesses and associated selection and exception handling rules to support the modelling, analysis and enactment of business processes. The work in [9] presents the requirements of a Web Service Management framework which also includes the typical functionalities addressed in self-healing systems. The authors analyze and compare multiple alternative architectures for the implementation of Web Service Management systems proposing Web service substitution and complex service re-compositions as repair actions. In addition, an extensive amount of work on flexible recovery in the context of advanced transaction models has been done, e.g., in [10,19]. They particularly show how some of the concepts used in transaction management can be applied to workflow environments. A “minimal” approach to recovery can be built with BPML [4] that uses a Petri nets based model focusing on flexibility. For this reason recovery actions are viewed as a transition from the actual (faulty) state to a new (correct) one, but constraints in state transitions for guarantying the correctness of recovery action have to be defined by the designer. High flexibility is ensured but the effort required becomes too cumbersome. A specific comparison will be carried out with the systematic approach to recovery actions presented in [14]. In this work the authors consider a set of recovery policies both on task and region of a workflow. They use an extend Petri Net approach to change the normal behaviour when an expected but unusual situation or failure occurs. As in our approach the recovery policies are set at design time. Table 1 shows how the mechanisms presented in Hamadi’s work can be realized using Ws-BPEL. Several solutions use standard Ws-BPEL handlers, others are realized exploiting the mechanisms proposed in this paper. When Hamadi uses the term “after”, he defines the moment after finishing the execution of task/region and before initiating any subsequent task/region. The terms region in his work is the same of Scope in our approach.
234
S. Modafferi and E. Conforti
Table 1. Comparison between recovery actions presented in [14] and the proposed mechanisms Recovery Policy in [14] Redo RedoAfter Compensate CompesateAfter AlternativeActivity AlternativeProvider Skip SkipTo Timeout
Proposed solution Task level Redo mechanism Redo mechanism Compensation Handler Compensation Handler Catch fault Dynamic Binding Catch fault empty Not supported TimeOut mechanism
Proposed solution Scope level Redo policy Redo mechanism Compensation Handler Compensation Handler rollback + Fault Handler ——— Fault Handler Not supported TimeOut mechanism
The meaning of “redo” actions are analogous in our approach and in the Hamadi’s one. With the mechanism presented in Section 4.3 it is possible to have the same behavior. The “compensation” is realized in Ws-BPEL by Compensation Handler. According to Hamadi, the “Alternative recovery policy” allows another task/region T’ to be executed in place of a running task/region T in case the later fails. The proposed mapping with Ws-BPEL distinguishes between task and region. In the first case the solution is simple and it is exploited by a Fault Handler opportunely filled with the alternative behavior. In the second case an analogous behavior can be realized using again a Fault Handler filled with the alternative behavior or it can be better exploited using the mechanism of Rollback and conditional re-execution of the flow. This mechanism allows the designer to define more than one alternative behavior linking the choice of this behavior to the current point during the workflow execution where the fault happens. “Alternative Provider” for the single tasks can be exploited in Ws-BPEL using some mechanisms for dynamic binding provided by several Ws-BPEL engines however this is not supported by standard Ws-BPEL. It is still not possible to implement an analogous behavior to the pattern of Alternative Provider at Scope level. The “skip” is mapped in Ws-BPEL with an empty Fault Handler defined upon the single Task or the Scope. The “skip to” is not mapped on Ws-BPEL because it does not support any way to realize a free goto operation. The “timeout” is implemented in Ws-BPEL using the corresponding method presented in Section 4.2. The other mechanisms presented in our work have not a direct mapping, this is compliant with the idea of the authors of [14] stating that recovery patterns should be extended and improved to cope with more complex situations.
Methods for Enabling Recovery Actions in Ws-BPEL
6
235
Conclusion and Future Work
In this paper some mechanisms for enabling recovery actions using standard Ws-BPEL language and engine has been presented. Ws-BPEL is the de-facto standard language for Web-based process orchestration an the possibility of overcoming several limitations about recovery without modifying it is really a need because Self-Healing are an emerging exigence for Information Systems where processes are everyday more complicated and where many autonomous actors are involved. The proposed mechanisms cover a wide range of possibility: i) the specification of a time deadline associated with a task, ii) the ability of redoing a single Task or an entire Scope, iii) the possibility of specifying alternative paths to follow in the prosecution after the reception of an enabling message, iv) the possibility of going back in the process to a point defined as safe for redoing the same set of tasks or for performing an alternative path. Ongoing research is spread in several direction: the improvement of “Future alternative behavior” mechanism for defining a suitable set of rules allowing to design the alternative behavior in a parametric way with respect to the default behavior, the implementation of an efficient preprocessor, the demonstration of absence of conflict among mechanisms. Future research will cover the development of a graphical tool for Ws-BPEL annotation and the study of new and more flexible mechanism that should allow the freezing, and even killing, of a Ws-BPEL instance for creating a new one that should inherit the state of the dead one. An orthogonal aspect in the future work will be the enrichment of the simple and prototypical version of the used ontology of faults and recovery actions. Finally, interesting use of presented mechanisms can also be envisaged in developing of Self-Healing workflow engine in an advanced Self-Healing Information System. In this scenario they can be leveraged in more complex recovery strategies, decided somewhere in the environment, and composed of a part in charge of the Ws-BPEL engine (i.e. these mechanisms) and of a part hidden to it and managed outside.
Acknowledgement This work is partially funded by by EU Commission within the FET-STREP Project WS-Diamond. The authors are grateful to Prof. Fugini for the fruitful discussions.
References 1. 2. 3. 4.
http://www.alphaworks.ibm.com/tech/bpws4j, 2002. http://www.oracle.com/technology/products/ias/bpel/index.html, 2003. http://www.activebpel.org, 2004. A. Arkin et al. Business process modeling language BPML 1.0, 2002.
236
S. Modafferi and E. Conforti
5. M. Adams, A.H.M. ter Hofstede, D. Edmond, and W.M.P. van der Aalst. Facilitating flexibility and dynamic exception handling in workflows through worklets. In Short Paper Proceedings at (CAiSE), volume 161 of CEUR Workshop Proceedings, Porto, Portugal, 2005. 6. F. Casati, S. Ceri, B. Pernici, and G. Pozzi. Workflow evolution. Data Knowl. Eng., 24(3):211–238, 1998. 7. F. Curbera, Y. Goland, J. Klein, F. Leymann, D. Roller, S. Thatte, and S. Weerawarana. Business Process Execution Language for Web Services, version 1.0, 2002. http://www.ibm.com/developerworks/library/ws-bpel/. 8. J. Eder and W. Liebhart. Workflow recovery. In Proc. of IFCIS Int. Conf. on Cooperative Information Systems (CoopIS), pages 124 – 134, Brussels, Belgium, 1996. IEEE. 9. E. Esfandiari and V. Tosic. Towards a web service composition management framework. In Proc. of Int. Conf. on Web Services (ICWS), Orlando FL, USA, 2005. 10. D. Georgakopoulos, M.F. Hornick, and F. Manola. Customizing transaction models and mechanisms in a programmable environment supporting reliable workflow automation. IEEE Trans. Knowl. Data Eng., 8(4):630–649, 1996. 11. P. Grefen, B. Pernici, and G. Sanchez (Eds). Database Support for Workflow Management - The WIDE Project. Kluwer Academic Publishers, 1999. 12. D. Grigori, F. Casati, U. Dayal, and M.C. Shan. Improving business process quality through exception understanding, prediction, and prevention. In Proc. of Proceedings of Int. Conf. on Very Large Data Bases (VLDB), Roma, Italy, 2001. 13. C. Hagen and G. Alonso. Exception handling in workflow management systems. IEEE Trans. Software Eng., 26(10):943–958, 2000. 14. R. Hamadi and B. Benatallah. Recovery nets: Towards self-adaptive workflow systems. In Proc. of Int. Conf. on Web Information Systems Engineering (WISE), pages 439–453, Brisbane, Australia, 2004. 15. J. Miller, K. Verma, P. Rajasekaran, A. Sheth, R. Aggarwal, and K. Sivashanmugam. Adding semantics to wsdl. White paper, 2004. http://lsdis.cs.uga.edu/library/download/wsdl-s.pdf. 16. S. Modafferi, B. Benatallah, F. Casati, and B. Pernici. A methodology for designing and managing context-aware workflows. In Proc. of IFIP TC 8 Working Conference on Mobile Information Systems (MOBIS), Leeds, UK, 2005. 17. B. Pernici (Ed). Mobile Information Systems Infrastructure and Design for Adaptivity and Flexibility. Springer, 2006. 18. M. Reichert, S. Rinderle, U. Kreher, and P. Dadam. Adaptive process management with ADEPT2. In Proc. of Int. Conf. on Data Engineering ICDE, pages 1113–1114, Tokyo, Japan, 2005. 19. H. W¨ achter and A. Reuter. The ConTract model. In A.K. Elmagarmid, editor, Database Transaction Models for Advanced Applications, pages 219–263. Morgan Kaufmann, 1992.