2010 2010 Fourth Fourth IEEE International International Conference Conference on on Secure Secure Software Software Integration Integration andand Reliability Reliability Improvement Improvement
YAWL2DVE: An Automated Translator For Workflow Verification Fazle Rabbi Centre for Logic and Information St. Francis Xavier University Antigonish, Canada Email:
[email protected]
Hao Wang Centre for Logic and Information St. Francis Xavier University Antigonish, Canada Email:
[email protected]
task. Current WfMSs facilitate the enactment of workflows with some degree of fault-tolerance, e.g., exception handling, but automated formal verification capacity is limited. Some workflow systems provide limited forms of verifications. The YAWL (Yet Another Workflow Language) [6] project has identified several properties of processes that may need to be verified. YAWL can verify the soundness property of workflow nets (a restricted sub class of Petri nets) which guarantees the absence of live-locks, deadlocks, and other anomalies without domain knowledge [7], such as “the end of the workflow is reachable from any point in the workflow”, and “all tasks can contribute to the completion of the workflow, i.e., there are no paths that are never traversed in any reachable state”. But formal verification is commonly done using temporal logics as the specification language. Two main categories of temporal logics used for verification are linear time logics, such as Linear Temporal Logic (LTL) and branching time logics, such as Computation Tree Logic (CTL). The model checkers SPIN and DiVinE [8] verify properties of LTL. The popular SMV model checker is capable of verifying both LTL and CTL properties. Research into the application of SPIN to workflow specification and verification can be found in [9], focusing on the translation of workflow patterns to Promela (the Modeling language of SPIN), which are language independent, in order to formally verify workflow systems. Unfortunately, the wellknown state explosion problem often limits the extent to which model checking can be applied to realistic models of systems, due to the huge resulting memory requirements [10]. Conventional sequential model checking tools are not suitable for handling real world systems having high memory and computational power requirements. DiVinE is a promising system which primarily focuses on verifying models with large state spaces, providing a distributed and parallel model checking environment addressing these requirements. In [11], we manually translated a number of workflow patterns. These patterns, defined by Van der Aalst et al. [12], are well accepted as the basic building blocks for the design and development of workflow models. However our small case study showed that the manual translation is tedious and error-prone, which makes it infeasible for large and complex models. This paper presents an automated translator which
Abstract—Workflow management systems (WfMSs) have gained increasing attention recently as an important technology to improve information system development in dynamic and distributed organizations. However the absence of verification facilities in most WfMSs causes the resulting implementation of large and complex workflow models to be at risk of undesirable runtime executions. This problem of design validation ensuring the correctness of the design at the earliest stage possible is a major challenge for any responsible system development process, and the activities intended for its solution occupy an ever increasing portion of the development cycle cost and time budgets. Model checking is a popular technique to systematically and automatically verify system properties, but it requires a substantial effort to convert the system design into a specific model checking program. In this paper, we present an automated translator (YAWL2DVE) which can convert a graphical workflow model into DVE, the input language of DiVinE. DiVinE is a distributed and parallel model checker, which can effectively handle the well known “state explosion problem” of this domain. We show the effectiveness of this translator with a case study on a real world health care workflow model. Keywords-workflow management; formal verification; model checking; YAWL;
I. I NTRODUCTION Workflow Management Systems provide automated support for defining and controlling various activities (tasks) associated with business processes [1] [2]. The automated support reduces costs and flow times for business processes, by improving the robustness of the process and increasing productivity and quality of service [3] [4]. Once a workflow is set up for a business process, the actors are expected to act according to the predefined set of rules. The workflow can be analyzed and modified at a later time depending on the data gathered from past experience. A survey on current verification and validation methods for software systems shows that only small amount of research is based on the formal verification approach [5]. Focus has been on simulation and testing, but for safety critical systems this approach can be expensive in terms of time, money or even lives. Therefore the need for a verified software model, which assures the necessary specifications are true in every run, before the system enactment in real time, is clear. Given the large number of rules and their frequency of change, manual verification is generally a time-consuming 978-0-7695-4086-3/10 $26.00 © 2010 IEEE DOI 10.1109/SSIRI.2010.31 10.1109/SSIRI.2010.32
Wendy MacCaull Centre for Logic and Information St. Francis Xavier University Antigonish, Canada Email:
[email protected]
53
can convert a YAWL model into DVE, the input language of DiVinE. In this way, the verification of workflow becomes easier and faster. The methodology is easily extended to input for other model checkers such as SMV and SPIN. It is anticipated that the method can be adapted to other graphical workflow modeling languages. Our case study shows that an automated translator for the WfMSs is a promising approach to the verification of real world workflow models. In addition, using the automated translator, verification tasks can now be easily adapted to different settings as healthcare processes vary among hospitals. The remainder of this paper is organized as follows. Section II provides background information; Section III presents the automated translator; Section IV presents a case study; and Section V concludes the paper and offers some directions for future work.
C. DiVinE (High-Performance Model Checking) DiVinE is a distributed explicit-state LTL model checker based on the automata-theoretic approach to LTL model checking [8]. Algorithms either find accepting cycles in the product of the system model and a Buchi automaton for the negation of the formula, or generate a counter-example, useful for debugging. DVE is sufficiently expressive to model general processes. The basic modelling unit in DVE is a system, which is composed of processes. Processes can go from one process state to another through transitions, which can be guarded by a condition - this condition, also called a “guard”, determines whether the transition can be activated. The property to be verified can be written as an LTL formula and a corresponding property process can be automatically generated. Modeled system processes and the property process progress synchronously, so the latter can observe the system’s behavior step by step and catch errors.
II. BACKGROUND This section provides background information for the terminology and methods used in this work.
III. T HE AUTOMATED TRANSLATOR (YAWL2DVE) In [11], a number of established workflow patterns were translated into DVE. In this translation, processes, subprocesses and activities are mapped into DVE processes and control flow paths are mapped into DVE channels. Messages between processes are represented, without loss of generality, by integers in DVE. This work was foundation for the development of the automated translator. YAWL2DVE is developed using Java. It can translate any YAWL workflow model. YAWL2DVE takes a YAWL file as input and generates a mdve file as output. The mdve file can be combined with ltl properties file. The output file can then be used for the verification. YAWL2DVE processes the YAWL file which is an XML file. The following steps are used in the YAWL2DVE translation: 1) Parse XML and construct workflow components 2) Create links and channel numbers 3) Process multiple tasks 4) Process multiple choice 5) Generate DVE code from the root net decomposition Fig. 1 shows the workflow components that YAWL2DVE creates after processing the XML file. The task can be Atomic or Multiple depending on the type of workflow task component. Each task holds the reference of its next and previous elements. For each atomic Task instance YAWL2DVE will create one process in DiVinE; and for each multiple task instance YAWL2DVE will create multiple processes in DiVinE. Processes will send and receive signals through unique channels which will be created using the task IDs and channel indices. During execution of the model checker, processes remain in the waiting state. After receiving activation signals, processes move to the working state. A process activates its subsequent processes after its work. Fig. 2 shows a registration workflow; part of its XML is provided below:
A. Workflow patterns A workflow can be a depiction of a sequence of operations, declared as the work of a person, work of a simple or complex mechanism, work of a group of persons, work of an organization of staff, or work of a machine. A well-known collection of twenty Workflow Patterns were presented by Van der Aalst et. al., [12]. This collection of patterns focuses on one specific aspect of process-oriented application development, namely the description of control flow dependencies between activities in a workflow. Since their release, these patterns have been widely used by practitioners, vendors and academics alike in the selection, design and development of workflow systems. A comprehensive review of pattern support in fourteen distinct process modeling tools, including workflow systems (Staffware, WebSphere MQ, COSA, iPlanet, SAP Workflow and FileNet), case handling systems (FLOWer), business process modelling languages (BPMN, UML 2.0 Activity Diagrams and EPCs) and business process execution languages (BPEL4WS, WebSphere BPEL, Oracle BPEL and XPDL), may be found in [13]. B. YAWL (Yet Another Workflow Language) YAWL is a BPM (Business Process Management)/Workflow system, based on a concise and powerful graphical modeling language. YAWL handles complex data, transformations, integration with organizational resources and Web Service integration. YAWL uses a Petri net-based formalism that was extended with additional features to facilitate the modeling of complex workflows [6]. A workflow specification in YAWL is a set of extended workflow nets (EWF-nets) which may be nested. A YAWL net is made up of tasks, conditions and flow relations between them. Three kinds of split and three corresponding kinds of join, AND, XOR and OR allow one to construct any complex workflow. Tasks can be atomic, multiple or composite. 54
Fig. 1.
Class Diagram of workflow components.
tion. In the DiVinE translation of this workflow YAWL2DVE will create six processes named InputCondition 1, Registration, Create Personal Profile, Create Home Chart, Registration Confirmation, OutputCondition 27. The InputCondition 1 process will send a sync signal through channel ’chan InputCondition 1 0’. The Registration process will be activated after receiving that signal and after completing its work it will send two sync signals through ’chan Registration 28 0’ and ’chan Registration 28 1’. The processes will be executed according to the workflow order. Fig. 3 shows the attributes of the tasks. For each Net Decomposition of the workflow, a NetDecomposition instance will be created which contains one InputCondition, one EndCondition and one or more Task and Condition instances. Step 3, 4 and 5 are described as follows:
.... Registration ....
A. Process multiple tasks
Fig. 2.
Process multiple tasks with prior design time knowledge: Fig. 4 shows a multiple task M which has the property value maxInstance = 3 and type = static. YAWL2DVE will process M as in Fig. 5. Process multiple tasks with run time knowledge: Fig. 6 shows how YAWL2DVE will simplify a multiple task M with minInstance = 2, maxInstance = 3 and type = dynamic. At runtime, at least 2 and at most 3 processes will be executed.
Registration Workflow
The task IDs in the XML file are unique. An index and these IDs are used to generate channel numbers for the communica55
Fig. 5. Decomposition of multiple tasks with prior design time knowledge into atomic tasks
Fig. 6. Decomposition of multiple tasks with runtime knowledge into atomic tasks Fig. 3.
Instances for the Registration workflow
a unique orChoiceVariableName using which the sync process will know about the choice information. Algorithm 2: ProcessMultipleChoice
Fig. 4.
v list ← φ for j ∈ orJoinTasks do s ← FindSplitTask(j) v ← MakeVariable(j) j.or join ← v s.or split ← v v list.add(v) if v list 6= φ then CreateGlobalVariables(v-list)
Multiple tasks with prior design time knowledge
Process multiple tasks without synchronization: YAWL2DVE processes “Multiple tasks without synchronization” pattern after the ‘Create links and channel numbers’ step. It removes the sync signals from the branches which do not need synchronization. It sets noSyncFlag to true for the tasks by using Algorithm 1:
In the translated DVE code the multiple-choice-variables (v list) will be declared globally and initialized to zero. Any multiple choice synchronizing process (OR-Join) will be in the idle state until it gets a positive value for its multiple-choice-variable. The corresponding split task will increase the value of multiple-choice-variable if it sends an activation signal to a task from which the synchronizing process is reachable. On the other hand, the synchronizing process will change its state from “idle” to “waiting” after getting a positive value for the multiple-choice-variable. This process will then wait for the sync signals and count them. If the total number of sync signals equals the value of the multiple-choice-variable, it starts working. Other processes in between the split and join can change the value of the multiple-choice-variable depending on further split or join operation. In the workflow of Fig. 7, OR-Join task F’s corresponding split task is A, as all the incoming branches of F originate from Task A. Let the multiple-choice-variable be F 9 OR. Initially F 9 OR equals zero. Task A activates the two branches and sets its value to 2. Task F will wait for two sync signals. However Task B activates another branch,
Algorithm 1: RemoveSyncSignals Input: Task t Result: Removes synchnonization signals from branches if t.split code = “AND” & t.or split 6= φ then for n ∈ nextTasks(t) do if n ∈ multipleTasks then RemoveSyncFromLast(n, t.or split)
B. Process Multiple Choice Handling multiple choice patterns requires additional information in order to synchronize. The synchronizing process will get information about the number of active choices. It will start working only after receiving exactly that desired number of signals. YAWL2DVE handles the multiple choice pattern by Algorithm 2. This algorithm will identify two corresponding OR-Tasks (OR-Join and its corresponding split task) and create 56
The RemoveSyncFromLast procedure visits the tasks descendent elements and removes synchronization signals from the task which was synchronizing with an OR-Join Task with or join = t.or split. Note that the value is set to or join in Algorithm 2.
Algorithm 3: GetDveForNet Input: theNet, srcChannel, destChannel Output: DVE Code if theNet.lock = true then // YAWL2DVE does not allow recursive composite references return; theNet.lock ← true; dve ← GetDveForTask (theNet.inputCondition, srcChannel, destChannel, null ) if theNet.isRootNet = fasle then theNet.cloneCounter++; theNet.lock ← false; return dve
Algorithm 4: GetDveForTask Fig. 7.
Multiple choice
Input: Task t, srcChnl, destChnl, or-contexts Output: DVE Code dve ← INITIALIZE() dve.append(JoinStatements(t)) if t ∈ compositeTasks then cNet ← t.compositeRef chnl1 ← t.id + “ CALL ” + cNet.cloneCounter chnl2 ← t.id + “ END ” + cNet.cloneCounter dve.append( CompositeCallStatements (chnl1, chnl2) ) compDve ← GetDveForNet(cNet, chnl1, chnl2) if t.or split 6= φ then or-context ← t.or split or-contexts.add(or-context) dve.append(SplitStatements(t)) if compDve 6= φ then dve.append(compCode) for n ∈ nextTasks(t) do c ← GetDveForTask(n, srcChnl, destChln, or-contexts) dve.append(c) return dve
so it will increase the value of F 9 OR; now its value will be 3. Now task F will have to get exactly 3 sync signals to become active. The DVE code for OR-Join Process F
process F { int no_of_ac = 0; ...... idle -> waiting{guard F_9_OR>0;}, waiting -> idle{guard F_9_OR==0&& no_of_ac==0;}, waiting -> use_chan{guard F_9_OR!=no_of_ac; sync chan_C_7_0?; effect no_of_ac=no_of_ac+1;}, waiting -> use_chan{guard F_9_OR!=no_of_ac; sync chan_D_6_0?; effect no_of_ac=no_of_ac+1;}, waiting -> use_chan{guard F_9_OR!=no_of_ac; net decomposition (compositeRef). YAWL2DVE handles this efficiently by using a cloneCounter. If two or more composite sync chan_E_9_0?; effect no_of_ac=no_of_ac+1;}, tasks are decomposed to the same NetDecomposition then the cloneCounter will be used to generate unique process waiting -> working{guard F_9_OR>0&& names for the tasks. The cloneCounter has been described F_9_OR==no_of_ac;}, use_chan-> waiting{guard no_of_ac!=F_9_OR;},in Algorithm 3 and Algorithm 4. use_chan-> working{guard no_of_ac==F_9_OR; effect F_9_OR=0,no_of_ac=0;}, Parameter mappings and Task decomposition: Currently YAWL2DVE processes the basic data types (e.g., working -> waiting{sync chan_G_10_0!;} integer, byte, Boolean) and we intend to extend it with more } complex data types in future. It processes the task decompoC. Generate DVE code from root net decomposition sition variables of the YAWL task components and populates Algorithm 3 starts visiting the components from root net the variableList of Tasks (Fig. 1). YAWL2DVE supports only decompositions input condition in DFS order. Algorithm 3 basic mathematical operators (e.g., + - * / =); it will be and Algorithm 4 are used to generate the DVE code for the extended with more XQuery expressions in the future. The parameter expressions are used to set values to the global components. variables of the DVE translation. These expressions are used in the effect statement of a transition. Process Composite tasks: A composite task decomposes to another workflow net. If the YAWL2DVE processes flow predicates with basic matheisComposite property is true for any task component, it means matical operators (e.g., + - ∗ / = > < >=