Veri cation Problems in Conceptual Work ow Speci cations A.H.M. ter Hofstede1; , M.E. Orlowska;#, J. Rajapakse# Department of Computer Science The University of Queensland Brisbane, Qld 4072 Australia e-mail: farthur,
[email protected]
#Distributed Systems Technology Centre The University of Queensland Brisbane, Qld 4072 Australia e-mail:
[email protected]
Abstract Most of today's business requirements can only be accomplished through integration of various autonomous systems which were initially designed to serve the needs of particular applications. In the literature work ows are proposed to design these kinds of applications. The key tool for designing such applications is a powerful conceptual speci cation language. Such a language should be capable of capturing interactions and cooperation between component tasks of work ows among others. These include sequential execution, iteration, choice, parallelism and synchronisation. The central focus of this paper is the veri cation of such process control aspects in conceptual work ow speci cations. As it is generally agreed upon that the later in the software development process an error is detected, the more it will cost to correct it, it is of vital importance to detect errors as early as possible in the systems development process. In this paper some typical veri cation problems in work ow speci cations are identi ed and their complexity is addressed. It will be proven that some fundamental problems are not tractable and we will show what restriction is needed to allow termination problems to be recognized in polynomial time.
Keywords: Work ow, Veri cation, Computational Complexity
1 Introduction Information systems integration may be considered a key theme for the 1990s since it is a prerequisite for running businesses cost-eectively. Most of today's business requirements can only be accomplished through the integration of various autonomous systems which were initially designed to serve the needs of particular applications. In designing such integrated systems Current address: School of Information Systems, Queensland University of Technology, GPO Box 2434, Brisbane Qld 4001, Australia. E-mail:
[email protected]. 1
1
we use pre-existing systems as components of the new system. As a result, we now see some complex applications which span across several pre-existing systems. Work ows have been identi ed as a good candidate for designing such applications [2, 6, 8, 9, 10, 12, 13, 26, 25]. In a work ow, a set of operations performed to achieve some basic business process on one pre-existing system is usually described as a task. A set of tasks constitute a work ow where these tasks may be interrelated in some way re ecting business application needs. The tasks are either existing information processes in a pre-existing system or they may be implemented on request by the work ow designer. Work ow management systems (WFMS) have been developed to implement work ows as a whole. Analoguously to other systems, WFMSs have conceptual and physical levels. It is important to capture all the details of tasks and their interactions at a conceptual level in order to prevent premature implementation decisions which may lead to suboptimal solutions later on. As validation is of crucial importance, work ow speci cations should be comprehensible as this facilitates communication with domain experts. One way of achieving comprehensibility is by oering graphical representations of modelling concepts. In order to be capable of adequately capturing a work ow problem, the work ow speci cation language should have sucient expressive power. In particular this means that constructs for various forms of process control should be oered. Finally, a work ow speci cation language should have a formal foundation in order to allow for formal reasoning and prevent interpretation ambiguities. The formal foundation should include both syntax and semantics. Summarizing: A conceptual work ow speci cation language should be comprehensible, have sucient expressive power and should have a formal foundation. Work ow management research has received much attention in recent years. In [12] an upto-date high-level overview of the current work ow management methodologies and software products is provided. Research prototypes, such as e.g. described in [21, 13, 18], concentrate on concurrency, correctness, and recovery of work ows while commercial products pay more attention to user friendly work ow speci cation tools leaving aspects like recovery and correctness to designers. Some WFMS have not given much attention to the conceptual level [26]. As identi ed in [12], work ow management needs a complete framework starting from the conceptual level. In [7] a good review of current work at the conceptual level is provided and a conceptual level language to describe work ows is presented. Its implementation model is based on active database technology. Necessary active rules are semi-automatically generated from the conceptual level language. In this reference it is claimed that the work described in [1, 6, 10] lacks expressive power concerning the possibility of specifying task interactions and the mapping from work ow speci cation to work ow execution, in particular with regard to exception handling. However, in [7] a formal semantics of the language presented is not given. Hence, it is not possible to reason about the correctness of the conceptual speci cations. Work ow modelling is similar to process modelling [12]. In a work ow context, tasks are basic work units that collectively achieve a certain goal. The collective nature shows various types of process dependencies. Therefore, whatever language is used to specify work ows, it should be suciently powerful to capture those dependencies: sequential order, parallelism, iteration, choice, synchronisation etc. The central focus of this paper is the identi cation of veri cation issues in conceptual work ow speci cations, as far as process control is concerned, and their complexity. Clearly, veri cation 2
at the conceptual level is crucial as it is a well-known fact that the later in the development process an error is detected, the more expensive it is to correct it. As mentioned above it is desirable to have sucient expressive power in speci cation languages. However, there is an obvious trade-o between expressive power and complexity of veri cation. In this paper, we will prove that some veri cation problems in the context of work ow speci cations are not tractable or even undecidable. In addition to that, we will focus on a restriction of expressive power which allows terminations considerations to be performed in polynomial time. The paper is organised as follows. In the next section, core concepts for specifying process dependencies are introduced and an example of a work ow speci cation is given using these concepts. In section 3, a number of veri cation problems are de ned and their complexity is determined. In section 4, a necessary and sucient condition for detecting termination in a restricted work ow speci cation language is given and its correctness is proved. Section 5 concludes the paper and identi es topics for further research.
2 Essential Work ow Concepts The speci cation of work ows in general is known to be quite complex and many issues are involved. Work ow speci cations should be capable of expressing at least:
Properties of tasks, such as compensatability (i.e. can the result of the task be undone),
pre- and postconditions (which might involve complex time aspects), redo-ability (can a task be redone) etc. Information ows between tasks and information of a more persistent nature (e.g. external databases). Execution dependencies between tasks (also referred to as control ow). These dependencies can be based on conditions (value based or failure/successful) or temporal, parallel/sequential etc. Capacities of tasks. Capacities might e.g. refer to storage, to throughput, or to numbers of active instances.
Generally speaking, work ow speci cations need not pay attention to task functionality as focus is on the coordination of tasks. In this paper, focus is solely on control ow in work ow speci cations. Any conceptual work ow speci cation language should at least be capable of capturing moments of choice, sequential composition, parallel execution, and synchronization. In the next section, task structures are presented which are capable of modelling these task dependencies. Their informal explanation is based on [15], for a de nition of their formal semantics the reader is referred to [14]. It should be stressed here that task structures serve as a means to an end. They might be viewed upon as a kernel for work ow speci cation concepts and are used in this paper to study veri cation problems. The results extend to any language oering concepts for choice, sequential composition, parallel execution, and synchronization. 3
2.1 Informal Explanation of Task Structures Task structures were introduced in [5] to describe and analyze problem solving processes. In [28, 29] they were extended and used as a meta-process modelling technique for describing the strategies used by experienced information engineers. In [14] they were extended again and a formal semantics in terms of Process Algebra was given [3]. In gure 1, the main concepts of task structures are graphically represented. They are discussed subsequently. decomposition
A task
initial item
B trigger
C
G
non-terminating decision
E
synchroniser
F H
terminating decision
B
Figure 1: Graphical representation of task structure concepts The central notion in task structures is the notion of a task. In a work ow context, tasks are basic work units that collectively achieve a certain goal. A task can be de ned in terms of other tasks, referred to as its subtasks. This decomposition may be performed repeatedly until a desired level of detail has been reached. Tasks with the same name have the same decomposition, e.g. the tasks named B in gure 1. Performing a task may involve choices between subtasks, decisions represent these moments of choice. Decisions coordinate the execution of tasks. Two kinds of decisions are distinguished, terminating and non-terminating decisions. A decision that is terminating, may lead to termination of the execution path of that decision. If this execution path is the only active execution path of the supertask, the supertask terminates as well. Triggers, graphically represented as arrows, model sequential order. In gure 1 the task with name G can start after termination of the top task named B . Initial items are those tasks or decisions, that have to be performed rst as part of the execution of a task that has a decomposition. Due to iterative structures, it may not always be clear which task objects are 4
initial. Therefore, this has to be indicated explicitly. Finally, synchronisers deal with explicit synchronisation. In gure 1 the task named H can only start when the tasks with names C and G have terminated. As we use task structures as a generic language for work ow speci cations, in the rest of the paper we refer to them as work ow structures.
2.2 Example: Health Insurance Find Client
Enter Appeal Information
Update Information
Enter Information
Evaluate Health Status
Generate Rejection Letter
Create Policy
Generate Acceptance Letter
Formalize Reply
Receive Reply
Start Appeal Process
Archive Application
Figure 2: Main task for health insurance As an example consider a health insurance application processing work ow [17]. Figure 2 depicts the work ow speci cation of the health insurance application processing system. In 5
processing a health insurance application, the rst task is to nd the client details. There could be three types of clients namely new, old, and clients with appeals. Therefore, there is a decision after the task Find Client. Depending on the outcome of the decision the next appropriate task will be executed. The task named Evaluate Health Status is a supertask which needs further decomposition. The decomposition is shown in Figure 3. Study Application
Request History
Request more Information
Response Received
Request Opinion from Medical Expert
Opinion Received
Figure 3: Decomposition of Evaluation of Health Status
Mail Documents
Update Files
Figure 4: Decomposition of Formalizing Replies Again, the outcome of this task has to be evaluated. If the health status of the client is acceptable then tasks Create Policy and Generate Acceptance Letter will be executed in parallel. If the health status is not acceptable then a rejection letter will be sent. Then the reply can be formalized (see gure 4) meaning that the involved documents can be mailed and the relevant les updated. Clients are allowed to appeal against the company's decision. Therefore, 6
the client restarts the application process again as an appealing client. Finally, it should be mentioned that the dashed box does not have any special meaning and will be used in a later section.
2.3 Syntax of Work ow Structures In this section the syntax of work ow structures is de ned using elementary set theory. A work ow structure consists of the following components: 1. A set X of work ow objects. X is the (disjoint) union of a set of synchronisers S , a set of tasks T and a set of decisions D. In D we distinguish a subset Dt consisting of the terminating decisions. 2. A relation Trig X X of triggers. 3. A function Name: T ! V yielding the name of a task, where V is a set of names. 4. A partial decomposition function Sup: X V . If Sup(x) = v, this means that work ow object x is part of the decomposition of v. Names in the range of this function are called decomposition names. The set Q ran(Sup) contains all these names. The complement A V n Q is the set of atomic actions. 5. A partial function Init Sup yielding the initial items of a task. Work ow structures are required to have a unique task, in the top of the decomposition hierarchy. For this task the decomposition function should be unde ned, i.e.
9!t2T [Sup(t)"] In the remainder, we call this unique task, the main task and refer to it as t0 . Using the terminology of graph theory, this requirement stipulates that the decomposition hierarchy of task structures should be rooted. Contrary to [29], we do not require the decomposition hierarchy to be acyclic, but allow recursive decomposition structures. They allow for recursive speci cation of work ows. Decomposition allows for the modularization of work ow speci cations. Typically, modules should be loosely connected and hence, triggers should not cross decomposition boundaries:
x1 Trigx2 ) Sup(x1 ) = Sup(x2 ) Finally, each work ow object should be reachable from an initial item. This implies that there should be a trigger path from such an initial item to the work ow object in question:
8x2Xnft0 g9i2X [Init(i) = Sup(x) ^ iTrig x] In this requirement, Trig is the re exive transitive closure of Trig. 7
3 Some Veri cation Problems and Their Complexity In this section focus is on the de nition of veri cation problems in work ow speci cations and their complexity. These problems solely concern process control issues and assume that speci cations may use the core work ow concepts as presented in the previous section. In a work ow speci cation it can be relevant to be able to determine whether a certain task can be invoked. This is particularly true if the task involved is critical, i.e. without its execution the work ow cannot be considered successful.
De nition 3.1
The initiation problem for a work ow object x in a work ow structure W is to determine whether there is a sequence of events leading to the execution of x. 2
Theorem 3.1 The initiation problem is NP-complete (= NTIME(poly)-complete). Proof:
For this proof we describe a polynomial time transformation of SATISFIABILITY to the initiation problem. SATISFABILITY, or SAT for short, is known to be NP-complete (see e.g. [11]) and formally corresponds to the question whether there is a truth assignment to a set K of clauses over U , a set of boolean variables. Each clause is a set of literals, which are either boolean variables from U or negations of boolean variables from U . The proof given here is inspired by the proof given in [16], where a comparable translation was used for proving the fact that determining liveness in a Free Choice Petri Net is an NP-complete problem. Let K be an instance of SAT. The work ow structure WK resulting from the transformation consists of a main task t0 with for each boolean variable u 2 U a decision Du as initial item. This set of decisions is to capture all possible truth assignments to the boolean variables in U . Hence, each decision Du has an output trigger to a task with name u and an output trigger to a task with name u (to avoid confusion, the symbol is used for logical negation in this proof). Let t be a task with name x, where x is a literal. For each clause C in K for which x 2 C a trigger from t to a task with name h x; C i exists (where it is assumed that x x). Further, for each C a synchroniser SC exists which has as input all tasks with names of the form hx; C i. Finally, each synchroniser SC has an output trigger to a task with name K . Note that the construction implies that a synchroniser SC will only be triggered if every one of its input tasks will be executed, which means that each of the literals in C evaluates to false in a particular truth assignment. The construction guarantees that K can be initiated if and only if K is not satis able. Suppose K is not satis able, then in any truth assignment T there is a clause C = fx1; : : : ; xng that evaluates to false under T . Hence, all literals xi evaluate to false under T . Synchroniser SC depends on the tasks with names h x1 ; C i ; : : : ; h xn ; C i. All these tasks will be started as a task with name h xi ; C i will be started by a task with name xi. As xi evaluates to false under T , decision du , where ui is the boolean variable in literal xi , will trigger the task with name xi . Synchroniser SC will then trigger the task i
8
with name K . By analogous reasoning, if K is satis able, then none of the synchronisers SC will be started. Clearly this construction is a polynomial time transformation. Finally, to prove that the problem is in NP, it is sucient to observe that the veri cation of an execution scenario (leading to the execution of a work ow object) can be done in polynomial time. Note that the translation will never result in loops and that decomposition is not used (except for the main task). 2
~p
p
q
~q
~r
r
K
Figure 5: Work ow structure for K = A ^ B ^ C = (p_ q _ r) ^ (p_ r) ^ (q_ r)
Example 3.1 The construction of the previous proof is illustrated in gure 5, where the result of the translation of the formula K = A ^ B ^ C = (p_ q _ r) ^ (p_ r) ^ (q_ r) 2
to a work ow structure is presented.
A state of a work ow structure may be de ned as a multiset of its work ow objects. If a work ow object occurs n times in the multiset, n instances are active. The formal semantics of a work ow structure might be de ned as its set of associated states and possible transitions between these states (see also section 4.3). The initial state then corresponds to the multiset containing the inital items of the main task exactly once. A state is terminal if it solely consists of terminating decisions and tasks without output triggers. As in nite work ow speci cations are not desirable, it is imperative to be able to detect them statically.
De nition 3.2
The termination problem is to determine whether a work ow structure can reach a terminal state.
2
9
Unfortunately, any algorithm solving the termination problem will require at least an exponential amount of storage space.
Theorem 3.2 The termination problem is DSPACE(exp)-hard. Proof:
By a reduction of the reachability problem for Ordinary Petri Nets to the termination problem. In [16] it is proven that the reachability problem is DSPACE(exp)-hard. Ordinary Petri Nets are Petri Nets where the multiplicity of any place is limited to be less than or equal to one. In [23] it is shown that the reachability problem for Ordinary Petri Nets can be reduced to the reachability problem for Petri Nets. The reachability problem has as input an Ordinary Petri Net P and markings and 0 of P . A marking assigns a nite number of tokens to each place. The question is whether marking 0 can be reached from marking (see e.g. [23, 24]). An Ordinary Petri Net is a four tuple hP; T; I; Oi, where P is a set of places, T is a set of transitions, I : T ! }(P ) is the input function, a mapping from transitions to sets of places, and O: T ! }(P ) is the output function, also a mapping from transitions to sets of places. Let P = hP; T; I; Oi be an Ordinary Petri Net and and 0 markings of P . The corresponding work ow structure WP (;0 ) has a task Tp for each place p and a synchroniser St for each transition t. If p is a place with exactly one arrow to a certain transition t, then there is a corresponding trigger between Tp and St . If p is a place with more output arrows, this means that there is a choice between these transitions. Hence, a decision Dp is introduced and a trigger from Tp to Dp as well as for each arrow from p to a transition t, a trigger from Dp to St . Finally, each arrow from a transition t to a place p results in a trigger from St to Tp . The initial marking determines the initial items of the work ow structure WP (;0 ) . For each place p with n tokens in marking (n > 0), we create n synchronisers, without input triggers, but with exactly one output trigger to task Tp. All such synchronisers are initial item of WP (;0 ) . Note that because of the fact that the trigger relation Trig is a set, we cannot create a single synchroniser for p with n arrows to task Tp. Marking 0 is reachable from marking i a state containing each task Tp exactly 0 (p) times (and no other work ow objects) is reachable in WP (;0 ) . 2
Remark 3.1
In the previous proof it is necessary to use Ordinary Petri Nets instead of unrestricted Petri Nets as this would require the trigger relation Trig to be a multiset. In unrestricted Petri Nets there may be several arrows from a place to the same transition, or several arrows from a transition to the same place. This latter situation is not problematic in terms of the current de nition of work ow structures, as it can be simulated by a \misuse" of the synchronizer: if a work ow object v should activate n instances of a work ow object w, then n intermediate synchronisers with one input trigger from v and an output trigger to w capture this behaviour. The former situation, however, cannot be captured. It would correspond to a situation where a synchroniser would have to await 10
completion of a certain number of instantiations of the same work ow object. In the context of capacities (discussed later on in this section) it might be desirable to adapt the de nition of work ow structures and to allow Trig to be a multiset. 2 A
B
C
A
B
D
E
F
D
E
C
F
Figure 6: Translating Petri Nets to Work ow Structures
Example 3.2 In gure 6 an example of a Petri Net and its corresponding Work ow Stucture 2
is shown.
De nition 3.3
A work ow structure is safe if and only if from every reachable state a terminal state can be reached. 2
Corollary 3.1 Determining safeness is a DSPACE(exp)-hard problem. As capacities may play an important role in work ows, it may be important to know how many active copies of a work ow object may come into existence at a certain point of time. For example, the execution of a certain work ow object may be the responsibility of a certain department with only n members. If case of more than n invocations of that work ow object, the department is overloaded.
De nition 3.4
A work ow object w is n-bounded if and only if in every reachable state w does not occur more than n times. 2
Theorem 3.3 Determining whether a work ow object w is n-bounded is a DSPACE(poly)complete problem.
Proof:
Follows immediately from the translation of Ordinary Petri Nets to Work ow Structures and the fact that determining n-boundedness in a Petri Net is DSPACE(poly)-complete (see e.g. [16]). The reader is reminded of the inclusion NTIME(poly) DSPACE(poly).
2
11
Often it is desirable to determine whether two speci cations are equivalent, i.e. whether they express the same work ow. This might be relevant e.g. in the context of execution optimization. Formally, two work ow speci cations are equivalent i they can generate exactly the same set of traces. A trace corresponds to a list of atomic actions (see section 2.3) in an order as performed in a complete execution of a work ow speci cation. Atomic actions correspond to basic functionality in a work ow structure. For a formal de nition of a trace refer to [14].
Theorem 3.4 The equivalence problem for work ow structures is undecidable. Proof:
For context-free grammars G1 and G2 to determine whether L(G1 ) = L(G2 ) is undecidable (see e.g. [27]). Formally, a context-free grammar G is a tuple hN; ; ; S i, where N is a nite set of nonterminal symbols, is a nite set of terminal symbols, S 2 N is the initial symbol and is a set of production rules of the form A ! ! where A 2 N and ! 2 (N [ ) (see e.g. [11]). A context-free grammar G can be translated to a work ow structure WG such that w 2 L(G) if and only if w is a trace of WG . For every nonterminal t there will be a decomposition with name t, having a decision Dt as initial item. This decision is terminating i G contains a rule of the form t ! ". Let P be a production rule of the form t ! s1 ; : : : ; sn . For every si (1 i n) there is a task TP with name si in the decomposition of name t. Furthermore, there is a trigger from Dt to the task with name s1 . Hence production rule P results in a sequence of tasks in an order corresponding to the order of the nonterminals and terminals si in its righthand side. As P might be one of more production rules for t, decision Dt is introduced allowing a choice for P . The main task has a decision DS as initial item (recall that S is the initial symbol of the grammar). This completes the construction of WG . If w 2 L(G), then the corresponding trace simply follows a (!) derivation of w by choosing in each decision associated with each nonterminal the production rule chosen in that derivation. The other way around is identical. Hence, we achieved a one-to-one correspondence between context-free grammars and work ow structures. Finally, note that this proof does not require the use of synchronisers, but makes substantial use of the fact that recursive decomposition structures are allowed. i
2
Corollary 3.2 Detecting whether a work ow speci cation is more generic than another work ow speci cation is undecidable.
Example 3.3 In gure 7, the work ow structure WG for a context-free grammar G is given. This grammar is de ned by the following production rules:
S A B D
! ! ! !
ABcD aA bBA d 12
S ! " A ! af B ! c
2
S
A
B
A a
a
A
f
b
c
D d
B B c
A D
Figure 7: Work ow structure for context-free grammar G Theorem 3.4 implies that we have a situation similar to query optimization. Equivalence of rst order queries for example is not decidable, hence focus is on the application of equivalencepreserving transformations. For work ow speci cations, also desirable transformations could be de ned, supporting the \optimization" of their execution.
4 Termination in Restricted Work ow Structures In this section termination in Restricted Work ow Structures is studied and it is shown that safeness can be veri ed in polynomial time.
4.1 Restriction of Work ow Structures When studying the proofs of the previous section it becomes clear that the expressive power of work ow structures is to a large extent due to the concept of synchroniser. Unrestricted use of this construct causes the non-polynomial complexity of termination veri cation. On the other hand, obviously, disallowing all forms of synchronisation leads to an undesired loss of expressive power. Hence, it is necessary to focus on a more controlled form of synchronisation. This controlled form should guarantee that determining local correctness of synchronisation is sucient for global correctness. In work ow structures without synchronisers it is still possible to express some forms of synchronisation. One may refer to this form of synchronisation as synchronisation through decomposition : work ow objects after a decomposed task may only start if all execution paths in that task have terminated. Consider for example the schema in gure 2. The task Receive Reply can only be initiated once both tasks in the decomposition of Receive Reply (see gure 3) have terminated. This form of synchronisation allows the removal of the two synchronisers of the schema of gure 2 by simply introducing a supertask, as indicated by the dashed box. Both these tasks then become initial items of this supertask. Once again note, that the removal of synchronisers is not always possible by the introduction of a proper supertask (consider e.g. gure 5). Another source of complexity is the fact that so far decomposition structures are allowed to be cyclic. This is heavily used in the proof of the undecidability of determining equivalence of 13
work ow speci cations. Hence, in Restricted Work ow Structures cyclic decomposition is not allowed. Formally, Restricted Work ow Structures are work ow structures without synchronisers and without cyclic decomposition structures. The former requirement simply translates to S = ? in terms of the syntax presented in section 2.3. To formally capture the latter requirement a relation Super V V has to be de ned which de nes decomposition relations between decomposition names :
xSupery 9t2T [Name(t) = y ^ Sup(x) = y] The requirement that decomposition structures should be acyclic can now be formally stated by requiring that the re exive transitive closure of this relation is asymmetric:
xSuper y ) :ySuper x Restricted Work ow Structures were used in [28, 29] to describe and analyze the problem solving behaviour of experienced information engineers.
4.2 Termination in Restricted Work ow Structures Having de ned a restriction of work ow structures it is important to nd a computationally tractable rule that guarantees safeness, i.e. each execution scenario will lead to succesful termination. To provide some intuition, in gure 8 some work ow structures with termination problems are shown. The right most work ow structure represents a trivial example of deadlock: a \decision" with no outgoing triggers. The other two work ow structures are examples of livelocks: tasks are to be performed continuously, there is no execution path leading to termination. A
B
C
D
F
E
Figure 8: Livelock and deadlock in work ow structures The solution starts with the observation that as decomposition is acylic, it is sucient to look at each decomposition individually. In the rest, we therefore focus on a single supertask s and its decomposition. Sets such as X ; T ; D etc. are from now on restricted to this decomposition. A work ow object may be considered terminating if and only if after its execution it is possible to terminate (in zero or more steps). For non-terminating decisions this means that at least one of the possible subsequent work ow objects is terminating. For tasks this means that all 14
subsequent work ow objects have to be terminating as well (as they will all be started in parallel). Formally, the notion of a terminating work ow object can be captured by the unary predicate Term, which is de ned via the following set of derivation rules: [T1] d 2 Dt ` Term(d)
[T2] d 2 D ^ dTrige ^ Term(e) ` Term(d) [T3] t 2 T ^ 8x2X [Term(x)] ` Term(t) As will be shown in section 4.3.3, supertask s will always terminate succesfully if and only if every work ow object in its decomposition satis es the predicate Term. Before these proofs can be given, however, it is necessary to de ne a formal semantics for Restricted Work ow Structures.
4.3 Semantics of Restricted Work ow Structures Restricted Work ow Structures are work ow structures without synchronisers or cyclic decomposition structures. As the Process Algebra translation presented in [14] would be an overkill for the safeness proofs to be presented, we will assign a simple trace semantics to Restricted Work ow Structures. The basic observation is that each work ow structure state is completely determined by the work ow objects and their respective number of active instances. Formally, such a state can be seen as a multiset on X .
4.3.1 Multisets In this section the notation used for multisets and some de nitions as far as they are needed in the rest of this paper are brie y introduced. Multisets [20], also known as multiple membership sets [19], or bags [22], dier from ordinary sets in that a multiset may contain an element more than once. A multiset can be denoted by an enumeration of its elements, e.g. f[a; a; b; c; d; c; a]g. If X is a multiset, then #(a; X ) denotes the number of occurrences of a in X . The membership operator for multisets takes the occurrence frequency of elements in a multiset into account: a 2n X () #(a; X ) = n In the remainder of this paper, a 2 X is used as a shorthand for #(a; X ) > 0. Bag comprehension is the bag-theoretic equivalent of set comprehension. Let C (a; n) be a predicate such that for each a exactly one n exists, such that C (a; n). A multiset can then be denoted by means of the bag comprehension schema [4]: f[a"n j C (a; n)]g This set is an intensional denotation of the multiset X that is determined by: C (a; n) () a 2n X The set of all nite multisets over a domain X is denoted as M(X ). 15
4.3.2 Trace Semantics for Work ow Structures The semantics of a work ow structure is the set of possible traces, i.e. possible action sequences. These traces can be captured by the introduction of a one step transition relation between states. This transition relation ?! is a subset of M(X ) M(X ) (A [f"g), where A is the set of all task names of tasks occurring in the decomposition of supertask s (such names correspond to atomic actions in this context, evenathough they may have an associated decomposition), and " represents the empty string. If X ?! Y , then state X may change to state Y by performing a task with name a or by performing a decision in which case no action is performed and a = ". Formally this one step transition relation may be de ned by the following set of derivation rules: " C n fdg [D1] C 2 M(X ) ^ d 2 Dt \ C ` C ?!
The above rule states that executing a terminating decision may lead to the removal of that decision from the state. The following rule states that executing a decision (either terminating or not) may lead to the replacement of that decision in the state by one of its successors. " [D2] C 2 M(X ) ^ d 2 D \ C ^ dTrige ` C ?! feg [ C n fdg
Execution of a task leads to its replacement by its successor work ow objects:
[D3] C 2 M(X ) ^ t 2 T \ C ` C Name ?!(t) f[x"1 j tTrigx]g [ C n f[t]g The re exive transitive closure ?! can now easily be de ned by: " [M1] ` C ?! C a 0 a 0 [M2] C ?! C ` C ?! C 00 00 0 C ` C ?! C [M3] C ?! C ^ C 0 ?!
Note that " plays the role of neutral element with respect to string concatenation (). Now we can de ne reachability of states. Formally, a state m 2 M(X ) is reachable, notation reach(m), if and only if it is reachable from the set of initial items by performing some sequence of actions: h i 92A f[x"1 j Init(x) = Name(s)]g ?! m The semantics of supertask s is the set of all possible traces leading to termination, i.e. the set of 2 A such that: ? f[x"1 j Init(x) = Name(s)]g ?! As traces as such do not play an important role inhthe contexti of this paper, the notation C0 . C ?! C 0 will be used as an abbreviation for 92A C ?! 16
4.3.3 Proving Correctness In this section it is proved that the condition 8x2X [Term(x)] is a necessary and sucient condition for guaranteeing safeness. Although this might seem rather trivial at rst, the proofs are not. The reason for this is the fact that although work ow objects might be terminating, their execution might increase the number of work ow objects in the state. The suciency proof deals with this problem by de ning a partial order on states and showing that for each state there is a monotonously decreasing series of states with as nal state the empty state. This proof technique might be of interest for other conceptual work ow speci cation considerations.
Theorem 4.1 (Suency ) 8x2X [Term(x)] ) 8Y 2M(X ) [Y ?! ?] Proof:
Assume 8x2X [Term(x)]. A partial order on X can be de ned as follows: x y if and only if the number of derivation steps needed to prove that Term(x) is less than the number of derivation steps needed for Term(y). From this partial order on work ow objects a partial order on states can be derived:
i #(x; ) > #(x; ) ) 9yx [#(y; ) < #(y; )] Informally one may think of this de nition as: if and only if is closer to termination than . Now suppose Y 2 M(X ). If Y = ?, then we are ready as ? ?! ?. Hence assume Y 6= ?. We prove that we can nd a Z0 2 M(X ) such that Z0 Y and Y ?! Z0 . As there can only be nitely many Zi such that Y Z0 Z1 : : : Zm ?, we have proven the result. Let w 2 Y be a maximal element in Y . The following three cases can be distinguished: If w 2 Dt then choose Z0 = Y n f[w]g. In that case Y ?! Z0 and Z0 Y . If w 2 D n Dt, then choose v 2 X such that v w and wTrigv (such a v exists!). De ne Z0 = f[v]g [ Y n f[w]g. In that case we also have Y ?! Z0 and Z0 Y . Finally, if w 2 T , 8x2X ;wTrigx [x w]. Hence by de ning Z0 = f[x"1 j wTrigx]g [ Y n f[w]g again Y ?! Z0 and Z0 Y .
2 Note that the above theorem is a bit stronger than we actually need. It states that from every possible state, not only the reachable ones, the empty state can be reached. The following theorem states that if not every work ow object in a decomposition is terminating, the work ow speci cation is not safe.
Theorem 4.2 (Necessity ) 9x2X [:Term(x)] ) 9Y 2M(X );reach(Y ) [:(Y ?! ?)] 17
Proof:
By proving that if Z ?! Y :
9z2Z [:Term(z)] ) 9y2Y [:Term(y)] ; which, informally speaking, captures the fact that it is not possible to get rid of nonterminating work ow objects in a state. Assume Z ?! Y and z 2 Z such that :Term(z ). The following cases can now be distinguished: 1. z 2 D n Dt , in which case 8x2X ;zTrigx [:Term(x)]. 2. z 2 T , in which case 9x2X ;zTrigx [:Term(x)]. Executing z will then lead to the addition of such a work ow object x to Y . Note that it is essential for this proof that the non-terminating work ow object is reachable from one of the initial items.
2
5 Conclusions and Further Research In this paper veri cation problems in work ow speci cations were addressed. Focus was on control aspects only. Even then, however, it turns out that many interesting questions are not tractable. These may serve as \facts of life" for work ow specialists and may prevent fruitless searches for eciency. A restriction of the synchronisation concept and decomposition structures has been proposed that allows for termination veri cation in polynomial time. Although as mentioned the worst case complexity of many veri cation problems is NP-complete or worse, this does not mean that nothing can be done. In particular, focus of further research will be on heuristics and incremental algorithms to reduce the average case complexity as much as possible. In this context, it should also be noted that veri cation time is not as critical as execution time. Finally, a tool is under development that visualises termination problems and as such may help their early detection as well as provide insight into their average time complexity in practice.
References [1] M. Ansari, L. Ness, M. Rusinkiewicz, and A. Sheth. Using Flexible Transactions to Support Multi-System Telecommunication Applications. In Li-Yan Yuan, editor, Proceedings of the 18th VLDB Conference, pages 65{76, Vancouver, Canada, August 1992. [2] P. Attie, P. Singh, A. Sheth, and M. Rusinkiewicz. Specifying and Enforcing Intertask Dependencies. In R. Agrawal, S. Baker, and D. Bell, editors, Proceedings of the 19th VLDB Conference, pages 134{145, Dublin, Ireland, August 1993. [3] J.C.M. Baeten and W.P. Weijland. Process Algebra. Cambridge University Press, Cambridge, United Kingdom, 1990. 18
[4] E.A. Boiten. Views of Formal Program Development. PhD thesis, University of Nijmegen, Nijmegen, The Netherlands, 1992. [5] P.W.G. Bots. An Environment to Support Problem Solving. PhD thesis, Delft University of Technology, Delft, The Netherlands, 1989. [6] Y. Breibart, D. Georgakopoulos, and H. Schek. Merging Application-centric and Datacentric Approaches to Support Transaction-oriented Multi-system Work ows. SIGMOD Record, 22(3):23{30, September 1993. [7] F. Casati, S. Ceri, B. Pernici, and G. Pozzi. Conceptual Modeling of Work ows. In M.P. Papazoglou, editor, Proceedings of the OOER'95, 14th International Object-Oriented and Entity-Relationship Modelling Conference, volume 1021 of Lecture Notes in Computer Science, pages 341{354. Springer-Verlag, December 1995. [8] U. Dayal, M. Hsu, and R. Ladin. Organizing Long-running Activities with Triggers and Transactions. In H. Garcia-Molina and H.V. Jagadish, editors, Proceedings of the 1990 ACM SIGMOD Conference on the Management of Data, pages 204{214, Atlantic City, New Jersey, May 1990. [9] A. Elmagarmid, Y. Leu, W. Litwin, and M. Rusinkiewicz. A Multidatabase Transaction Model for Interbase. In D. McLeod, R. Sacks-Davis, and H. Schek, editors, Proceedings of the 16th International Conference on VLDB, pages 507{518, Brisbane, Australia, August 1990. [10] A. Forst, E. Kuhn, and O. Bukhres. General Purpose Work ow Languages. Distributed and Parallel Databases, 3(2):187{218, April 1995. [11] M.R. Garey and D.S. Johnson. Computers and Intractability: A Guide to NPCompleteness. W.H. Freeman and Company, San Francisco, California, 1979. [12] D. Georgakopoulos, M. Hornick, and A. Sheth. An Overview of Work ow Management: From Process Modelling to Work ow Automation Infrastructure. Distributed and Parallel Databases, 3(2):119{153, April 1995. [13] Y. Halabi, M. Ansari, R. Batra, W. Jin, G. Karabatis, P. Krychniak, M. Rusinkiewicz, and L. Suardi. Narada: An Environment for Speci cation and Execution of Multi-system Applications. In Proceedings of the Second International Conference on Systems Integration, pages 680{690, Los Alamitos, California, 1992. IEEE Computer Society Press. [14] A.H.M. ter Hofstede and E.R. Nieuwland. Task structure semantics through process algebra. Software Engineering Journal, 8(1):14{20, January 1993. [15] A.H.M. ter Hofstede and T.F. Verhoef. Meta-CASE: Is the Game worth the Candle? Information Systems Journal, 6(1):41{68, 1996. [16] N.D. Jones, L.H. Landweber, and Y.E. Lien. Complexity of Some Problems in Petri Nets. Theoretical Computer Science, 4:277{299, 1977.
19
[17] M. Kamath and K. Ramamritham. Modeling Correctness & Systems Issues in Supporting Advanced Database Applications Using Work ow Management Systems. Computer science technical report 95-50, University of Massachusetts, Massachusetts, June 1995. [18] N. Krishnakumar and A. Sheth. Managing Heterogenous Multi-system Tasks to Support Enterprise-wide Operations. Distributed and Parallel Databases, 3(2):155{186, April 1995. [19] A. Levy. Basic Set Theory. Springer-Verlag, Berlin, Germany, 1979. [20] A. Lew. Computer Science: A Mathematical Introduction. Prentice-Hall, Englewood Clis, New Jersey, 1985. [21] J.A. Miller, A. Sheth, K.J. Kochut, and X. Wang. CORBA-Based Run-Time Architectures for Work ow Management Systems. Journal of Database Management, 7(1):16{27, 1996. [22] H. Partsch. Speci cation and Transformation of Programs - a Formal Approach to Software Development. Springer-Verlag, Berlin, Germany, 1990. [23] J.L. Peterson. Petri Net Theory and the Modelling of Systems. Prentice-Hall, Englewoods Clis, New Jersey, 1981. [24] W. Reisig. Petri Nets: An Introduction. EATCS Monographs on Theoretical Computer Science. Springer-Verlag, Berlin, Germany, 1985. [25] M. Rusinkiewicz, P. Krychniak, and A. Cichocki. Towards a Model for Multidatabase Transactions. International Journal of Intelligent and Cooperative Information Systems, 1(3 & 4):579{617, December 1992. [26] M. Rusinkiewicz and A. Sheth. Speci cation and execution of transactional work ows. In W. Kim, editor, Modern Database Systems: The Object Model, Interoperability, and Beyond. ACM Press, Cambridge, Massachusetts, 1994. [27] A. Salomaa. Formal Languages. ACM Monograph Series. Academic Press, New York, New York, 1973. [28] G.M. Wijers and H. Heijes. Automated Support of the Modelling Process: A view based on experiments with expert information engineers. In B. Steinholz, A. Slvberg, and L. Bergman, editors, Proceedings of the Second Nordic Conference CAiSE'90 on Advanced Information Systems Engineering, volume 436 of Lecture Notes in Computer Science, pages 88{108, Stockholm, Sweden, 1990. Springer-Verlag. [29] G.M. Wijers, A.H.M. ter Hofstede, and N.E. van Oosterom. Representation of Information Modelling Knowledge. In V.-P. Tahvanainen and K. Lyytinen, editors, Next Generation CASE Tools, volume 3 of Studies in Computer and Communication Systems, pages 167{ 223. IOS Press, 1992.
20
Contents 1 Introduction
1
2 Essential Work ow Concepts
3
3 Some Veri cation Problems and Their Complexity
8
2.1 Informal Explanation of Task Structures . . . . . . . . . . . . . . . . . . . . . . 2.2 Example: Health Insurance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Syntax of Work ow Structures . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 Termination in Restricted Work ow Structures
4.1 Restriction of Work ow Structures . . . . . . . . 4.2 Termination in Restricted Work ow Structures . 4.3 Semantics of Restricted Work ow Structures . . 4.3.1 Multisets . . . . . . . . . . . . . . . . . . 4.3.2 Trace Semantics for Work ow Structures 4.3.3 Proving Correctness . . . . . . . . . . . .
5 Conclusions and Further Research
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
4 5 7
13 13 14 15 15 16 17
18
21