Linear Programming Deadlock Checking Using Partial Order Dependencies Victor Khomenko and Maciej Koutny Department of Computing Science, University of Newcastle Newcastle upon Tyne NE1 7RU, U.K.

fVictor.Khomenko, [email protected]

Abstract. Model checking based on the causal partial order semantics of Petri nets is an approach widely applied to cope with the state space explosion problem. One of the ways to exploit such a semantics is to consider ( nite pre xes of) net unfoldings | themselves a class of acyclic Petri nets | which contain enough information, albeit implicit, to reason about the reachable markings of the original Petri nets. In [15], a veri cation technique for net unfoldings was proposed in which deadlock detection was reduced to a mixed integer linear programming problem. In this paper, we present a further development of this approach. We adopt Contejean-Devie's algorithm for solving systems of linear constraints over the natural numbers domain and re ne it, by taking advantage of the speci c properties of systems of linear constraints to be solved. The essence of the proposed modi cations is to transfer the information about causality and con icts between the events involved in an unfolding, into a relationship between the corresponding integer variables in the system of linear constraints. Experimental results demonstrate that the new technique achieves signi cant speedups. Keywords: Model checking, integer linear programming, Petri nets, unfolding, causality and concurrency.

1 Introduction A distinctive characteristic of reactive concurrent systems is that their sets of local states have descriptions which are both short and manageable, and the complexity of their behaviour comes from highly complicated interactions with the external environment rather than from complicated data structures and manipulations thereon. One way of coping with this complexity problem is to use formal methods and, especially, computer aided veri cation tools implementing model checking [4,8] | a technique in which the veri cation of a system is carried out using a nite representation of its state space. The main drawback of model checking is that it suers from the state space explosion problem. That is, even a relatively small system speci cation can (and often does) yield a very large state space. To help in coping with this, a number of techniques have been proposed which can roughly be classi ed as aiming at an implicit compact representation of the full state space of a reactive concurrent system, or at an explicit representation of a reduced (though sucient for a given veri cation task) state space of the system. Techniques aimed at reduced representation of state spaces are typically based on the independence (commutativity) of some actions, often relying on the partial order view of concurrent computation. Such a view is the basis for algorithms employing McMillan's unfoldings ([11,12,14]), where the entire state space of a Petri Net is represented implicitly using an acyclic net to represent a system's actions and local states. The net unfolding technique presented in [14,15] reduces memory requirement, but the deadlock checking algorithms proposed are quite slow, even for medium-size unfoldings.

2

V.Khomenko and M.Koutny

In [15], the problem of deadlock checking a Petri net was reduced to a mixed integer linear programming problem. In this paper, we present a further development of this approach. We adopt the Contejean-Devie's algorithm ([1,2,5{7]) for eciently solving systems of linear constraints over the domain of natural numbers. We re ne this algorithm by employing unfolding-speci c properties of the systems of linear constraints to be solved, in model checking aimed at deadlock detection. The essence of the proposed modi cations is to transfer the information about causality and con icts between events involved in an unfolding into a relationship between the corresponding integer variables in the system of linear constraints. The results of initial experiments demonstrate that the new technique achieves signi cant speedups. The paper is organised as follows. In section 2 we provide basic de nitions concerning Petri nets and, in particular, net unfoldings. Section 3 brie y recalls the results presented in [15] where the deadlock checking problem has been reduced to the feasibility test of a system of linear constraints. Section 4 is based on the results developed in [1,2,5{7] and recalls the main aspects of the Contejean-Devie's Algorithm (CDA) for solving system of linear constraints over the natural numbers domain. The algorithm we propose in this paper is a variation of CDA, developed speci cally to exploit partial order dependencies between events in the unfolding of a Petri net. Our algorithm is described in section 5; we provide theoretical background, useful heuristics, and outline ways of reducing the number of variables and constraints in the original system presented in [15]. Section 6 contains results of experiments obtained for a number of benchmark examples. Section 7 brie y discusses the parallelisation aspects of the new algorithm, and in section 8, we discuss how the approach can be generalised to deal with other relevant veri cation problems, such as mutual exclusion, coverability and reachability analysis. Section 9 describes possible directions for future research.

2 Basic de nitions In this section, we rst present basic de nitions concerning Petri nets, and then recall (see [11]) notions related to net unfoldings.

Petri nets A net is a triple N = (S; T; F) such that S and T are disjoint sets of respectively places and transitions, and F (S T) [ (T S) is a ow relation. A marking of N is a multiset M of places, i.e. M : S ! N = f0; 1; 2; : ::g. We adopt the

standard rules about representing nets as directed graphs, viz. places are represented as circles, transitions as rectangles, the ow relation by arcs, and markings are shown by placing tokens within circles. As usual, we will denote z = fy j (y; z) 2 F g and z = fy j (z; y) 2 F g, for all z 2 S [ T. We will assume that t 6= ; = 6 t , for every t 2 T. A net system is a pair = (N; M0 ) comprising a nite net N = (S; T; F) and an (initial) marking M0 . A transition t 2 T is enabled at a marking M, denoted M[ti, if for every s 2 t, M(s) 1. Such a transition can be executed, leading to a marking M 0 de ned by M 0 = M ? t+t . We denote this by M[tiM 0 or M[iM 0. The set of reachable markings of is the smallest set [M0i containing M0 and such that if M 2 [M0i and M[iM 0 then M 0 2 [M0 i. For a nite sequence of transitions, = t1 : : :tk , we denote M0 [iM if there are markings M1 ; : : :; Mk such that Mk = M and Mi?1 [tiiMi , for i = 1; : : :; k. A marking is deadlocked if it does not enable any transitions. The net system is deadlock-free if no reachable marking is deadlocked; safe if for every reachable marking M, M(S) f0; 1g; and bounded if there is k 2 N such that M(S) f0; : : :; kg, for every reachable marking M.

LP Deadlock Checking Using Partial Order Dependencies

3

Marking equation Let = (N; M0 ) be a net system, and S = fs1; : : :; sm g and T = ft1 ; : : :; tn g be sets of its places and transitions, respectively. We will often identify a marking M of with a vector M = (1 ; : : :; m ) such that M(si ) = i, for all i m. The incidence matrix of is an m n matrix N = (Nij ) such that, for all i m and j n, 8 1 if s 2 t n t < i j j Nij = : ?1 if si 2 tj n tj

0 otherwise : The Parikh vector of a nite sequence of transitions is a vector x = (x1 ; : : :; xn) such that xi is the number of the occurrences of ti within , for every i n. One can show that if is an execution sequence such that M0[iM then M = M0 + N x . This provides a motivation for investigating the feasibility (or solvability) of the following system of equations: M = M + N x 0 M 2 Nm and x 2 Nn If we x the marking M, then the feasibility of the above system is a necessary condition for M to be reachable from M0 .

Branching processes Two nodes of a net N = (S; T; F), y and y0 , are in con ict, denoted by y#y0 , if there are distinct transitions t; t0 2 T such that t \ t0 = 6 ; and (t; y) and (t0 ; y0 ) are in the re exive transitive closure of the ow relation F, denoted by . A node y is in self-con ict if y#y. An occurrence net is a net ON = (B; E; G) where B is the set of conditions (places) and E is the set of events (transitions). It is assumed that: ON is acyclic (i.e. is a partial order); for every b 2 B, jbj 1; for every y 2 B [ E, :(y#y) and there are nitely many y0 such that y0 y, where denotes the irre exive transitive closure of G. Min (ON ) will denote the minimal elements of B [ E with respect to . The relation is the causality relation. Two nodes are co-related, denoted by y co y0 , if neither y#y0 nor y y0 nor y0 y. A homomorphism from and occurrence net ON to a net system is a mapping h : B [ E ! S [ T such that: h(B) S and h(E) T; for all e 2 E, the restriction of h to e is a bijection between e and h(e); the restriction of h to e is a bijection between e and h(e) ; the restriction of h to Min (ON ) is a bijection between Min (ON ) and M0 ; and for all e; f 2 E, if e = f and h(e) = h(f) then e = f. A branching process of [10] is a quadruple = (B; E; G; h) such that (B; E; G) is an occurrence net and h is a homomorphism from ON to . A branching process 0 = (B 0 ; E 0; G0; h0) of is a pre x of a branching process = (B; E; G; h), denoted by 0 v , if (B 0 ; E 0 ; G0) is a subnet of (B; E; G) such that: if e 2 E 0 and (b; e) 2 G or (e; b) 2 G then b 2 B 0 ; if b 2 B 0 and (e; b) 2 G then e 2 E 0; and h0 is the restriction of h to B 0 [ E 0 . For each there exists a unique (up to isomorphism) maximal (w.r.t. v) branching process, called the unfolding of .

Con gurations and cuts A con guration of an occurrence net ON is a set of events C such that for all e; f 2 C, :(e#f) and, for every e 2 C, f e implies f 2 C. A cut is a maximal w.r.t. set inclusion set of conditions B 0 such that b co b0, for all b; b0 2 B 0 . Every marking reachable from Min (ON ) is a cut. Let C be a nite con guration of a branching process . Then Cut (C) = (Min (ON ) [ C ) n C is a cut; moreover, the multiset of places h(Cut (C)) is a reachable marking of , denoted Mark (C). A marking M of is represented in if the latter contains

4

V.Khomenko and M.Koutny

a nite con guration C such that M = Mark (C). Every marking represented in is reachable, and every reachable marking is represented in the unfolding of . A branching process of is complete if for every reachable marking M of , there is a con guration C in such that Mark (C) = M, and for every transition t enabled by M, there is a con guration C [ feg such that e 62 C and h(e) = t. Although, in general, an unfolding is in nite, for every bounded net system one can construct a nite complete pre x Unf of the unfolding of . Moreover, there are so-called cut o events1 in Unf such that, for every reachable marking M of , there exists a con guration C in Unf such that M = Mark (C) and no event in C is a cut-o event.

3 Deadlock detection using linear programming In the rest of this paper, we will assume that Unf = (B; E; G; h) is a nite complete pre x of the unfolding of a bounded net system = (S; T; F; M0 ). We will denote by Min the canonical initial marking of Unf which places a single token in each of the minimal conditions and no token elsewhere. Furthermore, we will assume that b1 ; b2; : : :; bp and e1 ; e2; : : :; eq are respectively the conditions and events of Unf , and that C is the p q incidence matrix of Unf . The set of cut-o events of Unf will be denoted by Ecut . We now recall the main results from [15]. A nite and complete pre x Unf may be treated as an acyclic safe net system with the initial marking Min . Each reachable deadlocked marking in is represented by a deadlocked marking in Unf . However, some deadlocked markings of Unf lie beyond the cut-o events and may not correspond to deadlocks in . Such deadlocks can be excluded by prohibiting the cut-o events from occurring. Since for an acyclic Petri net the feasibility of the marking equation is a sucient condition for a marking to be reachable, the problem of deadlock checking can be reduced to the feasibility test of a system of linear constraints. Theorem 1. ([15]) is deadlock-free if and only if the following system has no solution (in M and x): 8M = M + C x > in > X > for all e 2 E < M(b) jej ? 1 b2 e (1) > > x(e) = 0 for all e 2 Ecut > > : M 2 Np and x 2 Nq where x(ei ) = xi, for every i q. In order to decrease the number of integer variables, M 0 can be treated as a rational vector since x 2 Nq and M = Min + C x 0 always imply that M 2 Np. Moreover, as an event can occur at most once in a given execution sequence of Unf from the initial marking Min , it is possible to require that x be a binary vector, x 2 f0; 1gq . To solve the mixed-integer LP-problem (1), [15] used the general-purpose LP-solver CPLEX [9], and demonstrated that there are signi cant performance gains if the number of cut-o events is relatively high since all variables in x corresponding to cut-o events are set to 0. 1 Intuitively, cut-o events are nodes at which the potentially in nite unfolding may be cut

without losing any essential information about the behaviour of ; see [10{12, 14, 15] for details.

LP Deadlock Checking Using Partial Order Dependencies

5

Remarks The characterisation of the deadlock checking problem given by (1) suggests

that, in general, one can consider the following speci c problems: { Pure feasibility test (to discover whether a net system is deadlock-free). { Checking feasibility and, in the case that there are deadlocks, to nd an execution sequence leading to a deadlock. { To facilitate debugging, nding a shortest path leading to a deadlock. This results in an optimisation problem, i.e. minimise L(x) = x(e1 ) + + x(eq ) under the constraints given by (1). We will show in section 5.1 that it is possible to reduce (1) to a pure integer LPproblem without increasing the total number of integer variables. Moreover, (1) has several problem-speci c internal dependencies between variables, and taking them into account may allow one to signi cantly reduce the number of calculations. Therefore it seems non-optimal to use general-purpose LP-solvers for this particular problem.

4 Solving systems of linear constraints In this paper, we will adapt the approach proposed in [1,2,5{7], in order to solve Petri net veri cation problems which can be reformulated as LP-problems. We start by recalling some basic results. The original Contejean and Devie's algorithm (CDA) [5{7] solves a system of linear homogeneous equations with arbitrary integer coecients

8a x + + a x = 0 > > < a1121x11 + + a12qq xqq = 0 . .. .. > . . > .. :

(2)

ap1x1 + + apq xq = 0

or A x = 0 where x 2 Nq and A = (aij ). For every 1 j q, let "j = (0; | :{z: :; 0}; 1; 0; : : :; 0) j ?1 times be the j-th vector in the canonical basis of Nq. Moreover, for every x 2 Nq, let a(x) 2 Np be a vector de ned by

0a x + + a x 1 11 1 1q q B a x + + a 21 1 2q xq C a(x) = B B C = x1 a("1) + + xq a("q ) ; . .. C @ .. . A

(3)

ap1 x1 + + apq xq

where a("j ) | the j-th column vector of the matrix A | is called the j -th basic default vector. The set S of all solutions of (2) can be represented by a nite basis B which is the minimal (w.r.t. set inclusion) subset of S such that every solution is a linear combination with non-negative integer coecients of the solutions in B. It can be shown that B comprises all solutions in S dierent from the trivial one, x = 0, which are minimal with respect to the ordering on Nq (x x0 if xi x0i, for all i q; moreover, x < x0 if x x0 and x 6= x0).

6

V.Khomenko and M.Koutny

The representation (3) suggests that any solution of (2) can be seen as a multiset of default vectors whose sum is 0. Choosing these vectors in an arbitrary order amounts to constructing a sequence of default vectors starting from, and returning to, the origin of Zp. CDA constructs such a sequence step by step: starting from the empty sequence, new default vectors are added until a solution is found, or no minimal solution can be obtained. However, dierent sequences of default vectors may correspond to the same solution (up to permutation of vectors). To eliminate some of the redundant sequences, a restriction for choosing the next default vector is used. A vector x 2 Nq (corresponding to a sequence of default vectors) such that a(x) 6= 0 can be incremented by 1 on its j-th component provided that a(x + "j ) = a(x) + a("j ) lies in the half-space containing 0 and delimited by the ane hyperplane perpendicular to the vector a(x) at its extremity when originating from 0 (see gure 1).

a(x)

0

a("j )

a(x + "j )

Fig. 1. Geometric interpretation of the branching condition in CDA This re ects a view that a(x) should not become too large, hence adding a("j ) to a(x) should yield a vector a(x + "j ) = a(x) + a("j ) `returning to the origin'. Formally, this restriction can be expressed by saying that given x = (x1; : : :; xq ), increment by 1 an xj satisfying a(x) a("j ) < 0 ; (4) where denotes the scalar product of two vectors. This reduces the search space without losing any minimal solution, since every sequence of default vectors which corresponds to a solution can be rearranged into a sequence satisfying (4).

Theorem 2. ([7]) The following hold for the CDA shown in gure 2: 1. Every minimal solution of the system (2) is computed. 2. Every solution computed by CDA is minimal. 3. The algorithm always terminates.

(completeness) (soundness) (termination)

Figure 3 illustrates the process of solving the homogeneous system of linear equations considered in [13]: ? x + x + 2x ? 3x = 0 1 2 3 4 ? x1 + 3x2 ? 2x3 ? x4 = 0

LP Deadlock Checking Using Partial Order Dependencies

{ { {

7

search breadth- rst a directed acyclic graph rooted at "1 ;:::;"q if a node y is equal to, or greater than, an already encountered solution of A x = 0 then y is a terminal node otherwise construct the sons of y by computing y + "j for each j q satisfying a(y) a("j ) < 0

Fig. 2. CDA (breath- rst version)

?1 1000 ?1

1 0100 3

2 ?2 0010

?3 0001 ?1

0 1100 2

3 0110 1

?2 0101 2

?1 0011 ?3

?1 2100 1

2 1110 0

?3 1101 1

0 0111 0

?1 2110

1

?1 1111 ?1

>x

0

2 2210 2

?2 2111 ?2

>x

0

1 3210 1

?1 2211 1

>x

0

0 4210 0

0 ?2 3211

>x

0

x = 00

= x

0

Fig. 3. Search graph constructed by CDA in gure 2; inside each box, the current value of a(x) is represented by a column on the left, and is followed by the current value of x; note that

x = (0; 1; 1; 1) and x = (4; 2; 1; 0) are two minimal solutions 0

00

8

V.Khomenko and M.Koutny

The example shows redundancies as some vectors were computed more than once. This can be remedied by using frozen components, de ned thus. Assume that there is a total ordering x on the sons of each node x of the search graph constructed by CDA. If x + "i and x + "j are two distinct sons of a node x such that x + "i x x + "j , then the i-th component is frozen in the sub-graph rooted at x + "j and cannot be incremented even if (4) is satis ed. The modi ed algorithm is still complete ([7]), and builds a forest which is a sub-graph of the original search graph. By de ning2 the ordering x as x + "i x x+ "j , i < j we obtain, for the system in the above example, the graph shown in gure 4 (see [13]).

?1 1000 ?1

1 0100 3

2 ?2 0010

?3 0001 ?1

0 1100 2

3 0110 1

?1 0011 ?3

?1 2100 1

?2 0101 2

2 1110 0

0 0111 0

= x

0

1

?1 2110 2 2210 2 1 3210 1 0 4210 0

= x

00

Fig. 4. Search graph constructed by the ordered version of CDA; frozen components are underlined, and the *s indicate nodes which cannot be developed according to condition (4) and frozen components rule The ordered version of CDA can easily handle bounds imposed on variables:

{ x0 x. Then, instead of starting with the vectors "1; : : :; "q , the algorithm starts

with x0 . The rest of the operation remains the same, but the minimal elements of

2 The ordering x may be de ned in other ways as well (see [7]).

LP Deadlock Checking Using Partial Order Dependencies

9

the set S 0 = fx j A x = 0 ^ x0 xg do not give all the solutions of A x = 0 x0 x However, any solution of the above system can be represented as a sum of a minimal element of S 0 and a non-negative integer linear combination of minimal solutions of the original system. { x x00 where x00 2 (N [ f1g)q . Then the algorithm works in the standard way except that the j-th component of a vector becomes frozen as soon as it reaches the j-th component of x00. { x0 x x00. Then a combination of two previous techniques is used. With the above extensions, CDA allows one to solve non-homogeneous diophantine systems 8a x + + a x = d > > < a1121x11 + + a12qq xqq = d12 (5) .. .. .. > > : ap1.x1 + + apq.xq = d.p By adding a new variable, x0 , we can transform (5) into a homogeneous system 8 ?d x + a x + + a x = 0 > > ?d12x00 + a1121x11 + + a12qq xqq = 0 < .. .. .. .. > > : ?d.px0 + ap1.x1 + + apq.xq = 0. Let Bk (k = 0; 1) be the set of all minimal solutions x = (x0; x1; : : :; xq ) of this system with x0 = k. Then any solution of (5) can be represented as x=y+

X

z2B0

cz z ;

where y 2 B1 and each cz belongs to N. Thus to solve (5) it suces to add just one variable which is frozen as soon as it reaches the value 1. The task of solving a system of linear inequalities 8a x + + a x d > > < a1121x11 + + a12qq xqq d12 (6) .. .. .. > > : ap1.x1 + + apq.xq d.p is more complicated.3 The standard linear programming approach is to reduce it to a system of equations 8a x + + a x + y = d1 > > < a2111x11 + + a21qq xqq + 1 y2 = d2 .. .. .. ... > . . . > : ap1x1 + + apq xq + yp = d p 3 Some solutions of (6) cannot be represented as non-negative linear combinations of minimal

solutions, and Ajili and Contejean use the notion of a non-decomposable solution to deal with this problem [2]. For our purposes, however, it is sucient to check only the minimal solutions.

10

V.Khomenko and M.Koutny

by adding slack variables yi 2 N, but this transformation increases the number of variables from q to q + p. And, as the computation time can grow exponentially in the number of variables, such an approach is not ecient. Moreover, the slack variables may assume arbitrary values in N, even if all the variables in the original problem are binary as in (1); as a result, the search space grows very rapidly. Another approach is to deal with inequalities directly. It has been developed in [1,2], where CDA was generalised to solve homogeneous systems of equations and inequalities. (A system of non-homogeneous inequalities can be reduced to a homogeneous one by adding just one binary variable, similarly as described above for (5); it is also possible to slightly change the algorithm to solve non-homogeneous systems directly.) For a system of linear constraints A x = 0 ^ A0 x 0, the branching condition (4) is modi ed by saying that given x = (x1 ; : : :; xq ), increment by 1 an xj for which there exist y1 ; : : :; yl such that the vector (x1; : : :; xq ; y1 ; : : :; yl ) can be incremented on its j-th component according to (4) applied to the system A x = 0 ^ A0 x + y = 0, where l is the number of rows in A0 and y = (y1 ; : : :; yl ). As shown in [1,2], this condition can be expressed as: (A x) (A "j ) +

Xl i=1

minf(A0i x)(A0i "j ); maxf0; A0i xg(A0i "j )g < 0

where A0i is the i-th row of A0. To ensure termination in the general case, [1,2] add one more condition, but if all the variables are bounded this is always guaranteed.

5 An algorithm for deadlock detection In this section, we introduce our deadlock checking algorithm, explain its theoretical background, and present some useful heuristics.

5.1 Reducing to a pure integer problem

The problem (1) can be reduced to a pure integer one, by substituting the expression for M given by the marking equation into other constraints and, at the same time, reducing the total number of constraints. Each equation in M = Min + C x has the form X X M(b) = Min (b) + x(f) ? x(f) for b 2 B: (7) f 2 b

f 2b

After substituting these into (1) we obtain the system

8 0 1 > X X X > @ x(f) ? x(f)A jej ? 1 ? X M (b) for all e 2 E > > f 2b b2 e < b2e f 2 b X X > M (b) + for all b 2 B x(f) ? x(f) 0 > b > f 2 f 2 b > : x 2 f0; 1gq and x(e) = 0 for all e 2 E in

in

(8)

cut

Usually, each inequality in (8) contains relatively few variables, and the size of memory required to store the entire system is linear (rather than quadratic) in the size of Unf . As (8) is a pure integer problem, CDA is directly applicable. However, since the number of variables can be large, it needs further re nement.

LP Deadlock Checking Using Partial Order Dependencies

11

5.2 Partial-order dependencies between variables In [15], a nite pre x of the unfolding is used only for building a system of constraints, and the latter is then passed to the LP-solver without any additional information. Yet, during the solving of the system, one may use dependencies between variables implied by the causal order on events, which can be easily derived from Unf . For example, if we set x(e) = 1 then each x(f) such that f is a predecessor (in causal order) of e must be equal to 1, and each x(g) such that g is in con ict with e, must be equal to 0. Similarly, if we set x(e) = 0 then no event f for which e is a cause can be executed in the same run, and so x(f) should be equal to 0. Our algorithm will use these observations to reduce the search space, and the experimental results indicate that taking into account causal dependencies, in combination with some useful heuristics, can lead to signi cant speedups. De nition 1. A vector x 2 f0; 1gq is compatible with Unf if for all distinct events e; f 2 E such that x(e) = 1, we have: f e ) x(f) = 1 and f#e ) x(f) = 0 : The motivation for considering compatible vectors follows from the next result. Theorem 3. A vector x 2 f0; 1gq is compatible with Unf if and only if there exists an execution sequence starting at Min whose Parikh vector is x. Proof. (() Let be an execution sequence starting from Min . For e 2 E to be executed, it is necessary for all f 2 E satisfying f e to occur before e. Moreover, if f#e then f cannot happen in the same execution sequence. Hence the Parikh vector of is compatible with Unf . ()) For each compatible vector x, it is possible to build an execution sequence whose Parikh vector is x. Indeed, as shown in [15], for acyclic nets the feasibility of marking equation is a sucient condition for a marking to be reachable. Moreover, the proof of this result presented in [15] implies that any solution of this equation corresponds to at least one execution sequence . Hence, if for a given compatible vector x, M = Min + C x 0 then M is reachable and there is an execution sequence leading to M whose Parikh vector is x. Therefore it is enough to show that for every compatible vector x, Min + C x 0, i.e. X X Min (b) + x(f) ? x(f) 0 for all b 2 B : (9) f 2 b

f 2b

X

0

Since j bj 1, for all b 2 B,

x(f) = x(f 0 ) f 2 b

if b = ; if b = ff 0 g

and there are two possible cases: Case 1:P b = ;. Then b is an initial condition and so Min (b) = 1. In this case (9) has the form f 2b x(f) 1 and it holds due to the fact that all the events in b are in con ict. P Case 2: b = ff 0 g for some f 0 2 E. Then Min (b) = 0, so (9) has the form f 2b x(f) x(f 0 ) and it also holds since all the events in b are in con ict, and f 0 is a predecessor for all of them. ut

12

V.Khomenko and M.Koutny

Corollary 1. For each reachable marking M of , there exists an execution sequence in Unf leading to a marking representing M , whose Parikh vector x is compatible with Unf and x(e) = 0, for every e 2 E . cut

Proof. Each reachable marking M of is represented in Unf by a marking M 0 which

can be reached from Min through an execution sequence without cut-o events. Theorem 3 implies that the Parikh vector of is compatible with Unf . ut In view of the last result, it is sucient for a deadlock detection algorithm to check only compatible vectors whose components corresponding to cut-o events are equal to zero. This can be done by building minimal compatible closure of a vector x (see the de nition below) in each step of CDA and freezing all x(e) such that e 2 Ecut . De nition 2. A vector x 2 f0; 1gq is a compatible closure of x 2 f0; 1gq if x x and x is compatible with Unf . Moreover, x is a minimal compatible closure if it is minimal with respect to among all possible compatible closures of x. Let us consider the causal ordering e1 e2 e3 , e2 e4 and e3 co e4 (see gure 5), and x = (1; 0; 1; 0). Then x0 = (1; 1; 1; 0) and x00 = (1; 1; 1; 1) are compatible closures of x, and x0 is the minimal one. e4 e1

e2 e3

Fig. 5. An occurrence net

Theorem 4. A vector x 2 f0; 1gq has a compatible closure if and only if for all e; f 2 E , x(e) = x(f) = 1 implies :(e#f). If x has a compatible closure then its minimal compatible closure exists and is unique. Moreover, in such a case if x has zero components for all cut-o events, then the same is true for its minimal compatible closure. Proof. Straightforward. We just point out that to build the minimal compatible closure

of x it is enough to set to 1 all the components x(f) for which there is e such that f e and x(e) = 1. ut It may happen that a vector x has a compatible closure according to theorem 4, but it cannot be computed because some of the zero components of x to be set to 1 have been frozen. In this case, the algorithm should behave as if such a closure cannot be built.

Branching condition Each step of CDA can be seen as moving from a point a(x) along a default vector a("j ) such that a(x) a("j ) < 0, which is interpreted as `returning to

the origin' (see gure 1). However, for an algorithm checking compatible vectors only, each step is moving along a vector which may be represented as a sum of several default vectors, and such a geometric interpretation is no longer valid. Indeed, let us consider the same ordering as in gure 5, and the equation a(x) = x1 + 5x2 ? 3x3 ? 3x4 = 0

LP Deadlock Checking Using Partial Order Dependencies

13

(which has a solution x = (1; 1; 1; 1)) with an initial constraint x1 = 1. Then we start the algorithm from the vector x = (1; 0; 0; 0), and the sequence of steps should begin from either "2 or "2 + "3 or "2 + "4 . But a(x) a("2 ) = 5 6< 0, a(x) a("2 + "3 ) = 2 6< 0, and a(x) a("2 + "4 ) = 2 6< 0, so we cannot choose a vector to make the rst step! A possible solution is to interpret each step "i1 + + "ik as a sequence of smaller steps "i1 ; : : :; "ik where we choose only the rst element "i1 for which a("i1 ) does return to the origin, and then add the remaining ones in order for x + "i1 + + "ik to be compatible, without worrying where they actually lead, as shown in gure 6. (If there is no compatible closure of x+"i1 then "i1 cannot be chosen.) This means that we check the condition a(x) a("i1 ) < 0 which coincides with the original CDA's branching condition, though we are moving along possibly dierent vector. As the following theorem shows, this technique is still complete.

a(vi ) a("i1 ) 0 vi+1 = vi + "i1 + + "ik

a("i2 )

a(vi+1 ) a("ik ) ...

Fig. 6. Geometric interpretation of the new branching condition

Theorem 5. Every non-trivial minimal compatible solution can be computed using the

above method.

Proof. The proof is adopted from [7]. We assume that the constraints are expressed as

a system of linear homogeneous equations A x = 0, i.e. (2). The argument can then be generalised to an arbitrary system of linear constraints using the approach described in [1,2]. Let z be a minimal non-trivial solution of A x = 0 compatible with Unf . The algorithm needs to construct a sequence of vectors z1; : : :; zl such that zi < zi+1 , zl = z, and each zi is compatible with Unf . For z1 , the minimal compatible closure of any "j such that "j z can be chosen (such an "j does exist and has a compatible closure since z is non-trivial and can be seen as | possibly not minimal | compatible closure of some "j ). Suppose that the algorithm has already constructed z1 ; : : :; zi and zi < z. Then there exists zi0 such that z = zi + zi0 and 0 = ka(z)k2 = k| a(z{zi )k2} + k| a(z{zi0 )k2} +2a(zi ) a(zi0 ) ; >0

>0

where k k denotes the Euclidean norm in Nq. Hence a(zi ) a(zi0 ) < 0. Consequently, there exists ji q such that "ji zi0 and a(zi ) a("ji ) < 0. Let us take as zi+1 the minimal compatible closure of zi +"ji (it exists due to zi +"ji z and z being compatible

14

V.Khomenko and M.Koutny

with Unf ). There are two possibilities: either we have constructed zi+1 = z, or zi+1 < z (note that zi+1 z for any compatible closure z of zi +"ji , and, in particular, for z = z). The process of generating new zi 's cannot continue inde nitely since zi < zi+1 z, so z will eventually be constructed. ut

Avoiding unnecessary constraints One can see that the inequalities in the middle of (8) are not essential for an algorithm checking only compatible vectors. Indeed, they are just the result of the substitution of M = Min + C x into the constraints M 0 and hold for any compatible vector x (see the proof of theorem 3). So these inequalities may be left out without losing any compatible solution. After eliminating these constraints, the reduced system

8 0 1 > X X X > < @ x(f) ? x(f)A jej ? 1 ? X M (b) for all e 2 E b2 e f 2 b f 2b b2 e > > x 2 f0; 1gq and x(e) = 0 for all e 2 E : in

(10)

cut

can have minimalsolutions which are not compatible with Unf . However, such solutions are not computed by the algorithm as it checks only compatible vectors.

Sketch of the algorithm In the discussion below we refer to the parameters appearing in the generic system (6), since (10) is of that format. The branching condition for (6) can be formulated as xj can be incremented by 1 if where each ri is given by ri = 0(a x ? d )(a " ) i i i j

p X i=1

ri < 0 ;

(11)

if ai x < di and ai "j < 0 otherwise ;

where ai = (ai1 ; : : :; aiq ). The algorithm in gure 7 starts from the tuple (0; : : :; 0) which is the root of the search tree and works in the way similar to the original CDA, but only checks vectors compatible with Unf . In the general case, the maximal depth of the search tree is q, so CDA needs to store q vectors of length q. In our algorithm, we use just two arrays, X and FIXED , of length q: { X : array[1::q] of f0; 1g To construct a solution. { FIXED : array[1::q] of integers To keep an information about the levels of xing the components of X. The interpretation of these arrays is as follows: { FIXED [i] = 0. Then X[i] must be 0 and this means that X[i] has not been considered yet, and may later be set to 1 or frozen. { FIXED [i] = k > 0 and X[i] = 0. Then X[i] has been frozen at some node on level k whose subtree the algorithm is developing. It cannot be unfrozen until the algorithm backtracks to the level k.

LP Deadlock Checking Using Partial Order Dependencies

15

{ FIXED [i] = k > 0 and X[i] = 1. Then X[i] has been set to 1 at some node on level k

whose subtree the algorithm is developing. This value is xed for the entire subtree. Notice that storing the levels of xing the elements of X allows one to undo changes during backtracking, without keeping all the intermediate values of X. We also use the following auxiliary variables and functions: { depth : integer The current depth in the search tree. { freeze(i : integer) Freezes all X[k] such that ei #ek . If there is X[k] = 1 to be frozen then freeze fails. The corresponding elements of FIXED are set to the current value of depth. { set(i : integer) Sets all X[j] such that ej ei to 1 and uses freeze to freeze all X[k] such that ei #ek . If there is a frozen X[j] to be set to 1, or X[k] = 1 to be frozen then set fails. The current value of depth is written in the elements of FIXED , corresponding to the components being xed.

Input

Cons | a system of constraints

Unf | a nite complete pre x of the unfolding

Output

A solution of Cons compatible with Unf if it exists

Initialisation

depth 1 X (0; :::; 0) for i 2 f1;:::; qg: FIXED [i]

1 if ei 2 Ecut 0 otherwise

Main procedure if X is a solution of Cons then return X for all i 2 fk j 1 k q ^ FIXED [k] = 0 ^ (11) holds g depth depth + 1 if set(i) has succeeded then apply the procedure to X recursively if solution is found then return X

undo changes in the elements X [j ] such that FIXED [j ] = depth depth depth ? 1 freeze(i) /* never fails here as X is compatible */ return solution not found

Fig. 7. Deadlock checking algorithm

Further optimisation We now introduce some simple yet useful heuristics. For example, if the algorithm has xed some variables and found out that some of the inequalities have become infeasible, then it may cut the current branch of the search graph. Moreover, we sometimes can determine the values of variables which have not yet been xed,

16

V.Khomenko and M.Koutny

or nd out that some inequalities have become redundant. Let us consider the inequality ai1x1 + + aiq xq di in the context of generating the search tree at the current node and calculate fix =

X

aij xj

xj is xed

max = xj

X

aij >0

aij

is not fixed

min = xj

X

aij di ? fix then (12), and so the whole system, is infeasible and the current subtree of the search tree may be pruned; if max di ? fix then (12) is redundant and may be ignored in the subtree rooted at the current node of the search tree; if min = di ? fix then (12) can only be satis ed if all its non- xed variables with negative coecients are equal to 1, and all its non- xed variables with positive coecients are equal to 0. After xing these variables (12) becomes redundant; if min + aik > di ? fix for a non- xed variable xk (in this case aik > 0), then (12) forces xk to be 0; if min ? aik > di ? fix for a non- xed variable xk (in this case aik < 0), then (12) forces xk to be 1. After xing the value of a variable, it is necessary to build a compatible closure for the current vector; new variables can become xed during this process, so the above tests are applied iteratively to all the inequalities in the system while it takes eect. If such a closure cannot be built, then the current subtree of the search tree does not contain a compatible solution and may be pruned. The tests may also be applied before the algorithm starts; in such a case we may reduce the system of constraints by excluding redundant inequalities and substituting the values of xed variables. As an example, let us consider the following system of inequalities: 8 ?x ? 2x + 2x + 2x + x 1 < 1 2 3 4 5 x + 2x + x ? x 2 1 2 4 5 : 2x1 + 3x2 + x3 ? 3x6 + 2x7 + x8 3 where the variables x1 and x2 are xed to 1 and x3 is xed to 0. Then, for the rst inequality fix = ?3, min = 0 and max = 3, so it is redundant due to max 1 ? fix. For the second inequality, fix = 3, min = ?1 and max = 1; due to min = 2 ? fix we can determine that x4 = 0 and x5 = 1. For the third inequality, fix = 5, min = ?3 and max = 3; it is possible to determine that x6 = 1 (due to min ? (?3) > 3 ? fix) and x7 = 0 (due to min + 2 > 3 ? fix). After xing this values the inequality becomes redundant. Suppose now that we added another constraint, x1 + x2 ? 2x3 + x9 1, for which fix = 2, min = 0 and max = 1. Then the system becomes infeasible due to min > 1 ? fix.

Additional dependencies between events The system of inequalities may contain additional information about causal dependencies which must hold in order to reach a

LP Deadlock Checking Using Partial Order Dependencies

17

deadlock. For example, the eect of xk ? xj 0 is the same as if ej was a predecessor of ek . Such an information can be incorporated directly into the unfolding before the algorithm starts. Let us formulate simple tests for determining such dependencies. We observe that (12) implies xk xj if and only if xk 6 xj implies that the inequality cannot be satis ed irrespective of the values of other variables. This happens if after xing xk = 1 and xj = 0, (12) becomes infeasible, i.e. min > di ? fix. Thus we can nd all such pairs (k; j) and then consider the following cases: { ej ek . Then we ignore this pair (the dependency is already represented in the unfolding). { ek ej . Then xk = xj for any execution sequence leading to a deadlock, so we can reduce the number of variables after the substitution xk = xj . { ek#ej . Then ek cannot occur in any execution sequence leading to a deadlock, and we may freeze xk and any other xi such that ek ei . { ek co ej . Then we may modify the unfolding by adding a new condition b such that b = fej g and b = fek g. Another kind of dependencies resembles that of the con ict relation. For example, the eect of xk + xj 1 is similar to a con ict between ek and ej in the unfolding. The inequality (12) implies xk + xj 1 if after xing xk = 1 and xj = 1, it becomes infeasible, i.e. min > di ? fix. Thus we can nd all such pairs (k; j) and then consider the following cases: { ek ej . Then ej cannot occur in any execution sequence leading to a deadlock, so we may freeze xj and any other component xi such that ej ei . { ej ek . Then, as before, we may freeze xk and any other component xi such that ek ei . { ek #ej . Then we ignore this pair (the dependency is already represented in the unfolding). { ek co ej . Then we may modify the unfolding by adding a new initial condition b such that b = fek ; ej g. The dependencies discussed above may also emerge during a run of the algorithm due to some variables becoming xed. In such a case, we do not modify the unfolding nor the system of constraints (otherwise we would have to undo changes during backtracking), but just x the values of some variables.

Shortest trail Finding a shortest path leading to a deadlock may facilitate debugging. In such a case, we need to solve an optimisation problem with the same system of constraints as before, and L(x) = x(e1 ) + + x(eq ) as a function to be minimised.

The algorithm can easily be adopted for this task. The only adjustment is not to stop after the rst solution has been found, but to keep the current optimal solution together with the value of the function L. As this function is non-decreasing, we may prune the branch of the search tree as soon as value of L becomes greater than, or equal to, the current optimal value. This also saves us from keeping all non-decomposable solutions found by the algorithm and used to prune the search tree as soon as the current node becomes greater than, or equal to, some non-decomposable solution (note that if x0 x00 then L(x0) L(x00)). It therefore makes sense to organise the search process in such a way that the rst solutions found give the value of L close to the optimal one (in order to reduce the search space). This can be done by choosing in each step of the algorithm the `most promising' branches. Since the ordering on the successors of a node in the ordered

18

V.Khomenko and M.Koutny

version of CDA (see section 4) is arbitrary, we may exploit the information about the value of L and check successors with smaller values rst. Such an algorithm can be seen as a version of the `branch and bound' method which only considers vectors compatible with unfolding and uses frozen components and branching condition to reduce the search space.

6 Experimental results For the experiments, we used the PEP tool [3] to generate nite complete pre xes for our partial order algorithm, and for deadlock checking based on McMillan's method ([14,15]) and the MIP algorithm4 ([15]). The results in table 1 have been measured on a PC with PentiumTM III/500MHz processor and 128M RAM (building unfoldings for RW(5), SEM(11), ELEVATOR(3), and STACK(10) were aborted after 20 hours). Problem

Deadlock- Original net Unfolding free jS j jT j jBj jE j p buf100 200 101 10101 5051 p mutual 49 41 887 479 p ab gesc 58 56 3326 1200 p sdl arg 163 96 644 199 sdl arg ddlk 157 92 657 223 p RW(2) 54 60 498 147 p RW(3) 72 120 4668 1281 p RW(4) 94 224 51040 13513 p SEM(2) 27 25 61 32 p SEM(3) 38 36 165 86 p SEM(4) 49 47 417 216 p SEM(5) 60 58 1013 522 p SEM(6) 71 69 2393 1228 p SEM(7) 82 80 5533 2830 p SEM(8) 93 91 12577 6416 p SEM(9) 104 102 28197 14354 p SEM(10) 115 113 62505 31764 p ELEVATOR(1) 59 72 518 287 p ELEVATOR(2) 83 130 29413 15366 STACK(3) 20 24 320 174 STACK(4) 24 30 968 525 STACK(5) 28 36 2912 1578 STACK(6) 32 42 8744 4737 STACK(7) 36 48 26240 14214 STACK(8) 40 54 78728 42645 STACK(9) 44 60 236192 127938 buf100 | mutual | | ab gesc sdl arg | sdl arg ddlk | RW(n) | SEM(n) | ELEVATOR(n) | STACK(n) | time | mem |

jEcut j

1 79 511 10 7 53 637 7841 5 17 49 129 321 769 1793 4097 9217 9 1796 26 80 242 728 2186 6560 19682

McM

Time [s] MIP

PO

0.01 24577 0.02 4.42 70 0.02 33.93 260 0.17 0.04 20 > b ? ? e 2 e 2 b b 2 h ( s ) b 2 h ( p ) > < ! X X X X > M (b) x(e) ? x(e) 1 ? > > b2h? (s0) e2b e2b b2h? (p0 ) > : x 2 f0; 1gq and x(e) = 0 for all e 2 E in

1

1

in

1

1

cut

and the algorithm proposed in this paper can be used for this purpose. Another interesting problem is marking reachability (and coverability). Let M 0 be a marking of . Then the system of constraints has the form 8M = M + C x > in > X > > < M(b) = ()M 0(s) for all s 2 S ?1

b2h (s) > > x(e) = 0 for all e 2 E > > : M 2 Np and x 2 Nq

cut

and, after eliminating the variables M and constraints M 0,6 we need to check the feasibility of the following system:

8 0 1 > X X X > < @ x(f) ? x(f)A = ()M 0(s) ? X M (b) for all s 2 S f 2b b2h? (s) f 2 b b2h? (s) > > : x 2 f0; 1gq and x(e) = 0 for all e 2 E in

1

1

cut

The algorithm can be easily adopted to solve this system, and if it has a compatible solution then the marking M 0 is reachable (coverable), and any compatible solution gives an execution sequence leading to it (respectively, to a marking which covers it), otherwise M 0 is unreachable (respectively, uncoverable).

9 Conclusions Experiments indicate that the algorithm we proposed in this paper can solve problems with thousands of variables. This overcomes the existing limitations, as MIP -problems 6 For the coverability analysis, the inequalities for which M (s) = 0 may also be excluded; they hold for any compatible vector x. 0

LP Deadlock Checking Using Partial Order Dependencies

21

with even a few hundreds of integer variables are usually a hard task for general purpose solvers. It is worth emphasising that the limitation was not the size of computer memory, but rather the time to solve an NP-complete problem. With our approach, the main limitation becomes the size of memory to store the unfolding. Our future research will aim at developing eective parallel algorithms (especially, for a distributed memory architecture) for constructing large unfoldings which cannot t in the memory of a single processing node, and modifying our algorithm to handle such `distributed' unfoldings. Finally, our experiments have indicated that the process of model checking a nite complete pre x is often faster than the process of constructing such a pre x. We therefore plan to investigate novel algorithms for fast generation of net unfoldings. Finally, it is worth pointing out that the approach presented in this paper is also valid for Petri Nets with weighted arcs since all the arcs in their unfoldings are unitary.

Acknowledgements

We would like to thank Paul Watson for his support and comments on an earlier version of this paper, and Christian Stehno for his help with using the PEP tool. The rst author was supported by an ORS Awards Scheme grant ORS/C20/4 and by an EPSRC grant GR/M99293.

References 1. F. Ajili and E. Contejean: Complete Solving of Linear Diophantine Equations and Inequations Without Adding Variables. Proc. of 1st International Conference on principles and practice of Constraint Programming, Cassis (1995) 1{17. 2. F. Ajili and E. Contejean: Avoiding Slack Variables in the Solving of Linear Diophantine Equations and Inequations. Theoretical Computer Science 173 (1997) 183{208. 3. E. Best and B. Grahlmann: PEP | more than a Petri Net Tool. Proc. of TACAS'96: Tools and Algorithms for the Construction and Analysis of Systems, Margaria T., Steen B. (Eds.). Springer-Verlag, Lecture Notes in Computer Science 1055 (1996) 397{401. 4. E. M. Clarke, E. A. Emerson and A.P. Sistla: Automatic Veri cation of Finite-state Concurrent Systems Using Temporal Logic Speci cations. ACM TOPLAS 8 (1986) 244{263. 5. E. Contejean: Solving Linear Diophantine Constraints Incrementally. Proc. of 10th International conference on Logic Programming, D. S. Warren (Ed.). MIT Press (1993) 532{549. 6. E. Contejean and H. Devie: Solving Systems of Linear Diophantine Equations. Proc. of 3rd Workshop on Uni cation, University of Keiserlautern (1989). 7. E. Contejean and H. Devie: An Ecient Incremental Algorithm for Solving Systems of Linear Diophantine Equations. Information and Computation 113 (1994) 143{172. 8. J. C.Corbett: Evaluating Deadlock Detection Methods. University of Hawaii at Manoa (1994). 9. CPLEX Corporation: CPLEX 3.0. Manual (1995). 10. J. Engelfriet: Branching processes of Petri Nets. Acta Informatica 28 (1991) 575{591. 11. J. Esparza, S. Romer and W. Vogler: An Improvement of McMillan's Unfolding Algorithm. Proc. of TACAS'96: Tools and Algorithms for the Construction and Analysis of Systems, Margaria T., Steen B. (Eds.). Springer-Verlag, Lecture Notes in Computer Science 1055 (1996) 87{106. 12. J. Esparza: Model Checking Based on Branching Processes. Science of Computer Programming 23 (1994) 151{195. 13. S. Krivoy: About Some Methods of Solving and Feasibility Criteria of Linear Diophantine Equations over Natural Numbers Domain (in Russian). Cybernetics and System Analysis 4 (1999) 12{36. 14. K. L. McMillan: Using Unfoldings to Avoid State Explosion Problem in the Veri cation of Asynchronous Circuits. Proc. of 4th Workshop on Computer Aided Veri cation, SpringerVerlag, Lecture Notes in Computer Science 663 (1992) 164{174.

22

V.Khomenko and M.Koutny

15. S. Melzer and S. Romer: Deadlock Checking Using Net Unfoldings. Proc. of Computer Aided Veri cation | CAV'97, O. Grumberg (Ed.). Springer-Verlag, Lecture Notes in Computer Science 1254 (1997) 352{363. 16. A. Pnueli: Applications of Temporal Logic to the Speci cation and Veri cation of Reactive Systems: A Survey of Current Trends. Proc. of Advanced School on Current Trends in Concurrency, Springer-Verlag, Lecture Notes in Computer Science 224 (1994) . 17. R. K. Ranjan, J. V. Sanghavi, R. K. Brayton, and A. Sangiovanni-Vincentelli: Binary Decision Diagrams on Network of Workstation. Proc. of International Conference on Computer Design (ICCD'96), Austin, Texas (1996) 358{364.

fVictor.Khomenko, [email protected]

Abstract. Model checking based on the causal partial order semantics of Petri nets is an approach widely applied to cope with the state space explosion problem. One of the ways to exploit such a semantics is to consider ( nite pre xes of) net unfoldings | themselves a class of acyclic Petri nets | which contain enough information, albeit implicit, to reason about the reachable markings of the original Petri nets. In [15], a veri cation technique for net unfoldings was proposed in which deadlock detection was reduced to a mixed integer linear programming problem. In this paper, we present a further development of this approach. We adopt Contejean-Devie's algorithm for solving systems of linear constraints over the natural numbers domain and re ne it, by taking advantage of the speci c properties of systems of linear constraints to be solved. The essence of the proposed modi cations is to transfer the information about causality and con icts between the events involved in an unfolding, into a relationship between the corresponding integer variables in the system of linear constraints. Experimental results demonstrate that the new technique achieves signi cant speedups. Keywords: Model checking, integer linear programming, Petri nets, unfolding, causality and concurrency.

1 Introduction A distinctive characteristic of reactive concurrent systems is that their sets of local states have descriptions which are both short and manageable, and the complexity of their behaviour comes from highly complicated interactions with the external environment rather than from complicated data structures and manipulations thereon. One way of coping with this complexity problem is to use formal methods and, especially, computer aided veri cation tools implementing model checking [4,8] | a technique in which the veri cation of a system is carried out using a nite representation of its state space. The main drawback of model checking is that it suers from the state space explosion problem. That is, even a relatively small system speci cation can (and often does) yield a very large state space. To help in coping with this, a number of techniques have been proposed which can roughly be classi ed as aiming at an implicit compact representation of the full state space of a reactive concurrent system, or at an explicit representation of a reduced (though sucient for a given veri cation task) state space of the system. Techniques aimed at reduced representation of state spaces are typically based on the independence (commutativity) of some actions, often relying on the partial order view of concurrent computation. Such a view is the basis for algorithms employing McMillan's unfoldings ([11,12,14]), where the entire state space of a Petri Net is represented implicitly using an acyclic net to represent a system's actions and local states. The net unfolding technique presented in [14,15] reduces memory requirement, but the deadlock checking algorithms proposed are quite slow, even for medium-size unfoldings.

2

V.Khomenko and M.Koutny

In [15], the problem of deadlock checking a Petri net was reduced to a mixed integer linear programming problem. In this paper, we present a further development of this approach. We adopt the Contejean-Devie's algorithm ([1,2,5{7]) for eciently solving systems of linear constraints over the domain of natural numbers. We re ne this algorithm by employing unfolding-speci c properties of the systems of linear constraints to be solved, in model checking aimed at deadlock detection. The essence of the proposed modi cations is to transfer the information about causality and con icts between events involved in an unfolding into a relationship between the corresponding integer variables in the system of linear constraints. The results of initial experiments demonstrate that the new technique achieves signi cant speedups. The paper is organised as follows. In section 2 we provide basic de nitions concerning Petri nets and, in particular, net unfoldings. Section 3 brie y recalls the results presented in [15] where the deadlock checking problem has been reduced to the feasibility test of a system of linear constraints. Section 4 is based on the results developed in [1,2,5{7] and recalls the main aspects of the Contejean-Devie's Algorithm (CDA) for solving system of linear constraints over the natural numbers domain. The algorithm we propose in this paper is a variation of CDA, developed speci cally to exploit partial order dependencies between events in the unfolding of a Petri net. Our algorithm is described in section 5; we provide theoretical background, useful heuristics, and outline ways of reducing the number of variables and constraints in the original system presented in [15]. Section 6 contains results of experiments obtained for a number of benchmark examples. Section 7 brie y discusses the parallelisation aspects of the new algorithm, and in section 8, we discuss how the approach can be generalised to deal with other relevant veri cation problems, such as mutual exclusion, coverability and reachability analysis. Section 9 describes possible directions for future research.

2 Basic de nitions In this section, we rst present basic de nitions concerning Petri nets, and then recall (see [11]) notions related to net unfoldings.

Petri nets A net is a triple N = (S; T; F) such that S and T are disjoint sets of respectively places and transitions, and F (S T) [ (T S) is a ow relation. A marking of N is a multiset M of places, i.e. M : S ! N = f0; 1; 2; : ::g. We adopt the

standard rules about representing nets as directed graphs, viz. places are represented as circles, transitions as rectangles, the ow relation by arcs, and markings are shown by placing tokens within circles. As usual, we will denote z = fy j (y; z) 2 F g and z = fy j (z; y) 2 F g, for all z 2 S [ T. We will assume that t 6= ; = 6 t , for every t 2 T. A net system is a pair = (N; M0 ) comprising a nite net N = (S; T; F) and an (initial) marking M0 . A transition t 2 T is enabled at a marking M, denoted M[ti, if for every s 2 t, M(s) 1. Such a transition can be executed, leading to a marking M 0 de ned by M 0 = M ? t+t . We denote this by M[tiM 0 or M[iM 0. The set of reachable markings of is the smallest set [M0i containing M0 and such that if M 2 [M0i and M[iM 0 then M 0 2 [M0 i. For a nite sequence of transitions, = t1 : : :tk , we denote M0 [iM if there are markings M1 ; : : :; Mk such that Mk = M and Mi?1 [tiiMi , for i = 1; : : :; k. A marking is deadlocked if it does not enable any transitions. The net system is deadlock-free if no reachable marking is deadlocked; safe if for every reachable marking M, M(S) f0; 1g; and bounded if there is k 2 N such that M(S) f0; : : :; kg, for every reachable marking M.

LP Deadlock Checking Using Partial Order Dependencies

3

Marking equation Let = (N; M0 ) be a net system, and S = fs1; : : :; sm g and T = ft1 ; : : :; tn g be sets of its places and transitions, respectively. We will often identify a marking M of with a vector M = (1 ; : : :; m ) such that M(si ) = i, for all i m. The incidence matrix of is an m n matrix N = (Nij ) such that, for all i m and j n, 8 1 if s 2 t n t < i j j Nij = : ?1 if si 2 tj n tj

0 otherwise : The Parikh vector of a nite sequence of transitions is a vector x = (x1 ; : : :; xn) such that xi is the number of the occurrences of ti within , for every i n. One can show that if is an execution sequence such that M0[iM then M = M0 + N x . This provides a motivation for investigating the feasibility (or solvability) of the following system of equations: M = M + N x 0 M 2 Nm and x 2 Nn If we x the marking M, then the feasibility of the above system is a necessary condition for M to be reachable from M0 .

Branching processes Two nodes of a net N = (S; T; F), y and y0 , are in con ict, denoted by y#y0 , if there are distinct transitions t; t0 2 T such that t \ t0 = 6 ; and (t; y) and (t0 ; y0 ) are in the re exive transitive closure of the ow relation F, denoted by . A node y is in self-con ict if y#y. An occurrence net is a net ON = (B; E; G) where B is the set of conditions (places) and E is the set of events (transitions). It is assumed that: ON is acyclic (i.e. is a partial order); for every b 2 B, jbj 1; for every y 2 B [ E, :(y#y) and there are nitely many y0 such that y0 y, where denotes the irre exive transitive closure of G. Min (ON ) will denote the minimal elements of B [ E with respect to . The relation is the causality relation. Two nodes are co-related, denoted by y co y0 , if neither y#y0 nor y y0 nor y0 y. A homomorphism from and occurrence net ON to a net system is a mapping h : B [ E ! S [ T such that: h(B) S and h(E) T; for all e 2 E, the restriction of h to e is a bijection between e and h(e); the restriction of h to e is a bijection between e and h(e) ; the restriction of h to Min (ON ) is a bijection between Min (ON ) and M0 ; and for all e; f 2 E, if e = f and h(e) = h(f) then e = f. A branching process of [10] is a quadruple = (B; E; G; h) such that (B; E; G) is an occurrence net and h is a homomorphism from ON to . A branching process 0 = (B 0 ; E 0; G0; h0) of is a pre x of a branching process = (B; E; G; h), denoted by 0 v , if (B 0 ; E 0 ; G0) is a subnet of (B; E; G) such that: if e 2 E 0 and (b; e) 2 G or (e; b) 2 G then b 2 B 0 ; if b 2 B 0 and (e; b) 2 G then e 2 E 0; and h0 is the restriction of h to B 0 [ E 0 . For each there exists a unique (up to isomorphism) maximal (w.r.t. v) branching process, called the unfolding of .

Con gurations and cuts A con guration of an occurrence net ON is a set of events C such that for all e; f 2 C, :(e#f) and, for every e 2 C, f e implies f 2 C. A cut is a maximal w.r.t. set inclusion set of conditions B 0 such that b co b0, for all b; b0 2 B 0 . Every marking reachable from Min (ON ) is a cut. Let C be a nite con guration of a branching process . Then Cut (C) = (Min (ON ) [ C ) n C is a cut; moreover, the multiset of places h(Cut (C)) is a reachable marking of , denoted Mark (C). A marking M of is represented in if the latter contains

4

V.Khomenko and M.Koutny

a nite con guration C such that M = Mark (C). Every marking represented in is reachable, and every reachable marking is represented in the unfolding of . A branching process of is complete if for every reachable marking M of , there is a con guration C in such that Mark (C) = M, and for every transition t enabled by M, there is a con guration C [ feg such that e 62 C and h(e) = t. Although, in general, an unfolding is in nite, for every bounded net system one can construct a nite complete pre x Unf of the unfolding of . Moreover, there are so-called cut o events1 in Unf such that, for every reachable marking M of , there exists a con guration C in Unf such that M = Mark (C) and no event in C is a cut-o event.

3 Deadlock detection using linear programming In the rest of this paper, we will assume that Unf = (B; E; G; h) is a nite complete pre x of the unfolding of a bounded net system = (S; T; F; M0 ). We will denote by Min the canonical initial marking of Unf which places a single token in each of the minimal conditions and no token elsewhere. Furthermore, we will assume that b1 ; b2; : : :; bp and e1 ; e2; : : :; eq are respectively the conditions and events of Unf , and that C is the p q incidence matrix of Unf . The set of cut-o events of Unf will be denoted by Ecut . We now recall the main results from [15]. A nite and complete pre x Unf may be treated as an acyclic safe net system with the initial marking Min . Each reachable deadlocked marking in is represented by a deadlocked marking in Unf . However, some deadlocked markings of Unf lie beyond the cut-o events and may not correspond to deadlocks in . Such deadlocks can be excluded by prohibiting the cut-o events from occurring. Since for an acyclic Petri net the feasibility of the marking equation is a sucient condition for a marking to be reachable, the problem of deadlock checking can be reduced to the feasibility test of a system of linear constraints. Theorem 1. ([15]) is deadlock-free if and only if the following system has no solution (in M and x): 8M = M + C x > in > X > for all e 2 E < M(b) jej ? 1 b2 e (1) > > x(e) = 0 for all e 2 Ecut > > : M 2 Np and x 2 Nq where x(ei ) = xi, for every i q. In order to decrease the number of integer variables, M 0 can be treated as a rational vector since x 2 Nq and M = Min + C x 0 always imply that M 2 Np. Moreover, as an event can occur at most once in a given execution sequence of Unf from the initial marking Min , it is possible to require that x be a binary vector, x 2 f0; 1gq . To solve the mixed-integer LP-problem (1), [15] used the general-purpose LP-solver CPLEX [9], and demonstrated that there are signi cant performance gains if the number of cut-o events is relatively high since all variables in x corresponding to cut-o events are set to 0. 1 Intuitively, cut-o events are nodes at which the potentially in nite unfolding may be cut

without losing any essential information about the behaviour of ; see [10{12, 14, 15] for details.

LP Deadlock Checking Using Partial Order Dependencies

5

Remarks The characterisation of the deadlock checking problem given by (1) suggests

that, in general, one can consider the following speci c problems: { Pure feasibility test (to discover whether a net system is deadlock-free). { Checking feasibility and, in the case that there are deadlocks, to nd an execution sequence leading to a deadlock. { To facilitate debugging, nding a shortest path leading to a deadlock. This results in an optimisation problem, i.e. minimise L(x) = x(e1 ) + + x(eq ) under the constraints given by (1). We will show in section 5.1 that it is possible to reduce (1) to a pure integer LPproblem without increasing the total number of integer variables. Moreover, (1) has several problem-speci c internal dependencies between variables, and taking them into account may allow one to signi cantly reduce the number of calculations. Therefore it seems non-optimal to use general-purpose LP-solvers for this particular problem.

4 Solving systems of linear constraints In this paper, we will adapt the approach proposed in [1,2,5{7], in order to solve Petri net veri cation problems which can be reformulated as LP-problems. We start by recalling some basic results. The original Contejean and Devie's algorithm (CDA) [5{7] solves a system of linear homogeneous equations with arbitrary integer coecients

8a x + + a x = 0 > > < a1121x11 + + a12qq xqq = 0 . .. .. > . . > .. :

(2)

ap1x1 + + apq xq = 0

or A x = 0 where x 2 Nq and A = (aij ). For every 1 j q, let "j = (0; | :{z: :; 0}; 1; 0; : : :; 0) j ?1 times be the j-th vector in the canonical basis of Nq. Moreover, for every x 2 Nq, let a(x) 2 Np be a vector de ned by

0a x + + a x 1 11 1 1q q B a x + + a 21 1 2q xq C a(x) = B B C = x1 a("1) + + xq a("q ) ; . .. C @ .. . A

(3)

ap1 x1 + + apq xq

where a("j ) | the j-th column vector of the matrix A | is called the j -th basic default vector. The set S of all solutions of (2) can be represented by a nite basis B which is the minimal (w.r.t. set inclusion) subset of S such that every solution is a linear combination with non-negative integer coecients of the solutions in B. It can be shown that B comprises all solutions in S dierent from the trivial one, x = 0, which are minimal with respect to the ordering on Nq (x x0 if xi x0i, for all i q; moreover, x < x0 if x x0 and x 6= x0).

6

V.Khomenko and M.Koutny

The representation (3) suggests that any solution of (2) can be seen as a multiset of default vectors whose sum is 0. Choosing these vectors in an arbitrary order amounts to constructing a sequence of default vectors starting from, and returning to, the origin of Zp. CDA constructs such a sequence step by step: starting from the empty sequence, new default vectors are added until a solution is found, or no minimal solution can be obtained. However, dierent sequences of default vectors may correspond to the same solution (up to permutation of vectors). To eliminate some of the redundant sequences, a restriction for choosing the next default vector is used. A vector x 2 Nq (corresponding to a sequence of default vectors) such that a(x) 6= 0 can be incremented by 1 on its j-th component provided that a(x + "j ) = a(x) + a("j ) lies in the half-space containing 0 and delimited by the ane hyperplane perpendicular to the vector a(x) at its extremity when originating from 0 (see gure 1).

a(x)

0

a("j )

a(x + "j )

Fig. 1. Geometric interpretation of the branching condition in CDA This re ects a view that a(x) should not become too large, hence adding a("j ) to a(x) should yield a vector a(x + "j ) = a(x) + a("j ) `returning to the origin'. Formally, this restriction can be expressed by saying that given x = (x1; : : :; xq ), increment by 1 an xj satisfying a(x) a("j ) < 0 ; (4) where denotes the scalar product of two vectors. This reduces the search space without losing any minimal solution, since every sequence of default vectors which corresponds to a solution can be rearranged into a sequence satisfying (4).

Theorem 2. ([7]) The following hold for the CDA shown in gure 2: 1. Every minimal solution of the system (2) is computed. 2. Every solution computed by CDA is minimal. 3. The algorithm always terminates.

(completeness) (soundness) (termination)

Figure 3 illustrates the process of solving the homogeneous system of linear equations considered in [13]: ? x + x + 2x ? 3x = 0 1 2 3 4 ? x1 + 3x2 ? 2x3 ? x4 = 0

LP Deadlock Checking Using Partial Order Dependencies

{ { {

7

search breadth- rst a directed acyclic graph rooted at "1 ;:::;"q if a node y is equal to, or greater than, an already encountered solution of A x = 0 then y is a terminal node otherwise construct the sons of y by computing y + "j for each j q satisfying a(y) a("j ) < 0

Fig. 2. CDA (breath- rst version)

?1 1000 ?1

1 0100 3

2 ?2 0010

?3 0001 ?1

0 1100 2

3 0110 1

?2 0101 2

?1 0011 ?3

?1 2100 1

2 1110 0

?3 1101 1

0 0111 0

?1 2110

1

?1 1111 ?1

>x

0

2 2210 2

?2 2111 ?2

>x

0

1 3210 1

?1 2211 1

>x

0

0 4210 0

0 ?2 3211

>x

0

x = 00

= x

0

Fig. 3. Search graph constructed by CDA in gure 2; inside each box, the current value of a(x) is represented by a column on the left, and is followed by the current value of x; note that

x = (0; 1; 1; 1) and x = (4; 2; 1; 0) are two minimal solutions 0

00

8

V.Khomenko and M.Koutny

The example shows redundancies as some vectors were computed more than once. This can be remedied by using frozen components, de ned thus. Assume that there is a total ordering x on the sons of each node x of the search graph constructed by CDA. If x + "i and x + "j are two distinct sons of a node x such that x + "i x x + "j , then the i-th component is frozen in the sub-graph rooted at x + "j and cannot be incremented even if (4) is satis ed. The modi ed algorithm is still complete ([7]), and builds a forest which is a sub-graph of the original search graph. By de ning2 the ordering x as x + "i x x+ "j , i < j we obtain, for the system in the above example, the graph shown in gure 4 (see [13]).

?1 1000 ?1

1 0100 3

2 ?2 0010

?3 0001 ?1

0 1100 2

3 0110 1

?1 0011 ?3

?1 2100 1

?2 0101 2

2 1110 0

0 0111 0

= x

0

1

?1 2110 2 2210 2 1 3210 1 0 4210 0

= x

00

Fig. 4. Search graph constructed by the ordered version of CDA; frozen components are underlined, and the *s indicate nodes which cannot be developed according to condition (4) and frozen components rule The ordered version of CDA can easily handle bounds imposed on variables:

{ x0 x. Then, instead of starting with the vectors "1; : : :; "q , the algorithm starts

with x0 . The rest of the operation remains the same, but the minimal elements of

2 The ordering x may be de ned in other ways as well (see [7]).

LP Deadlock Checking Using Partial Order Dependencies

9

the set S 0 = fx j A x = 0 ^ x0 xg do not give all the solutions of A x = 0 x0 x However, any solution of the above system can be represented as a sum of a minimal element of S 0 and a non-negative integer linear combination of minimal solutions of the original system. { x x00 where x00 2 (N [ f1g)q . Then the algorithm works in the standard way except that the j-th component of a vector becomes frozen as soon as it reaches the j-th component of x00. { x0 x x00. Then a combination of two previous techniques is used. With the above extensions, CDA allows one to solve non-homogeneous diophantine systems 8a x + + a x = d > > < a1121x11 + + a12qq xqq = d12 (5) .. .. .. > > : ap1.x1 + + apq.xq = d.p By adding a new variable, x0 , we can transform (5) into a homogeneous system 8 ?d x + a x + + a x = 0 > > ?d12x00 + a1121x11 + + a12qq xqq = 0 < .. .. .. .. > > : ?d.px0 + ap1.x1 + + apq.xq = 0. Let Bk (k = 0; 1) be the set of all minimal solutions x = (x0; x1; : : :; xq ) of this system with x0 = k. Then any solution of (5) can be represented as x=y+

X

z2B0

cz z ;

where y 2 B1 and each cz belongs to N. Thus to solve (5) it suces to add just one variable which is frozen as soon as it reaches the value 1. The task of solving a system of linear inequalities 8a x + + a x d > > < a1121x11 + + a12qq xqq d12 (6) .. .. .. > > : ap1.x1 + + apq.xq d.p is more complicated.3 The standard linear programming approach is to reduce it to a system of equations 8a x + + a x + y = d1 > > < a2111x11 + + a21qq xqq + 1 y2 = d2 .. .. .. ... > . . . > : ap1x1 + + apq xq + yp = d p 3 Some solutions of (6) cannot be represented as non-negative linear combinations of minimal

solutions, and Ajili and Contejean use the notion of a non-decomposable solution to deal with this problem [2]. For our purposes, however, it is sucient to check only the minimal solutions.

10

V.Khomenko and M.Koutny

by adding slack variables yi 2 N, but this transformation increases the number of variables from q to q + p. And, as the computation time can grow exponentially in the number of variables, such an approach is not ecient. Moreover, the slack variables may assume arbitrary values in N, even if all the variables in the original problem are binary as in (1); as a result, the search space grows very rapidly. Another approach is to deal with inequalities directly. It has been developed in [1,2], where CDA was generalised to solve homogeneous systems of equations and inequalities. (A system of non-homogeneous inequalities can be reduced to a homogeneous one by adding just one binary variable, similarly as described above for (5); it is also possible to slightly change the algorithm to solve non-homogeneous systems directly.) For a system of linear constraints A x = 0 ^ A0 x 0, the branching condition (4) is modi ed by saying that given x = (x1 ; : : :; xq ), increment by 1 an xj for which there exist y1 ; : : :; yl such that the vector (x1; : : :; xq ; y1 ; : : :; yl ) can be incremented on its j-th component according to (4) applied to the system A x = 0 ^ A0 x + y = 0, where l is the number of rows in A0 and y = (y1 ; : : :; yl ). As shown in [1,2], this condition can be expressed as: (A x) (A "j ) +

Xl i=1

minf(A0i x)(A0i "j ); maxf0; A0i xg(A0i "j )g < 0

where A0i is the i-th row of A0. To ensure termination in the general case, [1,2] add one more condition, but if all the variables are bounded this is always guaranteed.

5 An algorithm for deadlock detection In this section, we introduce our deadlock checking algorithm, explain its theoretical background, and present some useful heuristics.

5.1 Reducing to a pure integer problem

The problem (1) can be reduced to a pure integer one, by substituting the expression for M given by the marking equation into other constraints and, at the same time, reducing the total number of constraints. Each equation in M = Min + C x has the form X X M(b) = Min (b) + x(f) ? x(f) for b 2 B: (7) f 2 b

f 2b

After substituting these into (1) we obtain the system

8 0 1 > X X X > @ x(f) ? x(f)A jej ? 1 ? X M (b) for all e 2 E > > f 2b b2 e < b2e f 2 b X X > M (b) + for all b 2 B x(f) ? x(f) 0 > b > f 2 f 2 b > : x 2 f0; 1gq and x(e) = 0 for all e 2 E in

in

(8)

cut

Usually, each inequality in (8) contains relatively few variables, and the size of memory required to store the entire system is linear (rather than quadratic) in the size of Unf . As (8) is a pure integer problem, CDA is directly applicable. However, since the number of variables can be large, it needs further re nement.

LP Deadlock Checking Using Partial Order Dependencies

11

5.2 Partial-order dependencies between variables In [15], a nite pre x of the unfolding is used only for building a system of constraints, and the latter is then passed to the LP-solver without any additional information. Yet, during the solving of the system, one may use dependencies between variables implied by the causal order on events, which can be easily derived from Unf . For example, if we set x(e) = 1 then each x(f) such that f is a predecessor (in causal order) of e must be equal to 1, and each x(g) such that g is in con ict with e, must be equal to 0. Similarly, if we set x(e) = 0 then no event f for which e is a cause can be executed in the same run, and so x(f) should be equal to 0. Our algorithm will use these observations to reduce the search space, and the experimental results indicate that taking into account causal dependencies, in combination with some useful heuristics, can lead to signi cant speedups. De nition 1. A vector x 2 f0; 1gq is compatible with Unf if for all distinct events e; f 2 E such that x(e) = 1, we have: f e ) x(f) = 1 and f#e ) x(f) = 0 : The motivation for considering compatible vectors follows from the next result. Theorem 3. A vector x 2 f0; 1gq is compatible with Unf if and only if there exists an execution sequence starting at Min whose Parikh vector is x. Proof. (() Let be an execution sequence starting from Min . For e 2 E to be executed, it is necessary for all f 2 E satisfying f e to occur before e. Moreover, if f#e then f cannot happen in the same execution sequence. Hence the Parikh vector of is compatible with Unf . ()) For each compatible vector x, it is possible to build an execution sequence whose Parikh vector is x. Indeed, as shown in [15], for acyclic nets the feasibility of marking equation is a sucient condition for a marking to be reachable. Moreover, the proof of this result presented in [15] implies that any solution of this equation corresponds to at least one execution sequence . Hence, if for a given compatible vector x, M = Min + C x 0 then M is reachable and there is an execution sequence leading to M whose Parikh vector is x. Therefore it is enough to show that for every compatible vector x, Min + C x 0, i.e. X X Min (b) + x(f) ? x(f) 0 for all b 2 B : (9) f 2 b

f 2b

X

0

Since j bj 1, for all b 2 B,

x(f) = x(f 0 ) f 2 b

if b = ; if b = ff 0 g

and there are two possible cases: Case 1:P b = ;. Then b is an initial condition and so Min (b) = 1. In this case (9) has the form f 2b x(f) 1 and it holds due to the fact that all the events in b are in con ict. P Case 2: b = ff 0 g for some f 0 2 E. Then Min (b) = 0, so (9) has the form f 2b x(f) x(f 0 ) and it also holds since all the events in b are in con ict, and f 0 is a predecessor for all of them. ut

12

V.Khomenko and M.Koutny

Corollary 1. For each reachable marking M of , there exists an execution sequence in Unf leading to a marking representing M , whose Parikh vector x is compatible with Unf and x(e) = 0, for every e 2 E . cut

Proof. Each reachable marking M of is represented in Unf by a marking M 0 which

can be reached from Min through an execution sequence without cut-o events. Theorem 3 implies that the Parikh vector of is compatible with Unf . ut In view of the last result, it is sucient for a deadlock detection algorithm to check only compatible vectors whose components corresponding to cut-o events are equal to zero. This can be done by building minimal compatible closure of a vector x (see the de nition below) in each step of CDA and freezing all x(e) such that e 2 Ecut . De nition 2. A vector x 2 f0; 1gq is a compatible closure of x 2 f0; 1gq if x x and x is compatible with Unf . Moreover, x is a minimal compatible closure if it is minimal with respect to among all possible compatible closures of x. Let us consider the causal ordering e1 e2 e3 , e2 e4 and e3 co e4 (see gure 5), and x = (1; 0; 1; 0). Then x0 = (1; 1; 1; 0) and x00 = (1; 1; 1; 1) are compatible closures of x, and x0 is the minimal one. e4 e1

e2 e3

Fig. 5. An occurrence net

Theorem 4. A vector x 2 f0; 1gq has a compatible closure if and only if for all e; f 2 E , x(e) = x(f) = 1 implies :(e#f). If x has a compatible closure then its minimal compatible closure exists and is unique. Moreover, in such a case if x has zero components for all cut-o events, then the same is true for its minimal compatible closure. Proof. Straightforward. We just point out that to build the minimal compatible closure

of x it is enough to set to 1 all the components x(f) for which there is e such that f e and x(e) = 1. ut It may happen that a vector x has a compatible closure according to theorem 4, but it cannot be computed because some of the zero components of x to be set to 1 have been frozen. In this case, the algorithm should behave as if such a closure cannot be built.

Branching condition Each step of CDA can be seen as moving from a point a(x) along a default vector a("j ) such that a(x) a("j ) < 0, which is interpreted as `returning to

the origin' (see gure 1). However, for an algorithm checking compatible vectors only, each step is moving along a vector which may be represented as a sum of several default vectors, and such a geometric interpretation is no longer valid. Indeed, let us consider the same ordering as in gure 5, and the equation a(x) = x1 + 5x2 ? 3x3 ? 3x4 = 0

LP Deadlock Checking Using Partial Order Dependencies

13

(which has a solution x = (1; 1; 1; 1)) with an initial constraint x1 = 1. Then we start the algorithm from the vector x = (1; 0; 0; 0), and the sequence of steps should begin from either "2 or "2 + "3 or "2 + "4 . But a(x) a("2 ) = 5 6< 0, a(x) a("2 + "3 ) = 2 6< 0, and a(x) a("2 + "4 ) = 2 6< 0, so we cannot choose a vector to make the rst step! A possible solution is to interpret each step "i1 + + "ik as a sequence of smaller steps "i1 ; : : :; "ik where we choose only the rst element "i1 for which a("i1 ) does return to the origin, and then add the remaining ones in order for x + "i1 + + "ik to be compatible, without worrying where they actually lead, as shown in gure 6. (If there is no compatible closure of x+"i1 then "i1 cannot be chosen.) This means that we check the condition a(x) a("i1 ) < 0 which coincides with the original CDA's branching condition, though we are moving along possibly dierent vector. As the following theorem shows, this technique is still complete.

a(vi ) a("i1 ) 0 vi+1 = vi + "i1 + + "ik

a("i2 )

a(vi+1 ) a("ik ) ...

Fig. 6. Geometric interpretation of the new branching condition

Theorem 5. Every non-trivial minimal compatible solution can be computed using the

above method.

Proof. The proof is adopted from [7]. We assume that the constraints are expressed as

a system of linear homogeneous equations A x = 0, i.e. (2). The argument can then be generalised to an arbitrary system of linear constraints using the approach described in [1,2]. Let z be a minimal non-trivial solution of A x = 0 compatible with Unf . The algorithm needs to construct a sequence of vectors z1; : : :; zl such that zi < zi+1 , zl = z, and each zi is compatible with Unf . For z1 , the minimal compatible closure of any "j such that "j z can be chosen (such an "j does exist and has a compatible closure since z is non-trivial and can be seen as | possibly not minimal | compatible closure of some "j ). Suppose that the algorithm has already constructed z1 ; : : :; zi and zi < z. Then there exists zi0 such that z = zi + zi0 and 0 = ka(z)k2 = k| a(z{zi )k2} + k| a(z{zi0 )k2} +2a(zi ) a(zi0 ) ; >0

>0

where k k denotes the Euclidean norm in Nq. Hence a(zi ) a(zi0 ) < 0. Consequently, there exists ji q such that "ji zi0 and a(zi ) a("ji ) < 0. Let us take as zi+1 the minimal compatible closure of zi +"ji (it exists due to zi +"ji z and z being compatible

14

V.Khomenko and M.Koutny

with Unf ). There are two possibilities: either we have constructed zi+1 = z, or zi+1 < z (note that zi+1 z for any compatible closure z of zi +"ji , and, in particular, for z = z). The process of generating new zi 's cannot continue inde nitely since zi < zi+1 z, so z will eventually be constructed. ut

Avoiding unnecessary constraints One can see that the inequalities in the middle of (8) are not essential for an algorithm checking only compatible vectors. Indeed, they are just the result of the substitution of M = Min + C x into the constraints M 0 and hold for any compatible vector x (see the proof of theorem 3). So these inequalities may be left out without losing any compatible solution. After eliminating these constraints, the reduced system

8 0 1 > X X X > < @ x(f) ? x(f)A jej ? 1 ? X M (b) for all e 2 E b2 e f 2 b f 2b b2 e > > x 2 f0; 1gq and x(e) = 0 for all e 2 E : in

(10)

cut

can have minimalsolutions which are not compatible with Unf . However, such solutions are not computed by the algorithm as it checks only compatible vectors.

Sketch of the algorithm In the discussion below we refer to the parameters appearing in the generic system (6), since (10) is of that format. The branching condition for (6) can be formulated as xj can be incremented by 1 if where each ri is given by ri = 0(a x ? d )(a " ) i i i j

p X i=1

ri < 0 ;

(11)

if ai x < di and ai "j < 0 otherwise ;

where ai = (ai1 ; : : :; aiq ). The algorithm in gure 7 starts from the tuple (0; : : :; 0) which is the root of the search tree and works in the way similar to the original CDA, but only checks vectors compatible with Unf . In the general case, the maximal depth of the search tree is q, so CDA needs to store q vectors of length q. In our algorithm, we use just two arrays, X and FIXED , of length q: { X : array[1::q] of f0; 1g To construct a solution. { FIXED : array[1::q] of integers To keep an information about the levels of xing the components of X. The interpretation of these arrays is as follows: { FIXED [i] = 0. Then X[i] must be 0 and this means that X[i] has not been considered yet, and may later be set to 1 or frozen. { FIXED [i] = k > 0 and X[i] = 0. Then X[i] has been frozen at some node on level k whose subtree the algorithm is developing. It cannot be unfrozen until the algorithm backtracks to the level k.

LP Deadlock Checking Using Partial Order Dependencies

15

{ FIXED [i] = k > 0 and X[i] = 1. Then X[i] has been set to 1 at some node on level k

whose subtree the algorithm is developing. This value is xed for the entire subtree. Notice that storing the levels of xing the elements of X allows one to undo changes during backtracking, without keeping all the intermediate values of X. We also use the following auxiliary variables and functions: { depth : integer The current depth in the search tree. { freeze(i : integer) Freezes all X[k] such that ei #ek . If there is X[k] = 1 to be frozen then freeze fails. The corresponding elements of FIXED are set to the current value of depth. { set(i : integer) Sets all X[j] such that ej ei to 1 and uses freeze to freeze all X[k] such that ei #ek . If there is a frozen X[j] to be set to 1, or X[k] = 1 to be frozen then set fails. The current value of depth is written in the elements of FIXED , corresponding to the components being xed.

Input

Cons | a system of constraints

Unf | a nite complete pre x of the unfolding

Output

A solution of Cons compatible with Unf if it exists

Initialisation

depth 1 X (0; :::; 0) for i 2 f1;:::; qg: FIXED [i]

1 if ei 2 Ecut 0 otherwise

Main procedure if X is a solution of Cons then return X for all i 2 fk j 1 k q ^ FIXED [k] = 0 ^ (11) holds g depth depth + 1 if set(i) has succeeded then apply the procedure to X recursively if solution is found then return X

undo changes in the elements X [j ] such that FIXED [j ] = depth depth depth ? 1 freeze(i) /* never fails here as X is compatible */ return solution not found

Fig. 7. Deadlock checking algorithm

Further optimisation We now introduce some simple yet useful heuristics. For example, if the algorithm has xed some variables and found out that some of the inequalities have become infeasible, then it may cut the current branch of the search graph. Moreover, we sometimes can determine the values of variables which have not yet been xed,

16

V.Khomenko and M.Koutny

or nd out that some inequalities have become redundant. Let us consider the inequality ai1x1 + + aiq xq di in the context of generating the search tree at the current node and calculate fix =

X

aij xj

xj is xed

max = xj

X

aij >0

aij

is not fixed

min = xj

X

aij di ? fix then (12), and so the whole system, is infeasible and the current subtree of the search tree may be pruned; if max di ? fix then (12) is redundant and may be ignored in the subtree rooted at the current node of the search tree; if min = di ? fix then (12) can only be satis ed if all its non- xed variables with negative coecients are equal to 1, and all its non- xed variables with positive coecients are equal to 0. After xing these variables (12) becomes redundant; if min + aik > di ? fix for a non- xed variable xk (in this case aik > 0), then (12) forces xk to be 0; if min ? aik > di ? fix for a non- xed variable xk (in this case aik < 0), then (12) forces xk to be 1. After xing the value of a variable, it is necessary to build a compatible closure for the current vector; new variables can become xed during this process, so the above tests are applied iteratively to all the inequalities in the system while it takes eect. If such a closure cannot be built, then the current subtree of the search tree does not contain a compatible solution and may be pruned. The tests may also be applied before the algorithm starts; in such a case we may reduce the system of constraints by excluding redundant inequalities and substituting the values of xed variables. As an example, let us consider the following system of inequalities: 8 ?x ? 2x + 2x + 2x + x 1 < 1 2 3 4 5 x + 2x + x ? x 2 1 2 4 5 : 2x1 + 3x2 + x3 ? 3x6 + 2x7 + x8 3 where the variables x1 and x2 are xed to 1 and x3 is xed to 0. Then, for the rst inequality fix = ?3, min = 0 and max = 3, so it is redundant due to max 1 ? fix. For the second inequality, fix = 3, min = ?1 and max = 1; due to min = 2 ? fix we can determine that x4 = 0 and x5 = 1. For the third inequality, fix = 5, min = ?3 and max = 3; it is possible to determine that x6 = 1 (due to min ? (?3) > 3 ? fix) and x7 = 0 (due to min + 2 > 3 ? fix). After xing this values the inequality becomes redundant. Suppose now that we added another constraint, x1 + x2 ? 2x3 + x9 1, for which fix = 2, min = 0 and max = 1. Then the system becomes infeasible due to min > 1 ? fix.

Additional dependencies between events The system of inequalities may contain additional information about causal dependencies which must hold in order to reach a

LP Deadlock Checking Using Partial Order Dependencies

17

deadlock. For example, the eect of xk ? xj 0 is the same as if ej was a predecessor of ek . Such an information can be incorporated directly into the unfolding before the algorithm starts. Let us formulate simple tests for determining such dependencies. We observe that (12) implies xk xj if and only if xk 6 xj implies that the inequality cannot be satis ed irrespective of the values of other variables. This happens if after xing xk = 1 and xj = 0, (12) becomes infeasible, i.e. min > di ? fix. Thus we can nd all such pairs (k; j) and then consider the following cases: { ej ek . Then we ignore this pair (the dependency is already represented in the unfolding). { ek ej . Then xk = xj for any execution sequence leading to a deadlock, so we can reduce the number of variables after the substitution xk = xj . { ek#ej . Then ek cannot occur in any execution sequence leading to a deadlock, and we may freeze xk and any other xi such that ek ei . { ek co ej . Then we may modify the unfolding by adding a new condition b such that b = fej g and b = fek g. Another kind of dependencies resembles that of the con ict relation. For example, the eect of xk + xj 1 is similar to a con ict between ek and ej in the unfolding. The inequality (12) implies xk + xj 1 if after xing xk = 1 and xj = 1, it becomes infeasible, i.e. min > di ? fix. Thus we can nd all such pairs (k; j) and then consider the following cases: { ek ej . Then ej cannot occur in any execution sequence leading to a deadlock, so we may freeze xj and any other component xi such that ej ei . { ej ek . Then, as before, we may freeze xk and any other component xi such that ek ei . { ek #ej . Then we ignore this pair (the dependency is already represented in the unfolding). { ek co ej . Then we may modify the unfolding by adding a new initial condition b such that b = fek ; ej g. The dependencies discussed above may also emerge during a run of the algorithm due to some variables becoming xed. In such a case, we do not modify the unfolding nor the system of constraints (otherwise we would have to undo changes during backtracking), but just x the values of some variables.

Shortest trail Finding a shortest path leading to a deadlock may facilitate debugging. In such a case, we need to solve an optimisation problem with the same system of constraints as before, and L(x) = x(e1 ) + + x(eq ) as a function to be minimised.

The algorithm can easily be adopted for this task. The only adjustment is not to stop after the rst solution has been found, but to keep the current optimal solution together with the value of the function L. As this function is non-decreasing, we may prune the branch of the search tree as soon as value of L becomes greater than, or equal to, the current optimal value. This also saves us from keeping all non-decomposable solutions found by the algorithm and used to prune the search tree as soon as the current node becomes greater than, or equal to, some non-decomposable solution (note that if x0 x00 then L(x0) L(x00)). It therefore makes sense to organise the search process in such a way that the rst solutions found give the value of L close to the optimal one (in order to reduce the search space). This can be done by choosing in each step of the algorithm the `most promising' branches. Since the ordering on the successors of a node in the ordered

18

V.Khomenko and M.Koutny

version of CDA (see section 4) is arbitrary, we may exploit the information about the value of L and check successors with smaller values rst. Such an algorithm can be seen as a version of the `branch and bound' method which only considers vectors compatible with unfolding and uses frozen components and branching condition to reduce the search space.

6 Experimental results For the experiments, we used the PEP tool [3] to generate nite complete pre xes for our partial order algorithm, and for deadlock checking based on McMillan's method ([14,15]) and the MIP algorithm4 ([15]). The results in table 1 have been measured on a PC with PentiumTM III/500MHz processor and 128M RAM (building unfoldings for RW(5), SEM(11), ELEVATOR(3), and STACK(10) were aborted after 20 hours). Problem

Deadlock- Original net Unfolding free jS j jT j jBj jE j p buf100 200 101 10101 5051 p mutual 49 41 887 479 p ab gesc 58 56 3326 1200 p sdl arg 163 96 644 199 sdl arg ddlk 157 92 657 223 p RW(2) 54 60 498 147 p RW(3) 72 120 4668 1281 p RW(4) 94 224 51040 13513 p SEM(2) 27 25 61 32 p SEM(3) 38 36 165 86 p SEM(4) 49 47 417 216 p SEM(5) 60 58 1013 522 p SEM(6) 71 69 2393 1228 p SEM(7) 82 80 5533 2830 p SEM(8) 93 91 12577 6416 p SEM(9) 104 102 28197 14354 p SEM(10) 115 113 62505 31764 p ELEVATOR(1) 59 72 518 287 p ELEVATOR(2) 83 130 29413 15366 STACK(3) 20 24 320 174 STACK(4) 24 30 968 525 STACK(5) 28 36 2912 1578 STACK(6) 32 42 8744 4737 STACK(7) 36 48 26240 14214 STACK(8) 40 54 78728 42645 STACK(9) 44 60 236192 127938 buf100 | mutual | | ab gesc sdl arg | sdl arg ddlk | RW(n) | SEM(n) | ELEVATOR(n) | STACK(n) | time | mem |

jEcut j

1 79 511 10 7 53 637 7841 5 17 49 129 321 769 1793 4097 9217 9 1796 26 80 242 728 2186 6560 19682

McM

Time [s] MIP

PO

0.01 24577 0.02 4.42 70 0.02 33.93 260 0.17 0.04 20 > b ? ? e 2 e 2 b b 2 h ( s ) b 2 h ( p ) > < ! X X X X > M (b) x(e) ? x(e) 1 ? > > b2h? (s0) e2b e2b b2h? (p0 ) > : x 2 f0; 1gq and x(e) = 0 for all e 2 E in

1

1

in

1

1

cut

and the algorithm proposed in this paper can be used for this purpose. Another interesting problem is marking reachability (and coverability). Let M 0 be a marking of . Then the system of constraints has the form 8M = M + C x > in > X > > < M(b) = ()M 0(s) for all s 2 S ?1

b2h (s) > > x(e) = 0 for all e 2 E > > : M 2 Np and x 2 Nq

cut

and, after eliminating the variables M and constraints M 0,6 we need to check the feasibility of the following system:

8 0 1 > X X X > < @ x(f) ? x(f)A = ()M 0(s) ? X M (b) for all s 2 S f 2b b2h? (s) f 2 b b2h? (s) > > : x 2 f0; 1gq and x(e) = 0 for all e 2 E in

1

1

cut

The algorithm can be easily adopted to solve this system, and if it has a compatible solution then the marking M 0 is reachable (coverable), and any compatible solution gives an execution sequence leading to it (respectively, to a marking which covers it), otherwise M 0 is unreachable (respectively, uncoverable).

9 Conclusions Experiments indicate that the algorithm we proposed in this paper can solve problems with thousands of variables. This overcomes the existing limitations, as MIP -problems 6 For the coverability analysis, the inequalities for which M (s) = 0 may also be excluded; they hold for any compatible vector x. 0

LP Deadlock Checking Using Partial Order Dependencies

21

with even a few hundreds of integer variables are usually a hard task for general purpose solvers. It is worth emphasising that the limitation was not the size of computer memory, but rather the time to solve an NP-complete problem. With our approach, the main limitation becomes the size of memory to store the unfolding. Our future research will aim at developing eective parallel algorithms (especially, for a distributed memory architecture) for constructing large unfoldings which cannot t in the memory of a single processing node, and modifying our algorithm to handle such `distributed' unfoldings. Finally, our experiments have indicated that the process of model checking a nite complete pre x is often faster than the process of constructing such a pre x. We therefore plan to investigate novel algorithms for fast generation of net unfoldings. Finally, it is worth pointing out that the approach presented in this paper is also valid for Petri Nets with weighted arcs since all the arcs in their unfoldings are unitary.

Acknowledgements

We would like to thank Paul Watson for his support and comments on an earlier version of this paper, and Christian Stehno for his help with using the PEP tool. The rst author was supported by an ORS Awards Scheme grant ORS/C20/4 and by an EPSRC grant GR/M99293.

References 1. F. Ajili and E. Contejean: Complete Solving of Linear Diophantine Equations and Inequations Without Adding Variables. Proc. of 1st International Conference on principles and practice of Constraint Programming, Cassis (1995) 1{17. 2. F. Ajili and E. Contejean: Avoiding Slack Variables in the Solving of Linear Diophantine Equations and Inequations. Theoretical Computer Science 173 (1997) 183{208. 3. E. Best and B. Grahlmann: PEP | more than a Petri Net Tool. Proc. of TACAS'96: Tools and Algorithms for the Construction and Analysis of Systems, Margaria T., Steen B. (Eds.). Springer-Verlag, Lecture Notes in Computer Science 1055 (1996) 397{401. 4. E. M. Clarke, E. A. Emerson and A.P. Sistla: Automatic Veri cation of Finite-state Concurrent Systems Using Temporal Logic Speci cations. ACM TOPLAS 8 (1986) 244{263. 5. E. Contejean: Solving Linear Diophantine Constraints Incrementally. Proc. of 10th International conference on Logic Programming, D. S. Warren (Ed.). MIT Press (1993) 532{549. 6. E. Contejean and H. Devie: Solving Systems of Linear Diophantine Equations. Proc. of 3rd Workshop on Uni cation, University of Keiserlautern (1989). 7. E. Contejean and H. Devie: An Ecient Incremental Algorithm for Solving Systems of Linear Diophantine Equations. Information and Computation 113 (1994) 143{172. 8. J. C.Corbett: Evaluating Deadlock Detection Methods. University of Hawaii at Manoa (1994). 9. CPLEX Corporation: CPLEX 3.0. Manual (1995). 10. J. Engelfriet: Branching processes of Petri Nets. Acta Informatica 28 (1991) 575{591. 11. J. Esparza, S. Romer and W. Vogler: An Improvement of McMillan's Unfolding Algorithm. Proc. of TACAS'96: Tools and Algorithms for the Construction and Analysis of Systems, Margaria T., Steen B. (Eds.). Springer-Verlag, Lecture Notes in Computer Science 1055 (1996) 87{106. 12. J. Esparza: Model Checking Based on Branching Processes. Science of Computer Programming 23 (1994) 151{195. 13. S. Krivoy: About Some Methods of Solving and Feasibility Criteria of Linear Diophantine Equations over Natural Numbers Domain (in Russian). Cybernetics and System Analysis 4 (1999) 12{36. 14. K. L. McMillan: Using Unfoldings to Avoid State Explosion Problem in the Veri cation of Asynchronous Circuits. Proc. of 4th Workshop on Computer Aided Veri cation, SpringerVerlag, Lecture Notes in Computer Science 663 (1992) 164{174.

22

V.Khomenko and M.Koutny

15. S. Melzer and S. Romer: Deadlock Checking Using Net Unfoldings. Proc. of Computer Aided Veri cation | CAV'97, O. Grumberg (Ed.). Springer-Verlag, Lecture Notes in Computer Science 1254 (1997) 352{363. 16. A. Pnueli: Applications of Temporal Logic to the Speci cation and Veri cation of Reactive Systems: A Survey of Current Trends. Proc. of Advanced School on Current Trends in Concurrency, Springer-Verlag, Lecture Notes in Computer Science 224 (1994) . 17. R. K. Ranjan, J. V. Sanghavi, R. K. Brayton, and A. Sangiovanni-Vincentelli: Binary Decision Diagrams on Network of Workstation. Proc. of International Conference on Computer Design (ICCD'96), Austin, Texas (1996) 358{364.