ACI1 constraints

0 downloads 0 Views 328KB Size Report
and e cient manipulation of ACI1 constraints. 1 Introduction. Equational theories are rst-order theories whose axioms are universally quanti ed equations ...
ACI 1 constraints Agostino Dovier

Universita di Verona, Italy [email protected]

Carla Piazza

Universita di Udine, Italy, [email protected]

Enrico Pontelli

New Mexico State University, USA [email protected]

Gianfranco Rossi

Universita di Parma, Italy [email protected]

Abstract

Disuni cation is the problem of deciding satis ability of a system of equations and disequations w.r.t. a given equational theory. In this paper we study the disuni cation problem in the context of ACI 1 equational theories. We provide a characterization of the interpretation structures suitable to model the axioms in ACI 1 theories. The satis ability problem is solved using known techniques for the equality constraints and novel methodologies to transform disequation constraints into solved forms. We propose three solved forms, o ering an increasingly more precise characterization of the set of solutions. Two of them can be computed and tested in polynomial time. The novel results achieved open new possibilities in the practical and ecient manipulation of ACI 1 constraints.

1 Introduction

Equational theories are rst-order theories whose axioms are universally quanti ed equations between rst-order terms [21]. A (non-empty) equational theory E forces certain classes of syntactically di erent terms to be interpreted as the same object in any model of E . For example, if E contains the axiom X + Y = Y + X , then the terms a + b and b + a will be interpreted in the same way in any model of E . However, an equational theory is generally not strong enough to state when two terms must be distinguished. As a matter of fact, a 1-element structure 1 is a model of any equational theory. In 1 any constraint of the form s 6= t is unsatis able! If a \wider" structure is chosen, then the satis ability problem for a set of positive (equations) and negative (disequations) constraints|a.k.a. disuni cation problem |becomes meaningful and complex. In this paper we tackle this problem in the context of equational theories describing the associative (A), commutative (C), and idempotent (I) nature of a function symbol. Constraints in the context of ACI theories (or similar theories for set-like structures) have been shown to be very important from the theoretical as well as the practical point of view [18, 14, 13, 7]. The ultimate goal of our e ort is to develop a framework for handling ACI constraints which can be used in Constraint Logic Programming (CLP).

The problem of handling positive and negative constraints under equational theories has been explored in the literature. In [8] a general solution to the disuni cation problem is presented; unfortunately such solution is valid only for compact equational theories, and ACI |as discussed later| does not meet this requirement. The general problem of solving disequations w.r.t. a given equational theory is also addressed in [4]. Here the technique employed to solve the problem is that of transforming disequations into uni cation problems, whose solution sets|that, for nitary theories can be nitely represented|are exactly the negation of the solution set for the starting problem. The answer to the satis ability problem is represented as a pair ; , called substitution with exceptions :  is a solution of the initial constraints if and only if  is an instance of the substitution  and  is not an instance of any of the substitutions in the set . This test, as well as the test to verify whether ; is non-empty, can be not trivial. Moreover, substitutions with exceptions correspond to solved form constraints containing universally quanti ed variables, which makes them unsuitable to be used in the context of a CLP system. Recently Baader and Schulz [2] developed a general technique capable of combining satis ability algorithms for disuni cation in disjoint equational theories. The approach is very general and can be adapted to work for ACI theories on general signatures. However, the method leads to an exponential explosion of alternatives and there seems to be no practical way to obtain \partial" ecient solutions from such scheme. In this paper we present constraint solving techniques to handle equation and disequation constraints under ACI 11 in CLP languages. The presentation starts with a characterization of the structures which are suitable to model the axioms in ACI 1|the join-semilattices with bottom. This is used to explore the issue of satis ability of positive (Sect. 3) and negative (Sect. 4) constraints with respect to the di erent possible signatures of the language. This analysis captures the relationship between the satis ability of negative constraints and the \shape" of the interpretation structure. In the context of a CLP system, uninterpreted function symbols are typically manipulated as nite trees [19]. We develop a rst-order theory which extends (general) ACI 1 and corresponds to T ()= =ACI 1 |i.e., the Herbrand Universe modulo the congruence relation imposed by ACI 1|on the class of conjunctions of positive and negative constraints. This allows us to focus on the canonical domain of Herbrand terms. In this context we present three solved forms for disuni cation problems, as well as the algorithms which allow arbitrary ACI 1 constraints to be transformed into any of these three forms (Sect. 5). These solved forms meet the general requirements for solved form constraints|e.g., deciding their satis ability is trivial and ecient. Furthermore, all these solved forms are adequate to be eciently used in the context of a CLP system|a property which was missing from some of the solved forms proposed by other researchers [2, 4]. Two of these solved forms (called implicit and intermediate ) can be obtained in polynomial time from any conjunction of disequations. Finally, we show how the results have a direct application to solve set-based constraints, taking advantage of the polynomial nature of the implicit solved form proposed. The results achieved open new possibilities in the practical manipulation of 1 ACI 1 is ACI with an additional axiom requiring the existence of an identity element for the ACI operator.

ACI 1 constraints, thus overcoming limitations and ineciencies present in the existing CLP languages over set structures [10]. Sections 6 and 7 summarize our results and relate them to analogous problems in the literature.2

2 Preliminaries

Throughout the paper we assume the standard notions and notation used in rst-order logic, uni cation theory, and constraint logic programming (e.g., [12, 21, 16]). In particular, let  be a rst-order signature with arity function ar,  be a collection of predicate symbols, and V a denumerable collection of variables. T (; V ) (T ()) denotes the set of rst-order terms (resp. ground terms) built from  and V (resp. ). Moreover, we will call admissible constraints (Adm ) a given set of rst-order formulae over h; ; Vi. Given a rst-order theory T over h; ; Vi and a model A of T , T and A correspond on Adm [16] if, for each c 2 Adm, we have that ( T j= ~9(c) i A j= ~9(c) ). A valuation of a constraint c is an assignment of values from A to the free variables of c|where A is the domain of A. If s; t 2 T (; V ), then s = t is a -equation and s 6= t is a -disequation. An equational theory is a rst-order theory whose axioms are universally quanti ed -equations. Given an equational theory E , we can de ne the concept of E -equality (=E ) as the least congruence relation over T (; V ) which contains E and which is closed under substitution [3]. The relation =E induces a partition of T () into congruence classes. The set of these classes will be denoted by T ()= =E . T ()= =E , together with a mapping (the interpretation function) which assigns the equivalence class [t] to each term t is a model of the theory E . Given a conjunction C  (s1 = t1 ^ : : : ^ sn = tn ) of -equations, the (decision) E -uni cation problem is the problem of deciding whether E j= ~9C . If E is an equational theory, Birko 's completeness theorem [21] ensures that E j= ~9C if and only if T ()= =E j= ~9C . From a constraint point of view, -equations can be chosen as admissible constraints. The theory E and the model identi ed by T ()= =E correspond on the class containing all possible conjunctions of equations [16]. Given a conjunction C  (s1 6= t1 ^ : : : ^ sn 6= tn ) of -disequations (i.e., a disequation constraint), we call (decision) E -disequation problem the problem of establishing whether E j= ~9C . For an equational theory E , this test has always a negative answer, since, for instance, the structure 1 = hf?g; ()1 i, with ()1 the interpretation of all terms in the unique element ?, is a model of any equational theory, and in 1 any constraint of the form s 6= t is unsatis able. This problem originates from the fact that any (non-empty) equational theory E forces certain distinct terms to be interpreted in the same way in any model of E |however, it is not strong enough to state when two terms must be distinguished in each model of E . As a consequence, the disequation problem is typically stated as the problem of verifying satis ability of ~9C w.r.t. a given interpretation structure A (usually, T ()= =E )|i.e., A j= ~9C . A related problem is that of nding the structures A ful lling such property. 2

Complete proofs available at www.cs.nmsu.edu/lldap/prj lp/papers/dppr99.html

Example 2.1 Let E consist of the unique axiom X = Y and let Adm contain all -equations and -disequations. Then E corresponds with 1|in particular, E is the complete theory of 1. If E corresponds with 1, then E is said to be trivial. In particular: Proposition 2.2 Given a non-trivial equational theory E such that Adm contains all -equations and -disequations, there is no structure corresponding to E . A theory E is satisfaction complete [16] if for each admissible constraint c either E j= ~9c or E j= :~9c. In terms of satisfaction completeness of nontrivial theories, Proposition 2.2 leads to the following result: Corollary 2.3 Given a non-trivial equational theory E such that Adm contains all -equations and -disequations, E is not satisfaction complete. As mentioned in Sect. 1, our ultimate goal is to handle constraints composed by arbitrary conjunctions C of -equations and -disequations. This class of problems has been typically referred to as E -disuni cation problems [8, 4]. An E -solution  of C in a structure A|denoted A j= C|is a valuation  : V ?! A, extended as usual to terms, such that s =A t for all s = t in C and s 6=A t for all s 6= t in C . The technique for handling equations and disequations presented in this paper provides us with a methodology to tackle disuni cation problems. In fact, if E is a nitary theory|i.e., every uni cation problem admits a nite complete collection of most general uni ers|then a disuni cation problem C can be simply solved by computing a complete set of uni ers for the equations in C and then verifying whether, given any of these uni ers , there is a solution for the disequations of C  [2]. Let   f;; [g be a signature containing the binary function symbol [ and the constant symbol ;. Let us also de ne a one-to-one function # :  [ V ?! N , which will be used to obtain an order over T (; V ). Observe that ; and [ can be replaced by any other pair of function symbols ful lling the same axioms. A signature  is general if it contains at least one function symbol of arity > 0 and di erent from [. Let us also recall some standard de nitions from lattice theory [15]. A relation  L  L is a partial order on L if  is re exive, antisymmetric, and transitive. Let us denote with ? the bottom of the partial order, when it W exists|i.e., (8x 2 L)(?  x). hL; i is a join-semilattice if the element W x y 2 L exists for each x; y 2WL, where Wx y (read \x join y") is the unique element satisfying x  x Wy; y  x y and for all z 2 L such that x  z and y  z it holds that x y  z. If hL; i is a partial order with bottom ?, then a 2 L is an atom if (8x 2 L)((x  a ^ x 6= a) ! x = ?). The following equations describe the theory ACI 1: (A) (X [ Y ) [ Z = X [ (Y [ Z ) (I ) X [ X = X (C ) X [Y = Y [X (1) ; [ X = X Let us analyze the structures A = hA; ()A i for  = f;; [g that are models of ACI 1. By de nition of structure, the domain A is not empty. We indicate (;)A 2 A as ?. A relation  is induced by ([)A (simply denoted by [A ) on A: x  y $ x [A y = y.

Proposition 2.4 Let A = hA; ()A i be a model of ACI . Then  is a partial order on A. If A is a model of ACI 1, then ? = (;)A is the bottom of A. It is easy to observe that if A is not a model of ACI , then  is not guaranteed to be a partial order. Given a structure A for f;; [g that is a model of ACI 1, we denote with hA; i the partial order de ned above [15]. Proposition 2.5 Let A = hA; ()A i be a structure for f;; [g with ? = (;)A 2 A. A is a model of ACI 1 i hA; i is a join-semilattice with bottom ?. The ACI 1 axioms allow us to design a normalization function for T (; V )terms,  : T (; V ) ?! T (; V ). Intuitively, the e ect of  is to remove repeated elements and occurrences of ; from unions and reorder the elements of a union according to the ordering induced by #. For example, given a term t 2 T (f;; [g; V ), then (t) is always of the form: ; or X [ : : : [ Xm where #(X ) <    < #(Xm ). Observe that the result of (t) is not properly a term, but the associativity of [ allows us to use these entities as terms. Theorem 2.6 Let S =  n f;; [g. If t 2 T (; V ), vars (t) = X , jtj = n, then ACI 1 j= 8X ((t) = t) and (1) if S is a set of constant symbols, then 1

1

(t) can be performed in time O(n log n), (2) if S contains a function symbol of arity greater than 0, then (t) can be performed in time O(n2 ). Thus, (t) can be chosen as the canonical representative of the ACI 1congruence class [t] in T () or T (; V ) to which t belongs. Given a conjunction C  (s1 1 t1 ^ : : : ^ sh hth ) where i 2 f=; 6=g, its canonical form is the formula (C )  (s1 )1 (t1 ) ^ : : : ^ (sh )h (th ). The worst-case time complexity for the computation of the canonical form of C is O(n2 ), where n = jC j. In particular,  is an idempotent operation, thus (t)  ((t)) for all terms t. Corollary 2.7 If C  (s11 t1 ^ : : : ^ shhth), where i 2 f=; 6=g, then ACI 1 j= ~8(C $ (C )).

3 ACI 1 Equation Constraints - Uni cation

Given two terms s; t, we are interested in the decision problem ACI 1 j= ~9(s = t) and in computing a complete set of ACI 1 uni ers. An overview of the general decision and uni cation problems in presence of ACI 1 operators can be found in [11]. Thanks to Corollary 2.7, we can concentrate on the problem (s) = (t). The problem can be classi ed in three possible classes,

according to the form of the signature : Elementary Uni cation:  = f;; [g. (s) = (t) is of the form:3 X1 [ : : : [ Xm = Y1 [ : : : [ Yn The decision problem in this case admits always an armative answer| i.e., the valuation [V=; : V 2 vars (s; t)]. A unique most general solution always exists and it can be easily computed using the technique of [1]. 3

When m = 0 the l.h.s. is simply and, similarly, if n = 0 then the r.h.s. is . ;

;

Uni cation with Constants:  = f[; ;; c ; : : : ; cn g, where fc ; : : : ; cn g is a nite collection of constants distinct from ;. (s) = (t) is of the form: X [ : : : [ Xk [ b [ : : : [ bh = Y [ : : : [ Yq [ d [ : : : [ dp The decision problem ACI 1 j= ~9((s) = (t)) can be solved in time O(n) where n = j(s)j + j(t)j. In [1] it is shown how to compute the (minimal) 1

1

1

1

1

1

complete set of most general uni ers for this problem. Uniqueness of the most general uni er is lost, due to the presence of constants, but the problem remains nitary |i.e., it is possible to describe the complete set of solutions through a nite number of uni ers. General Uni cation:  = f[; ;; f1 ; f2; : : :g (a general signature). The general uni cation problem has the following format: X1 [ : : : [ Xh [ s1 [ : : : [ sk = Y1 [ : : : [ Yp [ t1 [ : : : [ tq where si ; tj are terms whose main functor is di erent from [. The decision problem is NP-complete [17, 9]. Algorithms to compute complete collections of uni ers for this class of problems have been presented in the literature|either as combination of simpler uni cation procedures [3] or as ad-hoc uni cation algorithms [11].

4 ACI 1 Disequation Constraints - Disuni cation

In this section we will concentrate on the problem of handling conjunctions of disequations. The whole disuni cation problem (conjunctions of equations and disequations) can be solved in two stages. First, uni cation techniques from the previous section are used on the equations. Then, after applying the resulting substitutions, we can safely remove all equations and concentrate on the problems (s) 6= (t). While in the uni cation case each equation is satis able in at least one model of ACI 1, the same does not hold in the case of disequations: a negative constraint can be unsatis able in all models (e.g., ; 6= ;). Other constraints (e.g., X1 6= X2) are satis able in some structures and unsatis able in others. Thus, the study of ACI 1 disequations requires an analysis of the possible structures for the given theory. The case of elementary disequation constraints (i.e., when  = f;; [g) can be viewed just as a simpler subcase of ACI 1 disequation constraints with constants. Thus we prefer to skip it here, due to space limitations.

4.1 ACI 1 Disequation Constraints with Constants

Let  = f;  c0 ; [; c1 ; : : : ; cm g. Let us analyze the structures over such . All models A for elementary ACI 1 (thus, all join-semilattices with bottom| Prop. 2.5) are also models of ACI 1 with constants, provided an interpretation for the constant symbols in  is given. However, it is natural to focus on structures in which the m constants are interpreted as distinct objects, each of them di erent from ?. This can be forced by introducing an additional (non-equational) axiom in the theory ACI 1: (F20 ) ci 6= cj i; j 2 f0; : : : ; mg; i 6= j Such structures are exactly all the join-semilattices with bottom with a domain of at least m + 1 objects. (F20 ) is actually an instance of the freeness axiom scheme (F2 ) of Clark's Equational Theory [6] (introduced rst by Mal'cev in [20]). For example, if m = 4, all the structures below are models of such extended theory.

c4 

c2

%

-



c3

 -

c1



"

?

%

c4 c3 c2 c1

 "  "  "  "

?

c1



.. .

 -

c2

.. .



c 1 c2 c3 c 4 .. .. . c3 c 4 .

" %

[

[



[



%

?

Among the possible models, we are interested in those ful lling the axiom   S S (Dc ) (8I; J  f1; : : : ; mg) I 6= J ! i2I ci 6= j 2J cj (D stands for Domain and c for constants ), where, if A = fa1 ; : : : ; an g  f1; : : : ; mg, then Si2A ci represents the term ca1 [    [ can . For instance, when m = 2, (Dc ) becomes: ; 6= c1 ^ ; 6= c2 ^ ; 6= c1 [ c2 ^ c1 6= c2 ^ c1 6= c1 [ c2 ^ c2 6= c1 [ c2 Among the structures satisfying these requirements, we can nd the Boolean lattices, isomorphical to h}(fc1 ; : : : ; cm g); i, i.e., those having fc1 g; : : : ; fcm g as atoms. Assuming (Dc ) we can also ignore (F20 ), since (Dc ) implies (F20 ). Let us assume that #(;) = 0 and #(ci ) = i for i = 1; : : : ; m; using  we can focus on the terms of the form ; or

X1 [    [ Xk [ ci1 [    [ cih where h + k > 0 and ij < ij +1 for j = 1; : : : ; h ? 1. The disequation (s) 6= (t) gives rise to the following possible cases: 1. r 6= r 2. ci1 [    [ cih 6= cj1 [    [ cjk and fi1 ; : : : ; ih g 6= fj1 ; : : : ; jk g 3. X1 [    [ Xm [ ci1 [    [ cih 6= Y1 [    [ Yn [ cj1 [    [ cjk , m > 0. Disequations of the rst form are false in any model of ACI 1. Disequations of the second kind are true (and therefore can be removed) in any model of ACI 1+(Dc). In particular, given any join-semilattice with bottom di erent from 1, it is possible to build a structure which is a model of ACI 1 and

in which a given disequation of type 2 is satis ed. Similarly, satis ability of disequations of the 3rd kind depends on the domain. In particular, a disequation of type 3 is:  unsatis able in 1  satis able in any model of ACI 1+(Dc) if there is a constant which occurs on one side and not on the other, or if there is a constant in  which does not occur in the disequation. The following theorem holds independently from the presence of (Dc ): Theorem 4.1 If C is a disequation constraint in canonical form and it contains r disequations (of type 2 or 3), then C is satis able in any structure which contains a substructure isomorphical to h}(fa1 ; : : : ; ar g); i. Corollary 4.2 If the structure is wide enough, then the decision problem for a disequation constraint C can be solved in O(n log n), where n = jC j. For a given xed structure, the problem is NP-complete.

4.2 General ACI 1 Disequation Constraints

Let us assume that the signature  can contain any constant and function symbols. From Sect. 2 we know that when  is induced by the interpretation of [, the models of ACI 1 are the join-semilattices with bottom. However,

in presence of a domain A and an interpretation [A , the interpretation of the functions in  introduces a variety of possibilities for building models. The most common interpretation of the constant and function symbols di erent from [ is the one induced by the structure T ()= =ACI 1 , denoted by H. In this section we prove some results about this model. We also de ne a theory T such that H and T correspond on the class of formulae we are interested in, namely conjunctions of equations and disequations. Example 4.3 The gure below shows a representation of H = T ()= =ACI 1 when  = f;; [; fgg where fg is a function symbol of arity 1. With a slight abuse of notation, we denote with fs; tg the congruence class of fsg [ ftg. This allows one to interpret H as the set of hereditarily nite and wellfounded sets. Sets at level i contain exactly i elements. .. . lev. 2 lev. 1 lev. 0

;

.. .

.. .

; ;

f; f;gg "

ff;g f; f;ggg -

f;g

"

-

ff;gg -

" ;

.. . ::: .. . ;

ff; f;ggg %

%

%

.. . ::: .. . ::: :::

Let us start by observing that the atoms of hH; i are all and only the congruence classes containing terms with a main functor di erent from [. This is stated by the following lemma: Lemma 4.4 The atoms of hH; i are exactly the classes [t], for some ground term t  f (t1 ; : : : ; tn ), f 6 [. Thus, with slight abuse of notation, from now on we will call atom any term whose main (outermost) function symbol is di erent from [. Observe that this property is not guaranteed to hold in other structures. For example, in any structure A in which A j= f (t1 ; : : : ; tn ) [ a = f (t1 ; : : : ; tn ), for some a 2 , the term f (t1 ; : : : ; tn ) is not an atom. H also properly models the extensionality principle for equality between sets denoted by ACI terms: Lemma 4.5 Let s; t be two terms and H j= ~9(s 6= t). For all solutions , if s1 [    [ sm 2 (s) and t1 [    [ tn 2 (t), where all the si , tj are atoms, then there are an atom a and an index i such that si = a and for all tj tj 6= a or, vice versa, ti = a and for all sj sj 6= a. The structure H is also a model of the freeness equational axioms: Vn (F1 ) f (X1 ; : : : ; Xn ) = f (Y1 ; : : : ; Yn ) ! i=1 Xi = Yi f 2 ; f 6 [ (F2 ) f (X1 ; : : : ; Xm ) 6= g(Y1 ; : : : ; Yn ) f 6 g; f; g 6 [ (F3 ) (X 6= t1 [    [ f (   X   ) [    [ tn ) f 6 [ The freeness axioms [6] have been re ned to capture the behavior of [. Similarly to what we did for (Dc ), we need to enforce the property that unions of distinct atoms return distinct objects of the domain. Instead of extending (Dc )|which would be quite cumbersome in this context|we introduce the following axiom scheme (Df ): for all f 2 , f 6 [, ar(f ) = n: f (X1 ; : :0: ; Xn) [ X = Y1 [ Y2 ! 1 (Y1 = f (X1; : : : ; Xn ) [ Z1 ^ X = Z1 [ Y2 ) _ A 9Z1Z2@ (Y2 = f (X1; : : : ; Xn) [ Z2 ^ X = Y1 [ Z2) _ (Y1 = f (X1; : : : ; Xn ) [ Z1 ^ Y2 = f (X1 ; : : : ; Xn ) [ Z2 ^ X = Z1 [ Z2 )

Note that the direction is a simple consequence of the ACI axioms. This axiom scheme captures the intuitive notion of atoms in the context of [, and subsumes (Dc ). Thus, axiom (Dc ) can be safely removed: Lemma 4.6 ACI 1Df F2 j= (Dc ).4 Hereafter we will use TACI to denote ACI 1F1 F2 F3 Df . Theorem 4.7 H and TACI correspond on the class of all the conjunctions of equations and disequations. The following theorem is fundamental for the satis ability test of all the solved forms presented in this paper. Theorem 4.8 Let  be a general signature. Given a disequation constraint C  (s1 6= t1 ^   ^ sn 6= tn) such that (si ) 6 (ti ) for all i = 1; : : : ; n, then C is satis able in H, and in every model of TACI . Corollary 4.9 If  is a general signature, then the satis ability of a disequation constraint in H can be decided in polynomial time.

5 Solved forms

Most constraint systems rely on the availability of constraints simpli ers to transform constraints into equivalent \simpler" formulae. In particular, it is common to identify a class of formulae, called solved forms, which are the target of this simpli cation. As described in [8], solved forms should be:  solvable: each solved form is either false or it admits at least one solution;  simple: satis ability of a solved form should be trivially decidable;  complete: every constraint is equivalent to a nite (possibly empty) disjunction of solved forms. We propose three solved forms for the ACI 1-constraints considered in this paper. Each form can be computed from the previous ones. The rst form (implicit ) implicitly represents its set of solutions. Given a constraint C , a unique implicit solved form constraint can be computed from it in polynomial (quadratic) time with respect to the number jC j of occurrences of symbols in C . The second solved form (intermediate ) further simpli es a constraint in implicit solved form. In polynomial (cubic) time it is possible to compute a formula containing a collection of intermediate solved form constraints. Each component of the collection can be computed in polynomial time, but the number of components might be exponential (exponentiality arises due to the application of distributivity). The third solved form (explicit ) represents explicitly its solutions. It can be computed from the rst one on demand, and determining each disjunct requires exponential time. All the solved forms are instances of the compact formulae de ned in [8]. If the theory ACI 1 were compact, the ability to reach a solved form would automatically ensure satis ability of these constraints over H [8]. Unfortunately (see Sect. 6), ACI 1 is not compact; thus, a direct proof of satis ability of the di erent solved forms is required. In the following subsections we precisely characterize the three solved forms. However, we anticipate that all of them ful ll the hypothesis of Theorem 4.8.

4 The converse is not true. Consider  = ; ; c and the lattice < a1 ; < a2 ; a1 < a2 ). c; a2 < cA . It ful lls (Dc ) but not (Df ) (consider c |{z} = |{z} a1 |{z} f; [

g

[

?

[

?

X

Y1

Y2

?

Corollary 5.1 An implicit/intermediate/explicit solved form constraint C di erent from false is satis able in H, and hence TACI j= ~9C .

5.1 Implicit solved form

A variable X occurs nested in a term t if t is of the form: t1 [  [f (   X   )[    [ tn, with f 6 [. A constraint C is in implicit solved form if it is false, true, or C  (s1 6= t1 ^    ^ sn 6= tn ) and for all i = 1; : : : ; n  vars (si) [ vars (ti) 6= ;, and  si  (si) and ti  (ti), and  si 6 ti, and  if ti is a variable, then si is also a variable, and  if si is a variable, then it does not occur nested in ti. Given a constraint C  (s1 6= t1 ^  ^ sn 6= tn ), we can obtain an equivalent constraint in implicit solved form, starting from (C ) and applying the function impl simpl of Fig. 1. Occurrences of true and false can be easily handled in linear time at the end of the rewriting process|we assume that the function tfsimpl performs this task. Clearly, tfsimpl(impl simpl((t))) is in implicit solved form. function impl simpl('): while there is a disequation c in ' not in implicit s.f. do case c of (1) r 6=r 7! false s 6= t (2) s 6 t; vars (s) = ; ^ vars (t) = ;  7! true t 6= X; (3) t is not a variable  7! X 6= t 7! true (4) X 6= t1 [    [ f (   X   ) [   f [6tn[;  f (  ) 6= g(  ); (5) 7! true

f g; f 6

;g

6 [

6 [

Figure 1: Rewriting procedure for implicit solved form

Proposition 5.2 Given a constraint C , tfsimpl(impl simpl((C ))) can be computed in time O(n2 ) where n = jC j. Moreover, if X = vars (C ), then ACI 1F1 F2 F3 j= 8X (C $ impl simpl((C ))).

5.2 Intermediate solved form

A constraint in implicit solved form C is in intermediate solved form if it is false or C  (s1 6= t1 ^    ^ sn 6= tn ) and for all i = 1; : : : ; n  si is a variable and si does not occur nested in ti, or  si  r1 [    [ rh and ti  v1 [    [ vk , h; k  2. The procedure described in Fig. 2 produces a formula containing only disequations in intermediate solved form. Moreover, the formula is equivalent to the original constraint. Proposition 5.3 Given a constraint C , the formula int simpl((C )) can be computed in time O(n3 ) (n = jC j). Moreover, TACI j= ~8(C $ int simpl((C ))).

function int simpl('): while there is a disequation c in ' not in intermediate s.f. do case c of (1){(5) as in impl simpl  s1 [ s2 ; 6= f (t1 ; : : : ; tn ); 7! f (t1 ; : : : ; tn ) 6= s1 [ s2 (6) f 6 [ W (7) f (s1 ; : : : ; sn ) 6= f (t1 ; : : : ; tn ) 7! Wni=1 si 6= ti 2 (8) f (s1 ; : : : ; sn ) 6= t1 [ t2 7! i=1 (f (s1 ; : : : ; sn ) 6= ti ^ ti 6= ;)_ (f (s1 ; : : : ; sn ) 6= t1 ^ f (s1 ; : : : ; sn ) 6= t2 )

Figure 2: Rewriting procedure for intermediate solved form function expl simpl('): while there is a disequation c in ' not in explicit s.f. do if c  r [ s 6= t [ u then replace c by  (c) else replace c by int simpl(c)

Figure 3: Rewriting procedure for explicit solved form Nevertheless, a successive application of the distributivity can generate an exponential number of disjuncts. This consideration holds in any general equational theory E . For example, given a binary function symbol f and the constraint f (X1 ; Y1 ) 6= f (V1 ; Z1 ) ^    ^ f (Xn ; Yn ) 6= f (Vn ; Zn ), after the application of the simpli cation rules in Fig. 2, we obtain (in time O(n)): (X1 6= V1 _ Y1 6= Z1 ) ^    ^ (Xn 6= Vn _ Yn 6= Zn ). The corresponding disjunctive normal form contains 2n disjuncts. On the other hand, the presence of an exponential number of disjuncts generated by the int simpl procedure implies that an explicit enumeration of the whole set of solutions requires exponential amount of time: Corollary 5.4 Given a constraint C , an intermediate solved form constraint C 0 that implies C can be computed in time O(n3 ), where n = jC j. Moreover, the disjunction of all the intermediate solved form constraints equivalent to C can be computed in time 2O(n) . Adopting a structure-sharing technique to implement int simpl it should be possible to lower the complexity to O(n2 ) (instead of O(n3 )).

5.3 Explicit solved form

In this section we show how a disequation r [ s 6= t [ u occurring in a constraint in intermediate solved form can be further simpli ed. An intermediate solved form constraint C is in explicit solved form if it is false or C  (s1 6= t1 ^  ^ sn 6= tn ) and, for all i = 1; : : : ; n, the following holds: if si is not a variable, then si 6= ti is of the form: X [ Y1 [    [ Yh 6= Y1 [    [ Yh [ r1 [    [ rk The rewriting of a constraint C in intermediate solved form, di erent from false, into an equivalent constraint in explicit form (see Fig. 3) is based on the recursive replacement of each conjunct of the form

cX [ : : : [ Xm{z[ t1 [ : : : [ th} 6= Y| 1 [ : : : [ Yn{z[ s1 [ : : : [ sk} ; | 1 `

Wh

r

W

with the formula  (c)  '` _ 'r where '`  i=1 term _ mi=1 'ivar and V V 'iterm  Vkj=1(ti 6= sj ) ^ nj=1(Yj 6= Yj [ ti ) S 'ivar  J f1;:::;kg (Xi [ Y1 [ : : : [ Yn 6= Y1 [ : : : [ Yn [ r2J sr ) The de nition of 'r is perfectly symmetrical. Intuitively, 'iterm asserts the fact that ti does not belong to the set described by the r.h.s., while 'ivar states that Xi is not a subset of the r.h.s. The replacement of c with  (c) can generate a number of new disequations (e.g., s1 6= t1 with s1  f (t1 ) and s1  g(t2 )) that can be rewritten by the function int simpl of Fig. 2. However, all the subproblems generated are of smaller size and the function expl simpl will eventually terminate. Proposition 5.5 For a canonical disequation c of the form X1 [ : : : [ Xm [ t1 [ : : : [ th 6= Y1 [ : : : [ Yn [ s1 [ : : : [ sk we have that TACI j= ~8(c $  (c)).

'i

Proposition 5.6 Given an implicit solved form constraint C , TACI j= ~8(C $ expl simpl(C )). Theorem 5.7 Given an implicit solved form constraint C , expl simpl(C )

terminates. Observe that every disequation of the form ti 6= sj occurs twice in  (c). Adopting a structure-sharing technique, however, it is possible to keep the complexity of expl simpl within O(n3 ). Moreover,  (c) generates an exponential number of disequations. So, even if the number of steps is polynomial, the real complexity of the algorithm is exponential: O(n3 2n ). Example 5.8 Let us consider the constraint f (X [ a [ g(Y ); a) 6= f (X [ Y [ b; X ) and assume that #(X ) < #(Y ) < #(a) < #(b) < #(g). The constraint is in implicit solved form; the corresponding disjunction of constraints in intermediate solved form is X 6= a _ X [ a [ g(Y ) 6= X [ Y [ b and the corresponding disjunction of constraints in explicit solved form is X 6= a _ X 6= X [ a ^ Y 6= Y [ a _ X 6= X [ g(Y ) ^ Y 6= Y [ g(Y ) _ X 6= X [ b _ X 6= X [ Y ^ X [ Y 6= X [ a ^ X [ Y 6= X [ g(Y ) ^ X [ Y 6= X [ a [ g(Y ) Observe that the components of the explicit solved form precisely identify the set of possible solutions. For instance, the second disjunct forces X and Y to be not of the form a [ s for any term s.

6 Discussion and Related Work

In [8] Comon studies the problem of determining adequate solved forms for disuni cation problems in the context of quotient algebras T ()= =E for various classes of equational theories E . Comon identi es a class of formulae, called compact formulae. All the three solved forms presented in this paper ful ll the requirement of being compact formulae. Comon proves also that compact equational theories, i.e., theories for which:

 E -uni cation is nitary and decidable  each equation s = t, vars (s; t) = fX g and such that s 6=E t admits a nite number of solutions in H

guarantee that every compact formula distinct from false is satis able in

T ()= =E [8]. However, ACI 1 is not compact, since equations of the type X = X [ a do not admit a nite set of solutions if T ()= =ACI 1 is in nite|which is the

case when  is general. Nevertheless, we have demonstrated in the previous sections how to reduce an arbitrary disuni cation problem to a compact form, as well as the fact that the speci c compact forms considered in our context are always satis able (Corollary 5.1). Thus, ACI 1 represents a good example to indicate that compactness of the equational theory is a sucient but not necessary condition for the satis ability of formulae in compact form. Buckert [4] introduces a general scheme for solving disuni cation problems in the context of an arbitrary equational theory E . Solutions of disuni cation problems are described through the use of substitutions with exceptions, i.e., entities of the form ; where  is a substitution and a set of substitutions. An actual solution to the disuni cation problem is represented by any instantiation of  which is not an instantiation of any of the substitutions in . In the context of a theory E which is nitary w.r.t. uni cation, the set of all solutions to a disuni cation problem can be represented using a nite set of substitutions with exceptions|additionally, the component of each of them is guaranteed to be nite. Substitutions with exceptions can be obtained from the solutions of a set of uni cation problems. Nevertheless, this approach is not suitable to be used in a CLP language. Each substitution with exceptions is equivalent to a formula of the type: 9W 8Y (X1 = t1 ^ : : : ^ Xn = tn ^ W1 6= s1 ^ : : : ^ Wm 6= sm) where vars(s1 ; : : : ; sm ) = Y , vars(t1 ; : : : ; tn ) \ Y = ; and W1 ; : : : ; Wm 2 W . This leads to the generation of formulae with arbitrary quanti cations, that are inadequate to a CLP framework. Moreover, to guarantee the existence of solutions it is necessary to verify if  is an E -instance of any substitution in |as in the Inconsistency Lemma in [4]. This requires solving additional E -uni cation problems as well as having an explicit representation of . In [2] Baader and Schulz develop a general technique capable of combining the satis ability algorithms (based on substitutions with exceptions) for disjoint equational theories. The approach is general and can be applied to ACI 1 as well. However, it provides a unique solved form that can be reached in exponential time. The solution of [2] introduces a great variety of new variables and opens a large number of alternatives. In particular, with this method one has to guess: the partition of the m variables present in the problem (m  n) into equivalence classes, a linear ordering over the variables (among the possible m!), and a type information for each variable, specifying to which theory E0 ; E1 the variable belongs to (2m possible choices). This leads to an overall complexity|modulo p 3n2?2  ? 2  n?2 2  the usual combina. Thus, in a particular torial approximations|of  2 n e case as that presented in this paper, it is reasonable to improve their constraint solver, developed for a universal framework. We do not introduce new variables to solve inequations and, using the implicit solved form we do not introduce disjunctions. Starting from a constraint made of disequations,

the complexity of our approach seems to be more promising (e.g., O(n2 ) for the implicit solved form) and practical. In the context of CLP with sets, three major proposals have been presented in the literature. In [14], Gervet presents a language, called Conjunto, which incorporates a constraint solver over boolean lattices built from ( at) set intervals. The constraints can be more complex (e.g., boolean constraints) than those considered in this paper, but the domain isless general. In particular, the simulation of nested sets is not possible|which prevents the direct encoding of many interesting problems. In CLPS [18] the authors use a solved form similar to the implicit one presented in this paper. On the other hand, their constraint solving mechanisms appear to be based on reducing the problem to standard forward-checking and lookahead techniques. The limited literature on the topic prevents us from a deeper comparison with the capabilities of CLPS. flogg [9, 10] is a constraint logic programming language over hybrid and hereditarily nite sets. Sets in flogg are represented using a more restricted construction, based on the use of the constant ; and the binary function symbol with (interpreted as the set element insertion operation). The function symbol [, instead, is not available. The union operation, however, is provided in the extended version presented in [10] as a primitive constraint based on the ternary predicate symbol [3 . Uni cation in this context is still NP-complete and can be seen as an instance of the cases analyzed in this paper (see [11]). Disuni cation is relatively simpler: the constraint solvers developed for flogg [9] are capable of handling both equalities and disequalities, leading to a solved form containing only primitive constraints of the form X = t and Y 6= s, where X; Y are variables, X occurs only once in the resulting constraint, and Y does not appear in s.

7 Conclusions

In this paper we have studied the problem of verifying the satis ability of conjunctions of equations and disequations w.r.t. an ACI 1 theory. The ability to eciently verify the satis ability of this class of formulae is vital to the development of more general and e ective CLP languages embedding sets. Existing results in the area of E -disuni cation (e.g., [8, 4, 2]) present general techniques which are either inadequate to the needs of a CLP framework (e.g., [4, 2]) or unsuitable to the characteristics of ACI 1 equational theories (e.g., [8]). The contributions of this paper are:  we have characterized the structures suitable to model ACI 1-like theories  we have provided complexity results for the problem of verifying satis ability of elementary disuni cation and disuni cation with constants  in the general disuni cation case, we have characterized the axiomatization which captures the desired properties, and which corresponds to the \standard" T ()= =ACI 1 model  we have proposed three solved forms, increasingly more precise in the characterization of the solutions set, and developed algorithms to compute the equivalent solved form for arbitrary conjunctions of disequations. Each solved form can be trivially tested for satis ability. Furthermore two of the three solved forms can be computed and tested in polynomial time. As future work, we will continue exploring the issue of solved forms for ACI 1 constraints, e.g., to achieve even more precise representation of the

solutions set. We will also explore the use of structure-sharing as a term representation technique to contain the size explosion during the uni cation phase of constraint solving.

Acknowledgments

The authors wish to thank R. Giacobazzi and D. Ranjan. A. Dovier and C. Piazza are partially supported by MURST. E. Pontelli is partially supported by NSF grants CDA9729848, EIA9810732, CCR9875279, and the US-Spain Research Program.

References

[1] F. Baader and W. Buttner. Uni cation in Commutative and Idempotent Monoids. TCS, 56:345{352, 1988. [2] F. Baader and K. U. Schulz. Combination Techniques and Decision Problems for Disuni cation. TCS, 142:229{255, 1995. [3] F. Baader and K. U. Schulz. Uni cation in the Union of Disjoint Equational Theories: Combining Decision Procedures. JSC, 21:211{243, 1996. [4] H.-J. Buckert. Solving Disequations in Equational Theories. In CADE 1988, vol. 310 of LNCS, pages 517{526. Springer-Verlag, 1988. [5] D. Cantone, A. Ferro, and E. G. Omodeo. Computable Set Theory, Vol. 1. Int. Series of Monographs on Computer Science. Clarendon Press, Oxford, 1989. [6] K. L. Clark. Negation as Failure. In H. Gallaire and J. Minker, eds., Logic and Databases, pages 293{321. Plenum Press, 1978. [7] M. Codish and V. Lagoon. Type Dependencies for Logic Programs using ACIuni cation. Israeli Symp. on Theory of Computing and Systems, IEEE, 1996. [8] H. Comon. Disuni cation: a Survey. In Computational Logic: Essays in Honor of Alan Robinson. MIT Press, 1991. [9] A. Dovier, E. G. Omodeo, E. Pontelli, and G. Rossi. flogg: A Language for Programming in Logic with Finite Sets. JLP, 28(1):1{44, 1996. [10] A. Dovier, C. Piazza, E. Pontelli, and G. Rossi. On the Representation and Management of Finite Sets in CLP-languages. Proc. JICSLP, MIT Press, 1998. [11] A. Dovier, E. Pontelli, and G. Rossi. Set Uni cation Revisited. NMSU-CSTR9817, Dept. of Computer Science, New Mexico State Univ., USA, Oct. 1998. [12] H. B. Enderton. A Mathematical Introduction to Logic. Academic Press, 1972. [13] C. Fidge et al. A Set-theoretic Model for Real-Time Speci cation and Reasoning. In Mathematics of Program Construction , Springer-Verlag, 1998. [14] C. Gervet. Interval Propagation to Reason about Sets: De nition and Implementation of a Practical Language. Constraints, 1:191{246, 1997. [15] G. Gratzer. General Lattice Theory. Birkhauser-Verlag, 1978. [16] J. Ja ar, M. Maher, K. Marriot, and P. Stuckey. The Semantics of Constraint Logic Programs. JLP, 37:1{46, 1998. [17] D. Kapur and P. Narendran. NP-Completeness of the Set Uni cation and Matching Problems, In J. H. Siekmann ed., 8th CADE, Springer-Verlag, 1986. [18] B. Legeard and E. Legros. Short Overview of the CLPS System. In Proc. PLILP, Vol. 528 of LNCS, pp. 431{433. Springer-Verlag, 1991. [19] M. J. Maher. Complete Axiomatizations of the Algebras of Finite, Rational and In nite Trees. In Proc. 3rd Symp. LICS (1988), 349{357. [20] A. Mal'cev. Axiomatizable Classes of Locally Free Algebras of Various Types. In The Metamathematics of Algebraic Systems, North Holland, 1971, ch. 23. [21] J. H. Siekmann. Uni cation Theory. In C. Kirchner, editor, Uni cation. Academic Press, 1990.