The Correctness of Set-Sharing

University of Leeds

SCHOOL OF COMPUTER STUDIES RESEARCH REPORT SERIES Report 98.03

The Correctness of Set-Sharing by

Patricia M Hill & Roberto2 Bagnara1& Enea Zaanella January 1998

1 [email protected], Dipartimento di Matematica, Universit a degli Studi di Parma, Italy. Much of this work was supported by EPSRC grant GR/L19515. 2 [email protected], Servizio IX Automazione, Universit a degli Studi di Modena, Italy.

.

It is important that practical data ow analysers are backed by reliably proven theoretical results. Abstract interpretation provides a sound mathematical framework. Moreover, in logic programming, necessary generic properties for an abstract domain to be well-de ned and sound with respect to the concrete semantics have been identi ed. Sharing is an abstract domain that is a standard choice for sharing analysis for both practical work and further theoretical study. In spite of this, we found that there are no satisfactory proofs for the key properties of commutativity and idempotence that are essential for Sharing to be well-de ned and that published statements of the safeness property assumed the occur-check. This paper provides a generalisation of the abstraction function for Sharing that can be applied to any language, with or without the occur-check. The results for safeness, idempotence and commutativity for abstract uni cation using this abstraction function are given.

Keywords: abstract interpretation, logic programming, occur-check, rational trees, set sharing.

/

1. Introduction Today, talking about sharing analysis for logic programs is almost the same as talking about the set-sharing domain Sharing of Jacobs and Langen [9, 10]. Researchers are primarily concerned with extending the domain with linearity, freeness, depth-k abstract substitutions and so on [3, 4, 6, 13, 14, 18]. Key properties such as commutativity and soundness of this domain and its associated abstract operations are normally assumed to hold. The main reason for this is that [10] not only includes a proof of the soundness but also refers the reader to the thesis of Langen [15] for proofs of commutativity and idempotence. In abstract interpretation, the concrete semantics of a program is approximated by an abstract semantics. In particular, the concrete domain is replaced by an abstract domain and each elementary operation on the concrete domain is replaced by a corresponding abstract operation on the abstract domain. Thus, assuming the global abstract procedure mimics the concrete execution procedure, each operation on elements in the abstract domain must produce an approximation of the corresponding operation on corresponding elements in the concrete domain. The key operation in a logic programming derivation is uni cation unify and the corresponding operation for an abstract domain is aunify. An important step in standard uni cation algorithms is the occur-check thereby avoiding in nite data structures. However, in computational terms, it is expensive and it is well known that Prolog implementations by default omit this check. Although standard uni cation algorithms that include the occur-check produce a substitution that is idempotent, the resulting substitution when the occur-check is omitted, may not be idempotent. In spite of this, most theoretical work on data ow analysis of logic programming assume the result of unify is always idempotent. 1

In particular both [10] and [15] assume in their proofs of soundness that the concrete substitutions are idempotent. Thus their results as published do not apply to the analysis of all Prolog programs. If two terms in the concrete domain are uni able, then unify computes a most general uni er (mgu ). Up to renaming of variables, an mgu is unique. Moreover a substitution is de ned as a set of bindings or equations between variables and other terms. Thus, for the concrete domain, the order and multiplicity of elements are irrelevant in both the computation and semantics of unify. It is therefore useful that the abstraction of the uni cation procedure should be unaected by the order and multiplicity in which it abstracts the bindings that are present in the substitution. Furthermore, from a practical perspective, it is useful if the global abstract procedure can proceed in a dierent order to the concrete one without aecting the accuracy of the analysis results. Hence, it is extremely desirable that aunify is also commutative and idempotent. However, as explained later in this paper, only a weak form of idempotence has ever been proved (and an erroneous counter-example has been used to justify this) while the only previous proof of commutativity [15] is seriously awed. As sharing is normally combined with linearity and freeness domains which are not idempotent or commutative, it may be asked why these properties are important for sharing. In answer to this, we observe that the order and multiplicity in which the bindings in a substitution are analysed aects the accuracy of the linearity and freeness domains. It is therefore a real advantage to be able to ignore these aspects as far as the sharing domain is concerned. This paper provides a generalisation of the abstraction function for Sharing that can be applied to any language, with or without the occur-check. The results for safeness, idempotence and commutativity for abstract uni cation using this abstraction function are proven. In the next section, we introduce the notation and de nitions we need concerning equality and substitutions in the concrete domain. In Section 3, we introduce a new concept called variable-idempotence which generalises idempotence to allow for rational trees. In Section 4, we recall the de nition of the abstract domain Sharing and de ne its abstraction function suitably generalised to allow for non-idempotent substitutions. In Sections sec:aunify results and 6, we present the proofs of the commutativity, idempotence and safeness results. We conclude in Section 7. This nal section gives more detailed justi cation for the results in this paper.

2. Equations and Substitutions 2.1. Notation For a set S , # S is the cardinality of S , }(S ) is the powerset of S , whereas }f (S ) is the set of all the nite subsets of S . The symbol Vars denotes a denumerable set of variables, whereas TVars denotes the set of rst-order terms over Vars for some given set of function symbols. The set of variables occurring in a syntactic object o is denoted by vars (o). 2

2.2. Substitutions If x 2 Vars ; s 2 TVars , then x 7! s is called a binding. A substitution is a total function : Vars ! TVars that is the identity almost everywhere; in other words, the domain of ,

dom() def = x 2 Vars (x) 6= x is nite. If t 2 TVars , we write t to denote (t). are denoted Substitutions by the set of their bindings, thus is identi ed with x 7! (x) x 2 dom() . The composition of substitutions is de ned in the usual way. Thus is the substitution such that, for all terms t, ( )(t) = ((t)). A substitution is idempotent if, for all t 2 TVars , t = t. A substitution is circular if it has the form fx1 7! x2 ; : : : ; xn?1 7! xn ; xn 7! x1 g. A substitution is in rational solved form if it has no circular subset. The set of all substitutions in rational solved form is denoted by Subst .

2.3. Equations

An equation is of the form s = t where s; t 2 TVars . Eqs denotes the set of all equations over TVars . We are concerned in this paper to keep the results on sharing as general as possible. In particular, we do not want to restrict ourselves to a speci c equality theory. Thus we allow for any equality theory T over TVars that includes the basic axioms denoted by the following schemata. s=s (2.1) s = t () t = s (2.2) r = s ^ s = t =) r = t (2.3) f (s1 ; : : : ; sn ) = f (t1 ; : : : ; tn ) () s1 = t1 ; : : : ; sn = tn (2.4) :f (s1 ; : : : ; sn ) = g(t1 ; : : : ; tm ): (2.5) Of course, T can include other axioms. For example, it is usual in logic programming and most implementations of Prolog to assume an equality theory based on syntactic identity and characterised by the axiom schemata given by Clark [5]. This consists of the basic axioms together with the following occur-check axioms : 8z 2 Vars : 8t 2 (TVars n Vars ) : z 2 vars (t) =) :(z = t): However, an alternative approach used in some implementations of Prolog, does not require these occur-check axioms. This is based on the theory of rational trees [7, 8]. It assumes the basic axioms together with a set of uniqueness axioms [11, 12]. These state that each equation in rational solved form uniquely de nes a set of trees. Thus, an equation z = t where z 2 vars (t) and t 2 (TVars n Vars ) denotes the axiom (expressed in terms of the usual rst-order quanti ers [16]):

?

8x 2 Vars : z = t ^ x = t[z=x] =) z = x : The basic axioms de ned by schemata 2.1, 2.2, 2.3, 2.4, and 2.5, which are all that are required for the results in this paper, are included in both these theories. 3

A substitution may be regarded as a set of equations fx = t x 7! t 2 g. A set of equations e 2 }f (Eqs) is uni able if there is 2 Subst such that T ` ( =) e). is called a uni er for e. is said to be a relevant uni er of e if vars () vars (e). That is, does not introduce any new variables. is a most general uni er mgu for e if, for every uni er 0 of e, T ` (0 =) ). An mgu, if it exists, is unique up to the renaming of variables. In this paper, mgu(e) always denotes a relevant uni er of e.

3. Variable-Idempotence It is usual in papers on sharing analysis to assume that all the substitutions are idempotent. However, to allow for substitutions representing rational trees, we need to generalise idempotence to variable-idempotence. De nition 1. A substitution is variable-idempotent if

8t 2 TVars : vars (t) = vars (t): The set of all variable-idempotent substitutions is denoted by VSubst. It is convenient to use the following alternative characterisation of variable-idempotence: A substitution is variable-idempotent if and only if,

8(x 7! t) 2 : vars (t) = vars (t): It follows from this that any substitution consisting of a single binding is variableidempotent. We de ne the transformation 7?S! Subst Subst as follows: x 7! t (y 7! s) 2 x 6= y 7?S! n fy 7! sg [ fy 7! s[x=t]g Any substitution can be transformed to a variable-idempotent substitution 0 for by a nite sequence of S transformations. Furthermore, if the substitutions and 0 are regarded as equations, then they are equivalent with respect to any equality theory that includes the basic equality axioms. These two statements are direct consequences of Lemmas 2 and 3, respectively. Lemma 2. Let T be an equality theory that satis es the basic equality axioms and and 0 be substitutions. Suppose that (x 7! t); (y 7! s) 2 where x 6= y and 0 = n fy 7! sg [ fy 7! s[x=t]g. Then (regarding and 0 as sets of equations) T ` ( () 0 ). Proof. We rst show by induction on the depth of the term s that

x = t =) s = s[x=t]: Suppose s has depth 1. If s is x, then s[x=t] = t and the result is trivial. If s is a variable distinct from x or a constant, then s[x=t] = s and the result follows from equality axiom 2.1. Suppose now that s = f (s1 ; : : : ; sn ) and the result holds for 4

all terms of depth less than that of s. Then, by the inductive hypothesis, for each

i = 1; : : : ; n,

x = t =) si = si [x=t] Hence, by axiom 2.4, x = t =) f (s1 ; : : : ; sn ) = f (si [x=t]; : : : ; sn [x=t]) and hence

x = t =) f (s1 ; : : : ; sn ) = f (si ; : : : ; sn )[x=t]: Thus, combining this result with axiom 2.3, we have fx = t; y = sg =) fx = t; y = s; s = s[x=t]g =) fx = t; y = s[x=t]g: Similarly, combining this result with axioms 2.2 and 2.3,

fx = t; y = s[x=t]g =) fx = t; y = s[x=t]; s = s[x=t]g =) fx = t; y = sg: Lemma 3. Suppose that, for each j = 0, : : : , n:

j = fx1 7! t1;j ; : : : ; xn 7! tn;j g where tj;j = tj;j?1 and if j > 0, for each i = 1; : : : ; n where i 6= j , ti;j = ti;j?1 [xj =tj;j?1 ]. Then, for each j = 0; : : : ; n, j = fx1 7! t1;j ; : : : ; xj 7! tj;j g is variable-idempotent and, if j > 0, j can be obtained from j?1 by a sequence

of S transformations. Proof. The proof is by induction on j . Since 0 is empty, the base case when j = 0 is trivial. Suppose, therefore that 1 j n and the hypothesis holds for j?1 and j?1 . By the de nition of j , we have j = fxj 7! tj;j?1 g j?1 . Consider an arbitrary i, 1 i j . We will show that vars (ti;j j ) = vars (ti;j ). Suppose rst that i = j . Then since tj;j = tj;j?1 , tj;j?1 = tj;0 j?1 and, by the inductive hypothesis, vars (tj;0 j?1 j?1 ) = vars (tj;0 j?1 ), we have ?

vars (tj;j j ) = vars tj;0 j?1 j?1 fxj 7! tj;j g ? = vars tj;0 j?1 fxj 7! tj;j g ? = vars tj;j fxj 7! tj;j g = vars (tj;j ):

Suppose now that i 6= j . Then,

?

vars (ti;j ) = vars ti;j?1 fxj 7! tj;j?1 g : 5

and, by the inductive hypothesis, vars (ti;j?1 j?1 ) = vars (ti;j?1 ). If xj 2= vars (ti;j?1 ), then ?

vars (ti;j j?1 ) = vars ti;j?1 fxj 7! tj;j?1 gj?1 = vars (ti;j?1 j?1 ) = vars (ti;j ):

On the other hand, if xj 2 vars (ti;j?1 ), then ?

vars (ti;j j?1 ) = vars ti;j?1 fxj 7! tj;j?1 gj?1 = vars (ti;j?1 j?1 ) n fxj g [ vars (tj;j?1 j?1 ) = vars (ti;j?1 ) n fxj g [ vars (tj;j?1 ) ? = vars ti;j?1 fxj 7! tj;j?1 g = vars (ti;j ):

Thus, in both cases, ?

vars (ti;j j ) = vars ti;j j?1 fxj 7! tj;j?1 g ? = vars ti;j fxj 7! tj;j?1 g = vars (ti;j?1 fxj 7! tj;j?1 gfxj 7! tj;j?1 g :

However, a substitution consisting of a single binding is variable-idempotent. Thus ?

vars (ti;j j ) = vars ti;j?1 fxj 7! tj;j?1 g = vars (ti;j ):

Therefore, for each i = 1; : : : ; j , vars (ti;j j ) = vars (ti;j ). It then follows (using the alternative characterisation of variable idempotence) that j is variableidempotent.

4. Set Sharing 4.1. The Sharing Domain The Sharing domain is due to Jacobs and Langen [9]. However, we use the de nition as presented in [1]. 6

De nition 4. (The set-sharing lattice.) Let3

SG def = S 2 }f (Vars ) S 6= ?

and let SH def = }(SG ). The set-sharing lattice is given by the set

SS def = (sh ; U ) sh 2 SH ; U 2 }f (Vars ); 8S 2 sh : S U [ f?; >g ordered by SS de ned as follows, for each d; (sh 1 ; U1); (sh 2 ; U2 ) 2 SS:

? SS d; d SS >; (sh 1 ; U1 ) SS (sh 2 ; U2 )

()

(U1 = U2 ) ^ (sh 1 sh 2 ): It is straightforward to see that every subset of SS has a least upper bound with respect to SS . Hence SS is a complete lattice4 . An element sh of SH abstracts the property of sharing in a substitution . That is, if is idempotent, two variables x; y must be in the same set in sh if some variable, say v occurs in both x and y. In fact, this is also true for variableidempotent substitutions although it is shown below that this needs to be generalised for substitutions that are not variable-idempotent. Thus, the de nition of the abstraction function for sharing, requires an ancilliary de nition for the notion of occurrence. De nition 5. (Occurrence.) For each non-negative integer n, occi : Subst Vars ! }f (Vars ) is de ned for each 2 Subst and each v 2 Vars: occ0 (; v) def = fvg if v = v, occ0 (; v) def =? if v 6= v. If n > 0,

occn (; v) def = fy 2 Vars jx 2 vars (y) \ occn?1 (; v)g: It can be seen that, for xed values of and v, occn (; v) is monotonic and extensive with respect to the index n. Hence, as the range of occn (; v) is restricted to the nite set of variables in , there is an N such that occN (; v) = occM (; v)) for all M N . Let occ!(; v) def = occN (; v). Note that if is variable idempotent, then occ!(; v) = occ1 (; v). Note also that if v 6= v, then occ!(; v) = ?. 3 The literature on Sharing is almost unanimous in de ning sharing-sets so that they always contain the empty set. We deviate from this de facto standard: in our approach sharing-sets never contain the empty set. We do this because 1. there is no real need of having ? and as the bottom element and the ordering of the domain, respectively (the original motivation for including the empty set in each sharing-set); 2. the de nitions turn out to be easier; 3. we describe the implementation (where the empty set never appears in sharing-sets) more faithfully. 4 Notice that the only reason we have > 2 SS is in order to turn SS into a lattice rather than a CPO.

7

Previous de nitions for an occurrence operator such as that for sg in [9] have all been for idempotent substitutions. However, when is an idempotent substitution, occ!(; v) and sg(; v) are the same for all v 2 Vars . We base the de nition of abstraction on the occurrence operator, occ!. De nition 6. (Abstraction.) The concrete domain Subst is related to SS by means of the abstraction function : }(Subst ) }f (Vars ) ! SS. For each 2 }(Subst ) and each U 2 }f (Vars ),

(; U ) def =

G

2

(; U );

where : Subst }f (Vars ) ! SS is de ned, for each 2 Subst and each U 2 }f (Vars ), by

(; U ) def =

occ!(; v) \ U j v 2 V ars n f?g; U

The following lemma shows that the abstraction for a substitution is the same as the abstraction for a variable-idempotent substitution for . Lemma 7. Let be a substitution, 0 a substitution obtained from by a sequence of S transformations, U a set of variables and v 2 Vars. Then v = v () v = v0 occ!(; v) = occ!(0 ; v); (; U ) = (0 ; U ): Proof. Suppose rst that 0 is obtained from by a single S transformation. Thus we can assume that x 7! t and y 7! s are in where x 2 vars (s) and that ? 0 = n fy 7! sg [ y 7! s[x=t] : It follows that as is in rational solved form, has no circular subset and hence v = v () v = v0 . Thus, if v 6= v, then v 6= v0 and occ!(; v) = occ!(0 ; v) = ?. We now assume that v = v = v0 and prove that occm (; v) occ!(0 ; v): The proof is by induction on m. By De nition 5, occ0 (; v) = occ0 (0 ; v) = fvg, so that the result holds for m = 0. Suppose then that m > 0 and that vm 2 occm (; v). By De nition 5, there exists vm?1 2 vars (vm ) where vm?1 2 occm?1 (; v). Hence, by the inductive hypothesis, vm?1 2 occ!(0 ; v). If vm?1 2 vars (vm 0 ), then, by De nition 5, vm 2 occ!(0 ; v) . On the other hand, if vm?1 2= vars (vm 0 ), then vm = y, vm?1 = x, and x 2 vars (s) (so that vars (t) vars (s[x=t])). However, by hypothesis, v = v, so that x 6= v and m > 1. Thus, by De nition 5, there exists vm?2 2 vars (t) and vm?2 2 occm?2 (; v). By the inductive hypothesis, vm?2 2 occ!(0 ; v). Since y 7! s[x=t] 2 0 , and vm?2 2 vars (s[x=t]), vm?2 2 vars (y0 ). Thus, by De nition 5, y 2 occ!(0 ; v). Conversely, we now prove that, for all m, occm (0 ; v) occ!(; v) 8

The proof is again by induction on m. As in the previous case, occ0 (0 ; v) = occ0 (; v) = fvg, so that the result holds for m = 0. Suppose then that m > 0 and that vm 2 occm (0 ; v). By De nition 5, there exists vm?1 2 vars (vm 0 ) where vm?1 2 occm?1 (0 ; v). Hence, by the inductive hypothesis, vm?1 2 occ!(; v). If vm 2 occ(; vm?1 ), then, by De nition 5, vm 2 occ!(; v). On the other hand, if vm?1 2= vars (vm ), then vm = y, vm?1 2 vars (t) and x 2 vars (s). Thus, as y 7! s 2 , y 2 vars (x). However, since x 7! t 2 , vm?1 2 vars (x) so that, by De nition 5, x 2 occ!(; v). Thus, again by De nition 5, y 2 occ!(; v). Thus, if 0 is obtained from by a single S transformation, we have v = v () v = v0 , occ!(; v) = occ!(0 ; v) and (; U ) = (0 ; U ): Suppose now that there is a sequence = 1 ; : : : ; n = 0 such that, for i = 2; : : : ; n, i is obtained from i?1 by a single S -step. If n = 1, then = 0 . If n > 1, we have by the rst part that, for each i = 2; : : : ; n,

v = vi?1 () v = vi ; occ!(i?1 ; v) = occ!(i ; v); (i?1 ; U ) = (i ; U ): and hence the required results.

4.2. Abstract Operations for Sharing Sets

Before introducing the abstract operations over SH we need some ancillary de nitions. De nition 8. (Auxiliary functions.) The closure under union function (also called star-union), ()? : SH ! SH , is given, for each sh 2 SH , by

sh ? def = S 2 SG 9n 1 : 9T1 ; : : : ; Tn 2 sh : S = T1 [ [ Tn : For each sh 2 SH and each T 2 }f (Vars ), the extraction of the relevant component of sh with respect to T is encoded by the function rel: }f (Vars ) SH ! SH is de ned as

rel(T; sh ) def = f S 2 sh j S \ T 6= ? g: For each sh 1 ; sh 2 2 SH , the function bin: SH SH ! SH is given by

bin(sh 1 ; sh 2 ) def = f S1 [ S2 j S1 2 sh 1 ; S2 2 sh 2 g: The function proj: SH }f (Vars ) ! SH projects an element of SH onto a set of variables of interest: if sh 2 SH and V 2 }f (Vars ), then

proj(sh ; V ) def = f S \ V j S 2 sh ; S \ V 6= ? g: De nition 9. (amgu.) function amgu captures the eects of a binding x 7! t on an SH element. Let x be a variable and t a term. Let also sh 2 SH and ?

A def = rel fxg; sh ; def ? B = rel vars (t); sh : 9

Then ?

amgu(sh ; x 7! t) def = sh n (A [ B ) [ bin(A? ; B ? ): In Section 6, the following soundness result for amgu is proved. Lemma 10. Let (sh ; U ) 2 SS and fx 7! tg; ; 2 Subst such that is a relevant uni er of fx = tg and vars (x); vars (t); vars () U . Then

(; U ) SS (sh ; U ) =) ( ; U ) SS (amgu(sh ; x 7! t); U ): The following lemmas which are proved in [2] show that amgu is both commutative and idempotent. Lemma 11. Let sh 2 SH and fx 7! rg 2 Subst. Then ?

amgu(sh ; x 7! r) = amgu amgu(sh ; x 7! r); x 7! r : Lemma 12. Let sh 2 SH and fx 7! rg; fy 7! tg 2 Subst. Then ?

?

amgu amgu(sh ; x 7! r); y 7! t = amgu amgu(sh ; y 7! t); x 7! r :

4.3. Abstract Operations for Sharing Domains The de nitions and results of Subsection 4.2 can be lifted to apply to sharing domains. De nition 13. (Amgu.) The operation Amgu: SS Subst ! SS extends the SS description it takes as an argument, to the set of variables occurring in the binding it is given as the second argument. Then it applies amgu: ?

Amgu (sh ; U ); x 7! t def

= amgu sh [ fug u 2 vars (x 7! t) n U ; x 7! t ; U [ vars (x 7! t) :

The results for amgu can easily be extended to apply to Amgu giving us the following corollaries. Corollary 14. Let (sh ; U ) 2 SS, and fx 7! tg; ; 2 Subst such that vars () U and a relevant uni er of fx = tg. Then ?

?

(; U ) SS (sh ; U ) =) ; U [ vars (x 7! t) SS Amgu (sh ; U ); x 7! t : Corollary 15. Let sh 2 SH and fx 7! rg 2 Subst. Then ?

?

Amgu (sh ; U ); x 7! r = Amgu Amgu (sh ; U ); x 7! r ; x 7! r : Corollary 16. Let sh 2 SH and fx 7! rg; fy 7! tg 2 Subst. Then

?

?

Amgu Amgu (sh ; U ); x 7! r ; y 7! t = Amgu Amgu (sh ; U ); y 7! t ; x 7! r : 10

De nition 17. (aunify.) The function aunify: SS Eqs ! SS generalises Amgu to a set of equations e: If (sh ; U ) 2 SS, x is a variable, r a term, s = f (s1 ; : : : ; sn ), t = f (t1 ; : : : ; tn ) non-variable terms, and s = t the set of equations fs1 = t1 ; : : : ; sn = tn g, then

aunify((sh ; U ); ?) def = (sh ; U ); if e 2 }f (Eqs) is uni able, ? ? = aunify Amgu (sh ; U ); x 7! r ; e n fx = rg ; aunify (sh ; U ); e [ fx = rg def ?

?

aunify (sh ; U ); e [ fs = xg def = aunify (sh ; U ); (e n fs = xg) [ fx = sg ; ? def ? aunify (sh ; U ); e [ fs = tg = aunify (sh ; U ); (e n fs = tg) [ s = t ; and, if e is not uni able,

aunify((sh ; U ); e) def = ?: For the distinguished elements ? and > of SS ?

aunify ?; e def = ?; ? def aunify >; e = >: As a result of Corollary 16, Amgu and aunify commute. Lemma 18. Let (sh ; U ) 2 SS, e 2 }f (Eqs) and y 7! u 2 Subst. Then

?

?

aunify Amgu (sh ; U ); y 7! u ; e = Amgu aunify (sh ; U ); e ; y 7! u : The proof of this is in Section 5. As a consequence of this and Corollaries 14, 15 and 16, we have the following soundness, commutativity and idempotence results required for aunify to be sound and well-de ned. The proofs of these results are in Section 5. Theorem 19. Let (sh ; U ) 2 SS, ; 2 Subst, and e 2 }f (Eqs) such that vars () U and a relevant uni er of e. Then ?

?

(; U ) SS (sh ; U ) =) ; U [ vars (e) SS aunify (sh ; U ); e : Theorem 20. Let (sh ; U ) 2 SS and e 2 }f (Eqs). Then ?

?

aunify (sh ; U ); e = aunify aunify (sh ; U ); e ; e : Theorem 21. Let (sh ; U ) 2 SS and e1 ; e2 2 }f (Eqs). Then

?

?

aunify aunify (sh ; U ); e1 ; e2 = aunify aunify (sh ; U ); e2 ; e1 : 11

5. Proofs of results for Sharing Domains We prove all the results in this section by induction on a measure of the size of a set of equations e 2 }f (Eqs), where size of e is de ned here to be the number of variables on the left hand sides of the equations in e plus twice the number of non-variable symbols on the left hand side of the equations in e. For each result, the proof is obvious if e is empty or does not unify. Thus, in the following proofs, we assume that e uni es and is non-empty. For each proof, there are three cases, depending on whether e contains an element of the form x = r, s = x, or s = t where x is a variable, r is any term, and s def = f (s1 ; : : : ; sn ), t def = f (t1 ; : : : ; tn ). Let s = t denote the set of equations fs1 = t1 ; : : : ; sn = tn g.

Proof of Lemma 18. = e n fx = rg. Case 1 x = r 2 e. Let e1 def

?

aunify Amgu (sh ; U ); y 7! u ; e

?

= aunify Amgu Amgu (sh ; U ); y 7! u ; x 7! r ; e1

?

= aunify Amgu Amgu (sh ; U ); x 7! r ; y 7! u ; e1

?

= Amgu aunify Amgu (sh ; U ); x 7! r ; e1 ; y 7! u

?

= Amgu aunify (sh ; U ); e ; y 7! u

(De nition 17) (Corollary 16) (induction) (De nition 17):

= e n fs = xg [ fx = sg. Case 2 s = x 2 e. Let e1 def

?

aunify Amgu (sh ; U ); y 7! u ; e

?

?

= aunify Amgu (sh ; U ); y 7! u ; e1

= Amgu aunify (sh ; U ); e1 ; y 7! u

?

= Amgu aunify (sh ; U ); e ; y 7! u

(De nition 17) (induction) (De nition 17):

= e n fs = tg [ s = t. Then the proof of this is the Case 3 s = t 2 e. Let e1 def

same as for Case 2.

Proof of Theorem 19. def ? Case 1 x = r 2 e. Let e0 = e n fx = rg, def = mgu fx 0 = t 0 g and 0 be a 12

relevant uni er for e0 such that = 0 . (; U ) SS (sh ; U ) ? ? (induction) =) 0 ; U [ vars (e0 ) SS aunify (sh ; U ); e0 ? ? =) 0 ; U [ vars (e) SS Amgu aunify (sh ; U ); e0 ; x 7! r (Corollary 14) ? ? =) 0 ; U [ vars (e) SS aunify Amgu (sh ; U ); x 7! r ; e0 (Lemma 18) ? ? =) ; U [ vars (e) SS aunify (sh ; U ); e (De nition 17).

Case 2 s = x 2 e. Let e1 def = e n fs = xg [ fx = sg. Then mgu(e) = mgu(e1 ) and the result follows from Case 1. = e n fs = tg [ s = t. Then mgu(e) = mgu(e1 ). Case 3 s = t 2 e. Let e1 def (; U ) SS (sh ; U ) ? =) ( ; U [ vars (e1 )) SS aunify (sh ; U ); e1 (induction) ? ? =) ; U [ vars (e) SS aunify (sh ; U ); e (De nition 17). Proof of Theorem 20. def Case 1 x = r 2 e. Let e1 = e n fx = tg.

?

aunify aunify (sh ; U ); e ; e

?

?

?

= aunify Amgu aunify Amgu((sh ; U ); x 7! r); e1 ; x 7! r ; e1 ?

(De nition 17)

= aunify aunify Amgu Amgu((sh ; U ); x 7! r); x 7! r ; e1 ; e1

= aunify Amgu Amgu((sh ; U ); x 7! r); x 7! r ; e1

= aunify Amgu (sh ; U ); x 7! r ; e1 ? = aunify (sh ; U ); e

(Lemma 18) (induction) (Corollary 15) (De nition 17):


?

aunify aunify (sh ; U ); e ; e

?

= aunify aunify (sh ; U ); e1 ; e1 ? = aunify (sh ; U ); e1 ? = aunify (sh ; U ); e :

(De nition 17) (induction) (De nition 17):

= e n fs = tg [ s = t. Then the rest of the proof is Case 3 s = t 2 e. Let e1 def

the same as for Case 2.

13

Proof of Theorem 21. def Case 1 x = r 2 e. Let e02 = e n fx = rg.

?

aunify aunify (sh ; U ); e1 ; e2

?

?

?

?

= aunify aunify Amgu (sh ; U ); x 7! r ; e1 ; e02 = aunify aunify Amgu (sh ; U ); x 7! r ; e02 ; e1

= aunify Amgu aunify (sh ; U ); e02 ; x 7! r ; e1

= aunify aunify (sh ; U ); e2 ; e1

(De nition 17) (induction) (Lemma 18)

(De nition 17):


?

aunify aunify (sh ; U ); e1 ; e2 ? = aunify aunify (sh ; U ); e1 ; e02

?

0

(De nition 17)

= aunify aunify (sh ; U ); e2 ; e1 ? = aunify aunify (sh ; U ); e2 ; e1

(induction) (De nition 17):

= e n fs = tg [ s = t. Then the proof of this is the Case 3 s = t 2 e. Let e1 def

same as for Case 2.

6. Proofs of results for Sharing Sets In the proofs we use the fact that ()? and rel are monotonic and idempotent so that sh 1 sh 2 =) sh ?1 sh ?2 ; sh 1 sh 2 =) rel(sh 1 ; U ) rel(sh 2 ; U ):

(6.1) (6.2)

Let t1 , :S: : , tn be terms. For the sake of brevity we will use the notation vt1 tn to denote ni=1 vars (ti ). In particular, if x and y are (term) variables, and r and t are terms, we will use the following de nitions: = fxg; vx def def vr = vars (r); = vx [ v r ; vxr def

= fyg; vy def def vt = vars (t); = vy [ v t : vyt def

De nition 22. (rel.) Suppose V 2 }f (Vars ) and sh 2 SH . Then

rel(V; sh ) def = sh n rel(V; sh ): 14

Notice that if S 2 rel(V; sh ) then S \ V = ?. Conversely, if S 2 sh and S \ V = ? then S 2 rel(V; sh ). The following de nition of amgu is clearly equivalent to the one given in De nition 8: for each variable y, each term t,and each sh 2 SH , ? amgu(sh ; y 7! t) def = rel(vyt ; sh ) [ bin rel(vy ; sh )? ; rel(vt ; sh )? :

(6.3)

Proof of Lemma 10. By Lemma 7, if is not variable-idempotent, it can be transformed to a variable-idempotent substitution 0 where (; U ) = (0 ; U ). We therefore can assume that is variable-idempotent so that, for any v 2 Vars , occ!(; v) = occ1 (; v). Thus, to simplify the notation in this proof, we write occ instead of occ!. We rst prove the result assuming that (; U ) = (sh ; U ). Let def = . Suppose (; U ) = (sh 0 ; U ) and S 2 sh 0 . Then, for some w 2 U , there exists v 2 vars (w) \ U such that S = occ(; v):

(6.4)

Suppose = S 0 def

[

occ(; y) y 2 occ(; v); occ(; y) \ U 6= ? :

(6.5)

We rst show that S = S 0 . We have z 2 occ(; v) \ U if and only if v 2 vars (z) and z 2 U . Since = , we have v 2 vars (z) if and only if there exists y 2 vars (z) such that v 2 vars (y ). This holds if and only if there exists y such that z 2 occ(; y) and y 2 occ(; v), which, as z 2 U , is equivalent to z 2 S 0 . Let [

occ(; y) y 2 occ(; v); x 2 occ(; y) ; = Sx def [ = occ(; y) y 2 occ(; v); occ(; y) \ vars (t) 6= ? ; St def = S n (Sx [ St ): S0 def

(6.6) (6.7) (6.8)

Note that, by (6.5), Sx [ St S . Moreover, by (6.6), if Sx 6= ? then x 2 Sx . Similarly, by (6.7), we have also S \ vars (t) St and thus, by (6.5),

S0 =

8 > [< > :

occ(; y)

9

occ( ) occ( occ( )

y2

> ; v); = : ; y \ U 6= ?; > ; ; y \ (fxg [ vars (t)) = ?

(6.9)

There are two cases to consider. Either S0 6= ? or S0 = ?. Suppose rst that S0 6= ?. Then, by (6.9), occ(; y0 ) S0 for some y0 2 occ(; v) and occ( ? ; y0 ) \ U 6= ?. Hence occ(; y0 ) 2 sh . It follows from (6.9) that occ(; y0 ) 2 rel fxg [ vars (t); sh , and ?

occ(; y0 ) \ fxg [ vars (t) = ?:

(6.10)

Hence y0 62 vars (x) [ vars (t). Therefore y0 = y0 . Since v 2 vars (y0 ) we have y0 = v and, consequently, v = v. Suppose occ(; y) S0 , where y 2 occ(; v). Then v 2 vars ? (y ). Hence y = v and occ(; y ) = occ(; v ). Thus S0 = occ(; v ) and S0 2 rel fxg [ vars (t); sh . We now show that Sx = St = ?. If y0 2 vars (x), 15

then x 2 occ(; y0 ). Similarly, if y0 2 vars (t), occ(; y0 ) \ vars (t) 6= ?. This establishes, by contradiction with (6.10), that Sx = St = ? and thus ?

S = S0 2 rel fxg [ vars (t); sh : Now suppose that S0 = ?. Then, since S 6= ?, we have Sx [ St 6= ?. Suppose that Sx 6= ?. Then, by (6.6), x 2 Sx . Hence x 2 S . However, by (6.4), S = occ(; v) so that v 2 vars (x). Since x = t, we have v 2 vars (t). Thus, by (6.4), we have S \ vars (t) 6= ?. Then, by (6.7), St \ vars (t) 6= ? and so St 6= ?. Therefore both Sx 6= ? and St 6= ?. Since x 2 U , x 2 occ(; y) implies occ(; ?y) \ U 6= ?. Thus x 2 occ(; y?) implies ? occ(; y ) 2 sh and hence occ(; y) 2 rel fxg; sh . So, by (6.6), S 2 rel f x g ; sh . Similarly, the de nition x ? ? of St ensures that St 2 rel vars (t); sh and thus, as S0 = ?, ? ? S = Sx [ St 2 bin rel fxg; sh ? ; rel vars(t); sh ? :

Combining the cases for S0 6= ? and S0 = ? we obtain the required result

S 2 amgu(sh ; x 7! t): Finally, since sh 1 sh 2 =) amgu(sh 1 ; x 7! t) amgu(sh 2 ; x 7! t);

we have the result when (; U ) SS sh . 2

Proof of Lemma 11. Let = rel(vxr ; sh ) sh ? def ? def sh xr = bin rel(vx ; sh )? ; rel(vr ; sh )? : Then, sh ?xr = sh xr

bin(sh xr ; sh xr ) = shxr

rel(vx ; sh xr ) = sh xr rel(vr ; sh xr ) = sh xr rel(vxr ; sh xr ) = ?

rel(vx ; sh ? ) = ? rel(vr ; sh ? ) = ? rel(vxr ; sh ? ) = sh ? :

and

Hence, we have ?

bin rel(vx ; sh xr )? ; rel(vr ; sh xr ) = shxr ; rel(vx ; sh ? [ sh xr ) = sh xr ; rel(vr ; sh ? [ sh xr ) = sh xr ; rel(vxr ; sh ? [ sh xr ) = sh ? : 16

Now, by 6.3, ? amgu amgu(sh ; x 7! r); x 7! r ? = rel(vxr ; sh ? [ sh xr ) [ bin rel(vx ; sh ? [ sh xr )? ; rel(vr ; sh ? [ sh xr )? = sh ? [ sh xr = amgu(sh ; x 7! r):

2 For the proof of commutativity, we require the following auxiliary results. Lemma 23. For each V 2 }f (Vars ) and sh 2 SH we have rel(V; sh ? ) = rel(V; sh )? : Proof. Let S 2 }f (Vars ). Then S 2 rel(V; sh ? ) means that S 2 shS?n and S \ V = ?. In other words, there exist S1 2 sh , : : : , Sn 2 sh such that S = i=1 Si and, for each i = 1, : : : , n, we have Si \ V = ?. This amounts S to saying that there exist S1 2 rel(V; sh ), : : : , Sn 2 rel(V; sh ) such that S = ni=1 Si , which is equivalent to S 2 rel(V; sh )? . 2 The auxiliary function rel enjoys a weaker property. Lemma 24. For each V 2 }f (Vars ) and sh 2 SH we have rel(V; sh ? ) rel(V; sh )? : Proof. Let S 2 }f (Vars ). Then S 2 rel(V; sh )? means that there Sexist S1 2 sh , : : : , Sn 2 sh such that Si \ V 6= ?, for each i = 1, : : : , n, and S = ni=1 Si . Thus S \ V 6= ? and S 2 rel(V; sh ? ). Hence, rel(V; sh ? ) rel(V; sh )? . 2 Lemma 25. For each V 2 }f (Vars ), sh 1 ; sh 2 2 SH , and S 2 }f (Vars ) we have

S 2 rel(V; sh 1 [ sh 2 )? [ f?g () 9S1 2 rel(V; sh 1 )? [ f?g : 9S2 2 rel(V; sh 2 )? [ f?g : S = S1 [ S2 : Proof. If S = ? the statement is trivial. Suppose S 2 rel(V; sh 1 [ sh 2 )? . Then, for some n 2 N and for eachSi = 1, n : : : , n, there S exists Ri 2 (sh 1 [ sh 2 ) such that Ri \ V 6= ? and S? = i=1 Ri . Let Sj = f Ri j Ri 2 sh j g for j = 1; 2. Thus S1 2 rel(V; sh 1 ) [ f?g and S2 2 rel(V; sh 2 )? [ f?g and S = S1 [ S2 . Suppose 9S1 2 rel(V; sh 1 )? [ f?g : 9S2 2 rel(V; sh 2 )? [ f?g : S = S1 [ S2 , with S1 and S2 not both empty. Then, for some m; n 0,Sthere exist R1 ; : : :S; Rm 2 n rel(V; sh 1 ) and T1 ; : : : ; Tn 2 rel(V; sh 2 ) such that S1 = m i=1 Ri and S2 = i=1 Ti . Then R1 ; : : : ; Rm ; T1 ; : : : ; Tn 2 rel(V; sh 1 [ sh 2) and S=

[ m

i=1

Ri [

[ n

i=1

Ti ;

thus S 2 rel(V; sh 1 [ sh 2 )? . 2 Lemma 26. For each V1 ; V2 2 }f (Vars ) and sh 2 SH we have ? ? rel V1 ; rel(V2 ; sh ) = rel V2 ; rel(V1 ; sh ) : 17

?

Proof . S 2 rel V1 ; rel(V2 ; sh ) means that S \ V1 6= ? and S \ V2 = ?. S 2 ? rel V2 ; rel(V1 ; sh ) means that S \ V2 = ? and S \ V1 6= ?. 2 Lemma 27. For each sh 1 ; sh 2 2 SH , we have bin(sh 1 ; sh2 )? = bin(sh ?1 ; sh?2 ) Proof. S 2 bin(sh 1 ; sh2 )? means that, for some n, there exist R1 ; : : : ; Rn 2 sh 1 and T1 ; : : : ; Tn 2 sh 2 such that S = (R1 [ T1 ) [ [ (Rn [ Tn). S 2 bin(sh ?1 ; sh?2 ) means that, for some n, 9R1 ; : : : ; Rn 2 sh 1 and 9T1; : : : ; Tn 2 sh 2 such that S = (R1 [ [ Rn ) [ (T1 [ [ Tn ). 2 Proof of Lemma 12. We let R, S , T , and U (possibly subscripted) denote elements of sh ? . The subscripts for R, S , T , and U will indicate the sets in vx , vr , vxr , vy , vt , and vyt that have a common element with the subscripted set whenever it is non-empty. ? Suppose that S 2 amgu amgu( sh ; x 7! r); y 7! t . Then we show that S 2 ? amgu amgu(sh ; y 7! t); x 7! r . There are two cases due to the two components of the de nition of amgu in (6.3).

Case 1.

?

S 2 rel vyt ; amgu(sh ; x 7! r) : Then S 2 amgu(sh ; x 7! r) and S \ vyt = ?. Again there are two possibilities.

Subcase 1a. Suppose rst that S 2 rel(vxr ; sh ): Thus S 2 sh , and, since in this case we have S \ vyt = ?,

S 2 rel(vyt ; sh ): As (6.3) implies that rel(vyt ; sh ) amgu(sh ; y 7! t), we have also

S 2 amgu(sh ; y 7! t): Now, since the hypothesis of this subcase implies S \ vxr = ?, we obtain ?

S 2 rel vxr ; amgu(sh ; y 7! t) : Hence, by the alternative de nition of amgu, (6.3), we can conclude that ?

S 2 amgu amgu(sh ; y 7! t); x 7! r :

Subcase 1b. Suppose now that

?

S = bin rel(vx ; sh )? ; rel(vr ; sh )? : 18

Then, there exist Sx; Sr 2 }f (Vars ) n ? such that S = Sx [ Sr , where

Sx 2 rel(vx ; sh )?

Sr 2 rel(vr ; sh )? :

By the hypothesis for this case we have S \ vyt = ? and thus Sx \ vyt = ? and Sr \ vyt = ?. This allows to state that ?

?

Sx 2 rel vyt ; rel(vx ; sh )? ;

Sr 2 rel vyt ; rel(vr ; sh )? ;

and hence, by Lemma 23, ? Sx 2 rel vyt ; rel(vx ; sh ) ? ;

? Sr 2 rel vyt ; rel(vr ; sh ) ? ;

Thus, by Lemma 26, ? Sx 2 rel vx ; rel(vyt ; sh ) ? ;

? Sr 2 rel vr ; rel(vyt ; sh ) ? ;

so that, by (6.1, 6.2, 6.3), ? ? Sx 2 rel vx ; amgu(sh ; y 7! t) ? Sr 2 rel vr ; amgu(sh ; y 7! t) ? : ? Hence S = Sx [ Sr 2 amgu amgu(sh ; y 7! t); x 7! r .

Case 2.

? ? S 2 bin rel vy ; amgu(sh ; x 7! r) ? ; rel vt ; amgu(sh ; x 7! r) ? :

Then there exist Sy ; St 2 }f (Vars ) n ? such that

S = S y [ St where

Sy 2 rel vy ; amgu(sh ; x 7! r) ? ; ? St 2 rel vt ; amgu(sh ; x 7! r) ? : ?

By (6.3) and Lemma 25, there exist R? ; Rxr ; T? ; Txr such that

Sy = R? [ Rxr ; St = T? [ Txr where

R? 2 rel vy ; rel(vxr ; sh ) ? [ f?g; ? ? Rxr 2 rel vy ; bin rel(vx ; sh )? ; rel(vr ; sh )? [ f?g; ? T? 2 rel vt ; rel(vxr ; sh ) ? [ f?g; ? ? Txr 2 rel vt ; bin rel(vx ; sh )? ; rel(vr ; sh )? [ f?g: ?

19

(6.11)

Then, by lemmas 26 and 23, ?

R? 2 rel vxr ; rel(vy ; sh )? [ f?g; ? T? 2 rel vxr ; rel(vt ; sh )? [ f?g:

(6.12)

Also, using lemmas 24, 27, and then the idempotence of ()? ,

?

Rxr 2 rel vy ; bin rel(vx ; sh )? ; rel(vr ; sh )? [ f?g; ? Txr 2 rel vt ; bin rel(vx ; sh )? ; rel(vr ; sh )? [ f?g:

(6.13)

Subcase 2a. Suppose Rxr = Txr = ?. Then, as Sy 6= ? and St 6= ?, we have R? 6= ? and T? 6= ?, and hence, using (6.12), ? S = R? [ T? 2 bin rel(vy ; sh )? ; rel(vt ; sh )? ; so that, by (6.3),

S = R? [ T? 2 amgu(sh ; y 7! t): Also, it follows from (6.12) that R? \ vxr = ? and T? \ vxr = ?, so that ?

S 2 rel vxr ; amgu(sh ; y 7! t) : ?

Hence, by (6.3), S 2 amgu amgu(sh ; y 7! t); x 7! r .

Subcase 2b. Suppose Rxr [ Txr 6= ?: Then, by (6.13), (Rxr [ Txr ) \ vyt = 6 ?:

(6.14)

The proof of this subcase is in two parts. In the rst part we divide Rxr and Txr into a number of subsets. First, by (6.13), there exist Rx , Rr , Tx, Tr 2 }f (Vars ) such that

Rxr = Rx [ Rr ;

Txr = Tx [ Tr ;

where either Rx = Rr = ? or

Rx 2 rel(vx ; sh )? ;

Rr 2 rel(vr ; sh )? ;

and either Tx = Tr = ? or

Tx 2 rel(vx ; sh )? ;

Tr 2 rel(vr ; sh )? :

Thus, by (6.14),

Rx [ Tx 6= ?;

Rr [ Tr 6= ?: 20

(6.15)

We now subdivide the sets Rx , Tx, Rr , and Tr further. First note that ?

sh = rel(vyt ; sh ) [ rel(vy ; sh ) [ rel vy ; rel(vt ; sh ) ; ? sh = rel(vyt ; sh ) [ rel vt ; rel(vy ; sh ) [ rel(vt ; sh ):

Hence, by Lemma 25, sets Rx? , Rxy , Rxt , Rr?, Rry , Rrt , Tx?, Txy , Txt, Tr?, Try , Trt 2 }f (Vars ) exist such that

Rx = Rx? [ Rxy [ Rxt ; Rr = Rr? [ Rry [ Rrt ; where

Tx= Tx? [ Txy [ Txt; Tr = Tr? [ Try [ Trt

(6.16)

Rx? ; Tx? 2 rel vx ; rel(vyt ; sh ) ? [ f?g; ? Rr?; Tr? 2 rel vr ; rel(vyt ; sh ) ? [ f?g;

(6.17)

Rxy ; Txy 2 rel vx ; rel(vy ; sh ) ? [ f?g; ? Rry ; Try 2 rel vr ; rel(vy ; sh ) ? [ f?g; ? Rxt ; Txt 2 rel vx ; rel(vt ; sh ) ? [ f?g; ? Rrt ; Trt 2 rel vr ; rel(vt ; sh ) ? [ f?g;

(6.18)

?

and ?

and also (Rx n Rxy ) \ vy = ?; (Rr n Rry ) \ vy = ?;

(Tx n Txt) \ vt = ?; (Tr n Trt) \ vt = ?:

(6.19)

We note a few simple but useful consequences of these de nitions. First, it follows from (6.17) using (6.1, 6.2, 6.3) that

Rx?; Tx? 2 rel vx ; amgu(sh ; y 7! t) ? [ f?g; ? Rr?; Tr? 2 rel vr ; amgu(sh ; y 7! t) ? [ f?g: ?

(6.20)

Secondly, using (6.17) with Lemma 24, we have

Rx? ; Tx?; Rr? ; Tr? 2 rel(vyt ; sh )? [ ?;

(6.21)

and then, using this with (6.14) and (6.16), it follows that

Rxy [ Txy [ Rry [ Try [ Rxt [ Txt [ Rrt [ Trt 6= ?:

(6.22)

In the second part of the proof for this subcase, the component subsets of S are reassembled in an order that proves the required result. First, let = R? [ Rxy [ Rry [ Txy [ Try ; Uy def def Ut = T? [ Rxt [ Rrt [ Txt [ Trt; 21

and

U def = Uy [ Ut : By (6.12) and (6.18) (with Lemma 24), each component set in the de nition of Uy is in rel(vy ; sh )? [ f?g and each component set in the de nition of Ut is in rel(vt ; sh )? [ f?g. Thus, by the de nition of ()? ,

Uy 2 rel(vy ; sh )? [ f?g; Ut 2 rel(vt ; sh )? [ f?g: ? ? By (6.19), Rxr n (Rxy [ Rry ) \ vy = ? and hence Sy n (Rxy [ Rry [ R? ) \ vy = ?. By (6.11) and Lemma 24, Sy \ vy 6= ?. Thus Rxy [ Rry [ R? 6= ? and, as a consequence, Uy 6= ?. For similar reasons, Ut 6= ?. Hence ? U 2 bin rel(vy ; sh )? ; rel(vt ; sh )? ; and therefore, using (6.3), it follows that

U 2 amgu(sh ; y 7! t):

(6.23)

Now, by (6.22), at least one of the following two inequalities holds: Rxy [ Txy [ Rxt [ Txt 6= ?; Rry [ Try [ Rrt [ Trt 6= ?: Suppose rst that Rxy [ Txy [ Rxt [ Txt = ? and Rry [ Try [ Rrt [ Trt 6= ?: Then, using (6.15) and (6.16) with the rst of these, Rx? [ Tx? 6= ?: Also, using (6.18) with the second, we have (Rry [ Rrt [ Try [ Trt) \ vr 6= ? and therefore,

U \ vr 6= ?: Hence, by (6.20) and (6.23),

Rx? [ Tx? 2 rel vx ; amgu(sh ; y 7! t) ? ; ? U [ Rr? [ Tr? 2 rel vr ; amgu(sh ; y 7! t) ? : Similarly, assuming Rxy [ Txy [ Rxt [ Txt 6= ? and Rry [ Try [ Rrt [ Trt = ? it follows that

?

Rr? [ Tr? 2 rel vr ; amgu(sh ; y 7! t) ? ; ? Rx? [ Tx? [ U 2 rel vx ; amgu(sh ; y 7! t) ? : Finally, assuming Rxy [ Txy [ Rxt [ Txt 6= ? and Rry [ Try [ Rrt [ Trt 6= ? it follows from (6.18) that U \ vx 6= ? and U \ vr 6= ?, and hence ?

Rx? [ Tx? [ U 2 rel vx ; amgu(sh ; y 7! t) ? ; ? U [ Rr? [ Tr? 2 rel vr ; amgu(sh ; y 7! t) ? : ?

22

However, since S = Rx? [ Tx? [ U [ Rr? [ Tr?; we have

? ? S 2 bin rel vx ; amgu(sh ; y 7! t) ? ; rel vr ; amgu(sh ; y 7! t) ? :

Hence, by (6.3),

?

S 2 amgu amgu(sh ; y 7! t); x 7! r :

2

7. Discussion

In this paper, we have provided a framework for analysing non-idempotent substitutions and presented results for soundness, idempotence and commutativity of sharing. The domain of sharing is well-known and most researchers concerned with analysing sharing and related properties assume these properties hold. Why therefore are the results in this paper necessary? Let us consider each of the above properties one at a time.

7.1. Soundness

We have shown that, for any substitution over a set of variables U , the abstraction (; U ) = (sh ; U ) is unique (Lemma 7) and that the aunify operation is sound (Theorem 19). Note that, in Theorem 19, there are no restrictions on ; it can be non-idempotent, possibly including cyclic bindings (that is, where the domain variable occurs in its co-domain). Thus this result is widely applicable. Previous results on sharing have assumed that the substitutions are all idempotent. This is true if the equality is syntactic identity, (de ned by Clark's equality theory [5]) and the implementation uses a uni cation algorithm (such as the Martelli-Montanari algorithm [17]) based on that of Robinson [19] and includes the occur-check. With such algorithms, it is well-known that for a given set of equations, the resulting uni er is both unique and idempotent. However, this is not what is implemented by most Prolog systems. For instance, if, as in most commercial Prolog systems, the uni cation algorithm is based on the Martelli-Montanari algorithm, but omits the occur check step, then the resulting substitution may not be idempotent. Consider the following example. Suppose we are given as input the equation p(z; f (x; y)) = p(f (z; y); z ) with an initial substitution that is empty. We apply the steps in Martelli-Montanari procedure but without the occur-check: set of equations substitution 1 p(z; f (x; y)) = p(f (z; y) ; z ) ? 2 z = f (z; y); f (x; y) = z ? 3 f (x; y) = f (z; y) z 7! f (z; y ) 4 fx = z; y = yg z 7! f (z; y ) 5 fy = yg z 7! f (z; y ); x 7! z 6 ? z 7! f (z; y); x 7! z 23

Note that we have used three kinds of steps here. In lines 1 and 3, neither argument of the selected equation is a variable. In this case, the outer non-variable symbols (when, as in this example, they are the same) are removed and new equations are formed between the corresponding arguments. In lines 2 and 4, the selected equation has the form v = t, where v is a variable and t is not identical to v, then every occurrence of v is replaced by t in all the remaining equations and the range of the substitution. v 7! t is then added to the substitution. In line 5, the identity is removed. Let = z 7! f (z; y); x 7! z , be the computed substitution. Then, we have vars (x) = vars (z ) = fz g ? vars (x2 ) = vars f (z; y) = fy; z g: Hence is not variable-idempotent. If the algorithm is as described in [12] and used in Prolog III [7], then the resulting uni er is in rational solved form. This algorithm does not generate idempotent or even variable-idempotent substitutions even when the occur-check would never have succeeded. Since, it has been shown that the substitution obtained in this way is unique (up to variable renaming), by Theorem 19, aunify is sound.

7.2. Idempotence

Theorem 20 states that aunify is idempotent. The only previous result concerning the idempotence of aunify is given in thesis of Langen [15, Theorem 32]. However, Langen proves a weak form of idempotence. Restating his result using our notation, he proved: ? aunify(sh ; e) SS aunify aunify(sh ; e); e Langen provides an example [15, Page 56] to explain why the SS in the above statement cannot be replaced by =. However, we have shown that aunify is indeed fully idempotent and there appears to be an error in the analysis of this example in [15].

7.3. Commutativity

In the thesis of Langen the \proof" of commutativity of amgu5 has a number of omissions and errors [15, Lemma 30]. We highlight here, one of the errors. To make it easier to compare, we adapt our notation and, de ne amge only in the case that a is a variable: amge(a; b; sh ) def = amgu(sh ; a 7! b): To prove the lemma, it has to show that: ? ? amge a2 ; b2 amge(a1 ; b1 ; sh ) = amge a1 ; b1 ; amge(a2 ; b2 ; sh ) : holds when a1 and a2 are variables. This corresponds to \the second base case" of the proof. We use Langen's terminology: 5 In fact the commutativity of a related concept, amge, is proved in this lemma, but the main part concerns the commutativity of amgu and it is this part that is in error.

24

A set of variables X is at a term t i var(t) \ X 6= ?. A set of variables X is at i i X is at ai or bi . A union X [i Y is of Type i i X is at ai and Y is at bi . Let

?

lhs def = amge a2 ; b2 ; amge(a1 ; b1 ; S ) ; ? rhs def = amge a1 ; b1; amge(a2 ; b2 ; S ) :

Let Z 2 lhs and T def = aunify(a1 ; b1 ; S ). Consider the case when Z = X [2 Y where X 2 rel(a2 ; T ); Y 2 rel(b2 ; T ); X = U [1 V where U 2 rel(a1 ; sh ); V 2 rel(b1 ; sh ) ?

and U \ vars (a2 ) [ vars (b2 ) = ? (that is, U is not at 2). Then the following quote [15, page 53, line 23] applies: In this case (U [1 V ) [2 Y = U [1 (V [2 Y ). By the inductive assumption V [2 Y is in the rhs and therefore so is Z . We give a counter-example to the statement \V [2 Y is in the rhs". Suppose a1 ; b1 ; a2 ; b2 are variables. We let each of a1 ; b1 ; a2 ; b2 denote both the actual variable and the singleton set containing that variable. Suppose sh = fa1 ; b1a2 ; b2 g: Then, from the de nition of amge, lhs = fa1b1 a2 b2 g; rhs = fa1b1 a2 b2 g; T = fa1 b1 a2 ; b2 g: Let Z = a1 b1 a2 b2 ; X = a1 b1 a2 ; Y = b2 ; U = a1 ; V = b1 a2 : It can be seen that these match all the above conditions. However V [2 Y = b1 a2 b2 and this is not in fa1 b1a2 b2g.

REFERENCES 1. R. Bagnara, P. M. Hill, and E. Zaanella. Set-sharing is redundant for pair-sharing. In P. Van Hentenryck, editor, Static Analysis: Proceedings of the 4th International Symposium, volume 1302 of Lecture Notes in Computer Science, pages 53{67, Paris, France, 1997. Springer-Verlag, Berlin. 2. R. Bagnara, P. M. Hill, and E. Zaanella. The correctness of set-sharing. Quaderno 172, Dipartimento di Matematica, Universita di Parma, 1998. 3. M. Bruynooghe and M. Codish. Freeness, sharing, linearity and correctness | All at once. In P. Cousot, M. Falaschi, G. File, and A. Rauzy, editors, Static Analysis, Proceedings of the Third International Workshop, volume 724 of Lecture Notes in Computer Science, pages 153{164, Padova, Italy, 1993. Springer-Verlag, Berlin. An extended version is available as Technical Report CW 179, Department of Computer Science, K.U. Leuven, September 1993. 4. M. Bruynooghe, M. Codish, and A. Mulkers. Abstract uni cation for a composite domain deriving sharing and freeness properties of program variables. In F. S. de Boer and M. Gabbrielli, editors, Veri cation and Analysis of Logic Languages, Proceedings of the W2 Post-Conference Workshop, International Conference on Logic Programming, pages 213{230, Santa Margherita Ligure, Italy, 1994.

25

5. K. L. Clark. Negation as failure. In H. Gallaire and J. Minker, editors, Logic and Databases, pages 293{322, Toulouse, France, 1978. Plenum Press. 6. M. Codish, D. Dams, G. File, and M. Bruynooghe. Freeness analysis for logic programs-and correctness? In D. S. Warren, editor, Logic Programming: Proceedings of the Tenth International Conference on Logic Programming, MIT Press Series in Logic Programming, pages 116{131, Budapest, Hungary, 1993. The MIT Press. An extended version is available as Technical Report CW 161, Department of Computer Science, K.U. Leuven, December 1992. 7. A. Colmerauer. Prolog and In nite Trees. In K. L. Clark and S. A. Tarnlund, editors, Logic Programming, APIC Studies in Data Processing, volume 16, pages 231{251. Academic Press, New York, 1982. 8. A. Colmerauer. Equations and inequations on nite and in nite trees. In Proceedings of the International Conference on Fifth Generation Computer Systems (FGCS'84), pages 85{99, Tokyo, Japan, 1984. ICOT. 9. D. Jacobs and A. Langen. Accurate and ecient approximation of variable aliasing in logic programs. In E. L. Lusk and R. A. Overbeek, editors, Logic Programming: Proceedings of the North American Conference, MIT Press Series in Logic Programming, pages 154{165, Cleveland, Ohio, USA, 1989. The MIT Press. 10. D. Jacobs and A. Langen. Static analysis of logic programs for independent AND parallelism. Journal of Logic Programming, 13(2&3):291{314, 1992. 11. J. Jaar, J-L. Lassez, and M.J. Maher. Prolog-II as an instance of the logic programming scheme. In M. Wirsing, editor, Formal Descriptions of Programming Concepts III, pages 275{299. North Holland, 1987. 12. T. Keisu. Tree Constraints. PhD thesis, The Royal Institute of Technology, Stockholm, Sweden, May 1994. Also available in the SICS Dissertation Series: SICS/D{ 16{SE. 13. A. King. A synergistic analysis for sharing and groundness which traces linearity. In D. Sannella, editor, Proceedings of the Fifth European Symposium on Programming, volume 788 of Lecture Notes in Computer Science, pages 363{378, Edinburgh, UK, 1994. Springer-Verlag, Berlin. 14. A. King and P. Soper. Depth-k sharing and freeness. In P. Van Hentenryck, editor, Logic Programming: Proceedings of the Eleventh International Conference on Logic Programming, MIT Press Series in Logic Programming, pages 553{568, Santa Margherita Ligure, Italy, 1994. The MIT Press. 15. A. Langen. Static Analysis for Independent And-Parallelism in Logic Programs. PhD thesis, Computer Science Department, University of Southern California, 1990. Printed as Report TR 91-05. 16. M.J. Maher. Complete axiomatizations of the algebra of nite, rational and in nite trees. In lics88, pages 348{357, 1988. 17. A. Martelli and U. Montanari. An ecient uni cation algorithm. ACM Transactions on Programming Languages and Systems (TOPLAS), 4(2):258{282, 1982. 18. K. Muthukumar and M. Hermenegildo. Compile-time derivation of variable dependency using abstract interpretation. Journal of Logic Programming, 13(2&3):315{ 347, 1992. 19. J. A. Robinson. A machine-oriented logic based on the resolution principle. Journal of the ACM, 12(1):23{41, 1965.

26