On the Word, Subsumption, and Complement Problem for ... - CiteSeerX

0 downloads 0 Views 230KB Size Report
The answer is negative as there are ground terms too complex to be represented .... de ned as L(t) = ft#R j :C ?! Ng. Two primal terms s and t are weakly ... the substitution replacing each variable in n by the corresponding row in the vector of ... Recurrent term schematizations are of potential use in all areas concerned with.
On the Word, Subsumption, and Complement Problem for Recurrent Term Schematizations? Miki Hermann1 and Gernot Salzer2 1 2

LORIA (CNRS), BP 239, 54506 Vanduvre-les-Nancy, France. [email protected] Technische Universitat Wien, Karlsplatz 13, 1040 Wien, Austria. [email protected]

Abstract. We investigate the word and the subsumption problem for recurrent term schematizations, which are a special type of constraints based on iteration. By means of uni cation, we reduce these problems to a fragment of Presburger arithmetic. Our approach is applicable to all recurrent term schematizations having a nitary uni cation algorithm. Furthermore, we study a particular form of the complement problem. Given a nite set of terms, we ask whether its complement can be nitely represented by schematizations, using only the equality predicate without negation. The answer is negative as there are ground terms too complex to be represented by schematizations with limited resources.

1 Introduction

In nite sets of rst-order terms with structural similarities appear frequently in several branches of automated deduction, like logic programming, model building, term rewriting, equational uni cation, or clausal theorem proving. They are usually produced by saturation-based procedures, like equational completion or hyper-resolution. A usual requirement for e ective use of such sets is the possibility to handle them by nite means. There exist several approaches to cope with this phenomenon, like lazy evaluation, set constraints, or term schematizations. Lazy evaluation usually does not combine well with uni cation or other operations. Set constraints allow to describe regular sets of rst-order terms, using the potential of regular tree grammars and tree automata, and having the good properties of regular tree languages. Schematizations exploit the recurring term structure in in nite sets, as produced by self-resolving clauses or by self-overlapping rewrite rules. Several formalisms for recurrent term schematizations were introduced within the last years. They rely on the same principle, namely the iteration of rstorder contexts, but di er in the expressive power. The main concern in this work is the decidability of uni cation and the construction of nite complete sets of uni ers. Formalisms satisfying these requirements are -terms [CH95], Iterms [Com95], R-terms [Sal92], and primal grammars [HG97], all of them with a nitary uni cation algorithm. Set operations were studied in [AHL97]. ?

This work was done while the second author was visiting LORIA. His visit was funded by Univeriste Henri Poincare, Nancy 1.

1

Applications of recurrent schematizations are quite rare and mostly theoretical, like in model building [Pel97] or cycle uni cation [Sal94]. One reason is that there are still some open problems to be solved prior to a successful implementation. A sine qua non of automated deduction is redundancy elimination. The elementary tools in this respect are testing for equality and subsumption. In other words, we need to solve the word problem and the subsumption problem for recurrent term schematizations. Moreover, only positive set operations were studied in [AHL97] without considering the complement. Complement building is interesting from the algebraic and logic point of view, e.g., during construction of counter-examples or for quanti er elimination. In the rst part of the paper, we investigate the word and the subsumption problem for primal grammars. By means of uni cation, we reduce them to a problem in Presburger arithmetic. Our approach is applicable to all recurrent term schematizations having a nitary uni cation algorithm. In the second part, we study a particular form of the complement problem. Given a nite set of terms, we ask whether its complement can be represented nitely by schematizations, using only the equality predicate without negation. The answer is negative as there are ground rst-order terms too complex to be represented by primal grammars with limited resources.

2 Term schematizations 2.1 Syntax

The language of primal terms is based on four kinds of symbols: rst-order variables V , counter variables C , function symbols Fp of arities p  0, and de ned symbols Dq;p of counter arities q  1 and rst-order arities p  0. Nullary function symbols are called constants. The set of all function and de ned symbols is denoted by F and D, respectively. Let N be the set of natural numbers. The set of counter expressions L is the set of linear expressions over C with coecients in N. It can be de ned inductively as the smallest set satisfying the following conditions. { N  L and C  L { (cl) 2 L if c 2 N and l 2 L { (l1 + l2 ) 2 L if l1 ; l2 2 L Two counter expressions are considered equal if they are equivalent with respect to the usual equalities of addition and multiplication. Furthermore, we drop parentheses where possible and do not distinguish between natural numbers and their symbolic representation. The set of primal terms P is de ned inductively as the smallest set satisfying the following conditions.

{ { {

VP f(t) 2 P if f 2 Fp and t 2 P p ^ l; t) 2 P if f^ 2 Dq;p , l 2 Lq , and t 2 P p f( 2

The sets of counter variables and rst-order variables of a primal term t are denoted by CV ar(t) and V ar(t), respectively. Example 1. Let x 2 V , a 2 F0, h 2 F2 , m; n 2 C , f^ 2 D1;1, and g^ 2 D2;0. Then ^ + 2; f(5; ^ h(a; x))); ^g(m; m + n)) is a primal term. h(f(3m

2.2 Semantics

In the sequel, we assume that the reader is familiar with the basic notions of term rewriting. With each de ned symbol f^ 2 Dq;p , we associate two rewrite ^ n; x) ! r1f^ and f(m ^ + 1; n; x) ! r2f^[f(m; ^ n + ; x)]A , where rules f(0; { m; n and x are counter variables and rst-order variables, respectively, i.e., (m; n) 2 C q and x 2 V p { r1f^ and r2f^ are primal terms, whose variables are among those of the left hand sides of the rules, i.e., r1f^ 2 P , V ar(r1f^)  x, CV ar(r1f^)  n r2f^ 2 P , V ar(r2f^)  x, CV ar(r2f^)  fmg [ n { all de ned symbols in r1f^ and r2f^ are smaller than f^ with respect to a given precedence relation on the de ned symbols { A is a set of independent rst-order positions of r2f^ without the root position {  is either the null vector or a k-dimensional unit vector, i.e., all components of  are zero except one which may be zero or one. The rst-order positions are those not below a de ned symbol. Formally, the set of rst-order positions is de ned recursively by the following equations. { P os(x) = fg for x 2 V , { P os(f(^   )) = fg for f^ 2 DS,pand { P os(f(t1 ; : : :; tp )) = fg [ i=1 fi:a j a 2 P os(ti )g for f 2 Fp. Two positions are independent if none is a pre x of the other. Let R be the set of all rewrite rules associated with the de ned symbols. The rewrite relation ?!R generated by R is the smallest relation that contains R, and is closed under congruence and substitution. By t#R we denote the normal form of t with respect to R. Note that t#R is a rst-order term if t contains no counter variables. The rst-order terms represented by a primal term t are de ned as L(t) = ft #R j : C ?! Ng. Two primal terms s and t are weakly equivalent, if L(s) = L(t). They are (strongly) equivalent, denoted by s =: t, if s #R = t #R holds for all substitutions : C ?! N. Obviously, equivalence implies weak equivalence. Example 2. Let x 2 V , a 2 F0, f 2 F1 , m; n 2 C , f^ 2 D1;0, and g^ 2 D1;1. ^ and t = g^(n+1; f(^g(n; a)))), where Consider the primal terms s = f(f(n)) ^ ^ ^ f(0) ! f(a); f(n+1) ! f(f(f(n))); g^(0; x) ! x; g^(n+1; x) ! f(^g (n; x)): 3

The terms s and t are strongly equivalent. Moreover, the schematized sets L(s) and L(t) are equal: L(s) = L(t) = ff n (a) j n  2g. On the other hand, the ^ ^ are weakly but not strongly equivalent. terms f(f(m)) and f(f(n))

2.3 Uni cation

A substitution is a mapping : (V [ C ) ?! (P [ L), which is well-typed and whose domain is nite, i.e., (x) 2 P for x 2 V , (n) 2 L for n 2 C , and dom() = fv 2 (V [ C ) j (v) 6= vg is nite. As usual, we extend substitutions homomorphically to primal terms and counter expressions. The application of  to a term t is written as t; the composition of two substitutions ;  is written as  with the understanding that t = (t) for all terms t. We denote  by the set fv 7! v j v 2 dom()g. Normalization is extended to substitutions in the natural way, i.e., #R = fv 7! v#R j v 2 dom()g. A substitution  is a uni er of two primal terms s and t i for all : C ?! N the rst-order substitution  #R uni es the rst-order terms s #R and t #R. A set of uni ers  is complete i for every counter substitution  there exists  2 , such that  #R is a most general uni er of s #R and t #R. Note that  is a uni er of s and t i s =: t, i.e., our notion of uni ability corresponds to the standard one in uni cation theory. This is not true for completeness: a uni er need not be an instance of any substitution in a given complete set of uni ers. Uni cation of primal terms is decidable and nitary, i.e., for any pair of primal terms there exists a nite set of uni ers which is complete. Moreover, complete sets of uni ers can be e ectively computed [HG97].

2.4 First-order formulas

In this paper, we use rst-order formulas to de ne the word problem in a concise way and to compare di erent notions of subsumption. Quanti ed counter variables are interpreted over the domain of natural numbers, quanti ed rstorder variables over the Herbrand universe with respect to the underlying set of function symbols. Free variables are treated as constants. Additionally, we use vectors and notations from linear algebra as a compact representation of similar: objects. For example, x =: s(k) stands for a set of equations of the form x = s(k), where x is a variable from x and s 2 s is a term containing variables k1; k2; : : : from k. Furthermore, fn 7! Ck + cg represents the substitution replacing each variable in n by the corresponding row in the vector of linear expressions, which is obtained by multiplying the matrix C of natural numbers by the vector k of counter variables and adding the vector c. Let s and t be primal terms containing the variables x = V ar(s), y = V ar(t), m = CV ar(s) and n = CV ar(t). A complete :set of uni ers for s and t can be considered as a solved form of the equation s = t in the following way. A uni er  = fx 7! s0 (k); y 7! t0(k); m 7! Ck + c; n 7! Dk + dg, where k are auxiliary counter variables introduced during uni cation, corresponds to the formula ?   (x; y; m; n) = 9k x =: s0 (k) ^ y =: t0 (k) ^ m = Ck + c ^ n = Dk + d : 4

Note that uni cation does not introduce auxiliary rst-order variables. However, s0 and t0 may contain variables from x and y; in this case these variables do not occur in the domain of the substitution. The formula associated with a complete set of uni ers  is the disjunction of the formulas corresponding to the single W uni ers:  (x; y; m; n) = 2  (x; y; m; n). Therefore the formulas s =: t and  (x; y; m; n) are equivalent.

2.5 Miscellaneous notations If t is a primal term and A  P os(t) is a set of independent rst-order positions, then t[]A is called a context. If s is a context and t is a context or primal term, then the concatenation of s and t, denoted by s  t, is the context or primal term sf 7! tg. Concatenation is associative, hence we drop parentheses where possible. The empty context  serves as unit element with respect to concatenation. Exponentiation is de ned by s0 =  and si+1 = s  si . The depth of a primal term t, denoted by depth (t), is recursively de ned as ^ l; t)) = 1 + depth (t) depth (t) = 0 for t 2 (V [ F0 ), and depth (f(t)) = depth (f( for f 2 Fp (p > 0) and f^ 2 D. The depth of a set or vector of terms t is de ned as depth (t) = maxfdepth (t) j t 2 tg. The depth of the set of rewrite rules R associated with D is the depth of the set of all right hand sides: depth (R) = ^ n + ; x)]A j f^ 2 Dg). depth (fr1f^; r2f^[f(m;

3 Redundancy elimination

Recurrent term schematizations are of potential use in all areas concerned with rst-order terms, mostly in automated deduction, like term rewriting with equational completion and proofs by consistency, or clausal theorem proving. An ubiquitous problem appearing there is the duplication of objects. Redundancy elimination plays therefore a vital role. In the simplest case, we need to maintain the set property, where no element (term, clause, literal) must occur twice. Another case of redundancy is the presence of two elements, where one is an instance of the other. In the rst case we have to solve the word problem, i.e., to determine whether two terms s and t represent the same object in the underlying theory. The latter case is usually referred to as the subsumption problem. Example 3. Consider the rewrite system fgfx ! gfx. Its completion produces the in nite set of rules ffgn fx ! gn fx j n 2 Ng. This set can be presented by the primal term (as a rewrite rule) f^g (n; fx) ! g^(n; fx), where R = fg^(0; x) ! gx; g^(k + 1; x) ! g(^g (k; x))g. The completion procedure continues to work with this new rewrite rule in the signature extended by the de ned symbols and produces the rule f^g (n; g^(n0 ; fx)) ! g^(n; g^(n0; fx)). This rule is redundant, but we cannot determine it syntactically. To do so, we need to show that the following formula is valid: 8n8n0 9k9x (f^g (k; fy) ! g^(k; fy)) =: (f^g (n; g^(n0 ; fx)) ! g^(n; g^(n0; fx))):

5

This is just a subsumption test for the newly produced rewrite rule. One way to show the validity is to prove that the word problem 8n8n0(^g (n; g^(n0; x)) =: g^(n + n0 ; x)) holds in the equational theory of R and that 8n; n0 9k(n + n0 = k) holds. Example 4. Another example for redundancy elimination is the check for instances of the identity axiom. Consider the rewrite system ^ ! f(a); f(n ^ + 1) ! f(f(f^(n))); g^(0; x) ! x; g^(n + 1; x) ! f(^g (n; x))g: ff(0) ^ = Suppose that we generate during a deduction process the equation f(f(n)) g^(n + 1; f(^g(n; a)))). To verify that it is an instance of the identity axiom, we ^ =: g^(n + 1; f(^g (n; a)))). need to solve the word problem 8n(f(f(n))

3.1 Word problem De nition 5. The word problem for two primal terms s and t is the question whether the formula 8n (s =: t) is valid in the equational theory generated by R, where n = CV ar(s) [ CV ar(t). One possibility to solve the word problem is to reduce s and t to unique normal forms, followed by a check whether the latter are syntactically equal. This approach is described for R-strings in [Sal91]. In this paper, we choose a di erent approach: we transform the word problem to a uni cation problem and a subsequent problem in Presburger arithmetic. The rst method is ecient but works only if we can de ne a unique normal form. In general, there is no obvious way of de ning the normal form of a primal term. Our approach does not depend on a speci c syntactic representation for schematizations, but requires only the existence of a nitary and terminating uni cation algorithm. Therefore, our method is applicable to all known recurrent schematizations, i.e., to -terms, I-terms, R-terms, and primal grammars. We proceed in three steps. 1. Elimination of rst-order variables: replace: all rst-order variables by new constants. Observe that the: formula 8n(s = t) is valid if and only if the corresponding formula 8n(s = t ) is valid, where the terms s ; t are obtained from the terms s; t by replacing each rst-order variable x by a new constant cx . 2. Uni cation: solve the equation s =: t . We solve the equation s =: t by means of uni cation. Note that a nitary and terminating uni cation algorithm exists for all four known recurrent schematizations. This means that the output of the uni cation algorithm is a nite disjunction of formulas 9k(n = Ni k + di ), where Ni and di is a matrix and a vector of non-negative integers, respectively, and k are new counter Wvariables introduced during uni cation. The resulting formula (n) = 9k i (n = Ni k + di ) contains only counter variables, since there are no rst-order variables in s and t . 6

3. Validity check: check whether the formula 8n (n) is valid. The formula ( n) represents a complete set of uni ers, one per disjunct, of: the problem s =: t . To show that the universally quanti ed formula 8n(s = t ) is valid, we need to prove that the uni ers from (n) cover the whole Cartesian product Njnj:. By correctness of the applied uni cation algorithm, the formulas 8n(s = t ) and 8n (n) are equivalent. The latter expression is a 2-formula of Presburger arithmetic and can be solved by usual methods [Coo72]. For complexity issues see Section 3.3.

3.2 Subsumption problem

In the rst-order case, a term s subsumes a term t if there exists a substitution , such that s = t. In the free algebra, this is equivalent to 9x(s = t), where x = V ar(s). An alternative de nition is that the formula 8y9x(s = t) is valid, where x = V ar(s) and y = V ar(t). These two de nitions are equivalent, except for singular signatures, since in the empty theory (without axioms) validity in the equational theory is equivalent to validity in the inductive theory. For schematizations, there are several possibilities to de ne subsumption. Let s and t be two primal terms from a schematization G, where m = CV ar(s), n = CV ar(t), x = V ar(s), and y = V ar(t). Recall that we check the validity of formulas in the equational theory of R, i.e., the free algebra generated by R. The possibilities to de ne that s subsumes t are: 1. The formula 9m9x(s =: t) is: valid. 2. The formula 8n8y9m9: x(s = t) is valid. 3. The formula 8n9m(s = t): is valid. 4. The formula 8n9m9x(s = t) is valid. The rst two approaches are straightforward extensions of the rst-order concept. The second approach does not meet a natural requirement for subsumption, namely independence of the underlying signature. Subsumption should be a local test on two terms independent of other elements. There exist two terms s, t, such that s subsumes t (according to the second de nition) over a signature F , but not over an extended signature F 0  F [AHL97, Example 14]. The same terms also show that the rst two subsumption concepts are not equivalent, since there is no substitution , such that s =: t, as required by the rst concept. The problems with the second concept originate from quanti cation over rst-order variables. One possibility to avoid them is to quantify only the counter variables, as in the third approach. This concept is not satisfactory either, since it does not capture usual rst-order subsumption. When we extend the third concept with usual equational rst-order subsumption, we get the fourth concept. Hence, we have two suitable concepts for subsumption: the rst and the last one. Intuitively, the rst concept expresses that there is a uniform mapping , relating the term s and t in the equational theory of the schematization. In particular, for the counter variable vectors m and n, this means that m is a linear expression of n. In contrast, the fourth concept requires this uniformity only on the rst-order level; the vectors m and n need not be related by a linear 7

function. Clearly, the rst concept implies the fourth concept. The converse is not true, as the following example shows. Example 6. Primal grammars can encode arbitrary linear expressions of the form

c0 + c1 k1 +    + cn kn. A monomial ck can be represented by g^c (k; a), where the underlying rewrite system is g^c (0; x) ! x; g^c (k + 1; x) ! f(    f (^gc (k; x))): | {z } c times

Addition of monomials is encoded by nesting of de ned symbols. Hence, l1 = 2m1 + 3m2 is represented by s = g^2 (m1 ; g^3(m2 ; a)) and l2 = n1 + 2 is encoded as t = g^1(n1 ; f(f(a))). We show that s subsumes t according to the last concept but not according to the rst one. Both problems reduce to purely Diophantine problems upon l1 and l2 , following the previously mentioned encoding. According to the last concept, s subsumes t i 8n9m9x(s =: t). This is equivalent to 8n19m1 ; m2(2m1 + 3m2 = n1 + 2), since the problem contains no rst-order variables. This formula is valid since n1 +2 covers all natural numbers greater than 1, and each number except 1 can be written in the form 2m1 +3m2 . Hence, s subsumes t. According to the rst concept, s subsumes t i 9m9x(s =: t) holds. This is equivalent to 9m1 9m2 (2m1 + 3m2 = n1 + 2). Now suppose that there is a substitution  = fm1 7! q1 n1 + d1; m2 7! q2n1 + d2g, where q1 and q2 are nonnegative coecients. By applying the substitution and regrouping, we obtain the equations 2q1 + 3q2 = 1 and 2d1 + 3d2 = 2. The rst equation has no solution in non-negative integers. Hence, there is no such substitution  and the formula 9m9x(s =: t) is not valid.1 The last subsumption concept encompasses the rst one. Moreover, the last concept corresponds to the natural view that schematizations are just a nite representation of in nite sets of rst-order terms: s subsumes t if every term represented by t is subsumed by a term represented by s. Therefore we adopt the last concept of subsumption.

De nition 7. Let s and t be primal terms, where m = CV ar(s), n: = CV ar(t), and x = V ar(s). The term s subsumes t if the formula 8n9m9x(s = t) is valid. A set S subsumes a set T if for each term t0 2 T there exists a term s0 2 S, such that s0 subsumes t0.

Lemma 8. A primal term s subsumes a primal term t if and only if the set L(s) subsumes the set L(t).

Similar to the word problem, we want to reduce subsumption to uni cation. In this way, the algorithm becomes independent of the chosen schematization 1

We thank Eric Domenjoud for providing this example.

8

formalism. We proceed in four steps: we replace certain rst-order variables by new constants, apply the uni cation algorithm, simplify the resulting formula, and check its validity in Presburger arithmetic. 1. Elimination of rst-order variables in t: replace all rst-order variables in t  . The formula 8n9m9x(s =: t) is by new constants, producing the term t valid i 8n9m9x(s =: t ) holds by the way how we interpret free variables. 2. Uni cation: solve the equation s = t by means of a uni cation algorithm. Its output can be written as the nite formula W

(m; n; x) = 9k i (x = ui (k) ^ m = Mi k + ci ^ n = Ni k + di ); where k are the new counter variables introduced during uni cation, Mi , Ni are matrices of non-negative integers, and ci, di are vectors of non-negative integers, for each i. 3. Simpli cation: remove the equations x = ui (k) and m = Mi k + ci from the formula (m; n; x), producing 0(n). Note that 9m9x (m; n; x) is equivalent to 0(n), since the variables m and x are existentially quanti ed and appear only once and separated on the left-hand side of equations. W 4. Validity check: check if 8n 0(n) is valid. The result 8n9k i (n = Ni k + di) belongs to the 2-fragment of Presburger arithmetic.

3.3 Complexity issues

Both the word problem and the subsumption problem reduce in the last step to a 2-formula in Presburger arithmetic. While the complexity of full Presburger arithmetic is at least doubly exponential and Cooper presents in [Coo72] an algorithm of triple exponential complexity, the 2-fragment is only coNP-complete, as it was proved by Gradel [Gra88] and Schoning [Sch97]. Our formulas are quite Wsimple and do not cover the whole 2-fragment: they are of the form 8n9k i (n = Ni k + di ), i.e., the formula is in disjunctive normal form and the variables n appear only once separated on the left-hand side. Therefore we can ask whether our special problems are still coNP-complete. The lower bound reductions used by Gradel and Schoning require more complex formulas. However, following an idea in [Sch97], due to Gradel, we can prove the coNP-hardness of our problems by a reduction from simultaneous incongruences [GJ79]. This NP-complete problem is de ned as follows. simultaneous incongruences

Instance: Collection f(a1 ; b1); : : :; (ap; bp)g of ordered pairs of positive integers, with ai  bi , for 1  i  p. Question: Is there an integer n such that, for 1  i  p, n 6 ai (modbi)? We use the dual problem to show coNP-hardness. W Encoding n  ai (modbi) as 9k(n = bik + aWi ), we obtain the disjunction 9k pi=1 (n = bi k + ai ). The nal formula is 8n9k i (n = bi k + ai ), which is of the same type as the formulas obtained from word and subsumption problems. 9

Note that in both cases only the problem solved in the last step is coNPcomplete. The overall complexity of our algorithms is determined by the complexity of uni cation. In particular, the cardinality of a minimal complete set of uni ers can be at least exponential [Sal91]; and we have to compute all solutions to obtain the formula. Hence, the formula in the last step can be exponentially longer than the input of the original problem.

4 Complement problem

If t is a rst-order term, its Herbrand universe is H(t) = ft j : X ?! T (F )g, the set of the ground instances of t with respect to the underlying signature F . Similarly,if T is a set of rst-order terms, its Herbrand universe H(T) is the union of the Herbrand universes H(t) for each t 2 T. For a primal term t, its Herbrand universe is the set H(L(t)), i.e., the Herbrand universe of the schematized set. Finally, the Herbrand universe of a set of primal terms T is obtained as the union of the Herbrand universes H(t) for each t 2 T. Given a set of rst-order or primal terms T, its complement is the set T c = T (F ) n H(T). A class C is a collection of sets of terms satisfying a common property. For a given class C , the complement problem is the question whether for each nite set of terms T 2 C there exists a nite set of terms T 0 2 C , such that H(T 0 ) = T c holds. The set T 0 is called a nite complement representation. For rst-order terms, Lassez and Marriott proved that nite sets of linear terms always have a nite complement representation [LM87]. On the other hand, they showed that this is not true for arbitrary nite sets of rst-order terms. Since schematizations were introduced to increase the expressive power of rst-order terms, we might expect to be able to represent the complements of non-linear terms by a nite set of primal terms. However, as we show in the sequel, already the very simple non-linear term f(x; x) has no nite complement representation by primal terms. The potential of primal terms resides in the possibility to generate arbitrarily deep terms by iterating contexts. The expressive power of iteration is limited by the fact that the number of contexts must be nite. The maximal number of consecutive iterations during a reduction of a primal term is measured by the iteration depth. Each iteration terminates with the application of the base ^ : : :) ! r1f^ for some de ned symbol f.^ Therefore we can determine the rule f(0; iteration depth by counting the occasions when a variable gets decremented to 0. The iteration depth of a primal term is then the maximum over all reductions. Inspection of the rewrite system R reveals that there is a correspondence between the application of base rules and the number of counter positions present in the primal term: each iteration consumes a counter position.

De nition 9. The iteration depth of a primal term is the function  de ned recursively as follows: { (x) = (a) = 0 for a rst-order variable x and a constant a, { (f(t1 ; : : :; tn)) = maxf(ti ) j i = 1; : : :; ng for an n-ary function symbol f, 10

{ (f(^ c; t1; : : :; tn)) = jcj + maxf(ti ) j i = 1; : : :; ng for a de ned symbol f.^

The iteration depth naturally extends to a set of primal terms T, de ned by (T) = maxf(t) j t 2 T g. This de nition emphasizes the static aspect by looking at the primal term only. The operational aspect, namely counting the occasions when a variable is ^ : : :)) = 1 + (r1f^) and decremented to 0, is expressed by the equalities (f(0; ^ + 1; : : :)) = (r2f^) for each de ned symbol f^ and substitution . Note (f(n that (t)  jDj  depth (t). Iteration of contexts consumes resources of the primal term. On one hand, a single iteration can produce an arbitrarily deep term. On the other hand, there are ground rst-order terms that require a certain iteration depth. We use two di erent contexts, f(; a) and f(a; ), to force a consumption of resources. Consider the ground term s = f(; a)m  a. If the value of m is suciently large, then a primal term t representing s must contain a de ned symbol through which we iterate the context f(; a), and the iteration depth of t must be at least 1. If we simply concatenate two blocks of the same context, like in f(; a)m  f(; a)m  a, we do not necessarily need to increase the iteration depth of the primal term. However, if we insert the context f(a; ) between the two blocks, producing the term s = f(; a)m  f(a; )  f(; a)m  a, we force a primal term t representing s to have an iteration depth of at least 2. Repeating the step, this idea leads to an upper bound on the number of context blocks f(; a)m  f(a; ) that can be represented by a given primal term t. Lemma 10. Let t be a primal term without rst-order variables and let s = w  (f(; a)m  f(a; ))n  a be a ground rst-order term, where w is a proper subcontext of f(; a)m  f(a; ). If s 2 L(t) and m > (t)  depth (R) + depth (t) then n  (t). Proof. Let B(t) = (t)  depth (R) + depth (t) be the lower bound on the value of m. Note that the context w is either f(; a)i  f(a; ) for some i < m or the empty context. We perform the proof by induction on the tuple (s; (t)), where the rst component is ordered by subterm ordering and the second by the usual ordering on natural numbers. Note that (t) = 0 for all rst-order terms t. The base case is presented by s = a and n = 0. The inequality 0  (t) holds for each term t. For the induction step, we perform a case analysis on the structure of t. The primal term t can begin with di erent pre xes of the term s. Case 1: t = f(; a)i  f(a; t0 ) for a term t0 . Then the term t0 must represent s0 = (f(; a)m  f(a; ))n  a, where s = w  s0 , i.e., s0 is a proper subterm of s. By induction hypothesis, we have that n  (t0). Since the iteration depths of t and t0 are equal (f(; a)i is a rst-order context), we obtain n  (t). ^ : : :) for some j  i and a de ned symbol f.^ To represent Case 2: t = f(; a)j  f(c; the term s, the counter expression c must be instantiated. Let  be a counter variable substitution, such that t #R = s. There are two subcases to analyze, one for c = 0, the other for positive values of c. 11

Case 2.1: c = 0. Then there is a reduction t ?!R t0 , where t0 = f(; a)i  r1f^

for a substitution . Both terms t and t0 represent the term s, but the inequality (t0)  (t) + 1 holds. Compared to t, the term t0 grew at least by the term r1f^, but the condition m > B(t0 ) still holds, since depth (t0)  depth (R) + depth (t) follows from the rewrite step. We have that B(t0 )  B(t) < m, therefore we can apply the induction hypothesis since the iteration depth decreases: (t) > (t0 ). From the induction hypothesis follows that n  (t0). Hence, n  (t) holds. Case 2.2: c > 0. We must perform a case analysis whether the context f(a; ) is present in r2f^ or not. Case 2.2.1: The context f(a; ) is absent from r2f^. Hence, the context r2f^ must + be of the form f(; a)k for some k and t ?! R f(; a)j +k(c)  t0 , where t0 is an ^ f 0 instance of the term r1 . The primal term t represents either the rst-order term s0 = f(; a)i?j ?k(c)  f(a; )  (f(; a)m  f(a; ))n  a or the rst-order term s0 = f(; a)m?j ?k(c)  f(a; )  (f(; a)m  f(a; ))n?1  a: In both cases, the term s0 is a subterm of s and B(t0 )  B(t) < m holds. Hence, by induction hypothesis, n ? 1  (t0 ) holds. Moreover, the iteration depth decreases ((t) > (t0) holds), therefore we have that n  (t). Case 2.2.2: The context f(a; ) is present in r2f^. Hence, the context r2f^ must be of the form f(; a)k  f(a; )  f(; a)l for some k and l, where the inequalities k + l < depth (R) < B(t) < m hold. Therefore c must be equal to 1, since m is + too large, and there exists the reduction t ?! R f(; a)j +k  f(a; )  f(; a)l  t0 . 0 The primal term t represents the rst-order term s0 = f(; a)m?l  f(a; )  (f(; a)m  f(a; ))n?1  a: Note that j +k = i < m holds. The term s0 is a subterm of s and the inequalities B(t0 )  B(t) < m hold. Hence, by induction hypothesis, n ? 1  (t0 ) holds. The iteration depth decreases ((t) > (t0 ) holds), therefore n  (t) holds. ut The lemma indicates that if we choose the value of n in the term s = (f(; a)m  f(a; ))n  a larger than the iteration depth (t) of the primal term t, then we cannot represent s by t using iteration only. Therefore, the term t must contain variables. Corollary 11. If s = (f(; a)m f(a; ))na is an instance of a primal term t with (t) < n and m > (t)  depth (R) + depth (t), then t must end with a variable. More precisely, for each counter substitution  , such that s is an instance of t #R, the term t #R is of the form (f(; )m  f(; ))i  f(; )j  x, where the signs  are either variables di erent from x or the constant a. 12

Proof. If t ends with a variable x and represents s, then there exist a counter substitution : C ?! N and a rst-order substitution , such that t #R = s. The substitution  is of the form fx 7! s0 g [ fy 7! ag, where V ar(t) = fxg [ y. Note that t #R = t #R holds. Now suppose that t does not end with a variable. Then t is a primal term without rst-order variables and is of the same size as t, i.e., (t) = (t) and depth (t) = depth (t). The instance t represents s, therefore by Lemma 10 we have n  (t). This is a contradiction with the condition (t) = (t) < n. ut

The Herbrand universes of a set of terms T and of a representation T 0 of its complement must be disjoint. This leads to the following result.

Lemma 12. Let T be a set of rst-order terms and T 0 a representation of its complement. Then for all t 2 T and t0 2 T 0, the terms t and t0 are not uni able. Proof. Suppose that t and t0 are uni able with the uni er . Hence, there exists

a ground term t and a ground substitution , such that t = t = t0  holds. The ground term t belongs to both Herbrand universes H(t) and H(t0 ), therefore the term t0 cannot be in the complement representation T 0 . Contradiction. ut We have now assembled the necessary tools to show that primal terms cannot nitely represent the complement of rst-order terms. The proof is done by contradiction. We try to nd a nite representation for the complement of the rst-order term f(x; x). The underlying idea is to choose a ground term s = f(s1 ; s2 ) from the complement, such that both s1 and s2 are too complex to be produced by iteration alone, and s2 is twice as deep as s1 . Therefore a term representing s must be of the form f(u; v), where both u and v end with variables y and z, respectively. If y 6= z then the terms f(u; v) and f(x; x) are uni able, what contradicts Lemma 12. If y = z, then there is no substitution , such that u#R = s1 and v#R = s2 hold.

Theorem 13. The complement of a nite set of rst-order terms cannot be represented in general by a nite set of primal terms.

Proof. We show that the term f(x; x) has no nite complement representation

even if we use schematizations. Assume that the nite set of primal terms T represents the complement of f(x; x). The set T must contain a primal term t representing the ground term s = f((f(; a)m  f(a; ))n  a; (f(; a)m  f(a; ))2n  a) where m and n are parameters depending on the set T and the the used schematization. Let n > (T) and m > (T)  depth (R) + depth (T). There must be a substitution , such that t#R = s holds. We analyze the possibilities for t. Without loss of generality, we assume that t is a variable, or a constant, or begins with a functional constructor symbol. If t begins with a de ned symbol, ^ : : :) for a de ned symbol f,^ then either n = 0 or n = n0 + 1. Hence, i.e., t = f(n; 13

after one reduction step by R we get t ?!R t0 where t0 = r1f^ or t0 = r2f^ for a substitution . The context r2f^ must start with a functional constructor symbol. The rewrite step does not increase the iteration depth. For t0 = r1f^, we have that (t) > (t0 ), therefore we can apply the induction hypothesis. We perform a case analysis for t. The term t starts with a constant, or a variable, or a functional symbol. Case 1: t = a for a constant a. The constant a clearly cannot represent the term s, since the root symbol of s is the functional symbol f. Case 2: t = y for a variable y. Then t is uni able with f(x; x), hence it cannot be a term from a complement representation T following Lemma 12. Contradiction. Case 3: t = f(u; v) for some terms u and v. Clearly, from n > (T) follows that n > (u) and n > (v). By Corollary 11, both terms u and v must end with a rst-order variable. There must be a counter substitution  and a rst-order substitution , such that t #R = s. Let u = n #R and v = v #R . From the structure of the term s and the properties of the rst-order substitutions follows that u = (f(; )m  f(; ))n  f(; )m  y; v = (f(; )m  f(; ))n  f(; )m  z where  stands for either a variable (di erent from y and z) or for the constant a. Both terms u and v must end by a variable since both iteration depths (u) and (v) a smaller that n. We perform a case analysis on the variables y and z. Case 3.1: the variables are di erent: y 6= z. Then we can unify f(u; v) with f(x; x) and therefore t cannot be in T following Lemma 12. Contradiction. Case 3.2: the variables are equal: y = z. Then there must be a rst-order substitution , such that u = (f(; a)m  f(a; ))n  a holds. From the structure of the term u follows that the variable y must be instantiated by  to the ground term (f(; a)m?m  f(a; ))  (f(; a)m  f(a; ))n?n ?1  a. Now, the instance v is equal to the ground term (f(; a)m  f(a; ))n  f(; a)m  f(; a)m?m  f(a; )  (f(; a)mf(a; ))n?n ?1  a: The context f(; a)m  f(; a)m?m must be equal to the context f(; a)m , therefore we get m0 = m00 . Hence, the instance v must be equal to the term (f(; a)m  f(a; ))n +n?n  a. Now it is clear that v cannot be equal to the required term (f(; a)m  f(a; ))2n , since n00 ? n0 6= n holds because of the inequalities n > (u), n > (v), n0  (u), and n00  (v). Contradiction. ut 00

0

0

0

00

0

0

0

00

00

0

00

00

0

5 Conclusion

We presented general algorithms for solving the word and the subsumption problem for primal terms that also work for -terms, I-terms, and R-terms. The algorithms require a nitary uni cation algorithm for the schematization formalisms, as well as a solver for the 2-fragment of Presburger arithmetic. Still, there are 14

some problems left, especially concerning eciency. For the word problem, it would be interesting to have an algorithm that computes rst a suitable normal form of primal terms, followed by a syntactic comparison. Algebraically, this amounts to axiomatizing the theory of primal terms. We also showed that equations and primal terms are not sucient for describing in general the complement of rst-order terms. This result trivially extends to recurrent term schematizations, since rst-order terms are just a special case. On the other hand, the complement problem is easily solvable if we extend the language by negation and quanti cation. Then the complement can be expressed by a formula in the rst-order theory of term schematizations. In this context, we are interested in deciding the validity of formulas and in obtaining solved forms, e.g., by quanti er elimination. Peltier showed in [Pel97] that the rstorder theory of R-terms is decidable by quanti er elimination. The decidability of the rst-order theory of primal terms is still an open problem.

References

[AHL97] A. Amaniss, M. Hermann, and D. Lugiez. Set operations for recurrent term schematizations. In M. Bidoit and M. Dauchet, editors, Proc. 7th Int. Joint Conf. on Theory and Practice of Software Development (TAPSOFT'97), Lille (France), LNCS 1214, pages 333{344. Springer, 1997. [CH95] H. Chen and J. Hsiang. Recurrence domains: Their uni cation and application to logic programming. Information and Computation, 122:45{69, 1995. [Com95] H. Comon. On uni cation of terms with integer exponents. Mathematical Systems Theory, 28(1):67{88, 1995. [Coo72] D.C. Cooper. Theorem proving in arithmetic without multiplication. In B. Meltzer and D. Mitchie, editors, Machine Intelligence, volume 7, pages 91{99. Edinburgh University Press, Edinburgh, UK, 1972. [GJ79] M.R. Garey and D.S. Johnson. Computers and intractability: A guide to the theory of NP-completeness. W.H. Freeman and Co, 1979. [Gra88] E. Gradel. Subclasses of Preburger arithmetic and the polynomial-time hierarchy. Theoretical Computer Science, 56(3):289{301, 1988. [HG97] M. Hermann and R. Galbavy. Uni cation of in nite sets of terms schematized by primal grammars. Theoretical Computer Science, 176(1-2):111{158, 1997. [LM87] J.-L. Lassez and K. Marriott. Explicit representation of terms de ned by counter examples. J. Automated Reasoning, 3(3):301{317, 1987. [Pel97] N. Peltier. Increasing model building capabilities by constraint solving on terms with integer exponents. J. Symbolic Computation, 24(1):59{101, 1997. [Sal91] G. Salzer. Deductive generalization and meta-reasoning, or how to formalize  Genesis. In Osterreichische Tagung fur Kunstliche Intelligenz, InformatikFachberichte 287, pages 103{115. Springer, 1991. [Sal92] G. Salzer. The uni cation of in nite sets of terms and its applications. In A. Voronkov, editor, Proc. 3rd Int. Conf. on Logic Programming and Automated Reasoning (LPAR'92), St. Petersburg (Russia), LNCS (LNAI) 624, pages 409{420. Springer, 1992. [Sal94] G. Salzer. Primal grammars and uni cation modulo a binary clause. In A. Bundy, editor, Proc. 12th Int. Conf. on Automated Deduction (CADE'94), Nancy (France), LNCS (LNAI) 814, pages 282{295. Springer, 1994.

15

[Sch97] U. Schoning. Complexity of Presburger arithmetic with xed quanti er dimension. Theory of Computing Systems, 30(4):423{428, 1997.

16

Suggest Documents