New Uses of Linear Arithmetic in Automated Theorem Proving by Induction Deepak Kapur and M. Subramaniam
Computer Science Department, State University of New York, Albany, NY 12222, U.S.A. email:
[email protected],
[email protected]
Abstract. Zhang, Kapur and Krishnamoorthy introduced a cover set method for designing induction schemes for automating proofs by induction from speci cations expressed as equations and conditional equations. This method has been implemented in the theorem prover Rewrite Rule Laboratory (RRL) and a proof management system Tecton built on top of RRL, and it has been used to prove many nontrivial theorems and reason about sequential as well as parallel programs. The cover set method is based on the assumption that a function symbol is de ned using a nite set of terminating (conditional or unconditional) rewrite rules. The termination ordering employed in orienting the rules is used to perform proofs by well-founded induction. The left side of the rules are used to design dierent cases of an induction scheme, and recursive calls to the function made in the right side can be used to design appropriate instantiations for generating induction hypotheses. A weakness of this method is that it relies on syntactic uni cation for generating an induction scheme for a conjecture. This paper goes a step further by proposing semantic analysis for generating an induction scheme for a conjecture from a cover set. We discuss the use of a decision procedure for Presburger arithmetic (quanti er-free theory of numbers with the addition operation and relational predicates >; ; v. 2 Since divides is not de ned when its rst argument is 0 and its second argument is non-zero, the above conjecture is not true if we drop the condition x > 0 from it. The completeness of the de nition of divides is discussed in detail in section 4. 3 A position is a sequence of nonnegative integers used to refer to a subterm in a term. An equation will be considered as a term with = as the binary predicate; a conditional equation will be considered as a term with = as the binary predicate whose second argument is an if term, where if is a considered as a binary function. In the above example, the position of divides(x; y) is 2.2.1 as the conjecture is viewed as an abbreviation for divides(x;y + y) = true if divides(x;y) ^ x > 0: 1
arithjar.tex; 14/06/1995; 17:08; no v.; p.4
4
Deepak Kapur and M. Subramaniam
(I1) : fhhfx ! u; y ! 0g; fg; f2:2:1
divides(u; 0)gi; fgi; hhfx ! u; y ! vg; fv < u; v 6= 0g; f2:2:1 divides(u; v)gi; fgi; hhfx ! u; y ! u + vg; fg; f2:2:1 divides(u; u + v)gi; fhfx ! u; y ! vg; fg; f2:2:1 divides(u; v)gigig:
An induction scheme is a nite set of tuples, each tuple corresponding to an induction case or subgoal. The rst component of the tuple is used to generate the conclusion in the induction subgoal; the second component is used to generate the induction hypotheses, if any. The rst component of the induction case is a 3-tuple, whose rst component is a substitution to be made on the conjecture, the second component is a nite set of the conditions for which the induction case is applicable, and the third component is how the subterm being used for generating the induction scheme must be replaced so that the rule of the de nition from which the induction case is generated, can be applied. The second component of the induction case is a nite set of 3-tuples for generating the induction hypotheses in the same way as the conclusion is generated. An induction case in which the second component is the empty set, corresponds to a base case. Note that the cover set C1 covers all the values of divides where the rst argument is non-zero or both the arguments are zero. Hence the induction scheme generated by the cover set can be used to prove properties under the condition that the rst argument of divides is non-zero. For the above conjecture, the rst base case is obtained by using the rst tuple in the induction scheme (it comes from the rst rule), and the substitutions for the variables are x = u; y = 0: divides(u; 0 + 0) if divides(u; 0) ^ u > 0: This formula simpli es to true using the de nition of + and the rst rule. The second base case is obtained from the second tuple (and it comes from the second rule) and the substitutions for the variables are x = u; y = v under the condition v < u and v 6= 0: divides(u; v + v) if divides(u; v) ^ (u > 0) ^ (v < u) ^ v 6= 0; which trivially simpli es to true since divides(u; v ) in the condition simpli es to false under the condition v < u and v 6= 0 using the second rule. The induction step comes from the third tuple in the induction scheme (and it comes from the third rule); the substitution for the variables in the conclusion are x = u; y = u + v: divides(u; (u + v) + (u + v)) if divides(u; u + v) ^ u > 0; with the substitutions for the variables in the induction hypothesis coming from the second component as x = u; y = v : divides(u; v + v) if divides(u; v) ^ u > 0:
arithjar.tex; 14/06/1995; 17:08; no v.; p.5
5 Using the associativity and commutativity properties of + and applying the third rule thrice, the conclusion above reduces to: New Uses of Linear Arithmetic in Automated Theorem Proving by Induction
divides(u; v + v) if divides(u; v) ^ u > 0; which is the induction hypothesis. So the conjecture is proved. The reader would have noticed how using the well-founded ordering suggested by the de nition of divides led to an induction hypothesis which turned out to be useful in proving this conjecture. Now consider a related conjecture: (P2 ) : divides(2; x) = not(divides(2; s(x))); where 2 is an abbreviation for s(s(0)).4 Using the cover set of divides, the following induction scheme for divides(2; x) at position 1 is generated: (I2 ) : fhhfx ! 0g; fg; f1 divides(2; 0)gi; fgi; hhfx ! vg; fv < 2; v 6= 0g; f1 divides(2; v)gi; fgi; hhfx ! 2 + vg; fg; f1 divides(2; 2 + v)gi; fhfx ! vg; fg; f1 divides(2; v)gigig: There are two base cases: divides(2; 0) = not(divides(2; s(0))), which is proved using rules 1 and 2; similarly, divides(2; v ) = not(divides(2; s(v ))) if v < 2 ^ v 6= 0, which simpli es to not(divides(2; s(1))) = false; since v 6= 0 ^ v < 2 implies v = 1 = s(0). If we check for applicability of de nitions and lemmas modulo the theory of linear arithmetic, rewriting can get very expensive. Since rewriting is a primitive operation in a rewrite-based theorem prover such as RRL, performing rewriting modulo a theory such as linear arithmetic can slow the theorem prover down considerably. It is thus necessary to use the linear arithmetic procedure in a judicious manner to widen the scope of the cover set method and at the same time, maintain eciency. In this paper, we do not rewrite modulo the linear arithmetic theory. We discuss how the subgoal divides(2; s(1)) can be proved without rewriting modulo linear arithmetic after considering the induction step generated from the third case in the induction scheme. For the induction step, the conclusion is:
divides(2; 2 + v) = not(divides(2; s(2 + v))); assuming the induction hypothesis: divides(2; v ) = not(divides(2; s(v ))).
Rule 3 is now applicable on the left side of the conclusion. But what about the right side? Using semantic information about natural numbers, s and +, it 4 This example is taken from the Nqthm Corpus. This conjecture is proved there with the help of an explicit induction hint.
arithjar.tex; 14/06/1995; 17:08; no v.; p.6
6 Deepak Kapur and M. Subramaniam would be possible to see that s(2 + v ) = 2 + s(v )5, so rule 3 is applicable to not(divides(2; s(2 + v))) to give not(divides(2; s(v))). As stated above, for reasons of eciency, we do not wish to rewrite modulo the theory of linear arithmetic, so we achieve this by merging induction schemes for the two dierent occurrences of divides in the conjecture. So an induction scheme for divides(2; s(x)) at position 2.1 is also generated using the cover set of divides. The induction schemes for divides(2; x) and divides(2; s(x)) are then merged to give the scheme: (I3 ) : fhhfx ! 0g; fg; f1 divides(2; 0); 2:1 divides(2; s(0))gi; fgi; hhfx ! 1g; fg; f1 divides(2; 1); 2:1 divides(2; 2 + 0)gi; fgi; hhfx ! 2 + vg; fg; f1 divides(2; 2 + v); 2:1 divides(2; 2 + (v + 1))gi; fhfx ! vg; fg; f1 divides(2; v); 2:1 divides(2; v + 1)gigig: The rst base case is proved as it was done before. The second base case becomes
divides(2; 1) = not(divides(2; 2 + 0)) which is established now by the rules 2 and 3 of the de nition of divides. In the induction step, the conclusion is:
divides(2; 2 + v) = not(divides(2; 2 + (v + 1))); with the hypothesis: divides(2; v ) = not(divides(2; v + 1)). The conclusion, by the use of rule 3, is reduced to
divides(2; v) = not(divides(2; v + 1)); to which now the hypothesis is applicable. The reason for keeping replacements for subterms being used for generating induction schemes should be evident. In the above conclusion, the subterm at position 1 is already in the form divides(2; 2+ v ) so that rule 3 in the de nition of divides is directly applicable; a similar remark applies for the subterm divides(2; 2 + (v + 1)) at position 2.1. As the reader must have observed, such semantic analysis can be performed in the case of natural numbers using the linear arithmetic decision procedure since reasoning about 0; s; +; =; needs to be performed. In this paper, we show how Presburger arithmetic can be used in mechanizing induction to reconcile dierent representations of natural numbers { the one using successor and the other using +; ;