A Resolution Theorem Prover for Intuitionistic Logic Tanel Tammet Department of Computing Science, Goteborg University and Chalmers University of Technology, S-41296 Goteborg, Sweden email:
[email protected]
Abstract
We use the general scheme of building resolution calculi (also called the inverse method) originating from S.Maslov and G.Mints to design and implement a resolution theorem prover for intuitionistic logic. A number of search strategies is introduced and proved complete. The resolution method is shown to be a decision procedure for a new syntactically described decidable class of intuitionistic logic. We compare the search strategies suitable for the resolution method with strategies suitable for the tableau method. The performance of our prover is compared with the performance of a tableau prover for intuitionistic logic presented in [17].
1 Introduction Intuitionistic logic is interesting since intuitionistic proofs contain more information than the corresponding classical proofs: each intuitionistic proof corresponds to a computable program. Hence intuitionistic logic can be used for program synthesis. However, proof search in the rst order intuitionistic logic is more dicult than in the rst order classical logic: there are no convenient normal forms like conjunctive normal form or prenex form. Except a few special cases, also Skolemization cannot be applied to intuitionistic formulas. Dierently from classical logic, there are only a few interesting decidable classes known for intuitionistic logic. For decades, most of the research in automated theorem proving has been concentrated on classical logic. Relatively few papers are devoted on proof search in intuitionistic logic. The following is an incomplete list of such papers: [22],[7], [3], [11], [23], [15], [17], [18], [5], [13]. Despite the fact that several intuitionistic theorem provers have been implemented (see [5], [18], [3]), only very few published papers describe the actual implementation of an automated theorem prover and bring the results of running the prover on some benchmarks: [22] and [17]. The prover described in [22] is limited to propositional calculus. In recent years we observe a renewed interest in proof search for intuitionistic logic, originating mostly from research in intuitionistic type theories (see [1] and [2] for early research in automating type theories). Although type theories are essentially higher-order, there exist useful fragments of type theories which can be directly encoded in rst order 1
intuitionistic logic, with no additional axioms or axiom schemes required, see e.g. [20]. In fragments like these the problems of proof search in type theory translate directly into problems of proof search in rst order intuitionistic logic. It is not realistic to expect that an automated theorem prover will be able to prove most of the complicated, but provable intuitionistic formulas (or type theory formulas) we are interested of, all by itself in acceptable time frame. However, the automated methods bear serious promise as a powerful tool assisting humans in developing a proof. In a more developed area of classical logic, there are numerous examples of open problems being solved largely due to the substantial help received from the automated theorem prover. We hope that an ecient prover for intuitionistic logic can be of analogous help for program synthesis and type theory.
2 Terminology For the basic terminology of resolution (term, atom, literal, clause, substitution, most general uni er (denoted mgu)) see e.g. [6, 4]. Let us x the terminology concerning Gentzen-type systems (sequent calculus). In the sequent ? ` the left-hand side ? and right-hand side are sometimes called antecedent and succedent. In each inference rule
P1 P2 : : : Pn C
the sequent C written under the line is the conclusion, and the sequents P1 ; : : :; Pn over the line are premises. The formula constructed by the rule and shown explicitly in the conclusion, for example A _ B in A; ? ` D B; ? ` D _ ` A _ B; ? ` D is the main formula, the components of the main formula shown explicitly in the premises (A and B in _ `) are side formulas, and the remaining formulas (? and D in _ `) are parametric formulas. Recall the de nition of the sign of a subformula occurrence in a formula: positive subformulas occur within the scope of even number (for example zero) of occurrences of negations and the premises of implication. Non-positive occurrences are negative. In the following we will always assume that we are dealing with a cut-free Gentzen-type system (sequent calculus) as the basic formalism for representing the rules of logic. The objects derivable in logic are sequents, sometimes also called clauses. We will moreover assume that the derivation rules have a subformula property: all the formulas in the premisses are subformulas or substitution instances of the formulas in the conclusion. Both rst order classical and intuitionistic logic (as well as various modal logics, minimal logic and linear logic) enjoy a representation in the cut-free sequent calculus with a subformula property.
3 The Tableau Method By the \tableau method" we mean the backward-chaining proof search method where the sequent calculus rules are applied bottom-up (observe that premisses are written above 2
the line and the conclusion below the line). Search starts with the formula to be proved and branches backwards using the sequent calculus rules in a bottom-up manner. Search is continued until all the branches are terminated by axiom rules or no rules can be backwards-applied to any of the branches. Subformula property of the logic guarantees that any backwards application of a rule produces only a nite number of or-branches originating from the node.
Example Proving ` A)(A & A) intuitionistically by the tableau method. Backwardsapplying the rule A; ? ` B `) ?`A)B gives a single subgoal A ` A & A. Backwards-applying the rule ?`A ?`B `& ? `A&B gives two syntactically identical subgoals A ` A, which are both axioms. The proof is nished. In addition to the rules `) and ` & we could have backwards-applied the structural rules of weakening and contraction, but since these applications would have been useless for our proof, we hid the option.
In addition, it is assumed that the quanti er rules are not applied as is { metavariables are used instead (see[18],[17]). The use of metavariables gives typically a much smaller search space and higher eciency than using the standard quanti er rules directly. There exist various modi cations of the tableau method, using dierent variants of the sequent calculus and dierent search strategies. The common feature of tableau methods is that due to the use of metavariables the choices done in the search tree have a global eect: a substitution instanciating a (meta)variable x in one branch also changes all the other branches where x occurs. Therefore we have to use backtracking through the proof tree when searching for the proof. For linear logic similar global eects do occur already on the propositional level. Tableau methods are generally characterised as global methods. Depth- rst search using tableau is incomplete for any undecidable logic, like intuitionistic predicate calculus and propositional linear logic. However, depth- rst search enjoys several advantages over breadth- rst search (a relatively small computational overhead). Thus several tableau provers implement iteratively deepening depth- rst search, where each iteration has a certain bound on a length of branches. Experiments for both the intuitionistic tableau (see [17]) and the linear propositional tableau (see [19]) have showed the good bound to be a number of contractions in a branch. The chief gains of eciency for the tableau provers are obtained by restricting the calculus in order to obtain a low branching factor in the \upwards" direction of the sequent deduction. This requires proving a number of lemmas showing that whenever a formula F is provable in the unmodi ed sequent calculus, it is also provable in the restricted calculus. In the other words it is shown that all the proofs have a certain normal form suitable for the tableau calculus. A modi cation typically suitable for the tableau method stands in moving the weakening applications up to axioms (modifying the axiom rule to incorporate weakening) and prohibiting the separate weakening rule. Other suitable modi cations involve employing 3
the invertibility of some rules, treating certain nested applications of propositional connectives as a single application of an n-ary connective, etc. One of the main goals of modifying the rules is to restrict the occasions where the contraction rule may be applied { without contraction intuitionistic logic would be decidable. The tableau prover presented in [17] uses the following system GI: Logical axioms. ?; ? ` B and B; ? ` B for arbitrary formulas B.
Inference rules. A; B; ? ` D & ` A & B; ? ` D A ) B; ? ` A B; ? ` D )` A ) B; ? ` D A; ? ` D B; ? ` D _ ` A _ B; ? ` D
?`A ?`B `& ? `A&B A; ? ` B `) ?`A)B ?`A
`_
? `A_B ?`B `_ ? `A_B
8yA[y]; A[t]; ? ` D 8 ` ? ` A[t] ` 9 8yA[y]; ? ` D ? ` 9xA A[y]; ? ` D 9 ` () ? ` A[y ] ` 8 () 9xA[x]; ? ` D ? ` 8xA where () denotes the eigenvariable condition: y does not have any free occurrences in ? or D. Exchange rule is implicit.
Observe that a tableau prover using the system GI is a decision procedure for formulas not containing negative occurrences of 8.
4 The Generic Resolution Method The generic resolution method (also called the \inverse method") originally developed by S.Maslov and G.Mints (see e.g. [9] (contains uni cation already), [8] and [11]) is a forward-chaining proof search method. The sequent calculus rules are applied top-tobottom (again, observe that premisses are written above the line and the conclusion below the line). Search starts with the set of axioms and produces new sequents from the already derived ones by applying the sequent calculus rules, until the formula we want to prove is eventually derived. In the following we will refer to the method shortly as the resolution method. Ordinarily the name \resolution method" (coined by J.Robinson) denotes the proof search method presented by J.Robinson in [16]. In our context the usage of \resolution" is justi ed by the fact that Robinson's resolution method can be seen as a special case of the generic resolution. The forward-chaining method which systematically applies all the sequent rules in all possible ways is obviously complete: if there exists a derivation of a formula F , the method will eventually derive F . However, such an unrestricted method is hopelessly inecient. 4
De nition 1 A resolution method for proving some sequent S in some sequent calculus
enjoying a subformula property is a forward-chaining proof search method for this calculus with the additional restriction: any derived sequent must contain only subformulas of S .
Again, the resolution method is obviously complete for any sequent calculus enjoying a subformula property, like intuitionistic logic. In the following we will consider a number of strategies for the resolution method. The strategies prohibit certain application combinations: not all sequent calculus derivations will be reachable. Completeness proofs of strategies will demonstrate that whenever there is a proof of a formula F , then there is a proof without the prohibited combinations. The main device of proving the completeness of a developed resolution method M incorporating some restriction strategies is to show that there is a restricted form R for sequent calculus derivations such that: R is complete, i.e. all formulas derivable in the original calculus have a derivation with the form R. All the possible derivations satisfying R are reachable by M . All the sequents derived during the forward-chanining search are independent of each other. Substituting into a variable x in a sequent ? does not change the value of any variable in any other sequent. Indeed, we rename all the free variables in at least one of premiss sequents prior to the application of a rule. The resolution method is generally characterised as a local method.
4.1 A Suitable Sequent Calculus
When using forward reasoning for proof search it is important to minimize the number of weakening applications and maximize the number of contraction applications: the former increase the size of a derived sequent, the latter decrease it. The intuitionistic sequent calculus GJ' from [13] avoids explicit applications of structural rules. We present a modi cation GJm of GJ': we allow to rename the bound variables. the versions of & `, 8 ` and ) ` which become redundant in the presence of renaming have been dropped. we avoid the special constant ? denoting absurdity: negation :A is treated as a separate connective, not as a mere denotation for the formula A)?. To compensate, an extra rule ` )0 is added for implication. GJm follows. Compare GJm with the tableau-oriented system GI presented earlier. Logical axioms. A ` A for any atom A.
Inference rules. A; ? ` D (A & B; ?) ` D & ` B; ? ` D & ` (A & B; ?) ` D
?`A `B (?; ) ` A & B ` & 5
? ` A B; ` D )` (A ) B; ?; ) ` D A; ? ` `)0 ?`A)B
A; ? ` B `) ?`A)B ? ` B `)00 ?`A)B
?`A (:A; ?) ` : `
A; ? ` ` : ? ` :A
A; ? ` D B; ` D _ ` (A _ B; ?; ) ` D
?`A `_ ? ` A_B ?`B `_
? ` A_B
A[t]; ? ` D 8 ` ? ` A[t] (8yA[y ]; ?) ` D ? ` 9xA ` 9 A[y]; ? ` D ? ` A[y ] (9xA[x]; ?) ` D 9 ` () ? ` 8xA ` 8 () where () denotes the eigenvariable condition: y does not have any free occurrences in ? or D. D stands for a single atom or no atom at all. Exchange rule is implicit. If ? and are lists of formulas, then (?; ) is the result of concatenating them and contracting repetitions modulo names of bound variables: A; A0; ? ` B is replaced by A; ? ` B i A0 is obtained from A by renaming bound variables.
In essence, we consider the left side of a sequent to be a set of formulas. We note that the rule `)0 is necessary only in case F contains the negation connective :.
Theorem 1 Closed sequent ` F 0 is derivable in GJm i the sequent ` F is derivable in GJ'. F 0 is obtained from F by replacing subformulas A)? with :A and renaming bound variables. F 0 may not contain ?. The proof is easy.
4.2 Labelling
One of the main ideas of the general resolution framework for logics with a subformula property (eg. classical, intuitionistic and linear logics) in [11, 12, 13] is to label subformulas of any investigated formula F with new atomic formulas in order to reduce the depth of a formula. This labelling is encoded inside the given logic, using the implication connective (a connective ) such that for any two formulas A and B one has ` A ) B i A ` B holds) assumed to be present (or encodable) in the logic. In general not all the subformulas have to be labelled. For example, since labelling atomic formulas cannot reduce the depth of the formula, atomic formulas are usually not labelled. A formula (A? (B C )) where ? and are arbitrary binary connectives, can be labelled 6
as
(A ? (|B {zC ))} |
{z
L2
L1
}
and the derivability of the labelled formula can generally be encoded as (B C ) ) L1 ; L1 ) (B C ); (A ? L1 ) ) L2 ; L2 ) (A ? L1) ` L2 in the two-sided sequent calculus, provided that ) is an implication. A label is thus treated as an atomic formula. For ordinary logics it is possible to keep only one of the de ning implications (: : : ) L) and (L ) : : :) for each label L. Implication may be expressed in terms of other connectives in the logic (this is the case in classical and linear logics, but not in intuitionistic logic), possibly allowing a more convenient encoding and special optimisations. Furthemore, in classical logic it is possible to use equivalent transformations to convert any formula F to an equivalent sequent S where all the formulas have a subformula depth less than three (if we assume classical disjunction and conjunction to be n-ary connectives for any n) without introducing labels. It appears that in case of predicate calculus it is useful to label subformulas S with the atoms formed by taking a new predicate (say, P ) and giving it the set of free variables x1 ; : : :; xn in S as arguments: P (x1; : : :; xn). This enables us to encode the labelling of the predicate formula inside the logic itself. Example Consider the formula F1: (8xP (x; b)) & (8x8y (P (x; y ))B (x; y ))))8x9y (B (x; y ) _ P (y; x)) We label all the nonatomic subformulas of F1 . The following is a labelled form of F1 , with the labels attached to leading connectives of subformulas for better readability: (8L1 xP (x; b)) &L7 (8L3 x; y (P (x; y ))L2 (x;y) B (x; y )))
)L8 8L6 x9L5 x y(B(x; y) _L4 x;y P (y; x)) ( )
(
4.3 Instantiating Derivation Rules
)
We are now going to present the second main idea of the generic resolution method proposed in [9] and developed in [11], [13]: starting the search with maximally general axioms and building uni cation into derivation rules. Uni cation is also the essential idea behind Robinson's resolution. Let F be a formula we are trying to prove. We are interested in nding an ecient implementation for the resolution restriction of sequent calculus: any derived sequent must contain only subformulas of F . Using standard sequent rules and throwing away all the derived sequents which do not satisfy the restriction will be wasteful. Instead, we will create a new instantiated sequent calculus for F . Since each label in the labelled form of F corresponds to a subformula, and this subformula has a certain leading connective, we can create a set of instances of ordinary sequent rules in GJm which allow us to derive the labels (not subformulas themselves) directly. Every sequent derived in this instantiated calculus satis es the resolution restriction. Each occurrence of a label has a mapping to the set of instances of the rules in GJm corresponding to the leading connective of the labelled subformula. 7
4.3.1 Rule Derivation Algorithm RR
1
Consider a subformula S of F in the abstract form: C (A1; : : :; An ) where C is the leading connective and A1 ; : : :; An are argument formulas. By label(X ) we denote the label of an arbitrary subformula X . Instances of the sequent rules corresponding to label(S ) are built from the sequent rules R1 ; : : :; Rm for the connective C in the following way. For each rule Ri replace the main formula of the rule by label(S ). Replace the modi ed conclusion ? ` B of the rule by (? ` B ) and add the side condition: is obtained by unifying all the side formulas in the modi ed rule with the corresponding labels from the set label(A1); : : :; label(An). The eigenvariable condition (y does not occur freely in the conclusion G) is translated as: y is a variable and y does not occur in the substitution instance G0 of the modi ed conclusion G0. The following polarity optimisation is obvious: remove these rules which introduce label(S ) with a polarity dierent from the polarity of S in F .
4.3.2 Axiom Derivation Algorithm RA
We are not satis ed with the form of axioms in GJm: A ` A for any atom A. Even if we restrict A to be led with a predicate occurring in F , the list of axioms for a predicate formula is still in nite, since there is generally an in nite number of instances for each non-ground atom in F . The problem is overcome by using the maximally general atoms for each predicate symbol in F for axioms. The set of possible axioms for the formula F is obtained by taking one axiom of the most general form P (x1 ; : : :; xn ) ` P (x1 ; : : :; xn) for each such predicate symbol P in F which has both a negative and a positive occurrence in F . Completeness is preserved if we use axiom instantiation: form the set of axioms by taking a set of instances instead: for every positive occurrence of an atom A and every negative occurrence of an atom B form the axiom (B ` A) where = mgu(A; B ). Our prover uses the following extended axiom set: for every negative occurrence of a suformula FL and every positive occurrence of a subformula FR form the axiom (L ` R) where = mgu(L; R) and L,R are labels of FL ,FR .
4.3.3 Proof Search Algorithm RD
1
Proof search is carried out by applying the derivation rules obtained by the algorithm RR1 to both the axioms obtained by RA1 and the sequents derived earlier in the search. The proof is found if the sequent ` label(F ) has been derived, where label(F ) is the label of the whole formula F . Before applying any derivation rule, all the variables in the premisses are renamed, so that the sets of variables in the premisses and the rule do not overlap. After the rule has been successfully applied, all the factors and factors of factors of the conclusion will be derived using the factorization rule:
X; X 0; ? ` Y = mgu(X; X 0) (X; ? ` Y ) Finally, all the repetitions in the left side of the derived sequent are deleted.
8
Example We continue the example . Recall the formula F and its labelling. The fol1
lowing is a set of labelled instances formed from the sequent rules in GJm. Recall implicit exchange and contraction in GJm: the left side of a sequent is a set of atoms. ? ` Y = mgu(X; P (x; b)) 8L1 `: (LX; 1 ; ? ` Y ) )L2 `: (??; ` ;XL2(x;Y;y) ``ZZ) = mgu(X; P (x; y)); = mgu(Y ; B(x; y)) ? ` Y = mgu(X; L2(x; y )) 8L3 `: (LX; 3 ; ? ` Y ) ` _L4 : (? ` ?L4`(x;Y y)) = mgu(Y; B(x; y)) or = mgu(Y; P (y; x)) ` 9L5 : (? `?L`5(Yx)) = mgu(Y; L4(x; y)) ` 8L6 : (??``LY6) = mgu(Y; L5(x)); VAR(x); NOOCC (x; (? ` L6)) &L7 `: LX;7 ; ?? `` YY X L1 or X L3 ` )00L8 : ?? `` LY8 Y L6 ` )L8 : X;? ?` `L8Y X L7; Y L6 The instance of the rule ` 8 corresponding to L6 contains the eigenvariable condition: x must be a variable and it must not occur in (? ` L6) . The form of the rules &L7 ` and ` )L8 has been simpli ed, since one of the literals to be resolved upon is a predicate symbol with no arguments and thus the substitution, if it exists, will be always empty. The set of possible axioms for the labelled F1 contains two sequents: P (x; y) ` P (x; y) B(x; y) ` B(x; y) Recall that all the variables in each premiss sequent must be dierent from variables both in the other premiss sequents and the rule. Variable renaming is used implicitly. The label-form derivation of F1 : P (x; y) ` P (x; y) B(x0 ; y 0) ` B(x0 ; y 0) ) ` P (x; y); L2(x; y) ` B(x; y) 8 ` L2 P (x; y); L3 ` B(x; y) & L`3 P (x; y); L7 ` B(x; y) 8 L7` L1; L7 ` B(x; b) & L`1 L7 ` B(x; b) ` _L7 L7 ` L4(x; b) ` 9 L4 L7 ` L5 (x) ` 8 L5
L7 ` L6 ` ) L6 L8 ` L8 Consider, for a change, an attempt to derive F1 using the right-side argument P (y; x) of the disjuction B (x; y ) _ P (y; x): P (x; y) ` P (x; y) P (x; y) ` L4(y; x) ``9_L4 P (x; y) ` L5(y) ` 8 L5??? : L6 P (x; y) ` L6 9
The last step of the derivation is not allowed due to the eigenvariable condition in the rule ` 8L6 : x = y, and y occurs in the conclusion P (x; y) ` L6. An attempt to derive F1 using 8L1 ` higher in the derivation also fails due to the eigenvariable condition: x = b, and b is not a variable. P (x; y) ` P (x; y) 8L ` L1 ` P (x; b) ` _ 1 L1 ` L4(b; x) ` 9 L4 L1 ` L5 (b) ` 8L5 ??? : :VAR(b) L6
L1 ` L6 Theorem 2 The proof search algorithm RD1 using axioms and instances of sequent calculus rules generated by the algorithms RA1 and RR1 presented above is sound and complete.
Proof We refer to [11, 13] for the details of a proof of a similar system. Soundness is
easy to prove. The principal idea of the completeness proof is the following: show completeness for the propositional case, which is straightforward, use the lifting lemma standardly used for completeness proofs of Robinson's resolution (see e.g. [4, 16, 6]) in order to lift the proof to predicate level.
2
4.4 Clause Notation
It is convenient to formalize the axioms and the system of instances of GJm rules produced by the algorithms RA1 and RR1 using the clause notation familiar from Robinson's resolution. This enables us to encode the problem of nding a derivation of the intuitionistic formula in a manner which is very similar to the hyperresolution strategy for ordinary Robinson-style resolution for classical logic. Consequently, it is possible to implement an automated theorem prover reusing the data structures and procedures of existing classical theorem provers. Indeed, this is the case with our implementation, which shares most of the code with our implementation of a classical resolution theorem prover. The calculus RJm is obtained from RJp given in [13] by the following inessential modi cations: the redundant clauses ) p and ) q1 are removed from the rules _ ` and 9 `, respectively, two rules for the negation connective : are added, the rule Res is split into a number of dierent cases: ` (_; 9; )00), (&; 8) ` and ` &. Such a modi cation was considered in a system RIp in [11]. the notational dierence: we write :L1; : : :; :Lm; R instead of L1; : : :; Lm ) R. Derivable objects of RJm are clauses which represent sequents. A clause is a set of literals: negative literals (if any) in the clause represent atoms on the left of the sign `, positive literal (if any) represents the atom on the right side of ` in the corresponding sequent. 10
The left premiss in all of the following resolution rules except factorization is a rule clause obtained by translating the corresponding rule in the instantiation of GJm. We will not bring an exact algorithm for the translation, since RJm is a mere notational variant of instantiated GJm. Other premisses are derived clauses. The rule clauses are analogous to nucleons and the derived clauses are analogous to electrons of the hyperresolution strategy of ordinary classical resolution, see [4]. All the literals to the left of j in a rule clause have to be resolved upon. The literal to the right of j will go to the conclusion. The rule ` ) is dierent from the usual rules of classical resolution: two literals to the left of j are resolved upon with two literals in a single non-rule premiss clause. The rules ` 8 and 9 ` are also nonstandard, since they contain the eigenvariable condition. :? denotes a set of negative literals. mgu(p1; p01; : : : ; pm ; p0m) denotes the result of unifying the terms f (p1 ; : : :; pm) and f (p01 ; : : :; p0m). R denotes either a positive literal or no literal at all. L denotes the label introduced by the rule. Each label L corresponds to a subformula with a polarity and a leading connective indicated to the left of the rule. V AR(t) denotes that t is a variable, NOOCC (v; t) denotes that v does not occur in t. 0 0 ` ) : p; :qj(L:?;:L?);:p ; q = mgu(p; p0; q; q0) 0 ` ()0; :) : pjL(:?:; L?);:p = mgu(p; p0) 0 : `: :p(j::?L; :L:)?; p = mgu(p; p0) 0 0 ) `: p; :qj:(L:?; ::?;; q:L; :R); :p ; R = mgu(p; p0; q; q0) 0 0 0 _ `: p; qj:L (::??; ;::p; ;:RL; R:) ; :q ; R = mgu(p; p0; q; q0; R; R0) 0 ` 8x : :p(j:L?; L:)?; p = mgu(p; p0); V AR(x); NOOCC (x; (:?; L)) ?; :p0 ; R = mgu(p; p0); V AR(x ); NOOCC (x; (:?; :L; R)) 9x `: pj:(:L?; ::L; R) 0 ` (_; 9; )00) : :p(j:L?; L:)?; p = mgu(p; p0) 0 0 ` & : :p; :q(jL:?; ::?;; pL) :; q = mgu(p; p0; q; q0) ?; :p0 ; R = mgu(p; p0) (&; 8) `: pj:(:L?; ::L; R) 0 p ; R = mgu(p; p0) Fact `: :(:?;?:; :p;p;:R )
Example We continue the example. Recall the formula F , the labelled form of F and 1
1
the corresponding instantiation of GJm. 'conc' will denote the conclusion of applying the rule clause with the substitution . The following is the set of instantiated rules from GJm in the clause form:
axiom clauses: :P (x; y); P (x; y) and :B(x; y); B(x; y). 11
rule clauses:
8L1 `: P (x; b)j:L )L2 `: :P (x; y); B(x; y)j:L (x; y) 8L3 `: L (x; y)j:L ` _L4 : :B(x; y)jL (x; y) and :P (y; x)jL (x; y) ` 9L5 : :L (x; y)jL (x) ` 8x;L6 : :L (x)jL VAR(x); NOOCC (x; conc) &L7 `: L j:L and L j:L ` )00L8 : :L jL ` )L8 : L ; :L jL 1
2
2
3
4
4
5
5
1
6
7
6
7
4
3
7
8
6
8
Now consider the derivations presented in the example. Each of these is trivially translated to the clause-notation derivation using the clause rules: replace the sequent notation ? ` R with the clause notation :?; R.
5 Strategies of Resolution The general principle of modifying the calculus is the same for the resolution as for the tableau calculus: diminishing the branching factor at search nodes. However, in the resolution case the search is carried on in the top-down manner and thus we want to diminish branching in the \down" direction. A number of modi cations and lemmas are shared with the tableau method. For example, invertibility of rules can be employed by both. It is frequently (though not always) the case that a lemma justifying a strategy which is natural for one search procedure turns out to be usable also for the other procedure.
5.1 Axioms and Polarity Optimization
The algorithms RR1 and RA1 introduced two restriction strategies: polarity optimisation and axiom instantiation. Both are present already in [9]. The extension strategy of RA1 of using an extended axiom set is dierent from all the other strategies we present, in the sense that it does not diminish the branching factor at search nodes. In contrary, the size of the set of axioms is increased. We have found in experiments that the extension strategy can sometimes give an enormous (in principle, super-exponential) improvement of eciency. A trivial case: proving formulas F ) F with a complex F . At the same time it has caused only a minor eciency loss for the test formulas where the additional axioms, if created at all, were of no use. The extension strategy obviously does not destroy completeness. The soundness of this strategy follows from the fact that any sequent of the form F ` F is intuitionistically derivable.
5.2 Subsumption
De nition 2 The clause ? subsumes the clause i ? for some substitution . 12
Subsumption is the strategy well-known from standard Robinson's resolution for classical logic. It works in exactly the same way for intuitionistic logic: every derived clause ? which is subsumed by some existing clause is immediately removed. Also, in case a newly derived clause subsumes some existing clauses ?1 ; : : :; ?n , then all the latter are immediately removed from search space. The following lemma is an old result, see [13]: Lemma 1 The subsumption strategy preserves completeness of resolution.
Proof Proof follows from the lifting lemma and the fact that GJm does not contain an
explicit rule of weakening: all the necessary weakening steps are combined implicitly into non-structural rules, for example )0 and )00. Let ? . Any derivation containing can be transformed to a (possibly shorter) derivation containing ? instead. If ? , then we know that the literals missing in ? are restored, wherever necessary, by the implicit weakening present in the rules of GJm. 2 We note that the subsumption strategy loses completeness for calculi with an explicit weakening rule.
5.3 Inversion strategy A rule
P1 P2 : : : Pn C is called invertible i its inversion is true: each Pi (1 i n) is derivable from C .
There exist sequent calculi for classical logic where all the rules are invertible. This is impossible for intuitionistic logic, however. In a backward-chaining proof search procedure the invertibility of a rule R can be used in the following way: in case R is backwardsapplicable to a sequent , apply R immediately and do not consider any other rule applications to . In case several invertible rules are applicable, choose one arbitrarily. Invertibility removes a lot of or-branching during the search. The following two rules in GJm are invertible: ` ) and ` :. However, in GI also the rules ` &, & `, _ `, ` ), ` 8, 9 ` and 8 ` are invertible. Invertibility of the rules in GI can be carried over to proof search using GJm or RJm.
De nition 3 A label is called invertible i it corresponds to one of the following rules: ` :, ` &, & `, _ `, ` ), ` 8, 9 ` and 8 `. De nition 4 The inversion strategy for RJm: it is prohibited to use a rule clause intro-
ducing a label L for the derivation of a clause (?; R; L) such that R is a an invertible label and L is not an invertible label. The strong inversion strategy for RJm:
introduce an arbitrary complete order i for all the invertible labels, prohibit to use a rule clause introducing a label L for the derivation of a clause (?; R; L) such that either R is a an invertible label and L is not an invertible label or R i L. Completeness of the inversion strategy for RJm is implied by the completeness of the strong inversion strategy for RJm. 13
Lemma 2 The strong inversion strategy preserves completeness for RJm. Proof Analogous lemma is proved in [13] for the rules ` &, & `, _ ` and ` ). The proof is easily extendable to the strong inversion strategy and the rules ` :, ` 8, 9 ` and 8 `.
We will summarize the main points of the proof from [13] with the modi cations for our case. Consider the sequent calculus GI from [17]: the rules ` &, & `, _ `, ` ), ` 8, 9 ` and 8 ` are all invertible. For every derivation D in GI there is a derivation D0 in GI plus weakening, such that all the axioms in D0 have a form A ` A or ? ` A and the weakening inferences are moved down to the inferences where the formulas introduced by weakening occur as side formulas. D0 can be transformed to a derivation in GJm by removing the explicit weakening inferences while no permutations of non-structural rules are necessary. Thus the invertibility of rules in GI can be carried over to proof search using GJm. Now it suces to observe that we can assume that in case a sequent ? is derivable in GJm, then there is a derivation satisfying the invertibility condition. The derivation in GJm is transformed to RJm without permuting rule applications. 2
5.4 Reduction Strategy
For several kinds of derived clauses we can immediately see that it is possible to allow only a single rule to be applied to the clause and not to consider any other rule applications. A general scheme of reduction strategies for the resolution method is proposed in [21]. The reduction strategy for linear logic developed independently in [19] was of crucial importance for eciency of the linear resolution prover in [19].
De nition 5 We say that a clause ? is reducible by a reduction strategy i only one
clause can be derived from ?, according to this strategy. Any such derivation gure (reduction) ? ; consists of one or several ordinary single-premiss inference steps, called reduction steps, ? ; ?1 ; : : : ; ?n ; where is the single clause derivable from ? and all the intermediate clauses ?; ?1 ; : : : ?n are discarded.
De nition 6 We say that some clause ? is fully reduced to a clause i immediately
after the derivation of ? a chain of n reduction steps is applied to ?, producing a reduction ? ; where is not reducible any more. The reduction strategy stands in converting every derived clause to a fully reduced form. A rule clause p1 ; : : :; pnjL in RJm has unique side formulas i all the literals p1 ; : : :; pn are led either by labels or such predicates which have only a single occurrence with the given polarity in the formula to be proved.
Theorem 3 Completeness is preserved if the following rules with unique side formulas are, whenever applicable, used as reduction rules for clauses: ` _, & `, 8 `, ` 9, : `, 9 `. The rule ` 8 with unique side formulas can be used as a reduction rule in case the formula to be proved does not have any negative occurrences of _. Proof By induction over the derivation tree in GJm: reduction preserves applicability of
all the other rules. We transform a derivation which does not follow the reduction strategy to a derivation which does follow the reduction strategy. An inference of the form ` _, & `, 8 `, ` 9, : `, 9 ` with unique side formulas is called a reduction inference. We will say that a 14
reduction inference I with a side formula A occurring on the derivation branch S is a late reduction i the same inference is applicable to A somewhere higher in S . Notice that all the reduction inferences except ` 8 and 9 ` are applicable immediately when the side formula A is derived. The induction parameter is the number of late reduction inferences in the derivation. The induction step stands in moving a late reduction inference upwards in the derivation tree, to the position where it is not late. Consider a derivation D containing a late reduction inference I with a side formula A and a main formula B : ?; A ` R or ? ` A ?; B ` R ?`B Move the reduction I higher in the tree, to the place where it is not late. We call the resulting derivation D0 and we have to show that it is correct. Notice that if and only if D contains _ ` inferences above I , then several copies of the reduction I may be moved to several dierent nodes in D0 . Replacing a set of parametric formulas ? in an inference J with a set of parametric formulas ?0 may prohibit the inference in only three cases: 1. the empty right side of a premiss is replaced by a formula. 2. ?0 contains more variables than ?, R B; ` R , and the mod3. the inference J uses the rule _ `, having a form A;A?_`B; ?; ` R 0 0 0 00 A; ? ` R B; ` R i ed inference has a form A _ B; ?0 ; 0 ` R0 such that R0 is dierent from R00. The main formula B in a copy of I cannot be a side formula of any inference between a copy of I in D0 and the the original location of I , hence B is always among the parametric formulas in this aected part of derivation. The rules `)0 and ` : cannot be used for reduction inferences, thus the case (1) above cannot occur. The copies of I in D0 do not introduce new variables, thus the case (2) above cannot occur. Consider the case (3). The formulas R0 and R00 may be dierent in D0 only in case R is the side formula of the reduction I and the reduction I is applicable in the derivation of one premiss of the rule _ ` but not in the other premiss. This can happen only if I uses a right-hand rule with a side condition, and ` 8 is the only such rule. Since we have prohibited to use ` 8 as a reduction rule for proof search of formulas containing _ negatively, the case (3) cannot occur. Thus all the inferences between a copy of I in D0 and the original location of I remain correct. 2 We note that the reduction strategy is not fully compatible with the inversion strategy. Our implementation gives preference to reduction, reducing a formula even if the reduction does not satisfy the invertibility condition. Thus the invertibility condition for the fully reduced clauses has to be weakened, accordingly. We have chosen to treat all the labels introduced by reduction as being non-invertible.
15
5.5 Nesting Strategy
De nition 7 Let F be the formula we want to prove. A clause containing literals L and
R is nested i one of the following holds: 1. R is a label of a subformula FR in F such that FR does not occur in the scope of negation or left side of implication and L is a label of a subformula FL of FR . 2. C is a subformula in F and C does not occur in the scope of negation or left side of implication. C has a form A _ B or A & B . R is a label of a subformula in A, L is a label of a subformula in B .
Nesting strategy: all the nested clauses are immediately discarded. A unique predicate is a predicate which has only a single occurrence with the given polarity in F . The nesting strategy can be strengthened by treating unique predicates in the same way as labels.
Theorem 4 The nesting strategy preserves completeness. Proof Let F be the formula to be proved. Let L, R and S be labels of subformulas
FL , FR and FS , respectively. We show that the nested sequents cannot occur in the GJm-derivation of F . Consider the case (1) in the de nition of nesting. The nested sequent N has a form ?; FL ` FR . Any sequent derivable from N has a form ` FS such that FR is a subformula of FS . Therefore it is impossible to derive a clause ` G with an empty left side from N . Consider the case (2). No formula containing the subformula C can occur in the left side of the sequent. It suces to consider the form of the rules ` _ and & `: no subformulas of A and B can occur in the same sequent in the derivation of F . 2
5.6 Horn Strategy and Hyperresolution
Consider a formula F which is convertible by equivalent transformations to a formula : 0 F = (H1 & : : : & Hn ) ) G such that each Hi (1 i n) has a form of a Horn pre-clause: Qx1 : : :Qxm:(A1 & : : : & Am ) ) B where each Q is either 9 or 8 and B 0 and each Ai (1 i m) is an atom. A horn clause form of Hi is a clause :A01 ; : : :; :A0m ; B 0, where B and each A0i is a Skolemized version of B or Ai , correspondingly. We will denote the Horn clause form of a clause H as HCF (H ).
De nition 8 A Horn strategy of proving (H & : : : & Hn ) ) G with each Hi being a Horn 1
pre-clause stands in searching for a proof of G with the additional axioms HCF (H1),: : :,HCF (Hn ) and an additional rule, the cut rule:
:?; A :A0; :; B = mgu(A; A0) (:?; :; B ) We refer to [13] for the proof of the following lemma: Lemma 3 The Horn strategy is sound and complete. 16
We note that the Horn strategy without the cut rule is incomplete. As a counterexample consider the formula (a & (a ) b)) ) b. Care has to be taken when using the Horn strategy for formulas containing the equality predicate. Skolemization included in the conversion to Horn clause form is not sound for intuitionistic logic with the equality predicate in case the substitution axioms 8x; y (x = y ) f (x) = f (y)) of equality allow to substitute into terms led by the Skolem functions. Soundness is preserved only if the equality substitution axioms for a particular formula are generated before Skolemization and thus do not allow to substitute into terms led by the Skolem functions. It can be shown that the last restriction preserves completeness.
De nition 9 A hyperresolution strategy of proving F =: (H & : : : & Hn) ) G with each 1
Hi being a Horn pre-clause stands in searching for a proof of G with the additional axioms HCF (H1); : : :; HCF (Hn) and an additional rule, hyperresolution: :L1; : : :; :Ln; R L01 : : : L0n = mgu(L ; L0 ; : : : ; L ; L0 ) 1 n n 1 R where the leftmost premiss is restricted to be one of the axioms HCF (H1); : : :; HCF (Hn) and all the other premisses contain a single literal.
Theorem 5 The hyperresolution strategy is sound and complete for such formulas F
where the formula G in the de nition above does not contain implication and does not contain negative occurrences of disjunction.
Proof Soundness follows from the soundness of Horn strategy. The completeness proof
is constructed using the previous lemma about the completeness of Horn strategy and the lemma about the completeness of hyperresolution for classical logic, see [4] for the latter.
2
We note that the hyperresolution strategy is incomplete in the general case. As a counterexample, consider deriving ` a ) c from the sequents a ` b and b ` c. In case the formula G in the de nition of the strategy is atomic, the hyperresolution strategy reduces to the standard hyperresolution method for classical logic.
5.7 One-premiss Heuristics
Our prover uses the following heuristic: once a clause C is derived and kept, apply all the usable one-premiss rules to C in all the possible ways. The clauses derived during this step will be treated in the same way. The eect of the heuristic is to push the applications of multiple-premiss rules as low in the derivation tree as possible. Since the multiple-premiss rules are typically capable of producing much more clauses than the one-premiss rules, the net eect is to push strongly branching nodes in the search as far away from the current search front as possible. It is easy to see that the tree of one-premiss rule applications is always nite, thus the heuristic preserves completeness. The one-premiss heuristic has given a major improvement in eciency for most of the examples we have experimented with,
17
5.8 Implementing the Eigenvariable Condition
We win some eciency by modifying the rules with the eigenvariable condition: 0 ` 8x : :p(j:L?; L:)?; p = mgu(p; p0); V AR(x); NOOCC(x; (:?; L))
?; :p0 ; R = mgu(p; p0); V AR(x ); NOOCC (x; (:?; :L; R)) 9x `: pj:(:L?; ::L; R)
We take a new special constant c not occurring anywhere in the formula and replace each rule clause :pjL for ` 8x and 9x ` with a clause :pc jL where pc = pfc=xg. The rules ` 8x and 9x ` are replaced by the following modi ed versions: 0 ` 80x : :p(cj:L?; L:)?; p = mgu(pc; p0); NOOCC (c; (:?; L)) 0 90x `: pcj:(:L?; ::L;?;R:)p; R = mgu(pc; p0); NOOCC (c; (:?; :L; R))
It is easy to see that the optimized versions of these two rules will be applicable to the same premisses and will always give the same result as the unoptimized versions.
5.9 The Main Loop of the Prover
Our resolution prover uses the given-clause algorithm which is common for resolution provers for classical logic, e.g. OTTER: [10]. Two main lists are maintained: sos (set of support) is the list of clauses yet to be considered, usable is the list of active clauses. All the axioms are initially put to the sos list, and the usable list is initially empty. While (sos is not empty and no refutation has been found) 1. Let given_clause be the 'lightest' clause in sos. 2. Move given_clause from sos to usable. 3. Infer and process new clauses. Each new clause must either have the given_clause as one of its parents and members of usable as its other parents, or it must be derived from such a new clause by single-premiss rules. New clauses that pass the retention tests are kept, that is, appended to sos. End of while loop
The algorithm for choosing the 'lightest' clause in sos is also rather common for classical resolution provers: pick the smallest clause in sos N times, then pick the rst clause in sos, then pick the smallest clause N times again, etc. The size of a clause is the count of occurrences of variables and constant, function and predicate symbols in the clause. The benchmarks presented in the paper have all been obtained with N = 4.
18
6 A Decidable Class of Intuitionistic Logic We will rst note that the resolution prover without special strategies is not a decision procedure for the class of formulas which do not contain 8 negatively. However, we can make the resolution prover to decide that class by limiting the depth of derivations to the number of occurrences of quanti ers and propositional connectives in the formula to be proved. We will now present a new decidable class of intuitionistic logic, called Near{Monadic. The decidability is proved by showing that the resolution method decides the class: a resolution prover can derive only a nite number of clauses from a near{monadic formula.
De nition 10 By Near{Monadic we denote the class of intuitionistic formulas without
function symbols such that no negative occurrence of a subformula contains more than one free variable.
The Near{Monadic class is similar to the Monadic class (the function-free class where all the predicates have an arity 1) and its extension, the Essentially Monadic class (the function-free class where all the atoms contain no more than one variable). It is known that although the Essentially Monadic class is decidable for classical logic, even the Monadic class is undecidable for intuitionistic logic. However, the Near{Monadic class does not contain the whole Monadic class. For example, a monadic formula (8x8y :(P (x) _ R(y )) is not near{monadic, since the negative subformula P (x) _ R(y ) contains two free variables. Any essentially monadic formula F of classical logic can be converted to an equivalent formula F 0 such that no occurrence of a subformula in F 0 contains more than one free variable. The conversion stands in moving all the quanti ers inside to atoms. Such an equivalent conversion does not exist for monadic formulas in intuitionistic logic.
Theorem 6 The resolution method incorporating subsumption strategy is a decision procedure for the Near{Monadic class.
4.
We will rst de ne the well-known notion of splitting (see e.g. [6]) and prove the lemma
De nition 11 The splitting of a clause R is a set R ; : : :; Rn of subsets (called blocks) 1
of R, such that:
each literal in R occurs in a single block, no two blocks Ri and Rj (i 6= j ) share variables, no splitting of R contains more than n blocks. For example, the splitting of a clause fP (x; y ); R(x); S (y ); G(z; u); M (u; u); L(v)g is the set ffP (x; y ); R(x); S (y )g; fG(z; u); M (u; u)g; fL(v )gg. In the following lemma we assume clause language without function symbols: atoms are built of predicate symbols, variables and constants. 19
Lemma 4 Assume that we have a nite set of predicate symbols P and a nite set of constant symbols C . Consider the set A of all atoms built using only variables and elements of P and C . Let A01 A such that no member of A01 contains more than one variable. Let A2+ = A ? A01. Let S be a set of clauses built from the elements of A01 and A2+ so that no clause in S contains more than a single element of A2+ , S contains all the factorizations of clauses in S and and no two clauses in S subsume each other. Then S cannot be in nite. Proof Let a clause R be built from the elements of A and A , so that it contains no 01
2+
more than one element of A2+ . Build a splitting R1 ; : : :; Rn of R. No two blocks Ri and Rj (i 6= j ) share variables. Each block Ri contains either 0. no variables, 1. a single variable or 2. more than one variable Due to the construction of R, the splitting R1 ; : : :; Rn can contain no more than one block of type (2). In case a block Ri of type (1) can be obtained from another block Rj of type (1) by renaming the variable occurring in Rj (Ri = Rj fx=yg, where x is the variable occurring in Ri and y is the variable occurring in Rj ), then the factor Rfx=y g of R subsumes R. In order to show that clauses in S have a limited length, we have to show that among the splittings of clauses in S there is only a nite number of possible blocks modulo renaming of variables. This follows from the limitations on the construction of clauses in S : only elements of A01 and A2+ may be used, no clause may contain more than one element of A2+ . 2 We are now in the position to prove the theorem 6. Proof Consider a near{monadic formula F . All the atoms N labelling negative subformulas in F contain no more than one variable. Since F is function{free, it is impossible for resolution to derive an instance N (including input tautologies) such that N contains more than one variable. Since our logic is intuitionistic, no derived clause may contain more than one atom labelling a positive subformula of F . Thus all the derivable clauses satisfy conditions of the lemma 4. Thus the number of derivable clauses is nite. 2 For example, consider the following four formulas from [18]. Each of these is provable classically, but not intuitionistically. For each of these our resolution prover exhausts the search space in ca 0.016 seconds, keeping 0{2 of the derived clauses, and then stops, thus proving the unprovability of the formula. The tableau prover [17], however, never terminates the unsuccesful proof search for those four formulas. S1: 8x(p(x) _ q (c)) ) (q (c) _ 8xp(x)) S2: (a(c) ) 9xb(x)) ) 9x(a(c) ) b(x)) S3: :8xa(x) ) 9x:a(x)
S4: ((8xa(x)) ) b(c)) ) 9x(a(x) ) b(c))
20
7 Experiments with Implementations Only three implementations of tableau provers for full rst-order intuitionistic logic are known to us, namely, a tableau prover (written in C) described in [17], a tableau prover of R.Dycko (written in Prolog) implementing the calculus from [5] and a tableau prover of N.Shankar (written in Lisp), implementing his dynamic Skolemization strategies from [18]. Unfortunately we do not have any benchmarks for the latter two provers, we only know (from personal communication) that for the following examples 1.1-1.8 with alternating quanti ers Shankar's prover is more than an order of magnitude faster than the prover from [17], mainly due to the dynamic Skolemization strategies for handling quanti ers. We have implemented a resolution prover for intuitionistic logic. No other resolution provers for the intuitionistic predicate calculus are known to us. Our prover is implemented in Scheme and compiled to C by the Scheme-to-C compiler Hobbit implemented by the author, using A.Jaer's Scheme interpreter scm as a library. Our prover uses the calculus RJm along with all the strategies described in the current paper.
7.1 The Benchmark Suite from the SICS paper
We compare the performance of the tableau prover from [17] and our resolution prover on the set of all the examples (except the speci c query-examples 8.1 and 8.2), provided in [17]. Both provers are compiled and run on the Sun SparcServer 10. server is 65, which is about the same speed as a 66-Mhz PC clone with a Pentium, or double the speed of a 66-Mhz PC clone with a 80486DX2 processor. When obtaining the timings presented in the following table, we chose not to use the Horn strategy and the hyperresolution strategy, for the reason that some of the preprocessing steps in these strategies could have been applied also by the tableau prover. We feel that the Horn strategy and hyperresolution strategy would have made the comparison unfair. We will later analyse some separate benchmarks and demonstrate the eect of the Horn and hyperresolution strategies. Remark: Our timings of the prover in [17] are better than the timings presented in [17] itself, due to using a faster machine. [17] systematizes examples in the following way: group 1 { alternations of quanti ers, group 2 { append, group 3 { problems 39-43 from Pelletier's collection, group 4 { existence, group 5 { uni cation, group 6 { simple, group 7 { problematic. Group 6 is relatively uninteresting. It deserves mentioning that the prover in [17] uses an ecient nonstandard approach to uni cation in the proof search (see results of group 5). The column \tabl t" refers to the time (in seconds) it takes to prove a formula with the number in the \nr" column using the tableau prover, \res t" refers to the time it takes to prove the same formula using the resolution prover, \clnr" refers to the number of clauses kept (most of the derived clauses are redundant (for several possible reasons) and are not kept) during the search with the resolution prover. All the timings are given in seconds. Shorter times in the table are relatively inaccurate.
21
nr
tabl t res t clnr nr tabl t res t clnr nr 1.1 0.02 0.04 14 1.2 0.36 0.06 24 1.3 1.4 0.01 0.05 8 1.5 1.30 0.07 24 1.6 1.7 0.11 0.02 7 1.8 fail 0.03 13 2.1 2.2 2.74 0.03 13 2.3 4.49 0.03 18 2.4 3.1 < 0:01 < 0:01 10 3.2 0.06 0.03 24 3.3 3.4 0.01 0.03 21 3.5 1.93 0.06 27 4.1 4.2 mem 0.03 18 4.3 mem 0.04 22 5.1 5.2 0.02 0.09 3 5.3 0.54 mem 3(?) 6.1 6.2 < 0:01 0.02 6 6.3 0.01 0.03 20 6.4 6.5 0.01 0.07 35 6.6 < 0:01 0.02 13 6.7 6.8 < 0:01 < 0:01 5 6.9 < 0:01 0.02 6 6.10 6.11 < 0:01 0.02 10 6.12 0.04 0.03 17 6.13 6.14 0.01 0.05 13 6.15 0.03 0.03 17 7.1 7.2 0.04 0.05 18 7.3 3.30 0.07 20 7.4
tabl t res t clnr fail 0.09 30 14.00 0.10 38 1.30 0.03 12 54.68 0.04 20 < 0:01 0.22 98 8.23 0.03 17 0.01 0.02 3 < 0:01 0.03 12 < 0:01 0.04 25 < 0:01 < 0:01 6 < 0:01 0.04 31 0.05 0.18 71 < 0:01 0.02 11 fail 0.09 22
Notes: \mem " in the resolution times column means that the resolution prover stopped due to memory management limitations (here this occurred as a result of unifying huge terms). \mem" in the tableau times column has an analogous meaning, except that on the machine of the authors of the prover those two formulas 4.2 and 4.3 were proved successfully in 43 and 49 seconds. \fail" means that several hours of proof search produced no result and we stopped the search. We present a selection of the formulas from the table above. For the rest see [17]. All the subformulas H , G are replaced by (H ) G) & (G ) H ) before the proof search is started.
7.1.1 Alternations of Quanti ers
The tableau prover performs poorly on larger examples from this group. It is possible to improve the performance of a tableau prover on this group by using the dynamic Skolemization techniques from [18]. The tableau prover of Shankar implementing those techniques has proved the example 1.3 in ca 10 seconds. 1.1 8x9y 8z (p(x) & q (y ) & r(z )) $ 8z 9y 8x(p(x) & q (y ) & r(z )) Data: tableau time 0.02, resolution time 0.04, clauses kept: 14. 1.2 8x9y 8z 9w(p(x) & q (y ) & r(z ) & s(w)) $ 9w8z 9y 8x(p(x) & q (y ) & r(z ) & s(w)) Data: tableau time 0.36, resolution time 0.06, clauses kept: 24. 1.3 8x9y 8z 9w8u(p(x) & q (y ) & r(z ) & s(w) & t(u)) $ 8u9w8z9y8x(p(x) & q(y) & r(z) & s(w) & t(u)) Data: tableau time fail, resolution time 0.09, clauses kept: 30. 1.6 9z 8x9y 8w9u(p(x) & q (y ) & r(z ) & s(w) & t(u)) $ 9u8w9y8x9z(p(x) & q(y) & r(z) & s(w) & t(u)) 22
Data: tableau time 14.00, resolution time 0.1, clauses kept: 38. 1.8 9x18y 19x28y 29x38y 3(p(x1; y 1) & q (x2; y 2) & r(x3; y 3)) ) 8y39x38y29x28y19x1(p(x1; y1) & q(x2; y2) & r(x3; y3)) Data: tableau time fail, resolution time 0.03, clauses kept: 13.
7.1.2 Append 2.3 8xapp(nil; x; x) & 8x8y 8z 8w(app(y; z; w) ) app(cons(x; y); z; cons(x; w))) ) 9xapp(cons(a1; cons(a2; cons(a3; cons(a4; cons(a5; cons(a6; cons(a7; nil))))))); nil; x)
Data: tableau time 4.49, resolution time 0.03, clauses kept: 18. 2.4 8xapp(nil; x; x) & 8x8y 8z 8w(app(y; z; w) ) app(cons(x; y); z; cons(x; w))) ) 9xapp(cons(a1; cons(a2; cons(a3; cons(a4; cons(a5; cons(a6; cons(a7; cons(a8; nil)))))))); nil; x) Data: tableau time 54.68, resolution time 0.04, clauses kept: 20.
7.1.3 Pelletier's Collection 3.1 :9x8y (mem(y; x) $ :mem(x; x))
Data: tableau time < 0:01, resolution time < 0:01, clauses kept: 10.
7.1.4 Existence 4.1 8x(p(x) ) p(h(x)) _ p(g (x))) & 9xp(x) & 8x:p(h(x)) ) 9xp(g(g(g(g(g(x))))))
Data: tableau time 8.23, resolution time 0.03, clauses kept: 17. 4.2 8x(p(x) ) p(h(x)) _ p(g (x))) & 9xp(x) & 8x:p(h(x)) ) 9xp(g(g(g(g(g(g(x))))))) Data: tableau time fail (memory), resolution time 0.03, clauses kept: 18.
7.1.5 Uni cation The time required to prove the problems in this group depends entirely on the eciency of the uni cation procedure for very large terms. The actual search space is very small. The uni cation procedure of the SICS tableau prover is better suited for the large terms than the uni cation procedure of our resolution prover. 5.28x09x19x29x39x49x59x69x79x89x99x10 (p(x1; x2; x3; x4; x5; x6; x7; x8; x9; x10) $ p(f (x0; x0); f (x1; x1); f (x2; x2); f (x3; x3); f (x4; x4); f (x5; x5); f (x6; x6); f (x7; x7); f (x8; x8); f (x9; x9))): Data: tableau time 0.02, resolution time 0.09, clauses kept: 3. 5.3 8x09x19x29x39x49x59x69x79x89x99x109x119x129x139x149x15 23
(p(x1; x2; x3; x4; x5; x6; x7; x8; x9; x10; x11; x12; x13; x14; x15) $ p(f (x0; x0); f (x1; x1); f (x2; x2); f (x3; x3); f (x4; x4); f (x5; x5); f (x6; x6); f (x7; x7); f (x8; x8); f (x9; x9); f (x10; x10); f (x11; x11); f (x12; x12); f (x13; x13); f (x14; x14))): Data: tableau time 0.54, resolution time fail (memory), clauses kept: 3(?). The resolution prover crashes while allocation memory during uni cation.
7.1.6 Simple 6.11 :9x8y (q (y ) ) r(x; y )) & 9x8y (s(y ) ) r(x; y )) ) :8x(q (x) ) s(x)) Data: tableau time 0.01, resolution time 0.02, clauses kept: 10.
6.13 8x(p(x) , q (x) _ r(x) _ 9ys(x; y )) & 9x9y (s(y; x) _ g (x)) & 8x(g(x) , 9ys(x; y) _ 9z(r(z) _ q(z) _ s(a; z))) ) 9xq(x) _ 9xr(x) _ 9x9ys(x; y) Data: tableau time 0.05, resolution time 0.18, clauses kept: 71. This is one example from the benchmark set where the tableau prover performs signi cantly better than the resolution prover.
7.1.7 Problematic The problems in this group fall into the class which is decidable by the tableau prover: no negative occurences of 8. Nevertheless, the huge amount of possible substitution combinations into the existentially bound variables makes the formulas very hard to prove, unless suitable strategies are used. The success of the resolution prover on these examples relies on the strategy of an extended axiom set: the additional axioms for the formula F are all the clauses of the form (:H; G) where = mgu(H; G), H is a negative and G is a positive occurrence of a subformula in F. 7.2 q (a1; a2; a3; a1; a2; a3) ) 9x19x29x39y 19y 29y 3 ((p(x1) & p(x2) & p(x3) $ p(y 1) & p(y 2) & p(y 3)) & q (x1; x2; x3; y1; y 2; y 3)) Data: tableau time 0.04, resolution time 0.05, clauses kept: 18. In case we do not use the extended axiom set, the resolution time goes up to 4.9 seconds, 296 clauses kept. 7.3 q (a1; a2; a3; a4; a1; a2; a3; a4) ) 9x19x29x39x49y 19y 29y 39y 4 ((p(x1) & p(x2) & p(x3) & p(x4) $ p(y 1) & p(y 2) & p(y 3) & p(y 4)) &q (x1; x2; x3; x4; y 1; y 2; y 3; y 4)) Data: tableau time 3.30 seconds, resolution time 0.07 seconds, clauses kept: 20. In case we do not use the extended axiom set, the resolution time goes up to ca eight minutes, 2680 clauses kept. 7.4 q (a1; a2; a3; a4; a5; a1; a2; a3; a4; a5) ) 9x19x29x39x49x59y 19y 29y 39y 49y 5 ((p(x1) & p(x2) & p(x3) & p(x4) & p(x5) $ p(y 1) & p(y 2) & p(y 3) & p(y 4) & p(y 5)) &q (x1; x2; x3; x4; x5; y 1; y 2; y3; y 4; y 5)) Data: tableau time fail, resolution time 0.09, clauses kept: 22. In case we do not use the extended axiom set, we do not manage to nd a resolution proof during ca one day of search. 24
7.2 Examples from Chang and Lee
We will consider two group theory examples from [4]. Both can be converted to Horn clauses, thus the classical and intuitionistic provability coincides. The motivation for examples is to demonstrate the eect of the hyperresolution strategy. (8xp(a; x; x) & 8xp(x; a; x) & 8xp(x; x; a) & p(b; c; d) & 8x8y8u8z8v8w(p(x; y; u) & p(y; z; v) & p(x; v; w) ) p(u; z; w)) & 8x8y8u8z8v8w(p(x; y; u) & p(y; z; v) & p(u; z; w) ) p(x; v; w))) ) p(c; b; d) Data: tableau time 10 seconds, resolution time without Horn strategy or hyperresolution 0.87 seconds, clauses kept: 367. Resolution time with hyperresolution 0.04 seconds, clauses kept: 4. (8xp(a; x; x) & 8xp(i(x); x; a) & 8x8y8u8z8v8w(p(x; y; u) & p(y; z; v) & p(x; v; w) ) p(u; z; w)) & 8x8y8u8z8v8w(p(x; y; u) & p(y; z; v) & p(u; z; w) ) p(x; v; w))) ) p(b; a; b) Data: tableau time 0.35 seconds, resolution time without Horn strategy or hyperresolution 0.36 seconds, clauses kept: 227. Resolution time with hyperresolution 0.04 seconds, clauses kept: 10.
7.3 Constructive Geometry of von Plato
In the paper [14] Jan von Plato presents a rst order axiomatization of constructive elementary geometry and proves a number of theorems in geometry. We will consider the axiomatization of geometry of apartness and convergence and prove several theorems from [14]. The prover is run on a Sun Sparcstation 5. The axiomatization consists of the following four groups of axioms. Axioms 1: (8x:dipt(x; x)) (8x:diln(x; x)) (8x:con(x; x)) (8xyz (dipt(x; y ))(dipt(x; z ) _ dipt(y; z )))) (8xyz (diln(x; y ))(diln(x; z ) _ diln(y; z )))) (8xyz (con(x; y ))(con(x; z ) _ con(y; z )))) Axioms 2: (8xy (dipt(x; y )):apt(x; ln(x; y )))) (8xy (dipt(x; y )):apt(y; ln(x; y )))) (8xy (con(x; y )):apt(pt(x; y ); x))) (8xy (con(x; y )):apt(pt(x; y ); y ))) Axiom 3: (8xyuv ((dipt(x; y ) & diln(u; v )))((apt(x; u) _ apt(x; v )) _ (apt(y; u) _ apt(y; v ))))) Axioms 4: 25
(8xyz (apt(x; y ))(dipt(x; z ) _ apt(z; y )))) (8xyz (apt(x; y ))(diln(y; z ) _ apt(x; z )))) (8xyz (con(x; y ))(diln(y; z ) _ con(x; z )))) We will proceed to the theorems originally proved in [14]. Unless said dierently, the prover from [17] eventually stopped with a memory allocation or stack full error after ca 10 seconds of work. The reolution prover proved the rst theorem 3.1 (uniqueness of constructed lines) in 120 seconds (7242 clauses kept), using the full axiomatization (groups 1,2,3,4) and no hints or additional lemmas: (8xyu((dipt(x; y ) & (:apt(x; u) & :apt(y; u)))):diln(u; ln(x; y )))) The same theorem was proved in 1.8 seconds (712 clauses kept) using only axioms 2 and 3. Using the same limited axiomatization, the prover from [17] succeeded in 0.2 seconds. The theorem 3.2 was proved in 0.25 seconds (170 clauses kept) from the full axiomatization: (8xy (con(x; y ))diln(x; y ))) The theorem 3.3 (uniqueness of constructed points) was proved in 138 seconds (7715 clauses kept) from the full axiomatization: (8xyz ((con(x; y ) & :apt(z; x) & :apt(z; y ))):dipt(z; pt(x; y )))) The resolution prover was unable to prove the following theorem 4.1.i.r during ca 10 hours of search. However, when we limited the axiomatization to contain only axiom groups 2 and 3, the proof was found in 0.5 seconds (364 clauses kept). Using the same limited axiomatization, the prover from [17] was able to prove this theorem in 71 seconds. ((dipt(a; b) & con(l; m) & diln(l; ln(a; b))))(apt(a; l) _ apt(b; l))) The resolution prover was unable to prove the following theorem 4.1.i.l during ca 10 hours of search. With an axiomatization limited to axiom groups 2 and 4, the proof was found in 0.4 seconds (215 clauses kept) Using the same limited axiomatization, the prover from [17] was able to prove this theorem in 0.2 seconds. ((dipt(a; b) & con(l; m) & (apt(a; l) _ apt(b; l))))diln(l; ln(a; b))) The provers exhibited analogous behaviour for the theorems 4.1.ii.r and 4.1.ii.l: ((dipt(a; b) & con(l; m) & diln(l; m) & dipt(a; pt(l; m))))(apt(a; l) _ apt(a; m))) ((dipt(a; b) & con(l; m) & (apt(a; l) _ apt(a; m))))dipt(a; pt(l; m))) The theorem 4.2 was too hard for the resolution prover in case no extra lemmas were given. We managed to nd the proof in 36 seconds (4113 clauses kept), using the theorem 4.1.i and axioms from the group 1 only. ((dipt(a; b) & dipt(c; d))) ((apt(a; ln(c; d)) _ apt(b; ln(c; d))))(apt(c; ln(a; b)) _ apt(d; ln(a; b))))) 26
The theorem 4.3.i was proved in 20 seconds (3460 clauses kept) from the full axiomatization and in 0.1 seconds (87 clauses kept) from groups 2 and 4 only. Using the same limited axiomatization, the prover from [17] was able to prove this theorem in 173 seconds. ((dipt(a; b) & apt(c; ln(a; b))))(dipt(c; a) & dipt(c; b))) The theorem 4.3.ii was too hard for the resolution prover in case the full axiomatization was used. It was proved in 0.5 seconds (260 clauses kept) from the axiom groups 2 and 4, though. Using the same limited axiomatization, the prover from [17] was able to prove this theorem in 158 seconds. ((dipt(a; b) & apt(c; ln(a; b))))(diln(ln(a; b); ln(c; a))& diln(ln(a; b); ln(c; b)))) Due to the lack of space we will not consider the experiments with the other theorems in [14].
7.4 Some Conclusions
As always, it is very hard to compare the relative performance of one search method to another, since minute changes in the representation of the problem or the strategies often have a crucial eect on the search. However, in our experiments the resolution prover is a clear winner for almost all the harder benchmark problems presented in [17] or [14]. In particular, it takes less than a second for our resolution prover to prove any of the [17] benchmarks (with the special uni cation problem being an exception) for which the tableau prover in [17] fails. Considering the problems in constructive geometry we note that the prover can be used as a practical tool in the hands of human mathematician, using the machine to ll in gaps in the schematic proof plans. However, without any human assistance or guidance the prover will be unable to nd proofs of complex theorems.
References
[1] Andrews, P. B. Resolution in Type Theory, Journal of Symbolic Logic, 36, 414{432 (1971). [2] Andrews, P. B., Miller, D. A, Longini Cohen, E., Pfenning, F. Automating Higher Order Logic. In Automated Theorem Proving: After 25 Years, pages 169{192, Contemporary Mathematics Series, vol. 29, American Matehmatical Society, 1984. [3] Beeson, M.J., Some applications of Gentzen's proof theory in automated deduction. Manuscript, 1988. [4] Chang, C.L., Lee, R.C.T. Symbolic Logic and Mechanical Theorem Proving. Academic Press, (1973). [5] Dycko, R. Contraction-free sequent calculi for intuitionistic logic. Journal of Symbolic Logic, 57(3), 795{807 (1992). [6] Fermuller, C., Leitsch, A., Tammet, T., Zamov, N. Resolution methods for decision problems. LNCS 679, Springer Verlag, (1993). 27
[7] Fitting, M. Resolution for Intuitionistic Logic. Paper presented at ISMIS '87, Charlotte, NC. 1987. [8] Lifschitz, V. What Is the Inverse Method? Journal of Automated Reasoning, 5, 1{23 (1989). [9] S.Ju.Maslov An inverse method of establishing deducibility in the classical predicate calculus. Dokl. Akad. Nauk. SSSR 159 (1964) 17-20=Soviet Math. Dokl. 5 (1964) 1420, MR 30 #3005. [10] W.McCune. OTTER 2.0 Users Guide. Tech. Report ANL-90/9, Argonne National Laboratory, Argonne, IL, March 1990. [11] G.Mints. Gentzen-type Systems and Resolution Rules. Part I. Propositional Logic. In COLOG-88, pages 198-231, LNCS 417, Springer-Verlag, 1990. [12] G.Mints. Resolution Calculus for The First Order Linear Logic. Journal of Logic, Language and Information, 2, 58-93 (1993). [13] G.Mints. Resolution Strategies for the Intuitionistic Logic. In Constraint Programming, NATO ASI Series F, v. 131, pp. 289{311, Springer Verlag, (1994). [14] J. von Plato. The Axioms of Constructive Geometry. Annals of Pure and Applied Logic 76(2), 169-200 (1995). [15] Pym, D.J., Wallen, L.A. Investigations into proof-search in a system of rst-order dependent function types. In CADE-10, pages 236{250, Springer Verlag 1990. [16] J.A. Robinson. A Machine-oriented Logic Based on the Resolution Principle. Journal of the ACM 12, 23-41 (1965). [17] D.Sahlin, T.Franzen, S.Haridi. An Intuitionistic Predicate Logic Theorem Prover.Journal of Logic and Computation, 2(5), 619{656 (1992). [18] N.Shankar. Proof Search in the Intuitionistic Sequent Calculus. In CADE-11, pages 522-536, LNCS 607, Springer Verlag, (1992). [19] T.Tammet. Proof Strategies in Linear Logic. Journal of Automated Reasoning 12(3), 273{304 (1994). [20] Tammet, T., Smith, J. Optimised Encodings of Fragments of Type Theory in First Order Logic. In Proceedings of the CADE-12 workshop on proof search in typetheoretic languages, Nancy 1994, 87-93. [21] Voronkov, A. Theorem proving in non-standard logics based on the inverse method. In CADE-11, pages 648-662, LNCS 607, Springer Verlag, (1992). [22] Volozh, B., Matskin, M., Mints, G., Tyugu, E. PRIZ system and the Propositional Calculus. Kibernetika and Software 6, (1983). [23] Wallen, L.A. Automated Proof Search in Non-Classical Logics. MIT Press, (1990). 28