Redundant Unifications Using Network Structures *. Shie-Jue Lee and Chih-Hung Wu. Department of Electrical Engineering. National Sun Yat-Sen University.
Improving Efficiency of a Theorem Prover by Eliminating Redundant Unifications Using Network Structures * Shie-Jue Lee and Chih-Hung Wu Department of Electrical Engineering National Sun Yat-Sen University Kaohsiung, Taiwan 80424 Abstract
literal unifications and partial unifications, and on duplicate instance checking in each hyper-linking round is avoided. The overhead is reduced significantly when the required number of hyper-linking rounds is large. In the sequel, we start with a brief introduction of the hyper-linking strategy. Then we show where the inefficiency of CLIN comes from. Then the network structures used for solving this inefficiency are described, followed by a presentation of test results. Given a set S of clauses, a link in S is defined to be a pair ( L ,M ) of literals such that both L and M appear in clauses in S and such that L and 4 4 are unifiable. If C = { L I , . .., Lm} is a clause in S then a hyper-link of C is a set { ( L l , Mi),. . ., (Lm, Mm)) of links such that there exists a substitution 0 such that L i e and M i 0 are complementary for all i, 1 5 i 5 m. A most general such 0 is called the substitution of the hyper-link and C0 for this 8 is called the instance of the hyper-link. We call C the nucleus of the hyperlink and the Mi the electrons of the hyper-link. This instance generation is called a hyper-link operation. The hyper-linking strategy basically applies hyperlink operation to generate new instances and periodically tests $he unsatisfiability of the ground set of all the retained clauses. If the input set S of clauses is unsatisfiable, then the hyper-linking strategy guarantees to find an unsatisfiable set of ground clauses. The hyper-linking strategy is compatible with the set of support strategy. Three support sets are defined. Suppose we are given an input set S of clauses. A forward support set is the set of all positive clauses in S. A backward support set is the set of all negative clauses in S. A user support set is a set of clauses provided by the user. If T is the user support set then it is assumed that S - T is satisfiable. The clauses in the support set are called forward supported, backward supported, or user supported respectively. As the hyper-link operation proceeds, the number of s u p ported clauses increases. More and more clauses are
In [6], a theorem proving method, called the hyperlinking strategy, was proposed to eliminate duplication of instances of clauses during the process of inference. A theorem prover, which implements the strategy, was also constructed. In this implementation, many literal unifications and partial unifications are performed repetitively from round to round, resulting in a large overhead when many rounds of hyper-linking is needed for hard problems. We propose a technique which maintains information acmss rounds b y constructing shared network structures, so that redundant work on calculating literal unifications and partial unifications, and on duplicate instance checking in each hyper-linking round is avoided. Ezperiments show that the overhead is reduced significantly when the required number of rounds is large.
1
Introduction
The hyper-linking strategy proposed in [SI eliminates duplication of instances of clauses during the process of inference. A theorem prover, called CLIN, which implements the strategy, was constructed [SI. CLIN has shown to be competitive with or even better than some major theorem proving systems on a spectrum of typical test problems. However, the implementation in CLIN is not efficient in terms of speed. Literal unification and partial unification are calculated repetitively from round t o round, resulting in large overhead when many rounds of hyper-linking is needed for hard problems. We propose a technique which applies a clause network, similar to the rule network in the Rete algorithm [2], to keep information across rounds, so that redundant work on calculating *Supported by National Science Council under grant NSC 81-04083-110-509.
299
0-81884212-2/93$03.008 1993IEEE
supported according to the following criterion. Suppose C is a nucleus of a hyper-link and L1 . . . Lm are the positive literals that are electrons for C and MI . . . M,,are the negative literals that are electrons for C. Thus C has m negative literals, linked by Li, and n positive literals, linked by M j . Let A1 ... Am be the clauses containing L1 . . . L,, respectively, and let B1 . . . Bn be the clauses containing MI ... Mn. Let D be the instance of C obtained by this hyper-link. Then D is forward supported if m = 0 (C is all positive) or if all Ai are forward supported. D is backward supported if n = 0 (C is all negative) or if all Bj are backward supported. D is user supported if C was user supported or if some Ai or Bj was user supported. By specifying that an instance be user supported, forward supported, or user supported, the number of instances generated can be reduced. Suppose we are given a set S of clauses. CLIN generates instances of S by a level saturation method described as follows. For each clause C in S, all hyperlinks of C are computed, together with their substitution 8.Then, the set of all such instances C0 are added to S, for all clauses C in S and all substitutions 8 of their hyper-links. This process results in a new set SI of clauses, and is called a round of hyper-linking. After a round of hyper-linking, the ground set Gr(S‘) is tested for propositional unsatisfiability. If Gr(S’) is unsatisfiable, then S is unsatisfiable and we are done. Otherwise another round of hyper-linking, and so on, is done. At the beginning of each round of hyper-linking, CLIN collects literals from all the retained clauses and divides them into different sets: positive, negative, and supported literal sets. Then one clause is considered after another. If all literals of the clause under considered are complementarily unified with other literals from appropriate literal sets simultaneously, then we have an instance of the clause and the instance is kept if it is different from other retained clauses. Exhaustive unification has to be done in order to generate every possible instances of a clause.
Definition. A literal unification is any pair of (Li,Mi) such that Li and Mi are complementarily unifiable. Definition. A partial unification is any set of literal unifications of the first n literals L1 , . . ., Ln , 2 5 n 5 m, requiring that the variable bindings among these n literal unifications be consistent. We say a partial unification of the first n literals to have length n. Thus, if a clause C has m literals, we have to do the following two things: 1. Find a literal unification for each literal of C.
2. Check if a partial unification of length 2 exists, a partial unification of length 3 exists, and so on, until a partial unification of length m exists. Suppose we are given a set S of clauses. CLIN first collects distinct literals and divides them into several literal sets. These literals are called fact-literals, and they are used for generating new instances in the first hyper-linking round. If new literals have been generated, they are added to appropriate literal sets and become fact-literals. Then all fact-literals are used for generating new instances in the second hyper-linking round, etc.. This process introduces a lot of unnecessary recomputations of literal unifications and partial unifications in each round after the first round of hyper-linking. The following example illustrates the problem. As the number of hyper-linking rounds increases, the redundant work performed also increases. We intend to eliminate the above redundant work by applying network structures described later. Many theorem provers apply sophisticated techniques to reduce similar redundancy in their implementations. For example, Greenbaum uses a discrimination network to help this problem in a resolution implementation [5]. OTTER also applies a network technique to reduce redundancy [8].
3
2
Motivation
The Clause Network
Unnecessary recomputation of literal unifications and partial unification with old fact-literals can be eliminated by using a network structure similar to the rule network, proposed by Forgy [2], which is used in many rule-based expert system interpreters including OPS5 and CLIPS, for rules in the form of pattern1 ,
The hyper-linking strategy generates an instance of a nucleus C only if all the literals find their electrons.
suppose
C = { L 1 , L z , ..., L,}.
...
Then we require that all Li, 1 5 i 5 m have electrons Mj. Based on this, we have the following two definitions .
pattern,
300
adionl,
When a new clause is generated, we compile this clause into the clause network. If new alpha or beta memory nodes are created for this new clause, we perform an operation called facts rematching to add appropriate fact-literals or partial unifications t o these memory nodes. Note that a fact-literal is never tested through the same path in the clause network more than once. If all the literals in a clause unify with fact-literals, the rule is activated. Instead of selecting one activated rule to fire in each cycle, we fire all activated rules in each round of hyper-linking and all instances of nuclei are added to the c l a w set.
...
adion,. P a t t e r n l , . .., pattern,,, are called the left hand side, LHS, of the rule, and adionl, . . ., action, are called the right hand side, RHS, of the rule. A rule is fired when all the patterns in LHS are matched with facts in the fact-list. We can benefit from similar data structures since each clause
c = { L l , .. Lk} in a clause set can be rewritten as a rule:
+,
...
4
a
Suppose we have a set of clauses S = { C l , C2,. . ., Cn}. Four sets, L,, L n ,Lb, and Lj, are created for collecting positive literals, negative literals, backward supported literals, and forward supported literals respectively. These are fact-literals. S is compiled into the clause network G consisting of the literal network P and the join network J. P is divided into two subnetwork P, and P, for positive literals and negative literals respectively. A literal node Np in P contains the information about the literal t o be unified, and a join node Nj in J contains the constraints to be satisfied about the variable bindings across literals in a clause. At the beginning of the first hyper-linking round, if backward support strategy is used, the elements in Lb are fed into the clause network from the top node of P p , followed by the elements in L, being fed into the clause network from the top node of P,, to find literal unifications and partial unifications. If forward support strategy is used, the elements in L j are fed into the clause network from the top node of Pn, followed by the elements in L, being fed into the clause network from the top node of P,, t o find literal unifications and partial unifications. Similarly for user support strategy. If no support strategy is specified, all of the elements in L, are fed into the clause network from the top node of Pn, and all of the elements in L, are fed into the clause network from the top node of P,. In the subsequent hyper-linking rounds, only new fact-literals are fed into the clause network. Each literal in the literal network unifies with the negation of incoming fact-literals. When a unification succeeds, a pair ( P u ,Vu),where Pu is the resulting literal after unification and Vu is a list of variable bindings in Pu,is stored in the alpha memory node associated with the literal and is passed to the join network. The first join node of each clause C compares every
assert an instance of C. and fact-literals can be viewed as the facts in the factlist or in the working memory. We call the resulting network the clause network. Instead of selecting one activated rule to fire in each cycle, we fire all activated rules in each round of hyper-linking. Clauses are compiled into the clause network which consists of the litem1 network and the joint network. Fact-literals that unify with a literal in a rule are stored in the alpha memory node of the literal in the literal network. Consistent facts unified with a sequence of literals in a rule are stored in beta memory nodes in the join network. By maintaining the clause network, redundant computation of literal unifications and partial unification with old fact-literals is avoided, making the theorem prover more efficient. For example, suppose we have a set S of the following clauses: 1. Cl = {S(f(X), XI,!Ax> .),S(Y,XI) 2.
c 2=
{S(f(X), XI,-'dYX))
The literal network for S is shown in the upper part of Figure 1. And the join network is shown in the lower part of Figure 1. Note that the literal node for -g(X,Y) is shared by C1 and C2.
3.1
Implementation
-'Lk
Operation of the Clause Network
All the information contained in the clause network are updated only if new fact-literals or new clauses are generated during hyper-linking. When a new factliteral is generated, we throw it into the clause network from the root node. This fact-literal would be tested for unification with literals and distributed to appropriate alpha memory and beta memory nodes.
301
............. .......................................................................................................
theliteralllprtworLi
I I
I I
14 r-f .1 Cl
1 ca
I I I
I I I
Figure 1: The clause network for S pair, ( P L , Vpl;) and ( P R , VPR)of its first two literals, for consistent variable bindings. ( P L , Vpl;) comes from the alpha memory node of the first literal L1 of C and ( P R , VPR)comes from the alpha memory node of the second literal La of C. If the variable bindings can be made t o be consistent, ((PL',LIZ'),VI) is a partial unification of length 2 and is stored in the beta memory node of the first join node. Note that PL' and LR' are new versions of P L and P R respectively due to necessary unifications between Vpl; and VPRbecause of common variables shared by P L and P R , and V' is a list of bindings of those variables appearing in P L and P R . The second join node of C then compares the content of the beta memory of the first join node with the content of the alpha memory of the third literal of C for consistent variable bindings. If any comparison succeeds, the partial unification of length 3 resulted from this successful comparison, together with a list of bindings of those variables appearing in the first three literals of C, is stored in the beta memory of the second join node. The process repeats until the last join node of C. Note that the computation of literal unifications and partial unifications described above is performed at the beginning of each hyper-linking round. The
computation is only done in the following two conditions:
1. L is a new fact-literal, that is, L did not appear in the previous hyper-linking round. In this case, L is fed into the clause network, flowing through nodes and reaching in appropriate alpha or beta memory nodes.
C was generated in the previous hyper-linking round. In this case, C is compiled and added to the clause network. There are two Cases:
2. A new clause
(a)
If no new literal nodes, hence no new alpha nodes, have been created for C, then we don't need to perform any literal unification. Partial unifications need to be done .for new join nodes, however.
(b) If new literal nodes have been created for C, then we need to do literal unifications for these new literal nodes by feeding all existing fact-literals from the top node of the clause network. Partial unifications need to be done for new join nodes also.
302
I>
Test and Discussion
5
We have tested the clause network idea on some problems. Performance comparison between an implementation using the idea and CLIN is also done. Note that both versions are written in Prolog. Some of the problems are from Stickel’s paper[ll] and some are from [9, 1, 10, 7, 31. All the times were obtained by running C-Prolog (version 1.5) on a DEC5000/125 workstation with 16 MB memory. Note that we used backward support strategy as default. We use ‘NET’ t o refer to the version using the clause network idea, and ‘CLIN’ to refer to the previous version CLIN, in the tables that follow. First, we define some terms used in the tables as follows. 0
0
0 0
I problems I version I I rob1
Ca
: the number of clauses retained in the database so far.
Pl
N, : the current hyper-linking round number.
ls103
& : total number of hyper-linking rounds needed to solve the problem.
ls105 lslll
0
CPU : total elapsed CPU time in seconds.
0
Pa : total number of tries for partial unification.
P40
0
P, : total number of successful partial unifica-
maw
I -
t ions.
PI : total number of failed partial unifications.
0
P, : defined as
0
In : total number of new instances generated in
CLIN
Table 1: Statistics about CPU time
Ci : the number of input clauses in the problem.
0
NET
I NET I CLIN NET CLIN NET CLIN 1 NET CLIN I NET CLIN I NET CLIN I NET CLIN
I
I
1
I
1
I I
I I
I
I
I
I
I
Pa 11 225 48 225 732 27222 349 9381 415 17854 576 9346 136 2974
I
I
PS
I
11 70 21 147 345 3465 182 1809 212 2082 175 4203 67
I
782
I
I
I I
I I
I
I I
pf
pr 20.45
I
0 155 27 78 387 23757 167 7572 203 15772 401 5143 69
I
2192
I I I I
I
I
I
4.68 37.18
I I 26.87 I I 43.02 I I 16.25 I
I 21.87
I
Table 2: Statistics about partial unifications
‘z
%i$.
a round of hyper-linking. 0
0
I d : total number of duplicate instances generated in a round of hyper-linking.
CLIN
I , : total number of instances generated in a round of hyper-linking, which is the sum of In and I d .
0
I, : defined as
0
S, : defined as
M. in C L I N
CPU in N E T
.
Table 1 shows the time comparison on some problems. or all the problems shown in this table, NET runs faster than CLIN. We compare the computation of partial unifications and instances in Table 2 and Table 3 respectively. It is clear that NET saves a lot of redundant partial unifications, indicated by the P,
mqw
2
19
NET CLIN
67 653
48 634
9.75
Table 3: Statistics about instance generation
303
column in Table 2, and duplicate instances, indicated by the I, column in Table 3. It seems that NET should have run faster than as shown in Table 1, since P, and I, are large as shown in Table 2 and Table 3 respectively. Part of the reasons are that it takes time to construct the clause network and that Prolog is not a good language for constructing network structures in which information are u p dated in part and pointers are to be maintained. We are considering to use LISP to recode our theorem prover. The network version takes more memory than CLIN, which has resulted in memory paging for some problems. We are taking care of this problem.
[4]J. Giarratano and G. Riley.
Ezpert Systems: Principles and Programming. PWS-KENT, 1989.
[5] S . Greenbaum. Input Thansformations and Resolution Implementation Techniques for Theorem Proving in First-Order Logic. PhD thesis, University of Illinois at Urbana-Champaign, 1986.
[SI S.-J. Lee and D. Plaisted. Eliminating duplication with the hyper-linking strategy. Journal of Automated Reasoning, pages 25-42, 1992. Non-horn problems. Journal of Automated Reasoning, 1:103-
[7] E. Lusk and R. Overbeek. 114,1985.
[8]W. W. McCune. Otter 1.0 Users’ Guide. Mathe-
6
matics and Computer Science Division, Aggonne National Laboratory, Argonne, Illinois, January
Conclusion
1989. Our work was motivated by the observation that redundant calculation of literal unifications and partial unifications, and redundant duplicate instance checking are significant causes of the inefficiency of the hyper-linking theorem prover. We have shown how to improve the performance of the hyper-linking theorem prover by applying the clause network technique. Information are recorded in the alpha and beta memory nodes in the clause network, so that redundant calculation of literal unifications and partial unifications, and redundant duplicate instance checking is avoided. One problem associated with the clause network technique is that the network takes memory to record literal unifications and partial unifications in alpha and beta memory nodes. This problem also occurs in the rule network technique used in expert systems [2,41. Another problem is that there is an overhead for creating the clause network. For simple problems which can be solved in a small number of hyper-linking rounds, the clause network technique may hurt.
[9]F. J. Pelletier. Seventy-five problems for testing automatic theorem provers. Journal of Automated Reasoning, 2:191-216, 1986.
[lo] D. Plaisted. A simplified problem reduction format. Artificial Intelligence, 18:227-261,1982. A prolog technology theorem prover: Implementation by an extended prolog compiler. In Proceedings of the 8th International Conference on Automated Deduction, pages 573-
[ll] M.E. Stickel.
587, 1986.
References W. W. Bledsoe. Non-resolution theorem proving. Artificial Intelligence, 9:l-35, 1977.
C. L. Forgy. Rete: A fast algorithm for the many pattern/many object pattern match problem. Artificial Intelligence, 19:17-37, 1982. E. C. Fkeuder. Synthesizing constraint expressions. Communications of the Association for Computing Machinery, 21:958-966,1978.
304