Inductively Speeding Up Logic Programs
M. Numao, T. Maruoka, and M. Shimura Department of Computer Science, Tokyo Institute of Technology, Tokyo, JAPAN
[email protected]
Abstract This paper presents a speed-up learning method for logic programs, which accelerates a program by composing macro clauses based on partial structures of proof trees. Many systems have been proposed for composing useful macros, e.g., some of them select macros that connects two peaks in a heuristic function. Another employs heuristics that select useful macros. Although they work well in some domains, such methods depend on domaindependent heuristics that have to be exploited by their users. We propose a heuristic-independent mechanism by detecting backtracking. The method uses a dead-end path as a negative explanation tree, compares it with positive one, and nds a rst dierent node to remove its corresponding rule by composing a macro. Repeated substructures in such a macro are then combined by applying the generalize-number technique and by sharing common substructures. Experimental results in STRIPS domain show that, by selecting an appropriate set of macros, 1) backtracking in solving training examples are suppressed, 2) its problem solving eciency does not deteriorate even after learning a number of examples, 3) after learning 30 training examples, no backtracking occurs in solving 100 test examples dierent from the training examples. In conclusion, the proposed method speeds up the problem solving from 10 to 100 times.
1
Introduction
Explanation-Based Learning (EBL) has been used for speed-up learning in problem solving. Since there are many combinations of macros in each explanation, EBL systems need a selective learning mechanism of macros. Some systems select macros that connects two peaks in a heuristic function (Iba 1985; Minton 1985). Another system employs heuristics that select useful macros (Yamada and Tsuji 1989). Although they work well in some domains, such methods depend on domain-dependent heuristics that have to be exploited by their users. 1
2
Inductively Speeding Up Logic Programs Symbol-Level Learning
Smaller Size
Speed Up
Backtracking
Guards
Fig. 1.1.
Macro
Unification
Meta-control
Others
Classes of Symbol-Level Learning.
This paper presents a heuristic-independent mechanism by detecting backtracking. The method uses a dead-end path as a negative explanation tree, compares it with positive one, and nds a rst dierent node to remove its corresponding rule by composing a macro. Experimental results in STRIPS domain show that, after learning 30 training examples, no backtracking occurs in solving 100 test examples, and that the problem solving is speeded up from 10 to 100 times.
2
Our Approach
Symbol-Level Learning (SLL) improves a rule set based on training examples. Figure 1.1 shows some classes of SLL. A rule set is better than the others if it works faster and occupies smaller memory space, i.e., i) its reasoning speed is faster, and ii) its size is smaller. Reasoning speed depends on the cost of uni cation and backtracking. Optimization by the recent Prolog compilers removes most of the uni cation processes, while they do not optimize backtracking, that depends on the query and occurs dynamically. AI programs are usually declarative and nondeterministic, and thus cause backtracking. They have had to be hand-rewritten to ecient ones. Composing macros is a classical method of speedup learning, which decreases both the cost of backtracking and uni cation. However, composing too large or too many macros worsens reasoning speed and causes the utility problem (Minton 1988). Another method controls reasoning by employing meta-interpreter. It is analyzed very well (Minton 1988), though such an interpreter usually slows down reasoning. A rather new technique is to attach guards to each clause and to select an appropriate clause (Cohen 1990; Zelle and Mooney 1992). This works well for suppressing backtracking, if the guards are kept simple and checked fast. On the other hand, the authors proposed a learning method based on
NUMAO, MARUOKA, AND SHIMURA
3
partial structures of explanations, which selectively learns macro operators (Numao and Shimura 1990, 1989). The method learns at the knowledge level in the eld of translation and logical circuit design, and outputs termrewriting rules for a production system Nat . In this paper, we apply this method to the learning of Horn-clauses, and investigate a method to suppress backtracking by assuming a failure as a negative instance. To reduce the size of a learned rule grown during the learning process, we will develop a method for folding a repetition in each macro and extracting a common subsequence in macros. It makes the subsequence shared to save the memory space and the cost of uni cation.
3
Learning Algorithm
3.1 Generalization and Specialization In prolog, a computation progresses via goal reduction 1 . In each stage of the computation, there exists a resolvent, i.e., a conjunction of goals to be proved. To suppress backtracking, we construct a learning mechanism for avoiding failing paths.
De nition 1. (Instance) An instance is a pair of a query and a resolvent: (query, resolvent). A clause set derives an instance i the query in the instance derives the resolvent. An instance is positive if its resolvent is reduced to be empty, and negative otherwise. Example 2. Consider the following logic program: p :- q(X), q(Y), r(X,Y). q(b). q(c). r(a,b). r(b,c).
A computation of a query ` ?- p.' progresses via the following goal reduction: p
> > >
(backtracking)
> >
q(X),q(Y),r(X,Y) q(Y),r(b,Y) r(b,b) (failure) r(b,c)
2 (success)
A pair of p and each resolvent above is an instance. The resolvents in the following instances are reduced to be empty: (p; (q(X),q(Y),r(X,Y))) (p; (q(Y),r(b,Y))) (p; r(b,c)) (p; 2) 1 See 1.8 (p.12) and 4.2 (p.72) in (Sterling and Shapiro 1986).
4
Inductively Speeding Up Logic Programs
Thus, they are positive instances. The following instance is negative since its resolvent fails: (p; r(b,b))
2 De nition 3. (Generalization and Specialization of a Clause Set) Let and I2 be instance sets derived from clause sets S1 and S2 , respectively. S1 is a generalization of S2 , and S2 is a specialization of S1 , denoted by S1 S2 , i I1 I2 .
I1
The following two theorems show that Explanation-Based Generalization (EBG) (Mitchell et al. 1986; Kedar-Cabelli and McCarty 1987) specializes a clause set:
Theorem 4. Let c be a macro generated by EBG from an explanation including clauses c1 ; . . . ; cn . Then, fc1 ; . . . ; cn g c. Proof. Let i be any instance derived from fcg. Since c is generated from
. . . ; cn , i is also derived from fc1 ; . . . ; cn g. Therefore, fc1 ; . . . ; cn g fcg. In general, fc1 ; . . . ; cn g 6= fcg, since some instances derived from c1 ; . . . ; cn are not derived from c. 2
c1 ;
Theorem 5. If S1 S2 and S3 S4 then, S1 [ S3 S2 [ S4 . Proof. Suppose S2 [ S4 derives i. Consider a clause in S2 is applied at a step in the derivation. Since S1 S2 , S1 derives the same step. Similarly when a clause in S4 is applied, S3 derives the same step. Since S1 [ S3 derives each step of the derivation, S1 [ S3 derives i, by which we conclude that (S1 [ S3 ) (S2 [ S4 ). 2 These theorems proclaim that a clause set is specialized when c1 ; . . . ; cn is replaced by c. We speed up a program by making a specialization that avoids backtracking.
3.2 An Algorithm for Speed-Up Learning
Figure 1.2 shows an algorithm to suppress backtracking based on positive and negative instances. A proof tree is an ordered tree, each of whose nodes speci es a clause. Children of a node specify clauses applied to its subgoals in their order. The function clause(x) gives a clause applied in a node x. The algorithm traverses a proof tree, nds an inappropriate application of a clause, and replaces the clause by a macro to suppress backtracking. The algorithm only needs part of the instances given in De nition 1. As a positive instance Pi , it is given a pair: (instantiatedQuery; fg) where instantiatedQuery is a query with output variables instantiated correctly. In the fth line, the algorithm checks whether a query with output variables
NUMAO, MARUOKA, AND SHIMURA
5
begin
:= domain theory ; := the proof tree of each positive instance Pi (i = 1; . . . ; n); for i := 1 to n do while clauseSet derives any negative instance do begin N := the derived negative instance; E N := the proof tree of N ; Compare each node in E N with its corresponding node in E Pi in depth- rst order; h := the rst dierent node; clauseS et := fg; for i = 1 to n do for each p 2 EPi do if clause(p) = clause(h) then begin q := the parent or a child of p; . . . (*) Make a macro c based on clause(p) and clause(q ); clauseS et := clauseSet [ fcg clauseS et
E Pi
end end.
end else clauseS et := clauseS et [ fclause(p)g
Fig. 1.2.
Learning Algorithm.
uninstantiated causes backtracking or not. If it does, the derived negative instance N in the sixth line is a pair: (query; resolvents) where resolvents is a set of resolvents at the point of failure. E Pi and E N are proof trees of Pi and N , respectively. The algorithm nds the rst incorrectly applied clause in E N by comparing both the proof trees, and replaces it by a macro. Let us suppress backtracking in Example 2 by using this algorithm. Positive and negative instances P1 and N are (p; 2) and (p; r(b,b)), respectively. Their proof trees E P1 and E N are shown in Figure 1.3. In each node, predicate_place speci es a clause, where predicate and place are
p_1
q_1
q_2
EP 1 Fig. 1.3.
p_1
r_2
q_1
q_1
EN
Proof Trees of the Positive and Negative Instances.
6
Inductively Speeding Up Logic Programs
a predicate and a place in its de nition, respectively; e.g., q 2 is the second clause in the de nition of q, i.e., the fact ` q(c).'. The rst dierent node h between E P1 and E N is the shadowed node q 1. While traversing E P1 in the for loop, clause(p) = clause(h) when p = q 1. Thus, the algorithm makes a macro c based on clause(p 1) and clause(q 1) encircled with a dotted line, and generates the following deterministic clauseS et: p :- q(Y),r(b,Y). q(c). r(b,c).
We need only one macro for removing backtracking above. In general, the while loop in the fth line invokes iterative composition of a larger macro while the clauseSet causes backtracking. A clause usually checks their applying conditions in the left most part of its body as follows:2
head :- conditions , . . ., goals , . . .. We assume such conditions are operational and ignore backtracking there. EBG generates a macro c based on p and its parent or child q at (*) in the algorithm. Node p and q should be a node and its rst child; e.g., p 1 and q 1 in Figure 1.3. A macro based on the node and its other children does not suppress backtracking, since the generated clause has conditions and goals aligned alternatively.
3.3 Generalizing Number and Sharing a Common Sequence The presented algorithm tends to generate a large macro, which occupies much memory space and slows down its uni cation. Since such a large macro usually contains a repeated sequence, we apply the generalizingnumber technique (Shavlik and DeJong 1987a, 1987b) to the extracted macros, which integrates the dierent number of repetition. Figure 1.4 shows an example of generalizing number. Such a generalized repetition is implemented as a group of clauses internally calling one another. After generalizing number, if two sequences have a common structure, such as the left trees in Figure 1.5, make the sequence shared as shown in the right tree. This transformation saves memory space and the cost of uni cation. Sharing a common sequence may cause over-generalization, i.e., the left trees do not include the combination of nil and subtree C , while the right tree does. In the experiment to be presented in the next section, this causes no problem since the selection between subtree B and subtree C does not depend on gothrudr nor nil, though in general we need more investigation. 2 The planning program for the following experiment consists of such clauses.
NUMAO, MARUOKA, AND SHIMURA
gothrudr
7
gothrudr
gothrudr gothrudr
gothrudr gotod
gotod nil
nil
nil
nil
gotod nil
nil
gotod nil
nil gotod nil nil
nil
nil
Fig. 1.4.
Generalizing Number.
pushthrudr
su
gotob
pushthrudr
re
bt eB
gothrudr
gotob
C
Sharing a common sequence.
e re
C Fig. 1.5.
bt
e re
bt
nil
su
eB re
bt
nil
su
gotob
su
gothrudr pushthrudr
Inductively Speeding Up Logic Programs
8
initialGoal gothrudr
Initial State
door1
pushd
2
open
box room1
room2 room2
gotod
1
Goal State
nil robot box
room1
Fig. 1.6.
4
nil pushd
4 3
room2
pushthrudr
5
gotob
robot
gotob nil
nil
nil
nil
A Planning Example and Its Proof Tree.
Evaluation of the Method
4.1 STRIPS in Prolog
We evaluate the proposed method based on STRIPS planning system (Fikes 1971, 1972). A system written in Prolog is needed to apply the proposed learning mechanism. The authors rewrote a simple planner in (Sterling and Shapiro 1986)3 . It searches a plan in a depth- rst manner prevented from an in nite loop by checking formerly occurred models and by limiting the search depth. Consider forming a plan for achieving the goal state in Figure 1.6 from the initial state. The proof tree in the gure shows its planning process. The solid line shows the proof tree of the positive instance (EP1 ). Each dotted line partially describes a proof tree of a negative instance (EN ). The following is a query for the planning procedure: ?- transform( [type(room1,room),. . .,pushable(box1), connects(door1,room1,room2)], [inroom(box1,room1),inroom(robot,room1), status(door1,closed)], [inroom(robot,room2),inroom(box1,room2)], Plan).
111
111 111
fact initial state goal state
This query invokes the following clause, represented by initialGoal in the proof tree: transform(Fact,Model,Goal,Plan):diff(Model,Goal,Diff), transform(Fact,Model,Goal,Diff,Ac,[],Model,Actions,N,0), 3 Program 14.11 (p.222).
NUMAO, MARUOKA, AND SHIMURA
9
reverse(Actions,Plan).
where diff(+Model,+Goal,-Diff) is an operational predicate that calculates the dierence between Model and Goal. If no dierence is detected, the following clause terminates the recursion: transform(Fact,Model,G,[],Na,Vg,Vm,[],Model,_).
It is indicated by nil in the proof tree. Each of other nodes indicates a planning clause selecting an operator. The system has the following eight operators: gotob(BX) Go to object BX. gotod(DX) Go to door DX. pushb(BX,BY) Push BX to object BY. pushd(BX,DX) Push BX to door DX. gothrudr(DX,RX) Go through door DX into room RX. pushthrudr(BX,DX,RX) Push BX through door DX into room RX. open(DX) Open door DX. close(DX) Close door DX. As an example of the planning clauses, one selecting pushthrudr operator is as follows: transform(Fact,Model,Goal,Diff,pushthrudr(BX,DX,RX), Vgoal,Vmodel,Plan,New_model,M) :conditions(Fact,Model,Goal,pushthrudr(BX,DX,RX),Diff), . . ., transform(Fact,Model,Sub_goal,Diff1,Action,[Effect|Vgoal], Vmodel,Actions,Nmodel,NM), . . ., transform(Fact,Update_model,Goal,Diff2,Next_ac, [Effect|Vgoal],[Update_model|Vmodel],Acts, New_model,NM), append(Acts,[pushthrudr(BX,DX,RX,RY)|Actions],Plan).
where conditions( , , , , ) is an operational predicate that checks conditions for applying the operator. The rst subgoal transform(Fact, Model, 1 1 1 ) makes a subplan required before the application of the operator. The second subgoal transform(Fact, Update model, 1 1 1 ) makes a subplan after the application. The dierences between the initial state and the goal state in Figure 1.6 are the places of the robot and the box. If the planner selects the latter, the operator is pushthrudr(box1, door1, room1, room2) (node 5 in the gure), whose precondition is: inroom(box1,room1), inroom(robot,room1), nextto(robot,box1), nextto(box1,door1), status(door1,open),
out of which a subplan has to reduce:
Inductively Speeding Up Logic Programs
10 room1
1
room2
6
room3
room4
room5
7
2
5
8
9
R
3 room6
room7
Fig. 1.7.
4 room8
room9
A State.
nextto(robot,box1), nextto(box1,door1), status(door1,open).
If the planner determines to reduce status(door1,open), the operator is open(door1) (node 2). For its application, nextto(robot,door1) is reduced by the directly applicable operator gotod(door1) (node 1). After open(door1) is applied, the subtree whose root is node 4 reduces other dierences for the application of pushthrudr(box1, door1, room1, room2). The result is the inorder sequence of operators in the tree, indicated by numbers in the gure, as follows: gotod(door1) open(door1) gotob(box1) pushd(box1,door1) pushthrudr(box1,door1,room2) Backtracking occurs when the planner determines to reduce other dierences during the above process. If it selects the place of the robot instead of the box, the operator is gothrudr(door1, room2) as shown in a dotted line in the gure. As such, the operator becomes gotob(box1) and pushd(box1,door1), if the planner tries to reduce nexto(robot,box1) and nextto(box1,door1), respectively, instead of status(open). To suppress backtracking, the algorithm in Figure 1.2 composes a macro based on node 3 and 4, and removes gothrudr, gotob and pushd. Since the composed macro is applied instead of node 2, a larger macro shadowed in the gure is composed. The operators pushthrudr, open, gotod, nil and the last macro make the plan without backtracking.
4.2 Experiments
The authors made experiments on planning in a robot domain shown in Figure 1.7. They generated the initial and the goal states of planning by randomly choosing: the number of boxes (0-9), the locations of boxes,
NUMAO, MARUOKA, AND SHIMURA
11
whether each box contacts another box or not, whether each door is open or not, and the location of robot. 1000 pairs of such states are stored, out of which training and test examples were sampled for each experiment. The length of plans for the pairs are from 10 to 120 steps. By solving the training examples with learned clauses, the authors rst veri ed that our method suppresses backtracking: Experiment A After learning 20 examples, compare CPU time for the non-learned and the learned clauses to solve the same examples. If the system extracts super uous macros, problem solving eciency deteriorates after learning number of examples. This phenomenon is checked by the following experiment: Experiment B Make the same experiment for 30 examples by adding 10 examples to Experiment A. Figure 1.8 shows the result of Experiment A and B. The CPU time is for SICStus Prolog version 0.7 on Sparc Station 2. No backtracking occurred after learning in Experiment A. The learning accelerated the problem solving from 10 to 100 times, and reduced the number of searched nodes from 1/50 to 1/1000 in both the experiments. Although learned sets of clauses in both the experiments are dierent, their eciency is almost the same as shown in Figure 1.8, exhibiting that correct macros are extracted. When the system learns more than 30 examples, it composes no more combination as a macro, i.e., the result is the same even if the system learns more than 30 examples. After learning 30 examples, the system composes 5 or 6 macros. Each macro is constructed from some clauses, which implement generalizing number and sharing a common sequence. The number of clauses constructing the macros sums up to 104, occupying rather much memory space, though their reasoning speed is fast. To test generality of extracted macros, the system solved a dierent set of examples: Experiment C After learning 30 training examples, solve other 100 test examples. Figure 1.9 shows the result of Experiment C. The learned macro solved the 100 examples without any backtracking, and improved the problem solving similarly to Experiment B. This result shows that the system learned general and useful macros. The problem in the current system is that it takes much time for learning, e.g., about 10 hours for learning 30 examples. Therefore, the method
Inductively Speeding Up Logic Programs
12
Cumulative Time (CPU Seconds) None learned 20 learned 30 learned
5000
2000 1000 500
200 100 50
20 10 5
2 0
Problems 5 Fig. 1.8.
10
15
20
The Result of Experiment A and B.
is appropriate for speeding up frequently used programs, though it should be improved in a future implementation.
5
Related Work
Iba (1985) and Minton (1985) proposed the peak-to-peak heuristics that composes a macro connecting two peaks in an evaluation function. However, it is usually hard to construct an appropriate heuristic function for evaluating goals in logic programs. Minton (1985), Yamamura and Kobayashi (1991) presented methods to extract common substructures in explanations. This is useful for reducing the number of macros, though it may not accelerate problem solving. Our method concentrates on suppressing backtracking, after which it reduces
NUMAO, MARUOKA, AND SHIMURA
13
Cumulative Time(CPU Seconds) None learned 30 learned
100000 50000 20000 10000 5000 2000 1000 500 200 100 50 20 10 5 2 0
Problems 0
20 Fig. 1.9.
40
60
80
100
The Result of Experiment C.
the number of macros by applying the generalizing-number technique and making common structures shared. Yamada (1989) proposed a heuristics called the perfect causality for selecting appropriate macros. Although it works well in many problems, it may slow down problem solving in others. By suppressing backtracking, our method straightforwardly speeds up problem solving. Minton (1988) presented learning from failure . It is a similar method to ours in the sense that both the methods are based on failures. The difference is in their representation of learned knowledge. Minton's method learns control heuristics for Prodigy system, and thus needs a metainterpreter for being applied to logic programs. Such an interpreter runs slower than compiled Prolog programs. In contrast, our method learns macros, eciently executable by Prolog processors.
Inductively Speeding Up Logic Programs
14
In the eld of Combining Empirical and Explanation-based Learning, some researchers pursue methods to extract partial structures of explanations based on positive and negative examples (Numao 1990, 1989), and to search a macro on a generalization hierarchy of common explanation structures (Kobayashi 1991). In contrast to our research focusing on learning at the symbol level, they evaluated macros at the knowledge level. Cohen (1990), Zelle and Mooney (1992) proposed to add some guards to each clause for controlling and speeding up a program. Such guards are synthesized by using the techniques of Inductive Logic Programming. Instead of introducing such additional guards, our method composes macros, which are decomposed into simpler clauses by employing generalizing number and sharing common structures.
6
Conclusion
We have presented a method for extracting macros based on backtracking, and evaluated it by using planning examples in STRIPS. This method is widely applicable to automatic improvement of logic programs and knowledge bases.
Acknowledgment
The author would like to thank Dr. Seiji Yamada, who sent us data of his experiment (Yamada 1989). Tatsuya Kawamoto has assisted us in using SICStus Prolog.
Bibliography
1. Cohen, W. (1990). Learning approximate control rules of high utility. Proc. the Seventh International Conference on Machine Learning, 268{ 276, Morgan Kaufmann, San Mateo. 2. Fikes, R. E., Hart, P. E., and Nilsson, N. J. (1972). Learning and executing generalized robot plans. Arti cial Intelligence, 3, 251{288. 3. Fikes R. E. and Nilsson N. J. (1971). STRIPS: a new approach to the application of theorem proving to problem solving. Arti cial Intelligence, 2, 189{208. 4. Iba G. A. (1985). Learning by discovering macros in puzzle solving.
Proceedings of the Ninth International Joint Conference on Arti cial Intelligence, 640 { 642, Morgan Kaufmann. 5. Kedar-Cabelli S. T. and McCarty. L. T. (1987). Explanation-based generalization as resolution theorem proving. Proceedings of the Fourth International Workshop on Machine Learning, 383{389, Morgan Kaufmann. 6. Kobayashi, S., Shirai, Y., and Yamamura, M. (1991). Acquiring valid macro rules under the imperfect domain theory by topdown search
NUMAO, MARUOKA, AND SHIMURA
15
on a generalization hierarchy of common explanation structures (in Japanese). Journal of Japanese Society for Arti cial Intelligence, 6(3), 416{425. 7. Minton. S. (1985). Selectively generalizing plans for problem-solving.
Proceedings of the Ninth International Joint Conference on Arti cial Intelligence, 596{599, Morgan Kaufmann. 8. Minton, S. (1988). Learning Search Control Knowledge. Kluwer Academic Publishers, Boston/Dordrecht/London. 9. Mitchell, T. M., Keller, R. M., and Kedar-Cabelli, S. T. (1986). Explanation-based generalization: a unifying view. Machine Learning, 1, 47{80. 10. Numao, M. and Shimura, M. (1989). A learning method based on partial structures of explanations (in Japanese). The Transactions of the Institute of Electronics, Information and Communication Engineers, J72D-II(2), 263{270. 11. Numao, M. and Shimura, M. (1990). A knowledge-level analysis of explanation-based learning. Proceedings of the Third International Conference on Industrial & Engineering Applications of Arti cial Intelligence & Expert Systems, 959{967, ACM Press, New York. 12. Shavlik, J. W. and DeJong, G. F. (1987a). BAGGER: an EBL system that extends and generalizes explanations. Proceedings of the Sixth National Conference on Arti cial Intelligence, Morgan Kaufmann. 13. Shavlik, J. W. and DeJong, G. F. (1987b). An explanation-based approach to generalizing number. Proceedings of the Tenth International Joint Conference on Arti cial Intelligence, Morgan Kaufmann. 14. Sterling L. and Shapiro. E. (1986). The Art of Prolog. The MIT Press, Cambridge. 15. Yamada, S. and Tsuji, S. (1989). Selective learning of macro-operators with perfect causality. IJCAI89, 603{608, Morgan Kaufmann, San Mateo. 16. Yamamura, M. and Kobayashi, S. (1991). An augmented EBL and its application to the utility problem. IJCAI91, 623{629, Morgan Kaufmann, San Mateo. 17. Zelle J. M. and Mooney, R. J. (1992). Speeding-up logic programs by combining EBG and FOIL. Proc. ML92 Workshop on Knowledge Compilation & Speedup Learning.