Inductively Speeding Up Logic Programs

0 downloads 0 Views 208KB Size Report
This paper presents a speed-up learning method for logic programs, which accelerates a ... by selecting an appropriate set of macros, 1) backtracking in solving training examples are .... Let us suppress backtracking in Example 2 by using this algorithm. Positive and ..... Learning by discovering macros in puzzle solving.
Inductively Speeding Up Logic Programs

M. Numao, T. Maruoka, and M. Shimura Department of Computer Science, Tokyo Institute of Technology, Tokyo, JAPAN [email protected]

Abstract This paper presents a speed-up learning method for logic programs, which accelerates a program by composing macro clauses based on partial structures of proof trees. Many systems have been proposed for composing useful macros, e.g., some of them select macros that connects two peaks in a heuristic function. Another employs heuristics that select useful macros. Although they work well in some domains, such methods depend on domaindependent heuristics that have to be exploited by their users. We propose a heuristic-independent mechanism by detecting backtracking. The method uses a dead-end path as a negative explanation tree, compares it with positive one, and nds a rst di erent node to remove its corresponding rule by composing a macro. Repeated substructures in such a macro are then combined by applying the generalize-number technique and by sharing common substructures. Experimental results in STRIPS domain show that, by selecting an appropriate set of macros, 1) backtracking in solving training examples are suppressed, 2) its problem solving eciency does not deteriorate even after learning a number of examples, 3) after learning 30 training examples, no backtracking occurs in solving 100 test examples di erent from the training examples. In conclusion, the proposed method speeds up the problem solving from 10 to 100 times.

1

Introduction

Explanation-Based Learning (EBL) has been used for speed-up learning in problem solving. Since there are many combinations of macros in each explanation, EBL systems need a selective learning mechanism of macros. Some systems select macros that connects two peaks in a heuristic function (Iba 1985; Minton 1985). Another system employs heuristics that select useful macros (Yamada and Tsuji 1989). Although they work well in some domains, such methods depend on domain-dependent heuristics that have to be exploited by their users. 1

2

Inductively Speeding Up Logic Programs Symbol-Level Learning

Smaller Size

Speed Up

Backtracking

Guards

Fig. 1.1.

Macro

Unification

Meta-control

Others

Classes of Symbol-Level Learning.

This paper presents a heuristic-independent mechanism by detecting backtracking. The method uses a dead-end path as a negative explanation tree, compares it with positive one, and nds a rst di erent node to remove its corresponding rule by composing a macro. Experimental results in STRIPS domain show that, after learning 30 training examples, no backtracking occurs in solving 100 test examples, and that the problem solving is speeded up from 10 to 100 times.

2

Our Approach

Symbol-Level Learning (SLL) improves a rule set based on training examples. Figure 1.1 shows some classes of SLL. A rule set is better than the others if it works faster and occupies smaller memory space, i.e., i) its reasoning speed is faster, and ii) its size is smaller. Reasoning speed depends on the cost of uni cation and backtracking. Optimization by the recent Prolog compilers removes most of the uni cation processes, while they do not optimize backtracking, that depends on the query and occurs dynamically. AI programs are usually declarative and nondeterministic, and thus cause backtracking. They have had to be hand-rewritten to ecient ones. Composing macros is a classical method of speedup learning, which decreases both the cost of backtracking and uni cation. However, composing too large or too many macros worsens reasoning speed and causes the utility problem (Minton 1988). Another method controls reasoning by employing meta-interpreter. It is analyzed very well (Minton 1988), though such an interpreter usually slows down reasoning. A rather new technique is to attach guards to each clause and to select an appropriate clause (Cohen 1990; Zelle and Mooney 1992). This works well for suppressing backtracking, if the guards are kept simple and checked fast. On the other hand, the authors proposed a learning method based on

NUMAO, MARUOKA, AND SHIMURA

3

partial structures of explanations, which selectively learns macro operators (Numao and Shimura 1990, 1989). The method learns at the knowledge level in the eld of translation and logical circuit design, and outputs termrewriting rules for a production system Nat . In this paper, we apply this method to the learning of Horn-clauses, and investigate a method to suppress backtracking by assuming a failure as a negative instance. To reduce the size of a learned rule grown during the learning process, we will develop a method for folding a repetition in each macro and extracting a common subsequence in macros. It makes the subsequence shared to save the memory space and the cost of uni cation.

3

Learning Algorithm

3.1 Generalization and Specialization In prolog, a computation progresses via goal reduction 1 . In each stage of the computation, there exists a resolvent, i.e., a conjunction of goals to be proved. To suppress backtracking, we construct a learning mechanism for avoiding failing paths.

De nition 1. (Instance) An instance is a pair of a query and a resolvent: (query, resolvent). A clause set derives an instance i the query in the instance derives the resolvent. An instance is positive if its resolvent is reduced to be empty, and negative otherwise. Example 2. Consider the following logic program: p :- q(X), q(Y), r(X,Y). q(b). q(c). r(a,b). r(b,c).

A computation of a query ` ?- p.' progresses via the following goal reduction: p

> > >

(backtracking)

> >

q(X),q(Y),r(X,Y) q(Y),r(b,Y) r(b,b) (failure) r(b,c)

2 (success)

A pair of p and each resolvent above is an instance. The resolvents in the following instances are reduced to be empty: (p; (q(X),q(Y),r(X,Y))) (p; (q(Y),r(b,Y))) (p; r(b,c)) (p; 2) 1 See 1.8 (p.12) and 4.2 (p.72) in (Sterling and Shapiro 1986).

4

Inductively Speeding Up Logic Programs

Thus, they are positive instances. The following instance is negative since its resolvent fails: (p; r(b,b))

2 De nition 3. (Generalization and Specialization of a Clause Set) Let and I2 be instance sets derived from clause sets S1 and S2 , respectively. S1 is a generalization of S2 , and S2 is a specialization of S1 , denoted by S1  S2 , i I1  I2 .

I1

The following two theorems show that Explanation-Based Generalization (EBG) (Mitchell et al. 1986; Kedar-Cabelli and McCarty 1987) specializes a clause set:

Theorem 4. Let c be a macro generated by EBG from an explanation including clauses c1 ; . . . ; cn . Then, fc1 ; . . . ; cn g  c. Proof. Let i be any instance derived from fcg. Since c is generated from

. . . ; cn , i is also derived from fc1 ; . . . ; cn g. Therefore, fc1 ; . . . ; cn g  fcg. In general, fc1 ; . . . ; cn g 6= fcg, since some instances derived from c1 ; . . . ; cn are not derived from c. 2

c1 ;

Theorem 5. If S1  S2 and S3  S4 then, S1 [ S3  S2 [ S4 . Proof. Suppose S2 [ S4 derives i. Consider a clause in S2 is applied at a step in the derivation. Since S1  S2 , S1 derives the same step. Similarly when a clause in S4 is applied, S3 derives the same step. Since S1 [ S3 derives each step of the derivation, S1 [ S3 derives i, by which we conclude that (S1 [ S3 )  (S2 [ S4 ). 2 These theorems proclaim that a clause set is specialized when c1 ; . . . ; cn is replaced by c. We speed up a program by making a specialization that avoids backtracking.

3.2 An Algorithm for Speed-Up Learning

Figure 1.2 shows an algorithm to suppress backtracking based on positive and negative instances. A proof tree is an ordered tree, each of whose nodes speci es a clause. Children of a node specify clauses applied to its subgoals in their order. The function clause(x) gives a clause applied in a node x. The algorithm traverses a proof tree, nds an inappropriate application of a clause, and replaces the clause by a macro to suppress backtracking. The algorithm only needs part of the instances given in De nition 1. As a positive instance Pi , it is given a pair: (instantiatedQuery; fg) where instantiatedQuery is a query with output variables instantiated correctly. In the fth line, the algorithm checks whether a query with output variables

NUMAO, MARUOKA, AND SHIMURA

5

begin

:= domain theory ; := the proof tree of each positive instance Pi (i = 1; . . . ; n); for i := 1 to n do while clauseSet derives any negative instance do begin N := the derived negative instance; E N := the proof tree of N ; Compare each node in E N with its corresponding node in E Pi in depth- rst order; h := the rst di erent node; clauseS et := fg; for i = 1 to n do for each p 2 EPi do if clause(p) = clause(h) then begin q := the parent or a child of p; . . . (*) Make a macro c based on clause(p) and clause(q ); clauseS et := clauseSet [ fcg clauseS et

E Pi

end end.

end else clauseS et := clauseS et [ fclause(p)g

Fig. 1.2.

Learning Algorithm.

uninstantiated causes backtracking or not. If it does, the derived negative instance N in the sixth line is a pair: (query; resolvents) where resolvents is a set of resolvents at the point of failure. E Pi and E N are proof trees of Pi and N , respectively. The algorithm nds the rst incorrectly applied clause in E N by comparing both the proof trees, and replaces it by a macro. Let us suppress backtracking in Example 2 by using this algorithm. Positive and negative instances P1 and N are (p; 2) and (p; r(b,b)), respectively. Their proof trees E P1 and E N are shown in Figure 1.3. In each node, predicate_place speci es a clause, where predicate and place are

p_1

q_1

q_2

EP 1 Fig. 1.3.

p_1

r_2

q_1

q_1

EN

Proof Trees of the Positive and Negative Instances.

6

Inductively Speeding Up Logic Programs

a predicate and a place in its de nition, respectively; e.g., q 2 is the second clause in the de nition of q, i.e., the fact ` q(c).'. The rst di erent node h between E P1 and E N is the shadowed node q 1. While traversing E P1 in the for loop, clause(p) = clause(h) when p = q 1. Thus, the algorithm makes a macro c based on clause(p 1) and clause(q 1) encircled with a dotted line, and generates the following deterministic clauseS et: p :- q(Y),r(b,Y). q(c). r(b,c).

We need only one macro for removing backtracking above. In general, the while loop in the fth line invokes iterative composition of a larger macro while the clauseSet causes backtracking. A clause usually checks their applying conditions in the left most part of its body as follows:2

head :- conditions , . . ., goals , . . .. We assume such conditions are operational and ignore backtracking there. EBG generates a macro c based on p and its parent or child q at (*) in the algorithm. Node p and q should be a node and its rst child; e.g., p 1 and q 1 in Figure 1.3. A macro based on the node and its other children does not suppress backtracking, since the generated clause has conditions and goals aligned alternatively.

3.3 Generalizing Number and Sharing a Common Sequence The presented algorithm tends to generate a large macro, which occupies much memory space and slows down its uni cation. Since such a large macro usually contains a repeated sequence, we apply the generalizingnumber technique (Shavlik and DeJong 1987a, 1987b) to the extracted macros, which integrates the di erent number of repetition. Figure 1.4 shows an example of generalizing number. Such a generalized repetition is implemented as a group of clauses internally calling one another. After generalizing number, if two sequences have a common structure, such as the left trees in Figure 1.5, make the sequence shared as shown in the right tree. This transformation saves memory space and the cost of uni cation. Sharing a common sequence may cause over-generalization, i.e., the left trees do not include the combination of nil and subtree C , while the right tree does. In the experiment to be presented in the next section, this causes no problem since the selection between subtree B and subtree C does not depend on gothrudr nor nil, though in general we need more investigation. 2 The planning program for the following experiment consists of such clauses.

NUMAO, MARUOKA, AND SHIMURA

gothrudr

7

gothrudr

gothrudr gothrudr

gothrudr gotod

gotod nil

nil

nil

nil

gotod nil

nil

gotod nil

nil gotod nil nil

nil

nil

Fig. 1.4.

Generalizing Number.

pushthrudr

su

gotob

pushthrudr

re

bt eB

gothrudr

gotob

C

Sharing a common sequence.

e re

C Fig. 1.5.

bt

e re

bt

nil

su

eB re

bt

nil

su

gotob

su

gothrudr pushthrudr

Inductively Speeding Up Logic Programs

8

initialGoal gothrudr

Initial State

door1

pushd

2

open

box room1

room2 room2

gotod

1

Goal State

nil robot box

room1

Fig. 1.6.

4

nil pushd

4 3

room2

pushthrudr

5

gotob

robot

gotob nil

nil

nil

nil

A Planning Example and Its Proof Tree.

Evaluation of the Method

4.1 STRIPS in Prolog

We evaluate the proposed method based on STRIPS planning system (Fikes 1971, 1972). A system written in Prolog is needed to apply the proposed learning mechanism. The authors rewrote a simple planner in (Sterling and Shapiro 1986)3 . It searches a plan in a depth- rst manner prevented from an in nite loop by checking formerly occurred models and by limiting the search depth. Consider forming a plan for achieving the goal state in Figure 1.6 from the initial state. The proof tree in the gure shows its planning process. The solid line shows the proof tree of the positive instance (EP1 ). Each dotted line partially describes a proof tree of a negative instance (EN ). The following is a query for the planning procedure: ?- transform( [type(room1,room),. . .,pushable(box1), connects(door1,room1,room2)], [inroom(box1,room1),inroom(robot,room1), status(door1,closed)], [inroom(robot,room2),inroom(box1,room2)], Plan).

111

111 111

fact initial state goal state

This query invokes the following clause, represented by initialGoal in the proof tree: transform(Fact,Model,Goal,Plan):diff(Model,Goal,Diff), transform(Fact,Model,Goal,Diff,Ac,[],Model,Actions,N,0), 3 Program 14.11 (p.222).

NUMAO, MARUOKA, AND SHIMURA

9

reverse(Actions,Plan).

where diff(+Model,+Goal,-Diff) is an operational predicate that calculates the di erence between Model and Goal. If no di erence is detected, the following clause terminates the recursion: transform(Fact,Model,G,[],Na,Vg,Vm,[],Model,_).

It is indicated by nil in the proof tree. Each of other nodes indicates a planning clause selecting an operator. The system has the following eight operators:  gotob(BX) Go to object BX.  gotod(DX) Go to door DX.  pushb(BX,BY) Push BX to object BY.  pushd(BX,DX) Push BX to door DX.  gothrudr(DX,RX) Go through door DX into room RX.  pushthrudr(BX,DX,RX) Push BX through door DX into room RX.  open(DX) Open door DX.  close(DX) Close door DX. As an example of the planning clauses, one selecting pushthrudr operator is as follows: transform(Fact,Model,Goal,Diff,pushthrudr(BX,DX,RX), Vgoal,Vmodel,Plan,New_model,M) :conditions(Fact,Model,Goal,pushthrudr(BX,DX,RX),Diff), . . ., transform(Fact,Model,Sub_goal,Diff1,Action,[Effect|Vgoal], Vmodel,Actions,Nmodel,NM), . . ., transform(Fact,Update_model,Goal,Diff2,Next_ac, [Effect|Vgoal],[Update_model|Vmodel],Acts, New_model,NM), append(Acts,[pushthrudr(BX,DX,RX,RY)|Actions],Plan).

where conditions( , , , , ) is an operational predicate that checks conditions for applying the operator. The rst subgoal transform(Fact, Model, 1 1 1 ) makes a subplan required before the application of the operator. The second subgoal transform(Fact, Update model, 1 1 1 ) makes a subplan after the application. The di erences between the initial state and the goal state in Figure 1.6 are the places of the robot and the box. If the planner selects the latter, the operator is pushthrudr(box1, door1, room1, room2) (node 5 in the gure), whose precondition is: inroom(box1,room1), inroom(robot,room1), nextto(robot,box1), nextto(box1,door1), status(door1,open),

out of which a subplan has to reduce:

Inductively Speeding Up Logic Programs

10 room1

1

room2

6

room3

room4

room5

7

2

5

8

9

R

3 room6

room7

Fig. 1.7.

4 room8

room9

A State.

nextto(robot,box1), nextto(box1,door1), status(door1,open).

If the planner determines to reduce status(door1,open), the operator is open(door1) (node 2). For its application, nextto(robot,door1) is reduced by the directly applicable operator gotod(door1) (node 1). After open(door1) is applied, the subtree whose root is node 4 reduces other di erences for the application of pushthrudr(box1, door1, room1, room2). The result is the inorder sequence of operators in the tree, indicated by numbers in the gure, as follows: gotod(door1) open(door1) gotob(box1) pushd(box1,door1) pushthrudr(box1,door1,room2) Backtracking occurs when the planner determines to reduce other di erences during the above process. If it selects the place of the robot instead of the box, the operator is gothrudr(door1, room2) as shown in a dotted line in the gure. As such, the operator becomes gotob(box1) and pushd(box1,door1), if the planner tries to reduce nexto(robot,box1) and nextto(box1,door1), respectively, instead of status(open). To suppress backtracking, the algorithm in Figure 1.2 composes a macro based on node 3 and 4, and removes gothrudr, gotob and pushd. Since the composed macro is applied instead of node 2, a larger macro shadowed in the gure is composed. The operators pushthrudr, open, gotod, nil and the last macro make the plan without backtracking.

4.2 Experiments

The authors made experiments on planning in a robot domain shown in Figure 1.7. They generated the initial and the goal states of planning by randomly choosing:  the number of boxes (0-9),  the locations of boxes,

NUMAO, MARUOKA, AND SHIMURA

11

whether each box contacts another box or not,  whether each door is open or not, and  the location of robot. 1000 pairs of such states are stored, out of which training and test examples were sampled for each experiment. The length of plans for the pairs are from 10 to 120 steps. By solving the training examples with learned clauses, the authors rst veri ed that our method suppresses backtracking: Experiment A After learning 20 examples, compare CPU time for the non-learned and the learned clauses to solve the same examples. If the system extracts super uous macros, problem solving eciency deteriorates after learning number of examples. This phenomenon is checked by the following experiment: Experiment B Make the same experiment for 30 examples by adding 10 examples to Experiment A. Figure 1.8 shows the result of Experiment A and B. The CPU time is for SICStus Prolog version 0.7 on Sparc Station 2. No backtracking occurred after learning in Experiment A. The learning accelerated the problem solving from 10 to 100 times, and reduced the number of searched nodes from 1/50 to 1/1000 in both the experiments. Although learned sets of clauses in both the experiments are di erent, their eciency is almost the same as shown in Figure 1.8, exhibiting that correct macros are extracted. When the system learns more than 30 examples, it composes no more combination as a macro, i.e., the result is the same even if the system learns more than 30 examples. After learning 30 examples, the system composes 5 or 6 macros. Each macro is constructed from some clauses, which implement generalizing number and sharing a common sequence. The number of clauses constructing the macros sums up to 104, occupying rather much memory space, though their reasoning speed is fast. To test generality of extracted macros, the system solved a di erent set of examples: Experiment C After learning 30 training examples, solve other 100 test examples. Figure 1.9 shows the result of Experiment C. The learned macro solved the 100 examples without any backtracking, and improved the problem solving similarly to Experiment B. This result shows that the system learned general and useful macros. The problem in the current system is that it takes much time for learning, e.g., about 10 hours for learning 30 examples. Therefore, the method 

Inductively Speeding Up Logic Programs

12

Cumulative Time (CPU Seconds) None learned 20 learned 30 learned

5000

2000 1000 500

200 100 50

20 10 5

2 0

Problems 5 Fig. 1.8.

10

15

20

The Result of Experiment A and B.

is appropriate for speeding up frequently used programs, though it should be improved in a future implementation.

5

Related Work

Iba (1985) and Minton (1985) proposed the peak-to-peak heuristics that composes a macro connecting two peaks in an evaluation function. However, it is usually hard to construct an appropriate heuristic function for evaluating goals in logic programs. Minton (1985), Yamamura and Kobayashi (1991) presented methods to extract common substructures in explanations. This is useful for reducing the number of macros, though it may not accelerate problem solving. Our method concentrates on suppressing backtracking, after which it reduces

NUMAO, MARUOKA, AND SHIMURA

13

Cumulative Time(CPU Seconds) None learned 30 learned

100000 50000 20000 10000 5000 2000 1000 500 200 100 50 20 10 5 2 0

Problems 0

20 Fig. 1.9.

40

60

80

100

The Result of Experiment C.

the number of macros by applying the generalizing-number technique and making common structures shared. Yamada (1989) proposed a heuristics called the perfect causality for selecting appropriate macros. Although it works well in many problems, it may slow down problem solving in others. By suppressing backtracking, our method straightforwardly speeds up problem solving. Minton (1988) presented learning from failure . It is a similar method to ours in the sense that both the methods are based on failures. The difference is in their representation of learned knowledge. Minton's method learns control heuristics for Prodigy system, and thus needs a metainterpreter for being applied to logic programs. Such an interpreter runs slower than compiled Prolog programs. In contrast, our method learns macros, eciently executable by Prolog processors.

Inductively Speeding Up Logic Programs

14

In the eld of Combining Empirical and Explanation-based Learning, some researchers pursue methods to extract partial structures of explanations based on positive and negative examples (Numao 1990, 1989), and to search a macro on a generalization hierarchy of common explanation structures (Kobayashi 1991). In contrast to our research focusing on learning at the symbol level, they evaluated macros at the knowledge level. Cohen (1990), Zelle and Mooney (1992) proposed to add some guards to each clause for controlling and speeding up a program. Such guards are synthesized by using the techniques of Inductive Logic Programming. Instead of introducing such additional guards, our method composes macros, which are decomposed into simpler clauses by employing generalizing number and sharing common structures.

6

Conclusion

We have presented a method for extracting macros based on backtracking, and evaluated it by using planning examples in STRIPS. This method is widely applicable to automatic improvement of logic programs and knowledge bases.

Acknowledgment

The author would like to thank Dr. Seiji Yamada, who sent us data of his experiment (Yamada 1989). Tatsuya Kawamoto has assisted us in using SICStus Prolog.

Bibliography

1. Cohen, W. (1990). Learning approximate control rules of high utility. Proc. the Seventh International Conference on Machine Learning, 268{ 276, Morgan Kaufmann, San Mateo. 2. Fikes, R. E., Hart, P. E., and Nilsson, N. J. (1972). Learning and executing generalized robot plans. Arti cial Intelligence, 3, 251{288. 3. Fikes R. E. and Nilsson N. J. (1971). STRIPS: a new approach to the application of theorem proving to problem solving. Arti cial Intelligence, 2, 189{208. 4. Iba G. A. (1985). Learning by discovering macros in puzzle solving.

Proceedings of the Ninth International Joint Conference on Arti cial Intelligence, 640 { 642, Morgan Kaufmann. 5. Kedar-Cabelli S. T. and McCarty. L. T. (1987). Explanation-based generalization as resolution theorem proving. Proceedings of the Fourth International Workshop on Machine Learning, 383{389, Morgan Kaufmann. 6. Kobayashi, S., Shirai, Y., and Yamamura, M. (1991). Acquiring valid macro rules under the imperfect domain theory by topdown search

NUMAO, MARUOKA, AND SHIMURA

15

on a generalization hierarchy of common explanation structures (in Japanese). Journal of Japanese Society for Arti cial Intelligence, 6(3), 416{425. 7. Minton. S. (1985). Selectively generalizing plans for problem-solving.

Proceedings of the Ninth International Joint Conference on Arti cial Intelligence, 596{599, Morgan Kaufmann. 8. Minton, S. (1988). Learning Search Control Knowledge. Kluwer Academic Publishers, Boston/Dordrecht/London. 9. Mitchell, T. M., Keller, R. M., and Kedar-Cabelli, S. T. (1986). Explanation-based generalization: a unifying view. Machine Learning, 1, 47{80. 10. Numao, M. and Shimura, M. (1989). A learning method based on partial structures of explanations (in Japanese). The Transactions of the Institute of Electronics, Information and Communication Engineers, J72D-II(2), 263{270. 11. Numao, M. and Shimura, M. (1990). A knowledge-level analysis of explanation-based learning. Proceedings of the Third International Conference on Industrial & Engineering Applications of Arti cial Intelligence & Expert Systems, 959{967, ACM Press, New York. 12. Shavlik, J. W. and DeJong, G. F. (1987a). BAGGER: an EBL system that extends and generalizes explanations. Proceedings of the Sixth National Conference on Arti cial Intelligence, Morgan Kaufmann. 13. Shavlik, J. W. and DeJong, G. F. (1987b). An explanation-based approach to generalizing number. Proceedings of the Tenth International Joint Conference on Arti cial Intelligence, Morgan Kaufmann. 14. Sterling L. and Shapiro. E. (1986). The Art of Prolog. The MIT Press, Cambridge. 15. Yamada, S. and Tsuji, S. (1989). Selective learning of macro-operators with perfect causality. IJCAI89, 603{608, Morgan Kaufmann, San Mateo. 16. Yamamura, M. and Kobayashi, S. (1991). An augmented EBL and its application to the utility problem. IJCAI91, 623{629, Morgan Kaufmann, San Mateo. 17. Zelle J. M. and Mooney, R. J. (1992). Speeding-up logic programs by combining EBG and FOIL. Proc. ML92 Workshop on Knowledge Compilation & Speedup Learning.