An intelligent search method using Inductive Logic Programming Nobuhiro Inuzuka, Hirohisa Seki and Hidenori Itoh
Department of Intelligence and Computer Science Nagoya Institute of Technology Gokiso-cho, Showa-ku, Nagoya 466, Japan Phone: +81-52-735-5475, Fax: +81-52-735-5477 E-mail: finuzuka,seki,
[email protected]
Abstract
We propose a method to use Inductive Logic Programming to give heuristic functions for searching goals to solve problems. The method takes solutions of a problem or a history of search and a set of background knowledge on the problem. In a large class of problems, a problem is described as a set of states and a set of operators, and is solved by nding a series of operators. A solution, a series of operators that brings an initial state to a nal state, is transformed into positive and negative examples of a relation \better-choice", which describes that an operator is better than others in a state. We also give a way to use the \better-choice" relation as a heuristic function. The method can use any logic program as background knowledge to induce heuristics, and induced heuristics has high readability. The paper inspects the method by applying to a puzzle. 1
Introduction
Interest in applications of Inductive Logic Programming (ILP) method to a number of elds has been increased recently. Several researches of application of ILP to diagnosis, design, or analysis, such as [Mizoguchi et al., 1996; Dzeroski et al., 1996; Muggleton et al., 1996], have been presented, and have been succeeded. This paper proposes a method to generate heuristics to solve a problem using ILP technique. To apply a learning method to problem solving has been researched. Lex[Mitchell et al., 1983] learns heuristic rules that tell situations in which operators should be applied. It learns from experience of solving problems using attributes of problems. Explanationbased learning[Mitchell et al., 1986] makes a problem solving process shorten by generating shortcut rules. Absolver[Prieditis, 1993] discovers admissible heuristics, which are used for A3 algorithm using the Relaxed Problem method. ILP is expected to contribute to problem solving because of its exibility. Attempts to use ILP technique for
solutions
start
IS/ILP system
3 6 7 5 8 1 4 2 3 6 7 5 8 1 4 2
3 6 7 5 1 4 8 2
3 6 7 5 8 1 4 2
generationg examples
examples of operation choice
ILP system
Background knowledge on the problem
comparison relation between operations
1 2 3 4 5 6 7 8
goal
Constituting a heuristic function
heuristic function
Figure 1: Outline of the method and the system IS/ILP problem solving includes Dolphin[Zelle et al., 1993] and Scope[Estlin et al., 1996]. Dolphine learns conditions to make nondeterministic programs deterministic combining with EBL method. Scope learns control rules for partial order planning also combining with EBL. Learning procedure of ILP, however, does not match with problem solving. This paper gives a general framework to use the learning procedure of ILP for problem solving. This method inherits advantage from ILP in that it can use any logic program as background knowledge to induce heuristics and induced knowledge which is a core of heuristics is readable. The following section explains a method to generate positive and negative examples for ILP from a solution of a problem or a history of problem solving, and a method to use results from ILP as a heuristic function for problem solving. Section 3 describes an implemented system IS/ILP of the method. Experiments of the system with a puzzle are given in Section 4. 2
A search method using ILP
Figure 1 illustrates an outline of the method. The method uses solutions of a search problem to generate a heuristic function, which is used to solve the same type of problems. To generate a heuristic function from solu-
tions, an ILP system is used with background knowledge that describes various aspects of the problem.
and k 0 1 negative examples
(sj ; o1 ; oij );
better-choice
111
2.1 Search problems and de nitions
A search problem P is a 4-tuple (S; O; sI ; F ), where 8 S : a nite set; > < O : a nite set of mappings from S to S; (1) s > I : 2 S; and
F
S:
An element of S is called a state. A mapping o 2 O is called an operator, which maps a state s 2 S to a state o(s) 2 S . sI is a state called an initial state in S . F is a subset of S , and a state in F is called a nial state or a goal state. For a state s and a sequence of operators o = (o1 ; o2 ; 1 1 1 ; on ), o(s) denotes on (1 1 1 (o2 (o1 (s))) 1 1 1). To solve a problem P = (S; O; sI ; F ) is to nd a nite sequence of operators o that satis es o(sI ) 2 F , where o is called a solution of P . Problems with the same set of states, the same set of operators, and the same nal states make a class of problems.
2.2 Inducing knowledge from solutions
The method takes a set of solutions of problems in a class. From one of the solutions, we can generate positive and negative examples. Let o = (oi1 ; oi2 ; 1 1 1 ; oin ) be a solution of a problem P = (S; O; sI ; F ), where O = fo1 ; 1 1 1 ; ok g. This solution tells that an operator oij is the best operator of other operators in O at a state oij01 (1 1 1 (oi1 (sI )) 1 1 1). (1 j n) Let us consider a relation best-choice(S; O; I ) that takes true for a state S , the set of operators O and an index I of the best operator. best-choice(sj ; O; ij ) is a positive examples of the relation, where sj = oij01 (1 1 1 (oi1 (sI )) 1 1 1). Other instances 0 0 best-choice(sj ; O; i ) (i 6= ij ) with the same state and dierent indexes are negative examples. In principle positive and negative examples of best-choice generated from solutions can be used to induce a de nition of best-choice as an input sample of ILP systems. In practice, however, that is dicult, because the relation has three arguments in appearance but k +2 arguments (a state, k operators, and an index) substantially and the large number of arguments prevents an ILP system from inducing a de nition. The relation best-choice, which selects an operator from many others, can be expanded into a comparison relation better-choice, which compares two operators. A positive example best-choice(si ; O; ij ) of best-choice is expanded into k 0 1 positive examples of better-choice better-choice(sj ; oij ; o1 );
111
(sj ; oij ; oij 01 ); (sj ; oij ; oij +1 );
better-choice
better-choice
111
(sj ; oij ; ok );
better-choice
(sj ; oij 01 ; oij ); (sj ; oij +1 ; oij );
better-choice
better-choice
111
(sj ; ok ; oij ):
better-choice
We have n(k 0 1) positive examples and n(k 0 1) negative ones for a solution consisting of n operators. An ILP system works to induce a de nition of better-choice using positive and negative examples generated from solutions and background knowledge on the problem.
2.3 Using induced knowledge as heuristics
To use a de nition of the relation better-choice in search process solving a problem we have to give a way to produce a heuristic function from the relation. A heuristic function is given by two steps. In the rst step an order relation is generated from better-choice for each state, and a heuristic function, which gives evaluation values for each state, is produced from the order. The relation better-choice(s; o1 ; o2 ) is regarded as a binary relation better-choices (o1 ; o2 ) for a state s. A transitive and re exive closure better-choice3s of the relation, which is de ned by that better-choice3s (o; o0 ) i there is a sequence of operators (o = o1 ; 1 1 1 ; on = o0 ) (n 0) such that better-choices (oi ; oi+1 ) for i = 1; 1 1 1 ; n 0 1. 3 better-choices is a pseudo-order relation because it is transitive and re exive trivially. Let us consider the following set inferior-sets (o) for each operator o and each state s. inferior-sets
(o) = fo0 2 Ojbetter-choice3s (o; o0 )g
(2)
Using this mapping inferior-sets we can give a superiority of operator as a total order superiors
(o1 ; o2 )
i #inferior-sets (o1 ) #inferior-sets (o2 );
(3)
where #A denotes the number of elements of A. The relation superiors is a total order relation, and so we can select the best operator using it. Figure 2 illustrates a hill-climbing algorithm using this relation. The algorithm, however, easily makes an in nite search path, because the de nition of better-choice induced by an ILP system is not always complete, and so superiors does not always guide the algorithm appropriately. To avoid an in nite searching path we give a safe algorithm using superiors . We use the best- rst search. To use the best- rst search we need to evaluate every state, which means that we have to compare states globally. superiors relation only locally compares operators in a state s or it compares states succeeding to s by operators. A global evaluation of states can be given by superiors relation. Let us consider a sequence of operators o =
1
1
1 α
α α2 α
3
α
4
α
3
(a) f2 (i) =
α2
α
α4
α2
α
α6 α
3
α7
α8 α
4
5
(b) f3 (i) = i
α5 α α7
6
α8 α
α10
(c) f4 (i) = 2i
9
α12 α13 α16
Figure 3: various con dence functions and evaluation values Problem: P=(S, O, sI , F) s:=sI Repeat Ops := all operators able to apply to s Choose the maximum operator o from Ops on superiors s := o(s) Until s 2 F Stop the algorithm with success
Figure 2: Hill-climbing search using superiors relation (oi1 ; 1 1 1 ; oin ). We assume that for every j (j = 1; 1 1 1 ; n) oij has a ranking rj by superiorsj , i.e. oij is the rj -th best operator among O, where sj = oij 01 (1 1 1 (oi1 (sI )) 1 1 1). Then an evaluation value of the state o(sI ) is vo de ned by vo = f (r1 ) 1 f (r2 ) 1 1 1 f (rn ); (4) where f is a function satisfying that 1 > f (1) f (2) 1 1 1 f (i) 1 1 1 0:
(5)
We call the function f a con dence function. A best- rst search algorithm using the evaluation values is illustrated in Figure 4. In the algorithm an evaluation value of a state s is calculated by multiplying an evaluation value of parent state by a value of the con dence function of ranking of an operator that generates the state s. Various con dence functions give various performance in the search algorithm. If a function given by
f1 (1) = ; f1 (2) = f1 (3) = 1 1 1 = 0 (0 < < 1)
(6)
is used, the algorithm performs a hill-climbing, which is the same as Figure 2. Another function de ned by
f2 (1) = f2 (2) = 1 1 1 = (0 < < 1)
(7)
gives the same evaluation value for every state generated by the same number of operators from the initial state. Hence, the algorithm performs the breadth- rst search. If a con dence function f decreases values as later ranking more rapidly than another function f 0 , the search algorithm with f visits a node earlier than f 0 , the
Problem: P=(S, O, sI , F) s:=sI open:= f(s; 1)g close:= Repeat Choose a state s such that (s; v ) 2 open and s 62 close with the maximum value v If s 2 F then stop the algorithm with success open:= open 0 f(s; v )g close:= close [ fsg ops:= all operators able to apply to s Sort ops by superiors relation open:= open [ f (oi (s); vi )joi 2 ops; vi = v 1 f (i) for the i-th operator in opsg Until open is empty Stop the algorithm with failure
Figure 4: Best- rst search using superiors relation node which is evaluated higher by knowledge induced by ILP. So we can say that f con des knowledge more than f 0 . For example, a function
f3 (i) = i (0 < < 1)
(8)
con des knowledge less than a function
f4 (i) = 2i (0 < < 1):
(9)
Figure 3 shows evaluation values in the search with con dence functions f2 , f3 and f4 , where search graphs are assumed that every node is expanded to two nodes, of which a left node has higher ranking than right one. The search always nds a solution if the con dence function f holds f (i) > 0 for every i and if there is a nite solution. 3
The system IS/ILP
The method is implemented as a system called IS/ILP (Intelligent Search system using ILP) on SICStus Prolog version 2.1. The system includes a FOIL-like top-down ILP system FOIL-I[Inuzuka et al., 1996] with changes. FOIL[Quinlan, 1990; Quinlan et al., 1993] is a wellknown top-down ILP system, but, it is not robust for
bg(goal, bg(move, bg(opposite, bg(num_misplace, bg(dstnc_from_goal, bg(num_diff, bg(distance, bg(dstnc_empty, bg(num_psble_dir, bg(can_move,
1, 3, 2, 2, 2, 3, 3, 3, 2, 2,
[-], [+,+,-], [+,+], [+,-], [+,-], [+ +,-], [+,+,-], [+,+,-], [+,-], [+,+],
[board], [board, direction, board], [direction, direction], [board, int], [board, int], [board, board, int], [board, board, int], [board, board, int], [board, int], [board, direction],
[] ). [] ). [] ). [] ). [] ). [proper_sorted]). [proper_sorted]). [proper_sorted]). [] ). [] ).
Figure 5: Information of backgroud knowledge to be given to the system lack of examples. FOIL-I induces de nitions from relatively small samples. Dierence of the IS/ILP version of FOIL-I from the original version includes that 1. IS/ILP version can take any Prolog program as intensional de nitions of background knowledge, and that 2. IS/ILP version can treat noise. It allows a clause to cover at most as many negative examples as 20% of positive examples covered by the clause, and allows a program induced not to cover at most 10% of given positive examples. To use a Prolog program as background knowledge information is given to the system. Figure 5 is an example of such information, which gives the system names of predicates, their arities, modes, types and conditions. A condition of predicate limits a use of predicate in a clause, e.g. proper sorted excludes literals with arguments whose indexes are not sorted properly. The other conditions include apart to inhibit duplication of arguments, permutation inhibited to inhibit permutations of input mode arguments, and already appeared to allow only arguments already appeared. 4
Experiments with a puzzle
The method is experimented with the implementation IS/ILP and a small puzzle called eight puzzle. Eight puzzle consists of a board which has 3 2 3 places, where tiles labeled by numbers 1; 1 1 1 ; 8 are placed and a place is empty (see Figure 7). A tile that is next to the empty can slide to the empty place. To solve an eight puzzle is to bring a con guration of tiles to the con guration shown in Figure 7 by sliding tiles. The puzzle is a search problem de ned by 8 S : all possible con gurations of tiles on the board, > > > O : left, right, up and down, which move a tile > > < to an empty place in each direction, s 2 S : a con guration, and > I > > > F : a singleton set including the con guration shown > : in Figure 7. For an experiment, we generated a solution of an eight puzzle, and use it to induce a heuristic function for the
%% %% Results from eight2 by FOIL-I IsIlp version. %% better_choice(A,B,C):move(A,B,D),goal(D). %% --- covers 3 pos. and 0 neg. examples. better_choice(A,B,C):move(A,B,D),num_psble_dir(A,E), distance(A,D,E),opposite(B,C). %% --- covers 2 pos. and 0 neg. examples. better_choice(A,B,C):move(A,B,D),better_choice(D,B,C). %% --- covers 6 pos. and 1 neg. examples. better_choice(A,B,C):move(A,B,D),num_misplace(A,E), dstnc_from_goal(A,F),num_psble_dir(A,G), le(G,F),opposite(C,B), num_misplace(D,H),le(H,E). %% --- covers 5 pos. and 0 neg. examples. better_choice(A,B,C):move(A,B,D),num_misplace(A,E), num_psble_dir(A,F),le(F,E), num_misplace(D,G),le(G,E). %% --- covers 15 pos. and 1 neg. examples. %% 0 positive examples remain uncovered.
Figure 6: An example of knowledge induced from a solution puzzle by IS/ILP. Figure 6 is an example of knowledge induced by the ILP part of the system. Predicates in background knowledge given to the system, and their meanings are shown in Table 1. Information, which is shown in Figure 5, is also given to the system as a header of background knowledge. In Table 1 and Figure 5, board means a type that represents a situation on the puzzle board, and direction is a type that represents one of the directions which a title is moved to. So, board and direction correspond with a state and an operator in the general setting explained in Section 2.1.
predicates goal(board) move(board1, direction, board2) opposite(direction1, direction2) num-misplace(board, integer) dstnc-from-goal(board, integer) num-di(board1, board2, integer) distance(board1, board2, integer) dstnc-empty(board1, board2, integer) num-psble-dir(board, integer) can-move(board, direction)
explanation board is the goal state. board2 is a result of moving the empty space of board1 in the direction direction. direction2 is the opposite direction to direction1. integer is the number of labels that are not at the place of the goal. integer is the summation of the distances of labels in board from their place in the goal. integer is the number of labels of board1 that are not at the place of the label of board2. integer is the summation of the distances of labels in board1 from their place in board2. integer is the distance of empty place in board1 from the empty place in board2. integer is the number of possible directions to move in board. board can be moved in the direction direction
Table 1: Background knowledge used with the puzzle.
1 2 3 4 5 6 7 8
Figure 7: Eight puzzle Three kinds of puzzles, of which solutions are series of by six, ten, fteen operators, respectively, are used. They are called easy puzzles, medium puzzles and hard puzzles, respectively. To investigate a dierence of heuristic functions induced from dierent solutions, we made the system induce heuristic functions from solution of a easy puzzle, a medium puzzle, two easy puzzle and two medium puzzles, respectively, and used them to solve puzzles. Results are shown in Figure 8. In the Figure 8, the visit node rate of knowledge K is de ned by the number of nodes visited by the al- visit node = gorithm in Fig.4 with K to nd a goal : the number of nodes visited by the rate of K
breadth- rst search to nd a goal
(10)
This value is a measure how knowledge is useful in problem solving. The visit node rate is widely scattered, but the average has a tendency to decrease as many and harder puzzles, which IS/ILP induced knowledge with. The second experiment investigates eects of con dence functions. Three dierent con dence functions are de ned. Type 1:f (1) = 0:95,f (2) = 0:5,f (3) = 0:2, f (4) = 0:05 Type 2:f (1) = 0:95,f (2) = 0:5,f (3) = 0:3, f (4) = 0:1 Type 3:f (1) = 0:95,f (2) = 0:8,f (3) = 0:65,f (4) = 0:5
From the obsarvation in Section 2.3, we can say that Type 1 function con des more than Type 2, and it con des more than Type 3. Figure 9 compares the visit node rates of knowledge between Type 1, 2 and 3 con dence functions. In the graph, a circle ( ) (a cross (2)) is plotted for the visit node rate with Type 2 con dence function in Xcoordinates and for the rate with Type 1 (Type 3, respectively) con dence function in Y-coordinates. The results shows a tendency that a puzzle of a low rate for Type 2 becomes lower for Type 3 and higher for Type 1, and a puzzle of a high rate for Type 2 becomes higher for Type 3 and lower for Type 1. This tendency is consistent with that Type 1 con des in knowledge more than Type 2, and Type 2 does more than Type 3. 5
Conclusions
The paper describes a general method to use ILP technique to acquire heuristic functions for problems that are expressed by states and operators. Features of the method include that (1)knowledge (heuristic functions) induced has high readability and it can be modi ed by hand easily, (2)the method can use any Prolog program as background knowledge if it is prepared with information of modes, types, etc., (3)the search method guarantees to nd a solution if there is a nite one, and (4)a con dence function can be selected as a parameter to con de a knowledge induced. In the system an ILP system induces a comparison relation among local states to which a state is expanded. It lets the cost of expansion be independent of the total number of states in open of the algorithm in Figure 4. Every expansion of state needs n(n 0 1) times of comparison of two states, where n is the number of states generated by the expansion. In the case of eight puzzle
1000 (%)
(%)
120 100
Easy puzzles Medium puzzles Hard puzzles Average
The visit node rates of knowledge with Type 1/Type 3 confidence function
The visit node rates of knowledge learned
140
100
80 60 40 20
10
Type 1 confidence function Type 3 confidence function
0 an easy a medium two easy two medium puzzle puzzle puzzles puzzles
knowledge learned by
Figure 8: The visit node rates of knowledge learned
n is at most 4.
The system is an experimental implementation and improvements of the system, especially in ILP part, are necessary for practical applications.
References
[Dzeroski et al., 1996] S. Dzeroski, S. Schulze-Kremer, K. R. Geidtke, K. Siems and D. Wettschereck: \Applying ILP to diterpene structure elucidation from 1 3CNMR spectra", Proc. 6th Int'l Inductive Logic Programming Workshop, pp.14{27 (1996). [Estlin et al., 1996] T. A. Estlin and R. J. Mooney: \Multi-Strategy Learning of Search Control for Partial-Order Planning", Proc. 13th Natinal Conf. on AI (AAAI'96), pp. 843-848, (1996). [Inuzuka et al., 1996] N. Inuzuka, M. Kamo, N. Ishii, H. Seki and H. Itoh: \Top-down induction of logic programs from incomplete samples", Proc. 6th Int'l Inductive Logic Programming Workshop, pp.119{ 136 (1996). [Mitchell et al., 1983] T. M. Mitchell, P. E. Utogo and R. Banerji: \Learning by Experimentation: Acquiring and re ning problem solving heuristics", In R. S. Michalski editor, Machine learning: an arti cial intelligence approach, Tioga Publishing, Palo Alto, CA (1983). [Mitchell et al., 1986] T. M. Mitchell, R. M. Keller and S. T. Kedar-Cabelli: \Explanation-Based Generalization: A Unifying View", Machine Learning, 1, pp.47{80 (1986).
1
1
10 100 The visit node rates of knowledge with Type 2 confidence function
1000 (%)
Figure 9: The visit node rates with various con dence functions [Mizoguchi et al., 1996] F. Mizoguchi, H. Ohwada, M. Daidoji and S. Shirato: \Learning rules that classify ocular founds images for glaucoma diagnosis", Proc. 6th Int'l Inductive Logic Programming Workshop, pp.191{204 (1996). [Muggleton et al., 1996] S. Muggleton, C. D. Page and A. Srinivasan: \An initial experiment into stereochemistry-based drug design using ILP", Proc. 6th Int'l Inductive Logic Programming Workshop, pp.245{261 (1996). [Prieditis, 1993] A. E. Prieditis: \Machine discovery of eective admissible heuristics", Machine Learning, 12, pp.117{141 (1993). [Quinlan, 1990] J. R. Quinlan: \Learning logical de nitions from relations", Machine Learning, 5, pp.239{ 266 (1990). [Quinlan et al., 1993] J. R. Quinlan and R. M. Cameron-Jones: \FOIL: A midterm report" In P. Brazdil, editor, Proc. 6th European Conf. on Machine Learning, vol. 667 of LNAI, pp.3{20. Springer-Verlag (1993). [Zelle et al., 1993] J. M. Zelle and R. J. Mooney: \Combining FOIL and EBG to speed-up Logic Programs", Proc. 13th Int'l Joint Conf. on AI (IJCAI93), pp. 1106{1111, (1993).