An e cient validation mechanism for Inductive

0 downloads 0 Views 237KB Size Report
Universit e d'Orl eans | LIFO. BP 6759. 45067 Orl eans ... ity | Proof method of completeness | Semantics of normal logic programs. ... which models a basic and sure knowledge, and from a set of examples and counter-examples. The latter set ...
An ecient validation mechanism for Inductive Logic Programming using compositionality Arnaud Lallouet and Lionel Martin Universite d'Orleans | LIFO BP 6759 45067 Orleans Cedex 2 November, 1995

Abstract

Inductive Logic Programming, which consists in learning clauses from examples, can be viewed as a cycle conception/validation leading to the acceptance of the induced program provided that it ful lls a certain criterion. We focus on the validation step in the context of empirical multi-predicate learning of normal clauses. Thanks to a compositional semantics, the classical validation step of the complete induced program can be replaced by the veri cation of local properties for a cut out into units, considerably limiting the usual combinatorial explosion. Moreover, we provide a semantics-preservative transformation which allows to simplify the program and provides a further re nement of the cut out.

Resume

La Programmation Logique Inductive consiste a apprendre des clauses a partir d'exemples et peut ^etre vue comme un cycle conception/validation menant a l'acceptation du programme induit des qu'il satisfait un certain critere. Nous nous interessons plus particulierement a l'etape de validation dans le contexte de l'apprentissage empirique multi-predicat de clauses normales. Gr^ace a une semantique compositionelle, l'etape classique de la validation du programme complet peut ^etre remplacee par la veri cation de proprietes locales pour un certain decoupage en unites, ce qui limite considerablement l'explosion combinatoire. De plus, nous proposons une transformation preservant la semantique qui permet de simpli er le programme et propose un ranement du decoupage.

Keywords: Machine learning | Validation | Inductive Logic Programming | Compositional-

ity | Proof method of completeness | Semantics of normal logic programs.

1 Introduction Inductive Logic Programming (or ILP) consists in building a logic program from an initial theory, which models a basic and sure knowledge, and from a set of examples and counter-examples. The latter set is modeled by a set of literals we call intended interpretation. Since we aim to model a learning process, we want the learned program to be complete with regards to its intended interpretation. We want it to deduce the examples and to refuse the counter-examples. This approach bears resemblance with what we imagine of human learning : from a set of observations and its own knowledge, mind builds itself up a theory in order to minimize the contradictions with its knowledge. The theory built in this way is valid in the closed system corresponding to the initial set of observations. But, since the world is an open system, the adding of a new piece of information may induce a reconstruction of the theory. These two steps are related to two paradigms of ILP : the empirical learning and the theoryrevision approach. We are interested here in the rst approach in which the set of examples as well as the initial theory is supposed de ned at the beginning. We can model the learning process this way: suppose we have an induction function fi which, from a set of examples E , a knowledge base KB and an incomplete program P already supposed to model some examples, yields a new program P 0 better than P in a certain sense. Modi cations of the program are done step by step by this induction function : addition or removal of clauses, addition or removal of literals in the body of clauses, folding, unfolding, -subsumption, etc. . . When a program P is induced, ie after an application of our induction function, it has to meet a validation criterion in order to be accepted. If not, the whole process has to be done another time until satisfaction of the validation requirements. This is the purpose of the gure 1. In the gure, we represent a complex induction function that may include internal validation steps modeled by an internal criterion. E Induction

P0

Induction function

P1

Validation

Pi

Function

Pi+1 Validation criterion

criterion Internal criterion

BK

Accepted program

Accepted program

Figure 1: revision/validation cycle The criterion we choose here is the completeness of the induced program with regards to the intended interpretation formed by the set of examples and the knowledge base. We focus in this paper on a very general case, ie the learning of a normal program composed of multi-predicate de nitions, that are eventually mutually recursive. The validation method we present uses a compositional semantics for normal logic programs and allows to cut out the learned program into small pieces. The main compositional validation result ensures that only the knowledge of the completeness of every piece is sucient to have the completeness of the whole program. The rst part presents some methods used in ILP to build up and to validate a learned program. We expose in second part the compositional semantics and we give the main validation results. 2

Then we show how these results can greatly improve current validation techniques by reducing considerably the size of the processed programs. This yields more tractable proofs, opening the door to the treatment of examples previously impossible to handle.

2 Inductive Logic Programming When learning a logic program from a speci cation, two aspects have to be distinguished : methods for the proper construction of the program and validation, which arises at di erent steps of the construction. We propose here a short survey of both aspects.

2.1 Construction of the program

Except for systems which use multiple representations such as LINUS [18], and the system TRACY [2] which learns a set of clauses at the same time, most systems in ILP only make two kinds of changes in the program: either a new clause is added, or a clause is deleted. Then di erences between systems come from basic operations they use to produce a program. Usual operations on programs are the following :  inverse resolution [29, 22] : from two clauses R and C 1 of a theory, one nds a clause C 2 such that R is the resolvent of C1 and C2 by SLD-resolution (absorption = identi cation) ; then by replacing R by C2, we obtain a new theory which allows to produce more information. Inverse resolution has the original feature of being able to extend its description language by creating new predicates ;  the empirical system FOIL [26] learns a de nition for the n-ary predicate p as follows : starting from the most general clause p(X1; : : :; Xn) , it adds literals one by one, while some positive examples are covered and until no negative example is covered. The added literal is chosen by comparing the information gain of each possible literal and literal with good gain are memorized for an eventual backtrack;  the interactive system Clint [27] employs a series of concept-description languages L1; : : :; Ln, ordered according to a growing expressiveness. When learning a new de nition for a predicate p in order to prove the new example e, Clint starts with clauses of L1, and tries to generalize these clauses while they are consistent with the intended interpretation and until the new example is proved. If e cannot be proved with a clause of L1, Clint shifts to L2 , and so on ;  relative least general generalization (RLGG) : GOLEM [23] builds clauses by generalizing the RLGG [25] of two ground atoms with respect to a theory expressed by a set of ground atoms. Usually, the learning task is speci ed by a knowledge base KB de ning a set of basic predicates, a set of positive examples E + and a set of negative examples E ? specifying the expected truth value of ground atoms for the predicates to learn. Let MKB = M+KB [ :M?KB be the semantics of the program KB , we note I = (E + [ M+KB ) [ :(E ? [ M?KB ) the intended interpretation. The goal of the learning task is to create a program P which is complete wrt the intended interpretation, ie each positive example is true for P and each negative one is false for P . Validation of the learned program consists in comparing the semantics of the program with the intended interpretation. In many case, this semantical property is replaced by the syntactical \extensional covering" :

De nition 1 (Covering, rejection)

3

 A ground atom e is extensionally covered by a clause C wrt the interpretation J i there exists a ground instance e l1; : : :ln of C such that each li is true in J ;  A ground atom e is extensionally rejected by a clause C wrt the interpretation J i for all ground instance e l1; : : :ln of C , there exists at least a literal li false in J ;  A ground atom is extensionally covered (resp. extensionally rejected) by a program if it is extensionally covered by a clause of the program (resp. if it is rejected by every clause of the program).

From a validation point of view, P extensionally covers E means the syntactical property described above and P intensionally covers E means genuine completeness, ie E  SEMP (E ). In the following, we will say that a program P satis es the extensional constraints when P extensionally covers each example of E + and extensionally rejects every negative example of E ? wrt the intended interpretation. However, a program which satis es the extensional constraints is not necessarily interesting : the clause p(X ) p(X ) has this extensional property but does not give any information (this point has already been pointed out in [28] and [1]). For system using such a method, the validation steps can prevent from creating such uninteresting clauses.

2.2 Validation of the learned program

Let us rst notice that, for some systems, the sole validation step is an empirical one : the system learns the expected program for some classical problems. Systems which ensure that the semantics of the learned program contains the intended interpretation either use syntactical biases which reduce the search space or compute (partially or totally) the semantics of the learned program. For this kind of validation, the computation of the semantics is either top-down or bottom-up.

Syntactic biases This technique is usually used in systems that try to learn a program which satisfy the extensional constraints [26], [1], [2]. The underlying idea is to reduce the search space to clauses which does not induce in nite branch in the search tree of any ground atom. Then, if the extensional constraints are satis ed, the learned program is complete wrt the intended interpretation. This bias can be summarized as follows. The clause : p(X1; : : :; Xn) l1; : : :; lk; p(Y1; : : :; Yn); : : : can be learned i there exists li = s(Z1; : : :; Zj ), i  k such that  s is a basic predicate ;  there exists 1 < r < n, 1 < u; v < j , u 6= v such that Zu = Xr and Zv = Yr ;  there exists a well-founded ordering < on the domain of Zu; Zv such that for all s(c1; : : :; cj ) 2 M+KB , cu < cv . If this bias allows to produce complete programs, it induces strong restrictions. For example, the classical de nition of transitive closure rof the relation r : f r(X,Y) r(X,Y)., r (X,Y) r(X,Z), r (Z,Y). g cannot be learned if the relation r contains cycles. The improvement of this bias proposed in [8] still remains a strong restriction. 4

Top-down evaluation of the semantics SLD-resolution is used by Clint for the validation of the learned program and for the validation of di erent steps of the construction. In order to ensure that a positive example is proved by the program, it tries to nd a SLD-refutation for this example and to ensure that a negative example can not be proved, it veri es that there exists a nite failure SLD tree for this example. The main problem here is due to the incompleteness of the operational SLD-refutation : when searching a SLD-refutation for P [ f Gg, the procedure can explore an in nite branch of the search tree, thus giving no answer. In this case, the learning process falls into an in nite loop. To avoid this situation, Clint uses a depth-bounded search : it considers only the nodes of the search tree having at most a depth h, for a given h. With this technique, the procedure always gives an answer in nite time but it becomes incorrect, ie it may consider that P [ f Gg does not have any SLD-refutation even if G is a logical consequence of P . Bottom-up evaluation of the semantics The system MULT-ICN [20], [21], [19] computes the

semantics of the program, ie the least xpoint of either the operator of the well-founded semantics or the operator of Fitting's semantics (they are de ned in section 3.2). In order to reduce the complexity of such a computation, MULT-ICN computes the semantics of a simpli ed program Prec (\program of the recursive calls") instead of the semantics of P [ KB. The semantics MKB of KB is computed once for all and then Prec is de ned from P with a set of transformations.

De nition 2 (Reduction of P modulo I [11, 10])

Let P be a normal logic program and I a set of ground literals. The reduction of P modulo I , denoted by P=I , is the program obtained from Inst(P ) by deleting each clause containing in its body a literal false in I and by deleting each literal true in I from the remaining clauses.

Prec is obtained as follows, where P is the learned program and E is the intended speci cation of

the predicates to learn :  a clause e l1; : : :; ln of P=MKB with e 2 E + belongs to Prec i each li is true in E ;  for every clause e l1; : : :; ln of P=MKB with e 2 E ? then e li1 ; : : :; lik belongs to Prec, where li1 ; : : :; lik are the literals of l1; : : :; ln false in E . The completeness of Prec wrt the intended interpretation gives a sucient condition for the completeness of P :

Theorem 3

Let P be a logic program which satis es the extensional constraints; if Prec is complete wrt E then P [ KB is complete complete wrt E .

The proof of this theorem in given in [20]. This method is possible if the Herbrand universe is nite (which is often the case in ILP), but computation of the semantics of Prec can still be complex. We propose in the following, a compositional approach which allows to compute the semantics of smaller part of Prec , thus reducing the complexity of the validation.

3 Compositional validation We present in this section compositional extensions of the well-founded semantics and of Fitting's semantics, as well as validation results for both of them [13], [15]. Since [17], many authors have been 5

interested in giving a compositional semantics for logic programming, especially as a theoretical foundation for a module system. We can cite [24], [5], [7], [6], [4], but all these works are only concerned with de nite programs. In the remainder, we use the following notations :  The Herbrand base is denoted by HB ;  For an atom a 2 HB, ?a = :a, ?:a = a and jaj = j:aj = a. ?a is called opposite of a. Let I  HB [ :HB be a set of literals and A a set of atoms.  Ie is called conjugate of I and is de ned by Ie = fl 2 HB [ :HB j ? l 2= I g. Moreover, Ie = I ? [ :I + ;  I jA is the restriction of I to A and is de ned by I jA = fl 2 I j jlj 2 Ag ;  For every operator T mapping sets of literals to sets of atoms, we de ne the operator TA mapping sets of atoms to sets of atoms by TA (J ) = T (J [ :A).

3.1 Units

In order to prepare a program for some eventual compositions, we replace the usual notion of program by the more convenient notion of unit. A program is a closed and self-sucient entity. Negation is classically obtained by absence of positive information. Conversely, a unit is open to the outside world because predicates used in bodies of clauses may have a de nition external to the unit.

De nition 4 (Unit)

A unit u is a pair (H; P ) where H  HB is a set of ground atoms, and P is a logic program such that for each clause h B 2 P , h 2 H . The underlying idea of this de nition is to make a clear separation between what is produced by the unit, which is in H , and the remainder, which is in H . H can also be viewed as an Herbrand base local to the unit. In the context of a module system, one should say that H is associated to the export part of the unit and H to the import part. A unit which imports nothing is identi ed to a program. Moreover, we denote by TPS the immediate consequence operator of P importing S . We de ne TPS (I ) = TP (S [ I ). Both the well-founded and Fitting's semantics are sets of literals. An abstract semantics SEM for a program is a function 2H [:H ?! 2H [:H de ned by S 7?! SEMP (S ), in order to take into account an imported set of literals S . Units are made to combine with others. We de ne the notion of systems of units :

De nition 5 (System of units) A system of units is a set U where every i 2 U is a unit (Hi; Pi ) and 8i; j 2 U ; i 6= j =) Hi \ Hj = ;. The restriction we make on the sets Hi ensures that a literal is produced at most by one unit of the system. Partial correctness and completeness for a unit are de ned with regards to a set of imported literals and a set of produced literals. The output of a unit is proved correct assuming the correctness of the input :

De nition 6 (Speci cation)

A speci cation for a unit u is a pair (S; S 0), where S  H [:H is the input part and S 0  H [:H the output part.

6

De nition 7 (Partial correctness and completeness for a unit)

We say that a unit u is partially correct wrt (S; S 0) if SEMu (S )  S 0. u is complete wrt (S; S 0) if S 0  SEMu(S ).

The sum of the system is obtained as union of the units of the system :

De nition 8 (Sum of a system)

S

S

We call the unit u = (H; P ) with H = i2U Hi and P = i2U Pi the sum unit of a system U . A system of units is positively hierarchical if there exists no positive dependency (ie on atoms) between the units. It is hierarchical if there exists no dependency between units, neither positive or negative.

De nition 9 ((Positively) hierarchical system of units) For i; j 2 U , we de ne the dependency relation i  j (i + j ) read \i is (positively) used by j " i there exists a clause h B 2 Pj such that B jHi = 6 ; (B+ \ Hi =6 ;). A system of units U is (positively) hierarchical if  (+ ) is a well-founded relation.

3.2 Compositional semantics

We impose the four following requirements for an abstract semantics S 7?! SEMu (S ) to be compositional : 1. the semantics has to take its values in the local Herbrand base : SEMu(S )  H [ :H ; 2. the compositional semantics has to be a conservative extension of the semantics for programs on which it is based : if H = HB , SEMu (S ) = SEMP ; 3. the semantics has to be monotone with regards to the set of imported literals : S  S 0 =) SEMu(S )  SEMu(S 0) ; 4. the semantics of the sum of a system has to be a function of the semantics of the components of this system. Let U be a system of units and u = (H; P ) its sum. We want this system to import the set of literals S  H [ :H . Each unit i 2 U can individually import literals belonging to Hi [ :Hi . Since H  Hi, these literals either come from S , or are produced by an other unit of the system. We consider the operator  : 2H [:H ?! 2H [:H which makes the union of the semantics of the units of a system importing S :

 : I 7?!

[

i2U

SEMi(S [ I jHi )

The operator  is monotone because the underlying semantics SEMu is. We can now de ne a compositional semantics :

De nition 10 (Abstract compositional semantics) The semantics S 7?! SEMu (S ) is compositional if, in addition to previous requirements, we have, for a system U of sum u : SEMu(S ) = lfp 

We propose the following extensions of the classical operators of the well-founded and of Fitting's semantics : 7

Well-founded semantics :

The classical operator [3] is u (I ) = lfp Tu;I ? [ : lfp Tu;H ?I + . The well-founded semantics is obtained as the lfp of this operator. The following operator de nes an extension which allows to take into account a set of imported literals : S ? [ :(H ? lfp T S jH + ) Su (I ) = lfp Tu;I u;H ?I e

Theorem 11

The semantics SEMu : S 7?! lfp Su is compositional if the system of units is positively hierarchical.

Fitting's semantics :

Fitting's semantics can be de ned as the lfp of the following operator [12] : u (I ) = lfp Tu;I ? [ : gfp Tu;H ?I + . It is extended by the following operator : S ? [ :(H ? gfp T S jH ) Su (I ) = lfp Tu;I u;H ?I + e

Theorem 12

The semantics SEMu : S 7?! lfp Su is compositional.

3.3 Validation results

If the semantics SEM is compositional, we have the following proof methods of partial correctness and of completeness called proof methods by decomposition. Let U be a system of units and u its sum.

Theorem 13 (Partial correctness by decomposition)

u is partially correct with regards to (S; S 0) if for each unit i 2 U , i is partially correct with regards to (S [ S 0jHi ; S 0jHi ).

Theorem 14 (Completeness by decomposition) If the system U is hierarchical, the sum unit u is complete with regards to (S; S 0) if for each unit i 2 U , i is complete with regards to (S [ S 0jHi ; S 0jHi ). Note that partial correctness only requires a compositional semantics while completeness moreover needs a hierarchical system of units. Proofs of these theorems can be found in [14] and [15].

4 Compositional Validation in Inductive Logic Programming Let us rst intuitively de ne our goal. We want to induce de nitions for a set of predicate fpigi2Learn from the family of examples fEig. The acceptability criterion we choose is the intensional coverage of the set of examples Ei by the induced program, that is the completeness of the program with regards to the speci cation de ned by the set of examples and the initial knowledge base. We are not concerned here with details of the induction function, although we assume there is one which allows to revise the program during the learning process. Moreover, we assume that the induced program extensionally covers the set of examples and extensionally rejects the counter-examples. We propose the following mechanism : when a program is produced by the induction function, instead of computing the semantics of the whole program and proving global completeness, we 8

cut out the program into carefully chosen units and, thanks to theorem 14, we prove the local completeness for the units with import. For the sake of simplicity, we explain our framework in three steps. In the rst one, decomposition is roughly made according to the predicate dependency graph, and without particular hypothesis on properties of the learned program. A rst re nement is to consider the literal dependency graph, under the assumption that the learned program ful lls the extentional constraints. Then we present a further re nement which involves semantics preservative transformations of the program.

4.1 Main validation scheme

Assume we have a knowledge base KB viewed as a set of literals and a set of examples and counter-examples E covering a family of predicates to be learned (pi)i2Learn . We propose the following process :  First, creation of the units from the program. One main diculty with multiple predicate learning comes from the possibility of inducing mutually recursive de nitions. As we saw in previous section, our proof method of completeness requires a hierarchy in the unit system. We can not, as it seems to be natural, break up the program following each predicate symbol. Actually, the relevant system of units U is obtained from a decomposition of the induced program into the strongly connected components of the predicate dependency graph. Mutually recursive predicate are grouped together in the same connected component. A strati cation of this graph is thus induced from the dependencies between the di erent units, giving a well-founded ordering which can be used to optimize the evaluation of the partial semantics. This optimization allows to treat rst modules which import the least number of literals. An other optimization should be to treat rst the modules which have the least number of clauses. This is done by a suitable topological sorting of the ordering. 1. We break up P , given by the induction function, into P1 ; : : :; Pn following the predicates dependencies graph and grouping together its strongly connected components. Let  be the well-founded ordering obtained from the dependency relations between components ; 2. We de ne the unit i = (Pi ; Hi) corresponding to the program Pi , where Hi is the set of atoms whose predicate symbol is in Pi . We de ne the unit u as the sum of the system. 3. Let Ei = E jHi ; S 4. We compute partial semantics : SEMi(KB [ j i Ej ). The ordering of the computations follows a topological sorting of the ordering  ; S 5. If 8i 2 U ; Ei  SEMi(KB [ j i Ej ), then, according to theorem 14, the program P is accepted. Otherwise, the induction function is called another time and the whole process is iterated. Figure 2: Compositional validation scheme

 Then, computation of the semantics of a unit i is done with regards to what is imported

by the unit. It includes the knowledge base, of course, but also examples of the intended 9

interpretation which are produced by other units (E jHi ). [15] shows that when the system is hierarchical, the only needed units are those which are smaller than i according to .  Finally, theorem 14 ensures that the whole program is complete provided that all Sthe units are, ie we get E  SEMP (KB ) from the knowledge of 8i 2 U ; E jHi  SEMui (KB [ j i Ej ). The size of the units being smaller than those of the initial program, these proofs are more likely to be done without exceeding machines capabilities. Evaluation of the semantics of the di erent units can be done according to the topological sorting of the ordering. The validation process can be interrupted as soon as a unit does not ful lls its validation requirement. The algorithm given in gure 2 summarize this process. When several predicates are concerned, such that when learning from a database, the reduction of the complexity is important.

4.2 Example

This example of multiple predicate learning is borrowed from [28]. The knowledge base KB given in gure 3 contains informations about the sex of a group of people and their direct family relationship : male(prudent) male(willem) male(etienne) male(leon) male(rene) male(bart) male(luc) male(pieter) male(stijn)

female(laura) female(esther) female(rose) female(alice) female(yvonne) female(katleen) female(lieve) female(soetkin) female(an) female(lucy)

father(bart,pieter) father(bart,stijn) father(luc,soetkin) father(willem,lieve) father(willem,katleen) father(rene,willem) father(rene,lucy) father(leon,rose) father(etienne,luc) father(etienne,an) father(prudent,esther)

mother(katleen,stijn) mother(katleen,pieter) mother(lieve,soetkin) mother(esther,lieve) mother(esther,katleen) mother(yvonne,willem) mother(yvonne,lucy) mother(alice,rose) mother(rose,luc) mother(rose,an) mother(laura,esther)

Figure 3: Example of knowledge base Let Learn be the set of to be learned predicate symbols : Learn = fancestor, male-ancestor, female-ancestor, parent, grand-fatherg. We moreover assume that the set of examples given for the predicate to be learned is complete. This is to say that, for every ground atom e made out with a predicate from Learn and constants appearing in KB , then either e 2 E + , or e 2 E ?. We want to validate the program given in gure 4. (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11)

parent(X) father(X). parent(X) mother(X). grand-father(X,Y) father(X,Z), parent(Z,Y). female-ancestor(X,Y) parent(X,Y), female(X). male-ancestor(X,Y) father(X,Y). female-ancestor(X,Y) mother(X,Z), female-ancestor(Z,Y). male-ancestor(X,Y) father(X,Z), female-ancestor(Z,Y). male-ancestor(X,Y) father(X,Z), male-ancestor(Z,Y). female-ancestor(X,Y) female-ancestor(X,Z), male-ancestor(Z,Y). ancestor(X,Y) female-ancestor(X,Y). ancestor(X,Y) male-ancestor(X,Y).

Figure 4: Learned program 10

The dependency graph between predicates is given in gure 5. De nitions of male-ancestor and female-ancestor are grouped in the same unit because they are mutually recursive. One can immediately see that the size of the units is a lot smaller than the size of the whole program, so is the diculty of their validation. Our optimization suggests to begin the checking by the lowest level unit, ie parent, then to go on with the other ones. male-ancestor

BK grand-father

ancestor parent female-ancestor

father mother male female

Figure 5: Graph of dependencies between predicates The Herbrand universe of the program consists of 19 constants. Computation of the whole semantics of the program leads to handle 192 atoms per predicate of arity 2, ie 192  5 = 1805 ground atoms. With our rst decomposition, the largest units (male-ancestor and female-ancestor) includes only 2  192 = 722 atoms.

4.3 A rst re nement: a cut out following literals

The rst re nement we present is very natural. It consists of building the dependency graph of the program's ground instances instead of the clauses themselves. For instance, we can use Inst(P ), P=MKB or Prec . Then, a cut out can be made on literals, yielding smaller units. This re nement ts well with our theoretical framework since the export part of the units (Hi) are sets of ground atoms. It is therefore natural to build several units across the same predicate symbol. Note that this decomposition is a sub-decomposition of the former one, following predicate names. Experimentation has been done on the Prec of the previous example. Decomposition of the largest unit male-ancestor and female-ancestor yields 9 units which include respectively 19, 19, 19, 19, 19, 12, 13, 15 and 303 atoms. 284 atoms are not part of this cut out (they do not belong to any strongly connected component). We now take into account the hypothesis we make on the nature of the learned program, ie that the program ful lls the extensional constraints. Units that do not belong to any strongly connected component are of two kinds : either they do not depend on any strongly connected component and therefore they are automatically validated by the extensional constraint of the program, or they depend on a strongly connected component and their validation is implied by the validation of this component. Hence we do only check the strongly connected components. On our example, if completeness is proven for the 9 former units, then the intended interpretation for these 284 atoms is included in the whole program's semantics.

4.4 A second re nement: a semantics-preservative transformation

In order to get even smaller units, we want to iterate the decomposition process on each unit. But, since the program is the same, it would lead to the same result. The idea is then to apply program simpli cations in order to allow smaller strongly connected components to struggle out. These simpli cations are obtained by the two following transformations, for each unit (Hi; Pi ) : 1. the rst transformation is the reduction of Pi modulo its import (E [ MKB )jHi like in de nition 2 ; 11

2. the second transformation consists of, whenever the clause \h ?" belongs to P , deleting all clauses whose head is h except \h ?" itself. These transformations are semantics-preservative. For the rst one, it is stated by the following theorem. For the second it is obvious.

Theorem 15 (Semantics preservation of P=I ) For a unit u = (H; P ) and I  H [:H , SEMu (I ) = SEMP=I (;), where SEM is the compositional extension of the well-founded or of Fitting's semantics.

Proof of this theorem can be found in [16]. Provided these transformations, we can obtain a further decomposition of existing units. On our running example, all the little units of less than 20 atoms simply disappear | they do not induce any strongly connected component |, and the large one is broken up into 13 smaller ones containing respectively 4, 13, 18, 16, 15, 10, 18, 10, 16, 18, 7, 7 and 122 atoms. It appears that all these atoms are counter-examples in the intended interpretation for both male-ancestor and female-ancestor. parent

grand-father

disappearing unit ancestor

1805 Whole Program

Final state

722 male-ancestor female-ancestor

303

122

Decomposition Decomposition

following

following

literals

predicate names

(1st Iteration)

2nd Iteration

122 3rd Iteration

Figure 6: Summary of decompositions An other iteration yields the same units and does not generate any further cut out. Since the program is de nite and all atoms are counter-examples, each export set Hi of a unit is an unfounded set for Pi . Thus, all these counter-examples are false for the well-founded semantics, and unde ned for Fitting's semantics. Hence we deduce that the program is complete wrt the well-founded semantics, and not wrt Fitting's semantics. For a normal program, since some dependencies may be negative, it would remain to prove that the last units are (or are not) unfounded sets and to 12

compute the actual semantics. More generally, in the special case of de nite programs, if we get at the end of the process a set of units fi = (Hi ; Pi)g that can not be broken up any further, then the whole program is complete if for each i, Hi  E ? . Figure 6 summarizes our framework on our running example. The gain of using a compositional decomposition to validate the program is obvious and important.

5 Conclusion We propose a novel method to validate a multi-predicate normal logic program learned from examples in Inductive Logic Programming. Instead of computing the semantics of the whole program to check its completeness with regards to the set of examples, we propose to cut out the program into small units and check locally the completeness of each unit by using a compositional semantics. A result of compositional completeness allows to deduce the completeness of the whole program from the knowledge of the completeness of each of its parts, assuming the import of what is produced by the others. In most cases, the Herbrand universe is large enough to prevent a complete computation of the semantics of the program without exceeding machine capabilities. Most usual validation techniques are either empirical, either incomplete or limited with regards to the size of the program. Conversely, with our strong theoretical framework, the door is open to handle programs of virtually any size, for the chance to obtain a large strongly connected component is low. Moreover, since the decomposition always lowers the size of the program, our validation method is a genuine optimization of existing methods. The increase of complexity consists only in nding strongly connected components of a graph, which is linear in time (see [9], page 488-493). An implementation is partially done as an extension of the system MULT-ICN [20]. This system uses moreover a complex induction function which includes partial validation steps. These internal computations are likely to be optimized the same way.

References [1] F. Bergadano and D. Gunetti. An interactive system to learn functional logic programs. In 13th Joint International Conference on Arti cial Intelligence, pages 1044{1049, Chambery, France, 1993. Morgan-Kaufman. [2] F. Bergadano and D. Gunetti. Learning clauses by tracing derivations. In 4th International workshop on Inductive Logic Programming, pages 11{29, Bonn, Germany, 1994. [3] S. Bonnier, U. Nilsson, and T. Naslund. A simple xed point characterization of three-valued stable model semantics. Information processing letters, 40:73{78, 1990. [4] A. Brogi, E. Lamma, P. Mancarella, and P. Mello. Normal logic programs as open positive programs. In Joint International Conference and Symposium on Logic Programming, 1992. [5] A. Brogi, E. Lamma, and P. Mello. Composing open logic programs. Journal of Logic and Computation, 3(4):417{439, August 1993. [6] A. Brogi, P. Mancarella, D. Pedreshi, and F.Turini. Modular logic programming. ACM transactions on programming languages, 16(4):1361{1398, 1994. 13

[7] M. Bugliesi, E. Lamma, and P. Mello. Modularity in logic programming. Journal of Logic Programming, Special Edition : Ten years of Logic Programming:1{64, 1994. [8] R. M. Cameron-Jones and J. R. Quinlan. Avoiding pitfalls when learning recursive theories. In Proceedings of the 13th International Joint Conference on Arti cial Intelligence, pages 1050{ 1055. Morgan Kaufmann, 1993. [9] T.H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to algorithms. MIT Press, 1990. [10] Agostino Cortesi and Gilberto File. Graph properties for normal logic programs. Theoretical Computer Science, 107:277{303, 1993. [11] M. Davis and H. Putnam. A computing procedure for quanti cation theory. Journal of the ACM, 7:201{215, 1960. [12] Gerard Ferrand. The notion of symptom and error in declarative diagnosis of logic programs. In Automated and Algorithmic Debugging, Linkoping, 1993. Springer LNCS. [13] Gerard Ferrand and Arnaud Lallouet. A compositional proof method of partial correctness for normal logic programs. In J.W. Lloyd, editor, International Logic Programming Symposium, Portland, Oregon, 12 1995. MIT Press. [14] Gerard Ferrand and Arnaud Lallouet. A compositional proof method of partial correctness for normal logic programs with an application to Godel. Technical Report 95-14, Laboratoire d'Informatique Fondamentale d'Orleans, Universite d'Orleans, 1995. [15] Arnaud Lallouet. Modularite, validation et parallelisme de donnees en programmation logique. PhD thesis, Universite d'Orleans, LIFO, BP 6759, F-45067 Orleans Cedex 2, France, December 1995. to appear. [16] Arnaud Lallouet and Lionel Martin. An ecient validation mechanism for inductive logic programming using compositionality. Technical Report to appear, Laboratoire d'Informatique Fondamentale d'Orleans, Universite d'Orleans, 1995. [17] J.-L. Lassez and M.J. Maher. Closures and fairness in the semantics of programming logic. Theoretical Computer Science, 29:167{184, 1984. [18] N. Lavrac, S. Dzeroski, and M. Grobelnik. Learning non-recursive de nitions of relations with LINUS. In Yves Kodrato , editor, Proceedings of the 5th European Working Session on Learning, volume 482 of Lecture Notes in Arti cial Intelligence. Springer-Verlag, 1991. [19] Lionel Martin. Apprentissage de programmes normaux en Programmation Logique Inductive. PhD thesis, Universite d'Orleans, Universite d'Orleans - LIFO, BP 6759, F-45067 Orleans Cedex 2, France, December 1995. to appear. [20] Lionel Martin and Christel Vrain. MULT-ICN: an empirical multiple predicate learner. In Luc de Raedt, editor, International Workshop on Inductive Logic Programming, pages 129{144. Katholieke Universiteit Leuven, Belgium, September 1995. [21] Lionel Martin and Christel Vrain. A three-valued framework for the induction of general logic programs. In Luc de Raedt, editor, International Workshop on Inductive Logic Programming, pages 109{127. Katholieke Universiteit Leuven, Belgium, September 1995. 14

[22] S. Muggleton and W. Buntine. Machine invention of rst-order predicates by inverting resolution. In Proceedings of the Fifth International Conference on Machine Learning, pages 339{352. Kaufmann, 1988. [23] S. Muggleton and C. Feng. Ecient induction of logic programs. In Proceedings of the First Conference on Algorithmic Learning Theory, Tokyo, 1990. Ohmsha. [24] R. O'Keefe. Toward an algebra for constructing logic programs. In Symposium on Logic Programming, pages 152{160, Boston, 1985. [25] G. Plotkin. A further note on inductive generalization. In Machine Intelligence, volume 6. Edinburgh University Press, 1971. [26] J. R. Quinlan. Determinate literals in inductive logic programming. In IJCAI-91: Proceedings of the Twelfth International Joint Conference on Arti cial Intelligence, pages 746{750, San Mateo, CA:, 1991. Morgan-Kaufmann. [27] L. De Raedt. Interactive Theory Revision: an Inductive Logic Programming Approach. Academic Press, 1992. [28] L. De Raedt, N. Lavrac, and S. Dzeroski. Multiple predicate learning. In Proceedings of the 13th International Joint Conference on Arti cial Intelligence, pages 1037{1043. Morgan Kaufmann, 1993. [29] C. Sammut and R. B Banerji. Learning concepts by asking questions. In R. Michalski, J. Carbonnel, and T. Mitchell, editors, Machine Learning: An Arti cial Intelligence Approach. Vol. 2, pages 167{192. Kaufmann, Los Altos, CA, 1986.

15