Top-down Synthesis of Recursive Logic Procedures from First-order Logic Speci cations K.K. Lau
Department of Computer Science, University of Manchester Oxford Road, Manchester M13 9PL, England
[email protected]
S.D. Prestwich
European Computer-Industry Research Center Arabellastrasse 17, D-8000 Munchen 81, West Germany
Abstract Derivation of logic programs from rst-order logic speci cations is nontrivial and tends to be done by hand. We describe a method for synthesising recursive logic procedures from their rst-order logic speci cations that is amenable to mechanisation. The method is strictly top-down and has been implemented as part of a user-guided synthesis system in Prolog.
1 Introduction Recursion is fundamental to logic programming, as is evident in standard works such as [9] and [16]. The standard fold-unfold technique of Burstall & Darlington [1] and Manna & Waldinger [15] for recursion introduction has been applied to logic program derivation (in one form or another) by Clark & Darlington [2], Clark & Sickel [3], Hogger [8], Kanamori & Horiuchi [10], and Tamaki & Sato [17], among others. In general, where logic programs are derived from rst-order logic speci cations, the synthesis tends to be done by hand; whilst transformation of a program into another is more amenable to mechanisation. In this paper, we present a method for synthesising recursive logic procedures from rst-order logic speci cations which can be mechanised. The method employs fold-unfold in a strictly top-down manner, and has been implemented as part of a user-guided synthesis system in Prolog.
2 First-order Logic Speci cation of Logic Procedures The most important feature of a speci cation language is expressiveness. This makes rst-order logic a natural choice. However, for mechanised
derivation, it may be advantageous to have a speci cation language closer to the (Horn) clausal form. For example, Kowalski [12] proposes the extended Horn clause subset of logic as a suitable general-purpose candidate, and Dayantis [5] makes use of recursively-de ned relations which are also closely related to Horn clause logic. In our work, we are concerned with both expressiveness of speci cation language and ease of mechanisation of derivation. Not surprisingly, our choice is a compromise. It consists of two kinds of rst-order logic sentences, implications and de nitions . An implication has a head and a body , and is written as
p q where the head p is a literal ,and the body q is a formula constructed from literals, the connectives ; and variables. Its logical meaning is its universal closure (p q).1 It is obvious that (the disjunctive normal form of) an implication can be immediately transformed into an equivalent set of (de nite ) clauses , possibly a procedure ,2 and vice versa.3 Our motivation here is to allow procedures to be used in speci cations; in particular, procedures for elementary predicates such as membership test. An implication allows a concise form for procedures with several clauses (and it is worth pointing out that the immediate result of our synthesis is a recursive implication). A de nition has the form p q where the head p is a literal, and the body q is any rst-order logic formula. The logical meaning of the de nition is its universal closure (p q). One motive for de nitions is expressiveness, by allowing any rst-order formulae in their bodies. We also make use of the if and the only-if parts of de nitions separately for the purpose of fold-unfold. It is worth noting that de nitions may themselves be recursive, and may therefore readily provide recursive implications/procedures. For the purpose of synthesis, de nitions are reduced to the following normal forms: Conjunctive form p q1 qn Disjunctive form p q1 qn Existential form p x 1 xn : q Universal form p x1 xn : q where q1 ; : : : ; qn (n 1), and q are literals. The required transformation is straightforward and can be done automatically. It may involve replacing ^
_
8
$
8
$
^ ^
$
_ _
$ 9
9
$ 8
8
$
1 Variables occurring in q but not in p are called the internal variables of the implication. 2 As in [16], a procedure is a set of clauses with the same positive predicate in the head. 3 By de nition, we allow negated predicates as heads of implications (and clauses); in the
bodies, we can either allow negated predicates, or remove them by making new de nitions.
subformulae with new predicates. Whenever this happens, a new de nition is added to the speci cation, whose head is the new predicate and whose body is the replaced formula. The arguments of the predicate are the free variables of the formula. As an example of normal forms, consider the de nition of subset over lists: subset(a; b) x:(mem(x; b) mem(x; a)) This is transformed to: subset(a; b) x: aux(x; a; b) aux(x; a; b) mem(x; a) mem(x; b) where aux is a new predicate symbol. Finally, a speci cation for a procedure with head p will typically contain a de nition with head p, together with de nitions/implications for all other literals that appear in the body of this de nition. $ 8
$
8
$
_ :
3 Synthesis of Recursive Logic Procedures Given a rst-order logic speci cation (with de nitions in normal form) of a procedure with head p, our method of synthesis will attempt to nd a recursive implication from which this procedure can be derived, but it also requires the user to specify the following beforehand: the head of the required implication (which will be an instance of p); the form of the required set of recursive calls (also instances of p). These constitute what we shall call a folding problem . They are necessary because in general there may be many implications which can be derived from the same speci cation, with dierent recursive calls. By specifying the head and recursive calls we restrict the search space for possible implications.4 For example, the subset speci cation can lead to (among others) the following recursive procedures (excluding base cases):
(i) subset(h:t; `) (ii) subset(a^b; c^d)
mem(h; `) subset(t; `) subset(a; c) subset(b; d) ^
^
where^represents list concatenation.5 The recursive calls in these procedures can be speci ed by the following folding problems: (i) fold subset(h:t; `) to subset(t; s) ; (ii) fold subset(a^b; e) to subset(a; s1 ); subset(b; s2 ) : f
f
g
g
4 Our approach here is similar to that taken by Feather [7] for pattern-directed fold-
unfold transformations used for developing programs in recursion equations. 5 Note the use of ^, and that in standard list notation, h:t = [h]^t, where h is a single element and t is a list.
(Note that the \output variables" s; s1 ; s2 are just arbitrary names.) The actual recursive calls (as well as the rest of the bodies of the recursive procedures) are obtained by our folding strategies which solve these problems, and in the process bind s to ` in (i), e to c^d, s1 to c, s2 to d in (ii) respectively. These strategies will be described in Section 3.2.
3.1 The Folding Problem
A formal de nition of a folding problem is as follows.6 Suppose p is the head of the speci ed procedure. Then given an instance p0 of p and a set p1 ; : : : ; pm of desired recursive calls we want an implication f
g
p0
q
where denotes bindings incurred while solving the problem, and q contains among its literals the calls
p1 ; : : : ; pm : The body q can be thought of as an unfolding of p0 . We refer to the problem of nding such a procedure as a folding problem . Its solution is p0 q. We represent a folding problem by an expression of the form7 fold(Head; Body; Calls) where Head is the literal p0 which is to form the head of the required recursive procedure. Body is the body of the implication, which is to be determined. Note that although the body is just some arbitrary variable here, it will be instantiated to logic formulae in the synthesis and is therefore a useful \place-holder". Calls is the set of desired recursive calls p1 ; : : : ; pm . These calls will be referred to as the required calls. A general folding problem will thus be written as fold(p0; z; p1 ; : : : ; pm ) : For example, the folding problem (i) for subset above is written as fold(subset(h:t; `); z; subset(t; s ) :
f
f
g
g
f
g
6 We shall use lower-case Greek letters to denote substitutions. 7 This can be regarded as a meta-level goal to be solved.
It should be noted that our de nition of a folding problem admits the trivial solution p0 p0 p1 pm or more generally p0 q0 p1 pm for any given procedure (p q). This is true by a weakening rule of natural deduction. Such solutions are of little or no use, and are avoided by our strategies, in which the body of a procedure is weakened whenever a recursive call is conjoined with it (thus giving a stronger result than the trivial solution).
3.2 Folding Strategies
^
^ ^
^
^ ^
To solve a folding problem we use a top-down problem-reduction approach, i.e. a folding problem is decomposed into subproblems which themselves may be further decomposed into subsubproblems, and so on, thus creating a tree of problems. Each node of the tree is a folding problem, the root of the tree being the initial problem , and the leaves terminal problems . Terminal problems are those which can be solved directly. Subsolutions are then systematically composed to give solutions to subproblems higher up the tree, eventually solving the initial problem. A folding strategy either decomposes or directly solves a folding problem. Each decomposition strategy consists of a precise de nition of how the subproblems are to be formed, and how the subsolutions (if and when they are found) are to be composed into a solution for the parent problem. Each direct strategy explicitly de nes the solution of the problem. In order to solve a folding problem, we need to derive the body z of the required recursive implication (with the required recursive calls) from the given speci cation. We do so recursively by making use of folding strategies in the following manner. Initially z is an uninstantiated variable. A folding strategy is then chosen, together with a sentence (a de nition or an implication) in the speci cation whose head matches that of the required implication. The folding strategy then de nes subproblems based on the body of this sentence, looking for the folds to produce the required calls in separate parts of the body. The nal folded version of this body (to be found recursively) is z . Each decomposition thus corresponds to an unfolding of the head of the current folding problem. On the other hand, each composition of subsolutions corresponds to folding the parent problem using folds already found in the subproblems. In contrast to general fold-unfold where unfolding is done according to some eureka strategies in the hope that the results might contain possible folds, we only unfold when doing so would lead to a fold. This means that we always decompose a folding problem in such a way that if the subproblems can be solved then we can compose a solution to the problem.
In the following sections, we describe the folding strategies in detail, and in the Appendix we give an example of how these strategies are used to solve a (simple) complete folding problem.
3.2.1 The definition Strategy
This is the main decomposition strategy, and is usually the rst one applied to the initial problem. The general approach here can be described as follows. Let the problem be fold(p0; z; p1 ; : : : ; pm ) and suppose the de nition of p chosen to decompose the problem is8 f
g
p q : First we use the if part of the de nition to unfold p0, i.e. we use p0 q0 as the unfold formula . Then we de ne the subproblem fold(q0; z; q1 ; : : : ; qm ) the solution of which will yield a procedure for q0 whose body contains the recursive calls q1 ; : : : ; qm . To obtain a solution to the initial problem, we use the only-if part of the de nition to fold qi to pi; i = 1; : : : ; m, i.e. we use qi pi ; i = 1; : : : ; m as the set of fold formulae . In general, we may have more than one such subproblem. The way these subproblems are constructed, and the way the subsolutions are composed into a solution both depend upon the form of q. The definition strategy thus has four versions | one for each normal form of de nition. $
f
g
Conjunctive form
Suppose the chosen de nition of p is of the form
p q1 qn : Then the unfold formula is p0 q1 0 qn 0 and the initial problem can be decomposed into 1 k n subproblems fold(qi0 ; zi ; qi 1 ; : : : ; qim ) ; i = 1; : : : ; k : $
^ ^
^ ^
f
g
8 This choice may be non-deterministic, although it is usually unique.
Without loss of generality, suppose the rst k subproblems have been solved.9 Then their solutions can be composed into a solution for the initial problem using the following set of fold formulae10
q1 i
^ ^
qk i
pi ; i = 1; : : : ; m :
Suppose the solutions of the rst k subproblems are procedures of the form11
qi
zi [qi 1 ; : : : ; qi m ] ; i = 1; : : : ; k;
where denotes the bindings that have been incurred. First unfold the k solutions into the de nition for p to get
p0 where i = i we get
z1 zk qk+1 0 qn0 ; i = 0; : : : ; m. Then applying to the fold formula set ^ ^
^
^ ^
q1i qk i pi ; i = 1; : : : ; m : Now transform the body to DNF (disjunctive normal form), and introduce folds using the fold formulae as follows. If a disjunct contains the literals q1 f ; : : : ; qk f for some 1 f m, i.e. it is of the form ^ ^
q1 f
^ ^
q k f
^
c
for some conjunction c, then we could introduce the fold pf by using the corresponding fold formula to replace the disjunct by (since q1 f qnf instead by the equivalent12 ^ ^
pf
^
!
pf q 1 f
^
c
^ ^
(c
qk+1 f
qk f ). However, we replace it
^ ^
qnf ) :
The motive for doing this is to make a new de nition involving qk+1; : : : ; qn so as not to lose them altogether. Our experience shows that such de nitions are often the key to the synthesis of useful auxiliary clauses. For example, clauses for the merge predicate for merge sort (as de ned in [2] for instance) can be synthesised from such a de nition (see [14]). If the disjunct also contains
q1 f k
0
^ ^
q k f
0
9 We shall show later the consequences of solving or not solving any subproblem. 10 Here we make use of the fact that (q1 ^ ^ qn p) ` (q1 ^ ^ qk p); for
n:
11 We shall use f [`1 ; : : : ; `n ] to denote a formula f containing the literals `1 ; : : : ; `n . 12 It is equivalent because pf ^ :qi f $ q1 f ^ ^ qn f ^ :qi f $ ? ; i =
1; : : : ;
n:
for some f 0 = f , then we introduce both folds: 6
pf
^
pf0
^
(c
qk+1 f
^
qk+1f
^ ^
0
qnf
^
qnf ) 0
Similarly any number of folds can be dealt with in any one disjunct. A disjunct is left unchanged otherwise. When all the disjuncts have been processed, the nal result is the required implication. A schematic overview of the conjunctive definition strategy for the simple case where k = n is shown in Figure 1.
p0
q1 0
^ ^
*
qn0 *
p1 q 1 1 qn1 .. .. .. . . . pm q 1 m qnm The required head p0 is unfolded using the chosen de nition; then q1 0 ; : : : ; qk 0 are unfolded to procedures whose bodies contain instances of the qi j ; folds are then made in the body to introduce the required recursive calls p1 ; : : : ; pm . Figure 1: The conjunctive definition strategy for k = n. From the above analysis, it can be seen that to produce all the required calls it suces to solve just (any) one subproblem. However, if a subproblem is solved successfully, then its head is folded out, i.e. it does not appear in the solution to the parent problem, and moreover new bindings may reduce the number of internal variables in this solution. Thus, the more subproblems solved, the simpler the body of the resultant implication. In practice, we do not know which set of subsolutions will be optimum, so our policy is to attempt as many subproblems as possible. !
^ ^
!
^ ^
Disjunctive form
This proceeds along much the same lines as the conjunctive form. Suppose the chosen de nition for p is of the form:
p
$
q1
qn :
_ _
The unfold formula is
p0
q 1 0
_ _
q n 0 ;
the initial problem can be decomposed into 1 k n subproblems fold(qi0 ; zi ; qi 1 ; : : : ; qim ) ; i = 1; : : : ; k;
f
g
and the fold formula set is
q1 i
_ _
q n i
pi ; i = 1; : : : ; m :
Note that here we have retained qi ; i > k. This is because the method of folding here is slightly dierent from the conjunctive form. Assuming the rst k subproblems are solved, the subsolutions are composed in a similar manner to the conjunctive strategy, except that here we transform the body to CNF (conjunctive normal form), and use conjuncts instead of disjuncts. If a conjunct contains a fold corresponding to f for some 1 f m, i.e. it is of the form q1 f q k f d for some disjunction d, then the fold formula
_ _
q 1 f
_ _
_
q n f
pf
can be transformed to
q1 f
_ _
qk f
pf
qk+1f
^ :
qnf
^ ^ :
which is used to fold, giving13
pf
qk+1f
^ :
qn f
^ ^ :
_
d:
As in the conjunctive strategy, this can be generalised to any number of folds in a conjunct. Finally, as in the conjunctive form, solving any k n subproblems will produce all the required calls.
Existential form
Suppose the chosen de nition for p has the form:
p
$
x: q(x) :
9
(Note that here x denotes a tuple of variables.) The unfold formula is14
p0
q(x)0 ;
the single subproblem is fold(q(x)0 ; z1 ; q(x1 )1 ; : : : ; q(xm )m ) ; f
g
13 Note that here qk+1 ; : : : ; qn have been retained as in the conjunctive form. 14 Note that 0 will not contain x since x only occur in q .
and the fold formula set is xi : q(xi )i
pi ; i = 1; : : : ; m ; where x1 ; : : : ; xm are (tuples of) new variables. Although we have decomposed the initial problem to only one subproblem, the latter should be easier to solve than the former, and may itself be further decomposed into subproblems. Due to the presence of the existential quanti er, extra care is needed here while solving the subproblem and composing a solution to the initial problem. In solving the subproblem, uni cation must be constrained so that no variable in p0 can become bound to a term containing any variable in x1 ; : : : ; xm . Moreover, x1 ; : : : ; xm must be treated as constants. These restrictions will guarantee that the required p folds can be found, provided the subproblem is solved. We compose a solution to the initial problem as follows. Suppose the solution to the subproblem is q(x)0 z1 [q(x1 )1 ; : : : ; q(xm )m ] where the i are de ned as before in terms of the new bindings and the substitutions i . Because of the uni cation constraints, we can explicitly quantify x1 ; : : : ; xm and use the unfold formula to obtain p0 x1 ; : : : ; x m : z 1 : Now transform z1 to DNF and distribute the quanti ers. Then we nd the required folds as follows. (We shall suppose, without loss of generality, that x1 ; : : : ; xm are single variables (not tuples) for simplicity.) Consider a disjunct of z1 , and suppose that out of q(x1 )1 ; : : : ; q(xm )m only q(xf )f occurs in this disjunct, i.e. the disjunct has the form x1 ; : : : ; xm : (q(xf )f r) where r is some conjunction of literals. If xf does not occur in r, then we can move the xf quanti er inward to get x1 ; : : : ; xf ?1 ; xf +1 ; : : : ; xm : ( xf :(q(xf )f ) r) : Using the fold formula for f this can be transformed to x1 ; : : : ; xf ?1 ; xf +1 ; : : : ; xm : (pf r) to introduce the fold pf . On the other hand, if xf does occur in r, that is we have r(xf ), then we cannot simply move the quanti er inward. However, if xf can be expressed as a (Skolem) function of the other variables of q(xf )f (call these vf ), say xf = fun(vf ) 9
9
9
9
^
9
9
9
9
^
^
then we can replace r(xf ) by r(fun(vf )). The result is
x1; : : : ; xf ?1 ; xf +1 ; : : : ; xm: (pf
9
^
r(fun(vf ))
^
funcons)
where funcons is a predicate with the property
funcons
xf ; vf : (xf = fun(vf )
8
q(xf )f ) :
We must now construct the function fun using funcons as a speci cation. If we can somehow do this, funcons becomes a tautology and can be deleted, giving x1 ; : : : ; xf ?1 ; xf +1 ; : : : ; xm : (pf r(fun(vf ))) : These two cases can easily be adapted to those where x1 ; : : : ; xm are not single variables, and where q(xf )f occurs in the disjunct for more than one value of f . 9
^
Universal form
Suppose the chosen de nition for p has the form:
p
$
x: q(x) :
8
(Note that here again x denotes a tuple of variables.) The unfold formula is
p0
x: q(x)0 ;
8
the single subproblem is fold(q(x)0 ; z1 ; q(x1 )1 ; : : : ; q(xm )m ) ; and the fold formula set is f
q(xi )i
g
pi ; i = 1; : : : ; m ;
where x1 ; : : : ; xm are (tuples of) new variables. As in the existential form, extra care is needed to deal with the quanti er. While solving the subproblem, uni cation has to be constrained so that the variables in x are treated as constants. Moreover, no variable occurring in p0 must become bound to a term containing any variable in x. This constraint preserves partial correctness. Another constraint is that no variable occurring in p1 ; : : : ; pm must be bound to a term containing any variable in x. This guarantees that folding is possible if the subproblem can be solved. To compose a solution for the initial problem, suppose we have the subsolution q(x)0 z1 [q(x1 )1 ; : : : ; q(xm )m ]
where the new bindings are and i = i the unfold formula, we get
p0
; i = 0; : : : ; m as before. Using
x w: z1 [q(x1 )1 ; : : : ; q(xm )m ] ; where w is the tuple of internal variables of z1 (which must be explicitly quanti ed here because of the x quanti er). Then use the fold formula set to obtain p0 x w: z1 [p1 ; : : : ; pm] : This is not yet an implication. To turn it into one, we must get rid of all the \ x w" quanti ers which enclose pf . First, let w = u v where u is the set of w's which occur in some pf , and v those which do not. Then we have p0 x u v: z1 [p1 ; : : : ; pm] Now strengthen the body to get 8
9
8
8
8
9
9
[
8
p0
9
9
u x v: z1 [p1 ; : : : ; pm ] : This is valid because \ u x" is stronger than \ x u". Next put z1 into DNF and distribute the v quanti ers, then into CNF and distribute the x quanti ers as far as possible. We know that no v or x variables occur in any pi ; i = 1; : : : ; m, (v by de nition, x because of the constraint on the bindings) so after distribution no pi is inside any quanti er other than the u quanti ers. Therefore we can remove the x and v quanti ers by forming new de nitions (which lead to subproblems) and replacing subformulae by the new atoms, without losing any of the pi . The formula is now in the form of an implication with the required recursive calls (the u quanti er can be dropped, leaving u implicitly universally quanti ed). 9
9
8
9
8
8
9
9
8
9
8
9
9
3.2.2 The implication strategy
This is both a decomposition and a direct strategy. It exploits the structure of some already known recursive implication. Given a problem fold(p0; z; p1 ; : : : ; pm ) suppose there is a known recursive implication whose head matches p0 . Let the matching substitution between p0 and the head of the procedure be , and the matched procedure be f
p0
g
q[p1 ; : : : ; pn]
where n ( 1) may be greater than, less than or equal to m. The pi will be referred to as the available calls , to distinguish them from the required calls pi.
Decomposition is done as follows. Associate each required call pi with some (not necessarily unique) available call pj . Then a set of subproblems is generated in the following way. For each pj form the set sj of pi associated with it. Then for each nonempty sj , form a subproblem FOLD(pj ; zzj ; sj ) : The conjunction of these subproblems must now be solved. To compose a solution for the initial problem, suppose the subsolutions are (for some k)
pj 0
zj [sj 0 ] ; j = 1; : : : ; k;
where 0 are the bindings incurred. Simply unfold these in z to get
p0 0
z[z1 ; z2 ; : : : ; zk ] :
Since each pi 0 is contained in some zj this is a solution to the original problem.
3.2.3 The match strategy
This is a direct strategy. A folding problem fold(p0; z; p1 ; : : : ; pm ) can be solved directly (and trivially) if p0 can be uni ed with each of the pj ; j = 1; : : : ; m; and z.15 This instantiates the folding problem to the form fold(t; t; t ) which has a trivial solution. The match strategy checks for such terminal problems and solves them. f
g
f g
3.2.4 The modus ponens strategy
This is another direct strategy. Given a problem fold(p0; z; p1 ; : : : ; pm ) a direct solution can always be found immediately by de ning a new predicate and using the Modus Ponens rule. This solution 16 is f
p0
aux(y1; : : : ; yn)
g
^
p1
^ ^
pm
15 Such a problem may arise if the problem is a subproblem and bindings previously introduced cause p0 to be already identical to each pi for example. 16 Note that this is not a \trivial" solution in the sense de ned in Section 3.1 (p.5), because p has been weakened to aux(y1 ; : : : ; yn ).
where y1 ; : : : ; yn = FV (p0 ; : : : ; pm ) and f
g
aux(y1; : : : ; yn)
p1 pm ) : The modus ponens strategy \solves" a folding problem in this way, making the given problem a terminal problem, but creating a new predicate aux(y1 ; : : : ; yn) speci ed by a de nition. However, a procedure must now be synthesised for the new predicate. This strategy can always be applied, but it is not always useful because the new predicate may be just as dicult to deal with. Its main merit is its ability to produce new de nitions not in the given speci cations, thus opening up more search space than may initially be apparent. $
(p0
^ ^
4 Conclusion Our method of synthesis may be regarded as partial deduction ([11]) of a rst-order logic speci cation with respect to a most general goal. The result of our synthesis is equivalent to what Deville & Burnay [6] call a logic description, from which they derive a logic program. To recapitulate, in our top-down approach: folds are partially speci ed in advance; unfolding is only done when it contributes directly to a speci ed fold; and folding is then done automatically. (Most other transformations are postponed until after unfolding and folding.) The strategies de ne a search space which should be much smaller than the total space of all applicable inference rules, by exploiting design decisions given by the user and hence aiming for a partially known result. We do not claim, however, that procedures synthesised using these strategies are ecient. Our primary objective has been to implement a practical system based on these strategies. To this end, we have developed a Prolog system which can be regarded as a meta-level logic program of the form:17 fold(h; b; c) sl(h) match(h; b; c) fold(h; b; c) h c sl(h) autofold(h; b; c) fold(h; b; c) h c sl(h) implication(h; b; c) fold(h; b; c) h c definition(h; b; c) fold(h; b; c) h c modus ponens(h; b; c) :
:
^
2
^ :
^
^
:
2
^ :
:
2
^
:
2
^
(with standard Prolog control, and all that this implies) together with metaprocedures for the dierent strategies (with user-guidance). The details in the Appendix have been obtained on this system. Theoretical issues have not yet been addressed properly. Foremost among these are two related to completeness. Firstly, although partial correctness
17 sl(h) means h is a system-generated literal created while transforming de nitions to normal forms. autofold is a strategy which attempts to solve a problem without user-
guidance.
is guaranteed (a property of general fold-unfold), completeness of a derived procedure has not been investigated. Clearly, this depends on the \completeness" of the folding problem in question, but it would be helpful if the system could determine the completeness of any procedure that had been synthesised. We have also ignored base cases in our present system. Secondly, completeness of our top-down approach with respect to general fold-unfold has also not been proven. In practice, this has not been a problem. We have used the system successfully to synthesise procedures for a wide variety of algorithms, including a large family of sorting algorithms ([13, 14]), a family for generating convex hulls, and (with some extensions not discussed here) algorithms for solving systems of linear equations. Functional (notation) and equality reasoning has been used for convenience whenever and wherever appropriate. Again, the theoretical (and practical) implications of such expedients have not been properly treated. Finally, the exclusive use of laws of logic for composing solutions from subsolutions may be too restrictive.
Acknowledgements
We thank Tim Clement and the referees for their helpful comments which have improved this paper considerably.
References [1] R.M. Burstall, J. Darlington, A Transformation System for Developing Recursive Programs, J. ACM 24(1), January 1977, 44-67. [2] K.L. Clark, J. Darlington, Algorithm Classi cation through Synthesis, The Computer Journal 23(1), 1980, 61-65. [3] K.L. Clark, S. Sickel, Predicate Logic: A Calculus for the Derivation of Programs, Proc. IJCAI-77 , 1977, 419-420. [4] K. L. Clark, S.-A. Tarnlund, A First Order Theory of Data and Programs, in Information Processing 77 , North-Holland 1977, 939-944. [5] G. Dayantis, Logic Program Derivation for a Class of First Order Logic Relations, Proc. IJCAI-87 , 1987, 9-14. [6] Y. Deville, J. Burnay, Generalization and Program Schemata: A Step Towards Computer-Aided Construction of Logic Programs, Proc. NACLP 89 , 1989, 409-425. [7] M.S. Feather, A System for Assisting Program Transformation, ACM TOPLAS 4(1), 1982, 1-20. [8] C.J. Hogger, Derivation of Logic Programs, J. ACM 28(2), 1981, 372392. [9] C.J. Hogger, Introduction to Logic Programming , Academic Press 1984.
[10] T. Kanamori, K. Horiuchi, Construction of Logic Programs based on Generalised Unfold/fold Rules, ICOT Technical Report TR-177, 1986. [11] J. Komorowski, Towards Synthesis of Programs in the Framework of Partial Deduction, Proc. Workshop on Automating Software Design , 11th IJCAI, 1989. [12] R. Kowalski, The Relation between Logic Programming and Logic Speci cation, in C.A.R. Hoare, J.C. Shepherdson (eds), Mathematical Logic and Programming Languages , Prentice-Hall 1985, 11-27. [13] K.K. Lau, A Note on Synthesis and Classi cation of Sorting Algorithms, Acta Informatica 27, 1989, 73-80.. [14] K.K. Lau, S.D. Prestwich, Synthesis of Logic Programs for Recursive Sorting Algorithms, Technical Report UMCS-88-10-1, Department of Computer Science, University of Manchester, 1988. [15] Z. Manna, R. Waldinger, Synthesis: Dreams Programs, IEEE Trans. Soft. Eng., 5(4), July 1979, 294-328. [16] L. Sterling, E. Shapiro, The Art of Prolog , MIT Press 1986. [17] H. Tamaki, T. Sato, Unfold/fold Transformations for Logic Programs, Proc. 2nd Int. Conf. on Logic Programming , 1984, 127-138. )
Appendix: A Complete Example We give the complete synthesis of a procedure from the following speci cation for subset (over lists) in normal form:
subset(a; b) aux(x; b; a) mem(x; a^b)
x aux(x; b; a)) mem(x; b) mem(x; a) mem(x; a) mem(x; b) : Suppose we wish to nd a recursive procedure for subset which con$
8
$
_ :
$
_
structs a subset by concatenating two sets which are recursively constructed. This \design decision" is represented by the folding problem G1 : fold(subset(a^b; s); z1 ; subset(a; s1 ); subset(b; s2 ) ) : To solve this problem, we rst decompose it using the universal form of definition18 with the subset de nition in the speci cation, giving the subproblem G1:1 : fold(aux(x; s; a^b); z1:1 ; aux(x1 ; s1 ; a); aux(x2 ; s2 ; b) ) : Here the unfold formula is f
g
f
subset(a^b; s)
g
x aux(x; s; a^b)
8
18 This introduces the following uni cation constraints: x are constants and a; b; s; s1 ; s2 must not become bound to any terms containing x.
and the fold formula set is ( aux(x1; s1; a) aux(x2; s2; b)
subset(a; s1) subset(b; s2)
)
:
Next apply the disjunctive form of definition with the aux de nition to decompose this into the following subproblems: G1:1:1 : fold(mem(x; s); z1:1:1 ; mem(x1 ; s1 ); mem(x2 ; s2 ) ) ; f
g
G1:1:2 : fold( mem(x; a^b); z1:1:2 ; mem(x1 ; s1 ); mem(x2 ; s2 ) ) : Here the unfold formula is aux(x; s; a^b) mem(x; s) mem(x; a^b) and the fold formula set is ( ) mem(x1 ; s1) mem(x1; a) aux(x1 ; s1; a) : mem(x2; s2) mem(x2; b) aux(x2 ; s2; b) :
f:
:
g
_ :
_ :
_ :
G1:1:1 can be solved by match. This creates new bindings, given by the substitution [x1 := x; x2 := x; s1 := s; s2 := s]. Under these bindings, the second subproblem becomes G1:1:2 : fold( mem(x; a^b); z1:1:2 ; mem(x; a); mem(x; b) ) : This can be solved directly by implication using the only-if half of the mem de nition: mem(x; a^b) mem(x; a) mem(x; b) which is the solution to this subproblem. Using the solutions for G1:1:1 and G1:1:2 we compose a solution for G1:1 . First unfold these subsolutions in the unfold formula, giving aux(x; s; a^b) mem(x; s) mem(x; a) mem(x; b) : Next transform the body to CNF: aux(x; s; a^b) (mem(x; s) mem(x; a)) (mem(x; s) mem(x; b)) : Then apply the fold formulae for G1:1 (with the new bindings), giving aux(x; s; a^b) aux(x; s; a) aux(x; s; b) which is the required solution for G1:1 . To compose a solution for G1 , unfold this solution in the unfold formula for G1 , giving subset(a^b; s) x (aux(x; s; a) aux(x; s; b)) : :
f:
:
:
:
^ :
_ :
^ :
_ :
^
^
8
g
^
_ :
There are no internal variables in the body, so we proceed by distributing the universal quanti er over the conjunction, which is already in CNF, to get
subset(a^b; s)
( x aux(x; s; a)) 8
^
( x aux(x; s; b)) : 8
Applying the fold formulae gives
subset(a^b; s)
( x subset(a; s)) 8
^
( x subset(b; s)) : 8
As guaranteed by the constraints, x does not occur in either recursive call, so we can delete the quanti ers to get
subset(a^b; s)
subset(a; s)
^
subset(b; s) :
which is the required implication, already in the form of a procedure.