Programming with Equations, Subsets, and Relations ... - CiteSeerX

3 downloads 0 Views 166KB Size Report
Whereas strati cation in Horn- clause programs .... typical match would be fx Joe;t fMark;Mary;Janegg. We will use ... e ectively reduces to fMark;Mary;Joe;Janeg.
Programming with Equations, Subsets, and Relations Bharat Jayaraman Department of Computer Science State University of New York at Bu alo Bu alo, New York 14260

David A. Plaisted Department of Computer Science University of North Carolina at Chapel Hill Chapel Hill, NC 27514

Abstract

We discuss the declarative and computational issues in combining equational, subset, and relational assertions in a logic programming language. The novel feature in this work is the subset assertion, whose interactions with equational and relational assertions are discussed in this paper. The semantics of subset assertions incorporate a collect all capability, which is expressed formally by the completion of the program. When used in conjunction with equational assertions, subset assertions serve to de ne set-valued functions, and the resulting paradigm is called subset-equational programming. We also present the class of strati ed subset-equational programs for formalizing the class of closure functions, which are useful in de ning various transitive-closure sets. The declarative and operational semantics of simple and strati ed subset-equational programs are the main focus of this paper. The operational semantics of closures is based on memo-tables (or extension tables). When subset assertions are used in conjunction with relational assertions, the paradigm is called subset-relational programming. We present both simple and strati ed subset-relational programming, and show how they provide a declarative way to de ne Prolog's setof construct.

1. Introduction

This work originated while investigating the use of non-con uent rewriting systems for de ning sets [P86], and subsequently led to a functional programming language based on equations and subset assertions [JP87]. While the use of equations for functional programming is well-known [O85], subset assertions are relatively new, and have the form f(terms)  expression. Informally, the declarative meaning of a subset assertion is that, for all its ground instances, the function f operating on argument ground terms is a superset of the ground set denoted by the expression on the right-side. By providing subset assertions with a collect-all capability, the meaning of a set-valued function f operating on ground terms is equal to the union of the 1

respective sets de ned by the di erent subset assertions for f. The top-level query is of the form ? expression where expression is a ground expression. The meaning of this query is the ground term t such that expression = t is a logical consequence of the completion of the program, i.e., augmenting all subset assertions de ning some function with equality assertions that capture the collect all capability of these subset assertions. We refer to the programming paradigm resulting from equations and subset assertions as subset-equational programming. In our earlier papers, we described the execution model for subsetequational programs using innermost reduction and associative-commutative (a-c) matching [JP87], and also showed how a restricted form of a-c matching can be eciently compiled into instructions similar to those of the Warren Abstract Machine (WAM) [JN88]. We are interested in a-c matching because of the presence of the union [ constructor (we explain later why the idempotence and identity are not used in the matching process). This paper concentrates on the declarative semantics of subset-equational programs, and discusses two kinds of subset-equational programs: 1. Simple subset-equational programs. The focus here is on formalizing the collect all capability in terms of the completion of the program, and also on the correctness of the operational semantics. 2. Strati ed subset-equational programs. Whereas strati cation in Hornclause programs with negation by failure is used to avoid the problems of recursive negation, we introduce strati cation here for formalizing the class of closure functions. These functions are useful in de ning various transitive-closure sets, e.g., data ow analysis in compilers, etc. In general, strati ed subset-logic programs consist of (closure and non-closure) functions that are partitioned into several levels. We provide modeltheoretic and operational semantics for strati ed subset-equational programs. We also explore the integration of subset and relational assertions in this paper, but our treatment of this topic is less detailed. We refer to this paradigm as subset-relational programming. A typical assertion in this paradigm is of the form f(term)  set :- B, where B is, in general, a conjunction of relational and equality goals. This paradigm can provide a declarative treatment of Prolog's setof construct [W82], which is known to be very useful in practice, but does not possess a simple declarative semantics. We show how subset assertions can be used to de ne the setof construct directly. The paradigm can also be used to de ne negation-by-failure, as well as the `grouping construct' in LDL1 [BNRST87]. Similar to subset-equational programs, we present the class of simple subset-relational programs and strati ed subset-relational programs. Simple subset-relational programs do not permit equality goals in B above; 2

we introduce them to discuss the basic ideas of the paradigm, including the need for generalizing restricted a-c matching to restricted a-c uni cation. As with negation-by-failure, we introduce strati cation in order to obtain a simple semantics when equality goals occur on the r.h.s. of subset-relational programs. The rest of this paper is organized as follows: section 2 informally describes simple subset-equational programming, summarizing and clarifying the essential language ideas of our earlier papers [JP87, JN88]; sections 3 and 4 are devoted to the model-theoretic and operational semantics of simple and strati ed subset-equational programs respectively; section 5 introduces subset-relational programs; nally, section 6 presents conclusions, further comments on related work, and areas of further work.

2. Subset-Equational Programs: An Informal Introduction

We rst specify the syntactic structure of term and expression. term ::= atom j variable j  j fterm g j term [ term j constructor(terms) terms ::= term j term , terms expr ::= term j fexpr g j expr [ expr j constructor (exprs ) j function(exprs) exprs ::= expr j expr , exprs We will refer to a term as a set if it has one of the set constructors, , f g or [, at its outermost level|these are the only set constructors|otherwise, we will refer to the term as an element. The constructor  is the empty set, f g is the singleton-set constructor, and [ is stands for set-union. A ground term is a term without any variables in it. Informally, terms correspond to data objects, and we consider only nite terms in this paper. A program assertion is of the form f(terms)  expr or f(terms) = expr. Note that we distinguish a constructor from a function, and hence also a term from an expression. A constructor is not de ned by any program assertions and is therefore irreducible, whereas a function is de ned by one or more program assertions which are used to reduce a functional expression. We next discuss the two key notions in subset-equational programming: the completion, and restricted a-c matching.

2.1 Completion

We `complete' program assertions in order to derive equality assertions from subset assertions. The completion of a program incorporates two assumptions underlying the meaning of subset-equational programs: (i) Collect-all Assumption. If a set-valued expression e is such that e  s1, : : :, e  sn , and it is determined that there are no other known subsets for e according to the given program, then the collect-all assumption allows us to infer e = [i=1;n si . Note that the result must be a nite set. 3

(ii) Emptiness-as-Failure Assumption. This assumption e ectively allows us to discard all failing reductions when collecting the di erent subsets of a set. There are two aspects to the emptiness-as-failure assumption: (a) the value of fexprg is  if expr reduces to an (irreducible) expression with non-constructors, and (b) applying a (non-constructor) set-valued function fs to terms that don't match any of the l.h.s. of assertions for fs yields . We illustrate both aspects of the completion by a simple example. Note that our lexical convention in this paper is to begin atoms with an uppercase letter and variables with a lowercase letter. f(Bob) = Mark f(Ann) = Mark f(Mark) = Joe p(x) p(x)

 ff(x)g  fm(x)g

m(Bob) = Mary m(Ann) = Mary m(Mark) = Jane

The collect-all assumption e ectively supplements p by the assertion p(x) = ff(x)g [ fm(x)g. Thus, for example, p(Bob) = ff(Bob)g [ fm(Bob)g = fMarkg [ fMaryg = fMark; Maryg. By the collect-all assumption and case (a) of the emptiness-as-failure assumption, we have, for example, p(Mary) = ff(Mary)g [ fm(Mary)g =  [  = . Note that f(Mary) and m(Mary) are irreducible and f and m are not constructors. (We illustrate case (b) of the emptiness-as-failure assumption in the section 2.3.)

2.2. Restricted A-C Matching

The associative-commutative matching problem may be stated as follows: Given two terms t1 (possibly non-ground) and t2 (ground), some constructors of which may be associative-commutative, is there a substitution  such that t1  =ac t2 (where =ac means `equality modulo the associative and commutative equations')? This problem was rst posed by Plotkin [P72] and has since been studied quite extensively in the literature (see [BKN85] and references therein). For example, if t1  u [ v and t2  fMark, Mary, Joe, Janeg, we will obtain 16 di erent substitutions for  such that t1  =ac t2 , corresponding to the di erent ways of splitting the four-element set into two subsets|a typical match would be fu fMary; Janeg; v fMark; Joegg. On the other hand, if t1  fxg [ t and t2  fMark, Mary, Joe, Janeg, we will obtain 4 di erent substitutions, corresponding to the di erent ways of selecting one element from the set and its corresponding remainder|a typical match would be fx Joe; t fMark; Mary; Janegg. We will use the notation fx j tg to refer to a non-empty set, one of whose elements is x and the remainder of the set is t. That is, fx j tg is syntactic sugar for fxg [ t. The case when all set patterns are restricted to the form fterm j term g, where term does not use [ explicitly, is of special interest, because it is amenable to a more ecient implementation. Henceforth in this paper, we disallow explicit use of the [ constructor in program 4

assertions|we show in this section how set union can be de ned using program assertions. Basically, this restriction permits iteration over the elements of a set, rather than iteration over the subsets of a set. While some expressive convenience is sacri ced by this restriction, most practical cases are una ected. We refer to the associated matching operation as restricted a-c matching, which we discussed in greater detail in [JN88]. We note that the equality =ac is based only on the associative and commutative properties, but not the idempotent property. Thus, for example, matching fx j tg with fMark; Maryg cannot yield the substitution fx Mark; t fMark; Marygg. The reason for disallowing the idempotent property during matching is to avoid a potential in nite loop in recursive de nitions where t appears on the r.h.s. of the rule (see perms de nition at end of this section). Furthermore, because a singleton set such as fMarkg is represented internally as fMarkg [ , it can match fx j tg yielding fx Mark; t g; thus the identity property is not explicitly required during matching.

2.3 Examples

The multiple a-c matches arising from the use of patterns such as fx j tg provide a convenient and ecient way of iterating over the elements of a set. Continuing the example from the section 2.1, we may de ne the set of ancestors of some individual as follows. anc(x) = allanc(p(x))

 f j g  anc(x)

allanc(s) s allanc( x t )

For example, to nd the ancestors of Bob, we evaluate anc(Bob), which reduces to allanc(p(Bob)). The evaluation of nested expressions occurs innermost- rst; hence the above expression reduces to allanc(fMark; Maryg). Because allanc is de ned by two subset assertions, both these assertions are considered in reducing allanc(fMark; Maryg). The result from the rst assertion is fMark; Maryg. When matching allanc(fMark; Maryg) with the leftside of the second assertion de ning allanc, both a-c matches are considered, namely, fx Mark; t fMarygg and fx Mary; t fMarkgg. The rightside of this assertion is then separately evaluated for each of these matches, and the union of these sets (along with that from the rst assertion) is de ned as the value for allanc(fMark; Maryg). Thus allanc(fMark; Maryg) e ectively reduces to fMark; Mary; Joe; Janeg. The following points should be noted in the above process: (i) In general, duplicates must be eliminated when taking the above union, but we showed in [JN88] how this check can be deferred for a particular argument when the function distributes over union in this argument. (ii) The case when the argument set to allanc is empty is correctly handled, i.e., allanc() =. We clearly have allanc()  , by the rst assertion for allanc. We also have allanc()   by the second 5

assertion for allanc, because of case (b) of the emptiness-as-failure assumption. (iii) With reference to the second rule for allanc, because the variable t is not used on the r.h.s., considerable space and time can be saved by not constructing the remainder-set for it. We provide the notation to refer to the \don't-care variable," as in Prolog, so that the programmer can indicate such cases explicitly. We show a few more examples of programs in this paradigm to illustrate the succintness provided by subset assertions and restricted a-c matching. member(h; fh j g) = true crossproduct(fx j g; fy j g)  f[x j y]g intersect(fx j g; fx j g)  fxg union(s1; s2)  s1 union(s1; s2)  s2 perms() = f[ ]g perms(fx j tg)  distr(x; perms(t)) distr(x; fh j g)  f[x j h]g Note that the rst four operations shown above are all stated nonrecursively. It is possible to compile these de nitions so that no recursive calls occur even during execution. The perms example illustrates recursion in conjunction with a-c matching; further details on this de nition may be found in [JN88]. This example also illustrates why we prefer not to use the idempotence property during matching. Before proceeding to the semantics of subset-equational programs, we brie y address the con uence requirement of subset-equational programs. Of special interest is the case where the set terms of the form ft1 j t2 g appear on the l.h.s. of such assertions. Stated as a syntactic condition, we require that: (i) the left-hand side of each equality assertion not overlap with any other assertion, (ii) when set terms occur in equality assertions, the result should be independent of which one of the potentially many a-c matches is selected. Other less restrictive conditions are possible, but we shall assume the above conditions, for the sake of speci city. Note that a subset assertion may overlap with other subset assertions.

3. Semantics of Simple Subset-Equational Programs

We now give a more formal account of simple subset-equational programs. To keep the presentation brief, we omit proofs of propositions and theorems in this paper, but instead refer the reader to [J89].

3.1. Completion

Before de ning the completion, we atten all expressions so that the arguments of all function calls are terms. Temporary variables are introduced as 6

necessary. For the parent-and-ancestor example of section 2, the attened program would be as shown below.

f g f g

p(x) e :- f(x) = e p(x) e :- m(x) = e anc(x) = s2 :- p(x) = s1, allanc(s1) = s2 allanc(s) s allanc( x t ) s :- anc(x) = s.

 f j g 

The general attened form of a program assertion is H :- B, where H may be either f(t) = u (where t and u are terms) or f(t)  u, and B is of the form E1 , : : :, En , where each Ei is fi (ti ) = ui , where fi is a user-de ned function. Note: (i) B may be empty, in which case we have an unconditional assertion; (ii) without loss of of generality, f's argument can be assumed to be a single term t rather a sequence of terms, because the latter is subsumed by the former given a sequence constructor; and (iii) the order of equalities on the r.h.s. re ects the innermost reduction order for expressions. The general attened form of a (ground) query expression will be a sequence of equalities of the form f1 (t1 ) = x1 , : : :, fn (tn ) = xn , where each ti is a term and each xi is a variable. We are now ready to de ne the completion of a attened program P : 1. Collect-All Assumption. For each set-valued function fs de ned by a collection of n subset assertions, fs (t1 )  s1 :- B1 , : : :, fs (tn )  sn :Bn , where Bi stands for the body of each assertion, we add the single equality assertion fs (v ) = [i=1;n [ fsi : (9yi ) [v =ac ti ^ Bi ]g where v is a new variable not appearing in any of the subset assertions de ning fs , yi are all the variables of the i-th assertion but excluding those variables in si , and Bi can use any variable from the i-th assertion. 2. Emptiness-as-Failure Assumption. We introduce the device of an unde ned element, ?, in order to denote the \value" of an irreducible element-valued expression that is not a term. There are two cases to the emptiness-as-failure assumption: Case (i): For each element-valued function fe de ned by a collection of n equality assertions, fe (t1 ) = u1 :- B1 , : : :, fe (tn ) = un :- Bn , we add fe (x) = ?, for each x 2 G ? T, where G is the universe of ground terms and T is the set of ground terms obtained by instantiating each of the terms t1 , : : :, tn on the l.h.s of all assertions for fe . Case (ii): In the case of a set-valued function fs , we add 7

, for each x 2 G ? T. Finally, we have f?g = , fe (: : :; ?; : : :) = ?, and fs (: : :; ?; : : :) = . We shall refer to the completion of a program P as comp(P ). For the fs (x) =

parent-and-ancestor example, the collect-all assumption results in the following equality assertions. p(v) = [ ffeg : f(v) = eg [ [ffeg : m(v) = eg allanc(v) = v [ [fs : (9x; t)fx j tg =ac v ^ anc(x) = sg And the emptiness-as-failure results in the following assertions: p(?) =  anc(?) =  allanc(?) = 

3.2. Model-theoretic Semantics

We de ne the model-theoretic semantics starting from the universe of terms UP (similar to Herbrand universe) which is the set of ground terms of the program P augmented with the unde ned element ?. By P , we refer to the

attened form of the source program. The base of interpretations BP for a program P is the set of unconditional ground equality and subset assertions derived from P . A model of a program P is an interpretation I  BP such that I satis es every assertion in P , i.e., for all ground instances Hg :- Eg1 , : : :, Egn , whenever fEg1 , : : :, Egn g  I, we also have Hg 2 I. In addition we assume that a model satis es the following equality theory for constructors: (i) associative, commutative, identity and idempotent properties of [, (ii) c(: : :; ?; : : :) =? for any element-valued constructor c, and (iii) f?g = . Terms that are not equal by this equality theory are assumed to be equal only if they are identical. De nition 1: An assertion A is said to be a logical consequence of a program P , denoted P j= A, if every model of P is also a model of A. De nition 2: The model-theoretic semantics of P , MP = f A : comp(P ) j= A g, where A is an unconditional ground assertion. Note that an interpretation I is a model of an assertion fs (g ) = [i=1;n [ fsi : (9yi ) [v =ac ti ^ Bi ]g, where g is a ground term, if whenever f(g ) = s 2 I, we have s = [fsi : I models (9yi ) [g =ac ti ^ Bi ]g. De nition 3: Given a program P and a query expression G, we say  is a correct answer substitution, where  binds all variables in G to ground terms, i comp(P ) j= G. Proposition 1: comp(P ) j= P . We note that if P is terminating, there is a unique model for comp(P ). However, non-terminating subset-equational programs do not have unique models, as illustrated by the following program de ning two set-valued functions f and g: 8

f(x) f(x) g(x)

 f1g  g(x)

= g([x]) Note that the completion adds the following assertions: f(x) = f1g [ [fs : g(x) = sg f(?) =  g(?) =  The above program has an in nite number of models of the form: ff(?) = g(?) = , f(x)  f1g, f(x) = f1g [ s, g(x) = sg, where x 6= ? and each di erent model would exercise a di erent choice for ground set s. However, because of the following proposition, there is a least model, which is, in this example, ff(?) = g(?) = , f(x)  f1gg where x 6= ?. Proposition 2: The intersection of all models of comp(P ) is a model. Theorem 1: MP = \fM : M is a model for comp(P )g. We can give a xed-point characterization of MP using an immediateconsequence operator. We omit its presentation in this paper to stay within the page limits; the interested reader is referred to [J89] for its details.

3.3. Operational Semantics

We de ne the operational semantics for the attened program P supplemented with equality assertions for the emptiness-as-failure assumption. We de ne a rewriting relation ! by considering the two mutually exclusive cases in rewriting an innermost expression. Case 1: Given variants of subset assertions, f(t1 )  s1 :- B1 , : : : f(tn )  sn :- Bn , and a attened query expression G  g1 , : : :, gm , where g1 is f(t) = x and x is a variable, we de ne the rewriting relation ! such that, if matching t with t1 : : :tn yields respectively the ( nitely many) substitutions 11 , : : :, 1k1 , : : :, n1 , : : :, nk , then G ! (B1 11 ), : : :, (B1 1k1 ), : : :, (Bn n1 ), : : :, (Bn nk ), (g2, : : :, gn )  , where   fx [ij (si ij )g. Case 2: Given variants of equality assertions, f(t1) = u1 :- B1 , : : : f(tn) = un :- Bn , and a attened query expression G  g1 , : : :, gm , where g1 is f(t) = x and x is a variable, we de ne the rewriting relation ! such that, if matching t with t1 : : :tn yields respectively the substitutions 11 , : : :, 1k1 , : : :, n1 , : : :, nk , then G ! (Bi ij ), (g2, : : :, gn )  for some i and j , where   fx (ui ij )g. De nition 4: Given a program P and query G, we say that  restricted to variables in G is the computed answer of a derivation G  G1 ! G2 ! : : : ! Gk  [] n

n

n

9

if   1 : : : k , i.e., the composition of the substitutions 1 , : : :, k at each step. The following theorems express the correctness of the operational semantics. Theorem 2 (soundness): Given a program P and top-level goal sequence G, the computed answer  is a correct answer. Theorem 3 (completeness): Given a program P and top-level goal sequence G, if there exists a correct answer , then there is a !-derivation such that  is the computed answer.

4. Strati ed Subset-Equational Programs 4.1. Closure Functions

It turns out that, with a suitably enhanced operational procedure, one can compute more information than is de ned by the least model, MP . A trivial example is shown below. f(x)  fxg f(x)

 f(x)

The above program is nonterminating (according to the operational semantics of section 3.3), and its least model is of the form ff(?) = , f(x)  fxgg, for x 6=?. That is, in the least model, we know that f(x) contains fxg, but we do not know which set f(x) is equal to. In this example, although there are an in nite number of models which contain f(x) = fxg [ s for some ground set s, the least model does not have any of these assertions. Note also that the nontermination here arises because of a recursive call with an identical argument as an outer call, e.g., f(x) = fxg [ f(x). Such calls can be detected with the use of a memo-table (or extension-table). We refer to sets de ned cyclically in this manner as set closures, and functions de ned using such sets as closure functions. All closure functions in this language are set-valued functions. A more useful example is the function reach below for nding the set of reachable nodes of a graph g, represented as a set of ordered pairs, starting from some given node v.

 fvg  allreach(adjacent(v, g), allreach(fx j g; g)  reach(x; g) g)  fwg adjacent(v, f[v, w] | reach(v, g) reach(v, g)

g)

In general, closure functions are useful whenever one is interested in de ning the smallest set satisfying some property, e.g., data ow analysis in compilers. We require the programmer to identify all closure functions, say through annotations|otherwise the overhead of memo-izing function calls can become excessive. In order to permit the use of closure functions as arguments to non-closure functions and still maintain our ability to de ne a suitable declarative semantics, we impose three conditions: 10

(i) We stratify all assertions into n levels, where level 1 assertions de ne only non-closure functions, and each of the remaining levels is divided into two sets of assertions: assertions de ning closure functions and assertions de ning non-closure functions. In the above example, adjacent would be at level one, and the closure functions reach and allreach would be at level two. (ii) We require that all closure functions be terminating (after memoization). We can relax this requirement, but we will assume it in this paper, for the sake of simplicity in presentation. (iii) We require that all closure functions at a given level are de ned in terms of one another using subset-monotonic functions, but may be de ned in terms of any (closure or non-closure) function from a lower level. A setvalued function g is said to be subset-monotonic in a particular argument i s1  s2 implies g (: : :, s1 , : : :)  g (: : :, s2 , : : :). For example, set-di erence x ? y, is not subset-monotonic in its second argument, but is subset-monotonic in its rst argument.

4.2. Semantics of Strati ed Subset-Equational Programs De nition 5: Assuming Pj are all the assertions at level j and Pjc are just

the assertions de ning closure functions at level j , Mn de nes the modeltheoretic semantics of a strati ed program with n levels, where for j > 1, Mj = Aj [ Cj , where Aj = f A : comp(Pj ) [ Mj ?1 j= Ag, Cj = f fji(t) = s : fji is a closure function at level j ^ (9M )[M models Pjc ^ fji (t) = s 2 M ^ (8M2 ) [M2 models Pjc ^ fji(t) = s2 2 M2 implies s  s2 ]] g and M1 = f A : comp(P1 ) j= Ag, At level j , Aj de nes the semantics of non-closure functions, and, in the case of closure functions, it also de nes those subset assertions that are logical consequences of the program. The equality assertions for a closure function fji at level j are de ned by Cj . For each fji(t), the de ned value s is the smallest set among those sets de ned for fji (t) in the di erent models of Pjc . To de ne an operational semantics, we de ne the ! relation between pairs of the form , where G is a goal-sequence as before, and T is a memo-table, i.e., a set of assertions of the form f(t) = u, where f is a closure function, t is a ground term, but u may be non-ground. Initially T = . Given a pair , let the rst goal in G, g1  f(t) = v . We de ne ! , as follows. (i) If f is a non-closure function, we de ne G' as in section 3.3. If  is the computed substitution for v in deriving G', we de ne T'  T  . (ii) If f is a closure function and there is no assertion of the form f(u) = 11

w in T for any u =ac t, we de ne G' as in section 3.3, and T'  (T [ ff(t) = vg) , where  is the computed substitution for v in deriving G'. (iii) If f is a closure function and f(u) = w is in T for some u =ac t, we de ne G'  (G - [g1 ])  , and T'  T  , where (G - [g1) represent the sequence G with g1 removed, and   fv w0 g, where w0 is the smallest solution to the equation v = w, and may be obtained by replacing all occurrences of v in w by .

Assuming that the correct answer and computed answer are re-stated relative to the new semantic de nitions, we have the following correctness results. Theorem 4 (soundness): For a strati ed program P and goal G, the computed answer  is a correct answer. Theorem 5 (completeness): Given a strati ed program P and goal G, if there exists a correct answer , there is a !-derivation such that  is the computed answer. For the example de ning f and g in section 4.1, the top-level query has a successful !-derivation with the computed answer f v f1; [1]g g. The memo-table at the end of the derivation would be f ff(1) = f1; [1]g, g([1]) = f1; [1]g g.

5. Subset-relational Programming

We now brie y discuss the combination of subset and relational assertions. We begin with simple subset-relational programming in which only relational goals may occur in the body of a subset assertion. We then consider both equality and relational goals in the body.

5.1. Simple Subset-relational Programming

The general form of a subset assertion in this paradigm is: f(t)  s :- p1 (t1 ), : : :, pn (tn ), where t is any term, s is a set, each pi is a predicate de ned through de nite clauses, and each ti is a term|as before, we can assume a single term rather than a sequence of terms without loss of generality. Note that these de nite clauses can contain set-terms, and therefore are special case of the programs considered in [JLM84]. The declarative meaning of the above assertion is that, for all its ground instances, the function f operating on ground t contains a ground set s if the condition in the body, p1 (t1 ), : : :, pn (tn ), is true. When a set-valued function fs is de ned by a collection of n such subset assertions, fs (t1 )  s1 :- B1 , : : :, fs (tn )  sn :- Bn , where Bi stands for the body of each assertion, as shown above, we de ne its completion similar to that in section 3.1, as follows. fs (v ) = [i=1;n [ fsi : (9yi ) [v =ac ti ^ Bi ]g where v is a new variable not appearing in any of the subset assertions de ning fs and yi are all the variables of the i-th assertion but excluding 12

those variables in si , and Bi may use any variable from the i-th clause. As before, we assume that resulting sets are nite. The model-theoretic semantics of a simple subset-relational program P is de ned by rst taking the least model of the de nite-clause part of P , and then using this least model to de ne the model for subset assertions and their completion. The operational semantics uses restricted a-c matching with subset assertions and, because clauses may have set-terms, restricted a-c uni cation (de ned in section 5.2) with relational assertions. We require that arguments to a function de ned by subset assertions must be ground. Given a top-level goal f(u) = x, where u is a ground term, and x is a variable, if the k-th subset assertion de ning f is f(tk )  sk :- Bk , and matching u and tk yields pk matching substitution, k1 , : : :, kp , the body Bk ki is then reduced by SLD-resolution (except that restricted a-c uni cation is used at each step). If the computed answers from each successful SLD-derivation of (B ki ) are ki1 , : : :, kin , then the solution for x is x [k=1;n [i=1;p [j=1;q (sk ki kij ) The following points should be noted: We require that all SLD-derivations from B be terminating; each derivation could be either successful or nitelyfailed [L87]. We also require that there be only a nite number of derivations, because we require all sets to be nite. If all SLD-derivations are nitelyfailed, considering each subset assertion and each a-c match, the computed set is . k

i

k

i

5.1 Examples Negation-by-Failure. Negation-by-failure can be easily simulated in this paradigm. For example, a top-level goal

t

not p( )

where p is a predicate de ned by de nite clauses and t is a ground term, can be simulated by rst de ning the subset assertion all p(x)  f1g :- p(t), and then using the top-level goal all p(t) = . Note that arguments to a subset assertion must be ground and all reductions from p(t) must terminate|requirements that are also needed for the correctness of negationby-failure [L87]. Setof. Simple subset-relational programs can be used to simulate Prolog's setof feature in a more declarative manner. For example, assuming the usual append/3 relation, we can simulate the following Prolog-like clause for nding the partitions of a list, parts(list, answer) :- setof([x|y], append(x, y, list), answer)

by the following subset assertion: parts(list)  f[xjy]g :- append(x, 13

y, list).

The completion of the above assertion gives us the desired set of all possible partitions, as follows. parts(list) = [f f[xjy]g : append(x; y; list) g. Note that variables not occurring in the head of a subset assertion are considered to be existential variables, e.g., the variable y in the following program, prefixes(list)  fxg :- append(x, y, list). Set-terms in Relations. The use of set terms in relations makes possible some interesting de nitions. Consider, for example, the following de nition of the permutations of the elements of a set. set-to-list(; [ ]) set-to-list(fx j sg; [x j t]) :- set-to-list(s, t) permutations(set)  flistg :- set-to-list(set, list) Because the argument to permutations is assumed to be ground, the rst argument in the invocation of set-to-list in the body of permutations will also be ground. Because the matching of an n-element set against the second assertion for set-to-list would yield n di erent matches, each of these matches is separately considered in recursively reducing the body of set-to-list. In this manner, all permutations to a top-level invocation of permutations are computed.

5.2 Restricted A-C Uni cation

We brie y describe the essential cases of this algorithm. The set terms to be considered are , ftg, and ft j sg. Note that  is uni able only with itself and a variable. The singleton ftg can be viewed as ft j g, and hence the two potentially uni able cases are: 1. ft1 j s1 g = ft2 j s2 g. This equality can be reduced to either one of the following two sets of equalities; the solution of each set giving a di erent solution: (a) t1 = t2 ; s1 = s2 (b) s1 = ft2 j z g; s2 = ft1 j z g, (z is a distinct variable) 2. x = ft1 j s1 g. There are three cases to consider: (a) x does not occur in ft1 j s1 g. The equality is solved with the unifying substitution x ft1 j s1 g. (b) x occurs in t1 . The equality is unsolvable. (c) x does not occur in t1 but occurs in s1 . The equality is solvable i s1 is either x or ft2 j xg or ft2 j ft3 j xgg, etc., where x does not occur in any ti . The unifying substitution is respectively either x ft1 j z g, or x ft1 j ft2 j z gg, or x ft1 j ft2 j ft3 j z ggg, etc., where z is a distinct variable. With the provision of restricted a-c uni cation, we can freely permit nonground set terms as arguments to predicates, but we must still insist on ground terms as arguments to functions de ned by subset assertions. 14

5.3 Strati ed Subset-relational Programming

Simple subset-relational programs do not provide equality goals on the r.h.s. of rules, and hence are restricted to a \one-level" collect capability. In other words, the result of forming a set cannot be used in any computation; it is simply reported to the user at the top-level. In order to de ne the more general case where the resulting set is used in other computations, we introduce the class of strati ed subset-relational programs. The need for strati cation should be clear, given that subset-relational programs can simulate negation-by-failure. Basically, we wish to avoid a problem analogous to recursive negation, so that a well-de ned semantics can be given. Program assertions are therefore partitioned into n levels, so that equality goals appearing in any level may refer only to subset assertions de ned at a lower level. With such a partition, models are constructed incrementally, starting at the lowest level. Strati ed subset-relational programs are closely related to the programs of the LDL1 language [BNRST87]. For example, an LDL1 assertion of the form p(X, ) :- B where is their `grouping construct' may be translated as p(X)  fYg :- B. Despite LDL1's obvious bene ts for de ning sets in logic languages, we think the subset assertion plays a useful role because of its ability not only to collect multiple solutions to the goal in its r.h.s., but also to use the multiple matches on its l.h.s. The main di erences from the LDL1 approach are that subset assertions are not intended as a database query language, and hence its operational semantics uses a conventional `top-down' evaluation.

5. Conclusions

We mention the salient points about the language concepts and semantics we have presented in comparison with that for conventional logic programs: 1. Unlike Horn clauses augmented with negation-by-failure, the completion of a subset-equational program does not lead to inconsistency| basically, we do not provide a 6 or non-membership primitive in the language. The declarative (or model-theoretic) semantics of a subset-equational program P can be expressed in terms of the logical consequences of comp(P ). Note that models for comp(P ) have equality assertions and subset assertions, although our interest at the top-level is in the term that is equal to a given query expression. The operational semantics, based on a-c matching and innermost- rst reduction, is correct (sound and complete) with respect to the declarative or xed-point semantics. 2. Strati cation in subset-equational programs is needed to make possible a xed-point semantics for closure functions. Our semantics require that the auxiliary functions used in de ning one closure function in terms of another (at the same level) to be subset-monotonic, and also all closure 15

functions to be terminating|the latter assumption can be relaxed, but we haven't explored this issue in this paper. The operational semantics corresponding to the above de nition essentially uses a memo-table in order avoid nontermination from identical nested calls. 3. Subset-relational programs provide a way to deal with sets in a more rigorous manner than Prolog's setof construct. We brie y discussed the declarative and procedural semantics for such programs. The need for strati ed subset-relational programs was also discussed, as well as the need for restricted a-c uni cation. At this stage of our research, we have an implementation for simple subset-equational programs [JN88], but not for strati ed subset-equational or subset-relational programs. An implementation of these latter two classes of programs is clearly needed to facilitate better understanding of their use in practice. We would also like to relax our restrictions on nite sets, eager reduction, and rst-order terms, and examine the semantic and implementation issues of the resulting language. There has been considerable interest in the use of sets in functional and logic programming [W83, T85, DFP86, JS86, BNRST87, JP87, R87, K88, SJ89]. It is easy to show that subset-equational programs can be used directly encode the relative set construct of Miranda [T85], while subsetrelational programs can express the absolute set construct of Darlington [DFP86] and Robinson [R87]. The paradigms discussed in this paper, however, do not include higher-order operations and lazy evaluation in their present development. An interesting use of lazy rewriting is discussed in [N89], where it is shown how one can prune (by lazy evaluation) the generation of elements of a set that is de ned via noncon uent rewrite rules. Quanti ers over sets are described in [K88], and, as discussed earlier, a `collect all' capability is provided in [BNRST87] through the use of a grouping construct. In comparison with these approaches, the novel idea of subsetequational and subset-relational programming is that very concise, clear and ecient programs can be obtained when operations are formulated through subset assertions and much of the iteration over sets is moved into the matching process.

Acknowledgments This research was supported by grants DCR-8603609 and CCR-8802282 from the National Science Foundation, and was carried out while Jayaraman was at the University of North Carolina. We thank the anonymous referees for their comments and suggestions. Thanks also to Deepak Kapur for discussions on A-C uni cation.

References [BKN85]

D. Benanav, D. Kapur, and P. Narendran, \On the complexity of matching problems," In Rewriting Techniques and Applications, pp. 417-429, Dijon, May 1985. 16

[BNRST87] C. Beeri, S. Naqvi, R. Ramakrishnan, O. Shmueli, S. Tsur, \Sets and Negation in a Logic Database Language (LDL1)," In 6th ACM PODS, pp. 21-37, 1987. [DFP86] J. Darlington, A.J. Field, and H. Pull, \Uni cation of Functional and Logic Languages," In DeGroot and Lindstrom (eds.), Logic Programming, Relations, Functions and Equations, pp. 37-70, Prentice-Hall, 1986. [JLM84] J. Ja ar, J-L. Lassez, M.J. Maher, \A theory of complete logic programs with equality," In J. of Logic Programming, pp.211223, 1984. [JS86] B. Jayaraman and F.S.K. Silbermann, \Equations, Sets, and Reduction Semantics for Functional and Logic Programming," In 1986 ACM Conf. on LISP and Functional Programming, pp. 320-331, MIT, August 1986. [JP87] B. Jayaraman and D.A. Plaisted, \Functional Programming with Sets," In 3rd Int'l Conf. on Functional Prog. Langs. and Comp. Arch., pp. 194-210, Portland, September 1987. [JN88] B. Jayaraman and A. Nair, \Subset-logic Programming: Application and Implementation," In 5th Int'l Logic Prog. Conf., pp. 843-858, Seattle, August 1988. [J89] B. Jayaraman, \Programming with Equations and Subset Assertions," Submitted for publication. [K88] G.M. Kuper, \On the Expressive Power of Logic Programming Languages with Sets," In 7th ACM PODS, pp. 10-14, 1988. [L87] J.W. Lloyd, \Foundations of Logic Programming," SpringerVerlag, 1987. [N89] S. Narain, \Optimization by Nondeterministic, Lazy Rewriting," In Rewriting Techniques and Applications, SpringerVerlag LNCS 355, pp. 326-342, Chapel Hill, April 1989. [O85] M.J. O'Donnell, \Equational Logic as a Programming Language," MIT Press, 1985. [P72] G. Plotkin \Building-in equational theories," In Machine Intellgence, 7, pp. 73-90, 1972. [P86] D.A. Plaisted, \Nondeterminism by Associative-Commutative Rewriting," Unpublished Report, Department of Computer Science, UNC-Chapel Hill, March 1986, 30 pages. [R87] J.A. Robinson, \Beyond LOGLISP: Combining functional and relational programming in a reduction setting," In Machine Intelligence 11, 1987. [SJ89] F.S.K. Silbermann and B. Jayaraman, \Set Abstraction in Functional and Logic Programming," In 4th Int'l COnf. Functional Prog. and Comp. Architecture, London, U.K., September 17

[T85] [vEK76] [W82]

1989. D. A. Turner, \Miranda: A non-strict functional language with polymorphic types," in Conf. on Functional Prog. Langs. and Comp. Arch., Nancy, France, Sep. 1985, pp. 1-16. M. H. van Emden and R. A. Kowalski, \The Semantics of Predicate Logic as a Programming Language," JACM 23, No. 4 (1976) pp. 733{743. D.H.D. Warren, \Higher-order extensions to PROLOG: are they needed?" Machine Intelligence 10, 1982, pp. 441-454.

18

Suggest Documents