Preference Logic Programming Abstract - CiteSeerX

1 downloads 0 Views 254KB Size Report
cannot be satis ed, yet one wants the best possible solution to the constraints. ..... ming paradigm that uni ed declarative programming and mathematical programming. A program ...... In Proc. of 7th Intl. Symp. on Methodologies for Intelligent. Systems, LNAI 689 ... Kowalski and Kenneth A. Bowen, editors, Proc. 5th Joint ...
Preference Logic Programming Kannan Govindarajan Dissertation Proposal Submitted to the Faculty of the Department of Computer Science of State University of New York at Bu alo in partial ful llment of the requirements for the degree of Doctor of Philosophy April 1995 Dissertation Committee: Dr. Bharadwaj Jayaraman (Chair) Dr. Kenneth W. Regan Dr. Alan L. Selman Dr. Suryanarayana M. Mantha (Xerox Corporation)

Abstract This research is concerned with the semantics, applications, and implementation of a declarative language for specifying constraint optimization and relaxation problems. These problems arise in diverse application areas|document layout, scheduling, parsing, etc. Existing constraint languages are ill-equipped to express such problems, and provide ad hoc constructs whose semantics is provided in an unsatisfactory manner. We propose a principled extension of constraint logic programming for solving these problems. Our proposed paradigm is called preference logic programming, and we have shown how to formulate constraint optimization problems as well as heuristics for hard combinatorial problems in this paradigm. We are presently investigating the kinds of constraint relaxation problems that can be naturally expressed in the paradigm. We also propose to explore the use of preferences in the grammatical and database contexts. We have developed an extension of logic grammars, called preference logic grammars, capable of expressing the criteria for disambiguating ambiguous parses. We introduce preference datalog programs as restricted preference logic programs with database applications in mind.

Contents 1 Research Problems and Signi cance

1

2 Background and Related Work

2

2.1 Background : : : : : : : : : : : : : : : 2.1.1 Logic Programming : : : : : : 2.1.2 Constraint Logic Programming 2.1.3 Logic of Preference : : : : : : : 2.2 Related Work : : : : : : : : : : : : : :

3 Current Status of Work

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

3.1 Preference Logic Programming : : : : : : : : : : : : : : 3.1.1 Syntax of Preference Logic Programs : : : : : : : 3.1.2 Programming Heuristics with Preference : : : : : 3.1.3 Semantics of Preference Logic Programs : : : : : 3.2 Preference Logic Grammars : : : : : : : : : : : : : : : : 3.2.1 Syntactic Ambiguity Resolution using Preference 3.2.2 Semantic Ambiguity Resolution using Preference 3.3 Preference Datalog Programs : : : : : : : : : : : : : : :

4 Proposed Work

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

4.1 Remaining Issues in the Paradigm : : : : : : : : : : : : : : : : 4.1.1 Semantics : : : : : : : : : : : : : : : : : : : : : : : : : : 4.1.2 Constraint Relaxation : : : : : : : : : : : : : : : : : : : 4.1.3 Generalized Aggregation and Constraints in Databases : 4.2 Implementation : : : : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : :

2 2 3 4 5

6

7 7 8 9 11 11 12 14

16

16 16 17 18 18

5 Directions for Further Research

19

6 Summary and Conclusions

22

1 Research Problems and Signi cance The concept of preference plays a fundamental role in the sciences. It naturally gives rise to the notion of extrema (minima and maxima) which is very important in the physical sciences. For example, in computer science, we are generally interested in minimizing computational resources (time, space, communication, etc.) to perform a task. In theoretical physics, minimization principles, such as least action and minimal coupling, are the cornerstones of classical mechanics and eld theory. The concept of preference has also been studied in other elds including decision theory, economics, philosophical logic, arti cial intelligence, operations research, and ethics [5, 6, 12, 21, 96]. In a very broad sense, preference is a binary relation over some appropriate domain. von Wright [96] suggests that the preference relation of primary importance is the one between states of a airs that are captured by possibly in nite sets of propositions. While our interest in preference is inspired by work in these various areas, our immediate goal in extending logic programming with preference is motivated by a host of practical applications, ranging from document processing to scheduling. In formulating these applications as logic programs, we require the ability to compare alternative solutions to a set of constraints and choose the best one. This kind of selection or optimization is a meta-level operation, and therefore falls outside the standard (constraint) logical framework. To appreciate our proposed approach, we note that in general there are three components to an optimization problem: 1. speci cation of the constraints of the problem; 2. speci cation of what is to be optimized; and 3. speci cation of the criteria for determining the optimal solution. While several approaches to optimization have been proposed in the logic programming and deductive databases literature [20, 23, 32, 43, 48, 61, 67, 77, 84], they do not allow one to fully specify all three of the above components. Our thesis is that the concept of preference provides a natural, modular, declarative, and ecient way for expressing the three components of an optimization problem. We describe a principled extension of constraint logic programming, called preference logic programming, for specifying these problems. This approach also enables us to declaratively characterize certain kinds of relaxation, an operation that naturally arises when all the constraints specifying a problem cannot be satis ed, yet one wants the best possible solution to the constraints. In the framework of preference logic programming, the preferences can be viewed as \soft" constraints, which can be relaxed in a disciplined manner. We investigate the semantics for preference logic programs using concepts from modal logic. The use of modal logic should not be surprising since optimization is a meta-level concept. Essentially, we propose to give a possible-worlds semantics for a preference logic program in which each world is a model for the constraints of the program, and an ordering over these worlds is enforced based on the truth of certain propositions, called preference criteria. We introduce the concept of preferential consequence to refer to truth in the optimal worlds|in contrast with logical consequence which refers to truth in all worlds. An optimal answer to a query G can then be de ned a substitution  such that G is a preferential consequence of the preference logic program. We also propose to develop a computational scheme called PTSLD-derivation, which stands for Pruned Tree SLD-derivation, 1

for eciently computing the optimal answers to queries. Because we are not carrying out general proving theorems in the logic of preference, we are able to devise an ecient derivation procedure. We propose to examine the use of preference in two important areas: grammars and databases. We introduce preference logic grammars (PLGs) as a means for formulating optimal parsing problems. Optimal parsing is an extension of (context-free grammar) parsing wherein costs are associated with the di erent, ambiguous parses of a string and the preferred parse is the one with least cost. Many problems can be viewed as optimal parsing problems, e.g., code generation, document layout, etc. Motivated by potential applications in deductive databases, we introduce preference datalog programs (PDPs) as preference logic programs with no uninterpreted function symbols. Preference datalog programs can be thought of as adding to datalog [89] the capability of expressing criteria for choosing among alternative solutions to a query. This can be viewed as a generalization of extremevalue aggregate operations in deductive databases [91]. In both the grammar and database contexts, a bottom-up evaluation mechanism seems more natural, and we therefore propose to investigate extensions of bottom-up evaluation techniques for de nite clause programs [72]. The rest of this proposal is organized as follows. Section 2 discusses related work in this area. Section 3 describes the current status of the work. Section 4 proposes topics that will be investigated in the dissertation. Section 5 discusses other directions for further research. Section 6 presents a summary, conclusions, and an outline of the dissertation.

2 Background and Related Work The idea of explicitly using constraints for problem speci cation can be traced to the work of Sutherland [86], who was interested in designing graphical user interfaces. Constraint programming and logic programming developed independently until Steele [82] noted that there was a conceptual correspondence between them. Ja ar and Lassez [36] introduced constraint logic programming (CLP) by integrating logic programming with constraint programming. van Hentenryck [93] also showed how to incorporate constraint solving ability into a logic programming language such as Prolog. We rst brie y review relevant concepts from logic programming, constraint logic programming, and the logic of preference, and then survey closely related work.

2.1 Background 2.1.1 Logic Programming Logic programming [59] is one of the main forms of declarative programming, which, compared with imperative programming, o ers the advantages of having simpler semantics, better program readability, and greater potential for ecient parallel execution. Logic programs are theories in some suitable logic; their declarative semantics is usually given in terms of logical consequence (j=), and their operational semantics in terms of logical inference (`). Any logic that has a model theory, proof theory, a soundness theorem, and, preferably, a completeness theorem is a candidate for being the basis of a logic programming language [60]. In practice, however, the logic is suitably restricted so that there exists an ecient proof procedure for its theories. For instance, while rst-order logic is a good candidate for forming the basis of a programming language, the subset of the logic of interest 2

called de nite clauses has an ecient operational semantics: p(t): p(t) p1 (t1) ^ : : : ^ pn(tn ): De nite clauses have a unique minimal Herbrand model and their logical consequences can be eciently computed by SLD-resolution. Furthermore, de nite clauses are computationally sucient in that any partial recursive function can be suitably encoded [59]. De nite clauses, however, have serious limitations as a programming language. For example, negated goals are very useful for programming but the syntax of de nite clauses disallows negated goals. Furthermore, the uni cation algorithm for rst-order logic solves for syntactic equality. However, when reasoning with objects such as sets, it is desirable for the (set) terms to unify with terms that are not necessarily syntactically identical, as there are equational theories that the uni cation must take into account [37, 40]. Furthermore, some applications may need uni cation to be generalized to solving inequality constraints over some domain. Motivated by such considerations, Ja ar and Lassez [36] proposed the paradigm of constraint logic programming (CLP), which integrates constraint solving and logic programming.

2.1.2 Constraint Logic Programming In essence, constraint logic programming extends traditional logic programming by incorporating constraints and methods of solving the constraints over the domain. The main features of constraint logic programming languages are as follows: 1. 2. 3. 4.

Constraint solving replaces rst-order uni cation. Constraints can be used to specify queries as well as answers. Constraints and variables are dynamically generated during execution. Constraints are checked globally for satis ability before the execution proceeds further.

Constraint Logic Programming schemes proposed by Ja ar and Lassez [36, 38] are parametrized by the domain of the constraints. For instance, in CLP(R), the constraints are arithmetic constraints over reals. Programs in CLP languages are made up of clauses of the following kinds: p(t) c1(t1 ) ^ : : : ^ ck (tk ): p(t) c1(t1 ) ^ : : : ^ ck (tk ) ^ p1(u1 ) ^ : : : ^ pn (un): Here the cis are constraints over the domain over which the CLP system is parametrized. Constraint logic programming languages have found use in a wide variety of applications areas ranging from modelling complex problems such as electric circuit analysis and options trading to combinatorial search problems such as scheduling and DNA sequencing [38, 101]. In problems such as scheduling the notion of preference arises naturally. For example, suppose one is scheduling a faculty meeting and each faculty person has certain \hard" constraints, e.g., times when they are teaching, and some \soft" constraints, such as preferring to meet in the afternoon rather than the 3

morning. In this case, we prefer the schedule with the meeting in the afternoon over the schedule with the meeting in the morning, assuming that the hard constraints allow the meeting to be scheduled both in the morning and the afternoon. Many applications require constraint optimization, i.e., nding the best solution to the set of constraints. Specifying such preferences falls outside the purview of constraint logic programming. Most CLP systems have operators such as maximize and minimize, although the semantics of such meta-level operators is usually not provided. Furthermore, the criteria for ordering the solutions have no explicit status in the model theory as the semantics of optimization (when it is discussed) is usually provided through negation [20, 23, 48, 65].

2.1.3 Logic of Preference The syntax of the logic of preference introduced by Mantha [63] extends the syntax of rst-order logic by introducing a modal operator Pf with the rule of formation: if F is a formula then so is Pf F . Following the model tradition in modal logic enunciated by Carnap [15, 16] and Kripke [54, 55] (where the possible worlds semantics for modal logic was introduced), Mantha [63] provides a possible worlds semantics for preference logic. Essentially, the truth value of purely rst-order formulae at any world is determined in the usual way, but a formula of the form Pf F is true at a world w if every world v where F is true is related to w by the relation , i.e., w  v . The intuitive meaning of the truth of Pf F at any world w is that the truth of F at any other world v is sucient to make v better than w. F is said to be a preference criterion at w. We will be interested in frames where the relation  has the property that if w1  w2 then there exists a formula F such that w1 j= Pf F and w2 j= F . Mantha [63] also introduced rst-order preference theories and showed their usefulness in a variety of applications such as non-monotonic reasoning, deontic logic and document processing. A preferential theory had two parts: (1) a rst-order theory and (2) an arbiter. The arbiter is a conjunction of arbiter clauses of the form M1 ^ : : :^ Mk ! Pf (L1 ^ : : :^ Li ). The worlds in the possible worlds semantics for a preferential theory are models for the rst-order theory and the relationship among these worlds is determined by the arbiter. Though preferential theories are powerful enough to express many optimization problems, they cannot be executed eciently. Preference logic programs (discussed in section 3) are restricted preferential theories which can be executed eciently. The relationship between preference logic programs and general preferential theories is akin to the relationship between de nite clauses and general rst-order theories. The semantics of preference logic programs is also given in terms of possible worlds. Each world is a model for the rst-order theory of the program and the relationship among the worlds (ordering of the solutions) is determined by the arbiter. We are interested in preferential consequences, i.e., truth in optimal worlds in the preference model rather than logical consequences or validity, i.e., truth in all the models of the program. Unlike computing validity (which is known to be PSPACE-complete for the propositional version of preference logic [63]), computing the preferential consequences of a program is not very expensive.

2.2 Related Work We brie y survey the various approaches to optimization that have been proposed in the logic programming and deductive databases literature. 4

Optimization Predicates in CLP languages: Implemented constraint logic programming systems (CLP(R) [39], CHIP [93], etc.) include meta-level operators such as minimize and maximize. As noted before, the semantics of constraint logic programs [36] does not provide a declarative account for such operators. Fages [20] proposed a generalization of such operators and provided logical and xpoint semantics based on 3-valued Kunen-Fitting [56] semantics for negation. Maher and Stuckey [61] discussed how to augment the kinds of queries that can be posed to a CLP system. To allow for optimization queries, they suggested that the solutions to any query be mapped to a partial order to determine the best solution. Marriott and Stuckey [65] provided a declarative semantics for minimization predicates based on a translation into subgoals with negation. They propose an operational semantics in which the current optimal solution is used to prune sub-optimal solutions.

Hierarchical Constraint Logic Programming: Borning et al [10, 100] introduced the frame-

work of Hierarchical Constraint Logic Programming (HCLP), which extends the paradigm of CLP [36, 38] by allowing the programmer to specify soft constraints in addition to the required or hard constraints. A HCLP scheme is parametrized by both the domain of the constraints and the comparator that is used to choose among alternative solutions to the soft constraints. In essence, in HCLP, the programmer can control the order in which constraints should be relaxed using the strengths of constraints and the comparator. HCLP is not suited for specifying optimization problems; in fact it cannot accurately simulate the minimize operator in standard CLP systems. To do so, one needs previous knowledge of a lower bound for the minimum value and the answer depends on the kind of comparator used. The HCLP system is equipped with a nite set of comparators and it is not clear which comparator is useful in a given application. In short, HCLP is geared towards specifying relaxation problems, and not towards specifying optimization problems.

Aggregation in Deductive Databases and Logic Programming: Aggregation can be re-

garded as a function from a multiset of values to a single value. Examples of aggregate functions include min, max, and sum. There has been a considerable amount of research in recent years in trying to provide a satisfactory semantics for aggregate operations in logic programming and deductive databases. Ganguly et al [23] considered minimum and maximum value aggregates. A program with just these aggregate operations has an equivalent rst-order formulation using negation. They also showed that under certain monotonicity conditions, the rst-order equivalent program has a total well-founded model [92] which can be computed using a greedy xed-point procedure. Kemp and Stuckey [48] examined programs with recursion through aggregation. To give semantics for programs with aggregation, they extended two well known semantics for programs with negation, namely well founded models and stable models [24]. Ross and Sagiv [77] provide semantics for aggregation where the domain over which the aggregation is performed is a complete lattice and the program is monotonic. By Tarski's theorem [87], we are guaranteed the existence of a least xed-point for the aggregate operation.

Partial Order Programming: Parker [67] introduced partial order programming as a program-

ming paradigm that uni ed declarative programming and mathematical programming. A program in the framework was a set of assertions of the form ui w fi (v ), for i = 1; : : :; n, where each fi is continuous (and therefore monotonic) in the partial order topology. The goal of the program was to compute the values of variables that would minimize some uj . Jayaraman et al [43] extended 5

this notion by using the partial order to de ne functions. The use of non-monotonic functions was allowed modulo strati cation. They also introduced conditional partial order assertions of the form: f (terms)  expression condition: f (terms)  expression condition: Multiple partial order assertions may be used to de ne a function f and the intuitive meaning of each assertion is that for any ground instance of an assertion, the function f applied to the arguments is () the ground expression if the conditions in the body hold true. The semantics of f applied to ground arguments is the least upper bound (greatest lower bound) of all the expressions it is  () to. The operational semantics made use of monotonically updatable memo-tables to compute the intended answers.

Critique of Related Work: The previous approaches su er from one or more of the following drawbacks:

1. The notion of optimization is based on a xed partial or total order and one cannot program the ordering to suit the application. In applications such as ambiguity resolution in context free grammars, there may be no explicit ordering among the possible solutions. Preference logic programming can express the criteria for disambiguating parses in a much more natural and declarative manner than traditional approaches. 2. The semantics of optimization is usually given in terms of the semantics of negation. In many cases, unlike the de nite clause case, there may be multiple intended models (stable, well-founded, etc.). Uniqueness of models is guaranteed only if the program is restricted (for instance, is locally strati ed). Given a program, determining whether it satis es these syntactic restrictions is 11 -complete in the analytic hierarchy [9]. In our proposed framework, on the other hand, there is a unique intended model irrespective of whether the program is locally strati ed or not. 3. The semantics proposed in the literature does not give due status to the ordering among the solutions that causes a solution to be optimal. Our semantics on the other hand, does give due status to the ordering among the solutions. This allows the user to program the order in which the optimality of a solution may be relaxed.

3 Current Status of Work In this section, we brie y describe the paradigm of preference logic programming and provide motivating examples. We also brie y describe the formalism of preference logic grammars which is useful for specifying optimal parsing problems. Finally, we introduce the class of preference datalog programs with database applications in mind.

3.1 Preference Logic Programming We now introduce preference logic programs as restricted preferential theories and discuss their semantics and potential applications. 6

3.1.1 Syntax of Preference Logic Programs A preference logic program (PLP) has two parts: a rst-order theory and an arbiter. The First-Order Theory: The rst-order clauses of a preference logic program can have one of two forms: 1. H B1 ; : : :; Bn , (n  0), i.e., de nite clauses. In general, some of the Bi s could be constraints as in [36]. 2. H ! C1; : : :; Cl j B1 ; : : :; Bm , (l; m  0), i.e., optimization clauses. C1; : : :; Cl are constraints as in [36] that must be satis ed for this clause to be applicable to a goal. The predicate symbols appearing in a PLP can be partitioned into three disjoint sets, depending on the kinds of clauses used to de ne them: 1. C-predicates appear only in the heads of de nite clauses and the bodies of these clauses contain only other C-predicates (C stands for core). The C-predicates de ne the constraints to be satis ed by each solution. 2. O-predicates appear in the heads of only optimization clauses (O stands for optimization). For each ground instance of an optimization clause, the instance of the O-predicate at the head is a candidate for the optimal solution provided the corresponding instance of the body of the clause is true. The constraints that appear before the j in the body of an optimzation clause are referred to as the guard and must be satis ed in order for the head H to be reduced. 3. D-predicates appear in the heads of only de nite clauses and at least one goal in the body of at least one such clause is either an O-predicate or a D-predicate. (D stands for derived from O-predicates.)

The Arbiter: The arbiter part of a preference logic program has clauses of the following form: p(t)  p(u)

L1; : : :; Ln

(n  0)

where p is an O-predicate and each Li is an atom whose head is a C-predicate or a constraint (such as ; ; ; etc.) over some domain. In essence this form of the arbiter states that p(t) is less preferred than p(u) if L1 ; : : :; Ln. For example, assuming a set of facts for the predicate edge, the following program can be treated as a declarative speci cation of path. path(X,Y,C,[e(X,Y)]) path(X,Y,C,[e(X,Z)|L1])

edge(X,Y,C). edge(X,Z,C1), path(Z,Y,C2,L1), C = C1 + C2.

To specify what is to optimized and the criteria for the optimal solution, we introduce two new kinds of clauses: optimization clauses and arbiter clauses. The optimization clause for this example is: n sh dist(X,Y,C,L)

!

path(X,Y,C,L).

7

This clause states that the set of solutions to the goal sh dist(X,Y,C,L) is some subset of the set of solutions to the goal path(X,Y,C,L). However, we are interested in the optimal subset of solutions and this requirement is speci ed by the following arbiter clause, which speci es that, given two paths between two nodes of the graph, we prefer the path of lesser cost. n sh dist(X,Y,C1,L)



n sh dist(X,Y,C2,L1)

C2 < C1.

In addition, we could write the following clause to specify that the shortest path between any two nodes is the path that is computed by n sh dist: n sh path(X,Y,L)

n sh dist(X,Y,C,L).

In the example program above, the de nite clauses de ning path (and the facts for the edge predicate) make up the core program TC . Thus the C-predicates are path, edge and =. The ! clause and the last clause make up the optimization program TO . The only O-predicate is n sh dist and the only D-predicate is n sh path. This formulation is \naive" because the operational semantics performs optimization by adopting a generate and select strategy: instances of n sh dist are generated using path, and the arbiter selects the optimal instances.

3.1.2 Programming Heuristics with Preference We now illustrate how the notion of preference can be used to program heuristics. Heuristics are important in fundamental areas of computer science such as algorithms, arti cial intelligence and operations research [75]. They are used to get acceptable solutions to intractable problems that arise in these areas in a reasonable amount of time. For instance, many practical problems in combinatorial optimization are solved using heuristics, e.g., the traveling salesman problem, scheduling, hamiltonian cycle problem, etc. These problems usually involve searching through a search space to arrive at the solution. Typically, the search can take exponential time as most of these problems are NP-complete (unless P=NP!). Heuristics are typically used in these problems to prune the search space, making the search space polynomial in size without compromising the quality of the solution completely. In many cases, it is easy to get within a constant factor of the optimal solution for some problems, e.g., the greedy method for integer bin-packing, the minimum spanning tree heuristic for traveling salesman problem, etc. In many cases, it may be possible to specify such heuristics by comparing partial solutions and preferring one partial solution over the other. This comparison can be expressed naturally in the framework of preference logic programming. For instance, consider the nearest neighbor heuristic for the traveling salesman problem [8, 57]. The following program is the formulation of the nearest neighbor heuristic in our paradigm: tour(L,T,Cost)

tour(L,[],T,0,Cost).

tour([],T,T,Cost,Cost). tour([H|T],[],Tour,C0,C1) tour(T,[H],Tour,C0,C1). tour(List,[Lastcity|Tour],T,C0,C) closest(Lastcity,List,City,Rest,Cost), C1 = C0 + Cost, tour(Rest,[City,Lastcity|Tour],T,C1,C).

8

closest(Lastcity,List,City,Rest,Cost) member(City,List,Rest), edge(Lastcity,City,Cost). closest(A, ,B, ,C)



!

closest(A, ,B, ,C1)

C

>

C1.

Essentially, tour(List,Tour,Cost) computes a Tour passing through the cities in List with cost Cost. It uses closest that computes the city that is closest to the last city on a partial tour. Many heuristics for TSP have two distinct components: (i) tour construction and (ii) tour improvement. The nearest-neighbor heuristic above is a tour construction heutistic. Tour improvement procedures, on the other hand, start o from the current tour, perform some local search and pick the best tour close to the current tour. For instance the Lin-Kernighan heuristic [58] de nes the neighbours of a tour to be those tours that can be obtained from it by interchanging some of the tour edges with non-tour edges. Since many heuristics are local search heuristics, we anticipate that the notion of preference will be useful in programming such heuristics declaratively.

3.1.3 Semantics of Preference Logic Programs We brie y provide declarative and operational semantics for preference logic programs (full details of which are described in [30, 31]). We also describe an extension of SLD-derivations for de nite clauses to compute the intended answers for preference logic programs.

Model Theoretic Semantics: Following Mantha [63], we give a possible-worlds semantics for

preference logic programs: essentially, each world is a Herbrand model for the constraints of the program, and an ordering over these worlds is enforced by the arbiter clauses in the program. An arbiter clause p(t)  p(u) L1; : : :; Ln is translated into a clause of the form p(t) ! Pf (p(u) ^ L1; : : :; Ln ). We illustrate the model theory using the shortest path example. Consider a directed graph with the following edges: fedge(a,b,5), edge(b,c,10), edge(a,c,25)g. The canonical model M for TC of interest to us here is the set:

fedge(a,b,5), edge(b,c,10), edge(a,c,25)g [ fpath(a,b,5,[e(a,b)]), path(b,c,10,[e(b,c)]), path(a,c,25,[e(a,c)]), path(a,c,15,[e(b,c),e(a,b)])g. Given a preference logic program, its preference model is constructed by rst by de ning the worlds in the model. Each world in the preference model is obtained by extending the least model for TC so that it then becomes a model for TC ^ TO ^ A. The instances of D-predicates at each world have to be supported, i.e. if p(t) is a ground instance of a D-predicate, then there is a clause in the program whose head uni es with P (t) and the instances of the goals in the body are present in the world. The following is a fragment of the intended preference model for the example program considered above and is used to illustrate how the ordering among worlds is enforced. For brevity, we have considered worlds that have instances of n sh dist(a,b, , ) and n sh dist(b,c, , ) since there is only one path between the associated vertices in the graph. These worlds (restricted to user-de ned C-predicates, O-predicates and D-predicates) are shown below: 9

1. M [ fn sh dist(a,b,5,[e(a,b)]),

n sh dist(b,c,10,[e(b,c)]), n sh dist(a,c,25,[e(a,c)]), n sh dist(a,c,15,[e(b,c),e(a,c)]) n sh path(a,b,[e(a,b)]), n sh path(b,c,[e(b,c)]), n sh path(a,c,[e(a,c)]), n sh path(a,c,[e(b,c),e(a,b)]) .

[f

g

g

2. M [ fn sh dist(a,b,5,[e(a,b)]),

n sh dist(b,c,10,[e(b,c)]), n sh path(a,b,[e(a,b)]), n sh dist(a,c,25,[e(a,c)]) n sh path(b,c,[e(b,c)]), n sh path(a,c,[e(a,c)])

g[f

3. M [ fn sh dist(a,b,5,[e(a,b)]),

g

n sh dist(b,c,10,[e(b,c)]), n sh path(a,b,[e(a,b)]), n sh dist(a,c,15,[e(b,c),e(a,c)]) n sh path(b,c,[e(b,c)]), n sh path(a,c,[e(b,c),e(a,b)])

g[f

g

The arbiter is satis ed by the following binary relation  on the set of worlds: f1  1; 1  3; 2  3g. Given the intended preference model for a preference logic program, we de ne the preferential consequences of the program to refer to truth in the optimal worlds (in contrast with logical consequence which refers to truth in all worlds). An optimal world is one that is not  any other world. For instance, in the example above, world 3 is the optimal world. Note that only n sh dist(a,c,15, ) is true in an optimal world, and no other n sh dist(a,c,X, ) for any X is true (model 3). Suppose there is a world w where n sh dist(a,c,X, ) is true for some X > 15, then the formula (n sh dist(a,c,15) ^ 15 < X) becomes a preference criterion at w, making world 3 above better than w. Note that in an optimal world, n sh path(a,d,X) is true for only X = [e(b,c),e(a,b)], con rming that n sh path computes the actual shortest path. Adding the edge instance edge(a,c,5) changes the shortest-path between a and c, i.e., optimization is a non-monotonic notion [14]. An optimal answer to the query G is a substitution  such that G is a preferential consequence of the preference logic program. In essence, we have provided a logical account of optimization by casting it as inference in a modal logic of preference. The use of modal logic for this purpose is not surprising because optimization is a non-monotonic notion and modal logics have been used in the past to characterize non-monotonic inference [64, 14].

Operational Semantics: We provide a derivation scheme called PTSLD-derivation, which stands for Pruned Tree SLD-derivation, for eciently computing the optimal answers to queries. In essence, PTSLD-derivations construct the or-parallel tree from a goal and use the arbiter as advice to prune paths that compute non-optimal answers. We refer the reader to [30, 31] for the details of the operational semantics. Since we are computing preferential consequences as opposed to logical consequences, we do not incur the cost of theorem proving in a general modal logic [80]. PTSLD derivations have interesting connections with SLDNF derivations. Negation as Failure may be unsound for non-ground goals [59]. On the other hand, the soundness of PTSLD derivations requires that a predicate appearing at the head of an optimization clause must be invoked with unbound variables at certain argument positions|the values at these positions being determined by the body of the optimization clause and used by the arbiter to prefer one solution over another. The completeness of PTSLD derivations requires that the search tree for every optimization goal is nite, because the optimal solution cannot in general be computed unless the search tree is exhaustively 10

explored. This requirement is similar to that of nitely-failed search trees in Negation as Failure. Even for preference logic programs with nite search trees, incompleteness arises when the optimization predicates are not locally strati ed. This motivates our interest in the class of strati ed and locally strati ed preference logic programs with nite search trees.

3.2 Preference Logic Grammars The concept of preference also provides a general means for specifying the selection criteria for choosing among alternative solutions to a goal. A good use of this capability is ambiguity resolution in programming language and natural language grammars. Context-Free Grammars are a natural formalism for specifying the syntax of a language, while Logic is generally the formalism of choice for specifying semantics (meaning). Logic Grammars combine these two formalisms, and can be used for specifying syntactic and semantic constraints in a variety of parsing applications, from programming languages to natural languages [2]. Several forms of logic grammars have been researched over the last two decades. Basically, they extend context-free grammars in that they generally allow:

 the use of arguments with nonterminals to represent semantics, where the arguments may be terms of some logic;  the use of uni cation to propagate semantic information;  the use of logical tests to control the application of production rules.

An important problem in these applications is that of resolving ambiguity|syntactic and semantic. An example of syntactic ambiguity is the \dangling else" problem in programming language syntax, while an example of semantic ambiguity is that of optimal parsing where one computes costs of parses intended parse of a string is the parse with the least cost.

3.2.1 Syntactic Ambiguity Resolution using Preference Traditionally, syntactic ambiguity is resolved in some ad hoc manner; for example, ambiguity in the \dangling else" problem is resolved during semantic analysis phase of a compiler, by pairing up each \else" with the closest previous unpaired \else". To illustrate, consider the following fragment of an ambiguous grammar for a programming language:



::= if if



then then

j else

A natural way to resolve the ambiguity is to express our preference for one parse over the other. For example, suppose that we wanted to match each else with the closest previous unmatched then. The resulting solution, shown below, makes use of De nite Clause Grammars (DCGs) [69] and is far more succinct than rewriting the grammar to avoid ambiguity: 11

ifstmt(if(C,T)) --> [if], cond(C), [then], stmtseq(T). ifstmt(if(C,T,E)) --> [if], cond(C), [then], stmtseq(T), [else], stmtseq(E).

Essentially, each DCG rule is derived from a production in the context free grammar. The terminals are enclosed in square brackets and the non-terminals take arguments (attributes) that can be de ned by de nite clauses. In this example the attribute of interest is the parse tree rooted at the nonterminal. DCGs can be converted into de nite clauses which compute the attributed parse tree. Essentially, each DCG rule is converted into a de nite clause by adding two arguments to every non-terminal corresponding to the list of terminals in the input list before the non-terminal is parsed and the list of terminals in the input list after the parse of the non-terminal is completed. In the example above, to specify the criterion that each else pairs up with the closest previous unpaired then, we simply add the following preference clause: ifstmt(if(C,if(C1,T),E))



ifstmt(if(C,if(C1,T,E))).

Essentially, this states that if there are two parses for a sequence of terminals from the non-terminal ifstmt, one prefers parses where the else matches the closest unmatched then.

3.2.2 Semantic Ambiguity Resolution using Preference Ambiguity in context free grammars can also be resolved by de ning functions from the various (ambiguous) parses to a totally ordered domain and choosing the parse with the least cost as the intended parse of a string. This can be viewed as semantic ambiguity resolution as the function from the parse to a totally ordered domain can be viewed as a semantic function. We also refer to this problem as the optimal parsing problem. Many applications such as optimal layout of documents [11], code generation [3], etc., can be viewed as optimal parsing problems. We demonstrate the use of PLGs for optimal parsing problems by showing how the criteria for laying out paragraphs|which a formatter such as TEX or LATEX would use [52]|can be speci ed declaratively. Just as DCTGs can be translated into Constraint Logic Programs [36, 38, 62, 93], PLGs can be translated into Preference Logic Programs [30, 31]. DCTGs were introduced by Abramson [1], and can be viewed as a logical counterpart of attribute grammars [50]. In attribute grammars, syntax is speci ed by context-free rules, and semantics are speci ed by attributes attached to nonterminal nodes in the derivation trees and by function de nitions that de ne the attributes. DCTG rules augment context-free rules in two ways: (1) they provides a mechanism for associating with nonterminals logical variables which represent subtrees rooted at the nonterminal; and (2) they provide a mechanism for specifying the computation of semantic values of properties of the trees. For example, we associate a logical variable N with a nonterminal nt by writing: nt^^N

The logical variable N will be eventually instantiated to the subtree corresponding to the nonterminal nt. In order to specify the computation of some semantic value X of a property prop of the tree N, we write: N^^prop(X)

12

The DCTG formalism di ers from attribute grammars in the following ways: 1. The semantics is captured by associating a set of de nite clauses to nonterminal nodes of a derivation tree. 2. No distinction is made between inherited and synthesized attributes in the notation. The use of logical variables that unify with parse trees eliminates the need for this distinction. We now introduce PLGs and demonstrate their usefulness in specifying document structures starting from the preferential attribute grammars speci cations of [11]. Logically, a paragraph is a sequence of lines where each line is a sequence of words. Knuth and Plass [52] describe how to lay out a sequence of words forming a paragraph by computing the badness of the paragraph and laying out the paragraph so that it has the least badness. This view of a paragraph can be captured by context free grammar with regular right hand sides as in Backus-Naur Form grammars augmented with attribute de nitions as follows: ::= + .badness = SUM i i.badness MIN : .badness

::= [unit]+ .badness = j .adjustmentj*3 Where .adjustment is de ned in terms of other attributes of . The reader is referred to [11, 29] for details. These grammars specifying document structures, are extremely ambiguous. Given a description of a document and a sequence of words representing its content, we are interested the parse of a sequence of words from that has the least badness. The following PLG expresses this notion declaratively: para ::= line^^Line badness(Badness) ::- Line^^badness(Badness). para ::= line^^Line, para^^Para badness(Badness) ::- Line^^badness(Linebadness), Para^^badness(Parabadness), Badness = Linebadness + Parabadness. para^^Para^^badness(B)



para^^Para1^^badness(B1) ::- B1 < B.

line ::= unit^^Unit badness(Badness) ::- adjustment(Adjust) mod(Adjust,Adjust1),

13

Badness = 3 * Adjust1. line ::= unit^^Unit, line^^Line badness(Badness) ::- adjustment(Adjust), mod(Adjust,Adjust1), Badness = 3 * Adjust1.

Apart from document layout, many other problems in computer science can be formulated as optimal parsing problems. For instance, code generation [3], syntax error recovery [88] and the grammar problem [51]. Furthermore, the notion of preference enables us to express the criteria used in natural language processing to disambiguate sentences. For instance, criteria such as minimal attachment and right association [19, 68, 49] can be easily expressed in the framework of PLGs.

3.3 Preference datalog Programs One of the main applications of logic programming is in the area of database query languages. Commercial database systems are relational database systems [17] which represent data as at relations. Queries on such databases are expressed in languages such as SQL. Though SQL is a declarative query language, its expressive power is seriously limited by the inability to express recursive queries. Deductive databases, on the other hand, allow general recursive queries as they are based on logic programming formalisms. datalog [89] is a query language that is used frequently in deductive databases. datalog programs are essentially de nite clause programs that do not have any uninterpreted function symbols. There is a mismatch of priorities between programming systems and database systems. Practical logic programming systems have adopted a top-down query evaluation mechanism as this lends a clean procedural interpretation to the clauses making up the program. Furthermore, most logic programming systems deviate from the ideal of being logical theories in that the order in which the clauses are written and the order of goals in the body of a clause a ect the outcome of the computation resulting from a query. In programming systems this behavior is acceptable because one is interested in programmability and eciency. Many practical programming systems such as Prolog, not only sacri ce completeness but also soundness for the sake of eciency. In database systems, however, soundness, completeness and declarativeness are more important. The query evaluation mechanism in deductive database systems should compute answers to queries in a manner that is independent of the order of facts and rules in the database. Bottom-up evaluation is the preferred method for computing answers to queries in deductive database systems. In fact, if the amount of extensional data in a database is large, bottom-up evaluation is more ecient than top-down evaluation [90]. Another advantage of bottom-up evaluation is that it is easy to avoid redundant computation that a top-down evaluation mechanism would perform in the absence of appropriate memoization. However, bottom-up evaluation may have to compute the least xed point of the program to answer any query because, unlike top-down evaluation, it is not goal directed. Techniques such as magic template rewriting modify the program using the query so that a bottom-up evaluation of the rewritten program will generate facts that are relevant to the query. Preferences can be viewed as providing a general means for specifying extreme-value aggregate operations in deductive databases. Furthermore, just as the arbiter can be viewed as providing 14

control advice to the top-down evaluation mechanism, it can also be viewed as providing direction to the bottom-up evaluation mechanism. Govindarajan [28] proposed preference datalog programs as preference logic programs without uninterpreted function symbols for database applications. This motivates our interest in bottom-up evaluation techniques for preference datalog programs (and also for preference logic programs). The O-predicates in a preference datalog programs can be assigned ordinals so that if an O-predicate O1 is de ned in terms of O2, O1 is assigned a higher rank than O2. The number of di erent ordinals assigned to O-predicates determines the number of levels in the preference datalog program. Suppose the given program has k levels of O-predicates. The bottom-up evaluation of the program has k + 1 stages as follows: 1. Compute the least xed point of the core program, let this be denoted by M0. 2. Consider the de nitions of the O-predicates at level 1. Starting from M0 , construct the least xed point of the program consisting of clauses de ning the O-predicates of level 1 by treating the ! clauses just as clauses. Let this set be N1 . Construct the set O1 by deleting from N1 instances of O-predicates that get pruned by the arbiter. Now consider the de nitions of D-predicates of level 1. Starting from O1, construct the least xed point of program whose clauses are the de nitions of the D-predicates of level 1. Let this set be M1. 3. At the ith (for i  k) level, we start from Mi?1 and construct Mi as above. The following is an example of a preference datalog program for the shortest path in a graph. This formulation is indicative of how general dynamic programming algorithms can be expressed in the the framework. sh dist(X,X,N,0). sh dist(X,Y,1,C) sh dist(X,Y,N,C)

sh dist(X,Y,N,C1)

! ! 

j

>

X Y edge(X,Y,C). N 1, X Y sh dist(X,Z,1,C1), N1 = N -1, sh dist(Z,Y,N1,C2), C = C1 + C2.



sh dist(X,Y,N,C2)

j

C2


1, X Y j magic sh dist(X,Y,N,C), sh dist(X,Z,1,C1), 15

magic sh dist(X,Z,1,C1) magic sh dist(Z,Y,N1,C2)

! !

N1 = N -1, sh dist(Z,Y,N1,C2), C = C1 + C2. X Y, N 1, magic sh dist(X,Y,N,C). X Y, N 1, magic sh dist(X,Y,N,C), sh dist(X,Z,1,C1), N1 = N - 1.



> >

magic sh dist(a,b,10,X).