COMPLEXITY AND EXPRESSIVE POWER OF LOGIC PROGRAMMING

10 downloads 0 Views 562KB Size Report
Feb 5, 1999 - tion is necessarily succinct; for a more detailed treatment, see [Lloyd ..... in R, because harry is not a constant in the finite universe UD of the ...
I N F S Y S RESEARCH R E P O R T

¨ I NFORMATIONSSYSTEME I NSTITUT F UR A BTEILUNG W ISSENSBASIERTE S YSTEME

C OMPLEXITY AND E XPRESSIVE P OWER OF L OGIC P ROGRAMMING

Evgeny DANTSIN Georg GOTTLOB

Thomas EITER Andrei VORONKOV

INFSYS R ESEARCH R EPORT 1843-99-05 F EBRUARY 1999

Institut f¨ur Informationssysteme Abtg. Wissensbasierte Systeme Technische Universit¨at Wien Treitlstraße 3 A-1040 Wien, Austria Tel:

+43-1-58801-18405

Fax:

+43-1-58801-18493

[email protected] www.kr.tuwien.ac.at

INFSYS RESEARCH R EPORT INFSYS R ESEARCH R EPORT 1843-99-05, F EBRUARY 1999

C OMPLEXITY AND E XPRESSIVE P OWER OF L OGIC P ROGRAMMING Evgeny Dantsin,1 Thomas Eiter,2 Georg Gottlob,3 Andrei Voronkov4

Abstract. This paper surveys various complexity and expressiveness results on different forms of logic programming. The main focus is on decidable forms of logic programming, in particular, propositional logic programming and datalog, but we also mention general logic programming with function symbols. Next to classical results on plain logic programming (pure Horn clause programs), more recent results on various important extensions of logic programming are surveyed. These include logic programming with different forms of negation, disjunctive logic programming, logic programming with equality, and constraint logic programming.

1

Computing Science Department, Uppsala University, Box 311, S 751 05 Uppsala, Sweden. Email: [email protected] 2 Institut und Ludwig Wittgenstein Labor f¨ur Informationssysteme, Technische Universit¨at Wien, Treitlstraße 3, A-1040 Wien, Austria. E-mail: [email protected] 3 Institut und Ludwig Wittgenstein Labor f¨ur Informationssysteme, Technische Universit¨at Wien, Paniglgasse 16, A-1040 Wien, Austria. E-mail: [email protected] 4 Computing Science Department, Uppsala University, Box 311, S 751 05 Uppsala, Sweden. Email: [email protected] Acknowledgements: Evgeny Dantsin was partially supported by grants from TFR and INTAS-RFBR. Andrei Voronkov was partially supported by TFR grants. This paper is an extended version of the preliminary abstract “Complexity and Expressiveness of Logic Programming,” which appeared in: Proceedings 12th Annual IEEE Conference on Computational Complexity (CCC ’97), Ulm, Germany, June 24–27, pp. 82–101, 1997

Copyright c 1999 by the authors

INFSYS RR 1843-99-05

I

Contents 1 Introduction

1

2 Preliminaries 2.1 Syntax of logic programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Semantics of logic programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Datalog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 2 3 5

3 Complexity classes 7 3.1 Turing machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Notation for complexity classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.3 Reductions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4 Complexity of plain logic programming 4.1 Simulation of deterministic Turing machines by logic programs . 4.2 Propositional logic programming . . . . . . . . . . . . . . . . . 4.3 Complexity of datalog . . . . . . . . . . . . . . . . . . . . . . 4.4 Logic programming with functions . . . . . . . . . . . . . . . . 4.5 Further issues . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Complexity of logic programming with negation 5.1 Stratified negation . . . . . . . . . . . . . . . 5.2 Well-founded negation . . . . . . . . . . . . 5.3 Stable model semantics . . . . . . . . . . . . 5.4 Inflationary and noninflationary semantics . . 5.5 Further semantics of negation . . . . . . . . . 6 Disjunctive logic programming

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

11 12 13 15 17 19

. . . . .

21 21 24 24 26 26 27

7 Expressive power of logic programming 30 7.1 The order mismatch and relational machines . . . . . . . . . . . . . . . . . . . . . . . . . . 36 7.2 Expressive power of logic programming with complex values . . . . . . . . . . . . . . . . . 37 8 Unification and its complexity

37

9 Logic programming with equality 38 9.1 Equational theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 9.2 Complexity of E -unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 9.3 Complexity of nonrecursive logic programming with equality . . . . . . . . . . . . . . . . . 40 10 Constraint logic programming 40 10.1 Complexity of constraint logic programming . . . . . . . . . . . . . . . . . . . . . . . . . . 41 10.2 Expressiveness of Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Index

58

INFSYS RR 1843-99-05

1

1 Introduction Logic programming is a well-known declarative method of knowledge representation and programming based on the idea that the language of first-order logic is well-suited for both representing data and describing desired outputs [Kowalski 1974]. Logic programming was developed in the early 1970’s based on work in automated theorem proving [Green 1969, Kowalski & Kuehner 1971], in particular, on Robinson’s resolution principle [Robinson 1965]. A pure logic program consists of a set of rules, also called definite Horn clauses. Each such rule has the body, where head is a logical atom and body is a conjunction of logical atoms. The logical form head semantics of such a rule is given by the implication body ) head (for a more precise account, see Section 2). Note that the semantics of a pure logic program is completely independent of the order in which its clauses are given, and of the order of the single atoms in each rule body. With the advent of the programming language Prolog [Colmerauer, Kanoui, Roussel & Passero 1973], the paradigm of logic programming became soon ready for practical use. Many applications in different areas were and are successfully implemented in Prolog. Note that Prolog is — in a sense — only an approximation to fully declarative logic programming. In fact, the clause matching and backtracking algorithms at the core of Prolog are sensitive to the ordering of the clauses in a program and of the atoms in a rule body. While Prolog has become a popular programming language taught in many computer science curricula, research focuses more on pure logic programming and on extensions thereof. Even in some application areas such as knowledge representation (a subfield of artificial intelligence) and databases there is a predominant need for full declarativeness, and hence for pure logic programming. In knowledge representation, declarative extensions of pure logic programming, such as negation in rule bodies and disjunction in rule heads, are used to formalize common sense reasoning. In the database context, the query language datalog was designed and intensively studied (see [Ullman 1988, Ullman 1989, Ceri, Gottlob & Tanca 1990]). There are many interesting complexity results on logic programming. These results are not limited to “classical” complexity theory but also comprise expressiveness results in the sense of descriptive complexity theory. For example, it was shown that (a slight extension of) datalog cannot just express some, but actually all polynomially computable queries on ordered databases and only those. Thus datalog precisely expresses or captures the complexity class P on ordered databases. Similar results were obtained for many variants and extensions of datalog. It turned out that all major variants of datalog can be characterized by suitable complexity classes. As a consequence, complexity theory has become a very important tool for comparing logic programming formalisms. This paper surveys various complexity and expressiveness results on different forms of (purely declarative) logic programming. The aim of the paper is twofold. First, a broad survey and many pointers to the literature are given. Second, in order to give a flavor of complexity issues in logic programming, a few fundamental topics are explained in greater detail, in particular, the basic results on plain logic programming (Section 4) and some fundamental issues related to descriptive complexity (Section 7). These two sections are written in a more tutorial style and contain several proofs, while the other sections are written in a rather succinct survey style. Note that the present paper does not consist of an encyclopedic listing of all published complexity results on logic programming, but rather of a more or less subjective choice of results. There are many interesting results which we cannot mention for space reasons; such results may be found in other surveys, such as, e.g., [Cadoli & Schaerf 1993, Schlipf 1995a]. For example, results on abductive logic programming [Eiter, Gottlob & Leone 1997a, Inoue & Sakama 1993, Sakama & Inoue 1994b, Marek, Nerode & Remmel 1996], on intuitionistic logic programming [Bonner 1990, Bonner 1999], and on Prolog [Dikovsky 1993].

2

INFSYS RR 1843-99-05

The paper is organized as follows. Section 2 defines syntax and semantics of logic programs, describe datalog and introduce complexity measures. Computational models are complexity notation are discussed in Section 3. Section 4 presents the main complexity results on plain logic programming and datalog. Section 5 discusses various semantics for logic programming with negation and respective complexity results. Section 6 deals with disjunctive logic programming. Section 7 studies the expressive power of datalog and logic programming with complex values. Section 8 characterizes the complexity of unification. Section 9 deals with logic programming extended by equality. Finally, Section 10 describes complexity results on constraint logic programming. This article is an extended version of [Dantsin, Eiter, Gottlob & Voronkov 1997].

2 Preliminaries In this section, we introduce some basic concepts of logic programming. Due to space reasons, the presentation is necessarily succinct; for a more detailed treatment, see [Lloyd 1987, Apt 1990, Apt & Bol 1994, Baral & Gelfond 1994]. We use letters p; q; : : : for predicate symbols, X; Y; Z; : : : for variables, f; g; h; : : : for function symbols, and a; b; c; : : : for constants; a bold face version of a letter denotes a list of symbols of the respective type. In logic programs, we sometimes denote predicate and function symbols by arbitrary strings.

2.1 Syntax of logic programs Logic programs are formulated in a language L of predicates and functions of nonnegative arity; 0-ary functions are constants. A language L is function-free if it contains no functions of arity greater than 0. A term is inductively defined as follows: each variable X and each constant c is a term, and if f is an n-ary function symbol and t1; : : : tn are terms, then f (t1 ; : : : ; tn ) is a term. A term is ground, if no variable occurs in it. The Herbrand universe of L, denoted UL , is the set of all ground terms which can be formed with the functions and constants in L. An atom is a formula p(t1 ; : : : ; tn ), where p is a predicate symbol of arity n and each ti is a term. An atom is ground, if all ti are ground. The Herbrand base of a language L, denoted BL , is the set of all ground atoms that can be formed with predicates from L and terms from UL . A Horn clause is a rule of the form

A0

A1 ; : : : ; A m

(m  0);

where each Ai is an atom. The parts on the left and on the right of “ ” are called the head and the body of the rule, respectively. A rule r of the form A0 , i.e., whose body is empty, is called a fact, and if A0 is a ground atom, then r is called a ground fact. A logic program is a finite set of Horn clauses. A clause or logic program is ground, if it contains no variables. With each logic program P , we associate the language L(P ) that consists of the predicates, functions and constants occurring in P . If no constant occurs in P , we add some constant to L(P ) for technical reasons. Unless stated otherwise, L(P ) is the underlying language, and we use simplified notation UP and BP for UL(P ) and BL(P ) , respectively. A Herbrand interpretation of a logic program P is any subset I  BP of its Herbrand base. Intuitively, the atoms in I are true, while all others are false. A Herbrand model of P is a Herbrand interpretation of P

INFSYS RR 1843-99-05

3

such that for each rule A0 A1 ; : : : ; Am in P , this interpretation satisfies the logical formula 8X((A1 ^    ^ Am ) ) A0), where X is a list of the variables in the rule. Propositional logic programs are logic programs in which all predicates have arity 0, i.e., all atoms are propositional ones. Example 2.1 Here is an example of a propositional logic program, which captures knowledge (in a simplified form) about a steam engine equipped with three signal gauges. shut down shut down leak valve closed pressure loss overheat signal 1 signal 2

overheat leak valve closed; pressure loss signal 1 signal 2 signal 3

Informally, the rules of the program tell that the system has to be shut down if it is in a dangerous state. Such states are connected to causes and signals by respective rules. The facts signal 1 and signal 2 state that signals #2 and #3, respectively, are being observed. Note that if P is a propositional logic program then BP is a set of propositional atoms. Any interpretation of P is a subset of BP .

2.2 Semantics of logic programs The notions of a Herbrand interpretation and model can be generalized for infinite sets of clauses in a natural way. Let P be a set (finite or infinite) of ground clauses. Such a set P defines an operator TP : 2BP ! 2BP , where 2BP denotes the set of all Herbrand interpretations of P , by

TP (I ) = fA0 2 BP j P contains a rule A0 A1 ; : : : ; Am such that fA1 ; : : : ; Am g  I holds g: This operator is called the immediate consequence operator; intuitively, it yields all atoms that can be derived by a single application of some rule in P given the atoms in I . Since TP is monotone, by the Knaster-Tarski Theorem it has the least fixpoint, denoted by TP1 ; since, moreover, TP is also continuous, by Kleene’s Theorem TP1 is the limit of the sequence TP0 = ;, TPi+1 = TP (TPi ), i  0. A ground atom A is called a consequence of a set P of clauses, if A 2 TP1 (we write P j= A). Also, by = TP1. The semantics definition, a negated ground atom :A is a consequence of P , denoted P j= :A, if A 2 of a set P of ground clauses, denoted M(P ), is defined as the following set consisting of atoms and negated atoms:

M(P ) = fA j P j= Ag [ f:A j P j= :Ag = TP1 [ f:A j A 2 BP n TP1 g:

4

INFSYS RR 1843-99-05

Example 2.2 (See Example 2.1.) For the program

P above, we have

TP0 = ;; TP1 = fsignal 1; signal 2g; TP2 = TP1 [ fvalve closed; pressure lossg; TP3 = TP2 [ fleakg; TP4 = TP1 = TP3 [ fshutdowng: Thus, the least fixpoint is reached in four steps; e.g., P

j= shutdown and P j= :overheat.

For each set P of clauses, TP1 coincides with the unique least Herbrand model of P , where a model M is smaller than a model N , if M is a proper subset of N [van Emden & Kowalski 1976]. The semantics of nonpropositional logic programs is now defined as follows. Let the grounding of a clause r in a language L, denoted ground(r; L), be the set of all clauses obtained from r by all possible substitutions of elements of UL for the variables in r . For any logic program P , we define ground(P; L) =

[ ground(r; L);

r2P

and we write ground(P ) for ground(P; L(P )). The operator TP by TP = Tground(P ) . Accordingly, M(P ) = M(ground(P )).

: 2 BP

! 2B

P

associated with P is defined

Example 2.3 Let P be the program

p(a) p(f (x))

p(x)

UP = fa; f (a); f (f (a)); : : :g and ground(P ) contains p(f (f (a))) p(f (a)), . . . . The least fixpoint of TP is Then,

the clauses

p(a)

,

p(f (a))

p(a),

1 n TP1 = Tground (P ) = fp(f (a)) j n  0g: Hence, e.g., P

j= p(f (f (a))).

In practice, generating ground(P ) is often cumbersome, since, even in case of function-free languages, it is in general exponential in the size of P . Moreover, it is not always necessary to compute M(P ) in order to determine whether P j= A for some particular atom A. For these reasons, completely different strategies of deriving atoms from a logic program have been developed. These strategies are based on variants of the famous Resolution Principle of Robinson [1965]. The major variant is SLD-resolution [Kowalski & Kuehner 1971, Apt & van Emden 1982]. Roughly, SLD-resolution can be described as follows. A goal is a conjunction of atoms, and a substitution is a function # that maps variables v1 ; : : : ; vn to terms t1 ; : : : ; tn . The result of simultaneous replacement of variables vi by terms ti in an expression E is denoted by E#. For a given goal G and a program P , SLD-resolution tries to find a substitution # such that G# logically follows from P . The initial goal is repeatedly transformed until the empty goal is obtained. Each transformation step is based on the application of the resolution rule to a selected atom Bi from the goal B1 ; : : : ; Bm and a clause A0 A1 ; : : : ; An from P . SLD-resolution tries to unify Bi with the head A0 , i.e., to find a substitution # such that A0 # = Bi #.

INFSYS RR 1843-99-05

5

Such a substitution # is called a unifier of A0 and Bi . If a unifier # exists, a most general such # (which is essentially unique) is chosen and the goal is transformed into

(B1 ; : : : ; Bi?1 ; A1 ; : : : ; An ; Bi+1 ; : : : ; Bm )#: For a more precise account, [see Apt 1990, Lloyd 1987]; for resolution on general clauses, see e.g., [Leitsch 1997]. The complexity of unification will be dealt with in Section 8.

2.3 Datalog The interest in using logic in databases gave rise to the field of deductive databases; see [Minker 1996] for a comprehensive overview of the development of this area. It appeared that logic programming is a suitable formalism for querying relational databases. In this context, the logic programming based query language datalog and various extensions thereof have been defined. In the context of logic programming, relational databases are identified with sets of ground facts p(c1 ; : : : ; cn ). Intuitively, all ground facts with the same predicate symbol p represent a data relation. The set of all predicate symbols occurring in the database together with a possibly infinite domain for the argument constants is called the schema of the database. With each database D , we associate a finite universe UD of constants which encompasses at least all constants appearing in D , but possibly more. In the classical database context, UD is often identified with the set of all constants appearing in D . But in the datalog context, a larger universe UD may be suitable in case one wants to derive assertions about items that do not explicitly occur in the database. To understand how datalog works, let us consider a clarifying example. Example 2.4 Consider a database D containing the ground facts father(john; mary) father(joe; kurt) mother(mary; joe) mother(tina; kurt) The schema of this database is the set of relation symbols ffather, motherg together with the domain STRING of all alphanumeric strings. With this database, we associate the finite universe UD = f john, mary, joe, tina, kurt, susan g. Note that susan does not appear in the database but is included in the universe UD . The following datalog program (or query) P computes all ancestor relationships relative to this database: parent(X; Y ) parent(X; Y ) ancestor(X; Y ) ancestor(X; Y ) person(X )

father(X; Y ) mother(X; Y ) parent(X; Y ) parent(X; Z ); ancestor (Z; Y )

In the program P , father and mother are the input predicates, also called database predicates. Their interpretation is fixed by the given input database D . The predicates ancestor and person are output predicates, and the predicate parent is an auxiliary predicate. Intuitively, the output predicates are those which

6

INFSYS RR 1843-99-05

are computed as the visible result of the query, while the auxiliary predicates are introduced for representing some intermediate results, which are not to be considered part of the final result. The datalog program P on input database D computes a result database R with the schema fancestor ; persong containing among others the following ground facts: ancestor(mary; joe); ancestor(john; joe); person(john); person(susan):

The last fact is in R because susan is included as a constant in UD . However, the fact person(harry) is not in R, because harry is not a constant in the finite universe UD of the database D .

Formally, a database schema D consists of a finite set Rels(D ) of relation names with associated arities and a (possibly countable infinite) domain Dom(D ). For each database schema D , we denote by HB(D ) the Herbrand base corresponding to the function-free language whose predicate symbols are Rels(D ) and whose constant symbols are Dom(D ). A database (also, database instance) D over a schema D is given by a finite subset of the Herbrand base D  HB(D) together with an associated finite universe UD  Dom(D), containing all constants actually appearing in D . By abuse of notation, we also write D instead of hD; UD i. We denote by D jp the extension of the relation p 2 Rels(D ) in D . Moreover, INST(D ) denotes the set of all databases over D . A datalog query or a datalog program is a function-free logic program P with three associated database schemas: the input schema Din , the output schema Dout and the complete schema D , such that the following is satisfied: Dom(Din ) = Dom(Dout ) = Dom(D ); Rels(Din )  Rels(D ); Rels(Dout )  Rels(D ); and Rels(Din ) \ Rels(Dout ) = ;:

Moreover, each predicate symbol appearing in P is contained in Rels(D ) and no predicate symbol from Din appears in a rule head of P (the latter means that the predicates of the input database are never redefined by a datalog program). The formal semantics of a datalog program P over the input schema Din , output schema Dout , and complete schema D is given by a partial mapping from instances of Din to instances of Dout over the same universe. A result instance of Dout is regarded as the result of the query. More formally, MP : INST(Din ) ! INST(Dout ) is defined for all instances Din 2 INST(Din ) such that all constants occurring in P appear in UD , and maps every such Din to the database Dout = MP (Din ) such that UD = UD and, for every relation p 2 Rels(Dout ), out

in

in

Dout jp = fa j p(a) 2 M(ground(P [ Din ; L(P; Din )))g; where M and ground are defined as in Section 2.2, and L(P; Din ) is the language of P extended by all constants in the universe UD . For all ground atoms A 2 HB(Dout ), we write P [ Din j= A if A 2 MP (Din ) and write P [ Din j= :A if A 2= MP (Din ). in

The semantics of datalog is thus inherited from the semantics of logic programming. In a similar way, the semantics of various extensions of datalog is inherited from the corresponding extensions of logic programming.

INFSYS RR 1843-99-05

7

There are three main kinds of complexity connected to plain datalog and its various extensions [Vardi 1982]:

  

The data complexity is the complexity of checking whether Din [ P j= A when datalog programs P are fixed, while input databases Din and ground atoms A are an input. The program complexity (also called expression complexity) is the complexity of checking whether Din [ P j= A when input databases Din are fixed, while datalog programs P and ground atoms A are an input. The combined complexity is the complexity of checking whether Din [ P Din , datalog programs P and ground atoms A are an input.

j= A when input databases

Note that for plain datalog, as well as for all other versions of datalog considered in this paper, the combined complexity is equivalent to the program complexity with respect to polynomial-time reductions. This is due to the fact that with respect to the derivation of ground atoms, each pair hDin ; P i can be easily reduced to the pair hD; ; P  i, where D; is the empty database instance associated with a universe of two constants c1 and c2 , and P  is obtained from P [ Din by straightforward encoding of the universe UD using n-tuples over fc1 ; c2 g, where n = djUD je. For this reason, we mostly disregard the combined complexity in the material concerning datalog. We remark, however, that due to a fixed universe, program complexity may allow for slightly sharper upper bounds than the combined complexity (e.g., ETIME vs EXPTIME). Another approach to measuring complexity of query languages is the parametric complexity [Papadimitriou & Yannakakis 1997]. In this approach, the complexity is expressed as a function of some “reasonable” parameters. An example of such a parameter is the number of variables appearing in the query (interest in this parameter is motivated by [Vardi 1995], where it is shown that data and program complexity become close when the number of query variables is bounded). As for logic programming in general, a generalization of the combined complexity may be regarded as the main complexity measure. Below, when we speak about the complexity of a fragment of logic programming, we mean the following kind of complexity: in

in



The complexity of (some fragment of) logic programming is the complexity of checking whether P j= A for variable both programs P and ground atoms A.

3 Complexity classes This section contains definitions for most of the complexity classes which are encountered in this survey and provides other related definitions. A detailed exposition of most complexity notions can be found e.g. in [Papadimitriou 1994]. We follow the notation of [Johnson 1990], where definitions of all complexity classes used in this article can be found.

3.1 Turing machines Deterministic Turing machines. Informally, we think of a Turing machine as a device able to read from and write on a semi-infinite tape, whose contents may be locally accessed and changed in a computation. Formally, a deterministic Turing machine (DTM) is defined as a quadruple (S; ; ; s0 ), where S is a finite

8

INFSYS RR 1843-99-05

set of states,  is a finite alphabet of symbols,  is a transition function, and s0 2 S is the initial state. The alphabet  contains a special symbol called the blank. The transition function  is a map

 : S   ! (S [ fhalt; yes; nog)    f-1,

0, +1g;

where halt, yes, and no denote three additional states not occurring in S , and -1, 0, +1 denote motion directions. It is assumed here, without loss of generality, that the machine is well-behaved and never moves off the tape, i.e., d 6= -1 whenever the cursor is on the leftmost cell; this can be ensured by proper design of  .1 Let T be a DTM (; S; ; s0 ). The tape of T is divided into cells containing symbols of . There is a cursor that may move along the tape. At the start, T is in the initial state s0 , and the cursor points to the leftmost cell of the tape. An input string I is written on the tape as follows: the first jI j cells c0 ; : : : ; cjI j?1 of the tape, where jI j denotes the length of I , contains the symbols of I , and all other cells contain . The machine takes successive steps of computation according to  . Namely, assume that T is in a state s 2 S and the cursor points to the symbol  2  on the tape. Let

(s; ) = (s0 ; 0 ; d): Then T changes its current state to s0 , overwrites  0 on  , and moves the cursor according to d. Namely, if d = -1 or d = +1, then the cursor moves to the previous cell or the next one, respectively; if d = 0, then the cursor remains in the same position. When any of the states halt, yes or no is reached, T halts. We say that T accepts the input I if T halts in yes. Similarly, we say that T rejects the input in the case of halting in no. If halt is reached, we say that the output of T on I is computed. This output, denoted by T (I ), is defined as the string contained in the initial segment of the tape which ends before the first blank. Nondeterministic Turing machines. Like a DTM, a nondeterministic Turing machine, or NDTM, is defined as a quadruple (S; ; ; s0 ), where S; ; s0 are the same as before. Possible operations of the machine are described by , which is no longer a function. Instead,  is a relation:



 (S  )  (S [ fhalt; yes; nog)    f-1,

0, +1g:

A tuple whose first two members are s and  respectively, specifies the action of the NDTM when its current state is s and the symbol pointed at by its cursors is  . If the number of such tuples is greater than one, the NDTM nondeterministically chooses any of them and operates accordingly. Unlike the case of a DTM, the definition of acceptance and rejection by a NDTM is asymmetric. We say that a NDTM accepts an input if there is at least one sequence of choices leading to the state yes. A NDTM rejects an input if no sequence of choices can lead to yes. Time and space bounds. The time expended by a DTM T on an input I is defined as the number of steps taken by T on I from the start to halting. If T does not halt on I , the time is considered to be infinite. For a NDTM T , we define the time expended by T on I as 1, if T does not accept I (respectively, computes no output for I ), and otherwise as the minimum over the number of steps in any accepting (respectively, output producing) computation of T . Some texts assume that  has a special symbol which marks the left end of the tape. This symbol can be eliminated by a proper redesign of the machine. For the purpose of this paper, the simpler model without a left end marker is convenient. 1

INFSYS RR 1843-99-05

9

The space required by a DTM T on I is the number of cells visited by the cursor during the computation on I . In the case of a NDTM, the space is defined as 1, if T does not accept I (respectively, computes no output for I ), and otherwise as the minimum number of cells visited on the tape over all accepting (respectively, output producing) computations. Let T be a DTM or a NDTM. Let f be a function from the positive integers to themselves. We say that T halts in time O (f (n)), if there exist positive integers c and n0 such that the time expended by T on any input of length n is not greater than cf (n) for all n  n0 . Likewise, we say that T halts within space O(f (n)) if the space required by T on any input of length n is not greater than cf (n) for all n  n0, where c and n0 are positive integers. Assume that a DTM (NDTM) T halts in time O (nd ), where d is a positive integer. Then we call T a polynomial-time DTM (NDTM) and we say that T halts in polynomial time. Similarly, if T halts within space O (nd ), we call T a polynomial-space DTM (NDTM).

3.2 Notation for complexity classes As above, let  be a finite alphabet containing . Let 0 =  n f g, and let L  0  be a language in 0 , i.e. a set of finite strings over 0 . Let T be a DTM or a NDTM such that (i) if x 2 L then T accepts x, and (ii) if x 62 L then T rejects x. Then we say that T decides L. In addition, if T halts in time O(f (n)), we say that T decides L in time O (f (n)). Similarly, if T halts within space O (f (n)), we say that T decides L within space O (f (n)). Observe that if f (n) is a sublinear function, then a Turing machine which halts within space f (n) can not read the whole input string, nor produce a large output. To remedy this problem, a Turing machine T is equipped with a read-only input-tape and a write-only output tape besides the work tape, which contain the input string and the output computed by T , respectively. The time and space requirement of T is defined as above, where only the space used on the work tape counts. In case T halts within sublinear time f (n), random access to the input symbols on the input-tape is provided using a further tape which serves as an index register. In the following, we assume that multi-tape machines as described may be used for deciding languages within sublinear bounds. Let f be a function on positive integers. We define the following sets of languages:

TIME(f (n)) NTIME(f (n)) SPACE(f (n)) NSPACE(f (n))

= = = =

fL j L is decided by some DTM in time O(f (n))g; fL j L is decided by some NDTM in time O(f (n))g; fL j L is decided by some DTM within space O(f (n))g; fL j L is decided by some NDTM within space O(f (n))g:

All these sets are examples of complexity classes, other examples will be given below. Note that some functions f can lead to complexity classes with unnatural properties (see [Papadimitriou 1994] for details). However, for “normal” functions such as polynomials, exponents or logarithms, the corresponding complexity classes are “normal” too. Complexity classes of most interest are not classes corresponding to particular functions but their unions such as, for example, the union d>0 TIME(nd ) taken over all polynomials of the form nd . The following abbreviations are used to denote main complexity classes of such a kind:

S

10

INFSYS RR 1843-99-05

P NP EXPTIME NEXPTIME PSPACE EXPSPACE L NL

S S S S S S

= d>0 TIME(nd ); = d>0 NTIME(nd ); = d>0 TIME(2nd ); d = d>0 NTIME(2n ); = d>0 SPACE(nd ); = d>0 SPACE(2nd ); = SPACE(log n); = NSPACE(log n):

S

S

The list contains no abbreviations for the nondeterministic counterparts of PSPACE and EXPSPACE bed cause d>0 NSPACE(nd ) coincides with the class PSPACE and d>0 NSPACE(2n ) coincides with the class EXPSPACE [Sawitch 1970]. Complementary classes. Any complexity class C has its complementary class denoted by co-C and defined as follows. For every language L in 0 , let L denote its complement, i.e. the set 0  n L. Then co-C is fL j L 2 Cg. The polynomial hierarchy. To define the polynomial hierarchy classes, we need to introduce oracle Turing machines. Let A be a language. An oracle DTM T A , also called a DTM with oracle A, can be thought of as an ordinary DTM augmented by an additional write-only query tape and additional three states query, 2 and 62. When T A is not in the state query, the computation proceeds as usual (in addition, T A can write on the query tape). When T A is in query, T A changes this state to 2 or 62 depending on whether the string written on the query tape belongs to A or not; furthermore, the query tape is instantaneously erased. Like the case of an ordinary DTM, the expended time is the number of steps and the required space is the number of cells used on the tape and the query tape. An oracle NDTM is defined as the same augmentation of a NDTM. Let C be a set of languages. We define complexity classes PC and NPC as follows. For a language L, we have L 2 PC (or L 2 NPC ) if and only if there is some language A 2 C and some polynomial-time oracle DTM (or NDTM) T A such that T A decides L. The polynomial hierarchy consists of classes pi , pi , and pi defined by the following equalities:

p0 = p0 = p0 = P; p pi+1 = Pi ; p pi+1 = NPi ; pi+1 = co-pi+1 ;

for all i  0. The class PH is defined as

Si0 pi.

Exponential time. Besides EXPTIME and NEXPTIME, we mention in this paper some other classes that characterize computation in exponential time. The classes ETIME and NETIME are defined as

INFSYS RR 1843-99-05

11

[ TIME(2dn ) and [ NTIME(2dn)

d>0

d>0

respectively; they capture linear exponents instead of polynomial exponents. The class EXPTIME can be viewed as 1-EXPTIME where 1 means the first level of exponentiation. Double exponents, triple exponents, etc. are captured by the classes 2-EXPTIME, 3-EXPTIME etc. defined as

[ TIME(22

nd

d>0

);

[ TIME(22

d

2n

d>0

); : : : :

Their nondeterministic counterparts are defined in the same way but with the replacement of TIME(fn) by NTIME(f (n)).

3.3 Reductions Let L1 and L2 be languages. Assume that there is a DTM R such that 1. For all input strings x, we have x R on input x. 2.

2 L1 if and only if R(x) 2 L2 , where R(x) denotes the output of

R halts within space O(log n).

Then R is called a logarithmic-space reduction from L1 to L2 and we say that L1 is reducible to L2 . Let C be a set of languages. A language L is called C -hard, if any language L0 in C is reducible to L. If L is C -hard and L 2 C , then L is called C -complete or complete for C . Besides the above notion of a reduction, complexity theory also considers many other kinds of reductions, for example, polynomial-time many-one reductions or polynomial-time Turing reductions (stronger kinds of reductions). In this paper, unless otherwise stated, a reduction means a logarithmic-space reduction. We note that in several cases, results that we shall review have been stated for polynomial-time manyone reductions, but the proofs establish that they hold under logarithmic-space reduction. Furthermore, many results hold under yet weaker reductions such as first-order reduction [see e.g. Immerman 1998]. In case of weak reductions, as well as in case of computation with sublinear resource constraints, the particular representation of the problem input as a string I may be a matter of concern. For most of the problems that we describe, and in particular those having complexity at least P, this is not an issue; any “reasonable” representation is appropriate [see e.g. Johnson 1990]. In the other cases, the reader is requested to the original sources for the details.

4 Complexity of plain logic programming In this section, we survey some basic results on the complexity of plain logic programming. This section is written in a slightly more tutorial style than the following sections in order to help both readers not familiar with logic programming and readers not too familiar with complexity theory to grasp some key issues relating complexity theory and logic programming.

12

INFSYS RR 1843-99-05

4.1 Simulation of deterministic Turing machines by logic programs Let T be a DTM. Consider the computation of T on an input string I . The purpose of this section is to describe a logic program L and a goal G such that L j= G if and only if T accepts I in at most N steps. The transition function  of a DTM with a single tape can be represented by a table whose rows are tuples t = hs; ; s0 ;  0 ; di. Such a tuple t expresses the following if-then-rule: if at some time instant  the DTM is in state s, the cursor points to cell number  , and this cell contains symbol  then at instant  + 1 the DTM is in state s0 , cell number  contains symbol  0 , and the cursor points to cell number  + d. It is possible to describe the complete evolution of a DTM T on input string I from its initial configuration at time instant 0 to the configuration at instant N by a propositional logic program L(T; I; N ). To achieve this, we define the following classes of propositional atoms: symbol [;  ] for 0    N , 0    N and cell number  contains symbol .

2 . Intuitive meaning: at instant  of the computation,

cursor[;  ] for 0 number  .

   N and 0    N .

states [ ] for 0  

 N and s 2 S . Intuitive meaning: at instant  the DTM T is in state s.

accept Intuitive meaning:

Intuitive meaning: at instant



the cursor points to cell

T has reached state yes.

Let us denote by Ik the k -th symbol of the string I = I0    IjI j?1 . The initial configuration of input I is reflected by the following initialization facts in L(T; I; N ): symbol [0;  ] symbol [0;  ] cursor[0; 0] states0 [0]

for 0   < jI j, where I for jI j    N

T

on

=

Each entry hs; ; s0 ;  0 ; di of the transition table  is translated into the following propositional Horn clauses, which we call the transition rules. The clauses are asserted for each value of  and  such that 0   < N , 0   < N , and 0   + d. symbol [ + 1;  ] cursor[ + 1;  + d] states [ + 1] 0

0

states [ ]; symbol [;  ]; cursor [;  ] states [ ]; symbol [;  ]; cursor [;  ] states [ ]; symbol [;  ]; cursor [;  ]

These clauses almost perfectly describe what is happening during a state transition from an instant  to an instant  + 1. However, it should not be forgotten that those tape cells which are not changed during the transition keep there old values at instant  + 1. This must be reflected by what we term inertia rules. These rules are asserted for each time instant  and tape cells numbers ;  0 , where 0   < N , 0   <  0  N , and have the following form:

INFSYS RR 1843-99-05

13

symbol [ + 1;  ] symbol [ + 1;  0 ]

symbol [;  ]; cursor [;  0 ] symbol [;  0 ]; cursor[;  ]

Finally, a group of clauses termed accept rules derives the propositional atom accept, whenever an accepting configuration is reached. accept

stateyes [ ]

for 0  

 N.

Denote by L the logic program L(T; I; N ). Note that TL0 = ; and that TL1 contains the initial configuration of T at time instant 0. By construction, the least fixpoint TL1 of L is reached at TLN +2 , and the ground atoms added to TL , 2    N + 1, i.e., those in TL n TL ?1 , describe the configuration of T on the input I at the time instant  ? 1. The fixpoint TL1 contains accept if and only if an accepting configuration has been reached by T in at most N computation steps. We thus have: Lemma 4.1

L(T; I; N ) j= accept if and only if the DTM T accepts the input string I within N steps.

A somewhat different simulation of deterministic multi-tape Turing machines by logic programs was given by Itai & Makowsky [1987]. These authors also note that simulating Turing machines by Horn clause theories, and, more generally, by logical deduction has a long history: “The idea of simulating Turing machines by logical deduction goes back to Turing’s original paper [Turing 1936-1937]. Turing introduced his abstract machine concept at a time when computations were considered to be something mechanical, and felt it was necessary to show that logical deduction can be reduced to such a mechanistic model of computation. However, this reduction uses full first-order logic. A reduction using only universal Horn formulas (with function symbols) appears buried in the exposition of Scholz & Hasenjaeger [1961]. It also forms the basis of the theory of formal systems, as presented by Smullyan in his thesis [Smullyan 1961]. The idea of coding Turing machines by logic Horn formulas appears explicitly in [B¨uchi 1962] and has been used since 1971 in a series of papers by Aandera, B¨orger, and Lewis [Aandera & B¨orger 1979, B¨orger 1971, B¨orger 1974, B¨orger 1984, Lewis 1979] to obtain undecidability and complexity results. Since then, various authors have rediscovered that such a reduction is possible and have used this observation to show that logic programming is computationally complete. The earliest reference we have found that states this result explicitly is [Andr´eka & N´emeti 1978]; a slightly weaker result appears in [T¨arnlund 1977].” Yet another translation and further references can be found in the recent book [B¨orger, Gr¨adel & Gurevich 1997].

4.2 Propositional logic programming The simulation of a DTM by a propositional logic program, as described in Section 4.1 is almost all we need in order to determine the complexity of propositional logic programming, i.e., the complexity of deciding whether P j= A holds for a given logic program P and ground atom A. Theorem 4.2 (implicit in [Jones & Laaser 1977, Vardi 1982, Immerman 1986]) Propositional logic programming is P-complete.

14

INFSYS RR 1843-99-05

Proof. 1. Membership. It is obvious that the least fixpoint TP1 of the operator TP , given program P , can be computed in polynomial time: the number of iterations (i.e. applications of TP ) is bounded by the number of rules plus one. Each iteration step is clearly feasible in polynomial time. 2. Hardness. Let A be a language in P. Thus A is decidable in q (n) steps by a DTM T for some polynomial q . Transform each instance I of A to the corresponding logic program L(T; I; q (jI j)) as described in Section 4.1. By Lemma 4.1, L(T; I; q (jI j)) j= accept if and only if T has reached an accepting state within q (n) steps. The translation from I to L(T; I; q (jI j)) is very simple and is clearly feasible in logarithmic space, since all rules of L(T; I; q (jI j)) can be generated independently of each other and each has size logarithmic in jI j; note that the numbers  and  have O (log jI j) bits, while all other syntactic constituents of a rule have constant size. We have thus shown that every language A in P is logspace reducible to propositional logic programming. Hence, propositional logic 2 programming is P-hard. Obviously, this theorem can be proved by simpler reductions from other P-complete problems, for example from the monotone circuit value problem (see [Papadimitriou 1994]); however, our proof from first principles unveils the computational nature of logic programming and provides a basic framework form which further results will be derived by slight adaptations in the sequel. Notice that in a standard programming environment, propositional logic programming is feasible in linear time by using appropriate data structures, as follows from results about deciding Horn satisfiability [Dowling & Gallier 1984, Itai & Makowsky 1987]. This does not mean that all problems in P are solvable in linear time; first, the model of computation used in [Dowling & Gallier 1984] is the RAM machine, and second logarithmic-space reductions may in general polynomially increase the input. Theorem 4.2 holds under stronger reductions. In fact, it holds under the requirement that the logspace reduction is also a polylogtime reduction (PLT). Briefly, a map f :  ! 0 from a problem  to a problem 0 is a PLT-reduction, if there are polylogtime deterministic Turing machines N and M such that for all w, N (w) = jf (w)j and for all w and n, M (w; n) = Bit(n; f (w)), i.e., the n-th bit of f (w) (see e.g. [Veith 1998] for details). (Recall that N and M have separate input tapes whose cells can be accessed by use of an index register tape.) Since the above encoding of a DTM into logic programming is highly regular, it is easily seen that it is a PLT reduction. Syntactical restrictions on programs lead to completeness for classes inside P. Let LP(k ) denote logic programming where each clause has at most k atoms in the body. Then, by results in [Vardi 1982, Immerman 1987], one easily obtains: Theorem 4.3 LP(1) is NL-complete. Proof. (Sketch) 1. Membership The membership part can be established by reducing this problem to graph reachability, i.e., given a directed graph G = (V; E ) and vertices s; t 2 V , decide whether t is reachable from s. Since graph reachability is in NL and NL is closed under logarithmic-space reductions (i.e., reducibility of a problem A to a problem B in NL implies that A is in NL), it follows that LP(1) is in NL. For a program P from LP(1), the question whether P j= A is equivalent to the node true (representing truth) is reachable from the node A in the directed graph G = (V; E ) as follows. The vertex set V

INFSYS RR 1843-99-05

15

is the set of atoms in P plus true; the edge set E contains an edge (A; B ) directed from A to B for B in P , and an edge (A; true) for every fact A in P . Clearly, the graph G is every rule A constructible from P in logarithmic space. Thus, the problem is in NL. 2. Hardness Conversely, graph reachability is easily transformed into P j= A for a program in LP(1). 2 Since graph reachability is NL-complete (thus NL-hard), the result is established. Observe that the above DTM encoding can be easily modified to programs in LP(2). Hence, LP(2) is

P-complete.

Further syntactical restrictions on LP (1) yield problems complete for stronger than logspace reductions), which we omit here.

L (of course,

under reductions

4.3 Complexity of datalog Let us now turn to datalog, and let us first consider the data complexity. Grounding P on an input database D yields polynomially many clauses in the size of D; hence, the complexity of propositional logic programming is an upper bound for the data complexity. The same holds for the variants of datalog we shall consider in the sequel. The complexity of propositional logic programming is also a lower bound. Thus, Theorem 4.4 (implicit in [Vardi 1982, Immerman 1986]) Datalog is data complete for P. In fact, this result follows from the proof of Theorem 7.2 below. An alternative proof of P-hardness can be given by writing a simple datalog meta-interpreter for propositional LP(k ), where k is a constant. Represent rules A0 A1 ; : : : ; Ai , where 0  i  k, by tuples hA0 ; : : : ; Ai i in an (i + 1)-ary relation Ri on the propositional atoms. Then, a program P in LP(k) which is stored this way in a database D(P ) can be evaluated by a fixed datalog program PMI (k ) which contains for each relation Ri , 0  i  k , a rule

T (X0 )

T (X1 ); : : : ; T (Xi ); Ri (X0 ; : : : ; Xi ):

Here T (x) intuitively means that atom x is true. Then, P j= A just if PMI [ P (D ) j= T (A). P-hardness of the data complexity of datalog is then immediate from Theorem 4.2. The program complexity is exponentially higher. Theorem 4.5 (implicit in [Vardi 1982, Immerman 1986]) Datalog is program complete for EXPTIME. Proof. (Sketch) 1. Membership. Grounding P on D leads to a propositional program P 0 whose size is exponential in the size of the fixed input database D . Hence, by Theorem 4.2, the program complexity is in EXPTIME. 2. Hardness. In order to prove EXPTIME-hardness, we show that if a DTM T halts in less than N = 2n steps on a given input I where jI j = n, then T can be simulated by a datalog program over a fixed input database D . In fact, we use D; , i.e., the empty database with the universe U = f0; 1g. k

We employ the scheme of the DTM encoding into logic programming from above, but use the predicates symbol (X; Y ), cursor(X; Y ) and states (X ) instead of the propositional letters symbol [X; Y ], cursor[X; Y ] and states [X ] respectively. The time points  and tape positions  from 0 to 2m ? 1, m = nk ,

16

INFSYS RR 1843-99-05

are represented by m-ary tuples over U , on which the functions  + 1 and  + d are realized by means of the successor Succm from a linear order m on U m . For an inductive definition, suppose Succi (X; Y ), Firsti (X), and Lasti (X) tell the successor, the first, and the last element from a linear order i on U i , where X and Y have arity i. Then, use rules Succi+1 (Z; X; Z; Y ) Succi+1 (Z; X; Z 0 ; Y ) Firsti+1 (Z; X) Lasti+1 (Z; X)

Succi (X; Y ) Succ1 (Z; Z 0 ); Lasti (X); Firsti (Y ) First1 (Z ); Firsti (X) Last1 (z ); Lasti (X)

Here Succ1 (X; Y ), First1 (X ), and Last1 (X ) on U 1 = U must be provided. For our reduction, we use the usual ordering 0 1 1 and provide those relations by the ground facts Succ1 (0; 1), First1 (0), and Last1 (1). The initialization facts symbol [0;  ] are readily translated into the datalog rules symbol (X; t)

Firstm (X);

where t represents the position  , and similarly the facts cursor[0; 0] and states0 [0]. The remaining initialization facts symbol [0;  ], where jI j    N , are translated to the rule symbol

(X; Y)

Firstm (X);

m(t; Y)

where t represents the number jI j; the order m is easily defined from Succm by two clauses

m(X; X) m(X; Y)

X

Succm (X; Z);

m (Z; Y)

The transition and inertia rules are easily translated into datalog rules. For realizing the body atoms Succm (X; X0 ). For example, the clause symbol

0

[ + 1; ]

 + 1 and  + d, use in

states [ ]; symbol [;  ]; cursor [;  ]

is translated into symbol

0

(X0 ; Y)

states (X); symbol (X; Y ); cursor(X; Y ); Succm (X; X0 ):

The translation of the accept rules is straightforward. For the resulting datalog program P 0 , it holds that P 0 [ D; j= accept if and only if T accepts input I in at most N steps. It is easy to see that P 0 can be constructed from T and I in logarithmic space. Hence, datalog has EXPTIME-hard program complexity. Note that straightforward simplifications in the construction are possible, which we omit here, as part of 2 it will be reused below. Instead of using a generic reduction, the hardness part of this theorem can also be obtained by applying complexity upgrading techniques [Papadimitriou & Yannakakis 1985, Balc´azar, Lozano & Tor´an 1992]. We briefly outline this in the rest of this section. This technique utilizes a conversion lemma [Balc´azar et al. 1992] of the form “If  X -reduces to 0 , then s() Y -reduces to s(0 )”; here s() is the succinct variant of , where the instances I of  are given by a Boolean circuit CI which computes the bits of I (see [Balc´azar et al. 1992] for details).

INFSYS RR 1843-99-05

17

The strongest form of the conversion lemma appears in [Veith 1998], where X is PLT and Y is monotone projection reducibility [Immerman 1987]. The conversion lemma gives rise to an upgrading theorem, which has been subsequently sharpened [Balc´azar et al. 1992, Eiter, Gottlob & Mannila 1994, Gottlob, Leone & Veith 1995, Veith 1998] and is stated below in the strongest form of [Veith 1998]. For a complexity class C , denote long(C ) = flong(L) j L 2 Cg, where long(L) = bin(n)21L f0; 1gn , i.e., contains all strings of length n such that n, in binary and with the leading 1 omitted, belongs to L.

S

Theorem 4.6 Let C1 and C2 be complexity classes such that long(C1 )  C2 . If PLT-reduction, then s() is hard for C1 under monotone projection reduction.

 is hard for C2 under

We remark that since monotone projection reduction is very weak, a special encoding of succinct problems is necessary. From the observations in Section 4.2, we then obtain that s(LP(2)) is EXPTIME-hard under monotone projection reductions, where each program P is stored in the database D (P ), which is represented by a binary string in the standard way. s(LP(2)) can be reduced to evaluating a datalog program P  over a fixed database as follows. From a succinct instance of LP(2), i.e., a Boolean circuit CI for I = D (P ), Boolean circuits Ci for computing Ri , 0  i  2 can be constructed which use negation merely on input gates. Each such circuit Ci (X) can be simulated by straightforward datalog rules. For example, an ^-gate gi with input from gates gj and gk is described by a rule gi (X) gj (X); gk (X), and an _-gate gi is described gj (X) and gi (X) gk (X). Observe that Boolean circuits with arbitrary use of by the rules gi (X) negation can be easily simulated in stratified datalog [Kolaitis & Papadimitriou 1991] or disjunctive datalog [Eiter, Gottlob & Mannila 1997]. The desired program P  comprises the rules for the Boolean circuits Ci and the rules of the metainterpreter PMI (k ), which are adapted for a binary encoding of the domain UD(P ) of the database D (P ) by using binary tuples of arity dlog jUD(P ) je. This construction is feasible in logarithmic space, from which EXPTIME-hard program complexity of datalog follows. We refer the reader to [Eiter et al. 1994, Eiter, Gottlob & Mannila 1997, Gottlob et al. 1995] for the technical details.

4.4 Logic programming with functions Let us see what happens if we allow function symbols in logic programs. In this case, entailment of an atom is no longer decidable. To prove it, we can, for example, reduce Hilbert’s Tenth Problem to the query answering in full logic programming. Natural numbers can be represented using the constant 0 and the successor function s. Addition and multiplication are expressed by the following simple logic program:

X +0 =X X + s(Y ) = s(Z ) X 0 =0 X  s(Y ) = Z

X +Y = Z X  Y = U; U + X = Z

Now, undecidability of full logic programming follows from the undecidability of diophantine equations [Matiyaseviˇc 1970]. More precisely, it shows that full logic programming can express r.e.-complete languages. On the other hand, the least fixpoint TP1 of any logic program P is clearly a r.e. set. This shows r.e.-completeness of logic programming.

18

INFSYS RR 1843-99-05

Theorem 4.7 ([Andr´eka & N´emeti 1978, T¨arnlund 1977]) Logic programming is r.e.-complete.2 Of course, this theorem may as well be proved by a simple encoding of Turing machines similar to the encoding in the proof of Theorem 4.5 (use terms f n (c), n  0, for representing cell positions and time instants). It is interesting to note that Smullyan [1956] asserted –quite some time before the first proposals to logic programming – a closely related result which essentially says that, in our terms, the minimal model semantics of logic programming over arithmetic yields the r.e. sets. Theorem 4.7 was generalized in [Voronkov 1995] for more expressive S-semantics and C-semantics [Falaschi, Levi, Martelli & Palamidessi 1989]. On the other hand, it was sharpened to syntactical classes of logic programs. E.g., T¨arnlund [1977] used binary Horn clause programs to simulate a universal Turing ˇ ep´anek [1982] showed machine. By a transformation from binary Horn clause programs, Sebel´ik & Stˇ that a class of logic programs called stratifiable (in a sense different from the one in Section 5.1) is r.e.ˇ ep´ankov´a 1986] proved that (an inessential variant of) PRIMLOG complete. Furthermore, [Stˇep´anek & Stˇ [see Markusz & Kaposi 1982] is r.e.-complete, which restricts considerably the size of AND- and ORbranching and allows to use recursion explicitly in only a single clause of particular type. The proof shows that all -recursive functions can be expressed within this fragment. A natural decidable fragment of logic programming with functions are nonrecursive programs, in which intuitively no predicate depends syntactically on itself (see Section 5.1 for a definition). Their complexity is characterized by the following theorem. Theorem 4.8 ([Dantsin & Voronkov 1997b]) Nonrecursive logic programming is NEXPTIME-complete. The membership is established by applying SLD-resolution with constraints. The size of the derivation turns out to be exponential. NEXPTIME-hardness is proved by reduction from the tiling problem for the square 2n  2n . Some other fragments of logic programming with function symbols are known to be decidable too. For example, the following result was established in [Shapiro 1984], by using a simulation of alternating Turing machines by logic programs and vice versa. Theorem 4.9 ([Shapiro 1984]) Logic programming with function symbols is PSPACE-complete, if each rule is restricted as follows: the body contains only one atom, the size of the head is greater than or equal to that of the body, and the number of occurrences of any variable in the body is less than or equal to the number of its occurrences in the head. The simulation assumed that the input to an alternating Turing machine is written on the work-tape. ˇ ep´ankov´a 1986] showed that the Extending the simulation by a distinguished input-tape, [Stˇep´anek & Stˇ class of logic programs having logarithmic (respectively, polynomial) goal-size complexity is P-complete (respectively, EXPTIME-complete). Here, the goal-size complexity is the maximal size of any subgoal (in terms of symbols) occurring in the proof tree of a goal. Related notions of complexity and normal forms of ˇ ep´anek 1984], are studied in [Ochozka, programs, defined in terms of computation trees [Stˇep´ankov´a & Stˇ ˇ ep´anek & Stˇ ˇ ep´ankov´a 1988]. Stˇ We refer to [Blair 1982, Fitting 1987a, Fitting 1987b] for further material on recursion-theoretic issues related to logic programming. In the context of recursion theory, reducibility of a language (or problem) L1 to L2 is understood in terms of a Turing reduction, i.e., L1 can be decided by a DTM with oracle L2 , rather than logarithmic-space reduction. 2

INFSYS RR 1843-99-05

19

4.5 Further issues Besides data and combined complexity, many other complexity aspects of logic program have been investigated, in particular in the context of datalog. We discuss here some of issues that have received broad attention. Sirups. A strongly restricted class of logic programs often considered in the literature is the class of single rule programs (sirups) or programs consisting of one recursive rule and some nonrecursive (initialization) rules or atoms. For a long time, the decidability of the following problem was open: Given an LP P (with function symbols) that consists of a unique recursive rule and a set of ground atoms, and given a ground goal G, does it hold that P j= G? This problem is equivalent to the Horn clause implication problem, i.e., checking whether the universal closure of a Horn clause C1 logically implies the universal closure of a Horn clause C2 . The problem was shown to be undecidable in [Marcinkowski & Pacholski 1992]. Some decidable special cases of this problem were studied in [Gottlob 1987, Leitsch & Gottlob 1990, Leitsch 1990]. Several undecidability results of inference and satisfiability problems for various restricted forms of sirups with non-ground atoms or with nonrecursive rules can be found in [Devienne 1990, Devienne, Leb`egue & Routier 1993, Hanschke & W¨urtz 1993, Devienne, Leb`egue, Parrain, Routier & W¨urtz 1996]. Datalog sirups are EXPTIME complete with respect to program and combined complexity; this remains true even for datalog sirups consisting of a unique rule and no facts [Gottlob 1999]. It follows that deciding whether (the universal closure of) a datalog clause logically implies (the universal closure of) another datalog clause is EXPTIME complete, too. The problem of evaluating a nonrecursive Horn clause (with or without function symbols) over a set of ground facts is NP-complete [Chandra & Merlin 1977] (even for a fixed set of ground facts). (Here by “evaluation”, we mean determining whether a rule fires.) This problem is computationally equivalent to the problem of evaluating a Boolean conjunctive query over a database, i.e., a datalog clause whose body contains only input predicates, and also to the well known NP-complete clause subsumption problem [Garey & Johnson 1979] (see below). The parametric complexity of conjunctive queries is studied on [Papadimitriou & Yannakakis 1997]. With respect to data complexity, datalog sirups are complete for P, and thus in general inherently sequential [cf. Kanellakis 1988]. There are, however, many interesting special cases in which sirup queries can be evaluated in parallel. Inside P and parallelization issues. In [Ullman & van Gelder 1988] the polynomial fringe property is studied. Roughly, a datalog program P has the polynomial fringe property if it is guaranteed that for each database D and goal G such that P [ D j= G, there is a derivation tree whose fringe (i.e., set of leaves) is of polynomial size. The data complexity of datalog programs with the polynomial fringe property is in LOGCFL, which is the class of all languages (that is, problems) that are reducible in logarithmic space to a context-free language. LOGCFL is a subclass of NC2 , and thus contains highly parallelizable problems k [Johnson 1990]; furthermore, programs whose fringe is superpolynomial (i.e., O (2log n )) are in NC [Ullman & van Gelder 1988, Kanellakis 1988]. Here NC2 is the second level of the NC-hierarchy of complexity classes NCi . These classes are defined by families of uniform Boolean circuits of depth O (logi n) [Johnson 1990]. An example of programs with the polynomial fringe property are linearly recursive sirups; however, there also exist nonlinear sirups that are not equivalent to any linear sirup and are still in NC [Afrati & Cosmadakis 1989].

20

INFSYS RR 1843-99-05

In [Kanellakis 1988], the polynomial (superpolynomial) tree-size property for width k is considered. Roughly, a datalog program has this property if every derivable atom can be obtained by a width-k derivation tree of polynomial (superpolynomial) size. A width-k derivation tree is a generalized derivation tree, where each node may represent up to k ground atoms. For width k = 1, the polynomial (resp., superpolynomial) tree-size property coincides with the polynomial (resp., superpolynomial) fringe property; however, for higher widths, the former properly generalizes the latter. Kanellakis [1988] shows that the data complexity of datalog programs having the polynomial (resp., superpolynomial) tree-size property for any fixed constant width is in LOGCFL (resp., in NC). The hypergraph (V; E ) associated with a Horn clause or conjunctive query has as set V of vertices the set of variables occurring in the rule; its set E of hyperedges contains for each atom A in the rule body a hyperedge consisting of the variables occurring in A. If the hypergraph associated with a nonrecursive rule is acyclic, the evaluation problem is feasible in polynomial time [Yannakakis 1981] and is actually complete for LOGCFL and thus highly parallelizable [Gottlob, Leone & Scarcello 1998]. For generalizations of this result to various types of nearly acyclic hypergraphs, see [Gottlob, Leone & Scarcello 1999]. While determining whether a datalog program is parallelizable, i.e., has data complexity in NC, is in general undecidable [Ullman & van Gelder 1988, Gaifman, Mairson, Sagiv & Vardi 1987], the problem has been completely resolved by [Afrati & Papadimitriou 1993] for an interesting and relevant class of sirups called simple chain queries. These are logic programs with a single recursive rule whose right hand side consists of binary relations forming a chain. An example of such a rule, involving a database predicate a, is

s(X; Y )

a(X; Z1 ); s(Z1 ; Z2 ); s(Z2 ; Z3 ); a(Z3 ; Y ):

Afrati & Papadimitriou [1993] show that (unless P = NC) simple chain queries are either complete for P or in NC. They give a precise characterization of the P-complete and NC-computable simple chain queries. Boundedness. Many papers have been devoted to the decidability of the boundedness problem for datalog programs. A datalog program P is bounded, if there exists a constant k such that for all databases D , the number of iteration steps needed in order to compute the least fixed point M(ground(P [ D; L(P; D ))) is bounded by k and is thus independent of D (it depends on P only). Boundedness is an interesting property, because as shown in [Ajtai & Gurevich 1994], a datalog program is bounded if and only if it is equivalent to a first-order query. For important related results on the equivalence of recursive and nonrecursive datalog queries, see [Chaudhuri & Vardi 1997]. The undecidability of the boundedness for general datalog programs was shown in [Gaifman et al. 1987], for linear recursive queries in [Vardi 1988], and for sirups in [Abiteboul 1989]. There is a very large number of papers discussing the decidability of boundedness issues, both for syntactic restrictions of datalog programs or sirups and for variants of boundedness such as uniform boundedness. Good surveys of early work are given in [Kanellakis 1988] and in [Kanellakis 1990]. The following is an incomplete list of papers where important results and further relevant references on decidability issues of boundedness or uniform boundedness can be found: [Hillebrand, Kanellakis, Mairson & Vardi 1995, Marcinkowski 1996b, Marcinkowski 1996a]. Sufficient conditions for boundedness were given in [Minker & Nicolas 1982, Sagiv 1985, Ioannidis 1986, Vardi 1988, Naughton 1989, Cosmadakis 1989, Naughton & Sagiv 1987, Naughton & Sagiv 1991]. Containment, equivalence, and subsumption. Issues that have been studied repeatedly in the context of query optimization are query equivalence and containment. Query containment is the problem, given two datalog programs P1 and P2 having the same input schema Din and output schema Dout , whether for every

INFSYS RR 1843-99-05

21

input database Din , the output of P1 is contained in the output of P2 , i.e, MP1 (Din )jp  MP2 (Din )jp holds, for every relation p 2 Dout . As shown by Shmueli [1987], containment and equivalence are undecidable for datalog programs; however, a stronger form of uniform containment is decidable [Sagiv 1988]. In the case where P1 and P2 contain only conjunctive queries, containment and equivalence are NPcomplete [Sagiv & Yannakakis 1981], and remain NP-complete even if P1 and P2 consist of single conjunctive queries [Chandra & Merlin 1977]. If the domain has a linear order  and comparison literals t1  t2 , t1 < t2 , and t1 6= t2 may be used in rule bodies, then the containment problem for single conjunctive queries is p2 -complete [van der Meyden 1997]; this result generalizes to sets of conjunctive queries. As shown in [van der Meyden 1997], conjunctive query containment is still co-NP-complete if the database relations are monadic, but polynomial if an additional sequentiality restrictions is imposed on order literals. Containment of a nonrecursive datalog program P1 in a recursive datalog program P2 is decidable, since P1 can be rewritten to a set of conjunctive queries, and deciding whether a conjunctive query is contained in an arbitrary (recursive) datalog program is EXPTIME-complete [Cosmadakis & Kanellakis 1986, Chandra, Lewis & Makowsky 1981]. Chaudhuri & Vardi [1994] have investigated the converse problem, i.e., containment of a recursive datalog program P1 in a nonrecursive datalog program P2 . They showed that the problem is 3-EXPTIME-complete in general and 2-EXPTIME-complete if P2 is a set of conjunctive queries. Furthermore, they showed that deciding equivalence of a recursive and a nonrecursive datalog program is 3-EXPTIME-complete. We observe that the containment problem for conjunctive queries is equivalent to the clause subsumption problem. A clause C subsumes a clause D , if there exists a substitution  such that C  D ; subsumption algorithms are discussed in [Gottlob & Leitsch 1985b, Gottlob & Leitsch 1985a, Bachmair, Chen, Ramakrishnan & Ramakrishnan 1996]. This equivalence extends to sets of conjunctive queries, i.e., in essence to nonrecursive datalog programs [Sagiv & Yannakakis 1981]. For a discussion of subsumption-based and other notions of equivalence for logic programs, see [Maher 1988]. The clause subsumption problem plays a very important role in the field of inductive logic programming (ILP) [Muggleton 1992]. For complexity results on ILP consult [Kietz & Dzeroski 1994, Gottlob, Leone & Scarcello 1997]. A problem related to clause subsumption is clause condensation, i.e., removing redundancy from a clause. Complexity results and algorithms for clause condensation can be found in [Gottlob & Ferm¨uller 1993]. The complexity of the clause evaluation problem and of other related problems on generalized Herbrand interpretations, which may contain nonground atoms, is studied in [Gottlob & Pichler 1998].

5 Complexity of logic programming with negation 5.1 Stratified negation A literal L is either an atom A (called a positive literal) or a negated atom :A (called a negative literal). Literals A and :A are complementary; for any literal L, we denote by ::L its complementary literal, and for any set Lit of literals, ::Lit = f::L j L 2 Litg. A normal clause is a rule of the form

A

L1 ; : : : ; L m

(m  0)

(1)

where A is an atom and each Li is a literal. A normal logic program is a finite set of normal clauses. The semantics of normal logic programs is not straightforward, and numerous proposals exist [cf. Bidoit 1991, Apt & Bol 1994]. However, there is general consensus for stratified normal logic programs.

22

INFSYS RR 1843-99-05

A normal logic program P is stratified [see Apt, Blair & Walker 1988], if there is an assignment str () of integers 0,1,. . . to the predicates p in P , such that for each clause r in P the following holds: If p is the predicate in the head of r and q the predicate in an Li from the body, then str (p)  str (q ) if Li is positive, and str (p) > str (q ) if Li is negative. Example 5.1 Reconsider the steam turbine scenario in Example 2.1, and let us add the following rules to the program there: check sensors signal error signal error signal error

signal error valve closed; :signal 1 pressure loss; :signal 2 overheat; :signal 3

These rules express knowledge about potential signal errors, which must handled by checking the sensors. The augmented program P is stratified: E.g. for the assignment str (A) = 1 for A 2 fcheck sensors; signal errorg and str (B ) = 0 for any other atom B occurring in P , the condition of stratification is satisfied.

The reduct of a normal logic program P by a Herbrand interpretation I [Gelfond & Lifschitz 1988], denoted P I , is the set of ground clauses obtained from ground(P ) as follows: first remove every clause r with a negative literal L in the body such that ::L 2 I , and then remove all negative literals from the remaining rules. Notice that P I is a set of ground Horn clauses. The semantics of a stratified normal program P is then defined as follows. Take an arbitrary stratification str . Denote by P=k the set of rules r such that str (p) = k , where p is the head predicate of r . Define a sequence of Herbrand interpretations: M0 = ;, and Mk+1 is the least Herbrand model of P=Mkk [ Mk for k  0. Finally, let

Mstr (P ) =

[ M [ f:A j A 2= [ M g: i

i

i

i

The semantics Mstr does not depend on the stratification str [Apt et al. 1988]. Note that in the propositional case Mstr (P ) is polynomially computable. Example 5.2 We consider the program P in Example 5.1. For the stratification str () of P given there, P=0 contains the clauses listed in Example 2.1, and P=1 the clauses introduced in Example 5.1. Then,

M0 = P ; M0 = ; P=0 0 M 1 1 M1 = TP0 P=1 = fcheck sensors M2 = TP10

signal error; signal error

overheat g

where TP10 = fsignal 1, signal 2, valve closed, pressure loss, leak, shutdowng. Thus, [f:signal 3, :overheat, :signal error, :check sensorsg.

Mstr (P ) = TP10

Theorem 5.3 (implicit in [Apt et al. 1988]) Stratified propositional logic programming with negation is P-complete. Stratified datalog with negation is data complete for P and program complete for EXPTIME. For full logic programming, stratified negation yields the arithmetical hierarchy. Theorem 5.4 ([Apt & Blair 1988]) Logic programming with complete.

n levels of stratified

negation is

0n+1 -

INFSYS RR 1843-99-05

signature

23

( 2; 0; 0) ( ; 1; 0)

( ;  2; 0)

not range-restricted

( ; ;  1)

no negation with negation

PSPACE PSPACE

PSPACE NEXPTIME NEXPTIME PSPACE TA(2O(n= log n) ; O(n= log n)) NONELEMENTARY(n)

no negation with negation

PSPACE PSPACE

PSPACE PSPACE PSPACE PSPACE

range-restricted

NEXPTIME TA(2n= log n ; n= log n)

Table 1: Summary of results. Recall here that 0n+1 denotes the relations over the natural numbers that are definable in arithmetic by means of a first-order formula (Y ) = 9X0 8X1    Qk Xn (X0 ; : : : ; Xn ; Y ) with free variables Y , where the quantifiers alternate and is quantifier-free; in particular, 01 contains the r.e. sets. Further complexity results on stratification can be found in [Blair & Cholak 1994, Palopoli 1992]. A particular case of stratified negation are nonrecursive logic programs. A program is nonrecursive if and only if it has a stratification such that each predicate p occurs in its defining stratum P=str (p) only in the heads of rules. Theorem 5.5 (implicit in [Immerman 1987, Vardi 1982]) Nonrecursive propositional logic programming with negation is P-complete. Nonrecursive datalog with negation is program complete for PSPACE, and its data complexity is in the class AC0 , which contains the languages recognized by unbounded fan-in circuits of polynomial size and constant depth [Johnson 1990]. Vorobyov & Voronkov [1998] classified the complexity of nonrecursive logic programming depending on the signature, presence of negation and range-restriction. A clause P is called range-restricted if every variable occurring in this clause also occurs in a positive literal in the body. A program P is range-restricted if so is every clause in P . Range-restricted clauses have a number of good properties, for example domainindependence. Before presenting the results of Vorobyov & Voronkov [1998], we explain the notation for signatures used in their paper. The tuple (k; l; m) denotes the signature with k constants, l unary function symbols and m function symbols of arity  2. The complexity of nonrecursive logic programming is summarized in Table 1. In this table TA(f (n); g (n)) means the class of functions computable on alternating Turing machines [Chandra, Kozen & Stockmeyer 1981] using g (O (n)) alternations with time f (O (n)) on every branch. Such classes are closed under polylin (and loglin) reductions, i.e., those running in polynomial time (respectively, logarithmic space), with output linearly bounded by the input. Such complexity classes arise in connection with the complexity characterization of logical theories [Berman 1977, Berman 1980]. In order to define NONELEMENTARY (n), define functions e0 (n) = n, ek+1 (n) = 2ek (n) , and e1 (n) = en(0). Recall that a problem is called elementary recursive, if it can be decided within time bounded by ek (n) for some fixed natural number k. Then NONELEMENTARY(f (n)) is the class of problems with lower and upper time bounds of the form e1 (f (cn)) and e1 (f (dn)) for some c; d > 0. In all cases in the table we have completeness in the corresponding complexity class, except for NONELEMENTARY (n) (in this case both lower and upper bounds are linearly growing towers of 2’s). Thus, there is a huge difference between nonrecursive datalog with negation and nonrecursive logic programming with negation in their program complexity, namely PSPACE vs. NONELEMENTARY (n). At the same time, as [Vardi 1982] and the following result show, both the languages have polynomial data

24

INFSYS RR 1843-99-05

complexity. Theorem 5.6 ([Dantsin & Voronkov 1998]) Nonrecursive logic programming with negation has polynomial data complexity.

5.2 Well-founded negation Roughly speaking, the well-founded semantics (WFS) [van Gelder, Ross & Schlipf 1991] assigns value “unknown” to an atom A, if it is defined by unstratified negation. Briefly, WFS can be defined as follows [Baral & Subrahmanian 1993]. Let FP (I ) be the operator FP (I ) = TP1I . Since FP (I ) is anti-monotone, FP2 (I ) is monotone, and thus has a least and a greatest fixpoint, denoted by FP2 "1 and FP2 #1 , respectively. Then, the meaning of a program P under WFS, Mwfs (P ), is

Mwfs (P ) = FP2 "1 [ f:A j A 2= FP2 #1g: Note that on stratified programs, WFS and stratified semantics coincide. Theorem 5.7 (implicit in [van Gelder 1989, van Gelder et al. 1991]) Propositional logic programming with negation under WFS is P-complete. Datalog with negation under WFS is data complete for P and program complete for EXPTIME. The question whether P j=wfs A can be decided in linear time is open [Berman, Schlipf & Franco 1995]. A fragment of datalog with well-founded negation that has linear data complexity and, under certain restrictions, also linear combined complexity, was recently identified in [Gottlob, Gr¨adel & Veith 1998]. This fragment, called datalog LITE, is well-suited for expressing temporal properties of a finite state system represented as a Kripke structure. It is more expressive than CTL and some other well-known temporal logics used in automatic verification. For full logic programming, the following is known. Theorem 5.8 ([Schlipf 1995b]) Logic programming with negation under WFS is 11 -complete.

The class 11 belongs to the analytical hierarchy (in a relational form) and contains those relations which are definable by a second-order formula (X) = 8P(P; X), where P is a tuple of predicate variables and  is a first-order formula with free variables X. For more details about this class in the context of logic programming, see e.g. [Schlipf 1995b, Eiter & Gottlob 1997].

5.3 Stable model semantics An interpretation I of a normal logic program P is a stable model of P [Gelfond & Lifschitz 1988], if I = TP1I , i.e., I is the least Herbrand model of P I . In general, a normal logic program P may have zero, one, or multiple stable models. Example 5.9 Let P be the following program: sleep work Then M1

:work :sleep

= fsleepg and M2 = fworkg are the stable models of P .

INFSYS RR 1843-99-05

25

Denote by SM(P ) the set of stable models of semantics (SMS) is

Mst(P ) =

\

P.

M 2SM(P )

The meaning

Mst of P

under the stable model

(M [ ::(BP n M )):

Note that every stratified P has a unique stable model, and its stratified and stable semantics coincide. Unstratified rules increase complexity. Theorem 5.10 (Marek & Truszczy´nski [1991], Bidoit & Froidevaux [1991]) Given a propositional normal logic program P , deciding whether SM(P ) 6= ; is NP-complete. Proof. 1. Membership. Clearly, P I is polynomial time computable from P and I . Hence, a stable model M of P can be guessed and checked in polynomial time. 2. Hardness. Modify the DTM encoding in Section 4 for a nondeterministic Turing machine T as follows. For each state s and symbol  , introduce atoms Bs;;1 [ ],. . . , Bs;;k [ ] for all 1   < N and transitions hs; ; si ; i0 ; di i, where 1  i  k . Add Bs;;i [ ] in the bodies of the transition rules for hs; ; si ; i0 ; dii and the rule

Bs;;i [ ]

:Bs;;1[ ]; : : : ; :Bs;;i?1[ ]; :Bs;;i+1[ ]; : : : ; :Bs;;k [ ]:

Intuitively, these rules nondeterministically select precisely one of the possible transitions for s and  at time instant  , whose transition rules are enabled via Bs;;i [ ]. Finally, add a rule accept

:accept:

It ensures that accept is true in every stable model. The stable models correspond to the accepting runs of T .

M of the resulting program 2

As an easy consequence, we obtain Theorem 5.11 ([Marek & Truszczy´nski 1991, Schlipf 1995b]; cf. also [Kolaitis & Papadimitriou 1991]) Logic programming with negation under SMS is co-NP-complete. Datalog with negation under SMS is data complete for co-NP and program complete for co-NEXPTIME. For full logic programming, SMS has the same complexity as WFS. Theorem 5.12 ([Schlipf 1995b, Marek, Nerode & Remmel 1994]) Logic programming with negation under SMS is 11 -complete. Further results on stable models of recursive (rather than only finite) logic programs can be found in [Marek, Nerode & Remmel 1992]. Beyond inference, further complexity aspects of stable models have been analyzed, including compact representations of stable models and the well-founded semantics of nonground logic programs [Gottlob, Marcus, Nerode, Salzer & Subrahmanian 1996, Eiter, Lu & Subrahmanian 1998], and optimization issues such as determining symmetries across stable models [Eiter, Gottlob & Leone 1997b].

26

INFSYS RR 1843-99-05

5.4 Inflationary and noninflationary semantics

e

The inflationary semantics (INFS) [Abiteboul & Vianu 1991a, Abiteboul, Hull & Vianu 1995] is inspired by inflationary fixpoint logic [Gurevich & Shelah 1986]. In place of TP1 , it uses the limit TP1 of the sequence

e

;;

TP0 = i TP+1 =

e TbP (TePi ); if i  0; where TeP is the inflationary operator Te(I ) = I [ TP (I ). Clearly, TeP1 is computable in polynomial time for a propositional program P . Moreover, TeP1 coincides with TP1 for Horn clause programs P . Therefore, I

by the above results,

Theorem 5.13 ([Abiteboul & Vianu 1991a]; implicit in [Gurevich & Shelah 1986]) Logic programming with negation under INFS is P-complete. Datalog with negation under INFS is data complete for P and program complete for EXPTIME.

b

The noninflationary semantics (NINFS) [Abiteboul & Vianu 1991a], in the version of Abiteboul & Vianu [1995, page 373], uses in place of TP1 the limit TP1 of the sequence

TbP0 = ;; TbPi+1 = TbP (TbPi ); if i  0; where TbP (I ) = TP I (I ), if it exists; otherwise, TbP1 is undefined. Similar equivalent algebraic query lan-

guages have been earlier described in [Chandra & Harel 1982, Vardi 1982]. In particular, datalog under NINFS is equivalent to partial fixpoint logic [Abiteboul & Vianu 1991a, Abiteboul et al. 1995]. As easily seen, TP1 is for a propositional program P computable in polynomial space; this bound is tight. Theorem 5.14 ([Abiteboul & Vianu 1991a, Abiteboul et al. 1995]) Logic programming with negation under NINFS is PSPACE-complete. Datalog with negation under NINFS is data complete for PSPACE and program complete for EXPSPACE.

5.5 Further semantics of negation A number of interesting further semantics for logic programming with negation have been defined, among them partial stable models, maximal partial stable models, regular models, perfect models, fixpoint models, the 2- and 3-valued completion semantics, and the tie-breaking semantics; see e.g. [Schlipf 1995b, You & Yuan 1995, Kolaitis & Papadimitriou 1991, Papadimitriou & Yannakakis 1997]. These semantics must remain undiscussed here; see e.g. [Schlipf 1995b, Sacc´a 1995, Kolaitis & Papadimitriou 1991, Papadimitriou & Yannakakis 1997] for more details and complexity results. Extensions of logic programming with negation have been proposed which handle different kinds of negation, namely strong and default negation [see e.g. Gelfond & Lifschitz 1991, Pearce & Wagner 1991]. The semantics we have considered above use default negation as the single kind of negation. Different kinds of negation increase the suitability of logic programming as a knowledge representation formalism [Baral & Gelfond 1994]. In the approach of Gelfond & Lifschitz [1991], strong negation is interpreted as classical negation. E.g., the rule  :flies(X ); bird(X ) flies(X )

INFSYS RR 1843-99-05

27

naturally expresses that a bird flies by default; here, “ ” is default negation and “:” is classical negation. The language of extended logic programs treats literals with classical negation as atoms, on which default negation may be applied. The notion of answer set for such a program is defined by a natural generalization of the concept of stable model [see Gelfond & Lifschitz 1991]. As for the complexity, there is no increase for extended logic programs over normal logic programs under SMS. Theorem 5.15 (Ben-Eliyahu & Dechter [1994]) Given a propositional extended logic program P , deciding whether P has an answer set is NP-complete, and extended logic programming is co-NP-complete. Complexity results on extended logic programs with rule priorities can be found in [Brewka & Eiter 1998], and for an extension of logic programming using hierarchical modules in [Buccafurri, Leone & Rullo 1998].

6 Disjunctive logic programming Informally, disjunctive logic programming (DLP) extends logic programming by adding disjunction in the rule heads, in order to allow more suitable knowledge representation and to increase expressiveness. For example,

male(X ) _ female(X )

person(X )

naturally represents that any person is either male or female. A disjunctive logic program is a set of clauses

A1 _    _ Ak L1 ; : : : ; Lm (k  1; m  0); (2) where each Ai is an atom and each Lj is a literal. For a background, see [Lobo, Minker & Rajasekar 1992]

and the more recent [Minker 1994]. The semantics of negation-free disjunctive logic programs is based on minimal Herbrand models, as the least (unique minimal) model does not exist in general. Example 6.1 Let P consist of the single clause p _ q and M2 = fq g.

. Then, P has the two minimal models M1

= fpg

Denote by MM(P ) the set of minimal Herbrand models of P . The Generalized Closed World Assumption (GCWA) [Minker 1982] for negation-free P amounts to the meaning MGCWA (P ) = fL j MM(P ) j= Lg. Example 6.2 Consider the following propositional program P 0 , describing the behavior of a reviewer while reviewing a paper: good _ bad happy angry smoke smoke paper

paper good bad happy angry

28

INFSYS RR 1843-99-05

The following models of P 0 are minimal:

M1 = fpaper; good; happy; smokeg M2 = fpaper; bad; angry; smokeg: Under GCWA, we have P

and

j=GCWA smoke, while P 6j=GCWA good and P 6j=GCWA :good.

Theorem 6.3 ([Eiter & Gottlob 1993, Eiter et al. 1994]) Let P be a propositional negation-free disjunctive logic program and A be a propositional atom. (i) Deciding whether P j=GCWA A is co-NP-complete. (ii) Deciding whether P j=GCWA :A is p2 -complete.

Proof. It is not hard to argue that for an atom A, we have P j=GCWA A if and only if P j=PC A, where j=PC is the classical logical consequence relation. In addition, it is not hard to argue that any set of clauses can be represented by a suitable disjunctive logic program. Hence, by the well-known NP-completeness of SAT, part (i) is obvious. Let us thus consider part (ii). 1. Membership. We have P 6j=GCWA :A if and only if there exists an M 2 MM(P ) such that M 6j= :A, i.e., A 2 M . Clearly, a guess for M can be verified with an oracle for NP in polynomial time; from this, membership of the problem in p2 follows.

2. Hardness. We show p2 -hardness by an encoding of alternating Turing machines (ATM) [Chandra, Kozen & Stockmeyer 1981]. Recall that an ATM T has its set of states partitioned into existential (9) and universal (8) states. If the machine reaches an 9-state (respectively, 8-state) s in a run, then the input is accepted if the computation continued in some (respectively, all) of the possible successor configurations is accepting. As in our simulations above, we assume that T has a single tape. The polynomial-time bounded ATMs which start in a 8-state s0 and have one alternation, i.e., precisely one transition from a 8-state to an 9-state in each run (and no reverse transition), solve precisely the problems in p2 [Chandra, Kozen & Stockmeyer 1981].

By adapting the construction in the proof of Theorem 5.10, we show how any such machine T on input I can be simulated by a disjunctive logic program P under GCWA. Without loss of generality, we assume that each run of T is polynomial-time bounded [Balc´azar, Diaz & Gabarr´o 1990]. We start from the clauses constructed for the NTM T on input I in the proof of Theorem 5.10, from :accept and replace the clauses which we drop the clause accept

Bs;;i [ ]

:Bs;;1[ ]; : : : ; :Bs;;i?1[ ]; :Bs;;i+1[ ]; : : : ; :Bs;;k [ ]:

for s and  by the logically equivalent disjunctive clause

Bs;;1 [ ] _    _ Bs;;k [ ]

:

Intuitively, in a minimal model precisely one of the atoms Bs;;i [ ] will be present, which means that one of the possible branchings is followed in a run. The current clauses constitute a propositional program which derives accept under GCWA if and only if T would accept I if all its states were

INFSYS RR 1843-99-05

29

universal. We need to respect the 9-states, however. For each 9-state s and time point  > 0, we set up the following clauses, where s0 is any 9-state,    0  N , 0    N , and 1  i  k : states [ 0 ] symbol [ 0 ;  ] cursor[ 0 ;  ] 0

Bs;;i[ 0 ]

naccept; states [ ] naccept; states [ ] naccept; states [ ] naccept; states [ ]:

Intuitively, these rules state that if a nonaccepting run enters an 9-state, i.e., naccept is true, then all relevant facts involving a time point  0   are true. This way, nonaccepting runs are tilted. Finally, we set up for each nonaccepting terminal 9-state s the clauses naccept

states [ ];

0 1. T computes a Boolean (0-ary) predicate.

INFSYS RR 1843-99-05

33

The simulation is akin to the simulation used in the proofs of Theorems 4.2 and 4.5. Recall the framework of Section 4.1. In the spirit of this framework, it suffices to encode nk timepoints  and tape-cell numbers  within a fixed datalog program. This is achieved by considering k -tuples X = hX1 ; : : : ; Xk i of variables Xi ranging over U . Each such k-tuple encodes the integer int(X) = k X  nk?i. i=1 i At time point 0 the tape of T contains an encoding of the input graph. Recall that in Section 4.1 this was reflected by the following initialization facts

P

symbol [0;  ]

for 0  

< jI j, where I = :

Before translating these rules into appropriate datalog rules, we shall spend a word about how input graphs are usually represented by a binary strings. A graph hU; Gi is encoded by binary string enc(U; G) of length jU j2 : if G(i; j ) is true for i; j 2 U = [0; n ? 1] then the bit number i  n + j of enc(U; G) is 1, otherwise this bit is 0. The bit positions of enc(U; G) are exactly the integers from 0 to n2 ? 1. These integers are represented by all k -tuples h0k?2 ; a; bi such that a; b 2 U . Moreover, the bit-position int(h0k?2 ; X; Y i) of enc(U; G) is 1 if and only if G(X; Y ) is true in the input database and 0 otherwise. The above initialization rules can therefore be translated into the datalog rules symbol1 [0k ; 0k?2 ; X; Y ] symbol0 [0k ; 0k?2 ; X; Y ]

G(X; Y ) :G(X; Y )

Intuitively, the first rule says that at time point 0 = int(0k ), bit number int(h0k?2 ; X; Y i) on the tape is 1 if G(X; Y ) is true. The second rule states that the same bit is false if G(X; Y ) is false. Note that the second rule applies negation to an input predicate. Only this rule in the entire datalog+ program uses negation. Clearly, these two rules simulate that at time point 0, the cells c0 ,. . . , cn2 ?1 contain precisely the string enc(U; G). The other initialization rules described in Section 4.1 are also easily translated into appropriate datalog rules. Let us now see how the other rules are translated into datalog. From the linear order given by Succ(X; Y ), First(X ), and Last(X ), it is easy to define by datalog clauses a linear order k on k -tuples Succk (X; Y ), Firstk (X), Lastk (X) (see the proof of Theorem 4.5), by using Succ1 = Succ, First1 = First and Last1 = Last. By using Succk , transition rules, inertia rules and the accept rules are easily translated into datalog as in the proof of Theorem 4.5. The output schema of the resulting datalog program P + is defined to be Dout = facceptg. It is clear that this program evaluates to true on input D = hU; Gi, i.e., P + [ D j= accept if and only if T accepts enc(U; G). The generalization to a setting where the simplifying assumptions 1–3 are not made is rather straightforward and is omitted. Assumption 4 can also be easily lifted to the computation of output predicates. We consider here the case where the output scheme Dout contains a single binary relation R. Then, the output database D 0 computed by T , which is a graph hU; Ri, can be encoded similar as the input database as a binary string enc(U; R) of length jU j2 . We may suppose that when the machine enters the halt state, this string is contained in the first jU j2 cells of the tape. To obtain the positive facts of the output relation R, we add the following rule:

R(X; Y )

symbol1 [Y ; 0k?2 ; X; Y ]); statehalt [Y ]

2

34

INFSYS RR 1843-99-05

We remark that a result similar to Theorem 7.2 was independently obtained by Livchak [1983]. Let us now state somewhat more succinctly further interesting results on datalog. A prominent query language is fixpoint logic (FPL), which is the extension of first-order logic by a least fixpoint operator lfp(X; '; S ), where S is a jXj-ary predicate occurring positively in the formula ' = '(X; S ), and X is a tuple of free variables in '; intuitively, it returns the least fixpoint of the operator ? defined by ?(S ) = fa j D j= '(a; S )g. We refer to [Chandra & Harel 1982, Abiteboul et al. 1995, Ebbinghaus & Flum 1995] for details. As shown in [Chandra & Harel 1982], FPL expresses a proper subset of the queries in P. Datalog+ relates to FPL as follows. Theorem 7.3 ([Chandra & Harel 1985]) Datalog+ = FPL+ (9), i.e., Datalog+ coincides with the fragment of FPL having negation restricted to database relations and only existential quantifiers. As for expressibility in first-order logic, Ajtai & Gurevich [1994] have shown that a datalog query is equivalent to a first-order formula if and only if it is bounded, and thus expressible in existential first-order logic. Adding stratified negation does not preserve the equivalence of datalog and fixpoint logic in Theorem 7.3. Theorem 7.4 ([Kolaitis 1991]; implicit in [Dahlhaus 1987]) Stratified datalog $ FPL. This theorem is not obvious. In fact, for some time coincidence of the two languages was assumed, based on a respective statement in [Chandra & Harel 1985]. The nonrecursive fragment of datalog coincides with well-known database query languages. Theorem 7.5 ([cf. Abiteboul et al. 1995]) Nonrecursive range-restricted datalog with negation = relational algebra = relational calculus. Nonrecursive datalog with negation = first-order logic (without function symbols). The expressive power of relational algebra is equivalent to that of a fragment of the database query language SQL (essentially, SQL without grouping and aggregate functions). The expressive power of SQL is discussed in [Libkin & Wong 1994, Dong, Libkin & Wong 1997, Libkin 1997]. Unstratified negation yields higher expressive power. Theorem 7.6 (i) Datalog under WFS = FPL ([van Gelder 1989]). (ii) Datalog under INFS = FPL ([Abiteboul & Vianu 1991a], using [Gurevich & Shelah 1986]). As recently shown, the first result holds also for total WFS (i.e., the well-founded model is always total) [Flum, Kubierschky & Lud¨ascher 1997]. We remark that the variants of datalog mentioned above can only define queries which are expressible in infinitary logic with finitely many variables (L!1! ) [Kolaitis & Vardi 1995]. It is known that L!1! has a 0-1 law, i.e., every query definable in this language is either almost surely true or almost surely false, if the size of the universe grows to infinity [Kolaitis & Vardi 1992]. It is easy to see that the boolean Even-query qE , which tells if the domain of a given input database Din (over a fixed schema) contains an even number of elements, is not almost surely true or almost surely false. Thus, a fortiori, this query– which is computable in polynomial time– is not expressible in the above variants of datalog. On ordered databases, Theorem 7.2 and the theorems in Section 5 imply

INFSYS RR 1843-99-05

35

Theorem 7.7 On ordered databases, the following query languages capture P: stratified datalog, datalog under INFS, and datalog under WFS. Syntactical restrictions allow us to capture classes within P. Let datalog+ (1) be the fragment of datalog+ where each rule has most one nondatabase predicate in the body, and let datalog+ (1; d) be the fragment of datalog+ (1) where each predicate occurs in at most one rule head. Theorem 7.8 ([Gr¨adel 1992, Veith 1994]) On ordered databases, datalog+ (1) captures NL and its restriction datalog+ (1; d) captures L. Due to inherent nondeterminism, stable semantics is much more expressive. Theorem 7.9 ([Schlipf 1995b]) Datalog under SMS captures co-NP. Note that for this result an order on the input database is not needed. Informally, in each stable model such an ordering can be guessed and checked by the program. By Fagin’s [1974] Theorem, this implies that datalog under SMS is equivalent to the existential fragment of second-order logic over finite structures. Theorem 7.10 ([Abiteboul & Vianu 1991a]) On ordered databases, datalog under NINFS captures PSPACE. Here ordering is needed. An interesting result in this context, formulated in terms of datalog, is the following [Abiteboul & Vianu 1991a]: datalog under INFS = datalog under NINFS on arbitrary finite databases if and only if P = PSPACE. While the “only if” direction is obvious, the proof of the “if” direction is involved. It is one of the rare examples that translates open relationships between deterministic complexity classes into corresponding relationships between query languages. We next briefly address the expressive power of disjunctive logic programs. Theorem 7.11 ([Eiter et al. 1994, Eiter, Gottlob & Mannila 1997]) Disjunctive datalog under SMS captures p2 .

It appeared that fragment of disjunctive datalog have interesting properties. While disjunctive datalog+;6= expresses only a subset of the queries in co-NP (e.g., it can not express the Even-query), it expresses all of p2 under the credulous notion of consequence, i.e., P j=c A if A is true in some stable model. Furthermore, under credulous consequence every query in nondisjunctive datalog+;6= is expressible in disjunctive datalog+ , even though the inequality predicate can not be recognized. Finally, we consider full logic programs. In this case, the input databases are arbitrary (not necessarily recursive) relations on the genuine (infinite) Herbrand universe of the program. Theorem 7.12 [Schlipf 1995b, Eiter & Gottlob 1997] Each of logic programming under WFS, logic programming under SMS, and DLP under SMS captures 11 . Thus, different from the function-free case, adding disjunction does not increase the expressive power of normal logic programs. The reason is that disjunctive logic programs can be expressed in a weak fragment of the class 12 of second-order logic, which in the case of an infinite Herbrand universe can be coded to the 11 fragment. For further expressiveness results on logic programs see e.g. [Schlipf 1995b, Sacc´a 1995, Sacc´a 1997, Greco & Sacc`a 1997, Greco & Sacc`a 1996, Eiter, Leone & Sacc`a 1998, Cadoli & Palopoli 1998]. In particular, co-NP can be captured by a variant of circumscribed datalog [Cadoli & Palopoli 1998], and

36

INFSYS RR 1843-99-05

further classes of the polynomial hierarchy can be captured by variants of stable models [Sacc´a 1995, Sacc´a 1997, Eiter, Leone & Sacc`a 1998, Buccafurri, Greco & Sacc´a 1997] as well as through modular logic programming [Eiter, Gottlob & Veith 1997, Buccafurri et al. 1998]. Results on the expressiveness of the stable model semantics over disjunctive databases, which are given by sets of ground clauses rather than facts, can be found in [Bonatti & Eiter 1996]. We conclude this subsection with a brief look on expressiveness results for nondeterministic queries. A nondeterministic query maps an input database to one from a set of possible output databases; it can be viewed as a multi-valued function. For example, a query which returns as output a Hamiltonian cycle of given input graph is a nondeterministic query. The (deterministic) queries that we have considered above are a special case of nondeterministic queries. It has been shown that the class NDB-P of nondeterministic queries which are computable in polynomial time can be captured by suitable nondeterministic variants of datalog, e.g., by a procedure-style variants [Abiteboul & Vianu 1991a], by datalog6= (datalog with inequality) extended with a choice operator, or by datalog with stable models [Corciulo, Giannotti & Pedreschi 1997, Giannotti & Pedreschi 1998]. Also NDB-PSPACE, the class of nondeterministic queries computable in polynomial space, is captured by a nondeterministic variant of datalog [Abiteboul & Vianu 1991a]. For a tutorial survey of such and related deterministic languages, we recommend [Vianu 1997]. For further issues on nondeterministic queries, we refer to [Giannotti, Greco, Sacc`a & Zaniolo 1997, Grumbach & Lacroix 1997, Leone, Palopoli & Sacc`a 1998].

7.1 The order mismatch and relational machines Many results on capturing the complexity classes by logical languages suffer from the order mismatch. For example, the results by Immerman and Vardi (Theorems 7.7 and 7.10) show that P = PSPACE if and only if Datalog under INFS and Datalog under NINFS coincide on ordered databases. The order appears when we code the input for a standard computational device, like a Turing machine, while the semantics of Datalog and logic is defined directly in terms of logical structures, where no order on elements is given. To overcome this mismatch, [Abiteboul & Vianu 1991b, Abiteboul & Vianu 1995] introduced relational complexity theory, where computations on unordered structures are modeled by relational machines. In [Abiteboul & Vianu 1991b, Abiteboul & Vianu 1995, Abiteboul, Vardi & Vianu 1997] several relational complexity classes are introduced, such as Pr (relational polynomial time), NPr (relational nondeterministic polynomial time), PSPACEr (relational polynomial space) and EXPTIMEr (relational exponential time). It follows that all separation results among the standard complexity classes translate into separation results among relational complexity classes. For example, P = NP if and only if Pr = NPr . It happens that Datalog under various semantics captures the relational complexity classes on unordered databases. For example (cf. Theorems 7.7 and 7.10), we have Theorem 7.13 Datalog under INFS captures Pr . Datalog under NINFS captures PSPACEr . Note that together with the correspondence of the separation results between the standard complexity classes and the relational complexity classes, this theorem implies that Datalog under INFS coincides with Datalog under NINFS if and only if P = PSPACE. Therefore, the results of [Abiteboul & Vianu 1991b, Abiteboul & Vianu 1995, Abiteboul et al. 1997] provide an order-free correspondence between questions in computational and descriptive complexity.

INFSYS RR 1843-99-05

37

7.2 Expressive power of logic programming with complex values The expressive power of datalog queries is defined in terms of input and output databases, i.e., finite sets of tuples. In order to extend the notion of expressive power to logic programming with complex values, we need to define what we mean by an input. For example, in the case of plain logic programming, an input may be a finite set of ground terms, i.e. a finite set of trees. In the case of logic programming with sets, an input may be a set whose elements may be sets too and so on. Various models and languages for dealing with complex values in databases have been proposed [e.g. Abiteboul & Kanellakis 1989, Abiteboul & Grumbach 1988, Kifer & Wu 1993, Kifer, Lausen & Wu 1995, Abiteboul & Beeri 1995, Buneman, Naqvi, Tannen & Wong 1995, Suciu 1997, Greco, Palopoli & Spadafora 1995, Libkin, Machlin & Wong 1996, Abiteboul et al. 1995]. The functional approach to such languages dominates the logic programming one. To extend variants of nested relational algebra as in [Buneman et al. 1995] to datalog, bounded fixpoint constructs have been proposed [Suciu 1997], as well as deflationary fixpoint constructs [Colby & Libkin 1997]. The comparative expressive power of languages for complex values is studied in [e.g. Abiteboul & Grumbach 1988, Vadaparty 1991, Suciu 1997, Abiteboul & Beeri 1995, Dantsin & Voronkov 1998]. For example, Abiteboul & Beeri [1995] introduce a model for restricted combinations of tuples and sets and several corresponding query languages, including the algebraic and logic programming ones. It is proved that all these languages define the same class of queries. Dantsin & Voronkov [1998] show that nonrecursive logic programming with negation has the same expressive power as nonrecursive datalog with negation (under a natural representation of inputs). Thus, the use of recursive data structures, namely trees, in nonrecursive datalog gives no gain in the expressiveness. It follows from this result and [Immerman 1987] that nonrecursive logic programming with negation is in AC0 . The absolute expressive power of languages for complex values is also studied in [Sazonov 1993, Suciu 1997, Lisitsa & Sazonov 1995, Grumbach & Vianu 1995, Gyssens, van Gucht & Suciu 1995, Lisitsa & Sazonov 1997]; further issues, such as expressibility of particular queries or faithful extension of datalog, are studied in [Libkin & Wong 1989, Wong 1996, Paredaens & van Gucht 1992]. Results on the expressive power of different forms of logic programming with constraints can be found e.g. in [Cosmadakis & Kuper 1994, Kanellakis, Kuper & Revesz 1995, Benedikt, Dong, Libkin & Wong 1996, Vandeurzen, Gyssens & van Gucht 1996]. Unlike research on the expressive power of datalog, there is no mainstream in research on the expressive power of logic programming with complex values. Extension of declarative query languages by complex values is more actively studied in database theory.

8 Unification and its complexity What is the complexity of query answering for very simple logic programs consisting of one fact? This problem leads us to the problem of solving equations over terms, known as the unification problem. Unification lies in the very heart of implementations of logic programming and automated reasoning systems. Atoms or terms s and t are called unifiable if there exists a substitution # that makes them equal, i.e., the terms s# and t# coincide; such a substitution # is called a unifier of s and t. The unification problem is the following decision problem: given terms s and t, are they unifiable? Robinson [1965] described an algorithm that solves this problem and, if the answer is positive, computes a most general unifier of given two terms. His algorithm had exponential time and space complexity mainly because of the representation of terms by strings of symbols. Using better representations (for example,

38

INFSYS RR 1843-99-05

by directed acyclic graphs), Robinson’s algorithm was improved to linear time algorithms, e.g. [Martelli & Montanari 1976, Paterson & Wegman 1978]. Theorem 8.1 ([Dwork, Kanellakis & Mitchell 1984, Yasuura 1984, Dwork, Kanellakis & Stockmeyer 1988]) The unification problem is P-complete.

P-hardness of the unification problem was proved by reductions from some versions of the circuit value problem in [Dwork et al. 1984, Yasuura 1984, Dwork et al. 1988]. (Note that [Lewis & Statman 1982] states that unifiability is complete for co-NL; however, [Dwork et al. 1984] gives a counterexample to the proof in [Lewis & Statman 1982].) Also, many quadratic time and almost linear time unification algorithms have been proposed because these algorithms are often more suitable for applications and generalizations (see a survey of the main unification algorithms in [Baader & Siekmann 1994]). Here we mention only Martelli & Montanari’s [1982] algorithm based on ideas going back to Herbrand’s [1972] famous work. Modifications of this algorithm are widely used for unification in equational theories and rewriting systems. The time complexity of Martelli and Montanari’s algorithm is O (nA?1 (n)) where A?1 is a function inverse to Ackermann’s function and thus A?1 grows very slowly. 9 Logic programming with equality The relational model of data deals with simple values, namely tuples consisting of atomic components. Various generalizations and formalisms have been proposed to handle more complex values like nested tuples, tuples of sets, etc; see Section 7.2 and [Abiteboul & Beeri 1995]. Most of these formalisms can be expressed in terms of logic programming with equality [Gallier & Raatz 1986, Gallier & Raatz 1989, H¨olldobler 1989, Hanus 1994, Degtyarev & Voronkov 1996] and constraint logic programming considered in Section 10.

9.1 Equational theories Let L be a language containing the equality predicate =. By an equation over L we mean an atom s = t where s and t are terms in L. An equational theory E over L is a set of equations closed under the logical consequence relation, i.e., a set satisfying the following conditions: (i) E contains the equation x = x; (ii) if E contains s = t then E contains t = s; (iii) if E contains r = s and s = t then E contains r = t; (iv) if E contains s1 = t1 ; : : : ; sn = tn then E contains f (s1 ; : : : ; sn ) = f (t1 ; : : : ; tn ) for each n-ary function symbol f 2 L; and (v) if E contains s = t then E contains s# = t# for all substitutions #. The syntax of logic programs over an equational theory E coincides with that of ordinary logic programs. Their semantics is defined as a generalization of the semantics of logic programming so that terms are identified if they are equal in E . Example 9.1 We demonstrate logic programs with equality by a logic program processing finite sets. Finite sets are a typical example of complex values handled in databases. We represent finite sets by ground terms as follows: (i) the constant fg denotes the empty set, (ii) if s represents a set and t is a ground term then ft j sg represents the set ftg [ s (where ftg and s are not necessarily disjoint). However the equality on sets is defined not as identity of terms but as equality in the equational theory in which terms are considered to be equal if and only if they represent equal sets (we omit the axiomatization of this theory).

INFSYS RR 1843-99-05

39

Consider a very simple program that checks whether two given sets have a nonempty intersection. This program consists of one fact non empty intersection (fX

j Y1g; fX j Y2 g) : For example, to check that the sets f1; 3; 5g and f4; 1; 7g have a common member, we ask the query non empty intersection (f1; 3; 5g ; f4; 1; 7g ). The answer will be positive. Indeed, the following system of equations has solutions in the equational theory of sets:

fX j Y1g = f1; 3; 5g; fX j Y2g = f4; 1; 7g: For example, set X = 1, Y1 = f3; 5g , Y2 = f7; 4; 1g . Note that if we represent sets by lists in plain logic programming without equality, any encoding of non empty intersection will require recursion. The complexity of logic programs over E depends on the complexity of solving systems of term equations in E . The problem of whether a system of term equations is solvable in an equational theory E is known as the problem of simultaneous E -unification. A substitution # is called an E -unifier of terms s and t if the equation s# = t# is a logical consequence of the theory E . By the E -unification problem we mean the problem of whether there exists an E -unifier of two given terms. Ordinary unification can be viewed as E -unification where E contains only trivial equations t = t. It is natural to think of an E -unifier of s and t as a solution to the equation s = t in the theory E .

9.2 Complexity of E -unification Solving equations is a traditional subject of all mathematics. Since any result on solving equation systems can be viewed as a result on E -unification, it is thus practically impossible to overview all results on the complexity of E -unification. Therefore, we restrict this survey to only few cases closely connected with logic programming. The general theory of E -unification may be found e.g. in [Baader & Siekmann 1994]. Let E be an equational theory over L and  be a binary function symbol in L (written in the infix form). We call  an associative symbol if E contains the equation x  (y  z ) = (x  y )  z , where x; y and z are variables. Similarly,  is called an AC-symbol (an abbreviation for an associative-commutative symbol) if  is associative and, in addition, E contains x  y = y  x. If  is an AC-symbol and E contains x  x = x, we call  an ACI-symbol (I stands for idempotence). Also,  is called an AC1-symbol (or an ACI1-symbol) if  is an AC-symbol (an ACI-symbol respectively) and E contains the equation x  1 = x where 1 is a constant belonging to L. Theorem 9.2 ([Makanin 1977, Baader & Schulz 1992, Benanav, Kapur & Narendran 1987, Ko´scielski & Pacholski 1996]) Let E be an equational theory defining a function symbol  in L as an associative symbol (E contains all logical consequences of x  (y  z ) = (x  y )  z and no other equations). The following upper and lower bounds on the complexity of the E -unification problem hold: (i) this problem is in 3-NEXPTIME, (ii) this problem is NP-hard. Basically, all algorithms for unification under associativity are based on Makanin’s [1977] algorithm for word equations. The 3-NEXPTIME upper bound is obtained in [Ko´scielski & Pacholski 1996]. The following theorem characterizes other popular kinds of equational theories.

40

INFSYS RR 1843-99-05

Theorem 9.3 ([Kapur & Narendran 1986, Kapur & Narendran 1992, Baader & Schulz 1996]) Let E be an equational theory defining some symbols as one of the following: AC-symbols, ACI-symbols, AC1-symbol, or ACI1-symbols (there can be one or more of these kinds of symbols). Suppose the theory E contains no other equations. Then the E -unification problem is NP-complete.

9.3 Complexity of nonrecursive logic programming with equality In the case of ordinary unification, there is a simple way to reduce solvability of finite systems of equations to solvability of single equations. However, these two kinds of solvability are not equivalent for some theories: there exists an equational theory E such that the solvability problem for one equation is decidable, while solvability for (finite) systems of equations is undecidable [Narendran & Otto 1990]. Simultaneous E -unification determines decidability of nonrecursive logic programming over E . Theorem 9.4 (implicit in [Dantsin & Voronkov 1997b]) Let E be an equational theory. Nonrecursive logic programming over E is decidable if and only if the problem of simultaneous E -unification is decidable. An equational theory E is called NP-solvable if the problem of solvability of equation systems in E is in NP. For example, the equational theory of finite sets mentioned above, the equational theory of bags (i.e. finite multisets) and the equational theory of trees (containing only equations t = t) are NP-solvable [Dantsin & Voronkov 1999]. Theorem 9.5 ([Dantsin & Voronkov 1997a, Dantsin & Voronkov 1997b, Dantsin & Voronkov 1999]) Nonrecursive logic programming over an NP-solvable equational theory E is in NEXPTIME. Moreover, if E is a theory of trees, or bags, or finite sets, or any combination of them, then nonrecursive logic programming over E is also NEXPTIME-complete.

10 Constraint logic programming Informally, constraint logic programming (CLP) extends logic programming by involving additional conditions on terms. These conditions are expressed by constraints, i.e., equations, disequations, inequations etc. over terms. The semantics of such constraints is predefined and does not depend on logic programs. Example 10.1 We illustrate CLP by the standard example. Suppose that we would like to solve the following puzzle:

S E N D M O R E M O N E Y +

All these letters are variables ranging over decimal digits 0; 1; : : : ; 9. As usual, different letters denote different digits and S; M 6= 0. This puzzle can be solved by a constraint logic program over the domain of integers (Z; =; 6=; ; +; ; 0; 1; : : : ). Informally, this program can be written as follows.

INFSYS RR 1843-99-05

41

find(S; E; N; D;

M; O; R; E; M; O; N; E; Y ) 1  S  9; : : : ; 0  Y  9; S 6= E; : : : ; R 6= Y; 1000  S + 100  E + 10  N + D+ 1000  M + 100  O + 10  R + E = 10000  M + 1000  O + 100  N + 10  E + Y

The query find(S; E; N; D;

M; O; R; E; M; O; N; E; Y ) will be answered by the only solution + 9 5 6 7 1 0 8 5 1 0 6 5 2

A structure is defined by an interpretation I of a language L in a nonempty set D . For example, we shall consider the structure defined by the standard interpretation of the language consisting of the constant 0, the successor function symbol s and the equality predicate = on the set N of natural numbers. This structure is denoted by (N; =; s; 0). Other examples of structures are obtained by replacing N by the sets Z (the integers), Q (the rational numbers), R (the reals) or C (the complex numbers). Below we denote structures in a similar way, keeping in mind the standard interpretation of arithmetic function symbols in number sets. The symbols  and = stand for multiplication and division respectively. We use k  x to denote unary functions of multiplication by particular numbers (of the corresponding domain); xk is used similarly. All structures under consideration are assumed to contain the equality symbol. Let S be a structure. An atom c(t1 ; : : : ; tk ) where t1 ; : : : ; tk are terms in the language of S is called a constraint. By a constraint logic program over S we mean a finite set of rules

p(X)

c1 ; : : : ; cm ; q1 (X1 ); : : : ; qn (Xn )

where c1 ; : : : ; cm are constraints, p; q1 ; : : : ; qn are predicate symbols not occurring in the language of S , and X; X1 ; : : : ; Xn are lists of variables. The semantics of CLP is defined as a natural generalization of semantics of logic programming [e.g. Jaffar & Maher 1994]. If S contains function symbols interpreted as tree constructors (i.e. equality of corresponding terms is interpreted as ordinary unification) then CLP over S is an extension of logic programming. Otherwise, CLP over S can be regarded as an extension of Datalog by constraints.

10.1 Complexity of constraint logic programming There are two sources of complexity in CLP: complexity of solving systems of constraints and complexity coming from the logic programming scheme. However, interaction of these two components can lead to complexity much higher than merely the sum of their complexities. For example, Datalog (which is EXPTIME-complete) with linear arithmetic constraints (whose satisfiability problem is in NP for integers and in P for rational numbers and reals) is undecidable. Theorem 10.2 ([Cox, McAloon & Tretkoff 1990]) CLP over (N; =; s; 0) is r.e.-complete. The same holds for each of Z, Q, R, and C instead of N.

42

INFSYS RR 1843-99-05

The proof uses the fact that CLP over (N; =; s; 0; 1) allows one to define addition and multiplication in terms of successor. Thus, diophantine equations can be expressed in this fragment of CLP. On the other hand, simpler constraints, namely constraints over ordered infinite domains (of some particular kind), do not increase the complexity of Datalog. Theorem 10.3 ([Cox & McAloon 1993]) CLP over (Z; =;