Logic Programming - Google Sites

2 downloads 165 Views 2MB Size Report
shop on Logic and Databases was ..... database system, databases store positive data only ...... directly by telephone,
LOgiC

Programming

~

The Impact

~ ~

Logic Programming on Databases John Grant and Jack Minker

he purpose of this article is to demonstrate the significant impact that logic programming has had on databases. In particular, logic programming has contributed to the understanding of the semantics of a database, has extended the concept of relational databases, and has introduced new techniques in providing useful tools for database users. This section of the article contains some historical background, fundamental concepts from first-order logic and lattice theory, and an axiomatization for deductive databases. At the present time the most commonly used deductive databases are definite. This class is studied in the section "Definite Databases" where the three major semantics: declarative, fixpoint, and procedural, are defined. The handling of negation and implementation issues are also covered in this section. The section entitled "Applications" contains three applications of the use of logic in databases: semantic query optimization, the generation o f cooperative answers, and update validation. Then, three types of deductive databases that are more general than definite databases are considered in the section "General Deductive Databases." Stratified databases allow negation in the body of a statement but not recursion via negation. The wellfounded approach is more general and uses a three-valued logic. Disjunctive databases allow disjunctive

T

COMMUNICATIONSOF THE &¢M/March 1992/Vol.35, No.3

facts and conclusions. T h e article concludes with the section "Summary." Historical Background T h e reader is referred to [30] for an account of the development of deductive databases starting in the 1950s. That paper also includes numerous references which are not included in this article. T h e history is divided there into three parts, each covering approximately 10 years. During the time period 1957-1968 active research on precursors to deductive databases was conducted at several locations in the U.S., including the Rand Corporation, MIT, and Stanford. A system called RDF (relational data file) that had an inferential capability was implemented at the Rand Corporation. The resolution principle was discovered by J.A. Robinson [40] for theorem proving. Green and Raphael [18] at the Stanford Research Institute recognized the application of resolution to the implementation of deduction for databases in a uniform manner. During the second time period, 1969-1978, the concept of logic programming was proposed and then implemented in the language Prolog. Also, the foundations of logic programming were developed by van Emden and Kowalski [45]. Their major results concern definite logic programs and databases and will be sketched in a later section of this article. This work established the equivalence of declara-

tive, fixpoint, and procedural semantics. By 1977 there were several researchers who were applying logic to databases with significant results. Consequently, the Workshop on Logic and Databases was held in November 1977 in Toulouse, France. Some of the early important results about deductive databases were included in the book of papers [14] presented at the Workshop including the results about negation in definite databases that will be given in a later section of this article. The third time period begins in 1979 and leads up to the present. In the early 1980s Reiter [39] proposed formal theories of databases and reinterpreted the conventional model-theoretic perspective on databases in purely proof-theoretic terms. The basic axioms that he defined will be given in a later subsection of this article. In the 1980s the efficient implementation of recursion became an important research topic [44]. T h e handling of negation in rules was clarified with the introduction of stratified databases and the handling of disjunctive information was investigated for disjunctive databases. The Workshop on the Foundations of Deductive Databases and Logic Programming, held in Washington in July 1986, brought together some o f the leading researchers in the field. T h e book [31] of papers described the state-of-the-art in the late 1980s and directions for future research.

67

LOgiC P r o g r a m m i n g Deductive databases represent the convergence o f databases and logic p r o g r a m m i n g . S t a n d a r d relational database systems are used successfully in business, government, and industry. But these systems do not have built-in reasoning capabilities. Logic p r o g r a m m i n g languages typically deal with files as standard p r o g r a m m i n g languages and provide few of the capabilities o f database systems. A deductive database system combines the multiple file handling, concurrency, security, and recovery aspects o f database systems with the logicbased reasoning, in terms o f recursive rules, o f logic p r o g r a m m i n g . T h u s logic p r o g r a m m i n g has had a p r o f o u n d effect on databases both theoretically, providing logical foundations for databases, and practically, by e x t e n d i n g the power o f relational database systems to incorporate logical deduction. Fundamental Concepts This subsection provides some key definitions for logic p r o g r a m m i n g that are i m p o r t a n t in the formalization o f deductive databases. See [16] and o t h e r articles in this issue for additional background. A firstorder language L (for the application o f first-order logic) contains infinitely many variables, propositional connectives, quantifiers, punctuation symbols, constant symbols, function symbols, a n d predicate symbols. A specific language for a deductive database is characterized by its constant and predicate symbols; usually deductive databases contain no function symbols in o r d e r to avoid infinite domains that can lead to an infinite set o f answers. We will assume for this article that there are no function symbols in the language and that the language contains at least one predicate a n d one constant symbol. However, the results in this article apply to the case where there are function symbols, but then a query may have an infinite set o f answers. In particular, function symbols may have to be included for some e x p e r t systems. Terms, atoms (or atomic

68

formulas), and formulas (including sentences) are defined in the stand a r d way. T h e Herbrand universe is the set o f constant symbols o f L; the Herbrand base (HB), is the set o f all atoms that are ground, that is, contain no variables. HB contains all possible facts about the database. T h e semantics o f first-order logic is defined by an interpretation that consists o f a n o n e m p t y domain and which assigns a meaning to each nonlogical symbol: an element o f the d o m a i n to a constant symbol and a predicate on the d o m a i n to a predicate symbol. A n interpretation I is a model for a set o f sentences S if every element o f S is true in I. S logically implies a sentence w (written S I= w) if w is true in all models o f S. A Herbrand interpretation is one whose domain is the H e r b r a n d universe; if it is a model for S, it is called a Herbrand model. A n inference system ( p r o o f theory) consists o f a set o f axiom schemas a n d rules o f inference that are used to prove formulas. T h e sentence w is provable from S (written S I- w) if there is a p r o o f o f w from S. An inference system is sound if S I - w implies S I= w and complete if S I= w implies S I- w. T h e resolution method tries to derive the e m p t y clause from the clausal forms o f S and -qw by using a simple rule called the resolution principle. T h e resolution m e t h o d is sound and complete. A clause has the form "TAlv . . .

v~AnvBlv

...

vBm

where the Ai and Bj are atoms and all variables are universally quantified. T h e equivalent formulation

B1,

• • • , Bm

*--A1

.....

A.

is used with deductive databases. Such a clause may be interpreted in two ways: 1) if AI . . . . , A, are all true, then at ]east one o f B1 . . . . . B m is true; 2) to solve for B1 v . . . v Bin, solve all o f Al . . . . . An. T h e atoms A] . . . . . A, form the body o f the clause; B1 . . . . . Bm form the head. A clause is called rangerestricted if every variable that appears in the head also appears in the body. A clause is recursive if the

same predicate appears both in the head and the body. T h e empty clause, {1,V2,0}. A n interpretation is often written as I = (T,F), where T = {A E HB I I(A) = 1} and F = {a E HB[ I(A) = 0}. Here, T stands for the true atoms and F for the false atoms. An interpretation is twovalued if T U F = H B . I is extended to formulas by the following equations: I(-7A) = 1 - I(A) I(A & B) = min(I(A),I(B)) I(A v B) = max(I(A),I(B)) I(B *-- A) = 1 if I(B) > I(A), = 0 otherwise An interpretation I is a model of a normal database P if I ( S ) = 1 for every S E P. Interpretations are ordered by using the numerical values of the truth-values: I -< I' if I(A) -< I'(A) for all A G HB. A minimal model I is one such that there is no model I' with I' < I. Intuitively, a minimal model minimizes T and maximizes F. This generalizes the concept of minimal models for definite databases where the logic is two-valued and the set of true atoms is minimized. T h e wellf o u n d e d approach provides a minimal three-valued model for every database. For stratified databases this model is two-valued a n d is the one described in the previous subsection. Let I = (T,F) be an interpretation, T C_ HB, F C_ HB. Define T/(T) = {A E HB I there is a g r o u n d instance of a rule in P, A