Address: lJ3M Research K55/801, 650 Harry Rd., San. Jose, CA ... the real world are taken into account, incomplete- ...... San Francisco, May 1982, pp. 137-.
ON THE INTEGRITY
OF DATABASES
WITH INCOMPLETE
INFORMATION
Extended Abstract
Moshe Y. Vardi+ IBM Almaden Research Center
Abstract
harder
We consider bases with
the
incomplete
meaningfulness
of data-
information.
The
basic
idea is that such a database is meaningful
if it
can be completed
to a database with
information
satisfies
straints. the
that
the
assumption
from both
complexity
and
of
completion.
The
open-world
aspects of computational
logical
axiomatizability,
are harder than integrity
while for data
1. Introduction A database is a model of the real world.
open-world
requires that the database with
with
information.
incomplete
assumption
be a conservative
the database with the
incomplete
somewhat
under
ness of information
information.
surprising
the closed-world
ple lack of knowledge
stems not
to take into accounts cer-
about the aspects that we
did take into account.
is
* Address: lJ3M Research K55/801, 650 Harry Rd., San Jose, CA 9512043099.
Dealing -with
information
is a central
problem
Intelligence
and database
theory,
weakest point of current base management Permission to copy without fee all or part of this maicrial is granted provided tba~ the copies are not made or distributed for direct commercial advantage. the ACM copyright notice and the title of the publication and its dale appear, and notice is given that copying is by permission of the Association for Computing Machinery. l’o copy otherwise, or to republish. requires a fee and;or specific permission.
need for systematic information,
252
in Artificial and it is the
database and knowledge
Re84].
lL84,
There is a manifest
methods to model incomplete
for algorithms
that
modify
incom-
in response to new facts, and
for query answering algorithms
$00.75
incomplete
systems jCo75, FW83,
Li81, Li83, McDD80,
plete information ACM-0-89791-179~2/86/0300-0252
how-
tain aspects of the real world, but also from sim-
that
assumption
In practice,
of information
only from our inability
of We
result
is inherent.
ever, incompleteness
com-
extension
process not all aspects of
the real world are taken into account, incomplete-
The closed-world
requires that the database with
plete information
Since in any modelling
com-
be an extension of the database
0 1986
the
bases with complete information.
con-
plete information
integrity
assumption
under
We look at two approaches to defining
notion
prove
integrity
both notions
complete
integrity
than
that are consistent
with our models and our update algorithms.
To be more precise, we have to specify our
The fundamental problem that we address
model for information
more concretely.
in this paper is the meaningfulne.ss of incomplete
represent full information
information.
set U of attributes.
In order to address the problem we
We
as a relation on some
Such relations are called
We focus here on a particular
need to define meaningfulness more precisely.
complete relations.
This is quite straightforward
for full information.
type of incompleteness, where data are missing in
For any given application,
only a subset of all
a uniform manner. Specifically, we take incomto be relations on V, where V is a
possible collections of data is usually of interest.
plete relations
This subset is defined by certain constraints,
proper subset of U. A complete relation p is an
called integrity
eztension
constraints.
The data is con-
of an incomplete relation q if every
sidered ‘to be meaningful if it satisfies the con-
tuple in q comes from some tuple in p, i.e,
straints.
qEnv(p).
(An example of an integrity mechanism
p is a conservative
eztension of q if in
is that of keye, where there can be no two data
addition every tuple in p is reflected by a tuple in
items with
q, i.e, q=nt(p).
the same key.) If, however, only
incomplete information
Thus the OWA assumes that
our knowledge about V-tuples is possibly incom-
is given to us, then we
might not be able to test whether the constraints
plete, while the CWA assumes that while our
are satisfied or not. The intuitive
knowledge
answer is that
of
Utuples
is
incomplete,
incomplete information is meaningful if it can be
knowledge of V-tuples is complete.
completed to meaningful full information, i.e., if
according to CWA, if a certain Vtuple is not in
we can complete it to a complete collection of
q, then this tuple does not represent a correct
data that satisfies the constraints.
fact (Re78]. The closed-world assumption is an
(This idea
That
our is,
emerged first in the framework of the universal
example of what is called in AI default reasoning
relation model [Ho82]).
[Re80]. We specify integrity
We have of course to decide when complete information
approaches. approach
According
to
first-order sentences that seem to be suitable for
We consider two the
specifying .database semantics va83,Ul83].
open-world
given dependencies. We say that an incomplete
be an eztension of the
information.
The
A
complete relation is meaningful if it satisfies the
(OWA), what is required is that the
complete information incomplete
dencies. The class of dependencies is a class of
is considered to be a completion of
some incomplete information.
constraints by depen-
relation is 0 WA-consistent
closed-w0rl.l
with the given depen-
approach (CWA) requires that the complete infor-
dencies if it has an extension that satisfies the
extension of the incom-
dependencies. We say that an incomplete rela-
mation be a conservative
tion is CWA-consistent
plete information.
with the given dependen-
cies if it has a conservative extension that satisfies
253
,
scheme R is a mapping from R to a set of values
the dependencies. We investigate here two questions related to
What
is the
complezity
of testing
con-
sistency? (2)
A relation on R is a finite set
of tuples on R. An unrestricted
consistency: (1)
called the domain. finite or an infinite
relation on R is a
set of tuples on R.
Our
interest here is mainly in relations. If t is a tuple of X and YCX, then t[Yl is a tuple of Y defined
What is the logic required to aziomatize consistency?
as the restriction of t to Y If r is a relation on X and YCX, then the projection
More formally,
we are given a set C of.
given by
dependencies that the complete relation is supposed to satisfy.
Let con@)
of r onto Y is
7ry(r)={
t[ y1: Er}
be the class of
incomplete relations that are consistent with C.
Our definition of relations is different in an
We try to find out what is the complexity of
inessential manner from the standard definition of
recognizing incomplete relations in cons(C), and
relations in mathematical logic. That is, by fixing
whether we can axiomatize it, that is, construct a
some linear ordering for the attributes of U we
set C’
can consider a relation on R to be a finite subset
of sentences such that cons(C) is exactly
the class of incomplete relations that satisfy C’ .
of D’“, where m=IRI.
The domain of this relation
CWA-
is the set of all elements that occur in some tuple,
consistency is harder than OWA-consistency from
and for our purposes need not be mentioned
both aspects of complexity and axiomatizability.
explicitly.
Our
investigation
shows
that
For example, according to one complexity meas-
2.2.
Dependencies
ure, OWA-consistency can be checked in polynomial
time
complete.
while
CWA-consistency there
Interestingly,
are
is
For any given application only a subclass of
NP-
database
aspects where CWA makes life easier. We discuss
all possible relations is of interest.
This subclass
is defined by semantic constraints that are to be satisfied by the relations of interest. A family of
thii at the end of the paper.
constraints that was extensively studied in the literature
2. Basic Definitions
is the family of dependencies.
reader who is interested 2.1. Tuples,
Relations,
We have a finite called the universe,
and Databases
set U={Al,
of attributes,
(The
in the relationship
between the family’ of dependencies defined here and other families of dependencies is referred to
. . . ,A,,},
IFa=1 4
which intui-
tively are column names. A relation scheme is a
The language will be a first-order language
nonempty subset of U. A tuple on a relation
with
254
equality
and without
function
symbols.
When
talking
about
relations
over
(called full in [Fa82]).
the
R,
Observe that egd’s are
language will contain one ]R]-ary predicate sym-
necessarily total.
bol R.
dependency is equivalent
This language will be denoted as L(R).
We
call
an
WI,
. - , ,v,.)
atomic
formula
of
the
of i.e.,
dependencies with a single atomic formula on the
a relational formula, and an atomic
is a first-order
to a conjunction
finitely many total dependencies with q=l,
form
formula of the form vr=v* an equality formula. A dependency
Observe also that every total
sentence in the
right-hand-side
of the implication.
assume without
loss of generality that all total
dependencies are of, this form.
language L of the form
Thus, we We will
say
“dependencies on R” instead of “dependencies in VYl * -
* Yk%
* * . z&4/\
* . * I\Ap-+&I\
* * * l\BJ,
the language L(R)“.
where:
(1) t-4
(3)
or untyped
k,p,q>l
[BVSl,Fa82].
We mostly focus on
total untyped dependencies in this paper; In $6
and 120.
The A’s are relational
we shall consider non-total and typed dependen-
formulas that use
between themselves exactly all the variables
cies.
Yl, * * * ,Yk
2.3. Satisfaction
The B’s use between themselves all the vari-
and
Consistency
If we are given a set C of dependencies on
ables zl, . . . ,zl and possibly some y’s.
(4)
Dependencies can be typed
U, then it is quite obvious when a complete rela-
Either all the B’s are relational formulas, or
tion satisfies C. If p is a relation on U, then we
I=0 and they are all equality formulas.
just have to check that the relational structure‘
If all B’s are relational formulas, the depen-
< C,p>,
dency is called a tuple-generating (abbr. tgd), Intuitively,
where C is the set of elements that
occur in p, satisfies the dependencies in C. The
dependency
situation
a tgd says that if some
is more complicated with incomplete
relations.
tuples, satisfying certain equalities, exist in the relation, then some either other tuples, satisfying
Let V be a proper subset of U, and suppose
certain other equalities, must also exist in the
that for some reason we lack information
relation.
the data entries for the attributes in U-V.
If all the B’s are equality formulas, the
dependency is called an equality-generating dency (abbr. egd). Intuitively,
depen-
call relations on V an incomplete
about
relation,
W.e as
an egd says that if
opposed to complete relations, which are relations
some tuples, satisfying certain equalities, exist in
on U, There could be many possible reasons for
the relation, then these tuples must also satisfy
the lack of information.
some other certain equalities.
not be authorized to read this information or pos-
For example, we might
Dependencies without existential quantifiers,
sibly a physical sensor that is supposed to supply
i.e., in the syntax above l=O,, are called total
this information is broken. At any rate, we now
255
want to decide whether
a given relation
semantically
meaningful.
Intuitively,
plete relation
is semantically
meaningful
be completed
to a complete
relation.
this notion of completion
approach has been pursued in the context of data-
on V is
an incom-
base modelling
if it can To define
3.
open-world
assumption
(OWA)
assumption.
So
data entries for the attributes may also lack information for the attributes ing definition.
in U-V,
in V. This motivate
dependencies
with
The open-world
model (cf. [GMv86,Mw84])
assumes that
though
V.
This
motivate
the following
relations OWA
p on U is a conservative
relation
q on V if q=mrv(p).
extension
with
position
as
“databases:
as logical theories.
and CWA,
give rise to different
approaches,
however,
consistency
theories.
A similar
with
of the universal
reduction,
relation
in
model, is
be a subset of U, and let q be a relation To describe q by a logical
theory
ment the language by individual corresponding
constant symbol
to the entries in q.
tions
The language
256
name
name for rela-
on V, and C is the set of elements
occur in q.
closed-world
on V.
we first aug-
will be JYQJ,V,C), where U is the relation
is not in q
This is an instance of what is called in The
theories.
Let C be a set of dependencies on U, let V
A
c if it has a conservative
[Re80].
vs.
The two approaches,
to hold.
reasoning
theory
[Ko81, NG78, Re84].
for relations on U, V is the relation
default
In this csse
This is the crux of the
then this tuple represent a fact that is known not
AI
is more
described in [GMv86].
in
q is said to be
t
it
of the real world.
known
the context
in WV,
eztension of a
that if a tuple
is incomplete,
to describe the database as a model
of the resulting
that satisfies C. Thus the CWA takes
the default
and the database
the given dependencies is reduced to satisfiability
(CWA)
definition.
approach
We now show how to represent incomplete
relation
about the attributes
relation
CWA-COn8i8tcnt
information
interpretation”
that
we may lack information
we have full information
with
as a model of the real world.
In both
about the data entries for the attributes
is the intuitive
as
can be described
paradigm
and in the context
assvmption
ClO88d-WOdd
This
than as an interpretation.
of query processing (cf. [IL84]). The
databases
is complete
has been
of the universal
viewed
the database should be viewed ss a theory rather
q is said to be
approach
have
i.e., they associate relations
of our knowledge
c (recall that C is a set of
pursued in the context
we
when information
appropriate
the follow-
on v) if it has an extension
satisfies C.
far
names.
When
but we
p on U is an extension
q on V if qEav(p).
OWA-COn8i8tcd
about the
about the data entries
A relation
of a relation
relation
is in
It assumes
that not only do we lack information
_I
Databases
interpretations,
some sense a worst-case
.
Logic&l
about the degree
of our lack of information. The
of
query processing (cf. [Re78, Re83, Va85]).
more precisely, we have
to make some more assumptions
(cf. [Re84]) and in the context
that
C serves as the set of individual
HyW(Y)-v=al\/- * - \/Y=&
constant symbols. The OWA theory of q, denoted Z’ho,(q,C) consists of four components:
(1)
Integrity
constraints:
That is, the CWA axiom says that all information about the attributes in Vis already represented in
These are the depen-
dencies in C written in the language L(U).
q. If q is empty, then the CWA axiom is t7’
This axioms say that the complete relation
Y(-V(Y))*
that we are considering has to satisfy the
We can now state the connection between
given dependencies.
(2)
Uniqueness
consistency and the above theories. Recall that a theory is finitely
aziomcr: For every pair c,d of
distinct constant symbol’ we have an axiom
model,
-~(c=d).
Theorem
This says that unique elements are
indeed unique.
(3)
3.1.
satisfiable
if it has a finite
Let U be a set of attributes, and
let C be a set of dependencies on U Let V be a axiom:
Contaiflmekt V={Al,
. . . ,A,,,}
U={Al,
. . . ,A,,A,+l,
Assume
. . . ,A,,}.
that
subset of U, and let q be a relation on V Then q
and
is OWA-consistent ThowA(q,C)
Then this
with
C if
and only
is finitely satisfiable, and q is CWA-
consistent with C if and only if Thcw,,(q,C)
axiom is
if is
finitely satisfiable. [] 4.
Computational
Complexity
We now want to analyze how hard it is to determine whether a given incomplete relation is
Thii axiom says that we are considering a
consistent with a given set of dependencies. Note
complete relation that is an extension of the
that this decision problem has two parameters:
incomplete relation.
(4
Atomic jacte: For each tuple
analyze the complexity
in q we have an axiom V(aI, . . . ,a,,,).
complexity with respect to the size of the given
The CWA theory of q, denoted Z%o,(q,C),
relation and complexity with respect to the size of
consists, in addition to the above axioms, also the CWA
axiom.
the given dependencies. The former has been
Let al, . . . ,ak be the list of all
tuples in q. For an mtuple y=< variables, let y=a'
in two different ways:
yl, . . . , y,> of
termed in va82] the data complexity,
while the
latter has been termed there ezpksion
complez-
ity.
be a shorthand for /I\ y,=a$
bined
The CWA axiom is
(We do not consider in thii abstract comcomplexity,
which
is complexity
with
respect to the combined size of the given relation
257
(2)
and the given dependencies.)
There is a finite set C of total dependencies such that
To demonstrate the difference between data
RELow,@)is
PTIME-complete.
complexity and expression complexity, let us first survey known results about the complexity satisfaction.
of
Thus while the data complexity of satisfaction is
To study data complexity, we have
in LOGSPACE, the data complexity of OWA-
to fbc a given set C of dependencies on U and
consistency’is complete for PTIME.
consider the set
suggests that while we can check satisfaction fast by using parallel processing, we cannot do the
is a relation on U that satisfies C}.
REL(C)={p:p
This strongly
same for OWA-consistency [E3077]. To study expression complexity, we have to fix a
To study expression complexity, we have to
given relation p on U and consider the set
fix a given relation q on Vand consider the set
: C is a finite set of total dependencies
TDEp(p)={C
in [CM77,Ch81]
(2)
4.1
(1)
For
For every finite set C of total dependencies,
REL(C)is in
For
relation
every
TDEp(p)
(3)
Theorem
the collection
p,
4.3.
every
relation
q,
the
collection
TDEPOwA(q) is in EXPTIME.
LOGSPACE. the
U
such that q is OWA-consistent with C}.
The following theorem follows from results
(1)
Vc U and C
is a finite set of total dependencies on
on U that are satisfied by p}.
Theorem
< U,C> :
Z’DEPOwA(q)={
(2)
There is a relation q such that TDEPowA(q)
collection
is EXPTIMEI-complete. []
is in co-NP.
Thus
the
expression
complexity
of
OWA-
There is a relation p such that the collec-
consistency is exponentially harder than its data
tion TDEF’(p) is co-NP-complete. []
complexity and it is provably intractable.
We now refer to OWA-consistency.
To
We now give the analogous definitions for
study data complexity, we have to fix a given set C of dependencies on
RELowA(C)={
U and consider the :
VSU and qis
CWA-consistency.
set
To study data complexity, we
have to fix a given set C of dependencies on
U
and consider the set
a relation
RELcwA(C)={ < V,q>
on Vthat is OWA-consistent with C}.
:
Vc U and
q is a relation
on V that is CWA-consistent with C}. j Theorem
(1)
4.2 To
For every finite set C of total dependencies, the collection
RELowA(C) is in
study expression complexity, we have to fix a
given relation q on Vand consider the set
PTIME.
258
TDEPC&q)={
< U,C> : Vg U and C
marked nulls. We now chase p with the dependencies in C, p’ossibly equating marked nulls and
is a finite set of total dependencies on U
adding tuples to p. such that q is CWA-consistent with C}. Theorem
two non-null consistent
4.4
If we are forced to equate
elements, then q is not OWA-
with
C,
otherwise
it
is
OWA-
consistent. This process is polynomial in the size (1)
For every finite set C of total dependencies,
of p and exponential in the size of C.
.
the collection RELcw,.,(C) is in NF’. To check for CWA-consistency we also have (2)
There is a finite set C of total dependencies such that RELcw@)
to check at the end that q=nv(p).
is NF’-complete. []
Theorem 4.4 is reminiscent
If this is not
the case we have to guess an assignment of non-
of Theorem 2 in
nulls to the nulls such that q=ndp)
will be
[CKS85], which p roves an NP-completeness result
satisfied. After such an assignment we may have
in the context of the universal relation model
to repeat the process of chasing and assigning
[Mw84]
until we reach convergence or until we are forced
and the
domain-closure
assumption
[Re84]. Theorem
(1)
For
to equate non-nulls.
is due to the nondeterministic assignment.
4.5.
every
relation
q,
the
In the next section we shall see that CWA-
collection
is ‘in NRXPTIME.
consistency is harder than OWA-consistency not
There is a relation q such that TDEPCwA(q)
only from ‘a computational point of view but also
TDEP,,(q) (2)
Thus the added complexity
from a logical point of view.
is NEXF’Tlh!lI?-complete. [] According
to the above result the gap
6. Axiomatizability
between OWA-consistency and CWA-consistency is the gap between deterministic ministic
time.
A subject of great interest in mathematical
and nondeter-
logic is that of aziomatizability.
Note, however, that practically
Given a class s1
of structures, the logician tries to axiomatize it by
speaking this is an exponential gap!
defining a logic A, which consists of a language L
To explain why CWA-consistency is prob-
and a satisfaction relationship between structures
ably harder than OWA-consistency, we informally
and sentences in L. s2 is aziomatizable by A if
describe an algorithm to check consistency. The
there exists a set C of sentences of A, such that a
idea is that given a set C of total dependencies on
structure M is in h2 if and only if M satisfies all
U and a relation p on Vc U, we try to construct
sentences in C. If C is finite, then ct is finitely
a (conservative) extension p of q that .satisfies C.
aziomatirable by A. This notion of axiomatizabil-
This is done as follows. For every tuple t in q,
ity enables us to classify the expressive power of
we construct a tuple in p by extending t with
.logics according to the classes of structures that
259
they can axiomatize or finitely axiomatize.
dependencies C such that RELowA(V,C) is not
We
axiomatizable by egd’s. 0
show in this section that it is harder to axiomatize CWA-consistency than to axiomatize OWA-
Theorem 5.1 suggest that CWA-consistency
consistency.
is logically harder than OWA-consistency, but it
We first try to axiomatize consistency by
seems to be only “mildly”
harder.
To see that
first-order logic. We have to bear in mind, how-
CWA-consistency is more than “mildly”
harder
ever, that every class of relations that is closed
than OWA-consistency, it is instructive
to con-
under isomorphism is axiomatizable by first-order
sider unrestricted relations (i.e., relations that can
logic. Furthermore, it is even axiomatizable in a
be either finite or infinite).
proper subset of first-order logic.
the definitions in 52 carry over to unrestricted
which we call universal-etistential
This subset,
relations with no modification.
logic, is the set
of all first-order sentences whose prefix consists of
following definitions:
a string of universal quantifiers followed by a
uR~~owA(v,~)={q
string of existential tizability
quantifiers.
Thus, axioma-
results for first-order
logic are not
We also need the
: Q is an unrestricte relation
on V that is OWA-consistent with C},
interesting, unless they talk about finite axiomatizability
It is easy to see that
: q is an unrestricted relation
~‘=OWA(~“,c)={q
or about a proper subset of universalon V that is CWA-consistent with Cl.
existential lqgic . To study axiomatizability
.of consistency we
Theorem
6.2. Let C be a set of total dependen-
have to fix a set C of dependencies on U and a
cies on U, and let V be a subset of U. Then
relation scheme V& lJ. Thus we define
URELOWA(
RELowA( V,C)={q
v,c)
is axiomatizable by egd’s. On
the other hand, there are particular U and V and
: q is a relation on V
a particular
that is OWA-consistent with C}.
finite set C of total dependencies
such that URELcwA( V,C) is not axiomatizable by R.f!3LowA(V,C)={q
first-order logic. 1
: q is a relation on V
The above results are interesting theoreti-
that is CWA-consistent with C}.
cally, but do not really have practical significance Theorem
because the set of dependencies promised by the
6.1. Let C be a set of total dependen-
theorem can be infinite.
cies on U, and let V be a subset of U. Then RELowA(V,C)
is axiomatizable
What we would like to
have is finite axiomatizability
by egd’s, and
by first-order logic.
RELowA( V,C) is axiomatizable by total dependen-
Since first-order satisfaction can be tested in loga-
cies. On the other hand, there are particular
rithmic
and
V and
a particular
finite
U
space [Ch81], finite axiomatizability
consistency by first-order
set of total
260
logic will
of
entail, by
Theorem
4.4, that NP=LOGSPACE!
gests the following Theorem
6.3.
This sug-
The following
result.
consistency
There are relation
V and a finite set C of total
schemes Wand
can be axiomatized
Theorem
5.4.
REL~~~( V,C) and RELcwA( V,C) are
dependencies
not finitely
axiomatizable
Then RELowA(V,C)
Since we can not finitely sistency
by first-order
higher-order consistency
con-
logic, we try to do it by
logics.
Studying
the definition
we observe that essentially
of existentially
[]
axiomatize
quantifying
is finitely
axiomatizable
fkpoint
logic.
ticular
U and V and a particular
finite set C of
axiomatizable
by fixpoint
logic.
0
CWA-consistency
tions, which are relations over a possibly extended
even more powerful
logic: existential
domain.
logic (eso logic).
logic
logicFe74].
It is a very
satisfaction
relationship
an
powerful
language
is not necessarily
recur-
sentences of L are of the form ZlP(+), where I$ is
domain.
con-
obtained
a first-order
by adding P to L.
formula of L’ . Let M be a structure
of L with domain D. M satisfies the sentence
We
s(4)
first
consider
the
if there is a relation
P to L.
be the language The fixpoint
Theorem
sen-
6.5.
dependencies
Let C be a finite
first-order
finitely
axiomatizable
free variables
21, . . . ,z,, where P occurs positively. a structure minimal
of L with
n-ary relation
domain
L’ . The relation the structure relationship:
M.
6. Non-Total
on the domain of M, such
is satisfied in the structure
. . . ,z,,)=
RELcwA( V,C)
by eso logic.
are
[]
Dependencies
So far we have considered only total depen-
4)
dencies.
(A&p) of the language
p is the least jixpoint
and
Let M be
D. Let p be the
that the sentences ‘w’zi * * * z,(P(q,
set of total
on U, and let V be a subset of U.
REL~~,J( V,C)
with
(M,p) of
the language L’ .
Then
of L’
p on the domain of M
such that 6, is satisfied in the structure
tences of L are of the form pP(q5), where 4 is a formula
The eso
logic that does not use
name, and let L’
by adding
be the
name, and let L’
whose
Let L be a language, let Let P be a new n-
obtained
second-order
Let L be a language, let Let P
be a new n-ary relation
logic of [AU79,CH82].
ary relation
we need an
logic,
by higher-order
extended
fixpoint
projective
0 ur aim here is to axiomatize
sive [Ha76]. sistency
is called in
many-sorted
by
On the other hand, there are par-
To axiomatize
mathematical
set of total
on U, and let V be a subset of U.
rela-
The logic of such definition
logic,
total dependencies such that RELCwA( V,C) is not
of
it consists
over complete
OWA-
by fixpoint
Let C be a finite
such that
logic.
claims that
which is not the case for CWA-consistency.
dependencies on U
by first-order
theorem
In this
dependencies.
section We
also
we consider wish
to
non-total distinguish
between typed and non-typed dependencies.
of 4 in
We now define the satisfaction
tively,
typed dependencies do not require interac-
tion between
M satisfies pP(4) if p=D”.
Formally,
261
Intui-
different
columns
of the relations.
a dependency u is typed if it is subject
to the following (1)
syntactic
If a variable position
constraints:
DEPowA(q)={
z occurs in the i-th argument
: VsUand
C
is a finite set of dependencies on U
of R, then it does not occur in the
such that q is OWA-consistent
with C},
+th argument position of R for jsi. (2)
If a variable
position of R, and a variable j-th
argument
position
then the equality
s-y
y occurs in the
typed.
and Fagin’s
untyped
[Fa82].
other
such that q is CWA-consistent
e.g.,
dependencies,
embedded implicational
cies, on the
Theorem
are
(1)
dependen-
Inclusion
hand,
dependen-
are an example
of
(2)
dependencies.
dependencies
is that for typed total
OWA-consistency
and
For every finite set C of dependencies,
the
collections
REL OWA@) and RJ%WAW
are
recursively
enumerable.
There is a finite set C of typed dependencies RELowA(C)
(3)
For
every
relation
DEpoWA(q) 6.1.
Let U be a set of attributes,
and
(4
Let V be a subset of U, and let q be a relation on Then
q is OWA-consistent
CWA-consistent
with
are
q,
the
and DEpcwA(q)
collections
are
WX.USiVely
enumerable.
let C be a set of typed total dependencies on U.
V
and RELcw,Q)
not recursive.
CWA-
consistency coincide. Theorem
with C}.
6.2.
such that Our first observation
: Vc U and C
does not occur in u.
multivalued
cies are also typed
< U,C>
is a finite set of dependencies on U
of R, where j#i,
Most dependencies studied in the literature, junctional
DEPcwA(q)={
z occurs in the 6th argument
There
is
a relation
q
such
D,??poWA(cl) and DEPGwA(q)
C iff q is
sive.
that
that
are not recur-
1
with C. [] According
to Theorem
6.2 both notions
of
Theorem 6.1 explains why previous works on conconsistency sistency with
respect to typed total dependencies
total
(e.g., [Fa82, GH83, GZ82, Hu84, Hu86]) did not distinguish
beween open-world
assumptions. consistency
As
we
shall
see later,
and CWA-consistency
for typed non-total
point
and closed-world
to
get finite
that
there is no
axiomatizability
We can still try to axiomatize
consistency by infinite
sets of sentences.
dependencies.
non-total
first some definitions,
trying
imply decidability.
Theorem with
It follows
dependencies. in
in the presence non-
results in the spirit of $5, since such results would
OWA-
do not coincide
Let us consider now the complexity sistency
are intractable
dependencies.
of con-
U,
We need
and
6.3. Let C be a set of dependencies on let
V
be
a subset
of
U.
Then
RELowA( V,C) and UREL~G~A( V,C) are axiomatiz-
where q is a relation on V
able by egd’s.
262
On the other
hand,
there
are
particular
U and Vand a particular
typed
dependencies
not
axiomatizable
u-Gm4( logic.
such that by
finite set C of
RELcwA( V,C) is
dependencies,
is not axiomatizable
JF)
We say that adp)
and
by first-order
in q if 4 holds in
p of q that
4 CWA-holds2
XV(P) for all conservative
satisfies C.
in q if 4 holds in
extensions
p of q that
satisfies C. It follows from the definitions q is CWA-consistent
The second claim of Theorem 6.3 answers in the negative
a question
posed by Fagin
We note that the counterexample U is the set ABC,
dependency
the embedded multivalued 7. Static
To study
and
Q-B]
C.
closed-world pulation
approach
of incomplete
CWA-holdsc={(V,q,4)
mani-
that
tions (i.e., relations
Theorem
7.1.
(1)
collection
refers to incomplete
on V). For simplicity
answer.
Vis a first-order
A first-order
to define the semantics relations. approach
Our
(2)
we con-
We now have
is analogous
taken in the definition
to
the incomplete
q be a relation
the
collection
OWA-holds@
is
co-r.e.-
[] under CWA is not harder
than
evaluation,
standard evaluation
query
under
OWA
[Ch477], while is
intractable.
Thus, there is a trade-off between the complexity of manipulating
of
complexity
relation.
As with consistency
is PSPACE
Thus query evaluation
query
of consistency.
is we apply the query to all completions
The
complete.
of queries on incomplete
approach
CWA-holdsc
dencies.
rela-
Boolean query on
sentence in L(V).
The
complete for any finite set C of total depen-
sider only Boolean queries, i.e., queries that have a yes/no
in q},
Let U be an
set, let C be a set of dependencies on U,
symbol
: g5 CWA-hold+
q5is a query on V
and let V be a subset of U. Recall that V is the predicate
on V, and
where Vis a subset of U, q is a relation on V, and
databases more difficult.
Let us consider now the static manipulation
attribute
q},
q5is a query on V, and
that the
makes the dynamic
of
collec-
V,q,d) : 4 0 WA-holdsE.in
where Vis a subset of U, q is a relation
of the database
of databases, i.e., query evaluation.
we consider the following
0 WA-holdsx={(
Our results indicate
complexity
tions:
needs to be checked only in the process of updating the database.
the computational
query evaluation
A Trade-Off
the consistency
C, then 4 CWA-holds
viewed as a complete relation).
very
AC+B
dependency
vs. Dynamic:
Normally,
is actually
with
that if
in q iff $J holds in q (i.e., 4 holds in q where q is
Fa82].
V is the set AB and C
consist of the functional
That
for all extensions
We say that
[]
simple:
4 OWA-holdsc
we have two cases. Let
on V, and let (b be a query on V
263
the database statically
of manipnlating
it dynamically.
and the
Acknowledgements.
Portland, March 1985, pp. 261-275.
Pd like to thank Ron
Fagin and Shuky Sagiv for their comments on a
Fw
previous draft of this paper.
Chandra,
A.K.,
Merlin,
Optimal implementation
of conjunc-
tive queries in relational References [AU791
languages.
Computing, 1977, pp. 77-90.
Proc.
(Co751
6th ACM Symp. on Principle8 of Pro-
Borodin, A.B.: On relating time and
Bull.
,of
7, 3-4(1975), pp. 25
Fagin, R.: Horn clauses and database
SLAM J.
dependencies. J. ACM 29(1982), pp.
Comput. 6(1977), pp. 733-744.
252-285.
Beeri, C., Vardi, M.Y.: The implica-
[Fe741
tion problem for data dependencies, Languages, and Programming,
Feferman, S.r Two notes on abstract model theory - Properties invariant
Proc. 8th Int. Colloq. on Automata,
on the range of definable relations
July
between
Notes in Computer
structures.
Fundamenta
Math. 82(1974), pp. 153-165.
Science - Vol. 115, Springer-Verlag,
PJvw
1981, pp. 73-85. Chandra, AK.:
FDT
28.
space to size and depth.
[Ch81]
#7).
ACM-SIGMOD
1979, pp. 110-117.
1981, Lecture
Codd, E.F.: Understanding relations (Installment
gramming Languages, San Antonio,
[BV81]
databases.
Proc. 9th ACM Symp. on Theory of Aho, A.V., Ullman, J.D.: Universality of data retrieval
PO771
P.M.:
Fagin, R., Ullman, J.D., Vardi, M.Y.: On the semantics of updates in data-
Programming primi-
bases, Proc.
tives for database languages. Proc.
Principle8
8th ACM Symp. on Principle8 of Pro-
2nd ACM of
Symp. on
Database
Systems,
Atlanta, March 1983, pp. 352365..
gramming Languagea, 1981, pp. 50[GH83]
62.
Ginsburg, S., Hull, R.: Characterizations for functional
[~=I
Chandra, A.K., Harel, D.: Structure
dependency and
Boyce-Codd normal
and complexity of relational queries.
Theoretic,al
J. Computer and System Sciences
form families.
Computer
Science
29(1983), pp. 243-284.
25(1982), pp. 99-128. [CKSSS]
Cosmadakis, S.S., Kanellakis,
[GZ82]
P.C.,
of functional dependency families.
Spyratosi N.: Partition semantics for relations. Principle8
ACM 29(1982) pp. 678-698.
Proc. 4th ACM Symp. on of
Databaee
Ginsburg, S., Zaiddan, S.: Properties
Syetems,
264
J.
[GMV86]
Graham, Vardi,
M.,
M.Y.:
satisfaction, (GV84]
Graham,
Notions
[Li81]
and
M.Y.:
[Li83]
On the
[Ma831
model-theoretic
Proc.
Hierarchy Note
in
Theory, Mathematics
Springer-Verlag, [Ho821
1975.
and
P.: Testing satisfaction
functional
dependencies.
Hull, tional
R.: Finitely
537,
J. ACM
[McDD80]
families.
J. ACM
Hull,
R.: Non-finite
projections families.
specifiability
[NG78]
of
of functional
dependency
To appear in
Theoretical
Imielinski, information
T., Lipski,
[Re78]
Kowalski, language.
on Theoretical Cetraro,
[Re80]
Advanced
Computer Maryland,
1983.
of Relational Science Press,
1983. J.D., Vardi, M.Y.: of the universal
relation
model,
Database
Systems 9(1984), pp. 283
McDermott,
ACM
Trans.
on
D., Doyle, J.: Nonmono-
knowledge
I.
Artificial
Intelli-
Nicolas,
J.M., Gallaire,
- theory
vs. interpretation.
In Logic
and Data Bases, Plenum
Press, New
Reiter,
R.:
bases.
In Logic
H.: Database
On
closed world
data-
and Databases (H.
and J. Minker,
eds.), Plenum
Reiter, R.: A logic for default reasoning.
R.A.: Logic as a database Proc.
Laboratoire
Press, New York, 1978, pp. 55-76.
databases.
J. ACM 31(1984), pp. 671-791. [Ko81]
D., The Theory
Gallaire
W.: Incomplete
in relational
#138,
York, 1978, p. 33-54.
Computer Science.
P41
de Paris-Sud,
data-
gence 13(1980), pp. 41-72.
31(1984), pp. 21@226. [Hu86]
Maier,
tonic implica-
in
308.
of
specifiable
dependency
ACM
related
Report
On the foundations
29(1982), pp. 668-677. [Hu84]
information
Maier, D., Ullman,
1976, pp. 335-345.
Honeyman,
Problems
Research
Rockville,
Lecture
- Vol.
W.: Logical
Databases,
languages.
2nd Conf. on Set Theory
with
J.
de Recherche ed Informatique,
1984,
Hajek, P.: Some remarks on observational
Lipski,
Universite
pp. 281-289. [Ha761
databases
information.
bases.
of Data-
April
On
to incomplete
of
database states, Proc. 3rd
base Systems, Waterloo,
W.:
28(1981), pp. 41-70.
axiomatizability
Symp. on Principles
Lipski, incomplete
of dependency
M.H., Vardi,
consistent
A.O.,
To appear in J. ACM.
complexity
ACM
Mendelzon,
Artificial
Intelligence
13(1980),
pp. 81-132.
Seminar
Issues in Databases,
[Re83]
1981.
Reiter, complete
265
R.: A sound and sometimes query
evaluation
algorithm
jot
relational
database8
with
null
Tech. Report 83-11, Dept. of
values.
Computer Science, Univ. of British Columbia, 1983. [Re84]
Reiter, R.: Towards a logical reconstruction theory. (M.L.
of
relational
database
In On Conceptual
Drodie,
Modelling
J. Mylopoulos,
and
J.W. Schmidt, eds.), Springer-Verlag, New York, 1984, pp. 191-233. Ullman, J. D., Principles Systems,
Computer
of Database
Science Press,
Potomac, Maryland, 1983. P821
Vardi, M.Y.: The complexity of relational query languages. Proc. 14th ACM Symp. on Theory of Computing,
San Francisco, May 1982, pp. 137146. M.Y. Vardi: Querying logical databases. Proc. Principle8
4th ACM of
Database
Symp.
on
Systems,
March 1985, pp. 57-65. To appear in J. Computer and System Sciences.
266