Missing:
Simple Conditions for Guaranteeing Higher Normal Forms in Relational Databases C. J. DATE Independent
Consultant
and RONALD
FAGIN
IBM Almaden
Research Center
A key is simple if it consists of a single attribute. It is shown that if a relation schema is in third normal form and every key is simple, then it is in projection-join normal form (sometimes called fifth normal form), the ultimate normal form with respect to projections and joins. Furthermore, it is shown that if a relation schema is in Boyce-Codd normal form and some key is simple, then it is in fourth normal form (but not necessarily projection-join normal form). These results give the database designer simple sufficient conditions, defined in terms of functional dependencies alone, that guarantee that the schema being designed is automatically in higher normal forms. Categories forms General
and Subject
Terms: Design,
Descriptors:
H.2. l[Database
Management]:
Logical
Design—normal
Theory
Additional Key Words and Phrases: Boyce-Codd normal form, BCNF, database design, fifth normal form, 5NF, fourth normal form, 4NI?, functional dependency, join dependency, projection-join normal form, PJ/NF, multivalued dependency, normalization, relational database, simple key
1. INTRODUCTION In his first schemas
papers
on the
which
certain
in
undesirable
behavior.
able
behavior
form
that
an
“improved
(BCNF), convert
which relation
third
occur.
normal
The form”,
still. not
in
[4, 5] observed
strongest
normal
~orms
(most
that
relation
occur
where
restrictive)
this
exhibit
undesir-
such
normal
form
(3NF). Later Codd [6] defined called Boyce-Codd normal form
usually
He discussed a given
dependencies
normal
various
is third
is stronger
Codd
of functional
defined
then
schemas
model,
patterns
He
does not
he defined
relational
normal
ways form
to normalize, into
ones
that that
is, are,
to by
Authors’ addresses: C. J. Date, P.O.B. 1000, Healdsburg, CA 95448; R. Fagin, IBM Almaden Research Center, 650 Harry Road, San Jose, CA 95120. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Aesocie.kion for Compd+ Machinmy. T& copy otherwise, or to republish, requiro~ a fee andjor specific permission. @ 1992 ACM 0362-5915/92/0900-0465 $01.50 ACM Transactions
on Database Systems, Vol. 17, No. 3, September 1992, Pages 465-476.
466
.
making
C. J. Date and R. Fagln of the
use
BCNF
relation
Codd’s
normal
projection
schemas
dependencies,
and join may
operators.
have
forms
were
Fagin
[8] defined
some
intended
Fagin
of
the
to prevent.
a new
By
fourth
[8]
making
normal
form
by making use of join normal form (PJ/NF),
fifth
is
normal
form
“ultimate”
(5NF),
normal
which
form,
when
stronger
only
the
than
even that
of multiualued
(4NF),
which
dependencies and
and
called
which
join
is
[10],
sometimes
4NF,
projection
that
behavior
use
stronger than BCNF. Later, Fagin [9] defined projection-join
even
observed
anomalous
is
the
operators
are
allowed. Nearly and
every
textbook
and
a number
4NF,
problem
with
both
in
with
the
and
in
the
very
opposite
trap
and
inaccurate
as to be virtually
The
is that
problem
dependencies
and
functional and
guarantee
higher
schema
is in
third
that
simple,
then form).
it
is
to understand Thereby, the
and
they
database
In and
Section
prove
job who
normal 2, we
give
proceed
that
if
is
easier.
give
can
schema In
Sections not
the
3, 4, and
However, and 5 can
some
is
we show
be skipped
are
of this paper include
ACM Transactions
their
is,
easy
guideline,
without
of a
the
practitioner
normal which
are
we
some key is
projection-join
for
higher
practical
to
a relation
consists
and
also
requiring
forms.
may
useful
situations
of our give
and 5, we
results.
definitions.
every
key
consider
is
the
make for
in
the
which
knowledge
of
by those
Sections
3,4,
Section
4, we
simple,
then
the
of assuming
euery key is simple. We show th~t key is simple, then the relation
by a counterexample
is simple,
In In effect
then
it
interested
that
is not
own presentations
only
in this regard!
on Database Systems, Vol. 17, No. 3, September 1992.
if a relation
necessarily
proofs.
1 The authors
if
Furthermore,
form
results
sufficient
that
(that
the
design
3, we 3NF
Section
key
dependencies
necessarily
not
that
than
easy-to-un-
are
form.
normal
description
Section
some
4NF.
is simple
(but
some
show
normal
class
than
is
fall and
dependencies.
an informal In
the
that some key is simple, rather is BCNF and a relation schema is BCNF,
key
be achieved,
or join
PJ/NF.
we
These
only
schema
higher they
understand
that
to guarantee
if
schema
a
of multivalued
to
of functional
particular,
database
can
formally.
terms situations,
form
a little
form
a relation
schema
the
imprecise
terms
is to give
in
every
sufficient
in
harder
of
conditions
a practical
dependencies
5, we
relation
are
paper
is in Boyce-Codd
give
instructor,
projection-join
and
normal
that
of this
projection-join
fourth
designer’s
multivalued
In
results
provide
database
is in
are
class
form
it
in
or else
defined
which
defined
forms.
schema
These
insight, so
be
PJ/NF,
concerned of
are
to
and
excessively
that
are
purpose
a wide
normal then
if a relation
normal
statements
4NF
definitions
of practical
PJ/NF
are
in
normal
attribute),
show
‘l’he
hold
are
precise
BCNF
tends
useless. and
which
that
way
dependencies,
conditions,
alone,
single
join
the
make
4NF
dependencies.
derstand
in
consider
There
namely
they
and
many
PJ/NF. forms,
:1 Either
accurate
little
3NF,
discuss
normal
seminars
giving
but
discusses
also
of higher
live
formalism, forms,
into
databases
of them
presentations
print
normal
on
in the
PJ/NF. results
but
Simple Conditions 2. INFORMAL Normal
DISCUSSION
forms
relational
for Guaranteeing
certain
As
anomalous
normalization. ization
along
Nearly
form
of 3NF.
textbook
the
if
attributes
(written
trivial),
(2)
A is an X
occur.
contains
a key
of a key.
into
either
(so
X
(1) A is in
functionally
If we eliminate
X
on
(so
of 3NF
some
the
determines
the third
third
definitions
definition
depends
set
X
of
dependence
then
is
attribute),
euery
clause,
the
form.
through
equivalent
a convenient
functionally
of
normal
normalization
of essentially
of
Normal-
structure
a higher
a
that
levels
it is.
(the
in
means
various
desirable
schema
discussion,
A) then
+
467
relations
“good
are
more
discusses
that
structure work,
There
the
relation
a number
of our
to
early
constraints)
attribute
X
or (3) A is part
the
are
his
form,
on databases
There
purposes
says that
not
integrity
ways
in
normal
convert
the
(3NF).
For
the
to
with
every
normal
explained does
higher
a process
is
relations,
of as “good”
Codd
behavior
The
.
OF RESULTS
be thought
can
database.
Higher Normal Forms in Databases
we obtain
the
(more desirable) Boyce-Codd normal form ( BCNF). Thus, BCNF stronger are the result of demands that the only nontrivial functional dependencies less demanding: It allows a nontrivial functional depenkeys. 3NF is a little
dency that
A is part
X + A if in
passing
a key,
from
which
Normal
may
though
that
example attributes
schema
Date
[7]
COURSE,
1. The
semantics
are
only if course a given
c can
course,
number
of
particular
offering since
BCNF,
However,
example,
the
for
intended
a schema
be reasonable would
is
and
TEXT. (c,
any
texts.
is
split
up
other;
is
that
is,
of the
given
course,
is
“all
key”
(no
it
is
still
not
structured
that
Professor
textbook.
that
matter
the
same
in
Green
teaches may
are
is in
Figure
relation if and x as a reference. For teachers
are
used. three This
physics lead
and
and
actually
way.
redundancy
an
There
in the
of the
a good
the
consider
4NF.
teachers
texts
same
instance
who
subset
even
of
us
not
of corresponding
proper
This
Let
but
A sample
assumed no
it
prevent.
BCNF
However, some
t,x) appears
number It
have
t and uses text
by teacher
exist
“goodness”.
still to
that
A tuple
can
physics
might process
to
may
TEACHER,
information
each
it
as follows.
each
is
this
normalization
correspond
BCNF, was
of
corresponding of
reason
the
to is
be taught
there
independent
key).
intended
normalization
from
The
to BCNF,
be desirable.
are
a relation
problems
once
not
forms
of a key.
3NF
any
texts
are
teaches This
any
example
attributes
is
is because, appears
a
for twice,
to inconsistencies.
Fourth normal form (4NF) is intended to remedy the problem, as we now discuss. on COURSE, i.e., COURSE + The functional dependency of TEACHER taught by multiple teachers. TEACHER, does not hold, since a course maybe However, (written
there
COURSE dency
+
must
schema one
is
COURSE
with
COURSE
is
-
a “multivalued ~
TEXT.
-
For
be a consequence not
4NF.
attributes and
TEXT,
dependency”
TEACHER), the
schema of keys.
However,
by
COURSE
and
as in
and
Figure
to be This
TEACHER,
ACM Transactions
obtain
TEACHER
on
a multivalued
4NF,
is not
decomposing 2, we
of
similarly
every
the it
case
into
and
COURSE
dependency
multivalued here,
two another
depen-
so the
relation with
relation schemas,
attributes
4NF.
on Database Systems, Vol. 17, No. 3, September 1992.
468
C. J. Date and R. Fagm
..
TEXT
TEA
CHER
Physics
Prof.
Green
Basic
Physics
Prof.
Green
Principles
COURSE
Mechanics of
Physics
Prof.
Brown
Basic
Physics
Prof.
Brown
Principles
Math
Prof.
Green
Basic
Math
Prof.
Green
Vector
Math
Prof.
Green
Trigonometry
Fig. 1.
Mechanics of
Analysis
BCNF but not 4NF.
m
COURSE
TEXT
Physics
Basic
Principles
Basic
Math
Although
it
of
Y
X
This
dencies We key
not
if
informally
for
each
X,
euery
pair
make
it
difficult
4NF
(which
and
now
ready
consisting
of
along
“counterexamples”
attribute.
noted
BCNF
but
database that in
are
but
schema case.
all This
that
not
We
ACM Transactions
can
all
being
simply
us
In
key,
fact,
modify
seem
sense, to 4NF
could
X
+
+
depen-
dependencies). paper.
We
of our not of
only
have
more and
that
the
4NF
functional,
to a
is
Thus,
but
consisting
refer
results
4NF.
BCNF
only
incorrectly
multivalued
implies are
if the
dependence
dependency
One
of
than not
the one
multi-
4NF.
always
not
one
of this
simple, that
Indeed,
of multivalued
involving
key.
some
lead but
then
simple.
as
documents
4Nl? In
might is BCNF
terms
a compound
similar
key.
Y’s”),
COURSE-TEACHER-TEXT
is
multivalued
dependencies.
contributions
guarantees
the
4NF, and
BCNF
particular
example.
the
earlier,
definition
is that
to understand
in
the
a condition
that
not
design
is
us
of
precise
is a multivalued
is a multivalued
practitioners
key
key
gives
a set
of schemas
every
This
is
attribute
some
dependencies,
As we is
with
problem
as, “there
Y, there
is defined
a single
possible
that
for
to discuss
BCNF,
property
X,
a formal,
the
as functional
there
Analysis
Trigonometry
to give
(such
of Optics
Mechanics
Vector
to obtain 4NF.
of 4NF),
as intuitive
too
for
are
valued
(and
that
conclude Y.
are
is stated on
possible
dependencies
dependencies concept
certainly
is
Decomposing
Mechanics
I
Physics
Math
multivalued
Optics
Mechanics
Math
Fig, 2.
Optics
the
believe
examination
reveals
that
to be of this examples that
is necessarily our
example
an
every all
examples same are
1992.
of
general all
really
example key.
COURSE-TEACHER-TEXT
on Database Systems, Vol. 17, No. 3, September
we gave,
that
of handbooks
However,
of
on
schemas form, the
and same
a relation this example
is not by
Simple Conditions adding than
a constraint one
{TEACHER,
TEXT}
+
4NF,
We
+
just
are
BCNF
one
key
TEXT} we
saw
an
example 4NF
not
4NF
in
Or,
key in
is a still
every
higher
fifth
is
the relation some
are
enough not
weaken 3NF,
enough
This
every
case, and
example,
but that
there
that is only
{TEACHER,
More
not
schema
of schemas
each
compound. 4NF,
generally,
every
if a schema
normal
forms
projection-join
key
is BCNF
which
is the
is and
and
joins
using
functional
There
( PJ/NF)
normal is
or
form
is concerned. PJ/NF
hard
simple
give
4NF.
form
“ultimate”
dependencies,
can
through
normal
are also somewhat
we
the
assumptions
is BCNF,
as
Just
as
defined
by
for the practitioner
to
conditions,
which
we
now
dependencies
alone,
that
are
that
and
4NF.
assuming
this
However,
guarantee
is enough
first
assumption
than
assuming
schema
as we
PJ/NF.
euery
that
by
(not
if
later,
we
PJ/NF.
we
key
In
only
Thus,
that
show
these
assumptions
strengthen
some)
just
assuming
BCNF.
show
But
to guarantee
is 3NF,
to guarantee
comfortable
formal
still
fact,
we
the
that
the
simple,
is
can
we
then
relation
the
second
then
even
schema
assumptions
is that
and
key is simple
(2) every
is simple)
is
PJ\NF.
to
(1) the relation
are
in
is simple
that the
in
are
to projections
which
to guarantee
rather
that
first
discussed
(5NF),
defined
are
by
show
can
schema
examples
keys
multivalued
again,
sufficient
assumption
and
dependence
the
in the
these
called
respect
schema
key
more
is 4NF. have
form
discussed,
(2)
are
it
using
that
(1)
TEXT} and
two
is that
show
with
to guarantee
common
we
form,
Once
have
in
way,
normal
sufficient we
in
is a key,
is false
our
it another
we
describe, As
469
dependency
TEXT}
conjecture
What
BCNF
then
by
the
is
dependencies,
understand.
text
multivalued
of keys,
that
normal
defined
join
using
same
functional
the
schema
section,
as decomposing
4NF
Since
a consequence
key.
do have
putting
this
sometimes
the
the
{TEACHER,
key.
that
is all
example),
is simple,
So far
not
TEACHER,
second
that
is
all
use
adding
Then
not
not
({COURSE,
compound.
far
by
but
show
COURSE). is
cannot
to
.
it is BCNF.
but
in the
some
+
TEACHER
is BCNF
teacher
corresponds
example
although
Higher Normal Forms in Databases
a given
(this
this
COURSE
that
that
course
particular not
for Guaranteeing
with which,
concludes
PJ/NF.
3NF,
this
when
combined
the
definitions
informal
and
For
gives
with part
proofs
those
them
can
database
an additional 3NF,
of this read
designers condition
automatically paper.
Those
who (that
are
guarantees who
are
most
every
key
PJ\NF.
interested
in
on.
3. DEFINITIONS We
are
given
a fixed
represent
column
we
write
may
finite
names XY
set
U of distinct
of a relation.
for
X
u Y.
If
ACM Transactions
If X
X
symbols, and
= {Al,
Y are
. . . . A.},
called
attributes,
sets
of attributes,
then
we
may
which then write
on Database Systems, Vol. 17, No. 3, September 1992.
470
C, J, Date and R. Fagln
.
. . . A.
Al
singleton
Let
for
X.
set
{A}.
In
T be a set
tuple,
particular,
we
of attributes
if T is understood)
may
(that
is,
with
restricting
the
Assume
that
respectively.
mapping
the
s to
relations
The
and
A functional X
and
if every
sets
pair
agrees
. . . . R.
the
or
X
Y,
and
that
in
there
+ (Y – X).
where
A join hold
for
The
(or
simply
a T-tuple
attribute
is
(over
a
T)
obtained
sets
exploits
TI,
which
attribute
the
the
is
. . . . T. written
set
TI u . . . u T.
fact
that
the
join
way,
( JD) are
of
Y
holds
Z -
be the -
Y
form
to hold
also
R if
= t[Y]. X
+
of attributes
is said
R
X
relation
s[Y]
of the set
Y, where
in
for
then
is a statement
replace
not
for
+
Y,
in
X
a relation
S[ X ] = t [ X ], then
an
MVD
there
is
is a statement
attributes.
by an equivalent
of the
The
MVD
disjoint.
side are
R if R = M{ RIXI],
a relation
-
G
a relation
attributes
= t[X],
t of R where
we can
[10]
sets
of the
X
X for
S[ XY]
side and right-hand
. . . . X.
FD
form
to hold
and U[ Z] = t[Z]. Later, we shall make the MVD X + + Y is equivalent to the MVD
=
[2] that
In this
each
s[X]
Let X
of the
Y is said
on
[8]
s and
]
+
the
MVD
tuples
ZL[ XY
dependency
Xl,
is,
of attributes.
are
the left-hand
where
agrees
Y. That
U — XY.
u of R where of the simple fact
use
X
(LTVD)
sets
a tuple
X -
FD
dependency
Z =
R if whenever
s over
of R where
Y are
is,
T-tuple
. . . , R ~,
notation
t are tuples
A multivalued where
over
RI,
[5] is a statement The
of R that
attributes
s and
are
relations
(FD)
of attributes.
of tuples
on the
whenever
A
T. Thus,
the
commutative.)
dependency
Y are
U).
represent
each attribute in T. A relation then S[ T ] denotes the T-tuple
R.}, is the set of all tuples M{ RI,..., such that S[ T, ] is in R, for each i. (Our is associative
of
domain
to
T.
Rl,
of
join
A
simply
a subset
is a function
mapping that associates a value with is a set of T-tuples. If s is a U-tuple, by
write
JD
form
M
M
{Xl,.
{Xl,.
. . . X.}
. . . , R[ X. ]}. We shall
. . . X.}, is
make
said
to
later
use
to the MVD of the simple fact [8] that the JD M {Xl, Xz} is equivalent Xln Xz-+Xz. and u is a single dependency, then S logically If Z is a set of dependencies that satisfies z implies w (or c is a logical consequence of Z) if every relation also
satisfies
relation”
a.
that
Thus,
Z
satisfies
of FD’s logically
implies
As
CT if an
the FD A + C (this
is A satisfies
attributes X
only
and only
+
Y
if either
Y = 0
schema
of dependencies
describes
this if
the
ACM TransactIons
attributes
only
is called
involving
U,
the
It
if
Y c X;
U Y = U; and
or X
is a pair
implies
dependency.
and
if Xi = U for some
A relation a set
is trivial
logically
is the
MVD
no
“counterexample
set
{A
+
X
+
+
B, B
+
C}
As another
transitivity).
if it is valid, that is, a logical consequence of the empty set. For since every relation where one of the the FD A + A is trivial,
and
Y
there
example,
the
FD
+
implies u.
trivial example, an
X
not
example, is
FD
logically
X but
Y. A dependency
is straightforward an
a JD
MVD
M
X
{Xl,
to verify +
+
Y
. . . . X.}
that
is trivial
if
is trivial
if
i. (U, only along
Z),
U is a set of attributes
where
these with
attributes. the
Thus,
dependencies,
on Database Systems, Vol. 17, No, 3, September 1992.
a relation where
and
Z is
schema they
are
Simple Conchtlons for Guaranteeing thought
of as constraints.
(U, Z) if cr involves
Higher Normal Forms in Databases
A dependency
only
attributes
v
is a dependency
.
471
of the schema
in U, and if u is a logical
consequence
of
Z. A key (sometimes called candidate key) of a schema is a set K of attributes such that (a) the FD K + U is an FD of the schema, and (b) if K‘ is a proper subset of K, then K’ + U is not an FD of the schema. A superkey is a superset
of a key. Thus,
FD of the schema. key dependency We
now
paper, Third
define
the
normal make
to
normal
form,
a key
attribute
of
(31W0
a single
was
we
the
need
MVD
[9] that
use
for
if the FD K +
we shall
In
a prime
Codd third
our
U is an
is called
consider
[5].
In
this
normal
a
in
this
paper,
we
form,
purposes.
a relation
attribute) A relation
A is a nontrivial
either
To
schema,
define
is
third
an attribute
if it is contained schema is in third
FD of the schema,
X is a superkey
which
is
in some normal
where
A is
or A is a key attribute.
is in Boyce-Codd normal form (BCNF) [6] if whenever FD of the schema, necessarily X is a superkey. It is
[9] that
schema
is a nontrivial
to
attribute.
X -
by of
definition. called
a relation
is a logical
schema. A relation
that
definition
easier
another
then
by Fagin
by Fagin
is
it is a nonkey
attribute,
forms defined
[11]
(sometimes
if whenever
schema
normal
originally
but
A relation schema X + Y is a nontrivial shown
precisely
K is a key of the schema,
of strength.
Zaniolo’s
Codd’s,
key. Otherwise, form
various
order
form
use
equivalent
U, where
(KD).
in increasing
shall
K is a superkey
An FD K +
schema
consequence
is in fourth of the
normal
schema,
a relation
is BCNF
of the form
(4NF)
necessarily
schema
if and only
set of key
is 4NF
if every
dependencies
[8] if whenever
X + -
X is a superkey.
if and
only
FD of of the Y
It is shown
if every
MVD
of the
schema is a logical consequence of the set of key dependencies of the schema. A relation schema is in projection-join normal form (PJ/NF) [91, sometimes
called
fifth
consequence
4. ASSUMING In this
section,
4.1.
THEOREM
Then
To prove
this
hypotheses show).
we prove and
that
one that
Specifically,
is simple.
form
4.2. Then
JD of the
schema
is a logical
of the schema.
in
condition,
a wide
involving
class
only
of situations,
functional guarantees
we show: Assume
that
the relation theorem, prove
there
Assume
are two
harder
that
schema
steps.
is BCNF
that
if a relation
result
a relation
We first
schema
it is PJ/NF. step in the proof
schema
is 3NF,
and
that
every key
is PJ/NF.
the relation
the
the relation
a relation
schema
of the theorem,
We then
LEMMA
if every
a simple
holds
and every key is simple, then The next lemma is the first
simple.
(5NF),
EVERY KEY IS SIMPLE
dependencies, PJ/NF.
normal
of the set of key dependencies
schema
show
and
under
is very
schema
of Theorem is 3NF,
that
(this
the
easy to
is BCNF,
4.1. that
every key is
is BCNF.
ACM Transactions
on Database Systems, Vol. 17, No. 3, September 1992.
472
.
C. J. Date and R. Fagin Let
PROOF.
attribute.
X + A be a nontrivial
Assume
sufficient
that
the
to show that
FD of the schema,
schema
is 3NF.
X is a superkey.
To show
Since
where
that
the schema
A is a single
it is BCNF, is 3NF,
it is
either
X is
a superkey, or A is a key attribute. If A is a key attribute, then A is a key itself, since by assumption, every key is simple. Since X + A is an FD of the it follows
that
was to be shown.
schema,
❑
If JD.
M{
Xl,...,
We
shall
X is a superkey.
Xn } is a JD, make
use
then
So in any case, X is a superkey.
we refer
of the
to the
following
This
XL’S as components
simple
lemma,
of the
whose
proof
is
straightforward. [3] Let JI two components
LEMMA
4.3.
replacing
As an example the JD
N
of Lemma
{ABC,
AD,
by their union BCE. We shall also make
be a JD, and of JI by their 4.3, the JD
BCE},
use of the
mining whether a given JD correctness of the algorithm results of Aho Theinputisa
where
et al. [1]. JD H
M
let Jz union.
be a JD obtained Then JI logically
{ABC,
AD,
we replace
following
the
from JI by implies Jz.
BC, CE} logically
implies
components
and
Membership
BC
Algorithm
for deter-
is a logical consequence of a set of KD’s. The is stated in Fagin [9]; it also follows from the
{Xl,..
., X~}andaset{KU,
U,..
., K~+U}of
rule until set&as {Xl,. . ., Xn}. Apply the following applied: if K1 c Y n Z for some key K1 (1 < i < s) and
it can be for some
KD’s. Initialize no longer
members Y and Z of ~, then remove the sets Y and Z from particular,
the
number
{Y,,..
., Y,} be the final of theset{Kl - U, ...,
of
CE
replace Y and Z in ~ by their union, that is, & and add to & the single member Y U Z. (In members
of
W
then
decreases
by
one.)
Let
result. Then M {XI,... , X~} is a logical consequence K, + U} of KD’S precisely if some ~ equals the set U
of all attributes. Example. Assume that the attributes are U = {A, B, C, D}, and that the input consists of the JD N {AB, AD, BC} and the KD’s A + U and B + U. We now show, by using the Membership Algorithm, that the JD is a logical consequence of the KD’s. Initialize and since AB and AD are in Y,
AD, BC}. Since A + U is a KD AB and AD in P by ABD. At
& as {AB, we replace
this stage, F is {ABD, BC}. Since B + U is a KD and since ABD and BC are in 9, we replace ABD and BC by ABCD. We are left with P = { ABCD} = {V}. w {All,
So the Membership AD, BC} is a logical
Algorithm consequence
tells us that, indeed, the JD of the set {A + U, B + U} of KD’s.
❑
The next LEMMA
simple.
is the second
Assume
4.4.
Then
PROOF.
simple.
lemma
the relation
Assume
We wish
ACM Transactions
that
to show
that
(and
final)
a relation
schema
step in the proof
schema
is BCNF,
of Theorem and
every
4.1. key is
is PJ/NF.
the
relation
that
the schema
schema
is
is PJ/NF.
BCNF,
and
Assume
on Database Systems, Vol. 17, No. 3, September 1992
every
not.
Then
key
is
there
Simple Conditions isa
JD~{Xl,
for Guaranteeing
. . ., Xn}
Higher Normal Forms In Databases
of the schema
that
is not a logical
.
473
consequence
of the
KD’s. Let rithm
{Yl,. with
. . . Yr} be the final result after applying the Membership Algothe JD w {Xl, . . ., X~} and the KD’s of the schema as the input. Since, by assumption, M {Xl,..., X~} is not a logical consequence of the KD’s, it follows from the Membership Algorithm that none of the YZ’S equals the set U of all attributes. of
Lemma
w
{Yl, . . ., Y,}. By
4.3
that
assumption,
least
Note the
every
one key, it follows
JD key
also that M
{Xl,
is simple.
in particular
it follows
from
. . . . X~}
logically
Since
that
every
there
repeated
applications
implies
relation
the
schema
is some simple
JD
has
key. Let
at
A be
an attribute that is a simple key. Then A is in exactly one Y,. This is because A is certainly in at least one Yi, since the union of the ~’s is the set U of all attributes. Further, A cannot be a member of both YZ and Y~, where because the procedure would have replaced Y, and Y~ by Y, u Yj, since key. Without
loss
of generality,
W=
Y2 U...
least
one such ~,
U Y, to be the
let YI be the union of all of the
since as we noted
above,
Yi that contains Y,’s except for Yl;
none of the ~’s
equals
i #j, A is a
A. Define there is at the set U of
all attributes, and in particular YI + U. By Lemma 4.3 applied repeatedly, follows that the JD M {Yl, W} is a logical consequence of M {Xl,. . . . Xn}. we noted
earlier,
Therefore, We now multivalued FD-MVD2
the JD
M {Yl, W} is equivalent
YI n W + +
the MVD
make use of dependencies, in
the
to the
W is an MVD
MVD
of the
YI
n W +
+
axiomatization
of Beeri
W.
schema.
the following inference rule for functional which is a special case of the inference
complete
it As
et al.
[2]
for
and rule
FD’s
and
MVD’S. If X + + Z and
(*) Let
X
apply
be YI n W, let the inference
Y be A, and
rule
Zisani’hlVD we showed
Y + Z, where
(*) to derive
(l)
X++ which
(2)
Y - Z is an FD of the schema: the schema, since A is a key.
let
Y and Z are disjoint, Z be W. We now
then X + Z. show
that
we can
an FD of the schema.
of the schema: Here X~-Zis above is an MVD of the schema. Here
Y -
Z is A +
Yln W, which
W~~W, is an FD of
(3) Y and Z are disjoint:
Here Y and Z are A and W, respectively. We saw that A is in precisely one of the Y,’s, namely YI. Since W = Yg u . . . u Y, (the union of all of the Y,’s except for Yl), it follows that
above
A G W. Thus It follows
from
A and W are indeed
inference
rule
(*) that
disjoint. X ~ Z, that
is, YI n W + W, is an FD
of the schema. We now show that the FD YI n W + W is nontrivial, that W is not a subset of YI n W. If W were a subset of YI n W, then W would
is, be
a subset of Y1. Since YI u W = YI u Y2 u “o. u Y. = U, it would follow that YI = U, which we have shown is not the case. So indeed, the FD YI n W + W is nontrivial. Since it is an FD of the schema, and since the schema is BCNF, ACM Transactions
on Database Systems, Vol. 17, No. 3, September 1992.
474 it
C. J. Date and R, Fagin
.
follows
simple,
that
YI n W
it follows
WGW=Y2U
is
a superkey.
Since
by
assumption,
of both also, B = YI n W c Y1. Thus, B is a member impossible, because the procedure would have replaced ❑ since B is a key. Theorem 4.1 now follows
5. ASSUMING The
SOME
assumption
that
some key is simple
from
Lemmas
KEY IS SIMPLE every
key is simple
is fairly
strong.
is 4NF.
However,
we show by a counterexample
schema is not 4NF, then there where V is not a superkey.
5.1. Assume W be a nontrivial
As we noted
W are disjoint
contain
that
there
earlier,
we can assume
(by replacing
both
a member
is a nontrivial
every
without
key
MVD
of the
Let Let
schema
loss of generality
W by W – V if necessary),
of W and a member
is
which gives us informais BCNF but not 4NF.
is in exactly one of V, W, or W‘. If the assumptions not the conclusion, then let K be a key of the schema
attribute hold but
we
that a relation schema is BCNF but not 4NF. MVD of the schema where V is not a superkey.
W‘ be the set of attributes not in V or W. Then contains a member of W and a member of W’. PROOF.
section,
relation schema where some key is simple that is not PJ/NF. to prove that every BCNF relation schema where some key
Recall that if a relation V ~ + W of the schema
V and
In this
assumption that some relation schema where
use of the following lemma, of a relation schema that
LEMMA
is
4.2 and 4.4.
simple is 4NF, we make tion about the structure
V + +
key
YI and Yj. But this is YI and Yj by YI U Y~,
show that we can sometimes get by with the weaker key is simple. In particular, we show that every BCNF is a BCNF In order
every
that YI n W contains some simple key B. Since B ● YI n with2