Simple conditions for guaranteeing higher normal forms in relational

0 downloads 0 Views 865KB Size Report
of a single attribute), then it is in projection-join normal form. Furthermore, we ... 3, we give definitions. In Section. 4, we prove that if a relation schema is 3NF and ... consider an example from. Date. [7] of a schema that is BCNF but not. 4NF. ..... Y are sets of attributes. The. FD. X +. Y is said to hold for a relation. R if every.
Missing:
Simple Conditions for Guaranteeing Higher Normal Forms in Relational Databases C. J. DATE Independent

Consultant

and RONALD

FAGIN

IBM Almaden

Research Center

A key is simple if it consists of a single attribute. It is shown that if a relation schema is in third normal form and every key is simple, then it is in projection-join normal form (sometimes called fifth normal form), the ultimate normal form with respect to projections and joins. Furthermore, it is shown that if a relation schema is in Boyce-Codd normal form and some key is simple, then it is in fourth normal form (but not necessarily projection-join normal form). These results give the database designer simple sufficient conditions, defined in terms of functional dependencies alone, that guarantee that the schema being designed is automatically in higher normal forms. Categories forms General

and Subject

Terms: Design,

Descriptors:

H.2. l[Database

Management]:

Logical

Design—normal

Theory

Additional Key Words and Phrases: Boyce-Codd normal form, BCNF, database design, fifth normal form, 5NF, fourth normal form, 4NI?, functional dependency, join dependency, projection-join normal form, PJ/NF, multivalued dependency, normalization, relational database, simple key

1. INTRODUCTION In his first schemas

papers

on the

which

certain

in

undesirable

behavior.

able

behavior

form

that

an

“improved

(BCNF), convert

which relation

third

occur.

normal

The form”,

still. not

in

[4, 5] observed

strongest

normal

~orms

(most

that

relation

occur

where

restrictive)

this

exhibit

undesir-

such

normal

form

(3NF). Later Codd [6] defined called Boyce-Codd normal form

usually

He discussed a given

dependencies

normal

various

is third

is stronger

Codd

of functional

defined

then

schemas

model,

patterns

He

does not

he defined

relational

normal

ways form

to normalize, into

ones

that that

is, are,

to by

Authors’ addresses: C. J. Date, P.O.B. 1000, Healdsburg, CA 95448; R. Fagin, IBM Almaden Research Center, 650 Harry Road, San Jose, CA 95120. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Aesocie.kion for Compd+ Machinmy. T& copy otherwise, or to republish, requiro~ a fee andjor specific permission. @ 1992 ACM 0362-5915/92/0900-0465 $01.50 ACM Transactions

on Database Systems, Vol. 17, No. 3, September 1992, Pages 465-476.

466

.

making

C. J. Date and R. Fagln of the

use

BCNF

relation

Codd’s

normal

projection

schemas

dependencies,

and join may

operators.

have

forms

were

Fagin

[8] defined

some

intended

Fagin

of

the

to prevent.

a new

By

fourth

[8]

making

normal

form

by making use of join normal form (PJ/NF),

fifth

is

normal

form

“ultimate”

(5NF),

normal

which

form,

when

stronger

only

the

than

even that

of multiualued

(4NF),

which

dependencies and

and

called

which

join

is

[10],

sometimes

4NF,

projection

that

behavior

use

stronger than BCNF. Later, Fagin [9] defined projection-join

even

observed

anomalous

is

the

operators

are

allowed. Nearly and

every

textbook

and

a number

4NF,

problem

with

both

in

with

the

and

in

the

very

opposite

trap

and

inaccurate

as to be virtually

The

is that

problem

dependencies

and

functional and

guarantee

higher

schema

is in

third

that

simple,

then form).

it

is

to understand Thereby, the

and

they

database

In and

Section

prove

job who

normal 2, we

give

proceed

that

if

is

easier.

give

can

schema In

Sections not

the

3, 4, and

However, and 5 can

some

is

we show

be skipped

are

of this paper include

ACM Transactions

their

is,

easy

guideline,

without

of a

the

practitioner

normal which

are

we

some key is

projection-join

for

higher

practical

to

a relation

consists

and

also

requiring

forms.

may

useful

situations

of our give

and 5, we

results.

definitions.

every

key

consider

is

the

make for

in

the

which

knowledge

of

by those

Sections

3,4,

Section

4, we

simple,

then

the

of assuming

euery key is simple. We show th~t key is simple, then the relation

by a counterexample

is simple,

In In effect

then

it

interested

that

is not

own presentations

only

in this regard!

on Database Systems, Vol. 17, No. 3, September 1992.

if a relation

necessarily

proofs.

1 The authors

if

Furthermore,

form

results

sufficient

that

(that

the

design

3, we 3NF

Section

key

dependencies

necessarily

not

that

than

easy-to-un-

are

form.

normal

description

Section

some

4NF.

is simple

(but

some

show

normal

class

than

is

fall and

dependencies.

an informal In

the

that some key is simple, rather is BCNF and a relation schema is BCNF,

key

be achieved,

or join

PJ/NF.

we

These

only

schema

higher they

understand

that

to guarantee

if

schema

a

of multivalued

to

of functional

particular,

database

can

formally.

terms situations,

form

a little

form

a relation

schema

the

imprecise

terms

is to give

in

every

sufficient

in

harder

of

conditions

a practical

dependencies

5, we

relation

are

paper

is in Boyce-Codd

give

instructor,

projection-join

and

normal

that

of this

projection-join

fourth

designer’s

multivalued

In

results

provide

database

is in

are

class

form

it

in

or else

defined

which

defined

forms.

schema

These

insight, so

be

PJ/NF,

concerned of

are

to

and

excessively

that

are

purpose

a wide

normal then

if a relation

normal

statements

4NF

definitions

of practical

PJ/NF

are

in

normal

attribute),

show

‘l’he

hold

are

precise

BCNF

tends

useless. and

which

that

way

dependencies,

conditions,

alone,

single

join

the

make

4NF

dependencies.

derstand

in

consider

There

namely

they

and

many

PJ/NF. forms,

:1 Either

accurate

little

3NF,

discuss

normal

seminars

giving

but

discusses

also

of higher

live

formalism, forms,

into

databases

of them

presentations

print

normal

on

in the

PJ/NF. results

but

Simple Conditions 2. INFORMAL Normal

DISCUSSION

forms

relational

for Guaranteeing

certain

As

anomalous

normalization. ization

along

Nearly

form

of 3NF.

textbook

the

if

attributes

(written

trivial),

(2)

A is an X

occur.

contains

a key

of a key.

into

either

(so

X

(1) A is in

functionally

If we eliminate

X

on

(so

of 3NF

some

the

determines

the third

third

definitions

definition

depends

set

X

of

dependence

then

is

attribute),

euery

clause,

the

form.

through

equivalent

a convenient

functionally

of

normal

normalization

of essentially

of

Normal-

structure

a higher

a

that

levels

it is.

(the

in

means

various

desirable

schema

discussion,

A) then

+

467

relations

“good

are

more

discusses

that

structure work,

There

the

relation

a number

of our

to

early

constraints)

attribute

X

or (3) A is part

the

are

his

form,

on databases

There

purposes

says that

not

integrity

ways

in

normal

convert

the

(3NF).

For

the

to

with

every

normal

explained does

higher

a process

is

relations,

of as “good”

Codd

behavior

The

.

OF RESULTS

be thought

can

database.

Higher Normal Forms in Databases

we obtain

the

(more desirable) Boyce-Codd normal form ( BCNF). Thus, BCNF stronger are the result of demands that the only nontrivial functional dependencies less demanding: It allows a nontrivial functional depenkeys. 3NF is a little

dency that

A is part

X + A if in

passing

a key,

from

which

Normal

may

though

that

example attributes

schema

Date

[7]

COURSE,

1. The

semantics

are

only if course a given

c can

course,

number

of

particular

offering since

BCNF,

However,

example,

the

for

intended

a schema

be reasonable would

is

and

TEXT. (c,

any

texts.

is

split

up

other;

is

that

is,

of the

given

course,

is

“all

key”

(no

it

is

still

not

structured

that

Professor

textbook.

that

matter

the

same

in

Green

teaches may

are

is in

Figure

relation if and x as a reference. For teachers

are

used. three This

physics lead

and

and

actually

way.

redundancy

an

There

in the

of the

a good

the

consider

4NF.

teachers

texts

same

instance

who

subset

even

of

us

not

of corresponding

proper

This

Let

but

A sample

assumed no

it

prevent.

BCNF

However, some

t,x) appears

number It

have

t and uses text

by teacher

exist

“goodness”.

still to

that

A tuple

can

physics

might process

to

may

TEACHER,

information

each

it

as follows.

each

is

this

normalization

correspond

BCNF, was

of

corresponding of

reason

the

to is

be taught

there

independent

key).

intended

normalization

from

The

to BCNF,

be desirable.

are

a relation

problems

once

not

forms

of a key.

3NF

any

texts

are

teaches This

any

example

attributes

is

is because, appears

a

for twice,

to inconsistencies.

Fourth normal form (4NF) is intended to remedy the problem, as we now discuss. on COURSE, i.e., COURSE + The functional dependency of TEACHER taught by multiple teachers. TEACHER, does not hold, since a course maybe However, (written

there

COURSE dency

+

must

schema one

is

COURSE

with

COURSE

is

-

a “multivalued ~

TEXT.

-

For

be a consequence not

4NF.

attributes and

TEXT,

dependency”

TEACHER), the

schema of keys.

However,

by

COURSE

and

as in

and

Figure

to be This

TEACHER,

ACM Transactions

obtain

TEACHER

on

a multivalued

4NF,

is not

decomposing 2, we

of

similarly

every

the it

case

into

and

COURSE

dependency

multivalued here,

two another

depen-

so the

relation with

relation schemas,

attributes

4NF.

on Database Systems, Vol. 17, No. 3, September 1992.

468

C. J. Date and R. Fagm

..

TEXT

TEA

CHER

Physics

Prof.

Green

Basic

Physics

Prof.

Green

Principles

COURSE

Mechanics of

Physics

Prof.

Brown

Basic

Physics

Prof.

Brown

Principles

Math

Prof.

Green

Basic

Math

Prof.

Green

Vector

Math

Prof.

Green

Trigonometry

Fig. 1.

Mechanics of

Analysis

BCNF but not 4NF.

m

COURSE

TEXT

Physics

Basic

Principles

Basic

Math

Although

it

of

Y

X

This

dencies We key

not

if

informally

for

each

X,

euery

pair

make

it

difficult

4NF

(which

and

now

ready

consisting

of

along

“counterexamples”

attribute.

noted

BCNF

but

database that in

are

but

schema case.

all This

that

not

We

ACM Transactions

can

all

being

simply

us

In

key,

fact,

modify

seem

sense, to 4NF

could

X

+

+

depen-

dependencies). paper.

We

of our not of

only

have

more and

that

the

4NF

functional,

to a

is

Thus,

but

consisting

refer

results

4NF.

BCNF

only

incorrectly

multivalued

implies are

if the

dependence

dependency

One

of

than not

the one

multi-

4NF.

always

not

one

of this

simple, that

Indeed,

of multivalued

involving

key.

some

lead but

then

simple.

as

documents

4Nl? In

might is BCNF

terms

a compound

similar

key.

Y’s”),

COURSE-TEACHER-TEXT

is

multivalued

dependencies.

contributions

guarantees

the

4NF, and

BCNF

particular

example.

the

earlier,

definition

is that

to understand

in

the

a condition

that

not

design

is

us

of

precise

is a multivalued

is a multivalued

practitioners

key

key

gives

a set

of schemas

every

This

is

attribute

some

dependencies,

As we is

with

problem

as, “there

Y, there

is defined

a single

possible

that

for

to discuss

BCNF,

property

X,

a formal,

the

as functional

there

Analysis

Trigonometry

to give

(such

of Optics

Mechanics

Vector

to obtain 4NF.

of 4NF),

as intuitive

too

for

are

valued

(and

that

conclude Y.

are

is stated on

possible

dependencies

dependencies concept

certainly

is

Decomposing

Mechanics

I

Physics

Math

multivalued

Optics

Mechanics

Math

Fig, 2.

Optics

the

believe

examination

reveals

that

to be of this examples that

is necessarily our

example

an

every all

examples same are

1992.

of

general all

really

example key.

COURSE-TEACHER-TEXT

on Database Systems, Vol. 17, No. 3, September

we gave,

that

of handbooks

However,

of

on

schemas form, the

and same

a relation this example

is not by

Simple Conditions adding than

a constraint one

{TEACHER,

TEXT}

+

4NF,

We

+

just

are

BCNF

one

key

TEXT} we

saw

an

example 4NF

not

4NF

in

Or,

key in

is a still

every

higher

fifth

is

the relation some

are

enough not

weaken 3NF,

enough

This

every

case, and

example,

but that

there

that is only

{TEACHER,

More

not

schema

of schemas

each

compound. 4NF,

generally,

every

if a schema

normal

forms

projection-join

key

is BCNF

which

is the

is and

and

joins

using

functional

There

( PJ/NF)

normal is

or

form

is concerned. PJ/NF

hard

simple

give

4NF.

form

“ultimate”

dependencies,

can

through

normal

are also somewhat

we

the

assumptions

is BCNF,

as

Just

as

defined

by

for the practitioner

to

conditions,

which

we

now

dependencies

alone,

that

are

that

and

4NF.

assuming

this

However,

guarantee

is enough

first

assumption

than

assuming

schema

as we

PJ/NF.

euery

that

by

(not

if

later,

we

PJ/NF.

we

key

In

only

Thus,

that

show

these

assumptions

strengthen

some)

just

assuming

BCNF.

show

But

to guarantee

is 3NF,

to guarantee

comfortable

formal

still

fact,

we

the

that

the

simple,

is

can

we

then

relation

the

second

then

even

schema

assumptions

is that

and

key is simple

(2) every

is simple)

is

PJ\NF.

to

(1) the relation

are

in

is simple

that the

in

are

to projections

which

to guarantee

rather

that

first

discussed

(5NF),

defined

are

by

show

can

schema

examples

keys

multivalued

again,

sufficient

assumption

and

dependence

the

in the

these

called

respect

schema

key

more

is 4NF. have

form

discussed,

(2)

are

it

using

that

(1)

TEXT} and

two

is that

show

with

to guarantee

common

we

form,

Once

have

in

way,

normal

sufficient we

in

is a key,

is false

our

it another

we

describe, As

469

dependency

TEXT}

conjecture

What

BCNF

then

by

the

is

dependencies,

understand.

text

multivalued

of keys,

that

normal

defined

join

using

same

functional

the

schema

section,

as decomposing

4NF

Since

a consequence

key.

do have

putting

this

sometimes

the

the

{TEACHER,

key.

that

is all

example),

is simple,

So far

not

TEACHER,

second

that

is

all

use

adding

Then

not

not

({COURSE,

compound.

far

by

but

show

COURSE). is

cannot

to

.

it is BCNF.

but

in the

some

+

TEACHER

is BCNF

teacher

corresponds

example

although

Higher Normal Forms in Databases

a given

(this

this

COURSE

that

that

course

particular not

for Guaranteeing

with which,

concludes

PJ/NF.

3NF,

this

when

combined

the

definitions

informal

and

For

gives

with part

proofs

those

them

can

database

an additional 3NF,

of this read

designers condition

automatically paper.

Those

who (that

are

guarantees who

are

most

every

key

PJ\NF.

interested

in

on.

3. DEFINITIONS We

are

given

a fixed

represent

column

we

write

may

finite

names XY

set

U of distinct

of a relation.

for

X

u Y.

If

ACM Transactions

If X

X

symbols, and

= {Al,

Y are

. . . . A.},

called

attributes,

sets

of attributes,

then

we

may

which then write

on Database Systems, Vol. 17, No. 3, September 1992.

470

C, J, Date and R. Fagln

.

. . . A.

Al

singleton

Let

for

X.

set

{A}.

In

T be a set

tuple,

particular,

we

of attributes

if T is understood)

may

(that

is,

with

restricting

the

Assume

that

respectively.

mapping

the

s to

relations

The

and

A functional X

and

if every

sets

pair

agrees

. . . . R.

the

or

X

Y,

and

that

in

there

+ (Y – X).

where

A join hold

for

The

(or

simply

a T-tuple

attribute

is

(over

a

T)

obtained

sets

exploits

TI,

which

attribute

the

the

is

. . . . T. written

set

TI u . . . u T.

fact

that

the

join

way,

( JD) are

of

Y

holds

Z -

be the -

Y

form

to hold

also

R if

= t[Y]. X

+

of attributes

is said

R

X

relation

s[Y]

of the set

Y, where

in

for

then

is a statement

replace

not

for

+

Y,

in

X

a relation

S[ X ] = t [ X ], then

an

MVD

there

is

is a statement

attributes.

by an equivalent

of the

The

MVD

disjoint.

side are

R if R = M{ RIXI],

a relation

-

G

a relation

attributes

= t[X],

t of R where

we can

[10]

sets

of the

X

X for

S[ XY]

side and right-hand

. . . . X.

FD

form

to hold

and U[ Z] = t[Z]. Later, we shall make the MVD X + + Y is equivalent to the MVD

=

[2] that

In this

each

s[X]

Let X

of the

Y is said

on

[8]

s and

]

+

the

MVD

tuples

ZL[ XY

dependency

Xl,

is,

of attributes.

are

the left-hand

where

agrees

Y. That

U — XY.

u of R where of the simple fact

use

X

(LTVD)

sets

a tuple

X -

FD

dependency

Z =

R if whenever

s over

of R where

Y are

is,

T-tuple

. . . , R ~,

notation

t are tuples

A multivalued where

over

RI,

[5] is a statement The

of R that

attributes

s and

are

relations

(FD)

of attributes.

of tuples

on the

whenever

A

T. Thus,

the

commutative.)

dependency

Y are

U).

represent

each attribute in T. A relation then S[ T ] denotes the T-tuple

R.}, is the set of all tuples M{ RI,..., such that S[ T, ] is in R, for each i. (Our is associative

of

domain

to

T.

Rl,

of

join

A

simply

a subset

is a function

mapping that associates a value with is a set of T-tuples. If s is a U-tuple, by

write

JD

form

M

M

{Xl,.

{Xl,.

. . . X.}

. . . , R[ X. ]}. We shall

. . . X.}, is

make

said

to

later

use

to the MVD of the simple fact [8] that the JD M {Xl, Xz} is equivalent Xln Xz-+Xz. and u is a single dependency, then S logically If Z is a set of dependencies that satisfies z implies w (or c is a logical consequence of Z) if every relation also

satisfies

relation”

a.

that

Thus,

Z

satisfies

of FD’s logically

implies

As

CT if an

the FD A + C (this

is A satisfies

attributes X

only

and only

+

Y

if either

Y = 0

schema

of dependencies

describes

this if

the

ACM TransactIons

attributes

only

is called

involving

U,

the

It

if

Y c X;

U Y = U; and

or X

is a pair

implies

dependency.

and

if Xi = U for some

A relation a set

is trivial

logically

is the

MVD

no

“counterexample

set

{A

+

X

+

+

B, B

+

C}

As another

transitivity).

if it is valid, that is, a logical consequence of the empty set. For since every relation where one of the the FD A + A is trivial,

and

Y

there

example,

the

FD

+

implies u.

trivial example, an

X

not

example, is

FD

logically

X but

Y. A dependency

is straightforward an

a JD

MVD

M

X

{Xl,

to verify +

+

Y

. . . . X.}

that

is trivial

if

is trivial

if

i. (U, only along

Z),

U is a set of attributes

where

these with

attributes. the

Thus,

dependencies,

on Database Systems, Vol. 17, No, 3, September 1992.

a relation where

and

Z is

schema they

are

Simple Conchtlons for Guaranteeing thought

of as constraints.

(U, Z) if cr involves

Higher Normal Forms in Databases

A dependency

only

attributes

v

is a dependency

.

471

of the schema

in U, and if u is a logical

consequence

of

Z. A key (sometimes called candidate key) of a schema is a set K of attributes such that (a) the FD K + U is an FD of the schema, and (b) if K‘ is a proper subset of K, then K’ + U is not an FD of the schema. A superkey is a superset

of a key. Thus,

FD of the schema. key dependency We

now

paper, Third

define

the

normal make

to

normal

form,

a key

attribute

of

(31W0

a single

was

we

the

need

MVD

[9] that

use

for

if the FD K +

we shall

In

a prime

Codd third

our

U is an

is called

consider

[5].

In

this

normal

a

in

this

paper,

we

form,

purposes.

a relation

attribute) A relation

A is a nontrivial

either

To

schema,

define

is

third

an attribute

if it is contained schema is in third

FD of the schema,

X is a superkey

which

is

in some normal

where

A is

or A is a key attribute.

is in Boyce-Codd normal form (BCNF) [6] if whenever FD of the schema, necessarily X is a superkey. It is

[9] that

schema

is a nontrivial

to

attribute.

X -

by of

definition. called

a relation

is a logical

schema. A relation

that

definition

easier

another

then

by Fagin

by Fagin

is

it is a nonkey

attribute,

forms defined

[11]

(sometimes

if whenever

schema

normal

originally

but

A relation schema X + Y is a nontrivial shown

precisely

K is a key of the schema,

of strength.

Zaniolo’s

Codd’s,

key. Otherwise, form

various

order

form

use

equivalent

U, where

(KD).

in increasing

shall

K is a superkey

An FD K +

schema

consequence

is in fourth of the

normal

schema,

a relation

is BCNF

of the form

(4NF)

necessarily

schema

if and only

set of key

is 4NF

if every

dependencies

[8] if whenever

X + -

X is a superkey.

if and

only

FD of of the Y

It is shown

if every

MVD

of the

schema is a logical consequence of the set of key dependencies of the schema. A relation schema is in projection-join normal form (PJ/NF) [91, sometimes

called

fifth

consequence

4. ASSUMING In this

section,

4.1.

THEOREM

Then

To prove

this

hypotheses show).

we prove and

that

one that

Specifically,

is simple.

form

4.2. Then

JD of the

schema

is a logical

of the schema.

in

condition,

a wide

involving

class

only

of situations,

functional guarantees

we show: Assume

that

the relation theorem, prove

there

Assume

are two

harder

that

schema

steps.

is BCNF

that

if a relation

result

a relation

We first

schema

it is PJ/NF. step in the proof

schema

is 3NF,

and

that

every key

is PJ/NF.

the relation

the

the relation

a relation

schema

of the theorem,

We then

LEMMA

if every

a simple

holds

and every key is simple, then The next lemma is the first

simple.

(5NF),

EVERY KEY IS SIMPLE

dependencies, PJ/NF.

normal

of the set of key dependencies

schema

show

and

under

is very

schema

of Theorem is 3NF,

that

(this

the

easy to

is BCNF,

4.1. that

every key is

is BCNF.

ACM Transactions

on Database Systems, Vol. 17, No. 3, September 1992.

472

.

C. J. Date and R. Fagin Let

PROOF.

attribute.

X + A be a nontrivial

Assume

sufficient

that

the

to show that

FD of the schema,

schema

is 3NF.

X is a superkey.

To show

Since

where

that

the schema

A is a single

it is BCNF, is 3NF,

it is

either

X is

a superkey, or A is a key attribute. If A is a key attribute, then A is a key itself, since by assumption, every key is simple. Since X + A is an FD of the it follows

that

was to be shown.

schema,



If JD.

M{

Xl,...,

We

shall

X is a superkey.

Xn } is a JD, make

use

then

So in any case, X is a superkey.

we refer

of the

to the

following

This

XL’S as components

simple

lemma,

of the

whose

proof

is

straightforward. [3] Let JI two components

LEMMA

4.3.

replacing

As an example the JD

N

of Lemma

{ABC,

AD,

by their union BCE. We shall also make

be a JD, and of JI by their 4.3, the JD

BCE},

use of the

mining whether a given JD correctness of the algorithm results of Aho Theinputisa

where

et al. [1]. JD H

M

let Jz union.

be a JD obtained Then JI logically

{ABC,

AD,

we replace

following

the

from JI by implies Jz.

BC, CE} logically

implies

components

and

Membership

BC

Algorithm

for deter-

is a logical consequence of a set of KD’s. The is stated in Fagin [9]; it also follows from the

{Xl,..

., X~}andaset{KU,

U,..

., K~+U}of

rule until set&as {Xl,. . ., Xn}. Apply the following applied: if K1 c Y n Z for some key K1 (1 < i < s) and

it can be for some

KD’s. Initialize no longer

members Y and Z of ~, then remove the sets Y and Z from particular,

the

number

{Y,,..

., Y,} be the final of theset{Kl - U, ...,

of

CE

replace Y and Z in ~ by their union, that is, & and add to & the single member Y U Z. (In members

of

W

then

decreases

by

one.)

Let

result. Then M {XI,... , X~} is a logical consequence K, + U} of KD’S precisely if some ~ equals the set U

of all attributes. Example. Assume that the attributes are U = {A, B, C, D}, and that the input consists of the JD N {AB, AD, BC} and the KD’s A + U and B + U. We now show, by using the Membership Algorithm, that the JD is a logical consequence of the KD’s. Initialize and since AB and AD are in Y,

AD, BC}. Since A + U is a KD AB and AD in P by ABD. At

& as {AB, we replace

this stage, F is {ABD, BC}. Since B + U is a KD and since ABD and BC are in 9, we replace ABD and BC by ABCD. We are left with P = { ABCD} = {V}. w {All,

So the Membership AD, BC} is a logical

Algorithm consequence

tells us that, indeed, the JD of the set {A + U, B + U} of KD’s.



The next LEMMA

simple.

is the second

Assume

4.4.

Then

PROOF.

simple.

lemma

the relation

Assume

We wish

ACM Transactions

that

to show

that

(and

final)

a relation

schema

step in the proof

schema

is BCNF,

of Theorem and

every

4.1. key is

is PJ/NF.

the

relation

that

the schema

schema

is

is PJ/NF.

BCNF,

and

Assume

on Database Systems, Vol. 17, No. 3, September 1992

every

not.

Then

key

is

there

Simple Conditions isa

JD~{Xl,

for Guaranteeing

. . ., Xn}

Higher Normal Forms In Databases

of the schema

that

is not a logical

.

473

consequence

of the

KD’s. Let rithm

{Yl,. with

. . . Yr} be the final result after applying the Membership Algothe JD w {Xl, . . ., X~} and the KD’s of the schema as the input. Since, by assumption, M {Xl,..., X~} is not a logical consequence of the KD’s, it follows from the Membership Algorithm that none of the YZ’S equals the set U of all attributes. of

Lemma

w

{Yl, . . ., Y,}. By

4.3

that

assumption,

least

Note the

every

one key, it follows

JD key

also that M

{Xl,

is simple.

in particular

it follows

from

. . . . X~}

logically

Since

that

every

there

repeated

applications

implies

relation

the

schema

is some simple

JD

has

key. Let

at

A be

an attribute that is a simple key. Then A is in exactly one Y,. This is because A is certainly in at least one Yi, since the union of the ~’s is the set U of all attributes. Further, A cannot be a member of both YZ and Y~, where because the procedure would have replaced Y, and Y~ by Y, u Yj, since key. Without

loss

of generality,

W=

Y2 U...

least

one such ~,

U Y, to be the

let YI be the union of all of the

since as we noted

above,

Yi that contains Y,’s except for Yl;

none of the ~’s

equals

i #j, A is a

A. Define there is at the set U of

all attributes, and in particular YI + U. By Lemma 4.3 applied repeatedly, follows that the JD M {Yl, W} is a logical consequence of M {Xl,. . . . Xn}. we noted

earlier,

Therefore, We now multivalued FD-MVD2

the JD

M {Yl, W} is equivalent

YI n W + +

the MVD

make use of dependencies, in

the

to the

W is an MVD

MVD

of the

YI

n W +

+

axiomatization

of Beeri

W.

schema.

the following inference rule for functional which is a special case of the inference

complete

it As

et al.

[2]

for

and rule

FD’s

and

MVD’S. If X + + Z and

(*) Let

X

apply

be YI n W, let the inference

Y be A, and

rule

Zisani’hlVD we showed

Y + Z, where

(*) to derive

(l)

X++ which

(2)

Y - Z is an FD of the schema: the schema, since A is a key.

let

Y and Z are disjoint, Z be W. We now

then X + Z. show

that

we can

an FD of the schema.

of the schema: Here X~-Zis above is an MVD of the schema. Here

Y -

Z is A +

Yln W, which

W~~W, is an FD of

(3) Y and Z are disjoint:

Here Y and Z are A and W, respectively. We saw that A is in precisely one of the Y,’s, namely YI. Since W = Yg u . . . u Y, (the union of all of the Y,’s except for Yl), it follows that

above

A G W. Thus It follows

from

A and W are indeed

inference

rule

(*) that

disjoint. X ~ Z, that

is, YI n W + W, is an FD

of the schema. We now show that the FD YI n W + W is nontrivial, that W is not a subset of YI n W. If W were a subset of YI n W, then W would

is, be

a subset of Y1. Since YI u W = YI u Y2 u “o. u Y. = U, it would follow that YI = U, which we have shown is not the case. So indeed, the FD YI n W + W is nontrivial. Since it is an FD of the schema, and since the schema is BCNF, ACM Transactions

on Database Systems, Vol. 17, No. 3, September 1992.

474 it

C. J. Date and R, Fagin

.

follows

simple,

that

YI n W

it follows

WGW=Y2U

is

a superkey.

Since

by

assumption,

of both also, B = YI n W c Y1. Thus, B is a member impossible, because the procedure would have replaced ❑ since B is a key. Theorem 4.1 now follows

5. ASSUMING The

SOME

assumption

that

some key is simple

from

Lemmas

KEY IS SIMPLE every

key is simple

is fairly

strong.

is 4NF.

However,

we show by a counterexample

schema is not 4NF, then there where V is not a superkey.

5.1. Assume W be a nontrivial

As we noted

W are disjoint

contain

that

there

earlier,

we can assume

(by replacing

both

a member

is a nontrivial

every

without

key

MVD

of the

Let Let

schema

loss of generality

W by W – V if necessary),

of W and a member

is

which gives us informais BCNF but not 4NF.

is in exactly one of V, W, or W‘. If the assumptions not the conclusion, then let K be a key of the schema

attribute hold but

we

that a relation schema is BCNF but not 4NF. MVD of the schema where V is not a superkey.

W‘ be the set of attributes not in V or W. Then contains a member of W and a member of W’. PROOF.

section,

relation schema where some key is simple that is not PJ/NF. to prove that every BCNF relation schema where some key

Recall that if a relation V ~ + W of the schema

V and

In this

assumption that some relation schema where

use of the following lemma, of a relation schema that

LEMMA

is

4.2 and 4.4.

simple is 4NF, we make tion about the structure

V + +

key

YI and Yj. But this is YI and Yj by YI U Y~,

show that we can sometimes get by with the weaker key is simple. In particular, we show that every BCNF is a BCNF In order

every

that YI n W contains some simple key B. Since B ● YI n with2