Some Classes of Multilevel Relational Structures - Indiana CS

0 downloads 0 Views 844KB Size Report
result [FYI]?. This involved studying weak MVDs, which were first defined in [JS] and generalized in [Tho]. We then obtained the "top down" version of this result,.
SOME CLASSES OF MULTILEVEL RELATIONAL STRUCTURES (Extended

Abstract)

Dirk Van Gucht University at Bloomington

Indiana

Patrick C. Fischer Vanderbilt University

1.

Introduction. The

multilevel, relational

original

proposed

by Codd permitted

be entries

in

complex

a component

of

values

structures

only

to

first

normal

More

to data

for

information

too

came from

restrictive

[Mak].

While

his

he showed that could

be

also

began the study

FDs

and

as

(FD)

if

restriction

could

database later

[SP],

[KTT],

Ozsoyoglu,

Macleod Appelrath

faithfully

[Mac],

et.

[BRS],

to unnozmalized lished

only

over

single

whether

and

from 1NF

results. model

to

result, time

Pw.

Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission.

structure, which

nesting

ACM-O-89791-179~2/86/0300-0060

$00.75

60

were

version

of to

Characterizations properties

able

to

the

(in

problems

Thus we can determine

a

given

some order underlying

[FV3]? able

and solve above

provide

the problem: exist.

structure

the

in

any order

transform we are

restriction

in

does there

will

paper,

of

unnested

for

to this

three

unnormalized determine

of MVDs and weak MVDs

In

this

down" version

set-theoretic

relation

one-level case.

6 1986

we

time algorithm

of

all

of

families

a one-level flat

the

the

weak MVDs,

an

one

structure?

terms of

Later,

polynomial

given

whether

of

and generalized

can be renested

in

terms

[JS]

sets

affects

the "top

can

the original

and in

allow

i.e.,

polynomial

were given

estabThomas

over

studying

in

the

in polynomial

of nesting

defined

structure,

Gucht

to solve

only

involved

of

obtain

Van

They on

and a one-level

permitted

We then obtained

structure

is is

and

relation

in [Tho]. this

relation

[FsTv].

order

one-level

no

unnesting

paper we were able a flat

This

Thomas and in [JS]

of nesting

Saxton

effort

the

Scholl

LIFO order.

can one determine

[FYI]?

this

considered they

later

which were first

al.

NEST and

relations While they

the

time

result

[CT].

the

attributes,

some were

(INF)

even if

of the effects

scheme (nests attributes),

results

than strict

to this given

nesting

[OO,OMO],

Freitag

and Tansel

algebraic

generalized

[KV],

introduced

to restructure

some important Fischer

and Matos

and Clifford

form [JS].

et.

MVDs;

basic

INF

Schek and

Kambayashi

Ozsoyoglu

and Schek

UNNEST operators nesting

al.

Kuper and Vardi

[FA],

model suggestions

in other

problem:

to consider the

nesting,

In a previous

(MVD)

Similar

Gonnet.[Gon],

Jaeschke

and

more

made by Bancilhon

Pistor

by multilevel

contributed

functional

relaxing

Thus,

applications.

1977

dependency set-valued

a

one were willing

relations.

in

informal,

and

model and showed that by a flat

1NF was

that

was fairly

a multivalued

unnormalized

carried

lost

Makinouchi

treatment

treated

dependency

suggestions

[FT,Tho,TF].

Schek

many of the

a more generalized

done later

One of the earliest

and

model [Sch,SS].

established

(1NF) has

gained wide acceptance.

nesting

Schek,

a similar

Fischer

model with form

multiattribute

recently,

described

[Cod].

atomic

The relational

be permitted. restriction

model

a tuple

Codd recommended that

However, this

database

to

remove

reverse for

the

in polynomial

the order)

general time:

1) whether a given multilevel structure is a (reachable by some nesting nested relation sequence): 2) whether a given multilevel structure is a permutable nested relation (reachable from its unneated form by any reneating sequence); and 3) given a flat relation and an arbitrary scheme, whether or not every nesting sequence for that scheme will produce the same instance when applied to the relation. We also show that the class of permutable nested relations is the largest subclass of the nested relations which is closed under unneating. class to the Finally, we compare this relations of Delobel hierarchically-organized [Del] and Lien [Lie] and to the hierarchical structures studied by Bancilhon et. al. [BRS], Abiteboul and Bidoit CAB], Ozsoyoglu and Yuan [CC], and Roth et. al. [RIGS].

S, we will uae the root node R(S); hence, C(S) means C(R(S)), etc. Finally, if Y is a set of nodes, we will label a node Y* only if C(Y*) = Y. A structure consists of a scheme S and an instance of the root R(S), where an instance is defined recursively as follows: 1. an instance of a leaf node A is a single atomic value from dam(A), the set of possible values of attribute A; 2. an instance of an interior node M of S is a finite set of ‘tuplea, where the number of components of a tuple is IC(M)l, and for each N 6 C(M), a tuple has an entry which is an instance of N, i.e., an atomic value from dam(N) if N 6 A(M) and a set of tuples over the children of N if N 6 H(M). If S is a flat scheme, a structure over S is called a flat relation or simply a relation. Example 1. Consider a database of class schedules for s given term with attributes C (Course), S (Section), T (Teacher), D (Day), H (Hour) and R (Room). This information is often printed out as a three-level nested structure (although settheoretic braces are usually omitted). We give a scheme S, and an instance 6, below. The "header" for the instance s, corresponds to the leaf nodes of the scheme S1'

2.

Basic Concepts. Our formalism for schemes is very similar to the normal form of Hull and Yap [HY]. However, we always do set formation (collection) whenever we do aggregation (composition), and we do not use Thus, our generalization (classification). schemes need not have different kinds of non-leaf nodes. Let U denote the universe of attributes. A scheme S is a directed rooted tree for which the leaf nodes are distinct elements of U. The root node is denoted by R(S). The set of leaf nodes (attributes) of S is denoted by at@), and the set of non-leaf (interior) nodes of S by int(S). The set of nodes which define proper subschemes of S is called sub(S); hence, sub(S) = int(S) The set of nodes which have only lesf {R(S)). nodes as children is called the set fin(S) of frontier nodes of S. If R(S) 6 fin(S) we will call S a --flat scheme. We extend the notions att, int, sub, and fin to subschemes of 6, e.g., given a node M 6 int(S), att(M) means att(T(M)) where T(M) is the subtree consisting of M and its descendants. For a node M 6 int(S), the set of children of M, denoted by C(M), ia partitioned into attributes A(M) = C(M)n att(S) and higher order node8 H(M) = If we apply C, A or H to a scheme C(M) fl int(S).

A COURSE-INFO

R Scheme S,

C

S

T

D

Jones Smith

1; {{M,W,F)

Instance 6,

61

H

1

R

An alternative way of labelling the nodes of Then S, would be to use the notation in [AB]. DAYSwould be I)*, SCHEDULE would be (D*HR)*, etc. We now define the NEST and UNNESToperators. First, let t be a tuple in a structure with scheme S and let X C C(S). The X-value of t, denoted t[X], is the restriction of the tuple t to the component8 associated with X. Now let 8 be a structure over scheme S and let Y C C(S). Then NESTy(a) is a new structure 8' with scheme S', where S' is obtained by inserting a new child Y* of R(S) into S and making all of the member8of Y * Hence C(S') = children of Y in S'. A tuple v 6 8' if and only if (c(s)-Y) u e*1. there exists a tuple t 6 8 such that: 1. v[C(S)-Y] = t[C(S)-Y] and 2.

v[Y*]

NESTY(NEST, . . . (NESTI (r)). . .). n R-l 1 Since the UNNETST operator COX=lUteS extension deal8 with [J%FT,TFl, its natural unnesting over a set of objects. Let S be a scheme with Q(S)Csub(S). Ne say that Q(S) is an unneating -8et of S if and only if whenever a node M 6 sub(S) is in Q(S), then all the ancestors of M in sub(S) are also in Q(S). Thus, we may not unnest an object unless highezclevel nestinga over that object have been unnested. Now let 8 be a structure over scheme S and Q(S) be an unnesting set of s. Then UNNESTQ(S) (8) 18 the structure UNNESTM(UNNESTM (...(UNNESTM (8))...)), where 1 k k-l the sequence (Ml,..., Mk) satisfies the Conditions: 1. {M1,...,Hk} = Q(S) and 2. for each pair of indices i and j, 1 5 i < j 5 k, either Mj is a descendant of Mi in S or Mi and M are incomparable 3 node8 in S. Example 2. The flat relation a88OCiated with the structure a, in Example 1 above is given a8 the scheme S2 below. One can structure s 2 with readily observe that a2 can be obtained from 8, using the unneating set sub@,) = {(ST(D*HR)*)*, (D*HR)*, D*}. Conversely, the nesting sequence (D, D*HR, ST(D*HR)*) will transform a2 into 8,. the etructure

=

nITy({t' 6 8 1 t'[c(s)-Y] = t[c(s)-Y]}). Thus, the tuples of 8 which agree outside of Y are merged into a eingle tuple of 8' with a set-valued entry in the Y* position. Let s be a structure with scheme S and let M 6 H(S). Then UNNESTM(8)is a structure 8" with scheme S", where Sv is obtained by deleting M from S and connecting the children of M to R(S), i.e.; C(M). A tuple t 6 8" if and C(Snt) = (C(S)-{M})U only if there exists a tuple v 6 8 such that: 1. t[C(S)-{M}] = v[C(S)-{M}] and 2.

t[C(M)]

6 v[M].

We now extend the definition of NEST to describe a sequence of NEST operations. Let S be a scheme. A sequence P(S) = (Y,,...,Y,) of sets of node8 of S ie called a nesting sequence for S if and only if the sequence Is a valid order of neeting to reach the scheme S, i.e.: 1. for each i, 1 5 1' n, Yi = C(Yr) for 8omeY; 6 sub(S); 2. 3.

i. Scheme S2

{Y;,...,Y;} = sub(S); for each pair of indices i and j, 1 5 i < j 5 n, either Y; is an ancestor

C

of Y: in S or Y: and Y* are incomparable 3 node8 in S, i.e., neither of them ia a descendant of the other. Let r be a relation over att(S) and let P(S) = ,,...,Yn) be a nesting eequence for S. The 0 8trUCtUl.e NESTP(s)(r) over scheme S is defined as

S

T

D M

CSIOO CSIOO CSlOO

01 01 01

Jones Jones Jones

CSlOO CSIOO CSIOO

02 02 02

Smith Smith Smith Instance

62

w F M

w F s2

H 10 10 2 1 1 1

R EN239 EN239 EN201 EN203 EN203 EN203

As in [Tho]

we extend

and MVDs in the natural are

non-atomic

are

We similarly

sets.

multivalued Let if

way:

entries

they

are

if

extend

the definition

whenever s contains

We begin

of weak in

[FYI].

Z

t3

x

Y’ Y Y

2 z 2’

The Classes Between

there

Thus, so = UNNESTsub(s)(*) and So is a scheme with C(So)= att(S). Then:

flat

s = NESTp(s)(~o) si satisfies

2'

for

normalization-lossless that

unnested

from this

relation

structures

is to

using

only

studied

in

This

[Tho,TF];

Our classes

will

all

we shall

denote

it

obtained

relation

from

operations,

i.e.,

using

by NL.

9 *

a is

scheme S if

and only

if

over att(S)

and a nesting

that

a nested there

legal

relation

exists

a relation

sequence P(S) for S such

relations

relations at

can be used at various

the

conceptual

better,

level

more readable

output

levels

to model

at the user interface

store

by NR.

nested

relation

a relation

a structure

r over att(S)

sequence P(S)

for

the

permutable

class

of

s is

such that

S, NESTP(S)(r)

nested

PNR is contained We now give given

for

structure

Furthermore,

relations

there

the

by PNR.

P(S)

izing

induction

satisfieskthe 1

. . . . To be

Q(Sk)

if FD

5 i'

k.

this

gives:

= NESTQ(S )(s,)

if

This

we have:

if

(C(Ti)-YI)

--> Y;

and only

if

FD (C(Si)-YI)

--> r;

k.

in

[TF]

(cf.

also

[FSTV])

character-

= 8 we have:

=

these two equivalences

s = NESTp(S) (s o ) if

Obviously,

we may

hence

and only

when NESTy(UNNESTy*(s))

Combining

of all

= = SJ for

tJ = sJ and T

NESTy (sk) if and only if k+l sk+, satisfies the FD (C(Sk+,)-Yl+,)

over

to t and

sequence

hypothesis;

NESTQ(T )(t,)

'k+l

in NR.

completes

and only

the

induction

-4

Yi+,.

yields: if

the k+l FDs hold. step;

hence

the

some of

I the

theorem is established.

characterizations belongs

to

NR(S)

we show that

the

classes

=

sequence for S. Then

3 Tk has k subschemes,

Since

By a result

exists

We denote

class

and

over S such let

the same definitions

si satisfieskthe

to

for any nesting relations

and

nesting Clearly,

the

'k

agermutable

= s.

nested

scheme S by PNR(S) and denote permutable

by applying

for 1 5 1'

over the scheme S if

k+l

to and the schemes Tk, Tk,,,

9

Restating

level

relation

Let s be a structure

truncated

for

in a DBMS,

to provide

s is a flat true.

) be a nesting



ti

data more efficiently. We say that

l

tk=

Nested

and at the physical

f

apply

applications

level

then

=

,‘k”k+,

(Y, 1,.. . J,). O 2 for

t,,

t2 6 s,

all

M

6

= t2[Z].

simply

W

.

an ordinary

l?D.

a GFD becomes a "SFD" as

in [FV3].

relations

64

Let v,,

We say that

ovp t,[N].

any two tuples

t, [VI = t,Cvl

To get an intuitive

s : 3

to

Then s satisfies

When W = @, a GFD is

PNR

~,[A(M)]

N 6 H(M), t,[N]

functional

ovp t2[M],

t,[M]

nested

t, 6 v,[M]

such that

s be a structure

and only

whenever

size

that

exist

C(S) and W CH(S).

generalized

data structures.

relations.

on M, denoted v,[M]

v, [Ml fl v2[Ml # #, or

V,Z f

FD.

M 6 H(S).

s

we need

either:

and for all

however,

in the

of a generalized

of s and let

t2 6 v2[n]

s

PNR(S)

2. with

polynomial

One can,

polynomial

appropriate consider

may

of s.

space complexity now

claim

when a structure

to

1.

relation

of the structure

relation

the size

time

unnested

We cannot to the

unnested in

of s by choosing We

s.

respect

in polynomial

of the fully

belongs

v, and v2 overlap and only if

else s is not in MI(S)

to the size

to characterize

the notion

v2 be tuples

a subscheme of S'

respect

order

scheme S

introduce

s 6 NR(S')

where S' is the same scheme as S but with M interpreted

S;

the FD

ITNNESTsub(S)-int(M) (att(S)-att(M)) --> M;

-end NR Algorithm

a member

PI

Let s be a structure

-if fin(S) = {R(S)}, then s is a relation, find

not

of PNR(S3).

position

occurs.

recognition

The reader

s3

I Theorem 1 and Lemma 1 can be used to produce

polynomial-time

s3

s3 = NEST(PmEm ,CHILD) ( "; 1, which 6 NR(S3). However, s3 f

that

NEST(,.HILD,PARENT)(~);

sequences

k is the earliest

to show that

Let

a

Instance

details.

since

{c,,c2)

IP,'P3)

Scheme S 3

first

where

unnested

the NSSTM operation

FD by

as its

of the nesting

which produce s from r, can then be invoked

the

there C(M)

NESTp(s)(r>

One assumes that

in which

fin(S),

then

UNNFSTsub(S)-int(H)(s)' nesting sequence P(S)

time

{P,tP*)

also:

Lemma 1.

a

CHILD

view of permutable

and GFDs, consider

the following

nested' example

(somewhat simplified) of zaniolo and Melkanoff [ZM] (also studied in 1~~1). s4 over Example 4. Suppose there is a dictionary scheme S of corresponding technical terms in two different4 languages, French (F) and English (E). Assume that an object might be described by more than one word in each language. No word, however, describes more than one object. I

0 DICTIONARY F F*

E*

F 6)

E

In the next theorem we can characterize the class PNR(S) in terms of satisfaction of FDs in a partially unnested version of that structure. Furthermore, we see the correspondence between satisfaction of these FDs and satisfaction of GFDs in the original structure. Theorem 2. Let s be a structure over scheme S. Then the following statements are equivalent: 8 6 PNR(S); 1. for all M 6 sub(S), 2. UNNESTsub(S)-int(M) (8) satisfies the FD (att(S)-att(M)) --> A; 3. for all A 6 H(S), a. s satisfies the GFD A(S)Ui(S)-M> --> M and b. for all tuples t 6 8, t[M] 6 PNR(T(M)). The implication (1) => (2) Sketch of Proof. follows directly from Theorem 1 by choosing, for each M 6 sub(S), a nesting sequence which nests up The to M before performing any other nests. implication (2) => (1) involves consideration of the effect of unnesting on FDs (cf. [FSTV]). The proof of the equivalende of (2) and (3) is quite technical and is omitted. I Part 3 of Theorem 2 permits the construction of an algorithm to check membership in PNR(S) by However, working only with the given structure. Part 2 immediately leads to an algorithm which is easier to understand, i.e., for each M 6 sub(S) perform the appropriate unnesting and check for satisfaction of the related FD. Both algorithms run in polynomial time with respect to the size of the fully unnested relation associated with 8. The comments with respect to Algorithm NR also apply here. It is obvious that NL is closed under the UNNESToperator, .and Thomas showed that NR is not The latter fact may closed under UNNEST[The]. also be verified by considering the nested of Example 3 above. It can easily be relation 5 shown by using Theorem 1 that the SkuCtU~ The UNNESTpARENT*(s3)is not a nested relation.

if,} {f,,f,l tf,}

SchemeS4

Instance

E te, ,e,l te3,e41 Ie51 s4

Let si = UNNEST~,*,E*,'s4>. The reader can verify that s4 = NEST(F,E>'s;) = N=T(E,F+*;), and therefore s4 6 PNR(S4). F

E

fl

el

fl

e2

f2

"3 "4 $ e4 "5

f2

5 4 f4

Instance 8; If we compare the structures in Examples 3 the and 4 we notice a subtle difference. In "J9 value p, appears in the PARENT*component of the (3) and sec+ond (t,) tuples, i.e., first . t,[PARENT ](1 t2[PARENT ] f $i. On the other hand, in s4 the analogous set intersections exe empty. It turns out that a recursive extension of this intersection property, namely the definition of a GFD, is the crucial difference between nested relations and permutable nested relations. As can be verified, s4 satisfies the GFDs --> F* and --> E*. The structure s3, however, violates the GFD --> CHILD* although it satisfies the ordinary FD pARENT* --> CHILD* and the GFD --> PARENT+.

class PNR is dealt with in the next theorem.

65

The class

Theorem 3.

of the nested

PNR is the largest

relations

subclass

which is closed

4.

Since

under the

We first

show that

UNNEST. Suppose's is

the

scheme

UNNESTy*(s), need

to

and let

Q(S')

closed

S' be the scheme of s'. For

6 PNR(S').

for

any

and that

relations,

UNNESTy* to both = NFSTQ(S,)(r)

8’

sides

of the

Now suppose some class S be its

Induction

hierarchical

yields

B CNR is closed B CPNR.

use induction 0 5 Isub(

5 k, if

that

equivalent x,

8 over

k = 0, then s is a relation

regardless

of whether

Induction Y* 6 H(S)

UNNESTy*(s)

under unnesting, 6 B.

induction

hypothesis

show that

NESTY(UNNESTy*(s))

B CNR,

we know that

sequence

for

such that

S,

Since

exists

some P(S)

of nested

notion

the

of

the

HNEST. over

scheme S and let Then:

8' with

the MVD X ->-> thus

YIZ,

with

producing

a new

scheme S, then doing NESTy(s').

Let s5 be the flat

relation:

A

B

C

=

a

bl

c1

each

e

b2

“2

by the

6 PNR.

= s.

there

i.e.,

for

Therefore,

UNNESTy*(s)

to study

end s 6 PNR

s 6 B and assume !sub(S)!

B is closed

In order

as above,

s 6 B.

Let

Step:

Since

k+l .

Y,

flat

N'ESTy(~xy(~) 1x1 Xx,(*)), where Performing an HNEST operation is

to enforcing

Example 5.

If

=

the

and Xn Y = $.

Z defined

instance

8 6 B then

s 6 PNR. Basis:

Y-C(S)

the structure

in the context

nest operator

HNESTyzX(s) z = c(s)-XY.

on isuh(S)l.

For any structure

Hypothesis:

scheme S such that

Let s 6 B

for

scheme form a key for

introduce

s be s structure

X CA(S),

under

We need to prove

scheme.

We will

s 6 PNR(S).

structures

Let

UNNEST. We have to show that and let

equation

as desired.

that

or substructure. we

by several

A hierarchical

substructures,

its

are

of hierarchical

intensively

of the associated

hierarchical

Applying

applications

[AB,OY,RKS]).

of

structure

nesting

S and s 6 PNR(S).

data the class

has the property

each

attributes

nesting

S',

for

(cf.

structure

We

and Nesting.

has been studied

researchers

s = NESTy(NESTQ(S,)(r)), Y) is a where r = UNNESTsUb(S) (s), since (Q(S'), sequence

in nature,

structures

under

6 PNR, then s 6 PNR(S), where S Let Y* 6 H(S), let s' = s.

of

show s'

sequence

PNR is

Structures

many real-world

hierarchical

UNNEST operator. Proof.

Hierarchical

Instance

We now

6 5

8 6 B and The structure

a nesting

shown below:

= (Y,,...,Y,),

s = NESTp(S)(UNNESTsub(S)(s)).

s,j = HNESTBzA(s5) over scheme S; is

Then Y

must appear for 6

m

in P(S) since Y* 6 H(S), i.e., Y = Ym By Theorem 1 we know that some 1 5 m 5 n.

A

= UNNEST~y~,~~~,y&,~(s)

a

{b,,b2}

c,

a

fb, ,b2}

c2

(c(sm)-Y*)

It

->

follows

nesting

a result

on a subset

and the

fact

participate FD

where Sm is

r*,

from

r*.

as in the proof

ing,

we have

sequence

operation follows

on a set that

for

(regarding

hand side of a FD) Y* does not

again

using

of Theorem 1.

S must Y such

C Instance

Scheme S,j

8;

(hence

for each Y* 6 H(S), 6 PNR and s = NESTy(UNNESTy*(s)). nesting

FD

the scheme of sm.

nesting) that s satisfies From this we conclude

s = NESTy(UNNESTy*(s)), [TF]

the

[FSTV]

Y* 6 H(S)

in later -->

in

of the left

that

(c(s)-Y*)

satisfies

C

B

that

the that

We can properties

a lemma of

1.

Recapitulat-

s 6 PNR(S); hence 8 6 PNR.

about

the

of HNESTytX (6): For any X, HNESTyEX(6) and NESTy(s)

have

HNESTyzX(s)

2.

since

a NEST

Y* 6 H(S),

some observations

the same scheme.

UNNESTY*(s) Since any

end with

make

the MVD x ->->

satisfies Y

the FD

is enforced

it

nest is performed (cf. [Mak,FSTV]). 3. UNNESTy*(HNESTy:X(s)) = s if

I

s satisfies

66

the MVD X ->-> Y.

X -->

Y*

before

the

and only

if

In order

to define

need some additional and let

ancestors

define

K(M)

be the

P(S)

Theorem

be the

set

of

A(N)

for

4.

sequence

att(S),

over

then

for

We

2.

for all

over

nesting att(S)

S if

S, and r

if

3.

=

for

all

a.

s satisfies

b.

hierarchical exists

of a hierarchical

can be created

HNEST operations I.

HNEST

2.

by the

a

The

scheme S

6

relation:

HNEST ' - ' FATHER-HOBBY:FATHER Note that the third HNEST could be performed must precede

of

for

the

in

the

analogue

characterizes approach

for

PNR(S).

to

that

of

that

I the

along

the

hierarchical

at

be

also

the second.

in Theorem 4 states attributes

node to any node forms unnested

corresponds

to

structures

given

used

to

algorithm

to determine

Corollary

1.

for

in

[RKS].

a

polynomial-time

It

HNR is contained

immediately that

The

definition

may

membership in HNR(S).

induction

The fact

a key

structure.

the

construct

The class

Follows

easily

flat

corresponding

item

by a simple

in PNR.

from Theorems 2 and 4

on Isub(S the

containment

is

fails

The structure

example: is clearly a7

proper

I is

shown by the following

Example 7.

to be in HNR because

the FD

in PNR but A -->

B"

is

violated.

SON-HOBBY FATHER-HOBBY

SON

is

which

the

from the root

Proof.

FATHER

t 6 s,

theorem

similar

path third

3.

first

is

concatenation

SON-HOBBY:FATHER,SOH

but the

tuples

The second item

of three flat

proof

s by the

for FDs ), and

Theorem 2 and is omitted.

HNESTSON S H**FATHER

any time

the FD A(S) --> M

6 HNR(T(M)).

This

Discussion.

r over

structure.

application

to the appropriate

for all t[M]

such that

below is an example

the

M 6 H(S),

HNR(S) of Theorem 2, the

satisfies

(hence A(S) is a key for

there

s = HNESTP(s)(r)' The structure s6 over

M 6 sub(S),

union r&e

sequence P(S) for S and a re'lation

Example 6. It

and only

S.

FD K(M) --> M, where K(M) is defined as above;

HNESTYn:Xn(HNESTY~,:Xn-,...(HNESTY,.X,(r))...),

structure

scheme

are equivalent:

UNNESTsub(s)-int(M)(s)

all

HNESTp(S)(r)

where for 1 5 i ( n, X = K(Y;). i A structure s over scheme S is's

statements

over

s 6 HNR(S);

1.

use of a nesting

the

s be a structure

Let

Then the following

of

a sequence of HNEST operations,

is a nesting

a relation

we

M) of M in S.

union

Now we extend

sequence to permit if

ant(M)

(excluding

to

N 6 ant(M).

is

Let

structures

Let S be a scheme

notions.

M 6 int(S).

proper

i.e.,

hierarchical

Scheme S6

FATHER 5

Al

SON-HOBBY FATHER-HOBBY

SON

B*

r$,h4}

"I

A

s2 f2

We

denote

structures

over

hierarchical

the

class

The class

s6

was first of

C

lb,)

{c,)

a

{b2}

{c,)

Instance HNR is

closed

s7

under UNNEST.

This

shown by Roth, Korth and Silberschatz The proof can be simplified using the IRKS]. HNEST characterizetion given above.

hierarchical

scheme S by HNR(S) and the class structures

B

a

C

Scheme S7

Instance

of all

B

A c*

by HNR.

67

6.

Open Problems. We conclude this paper by mentioning a few open problems. of NL. What is 1. Find a characterization the time complexity of a recognition problem for NL? Is it at least demonstrably in NP? In this paper we did not allow null 2. values or empty sets. How does the theory change if we allow these (cf. LAB])? Are there deeper semantics for GFDs, 3. i.e., can they also say something about the meaning of a database or do they only reflect the way the data has been restructured (cf. [FV3,V])? Do WMVDs have any role other than 4. indicating when NEST operations commute? Are they of any interest in a database design algorithm (cf. [FV1,V])?

Associated Flat Relations. ' The previous theorems have been "top down", i.e., by analyzing a structure some properties of the structure have been determined. One may also consider a "bottom up" approach by focussing attention on the flat relations associated with a Recall that any structure class of structures. unnests to a unique relation, but relations may or may not renest to the same structure.' The class PNR(S) induces a subclass PR(S) of the flat relations which are obtainable by unnesting structures in PNR(S), i.e., a relation r 6 PR(S) if for some s 6 PNR(S), r = UNNESTsub(s)(s). Hence, r may be renested to s We now using any legal nesting sequence. characterize PR(S). Theorem 5. Let S be a scheme and let r be a relation over att(S). Then the following statements are equivalent: 1. r 6 PR(S); 2. r satisfies the WMVDs ((att(S>-att(M))-att(N)) -w-> att(M) for each pair M, N of incomparable interior nodes of S; 3. r satisfies the WMVDs ((att(S)-att(M))-att(N)) -w-> att(M) for each pair M, N of interior nodes of S which are siblings. The relation si in Example 4 illustrates this theorem. In this simple case items 2 and 3.above both state that the WMVDs9 -w-> E and # -w-> F must hold in 8'. 4 We now define the class HR(S) of hierarchically organized relations (over scheme S), first studied by Delobel and Lien [Del,Lie]. We say a relation r 6 HR(S) if for some s 6 HNR(S), r = UNNESTsub(S)(s). We claim that our definition is equivalent to the one used by Delobel. Theorem 6. Let S be a scheme and let r be a relation over att(S). Then r 6 HR(S), if and only if 1. r satisfies the MVLk 2. (K(M) U A(M)) ->-> att(N) for all M 6 int(S) and all N 6 H(M), where K(M) is defined as before.

5.

References [AB] S. Abiteboul, N. Bidoit, "Non First Normal Hierarchically Form Relations to Represent Organized Data", Proc. Third ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, 1984, 191-200.

[BRS] F. Bancilhon, P. Richard, M. Scholl, "On Line Processing of Compacted Relations", Proc. 8th Int'l Conf. on Very 'Large Data Bases, 1982, 263-269. [BK] C. Beeri, M. Kifer, "Comprehensive Approach to the Design of Relational Database Schemes”, Proc. 10th In't'l Conf. on Very Large Data Bases, 1984, 196-207. [CT] J. Clifford, A. Tansel, "On an Algebra For Historical Relational Databases: Two Views", Proc. ACM SIGMOD Int'l Conf. on Management of Data, 1985, 247-265. [Cod] E.F. Codd, "A Relational Model for Large Comm. ACM 13,6 (June 19701, Shared Data Banks", ---

377-387.

[Del] C. Delobel, "Normalization and Hierarchical Dependencies in the Relational Data Model", ACM Trans. on Database Systems 3,3 (September 197q 201-222.

68

[FSTV] P.C. Fischer, L.V. Saxton, S.J. Thomas, D. "Interactions between Dependencies and Van Gucht, Nested Relational Structures", J- Computer System Sciences 1 (1985), to appear.

Z. Ozsoyoglu, V. Matos, Z.M. Ozsoyoglu, "Extending Relational Albebra and Relational Calculus for Set-Valued Attributes and Aggregate Functions", Tech. Report, Case Western Reserve [oao]

University,

[FT] P.C. Fischer, S.J. Thomas, "Operators for Form Relations", Proc. IEEE Non-Firs%Normal and Applications Conference, Computer Software 1983,

1983.

[00] Z.M. Ozsoyoglu, G. Ozsoyoglu, "An Extension of Relational Algebra for Summary Tables", Proc. Statistical Database Workshop, 1983, 202-211.

464-475.

LFtivaluepd.C.

Fischer, D. ;;tc Gucht, "W;;; Third Dependencies", . SIGACT-SIGMOD Symposium on Principles of Database Systems, 1984, 266-274.

[OY] Z.M. Ozsoyoglu, L.Y. Yuan, "A Normal Form for Nested Relations", Proc. ACM Fourth SIGACT-SIGMOD Symposium on Principles of Database Systems, 1985, 251-260.

[FV2] P.C. Fischer, D. Van Gucht, 'Structure of Relations Satisfying Certain Families of Deper+ dencies", in Proc. 2nd Symposium on Theoretical Aspects of Coar?%ience, K. Ehlhorn, Ed., Springer&&lag, Berlin, 1985, 132-142.

A. Silberschatz, M.A. Roth, H.F. Korth, [RKSI "Theory of Non-First-Normal-Form Relational Databases", Tech. Report TR-84-36, University of Texas at Austin, 1984. [Sch] H.J. Schek, "Towards a Basic Relational NF2 Algebra Processor", Proc. of the International Conference on Foundations of Data Organization, Kyoto, Japan, 1985, 173-182.

P.C. Fischer, D. Van Gucht, "Determining When A Structure is a Nested Relation", Proc. 11th Int'l Conf. on Very Large Data Bases, 1985, [FV3]

171-180.

[FA] J. Freitag, S-NF2 Relations", (North-Holland)

H.J.

[SP] H.-J. Schek, P. Pistor, "Data Structures for an Integrated Data Base Management and Information Retrieval System", Proc. 8th Int'l Conf. on Very Large Data Bases, Mexico, 1982, 197-207.

Appelrath, "Modelling IR by A Broad Computing 85: Developmenx, G. B-ucci and Science Publishers, B.V.

[SS] H.J. Schek, M.H. Scholl, "An Algebra for the Relational Model with Relation-Valued Attributes", University of TF DVSI-1984-Tl, Technical Darmstadt, West Germany, 1984.

1985.

Data Bases", Proc. [Gon] G. Gonnet, "Unstructured Second ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, 1983, 117-124.

Form [Tho] S.J. Thomas, "A Non-First-Normal Relational Database Model", Ph.D. Dissertation, Vanderbilt University, 1983.

[HY] R. Hull, C.K. Yap, "The Format Model", Journal of the ACM 31,3 (July 1984), 86-96. --em[JS] G. Jaeschke, H.-J. Schek, "Remarks on the Algebra on Non First Normal Form Relations", Proc. ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, 1982, 124-138.

[TF] S.J. Thomas, P.C. Fischer, "Nested Relational Structures", Theory of Databases, P.C. Kanellakis, Ed., JAI Press, 1985, to appear.

[KTT] Y. Kambayashi, K. Tanaka, K. Takeda, "Synthesis of Unnormalized Relations Incorporating Information Sciences 3 (1983)' More Meaning",

F;Y,latiotil Vanderbilt

[ZM] C. Zaniolo, M.A. Melkanoff, of Relational Database Schemata", &I (March 1981), Database Systems

201-247.

[KV] G.M. Kuper, M.Y. Verdi, "A New Approach to ACM SIGACT-SIGMOD Database Logic", Proc. Third Symposium on the Principles of Database Systems, 1984,

86-96.

[Lie] Y. Relational Systems 6,l

"Hierarchial E. Lien, Databases", ACM Trans. (March 1981), 48-69.

Schemata for on Database

Mscleod,"A Model for [Mac] I.A. Information Systems", Proc. Ninth Int'l Very Large Data Bases, 1983, 280-289.

'Theory of Unnormalized Van Gucht, Structures", Ph.D. Dissertation, University, 1985.

Integrated Conf. on

"A Consideration of Normal [Mak] A. Maklnouchi, Form of NoGNecessarily-Normalized Relations in the Relational Data Model", Proc. 5th Int'l Conf. on Very Large Data Bases, 1977, 447-453.

69

"On

the

Design

ACM Trans. I%.-

on -