Data structures and graph grammars - Springer Link

16 downloads 0 Views 702KB Size Report
nite non-empty mutually disjoint sets, S ~ E is the starter (the axiom) n .... If an employee is married~ the name of the wife must be known; more- over, the system ...
DATA STRUCTURES AND GRAPH GRAMMARS P.L. Della Vigna C. Ghezzi Istituto di Elettrotecniea ed Elettronica Politecnico di Milano - Piazza L. da Vinci 32 20133 Milano - Italy

ABSTRACT This paper is concerned with a formal model for data structure definition:

data graph grammars

(DGG's).

The model is claimed to give a rigorous documentation of data structures and to suit very properly program design via stepwise refinement. Moreover it is possible to verify data structure correctness,

with re-

gard to their formal definition. Last, attribute context-free duced.

data graph grammars

(A-CF-DSG's) are intr~

A-CF-DGG's not only give a complete and clean description of d~

ta structures and algorithms running along data structures,

but also

can support an automatic synthesis of such algorithms.

KEY WORDS AND PHRASES

Data structure, correctness,

abstraction,

stepwise refinement,

program synthesis,

mars, parsing.

context-free

software reliability,

grammars, attribute gram-

131

i. INTRODUCTION Programming

methodologies

modifiable,

readable and portable

topic in computer

which can help in designing correct,

easily

software have become an important

science.

A widely accepted principle

is that the quality of software can be consi i

derably -improved if the programmer

can express his tasks in a free and

natural way, without being concerned with details of the machine, which could force him to tailor his solution to\some u n n a t u r ~ or unessential features. Very high-level

languages are an ambitious

answer to these problems,

but it has been argued they cannot exhaust all the needs of programmers. Moreover the serious problems of optimization

which arise have

not yet received a solution which allows to obtain a code of good qua lity. Another attractive

attack to this problem consists

in successively

composing a solution through "levels of abstraction". the solution is initially operations

specified by using an abstract machine whose

and data tailor the problem to be solved.

tion is not directly

supported by the language,

ed until a level is reached which is directly We feel that programming

de-

This means that Whenever an abstrac

it is recursively

supported by the system.

through levels of abstraction

be considered as a general philosophy

should not only

to be divulged to non-believers,

but should also inspire the design of computer-aided

program develop-

ment systems which allow to test, measure and modify programs stage of their stepwise refinement.

detail

at each

Our research effort is presently

in this area. Quoting Liskov /i/, two kinds of abstraction ful in writing programs: Abstract operations

"abstract operations

are naturally

represented

are recognized

and abstract data types by subroutines

dures, which permits them to be used abstractly details of implementation).

the ordinary

representation,

ces the user of the type to be aware of implementation These principles research

for ab-

of the way the objects of the type will occupy storage,

CLU programming

have inspired the definition !anguage/system,

a fo~

information".

and implementation

of the

which is one of the most interesting

efforts towards the definition of a programming

ing structured programming

.

or proce-

(without knowledge of

However, a program representation

stract data types is not so obvious; description

to be use-

/2/ and modularity

/3/.

system support

132

We present here another model tion and refinement

for data structures

definition,

which is based on graph grammars.

In particular,

we will show how the model can be used for clean d o c u m e n t a t i o n project and how it can support a computer assisted tures,resulting

in a considerable

The reader who is interested

improvement

abstra~ of the

design of data stru~

of program reliability.

in this topic is invited to read some re-

lated works which have a p p e a r e d

in the literature

(/4/,/5/~/6/).

2. DATA GRAPH GRAMMARS A data structure

can be viewed a b s t r a c t l y

by a network of access paths. over E (the node al~habet)

as a set of objects

Thus we can formally

and A (the link alphabet)

D = (N, ¢ , ~ ), where ~ N is the set of nodes, are the nod_.~e and link labell!n ~ functions Let ~ =

{DID is a data graph over Z

connected

define a as a triplet

¢ : N ÷ Z and ~ N x

A xN

respectively.

, A};

a data graph l a n g u a g e ~

over

E ~ A is a subset of ~ . Two data graphs D = (ND~ CD' ~D ) and F = (NF, CF' ~F ) are e q u i v a l e n t (D { F) if a one-to-one

equivalence

function e : N D + N F can be found

such that i)

CD (n) = CF (e(n)),

2) (nl~a~n2) Languages strings,

e

~D

~n

iff

of graphs~

e ND (e(nl) , a, e(n2))

as an extention

of the w e l l - k n o w n

have been studied by researchers

number of papers

(/7/,/8/~/9/,/i0/,/ii/).

finition and t r a n s l a t i o n

~ ~F

are explained

languages

of

in p a t t e r n - r e c o g n i t i o n

in

Applications

a

to language de-

in /12/ by Pratt,

from w h o m we

borrow some formalism. Also

if it appears

along lines

that the theory of graph grammars may be developed

similar to the theory of string grammars~

main yet to be studied; a) connections

for example

many preblems

r~

:

with graph-automata;

b) parsing; c) definition

of meaningful

We shall consider

classes of grammars.

here mainly context-free

trying to give any answers /i0/~

restricted

to the questions

/ii/. We shall rather restrict

graph grammars,

without

above for which we refer to

our attention to their use as a

tool for data definition. Let D = (N~

¢, ~) and hi, nj ~ N; the interpretation

of ~(n i) = X i

133

and

¢(nj)

tively.

= Xj is that object n i and nj are of type X i and Xj r e s p e £

(ni, Y, nj)c ~

means that object nj can be accessed by n i fo!

lowing the access link Y. Links should not be considered as pointers, present memory locations; objects whose definition links.

In practice,

as well as nodes do not r[

rather they are abstract ways of r e f e r e n c i n g can be recursively

for example,

given in terms of other

links could represent a simple refe-

rence or even a search algorithm. A top-down

design of a data structure

operations

which recursively

In particular,

detail the description

we shall concentrate

link refinements

should be considered

as a set of

of types and links.

here on data type refinements:

could also be taken into account with minor changes

to the model. Node type refinements

are r e p r e s e n t e d

here as production

rules which

describe the structure of a type in terms of lower level component

da-

ta types. Formally,

a data graph grammar DGG is a 5-tuple G = (Zn,Zt,

where the n on terminal node alphabet zt(z = zt U

Zn, the terminal node alphabet

z n is the total alphabet)

nite non-empty mutually

disjoint

and R is the set of ~roduetion

A ,S, R),

sets,

rules.

and the link alphabe~

A are fi-

S ~ E is the starter (the axiom) n Each element r g R is a 5-tuple

r = (A, D, I, O, W) such that i) A E ~

n (o)

2) D = (N,~, ~) is a connected

graph over ~ and A

3) I E N is the input node 4) 0 e N is the output node 5) W c _ N

.

Before defining how productions troduce the operation ~ of graphs as result. be graphs over

which,

E,A and ~2 a (possibly

i) N = M 1 U M 2 U

M3

where

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

empty)

.

.

.

.

gives a set

subset of N 2. D' e Join

to a graph D = (N,¢,~)

M 2, N 2 = M 3 t) M4, ~ 2 C .

we in-

(N2,~2,~2) , D' = (N',¢',#') such that

MI, M2, M 3 and M 4 are mutually

set such that N I = M I U .

applied to two graphs,

Let D 1 = (NI,~,~I) , D2=

(DI,D2,~ 2) if D' is equivalent

.

are used to derive data graphs,

:

disjoint

M 3, ~ : N 2-~ M 2 U

M3

.

(o) Let D' = (N, 9, ~') be the undirected graph associated to D, such th~,~' = {(nl,a,n 2) I(nl,a,n2) g ~ V (n2,al,nl) e ~} : D is connected if is connected.

!34

is a s u r j e c t i v e a)

q (n)=n

application

~n

E M4

such that

2) a) ~(n)

= }l(n)

n E N1

b) ¢(n)

= ¢2(n)

n E M3

(n,a,m)E

$

iff n, me N I and

such that n = ~(n'), Intuitively,

If

e

w i t h nodes

i) j(n)

= e (n)

2) j(n)

= e (Q(n))

~n

graphs

function

~i (q (n))

(n', a, m')

e N 2 can be found

c $2"

(DI~ D2~ ~2 ) can be v i e w e d

nodes

of D 2 not in [2 can be

same label.

e : N + N',

as follows

the joint

function

:

E NI Vn

join

sult of the ~

c N2

is ~

(~-join)

operation

if ~2

is a single

= N2"

graph

In such a case

formed

the r!

by the pair of

D I and D 2.

The d e r i v a t i o n over

in D I w i t h the

:

¢I or n',m'

g r a p h of j o i n

N 2 + N' is d e f i n e d

The o p e r a t i o n

and

of D I and D 2 w h e r e

is the e q u i v a l e n c e

j : NI U

~2(n)

(n,a,m)E

m = Q(m')

each r e s u l t i n g

as a j u x t a p o s i t i o n identified

the c o n d i t i o n s

g M3

b) ~ (n)E M 2 ~ n

3)

satisfying

set Y(G)

Z and A w h i c h

i. Y(G)

contains

({n},

defined

by the data g r a m m a r

can be r e c u r s i v e l y

all the graphs

~, s), w h e r e

¢(n)

defined

D o (the

= S and

G is a set of graphs

as follows

start

graph~)

E is the e m p t y

:

equivalent

link l a b e l l i n g

to fun~

tion ii.

let D I = (NI, (A, D2, graphs

¢I' ¢I ) s Y(G),

D' e q u i v a l e n t

I o let n{

=

(N{, ~i', % '),

b) - ~ ( n )

) O

= ~1(n)

~n

- }~(n I)

= X~

X E Z

-

= ~,

~ E

${(n O)

~2' @2 )"

Y(G)

contains

to D = (N, ¢, $) c o n s t r u c t e d

a) N{ = (N 1 - {[}

c)

~ E N I, @i (~) : A E Z n,

I, 0, W)E R, D 2 = (N2,

also the

as follows:

where {n I, n 0]

>

n I, n O ~ N 1

E N 1 - {[}

( n i , a ~ n 2 ) E ${ ~ n i , n 2 g N1 - {~} s u c h t h a t ( n l , a , n 2) ~ ~ i (n o , a, n I) g ${ if (~, a, [) e ~I (n, a, n I) E ${ ~ n e N 1 -{~} such that - (no, a, n)

2. let D 1" : (NI, " 3. if j is the

~

~{ ~ n

~ N I - {~}

~i~ " ~ ' i ) e ~oin

joint

function

such that

(D{ , D 2 , W)

j : N{ ~ N 2 ÷ N[

(n, a, ~) (~, a, n)

e *i" c ~i"

135

then a) N : N 1 {J(ni), J(no)} b) ¢(n) = ¢~'(n), V n ¢ N{ - {J(ni) , J(no)} c) -(nl,a,n 2) e ~ n l , n 2 E N~ - {(J(ni), J(no)) -(n,a,j(I))¢ ~ n c N~ - {J(ni) , J(no)} such that (n,a,J (ni))e ~ -(j(O), a , n ) e ~ ? n

e N~ - {J(ni) , J(no)} such that

(J(no) , a, n) a ~i -(j(0), a, j(I)) E @ if (J(no) , a, J(nI))

~T

¢ ~i

In general, the application of a rule to a graph D in Y(G) gives a result which depends on D, i.e. the operation is context dependent. A DGG is a context-free

data graph grammar

(CF-DGG)

if all the rules

(A, D, I, 0, W) where D = (N, ¢, #) are such that W = N. The data graph language

(DGL) defined by a grammar G is :

L(G) = {HIH = (N H, CH' ~H ) E Y(G)A CH(n)

g Zt ~ n ¢

N H}

Example 1 (o) The following grammar graphs over Z = {a}

defines the set of binary directed acyclic and A = {Xl, x 2} .

BDAG ~

x

x2

[ i]

q

BDAG--'~

~

[i]

(o) The rule (A,(N, ¢, ~), I, O, W) is represented as A ÷ (N, ¢, ~), where the input node is marked by an arrow, and the output node by a double circle. The set of nodes W is bracketed by l a n d ] . In the sequel, if no set W is listed, W = N is assumed.

136

Example

2

The following presenting

grammar generates

the employee

the data structure

file of a firm. Employees

shown in fig. i, r[ are grouped accord-

ing to their sex. If an employee over,

is married~

the name of the wife must be known; more-

the system should record married

employee

couples of employees.

file

next ma man list

woman list

man ~woman husbandof

woman

name~

Ill

137

first man

/

~first woman

woman name

next man

newt woman husbandof woman name

~ann~o~ 7 ~

.an

~omanna~e)

/ woman husanof"

end

Figure i

138

3. THE PARSING PROBLEM FOR DATA STRUCTURES

The formalism of DGG's structures

in a clean and rigorous

perty, because increase

should be viewed as a tool for describing

software reliability,

p r o g r a m correctness

are given in /ii/ where

As for the models

arise concerning

(A,D,I,0~W)~

suitable

subclasses

described here

it is decidable

for CF-DGG's

supporting

e~

to prove the following:

is decidable

for DGG's having rules

such that cardinality

is undecidable

for DGG's.

which test data structure

can be a u t o m a t i c a l l y

In what follows we shall restrict

correct-

constructed.

out attention to CF-DGG's

4. DATA GRAPH GRAMMARS AND TOP-D0k~

In this section we give an example

(W) ~ I.

for CF-DGG's.

programs

(data structure parsers)

the stepwise refinement

is correct a c c o r d i n g

of CF-DGG's

it is possible

where D = (N, ¢~ ¢),

given a CF-DGG~

i.e. the

are also studied.

P r o p o s i t i o n , 2 - The parsing p r o b l e m

ness

the formal properties

the parsing problem for DGG's,

1 - The parsing p r o b l e m

In particular~

pro-

can greatly

becomes much more easy to prove

Several results on such problems

ficient parsing algorithms

Moreover~

is a very important

of deeiding whether a data structure

to its formal definition°

Proposition

as it

naturally

One of them regards

possibility

This

and to m a i n t a i n programs.

A number of questions of DGG's.

way.

it is well known that a clean d o c u m e n t a t i o n

data

PROGR~LM DESIGN

: AN EXAMPLE

showing how DGG's can be used in

of p r o g r a m construction.

Given a library organized

in sections of different

matters we develop

an a l g o r i t h m which computes

g, the set of empty sections.

The data

structure will be developed

in parallel with the refinement

of the

search algorithm. The p r o g r a m

is written

conventions

for operations

in an A l g o l - l i k e

type = and a is a link exiting A, then i. B:=a(A)

means

language,

on the data structure:

with the following if A is an object of

:

that the data structure

control

leaves object A follo~

ing link a and the object reached by A under a is denoted by B; 2. is-link

(A~a)

is a boolean

function which is true iff a link label-

139

led a leaves A; 3. if A denotes an object at step rule

i

whose type e is detailed by the

~+D at step i+k (k 5 I), then A denotes the input node of

graph D at step i+k.

Data structure

Prosram

Data structure --~

/initially the current object is START/ Sect :: init (START); /successive integer numbers are associated to successive sections/

Library

i~- 0 ; £ 4 - @ ; Library

---~

scanned4- false;

repeat i~--i+l; i f empty

then

(Sect)

£~-60{i}

if is-link

;

(Sect~ next)

then S e c t 4 - n e x t

(Sect)

else s c a n n e d ~ - t r u e until scanned

We deatil empty(S, ect) Section

Head~--Sect if is-link first

back

; (Head,

then e m p t y . - f a l s e else e m p t y * - t r u e

Section

first)

140

The reader should note that further r e f i n e m e n t is r e q u i r e d to detail step 4. The r e f i n e m e n t implies: i) d e f i n i t i o n and p o s s i b l e r e f i n e m e n t of links; 2) c o n c r e t e i m p l e m e n t a t i o n of the data structure. If we c o n s i d e r each link as a simple reference~

no further r e f i n e m e n t

is r e q u i r e d and we must simply map the a b s t r a c t data structure onto the structures

supported by the p r o g r a m m i n g language.

On the other hand, we could c o n s i d e r links as invocations of algorithms yet to be detailed.

For example,

link next could extract from a secon-

dary storage the file c o n t a i n i n g the next section. On the o t h e r hand, even if the a l g o r i t h m w h i c h computes ~ does not require further r e f i n e m e n t s of the data structure, other queries about the data structure,

such as the list of books w r i t t e n by an author all

o v e r the library, w o u l d require d e t a i l i n g the n o n t e r m i n a ! Vols by means of the f o l l o w i n g p r o d u c t i o n s

SUC Vols

--~

Vols

--~

Book

--~

A u t h l i s t ~-~

A u t h l i s t --~

5. DATA GRAPH G R A M M A R S AND P R O G R A M

In this

SYNTHESIS

section we show how data graph grammars can be used for a u t o m !

tieally synthesizing algorithms which perform computations running a l o n g the data structure.

141

We introduce here the f o r m a l i s m of A t t r i b u t e - C F - D G G ' s w h i c h can be con sidered as an e x t e n s i o n of similar concepts of /13/ /14/. For each symbol X c Z there is a set I(X) of inherited a t t r i b u t e s and a set SCX) of s y n t h e s i z e d attributes.

The evaluation of the a t t r i b u t e s

is defined w i t h i n the scope of a single production,

by means of attri-

butes rules. A t t r i b u t e s of the l e f t h a n d side n o n t e r m i n a i of the p r o d u ~ tion are s y n t h e s i z e d while a t t r i b u t e s of the r i g h t h a n d s i d e elements are inherited;

attribute rules specify how a given a t t r i b u t e can be c o m p u ~

ed in terms of attributes of o t h e r elements in the same production. As to the example d e s c r i b e d in section 4, we introduce the f o l l o w i n g s y n t h e s i z e d attributes

:

- E, giving the set of empty sections; - ~, giving the set of books w r i t t e n by a given AUTHOR; - ~, w h i c h is true iff A U T H O R has at least one book in the library; - in, w h i c h is true iff A U T H O R is in the authorlist of a book; and the inherited

attribute

- n, which numbers each section

of the library.

The A t t r i b u t e - C F - D G G which represents the example is shown in fig.

2.

The indices which a p p e a r in the attribute rules relate a t t r i b u t e s to the elements of the productions. A t t r i b u t e s can be evaluated by an a l g o r i t h m w h i c h runs along the parse structure of the data structure; the values computed for the a t t r i b u t e s of the starter of the grammar are the result of the data structure. In our example

the evaluation of a t t r i b u t e

"Data structure"

e

of the n o n t e r m i n a l

gives the same result as the p r o g r a m d e s c r i b e d in se~

tion 4. The r e a d e r should note that using the f o r m a l i s m of A t t r i b u t e - C F - D G G we simply specify, for each rule, how to compute an attribute, of other attributes.

tion of an algorithm, because the e v a l u a t i o n sequence is not specified.

in terms

In other words we do not give the formal s p e c i f i c ~ of the a t t r i b u t e s

The only c o n s t r a i n t w h i c h must be s a t i s f i e d by an ef

fective a l g o r i t h m is that an a t t r i b u t e can be e v a l u a t e d only if the va lues of the a t t r i b u t e s from w h i c h it depends are known. It is p o s s i b l e to design an a l g o r i t h m which,

given the a t t r i b u t e - C F - D G G

and a data structure s a t i s f y i n g the grammar,

is able to find a suitable

e v a l u a t i o n sequence

(if it exists /13/) which allows:

a) to compute all the attributes or

in an i n t e r p r e t a t i v e

scheme,

142

b) to generate an object p r o g r a m w h i c h computes the attributes. In both eases~ data types and o p e r a t o r s used in a t t r i b u t e rules must be d i r e c t l y s u p p o r t e d by the i n t e r p r e t e r or by the p r o g r a m m i n g language in w h i c h the object p r o g r a m is written. In the example, we have s u p p o s e d that the object language supports data of type i n t e g e r and boolean. If we do not have a c o m p u t e r aided p r o g r a m design system, w h i c h is able to a u t o m a t i c a l l y c o n s t r u c t a p r o g r a m w h i c h evaluates attributes, Attribute-CF-DGG's

seem to play an useful role in giving a complete and

clean d o c u m e n t a t i o n of data s t r u c t u r e s and a l g o r i t h m s w h i c h run along data structures. It must be e m p h a s i z e d that this model

is not suitable to r e p r e s e n t op~

rations w h i c h d y n a m i c a l l y change data structures. data structure is m o d i f i e d

Therefore whenever a

it is n e c e s s a r y to r e - p a r s e the structure

in o r d e r to obtain the new values of its attributes. Attributes

can also be used to impose r e s t r i c t i o n s on the class

data structures DGG or could

d e f i n e d by a CF-DGG w h i c h cannot be specified by a CF-

be w i t h a r a t h e r c o m p l i c a t e d grammar.

In the sequel we p r e s e n t an A t t r i b u t e - C F - D G G

Data Structure l--~P

for the example in Sec. 4

n 3 ÷

2

~

:5

÷

1

--~

i

Irl + Ir3

~i ÷ s3 6

Library I

of

E

3

n2 ÷ nI

n 3 ÷ nl+l

~ ! ÷ ~2 u ~3 ~I ~ i'-~f I~I

= { then

true else false eI

Library

÷

e2 U

e3

i n2 ÷ nl

~I ÷ ~2

el ~ i--f ~i = ~ then true else false eI ~ E2

143

Section I --~

~l ÷ ~ back

f i ~ s t ~

~÷~

Section I --~

Vols I

--~

Vols ~

~-~

Book I

--~

~l ÷

suc

eI ÷ n I

vl÷ ~ --Authlist t

÷ if in 3 then {val (Title)} (o) else

Authlistl~-~

inl+ (if AUTHORzval(Author) then true else false)V in 3

Authlistl.-~

inl÷ if AUTHOR:val(Author) then true else false

(e) Val (a) gives the value of the terminal a

144

G. CONCLUSION In this paper we have given a formal definition of data graph grammars and we have discussed their relevance to data structure design. In particular~ graph grammars,

we have restricted our attention to context-free data and we have shown that:

I) they give a complete and rigorous documentation of a data structure; 2) they describe in a clean and natural way stepwise refinements of data structures; 3) it is possible to verify data structure correctness, to their formal

(syntactic)

with regard

definition;

4) it is possible to associate attribute rules to each production,

so

that algorithms which walk along a data structure can be automatically synthesized. Further investigations are currently going on with regard to the following points: i) dynamic change of data structures 2) data graph realization in a computer memory, with respect both to the automatic choice of efficient

storage structures and restric-

tions on CF-DGG's which derive graphs more easily implementable

/6/.

These points and a deeper insight into the practi~ai relevance of the model are worth studying to support our belief that attribute data graph grammars can play an useful role in computer assisted program design.

145 REFERENCES /i/ /2/

Liskov, B. "An introduction to CLU", Computation Structures Group Memo 136, MIT Project MAC, 1976. Dahl, 0.J., Dijkstra, E.W., Hoare C.A.R. "Structured programming" Academic Press New York~ 1972.

/3/

Parnas, D.L. "On the criterion used in decomposing systems into modules", CACM 15, 12, 1053-58, 1972.

/4/

Earley, J. "Toward an understanding of data structures", CACM 14, 617-626~ 1971.

/5/

Shneiderman, B., Scheuermann, P. "Structured data structures", CACM 17, i0~ 583-587, 1974.

/6/

Rosengerg, A.L. "Addressable data graphs", JACM 19, 2, 309-340, 1972.

/7/

Pfaltz, J.L., Rosenfeld, A., -"Web grammars"Proc, ist Intl. Joint Conference on Artificial Intelligence, Washington, 609-19, 1969.

/8/

Montanari, U.C. "Separable graphs, planar graphs and web grammars", Information and Control, 16, 243-67, 1970.

/9/

Paviidis, T. "Linear and context-free graph grammars", JACM 19, 11-22, 1972.

/i0/ Milgram D.I. "Web automata", University of Maryland, Computer Science Center Technical rep. 271, 1973. /Ii/ Della Vigna, P., Ghezzi, C. "Context-free graph grammars"~ Internal rep. 76-1, Istituto di Elettrotecnica ed Elettronica, Politecnico di Milano, IEEPM, 1976. /12/ Pratt, T.W. "Pair grammars, graph languages and string to graph translations"~ JCSS 5, 580-595, 1971. /13/ Knuth, D. "Semantics of context-free languages", Math. Systems Theory, 2~ 127-145, 1968; Correction: Math. Systems Theory 5, 95-96, 1971. /14/ Bochmann, G.V. "Semantic evaluated from left to right"~ CACM 2, 19, 55-63, 1976