Specification of compilers as abstract data type representations
M.C. GAUDEL (INRIA, FRANCE)
Summary This paper presents a method for specifying and proving compilers. This method is based on the algebraic data types ideas. The main points are : -
-
-
to each language is associated an algebraic abstract data type, the semantic value of a program is given as a term of this data type, the translation of the semantic values of source programs into semantic values of target programs is specified and proved as the representation of an algebraic data type by another one.
A compiler generator, PERLUETTE, which accepts such specifications as input is described. The proof technic is discussed.
I - INTRODUCTION Several experiments have been or are currently performed in order to use formal semantic specifications in a compiler generator. Although automatic production of parsers from a B.N.F.-like specification of a grammar is now widely kno~nand used (see [I, 2] among many others), it still remains that the development of the other parts of a compiler is made, most of the time, in an
'ad hoc' manner. Tools like Knuthian attributes (W-grammars, Affix-
grammars) are useful to specify compilers but the generation of all the parts of a compiler is possible only if there exists a formal and well-suited way to describe languages semantics. Of course, this description must be directly acceptable by the compiler generation system.
141
Among the works in this area, one of the most interesting is the SIS system developped by P. Mosses [3, 4]. This system takes as input a denotational definition of the Source Language semantics. The resulting translator compiles every source program into an expression of the intermediate language LAMBDA. This expression can be then evaluated by the LAMBDA interpreter. Our approach [5, 6, 7, 8] is a bit different since our goal is to produce compilers which generate any given target code. Accordingly, a definition of both the Source and Target languages must be given to the system as well as the implementation choices :
Definition~!
Source Language D e f i n i t i o n ~ , ~
Target Language
Compiler I Generation .
Compiler
System
Implementation C h o i e e s ~
In his paper 'advice on structuring compilers and proving them correct' [9] F.L. Morris stood by a similar idea : as he pointed out, it is necessary to specify source and target semantics ; moreover this approach makes it possible to get a correctness proof of the specified compiler, proving that the following diagram commutes.
Target Language
Source Language y (compiler) (source
~(target
semantics)
semantics)
Source Meanings
Target Meanings E (representation)
142
Depending on the theoretical framework used, meanings of programs can be of various sort : continuous functions, predicate transformers, algebraic values,
... However we think that a well suited formalism to specify and
prove representations is given by algebraic data types [10, II, 12]. Thus, the semantic value of a program is considered to be a term of an algebraic data type and the bottom line (s) of the diagram above is nothing more than the representation of a 'Source data type' by a 'Target data type'. This representation
must be proved correct in the sense of [10] or [II] : the
commutativity of the diagram is, obviously, not sufficient since it is true in the trivial case where there is only one value in the Target semantic domain ! We are currently developping a system, PERLUETTE(I~" which generates a compiler from such specifications. The system is presented in part II of the paper. In part III an example of compiler specification is given. In part IV the proof method is outlined.
II - GENERAL PRESENTATION OF THE SYSTEM PERLUETTE takes into account only the 'syntatic' part of the abstract data types associated with Source and Target Languages. It is a compiler generator, not a prover. The axioms (the 'semantic' part) of the data types are needed for the correctness proofs which, at this time, are done by hand. It is definitely possible (and highly desirable) to mechanize these proofs using or modifying an existing system such as LCF [13], AFFIRM [14,15] or PROLOG. The first part of a source language definition is the presentation of the algebraic data type associated to this language. This data type describes in a formal way the properties of the operations (using the more general
meaning
of this word) of the language. The semantic value of a program is a composition of some of these operations, i.e a
constant term of the data type. Semantic
equations specify the semantic value of a ~hrase of the programming language as a composition of the semantic values of the components of this phrase.
(1)Production Elegante et RelLlement aUtomatiquE de TraducTEurs - 'Perluette' is the french name of the character '&'°
143
These equations define,
in a syntax directed way, a semantic function which has
programs as domain and
constant terms as co-domain.
The general scheme of the system is given below :
~ g r a m
Source Language Definition : Syntax + A.D.T
1
+ Semantic
equations term of A.D.T 1 Representation of A.D.T 1
by A.D.T 2
term~ of A.D°T 2 @bject Language
I
Definition : A.D,T + Code Generation
) 2
~ompiler Specification
PERLUETTE
Corn~i let
The resulting translators work in three steps : the first one produces an intermediate text, which is a term of the Source data type ; this term is rewritten by the second step as a term of the Target data type ; the last step performs the code generation starting from this second term. In the current version of the system, the semantic equations of the source language are programmed using synthesized attributes [16, 17]. The language used to write these attributes is LISP. Attributes evaluation (and parsing) is performed by the DELTA system previously developped by B. Lorho [18]. The two parts of DELTA, the constructor and the evaluator have been respectively introduced in the first part of the generator (&l) and the first step of the translators (STEPI). The reshlt of the first step is a LISP form, since the attributes
144
are specified in LISP. The STEP2 of the translators must evaluate
this form
using a LISP subsystem which is the result of &2. This subsystem is a set of LISP functions which are built from the representation
specification
: every
operation name of the Source data type which can occur in the first term is considered,
at this step, to be the name of a LISP function whose result is
the translation of this operation
into a composition of Target data type
operations. As it will be seen in the example of Part III, the second intermediate close to the target code. STEP3 of the translators into some code, performing statements
removing.
has to rewrite this text
tasks such as registers allocation or unecessary
This step works in the same way as STEP2, evaluating
the second term as a LISP form whose result is the generated This system is currently developped. specifications
text is
Several examples
code.
of compiler
[19,5,8] have been tested with this method.
To avoid inefficencies, of a source program,
STEP] in addition of evaluating
checks its correctness.
the semantic value
Thus, no intermediate
text is
generated if the program is not valid. This verification has to be specified in the same time as semantic equations, From a practical point of view, efficient
using attributes.
the resulting
in time but rather expensive
translators
seem, so far, to be
in space. The quality of the generated
code depends on...the quality of the given specification.
Let us consider
now an example of such a specification.
III - EXAMPLE OF A COMPILER SPECIFICATION The compiler presented here translates
a source language with Algol-like
blocks and arrays into a target language with addresses, hexadecimal
numbers,
registers,
etc. Only some points of the specification
are given
since the goal is to show the specification method, not the compiler. The implantation
is a very classical
information vectors are allocated allocated at run-time,
one. Simple variables
in a static memory area. Arrays are
in a dynamic memory area,
The complete specification
and arrays
is given in appendix.
at the end of the memory.
t45
III.I - Source Language Definition The syntax of the source language is given below
:
P + BLOCK BLOCK + begin DL ; SL end DL ÷ DL ; DID D ÷ integer
ID]array ID [E:E]
SL ÷ SL ; S I s
S + ID:=EIID[E]:=EIBLOCKlbegin
SL en_~dl
if C then S else S[while C do S E ÷ E+TIE-TIE*T]E/TIT T ÷ (E) IIDIID[E]INB C ÷ E=E]E~E NB and ID are considered
to be terminal
(their syntax will be dealt with
by the scanner constructor). III.l.l - Semantics As usually,
of expressions
the presentation
of a data type associated with this language
is made up of a list of names of type, a list of names of operation with their domains and co-domain and some axioms between operations.
In order to specify error cases,
given, which describe For the considered
(or laws) which express
the relationshin
some restrictions
are
the forbidden terms of the data type [ i ~ .
source language there are boolean
identifier of integer and array (Int-id) Array-id)
(Bool), integer
(Int),
among the names of type.
type Bool ;
o_£ ( ) ÷ Bool : true, false ; (Bool) ÷ Bool : ^ ; (BooI,Bool)
+ Bool
: and, or, impl, eq ;
axioms end Bool ; Bool and Int data types are well-known Moreover,
boolean operators
usual rules for priority.
and their axioms are not given.
are going to be used in a infixed way, with the
146
type Int ; op
(Int,lnt) * Int : add, sub, mult, div ; (Int) * Bool
: neg ;
(Int,lnt) ÷ Bool
: eq, neq ;
axioms
end Int ; Let us call V the semantic function which returns corresponding
to an arithmetic expression,
the term of Int type
and B the semantic function for
conditions which returns a term of Bool type. Among the semantic equations, there are: VIE+T ~ = plus (V~E~ , V~T~) V~(E)~ = V~E~ V~NB~ = Int'NB'
B~EI~E2~ = neq (W~EI~ , V~E2~) Note that constants
of a type are enclosed between quotes, preceded by the name
of their type. It is important
to point out that the operation names are not interpreted
have no more information about presentation
'plus' or 'neq' than what is specified
: we
in the
of 'Int'.
Semantics of identifiers
is less straightforward
to state since the properties
of block structure must be described.
A way to do that is to introduce some
types environment
(Vat) and to specify by means of axioms whic
variable
(Env) and variables
is designated by an identifier
noticed that an 'environment' is modified by declarations type Array-id
in a given environment.
and block's entries or exits.
;
op
(Array-id, Array-id)
÷ Bool:eq
;
axioms eq(Array-id end
Array-id
type Int-id ;
;
It must be
here, is no more than a stack of identifiers which
'C]', Array-id
'C2') = eq(C], C2) ;
147
end Int-id type
;
Id = union
(Array-id,
Int-id)
;
end Id ; type E n v
;
o_pp
( ) + Env
: current-env
( ) ÷ Env
: empty
(Env) ÷ Env
;
;
: newblock,
(Env, Array-id)
÷ Env
(Env,
Int-id)
+ Env
(Env,
Id) + Bool
eraseblock
;
: add-array
: add-int
: is-in.a,
;
;
is-in.i
... ;
axioms is-in.a(empty,
id) = false
is-in.a(newblock(e),
id)
;
= is-in.a(e,id)
is-in.a(add-array(e,idl),id2)
= if eq(idl, else
is-in.a(add-int(e,
id]),
;
id2)
= if eq(idl, else
id2)
is-in.a(e, id2)
is-in.a(e,
then true id2)
;
then false id2)
;
° . ,
eraseblock(newblock(e))
= e ;
eraseblock(add-array(e,id))
= erase(e)
;
. . °
end Env
;
The complete type Var
presentation
of Env data
type can be found
;
op
(Vat, Var) ÷ Bool (Int-id,
Env) ÷ Var
: des
(Array-id,
Env,
(Array-id,
Env) ÷ Int
(Var) ÷ Int axioms
: eq ;
Int) ÷ Var
: val
;
; : elt
: lwb,
upb
; ;
in the appendix
148
eq(des(id,
newblock(e)),des(id,e))
eq(des(idl,
add-int(e,
= true ;
id2)), des(id], e)) = if eq(id], id2) then false else true ;
end Var ; The scope rules are expressed by preconditions instance
: Pre(des,id,e)
specifies
on the term of type Var-For
= is-in.i(e,
id) ;
that the term 'des(id, e)' can be written only if 'id' belongs
environment
to the
'e'
Thus, the identifiers
semantics
is given by the following equations
VIID ~ = val(des(Int-id
'ID', current-env))
V~ID[E]~ = val(elt(Array-id (the preconditions,
:
;
'ID', current-env,
V~E~))
;
as it was said in Part II, had been checked using attributes)
111.1.2 - Notions of state and modification So as to describe the semantics of statements
(or declarations),
we introduce a
new name of type : type Modif
;
The semantic value of a statement is a term of this data type. Let $ be the corresponding
semantic function. Among the semantic equations
S~integer ID~ = int-decl(Int-id S~ID
:= E~ = int-assign(Int-id
S~SL ; S~ = concat(S[SL],
there are :
'ID') ; 'ID', V~E~) ;
S~S~)
;
S~begin DL ; SL end~ = concat3(enter-bloek,SIDL~,SISL~, where
'int-decl',
int-assign',
'concat',
'enter-block'
operations of the Modif data type : o_£p (Int-id) ÷ Modif
: int-decl
;
(Int-id, Int) ÷ Modif
: int-assign
(Modif, Modif) ÷ Modif
: eoncat ;
( ) ÷ Modif
: enter-block,
;
exit-block
;
exit-block)
and 'exit-block'
;
are some
149
The intuitive m e a n i n g of this data type,
is that a m o d i f i c a t i o n
state. Thus, we have to define what is a state. it must be pointed out that M o d i f
data type associated w i t h the programming Definition
Before giving
is the 'type of interest'
is a change of
this definition,
of the algebraic
language.
:
Let be the p r e s e n t a t i o n
of the algebraic
data type associated with a
language. A state is a set S of formulas type, w h i c h satisfies i) t = t s S
t=t', where t and t' are some terms of the data
the following properties
:
for all the terms t of the data type ;
ii) t = t' ~ S ~ t' = t c S
;
iii) t = t' E S, t' = t" e S ~ t = t" e S
;
iv) for all f of F such as the co-domain of f is not Modif (2), if t and t' belongs
to the domain of f : t = t' e S ~ f(t) = f(t') e S.
A state satisfies substituting,
the set of axioms A if it contains
the terms of of the relevant Given a current state, formulas
a modification
w e suggest
of the modifications to use some primitives
occuring
obtained
for instance)
removes
some
In order to make easier the
and the proofs of their representations, operations
on modifications
and states,
w h i c h are common to all the language definitions. The first operation
is the application
appl
of a m o d i f i c a t i o n
to a state
:
(m,S) = S'.
The second one is the adjonction of an axiom a to a state S : S+a is the smallest
state which contains
(2) It is necessary
S and all the formulas
to have this restriction
semantics w i t h o u t
introducing
obtained
in substituting
in order to deal with procedure
some paradoxes
[20].
in
in a, all
domains. (an assignment
from the state and add some new ones.
definitions
all the formulas
in each axiom a of A, to all the free variables
150
to all the free variables
in a, all the terms of the data type which are
re i evan t. The last one is a 'generalized
substitution'
define all kind of assignement
:
which makes it possible
to
subst (f (%) ,~) means,
intuitively,
that the value of function f, for all the terms equal
to % becomes p. subst is characterized but it insures
by a rather complicated
that no inconsistencies
definition given below
are introduced
in the resulting state,
even if there are some occurences of f in % or 7. Thus it is possible describe assignement
for array's
appl(suhst(f(%),ja),S)
= {FIF[f'/f]eS
+ f'(x) = if eq(x,%)
Using appl and subst, we can now give the definitions operations
of the considered
to
elements as well as for simple variables.
Source Language
then ~ else f(x)}
of the Modif data type
:
definitions int-decl(id)
= subst(current~env,add-int(current-env,id))
int-assign(id,i)
= subst(val(des(id,current-env)),i)
enter-block
= subst(current-env,newbloek(cu~rent-env))
exit-block
= subst(current-env,eraseblock(current-env))
appl(concat(m],m2),S)
= appl(m2,appl(m],S))
;
; ; ;
;
° . .
end Modif
;
111.2 - Example of a Target dat a typ e The complete
specification
As a consequence the name of types
of the Target data type is given in appendix.
of the kind of target language considered, : hexadecimal
condition codes (Cond-code).
numbers
(neg~branch)
there are among , registers
Among the name of modifications
compare, branch, branch under condition condition
(Hexa), addresses
(cond-branch),
and
: load, store,
branch under reverse
etc. We consider that the Hexa data type is predefined,
with the same axioms as the Int data type :
151
type Hexa
;
op (Hexa,Hexa)
÷ Hexa : plus~ minus, mult, div ;
(Hexa) ~ Bool (Hexa,Hexa)
: neg ;
÷ Bool
: eq
axioms
end Hexa It is important
to notice that the Bool data type is here,
auxiliary type. It is needed to write conditionnal type the result of a comparaison
in some way, an
axioms, but the Target data
is a condition code, not a boolean.
The Address data type contains an indexing operation.
From an address or a
register one can get a 'Content', which is either an hexadecimal an address.
The corresponding
operations
are named
The Cond-code data type is a bit more complicated in this type : 'It', Register
'ca' and 'cr'. : there are three constants
'eq'. These values are contained
'eond', which is modified only by the modification
the definition
in a specific register, 'compare'.
Thus,
of compare is :
compare(cl,c2) where
'gt',
number or
= subst(cc(Register'cond'),
test(c],c2))
'cc' returns the content of the register and test(cl,c2)
is the result
of the comparison.
111.3 - Specification of the implantation In this part of the compiler
specification,
a representation
is given for every
operation name wich occurs in the R.H.S of the Source semantic equations. Only these representations get a correctness a representation operations,
are needed to perform the translation~
proof of the implantation, of all the operations
it becomes necessary
If one want to to give
occuring in the axioms (of all the
if the Source data type does not contain unecessary
ones).
In this part of the paper, we are going to present only a part of the specification.
Let us consider
the Int data type first. It is going to be
represented by the Hexa data type. This is specified
in the following way
:
152
typ~ Int Hexa ; repr Int 'I' = Hexa 'CONVERT(1) I ; repr add(i,j) = plus(rear i, repr j) ; . ° .
end Int ; CONVERT is a meta-proceduye which returns the text of the hexadecimal constant corresponding to I. The data types Int~id, Array-id, Id and Env are not represented. The Var data type is represented by addresses, and the operation ~des' is represented by a constant address since the allocation of simple variables is static. type Var ; Address ; repr des (Int-id 'ID', current-env) = Address 'SEARCH (ID)' ; SEARCH is a meta-procedure.
It looks for the address of identifier ID in a
table TABLE, starting from the last added identifier. Its definition is given below : SEARCH (ID)
=
SEQ-SEARCH (ID, SIZE-I)
where SEQ-SEARCH (ID,i) = IF TABLE [i,l] = ID THEN TABLE [i,2] ELSE SEQ-SEARCH (ID,i-l) Before going on in the specification, we specify the way the couples are entered in TABLE. This is done by two functions : ALLOC! for simple variables and ALLOC3 for arrays (3 words are allocated). ALLOCI
(ID) = (TABLE [SIZE] : = ; INCR (SIZE) ; INCR (TOP (ALLOC-STACK)))
ALLOC3 (ID) = (TABLE [SIZE] : = ; INCR (SIZE) ; INCPJ (TOP (ALLOC-STACK))) The three words allocated to an array are used in the following way : in the first one is the address of the beginning of the array ; the last ones contain the lower bound and the upper bound. Thus the representation of 'elt' is (index underflow or overflow are not tested) :
153
repr elt (Array-id 'ID', eurrent-env, i) = indexing (ca (Address 'SEARCH (ID)'), minus (repr i, repr lwb (Array 'ID', current-env)))
;
where : repr lwb (Array-id 'ID', current~env) = ca (indexing (Address 'SEARCH (ID) ~ , Hexa 'I') ; Going on with the representation of ~ar data type : repr val(~) = ca (repr v) ; Among the modifications, we give the representation of the declarations, and of 'enter-block' and 'exit~block'. repr int-decl (Int-id 'ID') = #~ ALLOCI (JlD) ~=# n o p ; (the meta-statements, i.e compile-time computations, are noted between #@. 'nop' is the 'no operation' target statement). repr array-dec1 (Array 'ID', i, j) = #=~ ALLOC3 (ID) ; A : = SEARCH (ID) #~ seqcomp 3 (store (repr i, indexing (Address 'A', Hexa '|')), store (repr j, indexing (Address 'A', Hexa '2')), load (minus (cr (Register 'free'), plus (minus (repr j, repr i), Hexa 'I')), Register'free'), store (cr (Register 'free'), Address 'A')) ; (it should be verified that j-i e O). Register 'free' points on the first free address before the last allocated array Dynamic allocation is done in a decreasing way in the memory addresses. repr enter-block = ~
A : = TOP (ALLOC--STACK)
;
PUSH (ALLOC-STACK, A+I), PUSH (TAB-STACK, SIZE) #~
store (cr (Register 'free'), Address 'A') ; It is a bit confusing but not really complicated. Before entering a block, at compile-time, the address following the previous block allocations is push on a stack and the allocation of the new block will start from it (see ALLOCI) and the SIZE of TABLE is saved. At run-time, the pointer in the dynamic area is saved. repr exit-block = ~
POP (ALLOC-STACK) ; A : = TOP (ALLOC-STACK) ;
SIZE : = TOP (TAB-STACK) ; POP (TAB-STACK) ~
load (ca (Address 'A'), Register 'free') ;
I54
It may be seen that the specification of the implementation becomes very systematic and formal. But the main advantage is the ability to check up such implementations.
In several previous examples, some errors in the
specification come to light during the proof process.
IV - CORRECTNESS PROOF OF IMPLE~NTATIONS The correctness proof methodology is based on the distinction of 'modifiable' operations from classical operations. Modifiable operations are operations which appear in a subst, in some modification definition.
There must not be axioms on these operations since subst could introduce some inconsistencies with them. Resides, to be correct, the representation of such operations must verify some strong properties
: if two such
operations are represented by the same operation of the Target data type, then, there must be no overlapping between the domains of these representations (if it were the case, it would be possible to modify two Source operations by the same Target modification). The proof method is outlined below : first it must be verified that the implementation satisfies the above property for modifiable operations ; then, the Source data type, without the modifiable operations and the modifications,
is considered,and it must be proved that its representation
into the Target data type is correct in the sense of (10] (the axioms are kept)
; the last step is the correctness proof of the representation
of modifications. The representation of a modification is correct if : - it keeps unchanged the representation invariants which arised in the previous proof (Source data type without modifications) - if the modification is defined by a subst (f(%),~), then the representation embodies a modification which can be proved to be equivalent to subst ~rep.r f(~), repr D) -
the other modifications occuring in the representation are
'~ithout side-effect", i.e without effect on the Source data type. They do not modify the representation of any Source operation.
Such a proof is very long, but it is built systematically, and the various steps are not very difficult. A complete proof is given in [2Q].
155
ACKNOWLEDGMENTS Theoretical as well practical aspects of this work have been studied in collaboration with Ph. DESCHAMP, M. MAZAUD and C. PAIR, P. BOULLIER and B. LORHO's previous works made it possible the effective developpment of the system. P. DERANSART's knowledge of LISP was very useful.
REFERENCES Ill
P. BOULLIER. Le syst~me SYNTAX, Manuel d'utilisation, Groupe Langages et Traducteur, IRIA, 1977.
[~
P. ~OULLIER. Automatic Syntatic Error Recovery for LR-Parsers, 5 th Annual III Conference - Guidel, Mai 1977.
[3]
P.D. MOSSES. Mathematical Semantics and Compiler Generation. Ph.D. Thesis, Universit~ d'Oxford,
[~:]
]975.
P.Do MOSSES. SIS, a Compiler-Generator System using Denotational Semantics Dept. of Computer Science, Universit~ d'Aarhns, June 1978.
[5]
M.C. GAUDEL. A formal Approach to Translator Specification, IFIP Congress, Toronto,
[6]
]977.
M.C. GAUDEL, Ph. DESCAnt, M. MAZAUD. Semantics of Procedures as an Algebraic Abstract Data Type, Rapport Laboria n ° 334 - 1978.
[7]
M.C. GAUDEL, C. PAIR. Construction de Compilateurs bas~e sur une S~mantique Formelle. Acte des journ~es francophones sur la certification du logiciel, Gen~ve,
rS]
1979.
M.C GAUDEL, C. PAIR. The Use of a Formal Semantics to Produce and Prove Compilers. International Workshop on the Semantics of Programming Language. Bad Honnef. March 1979.
Eg]
F. Lockwood MORRIS. Advice on Structuring Compilers and Proving them Correct. P.O.P.L. 1973 . J.V. GUTTAG, E. HOROWITZ, D.R. MUSSER. Abstract Data Types and Software Validation. CACM. December 1978.
156
[Ii]
J.W. THATCHER, E. WAGNER, J.B. WRIGHT. An Initial Algebra approach to the Specification, Correctness, and Implementation of Abstract Data Type. In Current T~ends in Programming Methodology I~T (R. Yeh. Ed). Prentice Hall ]979.
[12]
M.C. GAUDEL. An introduction to Algebraic Abstract Data Type. Lecture notes of the 2nd advanced course on Computing System Reliability. IRIA, Sept ]979.
[13]
M. GORDON, R. MILNER, C. WADSWORTH - Edinburgh LCF. Internal Report CSR-11-77 University of Edinburgh, Sept 1977.
[14]
D.R. MUSSER. A Data Type Verification System based on Rewrite Rules. 6 th Texas Conference on Computing System, Austin, Nov. 1977.
[15]
D.R. MUSSER. Abstract Data Type Specification in the AFFIRM System. IEEE Transactions on Software Engineering (to appear).
[16]
D.E. KNUTH. Semantics of Context-free Languages, Math. Systems Theory, Vol. n°5, n°l, 1971.
[17]
B. LORHO. De la D~finition g la Traduction des Langages de Programmation: la mgthode des attributs s~mantiques. Thgse d'Etat - Toulouse,
[18]
1974.
B. LORHO. DELTA : manuel d'utilisation, Groupe Langages et Traducteurs, IRIA, Nov. ]977.
[19]
M.C. DENDIEN-GAUDEL. Applications des Structures de Donn~es ~ la description et ~ la preuve de Traducteurs. Congr~s AECET Informatique, Gif-sur-Yvette, Nov. 1976.
[20]
M.C. GAUDEL. Thesis, University of Nancy (France), March 1980.
157
APPENDIX
DEFINITION
OF THE S O U R C E
I
LANGUAGE.
(~ -> Bool : true, false : ( B o o l ) -> B o o t : ~ : **~*it is the 'not' f q o o l o B o o l ) -> B o o t ; arid, o r , i m p t , e q ; ax iotas end B o o l
:
~yp~ Int : o__9_ T n t , I n t ) -> t n t : a d d , I n t ) -> B o o t : neg ; ]nt, I n t ) -> B o o l ; eq, axioms end l n t
operator****
sub.
mulL~
div
:
neq ;
:
,~y2e A r r a y - i d : op r~rray-id, Array-id) -> B o o ! ax iota5 r,~(--~rray-id_~ C| , A r r a y - l d ' C 2 end A ~ r a y - i d ; ~ypq Int-id : QR. ~Tnt-id, Int-id) eq('lnt-id'Ct', end [ n t - i d ;
-> Bool
tnt-id'C2')
:eq
: eq ; ) = eq(Ct,
C2)
;
= eq(Ct,
C2)
~y~,~ Id = u n i o n ( A r r a y - i d , Int-ld) ; PF', r~d, Id) -> B o o ] :eq ; axioms eq(Int-id'Ct', Array-id'C2') = eq(Cl, eq(Array-id'Cl', lnt-id'C2') = eq(C1, ,~n___d Id :
;
C2) C2)
;
"
158
Ly_P_e Er, v ; (~ -> Env : empty : () -3 Env : c u r r e n t - e n v ; (c ,~.nv" ) -2, Env : n e w b l o c k , e r a s e b l o c k , (Env, Array-id) -> Env : a d d - a r r a y ; (Fnv, Int-id) -> Env : a d d - i n t : f E n v , I d ) -> 8 o o l : i s - i n . a , is-in.t, is-local ; 3x!oms is-tn.a(empty, id) = f a l s e ; is-in.a(neublock(e), id) = is-in.a(e, id) is-inoa(add-array(e, idl), id2) = i_L£ eq(idl, id2) then true else is-in.aie,id2) ; is-in.a(add-int(e, idl), i d 2 ) = i_(f e q ( i d l , id2) then false else is-in.a(e, id2) ; i:-in.i(empty, id) is-in.i(newblock(e), i~-Jn,i(add-array(e,
= false ; id) = ts-in.i(e, id : id|), id2) l~_Leq(idt, :d2) then false else is-in.i(e, id2) ; i~-~n.i(add-tnt(e, idl), i d 2 ) = j~£ e q ( i d l ~ id2) then true ~tse is-ln.i(e, i--d-2T ; t?.-Ioca1(empty, id) = false ; is-[ocal(newbtoc~(e), ld) = raise ; !~-local(add-array(e~ tdl), id2) = if eq(Jdl,id2) then true eTse is-local(e, td2) ; i~-]ocal(add-int(e, idl), i d 2 ) = L {e ~ E q T i d l , i d 2 ) t h e n t r u e else is-local(e, td2) ;
ecaseblock(empty) =empty ; erasebtock(newblock(e)) = e ; e?aseblock(add-array(e, t d ) ) = e r a s e b l o c k ( e ) : eraseblock(add-int(e, id)) = eraseblock(e) ;
?co*fictions
P~e(add-array, e, i d ) = " i s - l o c a l ( e , P~e(add-lnt, e, id) = is-total(e, end Env : Eyp~ Var : o~ :Jar. V a r ) -> B o o l : e q ; Tnt-ld, Env) -> Var : des ; ~rray-id, Env, I n t ) -> Var : e l t Array-id, Env) -> I n t : l w b , upb Uar) -> Int : val ;
id) id) ;
;
: ;
3XlOmS
4 + ~ * eq i s an e q u i v a l e n c e retatlon and b e s i d e s : * * * * c~(d~s(idl, e), des(id2, e)) = eq(idl,id2) : eq(des(id, newblock(e)), des(id, e)) = trbe ; eq(des(idl, add-int(e, id2)), des(idl, e)) = "eq(idl, ld2) ; eq(elt(idl, e. i), elt(id2, e. j ) ) = eq(idl, i d 2 ) and e q ( i , j) ; eq(e]t(id,neeblock(e), i), elt(id, e, i ) ) true ; eq(elt(idl, add-array(e, id2), i), elt(id, e, i ) ) = ^ e q ( i d l , :d2; eq(des(idl, el), elt(id2, e2, ~)) = false ; rectrictions P~.e(des, l d , e) = i s - i n . i (e, id) : Pr&(elt, id, e. i) = is-ln.a(e, id) : P r e ( u p b , i d , e) ts-ln.a(e, ld) Pre(lwb, i d , e) ls-in.a(e, td) : -~** arrays underflou and o v e r f l o w s p e c i f i c a t i o n s ~*** pea(sub(l, lwb(id, e ) ) ) => F a i l u r e ( e l l , i ~ , e. i ) ; neq(sub(upbiid, e ) , i ) ) => F a i t u r e ( e l t , Jd, e , i ) ; end Var :
;
159 t~D,p Modif Am ~rnt-id) -> Modlf : int-decl ; r A r r a y - i d , int. int) -> M o d i f : a r r a y - d e c l : ~Int-id, Int) -> M o d l f : i n t - a s s i g n : IArray-id, Int. Int) -> Modif : a r r a y - a s s J g n (~ -> M o d i f : i n i t , , e n t e r - b l o c k , e x i t - b l o c k (Bool, Modif, Modif) -> Modif : c o n d ; ~moe], Modi() -> M o d i f : loop ; (Hodif, Modif) -> M o d i f : c o n c a t : 3ef!nitlons
int-decltid) 2rray-decl(id.
= subst(current-env, add-int(current-env, id)) ; i, j) = concat(subst(current-env, add-array(current-env, subst( a p R . l ( c o n d ( b , m l , m2), o~) = a p p l ( m l . . S) ; b = f a [ s e @ S => a p p l ( c o n d ( b , m l . m2), S) = app..l(m2, S) ; b = t r u e @ S => a p p l ( l o o p ( b , m), S) : a p p ! ( I o o p ( b , m), a p p l ( m . S ) ) ; b = f a l s e @ S => a p p l ( t o o p ( b , m)~ S) = S ; ~ppl(concat(ml. m2), S) = a_Ep__l(m2, a p p t ( m ! , S) ; qe~ctions P r e ( i n t - a s s i g n , id. i) is-in.i(current-env, id) ; P~e.(array-assign, id, i, j) = is-in.a(current-env, id) ; neq(sub(i, teb(id, current-env))) => F _ a a i , , , l u r e ( a r r a y - a s s i g n , i d , i . j ) ; rmo(sub(upb(id,~ current-env), J ) ) => ~ a i i u r e ( a r r a y - a s s i g n , id, i, j) ; end Modif ; *~************SEHANTIC
EQUATIONS*******,********************************
H : P -> term of data
type M o d i f
H { b e g i n DL ; SL e n d } = c o n c a t . S ( i n i t , S{DL}. S{SL}) ; S : D . S . D L , S L -> t e r m of d a t a t y p e M o d i f f.~ ; S{DL;D} = concat(S{DL}. S.D,) S { i n t e g e r ID} = i n t - d e c l ( I n t - i d ' I D ' ) : b(array ID[EI:E21} = array-decl(Array-id'ID'.V1{El},V1{E2}) S f S L ; S } = c : o n c a t ( S { S L } . S { S ) ) ; * * * * s o r r y f o r t h e S{S) S{ID:=E} = int-assign(Int-td'ZD', V{E}) ;
; !****
S,~ID[E1]:=E2} : array-assign(Array-ld'ID'..g,V{E1},, V{E2}) S { b e g i n DL:SL e n d f = c o n c a t . 4 ( e n t e r - b l o c k , ~ D L s , S { S L } . e x i S { b e g i r , SL e n d ) = S ( S L ) : S { i f C t h e n S1 e l s e 5 2 } = c o n d ( B { C } . S{St}, S{$2}) ; S ( w h i t e C do S) = l o o p ( B { C ) , S { S } ) :
i -block)
:
V,V1 : E,I- -> t e r m o f I n t d a t a t y p e V{E+T} = a d d ( V { E } , V{I}) : * * * * * i t is the s a m e thing f o r the o t h e r o p e r a t i o n s * * * * * V{(E)} = VfE} ; r ! ) VflD) = v a l ( d e s ( I n t - i d ' I D ' , c u r r e n t - e n v ) ) : ~;'~ VflD[E]} = val(elt(Array-id'ID' ,, current-env, VfE})) V term o f B o o l d a t a t y p e
BfEI=E2} B{ElmE2}
= eq(V{E1}, = neq(VfE1},
V{E2}) : V{E2}) ;
id)),
160
APPENDIX
2
******************************* t Y P e He:..
:
eq
Re_~ister.'C2")
,~n,i,:,rl,
(He>::a,
(Address) -]:. C o n t e n t (Re.q.i s t e r ) -> Content r.estr-ictic n s pr-,~ ( c r . , r'i*' = ..... e ~ ( r . , t"," Pe
Bool
:
h2))
"
=
e~(C1,
C2)
;
Address)
ca : or, ; Re.qister.",_-.,:,nd")
;
Cond-code
() -> Cond-o:,de : It,et,e9 (Cor, tent~ Content) -> Cond-code (Re~ister) -]:- - : o n d - c o d e : cc (Cc, n d - c o d e ) -> Cond-o:,de : s'.rm a>:: i n m s s'.,.m(_qt) = It ~_~'.,-'m(It) = ~ t ; test(hl, Plus(h2,
He>ca"l"))
test(hi.,,
minus(h2,
test(el,
indexing(a2,
=
Hexa"l"))
C,:, r, d - c ,:, d e
:
test
i~: e q ( h l , h 2 ) t h e n It elscl i ~ e q ( h l , P l u s " ( h 2 ~ H e x m . " l " ) ) then eq else test(hl, h2) = i~t= e ~ ( h l , h2) then ~t e l s e iF e q ( h l , m i n u s ( h . . " : , H e x a " l " ) )
Hexa"l')
test(el, ,_-2) = s - t i n ( t e s t ( c 2 , cl ) r.e s t r . i , - t i ,:, r, ~ Pr.e (,_-,:, r.) = e q ( r - , Re_~ister.'cond") e r, ,~
Pli.J_~(hl,
=
else t e _ ~ t ( h l , h 2 ) i~ em(al, a2) then It else. i~ em(al,index!in~(a.2,Hexa"l')) t h e q , e~ test(a1~ a.2) ;
161
tYP__..~e L a b e l
;
(Label~
Label
)
-0.
Bool
"eq
;
a::< i orri~
eq(Labet'h.'.:l", end I_abet .~ tYPe
Modit=
Label"C2.")
=
e~(Ct~
C2)
;
;
0 P
(Content~ Register.) -> Modif : load ; (C:ontent~ Address) -> Modi~ : store ; (C,:,r, t e n t , Cor, ter, t) -:::. M , : , d i £ : ,-,:,rflPar.e (Labet~ Modit=) -> M,:,di£ : labelled (Modi~=~ Label ) - ' > B,:,,:,I : e x i t T e n t P ' r ; (Label) -> Modit= : br-anch (Label7 C:ond-cc, de) -.'::. M o d i 1 = : c o n d - b r . a n c h ~ ne_~-br-aru=h (M,:,di~ Modi~) - ' > Mc, d i £ : se~comp ; () -> M,:,di~ : n,:,P (M,:,di£, Content) -:::-C,:,ntent : ~ide-effect ; (Modi4, Cond-o:,de) -.> Cond-code : c.side-e~£, nc.side-et:~ a X i 0 rrl
entr'.,'(load(,:,
r.),
entr.'-,'(stor.e(c,
l)
a),
= 1)
£alse =
£atse
er~tr--r(comear-e(cl, c2), t) = ~:alse entP'-,.'(branch(lt), 12) = ~alse erltr-'-,'(c,:,r~d-br.arJch(ll, ,:cored), 12) = false er, t r . . , . ( r , e ~ - b r . a n c h ( l l , cc,:,nd), 12) =~alse entr-Y(labetled(ll, m), 12) = i~ e~(ll~ 12) entr"-,.(se~comp(mt, er~tr..,.(noP.~ 1) =
m2), ~alse
exit(toad(c, r-), e:: eiPP ]. ( s e a i _ - o m P . :9 ( , z o n d - b r a n c h ( 1 , c c o n d ) , m l , 1 a b e t t e d ( t , m2 ) ) , = a P P I ( S e c h Z C , ITIp(rFI~, l a b e l l e d ( l , m2)), E;) ~*rules For ne~-branch are the reverse ones *~.~**~*~**~**~ le, t,, m = l e . b e l l e d ( 1 1 , s e ~ c o m P . 5 ( m l , c o n d - b r a n c h ( 1 2 , c c o n d ) , m 2 = b r a n c h ( 1 1 ) , label led( 12,m3) ) ) For all 1 ~.~ L a b e l , e::-::it(ml,1)=+:alse and_ e>::it(m2,1)=~:alse then the two #ollc.~in_~ rules h,:,ld : cc(ReBister-'cor, d')=co:,nd @ amel,(ml, :_:;) = > a P P ! ( r [ t , S) = a P P ) ( s e ~ c o m P ( m l : tabelled(12, m3))~ S) ; "cc(Register"cond')=ccond @ aPP](ml~ S) =:>
aPPl(m,
S)
= apP],(rrl,
aeel(noP, S) = S ca(h.side-e{Eect(m,a))
=
apPI(se~C:OITIP(ITII:
1TI2)~ '..:~))
h.side-e4~ect(m,
ca(a))
cond-branch(l~c.side-e44(m, cond-branch(l,nc.side-ef4(m,
ccond)) ccond))
neg-branch(t,c.side-e44(m,ccond)) neg-branch(],nc.side-e44(m,
ccond))
*#****~*~***END
OF T H E
TARGET
= = =
DATA
=
S)
;
se~,::omP(m, cond-branch(l~ccond)) segcome(m, ne~-branch(l,ccond))
seqcome(m, neg-branch(l,ccond)) seacome(m, cond-branch(l,ccond))
TYPE#**~*****##*****#***~*~**#
; ; ;
163
:3
APPENDIX
SPECIFICATION
raF T H E
IMPLEMENTATION
#¢~#-~##~OnlY the operations ~ ~ are r.ePr-esented
and . ~
tYPes #
~
oo:ur, ~
ir~
in
~
the
~
~
semantic ~
e~ua.tions~##~
~
t-,,P___%e B o o 1 Cond-code en,._.~4~ B o o t -~
tYPe
Int
PePI~_ I r l t " I " = H e : : < a ' C : O N V E R T ( I ) ~' r'ePr" add(i, J) = PllJS(l~e~r ' i, r-~..J) ; "~£i~i£_r' s u b ( i , J) = minus(rapt i, y ~ J) ; mult(i, J ) = m u l t ( r . e e [ , i, FePr; J ) r e Pr~ d i v ( i , J) = div(ree~; i~ ~ j) rePr e~(i, J) = ~ide-e~:~ect.c( comPare(rapt i~ r:ePr J ) , e ~ ) ; r-e~,r, r , e ~ ( i , J) = side-effect.no( comPar.e(r.ePr :i.~ r - e P r . J ) , e~) Ir, t ;
~##~Int-id,
Ar.ra'...'-id,
tP......~ Y VarAddr.ess r'ePr des(Irft-id"ID", E.eP~r. d e s ( I r , t - i d " I D ' , r.ej.~r, e ] t ( A r . r . a Y - i d " I [ I
tee_E-
]wb(ArraY-id'ID",
reer~e]t(Arr.ay-id'ID",
rePr
]wb(Arra-,'-id"Ib",
F'~PF'
V,.-~I(v)
=
ca(r-apt,.
Id
and
Env
are
not
represented
;
. ###~#-m##~####~.~
,:urrer, t-env) = Address"E;EARL-:H(ID)" er.aseb]c,c~:(Eur.r.erlt-env)) = Address'SEARCHI(ID)" ~" , c u P P e r l t - e r l v , i) = irlde>::irl.~( ca(Address":-:EARr:H(ID)"), m i r i u s ( P e P LP i~ r;.ePr] 1 w b ( A r ra-,--i d " I D ~",
cur-rer.t--env) ) ) ; c.a(in,~e>::ir,_~(Address"'.E;EARF:H(ID).", He>::a "" 1 "" ) ) ; er.a_~eb]ock(CLIr+t~erit-eriv)~ i) = irldexin~( ca (Address"SEARr:Hl ( I D ) "'), minus(E~t~.£, i, r.ePr l w b ( A r r a Y - i d " ' I D " , er-aseb~ock(current-env)})) erasebiock(current-env)) = ca(inde::::a" 1 "" ) ) ; V) ,:urr.ent-env)
=
;
164
type Modi~ rePP int-decl(Int-id'ID") = # ALLOCI(ID) #noP ; aPra","-,-~ecl(Ar-ra',"-id'IB", i, J) = # ALLOC3(ID); A:=SEARCH(ID) # se~,:,:,lTiP.4(stoPe(PeeP i~ inde>.'.in_q(Addr, ess'A", He:;