Enhanced Control and Data Flow Graphs in Montages - CiteSeerX

3 downloads 0 Views 271KB Size Report
like "self-executing" program texts and nice animation concepts, it is not useful ... sense that it speci es the context-free grammar rule, the. (local) control and data ...
Enhanced Control and Data Flow Graphs in Montages Matthias Anlau GMD FIRST, D-12489 Berlin, Germany [email protected]

Philipp W. Kutter ETH Zurich, CH-8092 Zurich, Switzerland [email protected]

Alfonso Pierantonio Universita di L'Aquila, I-67100 L'Aquila, Italy [email protected]

Abstract Montages is a semi-visual framework for the speci cation of syntax and semantics of imperative programming languages. The primary aim of Montages is to allow for speci cations which can be produced and used during the various stages of the language life cycle. This paper describes an extension of Montages and its logical/algebraic characterization. The need for the extension occurred during full-scale case studies on Oberon, Java, and a domain speci c language, that has been designed in industry. The tool support for Montages (Gem-Mex) has been completely re-implemented in order to support the new features and to generate an enhanced programming environment.

1 Introduction In this paper, we present an enhancement of Montages [22], a framework for the formal description of syntax and semantics of imperative programming languages. The aim of Montages is to give formal descriptions of realistic programming languages in way that such descriptions can be produced and used during the language life cycle. Syntax, static and dynamic semantics are given in a uniform and coherent way by means of semi-visual descriptions. The static aspects of a language are described by diagrammatic means resembling control and data ow graphs, and the overall speci cations are similar in structure, length, and complexity to those found in common language manuals. The Gem-Mex rapid prototyping tool allows the designer to maintain design decisions and generate a complete programming environment with associated documentation. The original formulation of Montages [22] was strongly in uenced by some case studies where we speci ed the Oberon language [20, 21]. It was challenging, since we started with a carefully elaborated and complete speci cation of an object oriented language1 that 1

Most people do not consider Oberon being object oriented due to

is used at least in academy for the implementation of compilers, operating systems [33], various applications, and teaching [24]. Some peculiarities in the design of the language Oberon a ected considerably the original Montages formulation. In particular, each semantic concept in the language is represented by a meaningful keyword in the concrete syntax. Therefore, we used the tokens representing these keywords in the concrete syntax tree as action nodes, i.e. nodes where the corresponding dynamic semantics undertakes an action. Although, this is somehow appealing, leading to ideas like "self-executing" program texts and nice animation concepts, it is not useful (if not counterproductive) for languages like C++ or Java. In these languages, concrete syntax is reduced to the minimum and therefore action nodes must be based on the nodes of the abstract syntax tree. The rst formulation of Montages has been used also in other case studies, such as the speci cation of the Java [32] language, and the design and prototyping of a domain-speci c language in an industrial context [23]. The experience showed the need for subsequent re nements of Montages and suggested a better tool design and implementation. In particular, the work on Java [32] allowed to validate our ideas how the control and data ow graphs can be made more expressive by labeling the control ow arrows with ring conditions. The resulting predicate nets confer to the formalism enhanced pragmatic qualities, such as writeability, extensibility, readability, and, in general, ease of maintenance. Moreover, speci cations are considerably shortened, for instance the speci cation of the Oberon language presented in [21] has been reduced of more than 30% [4], and the dynamic semantics of all control statements of Java can be given in a completely visual fashion. This paper presents the new Montages proposal and its logical/algebraic characterization by means of sima di erent terminology used in the language de nition, but in fact it has classes, inheritance, and dynamic binding.

While ::= "while" Expression "do" StmSequence "od"

ple rst-order logic. Such a formulation of Montages is based on the abstract syntax and allows the speci cation of the ring conditions directly in the control and data ow. Additionally, we illustrate the tool GemMex which has been completely re-implemented according to the new formulation of the formalism, and which presents substantial enhancements to the initial prototype [2]. The paper is organized as follows. In Sect. 2 we present the new Montages formalism and the control

ow with ring conditions. Sect. 3 outlines the main features of the rapid prototyping tool Gem-Mex. Finally, in Sect. 4 we compare our work with other semantical frameworks.

guard

I

S-Expression

self

T

guard.value S-StmSequence

condition

S-Expression.type = boolean

Figure 1: The While Montage

2 Montages

ponents. In order to access descendants of a given node in the abstract syntax tree, statically de ned attributes are provided. Such attributes are called selectors and they are unambiguously de ned by the EBNF rule. The control and data ow graph is obtained by decorating the abstract syntax tree, which is performed according to the (local) control and data ow graphs, which are given within each Montage. Such local graphs are described by using the Montage visual language (MVL), a visual language which has been explicitly devised for this purpose. Lexicons of MVL are, respectively, elements and edges (or arrows). Elements are labeled ovals and boxes. Boxes represent components of a Montage. Ovals represent the dynamic semantics actions associated with the Montage, and are labeled with the name of the action. In the case of one action, the generic label "self" is used. Eventually, nested boxes are used to represent indirect components and a generic LIST-box is used for lists. Edges are used to connect elements in order to denote the control and data ow. Fig. 1 illustrates the Montage speci cation of a while statement. The topmost part is the production rule de ning the context-free syntax. The remaining parts de ne static and dynamic aspects of the construct, here of the While. In particular, in the control and data ow the unique action of a while-instance is denoted by the self -oval, whereas the components are represented with boxes labeled by selectors, e.g. the S-Expression box for the Expression component. The solid and dotted arrows denote the data and control

ow, respectively. Two special control ow arrows, I (initial) and T (terminal), denote where the control ow initially enters and from where control nally exits the construct. In the example, the Expression component is the rst element which receives control, whereas the self-action is from where the control can leave. The de nition of entry and exit points by means of I

A departure point for our considerations has been the formal speci cation of the C language [15] 2 , which showed how the state-based formalism Abstract State Machines [13, 14, 16] (ASMs), formerly called Evolving Algebras, is well-suited for the formal description of the dynamic behavior of full-blown practical languages. In essence, ASMs constitute a formalism in which a state is updated in discrete time steps. Unlike most state-based systems, the state is given by an algebra, that is, a collection of functions and universes. The state transitions are given by rules that update functions pointwise and extend universes with new elements. The model presented in [15] describes the dynamic semantics of the C language by presuming on an explicit representation of control and data ow as a graph (CDG). This represents a major limitation for such a model, since the control and data ow graph is a crucial part of the speci cation. Therefore, we extended the approach in [15] by introducing a mapping which describes how to obtain the control and data ow graph starting from the abstract syntax tree. In our formalism a language speci cation consists of a collection of abstract data types, called Montages. The Montages are hierarchically arranged according to the rules of the corresponding context-free grammar. Each Montage is a \BNF-extension-to-semantics" in the sense that it speci es the context-free grammar rule, the (local) control and data ow graph, the static semantics, and the dynamic semantics of the construct. A Montage describes the properties of its instances, which are the nodes in the abstract syntax tree. Symbols in the righthand side of the EBNF rule are called (direct) components of the Montage, and symbols which are reachable as components of components are called indirect com2 Historically the C case study was preceded and paralleled by work on Modula2, Prolog, and Occam, see [7] for a commented bibliography on ASM case studies.

2

and T arrows is recursive on components, e.g. to enter the Expression component means entering the Expression's entry point that is de ned by the I arrow in the Expression's Montage. The action-nodes of the Montage are the atoms in the control ow and thus identical with their entry and exit point. The entry point and the exit point are used to plug together data and control ow. For instance, having a sequence of statements s1 ; s ; s +1 ; s the exit point of the statement s will be connected with the entry point of the statement s +1 . Dotted control arrows link the exit point of their source with the entry point of their target: the exit point of the Expression component is linked with the self action, this action is linked with the entry point of the StmSequence, and the exit point of the StmSequence is linked with the entry point of the Expression. Control follows this links unconditionally, if no ring conditions are given. The labels of control ow arrows are boolean predicates, de ning these ring conditions: if the source of the link is active and the ring condition evaluates to true, control is passed to the target of the link. In the example the predicate guard.value is the ring condition for passing control from the self-action to the entry point of the sequence of statements StmSequence. In general there is an instantiation process to get from the control predicate of a control ow arrow to the concrete ring condition of a control link. In the dynamical behavior of a construct, intermediate results are stored in attributes of the nodes. In the language containing the While we are supposed to know that each Expression stores its result in an attribute value. As an example one may look at Fig. 9 where the Montage for a Relation construct can be seen. This Montages has exactly the update of the discussed value attribute as dynamic semantics rule. Data ow links from one Montage instance to another are used to retrieve such data. A data ow arrow de nes a data link from its source to its target. If the source is an action of some Montage instance, the link is attached to the instance. Thus the guard -labeled data ow arrow in the example links the current instance with the Expression. It can be used by actions of While to retrieve the value -attribute during run-time. This is, for instance, done in the discussed control predicate guard.value. If data-arrows link an action-oval of the Montage with a direct component they are equivalent with one of the selectors, e.g. here the guard arrow is equivalent to the S-Expression selector. The third part of the while Montage contains the static semantics. In the example, the type of the expression in the while must be boolean. One may make use of full rst-order logic to express context sensitiveconstraints. In general, Montages have a fourth part 

i

i



n

part which contains an action that is triggered if control reaches the self node. This action can be given as an ASM transition rule. Montages declarations are intelligible and concise and can make use of useful features such as an elaborated list-representation, and the possibility to connect indirect components of a Montage as well as non-local objects. The power of data arrows is only visible if used in combination with indirect components and lists. The semantics of MVL declarations may lead to a simple left-to-right, bottom-up traversal of the abstract syntax tree. This has been proven to be enough for classes of languages such as Oberon [21]. However, there are languages, e.g. Sather, which require less straightforward traversal, then well known techniques from attribute grammars [18, 31] and graph grammars [28] are adopted to infer the right traversal as proposed with Montages in [12, 11].

i

i

2.1 Basic Terminology

The syntax of the speci ed language is given by the collection of all EBNF rules. Without loss of generality, we assume that the rules are given in one of the two following forms:

A ::= B C D E = F G H j

j

(1) (2)

The rst form de nes that A has the components B , C , and D whereas the second form de nes that E is one of the alternatives F , G, or H . Rules of the rst

form are called characteristic productions and rules of the second form are called synonym productions. We guarantee that each non-terminal symbol appears in exactly one rule as the left-hand-side. Non-terminal symbols appearing in the rst form of rules are called characteristic symbols and those appearing in synonym productions are called synonym symbols. Each characteristic symbol and certain terminal symbols de ne a Montage. A Montage is a class 3 whose instances are the corresponding nodes in the abstract syntax tree. Such nodes have a descendant for each component of the symbol. A Montage has a set of prede ned, immutable attributes, so called selectors that can be used to retrieve these descendants. The signature of selectors is derived from the classes of the components, e.g. in the above given rule, the B, C, and D components of an A instance can be retrieved by the

3 In this context we consider class to be a special kind of abstract data type, having attributes and methods (actions) and, most important for us, where the notion of sub-typing and inheritance are prede ned in the usual way.

3

selectors S-B, S-C, and S-D. In Fig. 2 a possible representation of the A-Montage as class and an abstract syntax tree (AST) with two instances of A and their components are depicted. class A attributes S-B of type B S-C of type C S-D of type D ::: methods static-semantics dynamic-semantics

class J, which can be reached by following the selectors S-B, S-H, and then S-J (Fig. 3). The use of such a path in a Montage de nition imposes a number of constraints on the other EBNF rules of the language. The example self.S-B.S-H.S-J requires, that there is a B component in the Montage containing the path. Further, every subtype of B must have an H component, and every subtype of H must have an J component. More informally, the path self.S-B.S-H.S-J must exist in all possible ASTs.

n1 2A S-B

S-C

S-D

n2 2B n3 2C n4 2D

:::

A ::= B C D

n5 2A S-B

S-C

:::

self.S-B.S-H.S-J

S-D

n6 2B n7 2C n8 2D

:::

AST

imposed constraints: B = B1 j B2 j : : : B1 ::= : : : H : : : n1 2A B2 ::= : : : H : : : ::: S-B H = H1 j H2 j : : : H1 ::= : : : J : : : n2 2B; : : : H2 ::= : : : J : : : S-H

n9 2H; : : :

Figure 2: Montage class A, instances in the AST, selectors S-B, S-C, S-D

S-J

n10 2J; : : :

Synonym rules introduce synonym classes and de ne subtype relations. The symbols on the right-handside can be further synonym classes or Montage classes. Each class on the right-hand-side is a subtype of the introduced synonym class. Thus each instance of one of the classes on the right-hand side is an instance of the class on the left-hand-side, e.g. in the given example, all F, G, and H instances are E-instances as well. The instances of a synonym class consist of the instances of its subtypes. In future work we will allow non-empty synonym class de nitions, for instance for the speci cation of binary expression as a super-type of relation, sum, product. In principle state-of-the-art OO methodology for reuse and combination of Montages is applicable allowing for a powerful develop method for new languages, e.g. for domain speci c languages. In the AST, each node is an an instance of arbitrary many (possibly zero) synonym classes and of exactly one Montage. Variable terminal, e.g. identi ers or numbers, correspond to Montages without components having an attribute Name containing the micro-syntax. Fixed terminals are not included in the AST by default. The treatment of terminal symbols together with the subtyping introduced by the synonym productions allows for an automatic generation of AST from the concrete syntax given by EBNF (based on the work in [26]). Inside a Montage de nition, the term self denotes the current instance of the class. Using the selectors, and knowledge about the AST, we can build paths, denoting nodes with a xed distance w.r.t. the self. For instance, the path self.S-B.S-H.S-J denotes a node of

AST

Figure 3: Montage A using path self.S-B.S-H.S-J, situation in AST, and constraints on EBNF rules of B, H.

2.2 Basic Visualization

A Montage de nition is partly textual, partly graphical. It consists of four horizontally layered boxes. The rst box contains the EBNF rule. The second is used for giving the control/data- ow graph. Instead of giving a textual de nition, using the path terms introduced above, we use the Montages Visual Language (MVL). The purpose of MVL is to specify the components containing entry and exit points of control ow, and to draw arrows between components and the Montage itself and its actions. As a rst basic component of MVL we introduce visualization of paths. A path of the form self.S-B, denoting a direct component is visualized by a box labeled with the selector \S-B". A more complex path, Path.S-H

where Path is an arbitrary complex path, is visualized by an \S-H" labeled box which is nested inside the visualization of Path. In Fig. 4, the example path self.SB.S-H.S-J is visualized. Please note, that the innermost box visualizes self.S-B.S-H.S-J, the second innermost box visualizes self.S-B.S-H and the outermost box visualizes self.S-B. 4

In that example the link is a simple attribute, but in general n-ary attributes are allowed. Assume we are de ning Montage A. A data ow arrow consists of a source path Src-PATH , a target path Trg-PATH, and a label t being a n-ary term. In the AST the arrow is instantiated for each instance of A with links from the node reached by the source path to the node reached by the target path.

S-J S-H S-B

Figure 4: Visualization of self.S-B.S-H.S-J

A ::=

The oval-action nodes visualize themselves as far as control- ow is concerned, and the corresponding self node in the data ow. If there is only one action node, we can identify this node with the self node, therefore we have chosen the label self for a unique action. For the de nition of initial and terminal action in the control ow, we need to give the component I-PATH and T-PATH containing these actions, respectively. Graphically this is done by marking the visualization of IPATH by an incoming arrow, labeled with I , and marking the visualization of T-PATH by an outgoing arrow, labeled with T . An example can be seen in the While Montage, Fig. 1, where I-PATH is self.S-Expression and T-PATH is self. An abstract description and the formulas are given here: A ::= I

Src-PATH

8

T-PATH

self 2 A : self.Initial = I-PATHInitial self.Terminal = T-PATH.Initial

T

8

^

t

Trg-PATH

self 2 A : Src-PATH.t = Trg-PATH

(4)

With control arrows the situation is slightly more dicult, since the labels are predicates denoting ring conditions. We can still give predicate logic de nitions, but these have to be reevaluated in each state of the dynamic semantics execution. Given a xed state, we de ne a binary relation SwitchControl that is true for each pair of actions a, b if they are linked by a control arrow with a ring condition evaluating to true. For instance in the While Montage, a control arrow with ring condition guard.value connects the self with the statement sequence. Thus control will switch from the self-action to the initial action of the statement sequence only if the value of the guard evaluates to true. Otherwise the non-labeled arrow leaving the construct is followed. In general non labeled arrows denote the default case. For ease of exposition we do not give the formalization of default cases. Informally the ring condition of the default case is the negation of the conjunction of all other cases, e.g. the default case has the same semantics as an else-clause in if-statements of imperative languages. A control arrow links the terminal of its source with the initial of its target. Such an arrow consists of a source path Src-PATH , a target path Trg-PATH, and a predicate p as label. This arrow de nes the binary relation SwitchControl, representing the activated control links among action-nodes:

::: I-PATH

:::

(3)

In the next sections we give the formal semantics of control and data arrows, their instantiation as links in the CDG, and their visual speci cation with MVL.

2.3 Control and Data ow

After the static analysis phase dynamic semantics is executed. In the case of sequential languages, exactly one action of the CDG is activated, then control is passed to the next node along the control links. Results of calculations performed by the actions are stored in additional attributes of the Montage. If one Montage needs to access the data of another one it uses a data link. For instance the self-action of the While needs the guard link to access the expression's value (which is set during execution of the expressions dynamic semantics).

A ::=

::: Src-PATH

5

p

Tgt-PATH

8

self 2 A :

p

where \t" is some terminal symbol. Such a list of K s may be a component in a characteristic rule. The selector for the node representing the K -list is S-K and the single elements of the list are refered to as S-K[0], S-K[1], : : : , S-K[n] where n is the list length minus one. The list length is accessible as S-K.ListLength. Moreover, all elements of the list have an attribute Position denoting their physical position within the list. The initial and terminal action of a list are de ned to be the initial of the rst element, respectively the terminal of the last element in the list. If the list is empty, they point to the list node itself, which is a skip action. Given a path Path leading to a Montage with K-list component we introduce the special path Path.S-K[ ]. This path can be used to speak about all elements in the list. In the formulas (3), (4), (5), and (6) we need to replace such a path with Path.S-K[i] where i is a fresh variable that is all-quanti ed over the range 0, : : : , (Path.S-K[i].ListLength - 1) . Visually Path.S-K[ ] is represented by a \S-K" labeled box, inside a box with label LIST, which in turn is inside the box visualizing Path (Fig. 5). The box labeled LIST is visualizing the list node itself, and can be used to direct control ow to the initial action in the list.



SwitchControl(Src-PATH.Terminal; Trg-PATH.Initial)

(5)

Please note, that the name self of the bound variable is crucial, since the paths contain the free variable self. In this paper we do not further specify how the relation SwitchControl is used by the ASM giving the dynamic semantics. Of course it can be used both in sequential and concurrent situations. In the current implementation, one sequential process is supported. This process ows through the graph along edges whose ring condition evaluates true, e.g. whose source and target are in the SwitchControl relation. This technique can be applied to all existing ASM case studies for dynamic semantics, in order to visualize part of their rules.

2.4 Additional machinery

f

g

Both in the case of data ow and control ow arrows, for the instantiation of the labels it is necessary to refer to the source and the target of the arrows. In the label of data ow arrows these references are used as arguments of n-ary attributes, and in the label of control

ow they are used in the ring condition. In the While example this was not needed. We introduce the terms src and trg that can be used in the label of an arrow to denote the source and target path of that arrow. This semantics is formalized by enclosing the formulas (4) and (5) in the following let-clause:

LIST

S-K

let src = Src-PATH in let trg = Trg-PATH in (6)

Path

The formulas (3), (4), (5), and (6) give the complete semantics of MVL. Some additional complexity is introduced in the next subsection about list processing. In particular, there will be sets of instances corresponding to list components. The here introduced src and trg are widely used in the speci cation of labels for arrows among such list components.

Figure 5: Visualization of a list and its elements. In the simplest case, sequential control ow is assumed to link the elements of the list in their order. For instance in the following examples for the Java switch statement and the Pascal case statement, we use the sequential linking in the list for the fall-through semantic of the switch. We simplify the statements leaving away the default clauses. In Fig. 6 the case statement of Pascal is given. An expression is evaluated, and then in the control arrow of the self-action this value is compared with the constant value of the Expression component of each LabeledStm. If the two values are equal, control ows to that statement. After that control leaves directly the construct, since the only outgoing control arrow of the LabeledStm is the T arrow. In the switch statement (Fig. 7) the LabeledStm to be executed is chosen in the same way. But after the execution of the

2.5 List Processing

Experience with other speci cation frameworks [19] made us believe that a considerable part of speci cations deals with list processing. Therefore we tried to integrate powerful visual constructs to specify arrows from and to lists. In the mapping from concrete to abstract syntax the following patterns are recognized and treated as simple lists of K s. K KK K \t" K [K \t" K ] f

g

f

g

f

g

f

g

6

Case ::= "case" "(" Expression ")" fLabeledStmg "end" I

cond.constValue = trg.S-Expression.value

S-Expression

LIST

guard

S-LabeledStm

self

T

Figure 6: The Case Montage Switch ::= "switch" "(" Expression ")" fLabeledStmg "end" I

S-Expression

cond.value = trg.S-Expression.value

guard

self

LIST

S-LabeledStm

Figure 9: The graphical editor of the Gem-Mex tool

T

Figure 7: The Switch Montage LabeledStm ::= "case" Expression ":" StmSequence ";" I

S-StmSequence

condition

S-Expression

T

S-Expression.constValue

= undef 6

Figure 8: The LabeledStm Montage LabeledStm, there is no outgoing T arrow, and control \falls through" along the list of LabeledStm. At the end of the list, the T-arrow leaves the statement. Please note as well, that in Montage LabeledStm (Fig. 8) control is not passed to the expression, since the expression is expected to set its constValue attribute during analysis.

Figure 10: The console of the Gem-Mex tool be generated automatically; the Fig. 9 shows the editor opened for a Relation Montage; the Montages executable generator (Mex) which automatically generates correct and ecient implementations of the language; the main console is shown in Fig. 10; the generic animation and debugger tool visualizes the static and dynamic behavior of the speci ed language at a symbolic level; source programs written in the speci ed language can be animated and inspected in a visual environment; a snapshot of the debugging process of an example program using the earlier presented Switch statement is shown in Fig.11. A broad range of professionals may nd interesting and convenient to use Gem-Mex. The whole develop

3 Gem-Mex: The Development Environment for Montages



The development environment for Montages is given by the Gem-Mex tool [2, 3]. It is a complex system which assists the designer in a number of activities related with the language life cycle, from early design to routine programmer usage. It consists of a number of interconnected components the Graphical Editor for Montages (Gem) is a sophisticated graphical editor in which Montages can be entered; furthermore documentation can 

7

ment of a programming language can be supported with an e ective impact on the productivity and robustness of the design. The designer can enter the speci cation, browse it and especially maintain it. Speci cations may evolve in time even in a non-monotonic way since modi cations can be localized within very neat boundaries. By doing so, di erent experimentation can take place with di erent versions of the syntax and semantics of the speci ed language in a very short time. Besides the pure editing functionality, Gem can be used to generate documents suitable for speci cation presentation. Experience suggests how lack in documentation is a dangerous bottleneck for the consistency and coherence of a project. Both, paper and online presentation of the language speci cation are automatically generated by Gem: LATEX documents illustrate the Montages and the grammar; such documents are easily customizable for the non-specialist user; all Montages in this paper are generated by Gem-Mex; HTML versions of the language speci cation allows to browse the speci cation and retrieve pieces of speci cation. Moreover, intelligibility is enhanced by means of \literate speci cation" techniques directly supported by Gem. Formal parts of the speci cation can be substituted with textual elements by means of a \literate programming" tool integrated in the system. \Literate speci cation" means that the Montages text elds may contain references to other parts of the formalization speci ed outside of the Montages modules. Thus, the readability and comprehension of a Montages speci cation results very much similar to those of language manuals. 



4 Related Work Denotational semantics has been regarded as the most promising approach for the semantic description of programming languages. But its problems with the pragmatics have been discovered already in case studies of the scale of Pascal and C [29]. Imperative features are not easily covered because of the global visibility of the de nition of semantic domains throughout a denotation description. Moreover domain de nitions often need to be changed when extending the language with unforeseen constructs, for instance a change from the direct style to the continuation style when adding gotos [25]. To cite Abramsky \ : : : once languages with features beyond the purely functional are considered, the appropriateness of modeling programs by functions is increasingly open to question. Neither concurrency nor

Figure 11: The debugger generated by the Gem-Mex tool

8

`advanced' imperative features have been captured denotationally in a fully convincing fashion." [1] An approach with the same ambitious goal are Kahn's Natural Semantics [17] which are directly based on Natural Deduction. For somebody knowing mathematical logic, Natural Semantics are pretty intuitive and we used it for the dynamic semantics of Oberon [19]. Although we succeeded due to the excellent tool support by Centaur [10], the result was much longer and more complex then the Montages counterpart given in [21], since one has to carry around all the state information in the case of Natural Semantics. Action semantics [25] is an initial{algebra semantics based on Mosses' uni ed algebras. It retained some denotational semantics features, i.e. context{free grammars for de ning abstract{syntax trees, and the use of Horn clauses to give inductive de nition of compositional semantic functions. The main semantic entities are actions, which are speci ed by means of the action notation. To mention Mosses \ : : : the current structural operational semantics of action notation is not easy to modify; alternative forms of operational semantics, such as evolving{algebra semantics, might be preferable in that respect." [25] Another universal meta{language for specifying languages is ASF+SDF [30]. It is an initial{algebra approach and speci es the semantics by means of conditional equations. As all the initial{algebra based formalisms they are forced to remain under the expressiveness of the logic of Horn clauses, i.e. conditional equations, otherwise the existence of the initial model is not guaranteed and the syntax cannot be mapped in an unambiguous way to the semantics because of non{ existence of the universal homomorphism. Using ASMs for dynamic semantics, the work in [27] de nes a framework comparable to ours. For the static part, it proposes occurrence algebras which integrate term algebras and context free grammars by providing terms for all nodes of all possible derivation trees. This allows such an approach to de ne all static aspects of the language in a functional algebraic system, which is supported by the MAX tool. None of the discussed approaches uses visual descriptions of control/data ow and none of them supports structuring of all speci cation aspects in a vertical way, e.g. in self{contained modules for each language construct. This way of structuring is novel with respect to existing frameworks [30, 10, 25], as far as we know. In combination with re nements of involved semantic functions, and renaming of the vocabulary, it allows to reuse large parts of language speci cations directly in other speci cations. PL speci cations can be presented as a series of sub-languages, each reusing its predecessor and extending it with new features. This speci cation

structure has been used in ASM case studies [8, 15] and was adapted to the Montages case study of Oberon [21]. The work with the current implementation of Montages shows, that such sub-languages are useful, working languages, that can be executed, tested, and explained to the user in order to facilitate understanding of the whole language. The design and prototyping of a language is much more productive if such a stepwise development and testing is possible. Furthermore in [5, 6, 9] a series of sub-languages is used to structure correctness proofs of translation schemes. The here presented logical de nition of static and dynamic aspects of Montages descriptions may be a rst step of integrating visual ow speci cations and vertical structuring with the above discussed approaches.

Acknowledgments

We would like to thank S. Chakraborty, C. Denzler, B. Di Franco, W. Shen, L. Thiele, and C. Wallace for collaboration in the Montages project. Furthermore we thank G. Goos and W. Zimmermann for the helpful discussions on the topic.

References [1] S. Abramsky. Semantics of Interaction. In Trees in Algebra and Programming { CAAP'96, 21st Int. Coll., volume 1059 of LNCS, page 1. Springer Verlag, 1996. [2] M. Anlau , P. Kutter, and A. Pierantonio. Formal Aspects of and Development Environments for Montages. In M. Sellink, editor, 2nd International Workshop on the Theory and Practice of Algebraic Speci cations, Workshops in Computing, Amsterdam, 1997. Springer. [3] M. Anlau , P.W. Kutter, and A. Pierantonio. The gem-mex tool homepage. http://www. rst.gmd.de/ ma/gem/, 1997. [4] M. Anlau , P.W. Kutter, and A. Pierantonio. The formal speci cation of Oberon revised. Technical Report 34/Sept, Dip. Matematica Pura ed Applicata, Universita di L'Aquila, 1998. [5] E. Borger and D. D. Rosenzweig. The WAM { De nition and Compiler Correctness. In C. Beierle and L. Plumer, editors, Logic Programming: Formal Methods and Practical Applications, Studies in Computer Science and Arti cial Intelligence, chapter 2, pages 20{90. North-Holland, 1994. [6] E. Borger and I. Durdanovic. Correctness of compiling Occam to Transputer code. Computer Journal, 39(1):52{92, 1996. 

9

[7] E. Borger and J. Huggins. Abstract state machines 1988 { 1998: Commented asm bibliography. In H. Ehrig, editor, EATCS Bulletin, Formal Speci cation Column, number 64, pages 105 { 127. EATCS, February 1998. [8] E. Borger and D. Rosenzweig. A Mathematical De nition of Full Prolog. In Science of Computer Programming, volume 24, pages 249{286. NorthHolland, 1994. [9] E. Borger and W. Schulte. De ning the Java Virtual Machine as platform for provably correct Java compilation. In J. Gruska and J. Zlatuska, editors, Proc. MFCS'98, LNCS, 1998. [10] P. Borra, D. Clement, T. Despeyroux, J. Incerpi, G. Kahn, B. Lang, and V. Pascual. CENTAUR: The System. Technical Report 777, INRIA, Sophia Antipolis, 1987. [11] S Glesner, A. Heberle, and W. Lowe. Static semantics with montages. 1998. [12] W. Goerigk, W. Zimmermann, T. Gaul, A. Heberle, and U. Ho mann. Correct compilation of a while-language with parameterless recursive procedures. Unpublished working paper of the Veri x project, July 1998. [13] Y. Gurevich. Logic and the Challenge of Computer Science. Theory and Practice of Software Engineering, pages 1 { 57, 1988. [14] Y. Gurevich. Evolving Algebras 1993: Lipari Guide. In E. Borger, editor, Speci cation and Validation Methods. Oxford University Press, 1995. [15] Y. Gurevich and J.K. Huggins. The Semantics of the C Programming Language, volume 702 of LNCS, pages 274 { 308. Springer, 1993. [16] J. Huggins. Abstract State Machines Web Page. [17] G. Kahn. Natural Semantics. In Proceedings of the Symp. on Theoretical Aspects of Computer Science, Passau, Germany, 1987. [18] D.E. Knuth. Semantics of Context{Free Languages. Math. Systems Theory, 2(2):127 { 146, 1968. [19] P.W. Kutter. Executable Speci cation of Oberon Using Natural Semantics. Term work, ETH Zurich, implementation on the Centaur System [10], 1996. [20] P.W. Kutter and F. Haussmann. Dynamic semantics of the programming language oberon. Term worke, ETH Zurich, July 1995. A revised version appeared as technical report of Institut TIK, ETH, number 27, 1997.

[21] P.W. Kutter and A. Pierantonio. The formal speci cation of Oberon. J.UCS, Springer, 3(5):443 { 503, 1997. [22] P.W. Kutter and A. Pierantonio. Montages: Speci cation of Realistic Programming Languages. J.UCS, Springer, 3(5):416 { 442, 1997. [23] P.W. Kutter, D. Schweizer, and L. Thiele. Integrating formal domain speci c language design in the software life cycle. to appear in proceedings of Current Trends in Applied Formal Methods, October 1998, Boppard Germany. [24] Niklaus Wirth Martin Reiser. Programming in Oberon - Steps Beyond Pascal and Modula. Addison-Wesley, 1992. [25] P. Mosses. Theory and Practice of Action Semantics. In MFCS'96, 21st International Symposium, volume 1113 of LNCS, pages 37 { 61. Springer Verlag, 1996. [26] M. Odersky. A New Approach to Formal Language De nition and its Application to Oberon. PhD thesis, ETH Zurich, 1989. [27] A. Poetzsch-He ter. Prototyping realistic programming languages based on formal speci cations. Acta Informatica, 34:737{772, 1997. 1997. [28] G. Rozenberg, G. Engels, and H.J. Kreowski. Graph Grammar Handbook. ?, 1998. [29] D.A. Schmidt. Denotational Semantics: A Methodology for Language Development. Allyn & Bacon, 1986. [30] A. van Deursen, J. Heering, and P. Klint, editors. Language Prototyping { An Algebraic Approach, volume 5 of AMAST Series in Computing. World Scienti c, 1996. [31] W.M. Waite and G. Goos. Compiler Construction. Springer, 1984. [32] C. Wallace. The semantics of the java programming language: Preliminary version. Technical Report CSE-TR-355-97, University of Michigan EECS Department Technical Report, 1997. [33] N. Wirth and J. Gutknecht. Project Oberon, The Design of an Operating System and Compiler. Addison-Wesley, 1992.

10

Suggest Documents